From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-f175.google.com (mail-lj1-f175.google.com [209.85.208.175]) by dpdk.org (Postfix) with ESMTP id 51E164C95 for ; Tue, 11 Sep 2018 20:39:37 +0200 (CEST) Received: by mail-lj1-f175.google.com with SMTP id v26-v6so21680220ljj.3 for ; Tue, 11 Sep 2018 11:39:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=dHXFAzOUNVEfEJR53TyLExO/2JZmTezJzKqI6lqrbaY=; b=bYxGlW7peuFoxeKvgvJEZH561RtugkCS3KDYm/9tB4zwdASpIju21gX4AaWzJykP3U 6VM74s0X8BqErwy++j8Dfbm4JJabNbti70dejr/jlJr3NmNhC5a63v0ImHSivPR7Jw2t HCJaMTRVS12W5mG5PqAk8/p5EArm7Qp6pPOE0uPtO353F23krROIbLJVgRq17Ih1hc5h zWCnKo6k8skkNoz9UKTScbD54j55alcL1yPHrwKtdUGYRh6xeLCVv8moMSYVvYzI2eUO I7jxIrK1HQYJOymjT6UwKtJalYqedVP9AslWQlQ99iZCaZ4EHtqMOvviBProH8uli7pC TTNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=dHXFAzOUNVEfEJR53TyLExO/2JZmTezJzKqI6lqrbaY=; b=TcPf87M6qfwkE4oYqxJA7Vqo9ahVoWh+xlfeSNDfyPxFmJvCCmvxSMAA8q8vcmeGEm gAwb6h/vpWV3C+0OA+fBPcwSRThQEpBnl+ebv6aEPGf4RnS9yxI/oSkGKvHh016ZtJNn /WsJ4lyJrigHedXncV4MUAmq0VElexGixfAMH0vU1PCy3bJ3HK52mWcM/nR3Ksw2Bg4p UDwHSvjUJo6Z4fOsd5lVQGO/Pf3va9Vq6ix3d4j4gVsLiSmcnY6FRN6DWC2W6MEtdJ2O So45QRhpwRKPOkYjCy8FSmdpUj9q2BeXlIuzCOvwT5IskNHNciThmk0QY4/nhX2qtdux kz9Q== X-Gm-Message-State: APzg51C77O+md3KtjV4cG0V2YdJof9plI2EJ+jZ3E0T1Zoy5xFnlOEoF 6ib2UEDjSbFQP04Gtm1liEnCUQOMD+fMbrg6bMI= X-Google-Smtp-Source: ANB0VdaYf6SZYVEod+YgF1J5iz5UuAqGKvLTvJDxNebq3iINBJpM2PTLBqlhN38XKlOCALNvKg1CRlF2bevZX/A9ARA= X-Received: by 2002:a2e:658a:: with SMTP id e10-v6mr14708805ljf.99.1536691176637; Tue, 11 Sep 2018 11:39:36 -0700 (PDT) MIME-Version: 1.0 References: <20180911110744.7ef55fc2@xeon-e3> In-Reply-To: <20180911110744.7ef55fc2@xeon-e3> From: Arvind Narayanan Date: Tue, 11 Sep 2018 13:39:24 -0500 Message-ID: To: stephen@networkplumber.org Cc: keith.wiles@intel.com, users@dpdk.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-users] How to use software prefetching for custom structures to increase throughput on the fast path X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Sep 2018 18:39:37 -0000 Stephen, thanks! That is it! Not sure if there is any workaround. So, essentially, what I am doing is -- core 0 gets a burst of my_packet(s) from its pre-allocated mempool, and then (bulk) enqueues it into a rte_ring. Core 1 then (bulk) dequeues from this ring and when it access the data pointed by the ring's element (i.e. my_packet->tag1), this memory access latency issue is seen. I cannot advance the prefetch any earlier. Is there any clever workaround (or hack) to overcome this issue other than using the same core for all the functions? For e.g. can I can prefetch the packets in core 0 for core 1's cache (could be a dumb question!)? Thanks, Arvind On Tue, Sep 11, 2018 at 1:07 PM Stephen Hemminger < stephen@networkplumber.org> wrote: > On Tue, 11 Sep 2018 12:18:42 -0500 > Arvind Narayanan wrote: > > > If I don't do any processing, I easily get 10G. It is only when I access > > the tag when the throughput drops. > > What confuses me is if I use the following snippet, it works at line > rate. > > > > ``` > > int temp_key = 1; // declared outside of the for loop > > > > for (i = 0; i < pkt_count; i++) { > > if (rte_hash_lookup_data(rx_table, &(temp_key), (void **)&val[i]) < > 0) { > > } > > } > > ``` > > > > But as soon as I replace `temp_key` with `my_packet->tag1`, I experience > > fall in throughput (which in a way confirms the issue is due to cache > > misses). > > Your packet data is not in cache. > Doing prefetch can help but it is very timing sensitive. If prefetch is > done > before data is available it won't help. And if prefetch is done just before > data is used then there isn't enough cycles to get it from memory to the > cache. > > >