From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw0-f170.google.com (mail-yw0-f170.google.com [209.85.161.170]) by dpdk.org (Postfix) with ESMTP id 4C71FFFA for ; Sat, 4 Feb 2017 04:37:27 +0100 (CET) Received: by mail-yw0-f170.google.com with SMTP id u68so23470431ywg.0 for ; Fri, 03 Feb 2017 19:37:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=42SZd0VljKLHnOVtiFiGf9XagaYdAvxY2pguVkzQZCs=; b=MXQdNS/42Z8XFLQODo1E9kQC4iR3ROaF5m2MrFconz1MwIhmLXY2svUy47rkcylGw0 nznklvYfDpf2wME05weByMwzTfiT/PzPgs4TCEp7drMTeSxINsi7DweUf5MhuS6rGGQn 5z69sAAmT7Z4bGJ/SvWoE+yn57r7YV6Kn+kTA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=42SZd0VljKLHnOVtiFiGf9XagaYdAvxY2pguVkzQZCs=; b=uaqshT/nuoOGEHA+HyF22HhTAMzafH6B6cb3Z18XzgZxJBOgWtDAFtN2VVADpYTeM7 wzYp1bQYBNKWzLh2H1vqKE7h7deEm0pdSk+ez2g6c/sSbSi+YvCSXI9yXFsfBTSKCcDx t9uWtGYaDm7rd/qcfOlWECb0WbeRH+SMscLYSQuTgDPDcoMUOQJjKCKDhB/mytlkvxma 2p89JJO2CGt+IP6rUUVj9X6qDpcNem7091WEnHhfmuOFXKZWDZIis2FDp6hVcZ6/C49K HcQZj8oi+il8UTOqt/2sB3zKJXWiwJNjsNRBCwd5g83xzsHhfsUHsD4wkKCx/rLenx+n 0Gjg== X-Gm-Message-State: AIkVDXLLlKjz1fcxL7q1NoTtTJtpz0GeFuYxTYK32MGNeRWQ4KHrtKlKbEYj4HHibJXxGZwFax16h2Vv7rqSP3+O X-Received: by 10.129.122.73 with SMTP id v70mr205023ywc.89.1486179446588; Fri, 03 Feb 2017 19:37:26 -0800 (PST) MIME-Version: 1.0 Received: by 10.37.200.4 with HTTP; Fri, 3 Feb 2017 19:37:26 -0800 (PST) In-Reply-To: <2601191342CEEE43887BDE71AB9772583F1109B4@irsmsx105.ger.corp.intel.com> References: <1482127758-4904-1-git-send-email-jianbo.liu@linaro.org> <2601191342CEEE43887BDE71AB9772583F10FF87@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB9772583F1109B4@irsmsx105.ger.corp.intel.com> From: Jianbo Liu Date: Sat, 4 Feb 2017 11:37:26 +0800 Message-ID: To: "Ananyev, Konstantin" Cc: "dev@dpdk.org" , "Zhang, Helin" , "jerin.jacob@caviumnetworks.com" Content-Type: text/plain; charset=UTF-8 Subject: Re: [dpdk-dev] [PATCH 1/2] net/ixgbe: calculate the correct number of received packets in bulk alloc function X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Feb 2017 03:37:28 -0000 On 3 February 2017 at 19:38, Ananyev, Konstantin wrote: > > >> -----Original Message----- >> From: Jianbo Liu [mailto:jianbo.liu@linaro.org] >> Sent: Friday, February 3, 2017 6:22 AM >> To: Ananyev, Konstantin >> Cc: dev@dpdk.org; Zhang, Helin ; jerin.jacob@caviumnetworks.com >> Subject: Re: [PATCH 1/2] net/ixgbe: calculate the correct number of received packets in bulk alloc function >> >> On 2 February 2017 at 00:19, Ananyev, Konstantin >> wrote: >> > Hi, >> > >> >> -----Original Message----- >> >> From: Jianbo Liu [mailto:jianbo.liu@linaro.org] >> >> Sent: Monday, December 19, 2016 6:09 AM >> >> To: dev@dpdk.org; Zhang, Helin ; Ananyev, Konstantin ; >> >> jerin.jacob@caviumnetworks.com >> >> Cc: Jianbo Liu >> >> Subject: [PATCH 1/2] net/ixgbe: calculate the correct number of received packets in bulk alloc function >> >> >> >> To get better performance, Rx bulk alloc recv function will scan 8 descriptors >> >> in one time, but the statuses are not consistent on ARM platform because >> >> the memory allocated for Rx descriptors is cacheable hugepages. >> >> This patch is to calculate the number of received packets by scanning DD bit >> >> sequentially, and stops when meeting the first packet with DD bit unset. >> >> >> >> Signed-off-by: Jianbo Liu >> >> --- >> >> drivers/net/ixgbe/ixgbe_rxtx.c | 12 ++++++++---- >> >> 1 file changed, 8 insertions(+), 4 deletions(-) >> >> >> >> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c >> >> index b2d9f45..2866bdb 100644 >> >> --- a/drivers/net/ixgbe/ixgbe_rxtx.c >> >> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c >> >> @@ -1402,17 +1402,21 @@ ixgbe_rx_scan_hw_ring(struct ixgbe_rx_queue *rxq) >> >> for (i = 0; i < RTE_PMD_IXGBE_RX_MAX_BURST; >> >> i += LOOK_AHEAD, rxdp += LOOK_AHEAD, rxep += LOOK_AHEAD) { >> >> /* Read desc statuses backwards to avoid race condition */ >> >> - for (j = LOOK_AHEAD-1; j >= 0; --j) >> >> + for (j = LOOK_AHEAD - 1; j >= 0; --j) { >> >> s[j] = rte_le_to_cpu_32(rxdp[j].wb.upper.status_error); >> >> - >> >> - for (j = LOOK_AHEAD - 1; j >= 0; --j) >> >> pkt_info[j] = rte_le_to_cpu_32(rxdp[j].wb.lower. >> >> lo_dword.data); >> >> + } >> >> + >> >> + rte_smp_rmb(); >> > >> > If reads can be reordered, shouldn't we fill pkt_info[] after smp_rmb() here? >> >> The barrier is to forbid the reordering from the following readings, >> which will count the number of actual received packets. > > What I meant is that if you'll keep reading from both rxdp[].wb.lower and rxdp[].wb.upper > before rmb, then nothing would prevent cpu from reorder these reads in any way it likes > (if we are talking about cpus with read reordering allowed), right? > So it can end up with the following order: > > rxdp[N].wb.lower > rxdp[N].wb.upper > > or even: > > rxdp[N-1].wb.lower > rxdp[N].wb.lower > rxdp[N-1].wb.upper > rxdp[N].wb.upper > > In such cases pkt_info[] may contain invalid data. Yes, it's possible. I'll send v2. Thanks! > >> And as wb.uper and wb.lower of one descriptor are in the same >> cacheline, could it be better to read them at the same time?. > > It could be, but I think for the sake of data integrity we have to make sure that > cpu would never read any other RXD field before wb.upper. status_error, see above. > > BTW, the following code might re-read both wb.upper and wb.lower anyway. > So I don't think you'll save many cycles here anyway. > >> >> > As another nit - with rmb() in and because you are looking the first gap in s[] now, >> > no need to read TXDs in backward order. >> >> Reading backward is just to keep as it is for x86 platform. > > With the change you introducing, I don't think it is necessary any more. > > Konstantin > >> >> > How it looks to me (as a suggestion): >> > >> > for (j = 0; j != LOOK_AHEAD; j++) >> > s[j] = rte_le_to_cpu_32(rxdp[j].wb.upper.status_error); >> > >> > rte_smp_rmb(); >> > >> > for (j = 0; j < LOOK_AHEAD && (s[j] & IXGBE_RXDADV_STAT_DD) != 0; j++) >> > ; >> > >> > for (j = 0; j < nb_dd; ++j) { >> > pkt_info[j] = rte_le_to_cpu_32(rxdp[j].wb.lower.lo_dword.data); >> > .... >> > >> > Konstantin >> > >> > >> >> >> >> /* Compute how many status bits were set */ >> >> nb_dd = 0; >> >> for (j = 0; j < LOOK_AHEAD; ++j) >> >> - nb_dd += s[j] & IXGBE_RXDADV_STAT_DD; >> >> + if (s[j] & IXGBE_RXDADV_STAT_DD) >> >> + ++nb_dd; >> >> + else >> >> + break; >> >> >> >> nb_rx += nb_dd; >> >> >> >> -- >> >> 2.4.11 >> >