From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id 8B7FAB6D for ; Wed, 8 Feb 2017 20:53:16 +0100 (CET) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP; 08 Feb 2017 11:53:15 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.35,348,1484035200"; d="scan'208";a="818604453" Received: from irsmsx104.ger.corp.intel.com ([163.33.3.159]) by FMSMGA003.fm.intel.com with ESMTP; 08 Feb 2017 11:53:14 -0800 Received: from irsmsx156.ger.corp.intel.com (10.108.20.68) by IRSMSX104.ger.corp.intel.com (163.33.3.159) with Microsoft SMTP Server (TLS) id 14.3.248.2; Wed, 8 Feb 2017 19:53:13 +0000 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.38]) by IRSMSX156.ger.corp.intel.com ([169.254.3.104]) with mapi id 14.03.0248.002; Wed, 8 Feb 2017 19:53:13 +0000 From: "Ananyev, Konstantin" To: "Ananyev, Konstantin" , "Yigit, Ferruh" , Jianbo Liu , "dev@dpdk.org" , "Zhang, Helin" , "jerin.jacob@caviumnetworks.com" Thread-Topic: [dpdk-dev] [PATCH v2 1/2] net/ixgbe: calculate the correct number of received packets in bulk alloc function Thread-Index: AQHSfspPXOE8A/6OI0iGKLtZsKL0AqFY1rxggAaWz4CAAAlogIAADIkQ Date: Wed, 8 Feb 2017 19:53:13 +0000 Message-ID: <2601191342CEEE43887BDE71AB9772583F114FBC@irsmsx105.ger.corp.intel.com> References: <1482127758-4904-1-git-send-email-jianbo.liu@linaro.org> <1486201024-32656-1-git-send-email-jianbo.liu@linaro.org> <2601191342CEEE43887BDE71AB9772583F110DC7@irsmsx105.ger.corp.intel.com> <1de7e1b9-cdf9-d53e-132b-30be97883af8@intel.com> <2601191342CEEE43887BDE71AB9772583F114EDA@irsmsx105.ger.corp.intel.com> In-Reply-To: <2601191342CEEE43887BDE71AB9772583F114EDA@irsmsx105.ger.corp.intel.com> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.182] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v2 1/2] net/ixgbe: calculate the correct number of received packets in bulk alloc function X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Feb 2017 19:53:17 -0000 > -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstantin > Sent: Wednesday, February 8, 2017 6:54 PM > To: Yigit, Ferruh ; Jianbo Liu ; dev@dpdk.org; Zhang, Helin ; > jerin.jacob@caviumnetworks.com > Subject: Re: [dpdk-dev] [PATCH v2 1/2] net/ixgbe: calculate the correct n= umber of received packets in bulk alloc function >=20 > Hi Ferruh, >=20 > > > > On 2/4/2017 1:26 PM, Ananyev, Konstantin wrote: > > >> > > >> To get better performance, Rx bulk alloc recv function will scan 8 d= escs > > >> in one time, but the statuses are not consistent on ARM platform bec= ause > > >> the memory allocated for Rx descriptors is cacheable hugepages. > > >> This patch is to calculate the number of received packets by scan DD= bit > > >> sequentially, and stops when meeting the first packet with DD bit un= set. > > >> > > >> Signed-off-by: Jianbo Liu > > >> --- > > >> drivers/net/ixgbe/ixgbe_rxtx.c | 16 +++++++++------- > > >> 1 file changed, 9 insertions(+), 7 deletions(-) > > >> > > >> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgb= e_rxtx.c > > >> index 36f1c02..613890e 100644 > > >> --- a/drivers/net/ixgbe/ixgbe_rxtx.c > > >> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c > > >> @@ -1460,17 +1460,19 @@ static inline int __attribute__((always_inli= ne)) > > >> for (i =3D 0; i < RTE_PMD_IXGBE_RX_MAX_BURST; > > >> i +=3D LOOK_AHEAD, rxdp +=3D LOOK_AHEAD, rxep +=3D LOOK_AHEAD= ) { > > >> /* Read desc statuses backwards to avoid race condition */ > > >> - for (j =3D LOOK_AHEAD-1; j >=3D 0; --j) > > >> + for (j =3D 0; j < LOOK_AHEAD; j++) > > >> s[j] =3D rte_le_to_cpu_32(rxdp[j].wb.upper.status_error); > > >> > > >> - for (j =3D LOOK_AHEAD - 1; j >=3D 0; --j) > > >> - pkt_info[j] =3D rte_le_to_cpu_32(rxdp[j].wb.lower. > > >> - lo_dword.data); > > >> + rte_smp_rmb(); > > >> > > >> /* Compute how many status bits were set */ > > >> - nb_dd =3D 0; > > >> - for (j =3D 0; j < LOOK_AHEAD; ++j) > > >> - nb_dd +=3D s[j] & IXGBE_RXDADV_STAT_DD; > > >> + for (nb_dd =3D 0; nb_dd < LOOK_AHEAD && > > >> + (s[nb_dd] & IXGBE_RXDADV_STAT_DD); nb_dd++) > > >> + ; > > >> + > > >> + for (j =3D 0; j < nb_dd; j++) > > >> + pkt_info[j] =3D rte_le_to_cpu_32(rxdp[j].wb.lower. > > >> + lo_dword.data); > > >> > > >> nb_rx +=3D nb_dd; > > >> > > >> -- > > > > > > Acked-by: Konstantin Ananyev > > > > Hi Konstantin, > > > > Is the ack valid for v3 and both patches? >=20 > No, I didn't look into the second one in details. > It is ARM specific, and I left it for people who are more familiar with A= RM then me :) > Konstantin Actually, I had a quick look after your mail. + /* A.1 load 1 pkts desc */ + descs[0] =3D vld1q_u64((uint64_t *)(rxdp)); + rte_smp_rmb(); /* B.2 copy 2 mbuf point into rx_pkts */ vst1q_u64((uint64_t *)&rx_pkts[pos], mbp1); @@ -271,10 +270,11 @@=20 /* B.1 load 1 mbuf point */ mbp2 =3D vld1q_u64((uint64_t *)&sw_ring[pos + 2]); =20 - descs[2] =3D vld1q_u64((uint64_t *)(rxdp + 2)); - /* B.1 load 2 mbuf point */ descs[1] =3D vld1q_u64((uint64_t *)(rxdp + 1)); - descs[0] =3D vld1q_u64((uint64_t *)(rxdp)); + + /* A.1 load 2 pkts descs */ + descs[2] =3D vld1q_u64((uint64_t *)(rxdp + 2)); + descs[3] =3D vld1q_u64((uint64_t *)(rxdp + 3)); Assuming that on all ARM-NEON platforms 16B reads are atomic, I think there is no need for smp_rmb() after the desc[0] read. What looks more appropriate to me: descs[0] =3D vld1q_u64((uint64_t *)(rxdp)); descs[1] =3D vld1q_u64((uint64_t *)(rxdp + 1)); descs[2] =3D vld1q_u64((uint64_t *)(rxdp + 2)); descs[3] =3D vld1q_u64((uint64_t *)(rxdp + 3)); rte_smp_rmb(); ... But, as I said would be good if some ARM guys have a look here. Konstantin >=20 > > > > Thanks, > > ferruh > > > > > > > >> 1.8.3.1 > > >