From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id DD374A034F; Mon, 7 Jun 2021 16:55:56 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5781A4067E; Mon, 7 Jun 2021 16:55:56 +0200 (CEST) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by mails.dpdk.org (Postfix) with ESMTP id 79BEE40147 for ; Mon, 7 Jun 2021 16:55:54 +0200 (CEST) IronPort-SDR: S2djQiMM+tQYZhLBcWPD5PJfs489VYMqA2JGUuXe43kCstjIyey3jJJS6T3NuN5AHS0ohTmRbQ csZGck9KaMcQ== X-IronPort-AV: E=McAfee;i="6200,9189,10008"; a="204604824" X-IronPort-AV: E=Sophos;i="5.83,255,1616482800"; d="scan'208";a="204604824" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jun 2021 07:55:48 -0700 IronPort-SDR: u6FYyhAwUsNbB/nR2+CNw7873OimHC0LYwQhpdN++wruXkq1uyyA2C0gZ0wfkw22FwQ531ns1+ QoKiWFYETvuA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,255,1616482800"; d="scan'208";a="551255999" Received: from fmsmsx603.amr.corp.intel.com ([10.18.126.83]) by orsmga004.jf.intel.com with ESMTP; 07 Jun 2021 07:55:48 -0700 Received: from shsmsx604.ccr.corp.intel.com (10.109.6.214) by fmsmsx603.amr.corp.intel.com (10.18.126.83) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2242.4; Mon, 7 Jun 2021 07:55:47 -0700 Received: from shsmsx601.ccr.corp.intel.com (10.109.6.141) by SHSMSX604.ccr.corp.intel.com (10.109.6.214) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2242.4; Mon, 7 Jun 2021 22:55:45 +0800 Received: from shsmsx601.ccr.corp.intel.com ([10.109.6.141]) by SHSMSX601.ccr.corp.intel.com ([10.109.6.141]) with mapi id 15.01.2242.008; Mon, 7 Jun 2021 22:55:45 +0800 From: "Zhang, Qi Z" To: Honnappa Nagarahalli , Joyce Kong , "Xing, Beilei" , Ruifeng Wang CC: "dev@dpdk.org" , nd , nd Thread-Topic: [PATCH v1] net/i40e: remove the SMP barrier in HW scanning func Thread-Index: AQHXWRQSr5j0Y9t8tEik4BdoB++ri6sHBN4w///HsICAAdTTMA== Date: Mon, 7 Jun 2021 14:55:45 +0000 Message-ID: <2cb94e2e1bf74840acaadc389b4745f5@intel.com> References: <20210604073405.14880-1-joyce.kong@arm.com> <561469a10f13450bae9e857f186b0123@intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-reaction: no-action dlp-version: 11.5.1.3 dlp-product: dlpe-windows x-originating-ip: [10.239.127.36] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v1] net/i40e: remove the SMP barrier in HW scanning func X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > -----Original Message----- > From: Honnappa Nagarahalli > Sent: Monday, June 7, 2021 2:33 AM > To: Zhang, Qi Z ; Joyce Kong ; > Xing, Beilei ; Ruifeng Wang > Cc: dev@dpdk.org; nd ; Honnappa Nagarahalli > ; nd > Subject: RE: [PATCH v1] net/i40e: remove the SMP barrier in HW scanning > func >=20 > >=20 > > > > > > Add the logic to determine how many DD bits have been set for > > > contiguous packets, for removing the SMP barrier while reading descs. > > > > I didn't understand this. > > The current logic already guarantee the read out DD bits are from > > continue packets, as it read Rx descriptor in a reversed order from the= ring. > Qi, the comments in the code mention that there is a race condition if th= e > descriptors are not read in the reverse order. But, they do not mention w= hat > the race condition is and how it can occur. Appreciate if you could expla= in > that. The Race condition happens between the NIC and CPU, if write and read DD bi= t in the same order, there might be a hole (e.g. 1011) with the reverse re= ad order, we make sure no more "1" after the first "0" as the read address are declared as volatile, compiler will not re-ordered = them. >=20 > On x86, the reads are not re-ordered (though the compiler can re-order). = On > ARM, the reads can get re-ordered and hence the barriers are required. In > order to avoid the barriers, we are trying to process only those descript= ors > whose DD bits are set such that they are contiguous. i.e. if the DD bits = are > 1011, we process only the first descriptor. Ok, I see. thanks for the explanation. At this moment, I may prefer not change the behavior of x86, so compile opt= ion for arm can be added, in future when we observe no performance impact f= or x86 as well, we can consider to remove it, what do you think? >=20 > > So I didn't see the a new logic be added, would you describe more > > clear about the purpose of this patch? > > > > > > > > Signed-off-by: Joyce Kong > > > Reviewed-by: Ruifeng Wang > > > --- > > > drivers/net/i40e/i40e_rxtx.c | 13 ++++++++----- > > > 1 file changed, 8 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/net/i40e/i40e_rxtx.c > > > b/drivers/net/i40e/i40e_rxtx.c index > > > 6c58decec..410a81f30 100644 > > > --- a/drivers/net/i40e/i40e_rxtx.c > > > +++ b/drivers/net/i40e/i40e_rxtx.c > > > @@ -452,7 +452,7 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq) > > > uint16_t pkt_len; > > > uint64_t qword1; > > > uint32_t rx_status; > > > - int32_t s[I40E_LOOK_AHEAD], nb_dd; > > > + int32_t s[I40E_LOOK_AHEAD], var, nb_dd; > > > int32_t i, j, nb_rx =3D 0; > > > uint64_t pkt_flags; > > > uint32_t *ptype_tbl =3D rxq->vsi->adapter->ptype_tbl; @@ -482,11 > > > +482,14 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq) > > > I40E_RXD_QW1_STATUS_SHIFT; > > > } > > > > > > - rte_smp_rmb(); > > > > Any performance gain by removing this? and it is not necessary to be > > combined with below change, right? > > > > > - > > > /* Compute how many status bits were set */ > > > - for (j =3D 0, nb_dd =3D 0; j < I40E_LOOK_AHEAD; j++) > > > - nb_dd +=3D s[j] & (1 << > > I40E_RX_DESC_STATUS_DD_SHIFT); > > > + for (j =3D 0, nb_dd =3D 0; j < I40E_LOOK_AHEAD; j++) { > > > + var =3D s[j] & (1 << I40E_RX_DESC_STATUS_DD_SHIFT); > > > + if (var) > > > + nb_dd +=3D 1; > > > + else > > > + break; > > > + } > > > > > > nb_rx +=3D nb_dd; > > > > > > -- > > > 2.17.1