From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 96CB81B1CA for ; Wed, 14 Feb 2018 13:35:23 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 14 Feb 2018 04:35:22 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.46,512,1511856000"; d="scan'208";a="17579790" Received: from irsmsx110.ger.corp.intel.com ([163.33.3.25]) by orsmga007.jf.intel.com with ESMTP; 14 Feb 2018 04:35:21 -0800 Received: from irsmsx155.ger.corp.intel.com (163.33.192.3) by irsmsx110.ger.corp.intel.com (163.33.3.25) with Microsoft SMTP Server (TLS) id 14.3.319.2; Wed, 14 Feb 2018 12:35:20 +0000 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.221]) by irsmsx155.ger.corp.intel.com ([169.254.14.21]) with mapi id 14.03.0319.002; Wed, 14 Feb 2018 12:35:19 +0000 From: "Ananyev, Konstantin" To: "Richardson, Bruce" CC: Yongseok Koh , Olivier Matz , "dev@dpdk.org" Thread-Topic: [dpdk-dev] Accessing 2nd cacheline in rte_pktmbuf_prefree_seg() Thread-Index: AQHTpRxoSpjpp7VOiUGxw51sLWTbeaOjOjMAgACJaVCAAAlJYIAAAuaAgAAEvKA= Date: Wed, 14 Feb 2018 12:35:19 +0000 Message-ID: <2601191342CEEE43887BDE71AB97725890572EF7@irsmsx105.ger.corp.intel.com> References: <97910E4F-11F5-4BDB-A460-2656B88EA87D@mellanox.com> <2601191342CEEE43887BDE71AB97725890572EA2@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB97725890572EC6@irsmsx105.ger.corp.intel.com> <20180214121157.GA3116@bricha3-MOBL3.ger.corp.intel.com> In-Reply-To: <20180214121157.GA3116@bricha3-MOBL3.ger.corp.intel.com> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiNTcxNmY3NDYtODVhYi00NGZiLWI3NzktZDA3YzkzZDY5M2ZkIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE2LjUuOS4zIiwiVHJ1c3RlZExhYmVsSGFzaCI6Inp0dzVYcG5TZFJCTzcrUW04ejVjQTBvWTNrc0R6WUZhaWxMNktiUklxaDQ9In0= x-ctpclassification: CTP_NT dlp-product: dlpe-windows dlp-version: 11.0.0.116 dlp-reaction: no-action x-originating-ip: [163.33.239.180] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] Accessing 2nd cacheline in rte_pktmbuf_prefree_seg() X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Feb 2018 12:35:24 -0000 > -----Original Message----- > From: Richardson, Bruce > Sent: Wednesday, February 14, 2018 12:12 PM > To: Ananyev, Konstantin > Cc: Yongseok Koh ; Olivier Matz ; dev@dpdk.org > Subject: Re: [dpdk-dev] Accessing 2nd cacheline in rte_pktmbuf_prefree_se= g() >=20 > On Wed, Feb 14, 2018 at 12:03:55PM +0000, Ananyev, Konstantin wrote: > > > > > > > -----Original Message----- > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstan= tin > > > Sent: Wednesday, February 14, 2018 11:48 AM > > > To: Yongseok Koh ; Olivier Matz > > > Cc: dev@dpdk.org > > > Subject: Re: [dpdk-dev] Accessing 2nd cacheline in rte_pktmbuf_prefre= e_seg() > > > > > > Hi Yongseok, > > > > > > > > On Feb 13, 2018, at 2:45 PM, Yongseok Koh wr= ote: > > > > > > > > > > Hi Olivier > > > > > > > > > > I'm wondering why rte_pktmbuf_prefree_seg() checks m->next instea= d of > > > > > m->nb_segs? As 'next' is in the 2nd cacheline, checking nb_segs s= eems beneficial > > > > > to the cases where almost mbufs have single segment. > > > > > > > > > > A customer reported high rate of cache misses in the code and I t= hought the > > > > > following patch could be helpful. I haven't had them try it yet b= ut just wanted > > > > > to hear from you. > > > > > > > > > > I'd appreciate if you can review this idea. > > > > > > > > > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbu= f.h > > > > > index 62740254d..96edbcb9e 100644 > > > > > --- a/lib/librte_mbuf/rte_mbuf.h > > > > > +++ b/lib/librte_mbuf/rte_mbuf.h > > > > > @@ -1398,7 +1398,7 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m) > > > > > if (RTE_MBUF_INDIRECT(m)) > > > > > rte_pktmbuf_detach(m); > > > > > > > > > > - if (m->next !=3D NULL) { > > > > > + if (m->nb_segs > 1) { > > > > > m->next =3D NULL; > > > > > m->nb_segs =3D 1; > > > > > } > > > > > @@ -1410,7 +1410,7 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m) > > > > > if (RTE_MBUF_INDIRECT(m)) > > > > > rte_pktmbuf_detach(m); > > > > > > > > > > - if (m->next !=3D NULL) { > > > > > + if (m->nb_segs > 1) { > > > > > m->next =3D NULL; > > > > > m->nb_segs =3D 1; > > > > > } > > > > > > > > Well, m->pool in the 2nd cacheline has to be accessed anyway in ord= er to put it back to the mempool. > > > > It looks like the cache miss is unavoidable. > > > > > > As a thought: in theory PMD can store pool pointer together with each= mbuf it has to free, > > > then it could be something like: > > > > > > if (rte_pktmbuf_prefree_seg(m[x] !=3D NULL) > > > rte_mempool_put(pool[x], m[x]); > > > > > > Then what you suggested above might help. > > > > After another thought - we have to check m->next not m->nb_segs. > > There could be a situations where nb_segs=3D=3D1, but m->next !=3D NULL > > (2-nd segment of the 3 segment packet for example). > > So probably we have to keep it as it is. > > Sorry for the noise > > Konstantin >=20 > It's still worth considering as an option. We could check nb_segs for > the first segment of a packet and thereafter iterate using the next > pointer. In multi-seg case PMD frees segments (not packets). It could happen that first segment would be already freed while the second= =20 still not. > It means that your idea of storing the pool pointer for each > mbuf becomes useful for single-segment packets. But then we'll have to support 2 different flavors of prefree_seg(). Alternative would be to change all PMDs multi-seg TX so when first segment = is=20 going to be freed we update nb_segs for the second and so on. Both options seems like too much hassle. Konstantin