From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 914C84A63 for ; Tue, 8 Dec 2015 17:10:23 +0100 (CET) Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga103.jf.intel.com with ESMTP; 08 Dec 2015 08:08:39 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,400,1444719600"; d="scan'208";a="9640578" Received: from irsmsx110.ger.corp.intel.com ([163.33.3.25]) by fmsmga004.fm.intel.com with ESMTP; 08 Dec 2015 08:08:38 -0800 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.203]) by irsmsx110.ger.corp.intel.com ([169.254.15.163]) with mapi id 14.03.0248.002; Tue, 8 Dec 2015 16:07:47 +0000 From: "Ananyev, Konstantin" To: Jerin Jacob Thread-Topic: [dpdk-dev] [PATCH 1/2] mbuf: fix performance/cache resource issue with 128-byte cache line targets Thread-Index: AQHRMD83I2PPlXB2GkuW7U0sf/5hWZ6/lqgQgAF1foCAACUpkA== Date: Tue, 8 Dec 2015 16:07:46 +0000 Message-ID: <2601191342CEEE43887BDE71AB97725836AD1BDB@irsmsx105.ger.corp.intel.com> References: <1449417564-29600-1-git-send-email-jerin.jacob@caviumnetworks.com> <1449417564-29600-2-git-send-email-jerin.jacob@caviumnetworks.com> <2601191342CEEE43887BDE71AB97725836AD15BE@irsmsx105.ger.corp.intel.com> <20151208124527.GA18192@localhost.localdomain> In-Reply-To: <20151208124527.GA18192@localhost.localdomain> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.180] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] [PATCH 1/2] mbuf: fix performance/cache resource issue with 128-byte cache line targets X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Dec 2015 16:10:24 -0000 >=20 > Hi Konstantin, >=20 > > Hi Jerin, > > > > > -----Original Message----- > > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com] > > > Sent: Sunday, December 06, 2015 3:59 PM > > > To: dev@dpdk.org > > > Cc: thomas.monjalon@6wind.com; Richardson, Bruce; olivier.matz@6wind.= com; Dumitrescu, Cristian; Ananyev, Konstantin; Jerin > > > Jacob > > > Subject: [dpdk-dev] [PATCH 1/2] mbuf: fix performance/cache resource = issue with 128-byte cache line targets > > > > > > No need to split mbuf structure to two cache lines for 128-byte cache= line > > > size targets as it can fit on a single 128-byte cache line. > > > > > > Signed-off-by: Jerin Jacob > > > --- > > > app/test/test_mbuf.c | 4 ++= ++ > > > lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h | 4 ++= ++ > > > lib/librte_mbuf/rte_mbuf.h | 2 ++ > > > 3 files changed, 10 insertions(+) > > > > > > diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c > > > index b32bef6..5e21075 100644 > > > --- a/app/test/test_mbuf.c > > > +++ b/app/test/test_mbuf.c > > > @@ -930,7 +930,11 @@ test_failing_mbuf_sanity_check(void) > > > static int > > > test_mbuf(void) > > > { > > > +#if RTE_CACHE_LINE_SIZE =3D=3D 64 > > > RTE_BUILD_BUG_ON(sizeof(struct rte_mbuf) !=3D RTE_CACHE_LINE_SIZE *= 2); > > > +#elif RTE_CACHE_LINE_SIZE =3D=3D 128 > > > + RTE_BUILD_BUG_ON(sizeof(struct rte_mbuf) !=3D RTE_CACHE_LINE_SIZE); > > > +#endif > > > > > > /* create pktmbuf pool if it does not exist */ > > > if (pktmbuf_pool =3D=3D NULL) { > > > diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_com= mon.h b/lib/librte_eal/linuxapp/eal/include/exec- > > > env/rte_kni_common.h > > > index bd1cc09..e724af7 100644 > > > --- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h > > > +++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h > > > @@ -121,8 +121,12 @@ struct rte_kni_mbuf { > > > uint32_t pkt_len; /**< Total pkt len: sum of all segment data= _len. */ > > > uint16_t data_len; /**< Amount of data in segment buffer. */ > > > > > > +#if RTE_CACHE_LINE_SIZE =3D=3D 64 > > > /* fields on second cache line */ > > > char pad3[8] __attribute__((__aligned__(RTE_CACHE_LINE_SIZE))); > > > +#elif RTE_CACHE_LINE_SIZE =3D=3D 128 > > > + char pad3[24]; > > > +#endif > > > void *pool; > > > void *next; > > > }; > > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h > > > index f234ac9..0bf55e0 100644 > > > --- a/lib/librte_mbuf/rte_mbuf.h > > > +++ b/lib/librte_mbuf/rte_mbuf.h > > > @@ -813,8 +813,10 @@ struct rte_mbuf { > > > > > > uint16_t vlan_tci_outer; /**< Outer VLAN Tag Control Identifier (C= PU order) */ > > > > > > +#if RTE_CACHE_LINE_SIZE =3D=3D 64 > > > /* second cache line - fields only used in slow path or on TX */ > > > MARKER cacheline1 __rte_cache_aligned; > > > +#endif > > > > I suppose you'll need to keep same space reserved for first 64B even on= systems with 128B cache-line. > > Otherwise we can endup with different mbuf format for systems with 128B= cache-line. >=20 > Just to understand, Is there any issue in mbuf format being different > across the systems. I think, we are not sending the mbuf over the wire > or sharing with different system etc. right? No, we don't have to support that. At least I am not aware about such cases.=20 >=20 > Yes, I do understand the KNI dependency with mbuf. Are you asking about what will be broken (except KNI) if mbuf layout IA and= ARM would be different? Probably nothing right now, except vector RX/TX. But they are not supported on ARM anyway, and if someone will implement the= m in future, it might be completely different from IA one. =20 It just seems wrong to me to have different mbuf layout for each architectu= re. >=20 > > Another thing - now we have __rte_cache_aligned all over the places, an= d I don't know is to double > > sizes of all these structures is a good idea. >=20 > I thought so, the only concern I have, what if, the struct split to 64 > and one cache line is shared between two core/two different structs which= have > the different type of operation(most likely). One extensive write and oth= er one > read, The write makes the line dirty start evicting and read core is > going to suffer. Any thoughts? >=20 > If its tradeoff between amount memory and performance, I think, it makes = sense > to stick the performance in data plane, Hence split option may be not use= ful? > right? I understand that for most cases you would like to have your data structure= s CL aligned - to avoid performance penalties. I just suggest to have RTE_CACHE_MIN_LINE_SIZE(=3D=3D64) for few cases when= it might be plausible. As an example: struct rte_mbuf { ... MARKER cacheline1 __rte_cache_min_aligned; ... } _rte_cache_aligned; So we would have mbuf with the same size and layout, but different alignmen= t for IA and ARM. Another example, where RTE_CACHE_MIN_LINE_SIZE could be used: struct rte_eth_(rxq|txq)_info. There is no real need to have them 128B aligned for ARM.=20 The main purpose why they were defined as '__rte_cache_aligned' - just to reserve some space for future expansion. Konstantin >=20 >=20 > > Again, #if RTE_CACHE_LINE_SIZE =3D=3D 64 ... all over the places looks= a bit clumsy. > > Wonder can we have __rte_cache_aligned/ RTE_CACHE_LINE_SIZE architectur= e specific, >=20 > I think, its architecture specific now >=20 > > but introduce RTE_CACHE_MIN_LINE_SIZE(=3D=3D64)/ __rte_cache_min_aligne= d and used it for mbuf > > (and might be other places). >=20 > Yes, it will help in this specific mbuf case and it make sense > if mbuf going to stay within 2 x 64 CL. >=20 > AND/OR >=20 > can we introduce something like below to reduce the clutter in > other places, macro name is just not correct, trying to share the view >=20 > #define rte_cacheline_diff(for_64, for_128)\ > do {\ > #if RTE_CACHE_LINE_SIZE =3D=3D 64\ > for_64; > #elif RTE_CACHE_LINE_SIZE =3D=3D 128\ > for_128;\ > #endif >=20 > OR >=20 > Typedef struct rte_mbuf to new abstract type and define for 64 bytes and > 128 byte >=20 > Jerin >=20 > > Konstantin > > > >