From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 1062A5424 for ; Thu, 22 Jan 2015 13:47:08 +0100 (CET) Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga102.jf.intel.com with ESMTP; 22 Jan 2015 04:44:02 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.09,448,1418112000"; d="scan'208";a="654848721" Received: from irsmsx105.ger.corp.intel.com ([163.33.3.28]) by fmsmga001.fm.intel.com with ESMTP; 22 Jan 2015 04:47:01 -0800 Received: from irsmsx104.ger.corp.intel.com ([169.254.5.229]) by irsmsx105.ger.corp.intel.com ([169.254.7.81]) with mapi id 14.03.0195.001; Thu, 22 Jan 2015 12:45:22 +0000 From: "Walukiewicz, Miroslaw" To: "Liang, Cunming" , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EAL thread Thread-Index: AQHQNhwPyqRd0DBtV0SaZJL7/aG+LZzL4v+ggAAsUgCAAAQXwA== Date: Thu, 22 Jan 2015 12:45:21 +0000 Message-ID: <7C4248CAE043B144B1CD242D275626532FE3C144@IRSMSX104.ger.corp.intel.com> References: <1417589628-43666-1-git-send-email-cunming.liang@intel.com> <1421914598-2747-1-git-send-email-cunming.liang@intel.com> <1421914598-2747-14-git-send-email-cunming.liang@intel.com> <7C4248CAE043B144B1CD242D275626532FE3BEE9@IRSMSX104.ger.corp.intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.181] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EAL thread X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Jan 2015 12:47:09 -0000 > -----Original Message----- > From: Liang, Cunming > Sent: Thursday, January 22, 2015 1:20 PM > To: Walukiewicz, Miroslaw; dev@dpdk.org > Subject: RE: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EAL > thread >=20 >=20 >=20 > > -----Original Message----- > > From: Walukiewicz, Miroslaw > > Sent: Thursday, January 22, 2015 5:53 PM > > To: Liang, Cunming; dev@dpdk.org > > Subject: RE: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non- > EAL > > thread > > > > > > > > > -----Original Message----- > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cunming Liang > > > Sent: Thursday, January 22, 2015 9:17 AM > > > To: dev@dpdk.org > > > Subject: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EAL > > > thread > > > > > > For non-EAL thread, bypass per lcore cache, directly use ring pool. > > > It allows using rte_mempool in either EAL thread or any user pthread. > > > As in non-EAL thread, it directly rely on rte_ring and it's none pree= mptive. > > > It doesn't suggest to run multi-pthread/cpu which compete the > > > rte_mempool. > > > It will get bad performance and has critical risk if scheduling polic= y is RT. > > > > > > Signed-off-by: Cunming Liang > > > --- > > > lib/librte_mempool/rte_mempool.h | 18 +++++++++++------- > > > 1 file changed, 11 insertions(+), 7 deletions(-) > > > > > > diff --git a/lib/librte_mempool/rte_mempool.h > > > b/lib/librte_mempool/rte_mempool.h > > > index 3314651..4845f27 100644 > > > --- a/lib/librte_mempool/rte_mempool.h > > > +++ b/lib/librte_mempool/rte_mempool.h > > > @@ -198,10 +198,12 @@ struct rte_mempool { > > > * Number to add to the object-oriented statistics. > > > */ > > > #ifdef RTE_LIBRTE_MEMPOOL_DEBUG > > > -#define __MEMPOOL_STAT_ADD(mp, name, n) do { > \ > > > - unsigned __lcore_id =3D rte_lcore_id(); \ > > > - mp->stats[__lcore_id].name##_objs +=3D n; \ > > > - mp->stats[__lcore_id].name##_bulk +=3D 1; \ > > > +#define __MEMPOOL_STAT_ADD(mp, name, n) do { \ > > > + unsigned __lcore_id =3D rte_lcore_id(); \ > > > + if (__lcore_id < RTE_MAX_LCORE) { \ > > > + mp->stats[__lcore_id].name##_objs +=3D n; \ > > > + mp->stats[__lcore_id].name##_bulk +=3D 1; \ > > > + } \ > > > } while(0) > > > #else > > > #define __MEMPOOL_STAT_ADD(mp, name, n) do {} while(0) > > > @@ -767,8 +769,9 @@ __mempool_put_bulk(struct rte_mempool *mp, > > > void * const *obj_table, > > > __MEMPOOL_STAT_ADD(mp, put, n); > > > > > > #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 > > > - /* cache is not enabled or single producer */ > > > - if (unlikely(cache_size =3D=3D 0 || is_mp =3D=3D 0)) > > > + /* cache is not enabled or single producer or none EAL thread */ > > > > I don't understand this limitation. > > > > I see that the rte_membuf.h defines table per RTE_MAX_LCORE like below > > #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 > > /** Per-lcore local cache. */ > > struct rte_mempool_cache local_cache[RTE_MAX_LCORE]; > > #endif > > > > But why we cannot extent the size of the local cache table to something > like > > RTE_MAX_THREADS that does not exceed max value of rte_lcore_id() > > > > Keeping this condition here is a real performance killer!!!!!!. > > I saw in my test application spending more 95% of CPU time reading the > atomic > > in M C/MP ring utilizing access to mempool. > [Liang, Cunming] This is the first step to make it work. > By Konstantin's comments, shall prevent to allocate unique id by ourselve= s. > And the return value from gettid() is too large as an index. > For non-EAL thread performance gap, will think about additional fix patch > here. > If care about performance, still prefer to choose EAL thread now. In previous patch you had allocation of the thread id on base of unique get= tid() as number=20 not a potential pointer as we can expect from implementation getid() from L= inux or BSD. The another problem is that we compare here int with some unique thread ide= ntifier. How can you prevent that when implementation of gettid will change and uniq= ue thread identifier will be=20 Less than RTE_MAX_LCORE and will be still unique.=20 I think that your assumption will work for well-known operating systems but= will be very unportable. Regarding performance the DPDK can work efficiently in different environmen= ts including pthreads.=20 You can imagine running DPDK from pthread application where affinity will b= e made by application.=20 Effectiveness depends on application thread implementation comparable to EA= L threads.=20 I think that this is a goal for this change. > > > > Same comment for get operation below > > > > > + if (unlikely(cache_size =3D=3D 0 || is_mp =3D=3D 0 || > > > + lcore_id >=3D RTE_MAX_LCORE)) > > > goto ring_enqueue; > > > > > > /* Go straight to ring if put would overflow mem allocated for cach= e > > > */ > > > @@ -952,7 +955,8 @@ __mempool_get_bulk(struct rte_mempool *mp, > void > > > **obj_table, > > > uint32_t cache_size =3D mp->cache_size; > > > > > > /* cache is not enabled or single consumer */ > > > - if (unlikely(cache_size =3D=3D 0 || is_mc =3D=3D 0 || n >=3D cache_= size)) > > > + if (unlikely(cache_size =3D=3D 0 || is_mc =3D=3D 0 || > > > + n >=3D cache_size || lcore_id >=3D RTE_MAX_LCORE)) > > > goto ring_dequeue; > > > > > > cache =3D &mp->local_cache[lcore_id]; > > > -- > > > 1.8.1.4