From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <konstantin.ananyev@intel.com>
Received: from mga11.intel.com (mga11.intel.com [192.55.52.93])
 by dpdk.org (Postfix) with ESMTP id 022F15424
 for <dev@dpdk.org>; Thu, 22 Jan 2015 15:05:03 +0100 (CET)
Received: from orsmga003.jf.intel.com ([10.7.209.27])
 by fmsmga102.fm.intel.com with ESMTP; 22 Jan 2015 06:05:01 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.09,449,1418112000"; d="scan'208";a="516028782"
Received: from irsmsx107.ger.corp.intel.com ([163.33.3.99])
 by orsmga003.jf.intel.com with ESMTP; 22 Jan 2015 05:58:16 -0800
Received: from irsmsx105.ger.corp.intel.com ([169.254.7.81]) by
 IRSMSX107.ger.corp.intel.com ([169.254.10.75]) with mapi id 14.03.0195.001;
 Thu, 22 Jan 2015 14:04:40 +0000
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: "Walukiewicz, Miroslaw" <Miroslaw.Walukiewicz@intel.com>, "Liang, Cunming"
 <cunming.liang@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
Thread-Topic: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EAL
 thread
Thread-Index: AQHQNhwTNbIO/dhX6ka3A/dU0uJTqZzL5iGAgAApMACAAAcHgIAACSkg
Date: Thu, 22 Jan 2015 14:04:39 +0000
Message-ID: <2601191342CEEE43887BDE71AB977258213DEEE8@irsmsx105.ger.corp.intel.com>
References: <1417589628-43666-1-git-send-email-cunming.liang@intel.com>
 <1421914598-2747-1-git-send-email-cunming.liang@intel.com>
 <1421914598-2747-14-git-send-email-cunming.liang@intel.com>
 <7C4248CAE043B144B1CD242D275626532FE3BEE9@IRSMSX104.ger.corp.intel.com>
 <D0158A423229094DA7ABF71CF2FA0DA3118A975E@shsmsx102.ccr.corp.intel.com>
 <7C4248CAE043B144B1CD242D275626532FE3C144@IRSMSX104.ger.corp.intel.com>
In-Reply-To: <7C4248CAE043B144B1CD242D275626532FE3C144@IRSMSX104.ger.corp.intel.com>
Accept-Language: en-IE, en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [163.33.239.181]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EAL
 thread
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 22 Jan 2015 14:05:05 -0000


Hi Miroslaw,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Walukiewicz, Mirosla=
w
> Sent: Thursday, January 22, 2015 12:45 PM
> To: Liang, Cunming; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EAL =
thread
>=20
>=20
>=20
> > -----Original Message-----
> > From: Liang, Cunming
> > Sent: Thursday, January 22, 2015 1:20 PM
> > To: Walukiewicz, Miroslaw; dev@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EA=
L
> > thread
> >
> >
> >
> > > -----Original Message-----
> > > From: Walukiewicz, Miroslaw
> > > Sent: Thursday, January 22, 2015 5:53 PM
> > > To: Liang, Cunming; dev@dpdk.org
> > > Subject: RE: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-
> > EAL
> > > thread
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cunming Liang
> > > > Sent: Thursday, January 22, 2015 9:17 AM
> > > > To: dev@dpdk.org
> > > > Subject: [dpdk-dev] [PATCH v1 13/15] mempool: add support to non-EA=
L
> > > > thread
> > > >
> > > > For non-EAL thread, bypass per lcore cache, directly use ring pool.
> > > > It allows using rte_mempool in either EAL thread or any user pthrea=
d.
> > > > As in non-EAL thread, it directly rely on rte_ring and it's none pr=
eemptive.
> > > > It doesn't suggest to run multi-pthread/cpu which compete the
> > > > rte_mempool.
> > > > It will get bad performance and has critical risk if scheduling pol=
icy is RT.
> > > >
> > > > Signed-off-by: Cunming Liang <cunming.liang@intel.com>
> > > > ---
> > > >  lib/librte_mempool/rte_mempool.h | 18 +++++++++++-------
> > > >  1 file changed, 11 insertions(+), 7 deletions(-)
> > > >
> > > > diff --git a/lib/librte_mempool/rte_mempool.h
> > > > b/lib/librte_mempool/rte_mempool.h
> > > > index 3314651..4845f27 100644
> > > > --- a/lib/librte_mempool/rte_mempool.h
> > > > +++ b/lib/librte_mempool/rte_mempool.h
> > > > @@ -198,10 +198,12 @@ struct rte_mempool {
> > > >   *   Number to add to the object-oriented statistics.
> > > >   */
> > > >  #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
> > > > -#define __MEMPOOL_STAT_ADD(mp, name, n) do {
> > 	\
> > > > -		unsigned __lcore_id =3D rte_lcore_id();		\
> > > > -		mp->stats[__lcore_id].name##_objs +=3D n;		\
> > > > -		mp->stats[__lcore_id].name##_bulk +=3D 1;		\
> > > > +#define __MEMPOOL_STAT_ADD(mp, name, n) do {                    \
> > > > +		unsigned __lcore_id =3D rte_lcore_id();           \
> > > > +		if (__lcore_id < RTE_MAX_LCORE) {               \
> > > > +			mp->stats[__lcore_id].name##_objs +=3D n;	\
> > > > +			mp->stats[__lcore_id].name##_bulk +=3D 1;	\
> > > > +		}                                               \
> > > >  	} while(0)
> > > >  #else
> > > >  #define __MEMPOOL_STAT_ADD(mp, name, n) do {} while(0)
> > > > @@ -767,8 +769,9 @@ __mempool_put_bulk(struct rte_mempool *mp,
> > > > void * const *obj_table,
> > > >  	__MEMPOOL_STAT_ADD(mp, put, n);
> > > >
> > > >  #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
> > > > -	/* cache is not enabled or single producer */
> > > > -	if (unlikely(cache_size =3D=3D 0 || is_mp =3D=3D 0))
> > > > +	/* cache is not enabled or single producer or none EAL thread */
> > >
> > > I don't understand this limitation.
> > >
> > > I see that the rte_membuf.h defines table per RTE_MAX_LCORE like belo=
w
> > > #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
> > >         /** Per-lcore local cache. */
> > >         struct rte_mempool_cache local_cache[RTE_MAX_LCORE];
> > > #endif
> > >
> > > But why we cannot extent the size of the local cache table to somethi=
ng
> > like
> > > RTE_MAX_THREADS that does not exceed max value of rte_lcore_id()
> > >
> > > Keeping this condition here is a  real performance killer!!!!!!.
> > > I saw in my test application spending more 95% of CPU time reading th=
e
> > atomic
> > > in M C/MP ring utilizing access to mempool.
> > [Liang, Cunming] This is the first step to make it work.
> > By Konstantin's comments, shall prevent to allocate unique id by oursel=
ves.
> > And the return value from gettid() is too large as an index.
> > For non-EAL thread performance gap, will think about additional fix pat=
ch
> > here.
> > If care about performance, still prefer to choose EAL thread now.
>=20
> In previous patch you had allocation of the thread id on base of unique g=
ettid() as number
> not a potential pointer as we can expect from implementation getid() from=
 Linux or BSD.

I am really puzzled with your sentence above.
What ' potential pointer' you are talking about?
rte_lcore_id() - returns unsigned 32bit integer (as it always did).
_lcore_id for each EAL thread is assigned at rte_eal_init().
For the EAL thread  _lcore_id value is in interval [0, RTE_MAX_LCORE) and
it is up to the user to make sure that each _lcore_id is unique inside DPDK=
 MultiProcess group.
That's all as it was before.=20
What's new with Steve's patch:
1) At startup user can select a set of physical cpu(s) on which each EAL th=
read (lcore) can run.
>>From explanation at  [PATCH v1 02/15]:
--lcores=3D'1,2@(5-7),(3-5)@(0,2),(0,6)' means starting 7 EAL thread as bel=
ow
lcore 0 runs on cpuset 0x41 (cpu 0,6)
lcore 1 runs on cpuset 0x2 (cpu 1)
lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
lcore 6 runs on cpuset 0x41 (cpu 0,6)

2) Non-EAL (dynamically created threads) will have their  _lcore_id set to =
LCORE_ID_ANY (=3D=3DUINT32_MAX).
That allow us to distinguish them from EAL threads (lcores).
Make changes in DPDK functions, so that they will safely work within such t=
hreads too, but there could be some performance=20
(rte_mempool get/put)  degradation comaring to EAL thread. =20

>=20
> The another problem is that we compare here int with some unique thread i=
dentifier.

We are not doing that here.
_lcore_id is unsigned int.

> How can you prevent that when implementation of gettid will change and un=
ique thread identifier will be
> Less than RTE_MAX_LCORE and will be still unique.

We are not using gettid() to assign _lcore_id values.
And never did.
Please read the patch set carefully.

>=20
> I think that your assumption will work for well-known operating systems b=
ut will be very unportable.
>=20
> Regarding performance the DPDK can work efficiently in different environm=
ents including pthreads.
> You can imagine running DPDK from pthread application where affinity will=
 be made by application.
> Effectiveness depends on application thread implementation comparable to =
EAL threads.

That was one of the goals of the patch: make all DPDK API to work with dyna=
mically created threads.
So now it is safe to do:

static void *thread_func1(void *arg)
{
    rte_pktmbuf_free((struct rte_mbuf *)arg);
   return NULL;
}

pthread_t tid;
struct rte_mbuf *m =3D rte_pktmbuf_alloc(mp);
=20
pthread_create(&tid, NULL, thread_func1, m);


As Steve pointed - if you are dependent on rte_mempool get/put performance =
inside your threads -
better to use EAL threads.
With current patch you can have up to RTE_MAX_LCORE such threads and can as=
sign any cpu affinity for each of them.
Though, if you have an idea how to extend mempool cache to any number of dy=
namically created threads
(make cache for each mempool sort of TLS?), then sure we can discuss it.
But I suppose it should be a metter of separate patch.   =20

Konstantin

>=20
> I think that this is a goal for this change.
>=20
> > >
> > > Same comment for get operation below
> > >
> > > > +	if (unlikely(cache_size =3D=3D 0 || is_mp =3D=3D 0 ||
> > > > +		     lcore_id >=3D RTE_MAX_LCORE))
> > > >  		goto ring_enqueue;
> > > >
> > > >  	/* Go straight to ring if put would overflow mem allocated for ca=
che
> > > > */
> > > > @@ -952,7 +955,8 @@ __mempool_get_bulk(struct rte_mempool *mp,
> > void
> > > > **obj_table,
> > > >  	uint32_t cache_size =3D mp->cache_size;
> > > >
> > > >  	/* cache is not enabled or single consumer */
> > > > -	if (unlikely(cache_size =3D=3D 0 || is_mc =3D=3D 0 || n >=3D cach=
e_size))
> > > > +	if (unlikely(cache_size =3D=3D 0 || is_mc =3D=3D 0 ||
> > > > +		     n >=3D cache_size || lcore_id >=3D RTE_MAX_LCORE))
> > > >  		goto ring_dequeue;
> > > >
> > > >  	cache =3D &mp->local_cache[lcore_id];
> > > > --
> > > > 1.8.1.4