From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <konstantin.ananyev@intel.com>
Received: from mga03.intel.com (mga03.intel.com [134.134.136.65])
 by dpdk.org (Postfix) with ESMTP id 73B7B19F5
 for <dev@dpdk.org>; Thu,  8 Jan 2015 18:05:57 +0100 (CET)
Received: from fmsmga002.fm.intel.com ([10.253.24.26])
 by orsmga103.jf.intel.com with ESMTP; 08 Jan 2015 09:02:32 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.07,723,1413270000"; d="scan'208";a="658836745"
Received: from irsmsx101.ger.corp.intel.com ([163.33.3.153])
 by fmsmga002.fm.intel.com with ESMTP; 08 Jan 2015 09:05:53 -0800
Received: from irsmsx105.ger.corp.intel.com ([169.254.7.195]) by
 IRSMSX101.ger.corp.intel.com ([169.254.1.126]) with mapi id 14.03.0195.001;
 Thu, 8 Jan 2015 17:05:48 +0000
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: "Liang, Cunming" <cunming.liang@intel.com>, Stephen Hemminger
 <stephen@networkplumber.org>, "Richardson, Bruce"
 <bruce.richardson@intel.com>
Thread-Topic: [dpdk-dev] [RFC PATCH 0/7] support multi-phtread per lcore
Thread-Index: AQHQFOb7rWYcSWZpW0S0EaABVQ+aIpyKJ7GAgAFMAwCABRIMAIAAC9OAgAS+ioCAANxagIAAj96AgAQtdYCAAISagIAAkhMAgAEB6QCAGXpCEA==
Date: Thu, 8 Jan 2015 17:05:48 +0000
Message-ID: <2601191342CEEE43887BDE71AB977258213D39EA@irsmsx105.ger.corp.intel.com>
References: <1418263490-21088-1-git-send-email-cunming.liang@intel.com>
 <7C4248CAE043B144B1CD242D275626532FE15298@IRSMSX104.ger.corp.intel.com>
 <D0158A423229094DA7ABF71CF2FA0DA31188B881@shsmsx102.ccr.corp.intel.com>
 <7C4248CAE043B144B1CD242D275626532FE232BA@IRSMSX104.ger.corp.intel.com>
 <D0158A423229094DA7ABF71CF2FA0DA31188C928@shsmsx102.ccr.corp.intel.com>
 <7C4248CAE043B144B1CD242D275626532FE27C3B@IRSMSX104.ger.corp.intel.com>
 <D0158A423229094DA7ABF71CF2FA0DA31188E454@shsmsx102.ccr.corp.intel.com>
 <20141219100342.GA3848@bricha3-MOBL3>
 <D0158A423229094DA7ABF71CF2FA0DA31188EF9F@shsmsx102.ccr.corp.intel.com>
 <20141222094603.GA1768@bricha3-MOBL3> <20141222102852.7e6d5e81@urahara>
 <D0158A423229094DA7ABF71CF2FA0DA31188F9AD@shsmsx102.ccr.corp.intel.com>
In-Reply-To: <D0158A423229094DA7ABF71CF2FA0DA31188F9AD@shsmsx102.ccr.corp.intel.com>
Accept-Language: en-IE, en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [163.33.239.180]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [RFC PATCH 0/7] support multi-phtread per lcore
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 08 Jan 2015 17:05:58 -0000


Hi Steve,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Liang, Cunming
> Sent: Tuesday, December 23, 2014 9:52 AM
> To: Stephen Hemminger; Richardson, Bruce
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [RFC PATCH 0/7] support multi-phtread per lcore
>=20
>=20
>=20
> > -----Original Message-----
> > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > Sent: Tuesday, December 23, 2014 2:29 AM
> > To: Richardson, Bruce
> > Cc: Liang, Cunming; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [RFC PATCH 0/7] support multi-phtread per lcore
> >
> > On Mon, 22 Dec 2014 09:46:03 +0000
> > Bruce Richardson <bruce.richardson@intel.com> wrote:
> >
> > > On Mon, Dec 22, 2014 at 01:51:27AM +0000, Liang, Cunming wrote:
> > > > ...
> > > > > I'm conflicted on this one. However, I think far more application=
s would be
> > > > > broken
> > > > > to start having to use thread_id in place of an lcore_id than wou=
ld be
> > broken
> > > > > by having the lcore_id no longer actually correspond to a core.
> > > > > I'm actually struggling to come up with a large number of scenari=
os where
> > it's
> > > > > important to an app to determine the cpu it's running on, compare=
d to the
> > large
> > > > > number of cases where you need to have a data-structure per threa=
d. In
> > DPDK
> > > > > libs
> > > > > alone, you see this assumption that lcore_id =3D=3D thread_id a l=
arge number
> > of
> > > > > times.
> > > > >
> > > > > Despite the slight logical inconsistency, I think it's better to =
avoid
> > introducing
> > > > > a thread-id and continue having lcore_id representing a unique th=
read.
> > > > >
> > > > > /Bruce
> > > >
> > > > Ok, I understand it.
> > > > I list the implicit meaning if using lcore_id representing the uniq=
ue thread.
> > > > 1). When lcore_id less than RTE_MAX_LCORE, it still represents the =
logical
> > core id.
> > > > 2). When lcore_id large equal than RTE_MAX_LCORE, it represents an =
unique
> > id for thread.
> > > > 3). Most of APIs(except rte_lcore_id()) in rte_lcore.h suggest to b=
e used only
> > in CASE 1)
> > > > 4). rte_lcore_id() can be used in CASE 2), but the return value no =
matter
> > represent a logical core id.
> > > >
> > > > If most of us feel it's acceptable, I'll prepare for the RFC v2 bas=
e on this
> > conclusion.
> > > >
> > > > /Cunming
> > >
> > > Sorry, I don't like that suggestion either, as having lcore_id values=
 greater
> > > than RTE_MAX_LCORE is terrible, as how will people know how to dimens=
ion
> > arrays
> > > to be indexes by lcore id? Given the choice, if we are not going to j=
ust use
> > > lcore_id as a generic thread id, which is always between 0 and
> > RTE_MAX_LCORE
> > > we can look to define a new thread_id variable to hold that. However,=
 it should
> > > have a bounded range.
> > > From an ease-of-porting perspective, I still think that the simplest =
option is to
> > > use the existing lcore_id and accept the fact that it's now a thread =
id rather
> > > than an actual physical lcore. Question is, is would that cause us lo=
ts of issues
> > > in the future?
> > >
> > > /Bruce
> >
> > The current rte_lcore_id() has different meaning the thread. Your propo=
sal will
> > break code that uses lcore_id to do per-cpu statistics and the lcore_co=
nfig
> > code in the samples.
> > q
> [Liang, Cunming] +1.

Few more thoughts on that subject:

Actually one more place in the lib, where lcore_id is used (and it should b=
e unique):
rte_spinlock_recursive_lock() / rte_spinlock_recursive_trylock().
So if we going to replace lcore_id with thread_id as uniques thread index, =
then these functions
have to be updated too.

About maintaining our own unique thread_id inside shared memory (_get_linea=
r_tid()/_put_linear_tid()).
There is one thing that worries me with that approach:
In case of abnormal process termination, TIDs used by that process will rem=
ain 'reserved'
and there is no way to know which TIDs were used by terminated process.
So there could be a situation with DPDK multi-process model,
when after secondary process abnormal termination, It wouldn't be possible =
to restart it -
we just run out of 'free' TIDs.=20
=20
Which makes me think probably there is no need to introduce new globally un=
ique 'thread_id'?
Might be just lcore_id is enough? =20
As Mirek and Bruce suggested we can treat it a sort of 'unique thread id' i=
nside EAL.
Or as 'virtual' core id that can run on set of physical cpus, and these sub=
sets for different 'virtual' cores can intersect.
Then basically we can keep legacy behaviour with '-c <lcores_mask>,' where =
each
lcore_id matches one to one  with physical cpu, and introduce new one, some=
thing like:
--lcores=3D'(<lcore_set1>)=3D(<phys_cpu_set1>),..(<lcore_setN)=3D(<phys_cpu=
_setN>)'.
So let say: --lcores=3D(0-7)=3D(0,2-4),(10)=3D(7),(8)=3D(all)' would mean:
Create 10 EAL threads, bind threads with clore_id=3D[0-7] to cpuset: <0,2,3=
,4>,=20
thread  with lcore_id=3D10 is binded to  cpu 7, and allow to run lcore_id=
=3D8 on any cpu in the system.   =20
Of course '-c' and '-lcores' would be mutually exclusive, and we will need =
to update  rte_lcore_to_socket_id()
and introduce: rte_lcore_(set|get)_affinity().

Does it make sense to you?

BTW, one more thing: while we are on it  - it is probably a good time to do=
 something with our interrupt thread?
It is a bit strange that we can't use rte_pktmbuf_free() or  rte_spinlock_r=
ecursive_lock() from our own interrupt/alarm handlers

Konstantin