From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 35B418052 for ; Thu, 11 Dec 2014 10:57:00 +0100 (CET) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga102.jf.intel.com with ESMTP; 11 Dec 2014 01:55:13 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.04,691,1406617200"; d="scan'208";a="497149516" Received: from irsmsx110.ger.corp.intel.com ([163.33.3.25]) by orsmga003.jf.intel.com with ESMTP; 11 Dec 2014 01:52:46 -0800 Received: from irsmsx104.ger.corp.intel.com ([169.254.5.209]) by IRSMSX110.ger.corp.intel.com ([169.254.15.55]) with mapi id 14.03.0195.001; Thu, 11 Dec 2014 09:56:35 +0000 From: "Walukiewicz, Miroslaw" To: "Liang, Cunming" , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [RFC PATCH 0/7] support multi-phtread per lcore Thread-Index: AQHQFOb74ekE1aZuS0qI47KzEksKLpyKJ0pQ Date: Thu, 11 Dec 2014 09:56:35 +0000 Message-ID: <7C4248CAE043B144B1CD242D275626532FE15298@IRSMSX104.ger.corp.intel.com> References: <1418263490-21088-1-git-send-email-cunming.liang@intel.com> In-Reply-To: <1418263490-21088-1-git-send-email-cunming.liang@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.180] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [RFC PATCH 0/7] support multi-phtread per lcore X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Dec 2014 09:57:02 -0000 Thank you Cunming for explanation.=20 What about DPDK timers? They also depend on rte_lcore_id() to avoid spinloc= ks.=20 Mirek > -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Cunming Liang > Sent: Thursday, December 11, 2014 3:05 AM > To: dev@dpdk.org > Subject: [dpdk-dev] [RFC PATCH 0/7] support multi-phtread per lcore >=20 >=20 > Scope & Usage Scenario > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >=20 > DPDK usually pin pthread per core to avoid task switch overhead. It gains > performance a lot, but it's not efficient in all cases. In some cases, it= may > too expensive to use the whole core for a lightweight workload. It's a > reasonable demand to have multiple threads per core and each threads > share CPU > in an assigned weight. >=20 > In fact, nothing avoid user to create normal pthread and using cgroup to > control the CPU share. One of the purpose for the patchset is to clean th= e > gaps of using more DPDK libraries in the normal pthread. In addition, it > demonstrates performance gain by proactive 'yield' when doing idle loop > in packet IO. It also provides several 'rte_pthread_*' APIs to easy life. >=20 >=20 > Changes to DPDK libraries > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D >=20 > Some of DPDK libraries must run in DPDK environment. >=20 > # rte_mempool >=20 > In rte_mempool doc, it mentions a thread not created by EAL must not use > mempools. The root cause is it uses a per-lcore cache inside mempool. > And 'rte_lcore_id()' will not return a correct value. >=20 > The patchset changes this a little. The index of mempool cache won't be a > lcore_id. Instead of it, using a linear number generated by the allocator= . > For those legacy EAL per-lcore thread, it apply for an unique linear id > during creation. For those normal pthread expecting to use rte_mempool, i= t > requires to apply for a linear id explicitly. Now the mempool cache looks= like > a per-thread base. The linear ID actually identify for the linear thread = id. >=20 > However, there's another problem. The rte_mempool is not preemptable. > The > problem comes from rte_ring, so talk together in next section. >=20 > # rte_ring >=20 > rte_ring supports multi-producer enqueue and multi-consumer dequeue. > But it's > not preemptable. There's conversation talking about this before. > http://dpdk.org/ml/archives/dev/2013-November/000714.html >=20 > Let's say there's two pthreads running on the same core doing enqueue on > the > same rte_ring. If the 1st pthread is preempted by the 2nd pthread while i= t > has > already modified the prod.head, the 2nd pthread will spin until the 1st o= ne > scheduled agian. It causes time wasting. In addition, if the 2nd pthread = has > absolutely higer priority, it's more terrible. >=20 > But it doesn't means we can't use. Just need to narrow down the situation > when > it's used by multi-pthread on the same core. > - It CAN be used for any single-producer or single-consumer situation. > - It MAY be used by multi-producer/consumer pthread whose scheduling > policy > are all SCHED_OTHER(cfs). User SHOULD aware of the performance penalty > befor > using it. > - It MUST not be used by multi-producer/consumer pthread, while some of > their > scheduling policies is SCHED_FIFO or SCHED_RR. >=20 >=20 > Performance > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >=20 > It loses performance by introducing task switching. On packet IO perspect= ive, > we can gain some back by improving IO effective rate. When the pthread do > idle > loop on an empty rx queue, it should proactively yield. We can also slow > down > rx for a bit while to take more advantage of the bulk receiving in the ne= xt > loop. In practice, increase the rx ring size also helps to improve the ov= errall > throughput. >=20 >=20 > Cgroup Control > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >=20 > Here's a simple example, there's four pthread doing packet IO on the same > core. > We expect the CPU share rate is 1:1:2:4. > > mkdir /sys/fs/cgroup/cpu/dpdk > > mkdir /sys/fs/cgroup/cpu/dpdk/thread0 > > mkdir /sys/fs/cgroup/cpu/dpdk/thread1 > > mkdir /sys/fs/cgroup/cpu/dpdk/thread2 > > mkdir /sys/fs/cgroup/cpu/dpdk/thread3 > > cd /sys/fs/cgroup/cpu/dpdk > > echo 256 > thread0/cpu.shares > > echo 256 > thread1/cpu.shares > > echo 512 > thread2/cpu.shares > > echo 1024 > thread3/cpu.shares >=20 >=20 > -END- >=20 > Any comments are welcome. >=20 > Thanks >=20 > *** BLURB HERE *** >=20 > Cunming Liang (7): > eal: add linear thread id as pthread-local variable > mempool: use linear-tid as mempool cache index > ring: use linear-tid as ring debug stats index > eal: add simple API for multi-pthread > testpmd: support multi-pthread mode > sample: add new sample for multi-pthread > eal: macro for cpuset w/ or w/o CPU_ALLOC >=20 > app/test-pmd/cmdline.c | 41 +++++ > app/test-pmd/testpmd.c | 84 ++++++++- > app/test-pmd/testpmd.h | 1 + > config/common_linuxapp | 1 + > examples/multi-pthread/Makefile | 57 ++++++ > examples/multi-pthread/main.c | 232 ++++++++++++++++++++++++ > examples/multi-pthread/main.h | 46 +++++ > lib/librte_eal/common/include/rte_eal.h | 15 ++ > lib/librte_eal/common/include/rte_lcore.h | 12 ++ > lib/librte_eal/linuxapp/eal/eal_thread.c | 282 > +++++++++++++++++++++++++++--- > lib/librte_mempool/rte_mempool.h | 22 +-- > lib/librte_ring/rte_ring.h | 6 +- > 12 files changed, 755 insertions(+), 44 deletions(-) > create mode 100644 examples/multi-pthread/Makefile > create mode 100644 examples/multi-pthread/main.c > create mode 100644 examples/multi-pthread/main.h >=20 > -- > 1.8.1.4