From: Cunming Liang <cunming.liang@intel.com>
To: dev@dpdk.org
Subject: [dpdk-dev] [RFC PATCH 0/7] support multi-phtread per lcore
Date: Thu, 11 Dec 2014 10:04:43 +0800 [thread overview]
Message-ID: <1418263490-21088-1-git-send-email-cunming.liang@intel.com> (raw)
Scope & Usage Scenario
========================
DPDK usually pin pthread per core to avoid task switch overhead. It gains
performance a lot, but it's not efficient in all cases. In some cases, it may
too expensive to use the whole core for a lightweight workload. It's a
reasonable demand to have multiple threads per core and each threads share CPU
in an assigned weight.
In fact, nothing avoid user to create normal pthread and using cgroup to
control the CPU share. One of the purpose for the patchset is to clean the
gaps of using more DPDK libraries in the normal pthread. In addition, it
demonstrates performance gain by proactive 'yield' when doing idle loop
in packet IO. It also provides several 'rte_pthread_*' APIs to easy life.
Changes to DPDK libraries
==========================
Some of DPDK libraries must run in DPDK environment.
# rte_mempool
In rte_mempool doc, it mentions a thread not created by EAL must not use
mempools. The root cause is it uses a per-lcore cache inside mempool.
And 'rte_lcore_id()' will not return a correct value.
The patchset changes this a little. The index of mempool cache won't be a
lcore_id. Instead of it, using a linear number generated by the allocator.
For those legacy EAL per-lcore thread, it apply for an unique linear id
during creation. For those normal pthread expecting to use rte_mempool, it
requires to apply for a linear id explicitly. Now the mempool cache looks like
a per-thread base. The linear ID actually identify for the linear thread id.
However, there's another problem. The rte_mempool is not preemptable. The
problem comes from rte_ring, so talk together in next section.
# rte_ring
rte_ring supports multi-producer enqueue and multi-consumer dequeue. But it's
not preemptable. There's conversation talking about this before.
http://dpdk.org/ml/archives/dev/2013-November/000714.html
Let's say there's two pthreads running on the same core doing enqueue on the
same rte_ring. If the 1st pthread is preempted by the 2nd pthread while it has
already modified the prod.head, the 2nd pthread will spin until the 1st one
scheduled agian. It causes time wasting. In addition, if the 2nd pthread has
absolutely higer priority, it's more terrible.
But it doesn't means we can't use. Just need to narrow down the situation when
it's used by multi-pthread on the same core.
- It CAN be used for any single-producer or single-consumer situation.
- It MAY be used by multi-producer/consumer pthread whose scheduling policy
are all SCHED_OTHER(cfs). User SHOULD aware of the performance penalty befor
using it.
- It MUST not be used by multi-producer/consumer pthread, while some of their
scheduling policies is SCHED_FIFO or SCHED_RR.
Performance
==============
It loses performance by introducing task switching. On packet IO perspective,
we can gain some back by improving IO effective rate. When the pthread do idle
loop on an empty rx queue, it should proactively yield. We can also slow down
rx for a bit while to take more advantage of the bulk receiving in the next
loop. In practice, increase the rx ring size also helps to improve the overrall
throughput.
Cgroup Control
================
Here's a simple example, there's four pthread doing packet IO on the same core.
We expect the CPU share rate is 1:1:2:4.
> mkdir /sys/fs/cgroup/cpu/dpdk
> mkdir /sys/fs/cgroup/cpu/dpdk/thread0
> mkdir /sys/fs/cgroup/cpu/dpdk/thread1
> mkdir /sys/fs/cgroup/cpu/dpdk/thread2
> mkdir /sys/fs/cgroup/cpu/dpdk/thread3
> cd /sys/fs/cgroup/cpu/dpdk
> echo 256 > thread0/cpu.shares
> echo 256 > thread1/cpu.shares
> echo 512 > thread2/cpu.shares
> echo 1024 > thread3/cpu.shares
-END-
Any comments are welcome.
Thanks
*** BLURB HERE ***
Cunming Liang (7):
eal: add linear thread id as pthread-local variable
mempool: use linear-tid as mempool cache index
ring: use linear-tid as ring debug stats index
eal: add simple API for multi-pthread
testpmd: support multi-pthread mode
sample: add new sample for multi-pthread
eal: macro for cpuset w/ or w/o CPU_ALLOC
app/test-pmd/cmdline.c | 41 +++++
app/test-pmd/testpmd.c | 84 ++++++++-
app/test-pmd/testpmd.h | 1 +
config/common_linuxapp | 1 +
examples/multi-pthread/Makefile | 57 ++++++
examples/multi-pthread/main.c | 232 ++++++++++++++++++++++++
examples/multi-pthread/main.h | 46 +++++
lib/librte_eal/common/include/rte_eal.h | 15 ++
lib/librte_eal/common/include/rte_lcore.h | 12 ++
lib/librte_eal/linuxapp/eal/eal_thread.c | 282 +++++++++++++++++++++++++++---
lib/librte_mempool/rte_mempool.h | 22 +--
lib/librte_ring/rte_ring.h | 6 +-
12 files changed, 755 insertions(+), 44 deletions(-)
create mode 100644 examples/multi-pthread/Makefile
create mode 100644 examples/multi-pthread/main.c
create mode 100644 examples/multi-pthread/main.h
--
1.8.1.4
next reply other threads:[~2014-12-11 2:05 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-11 2:04 Cunming Liang [this message]
2014-12-11 2:04 ` [dpdk-dev] [RFC PATCH 1/7] eal: add linear thread id as pthread-local variable Cunming Liang
2014-12-16 7:00 ` Qiu, Michael
2014-12-22 19:02 ` Ananyev, Konstantin
2014-12-23 9:56 ` Liang, Cunming
2014-12-11 2:04 ` [dpdk-dev] [RFC PATCH 2/7] mempool: use linear-tid as mempool cache index Cunming Liang
2014-12-11 2:04 ` [dpdk-dev] [RFC PATCH 3/7] ring: use linear-tid as ring debug stats index Cunming Liang
2014-12-11 2:04 ` [dpdk-dev] [RFC PATCH 4/7] eal: add simple API for multi-pthread Cunming Liang
2014-12-11 2:04 ` [dpdk-dev] [RFC PATCH 5/7] testpmd: support multi-pthread mode Cunming Liang
2014-12-11 2:04 ` [dpdk-dev] [RFC PATCH 6/7] sample: add new sample for multi-pthread Cunming Liang
2014-12-11 2:04 ` [dpdk-dev] [RFC PATCH 7/7] eal: macro for cpuset w/ or w/o CPU_ALLOC Cunming Liang
2014-12-11 2:54 ` [dpdk-dev] [RFC PATCH 0/7] support multi-phtread per lcore Jayakumar, Muthurajan
2014-12-11 9:56 ` Walukiewicz, Miroslaw
2014-12-12 5:44 ` Liang, Cunming
2014-12-15 11:10 ` Walukiewicz, Miroslaw
2014-12-15 11:53 ` Liang, Cunming
2014-12-18 12:20 ` Walukiewicz, Miroslaw
2014-12-18 14:32 ` Bruce Richardson
2014-12-18 15:11 ` Olivier MATZ
2014-12-18 16:04 ` Bruce Richardson
2014-12-18 16:15 ` Stephen Hemminger
2014-12-19 1:28 ` Liang, Cunming
2014-12-19 10:03 ` Bruce Richardson
2014-12-22 1:51 ` Liang, Cunming
2014-12-22 9:46 ` Bruce Richardson
2014-12-22 10:01 ` Walukiewicz, Miroslaw
2014-12-23 9:45 ` Liang, Cunming
2014-12-22 18:28 ` Stephen Hemminger
2014-12-23 9:19 ` Walukiewicz, Miroslaw
2014-12-23 9:23 ` Bruce Richardson
2014-12-23 9:51 ` Liang, Cunming
2015-01-08 17:05 ` Ananyev, Konstantin
2015-01-08 17:23 ` Richardson, Bruce
2015-01-09 9:51 ` Liang, Cunming
2015-01-09 9:40 ` Liang, Cunming
2015-01-09 11:52 ` Ananyev, Konstantin
2015-01-09 9:45 ` Liang, Cunming
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1418263490-21088-1-git-send-email-cunming.liang@intel.com \
--to=cunming.liang@intel.com \
--cc=dev@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).