DPDK patches and discussions
 help / color / mirror / Atom feed
From: Cunming Liang <cunming.liang@intel.com>
To: dev@dpdk.org
Subject: [dpdk-dev] [RFC PATCH 0/7] support multi-phtread per lcore
Date: Thu, 11 Dec 2014 10:04:43 +0800	[thread overview]
Message-ID: <1418263490-21088-1-git-send-email-cunming.liang@intel.com> (raw)

Scope & Usage Scenario 

DPDK usually pin pthread per core to avoid task switch overhead. It gains 
performance a lot, but it's not efficient in all cases. In some cases, it may
too expensive to use the whole core for a lightweight workload. It's a 
reasonable demand to have multiple threads per core and each threads share CPU 
in an assigned weight.

In fact, nothing avoid user to create normal pthread and using cgroup to 
control the CPU share. One of the purpose for the patchset is to clean the 
gaps of using more DPDK libraries in the normal pthread. In addition, it 
demonstrates performance gain by proactive 'yield' when doing idle loop 
in packet IO. It also provides several 'rte_pthread_*' APIs to easy life.

Changes to DPDK libraries

Some of DPDK libraries must run in DPDK environment.

# rte_mempool

In rte_mempool doc, it mentions a thread not created by EAL must not use
mempools. The root cause is it uses a per-lcore cache inside mempool. 
And 'rte_lcore_id()' will not return a correct value.

The patchset changes this a little. The index of mempool cache won't be a 
lcore_id. Instead of it, using a linear number generated by the allocator.
For those legacy EAL per-lcore thread, it apply for an unique linear id 
during creation. For those normal pthread expecting to use rte_mempool, it
requires to apply for a linear id explicitly. Now the mempool cache looks like
a per-thread base. The linear ID actually identify for the linear thread id.

However, there's another problem. The rte_mempool is not preemptable. The 
problem comes from rte_ring, so talk together in next section.

# rte_ring

rte_ring supports multi-producer enqueue and multi-consumer dequeue. But it's 
not preemptable. There's conversation talking about this before.

Let's say there's two pthreads running on the same core doing enqueue on the 
same rte_ring. If the 1st pthread is preempted by the 2nd pthread while it has 
already modified the prod.head, the 2nd pthread will spin until the 1st one 
scheduled agian. It causes time wasting. In addition, if the 2nd pthread has 
absolutely higer priority, it's more terrible.

But it doesn't means we can't use. Just need to narrow down the situation when 
it's used by multi-pthread on the same core.
- It CAN be used for any single-producer or single-consumer situation.
- It MAY be used by multi-producer/consumer pthread whose scheduling policy
are all SCHED_OTHER(cfs). User SHOULD aware of the performance penalty befor 
using it.
- It MUST not be used by multi-producer/consumer pthread, while some of their
scheduling policies is SCHED_FIFO or SCHED_RR.


It loses performance by introducing task switching. On packet IO perspective,
we can gain some back by improving IO effective rate. When the pthread do idle 
loop on an empty rx queue, it should proactively yield. We can also slow down
rx for a bit while to take more advantage of the bulk receiving in the next 
loop. In practice, increase the rx ring size also helps to improve the overrall

Cgroup Control

Here's a simple example, there's four pthread doing packet IO on the same core.
We expect the CPU share rate is 1:1:2:4.
> mkdir /sys/fs/cgroup/cpu/dpdk
> mkdir /sys/fs/cgroup/cpu/dpdk/thread0
> mkdir /sys/fs/cgroup/cpu/dpdk/thread1
> mkdir /sys/fs/cgroup/cpu/dpdk/thread2
> mkdir /sys/fs/cgroup/cpu/dpdk/thread3
> cd /sys/fs/cgroup/cpu/dpdk
> echo 256 > thread0/cpu.shares
> echo 256 > thread1/cpu.shares
> echo 512 > thread2/cpu.shares
> echo 1024 > thread3/cpu.shares


Any comments are welcome.


*** BLURB HERE ***

Cunming Liang (7):
  eal: add linear thread id as pthread-local variable
  mempool: use linear-tid as mempool cache index
  ring: use linear-tid as ring debug stats index
  eal: add simple API for multi-pthread
  testpmd: support multi-pthread mode
  sample: add new sample for multi-pthread
  eal: macro for cpuset w/ or w/o CPU_ALLOC

 app/test-pmd/cmdline.c                    |  41 +++++
 app/test-pmd/testpmd.c                    |  84 ++++++++-
 app/test-pmd/testpmd.h                    |   1 +
 config/common_linuxapp                    |   1 +
 examples/multi-pthread/Makefile           |  57 ++++++
 examples/multi-pthread/main.c             | 232 ++++++++++++++++++++++++
 examples/multi-pthread/main.h             |  46 +++++
 lib/librte_eal/common/include/rte_eal.h   |  15 ++
 lib/librte_eal/common/include/rte_lcore.h |  12 ++
 lib/librte_eal/linuxapp/eal/eal_thread.c  | 282 +++++++++++++++++++++++++++---
 lib/librte_mempool/rte_mempool.h          |  22 +--
 lib/librte_ring/rte_ring.h                |   6 +-
 12 files changed, 755 insertions(+), 44 deletions(-)
 create mode 100644 examples/multi-pthread/Makefile
 create mode 100644 examples/multi-pthread/main.c
 create mode 100644 examples/multi-pthread/main.h


             reply	other threads:[~2014-12-11  2:05 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-11  2:04 Cunming Liang [this message]
2014-12-11  2:04 ` [dpdk-dev] [RFC PATCH 1/7] eal: add linear thread id as pthread-local variable Cunming Liang
2014-12-16  7:00   ` Qiu, Michael
2014-12-22 19:02   ` Ananyev, Konstantin
2014-12-23  9:56     ` Liang, Cunming
2014-12-11  2:04 ` [dpdk-dev] [RFC PATCH 2/7] mempool: use linear-tid as mempool cache index Cunming Liang
2014-12-11  2:04 ` [dpdk-dev] [RFC PATCH 3/7] ring: use linear-tid as ring debug stats index Cunming Liang
2014-12-11  2:04 ` [dpdk-dev] [RFC PATCH 4/7] eal: add simple API for multi-pthread Cunming Liang
2014-12-11  2:04 ` [dpdk-dev] [RFC PATCH 5/7] testpmd: support multi-pthread mode Cunming Liang
2014-12-11  2:04 ` [dpdk-dev] [RFC PATCH 6/7] sample: add new sample for multi-pthread Cunming Liang
2014-12-11  2:04 ` [dpdk-dev] [RFC PATCH 7/7] eal: macro for cpuset w/ or w/o CPU_ALLOC Cunming Liang
2014-12-11  2:54 ` [dpdk-dev] [RFC PATCH 0/7] support multi-phtread per lcore Jayakumar, Muthurajan
2014-12-11  9:56 ` Walukiewicz, Miroslaw
2014-12-12  5:44   ` Liang, Cunming
2014-12-15 11:10     ` Walukiewicz, Miroslaw
2014-12-15 11:53       ` Liang, Cunming
2014-12-18 12:20         ` Walukiewicz, Miroslaw
2014-12-18 14:32           ` Bruce Richardson
2014-12-18 15:11             ` Olivier MATZ
2014-12-18 16:04               ` Bruce Richardson
2014-12-18 16:15           ` Stephen Hemminger
2014-12-19  1:28           ` Liang, Cunming
2014-12-19 10:03             ` Bruce Richardson
2014-12-22  1:51               ` Liang, Cunming
2014-12-22  9:46                 ` Bruce Richardson
2014-12-22 10:01                   ` Walukiewicz, Miroslaw
2014-12-23  9:45                     ` Liang, Cunming
2014-12-22 18:28                   ` Stephen Hemminger
2014-12-23  9:19                     ` Walukiewicz, Miroslaw
2014-12-23  9:23                       ` Bruce Richardson
2014-12-23  9:51                     ` Liang, Cunming
2015-01-08 17:05                       ` Ananyev, Konstantin
2015-01-08 17:23                         ` Richardson, Bruce
2015-01-09  9:51                           ` Liang, Cunming
2015-01-09  9:40                         ` Liang, Cunming
2015-01-09 11:52                           ` Ananyev, Konstantin
2015-01-09  9:45                         ` Liang, Cunming

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1418263490-21088-1-git-send-email-cunming.liang@intel.com \
    --to=cunming.liang@intel.com \
    --cc=dev@dpdk.org \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).