From: Bruce Richardson <bruce.richardson@intel.com>
To: "Chi, Xiaobo (NSN - CN/Hangzhou)" <xiaobo.chi@nsn.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes
Date: Tue, 16 Dec 2014 10:03:44 +0000 [thread overview]
Message-ID: <20141216100344.GA9152@bricha3-MOBL3> (raw)
In-Reply-To: <EF703E8970265941A1316EEFE0AA7B6C40A7A04D@SGSIMBX004.nsn-intra.net>
On Tue, Dec 16, 2014 at 09:26:48AM +0000, Chi, Xiaobo (NSN - CN/Hangzhou) wrote:
> Hi, Bruce,
> How about this patch, can it be merged to master branch? Thanks.
>
> Brgs,
> Chi Xiaobo
>
At this point, I think we are well past code-freeze for new features for 1.8,
but this looks a good candidate for 2.0 once the merge window for that opens.
/Bruce
>
> -----Original Message-----
> From: Chi, Xiaobo (NSN - CN/Hangzhou)
> Sent: Monday, December 15, 2014 5:58 PM
> To: 'ext Hiroshi Shimamoto'; dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes
>
> Hi, Hiroshi,
> Yes, the should be performance degradation, not only due to the mempool cache, but also due to process scheduling overhead (lead by no CPU pin.)
> I have not done the performance testing. In my project scenarios, those SECONDARY processes only send/receive messages to/from the PRIMARY process via mempool/ring, the throughput is not so high, so the performance degradation is not critical to us. but there are dozens of SECONDARY processes in our system, it will be hard to manually properly pin them to different CPU cores, what we want is to apply linux standard scheduling mechanism to do load balance between CPU cores.
>
> Brgs,
> Chi Xiaobo
>
>
> -----Original Message-----
> From: ext Hiroshi Shimamoto [mailto:h-shimamoto@ct.jp.nec.com]
> Sent: Thursday, December 11, 2014 11:03 AM
> To: Chi, Xiaobo (NSN - CN/Hangzhou); dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes
>
> Hi,
>
> sorry for the delay.
>
> > Subject: RE: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes
> >
> > Hi, Hiroshi,
> > Yes, you are right, in order to avoid such problem, while create the mempool, which shall be shared between the primary
> > process and those secondary Processes, we need to assign the cache_size param value to be zero. And in order to make the
> > system more stable, it's better to define the RTE_MEMPOOL_CACHE_MAX_SIZE to be 0 in rte_config.h.
>
> Yes, it prevents the data corruption, but it also hurts the performance.
> I think, if we use the mbuf w/o cache for PMD, we will see the performance degradation.
>
> Don't you have any number?
>
> thanks,
> Hiroshi
>
> >
> > /* create the mempool */
> > struct rte_mempool *
> > rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
> > unsigned cache_size, unsigned private_data_size,
> > rte_mempool_ctor_t *mp_init, void *mp_init_arg,
> > rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
> > int socket_id, unsigned flags);
> >
> >
> > Brgs,
> > Chi xiaobo
> >
> >
> > -----Original Message-----
> > From: ext Hiroshi Shimamoto [mailto:h-shimamoto@ct.jp.nec.com]
> > Sent: Wednesday, December 03, 2014 6:54 PM
> > To: Chi, Xiaobo (NSN - CN/Hangzhou); dev@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes
> >
> > Hi,
> >
> > > Subject: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes
> > >
> > > From: Chi Xiaobo <xiaobo.chi@nsn.com>
> > >
> > > Problem: There is one normal DPDK processes deployment scenarios: one primary process and several (even hundreds) secondary
> > > processes; all outside packets/messages are sent/received by primary process and then distribute them to those secondary
> > > processes by DPDK's ring/sharedmemory mechanism. In such scenarios, those SECONDARY processes need only hugepage based
> > > sharememory mechanism and it?��s upper libs (such as ring, mempool, etc.), they need not cpu core pinning, iopl privilege
> > > changing , pci device, timer, alarm, interrupt, shared_driver_list, core_info, threads for each core, etc. Then, for
> > > such kind of SECONDARY processes, the current rte_eal_init() is too heavy.
> > >
> > > Solution:One new EAL initializing argument, --memory-only, is added. It is only for those SECONDARY processes which
> > only
> > > want to share memory with other processes. if this argument is defined, users need not define those mandatory arguments,
> > > such as -c and -n, due to we don't want to pin such kind of processes to any CPUs.
> >
> > however, we need the lcore_id per thread to use mempool.
> > If the lcore_id is not initialized, it must be 0, and multiple threads will break
> > mempool caches per thread, because of race condition.
> > We have to assign lcore_id per thread, these ids must not be overlapped, or disable
> > mempool handling in SECONDARY process.
> >
> > thanks,
> > Hiroshi
> >
> > > Signed-off-by: Chi Xiaobo <xiaobo.chi@nsn.com>
> > > ---
> > > lib/librte_eal/common/eal_common_options.c | 17 ++++++++++++---
> > > lib/librte_eal/common/eal_internal_cfg.h | 1 +
> > > lib/librte_eal/common/eal_options.h | 2 ++
> > > lib/librte_eal/linuxapp/eal/eal.c | 34 +++++++++++++++++-------------
> > > 4 files changed, 36 insertions(+), 18 deletions(-)
> > >
> > > diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
> > > index e2810ab..7b18498 100644
> > > --- a/lib/librte_eal/common/eal_common_options.c
> > > +++ b/lib/librte_eal/common/eal_common_options.c
> > > @@ -85,6 +85,7 @@ eal_long_options[] = {
> > > {OPT_XEN_DOM0, 0, 0, OPT_XEN_DOM0_NUM},
> > > {OPT_CREATE_UIO_DEV, 1, NULL, OPT_CREATE_UIO_DEV_NUM},
> > > {OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM},
> > > + {OPT_MEMORY_ONLY, 0, NULL, OPT_MEMORY_ONLY_NUM},
> > > {0, 0, 0, 0}
> > > };
> > >
> > > @@ -126,6 +127,7 @@ eal_reset_internal_config(struct internal_config *internal_cfg)
> > > internal_cfg->no_hpet = 1;
> > > #endif
> > > internal_cfg->vmware_tsc_map = 0;
> > > + internal_cfg->memory_only= 0;
> > > }
> > >
> > > /*
> > > @@ -454,6 +456,10 @@ eal_parse_common_option(int opt, const char *optarg,
> > > conf->process_type = eal_parse_proc_type(optarg);
> > > break;
> > >
> > > + case OPT_MEMORY_ONLY_NUM:
> > > + conf->memory_only= 1;
> > > + break;
> > > +
> > > case OPT_MASTER_LCORE_NUM:
> > > if (eal_parse_master_lcore(optarg) < 0) {
> > > RTE_LOG(ERR, EAL, "invalid parameter for --"
> > > @@ -525,9 +531,9 @@ eal_check_common_options(struct internal_config *internal_cfg)
> > > {
> > > struct rte_config *cfg = rte_eal_get_configuration();
> > >
> > > - if (!lcores_parsed) {
> > > - RTE_LOG(ERR, EAL, "CPU cores must be enabled with options "
> > > - "-c or -l\n");
> > > + if (!lcores_parsed && !(internal_cfg->process_type == RTE_PROC_SECONDARY&& internal_cfg->memory_only) ) {
> > > + RTE_LOG(ERR, EAL, "For those processes without memory-only option, CPU cores "
> > > + "must be enabled with options -c or -l\n");
> > > return -1;
> > > }
> > > if (cfg->lcore_role[cfg->master_lcore] != ROLE_RTE) {
> > > @@ -545,6 +551,10 @@ eal_check_common_options(struct internal_config *internal_cfg)
> > > "specified\n");
> > > return -1;
> > > }
> > > + if ( internal_cfg->process_type != RTE_PROC_SECONDARY && internal_cfg->memory_only ) {
> > > + RTE_LOG(ERR, EAL, "only secondary processes can specify memory-only option.\n");
> > > + return -1;
> > > + }
> > > if (index(internal_cfg->hugefile_prefix, '%') != NULL) {
> > > RTE_LOG(ERR, EAL, "Invalid char, '%%', in --"OPT_FILE_PREFIX" "
> > > "option\n");
> > > @@ -590,6 +600,7 @@ eal_common_usage(void)
> > > " --"OPT_SYSLOG" : set syslog facility\n"
> > > " --"OPT_LOG_LEVEL" : set default log level\n"
> > > " --"OPT_PROC_TYPE" : type of this process\n"
> > > + " --"OPT_MEMORY_ONLY": only use shared memory, valid only for secondary process.\n"
> > > " --"OPT_PCI_BLACKLIST", -b: add a PCI device in black list.\n"
> > > " Prevent EAL from using this PCI device. The argument\n"
> > > " format is <domain:bus:devid.func>.\n"
> > > diff --git a/lib/librte_eal/common/eal_internal_cfg.h b/lib/librte_eal/common/eal_internal_cfg.h
> > > index aac6abf..f51f0a2 100644
> > > --- a/lib/librte_eal/common/eal_internal_cfg.h
> > > +++ b/lib/librte_eal/common/eal_internal_cfg.h
> > > @@ -85,6 +85,7 @@ struct internal_config {
> > >
> > > unsigned num_hugepage_sizes; /**< how many sizes on this system */
> > > struct hugepage_info hugepage_info[MAX_HUGEPAGE_SIZES];
> > > + volatile unsigned memory_only; /**<wheter the seconday process only need shared momory only or not */
> > > };
> > > extern struct internal_config internal_config; /**< Global EAL configuration. */
> > >
> > > diff --git a/lib/librte_eal/common/eal_options.h b/lib/librte_eal/common/eal_options.h
> > > index e476f8d..87cc5db 100644
> > > --- a/lib/librte_eal/common/eal_options.h
> > > +++ b/lib/librte_eal/common/eal_options.h
> > > @@ -77,6 +77,8 @@ enum {
> > > OPT_CREATE_UIO_DEV_NUM,
> > > #define OPT_VFIO_INTR "vfio-intr"
> > > OPT_VFIO_INTR_NUM,
> > > +#define OPT_MEMORY_ONLY "memory-only"
> > > + OPT_MEMORY_ONLY_NUM,
> > > OPT_LONG_MAX_NUM
> > > };
> > >
> > > diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
> > > index 89f3b5e..c160771 100644
> > > --- a/lib/librte_eal/linuxapp/eal/eal.c
> > > +++ b/lib/librte_eal/linuxapp/eal/eal.c
> > > @@ -752,14 +752,6 @@ rte_eal_init(int argc, char **argv)
> > >
> > > rte_config_init();
> > >
> > > - if (rte_eal_pci_init() < 0)
> > > - rte_panic("Cannot init PCI\n");
> > > -
> > > -#ifdef RTE_LIBRTE_IVSHMEM
> > > - if (rte_eal_ivshmem_init() < 0)
> > > - rte_panic("Cannot init IVSHMEM\n");
> > > -#endif
> > > -
> > > if (rte_eal_memory_init() < 0)
> > > rte_panic("Cannot init memory\n");
> > >
> > > @@ -772,14 +764,30 @@ rte_eal_init(int argc, char **argv)
> > > if (rte_eal_tailqs_init() < 0)
> > > rte_panic("Cannot init tail queues for objects\n");
> > >
> > > + if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0)
> > > + rte_panic("Cannot init logs\n");
> > > +
> > > + eal_check_mem_on_local_socket();
> > > +
> > > + rte_eal_mcfg_complete();
> > > +
> > > + /*with memory-only option, we need not cpu affinity, pci device, alarm, external devices, interrupt, etc. */
> > > + if( internal_config.memory_only ){
> > > + RTE_LOG (DEBUG, EAL, "memory-only defined, so only memory being initialized.\n");
> > > + return 0;
> > > + }
> > > +
> > > + if (rte_eal_pci_init() < 0)
> > > + rte_panic("Cannot init PCI\n");
> > > +
> > > #ifdef RTE_LIBRTE_IVSHMEM
> > > + if (rte_eal_ivshmem_init() < 0)
> > > + rte_panic("Cannot init IVSHMEM\n");
> > > +
> > > if (rte_eal_ivshmem_obj_init() < 0)
> > > rte_panic("Cannot init IVSHMEM objects\n");
> > > #endif
> > >
> > > - if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0)
> > > - rte_panic("Cannot init logs\n");
> > > -
> > > if (rte_eal_alarm_init() < 0)
> > > rte_panic("Cannot init interrupt-handling thread\n");
> > >
> > > @@ -789,10 +797,6 @@ rte_eal_init(int argc, char **argv)
> > > if (rte_eal_timer_init() < 0)
> > > rte_panic("Cannot init HPET or TSC timers\n");
> > >
> > > - eal_check_mem_on_local_socket();
> > > -
> > > - rte_eal_mcfg_complete();
> > > -
> > > TAILQ_FOREACH(solib, &solib_list, next) {
> > > RTE_LOG(INFO, EAL, "open shared lib %s\n", solib->name);
> > > solib->lib_handle = dlopen(solib->name, RTLD_NOW);
> > > --
> > > 1.9.4.msysgit.2
>
next prev parent reply other threads:[~2014-12-16 10:03 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-03 10:11 chixiaobo
2014-12-03 10:53 ` Hiroshi Shimamoto
2014-12-04 7:21 ` Chi, Xiaobo (NSN - CN/Hangzhou)
2014-12-11 3:02 ` Hiroshi Shimamoto
2014-12-15 9:57 ` Chi, Xiaobo (NSN - CN/Hangzhou)
2014-12-16 9:26 ` Chi, Xiaobo (NSN - CN/Hangzhou)
2014-12-16 10:03 ` Bruce Richardson [this message]
2015-01-22 9:05 ` Chi, Xiaobo (NSN - CN/Hangzhou)
2015-01-22 11:17 ` Bruce Richardson
2015-01-22 13:00 ` Thomas Monjalon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141216100344.GA9152@bricha3-MOBL3 \
--to=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=xiaobo.chi@nsn.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).