DPDK patches and discussions
 help / color / mirror / Atom feed
From: Bruce Richardson <bruce.richardson@intel.com>
To: "Chi, Xiaobo (NSN - CN/Hangzhou)" <xiaobo.chi@nsn.com>,
	thomas.monjalon@6wind.com
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes
Date: Thu, 22 Jan 2015 11:17:33 +0000	[thread overview]
Message-ID: <20150122111732.GA4580@bricha3-MOBL3> (raw)
In-Reply-To: <EF703E8970265941A1316EEFE0AA7B6C40A92F4B@SGSIMBX004.nsn-intra.net>

On Thu, Jan 22, 2015 at 09:05:34AM +0000, Chi, Xiaobo (NSN - CN/Hangzhou) wrote:
> Hi, Bruce,
> Since the DPDK2.0 merge window is opened now, so is it possible for this patch to be one candidate for v2.0?
> I searched in the DPDK patchwork(http://www.dpdk.org/dev/patchwork/project/dpdk/list/?state=*&q=memory-only&archive=both ), but can not find this V2 patch. Can you please help to check why? Thanks a lot.
> 
> Filters: Search = memory-only  remove filter
> Patch	 Date	Submitter	Delegate	State
> [dpdk-dev] add one option memory-only for those secondary PRBs	2014-12-02	chixiaobo		Not Applicable
> [dpdk-dev] add one option memory-only for those secondary PRBs	2014-12-02	chixiaobo		Changes Requested
> 
> Brgs,
> Chi Xiaobo
> 
That's a question that Thomas is better able to answer than me, since he is the
man with control over patchwork! :-)

Thomas, any feedback here?

Thanks,
/Bruce
> 
> -----Original Message-----
> 
> From: ext Bruce Richardson [mailto:bruce.richardson@intel.com] 
> Sent: Tuesday, December 16, 2014 6:04 PM
> To: Chi, Xiaobo (NSN - CN/Hangzhou)
> Cc: ext Hiroshi Shimamoto; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes
> 
> On Tue, Dec 16, 2014 at 09:26:48AM +0000, Chi, Xiaobo (NSN - CN/Hangzhou) wrote:
> > Hi, Bruce,
> > How about this patch, can it be merged to master branch? Thanks.
> > 
> > Brgs,
> > Chi Xiaobo
> > 
> 
> At this point, I think we are well past code-freeze for new features for 1.8,
> but this looks a good candidate for 2.0 once the merge window for that opens.
> 
> /Bruce
> 
> > 
> > -----Original Message-----
> > From: Chi, Xiaobo (NSN - CN/Hangzhou) 
> > Sent: Monday, December 15, 2014 5:58 PM
> > To: 'ext Hiroshi Shimamoto'; dev@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes
> > 
> > Hi, Hiroshi,
> > Yes, the should be performance degradation, not only due to the mempool cache, but also due to process scheduling overhead (lead by no CPU pin.)
> > I have not done the performance testing. In my project scenarios, those SECONDARY processes only send/receive messages to/from the PRIMARY process via mempool/ring, the throughput is not so high, so the performance degradation is not critical to us. but there are dozens of SECONDARY processes in our system, it will be hard to manually properly pin them to different CPU cores, what we want is to apply linux standard scheduling mechanism to do load balance between CPU cores.
> > 
> > Brgs,
> > Chi Xiaobo
> > 
> > 
> > -----Original Message-----
> > From: ext Hiroshi Shimamoto [mailto:h-shimamoto@ct.jp.nec.com] 
> > Sent: Thursday, December 11, 2014 11:03 AM
> > To: Chi, Xiaobo (NSN - CN/Hangzhou); dev@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes
> > 
> > Hi,
> > 
> > sorry for the delay.
> > 
> > > Subject: RE: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes
> > > 
> > > Hi, Hiroshi,
> > > Yes, you are right, in order to avoid such problem, while create the mempool, which shall be shared between the primary
> > > process and those secondary Processes, we need to assign the cache_size param value to be zero. And in order to make the
> > > system more stable, it's better to define the RTE_MEMPOOL_CACHE_MAX_SIZE to be 0 in rte_config.h.
> > 
> > Yes, it prevents the data corruption, but it also hurts the performance.
> > I think, if we use the mbuf w/o cache for PMD, we will see the performance degradation.
> > 
> > Don't you have any number?
> > 
> > thanks,
> > Hiroshi
> > 
> > > 
> > > /* create the mempool */
> > > struct rte_mempool *
> > > rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
> > > 		   unsigned cache_size, unsigned private_data_size,
> > > 		   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
> > > 		   rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
> > > 		   int socket_id, unsigned flags);
> > > 
> > > 
> > > Brgs,
> > > Chi xiaobo
> > > 
> > > 
> > > -----Original Message-----
> > > From: ext Hiroshi Shimamoto [mailto:h-shimamoto@ct.jp.nec.com]
> > > Sent: Wednesday, December 03, 2014 6:54 PM
> > > To: Chi, Xiaobo (NSN - CN/Hangzhou); dev@dpdk.org
> > > Subject: RE: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes
> > > 
> > > Hi,
> > > 
> > > > Subject: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes
> > > >
> > > > From: Chi Xiaobo <xiaobo.chi@nsn.com>
> > > >
> > > > Problem: There is one normal DPDK processes deployment scenarios: one primary process and several (even hundreds) secondary
> > > > processes; all outside packets/messages are sent/received by primary process and then distribute them to those secondary
> > > > processes by DPDK's ring/sharedmemory mechanism. In such scenarios, those SECONDARY processes need only hugepage based
> > > > sharememory mechanism and it?��s upper libs (such as ring, mempool, etc.), they need not cpu core pinning, iopl privilege
> > > > changing , pci device, timer, alarm, interrupt, shared_driver_list,  core_info, threads for each core, etc. Then, for
> > > > such kind of SECONDARY processes, the current rte_eal_init() is too heavy.
> > > >
> > > > Solution:One new EAL initializing argument, --memory-only, is added. It is only for those SECONDARY processes which
> > > only
> > > > want to share memory with other processes. if this argument is defined, users need not define those mandatory arguments,
> > > > such as -c and -n, due to we don't want to pin such kind of processes to any CPUs.
> > > 
> > > however, we need the lcore_id per thread to use mempool.
> > > If the lcore_id is not initialized, it must be 0, and multiple threads will break
> > > mempool caches per thread, because of race condition.
> > > We have to assign lcore_id per thread, these ids must not be overlapped, or disable
> > > mempool handling in SECONDARY process.
> > > 
> > > thanks,
> > > Hiroshi
> > > 
> > > > Signed-off-by: Chi Xiaobo <xiaobo.chi@nsn.com>
> > > > ---
> > > >  lib/librte_eal/common/eal_common_options.c | 17 ++++++++++++---
> > > >  lib/librte_eal/common/eal_internal_cfg.h   |  1 +
> > > >  lib/librte_eal/common/eal_options.h        |  2 ++
> > > >  lib/librte_eal/linuxapp/eal/eal.c          | 34 +++++++++++++++++-------------
> > > >  4 files changed, 36 insertions(+), 18 deletions(-)
> > > >
> > > > diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
> > > > index e2810ab..7b18498 100644
> > > > --- a/lib/librte_eal/common/eal_common_options.c
> > > > +++ b/lib/librte_eal/common/eal_common_options.c
> > > > @@ -85,6 +85,7 @@ eal_long_options[] = {
> > > >  	{OPT_XEN_DOM0, 0, 0, OPT_XEN_DOM0_NUM},
> > > >  	{OPT_CREATE_UIO_DEV, 1, NULL, OPT_CREATE_UIO_DEV_NUM},
> > > >  	{OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM},
> > > > +	{OPT_MEMORY_ONLY, 0, NULL, OPT_MEMORY_ONLY_NUM},
> > > >  	{0, 0, 0, 0}
> > > >  };
> > > >
> > > > @@ -126,6 +127,7 @@ eal_reset_internal_config(struct internal_config *internal_cfg)
> > > >  	internal_cfg->no_hpet = 1;
> > > >  #endif
> > > >  	internal_cfg->vmware_tsc_map = 0;
> > > > +	internal_cfg->memory_only= 0;
> > > >  }
> > > >
> > > >  /*
> > > > @@ -454,6 +456,10 @@ eal_parse_common_option(int opt, const char *optarg,
> > > >  		conf->process_type = eal_parse_proc_type(optarg);
> > > >  		break;
> > > >
> > > > +	case OPT_MEMORY_ONLY_NUM:
> > > > +		conf->memory_only= 1;
> > > > +		break;
> > > > +
> > > >  	case OPT_MASTER_LCORE_NUM:
> > > >  		if (eal_parse_master_lcore(optarg) < 0) {
> > > >  			RTE_LOG(ERR, EAL, "invalid parameter for --"
> > > > @@ -525,9 +531,9 @@ eal_check_common_options(struct internal_config *internal_cfg)
> > > >  {
> > > >  	struct rte_config *cfg = rte_eal_get_configuration();
> > > >
> > > > -	if (!lcores_parsed) {
> > > > -		RTE_LOG(ERR, EAL, "CPU cores must be enabled with options "
> > > > -			"-c or -l\n");
> > > > +	if (!lcores_parsed && !(internal_cfg->process_type == RTE_PROC_SECONDARY&& internal_cfg->memory_only) ) {
> > > > +		RTE_LOG(ERR, EAL, "For those processes without memory-only option, CPU cores "
> > > > +							"must be enabled with options -c or -l\n");
> > > >  		return -1;
> > > >  	}
> > > >  	if (cfg->lcore_role[cfg->master_lcore] != ROLE_RTE) {
> > > > @@ -545,6 +551,10 @@ eal_check_common_options(struct internal_config *internal_cfg)
> > > >  			"specified\n");
> > > >  		return -1;
> > > >  	}
> > > > +	if ( internal_cfg->process_type != RTE_PROC_SECONDARY && internal_cfg->memory_only ) {
> > > > +		RTE_LOG(ERR, EAL, "only secondary processes can specify memory-only option.\n");
> > > > +		return -1;
> > > > +	}
> > > >  	if (index(internal_cfg->hugefile_prefix, '%') != NULL) {
> > > >  		RTE_LOG(ERR, EAL, "Invalid char, '%%', in --"OPT_FILE_PREFIX" "
> > > >  			"option\n");
> > > > @@ -590,6 +600,7 @@ eal_common_usage(void)
> > > >  	       "  --"OPT_SYSLOG"     : set syslog facility\n"
> > > >  	       "  --"OPT_LOG_LEVEL"  : set default log level\n"
> > > >  	       "  --"OPT_PROC_TYPE"  : type of this process\n"
> > > > +	       "  --"OPT_MEMORY_ONLY": only use shared memory, valid only for secondary process.\n"
> > > >  	       "  --"OPT_PCI_BLACKLIST", -b: add a PCI device in black list.\n"
> > > >  	       "               Prevent EAL from using this PCI device. The argument\n"
> > > >  	       "               format is <domain:bus:devid.func>.\n"
> > > > diff --git a/lib/librte_eal/common/eal_internal_cfg.h b/lib/librte_eal/common/eal_internal_cfg.h
> > > > index aac6abf..f51f0a2 100644
> > > > --- a/lib/librte_eal/common/eal_internal_cfg.h
> > > > +++ b/lib/librte_eal/common/eal_internal_cfg.h
> > > > @@ -85,6 +85,7 @@ struct internal_config {
> > > >
> > > >  	unsigned num_hugepage_sizes;      /**< how many sizes on this system */
> > > >  	struct hugepage_info hugepage_info[MAX_HUGEPAGE_SIZES];
> > > > +	volatile unsigned memory_only;    /**<wheter the seconday process only need shared momory only or not */
> > > >  };
> > > >  extern struct internal_config internal_config; /**< Global EAL configuration. */
> > > >
> > > > diff --git a/lib/librte_eal/common/eal_options.h b/lib/librte_eal/common/eal_options.h
> > > > index e476f8d..87cc5db 100644
> > > > --- a/lib/librte_eal/common/eal_options.h
> > > > +++ b/lib/librte_eal/common/eal_options.h
> > > > @@ -77,6 +77,8 @@ enum {
> > > >  	OPT_CREATE_UIO_DEV_NUM,
> > > >  #define OPT_VFIO_INTR    "vfio-intr"
> > > >  	OPT_VFIO_INTR_NUM,
> > > > +#define OPT_MEMORY_ONLY  "memory-only"
> > > > +	OPT_MEMORY_ONLY_NUM,
> > > >  	OPT_LONG_MAX_NUM
> > > >  };
> > > >
> > > > diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
> > > > index 89f3b5e..c160771 100644
> > > > --- a/lib/librte_eal/linuxapp/eal/eal.c
> > > > +++ b/lib/librte_eal/linuxapp/eal/eal.c
> > > > @@ -752,14 +752,6 @@ rte_eal_init(int argc, char **argv)
> > > >
> > > >  	rte_config_init();
> > > >
> > > > -	if (rte_eal_pci_init() < 0)
> > > > -		rte_panic("Cannot init PCI\n");
> > > > -
> > > > -#ifdef RTE_LIBRTE_IVSHMEM
> > > > -	if (rte_eal_ivshmem_init() < 0)
> > > > -		rte_panic("Cannot init IVSHMEM\n");
> > > > -#endif
> > > > -
> > > >  	if (rte_eal_memory_init() < 0)
> > > >  		rte_panic("Cannot init memory\n");
> > > >
> > > > @@ -772,14 +764,30 @@ rte_eal_init(int argc, char **argv)
> > > >  	if (rte_eal_tailqs_init() < 0)
> > > >  		rte_panic("Cannot init tail queues for objects\n");
> > > >
> > > > +	if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0)
> > > > +		rte_panic("Cannot init logs\n");
> > > > +
> > > > +	eal_check_mem_on_local_socket();
> > > > +
> > > > +	rte_eal_mcfg_complete();
> > > > +
> > > > +    /*with memory-only option, we need not cpu affinity, pci device, alarm, external devices, interrupt, etc. */
> > > > +	if( internal_config.memory_only ){
> > > > +		RTE_LOG (DEBUG, EAL, "memory-only defined, so only memory being initialized.\n");
> > > > +		return 0;
> > > > +	}
> > > > +
> > > > +	if (rte_eal_pci_init() < 0)
> > > > +		rte_panic("Cannot init PCI\n");
> > > > +
> > > >  #ifdef RTE_LIBRTE_IVSHMEM
> > > > +	if (rte_eal_ivshmem_init() < 0)
> > > > +		rte_panic("Cannot init IVSHMEM\n");
> > > > +
> > > >  	if (rte_eal_ivshmem_obj_init() < 0)
> > > >  		rte_panic("Cannot init IVSHMEM objects\n");
> > > >  #endif
> > > >
> > > > -	if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0)
> > > > -		rte_panic("Cannot init logs\n");
> > > > -
> > > >  	if (rte_eal_alarm_init() < 0)
> > > >  		rte_panic("Cannot init interrupt-handling thread\n");
> > > >
> > > > @@ -789,10 +797,6 @@ rte_eal_init(int argc, char **argv)
> > > >  	if (rte_eal_timer_init() < 0)
> > > >  		rte_panic("Cannot init HPET or TSC timers\n");
> > > >
> > > > -	eal_check_mem_on_local_socket();
> > > > -
> > > > -	rte_eal_mcfg_complete();
> > > > -
> > > >  	TAILQ_FOREACH(solib, &solib_list, next) {
> > > >  		RTE_LOG(INFO, EAL, "open shared lib %s\n", solib->name);
> > > >  		solib->lib_handle = dlopen(solib->name, RTLD_NOW);
> > > > --
> > > > 1.9.4.msysgit.2
> > 

  reply	other threads:[~2015-01-22 11:17 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-03 10:11 chixiaobo
2014-12-03 10:53 ` Hiroshi Shimamoto
2014-12-04  7:21   ` Chi, Xiaobo (NSN - CN/Hangzhou)
2014-12-11  3:02     ` Hiroshi Shimamoto
2014-12-15  9:57       ` Chi, Xiaobo (NSN - CN/Hangzhou)
2014-12-16  9:26       ` Chi, Xiaobo (NSN - CN/Hangzhou)
2014-12-16 10:03         ` Bruce Richardson
2015-01-22  9:05           ` Chi, Xiaobo (NSN - CN/Hangzhou)
2015-01-22 11:17             ` Bruce Richardson [this message]
2015-01-22 13:00               ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150122111732.GA4580@bricha3-MOBL3 \
    --to=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=thomas.monjalon@6wind.com \
    --cc=xiaobo.chi@nsn.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).