From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id B25935934 for ; Thu, 22 Jan 2015 12:17:39 +0100 (CET) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga103.fm.intel.com with ESMTP; 22 Jan 2015 03:11:52 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.09,448,1418112000"; d="scan'208";a="515950286" Received: from bricha3-mobl3.ger.corp.intel.com ([10.243.20.25]) by orsmga003.jf.intel.com with SMTP; 22 Jan 2015 03:10:50 -0800 Received: by (sSMTP sendmail emulation); Thu, 22 Jan 2015 11:17:33 +0025 Date: Thu, 22 Jan 2015 11:17:33 +0000 From: Bruce Richardson To: "Chi, Xiaobo (NSN - CN/Hangzhou)" , thomas.monjalon@6wind.com Message-ID: <20150122111732.GA4580@bricha3-MOBL3> References: <1417601518-16852-1-git-send-email-xiaobo.chi@nsn.com> <7F861DC0615E0C47A872E6F3C5FCDDBD05DCB1C7@BPXM14GP.gisp.nec.co.jp> <7F861DC0615E0C47A872E6F3C5FCDDBD05DD6E68@BPXM14GP.gisp.nec.co.jp> <20141216100344.GA9152@bricha3-MOBL3> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Organization: Intel Shannon Ltd. User-Agent: Mutt/1.5.23 (2014-03-12) Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Jan 2015 11:17:40 -0000 On Thu, Jan 22, 2015 at 09:05:34AM +0000, Chi, Xiaobo (NSN - CN/Hangzhou) wrote: > Hi, Bruce, > Since the DPDK2.0 merge window is opened now, so is it possible for this patch to be one candidate for v2.0? > I searched in the DPDK patchwork(http://www.dpdk.org/dev/patchwork/project/dpdk/list/?state=*&q=memory-only&archive=both ), but can not find this V2 patch. Can you please help to check why? Thanks a lot. > > Filters: Search = memory-only remove filter > Patch Date Submitter Delegate State > [dpdk-dev] add one option memory-only for those secondary PRBs 2014-12-02 chixiaobo Not Applicable > [dpdk-dev] add one option memory-only for those secondary PRBs 2014-12-02 chixiaobo Changes Requested > > Brgs, > Chi Xiaobo > That's a question that Thomas is better able to answer than me, since he is the man with control over patchwork! :-) Thomas, any feedback here? Thanks, /Bruce > > -----Original Message----- > > From: ext Bruce Richardson [mailto:bruce.richardson@intel.com] > Sent: Tuesday, December 16, 2014 6:04 PM > To: Chi, Xiaobo (NSN - CN/Hangzhou) > Cc: ext Hiroshi Shimamoto; dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes > > On Tue, Dec 16, 2014 at 09:26:48AM +0000, Chi, Xiaobo (NSN - CN/Hangzhou) wrote: > > Hi, Bruce, > > How about this patch, can it be merged to master branch? Thanks. > > > > Brgs, > > Chi Xiaobo > > > > At this point, I think we are well past code-freeze for new features for 1.8, > but this looks a good candidate for 2.0 once the merge window for that opens. > > /Bruce > > > > > -----Original Message----- > > From: Chi, Xiaobo (NSN - CN/Hangzhou) > > Sent: Monday, December 15, 2014 5:58 PM > > To: 'ext Hiroshi Shimamoto'; dev@dpdk.org > > Subject: RE: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes > > > > Hi, Hiroshi, > > Yes, the should be performance degradation, not only due to the mempool cache, but also due to process scheduling overhead (lead by no CPU pin.) > > I have not done the performance testing. In my project scenarios, those SECONDARY processes only send/receive messages to/from the PRIMARY process via mempool/ring, the throughput is not so high, so the performance degradation is not critical to us. but there are dozens of SECONDARY processes in our system, it will be hard to manually properly pin them to different CPU cores, what we want is to apply linux standard scheduling mechanism to do load balance between CPU cores. > > > > Brgs, > > Chi Xiaobo > > > > > > -----Original Message----- > > From: ext Hiroshi Shimamoto [mailto:h-shimamoto@ct.jp.nec.com] > > Sent: Thursday, December 11, 2014 11:03 AM > > To: Chi, Xiaobo (NSN - CN/Hangzhou); dev@dpdk.org > > Subject: RE: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes > > > > Hi, > > > > sorry for the delay. > > > > > Subject: RE: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes > > > > > > Hi, Hiroshi, > > > Yes, you are right, in order to avoid such problem, while create the mempool, which shall be shared between the primary > > > process and those secondary Processes, we need to assign the cache_size param value to be zero. And in order to make the > > > system more stable, it's better to define the RTE_MEMPOOL_CACHE_MAX_SIZE to be 0 in rte_config.h. > > > > Yes, it prevents the data corruption, but it also hurts the performance. > > I think, if we use the mbuf w/o cache for PMD, we will see the performance degradation. > > > > Don't you have any number? > > > > thanks, > > Hiroshi > > > > > > > > /* create the mempool */ > > > struct rte_mempool * > > > rte_mempool_create(const char *name, unsigned n, unsigned elt_size, > > > unsigned cache_size, unsigned private_data_size, > > > rte_mempool_ctor_t *mp_init, void *mp_init_arg, > > > rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg, > > > int socket_id, unsigned flags); > > > > > > > > > Brgs, > > > Chi xiaobo > > > > > > > > > -----Original Message----- > > > From: ext Hiroshi Shimamoto [mailto:h-shimamoto@ct.jp.nec.com] > > > Sent: Wednesday, December 03, 2014 6:54 PM > > > To: Chi, Xiaobo (NSN - CN/Hangzhou); dev@dpdk.org > > > Subject: RE: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes > > > > > > Hi, > > > > > > > Subject: [dpdk-dev] [PATCH v2] add one option memory-only for secondary processes > > > > > > > > From: Chi Xiaobo > > > > > > > > Problem: There is one normal DPDK processes deployment scenarios: one primary process and several (even hundreds) secondary > > > > processes; all outside packets/messages are sent/received by primary process and then distribute them to those secondary > > > > processes by DPDK's ring/sharedmemory mechanism. In such scenarios, those SECONDARY processes need only hugepage based > > > > sharememory mechanism and it?��s upper libs (such as ring, mempool, etc.), they need not cpu core pinning, iopl privilege > > > > changing , pci device, timer, alarm, interrupt, shared_driver_list, core_info, threads for each core, etc. Then, for > > > > such kind of SECONDARY processes, the current rte_eal_init() is too heavy. > > > > > > > > Solution:One new EAL initializing argument, --memory-only, is added. It is only for those SECONDARY processes which > > > only > > > > want to share memory with other processes. if this argument is defined, users need not define those mandatory arguments, > > > > such as -c and -n, due to we don't want to pin such kind of processes to any CPUs. > > > > > > however, we need the lcore_id per thread to use mempool. > > > If the lcore_id is not initialized, it must be 0, and multiple threads will break > > > mempool caches per thread, because of race condition. > > > We have to assign lcore_id per thread, these ids must not be overlapped, or disable > > > mempool handling in SECONDARY process. > > > > > > thanks, > > > Hiroshi > > > > > > > Signed-off-by: Chi Xiaobo > > > > --- > > > > lib/librte_eal/common/eal_common_options.c | 17 ++++++++++++--- > > > > lib/librte_eal/common/eal_internal_cfg.h | 1 + > > > > lib/librte_eal/common/eal_options.h | 2 ++ > > > > lib/librte_eal/linuxapp/eal/eal.c | 34 +++++++++++++++++------------- > > > > 4 files changed, 36 insertions(+), 18 deletions(-) > > > > > > > > diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c > > > > index e2810ab..7b18498 100644 > > > > --- a/lib/librte_eal/common/eal_common_options.c > > > > +++ b/lib/librte_eal/common/eal_common_options.c > > > > @@ -85,6 +85,7 @@ eal_long_options[] = { > > > > {OPT_XEN_DOM0, 0, 0, OPT_XEN_DOM0_NUM}, > > > > {OPT_CREATE_UIO_DEV, 1, NULL, OPT_CREATE_UIO_DEV_NUM}, > > > > {OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM}, > > > > + {OPT_MEMORY_ONLY, 0, NULL, OPT_MEMORY_ONLY_NUM}, > > > > {0, 0, 0, 0} > > > > }; > > > > > > > > @@ -126,6 +127,7 @@ eal_reset_internal_config(struct internal_config *internal_cfg) > > > > internal_cfg->no_hpet = 1; > > > > #endif > > > > internal_cfg->vmware_tsc_map = 0; > > > > + internal_cfg->memory_only= 0; > > > > } > > > > > > > > /* > > > > @@ -454,6 +456,10 @@ eal_parse_common_option(int opt, const char *optarg, > > > > conf->process_type = eal_parse_proc_type(optarg); > > > > break; > > > > > > > > + case OPT_MEMORY_ONLY_NUM: > > > > + conf->memory_only= 1; > > > > + break; > > > > + > > > > case OPT_MASTER_LCORE_NUM: > > > > if (eal_parse_master_lcore(optarg) < 0) { > > > > RTE_LOG(ERR, EAL, "invalid parameter for --" > > > > @@ -525,9 +531,9 @@ eal_check_common_options(struct internal_config *internal_cfg) > > > > { > > > > struct rte_config *cfg = rte_eal_get_configuration(); > > > > > > > > - if (!lcores_parsed) { > > > > - RTE_LOG(ERR, EAL, "CPU cores must be enabled with options " > > > > - "-c or -l\n"); > > > > + if (!lcores_parsed && !(internal_cfg->process_type == RTE_PROC_SECONDARY&& internal_cfg->memory_only) ) { > > > > + RTE_LOG(ERR, EAL, "For those processes without memory-only option, CPU cores " > > > > + "must be enabled with options -c or -l\n"); > > > > return -1; > > > > } > > > > if (cfg->lcore_role[cfg->master_lcore] != ROLE_RTE) { > > > > @@ -545,6 +551,10 @@ eal_check_common_options(struct internal_config *internal_cfg) > > > > "specified\n"); > > > > return -1; > > > > } > > > > + if ( internal_cfg->process_type != RTE_PROC_SECONDARY && internal_cfg->memory_only ) { > > > > + RTE_LOG(ERR, EAL, "only secondary processes can specify memory-only option.\n"); > > > > + return -1; > > > > + } > > > > if (index(internal_cfg->hugefile_prefix, '%') != NULL) { > > > > RTE_LOG(ERR, EAL, "Invalid char, '%%', in --"OPT_FILE_PREFIX" " > > > > "option\n"); > > > > @@ -590,6 +600,7 @@ eal_common_usage(void) > > > > " --"OPT_SYSLOG" : set syslog facility\n" > > > > " --"OPT_LOG_LEVEL" : set default log level\n" > > > > " --"OPT_PROC_TYPE" : type of this process\n" > > > > + " --"OPT_MEMORY_ONLY": only use shared memory, valid only for secondary process.\n" > > > > " --"OPT_PCI_BLACKLIST", -b: add a PCI device in black list.\n" > > > > " Prevent EAL from using this PCI device. The argument\n" > > > > " format is .\n" > > > > diff --git a/lib/librte_eal/common/eal_internal_cfg.h b/lib/librte_eal/common/eal_internal_cfg.h > > > > index aac6abf..f51f0a2 100644 > > > > --- a/lib/librte_eal/common/eal_internal_cfg.h > > > > +++ b/lib/librte_eal/common/eal_internal_cfg.h > > > > @@ -85,6 +85,7 @@ struct internal_config { > > > > > > > > unsigned num_hugepage_sizes; /**< how many sizes on this system */ > > > > struct hugepage_info hugepage_info[MAX_HUGEPAGE_SIZES]; > > > > + volatile unsigned memory_only; /** > > > }; > > > > extern struct internal_config internal_config; /**< Global EAL configuration. */ > > > > > > > > diff --git a/lib/librte_eal/common/eal_options.h b/lib/librte_eal/common/eal_options.h > > > > index e476f8d..87cc5db 100644 > > > > --- a/lib/librte_eal/common/eal_options.h > > > > +++ b/lib/librte_eal/common/eal_options.h > > > > @@ -77,6 +77,8 @@ enum { > > > > OPT_CREATE_UIO_DEV_NUM, > > > > #define OPT_VFIO_INTR "vfio-intr" > > > > OPT_VFIO_INTR_NUM, > > > > +#define OPT_MEMORY_ONLY "memory-only" > > > > + OPT_MEMORY_ONLY_NUM, > > > > OPT_LONG_MAX_NUM > > > > }; > > > > > > > > diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c > > > > index 89f3b5e..c160771 100644 > > > > --- a/lib/librte_eal/linuxapp/eal/eal.c > > > > +++ b/lib/librte_eal/linuxapp/eal/eal.c > > > > @@ -752,14 +752,6 @@ rte_eal_init(int argc, char **argv) > > > > > > > > rte_config_init(); > > > > > > > > - if (rte_eal_pci_init() < 0) > > > > - rte_panic("Cannot init PCI\n"); > > > > - > > > > -#ifdef RTE_LIBRTE_IVSHMEM > > > > - if (rte_eal_ivshmem_init() < 0) > > > > - rte_panic("Cannot init IVSHMEM\n"); > > > > -#endif > > > > - > > > > if (rte_eal_memory_init() < 0) > > > > rte_panic("Cannot init memory\n"); > > > > > > > > @@ -772,14 +764,30 @@ rte_eal_init(int argc, char **argv) > > > > if (rte_eal_tailqs_init() < 0) > > > > rte_panic("Cannot init tail queues for objects\n"); > > > > > > > > + if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0) > > > > + rte_panic("Cannot init logs\n"); > > > > + > > > > + eal_check_mem_on_local_socket(); > > > > + > > > > + rte_eal_mcfg_complete(); > > > > + > > > > + /*with memory-only option, we need not cpu affinity, pci device, alarm, external devices, interrupt, etc. */ > > > > + if( internal_config.memory_only ){ > > > > + RTE_LOG (DEBUG, EAL, "memory-only defined, so only memory being initialized.\n"); > > > > + return 0; > > > > + } > > > > + > > > > + if (rte_eal_pci_init() < 0) > > > > + rte_panic("Cannot init PCI\n"); > > > > + > > > > #ifdef RTE_LIBRTE_IVSHMEM > > > > + if (rte_eal_ivshmem_init() < 0) > > > > + rte_panic("Cannot init IVSHMEM\n"); > > > > + > > > > if (rte_eal_ivshmem_obj_init() < 0) > > > > rte_panic("Cannot init IVSHMEM objects\n"); > > > > #endif > > > > > > > > - if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0) > > > > - rte_panic("Cannot init logs\n"); > > > > - > > > > if (rte_eal_alarm_init() < 0) > > > > rte_panic("Cannot init interrupt-handling thread\n"); > > > > > > > > @@ -789,10 +797,6 @@ rte_eal_init(int argc, char **argv) > > > > if (rte_eal_timer_init() < 0) > > > > rte_panic("Cannot init HPET or TSC timers\n"); > > > > > > > > - eal_check_mem_on_local_socket(); > > > > - > > > > - rte_eal_mcfg_complete(); > > > > - > > > > TAILQ_FOREACH(solib, &solib_list, next) { > > > > RTE_LOG(INFO, EAL, "open shared lib %s\n", solib->name); > > > > solib->lib_handle = dlopen(solib->name, RTLD_NOW); > > > > -- > > > > 1.9.4.msysgit.2 > >