From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.droids-corp.org (zoll.droids-corp.org [94.23.50.67]) by dpdk.org (Postfix) with ESMTP id B6621590E for ; Fri, 14 Oct 2016 14:11:04 +0200 (CEST) Received: from lfbn-1-5996-232.w90-110.abo.wanadoo.fr ([90.110.195.232] helo=[192.168.1.13]) by mail.droids-corp.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1bv1NO-00072q-LF; Fri, 14 Oct 2016 14:14:15 +0200 To: Hemant Agrawal References: <1473959607-1951-1-git-send-email-hemant.agrawal@nxp.com> <1474044395-11627-1-git-send-email-hemant.agrawal@nxp.com> <1474044395-11627-2-git-send-email-hemant.agrawal@nxp.com> <7e26965e-173b-1cf6-0beb-c94712ad615a@nxp.com> Cc: dev@dpdk.org, jerin.jacob@caviumnetworks.com, david.hunt@intel.com From: Olivier Matz Message-ID: <412b3d71-f6df-7ee5-6189-ed7af1268fed@6wind.com> Date: Fri, 14 Oct 2016 14:10:54 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Icedove/45.2.0 MIME-Version: 1.0 In-Reply-To: <7e26965e-173b-1cf6-0beb-c94712ad615a@nxp.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v3 2/2] mempool: pktmbuf pool default fallback for mempool ops error X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Oct 2016 12:11:04 -0000 Hi Hemant, Sorry for the late answer. Please see some comments inline. On 10/13/2016 03:15 PM, Hemant Agrawal wrote: > Hi Olivier, > Any updates w.r.t this patch set? > > Regards > Hemant > On 9/22/2016 6:42 PM, Hemant Agrawal wrote: >> Hi Olivier >> >> On 9/19/2016 7:27 PM, Olivier Matz wrote: >>> Hi Hemant, >>> >>> On 09/16/2016 06:46 PM, Hemant Agrawal wrote: >>>> In the rte_pktmbuf_pool_create, if the default external mempool is >>>> not available, the implementation can default to "ring_mp_mc", which >>>> is an software implementation. >>>> >>>> Signed-off-by: Hemant Agrawal >>>> --- >>>> Changes in V3: >>>> * adding warning message to say that falling back to default sw pool >>>> --- >>>> lib/librte_mbuf/rte_mbuf.c | 8 ++++++++ >>>> 1 file changed, 8 insertions(+) >>>> >>>> diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c >>>> index 4846b89..8ab0eb1 100644 >>>> --- a/lib/librte_mbuf/rte_mbuf.c >>>> +++ b/lib/librte_mbuf/rte_mbuf.c >>>> @@ -176,6 +176,14 @@ rte_pktmbuf_pool_create(const char *name, >>>> unsigned n, >>>> >>>> rte_errno = rte_mempool_set_ops_byname(mp, >>>> RTE_MBUF_DEFAULT_MEMPOOL_OPS, NULL); >>>> + >>>> + /* on error, try falling back to the software based default >>>> pool */ >>>> + if (rte_errno == -EOPNOTSUPP) { >>>> + RTE_LOG(WARNING, MBUF, "Default HW Mempool not supported. " >>>> + "falling back to sw mempool \"ring_mp_mc\""); >>>> + rte_errno = rte_mempool_set_ops_byname(mp, "ring_mp_mc", >>>> NULL); >>>> + } >>>> + >>>> if (rte_errno != 0) { >>>> RTE_LOG(ERR, MBUF, "error setting mempool handler\n"); >>>> return NULL; >>>> >>> >>> Without adding a new method ".supported()", the first call to >>> rte_mempool_populate() could return the same error ENOTSUP. In this >>> case, it is still possible to fallback. >>> >> It will be bit late. >> >> On failure, than we have to set the default ops and do a goto before >> rte_pktmbuf_pool_init(mp, &mbp_priv); I still think we can do the job without adding the .supported() method. The following code is just an (untested) example: struct rte_mempool * rte_pktmbuf_pool_create(const char *name, unsigned n, unsigned cache_size, uint16_t priv_size, uint16_t data_room_size, int socket_id) { struct rte_mempool *mp; struct rte_pktmbuf_pool_private mbp_priv; unsigned elt_size; int ret; const char *ops[] = { RTE_MBUF_DEFAULT_MEMPOOL_OPS, "ring_mp_mc", NULL, }; const char **op; if (RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) != priv_size) { RTE_LOG(ERR, MBUF, "mbuf priv_size=%u is not aligned\n", priv_size); rte_errno = EINVAL; return NULL; } elt_size = sizeof(struct rte_mbuf) + (unsigned)priv_size + (unsigned)data_room_size; mbp_priv.mbuf_data_room_size = data_room_size; mbp_priv.mbuf_priv_size = priv_size; for (op = &ops[0]; *op != NULL; op++) { mp = rte_mempool_create_empty(name, n, elt_size, cache_size, sizeof(struct rte_pktmbuf_pool_private), socket_id, 0); if (mp == NULL) return NULL; ret = rte_mempool_set_ops_byname(mp, *op, NULL); if (ret != 0) { RTE_LOG(ERR, MBUF, "error setting mempool handler\n"); rte_mempool_free(mp); if (ret == -ENOTSUP) continue; rte_errno = -ret; return NULL; } rte_pktmbuf_pool_init(mp, &mbp_priv); ret = rte_mempool_populate_default(mp); if (ret < 0) { rte_mempool_free(mp); if (ret == -ENOTSUP) continue; rte_errno = -ret; return NULL; } } rte_mempool_obj_iter(mp, rte_pktmbuf_init, NULL); return mp; } >>> I've just submitted an RFC, which I think is quite linked: >>> http://dpdk.org/ml/archives/dev/2016-September/046974.html >>> Assuming a new parameter "mempool_ops" is added to >>> rte_pktmbuf_pool_create(), would it make sense to fallback to >>> "ring_mp_mc"? What about just returning ENOTSUP? The application could >>> do the job and decide which sw fallback to use. >> >> We ran into this issue when trying to run the standard DPDK examples >> (l3fwd) in VM. Do you think, is it practical to add fallback handling in >> each of the DPDK examples? OK. What is still unclear for me, is how the software is aware of the different hardware-assisted handlers. Moreover, we could imagine more software handlers, which could be used depending on the use case. I think this choice has to be made by the user or the application: - the application may want to use a specific (sw or hw) handler: in this case, it want to be notified if it fails, instead of having a quiet fallback to ring_mp_mc - if several handlers are available, the application may want to try them in a specific order - maybe some handlers will have some limitations with some configurations or driver? The application could decide to use a different handler according the configuration. So that's why I think this is an application decision. >> Typically when someone is writing a application on host, he need not >> worry non-availability of the hw offloaded mempool. He may also want to >> run the same binary in virtual machine. In VM, it is not guaranteed that >> hw offloaded mempools will be available. Running the same binary is of course a need. But if your VM does not provide the same virtualized hardware than the host, I think the command line option makes sense. AFAIU, on the host, you can use a hw mempool handler or a sw one, and on the guest, only a sw one. So you need an option to select the behavior you want on the host, without recompiling. I understand that modifying all the applications is not a good option either. I'm thinking a a EAL parameter that would allow to configure a library (mbuf in this case). Something like kernel boot options. Example: testpmd -l 2,4 --opt mbuf.handler="ring_mp_mc" -- [app args] I don't know if this is feasible, this has to be discussed first, but what do you think on the principle? The value could also be an ordered list if we want a fallback, and this option could be overriden by the application before creating the mbuf pool. There was some discussions about a kind of dpdk global configuration some time ago. Regards, Olivier