From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-HE1-obe.outbound.protection.outlook.com (mail-he1eur01on0082.outbound.protection.outlook.com [104.47.0.82]) by dpdk.org (Postfix) with ESMTP id 53EF736E for ; Thu, 8 Dec 2016 09:57:52 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nxp.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=oHsKn8zIDe12MhyRVT420NoDsfWFltYXzrAPNjh9wzA=; b=oTzDlNpPuF9plPVjs0q4sfipQ4pTSHK4fKuerKr00m6Uqet3P1zqV89AcMDOJnIlG+rBl0yCxhfBqToagUOcoZxCxdCa6YPG4xW7E+DkU/cR4LXFAL9mbyOhT3vJEz9T1QwB+48NgVV3/Ku95i8TLXbDmg93WXl6hxdCEcQ1UP4= Received: from DB5PR04MB1605.eurprd04.prod.outlook.com (10.164.38.147) by DB5PR04MB1608.eurprd04.prod.outlook.com (10.164.38.150) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.761.9; Thu, 8 Dec 2016 08:57:49 +0000 Received: from DB5PR04MB1605.eurprd04.prod.outlook.com ([10.164.38.147]) by DB5PR04MB1605.eurprd04.prod.outlook.com ([10.164.38.147]) with mapi id 15.01.0771.008; Thu, 8 Dec 2016 08:57:49 +0000 From: Hemant Agrawal To: Olivier Matz CC: "dev@dpdk.org" , "jerin.jacob@caviumnetworks.com" , "david.hunt@intel.com" Thread-Topic: [PATCH v3 2/2] mempool: pktmbuf pool default fallback for mempool ops error Thread-Index: AQHSJhQK19GhEpRKXUi0B/Amkg4mI6DNlr0wgBdhogCAGRvAQA== Date: Thu, 8 Dec 2016 08:57:49 +0000 Message-ID: References: <1473959607-1951-1-git-send-email-hemant.agrawal@nxp.com> <1474044395-11627-1-git-send-email-hemant.agrawal@nxp.com> <1474044395-11627-2-git-send-email-hemant.agrawal@nxp.com> <7e26965e-173b-1cf6-0beb-c94712ad615a@nxp.com> <412b3d71-f6df-7ee5-6189-ed7af1268fed@6wind.com> <376759b0-5d49-b904-b793-4f907bea8537@6wind.com> In-Reply-To: <376759b0-5d49-b904-b793-4f907bea8537@6wind.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=hemant.agrawal@nxp.com; x-originating-ip: [192.88.169.1] x-ms-office365-filtering-correlation-id: 5615cba4-bd44-4ef2-f66b-08d41f484974 x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001);SRVR:DB5PR04MB1608; x-microsoft-exchange-diagnostics: 1; DB5PR04MB1608; 7:Q47EzGNq2GgtThwMzr1elBk68fe34gh5uZnLatKQ/uNSJXpymJQwJxJZ4nL8u2T1SGxnEk1p90pNPbImbSFCsUhldjxFFZiXkNsnKcmNjklfyNg8Npjf0jB7iimAloEw06MBwxOTjadKzdk6U2MQZZ9RKs46O0Ugt8id4IhFhmAWNMMCcK4Wr4IKNH6IJoH9BGxywtoyvS0nwRrT/e16ntDZ0iYMuw/TZCX8aXgziSeAIuAwEJC80/7Dc7qCb9o65MT3jG8bQXpubQh6E0G/SbcZcNu2ztydmP5+3XhvdWvpF3957ZJVdl9ccWzFLYSYMdy48EKrv99KYKgDCD/mvJmVrvBzb/yJ3FvO+7gNC7rDDg+KeBfnT6r9Ol8JgzMGvFpndR1bUq1RfiJNvO6XsIws9abdwxSHaL69IRVLu1qAgrfihmQhDMf9HtLRT9OpTMjSPkOYpuaU/U2krnwrYg== x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(278428928389397)(185117386973197)(228905959029699); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040375)(601004)(2401047)(5005006)(8121501046)(10201501046)(3002001)(6055026)(6041248)(20161123555025)(20161123560025)(20161123562025)(20161123564025)(6047074)(6072148); SRVR:DB5PR04MB1608; BCL:0; PCL:0; RULEID:; SRVR:DB5PR04MB1608; x-forefront-prvs: 0150F3F97D x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(7916002)(39850400002)(39860400002)(39410400002)(39840400002)(39450400003)(76104003)(189002)(377454003)(43544003)(13464003)(199003)(24454002)(68736007)(3846002)(102836003)(93886004)(33656002)(122556002)(5660300001)(7846002)(110136003)(7736002)(66066001)(6116002)(74316002)(106356001)(305945005)(81156014)(15395725005)(81166006)(575784001)(86362001)(8676002)(7696004)(3660700001)(8936002)(3280700002)(101416001)(106116001)(77096006)(76576001)(229853002)(6916009)(54356999)(4326007)(76176999)(2906002)(9686002)(92566002)(97736004)(2950100002)(38730400001)(105586002)(50986999)(2900100001)(189998001)(6506006); DIR:OUT; SFP:1101; SCL:1; SRVR:DB5PR04MB1608; H:DB5PR04MB1605.eurprd04.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: nxp.com does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: nxp.com X-MS-Exchange-CrossTenant-originalarrivaltime: 08 Dec 2016 08:57:49.5119 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 686ea1d3-bc2b-4c6f-a92c-d99c5c301635 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB5PR04MB1608 Subject: Re: [dpdk-dev] [PATCH v3 2/2] mempool: pktmbuf pool default fallback for mempool ops error X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Dec 2016 08:57:53 -0000 Hi Olivier, Apology for a delayed response. > -----Original Message----- > From: Olivier Matz [mailto:olivier.matz@6wind.com] > Sent: Tuesday, November 22, 2016 2:55 PM > To: Hemant Agrawal > Cc: dev@dpdk.org; jerin.jacob@caviumnetworks.com; david.hunt@intel.com > Subject: Re: [PATCH v3 2/2] mempool: pktmbuf pool default fallback for > mempool ops error >=20 > Hi Hemant, >=20 > Back on this topic, please see some comments below. >=20 > On 11/07/2016 01:30 PM, Hemant Agrawal wrote: > > Hi Olivier, > > > >> -----Original Message----- > >> From: Olivier Matz [mailto:olivier.matz@6wind.com] > >> Sent: Friday, October 14, 2016 5:41 PM > >>> On 9/22/2016 6:42 PM, Hemant Agrawal wrote: > >>>> Hi Olivier > >>>> > >>>> On 9/19/2016 7:27 PM, Olivier Matz wrote: > >>>>> Hi Hemant, > >>>>> > >>>>> On 09/16/2016 06:46 PM, Hemant Agrawal wrote: > >>>>>> In the rte_pktmbuf_pool_create, if the default external mempool > >>>>>> is not available, the implementation can default to "ring_mp_mc", > >>>>>> which is an software implementation. > >>>>>> > >>>>>> Signed-off-by: Hemant Agrawal > >>>>>> --- > >>>>>> Changes in V3: > >>>>>> * adding warning message to say that falling back to default sw > >>>>>> pool > >>>>>> --- > >>>>>> lib/librte_mbuf/rte_mbuf.c | 8 ++++++++ > >>>>>> 1 file changed, 8 insertions(+) > >>>>>> > >>>>>> diff --git a/lib/librte_mbuf/rte_mbuf.c > >>>>>> b/lib/librte_mbuf/rte_mbuf.c index 4846b89..8ab0eb1 100644 > >>>>>> --- a/lib/librte_mbuf/rte_mbuf.c > >>>>>> +++ b/lib/librte_mbuf/rte_mbuf.c > >>>>>> @@ -176,6 +176,14 @@ rte_pktmbuf_pool_create(const char *name, > >>>>>> unsigned n, > >>>>>> > >>>>>> rte_errno =3D rte_mempool_set_ops_byname(mp, > >>>>>> RTE_MBUF_DEFAULT_MEMPOOL_OPS, NULL); > >>>>>> + > >>>>>> + /* on error, try falling back to the software based default > >>>>>> pool */ > >>>>>> + if (rte_errno =3D=3D -EOPNOTSUPP) { > >>>>>> + RTE_LOG(WARNING, MBUF, "Default HW Mempool not > supported. " > >>>>>> + "falling back to sw mempool \"ring_mp_mc\""); > >>>>>> + rte_errno =3D rte_mempool_set_ops_byname(mp, "ring_mp_mc"= , > >>>>>> NULL); > >>>>>> + } > >>>>>> + > >>>>>> if (rte_errno !=3D 0) { > >>>>>> RTE_LOG(ERR, MBUF, "error setting mempool handler\n"); > >>>>>> return NULL; > >>>>>> > >>>>> > >>>>> Without adding a new method ".supported()", the first call to > >>>>> rte_mempool_populate() could return the same error ENOTSUP. In > >>>>> this case, it is still possible to fallback. > >>>>> > >>>> It will be bit late. > >>>> > >>>> On failure, than we have to set the default ops and do a goto > >>>> before rte_pktmbuf_pool_init(mp, &mbp_priv); > >> > >> I still think we can do the job without adding the .supported() method= . > >> The following code is just an (untested) example: > >> > >> struct rte_mempool * > >> rte_pktmbuf_pool_create(const char *name, unsigned n, > >> unsigned cache_size, uint16_t priv_size, uint16_t data_room_size, > >> int socket_id) > >> { > >> struct rte_mempool *mp; > >> struct rte_pktmbuf_pool_private mbp_priv; > >> unsigned elt_size; > >> int ret; > >> const char *ops[] =3D { > >> RTE_MBUF_DEFAULT_MEMPOOL_OPS, "ring_mp_mc", NULL, > >> }; > >> const char **op; > >> > >> if (RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) !=3D priv_size) { > >> RTE_LOG(ERR, MBUF, "mbuf priv_size=3D%u is not aligned\n", > >> priv_size); > >> rte_errno =3D EINVAL; > >> return NULL; > >> } > >> elt_size =3D sizeof(struct rte_mbuf) + (unsigned)priv_size + > >> (unsigned)data_room_size; > >> mbp_priv.mbuf_data_room_size =3D data_room_size; > >> mbp_priv.mbuf_priv_size =3D priv_size; > >> > >> for (op =3D &ops[0]; *op !=3D NULL; op++) { > >> mp =3D rte_mempool_create_empty(name, n, elt_size, cache_size, > >> sizeof(struct rte_pktmbuf_pool_private), socket_id, 0); > >> if (mp =3D=3D NULL) > >> return NULL; > >> > >> ret =3D rte_mempool_set_ops_byname(mp, *op, NULL); > >> if (ret !=3D 0) { > >> RTE_LOG(ERR, MBUF, "error setting mempool handler\n"); > >> rte_mempool_free(mp); > >> if (ret =3D=3D -ENOTSUP) > >> continue; > >> rte_errno =3D -ret; > >> return NULL; > >> } > >> rte_pktmbuf_pool_init(mp, &mbp_priv); > >> > >> ret =3D rte_mempool_populate_default(mp); > >> if (ret < 0) { > >> rte_mempool_free(mp); > >> if (ret =3D=3D -ENOTSUP) > >> continue; > >> rte_errno =3D -ret; > >> return NULL; > >> } > >> } > >> > >> rte_mempool_obj_iter(mp, rte_pktmbuf_init, NULL); > >> > >> return mp; > >> } > >> > >> > > [Hemant] This look fine to me. Please submit a patch for the same. > > > >>>>> I've just submitted an RFC, which I think is quite linked: > >>>>> http://dpdk.org/ml/archives/dev/2016-September/046974.html > >>>>> Assuming a new parameter "mempool_ops" is added to > >>>>> rte_pktmbuf_pool_create(), would it make sense to fallback to > >>>>> "ring_mp_mc"? What about just returning ENOTSUP? The application > >>>>> could do the job and decide which sw fallback to use. > >>>> > >>>> We ran into this issue when trying to run the standard DPDK > >>>> examples > >>>> (l3fwd) in VM. Do you think, is it practical to add fallback > >>>> handling in each of the DPDK examples? > >> > >> OK. What is still unclear for me, is how the software is aware of the > >> different hardware-assisted handlers. Moreover, we could imagine more > >> software handlers, which could be used depending on the use case. > >> > >> I think this choice has to be made by the user or the application: > >> > >> - the application may want to use a specific (sw or hw) handler: in > >> this case, it want to be notified if it fails, instead of having > >> a quiet fallback to ring_mp_mc > >> - if several handlers are available, the application may want to > >> try them in a specific order > >> - maybe some handlers will have some limitations with some > >> configurations or driver? The application could decide to use > >> a different handler according the configuration. > >> > >> So that's why I think this is an application decision. > >> > > [Hemant] We should simplify it: if the application has supplied the h= andler, it > is application's responsibility to take care of failure. Only if the appl= ication want > to use the default handler, the implementation can fallback. The fallbac= k > handler can again be configurable. >=20 > Honestly, I'm not very convinced that having a quiet fallback is the prop= er > solution: the application won't be notified that the fallback occured. >=20 > I understand your patch solves your use-case, but my fear is that we just > integrate this minimal patch and never do the harder work of solving all = the use- > cases. I think we need to give the tools to the applications to control w= hat > occurs because we also need to solve these issues: > - you have a hardware handler, but you want to use the software > handler > - you have several software handlers, and you (as an administrator) or > the application knows which one is the best to use in your case. >=20 > An alternative (which I don't like either) to solve your issue without mo= difying > the mbuf code is to have your own handler fallbacking to ring_mp_mc. >=20 [Hemant] are you suggesting that we also configure a fallback handler in ad= dition to default?=20 I agree that we need to provide full flexibility to the applications, who a= re willing to implement the fallback mechanism.=20 >=20 > >>>> Typically when someone is writing a application on host, he need > >>>> not worry non-availability of the hw offloaded mempool. He may also > >>>> want to run the same binary in virtual machine. In VM, it is not > >>>> guaranteed that hw offloaded mempools will be available. > >> > >> Running the same binary is of course a need. But if your VM does not > >> provide the same virtualized hardware than the host, I think the > >> command line option makes sense. > >> > >> AFAIU, on the host, you can use a hw mempool handler or a sw one, and > >> on the guest, only a sw one. So you need an option to select the > >> behavior you want on the host, without recompiling. > >> > >> I understand that modifying all the applications is not a good option > >> either. I'm thinking a a EAL parameter that would allow to configure > >> a library (mbuf in this case). Something like kernel boot options. > >> Example: testpmd -l 2,4 --opt mbuf.handler=3D"ring_mp_mc" -- [app args= ] > >> > >> I don't know if this is feasible, this has to be discussed first, but > >> what do you think on the principle? The value could also be an > >> ordered list if we want a fallback, and this option could be > >> overriden by the application before creating the mbuf pool. There was > >> some discussions about a kind of dpdk global configuration some time a= go. > >> > > [Hemant] I agree that command line option provide a better control in t= his > case. > > On the flipside, We need to be careful that we do not end up having too= many > command line options. >=20 > Yes, we should avoid having too many command line arguments. > I think we should go in the direction of having a sort of key/value datab= ase in > EAL. It could be set by: > - a generic command line argument > - a config file (maybe later) > - the application through an API >=20 > Then, it would be used by: > - dpdk libraries > - the application (maybe) >=20 > This has been discussed some times, the latest was probably: > http://dpdk.org/ml/archives/dev/2016-June/040079.html >=20 > I think it would be a good tool for dpdk configurability, and it would he= lp to > remove some compile-time options and maybe some eal options. > Unfortunately, I don't have the time to work on this at the moment. >=20 > Regards, > Olivier