From: Olivier Matz <olivier.matz@6wind.com>
To: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Cc: Andrew Rybchenko <arybchenko@solarflare.com>,
Jerin Jacob <jerinjacobk@gmail.com>, dpdk-dev <dev@dpdk.org>,
Gage Eads <gage.eads@intel.com>,
"Artem V. Andreev" <artem.andreev@oktetlabs.ru>,
Jerin Jacob <jerinj@marvell.com>,
Nithin Dabilpuram <ndabilpuram@marvell.com>,
Vamsi Attunuru <vattunuru@marvell.com>,
Hemant Agrawal <hemant.agrawal@nxp.com>
Subject: Re: [dpdk-dev] [PATCH dpdk-dev v3] mempool: sort the rte_mempool_ops by name
Date: Mon, 9 Mar 2020 10:05:57 +0100 [thread overview]
Message-ID: <20200309090557.GP13822@platinum> (raw)
In-Reply-To: <CAMDZJNU7B9qhJp0Hq49fSzivqf2budAYj+YENVkbsqPjNJxSRw@mail.gmail.com>
On Mon, Mar 09, 2020 at 04:55:28PM +0800, Tonghao Zhang wrote:
> On Mon, Mar 9, 2020 at 4:27 PM Olivier Matz <olivier.matz@6wind.com> wrote:
> >
> > Hi,
> >
> > On Mon, Mar 09, 2020 at 11:01:25AM +0800, Tonghao Zhang wrote:
> > > On Sat, Mar 7, 2020 at 8:54 PM Andrew Rybchenko
> > > <arybchenko@solarflare.com> wrote:
> > > >
> > > > On 3/7/20 3:51 PM, Andrew Rybchenko wrote:
> > > > > On 3/6/20 4:37 PM, Jerin Jacob wrote:
> > > > >> On Fri, Mar 6, 2020 at 7:06 PM <xiangxia.m.yue@gmail.com> wrote:
> > > > >>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > > >>>
> > > > >>> The order of mempool initiation affects mempool index in the
> > > > >>> rte_mempool_ops_table. For example, when building APPs with:
> > > > >>>
> > > > >>> $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> > > > >>>
> > > > >>> The "bucket" mempool will be registered firstly, and its index
> > > > >>> in table is 0 while the index of "ring" mempool is 1. DPDK
> > > > >>> uses the mk/rte.app.mk to build APPs, and others, for example,
> > > > >>> Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > > > >>> The mempool lib linked in dpdk and Open vSwitch is different.
> > > > >>>
> > > > >>> The mempool can be used between primary and secondary process,
> > > > >>> such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
> > > > >>> There will be a crash because dpdk-pdump creates the "ring_mp_mc"
> > > > >>> ring which index in table is 0, but the index of "bucket" ring
> > > > >>> is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
> > > > >>> mempool ops and malloc memory from mempool. The crash will occur:
> > > > >>>
> > > > >>> bucket_dequeue (access null and crash)
> > > > >>> rte_mempool_get_ops (should get "ring_mp_mc",
> > > > >>> but get "bucket" mempool)
> > > > >>> rte_mempool_ops_dequeue_bulk
> > > > >>> ...
> > > > >>> rte_pktmbuf_alloc
> > > > >>> rte_pktmbuf_copy
> > > > >>> pdump_copy
> > > > >>> pdump_rx
> > > > >>> rte_eth_rx_burst
> > > > >>>
> > > > >>> To avoid the crash, there are some solution:
> > > > >>> * constructor priority: Different mempool uses different
> > > > >>> priority in RTE_INIT, but it's not easy to maintain.
> > > > >>>
> > > > >>> * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
> > > > >>> be same as libdpdk.a/libdpdk.so, but when adding a new mempool
> > > > >>> driver in future, we must make sure the order.
> > > > >>>
> > > > >>> * register mempool orderly: Sort the mempool when registering,
> > > > >>> so the lib linked will not affect the index in mempool table.
> > > > >>>
> > > > >>> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > > >>> Acked-by: Olivier Matz <olivier.matz@6wind.com>
> > > > >> Acked-by: Jerin Jacob <jerinj@marvell.com>
> > > > >
> > > > > The patch is OK, but the fact that ops index changes during
> > > > > mempool driver lifetime is frightening. In fact it breaks
> > > > > rte_mempool_register_ops() return value semantics (read
> > > > > as API break). The return value is not used in DPDK, but it
> > > > > is a public function. If I'm not mistaken it should be taken
> > > > > into account.
> >
> > Good points.
> >
> > The fact that the ops index changes during mempool driver lifetime is
> > indeed frightening, especially knowning that this is a dynamic
> > registration that could happen at any moment in the life of the
> > application. Also, breaking the ABI is not desirable.
> That solution is better.
>
> > Let me try to propose something else to solve your issue:
> >
> > 1/ At init, the primary process allocates a struct in shared memory
> > (named memzone):
> >
> > struct rte_mempool_shared_ops {
> > size_t num_mempool_ops;
> > struct {
> > char name[RTE_MEMPOOL_OPS_NAMESIZE];
> > } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX];
> > char *mempool_ops_name[RTE_MEMPOOL_MAX_OPS_IDX];
oops I forgot to remove this line (replaced by mini-struct just above).
> > rte_spinlock_t mempool;
> > }
> >
> > 2/ When we register a mempool ops, we first get a name and id from the
> > shared struct: with the lock held, lookup for the registered name and
> > return its index, else get the last id and copy the name in the struct.
> >
> > 3/ Then do as before (in the per-process global table), except that we
> > reuse the registered id.
> >
> > We can remove the num_ops field from rte_mempool_ops_table.
> >
> > Thoughts?
> >
> >
> > > Yes, should update the doc: how about this:
> > >
> > > diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
> > > index c90cf31..5a9c8a7 100644
> > > --- a/lib/librte_mempool/rte_mempool.h
> > > +++ b/lib/librte_mempool/rte_mempool.h
> > > @@ -904,7 +904,9 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
> > > * @param ops
> > > * Pointer to an ops structure to register.
> > > * @return
> > > - * - >=0: Success; return the index of the ops struct in the table.
> > > + * - >=0: Success; return the index of the last ops struct in the table.
> > > + * The number of the ops struct registered is equal to index
> > > + * returned + 1.
> > > * - -EINVAL - some missing callbacks while registering ops struct.
> > > * - -ENOSPC - the maximum number of ops structs has been reached.
> > > */
> > > diff --git a/lib/librte_mempool/rte_mempool_ops.c
> > > b/lib/librte_mempool/rte_mempool_ops.c
> > > index b0da096..053f340 100644
> > > --- a/lib/librte_mempool/rte_mempool_ops.c
> > > +++ b/lib/librte_mempool/rte_mempool_ops.c
> > > @@ -26,7 +26,11 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> > > return strcmp(m_a->name, m_b->name);
> > > }
> > >
> > > -/* add a new ops struct in rte_mempool_ops_table, return its index. */
> > > +/*
> > > + * add a new ops struct in rte_mempool_ops_table.
> > > + * on success, return the index of the last ops
> > > + * struct in the table.
> > > + */
> > > int
> > > rte_mempool_register_ops(const struct rte_mempool_ops *h)
> > > {
> > > > > Also I remember patches which warn about above behaviour
> > > > > in documentation. If behaviour changes, corresponding
> > > > > documentation must be updated.
> > > >
> > > > One more point. If the patch is finally accepted it definitely
> > > > deserves few lines in release notes.
> > > OK, a separate patch should be sent before DPDK 20.05 release ?
> > > >
> > >
> > >
> > > --
> > > Thanks,
> > > Tonghao
>
>
>
> --
> Thanks,
> Tonghao
next prev parent reply other threads:[~2020-03-09 9:06 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-02 1:57 [dpdk-dev] [PATCH] " xiangxia.m.yue
2020-03-02 13:45 ` Jerin Jacob
2020-03-04 13:17 ` Tonghao Zhang
2020-03-04 13:33 ` Jerin Jacob
2020-03-04 14:46 ` Tonghao Zhang
2020-03-04 15:14 ` Jerin Jacob
2020-03-04 15:25 ` Tonghao Zhang
2020-03-05 8:20 ` [dpdk-dev] [PATCH dpdk-dev v2] " xiangxia.m.yue
2020-03-05 16:57 ` Olivier Matz
2020-03-06 13:36 ` [dpdk-dev] [PATCH dpdk-dev v3] " xiangxia.m.yue
2020-03-06 13:37 ` Jerin Jacob
2020-03-07 12:51 ` Andrew Rybchenko
2020-03-07 12:54 ` Andrew Rybchenko
2020-03-09 3:01 ` Tonghao Zhang
2020-03-09 8:27 ` Olivier Matz
2020-03-09 8:55 ` Tonghao Zhang
2020-03-09 9:05 ` Olivier Matz [this message]
2020-03-09 13:15 ` David Marchand
2020-03-16 7:43 ` Tonghao Zhang
2020-03-16 7:55 ` Olivier Matz
2020-03-24 9:35 ` Andrew Rybchenko
2020-03-24 12:41 ` Tonghao Zhang
2020-04-09 10:52 ` [dpdk-dev] [PATCH dpdk-dev 1/2] eal: introduce last-init queue for libraries initialization xiangxia.m.yue
2020-04-09 10:53 ` [dpdk-dev] [PATCH dpdk-dev 2/2] mempool: use shared memzone for rte_mempool_ops xiangxia.m.yue
2020-04-09 11:31 ` [dpdk-dev] [PATCH dpdk-dev 1/2] eal: introduce last-init queue for libraries initialization Jerin Jacob
2020-04-09 15:04 ` Tonghao Zhang
2020-04-09 15:02 ` [dpdk-dev] [PATCH dpdk-dev v2 1/2] eal: introduce rte-init " xiangxia.m.yue
2020-04-09 15:02 ` [dpdk-dev] [PATCH dpdk-dev v2 2/2] mempool: use shared memzone for rte_mempool_ops xiangxia.m.yue
2020-04-10 6:18 ` [dpdk-dev] [PATCH dpdk-dev v2 1/2] eal: introduce rte-init queue for libraries initialization Jerin Jacob
2020-04-10 13:11 ` Jerin Jacob
2020-04-12 3:20 ` Tonghao Zhang
2020-04-12 3:32 ` Tonghao Zhang
2020-04-13 11:32 ` Jerin Jacob
2020-04-13 14:21 ` [dpdk-dev] [PATCH dpdk-dev v3 " xiangxia.m.yue
2020-04-13 14:21 ` [dpdk-dev] [PATCH dpdk-dev v3 2/2] mempool: use shared memzone for rte_mempool_ops xiangxia.m.yue
2020-04-16 22:27 ` Thomas Monjalon
2020-04-27 8:03 ` Tonghao Zhang
2020-04-27 11:40 ` Thomas Monjalon
2020-04-27 12:51 ` Tonghao Zhang
2020-04-28 13:22 ` Tonghao Zhang
2020-05-04 7:42 ` Olivier Matz
2021-03-25 14:24 ` David Marchand
2020-04-23 13:38 ` Andrew Rybchenko
2020-04-27 5:23 ` Tonghao Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200309090557.GP13822@platinum \
--to=olivier.matz@6wind.com \
--cc=artem.andreev@oktetlabs.ru \
--cc=arybchenko@solarflare.com \
--cc=dev@dpdk.org \
--cc=gage.eads@intel.com \
--cc=hemant.agrawal@nxp.com \
--cc=jerinj@marvell.com \
--cc=jerinjacobk@gmail.com \
--cc=ndabilpuram@marvell.com \
--cc=vattunuru@marvell.com \
--cc=xiangxia.m.yue@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).