* Re: [dpdk-dev] [PATCH] mempool: sort the rte_mempool_ops by name
2020-03-02 1:57 [dpdk-dev] [PATCH] mempool: sort the rte_mempool_ops by name xiangxia.m.yue
@ 2020-03-02 13:45 ` Jerin Jacob
2020-03-04 13:17 ` Tonghao Zhang
2020-03-05 8:20 ` [dpdk-dev] [PATCH dpdk-dev v2] " xiangxia.m.yue
` (4 subsequent siblings)
5 siblings, 1 reply; 44+ messages in thread
From: Jerin Jacob @ 2020-03-02 13:45 UTC (permalink / raw)
To: xiangxia.m.yue
Cc: dpdk-dev, Olivier Matz, Andrew Rybchenko, Gage Eads,
Artem V. Andreev, Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru,
Hemant Agrawal
On Mon, Mar 2, 2020 at 7:27 AM <xiangxia.m.yue@gmail.com> wrote:
>
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>
> The order of mempool initiation affects mempool index in the
> rte_mempool_ops_table. For example, when building APPs with:
>
> $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
>
> The "bucket" mempool will be registered firstly, and its index
> in table is 0 while the index of "ring" mempool is 1. DPDK
> uses the mk/rte.app.mk to build APPs, and others, for example,
> Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> The mempool lib linked in dpdk and Open vSwitch is different.
>
> The mempool can be used between primary and secondary process,
> such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
> There will be a crash because dpdk-pdump creates the "ring_mp_mc"
> ring which index in table is 0, but the index of "bucket" ring
> is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
> mempool ops and malloc memory from mempool. The crash will occur:
>
> bucket_dequeue (access null and crash)
> rte_mempool_get_ops (should get "ring_mp_mc",
> but get "bucket" mempool)
> rte_mempool_ops_dequeue_bulk
> ...
> rte_pktmbuf_alloc
> rte_pktmbuf_copy
> pdump_copy
> pdump_rx
> rte_eth_rx_burst
>
> To avoid the crash, there are some solution:
> * constructor priority: Different mempool uses different
> priority in RTE_INIT, but it's not easy to maintain.
>
> * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
> be same as libdpdk.a/libdpdk.so, but when adding a new mempool
> driver in future, we must make sure the order.
>
> * register mempool orderly: Sort the mempool when registering,
> so the lib linked will not affect the index in mempool table.
>
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> ---
> lib/librte_mempool/rte_mempool_ops.c | 18 ++++++++++++++++--
> 1 file changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
> index 22c5251..06dfe16 100644
> --- a/lib/librte_mempool/rte_mempool_ops.c
> +++ b/lib/librte_mempool/rte_mempool_ops.c
> @@ -22,7 +22,7 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> rte_mempool_register_ops(const struct rte_mempool_ops *h)
> {
> struct rte_mempool_ops *ops;
> - int16_t ops_index;
> + unsigned ops_index, i;
>
> rte_spinlock_lock(&rte_mempool_ops_table.sl);
>
> @@ -50,7 +50,19 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> return -EEXIST;
> }
>
> - ops_index = rte_mempool_ops_table.num_ops++;
> + /* sort the rte_mempool_ops by name. the order of the mempool
> + * lib initiation will not affect rte_mempool_ops index. */
+1 for the fix.
For the implementation, why not use qsort_r() for sorting?
> + ops_index = rte_mempool_ops_table.num_ops;
> + for (i = 0; i < rte_mempool_ops_table.num_ops; i++) {
> + if (strcmp(h->name, rte_mempool_ops_table.ops[i].name) < 0) {
> + do {
> + rte_mempool_ops_table.ops[ops_index] =
> + rte_mempool_ops_table.ops[ops_index -1];
> + } while (--ops_index > i);
> + break;
> + }
> + }
> +
> ops = &rte_mempool_ops_table.ops[ops_index];
> strlcpy(ops->name, h->name, sizeof(ops->name));
> ops->alloc = h->alloc;
> @@ -63,6 +75,8 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> ops->get_info = h->get_info;
> ops->dequeue_contig_blocks = h->dequeue_contig_blocks;
>
> + rte_mempool_ops_table.num_ops++;
> +
> rte_spinlock_unlock(&rte_mempool_ops_table.sl);
>
> return ops_index;
> --
> 1.8.3.1
>
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH] mempool: sort the rte_mempool_ops by name
2020-03-02 13:45 ` Jerin Jacob
@ 2020-03-04 13:17 ` Tonghao Zhang
2020-03-04 13:33 ` Jerin Jacob
0 siblings, 1 reply; 44+ messages in thread
From: Tonghao Zhang @ 2020-03-04 13:17 UTC (permalink / raw)
To: Jerin Jacob
Cc: dpdk-dev, Olivier Matz, Andrew Rybchenko, Gage Eads,
Artem V. Andreev, Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru,
Hemant Agrawal
On Mon, Mar 2, 2020 at 9:45 PM Jerin Jacob <jerinjacobk@gmail.com> wrote:
>
> On Mon, Mar 2, 2020 at 7:27 AM <xiangxia.m.yue@gmail.com> wrote:
> >
> > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> >
> > The order of mempool initiation affects mempool index in the
> > rte_mempool_ops_table. For example, when building APPs with:
> >
> > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> >
> > The "bucket" mempool will be registered firstly, and its index
> > in table is 0 while the index of "ring" mempool is 1. DPDK
> > uses the mk/rte.app.mk to build APPs, and others, for example,
> > Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > The mempool lib linked in dpdk and Open vSwitch is different.
> >
> > The mempool can be used between primary and secondary process,
> > such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
> > There will be a crash because dpdk-pdump creates the "ring_mp_mc"
> > ring which index in table is 0, but the index of "bucket" ring
> > is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
> > mempool ops and malloc memory from mempool. The crash will occur:
> >
> > bucket_dequeue (access null and crash)
> > rte_mempool_get_ops (should get "ring_mp_mc",
> > but get "bucket" mempool)
> > rte_mempool_ops_dequeue_bulk
> > ...
> > rte_pktmbuf_alloc
> > rte_pktmbuf_copy
> > pdump_copy
> > pdump_rx
> > rte_eth_rx_burst
> >
> > To avoid the crash, there are some solution:
> > * constructor priority: Different mempool uses different
> > priority in RTE_INIT, but it's not easy to maintain.
> >
> > * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
> > be same as libdpdk.a/libdpdk.so, but when adding a new mempool
> > driver in future, we must make sure the order.
> >
> > * register mempool orderly: Sort the mempool when registering,
> > so the lib linked will not affect the index in mempool table.
> >
> > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > ---
> > lib/librte_mempool/rte_mempool_ops.c | 18 ++++++++++++++++--
> > 1 file changed, 16 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
> > index 22c5251..06dfe16 100644
> > --- a/lib/librte_mempool/rte_mempool_ops.c
> > +++ b/lib/librte_mempool/rte_mempool_ops.c
> > @@ -22,7 +22,7 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> > rte_mempool_register_ops(const struct rte_mempool_ops *h)
> > {
> > struct rte_mempool_ops *ops;
> > - int16_t ops_index;
> > + unsigned ops_index, i;
> >
> > rte_spinlock_lock(&rte_mempool_ops_table.sl);
> >
> > @@ -50,7 +50,19 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> > return -EEXIST;
> > }
> >
> > - ops_index = rte_mempool_ops_table.num_ops++;
> > + /* sort the rte_mempool_ops by name. the order of the mempool
> > + * lib initiation will not affect rte_mempool_ops index. */
>
> +1 for the fix.
> For the implementation, why not use qsort_r() for sorting?
The implementation is easy, and the number of mempool driver is not too large.
But we can use the qsort_r to implement it.
>
> > + ops_index = rte_mempool_ops_table.num_ops;
> > + for (i = 0; i < rte_mempool_ops_table.num_ops; i++) {
> > + if (strcmp(h->name, rte_mempool_ops_table.ops[i].name) < 0) {
> > + do {
> > + rte_mempool_ops_table.ops[ops_index] =
> > + rte_mempool_ops_table.ops[ops_index -1];
> > + } while (--ops_index > i);
> > + break;
> > + }
> > + }
> > +
> > ops = &rte_mempool_ops_table.ops[ops_index];
> > strlcpy(ops->name, h->name, sizeof(ops->name));
> > ops->alloc = h->alloc;
> > @@ -63,6 +75,8 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> > ops->get_info = h->get_info;
> > ops->dequeue_contig_blocks = h->dequeue_contig_blocks;
> >
> > + rte_mempool_ops_table.num_ops++;
> > +
> > rte_spinlock_unlock(&rte_mempool_ops_table.sl);
> >
> > return ops_index;
> > --
> > 1.8.3.1
> >
--
Thanks,
Tonghao
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH] mempool: sort the rte_mempool_ops by name
2020-03-04 13:17 ` Tonghao Zhang
@ 2020-03-04 13:33 ` Jerin Jacob
2020-03-04 14:46 ` Tonghao Zhang
0 siblings, 1 reply; 44+ messages in thread
From: Jerin Jacob @ 2020-03-04 13:33 UTC (permalink / raw)
To: Tonghao Zhang
Cc: dpdk-dev, Olivier Matz, Andrew Rybchenko, Gage Eads,
Artem V. Andreev, Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru,
Hemant Agrawal
On Wed, Mar 4, 2020 at 6:48 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>
> On Mon, Mar 2, 2020 at 9:45 PM Jerin Jacob <jerinjacobk@gmail.com> wrote:
> >
> > On Mon, Mar 2, 2020 at 7:27 AM <xiangxia.m.yue@gmail.com> wrote:
> > >
> > > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > >
> > > The order of mempool initiation affects mempool index in the
> > > rte_mempool_ops_table. For example, when building APPs with:
> > >
> > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> > >
> > > The "bucket" mempool will be registered firstly, and its index
> > > in table is 0 while the index of "ring" mempool is 1. DPDK
> > > uses the mk/rte.app.mk to build APPs, and others, for example,
> > > Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > > The mempool lib linked in dpdk and Open vSwitch is different.
> > >
> > > The mempool can be used between primary and secondary process,
> > > such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
> > > There will be a crash because dpdk-pdump creates the "ring_mp_mc"
> > > ring which index in table is 0, but the index of "bucket" ring
> > > is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
> > > mempool ops and malloc memory from mempool. The crash will occur:
> > >
> > > bucket_dequeue (access null and crash)
> > > rte_mempool_get_ops (should get "ring_mp_mc",
> > > but get "bucket" mempool)
> > > rte_mempool_ops_dequeue_bulk
> > > ...
> > > rte_pktmbuf_alloc
> > > rte_pktmbuf_copy
> > > pdump_copy
> > > pdump_rx
> > > rte_eth_rx_burst
> > >
> > > To avoid the crash, there are some solution:
> > > * constructor priority: Different mempool uses different
> > > priority in RTE_INIT, but it's not easy to maintain.
> > >
> > > * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
> > > be same as libdpdk.a/libdpdk.so, but when adding a new mempool
> > > driver in future, we must make sure the order.
> > >
> > > * register mempool orderly: Sort the mempool when registering,
> > > so the lib linked will not affect the index in mempool table.
> > >
> > > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > ---
> > > lib/librte_mempool/rte_mempool_ops.c | 18 ++++++++++++++++--
> > > 1 file changed, 16 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
> > > index 22c5251..06dfe16 100644
> > > --- a/lib/librte_mempool/rte_mempool_ops.c
> > > +++ b/lib/librte_mempool/rte_mempool_ops.c
> > > @@ -22,7 +22,7 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> > > rte_mempool_register_ops(const struct rte_mempool_ops *h)
> > > {
> > > struct rte_mempool_ops *ops;
> > > - int16_t ops_index;
> > > + unsigned ops_index, i;
> > >
> > > rte_spinlock_lock(&rte_mempool_ops_table.sl);
> > >
> > > @@ -50,7 +50,19 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> > > return -EEXIST;
> > > }
> > >
> > > - ops_index = rte_mempool_ops_table.num_ops++;
> > > + /* sort the rte_mempool_ops by name. the order of the mempool
> > > + * lib initiation will not affect rte_mempool_ops index. */
> >
> > +1 for the fix.
> > For the implementation, why not use qsort_r() for sorting?
> The implementation is easy, and the number of mempool driver is not too large.
> But we can use the qsort_r to implement it.
Since it is in a slow path, IMO, better to use standard sort functions
for better readability.
> >
> > > + ops_index = rte_mempool_ops_table.num_ops;
> > > + for (i = 0; i < rte_mempool_ops_table.num_ops; i++) {
> > > + if (strcmp(h->name, rte_mempool_ops_table.ops[i].name) < 0) {
> > > + do {
> > > + rte_mempool_ops_table.ops[ops_index] =
> > > + rte_mempool_ops_table.ops[ops_index -1];
> > > + } while (--ops_index > i);
> > > + break;
> > > + }
> > > + }
> > > +
> > > ops = &rte_mempool_ops_table.ops[ops_index];
> > > strlcpy(ops->name, h->name, sizeof(ops->name));
> > > ops->alloc = h->alloc;
> > > @@ -63,6 +75,8 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> > > ops->get_info = h->get_info;
> > > ops->dequeue_contig_blocks = h->dequeue_contig_blocks;
> > >
> > > + rte_mempool_ops_table.num_ops++;
> > > +
> > > rte_spinlock_unlock(&rte_mempool_ops_table.sl);
> > >
> > > return ops_index;
> > > --
> > > 1.8.3.1
> > >
>
>
>
> --
> Thanks,
> Tonghao
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH] mempool: sort the rte_mempool_ops by name
2020-03-04 13:33 ` Jerin Jacob
@ 2020-03-04 14:46 ` Tonghao Zhang
2020-03-04 15:14 ` Jerin Jacob
0 siblings, 1 reply; 44+ messages in thread
From: Tonghao Zhang @ 2020-03-04 14:46 UTC (permalink / raw)
To: Jerin Jacob
Cc: dpdk-dev, Olivier Matz, Andrew Rybchenko, Gage Eads,
Artem V. Andreev, Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru,
Hemant Agrawal
On Wed, Mar 4, 2020 at 9:33 PM Jerin Jacob <jerinjacobk@gmail.com> wrote:
>
> On Wed, Mar 4, 2020 at 6:48 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> >
> > On Mon, Mar 2, 2020 at 9:45 PM Jerin Jacob <jerinjacobk@gmail.com> wrote:
> > >
> > > On Mon, Mar 2, 2020 at 7:27 AM <xiangxia.m.yue@gmail.com> wrote:
> > > >
> > > > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > >
> > > > The order of mempool initiation affects mempool index in the
> > > > rte_mempool_ops_table. For example, when building APPs with:
> > > >
> > > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> > > >
> > > > The "bucket" mempool will be registered firstly, and its index
> > > > in table is 0 while the index of "ring" mempool is 1. DPDK
> > > > uses the mk/rte.app.mk to build APPs, and others, for example,
> > > > Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > > > The mempool lib linked in dpdk and Open vSwitch is different.
> > > >
> > > > The mempool can be used between primary and secondary process,
> > > > such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
> > > > There will be a crash because dpdk-pdump creates the "ring_mp_mc"
> > > > ring which index in table is 0, but the index of "bucket" ring
> > > > is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
> > > > mempool ops and malloc memory from mempool. The crash will occur:
> > > >
> > > > bucket_dequeue (access null and crash)
> > > > rte_mempool_get_ops (should get "ring_mp_mc",
> > > > but get "bucket" mempool)
> > > > rte_mempool_ops_dequeue_bulk
> > > > ...
> > > > rte_pktmbuf_alloc
> > > > rte_pktmbuf_copy
> > > > pdump_copy
> > > > pdump_rx
> > > > rte_eth_rx_burst
> > > >
> > > > To avoid the crash, there are some solution:
> > > > * constructor priority: Different mempool uses different
> > > > priority in RTE_INIT, but it's not easy to maintain.
> > > >
> > > > * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
> > > > be same as libdpdk.a/libdpdk.so, but when adding a new mempool
> > > > driver in future, we must make sure the order.
> > > >
> > > > * register mempool orderly: Sort the mempool when registering,
> > > > so the lib linked will not affect the index in mempool table.
> > > >
> > > > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > > ---
> > > > lib/librte_mempool/rte_mempool_ops.c | 18 ++++++++++++++++--
> > > > 1 file changed, 16 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
> > > > index 22c5251..06dfe16 100644
> > > > --- a/lib/librte_mempool/rte_mempool_ops.c
> > > > +++ b/lib/librte_mempool/rte_mempool_ops.c
> > > > @@ -22,7 +22,7 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> > > > rte_mempool_register_ops(const struct rte_mempool_ops *h)
> > > > {
> > > > struct rte_mempool_ops *ops;
> > > > - int16_t ops_index;
> > > > + unsigned ops_index, i;
> > > >
> > > > rte_spinlock_lock(&rte_mempool_ops_table.sl);
> > > >
> > > > @@ -50,7 +50,19 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> > > > return -EEXIST;
> > > > }
> > > >
> > > > - ops_index = rte_mempool_ops_table.num_ops++;
> > > > + /* sort the rte_mempool_ops by name. the order of the mempool
> > > > + * lib initiation will not affect rte_mempool_ops index. */
> > >
> > > +1 for the fix.
> > > For the implementation, why not use qsort_r() for sorting?
> > The implementation is easy, and the number of mempool driver is not too large.
> > But we can use the qsort_r to implement it.
>
> Since it is in a slow path, IMO, better to use standard sort functions
> for better readability.
Agree, can you help me review the patch:
diff --git a/lib/librte_mempool/rte_mempool_ops.c
b/lib/librte_mempool/rte_mempool_ops.c
index 22c5251..1acee58 100644
--- a/lib/librte_mempool/rte_mempool_ops.c
+++ b/lib/librte_mempool/rte_mempool_ops.c
@@ -17,6 +17,15 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
.num_ops = 0
};
+static int
+compare_mempool_ops(const void *a, const void *b)
+{
+ const struct rte_mempool_ops *m_a = a;
+ const struct rte_mempool_ops *m_b = b;
+
+ return strcmp(m_a->name, m_b->name);
+}
+
/* add a new ops struct in rte_mempool_ops_table, return its index. */
int
rte_mempool_register_ops(const struct rte_mempool_ops *h)
@@ -63,6 +72,9 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
ops->get_info = h->get_info;
ops->dequeue_contig_blocks = h->dequeue_contig_blocks;
+ qsort(rte_mempool_ops_table.ops, rte_mempool_ops_table.num_ops,
+ sizeof(rte_mempool_ops_table.ops[0]), compare_mempool_ops);
+
rte_spinlock_unlock(&rte_mempool_ops_table.sl);
return ops_index;
>
> > >
> > > > + ops_index = rte_mempool_ops_table.num_ops;
> > > > + for (i = 0; i < rte_mempool_ops_table.num_ops; i++) {
> > > > + if (strcmp(h->name, rte_mempool_ops_table.ops[i].name) < 0) {
> > > > + do {
> > > > + rte_mempool_ops_table.ops[ops_index] =
> > > > + rte_mempool_ops_table.ops[ops_index -1];
> > > > + } while (--ops_index > i);
> > > > + break;
> > > > + }
> > > > + }
> > > > +
> > > > ops = &rte_mempool_ops_table.ops[ops_index];
> > > > strlcpy(ops->name, h->name, sizeof(ops->name));
> > > > ops->alloc = h->alloc;
> > > > @@ -63,6 +75,8 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> > > > ops->get_info = h->get_info;
> > > > ops->dequeue_contig_blocks = h->dequeue_contig_blocks;
> > > >
> > > > + rte_mempool_ops_table.num_ops++;
> > > > +
> > > > rte_spinlock_unlock(&rte_mempool_ops_table.sl);
> > > >
> > > > return ops_index;
> > > > --
> > > > 1.8.3.1
> > > >
> >
> >
> >
> > --
> > Thanks,
> > Tonghao
--
Thanks,
Tonghao
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH] mempool: sort the rte_mempool_ops by name
2020-03-04 14:46 ` Tonghao Zhang
@ 2020-03-04 15:14 ` Jerin Jacob
2020-03-04 15:25 ` Tonghao Zhang
0 siblings, 1 reply; 44+ messages in thread
From: Jerin Jacob @ 2020-03-04 15:14 UTC (permalink / raw)
To: Tonghao Zhang
Cc: dpdk-dev, Olivier Matz, Andrew Rybchenko, Gage Eads,
Artem V. Andreev, Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru,
Hemant Agrawal
On Wed, Mar 4, 2020 at 8:17 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>
> On Wed, Mar 4, 2020 at 9:33 PM Jerin Jacob <jerinjacobk@gmail.com> wrote:
> >
> > On Wed, Mar 4, 2020 at 6:48 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> > >
> > > On Mon, Mar 2, 2020 at 9:45 PM Jerin Jacob <jerinjacobk@gmail.com> wrote:
> > > >
> > > > On Mon, Mar 2, 2020 at 7:27 AM <xiangxia.m.yue@gmail.com> wrote:
> > > > >
> > > > > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > > >
> > > > > The order of mempool initiation affects mempool index in the
> > > > > rte_mempool_ops_table. For example, when building APPs with:
> > > > >
> > > > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> > > > >
> > > > > The "bucket" mempool will be registered firstly, and its index
> > > > > in table is 0 while the index of "ring" mempool is 1. DPDK
> > > > > uses the mk/rte.app.mk to build APPs, and others, for example,
> > > > > Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > > > > The mempool lib linked in dpdk and Open vSwitch is different.
> > > > >
> > > > > The mempool can be used between primary and secondary process,
> > > > > such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
> > > > > There will be a crash because dpdk-pdump creates the "ring_mp_mc"
> > > > > ring which index in table is 0, but the index of "bucket" ring
> > > > > is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
> > > > > mempool ops and malloc memory from mempool. The crash will occur:
> > > > >
> > > > > bucket_dequeue (access null and crash)
> > > > > rte_mempool_get_ops (should get "ring_mp_mc",
> > > > > but get "bucket" mempool)
> > > > > rte_mempool_ops_dequeue_bulk
> > > > > ...
> > > > > rte_pktmbuf_alloc
> > > > > rte_pktmbuf_copy
> > > > > pdump_copy
> > > > > pdump_rx
> > > > > rte_eth_rx_burst
> > > > >
> > > > > To avoid the crash, there are some solution:
> > > > > * constructor priority: Different mempool uses different
> > > > > priority in RTE_INIT, but it's not easy to maintain.
> > > > >
> > > > > * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
> > > > > be same as libdpdk.a/libdpdk.so, but when adding a new mempool
> > > > > driver in future, we must make sure the order.
> > > > >
> > > > > * register mempool orderly: Sort the mempool when registering,
> > > > > so the lib linked will not affect the index in mempool table.
> > > > >
> > > > > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > > > ---
> > > > > lib/librte_mempool/rte_mempool_ops.c | 18 ++++++++++++++++--
> > > > > 1 file changed, 16 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
> > > > > index 22c5251..06dfe16 100644
> > > > > --- a/lib/librte_mempool/rte_mempool_ops.c
> > > > > +++ b/lib/librte_mempool/rte_mempool_ops.c
> > > > > @@ -22,7 +22,7 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> > > > > rte_mempool_register_ops(const struct rte_mempool_ops *h)
> > > > > {
> > > > > struct rte_mempool_ops *ops;
> > > > > - int16_t ops_index;
> > > > > + unsigned ops_index, i;
> > > > >
> > > > > rte_spinlock_lock(&rte_mempool_ops_table.sl);
> > > > >
> > > > > @@ -50,7 +50,19 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> > > > > return -EEXIST;
> > > > > }
> > > > >
> > > > > - ops_index = rte_mempool_ops_table.num_ops++;
> > > > > + /* sort the rte_mempool_ops by name. the order of the mempool
> > > > > + * lib initiation will not affect rte_mempool_ops index. */
> > > >
> > > > +1 for the fix.
> > > > For the implementation, why not use qsort_r() for sorting?
> > > The implementation is easy, and the number of mempool driver is not too large.
> > > But we can use the qsort_r to implement it.
> >
> > Since it is in a slow path, IMO, better to use standard sort functions
> > for better readability.
> Agree, can you help me review the patch:
>
> diff --git a/lib/librte_mempool/rte_mempool_ops.c
> b/lib/librte_mempool/rte_mempool_ops.c
> index 22c5251..1acee58 100644
> --- a/lib/librte_mempool/rte_mempool_ops.c
> +++ b/lib/librte_mempool/rte_mempool_ops.c
> @@ -17,6 +17,15 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> .num_ops = 0
> };
>
> +static int
> +compare_mempool_ops(const void *a, const void *b)
> +{
> + const struct rte_mempool_ops *m_a = a;
> + const struct rte_mempool_ops *m_b = b;
> +
> + return strcmp(m_a->name, m_b->name);
> +}
> +
> /* add a new ops struct in rte_mempool_ops_table, return its index. */
> int
> rte_mempool_register_ops(const struct rte_mempool_ops *h)
> @@ -63,6 +72,9 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> ops->get_info = h->get_info;
> ops->dequeue_contig_blocks = h->dequeue_contig_blocks;
>
> + qsort(rte_mempool_ops_table.ops, rte_mempool_ops_table.num_ops,
> + sizeof(rte_mempool_ops_table.ops[0]), compare_mempool_ops);
Looks good.
Not tested. Please check qsort behavior when
rte_mempool_ops_table.num_ops == 0 case.
> +
> rte_spinlock_unlock(&rte_mempool_ops_table.sl);
>
> return ops_index;
>
>
> >
> > > >
> > > > > + ops_index = rte_mempool_ops_table.num_ops;
> > > > > + for (i = 0; i < rte_mempool_ops_table.num_ops; i++) {
> > > > > + if (strcmp(h->name, rte_mempool_ops_table.ops[i].name) < 0) {
> > > > > + do {
> > > > > + rte_mempool_ops_table.ops[ops_index] =
> > > > > + rte_mempool_ops_table.ops[ops_index -1];
> > > > > + } while (--ops_index > i);
> > > > > + break;
> > > > > + }
> > > > > + }
> > > > > +
> > > > > ops = &rte_mempool_ops_table.ops[ops_index];
> > > > > strlcpy(ops->name, h->name, sizeof(ops->name));
> > > > > ops->alloc = h->alloc;
> > > > > @@ -63,6 +75,8 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> > > > > ops->get_info = h->get_info;
> > > > > ops->dequeue_contig_blocks = h->dequeue_contig_blocks;
> > > > >
> > > > > + rte_mempool_ops_table.num_ops++;
> > > > > +
> > > > > rte_spinlock_unlock(&rte_mempool_ops_table.sl);
> > > > >
> > > > > return ops_index;
> > > > > --
> > > > > 1.8.3.1
> > > > >
> > >
> > >
> > >
> > > --
> > > Thanks,
> > > Tonghao
>
>
>
> --
> Thanks,
> Tonghao
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH] mempool: sort the rte_mempool_ops by name
2020-03-04 15:14 ` Jerin Jacob
@ 2020-03-04 15:25 ` Tonghao Zhang
0 siblings, 0 replies; 44+ messages in thread
From: Tonghao Zhang @ 2020-03-04 15:25 UTC (permalink / raw)
To: Jerin Jacob
Cc: dpdk-dev, Olivier Matz, Andrew Rybchenko, Gage Eads,
Artem V. Andreev, Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru,
Hemant Agrawal
On Wed, Mar 4, 2020 at 11:14 PM Jerin Jacob <jerinjacobk@gmail.com> wrote:
>
> On Wed, Mar 4, 2020 at 8:17 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> >
> > On Wed, Mar 4, 2020 at 9:33 PM Jerin Jacob <jerinjacobk@gmail.com> wrote:
> > >
> > > On Wed, Mar 4, 2020 at 6:48 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> > > >
> > > > On Mon, Mar 2, 2020 at 9:45 PM Jerin Jacob <jerinjacobk@gmail.com> wrote:
> > > > >
> > > > > On Mon, Mar 2, 2020 at 7:27 AM <xiangxia.m.yue@gmail.com> wrote:
> > > > > >
> > > > > > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > > > >
> > > > > > The order of mempool initiation affects mempool index in the
> > > > > > rte_mempool_ops_table. For example, when building APPs with:
> > > > > >
> > > > > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> > > > > >
> > > > > > The "bucket" mempool will be registered firstly, and its index
> > > > > > in table is 0 while the index of "ring" mempool is 1. DPDK
> > > > > > uses the mk/rte.app.mk to build APPs, and others, for example,
> > > > > > Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > > > > > The mempool lib linked in dpdk and Open vSwitch is different.
> > > > > >
> > > > > > The mempool can be used between primary and secondary process,
> > > > > > such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
> > > > > > There will be a crash because dpdk-pdump creates the "ring_mp_mc"
> > > > > > ring which index in table is 0, but the index of "bucket" ring
> > > > > > is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
> > > > > > mempool ops and malloc memory from mempool. The crash will occur:
> > > > > >
> > > > > > bucket_dequeue (access null and crash)
> > > > > > rte_mempool_get_ops (should get "ring_mp_mc",
> > > > > > but get "bucket" mempool)
> > > > > > rte_mempool_ops_dequeue_bulk
> > > > > > ...
> > > > > > rte_pktmbuf_alloc
> > > > > > rte_pktmbuf_copy
> > > > > > pdump_copy
> > > > > > pdump_rx
> > > > > > rte_eth_rx_burst
> > > > > >
> > > > > > To avoid the crash, there are some solution:
> > > > > > * constructor priority: Different mempool uses different
> > > > > > priority in RTE_INIT, but it's not easy to maintain.
> > > > > >
> > > > > > * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
> > > > > > be same as libdpdk.a/libdpdk.so, but when adding a new mempool
> > > > > > driver in future, we must make sure the order.
> > > > > >
> > > > > > * register mempool orderly: Sort the mempool when registering,
> > > > > > so the lib linked will not affect the index in mempool table.
> > > > > >
> > > > > > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > > > > ---
> > > > > > lib/librte_mempool/rte_mempool_ops.c | 18 ++++++++++++++++--
> > > > > > 1 file changed, 16 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
> > > > > > index 22c5251..06dfe16 100644
> > > > > > --- a/lib/librte_mempool/rte_mempool_ops.c
> > > > > > +++ b/lib/librte_mempool/rte_mempool_ops.c
> > > > > > @@ -22,7 +22,7 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> > > > > > rte_mempool_register_ops(const struct rte_mempool_ops *h)
> > > > > > {
> > > > > > struct rte_mempool_ops *ops;
> > > > > > - int16_t ops_index;
> > > > > > + unsigned ops_index, i;
> > > > > >
> > > > > > rte_spinlock_lock(&rte_mempool_ops_table.sl);
> > > > > >
> > > > > > @@ -50,7 +50,19 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> > > > > > return -EEXIST;
> > > > > > }
> > > > > >
> > > > > > - ops_index = rte_mempool_ops_table.num_ops++;
> > > > > > + /* sort the rte_mempool_ops by name. the order of the mempool
> > > > > > + * lib initiation will not affect rte_mempool_ops index. */
> > > > >
> > > > > +1 for the fix.
> > > > > For the implementation, why not use qsort_r() for sorting?
> > > > The implementation is easy, and the number of mempool driver is not too large.
> > > > But we can use the qsort_r to implement it.
> > >
> > > Since it is in a slow path, IMO, better to use standard sort functions
> > > for better readability.
> > Agree, can you help me review the patch:
> >
> > diff --git a/lib/librte_mempool/rte_mempool_ops.c
> > b/lib/librte_mempool/rte_mempool_ops.c
> > index 22c5251..1acee58 100644
> > --- a/lib/librte_mempool/rte_mempool_ops.c
> > +++ b/lib/librte_mempool/rte_mempool_ops.c
> > @@ -17,6 +17,15 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> > .num_ops = 0
> > };
> >
> > +static int
> > +compare_mempool_ops(const void *a, const void *b)
> > +{
> > + const struct rte_mempool_ops *m_a = a;
> > + const struct rte_mempool_ops *m_b = b;
> > +
> > + return strcmp(m_a->name, m_b->name);
> > +}
> > +
> > /* add a new ops struct in rte_mempool_ops_table, return its index. */
> > int
> > rte_mempool_register_ops(const struct rte_mempool_ops *h)
> > @@ -63,6 +72,9 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> > ops->get_info = h->get_info;
> > ops->dequeue_contig_blocks = h->dequeue_contig_blocks;
> >
> > + qsort(rte_mempool_ops_table.ops, rte_mempool_ops_table.num_ops,
> > + sizeof(rte_mempool_ops_table.ops[0]), compare_mempool_ops);
>
> Looks good.
>
> Not tested. Please check qsort behavior when
> rte_mempool_ops_table.num_ops == 0 case.
If nmemb (qsort arg) == 0, qsort will do nothing.
>
>
> > +
> > rte_spinlock_unlock(&rte_mempool_ops_table.sl);
> >
> > return ops_index;
> >
> >
> > >
> > > > >
> > > > > > + ops_index = rte_mempool_ops_table.num_ops;
> > > > > > + for (i = 0; i < rte_mempool_ops_table.num_ops; i++) {
> > > > > > + if (strcmp(h->name, rte_mempool_ops_table.ops[i].name) < 0) {
> > > > > > + do {
> > > > > > + rte_mempool_ops_table.ops[ops_index] =
> > > > > > + rte_mempool_ops_table.ops[ops_index -1];
> > > > > > + } while (--ops_index > i);
> > > > > > + break;
> > > > > > + }
> > > > > > + }
> > > > > > +
> > > > > > ops = &rte_mempool_ops_table.ops[ops_index];
> > > > > > strlcpy(ops->name, h->name, sizeof(ops->name));
> > > > > > ops->alloc = h->alloc;
> > > > > > @@ -63,6 +75,8 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> > > > > > ops->get_info = h->get_info;
> > > > > > ops->dequeue_contig_blocks = h->dequeue_contig_blocks;
> > > > > >
> > > > > > + rte_mempool_ops_table.num_ops++;
> > > > > > +
> > > > > > rte_spinlock_unlock(&rte_mempool_ops_table.sl);
> > > > > >
> > > > > > return ops_index;
> > > > > > --
> > > > > > 1.8.3.1
> > > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Thanks,
> > > > Tonghao
> >
> >
> >
> > --
> > Thanks,
> > Tonghao
--
Thanks,
Tonghao
^ permalink raw reply [flat|nested] 44+ messages in thread
* [dpdk-dev] [PATCH dpdk-dev v2] mempool: sort the rte_mempool_ops by name
2020-03-02 1:57 [dpdk-dev] [PATCH] mempool: sort the rte_mempool_ops by name xiangxia.m.yue
2020-03-02 13:45 ` Jerin Jacob
@ 2020-03-05 8:20 ` xiangxia.m.yue
2020-03-05 16:57 ` Olivier Matz
2020-03-06 13:36 ` [dpdk-dev] [PATCH dpdk-dev v3] " xiangxia.m.yue
` (3 subsequent siblings)
5 siblings, 1 reply; 44+ messages in thread
From: xiangxia.m.yue @ 2020-03-05 8:20 UTC (permalink / raw)
To: dev, olivier.matz, arybchenko, gage.eads, artem.andreev, jerinj,
ndabilpuram, vattunuru, hemant.agrawal
Cc: Tonghao Zhang
From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
The order of mempool initiation affects mempool index in the
rte_mempool_ops_table. For example, when building APPs with:
$ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
The "bucket" mempool will be registered firstly, and its index
in table is 0 while the index of "ring" mempool is 1. DPDK
uses the mk/rte.app.mk to build APPs, and others, for example,
Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
The mempool lib linked in dpdk and Open vSwitch is different.
The mempool can be used between primary and secondary process,
such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
There will be a crash because dpdk-pdump creates the "ring_mp_mc"
ring which index in table is 0, but the index of "bucket" ring
is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
mempool ops and malloc memory from mempool. The crash will occur:
bucket_dequeue (access null and crash)
rte_mempool_get_ops (should get "ring_mp_mc",
but get "bucket" mempool)
rte_mempool_ops_dequeue_bulk
...
rte_pktmbuf_alloc
rte_pktmbuf_copy
pdump_copy
pdump_rx
rte_eth_rx_burst
To avoid the crash, there are some solution:
* constructor priority: Different mempool uses different
priority in RTE_INIT, but it's not easy to maintain.
* change mk/rte.app.mk: Change the order in mk/rte.app.mk to
be same as libdpdk.a/libdpdk.so, but when adding a new mempool
driver in future, we must make sure the order.
* register mempool orderly: Sort the mempool when registering,
so the lib linked will not affect the index in mempool table.
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
---
v2:
1. use the qsort to sort the mempool_ops.
2. tested: https://travis-ci.com/ovn-open-virtual-networks/dpdk-next-net/builds/151894026
---
lib/librte_mempool/rte_mempool_ops.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
index 22c5251..e9113cf 100644
--- a/lib/librte_mempool/rte_mempool_ops.c
+++ b/lib/librte_mempool/rte_mempool_ops.c
@@ -17,6 +17,15 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
.num_ops = 0
};
+static int
+compare_mempool_ops(const void *a, const void *b)
+{
+ const struct rte_mempool_ops *m_a = a;
+ const struct rte_mempool_ops *m_b = b;
+
+ return strcmp(m_a->name, m_b->name);
+}
+
/* add a new ops struct in rte_mempool_ops_table, return its index. */
int
rte_mempool_register_ops(const struct rte_mempool_ops *h)
@@ -63,6 +72,11 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
ops->get_info = h->get_info;
ops->dequeue_contig_blocks = h->dequeue_contig_blocks;
+ /* sort the rte_mempool_ops by name. the order of the mempool
+ * lib initiation will not affect rte_mempool_ops index. */
+ qsort(rte_mempool_ops_table.ops, rte_mempool_ops_table.num_ops,
+ sizeof(rte_mempool_ops_table.ops[0]), compare_mempool_ops);
+
rte_spinlock_unlock(&rte_mempool_ops_table.sl);
return ops_index;
--
1.8.3.1
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v2] mempool: sort the rte_mempool_ops by name
2020-03-05 8:20 ` [dpdk-dev] [PATCH dpdk-dev v2] " xiangxia.m.yue
@ 2020-03-05 16:57 ` Olivier Matz
0 siblings, 0 replies; 44+ messages in thread
From: Olivier Matz @ 2020-03-05 16:57 UTC (permalink / raw)
To: xiangxia.m.yue
Cc: dev, arybchenko, gage.eads, artem.andreev, jerinj, ndabilpuram,
vattunuru, hemant.agrawal
Hi,
On Thu, Mar 05, 2020 at 04:20:40PM +0800, xiangxia.m.yue@gmail.com wrote:
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>
> The order of mempool initiation affects mempool index in the
> rte_mempool_ops_table. For example, when building APPs with:
>
> $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
>
> The "bucket" mempool will be registered firstly, and its index
> in table is 0 while the index of "ring" mempool is 1. DPDK
> uses the mk/rte.app.mk to build APPs, and others, for example,
> Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> The mempool lib linked in dpdk and Open vSwitch is different.
>
> The mempool can be used between primary and secondary process,
> such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
> There will be a crash because dpdk-pdump creates the "ring_mp_mc"
> ring which index in table is 0, but the index of "bucket" ring
> is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
> mempool ops and malloc memory from mempool. The crash will occur:
>
> bucket_dequeue (access null and crash)
> rte_mempool_get_ops (should get "ring_mp_mc",
> but get "bucket" mempool)
> rte_mempool_ops_dequeue_bulk
> ...
> rte_pktmbuf_alloc
> rte_pktmbuf_copy
> pdump_copy
> pdump_rx
> rte_eth_rx_burst
>
> To avoid the crash, there are some solution:
> * constructor priority: Different mempool uses different
> priority in RTE_INIT, but it's not easy to maintain.
>
> * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
> be same as libdpdk.a/libdpdk.so, but when adding a new mempool
> driver in future, we must make sure the order.
>
> * register mempool orderly: Sort the mempool when registering,
> so the lib linked will not affect the index in mempool table.
>
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Sorting the pool drivers certainly make things better than they
are today. However, there is still an issue as soon as the mempool
list is not the same on both sides.
I don't see any better solution anyway. Storing the ops pointers in the
shared memory won't work, since function addresses won't be the same in
the 2 processes.
Just one minor comment below.
> ---
> v2:
> 1. use the qsort to sort the mempool_ops.
> 2. tested: https://travis-ci.com/ovn-open-virtual-networks/dpdk-next-net/builds/151894026
> ---
> lib/librte_mempool/rte_mempool_ops.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
> index 22c5251..e9113cf 100644
> --- a/lib/librte_mempool/rte_mempool_ops.c
> +++ b/lib/librte_mempool/rte_mempool_ops.c
> @@ -17,6 +17,15 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> .num_ops = 0
> };
>
> +static int
> +compare_mempool_ops(const void *a, const void *b)
> +{
> + const struct rte_mempool_ops *m_a = a;
> + const struct rte_mempool_ops *m_b = b;
> +
> + return strcmp(m_a->name, m_b->name);
> +}
> +
> /* add a new ops struct in rte_mempool_ops_table, return its index. */
> int
> rte_mempool_register_ops(const struct rte_mempool_ops *h)
> @@ -63,6 +72,11 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> ops->get_info = h->get_info;
> ops->dequeue_contig_blocks = h->dequeue_contig_blocks;
>
> + /* sort the rte_mempool_ops by name. the order of the mempool
> + * lib initiation will not affect rte_mempool_ops index. */
initiation -> initialization
there is also a checkpatch comment:
WARNING:BLOCK_COMMENT_STYLE: Block comments use a trailing */ on a separate line
#153: FILE: lib/librte_mempool/rte_mempool_ops.c:76:
+ * lib initiation will not affect rte_mempool_ops index. */
> + qsort(rte_mempool_ops_table.ops, rte_mempool_ops_table.num_ops,
> + sizeof(rte_mempool_ops_table.ops[0]), compare_mempool_ops);
> +
> rte_spinlock_unlock(&rte_mempool_ops_table.sl);
>
> return ops_index;
> --
> 1.8.3.1
>
Then,
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Thanks!
^ permalink raw reply [flat|nested] 44+ messages in thread
* [dpdk-dev] [PATCH dpdk-dev v3] mempool: sort the rte_mempool_ops by name
2020-03-02 1:57 [dpdk-dev] [PATCH] mempool: sort the rte_mempool_ops by name xiangxia.m.yue
2020-03-02 13:45 ` Jerin Jacob
2020-03-05 8:20 ` [dpdk-dev] [PATCH dpdk-dev v2] " xiangxia.m.yue
@ 2020-03-06 13:36 ` xiangxia.m.yue
2020-03-06 13:37 ` Jerin Jacob
2020-04-09 10:52 ` [dpdk-dev] [PATCH dpdk-dev 1/2] eal: introduce last-init queue for libraries initialization xiangxia.m.yue
` (2 subsequent siblings)
5 siblings, 1 reply; 44+ messages in thread
From: xiangxia.m.yue @ 2020-03-06 13:36 UTC (permalink / raw)
To: dev, olivier.matz, arybchenko, gage.eads, artem.andreev, jerinj,
ndabilpuram, vattunuru, hemant.agrawal
Cc: Tonghao Zhang
From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
The order of mempool initiation affects mempool index in the
rte_mempool_ops_table. For example, when building APPs with:
$ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
The "bucket" mempool will be registered firstly, and its index
in table is 0 while the index of "ring" mempool is 1. DPDK
uses the mk/rte.app.mk to build APPs, and others, for example,
Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
The mempool lib linked in dpdk and Open vSwitch is different.
The mempool can be used between primary and secondary process,
such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
There will be a crash because dpdk-pdump creates the "ring_mp_mc"
ring which index in table is 0, but the index of "bucket" ring
is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
mempool ops and malloc memory from mempool. The crash will occur:
bucket_dequeue (access null and crash)
rte_mempool_get_ops (should get "ring_mp_mc",
but get "bucket" mempool)
rte_mempool_ops_dequeue_bulk
...
rte_pktmbuf_alloc
rte_pktmbuf_copy
pdump_copy
pdump_rx
rte_eth_rx_burst
To avoid the crash, there are some solution:
* constructor priority: Different mempool uses different
priority in RTE_INIT, but it's not easy to maintain.
* change mk/rte.app.mk: Change the order in mk/rte.app.mk to
be same as libdpdk.a/libdpdk.so, but when adding a new mempool
driver in future, we must make sure the order.
* register mempool orderly: Sort the mempool when registering,
so the lib linked will not affect the index in mempool table.
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
v3:
* fix checkpatches.sh WARNING
* change "initiation -> initialization"
v2:
* use the qsort to sort the mempool_ops.
* tested: https://travis-ci.com/ovn-open-virtual-networks/dpdk-next-net/builds/151894026
---
lib/librte_mempool/rte_mempool_ops.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
index 22c5251..b0da096 100644
--- a/lib/librte_mempool/rte_mempool_ops.c
+++ b/lib/librte_mempool/rte_mempool_ops.c
@@ -17,6 +17,15 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
.num_ops = 0
};
+static int
+compare_mempool_ops(const void *a, const void *b)
+{
+ const struct rte_mempool_ops *m_a = a;
+ const struct rte_mempool_ops *m_b = b;
+
+ return strcmp(m_a->name, m_b->name);
+}
+
/* add a new ops struct in rte_mempool_ops_table, return its index. */
int
rte_mempool_register_ops(const struct rte_mempool_ops *h)
@@ -63,6 +72,13 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
ops->get_info = h->get_info;
ops->dequeue_contig_blocks = h->dequeue_contig_blocks;
+ /*
+ * sort the rte_mempool_ops by name. the order of the mempool
+ * lib initialization will not affect rte_mempool_ops index.
+ */
+ qsort(rte_mempool_ops_table.ops, rte_mempool_ops_table.num_ops,
+ sizeof(rte_mempool_ops_table.ops[0]), compare_mempool_ops);
+
rte_spinlock_unlock(&rte_mempool_ops_table.sl);
return ops_index;
--
1.8.3.1
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v3] mempool: sort the rte_mempool_ops by name
2020-03-06 13:36 ` [dpdk-dev] [PATCH dpdk-dev v3] " xiangxia.m.yue
@ 2020-03-06 13:37 ` Jerin Jacob
2020-03-07 12:51 ` Andrew Rybchenko
0 siblings, 1 reply; 44+ messages in thread
From: Jerin Jacob @ 2020-03-06 13:37 UTC (permalink / raw)
To: Tonghao Zhang
Cc: dpdk-dev, Olivier Matz, Andrew Rybchenko, Gage Eads,
Artem V. Andreev, Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru,
Hemant Agrawal
On Fri, Mar 6, 2020 at 7:06 PM <xiangxia.m.yue@gmail.com> wrote:
>
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>
> The order of mempool initiation affects mempool index in the
> rte_mempool_ops_table. For example, when building APPs with:
>
> $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
>
> The "bucket" mempool will be registered firstly, and its index
> in table is 0 while the index of "ring" mempool is 1. DPDK
> uses the mk/rte.app.mk to build APPs, and others, for example,
> Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> The mempool lib linked in dpdk and Open vSwitch is different.
>
> The mempool can be used between primary and secondary process,
> such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
> There will be a crash because dpdk-pdump creates the "ring_mp_mc"
> ring which index in table is 0, but the index of "bucket" ring
> is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
> mempool ops and malloc memory from mempool. The crash will occur:
>
> bucket_dequeue (access null and crash)
> rte_mempool_get_ops (should get "ring_mp_mc",
> but get "bucket" mempool)
> rte_mempool_ops_dequeue_bulk
> ...
> rte_pktmbuf_alloc
> rte_pktmbuf_copy
> pdump_copy
> pdump_rx
> rte_eth_rx_burst
>
> To avoid the crash, there are some solution:
> * constructor priority: Different mempool uses different
> priority in RTE_INIT, but it's not easy to maintain.
>
> * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
> be same as libdpdk.a/libdpdk.so, but when adding a new mempool
> driver in future, we must make sure the order.
>
> * register mempool orderly: Sort the mempool when registering,
> so the lib linked will not affect the index in mempool table.
>
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
> ---
> v3:
> * fix checkpatches.sh WARNING
> * change "initiation -> initialization"
>
> v2:
> * use the qsort to sort the mempool_ops.
> * tested: https://travis-ci.com/ovn-open-virtual-networks/dpdk-next-net/builds/151894026
> ---
> lib/librte_mempool/rte_mempool_ops.c | 16 ++++++++++++++++
> 1 file changed, 16 insertions(+)
>
> diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
> index 22c5251..b0da096 100644
> --- a/lib/librte_mempool/rte_mempool_ops.c
> +++ b/lib/librte_mempool/rte_mempool_ops.c
> @@ -17,6 +17,15 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> .num_ops = 0
> };
>
> +static int
> +compare_mempool_ops(const void *a, const void *b)
> +{
> + const struct rte_mempool_ops *m_a = a;
> + const struct rte_mempool_ops *m_b = b;
> +
> + return strcmp(m_a->name, m_b->name);
> +}
> +
> /* add a new ops struct in rte_mempool_ops_table, return its index. */
> int
> rte_mempool_register_ops(const struct rte_mempool_ops *h)
> @@ -63,6 +72,13 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> ops->get_info = h->get_info;
> ops->dequeue_contig_blocks = h->dequeue_contig_blocks;
>
> + /*
> + * sort the rte_mempool_ops by name. the order of the mempool
> + * lib initialization will not affect rte_mempool_ops index.
> + */
> + qsort(rte_mempool_ops_table.ops, rte_mempool_ops_table.num_ops,
> + sizeof(rte_mempool_ops_table.ops[0]), compare_mempool_ops);
> +
> rte_spinlock_unlock(&rte_mempool_ops_table.sl);
>
> return ops_index;
> --
> 1.8.3.1
>
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v3] mempool: sort the rte_mempool_ops by name
2020-03-06 13:37 ` Jerin Jacob
@ 2020-03-07 12:51 ` Andrew Rybchenko
2020-03-07 12:54 ` Andrew Rybchenko
0 siblings, 1 reply; 44+ messages in thread
From: Andrew Rybchenko @ 2020-03-07 12:51 UTC (permalink / raw)
To: Jerin Jacob, Tonghao Zhang
Cc: dpdk-dev, Olivier Matz, Gage Eads, Artem V. Andreev, Jerin Jacob,
Nithin Dabilpuram, Vamsi Attunuru, Hemant Agrawal
On 3/6/20 4:37 PM, Jerin Jacob wrote:
> On Fri, Mar 6, 2020 at 7:06 PM <xiangxia.m.yue@gmail.com> wrote:
>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>>
>> The order of mempool initiation affects mempool index in the
>> rte_mempool_ops_table. For example, when building APPs with:
>>
>> $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
>>
>> The "bucket" mempool will be registered firstly, and its index
>> in table is 0 while the index of "ring" mempool is 1. DPDK
>> uses the mk/rte.app.mk to build APPs, and others, for example,
>> Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
>> The mempool lib linked in dpdk and Open vSwitch is different.
>>
>> The mempool can be used between primary and secondary process,
>> such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
>> There will be a crash because dpdk-pdump creates the "ring_mp_mc"
>> ring which index in table is 0, but the index of "bucket" ring
>> is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
>> mempool ops and malloc memory from mempool. The crash will occur:
>>
>> bucket_dequeue (access null and crash)
>> rte_mempool_get_ops (should get "ring_mp_mc",
>> but get "bucket" mempool)
>> rte_mempool_ops_dequeue_bulk
>> ...
>> rte_pktmbuf_alloc
>> rte_pktmbuf_copy
>> pdump_copy
>> pdump_rx
>> rte_eth_rx_burst
>>
>> To avoid the crash, there are some solution:
>> * constructor priority: Different mempool uses different
>> priority in RTE_INIT, but it's not easy to maintain.
>>
>> * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
>> be same as libdpdk.a/libdpdk.so, but when adding a new mempool
>> driver in future, we must make sure the order.
>>
>> * register mempool orderly: Sort the mempool when registering,
>> so the lib linked will not affect the index in mempool table.
>>
>> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>> Acked-by: Olivier Matz <olivier.matz@6wind.com>
> Acked-by: Jerin Jacob <jerinj@marvell.com>
The patch is OK, but the fact that ops index changes during
mempool driver lifetime is frightening. In fact it breaks
rte_mempool_register_ops() return value semantics (read
as API break). The return value is not used in DPDK, but it
is a public function. If I'm not mistaken it should be taken
into account.
Also I remember patches which warn about above behaviour
in documentation. If behaviour changes, corresponding
documentation must be updated.
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v3] mempool: sort the rte_mempool_ops by name
2020-03-07 12:51 ` Andrew Rybchenko
@ 2020-03-07 12:54 ` Andrew Rybchenko
2020-03-09 3:01 ` Tonghao Zhang
0 siblings, 1 reply; 44+ messages in thread
From: Andrew Rybchenko @ 2020-03-07 12:54 UTC (permalink / raw)
To: Jerin Jacob, Tonghao Zhang
Cc: dpdk-dev, Olivier Matz, Gage Eads, Artem V. Andreev, Jerin Jacob,
Nithin Dabilpuram, Vamsi Attunuru, Hemant Agrawal
On 3/7/20 3:51 PM, Andrew Rybchenko wrote:
> On 3/6/20 4:37 PM, Jerin Jacob wrote:
>> On Fri, Mar 6, 2020 at 7:06 PM <xiangxia.m.yue@gmail.com> wrote:
>>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>>>
>>> The order of mempool initiation affects mempool index in the
>>> rte_mempool_ops_table. For example, when building APPs with:
>>>
>>> $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
>>>
>>> The "bucket" mempool will be registered firstly, and its index
>>> in table is 0 while the index of "ring" mempool is 1. DPDK
>>> uses the mk/rte.app.mk to build APPs, and others, for example,
>>> Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
>>> The mempool lib linked in dpdk and Open vSwitch is different.
>>>
>>> The mempool can be used between primary and secondary process,
>>> such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
>>> There will be a crash because dpdk-pdump creates the "ring_mp_mc"
>>> ring which index in table is 0, but the index of "bucket" ring
>>> is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
>>> mempool ops and malloc memory from mempool. The crash will occur:
>>>
>>> bucket_dequeue (access null and crash)
>>> rte_mempool_get_ops (should get "ring_mp_mc",
>>> but get "bucket" mempool)
>>> rte_mempool_ops_dequeue_bulk
>>> ...
>>> rte_pktmbuf_alloc
>>> rte_pktmbuf_copy
>>> pdump_copy
>>> pdump_rx
>>> rte_eth_rx_burst
>>>
>>> To avoid the crash, there are some solution:
>>> * constructor priority: Different mempool uses different
>>> priority in RTE_INIT, but it's not easy to maintain.
>>>
>>> * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
>>> be same as libdpdk.a/libdpdk.so, but when adding a new mempool
>>> driver in future, we must make sure the order.
>>>
>>> * register mempool orderly: Sort the mempool when registering,
>>> so the lib linked will not affect the index in mempool table.
>>>
>>> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>>> Acked-by: Olivier Matz <olivier.matz@6wind.com>
>> Acked-by: Jerin Jacob <jerinj@marvell.com>
>
> The patch is OK, but the fact that ops index changes during
> mempool driver lifetime is frightening. In fact it breaks
> rte_mempool_register_ops() return value semantics (read
> as API break). The return value is not used in DPDK, but it
> is a public function. If I'm not mistaken it should be taken
> into account.
>
> Also I remember patches which warn about above behaviour
> in documentation. If behaviour changes, corresponding
> documentation must be updated.
One more point. If the patch is finally accepted it definitely
deserves few lines in release notes.
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v3] mempool: sort the rte_mempool_ops by name
2020-03-07 12:54 ` Andrew Rybchenko
@ 2020-03-09 3:01 ` Tonghao Zhang
2020-03-09 8:27 ` Olivier Matz
0 siblings, 1 reply; 44+ messages in thread
From: Tonghao Zhang @ 2020-03-09 3:01 UTC (permalink / raw)
To: Andrew Rybchenko
Cc: Jerin Jacob, dpdk-dev, Olivier Matz, Gage Eads, Artem V. Andreev,
Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru, Hemant Agrawal
On Sat, Mar 7, 2020 at 8:54 PM Andrew Rybchenko
<arybchenko@solarflare.com> wrote:
>
> On 3/7/20 3:51 PM, Andrew Rybchenko wrote:
> > On 3/6/20 4:37 PM, Jerin Jacob wrote:
> >> On Fri, Mar 6, 2020 at 7:06 PM <xiangxia.m.yue@gmail.com> wrote:
> >>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> >>>
> >>> The order of mempool initiation affects mempool index in the
> >>> rte_mempool_ops_table. For example, when building APPs with:
> >>>
> >>> $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> >>>
> >>> The "bucket" mempool will be registered firstly, and its index
> >>> in table is 0 while the index of "ring" mempool is 1. DPDK
> >>> uses the mk/rte.app.mk to build APPs, and others, for example,
> >>> Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> >>> The mempool lib linked in dpdk and Open vSwitch is different.
> >>>
> >>> The mempool can be used between primary and secondary process,
> >>> such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
> >>> There will be a crash because dpdk-pdump creates the "ring_mp_mc"
> >>> ring which index in table is 0, but the index of "bucket" ring
> >>> is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
> >>> mempool ops and malloc memory from mempool. The crash will occur:
> >>>
> >>> bucket_dequeue (access null and crash)
> >>> rte_mempool_get_ops (should get "ring_mp_mc",
> >>> but get "bucket" mempool)
> >>> rte_mempool_ops_dequeue_bulk
> >>> ...
> >>> rte_pktmbuf_alloc
> >>> rte_pktmbuf_copy
> >>> pdump_copy
> >>> pdump_rx
> >>> rte_eth_rx_burst
> >>>
> >>> To avoid the crash, there are some solution:
> >>> * constructor priority: Different mempool uses different
> >>> priority in RTE_INIT, but it's not easy to maintain.
> >>>
> >>> * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
> >>> be same as libdpdk.a/libdpdk.so, but when adding a new mempool
> >>> driver in future, we must make sure the order.
> >>>
> >>> * register mempool orderly: Sort the mempool when registering,
> >>> so the lib linked will not affect the index in mempool table.
> >>>
> >>> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> >>> Acked-by: Olivier Matz <olivier.matz@6wind.com>
> >> Acked-by: Jerin Jacob <jerinj@marvell.com>
> >
> > The patch is OK, but the fact that ops index changes during
> > mempool driver lifetime is frightening. In fact it breaks
> > rte_mempool_register_ops() return value semantics (read
> > as API break). The return value is not used in DPDK, but it
> > is a public function. If I'm not mistaken it should be taken
> > into account.
Yes, should update the doc: how about this:
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index c90cf31..5a9c8a7 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -904,7 +904,9 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
* @param ops
* Pointer to an ops structure to register.
* @return
- * - >=0: Success; return the index of the ops struct in the table.
+ * - >=0: Success; return the index of the last ops struct in the table.
+ * The number of the ops struct registered is equal to index
+ * returned + 1.
* - -EINVAL - some missing callbacks while registering ops struct.
* - -ENOSPC - the maximum number of ops structs has been reached.
*/
diff --git a/lib/librte_mempool/rte_mempool_ops.c
b/lib/librte_mempool/rte_mempool_ops.c
index b0da096..053f340 100644
--- a/lib/librte_mempool/rte_mempool_ops.c
+++ b/lib/librte_mempool/rte_mempool_ops.c
@@ -26,7 +26,11 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
return strcmp(m_a->name, m_b->name);
}
-/* add a new ops struct in rte_mempool_ops_table, return its index. */
+/*
+ * add a new ops struct in rte_mempool_ops_table.
+ * on success, return the index of the last ops
+ * struct in the table.
+ */
int
rte_mempool_register_ops(const struct rte_mempool_ops *h)
{
> > Also I remember patches which warn about above behaviour
> > in documentation. If behaviour changes, corresponding
> > documentation must be updated.
>
> One more point. If the patch is finally accepted it definitely
> deserves few lines in release notes.
OK, a separate patch should be sent before DPDK 20.05 release ?
>
--
Thanks,
Tonghao
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v3] mempool: sort the rte_mempool_ops by name
2020-03-09 3:01 ` Tonghao Zhang
@ 2020-03-09 8:27 ` Olivier Matz
2020-03-09 8:55 ` Tonghao Zhang
2020-03-24 9:35 ` Andrew Rybchenko
0 siblings, 2 replies; 44+ messages in thread
From: Olivier Matz @ 2020-03-09 8:27 UTC (permalink / raw)
To: Tonghao Zhang
Cc: Andrew Rybchenko, Jerin Jacob, dpdk-dev, Gage Eads,
Artem V. Andreev, Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru,
Hemant Agrawal
Hi,
On Mon, Mar 09, 2020 at 11:01:25AM +0800, Tonghao Zhang wrote:
> On Sat, Mar 7, 2020 at 8:54 PM Andrew Rybchenko
> <arybchenko@solarflare.com> wrote:
> >
> > On 3/7/20 3:51 PM, Andrew Rybchenko wrote:
> > > On 3/6/20 4:37 PM, Jerin Jacob wrote:
> > >> On Fri, Mar 6, 2020 at 7:06 PM <xiangxia.m.yue@gmail.com> wrote:
> > >>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > >>>
> > >>> The order of mempool initiation affects mempool index in the
> > >>> rte_mempool_ops_table. For example, when building APPs with:
> > >>>
> > >>> $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> > >>>
> > >>> The "bucket" mempool will be registered firstly, and its index
> > >>> in table is 0 while the index of "ring" mempool is 1. DPDK
> > >>> uses the mk/rte.app.mk to build APPs, and others, for example,
> > >>> Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > >>> The mempool lib linked in dpdk and Open vSwitch is different.
> > >>>
> > >>> The mempool can be used between primary and secondary process,
> > >>> such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
> > >>> There will be a crash because dpdk-pdump creates the "ring_mp_mc"
> > >>> ring which index in table is 0, but the index of "bucket" ring
> > >>> is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
> > >>> mempool ops and malloc memory from mempool. The crash will occur:
> > >>>
> > >>> bucket_dequeue (access null and crash)
> > >>> rte_mempool_get_ops (should get "ring_mp_mc",
> > >>> but get "bucket" mempool)
> > >>> rte_mempool_ops_dequeue_bulk
> > >>> ...
> > >>> rte_pktmbuf_alloc
> > >>> rte_pktmbuf_copy
> > >>> pdump_copy
> > >>> pdump_rx
> > >>> rte_eth_rx_burst
> > >>>
> > >>> To avoid the crash, there are some solution:
> > >>> * constructor priority: Different mempool uses different
> > >>> priority in RTE_INIT, but it's not easy to maintain.
> > >>>
> > >>> * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
> > >>> be same as libdpdk.a/libdpdk.so, but when adding a new mempool
> > >>> driver in future, we must make sure the order.
> > >>>
> > >>> * register mempool orderly: Sort the mempool when registering,
> > >>> so the lib linked will not affect the index in mempool table.
> > >>>
> > >>> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > >>> Acked-by: Olivier Matz <olivier.matz@6wind.com>
> > >> Acked-by: Jerin Jacob <jerinj@marvell.com>
> > >
> > > The patch is OK, but the fact that ops index changes during
> > > mempool driver lifetime is frightening. In fact it breaks
> > > rte_mempool_register_ops() return value semantics (read
> > > as API break). The return value is not used in DPDK, but it
> > > is a public function. If I'm not mistaken it should be taken
> > > into account.
Good points.
The fact that the ops index changes during mempool driver lifetime is
indeed frightening, especially knowning that this is a dynamic
registration that could happen at any moment in the life of the
application. Also, breaking the ABI is not desirable.
Let me try to propose something else to solve your issue:
1/ At init, the primary process allocates a struct in shared memory
(named memzone):
struct rte_mempool_shared_ops {
size_t num_mempool_ops;
struct {
char name[RTE_MEMPOOL_OPS_NAMESIZE];
} mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX];
char *mempool_ops_name[RTE_MEMPOOL_MAX_OPS_IDX];
rte_spinlock_t mempool;
}
2/ When we register a mempool ops, we first get a name and id from the
shared struct: with the lock held, lookup for the registered name and
return its index, else get the last id and copy the name in the struct.
3/ Then do as before (in the per-process global table), except that we
reuse the registered id.
We can remove the num_ops field from rte_mempool_ops_table.
Thoughts?
> Yes, should update the doc: how about this:
>
> diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
> index c90cf31..5a9c8a7 100644
> --- a/lib/librte_mempool/rte_mempool.h
> +++ b/lib/librte_mempool/rte_mempool.h
> @@ -904,7 +904,9 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
> * @param ops
> * Pointer to an ops structure to register.
> * @return
> - * - >=0: Success; return the index of the ops struct in the table.
> + * - >=0: Success; return the index of the last ops struct in the table.
> + * The number of the ops struct registered is equal to index
> + * returned + 1.
> * - -EINVAL - some missing callbacks while registering ops struct.
> * - -ENOSPC - the maximum number of ops structs has been reached.
> */
> diff --git a/lib/librte_mempool/rte_mempool_ops.c
> b/lib/librte_mempool/rte_mempool_ops.c
> index b0da096..053f340 100644
> --- a/lib/librte_mempool/rte_mempool_ops.c
> +++ b/lib/librte_mempool/rte_mempool_ops.c
> @@ -26,7 +26,11 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> return strcmp(m_a->name, m_b->name);
> }
>
> -/* add a new ops struct in rte_mempool_ops_table, return its index. */
> +/*
> + * add a new ops struct in rte_mempool_ops_table.
> + * on success, return the index of the last ops
> + * struct in the table.
> + */
> int
> rte_mempool_register_ops(const struct rte_mempool_ops *h)
> {
> > > Also I remember patches which warn about above behaviour
> > > in documentation. If behaviour changes, corresponding
> > > documentation must be updated.
> >
> > One more point. If the patch is finally accepted it definitely
> > deserves few lines in release notes.
> OK, a separate patch should be sent before DPDK 20.05 release ?
> >
>
>
> --
> Thanks,
> Tonghao
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v3] mempool: sort the rte_mempool_ops by name
2020-03-09 8:27 ` Olivier Matz
@ 2020-03-09 8:55 ` Tonghao Zhang
2020-03-09 9:05 ` Olivier Matz
2020-03-09 13:15 ` David Marchand
2020-03-24 9:35 ` Andrew Rybchenko
1 sibling, 2 replies; 44+ messages in thread
From: Tonghao Zhang @ 2020-03-09 8:55 UTC (permalink / raw)
To: Olivier Matz
Cc: Andrew Rybchenko, Jerin Jacob, dpdk-dev, Gage Eads,
Artem V. Andreev, Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru,
Hemant Agrawal
On Mon, Mar 9, 2020 at 4:27 PM Olivier Matz <olivier.matz@6wind.com> wrote:
>
> Hi,
>
> On Mon, Mar 09, 2020 at 11:01:25AM +0800, Tonghao Zhang wrote:
> > On Sat, Mar 7, 2020 at 8:54 PM Andrew Rybchenko
> > <arybchenko@solarflare.com> wrote:
> > >
> > > On 3/7/20 3:51 PM, Andrew Rybchenko wrote:
> > > > On 3/6/20 4:37 PM, Jerin Jacob wrote:
> > > >> On Fri, Mar 6, 2020 at 7:06 PM <xiangxia.m.yue@gmail.com> wrote:
> > > >>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > >>>
> > > >>> The order of mempool initiation affects mempool index in the
> > > >>> rte_mempool_ops_table. For example, when building APPs with:
> > > >>>
> > > >>> $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> > > >>>
> > > >>> The "bucket" mempool will be registered firstly, and its index
> > > >>> in table is 0 while the index of "ring" mempool is 1. DPDK
> > > >>> uses the mk/rte.app.mk to build APPs, and others, for example,
> > > >>> Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > > >>> The mempool lib linked in dpdk and Open vSwitch is different.
> > > >>>
> > > >>> The mempool can be used between primary and secondary process,
> > > >>> such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
> > > >>> There will be a crash because dpdk-pdump creates the "ring_mp_mc"
> > > >>> ring which index in table is 0, but the index of "bucket" ring
> > > >>> is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
> > > >>> mempool ops and malloc memory from mempool. The crash will occur:
> > > >>>
> > > >>> bucket_dequeue (access null and crash)
> > > >>> rte_mempool_get_ops (should get "ring_mp_mc",
> > > >>> but get "bucket" mempool)
> > > >>> rte_mempool_ops_dequeue_bulk
> > > >>> ...
> > > >>> rte_pktmbuf_alloc
> > > >>> rte_pktmbuf_copy
> > > >>> pdump_copy
> > > >>> pdump_rx
> > > >>> rte_eth_rx_burst
> > > >>>
> > > >>> To avoid the crash, there are some solution:
> > > >>> * constructor priority: Different mempool uses different
> > > >>> priority in RTE_INIT, but it's not easy to maintain.
> > > >>>
> > > >>> * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
> > > >>> be same as libdpdk.a/libdpdk.so, but when adding a new mempool
> > > >>> driver in future, we must make sure the order.
> > > >>>
> > > >>> * register mempool orderly: Sort the mempool when registering,
> > > >>> so the lib linked will not affect the index in mempool table.
> > > >>>
> > > >>> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > >>> Acked-by: Olivier Matz <olivier.matz@6wind.com>
> > > >> Acked-by: Jerin Jacob <jerinj@marvell.com>
> > > >
> > > > The patch is OK, but the fact that ops index changes during
> > > > mempool driver lifetime is frightening. In fact it breaks
> > > > rte_mempool_register_ops() return value semantics (read
> > > > as API break). The return value is not used in DPDK, but it
> > > > is a public function. If I'm not mistaken it should be taken
> > > > into account.
>
> Good points.
>
> The fact that the ops index changes during mempool driver lifetime is
> indeed frightening, especially knowning that this is a dynamic
> registration that could happen at any moment in the life of the
> application. Also, breaking the ABI is not desirable.
That solution is better.
> Let me try to propose something else to solve your issue:
>
> 1/ At init, the primary process allocates a struct in shared memory
> (named memzone):
>
> struct rte_mempool_shared_ops {
> size_t num_mempool_ops;
> struct {
> char name[RTE_MEMPOOL_OPS_NAMESIZE];
> } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX];
> char *mempool_ops_name[RTE_MEMPOOL_MAX_OPS_IDX];
> rte_spinlock_t mempool;
> }
>
> 2/ When we register a mempool ops, we first get a name and id from the
> shared struct: with the lock held, lookup for the registered name and
> return its index, else get the last id and copy the name in the struct.
>
> 3/ Then do as before (in the per-process global table), except that we
> reuse the registered id.
>
> We can remove the num_ops field from rte_mempool_ops_table.
>
> Thoughts?
>
>
> > Yes, should update the doc: how about this:
> >
> > diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
> > index c90cf31..5a9c8a7 100644
> > --- a/lib/librte_mempool/rte_mempool.h
> > +++ b/lib/librte_mempool/rte_mempool.h
> > @@ -904,7 +904,9 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
> > * @param ops
> > * Pointer to an ops structure to register.
> > * @return
> > - * - >=0: Success; return the index of the ops struct in the table.
> > + * - >=0: Success; return the index of the last ops struct in the table.
> > + * The number of the ops struct registered is equal to index
> > + * returned + 1.
> > * - -EINVAL - some missing callbacks while registering ops struct.
> > * - -ENOSPC - the maximum number of ops structs has been reached.
> > */
> > diff --git a/lib/librte_mempool/rte_mempool_ops.c
> > b/lib/librte_mempool/rte_mempool_ops.c
> > index b0da096..053f340 100644
> > --- a/lib/librte_mempool/rte_mempool_ops.c
> > +++ b/lib/librte_mempool/rte_mempool_ops.c
> > @@ -26,7 +26,11 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> > return strcmp(m_a->name, m_b->name);
> > }
> >
> > -/* add a new ops struct in rte_mempool_ops_table, return its index. */
> > +/*
> > + * add a new ops struct in rte_mempool_ops_table.
> > + * on success, return the index of the last ops
> > + * struct in the table.
> > + */
> > int
> > rte_mempool_register_ops(const struct rte_mempool_ops *h)
> > {
> > > > Also I remember patches which warn about above behaviour
> > > > in documentation. If behaviour changes, corresponding
> > > > documentation must be updated.
> > >
> > > One more point. If the patch is finally accepted it definitely
> > > deserves few lines in release notes.
> > OK, a separate patch should be sent before DPDK 20.05 release ?
> > >
> >
> >
> > --
> > Thanks,
> > Tonghao
--
Thanks,
Tonghao
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v3] mempool: sort the rte_mempool_ops by name
2020-03-09 8:55 ` Tonghao Zhang
@ 2020-03-09 9:05 ` Olivier Matz
2020-03-09 13:15 ` David Marchand
1 sibling, 0 replies; 44+ messages in thread
From: Olivier Matz @ 2020-03-09 9:05 UTC (permalink / raw)
To: Tonghao Zhang
Cc: Andrew Rybchenko, Jerin Jacob, dpdk-dev, Gage Eads,
Artem V. Andreev, Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru,
Hemant Agrawal
On Mon, Mar 09, 2020 at 04:55:28PM +0800, Tonghao Zhang wrote:
> On Mon, Mar 9, 2020 at 4:27 PM Olivier Matz <olivier.matz@6wind.com> wrote:
> >
> > Hi,
> >
> > On Mon, Mar 09, 2020 at 11:01:25AM +0800, Tonghao Zhang wrote:
> > > On Sat, Mar 7, 2020 at 8:54 PM Andrew Rybchenko
> > > <arybchenko@solarflare.com> wrote:
> > > >
> > > > On 3/7/20 3:51 PM, Andrew Rybchenko wrote:
> > > > > On 3/6/20 4:37 PM, Jerin Jacob wrote:
> > > > >> On Fri, Mar 6, 2020 at 7:06 PM <xiangxia.m.yue@gmail.com> wrote:
> > > > >>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > > >>>
> > > > >>> The order of mempool initiation affects mempool index in the
> > > > >>> rte_mempool_ops_table. For example, when building APPs with:
> > > > >>>
> > > > >>> $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> > > > >>>
> > > > >>> The "bucket" mempool will be registered firstly, and its index
> > > > >>> in table is 0 while the index of "ring" mempool is 1. DPDK
> > > > >>> uses the mk/rte.app.mk to build APPs, and others, for example,
> > > > >>> Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > > > >>> The mempool lib linked in dpdk and Open vSwitch is different.
> > > > >>>
> > > > >>> The mempool can be used between primary and secondary process,
> > > > >>> such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
> > > > >>> There will be a crash because dpdk-pdump creates the "ring_mp_mc"
> > > > >>> ring which index in table is 0, but the index of "bucket" ring
> > > > >>> is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
> > > > >>> mempool ops and malloc memory from mempool. The crash will occur:
> > > > >>>
> > > > >>> bucket_dequeue (access null and crash)
> > > > >>> rte_mempool_get_ops (should get "ring_mp_mc",
> > > > >>> but get "bucket" mempool)
> > > > >>> rte_mempool_ops_dequeue_bulk
> > > > >>> ...
> > > > >>> rte_pktmbuf_alloc
> > > > >>> rte_pktmbuf_copy
> > > > >>> pdump_copy
> > > > >>> pdump_rx
> > > > >>> rte_eth_rx_burst
> > > > >>>
> > > > >>> To avoid the crash, there are some solution:
> > > > >>> * constructor priority: Different mempool uses different
> > > > >>> priority in RTE_INIT, but it's not easy to maintain.
> > > > >>>
> > > > >>> * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
> > > > >>> be same as libdpdk.a/libdpdk.so, but when adding a new mempool
> > > > >>> driver in future, we must make sure the order.
> > > > >>>
> > > > >>> * register mempool orderly: Sort the mempool when registering,
> > > > >>> so the lib linked will not affect the index in mempool table.
> > > > >>>
> > > > >>> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > > >>> Acked-by: Olivier Matz <olivier.matz@6wind.com>
> > > > >> Acked-by: Jerin Jacob <jerinj@marvell.com>
> > > > >
> > > > > The patch is OK, but the fact that ops index changes during
> > > > > mempool driver lifetime is frightening. In fact it breaks
> > > > > rte_mempool_register_ops() return value semantics (read
> > > > > as API break). The return value is not used in DPDK, but it
> > > > > is a public function. If I'm not mistaken it should be taken
> > > > > into account.
> >
> > Good points.
> >
> > The fact that the ops index changes during mempool driver lifetime is
> > indeed frightening, especially knowning that this is a dynamic
> > registration that could happen at any moment in the life of the
> > application. Also, breaking the ABI is not desirable.
> That solution is better.
>
> > Let me try to propose something else to solve your issue:
> >
> > 1/ At init, the primary process allocates a struct in shared memory
> > (named memzone):
> >
> > struct rte_mempool_shared_ops {
> > size_t num_mempool_ops;
> > struct {
> > char name[RTE_MEMPOOL_OPS_NAMESIZE];
> > } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX];
> > char *mempool_ops_name[RTE_MEMPOOL_MAX_OPS_IDX];
oops I forgot to remove this line (replaced by mini-struct just above).
> > rte_spinlock_t mempool;
> > }
> >
> > 2/ When we register a mempool ops, we first get a name and id from the
> > shared struct: with the lock held, lookup for the registered name and
> > return its index, else get the last id and copy the name in the struct.
> >
> > 3/ Then do as before (in the per-process global table), except that we
> > reuse the registered id.
> >
> > We can remove the num_ops field from rte_mempool_ops_table.
> >
> > Thoughts?
> >
> >
> > > Yes, should update the doc: how about this:
> > >
> > > diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
> > > index c90cf31..5a9c8a7 100644
> > > --- a/lib/librte_mempool/rte_mempool.h
> > > +++ b/lib/librte_mempool/rte_mempool.h
> > > @@ -904,7 +904,9 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
> > > * @param ops
> > > * Pointer to an ops structure to register.
> > > * @return
> > > - * - >=0: Success; return the index of the ops struct in the table.
> > > + * - >=0: Success; return the index of the last ops struct in the table.
> > > + * The number of the ops struct registered is equal to index
> > > + * returned + 1.
> > > * - -EINVAL - some missing callbacks while registering ops struct.
> > > * - -ENOSPC - the maximum number of ops structs has been reached.
> > > */
> > > diff --git a/lib/librte_mempool/rte_mempool_ops.c
> > > b/lib/librte_mempool/rte_mempool_ops.c
> > > index b0da096..053f340 100644
> > > --- a/lib/librte_mempool/rte_mempool_ops.c
> > > +++ b/lib/librte_mempool/rte_mempool_ops.c
> > > @@ -26,7 +26,11 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> > > return strcmp(m_a->name, m_b->name);
> > > }
> > >
> > > -/* add a new ops struct in rte_mempool_ops_table, return its index. */
> > > +/*
> > > + * add a new ops struct in rte_mempool_ops_table.
> > > + * on success, return the index of the last ops
> > > + * struct in the table.
> > > + */
> > > int
> > > rte_mempool_register_ops(const struct rte_mempool_ops *h)
> > > {
> > > > > Also I remember patches which warn about above behaviour
> > > > > in documentation. If behaviour changes, corresponding
> > > > > documentation must be updated.
> > > >
> > > > One more point. If the patch is finally accepted it definitely
> > > > deserves few lines in release notes.
> > > OK, a separate patch should be sent before DPDK 20.05 release ?
> > > >
> > >
> > >
> > > --
> > > Thanks,
> > > Tonghao
>
>
>
> --
> Thanks,
> Tonghao
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v3] mempool: sort the rte_mempool_ops by name
2020-03-09 8:55 ` Tonghao Zhang
2020-03-09 9:05 ` Olivier Matz
@ 2020-03-09 13:15 ` David Marchand
2020-03-16 7:43 ` Tonghao Zhang
1 sibling, 1 reply; 44+ messages in thread
From: David Marchand @ 2020-03-09 13:15 UTC (permalink / raw)
To: Tonghao Zhang
Cc: Olivier Matz, Andrew Rybchenko, Jerin Jacob, dpdk-dev, Gage Eads,
Artem V. Andreev, Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru,
Hemant Agrawal, Burakov, Anatoly, Bruce Richardson
On Mon, Mar 9, 2020 at 9:56 AM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> On Mon, Mar 9, 2020 at 4:27 PM Olivier Matz <olivier.matz@6wind.com> wrote:
> > The fact that the ops index changes during mempool driver lifetime is
> > indeed frightening, especially knowning that this is a dynamic
> > registration that could happen at any moment in the life of the
> > application. Also, breaking the ABI is not desirable.
> That solution is better.
>
> > Let me try to propose something else to solve your issue:
> >
> > 1/ At init, the primary process allocates a struct in shared memory
> > (named memzone):
> >
> > struct rte_mempool_shared_ops {
> > size_t num_mempool_ops;
> > struct {
> > char name[RTE_MEMPOOL_OPS_NAMESIZE];
> > } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX];
> > char *mempool_ops_name[RTE_MEMPOOL_MAX_OPS_IDX];
> > rte_spinlock_t mempool;
> > }
> >
> > 2/ When we register a mempool ops, we first get a name and id from the
> > shared struct: with the lock held, lookup for the registered name and
> > return its index, else get the last id and copy the name in the struct.
> >
> > 3/ Then do as before (in the per-process global table), except that we
> > reuse the registered id.
> >
> > We can remove the num_ops field from rte_mempool_ops_table.
> >
> > Thoughts?
It seems better, just adding Anatoly and Bruce who know more about multiprocess.
Tonghao, could you add a unit test to exhibit the issue as part of this work?
Thanks.
--
David Marchand
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v3] mempool: sort the rte_mempool_ops by name
2020-03-09 13:15 ` David Marchand
@ 2020-03-16 7:43 ` Tonghao Zhang
2020-03-16 7:55 ` Olivier Matz
0 siblings, 1 reply; 44+ messages in thread
From: Tonghao Zhang @ 2020-03-16 7:43 UTC (permalink / raw)
To: David Marchand
Cc: Olivier Matz, Andrew Rybchenko, Jerin Jacob, dpdk-dev, Gage Eads,
Artem V. Andreev, Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru,
Hemant Agrawal, Burakov, Anatoly, Bruce Richardson
On Mon, Mar 9, 2020 at 9:15 PM David Marchand <david.marchand@redhat.com> wrote:
>
> On Mon, Mar 9, 2020 at 9:56 AM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> > On Mon, Mar 9, 2020 at 4:27 PM Olivier Matz <olivier.matz@6wind.com> wrote:
> > > The fact that the ops index changes during mempool driver lifetime is
> > > indeed frightening, especially knowning that this is a dynamic
> > > registration that could happen at any moment in the life of the
> > > application. Also, breaking the ABI is not desirable.
> > That solution is better.
> >
> > > Let me try to propose something else to solve your issue:
> > >
> > > 1/ At init, the primary process allocates a struct in shared memory
> > > (named memzone):
> > >
> > > struct rte_mempool_shared_ops {
> > > size_t num_mempool_ops;
> > > struct {
> > > char name[RTE_MEMPOOL_OPS_NAMESIZE];
> > > } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX];
> > > char *mempool_ops_name[RTE_MEMPOOL_MAX_OPS_IDX];
> > > rte_spinlock_t mempool;
> > > }
> > >
> > > 2/ When we register a mempool ops, we first get a name and id from the
> > > shared struct: with the lock held, lookup for the registered name and
> > > return its index, else get the last id and copy the name in the struct.
> > >
> > > 3/ Then do as before (in the per-process global table), except that we
> > > reuse the registered id.
> > >
> > > We can remove the num_ops field from rte_mempool_ops_table.
> > >
> > > Thoughts?
>
> It seems better, just adding Anatoly and Bruce who know more about multiprocess.
>
> Tonghao, could you add a unit test to exhibit the issue as part of this work?
>
> Thanks.
Hi Olivier
any progress?will apply this patch or wait your patch?
>
> --
> David Marchand
>
--
Thanks,
Tonghao
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v3] mempool: sort the rte_mempool_ops by name
2020-03-16 7:43 ` Tonghao Zhang
@ 2020-03-16 7:55 ` Olivier Matz
0 siblings, 0 replies; 44+ messages in thread
From: Olivier Matz @ 2020-03-16 7:55 UTC (permalink / raw)
To: Tonghao Zhang
Cc: David Marchand, Andrew Rybchenko, Jerin Jacob, dpdk-dev,
Gage Eads, Artem V. Andreev, Jerin Jacob, Nithin Dabilpuram,
Vamsi Attunuru, Hemant Agrawal, Burakov, Anatoly,
Bruce Richardson
Hi Tonghao,
On Mon, Mar 16, 2020 at 03:43:40PM +0800, Tonghao Zhang wrote:
> On Mon, Mar 9, 2020 at 9:15 PM David Marchand <david.marchand@redhat.com> wrote:
> >
> > On Mon, Mar 9, 2020 at 9:56 AM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> > > On Mon, Mar 9, 2020 at 4:27 PM Olivier Matz <olivier.matz@6wind.com> wrote:
> > > > The fact that the ops index changes during mempool driver lifetime is
> > > > indeed frightening, especially knowning that this is a dynamic
> > > > registration that could happen at any moment in the life of the
> > > > application. Also, breaking the ABI is not desirable.
> > > That solution is better.
> > >
> > > > Let me try to propose something else to solve your issue:
> > > >
> > > > 1/ At init, the primary process allocates a struct in shared memory
> > > > (named memzone):
> > > >
> > > > struct rte_mempool_shared_ops {
> > > > size_t num_mempool_ops;
> > > > struct {
> > > > char name[RTE_MEMPOOL_OPS_NAMESIZE];
> > > > } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX];
> > > > char *mempool_ops_name[RTE_MEMPOOL_MAX_OPS_IDX];
> > > > rte_spinlock_t mempool;
> > > > }
> > > >
> > > > 2/ When we register a mempool ops, we first get a name and id from the
> > > > shared struct: with the lock held, lookup for the registered name and
> > > > return its index, else get the last id and copy the name in the struct.
> > > >
> > > > 3/ Then do as before (in the per-process global table), except that we
> > > > reuse the registered id.
> > > >
> > > > We can remove the num_ops field from rte_mempool_ops_table.
> > > >
> > > > Thoughts?
> >
> > It seems better, just adding Anatoly and Bruce who know more about multiprocess.
> >
> > Tonghao, could you add a unit test to exhibit the issue as part of this work?
> >
> > Thanks.
> Hi Olivier
> any progress?will apply this patch or wait your patch?
Sorry if it wasn't clear, but my proposition was just an idea, I have no
time yet to work on it. Feel free to implement it by yourself, and I'll
help to review the patches.
Thanks,
Olivier
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v3] mempool: sort the rte_mempool_ops by name
2020-03-09 8:27 ` Olivier Matz
2020-03-09 8:55 ` Tonghao Zhang
@ 2020-03-24 9:35 ` Andrew Rybchenko
2020-03-24 12:41 ` Tonghao Zhang
1 sibling, 1 reply; 44+ messages in thread
From: Andrew Rybchenko @ 2020-03-24 9:35 UTC (permalink / raw)
To: Olivier Matz, Tonghao Zhang
Cc: Jerin Jacob, dpdk-dev, Gage Eads, Artem V. Andreev, Jerin Jacob,
Nithin Dabilpuram, Vamsi Attunuru, Hemant Agrawal
On 3/9/20 11:27 AM, Olivier Matz wrote:
> Hi,
>
> On Mon, Mar 09, 2020 at 11:01:25AM +0800, Tonghao Zhang wrote:
>> On Sat, Mar 7, 2020 at 8:54 PM Andrew Rybchenko
>> <arybchenko@solarflare.com> wrote:
>>>
>>> On 3/7/20 3:51 PM, Andrew Rybchenko wrote:
>>>> On 3/6/20 4:37 PM, Jerin Jacob wrote:
>>>>> On Fri, Mar 6, 2020 at 7:06 PM <xiangxia.m.yue@gmail.com> wrote:
>>>>>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>>>>>>
>>>>>> The order of mempool initiation affects mempool index in the
>>>>>> rte_mempool_ops_table. For example, when building APPs with:
>>>>>>
>>>>>> $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
>>>>>>
>>>>>> The "bucket" mempool will be registered firstly, and its index
>>>>>> in table is 0 while the index of "ring" mempool is 1. DPDK
>>>>>> uses the mk/rte.app.mk to build APPs, and others, for example,
>>>>>> Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
>>>>>> The mempool lib linked in dpdk and Open vSwitch is different.
>>>>>>
>>>>>> The mempool can be used between primary and secondary process,
>>>>>> such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
>>>>>> There will be a crash because dpdk-pdump creates the "ring_mp_mc"
>>>>>> ring which index in table is 0, but the index of "bucket" ring
>>>>>> is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
>>>>>> mempool ops and malloc memory from mempool. The crash will occur:
>>>>>>
>>>>>> bucket_dequeue (access null and crash)
>>>>>> rte_mempool_get_ops (should get "ring_mp_mc",
>>>>>> but get "bucket" mempool)
>>>>>> rte_mempool_ops_dequeue_bulk
>>>>>> ...
>>>>>> rte_pktmbuf_alloc
>>>>>> rte_pktmbuf_copy
>>>>>> pdump_copy
>>>>>> pdump_rx
>>>>>> rte_eth_rx_burst
>>>>>>
>>>>>> To avoid the crash, there are some solution:
>>>>>> * constructor priority: Different mempool uses different
>>>>>> priority in RTE_INIT, but it's not easy to maintain.
>>>>>>
>>>>>> * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
>>>>>> be same as libdpdk.a/libdpdk.so, but when adding a new mempool
>>>>>> driver in future, we must make sure the order.
>>>>>>
>>>>>> * register mempool orderly: Sort the mempool when registering,
>>>>>> so the lib linked will not affect the index in mempool table.
>>>>>>
>>>>>> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>>>>>> Acked-by: Olivier Matz <olivier.matz@6wind.com>
>>>>> Acked-by: Jerin Jacob <jerinj@marvell.com>
>>>>
>>>> The patch is OK, but the fact that ops index changes during
>>>> mempool driver lifetime is frightening. In fact it breaks
>>>> rte_mempool_register_ops() return value semantics (read
>>>> as API break). The return value is not used in DPDK, but it
>>>> is a public function. If I'm not mistaken it should be taken
>>>> into account.
>
> Good points.
>
> The fact that the ops index changes during mempool driver lifetime is
> indeed frightening, especially knowning that this is a dynamic
> registration that could happen at any moment in the life of the
> application. Also, breaking the ABI is not desirable.
>
> Let me try to propose something else to solve your issue:
>
> 1/ At init, the primary process allocates a struct in shared memory
> (named memzone):
>
> struct rte_mempool_shared_ops {
> size_t num_mempool_ops;
> struct {
> char name[RTE_MEMPOOL_OPS_NAMESIZE];
> } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX];
> char *mempool_ops_name[RTE_MEMPOOL_MAX_OPS_IDX];
> rte_spinlock_t mempool;
> }
>
> 2/ When we register a mempool ops, we first get a name and id from the
> shared struct: with the lock held, lookup for the registered name and
> return its index, else get the last id and copy the name in the struct.
>
> 3/ Then do as before (in the per-process global table), except that we
> reuse the registered id.
>
> We can remove the num_ops field from rte_mempool_ops_table.
>
> Thoughts?
I like the solution.
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v3] mempool: sort the rte_mempool_ops by name
2020-03-24 9:35 ` Andrew Rybchenko
@ 2020-03-24 12:41 ` Tonghao Zhang
0 siblings, 0 replies; 44+ messages in thread
From: Tonghao Zhang @ 2020-03-24 12:41 UTC (permalink / raw)
To: Andrew Rybchenko
Cc: Olivier Matz, Jerin Jacob, dpdk-dev, Gage Eads, Artem V. Andreev,
Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru, Hemant Agrawal
On Tue, Mar 24, 2020 at 5:36 PM Andrew Rybchenko
<arybchenko@solarflare.com> wrote:
>
> On 3/9/20 11:27 AM, Olivier Matz wrote:
> > Hi,
> >
> > On Mon, Mar 09, 2020 at 11:01:25AM +0800, Tonghao Zhang wrote:
> >> On Sat, Mar 7, 2020 at 8:54 PM Andrew Rybchenko
> >> <arybchenko@solarflare.com> wrote:
> >>>
> >>> On 3/7/20 3:51 PM, Andrew Rybchenko wrote:
> >>>> On 3/6/20 4:37 PM, Jerin Jacob wrote:
> >>>>> On Fri, Mar 6, 2020 at 7:06 PM <xiangxia.m.yue@gmail.com> wrote:
> >>>>>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> >>>>>>
> >>>>>> The order of mempool initiation affects mempool index in the
> >>>>>> rte_mempool_ops_table. For example, when building APPs with:
> >>>>>>
> >>>>>> $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> >>>>>>
> >>>>>> The "bucket" mempool will be registered firstly, and its index
> >>>>>> in table is 0 while the index of "ring" mempool is 1. DPDK
> >>>>>> uses the mk/rte.app.mk to build APPs, and others, for example,
> >>>>>> Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> >>>>>> The mempool lib linked in dpdk and Open vSwitch is different.
> >>>>>>
> >>>>>> The mempool can be used between primary and secondary process,
> >>>>>> such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
> >>>>>> There will be a crash because dpdk-pdump creates the "ring_mp_mc"
> >>>>>> ring which index in table is 0, but the index of "bucket" ring
> >>>>>> is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
> >>>>>> mempool ops and malloc memory from mempool. The crash will occur:
> >>>>>>
> >>>>>> bucket_dequeue (access null and crash)
> >>>>>> rte_mempool_get_ops (should get "ring_mp_mc",
> >>>>>> but get "bucket" mempool)
> >>>>>> rte_mempool_ops_dequeue_bulk
> >>>>>> ...
> >>>>>> rte_pktmbuf_alloc
> >>>>>> rte_pktmbuf_copy
> >>>>>> pdump_copy
> >>>>>> pdump_rx
> >>>>>> rte_eth_rx_burst
> >>>>>>
> >>>>>> To avoid the crash, there are some solution:
> >>>>>> * constructor priority: Different mempool uses different
> >>>>>> priority in RTE_INIT, but it's not easy to maintain.
> >>>>>>
> >>>>>> * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
> >>>>>> be same as libdpdk.a/libdpdk.so, but when adding a new mempool
> >>>>>> driver in future, we must make sure the order.
> >>>>>>
> >>>>>> * register mempool orderly: Sort the mempool when registering,
> >>>>>> so the lib linked will not affect the index in mempool table.
> >>>>>>
> >>>>>> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> >>>>>> Acked-by: Olivier Matz <olivier.matz@6wind.com>
> >>>>> Acked-by: Jerin Jacob <jerinj@marvell.com>
> >>>>
> >>>> The patch is OK, but the fact that ops index changes during
> >>>> mempool driver lifetime is frightening. In fact it breaks
> >>>> rte_mempool_register_ops() return value semantics (read
> >>>> as API break). The return value is not used in DPDK, but it
> >>>> is a public function. If I'm not mistaken it should be taken
> >>>> into account.
> >
> > Good points.
> >
> > The fact that the ops index changes during mempool driver lifetime is
> > indeed frightening, especially knowning that this is a dynamic
> > registration that could happen at any moment in the life of the
> > application. Also, breaking the ABI is not desirable.
> >
> > Let me try to propose something else to solve your issue:
> >
> > 1/ At init, the primary process allocates a struct in shared memory
> > (named memzone):
> >
> > struct rte_mempool_shared_ops {
> > size_t num_mempool_ops;
> > struct {
> > char name[RTE_MEMPOOL_OPS_NAMESIZE];
> > } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX];
> > char *mempool_ops_name[RTE_MEMPOOL_MAX_OPS_IDX];
> > rte_spinlock_t mempool;
> > }
> >
> > 2/ When we register a mempool ops, we first get a name and id from the
> > shared struct: with the lock held, lookup for the registered name and
> > return its index, else get the last id and copy the name in the struct.
> >
> > 3/ Then do as before (in the per-process global table), except that we
> > reuse the registered id.
> >
> > We can remove the num_ops field from rte_mempool_ops_table.
> >
> > Thoughts?
>
> I like the solution.
The patch will be sent, thanks.
--
Best regards, Tonghao
^ permalink raw reply [flat|nested] 44+ messages in thread
* [dpdk-dev] [PATCH dpdk-dev 1/2] eal: introduce last-init queue for libraries initialization
2020-03-02 1:57 [dpdk-dev] [PATCH] mempool: sort the rte_mempool_ops by name xiangxia.m.yue
` (2 preceding siblings ...)
2020-03-06 13:36 ` [dpdk-dev] [PATCH dpdk-dev v3] " xiangxia.m.yue
@ 2020-04-09 10:52 ` xiangxia.m.yue
2020-04-09 10:53 ` [dpdk-dev] [PATCH dpdk-dev 2/2] mempool: use shared memzone for rte_mempool_ops xiangxia.m.yue
2020-04-09 11:31 ` [dpdk-dev] [PATCH dpdk-dev 1/2] eal: introduce last-init queue for libraries initialization Jerin Jacob
2020-04-09 15:02 ` [dpdk-dev] [PATCH dpdk-dev v2 1/2] eal: introduce rte-init " xiangxia.m.yue
2020-04-13 14:21 ` [dpdk-dev] [PATCH dpdk-dev v3 " xiangxia.m.yue
5 siblings, 2 replies; 44+ messages in thread
From: xiangxia.m.yue @ 2020-04-09 10:52 UTC (permalink / raw)
To: olivier.matz, arybchenko, gage.eads, artem.andreev, jerinj,
ndabilpuram, vattunuru, hemant.agrawal, david.marchand,
anatoly.burakov, bruce.richardson
Cc: dev, Tonghao Zhang
From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
This patch introduces last-init queue, user can register a
callback for theirs initialization. Running rte_last_init_run(),
the almost resource of DPDK are available, such as memzone, ring.
With this way, user don't introduce additional codes in eal layer.
[This patch will be used for next patch.]
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
---
lib/librte_eal/common/eal_common_last_init.c | 43 ++++++++++++++++++++++++++++
lib/librte_eal/common/meson.build | 1 +
lib/librte_eal/freebsd/Makefile | 1 +
lib/librte_eal/freebsd/eal.c | 7 +++++
lib/librte_eal/include/rte_last_init.h | 36 +++++++++++++++++++++++
lib/librte_eal/linux/Makefile | 1 +
lib/librte_eal/linux/eal.c | 8 ++++++
7 files changed, 97 insertions(+)
create mode 100644 lib/librte_eal/common/eal_common_last_init.c
create mode 100644 lib/librte_eal/include/rte_last_init.h
diff --git a/lib/librte_eal/common/eal_common_last_init.c b/lib/librte_eal/common/eal_common_last_init.c
new file mode 100644
index 0000000..4f168e9
--- /dev/null
+++ b/lib/librte_eal/common/eal_common_last_init.c
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020 DPDK Community
+ */
+
+#include <sys/queue.h>
+
+#include <rte_last_init.h>
+#include <rte_debug.h>
+
+static struct rte_last_init_list rte_last_init_list =
+ TAILQ_HEAD_INITIALIZER(rte_last_init_list);
+
+void
+rte_last_init_register(rte_last_init_cb cb, const void *arg)
+{
+ struct rte_last_init *last;
+
+ RTE_VERIFY(cb);
+
+ last = malloc(sizeof(*last));
+ if (last == NULL)
+ rte_panic("Alloc memory for rte_last_init node failed\n");
+
+ last->cb = cb;
+ last->arg = arg;
+
+ TAILQ_INSERT_TAIL(&rte_last_init_list, last, next);
+}
+
+int
+rte_last_init_run(void)
+{
+ struct rte_last_init *init;
+ int ret;
+
+ TAILQ_FOREACH(init, &rte_last_init_list, next) {
+ ret = init->cb(init->arg);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
diff --git a/lib/librte_eal/common/meson.build b/lib/librte_eal/common/meson.build
index 02d9280..8107b24 100644
--- a/lib/librte_eal/common/meson.build
+++ b/lib/librte_eal/common/meson.build
@@ -43,6 +43,7 @@ sources += files(
'eal_common_thread.c',
'eal_common_timer.c',
'eal_common_uuid.c',
+ 'eal_common_last_init.c',
'hotplug_mp.c',
'malloc_elem.c',
'malloc_heap.c',
diff --git a/lib/librte_eal/freebsd/Makefile b/lib/librte_eal/freebsd/Makefile
index e5d4d8f..63cb0b9 100644
--- a/lib/librte_eal/freebsd/Makefile
+++ b/lib/librte_eal/freebsd/Makefile
@@ -60,6 +60,7 @@ SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += eal_common_thread.c
SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += eal_common_proc.c
SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += eal_common_fbarray.c
SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += eal_common_uuid.c
+SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += eal_common_last_init.c
SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += rte_malloc.c
SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += hotplug_mp.c
SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += malloc_elem.c
diff --git a/lib/librte_eal/freebsd/eal.c b/lib/librte_eal/freebsd/eal.c
index 6ae37e7..b51615f 100644
--- a/lib/librte_eal/freebsd/eal.c
+++ b/lib/librte_eal/freebsd/eal.c
@@ -44,6 +44,7 @@
#include <rte_option.h>
#include <rte_atomic.h>
#include <malloc_heap.h>
+#include <rte_last_init.h>
#include "eal_private.h"
#include "eal_thread.h"
@@ -874,6 +875,12 @@ static void rte_eal_init_alert(const char *msg)
eal_check_mem_on_local_socket();
+ if (rte_last_init_run() < 0) {
+ rte_eal_init_alert("Cannot init objects in last-init queue");
+ rte_errno = EFAULT;
+ return -1;
+ }
+
eal_thread_init_master(rte_config.master_lcore);
ret = eal_thread_dump_affinity(cpuset, sizeof(cpuset));
diff --git a/lib/librte_eal/include/rte_last_init.h b/lib/librte_eal/include/rte_last_init.h
new file mode 100644
index 0000000..8c91a97
--- /dev/null
+++ b/lib/librte_eal/include/rte_last_init.h
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020 DPDK Community
+ */
+
+#ifndef _RTE_LAST_INIT_H_
+#define _RTE_LAST_INIT_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdio.h>
+#include <sys/queue.h>
+
+typedef int (*rte_last_init_cb)(const void *arg);
+
+/**
+ * A structure describing a generic initialization.
+ */
+struct rte_last_init {
+ TAILQ_ENTRY(rte_last_init) next; /**< Next bus object in linked list */
+ const void *arg;
+ rte_last_init_cb cb;
+};
+
+/** Double linked list of buses */
+TAILQ_HEAD(rte_last_init_list, rte_last_init);
+
+void rte_last_init_register(rte_last_init_cb cb, const void *arg);
+int rte_last_init_run(void);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_LAST_INIT_H_ */
diff --git a/lib/librte_eal/linux/Makefile b/lib/librte_eal/linux/Makefile
index e5f4495..3a476a7 100644
--- a/lib/librte_eal/linux/Makefile
+++ b/lib/librte_eal/linux/Makefile
@@ -67,6 +67,7 @@ SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += eal_common_thread.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += eal_common_proc.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += eal_common_fbarray.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += eal_common_uuid.c
+SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += eal_common_last_init.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += rte_malloc.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += hotplug_mp.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += malloc_elem.c
diff --git a/lib/librte_eal/linux/eal.c b/lib/librte_eal/linux/eal.c
index 9530ee5..61fc540 100644
--- a/lib/librte_eal/linux/eal.c
+++ b/lib/librte_eal/linux/eal.c
@@ -51,6 +51,7 @@
#include <malloc_heap.h>
#include <rte_vfio.h>
#include <rte_option.h>
+#include <rte_last_init.h>
#include "eal_private.h"
#include "eal_thread.h"
@@ -1203,6 +1204,13 @@ static void rte_eal_init_alert(const char *msg)
eal_check_mem_on_local_socket();
+
+ if (rte_last_init_run() < 0) {
+ rte_eal_init_alert("Cannot init objects in last-init queue");
+ rte_errno = EFAULT;
+ return -1;
+ }
+
eal_thread_init_master(rte_config.master_lcore);
ret = eal_thread_dump_affinity(cpuset, sizeof(cpuset));
--
1.8.3.1
^ permalink raw reply [flat|nested] 44+ messages in thread
* [dpdk-dev] [PATCH dpdk-dev 2/2] mempool: use shared memzone for rte_mempool_ops
2020-04-09 10:52 ` [dpdk-dev] [PATCH dpdk-dev 1/2] eal: introduce last-init queue for libraries initialization xiangxia.m.yue
@ 2020-04-09 10:53 ` xiangxia.m.yue
2020-04-09 11:31 ` [dpdk-dev] [PATCH dpdk-dev 1/2] eal: introduce last-init queue for libraries initialization Jerin Jacob
1 sibling, 0 replies; 44+ messages in thread
From: xiangxia.m.yue @ 2020-04-09 10:53 UTC (permalink / raw)
To: olivier.matz, arybchenko, gage.eads, artem.andreev, jerinj,
ndabilpuram, vattunuru, hemant.agrawal, david.marchand,
anatoly.burakov, bruce.richardson
Cc: dev, Tonghao Zhang
From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
The order of mempool initiation affects mempool index in the
rte_mempool_ops_table. For example, when building APPs with:
$ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
The "bucket" mempool will be registered firstly, and its index
in table is 0 while the index of "ring" mempool is 1. DPDK
uses the mk/rte.app.mk to build APPs, and others, for example,
Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
The mempool lib linked in dpdk and Open vSwitch is different.
The mempool can be used between primary and secondary process,
such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
There will be a crash because dpdk-pdump creates the "ring_mp_mc"
ring which index in table is 0, but the index of "bucket" ring
is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
mempool ops and malloc memory from mempool. The crash will occur:
bucket_dequeue (access null and crash)
rte_mempool_get_ops (should get "ring_mp_mc",
but get "bucket" mempool)
rte_mempool_ops_dequeue_bulk
...
rte_pktmbuf_alloc
rte_pktmbuf_copy
pdump_copy
pdump_rx
rte_eth_rx_burst
To avoid the crash, there are some solution:
* constructor priority: Different mempool uses different
priority in RTE_INIT, but it's not easy to maintain.
* change mk/rte.app.mk: Change the order in mk/rte.app.mk to
be same as libdpdk.a/libdpdk.so, but when adding a new mempool
driver in future, we must make sure the order.
* register mempool orderly: Sort the mempool when registering,
so the lib linked will not affect the index in mempool table.
but the number of mempool libraries may be different.
* shared memzone: The primary process allocates a struct in
shared memory named memzone, When we register a mempool ops,
we first get a name and id from the shared struct: with the lock held,
lookup for the registered name and return its index, else
get the last id and copy the name in the struct.
previous discussion:
https://mails.dpdk.org/archives/dev/2020-March/159354.html
Suggested-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
---
lib/librte_mempool/rte_mempool.h | 22 ++++++++-
lib/librte_mempool/rte_mempool_ops.c | 89 ++++++++++++++++++++++++++++--------
2 files changed, 90 insertions(+), 21 deletions(-)
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index c90cf31..fdd1057 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -50,6 +50,7 @@
#include <rte_ring.h>
#include <rte_memcpy.h>
#include <rte_common.h>
+#include <rte_last_init.h>
#ifdef __cplusplus
extern "C" {
@@ -678,7 +679,6 @@ struct rte_mempool_ops {
*/
struct rte_mempool_ops_table {
rte_spinlock_t sl; /**< Spinlock for add/delete. */
- uint32_t num_ops; /**< Number of used ops structs in the table. */
/**
* Storage for all possible ops structs.
*/
@@ -910,6 +910,23 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
*/
int rte_mempool_register_ops(const struct rte_mempool_ops *ops);
+struct rte_mempool_shared_ops {
+ size_t num_mempool_ops;
+ struct {
+ char name[RTE_MEMPOOL_OPS_NAMESIZE];
+ } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX];
+
+ rte_spinlock_t mempool;
+};
+
+static inline int
+rte_mempool_register_ops_cb(const void *arg)
+{
+ const struct rte_mempool_ops *h = (const struct rte_mempool_ops *)arg;
+
+ return rte_mempool_register_ops(h);
+}
+
/**
* Macro to statically register the ops of a mempool handler.
* Note that the rte_mempool_register_ops fails silently here when
@@ -918,7 +935,8 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
#define MEMPOOL_REGISTER_OPS(ops) \
RTE_INIT(mp_hdlr_init_##ops) \
{ \
- rte_mempool_register_ops(&ops); \
+ rte_last_init_register(rte_mempool_register_ops_cb, \
+ (const void *)&ops); \
}
/**
diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
index 22c5251..48aa999 100644
--- a/lib/librte_mempool/rte_mempool_ops.c
+++ b/lib/librte_mempool/rte_mempool_ops.c
@@ -14,43 +14,95 @@
/* indirect jump table to support external memory pools. */
struct rte_mempool_ops_table rte_mempool_ops_table = {
.sl = RTE_SPINLOCK_INITIALIZER,
- .num_ops = 0
};
-/* add a new ops struct in rte_mempool_ops_table, return its index. */
-int
-rte_mempool_register_ops(const struct rte_mempool_ops *h)
+static int
+rte_mempool_register_shared_ops(const char *name)
{
- struct rte_mempool_ops *ops;
- int16_t ops_index;
+ static bool mempool_shared_ops_inited = false;
+ struct rte_mempool_shared_ops *shared_ops;
+ const struct rte_memzone* mz;
+ uint32_t ops_index = 0;
+
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
+ !mempool_shared_ops_inited) {
+
+ mz = rte_memzone_reserve("mempool_ops_shared",
+ sizeof(*shared_ops), 0, 0);
+ if (mz == NULL)
+ return -ENOMEM;
+
+ shared_ops = mz->addr;
+ shared_ops->num_mempool_ops = 0;
+ rte_spinlock_init(&shared_ops->mempool);
+
+ mempool_shared_ops_inited = true;
+ } else {
+ mz = rte_memzone_lookup("mempool_ops_shared");
+ if (mz == NULL)
+ return -ENOMEM;
+
+ shared_ops = mz->addr;
+ }
- rte_spinlock_lock(&rte_mempool_ops_table.sl);
+ rte_spinlock_lock(&shared_ops->mempool);
- if (rte_mempool_ops_table.num_ops >=
- RTE_MEMPOOL_MAX_OPS_IDX) {
- rte_spinlock_unlock(&rte_mempool_ops_table.sl);
+ if (shared_ops->num_mempool_ops >= RTE_MEMPOOL_MAX_OPS_IDX) {
+ rte_spinlock_unlock(&shared_ops->mempool);
RTE_LOG(ERR, MEMPOOL,
"Maximum number of mempool ops structs exceeded\n");
return -ENOSPC;
}
+ while (shared_ops->mempool_ops[ops_index].name[0]) {
+ if(!strcmp(name, shared_ops->mempool_ops[ops_index].name)) {
+ rte_spinlock_unlock(&shared_ops->mempool);
+ return ops_index;
+ }
+
+ ops_index ++;
+ }
+
+ strlcpy(shared_ops->mempool_ops[ops_index].name, name,
+ sizeof(shared_ops->mempool_ops[0].name));
+
+ shared_ops->num_mempool_ops ++;
+
+ rte_spinlock_unlock(&shared_ops->mempool);
+ return ops_index;
+}
+
+/* add a new ops struct in rte_mempool_ops_table, return its index. */
+int
+rte_mempool_register_ops(const struct rte_mempool_ops *h)
+{
+ struct rte_mempool_ops *ops;
+ int16_t ops_index;
+
if (h->alloc == NULL || h->enqueue == NULL ||
- h->dequeue == NULL || h->get_count == NULL) {
- rte_spinlock_unlock(&rte_mempool_ops_table.sl);
+ h->dequeue == NULL || h->get_count == NULL) {
RTE_LOG(ERR, MEMPOOL,
"Missing callback while registering mempool ops\n");
+ rte_errno = EINVAL;
return -EINVAL;
}
if (strlen(h->name) >= sizeof(ops->name) - 1) {
- rte_spinlock_unlock(&rte_mempool_ops_table.sl);
- RTE_LOG(DEBUG, EAL, "%s(): mempool_ops <%s>: name too long\n",
- __func__, h->name);
+ RTE_LOG(ERR, MEMPOOL,
+ "The registering mempool_ops <%s>: name too long\n",
+ h->name);
rte_errno = EEXIST;
return -EEXIST;
}
- ops_index = rte_mempool_ops_table.num_ops++;
+ ops_index = rte_mempool_register_shared_ops(h->name);
+ if (ops_index < 0) {
+ rte_errno = -ops_index;
+ return ops_index;
+ }
+
+ rte_spinlock_lock(&rte_mempool_ops_table.sl);
+
ops = &rte_mempool_ops_table.ops[ops_index];
strlcpy(ops->name, h->name, sizeof(ops->name));
ops->alloc = h->alloc;
@@ -165,9 +217,8 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
if (mp->flags & MEMPOOL_F_POOL_CREATED)
return -EEXIST;
- for (i = 0; i < rte_mempool_ops_table.num_ops; i++) {
- if (!strcmp(name,
- rte_mempool_ops_table.ops[i].name)) {
+ for (i = 0; i < RTE_MEMPOOL_MAX_OPS_IDX; i++) {
+ if (!strcmp(name, rte_mempool_ops_table.ops[i].name)) {
ops = &rte_mempool_ops_table.ops[i];
break;
}
--
1.8.3.1
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev 1/2] eal: introduce last-init queue for libraries initialization
2020-04-09 10:52 ` [dpdk-dev] [PATCH dpdk-dev 1/2] eal: introduce last-init queue for libraries initialization xiangxia.m.yue
2020-04-09 10:53 ` [dpdk-dev] [PATCH dpdk-dev 2/2] mempool: use shared memzone for rte_mempool_ops xiangxia.m.yue
@ 2020-04-09 11:31 ` Jerin Jacob
2020-04-09 15:04 ` Tonghao Zhang
1 sibling, 1 reply; 44+ messages in thread
From: Jerin Jacob @ 2020-04-09 11:31 UTC (permalink / raw)
To: Tonghao Zhang
Cc: Olivier Matz, Andrew Rybchenko, Gage Eads, Artem V. Andreev,
Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru, Hemant Agrawal,
David Marchand, Anatoly Burakov, Richardson, Bruce, dpdk-dev
On Thu, Apr 9, 2020 at 4:24 PM <xiangxia.m.yue@gmail.com> wrote:
>
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>
> This patch introduces last-init queue, user can register a
> callback for theirs initialization. Running rte_last_init_run(),
> the almost resource of DPDK are available, such as memzone, ring.
> With this way, user don't introduce additional codes in eal layer.
>
> [This patch will be used for next patch.]
>
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> ---
> lib/librte_eal/common/eal_common_last_init.c | 43 ++++++++++++++++++++++++++++
> lib/librte_eal/common/meson.build | 1 +
> lib/librte_eal/freebsd/Makefile | 1 +
> lib/librte_eal/freebsd/eal.c | 7 +++++
> lib/librte_eal/include/rte_last_init.h | 36 +++++++++++++++++++++++
Update doc/api/doxy-api-index.md and hook to documenion.
> +void
> +rte_last_init_register(rte_last_init_cb cb, const void *arg)
> +{
> + struct rte_last_init *last;
> +
> + RTE_VERIFY(cb);
> +
> + last = malloc(sizeof(*last));
Does not look like this memory is NOT freed anywhere.
> + if (last == NULL)
> + rte_panic("Alloc memory for rte_last_init node failed\n");
> +
> + last->cb = cb;
> + last->arg = arg;
> +
> + TAILQ_INSERT_TAIL(&rte_last_init_list, last, next);
> +}
> +
> +int
> +rte_last_init_run(void)
> +{
> + struct rte_last_init *init;
> + int ret;
> +
> + TAILQ_FOREACH(init, &rte_last_init_list, next) {
> + ret = init->cb(init->arg);
> + if (ret)
> + return ret;
> + }
> +
> + return 0;
> +}
> diff --git a/lib/librte_eal/common/meson.build b/lib/librte_eal/common/meson.build
> +typedef int (*rte_last_init_cb)(const void *arg);
> +
> +/**
> + * A structure describing a generic initialization.
> + */
> +struct rte_last_init {
> + TAILQ_ENTRY(rte_last_init) next; /**< Next bus object in linked list */
> + const void *arg;
> + rte_last_init_cb cb;
> +};
No need to expose this structure. Move to eal_private.h
> +
> +/** Double linked list of buses */
> +TAILQ_HEAD(rte_last_init_list, rte_last_init);
No need to expose this structure. Move to eal_private.h
> +
> +void rte_last_init_register(rte_last_init_cb cb, const void *arg);
Add Doxygen comment.
Should we change to rte_init_register()? add an enum to specify where to
call this RTE_INIT_PRE, RTE_INIT_POST to take care of the future needs.
Just thought of bringing this point to make sure we can use this
scheme for another use case IF ANY.
> +int rte_last_init_run(void);
No need to expose this function in public API. Move to eal_private.h.
Please start the internal functions with eal_(dont use rte_ for
internal functions)
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev 1/2] eal: introduce last-init queue for libraries initialization
2020-04-09 11:31 ` [dpdk-dev] [PATCH dpdk-dev 1/2] eal: introduce last-init queue for libraries initialization Jerin Jacob
@ 2020-04-09 15:04 ` Tonghao Zhang
0 siblings, 0 replies; 44+ messages in thread
From: Tonghao Zhang @ 2020-04-09 15:04 UTC (permalink / raw)
To: Jerin Jacob
Cc: Olivier Matz, Andrew Rybchenko, Gage Eads, Artem V. Andreev,
Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru, Hemant Agrawal,
David Marchand, Anatoly Burakov, Richardson, Bruce, dpdk-dev
On Thu, Apr 9, 2020 at 7:31 PM Jerin Jacob <jerinjacobk@gmail.com> wrote:
>
> On Thu, Apr 9, 2020 at 4:24 PM <xiangxia.m.yue@gmail.com> wrote:
> >
> > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> >
> > This patch introduces last-init queue, user can register a
> > callback for theirs initialization. Running rte_last_init_run(),
> > the almost resource of DPDK are available, such as memzone, ring.
> > With this way, user don't introduce additional codes in eal layer.
> >
> > [This patch will be used for next patch.]
> >
> > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > ---
> > lib/librte_eal/common/eal_common_last_init.c | 43 ++++++++++++++++++++++++++++
> > lib/librte_eal/common/meson.build | 1 +
> > lib/librte_eal/freebsd/Makefile | 1 +
> > lib/librte_eal/freebsd/eal.c | 7 +++++
> > lib/librte_eal/include/rte_last_init.h | 36 +++++++++++++++++++++++
>
> Update doc/api/doxy-api-index.md and hook to documenion.
>
> > +void
> > +rte_last_init_register(rte_last_init_cb cb, const void *arg)
> > +{
> > + struct rte_last_init *last;
> > +
> > + RTE_VERIFY(cb);
> > +
> > + last = malloc(sizeof(*last));
>
> Does not look like this memory is NOT freed anywhere.
>
> > + if (last == NULL)
> > + rte_panic("Alloc memory for rte_last_init node failed\n");
> > +
> > + last->cb = cb;
> > + last->arg = arg;
> > +
> > + TAILQ_INSERT_TAIL(&rte_last_init_list, last, next);
> > +}
> > +
> > +int
> > +rte_last_init_run(void)
> > +{
> > + struct rte_last_init *init;
> > + int ret;
> > +
> > + TAILQ_FOREACH(init, &rte_last_init_list, next) {
> > + ret = init->cb(init->arg);
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + return 0;
> > +}
> > diff --git a/lib/librte_eal/common/meson.build b/lib/librte_eal/common/meson.build
>
> > +typedef int (*rte_last_init_cb)(const void *arg);
> > +
> > +/**
> > + * A structure describing a generic initialization.
> > + */
> > +struct rte_last_init {
> > + TAILQ_ENTRY(rte_last_init) next; /**< Next bus object in linked list */
> > + const void *arg;
> > + rte_last_init_cb cb;
> > +};
>
> No need to expose this structure. Move to eal_private.h
>
>
> > +
> > +/** Double linked list of buses */
> > +TAILQ_HEAD(rte_last_init_list, rte_last_init);
>
> No need to expose this structure. Move to eal_private.h
>
> > +
> > +void rte_last_init_register(rte_last_init_cb cb, const void *arg);
>
> Add Doxygen comment.
>
> Should we change to rte_init_register()? add an enum to specify where to
> call this RTE_INIT_PRE, RTE_INIT_POST to take care of the future needs.
> Just thought of bringing this point to make sure we can use this
> scheme for another use case IF ANY.
>
>
> > +int rte_last_init_run(void);
>
> No need to expose this function in public API. Move to eal_private.h.
> Please start the internal functions with eal_(dont use rte_ for
> internal functions)
Thanks for your review, v2 is sent. thanks.
--
Best regards, Tonghao
^ permalink raw reply [flat|nested] 44+ messages in thread
* [dpdk-dev] [PATCH dpdk-dev v2 1/2] eal: introduce rte-init queue for libraries initialization
2020-03-02 1:57 [dpdk-dev] [PATCH] mempool: sort the rte_mempool_ops by name xiangxia.m.yue
` (3 preceding siblings ...)
2020-04-09 10:52 ` [dpdk-dev] [PATCH dpdk-dev 1/2] eal: introduce last-init queue for libraries initialization xiangxia.m.yue
@ 2020-04-09 15:02 ` xiangxia.m.yue
2020-04-09 15:02 ` [dpdk-dev] [PATCH dpdk-dev v2 2/2] mempool: use shared memzone for rte_mempool_ops xiangxia.m.yue
2020-04-10 6:18 ` [dpdk-dev] [PATCH dpdk-dev v2 1/2] eal: introduce rte-init queue for libraries initialization Jerin Jacob
2020-04-13 14:21 ` [dpdk-dev] [PATCH dpdk-dev v3 " xiangxia.m.yue
5 siblings, 2 replies; 44+ messages in thread
From: xiangxia.m.yue @ 2020-04-09 15:02 UTC (permalink / raw)
To: olivier.matz, arybchenko, gage.eads, artem.andreev, jerinj,
ndabilpuram, vattunuru, hemant.agrawal, david.marchand,
anatoly.burakov, bruce.richardson
Cc: dev, Tonghao Zhang
From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
This patch introduces last-init queue, user can register a
callback for theirs initialization. Running rte_last_init_run(),
the almost resource of DPDK are available, such as memzone, ring.
With this way, user don't introduce additional codes in eal layer.
[This patch will be used for next patch.]
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
---
v2:
* rename rte_last_init_register ->rte_init_register
* rename rte_last_init struct ->rte_init
* rename rte_init_cb ->rte_init_cb_t
* free the rte_init node when not used.
* remove rte_init and others to eal_private.h
* add comments
* fix checkpatch warning
---
lib/librte_eal/common/eal_common_init.c | 74 +++++++++++++++++++++++++++++++++
lib/librte_eal/common/eal_private.h | 23 ++++++++++
lib/librte_eal/common/meson.build | 1 +
lib/librte_eal/freebsd/Makefile | 1 +
lib/librte_eal/freebsd/eal.c | 6 +++
lib/librte_eal/include/rte_init.h | 59 ++++++++++++++++++++++++++
lib/librte_eal/linux/Makefile | 1 +
lib/librte_eal/linux/eal.c | 6 +++
8 files changed, 171 insertions(+)
create mode 100644 lib/librte_eal/common/eal_common_init.c
create mode 100644 lib/librte_eal/include/rte_init.h
diff --git a/lib/librte_eal/common/eal_common_init.c b/lib/librte_eal/common/eal_common_init.c
new file mode 100644
index 0000000..1d89db3
--- /dev/null
+++ b/lib/librte_eal/common/eal_common_init.c
@@ -0,0 +1,74 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020 DPDK Community
+ */
+
+#include <sys/queue.h>
+
+#include <rte_init.h>
+#include <rte_debug.h>
+#include <rte_tailq.h>
+
+#include "eal_private.h"
+
+static struct rte_init_list rte_init_list =
+ TAILQ_HEAD_INITIALIZER(rte_init_list);
+
+void
+rte_init_register(rte_init_cb_t cb, const void *arg, enum rte_init_type type)
+{
+ struct rte_init *last;
+
+ RTE_VERIFY(cb);
+
+ last = malloc(sizeof(*last));
+ if (last == NULL)
+ rte_panic("Alloc memory for rte_init node failed\n");
+
+ last->type = type;
+ last->arg = arg;
+ last->cb = cb;
+
+ TAILQ_INSERT_TAIL(&rte_init_list, last, next);
+}
+
+static int
+eal_rte_init_run_type(enum rte_init_type type)
+{
+ struct rte_init *last;
+ int ret;
+
+ TAILQ_FOREACH(last, &rte_init_list, next) {
+ if (last->type != type)
+ continue;
+
+ ret = last->cb(last->arg);
+ if (ret < 0)
+ return ret;
+ }
+
+ return 0;
+}
+
+int
+eal_rte_init_run(void)
+{
+ struct rte_init *last;
+ struct rte_init *tmp;
+ int ret;
+
+ ret = eal_rte_init_run_type(RTE_INIT_PRE);
+ if (ret < 0)
+ return ret;
+
+ ret = eal_rte_init_run_type(RTE_INIT_POST);
+ if (ret < 0)
+ return ret;
+
+ /* Free rte_init node, not used anymore. */
+ TAILQ_FOREACH_SAFE(last, &rte_init_list, next, tmp) {
+ TAILQ_REMOVE(&rte_init_list, last, next);
+ free(last);
+ }
+
+ return 0;
+}
diff --git a/lib/librte_eal/common/eal_private.h b/lib/librte_eal/common/eal_private.h
index ddcfbe2..f0bcc97 100644
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -11,6 +11,7 @@
#include <rte_dev.h>
#include <rte_lcore.h>
+#include <rte_init.h>
/**
* Structure storing internal configuration (per-lcore)
@@ -60,6 +61,28 @@ struct rte_config {
} __attribute__((__packed__));
/**
+ * A structure describing a generic initialization.
+ */
+struct rte_init {
+ TAILQ_ENTRY(rte_init) next;
+ enum rte_init_type type;
+ rte_init_cb_t cb;
+ const void *arg;
+};
+
+/** Double linked list of rte_init. */
+TAILQ_HEAD(rte_init_list, rte_init);
+
+/**
+ * Run the callback registered in the global double linked list.
+ *
+ * @return
+ * - 0 on success
+ * - Negative on error
+ */
+int eal_rte_init_run(void);
+
+/**
* Get the global configuration structure.
*
* @return
diff --git a/lib/librte_eal/common/meson.build b/lib/librte_eal/common/meson.build
index 02d9280..ad0ce6a 100644
--- a/lib/librte_eal/common/meson.build
+++ b/lib/librte_eal/common/meson.build
@@ -43,6 +43,7 @@ sources += files(
'eal_common_thread.c',
'eal_common_timer.c',
'eal_common_uuid.c',
+ 'eal_common_init.c',
'hotplug_mp.c',
'malloc_elem.c',
'malloc_heap.c',
diff --git a/lib/librte_eal/freebsd/Makefile b/lib/librte_eal/freebsd/Makefile
index e5d4d8f..89c5649 100644
--- a/lib/librte_eal/freebsd/Makefile
+++ b/lib/librte_eal/freebsd/Makefile
@@ -60,6 +60,7 @@ SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += eal_common_thread.c
SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += eal_common_proc.c
SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += eal_common_fbarray.c
SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += eal_common_uuid.c
+SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += eal_common_init.c
SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += rte_malloc.c
SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += hotplug_mp.c
SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += malloc_elem.c
diff --git a/lib/librte_eal/freebsd/eal.c b/lib/librte_eal/freebsd/eal.c
index 6ae37e7..63af98c 100644
--- a/lib/librte_eal/freebsd/eal.c
+++ b/lib/librte_eal/freebsd/eal.c
@@ -874,6 +874,12 @@ static void rte_eal_init_alert(const char *msg)
eal_check_mem_on_local_socket();
+ if (eal_rte_init_run() < 0) {
+ rte_eal_init_alert("Cannot init objects in rte-init queue");
+ rte_errno = EFAULT;
+ return -1;
+ }
+
eal_thread_init_master(rte_config.master_lcore);
ret = eal_thread_dump_affinity(cpuset, sizeof(cpuset));
diff --git a/lib/librte_eal/include/rte_init.h b/lib/librte_eal/include/rte_init.h
new file mode 100644
index 0000000..636efff
--- /dev/null
+++ b/lib/librte_eal/include/rte_init.h
@@ -0,0 +1,59 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020 DPDK Community
+ */
+
+#ifndef _RTE_INIT_H_
+#define _RTE_INIT_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdio.h>
+#include <sys/queue.h>
+
+/**
+ * Implementation specific callback function which is
+ * responsible for specificed initialization.
+ *
+ * This is called when almost resources are available.
+ *
+ * @return
+ * 0 for successful callback
+ * Negative for unsuccessful callback with error value
+ */
+typedef int (*rte_init_cb_t)(const void *arg);
+
+/**
+ * rte_init type.
+ *
+ * The rte_init of RTE_INIT_PRE are called firstly,
+ * and then RTE_INIT_POST.
+ */
+enum rte_init_type {
+ RTE_INIT_PRE,
+ RTE_INIT_POST
+};
+
+/**
+ * Register a rte_init callback.
+ *
+ * @param cb
+ * A pointer to a rte_init_cb structure, which will be used
+ * in rte_eal_init().
+ *
+ * @param arg
+ * The cb will use that as param.
+ *
+ * @param type
+ * The type of rte_init registered.
+ */
+
+void rte_init_register(rte_init_cb_t cb, const void *arg,
+ enum rte_init_type type);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_INIT_H_ */
diff --git a/lib/librte_eal/linux/Makefile b/lib/librte_eal/linux/Makefile
index e5f4495..918d94b 100644
--- a/lib/librte_eal/linux/Makefile
+++ b/lib/librte_eal/linux/Makefile
@@ -67,6 +67,7 @@ SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += eal_common_thread.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += eal_common_proc.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += eal_common_fbarray.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += eal_common_uuid.c
+SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += eal_common_init.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += rte_malloc.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += hotplug_mp.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += malloc_elem.c
diff --git a/lib/librte_eal/linux/eal.c b/lib/librte_eal/linux/eal.c
index 9530ee5..dd0c258 100644
--- a/lib/librte_eal/linux/eal.c
+++ b/lib/librte_eal/linux/eal.c
@@ -1203,6 +1203,12 @@ static void rte_eal_init_alert(const char *msg)
eal_check_mem_on_local_socket();
+ if (eal_rte_init_run() < 0) {
+ rte_eal_init_alert("Cannot init objects in rte-init queue");
+ rte_errno = EFAULT;
+ return -1;
+ }
+
eal_thread_init_master(rte_config.master_lcore);
ret = eal_thread_dump_affinity(cpuset, sizeof(cpuset));
--
1.8.3.1
^ permalink raw reply [flat|nested] 44+ messages in thread
* [dpdk-dev] [PATCH dpdk-dev v2 2/2] mempool: use shared memzone for rte_mempool_ops
2020-04-09 15:02 ` [dpdk-dev] [PATCH dpdk-dev v2 1/2] eal: introduce rte-init " xiangxia.m.yue
@ 2020-04-09 15:02 ` xiangxia.m.yue
2020-04-10 6:18 ` [dpdk-dev] [PATCH dpdk-dev v2 1/2] eal: introduce rte-init queue for libraries initialization Jerin Jacob
1 sibling, 0 replies; 44+ messages in thread
From: xiangxia.m.yue @ 2020-04-09 15:02 UTC (permalink / raw)
To: olivier.matz, arybchenko, gage.eads, artem.andreev, jerinj,
ndabilpuram, vattunuru, hemant.agrawal, david.marchand,
anatoly.burakov, bruce.richardson
Cc: dev, Tonghao Zhang
From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
The order of mempool initiation affects mempool index in the
rte_mempool_ops_table. For example, when building APPs with:
$ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
The "bucket" mempool will be registered firstly, and its index
in table is 0 while the index of "ring" mempool is 1. DPDK
uses the mk/rte.app.mk to build APPs, and others, for example,
Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
The mempool lib linked in dpdk and Open vSwitch is different.
The mempool can be used between primary and secondary process,
such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
There will be a crash because dpdk-pdump creates the "ring_mp_mc"
ring which index in table is 0, but the index of "bucket" ring
is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
mempool ops and malloc memory from mempool. The crash will occur:
bucket_dequeue (access null and crash)
rte_mempool_get_ops (should get "ring_mp_mc",
but get "bucket" mempool)
rte_mempool_ops_dequeue_bulk
...
rte_pktmbuf_alloc
rte_pktmbuf_copy
pdump_copy
pdump_rx
rte_eth_rx_burst
To avoid the crash, there are some solution:
* constructor priority: Different mempool uses different
priority in RTE_INIT, but it's not easy to maintain.
* change mk/rte.app.mk: Change the order in mk/rte.app.mk to
be same as libdpdk.a/libdpdk.so, but when adding a new mempool
driver in future, we must make sure the order.
* register mempool orderly: Sort the mempool when registering,
so the lib linked will not affect the index in mempool table.
but the number of mempool libraries may be different.
* shared memzone: The primary process allocates a struct in
shared memory named memzone, When we register a mempool ops,
we first get a name and id from the shared struct: with the lock held,
lookup for the registered name and return its index, else
get the last id and copy the name in the struct.
Previous discussion: https://mails.dpdk.org/archives/dev/2020-March/159354.html
Suggested-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
---
v2:
* fix checkpatch warning
---
lib/librte_mempool/rte_mempool.h | 28 +++++++++++-
lib/librte_mempool/rte_mempool_ops.c | 89 ++++++++++++++++++++++++++++--------
2 files changed, 96 insertions(+), 21 deletions(-)
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index c90cf31..2709b9e 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -50,6 +50,7 @@
#include <rte_ring.h>
#include <rte_memcpy.h>
#include <rte_common.h>
+#include <rte_init.h>
#ifdef __cplusplus
extern "C" {
@@ -678,7 +679,6 @@ struct rte_mempool_ops {
*/
struct rte_mempool_ops_table {
rte_spinlock_t sl; /**< Spinlock for add/delete. */
- uint32_t num_ops; /**< Number of used ops structs in the table. */
/**
* Storage for all possible ops structs.
*/
@@ -910,6 +910,30 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
*/
int rte_mempool_register_ops(const struct rte_mempool_ops *ops);
+struct rte_mempool_shared_ops {
+ size_t num_mempool_ops;
+ struct {
+ char name[RTE_MEMPOOL_OPS_NAMESIZE];
+ } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX];
+
+ rte_spinlock_t mempool;
+};
+
+static inline int
+mempool_ops_register_cb(const void *arg)
+{
+ const struct rte_mempool_ops *h = (const struct rte_mempool_ops *)arg;
+
+ return rte_mempool_register_ops(h);
+}
+
+static inline void
+mempool_ops_register(const struct rte_mempool_ops *ops)
+{
+ rte_init_register(mempool_ops_register_cb, (const void *)ops,
+ RTE_INIT_PRE);
+}
+
/**
* Macro to statically register the ops of a mempool handler.
* Note that the rte_mempool_register_ops fails silently here when
@@ -918,7 +942,7 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
#define MEMPOOL_REGISTER_OPS(ops) \
RTE_INIT(mp_hdlr_init_##ops) \
{ \
- rte_mempool_register_ops(&ops); \
+ mempool_ops_register(&ops); \
}
/**
diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
index 22c5251..bc01964 100644
--- a/lib/librte_mempool/rte_mempool_ops.c
+++ b/lib/librte_mempool/rte_mempool_ops.c
@@ -14,43 +14,95 @@
/* indirect jump table to support external memory pools. */
struct rte_mempool_ops_table rte_mempool_ops_table = {
.sl = RTE_SPINLOCK_INITIALIZER,
- .num_ops = 0
};
-/* add a new ops struct in rte_mempool_ops_table, return its index. */
-int
-rte_mempool_register_ops(const struct rte_mempool_ops *h)
+static int
+rte_mempool_register_shared_ops(const char *name)
{
- struct rte_mempool_ops *ops;
- int16_t ops_index;
+ static bool mempool_shared_ops_inited = false;
+ struct rte_mempool_shared_ops *shared_ops;
+ const struct rte_memzone *mz;
+ uint32_t ops_index = 0;
+
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
+ !mempool_shared_ops_inited) {
+
+ mz = rte_memzone_reserve("mempool_ops_shared",
+ sizeof(*shared_ops), 0, 0);
+ if (mz == NULL)
+ return -ENOMEM;
+
+ shared_ops = mz->addr;
+ shared_ops->num_mempool_ops = 0;
+ rte_spinlock_init(&shared_ops->mempool);
+
+ mempool_shared_ops_inited = true;
+ } else {
+ mz = rte_memzone_lookup("mempool_ops_shared");
+ if (mz == NULL)
+ return -ENOMEM;
+
+ shared_ops = mz->addr;
+ }
- rte_spinlock_lock(&rte_mempool_ops_table.sl);
+ rte_spinlock_lock(&shared_ops->mempool);
- if (rte_mempool_ops_table.num_ops >=
- RTE_MEMPOOL_MAX_OPS_IDX) {
- rte_spinlock_unlock(&rte_mempool_ops_table.sl);
+ if (shared_ops->num_mempool_ops >= RTE_MEMPOOL_MAX_OPS_IDX) {
+ rte_spinlock_unlock(&shared_ops->mempool);
RTE_LOG(ERR, MEMPOOL,
"Maximum number of mempool ops structs exceeded\n");
return -ENOSPC;
}
+ while (shared_ops->mempool_ops[ops_index].name[0]) {
+ if (!strcmp(name, shared_ops->mempool_ops[ops_index].name)) {
+ rte_spinlock_unlock(&shared_ops->mempool);
+ return ops_index;
+ }
+
+ ops_index++;
+ }
+
+ strlcpy(shared_ops->mempool_ops[ops_index].name, name,
+ sizeof(shared_ops->mempool_ops[0].name));
+
+ shared_ops->num_mempool_ops++;
+
+ rte_spinlock_unlock(&shared_ops->mempool);
+ return ops_index;
+}
+
+/* add a new ops struct in rte_mempool_ops_table, return its index. */
+int
+rte_mempool_register_ops(const struct rte_mempool_ops *h)
+{
+ struct rte_mempool_ops *ops;
+ int16_t ops_index;
+
if (h->alloc == NULL || h->enqueue == NULL ||
- h->dequeue == NULL || h->get_count == NULL) {
- rte_spinlock_unlock(&rte_mempool_ops_table.sl);
+ h->dequeue == NULL || h->get_count == NULL) {
RTE_LOG(ERR, MEMPOOL,
"Missing callback while registering mempool ops\n");
+ rte_errno = EINVAL;
return -EINVAL;
}
if (strlen(h->name) >= sizeof(ops->name) - 1) {
- rte_spinlock_unlock(&rte_mempool_ops_table.sl);
- RTE_LOG(DEBUG, EAL, "%s(): mempool_ops <%s>: name too long\n",
- __func__, h->name);
+ RTE_LOG(ERR, MEMPOOL,
+ "The registering mempool_ops <%s>: name too long\n",
+ h->name);
rte_errno = EEXIST;
return -EEXIST;
}
- ops_index = rte_mempool_ops_table.num_ops++;
+ ops_index = rte_mempool_register_shared_ops(h->name);
+ if (ops_index < 0) {
+ rte_errno = -ops_index;
+ return ops_index;
+ }
+
+ rte_spinlock_lock(&rte_mempool_ops_table.sl);
+
ops = &rte_mempool_ops_table.ops[ops_index];
strlcpy(ops->name, h->name, sizeof(ops->name));
ops->alloc = h->alloc;
@@ -165,9 +217,8 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
if (mp->flags & MEMPOOL_F_POOL_CREATED)
return -EEXIST;
- for (i = 0; i < rte_mempool_ops_table.num_ops; i++) {
- if (!strcmp(name,
- rte_mempool_ops_table.ops[i].name)) {
+ for (i = 0; i < RTE_MEMPOOL_MAX_OPS_IDX; i++) {
+ if (!strcmp(name, rte_mempool_ops_table.ops[i].name)) {
ops = &rte_mempool_ops_table.ops[i];
break;
}
--
1.8.3.1
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v2 1/2] eal: introduce rte-init queue for libraries initialization
2020-04-09 15:02 ` [dpdk-dev] [PATCH dpdk-dev v2 1/2] eal: introduce rte-init " xiangxia.m.yue
2020-04-09 15:02 ` [dpdk-dev] [PATCH dpdk-dev v2 2/2] mempool: use shared memzone for rte_mempool_ops xiangxia.m.yue
@ 2020-04-10 6:18 ` Jerin Jacob
2020-04-10 13:11 ` Jerin Jacob
1 sibling, 1 reply; 44+ messages in thread
From: Jerin Jacob @ 2020-04-10 6:18 UTC (permalink / raw)
To: Tonghao Zhang
Cc: Olivier Matz, Andrew Rybchenko, Gage Eads, Artem V. Andreev,
Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru, Hemant Agrawal,
David Marchand, Anatoly Burakov, Richardson, Bruce, dpdk-dev
On Thu, Apr 9, 2020 at 8:33 PM <xiangxia.m.yue@gmail.com> wrote:
>
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>
> This patch introduces last-init queue, user can register a
> callback for theirs initialization. Running rte_last_init_run(),
The above section needs to be rewritten wrt v2 changes.
> the almost resource of DPDK are available, such as memzone, ring.
> With this way, user don't introduce additional codes in eal layer.
>
> [This patch will be used for next patch.]
See below
>
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> ---
See above
Move [This patch will be used for next patch.] here to avoid
unnecessary information in the git commit.
> v2:
> * rename rte_last_init_register ->rte_init_register
> * rename rte_last_init struct ->rte_init
> * rename rte_init_cb ->rte_init_cb_t
> * free the rte_init node when not used.
> * remove rte_init and others to eal_private.h
> * add comments
> * fix checkpatch warning
> ---
> diff --git a/lib/librte_eal/include/rte_init.h b/lib/librte_eal/include/rte_init.h
> new file mode 100644
> index 0000000..636efff
> --- /dev/null
> +++ b/lib/librte_eal/include/rte_init.h
> @@ -0,0 +1,59 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2020 DPDK Community
> + */
> +
> +#ifndef _RTE_INIT_H_
> +#define _RTE_INIT_H_
@file section is missing. See
lib/librte_eal/common/include/rte_errno.h as example.
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <stdio.h>
> +#include <sys/queue.h>
<sys/queue.h> is not required in public API header file.
> +
> +/**
> + * Implementation specific callback function which is
> + * responsible for specificed initialization.
> + *
> + * This is called when almost resources are available.
> + *
> + * @return
> + * 0 for successful callback
> + * Negative for unsuccessful callback with error value
> + */
> +typedef int (*rte_init_cb_t)(const void *arg);
> +
> +/**
> + * rte_init type.
> + *
> + * The rte_init of RTE_INIT_PRE are called firstly,
> + * and then RTE_INIT_POST.
> + */
> +enum rte_init_type {
> + RTE_INIT_PRE,
Type specific comment is missing.
Example as reference for formatting.
/**
* Enumerate trace mode operation.
*/
enum rte_trace_mode_e {
/**
* In this mode, When no space left in trace buffer, the subsequent
* events overwrite the old events in the trace buffer.
*/
RTE_TRACE_MODE_OVERWRITE,
/**
* In this mode, When no space left on trace buffer, the subsequent
* events shall not be recorded in the trace buffer.
*/
RTE_TRACE_MODE_DISCARD,
};
> + RTE_INIT_POST
> +};
> +
> +/**
> + * Register a rte_init callback.
> + *
> + * @param cb
> + * A pointer to a rte_init_cb structure, which will be used
s/used/invoked?
> + * in rte_eal_init().
> + *
> + * @param arg
> + * The cb will use that as param.
> + *
> + * @param type
> + * The type of rte_init registered.
> + */
> +
> +void rte_init_register(rte_init_cb_t cb, const void *arg,
> + enum rte_init_type type);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_INIT_H_ */
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v2 1/2] eal: introduce rte-init queue for libraries initialization
2020-04-10 6:18 ` [dpdk-dev] [PATCH dpdk-dev v2 1/2] eal: introduce rte-init queue for libraries initialization Jerin Jacob
@ 2020-04-10 13:11 ` Jerin Jacob
2020-04-12 3:20 ` Tonghao Zhang
0 siblings, 1 reply; 44+ messages in thread
From: Jerin Jacob @ 2020-04-10 13:11 UTC (permalink / raw)
To: Tonghao Zhang
Cc: Olivier Matz, Andrew Rybchenko, Gage Eads, Artem V. Andreev,
Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru, Hemant Agrawal,
David Marchand, Anatoly Burakov, Richardson, Bruce, dpdk-dev
On Fri, Apr 10, 2020 at 11:48 AM Jerin Jacob <jerinjacobk@gmail.com> wrote:
>
Three more items are missing in this patch
1) Unit test case for new API
2) Make the new API __rte_experimal
3) Update the .map file
> On Thu, Apr 9, 2020 at 8:33 PM <xiangxia.m.yue@gmail.com> wrote:
> >
> > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> >
> > This patch introduces last-init queue, user can register a
> > callback for theirs initialization. Running rte_last_init_run(),
>
> The above section needs to be rewritten wrt v2 changes.
>
> > the almost resource of DPDK are available, such as memzone, ring.
> > With this way, user don't introduce additional codes in eal layer.
> >
> > [This patch will be used for next patch.]
>
> See below
>
>
> >
> > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > ---
> See above
>
> Move [This patch will be used for next patch.] here to avoid
> unnecessary information in the git commit.
>
> > v2:
> > * rename rte_last_init_register ->rte_init_register
> > * rename rte_last_init struct ->rte_init
> > * rename rte_init_cb ->rte_init_cb_t
> > * free the rte_init node when not used.
> > * remove rte_init and others to eal_private.h
> > * add comments
> > * fix checkpatch warning
> > ---
> > diff --git a/lib/librte_eal/include/rte_init.h b/lib/librte_eal/include/rte_init.h
> > new file mode 100644
> > index 0000000..636efff
> > --- /dev/null
> > +++ b/lib/librte_eal/include/rte_init.h
> > @@ -0,0 +1,59 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright 2020 DPDK Community
> > + */
> > +
> > +#ifndef _RTE_INIT_H_
> > +#define _RTE_INIT_H_
>
> @file section is missing. See
> lib/librte_eal/common/include/rte_errno.h as example.
>
>
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +
> > +#include <stdio.h>
> > +#include <sys/queue.h>
>
> <sys/queue.h> is not required in public API header file.
>
> > +
> > +/**
> > + * Implementation specific callback function which is
> > + * responsible for specificed initialization.
> > + *
> > + * This is called when almost resources are available.
> > + *
> > + * @return
> > + * 0 for successful callback
> > + * Negative for unsuccessful callback with error value
> > + */
> > +typedef int (*rte_init_cb_t)(const void *arg);
> > +
> > +/**
> > + * rte_init type.
> > + *
> > + * The rte_init of RTE_INIT_PRE are called firstly,
> > + * and then RTE_INIT_POST.
> > + */
> > +enum rte_init_type {
> > + RTE_INIT_PRE,
>
> Type specific comment is missing.
>
> Example as reference for formatting.
>
> /**
> * Enumerate trace mode operation.
> */
> enum rte_trace_mode_e {
> /**
> * In this mode, When no space left in trace buffer, the subsequent
> * events overwrite the old events in the trace buffer.
> */
> RTE_TRACE_MODE_OVERWRITE,
> /**
> * In this mode, When no space left on trace buffer, the subsequent
> * events shall not be recorded in the trace buffer.
> */
> RTE_TRACE_MODE_DISCARD,
> };
>
> > + RTE_INIT_POST
> > +};
>
>
> > +
> > +/**
> > + * Register a rte_init callback.
> > + *
> > + * @param cb
> > + * A pointer to a rte_init_cb structure, which will be used
>
> s/used/invoked?
>
> > + * in rte_eal_init().
> > + *
> > + * @param arg
> > + * The cb will use that as param.
> > + *
> > + * @param type
> > + * The type of rte_init registered.
> > + */
> > +
> > +void rte_init_register(rte_init_cb_t cb, const void *arg,
> > + enum rte_init_type type);
> > +
> > +#ifdef __cplusplus
> > +}
> > +#endif
> > +
> > +#endif /* _RTE_INIT_H_ */
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v2 1/2] eal: introduce rte-init queue for libraries initialization
2020-04-10 13:11 ` Jerin Jacob
@ 2020-04-12 3:20 ` Tonghao Zhang
2020-04-12 3:32 ` Tonghao Zhang
0 siblings, 1 reply; 44+ messages in thread
From: Tonghao Zhang @ 2020-04-12 3:20 UTC (permalink / raw)
To: Jerin Jacob
Cc: Olivier Matz, Andrew Rybchenko, Gage Eads, Artem V. Andreev,
Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru, Hemant Agrawal,
David Marchand, Anatoly Burakov, Richardson, Bruce, dpdk-dev
On Fri, Apr 10, 2020 at 9:11 PM Jerin Jacob <jerinjacobk@gmail.com> wrote:
>
> On Fri, Apr 10, 2020 at 11:48 AM Jerin Jacob <jerinjacobk@gmail.com> wrote:
> >
>
> Three more items are missing in this patch
>
> 1) Unit test case for new API
> 2) Make the new API __rte_experimal
Hi Jerin
This API will be invoked in mempool, if use that prefix, there is a
compile warning:
include/rte_mempool.h:933:2: error: ‘rte_init_register’ is deprecated
(declared at x86_64-native-linuxapp-gcc/include/rte_init.h:61): Symbol
is not yet part of stable ABI [-Werror=deprecated-declarations]
remove the __rte_experimal in the patch 2?
> 3) Update the .map file
>
>
> > On Thu, Apr 9, 2020 at 8:33 PM <xiangxia.m.yue@gmail.com> wrote:
> > >
> > > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > >
> > > This patch introduces last-init queue, user can register a
> > > callback for theirs initialization. Running rte_last_init_run(),
> >
> > The above section needs to be rewritten wrt v2 changes.
> >
> > > the almost resource of DPDK are available, such as memzone, ring.
> > > With this way, user don't introduce additional codes in eal layer.
> > >
> > > [This patch will be used for next patch.]
> >
> > See below
> >
> >
> > >
> > > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > ---
> > See above
> >
> > Move [This patch will be used for next patch.] here to avoid
> > unnecessary information in the git commit.
> >
> > > v2:
> > > * rename rte_last_init_register ->rte_init_register
> > > * rename rte_last_init struct ->rte_init
> > > * rename rte_init_cb ->rte_init_cb_t
> > > * free the rte_init node when not used.
> > > * remove rte_init and others to eal_private.h
> > > * add comments
> > > * fix checkpatch warning
> > > ---
> > > diff --git a/lib/librte_eal/include/rte_init.h b/lib/librte_eal/include/rte_init.h
> > > new file mode 100644
> > > index 0000000..636efff
> > > --- /dev/null
> > > +++ b/lib/librte_eal/include/rte_init.h
> > > @@ -0,0 +1,59 @@
> > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > + * Copyright 2020 DPDK Community
> > > + */
> > > +
> > > +#ifndef _RTE_INIT_H_
> > > +#define _RTE_INIT_H_
> >
> > @file section is missing. See
> > lib/librte_eal/common/include/rte_errno.h as example.
> >
> >
> > > +#ifdef __cplusplus
> > > +extern "C" {
> > > +#endif
> > > +
> > > +#include <stdio.h>
> > > +#include <sys/queue.h>
> >
> > <sys/queue.h> is not required in public API header file.
> >
> > > +
> > > +/**
> > > + * Implementation specific callback function which is
> > > + * responsible for specificed initialization.
> > > + *
> > > + * This is called when almost resources are available.
> > > + *
> > > + * @return
> > > + * 0 for successful callback
> > > + * Negative for unsuccessful callback with error value
> > > + */
> > > +typedef int (*rte_init_cb_t)(const void *arg);
> > > +
> > > +/**
> > > + * rte_init type.
> > > + *
> > > + * The rte_init of RTE_INIT_PRE are called firstly,
> > > + * and then RTE_INIT_POST.
> > > + */
> > > +enum rte_init_type {
> > > + RTE_INIT_PRE,
> >
> > Type specific comment is missing.
> >
> > Example as reference for formatting.
> >
> > /**
> > * Enumerate trace mode operation.
> > */
> > enum rte_trace_mode_e {
> > /**
> > * In this mode, When no space left in trace buffer, the subsequent
> > * events overwrite the old events in the trace buffer.
> > */
> > RTE_TRACE_MODE_OVERWRITE,
> > /**
> > * In this mode, When no space left on trace buffer, the subsequent
> > * events shall not be recorded in the trace buffer.
> > */
> > RTE_TRACE_MODE_DISCARD,
> > };
> >
> > > + RTE_INIT_POST
> > > +};
> >
> >
> > > +
> > > +/**
> > > + * Register a rte_init callback.
> > > + *
> > > + * @param cb
> > > + * A pointer to a rte_init_cb structure, which will be used
> >
> > s/used/invoked?
> >
> > > + * in rte_eal_init().
> > > + *
> > > + * @param arg
> > > + * The cb will use that as param.
> > > + *
> > > + * @param type
> > > + * The type of rte_init registered.
> > > + */
> > > +
> > > +void rte_init_register(rte_init_cb_t cb, const void *arg,
> > > + enum rte_init_type type);
> > > +
> > > +#ifdef __cplusplus
> > > +}
> > > +#endif
> > > +
> > > +#endif /* _RTE_INIT_H_ */
--
Best regards, Tonghao
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v2 1/2] eal: introduce rte-init queue for libraries initialization
2020-04-12 3:20 ` Tonghao Zhang
@ 2020-04-12 3:32 ` Tonghao Zhang
2020-04-13 11:32 ` Jerin Jacob
0 siblings, 1 reply; 44+ messages in thread
From: Tonghao Zhang @ 2020-04-12 3:32 UTC (permalink / raw)
To: Jerin Jacob
Cc: Olivier Matz, Andrew Rybchenko, Gage Eads, Artem V. Andreev,
Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru, Hemant Agrawal,
David Marchand, Anatoly Burakov, Richardson, Bruce, dpdk-dev
On Sun, Apr 12, 2020 at 11:20 AM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>
> On Fri, Apr 10, 2020 at 9:11 PM Jerin Jacob <jerinjacobk@gmail.com> wrote:
> >
> > On Fri, Apr 10, 2020 at 11:48 AM Jerin Jacob <jerinjacobk@gmail.com> wrote:
> > >
> >
> > Three more items are missing in this patch
Can I add "Co-authored-by" for you ?
> > 1) Unit test case for new API
> > 2) Make the new API __rte_experimal
> Hi Jerin
> This API will be invoked in mempool, if use that prefix, there is a
> compile warning:
> include/rte_mempool.h:933:2: error: ‘rte_init_register’ is deprecated
> (declared at x86_64-native-linuxapp-gcc/include/rte_init.h:61): Symbol
> is not yet part of stable ABI [-Werror=deprecated-declarations]
>
> remove the __rte_experimal in the patch 2?
>
> > 3) Update the .map file
> >
> >
> > > On Thu, Apr 9, 2020 at 8:33 PM <xiangxia.m.yue@gmail.com> wrote:
> > > >
> > > > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > >
> > > > This patch introduces last-init queue, user can register a
> > > > callback for theirs initialization. Running rte_last_init_run(),
> > >
> > > The above section needs to be rewritten wrt v2 changes.
> > >
> > > > the almost resource of DPDK are available, such as memzone, ring.
> > > > With this way, user don't introduce additional codes in eal layer.
> > > >
> > > > [This patch will be used for next patch.]
> > >
> > > See below
> > >
> > >
> > > >
> > > > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > > > ---
> > > See above
> > >
> > > Move [This patch will be used for next patch.] here to avoid
> > > unnecessary information in the git commit.
> > >
> > > > v2:
> > > > * rename rte_last_init_register ->rte_init_register
> > > > * rename rte_last_init struct ->rte_init
> > > > * rename rte_init_cb ->rte_init_cb_t
> > > > * free the rte_init node when not used.
> > > > * remove rte_init and others to eal_private.h
> > > > * add comments
> > > > * fix checkpatch warning
> > > > ---
> > > > diff --git a/lib/librte_eal/include/rte_init.h b/lib/librte_eal/include/rte_init.h
> > > > new file mode 100644
> > > > index 0000000..636efff
> > > > --- /dev/null
> > > > +++ b/lib/librte_eal/include/rte_init.h
> > > > @@ -0,0 +1,59 @@
> > > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > > + * Copyright 2020 DPDK Community
> > > > + */
> > > > +
> > > > +#ifndef _RTE_INIT_H_
> > > > +#define _RTE_INIT_H_
> > >
> > > @file section is missing. See
> > > lib/librte_eal/common/include/rte_errno.h as example.
> > >
> > >
> > > > +#ifdef __cplusplus
> > > > +extern "C" {
> > > > +#endif
> > > > +
> > > > +#include <stdio.h>
> > > > +#include <sys/queue.h>
> > >
> > > <sys/queue.h> is not required in public API header file.
> > >
> > > > +
> > > > +/**
> > > > + * Implementation specific callback function which is
> > > > + * responsible for specificed initialization.
> > > > + *
> > > > + * This is called when almost resources are available.
> > > > + *
> > > > + * @return
> > > > + * 0 for successful callback
> > > > + * Negative for unsuccessful callback with error value
> > > > + */
> > > > +typedef int (*rte_init_cb_t)(const void *arg);
> > > > +
> > > > +/**
> > > > + * rte_init type.
> > > > + *
> > > > + * The rte_init of RTE_INIT_PRE are called firstly,
> > > > + * and then RTE_INIT_POST.
> > > > + */
> > > > +enum rte_init_type {
> > > > + RTE_INIT_PRE,
> > >
> > > Type specific comment is missing.
> > >
> > > Example as reference for formatting.
> > >
> > > /**
> > > * Enumerate trace mode operation.
> > > */
> > > enum rte_trace_mode_e {
> > > /**
> > > * In this mode, When no space left in trace buffer, the subsequent
> > > * events overwrite the old events in the trace buffer.
> > > */
> > > RTE_TRACE_MODE_OVERWRITE,
> > > /**
> > > * In this mode, When no space left on trace buffer, the subsequent
> > > * events shall not be recorded in the trace buffer.
> > > */
> > > RTE_TRACE_MODE_DISCARD,
> > > };
> > >
> > > > + RTE_INIT_POST
> > > > +};
> > >
> > >
> > > > +
> > > > +/**
> > > > + * Register a rte_init callback.
> > > > + *
> > > > + * @param cb
> > > > + * A pointer to a rte_init_cb structure, which will be used
> > >
> > > s/used/invoked?
> > >
> > > > + * in rte_eal_init().
> > > > + *
> > > > + * @param arg
> > > > + * The cb will use that as param.
> > > > + *
> > > > + * @param type
> > > > + * The type of rte_init registered.
> > > > + */
> > > > +
> > > > +void rte_init_register(rte_init_cb_t cb, const void *arg,
> > > > + enum rte_init_type type);
> > > > +
> > > > +#ifdef __cplusplus
> > > > +}
> > > > +#endif
> > > > +
> > > > +#endif /* _RTE_INIT_H_ */
>
>
>
> --
> Best regards, Tonghao
--
Best regards, Tonghao
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v2 1/2] eal: introduce rte-init queue for libraries initialization
2020-04-12 3:32 ` Tonghao Zhang
@ 2020-04-13 11:32 ` Jerin Jacob
0 siblings, 0 replies; 44+ messages in thread
From: Jerin Jacob @ 2020-04-13 11:32 UTC (permalink / raw)
To: Tonghao Zhang
Cc: Olivier Matz, Andrew Rybchenko, Gage Eads, Artem V. Andreev,
Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru, Hemant Agrawal,
David Marchand, Anatoly Burakov, Richardson, Bruce, dpdk-dev
On Sun, Apr 12, 2020 at 9:03 AM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>
> On Sun, Apr 12, 2020 at 11:20 AM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> >
> > On Fri, Apr 10, 2020 at 9:11 PM Jerin Jacob <jerinjacobk@gmail.com> wrote:
> > >
> > > On Fri, Apr 10, 2020 at 11:48 AM Jerin Jacob <jerinjacobk@gmail.com> wrote:
> > > >
> > >
> > > Three more items are missing in this patch
> Can I add "Co-authored-by" for you ?
We are not using that tag in DPDK. Feel free to add, Suggested-by:
Jerin Jacob <jerinj@marvell.com>
>
> > > 1) Unit test case for new API
> > > 2) Make the new API __rte_experimal
> > Hi Jerin
> > This API will be invoked in mempool, if use that prefix, there is a
> > compile warning:
> > include/rte_mempool.h:933:2: error: ‘rte_init_register’ is deprecated
> > (declared at x86_64-native-linuxapp-gcc/include/rte_init.h:61): Symbol
> > is not yet part of stable ABI [-Werror=deprecated-declarations]
http://patches.dpdk.org/patch/68119/ will be applied soon. So it will
not a problem,
^ permalink raw reply [flat|nested] 44+ messages in thread
* [dpdk-dev] [PATCH dpdk-dev v3 1/2] eal: introduce rte-init queue for libraries initialization
2020-03-02 1:57 [dpdk-dev] [PATCH] mempool: sort the rte_mempool_ops by name xiangxia.m.yue
` (4 preceding siblings ...)
2020-04-09 15:02 ` [dpdk-dev] [PATCH dpdk-dev v2 1/2] eal: introduce rte-init " xiangxia.m.yue
@ 2020-04-13 14:21 ` xiangxia.m.yue
2020-04-13 14:21 ` [dpdk-dev] [PATCH dpdk-dev v3 2/2] mempool: use shared memzone for rte_mempool_ops xiangxia.m.yue
5 siblings, 1 reply; 44+ messages in thread
From: xiangxia.m.yue @ 2020-04-13 14:21 UTC (permalink / raw)
To: olivier.matz, arybchenko, gage.eads, artem.andreev, jerinj,
ndabilpuram, vattunuru, hemant.agrawal, david.marchand,
anatoly.burakov, bruce.richardson
Cc: dev, Tonghao Zhang
From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
This patch introduces rte-init queue, users can register a callback
for theirs initialization. After the almost resource of DPDK are
available (e.g. memzone, ring), invoke eal_rte_init_run(). With this
way, users don't introduce additional codes in the eal layer.
Suggested-by: Olivier Matz <olivier.matz@6wind.com>
Suggested-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
---
This patch will be used for next patch.
and should be applied after http://patches.dpdk.org/patch/68119/
because of __rte_experimal
v3:
* add rte-init autotest
* add __rte_experimal prefix
* change return type of rte_init_register to int
* update doc/api/doxy-api-index.md
* update rte_eal_version.map
* update comments
* remove unused *.h in rte_init.h
v2:
* rename rte_last_init_register ->rte_init_register
* rename rte_last_init struct ->rte_init
* rename rte_init_cb ->rte_init_cb_t
* free the rte_init node when not used
* remove rte_init and others to eal_private.h
* add comments
* fix checkpatch warning
---
app/test/Makefile | 1 +
app/test/autotest_data.py | 6 +++
app/test/meson.build | 1 +
app/test/test_rte_init.c | 35 +++++++++++++
doc/api/doxy-api-index.md | 1 +
lib/librte_eal/common/eal_common_init.c | 87 +++++++++++++++++++++++++++++++++
lib/librte_eal/common/eal_private.h | 23 +++++++++
lib/librte_eal/common/meson.build | 1 +
lib/librte_eal/freebsd/Makefile | 1 +
lib/librte_eal/freebsd/eal.c | 6 +++
lib/librte_eal/include/rte_init.h | 68 ++++++++++++++++++++++++++
lib/librte_eal/linux/Makefile | 1 +
lib/librte_eal/linux/eal.c | 6 +++
lib/librte_eal/rte_eal_version.map | 1 +
14 files changed, 238 insertions(+)
create mode 100644 app/test/test_rte_init.c
create mode 100644 lib/librte_eal/common/eal_common_init.c
create mode 100644 lib/librte_eal/include/rte_init.h
diff --git a/app/test/Makefile b/app/test/Makefile
index 1f080d162659..494606ad4226 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -148,6 +148,7 @@ SRCS-y += test_alarm.c
SRCS-y += test_interrupts.c
SRCS-y += test_version.c
SRCS-y += test_func_reentrancy.c
+SRCS-y += test_rte_init.c
SRCS-y += test_service_cores.c
diff --git a/app/test/autotest_data.py b/app/test/autotest_data.py
index 7b1d01389be4..d06c27d68b4a 100644
--- a/app/test/autotest_data.py
+++ b/app/test/autotest_data.py
@@ -267,6 +267,12 @@
"Report": None,
},
{
+ "Name": "RTE INIT autotest",
+ "Command": "rte_init_autotest",
+ "Func": default_autotest,
+ "Report": None,
+ },
+ {
"Name": "Atomics autotest",
"Command": "atomic_autotest",
"Func": default_autotest,
diff --git a/app/test/meson.build b/app/test/meson.build
index 351d29cb657d..98a952a9285d 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -79,6 +79,7 @@ test_sources = files('commands.c',
'test_mempool.c',
'test_mempool_perf.c',
'test_memzone.c',
+ 'test_rte_init.c',
'test_meter.c',
'test_metrics.c',
'test_mcslock.c',
diff --git a/app/test/test_rte_init.c b/app/test/test_rte_init.c
new file mode 100644
index 000000000000..f6143bc76310
--- /dev/null
+++ b/app/test/test_rte_init.c
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 DPDK Community
+ */
+
+#include <stdio.h>
+
+#include <rte_init.h>
+
+#include "test.h"
+
+static int
+rte_init_cb(__rte_unused const void *arg)
+{
+ return 0;
+}
+
+static int
+test_rte_init(void)
+{
+ printf("test rte-init register API\n");
+ if (rte_init_register(rte_init_cb, NULL, RTE_INIT_PRE) != 0)
+ return -1;
+
+ printf("test rte-init cb\n");
+ if (rte_init_register(NULL, NULL, RTE_INIT_PRE) != -EINVAL)
+ return -1;
+
+ printf("test rte-init type\n");
+ if (rte_init_register(NULL, NULL, 10) != -EINVAL)
+ return -1;
+
+ return 0;
+}
+
+REGISTER_TEST_COMMAND(rte_init_autotest, test_rte_init);
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index dff496be0980..fa02c5f9f287 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -178,6 +178,7 @@ The public API headers are grouped by topics:
- **misc**:
[EAL config] (@ref rte_eal.h),
+ [RTE INIT] (@ref rte_init.h),
[common] (@ref rte_common.h),
[experimental APIs] (@ref rte_compat.h),
[ABI versioning] (@ref rte_function_versioning.h),
diff --git a/lib/librte_eal/common/eal_common_init.c b/lib/librte_eal/common/eal_common_init.c
new file mode 100644
index 000000000000..f5ae5c51b667
--- /dev/null
+++ b/lib/librte_eal/common/eal_common_init.c
@@ -0,0 +1,87 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 DPDK Community
+ */
+
+#include <sys/queue.h>
+
+#include <rte_init.h>
+#include <rte_tailq.h>
+#include <rte_log.h>
+
+#include "eal_private.h"
+
+static struct rte_init_list rte_init_list =
+ TAILQ_HEAD_INITIALIZER(rte_init_list);
+
+int
+rte_init_register(rte_init_cb_t cb, const void *arg, enum rte_init_type type)
+{
+ struct rte_init *last;
+
+ if (cb == NULL) {
+ RTE_LOG(ERR, EAL, "RTE INIT cb is NULL.\n");
+ return -EINVAL;
+ }
+
+ if (type != RTE_INIT_PRE && type != RTE_INIT_POST) {
+ RTE_LOG(ERR, EAL, "RTE INIT type is invalid.\n");
+ return -EINVAL;
+ }
+
+ last = malloc(sizeof(*last));
+ if (last == NULL) {
+ RTE_LOG(ERR, EAL,
+ "Allocate memory for rte_init node failed.\n");
+ return -ENOMEM;
+ }
+
+ last->type = type;
+ last->arg = arg;
+ last->cb = cb;
+
+ TAILQ_INSERT_TAIL(&rte_init_list, last, next);
+
+ return 0;
+}
+
+static int
+eal_rte_init_run_type(enum rte_init_type type)
+{
+ struct rte_init *last;
+ int ret;
+
+ TAILQ_FOREACH(last, &rte_init_list, next) {
+ if (last->type != type)
+ continue;
+
+ ret = last->cb(last->arg);
+ if (ret < 0)
+ return ret;
+ }
+
+ return 0;
+}
+
+int
+eal_rte_init_run(void)
+{
+ struct rte_init *last;
+ struct rte_init *tmp;
+ int ret;
+
+ ret = eal_rte_init_run_type(RTE_INIT_PRE);
+ if (ret < 0)
+ return ret;
+
+ ret = eal_rte_init_run_type(RTE_INIT_POST);
+ if (ret < 0)
+ return ret;
+
+ /* Free rte_init node, not used anymore. */
+ TAILQ_FOREACH_SAFE(last, &rte_init_list, next, tmp) {
+ TAILQ_REMOVE(&rte_init_list, last, next);
+ free(last);
+ }
+
+ return 0;
+}
diff --git a/lib/librte_eal/common/eal_private.h b/lib/librte_eal/common/eal_private.h
index ddcfbe2e4442..f0bcc977eb9d 100644
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -11,6 +11,7 @@
#include <rte_dev.h>
#include <rte_lcore.h>
+#include <rte_init.h>
/**
* Structure storing internal configuration (per-lcore)
@@ -60,6 +61,28 @@ struct rte_config {
} __attribute__((__packed__));
/**
+ * A structure describing a generic initialization.
+ */
+struct rte_init {
+ TAILQ_ENTRY(rte_init) next;
+ enum rte_init_type type;
+ rte_init_cb_t cb;
+ const void *arg;
+};
+
+/** Double linked list of rte_init. */
+TAILQ_HEAD(rte_init_list, rte_init);
+
+/**
+ * Run the callback registered in the global double linked list.
+ *
+ * @return
+ * - 0 on success
+ * - Negative on error
+ */
+int eal_rte_init_run(void);
+
+/**
* Get the global configuration structure.
*
* @return
diff --git a/lib/librte_eal/common/meson.build b/lib/librte_eal/common/meson.build
index 02d9280cc351..ad0ce6a19015 100644
--- a/lib/librte_eal/common/meson.build
+++ b/lib/librte_eal/common/meson.build
@@ -43,6 +43,7 @@ sources += files(
'eal_common_thread.c',
'eal_common_timer.c',
'eal_common_uuid.c',
+ 'eal_common_init.c',
'hotplug_mp.c',
'malloc_elem.c',
'malloc_heap.c',
diff --git a/lib/librte_eal/freebsd/Makefile b/lib/librte_eal/freebsd/Makefile
index e5d4d8ff260b..89c5649f16e6 100644
--- a/lib/librte_eal/freebsd/Makefile
+++ b/lib/librte_eal/freebsd/Makefile
@@ -60,6 +60,7 @@ SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += eal_common_thread.c
SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += eal_common_proc.c
SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += eal_common_fbarray.c
SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += eal_common_uuid.c
+SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += eal_common_init.c
SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += rte_malloc.c
SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += hotplug_mp.c
SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += malloc_elem.c
diff --git a/lib/librte_eal/freebsd/eal.c b/lib/librte_eal/freebsd/eal.c
index 6ae37e7e69be..63af98cb6ca7 100644
--- a/lib/librte_eal/freebsd/eal.c
+++ b/lib/librte_eal/freebsd/eal.c
@@ -874,6 +874,12 @@ static void rte_eal_init_alert(const char *msg)
eal_check_mem_on_local_socket();
+ if (eal_rte_init_run() < 0) {
+ rte_eal_init_alert("Cannot init objects in rte-init queue");
+ rte_errno = EFAULT;
+ return -1;
+ }
+
eal_thread_init_master(rte_config.master_lcore);
ret = eal_thread_dump_affinity(cpuset, sizeof(cpuset));
diff --git a/lib/librte_eal/include/rte_init.h b/lib/librte_eal/include/rte_init.h
new file mode 100644
index 000000000000..83c7fd47daec
--- /dev/null
+++ b/lib/librte_eal/include/rte_init.h
@@ -0,0 +1,68 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 DPDK Community
+ */
+
+#ifndef _RTE_INIT_H_
+#define _RTE_INIT_H_
+
+#include <rte_compat.h>
+
+/**
+ * @file
+ *
+ * RTE INIT Registration Interface
+ *
+ * This file exposes API for other libraries initialization callback.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Implementation specific callback function which is
+ * responsible for specificed initialization.
+ *
+ * This is invoked when almost resources are available.
+ *
+ * @return
+ * 0 for successfully invoked
+ * Negative for unsuccessfully invoked with error value
+ */
+typedef int (*rte_init_cb_t)(const void *arg);
+
+/**
+ * The RTE INIT type of callback function registered.
+ */
+enum rte_init_type {
+ RTE_INIT_PRE, /**< RTE INITs are invoked with high priority. */
+ RTE_INIT_POST /**< Last RTE INITs invoked. */
+};
+
+/**
+ * Register a rte_init callback.
+ *
+ * @param cb
+ * A pointer to a rte_init_cb structure, which will be invoked
+ * in rte_eal_init().
+ *
+ * @param arg
+ * The cb will use that as param.
+ *
+ * @param type
+ * The type of rte_init registered.
+ *
+ * @return
+ * 0 for success register callback.
+ * -EINVAL one of the parameters was invalid.
+ * -ENOMEM no appropriate memory allocated.
+ */
+__rte_experimental
+int rte_init_register(rte_init_cb_t cb, const void *arg,
+ enum rte_init_type type);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_INIT_H_ */
diff --git a/lib/librte_eal/linux/Makefile b/lib/librte_eal/linux/Makefile
index e5f44959c66c..918d94bec91e 100644
--- a/lib/librte_eal/linux/Makefile
+++ b/lib/librte_eal/linux/Makefile
@@ -67,6 +67,7 @@ SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += eal_common_thread.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += eal_common_proc.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += eal_common_fbarray.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += eal_common_uuid.c
+SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += eal_common_init.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += rte_malloc.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += hotplug_mp.c
SRCS-$(CONFIG_RTE_EXEC_ENV_LINUX) += malloc_elem.c
diff --git a/lib/librte_eal/linux/eal.c b/lib/librte_eal/linux/eal.c
index 9530ee55f8e3..dd0c2589f6c4 100644
--- a/lib/librte_eal/linux/eal.c
+++ b/lib/librte_eal/linux/eal.c
@@ -1203,6 +1203,12 @@ static void rte_eal_init_alert(const char *msg)
eal_check_mem_on_local_socket();
+ if (eal_rte_init_run() < 0) {
+ rte_eal_init_alert("Cannot init objects in rte-init queue");
+ rte_errno = EFAULT;
+ return -1;
+ }
+
eal_thread_init_master(rte_config.master_lcore);
ret = eal_thread_dump_affinity(cpuset, sizeof(cpuset));
diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map
index f9ede5b41c69..23aa30f542db 100644
--- a/lib/librte_eal/rte_eal_version.map
+++ b/lib/librte_eal/rte_eal_version.map
@@ -338,4 +338,5 @@ EXPERIMENTAL {
# added in 20.05
rte_log_can_log;
+ rte_init_register;
};
--
1.8.3.1
^ permalink raw reply [flat|nested] 44+ messages in thread
* [dpdk-dev] [PATCH dpdk-dev v3 2/2] mempool: use shared memzone for rte_mempool_ops
2020-04-13 14:21 ` [dpdk-dev] [PATCH dpdk-dev v3 " xiangxia.m.yue
@ 2020-04-13 14:21 ` xiangxia.m.yue
2020-04-16 22:27 ` Thomas Monjalon
2020-04-23 13:38 ` Andrew Rybchenko
0 siblings, 2 replies; 44+ messages in thread
From: xiangxia.m.yue @ 2020-04-13 14:21 UTC (permalink / raw)
To: olivier.matz, arybchenko, gage.eads, artem.andreev, jerinj,
ndabilpuram, vattunuru, hemant.agrawal, david.marchand,
anatoly.burakov, bruce.richardson
Cc: dev, Tonghao Zhang
From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
The order of mempool initiation affects mempool index in the
rte_mempool_ops_table. For example, when building APPs with:
$ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
The "bucket" mempool will be registered firstly, and its index
in table is 0 while the index of "ring" mempool is 1. DPDK
uses the mk/rte.app.mk to build APPs, and others, for example,
Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
The mempool lib linked in dpdk and Open vSwitch is different.
The mempool can be used between primary and secondary process,
such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
There will be a crash because dpdk-pdump creates the "ring_mp_mc"
ring which index in table is 0, but the index of "bucket" ring
is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
mempool ops and malloc memory from mempool. The crash will occur:
bucket_dequeue (access null and crash)
rte_mempool_get_ops (should get "ring_mp_mc",
but get "bucket" mempool)
rte_mempool_ops_dequeue_bulk
...
rte_pktmbuf_alloc
rte_pktmbuf_copy
pdump_copy
pdump_rx
rte_eth_rx_burst
To avoid the crash, there are some solution:
* constructor priority: Different mempool uses different
priority in RTE_INIT, but it's not easy to maintain.
* change mk/rte.app.mk: Change the order in mk/rte.app.mk to
be same as libdpdk.a/libdpdk.so, but when adding a new mempool
driver in future, we must make sure the order.
* register mempool orderly: Sort the mempool when registering,
so the lib linked will not affect the index in mempool table.
but the number of mempool libraries may be different.
* shared memzone: The primary process allocates a struct in
shared memory named memzone, When we register a mempool ops,
we first get a name and id from the shared struct: with the lock held,
lookup for the registered name and return its index, else
get the last id and copy the name in the struct.
Previous discussion: https://mails.dpdk.org/archives/dev/2020-March/159354.html
Suggested-by: Olivier Matz <olivier.matz@6wind.com>
Suggested-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
---
v2:
* fix checkpatch warning
---
lib/librte_mempool/rte_mempool.h | 28 +++++++++++-
lib/librte_mempool/rte_mempool_ops.c | 89 ++++++++++++++++++++++++++++--------
2 files changed, 96 insertions(+), 21 deletions(-)
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index c90cf31467b2..2709b9e1d51b 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -50,6 +50,7 @@
#include <rte_ring.h>
#include <rte_memcpy.h>
#include <rte_common.h>
+#include <rte_init.h>
#ifdef __cplusplus
extern "C" {
@@ -678,7 +679,6 @@ struct rte_mempool_ops {
*/
struct rte_mempool_ops_table {
rte_spinlock_t sl; /**< Spinlock for add/delete. */
- uint32_t num_ops; /**< Number of used ops structs in the table. */
/**
* Storage for all possible ops structs.
*/
@@ -910,6 +910,30 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
*/
int rte_mempool_register_ops(const struct rte_mempool_ops *ops);
+struct rte_mempool_shared_ops {
+ size_t num_mempool_ops;
+ struct {
+ char name[RTE_MEMPOOL_OPS_NAMESIZE];
+ } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX];
+
+ rte_spinlock_t mempool;
+};
+
+static inline int
+mempool_ops_register_cb(const void *arg)
+{
+ const struct rte_mempool_ops *h = (const struct rte_mempool_ops *)arg;
+
+ return rte_mempool_register_ops(h);
+}
+
+static inline void
+mempool_ops_register(const struct rte_mempool_ops *ops)
+{
+ rte_init_register(mempool_ops_register_cb, (const void *)ops,
+ RTE_INIT_PRE);
+}
+
/**
* Macro to statically register the ops of a mempool handler.
* Note that the rte_mempool_register_ops fails silently here when
@@ -918,7 +942,7 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
#define MEMPOOL_REGISTER_OPS(ops) \
RTE_INIT(mp_hdlr_init_##ops) \
{ \
- rte_mempool_register_ops(&ops); \
+ mempool_ops_register(&ops); \
}
/**
diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
index 22c5251eb068..b10fda662db6 100644
--- a/lib/librte_mempool/rte_mempool_ops.c
+++ b/lib/librte_mempool/rte_mempool_ops.c
@@ -14,43 +14,95 @@
/* indirect jump table to support external memory pools. */
struct rte_mempool_ops_table rte_mempool_ops_table = {
.sl = RTE_SPINLOCK_INITIALIZER,
- .num_ops = 0
};
-/* add a new ops struct in rte_mempool_ops_table, return its index. */
-int
-rte_mempool_register_ops(const struct rte_mempool_ops *h)
+static int
+rte_mempool_register_shared_ops(const char *name)
{
- struct rte_mempool_ops *ops;
- int16_t ops_index;
+ static bool mempool_shared_ops_inited;
+ struct rte_mempool_shared_ops *shared_ops;
+ const struct rte_memzone *mz;
+ uint32_t ops_index = 0;
+
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
+ !mempool_shared_ops_inited) {
+
+ mz = rte_memzone_reserve("mempool_ops_shared",
+ sizeof(*shared_ops), 0, 0);
+ if (mz == NULL)
+ return -ENOMEM;
+
+ shared_ops = mz->addr;
+ shared_ops->num_mempool_ops = 0;
+ rte_spinlock_init(&shared_ops->mempool);
+
+ mempool_shared_ops_inited = true;
+ } else {
+ mz = rte_memzone_lookup("mempool_ops_shared");
+ if (mz == NULL)
+ return -ENOMEM;
+
+ shared_ops = mz->addr;
+ }
- rte_spinlock_lock(&rte_mempool_ops_table.sl);
+ rte_spinlock_lock(&shared_ops->mempool);
- if (rte_mempool_ops_table.num_ops >=
- RTE_MEMPOOL_MAX_OPS_IDX) {
- rte_spinlock_unlock(&rte_mempool_ops_table.sl);
+ if (shared_ops->num_mempool_ops >= RTE_MEMPOOL_MAX_OPS_IDX) {
+ rte_spinlock_unlock(&shared_ops->mempool);
RTE_LOG(ERR, MEMPOOL,
"Maximum number of mempool ops structs exceeded\n");
return -ENOSPC;
}
+ while (shared_ops->mempool_ops[ops_index].name[0]) {
+ if (!strcmp(name, shared_ops->mempool_ops[ops_index].name)) {
+ rte_spinlock_unlock(&shared_ops->mempool);
+ return ops_index;
+ }
+
+ ops_index++;
+ }
+
+ strlcpy(shared_ops->mempool_ops[ops_index].name, name,
+ sizeof(shared_ops->mempool_ops[0].name));
+
+ shared_ops->num_mempool_ops++;
+
+ rte_spinlock_unlock(&shared_ops->mempool);
+ return ops_index;
+}
+
+/* add a new ops struct in rte_mempool_ops_table, return its index. */
+int
+rte_mempool_register_ops(const struct rte_mempool_ops *h)
+{
+ struct rte_mempool_ops *ops;
+ int16_t ops_index;
+
if (h->alloc == NULL || h->enqueue == NULL ||
- h->dequeue == NULL || h->get_count == NULL) {
- rte_spinlock_unlock(&rte_mempool_ops_table.sl);
+ h->dequeue == NULL || h->get_count == NULL) {
RTE_LOG(ERR, MEMPOOL,
"Missing callback while registering mempool ops\n");
+ rte_errno = EINVAL;
return -EINVAL;
}
if (strlen(h->name) >= sizeof(ops->name) - 1) {
- rte_spinlock_unlock(&rte_mempool_ops_table.sl);
- RTE_LOG(DEBUG, EAL, "%s(): mempool_ops <%s>: name too long\n",
- __func__, h->name);
+ RTE_LOG(ERR, MEMPOOL,
+ "The registering mempool_ops <%s>: name too long\n",
+ h->name);
rte_errno = EEXIST;
return -EEXIST;
}
- ops_index = rte_mempool_ops_table.num_ops++;
+ ops_index = rte_mempool_register_shared_ops(h->name);
+ if (ops_index < 0) {
+ rte_errno = -ops_index;
+ return ops_index;
+ }
+
+ rte_spinlock_lock(&rte_mempool_ops_table.sl);
+
ops = &rte_mempool_ops_table.ops[ops_index];
strlcpy(ops->name, h->name, sizeof(ops->name));
ops->alloc = h->alloc;
@@ -165,9 +217,8 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
if (mp->flags & MEMPOOL_F_POOL_CREATED)
return -EEXIST;
- for (i = 0; i < rte_mempool_ops_table.num_ops; i++) {
- if (!strcmp(name,
- rte_mempool_ops_table.ops[i].name)) {
+ for (i = 0; i < RTE_MEMPOOL_MAX_OPS_IDX; i++) {
+ if (!strcmp(name, rte_mempool_ops_table.ops[i].name)) {
ops = &rte_mempool_ops_table.ops[i];
break;
}
--
1.8.3.1
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v3 2/2] mempool: use shared memzone for rte_mempool_ops
2020-04-13 14:21 ` [dpdk-dev] [PATCH dpdk-dev v3 2/2] mempool: use shared memzone for rte_mempool_ops xiangxia.m.yue
@ 2020-04-16 22:27 ` Thomas Monjalon
2020-04-27 8:03 ` Tonghao Zhang
2020-04-23 13:38 ` Andrew Rybchenko
1 sibling, 1 reply; 44+ messages in thread
From: Thomas Monjalon @ 2020-04-16 22:27 UTC (permalink / raw)
To: Tonghao Zhang
Cc: olivier.matz, arybchenko, gage.eads, artem.andreev, jerinj,
ndabilpuram, vattunuru, hemant.agrawal, david.marchand,
anatoly.burakov, bruce.richardson, dev, xiangxia.m.yue
13/04/2020 16:21, xiangxia.m.yue@gmail.com:
> The order of mempool initiation affects mempool index in the
> rte_mempool_ops_table. For example, when building APPs with:
>
> $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
>
> The "bucket" mempool will be registered firstly, and its index
> in table is 0 while the index of "ring" mempool is 1. DPDK
> uses the mk/rte.app.mk to build APPs, and others, for example,
> Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> The mempool lib linked in dpdk and Open vSwitch is different.
We are supposed to use pkg-config to link DPDK.
Does the problem appear between a DPDK compiled with meson
and an application linked with pkg-config information?
If the problem really needs to be solved,
the EAL patch (first of this series) needs to be discussed
and reviewed carefully. I don't imagine it being done in 20.05.
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v3 2/2] mempool: use shared memzone for rte_mempool_ops
2020-04-16 22:27 ` Thomas Monjalon
@ 2020-04-27 8:03 ` Tonghao Zhang
2020-04-27 11:40 ` Thomas Monjalon
0 siblings, 1 reply; 44+ messages in thread
From: Tonghao Zhang @ 2020-04-27 8:03 UTC (permalink / raw)
To: Thomas Monjalon
Cc: Olivier Matz, Andrew Rybchenko, Gage Eads, Artem V. Andreev,
Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru, Hemant Agrawal,
David Marchand, Burakov, Anatoly, Bruce Richardson, dpdk-dev
On Fri, Apr 17, 2020 at 6:27 AM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 13/04/2020 16:21, xiangxia.m.yue@gmail.com:
> > The order of mempool initiation affects mempool index in the
> > rte_mempool_ops_table. For example, when building APPs with:
> >
> > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> >
> > The "bucket" mempool will be registered firstly, and its index
> > in table is 0 while the index of "ring" mempool is 1. DPDK
> > uses the mk/rte.app.mk to build APPs, and others, for example,
> > Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > The mempool lib linked in dpdk and Open vSwitch is different.
>
> We are supposed to use pkg-config to link DPDK.
> Does the problem appear between a DPDK compiled with meson
> and an application linked with pkg-config information?
>
> If the problem really needs to be solved,
> the EAL patch (first of this series) needs to be discussed
> and reviewed carefully. I don't imagine it being done in 20.05.
will I stop update the patches or when the patch is ready and then
decide applied or not?
>
>
--
Best regards, Tonghao
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v3 2/2] mempool: use shared memzone for rte_mempool_ops
2020-04-27 8:03 ` Tonghao Zhang
@ 2020-04-27 11:40 ` Thomas Monjalon
2020-04-27 12:51 ` Tonghao Zhang
0 siblings, 1 reply; 44+ messages in thread
From: Thomas Monjalon @ 2020-04-27 11:40 UTC (permalink / raw)
To: Tonghao Zhang
Cc: Olivier Matz, Andrew Rybchenko, Gage Eads, Artem V. Andreev,
Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru, Hemant Agrawal,
David Marchand, Burakov, Anatoly, Bruce Richardson, dpdk-dev
27/04/2020 10:03, Tonghao Zhang:
> On Fri, Apr 17, 2020 at 6:27 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > 13/04/2020 16:21, xiangxia.m.yue@gmail.com:
> > > The order of mempool initiation affects mempool index in the
> > > rte_mempool_ops_table. For example, when building APPs with:
> > >
> > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> > >
> > > The "bucket" mempool will be registered firstly, and its index
> > > in table is 0 while the index of "ring" mempool is 1. DPDK
> > > uses the mk/rte.app.mk to build APPs, and others, for example,
> > > Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > > The mempool lib linked in dpdk and Open vSwitch is different.
> >
> > We are supposed to use pkg-config to link DPDK.
> > Does the problem appear between a DPDK compiled with meson
> > and an application linked with pkg-config information?
> >
> > If the problem really needs to be solved,
> > the EAL patch (first of this series) needs to be discussed
> > and reviewed carefully. I don't imagine it being done in 20.05.
>
> will I stop update the patches or when the patch is ready and then
> decide applied or not?
As I said, it requires more discussion.
Please start by answering my question above.
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v3 2/2] mempool: use shared memzone for rte_mempool_ops
2020-04-27 11:40 ` Thomas Monjalon
@ 2020-04-27 12:51 ` Tonghao Zhang
2020-04-28 13:22 ` Tonghao Zhang
0 siblings, 1 reply; 44+ messages in thread
From: Tonghao Zhang @ 2020-04-27 12:51 UTC (permalink / raw)
To: Thomas Monjalon
Cc: Olivier Matz, Andrew Rybchenko, Gage Eads, Artem V. Andreev,
Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru, Hemant Agrawal,
David Marchand, Burakov, Anatoly, Bruce Richardson, dpdk-dev
On Mon, Apr 27, 2020 at 7:40 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 27/04/2020 10:03, Tonghao Zhang:
> > On Fri, Apr 17, 2020 at 6:27 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > >
> > > 13/04/2020 16:21, xiangxia.m.yue@gmail.com:
> > > > The order of mempool initiation affects mempool index in the
> > > > rte_mempool_ops_table. For example, when building APPs with:
> > > >
> > > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> > > >
> > > > The "bucket" mempool will be registered firstly, and its index
> > > > in table is 0 while the index of "ring" mempool is 1. DPDK
> > > > uses the mk/rte.app.mk to build APPs, and others, for example,
> > > > Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > > > The mempool lib linked in dpdk and Open vSwitch is different.
> > >
> > > We are supposed to use pkg-config to link DPDK.
> > > Does the problem appear between a DPDK compiled with meson
> > > and an application linked with pkg-config information?
Hi Thomas,
The library mempool linked order can trigger that problem. but when
the library is loaded
dynamically, trigger that problem too.
as Olivier Matz said:
The fact that the ops index changes during mempool driver lifetime is
indeed frightening, especially knowning that this is a dynamic
registration that could happen at any moment in the life of the
application.
the message in https://mails.dpdk.org/archives/dev/2020-March/159354.html
> > > If the problem really needs to be solved,
> > > the EAL patch (first of this series) needs to be discussed
> > > and reviewed carefully. I don't imagine it being done in 20.05.
> >
> > will I stop update the patches or when the patch is ready and then
> > decide applied or not?
>
> As I said, it requires more discussion.
> Please start by answering my question above.
I got it, Thanks.
>
--
Best regards, Tonghao
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v3 2/2] mempool: use shared memzone for rte_mempool_ops
2020-04-27 12:51 ` Tonghao Zhang
@ 2020-04-28 13:22 ` Tonghao Zhang
2020-05-04 7:42 ` Olivier Matz
0 siblings, 1 reply; 44+ messages in thread
From: Tonghao Zhang @ 2020-04-28 13:22 UTC (permalink / raw)
To: Thomas Monjalon
Cc: Olivier Matz, Andrew Rybchenko, Gage Eads, Artem V. Andreev,
Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru, Hemant Agrawal,
David Marchand, Burakov, Anatoly, Bruce Richardson, dpdk-dev
On Mon, Apr 27, 2020 at 8:51 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>
> On Mon, Apr 27, 2020 at 7:40 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > 27/04/2020 10:03, Tonghao Zhang:
> > > On Fri, Apr 17, 2020 at 6:27 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > >
> > > > 13/04/2020 16:21, xiangxia.m.yue@gmail.com:
> > > > > The order of mempool initiation affects mempool index in the
> > > > > rte_mempool_ops_table. For example, when building APPs with:
> > > > >
> > > > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> > > > >
> > > > > The "bucket" mempool will be registered firstly, and its index
> > > > > in table is 0 while the index of "ring" mempool is 1. DPDK
> > > > > uses the mk/rte.app.mk to build APPs, and others, for example,
> > > > > Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > > > > The mempool lib linked in dpdk and Open vSwitch is different.
> > > >
> > > > We are supposed to use pkg-config to link DPDK.
> > > > Does the problem appear between a DPDK compiled with meson
> > > > and an application linked with pkg-config information?
> Hi Thomas,
> The library mempool linked order can trigger that problem. but when
> the library is loaded
> dynamically, trigger that problem too.
> as Olivier Matz said:
> The fact that the ops index changes during mempool driver lifetime is
> indeed frightening, especially knowning that this is a dynamic
> registration that could happen at any moment in the life of the
> application.
>
> the message in https://mails.dpdk.org/archives/dev/2020-March/159354.html
Hi Thomas,
For first patch, I guess we support a callback for other library, it
make the codes much cleaner
at eal layer. Otherwise, if we init for library, we may include their
header file.
There is a better solution ?
> > > > If the problem really needs to be solved,
> > > > the EAL patch (first of this series) needs to be discussed
> > > > and reviewed carefully. I don't imagine it being done in 20.05.
> > >
> > > will I stop update the patches or when the patch is ready and then
> > > decide applied or not?
> >
> > As I said, it requires more discussion.
> > Please start by answering my question above.
> I got it, Thanks.
> >
>
>
> --
> Best regards, Tonghao
--
Best regards, Tonghao
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v3 2/2] mempool: use shared memzone for rte_mempool_ops
2020-04-28 13:22 ` Tonghao Zhang
@ 2020-05-04 7:42 ` Olivier Matz
2021-03-25 14:24 ` David Marchand
0 siblings, 1 reply; 44+ messages in thread
From: Olivier Matz @ 2020-05-04 7:42 UTC (permalink / raw)
To: Tonghao Zhang
Cc: Thomas Monjalon, Andrew Rybchenko, Gage Eads, Artem V. Andreev,
Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru, Hemant Agrawal,
David Marchand, Burakov, Anatoly, Bruce Richardson, dpdk-dev
Hi,
On Tue, Apr 28, 2020 at 09:22:37PM +0800, Tonghao Zhang wrote:
> On Mon, Apr 27, 2020 at 8:51 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> >
> > On Mon, Apr 27, 2020 at 7:40 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > >
> > > 27/04/2020 10:03, Tonghao Zhang:
> > > > On Fri, Apr 17, 2020 at 6:27 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > >
> > > > > 13/04/2020 16:21, xiangxia.m.yue@gmail.com:
> > > > > > The order of mempool initiation affects mempool index in the
> > > > > > rte_mempool_ops_table. For example, when building APPs with:
> > > > > >
> > > > > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> > > > > >
> > > > > > The "bucket" mempool will be registered firstly, and its index
> > > > > > in table is 0 while the index of "ring" mempool is 1. DPDK
> > > > > > uses the mk/rte.app.mk to build APPs, and others, for example,
> > > > > > Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > > > > > The mempool lib linked in dpdk and Open vSwitch is different.
> > > > >
> > > > > We are supposed to use pkg-config to link DPDK.
> > > > > Does the problem appear between a DPDK compiled with meson
> > > > > and an application linked with pkg-config information?
> > Hi Thomas,
> > The library mempool linked order can trigger that problem. but when
> > the library is loaded
> > dynamically, trigger that problem too.
> > as Olivier Matz said:
> > The fact that the ops index changes during mempool driver lifetime is
> > indeed frightening, especially knowning that this is a dynamic
> > registration that could happen at any moment in the life of the
> > application.
> >
> > the message in https://mails.dpdk.org/archives/dev/2020-March/159354.html
> Hi Thomas,
> For first patch, I guess we support a callback for other library, it
> make the codes much cleaner
> at eal layer. Otherwise, if we init for library, we may include their
> header file.
> There is a better solution ?
To summarize my understanding of the issu encountered by Tonghao:
Currently, it is not possible to call memzone_register() from an init
function (registered with RTE_INIT()). This is needed if we want to
store the list of registered mempool ops in a shared memory, available
from multiprocess.
Tonghao's patch 1/2 solves this issue. I tried to find alternatives
to this approach, but none of them seems satisfying:
- use RTE_PMD_REGISTER_VDEV() and rte_vdev_add_custom_scan() instead of
RTE_INIT() in the MEMPOOL_REGISTER_OPS() macro to delay the
initialization after eal_init(). This looks too complex (I made a POC
of it, it someone is interested).
- synchronize mempool ops in shared memory when mempool_create() is
called in the primary: this would probably works most of the time, but
it is not a perfect solution as we cannot ensure that the primary
application will create a mempool before the secondary comes up.
- introduce a mandatory call to rte_mempool_lib_init(): despite it's the
usual way to initialize libs, this will break compatibility.
> > > > > If the problem really needs to be solved,
> > > > > the EAL patch (first of this series) needs to be discussed
> > > > > and reviewed carefully. I don't imagine it being done in 20.05.
> > > >
OK, let's discuss it once 20.05 is out.
Regards,
Olivier
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v3 2/2] mempool: use shared memzone for rte_mempool_ops
2020-05-04 7:42 ` Olivier Matz
@ 2021-03-25 14:24 ` David Marchand
0 siblings, 0 replies; 44+ messages in thread
From: David Marchand @ 2021-03-25 14:24 UTC (permalink / raw)
To: Olivier Matz, Tonghao Zhang
Cc: Thomas Monjalon, Andrew Rybchenko, Gage Eads, Artem V. Andreev,
Jerin Jacob, Nithin Dabilpuram, Vamsi Attunuru, Hemant Agrawal,
Burakov, Anatoly, Bruce Richardson, dpdk-dev
On Mon, May 4, 2020 at 9:42 AM Olivier Matz <olivier.matz@6wind.com> wrote:
>
> Hi,
>
> On Tue, Apr 28, 2020 at 09:22:37PM +0800, Tonghao Zhang wrote:
> > On Mon, Apr 27, 2020 at 8:51 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> > >
> > > On Mon, Apr 27, 2020 at 7:40 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > >
> > > > 27/04/2020 10:03, Tonghao Zhang:
> > > > > On Fri, Apr 17, 2020 at 6:27 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > > >
> > > > > > 13/04/2020 16:21, xiangxia.m.yue@gmail.com:
> > > > > > > The order of mempool initiation affects mempool index in the
> > > > > > > rte_mempool_ops_table. For example, when building APPs with:
> > > > > > >
> > > > > > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> > > > > > >
> > > > > > > The "bucket" mempool will be registered firstly, and its index
> > > > > > > in table is 0 while the index of "ring" mempool is 1. DPDK
> > > > > > > uses the mk/rte.app.mk to build APPs, and others, for example,
> > > > > > > Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > > > > > > The mempool lib linked in dpdk and Open vSwitch is different.
> > > > > >
> > > > > > We are supposed to use pkg-config to link DPDK.
> > > > > > Does the problem appear between a DPDK compiled with meson
> > > > > > and an application linked with pkg-config information?
> > > Hi Thomas,
> > > The library mempool linked order can trigger that problem. but when
> > > the library is loaded
> > > dynamically, trigger that problem too.
> > > as Olivier Matz said:
> > > The fact that the ops index changes during mempool driver lifetime is
> > > indeed frightening, especially knowning that this is a dynamic
> > > registration that could happen at any moment in the life of the
> > > application.
> > >
> > > the message in https://mails.dpdk.org/archives/dev/2020-March/159354.html
> > Hi Thomas,
> > For first patch, I guess we support a callback for other library, it
> > make the codes much cleaner
> > at eal layer. Otherwise, if we init for library, we may include their
> > header file.
> > There is a better solution ?
>
> To summarize my understanding of the issu encountered by Tonghao:
>
> Currently, it is not possible to call memzone_register() from an init
> function (registered with RTE_INIT()). This is needed if we want to
> store the list of registered mempool ops in a shared memory, available
> from multiprocess.
>
> Tonghao's patch 1/2 solves this issue. I tried to find alternatives
> to this approach, but none of them seems satisfying:
>
> - use RTE_PMD_REGISTER_VDEV() and rte_vdev_add_custom_scan() instead of
> RTE_INIT() in the MEMPOOL_REGISTER_OPS() macro to delay the
> initialization after eal_init(). This looks too complex (I made a POC
> of it, it someone is interested).
>
> - synchronize mempool ops in shared memory when mempool_create() is
> called in the primary: this would probably works most of the time, but
> it is not a perfect solution as we cannot ensure that the primary
> application will create a mempool before the secondary comes up.
>
> - introduce a mandatory call to rte_mempool_lib_init(): despite it's the
> usual way to initialize libs, this will break compatibility.
>
> > > > > > If the problem really needs to be solved,
> > > > > > the EAL patch (first of this series) needs to be discussed
> > > > > > and reviewed carefully. I don't imagine it being done in 20.05.
> > > > >
>
> OK, let's discuss it once 20.05 is out.
>
Any news on this topic?
Is this issue still a problem?
--
David Marchand
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v3 2/2] mempool: use shared memzone for rte_mempool_ops
2020-04-13 14:21 ` [dpdk-dev] [PATCH dpdk-dev v3 2/2] mempool: use shared memzone for rte_mempool_ops xiangxia.m.yue
2020-04-16 22:27 ` Thomas Monjalon
@ 2020-04-23 13:38 ` Andrew Rybchenko
2020-04-27 5:23 ` Tonghao Zhang
1 sibling, 1 reply; 44+ messages in thread
From: Andrew Rybchenko @ 2020-04-23 13:38 UTC (permalink / raw)
To: xiangxia.m.yue, olivier.matz, gage.eads, artem.andreev, jerinj,
ndabilpuram, vattunuru, hemant.agrawal, david.marchand,
anatoly.burakov, bruce.richardson
Cc: dev
On 4/13/20 5:21 PM, xiangxia.m.yue@gmail.com wrote:
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>
> The order of mempool initiation affects mempool index in the
> rte_mempool_ops_table. For example, when building APPs with:
>
> $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
>
> The "bucket" mempool will be registered firstly, and its index
> in table is 0 while the index of "ring" mempool is 1. DPDK
> uses the mk/rte.app.mk to build APPs, and others, for example,
> Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> The mempool lib linked in dpdk and Open vSwitch is different.
>
> The mempool can be used between primary and secondary process,
> such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
> There will be a crash because dpdk-pdump creates the "ring_mp_mc"
> ring which index in table is 0, but the index of "bucket" ring
> is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
> mempool ops and malloc memory from mempool. The crash will occur:
>
> bucket_dequeue (access null and crash)
> rte_mempool_get_ops (should get "ring_mp_mc",
> but get "bucket" mempool)
> rte_mempool_ops_dequeue_bulk
> ...
> rte_pktmbuf_alloc
> rte_pktmbuf_copy
> pdump_copy
> pdump_rx
> rte_eth_rx_burst
>
> To avoid the crash, there are some solution:
> * constructor priority: Different mempool uses different
> priority in RTE_INIT, but it's not easy to maintain.
>
> * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
> be same as libdpdk.a/libdpdk.so, but when adding a new mempool
> driver in future, we must make sure the order.
>
> * register mempool orderly: Sort the mempool when registering,
> so the lib linked will not affect the index in mempool table.
> but the number of mempool libraries may be different.
>
> * shared memzone: The primary process allocates a struct in
> shared memory named memzone, When we register a mempool ops,
> we first get a name and id from the shared struct: with the lock held,
> lookup for the registered name and return its index, else
> get the last id and copy the name in the struct.
>
> Previous discussion: https://mails.dpdk.org/archives/dev/2020-March/159354.html
>
> Suggested-by: Olivier Matz <olivier.matz@6wind.com>
> Suggested-by: Jerin Jacob <jerinj@marvell.com>
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> ---
> v2:
> * fix checkpatch warning
> ---
> lib/librte_mempool/rte_mempool.h | 28 +++++++++++-
> lib/librte_mempool/rte_mempool_ops.c | 89 ++++++++++++++++++++++++++++--------
> 2 files changed, 96 insertions(+), 21 deletions(-)
>
> diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
> index c90cf31467b2..2709b9e1d51b 100644
> --- a/lib/librte_mempool/rte_mempool.h
> +++ b/lib/librte_mempool/rte_mempool.h
> @@ -50,6 +50,7 @@
> #include <rte_ring.h>
> #include <rte_memcpy.h>
> #include <rte_common.h>
> +#include <rte_init.h>
>
> #ifdef __cplusplus
> extern "C" {
> @@ -678,7 +679,6 @@ struct rte_mempool_ops {
> */
> struct rte_mempool_ops_table {
> rte_spinlock_t sl; /**< Spinlock for add/delete. */
> - uint32_t num_ops; /**< Number of used ops structs in the table. */
> /**
> * Storage for all possible ops structs.
> */
> @@ -910,6 +910,30 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
> */
> int rte_mempool_register_ops(const struct rte_mempool_ops *ops);
>
> +struct rte_mempool_shared_ops {
> + size_t num_mempool_ops;
Is there any specific reason to change type from uint32_t used
above to size_t? I think that uint32_t is better here since
it is just a number, not a size of memory or related value.
> + struct {
> + char name[RTE_MEMPOOL_OPS_NAMESIZE];
> + } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX];
> +
> + rte_spinlock_t mempool;
> +};
> +
> +static inline int
> +mempool_ops_register_cb(const void *arg)
> +{
> + const struct rte_mempool_ops *h = (const struct rte_mempool_ops *)arg;
> +
> + return rte_mempool_register_ops(h);
> +}
> +
> +static inline void
> +mempool_ops_register(const struct rte_mempool_ops *ops)
> +{
> + rte_init_register(mempool_ops_register_cb, (const void *)ops,
> + RTE_INIT_PRE);
> +}
> +
> /**
> * Macro to statically register the ops of a mempool handler.
> * Note that the rte_mempool_register_ops fails silently here when
> @@ -918,7 +942,7 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
> #define MEMPOOL_REGISTER_OPS(ops) \
> RTE_INIT(mp_hdlr_init_##ops) \
> { \
> - rte_mempool_register_ops(&ops); \
> + mempool_ops_register(&ops); \
> }
>
> /**
> diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
> index 22c5251eb068..b10fda662db6 100644
> --- a/lib/librte_mempool/rte_mempool_ops.c
> +++ b/lib/librte_mempool/rte_mempool_ops.c
> @@ -14,43 +14,95 @@
> /* indirect jump table to support external memory pools. */
> struct rte_mempool_ops_table rte_mempool_ops_table = {
> .sl = RTE_SPINLOCK_INITIALIZER,
> - .num_ops = 0
> };
>
> -/* add a new ops struct in rte_mempool_ops_table, return its index. */
> -int
> -rte_mempool_register_ops(const struct rte_mempool_ops *h)
> +static int
> +rte_mempool_register_shared_ops(const char *name)
> {
> - struct rte_mempool_ops *ops;
> - int16_t ops_index;
> + static bool mempool_shared_ops_inited;
> + struct rte_mempool_shared_ops *shared_ops;
> + const struct rte_memzone *mz;
> + uint32_t ops_index = 0;
> +
I think we should sanity check 'name' here to be not
empty string (see review notes below).
> + if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
> + !mempool_shared_ops_inited) {
> +
> + mz = rte_memzone_reserve("mempool_ops_shared",
> + sizeof(*shared_ops), 0, 0);
> + if (mz == NULL)
> + return -ENOMEM;
> +
> + shared_ops = mz->addr;
> + shared_ops->num_mempool_ops = 0;
> + rte_spinlock_init(&shared_ops->mempool);
> +
> + mempool_shared_ops_inited = true;
> + } else {
> + mz = rte_memzone_lookup("mempool_ops_shared");
> + if (mz == NULL)
> + return -ENOMEM;
> +
> + shared_ops = mz->addr;
> + }
>
> - rte_spinlock_lock(&rte_mempool_ops_table.sl);
> + rte_spinlock_lock(&shared_ops->mempool);
>
> - if (rte_mempool_ops_table.num_ops >=
> - RTE_MEMPOOL_MAX_OPS_IDX) {
> - rte_spinlock_unlock(&rte_mempool_ops_table.sl);
> + if (shared_ops->num_mempool_ops >= RTE_MEMPOOL_MAX_OPS_IDX) {
> + rte_spinlock_unlock(&shared_ops->mempool);
> RTE_LOG(ERR, MEMPOOL,
> "Maximum number of mempool ops structs exceeded\n");
> return -ENOSPC;
> }
>
> + while (shared_ops->mempool_ops[ops_index].name[0]) {
Please, compare with '\0' as DPDK style guide says.
> + if (!strcmp(name, shared_ops->mempool_ops[ops_index].name)) {
> + rte_spinlock_unlock(&shared_ops->mempool);
> + return ops_index;
> + }
> +
> + ops_index++;
> + }
> +
> + strlcpy(shared_ops->mempool_ops[ops_index].name, name,
> + sizeof(shared_ops->mempool_ops[0].name));
> +
> + shared_ops->num_mempool_ops++;
> +
> + rte_spinlock_unlock(&shared_ops->mempool);
> + return ops_index;
> +}
> +
> +/* add a new ops struct in rte_mempool_ops_table, return its index. */
> +int
> +rte_mempool_register_ops(const struct rte_mempool_ops *h)
> +{
> + struct rte_mempool_ops *ops;
> + int16_t ops_index;
> +
> if (h->alloc == NULL || h->enqueue == NULL ||
> - h->dequeue == NULL || h->get_count == NULL) {
> - rte_spinlock_unlock(&rte_mempool_ops_table.sl);
> + h->dequeue == NULL || h->get_count == NULL) {
Changing formatting just makes review a bit more harder.
> RTE_LOG(ERR, MEMPOOL,
> "Missing callback while registering mempool ops\n");
> + rte_errno = EINVAL;
Why is it done in the patch? For me it looks like logically
different change if it is really required (properly motivated).
> return -EINVAL;
> }
>
> if (strlen(h->name) >= sizeof(ops->name) - 1) {
> - rte_spinlock_unlock(&rte_mempool_ops_table.sl);
> - RTE_LOG(DEBUG, EAL, "%s(): mempool_ops <%s>: name too long\n",
> - __func__, h->name);
> + RTE_LOG(ERR, MEMPOOL,
> + "The registering mempool_ops <%s>: name too long\n",
> + h->name);
Why do you change from DEBUG to ERR here? It it not
directly related to the purpose of the patch.
> rte_errno = EEXIST;
> return -EEXIST;
> }
>
> - ops_index = rte_mempool_ops_table.num_ops++;
> + ops_index = rte_mempool_register_shared_ops(h->name);
> + if (ops_index < 0) {
> + rte_errno = -ops_index;
> + return ops_index;
> + }
> +
> + rte_spinlock_lock(&rte_mempool_ops_table.sl);
> +
> ops = &rte_mempool_ops_table.ops[ops_index];
> strlcpy(ops->name, h->name, sizeof(ops->name));
> ops->alloc = h->alloc;
> @@ -165,9 +217,8 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> if (mp->flags & MEMPOOL_F_POOL_CREATED)
> return -EEXIST;
>
> - for (i = 0; i < rte_mempool_ops_table.num_ops; i++) {
> - if (!strcmp(name,
> - rte_mempool_ops_table.ops[i].name)) {
> + for (i = 0; i < RTE_MEMPOOL_MAX_OPS_IDX; i++) {
> + if (!strcmp(name, rte_mempool_ops_table.ops[i].name)) {
Since rte_mempool_ops_table is filled in which zeros,
name string is empty by default. So, request with empty name
will match the first unused entry. I guess it is not what we
want here. I think we should handle empty string before the
loop and return -EINVAL.
> ops = &rte_mempool_ops_table.ops[i];
> break;
> }
>
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [dpdk-dev] [PATCH dpdk-dev v3 2/2] mempool: use shared memzone for rte_mempool_ops
2020-04-23 13:38 ` Andrew Rybchenko
@ 2020-04-27 5:23 ` Tonghao Zhang
0 siblings, 0 replies; 44+ messages in thread
From: Tonghao Zhang @ 2020-04-27 5:23 UTC (permalink / raw)
To: Andrew Rybchenko
Cc: Olivier Matz, Gage Eads, Artem V. Andreev, Jerin Jacob,
Nithin Dabilpuram, Vamsi Attunuru, Hemant Agrawal,
David Marchand, Burakov, Anatoly, Bruce Richardson, dpdk-dev
On Thu, Apr 23, 2020 at 9:38 PM Andrew Rybchenko
<arybchenko@solarflare.com> wrote:
>
> On 4/13/20 5:21 PM, xiangxia.m.yue@gmail.com wrote:
> > From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> >
> > The order of mempool initiation affects mempool index in the
> > rte_mempool_ops_table. For example, when building APPs with:
> >
> > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> >
> > The "bucket" mempool will be registered firstly, and its index
> > in table is 0 while the index of "ring" mempool is 1. DPDK
> > uses the mk/rte.app.mk to build APPs, and others, for example,
> > Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > The mempool lib linked in dpdk and Open vSwitch is different.
> >
> > The mempool can be used between primary and secondary process,
> > such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
> > There will be a crash because dpdk-pdump creates the "ring_mp_mc"
> > ring which index in table is 0, but the index of "bucket" ring
> > is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
> > mempool ops and malloc memory from mempool. The crash will occur:
> >
> > bucket_dequeue (access null and crash)
> > rte_mempool_get_ops (should get "ring_mp_mc",
> > but get "bucket" mempool)
> > rte_mempool_ops_dequeue_bulk
> > ...
> > rte_pktmbuf_alloc
> > rte_pktmbuf_copy
> > pdump_copy
> > pdump_rx
> > rte_eth_rx_burst
> >
> > To avoid the crash, there are some solution:
> > * constructor priority: Different mempool uses different
> > priority in RTE_INIT, but it's not easy to maintain.
> >
> > * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
> > be same as libdpdk.a/libdpdk.so, but when adding a new mempool
> > driver in future, we must make sure the order.
> >
> > * register mempool orderly: Sort the mempool when registering,
> > so the lib linked will not affect the index in mempool table.
> > but the number of mempool libraries may be different.
> >
> > * shared memzone: The primary process allocates a struct in
> > shared memory named memzone, When we register a mempool ops,
> > we first get a name and id from the shared struct: with the lock held,
> > lookup for the registered name and return its index, else
> > get the last id and copy the name in the struct.
> >
> > Previous discussion: https://mails.dpdk.org/archives/dev/2020-March/159354.html
> >
> > Suggested-by: Olivier Matz <olivier.matz@6wind.com>
> > Suggested-by: Jerin Jacob <jerinj@marvell.com>
> > Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> > ---
> > v2:
> > * fix checkpatch warning
> > ---
> > lib/librte_mempool/rte_mempool.h | 28 +++++++++++-
> > lib/librte_mempool/rte_mempool_ops.c | 89 ++++++++++++++++++++++++++++--------
> > 2 files changed, 96 insertions(+), 21 deletions(-)
> >
> > diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
> > index c90cf31467b2..2709b9e1d51b 100644
> > --- a/lib/librte_mempool/rte_mempool.h
> > +++ b/lib/librte_mempool/rte_mempool.h
> > @@ -50,6 +50,7 @@
> > #include <rte_ring.h>
> > #include <rte_memcpy.h>
> > #include <rte_common.h>
> > +#include <rte_init.h>
> >
> > #ifdef __cplusplus
> > extern "C" {
> > @@ -678,7 +679,6 @@ struct rte_mempool_ops {
> > */
> > struct rte_mempool_ops_table {
> > rte_spinlock_t sl; /**< Spinlock for add/delete. */
> > - uint32_t num_ops; /**< Number of used ops structs in the table. */
> > /**
> > * Storage for all possible ops structs.
> > */
> > @@ -910,6 +910,30 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
> > */
> > int rte_mempool_register_ops(const struct rte_mempool_ops *ops);
> >
> > +struct rte_mempool_shared_ops {
> > + size_t num_mempool_ops;
>
> Is there any specific reason to change type from uint32_t used
> above to size_t? I think that uint32_t is better here since
> it is just a number, not a size of memory or related value.
Thanks for you review. busy to commit other patch, so that be delayed.
I will change that in next version.
> > + struct {
> > + char name[RTE_MEMPOOL_OPS_NAMESIZE];
> > + } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX];
> > +
> > + rte_spinlock_t mempool;
> > +};
> > +
> > +static inline int
> > +mempool_ops_register_cb(const void *arg)
> > +{
> > + const struct rte_mempool_ops *h = (const struct rte_mempool_ops *)arg;
> > +
> > + return rte_mempool_register_ops(h);
> > +}
> > +
> > +static inline void
> > +mempool_ops_register(const struct rte_mempool_ops *ops)
> > +{
> > + rte_init_register(mempool_ops_register_cb, (const void *)ops,
> > + RTE_INIT_PRE);
> > +}
> > +
> > /**
> > * Macro to statically register the ops of a mempool handler.
> > * Note that the rte_mempool_register_ops fails silently here when
> > @@ -918,7 +942,7 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
> > #define MEMPOOL_REGISTER_OPS(ops) \
> > RTE_INIT(mp_hdlr_init_##ops) \
> > { \
> > - rte_mempool_register_ops(&ops); \
> > + mempool_ops_register(&ops); \
> > }
> >
> > /**
> > diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
> > index 22c5251eb068..b10fda662db6 100644
> > --- a/lib/librte_mempool/rte_mempool_ops.c
> > +++ b/lib/librte_mempool/rte_mempool_ops.c
> > @@ -14,43 +14,95 @@
> > /* indirect jump table to support external memory pools. */
> > struct rte_mempool_ops_table rte_mempool_ops_table = {
> > .sl = RTE_SPINLOCK_INITIALIZER,
> > - .num_ops = 0
> > };
> >
> > -/* add a new ops struct in rte_mempool_ops_table, return its index. */
> > -int
> > -rte_mempool_register_ops(const struct rte_mempool_ops *h)
> > +static int
> > +rte_mempool_register_shared_ops(const char *name)
> > {
> > - struct rte_mempool_ops *ops;
> > - int16_t ops_index;
> > + static bool mempool_shared_ops_inited;
> > + struct rte_mempool_shared_ops *shared_ops;
> > + const struct rte_memzone *mz;
> > + uint32_t ops_index = 0;
> > +
>
> I think we should sanity check 'name' here to be not
> empty string (see review notes below).
OK
> > + if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
> > + !mempool_shared_ops_inited) {
> > +
> > + mz = rte_memzone_reserve("mempool_ops_shared",
> > + sizeof(*shared_ops), 0, 0);
> > + if (mz == NULL)
> > + return -ENOMEM;
> > +
> > + shared_ops = mz->addr;
> > + shared_ops->num_mempool_ops = 0;
> > + rte_spinlock_init(&shared_ops->mempool);
> > +
> > + mempool_shared_ops_inited = true;
> > + } else {
> > + mz = rte_memzone_lookup("mempool_ops_shared");
> > + if (mz == NULL)
> > + return -ENOMEM;
> > +
> > + shared_ops = mz->addr;
> > + }
> >
> > - rte_spinlock_lock(&rte_mempool_ops_table.sl);
> > + rte_spinlock_lock(&shared_ops->mempool);
> >
> > - if (rte_mempool_ops_table.num_ops >=
> > - RTE_MEMPOOL_MAX_OPS_IDX) {
> > - rte_spinlock_unlock(&rte_mempool_ops_table.sl);
> > + if (shared_ops->num_mempool_ops >= RTE_MEMPOOL_MAX_OPS_IDX) {
> > + rte_spinlock_unlock(&shared_ops->mempool);
> > RTE_LOG(ERR, MEMPOOL,
> > "Maximum number of mempool ops structs exceeded\n");
> > return -ENOSPC;
> > }
> >
> > + while (shared_ops->mempool_ops[ops_index].name[0]) {
>
> Please, compare with '\0' as DPDK style guide says.
>
> > + if (!strcmp(name, shared_ops->mempool_ops[ops_index].name)) {
> > + rte_spinlock_unlock(&shared_ops->mempool);
> > + return ops_index;
> > + }
> > +
> > + ops_index++;
> > + }
> > +
> > + strlcpy(shared_ops->mempool_ops[ops_index].name, name,
> > + sizeof(shared_ops->mempool_ops[0].name));
> > +
> > + shared_ops->num_mempool_ops++;
> > +
> > + rte_spinlock_unlock(&shared_ops->mempool);
> > + return ops_index;
> > +}
> > +
> > +/* add a new ops struct in rte_mempool_ops_table, return its index. */
> > +int
> > +rte_mempool_register_ops(const struct rte_mempool_ops *h)
> > +{
> > + struct rte_mempool_ops *ops;
> > + int16_t ops_index;
> > +
> > if (h->alloc == NULL || h->enqueue == NULL ||
> > - h->dequeue == NULL || h->get_count == NULL) {
> > - rte_spinlock_unlock(&rte_mempool_ops_table.sl);
> > + h->dequeue == NULL || h->get_count == NULL) {
>
> Changing formatting just makes review a bit more harder.
>
> > RTE_LOG(ERR, MEMPOOL,
> > "Missing callback while registering mempool ops\n");
> > + rte_errno = EINVAL;
>
> Why is it done in the patch? For me it looks like logically
> different change if it is really required (properly motivated).
I guess that should add the err code to rte_rrno ? but I am fine, not
to do that.
> > return -EINVAL;
> > }
> >
> > if (strlen(h->name) >= sizeof(ops->name) - 1) {
> > - rte_spinlock_unlock(&rte_mempool_ops_table.sl);
> > - RTE_LOG(DEBUG, EAL, "%s(): mempool_ops <%s>: name too long\n",
> > - __func__, h->name);
> > + RTE_LOG(ERR, MEMPOOL,
> > + "The registering mempool_ops <%s>: name too long\n",
> > + h->name);
>
> Why do you change from DEBUG to ERR here? It it not
> directly related to the purpose of the patch.
Yes, because it is an error, so change that type. I change it in
separate patch ?
> > rte_errno = EEXIST;
> > return -EEXIST;
> > }
> >
> > - ops_index = rte_mempool_ops_table.num_ops++;
> > + ops_index = rte_mempool_register_shared_ops(h->name);
> > + if (ops_index < 0) {
> > + rte_errno = -ops_index;
> > + return ops_index;
> > + }
> > +
> > + rte_spinlock_lock(&rte_mempool_ops_table.sl);
> > +
> > ops = &rte_mempool_ops_table.ops[ops_index];
> > strlcpy(ops->name, h->name, sizeof(ops->name));
> > ops->alloc = h->alloc;
> > @@ -165,9 +217,8 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
> > if (mp->flags & MEMPOOL_F_POOL_CREATED)
> > return -EEXIST;
> >
> > - for (i = 0; i < rte_mempool_ops_table.num_ops; i++) {
> > - if (!strcmp(name,
> > - rte_mempool_ops_table.ops[i].name)) {
> > + for (i = 0; i < RTE_MEMPOOL_MAX_OPS_IDX; i++) {
> > + if (!strcmp(name, rte_mempool_ops_table.ops[i].name)) {
>
> Since rte_mempool_ops_table is filled in which zeros,
> name string is empty by default. So, request with empty name
> will match the first unused entry. I guess it is not what we
> want here. I think we should handle empty string before the
> loop and return -EINVAL.
OK, thanks.
> > ops = &rte_mempool_ops_table.ops[i];
> > break;
> > }
> >
>
--
Best regards, Tonghao
^ permalink raw reply [flat|nested] 44+ messages in thread