From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id A126BA00BE; Mon, 27 Apr 2020 07:23:57 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 090661BE3D; Mon, 27 Apr 2020 07:23:57 +0200 (CEST) Received: from mail-ed1-f65.google.com (mail-ed1-f65.google.com [209.85.208.65]) by dpdk.org (Postfix) with ESMTP id 3E3BB1BDAE for ; Mon, 27 Apr 2020 07:23:55 +0200 (CEST) Received: by mail-ed1-f65.google.com with SMTP id f12so12470413edn.12 for ; Sun, 26 Apr 2020 22:23:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=sZ4AKKUn5xv2pFTtzQgnRkymBQdo37Gml3T+lnBghos=; b=E7zertFZegFznnYdeeawrweqmDn25JugliSroPGglUnqbB3iCNXpGDZpvZf8l0UjLK JB9rNyV+zLkuOntDqIQQGIZJuMjsacm4GUUj2vjRtRJ6DakeW8DBYDtu0ZE2EiG8w4Jn YjBNdLFpYzlZPZFTB10GERIHVLqe3vPnWYV5ebJTsH5lbF0NCBwQv3YgurXXKPojU/rj lPxodMQ0fq6FyJ54QXUlniRDe4KVGZ7c+VWhCR+S8ftY6+hmEDCRsHx9sEeCzfc6wtiI yhjfuQ8C9EFDhChuSWZzCRf5ZfgpkQH7dM+0qj42CL9Za0ywD56L49GRluSbD4JdmoXh VV8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=sZ4AKKUn5xv2pFTtzQgnRkymBQdo37Gml3T+lnBghos=; b=rx1yxSnPpAVvp1wNLM/Gwhv5U28HZEAbVCQWx9XZOUye0llMsoaiiUHxFU+DVGstZ+ C8lv8d1uqokJ2Q1PcMQoFDt/DIO3TEvkGMNbFb8uwPEmTAqlp9MGSopZ9s44qHSlE6+z x6+HVCLcu9DMBxjhflbw0cp8I5trWLtkoTu9LIA8dBpAkrXoGbpG4shXNLvAy5kxoBYp vDb91sKW0fWZEYYtRE+MnIdwAPQgZe2Q2TDl/lhuwAmpNH8mEvO+8cLYIcjk77KAkR+Z Y3sVT+Jc3R8Z2THMDOQ2puCjmj/urwS2RRtJgHkR5I6gKjwpIbM/2okfxTnEzr5NTQCr mbUg== X-Gm-Message-State: AGi0PuYk46bhcgHWJFr0KsuxUm/MZF6BjXUW24t3pRfHNXavr+nh98a0 xWOdY9V82RK09FpIsWT36An4+DFkqFZAhhgtqLw= X-Google-Smtp-Source: APiQypJcy6LlMAjQuteN7jBDm/Qpd43a/JfA/45ytOpmLQ2Bz7tvlWGMV+XbCOgLIOV6kRJmOl2Frzq/SXr5x+l3ch8= X-Received: by 2002:a05:6402:22f0:: with SMTP id dn16mr17665690edb.201.1587965034801; Sun, 26 Apr 2020 22:23:54 -0700 (PDT) MIME-Version: 1.0 References: <1583114253-15345-1-git-send-email-xiangxia.m.yue@gmail.com> <1586787714-13890-1-git-send-email-xiangxia.m.yue@gmail.com> <1586787714-13890-2-git-send-email-xiangxia.m.yue@gmail.com> <3b1cb032-a1c2-6dee-af72-8ed690a9ad8c@solarflare.com> In-Reply-To: <3b1cb032-a1c2-6dee-af72-8ed690a9ad8c@solarflare.com> From: Tonghao Zhang Date: Mon, 27 Apr 2020 13:23:17 +0800 Message-ID: To: Andrew Rybchenko Cc: Olivier Matz , Gage Eads , "Artem V. Andreev" , Jerin Jacob , Nithin Dabilpuram , Vamsi Attunuru , Hemant Agrawal , David Marchand , "Burakov, Anatoly" , Bruce Richardson , dpdk-dev Content-Type: text/plain; charset="UTF-8" Subject: Re: [dpdk-dev] [PATCH dpdk-dev v3 2/2] mempool: use shared memzone for rte_mempool_ops X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Thu, Apr 23, 2020 at 9:38 PM Andrew Rybchenko wrote: > > On 4/13/20 5:21 PM, xiangxia.m.yue@gmail.com wrote: > > From: Tonghao Zhang > > > > The order of mempool initiation affects mempool index in the > > rte_mempool_ops_table. For example, when building APPs with: > > > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ... > > > > The "bucket" mempool will be registered firstly, and its index > > in table is 0 while the index of "ring" mempool is 1. DPDK > > uses the mk/rte.app.mk to build APPs, and others, for example, > > Open vSwitch, use the libdpdk.a or libdpdk.so to build it. > > The mempool lib linked in dpdk and Open vSwitch is different. > > > > The mempool can be used between primary and secondary process, > > such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled). > > There will be a crash because dpdk-pdump creates the "ring_mp_mc" > > ring which index in table is 0, but the index of "bucket" ring > > is 0 in Open vSwitch. If Open vSwitch use the index 0 to get > > mempool ops and malloc memory from mempool. The crash will occur: > > > > bucket_dequeue (access null and crash) > > rte_mempool_get_ops (should get "ring_mp_mc", > > but get "bucket" mempool) > > rte_mempool_ops_dequeue_bulk > > ... > > rte_pktmbuf_alloc > > rte_pktmbuf_copy > > pdump_copy > > pdump_rx > > rte_eth_rx_burst > > > > To avoid the crash, there are some solution: > > * constructor priority: Different mempool uses different > > priority in RTE_INIT, but it's not easy to maintain. > > > > * change mk/rte.app.mk: Change the order in mk/rte.app.mk to > > be same as libdpdk.a/libdpdk.so, but when adding a new mempool > > driver in future, we must make sure the order. > > > > * register mempool orderly: Sort the mempool when registering, > > so the lib linked will not affect the index in mempool table. > > but the number of mempool libraries may be different. > > > > * shared memzone: The primary process allocates a struct in > > shared memory named memzone, When we register a mempool ops, > > we first get a name and id from the shared struct: with the lock held, > > lookup for the registered name and return its index, else > > get the last id and copy the name in the struct. > > > > Previous discussion: https://mails.dpdk.org/archives/dev/2020-March/159354.html > > > > Suggested-by: Olivier Matz > > Suggested-by: Jerin Jacob > > Signed-off-by: Tonghao Zhang > > --- > > v2: > > * fix checkpatch warning > > --- > > lib/librte_mempool/rte_mempool.h | 28 +++++++++++- > > lib/librte_mempool/rte_mempool_ops.c | 89 ++++++++++++++++++++++++++++-------- > > 2 files changed, 96 insertions(+), 21 deletions(-) > > > > diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h > > index c90cf31467b2..2709b9e1d51b 100644 > > --- a/lib/librte_mempool/rte_mempool.h > > +++ b/lib/librte_mempool/rte_mempool.h > > @@ -50,6 +50,7 @@ > > #include > > #include > > #include > > +#include > > > > #ifdef __cplusplus > > extern "C" { > > @@ -678,7 +679,6 @@ struct rte_mempool_ops { > > */ > > struct rte_mempool_ops_table { > > rte_spinlock_t sl; /**< Spinlock for add/delete. */ > > - uint32_t num_ops; /**< Number of used ops structs in the table. */ > > /** > > * Storage for all possible ops structs. > > */ > > @@ -910,6 +910,30 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp, > > */ > > int rte_mempool_register_ops(const struct rte_mempool_ops *ops); > > > > +struct rte_mempool_shared_ops { > > + size_t num_mempool_ops; > > Is there any specific reason to change type from uint32_t used > above to size_t? I think that uint32_t is better here since > it is just a number, not a size of memory or related value. Thanks for you review. busy to commit other patch, so that be delayed. I will change that in next version. > > + struct { > > + char name[RTE_MEMPOOL_OPS_NAMESIZE]; > > + } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX]; > > + > > + rte_spinlock_t mempool; > > +}; > > + > > +static inline int > > +mempool_ops_register_cb(const void *arg) > > +{ > > + const struct rte_mempool_ops *h = (const struct rte_mempool_ops *)arg; > > + > > + return rte_mempool_register_ops(h); > > +} > > + > > +static inline void > > +mempool_ops_register(const struct rte_mempool_ops *ops) > > +{ > > + rte_init_register(mempool_ops_register_cb, (const void *)ops, > > + RTE_INIT_PRE); > > +} > > + > > /** > > * Macro to statically register the ops of a mempool handler. > > * Note that the rte_mempool_register_ops fails silently here when > > @@ -918,7 +942,7 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp, > > #define MEMPOOL_REGISTER_OPS(ops) \ > > RTE_INIT(mp_hdlr_init_##ops) \ > > { \ > > - rte_mempool_register_ops(&ops); \ > > + mempool_ops_register(&ops); \ > > } > > > > /** > > diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c > > index 22c5251eb068..b10fda662db6 100644 > > --- a/lib/librte_mempool/rte_mempool_ops.c > > +++ b/lib/librte_mempool/rte_mempool_ops.c > > @@ -14,43 +14,95 @@ > > /* indirect jump table to support external memory pools. */ > > struct rte_mempool_ops_table rte_mempool_ops_table = { > > .sl = RTE_SPINLOCK_INITIALIZER, > > - .num_ops = 0 > > }; > > > > -/* add a new ops struct in rte_mempool_ops_table, return its index. */ > > -int > > -rte_mempool_register_ops(const struct rte_mempool_ops *h) > > +static int > > +rte_mempool_register_shared_ops(const char *name) > > { > > - struct rte_mempool_ops *ops; > > - int16_t ops_index; > > + static bool mempool_shared_ops_inited; > > + struct rte_mempool_shared_ops *shared_ops; > > + const struct rte_memzone *mz; > > + uint32_t ops_index = 0; > > + > > I think we should sanity check 'name' here to be not > empty string (see review notes below). OK > > + if (rte_eal_process_type() == RTE_PROC_PRIMARY && > > + !mempool_shared_ops_inited) { > > + > > + mz = rte_memzone_reserve("mempool_ops_shared", > > + sizeof(*shared_ops), 0, 0); > > + if (mz == NULL) > > + return -ENOMEM; > > + > > + shared_ops = mz->addr; > > + shared_ops->num_mempool_ops = 0; > > + rte_spinlock_init(&shared_ops->mempool); > > + > > + mempool_shared_ops_inited = true; > > + } else { > > + mz = rte_memzone_lookup("mempool_ops_shared"); > > + if (mz == NULL) > > + return -ENOMEM; > > + > > + shared_ops = mz->addr; > > + } > > > > - rte_spinlock_lock(&rte_mempool_ops_table.sl); > > + rte_spinlock_lock(&shared_ops->mempool); > > > > - if (rte_mempool_ops_table.num_ops >= > > - RTE_MEMPOOL_MAX_OPS_IDX) { > > - rte_spinlock_unlock(&rte_mempool_ops_table.sl); > > + if (shared_ops->num_mempool_ops >= RTE_MEMPOOL_MAX_OPS_IDX) { > > + rte_spinlock_unlock(&shared_ops->mempool); > > RTE_LOG(ERR, MEMPOOL, > > "Maximum number of mempool ops structs exceeded\n"); > > return -ENOSPC; > > } > > > > + while (shared_ops->mempool_ops[ops_index].name[0]) { > > Please, compare with '\0' as DPDK style guide says. > > > + if (!strcmp(name, shared_ops->mempool_ops[ops_index].name)) { > > + rte_spinlock_unlock(&shared_ops->mempool); > > + return ops_index; > > + } > > + > > + ops_index++; > > + } > > + > > + strlcpy(shared_ops->mempool_ops[ops_index].name, name, > > + sizeof(shared_ops->mempool_ops[0].name)); > > + > > + shared_ops->num_mempool_ops++; > > + > > + rte_spinlock_unlock(&shared_ops->mempool); > > + return ops_index; > > +} > > + > > +/* add a new ops struct in rte_mempool_ops_table, return its index. */ > > +int > > +rte_mempool_register_ops(const struct rte_mempool_ops *h) > > +{ > > + struct rte_mempool_ops *ops; > > + int16_t ops_index; > > + > > if (h->alloc == NULL || h->enqueue == NULL || > > - h->dequeue == NULL || h->get_count == NULL) { > > - rte_spinlock_unlock(&rte_mempool_ops_table.sl); > > + h->dequeue == NULL || h->get_count == NULL) { > > Changing formatting just makes review a bit more harder. > > > RTE_LOG(ERR, MEMPOOL, > > "Missing callback while registering mempool ops\n"); > > + rte_errno = EINVAL; > > Why is it done in the patch? For me it looks like logically > different change if it is really required (properly motivated). I guess that should add the err code to rte_rrno ? but I am fine, not to do that. > > return -EINVAL; > > } > > > > if (strlen(h->name) >= sizeof(ops->name) - 1) { > > - rte_spinlock_unlock(&rte_mempool_ops_table.sl); > > - RTE_LOG(DEBUG, EAL, "%s(): mempool_ops <%s>: name too long\n", > > - __func__, h->name); > > + RTE_LOG(ERR, MEMPOOL, > > + "The registering mempool_ops <%s>: name too long\n", > > + h->name); > > Why do you change from DEBUG to ERR here? It it not > directly related to the purpose of the patch. Yes, because it is an error, so change that type. I change it in separate patch ? > > rte_errno = EEXIST; > > return -EEXIST; > > } > > > > - ops_index = rte_mempool_ops_table.num_ops++; > > + ops_index = rte_mempool_register_shared_ops(h->name); > > + if (ops_index < 0) { > > + rte_errno = -ops_index; > > + return ops_index; > > + } > > + > > + rte_spinlock_lock(&rte_mempool_ops_table.sl); > > + > > ops = &rte_mempool_ops_table.ops[ops_index]; > > strlcpy(ops->name, h->name, sizeof(ops->name)); > > ops->alloc = h->alloc; > > @@ -165,9 +217,8 @@ struct rte_mempool_ops_table rte_mempool_ops_table = { > > if (mp->flags & MEMPOOL_F_POOL_CREATED) > > return -EEXIST; > > > > - for (i = 0; i < rte_mempool_ops_table.num_ops; i++) { > > - if (!strcmp(name, > > - rte_mempool_ops_table.ops[i].name)) { > > + for (i = 0; i < RTE_MEMPOOL_MAX_OPS_IDX; i++) { > > + if (!strcmp(name, rte_mempool_ops_table.ops[i].name)) { > > Since rte_mempool_ops_table is filled in which zeros, > name string is empty by default. So, request with empty name > will match the first unused entry. I guess it is not what we > want here. I think we should handle empty string before the > loop and return -EINVAL. OK, thanks. > > ops = &rte_mempool_ops_table.ops[i]; > > break; > > } > > > -- Best regards, Tonghao