From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 81F73A3168 for ; Thu, 17 Oct 2019 09:54:41 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id F32441E883; Thu, 17 Oct 2019 09:54:39 +0200 (CEST) Received: from mail-wr1-f67.google.com (mail-wr1-f67.google.com [209.85.221.67]) by dpdk.org (Postfix) with ESMTP id 654711E54B for ; Thu, 17 Oct 2019 09:54:38 +0200 (CEST) Received: by mail-wr1-f67.google.com with SMTP id w18so584449wrt.3 for ; Thu, 17 Oct 2019 00:54:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind.com; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=DZYnB7WbkMWB1Nkoo3XpGQAgOCXe80NJWczWmDdATB0=; b=Gt/t/mUX9TeysXX5ofmSL6X1MNmogG1DYbQEmovEX4PbuL1nDT1nnfzGBUN0/NCkjB t1Y/VNJKwsw1R5CTr9W42UeDjp6zF0+XXcQruIfXbCxJF+Es7pO4HMcoDx5rSu7tW1hh Pa5CWPVI83hFGkzInog8zjWnfa2perTn8lVSqnAWqGmuYRbNNZC06lhzcOqE9Bfn7ow5 7OMi5EsaBstLYHmCnvStZ9kqUtMQ2MT6lgLZN6RvjYgFmp/dRJvtXublqoEEKZhFSBWp DmMyrC277iXkXaEz2D9R07vhpnEhkk53oEAylXjSjN74Uc9OAxkgTb+siWkV0Dqd9ZDa rFhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=DZYnB7WbkMWB1Nkoo3XpGQAgOCXe80NJWczWmDdATB0=; b=jDyQALJxv3GIutDNBR2kn2elnlzxYsgf8E4CMnXCngDNaDEAmLDBFpiHD7XejDJ98l jtqxlZf/Oztk08GRY7/tvzdlPP4Lp/UqtboqmbNe98Q8dS82Ls4KAIMy8NHwyLOgCGs+ kJqCRCbkrUycI78h49UlFd8aX+PqKytQ3JSqQdpekRV62Y47QCRymJ7oy4egVH2ZdP6E YJzH1MXnXgU2AsgwE9nMixbETNGBh+o6UkhMlMEjrfUrTpohdctY+M4elQSgeeiUtPFl YIgnUY3jXn4oSlnJ4Fh1PN4s8YkP4AkHIvXhqpoaeAXZGJqoawU9rX3R+OWbwJRR1YEk xqQw== X-Gm-Message-State: APjAAAUIHDLk9204hkyV2s3ZcBm1caaG6SdhbYhhdai8Zh0srf2nVSIG zddTOVrVT4MlbD2RXCcaYwXQhg== X-Google-Smtp-Source: APXvYqx0hSRq2r80ad2C1alSYPE3rww/ldQVTKxGBcftQ699E+vQylh8s3IUOiabyJgejy2/HfRwxw== X-Received: by 2002:adf:fc42:: with SMTP id e2mr1890586wrs.100.1571298877502; Thu, 17 Oct 2019 00:54:37 -0700 (PDT) Received: from 6wind.com (2a01cb0c0005a6000226b0fffeed02fc.ipv6.abo.wanadoo.fr. [2a01:cb0c:5:a600:226:b0ff:feed:2fc]) by smtp.gmail.com with ESMTPSA id 143sm3761179wmb.33.2019.10.17.00.54.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2019 00:54:35 -0700 (PDT) Date: Thu, 17 Oct 2019 09:54:34 +0200 From: Olivier Matz To: "Ananyev, Konstantin" Cc: "dev@dpdk.org" , Thomas Monjalon , "Wang, Haiyue" , Stephen Hemminger , Andrew Rybchenko , "Wiles, Keith" , Jerin Jacob Kollanukkaran Message-ID: <20191017075434.dk4flyktbbe3lxxd@platinum> References: <20190710092907.5565-1-olivier.matz@6wind.com> <20190918165448.22409-1-olivier.matz@6wind.com> <2601191342CEEE43887BDE71AB977258019196E0B7@irsmsx105.ger.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2601191342CEEE43887BDE71AB977258019196E0B7@irsmsx105.ger.corp.intel.com> User-Agent: NeoMutt/20180716 Subject: Re: [dpdk-dev] [PATCH] mbuf: support dynamic fields and flags X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Konstantin, Thanks for the feedback. Please see my answers below. On Tue, Oct 01, 2019 at 10:49:39AM +0000, Ananyev, Konstantin wrote: > Hi Olivier, > > > Many features require to store data inside the mbuf. As the room in mbuf > > structure is limited, it is not possible to have a field for each > > feature. Also, changing fields in the mbuf structure can break the API > > or ABI. > > > > This commit addresses these issues, by enabling the dynamic registration > > of fields or flags: > > > > - a dynamic field is a named area in the rte_mbuf structure, with a > > given size (>= 1 byte) and alignment constraint. > > - a dynamic flag is a named bit in the rte_mbuf structure. > > > > The typical use case is a PMD that registers space for an offload > > feature, when the application requests to enable this feature. As > > the space in mbuf is limited, the space should only be reserved if it > > is going to be used (i.e when the application explicitly asks for it). > > > > The registration can be done at any moment, but it is not possible > > to unregister fields or flags for now. > > Looks ok to me in general. > Some comments/suggestions inline. > Konstantin > > > > > Signed-off-by: Olivier Matz > > Acked-by: Thomas Monjalon > > --- > > > > rfc -> v1 > > > > * Rebase on top of master > > * Change registration API to use a structure instead of > > variables, getting rid of #defines (Stephen's comment) > > * Update flag registration to use a similar API as fields. > > * Change max name length from 32 to 64 (sugg. by Thomas) > > * Enhance API documentation (Haiyue's and Andrew's comments) > > * Add a debug log at registration > > * Add some words in release note > > * Did some performance tests (sugg. by Andrew): > > On my platform, reading a dynamic field takes ~3 cycles more > > than a static field, and ~2 cycles more for writing. > > > > app/test/test_mbuf.c | 114 ++++++- > > doc/guides/rel_notes/release_19_11.rst | 7 + > > lib/librte_mbuf/Makefile | 2 + > > lib/librte_mbuf/meson.build | 6 +- > > lib/librte_mbuf/rte_mbuf.h | 25 +- > > lib/librte_mbuf/rte_mbuf_dyn.c | 408 +++++++++++++++++++++++++ > > lib/librte_mbuf/rte_mbuf_dyn.h | 163 ++++++++++ > > lib/librte_mbuf/rte_mbuf_version.map | 4 + > > 8 files changed, 724 insertions(+), 5 deletions(-) > > create mode 100644 lib/librte_mbuf/rte_mbuf_dyn.c > > create mode 100644 lib/librte_mbuf/rte_mbuf_dyn.h > > > > --- a/lib/librte_mbuf/rte_mbuf.h > > +++ b/lib/librte_mbuf/rte_mbuf.h > > @@ -198,9 +198,12 @@ extern "C" { > > #define PKT_RX_OUTER_L4_CKSUM_GOOD (1ULL << 22) > > #define PKT_RX_OUTER_L4_CKSUM_INVALID ((1ULL << 21) | (1ULL << 22)) > > > > -/* add new RX flags here */ > > +/* add new RX flags here, don't forget to update PKT_FIRST_FREE */ > > > > -/* add new TX flags here */ > > +#define PKT_FIRST_FREE (1ULL << 23) > > +#define PKT_LAST_FREE (1ULL << 39) > > + > > +/* add new TX flags here, don't forget to update PKT_LAST_FREE */ > > > > /** > > * Indicate that the metadata field in the mbuf is in use. > > @@ -738,6 +741,8 @@ struct rte_mbuf { > > */ > > struct rte_mbuf_ext_shared_info *shinfo; > > > > + uint64_t dynfield1; /**< Reserved for dynamic fields. */ > > + uint64_t dynfield2; /**< Reserved for dynamic fields. */ > > Wonder why just not one field: > union { > uint8_t u8[16]; > ... > uint64_t u64[2]; > } dyn_field1; > ? > Probably would be a bit handy, to refer, register, etc. no? I didn't find any place where we need an access through u8, so I just changed it into uint64_t dynfield1[2]. > > > } __rte_cache_aligned; > > > > /** > > @@ -1684,6 +1689,21 @@ rte_pktmbuf_attach_extbuf(struct rte_mbuf *m, void *buf_addr, > > */ > > #define rte_pktmbuf_detach_extbuf(m) rte_pktmbuf_detach(m) > > > > +/** > > + * Copy dynamic fields from m_src to m_dst. > > + * > > + * @param m_dst > > + * The destination mbuf. > > + * @param m_src > > + * The source mbuf. > > + */ > > +static inline void > > +rte_mbuf_dynfield_copy(struct rte_mbuf *m_dst, const struct rte_mbuf *m_src) > > +{ > > + m_dst->dynfield1 = m_src->dynfield1; > > + m_dst->dynfield2 = m_src->dynfield2; > > +} > > + > > /** > > * Attach packet mbuf to another packet mbuf. > > * > > @@ -1732,6 +1752,7 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, struct rte_mbuf *m) > > mi->vlan_tci_outer = m->vlan_tci_outer; > > mi->tx_offload = m->tx_offload; > > mi->hash = m->hash; > > + rte_mbuf_dynfield_copy(mi, m); > > > > mi->next = NULL; > > mi->pkt_len = mi->data_len; > > diff --git a/lib/librte_mbuf/rte_mbuf_dyn.c b/lib/librte_mbuf/rte_mbuf_dyn.c > > new file mode 100644 > > index 000000000..13b8742d0 > > --- /dev/null > > +++ b/lib/librte_mbuf/rte_mbuf_dyn.c > > @@ -0,0 +1,408 @@ > > +/* SPDX-License-Identifier: BSD-3-Clause > > + * Copyright 2019 6WIND S.A. > > + */ > > + > > +#include > > + > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > + > > +#define RTE_MBUF_DYN_MZNAME "rte_mbuf_dyn" > > + > > +struct mbuf_dynfield_elt { > > + TAILQ_ENTRY(mbuf_dynfield_elt) next; > > + struct rte_mbuf_dynfield params; > > + int offset; > > Why not 'size_t offset', to avoid any explicit conversions, etc? Fixed > > +}; > > +TAILQ_HEAD(mbuf_dynfield_list, rte_tailq_entry); > > + > > +static struct rte_tailq_elem mbuf_dynfield_tailq = { > > + .name = "RTE_MBUF_DYNFIELD", > > +}; > > +EAL_REGISTER_TAILQ(mbuf_dynfield_tailq); > > + > > +struct mbuf_dynflag_elt { > > + TAILQ_ENTRY(mbuf_dynflag_elt) next; > > + struct rte_mbuf_dynflag params; > > + int bitnum; > > +}; > > +TAILQ_HEAD(mbuf_dynflag_list, rte_tailq_entry); > > + > > +static struct rte_tailq_elem mbuf_dynflag_tailq = { > > + .name = "RTE_MBUF_DYNFLAG", > > +}; > > +EAL_REGISTER_TAILQ(mbuf_dynflag_tailq); > > + > > +struct mbuf_dyn_shm { > > + /** For each mbuf byte, free_space[i] == 1 if space is free. */ > > + uint8_t free_space[sizeof(struct rte_mbuf)]; > > + /** Bitfield of available flags. */ > > + uint64_t free_flags; > > +}; > > +static struct mbuf_dyn_shm *shm; > > + > > +/* allocate and initialize the shared memory */ > > +static int > > +init_shared_mem(void) > > +{ > > + const struct rte_memzone *mz; > > + uint64_t mask; > > + > > + if (rte_eal_process_type() == RTE_PROC_PRIMARY) { > > + mz = rte_memzone_reserve_aligned(RTE_MBUF_DYN_MZNAME, > > + sizeof(struct mbuf_dyn_shm), > > + SOCKET_ID_ANY, 0, > > + RTE_CACHE_LINE_SIZE); > > + } else { > > + mz = rte_memzone_lookup(RTE_MBUF_DYN_MZNAME); > > + } > > + if (mz == NULL) > > + return -1; > > + > > + shm = mz->addr; > > + > > +#define mark_free(field) \ > > + memset(&shm->free_space[offsetof(struct rte_mbuf, field)], \ > > + 0xff, sizeof(((struct rte_mbuf *)0)->field)) > > I think you can avoid defining/unedifying macros here by something like that: > > static const struct { > size_t offset; > size_t size; > } dyn_syms[] = { > [0] = {.offset = offsetof(struct rte_mbuf, dynfield1), sizeof((struct rte_mbuf *)0)->dynfield1), > [1] = {.offset = offsetof(struct rte_mbuf, dynfield2), sizeof((struct rte_mbuf *)0)->dynfield2), > }; > ... > > for (i = 0; i != RTE_DIM(dyn_syms); i++) > memset(shm->free_space + dym_syms[i].offset, UINT8_MAX, dym_syms[i].size); > I tried it, but the following lines are too long [0] = {offsetof(struct rte_mbuf, dynfield1), sizeof((struct rte_mbuf *)0)->dynfield1), [1] = {offsetof(struct rte_mbuf, dynfield2), sizeof((struct rte_mbuf *)0)->dynfield2), To make them shorter, we can use a macro... but... wait :) > > + > > + if (rte_eal_process_type() == RTE_PROC_PRIMARY) { > > + /* init free_space, keep it sync'd with > > + * rte_mbuf_dynfield_copy(). > > + */ > > + memset(shm, 0, sizeof(*shm)); > > + mark_free(dynfield1); > > + mark_free(dynfield2); > > + > > + /* init free_flags */ > > + for (mask = PKT_FIRST_FREE; mask <= PKT_LAST_FREE; mask <<= 1) > > + shm->free_flags |= mask; > > + } > > +#undef mark_free > > + > > + return 0; > > +} > > + > > +/* check if this offset can be used */ > > +static int > > +check_offset(size_t offset, size_t size, size_t align, unsigned int flags) > > +{ > > + size_t i; > > + > > + (void)flags; > > > We have RTE_SET_USED() for such cases... > Though as it is an internal function probably better not to introduce > unused parameters at all. I removed the flag parameter as you suggested. > > + > > + if ((offset & (align - 1)) != 0) > > + return -1; > > + if (offset + size > sizeof(struct rte_mbuf)) > > + return -1; > > + > > + for (i = 0; i < size; i++) { > > + if (!shm->free_space[i + offset]) > > + return -1; > > + } > > + > > + return 0; > > +} > > + > > +/* assume tailq is locked */ > > +static struct mbuf_dynfield_elt * > > +__mbuf_dynfield_lookup(const char *name) > > +{ > > + struct mbuf_dynfield_list *mbuf_dynfield_list; > > + struct mbuf_dynfield_elt *mbuf_dynfield; > > + struct rte_tailq_entry *te; > > + > > + mbuf_dynfield_list = RTE_TAILQ_CAST( > > + mbuf_dynfield_tailq.head, mbuf_dynfield_list); > > + > > + TAILQ_FOREACH(te, mbuf_dynfield_list, next) { > > + mbuf_dynfield = (struct mbuf_dynfield_elt *)te->data; > > + if (strcmp(name, mbuf_dynfield->params.name) == 0) > > + break; > > + } > > + > > + if (te == NULL) { > > + rte_errno = ENOENT; > > + return NULL; > > + } > > + > > + return mbuf_dynfield; > > +} > > + > > +int > > +rte_mbuf_dynfield_lookup(const char *name, struct rte_mbuf_dynfield *params) > > +{ > > + struct mbuf_dynfield_elt *mbuf_dynfield; > > + > > + if (shm == NULL) { > > + rte_errno = ENOENT; > > + return -1; > > + } > > + > > + rte_mcfg_tailq_read_lock(); > > + mbuf_dynfield = __mbuf_dynfield_lookup(name); > > + rte_mcfg_tailq_read_unlock(); > > + > > + if (mbuf_dynfield == NULL) { > > + rte_errno = ENOENT; > > + return -1; > > + } > > + > > + if (params != NULL) > > + memcpy(params, &mbuf_dynfield->params, sizeof(*params)); > > + > > + return mbuf_dynfield->offset; > > +} > > + > > +static int mbuf_dynfield_cmp(const struct rte_mbuf_dynfield *params1, > > + const struct rte_mbuf_dynfield *params2) > > +{ > > + if (strcmp(params1->name, params2->name)) > > + return -1; > > + if (params1->size != params2->size) > > + return -1; > > + if (params1->align != params2->align) > > + return -1; > > + if (params1->flags != params2->flags) > > + return -1; > > + return 0; > > +} > > + > > +int > > +rte_mbuf_dynfield_register(const struct rte_mbuf_dynfield *params) > > What I meant at user-space - if we can also have another function that would allow > user to specify required offset for dynfield explicitly, then user can define it as constant > value and let compiler do optimization work and hopefully generate faster code to access > this field. > Something like that: > > int rte_mbuf_dynfiled_register_offset(const struct rte_mbuf_dynfield *params, size_t offset); > > #define RTE_MBUF_DYNFIELD_OFFSET(fld, off) (offsetof(struct rte_mbuf, fld) + (off)) > > And then somewhere in user code: > > /* to let say reserve first 4B in dynfield1*/ > #define MBUF_DYNFIELD_A RTE_MBUF_DYNFIELD_OFFSET(dynfiled1, 0) > ... > params.name = RTE_STR(MBUF_DYNFIELD_A); > params.size = sizeof(uint32_t); > params.align = sizeof(uint32_t); > ret = rte_mbuf_dynfiled_register_offset(¶ms, MBUF_DYNFIELD_A); > if (ret != MBUF_DYNFIELD_A) { > /* handle it somehow, probably just terminate gracefully... */ > } > ... > > /* to let say reserve last 2B in dynfield2*/ > #define MBUF_DYNFIELD_B RTE_MBUF_DYNFIELD_OFFSET(dynfiled2, 6) > ... > params.name = RTE_STR(MBUF_DYNFIELD_B); > params.size = sizeof(uint16_t); > params.align = sizeof(uint16_t); > ret = rte_mbuf_dynfiled_register_offset(¶ms, MBUF_DYNFIELD_B); > > After that user can use constant offsets MBUF_DYNFIELD_A/ MBUF_DYNFIELD_B > to access these fields. > Same thoughts for DYNFLAG. I added the feature in v2. > > + struct mbuf_dynfield_list *mbuf_dynfield_list; > > + struct mbuf_dynfield_elt *mbuf_dynfield = NULL; > > + struct rte_tailq_entry *te = NULL; > > + int offset, ret; > > size_t offset > to avoid explicit conversions, etc.? > Fixed. > > + size_t i; > > + > > + if (shm == NULL && init_shared_mem() < 0) > > + goto fail; > > As I understand, here you allocate/initialize your shm without any lock protection, > though later you protect it via rte_mcfg_tailq_write_lock(). > That seems a bit flakey to me. > Why not to store information about free dynfield bytes inside mbuf_dynfield_tailq? > Let say at init() create and add an entry into that list with some reserved name. > Then at register - grab mcfg_tailq_write_lock and do lookup > for such entry and then read/update it as needed. > It would help to avoid racing problem, plus you wouldn't need to > allocate/lookup for memzone. I don't quite like the idea of having a special entry with a different type in an element list. Despite it is simpler for a locking perspective, it is less obvious for the developper. Also, I changed the way a zone is reserved to return the one that have the less impact on next reservation, and I feel it is easier to implement with the shared memory. So, I just moved the init_shared_mem() inside the rte_mcfg_tailq_write_lock(), it should do the job. > > + if (params->size >= sizeof(struct rte_mbuf)) { > > + rte_errno = EINVAL; > > + goto fail; > > + } > > + if (!rte_is_power_of_2(params->align)) { > > + rte_errno = EINVAL; > > + goto fail; > > + } > > + if (params->flags != 0) { > > + rte_errno = EINVAL; > > + goto fail; > > + } > > + > > + rte_mcfg_tailq_write_lock(); > > + > > I think it probably would be cleaner and easier to read/maintain, if you'll put actual > code under lock protection into a separate function - as you did for __mbuf_dynfield_lookup(). Yes, I did that, it should be clearer now. > > + mbuf_dynfield = __mbuf_dynfield_lookup(params->name); > > + if (mbuf_dynfield != NULL) { > > + if (mbuf_dynfield_cmp(params, &mbuf_dynfield->params) < 0) { > > + rte_errno = EEXIST; > > + goto fail_unlock; > > + } > > + offset = mbuf_dynfield->offset; > > + goto out_unlock; > > + } > > + > > + if (rte_eal_process_type() != RTE_PROC_PRIMARY) { > > + rte_errno = EPERM; > > + goto fail_unlock; > > + } > > + > > + for (offset = 0; > > + offset < (int)sizeof(struct rte_mbuf); > > + offset++) { > > + if (check_offset(offset, params->size, params->align, > > + params->flags) == 0) > > + break; > > + } > > + > > + if (offset == sizeof(struct rte_mbuf)) { > > + rte_errno = ENOENT; > > + goto fail_unlock; > > + } > > + > > + mbuf_dynfield_list = RTE_TAILQ_CAST( > > + mbuf_dynfield_tailq.head, mbuf_dynfield_list); > > + > > + te = rte_zmalloc("MBUF_DYNFIELD_TAILQ_ENTRY", sizeof(*te), 0); > > + if (te == NULL) > > + goto fail_unlock; > > + > > + mbuf_dynfield = rte_zmalloc("mbuf_dynfield", sizeof(*mbuf_dynfield), 0); > > + if (mbuf_dynfield == NULL) > > + goto fail_unlock; > > + > > + ret = strlcpy(mbuf_dynfield->params.name, params->name, > > + sizeof(mbuf_dynfield->params.name)); > > + if (ret < 0 || ret >= (int)sizeof(mbuf_dynfield->params.name)) { > > + rte_errno = ENAMETOOLONG; > > + goto fail_unlock; > > + } > > + memcpy(&mbuf_dynfield->params, params, sizeof(mbuf_dynfield->params)); > > + mbuf_dynfield->offset = offset; > > + te->data = mbuf_dynfield; > > + > > + TAILQ_INSERT_TAIL(mbuf_dynfield_list, te, next); > > + > > + for (i = offset; i < offset + params->size; i++) > > + shm->free_space[i] = 0; > > + > > + RTE_LOG(DEBUG, MBUF, "Registered dynamic field %s (sz=%zu, al=%zu, fl=0x%x) -> %d\n", > > + params->name, params->size, params->align, params->flags, > > + offset); > > + > > +out_unlock: > > + rte_mcfg_tailq_write_unlock(); > > + > > + return offset; > > + > > +fail_unlock: > > + rte_mcfg_tailq_write_unlock(); > > +fail: > > + rte_free(mbuf_dynfield); > > + rte_free(te); > > + return -1; > > +} > > + > > +/* assume tailq is locked */ > > +static struct mbuf_dynflag_elt * > > +__mbuf_dynflag_lookup(const char *name) > > +{ > > + struct mbuf_dynflag_list *mbuf_dynflag_list; > > + struct mbuf_dynflag_elt *mbuf_dynflag; > > + struct rte_tailq_entry *te; > > + > > + mbuf_dynflag_list = RTE_TAILQ_CAST( > > + mbuf_dynflag_tailq.head, mbuf_dynflag_list); > > + > > + TAILQ_FOREACH(te, mbuf_dynflag_list, next) { > > + mbuf_dynflag = (struct mbuf_dynflag_elt *)te->data; > > + if (strncmp(name, mbuf_dynflag->params.name, > > + RTE_MBUF_DYN_NAMESIZE) == 0) > > + break; > > + } > > + > > + if (te == NULL) { > > + rte_errno = ENOENT; > > + return NULL; > > + } > > + > > + return mbuf_dynflag; > > +} > > + > > +int > > +rte_mbuf_dynflag_lookup(const char *name, > > + struct rte_mbuf_dynflag *params) > > +{ > > + struct mbuf_dynflag_elt *mbuf_dynflag; > > + > > + if (shm == NULL) { > > + rte_errno = ENOENT; > > + return -1; > > + } > > + > > + rte_mcfg_tailq_read_lock(); > > + mbuf_dynflag = __mbuf_dynflag_lookup(name); > > + rte_mcfg_tailq_read_unlock(); > > + > > + if (mbuf_dynflag == NULL) { > > + rte_errno = ENOENT; > > + return -1; > > + } > > + > > + if (params != NULL) > > + memcpy(params, &mbuf_dynflag->params, sizeof(*params)); > > + > > + return mbuf_dynflag->bitnum; > > +} > > + > > +static int mbuf_dynflag_cmp(const struct rte_mbuf_dynflag *params1, > > + const struct rte_mbuf_dynflag *params2) > > +{ > > + if (strcmp(params1->name, params2->name)) > > + return -1; > > + if (params1->flags != params2->flags) > > + return -1; > > + return 0; > > +} > > + > > +int > > +rte_mbuf_dynflag_register(const struct rte_mbuf_dynflag *params) > > +{ > > + struct mbuf_dynflag_list *mbuf_dynflag_list; > > + struct mbuf_dynflag_elt *mbuf_dynflag = NULL; > > + struct rte_tailq_entry *te = NULL; > > + int bitnum, ret; > > + > > + if (shm == NULL && init_shared_mem() < 0) > > + goto fail; > > + > > + rte_mcfg_tailq_write_lock(); > > + > > + mbuf_dynflag = __mbuf_dynflag_lookup(params->name); > > + if (mbuf_dynflag != NULL) { > > + if (mbuf_dynflag_cmp(params, &mbuf_dynflag->params) < 0) { > > + rte_errno = EEXIST; > > + goto fail_unlock; > > + } > > + bitnum = mbuf_dynflag->bitnum; > > + goto out_unlock; > > + } > > + > > + if (rte_eal_process_type() != RTE_PROC_PRIMARY) { > > + rte_errno = EPERM; > > + goto fail_unlock; > > + } > > + > > + if (shm->free_flags == 0) { > > + rte_errno = ENOENT; > > + goto fail_unlock; > > + } > > + bitnum = rte_bsf64(shm->free_flags); > > + > > + mbuf_dynflag_list = RTE_TAILQ_CAST( > > + mbuf_dynflag_tailq.head, mbuf_dynflag_list); > > + > > + te = rte_zmalloc("MBUF_DYNFLAG_TAILQ_ENTRY", sizeof(*te), 0); > > + if (te == NULL) > > + goto fail_unlock; > > + > > + mbuf_dynflag = rte_zmalloc("mbuf_dynflag", sizeof(*mbuf_dynflag), 0); > > + if (mbuf_dynflag == NULL) > > + goto fail_unlock; > > + > > + ret = strlcpy(mbuf_dynflag->params.name, params->name, > > + sizeof(mbuf_dynflag->params.name)); > > + if (ret < 0 || ret >= (int)sizeof(mbuf_dynflag->params.name)) { > > + rte_errno = ENAMETOOLONG; > > + goto fail_unlock; > > + } > > + mbuf_dynflag->bitnum = bitnum; > > + te->data = mbuf_dynflag; > > + > > + TAILQ_INSERT_TAIL(mbuf_dynflag_list, te, next); > > + > > + shm->free_flags &= ~(1ULL << bitnum); > > + > > + RTE_LOG(DEBUG, MBUF, "Registered dynamic flag %s (fl=0x%x) -> %u\n", > > + params->name, params->flags, bitnum); > > + > > +out_unlock: > > + rte_mcfg_tailq_write_unlock(); > > + > > + return bitnum; > > + > > +fail_unlock: > > + rte_mcfg_tailq_write_unlock(); > > +fail: > > + rte_free(mbuf_dynflag); > > + rte_free(te); > > + return -1; > > +} > > diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h > > new file mode 100644 > > index 000000000..6e2c81654 > > --- /dev/null > > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h > > @@ -0,0 +1,163 @@ > > +/* SPDX-License-Identifier: BSD-3-Clause > > + * Copyright 2019 6WIND S.A. > > + */ > > + > > +#ifndef _RTE_MBUF_DYN_H_ > > +#define _RTE_MBUF_DYN_H_ > > + > > +/** > > + * @file > > + * RTE Mbuf dynamic fields and flags > > + * > > + * Many features require to store data inside the mbuf. As the room in > > + * mbuf structure is limited, it is not possible to have a field for > > + * each feature. Also, changing fields in the mbuf structure can break > > + * the API or ABI. > > + * > > + * This module addresses this issue, by enabling the dynamic > > + * registration of fields or flags: > > + * > > + * - a dynamic field is a named area in the rte_mbuf structure, with a > > + * given size (>= 1 byte) and alignment constraint. > > + * - a dynamic flag is a named bit in the rte_mbuf structure, stored > > + * in mbuf->ol_flags. > > + * > > + * The typical use case is when a specific offload feature requires to > > + * register a dedicated offload field in the mbuf structure, and adding > > + * a static field or flag is not justified. > > + * > > + * Example of use: > > + * > > + * - A rte_mbuf_dynfield structure is defined, containing the parameters > > + * of the dynamic field to be registered: > > + * const struct rte_mbuf_dynfield rte_dynfield_my_feature = { ... }; > > + * - The application initializes the PMD, and asks for this feature > > + * at port initialization by passing DEV_RX_OFFLOAD_MY_FEATURE in > > + * rxconf. This will make the PMD to register the field by calling > > + * rte_mbuf_dynfield_register(&rte_dynfield_my_feature). The PMD > > + * stores the returned offset. > > + * - The application that uses the offload feature also registers > > + * the field to retrieve the same offset. > > + * - When the PMD receives a packet, it can set the field: > > + * *RTE_MBUF_DYNFIELD(m, offset, ) = value; > > + * - In the main loop, the application can retrieve the value with > > + * the same macro. > > + * > > + * To avoid wasting space, the dynamic fields or flags must only be > > + * reserved on demand, when an application asks for the related feature. > > + * > > + * The registration can be done at any moment, but it is not possible > > + * to unregister fields or flags for now. > > + * > > + * A dynamic field can be reserved and used by an application only. > > + * It can for instance be a packet mark. > > + */ > > + > > +#include > > +/** > > + * Maximum length of the dynamic field or flag string. > > + */ > > +#define RTE_MBUF_DYN_NAMESIZE 64 > > + > > +/** > > + * Structure describing the parameters of a mbuf dynamic field. > > + */ > > +struct rte_mbuf_dynfield { > > + char name[RTE_MBUF_DYN_NAMESIZE]; /**< Name of the field. */ > > + size_t size; /**< The number of bytes to reserve. */ > > + size_t align; /**< The alignment constraint (power of 2). */ > > + unsigned int flags; /**< Reserved for future use, must be 0. */ > > +}; > > + > > +/** > > + * Structure describing the parameters of a mbuf dynamic flag. > > + */ > > +struct rte_mbuf_dynflag { > > + char name[RTE_MBUF_DYN_NAMESIZE]; /**< Name of the dynamic flag. */ > > + unsigned int flags; /**< Reserved for future use, must be 0. */ > > +}; > > + > > +/** > > + * Register space for a dynamic field in the mbuf structure. > > + * > > + * If the field is already registered (same name and parameters), its > > + * offset is returned. > > + * > > + * @param params > > + * A structure containing the requested parameters (name, size, > > + * alignment constraint and flags). > > + * @return > > + * The offset in the mbuf structure, or -1 on error. > > + * Possible values for rte_errno: > > + * - EINVAL: invalid parameters (size, align, or flags). > > + * - EEXIST: this name is already register with different parameters. > > + * - EPERM: called from a secondary process. > > + * - ENOENT: not enough room in mbuf. > > + * - ENOMEM: allocation failure. > > + * - ENAMETOOLONG: name does not ends with \0. > > + */ > > +__rte_experimental > > +int rte_mbuf_dynfield_register(const struct rte_mbuf_dynfield *params); > > + > > +/** > > + * Lookup for a registered dynamic mbuf field. > > + * > > + * @param name > > + * A string identifying the dynamic field. > > + * @param params > > + * If not NULL, and if the lookup is successful, the structure is > > + * filled with the parameters of the dynamic field. > > + * @return > > + * The offset of this field in the mbuf structure, or -1 on error. > > + * Possible values for rte_errno: > > + * - ENOENT: no dynamic field matches this name. > > + */ > > +__rte_experimental > > +int rte_mbuf_dynfield_lookup(const char *name, > > + struct rte_mbuf_dynfield *params); > > + > > +/** > > + * Register a dynamic flag in the mbuf structure. > > + * > > + * If the flag is already registered (same name and parameters), its > > + * offset is returned. > > + * > > + * @param params > > + * A structure containing the requested parameters of the dynamic > > + * flag (name and options). > > + * @return > > + * The number of the reserved bit, or -1 on error. > > + * Possible values for rte_errno: > > + * - EINVAL: invalid parameters (size, align, or flags). > > + * - EEXIST: this name is already register with different parameters. > > + * - EPERM: called from a secondary process. > > + * - ENOENT: no more flag available. > > + * - ENOMEM: allocation failure. > > + * - ENAMETOOLONG: name is longer than RTE_MBUF_DYN_NAMESIZE - 1. > > + */ > > +__rte_experimental > > +int rte_mbuf_dynflag_register(const struct rte_mbuf_dynflag *params); > > + > > +/** > > + * Lookup for a registered dynamic mbuf flag. > > + * > > + * @param name > > + * A string identifying the dynamic flag. > > + * @param params > > + * If not NULL, and if the lookup is successful, the structure is > > + * filled with the parameters of the dynamic flag. > > + * @return > > + * The offset of this flag in the mbuf structure, or -1 on error. > > + * Possible values for rte_errno: > > + * - ENOENT: no dynamic flag matches this name. > > + */ > > +__rte_experimental > > +int rte_mbuf_dynflag_lookup(const char *name, > > + struct rte_mbuf_dynflag *params); > > + > > +/** > > + * Helper macro to access to a dynamic field. > > + */ > > +#define RTE_MBUF_DYNFIELD(m, offset, type) ((type)((uintptr_t)(m) + (offset))) > > + > > +#endif > > diff --git a/lib/librte_mbuf/rte_mbuf_version.map b/lib/librte_mbuf/rte_mbuf_version.map > > index 2662a37bf..a98310570 100644 > > --- a/lib/librte_mbuf/rte_mbuf_version.map > > +++ b/lib/librte_mbuf/rte_mbuf_version.map > > @@ -50,4 +50,8 @@ EXPERIMENTAL { > > global: > > > > rte_mbuf_check; > > + rte_mbuf_dynfield_lookup; > > + rte_mbuf_dynfield_register; > > + rte_mbuf_dynflag_lookup; > > + rte_mbuf_dynflag_register; > > } DPDK_18.08; > > -- > > 2.20.1 > I will send a v2 shortly, thanks Olivier