From: Olivier Matz <olivier.matz@6wind.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
Thomas Monjalon <thomas@monjalon.net>,
"Wang, Haiyue" <haiyue.wang@intel.com>,
Stephen Hemminger <stephen@networkplumber.org>,
Andrew Rybchenko <arybchenko@solarflare.com>,
"Wiles, Keith" <keith.wiles@intel.com>,
Jerin Jacob Kollanukkaran <jerinj@marvell.com>
Subject: Re: [dpdk-dev] [PATCH] mbuf: support dynamic fields and flags
Date: Thu, 17 Oct 2019 09:54:34 +0200 [thread overview]
Message-ID: <20191017075434.dk4flyktbbe3lxxd@platinum> (raw)
In-Reply-To: <2601191342CEEE43887BDE71AB977258019196E0B7@irsmsx105.ger.corp.intel.com>
Hi Konstantin,
Thanks for the feedback. Please see my answers below.
On Tue, Oct 01, 2019 at 10:49:39AM +0000, Ananyev, Konstantin wrote:
> Hi Olivier,
>
> > Many features require to store data inside the mbuf. As the room in mbuf
> > structure is limited, it is not possible to have a field for each
> > feature. Also, changing fields in the mbuf structure can break the API
> > or ABI.
> >
> > This commit addresses these issues, by enabling the dynamic registration
> > of fields or flags:
> >
> > - a dynamic field is a named area in the rte_mbuf structure, with a
> > given size (>= 1 byte) and alignment constraint.
> > - a dynamic flag is a named bit in the rte_mbuf structure.
> >
> > The typical use case is a PMD that registers space for an offload
> > feature, when the application requests to enable this feature. As
> > the space in mbuf is limited, the space should only be reserved if it
> > is going to be used (i.e when the application explicitly asks for it).
> >
> > The registration can be done at any moment, but it is not possible
> > to unregister fields or flags for now.
>
> Looks ok to me in general.
> Some comments/suggestions inline.
> Konstantin
>
> >
> > Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> > Acked-by: Thomas Monjalon <thomas@monjalon.net>
> > ---
> >
> > rfc -> v1
> >
> > * Rebase on top of master
> > * Change registration API to use a structure instead of
> > variables, getting rid of #defines (Stephen's comment)
> > * Update flag registration to use a similar API as fields.
> > * Change max name length from 32 to 64 (sugg. by Thomas)
> > * Enhance API documentation (Haiyue's and Andrew's comments)
> > * Add a debug log at registration
> > * Add some words in release note
> > * Did some performance tests (sugg. by Andrew):
> > On my platform, reading a dynamic field takes ~3 cycles more
> > than a static field, and ~2 cycles more for writing.
> >
> > app/test/test_mbuf.c | 114 ++++++-
> > doc/guides/rel_notes/release_19_11.rst | 7 +
> > lib/librte_mbuf/Makefile | 2 +
> > lib/librte_mbuf/meson.build | 6 +-
> > lib/librte_mbuf/rte_mbuf.h | 25 +-
> > lib/librte_mbuf/rte_mbuf_dyn.c | 408 +++++++++++++++++++++++++
> > lib/librte_mbuf/rte_mbuf_dyn.h | 163 ++++++++++
> > lib/librte_mbuf/rte_mbuf_version.map | 4 +
> > 8 files changed, 724 insertions(+), 5 deletions(-)
> > create mode 100644 lib/librte_mbuf/rte_mbuf_dyn.c
> > create mode 100644 lib/librte_mbuf/rte_mbuf_dyn.h
> >
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -198,9 +198,12 @@ extern "C" {
> > #define PKT_RX_OUTER_L4_CKSUM_GOOD (1ULL << 22)
> > #define PKT_RX_OUTER_L4_CKSUM_INVALID ((1ULL << 21) | (1ULL << 22))
> >
> > -/* add new RX flags here */
> > +/* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> >
> > -/* add new TX flags here */
> > +#define PKT_FIRST_FREE (1ULL << 23)
> > +#define PKT_LAST_FREE (1ULL << 39)
> > +
> > +/* add new TX flags here, don't forget to update PKT_LAST_FREE */
> >
> > /**
> > * Indicate that the metadata field in the mbuf is in use.
> > @@ -738,6 +741,8 @@ struct rte_mbuf {
> > */
> > struct rte_mbuf_ext_shared_info *shinfo;
> >
> > + uint64_t dynfield1; /**< Reserved for dynamic fields. */
> > + uint64_t dynfield2; /**< Reserved for dynamic fields. */
>
> Wonder why just not one field:
> union {
> uint8_t u8[16];
> ...
> uint64_t u64[2];
> } dyn_field1;
> ?
> Probably would be a bit handy, to refer, register, etc. no?
I didn't find any place where we need an access through u8, so I
just changed it into uint64_t dynfield1[2].
>
> > } __rte_cache_aligned;
> >
> > /**
> > @@ -1684,6 +1689,21 @@ rte_pktmbuf_attach_extbuf(struct rte_mbuf *m, void *buf_addr,
> > */
> > #define rte_pktmbuf_detach_extbuf(m) rte_pktmbuf_detach(m)
> >
> > +/**
> > + * Copy dynamic fields from m_src to m_dst.
> > + *
> > + * @param m_dst
> > + * The destination mbuf.
> > + * @param m_src
> > + * The source mbuf.
> > + */
> > +static inline void
> > +rte_mbuf_dynfield_copy(struct rte_mbuf *m_dst, const struct rte_mbuf *m_src)
> > +{
> > + m_dst->dynfield1 = m_src->dynfield1;
> > + m_dst->dynfield2 = m_src->dynfield2;
> > +}
> > +
> > /**
> > * Attach packet mbuf to another packet mbuf.
> > *
> > @@ -1732,6 +1752,7 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, struct rte_mbuf *m)
> > mi->vlan_tci_outer = m->vlan_tci_outer;
> > mi->tx_offload = m->tx_offload;
> > mi->hash = m->hash;
> > + rte_mbuf_dynfield_copy(mi, m);
> >
> > mi->next = NULL;
> > mi->pkt_len = mi->data_len;
> > diff --git a/lib/librte_mbuf/rte_mbuf_dyn.c b/lib/librte_mbuf/rte_mbuf_dyn.c
> > new file mode 100644
> > index 000000000..13b8742d0
> > --- /dev/null
> > +++ b/lib/librte_mbuf/rte_mbuf_dyn.c
> > @@ -0,0 +1,408 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright 2019 6WIND S.A.
> > + */
> > +
> > +#include <sys/queue.h>
> > +
> > +#include <rte_common.h>
> > +#include <rte_eal.h>
> > +#include <rte_eal_memconfig.h>
> > +#include <rte_tailq.h>
> > +#include <rte_errno.h>
> > +#include <rte_malloc.h>
> > +#include <rte_string_fns.h>
> > +#include <rte_mbuf.h>
> > +#include <rte_mbuf_dyn.h>
> > +
> > +#define RTE_MBUF_DYN_MZNAME "rte_mbuf_dyn"
> > +
> > +struct mbuf_dynfield_elt {
> > + TAILQ_ENTRY(mbuf_dynfield_elt) next;
> > + struct rte_mbuf_dynfield params;
> > + int offset;
>
> Why not 'size_t offset', to avoid any explicit conversions, etc?
Fixed
> > +};
> > +TAILQ_HEAD(mbuf_dynfield_list, rte_tailq_entry);
> > +
> > +static struct rte_tailq_elem mbuf_dynfield_tailq = {
> > + .name = "RTE_MBUF_DYNFIELD",
> > +};
> > +EAL_REGISTER_TAILQ(mbuf_dynfield_tailq);
> > +
> > +struct mbuf_dynflag_elt {
> > + TAILQ_ENTRY(mbuf_dynflag_elt) next;
> > + struct rte_mbuf_dynflag params;
> > + int bitnum;
> > +};
> > +TAILQ_HEAD(mbuf_dynflag_list, rte_tailq_entry);
> > +
> > +static struct rte_tailq_elem mbuf_dynflag_tailq = {
> > + .name = "RTE_MBUF_DYNFLAG",
> > +};
> > +EAL_REGISTER_TAILQ(mbuf_dynflag_tailq);
> > +
> > +struct mbuf_dyn_shm {
> > + /** For each mbuf byte, free_space[i] == 1 if space is free. */
> > + uint8_t free_space[sizeof(struct rte_mbuf)];
> > + /** Bitfield of available flags. */
> > + uint64_t free_flags;
> > +};
> > +static struct mbuf_dyn_shm *shm;
> > +
> > +/* allocate and initialize the shared memory */
> > +static int
> > +init_shared_mem(void)
> > +{
> > + const struct rte_memzone *mz;
> > + uint64_t mask;
> > +
> > + if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> > + mz = rte_memzone_reserve_aligned(RTE_MBUF_DYN_MZNAME,
> > + sizeof(struct mbuf_dyn_shm),
> > + SOCKET_ID_ANY, 0,
> > + RTE_CACHE_LINE_SIZE);
> > + } else {
> > + mz = rte_memzone_lookup(RTE_MBUF_DYN_MZNAME);
> > + }
> > + if (mz == NULL)
> > + return -1;
> > +
> > + shm = mz->addr;
> > +
> > +#define mark_free(field) \
> > + memset(&shm->free_space[offsetof(struct rte_mbuf, field)], \
> > + 0xff, sizeof(((struct rte_mbuf *)0)->field))
>
> I think you can avoid defining/unedifying macros here by something like that:
>
> static const struct {
> size_t offset;
> size_t size;
> } dyn_syms[] = {
> [0] = {.offset = offsetof(struct rte_mbuf, dynfield1), sizeof((struct rte_mbuf *)0)->dynfield1),
> [1] = {.offset = offsetof(struct rte_mbuf, dynfield2), sizeof((struct rte_mbuf *)0)->dynfield2),
> };
> ...
>
> for (i = 0; i != RTE_DIM(dyn_syms); i++)
> memset(shm->free_space + dym_syms[i].offset, UINT8_MAX, dym_syms[i].size);
>
I tried it, but the following lines are too long
[0] = {offsetof(struct rte_mbuf, dynfield1), sizeof((struct rte_mbuf *)0)->dynfield1),
[1] = {offsetof(struct rte_mbuf, dynfield2), sizeof((struct rte_mbuf *)0)->dynfield2),
To make them shorter, we can use a macro... but... wait :)
> > +
> > + if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> > + /* init free_space, keep it sync'd with
> > + * rte_mbuf_dynfield_copy().
> > + */
> > + memset(shm, 0, sizeof(*shm));
> > + mark_free(dynfield1);
> > + mark_free(dynfield2);
> > +
> > + /* init free_flags */
> > + for (mask = PKT_FIRST_FREE; mask <= PKT_LAST_FREE; mask <<= 1)
> > + shm->free_flags |= mask;
> > + }
> > +#undef mark_free
> > +
> > + return 0;
> > +}
> > +
> > +/* check if this offset can be used */
> > +static int
> > +check_offset(size_t offset, size_t size, size_t align, unsigned int flags)
> > +{
> > + size_t i;
> > +
> > + (void)flags;
>
>
> We have RTE_SET_USED() for such cases...
> Though as it is an internal function probably better not to introduce
> unused parameters at all.
I removed the flag parameter as you suggested.
> > +
> > + if ((offset & (align - 1)) != 0)
> > + return -1;
> > + if (offset + size > sizeof(struct rte_mbuf))
> > + return -1;
> > +
> > + for (i = 0; i < size; i++) {
> > + if (!shm->free_space[i + offset])
> > + return -1;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +/* assume tailq is locked */
> > +static struct mbuf_dynfield_elt *
> > +__mbuf_dynfield_lookup(const char *name)
> > +{
> > + struct mbuf_dynfield_list *mbuf_dynfield_list;
> > + struct mbuf_dynfield_elt *mbuf_dynfield;
> > + struct rte_tailq_entry *te;
> > +
> > + mbuf_dynfield_list = RTE_TAILQ_CAST(
> > + mbuf_dynfield_tailq.head, mbuf_dynfield_list);
> > +
> > + TAILQ_FOREACH(te, mbuf_dynfield_list, next) {
> > + mbuf_dynfield = (struct mbuf_dynfield_elt *)te->data;
> > + if (strcmp(name, mbuf_dynfield->params.name) == 0)
> > + break;
> > + }
> > +
> > + if (te == NULL) {
> > + rte_errno = ENOENT;
> > + return NULL;
> > + }
> > +
> > + return mbuf_dynfield;
> > +}
> > +
> > +int
> > +rte_mbuf_dynfield_lookup(const char *name, struct rte_mbuf_dynfield *params)
> > +{
> > + struct mbuf_dynfield_elt *mbuf_dynfield;
> > +
> > + if (shm == NULL) {
> > + rte_errno = ENOENT;
> > + return -1;
> > + }
> > +
> > + rte_mcfg_tailq_read_lock();
> > + mbuf_dynfield = __mbuf_dynfield_lookup(name);
> > + rte_mcfg_tailq_read_unlock();
> > +
> > + if (mbuf_dynfield == NULL) {
> > + rte_errno = ENOENT;
> > + return -1;
> > + }
> > +
> > + if (params != NULL)
> > + memcpy(params, &mbuf_dynfield->params, sizeof(*params));
> > +
> > + return mbuf_dynfield->offset;
> > +}
> > +
> > +static int mbuf_dynfield_cmp(const struct rte_mbuf_dynfield *params1,
> > + const struct rte_mbuf_dynfield *params2)
> > +{
> > + if (strcmp(params1->name, params2->name))
> > + return -1;
> > + if (params1->size != params2->size)
> > + return -1;
> > + if (params1->align != params2->align)
> > + return -1;
> > + if (params1->flags != params2->flags)
> > + return -1;
> > + return 0;
> > +}
> > +
> > +int
> > +rte_mbuf_dynfield_register(const struct rte_mbuf_dynfield *params)
>
> What I meant at user-space - if we can also have another function that would allow
> user to specify required offset for dynfield explicitly, then user can define it as constant
> value and let compiler do optimization work and hopefully generate faster code to access
> this field.
> Something like that:
>
> int rte_mbuf_dynfiled_register_offset(const struct rte_mbuf_dynfield *params, size_t offset);
>
> #define RTE_MBUF_DYNFIELD_OFFSET(fld, off) (offsetof(struct rte_mbuf, fld) + (off))
>
> And then somewhere in user code:
>
> /* to let say reserve first 4B in dynfield1*/
> #define MBUF_DYNFIELD_A RTE_MBUF_DYNFIELD_OFFSET(dynfiled1, 0)
> ...
> params.name = RTE_STR(MBUF_DYNFIELD_A);
> params.size = sizeof(uint32_t);
> params.align = sizeof(uint32_t);
> ret = rte_mbuf_dynfiled_register_offset(¶ms, MBUF_DYNFIELD_A);
> if (ret != MBUF_DYNFIELD_A) {
> /* handle it somehow, probably just terminate gracefully... */
> }
> ...
>
> /* to let say reserve last 2B in dynfield2*/
> #define MBUF_DYNFIELD_B RTE_MBUF_DYNFIELD_OFFSET(dynfiled2, 6)
> ...
> params.name = RTE_STR(MBUF_DYNFIELD_B);
> params.size = sizeof(uint16_t);
> params.align = sizeof(uint16_t);
> ret = rte_mbuf_dynfiled_register_offset(¶ms, MBUF_DYNFIELD_B);
>
> After that user can use constant offsets MBUF_DYNFIELD_A/ MBUF_DYNFIELD_B
> to access these fields.
> Same thoughts for DYNFLAG.
I added the feature in v2.
> > + struct mbuf_dynfield_list *mbuf_dynfield_list;
> > + struct mbuf_dynfield_elt *mbuf_dynfield = NULL;
> > + struct rte_tailq_entry *te = NULL;
> > + int offset, ret;
>
> size_t offset
> to avoid explicit conversions, etc.?
>
Fixed.
> > + size_t i;
> > +
> > + if (shm == NULL && init_shared_mem() < 0)
> > + goto fail;
>
> As I understand, here you allocate/initialize your shm without any lock protection,
> though later you protect it via rte_mcfg_tailq_write_lock().
> That seems a bit flakey to me.
> Why not to store information about free dynfield bytes inside mbuf_dynfield_tailq?
> Let say at init() create and add an entry into that list with some reserved name.
> Then at register - grab mcfg_tailq_write_lock and do lookup
> for such entry and then read/update it as needed.
> It would help to avoid racing problem, plus you wouldn't need to
> allocate/lookup for memzone.
I don't quite like the idea of having a special entry with a different type
in an element list. Despite it is simpler for a locking perspective, it is
less obvious for the developper.
Also, I changed the way a zone is reserved to return the one that have the
less impact on next reservation, and I feel it is easier to implement with
the shared memory.
So, I just moved the init_shared_mem() inside the rte_mcfg_tailq_write_lock(),
it should do the job.
> > + if (params->size >= sizeof(struct rte_mbuf)) {
> > + rte_errno = EINVAL;
> > + goto fail;
> > + }
> > + if (!rte_is_power_of_2(params->align)) {
> > + rte_errno = EINVAL;
> > + goto fail;
> > + }
> > + if (params->flags != 0) {
> > + rte_errno = EINVAL;
> > + goto fail;
> > + }
> > +
> > + rte_mcfg_tailq_write_lock();
> > +
>
> I think it probably would be cleaner and easier to read/maintain, if you'll put actual
> code under lock protection into a separate function - as you did for __mbuf_dynfield_lookup().
Yes, I did that, it should be clearer now.
> > + mbuf_dynfield = __mbuf_dynfield_lookup(params->name);
> > + if (mbuf_dynfield != NULL) {
> > + if (mbuf_dynfield_cmp(params, &mbuf_dynfield->params) < 0) {
> > + rte_errno = EEXIST;
> > + goto fail_unlock;
> > + }
> > + offset = mbuf_dynfield->offset;
> > + goto out_unlock;
> > + }
> > +
> > + if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > + rte_errno = EPERM;
> > + goto fail_unlock;
> > + }
> > +
> > + for (offset = 0;
> > + offset < (int)sizeof(struct rte_mbuf);
> > + offset++) {
> > + if (check_offset(offset, params->size, params->align,
> > + params->flags) == 0)
> > + break;
> > + }
> > +
> > + if (offset == sizeof(struct rte_mbuf)) {
> > + rte_errno = ENOENT;
> > + goto fail_unlock;
> > + }
> > +
> > + mbuf_dynfield_list = RTE_TAILQ_CAST(
> > + mbuf_dynfield_tailq.head, mbuf_dynfield_list);
> > +
> > + te = rte_zmalloc("MBUF_DYNFIELD_TAILQ_ENTRY", sizeof(*te), 0);
> > + if (te == NULL)
> > + goto fail_unlock;
> > +
> > + mbuf_dynfield = rte_zmalloc("mbuf_dynfield", sizeof(*mbuf_dynfield), 0);
> > + if (mbuf_dynfield == NULL)
> > + goto fail_unlock;
> > +
> > + ret = strlcpy(mbuf_dynfield->params.name, params->name,
> > + sizeof(mbuf_dynfield->params.name));
> > + if (ret < 0 || ret >= (int)sizeof(mbuf_dynfield->params.name)) {
> > + rte_errno = ENAMETOOLONG;
> > + goto fail_unlock;
> > + }
> > + memcpy(&mbuf_dynfield->params, params, sizeof(mbuf_dynfield->params));
> > + mbuf_dynfield->offset = offset;
> > + te->data = mbuf_dynfield;
> > +
> > + TAILQ_INSERT_TAIL(mbuf_dynfield_list, te, next);
> > +
> > + for (i = offset; i < offset + params->size; i++)
> > + shm->free_space[i] = 0;
> > +
> > + RTE_LOG(DEBUG, MBUF, "Registered dynamic field %s (sz=%zu, al=%zu, fl=0x%x) -> %d\n",
> > + params->name, params->size, params->align, params->flags,
> > + offset);
> > +
> > +out_unlock:
> > + rte_mcfg_tailq_write_unlock();
> > +
> > + return offset;
> > +
> > +fail_unlock:
> > + rte_mcfg_tailq_write_unlock();
> > +fail:
> > + rte_free(mbuf_dynfield);
> > + rte_free(te);
> > + return -1;
> > +}
> > +
> > +/* assume tailq is locked */
> > +static struct mbuf_dynflag_elt *
> > +__mbuf_dynflag_lookup(const char *name)
> > +{
> > + struct mbuf_dynflag_list *mbuf_dynflag_list;
> > + struct mbuf_dynflag_elt *mbuf_dynflag;
> > + struct rte_tailq_entry *te;
> > +
> > + mbuf_dynflag_list = RTE_TAILQ_CAST(
> > + mbuf_dynflag_tailq.head, mbuf_dynflag_list);
> > +
> > + TAILQ_FOREACH(te, mbuf_dynflag_list, next) {
> > + mbuf_dynflag = (struct mbuf_dynflag_elt *)te->data;
> > + if (strncmp(name, mbuf_dynflag->params.name,
> > + RTE_MBUF_DYN_NAMESIZE) == 0)
> > + break;
> > + }
> > +
> > + if (te == NULL) {
> > + rte_errno = ENOENT;
> > + return NULL;
> > + }
> > +
> > + return mbuf_dynflag;
> > +}
> > +
> > +int
> > +rte_mbuf_dynflag_lookup(const char *name,
> > + struct rte_mbuf_dynflag *params)
> > +{
> > + struct mbuf_dynflag_elt *mbuf_dynflag;
> > +
> > + if (shm == NULL) {
> > + rte_errno = ENOENT;
> > + return -1;
> > + }
> > +
> > + rte_mcfg_tailq_read_lock();
> > + mbuf_dynflag = __mbuf_dynflag_lookup(name);
> > + rte_mcfg_tailq_read_unlock();
> > +
> > + if (mbuf_dynflag == NULL) {
> > + rte_errno = ENOENT;
> > + return -1;
> > + }
> > +
> > + if (params != NULL)
> > + memcpy(params, &mbuf_dynflag->params, sizeof(*params));
> > +
> > + return mbuf_dynflag->bitnum;
> > +}
> > +
> > +static int mbuf_dynflag_cmp(const struct rte_mbuf_dynflag *params1,
> > + const struct rte_mbuf_dynflag *params2)
> > +{
> > + if (strcmp(params1->name, params2->name))
> > + return -1;
> > + if (params1->flags != params2->flags)
> > + return -1;
> > + return 0;
> > +}
> > +
> > +int
> > +rte_mbuf_dynflag_register(const struct rte_mbuf_dynflag *params)
> > +{
> > + struct mbuf_dynflag_list *mbuf_dynflag_list;
> > + struct mbuf_dynflag_elt *mbuf_dynflag = NULL;
> > + struct rte_tailq_entry *te = NULL;
> > + int bitnum, ret;
> > +
> > + if (shm == NULL && init_shared_mem() < 0)
> > + goto fail;
> > +
> > + rte_mcfg_tailq_write_lock();
> > +
> > + mbuf_dynflag = __mbuf_dynflag_lookup(params->name);
> > + if (mbuf_dynflag != NULL) {
> > + if (mbuf_dynflag_cmp(params, &mbuf_dynflag->params) < 0) {
> > + rte_errno = EEXIST;
> > + goto fail_unlock;
> > + }
> > + bitnum = mbuf_dynflag->bitnum;
> > + goto out_unlock;
> > + }
> > +
> > + if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > + rte_errno = EPERM;
> > + goto fail_unlock;
> > + }
> > +
> > + if (shm->free_flags == 0) {
> > + rte_errno = ENOENT;
> > + goto fail_unlock;
> > + }
> > + bitnum = rte_bsf64(shm->free_flags);
> > +
> > + mbuf_dynflag_list = RTE_TAILQ_CAST(
> > + mbuf_dynflag_tailq.head, mbuf_dynflag_list);
> > +
> > + te = rte_zmalloc("MBUF_DYNFLAG_TAILQ_ENTRY", sizeof(*te), 0);
> > + if (te == NULL)
> > + goto fail_unlock;
> > +
> > + mbuf_dynflag = rte_zmalloc("mbuf_dynflag", sizeof(*mbuf_dynflag), 0);
> > + if (mbuf_dynflag == NULL)
> > + goto fail_unlock;
> > +
> > + ret = strlcpy(mbuf_dynflag->params.name, params->name,
> > + sizeof(mbuf_dynflag->params.name));
> > + if (ret < 0 || ret >= (int)sizeof(mbuf_dynflag->params.name)) {
> > + rte_errno = ENAMETOOLONG;
> > + goto fail_unlock;
> > + }
> > + mbuf_dynflag->bitnum = bitnum;
> > + te->data = mbuf_dynflag;
> > +
> > + TAILQ_INSERT_TAIL(mbuf_dynflag_list, te, next);
> > +
> > + shm->free_flags &= ~(1ULL << bitnum);
> > +
> > + RTE_LOG(DEBUG, MBUF, "Registered dynamic flag %s (fl=0x%x) -> %u\n",
> > + params->name, params->flags, bitnum);
> > +
> > +out_unlock:
> > + rte_mcfg_tailq_write_unlock();
> > +
> > + return bitnum;
> > +
> > +fail_unlock:
> > + rte_mcfg_tailq_write_unlock();
> > +fail:
> > + rte_free(mbuf_dynflag);
> > + rte_free(te);
> > + return -1;
> > +}
> > diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
> > new file mode 100644
> > index 000000000..6e2c81654
> > --- /dev/null
> > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> > @@ -0,0 +1,163 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright 2019 6WIND S.A.
> > + */
> > +
> > +#ifndef _RTE_MBUF_DYN_H_
> > +#define _RTE_MBUF_DYN_H_
> > +
> > +/**
> > + * @file
> > + * RTE Mbuf dynamic fields and flags
> > + *
> > + * Many features require to store data inside the mbuf. As the room in
> > + * mbuf structure is limited, it is not possible to have a field for
> > + * each feature. Also, changing fields in the mbuf structure can break
> > + * the API or ABI.
> > + *
> > + * This module addresses this issue, by enabling the dynamic
> > + * registration of fields or flags:
> > + *
> > + * - a dynamic field is a named area in the rte_mbuf structure, with a
> > + * given size (>= 1 byte) and alignment constraint.
> > + * - a dynamic flag is a named bit in the rte_mbuf structure, stored
> > + * in mbuf->ol_flags.
> > + *
> > + * The typical use case is when a specific offload feature requires to
> > + * register a dedicated offload field in the mbuf structure, and adding
> > + * a static field or flag is not justified.
> > + *
> > + * Example of use:
> > + *
> > + * - A rte_mbuf_dynfield structure is defined, containing the parameters
> > + * of the dynamic field to be registered:
> > + * const struct rte_mbuf_dynfield rte_dynfield_my_feature = { ... };
> > + * - The application initializes the PMD, and asks for this feature
> > + * at port initialization by passing DEV_RX_OFFLOAD_MY_FEATURE in
> > + * rxconf. This will make the PMD to register the field by calling
> > + * rte_mbuf_dynfield_register(&rte_dynfield_my_feature). The PMD
> > + * stores the returned offset.
> > + * - The application that uses the offload feature also registers
> > + * the field to retrieve the same offset.
> > + * - When the PMD receives a packet, it can set the field:
> > + * *RTE_MBUF_DYNFIELD(m, offset, <type *>) = value;
> > + * - In the main loop, the application can retrieve the value with
> > + * the same macro.
> > + *
> > + * To avoid wasting space, the dynamic fields or flags must only be
> > + * reserved on demand, when an application asks for the related feature.
> > + *
> > + * The registration can be done at any moment, but it is not possible
> > + * to unregister fields or flags for now.
> > + *
> > + * A dynamic field can be reserved and used by an application only.
> > + * It can for instance be a packet mark.
> > + */
> > +
> > +#include <sys/types.h>
> > +/**
> > + * Maximum length of the dynamic field or flag string.
> > + */
> > +#define RTE_MBUF_DYN_NAMESIZE 64
> > +
> > +/**
> > + * Structure describing the parameters of a mbuf dynamic field.
> > + */
> > +struct rte_mbuf_dynfield {
> > + char name[RTE_MBUF_DYN_NAMESIZE]; /**< Name of the field. */
> > + size_t size; /**< The number of bytes to reserve. */
> > + size_t align; /**< The alignment constraint (power of 2). */
> > + unsigned int flags; /**< Reserved for future use, must be 0. */
> > +};
> > +
> > +/**
> > + * Structure describing the parameters of a mbuf dynamic flag.
> > + */
> > +struct rte_mbuf_dynflag {
> > + char name[RTE_MBUF_DYN_NAMESIZE]; /**< Name of the dynamic flag. */
> > + unsigned int flags; /**< Reserved for future use, must be 0. */
> > +};
> > +
> > +/**
> > + * Register space for a dynamic field in the mbuf structure.
> > + *
> > + * If the field is already registered (same name and parameters), its
> > + * offset is returned.
> > + *
> > + * @param params
> > + * A structure containing the requested parameters (name, size,
> > + * alignment constraint and flags).
> > + * @return
> > + * The offset in the mbuf structure, or -1 on error.
> > + * Possible values for rte_errno:
> > + * - EINVAL: invalid parameters (size, align, or flags).
> > + * - EEXIST: this name is already register with different parameters.
> > + * - EPERM: called from a secondary process.
> > + * - ENOENT: not enough room in mbuf.
> > + * - ENOMEM: allocation failure.
> > + * - ENAMETOOLONG: name does not ends with \0.
> > + */
> > +__rte_experimental
> > +int rte_mbuf_dynfield_register(const struct rte_mbuf_dynfield *params);
> > +
> > +/**
> > + * Lookup for a registered dynamic mbuf field.
> > + *
> > + * @param name
> > + * A string identifying the dynamic field.
> > + * @param params
> > + * If not NULL, and if the lookup is successful, the structure is
> > + * filled with the parameters of the dynamic field.
> > + * @return
> > + * The offset of this field in the mbuf structure, or -1 on error.
> > + * Possible values for rte_errno:
> > + * - ENOENT: no dynamic field matches this name.
> > + */
> > +__rte_experimental
> > +int rte_mbuf_dynfield_lookup(const char *name,
> > + struct rte_mbuf_dynfield *params);
> > +
> > +/**
> > + * Register a dynamic flag in the mbuf structure.
> > + *
> > + * If the flag is already registered (same name and parameters), its
> > + * offset is returned.
> > + *
> > + * @param params
> > + * A structure containing the requested parameters of the dynamic
> > + * flag (name and options).
> > + * @return
> > + * The number of the reserved bit, or -1 on error.
> > + * Possible values for rte_errno:
> > + * - EINVAL: invalid parameters (size, align, or flags).
> > + * - EEXIST: this name is already register with different parameters.
> > + * - EPERM: called from a secondary process.
> > + * - ENOENT: no more flag available.
> > + * - ENOMEM: allocation failure.
> > + * - ENAMETOOLONG: name is longer than RTE_MBUF_DYN_NAMESIZE - 1.
> > + */
> > +__rte_experimental
> > +int rte_mbuf_dynflag_register(const struct rte_mbuf_dynflag *params);
> > +
> > +/**
> > + * Lookup for a registered dynamic mbuf flag.
> > + *
> > + * @param name
> > + * A string identifying the dynamic flag.
> > + * @param params
> > + * If not NULL, and if the lookup is successful, the structure is
> > + * filled with the parameters of the dynamic flag.
> > + * @return
> > + * The offset of this flag in the mbuf structure, or -1 on error.
> > + * Possible values for rte_errno:
> > + * - ENOENT: no dynamic flag matches this name.
> > + */
> > +__rte_experimental
> > +int rte_mbuf_dynflag_lookup(const char *name,
> > + struct rte_mbuf_dynflag *params);
> > +
> > +/**
> > + * Helper macro to access to a dynamic field.
> > + */
> > +#define RTE_MBUF_DYNFIELD(m, offset, type) ((type)((uintptr_t)(m) + (offset)))
> > +
> > +#endif
> > diff --git a/lib/librte_mbuf/rte_mbuf_version.map b/lib/librte_mbuf/rte_mbuf_version.map
> > index 2662a37bf..a98310570 100644
> > --- a/lib/librte_mbuf/rte_mbuf_version.map
> > +++ b/lib/librte_mbuf/rte_mbuf_version.map
> > @@ -50,4 +50,8 @@ EXPERIMENTAL {
> > global:
> >
> > rte_mbuf_check;
> > + rte_mbuf_dynfield_lookup;
> > + rte_mbuf_dynfield_register;
> > + rte_mbuf_dynflag_lookup;
> > + rte_mbuf_dynflag_register;
> > } DPDK_18.08;
> > --
> > 2.20.1
>
I will send a v2 shortly, thanks
Olivier
next prev parent reply other threads:[~2019-10-17 7:54 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-10 9:29 [dpdk-dev] [RFC] " Olivier Matz
2019-07-10 17:14 ` Wang, Haiyue
2019-07-11 7:26 ` Olivier Matz
2019-07-11 8:04 ` Wang, Haiyue
2019-07-11 8:20 ` Olivier Matz
2019-07-11 8:34 ` Wang, Haiyue
2019-07-11 15:31 ` Stephen Hemminger
2019-07-12 9:18 ` Olivier Matz
2019-07-10 17:49 ` Stephen Hemminger
2019-07-10 18:12 ` Wiles, Keith
2019-07-11 7:53 ` Olivier Matz
2019-07-11 14:37 ` Wiles, Keith
2019-07-12 9:06 ` Olivier Matz
2019-07-11 7:36 ` Olivier Matz
2019-07-12 12:23 ` Jerin Jacob Kollanukkaran
2019-07-16 9:39 ` Olivier Matz
2019-07-16 14:43 ` Stephen Hemminger
2019-07-11 9:24 ` Thomas Monjalon
2019-07-12 14:54 ` Andrew Rybchenko
2019-07-16 9:49 ` Olivier Matz
2019-07-16 11:31 ` [dpdk-dev] ***Spam*** " Andrew Rybchenko
2019-09-18 16:54 ` [dpdk-dev] [PATCH] " Olivier Matz
2019-09-21 4:54 ` Wang, Haiyue
2019-09-23 8:31 ` Olivier Matz
2019-09-23 11:01 ` Wang, Haiyue
2019-09-21 8:28 ` Wiles, Keith
2019-09-23 8:56 ` Morten Brørup
2019-09-23 9:41 ` Olivier Matz
2019-09-23 9:13 ` Olivier Matz
2019-09-23 15:14 ` Wiles, Keith
2019-09-23 16:16 ` Olivier Matz
2019-09-23 17:14 ` Wiles, Keith
2019-09-23 16:09 ` Wiles, Keith
2019-10-01 10:49 ` Ananyev, Konstantin
2019-10-17 7:54 ` Olivier Matz [this message]
2019-10-17 11:58 ` Ananyev, Konstantin
2019-10-17 12:58 ` Olivier Matz
2019-10-17 14:42 ` [dpdk-dev] [PATCH v2] " Olivier Matz
2019-10-18 2:47 ` Wang, Haiyue
2019-10-18 7:53 ` Olivier Matz
2019-10-18 8:28 ` Wang, Haiyue
2019-10-18 9:47 ` Olivier Matz
2019-10-18 11:24 ` Wang, Haiyue
2019-10-22 22:51 ` Ananyev, Konstantin
2019-10-23 3:16 ` Wang, Haiyue
2019-10-23 10:21 ` Olivier Matz
2019-10-23 15:00 ` Stephen Hemminger
2019-10-23 15:12 ` Wang, Haiyue
2019-10-23 10:19 ` Olivier Matz
2019-10-23 11:45 ` Olivier Matz
2019-10-23 11:49 ` Ananyev, Konstantin
2019-10-23 12:00 ` Shahaf Shuler
2019-10-23 13:33 ` Olivier Matz
2019-10-24 4:54 ` Shahaf Shuler
2019-10-24 7:07 ` Olivier Matz
2019-10-24 7:38 ` Slava Ovsiienko
2019-10-24 7:56 ` Olivier Matz
2019-10-24 8:13 ` [dpdk-dev] [PATCH v3] " Olivier Matz
2019-10-24 15:30 ` Stephen Hemminger
2019-10-24 15:44 ` Thomas Monjalon
2019-10-24 17:07 ` Stephen Hemminger
2019-10-24 16:40 ` Thomas Monjalon
2019-10-26 12:39 ` [dpdk-dev] [PATCH v4] " Olivier Matz
2019-10-26 17:04 ` Thomas Monjalon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191017075434.dk4flyktbbe3lxxd@platinum \
--to=olivier.matz@6wind.com \
--cc=arybchenko@solarflare.com \
--cc=dev@dpdk.org \
--cc=haiyue.wang@intel.com \
--cc=jerinj@marvell.com \
--cc=keith.wiles@intel.com \
--cc=konstantin.ananyev@intel.com \
--cc=stephen@networkplumber.org \
--cc=thomas@monjalon.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).