DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: Olivier Matz <olivier.matz@6wind.com>, "dev@dpdk.org" <dev@dpdk.org>
Cc: Thomas Monjalon <thomas@monjalon.net>,
	"Wang, Haiyue" <haiyue.wang@intel.com>,
	Stephen Hemminger <stephen@networkplumber.org>,
	Andrew Rybchenko <arybchenko@solarflare.com>,
	"Wiles, Keith" <keith.wiles@intel.com>,
	Jerin Jacob Kollanukkaran <jerinj@marvell.com>
Subject: Re: [dpdk-dev] [PATCH] mbuf: support dynamic fields and flags
Date: Tue, 1 Oct 2019 10:49:39 +0000	[thread overview]
Message-ID: <2601191342CEEE43887BDE71AB977258019196E0B7@irsmsx105.ger.corp.intel.com> (raw)
In-Reply-To: <20190918165448.22409-1-olivier.matz@6wind.com>

Hi Olivier,

> Many features require to store data inside the mbuf. As the room in mbuf
> structure is limited, it is not possible to have a field for each
> feature. Also, changing fields in the mbuf structure can break the API
> or ABI.
> 
> This commit addresses these issues, by enabling the dynamic registration
> of fields or flags:
> 
> - a dynamic field is a named area in the rte_mbuf structure, with a
>   given size (>= 1 byte) and alignment constraint.
> - a dynamic flag is a named bit in the rte_mbuf structure.
> 
> The typical use case is a PMD that registers space for an offload
> feature, when the application requests to enable this feature.  As
> the space in mbuf is limited, the space should only be reserved if it
> is going to be used (i.e when the application explicitly asks for it).
> 
> The registration can be done at any moment, but it is not possible
> to unregister fields or flags for now.

Looks ok to me in general.
Some comments/suggestions inline.
Konstantin

> 
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> Acked-by: Thomas Monjalon <thomas@monjalon.net>
> ---
> 
> rfc -> v1
> 
> * Rebase on top of master
> * Change registration API to use a structure instead of
>   variables, getting rid of #defines (Stephen's comment)
> * Update flag registration to use a similar API as fields.
> * Change max name length from 32 to 64 (sugg. by Thomas)
> * Enhance API documentation (Haiyue's and Andrew's comments)
> * Add a debug log at registration
> * Add some words in release note
> * Did some performance tests (sugg. by Andrew):
>   On my platform, reading a dynamic field takes ~3 cycles more
>   than a static field, and ~2 cycles more for writing.
> 
>  app/test/test_mbuf.c                   | 114 ++++++-
>  doc/guides/rel_notes/release_19_11.rst |   7 +
>  lib/librte_mbuf/Makefile               |   2 +
>  lib/librte_mbuf/meson.build            |   6 +-
>  lib/librte_mbuf/rte_mbuf.h             |  25 +-
>  lib/librte_mbuf/rte_mbuf_dyn.c         | 408 +++++++++++++++++++++++++
>  lib/librte_mbuf/rte_mbuf_dyn.h         | 163 ++++++++++
>  lib/librte_mbuf/rte_mbuf_version.map   |   4 +
>  8 files changed, 724 insertions(+), 5 deletions(-)
>  create mode 100644 lib/librte_mbuf/rte_mbuf_dyn.c
>  create mode 100644 lib/librte_mbuf/rte_mbuf_dyn.h
> 
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -198,9 +198,12 @@ extern "C" {
>  #define PKT_RX_OUTER_L4_CKSUM_GOOD	(1ULL << 22)
>  #define PKT_RX_OUTER_L4_CKSUM_INVALID	((1ULL << 21) | (1ULL << 22))
> 
> -/* add new RX flags here */
> +/* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> 
> -/* add new TX flags here */
> +#define PKT_FIRST_FREE (1ULL << 23)
> +#define PKT_LAST_FREE (1ULL << 39)
> +
> +/* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> 
>  /**
>   * Indicate that the metadata field in the mbuf is in use.
> @@ -738,6 +741,8 @@ struct rte_mbuf {
>  	 */
>  	struct rte_mbuf_ext_shared_info *shinfo;
> 
> +	uint64_t dynfield1; /**< Reserved for dynamic fields. */
> +	uint64_t dynfield2; /**< Reserved for dynamic fields. */

Wonder why just not one field:
	union {
		uint8_t u8[16];
		...
		uint64_t u64[2];
	} dyn_field1;
?
Probably would be a bit handy, to refer, register, etc. no?

>  } __rte_cache_aligned;
> 
>  /**
> @@ -1684,6 +1689,21 @@ rte_pktmbuf_attach_extbuf(struct rte_mbuf *m, void *buf_addr,
>   */
>  #define rte_pktmbuf_detach_extbuf(m) rte_pktmbuf_detach(m)
> 
> +/**
> + * Copy dynamic fields from m_src to m_dst.
> + *
> + * @param m_dst
> + *   The destination mbuf.
> + * @param m_src
> + *   The source mbuf.
> + */
> +static inline void
> +rte_mbuf_dynfield_copy(struct rte_mbuf *m_dst, const struct rte_mbuf *m_src)
> +{
> +	m_dst->dynfield1 = m_src->dynfield1;
> +	m_dst->dynfield2 = m_src->dynfield2;
> +}
> +
>  /**
>   * Attach packet mbuf to another packet mbuf.
>   *
> @@ -1732,6 +1752,7 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, struct rte_mbuf *m)
>  	mi->vlan_tci_outer = m->vlan_tci_outer;
>  	mi->tx_offload = m->tx_offload;
>  	mi->hash = m->hash;
> +	rte_mbuf_dynfield_copy(mi, m);
> 
>  	mi->next = NULL;
>  	mi->pkt_len = mi->data_len;
> diff --git a/lib/librte_mbuf/rte_mbuf_dyn.c b/lib/librte_mbuf/rte_mbuf_dyn.c
> new file mode 100644
> index 000000000..13b8742d0
> --- /dev/null
> +++ b/lib/librte_mbuf/rte_mbuf_dyn.c
> @@ -0,0 +1,408 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2019 6WIND S.A.
> + */
> +
> +#include <sys/queue.h>
> +
> +#include <rte_common.h>
> +#include <rte_eal.h>
> +#include <rte_eal_memconfig.h>
> +#include <rte_tailq.h>
> +#include <rte_errno.h>
> +#include <rte_malloc.h>
> +#include <rte_string_fns.h>
> +#include <rte_mbuf.h>
> +#include <rte_mbuf_dyn.h>
> +
> +#define RTE_MBUF_DYN_MZNAME "rte_mbuf_dyn"
> +
> +struct mbuf_dynfield_elt {
> +	TAILQ_ENTRY(mbuf_dynfield_elt) next;
> +	struct rte_mbuf_dynfield params;
> +	int offset;

Why not 'size_t offset', to avoid any explicit conversions, etc?

> +};
> +TAILQ_HEAD(mbuf_dynfield_list, rte_tailq_entry);
> +
> +static struct rte_tailq_elem mbuf_dynfield_tailq = {
> +	.name = "RTE_MBUF_DYNFIELD",
> +};
> +EAL_REGISTER_TAILQ(mbuf_dynfield_tailq);
> +
> +struct mbuf_dynflag_elt {
> +	TAILQ_ENTRY(mbuf_dynflag_elt) next;
> +	struct rte_mbuf_dynflag params;
> +	int bitnum;
> +};
> +TAILQ_HEAD(mbuf_dynflag_list, rte_tailq_entry);
> +
> +static struct rte_tailq_elem mbuf_dynflag_tailq = {
> +	.name = "RTE_MBUF_DYNFLAG",
> +};
> +EAL_REGISTER_TAILQ(mbuf_dynflag_tailq);
> +
> +struct mbuf_dyn_shm {
> +	/** For each mbuf byte, free_space[i] == 1 if space is free. */
> +	uint8_t free_space[sizeof(struct rte_mbuf)];
> +	/** Bitfield of available flags. */
> +	uint64_t free_flags;
> +};
> +static struct mbuf_dyn_shm *shm;
> +
> +/* allocate and initialize the shared memory */
> +static int
> +init_shared_mem(void)
> +{
> +	const struct rte_memzone *mz;
> +	uint64_t mask;
> +
> +	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> +		mz = rte_memzone_reserve_aligned(RTE_MBUF_DYN_MZNAME,
> +						sizeof(struct mbuf_dyn_shm),
> +						SOCKET_ID_ANY, 0,
> +						RTE_CACHE_LINE_SIZE);
> +	} else {
> +		mz = rte_memzone_lookup(RTE_MBUF_DYN_MZNAME);
> +	}
> +	if (mz == NULL)
> +		return -1;
> +
> +	shm = mz->addr;
> +
> +#define mark_free(field)						\
> +	memset(&shm->free_space[offsetof(struct rte_mbuf, field)],	\
> +		0xff, sizeof(((struct rte_mbuf *)0)->field))

I think you can avoid defining/unedifying macros here by something like that:

static const struct {
      size_t offset;
      size_t size;
} dyn_syms[] = {
    [0] = {.offset = offsetof(struct rte_mbuf, dynfield1), sizeof((struct rte_mbuf *)0)->dynfield1),
    [1] = {.offset = offsetof(struct rte_mbuf, dynfield2), sizeof((struct rte_mbuf *)0)->dynfield2),
};
...

for (i = 0; i != RTE_DIM(dyn_syms); i++)
    memset(shm->free_space + dym_syms[i].offset, UINT8_MAX, dym_syms[i].size);

> +
> +	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> +		/* init free_space, keep it sync'd with
> +		 * rte_mbuf_dynfield_copy().
> +		 */
> +		memset(shm, 0, sizeof(*shm));
> +		mark_free(dynfield1);
> +		mark_free(dynfield2);
> +
> +		/* init free_flags */
> +		for (mask = PKT_FIRST_FREE; mask <= PKT_LAST_FREE; mask <<= 1)
> +			shm->free_flags |= mask;
> +	}
> +#undef mark_free
> +
> +	return 0;
> +}
> +
> +/* check if this offset can be used */
> +static int
> +check_offset(size_t offset, size_t size, size_t align, unsigned int flags)
> +{
> +	size_t i;
> +
> +	(void)flags;


We have RTE_SET_USED() for such cases...
Though as it is an internal function probably better not to introduce
unused parameters at all.

> +
> +	if ((offset & (align - 1)) != 0)
> +		return -1;
> +	if (offset + size > sizeof(struct rte_mbuf))
> +		return -1;
> +
> +	for (i = 0; i < size; i++) {
> +		if (!shm->free_space[i + offset])
> +			return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +/* assume tailq is locked */
> +static struct mbuf_dynfield_elt *
> +__mbuf_dynfield_lookup(const char *name)
> +{
> +	struct mbuf_dynfield_list *mbuf_dynfield_list;
> +	struct mbuf_dynfield_elt *mbuf_dynfield;
> +	struct rte_tailq_entry *te;
> +
> +	mbuf_dynfield_list = RTE_TAILQ_CAST(
> +		mbuf_dynfield_tailq.head, mbuf_dynfield_list);
> +
> +	TAILQ_FOREACH(te, mbuf_dynfield_list, next) {
> +		mbuf_dynfield = (struct mbuf_dynfield_elt *)te->data;
> +		if (strcmp(name, mbuf_dynfield->params.name) == 0)
> +			break;
> +	}
> +
> +	if (te == NULL) {
> +		rte_errno = ENOENT;
> +		return NULL;
> +	}
> +
> +	return mbuf_dynfield;
> +}
> +
> +int
> +rte_mbuf_dynfield_lookup(const char *name, struct rte_mbuf_dynfield *params)
> +{
> +	struct mbuf_dynfield_elt *mbuf_dynfield;
> +
> +	if (shm == NULL) {
> +		rte_errno = ENOENT;
> +		return -1;
> +	}
> +
> +	rte_mcfg_tailq_read_lock();
> +	mbuf_dynfield = __mbuf_dynfield_lookup(name);
> +	rte_mcfg_tailq_read_unlock();
> +
> +	if (mbuf_dynfield == NULL) {
> +		rte_errno = ENOENT;
> +		return -1;
> +	}
> +
> +	if (params != NULL)
> +		memcpy(params, &mbuf_dynfield->params, sizeof(*params));
> +
> +	return mbuf_dynfield->offset;
> +}
> +
> +static int mbuf_dynfield_cmp(const struct rte_mbuf_dynfield *params1,
> +		const struct rte_mbuf_dynfield *params2)
> +{
> +	if (strcmp(params1->name, params2->name))
> +		return -1;
> +	if (params1->size != params2->size)
> +		return -1;
> +	if (params1->align != params2->align)
> +		return -1;
> +	if (params1->flags != params2->flags)
> +		return -1;
> +	return 0;
> +}
> +
> +int
> +rte_mbuf_dynfield_register(const struct rte_mbuf_dynfield *params)

What I meant at user-space - if we can also have another function that would allow
user to specify required offset for dynfield explicitly, then user can define it as constant
value and let compiler do optimization work and hopefully generate faster code to access
this field.
Something like that:

int rte_mbuf_dynfiled_register_offset(const struct rte_mbuf_dynfield *params, size_t offset);

#define RTE_MBUF_DYNFIELD_OFFSET(fld, off)  (offsetof(struct rte_mbuf, fld) + (off))

And then somewhere in user code:

/* to let say reserve first 4B in dynfield1*/
#define MBUF_DYNFIELD_A	RTE_MBUF_DYNFIELD_OFFSET(dynfiled1, 0)
...
params.name = RTE_STR(MBUF_DYNFIELD_A);
params.size = sizeof(uint32_t);
params.align = sizeof(uint32_t);
ret = rte_mbuf_dynfiled_register_offset(&params, MBUF_DYNFIELD_A);
if (ret != MBUF_DYNFIELD_A)  {
     /* handle it somehow, probably just terminate gracefully... */
}
...

/* to let say reserve last 2B in dynfield2*/
#define MBUF_DYNFIELD_B	RTE_MBUF_DYNFIELD_OFFSET(dynfiled2, 6)
...
params.name = RTE_STR(MBUF_DYNFIELD_B);
params.size = sizeof(uint16_t);
params.align = sizeof(uint16_t);
ret = rte_mbuf_dynfiled_register_offset(&params, MBUF_DYNFIELD_B);

After that user can use constant offsets MBUF_DYNFIELD_A/ MBUF_DYNFIELD_B
to access these fields.
Same thoughts for DYNFLAG.

> +{
> +	struct mbuf_dynfield_list *mbuf_dynfield_list;
> +	struct mbuf_dynfield_elt *mbuf_dynfield = NULL;
> +	struct rte_tailq_entry *te = NULL;
> +	int offset, ret;

size_t offset
to avoid explicit conversions, etc.?

> +	size_t i;
> +
> +	if (shm == NULL && init_shared_mem() < 0)
> +		goto fail;

As I understand, here you allocate/initialize your shm without any lock protection,
though later you protect it via  rte_mcfg_tailq_write_lock().
That seems a bit flakey to me.
Why not to store information about free dynfield bytes inside mbuf_dynfield_tailq?
Let say  at init() create and add an entry into that list with some reserved name.
Then at register - grab mcfg_tailq_write_lock and do lookup
for such entry and then read/update it as needed.
It would help to avoid racing problem, plus you wouldn't need to
allocate/lookup for memzone.  


> +	if (params->size >= sizeof(struct rte_mbuf)) {
> +		rte_errno = EINVAL;
> +		goto fail;
> +	}
> +	if (!rte_is_power_of_2(params->align)) {
> +		rte_errno = EINVAL;
> +		goto fail;
> +	}
> +	if (params->flags != 0) {
> +		rte_errno = EINVAL;
> +		goto fail;
> +	}
> +
> +	rte_mcfg_tailq_write_lock();
> +

I think it probably would be cleaner and easier to read/maintain, if you'll put actual
code under lock protection into a separate function - as you did for __mbuf_dynfield_lookup().

> +	mbuf_dynfield = __mbuf_dynfield_lookup(params->name);
> +	if (mbuf_dynfield != NULL) {
> +		if (mbuf_dynfield_cmp(params, &mbuf_dynfield->params) < 0) {
> +			rte_errno = EEXIST;
> +			goto fail_unlock;
> +		}
> +		offset = mbuf_dynfield->offset;
> +		goto out_unlock;
> +	}
> +
> +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> +		rte_errno = EPERM;
> +		goto fail_unlock;
> +	}
> +
> +	for (offset = 0;
> +	     offset < (int)sizeof(struct rte_mbuf);
> +	     offset++) {
> +		if (check_offset(offset, params->size, params->align,
> +					params->flags) == 0)
> +			break;
> +	}
> +
> +	if (offset == sizeof(struct rte_mbuf)) {
> +		rte_errno = ENOENT;
> +		goto fail_unlock;
> +	}
> +
> +	mbuf_dynfield_list = RTE_TAILQ_CAST(
> +		mbuf_dynfield_tailq.head, mbuf_dynfield_list);
> +
> +	te = rte_zmalloc("MBUF_DYNFIELD_TAILQ_ENTRY", sizeof(*te), 0);
> +	if (te == NULL)
> +		goto fail_unlock;
> +
> +	mbuf_dynfield = rte_zmalloc("mbuf_dynfield", sizeof(*mbuf_dynfield), 0);
> +	if (mbuf_dynfield == NULL)
> +		goto fail_unlock;
> +
> +	ret = strlcpy(mbuf_dynfield->params.name, params->name,
> +		sizeof(mbuf_dynfield->params.name));
> +	if (ret < 0 || ret >= (int)sizeof(mbuf_dynfield->params.name)) {
> +		rte_errno = ENAMETOOLONG;
> +		goto fail_unlock;
> +	}
> +	memcpy(&mbuf_dynfield->params, params, sizeof(mbuf_dynfield->params));
> +	mbuf_dynfield->offset = offset;
> +	te->data = mbuf_dynfield;
> +
> +	TAILQ_INSERT_TAIL(mbuf_dynfield_list, te, next);
> +
> +	for (i = offset; i < offset + params->size; i++)
> +		shm->free_space[i] = 0;
> +
> +	RTE_LOG(DEBUG, MBUF, "Registered dynamic field %s (sz=%zu, al=%zu, fl=0x%x) -> %d\n",
> +		params->name, params->size, params->align, params->flags,
> +		offset);
> +
> +out_unlock:
> +	rte_mcfg_tailq_write_unlock();
> +
> +	return offset;
> +
> +fail_unlock:
> +	rte_mcfg_tailq_write_unlock();
> +fail:
> +	rte_free(mbuf_dynfield);
> +	rte_free(te);
> +	return -1;
> +}
> +
> +/* assume tailq is locked */
> +static struct mbuf_dynflag_elt *
> +__mbuf_dynflag_lookup(const char *name)
> +{
> +	struct mbuf_dynflag_list *mbuf_dynflag_list;
> +	struct mbuf_dynflag_elt *mbuf_dynflag;
> +	struct rte_tailq_entry *te;
> +
> +	mbuf_dynflag_list = RTE_TAILQ_CAST(
> +		mbuf_dynflag_tailq.head, mbuf_dynflag_list);
> +
> +	TAILQ_FOREACH(te, mbuf_dynflag_list, next) {
> +		mbuf_dynflag = (struct mbuf_dynflag_elt *)te->data;
> +		if (strncmp(name, mbuf_dynflag->params.name,
> +				RTE_MBUF_DYN_NAMESIZE) == 0)
> +			break;
> +	}
> +
> +	if (te == NULL) {
> +		rte_errno = ENOENT;
> +		return NULL;
> +	}
> +
> +	return mbuf_dynflag;
> +}
> +
> +int
> +rte_mbuf_dynflag_lookup(const char *name,
> +			struct rte_mbuf_dynflag *params)
> +{
> +	struct mbuf_dynflag_elt *mbuf_dynflag;
> +
> +	if (shm == NULL) {
> +		rte_errno = ENOENT;
> +		return -1;
> +	}
> +
> +	rte_mcfg_tailq_read_lock();
> +	mbuf_dynflag = __mbuf_dynflag_lookup(name);
> +	rte_mcfg_tailq_read_unlock();
> +
> +	if (mbuf_dynflag == NULL) {
> +		rte_errno = ENOENT;
> +		return -1;
> +	}
> +
> +	if (params != NULL)
> +		memcpy(params, &mbuf_dynflag->params, sizeof(*params));
> +
> +	return mbuf_dynflag->bitnum;
> +}
> +
> +static int mbuf_dynflag_cmp(const struct rte_mbuf_dynflag *params1,
> +		const struct rte_mbuf_dynflag *params2)
> +{
> +	if (strcmp(params1->name, params2->name))
> +		return -1;
> +	if (params1->flags != params2->flags)
> +		return -1;
> +	return 0;
> +}
> +
> +int
> +rte_mbuf_dynflag_register(const struct rte_mbuf_dynflag *params)
> +{
> +	struct mbuf_dynflag_list *mbuf_dynflag_list;
> +	struct mbuf_dynflag_elt *mbuf_dynflag = NULL;
> +	struct rte_tailq_entry *te = NULL;
> +	int bitnum, ret;
> +
> +	if (shm == NULL && init_shared_mem() < 0)
> +		goto fail;
> +
> +	rte_mcfg_tailq_write_lock();
> +
> +	mbuf_dynflag = __mbuf_dynflag_lookup(params->name);
> +	if (mbuf_dynflag != NULL) {
> +		if (mbuf_dynflag_cmp(params, &mbuf_dynflag->params) < 0) {
> +			rte_errno = EEXIST;
> +			goto fail_unlock;
> +		}
> +		bitnum = mbuf_dynflag->bitnum;
> +		goto out_unlock;
> +	}
> +
> +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> +		rte_errno = EPERM;
> +		goto fail_unlock;
> +	}
> +
> +	if (shm->free_flags == 0) {
> +		rte_errno = ENOENT;
> +		goto fail_unlock;
> +	}
> +	bitnum = rte_bsf64(shm->free_flags);
> +
> +	mbuf_dynflag_list = RTE_TAILQ_CAST(
> +		mbuf_dynflag_tailq.head, mbuf_dynflag_list);
> +
> +	te = rte_zmalloc("MBUF_DYNFLAG_TAILQ_ENTRY", sizeof(*te), 0);
> +	if (te == NULL)
> +		goto fail_unlock;
> +
> +	mbuf_dynflag = rte_zmalloc("mbuf_dynflag", sizeof(*mbuf_dynflag), 0);
> +	if (mbuf_dynflag == NULL)
> +		goto fail_unlock;
> +
> +	ret = strlcpy(mbuf_dynflag->params.name, params->name,
> +		sizeof(mbuf_dynflag->params.name));
> +	if (ret < 0 || ret >= (int)sizeof(mbuf_dynflag->params.name)) {
> +		rte_errno = ENAMETOOLONG;
> +		goto fail_unlock;
> +	}
> +	mbuf_dynflag->bitnum = bitnum;
> +	te->data = mbuf_dynflag;
> +
> +	TAILQ_INSERT_TAIL(mbuf_dynflag_list, te, next);
> +
> +	shm->free_flags &= ~(1ULL << bitnum);
> +
> +	RTE_LOG(DEBUG, MBUF, "Registered dynamic flag %s (fl=0x%x) -> %u\n",
> +		params->name, params->flags, bitnum);
> +
> +out_unlock:
> +	rte_mcfg_tailq_write_unlock();
> +
> +	return bitnum;
> +
> +fail_unlock:
> +	rte_mcfg_tailq_write_unlock();
> +fail:
> +	rte_free(mbuf_dynflag);
> +	rte_free(te);
> +	return -1;
> +}
> diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
> new file mode 100644
> index 000000000..6e2c81654
> --- /dev/null
> +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> @@ -0,0 +1,163 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2019 6WIND S.A.
> + */
> +
> +#ifndef _RTE_MBUF_DYN_H_
> +#define _RTE_MBUF_DYN_H_
> +
> +/**
> + * @file
> + * RTE Mbuf dynamic fields and flags
> + *
> + * Many features require to store data inside the mbuf. As the room in
> + * mbuf structure is limited, it is not possible to have a field for
> + * each feature. Also, changing fields in the mbuf structure can break
> + * the API or ABI.
> + *
> + * This module addresses this issue, by enabling the dynamic
> + * registration of fields or flags:
> + *
> + * - a dynamic field is a named area in the rte_mbuf structure, with a
> + *   given size (>= 1 byte) and alignment constraint.
> + * - a dynamic flag is a named bit in the rte_mbuf structure, stored
> + *   in mbuf->ol_flags.
> + *
> + * The typical use case is when a specific offload feature requires to
> + * register a dedicated offload field in the mbuf structure, and adding
> + * a static field or flag is not justified.
> + *
> + * Example of use:
> + *
> + * - A rte_mbuf_dynfield structure is defined, containing the parameters
> + *   of the dynamic field to be registered:
> + *   const struct rte_mbuf_dynfield rte_dynfield_my_feature = { ... };
> + * - The application initializes the PMD, and asks for this feature
> + *   at port initialization by passing DEV_RX_OFFLOAD_MY_FEATURE in
> + *   rxconf. This will make the PMD to register the field by calling
> + *   rte_mbuf_dynfield_register(&rte_dynfield_my_feature). The PMD
> + *   stores the returned offset.
> + * - The application that uses the offload feature also registers
> + *   the field to retrieve the same offset.
> + * - When the PMD receives a packet, it can set the field:
> + *   *RTE_MBUF_DYNFIELD(m, offset, <type *>) = value;
> + * - In the main loop, the application can retrieve the value with
> + *   the same macro.
> + *
> + * To avoid wasting space, the dynamic fields or flags must only be
> + * reserved on demand, when an application asks for the related feature.
> + *
> + * The registration can be done at any moment, but it is not possible
> + * to unregister fields or flags for now.
> + *
> + * A dynamic field can be reserved and used by an application only.
> + * It can for instance be a packet mark.
> + */
> +
> +#include <sys/types.h>
> +/**
> + * Maximum length of the dynamic field or flag string.
> + */
> +#define RTE_MBUF_DYN_NAMESIZE 64
> +
> +/**
> + * Structure describing the parameters of a mbuf dynamic field.
> + */
> +struct rte_mbuf_dynfield {
> +	char name[RTE_MBUF_DYN_NAMESIZE]; /**< Name of the field. */
> +	size_t size;        /**< The number of bytes to reserve. */
> +	size_t align;       /**< The alignment constraint (power of 2). */
> +	unsigned int flags; /**< Reserved for future use, must be 0. */
> +};
> +
> +/**
> + * Structure describing the parameters of a mbuf dynamic flag.
> + */
> +struct rte_mbuf_dynflag {
> +	char name[RTE_MBUF_DYN_NAMESIZE]; /**< Name of the dynamic flag. */
> +	unsigned int flags; /**< Reserved for future use, must be 0. */
> +};
> +
> +/**
> + * Register space for a dynamic field in the mbuf structure.
> + *
> + * If the field is already registered (same name and parameters), its
> + * offset is returned.
> + *
> + * @param params
> + *   A structure containing the requested parameters (name, size,
> + *   alignment constraint and flags).
> + * @return
> + *   The offset in the mbuf structure, or -1 on error.
> + *   Possible values for rte_errno:
> + *   - EINVAL: invalid parameters (size, align, or flags).
> + *   - EEXIST: this name is already register with different parameters.
> + *   - EPERM: called from a secondary process.
> + *   - ENOENT: not enough room in mbuf.
> + *   - ENOMEM: allocation failure.
> + *   - ENAMETOOLONG: name does not ends with \0.
> + */
> +__rte_experimental
> +int rte_mbuf_dynfield_register(const struct rte_mbuf_dynfield *params);
> +
> +/**
> + * Lookup for a registered dynamic mbuf field.
> + *
> + * @param name
> + *   A string identifying the dynamic field.
> + * @param params
> + *   If not NULL, and if the lookup is successful, the structure is
> + *   filled with the parameters of the dynamic field.
> + * @return
> + *   The offset of this field in the mbuf structure, or -1 on error.
> + *   Possible values for rte_errno:
> + *   - ENOENT: no dynamic field matches this name.
> + */
> +__rte_experimental
> +int rte_mbuf_dynfield_lookup(const char *name,
> +			struct rte_mbuf_dynfield *params);
> +
> +/**
> + * Register a dynamic flag in the mbuf structure.
> + *
> + * If the flag is already registered (same name and parameters), its
> + * offset is returned.
> + *
> + * @param params
> + *   A structure containing the requested parameters of the dynamic
> + *   flag (name and options).
> + * @return
> + *   The number of the reserved bit, or -1 on error.
> + *   Possible values for rte_errno:
> + *   - EINVAL: invalid parameters (size, align, or flags).
> + *   - EEXIST: this name is already register with different parameters.
> + *   - EPERM: called from a secondary process.
> + *   - ENOENT: no more flag available.
> + *   - ENOMEM: allocation failure.
> + *   - ENAMETOOLONG: name is longer than RTE_MBUF_DYN_NAMESIZE - 1.
> + */
> +__rte_experimental
> +int rte_mbuf_dynflag_register(const struct rte_mbuf_dynflag *params);
> +
> +/**
> + * Lookup for a registered dynamic mbuf flag.
> + *
> + * @param name
> + *   A string identifying the dynamic flag.
> + * @param params
> + *   If not NULL, and if the lookup is successful, the structure is
> + *   filled with the parameters of the dynamic flag.
> + * @return
> + *   The offset of this flag in the mbuf structure, or -1 on error.
> + *   Possible values for rte_errno:
> + *   - ENOENT: no dynamic flag matches this name.
> + */
> +__rte_experimental
> +int rte_mbuf_dynflag_lookup(const char *name,
> +			struct rte_mbuf_dynflag *params);
> +
> +/**
> + * Helper macro to access to a dynamic field.
> + */
> +#define RTE_MBUF_DYNFIELD(m, offset, type) ((type)((uintptr_t)(m) + (offset)))
> +
> +#endif
> diff --git a/lib/librte_mbuf/rte_mbuf_version.map b/lib/librte_mbuf/rte_mbuf_version.map
> index 2662a37bf..a98310570 100644
> --- a/lib/librte_mbuf/rte_mbuf_version.map
> +++ b/lib/librte_mbuf/rte_mbuf_version.map
> @@ -50,4 +50,8 @@ EXPERIMENTAL {
>  	global:
> 
>  	rte_mbuf_check;
> +	rte_mbuf_dynfield_lookup;
> +	rte_mbuf_dynfield_register;
> +	rte_mbuf_dynflag_lookup;
> +	rte_mbuf_dynflag_register;
>  } DPDK_18.08;
> --
> 2.20.1


  parent reply	other threads:[~2019-10-01 10:49 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-10  9:29 [dpdk-dev] [RFC] " Olivier Matz
2019-07-10 17:14 ` Wang, Haiyue
2019-07-11  7:26   ` Olivier Matz
2019-07-11  8:04     ` Wang, Haiyue
2019-07-11  8:20       ` Olivier Matz
2019-07-11  8:34         ` Wang, Haiyue
2019-07-11 15:31     ` Stephen Hemminger
2019-07-12  9:18       ` Olivier Matz
2019-07-10 17:49 ` Stephen Hemminger
2019-07-10 18:12   ` Wiles, Keith
2019-07-11  7:53     ` Olivier Matz
2019-07-11 14:37       ` Wiles, Keith
2019-07-12  9:06         ` Olivier Matz
2019-07-11  7:36   ` Olivier Matz
2019-07-12 12:23     ` Jerin Jacob Kollanukkaran
2019-07-16  9:39       ` Olivier Matz
2019-07-16 14:43         ` Stephen Hemminger
2019-07-11  9:24 ` Thomas Monjalon
2019-07-12 14:54 ` Andrew Rybchenko
2019-07-16  9:49   ` Olivier Matz
2019-07-16 11:31     ` [dpdk-dev] ***Spam*** " Andrew Rybchenko
2019-09-18 16:54 ` [dpdk-dev] [PATCH] " Olivier Matz
2019-09-21  4:54   ` Wang, Haiyue
2019-09-23  8:31     ` Olivier Matz
2019-09-23 11:01       ` Wang, Haiyue
2019-09-21  8:28   ` Wiles, Keith
2019-09-23  8:56     ` Morten Brørup
2019-09-23  9:41       ` Olivier Matz
2019-09-23  9:13     ` Olivier Matz
2019-09-23 15:14       ` Wiles, Keith
2019-09-23 16:16         ` Olivier Matz
2019-09-23 17:14           ` Wiles, Keith
2019-09-23 16:09       ` Wiles, Keith
2019-10-01 10:49   ` Ananyev, Konstantin [this message]
2019-10-17  7:54     ` Olivier Matz
2019-10-17 11:58       ` Ananyev, Konstantin
2019-10-17 12:58         ` Olivier Matz
2019-10-17 14:42 ` [dpdk-dev] [PATCH v2] " Olivier Matz
2019-10-18  2:47   ` Wang, Haiyue
2019-10-18  7:53     ` Olivier Matz
2019-10-18  8:28       ` Wang, Haiyue
2019-10-18  9:47         ` Olivier Matz
2019-10-18 11:24           ` Wang, Haiyue
2019-10-22 22:51   ` Ananyev, Konstantin
2019-10-23  3:16     ` Wang, Haiyue
2019-10-23 10:21       ` Olivier Matz
2019-10-23 15:00         ` Stephen Hemminger
2019-10-23 15:12           ` Wang, Haiyue
2019-10-23 10:19     ` Olivier Matz
2019-10-23 11:45       ` Olivier Matz
2019-10-23 11:49         ` Ananyev, Konstantin
2019-10-23 12:00   ` Shahaf Shuler
2019-10-23 13:33     ` Olivier Matz
2019-10-24  4:54       ` Shahaf Shuler
2019-10-24  7:07         ` Olivier Matz
2019-10-24  7:38   ` Slava Ovsiienko
2019-10-24  7:56     ` Olivier Matz
2019-10-24  8:13 ` [dpdk-dev] [PATCH v3] " Olivier Matz
2019-10-24 15:30   ` Stephen Hemminger
2019-10-24 15:44     ` Thomas Monjalon
2019-10-24 17:07       ` Stephen Hemminger
2019-10-24 16:40   ` Thomas Monjalon
2019-10-26 12:39 ` [dpdk-dev] [PATCH v4] " Olivier Matz
2019-10-26 17:04   ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2601191342CEEE43887BDE71AB977258019196E0B7@irsmsx105.ger.corp.intel.com \
    --to=konstantin.ananyev@intel.com \
    --cc=arybchenko@solarflare.com \
    --cc=dev@dpdk.org \
    --cc=haiyue.wang@intel.com \
    --cc=jerinj@marvell.com \
    --cc=keith.wiles@intel.com \
    --cc=olivier.matz@6wind.com \
    --cc=stephen@networkplumber.org \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).