DPDK patches and discussions
 help / color / mirror / Atom feed
From: Olivier MATZ <olivier.matz@6wind.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	 "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v2 1/5] mbuf: fix clone support when application uses private mbuf data
Date: Fri, 27 Mar 2015 10:07:42 +0100	[thread overview]
Message-ID: <55151DDE.8040301@6wind.com> (raw)
In-Reply-To: <2601191342CEEE43887BDE71AB97725821407D4D@irsmsx105.ger.corp.intel.com>

Hi Konstantin,

Thank you for your comments.

On 03/27/2015 01:24 AM, Ananyev, Konstantin wrote:
> Hi Olivier,
> 
>> -----Original Message-----
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier Matz
>> Sent: Thursday, March 26, 2015 4:00 PM
>> To: dev@dpdk.org
>> Cc: zoltan.kiss@linaro.org
>> Subject: [dpdk-dev] [PATCH v2 1/5] mbuf: fix clone support when application uses private mbuf data
>>
>> Add a new private_size field in mbuf structure that should
>> be initialized at mbuf pool creation. This field contains the
>> size of the application private data in mbufs.
>>
>> Introduce new static inline functions rte_mbuf_from_indirect()
>> and rte_mbuf_to_baddr() to replace the existing macros, which
>> take the private size in account when attaching and detaching
>> mbufs.
>>
>> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
>> ---
>>  app/test-pmd/testpmd.c     |  1 +
>>  examples/vhost/main.c      |  2 +-
>>  lib/librte_mbuf/rte_mbuf.c |  1 +
>>  lib/librte_mbuf/rte_mbuf.h | 44 ++++++++++++++++++++++++++++++++++----------
>>  4 files changed, 37 insertions(+), 11 deletions(-)
>>
>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
>> index 3057791..c5a195a 100644
>> --- a/app/test-pmd/testpmd.c
>> +++ b/app/test-pmd/testpmd.c
>> @@ -425,6 +425,7 @@ testpmd_mbuf_ctor(struct rte_mempool *mp,
>>  	mb->tx_offload   = 0;
>>  	mb->vlan_tci     = 0;
>>  	mb->hash.rss     = 0;
>> +	mb->priv_size    = 0;
>>  }
>>
>>  static void
>> diff --git a/examples/vhost/main.c b/examples/vhost/main.c
>> index c3fcb80..d542461 100644
>> --- a/examples/vhost/main.c
>> +++ b/examples/vhost/main.c
>> @@ -139,7 +139,7 @@
>>  /* Number of descriptors per cacheline. */
>>  #define DESC_PER_CACHELINE (RTE_CACHE_LINE_SIZE / sizeof(struct vring_desc))
>>
>> -#define MBUF_EXT_MEM(mb)   (RTE_MBUF_FROM_BADDR((mb)->buf_addr) != (mb))
>> +#define MBUF_EXT_MEM(mb)   (rte_mbuf_from_indirect(mb) != (mb))
>>
>>  /* mask of enabled ports */
>>  static uint32_t enabled_port_mask = 0;
>> diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
>> index 526b18d..e095999 100644
>> --- a/lib/librte_mbuf/rte_mbuf.c
>> +++ b/lib/librte_mbuf/rte_mbuf.c
>> @@ -125,6 +125,7 @@ rte_pktmbuf_init(struct rte_mempool *mp,
>>  	m->pool = mp;
>>  	m->nb_segs = 1;
>>  	m->port = 0xff;
>> +	m->priv_size = 0;
>>  }
>>
>>  /* do some sanity checks on a mbuf: panic if it fails */
>> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
>> index 17ba791..45ac948 100644
>> --- a/lib/librte_mbuf/rte_mbuf.h
>> +++ b/lib/librte_mbuf/rte_mbuf.h
>> @@ -317,18 +317,42 @@ struct rte_mbuf {
>>  			/* uint64_t unused:8; */
>>  		};
>>  	};
>> +
>> +	uint16_t priv_size;       /**< size of the application private data */
>>  } __rte_cache_aligned;
>>
>>  /**
>> - * Given the buf_addr returns the pointer to corresponding mbuf.
>> + * Return the mbuf owning the data buffer address of an indirect mbuf.
>> + *
>> + * @param mi
>> + *   The pointer to the indirect mbuf.
>> + * @return
>> + *   The address of the direct mbuf corresponding to buffer_addr.
>>   */
>> -#define RTE_MBUF_FROM_BADDR(ba)     (((struct rte_mbuf *)(ba)) - 1)
>> +static inline struct rte_mbuf *
>> +rte_mbuf_from_indirect(struct rte_mbuf *mi)
>> +{
>> +       struct rte_mbuf *md;
>> +       md = (struct rte_mbuf *)((char *)mi->buf_addr - sizeof(*mi) -
>> +	       mi->priv_size);
>> +       return md;
>> +}
>>
>>  /**
>> - * Given the pointer to mbuf returns an address where it's  buf_addr
>> - * should point to.
>> + * Return the buffer address embedded in the given mbuf.
>> + *
>> + * @param md
>> + *   The pointer to the mbuf.
>> + * @return
>> + *   The address of the data buffer owned by the mbuf.
>>   */
>> -#define RTE_MBUF_TO_BADDR(mb)       (((struct rte_mbuf *)(mb)) + 1)
>> +static inline char *
>> +rte_mbuf_to_baddr(struct rte_mbuf *md)
>> +{
>> +       char *buffer_addr;
>> +       buffer_addr = (char *)md + sizeof(*md) + md->priv_size;
>> +       return buffer_addr;
>> +}
>>
> 
> I am a bit puzzled here, so for indirect mbuf, what value priv_size should hold?
> Is that priv_size of indirect mfuf, or priv_size of direct mbuf, that mbuf is attached to?
> If it is first one, then your changes in __rte_pktmbuf_prefree_seg() wouldn't work properly,
> If second one then  changes in rte_pktmbuf_detach() looks wrong to me.
> Unless, of course priv_size for all mbufs in the system should always be the same value.
> But I suppose, that's not what your intention was, otherwise we don't need priv_size inside mbuf at all -
> just a new macro in config file seems enough, plus it would be easier and faster.

Yes, my assumption was that the priv_size isi the same on all mbufs
of a pool, and probably in most cases all mbufs of the system.
To me, a config macro is not th best solution because it prevents the
application to choose the proper size at run-time, especially if the
dpdk is distributed in binary format.

That's why I decided to have priv_size in mbufs, although having it in
the mempool private area is possible but it adds an additional
indirection (see the cover letter).

> I think that to support ability to setup priv_size on a mempool basis,
> and reserve private space between struct rte_mbuf and rte_mbuf. buf_addr,
> we need to: 
> 
> 1. Store priv_size both inside the mempool and inside the mbuf.
> 
> 2. rte_pktmbuf_attach() should change the value of priv_size to the priv_size of the direct mbuf we are going to attach to:
> rte_pktmbuf_attach(struct rte_mbuf *mi, struct rte_mbuf *md) {... mi->priv_size = md->priv_size; ...}
> 
> 3. rte_pktmbuf_detach() should restore original value of mbuf's priv_size:
> rte_pktmbuf_detach(struct rte_mbuf *m) 
> {
>   ...
>    m->priv_size = rte_mempool_get_privsize(m->pool);
>    m->buf _addr= rte_mbuf_to_baddr(m);
>    ...
> }
> 
> Also I think we need to provide a way to specify priv_size for all mbufs of the mepool at init time:
> - either force people to specify it at rte_mempool_create() time (probably use init_arg for that),
> - or provide separate function that could be called straight after rte_mempool_create() , that
> would setup priv_size for the  pool and for all its mbufs.
> - or some sort of combination of these 2 approaches - introduce a wrapper function 
> (rte_mbuf_pool_create() or something) that would take priv_size as parameter, 
> create a new mempool and then setup priv_size.

Your solution looks good to me, and even if I'm not absolutely
convinced that there is a use case where a packet is attached to
another one with a different priv_size, it's stronger from an
API perspective to support this. Maybe it could happen in a
virtualization or secondary process context where there are 2
different mbuf pools.

The mbuf_priv_size could go in struct rte_pktmbuf_pool_private, but
it can already be deducted from what we have today without changing
anything:

priv_size = pool->elt_size - RTE_PKTMBUF_HEADROOM -
	pool_private->mbuf_data_room_size -
	sizeof(rte_mbuf)

Introducing rte_mbuf_pool_create() seems a good idea to me, it
would hide 'rte_pktmbuf_pool_private' from the user and force
to initialize all the required fields (mbuf_data_room_size only
today, and maybe mbuf_priv_size).

The API would be:

struct rte_mempool *
rte_mbuf_pool_create(const char *name, unsigned n, unsigned elt_size,
	unsigned cache_size, size_t mbuf_priv_size,
	rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
	int socket_id, unsigned flags)

I can give it a try and send a patch for this.

> Though, I still think that the better approach is to reserve private space before struct rte_mbuf, not after.
> In that case, user got his private space, and we keep buf_addr straight after  rte_mbuf, without any whole.
> So we don't need steps 2 and 3, above,
> plus we don't need rte_mbuf_to_baddr() and rte_mbuf_from_indirect() - 
> RTE_MBUF_TO_BADDR/ RTE_MBUF_FROM_BADDR would keep working correctly.
> In fact, with this scheme - we don't even need priv_size for mbuf management (attach/detach/free).
> 
> Wonder did you try that approach?

Yes, I though about this approach too. But I think it would require
more changes. When an application or a driver allocate a mbuf with
mempool_get(), it expects that the returned pointer is the rte_mbuf *.

With this solution, there are 2 options:
- no mempool modification, so each application/driver has to add
  priv_size bytes to the object to get the mbuf pointer. This does not
  look feasible.
- change the mempool to support private area before each object. I
  think mempool API is already quite complex, and I think it's not
  the role of the mempool library to provide such features.

In any case, I think it would require more modifications (in terms of
lines of code, but also in terms of "allocation logic"). At the end,
the patch I suggested does not break any paradygm and just add ~10 lines
of code.

Regards,
Olivier

  reply	other threads:[~2015-03-27  9:07 UTC|newest]

Thread overview: 101+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-25 17:00 [dpdk-dev] [PATCH 0/5] mbuf: enhancements of mbuf clones Olivier Matz
2015-03-25 17:00 ` [dpdk-dev] [PATCH 1/5] mbuf: fix clone support when application uses private mbuf data Olivier Matz
2015-03-26 13:35   ` Bruce Richardson
2015-03-26 15:30     ` Olivier MATZ
2015-03-25 17:00 ` [dpdk-dev] [PATCH 2/5] mbuf: allow to clone an indirect mbuf Olivier Matz
2015-03-25 17:00 ` [dpdk-dev] [PATCH 3/5] test/mbuf: rename mc variable in m Olivier Matz
2015-03-25 17:00 ` [dpdk-dev] [PATCH 4/5] test/mbuf: enhance mbuf refcnt test Olivier Matz
2015-03-25 17:00 ` [dpdk-dev] [PATCH 5/5] test/mbuf: verify that cloning a clone works properly Olivier Matz
2015-03-26  8:48   ` Olivier MATZ
2015-03-26 15:59 ` [dpdk-dev] [PATCH v2 0/5] mbuf: enhancements of mbuf clones Olivier Matz
2015-03-26 15:59   ` [dpdk-dev] [PATCH v2 1/5] mbuf: fix clone support when application uses private mbuf data Olivier Matz
2015-03-26 17:13     ` Zoltan Kiss
2015-03-27  0:24     ` Ananyev, Konstantin
2015-03-27  9:07       ` Olivier MATZ [this message]
2015-03-27 13:56         ` Olivier MATZ
2015-03-27 14:25           ` Ananyev, Konstantin
2015-03-27 15:17             ` Olivier MATZ
2015-03-27 18:11               ` Zoltan Kiss
2015-03-28 21:19                 ` Olivier MATZ
2015-03-30 12:34               ` Ananyev, Konstantin
2015-03-30 19:55                 ` Olivier MATZ
2015-03-30 23:17                   ` Ananyev, Konstantin
2015-03-31 19:01                     ` Olivier MATZ
2015-04-01 13:48                       ` Ananyev, Konstantin
2015-04-01 15:18                         ` Olivier MATZ
2015-03-26 15:59   ` [dpdk-dev] [PATCH v2 2/5] mbuf: allow to clone an indirect mbuf Olivier Matz
2015-03-26 15:59   ` [dpdk-dev] [PATCH v2 3/5] test/mbuf: rename mc variable in m Olivier Matz
2015-03-26 15:59   ` [dpdk-dev] [PATCH v2 4/5] test/mbuf: enhance mbuf refcnt test Olivier Matz
2015-03-26 15:59   ` [dpdk-dev] [PATCH v2 5/5] test/mbuf: verify that cloning a clone works properly Olivier Matz
2015-03-31 19:22   ` [dpdk-dev] [PATCH v3 0/5] mbuf: enhancements of mbuf clones Olivier Matz
2015-03-31 19:23     ` [dpdk-dev] [PATCH v3 1/5] mbuf: fix clone support when application uses private mbuf data Olivier Matz
2015-04-02 14:32       ` Zoltan Kiss
2015-04-02 17:21       ` Ananyev, Konstantin
2015-04-06 21:49         ` Olivier MATZ
2015-04-07 12:40           ` Ananyev, Konstantin
2015-04-07 15:45             ` Olivier MATZ
2015-04-07 17:17               ` Ananyev, Konstantin
2015-04-08  9:44                 ` Olivier MATZ
2015-04-08 13:45                   ` Ananyev, Konstantin
2015-04-09 13:06                     ` Olivier MATZ
2015-04-20 15:41       ` [dpdk-dev] [PATCH v4 00/12] mbuf: enhancements of mbuf clones Olivier Matz
2015-04-20 15:41         ` [dpdk-dev] [PATCH v4 01/12] mbuf: fix mbuf data room size calculation rte_pktmbuf_pool_init Olivier Matz
2015-04-20 15:41         ` [dpdk-dev] [PATCH v4 02/12] examples: always initialize mbuf_pool private area Olivier Matz
2015-04-20 15:41         ` [dpdk-dev] [PATCH v4 03/12] mbuf: add accessors to get data room size and private size Olivier Matz
2015-04-20 15:41         ` [dpdk-dev] [PATCH v4 04/12] mbuf: fix rte_pktmbuf_init when mbuf private size is not zero Olivier Matz
2015-04-20 15:41         ` [dpdk-dev] [PATCH v4 05/12] testpmd: use standard functions to initialize mbufs and mbuf pool Olivier Matz
2015-04-20 15:41         ` [dpdk-dev] [PATCH v4 06/12] mbuf: introduce a new helper to create a " Olivier Matz
2015-04-20 15:41         ` [dpdk-dev] [PATCH v4 07/12] apps: use rte_pktmbuf_pool_create to create mbuf pools Olivier Matz
2015-04-20 15:41         ` [dpdk-dev] [PATCH v4 08/12] mbuf: fix clone support when application uses private mbuf data Olivier Matz
2015-04-20 15:41         ` [dpdk-dev] [PATCH v4 09/12] mbuf: allow to clone an indirect mbuf Olivier Matz
2015-04-20 15:41         ` [dpdk-dev] [PATCH v4 10/12] test/mbuf: rename mc variable in m Olivier Matz
2015-04-20 15:41         ` [dpdk-dev] [PATCH v4 11/12] test/mbuf: enhance mbuf refcnt test Olivier Matz
2015-04-20 15:41         ` [dpdk-dev] [PATCH v4 12/12] test/mbuf: verify that cloning a clone works properly Olivier Matz
2015-04-20 16:53         ` [dpdk-dev] [PATCH v4 00/12] mbuf: enhancements of mbuf clones Neil Horman
2015-04-20 17:07           ` Olivier MATZ
2015-04-20 17:21             ` Neil Horman
2015-04-20 18:24               ` Olivier MATZ
2015-04-21  9:55         ` [dpdk-dev] [PATCH v5 " Olivier Matz
2015-04-21  9:55           ` [dpdk-dev] [PATCH v5 01/12] mbuf: fix mbuf data room size calculation rte_pktmbuf_pool_init Olivier Matz
2015-04-21  9:55           ` [dpdk-dev] [PATCH v5 02/12] examples: always initialize mbuf_pool private area Olivier Matz
2015-04-21  9:55           ` [dpdk-dev] [PATCH v5 03/12] mbuf: add accessors to get data room size and private size Olivier Matz
2015-04-21  9:55           ` [dpdk-dev] [PATCH v5 04/12] mbuf: fix rte_pktmbuf_init when mbuf private size is not zero Olivier Matz
2015-04-21 15:07             ` Ananyev, Konstantin
2015-04-21  9:55           ` [dpdk-dev] [PATCH v5 05/12] testpmd: use standard functions to initialize mbufs and mbuf pool Olivier Matz
2015-04-21  9:55           ` [dpdk-dev] [PATCH v5 06/12] mbuf: introduce a new helper to create a " Olivier Matz
2015-04-21  9:55           ` [dpdk-dev] [PATCH v5 07/12] apps: use rte_pktmbuf_pool_create to create mbuf pools Olivier Matz
2015-04-21  9:55           ` [dpdk-dev] [PATCH v5 08/12] mbuf: fix clone support when application uses private mbuf data Olivier Matz
2015-04-21 15:01             ` Ananyev, Konstantin
2015-04-21 15:26               ` Olivier MATZ
2015-04-21  9:55           ` [dpdk-dev] [PATCH v5 09/12] mbuf: allow to clone an indirect mbuf Olivier Matz
2015-04-21  9:55           ` [dpdk-dev] [PATCH v5 10/12] test/mbuf: rename mc variable in m Olivier Matz
2015-04-21  9:55           ` [dpdk-dev] [PATCH v5 11/12] test/mbuf: enhance mbuf refcnt test Olivier Matz
2015-04-21  9:55           ` [dpdk-dev] [PATCH v5 12/12] test/mbuf: verify that cloning a clone works properly Olivier Matz
2015-04-21 11:50           ` [dpdk-dev] [PATCH v5 00/12] mbuf: enhancements of mbuf clones Neil Horman
2015-04-22  9:57           ` [dpdk-dev] [PATCH v6 00/13] " Olivier Matz
2015-04-22  9:57             ` [dpdk-dev] [PATCH v6 01/13] mbuf: fix mbuf data room size calculation rte_pktmbuf_pool_init Olivier Matz
2015-04-22  9:57             ` [dpdk-dev] [PATCH v6 02/13] examples: always initialize mbuf_pool private area Olivier Matz
2015-04-22  9:57             ` [dpdk-dev] [PATCH v6 03/13] mbuf: add accessors to get data room size and private size Olivier Matz
2015-04-28  9:15               ` Thomas Monjalon
2015-04-22  9:57             ` [dpdk-dev] [PATCH v6 04/13] mbuf: fix rte_pktmbuf_init when mbuf private size is not zero Olivier Matz
2015-04-22  9:57             ` [dpdk-dev] [PATCH v6 05/13] testpmd: use standard functions to initialize mbufs and mbuf pool Olivier Matz
2015-04-22  9:57             ` [dpdk-dev] [PATCH v6 06/13] mbuf: introduce a new helper to create a " Olivier Matz
2015-04-22  9:57             ` [dpdk-dev] [PATCH v6 07/13] apps: use rte_pktmbuf_pool_create to create mbuf pools Olivier Matz
2015-04-22  9:57             ` [dpdk-dev] [PATCH v6 08/13] mbuf: fix clone support when application uses private mbuf data Olivier Matz
2015-04-22  9:57             ` [dpdk-dev] [PATCH v6 09/13] mbuf: allow to clone an indirect mbuf Olivier Matz
2015-04-22  9:57             ` [dpdk-dev] [PATCH v6 10/13] test/mbuf: rename mc variable in m Olivier Matz
2015-04-22  9:57             ` [dpdk-dev] [PATCH v6 11/13] test/mbuf: enhance mbuf refcnt test Olivier Matz
2015-04-22  9:57             ` [dpdk-dev] [PATCH v6 12/13] test/mbuf: verify that cloning a clone works properly Olivier Matz
2015-04-22  9:57             ` [dpdk-dev] [PATCH v6 13/13] test/mbuf: add a test case for clone with different priv size Olivier Matz
2015-04-22 11:59             ` [dpdk-dev] [PATCH v6 00/13] mbuf: enhancements of mbuf clones Ananyev, Konstantin
2015-04-24 10:38               ` Zoltan Kiss
2015-04-27 17:38                 ` Thomas Monjalon
2015-04-28 11:15                   ` Zoltan Kiss
2015-05-07  1:57                 ` Xu, HuilongX
2015-05-07  7:32                   ` Olivier MATZ
2015-05-07  9:39                     ` Ananyev, Konstantin
2015-04-28  9:50               ` Thomas Monjalon
2015-03-31 19:23     ` [dpdk-dev] [PATCH v3 2/5] mbuf: allow to clone an indirect mbuf Olivier Matz
2015-03-31 19:23     ` [dpdk-dev] [PATCH v3 3/5] test/mbuf: rename mc variable in m Olivier Matz
2015-03-31 19:23     ` [dpdk-dev] [PATCH v3 4/5] test/mbuf: enhance mbuf refcnt test Olivier Matz
2015-03-31 19:23     ` [dpdk-dev] [PATCH v3 5/5] test/mbuf: verify that cloning a clone works properly Olivier Matz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55151DDE.8040301@6wind.com \
    --to=olivier.matz@6wind.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).