From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.droids-corp.org (zoll.droids-corp.org [94.23.50.67]) by dpdk.org (Postfix) with ESMTP id 5A676A69 for ; Fri, 27 Mar 2015 10:07:55 +0100 (CET) Received: from was59-1-82-226-113-214.fbx.proxad.net ([82.226.113.214] helo=[192.168.0.10]) by mail.droids-corp.org with esmtpsa (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from ) id 1YbQJ4-0005aT-Sf; Fri, 27 Mar 2015 10:12:03 +0100 Message-ID: <55151DDE.8040301@6wind.com> Date: Fri, 27 Mar 2015 10:07:42 +0100 From: Olivier MATZ User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.5.0 MIME-Version: 1.0 To: "Ananyev, Konstantin" , "dev@dpdk.org" References: <1427302838-8285-1-git-send-email-olivier.matz@6wind.com> <1427385595-15011-1-git-send-email-olivier.matz@6wind.com> <1427385595-15011-2-git-send-email-olivier.matz@6wind.com> <2601191342CEEE43887BDE71AB97725821407D4D@irsmsx105.ger.corp.intel.com> In-Reply-To: <2601191342CEEE43887BDE71AB97725821407D4D@irsmsx105.ger.corp.intel.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v2 1/5] mbuf: fix clone support when application uses private mbuf data X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Mar 2015 09:07:55 -0000 Hi Konstantin, Thank you for your comments. On 03/27/2015 01:24 AM, Ananyev, Konstantin wrote: > Hi Olivier, > >> -----Original Message----- >> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier Matz >> Sent: Thursday, March 26, 2015 4:00 PM >> To: dev@dpdk.org >> Cc: zoltan.kiss@linaro.org >> Subject: [dpdk-dev] [PATCH v2 1/5] mbuf: fix clone support when application uses private mbuf data >> >> Add a new private_size field in mbuf structure that should >> be initialized at mbuf pool creation. This field contains the >> size of the application private data in mbufs. >> >> Introduce new static inline functions rte_mbuf_from_indirect() >> and rte_mbuf_to_baddr() to replace the existing macros, which >> take the private size in account when attaching and detaching >> mbufs. >> >> Signed-off-by: Olivier Matz >> --- >> app/test-pmd/testpmd.c | 1 + >> examples/vhost/main.c | 2 +- >> lib/librte_mbuf/rte_mbuf.c | 1 + >> lib/librte_mbuf/rte_mbuf.h | 44 ++++++++++++++++++++++++++++++++++---------- >> 4 files changed, 37 insertions(+), 11 deletions(-) >> >> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c >> index 3057791..c5a195a 100644 >> --- a/app/test-pmd/testpmd.c >> +++ b/app/test-pmd/testpmd.c >> @@ -425,6 +425,7 @@ testpmd_mbuf_ctor(struct rte_mempool *mp, >> mb->tx_offload = 0; >> mb->vlan_tci = 0; >> mb->hash.rss = 0; >> + mb->priv_size = 0; >> } >> >> static void >> diff --git a/examples/vhost/main.c b/examples/vhost/main.c >> index c3fcb80..d542461 100644 >> --- a/examples/vhost/main.c >> +++ b/examples/vhost/main.c >> @@ -139,7 +139,7 @@ >> /* Number of descriptors per cacheline. */ >> #define DESC_PER_CACHELINE (RTE_CACHE_LINE_SIZE / sizeof(struct vring_desc)) >> >> -#define MBUF_EXT_MEM(mb) (RTE_MBUF_FROM_BADDR((mb)->buf_addr) != (mb)) >> +#define MBUF_EXT_MEM(mb) (rte_mbuf_from_indirect(mb) != (mb)) >> >> /* mask of enabled ports */ >> static uint32_t enabled_port_mask = 0; >> diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c >> index 526b18d..e095999 100644 >> --- a/lib/librte_mbuf/rte_mbuf.c >> +++ b/lib/librte_mbuf/rte_mbuf.c >> @@ -125,6 +125,7 @@ rte_pktmbuf_init(struct rte_mempool *mp, >> m->pool = mp; >> m->nb_segs = 1; >> m->port = 0xff; >> + m->priv_size = 0; >> } >> >> /* do some sanity checks on a mbuf: panic if it fails */ >> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h >> index 17ba791..45ac948 100644 >> --- a/lib/librte_mbuf/rte_mbuf.h >> +++ b/lib/librte_mbuf/rte_mbuf.h >> @@ -317,18 +317,42 @@ struct rte_mbuf { >> /* uint64_t unused:8; */ >> }; >> }; >> + >> + uint16_t priv_size; /**< size of the application private data */ >> } __rte_cache_aligned; >> >> /** >> - * Given the buf_addr returns the pointer to corresponding mbuf. >> + * Return the mbuf owning the data buffer address of an indirect mbuf. >> + * >> + * @param mi >> + * The pointer to the indirect mbuf. >> + * @return >> + * The address of the direct mbuf corresponding to buffer_addr. >> */ >> -#define RTE_MBUF_FROM_BADDR(ba) (((struct rte_mbuf *)(ba)) - 1) >> +static inline struct rte_mbuf * >> +rte_mbuf_from_indirect(struct rte_mbuf *mi) >> +{ >> + struct rte_mbuf *md; >> + md = (struct rte_mbuf *)((char *)mi->buf_addr - sizeof(*mi) - >> + mi->priv_size); >> + return md; >> +} >> >> /** >> - * Given the pointer to mbuf returns an address where it's buf_addr >> - * should point to. >> + * Return the buffer address embedded in the given mbuf. >> + * >> + * @param md >> + * The pointer to the mbuf. >> + * @return >> + * The address of the data buffer owned by the mbuf. >> */ >> -#define RTE_MBUF_TO_BADDR(mb) (((struct rte_mbuf *)(mb)) + 1) >> +static inline char * >> +rte_mbuf_to_baddr(struct rte_mbuf *md) >> +{ >> + char *buffer_addr; >> + buffer_addr = (char *)md + sizeof(*md) + md->priv_size; >> + return buffer_addr; >> +} >> > > I am a bit puzzled here, so for indirect mbuf, what value priv_size should hold? > Is that priv_size of indirect mfuf, or priv_size of direct mbuf, that mbuf is attached to? > If it is first one, then your changes in __rte_pktmbuf_prefree_seg() wouldn't work properly, > If second one then changes in rte_pktmbuf_detach() looks wrong to me. > Unless, of course priv_size for all mbufs in the system should always be the same value. > But I suppose, that's not what your intention was, otherwise we don't need priv_size inside mbuf at all - > just a new macro in config file seems enough, plus it would be easier and faster. Yes, my assumption was that the priv_size isi the same on all mbufs of a pool, and probably in most cases all mbufs of the system. To me, a config macro is not th best solution because it prevents the application to choose the proper size at run-time, especially if the dpdk is distributed in binary format. That's why I decided to have priv_size in mbufs, although having it in the mempool private area is possible but it adds an additional indirection (see the cover letter). > I think that to support ability to setup priv_size on a mempool basis, > and reserve private space between struct rte_mbuf and rte_mbuf. buf_addr, > we need to: > > 1. Store priv_size both inside the mempool and inside the mbuf. > > 2. rte_pktmbuf_attach() should change the value of priv_size to the priv_size of the direct mbuf we are going to attach to: > rte_pktmbuf_attach(struct rte_mbuf *mi, struct rte_mbuf *md) {... mi->priv_size = md->priv_size; ...} > > 3. rte_pktmbuf_detach() should restore original value of mbuf's priv_size: > rte_pktmbuf_detach(struct rte_mbuf *m) > { > ... > m->priv_size = rte_mempool_get_privsize(m->pool); > m->buf _addr= rte_mbuf_to_baddr(m); > ... > } > > Also I think we need to provide a way to specify priv_size for all mbufs of the mepool at init time: > - either force people to specify it at rte_mempool_create() time (probably use init_arg for that), > - or provide separate function that could be called straight after rte_mempool_create() , that > would setup priv_size for the pool and for all its mbufs. > - or some sort of combination of these 2 approaches - introduce a wrapper function > (rte_mbuf_pool_create() or something) that would take priv_size as parameter, > create a new mempool and then setup priv_size. Your solution looks good to me, and even if I'm not absolutely convinced that there is a use case where a packet is attached to another one with a different priv_size, it's stronger from an API perspective to support this. Maybe it could happen in a virtualization or secondary process context where there are 2 different mbuf pools. The mbuf_priv_size could go in struct rte_pktmbuf_pool_private, but it can already be deducted from what we have today without changing anything: priv_size = pool->elt_size - RTE_PKTMBUF_HEADROOM - pool_private->mbuf_data_room_size - sizeof(rte_mbuf) Introducing rte_mbuf_pool_create() seems a good idea to me, it would hide 'rte_pktmbuf_pool_private' from the user and force to initialize all the required fields (mbuf_data_room_size only today, and maybe mbuf_priv_size). The API would be: struct rte_mempool * rte_mbuf_pool_create(const char *name, unsigned n, unsigned elt_size, unsigned cache_size, size_t mbuf_priv_size, rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg, int socket_id, unsigned flags) I can give it a try and send a patch for this. > Though, I still think that the better approach is to reserve private space before struct rte_mbuf, not after. > In that case, user got his private space, and we keep buf_addr straight after rte_mbuf, without any whole. > So we don't need steps 2 and 3, above, > plus we don't need rte_mbuf_to_baddr() and rte_mbuf_from_indirect() - > RTE_MBUF_TO_BADDR/ RTE_MBUF_FROM_BADDR would keep working correctly. > In fact, with this scheme - we don't even need priv_size for mbuf management (attach/detach/free). > > Wonder did you try that approach? Yes, I though about this approach too. But I think it would require more changes. When an application or a driver allocate a mbuf with mempool_get(), it expects that the returned pointer is the rte_mbuf *. With this solution, there are 2 options: - no mempool modification, so each application/driver has to add priv_size bytes to the object to get the mbuf pointer. This does not look feasible. - change the mempool to support private area before each object. I think mempool API is already quite complex, and I think it's not the role of the mempool library to provide such features. In any case, I think it would require more modifications (in terms of lines of code, but also in terms of "allocation logic"). At the end, the patch I suggested does not break any paradygm and just add ~10 lines of code. Regards, Olivier