From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 7CCD65686 for ; Tue, 7 Apr 2015 19:22:03 +0200 (CEST) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga101.fm.intel.com with ESMTP; 07 Apr 2015 10:17:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.11,538,1422950400"; d="scan'208";a="676452368" Received: from irsmsx108.ger.corp.intel.com ([163.33.3.3]) by orsmga001.jf.intel.com with ESMTP; 07 Apr 2015 10:17:03 -0700 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.2]) by IRSMSX108.ger.corp.intel.com ([169.254.11.216]) with mapi id 14.03.0224.002; Tue, 7 Apr 2015 18:17:02 +0100 From: "Ananyev, Konstantin" To: Olivier MATZ , "dev@dpdk.org" Thread-Topic: [PATCH v3 1/5] mbuf: fix clone support when application uses private mbuf data Thread-Index: AQHQa+gyjaiCAXJ/uUmIkTCvT5y80Z058RBAgAaNe4CAAQPAcIAAKOKAgAAVpxA= Date: Tue, 7 Apr 2015 17:17:01 +0000 Message-ID: <2601191342CEEE43887BDE71AB9772582141451F@irsmsx105.ger.corp.intel.com> References: <1427385595-15011-1-git-send-email-olivier.matz@6wind.com> <1427829784-12323-1-git-send-email-zer0@droids-corp.org> <1427829784-12323-2-git-send-email-zer0@droids-corp.org> <2601191342CEEE43887BDE71AB97725821413A2D@irsmsx105.ger.corp.intel.com> <5522FF6B.1030503@6wind.com> <2601191342CEEE43887BDE71AB97725821414310@irsmsx105.ger.corp.intel.com> <5523FB9B.2060508@6wind.com> In-Reply-To: <5523FB9B.2060508@6wind.com> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.180] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v3 1/5] mbuf: fix clone support when application uses private mbuf data X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Apr 2015 17:22:04 -0000 Hi Olivier, > -----Original Message----- > From: Olivier MATZ [mailto:olivier.matz@6wind.com] > Sent: Tuesday, April 07, 2015 4:46 PM > To: Ananyev, Konstantin; dev@dpdk.org > Cc: zoltan.kiss@linaro.org; Richardson, Bruce > Subject: Re: [PATCH v3 1/5] mbuf: fix clone support when application uses= private mbuf data >=20 > Hi Konstantin, >=20 > On 04/07/2015 02:40 PM, Ananyev, Konstantin wrote: > > Hi Olivier, > > > >> -----Original Message----- > >> From: Olivier MATZ [mailto:olivier.matz@6wind.com] > >> Sent: Monday, April 06, 2015 10:50 PM > >> To: Ananyev, Konstantin; dev@dpdk.org > >> Cc: zoltan.kiss@linaro.org; Richardson, Bruce > >> Subject: Re: [PATCH v3 1/5] mbuf: fix clone support when application u= ses private mbuf data > >> > >> Hi Konstantin, > >> > >> Thanks for your comments. > >> > >> On 04/02/2015 07:21 PM, Ananyev, Konstantin wrote: > >>> Hi Olivier, > >>> > >>>> -----Original Message----- > >>>> From: Olivier Matz [mailto:olivier.matz@6wind.com] > >>>> Sent: Tuesday, March 31, 2015 8:23 PM > >>>> To: dev@dpdk.org > >>>> Cc: Ananyev, Konstantin; zoltan.kiss@linaro.org; Richardson, Bruce; = Olivier Matz > >>>> Subject: [PATCH v3 1/5] mbuf: fix clone support when application use= s private mbuf data > >>>> > >>>> From: Olivier Matz > >>>> > >>>> Add a new private_size field in mbuf structure that should > >>>> be initialized at mbuf pool creation. This field contains the > >>>> size of the application private data in mbufs. > >>>> > >>>> Introduce new static inline functions rte_mbuf_from_indirect() > >>>> and rte_mbuf_to_baddr() to replace the existing macros, which > >>>> take the private size in account when attaching and detaching > >>>> mbufs. > >>>> > >>>> Signed-off-by: Olivier Matz > >>>> --- > >>>> app/test-pmd/testpmd.c | 1 + > >>>> examples/vhost/main.c | 4 +-- > >>>> lib/librte_mbuf/rte_mbuf.c | 1 + > >>>> lib/librte_mbuf/rte_mbuf.h | 77 ++++++++++++++++++++++++++++++++++= +----------- > >>>> 4 files changed, 63 insertions(+), 20 deletions(-) > >>>> > >>>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c > >>>> index 3057791..c5a195a 100644 > >>>> --- a/app/test-pmd/testpmd.c > >>>> +++ b/app/test-pmd/testpmd.c > >>>> @@ -425,6 +425,7 @@ testpmd_mbuf_ctor(struct rte_mempool *mp, > >>>> mb->tx_offload =3D 0; > >>>> mb->vlan_tci =3D 0; > >>>> mb->hash.rss =3D 0; > >>>> + mb->priv_size =3D 0; > >>>> } > >>>> > >>>> static void > >>>> diff --git a/examples/vhost/main.c b/examples/vhost/main.c > >>>> index c3fcb80..e44e82f 100644 > >>>> --- a/examples/vhost/main.c > >>>> +++ b/examples/vhost/main.c > >>>> @@ -139,7 +139,7 @@ > >>>> /* Number of descriptors per cacheline. */ > >>>> #define DESC_PER_CACHELINE (RTE_CACHE_LINE_SIZE / sizeof(struct vr= ing_desc)) > >>>> > >>>> -#define MBUF_EXT_MEM(mb) (RTE_MBUF_FROM_BADDR((mb)->buf_addr) != =3D (mb)) > >>>> +#define MBUF_EXT_MEM(mb) (rte_mbuf_from_indirect(mb) !=3D (mb)) > >>>> > >>>> /* mask of enabled ports */ > >>>> static uint32_t enabled_port_mask =3D 0; > >>>> @@ -1550,7 +1550,7 @@ attach_rxmbuf_zcp(struct virtio_net *dev) > >>>> static inline void pktmbuf_detach_zcp(struct rte_mbuf *m) > >>>> { > >>>> const struct rte_mempool *mp =3D m->pool; > >>>> - void *buf =3D RTE_MBUF_TO_BADDR(m); > >>>> + void *buf =3D rte_mbuf_to_baddr(m); > >>>> uint32_t buf_ofs; > >>>> uint32_t buf_len =3D mp->elt_size - sizeof(*m); > >>>> m->buf_physaddr =3D rte_mempool_virt2phy(mp, m) + sizeof(*m); > >>>> diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c > >>>> index 526b18d..e095999 100644 > >>>> --- a/lib/librte_mbuf/rte_mbuf.c > >>>> +++ b/lib/librte_mbuf/rte_mbuf.c > >>>> @@ -125,6 +125,7 @@ rte_pktmbuf_init(struct rte_mempool *mp, > >>>> m->pool =3D mp; > >>>> m->nb_segs =3D 1; > >>>> m->port =3D 0xff; > >>>> + m->priv_size =3D 0; > >>> > >>> Why it is 0? > >>> Shouldn't it be the same calulations as in detach() below: > >>> m->priv_size =3D /*get private size from mempool private*/; > >>> m->buf_addr =3D (char *)m + sizeof(struct rte_mbuf) + m->priv_size; > >>> m->buf_len =3D mp->elt_size - sizeof(struct rte_mbuf) - m->priv_size; > >>> ? > >> > >> It's 0 because we also have in the function (not visible in the > >> patch): > >> > >> m->buf_addr =3D (char *)m + sizeof(struct rte_mbuf); > > > > Yep, that's why as I wrote above, I think we need to setup here all 3 f= ields: > > priv_size, buf_addr, buf_len exactly in the same way as in detach(). > > > >> > >> It means that an application that wants to use a private area has > >> to provide another init function derived from this default function. > > > > After your changes, attach/free and other functions from public mbuf AP= I > > rely on priv_size being set properly. > > So I suppose 'official' pktmbuf_init() should also set it in a proper m= anner. > > > >> This was already the case before the patch series. > > > > Before this patch series, we don't have priv_size, so we have nothing t= o setup. > > > >> > >> As we discussed in previous mail, I plan to propose a rework of > >> mbuf pool initialization in another series, and my initial idea was to > >> change this at the same time. But on the other hand it does not hurt > >> to do this change now. I'll include it in next version. > > > > Ok. >=20 > Just to be sure we're on the same line: >=20 > - before the patch series >=20 > - private area was working before that patch series if clones were not > used. To use a private are, the user had to provide another > function derived from pktmbuf_init() to change m->buf_addr and > m->buf_len. > - using both private area + clones was broken >=20 > - after the patch series >=20 > - private area is working with or without clone. But yo use it, > the user still has to provide another function to change > m->buf_addr, m->buf_len *and m->priv_size*. >=20 > The series just fixes the fact that "clones + priv" was not working. > It does not address the problem that providing a new pktmbuf_init() > function is required to use privata area. To fix this, I think it > could require a API evolution that should be part of another series. I don't think we need new pktmbuf_init(). We just need to update it, so both pktmbuf_init() and detach() setup buf_addr, buf_len (and priv_size) to exactly the same values. If they don't do that, it means that you can't use attach/detach with mempools created with pktmbuf_init() any more. BTW, another thing that I just realised: examples/ipv4_multicast and examples/ip_fragmentation/ - both create a pool of mbufs with elem_size < 2K and don't populate mempool'= s private area - so mbp_priv->mbuf_data_room_size =3D=3D 0, for them.=20 So that code in detach(): + mbp_priv =3D rte_mempool_get_priv(mp); + m->priv_size =3D mp->elt_size - RTE_PKTMBUF_HEADROOM - + mbp_priv->mbuf_data_room_size - + sizeof(struct rte_mbuf); Would break both these samples. I suppose we need to handle situation when mp->elt_size < RTE_PKTMBUF_HEADR= OOM + sizeof(struct rte_mbuf), (and probably also when mbuf_data_room_size =3D=3D 0) correctly.=20 Konstantin >=20 > I'll send a v4 addressing the comments soon, thanks. >=20 > Regards, > Olivier >=20 >=20 >=20 > > > >> > >> > >>> BTW, don't see changes in rte_pktmbuf_pool_init() to setup > >>> mbp_priv->mbuf_data_room_size properly. > >>> Without that changes, how can people start using that feature? > >>> It seems that the only way now - setup priv_size and buf_len for each= mbuf manually. > >> > >> It's the same reason than above. To use a private are, the user has > >> to provide its own function that sets up data_room_size, derived from > >> this pool_init default function. This was also the case before the > >> patch series. > >> > >> > >>> > >>>> } > >>>> > >>>> /* do some sanity checks on a mbuf: panic if it fails */ > >>>> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h > >>>> index 17ba791..932fe58 100644 > >>>> --- a/lib/librte_mbuf/rte_mbuf.h > >>>> +++ b/lib/librte_mbuf/rte_mbuf.h > >>>> @@ -317,18 +317,51 @@ struct rte_mbuf { > >>>> /* uint64_t unused:8; */ > >>>> }; > >>>> }; > >>>> + > >>>> + /** Size of the application private data. In case of an indirect > >>>> + * mbuf, it stores the direct mbuf private data size. */ > >>>> + uint16_t priv_size; > >>>> } __rte_cache_aligned; > >>>> > >>>> /** > >>>> - * Given the buf_addr returns the pointer to corresponding mbuf. > >>>> + * Return the mbuf owning the data buffer address of an indirect mb= uf. > >>>> + * > >>>> + * @param mi > >>>> + * The pointer to the indirect mbuf. > >>>> + * @return > >>>> + * The address of the direct mbuf corresponding to buffer_addr. > >>>> */ > >>>> -#define RTE_MBUF_FROM_BADDR(ba) (((struct rte_mbuf *)(ba)) - 1) > >>>> +static inline struct rte_mbuf * > >>>> +rte_mbuf_from_indirect(struct rte_mbuf *mi) > >>>> +{ > >>>> + struct rte_mbuf *md; > >>>> + > >>>> + /* mi->buf_addr and mi->priv_size correspond to buffer and > >>>> + * private size of the direct mbuf */ > >>>> + md =3D (struct rte_mbuf *)((char *)mi->buf_addr - sizeof(*mi= ) - > >>>> + mi->priv_size); > >>> > >>> (uintptr_t)mi->buf_addr? > >> > >> Any clue why (uintptr_t) would be better than (char *) ? > > > > No big difference really, just looks a bit better to me :) > > > >> By the way, I added this cast because it would not compile with > >> g++ (and probably with icc too). > >> > >>> > >>>> + return md; > >>>> +} > >>>> > >>>> /** > >>>> - * Given the pointer to mbuf returns an address where it's buf_add= r > >>>> - * should point to. > >>>> + * Return the buffer address embedded in the given mbuf. > >>>> + * > >>>> + * The user must ensure that m->priv_size corresponds to the > >>>> + * private size of this mbuf, which is not the case for indirect > >>>> + * mbufs. > >>>> + * > >>>> + * @param md > >>>> + * The pointer to the mbuf. > >>>> + * @return > >>>> + * The address of the data buffer owned by the mbuf. > >>>> */ > >>>> -#define RTE_MBUF_TO_BADDR(mb) (((struct rte_mbuf *)(mb)) + 1) > >>>> +static inline char * > >>> > >>> Might be better to return 'void *' here. > >> > >> Ok, as m->buf_addr is a (void *). > >> > >>> > >>>> +rte_mbuf_to_baddr(struct rte_mbuf *md) > >>>> +{ > >>>> + char *buffer_addr; > >>> > >>> uintptr_t buffer_addr? > >> > >> Same question than above, I don't really see why it's better than > >> (char *). > >> > >>> > >>>> + buffer_addr =3D (char *)md + sizeof(*md) + md->priv_size; > >>>> + return buffer_addr; > >>>> +} > >>>> > >>>> /** > >>>> * Returns TRUE if given mbuf is indirect, or FALSE otherwise. > >>>> @@ -688,6 +721,7 @@ static inline struct rte_mbuf *rte_pktmbuf_alloc= (struct rte_mempool *mp) > >>>> > >>>> /** > >>>> * Attach packet mbuf to another packet mbuf. > >>>> + * > >>>> * After attachment we refer the mbuf we attached as 'indirect', > >>>> * while mbuf we attached to as 'direct'. > >>>> * Right now, not supported: > >>>> @@ -701,7 +735,6 @@ static inline struct rte_mbuf *rte_pktmbuf_alloc= (struct rte_mempool *mp) > >>>> * @param md > >>>> * The direct packet mbuf. > >>>> */ > >>>> - > >>>> static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, struct = rte_mbuf *md) > >>>> { > >>>> RTE_MBUF_ASSERT(RTE_MBUF_DIRECT(md) && > >>>> @@ -712,6 +745,7 @@ static inline void rte_pktmbuf_attach(struct rte= _mbuf *mi, struct rte_mbuf *md) > >>>> mi->buf_physaddr =3D md->buf_physaddr; > >>>> mi->buf_addr =3D md->buf_addr; > >>>> mi->buf_len =3D md->buf_len; > >>>> + mi->priv_size =3D md->priv_size; > >>>> > >>>> mi->next =3D md->next; > >>>> mi->data_off =3D md->data_off; > >>>> @@ -732,7 +766,8 @@ static inline void rte_pktmbuf_attach(struct rte= _mbuf *mi, struct rte_mbuf *md) > >>>> } > >>>> > >>>> /** > >>>> - * Detach an indirect packet mbuf - > >>>> + * Detach an indirect packet mbuf. > >>>> + * > >>>> * - restore original mbuf address and length values. > >>>> * - reset pktmbuf data and data_len to their default values. > >>>> * All other fields of the given packet mbuf will be left intact. > >>>> @@ -740,22 +775,28 @@ static inline void rte_pktmbuf_attach(struct r= te_mbuf *mi, struct rte_mbuf *md) > >>>> * @param m > >>>> * The indirect attached packet mbuf. > >>>> */ > >>>> - > >>>> static inline void rte_pktmbuf_detach(struct rte_mbuf *m) > >>>> { > >>>> - const struct rte_mempool *mp =3D m->pool; > >>>> - void *buf =3D RTE_MBUF_TO_BADDR(m); > >>>> - uint32_t buf_len =3D mp->elt_size - sizeof(*m); > >>>> - m->buf_physaddr =3D rte_mempool_virt2phy(mp, m) + sizeof (*m); > >>>> - > >>>> + struct rte_pktmbuf_pool_private *mbp_priv; > >>>> + struct rte_mempool *mp =3D m->pool; > >>>> + void *buf; > >>>> + unsigned mhdr_size; > >>>> + > >>>> + /* first, restore the priv_size, this is needed before calling > >>>> + * rte_mbuf_to_baddr() */ > >>>> + mbp_priv =3D rte_mempool_get_priv(mp); > >>>> + m->priv_size =3D mp->elt_size - RTE_PKTMBUF_HEADROOM - > >>>> + mbp_priv->mbuf_data_room_size - > >>>> + sizeof(struct rte_mbuf); > >>> > >>> I think it is better to put this priv_size calculation above into the= separate function - > >>> rte_mbuf_get_priv_size(m) or something. > >>> We need it in few places, and users would probably need it anyway. > >> > >> yep, good idea > >> > >>> > >>>> + > >>>> + buf =3D rte_mbuf_to_baddr(m); > >>>> + mhdr_size =3D (char *)buf - (char *)m; > >>> > >>> Why do you need to recalculate mhdr_size here? > >>> As I understand it is a m->priv_size, and you just retrieved it, 2 li= nes above. > >>> > >> > >> It's not m->priv_size but (sizeof(rte_mbuf) + m->priv_size). > > > > Ah yes, sorry for confusion. > > > >> In both case, it requires an operation, but maybe > >> mhdr_size =3D (sizeof(rte_mbuf) + m->priv_size) > >> is clearer than > >> mhdr_size =3D (char *)buf - (char *)m > >> > >> > >>>> + m->buf_physaddr =3D rte_mempool_virt2phy(mp, m) + mhdr_size; > >>> > >>> Actually I think could just be: > >>> m->buf_physaddr =3D rte_mempool_virt2phy(mp, buf); > >> > >> Even if it would work, the API of rte_mempool_virt2phy() > >> says that the second argument should be "A pointer (virtual address) > >> to the element of the pool." > >> I think we should keep the initial code. > > > > Ok. > > Konstantin > > > >> > >> Regards, > >> Olivier > >> > >