From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-f49.google.com (mail-wg0-f49.google.com [74.125.82.49]) by dpdk.org (Postfix) with ESMTP id 774E920F for ; Fri, 27 Mar 2015 19:11:05 +0100 (CET) Received: by wgra20 with SMTP id a20so107477236wgr.3 for ; Fri, 27 Mar 2015 11:11:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=LwJZ4Xo6DBJ6R3ISm9Sahe9xNcp0bmUgizpnIC4iRuc=; b=AOllgX105M7YMGqpcm6gId8hinjafUq/OboYJrV4oZZjA+6DV42gmXxJUWIFmS4DuY IbEffVcbzTOyxD8RffmzKX3MThhd85YAA5CeRSRWhSvbKSr9TVWrPgyfwOGMMHP8VH+r MXlfn95uuxCI0c9HQo/cVIzoBjpAVJmBMfMGKd2RCCATbEGzbdw3O9TjFGesYSMB17iH iBGNM9TgsGQ6Zwm6XX+eO1WZkBNmc8SwUSB2HIvoo3x908nu4gtKhESC+3ajX0Xl0D56 9RGkB33me98XuMluKXqCMOhEoRTSj4TcFEykB9rNC5GK7AmIoi65S5tXQYt50HCuBcV9 7wgA== X-Gm-Message-State: ALoCoQnLDg7AaC9uBJgcIBMlwMrZMt6RLOLlKoHBECOncRqxWOPSjH1Z9SU5XmmIQGanTT8X1WDq X-Received: by 10.194.178.164 with SMTP id cz4mr39297693wjc.140.1427479865206; Fri, 27 Mar 2015 11:11:05 -0700 (PDT) Received: from [192.168.0.101] ([90.152.119.35]) by mx.google.com with ESMTPSA id c3sm3875722wiz.2.2015.03.27.11.11.03 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 27 Mar 2015 11:11:04 -0700 (PDT) Message-ID: <55159D39.1040608@linaro.org> Date: Fri, 27 Mar 2015 18:11:05 +0000 From: Zoltan Kiss User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Olivier MATZ , "Ananyev, Konstantin" , "dev@dpdk.org" References: <1427302838-8285-1-git-send-email-olivier.matz@6wind.com> <1427385595-15011-1-git-send-email-olivier.matz@6wind.com> <1427385595-15011-2-git-send-email-olivier.matz@6wind.com> <2601191342CEEE43887BDE71AB97725821407D4D@irsmsx105.ger.corp.intel.com> <55151DDE.8040301@6wind.com> <55156188.6040101@6wind.com> <2601191342CEEE43887BDE71AB9772582140814C@irsmsx105.ger.corp.intel.com> <55157477.7090207@6wind.com> In-Reply-To: <55157477.7090207@6wind.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v2 1/5] mbuf: fix clone support when application uses private mbuf data X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Mar 2015 18:11:05 -0000 On 27/03/15 15:17, Olivier MATZ wrote: > Hi Konstantin, > > On 03/27/2015 03:25 PM, Ananyev, Konstantin wrote: >> Hi Olivier, >> >>> -----Original Message----- >>> From: Olivier MATZ [mailto:olivier.matz@6wind.com] >>> Sent: Friday, March 27, 2015 1:56 PM >>> To: Ananyev, Konstantin; dev@dpdk.org >>> Subject: Re: [dpdk-dev] [PATCH v2 1/5] mbuf: fix clone support when application uses private mbuf data >>> >>> Hi Konstantin, >>> >>> On 03/27/2015 10:07 AM, Olivier MATZ wrote: >>>>> I think that to support ability to setup priv_size on a mempool basis, >>>>> and reserve private space between struct rte_mbuf and rte_mbuf. buf_addr, >>>>> we need to: >>>>> >>>>> 1. Store priv_size both inside the mempool and inside the mbuf. >>>>> >>>>> 2. rte_pktmbuf_attach() should change the value of priv_size to the priv_size of the direct mbuf we are going to attach to: >>>>> rte_pktmbuf_attach(struct rte_mbuf *mi, struct rte_mbuf *md) {... mi->priv_size = md->priv_size; ...} >>>>> >>>>> 3. rte_pktmbuf_detach() should restore original value of mbuf's priv_size: >>>>> rte_pktmbuf_detach(struct rte_mbuf *m) >>>>> { >>>>> ... >>>>> m->priv_size = rte_mempool_get_privsize(m->pool); >>>>> m->buf _addr= rte_mbuf_to_baddr(m); >>>>> ... >>>>> } >>>>> >>>>> Also I think we need to provide a way to specify priv_size for all mbufs of the mepool at init time: >>>>> - either force people to specify it at rte_mempool_create() time (probably use init_arg for that), >>>>> - or provide separate function that could be called straight after rte_mempool_create() , that >>>>> would setup priv_size for the pool and for all its mbufs. >>>>> - or some sort of combination of these 2 approaches - introduce a wrapper function >>>>> (rte_mbuf_pool_create() or something) that would take priv_size as parameter, >>>>> create a new mempool and then setup priv_size. >>> >>> I though a bit more about this solution, and I realized that doing >>> mi->priv_size = md->priv_size in rte_pktmbuf_attach() is probably not >>> a good idea, as there is no garantee that the size of mi is large enough >>> to store the priv of md. >>> >>> Having the same priv_size for mi and md is maybe a good constraint. >>> I can add this in the API comments and assertions in the code to >>> check this condition, what do you think? >> >> Probably we have a different concepts of what is mbuf's private space in mind. >> From my point, even indirect buffer should use it's own private space and >> leave contents of direct mbuf it attached to unmodified. >> After attach() operation, only contents of the buffer are shared between mbufs, >> but not the mbuf's metadata. > > Sorry if it was not clear in my previous messages, but I agree > with your description. When attaching a mbuf, only data, not > metadata should be shared. > > In the solution you are suggesting (quoted above), you say we need > to set mi->priv_size to md->priv_size in rte_pktmbuf_attach(). I felt > this was not possible, but it depends on the meaning we give to > priv_size: > > 1. If the meaning is "the size of the private data embedded in this > mbuf", which is the most logical meaning, we cannot do this > affectation > > 2. If the meaning is "the size of the private data embedded in the > mbuf the buf_addr is pointing to" (which is harder to get), the > affectation makes sense. > > From what I understand, you feel we should use 2/ as priv_size > definition. Is it correct? > > In my previous message, the definition of m->priv_size was 1/, > so that's why I felt assigning mi->priv_size to md->priv_size was > not possible. > > I agree 2/ is probably a good choice, as it would allow to attach > to a mbuf with a different priv_size. It may require some additional > comments above the field in the structure to explain that. I think we need to document it in the comments very well that you can attach mbuf's to each other with different private area sizes, and applications should care about this difference. And we should give a macro to get the private area size, which will get rte_mbuf.mp->priv_size. Actually we should give some better name to rte_mbuf.priv_size, it's a bit misleading now. Maybe direct_priv_size? > > >> Otherwise on detach(), you'll have to copy contents of private space back, from direct to indirect mbuf? >> Again how to deal with the case, when 2 or more mbufs will attach to the same direct one? >> >> So let say, if we'll have a macro: >> >> #define RTE_MBUF_PRIV_PTR(mb) ((void *)((struct rte_mbuf *)(mb)) + 1)) >> >> No matter is mb a direct or indirect mbuf. >> Do you have something else in mind here? > > I completely agree with this macro. We should consider the private data > as an extension of the mbuf structure. > > >>>> Introducing rte_mbuf_pool_create() seems a good idea to me, it >>>> would hide 'rte_pktmbuf_pool_private' from the user and force >>>> to initialize all the required fields (mbuf_data_room_size only >>>> today, and maybe mbuf_priv_size). >>>> >>>> The API would be: >>>> >>>> struct rte_mempool * >>>> rte_mbuf_pool_create(const char *name, unsigned n, unsigned elt_size, >>>> unsigned cache_size, size_t mbuf_priv_size, >>>> rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg, >>>> int socket_id, unsigned flags) >>>> >>>> I can give it a try and send a patch for this. >>> >>> About this, it is not required anymore for this patch series if we >>> agree with my comment above. >> >> I still think we need some way to setup priv_size on a per-mempool basis. >> Doing that in rte_mbuf_pool_create() seems like a good thing to me. >> Not sure, why you decided to drop it? > > I think we can already do it without changing the API by providing > our own rte_pktmbuf_init and rte_pktmbuf_pool_init. > > rte_pktmbuf_init() has to set: > m->buf_len = mp->elt_size - sizeof(struct mbuf); > m->priv_size = sizeof(struct mbuf) - sizeof(struct rte_mbuf); What's struct mbuf? If we take my assumption above, direct_priv_size could go uninitalized, and we can set it when attaching. > > rte_pktmbuf_pool_init() has to set: > /* we can use the default function */ > mbp_priv->mbuf_data_room_size = MBUF_RXDATA_SIZE + > RTE_PKTMBUF_HEADROOM; > > In this case, it is possible to calculate the mbuf_priv_size only > from the pool object: > > mbuf_priv_size = pool->elt_size - RTE_PKTMBUF_HEADROOM - > pool_private->mbuf_data_room_size - > sizeof(rte_mbuf) My understanding is that the pool private date is something completely different than the private data of the mbufs. I think rte_mempool.priv_size should be initialized in *mp_init. > > > I agree it's not ideal, but I think the mbuf pool initialization > is another problem. That's why I suggested to change this in a > separate series that will add rte_mbuf_pool_create() with the > API described above. Thoughts? > > > Thanks, > Olivier > > >> >> Konstantin >> >>> >>> I'll send a separate patch for that. It's probably a good occasion >>> to get rid of the pointer casted into an integer for >>> mbuf_data_room_size. >>> >>> Regards, >>> Olivier