From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ob0-f179.google.com (mail-ob0-f179.google.com [209.85.214.179]) by dpdk.org (Postfix) with ESMTP id 02F34234 for ; Fri, 27 Feb 2015 18:09:32 +0100 (CET) Received: by mail-ob0-f179.google.com with SMTP id wp4so19567705obc.10 for ; Fri, 27 Feb 2015 09:09:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=esFKoa3198TGxCuSKiXcUbwCXYWLyzqP9kPLOjnQIZY=; b=p50sgeMZZb/Iu76X19Ri1RH3OW1MA1hXmb7Pd4j843/ralsMxLgp49ETg71JY8t4u7 DwmkqFzb78ERMIDFqallC9BUp7txGNR7t3lnu5TeIzq2W2HW8mNZ7Vkv7NQ/U/lnQPkg m7oYqh5sQAYV0GXsi/4xtAfZEumVz0K7ZVNkEDGG9zpkbT8v3bdrDdoA0W1RajphdiDh 9BTL+YEjOYI6lcjkoKx2pzJy4NxRLIXHk2M0EMTodHBqnKnNdhQSWjCvbQ0PaF0dgFBF xkYa2cB1btrxK54vZTuMx7KAU2CkSgVuiG+Xp5osY6H2bJOPQPm1/Y8+FzEvVxhgqGyU rwoA== MIME-Version: 1.0 X-Received: by 10.182.125.130 with SMTP id mq2mr10729008obb.52.1425056972300; Fri, 27 Feb 2015 09:09:32 -0800 (PST) Received: by 10.202.105.138 with HTTP; Fri, 27 Feb 2015 09:09:32 -0800 (PST) In-Reply-To: <54F06F3A.40401@6wind.com> References: <1424992506-20484-1-git-send-email-vadim.suraev@gmail.com> <2601191342CEEE43887BDE71AB977258213F2C93@irsmsx105.ger.corp.intel.com> <54F06F3A.40401@6wind.com> Date: Fri, 27 Feb 2015 19:09:32 +0200 Message-ID: From: Vadim Suraev To: Olivier MATZ Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Cc: dev@dpdk.org Subject: Re: [dpdk-dev] [PATCH] rte_mbuf: scattered pktmbufs freeing optimization X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Feb 2015 17:09:33 -0000 Hi, Olivier, Hi Konstantin, >Indeed, this function looks useful, and I also have a work in progress >on this topic, but currently it is not well tested. I'm sorry, I didn't know. I'll not interfere with my patch)) >About the inlining, I have no objection now, although Stephen may be >right. I think we can consider un-inline some functions, based on >performance measurements. I've also noticed in many cases it makes no difference. Seems to be some trade-off. >- clarify the difference between raw_alloc/raw_free and > mempool_get/mempool_put: For instance, I think that the reference > counter initialization should be managed by rte_pktmbuf_reset() like > the rest of the fields, therefore raw_alloc/raw_free could be replaced It looks useful to me since not all of the fields are used in every particular application so if the allocation is decoupled from reset, one can save some cycles. Regards, Vadim. On Fri, Feb 27, 2015 at 3:20 PM, Olivier MATZ wrote: > Hi Vadim, Hi Konstantin, > > On 02/27/2015 01:18 PM, Vadim Suraev wrote: > > Hi, Konstantin, > > > >> Seems really useful. > > Indeed, this function looks useful, and I also have a work in progress > on this topic, but currently it is not well tested. > > As we are on the mbuf subject, for 2.1, I would like to discuss some > points of mbuf API: > > - support mbuf and segment bulk allocation/free > - clarify the difference between raw_alloc/raw_free and > mempool_get/mempool_put: For instance, I think that the reference > counter initialization should be managed by rte_pktmbuf_reset() like > the rest of the fields, therefore raw_alloc/raw_free could be replaced > by mempool functions (this would imply some modifications in drivers). > - clarify which functions are internal and which are not. Some could be > made public and hence should be documented: from what I see, the > vhost example currently uses internal APIs and I suspect it may > not work when MBUF_DEBUG=y. > - factorize rte_rxmbuf_alloc() which is duplicated in almost any driver > - discuss if driver should be calling pktmbuf_reset(). > It would avoid some bugs where fields are not properly initialized. > - check if rte_pktmbuf_free() is called on mbufs with m->next or > refcount not initialized > - support clones of clones > - support attach/detach of mbufs embedding a private part > - move m->next in the first cache line as it's initialized in rx > - rename "__rte_pktmbuf_prefree_seg" in something clearer like "decref". > > To avoid doing the work twice, please note that I already have an > implementation for: > - rte_mbuf_raw_free_bulk(): each mbuf return in its own pool > - rte_mbuf_free_seg_bulk(): similar to call rte_pktmbuf_free_seg() > on each mbuf > but they are not tested at all now. > > I will submit a RFC series in the coming weeks that addresses these > issues. It could be used as a basis for a discussion. > > Sorry Vadim for polluting the topic, but I think the subjects are > quite linked together. Please find some comments on your patch below. > > >> One thought - why to introduce the limitation that all mbufs have to be > > from the same mempool? > >> I think you can reorder it a bit, so it can handle situation when > chained > > mbufs belong to different mempools. > > > > I had a doubt, my concern was how practical is that (multiple mempools) > > case? > > I think having several mempool is common on multi-socket machines. > > > Do you think there should be two versions: lightweight (with the > > restriction) and generic? > > I think having the version that supports different mempools is more > important. Checking if the mempool is the same may not cost too much > compared to the rest of the work. But if we find a good use case > where having the lightweight version is preferable, we could have > 2 versions (ex: if performance is really different). > > >>> +/* This macro defines the size of max bulk of mbufs to free for > >> rte_pktmbuf_free_bulk */ > >>> +#define MAX_MBUF_FREE_SIZE 32 > > I wonder if it's required to have this macro. > > In my implementation, I use a dynamic-sized table: > > static inline void > rte_mbuf_free_seg_bulk(struct rte_mbuf **m_table, unsigned n) > { > void *obj_table[n]; > [...] > > > >>> + * > >>> + * @param head > >>> + * The head of mbufs to be freed chain > >>> + */ > >>> + > >>> +static inline void __attribute__((always_inline)) > >>> +rte_pktmbuf_free_bulk(struct rte_mbuf *head) > > About the inlining, I have no objection now, although Stephen may be > right. I think we can consider un-inline some functions, based on > performance measurements. > > >>> +{ > >>> + void *mbufs[MAX_MBUF_FREE_SIZE]; > >>> + unsigned mbufs_count = 0; > >>> + struct rte_mbuf *next; > >>> + > >>> + RTE_MBUF_MEMPOOL_CHECK1(head); > >>> + > >>> + while(head) { > >>> + next = head->next; > >>> + head->next = NULL; > >>> + if(__rte_pktmbuf_prefree_seg(head)) { > >>> + RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(head) == 0); > >>> + RTE_MBUF_MEMPOOL_CHECK2(head); > >>> + mbufs[mbufs_count++] = head; > >>> + } > >>> + head = next; > >>> + if(mbufs_count == MAX_MBUF_FREE_SIZE) { > >>> + rte_mempool_put_bulk(((struct rte_mbuf > >> *)mbufs[0])->pool,mbufs,mbufs_count); > >>> + mbufs_count = 0; > >>> + } > >>> + } > >>> + if(mbufs_count > 0) { > >>> + rte_mempool_put_bulk(((struct rte_mbuf > >> *)mbufs[0])->pool,mbufs,mbufs_count); > >>> + } > >>> +} > >>> + > > Also, I think a unit test should be added. > > Thanks, > Olivier >