From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.droids-corp.org (zoll.droids-corp.org [94.23.50.67]) by dpdk.org (Postfix) with ESMTP id CB9505A6A for ; Fri, 27 Feb 2015 14:21:10 +0100 (CET) Received: from was59-1-82-226-113-214.fbx.proxad.net ([82.226.113.214] helo=[192.168.0.10]) by mail.droids-corp.org with esmtpsa (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from ) id 1YRKub-0004Ii-M7; Fri, 27 Feb 2015 14:25:05 +0100 Message-ID: <54F06F3A.40401@6wind.com> Date: Fri, 27 Feb 2015 14:20:58 +0100 From: Olivier MATZ User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.3.0 MIME-Version: 1.0 To: Vadim Suraev , "Ananyev, Konstantin" References: <1424992506-20484-1-git-send-email-vadim.suraev@gmail.com> <2601191342CEEE43887BDE71AB977258213F2C93@irsmsx105.ger.corp.intel.com> In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Cc: dev@dpdk.org Subject: Re: [dpdk-dev] [PATCH] rte_mbuf: scattered pktmbufs freeing optimization X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Feb 2015 13:21:11 -0000 Hi Vadim, Hi Konstantin, On 02/27/2015 01:18 PM, Vadim Suraev wrote: > Hi, Konstantin, > >> Seems really useful. Indeed, this function looks useful, and I also have a work in progress on this topic, but currently it is not well tested. As we are on the mbuf subject, for 2.1, I would like to discuss some points of mbuf API: - support mbuf and segment bulk allocation/free - clarify the difference between raw_alloc/raw_free and mempool_get/mempool_put: For instance, I think that the reference counter initialization should be managed by rte_pktmbuf_reset() like the rest of the fields, therefore raw_alloc/raw_free could be replaced by mempool functions (this would imply some modifications in drivers). - clarify which functions are internal and which are not. Some could be made public and hence should be documented: from what I see, the vhost example currently uses internal APIs and I suspect it may not work when MBUF_DEBUG=y. - factorize rte_rxmbuf_alloc() which is duplicated in almost any driver - discuss if driver should be calling pktmbuf_reset(). It would avoid some bugs where fields are not properly initialized. - check if rte_pktmbuf_free() is called on mbufs with m->next or refcount not initialized - support clones of clones - support attach/detach of mbufs embedding a private part - move m->next in the first cache line as it's initialized in rx - rename "__rte_pktmbuf_prefree_seg" in something clearer like "decref". To avoid doing the work twice, please note that I already have an implementation for: - rte_mbuf_raw_free_bulk(): each mbuf return in its own pool - rte_mbuf_free_seg_bulk(): similar to call rte_pktmbuf_free_seg() on each mbuf but they are not tested at all now. I will submit a RFC series in the coming weeks that addresses these issues. It could be used as a basis for a discussion. Sorry Vadim for polluting the topic, but I think the subjects are quite linked together. Please find some comments on your patch below. >> One thought - why to introduce the limitation that all mbufs have to be > from the same mempool? >> I think you can reorder it a bit, so it can handle situation when chained > mbufs belong to different mempools. > > I had a doubt, my concern was how practical is that (multiple mempools) > case? I think having several mempool is common on multi-socket machines. > Do you think there should be two versions: lightweight (with the > restriction) and generic? I think having the version that supports different mempools is more important. Checking if the mempool is the same may not cost too much compared to the rest of the work. But if we find a good use case where having the lightweight version is preferable, we could have 2 versions (ex: if performance is really different). >>> +/* This macro defines the size of max bulk of mbufs to free for >> rte_pktmbuf_free_bulk */ >>> +#define MAX_MBUF_FREE_SIZE 32 I wonder if it's required to have this macro. In my implementation, I use a dynamic-sized table: static inline void rte_mbuf_free_seg_bulk(struct rte_mbuf **m_table, unsigned n) { void *obj_table[n]; [...] >>> + * >>> + * @param head >>> + * The head of mbufs to be freed chain >>> + */ >>> + >>> +static inline void __attribute__((always_inline)) >>> +rte_pktmbuf_free_bulk(struct rte_mbuf *head) About the inlining, I have no objection now, although Stephen may be right. I think we can consider un-inline some functions, based on performance measurements. >>> +{ >>> + void *mbufs[MAX_MBUF_FREE_SIZE]; >>> + unsigned mbufs_count = 0; >>> + struct rte_mbuf *next; >>> + >>> + RTE_MBUF_MEMPOOL_CHECK1(head); >>> + >>> + while(head) { >>> + next = head->next; >>> + head->next = NULL; >>> + if(__rte_pktmbuf_prefree_seg(head)) { >>> + RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(head) == 0); >>> + RTE_MBUF_MEMPOOL_CHECK2(head); >>> + mbufs[mbufs_count++] = head; >>> + } >>> + head = next; >>> + if(mbufs_count == MAX_MBUF_FREE_SIZE) { >>> + rte_mempool_put_bulk(((struct rte_mbuf >> *)mbufs[0])->pool,mbufs,mbufs_count); >>> + mbufs_count = 0; >>> + } >>> + } >>> + if(mbufs_count > 0) { >>> + rte_mempool_put_bulk(((struct rte_mbuf >> *)mbufs[0])->pool,mbufs,mbufs_count); >>> + } >>> +} >>> + Also, I think a unit test should be added. Thanks, Olivier