From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f51.google.com (mail-wm0-f51.google.com [74.125.82.51]) by dpdk.org (Postfix) with ESMTP id 2FB6E14EC for ; Wed, 10 May 2017 17:01:46 +0200 (CEST) Received: by mail-wm0-f51.google.com with SMTP id b84so7722012wmh.0 for ; Wed, 10 May 2017 08:01:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=MZKdvNT/GH+amS81VTFQJd4Y/i2eKtfOlGZEnuzZQIY=; b=GuBv7nnCfbJVR5SOvACMsHrzmmvlGDR4yPtj7DMH8iTT6ZjRO7wwNmBXKRIPm7Yzld IhZSAQ8tUJ4LtHtE8BRzojkf28xRmEwcFjKbyTOB2iF3vKLxeNUx7FRS0zXkvIQS5zul hBPCNP0FnLcWF7I3QOX2dxqB6qcs2s0p10lBAlur0fdPH2whXiF5FkKxOjMko8aCed7d WoHXDO4jv+yNjdSALM7mAS1Y5qjo4VQI8ioRQIGPPDvV2dFM5bLtfq2XAHAYdAvlK4Js kG0JYtoO/RRsQ0PwU4ZA/pQY+ora52Xx5n0BFreZre5X/cazXnfEHCgVEAtyOydKziu6 mjIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=MZKdvNT/GH+amS81VTFQJd4Y/i2eKtfOlGZEnuzZQIY=; b=VFI3alQHsQunPYZbLDUQRIIvqTObnyQ3/gwwQgIT9Ij7G71bZEZ0Gumkdi76iCLxs/ 9aW1rqv1+zWJmeyB9y5hJzo5AAR4EVQOmcoUXIQlb/Md48XNSN5K/vR5IQNF25bo0alO I28S0Xm9840NhkpilteoTIPTERH8R6cTiIVEDWgJO9zsL5RRYGw58qFNme4X3uIMEQQP i0AC3+FN3XkEV8w94Oo/Xk9G2JAerfOuy23b5aQX9yX85zgmDDl9PuGKkJpnatyNqDk+ XOHA4t+wUUTzOB16RbP5mY7DWBC0shLKKcvZjnOwudHWx7VfW6AXN3wuV0cXCT87zBsi 5uHg== X-Gm-Message-State: AODbwcD0v7udTF5wPpnsrTQ2rAWDFi1s5HY5+aiFq5jDnN/1QL5sB0jw 8lAiizOG9pJ0RmkC+GY= X-Received: by 10.28.91.3 with SMTP id p3mr3932701wmb.68.1494428505792; Wed, 10 May 2017 08:01:45 -0700 (PDT) Received: from platinum (2a01cb0c03c651000226b0fffeed02fc.ipv6.abo.wanadoo.fr. [2a01:cb0c:3c6:5100:226:b0ff:feed:2fc]) by smtp.gmail.com with ESMTPSA id e23sm3710970wre.54.2017.05.10.08.01.45 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 10 May 2017 08:01:45 -0700 (PDT) Date: Wed, 10 May 2017 17:01:43 +0200 From: Olivier Matz To: Gregory Etelson Cc: dev Message-ID: <20170510170143.023e3e16@platinum> In-Reply-To: References: <4486793.msUSBAJSWm@polaris> <20170505132602.13731734@platinum> X-Mailer: Claws Mail 3.14.1 (GTK+ 2.24.31; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] custom align for mempool elements X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 May 2017 15:01:46 -0000 Hi Gregory, On Fri, 5 May 2017 20:23:15 +0300, Gregory Etelson wrote: > Hello Oliver, > > Our application writes data from incoming MBUFs to storage device. > For performance considerations we use O_DIRECT storage access > and work in 'zero copy' data mode. > To achieve the 'zero copy' we MUST arrange data in all MBUFs to be 512 > bytes aligned > With pre-calculated custom pool element alignment and right > RTE_PKTMBUF_HEADROOM value > we can set MBUF data alignment to any value. In our case, 512 bytes > > Current implementation sets custom mempool alignment like this: > > struct rte_mempool *mp = rte_mempool_create_empty("MBUF pool", > mbufs_count, > elt_size, cache_size, sizeof(struct > rte_pktmbuf_pool_private), rte_socket_id(), 0); > > rte_pktmbuf_pool_init(mp, &mbp_priv); > *mp->elt_align = align*; > rte_mempool_populate_default(mp); I think we should try to avoid modifying mp->elt_align directly. A new api could be added for that, maybe something like: int rte_mempool_set_obj_align(struct rte_mempool *mp, size_t align) size_t rte_mempool_get_obj_align(struct rte_mempool *mp) The set() function would have to be called before mempool_populate(). We need to take care about conflict with the MEMPOOL_F_NO_CACHE_ALIGN flag. I think we should keep the compat for this flag user. This flag could be deprecated and removed later. I think there may also be some conflicts with MEMPOOL_F_NO_SPREAD. As I said in my previous mail, if the patch breaks the ABI (the mempool structure), it has to follow the abi deprecation process (= a notice in 17.05 and the patch for 17.08). I'm afraid it would be quite late for the notice though. Regards, Olivier > > Regards, > Gregory > > On Fri, May 5, 2017 at 2:26 PM, Olivier Matz wrote: > > > Hi Gregory, > > > > On Wed, 26 Apr 2017 07:00:49 +0300, Gregory Etelson > > wrote: > > > Signed-off-by: Gregory Etelson > > > --- > > > lib/librte_mempool/rte_mempool.c | 27 ++++++++++++++++++++------- > > > lib/librte_mempool/rte_mempool.h | 1 + > > > 2 files changed, 21 insertions(+), 7 deletions(-) > > > > > > diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_ > > mempool.c > > > index f65310f..c780df3 100644 > > > --- a/lib/librte_mempool/rte_mempool.c > > > +++ b/lib/librte_mempool/rte_mempool.c > > > @@ -382,7 +382,7 @@ rte_mempool_populate_phys(struct rte_mempool *mp, > > char *vaddr, > > > if (mp->flags & MEMPOOL_F_NO_CACHE_ALIGN) > > > off = RTE_PTR_ALIGN_CEIL(vaddr, 8) - vaddr; > > > else > > > - off = RTE_PTR_ALIGN_CEIL(vaddr, RTE_CACHE_LINE_SIZE) - > > vaddr; > > > + off = RTE_PTR_ALIGN_CEIL(vaddr, mp->elt_align) - vaddr; > > > > > > while (off + total_elt_sz <= len && mp->populated_size < mp->size) > > { > > > off += mp->header_size; > > > @@ -392,6 +392,7 @@ rte_mempool_populate_phys(struct rte_mempool *mp, > > char *vaddr, > > > else > > > mempool_add_elem(mp, (char *)vaddr + off, paddr + > > off); > > > off += mp->elt_size + mp->trailer_size; > > > + off = RTE_ALIGN_CEIL(off, mp->elt_align); > > > i++; > > > } > > > > > > @@ -508,6 +509,20 @@ rte_mempool_populate_virt(struct rte_mempool *mp, > > char *addr, > > > return ret; > > > } > > > > > > +static uint32_t > > > +mempool_default_elt_aligment(void) > > > +{ > > > + uint32_t align; > > > + if (rte_xen_dom0_supported()) { > > > + align = RTE_PGSIZE_2M;; > > > + } else if (rte_eal_has_hugepages()) { > > > + align = RTE_CACHE_LINE_SIZE; > > > + } else { > > > + align = getpagesize(); > > > + } > > > + return align; > > > +} > > > + > > > /* Default function to populate the mempool: allocate memory in > > memzones, > > > * and populate them. Return the number of objects added, or a negative > > > * value on error. > > > @@ -518,7 +533,7 @@ rte_mempool_populate_default(struct rte_mempool *mp) > > > int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY; > > > char mz_name[RTE_MEMZONE_NAMESIZE]; > > > const struct rte_memzone *mz; > > > - size_t size, total_elt_sz, align, pg_sz, pg_shift; > > > + size_t size, total_elt_sz, pg_sz, pg_shift; > > > phys_addr_t paddr; > > > unsigned mz_id, n; > > > int ret; > > > @@ -530,15 +545,12 @@ rte_mempool_populate_default(struct rte_mempool > > *mp) > > > if (rte_xen_dom0_supported()) { > > > pg_sz = RTE_PGSIZE_2M; > > > pg_shift = rte_bsf32(pg_sz); > > > - align = pg_sz; > > > } else if (rte_eal_has_hugepages()) { > > > pg_shift = 0; /* not needed, zone is physically contiguous > > */ > > > pg_sz = 0; > > > - align = RTE_CACHE_LINE_SIZE; > > > } else { > > > pg_sz = getpagesize(); > > > pg_shift = rte_bsf32(pg_sz); > > > - align = pg_sz; > > > } > > > > > > total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size; > > > @@ -553,11 +565,11 @@ rte_mempool_populate_default(struct rte_mempool > > *mp) > > > } > > > > > > mz = rte_memzone_reserve_aligned(mz_name, size, > > > - mp->socket_id, mz_flags, align); > > > + mp->socket_id, mz_flags, mp->elt_align); > > > /* not enough memory, retry with the biggest zone we have > > */ > > > if (mz == NULL) > > > mz = rte_memzone_reserve_aligned(mz_name, 0, > > > - mp->socket_id, mz_flags, align); > > > + mp->socket_id, mz_flags, mp->elt_align); > > > if (mz == NULL) { > > > ret = -rte_errno; > > > goto fail; > > > @@ -827,6 +839,7 @@ rte_mempool_create_empty(const char *name, unsigned > > n, unsigned elt_size, > > > /* Size of default caches, zero means disabled. */ > > > mp->cache_size = cache_size; > > > mp->private_data_size = private_data_size; > > > + mp->elt_align = mempool_default_elt_aligment(); > > > STAILQ_INIT(&mp->elt_list); > > > STAILQ_INIT(&mp->mem_list); > > > > > > diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_ > > mempool.h > > > index 48bc8ea..6631973 100644 > > > --- a/lib/librte_mempool/rte_mempool.h > > > +++ b/lib/librte_mempool/rte_mempool.h > > > @@ -245,6 +245,7 @@ struct rte_mempool { > > > * this mempool. > > > */ > > > int32_t ops_index; > > > + uint32_t elt_align; > > > > > > struct rte_mempool_cache *local_cache; /**< Per-lcore local cache > > */ > > > > > > > It looks the patch will break the ABI (the mempool structure), so it > > has to follow the abi deprecation process (= a notice in 17.05 and > > the patch for 17.08). > > > > Could you give us some details about why you need such feature, how you > > use it (since no API is added)? > > > > Thanks, > > Olivier > >