DPDK patches and discussions
 help / color / mirror / Atom feed
From: Gregory Etelson <gregory@weka.io>
To: Olivier Matz <olivier.matz@6wind.com>
Cc: dev <dev@dpdk.org>
Subject: Re: [dpdk-dev] custom align for mempool elements
Date: Fri, 5 May 2017 20:23:15 +0300	[thread overview]
Message-ID: <CAO--2fH6cuomaPkP4rEJfE_1QUaN=sZOB_RQdGG7kB=8UFWRWQ@mail.gmail.com> (raw)
In-Reply-To: <20170505132602.13731734@platinum>

Hello Oliver,

Our application writes data from incoming MBUFs to storage device.
For performance considerations we use O_DIRECT storage access
and work in 'zero copy' data mode.
To achieve the 'zero copy' we MUST arrange data in all MBUFs to be 512
bytes aligned
With pre-calculated custom pool element alignment and right
 RTE_PKTMBUF_HEADROOM value
we can set MBUF data alignment to any value. In our case, 512 bytes

Current implementation sets custom mempool alignment like this:

    struct rte_mempool *mp = rte_mempool_create_empty("MBUF pool",
mbufs_count,
                 elt_size, cache_size, sizeof(struct
rte_pktmbuf_pool_private), rte_socket_id(), 0);

    rte_pktmbuf_pool_init(mp, &mbp_priv);
  *mp->elt_align = align*;
    rte_mempool_populate_default(mp);

Regards,
Gregory

On Fri, May 5, 2017 at 2:26 PM, Olivier Matz <olivier.matz@6wind.com> wrote:

> Hi Gregory,
>
> On Wed, 26 Apr 2017 07:00:49 +0300, Gregory Etelson <gregory@weka.io>
> wrote:
> > Signed-off-by: Gregory Etelson <gregory@weka.io>
> > ---
> >  lib/librte_mempool/rte_mempool.c | 27 ++++++++++++++++++++-------
> >  lib/librte_mempool/rte_mempool.h |  1 +
> >  2 files changed, 21 insertions(+), 7 deletions(-)
> >
> > diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_
> mempool.c
> > index f65310f..c780df3 100644
> > --- a/lib/librte_mempool/rte_mempool.c
> > +++ b/lib/librte_mempool/rte_mempool.c
> > @@ -382,7 +382,7 @@ rte_mempool_populate_phys(struct rte_mempool *mp,
> char *vaddr,
> >       if (mp->flags & MEMPOOL_F_NO_CACHE_ALIGN)
> >               off = RTE_PTR_ALIGN_CEIL(vaddr, 8) - vaddr;
> >       else
> > -             off = RTE_PTR_ALIGN_CEIL(vaddr, RTE_CACHE_LINE_SIZE) -
> vaddr;
> > +             off = RTE_PTR_ALIGN_CEIL(vaddr, mp->elt_align) - vaddr;
> >
> >       while (off + total_elt_sz <= len && mp->populated_size < mp->size)
> {
> >               off += mp->header_size;
> > @@ -392,6 +392,7 @@ rte_mempool_populate_phys(struct rte_mempool *mp,
> char *vaddr,
> >               else
> >                       mempool_add_elem(mp, (char *)vaddr + off, paddr +
> off);
> >               off += mp->elt_size + mp->trailer_size;
> > +             off = RTE_ALIGN_CEIL(off, mp->elt_align);
> >               i++;
> >       }
> >
> > @@ -508,6 +509,20 @@ rte_mempool_populate_virt(struct rte_mempool *mp,
> char *addr,
> >       return ret;
> >  }
> >
> > +static uint32_t
> > +mempool_default_elt_aligment(void)
> > +{
> > +     uint32_t align;
> > +     if (rte_xen_dom0_supported()) {
> > +                align = RTE_PGSIZE_2M;;
> > +     } else if (rte_eal_has_hugepages()) {
> > +                align = RTE_CACHE_LINE_SIZE;
> > +     } else {
> > +                align = getpagesize();
> > +     }
> > +     return align;
> > +}
> > +
> >  /* Default function to populate the mempool: allocate memory in
> memzones,
> >   * and populate them. Return the number of objects added, or a negative
> >   * value on error.
> > @@ -518,7 +533,7 @@ rte_mempool_populate_default(struct rte_mempool *mp)
> >       int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
> >       char mz_name[RTE_MEMZONE_NAMESIZE];
> >       const struct rte_memzone *mz;
> > -     size_t size, total_elt_sz, align, pg_sz, pg_shift;
> > +     size_t size, total_elt_sz, pg_sz, pg_shift;
> >       phys_addr_t paddr;
> >       unsigned mz_id, n;
> >       int ret;
> > @@ -530,15 +545,12 @@ rte_mempool_populate_default(struct rte_mempool
> *mp)
> >       if (rte_xen_dom0_supported()) {
> >               pg_sz = RTE_PGSIZE_2M;
> >               pg_shift = rte_bsf32(pg_sz);
> > -             align = pg_sz;
> >       } else if (rte_eal_has_hugepages()) {
> >               pg_shift = 0; /* not needed, zone is physically contiguous
> */
> >               pg_sz = 0;
> > -             align = RTE_CACHE_LINE_SIZE;
> >       } else {
> >               pg_sz = getpagesize();
> >               pg_shift = rte_bsf32(pg_sz);
> > -             align = pg_sz;
> >       }
> >
> >       total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
> > @@ -553,11 +565,11 @@ rte_mempool_populate_default(struct rte_mempool
> *mp)
> >               }
> >
> >               mz = rte_memzone_reserve_aligned(mz_name, size,
> > -                     mp->socket_id, mz_flags, align);
> > +                     mp->socket_id, mz_flags, mp->elt_align);
> >               /* not enough memory, retry with the biggest zone we have
> */
> >               if (mz == NULL)
> >                       mz = rte_memzone_reserve_aligned(mz_name, 0,
> > -                             mp->socket_id, mz_flags, align);
> > +                             mp->socket_id, mz_flags, mp->elt_align);
> >               if (mz == NULL) {
> >                       ret = -rte_errno;
> >                       goto fail;
> > @@ -827,6 +839,7 @@ rte_mempool_create_empty(const char *name, unsigned
> n, unsigned elt_size,
> >       /* Size of default caches, zero means disabled. */
> >       mp->cache_size = cache_size;
> >       mp->private_data_size = private_data_size;
> > +     mp->elt_align = mempool_default_elt_aligment();
> >       STAILQ_INIT(&mp->elt_list);
> >       STAILQ_INIT(&mp->mem_list);
> >
> > diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_
> mempool.h
> > index 48bc8ea..6631973 100644
> > --- a/lib/librte_mempool/rte_mempool.h
> > +++ b/lib/librte_mempool/rte_mempool.h
> > @@ -245,6 +245,7 @@ struct rte_mempool {
> >        * this mempool.
> >        */
> >       int32_t ops_index;
> > +     uint32_t elt_align;
> >
> >       struct rte_mempool_cache *local_cache; /**< Per-lcore local cache
> */
> >
>
> It looks the patch will break the ABI (the mempool structure), so it
> has to follow the abi deprecation process (= a notice in 17.05 and
> the patch for 17.08).
>
> Could you give us some details about why you need such feature, how you
> use it (since no API is added)?
>
> Thanks,
> Olivier
>

  reply	other threads:[~2017-05-05 17:23 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-26  4:00 Gregory Etelson
2017-05-05 11:26 ` Olivier Matz
2017-05-05 17:23   ` Gregory Etelson [this message]
2017-05-10 15:01     ` Olivier Matz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAO--2fH6cuomaPkP4rEJfE_1QUaN=sZOB_RQdGG7kB=8UFWRWQ@mail.gmail.com' \
    --to=gregory@weka.io \
    --cc=dev@dpdk.org \
    --cc=olivier.matz@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).