DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Harris, James R" <james.r.harris@intel.com>
To: "Howell, Seth" <seth.howell@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] Aligned rte_mempool for storage applications
Date: Mon, 25 Mar 2019 21:13:44 +0000	[thread overview]
Message-ID: <FD6F1EFA-9ABB-4FFD-89D3-861F522A1769@intel.com> (raw)
Message-ID: <20190325211344.bX7L03CZmCBcAqQ3PCIEXLfxW7krv80pwnS9pbPo9zM@z> (raw)
In-Reply-To: <EA913ED399BBA34AA4EAC2EDC24CDD00AAA70A5B@FMSMSX105.amr.corp.intel.com>



On 3/25/19, 2:06 PM, "Howell, Seth" <seth.howell@intel.com> wrote:

    Hello,
    
    In SPDK, we use the rte_mempool struct for many internal structure collections. The per-thread cache and ease of allocation of mempools are very useful features.
    Some of the collections we store in SPDK are pools of I/O buffers. Typically, these pools contain elements of at least 4096 bytes, and we would like them to be aligned to 4k for performance reasons.

[Jim] Just to clarify Seth's point - the performance reasons are specifically to avoid wasteful memcopies.  The vast majority of NVMe SSDs in the market today do not have full scatter/gather support - rather they only support something called PRP (Physical Region Pages) which require all scatter gather elements except the first to be 4KB aligned.  There are other storage interfaces such as Linux AIO that also impose alignment restrictions.

-Jim


    Currently, the rte_mempool API doesn't support aligned mempool objects. This means that when we allocate a 4k buffer and want it aligned to 4k, we actually need to allocate an 8k buffer and calculate an offset into it each time we want to use it.
    We recently did a proof of concept using the rte_mempool_ops hook where we allocated a mempool and populated it with aligned entries. This allowed us to retrieve aligned addresses directly from rte_mempool_get(), but didn't help with the allocation size.
    Because the rte_mempool struct assumes that each element has a header attached to it, we still need to live up to that assumption for each object we create in a mempool. This means that the actual size of a buffer becomes 4k + 24 bytes. In order to get to our next aligned address, we need to add about 4k of padding to each element.
    Modifying the current rte_mempool struct to allow entries without headers seems impossible since it would break rte_mempool_for_obj_iter and rte_mempool_from_obj. However I still think there is a lot of benefit to be gained from a mempool structure that supports aligned objects without headers.
    I am wondering if DPDK would be open to us introducing an rte_mempool_aligned structure. This structure would essentially be a wrapper around a regular mempool struct. However, it would not require headers or trailers for each object in the pool.
    
    This structure would only be applicable to a subset of mempools with the following characteristics:
    	1. mempools for which the following flags were set: MEMPOOL_F_NO_CACHE_ALIGNED, MEMPOOL_F_NO_IOVA_CONTIG , MEMPOOL_F_NO_SPREAD
    	2. mempools that do not require the use of the following functions rte_mempool_from_obj (requires a pointer to the mp in the header of each obj), rte_mempool_for_obj_iter.
    	3. Any attempt to create this object when RTE_LIBRTE_MEMPOOL_DEBUG was enabled would necessarily fail since we can't check the header cookies.
    
    My thought would be that we could implement this data structure in a header and it would look something like this:
    
    Struct rte_mempool_aligned {
    	Struct rte_mempool mp;
    	Size_t obj_alignment;
    };
    
    The rest of the functions in the header would primarily be wrappers around the original functions. Most functions (rte_mempool_alloc, rte_mempool_free, rte_mempool_enqueue/dequeue, rte_mempool_get_count, etc.) could be implemented directly as wrappers, and others such as rte_mempool_create and the populate functions would have to be re-implemented to some degree in the new header. The remaining functions (check_cookies, for_obj_iter) would not be implemented in the rte_mempool_aligned.h file. 
    
    Would the community be welcoming of a new rte_mempool_aligned struct? If you don't feel like this would be the way to go, are there other options in DPDK for creating a pool of pre-allocated aligned objects? 
    
    Thank you,
    
    Seth Howell
    
    
    


  parent reply	other threads:[~2019-03-25 21:14 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-25 21:06 Howell, Seth
2019-03-25 21:06 ` Howell, Seth
2019-03-25 21:13 ` Harris, James R [this message]
2019-03-25 21:13   ` Harris, James R
2019-03-26  2:52   ` Varghese, Vipin
2019-03-26  2:52     ` Varghese, Vipin
2019-03-26 18:34     ` Howell, Seth
2019-03-26 18:34       ` Howell, Seth
2019-03-26 18:59       ` Harris, James R
2019-03-26 18:59         ` Harris, James R
2019-03-27  2:33         ` Varghese, Vipin
2019-03-27  2:33           ` Varghese, Vipin
2019-03-27  8:28         ` Varghese, Vipin
2019-03-27  8:28           ` Varghese, Vipin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=FD6F1EFA-9ABB-4FFD-89D3-861F522A1769@intel.com \
    --to=james.r.harris@intel.com \
    --cc=dev@dpdk.org \
    --cc=seth.howell@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).