DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Varghese, Vipin" <vipin.varghese@intel.com>
To: "Harris, James R" <james.r.harris@intel.com>,
	"Howell, Seth" <seth.howell@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] Aligned rte_mempool for storage applications
Date: Tue, 26 Mar 2019 02:52:42 +0000	[thread overview]
Message-ID: <4C9E0AB70F954A408CC4ADDBF0F8FA7D4D325E91@BGSMSX101.gar.corp.intel.com> (raw)
In-Reply-To: <FD6F1EFA-9ABB-4FFD-89D3-861F522A1769@intel.com>

Hi Seth,

If I may I would like to suggest and ask a query on the mempool alignment details. Please find my suggestion and query inline to the email.

Snipped
> 
>     In SPDK, we use the rte_mempool struct for many internal structure
> collections. The per-thread cache and ease of allocation of mempools are very
> useful features.
>     Some of the collections we store in SPDK are pools of I/O buffers. Typically,
> these pools contain elements of at least 4096 bytes, and we would like them to be
> aligned to 4k for performance reasons.
Query-1> is the total memory required to be 4096 only (data portion)?

> 
> [Jim] Just to clarify Seth's point - the performance reasons are specifically to avoid
> wasteful memcopies.  The vast majority of NVMe SSDs in the market today do not
> have full scatter/gather support - rather they only support something called PRP
> (Physical Region Pages) which require all scatter gather elements except the first
> to be 4KB aligned.  There are other storage interfaces such as Linux AIO that also
> impose alignment restrictions.
> 
> -Jim
> 
> 
>     Currently, the rte_mempool API doesn't support aligned mempool objects. This
> means that when we allocate a 4k buffer and want it aligned to 4k, we actually
> need to allocate an 8k buffer and calculate an offset into it each time we want to
> use it.
Query-2> why not create contiguous 4K aligned memory with rte_malloc?

>     We recently did a proof of concept using the rte_mempool_ops hook where
> we allocated a mempool and populated it with aligned entries. This allowed us to
> retrieve aligned addresses directly from rte_mempool_get(), but didn't help with
> the allocation size.
>     Because the rte_mempool struct assumes that each element has a header
> attached to it, we still need to live up to that assumption for each object we
> create in a mempool. This means that the actual size of a buffer becomes 4k + 24
> bytes. In order to get to our next aligned address, we need to add about 4k of
> padding to each element.
>     Modifying the current rte_mempool struct to allow entries without headers
> seems impossible since it would break rte_mempool_for_obj_iter and
> rte_mempool_from_obj. However I still think there is a lot of benefit to be gained
> from a mempool structure that supports aligned objects without headers.
>     I am wondering if DPDK would be open to us introducing an
> rte_mempool_aligned structure. This structure would essentially be a wrapper
> around a regular mempool struct. However, it would not require headers or
> trailers for each object in the pool.
Query-3> using mempool with 0 size for data portion we can either create a indirect buffer or use external mbuf to attach MBUF to 4K aligned rte_malloc areas. 

Note: we did similar to the prototype for AF_XDP_ZC_PMD (presented in BLR summit 2019). 

Advantage: no change in mempool library, mbuf library, or rte_malloc. Application works with zero change.

> 
>     This structure would only be applicable to a subset of mempools with the
> following characteristics:
>     	1. mempools for which the following flags were set:
> MEMPOOL_F_NO_CACHE_ALIGNED, MEMPOOL_F_NO_IOVA_CONTIG ,
> MEMPOOL_F_NO_SPREAD
>     	2. mempools that do not require the use of the following functions
> rte_mempool_from_obj (requires a pointer to the mp in the header of each obj),
> rte_mempool_for_obj_iter.
>     	3. Any attempt to create this object when
> RTE_LIBRTE_MEMPOOL_DEBUG was enabled would necessarily fail since we
> can't check the header cookies.
> 
>     My thought would be that we could implement this data structure in a header
> and it would look something like this:
> 
>     Struct rte_mempool_aligned {
>     	Struct rte_mempool mp;
>     	Size_t obj_alignment;
>     };
> 
>     The rest of the functions in the header would primarily be wrappers around the
> original functions. Most functions (rte_mempool_alloc, rte_mempool_free,
> rte_mempool_enqueue/dequeue, rte_mempool_get_count, etc.) could be
> implemented directly as wrappers, and others such as rte_mempool_create and
> the populate functions would have to be re-implemented to some degree in the
> new header. The remaining functions (check_cookies, for_obj_iter) would not be
> implemented in the rte_mempool_aligned.h file.
> 
>     Would the community be welcoming of a new rte_mempool_aligned struct? If
> you don't feel like this would be the way to go, are there other options in DPDK
> for creating a pool of pre-allocated aligned objects?
> 
>     Thank you,
> 
>     Seth Howell
> 
> 
> 


  parent reply	other threads:[~2019-03-26  2:52 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-25 21:06 Howell, Seth
2019-03-25 21:06 ` Howell, Seth
2019-03-25 21:13 ` Harris, James R
2019-03-25 21:13   ` Harris, James R
2019-03-26  2:52   ` Varghese, Vipin [this message]
2019-03-26  2:52     ` Varghese, Vipin
2019-03-26 18:34     ` Howell, Seth
2019-03-26 18:34       ` Howell, Seth
2019-03-26 18:59       ` Harris, James R
2019-03-26 18:59         ` Harris, James R
2019-03-27  2:33         ` Varghese, Vipin
2019-03-27  2:33           ` Varghese, Vipin
2019-03-27  8:28         ` Varghese, Vipin
2019-03-27  8:28           ` Varghese, Vipin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C9E0AB70F954A408CC4ADDBF0F8FA7D4D325E91@BGSMSX101.gar.corp.intel.com \
    --to=vipin.varghese@intel.com \
    --cc=dev@dpdk.org \
    --cc=james.r.harris@intel.com \
    --cc=seth.howell@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).