DPDK usage discussions
 help / color / mirror / Atom feed
* [dpdk-users] Huge pages to be allocated based on number of mbufs
@ 2016-03-14 17:54 Saurabh Mishra
  2016-03-15  1:47 ` John Boyle
  2016-03-17 17:25 ` [dpdk-users] [dpdk-dev] " Zoltan Kiss
  0 siblings, 2 replies; 3+ messages in thread
From: Saurabh Mishra @ 2016-03-14 17:54 UTC (permalink / raw)
  To: users, dev

Hi,

We are planning to support virtio, vmxnet3, ixgbe, i40e, bxn2x and SR-IOV
on some of them with DPDK.

We have seen that even if we give correct number of mbufs given the number
hugepages reserved, rte_eth_tx_queue_setup() may still fail with no enough
memory (I saw this on i40evf but worked on virtio and vmxnet3).

We like to know what's the recommended way to determine how many hugepages
we should allocate given the number of mbufs such that queue setup APIs
also don't fail.

Since we will be running on low-end systems too we need to be careful about
reserving hugepages.

Thanks,
/Saurabh

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [dpdk-users] Huge pages to be allocated based on number of mbufs
  2016-03-14 17:54 [dpdk-users] Huge pages to be allocated based on number of mbufs Saurabh Mishra
@ 2016-03-15  1:47 ` John Boyle
  2016-03-17 17:25 ` [dpdk-users] [dpdk-dev] " Zoltan Kiss
  1 sibling, 0 replies; 3+ messages in thread
From: John Boyle @ 2016-03-15  1:47 UTC (permalink / raw)
  To: Saurabh Mishra; +Cc: users, dev

Hi Saurabh,

I don't know all the details of your setup, but I'm guessing that you may
have run into the hugepage fragmentation issue.

Try calling rte_malloc_dump_stats(stdout, "dummy") right before the mempool
creation.  Output might look like this:

Socket:0
Heap_size:2147472192,
Free_size:2047523584,
Alloc_size:99948608,
Greatest_free_size:130023360,
Alloc_count:82,
Free_count:179,

(That would be after a successful allocation of a ~99MB mbuf pool.)

The mbuf_pool gets allocated with a single giant call to the internal
malloc_heap_alloc function.  If the "Greatest_free_size" is smaller than
the mbuf_pool you're trying to create, then the alloc will fail.  Now, if
the total free size is smaller or is not much larger than what you're
trying to allocate, then you would be advised to give it more hugepages.

On the other hand, if the total "Free_size" is much larger than what you
need, but the "Greatest_free_size" is considerably smaller (in the above
example, the largest free slab is 130 MB despite nearly 2GB being
available), then you have a considerably fragmented heap.

How do you get a fragmented heap during the initialization phase of the
program?  The heap is created by mmapping a bunch of hugepages, noticing
which ones happen to have adjacent physical addresses, and then the
contiguous chunks become the separate available slabs in the heap.  If the
system has just been booted, then you are likely to end up with a nice
large slab into which you can fit a huge mbuf_pool.  If the system's been
running for a while, it's more likely to be fragmented, in which case you
may get something like the example I pasted above.

At Pure Storage, we ended up solving this by reserving a single 1GB
hugepage, which can't be fragmented.

-- John Boyle
*Science is what we understand well enough to explain to a computer. Art is
everything else we do.* --Knuth

On Mon, Mar 14, 2016 at 10:54 AM, Saurabh Mishra <saurabh.globe@gmail.com>
wrote:

> Hi,
>
> We are planning to support virtio, vmxnet3, ixgbe, i40e, bxn2x and SR-IOV
> on some of them with DPDK.
>
> We have seen that even if we give correct number of mbufs given the number
> hugepages reserved, rte_eth_tx_queue_setup() may still fail with no enough
> memory (I saw this on i40evf but worked on virtio and vmxnet3).
>
> We like to know what's the recommended way to determine how many hugepages
> we should allocate given the number of mbufs such that queue setup APIs
> also don't fail.
>
> Since we will be running on low-end systems too we need to be careful about
> reserving hugepages.
>
> Thanks,
> /Saurabh
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [dpdk-users] [dpdk-dev] Huge pages to be allocated based on number of mbufs
  2016-03-14 17:54 [dpdk-users] Huge pages to be allocated based on number of mbufs Saurabh Mishra
  2016-03-15  1:47 ` John Boyle
@ 2016-03-17 17:25 ` Zoltan Kiss
  1 sibling, 0 replies; 3+ messages in thread
From: Zoltan Kiss @ 2016-03-17 17:25 UTC (permalink / raw)
  To: Saurabh Mishra, users, dev



On 14/03/16 17:54, Saurabh Mishra wrote:
> Hi,
>
> We are planning to support virtio, vmxnet3, ixgbe, i40e, bxn2x and SR-IOV
> on some of them with DPDK.
>
> We have seen that even if we give correct number of mbufs given the number
> hugepages reserved, rte_eth_tx_queue_setup() may still fail with no enough
> memory (I saw this on i40evf but worked on virtio and vmxnet3).
>
> We like to know what's the recommended way to determine how many hugepages
> we should allocate given the number of mbufs such that queue setup APIs
> also don't fail.

I think you ran into a fragmentation problem. If you allocate the 
hugepages later on after startup, chances are they are fragmented in the 
memory. When you allocate a pool, DPDK needs a continuous area of memory 
on the hugepages.
You should allocate them through the kernel boot params so they'll be as 
continuous as possible.


>
> Since we will be running on low-end systems too we need to be careful about
> reserving hugepages.
>
> Thanks,
> /Saurabh
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-03-17 17:25 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-14 17:54 [dpdk-users] Huge pages to be allocated based on number of mbufs Saurabh Mishra
2016-03-15  1:47 ` John Boyle
2016-03-17 17:25 ` [dpdk-users] [dpdk-dev] " Zoltan Kiss

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).