rte_pktmbuf_alloc() out of rte

DPDK usage discussions
 help / color / mirror / Atom feed

* rte_pktmbuf_alloc() out of rte_mbufs
       [not found] <67781150.1429748.1732243135675.ref@mail.yahoo.com>
@ 2024-11-22  2:38 ` amit sehas
  2024-11-22 16:45   ` Stephen Hemminger
  0 siblings, 1 reply; 6+ messages in thread
From: amit sehas @ 2024-11-22  2:38 UTC (permalink / raw)
  To: users

I am frequently running into out of mbufs when allocating packets. When this happens is there a way to dump counts of which buffers are where so we know what is going on?

I know that each rte_mbuf pool also has per cpu core cache to speed up alloc/free, and some of the buffers will end up there and if one were to never utilize a particular core for a particular mpool perhaps those mbufs are lost ... that is my rough guess ...

How do you debug out of mbufs issue?

regards

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: rte_pktmbuf_alloc() out of rte_mbufs
  2024-11-22  2:38 ` rte_pktmbuf_alloc() out of rte_mbufs amit sehas
@ 2024-11-22 16:45   ` Stephen Hemminger
  2024-11-26 17:57     ` amit sehas
  0 siblings, 1 reply; 6+ messages in thread
From: Stephen Hemminger @ 2024-11-22 16:45 UTC (permalink / raw)
  To: amit sehas; +Cc: users

On Fri, 22 Nov 2024 02:38:55 +0000 (UTC)
amit sehas <cun23@yahoo.com> wrote:

> I am frequently running into out of mbufs when allocating packets. When this happens is there a way to dump counts of which buffers are where so we know what is going on?
> 
> I know that each rte_mbuf pool also has per cpu core cache to speed up alloc/free, and some of the buffers will end up there and if one were to never utilize a particular core for a particular mpool perhaps those mbufs are lost ... that is my rough guess ...
> 
> How do you debug out of mbufs issue?
> 
> regards

The function rte_mempool_dump() will tell you some information about the status of a particular mempool.
If you enable mempool statistics you can get more info.

The best way to size a memory pool is to account for all the possible places mbuf's can be waiting.
Something like:
   Num Port * Num RxQ * Num RxD + Num Port * Num TxQ * Num TxD + Num Lcores * Burst Size + Num Lcores * Cache size

Often running out of mbufs is because of failure to free an recveived mbuf, or a buggy driver.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: rte_pktmbuf_alloc() out of rte_mbufs
  2024-11-22 16:45   ` Stephen Hemminger
@ 2024-11-26 17:57     ` amit sehas
  2024-11-26 23:50       ` amit sehas
  0 siblings, 1 reply; 6+ messages in thread
From: amit sehas @ 2024-11-26 17:57 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: users

rte_mempool_dump() with debugging enabled finds the following data, below i see that put_bulk is 40671864 and get_success_bulk is 40675959, the difference between these is 4095, which is
exactly the number of buffers. I will try to dig into the meaning of put_bulk and get_success_bulk to
determine if there is some kind of buffer leak that is occurring ... some amount of code review did not indicate an obvious issue .

mempool <mbuf_pool3>@0x16c4e2b00
  flags=10
  socket_id=-1
  pool=0x16c4da840
  iova=0x3ac4e2b00
  nb_mem_chunks=1
  size=4095
  populated_size=4095
  header_size=64
  elt_size=2176
  trailer_size=128
  total_obj_size=2368
  private_data_size=64
  ops_index=0
  ops_name: <ring_mp_mc>
  avg bytes/object=2368.578266
  stats:
    put_bulk=40671864
    put_objs=40671864
    put_common_pool_bulk=4095
    put_common_pool_objs=4095
    get_common_pool_bulk=455
    get_common_pool_objs=4095
    get_success_bulk=40675959
    get_success_objs=40675959
    get_fail_bulk=1
    get_fail_objs=1

On Friday, November 22, 2024 at 08:46:00 AM PST, Stephen Hemminger <stephen@networkplumber.org> wrote: 

On Fri, 22 Nov 2024 02:38:55 +0000 (UTC)

amit sehas <cun23@yahoo.com> wrote:

> I am frequently running into out of mbufs when allocating packets. When this happens is there a way to dump counts of which buffers are where so we know what is going on?
> 
> I know that each rte_mbuf pool also has per cpu core cache to speed up alloc/free, and some of the buffers will end up there and if one were to never utilize a particular core for a particular mpool perhaps those mbufs are lost ... that is my rough guess ...
> 
> How do you debug out of mbufs issue?
> 
> regards

The function rte_mempool_dump() will tell you some information about the status of a particular mempool.
If you enable mempool statistics you can get more info.

The best way to size a memory pool is to account for all the possible places mbuf's can be waiting.
Something like:
  Num Port * Num RxQ * Num RxD + Num Port * Num TxQ * Num TxD + Num Lcores * Burst Size + Num Lcores * Cache size

Often running out of mbufs is because of failure to free an recveived mbuf, or a buggy driver.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: rte_pktmbuf_alloc() out of rte_mbufs
  2024-11-26 17:57     ` amit sehas
@ 2024-11-26 23:50       ` amit sehas
  2024-11-27  0:51         ` Stephen Hemminger
  0 siblings, 1 reply; 6+ messages in thread
From: amit sehas @ 2024-11-26 23:50 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: users

Dumping the stats every 10 minutes suggest that there is no slow leak of buffers. The problem arises when the system is under stress and starts performing extra disk i/o. In this situation dpdk accumulates the buffers and does not return them back to the mempool right away thereby accumulating all the 4k buffers allocated to the queue.

rte_eth_tx_buffer_flush() should be flushing the buffers and returning them to the mempool ... is there any additional API that can make sure that this happens.

regards

On Tuesday, November 26, 2024 at 09:57:59 AM PST, amit sehas <cun23@yahoo.com> wrote: 

rte_mempool_dump() with debugging enabled finds the following data, below i see that put_bulk is 40671864 and get_success_bulk is 40675959, the difference between these is 4095, which is
exactly the number of buffers. I will try to dig into the meaning of put_bulk and get_success_bulk to
determine if there is some kind of buffer leak that is occurring ... some amount of code review did not indicate an obvious issue .

mempool <mbuf_pool3>@0x16c4e2b00
  flags=10
  socket_id=-1
  pool=0x16c4da840
  iova=0x3ac4e2b00
  nb_mem_chunks=1
  size=4095
  populated_size=4095
  header_size=64
  elt_size=2176
  trailer_size=128
  total_obj_size=2368
  private_data_size=64
  ops_index=0
  ops_name: <ring_mp_mc>
  avg bytes/object=2368.578266
  stats:
    put_bulk=40671864
    put_objs=40671864
    put_common_pool_bulk=4095
    put_common_pool_objs=4095
    get_common_pool_bulk=455
    get_common_pool_objs=4095
    get_success_bulk=40675959
    get_success_objs=40675959
    get_fail_bulk=1
    get_fail_objs=1

On Friday, November 22, 2024 at 08:46:00 AM PST, Stephen Hemminger <stephen@networkplumber.org> wrote: 

On Fri, 22 Nov 2024 02:38:55 +0000 (UTC)

amit sehas <cun23@yahoo.com> wrote:

> I am frequently running into out of mbufs when allocating packets. When this happens is there a way to dump counts of which buffers are where so we know what is going on?
> 
> I know that each rte_mbuf pool also has per cpu core cache to speed up alloc/free, and some of the buffers will end up there and if one were to never utilize a particular core for a particular mpool perhaps those mbufs are lost ... that is my rough guess ...
> 
> How do you debug out of mbufs issue?
> 
> regards

The function rte_mempool_dump() will tell you some information about the status of a particular mempool.
If you enable mempool statistics you can get more info.

The best way to size a memory pool is to account for all the possible places mbuf's can be waiting.
Something like:
  Num Port * Num RxQ * Num RxD + Num Port * Num TxQ * Num TxD + Num Lcores * Burst Size + Num Lcores * Cache size

Often running out of mbufs is because of failure to free an recveived mbuf, or a buggy driver.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: rte_pktmbuf_alloc() out of rte_mbufs
  2024-11-26 23:50       ` amit sehas
@ 2024-11-27  0:51         ` Stephen Hemminger
  2024-11-27  1:21           ` amit sehas
  0 siblings, 1 reply; 6+ messages in thread
From: Stephen Hemminger @ 2024-11-27  0:51 UTC (permalink / raw)
  To: amit sehas; +Cc: users

On Tue, 26 Nov 2024 23:50:25 +0000 (UTC)
amit sehas <cun23@yahoo.com> wrote:

> Dumping the stats every 10 minutes suggest that there is no slow leak of buffers. The problem arises when the system is under stress and starts performing extra disk i/o. In this situation dpdk accumulates the buffers and does not return them back to the mempool right away thereby accumulating all the 4k buffers allocated to the queue.
> 
> rte_eth_tx_buffer_flush() should be flushing the buffers and returning them to the mempool ... is there any additional API that can make sure that this happens.

If you read the code in rte_ethdev.h
The rte_eth_tx_buffer_flush is just does a send of the packets that application has aggregated
via rte_eth_tx_buffer.

It does nothing vis-a-vis mempools are causing the driver (PMD) to complete transmits.
There are some tuneables such as tx_free_thresh which control when driver should
start freeing sent mbufs. 

Have you isolated the CPU's used by DPDK threads?
Is the application stalling because it starts swapping. You may have to mlockall to keep the
pages of application from swapping out.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: rte_pktmbuf_alloc() out of rte_mbufs
  2024-11-27  0:51         ` Stephen Hemminger
@ 2024-11-27  1:21           ` amit sehas
  0 siblings, 0 replies; 6+ messages in thread
From: amit sehas @ 2024-11-27  1:21 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: users

looking at code for rte_eth_tx_buffer_flush(), in the error case where some buffers were not sent, it has a default callback which will enqueue the buffers back to the mempool, and in the non error case as well when the call is done it will enqueue the buffers back to the mempool ...

this is what we have relied on for ever, otherwise we would not be able to utilize it at all ... in the transmit case we never explicitly free the buffers ...and we are able to run the product through 100s of millions of packet transmits, its only under special circumstances that we run into the said issue ..

I hope i am not misunderstanding something. 

regards

On Tuesday, November 26, 2024 at 04:51:09 PM PST, Stephen Hemminger <stephen@networkplumber.org> wrote: 

On Tue, 26 Nov 2024 23:50:25 +0000 (UTC)

amit sehas <cun23@yahoo.com> wrote:

> Dumping the stats every 10 minutes suggest that there is no slow leak of buffers. The problem arises when the system is under stress and starts performing extra disk i/o. In this situation dpdk accumulates the buffers and does not return them back to the mempool right away thereby accumulating all the 4k buffers allocated to the queue.
> 
> rte_eth_tx_buffer_flush() should be flushing the buffers and returning them to the mempool ... is there any additional API that can make sure that this happens.

If you read the code in rte_ethdev.h
The rte_eth_tx_buffer_flush is just does a send of the packets that application has aggregated
via rte_eth_tx_buffer.

It does nothing vis-a-vis mempools are causing the driver (PMD) to complete transmits.
There are some tuneables such as tx_free_thresh which control when driver should
start freeing sent mbufs. 

Have you isolated the CPU's used by DPDK threads?
Is the application stalling because it starts swapping. You may have to mlockall to keep the
pages of application from swapping out.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-11-27  1:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <67781150.1429748.1732243135675.ref@mail.yahoo.com>
2024-11-22  2:38 ` rte_pktmbuf_alloc() out of rte_mbufs amit sehas
2024-11-22 16:45   ` Stephen Hemminger
2024-11-26 17:57     ` amit sehas
2024-11-26 23:50       ` amit sehas
2024-11-27  0:51         ` Stephen Hemminger
2024-11-27  1:21           ` amit sehas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).