DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] When are mbufs released back to the mempool?
@ 2013-12-17 18:13 Schumm, Ken
  2013-12-18  9:02 ` Olivier MATZ
  0 siblings, 1 reply; 5+ messages in thread
From: Schumm, Ken @ 2013-12-17 18:13 UTC (permalink / raw)
  To: dev

When running l2fwd the number of available mbufs returned by rte_mempool_count() starts at 7680 on an idle system.

As traffic commences the count declines from 7680 to 5632 (expected).

When traffic stops the count does not climb back to the starting value, indicating that idle mbufs are not returned to the mempool.

For the LCORE cache the doc states

"While this may mean a number of buffers may sit idle on some core's cache, the speed
at which a core can access its own cache for a specific memory pool without locks
provides performance gains"

which makes sense.

Is this also true of ring buffers?

We need to understand when packets are released back to the mempool and with l2fwd it appears that they never are, at least not all of them.

Thanks!

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-dev] When are mbufs released back to the mempool?
  2013-12-17 18:13 [dpdk-dev] When are mbufs released back to the mempool? Schumm, Ken
@ 2013-12-18  9:02 ` Olivier MATZ
  2013-12-19 19:09   ` Schumm, Ken
  0 siblings, 1 reply; 5+ messages in thread
From: Olivier MATZ @ 2013-12-18  9:02 UTC (permalink / raw)
  To: Schumm, Ken; +Cc: dev

Hello Ken,

On 12/17/2013 07:13 PM, Schumm, Ken wrote:
 > When running l2fwd the number of available mbufs returned by
 > rte_mempool_count() starts at 7680 on an idle system.
 >
 > As traffic commences the count declines from 7680 to
 > 5632 (expected).

You are right, some mbufs are kept at 2 places:

- in mempool per-core cache: as you noticed, each lcore has
   a cache to avoid a (more) costly access to the common pool.

- also, the mbufs stay in the hardware transmission ring of the
   NIC. Let's say the size of your hw ring is 512, it means that
   when transmitting the 513th mbuf, you will free the first mbuf
   given to your NIC. Therefore, (hw-tx-ring-size * nb-tx-queue)
   mbufs can be stored in tx hw rings.
   Of course, the same applies to rx rings, but it's easier to see
   it as they are filled when initializing the driver.

When choosing the number of mbufs, you need to take a value
greater than (hw-rx-ring-size * nb-rx-queue) + (hw-tx-ring-size *
nb-tx-queue) + (nb-lcores * mbuf-pool-cache-size)

 > Is this also true of ring buffers?

No, if you talk about rte_ring, there is no cache in this
structure.

Regards,
Olivier

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-dev] When are mbufs released back to the mempool?
  2013-12-18  9:02 ` Olivier MATZ
@ 2013-12-19 19:09   ` Schumm, Ken
  2013-12-19 19:35     ` Stephen Hemminger
  0 siblings, 1 reply; 5+ messages in thread
From: Schumm, Ken @ 2013-12-19 19:09 UTC (permalink / raw)
  To: Olivier MATZ; +Cc: dev

Hello Olivier,

Do you know what the reason is for the tx rings filling up and holding on to mbufs?

It seems they could be freed when the DMA xfer is acknowledged instead of waiting until the ring was full.

Thanks!
Ken Schumm

-----Original Message-----
From: Olivier MATZ [mailto:olivier.matz@6wind.com] 
Sent: Wednesday, December 18, 2013 1:03 AM
To: Schumm, Ken
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] When are mbufs released back to the mempool?

Hello Ken,

On 12/17/2013 07:13 PM, Schumm, Ken wrote:
 > When running l2fwd the number of available mbufs returned by  > rte_mempool_count() starts at 7680 on an idle system.
 >
 > As traffic commences the count declines from 7680 to  > 5632 (expected).

You are right, some mbufs are kept at 2 places:

- in mempool per-core cache: as you noticed, each lcore has
   a cache to avoid a (more) costly access to the common pool.

- also, the mbufs stay in the hardware transmission ring of the
   NIC. Let's say the size of your hw ring is 512, it means that
   when transmitting the 513th mbuf, you will free the first mbuf
   given to your NIC. Therefore, (hw-tx-ring-size * nb-tx-queue)
   mbufs can be stored in tx hw rings.
   Of course, the same applies to rx rings, but it's easier to see
   it as they are filled when initializing the driver.

When choosing the number of mbufs, you need to take a value greater than (hw-rx-ring-size * nb-rx-queue) + (hw-tx-ring-size *
nb-tx-queue) + (nb-lcores * mbuf-pool-cache-size)

 > Is this also true of ring buffers?

No, if you talk about rte_ring, there is no cache in this structure.

Regards,
Olivier

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-dev] When are mbufs released back to the mempool?
  2013-12-19 19:09   ` Schumm, Ken
@ 2013-12-19 19:35     ` Stephen Hemminger
  2013-12-19 23:56       ` Schumm, Ken
  0 siblings, 1 reply; 5+ messages in thread
From: Stephen Hemminger @ 2013-12-19 19:35 UTC (permalink / raw)
  To: Schumm, Ken; +Cc: dev

On Thu, 19 Dec 2013 19:09:48 +0000
"Schumm, Ken" <ken.schumm@intel.com> wrote:

> Hello Olivier,
> 
> Do you know what the reason is for the tx rings filling up and holding on to mbufs?

Optimization to defer freeing.
Note, there is no interrupts with DPDK so Transmit done can not be detected
until the next transmit.

> 
> It seems they could be freed when the DMA xfer is acknowledged instead of waiting until the ring was full.

You should also look at tx_free_thresh value in the rte_eth_txconf structure.
Several drivers use it to control when to free as in:

ixgbe_rxtx.c:
 
static inline uint16_t
tx_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
	     uint16_t nb_pkts)
{
	struct igb_tx_queue *txq = (struct igb_tx_queue *)tx_queue;
	volatile union ixgbe_adv_tx_desc *tx_r = txq->tx_ring;
	uint16_t n = 0;

	/*
	 * Begin scanning the H/W ring for done descriptors when the
	 * number of available descriptors drops below tx_free_thresh.  For
	 * each done descriptor, free the associated buffer.
	 */
	if (txq->nb_tx_free < txq->tx_free_thresh)
		ixgbe_tx_free_bufs(txq);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [dpdk-dev] When are mbufs released back to the mempool?
  2013-12-19 19:35     ` Stephen Hemminger
@ 2013-12-19 23:56       ` Schumm, Ken
  0 siblings, 0 replies; 5+ messages in thread
From: Schumm, Ken @ 2013-12-19 23:56 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev

Thanks very much, that clears it all up.

-----Original Message-----
From: Stephen Hemminger [mailto:stephen@networkplumber.org] 
Sent: Thursday, December 19, 2013 11:36 AM
To: Schumm, Ken
Cc: Olivier MATZ; dev@dpdk.org
Subject: Re: [dpdk-dev] When are mbufs released back to the mempool?

On Thu, 19 Dec 2013 19:09:48 +0000
"Schumm, Ken" <ken.schumm@intel.com> wrote:

> Hello Olivier,
> 
> Do you know what the reason is for the tx rings filling up and holding on to mbufs?

Optimization to defer freeing.
Note, there is no interrupts with DPDK so Transmit done can not be detected until the next transmit.

> 
> It seems they could be freed when the DMA xfer is acknowledged instead of waiting until the ring was full.

You should also look at tx_free_thresh value in the rte_eth_txconf structure.
Several drivers use it to control when to free as in:

ixgbe_rxtx.c:
 
static inline uint16_t
tx_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
	     uint16_t nb_pkts)
{
	struct igb_tx_queue *txq = (struct igb_tx_queue *)tx_queue;
	volatile union ixgbe_adv_tx_desc *tx_r = txq->tx_ring;
	uint16_t n = 0;

	/*
	 * Begin scanning the H/W ring for done descriptors when the
	 * number of available descriptors drops below tx_free_thresh.  For
	 * each done descriptor, free the associated buffer.
	 */
	if (txq->nb_tx_free < txq->tx_free_thresh)
		ixgbe_tx_free_bufs(txq);

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-12-19 23:55 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-17 18:13 [dpdk-dev] When are mbufs released back to the mempool? Schumm, Ken
2013-12-18  9:02 ` Olivier MATZ
2013-12-19 19:09   ` Schumm, Ken
2013-12-19 19:35     ` Stephen Hemminger
2013-12-19 23:56       ` Schumm, Ken

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).