DPDK usage discussions
 help / color / Atom feed
* [dpdk-users] ethdev: issues with tx_free_thresh + ixgbe
@ 2020-06-26 13:54 Julien Meunier
  0 siblings, 0 replies; only message in thread
From: Julien Meunier @ 2020-06-26 13:54 UTC (permalink / raw)
  To: users

Hello DPDK community,

I have a DPDK application running on a system with a high memory 
constraint: each applications have a memory budget to respect.
So, the number of mbuf in the mempool is calculated in order to match 
this constraint.

However, I saw a strange behavior regarding the way that the PMDs free 
the mbuf sent in the network.

To simplify my explanation, I built a small DPDK application which only 
sends packets (but testpmd can be used I think).
I used that on top of one VF (fm10kvf or ixgbevf). The VF has 1 RxQ / 
1024 RxD (not used) / 1 TxQ / 128 TxD
Also, let's say : 1 mbuf = 1 descriptor.

===========================
First point: tx_free_thresh
===========================

According to the DPDK programmer guide in section 11.4.5. 
``Configuration of Transmit Queues``::

   The minimum transmit packets to free threshold (tx_free_thresh).
     When the number of descriptors used to transmit packets exceeds
     this threshold, the network adaptor should be checked to see if it
     has written back descriptors.
     The default value for tx_free_thresh is 32.
     This ensures that the PMD does not search for completed descriptors
     **until at least 32 have been processed by the NIC for this queue.**

However, in the DPDK headers, in ``rte_eth_txconf`` struct::

   uint16_t tx_free_thresh; /**< Start freeing TX buffers if there are 

                                less free descriptors than this value. */

And in the docstring of ``rte_eth_tx_queue_setup``::

   - The *tx_free_thresh* value indicates the [minimum] number of network
     buffers that must be pending in the transmit ring to trigger their
     [implicit] freeing by the driver transmit function.

After a code review and tests on target (fm10kvf), my understanding is:
* tx_free_thresh is set to 32
- if I sent 32 packets, all mbufs are locked in the TxQ.
- if I sent 33 packets, all mbufs are locked in the TxQ.
- if I sent 96 packets, all mbufs are locked in the TxQ.
- if I sent 97 packets, PMD tries to clean the TxQ.

* tx_free_thresh is set to 128-32=96
- if I sent 32 packets, all mbufs are locked in the TxQ.
- if I sent 33 packets, PMD tries to clean the TxQ.

Is there any misunderstanding with the DPDK Programmer Guide or in the 
doctring ?
Should ``tx_free_thresh`` be defined as the following one ?

   This ensures that the PMD does not search for completed descriptors
   until at least 32 descriptors **are still available** by the NIC for
   this queue.

================================
Second: tx_free_thresh and ixgbe
================================

My application is running on two platforms. One with fm10kvf, one with 
ixgbevf.

I did the following tests:
- fm10kvf / no offload / TxD = 128 / tx_free_thresh = 96
=> after 33 packets, PMD tries to cleanup mbuf, as expected.

- fm10kvf / offload (TX multiseg ON) / TxD = 128 / tx_free_thresh = 96
=> after 33 packets, PMD tries to cleanup mbuf, as expected.

- ixgbevf / no offload / TxD = 128 / tx_free_thresh = 96
=> after 33 packets, PMD tries to cleanup mbuf, as expected.

- ixgbevf / offload (TX multiseg ON) / TxD = 128 / tx_free_thresh = 96
=> after 33 packets, all mbufs are locked in the TxQ.
=> after 97 packets, all mbufs are locked in the TxQ.
=> after 128 packets, only the first mbuf sent is freed.

I did some analysis in this PMD. The TX function is not the same when 
offload is enabled or not (ixgbe_set_tx_function)
- when no offload is used:
   * TX function is ixgbe_xmit_pkts_vec
   * ixgbe_xmit_pkts_vec correctly manages the tx_free_thresh and calls 
ixgbe_tx_free_bufs in order to free the mbufs.
- when offload is enabled:
   * TX function is ixgbe_xmit_pkts
   * ixgbe_xmit_pkts calls ixgbe_xmit_cleanup (instead of free), and 
seems to only manage internal pointers.
   * Free is done only when ixgbe detects if descriptor was previously 
used: http://git.dpdk.org/dpdk/tree/drivers/net/ixgbe/ixgbe_rxtx.c#n890

To sum-up:
- when offload is disabled, TX ring can be cleanup quickly, and it 
enhances the mbuf circulation inside the application.
- when offload is enabled, TX ring cannot be cleanup quickly, and at the 
end, all mbufs are locked in the TxQ.

I had a usecase with 6 VFs / 4 TxQ / 1024 descriptors. After few seconds 
of test, 6 * 4 * 1024 = 24 576 mbufs were unavailable, because locked in 
the TxQ.
As our mempool is quite low, this behavior cannot be handled, only 
because the PMD implementation is different.

Is it the expected and wanted behavior for ixgbe with full TX features 
enabled ?

Did I miss something in my configuration ?

Thanks in advance !
Best regards,

-- 
Julien Meunier

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, back to index

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-26 13:54 [dpdk-users] ethdev: issues with tx_free_thresh + ixgbe Julien Meunier

DPDK usage discussions

Archives are clonable:
	git clone --mirror http://inbox.dpdk.org/users/0 users/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 users users/ http://inbox.dpdk.org/users \
		users@dpdk.org
	public-inbox-index users


Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.users


AGPL code for this site: git clone https://public-inbox.org/ public-inbox