Re: [dpdk-dev] [PATCH] net/af_xdp: optimisations to improve packet loss

DPDK patches and discussions
 help / color / mirror / Atom feed

From: Ye Xiaolong <xiaolong.ye@intel.com>
To: Ciara Loftus <ciara.loftus@intel.com>
Cc: dev@dpdk.org, magnus.karlsson@intel.com
Subject: Re: [dpdk-dev] [PATCH] net/af_xdp: optimisations to improve packet loss
Date: Wed, 17 Jun 2020 10:34:06 +0800	[thread overview]
Message-ID: <20200617023406.GA42160@intel.com> (raw)
In-Reply-To: <20200612141746.9450-1-ciara.loftus@intel.com>

Hi, Ciara

Thanks for the work, really good improvement.

On 06/12, Ciara Loftus wrote:
>This commit makes some changes to the AF_XDP PMD in an effort to improve
>its packet loss characteristics.
>
>1. In the case of failed transmission due to inability to reserve a tx
>descriptor, the PMD now pulls from the completion ring, issues a syscall
>in which the kernel attempts to complete outstanding tx operations, then
>tries to reserve the tx descriptor again. Prior to this we dropped the
>packet after the syscall and didn't try to re-reserve.
>
>2. During completion ring cleanup, always pull as many entries as possible
>from the ring as opposed to the batch size or just how many packets
>we're going to attempt to send. Keeping the completion ring emptier should
>reduce failed transmissions in the kernel, as the kernel requires space in
>the completion ring to successfully tx.
>
>3. Size the fill ring as twice the receive ring size which may help reduce
>allocation failures in the driver.
>
>With these changes, a benchmark which measured the packet rate at which
>0.01% packet loss could be reached improved from ~0.1G to ~3Gbps.
>
>Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
>---
> drivers/net/af_xdp/rte_eth_af_xdp.c | 18 ++++++++++--------
> 1 file changed, 10 insertions(+), 8 deletions(-)
>
>diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c
>index 06124ba789..4c23bbdf7d 100644
>--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
>+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
>@@ -396,6 +396,8 @@ kick_tx(struct pkt_tx_queue *txq)
> {
> 	struct xsk_umem_info *umem = txq->umem;
> 
>+	pull_umem_cq(umem, XSK_RING_CONS__DEFAULT_NUM_DESCS);
>+
> #if defined(XDP_USE_NEED_WAKEUP)
> 	if (xsk_ring_prod__needs_wakeup(&txq->tx))
> #endif
>@@ -407,11 +409,9 @@ kick_tx(struct pkt_tx_queue *txq)
> 
> 			/* pull from completion queue to leave more space */
> 			if (errno == EAGAIN)
>-				pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE);
>+				pull_umem_cq(umem,
>+					     XSK_RING_CONS__DEFAULT_NUM_DESCS);
> 		}
>-#ifndef XDP_UMEM_UNALIGNED_CHUNK_FLAG
>-	pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE);
>-#endif
> }
> 
> #if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
>@@ -428,7 +428,7 @@ af_xdp_tx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
> 	struct xdp_desc *desc;
> 	uint64_t addr, offset;
> 
>-	pull_umem_cq(umem, nb_pkts);
>+	pull_umem_cq(umem, XSK_RING_CONS__DEFAULT_NUM_DESCS);
> 
> 	for (i = 0; i < nb_pkts; i++) {
> 		mbuf = bufs[i];
>@@ -436,7 +436,9 @@ af_xdp_tx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
> 		if (mbuf->pool == umem->mb_pool) {
> 			if (!xsk_ring_prod__reserve(&txq->tx, 1, &idx_tx)) {
> 				kick_tx(txq);
>-				goto out;
>+				if (!xsk_ring_prod__reserve(&txq->tx, 1,
>+							    &idx_tx))
>+					goto out;
> 			}
> 			desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx);
> 			desc->len = mbuf->pkt_len;
>@@ -758,7 +760,7 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals *internals __rte_unused,
> 	struct xsk_umem_info *umem;
> 	int ret;
> 	struct xsk_umem_config usr_config = {
>-		.fill_size = ETH_AF_XDP_DFLT_NUM_DESCS,
>+		.fill_size = ETH_AF_XDP_DFLT_NUM_DESCS * 2,
> 		.comp_size = ETH_AF_XDP_DFLT_NUM_DESCS,
> 		.flags = XDP_UMEM_UNALIGNED_CHUNK_FLAG};
> 	void *base_addr = NULL;
>@@ -867,7 +869,7 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq,
> 	struct xsk_socket_config cfg;
> 	struct pkt_tx_queue *txq = rxq->pair;
> 	int ret = 0;
>-	int reserve_size = ETH_AF_XDP_DFLT_NUM_DESCS / 2;
>+	int reserve_size = ETH_AF_XDP_DFLT_NUM_DESCS;
> 	struct rte_mbuf *fq_bufs[reserve_size];
> 
> 	rxq->umem = xdp_umem_configure(internals, rxq);
>-- 
>2.17.1
>

Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>

next prev parent reply	other threads:[~2020-06-17  2:43 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-12 14:17 Ciara Loftus
2020-06-17  2:34 ` Ye Xiaolong [this message]
2020-06-17  3:42 ` Stephen Hemminger
2020-06-23 14:50   ` Loftus, Ciara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200617023406.GA42160@intel.com \
    --to=xiaolong.ye@intel.com \
    --cc=ciara.loftus@intel.com \
    --cc=dev@dpdk.org \
    --cc=magnus.karlsson@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).