DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH] net/af_xdp: optimisations to improve packet loss
@ 2020-06-12 14:17 Ciara Loftus
  2020-06-17  2:34 ` Ye Xiaolong
  2020-06-17  3:42 ` Stephen Hemminger
  0 siblings, 2 replies; 4+ messages in thread
From: Ciara Loftus @ 2020-06-12 14:17 UTC (permalink / raw)
  To: dev; +Cc: xiaolong.ye, magnus.karlsson, Ciara Loftus

This commit makes some changes to the AF_XDP PMD in an effort to improve
its packet loss characteristics.

1. In the case of failed transmission due to inability to reserve a tx
descriptor, the PMD now pulls from the completion ring, issues a syscall
in which the kernel attempts to complete outstanding tx operations, then
tries to reserve the tx descriptor again. Prior to this we dropped the
packet after the syscall and didn't try to re-reserve.

2. During completion ring cleanup, always pull as many entries as possible
from the ring as opposed to the batch size or just how many packets
we're going to attempt to send. Keeping the completion ring emptier should
reduce failed transmissions in the kernel, as the kernel requires space in
the completion ring to successfully tx.

3. Size the fill ring as twice the receive ring size which may help reduce
allocation failures in the driver.

With these changes, a benchmark which measured the packet rate at which
0.01% packet loss could be reached improved from ~0.1G to ~3Gbps.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
---
 drivers/net/af_xdp/rte_eth_af_xdp.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c
index 06124ba789..4c23bbdf7d 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -396,6 +396,8 @@ kick_tx(struct pkt_tx_queue *txq)
 {
 	struct xsk_umem_info *umem = txq->umem;
 
+	pull_umem_cq(umem, XSK_RING_CONS__DEFAULT_NUM_DESCS);
+
 #if defined(XDP_USE_NEED_WAKEUP)
 	if (xsk_ring_prod__needs_wakeup(&txq->tx))
 #endif
@@ -407,11 +409,9 @@ kick_tx(struct pkt_tx_queue *txq)
 
 			/* pull from completion queue to leave more space */
 			if (errno == EAGAIN)
-				pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE);
+				pull_umem_cq(umem,
+					     XSK_RING_CONS__DEFAULT_NUM_DESCS);
 		}
-#ifndef XDP_UMEM_UNALIGNED_CHUNK_FLAG
-	pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE);
-#endif
 }
 
 #if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
@@ -428,7 +428,7 @@ af_xdp_tx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 	struct xdp_desc *desc;
 	uint64_t addr, offset;
 
-	pull_umem_cq(umem, nb_pkts);
+	pull_umem_cq(umem, XSK_RING_CONS__DEFAULT_NUM_DESCS);
 
 	for (i = 0; i < nb_pkts; i++) {
 		mbuf = bufs[i];
@@ -436,7 +436,9 @@ af_xdp_tx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 		if (mbuf->pool == umem->mb_pool) {
 			if (!xsk_ring_prod__reserve(&txq->tx, 1, &idx_tx)) {
 				kick_tx(txq);
-				goto out;
+				if (!xsk_ring_prod__reserve(&txq->tx, 1,
+							    &idx_tx))
+					goto out;
 			}
 			desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx);
 			desc->len = mbuf->pkt_len;
@@ -758,7 +760,7 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals *internals __rte_unused,
 	struct xsk_umem_info *umem;
 	int ret;
 	struct xsk_umem_config usr_config = {
-		.fill_size = ETH_AF_XDP_DFLT_NUM_DESCS,
+		.fill_size = ETH_AF_XDP_DFLT_NUM_DESCS * 2,
 		.comp_size = ETH_AF_XDP_DFLT_NUM_DESCS,
 		.flags = XDP_UMEM_UNALIGNED_CHUNK_FLAG};
 	void *base_addr = NULL;
@@ -867,7 +869,7 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq,
 	struct xsk_socket_config cfg;
 	struct pkt_tx_queue *txq = rxq->pair;
 	int ret = 0;
-	int reserve_size = ETH_AF_XDP_DFLT_NUM_DESCS / 2;
+	int reserve_size = ETH_AF_XDP_DFLT_NUM_DESCS;
 	struct rte_mbuf *fq_bufs[reserve_size];
 
 	rxq->umem = xdp_umem_configure(internals, rxq);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [dpdk-dev] [PATCH] net/af_xdp: optimisations to improve packet loss
  2020-06-12 14:17 [dpdk-dev] [PATCH] net/af_xdp: optimisations to improve packet loss Ciara Loftus
@ 2020-06-17  2:34 ` Ye Xiaolong
  2020-06-17  3:42 ` Stephen Hemminger
  1 sibling, 0 replies; 4+ messages in thread
From: Ye Xiaolong @ 2020-06-17  2:34 UTC (permalink / raw)
  To: Ciara Loftus; +Cc: dev, magnus.karlsson

Hi, Ciara

Thanks for the work, really good improvement.

On 06/12, Ciara Loftus wrote:
>This commit makes some changes to the AF_XDP PMD in an effort to improve
>its packet loss characteristics.
>
>1. In the case of failed transmission due to inability to reserve a tx
>descriptor, the PMD now pulls from the completion ring, issues a syscall
>in which the kernel attempts to complete outstanding tx operations, then
>tries to reserve the tx descriptor again. Prior to this we dropped the
>packet after the syscall and didn't try to re-reserve.
>
>2. During completion ring cleanup, always pull as many entries as possible
>from the ring as opposed to the batch size or just how many packets
>we're going to attempt to send. Keeping the completion ring emptier should
>reduce failed transmissions in the kernel, as the kernel requires space in
>the completion ring to successfully tx.
>
>3. Size the fill ring as twice the receive ring size which may help reduce
>allocation failures in the driver.
>
>With these changes, a benchmark which measured the packet rate at which
>0.01% packet loss could be reached improved from ~0.1G to ~3Gbps.
>
>Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
>---
> drivers/net/af_xdp/rte_eth_af_xdp.c | 18 ++++++++++--------
> 1 file changed, 10 insertions(+), 8 deletions(-)
>
>diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c
>index 06124ba789..4c23bbdf7d 100644
>--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
>+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
>@@ -396,6 +396,8 @@ kick_tx(struct pkt_tx_queue *txq)
> {
> 	struct xsk_umem_info *umem = txq->umem;
> 
>+	pull_umem_cq(umem, XSK_RING_CONS__DEFAULT_NUM_DESCS);
>+
> #if defined(XDP_USE_NEED_WAKEUP)
> 	if (xsk_ring_prod__needs_wakeup(&txq->tx))
> #endif
>@@ -407,11 +409,9 @@ kick_tx(struct pkt_tx_queue *txq)
> 
> 			/* pull from completion queue to leave more space */
> 			if (errno == EAGAIN)
>-				pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE);
>+				pull_umem_cq(umem,
>+					     XSK_RING_CONS__DEFAULT_NUM_DESCS);
> 		}
>-#ifndef XDP_UMEM_UNALIGNED_CHUNK_FLAG
>-	pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE);
>-#endif
> }
> 
> #if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
>@@ -428,7 +428,7 @@ af_xdp_tx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
> 	struct xdp_desc *desc;
> 	uint64_t addr, offset;
> 
>-	pull_umem_cq(umem, nb_pkts);
>+	pull_umem_cq(umem, XSK_RING_CONS__DEFAULT_NUM_DESCS);
> 
> 	for (i = 0; i < nb_pkts; i++) {
> 		mbuf = bufs[i];
>@@ -436,7 +436,9 @@ af_xdp_tx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
> 		if (mbuf->pool == umem->mb_pool) {
> 			if (!xsk_ring_prod__reserve(&txq->tx, 1, &idx_tx)) {
> 				kick_tx(txq);
>-				goto out;
>+				if (!xsk_ring_prod__reserve(&txq->tx, 1,
>+							    &idx_tx))
>+					goto out;
> 			}
> 			desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx);
> 			desc->len = mbuf->pkt_len;
>@@ -758,7 +760,7 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals *internals __rte_unused,
> 	struct xsk_umem_info *umem;
> 	int ret;
> 	struct xsk_umem_config usr_config = {
>-		.fill_size = ETH_AF_XDP_DFLT_NUM_DESCS,
>+		.fill_size = ETH_AF_XDP_DFLT_NUM_DESCS * 2,
> 		.comp_size = ETH_AF_XDP_DFLT_NUM_DESCS,
> 		.flags = XDP_UMEM_UNALIGNED_CHUNK_FLAG};
> 	void *base_addr = NULL;
>@@ -867,7 +869,7 @@ xsk_configure(struct pmd_internals *internals, struct pkt_rx_queue *rxq,
> 	struct xsk_socket_config cfg;
> 	struct pkt_tx_queue *txq = rxq->pair;
> 	int ret = 0;
>-	int reserve_size = ETH_AF_XDP_DFLT_NUM_DESCS / 2;
>+	int reserve_size = ETH_AF_XDP_DFLT_NUM_DESCS;
> 	struct rte_mbuf *fq_bufs[reserve_size];
> 
> 	rxq->umem = xdp_umem_configure(internals, rxq);
>-- 
>2.17.1
>

Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [dpdk-dev] [PATCH] net/af_xdp: optimisations to improve packet loss
  2020-06-12 14:17 [dpdk-dev] [PATCH] net/af_xdp: optimisations to improve packet loss Ciara Loftus
  2020-06-17  2:34 ` Ye Xiaolong
@ 2020-06-17  3:42 ` Stephen Hemminger
  2020-06-23 14:50   ` Loftus, Ciara
  1 sibling, 1 reply; 4+ messages in thread
From: Stephen Hemminger @ 2020-06-17  3:42 UTC (permalink / raw)
  To: Ciara Loftus; +Cc: dev, xiaolong.ye, magnus.karlsson

On Fri, 12 Jun 2020 14:17:46 +0000
Ciara Loftus <ciara.loftus@intel.com> wrote:

> This commit makes some changes to the AF_XDP PMD in an effort to improve
> its packet loss characteristics.
> 
> 1. In the case of failed transmission due to inability to reserve a tx
> descriptor, the PMD now pulls from the completion ring, issues a syscall
> in which the kernel attempts to complete outstanding tx operations, then
> tries to reserve the tx descriptor again. Prior to this we dropped the
> packet after the syscall and didn't try to re-reserve.
> 
> 2. During completion ring cleanup, always pull as many entries as possible
> from the ring as opposed to the batch size or just how many packets
> we're going to attempt to send. Keeping the completion ring emptier should
> reduce failed transmissions in the kernel, as the kernel requires space in
> the completion ring to successfully tx.
> 
> 3. Size the fill ring as twice the receive ring size which may help reduce
> allocation failures in the driver.
> 
> With these changes, a benchmark which measured the packet rate at which
> 0.01% packet loss could be reached improved from ~0.1G to ~3Gbps.
> 
> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>

You might want to add the ability to emulate a tx_free threshold
by pulling more completions earlier.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [dpdk-dev] [PATCH] net/af_xdp: optimisations to improve packet loss
  2020-06-17  3:42 ` Stephen Hemminger
@ 2020-06-23 14:50   ` Loftus, Ciara
  0 siblings, 0 replies; 4+ messages in thread
From: Loftus, Ciara @ 2020-06-23 14:50 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Ye, Xiaolong, Karlsson, Magnus

> 
> On Fri, 12 Jun 2020 14:17:46 +0000
> Ciara Loftus <ciara.loftus@intel.com> wrote:
> 
> > This commit makes some changes to the AF_XDP PMD in an effort to
> improve
> > its packet loss characteristics.
> >
> > 1. In the case of failed transmission due to inability to reserve a tx
> > descriptor, the PMD now pulls from the completion ring, issues a syscall
> > in which the kernel attempts to complete outstanding tx operations, then
> > tries to reserve the tx descriptor again. Prior to this we dropped the
> > packet after the syscall and didn't try to re-reserve.
> >
> > 2. During completion ring cleanup, always pull as many entries as possible
> > from the ring as opposed to the batch size or just how many packets
> > we're going to attempt to send. Keeping the completion ring emptier
> should
> > reduce failed transmissions in the kernel, as the kernel requires space in
> > the completion ring to successfully tx.
> >
> > 3. Size the fill ring as twice the receive ring size which may help reduce
> > allocation failures in the driver.
> >
> > With these changes, a benchmark which measured the packet rate at
> which
> > 0.01% packet loss could be reached improved from ~0.1G to ~3Gbps.
> >
> > Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
> 
> You might want to add the ability to emulate a tx_free threshold
> by pulling more completions earlier.

Thanks for the suggestion. I've implemented it in the v2.

Ciara


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-06-23 14:51 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-12 14:17 [dpdk-dev] [PATCH] net/af_xdp: optimisations to improve packet loss Ciara Loftus
2020-06-17  2:34 ` Ye Xiaolong
2020-06-17  3:42 ` Stephen Hemminger
2020-06-23 14:50   ` Loftus, Ciara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).