patches for DPDK stable branches
 help / color / mirror / Atom feed
From: "Zhang, Qi Z" <qi.z.zhang@intel.com>
To: "Ye, Xiaolong" <xiaolong.ye@intel.com>
Cc: "Yigit, Ferruh" <ferruh.yigit@intel.com>,
	"Loftus, Ciara" <ciara.loftus@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"stable@dpdk.org" <stable@dpdk.org>,
	"Karlsson, Magnus" <magnus.karlsson@intel.com>
Subject: Re: [dpdk-stable] [PATCH] net/af_xdp: fix Tx halt when no recv packets
Date: Wed, 11 Sep 2019 23:30:43 +0000	[thread overview]
Message-ID: <039ED4275CED7440929022BC67E7061153D9478C@SHSMSX105.ccr.corp.intel.com> (raw)
In-Reply-To: <20190911125625.GE45267@intel.com>



> -----Original Message-----
> From: Ye, Xiaolong
> Sent: Wednesday, September 11, 2019 8:56 PM
> To: Zhang, Qi Z <qi.z.zhang@intel.com>
> Cc: Yigit, Ferruh <ferruh.yigit@intel.com>; Loftus, Ciara
> <ciara.loftus@intel.com>; dev@dpdk.org; stable@dpdk.org; Karlsson,
> Magnus <magnus.karlsson@intel.com>
> Subject: Re: [PATCH] net/af_xdp: fix Tx halt when no recv packets
> 
> On 09/11, Zhang, Qi Z wrote:
> >
> >
> >> -----Original Message-----
> >> From: Ye, Xiaolong
> >> Sent: Tuesday, September 10, 2019 11:09 PM
> >> To: Zhang, Qi Z <qi.z.zhang@intel.com>
> >> Cc: Yigit, Ferruh <ferruh.yigit@intel.com>; Loftus, Ciara
> >> <ciara.loftus@intel.com>; dev@dpdk.org; stable@dpdk.org; Karlsson,
> >> Magnus <magnus.karlsson@intel.com>
> >> Subject: Re: [PATCH] net/af_xdp: fix Tx halt when no recv packets
> >>
> >> On 09/10, Zhang, Qi Z wrote:
> >> >
> >> >
> >> >> -----Original Message-----
> >> >> From: Ye, Xiaolong
> >> >> Sent: Tuesday, September 10, 2019 9:54 PM
> >> >> To: Zhang, Qi Z <qi.z.zhang@intel.com>
> >> >> Cc: Yigit, Ferruh <ferruh.yigit@intel.com>; Loftus, Ciara
> >> >> <ciara.loftus@intel.com>; dev@dpdk.org; stable@dpdk.org; Karlsson,
> >> >> Magnus <magnus.karlsson@intel.com>
> >> >> Subject: Re: [PATCH] net/af_xdp: fix Tx halt when no recv packets
> >> >>
> >> >> On 09/10, Zhang, Qi Z wrote:
> >> >> >
> >> >> >
> >> >> >> -----Original Message-----
> >> >> >> From: Ye, Xiaolong
> >> >> >> Sent: Tuesday, September 10, 2019 12:13 AM
> >> >> >> To: Yigit, Ferruh <ferruh.yigit@intel.com>; Loftus, Ciara
> >> >> >> <ciara.loftus@intel.com>; Ye, Xiaolong <xiaolong.ye@intel.com>;
> >> >> >> Zhang, Qi Z <qi.z.zhang@intel.com>
> >> >> >> Cc: dev@dpdk.org; stable@dpdk.org
> >> >> >> Subject: [PATCH] net/af_xdp: fix Tx halt when no recv packets
> >> >> >>
> >> >> >> The kernel only consumes Tx packets if we have some Rx traffic
> >> >> >> on specified queue or we have called send(). So we need to
> >> >> >> issue a
> >> >> >> send() even when the allocation fails so that kernel will start
> >> >> >> to consume
> >> >> packets again.
> >> >> >
> >> >> >So "allocation fails" means " xsk_ring_prod__reserve" fail right?
> >> >>
> >> >> Yes.
> >> >>
> >> >> >I don't understand when xsk_ring_prod__needs_wakeup is true why
> >> >> >kernel will stop Tx packet at this situation would you share more
> >> insight?
> >> >>
> >> >> Actually, the fail case is xsk_ring_prod__needs_wakeup is false,
> >> >> then we can't issue a send() when xsk_ring_prod__reserve fails.
> >> >
> >> >Sorry, I think my question should be for the case when
> >> >xsk_ring_prod__needs_wakeup is false, I don't understand why we need
> >> >to handle different at below two situations 1. when
> >> >xsk_ring_prod__reserve fails 2. normal tx scenario.
> >> >My understanding is when xsk_ring_prod__needs_wakeup(tx) is false,
> >> which means Tx is ongoing, we don't need to wake up kernel to continue.
> >> >
> >>
> >> The problem is that kernel does not guarantee that all entries are sent for
> Tx.
> >> There are a number of reasons that this might not happen, but usually
> >> some Rx packet will at some point in time in the very short future
> >> trigger further Tx processing and the packets will be sent. But if
> >> you only have Tx processing and no Rx at all, you have to trigger a sento()
> again.
> >
> >Ok , so the question is why we have below code.
> >#if defined(XDP_USE_NEED_WAKEUP)
> >if (xsk_ring_prod__needs_wakeup(&txq->tx))
> >#endif
> >	kick_tx(txq);
> >
> >Here, when xsk_ring_prod__needs_wakeup is false, we can skip kick_tx
> (send), but why same "if check" can't be applied to the case when
> xsk_ring_prod__reserve failed?
> 
> When the system is running out of Tx entries, it needs some explicit action to
> trigger kernel consumes the Tx buffers.
> 
> >
> >Btw, think about below case
> >when xsk_ring_prod_reserve failed, if we don't kick_tx, and no
> >following rx happens, does that mean the remain packets in tx queue will
> never get chance be transmitted?, what happen if the last tx_burst is never
> be called?
> 
> This is exactly the issue this patch try to fix, in this case,
> xsk_ring_prod__reserve failure means there is no more available entries in tx
> queue, if we don't call send/sendto or there is no rx traffic, Tx just halts.

the problem I saw is if a packet is transmitted in a tx_burst, it may get chance not be transmitted immediately by kernel (wait for a trigger), it may rely on a "send" call from next tx_burst.(that's the patch does). But this does not looks like a correct driver behavior, since we can't assume application will do tx_burst all the time.
I think in every tx_burst we need to guarantee kernel tx should be activated immediately, trigger a Tx for a previous tx_burst should not be the case.

> 
> Thanks,
> Xiaolong
> 
> >
> >>
> >> Thanks,
> >> Xiaolong
> >>
> >> >>
> >> >> Thanks,
> >> >> Xiaolong
> >> >>
> >> >> >
> >> >> >Thanks
> >> >> >Qi
> >> >> >
> >> >> >>
> >> >> >> Commit 45bba02c95b0 ("net/af_xdp: support need wakeup
> feature")
> >> >> >> breaks above rule by adding some condition to send, this patch
> >> >> >> fixes it while still keeps the need_wakeup feature for Tx.
> >> >> >>
> >> >> >> Fixes: 45bba02c95b0 ("net/af_xdp: support need wakeup feature")
> >> >> >> Cc: stable@dpdk.org
> >> >> >>
> >> >> >> Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
> >> >> >> ---
> >> >> >>  drivers/net/af_xdp/rte_eth_af_xdp.c | 28
> >> >> >> ++++++++++++++--------------
> >> >> >>  1 file changed, 14 insertions(+), 14 deletions(-)
> >> >> >>
> >> >> >> diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c
> >> >> >> b/drivers/net/af_xdp/rte_eth_af_xdp.c
> >> >> >> index 41ed5b2af..e496e9aaa 100644
> >> >> >> --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
> >> >> >> +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
> >> >> >> @@ -286,19 +286,16 @@ kick_tx(struct pkt_tx_queue *txq)  {
> >> >> >>  	struct xsk_umem_info *umem = txq->pair->umem;
> >> >> >>
> >> >> >> -#if defined(XDP_USE_NEED_WAKEUP)
> >> >> >> -	if (xsk_ring_prod__needs_wakeup(&txq->tx))
> >> >> >> -#endif
> >> >> >> -		while (send(xsk_socket__fd(txq->pair->xsk), NULL,
> >> >> >> -			    0, MSG_DONTWAIT) < 0) {
> >> >> >> -			/* some thing unexpected */
> >> >> >> -			if (errno != EBUSY && errno != EAGAIN && errno !=
> >> EINTR)
> >> >> >> -				break;
> >> >> >> -
> >> >> >> -			/* pull from completion queue to leave more space
> */
> >> >> >> -			if (errno == EAGAIN)
> >> >> >> -				pull_umem_cq(umem,
> >> ETH_AF_XDP_TX_BATCH_SIZE);
> >> >> >> -		}
> >> >> >> +	while (send(xsk_socket__fd(txq->pair->xsk), NULL,
> >> >> >> +		    0, MSG_DONTWAIT) < 0) {
> >> >> >> +		/* some thing unexpected */
> >> >> >> +		if (errno != EBUSY && errno != EAGAIN && errno !=
> EINTR)
> >> >> >> +			break;
> >> >> >> +
> >> >> >> +		/* pull from completion queue to leave more space */
> >> >> >> +		if (errno == EAGAIN)
> >> >> >> +			pull_umem_cq(umem,
> ETH_AF_XDP_TX_BATCH_SIZE);
> >> >> >> +	}
> >> >> >>  	pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE);  }
> >> >> >>
> >> >> >> @@ -367,7 +364,10 @@ eth_af_xdp_tx(void *queue, struct
> rte_mbuf
> >> >> >> **bufs, uint16_t nb_pkts)
> >> >> >>
> >> >> >>  	xsk_ring_prod__submit(&txq->tx, nb_pkts);
> >> >> >>
> >> >> >> -	kick_tx(txq);
> >> >> >> +#if defined(XDP_USE_NEED_WAKEUP)
> >> >> >> +	if (xsk_ring_prod__needs_wakeup(&txq->tx))
> >> >> >> +#endif
> >> >> >> +		kick_tx(txq);
> >> >> >>
> >> >> >>  	txq->stats.tx_pkts += nb_pkts;
> >> >> >>  	txq->stats.tx_bytes += tx_bytes;
> >> >> >> --
> >> >> >> 2.17.1
> >> >> >

  reply	other threads:[~2019-09-11 23:30 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-09 16:12 Xiaolong Ye
2019-09-10  4:14 ` Zhang, Qi Z
2019-09-10 13:53   ` Ye Xiaolong
2019-09-10 14:41     ` Zhang, Qi Z
2019-09-10 15:09       ` Ye Xiaolong
2019-09-11  2:05         ` Zhang, Qi Z
2019-09-11 12:56           ` Ye Xiaolong
2019-09-11 23:30             ` Zhang, Qi Z [this message]
2019-09-11  0:12     ` Zhang, Qi Z
2019-09-17  9:13 ` Loftus, Ciara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=039ED4275CED7440929022BC67E7061153D9478C@SHSMSX105.ccr.corp.intel.com \
    --to=qi.z.zhang@intel.com \
    --cc=ciara.loftus@intel.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=stable@dpdk.org \
    --cc=xiaolong.ye@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).