DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Dumitrescu, Cristian" <cristian.dumitrescu@intel.com>
To: Robert Sanford <rsanford2@gmail.com>, "dev@dpdk.org" <dev@dpdk.org>
Cc: "Liang, Cunming" <cunming.liang@intel.com>
Subject: Re: [dpdk-dev] [PATCH 4/4] port: fix ethdev writer burst too big
Date: Thu, 31 Mar 2016 13:22:47 +0000	[thread overview]
Message-ID: <3EB4FA525960D640B5BDFFD6A3D8912647974F2E@IRSMSX108.ger.corp.intel.com> (raw)
In-Reply-To: <1459198297-49854-5-git-send-email-rsanford@akamai.com>



> -----Original Message-----
> From: Robert Sanford [mailto:rsanford2@gmail.com]
> Sent: Monday, March 28, 2016 9:52 PM
> To: dev@dpdk.org; Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
> Subject: [PATCH 4/4] port: fix ethdev writer burst too big
> 
> For f_tx_bulk functions in rte_port_ethdev.c, we may unintentionally
> send bursts larger than tx_burst_sz to the underlying ethdev.
> Some PMDs (e.g., ixgbe) may truncate this request to their maximum
> burst size, resulting in unnecessary enqueuing failures or ethdev
> writer retries.

Sending bursts larger than tx_burst_sz is actually intentional. The assumption is that NIC performance benefits from larger burst size. So the tx_burst_sz is used as a minimal burst size requirement, not as a maximal or fixed burst size requirement.

I agree with you that a while ago the vector version of IXGBE driver used to work the way you describe it, but I don't think this is the case anymore. As an example, if TX burst size is set to 32 and 48 packets are transmitted, than the PMD will TX all the 48 packets (internally it can work in batches of 4, 8, 32, etc, should not matter) rather than TXing just 32 packets out of 48 and user having to either discard or retry with the remaining 16 packets. I am CC-ing Steve Liang for confirming this.

Is there any PMD that people can name that currently behaves the opposite, i.e. given a burst of 48 pkts for TX, accept 32 pkts and discard the other 16?

> 
> We propose to fix this by moving the tx buffer flushing logic from
> *after* the loop that puts all packets into the tx buffer, to *inside*
> the loop, testing for a full burst when adding each packet.
> 

The issue I have with this approach is the introduction of a branch that has to be tested for each iteration of the loop rather than once for the entire loop.

The code branch where you add this is actually the slow(er) code path (where local variable expr != 0), which is used for non-contiguous or bursts smaller than tx_burst_sz. Is there a particular reason you are only interested of enabling this strategy (of using tx_burst_sz as a fixed burst size requirement) only on this code path? The reason I am asking is the other fast(er) code path (where expr == 0) also uses tx_burst_sz as a minimal requirement and therefore it can send burst sizes bigger than tx_burst_sz.


> Signed-off-by: Robert Sanford <rsanford@akamai.com>
> ---
>  lib/librte_port/rte_port_ethdev.c |   20 ++++++++++----------
>  1 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/lib/librte_port/rte_port_ethdev.c
> b/lib/librte_port/rte_port_ethdev.c
> index 3fb4947..1283338 100644
> --- a/lib/librte_port/rte_port_ethdev.c
> +++ b/lib/librte_port/rte_port_ethdev.c
> @@ -151,7 +151,7 @@ static int rte_port_ethdev_reader_stats_read(void
> *port,
>  struct rte_port_ethdev_writer {
>  	struct rte_port_out_stats stats;
> 
> -	struct rte_mbuf *tx_buf[2 * RTE_PORT_IN_BURST_SIZE_MAX];
> +	struct rte_mbuf *tx_buf[RTE_PORT_IN_BURST_SIZE_MAX];
>  	uint32_t tx_burst_sz;
>  	uint16_t tx_buf_count;
>  	uint64_t bsz_mask;
> @@ -257,11 +257,11 @@ rte_port_ethdev_writer_tx_bulk(void *port,
>  			p->tx_buf[tx_buf_count++] = pkt;
> 
> 	RTE_PORT_ETHDEV_WRITER_STATS_PKTS_IN_ADD(p, 1);
>  			pkts_mask &= ~pkt_mask;
> -		}
> 
> -		p->tx_buf_count = tx_buf_count;
> -		if (tx_buf_count >= p->tx_burst_sz)
> -			send_burst(p);
> +			p->tx_buf_count = tx_buf_count;
> +			if (tx_buf_count >= p->tx_burst_sz)
> +				send_burst(p);
> +		}
>  	}

One observation here: if we enable this proposal (which I have an issue with due to the executing the branch per loop iteration rather than once per entire loop), it also eliminates the buffer overflow issue flagged by you in the other email :), so no need to e.g. doble the size of the port internal buffer (tx_buf).

> 
>  	return 0;
> @@ -328,7 +328,7 @@ static int rte_port_ethdev_writer_stats_read(void
> *port,
>  struct rte_port_ethdev_writer_nodrop {
>  	struct rte_port_out_stats stats;
> 
> -	struct rte_mbuf *tx_buf[2 * RTE_PORT_IN_BURST_SIZE_MAX];
> +	struct rte_mbuf *tx_buf[RTE_PORT_IN_BURST_SIZE_MAX];
>  	uint32_t tx_burst_sz;
>  	uint16_t tx_buf_count;
>  	uint64_t bsz_mask;
> @@ -466,11 +466,11 @@ rte_port_ethdev_writer_nodrop_tx_bulk(void
> *port,
>  			p->tx_buf[tx_buf_count++] = pkt;
> 
> 	RTE_PORT_ETHDEV_WRITER_NODROP_STATS_PKTS_IN_ADD(p, 1);
>  			pkts_mask &= ~pkt_mask;
> -		}
> 
> -		p->tx_buf_count = tx_buf_count;
> -		if (tx_buf_count >= p->tx_burst_sz)
> -			send_burst_nodrop(p);
> +			p->tx_buf_count = tx_buf_count;
> +			if (tx_buf_count >= p->tx_burst_sz)
> +				send_burst_nodrop(p);
> +		}
>  	}
> 
>  	return 0;
> --
> 1.7.1

  reply	other threads:[~2016-03-31 13:22 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-28 20:51 [dpdk-dev] [PATCH 0/4] port: fix and test bugs in tx_bulk ops Robert Sanford
2016-03-28 20:51 ` [dpdk-dev] [PATCH 1/4] app/test: enhance test_port_ring_writer Robert Sanford
2016-04-01 19:42   ` Sanford, Robert
2016-04-06 16:46     ` Dumitrescu, Cristian
2016-03-28 20:51 ` [dpdk-dev] [PATCH 2/4] port: fix ring writer buffer overflow Robert Sanford
2016-03-31 11:21   ` Dumitrescu, Cristian
2016-03-28 20:51 ` [dpdk-dev] [PATCH 3/4] port: fix full burst checks in f_tx_bulk ops Robert Sanford
2016-03-31 15:41   ` Dumitrescu, Cristian
2016-04-01 19:31     ` Sanford, Robert
2016-03-28 20:51 ` [dpdk-dev] [PATCH 4/4] port: fix ethdev writer burst too big Robert Sanford
2016-03-31 13:22   ` Dumitrescu, Cristian [this message]
2016-03-30 11:00 ` [dpdk-dev] [PATCH 0/4] port: fix and test bugs in tx_bulk ops Thomas Monjalon
2016-03-30 11:58   ` Dumitrescu, Cristian
2016-04-01 16:22 [dpdk-dev] [PATCH 4/4] port: fix ethdev writer burst too big Sanford, Robert
2016-04-11 19:21 ` Dumitrescu, Cristian
2016-04-11 20:36   ` Sanford, Robert
2016-04-12 15:40     ` Dumitrescu, Cristian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3EB4FA525960D640B5BDFFD6A3D8912647974F2E@IRSMSX108.ger.corp.intel.com \
    --to=cristian.dumitrescu@intel.com \
    --cc=cunming.liang@intel.com \
    --cc=dev@dpdk.org \
    --cc=rsanford2@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).