From: Bruce Richardson <bruce.richardson@intel.com>
To: "Morten Brørup" <mb@smartsharesystems.com>
Cc: Jeff Guo <jia.guo@intel.com>,
Thomas Monjalon <thomas@monjalon.net>,
Ferruh Yigit <ferruh.yigit@intel.com>,
Andrew Rybchenko <arybchenko@solarflare.com>,
qiming.yang@intel.com, beilei.xing@intel.com,
wei.zhao1@intel.com, qi.z.zhang@intel.com, jingjing.wu@intel.com,
dev@dpdk.org, helin.zhang@intel.com, barbette@kth.se
Subject: Re: [dpdk-dev] [RFC] ethdev: rte_eth_rx_burst() requirements fornb_pkts
Date: Thu, 27 Aug 2020 12:41:17 +0100 [thread overview]
Message-ID: <20200827114117.GD569@bricha3-MOBL.ger.corp.intel.com> (raw)
In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35C6125B@smartserver.smartshare.dk>
On Thu, Aug 27, 2020 at 12:13:51PM +0200, Morten Brørup wrote:
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
> > Sent: Thursday, August 27, 2020 11:44 AM
> >
> > On Thu, Aug 27, 2020 at 11:31:15AM +0200, Morten Brørup wrote:
> > > > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > > > Sent: Thursday, August 27, 2020 11:10 AM
> > > >
> > > > On Thu, Aug 27, 2020 at 10:40:11AM +0200, Morten Brørup wrote:
> > > > > Jeff and Ethernet API maintainers Thomas, Ferruh and Andrew,
> > > > >
> > > > > I'm hijacking this patch thread to propose a small API
> > modification
> > > > that prevents unnecessarily performance degradations.
> > > > >
> > > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jeff Guo
> > > > > > Sent: Thursday, August 27, 2020 9:55 AM
> > > > > >
> > > > > > The limitation of burst size in vector rx was removed, since it
> > > > should
> > > > > > retrieve as much received packets as possible. And also the
> > > > scattered
> > > > > > receive path should use a wrapper function to achieve the goal
> > of
> > > > > > burst maximizing.
> > > > > >
> > > > > > This patch set aims to maximize vector rx burst for for
> > > > > > ixgbe/i40e/ice/iavf PMDs.
> > > > > >
> > > > >
> > > > > Now I'm going to be pedantic and say that it still doesn't
> > conform to
> > > > the rte_eth_rx_burst() API, because the API does not specify any
> > > > minimum requirement for nb_pkts.
> > > > >
> > > > > In theory, that could also be fixed in the driver by calling the
> > non-
> > > > vector function from the vector functions if nb_pkts is too small
> > for
> > > > the vector implementation.
> > > > >
> > > > > However, I think that calling rte_eth_rx_burst() with a small
> > nb_pkts
> > > > is silly and not in the spirit of DPDK, and introducing an
> > additional
> > > > comparison for a small nb_pkts in the driver vector functions would
> > > > degrade their performance (only slightly, but anyway).
> > > > >
> > > >
> > > > Actually, I'd like to see a confirmed measurement showing a
> > slowdown
> > > > before
> > > > we discard such an option. :-)
> > >
> > > Good point!
> > >
> > > > While I agree that using small bursts is
> > > > not
> > > > keeping with the design approach of DPDK of using large bursts to
> > > > amortize
> > > > costs and allow prefetching, there are cases where a user/app may
> > want
> > > > a
> > > > small burst size, e.g. 4, for latency reasons, and we need a way to
> > > > support
> > > > that.
> > > >
> > > I assume that calling rte_eth_rx_burst() with nb_pkts=32 returns 4
> > packets if only 4 packets are available, so you would need to be
> > extremely latency sensitive to call it with a smaller nb_pkts. I guess
> > that high frequency trading is the only real life scenario here.
> > >
> > Yes, it really boils down to whether you are prepared to accept lower
> > max throughput or dropped packets in order to gain lower latency.
> >
> > > > Since the path selection is dynamic, we need to either:
> > > > a) provide a way for the user to specify that they will use smaller
> > > > bursts
> > > > and so that vector functions should not be used
> > > > b) have the vector functions transparently fallback to the scalar
> > ones
> > > > if
> > > > used with smaller bursts
> > > >
> > > > Of these, option b) is simpler, and should be low cost since any
> > check
> > > > is
> > > > just once per burst, and - assuming an app is written using the
> > same
> > > > request-size each time - should be entirely predictable after the
> > first
> > > > call.
> > > >
> > > Why does everyone assume that DPDK applications are so simple that
> > the branch predictor will cover the entire data path? I hear this
> > argument over and over again, and by principle I disagree with it!
> > >
> >
> > Fair enough, that was an assumption on my part. Do you see in your apps
> > many cases where branches are getting mispredicted despite going the
> > same
> > way each time though the code?
> >
> We haven't looked deeply into this, but I don't think so.
>
> My objection is of a more general nature. As a library, DPDK cannot assume that applications using it are simple, and - based on that assumption - take away resources that could have been available for the application.
>
> The Intel general optimization guidelines specifies that code should be arranged to be consistent with the static branch prediction algorithm: make the fall-through code following a conditional branch be the likely target for a branch with a forward target, and make the fall-through code following a conditional branch be the unlikely target for a branch with a backward target.
>
> It also says: Conditional branches that are never taken do not consume BTB resources.
>
> Somehow this last detail is completely ignored by DPDK developers.
>
> We put a lot of effort into conserving resources in most areas in DPDK, but when it comes to the branch prediction target buffer (BTB), we gladly organize code with branches turning the wrong way, thus unnecessarily consuming BTB entries. And the argument goes: The branch predictor will catch it after the first time.
>
Looks like something to investigate more. Thanks for bringing this up.
> > > How about c): add rte_eth_rx() and rte_eth_tx() functions for
> > receiving/transmitting a single packet. The ring library has such
> > functions.
> > >
> > > Optimized single-packet functions might even perform better than
> > calling the burst functions with nb_pkts=1. Great for latency focused
> > applications. :-)
> > >
> > That is another option, yes.
> > A further option is to add to the vector code a one-off switch to check
> > first
> > time it's called that the request size is not lower than the min
> > supported
> > (again basing on the assumption that one is not going to be varying the
> > burst size asked - which may not be true in call cases but won't leave
> > us
> > any worse off than we are now!).
>
> I certainly don't support this option. But it was worth mentioning.
>
Right. For now then, it seems like just documenting a minimum burst size is
reasonable.
/Bruce
next prev parent reply other threads:[~2020-08-27 11:41 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-27 7:54 [dpdk-dev] [PATCH v1 0/4] maximize vector rx burst for PMDs Jeff Guo
2020-08-27 7:54 ` [dpdk-dev] [PATCH v1 1/4] net/ixgbe: maximize vector rx burst for ixgbe Jeff Guo
2020-08-27 7:54 ` [dpdk-dev] [PATCH v1 2/4] net/i40e: maximize vector rx burst for i40e Jeff Guo
2020-08-27 7:54 ` [dpdk-dev] [PATCH v1 3/4] net/ice: maximize vector rx burst for ice Jeff Guo
2020-08-27 7:54 ` [dpdk-dev] [PATCH v1 4/4] net/iavf: maximize vector rx burst for iavf Jeff Guo
2020-08-27 8:40 ` [dpdk-dev] [RFC] ethdev: rte_eth_rx_burst() requirements for nb_pkts Morten Brørup
2020-08-27 9:09 ` Bruce Richardson
2020-08-27 9:31 ` Morten Brørup
2020-08-27 9:43 ` Bruce Richardson
2020-08-27 10:13 ` [dpdk-dev] [RFC] ethdev: rte_eth_rx_burst() requirements fornb_pkts Morten Brørup
2020-08-27 11:41 ` Bruce Richardson [this message]
2020-08-28 9:03 ` Morten Brørup
2020-08-28 10:07 ` Bruce Richardson
2020-08-28 10:50 ` Morten Brørup
2020-08-29 10:15 ` Morten Brørup
2020-09-09 6:36 ` [dpdk-dev] [PATCH v3 0/5] fix vector rx burst for PMDs Jeff Guo
2020-09-09 6:36 ` [dpdk-dev] [PATCH v3 1/5] net/iavf: fix vector rx burst for iavf Jeff Guo
2020-09-09 6:36 ` [dpdk-dev] [PATCH v3 2/5] net/ixgbe: fix vector rx burst for ixgbe Jeff Guo
[not found] ` <VI1PR0802MB23518C6B517B6EAD8E018CD49E260@VI1PR0802MB2351.eurprd08.prod.outlook.com>
2020-09-09 9:54 ` [dpdk-dev] 回复: " Feifei Wang
2020-09-09 6:36 ` [dpdk-dev] [PATCH v3 3/5] net/i40e: fix vector rx burst for i40e Jeff Guo
2020-09-09 6:36 ` [dpdk-dev] [PATCH v3 4/5] net/ice: fix vector rx burst for ice Jeff Guo
2020-09-15 7:10 ` Han, YingyaX
2020-09-09 6:36 ` [dpdk-dev] [PATCH v3 5/5] net/fm10k: fix vector rx burst for fm10k Jeff Guo
2020-09-09 6:45 ` [dpdk-dev] [PATCH v3 0/5] fix vector rx burst for PMDs Wang, Haiyue
2020-09-09 7:03 ` Guo, Jia
2020-09-09 7:05 ` Wang, Haiyue
2020-09-09 7:43 ` Morten Brørup
2020-09-09 7:55 ` Wang, Haiyue
2020-09-09 8:01 ` Guo, Jia
2020-09-17 7:58 ` [dpdk-dev] [PATCH v4 " Jeff Guo
2020-09-17 7:58 ` [dpdk-dev] [PATCH v4 1/5] net/iavf: fix vector rx burst for iavf Jeff Guo
2020-09-17 7:58 ` [dpdk-dev] [PATCH v4 2/5] net/ixgbe: fix vector rx burst for ixgbe Jeff Guo
2020-09-17 7:58 ` [dpdk-dev] [PATCH v4 3/5] net/i40e: fix vector rx burst for i40e Jeff Guo
2020-09-17 7:58 ` [dpdk-dev] [PATCH v4 4/5] net/ice: fix vector rx burst for ice Jeff Guo
2020-09-17 11:03 ` Zhang, Qi Z
2020-09-18 3:20 ` Guo, Jia
2020-09-18 3:41 ` Zhang, Qi Z
2020-09-18 4:41 ` Guo, Jia
2020-09-18 5:39 ` Zhang, Qi Z
2020-09-17 7:58 ` [dpdk-dev] [PATCH v4 5/5] net/fm10k: fix vector rx burst for fm10k Jeff Guo
2020-10-16 9:44 ` [dpdk-dev] [PATCH v5 0/5] fix vector rx burst for PMDs Jeff Guo
2020-10-16 9:44 ` [dpdk-dev] [PATCH v5 1/5] net/ixgbe: fix vector rx burst for ixgbe Jeff Guo
2020-10-16 9:44 ` [dpdk-dev] [PATCH v5 2/5] net/i40e: fix vector rx burst for i40e Jeff Guo
2020-10-16 9:44 ` [dpdk-dev] [PATCH v5 3/5] net/ice: fix vector rx burst for ice Jeff Guo
2020-10-16 9:44 ` [dpdk-dev] [PATCH v5 4/5] net/fm10k: fix vector rx burst for fm10k Jeff Guo
2020-10-16 9:44 ` [dpdk-dev] [PATCH v5 5/5] net/iavf: fix vector rx burst for iavf Jeff Guo
2020-10-23 5:09 ` Ling, WeiX
2020-10-23 10:11 ` [dpdk-dev] [PATCH v5 0/5] fix vector rx burst for PMDs Zhang, Qi Z
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200827114117.GD569@bricha3-MOBL.ger.corp.intel.com \
--to=bruce.richardson@intel.com \
--cc=arybchenko@solarflare.com \
--cc=barbette@kth.se \
--cc=beilei.xing@intel.com \
--cc=dev@dpdk.org \
--cc=ferruh.yigit@intel.com \
--cc=helin.zhang@intel.com \
--cc=jia.guo@intel.com \
--cc=jingjing.wu@intel.com \
--cc=mb@smartsharesystems.com \
--cc=qi.z.zhang@intel.com \
--cc=qiming.yang@intel.com \
--cc=thomas@monjalon.net \
--cc=wei.zhao1@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).