DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] Minutes of Technical Board Meeting, 2021-03-10
@ 2021-03-31  0:44 Honnappa Nagarahalli
  2021-03-31  8:52 ` Tom Barbette
  0 siblings, 1 reply; 7+ messages in thread
From: Honnappa Nagarahalli @ 2021-03-31  0:44 UTC (permalink / raw)
  To: dev; +Cc: nd, nd

Minutes of Technical Board Meeting, 2021-03-10

Members Attending:
   - Aaron Conole
   - Bruce Richardson
   - Ferruh Yigit
   - Hemant Agrawal
   - Honnappa Nagarahalli (Chair)
   - Jerin Jacob
   - Kevin Traynor
   - Konstantin Ananyev
   - Maxime Coquelin
   - Olivier Matz
   - Stephen Hemminger
   - Thomas Monjalon

NOTE: The Technical Board meetings take place every second Wednesday on https://meet.jit.si/DPDK at 3 pm UTC.
Meetings are public, and DPDK community members are welcome to attend.
Agenda and minutes can be found at http://core.dpdk.org/techboard/minutes

NOTE: Next meeting will be on Wednesday 2021-03-24 @3pm UTC, and will be chaired by Jerin.

#1 Build config improvement proposal
   * Discussion centered around the following issues
	- Ability to enable/disable compilation of libraries and PMDs. There was no objection from the techboard members on the requirement.
	- Ability to enable/disable #defines
	- Ability to tune the values of #defines
   * Few prominent points discussed
	- This will result in #ifdefs in the code (for ex: in testpmd)
	- One option is for all the PMDs to document their configurable #defines in PMD specific header files. Having these distributed is much easier to search.
	- Can some of the existing #defines be converted to runtime configurations? For ex: RTE_MAX_LCORE? This might impact ABI.
   * Bruce to think about converting the doc to a blog or an email on the mailing list. But soliciting feedback is most important.

#2 DMARC on mailing list
    * Ali Alnubani is working on the issue and will have a mailing list to test

#3 Documenting criteria on adding/removing members to technical board
    * Thomas has sent patches to the Techboard. Board members to review and provide feedback.

#4 Criteria for adding new members to security team?
    * Being known to members of the Techboard would be one criteria.
    * AI: Ferruh to put out a document or email to propose the criteria

#5 Testing leader's role definition
    * AI: Aaron Conole has it completed and will send it to the Techboard.

#6 Approval for blog on C11 adoption
    * Konstantin has provided the comments
    * Awaiting comments from PPC maintainers

#7 Update on DTS usability
    * Intel engineers working on DTS are involved in providing the feedback
    * If the test cases can be added to the unit tests, it should be allowed as well. i.e. DTS should not be made mandatory. This will provide flexibility for developers.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] Minutes of Technical Board Meeting, 2021-03-10
  2021-03-31  0:44 [dpdk-dev] Minutes of Technical Board Meeting, 2021-03-10 Honnappa Nagarahalli
@ 2021-03-31  8:52 ` Tom Barbette
  2021-04-06 13:13   ` Morten Brørup
  0 siblings, 1 reply; 7+ messages in thread
From: Tom Barbette @ 2021-03-31  8:52 UTC (permalink / raw)
  To: Honnappa Nagarahalli, dev; +Cc: nd, Alireza Farshin

Le 31-03-21 à 02:44, Honnappa Nagarahalli a écrit :
> 	- Ability to tune the values of #defines
>     * Few prominent points discussed
> 	- This will result in #ifdefs in the code (for ex: in testpmd)
> 	- One option is for all the PMDs to document their configurable #defines in PMD specific header files. Having these distributed is much easier to search.
> 	- Can some of the existing #defines be converted to runtime configurations? For ex: RTE_MAX_LCORE? This might impact ABI.
>     * Bruce to think about converting the doc to a blog or an email on the mailing list. But soliciting feedback is most important.

One alternative path worth looking at is to encourage the use of LTO, 
and modify APIs so the configuration can be provided at linking time, 
and propagated by the compiler.

E.g. one can define rte_max_lcore as a weak constant symbol, equal to 
128. At linking time the user may provide a rte_max_lcore that is more 
tailored, and still, dynamic arrays[rte_max_lcore] will be allocatable 
on the .bss section, avoiding an indirection. The compiler will be able 
to optimize loops etc which is impossible with pure runtime configuration.

In packetmill.io we actually pushed this to the next level where the 
driver can completely change its behavior without recompiling DPDK 
itself and spawning ifdefs everywhere.

However the price is the slowiness of LTO...

My 2 cents.

Tom



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] Minutes of Technical Board Meeting, 2021-03-10
  2021-03-31  8:52 ` Tom Barbette
@ 2021-04-06 13:13   ` Morten Brørup
  2021-04-07  0:47     ` Honnappa Nagarahalli
  0 siblings, 1 reply; 7+ messages in thread
From: Morten Brørup @ 2021-04-06 13:13 UTC (permalink / raw)
  To: Tom Barbette, Honnappa Nagarahalli, dev
  Cc: nd, Alireza Farshin, Van Haaren, Harry, Bruce Richardson

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Tom Barbette
> Sent: Wednesday, March 31, 2021 10:53 AM
> 
> Le 31-03-21 à 02:44, Honnappa Nagarahalli a écrit :
> > 	- Ability to tune the values of #defines
> >     * Few prominent points discussed
> > 	- This will result in #ifdefs in the code (for ex: in testpmd)
> > 	- One option is for all the PMDs to document their configurable
> #defines in PMD specific header files. Having these distributed is much
> easier to search.
> > 	- Can some of the existing #defines be converted to runtime
> configurations? For ex: RTE_MAX_LCORE? This might impact ABI.
> >     * Bruce to think about converting the doc to a blog or an email
> on the mailing list. But soliciting feedback is most important.
> 
> One alternative path worth looking at is to encourage the use of LTO,
> and modify APIs so the configuration can be provided at linking time,
> and propagated by the compiler.
> 
> E.g. one can define rte_max_lcore as a weak constant symbol, equal to
> 128. At linking time the user may provide a rte_max_lcore that is more
> tailored, and still, dynamic arrays[rte_max_lcore] will be allocatable
> on the .bss section, avoiding an indirection. The compiler will be able
> to optimize loops etc which is impossible with pure runtime
> configuration.
> 
> In packetmill.io we actually pushed this to the next level where the
> driver can completely change its behavior without recompiling DPDK
> itself and spawning ifdefs everywhere.
> 
> However the price is the slowiness of LTO...
> 
> My 2 cents.
> 
> Tom
> 

If we are moving away from Compile Time parameters, I certainly prefer Tom's suggestion of Link Time parameters, rather than Run Time parameters.

This might also provide a middle ground for optimizations where Compile Time parameters are considered unacceptable by the DPDK community. I'm thinking about something along the lines of the "constant size" rte_event array presented at the 2020 Userspace Summit by Harry (https://static.sched.com/hosted_files/dpdkuserspace2020/d3/dpdk_userspace_20_api_performance_hvh.pdf). Taking this thinking even further out, a Link Time parameter could perhaps replace the nb_pkts parameter in on optimized rte_eth_rx_burst() function.


Med venlig hilsen / kind regards
- Morten Brørup




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] Minutes of Technical Board Meeting, 2021-03-10
  2021-04-06 13:13   ` Morten Brørup
@ 2021-04-07  0:47     ` Honnappa Nagarahalli
  2021-04-07  7:11       ` Morten Brørup
  0 siblings, 1 reply; 7+ messages in thread
From: Honnappa Nagarahalli @ 2021-04-07  0:47 UTC (permalink / raw)
  To: Morten Brørup, Tom Barbette, dev
  Cc: nd, Alireza Farshin, Van Haaren, Harry, Bruce Richardson,
	Honnappa Nagarahalli, nd

<snip>
> 
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Tom Barbette
> > Sent: Wednesday, March 31, 2021 10:53 AM
> >
> > Le 31-03-21 à 02:44, Honnappa Nagarahalli a écrit :
> > > 	- Ability to tune the values of #defines
> > >     * Few prominent points discussed
> > > 	- This will result in #ifdefs in the code (for ex: in testpmd)
> > > 	- One option is for all the PMDs to document their configurable
> > #defines in PMD specific header files. Having these distributed is
> > much easier to search.
> > > 	- Can some of the existing #defines be converted to runtime
> > configurations? For ex: RTE_MAX_LCORE? This might impact ABI.
> > >     * Bruce to think about converting the doc to a blog or an email
> > on the mailing list. But soliciting feedback is most important.
> >
> > One alternative path worth looking at is to encourage the use of LTO,
> > and modify APIs so the configuration can be provided at linking time,
> > and propagated by the compiler.
> >
> > E.g. one can define rte_max_lcore as a weak constant symbol, equal to
> > 128. At linking time the user may provide a rte_max_lcore that is more
> > tailored, and still, dynamic arrays[rte_max_lcore] will be allocatable
> > on the .bss section, avoiding an indirection. The compiler will be
> > able to optimize loops etc which is impossible with pure runtime
> > configuration.
> >
> > In packetmill.io we actually pushed this to the next level where the
> > driver can completely change its behavior without recompiling DPDK
> > itself and spawning ifdefs everywhere.
> >
> > However the price is the slowiness of LTO...
> >
> > My 2 cents.
> >
> > Tom
> >
> 
> If we are moving away from Compile Time parameters, I certainly prefer Tom's
> suggestion of Link Time parameters, rather than Run Time parameters.
I think compile time constants are fine if they are not used in #ifdef. For ex: if they are used in 'if (...)', it will help eliminate code and branches.

> 
> This might also provide a middle ground for optimizations where Compile
> Time parameters are considered unacceptable by the DPDK community. I'm
> thinking about something along the lines of the "constant size" rte_event
> array presented at the 2020 Userspace Summit by Harry
> (https://static.sched.com/hosted_files/dpdkuserspace2020/d3/dpdk_userspac
> e_20_api_performance_hvh.pdf). Taking this thinking even further out, a Link
> Time parameter could perhaps replace the nb_pkts parameter in on
> optimized rte_eth_rx_burst() function.
> 
> 
> Med venlig hilsen / kind regards
> - Morten Brørup
> 
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] Minutes of Technical Board Meeting, 2021-03-10
  2021-04-07  0:47     ` Honnappa Nagarahalli
@ 2021-04-07  7:11       ` Morten Brørup
  2021-04-07  9:58         ` Bruce Richardson
  0 siblings, 1 reply; 7+ messages in thread
From: Morten Brørup @ 2021-04-07  7:11 UTC (permalink / raw)
  To: Honnappa Nagarahalli, Tom Barbette, dev
  Cc: nd, Alireza Farshin, Van Haaren, Harry, Bruce Richardson, nd

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Honnappa
> Nagarahalli
> Sent: Wednesday, April 7, 2021 2:48 AM
> 
> <snip>
> >
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Tom Barbette
> > > Sent: Wednesday, March 31, 2021 10:53 AM
> > >
> > > Le 31-03-21 à 02:44, Honnappa Nagarahalli a écrit :
> > > > 	- Ability to tune the values of #defines
> > > >     * Few prominent points discussed
> > > > 	- This will result in #ifdefs in the code (for ex: in
> testpmd)
> > > > 	- One option is for all the PMDs to document their
> configurable
> > > #defines in PMD specific header files. Having these distributed is
> > > much easier to search.
> > > > 	- Can some of the existing #defines be converted to runtime
> > > configurations? For ex: RTE_MAX_LCORE? This might impact ABI.
> > > >     * Bruce to think about converting the doc to a blog or an
> email
> > > on the mailing list. But soliciting feedback is most important.
> > >
> > > One alternative path worth looking at is to encourage the use of
> LTO,
> > > and modify APIs so the configuration can be provided at linking
> time,
> > > and propagated by the compiler.
> > >
> > > E.g. one can define rte_max_lcore as a weak constant symbol, equal
> to
> > > 128. At linking time the user may provide a rte_max_lcore that is
> more
> > > tailored, and still, dynamic arrays[rte_max_lcore] will be
> allocatable
> > > on the .bss section, avoiding an indirection. The compiler will be
> > > able to optimize loops etc which is impossible with pure runtime
> > > configuration.
> > >
> > > In packetmill.io we actually pushed this to the next level where
> the
> > > driver can completely change its behavior without recompiling DPDK
> > > itself and spawning ifdefs everywhere.
> > >
> > > However the price is the slowiness of LTO...
> > >
> > > My 2 cents.
> > >
> > > Tom
> > >
> >
> > If we are moving away from Compile Time parameters, I certainly
> prefer Tom's
> > suggestion of Link Time parameters, rather than Run Time parameters.
> I think compile time constants are fine if they are not used in #ifdef.
> For ex: if they are used in 'if (...)', it will help eliminate code and
> branches.

Yes!

And "if (...)" is more flexible than #ifdef/#if because it allows the expression to be mixed with non-constants.

Then perhaps Bruce's script to automatically make C constants out of #defines was not so silly anyway. :-)

> 
> >
> > This might also provide a middle ground for optimizations where
> Compile
> > Time parameters are considered unacceptable by the DPDK community.
> I'm
> > thinking about something along the lines of the "constant size"
> rte_event
> > array presented at the 2020 Userspace Summit by Harry
> >
> (https://static.sched.com/hosted_files/dpdkuserspace2020/d3/dpdk_usersp
> ac
> > e_20_api_performance_hvh.pdf). Taking this thinking even further out,
> a Link
> > Time parameter could perhaps replace the nb_pkts parameter in on
> > optimized rte_eth_rx_burst() function.
> >

Optimally, I would like to see e.g. the RX burst size being so constant that the PMD's RX function knows it and can use vector functions and possibly loop unrolling, without having to implement a pre-check on nb_pkts and a trailing non-vector loop for receiving any remaining odd nb_pkts. All the DPDK examples use #define MAX_PKT_BURST 32 or similar, and I assume most DPDK applications do too.

I do not trust the compiler to be clever enough to realize that the PMD's RX function is always called with a specific nb_pkts and optimize all this cruft away at compile time (or at link time), unless it is a #define or a compile time constant.

> >
> > Med venlig hilsen / kind regards
> > - Morten Brørup
> >
> >


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] Minutes of Technical Board Meeting, 2021-03-10
  2021-04-07  7:11       ` Morten Brørup
@ 2021-04-07  9:58         ` Bruce Richardson
  2021-04-07 10:29           ` Morten Brørup
  0 siblings, 1 reply; 7+ messages in thread
From: Bruce Richardson @ 2021-04-07  9:58 UTC (permalink / raw)
  To: Morten Brørup
  Cc: Honnappa Nagarahalli, Tom Barbette, dev, nd, Alireza Farshin,
	Van Haaren, Harry

On Wed, Apr 07, 2021 at 09:11:23AM +0200, Morten Brørup wrote:
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Honnappa
> > Nagarahalli
> > Sent: Wednesday, April 7, 2021 2:48 AM
> > 
> > <snip>
> > >
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Tom Barbette
> > > > Sent: Wednesday, March 31, 2021 10:53 AM
> > > >
> > > > Le 31-03-21 à 02:44, Honnappa Nagarahalli a écrit :
> > > > > 	- Ability to tune the values of #defines
> > > > >     * Few prominent points discussed
> > > > > 	- This will result in #ifdefs in the code (for ex: in
> > testpmd)
> > > > > 	- One option is for all the PMDs to document their
> > configurable
> > > > #defines in PMD specific header files. Having these distributed is
> > > > much easier to search.
> > > > > 	- Can some of the existing #defines be converted to runtime
> > > > configurations? For ex: RTE_MAX_LCORE? This might impact ABI.
> > > > >     * Bruce to think about converting the doc to a blog or an
> > email
> > > > on the mailing list. But soliciting feedback is most important.
> > > >
> > > > One alternative path worth looking at is to encourage the use of
> > LTO,
> > > > and modify APIs so the configuration can be provided at linking
> > time,
> > > > and propagated by the compiler.
> > > >
> > > > E.g. one can define rte_max_lcore as a weak constant symbol, equal
> > to
> > > > 128. At linking time the user may provide a rte_max_lcore that is
> > more
> > > > tailored, and still, dynamic arrays[rte_max_lcore] will be
> > allocatable
> > > > on the .bss section, avoiding an indirection. The compiler will be
> > > > able to optimize loops etc which is impossible with pure runtime
> > > > configuration.
> > > >
> > > > In packetmill.io we actually pushed this to the next level where
> > the
> > > > driver can completely change its behavior without recompiling DPDK
> > > > itself and spawning ifdefs everywhere.
> > > >
> > > > However the price is the slowiness of LTO...
> > > >
> > > > My 2 cents.
> > > >
> > > > Tom
> > > >
> > >
> > > If we are moving away from Compile Time parameters, I certainly
> > prefer Tom's
> > > suggestion of Link Time parameters, rather than Run Time parameters.
> > I think compile time constants are fine if they are not used in #ifdef.
> > For ex: if they are used in 'if (...)', it will help eliminate code and
> > branches.
> 
> Yes!
> 
> And "if (...)" is more flexible than #ifdef/#if because it allows the expression to be mixed with non-constants.
> 
> Then perhaps Bruce's script to automatically make C constants out of #defines was not so silly anyway. :-)
> 
> > 
> > >
> > > This might also provide a middle ground for optimizations where
> > Compile
> > > Time parameters are considered unacceptable by the DPDK community.
> > I'm
> > > thinking about something along the lines of the "constant size"
> > rte_event
> > > array presented at the 2020 Userspace Summit by Harry
> > >
> > (https://static.sched.com/hosted_files/dpdkuserspace2020/d3/dpdk_usersp
> > ac
> > > e_20_api_performance_hvh.pdf). Taking this thinking even further out,
> > a Link
> > > Time parameter could perhaps replace the nb_pkts parameter in on
> > > optimized rte_eth_rx_burst() function.
> > >
> 
> Optimally, I would like to see e.g. the RX burst size being so constant that the PMD's RX function knows it and can use vector functions and possibly loop unrolling, without having to implement a pre-check on nb_pkts and a trailing non-vector loop for receiving any remaining odd nb_pkts. All the DPDK examples use #define MAX_PKT_BURST 32 or similar, and I assume most DPDK applications do too.
> 
> I do not trust the compiler to be clever enough to realize that the PMD's RX function is always called with a specific nb_pkts and optimize all this cruft away at compile time (or at link time), unless it is a #define or a compile time constant.
> 

It certainly is not possible to do at compile time, because the calls are
in a different compilation unit from the functions themselves, not to
mention that a link-time the RX functions are called via a function
pointer. Therefore the only way to do this that I am aware of, is to have a
wrapper function use for the common values inside the drivers themselves.

For example, inside the i40e driver (which I'm using because it's the one
I'm most familiar with), the main receive function is already a wrapper
around a raw receive function, using constant-expansion by the compiler of
the final parameter (NULL) to automatically remove the code for tracking
scattered packets.

uint16_t
i40e_recv_pkts_vec_avx2(void *rx_queue, struct rte_mbuf **rx_pkts,
                   uint16_t nb_pkts)
{
	return _recv_raw_pkts_vec_avx2(rx_queue, rx_pkts, nb_pkts, NULL);
}

We can produce a version of this optimized for 32-element dequeues by
special-casing where nb_pkts == 32:

uint16_t
i40e_recv_pkts_vec_avx2(void *rx_queue, struct rte_mbuf **rx_pkts,
                   uint16_t nb_pkts)
{
	if (nb_pkts == 32)
		return _recv_raw_pkts_vec_avx2(rx_queue, rx_pkts, 32, NULL);
	return _recv_raw_pkts_vec_avx2(rx_queue, rx_pkts, nb_pkts, NULL);
}

Now the compiler when inlining the _raw_ function, can see that it needs
two copies, and for the first, that nb_pkts is compile-time constant of 32.

However, I'm not sure how useful an optimization like this is, and I'd be
interested to see what benefits testing shows. Beyond the loop iteration
count of 32, there is also the check after each burst of 8 dequeues in the
driver to check that we have a full set of 8 - and abort the loop if not.
It's also the case that unless an app is already at maximum load (or
overloaded), one would probably not expect to always get a full set of 32
packets each time, as you have no additional headroom for more.

/Bruce

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] Minutes of Technical Board Meeting, 2021-03-10
  2021-04-07  9:58         ` Bruce Richardson
@ 2021-04-07 10:29           ` Morten Brørup
  0 siblings, 0 replies; 7+ messages in thread
From: Morten Brørup @ 2021-04-07 10:29 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: Honnappa Nagarahalli, Tom Barbette, dev, nd, Alireza Farshin,
	Van Haaren, Harry

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
> Sent: Wednesday, April 7, 2021 11:59 AM
> 
> On Wed, Apr 07, 2021 at 09:11:23AM +0200, Morten Brørup wrote:
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Honnappa
> > > Nagarahalli
> > > Sent: Wednesday, April 7, 2021 2:48 AM
> > >
> > > <snip>
> > > >
> > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Tom
> Barbette
> > > > > Sent: Wednesday, March 31, 2021 10:53 AM
> > > > >
> > > > > Le 31-03-21 à 02:44, Honnappa Nagarahalli a écrit :
> > > > > > 	- Ability to tune the values of #defines
> > > > > >     * Few prominent points discussed
> > > > > > 	- This will result in #ifdefs in the code (for ex: in
> > > testpmd)
> > > > > > 	- One option is for all the PMDs to document their
> > > configurable
> > > > > #defines in PMD specific header files. Having these distributed
> is
> > > > > much easier to search.
> > > > > > 	- Can some of the existing #defines be converted to runtime
> > > > > configurations? For ex: RTE_MAX_LCORE? This might impact ABI.
> > > > > >     * Bruce to think about converting the doc to a blog or an
> > > email
> > > > > on the mailing list. But soliciting feedback is most important.
> > > > >
> > > > > One alternative path worth looking at is to encourage the use
> of
> > > LTO,
> > > > > and modify APIs so the configuration can be provided at linking
> > > time,
> > > > > and propagated by the compiler.
> > > > >
> > > > > E.g. one can define rte_max_lcore as a weak constant symbol,
> equal
> > > to
> > > > > 128. At linking time the user may provide a rte_max_lcore that
> is
> > > more
> > > > > tailored, and still, dynamic arrays[rte_max_lcore] will be
> > > allocatable
> > > > > on the .bss section, avoiding an indirection. The compiler will
> be
> > > > > able to optimize loops etc which is impossible with pure
> runtime
> > > > > configuration.
> > > > >
> > > > > In packetmill.io we actually pushed this to the next level
> where
> > > the
> > > > > driver can completely change its behavior without recompiling
> DPDK
> > > > > itself and spawning ifdefs everywhere.
> > > > >
> > > > > However the price is the slowiness of LTO...
> > > > >
> > > > > My 2 cents.
> > > > >
> > > > > Tom
> > > > >
> > > >
> > > > If we are moving away from Compile Time parameters, I certainly
> > > prefer Tom's
> > > > suggestion of Link Time parameters, rather than Run Time
> parameters.
> > > I think compile time constants are fine if they are not used in
> #ifdef.
> > > For ex: if they are used in 'if (...)', it will help eliminate code
> and
> > > branches.
> >
> > Yes!
> >
> > And "if (...)" is more flexible than #ifdef/#if because it allows the
> expression to be mixed with non-constants.
> >
> > Then perhaps Bruce's script to automatically make C constants out of
> #defines was not so silly anyway. :-)
> >
> > >
> > > >
> > > > This might also provide a middle ground for optimizations where
> > > Compile
> > > > Time parameters are considered unacceptable by the DPDK
> community.
> > > I'm
> > > > thinking about something along the lines of the "constant size"
> > > rte_event
> > > > array presented at the 2020 Userspace Summit by Harry
> > > >
> > >
> (https://static.sched.com/hosted_files/dpdkuserspace2020/d3/dpdk_usersp
> > > ac
> > > > e_20_api_performance_hvh.pdf). Taking this thinking even further
> out,
> > > a Link
> > > > Time parameter could perhaps replace the nb_pkts parameter in on
> > > > optimized rte_eth_rx_burst() function.
> > > >
> >
> > Optimally, I would like to see e.g. the RX burst size being so
> constant that the PMD's RX function knows it and can use vector
> functions and possibly loop unrolling, without having to implement a
> pre-check on nb_pkts and a trailing non-vector loop for receiving any
> remaining odd nb_pkts. All the DPDK examples use #define MAX_PKT_BURST
> 32 or similar, and I assume most DPDK applications do too.
> >
> > I do not trust the compiler to be clever enough to realize that the
> PMD's RX function is always called with a specific nb_pkts and optimize
> all this cruft away at compile time (or at link time), unless it is a
> #define or a compile time constant.
> >
> 
> It certainly is not possible to do at compile time, because the calls
> are
> in a different compilation unit from the functions themselves, not to
> mention that a link-time the RX functions are called via a function
> pointer.

Exactly. It would only work with a global #define or global compile time constant.

> Therefore the only way to do this that I am aware of, is to
> have a
> wrapper function use for the common values inside the drivers
> themselves.
> 
> For example, inside the i40e driver (which I'm using because it's the
> one
> I'm most familiar with), the main receive function is already a wrapper
> around a raw receive function, using constant-expansion by the compiler
> of
> the final parameter (NULL) to automatically remove the code for
> tracking
> scattered packets.
> 
> uint16_t
> i40e_recv_pkts_vec_avx2(void *rx_queue, struct rte_mbuf **rx_pkts,
>                    uint16_t nb_pkts)
> {
> 	return _recv_raw_pkts_vec_avx2(rx_queue, rx_pkts, nb_pkts, NULL);
> }
> 
> We can produce a version of this optimized for 32-element dequeues by
> special-casing where nb_pkts == 32:
> 
> uint16_t
> i40e_recv_pkts_vec_avx2(void *rx_queue, struct rte_mbuf **rx_pkts,
>                    uint16_t nb_pkts)
> {
> 	if (nb_pkts == 32)
> 		return _recv_raw_pkts_vec_avx2(rx_queue, rx_pkts, 32,
> NULL);
> 	return _recv_raw_pkts_vec_avx2(rx_queue, rx_pkts, nb_pkts, NULL);
> }
> 
> Now the compiler when inlining the _raw_ function, can see that it
> needs
> two copies, and for the first, that nb_pkts is compile-time constant of
> 32.

Great example! Now, consider this modification to your example, where rte_eth_rx_burst_size is a global constant variable that can be evaluated at compile time:

uint16_t
i40e_recv_pkts_vec_avx2(void *rx_queue, struct rte_mbuf **rx_pkts,
                   uint16_t nb_pkts)
{
-	if (nb_pkts == 32)
-		return _recv_raw_pkts_vec_avx2(rx_queue, rx_pkts, 32, NULL);
+	if (rte_eth_rx_burst_size)
+		return _recv_raw_pkts_vec_avx2(rx_queue, rx_pkts, rte_eth_rx_burst_size, NULL);
	return _recv_raw_pkts_vec_avx2(rx_queue, rx_pkts, nb_pkts, NULL);
}

> However, I'm not sure how useful an optimization like this is, and I'd
> be interested to see what benefits testing shows.

Yes, performance test results would be beneficial.

> Beyond the loop iteration
> count of 32, there is also the check after each burst of 8 dequeues in
> the
> driver to check that we have a full set of 8 - and abort the loop if
> not.

That loop could be optimized too, going from 8 to 32 - or going all the way to global_rx_burst_size.

> It's also the case that unless an app is already at maximum load (or
> overloaded), one would probably not expect to always get a full set of
> 32
> packets each time, as you have no additional headroom for more.
> 

Except if the application is designed to call rte_eth_rx_burst() less frequently to get a full burst as often as possible, using the NIC's many RX descriptors as burst buffer. Working with full bursts throughout the application provides higher total performance per clock cycle.

> /Bruce


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-04-07 10:29 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-31  0:44 [dpdk-dev] Minutes of Technical Board Meeting, 2021-03-10 Honnappa Nagarahalli
2021-03-31  8:52 ` Tom Barbette
2021-04-06 13:13   ` Morten Brørup
2021-04-07  0:47     ` Honnappa Nagarahalli
2021-04-07  7:11       ` Morten Brørup
2021-04-07  9:58         ` Bruce Richardson
2021-04-07 10:29           ` Morten Brørup

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).