From: Thomas Monjalon <thomas@monjalon.net>
To: Ray Kinsella <mdr@ashroe.eu>
Cc: dev@dpdk.org, "Yigit, Ferruh" <ferruh.yigit@intel.com>,
'Damjan Marion' <dmarion@me.com>,
"Wang, Haiyue" <haiyue.wang@intel.com>,
Jerin Jacob Kollanukkaran <jerinj@marvell.com>,
viacheslavo@mellanox.com, stephen@networkplumber.org,
arybchenko@solarflare.com
Subject: Re: [dpdk-dev] [PATCH v4 1/4] ethdev: add the API for getting burst mode information
Date: Sun, 03 Nov 2019 23:41:05 +0100 [thread overview]
Message-ID: <6097021.DVUxeYL0oo@xps> (raw)
In-Reply-To: <e4b31f81-3c21-c702-c599-aecda721598a@ashroe.eu>
03/11/2019 21:35, Ray Kinsella:
> On 29/10/2019 14:27, Ferruh Yigit wrote:
> > On 10/26/2019 5:23 PM, Thomas Monjalon wrote:
> >> 26/10/2019 11:23, Wang, Haiyue:
> >>> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> >>>> 26/10/2019 06:40, Wang, Haiyue:
> >>>>> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> >>>>>> 25/10/2019 18:02, Jerin Jacob:
> >>>>>>> On Fri, Oct 25, 2019 at 9:15 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >>>>>>>> 25/10/2019 16:08, Ferruh Yigit:
> >>>>>>>>> On 10/25/2019 10:36 AM, Thomas Monjalon wrote:
> >>>>>>>>>> 15/10/2019 09:51, Haiyue Wang:
> >>>>>>>>>>> Some PMDs have more than one RX/TX burst paths, add the ethdev API
> >>>>>>>>>>> that allows an application to retrieve the mode information about
> >>>>>>>>>>> Rx/Tx packet burst such as Scalar or Vector, and Vector technology
> >>>>>>>>>>> like AVX2.
> >>>>>>>>>>
> >>>>>>>>>> I missed this patch. I and Andrew, maintainers of ethdev, were not CC'ed.
> >>>>>>>>>> Ferruh, I would expect to be Cc'ed and/or get a notification before merging.
> >>>>>>>>>
> >>>>>>>>> It has been discussed in the mail list and went through multiple discussions,
> >>>>>>>>> patch is out since the August, +1 to cc all maintainers I missed that part,
> >>>>>>>>> but when the patch is reviewed and there is no objection, why block the merge?
> >>>>>>>>
> >>>>>>>> I'm not saying blocking the merge.
> >>>>>>>> My bad is that I missed the patch and I am asking for help with a notification
> >>>>>>>> in this case. Same for Andrew I guess.
> >>>>>>>> Note: it is merged in master and I am looking to improve this feature.
> >>>>>>>
> >>>>>>>>>>> +/**
> >>>>>>>>>>> + * Ethernet device RX/TX queue packet burst mode information structure.
> >>>>>>>>>>> + * Used to retrieve information about packet burst mode setting.
> >>>>>>>>>>> + */
> >>>>>>>>>>> +struct rte_eth_burst_mode {
> >>>>>>>>>>> + uint64_t options;
> >>>>>>>>>>> +};
> >>>>>>>>>>
> >>>>>>>>>> Why a struct for an integer?
> >>>>>>>>>
> >>>>>>>>> Again by a request from me, to not need to break the API if we need to add more
> >>>>>>>>> thing in the future.
> >>>>>>>>
> >>>>>>>> I would replace it with a string. This is the most flexible API.
> >>>>>>>
> >>>>>>> IMO, Probably, best of both worlds make a good option here,
> >>>>>>> as Haiyue suggested if we have an additional dev_specific[1] in structure.
> >>>>>>> and when a pass to the application, let common code make final string as
> >>>>>>> (options flags to string + dev_specific)
> >>>>>>>
> >>>>>>> options flag can be zero if PMD does not have any generic flags nor
> >>>>>>> interested in such a scheme.
> >>>>>>> Generic flags will help at least to have some common code.
> >>>>>>>
> >>>>>>> [1]
> >>>>>>> struct rte_eth_burst_mode {
> >>>>>>> uint64_t options;
> >>>>>>> char dev_specific[128]; /* PMD has specific burst mode information */
> >>>>>>> };
> >>>>>>
> >>>>>> I really don't see how we can have generic flags.
> >>>>>> The flags which are proposed are just matching
> >>>>>> the functions implemented in Intel PMDs.
> >>>>>> And this is a complicate solution.
> >>>>>> Why not just returning a name for the selected Rx/Tx mode?
> >>>>>
> >>>>> Intel PMDs use the *generic* methods like x86 SSE, AVX2, ARM NEON, PPC ALTIVEC,
> >>>>> 'dev->data->scattered_rx' etc for the target : "DPDK is the Data Plane Development Kit
> >>>>> that consists of libraries to accelerate packet processing workloads running on a wide
> >>>>> variety of CPU architectures."
> >>>>
> >>>> How RTE_ETH_BURST_SCATTERED and RTE_ETH_BURST_BULK_ALLOC are generic?
> >>>> They just match some features of the Intel PMDs.
> >>>> Why not exposing other optimizations of the Rx/Tx implementations?
> >>>> You totally missed the point of generic burst mode description.
> >>>>
> >>>>> If understand these new experimental APIs from above, then bit options is the best,
> >>>>> and we didn't invent new words to describe them, just from the CPU & other *generic*
> >>>>> technology. And the application can loop to check which kind of burst is running by
> >>>>> just simple bit test.
> >>>>>
> >>>>> If PMDs missed these, they can update them in future roadmaps to enhance their PMDs,
> >>>>> like MLX5 supports ARM NEON, x86 SSE.
> >>>>
> >>>> I have no word!
> >>>> You really think other PMDs should learn from Intel how to "enhance" their PMD?
> >>>> You talk about mlx5, did you look at its code? Did you see the burst modes
> >>>> depending on which specific hardware path is used (MPRQ, EMPW, inline)?
> >>>> Or depending on which offloads are handled?
> >>>>
> >>>> Again, the instruction set used by the function is a small part
> >>>> of the burst mode optimization.
> >>>>
> >>>> So you did not reply to my question:
> >>>> Why not just returning a name for the selected Rx/Tx mode?
> >>>
> >>> In fact, RFC v1/v2 returns the *name*, but the *name* is hard for
> >>> application to do further processing, strcmp, strstr ? Not so nice
> >>> for C code, and it is not so standard, So switch it to bit definition.
> >>
> >> Again, please answer my question: why do you need it?
> >> I think it is just informative, that's why a string should be enough.
> >> I am clearly against the bitmap because it is way too much restrictive.
> >> I disagree that knowing it is using AVX2 or AVX512 is so interesting.
> >> What you would like to know is whether it is processing packets 4 by 4,
> >> for instance, or to know which offload is supported, or what hardware trick
> >> is used in the datapath design.
> >> There are so many options in a datapath design that it cannot be
> >> represented with a bitmap. And it makes no sense to have some design
> >> criterias more important than others.
> >> I Cc an Intel architect (Edwin) who could explain you how much
> >> a datapath design is more complicate than just using AVX instructions.
> >
> > As I understand this is to let applications to give informed decision based on
> > what vectorization is used in the driver, currently this is not know by the
> > application.
> >
> > And as previously replied, the main target of the API is to define the vector
> > path, not all optimizations, so the number is limited.
No!
The name of this API is "burst mode information",
not "vector instructions used".
I think the main error is that in Intel PMDs,
each Rx/Tx function use different vector instructions.
So you generalize that knowing the vectors instructions
will give you a good information about the performance.
But this is generally wrong!
The right level of infos is much more complex.
> > There are many optimization in the data path, I agree we may not represent all
> > of them, and agreed existing enum having "RTE_ETH_BURST_BULK_ALLOC" and similar
> > causing this confusion, perhaps we can remove them.
> >
> > And if the requirement from the application is just informative, I would agree
> > that free text string will be better, right now 'rte_eth_rx/tx_burst_mode_get()'
> > is the main API to provide the information and
> > 'rte_eth_burst_mode_option_name()' is a helper for application/driver to log
> > this information.
> >
>
> Well look we have a general deficit of information about what is happening under
> the covers in DPDK. The end user may get wildly different performance characteristics
> based on the DPDK configuration. Simple example is using flow director causes the i40e
> PMD to switch to using a scalar code path, and performance may as much as half.
>
> This can cause no end of head-scratching in consuming products, I have done some
> of that head scratching myself, it is a usability nightmare.
>
> FD.io VPP tries to work around this by mining the call stack, to give the user _some_
> kind of information about what is happening. These kind of heroics should not be necessary.
>
> For exactly the same reasons as telemetry, we should be trying to give the users as much
> information as possible, in as standard as format as possible. Otherwise DPDK
> becomes arcane leaving the user running gdb to understand what is going on, as I
> frequently do.
I agree we must provide a clue to understand the performance result.
As Stephen commented at the very beginning, a log is enough for such debug.
But his comment was ignored. You wanted an API, fine.
I am OK to have an API to request infos which are also in logs.
> Finally, again for the same reasons as telemetry, I would say that machine readable is the
> ideal here.
I disagree here. There is no need to make this info machine readable.
We want a clue about the optimizations which are all about creativity.
And we cannot make creativity of developers "machine readable".
next prev parent reply other threads:[~2019-11-03 22:41 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-15 7:51 [dpdk-dev] [PATCH v4 0/4] get Rx/Tx packet " Haiyue Wang
2019-10-15 7:51 ` [dpdk-dev] [PATCH v4 1/4] ethdev: add the API for getting " Haiyue Wang
2019-10-15 10:45 ` Ferruh Yigit
2019-10-15 11:23 ` Wang, Haiyue
2019-10-15 11:13 ` Ferruh Yigit
2019-10-25 9:36 ` Thomas Monjalon
2019-10-25 10:26 ` Jerin Jacob
2019-10-25 13:59 ` Wang, Haiyue
2019-10-25 14:08 ` Ferruh Yigit
2019-10-25 15:45 ` Thomas Monjalon
2019-10-25 16:02 ` Jerin Jacob
2019-10-25 22:27 ` Thomas Monjalon
2019-10-26 4:40 ` Wang, Haiyue
2019-10-26 6:24 ` Thomas Monjalon
2019-10-26 9:23 ` Wang, Haiyue
2019-10-26 16:23 ` Thomas Monjalon
2019-10-29 14:27 ` Ferruh Yigit
2019-11-03 20:35 ` Ray Kinsella
2019-11-03 22:41 ` Thomas Monjalon [this message]
2019-11-04 9:49 ` Ray Kinsella
2019-11-04 9:54 ` Thomas Monjalon
2019-11-04 10:03 ` Ray Kinsella
2019-11-04 10:46 ` Wang, Haiyue
2019-11-04 11:30 ` Thomas Monjalon
2019-11-04 12:07 ` Ray Kinsella
2019-11-04 13:09 ` Thomas Monjalon
2019-11-04 13:48 ` Ray Kinsella
2019-11-04 14:17 ` Wang, Haiyue
2019-10-26 6:40 ` Slava Ovsiienko
2019-10-26 9:31 ` Wang, Haiyue
2019-10-26 6:58 ` Jerin Jacob
2019-10-26 9:37 ` Wang, Haiyue
2019-10-29 3:37 ` Jerin Jacob
2019-10-29 4:44 ` Wang, Haiyue
2019-10-29 5:19 ` Jerin Jacob
2019-10-29 5:42 ` Wang, Haiyue
2019-10-29 5:47 ` Jerin Jacob
2019-10-29 5:56 ` Wang, Haiyue
2019-10-29 6:00 ` Wang, Haiyue
2019-10-29 8:34 ` Jerin Jacob
2019-10-29 11:26 ` Wang, Haiyue
2019-10-29 12:56 ` Jerin Jacob
2019-10-29 13:51 ` Wang, Haiyue
2019-10-29 14:08 ` Jerin Jacob
2019-10-29 15:42 ` Wang, Haiyue
2019-10-29 15:59 ` Jerin Jacob
2019-10-29 16:16 ` Wang, Haiyue
2019-10-29 16:59 ` Ferruh Yigit
2019-10-30 4:38 ` Jerin Jacob
2019-10-30 4:43 ` Wang, Haiyue
2019-10-30 8:14 ` Wang, Haiyue
2019-10-31 10:46 ` Jerin Jacob
2019-10-31 11:15 ` Ray Kinsella
2019-10-31 11:16 ` Wang, Haiyue
2019-10-31 14:58 ` Ferruh Yigit
2019-10-31 15:07 ` Wang, Haiyue
2019-10-31 15:29 ` Ferruh Yigit
2019-10-31 15:54 ` Wang, Haiyue
2019-10-31 11:09 ` Ray Kinsella
2019-10-15 7:51 ` [dpdk-dev] [PATCH v4 2/4] net/i40e: add Rx/Tx burst mode get callbacks Haiyue Wang
2019-10-15 7:51 ` [dpdk-dev] [PATCH v4 3/4] net/ice: " Haiyue Wang
2019-10-15 7:51 ` [dpdk-dev] [PATCH v4 4/4] app/testpmd: show the Rx/Tx burst mode description Haiyue Wang
2019-10-15 12:11 ` [dpdk-dev] [PATCH v4 0/4] get Rx/Tx packet burst mode information Ferruh Yigit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6097021.DVUxeYL0oo@xps \
--to=thomas@monjalon.net \
--cc=arybchenko@solarflare.com \
--cc=dev@dpdk.org \
--cc=dmarion@me.com \
--cc=ferruh.yigit@intel.com \
--cc=haiyue.wang@intel.com \
--cc=jerinj@marvell.com \
--cc=mdr@ashroe.eu \
--cc=stephen@networkplumber.org \
--cc=viacheslavo@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).