DPDK patches and discussions
 help / color / mirror / Atom feed
From: Ray Kinsella <mdr@ashroe.eu>
To: dev@dpdk.org, Thomas Monjalon <thomas@monjalon.net>,
	"Yigit, Ferruh" <ferruh.yigit@intel.com>,
	'Damjan Marion' <dmarion@me.com>,
	"Wang, Haiyue" <haiyue.wang@intel.com>,
	Jerin Jacob Kollanukkaran <jerinj@marvell.com>
Subject: Re: [dpdk-dev] [PATCH v4 1/4] ethdev: add the API for getting burst mode information
Date: Sun, 3 Nov 2019 20:35:02 +0000	[thread overview]
Message-ID: <e4b31f81-3c21-c702-c599-aecda721598a@ashroe.eu> (raw)
In-Reply-To: <e6b91daf-ec77-3061-b24c-0afc8e2005df@intel.com>



On 29/10/2019 14:27, Ferruh Yigit wrote:
> On 10/26/2019 5:23 PM, Thomas Monjalon wrote:
>> 26/10/2019 11:23, Wang, Haiyue:
>>> From: Thomas Monjalon [mailto:thomas@monjalon.net]
>>>> 26/10/2019 06:40, Wang, Haiyue:
>>>>> From: Thomas Monjalon [mailto:thomas@monjalon.net]
>>>>>> 25/10/2019 18:02, Jerin Jacob:
>>>>>>> On Fri, Oct 25, 2019 at 9:15 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>>>>>>>> 25/10/2019 16:08, Ferruh Yigit:
>>>>>>>>> On 10/25/2019 10:36 AM, Thomas Monjalon wrote:
>>>>>>>>>> 15/10/2019 09:51, Haiyue Wang:
>>>>>>>>>>> Some PMDs have more than one RX/TX burst paths, add the ethdev API
>>>>>>>>>>> that allows an application to retrieve the mode information about
>>>>>>>>>>> Rx/Tx packet burst such as Scalar or Vector, and Vector technology
>>>>>>>>>>> like AVX2.
>>>>>>>>>>
>>>>>>>>>> I missed this patch. I and Andrew, maintainers of ethdev, were not CC'ed.
>>>>>>>>>> Ferruh, I would expect to be Cc'ed and/or get a notification before merging.
>>>>>>>>>
>>>>>>>>> It has been discussed in the mail list and went through multiple discussions,
>>>>>>>>> patch is out since the August, +1 to cc all maintainers I missed that part,
>>>>>>>>> but when the patch is reviewed and there is no objection, why block the merge?
>>>>>>>>
>>>>>>>> I'm not saying blocking the merge.
>>>>>>>> My bad is that I missed the patch and I am asking for help with a notification
>>>>>>>> in this case. Same for Andrew I guess.
>>>>>>>> Note: it is merged in master and I am looking to improve this feature.
>>>>>>>
>>>>>>>>>>> +/**
>>>>>>>>>>> + * Ethernet device RX/TX queue packet burst mode information structure.
>>>>>>>>>>> + * Used to retrieve information about packet burst mode setting.
>>>>>>>>>>> + */
>>>>>>>>>>> +struct rte_eth_burst_mode {
>>>>>>>>>>> +  uint64_t options;
>>>>>>>>>>> +};
>>>>>>>>>>
>>>>>>>>>> Why a struct for an integer?
>>>>>>>>>
>>>>>>>>> Again by a request from me, to not need to break the API if we need to add more
>>>>>>>>> thing in the future.
>>>>>>>>
>>>>>>>> I would replace it with a string. This is the most flexible API.
>>>>>>>
>>>>>>> IMO, Probably, best of both worlds make a good option here,
>>>>>>> as Haiyue suggested if we have an additional dev_specific[1] in structure.
>>>>>>> and when a pass to the application, let common code make final string as
>>>>>>> (options flags to string + dev_specific)
>>>>>>>
>>>>>>> options flag can be zero if PMD does not have any generic flags nor
>>>>>>> interested in such a scheme.
>>>>>>> Generic flags will help at least to have some common code.
>>>>>>>
>>>>>>> [1]
>>>>>>> struct rte_eth_burst_mode {
>>>>>>>         uint64_t options;
>>>>>>>         char dev_specific[128]; /* PMD has specific burst mode information */
>>>>>>> };
>>>>>>
>>>>>> I really don't see how we can have generic flags.
>>>>>> The flags which are proposed are just matching
>>>>>> the functions implemented in Intel PMDs.
>>>>>> And this is a complicate solution.
>>>>>> Why not just returning a name for the selected Rx/Tx mode?
>>>>>
>>>>> Intel PMDs use the *generic* methods like x86 SSE, AVX2, ARM NEON, PPC ALTIVEC,
>>>>> 'dev->data->scattered_rx' etc for the target : "DPDK is the Data Plane Development Kit
>>>>> that consists of libraries to accelerate packet processing workloads running on a wide
>>>>> variety of CPU architectures."
>>>>
>>>> How RTE_ETH_BURST_SCATTERED and RTE_ETH_BURST_BULK_ALLOC are generic?
>>>> They just match some features of the Intel PMDs.
>>>> Why not exposing other optimizations of the Rx/Tx implementations?
>>>> You totally missed the point of generic burst mode description.
>>>>
>>>>> If understand these new experimental APIs from above, then bit options is the best,
>>>>> and we didn't invent new words to describe them, just from the CPU & other *generic*
>>>>> technology. And the application can loop to check which kind of burst is running by
>>>>> just simple bit test.
>>>>>
>>>>> If PMDs missed these, they can update them in future roadmaps to enhance their PMDs,
>>>>> like MLX5 supports ARM NEON, x86 SSE.
>>>>
>>>> I have no word!
>>>> You really think other PMDs should learn from Intel how to "enhance" their PMD?
>>>> You talk about mlx5, did you look at its code? Did you see the burst modes
>>>> depending on which specific hardware path is used (MPRQ, EMPW, inline)?
>>>> Or depending on which offloads are handled?
>>>>
>>>> Again, the instruction set used by the function is a small part
>>>> of the burst mode optimization.
>>>>
>>>> So you did not reply to my question:
>>>> Why not just returning a name for the selected Rx/Tx mode?
>>>
>>> In fact, RFC v1/v2 returns the *name*, but the *name* is hard for
>>> application to do further processing, strcmp, strstr ? Not so nice
>>> for C code, and it is not so standard, So switch it to bit definition.
>>
>> Again, please answer my question: why do you need it?
>> I think it is just informative, that's why a string should be enough.
>> I am clearly against the bitmap because it is way too much restrictive.
>> I disagree that knowing it is using AVX2 or AVX512 is so interesting.
>> What you would like to know is whether it is processing packets 4 by 4,
>> for instance, or to know which offload is supported, or what hardware trick
>> is used in the datapath design.
>> There are so many options in a datapath design that it cannot be
>> represented with a bitmap. And it makes no sense to have some design
>> criterias more important than others.
>> I Cc an Intel architect (Edwin) who could explain you how much
>> a datapath design is more complicate than just using AVX instructions.
> 
> As I understand this is to let applications to give informed decision based on
> what vectorization is used in the driver, currently this is not know by the
> application.
> 
> And as previously replied, the main target of the API is to define the vector
> path, not all optimizations, so the number is limited.
> There are many optimization in the data path, I agree we may not represent all
> of them, and agreed existing enum having "RTE_ETH_BURST_BULK_ALLOC" and similar
> causing this confusion, perhaps we can remove them.
> 
> And if the requirement from the application is just informative, I would agree
> that free text string will be better, right now 'rte_eth_rx/tx_burst_mode_get()'
> is the main API to provide the information and
> 'rte_eth_burst_mode_option_name()' is a helper for application/driver to log
> this information.
> 

Well look we have a general deficit of information about what is happening under 
the covers in DPDK. The end user may get wildly different performance characteristics 
based on the DPDK configuration. Simple example is using flow director causes the i40e 
PMD to switch to using a scalar code path, and performance may as much as half.

This can cause no end of head-scratching in consuming products, I have done some 
of that head scratching myself, it is a usability nightmare. 

FD.io VPP tries to work around this by mining the call stack, to give the user _some_
kind of information about what is happening. These kind of heroics should not be necessary.

For exactly the same reasons as telemetry, we should be trying to give the users as much 
information as possible, in as standard as format as possible. Otherwise DPDK 
becomes arcane leaving the user running gdb to understand what is going on, as I 
frequently do. 

Finally, again for the same reasons as telemetry, I would say that machine readable is the 
ideal here.


              Name                Idx   Link  Hardware
FortyGigabitEthernet86/0/0         1     up   FortyGigabitEthernet86/0/0
  Link speed: 40 Gbps
  Ethernet address 3c:fd:fe:bc:b2:b0
  Intel X710/XL710 Family
    carrier up full duplex mtu 9206
    flags: admin-up pmd rx-ip4-cksum
    rx: queues 2 (max 320), desc 1024 (min 64 max 4096 align 32)
    tx: queues 3 (max 320), desc 1024 (min 64 max 4096 align 32)
    pci: device 8086:1583 subsystem 8086:0001 address 0000:86:00.00 numa 1
    max rx packet len: 9728
    promiscuous: unicast off all-multicast on
    vlan offload: strip off filter off qinq off
    rx offload avail:  vlan-strip ipv4-cksum udp-cksum tcp-cksum qinq-strip
                       outer-ipv4-cksum vlan-filter vlan-extend jumbo-frame
                       scatter keep-crc
    rx offload active: ipv4-cksum
    tx offload avail:  vlan-insert ipv4-cksum udp-cksum tcp-cksum sctp-cksum
                       tcp-tso outer-ipv4-cksum qinq-insert vxlan-tnl-tso
                       gre-tnl-tso ipip-tnl-tso geneve-tnl-tso multi-segs
                       mbuf-fast-free
    tx offload active: none
    rss avail:         ipv4-frag ipv4-tcp ipv4-udp ipv4-sctp ipv4-other ipv6-frag
                       ipv6-tcp ipv6-udp ipv6-sctp ipv6-other l2-payload
    rss active:        ipv4-frag ipv4-tcp ipv4-udp ipv4-other ipv6-frag ipv6-tcp
                       ipv6-udp ipv6-other
    tx burst function: i40e_xmit_pkts_vec_avx2 <--------------------------------
    rx burst function: i40e_recv_pkts_vec_avx2 <--------------------------------

    tx frames ok                               3303819336013
    tx bytes ok                              198229160160780
    rx frames ok                              17503181209363
    rx bytes ok                              205944346015764
    rx missed                                     2973393096

  reply	other threads:[~2019-11-03 20:35 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-15  7:51 [dpdk-dev] [PATCH v4 0/4] get Rx/Tx packet " Haiyue Wang
2019-10-15  7:51 ` [dpdk-dev] [PATCH v4 1/4] ethdev: add the API for getting " Haiyue Wang
2019-10-15 10:45   ` Ferruh Yigit
2019-10-15 11:23     ` Wang, Haiyue
2019-10-15 11:13   ` Ferruh Yigit
2019-10-25  9:36   ` Thomas Monjalon
2019-10-25 10:26     ` Jerin Jacob
2019-10-25 13:59     ` Wang, Haiyue
2019-10-25 14:08     ` Ferruh Yigit
2019-10-25 15:45       ` Thomas Monjalon
2019-10-25 16:02         ` Jerin Jacob
2019-10-25 22:27           ` Thomas Monjalon
2019-10-26  4:40             ` Wang, Haiyue
2019-10-26  6:24               ` Thomas Monjalon
2019-10-26  9:23                 ` Wang, Haiyue
2019-10-26 16:23                   ` Thomas Monjalon
2019-10-29 14:27                     ` Ferruh Yigit
2019-11-03 20:35                       ` Ray Kinsella [this message]
2019-11-03 22:41                         ` Thomas Monjalon
2019-11-04  9:49                           ` Ray Kinsella
2019-11-04  9:54                             ` Thomas Monjalon
2019-11-04 10:03                               ` Ray Kinsella
2019-11-04 10:46                                 ` Wang, Haiyue
2019-11-04 11:30                                 ` Thomas Monjalon
2019-11-04 12:07                                   ` Ray Kinsella
2019-11-04 13:09                                     ` Thomas Monjalon
2019-11-04 13:48                                       ` Ray Kinsella
2019-11-04 14:17                                         ` Wang, Haiyue
2019-10-26  6:40             ` Slava Ovsiienko
2019-10-26  9:31               ` Wang, Haiyue
2019-10-26  6:58             ` Jerin Jacob
2019-10-26  9:37               ` Wang, Haiyue
2019-10-29  3:37                 ` Jerin Jacob
2019-10-29  4:44                   ` Wang, Haiyue
2019-10-29  5:19                     ` Jerin Jacob
2019-10-29  5:42                       ` Wang, Haiyue
2019-10-29  5:47                         ` Jerin Jacob
2019-10-29  5:56                           ` Wang, Haiyue
2019-10-29  6:00                           ` Wang, Haiyue
2019-10-29  8:34                             ` Jerin Jacob
2019-10-29 11:26                               ` Wang, Haiyue
2019-10-29 12:56                                 ` Jerin Jacob
2019-10-29 13:51                                   ` Wang, Haiyue
2019-10-29 14:08                                     ` Jerin Jacob
2019-10-29 15:42                                       ` Wang, Haiyue
2019-10-29 15:59                                         ` Jerin Jacob
2019-10-29 16:16                                           ` Wang, Haiyue
2019-10-29 16:59               ` Ferruh Yigit
2019-10-30  4:38                 ` Jerin Jacob
2019-10-30  4:43                   ` Wang, Haiyue
2019-10-30  8:14                 ` Wang, Haiyue
2019-10-31 10:46                   ` Jerin Jacob
2019-10-31 11:15                     ` Ray Kinsella
2019-10-31 11:16                     ` Wang, Haiyue
2019-10-31 14:58                       ` Ferruh Yigit
2019-10-31 15:07                         ` Wang, Haiyue
2019-10-31 15:29                           ` Ferruh Yigit
2019-10-31 15:54                             ` Wang, Haiyue
2019-10-31 11:09                 ` Ray Kinsella
2019-10-15  7:51 ` [dpdk-dev] [PATCH v4 2/4] net/i40e: add Rx/Tx burst mode get callbacks Haiyue Wang
2019-10-15  7:51 ` [dpdk-dev] [PATCH v4 3/4] net/ice: " Haiyue Wang
2019-10-15  7:51 ` [dpdk-dev] [PATCH v4 4/4] app/testpmd: show the Rx/Tx burst mode description Haiyue Wang
2019-10-15 12:11 ` [dpdk-dev] [PATCH v4 0/4] get Rx/Tx packet burst mode information Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e4b31f81-3c21-c702-c599-aecda721598a@ashroe.eu \
    --to=mdr@ashroe.eu \
    --cc=dev@dpdk.org \
    --cc=dmarion@me.com \
    --cc=ferruh.yigit@intel.com \
    --cc=haiyue.wang@intel.com \
    --cc=jerinj@marvell.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).