From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1263CA0352; Mon, 4 Nov 2019 14:48:47 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id D5B8E374E; Mon, 4 Nov 2019 14:48:46 +0100 (CET) Received: from relay0196.mxlogin.com (relay0196.mxlogin.com [199.181.239.196]) by dpdk.org (Postfix) with ESMTP id A829B34F3 for ; Mon, 4 Nov 2019 14:48:45 +0100 (CET) Received: from filter004.mxroute.com (unknown [94.130.183.33]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay0196.mxlogin.com (Postfix) with ESMTPS id 92E47CC80248; Mon, 4 Nov 2019 07:48:44 -0600 (CST) Received: from galaxy.mxroute.com (unknown [23.92.70.113]) by filter004.mxroute.com (Postfix) with ESMTPS id DBFF13E9E1; Mon, 4 Nov 2019 13:48:41 +0000 (UTC) Received: from irdmzpr02-ext.ir.intel.com ([192.198.151.37]) by galaxy.mxroute.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.91) (envelope-from ) id 1iRcWm-0008Lp-Vc; Mon, 04 Nov 2019 08:36:17 -0500 To: Thomas Monjalon Cc: dev@dpdk.org, "Yigit, Ferruh" , 'Damjan Marion' , "Wang, Haiyue" , Jerin Jacob Kollanukkaran , viacheslavo@mellanox.com, stephen@networkplumber.org, arybchenko@solarflare.com References: <20191015075133.38560-1-haiyue.wang@intel.com> <3333162.RZseXh5CjJ@xps> <932da3d3-34b9-79a4-1a68-fb06056ee93f@ashroe.eu> <2680328.hM31vkk1Ea@xps> From: Ray Kinsella Openpgp: preference=signencrypt Autocrypt: addr=mdr@ashroe.eu; keydata= mQINBFv8B3wBEAC+5ImcgbIvadt3axrTnt7Sxch3FsmWTTomXfB8YiuHT8KL8L/bFRQSL1f6 ASCHu3M89EjYazlY+vJUWLr0BhK5t/YI7bQzrOuYrl9K94vlLwzD19s/zB/g5YGGR5plJr0s JtJsFGEvF9LL3e+FKMRXveQxBB8A51nAHfwG0WSyx53d61DYz7lp4/Y4RagxaJoHp9lakn8j HV2N6rrnF+qt5ukj5SbbKWSzGg5HQF2t0QQ5tzWhCAKTfcPlnP0GymTBfNMGOReWivi3Qqzr S51Xo7hoGujUgNAM41sxpxmhx8xSwcQ5WzmxgAhJ/StNV9cb3HWIoE5StCwQ4uXOLplZNGnS uxNdegvKB95NHZjRVRChg/uMTGpg9PqYbTIFoPXjuk27sxZLRJRrueg4tLbb3HM39CJwSB++ YICcqf2N+GVD48STfcIlpp12/HI+EcDSThzfWFhaHDC0hyirHxJyHXjnZ8bUexI/5zATn/ux TpMbc/vicJxeN+qfaVqPkCbkS71cHKuPluM3jE8aNCIBNQY1/j87k5ELzg3qaesLo2n1krBH bKvFfAmQuUuJT84/IqfdVtrSCTabvDuNBDpYBV0dGbTwaRfE7i+LiJJclUr8lOvHUpJ4Y6a5 0cxEPxm498G12Z3NoY/mP5soItPIPtLR0rA0fage44zSPwp6cQARAQABtBxSYXkgS2luc2Vs bGEgPG1kckBhc2hyb2UuZXU+iQJUBBMBCAA+FiEEcDUDlKDJaDuJlfZfdJdaH/sCCpsFAlv8 B3wCGyMFCQlmAYAFCwkIBwIGFQoJCAsCBBYCAwECHgECF4AACgkQdJdaH/sCCptdtRAAl0oE msa+djBVYLIsax+0f8acidtWg2l9f7kc2hEjp9h9aZCpPchQvhhemtew/nKavik3RSnLTAyn B3C/0GNlmvI1l5PFROOgPZwz4xhJKGN7jOsRrbkJa23a8ly5UXwF3Vqnlny7D3z+7cu1qq/f VRK8qFyWkAb+xgqeZ/hTcbJUWtW+l5Zb+68WGEp8hB7TuJLEWb4+VKgHTpQ4vElYj8H3Z94a 04s2PJMbLIZSgmKDASnyrKY0CzTpPXx5rSJ1q+B1FCsfepHLqt3vKSALa3ld6bJ8fSJtDUJ7 JLiU8dFZrywgDIVme01jPbjJtUScW6jONLvhI8Z2sheR71UoKqGomMHNQpZ03ViVWBEALzEt TcjWgJFn8yAmxqM4nBnZ+hE3LbMo34KCHJD4eg18ojDt3s9VrDLa+V9fNxUHPSib9FD9UX/1 +nGfU/ZABmiTuUDM7WZdXri7HaMpzDRJUKI6b+/uunF8xH/h/MHW16VuMzgI5dkOKKv1LejD dT5mA4R+2zBS+GsM0oa2hUeX9E5WwjaDzXtVDg6kYq8YvEd+m0z3M4e6diFeLS77/sAOgaYL 92UcoKD+Beym/fVuC6/55a0e12ksTmgk5/ZoEdoNQLlVgd2INtvnO+0k5BJcn66ZjKn3GbEC VqFbrnv1GnA58nEInRCTzR1k26h9nmS5Ag0EW/wHfAEQAMth1vHr3fOZkVOPfod3M6DkQir5 xJvUW5EHgYUjYCPIa2qzgIVVuLDqZgSCCinyooG5dUJONVHj3nCbITCpJp4eB3PI84RPfDcC hf/V34N/Gx5mTeoymSZDBmXT8YtvV/uJvn+LvHLO4ZJdvq5ZxmDyxfXFmkm3/lLw0+rrNdK5 pt6OnVlCqEU9tcDBezjUwDtOahyV20XqxtUttN4kQWbDRkhT+HrA9WN9l2HX91yEYC+zmF1S OhBqRoTPLrR6g4sCWgFywqztpvZWhyIicJipnjac7qL/wRS+wrWfsYy6qWLIV80beN7yoa6v ccnuy4pu2uiuhk9/edtlmFE4dNdoRf7843CV9k1yRASTlmPkU59n0TJbw+okTa9fbbQgbIb1 pWsAuicRHyLUIUz4f6kPgdgty2FgTKuPuIzJd1s8s6p2aC1qo+Obm2gnBTduB+/n1Jw+vKpt 07d+CKEKu4CWwvZZ8ktJJLeofi4hMupTYiq+oMzqH+V1k6QgNm0Da489gXllU+3EFC6W1qKj tkvQzg2rYoWeYD1Qn8iXcO4Fpk6wzylclvatBMddVlQ6qrYeTmSbCsk+m2KVrz5vIyja0o5Y yfeN29s9emXnikmNfv/dA5fpi8XCANNnz3zOfA93DOB9DBf0TQ2/OrSPGjB3op7RCfoPBZ7u AjJ9dM7VABEBAAGJAjwEGAEIACYWIQRwNQOUoMloO4mV9l90l1of+wIKmwUCW/wHfAIbDAUJ CWYBgAAKCRB0l1of+wIKm3KlD/9w/LOG5rtgtCUWPl4B3pZvGpNym6XdK8cop9saOnE85zWf u+sKWCrxNgYkYP7aZrYMPwqDvilxhbTsIJl5HhPgpTO1b0i+c0n1Tij3EElj5UCg3q8mEc17 c+5jRrY3oz77g7E3oPftAjaq1ybbXjY4K32o3JHFR6I8wX3m9wJZJe1+Y+UVrrjY65gZFxcA thNVnWKErarVQGjeNgHV4N1uF3pIx3kT1N4GSnxhoz4Bki91kvkbBhUgYfNflGURfZT3wIKK +d50jd7kqRouXUCzTdzmDh7jnYrcEFM4nvyaYu0JjSS5R672d9SK5LVIfWmoUGzqD4AVmUW8 pcv461+PXchuS8+zpltR9zajl72Q3ymlT4BTAQOlCWkD0snBoKNUB5d2EXPNV13nA0qlm4U2 GpROfJMQXjV6fyYRvttKYfM5xYKgRgtP0z5lTAbsjg9WFKq0Fndh7kUlmHjuAIwKIV4Tzo75 QO2zC0/NTaTjmrtiXhP+vkC4pcrOGNsbHuaqvsc/ZZ0siXyYsqbctj/sCd8ka2r94u+c7o4l BGaAm+FtwAfEAkXHu4y5Phuv2IRR+x1wTey1U1RaEPgN8xq0LQ1OitX4t2mQwjdPihZQBCnZ wzOrkbzlJMNrMKJpEgulmxAHmYJKgvZHXZXtLJSejFjR0GdHJcL5rwVOMWB8cg== Message-ID: Date: Mon, 4 Nov 2019 13:48:32 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <2680328.hM31vkk1Ea@xps> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-AuthUser: mdr@ashroe.eu Subject: Re: [dpdk-dev] [PATCH v4 1/4] ethdev: add the API for getting burst mode information X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 04/11/2019 13:09, Thomas Monjalon wrote: > 04/11/2019 13:07, Ray Kinsella: >> >> On 04/11/2019 11:30, Thomas Monjalon wrote: >>> 04/11/2019 11:03, Ray Kinsella: >>>> On 04/11/2019 09:54, Thomas Monjalon wrote: >>>>> 04/11/2019 10:49, Ray Kinsella: >>>>>> On 03/11/2019 22:41, Thomas Monjalon wrote: >>>>>>> 03/11/2019 21:35, Ray Kinsella: >>>>>>>> On 29/10/2019 14:27, Ferruh Yigit wrote: >>>>>>>>> On 10/26/2019 5:23 PM, Thomas Monjalon wrote: >>>>>>>>>> 26/10/2019 11:23, Wang, Haiyue: >>>>>>>>>>> From: Thomas Monjalon [mailto:thomas@monjalon.net] >>>>>>>>>>>> 26/10/2019 06:40, Wang, Haiyue: >>>>>>>>>>>>> From: Thomas Monjalon [mailto:thomas@monjalon.net] >>>>>>>>>>>>>> 25/10/2019 18:02, Jerin Jacob: >>>>>>>>>>>>>>> On Fri, Oct 25, 2019 at 9:15 PM Thomas Monjalon wrote: >>>>>>>>>>>>>>>> 25/10/2019 16:08, Ferruh Yigit: >>>>>>>>>>>>>>>>> On 10/25/2019 10:36 AM, Thomas Monjalon wrote: >>>>>>>>>>>>>>>>>> 15/10/2019 09:51, Haiyue Wang: >>>>>>>>>>>>>>>>>>> Some PMDs have more than one RX/TX burst paths, add the ethdev API >>>>>>>>>>>>>>>>>>> that allows an application to retrieve the mode information about >>>>>>>>>>>>>>>>>>> Rx/Tx packet burst such as Scalar or Vector, and Vector technology >>>>>>>>>>>>>>>>>>> like AVX2. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I missed this patch. I and Andrew, maintainers of ethdev, were not CC'ed. >>>>>>>>>>>>>>>>>> Ferruh, I would expect to be Cc'ed and/or get a notification before merging. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> It has been discussed in the mail list and went through multiple discussions, >>>>>>>>>>>>>>>>> patch is out since the August, +1 to cc all maintainers I missed that part, >>>>>>>>>>>>>>>>> but when the patch is reviewed and there is no objection, why block the merge? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I'm not saying blocking the merge. >>>>>>>>>>>>>>>> My bad is that I missed the patch and I am asking for help with a notification >>>>>>>>>>>>>>>> in this case. Same for Andrew I guess. >>>>>>>>>>>>>>>> Note: it is merged in master and I am looking to improve this feature. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> +/** >>>>>>>>>>>>>>>>>>> + * Ethernet device RX/TX queue packet burst mode information structure. >>>>>>>>>>>>>>>>>>> + * Used to retrieve information about packet burst mode setting. >>>>>>>>>>>>>>>>>>> + */ >>>>>>>>>>>>>>>>>>> +struct rte_eth_burst_mode { >>>>>>>>>>>>>>>>>>> + uint64_t options; >>>>>>>>>>>>>>>>>>> +}; >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Why a struct for an integer? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Again by a request from me, to not need to break the API if we need to add more >>>>>>>>>>>>>>>>> thing in the future. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I would replace it with a string. This is the most flexible API. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> IMO, Probably, best of both worlds make a good option here, >>>>>>>>>>>>>>> as Haiyue suggested if we have an additional dev_specific[1] in structure. >>>>>>>>>>>>>>> and when a pass to the application, let common code make final string as >>>>>>>>>>>>>>> (options flags to string + dev_specific) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> options flag can be zero if PMD does not have any generic flags nor >>>>>>>>>>>>>>> interested in such a scheme. >>>>>>>>>>>>>>> Generic flags will help at least to have some common code. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>> struct rte_eth_burst_mode { >>>>>>>>>>>>>>> uint64_t options; >>>>>>>>>>>>>>> char dev_specific[128]; /* PMD has specific burst mode information */ >>>>>>>>>>>>>>> }; >>>>>>>>>>>>>> >>>>>>>>>>>>>> I really don't see how we can have generic flags. >>>>>>>>>>>>>> The flags which are proposed are just matching >>>>>>>>>>>>>> the functions implemented in Intel PMDs. >>>>>>>>>>>>>> And this is a complicate solution. >>>>>>>>>>>>>> Why not just returning a name for the selected Rx/Tx mode? >>>>>>>>>>>>> >>>>>>>>>>>>> Intel PMDs use the *generic* methods like x86 SSE, AVX2, ARM NEON, PPC ALTIVEC, >>>>>>>>>>>>> 'dev->data->scattered_rx' etc for the target : "DPDK is the Data Plane Development Kit >>>>>>>>>>>>> that consists of libraries to accelerate packet processing workloads running on a wide >>>>>>>>>>>>> variety of CPU architectures." >>>>>>>>>>>> >>>>>>>>>>>> How RTE_ETH_BURST_SCATTERED and RTE_ETH_BURST_BULK_ALLOC are generic? >>>>>>>>>>>> They just match some features of the Intel PMDs. >>>>>>>>>>>> Why not exposing other optimizations of the Rx/Tx implementations? >>>>>>>>>>>> You totally missed the point of generic burst mode description. >>>>>>>>>>>> >>>>>>>>>>>>> If understand these new experimental APIs from above, then bit options is the best, >>>>>>>>>>>>> and we didn't invent new words to describe them, just from the CPU & other *generic* >>>>>>>>>>>>> technology. And the application can loop to check which kind of burst is running by >>>>>>>>>>>>> just simple bit test. >>>>>>>>>>>>> >>>>>>>>>>>>> If PMDs missed these, they can update them in future roadmaps to enhance their PMDs, >>>>>>>>>>>>> like MLX5 supports ARM NEON, x86 SSE. >>>>>>>>>>>> >>>>>>>>>>>> I have no word! >>>>>>>>>>>> You really think other PMDs should learn from Intel how to "enhance" their PMD? >>>>>>>>>>>> You talk about mlx5, did you look at its code? Did you see the burst modes >>>>>>>>>>>> depending on which specific hardware path is used (MPRQ, EMPW, inline)? >>>>>>>>>>>> Or depending on which offloads are handled? >>>>>>>>>>>> >>>>>>>>>>>> Again, the instruction set used by the function is a small part >>>>>>>>>>>> of the burst mode optimization. >>>>>>>>>>>> >>>>>>>>>>>> So you did not reply to my question: >>>>>>>>>>>> Why not just returning a name for the selected Rx/Tx mode? >>>>>>>>>>> >>>>>>>>>>> In fact, RFC v1/v2 returns the *name*, but the *name* is hard for >>>>>>>>>>> application to do further processing, strcmp, strstr ? Not so nice >>>>>>>>>>> for C code, and it is not so standard, So switch it to bit definition. >>>>>>>>>> >>>>>>>>>> Again, please answer my question: why do you need it? >>>>>>>>>> I think it is just informative, that's why a string should be enough. >>>>>>>>>> I am clearly against the bitmap because it is way too much restrictive. >>>>>>>>>> I disagree that knowing it is using AVX2 or AVX512 is so interesting. >>>>>>>>>> What you would like to know is whether it is processing packets 4 by 4, >>>>>>>>>> for instance, or to know which offload is supported, or what hardware trick >>>>>>>>>> is used in the datapath design. >>>>>>>>>> There are so many options in a datapath design that it cannot be >>>>>>>>>> represented with a bitmap. And it makes no sense to have some design >>>>>>>>>> criterias more important than others. >>>>>>>>>> I Cc an Intel architect (Edwin) who could explain you how much >>>>>>>>>> a datapath design is more complicate than just using AVX instructions. >>>>>>>>> >>>>>>>>> As I understand this is to let applications to give informed decision based on >>>>>>>>> what vectorization is used in the driver, currently this is not know by the >>>>>>>>> application. >>>>>>>>> >>>>>>>>> And as previously replied, the main target of the API is to define the vector >>>>>>>>> path, not all optimizations, so the number is limited. >>>>>>> >>>>>>> No! >>>>>>> The name of this API is "burst mode information", >>>>>>> not "vector instructions used". >>>>>>> I think the main error is that in Intel PMDs, >>>>>>> each Rx/Tx function use different vector instructions. >>>>>>> So you generalize that knowing the vectors instructions >>>>>>> will give you a good information about the performance. >>>>>>> But this is generally wrong! >>>>>>> The right level of infos is much more complex. >>>>>> >>>>>> I don't think anyone was suggesting limiting it to purely describing PMD optimization >>>>>> with vector instructions. If there are other commonalities let's describe those also. >>>>>> >>>>>> Vectorization was thought to be a good starting point - IMHO it is. >>>>>> >>>>>>> >>>>>>>>> There are many optimization in the data path, I agree we may not represent all >>>>>>>>> of them, and agreed existing enum having "RTE_ETH_BURST_BULK_ALLOC" and similar >>>>>>>>> causing this confusion, perhaps we can remove them. >>>>>>>>> >>>>>>>>> And if the requirement from the application is just informative, I would agree >>>>>>>>> that free text string will be better, right now 'rte_eth_rx/tx_burst_mode_get()' >>>>>>>>> is the main API to provide the information and >>>>>>>>> 'rte_eth_burst_mode_option_name()' is a helper for application/driver to log >>>>>>>>> this information. >>>>>>>>> >>>>>>>> >>>>>>>> Well look we have a general deficit of information about what is happening under >>>>>>>> the covers in DPDK. The end user may get wildly different performance characteristics >>>>>>>> based on the DPDK configuration. Simple example is using flow director causes the i40e >>>>>>>> PMD to switch to using a scalar code path, and performance may as much as half. >>>>>>>> >>>>>>>> This can cause no end of head-scratching in consuming products, I have done some >>>>>>>> of that head scratching myself, it is a usability nightmare. >>>>>>>> >>>>>>>> FD.io VPP tries to work around this by mining the call stack, to give the user _some_ >>>>>>>> kind of information about what is happening. These kind of heroics should not be necessary. >>>>>>>> >>>>>>>> For exactly the same reasons as telemetry, we should be trying to give the users as much >>>>>>>> information as possible, in as standard as format as possible. Otherwise DPDK >>>>>>>> becomes arcane leaving the user running gdb to understand what is going on, as I >>>>>>>> frequently do. >>>>>>> >>>>>>> I agree we must provide a clue to understand the performance result. >>>>>>> As Stephen commented at the very beginning, a log is enough for such debug. >>>>>>> But his comment was ignored. >>>>>> >>>>>> Do we expect applications built on DPDK to have to grep it's log to make such discoveries? >>>>>> It's very brittle and arcane way to provide information, if nothing else. >>>>>> >>>>>>> You wanted an API, fine. >>>>>>> I am OK to have an API to request infos which are also in logs. >>>>>> >>>>>> I would point out that an API to query meta-data is common practice else where. >>>>>> GStreamer GstCaps and Linux Sysfs are the closest example I can think of. >>>>>> >>>>>>> >>>>>>>> Finally, again for the same reasons as telemetry, I would say that machine readable is the >>>>>>>> ideal here. >>>>>>> >>>>>>> I disagree here. There is no need to make this info machine readable. >>>>>>> We want a clue about the optimizations which are all about creativity. >>>>>>> And we cannot make creativity of developers "machine readable". >>>>>> >>>>>> I am more concerned about the creativity in how developers describe optimizations. >>>>>> If there is no standardization of strings (or bits), the API will be challenging to use. >>>>> >>>>> No it won't be challenging because it will be just a string to print. >>>> >>>> Well the challenge is getting everyone to use the same set of strings, >>>> such that what is returned by the API has common meaning. >>>> >>>> I am fine with strings. >>>> So long as we have a method of encouraging folks to use a standard set were possible. >>> >>> I don't understand why you insist on standardizing. >>> Every drivers are different. >>> The routine names will have a sense only in the context of the driver. >> >> The more diversity in description, the more the user is reaching for documentation. >> If the user is lucky enough that description has documentation. > > I would go even further: > The documentation will not explain each Rx/Tx routine. > The user can have some guess, but the main information is to see > that the routine changed from one run to the other, so he can expect > a change in the performance result. > And as a bonus, he can ask more explanation to the maintainers > by giving the routine name he got from the API. > >> We can argue about this indefinitely, instead of proposing a standard. :-) >> The best way to this is to leave as the API as experimental for some period - as Haiyue suggests. >> And then review as the API drops experimental status, with a view to standardizing if possible? > > Yes we can argue indefinitely. > My position is against standardizing this information. > (not even talking about the ABI breaks it could cause) So here is my proposed work-around. We are probably to early to describe any commonalities beyond vectorization. Vectorization is seen as too Intel specific. Instead let's go with the PMD-specific strings for v20 ABI. Let's review the PMD-specific strings that emerge during v20 + Experimental, and see if it possible to start to standardize for v21 ABI? My position is that some degree of standardization is necessary here, for usability if no other reason. I am happy to give it time for that to emerge instead of trying to dictate it. So go with PMD-specific strings, and review after 1 year to see if we can improve? > >>>>> The challenge is trying to fix the design characteristics in an API. >>>> >>>> I thought Haiyue's patch with a fair degree of input from Ferruh and others is a pretty solid start. >>>> Let's describe those commonalities that _do_ exist today - it may not be enough, but it's better than we had. >>> >>> The initial requirement is to describe what makes the performance >>> change from one routine to the other. >>> We don't want to describe the commonalities but the specific differences. >>> Please let's focus on the real requirement and build an API which really helps. >>> As said several times, a PMD-specific string would be a good API. > > >