From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id B6653A0352; Mon, 4 Nov 2019 13:08:02 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 541F637A2; Mon, 4 Nov 2019 13:08:02 +0100 (CET) Received: from qrelay154.mxroute.com (qrelay154.mxroute.com [172.82.139.154]) by dpdk.org (Postfix) with ESMTP id 594F23576 for ; Mon, 4 Nov 2019 13:08:01 +0100 (CET) Received: from filter004.mxroute.com (unknown [94.130.183.33]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by qrelay154.mxroute.com (Postfix) with ESMTPS id 8A17414095D; Mon, 4 Nov 2019 07:08:00 -0500 (EST) Received: from galaxy.mxroute.com (unknown [23.92.70.113]) by filter004.mxroute.com (Postfix) with ESMTPS id 7F30D3E809; Mon, 4 Nov 2019 12:07:53 +0000 (UTC) Received: from irdmzpr01-ext.ir.intel.com ([192.198.151.36]) by galaxy.mxroute.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.91) (envelope-from ) id 1iRaxF-00021L-1b; Mon, 04 Nov 2019 06:55:29 -0500 To: Thomas Monjalon Cc: dev@dpdk.org, "Yigit, Ferruh" , 'Damjan Marion' , "Wang, Haiyue" , Jerin Jacob Kollanukkaran , viacheslavo@mellanox.com, stephen@networkplumber.org, arybchenko@solarflare.com References: <20191015075133.38560-1-haiyue.wang@intel.com> <4310025.8zu0hhGBfT@xps> <31ae4f6c-f55b-ae3f-4527-4870156b74ff@ashroe.eu> <3333162.RZseXh5CjJ@xps> From: Ray Kinsella Openpgp: preference=signencrypt Autocrypt: addr=mdr@ashroe.eu; keydata= mQINBFv8B3wBEAC+5ImcgbIvadt3axrTnt7Sxch3FsmWTTomXfB8YiuHT8KL8L/bFRQSL1f6 ASCHu3M89EjYazlY+vJUWLr0BhK5t/YI7bQzrOuYrl9K94vlLwzD19s/zB/g5YGGR5plJr0s JtJsFGEvF9LL3e+FKMRXveQxBB8A51nAHfwG0WSyx53d61DYz7lp4/Y4RagxaJoHp9lakn8j HV2N6rrnF+qt5ukj5SbbKWSzGg5HQF2t0QQ5tzWhCAKTfcPlnP0GymTBfNMGOReWivi3Qqzr S51Xo7hoGujUgNAM41sxpxmhx8xSwcQ5WzmxgAhJ/StNV9cb3HWIoE5StCwQ4uXOLplZNGnS uxNdegvKB95NHZjRVRChg/uMTGpg9PqYbTIFoPXjuk27sxZLRJRrueg4tLbb3HM39CJwSB++ YICcqf2N+GVD48STfcIlpp12/HI+EcDSThzfWFhaHDC0hyirHxJyHXjnZ8bUexI/5zATn/ux TpMbc/vicJxeN+qfaVqPkCbkS71cHKuPluM3jE8aNCIBNQY1/j87k5ELzg3qaesLo2n1krBH bKvFfAmQuUuJT84/IqfdVtrSCTabvDuNBDpYBV0dGbTwaRfE7i+LiJJclUr8lOvHUpJ4Y6a5 0cxEPxm498G12Z3NoY/mP5soItPIPtLR0rA0fage44zSPwp6cQARAQABtBxSYXkgS2luc2Vs bGEgPG1kckBhc2hyb2UuZXU+iQJUBBMBCAA+FiEEcDUDlKDJaDuJlfZfdJdaH/sCCpsFAlv8 B3wCGyMFCQlmAYAFCwkIBwIGFQoJCAsCBBYCAwECHgECF4AACgkQdJdaH/sCCptdtRAAl0oE msa+djBVYLIsax+0f8acidtWg2l9f7kc2hEjp9h9aZCpPchQvhhemtew/nKavik3RSnLTAyn B3C/0GNlmvI1l5PFROOgPZwz4xhJKGN7jOsRrbkJa23a8ly5UXwF3Vqnlny7D3z+7cu1qq/f VRK8qFyWkAb+xgqeZ/hTcbJUWtW+l5Zb+68WGEp8hB7TuJLEWb4+VKgHTpQ4vElYj8H3Z94a 04s2PJMbLIZSgmKDASnyrKY0CzTpPXx5rSJ1q+B1FCsfepHLqt3vKSALa3ld6bJ8fSJtDUJ7 JLiU8dFZrywgDIVme01jPbjJtUScW6jONLvhI8Z2sheR71UoKqGomMHNQpZ03ViVWBEALzEt TcjWgJFn8yAmxqM4nBnZ+hE3LbMo34KCHJD4eg18ojDt3s9VrDLa+V9fNxUHPSib9FD9UX/1 +nGfU/ZABmiTuUDM7WZdXri7HaMpzDRJUKI6b+/uunF8xH/h/MHW16VuMzgI5dkOKKv1LejD dT5mA4R+2zBS+GsM0oa2hUeX9E5WwjaDzXtVDg6kYq8YvEd+m0z3M4e6diFeLS77/sAOgaYL 92UcoKD+Beym/fVuC6/55a0e12ksTmgk5/ZoEdoNQLlVgd2INtvnO+0k5BJcn66ZjKn3GbEC VqFbrnv1GnA58nEInRCTzR1k26h9nmS5Ag0EW/wHfAEQAMth1vHr3fOZkVOPfod3M6DkQir5 xJvUW5EHgYUjYCPIa2qzgIVVuLDqZgSCCinyooG5dUJONVHj3nCbITCpJp4eB3PI84RPfDcC hf/V34N/Gx5mTeoymSZDBmXT8YtvV/uJvn+LvHLO4ZJdvq5ZxmDyxfXFmkm3/lLw0+rrNdK5 pt6OnVlCqEU9tcDBezjUwDtOahyV20XqxtUttN4kQWbDRkhT+HrA9WN9l2HX91yEYC+zmF1S OhBqRoTPLrR6g4sCWgFywqztpvZWhyIicJipnjac7qL/wRS+wrWfsYy6qWLIV80beN7yoa6v ccnuy4pu2uiuhk9/edtlmFE4dNdoRf7843CV9k1yRASTlmPkU59n0TJbw+okTa9fbbQgbIb1 pWsAuicRHyLUIUz4f6kPgdgty2FgTKuPuIzJd1s8s6p2aC1qo+Obm2gnBTduB+/n1Jw+vKpt 07d+CKEKu4CWwvZZ8ktJJLeofi4hMupTYiq+oMzqH+V1k6QgNm0Da489gXllU+3EFC6W1qKj tkvQzg2rYoWeYD1Qn8iXcO4Fpk6wzylclvatBMddVlQ6qrYeTmSbCsk+m2KVrz5vIyja0o5Y yfeN29s9emXnikmNfv/dA5fpi8XCANNnz3zOfA93DOB9DBf0TQ2/OrSPGjB3op7RCfoPBZ7u AjJ9dM7VABEBAAGJAjwEGAEIACYWIQRwNQOUoMloO4mV9l90l1of+wIKmwUCW/wHfAIbDAUJ CWYBgAAKCRB0l1of+wIKm3KlD/9w/LOG5rtgtCUWPl4B3pZvGpNym6XdK8cop9saOnE85zWf u+sKWCrxNgYkYP7aZrYMPwqDvilxhbTsIJl5HhPgpTO1b0i+c0n1Tij3EElj5UCg3q8mEc17 c+5jRrY3oz77g7E3oPftAjaq1ybbXjY4K32o3JHFR6I8wX3m9wJZJe1+Y+UVrrjY65gZFxcA thNVnWKErarVQGjeNgHV4N1uF3pIx3kT1N4GSnxhoz4Bki91kvkbBhUgYfNflGURfZT3wIKK +d50jd7kqRouXUCzTdzmDh7jnYrcEFM4nvyaYu0JjSS5R672d9SK5LVIfWmoUGzqD4AVmUW8 pcv461+PXchuS8+zpltR9zajl72Q3ymlT4BTAQOlCWkD0snBoKNUB5d2EXPNV13nA0qlm4U2 GpROfJMQXjV6fyYRvttKYfM5xYKgRgtP0z5lTAbsjg9WFKq0Fndh7kUlmHjuAIwKIV4Tzo75 QO2zC0/NTaTjmrtiXhP+vkC4pcrOGNsbHuaqvsc/ZZ0siXyYsqbctj/sCd8ka2r94u+c7o4l BGaAm+FtwAfEAkXHu4y5Phuv2IRR+x1wTey1U1RaEPgN8xq0LQ1OitX4t2mQwjdPihZQBCnZ wzOrkbzlJMNrMKJpEgulmxAHmYJKgvZHXZXtLJSejFjR0GdHJcL5rwVOMWB8cg== Message-ID: <932da3d3-34b9-79a4-1a68-fb06056ee93f@ashroe.eu> Date: Mon, 4 Nov 2019 12:07:48 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <3333162.RZseXh5CjJ@xps> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-AuthUser: mdr@ashroe.eu Subject: Re: [dpdk-dev] [PATCH v4 1/4] ethdev: add the API for getting burst mode information X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 04/11/2019 11:30, Thomas Monjalon wrote: > 04/11/2019 11:03, Ray Kinsella: >> On 04/11/2019 09:54, Thomas Monjalon wrote: >>> 04/11/2019 10:49, Ray Kinsella: >>>> On 03/11/2019 22:41, Thomas Monjalon wrote: >>>>> 03/11/2019 21:35, Ray Kinsella: >>>>>> On 29/10/2019 14:27, Ferruh Yigit wrote: >>>>>>> On 10/26/2019 5:23 PM, Thomas Monjalon wrote: >>>>>>>> 26/10/2019 11:23, Wang, Haiyue: >>>>>>>>> From: Thomas Monjalon [mailto:thomas@monjalon.net] >>>>>>>>>> 26/10/2019 06:40, Wang, Haiyue: >>>>>>>>>>> From: Thomas Monjalon [mailto:thomas@monjalon.net] >>>>>>>>>>>> 25/10/2019 18:02, Jerin Jacob: >>>>>>>>>>>>> On Fri, Oct 25, 2019 at 9:15 PM Thomas Monjalon wrote: >>>>>>>>>>>>>> 25/10/2019 16:08, Ferruh Yigit: >>>>>>>>>>>>>>> On 10/25/2019 10:36 AM, Thomas Monjalon wrote: >>>>>>>>>>>>>>>> 15/10/2019 09:51, Haiyue Wang: >>>>>>>>>>>>>>>>> Some PMDs have more than one RX/TX burst paths, add the ethdev API >>>>>>>>>>>>>>>>> that allows an application to retrieve the mode information about >>>>>>>>>>>>>>>>> Rx/Tx packet burst such as Scalar or Vector, and Vector technology >>>>>>>>>>>>>>>>> like AVX2. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I missed this patch. I and Andrew, maintainers of ethdev, were not CC'ed. >>>>>>>>>>>>>>>> Ferruh, I would expect to be Cc'ed and/or get a notification before merging. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> It has been discussed in the mail list and went through multiple discussions, >>>>>>>>>>>>>>> patch is out since the August, +1 to cc all maintainers I missed that part, >>>>>>>>>>>>>>> but when the patch is reviewed and there is no objection, why block the merge? >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'm not saying blocking the merge. >>>>>>>>>>>>>> My bad is that I missed the patch and I am asking for help with a notification >>>>>>>>>>>>>> in this case. Same for Andrew I guess. >>>>>>>>>>>>>> Note: it is merged in master and I am looking to improve this feature. >>>>>>>>>>>>> >>>>>>>>>>>>>>>>> +/** >>>>>>>>>>>>>>>>> + * Ethernet device RX/TX queue packet burst mode information structure. >>>>>>>>>>>>>>>>> + * Used to retrieve information about packet burst mode setting. >>>>>>>>>>>>>>>>> + */ >>>>>>>>>>>>>>>>> +struct rte_eth_burst_mode { >>>>>>>>>>>>>>>>> + uint64_t options; >>>>>>>>>>>>>>>>> +}; >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Why a struct for an integer? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Again by a request from me, to not need to break the API if we need to add more >>>>>>>>>>>>>>> thing in the future. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I would replace it with a string. This is the most flexible API. >>>>>>>>>>>>> >>>>>>>>>>>>> IMO, Probably, best of both worlds make a good option here, >>>>>>>>>>>>> as Haiyue suggested if we have an additional dev_specific[1] in structure. >>>>>>>>>>>>> and when a pass to the application, let common code make final string as >>>>>>>>>>>>> (options flags to string + dev_specific) >>>>>>>>>>>>> >>>>>>>>>>>>> options flag can be zero if PMD does not have any generic flags nor >>>>>>>>>>>>> interested in such a scheme. >>>>>>>>>>>>> Generic flags will help at least to have some common code. >>>>>>>>>>>>> >>>>>>>>>>>>> [1] >>>>>>>>>>>>> struct rte_eth_burst_mode { >>>>>>>>>>>>> uint64_t options; >>>>>>>>>>>>> char dev_specific[128]; /* PMD has specific burst mode information */ >>>>>>>>>>>>> }; >>>>>>>>>>>> >>>>>>>>>>>> I really don't see how we can have generic flags. >>>>>>>>>>>> The flags which are proposed are just matching >>>>>>>>>>>> the functions implemented in Intel PMDs. >>>>>>>>>>>> And this is a complicate solution. >>>>>>>>>>>> Why not just returning a name for the selected Rx/Tx mode? >>>>>>>>>>> >>>>>>>>>>> Intel PMDs use the *generic* methods like x86 SSE, AVX2, ARM NEON, PPC ALTIVEC, >>>>>>>>>>> 'dev->data->scattered_rx' etc for the target : "DPDK is the Data Plane Development Kit >>>>>>>>>>> that consists of libraries to accelerate packet processing workloads running on a wide >>>>>>>>>>> variety of CPU architectures." >>>>>>>>>> >>>>>>>>>> How RTE_ETH_BURST_SCATTERED and RTE_ETH_BURST_BULK_ALLOC are generic? >>>>>>>>>> They just match some features of the Intel PMDs. >>>>>>>>>> Why not exposing other optimizations of the Rx/Tx implementations? >>>>>>>>>> You totally missed the point of generic burst mode description. >>>>>>>>>> >>>>>>>>>>> If understand these new experimental APIs from above, then bit options is the best, >>>>>>>>>>> and we didn't invent new words to describe them, just from the CPU & other *generic* >>>>>>>>>>> technology. And the application can loop to check which kind of burst is running by >>>>>>>>>>> just simple bit test. >>>>>>>>>>> >>>>>>>>>>> If PMDs missed these, they can update them in future roadmaps to enhance their PMDs, >>>>>>>>>>> like MLX5 supports ARM NEON, x86 SSE. >>>>>>>>>> >>>>>>>>>> I have no word! >>>>>>>>>> You really think other PMDs should learn from Intel how to "enhance" their PMD? >>>>>>>>>> You talk about mlx5, did you look at its code? Did you see the burst modes >>>>>>>>>> depending on which specific hardware path is used (MPRQ, EMPW, inline)? >>>>>>>>>> Or depending on which offloads are handled? >>>>>>>>>> >>>>>>>>>> Again, the instruction set used by the function is a small part >>>>>>>>>> of the burst mode optimization. >>>>>>>>>> >>>>>>>>>> So you did not reply to my question: >>>>>>>>>> Why not just returning a name for the selected Rx/Tx mode? >>>>>>>>> >>>>>>>>> In fact, RFC v1/v2 returns the *name*, but the *name* is hard for >>>>>>>>> application to do further processing, strcmp, strstr ? Not so nice >>>>>>>>> for C code, and it is not so standard, So switch it to bit definition. >>>>>>>> >>>>>>>> Again, please answer my question: why do you need it? >>>>>>>> I think it is just informative, that's why a string should be enough. >>>>>>>> I am clearly against the bitmap because it is way too much restrictive. >>>>>>>> I disagree that knowing it is using AVX2 or AVX512 is so interesting. >>>>>>>> What you would like to know is whether it is processing packets 4 by 4, >>>>>>>> for instance, or to know which offload is supported, or what hardware trick >>>>>>>> is used in the datapath design. >>>>>>>> There are so many options in a datapath design that it cannot be >>>>>>>> represented with a bitmap. And it makes no sense to have some design >>>>>>>> criterias more important than others. >>>>>>>> I Cc an Intel architect (Edwin) who could explain you how much >>>>>>>> a datapath design is more complicate than just using AVX instructions. >>>>>>> >>>>>>> As I understand this is to let applications to give informed decision based on >>>>>>> what vectorization is used in the driver, currently this is not know by the >>>>>>> application. >>>>>>> >>>>>>> And as previously replied, the main target of the API is to define the vector >>>>>>> path, not all optimizations, so the number is limited. >>>>> >>>>> No! >>>>> The name of this API is "burst mode information", >>>>> not "vector instructions used". >>>>> I think the main error is that in Intel PMDs, >>>>> each Rx/Tx function use different vector instructions. >>>>> So you generalize that knowing the vectors instructions >>>>> will give you a good information about the performance. >>>>> But this is generally wrong! >>>>> The right level of infos is much more complex. >>>> >>>> I don't think anyone was suggesting limiting it to purely describing PMD optimization >>>> with vector instructions. If there are other commonalities let's describe those also. >>>> >>>> Vectorization was thought to be a good starting point - IMHO it is. >>>> >>>>> >>>>>>> There are many optimization in the data path, I agree we may not represent all >>>>>>> of them, and agreed existing enum having "RTE_ETH_BURST_BULK_ALLOC" and similar >>>>>>> causing this confusion, perhaps we can remove them. >>>>>>> >>>>>>> And if the requirement from the application is just informative, I would agree >>>>>>> that free text string will be better, right now 'rte_eth_rx/tx_burst_mode_get()' >>>>>>> is the main API to provide the information and >>>>>>> 'rte_eth_burst_mode_option_name()' is a helper for application/driver to log >>>>>>> this information. >>>>>>> >>>>>> >>>>>> Well look we have a general deficit of information about what is happening under >>>>>> the covers in DPDK. The end user may get wildly different performance characteristics >>>>>> based on the DPDK configuration. Simple example is using flow director causes the i40e >>>>>> PMD to switch to using a scalar code path, and performance may as much as half. >>>>>> >>>>>> This can cause no end of head-scratching in consuming products, I have done some >>>>>> of that head scratching myself, it is a usability nightmare. >>>>>> >>>>>> FD.io VPP tries to work around this by mining the call stack, to give the user _some_ >>>>>> kind of information about what is happening. These kind of heroics should not be necessary. >>>>>> >>>>>> For exactly the same reasons as telemetry, we should be trying to give the users as much >>>>>> information as possible, in as standard as format as possible. Otherwise DPDK >>>>>> becomes arcane leaving the user running gdb to understand what is going on, as I >>>>>> frequently do. >>>>> >>>>> I agree we must provide a clue to understand the performance result. >>>>> As Stephen commented at the very beginning, a log is enough for such debug. >>>>> But his comment was ignored. >>>> >>>> Do we expect applications built on DPDK to have to grep it's log to make such discoveries? >>>> It's very brittle and arcane way to provide information, if nothing else. >>>> >>>>> You wanted an API, fine. >>>>> I am OK to have an API to request infos which are also in logs. >>>> >>>> I would point out that an API to query meta-data is common practice else where. >>>> GStreamer GstCaps and Linux Sysfs are the closest example I can think of. >>>> >>>>> >>>>>> Finally, again for the same reasons as telemetry, I would say that machine readable is the >>>>>> ideal here. >>>>> >>>>> I disagree here. There is no need to make this info machine readable. >>>>> We want a clue about the optimizations which are all about creativity. >>>>> And we cannot make creativity of developers "machine readable". >>>> >>>> I am more concerned about the creativity in how developers describe optimizations. >>>> If there is no standardization of strings (or bits), the API will be challenging to use. >>> >>> No it won't be challenging because it will be just a string to print. >> >> Well the challenge is getting everyone to use the same set of strings, >> such that what is returned by the API has common meaning. >> >> I am fine with strings. >> So long as we have a method of encouraging folks to use a standard set were possible. > > I don't understand why you insist on standardizing. > Every drivers are different. > The routine names will have a sense only in the context of the driver. The more diversity in description, the more the user is reaching for documentation. If the user is lucky enough that description has documentation. We can argue about this indefinitely, instead of proposing a standard. :-) The best way to this is to leave as the API as experimental for some period - as Haiyue suggests. And then review as the API drops experimental status, with a view to standardizing if possible? > >>> The challenge is trying to fix the design characteristics in an API. >> >> I thought Haiyue's patch with a fair degree of input from Ferruh and others is a pretty solid start. >> Let's describe those commonalities that _do_ exist today - it may not be enough, but it's better than we had. > > The initial requirement is to describe what makes the performance > change from one routine to the other. > We don't want to describe the commonalities but the specific differences. > Please let's focus on the real requirement and build an API which really helps. > As said several times, a PMD-specific string would be a good API.