From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 88D67A0524; Fri, 5 Feb 2021 10:37:15 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 0CBF140682; Fri, 5 Feb 2021 10:37:15 +0100 (CET) Received: from shelob.oktetlabs.ru (shelob.oktetlabs.ru [91.220.146.113]) by mails.dpdk.org (Postfix) with ESMTP id 285BE4067B for ; Fri, 5 Feb 2021 10:37:13 +0100 (CET) Received: from [192.168.38.17] (aros.oktetlabs.ru [192.168.38.17]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by shelob.oktetlabs.ru (Postfix) with ESMTPSA id A27607F517; Fri, 5 Feb 2021 12:37:12 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 shelob.oktetlabs.ru A27607F517 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=oktetlabs.ru; s=default; t=1612517832; bh=rrdHiVqPtacFxVzBuWRbCZqyUwjBTjsYkJuepEyR2tM=; h=Subject:To:Cc:References:From:Date:In-Reply-To; b=A/uPcKZBxpkZ62Sxyr7+dnz0zN73N6QCp85OGDqHX7f1/V3BOCA2O+8UrgAzHSozp w+M9jbJevzDmyPHV4yUdhs+yEufkqh48eYv011jCFluixYEMeJPf16urtfMdPslq2z veOkZ1IuDEVyGs7KdfIoF9Ir42pCXI0tqzKtpgJ8= To: "Xueming(Steven) Li" Cc: "dev@dpdk.org" , Slava Ovsiienko , Asaf Penso , Thomas Monjalon References: <1608303356-13089-2-git-send-email-xuemingl@nvidia.com> <1611040409-11548-1-git-send-email-xuemingl@nvidia.com> <27ae1146-50a9-7a56-bce8-b93740927389@oktetlabs.ru> <8afffbdd-c822-bf63-8c74-a77fe78002f8@oktetlabs.ru> <700f53fd-d327-aa5d-13e6-df7afc5331eb@oktetlabs.ru> <6de02e03-2ab1-240e-1767-74bcc4403080@oktetlabs.ru> From: Andrew Rybchenko Organization: OKTET Labs Message-ID: <0ba359d9-724c-0431-013f-d868ccfbb466@oktetlabs.ru> Date: Fri, 5 Feb 2021 12:37:12 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.6.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v5 0/9] ethdev: support SubFunction representor X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 2/5/21 12:13 PM, Xueming(Steven) Li wrote: > >> -----Original Message----- >> From: Andrew Rybchenko >> Sent: Friday, February 5, 2021 3:35 PM >> To: Xueming(Steven) Li >> Cc: dev@dpdk.org; Slava Ovsiienko ; Asaf Penso ; Thomas Monjalon >> >> Subject: Re: [dpdk-dev] [PATCH v5 0/9] ethdev: support SubFunction representor >> >> On 2/4/21 5:15 PM, Xueming(Steven) Li wrote: >>> >>>> -----Original Message----- >>>> From: Andrew Rybchenko >>>> Sent: Monday, February 1, 2021 4:39 PM >>>> To: Xueming(Steven) Li >>>> Cc: dev@dpdk.org; Slava Ovsiienko ; Asaf >>>> Penso >>>> Subject: Re: [dpdk-dev] [PATCH v5 0/9] ethdev: support SubFunction >>>> representor >>>> >>>> On 1/28/21 5:31 PM, Xueming(Steven) Li wrote: >>>>> >>>>>>> The patch of device SF capability, but seems I misunderstood your suggestion. >>>>>>> Let me explain process to create a SF: >>>>>>> 1. SF can be created on the fly with scripts, unlike VF which is statically pre-created. >>>>>> >>>>>> Is there a maximum index and maximum total number of SF's created? How to find it? >>>>> >>>>> The maximum index is defined by firmware configuration, all SF's >>>>> information could be found from sysfs. To create a SF, both PCI and sfnum have to be specified. >>>> >>>> sysfs is obviously Linux specific. I think the information should be available via DPDK API. >>> >>> Yes, the new api discussed below should resolve this issue. >>> >>>> >>>>>> >>>>>>> 2. SF is created on a PF with a SF number. SF number is named per PF, different PF may have same SF number. >>>>>>> 3. For standalone PF, hot plug to DPDK using "PF#_BDF,representor=sf#", no need to use pf#sf# here. >>>>>>> 4. For bonding netdev, hot plug to DPDK using "PF0_BDF,representor=pf#sf#" >>>>>>> If using new api to return all representor IDs, need some way >>>>>>> locate the new created SF by PF and SF number, that's why "pf#sf#" is used in this patch set. >>>>>> >>>>>> I think the API should simply reserve/report space for maximum >>>>>> number of SFs. So, IDs are stable across restart/reboot in >>>>>> assumption that NIC is not reconfigured (changed maximum number of >>>>>> VF or >>>> maximum number of SFs of any PF). >>>>> >>>>> Yes, IDs should be stable as long as no NIC firmware configuration change. >>>>> >>>>> Just clarify, this api should be common enough to report all devices that a bus device supports: >>>>> 1. name, might contains controller and pf info, example: "eth:representor:c0pf1vf" >>>>> 2. ID range, example: 0-127 >>>>> The api describes ID ranges for each sub device type, users have to query the api and choose representor ID to probe. >>>>> >>>>> Prototype: >>>>> struct rte_bus_device_range { >>>>> char name[64]; >>>>> uint32_t base; >>>>> uint32_t number; >>>>> } >>>>> /* return number of ranges filled, or number of ranges if list is >>>>> NULL. */ int rte_bus_ dev_range_get(struct rte_bus_device_range >>>>> *list, int n); >>>> >>>> Hm, I thought about more port representor specific API. >>>> For me it is hard to tell if such generic naming is good or bad. I >>>> think it should be proven that such generic API makes sense. Any other potential users / use cases? >>> >>> I was thinking about SF, but SF is PCI specific, not suitable for this api. So I'm fine to make it as ethdev api. >>> To append new api into eth_dev_ops, is there ABI concern? >> >> No, eth_dev_ops are internal >> >>>> I've considered ethdev API which returns (in similar way as >>>> above) list of possible port representors which could be controlled >>>> by the device. Also I think it would be useful to include type information (enum with PF, VF, SF), controller ID. >>> >>> Agree. >>> >>> There is a new concern from orchestration side, currently, no >>> interface in openstack and OVS to retrieve representor ID range info, >>> It will take time to adapt this solution. To probe a representor, >>> orchestration need to know how to calculate representor ID, and the ID might vary on different max SF number, i.e. VF4 on PP1 >> might got different ID. Representor ID change before that will break the product. >> >> I see. >> >>> Considering both orchestration and testpmd users, how about keeping both solution together? This will bring max flexibility IMHO. >> >> As I said before I don't mind and I really think it is a good idea to add suggested interface to specify representor (i.e. cXpfYvfZ), but the >> problem is making bitmap from representor ID. >> >> ethdev API should use new representor info API to make a representor ID from controller/PF/{VF,SF}. >> Or do you see any problems with such approach? > > Sorry I thought the user to figure out representor ID from api. > This combination look good, thanks for clarification :) > > So the new api looks like this: Roughly speaking - yes > struct rte_eth_representor_info { > Enum representor_type; > Uint16_t controller; // -1 for any I'm not sure that I understand what does "any" mean in this case. I think it should be the zero in examples below. I think that API should return caller controller ID and PF ID. It would allow to interpret "vf5" correctly when caller is not controller #0 and/or PF #0. > Uint16_t port; // -1 for any port sounds like physical port, but it should be PF (pf, phys_fn or something like this). It could be many PFs per physical network port. > Uint16_t representor_id; May be base_id? Or rep_base_id? The question is what to do if range for VF or SF is not contiguous. Should we have one more index after phys_fn to represent it? E.g. union { uint16_t vf; uint16_t sf; }; > Uint16_t count; May be id_range which should be 1 to show one function. It could be convenient to treat 0 this way as well, but I doubt that it is a good idea. > char name[N]; > > int rte_eth_representor_info_get(struct rte_eth_representor_info *infos); > - Return number of entries. > - NULL infos just return number of entries supported. > Sample outputs: > VF, -1, 0, 0, 128, "pf0vf" > SF, -1, 0, 128, 2048, "pf0sf" > PF, -1, 0, 32767, 1, "pf" > VF, -1, 1, 32768, 128, "pf1vf" > SF, -1, 0, (32768+128), 2048, "pf1sf" > PF, -1, 0, 65535, 1, "pf" > >> >>> In struct rte_eth_dev_data, reserved bits could be used to define controller and port, this will avoid bitmap. How do you think? >> >> Could you add a bit more on it? Just a bit more details to the idea since I don't understand what exactly you mean and how it could >> help. > > The idea is replacing reserved_64s and adding more device location info in rte_eth-dev_data like this: > Uint16_t representor_id; > Uint16_t port_id; > Uint16_t controller_id; > Enum representor_type; > Compare them all when matching a device, this will also avoid bitmap encoding. > Reserved_64s[] was added to mitigate ABI conflicts, IIRC. > But seems no need if making representor info API to make ID. > >> >>>> >>>> There is one more bit which is not in the picture yet - >>>> switch_info.port_id. Should it be equal to representor ID? Or different and provided in the info structure? >>> >>> Not exactly same AFAIK, the id used in e-switch. >>> >>> >