From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 67CF9A0524; Wed, 2 Jun 2021 12:50:46 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1F72F4069F; Wed, 2 Jun 2021 12:50:46 +0200 (CEST) Received: from shelob.oktetlabs.ru (shelob.oktetlabs.ru [91.220.146.113]) by mails.dpdk.org (Postfix) with ESMTP id AA4D440689 for ; Wed, 2 Jun 2021 12:50:44 +0200 (CEST) Received: from [192.168.38.17] (aros.oktetlabs.ru [192.168.38.17]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by shelob.oktetlabs.ru (Postfix) with ESMTPSA id 2C69D7F4F3; Wed, 2 Jun 2021 13:50:44 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 shelob.oktetlabs.ru 2C69D7F4F3 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=oktetlabs.ru; s=default; t=1622631044; bh=7FpBKaKQ3M/YpRcxuUklV7OMXtpGBlBVfeFOSapOb+0=; h=Subject:To:Cc:References:From:Date:In-Reply-To; b=NHON8X9lzJVV4tH/L2zy1CVazh3fIweiqd9adWMcTfnJWiODiUc8gY8Oo4LcFnygA IGvmTYmaREseA95/t3dpUtswC/dt49AwhKTfB31KWCq2NFo34kjm3wBaXb77Jtj7Ou Wb05F4/FAomDqE/aQDbMsDhJvtdWdbzkU5+1TgjE= To: Eli Britstein , Ilya Maximets , Ivan Malov , dev@dpdk.org Cc: Smadar Fuks , Hyong Youb Kim , Kishore Padmanabha , Ori Kam , Ajit Khaparde , Jerin Jacob , John Daley , Thomas Monjalon , Ferruh Yigit References: <20210601111420.5549-1-ivan.malov@oktetlabs.ru> <8c4f559e-3430-e2d5-1199-f1d4f4a8546d@ovn.org> <9ed06b56-26e1-5812-e357-81147e699b3b@nvidia.com> <11ed17c8-a3f4-3fcb-b11f-7c4ee903b991@oktetlabs.ru> <4e53dced-2e88-3fde-f2b7-cb2e1368c1c8@nvidia.com> From: Andrew Rybchenko Organization: OKTET Labs Message-ID: <4447be7e-ce2a-a7a0-68e3-1de22a46cf80@oktetlabs.ru> Date: Wed, 2 Jun 2021 13:50:43 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.10.0 MIME-Version: 1.0 In-Reply-To: <4e53dced-2e88-3fde-f2b7-cb2e1368c1c8@nvidia.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [RFC PATCH] ethdev: clarify flow action PORT ID semantics X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 6/2/21 12:57 PM, Eli Britstein wrote: > > On 6/1/2021 5:53 PM, Andrew Rybchenko wrote: >> External email: Use caution opening links or attachments >> >> >> On 6/1/21 5:44 PM, Eli Britstein wrote: >>> On 6/1/2021 5:35 PM, Andrew Rybchenko wrote: >>>> External email: Use caution opening links or attachments >>>> >>>> >>>> On 6/1/21 4:24 PM, Eli Britstein wrote: >>>>> On 6/1/2021 3:10 PM, Ilya Maximets wrote: >>>>>> External email: Use caution opening links or attachments >>>>>> >>>>>> >>>>>> On 6/1/21 1:14 PM, Ivan Malov wrote: >>>>>>> By its very name, action PORT_ID means that packets hit an ethdev >>>>>>> with the >>>>>>> given DPDK port ID. At least the current comments don't state the >>>>>>> opposite. >>>>>>> That said, since port representors had been adopted, applications >>>>>>> like OvS >>>>>>> have been misusing the action. They misread its purpose as sending >>>>>>> packets >>>>>>> to the opposite end of the "wire" plugged to the given ethdev, for >>>>>>> example, >>>>>>> redirecting packets to the VF itself rather than to its representor >>>>>>> ethdev. >>>>>>> Another example: OvS relies on this action with the admin PF's >>>>>>> ethdev >>>>>>> port >>>>>>> ID specified in it in order to send offloaded packets to the >>>>>>> physical >>>>>>> port. >>>>>>> >>>>>>> Since there might be applications which use this action in its valid >>>>>>> sense, >>>>>>> one can't just change the documentation to greenlight the opposite >>>>>>> meaning. >>>>>>> This patch adds an explicit bit to the action configuration which >>>>>>> will let >>>>>>> applications, depending on their needs, leverage the two meanings >>>>>>> properly. >>>>>>> Applications like OvS, as well as PMDs, will have to be corrected >>>>>>> when the >>>>>>> patch has been applied. But the improved clarity of the action is >>>>>>> worth it. >>>>>>> >>>>>>> The proposed change is not the only option. One could avoid changes >>>>>>> in OvS >>>>>>> and PMDs if the new configuration field had the opposite meaning, >>>>>>> with the >>>>>>> action itself meaning delivery to the represented port and not to >>>>>>> DPDK one. >>>>>>> Alternatively, one could define a brand new action with the said >>>>>>> behaviour. >>>>> It doesn't make any sense to attach the VF itself to OVS, but only its >>>>> representor. >>>> OvS is not the only DPDK application. >>> True. It is just the focus of this commit message is OVS. >>>>> For the PF, when in switchdev mode, it is the "uplink representor", so >>>>> it is also a representor. >>>> Strictly speaking it is not a representor from DPDK point of >>>> view. E.g. representors have corresponding flag set which is >>>> definitely clear in the case of PF. >>> This is the per-PMD responsibility. The API should not care. >>>>> That said, OVS does not care of the type of the port. It doesn't >>>>> matter >>>>> if it's an "upstream" or not, or if it's a representor or not. >>>> Yes, it is clear, but let's put OvS aside. Let's consider a >>>> DPDK application which has a number of ethdev port. Some may >>>> belong to single switch domain, some may be from different >>>> switch domains (i.e. different NICs). Can I use PORT_ID action >>>> to redirect ingress traffic to a specified ethdev port using >>>> PORT_ID action? It looks like no, but IMHO it is the definition >>>> of the PORT_ID action. >>> Let's separate API from implementation. By API point of view, yes, the >>> user may request it. Nothing wrong with it. >>> >>>  From implementation point of view - yes, it might fail, but not for >>> sure, even if on different NICs. Maybe the HW of a certain vendor has >>> the capability to do it? >>> >>> We can't know, so I think the API should allow it. >> Hold on. What should it allow? It is two opposite meanings: >>   1. Direct traffic to DPDK ethdev port specified using ID to be >>      received and processed by the DPDK application. >>   2. Direct traffic to an upstream port represented by the >>      DPDK port. >> >> The patch tries to address the ambiguity, misuse it in OvS >> (from my point of view in accordance with the action >> documentation), mis-implementation in a number of PMDs >> (to work in OvS) and tries to sort it out with an explanation >> why proposed direction is chosen. I realize that it could be >> painful, but IMHO it is the best option here. Yes, it is a >> point to discuss. >> >> To start with we should agree that that problem exists. >> Second, we should agree on direction how to solve it. > > I agree. Suppose port 0 is the PF, and port 1 is a VF representor. > > IIUC, there are two options: > > 1. flow create 1 ingress transfer pattern eth / end action port_id id 0 > upstream 1 / end > > 2. flow create 1 ingress transfer pattern eth / end action port_id id 0 > upstream 0 / end > > [1] is the same behavior as today. > > [2] is a new behavior, the packet received by port 0 as if it arrived > from the wire. > > Then, let's have more: > > 3. flow create 0 ingress transfer pattern eth / end action port_id id 1 > upstream 1 / end > > 4. flow create 0 ingress transfer pattern eth / end action port_id id 1 > upstream 0 / end > > if we have [2] and [4], the packet going from the VF will hit [2], then > hit [4] and then [2] again in an endless loop? As I understand PORT_ID is a fate action. So, no more lookups are done. If the packet is loop back from applications, loop is possible. In fact, it is a good question if "flow creare 0 ingress transfer" or "flow create 1 ingress transfer" assume any implicit filtering. I always thought that no. i.e. if we have two network ports rule like flow create 0 ingress transfer pattern eth / end \ action port_id id 1 upstream 1 / end will match packets incoming from any port into the switch (network port 0, network port 1, VF or PF itself (???)). The topic also requires explicit clarification. PF itself is really a hard question because of "ingress" since traffic from PF is a traffic from DPDK application and it is egress, not ingress. I think that port ID used to created flow rule should not apply any filtering in the case of transfer since we have corresponding items to do it explicitly. If we do it implicitly as well, we need some priorities and a way to avoid implicit rules which makes things much harder to understand and implement. > If this is your meaning, maybe what you are looking for is an action to > change the in_port and continue processing? > > Please comment on the examples I gave or clarify the use case you are > trying to do. > > > Thanks, > > Eli > >> >>>>>> We had already very similar discussions regarding the >>>>>> understanding of >>>>>> what >>>>>> the representor really is from the DPDK API's point of view, and the >>>>>> last >>>>>> time, IIUC, it was concluded by a tech. board that representor >>>>>> should be >>>>>> a "ghost of a VF", i.e. DPDK APIs should apply configuration by >>>>>> default to >>>>>> VF and not to the representor device: >>>>>> >>>>>> https://patches.dpdk.org/project/dpdk/cover/20191029185051.32203-1-thomas@monjalon.net/#104376 >>>>>> >>>>>> >>>>>> >>>>>> This wasn't enforced though, IIUC, for existing code and semantics is >>>>>> still mixed. >>>>> I am not sure how this is related. >>>>>> I still think that configuration should be applied to VF, and the >>>>>> same >>>>>> applies >>>>>> to rte_flow API.  IMHO, average application should not care if >>>>>> device is >>>>>> a VF itself or its representor.  Everything should work exactly the >>>>>> same. >>>>>> I think this matches with the original idea/design of the switchdev >>>>>> functionality >>>>>> in the linux kernel and also matches with how the average user thinks >>>>>> about >>>>>> representor devices. >>>>> Right. This is the way representors work. It is fully aligned with >>>>> configuration of OVS-kernel. >>>>>> If some specific use-case requires to distinguish VF from the >>>>>> representor, >>>>>> there should probably be a separate special API/flag for that. >>>>>> >>>>>> Best regards, Ilya Maximets.