Re: [dpdk-dev] [RFC PATCH] ethdev: clarify flow action PORT ID semantics

DPDK patches and discussions
 help / color / mirror / Atom feed

From: Eli Britstein <elibr@nvidia.com>
To: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>,
	Ilya Maximets <i.maximets@ovn.org>,
	Ivan Malov <ivan.malov@oktetlabs.ru>, <dev@dpdk.org>
Cc: Smadar Fuks <smadarf@marvell.com>,
	Hyong Youb Kim <hyonkim@cisco.com>,
	Kishore Padmanabha <kishore.padmanabha@broadcom.com>,
	Ori Kam <orika@nvidia.com>,
	Ajit Khaparde <ajit.khaparde@broadcom.com>,
	Jerin Jacob <jerinj@marvell.com>, John Daley <johndale@cisco.com>,
	Thomas Monjalon <thomas@monjalon.net>,
	Ferruh Yigit <ferruh.yigit@intel.com>
Subject: Re: [dpdk-dev] [RFC PATCH] ethdev: clarify flow action PORT ID semantics
Date: Wed, 2 Jun 2021 14:21:51 +0300	[thread overview]
Message-ID: <adaab74c-8e08-4b06-41e9-9cde2ec67e1e@nvidia.com> (raw)
In-Reply-To: <4447be7e-ce2a-a7a0-68e3-1de22a46cf80@oktetlabs.ru>


On 6/2/2021 1:50 PM, Andrew Rybchenko wrote:
> External email: Use caution opening links or attachments
>
>
> On 6/2/21 12:57 PM, Eli Britstein wrote:
>> On 6/1/2021 5:53 PM, Andrew Rybchenko wrote:
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> On 6/1/21 5:44 PM, Eli Britstein wrote:
>>>> On 6/1/2021 5:35 PM, Andrew Rybchenko wrote:
>>>>> External email: Use caution opening links or attachments
>>>>>
>>>>>
>>>>> On 6/1/21 4:24 PM, Eli Britstein wrote:
>>>>>> On 6/1/2021 3:10 PM, Ilya Maximets wrote:
>>>>>>> External email: Use caution opening links or attachments
>>>>>>>
>>>>>>>
>>>>>>> On 6/1/21 1:14 PM, Ivan Malov wrote:
>>>>>>>> By its very name, action PORT_ID means that packets hit an ethdev
>>>>>>>> with the
>>>>>>>> given DPDK port ID. At least the current comments don't state the
>>>>>>>> opposite.
>>>>>>>> That said, since port representors had been adopted, applications
>>>>>>>> like OvS
>>>>>>>> have been misusing the action. They misread its purpose as sending
>>>>>>>> packets
>>>>>>>> to the opposite end of the "wire" plugged to the given ethdev, for
>>>>>>>> example,
>>>>>>>> redirecting packets to the VF itself rather than to its representor
>>>>>>>> ethdev.
>>>>>>>> Another example: OvS relies on this action with the admin PF's
>>>>>>>> ethdev
>>>>>>>> port
>>>>>>>> ID specified in it in order to send offloaded packets to the
>>>>>>>> physical
>>>>>>>> port.
>>>>>>>>
>>>>>>>> Since there might be applications which use this action in its valid
>>>>>>>> sense,
>>>>>>>> one can't just change the documentation to greenlight the opposite
>>>>>>>> meaning.
>>>>>>>> This patch adds an explicit bit to the action configuration which
>>>>>>>> will let
>>>>>>>> applications, depending on their needs, leverage the two meanings
>>>>>>>> properly.
>>>>>>>> Applications like OvS, as well as PMDs, will have to be corrected
>>>>>>>> when the
>>>>>>>> patch has been applied. But the improved clarity of the action is
>>>>>>>> worth it.
>>>>>>>>
>>>>>>>> The proposed change is not the only option. One could avoid changes
>>>>>>>> in OvS
>>>>>>>> and PMDs if the new configuration field had the opposite meaning,
>>>>>>>> with the
>>>>>>>> action itself meaning delivery to the represented port and not to
>>>>>>>> DPDK one.
>>>>>>>> Alternatively, one could define a brand new action with the said
>>>>>>>> behaviour.
>>>>>> It doesn't make any sense to attach the VF itself to OVS, but only its
>>>>>> representor.
>>>>> OvS is not the only DPDK application.
>>>> True. It is just the focus of this commit message is OVS.
>>>>>> For the PF, when in switchdev mode, it is the "uplink representor", so
>>>>>> it is also a representor.
>>>>> Strictly speaking it is not a representor from DPDK point of
>>>>> view. E.g. representors have corresponding flag set which is
>>>>> definitely clear in the case of PF.
>>>> This is the per-PMD responsibility. The API should not care.
>>>>>> That said, OVS does not care of the type of the port. It doesn't
>>>>>> matter
>>>>>> if it's an "upstream" or not, or if it's a representor or not.
>>>>> Yes, it is clear, but let's put OvS aside. Let's consider a
>>>>> DPDK application which has a number of ethdev port. Some may
>>>>> belong to single switch domain, some may be from different
>>>>> switch domains (i.e. different NICs). Can I use PORT_ID action
>>>>> to redirect ingress traffic to a specified ethdev port using
>>>>> PORT_ID action? It looks like no, but IMHO it is the definition
>>>>> of the PORT_ID action.
>>>> Let's separate API from implementation. By API point of view, yes, the
>>>> user may request it. Nothing wrong with it.
>>>>
>>>>   From implementation point of view - yes, it might fail, but not for
>>>> sure, even if on different NICs. Maybe the HW of a certain vendor has
>>>> the capability to do it?
>>>>
>>>> We can't know, so I think the API should allow it.
>>> Hold on. What should it allow? It is two opposite meanings:
>>>    1. Direct traffic to DPDK ethdev port specified using ID to be
>>>       received and processed by the DPDK application.
>>>    2. Direct traffic to an upstream port represented by the
>>>       DPDK port.
>>>
>>> The patch tries to address the ambiguity, misuse it in OvS
>>> (from my point of view in accordance with the action
>>> documentation), mis-implementation in a number of PMDs
>>> (to work in OvS) and tries to sort it out with an explanation
>>> why proposed direction is chosen. I realize that it could be
>>> painful, but IMHO it is the best option here. Yes, it is a
>>> point to discuss.
>>>
>>> To start with we should agree that that problem exists.
>>> Second, we should agree on direction how to solve it.
>> I agree. Suppose port 0 is the PF, and port 1 is a VF representor.
>>
>> IIUC, there are two options:
>>
>> 1. flow create 1 ingress transfer pattern eth / end action port_id id 0
>> upstream 1 / end
>>
>> 2. flow create 1 ingress transfer pattern eth / end action port_id id 0
>> upstream 0 / end
>>
>> [1] is the same behavior as today.
>>
>> [2] is a new behavior, the packet received by port 0 as if it arrived
>> from the wire.
>>
>> Then, let's have more:
>>
>> 3. flow create 0 ingress transfer pattern eth / end action port_id id 1
>> upstream 1 / end
>>
>> 4. flow create 0 ingress transfer pattern eth / end action port_id id 1
>> upstream 0 / end
>>
>> if we have [2] and [4], the packet going from the VF will hit [2], then
>> hit [4] and then [2] again in an endless loop?
> As I understand PORT_ID is a fate action. So, no more lookups
> are done. If the packet is loop back from applications, loop is
> possible.

I referred a HW loop, not SW. For example with JUMP action (also fate):

flow create 0 group 0 ingress transfer pattern eth / end action jump 
group 1 / end

flow create 0 group 1 ingress transfer pattern eth / end action jump 
group 0 / end

>
> In fact, it is a good question if "flow creare 0 ingress
> transfer" or "flow create 1 ingress transfer" assume any
> implicit filtering. I always thought that no.
> i.e. if we have two network ports rule like
>    flow create 0 ingress transfer pattern eth / end \
>         action port_id id 1 upstream 1 / end
> will match packets incoming from any port into the switch
> (network port 0, network port 1, VF or PF itself (???)).
> The topic also requires explicit clarification.
rte_flow is port based. It implicitly filters only packets for the 
provided port (0).

Maybe need to clarify documentation and have a "no filtering" API if needed.

> PF itself is really a hard question because of "ingress"
> since traffic from PF is a traffic from DPDK application and
> it is egress, not ingress.

Ingress means the direction. Hit on packets otherwise provided to the SW 
by rte_eth_rx_burst().

Same goes for the PF. Packets by rte_eth_rx_burst are the ones arriving 
from the wire, so ingress is that direction and egress is from the app.

>
> I think that port ID used to created flow rule should not
> apply any filtering in the case of transfer since we have
> corresponding items to do it explicitly. If we do it implicitly
> as well, we need some priorities and a way to avoid implicit
> rules which makes things much harder to understand and
> implement.

If "upstream 0" means what I thought it means (comments?) maybe a better 
way to do it is expose another port for that, so there will be 2 "PF" 
ports - one as the wire representor and the other one as the "PF" (or 
clearer naming...).

This would be a vendor decision, and there would be no need to change 
PORT_ID API.

>
>> If this is your meaning, maybe what you are looking for is an action to
>> change the in_port and continue processing?
>>
>> Please comment on the examples I gave or clarify the use case you are
>> trying to do.
>>
>>
>> Thanks,
>>
>> Eli
>>
>>>>>>> We had already very similar discussions regarding the
>>>>>>> understanding of
>>>>>>> what
>>>>>>> the representor really is from the DPDK API's point of view, and the
>>>>>>> last
>>>>>>> time, IIUC, it was concluded by a tech. board that representor
>>>>>>> should be
>>>>>>> a "ghost of a VF", i.e. DPDK APIs should apply configuration by
>>>>>>> default to
>>>>>>> VF and not to the representor device:
>>>>>>>
>>>>>>> https://patches.dpdk.org/project/dpdk/cover/20191029185051.32203-1-thomas@monjalon.net/#104376
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> This wasn't enforced though, IIUC, for existing code and semantics is
>>>>>>> still mixed.
>>>>>> I am not sure how this is related.
>>>>>>> I still think that configuration should be applied to VF, and the
>>>>>>> same
>>>>>>> applies
>>>>>>> to rte_flow API.  IMHO, average application should not care if
>>>>>>> device is
>>>>>>> a VF itself or its representor.  Everything should work exactly the
>>>>>>> same.
>>>>>>> I think this matches with the original idea/design of the switchdev
>>>>>>> functionality
>>>>>>> in the linux kernel and also matches with how the average user thinks
>>>>>>> about
>>>>>>> representor devices.
>>>>>> Right. This is the way representors work. It is fully aligned with
>>>>>> configuration of OVS-kernel.
>>>>>>> If some specific use-case requires to distinguish VF from the
>>>>>>> representor,
>>>>>>> there should probably be a separate special API/flag for that.
>>>>>>>
>>>>>>> Best regards, Ilya Maximets.

next prev parent reply	other threads:[~2021-06-02 11:22 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-01 11:14 Ivan Malov
2021-06-01 12:10 ` Ilya Maximets
2021-06-01 13:24   ` Eli Britstein
2021-06-01 14:35     ` Andrew Rybchenko
2021-06-01 14:44       ` Eli Britstein
2021-06-01 14:50         ` Ivan Malov
2021-06-01 14:53         ` Andrew Rybchenko
2021-06-02  9:57           ` Eli Britstein
2021-06-02 10:50             ` Andrew Rybchenko
2021-06-02 11:21               ` Eli Britstein [this message]
2021-06-02 11:57                 ` Andrew Rybchenko
2021-06-02 12:36                 ` Ivan Malov
2021-06-03  9:18                   ` Ori Kam
2021-06-03  9:55                     ` Andrew Rybchenko
2021-06-07  8:28                       ` Thomas Monjalon
2021-06-07  9:42                         ` Andrew Rybchenko
2021-06-07 12:08                           ` Ori Kam
2021-06-07 13:21                             ` Ilya Maximets
2021-06-07 16:07                               ` Thomas Monjalon
2021-06-08 16:13                                 ` Thomas Monjalon
2021-06-08 16:32                                   ` Andrew Rybchenko
2021-06-08 18:49                                     ` Thomas Monjalon
2021-06-09 14:31                                       ` Andrew Rybchenko
2021-06-01 14:49     ` Ivan Malov
2021-06-01 14:28   ` Ivan Malov
2021-06-02 12:46     ` Ilya Maximets
2021-06-02 16:26       ` Andrew Rybchenko
2021-06-02 17:35         ` Ilya Maximets
2021-06-02 19:35           ` Ivan Malov
2021-06-03  9:29             ` Ilya Maximets
2021-06-03 10:33               ` Andrew Rybchenko
2021-06-03 11:05                 ` Ilya Maximets
2021-06-03 11:29               ` Ivan Malov
2021-06-07 19:27                 ` Ilya Maximets
2021-06-07 20:39                   ` Ivan Malov
2021-06-25 13:04       ` Ferruh Yigit
2021-06-02 12:16   ` Thomas Monjalon
2021-06-02 12:53     ` Ilya Maximets
2021-06-02 13:10     ` Andrew Rybchenko
2021-09-03  7:46 ` [dpdk-dev] [PATCH v1] " Andrew Rybchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=adaab74c-8e08-4b06-41e9-9cde2ec67e1e@nvidia.com \
    --to=elibr@nvidia.com \
    --cc=ajit.khaparde@broadcom.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=hyonkim@cisco.com \
    --cc=i.maximets@ovn.org \
    --cc=ivan.malov@oktetlabs.ru \
    --cc=jerinj@marvell.com \
    --cc=johndale@cisco.com \
    --cc=kishore.padmanabha@broadcom.com \
    --cc=orika@nvidia.com \
    --cc=smadarf@marvell.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).