DPDK patches and discussions
 help / color / mirror / Atom feed
From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
To: Eli Britstein <elibr@nvidia.com>,
	Ilya Maximets <i.maximets@ovn.org>,
	Ivan Malov <ivan.malov@oktetlabs.ru>,
	dev@dpdk.org
Cc: Smadar Fuks <smadarf@marvell.com>,
	Hyong Youb Kim <hyonkim@cisco.com>,
	Kishore Padmanabha <kishore.padmanabha@broadcom.com>,
	Ori Kam <orika@nvidia.com>,
	Ajit Khaparde <ajit.khaparde@broadcom.com>,
	Jerin Jacob <jerinj@marvell.com>, John Daley <johndale@cisco.com>,
	Thomas Monjalon <thomas@monjalon.net>,
	Ferruh Yigit <ferruh.yigit@intel.com>
Subject: Re: [dpdk-dev] [RFC PATCH] ethdev: clarify flow action PORT ID semantics
Date: Wed, 2 Jun 2021 13:50:43 +0300	[thread overview]
Message-ID: <4447be7e-ce2a-a7a0-68e3-1de22a46cf80@oktetlabs.ru> (raw)
In-Reply-To: <4e53dced-2e88-3fde-f2b7-cb2e1368c1c8@nvidia.com>

On 6/2/21 12:57 PM, Eli Britstein wrote:
> 
> On 6/1/2021 5:53 PM, Andrew Rybchenko wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On 6/1/21 5:44 PM, Eli Britstein wrote:
>>> On 6/1/2021 5:35 PM, Andrew Rybchenko wrote:
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> On 6/1/21 4:24 PM, Eli Britstein wrote:
>>>>> On 6/1/2021 3:10 PM, Ilya Maximets wrote:
>>>>>> External email: Use caution opening links or attachments
>>>>>>
>>>>>>
>>>>>> On 6/1/21 1:14 PM, Ivan Malov wrote:
>>>>>>> By its very name, action PORT_ID means that packets hit an ethdev
>>>>>>> with the
>>>>>>> given DPDK port ID. At least the current comments don't state the
>>>>>>> opposite.
>>>>>>> That said, since port representors had been adopted, applications
>>>>>>> like OvS
>>>>>>> have been misusing the action. They misread its purpose as sending
>>>>>>> packets
>>>>>>> to the opposite end of the "wire" plugged to the given ethdev, for
>>>>>>> example,
>>>>>>> redirecting packets to the VF itself rather than to its representor
>>>>>>> ethdev.
>>>>>>> Another example: OvS relies on this action with the admin PF's
>>>>>>> ethdev
>>>>>>> port
>>>>>>> ID specified in it in order to send offloaded packets to the
>>>>>>> physical
>>>>>>> port.
>>>>>>>
>>>>>>> Since there might be applications which use this action in its valid
>>>>>>> sense,
>>>>>>> one can't just change the documentation to greenlight the opposite
>>>>>>> meaning.
>>>>>>> This patch adds an explicit bit to the action configuration which
>>>>>>> will let
>>>>>>> applications, depending on their needs, leverage the two meanings
>>>>>>> properly.
>>>>>>> Applications like OvS, as well as PMDs, will have to be corrected
>>>>>>> when the
>>>>>>> patch has been applied. But the improved clarity of the action is
>>>>>>> worth it.
>>>>>>>
>>>>>>> The proposed change is not the only option. One could avoid changes
>>>>>>> in OvS
>>>>>>> and PMDs if the new configuration field had the opposite meaning,
>>>>>>> with the
>>>>>>> action itself meaning delivery to the represented port and not to
>>>>>>> DPDK one.
>>>>>>> Alternatively, one could define a brand new action with the said
>>>>>>> behaviour.
>>>>> It doesn't make any sense to attach the VF itself to OVS, but only its
>>>>> representor.
>>>> OvS is not the only DPDK application.
>>> True. It is just the focus of this commit message is OVS.
>>>>> For the PF, when in switchdev mode, it is the "uplink representor", so
>>>>> it is also a representor.
>>>> Strictly speaking it is not a representor from DPDK point of
>>>> view. E.g. representors have corresponding flag set which is
>>>> definitely clear in the case of PF.
>>> This is the per-PMD responsibility. The API should not care.
>>>>> That said, OVS does not care of the type of the port. It doesn't
>>>>> matter
>>>>> if it's an "upstream" or not, or if it's a representor or not.
>>>> Yes, it is clear, but let's put OvS aside. Let's consider a
>>>> DPDK application which has a number of ethdev port. Some may
>>>> belong to single switch domain, some may be from different
>>>> switch domains (i.e. different NICs). Can I use PORT_ID action
>>>> to redirect ingress traffic to a specified ethdev port using
>>>> PORT_ID action? It looks like no, but IMHO it is the definition
>>>> of the PORT_ID action.
>>> Let's separate API from implementation. By API point of view, yes, the
>>> user may request it. Nothing wrong with it.
>>>
>>>  From implementation point of view - yes, it might fail, but not for
>>> sure, even if on different NICs. Maybe the HW of a certain vendor has
>>> the capability to do it?
>>>
>>> We can't know, so I think the API should allow it.
>> Hold on. What should it allow? It is two opposite meanings:
>>   1. Direct traffic to DPDK ethdev port specified using ID to be
>>      received and processed by the DPDK application.
>>   2. Direct traffic to an upstream port represented by the
>>      DPDK port.
>>
>> The patch tries to address the ambiguity, misuse it in OvS
>> (from my point of view in accordance with the action
>> documentation), mis-implementation in a number of PMDs
>> (to work in OvS) and tries to sort it out with an explanation
>> why proposed direction is chosen. I realize that it could be
>> painful, but IMHO it is the best option here. Yes, it is a
>> point to discuss.
>>
>> To start with we should agree that that problem exists.
>> Second, we should agree on direction how to solve it.
> 
> I agree. Suppose port 0 is the PF, and port 1 is a VF representor.
> 
> IIUC, there are two options:
> 
> 1. flow create 1 ingress transfer pattern eth / end action port_id id 0
> upstream 1 / end
> 
> 2. flow create 1 ingress transfer pattern eth / end action port_id id 0
> upstream 0 / end
> 
> [1] is the same behavior as today.
> 
> [2] is a new behavior, the packet received by port 0 as if it arrived
> from the wire.
> 
> Then, let's have more:
> 
> 3. flow create 0 ingress transfer pattern eth / end action port_id id 1
> upstream 1 / end
> 
> 4. flow create 0 ingress transfer pattern eth / end action port_id id 1
> upstream 0 / end
> 
> if we have [2] and [4], the packet going from the VF will hit [2], then
> hit [4] and then [2] again in an endless loop?

As I understand PORT_ID is a fate action. So, no more lookups
are done. If the packet is loop back from applications, loop is
possible.

In fact, it is a good question if "flow creare 0 ingress
transfer" or "flow create 1 ingress transfer" assume any
implicit filtering. I always thought that no.
i.e. if we have two network ports rule like
  flow create 0 ingress transfer pattern eth / end \
       action port_id id 1 upstream 1 / end
will match packets incoming from any port into the switch
(network port 0, network port 1, VF or PF itself (???)).
The topic also requires explicit clarification.
PF itself is really a hard question because of "ingress"
since traffic from PF is a traffic from DPDK application and
it is egress, not ingress.

I think that port ID used to created flow rule should not
apply any filtering in the case of transfer since we have
corresponding items to do it explicitly. If we do it implicitly
as well, we need some priorities and a way to avoid implicit
rules which makes things much harder to understand and
implement.

> If this is your meaning, maybe what you are looking for is an action to
> change the in_port and continue processing?
> 
> Please comment on the examples I gave or clarify the use case you are
> trying to do.
> 
> 
> Thanks,
> 
> Eli
> 
>>
>>>>>> We had already very similar discussions regarding the
>>>>>> understanding of
>>>>>> what
>>>>>> the representor really is from the DPDK API's point of view, and the
>>>>>> last
>>>>>> time, IIUC, it was concluded by a tech. board that representor
>>>>>> should be
>>>>>> a "ghost of a VF", i.e. DPDK APIs should apply configuration by
>>>>>> default to
>>>>>> VF and not to the representor device:
>>>>>>
>>>>>> https://patches.dpdk.org/project/dpdk/cover/20191029185051.32203-1-thomas@monjalon.net/#104376
>>>>>>
>>>>>>
>>>>>>
>>>>>> This wasn't enforced though, IIUC, for existing code and semantics is
>>>>>> still mixed.
>>>>> I am not sure how this is related.
>>>>>> I still think that configuration should be applied to VF, and the
>>>>>> same
>>>>>> applies
>>>>>> to rte_flow API.  IMHO, average application should not care if
>>>>>> device is
>>>>>> a VF itself or its representor.  Everything should work exactly the
>>>>>> same.
>>>>>> I think this matches with the original idea/design of the switchdev
>>>>>> functionality
>>>>>> in the linux kernel and also matches with how the average user thinks
>>>>>> about
>>>>>> representor devices.
>>>>> Right. This is the way representors work. It is fully aligned with
>>>>> configuration of OVS-kernel.
>>>>>> If some specific use-case requires to distinguish VF from the
>>>>>> representor,
>>>>>> there should probably be a separate special API/flag for that.
>>>>>>
>>>>>> Best regards, Ilya Maximets.


  reply	other threads:[~2021-06-02 10:50 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-01 11:14 Ivan Malov
2021-06-01 12:10 ` Ilya Maximets
2021-06-01 13:24   ` Eli Britstein
2021-06-01 14:35     ` Andrew Rybchenko
2021-06-01 14:44       ` Eli Britstein
2021-06-01 14:50         ` Ivan Malov
2021-06-01 14:53         ` Andrew Rybchenko
2021-06-02  9:57           ` Eli Britstein
2021-06-02 10:50             ` Andrew Rybchenko [this message]
2021-06-02 11:21               ` Eli Britstein
2021-06-02 11:57                 ` Andrew Rybchenko
2021-06-02 12:36                 ` Ivan Malov
2021-06-03  9:18                   ` Ori Kam
2021-06-03  9:55                     ` Andrew Rybchenko
2021-06-07  8:28                       ` Thomas Monjalon
2021-06-07  9:42                         ` Andrew Rybchenko
2021-06-07 12:08                           ` Ori Kam
2021-06-07 13:21                             ` Ilya Maximets
2021-06-07 16:07                               ` Thomas Monjalon
2021-06-08 16:13                                 ` Thomas Monjalon
2021-06-08 16:32                                   ` Andrew Rybchenko
2021-06-08 18:49                                     ` Thomas Monjalon
2021-06-09 14:31                                       ` Andrew Rybchenko
2021-06-01 14:49     ` Ivan Malov
2021-06-01 14:28   ` Ivan Malov
2021-06-02 12:46     ` Ilya Maximets
2021-06-02 16:26       ` Andrew Rybchenko
2021-06-02 17:35         ` Ilya Maximets
2021-06-02 19:35           ` Ivan Malov
2021-06-03  9:29             ` Ilya Maximets
2021-06-03 10:33               ` Andrew Rybchenko
2021-06-03 11:05                 ` Ilya Maximets
2021-06-03 11:29               ` Ivan Malov
2021-06-07 19:27                 ` Ilya Maximets
2021-06-07 20:39                   ` Ivan Malov
2021-06-25 13:04       ` Ferruh Yigit
2021-06-02 12:16   ` Thomas Monjalon
2021-06-02 12:53     ` Ilya Maximets
2021-06-02 13:10     ` Andrew Rybchenko
2021-09-03  7:46 ` [dpdk-dev] [PATCH v1] " Andrew Rybchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4447be7e-ce2a-a7a0-68e3-1de22a46cf80@oktetlabs.ru \
    --to=andrew.rybchenko@oktetlabs.ru \
    --cc=ajit.khaparde@broadcom.com \
    --cc=dev@dpdk.org \
    --cc=elibr@nvidia.com \
    --cc=ferruh.yigit@intel.com \
    --cc=hyonkim@cisco.com \
    --cc=i.maximets@ovn.org \
    --cc=ivan.malov@oktetlabs.ru \
    --cc=jerinj@marvell.com \
    --cc=johndale@cisco.com \
    --cc=kishore.padmanabha@broadcom.com \
    --cc=orika@nvidia.com \
    --cc=smadarf@marvell.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).