DPDK patches and discussions
 help / color / Atom feed
From: Andrew Rybchenko <arybchenko@solarflare.com>
To: Ori Kam <orika@mellanox.com>, Thomas Monjalon <thomas@monjalon.net>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	"pbhagavatula@marvell.com" <pbhagavatula@marvell.com>,
	"ferruh.yigit@intel.com" <ferruh.yigit@intel.com>,
	"jerinj@marvell.com" <jerinj@marvell.com>,
	"John McNamara" <john.mcnamara@intel.com>,
	Marko Kovacevic <marko.kovacevic@intel.com>,
	Adrien Mazarguil <adrien.mazarguil@6wind.com>,
	"david.marchand@redhat.com" <david.marchand@redhat.com>,
	"ktraynor@redhat.com" <ktraynor@redhat.com>
Subject: Re: [dpdk-dev] [PATCH 1/2] ethdev: add flow action type update as an offload
Date: Tue, 5 Nov 2019 09:50:13 +0300
Message-ID: <a8221c73-96c8-a103-5637-7b357d2df219@solarflare.com> (raw)
In-Reply-To: <AM4PR05MB3425A53840715A9BA01ACB3CDB7F0@AM4PR05MB3425.eurprd05.prod.outlook.com>

On 11/4/19 9:37 PM, Ori Kam wrote:
>> -----Original Message-----
>> From: Andrew Rybchenko <arybchenko@solarflare.com>
>> Sent: Sunday, November 3, 2019 1:41 PM
>> To: Ori Kam <orika@mellanox.com>; Thomas Monjalon
>> <thomas@monjalon.net>
>> Cc: dev@dpdk.org; pbhagavatula@marvell.com; ferruh.yigit@intel.com;
>> jerinj@marvell.com; John McNamara <john.mcnamara@intel.com>; Marko
>> Kovacevic <marko.kovacevic@intel.com>; Adrien Mazarguil
>> <adrien.mazarguil@6wind.com>; david.marchand@redhat.com;
>> ktraynor@redhat.com
>> Subject: Re: [dpdk-dev] [PATCH 1/2] ethdev: add flow action type update as an
>> offload
>> On 11/3/19 1:22 PM, Ori Kam wrote:
>>> Hi,
>>>> -----Original Message-----
>>>> From: dev <dev-bounces@dpdk.org> On Behalf Of Andrew Rybchenko
>>>> Sent: Friday, November 1, 2019 1:35 PM
>>>> To: Thomas Monjalon <thomas@monjalon.net>
>>>> Cc: dev@dpdk.org; Ori Kam <orika@mellanox.com>;
>>>> pbhagavatula@marvell.com; ferruh.yigit@intel.com; jerinj@marvell.com;
>> John
>>>> McNamara <john.mcnamara@intel.com>; Marko Kovacevic
>>>> <marko.kovacevic@intel.com>; Adrien Mazarguil
>>>> <adrien.mazarguil@6wind.com>; david.marchand@redhat.com;
>>>> ktraynor@redhat.com
>>>> Subject: Re: [dpdk-dev] [PATCH 1/2] ethdev: add flow action type update as
>> an
>>>> offload
>>>> On 10/31/19 5:49 PM, Thomas Monjalon wrote:
>>>>> 31/10/2019 10:49, Andrew Rybchenko:
>>>>>> On 10/28/19 5:00 PM, Ori Kam wrote:
>>>>>>>> -----Original Message-----
>>>>>>> From: Andrew Rybchenko <arybchenko@solarflare.com>
>>>>>>>> On 10/28/19 1:50 PM, Ori Kam wrote:
>>>>>>>>> Hi Pavan,
>>>>>>>>> Sorry for jumping in late.
>>>>>>>>> I don't understand why we need this feature. If the user didn't set any
>>>> flow
>>>>>>>> with MARK
>>>>>>>>> then the user doesn't need to check it.
>>>>>>>> There is pretty long discussion on the topic already, please, read [1].
>>>>>>>> [1]
>> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Finbox.dpdk
>>>>>>>>> Best,
>>>>>>>>> Ori
>>>>>>>> .org%2Fdev%2F3251fc00-7598-1c4f-fc2a-
>> 380065f0a435%40solarflare.com%2F&amp;data=02%7C01%7Corika%40mellan
>> ox.com%7Ce3f779d4b7c44b682d6508d75b9d8688%7Ca652971c7d2e4d9ba6a4
>> d149256f461b%7C0%7C0%7C637078604439019114&amp;sdata=sYooc%2FQ3C
>>>>>>>> kUZG3gRFPlZrm8xMfMB9gOWWex5YIkWhMc%3D&amp;reserved=0
>>>>>>> Thanks for the link, it was an interesting reading.
>>>>>>>>> Also it breaks compatibility.
>>>>>>>> Yes, there is a deprecation notice for it.
>>>>>>>>> If my understanding is correct the MARK field is going to be moved to
>>>>>>>> dynamic field, and this
>>>>>>>>> will be way to control the use of MARK.
>>>>>>>> Yes and I think the offload should used to request dynamic
>>>>>>>> field register. Similar to timestamp in dynamic mbuf examples.
>>>>>>>> Application requests Rx timestamp offload, PMD registers dynamic
>>>>>>>> filed.
>>>>>>> In general it was decided that there will be no capability for rte_flow
>> API,
>>>> due to the fact that
>>>>>>> it is impossible to support all possible combinations. For example a PMD
>>>> can allow mark on Rx
>>>>>>> while not supporting it on e-switch (transfer) or on Tx.
>>>>>>> The only way to validate it is validating a flow. If the flow is validated
>> then
>>>> the action is supported.
>>>>>>> This is the exact approach we are implementing with the Meta feature.
>>>>>>> So as I see it, the logic should be something like this:
>>>>>>> 1. run devconfigure.
>>>>>>> 2. allocate mempool
>>>>>>> 3. setup queues.
>>>>>>> 4. run rte_flow_validate with mark action.
>>>>>>> If flow validated register mark in mbuf else don't register.
>>>>>>> If the PMD needs some special setting for mark he can update the queue
>>>> when he gets the flow to validate.
>>>>>>> At this stage the device is not started so any change is allowed.
>>>>>> I understand why there is capability reporting in rte_flow API when
>>>>>> it is about rte_flow API itself. The problem appears when rte_flow
>>>>>> API starts to interact with other functionality.
>>>>>> Which pattern/actions should application try in order to decide
>>>>>> if MARK is supported or not.
>>>>> Why application should decide whether MARK is supported or not?
>>>>> In my understanding it can be enabled dynamically per flow.
>>>> Yes, it is per flow right now, but it is resource consuming to
>>>> make a flow rule just to discard it and work without offload.
>>>> The application already suffers and attempt to use hardware
>>>> offload makes it suffer even more. Of course, hardware offload
>>>> in application may be simply globall disabled, but presence of
>>>> MARK offload allows to do it dynamically based on offload
>>>> reported by PMD.
>>>> Also I think that Qi has a good example for vPMD why
>>>> MARK offload would be useful.
>>> I don't think that creating a simple flow during startup is resource consuming.
>> It is not about startup. It is for every rule which will be rejected since
>> MARK is not supported.
> I'm sorry but I don't understand. Like everything else in rte_flow, if flow validation failed,
> the flow should not be tested again. It is pointless. If the flow got rejected then the application
> should set it's logic accordingly and not use MARK flows.
> Same for example the pmd doesn't support encap, so after the application tries and fails, it should
> understand that it must do the encap in SW.

The problem is that flow validation may fail because of various reasons.
May be the first validation fails because of the pattern is not
supported, or
pattern+action is not supported, pattern+action is not supported because
of another rule is installed right now, but would be supported without the
rule installed.

>>> I think as we move more and more to rte_flow we can't continue using
>> offloads.
>>> The fact that one PMD doesn't support mark first should be listed in a release
>> notes,
>> Release notes are nice, but it is nothing for automated processing.
> Agree, this is why is should be tested using rte_flow.
>>> In Qi example the application can start with it's preferred PMD and test if its
>> support the mark action,
>>> if not try other PMD or use some fallback.
>> Pretty often there is no direct control over PMD to use. It is either
>> vendor specific
>> or no control at all. PMD choice is the result of requested offloads.
> We have different definition for PMD 😊. If I understand you correctly you mean
> which function the PMD uses internally, am I correct?

Yes, like vPMD is vector-based implementation of Rx/Tx routines.

> If so it is the responsibility of the PMD that when application tries to validate a flow with mark
> to switch the functions / PMD to a PMD that supports this kind of action.

It is impossible when queues are setup and port is running.
Flow rules may be and should be installed when traffic is running.

> Again think about it if the user want meta for example, the PMD should assume that the application wants this
> kind of action and use the matching PMD.  
>>> Think about it like this, assume that one PMD support some other
>>> rte_flow while the second PMD doesn't support it. so the application should
>> decide which is more important
>>> to it and enable the best PMD.
>> Again, it is not about filtering only. It is delivery of the extra
>> information which requires
>> extra processing and extra resources.
> Please see my comment above regarding metadata.
> I understand your point, we have 2 different components, the rte_flow
> and the mbuf. One is used by HW and one is used by SW.
> A feature like type is right in the middle. I had related argument with Slava
> about where to place the dynamic mbuf function for the meta. It was decided
> it is more rte_flow than mbuf.
> Also think about it like this, if (and I hope we can) in next generation we will be able to 
> write data to the mbuf from the HW. This will mean it will be only used in rte_flow, and not 
> in the rx_burst function. 

I think it will not make flow mark rte_flow entity only.

>>>>>> The right answer is a pattern/action
>>>>>> which will be really used, but what to do if there are many
>>>>>> combinations or if these combinations are not know in advance.
>>>>>> Minimal? But I easily imagine cases when minimal is not supported,
>>>>>> but more complex real life patterns are supported.
>>>>>> The main idea behind the offload is as much as you know in advance
>>>>>> as much you can optimize without overcomplicating drivers and HW.
>>>>>> In the case of OVS, absence MARK offload would mean that OVS
>>>>>> should not even try to use partial offload even if it is enabled.
>>>>>> So, no efforts are required to try to convert flow into pattern and
>>>>>> validate the flow rule.
>>>>> That's an interesting feedback.
>>>>> I would like to understand why OVS cannot adapt its datapath on demand
>>>>> per port, per queue and per flow?
>>>> I guess there is a misunderstanding here. What I'm trying to say
>>>> is that introduction of MARK offload would make code a bit more
>>>> simple and efficient. Basically it would be possible to enable
>>>> so-called hardware offload in OVS by default, but finally make
>>>> a decision per port based on MARK offload availability
>>>> (should it try to make rte_flow rule by flow and insert it?)
>>> Like I said the PMD can check if it mark is avail in the buff. So he can selected
>> the
>>> best function.
>> Which buff?
> Sorry mbuf. I thought it was decided also to move the mark to dynamic mbuf.
> Just to summarize I'm afraid that we will have more fields like the mark (we already do with the 
> metadata) and we might even get more.
> I don't want to have capability over rte_flow actions, since it can be very tricky for example 
> PMD can support only mark or meta but not both, while other will be able to support both.
> some will have limitation on the mark action some PMD will not (for example size)
> so in any case the application must validate it.
> Or if you want other example we can support mark both in transfer and normal mode. This is only
> true from this DPDK version. So what should PMD that supports mark only on non transfer flows report?

MARK offload is just a possibility to deliver MARK from HW to mbuf. All
rte_flow specifics
is out-of-scope. It should be delivered with the packet (or associated
with delivered packet
in some way).

The above also highlights problems of the meta vs mark design. They are very
similar and there is no any good definition of the difference and rules
one should be used/supported in which conditions.


  reply index

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-25 15:21 pbhagavatula
2019-10-25 15:21 ` [dpdk-dev] [PATCH 2/2] drivers/net: update Rx flow flag and mark capabilities pbhagavatula
2019-10-28 10:50 ` [dpdk-dev] [PATCH 1/2] ethdev: add flow action type update as an offload Ori Kam
2019-10-28 11:53   ` Andrew Rybchenko
2019-10-28 14:00     ` Ori Kam
2019-10-31  9:49       ` Andrew Rybchenko
2019-10-31 14:49         ` Thomas Monjalon
2019-10-31 23:59           ` Zhang, Qi Z
2019-11-01 11:35           ` Andrew Rybchenko
2019-11-03 10:22             ` Ori Kam
2019-11-03 11:41               ` Andrew Rybchenko
2019-11-04 18:37                 ` Ori Kam
2019-11-05  6:50                   ` Andrew Rybchenko [this message]
2019-11-05  8:35                     ` Ori Kam
2019-11-05 11:30                       ` Andrew Rybchenko
2019-11-05 16:37                         ` Ori Kam
2019-11-06  6:40                           ` Andrew Rybchenko
2019-11-06  7:42                             ` Ori Kam
2019-11-08  8:35                               ` Andrew Rybchenko
2019-11-08  9:00                                 ` Tom Barbette
2019-11-08 10:28                                 ` Thomas Monjalon
2019-11-08 10:42                                   ` Andrew Rybchenko
2019-11-08 11:03                                     ` Thomas Monjalon
2019-11-08 11:40                                       ` Zhang, Qi Z
2019-11-08 12:12                                         ` Ori Kam
2019-11-08 12:20                                           ` Andrew Rybchenko
2019-11-08 12:42                                             ` Ori Kam
2019-11-08 13:16                                               ` Zhang, Qi Z
2019-11-08 13:26                                                 ` Thomas Monjalon
2019-11-08 13:06                                         ` Thomas Monjalon
2019-11-08 12:00                                       ` Andrew Rybchenko
2019-11-08 13:17                                         ` Thomas Monjalon
2019-11-08 13:27                                           ` Andrew Rybchenko
2019-11-08 13:30                                             ` Thomas Monjalon
2019-11-19  9:24                                               ` Andrew Rybchenko
2019-11-19  9:50                                                 ` Thomas Monjalon
2019-11-19 10:59                                                   ` Andrew Rybchenko
2019-11-19 11:09                                                     ` Thomas Monjalon
2020-07-03 14:34                                                       ` Ferruh Yigit

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a8221c73-96c8-a103-5637-7b357d2df219@solarflare.com \
    --to=arybchenko@solarflare.com \
    --cc=adrien.mazarguil@6wind.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=jerinj@marvell.com \
    --cc=john.mcnamara@intel.com \
    --cc=ktraynor@redhat.com \
    --cc=marko.kovacevic@intel.com \
    --cc=orika@mellanox.com \
    --cc=pbhagavatula@marvell.com \
    --cc=thomas@monjalon.net \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

Archives are clonable:
	git clone --mirror http://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ http://inbox.dpdk.org/dev \
	public-inbox-index dev

Newsgroup available over NNTP:

AGPL code for this site: git clone https://public-inbox.org/ public-inbox