DPDK patches and discussions
 help / color / Atom feed
From: Ori Kam <orika@mellanox.com>
To: Andrew Rybchenko <arybchenko@solarflare.com>,
	Thomas Monjalon <thomas@monjalon.net>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	"pbhagavatula@marvell.com" <pbhagavatula@marvell.com>,
	"ferruh.yigit@intel.com" <ferruh.yigit@intel.com>,
	"jerinj@marvell.com" <jerinj@marvell.com>,
	John McNamara <john.mcnamara@intel.com>,
	Marko Kovacevic <marko.kovacevic@intel.com>,
	Adrien Mazarguil <adrien.mazarguil@6wind.com>,
	"david.marchand@redhat.com" <david.marchand@redhat.com>,
	"ktraynor@redhat.com" <ktraynor@redhat.com>
Subject: Re: [dpdk-dev] [PATCH 1/2] ethdev: add flow action type update as an offload
Date: Tue, 5 Nov 2019 08:35:20 +0000
Message-ID: <AM4PR05MB3425C16041A4243E73ECD11CDB7E0@AM4PR05MB3425.eurprd05.prod.outlook.com> (raw)
In-Reply-To: <a8221c73-96c8-a103-5637-7b357d2df219@solarflare.com>



> -----Original Message-----
> From: Andrew Rybchenko <arybchenko@solarflare.com>
> Sent: Tuesday, November 5, 2019 8:50 AM
> To: Ori Kam <orika@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>
> Cc: dev@dpdk.org; pbhagavatula@marvell.com; ferruh.yigit@intel.com;
> jerinj@marvell.com; John McNamara <john.mcnamara@intel.com>; Marko
> Kovacevic <marko.kovacevic@intel.com>; Adrien Mazarguil
> <adrien.mazarguil@6wind.com>; david.marchand@redhat.com;
> ktraynor@redhat.com
> Subject: Re: [dpdk-dev] [PATCH 1/2] ethdev: add flow action type update as an
> offload
> 
> On 11/4/19 9:37 PM, Ori Kam wrote:
> >> -----Original Message-----
> >> From: Andrew Rybchenko <arybchenko@solarflare.com>
> >> Sent: Sunday, November 3, 2019 1:41 PM
> >> To: Ori Kam <orika@mellanox.com>; Thomas Monjalon
> >> <thomas@monjalon.net>
> >> Cc: dev@dpdk.org; pbhagavatula@marvell.com; ferruh.yigit@intel.com;
> >> jerinj@marvell.com; John McNamara <john.mcnamara@intel.com>; Marko
> >> Kovacevic <marko.kovacevic@intel.com>; Adrien Mazarguil
> >> <adrien.mazarguil@6wind.com>; david.marchand@redhat.com;
> >> ktraynor@redhat.com
> >> Subject: Re: [dpdk-dev] [PATCH 1/2] ethdev: add flow action type update as
> an
> >> offload
> >>
> >> On 11/3/19 1:22 PM, Ori Kam wrote:
> >>> Hi,
> >>>
> >>>> -----Original Message-----
> >>>> From: dev <dev-bounces@dpdk.org> On Behalf Of Andrew Rybchenko
> >>>> Sent: Friday, November 1, 2019 1:35 PM
> >>>> To: Thomas Monjalon <thomas@monjalon.net>
> >>>> Cc: dev@dpdk.org; Ori Kam <orika@mellanox.com>;
> >>>> pbhagavatula@marvell.com; ferruh.yigit@intel.com; jerinj@marvell.com;
> >> John
> >>>> McNamara <john.mcnamara@intel.com>; Marko Kovacevic
> >>>> <marko.kovacevic@intel.com>; Adrien Mazarguil
> >>>> <adrien.mazarguil@6wind.com>; david.marchand@redhat.com;
> >>>> ktraynor@redhat.com
> >>>> Subject: Re: [dpdk-dev] [PATCH 1/2] ethdev: add flow action type update
> as
> >> an
> >>>> offload
> >>>>
> >>>> On 10/31/19 5:49 PM, Thomas Monjalon wrote:
> >>>>> 31/10/2019 10:49, Andrew Rybchenko:
> >>>>>> On 10/28/19 5:00 PM, Ori Kam wrote:
> >>>>>>>> -----Original Message-----
> >>>>>>> From: Andrew Rybchenko <arybchenko@solarflare.com>
> >>>>>>>> On 10/28/19 1:50 PM, Ori Kam wrote:
> >>>>>>>>> Hi Pavan,
> >>>>>>>>>
> >>>>>>>>> Sorry for jumping in late.
> >>>>>>>>>
> >>>>>>>>> I don't understand why we need this feature. If the user didn't set
> any
> >>>> flow
> >>>>>>>> with MARK
> >>>>>>>>> then the user doesn't need to check it.
> >>>>>>>> There is pretty long discussion on the topic already, please, read [1].
> >>>>>>>>
> >>>>>>>> [1]
> >>>>>>>>
> >>
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Finbox.dpdk
> >>>>>>>>> Best,
> >>>>>>>>> Ori
> >>>>>>>> .org%2Fdev%2F3251fc00-7598-1c4f-fc2a-
> >>>>>>>>
> >>
> 380065f0a435%40solarflare.com%2F&amp;data=02%7C01%7Corika%40mellan
> >>
> ox.com%7Ce3f779d4b7c44b682d6508d75b9d8688%7Ca652971c7d2e4d9ba6a4
> >>
> d149256f461b%7C0%7C0%7C637078604439019114&amp;sdata=sYooc%2FQ3C
> >>>>>>>> kUZG3gRFPlZrm8xMfMB9gOWWex5YIkWhMc%3D&amp;reserved=0
> >>>>>>>>
> >>>>>>> Thanks for the link, it was an interesting reading.
> >>>>>>>
> >>>>>>>>> Also it breaks compatibility.
> >>>>>>>> Yes, there is a deprecation notice for it.
> >>>>>>>>
> >>>>>>>>> If my understanding is correct the MARK field is going to be moved
> to
> >>>>>>>> dynamic field, and this
> >>>>>>>>> will be way to control the use of MARK.
> >>>>>>>> Yes and I think the offload should used to request dynamic
> >>>>>>>> field register. Similar to timestamp in dynamic mbuf examples.
> >>>>>>>> Application requests Rx timestamp offload, PMD registers dynamic
> >>>>>>>> filed.
> >>>>>>>>
> >>>>>>> In general it was decided that there will be no capability for rte_flow
> >> API,
> >>>> due to the fact that
> >>>>>>> it is impossible to support all possible combinations. For example a
> PMD
> >>>> can allow mark on Rx
> >>>>>>> while not supporting it on e-switch (transfer) or on Tx.
> >>>>>>> The only way to validate it is validating a flow. If the flow is validated
> >> then
> >>>> the action is supported.
> >>>>>>> This is the exact approach we are implementing with the Meta
> feature.
> >>>>>>> So as I see it, the logic should be something like this:
> >>>>>>> 1. run devconfigure.
> >>>>>>> 2. allocate mempool
> >>>>>>> 3. setup queues.
> >>>>>>> 4. run rte_flow_validate with mark action.
> >>>>>>> If flow validated register mark in mbuf else don't register.
> >>>>>>> If the PMD needs some special setting for mark he can update the
> queue
> >>>> when he gets the flow to validate.
> >>>>>>> At this stage the device is not started so any change is allowed.
> >>>>>> I understand why there is capability reporting in rte_flow API when
> >>>>>> it is about rte_flow API itself. The problem appears when rte_flow
> >>>>>> API starts to interact with other functionality.
> >>>>>> Which pattern/actions should application try in order to decide
> >>>>>> if MARK is supported or not.
> >>>>> Why application should decide whether MARK is supported or not?
> >>>>> In my understanding it can be enabled dynamically per flow.
> >>>> Yes, it is per flow right now, but it is resource consuming to
> >>>> make a flow rule just to discard it and work without offload.
> >>>> The application already suffers and attempt to use hardware
> >>>> offload makes it suffer even more. Of course, hardware offload
> >>>> in application may be simply globall disabled, but presence of
> >>>> MARK offload allows to do it dynamically based on offload
> >>>> reported by PMD.
> >>>>
> >>>> Also I think that Qi has a good example for vPMD why
> >>>> MARK offload would be useful.
> >>>>
> >>> I don't think that creating a simple flow during startup is resource
> consuming.
> >> It is not about startup. It is for every rule which will be rejected since
> >> MARK is not supported.
> >>
> > I'm sorry but I don't understand. Like everything else in rte_flow, if flow
> validation failed,
> > the flow should not be tested again. It is pointless. If the flow got rejected
> then the application
> > should set it's logic accordingly and not use MARK flows.
> > Same for example the pmd doesn't support encap, so after the application
> tries and fails, it should
> > understand that it must do the encap in SW.
> 
> The problem is that flow validation may fail because of various reasons.
> May be the first validation fails because of the pattern is not
> supported, or
> pattern+action is not supported, pattern+action is not supported because
> of another rule is installed right now, but would be supported without the
> rule installed.
> 

You are right, so the application should test with the most basic flow, or the flow that 
represent best what the application wishes. 
Since in this stage it is only validation there is no problem with conflicting flows. 

> >>> I think as we move more and more to rte_flow we can't continue using
> >> offloads.
> >>> The fact that one PMD doesn't support mark first should be listed in a
> release
> >> notes,
> >>
> >> Release notes are nice, but it is nothing for automated processing.
> >>
> > Agree, this is why is should be tested using rte_flow.
> >
> >>> In Qi example the application can start with it's preferred PMD and test if
> its
> >> support the mark action,
> >>> if not try other PMD or use some fallback.
> >> Pretty often there is no direct control over PMD to use. It is either
> >> vendor specific
> >> or no control at all. PMD choice is the result of requested offloads.
> >>
> > We have different definition for PMD 😊. If I understand you correctly you
> mean
> > which function the PMD uses internally, am I correct?
> 
> Yes, like vPMD is vector-based implementation of Rx/Tx routines.
> 
> > If so it is the responsibility of the PMD that when application tries to validate
> a flow with mark
> > to switch the functions / PMD to a PMD that supports this kind of action.
> 
> It is impossible when queues are setup and port is running.
> Flow rules may be and should be installed when traffic is running.
> 

I'm not saying to change the PMD while the traffic is running the validation of a flow
can be done before starting traffic.

> > Again think about it if the user want meta for example, the PMD should
> assume that the application wants this
> > kind of action and use the matching PMD.
> >
> >>> Think about it like this, assume that one PMD support some other
> >>> rte_flow while the second PMD doesn't support it. so the application should
> >> decide which is more important
> >>> to it and enable the best PMD.
> >> Again, it is not about filtering only. It is delivery of the extra
> >> information which requires
> >> extra processing and extra resources.
> >>
> > Please see my comment above regarding metadata.
> > I understand your point, we have 2 different components, the rte_flow
> > and the mbuf. One is used by HW and one is used by SW.
> > A feature like type is right in the middle. I had related argument with Slava
> > about where to place the dynamic mbuf function for the meta. It was decided
> > it is more rte_flow than mbuf.
> > Also think about it like this, if (and I hope we can) in next generation we will
> be able to
> > write data to the mbuf from the HW. This will mean it will be only used in
> rte_flow, and not
> > in the rx_burst function.
> 
> I think it will not make flow mark rte_flow entity only.
> 

But it will be set only using rte_flow and used by application so the Tx/Rx function
play no part in this case.

> >>>>>> The right answer is a pattern/action
> >>>>>> which will be really used, but what to do if there are many
> >>>>>> combinations or if these combinations are not know in advance.
> >>>>>> Minimal? But I easily imagine cases when minimal is not supported,
> >>>>>> but more complex real life patterns are supported.
> >>>>>>
> >>>>>> The main idea behind the offload is as much as you know in advance
> >>>>>> as much you can optimize without overcomplicating drivers and HW.
> >>>>>>
> >>>>>> In the case of OVS, absence MARK offload would mean that OVS
> >>>>>> should not even try to use partial offload even if it is enabled.
> >>>>>> So, no efforts are required to try to convert flow into pattern and
> >>>>>> validate the flow rule.
> >>>>> That's an interesting feedback.
> >>>>> I would like to understand why OVS cannot adapt its datapath on
> demand
> >>>>> per port, per queue and per flow?
> >>>> I guess there is a misunderstanding here. What I'm trying to say
> >>>> is that introduction of MARK offload would make code a bit more
> >>>> simple and efficient. Basically it would be possible to enable
> >>>> so-called hardware offload in OVS by default, but finally make
> >>>> a decision per port based on MARK offload availability
> >>>> (should it try to make rte_flow rule by flow and insert it?)
> >>> Like I said the PMD can check if it mark is avail in the buff. So he can
> selected
> >> the
> >>> best function.
> >> Which buff?
> > Sorry mbuf. I thought it was decided also to move the mark to dynamic mbuf.
> >
> > Just to summarize I'm afraid that we will have more fields like the mark (we
> already do with the
> > metadata) and we might even get more.
> > I don't want to have capability over rte_flow actions, since it can be very
> tricky for example
> > PMD can support only mark or meta but not both, while other will be able to
> support both.
> > some will have limitation on the mark action some PMD will not (for example
> size)
> > so in any case the application must validate it.
> > Or if you want other example we can support mark both in transfer and
> normal mode. This is only
> > true from this DPDK version. So what should PMD that supports mark only on
> non transfer flows report?
> 
> MARK offload is just a possibility to deliver MARK from HW to mbuf. All
> rte_flow specifics
> is out-of-scope. It should be delivered with the packet (or associated
> with delivered packet
> in some way).
> 

Yes but like I said in Mellanox PMD for example we supported the mark only on non-transfer flows until this release.
so when the user set mark on transfer flow it was invalid. (in transfer flow if we have a miss we send the packet back to the Rx
port so the application can understand on which table the miss happened)
In this version we added the support for mark also in transfer (E-Switch) flows.
So my question before this release what should the PMD report? What should the PMD report after this release?

Your idea was our first thought when adding the Tx meta, in that case the meta was always set in application
so we thought that this offload will enable us better function selection, but as you know we removed this capability
since it is not correct any more.



> The above also highlights problems of the meta vs mark design. They are very
> similar and there is no any good definition of the difference and rules
> which
> one should be used/supported in which conditions.
> 

Mark and Meta are exactly the same, the meta is just another value that the application can use.
This is why both should act the same.

And maybe this is the wining argument, the rte_flow validation approach was used and accepted for the meta.
So lets try it also with the mark. (please also remember that we didn't have this mark until now to somehow the 
PMD worked 😊)

Like I said before, I understand your approach, and each one of them has its own advantages and draw backs.
Lets start using the rte_flow approach and see how it goes, I promise you that if I see that it doesn't scale or cause more
issues I will be first one to submit changes.
 

> Andrew.
> 

Best,
Ori

  reply index

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-25 15:21 pbhagavatula
2019-10-25 15:21 ` [dpdk-dev] [PATCH 2/2] drivers/net: update Rx flow flag and mark capabilities pbhagavatula
2019-10-28 10:50 ` [dpdk-dev] [PATCH 1/2] ethdev: add flow action type update as an offload Ori Kam
2019-10-28 11:53   ` Andrew Rybchenko
2019-10-28 14:00     ` Ori Kam
2019-10-31  9:49       ` Andrew Rybchenko
2019-10-31 14:49         ` Thomas Monjalon
2019-10-31 23:59           ` Zhang, Qi Z
2019-11-01 11:35           ` Andrew Rybchenko
2019-11-03 10:22             ` Ori Kam
2019-11-03 11:41               ` Andrew Rybchenko
2019-11-04 18:37                 ` Ori Kam
2019-11-05  6:50                   ` Andrew Rybchenko
2019-11-05  8:35                     ` Ori Kam [this message]
2019-11-05 11:30                       ` Andrew Rybchenko
2019-11-05 16:37                         ` Ori Kam
2019-11-06  6:40                           ` Andrew Rybchenko
2019-11-06  7:42                             ` Ori Kam
2019-11-08  8:35                               ` Andrew Rybchenko
2019-11-08  9:00                                 ` Tom Barbette
2019-11-08 10:28                                 ` Thomas Monjalon
2019-11-08 10:42                                   ` Andrew Rybchenko
2019-11-08 11:03                                     ` Thomas Monjalon
2019-11-08 11:40                                       ` Zhang, Qi Z
2019-11-08 12:12                                         ` Ori Kam
2019-11-08 12:20                                           ` Andrew Rybchenko
2019-11-08 12:42                                             ` Ori Kam
2019-11-08 13:16                                               ` Zhang, Qi Z
2019-11-08 13:26                                                 ` Thomas Monjalon
2019-11-08 13:06                                         ` Thomas Monjalon
2019-11-08 12:00                                       ` Andrew Rybchenko
2019-11-08 13:17                                         ` Thomas Monjalon
2019-11-08 13:27                                           ` Andrew Rybchenko
2019-11-08 13:30                                             ` Thomas Monjalon
2019-11-19  9:24                                               ` Andrew Rybchenko
2019-11-19  9:50                                                 ` Thomas Monjalon
2019-11-19 10:59                                                   ` Andrew Rybchenko
2019-11-19 11:09                                                     ` Thomas Monjalon
2020-07-03 14:34                                                       ` Ferruh Yigit

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AM4PR05MB3425C16041A4243E73ECD11CDB7E0@AM4PR05MB3425.eurprd05.prod.outlook.com \
    --to=orika@mellanox.com \
    --cc=adrien.mazarguil@6wind.com \
    --cc=arybchenko@solarflare.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=jerinj@marvell.com \
    --cc=john.mcnamara@intel.com \
    --cc=ktraynor@redhat.com \
    --cc=marko.kovacevic@intel.com \
    --cc=pbhagavatula@marvell.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

Archives are clonable:
	git clone --mirror http://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ http://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev


Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/ public-inbox