DPDK patches and discussions
 help / color / mirror / Atom feed
From: Ori Kam <orika@nvidia.com>
To: Stephen Hemminger <stephen@networkplumber.org>,
	Ivan Malov <ivan.malov@oktetlabs.ru>
Cc: "NBU-Contact-Thomas Monjalon (EXTERNAL)" <thomas@monjalon.net>,
	"NBU-Contact-Adrien Mazarguil (EXTERNAL)"
	<adrien.mazarguil@6wind.com>, "dev@dpdk.org" <dev@dpdk.org>,
	Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Subject: RE: Understanding Flow API action RSS
Date: Sun, 9 Jan 2022 12:23:49 +0000	[thread overview]
Message-ID: <MW2PR12MB466610FF3790168E10C462D0D64F9@MW2PR12MB4666.namprd12.prod.outlook.com> (raw)
In-Reply-To: <20220104135612.4e5c8143@hermes.local>

Hi Stephen and Ivan

> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Tuesday, January 4, 2022 11:56 PM
> Subject: Re: Understanding Flow API action RSS
> 
> On Tue, 4 Jan 2022 21:29:14 +0300 (MSK)
> Ivan Malov <ivan.malov@oktetlabs.ru> wrote:
> 
> > Hi Stephen,
> >
> > On Tue, 4 Jan 2022, Stephen Hemminger wrote:
> >
> > > On Tue, 04 Jan 2022 13:41:55 +0100
> > > Thomas Monjalon <thomas@monjalon.net> wrote:
> > >
> > >> +Cc Ori Kam, rte_flow maintainer
> > >>
> > >> 29/12/2021 15:34, Ivan Malov:
> > >>> Hi all,
> > >>>
> > >>> In 'rte_flow.h', there is 'struct rte_flow_action_rss'. In it, 'queue' is
> > >>> to provide "Queue indices to use". But it is unclear whether the order of
> > >>> elements is meaningful or not. Does that matter? Can queue indices repeat?
> > >
> > > The order probably doesn't matter, it is like the RSS indirection table.
> >
> > Sorry, but RSS indirection table (RETA) assumes some structure. In it,
> > queue indices can repeat, and the order is meaningful. In DPDK, RETA
> > may comprise multiple "groups", each one comprising 64 entries.
> >
> > This 'queue' array in flow action RSS does not stick with the same
> > terminology, it does not reuse the definition of RETA "group", etc.
> > Just "queue indices to use". No definition of order, no structure.
> >
> > The API contract is not clear. Neither to users, nor to PMDs.
> >
From API in RSS the queues are simply the queue ID, order doesn't matter,
Duplicating the queue may affect the the spread based on the HW/PMD.
In common case each queue should appear only once and the PMD may duplicate
entries to get the best performance.

> > >
> > >    rx queue = RSS_indirection_table[ RSS_hash_value % RSS_indirection_table_size ]
> > >
> > > So you could play with multiple queues matching same hash value, but that
> > > would be uncommon.
> > >
> > >>> An ethdev may have "global" RSS setting with an indirection table of some
> > >>> fixed size (say, 512). In what comes to flow rules, does that size matter?
> > >
> > > Global RSS is only used if the incoming packet does not match any rte_flow
> > > action. If there is a a RTE_FLOW_ACTION_TYPE_QUEUE or RTE_FLOW_ACTION_TYPE_RSS
> > > these take precedence.
> >
> > Yes, I know all of that. The question is how does the PMD select RETA size
> > for this action? Can it select an arbitrary value? Or should it stick with
> > the "global" one (eg. 512)? How does the user know the table size?
> >
> > If the user simply wants to spread traffic across the given queues,
> > the effective table size is a don't care to them, and the existing
> > API contract is fine. But if the user expects that certain packets
> > hit some precise queues, they need to know the table size for that.
> >
Just like you said RSS simply spread the traffic to the given queues.
If application wants to send traffic to some queue it should use the queue action.

> > So, the question is whether the users should or should not build
> > any expectations of the effective table size and, if they should,
> > are they supposed to use the "global" table size for that?
> 
> You are right this area is completely undocumented. Personally would really like
> it if rte_flow had a reference software implementation and all the HW vendors
> had to make sure their HW matched the SW reference version. But this a case
> where the funding is all on the HW side, and no one has time or resources
> to do a complete SW version..
> 
> A sane implementation would configure RSS indirection as across all
> rx queues that were available when the device was started; ie all queues
> that did not have deferred start set. Then the application would start/stop
> queues and use rte_flow to reach them.
> 
> But it doesn't appear the HW follows that model.
> 
> 
> > >>> When the user selects 'RTE_ETH_HASH_FUNCTION_DEFAULT' in action RSS, does
> > >>> that allow the PMD to configure an arbitrary, non-Toeplitz hash algorithm?
> > >
> > > No the default is always Toeplitz.  This goes back to the original definition
> > > of RSS which is in Microsoft NDIS and uses Toeplitz.
> >
> > Then why have a dedicated enum named TOEPLITZ? Also, once again, the
> > documentation should be more specific to say which algorithm exactly
> > this DEFAULT choice provides. Otherwise, it is very vague.
> >
> > >
> > > DPDK should have more examples of using rte_flow, I have some samples
> > > but they aren't that useful.
> > >
> >
> > I could not agree more.

Feel free to add/suggest what example are missing.

> >
> > Thanks,
> > Ivan M.

Best,
Ori

  reply	other threads:[~2022-01-09 12:23 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-29 14:34 Ivan Malov
2022-01-04 12:41 ` Thomas Monjalon
2022-01-04 16:54   ` Stephen Hemminger
2022-01-04 18:29     ` Ivan Malov
2022-01-04 21:56       ` Stephen Hemminger
2022-01-09 12:23         ` Ori Kam [this message]
2022-01-09 13:03           ` Ivan Malov
2022-01-10  9:54             ` Ori Kam
2022-01-10 15:04               ` Ivan Malov
2022-01-10 17:18                 ` Ori Kam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MW2PR12MB466610FF3790168E10C462D0D64F9@MW2PR12MB4666.namprd12.prod.outlook.com \
    --to=orika@nvidia.com \
    --cc=adrien.mazarguil@6wind.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=dev@dpdk.org \
    --cc=ivan.malov@oktetlabs.ru \
    --cc=stephen@networkplumber.org \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).