From: Ori Kam <orika@mellanox.com>
To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Cc: Andrew Rybchenko <arybchenko@solarflare.com>,
Dekel Peled <dekelp@mellanox.com>,
"ferruh.yigit@intel.com" <ferruh.yigit@intel.com>,
"john.mcnamara@intel.com" <john.mcnamara@intel.com>,
"marko.kovacevic@intel.com" <marko.kovacevic@intel.com>,
Asaf Penso <asafp@mellanox.com>, Matan Azrad <matan@mellanox.com>,
Eli Britstein <elibr@mellanox.com>, "dev@dpdk.org" <dev@dpdk.org>,
Ivan Malov <Ivan.Malov@oktetlabs.ru>
Subject: Re: [dpdk-dev] [RFC] ethdev: add fragment attribute to IPv6 item
Date: Wed, 3 Jun 2020 08:16:11 +0000 [thread overview]
Message-ID: <AM6PR05MB51765D970F62CB461DEFA210DB880@AM6PR05MB5176.eurprd05.prod.outlook.com> (raw)
In-Reply-To: <20200602190410.GU26320@6wind.com>
Hi Adrien,
Great to hear from you again.
> -----Original Message-----
> From: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Sent: Tuesday, June 2, 2020 10:04 PM
> To: Ori Kam <orika@mellanox.com>
> Cc: Andrew Rybchenko <arybchenko@solarflare.com>; Dekel Peled
> <dekelp@mellanox.com>; ferruh.yigit@intel.com; john.mcnamara@intel.com;
> marko.kovacevic@intel.com; Asaf Penso <asafp@mellanox.com>; Matan Azrad
> <matan@mellanox.com>; Eli Britstein <elibr@mellanox.com>; dev@dpdk.org;
> Ivan Malov <Ivan.Malov@oktetlabs.ru>
> Subject: Re: [RFC] ethdev: add fragment attribute to IPv6 item
>
> Hi Ori, Andrew, Delek,
>
> (been a while eh?)
>
> On Tue, Jun 02, 2020 at 06:28:41PM +0000, Ori Kam wrote:
> > Hi Andrew,
> >
> > PSB,
> [...]
> > > > diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> > > > index b0e4199..3bc8ce1 100644
> > > > --- a/lib/librte_ethdev/rte_flow.h
> > > > +++ b/lib/librte_ethdev/rte_flow.h
> > > > @@ -787,6 +787,8 @@ struct rte_flow_item_ipv4 {
> > > > */
> > > > struct rte_flow_item_ipv6 {
> > > > struct rte_ipv6_hdr hdr; /**< IPv6 header definition. */
> > > > + uint32_t is_frag:1; /**< Is IPv6 packet fragmented/non-fragmented. */
> > > > + uint32_t reserved:31; /**< Reserved, must be zero. */
> > >
> > > The solution is simple, but hardly generic and adds an
> > > example for the future extensions. I doubt that it is a
> > > right way to go.
> > >
> > I agree with you that this is not the most generic way possible,
> > but the IPV6 extensions are very unique. So the solution is also unique.
> > In general, I'm always in favor of finding the most generic way, but
> sometimes
> > it is better to keep things simple, and see how it goes.
>
> Same feeling here, it doesn't look right.
>
> > > May be we should add 256-bit string with one bit for each
> > > IP protocol number and apply it to extension headers only?
> > > If bit A is set in the mask:
> > > - if bit A is set in spec as well, extension header with
> > > IP protocol (1 << A) number must present
> > > - if bit A is clear in spec, extension header with
> > > IP protocol (1 << A) number must absent
> > > If bit is clear in the mask, corresponding extension header
> > > may present and may absent (i.e. don't care).
> > >
> > There are only 12 possible extension headers and currently none of them
> > are supported in rte_flow. So adding a logic to parse the 256 just to get a max
> of 12
> > possible values is an overkill. Also, if we disregard the case of the extension,
> > the application must select only one next proto. For example, the application
> > can't select udp + tcp. There is the option to add a flag for each of the
> > possible extensions, does it makes more sense to you?
>
> Each of these extension headers has its own structure, we first need the
> ability to match them properly by adding the necessary pattern items.
>
> > > The RFC indirectly touches IPv6 proto (next header) matching
> > > logic.
> > >
> > > If logic used in ETH+VLAN is applied on IPv6 as well, it would
> > > make pattern specification and handling complicated. E.g.:
> > > eth / ipv6 / udp / end
> > > should match UDP over IPv6 without any extension headers only.
> > >
> > The issue with VLAN I agree is different since by definition VLAN is
> > layer 2.5. We can add the same logic also to the VLAN case, maybe it will
> > be easier.
> > In any case, in your example above and according to the RFC we will
> > get all ipv6 udp traffic with and without extensions.
> >
> > > And how to specify UPD over IPv6 regardless extension headers?
> >
> > Please see above the rule will be eth / ipv6 /udp.
> >
> > > eth / ipv6 / ipv6_ext / udp / end
> > > with a convention that ipv6_ext is optional if spec and mask
> > > are NULL (or mask is empty).
> > >
> > I would guess that this flow should match all ipv6 that has one ext and the
> next
> > proto is udp.
>
> In my opinion RTE_FLOW_ITEM_TYPE_IPV6_EXT is a bit useless on its own. It's
> only for matching packets that contain some kind of extension header, not a
> specific one, more about that below.
>
> > > I'm wondering if any driver treats it this way?
> > >
> > I'm not sure, we can support only the frag ext by default, but if required we
> can support other
> > ext.
> >
> > > I agree that the problem really comes when we'd like match
> > > IPv6 frags or even worse not fragments.
> > >
> > > Two patterns for fragments:
> > > eth / ipv6 (proto=FRAGMENT) / end
> > > eth / ipv6 / ipv6_ext (next_hdr=FRAGMENT) / end
> > >
> > > Any sensible solution for not-fragments with any other
> > > extension headers?
> > >
> > The one propose in this mail 😊
> >
> > > INVERT exists, but hardly useful, since it simply says
> > > that patches which do not match pattern without INVERT
> > > matches the pattern with INVERT and
> > > invert / eth / ipv6 (proto=FRAGMENT) / end
> > > will match ARP, IPv4, IPv6 with an extension header before
> > > fragment header and so on.
> > >
> > I agree with you, INVERT in this doesn’t help.
> > We were considering adding some kind of not mask / item per item.
> > some think around this line:
> > user request ipv6 unfragmented udp packets. The flow would look something
> > like this:
> > Eth / ipv6 / Not (Ipv6.proto = frag_proto) / udp
> > But it makes the rules much harder to use, and I don't think that there
> > is any HW that support not, and adding such feature to all items is overkill.
> >
> >
> > > Bit string suggested above will allow to match:
> > > - UDP over IPv6 with any extension headers:
> > > eth / ipv6 (ext_hdrs mask empty) / udp / end
> > > - UDP over IPv6 without any extension headers:
> > > eth / ipv6 (ext_hdrs mask full, spec empty) / udp / end
> > > - UDP over IPv6 without fragment header:
> > > eth / ipv6 (ext.spec & ~FRAGMENT, ext.mask | FRAGMENT) / udp / end
> > > - UDP over IPv6 with fragment header
> > > eth / ipv6 (ext.spec | FRAGMENT, ext.mask | FRAGMENT) / udp / end
> > >
> > > where FRAGMENT is 1 << IPPROTO_FRAGMENT.
> > >
> > Please see my response regarding this above.
> >
> > > Above I intentionally keep 'proto' unspecified in ipv6
> > > since otherwise it would specify the next header after IPv6
> > > header.
> > >
> > > Extension headers mask should be empty by default.
>
> This is a deliberate design choice/issue with rte_flow: an empty pattern
> matches everything; adding items only narrows the selection. As Andrew said
> there is currently no way to provide a specific item to reject, it can only
> be done globally on a pattern through INVERT that no PMD implements so far.
>
> So we have two requirements here: the ability to specifically match IPv6
> fragment headers and the ability to reject them.
>
> To match IPv6 fragment headers, we need a dedicated pattern item. The
> generic RTE_FLOW_ITEM_TYPE_IPV6_EXT is useless for that on its own, it must
> be completed with RTE_FLOW_ITEM_TYPE_IPV6_EXT_FRAG and associated
> object
Yes, we must add EXT_FRAG to be able to match on the FRAG bits.
> to match individual fields if needed (like all the others
> protocols/headers).
>
> Then to reject a pattern item... My preference goes to a new "NOT" meta item
> affecting the meaning of the item coming immediately after in the pattern
> list. That would be ultra generic, wouldn't break any ABI/API and like
> INVERT, wouldn't even require a new object associated with it.
>
> To match UDPv6 traffic when there is no fragment header, one could then do
> something like:
>
> eth / ipv6 / not / ipv6_ext_frag / udp
>
> PMD support would be trivial to implement (I'm sure!)
>
I agree with you as I said above. The issue is not PMD, the issues are:
1. think about the rule you stated above from logic point there is some contradiction,
you are saying ipv6 next proto udp but you also say not frag, this is logic only for IPV6 ext.
2. HW issue, I don't know of HW that knows how to support not on an item.
So adding something for all items for only one case is overkill.
> We may later implement other kinds of "operator" items as Andrew suggested,
> for bit-wise stuff and so on. Let's keep adding features on a needed basis
> though.
>
> --
> Adrien Mazarguil
> 6WIND
Best,
Ori
next prev parent reply other threads:[~2020-06-03 8:16 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-31 14:43 Dekel Peled
2020-06-01 5:38 ` Stephen Hemminger
2020-06-01 6:11 ` Dekel Peled
2020-06-02 14:32 ` Andrew Rybchenko
2020-06-02 18:28 ` Ori Kam
2020-06-02 19:04 ` Adrien Mazarguil
2020-06-03 8:16 ` Ori Kam [this message]
2020-06-03 12:10 ` Dekel Peled
2020-06-18 6:58 ` Dekel Peled
2020-06-28 14:52 ` Dekel Peled
2020-07-05 13:13 ` Andrew Rybchenko
2020-08-03 17:01 ` [dpdk-dev] [RFC v2] ethdev: add extensions attributes " Dekel Peled
2020-08-03 17:11 ` [dpdk-dev] [RFC v3] " Dekel Peled
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AM6PR05MB51765D970F62CB461DEFA210DB880@AM6PR05MB5176.eurprd05.prod.outlook.com \
--to=orika@mellanox.com \
--cc=Ivan.Malov@oktetlabs.ru \
--cc=adrien.mazarguil@6wind.com \
--cc=arybchenko@solarflare.com \
--cc=asafp@mellanox.com \
--cc=dekelp@mellanox.com \
--cc=dev@dpdk.org \
--cc=elibr@mellanox.com \
--cc=ferruh.yigit@intel.com \
--cc=john.mcnamara@intel.com \
--cc=marko.kovacevic@intel.com \
--cc=matan@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).