From: Adrien Mazarguil <adrien.mazarguil@6wind.com>
To: Ori Kam <orika@mellanox.com>
Cc: Andrew Rybchenko <arybchenko@solarflare.com>,
Dekel Peled <dekelp@mellanox.com>,
"ferruh.yigit@intel.com" <ferruh.yigit@intel.com>,
"john.mcnamara@intel.com" <john.mcnamara@intel.com>,
"marko.kovacevic@intel.com" <marko.kovacevic@intel.com>,
Asaf Penso <asafp@mellanox.com>, Matan Azrad <matan@mellanox.com>,
Eli Britstein <elibr@mellanox.com>, "dev@dpdk.org" <dev@dpdk.org>,
Ivan Malov <Ivan.Malov@oktetlabs.ru>
Subject: Re: [dpdk-dev] [RFC] ethdev: add fragment attribute to IPv6 item
Date: Tue, 2 Jun 2020 21:04:10 +0200 [thread overview]
Message-ID: <20200602190410.GU26320@6wind.com> (raw)
In-Reply-To: <AM6PR05MB51769EE4CE6BAB8B557E8325DB8B0@AM6PR05MB5176.eurprd05.prod.outlook.com>
Hi Ori, Andrew, Delek,
(been a while eh?)
On Tue, Jun 02, 2020 at 06:28:41PM +0000, Ori Kam wrote:
> Hi Andrew,
>
> PSB,
[...]
> > > diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> > > index b0e4199..3bc8ce1 100644
> > > --- a/lib/librte_ethdev/rte_flow.h
> > > +++ b/lib/librte_ethdev/rte_flow.h
> > > @@ -787,6 +787,8 @@ struct rte_flow_item_ipv4 {
> > > */
> > > struct rte_flow_item_ipv6 {
> > > struct rte_ipv6_hdr hdr; /**< IPv6 header definition. */
> > > + uint32_t is_frag:1; /**< Is IPv6 packet fragmented/non-fragmented. */
> > > + uint32_t reserved:31; /**< Reserved, must be zero. */
> >
> > The solution is simple, but hardly generic and adds an
> > example for the future extensions. I doubt that it is a
> > right way to go.
> >
> I agree with you that this is not the most generic way possible,
> but the IPV6 extensions are very unique. So the solution is also unique.
> In general, I'm always in favor of finding the most generic way, but sometimes
> it is better to keep things simple, and see how it goes.
Same feeling here, it doesn't look right.
> > May be we should add 256-bit string with one bit for each
> > IP protocol number and apply it to extension headers only?
> > If bit A is set in the mask:
> > - if bit A is set in spec as well, extension header with
> > IP protocol (1 << A) number must present
> > - if bit A is clear in spec, extension header with
> > IP protocol (1 << A) number must absent
> > If bit is clear in the mask, corresponding extension header
> > may present and may absent (i.e. don't care).
> >
> There are only 12 possible extension headers and currently none of them
> are supported in rte_flow. So adding a logic to parse the 256 just to get a max of 12
> possible values is an overkill. Also, if we disregard the case of the extension,
> the application must select only one next proto. For example, the application
> can't select udp + tcp. There is the option to add a flag for each of the
> possible extensions, does it makes more sense to you?
Each of these extension headers has its own structure, we first need the
ability to match them properly by adding the necessary pattern items.
> > The RFC indirectly touches IPv6 proto (next header) matching
> > logic.
> >
> > If logic used in ETH+VLAN is applied on IPv6 as well, it would
> > make pattern specification and handling complicated. E.g.:
> > eth / ipv6 / udp / end
> > should match UDP over IPv6 without any extension headers only.
> >
> The issue with VLAN I agree is different since by definition VLAN is
> layer 2.5. We can add the same logic also to the VLAN case, maybe it will
> be easier.
> In any case, in your example above and according to the RFC we will
> get all ipv6 udp traffic with and without extensions.
>
> > And how to specify UPD over IPv6 regardless extension headers?
>
> Please see above the rule will be eth / ipv6 /udp.
>
> > eth / ipv6 / ipv6_ext / udp / end
> > with a convention that ipv6_ext is optional if spec and mask
> > are NULL (or mask is empty).
> >
> I would guess that this flow should match all ipv6 that has one ext and the next
> proto is udp.
In my opinion RTE_FLOW_ITEM_TYPE_IPV6_EXT is a bit useless on its own. It's
only for matching packets that contain some kind of extension header, not a
specific one, more about that below.
> > I'm wondering if any driver treats it this way?
> >
> I'm not sure, we can support only the frag ext by default, but if required we can support other
> ext.
>
> > I agree that the problem really comes when we'd like match
> > IPv6 frags or even worse not fragments.
> >
> > Two patterns for fragments:
> > eth / ipv6 (proto=FRAGMENT) / end
> > eth / ipv6 / ipv6_ext (next_hdr=FRAGMENT) / end
> >
> > Any sensible solution for not-fragments with any other
> > extension headers?
> >
> The one propose in this mail 😊
>
> > INVERT exists, but hardly useful, since it simply says
> > that patches which do not match pattern without INVERT
> > matches the pattern with INVERT and
> > invert / eth / ipv6 (proto=FRAGMENT) / end
> > will match ARP, IPv4, IPv6 with an extension header before
> > fragment header and so on.
> >
> I agree with you, INVERT in this doesn’t help.
> We were considering adding some kind of not mask / item per item.
> some think around this line:
> user request ipv6 unfragmented udp packets. The flow would look something
> like this:
> Eth / ipv6 / Not (Ipv6.proto = frag_proto) / udp
> But it makes the rules much harder to use, and I don't think that there
> is any HW that support not, and adding such feature to all items is overkill.
>
>
> > Bit string suggested above will allow to match:
> > - UDP over IPv6 with any extension headers:
> > eth / ipv6 (ext_hdrs mask empty) / udp / end
> > - UDP over IPv6 without any extension headers:
> > eth / ipv6 (ext_hdrs mask full, spec empty) / udp / end
> > - UDP over IPv6 without fragment header:
> > eth / ipv6 (ext.spec & ~FRAGMENT, ext.mask | FRAGMENT) / udp / end
> > - UDP over IPv6 with fragment header
> > eth / ipv6 (ext.spec | FRAGMENT, ext.mask | FRAGMENT) / udp / end
> >
> > where FRAGMENT is 1 << IPPROTO_FRAGMENT.
> >
> Please see my response regarding this above.
>
> > Above I intentionally keep 'proto' unspecified in ipv6
> > since otherwise it would specify the next header after IPv6
> > header.
> >
> > Extension headers mask should be empty by default.
This is a deliberate design choice/issue with rte_flow: an empty pattern
matches everything; adding items only narrows the selection. As Andrew said
there is currently no way to provide a specific item to reject, it can only
be done globally on a pattern through INVERT that no PMD implements so far.
So we have two requirements here: the ability to specifically match IPv6
fragment headers and the ability to reject them.
To match IPv6 fragment headers, we need a dedicated pattern item. The
generic RTE_FLOW_ITEM_TYPE_IPV6_EXT is useless for that on its own, it must
be completed with RTE_FLOW_ITEM_TYPE_IPV6_EXT_FRAG and associated object
to match individual fields if needed (like all the others
protocols/headers).
Then to reject a pattern item... My preference goes to a new "NOT" meta item
affecting the meaning of the item coming immediately after in the pattern
list. That would be ultra generic, wouldn't break any ABI/API and like
INVERT, wouldn't even require a new object associated with it.
To match UDPv6 traffic when there is no fragment header, one could then do
something like:
eth / ipv6 / not / ipv6_ext_frag / udp
PMD support would be trivial to implement (I'm sure!)
We may later implement other kinds of "operator" items as Andrew suggested,
for bit-wise stuff and so on. Let's keep adding features on a needed basis
though.
--
Adrien Mazarguil
6WIND
next prev parent reply other threads:[~2020-06-02 19:04 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-31 14:43 Dekel Peled
2020-06-01 5:38 ` Stephen Hemminger
2020-06-01 6:11 ` Dekel Peled
2020-06-02 14:32 ` Andrew Rybchenko
2020-06-02 18:28 ` Ori Kam
2020-06-02 19:04 ` Adrien Mazarguil [this message]
2020-06-03 8:16 ` Ori Kam
2020-06-03 12:10 ` Dekel Peled
2020-06-18 6:58 ` Dekel Peled
2020-06-28 14:52 ` Dekel Peled
2020-07-05 13:13 ` Andrew Rybchenko
2020-08-03 17:01 ` [dpdk-dev] [RFC v2] ethdev: add extensions attributes " Dekel Peled
2020-08-03 17:11 ` [dpdk-dev] [RFC v3] " Dekel Peled
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200602190410.GU26320@6wind.com \
--to=adrien.mazarguil@6wind.com \
--cc=Ivan.Malov@oktetlabs.ru \
--cc=arybchenko@solarflare.com \
--cc=asafp@mellanox.com \
--cc=dekelp@mellanox.com \
--cc=dev@dpdk.org \
--cc=elibr@mellanox.com \
--cc=ferruh.yigit@intel.com \
--cc=john.mcnamara@intel.com \
--cc=marko.kovacevic@intel.com \
--cc=matan@mellanox.com \
--cc=orika@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).