DPDK patches and discussions
 help / color / mirror / Atom feed
From: Ori Kam <orika@mellanox.com>
To: Andrew Rybchenko <arybchenko@solarflare.com>,
	Dekel Peled <dekelp@mellanox.com>,
	"ferruh.yigit@intel.com" <ferruh.yigit@intel.com>,
	"john.mcnamara@intel.com" <john.mcnamara@intel.com>,
	"marko.kovacevic@intel.com" <marko.kovacevic@intel.com>
Cc: Asaf Penso <asafp@mellanox.com>, Matan Azrad <matan@mellanox.com>,
	Eli Britstein <elibr@mellanox.com>, "dev@dpdk.org" <dev@dpdk.org>,
	Adrien Mazarguil <adrien.mazarguil@6wind.com>,
	Ivan Malov <Ivan.Malov@oktetlabs.ru>
Subject: Re: [dpdk-dev] [RFC] ethdev: add fragment attribute to IPv6 item
Date: Tue, 2 Jun 2020 18:28:41 +0000
Message-ID: <AM6PR05MB51769EE4CE6BAB8B557E8325DB8B0@AM6PR05MB5176.eurprd05.prod.outlook.com> (raw)
In-Reply-To: <f4cbfef3-2c3d-fba8-308c-77c525a25299@solarflare.com>

Hi Andrew,

PSB,

Best,
Ori

> -----Original Message-----
> From: Andrew Rybchenko <arybchenko@solarflare.com>
> Sent: Tuesday, June 2, 2020 5:33 PM
> Subject: Re: [RFC] ethdev: add fragment attribute to IPv6 item
> 
> On 5/31/20 5:43 PM, Dekel Peled wrote:
> > Using the current implementation of DPDK, an application cannot
> > match on fragmented/non-fragmented IPv6 packets in a simple way.
> >
> > In current implementation:
> > IPv6 header doesn't contain information regarding the packet
> > fragmentation.
> > Fragmented IPv6 packets contain a dedicated extension header, as
> > detailed in RFC [1], which is not yet supported in rte_flow.
> > Non-fragmented packets don't contain the fragment extension header.
> > For an application to match on non-fragmented IPv6 packets, the
> > current implementation doesn't provide a suitable solution.
> > Matching on the Next Header field is not sufficient, since additional
> > extension headers might be present in the same packet.
> > To match on fragmented IPv6 packets, the same difficulty exists.
> >
> > Proposed update:
> > An additional value will be added to IPv6 header struct.
> > This value will contain the fragmentation attribute of the packet,
> > providing simple means for identification of fragmented and
> > non-fragmented packets.
> >
> > This update changes ABI, and is proposed for the 20.11 LTS version.
> >
> > [1]
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.dpdk
> .org%2Farchives%2Fdev%2F2020-
> March%2F160255.html&amp;data=02%7C01%7Corika%40mellanox.com%7C42
> 1376a4759b4b9550fd08d80701d24c%7Ca652971c7d2e4d9ba6a4d149256f461b
> %7C0%7C0%7C637267051695278770&amp;sdata=%2F1HKQPZUVwU199ERL2S
> HPFBYj%2BLFmx8%2BtW8ZBiDL%2FTw%3D&amp;reserved=0
> >
> > Signed-off-by: Dekel Peled <dekelp@mellanox.com>
> > ---
> >  lib/librte_ethdev/rte_flow.h | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> > index b0e4199..3bc8ce1 100644
> > --- a/lib/librte_ethdev/rte_flow.h
> > +++ b/lib/librte_ethdev/rte_flow.h
> > @@ -787,6 +787,8 @@ struct rte_flow_item_ipv4 {
> >   */
> >  struct rte_flow_item_ipv6 {
> >  	struct rte_ipv6_hdr hdr; /**< IPv6 header definition. */
> > +	uint32_t is_frag:1; /**< Is IPv6 packet fragmented/non-fragmented. */
> > +	uint32_t reserved:31; /**< Reserved, must be zero. */
> 
> The solution is simple, but hardly generic and adds an
> example for the future extensions. I doubt that it is a
> right way to go.
> 
I agree with you that this is not the most generic way possible,
but the IPV6 extensions are very unique. So the solution is also unique.
In general, I'm always in favor of finding the most generic way, but sometimes
it is better to keep things simple, and see how it goes.

> May be we should add 256-bit string with one bit for each
> IP protocol number and apply it to extension headers only?
> If bit A is set in the mask:
>  - if bit A is set in spec as well, extension header with
>    IP protocol (1 << A) number must present
>  - if bit A is clear in spec, extension header with
>    IP protocol (1 << A) number must absent
> If bit is clear in the mask, corresponding extension header
> may present and may absent (i.e. don't care).
> 
There are only 12 possible extension headers and currently none of them
are supported in rte_flow. So adding a logic to parse the 256 just to get a max of 12 
possible values is an overkill. Also, if we disregard the case of the extension, 
the application must select only one next proto. For example, the application
can't select udp + tcp. There is the option to add a flag for each of the
possible extensions, does it makes more sense to you?

> The RFC indirectly touches IPv6 proto (next header) matching
> logic.
> 
> If logic used in ETH+VLAN is applied on IPv6 as well, it would
> make pattern specification and handling complicated. E.g.:
>   eth / ipv6 / udp / end
> should match UDP over IPv6 without any extension headers only.
> 
The issue with VLAN I agree is different since by definition VLAN is 
layer 2.5. We can add the same logic also to the VLAN case, maybe it will
be easier. 
In any case, in your example above and according to the RFC we will
get all ipv6 udp traffic with and without extensions.

> And how to specify UPD over IPv6 regardless extension headers?

Please see above the rule will be eth / ipv6 /udp.

>   eth / ipv6 / ipv6_ext / udp / end
> with a convention that ipv6_ext is optional if spec and mask
> are NULL (or mask is empty).
> 
I would guess that this flow should match all ipv6 that has one ext and the next 
proto is udp.

> I'm wondering if any driver treats it this way?
>
I'm not sure, we can support only the frag ext by default, but if required we can support other 
ext.
 
> I agree that the problem really comes when we'd like match
> IPv6 frags or even worse not fragments.
> 
> Two patterns for fragments:
>   eth / ipv6 (proto=FRAGMENT) / end
>   eth / ipv6 / ipv6_ext (next_hdr=FRAGMENT) / end
> 
> Any sensible solution for not-fragments with any other
> extension headers?
> 
The one propose in this mail 😊 

> INVERT exists, but hardly useful, since it simply says
> that patches which do not match pattern without INVERT
> matches the pattern with INVERT and
>   invert / eth / ipv6 (proto=FRAGMENT) / end
> will match ARP, IPv4, IPv6 with an extension header before
> fragment header and so on.
>
I agree with you, INVERT in this doesn’t help.
We were considering adding some kind of not mask / item per item.
some think around this line:
user request ipv6 unfragmented udp packets. The flow would look something
like this:
Eth / ipv6 / Not (Ipv6.proto = frag_proto) / udp
But it makes the rules much harder to use, and I don't think that there
is any HW that support not, and adding such feature to all items is overkill.

 
> Bit string suggested above will allow to match:
>  - UDP over IPv6 with any extension headers:
>     eth / ipv6 (ext_hdrs mask empty) / udp / end
>  - UDP over IPv6 without any extension headers:
>     eth / ipv6 (ext_hdrs mask full, spec empty) / udp / end
>  - UDP over IPv6 without fragment header:
>     eth / ipv6 (ext.spec & ~FRAGMENT, ext.mask | FRAGMENT) / udp / end
>  - UDP over IPv6 with fragment header
>     eth / ipv6 (ext.spec | FRAGMENT, ext.mask | FRAGMENT) / udp / end
> 
> where FRAGMENT is 1 << IPPROTO_FRAGMENT.
> 
Please see my response regarding this above.

> Above I intentionally keep 'proto' unspecified in ipv6
> since otherwise it would specify the next header after IPv6
> header.
> 
> Extension headers mask should be empty by default.
> 
> Andrew.
Ori

  reply	other threads:[~2020-06-02 18:28 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-31 14:43 Dekel Peled
2020-06-01  5:38 ` Stephen Hemminger
2020-06-01  6:11   ` Dekel Peled
2020-06-02 14:32 ` Andrew Rybchenko
2020-06-02 18:28   ` Ori Kam [this message]
2020-06-02 19:04     ` Adrien Mazarguil
2020-06-03  8:16       ` Ori Kam
2020-06-03 12:10         ` Dekel Peled
2020-06-18  6:58           ` Dekel Peled
2020-06-28 14:52             ` Dekel Peled
2020-07-05 13:13       ` Andrew Rybchenko
2020-08-03 17:01 ` [dpdk-dev] [RFC v2] ethdev: add extensions attributes " Dekel Peled
2020-08-03 17:11   ` [dpdk-dev] [RFC v3] " Dekel Peled

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AM6PR05MB51769EE4CE6BAB8B557E8325DB8B0@AM6PR05MB5176.eurprd05.prod.outlook.com \
    --to=orika@mellanox.com \
    --cc=Ivan.Malov@oktetlabs.ru \
    --cc=adrien.mazarguil@6wind.com \
    --cc=arybchenko@solarflare.com \
    --cc=asafp@mellanox.com \
    --cc=dekelp@mellanox.com \
    --cc=dev@dpdk.org \
    --cc=elibr@mellanox.com \
    --cc=ferruh.yigit@intel.com \
    --cc=john.mcnamara@intel.com \
    --cc=marko.kovacevic@intel.com \
    --cc=matan@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git