DPDK patches and discussions
 help / color / mirror / Atom feed
From: Shahaf Shuler <shahafs@mellanox.com>
To: Adrien Mazarguil <adrien.mazarguil@6wind.com>,
	Mohammad Abdul Awal <mohammad.abdul.awal@intel.com>
Cc: Declan Doherty <declan.doherty@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	 Alex Rosenbaum <alexr@mellanox.com>
Subject: Re: [dpdk-dev] [PATCH v3 2/4] ethdev: Add tunnel encap/decap actions
Date: Tue, 10 Apr 2018 11:06:43 +0000	[thread overview]
Message-ID: <DB7PR05MB4426CBB6F19DE38EAB15E131C3BE0@DB7PR05MB4426.eurprd05.prod.outlook.com> (raw)
In-Reply-To: <20180410101958.GF4957@6wind.com>

Hi,

Adding small comment on top of Adrien's

Tuesday, April 10, 2018 1:20 PM, Adrien Mazarguil:
> On Mon, Apr 09, 2018 at 05:10:35PM +0100, Mohammad Abdul Awal wrote:
> > On 06/04/2018 21:26, Adrien Mazarguil wrote:
> > > On Fri, Apr 06, 2018 at 01:24:01PM +0100, Declan Doherty wrote:
> > > > Add new flow action types and associated action data structures to
> > > > support the encapsulation and decapsulation of the virtual tunnel
> > > > endpoints.
> > > >
> > > > The RTE_FLOW_ACTION_TYPE_TUNNEL_ENCAP action will cause the
> > > > matching flow to be encapsulated in the virtual tunnel endpoint
> > > > overlay defined in the tunnel_encap action data.
> > > >
> > > > The RTE_FLOW_ACTION_TYPE_TUNNEL_DECAP action will cause all
> > > > virtual tunnel endpoint overlays up to and including the first
> > > > instance of the flow item type defined in the tunnel_decap action
> > > > data for the matching flows.
> > > >
> > > > Signed-off-by: Declan Doherty <declan.doherty@intel.com>
> > > This generic approach looks flexible enough to cover the use cases
> > > that immediately come to mind (VLAN, VXLAN), its design is sound.
> > >
> > > However, while I'm aware it's not a concern at this point, it won't
> > > be able to deal with stateful tunnel or encapsulation types (e.g.
> > > IPsec or TCP) which will require additional meta data or some
> > > run-time assistance from the application.
> > >
> > > Eventually for more complex use cases, dedicated encap/decap actions
> > > will have to appear, so the issue I wanted to raise before going further is
> this:
> > >
> > > Going generic inevitably trades some of the usability; flat
> > > structures dedicated to VXLAN encap/decap with only the needed info
> > > to get the job done would likely be easier to implement in PMDs and
> > > use in applications. Any number of such actions can be added to rte_flow
> without ABI impact.
> > >
> > > If VXLAN is the only use case at this point, my suggestion would be
> > > to go with simpler RTE_FLOW_ACTION_TYPE_VXLAN_(ENCAP|DECAP)
> actions,
> > > with fixed
> > > L2/L3/L4/L5 header definitions to prepend according to RFC 7348.
> > We can go this way and this will increase the action for more and more
> > tunneling protocols being added. Current proposal is already a generic
> > approach which specifies as a tunnel for all the tunneling protocols.
> 
> Right, on the other hand there are not that many standard encapsulations
> offloaded by existing devices. rte_flow could easily handle dedicated actions
> for each of them without problem.
> 
> My point is that many of those (will eventually) have their own quirks to
> manage when doing encap/decap, it's not just a matter of prepending or
> removing a bunch of header definitions, otherwise we could as well let
> applications simply provide an arbitrary buffer to prepend.
> 
> Consider that the "generic" part is already built into rte_flow as the way
> patterns and action are handled. Adding another generic layer on top of that
> could make things more inconvenient than necessary to applications (my
> main concern).
> 
> You'd need another layer of validation/error reporting machinery to properly
> let applications know they cannot encap VXLAN on top of TCP on top of
> QinQinQinQinQ for instance. Either a single bounded encapsulation definition
> or a combination at the action list level is needed to avoid that.
> 
> > > Now we can start with the generic approach, see how it fares and add
> > > dedicated encap/decap later as needed.
> > >
> > > More comments below.
> <snip>
> > > > +Action: ``TUNNEL_ENCAP``
> > > > +^^^^^^^^^^^^^^^^^^^^^^

The ENCAP/DECAP doesn't have to be in the context of tunnel.
For example - let's take GRE - application may want to decap the GRE and encap it with L2. The L2 encapsulation is not related to any tunnel. 
Same for the other direction - VM sends Eth frame, and we want to decap the Eth and encap with GRE.

I think those action should be free from the tunnel association and just provide flow items we want to encap/decap or in a more generic way offset to the packet headers and buffer to encap (not sure how many devices supports that, may be overkill at this point). 

> > > > +
> > > > +Performs an encapsulation action by encapsulating the flows
> > > > +matched by the pattern items according to the network overlay
> > > > +defined in the ``rte_flow_action_tunnel_encap`` pattern items.
> > > > +
> > > > +This action modifies the payload of matched flows. The pattern
> > > > +items specified in the ``rte_flow_action_tunnel_encap`` action
> > > > +structure must defined a valid set of overlay headers, from the
> > > > +Ethernet header up to the overlay header. The pattern must be
> terminated with the RTE_FLOW_ITEM_TYPE_END item type.
> > > Regarding the use of a pattern list, if you consider PMDs are
> > > already iterating on a list of actions when encountering
> > > RTE_FLOW_ACTION_TYPE_TUNNEL_ENCAP, it adds yet another inner
> loop.
> > We understand that it is implementation specifics. If we do not go for
> > another inner loop, all the bundling need to be handled in the same
> > function, which seems more clumsy to me. This also breaks the tunnel
> > endpoint concept.
> > >
> > > How about making each encountered
> RTE_FLOW_ACTION_TYPE_TUNNEL_ENCAP
> > > provide exactly one item instead (in encap, i.e. reverse order)?
> > Again, if we have tunnel action, security action, and other actions,
> > all the processing and tracking need to be done in one function. Now
> > we will need ETH_ENCAP/DECAP, UDP_ENCAP/DECAP,
> NVGRE_ENCAP/DECAP, etc.
> 
> Well, the number of DECAP actions doesn't need to perfectly reflect that of
> ENCAP since it implies all preceding layers. No problem with that.
> 
> Regarding multiple dedicated actions, my suggestion was for a single generic
> one as in this patch, but each instance on the ENCAP side would deal with a
> single protocol layer, instead of having a single ENCAP action with multiple
> inner layers (and thus an inner loop).
> 
> PMDs also gain the ability to precisely report which encap step fails by making
> rte_flow_error point to the problematic object to ease debugging of flow
> rules on the application side.
> 
> Why would that break the tunnel idea and more importantly, how would it
> prevent PMD developers from splitting their processing into multiple
> functions?
> 
> > >
> > > In which case perhaps "GENERIC" would be a better fit than "TUNNEL".
> > >
> <snip>
> > > > +
> > > > +   +-------+--------------------------+------------+
> > > > +   | Index | Flow Item Type           | Flow Item  |
> > > > +   +=======+==========================+============+
> > > > +   | 0     | RTE_FLOW_ITEM_TYPE_ETH   | eth item   |
> > > > +   +-------+--------------------------+------------+
> > > > +   | 1     | RTE_FLOW_ITEM_TYPE_IPV4  | ipv4 item  |
> > > > +   +-------+--------------------------+------------+
> > > > +   | 2     | RTE_FLOW_ITEM_TYPE_UDP   | udp item   |
> > > > +   +-------+--------------------------+------------+
> > > > +   | 3     | RTE_FLOW_ITEM_TYPE_VXLAN | vxlan item |
> > > > +   +-------+--------------------------+------------+
> > > > +   | 4     | RTE_FLOW_ITEM_TYPE_END   | NULL       |
> > > > +   +-------+--------------------------+------------+
> > > One possible issue is that it relies on objects normally found on
> > > the pattern side of flow rules. Those are supposed to match
> > > something, they are not intended for packet header generation. While
> their "spec" and "mask"
> > > fields might make sense in this context, the "last" field is odd.
> > >
> > > You must define them without leaving anything open for
> > > interpretation by PMDs and users alike. Defining things as
> > > "undefined" is fine as long as it's covered.
> > Please note that the "void *item" in the
> > "rte_flow_action_tunnel_encap.pattern" points to the data structure
> > defined for the corresponding rte_flow_item_type instead of a
> > rte_flow_item structure. As an example, for the rte_flow_item_eth type,
> the "void *item"
> > will point to a "struct rte_flow_item_eth" instance. Thats why we have
> > defined struct rte_flow_action_item inside struct
> > rte_flow_action_tunnel_encap. So, no question of spec, mask, last
> anymore.
> 
> Right, I noticed that after commenting its structure definition below.
> 
> I think I won't be the only one confused by this approach, also because a
> mask is needed in addition to a specification structure, otherwise how do you
> plan to tell what fields are relevant in application-provided protocol headers?
> 
> An application might set unusual IPv4/UDP/VXLAN fields and expect them to
> be part of the encapsulated traffic. Without a mask, a PMD must take
> headers verbatim, and I don't think many devices are ready for that yet.
> 
> Hence my other suggestion: defining inflexible $PROTOCOL_(ENCAP|DECAP)
> actions that do not allow more than what's defined by official RFCs for
> $PROTOCOL.
> 
> <snip>
> > > > + */
> > > > +struct rte_flow_action_tunnel_encap {
> > > > +	struct rte_flow_action_item {
> > > > +		enum rte_flow_item_type type;
> > > > +		/**< Flow item type. */
> > > > +		const void *item;
> > > > +		/**< Flow item definition which points to the data of
> > > > +		 * corresponding rte_flow_item_type.
> > > > +		 */
> > > I see it's a new action type, albeit a bit confusing (there is no
> > > RTE_FLOW_ACTION_TYPE_ITEM).
> > >
> > > I suggest the standard pattern item type since you're going with
> > > enum rte_flow_item_type anyway. Keep in mind you need some kind of
> > > mask to tell what fields are relevant. An application might
> > > otherwise want to encap with unsupported properties (e.g. specific IPv4
> ToS field and whatnot).
> > >
> > > How about a single "struct rte_flow_pattern_item item", neither
> > > const and neither a pointer. It's generic enough, enclosed
> > > spec/last/mask pointers take care of the specifics. You just need to
> > > define what's supposed to happen when "last" is set.
> > Please see the comment above regarding this field.
> 
> Point still stands, if you need to distinguish spec and mask, a more complete
> structure is needed. Instead of adding a new (confusing) type, you should
> use rte_flow_item and define what happens when "last" is set.
> 
> You should define it as reserved for now, any non-NULL value is an error. This
> field might also be useful later.
> 
> <snip>
> > > > +};
> > > > +
> > > > +/**
> > > > + * RTE_FLOW_ACTION_TYP_TUNNEL_DECAP
> > > > + *
> > > > + * Virtual tunnel end-point decapsulation action data.
> > > > + *
> > > > + * Non-terminating action by default.
> > > > + */
> > > > +struct rte_flow_action_tunnel_decap {
> > > > +	enum rte_flow_item_type type;
> > > > +	/**<
> > > > +	 * Flow item type of virtual tunnel end-point to be decapsulated
> > > > +	 */
> > > > +};
> > > Note that contrary to ENCAP, DECAP wouldn't necessarily need
> > > repeated actions to peel each layer off. The current definition is fine.
> > To clarify, the the decap is upto the PMD to remove all the header for
> > a specified type. For example, for
> >
> > rte_flow_item_type type=RTE_FLOW_ITEM_TYPE_VXLAN, the PMD will
> peel off (ETH, IPV4, UDP, VXLAN) header all together.
> 
> Yep, that's fine, whether we use multiple actions or a single one doing
> multiple things, a single DECAP can peel them off all at once :)
> 
> > >
> > > > +
> > > > +/**
> > > >    * Definition of a single action.
> > > >    *
> > > >    * A list of actions is terminated by a END action.
> > > > --
> > > > 2.7.4
> > > >
> 
> If the reasons I gave did not manage to convince you about choosing
> between
> either fixed (VLAN|VXLAN)_(ENCAP|DECAP) actions or generic encap/decap
> actions that deal with a single protocol layer at once instead of the
> proposed approach, I'm ready to try it out as an experimental API (all new
> objects tagged as experimental) *if* you address the lack of mask, which
> remains an open issue.
> 
> --
> Adrien Mazarguil
> 6WIND

  reply	other threads:[~2018-04-10 11:06 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-06 12:23 [dpdk-dev] [PATCH v3 0/4] ethdev: Additions to support tunnel encap/decap offload Declan Doherty
2018-04-06 12:24 ` [dpdk-dev] [PATCH v3 1/4] ethdev: add group counter support to rte_flow Declan Doherty
2018-04-06 20:26   ` Adrien Mazarguil
2018-04-09 14:22     ` Mohammad Abdul Awal
2018-04-09 15:23       ` Adrien Mazarguil
2018-04-06 12:24 ` [dpdk-dev] [PATCH v3 2/4] ethdev: Add tunnel encap/decap actions Declan Doherty
2018-04-06 20:26   ` Adrien Mazarguil
2018-04-09 16:10     ` Mohammad Abdul Awal
2018-04-10 10:19       ` Adrien Mazarguil
2018-04-10 11:06         ` Shahaf Shuler [this message]
2018-04-17 14:58     ` Doherty, Declan
2018-04-06 12:24 ` [dpdk-dev] [PATCH v3 3/4] ethdev: Add group action type to rte_flow Declan Doherty
2018-04-06 20:26   ` Adrien Mazarguil
2018-04-17 14:40     ` Doherty, Declan
2018-04-06 12:24 ` [dpdk-dev] [PATCH v3 4/4] ethdev: Add metadata flow and action items support Declan Doherty
2018-04-06 20:27   ` Adrien Mazarguil
2018-04-17 14:40     ` Doherty, Declan
2018-04-27  7:41   ` Xueming(Steven) Li
2018-04-18 21:04 ` [dpdk-dev] [PATCH v4 0/6] additions to support tunnel encap/decap Declan Doherty
2018-04-18 21:04   ` [dpdk-dev] [PATCH v4 1/6] ethdev: Add tunnel encap/decap actions Declan Doherty
2018-04-19 13:03     ` Adrien Mazarguil
2018-04-23 11:00       ` Shahaf Shuler
2018-04-23 11:17         ` Doherty, Declan
2018-04-23 11:49           ` Adrien Mazarguil
2018-04-18 21:04   ` [dpdk-dev] [PATCH v4 2/6] ethdev: Add jump action type to rte_flow Declan Doherty
2018-04-19 13:03     ` Adrien Mazarguil
2018-04-18 21:04   ` [dpdk-dev] [PATCH v4 3/6] testpmd: add jump action Declan Doherty
2018-04-19 13:03     ` Adrien Mazarguil
2018-04-18 21:04   ` [dpdk-dev] [PATCH v4 4/6] ethdev: add mark flow item to flow item types Declan Doherty
2018-04-19 13:03     ` Adrien Mazarguil
2018-04-23 11:10       ` Shahaf Shuler
2018-04-23 11:49         ` Adrien Mazarguil
2018-04-18 21:04   ` [dpdk-dev] [PATCH v4 5/6] testpmd: add support for MARK flow item Declan Doherty
2018-04-19 13:03     ` Adrien Mazarguil
2018-04-18 21:04   ` [dpdk-dev] [PATCH v4 6/6] ethdev: add shared counter support to rte_flow Declan Doherty
2018-04-19 13:03     ` Adrien Mazarguil
2018-04-23 11:11   ` [dpdk-dev] [PATCH v4 0/6] additions to support tunnel encap/decap Shahaf Shuler
2018-04-23 11:13     ` Doherty, Declan
2018-04-23 15:56   ` [dpdk-dev] [PATCH v5 0/4] ethdev " Declan Doherty
2018-04-23 15:56     ` [dpdk-dev] [PATCH v5 1/4] ethdev: Add tunnel encap/decap actions Declan Doherty
2018-04-23 15:56     ` [dpdk-dev] [PATCH v5 2/4] ethdev: Add group JUMP action Declan Doherty
2018-04-23 15:56     ` [dpdk-dev] [PATCH v5 3/4] ethdev: add mark flow item to rte_flow_item_types Declan Doherty
2018-04-23 15:56     ` [dpdk-dev] [PATCH v5 4/4] ethdev: add shared counter support to rte_flow Declan Doherty
2018-04-24 16:26     ` [dpdk-dev] [PATCH v5 0/4] ethdev additions to support tunnel encap/decap Thomas Monjalon
2018-04-30 13:54       ` Thomas Monjalon
2018-04-25 22:05     ` Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DB7PR05MB4426CBB6F19DE38EAB15E131C3BE0@DB7PR05MB4426.eurprd05.prod.outlook.com \
    --to=shahafs@mellanox.com \
    --cc=adrien.mazarguil@6wind.com \
    --cc=alexr@mellanox.com \
    --cc=declan.doherty@intel.com \
    --cc=dev@dpdk.org \
    --cc=mohammad.abdul.awal@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).