DPDK patches and discussions
 help / color / mirror / Atom feed
From: Yongseok Koh <yskoh@mellanox.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
Cc: Dekel Peled <dekelp@mellanox.com>, "dev@dpdk.org" <dev@dpdk.org>,
	Adrien Mazarguil <adrien.mazarguil@6wind.com>,
	"olivier.matz@6wind.com" <olivier.matz@6wind.com>,
	Ori Kam <orika@mellanox.com>,
	Shahaf Shuler <shahafs@mellanox.com>
Subject: Re: [dpdk-dev] [RFC] ethdev: support metadata as flow rule criteria
Date: Thu, 23 Aug 2018 14:34:58 -0700	[thread overview]
Message-ID: <20180823213457.GC31847@yongseok-MBP.local> (raw)
In-Reply-To: <2601191342CEEE43887BDE71AB977258E9FA454D@IRSMSX102.ger.corp.intel.com>

On Wed, Aug 22, 2018 at 12:13:19PM +0000, Ananyev, Konstantin wrote:
> Hi Dekel,
> 
> > >
> > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Dekel Peled [mailto:dekelp@mellanox.com]
> > > > > Sent: Monday, August 13, 2018 10:47 AM
> > > > > To: dev@dpdk.org
> > > > > Cc: Ori Kam <orika@mellanox.com>; Shahaf Shuler
> > > > > <shahafs@mellanox.com>
> > > > > Subject: [RFC] ethdev: support metadata as flow rule criteria
> > > > >
> > > > > Current implementation of rte_flow allows match pattern of flow
> > > > > rule, based on packet data or header fields.
> > > > > This limits the application use of match patterns.
> > > > >
> > > > > For example, consider a vswitch application which controls a set of
> > > > > VMs, connected with virtio, in a fabric with overlay of VXLAN.
> > > > > Several VMs can have the same inner tuple, while the outer tuple is
> > > > > different and controlled by the vswitch (encap action).
> > > > > For the vswtich to be able to offload the rule to the NIC, it must
> > > > > use a unique match criteria, independent from the inner tuple, to
> > > > > perform the encap action.
> > > > >
> > > > > This RFC adds support for additional metadata to use as match pattern.
> > > > > The metadata is an opaque item, fully controlled by the application.
> > > > >
> > > > > The use of metadata is relevant for egress rules only.
> > > > > It can be set in the flow rule using the RTE_FLOW_ITEM_META.
> > > > >
> > > > > Application should set the packet metdata in the mbuf->metadata
> > > > > field, and set the PKT_TX_METADATA flag in the mbuf->ol_flags.
> > > > > The NIC will use the packet metadata as match criteria for relevant flow
> > > rules.
> > > > >
> > > > > For example, to do an encap action depending on the VM id, the
> > > > > application needs to configure 'match on metadata' rte_flow rule
> > > > > with VM id as metadata, along with desired encap action.
> > > > > When preparing an egress data packet, application will set VM id
> > > > > data in mbuf metadata field and set PKT_TX_METADATA flag.
> > > > >
> > > > > PMD will send data packets to NIC, with VM id as metadata.
> > > > > Egress flow on NIC will match metadata as done with other criteria.
> > > > > Upon match on metadata (VM id) the appropriate encap action will be
> > > > > performed.
> > > > >
> > > > > This RFC introduces metadata item type for rte_flow
> > > > > RTE_FLOW_ITEM_META, along with corresponding struct
> > > > > rte_flow_item_meta and ol_flag PKT_TX_METADATA.
> > > > > It also enhances struct rte_mbuf with new data item, uint64_t metadata.
> > > > >
> > > > > Comments are welcome.
> > > > >
> > > > > Signed-off-by: Dekel Peled <dekelp@mellanox.com>
> > > > > ---
> > > > >  doc/guides/prog_guide/rte_flow.rst | 21 +++++++++++++++++++++
> > > > >  lib/librte_ethdev/rte_flow.c       |  1 +
> > > > >  lib/librte_ethdev/rte_flow.h       | 25 +++++++++++++++++++++++++
> > > > >  lib/librte_mbuf/rte_mbuf.h         | 11 +++++++++++
> > > > >  4 files changed, 58 insertions(+)
> > > > >
> > > > > diff --git a/doc/guides/prog_guide/rte_flow.rst
> > > > > b/doc/guides/prog_guide/rte_flow.rst
> > > > > index b305a72..b6e35f1 100644
> > > > > --- a/doc/guides/prog_guide/rte_flow.rst
> > > > > +++ b/doc/guides/prog_guide/rte_flow.rst
> > > > > @@ -1191,6 +1191,27 @@ Normally preceded by any of:
> > > > >  - `Item: ICMP6_ND_NS`_
> > > > >  - `Item: ICMP6_ND_OPT`_
> > > > >
> > > > > +Item: ``META``
> > > > > +^^^^^^^^^^^^^^
> > > > > +
> > > > > +Matches an application specific 64 bit metadata item.
> > > > > +
> > > > > +- Default ``mask`` matches any 64 bit value.
> > > > > +
> > > > > +.. _table_rte_flow_item_meta:
> > > > > +
> > > > > +.. table:: META
> > > > > +
> > > > > +   +----------+----------+---------------------------+
> > > > > +   | Field    | Subfield | Value                     |
> > > > > +   +==========+==========+===========================+
> > > > > +   | ``spec`` | ``data`` | 64 bit metadata value     |
> > > > > +   +----------+--------------------------------------+
> > > > > +   | ``last`` | ``data`` | upper range value         |
> > > > > +   +----------+----------+---------------------------+
> > > > > +   | ``mask`` | ``data`` | zeroed to match any value |
> > > > > +   +----------+----------+---------------------------+
> > > > > +
> > > > >  Actions
> > > > >  ~~~~~~~
> > > > >
> > > > > diff --git a/lib/librte_ethdev/rte_flow.c
> > > > > b/lib/librte_ethdev/rte_flow.c index
> > > > > cff4b52..54e5ef8 100644
> > > > > --- a/lib/librte_ethdev/rte_flow.c
> > > > > +++ b/lib/librte_ethdev/rte_flow.c
> > > > > @@ -66,6 +66,7 @@ struct rte_flow_desc_data {
> > > > >  		     sizeof(struct rte_flow_item_icmp6_nd_opt_sla_eth)),
> > > > >  	MK_FLOW_ITEM(ICMP6_ND_OPT_TLA_ETH,
> > > > >  		     sizeof(struct rte_flow_item_icmp6_nd_opt_tla_eth)),
> > > > > +	MK_FLOW_ITEM(META, sizeof(struct rte_flow_item_meta)),
> > > > >  };
> > > > >
> > > > >  /** Generate flow_action[] entry. */ diff --git
> > > > > a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h index
> > > > > f8ba71c..b81c816 100644
> > > > > --- a/lib/librte_ethdev/rte_flow.h
> > > > > +++ b/lib/librte_ethdev/rte_flow.h
> > > > > @@ -413,6 +413,15 @@ enum rte_flow_item_type {
> > > > >  	 * See struct rte_flow_item_mark.
> > > > >  	 */
> > > > >  	RTE_FLOW_ITEM_TYPE_MARK,
> > > > > +
> > > > > +	/**
> > > > > +	 * [META]
> > > > > +	 *
> > > > > +	 * Matches a metadata value specified in mbuf metadata field.
> > > > > +	 *
> > > > > +	 * See struct rte_flow_item_meta.
> > > > > +	 */
> > > > > +	RTE_FLOW_ITEM_TYPE_META,
> > > > >  };
> > > > >
> > > > >  /**
> > > > > @@ -849,6 +858,22 @@ struct rte_flow_item_gre {  #endif
> > > > >
> > > > >  /**
> > > > > + * RTE_FLOW_ITEM_TYPE_META.
> > > > > + *
> > > > > + * Matches a specified metadata value.
> > > > > + */
> > > > > +struct rte_flow_item_meta {
> > > > > +	uint64_t data;
> > > > > +};
> > > > > +
> > > > > +/** Default mask for RTE_FLOW_ITEM_TYPE_META. */ #ifndef
> > > > > +__cplusplus static const struct rte_flow_item_meta
> > > rte_flow_item_meta_mask = {
> > > > > +	.data = RTE_BE64(UINT64_MAX),
> > > > > +};
> > > > > +#endif
> > > > > +
> > > > > +/**
> > > > >   * RTE_FLOW_ITEM_TYPE_FUZZY
> > > > >   *
> > > > >   * Fuzzy pattern match, expect faster than default.
> > > > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > > > > index
> > > > > 9ce5d76..8f06a78 100644
> > > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > > @@ -182,6 +182,11 @@
> > > > >  /* add new TX flags here */
> > > > >
> > > > >  /**
> > > > > + * This flag indicates that the metadata field in the mbuf is in use.
> > > > > + */
> > > > > +#define PKT_TX_METADATA		(1ULL << 41)
> > > > > +
> > > > > +/**
> > > > >   * UDP Fragmentation Offload flag. This flag is used for enabling UDP
> > > > >   * fragmentation in SW or in HW. When use UFO, mbuf->tso_segsz is
> > > used
> > > > >   * to store the MSS of UDP fragments.
> > > > > @@ -593,6 +598,12 @@ struct rte_mbuf {
> > > > >  	 */
> > > > >  	struct rte_mbuf_ext_shared_info *shinfo;
> > > > >
> > > > > +	/**
> > > > > +	 * Application specific metadata value for flow rule match.
> > > > > +	 * Valid if PKT_TX_METADATA is set.
> > > > > +	 */
> > > > > +	uint64_t metadata;
> > > > > +
> > >
> > > Just one thought - with that change we'll have only 8 free bytes left inside
> > > rte_mbuf.
> > > Wonder tan this metadata field be combined within tx_offload or probably
> > > hash fields?
> > > Konstantin
> > 
> > The match on metadata feature is currently implemented for egress, but is planned to be extended for ingress use in the future.
> > Hence the need for dedicated field, detached from Tx specific or Rx specific fields.
> 
> Could you probably explain a bit more how it will be used for ingress?
> As I understand it would be some user defined value associated with particular HW filter.
> Right now mbuf's hash might be used for similar purposes - it can contain flow filter ID.
> Do you expect HW to provide both rss/flow and this new metadata info simultaneously
> for the same packet?

Like I replied to Andrew, it would be possible. And metadata can even be used
for flow match. Flow ID is the classification result but metadata has meaning by
itself, could be coming from other entity.

Yongseok

> Konstantin
> 
> > Dekel
> > 
> > >
> > >
> > > > >  } __rte_cache_aligned;
> > > > >
> > > > >  /**
> > > > > --
> > > > > 1.8.3.1
> 

  reply	other threads:[~2018-08-23 21:35 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-13  7:46 Dekel Peled
2018-08-13  8:03 ` Dekel Peled
2018-08-21 13:08   ` Ananyev, Konstantin
2018-08-22  7:59     ` Dekel Peled
2018-08-22 12:13       ` Ananyev, Konstantin
2018-08-23 21:34         ` Yongseok Koh [this message]
2018-08-23 15:34   ` Ferruh Yigit
2018-08-22 13:31 ` Andrew Rybchenko
2018-08-23 21:31   ` Yongseok Koh
2018-08-24 10:11     ` Ananyev, Konstantin
2018-08-28 19:15       ` Yongseok Koh
2018-08-26 14:09 ` [dpdk-dev] [RFC v2] " Dekel Peled
2018-08-28 19:44   ` Yongseok Koh
2018-08-29  6:33     ` Dekel Peled
2018-08-29 12:06       ` Somnath Kotur
2018-08-30  6:02         ` Dekel Peled

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180823213457.GC31847@yongseok-MBP.local \
    --to=yskoh@mellanox.com \
    --cc=adrien.mazarguil@6wind.com \
    --cc=dekelp@mellanox.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@intel.com \
    --cc=olivier.matz@6wind.com \
    --cc=orika@mellanox.com \
    --cc=shahafs@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).