DPDK patches and discussions
 help / color / mirror / Atom feed
From: Yongseok Koh <yskoh@mellanox.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
Cc: Andrew Rybchenko <arybchenko@solarflare.com>,
	Dekel Peled <dekelp@mellanox.com>, "dev@dpdk.org" <dev@dpdk.org>,
	Ori Kam <orika@mellanox.com>,
	Shahaf Shuler <shahafs@mellanox.com>,
	Thomas Monjalon <thomas@monjalon.net>,
	"Yigit, Ferruh" <ferruh.yigit@intel.com>,
	Adrien Mazarguil <adrien.mazarguil@6wind.com>,
	Olivier Matz <olivier.matz@6wind.com>
Subject: Re: [dpdk-dev] [RFC] ethdev: support metadata as flow rule criteria
Date: Tue, 28 Aug 2018 19:15:42 +0000	[thread overview]
Message-ID: <F596B1F3-B1DA-4DC2-9EA4-1D78A9D351B0@mellanox.com> (raw)
In-Reply-To: <2601191342CEEE43887BDE71AB977258E9FA51C7@IRSMSX102.ger.corp.intel.com>

> On Aug 24, 2018, at 3:11 AM, Ananyev, Konstantin <konstantin.ananyev@intel.com> wrote:
> 
> 
> 
>> -----Original Message-----
>> From: Yongseok Koh [mailto:yskoh@mellanox.com]
>> Sent: Thursday, August 23, 2018 10:32 PM
>> To: Andrew Rybchenko <arybchenko@solarflare.com>
>> Cc: Dekel Peled <dekelp@mellanox.com>; dev@dpdk.org; orika@mellanox.com; shahafs@mellanox.com; Thomas Monjalon
>> <thomas@monjalon.net>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>; Adrien
>> Mazarguil <adrien.mazarguil@6wind.com>; Olivier Matz <olivier.matz@6wind.com>
>> Subject: Re: [dpdk-dev] [RFC] ethdev: support metadata as flow rule criteria
>> 
>> On Wed, Aug 22, 2018 at 04:31:14PM +0300, Andrew Rybchenko wrote:
>>> On 13.08.2018 10:46, Dekel Peled wrote:
>>>> Current implementation of rte_flow allows match pattern of flow rule,
>>>> based on packet data or header fields.
>>>> This limits the application use of match patterns.
>>>> 
>>>> For example, consider a vswitch application which controls a set of VMs,
>>>> connected with virtio, in a fabric with overlay of VXLAN.
>>>> Several VMs can have the same inner tuple, while the outer tuple is
>>>> different and controlled by the vswitch (encap action).
>>>> For the vswtich to be able to offload the rule to the NIC, it must use a
>>>> unique match criteria, independent from the inner tuple, to perform the
>>>> encap action.
>>>> 
>>>> This RFC adds support for additional metadata to use as match pattern.
>>>> The metadata is an opaque item, fully controlled by the application.
>>>> 
>>>> The use of metadata is relevant for egress rules only.
>>>> It can be set in the flow rule using the RTE_FLOW_ITEM_META.
>>>> 
>>>> Application should set the packet metdata in the mbuf->metadata field,
>>>> and set the PKT_TX_METADATA flag in the mbuf->ol_flags.
>>>> The NIC will use the packet metadata as match criteria for relevant flow
>>>> rules.
>>>> 
>>>> For example, to do an encap action depending on the VM id, the
>>>> application needs to configure 'match on metadata' rte_flow rule with
>>>> VM id as metadata, along with desired encap action.
>>>> When preparing an egress data packet, application will set VM id data in
>>>> mbuf metadata field and set PKT_TX_METADATA flag.
>>>> 
>>>> PMD will send data packets to NIC, with VM id as metadata.
>>>> Egress flow on NIC will match metadata as done with other criteria.
>>>> Upon match on metadata (VM id) the appropriate encap action will be
>>>> performed.
>>>> 
>>>> This RFC introduces metadata item type for rte_flow RTE_FLOW_ITEM_META,
>>>> along with corresponding struct rte_flow_item_meta and ol_flag
>>>> PKT_TX_METADATA.
>>>> It also enhances struct rte_mbuf with new data item, uint64_t metadata.
>>>> 
>>>> Comments are welcome.
>>>> 
>>>> Signed-off-by: Dekel Peled <dekelp@mellanox.com>
>>>> ---
>>>>  doc/guides/prog_guide/rte_flow.rst | 21 +++++++++++++++++++++
>>>>  lib/librte_ethdev/rte_flow.c       |  1 +
>>>>  lib/librte_ethdev/rte_flow.h       | 25 +++++++++++++++++++++++++
>>>>  lib/librte_mbuf/rte_mbuf.h         | 11 +++++++++++
>>>>  4 files changed, 58 insertions(+)
>>>> 
>>>> diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
>>>> index b305a72..b6e35f1 100644
>>>> --- a/doc/guides/prog_guide/rte_flow.rst
>>>> +++ b/doc/guides/prog_guide/rte_flow.rst
>>>> @@ -1191,6 +1191,27 @@ Normally preceded by any of:
>>>>  - `Item: ICMP6_ND_NS`_
>>>>  - `Item: ICMP6_ND_OPT`_
>>>> +Item: ``META``
>>>> +^^^^^^^^^^^^^^
>>>> +
>>>> +Matches an application specific 64 bit metadata item.
>>>> +
>>>> +- Default ``mask`` matches any 64 bit value.
>>>> +
>>>> +.. _table_rte_flow_item_meta:
>>>> +
>>>> +.. table:: META
>>>> +
>>>> +   +----------+----------+---------------------------+
>>>> +   | Field    | Subfield | Value                     |
>>>> +   +==========+==========+===========================+
>>>> +   | ``spec`` | ``data`` | 64 bit metadata value     |
>>>> +   +----------+--------------------------------------+
>>>> +   | ``last`` | ``data`` | upper range value         |
>>>> +   +----------+----------+---------------------------+
>>>> +   | ``mask`` | ``data`` | zeroed to match any value |
>>>> +   +----------+----------+---------------------------+
>>>> +
>>>>  Actions
>>>>  ~~~~~~~
>>>> diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
>>>> index cff4b52..54e5ef8 100644
>>>> --- a/lib/librte_ethdev/rte_flow.c
>>>> +++ b/lib/librte_ethdev/rte_flow.c
>>>> @@ -66,6 +66,7 @@ struct rte_flow_desc_data {
>>>>  		     sizeof(struct rte_flow_item_icmp6_nd_opt_sla_eth)),
>>>>  	MK_FLOW_ITEM(ICMP6_ND_OPT_TLA_ETH,
>>>>  		     sizeof(struct rte_flow_item_icmp6_nd_opt_tla_eth)),
>>>> +	MK_FLOW_ITEM(META, sizeof(struct rte_flow_item_meta)),
>>>>  };
>>>>  /** Generate flow_action[] entry. */
>>>> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
>>>> index f8ba71c..b81c816 100644
>>>> --- a/lib/librte_ethdev/rte_flow.h
>>>> +++ b/lib/librte_ethdev/rte_flow.h
>>>> @@ -413,6 +413,15 @@ enum rte_flow_item_type {
>>>>  	 * See struct rte_flow_item_mark.
>>>>  	 */
>>>>  	RTE_FLOW_ITEM_TYPE_MARK,
>>>> +
>>>> +	/**
>>>> +	 * [META]
>>>> +	 *
>>>> +	 * Matches a metadata value specified in mbuf metadata field.
>>>> +	 *
>>>> +	 * See struct rte_flow_item_meta.
>>>> +	 */
>>>> +	RTE_FLOW_ITEM_TYPE_META,
>>>>  };
>>>>  /**
>>>> @@ -849,6 +858,22 @@ struct rte_flow_item_gre {
>>>>  #endif
>>>>  /**
>>>> + * RTE_FLOW_ITEM_TYPE_META.
>>>> + *
>>>> + * Matches a specified metadata value.
>>>> + */
>>>> +struct rte_flow_item_meta {
>>>> +	uint64_t data;
>>>> +};
>>>> +
>>>> +/** Default mask for RTE_FLOW_ITEM_TYPE_META. */
>>>> +#ifndef __cplusplus
>>>> +static const struct rte_flow_item_meta rte_flow_item_meta_mask = {
>>>> +	.data = RTE_BE64(UINT64_MAX),
>>>> +};
>>>> +#endif
>>>> +
>>>> +/**
>>>>   * RTE_FLOW_ITEM_TYPE_FUZZY
>>>>   *
>>>>   * Fuzzy pattern match, expect faster than default.
>>>> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
>>>> index 9ce5d76..8f06a78 100644
>>>> --- a/lib/librte_mbuf/rte_mbuf.h
>>>> +++ b/lib/librte_mbuf/rte_mbuf.h
>>>> @@ -182,6 +182,11 @@
>>>>  /* add new TX flags here */
>>>>  /**
>>>> + * This flag indicates that the metadata field in the mbuf is in use.
>>>> + */
>>>> +#define PKT_TX_METADATA		(1ULL << 41)
>>>> +
>>>> +/**
>>>>   * UDP Fragmentation Offload flag. This flag is used for enabling UDP
>>>>   * fragmentation in SW or in HW. When use UFO, mbuf->tso_segsz is used
>>>>   * to store the MSS of UDP fragments.
>>>> @@ -593,6 +598,12 @@ struct rte_mbuf {
>>>>  	 */
>>>>  	struct rte_mbuf_ext_shared_info *shinfo;
>>>> +	/**
>>>> +	 * Application specific metadata value for flow rule match.
>>>> +	 * Valid if PKT_TX_METADATA is set.
>>>> +	 */
>>>> +	uint64_t metadata;
>>>> +
>>> 
>>> I don't see the difference from hash union which is 64-bit wide as well.
>>> hash.fdir.hi is used by flow mark action and mark match item (but just
>>> 32-bit).
>> 
>> Rx metadata would be different from flow mark ID. Mark ID is set when the flow
>> is created (it is a kind of marking classification result) but metadata could be
>> sent by other entity, e.g. VM-to-VM traffic or VM-to-HV traffic.
> 
> Ok, but it could be either rss OR flow id OR metdata (based on ol_flags) -
> hash is a union after all.
> Konstantin

Not sure, why can't both (flow ID and metadata) be set in a mbuf?
Why do you think it has to be exclusive?

Thanks,
Yongseok

  reply	other threads:[~2018-08-28 19:15 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-13  7:46 Dekel Peled
2018-08-13  8:03 ` Dekel Peled
2018-08-21 13:08   ` Ananyev, Konstantin
2018-08-22  7:59     ` Dekel Peled
2018-08-22 12:13       ` Ananyev, Konstantin
2018-08-23 21:34         ` Yongseok Koh
2018-08-23 15:34   ` Ferruh Yigit
2018-08-22 13:31 ` Andrew Rybchenko
2018-08-23 21:31   ` Yongseok Koh
2018-08-24 10:11     ` Ananyev, Konstantin
2018-08-28 19:15       ` Yongseok Koh [this message]
2018-08-26 14:09 ` [dpdk-dev] [RFC v2] " Dekel Peled
2018-08-28 19:44   ` Yongseok Koh
2018-08-29  6:33     ` Dekel Peled
2018-08-29 12:06       ` Somnath Kotur
2018-08-30  6:02         ` Dekel Peled

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=F596B1F3-B1DA-4DC2-9EA4-1D78A9D351B0@mellanox.com \
    --to=yskoh@mellanox.com \
    --cc=adrien.mazarguil@6wind.com \
    --cc=arybchenko@solarflare.com \
    --cc=dekelp@mellanox.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=olivier.matz@6wind.com \
    --cc=orika@mellanox.com \
    --cc=shahafs@mellanox.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).