DPDK patches and discussions
 help / color / mirror / Atom feed
From: Slava Ovsiienko <viacheslavo@mellanox.com>
To: Andrew Rybchenko <arybchenko@solarflare.com>,
	Thomas Monjalon <thomas@monjalon.net>,
	"olivier.matz@6wind.com" <olivier.matz@6wind.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>, Matan Azrad <matan@mellanox.com>,
	Ori Kam <orika@mellanox.com>, Yongseok Koh <yskoh@mellanox.com>
Subject: Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
Date: Wed, 30 Oct 2019 10:03:50 +0000	[thread overview]
Message-ID: <AM4PR05MB3265C716B1F9534E7E48F2D1D2600@AM4PR05MB3265.eurprd05.prod.outlook.com> (raw)
In-Reply-To: <AM4PR05MB3265063D6BBC352B1E580086D2600@AM4PR05MB3265.eurprd05.prod.outlook.com>

> -----Original Message-----
> From: Slava Ovsiienko
> Sent: Wednesday, October 30, 2019 11:00
> To: Andrew Rybchenko <arybchenko@solarflare.com>; Thomas Monjalon
> <thomas@monjalon.net>; olivier.matz@6wind.com
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Ori Kam
> <orika@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>
> Subject: RE: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
> 
> > -----Original Message-----
> > From: Andrew Rybchenko <arybchenko@solarflare.com>
> > Sent: Wednesday, October 30, 2019 9:35
> > To: Slava Ovsiienko <viacheslavo@mellanox.com>; Thomas Monjalon
> > <thomas@monjalon.net>; olivier.matz@6wind.com
> > Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Ori Kam
> > <orika@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>
> > Subject: Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
> >
> > @Olivier, please, take a look at the end of the mail.
> >
> > On 10/29/19 8:19 PM, Slava Ovsiienko wrote:
> > > Hi, Andrew
> > >
> > > Thank you for the review.
> > >
> > >> -----Original Message-----
> > >> From: Andrew Rybchenko <arybchenko@solarflare.com>
> > >> Sent: Tuesday, October 29, 2019 18:22
> > >> To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
> > >> Cc: Thomas Monjalon <thomas@monjalon.net>; Matan Azrad
> > >> <matan@mellanox.com>; olivier.matz@6wind.com; Ori Kam
> > >> <orika@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>
> > >> Subject: Re: [dpdk-dev] [PATCH v4] ethdev: extend flow metadata
> > >>
> > >> On 10/27/19 9:40 PM, Viacheslav Ovsiienko wrote:
> > >>> Currently, metadata can be set on egress path via mbuf tx_metadata
> > >>> field with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META
> > >> matches metadata.
> > >>> This patch extends the metadata feature usability.
> > >>>
> > >>> 1) RTE_FLOW_ACTION_TYPE_SET_META
> > >>>
> > >>> When supporting multiple tables, Tx metadata can also be set by a
> > >>> rule and matched by another rule. This new action allows metadata
> > >>> to be set as a result of flow match.
> > >>>
> > >>> 2) Metadata on ingress
> > >>>
> > >>> There's also need to support metadata on ingress. Metadata can be
> > >>> set by SET_META action and matched by META item like Tx. The final
> > >>> value set by the action will be delivered to application via
> > >>> metadata dynamic field of mbuf which can be accessed by
> > >> RTE_FLOW_DYNF_METADATA().
> > >>> PKT_RX_DYNF_METADATA flag will be set along with the data.
> > >>>
> > >>> The mbuf dynamic field must be registered by calling
> > >>> rte_flow_dynf_metadata_register() prior to use SET_META action.
> > >>>
> > >>> The availability of dynamic mbuf metadata field can be checked
> > >>> with
> > >>> rte_flow_dynf_metadata_avail() routine.
> > >>>
> > >>> For loopback/hairpin packet, metadata set on Rx/Tx may or may not
> > >>> be propagated to the other path depending on hardware capability.
> > >>>
> > >>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > >>> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > >> Above explanations lack information about "meta" vs "mark" which
> > >> may be set on Rx as well and delivered in other mbuf field.
> > >> It should be explained by one more field is required and rules defined.
> > > There is some story about metadata features.
> > > Initially, there were proposed two metadata related actions:
> > >
> > > - RTE_FLOW_ACTION_TYPE_FLAG
> > > - RTE_FLOW_ACTION_TYPE_MARK
> > >
> > > These actions set the special flag in the packet metadata, MARK
> > > action stores some specified value in the metadata storage, and, on
> > > the packet receiving PMD puts the flag and value to the mbuf and
> > > applications can see the packet was threated inside flow engine
> > > according to the appropriate RTE flow(s). MARK and FLAG are like
> > > some kind of gateway to transfer some per-packet information from
> > > the flow
> > engine to the application via receiving datapath.
> > >
> > >  From the datapath point of view, the MARK and FLAG are related to
> > > the
> > receiving side only.
> > > It would useful to have the same gateway on the transmitting side
> > > and there was the feature of type RTE_FLOW_ITEM_TYPE_META was
> > proposed.
> > > The application can fill the field in mbuf and this value will be
> > > transferred to
> > some field in the packet metadata inside the flow engine.
> > > It did not matter whether these metadata fields are shared because
> > > of MARK and META items belonged to different domains (receiving and
> > transmitting) and could be vendor-specific.
> > >
> > > So far, so good, DPDK proposes some entities to control metadata
> > > inside the flow engine and gateways to exchange these values on a
> > > per-
> > packet basis via datapaths.
> > >
> > > As we can see, the MARK and META means are not symmetric, there is
> > > absent action which would allow us to set META value on the
> > > transmitting
> > path. So, the action of type:
> > > - RTE_FLOW_ACTION_TYPE_SET_META is proposed.
> > >
> > > The next, applications raise the new requirements for packet metadata.
> > > The flow engines are getting more complex, internal switches are
> > > introduced, multiple ports might be supported within the same flow
> > > engine namespace. From the DPDK points of view, it means the packets
> > > might be sent on one eth_dev port and received on the other one, and
> > > the
> > packet path inside the flow engine entirely belongs to the same
> > hardware device. The simplest example is SR-IOV with PF, VFs and the
> representors.
> > > And there is a brilliant opportunity to provide some out-of-band
> > > channel to
> > transfer some extra data
> > >   from one port to another one, besides the packet data itself.
> > >
> > >
> > >> Above explanations lack information about "meta" vs "mark" which
> > >> may be set on Rx as well and delivered in other mbuf field.
> > >> It should be explained by one more field is required and rules defined.
> > >> Otherwise we can endup in half PMDs supporting mark only, half PMDs
> > >> supporting meta only and applications in an interesting situation
> > >> to make a choice which one to use.
> > > There is no "mark" vs "meta". MARK and META means are kept for
> > > compatibility issues and legacy part works exactly as before. The
> > > trials (with flow_validate)  is supposed to check whether PMD
> > > supports MARK or META feature on appropriate domain. It depends on
> > > PMD implementation, configuration and underlaying HW/FW/kernel
> > > capabilities
> > and should be resolved in runtime.
> >
> > The trials a way, but very tricky way. My imagination draws me
> > pictures how an application code could look like in attempt to use
> > either mark or meta for Rx only and these pictures are not nice.
> Agree, trials is not the best way.
> For now there is no application trying to choose mark or meta, because
> these ones belonged to other domains, and extension is newly introduced.
> So, the applications using mark will continue use mark, the same is regarding
> meta.
> The new application definitely will ask for both mark and of them, we have
> the requirements from customers  - "give us as many through bits as you
> can".
> This new application just may refuse to work if metadata features are not
> detected, because relays on it strongly.
> BTW, trials are not so complex: rte_flow_validate(mark),
> rte_flow_validate_metadata() and that's it.
> 
> > May be it will look acceptable when mark becomes a dynamic since usage
> > of either one or another dynamic field is definitely easier than usage
> > of either fixed or dynamic field.
> At least in PMD datapath it does not look very ugly.
> >
> > May be dynamic field for mark at fixed offset should be introduced in
> > the release or the nearest future? It will allow to preserve ABI up to
> > 20.11 and provide future proof API.
> > The trick is to register dynamic meta field at fixed offset at start
> > of a day to be sure that it is guaranteed to succeed.
> > It sounds like it is a transition mechanism from fixed to dynamic fields.
> >
> > [snip]
> >
> > >>> diff --git a/lib/librte_ethdev/rte_flow.h
> > >>> b/lib/librte_ethdev/rte_flow.h index 4fee105..b821557 100644
> > >>> --- a/lib/librte_ethdev/rte_flow.h
> > >>> +++ b/lib/librte_ethdev/rte_flow.h
> > >>> @@ -28,6 +28,8 @@
> > >>>    #include <rte_byteorder.h>
> > >>>    #include <rte_esp.h>
> > >>>    #include <rte_higig.h>
> > >>> +#include <rte_mbuf.h>
> > >>> +#include <rte_mbuf_dyn.h>
> > >>>
> > >>>    #ifdef __cplusplus
> > >>>    extern "C" {
> > >>> @@ -418,7 +420,8 @@ enum rte_flow_item_type {
> > >>>    	/**
> > >>>    	 * [META]
> > >>>    	 *
> > >>> -	 * Matches a metadata value specified in mbuf metadata field.
> > >>> +	 * Matches a metadata value.
> > >>> +	 *
> > >>>    	 * See struct rte_flow_item_meta.
> > >>>    	 */
> > >>>    	RTE_FLOW_ITEM_TYPE_META,
> > >>> @@ -1263,9 +1266,17 @@ struct
> rte_flow_item_icmp6_nd_opt_tla_eth
> > {
> > >>>    #endif
> > >>>
> > >>>    /**
> > >>> - * RTE_FLOW_ITEM_TYPE_META.
> > >>> + * @warning
> > >>> + * @b EXPERIMENTAL: this structure may change without prior
> > >>> + notice
> > >> Is it allowed to make experimental back?
> > > I think we should remove EXPERIMENTAL here. We do not introduce new
> > > feature, but just extend the apply area.
> >
> > Agreed.
> >
> > >>>     *
> > >>> - * Matches a specified metadata value.
> > >>> + * RTE_FLOW_ITEM_TYPE_META
> > >>> + *
> > >>> + * Matches a specified metadata value. On egress, metadata can be
> > >>> + set either by
> > >>> + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> > >>> + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress,
> > >>> + RTE_FLOW_ACTION_TYPE_SET_META sets
> > >>> + * metadata for a packet and the metadata will be reported via
> > >>> + mbuf metadata
> > >>> + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic
> > mbuf
> > >>> + field must be
> > >>> + * registered in advance by rte_flow_dynf_metadata_register().
> > >>>     */
> > >>>    struct rte_flow_item_meta {
> > >>>    	rte_be32_t data;
> > >> [snip]
> > >>
> > >>> @@ -2429,6 +2447,55 @@ struct rte_flow_action_set_mac {
> > >>>    	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
> > >>>    };
> > >>>
> > >>> +/**
> > >>> + * @warning
> > >>> + * @b EXPERIMENTAL: this structure may change without prior
> > >>> +notice
> > >>> + *
> > >>> + * RTE_FLOW_ACTION_TYPE_SET_META
> > >>> + *
> > >>> + * Set metadata. Metadata set by mbuf tx_metadata field with
> > >>> + * PKT_TX_METADATA flag on egress will be overridden by this action.
> > >>> +On
> > >>> + * ingress, the metadata will be carried by mbuf metadata dynamic
> > >>> +field
> > >>> + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field
> > >>> +must be
> > >>> + * registered in advance by rte_flow_dynf_metadata_register().
> > >>> + *
> > >>> + * Altering partial bits is supported with mask. For bits which
> > >>> +have never
> > >>> + * been set, unpredictable value will be seen depending on driver
> > >>> + * implementation. For loopback/hairpin packet, metadata set on
> > >>> +Rx/Tx may
> > >>> + * or may not be propagated to the other path depending on HW
> > >> capability.
> > >>> + *
> > >>> + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> > >>> + */
> > >>> +struct rte_flow_action_set_meta {
> > >>> +	rte_be32_t data;
> > >>> +	rte_be32_t mask;
> > >> As I understand tx_metadata is host endian. Just double-checking.
> > >> Is a new dynamic field host endian or big endian?
> > >> I definitely would like to see motivation in comments why data/mask
> > >> are big- endian here.
> > > metadata is opaque value, endianness does not matter, there are no
> > > some special motivations for choosing endiannes.
> > > rte_flow_item_meta() structure provides data with rte_be32_t type,
> > > so meta related action does
> > the same.
> >
> > Endianness of meta in mbuf and flow API should match and it must be
> Yes, and they match (for meta), both are rte_be32_t. OK, will emphasize it in
> docs.
> 
> > documented. Endianness is important if a HW supports less bits since
> > it makes a hit for application to use LSB first if the bit space is sufficient.
> > mark is defined as host-endian (uint32_t) and I think meta should be
> > the same. Otherwise it complicates even more either mark or meta usage
> > as discussed above .
> mark is mark, meta is meta, these ones are not interrelated (despite they are
> becoming similar). And there is no choice between mark and meta.
> It is not obvious we should have the same endianness for both.
> There are some applications using this legacy features, so there might be
> compatibility issues either.
> 
> >
> > Yes, I think that rte_flow_item_meta should be fixed since both mark
> > and tx_metadata are host-endian.
> >
> > (it says nothing about HW interface which is vendor specific and
> > vendor PMDs should care about it)
> Handling this in PMD might introduce the extra endianness conversion in
> datapath, impacting performance. Not nice, IMO.

Update: I've reviewed other [META] items/actions, all of them are in
host endianness. OK, we are moving tx_metadata to dynfield, it seems to
be good point to alter metadata endianness. And I see the way to avoid
conversions in data path

With best regards, Slava

> >
> > > I could assume the origin of selecting bigendian type was the
> > > endianness of metadata field in Tx descriptor of ConnectX NICs.
> > >
> > >>> +};
> > >>> +
> > >>> +/* Mbuf dynamic field offset for metadata. */ extern int
> > >>> +rte_flow_dynf_metadata_offs;
> > >>> +
> > >>> +/* Mbuf dynamic field flag mask for metadata. */ extern uint64_t
> > >>> +rte_flow_dynf_metadata_mask;
> > >> These two global variables look frightening to me.
> > >> It does not look good to me.
> > > For me too. But we need the performance, these ones are intended for
> > > usage in datapath, any overhead is painful.
> >
> > @Olivier, could you share your thoughts, please.
> >
> > Andrew.
> 
> With best regards, Slava


  parent reply	other threads:[~2019-10-30 10:03 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-03 21:32 [dpdk-dev] [RFC 1/3] " Yongseok Koh
2019-06-03 21:32 ` [dpdk-dev] [RFC 2/3] ethdev: add flow modify mark action Yongseok Koh
2019-06-06 10:35   ` Jerin Jacob Kollanukkaran
2019-06-06 18:33     ` Yongseok Koh
2019-06-03 21:32 ` [dpdk-dev] [RFC 3/3] ethdev: add flow tag Yongseok Koh
2019-07-04 23:23   ` [dpdk-dev] [PATCH] " Yongseok Koh
2019-07-05 13:54     ` Adrien Mazarguil
2019-07-05 18:05       ` Yongseok Koh
2019-07-08 23:32         ` Yongseok Koh
2019-07-09  8:38         ` Adrien Mazarguil
2019-07-11  1:59           ` Yongseok Koh
2019-10-08 12:57             ` Yigit, Ferruh
2019-10-08 13:18               ` Slava Ovsiienko
2019-10-10 16:09     ` [dpdk-dev] [PATCH v2] " Viacheslav Ovsiienko
2019-10-24 13:12       ` [dpdk-dev] [PATCH v3] " Viacheslav Ovsiienko
2019-10-27 16:38         ` Ori Kam
2019-10-27 18:42         ` [dpdk-dev] [PATCH v4] " Viacheslav Ovsiienko
2019-10-27 19:11           ` Ori Kam
2019-10-31 18:57             ` Ferruh Yigit
2019-06-09 14:23 ` [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata Andrew Rybchenko
2019-06-10  3:19   ` Wang, Haiyue
2019-06-10  7:20     ` Andrew Rybchenko
2019-06-11  0:06       ` Yongseok Koh
2019-06-19  9:05         ` Andrew Rybchenko
2019-07-04 23:21 ` [dpdk-dev] [PATCH] " Yongseok Koh
2019-07-10  9:31   ` Olivier Matz
2019-07-10  9:55     ` Bruce Richardson
2019-07-10 10:07       ` Olivier Matz
2019-07-10 12:01         ` Bruce Richardson
2019-07-10 12:26           ` Thomas Monjalon
2019-07-10 16:37             ` Yongseok Koh
2019-07-11  7:44               ` Adrien Mazarguil
2019-07-14 11:46                 ` Andrew Rybchenko
2019-07-29 15:06                   ` Adrien Mazarguil
2019-10-08 12:51                     ` Yigit, Ferruh
2019-10-08 13:17                       ` Slava Ovsiienko
2019-10-10 16:02   ` [dpdk-dev] [PATCH v2] " Viacheslav Ovsiienko
2019-10-18  9:22     ` Olivier Matz
2019-10-19 19:47       ` Slava Ovsiienko
2019-10-21 16:37         ` Olivier Matz
2019-10-24  6:49           ` Slava Ovsiienko
2019-10-24  9:22             ` Olivier Matz
2019-10-24 12:30               ` Slava Ovsiienko
2019-10-24 13:08     ` [dpdk-dev] [PATCH v3] " Viacheslav Ovsiienko
2019-10-27 16:56       ` Ori Kam
2019-10-27 18:40       ` [dpdk-dev] [PATCH v4] " Viacheslav Ovsiienko
2019-10-27 19:10         ` Ori Kam
2019-10-29 16:22         ` Andrew Rybchenko
2019-10-29 17:19           ` Slava Ovsiienko
2019-10-29 18:30             ` Thomas Monjalon
2019-10-29 18:35               ` Slava Ovsiienko
2019-10-30  6:28               ` Andrew Rybchenko
2019-10-30  7:35             ` Andrew Rybchenko
2019-10-30  8:59               ` Slava Ovsiienko
2019-10-30  9:20                 ` Andrew Rybchenko
2019-10-30 10:05                   ` Slava Ovsiienko
2019-10-30 10:03                 ` Slava Ovsiienko [this message]
2019-10-30 15:49               ` Olivier Matz
2019-10-31  9:25                 ` Andrew Rybchenko
2019-10-29 16:25         ` Olivier Matz
2019-10-29 16:33           ` Olivier Matz
2019-10-29 17:53             ` Slava Ovsiienko
2019-10-29 17:43           ` Slava Ovsiienko
2019-10-29 19:31         ` [dpdk-dev] [PATCH v5] " Viacheslav Ovsiienko
2019-10-30  8:02           ` Andrew Rybchenko
2019-10-30 14:40             ` Slava Ovsiienko
2019-10-30 14:46               ` Slava Ovsiienko
2019-10-30 15:20                 ` Olivier Matz
2019-10-30 15:57                   ` Thomas Monjalon
2019-10-30 15:58                   ` Slava Ovsiienko
2019-10-30 16:13                     ` Olivier Matz
2019-10-30  8:35           ` Ori Kam
2019-10-30 17:12           ` [dpdk-dev] [PATCH v6 0/2] extend flow metadata feature Viacheslav Ovsiienko
2019-10-30 17:12             ` [dpdk-dev] [PATCH v6 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
2019-10-31  9:19               ` Andrew Rybchenko
2019-10-31 13:05               ` [dpdk-dev] [PATCH v7 0/2] extend flow metadata feature Viacheslav Ovsiienko
2019-10-31 13:05                 ` [dpdk-dev] [PATCH v7 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
2019-10-31 15:47                   ` Olivier Matz
2019-10-31 16:13                     ` Slava Ovsiienko
2019-10-31 16:48                   ` [dpdk-dev] [PATCH v8 0/2] extend flow metadata feature Viacheslav Ovsiienko
2019-10-31 16:48                     ` [dpdk-dev] [PATCH v8 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
2019-11-04  6:13                       ` [dpdk-dev] [PATCH v9 0/2] extend flow metadata feature Viacheslav Ovsiienko
2019-11-04  6:13                         ` [dpdk-dev] [PATCH v9 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
2019-11-05 14:19                           ` [dpdk-dev] [PATCH v10 0/2] extend flow metadata feature Viacheslav Ovsiienko
2019-11-05 14:19                             ` [dpdk-dev] [PATCH v10 1/2] ethdev: extend flow metadata Viacheslav Ovsiienko
2019-11-05 14:19                             ` [dpdk-dev] [PATCH v10 2/2] ethdev: move egress metadata to dynamic field Viacheslav Ovsiienko
2019-11-06 15:49                             ` [dpdk-dev] [PATCH v10 0/2] extend flow metadata feature Ferruh Yigit
2019-11-04  6:13                         ` [dpdk-dev] [PATCH v9 2/2] ethdev: move egress metadata to dynamic field Viacheslav Ovsiienko
2019-10-31 16:48                     ` [dpdk-dev] [PATCH v8 " Viacheslav Ovsiienko
2019-10-31 17:21                       ` Olivier Matz
2019-11-01 12:34                       ` Andrew Rybchenko
2019-10-31 13:05                 ` [dpdk-dev] [PATCH v7 " Viacheslav Ovsiienko
2019-10-31 13:33                   ` Ori Kam
2019-10-31 15:51                   ` Olivier Matz
2019-10-31 16:07                     ` Slava Ovsiienko
2019-10-30 17:12             ` [dpdk-dev] [PATCH v6 " Viacheslav Ovsiienko
2019-10-31  9:01               ` Andrew Rybchenko
2019-10-31 10:54                 ` Slava Ovsiienko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AM4PR05MB3265C716B1F9534E7E48F2D1D2600@AM4PR05MB3265.eurprd05.prod.outlook.com \
    --to=viacheslavo@mellanox.com \
    --cc=arybchenko@solarflare.com \
    --cc=dev@dpdk.org \
    --cc=matan@mellanox.com \
    --cc=olivier.matz@6wind.com \
    --cc=orika@mellanox.com \
    --cc=thomas@monjalon.net \
    --cc=yskoh@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).