DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Dumitrescu, Cristian" <cristian.dumitrescu@intel.com>,
	"Zhang, Qi Z" <qi.z.zhang@intel.com>, <thomas@monjalon.net>,
	<orika@nvidia.com>, <david.marchand@redhat.com>,
	"Richardson, Bruce" <bruce.richardson@intel.com>,
	<jerinj@marvell.com>, <ferruh.yigit@amd.com>
Cc: <cristian.dumiterscu@intel.com>, <techboard@dpdk.org>,
	"Mcnamara, John" <john.mcnamara@intel.com>,
	"Zhang, Helin" <helin.zhang@intel.com>, <dev@dpdk.org>
Subject: RE: [PATCH] ethdev: introduce generic flow item and action
Date: Wed, 2 Aug 2023 16:54:11 +0200	[thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D87ABE@smartserver.smartshare.dk> (raw)
In-Reply-To: <DS0PR11MB744284E43AC27900FCD3FE26EB0BA@DS0PR11MB7442.namprd11.prod.outlook.com>

> From: Dumitrescu, Cristian [mailto:cristian.dumitrescu@intel.com]
> Sent: Wednesday, 2 August 2023 16.06
> 
> Hi Morten,
> 
> Thanks for your reply, comments inline below.
> 
> > From: Morten Brørup <mb@smartsharesystems.com>
> > Sent: Wednesday, August 2, 2023 11:25 AM
> >
> > > From: Qi Zhang [mailto:qi.z.zhang@intel.com]
> > > Sent: Wednesday, 2 August 2023 19.35
> > >
> > > From: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> > >
> > > For network devices that are programmable through languages such as
> > > the P4 language, there are no pre-defined flow items and actions.
> > >
> > > The format of the protocol header and metadata fields that are used to
> > > specify the flow items that make up the flow pattern, as well as the
> > > flow actions, are all defined by the program, with an infinity of
> > > possible combinations, as opposed to being selected from a finite
> > > pre-defined list.
> > >
> > > It is virtually impossible to pre-define all the flow items and the
> > > flow actions that programs might ever use, as these are only limited
> > > by the set of HW resources and the program developer's imagination.
> > >
> > > To support the programmable network devices, we are introducing:
> > >
> > > * A generic flow item: The flow item is expressed as an array of bytes
> > > of a given length, whose meaning is defined by the program loaded by
> > > the network device.
> >
> > The flow item is not "generic", it is "opaque": Only the application knows
> what
> this flow item does.
> 
> This item is definitely not opaque, as it does not contain any
> pointers/handles.
> It actually contains the exact bytes of the field in clear text, so how is
> this opaque?

The flow item is opaque in the sense that the DPDK library has no way to interpret it. E.g., unlike the RAW item, it cannot be debug-dumped to some human readable text format by a future DPDK library function.

> 
> We already have flow items in RTE_FLOW such as RAW, FUZZY, META, TAG, FLEX
> whose meaning is fully defined by the application, as you state, with some of
> them
> (like FLEX) containing pointers or indices that are completely "opaque". We
> could
> use one of these existing items, but we think it would be cleaner to define a
> new
> one that is more aligned to this specific purpose.
> 
> The way I would read your comment is that the _format_ of the flow item is
> defined by the application. But this is how the P4-programmable HW devices
> from
> Intel and other vendors work: the P4 program loaded by the device (i.e. the
> data
> plane application) defines the _format_ of the flow items.

Yes, the _format_ is opaque to DPDK, and only understood by the application.

> 
> >
> > I hate the concept for two reasons:
> > 1. The inability for applications to detect which flow items the underlying
> > hardware supports.
> 
> Does RTE_FLOW have any capability to detect what flows are supported by the
> underlying HW support?

I don't know. But a capability query API could be implemented.

> How is this a problem introduced by this proposal?

With this proposal, querying for capabilities seems impossible.

Maybe not.

> 
> > 2. The risk that vendors will use this instead of introducing new flow item
> > types, available for anyone to implement.
> 
> As stated above, there are already many existing flow items in RTE_FLOW that
> do just
> thisc(and even worse, as some of them contain opaque pointers or indices),
> such as
> RAW, FUZZY, META, TAG, FLEX.
> 
> >
> > If you proceed anyway, there's a few comments inline below.
> >
> > > The device is still expected to accept the
> > > existing pre-defined flow items such as Ethernet, IPv4/IPv6 headers,
> > > etc, as long as the program currently loaded on the device is defining
> > > and using flow items with identical format, but the device is not
> > > limited to the current set of pre-defined RTE_FLOW flow items.
> > >
> > > * A generic flow action: The flow action exact processing is defined
> > > by the program loaded by the network device, with the user expected to
> > > know the set of program actions for the purpose of assigning actions
> > > to flows. The device is still expected to accept the existing
> > > pre-defined flow actions such as packet drop, packet redirection to
> > > output queues, packet modifications, etc, as long as the program
> > > currently loaded on the device is defining and using flow actions that
> > > perform identical processing, but the device is not limited to the
> > > current set of pre-defined RTE_FLOW flow actions.
> > >
> > > Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> > > Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
> > > ---
> > >  lib/ethdev/rte_flow.h | 82
> > +++++++++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 82 insertions(+)
> > >
> > > diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> > > index 3fe57140f9..f7889d7dd0 100644
> > > --- a/lib/ethdev/rte_flow.h
> > > +++ b/lib/ethdev/rte_flow.h
> > > @@ -688,6 +688,15 @@ enum rte_flow_item_type {
> > >  	 * @see struct rte_flow_item_ib_bth.
> > >  	 */
> > >  	RTE_FLOW_ITEM_TYPE_IB_BTH,
> > > +
> > > +	/**
> > > +	 * Matches a custom protocol header or metadata field represented
> > > +	 * as a byte string of a given length, whose meaning is typically
> > > +	 * defined by the data plane program.
> > > +	 *
> > > +	 * @see struct rte_flow_item_generic.
> > > +	 */
> > > +	RTE_FLOW_ITEM_TYPE_GENERIC,
> >
> > Rename: _GENERIC -> _OPAQUE, everywhere.
> >
> > >  };
> > >
> > >  /**
> > > @@ -2311,6 +2320,32 @@ static const struct rte_flow_item_tx_queue
> > > rte_flow_item_tx_queue_mask = {
> > >  };
> > >  #endif
> > >
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this structure may change without prior notice
> > > + *
> > > + * RTE_FLOW_ITEM_TYPE_GENERIC
> > > + *
> > > + * Matches a custom protocol header or metadata field represented as a
> > byte
> > > + * array of a given length, whose meaning is typically defined by the
> data
> > > + * plane program.
> >
> > "byte array" -> "opaque data type"
> 
> This flow item does not contain any pointer or index with hidden meaning,
> this flow item contains the exact byte array in clear text, so it is not
> opaque.

The meaning of the byte array is opaque. It doesn't matter that the byte array is in clear text.

If you look at the other flow items (e.g. RAW, FLEX), anyone can parse the byte array and tell what it means.

> The only difference is that you don't have a predefined name for its protocol,
> as the field format is defined by the P4 program loaded by the HW device.
> 
> 
> >
> > > + *
> > > + * The field length must be non-zero. The field value must be non-NULL,
> > with
> > > the
> > > + * value bytes specified in network byte order.
> > > + */
> > > +struct rte_flow_item_generic {
> > > +	uint32_t length; /**< Field length. */
> > > +	const uint8_t *value; /**< Field value. */
> >
> > Suggestion: Change the value type from "const uint8_t *" to "const void *".
> It
> > makes it easier for the application to use a hierarchy of structures
> internally
> > for this.
> 
> The "value" should be an array of exactly "length" bytes. Changing the data
> type to
> "void *" is making this field opaque for no good reason IMO. We are simply
> following
> the same pattern used by other flow items such as RAW and FLEX, which are also
> specified as array of bytes, right?

OK. It was only a suggestion. It seems you are doing the same as FLEX here.

NB: I wonder why the flow item points to the value (in RAW and FLEX), instead of directly holding the value in a variable length array... but it's not important.

> 
> >
> > > +};
> > > +
> > > +/** Default mask for RTE_FLOW_ITEM_TYPE_GENERIC. */
> > > +#ifndef __cplusplus
> > > +static const struct rte_flow_item_generic rte_flow_item_generic_mask = {
> > > +	.length = 0xffffffff,
> > > +	.value = NULL,
> > > +};
> > > +#endif
> > > +
> > >  /**
> > >   * Action types.
> > >   *
> > > @@ -2989,6 +3024,14 @@ enum rte_flow_action_type {
> > >  	 * @see struct rte_flow_action_indirect_list
> > >  	 */
> > >  	RTE_FLOW_ACTION_TYPE_INDIRECT_LIST,
> > > +
> > > +	/**
> > > +	 * Custom action whose processing is typically defined by the data
> > plane
> > > +	 * program.
> > > +	 *
> > > +	 * @see struct rte_flow_action_generic.
> > > +	 */
> > > +	RTE_FLOW_ACTION_TYPE_GENERIC,
> > >  };
> > >
> > >  /**
> > > @@ -4064,6 +4107,45 @@ struct
> > rte_flow_indirect_update_flow_meter_mark {
> > >  	enum rte_color init_color;
> > >  };
> > >
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this structure may change without prior notice.
> > > + *
> > > + * Generic action argument configuration parameters.
> > > + *
> > > + * The action argument field length must be non-zero. The action argument
> > > field
> > > + * value must be non-NULL, with the value bytes specified in network byte
> > > order.
> > > + *
> > > + * @see struct rte_flow_action_generic
> > > + */
> > > +struct rte_flow_action_generic_argument {
> > > +	/** Argument field length. */
> > > +	uint32_t length;
> > > +	/** Argument field value. */
> > > +	const uint8_t *value;
> > > +};
> > > +
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this structure may change without prior notice.
> > > + *
> > > + * RTE_FLOW_ACTION_TYPE_GENERIC
> > > + *
> > > + * Generic action configuration parameters.
> > > + *
> > > + * Each action can have zero or more arguments.
> > > + *
> > > + * @see RTE_FLOW_ACTION_TYPE_GENERIC
> > > + */
> > > +struct rte_flow_action_generic {
> > > +	/** Action ID. */
> > > +	uint32_t action_id;
> > > +	/** Number of action arguments. */
> > > +	uint32_t action_args_num;
> > > +	/** Action arguments array. */
> > > +	const struct rte_flow_action_generic_argument *action_args;
> > > +};
> >
> > This is a list of arguments, not one argument. You should append "_array"
> > somewhere in the name, and add a normal (non-list) action, e.g.:
> >
> > struct rte_flow_action_opaque {
> > //Or: "struct rte_flow_action_generic", if you want to keep that name.
> > 	/** Action ID. */
> > 	uint32_t action_id;
> > 	/** Argument length. */
> > 	uint32_t length;
> > 	/** Argument value. */
> > 	const uint8_t *value;
> > // Or: "const void *value;" for the same reason as for the flow item.
> > };
> >
> 
> Did you somehow overlooked the "struct rte_flow_action_generic" from just
> above?
> 
> An action can have zero, one or more arguments, which are expressed
> in the "action_args" array; we are requiring the user to provide the action
> arguments as an array of "struct rte_flow_action_generic" as opposed to
> a single raw buffer with all the arguments concatenated there, makes sense?

Makes sense. I was thinking that you should have two different flow items, one item with a single argument, and one item with an array of arguments. However, it seems that FLEX also does what you do here, i.e. it has one flow item to cover zero, one or more arguments.

> 
> 
> >
> > > +
> > >  /* Mbuf dynamic field offset for metadata. */
> > >  extern int32_t rte_flow_dynf_metadata_offs;
> > >
> > > --
> > > 2.31.1
> 
> Regards,
> Cristian

      reply	other threads:[~2023-08-02 14:54 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-02 17:34 Qi Zhang
2023-08-02  9:37 ` Zhang, Qi Z
2023-08-02 10:25 ` Morten Brørup
2023-08-02 11:01   ` Morten Brørup
2023-08-02 11:21     ` Jerin Jacob
2023-08-02 14:06       ` Dumitrescu, Cristian
2023-08-02 15:24         ` Morten Brørup
2023-08-02 15:47           ` Ori Kam
2023-08-02 16:06             ` Ori Kam
2023-08-02 17:22               ` Dumitrescu, Cristian
2023-08-02 17:56                 ` Morten Brørup
2023-08-03  1:05                   ` Zhang, Qi Z
2023-08-03 13:57                     ` Morten Brørup
2023-08-16 13:23                       ` Dumitrescu, Cristian
2023-08-16 14:20                         ` Morten Brørup
2023-08-16 17:08                           ` Ferruh Yigit
2023-08-16 17:18                             ` Dumitrescu, Cristian
2023-08-16 16:19                         ` DPDK community: RTE_FLOW support for P4-programmable devices Dumitrescu, Cristian
2023-08-27  7:48                           ` Ori Kam
2023-08-28 16:12                             ` Dumitrescu, Cristian
2023-08-29  7:38                               ` Jerin Jacob
2023-08-29 10:18                                 ` Ferruh Yigit
2023-08-31 10:32                                   ` Ori Kam
2023-08-31 13:42                                     ` Stephen Hemminger
2023-09-01  2:07                                     ` Jerin Jacob
2023-09-01  6:57                                       ` Ori Kam
2023-09-01  9:52                                         ` Jerin Jacob
2023-09-04  7:45                                           ` Ferruh Yigit
2023-09-06  8:30                                             ` Ori Kam
2023-09-11 20:20                                         ` Dumitrescu, Cristian
2023-08-02 16:56             ` [PATCH] ethdev: introduce generic flow item and action Dumitrescu, Cristian
2023-08-03  8:39               ` Ori Kam
2023-08-16 13:12                 ` Dumitrescu, Cristian
2023-08-02 16:14           ` Dumitrescu, Cristian
2023-08-02 14:06   ` Dumitrescu, Cristian
2023-08-02 14:54     ` Morten Brørup [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35D87ABE@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=bruce.richardson@intel.com \
    --cc=cristian.dumiterscu@intel.com \
    --cc=cristian.dumitrescu@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@amd.com \
    --cc=helin.zhang@intel.com \
    --cc=jerinj@marvell.com \
    --cc=john.mcnamara@intel.com \
    --cc=orika@nvidia.com \
    --cc=qi.z.zhang@intel.com \
    --cc=techboard@dpdk.org \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).