DPDK patches and discussions
 help / color / mirror / Atom feed
From: Olivier Matz <olivier.matz@6wind.com>
To: Bing Zhao <bingz@mellanox.com>
Cc: Ori Kam <orika@mellanox.com>,
	"john.mcnamara@intel.com" <john.mcnamara@intel.com>,
	"marko.kovacevic@intel.com" <marko.kovacevic@intel.com>,
	Thomas Monjalon <thomas@monjalon.net>,
	"ferruh.yigit@intel.com" <ferruh.yigit@intel.com>,
	"arybchenko@solarflare.com" <arybchenko@solarflare.com>,
	"akhil.goyal@nxp.com" <akhil.goyal@nxp.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"wenzhuo.lu@intel.com" <wenzhuo.lu@intel.com>,
	"beilei.xing@intel.com" <beilei.xing@intel.com>,
	"bernard.iremonger@intel.com" <bernard.iremonger@intel.com>
Subject: Re: [dpdk-dev] [PATCH v5 1/2] rte_flow: add eCPRI key fields to flow API
Date: Sun, 12 Jul 2020 16:43:53 +0200	[thread overview]
Message-ID: <20200712144353.GJ5869@platinum> (raw)
In-Reply-To: <VI1PR05MB4192DCA65199DAE241671DB5DD630@VI1PR05MB4192.eurprd05.prod.outlook.com>

On Sun, Jul 12, 2020 at 02:28:03PM +0000, Bing Zhao wrote:
> Hi Olivier,
> Thanks
> 
> BR. Bing
> 
> > -----Original Message-----
> > From: Olivier Matz <olivier.matz@6wind.com>
> > Sent: Sunday, July 12, 2020 9:18 PM
> > To: Bing Zhao <bingz@mellanox.com>
> > Cc: Ori Kam <orika@mellanox.com>; john.mcnamara@intel.com;
> > marko.kovacevic@intel.com; Thomas Monjalon
> > <thomas@monjalon.net>; ferruh.yigit@intel.com;
> > arybchenko@solarflare.com; akhil.goyal@nxp.com; dev@dpdk.org;
> > wenzhuo.lu@intel.com; beilei.xing@intel.com;
> > bernard.iremonger@intel.com
> > Subject: Re: [PATCH v5 1/2] rte_flow: add eCPRI key fields to flow API
> > 
> > Hi Bing,
> > 
> > On Sat, Jul 11, 2020 at 04:25:49AM +0000, Bing Zhao wrote:
> > > Hi Olivier,
> > > Many thanks for your comments.
> > 
> > [...]
> > 
> > > > > +/**
> > > > > + * eCPRI Common Header
> > > > > + */
> > > > > +RTE_STD_C11
> > > > > +struct rte_ecpri_common_hdr {
> > > > > +#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> > > > > +	uint32_t size:16;		/**< Payload Size */
> > > > > +	uint32_t type:8;		/**< Message Type */
> > > > > +	uint32_t c:1;			/**< Concatenation Indicator
> > > > */
> > > > > +	uint32_t res:3;			/**< Reserved */
> > > > > +	uint32_t revision:4;		/**< Protocol Revision */
> > > > > +#elif RTE_BYTE_ORDER == RTE_BIG_ENDIAN
> > > > > +	uint32_t revision:4;		/**< Protocol Revision */
> > > > > +	uint32_t res:3;			/**< Reserved */
> > > > > +	uint32_t c:1;			/**< Concatenation Indicator
> > > > */
> > > > > +	uint32_t type:8;		/**< Message Type */
> > > > > +	uint32_t size:16;		/**< Payload Size */
> > > > > +#endif
> > > > > +} __rte_packed;
> > > >
> > > > Does it really need to be packed? Why next types do not need it?
> > > > It looks only those which have bitfields are.
> > > >
> > >
> > > Nice catch, thanks. For the common header, there is no need to use
> > the
> > > packed attribute, because it is a u32 and the bitfields will be
> > > aligned.
> > > I checked all the definitions again. Only " Type #4: Remote Memory
> > Access"
> > > needs to use the packed attribute.
> > > For other sub-types, "sub-header" part of the message payload will
> > get
> > > aligned by nature. For example, u16 after u16, u8 after u16, these
> > > should be OK.
> > > But in type #4, the address is 48bits wide, with 16bits MSB and 32bits
> > > LSB (no detailed description in the specification, correct me if
> > > anything wrong.) Usually, the 48bits address will be devided as this
> > > in a system. And there is no 48-bits type at all. So we need to define
> > two parts for it: 32b LSB follows 16b MSB.
> > > u32 after u16 should be with packed attribute. Thanks
> > 
> > What about using a bitfield into a uint64_t ? I mean:
> > 
> > 	struct rte_ecpri_msg_rm_access {
> > if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> > 		...
> > 		uint64_t length:16;		/**< number of bytes
> > */
> > 		uint64_t addr:48;		/**< address */
> > #else
> > 		...
> > 		uint64_t addr:48;		/**< address */
> > 		uint64_t length:16;		/**< number of bytes
> > */
> > #endif
> > 	};
> > 
> 
> Thanks for your suggestion.
> https://stackoverflow.com/questions/10906238/warning-when-using-bitfield-with-unsigned-char
> AFAIK (from this page), bitfields support only support bool and int. uint64_t is some type of "long"
> and most of the compilers should support it. But I am not sure if it is a standard implementation.

The uint8_t[6], as in your v6, is a good idea.


> > >
> > > >
> > > > I wonder if the 'dw0' could be in this definition instead of in
> > > > struct rte_ecpri_msg_hdr?
> > > >
> > > > Something like this:
> > > >
> > > > struct rte_ecpri_common_hdr {
> > > > 	union {
> > > > 		uint32_t u32;
> > > > 		struct {
> > > > 			...
> > > > 		};
> > > > 	};
> > > > };
> > > >
> > > > I see 2 advantages:
> > > >
> > > > - someone that only uses the common_hdr struct can use the .u32
> > > >   in its software
> > > > - when using it in messages, it looks clearer to me:
> > > >     msg.common_hdr.u32 = value;
> > > >   instead of:
> > > >     msg.dw0 = value;
> > > >
> > > > What do you think?
> > >
> > > Thanks for the suggestion, this is much better, I will change it.
> > > Indeed, in my original version, no DW(u32) is defined for the header.
> > > After that, I noticed that it would not be easy for the static casting
> > > to a u32 from bitfield(the compiler will complain), and it would not
> > > be clear to swap the endian if the user wants to use this header. I
> > > added this DW(u32) to simplify the usage of this header. But yes, if I
> > > do not add it here, it would be not easy or clear for users who just
> > use this header structure.
> > > I will change it. Is it OK if I use the name "dw0"?
> > 
> > In my opinion, u32 is more usual than dw0.
> > 
> 
> I sent patch set v6 with this change, thanks.
> 
> > >
> > > >
> > > > > +
> > > > > +/**
> > > > > + * eCPRI Message Header of Type #0: IQ Data  */ struct
> > > > > +rte_ecpri_msg_iq_data {
> > > > > +	rte_be16_t pc_id;		/**< Physical channel ID */
> > > > > +	rte_be16_t seq_id;		/**< Sequence ID */
> > > > > +};
> > > > > +
> > > > > +/**
> > > > > + * eCPRI Message Header of Type #1: Bit Sequence  */ struct
> > > > > +rte_ecpri_msg_bit_seq {
> > > > > +	rte_be16_t pc_id;		/**< Physical channel ID */
> > > > > +	rte_be16_t seq_id;		/**< Sequence ID */
> > > > > +};
> > > > > +
> > > > > +/**
> > > > > + * eCPRI Message Header of Type #2: Real-Time Control Data  */
> > > > struct
> > > > > +rte_ecpri_msg_rtc_ctrl {
> > > > > +	rte_be16_t rtc_id;		/**< Real-Time Control Data ID
> > > > */
> > > > > +	rte_be16_t seq_id;		/**< Sequence ID */
> > > > > +};
> > > > > +
> > > > > +/**
> > > > > + * eCPRI Message Header of Type #3: Generic Data Transfer  */
> > > > struct
> > > > > +rte_ecpri_msg_gen_data {
> > > > > +	rte_be32_t pc_id;		/**< Physical channel ID */
> > > > > +	rte_be32_t seq_id;		/**< Sequence ID */
> > > > > +};
> > > > > +
> > > > > +/**
> > > > > + * eCPRI Message Header of Type #4: Remote Memory Access
> > */
> > > > > +RTE_STD_C11
> > > > > +struct rte_ecpri_msg_rm_access {
> > > > > +#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> > > > > +	uint32_t ele_id:16;		/**< Element ID */
> > > > > +	uint32_t rr:4;			/**< Req/Resp */
> > > > > +	uint32_t rw:4;			/**< Read/Write */
> > > > > +	uint32_t rma_id:8;		/**< Remote Memory Access
> > > > ID */
> > > > > +#elif RTE_BYTE_ORDER == RTE_BIG_ENDIAN
> > > > > +	uint32_t rma_id:8;		/**< Remote Memory Access
> > > > ID */
> > > > > +	uint32_t rw:4;			/**< Read/Write */
> > > > > +	uint32_t rr:4;			/**< Req/Resp */
> > > > > +	uint32_t ele_id:16;		/**< Element ID */
> > > > > +#endif
> > > > > +	rte_be16_t addr_m;		/**< 48-bits address (16 MSB)
> > > > */
> > > > > +	rte_be32_t addr_l;		/**< 48-bits address (32 LSB) */
> > > > > +	rte_be16_t length;		/**< number of bytes */
> > > > > +} __rte_packed;
> > > > > +
> > > > > +/**
> > > > > + * eCPRI Message Header of Type #5: One-Way Delay
> > Measurement
> > > > */
> > > > > +struct rte_ecpri_msg_delay_measure {
> > > > > +	uint8_t msr_id;			/**< Measurement ID */
> > > > > +	uint8_t act_type;		/**< Action Type */
> > > > > +};
> > > > > +
> > > > > +/**
> > > > > + * eCPRI Message Header of Type #6: Remote Reset  */ struct
> > > > > +rte_ecpri_msg_remote_reset {
> > > > > +	rte_be16_t rst_id;		/**< Reset ID */
> > > > > +	uint8_t rst_op;			/**< Reset Code Op */
> > > > > +};
> > > > > +
> > > > > +/**
> > > > > + * eCPRI Message Header of Type #7: Event Indication  */ struct
> > > > > +rte_ecpri_msg_event_ind {
> > > > > +	uint8_t evt_id;			/**< Event ID */
> > > > > +	uint8_t evt_type;		/**< Event Type */
> > > > > +	uint8_t seq;			/**< Sequence Number */
> > > > > +	uint8_t number;			/**< Number of
> > > > Faults/Notif */
> > > > > +};
> > > > > +
> > > > > +/**
> > > > > + * eCPRI Message Header Format: Common Header + Message
> > > > Types  */
> > > > > +RTE_STD_C11
> > > > > +struct rte_ecpri_msg_hdr {
> > > > > +	union {
> > > > > +		struct rte_ecpri_common_hdr common;
> > > > > +		uint32_t dw0;
> > > > > +	};
> > > > > +	union {
> > > > > +		struct rte_ecpri_msg_iq_data type0;
> > > > > +		struct rte_ecpri_msg_bit_seq type1;
> > > > > +		struct rte_ecpri_msg_rtc_ctrl type2;
> > > > > +		struct rte_ecpri_msg_bit_seq type3;
> > > > > +		struct rte_ecpri_msg_rm_access type4;
> > > > > +		struct rte_ecpri_msg_delay_measure type5;
> > > > > +		struct rte_ecpri_msg_remote_reset type6;
> > > > > +		struct rte_ecpri_msg_event_ind type7;
> > > > > +		uint32_t dummy[3];
> > > > > +	};
> > > > > +};
> > > >
> > > > What is the point in having this struct?
> > > >
> > > > From a software point of view, I think it is a bit risky, because
> > > > its size is the size of the largest message. This is probably what
> > > > you want in your case, but when a software will rx or tx such
> > > > packet, I think they shouldn't use this one. My understanding is
> > > > that you only need this structure for the mask in rte_flow.
> > > >
> > > > Also, I'm not sure to understand the purpose of dummy[3], even
> > after
> > > > reading your answer to Akhil's question.
> > > >
> > >
> > > Basically YES and no. To my understanding, the eCPRI message
> > format is
> > > something like the ICMP packet format. The message (packet) itself
> > > will be parsed into different formats based on the type of the
> > common
> > > header. In the message payload part, there is no distinct definition
> > > of the "sub-header". We can divide them into the sub-header and
> > data parts based on the specification.
> > > E.g. physical channel ID / real-time control ID / Event ID + type are
> > > the parts that datapath forwarding will only care about. The
> > following
> > > timestamp or user data parts are the parts that the higher layer in
> > the application will use.
> > > 1. If an application wants to create some offload flow, or even
> > handle
> > > it in the SW, the common header + first several bytes in the payload
> > > should be enough. BUT YES, it is not good or safe to use it in the
> > higher layer of the application.
> > > 2. A higher layer of the application should have its own definition of
> > > the whole payload of a specific sub-type, including the parsing of the
> > user data part after the "sub-header".
> > > It is better for them just skip the first 4 bytes of the eCPRI message or
> > a known offset.
> > > We do not need to cover the upper layers.
> > 
> > Let me explain what is my vision of how an application would use the
> > structures (these are completly dummy examples, as I don't know
> > ecpri protocol at all).
> > 
> > Rx:
> > 
> > 	int ecpri_input(struct rte_mbuf *m)
> > 	{
> > 		struct rte_ecpri_common_hdr hdr_copy, *hdr;
> > 		struct rte_ecpri_msg_event_ind event_copy, *event;
> > 		struct app_specific app_copy, *app;
> > 
> > 		hdr = rte_pktmbuf_read(m, 0, sizeof(*hdr),
> > &hdr_copy);
> > 		if (unlikely(hdr == NULL))
> > 			return -1;
> > 		switch (hdr->type) {
> > 		...
> > 		case RTE_ECPRI_EVT_IND_NTFY_IND:
> > 			event = rte_pktmbuf_read(m, sizeof(*hdr),
> > sizeof(*event),
> > 				&event_copy);
> > 			if (unlikely(event == NULL))
> > 				return -1;
> > 			...
> > 			app = rte_pktmbuf_read(m, sizeof(*app),
> > 				sizeof(*hdr) + sizeof(*event),
> > 				&app_copy);
> > 			...
> > 
> > Tx:
> > 
> > 	int ecpri_output(void)
> > 	{
> > 		struct rte_ecpri_common_hdr *hdr;
> > 		struct rte_ecpri_msg_event_ind *event;
> > 		struct app_specific *app;
> > 
> > 		m = rte_pktmbuf_alloc(mp);
> > 		if (unlikely(m == NULL))
> > 			return -1;
> > 
> > 		app = rte_pktmbuf_append(m, sizeof(*data));
> > 		if (app == NULL)
> > 			...
> > 		app->... = ...;
> > 		...
> > 		event = rte_pktmbuf_prepend(m, sizeof(*event));
> > 		if (event == NULL)
> > 			...
> > 		event->... = ...;
> > 		...
> > 		hdr = rte_pktmbuf_prepend(m, sizeof(*hdr));
> > 		if (hdr == NULL)
> > 			...
> > 		hdr->... = ...;
> > 
> > 		return packet_send(m);
> > 	}
> > 
> > In these 2 examples, we never need the unioned structure (struct
> > rte_ecpri_msg_hdr).
> > 
> > Using it does not look possible to me, because its size is corresponds to
> > the largest message, not to the one we are parsing/building.
> > 
> 
> Yes, in the cases, we do not need the unioned structures at all.
> Since the common header is always 4 bytes, an application could use the
> sub-types structures started from an offset of 4 of eCPRI layer, as in your example.
> This is in the datapath. My original purpose is for some "control path", typically
> the flow (not sure if any other new lib implementation) API, then the union
> could be used there w/o treating the common header and message payload
> header in a separate way and then combine them together. In this case, only
> the first several bytes will be accessed and checked, there will be no change
> of message itself, and then just passing it to datapath for further handling as
> in your example.
> 
> > > I think some comments could be added here, is it OK?
> > > 3. Regarding this structure, I add it because I do not want to
> > > introduce a lot of new items in the rte_flow: new items with
> > > structures, new enum types. I prefer one single structure will cover
> > most of the cases (subtypes). What do you think?
> > > 4. About the *dummy* u32, I calculated all the "subheaders" and
> > choose
> > > the maximal value of the length. Two purposes (same as the u32 in
> > the common header):
> > >   a. easy to swap the endianness, but not quite useful. Because some
> > parts are u16 and u8,
> > >     and should not be swapped in a u32. (some physical channel ID
> > and address LSB have 32bits width)
> > >     But if some HW parsed the header u32 by u32, then it would be
> > helpful.
> > >   b. easy for checking in flow API, if the user wants to insert a flow.
> > Some checking should
> > >       be done to confirm if it is wildcard flow (all eCPRI messages or
> > eCPRI message in some specific type),
> > >       or some precise flow (to match the pc id or rtc id, for example).
> > With these fields, 3 DW
> > >       of the masks only need to be check before continuing. Or else, the
> > code needs to check the type
> > >       and a lot of switch-case conditions and go through all different
> > members of each header.
> > 
> > Thanks for the clarification.
> > 
> > I'll tend to say that if the rte_ecpri_msg_hdr structure is only useful for
> > rte_flow, it should be defined inside rte_flow.
> > 
> 
> Right now, yes it will be used in rte_flow. But I checked the current implementations
> of each item, almost all the headers are defined in their own protocol files. So in v6,
> I change the name of it, in order not to confuse the users of this API, would it be OK?
> Thanks

OK

> 
> > However, I have some fears about the dummy[3]. You said it could be
> > enlarged later: I think it is dangerous to change the size of a structure
> > that may be used to parse data (and this would be an ABI change).
> > Also, it seems dummy[3] cannot be used easily to swap endianness, so
> > is it really useful?
> > 
> 
> It might be enlarger but not for now, until a new revision of this specification. For
> all the subtypes it has right now, the specification will remain them as same as today.
> Only new types will be added then. So after several years, we may consider to change it
> in the LTS. Is it OK?

OK, I think in this case it may even be another structure

> In most cases, the endianness swap could be easy, we will swap it in each DW / u32. Tome
> the exception is that some field crosses the u32 boundary, like the address in type#4, we may
> treat it separately. And the most useful case is for the mask checking, it could simplify the
> checking to at most 3 (u32==0?) without going through each member of different types.
> 
> And v6 already sent, I change the code based on your suggestion. Would you please
> help to give some comments also?
> 
> Appreciate for your help and suggestion.
> 
> > 
> > Thanks,
> > Olivier
> > 
> > 
> > > > > +
> > > > > +#ifdef __cplusplus
> > > > > +}
> > > > > +#endif
> > > > > +
> > > > > +#endif /* _RTE_ECPRI_H_ */
> > > > > diff --git a/lib/librte_net/rte_ether.h
> > > > > b/lib/librte_net/rte_ether.h index 0ae4e75..184a3f9 100644
> > > > > --- a/lib/librte_net/rte_ether.h
> > > > > +++ b/lib/librte_net/rte_ether.h
> > > > > @@ -304,6 +304,7 @@ struct rte_vlan_hdr {  #define
> > > > RTE_ETHER_TYPE_LLDP
> > > > > 0x88CC /**< LLDP Protocol. */  #define RTE_ETHER_TYPE_MPLS
> > > > 0x8847 /**<
> > > > > MPLS ethertype. */  #define RTE_ETHER_TYPE_MPLSM 0x8848
> > /**<
> > > > MPLS
> > > > > multicast ethertype. */
> > > > > +#define RTE_ETHER_TYPE_ECPRI 0xAEFE /**< eCPRI ethertype
> > (.1Q
> > > > > +supported). */
> > > > >
> > > > >  /**
> > > > >   * Extract VLAN tag information into mbuf
> > > > > --
> > > > > 1.8.3.1
> > > > >
> > >

  reply	other threads:[~2020-07-12 14:43 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-28 16:20 [dpdk-dev] [PATCH] " Bing Zhao
2020-07-02  6:46 ` [dpdk-dev] [PATCH v2] " Bing Zhao
2020-07-02  8:06   ` Ori Kam
2020-07-02 12:53   ` [dpdk-dev] [PATCH v3 0/2] rte_flow: introduce eCPRI item for rte_flow Bing Zhao
2020-07-02 12:53     ` [dpdk-dev] [PATCH v3 1/2] rte_flow: add eCPRI key fields to flow API Bing Zhao
2020-07-05 11:34       ` Ori Kam
2020-07-02 12:53     ` [dpdk-dev] [PATCH v3 2/2] app/testpmd: add eCPRI in flow creation patterns Bing Zhao
2020-07-05 11:36       ` Ori Kam
2020-07-07 15:36     ` [dpdk-dev] [PATCH v4 0/2] rte_flow: introduce eCPRI item for rte_flow Bing Zhao
2020-07-07 15:36       ` [dpdk-dev] [PATCH v4 1/2] rte_flow: add eCPRI key fields to flow API Bing Zhao
2020-07-08 18:49         ` Akhil Goyal
2020-07-09  3:58           ` Bing Zhao
2020-07-07 15:36       ` [dpdk-dev] [PATCH v4 2/2] app/testpmd: add eCPRI in flow creation patterns Bing Zhao
2020-07-10  8:45       ` [dpdk-dev] [PATCH v5 0/2] rte_flow: introduce eCPRI item for rte_flow Bing Zhao
2020-07-10  8:45         ` [dpdk-dev] [PATCH v5 1/2] rte_flow: add eCPRI key fields to flow API Bing Zhao
2020-07-10 14:31           ` Olivier Matz
2020-07-11  4:25             ` Bing Zhao
2020-07-12 13:17               ` Olivier Matz
2020-07-12 14:28                 ` Bing Zhao
2020-07-12 14:43                   ` Olivier Matz [this message]
2020-07-10  8:45         ` [dpdk-dev] [PATCH v5 2/2] app/testpmd: add eCPRI in flow creation patterns Bing Zhao
2020-07-12 13:35         ` [dpdk-dev] [PATCH v6 0/2] rte_flow: introduce eCPRI item for rte_flow Bing Zhao
2020-07-12 13:35           ` [dpdk-dev] [PATCH v6 1/2] rte_flow: add eCPRI key fields to flow API Bing Zhao
2020-07-12 14:45             ` Olivier Matz
2020-07-12 14:50               ` Bing Zhao
2020-07-13  0:50               ` Thomas Monjalon
2020-07-13  8:30                 ` Ferruh Yigit
2020-07-12 13:35           ` [dpdk-dev] [PATCH v6 2/2] app/testpmd: add eCPRI in flow creation patterns Bing Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200712144353.GJ5869@platinum \
    --to=olivier.matz@6wind.com \
    --cc=akhil.goyal@nxp.com \
    --cc=arybchenko@solarflare.com \
    --cc=beilei.xing@intel.com \
    --cc=bernard.iremonger@intel.com \
    --cc=bingz@mellanox.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=john.mcnamara@intel.com \
    --cc=marko.kovacevic@intel.com \
    --cc=orika@mellanox.com \
    --cc=thomas@monjalon.net \
    --cc=wenzhuo.lu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).