* [RFC] ethdev: introduce entropy calculation
@ 2023-12-10 8:30 Ori Kam
2023-12-12 12:19 ` Dariusz Sosnowski
` (2 more replies)
0 siblings, 3 replies; 19+ messages in thread
From: Ori Kam @ 2023-12-10 8:30 UTC (permalink / raw)
To: dsosnowski, ferruh.yigit, cristian.dumitrescu, Thomas Monjalon,
Andrew Rybchenko
Cc: dev, orika, rasland
When offloading rules with the encap action, the HW may calculate entropy based on the encap protocol.
Each HW can implement a different algorithm.
When the application receives packets that should have been
encaped by the HW, but didn't reach this stage yet (for example TCP SYN packets),
then when encap is done in SW, application must apply
the same entropy calculation algorithm.
Using the new API application can request the PMD to calculate the
value as if the packet passed in the HW.
Signed-off-by: Ori Kam <orika@nvidia.com>
---
lib/ethdev/rte_flow.h | 49 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 49 insertions(+)
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index affdc8121b..3989b089dd 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -6753,6 +6753,55 @@ rte_flow_calc_table_hash(uint16_t port_id, const struct rte_flow_template_table
const struct rte_flow_item pattern[], uint8_t pattern_template_index,
uint32_t *hash, struct rte_flow_error *error);
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Destination field type for the entropy calculation.
+ *
+ * @see function rte_flow_calc_encap_entropy
+ */
+enum rte_flow_entropy_dest {
+ /* Calculate entropy placed in UDP source port field. */
+ RTE_FLOW_ENTROPY_DEST_UDP_SRC_PORT,
+ /* Calculate entropy placed in NVGRE flow ID field. */
+ RTE_FLOW_ENTROPY_DEST_NVGRE_FLOW_ID,
+};
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Calculate the entropy generated by the HW for a given pattern,
+ * when encapsulation flow action is executed.
+ *
+ * @param[in] port_id
+ * Port identifier of Ethernet device.
+ * @param[in] pattern
+ * The values to be used in the entropy calculation.
+ * @param[in] dest_field
+ * Type of destination field for entropy calculation.
+ * @param[out] entropy
+ * Used to return the calculated entropy. It will be written in network order,
+ * so entropy[0] is the MSB.
+ * The number of bytes is based on the destination field type.
+ * @param[out] error
+ * Perform verbose error reporting if not NULL.
+ * PMDs initialize this structure in case of error only.
+ *
+ * @return
+ * - (0) if success.
+ * - (-ENODEV) if *port_id* invalid.
+ * - (-ENOTSUP) if underlying device does not support this functionality.
+ * - (-EINVAL) if *pattern* doesn't hold enough information to calculate the entropy
+ * or the dest is not supported.
+ */
+__rte_experimental
+int
+rte_flow_calc_encap_entropy(uint16_t port_id, const struct rte_flow_item pattern[],
+ enum rte_flow_entropy_dest dest_field, uint8_t *entropy,
+ struct rte_flow_error *error);
+
#ifdef __cplusplus
}
#endif
--
2.34.1
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [RFC] ethdev: introduce entropy calculation
2023-12-10 8:30 [RFC] ethdev: introduce entropy calculation Ori Kam
@ 2023-12-12 12:19 ` Dariusz Sosnowski
2023-12-14 11:34 ` Ferruh Yigit
2023-12-16 9:19 ` Andrew Rybchenko
2 siblings, 0 replies; 19+ messages in thread
From: Dariusz Sosnowski @ 2023-12-12 12:19 UTC (permalink / raw)
To: Ori Kam, ferruh.yigit, cristian.dumitrescu,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Andrew Rybchenko
Cc: dev, Raslan Darawsheh
> Subject: [RFC] ethdev: introduce entropy calculation
>
> When offloading rules with the encap action, the HW may calculate entropy
> based on the encap protocol.
> Each HW can implement a different algorithm.
>
> When the application receives packets that should have been encaped by the
> HW, but didn't reach this stage yet (for example TCP SYN packets), then when
> encap is done in SW, application must apply the same entropy calculation
> algorithm.
>
> Using the new API application can request the PMD to calculate the value as if
> the packet passed in the HW.
>
> Signed-off-by: Ori Kam <orika@nvidia.com>
Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Best regards,
Dariusz Sosnowski
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC] ethdev: introduce entropy calculation
2023-12-10 8:30 [RFC] ethdev: introduce entropy calculation Ori Kam
2023-12-12 12:19 ` Dariusz Sosnowski
@ 2023-12-14 11:34 ` Ferruh Yigit
2023-12-14 14:16 ` Ori Kam
2023-12-16 9:19 ` Andrew Rybchenko
2 siblings, 1 reply; 19+ messages in thread
From: Ferruh Yigit @ 2023-12-14 11:34 UTC (permalink / raw)
To: Ori Kam, dsosnowski, cristian.dumitrescu, Thomas Monjalon,
Andrew Rybchenko
Cc: dev, rasland
On 12/10/2023 8:30 AM, Ori Kam wrote:
> When offloading rules with the encap action, the HW may calculate entropy based on the encap protocol.
> Each HW can implement a different algorithm.
>
Hi Ori,
Can you please provide more details what this 'entropy' is used for,
what is the usecase?
> When the application receives packets that should have been
> encaped by the HW, but didn't reach this stage yet (for example TCP SYN packets),
> then when encap is done in SW, application must apply
> the same entropy calculation algorithm.
>> Using the new API application can request the PMD to calculate the
> value as if the packet passed in the HW.
>
So is this new API a datapath API? Is the intention that application
call this API per packet that is missing 'entropy' information?
> Signed-off-by: Ori Kam <orika@nvidia.com>
> ---
> lib/ethdev/rte_flow.h | 49 +++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 49 insertions(+)
>
> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> index affdc8121b..3989b089dd 100644
> --- a/lib/ethdev/rte_flow.h
> +++ b/lib/ethdev/rte_flow.h
> @@ -6753,6 +6753,55 @@ rte_flow_calc_table_hash(uint16_t port_id, const struct rte_flow_template_table
> const struct rte_flow_item pattern[], uint8_t pattern_template_index,
> uint32_t *hash, struct rte_flow_error *error);
>
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Destination field type for the entropy calculation.
> + *
> + * @see function rte_flow_calc_encap_entropy
> + */
> +enum rte_flow_entropy_dest {
> + /* Calculate entropy placed in UDP source port field. */
> + RTE_FLOW_ENTROPY_DEST_UDP_SRC_PORT,
> + /* Calculate entropy placed in NVGRE flow ID field. */
> + RTE_FLOW_ENTROPY_DEST_NVGRE_FLOW_ID,
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Calculate the entropy generated by the HW for a given pattern,
> + * when encapsulation flow action is executed.
> + *
> + * @param[in] port_id
> + * Port identifier of Ethernet device.
> + * @param[in] pattern
> + * The values to be used in the entropy calculation.
> + * @param[in] dest_field
> + * Type of destination field for entropy calculation.
> + * @param[out] entropy
> + * Used to return the calculated entropy. It will be written in network order,
> + * so entropy[0] is the MSB.
> + * The number of bytes is based on the destination field type.
>
Is the size same as field size in the 'dest_field'?
Like for 'RTE_FLOW_ENTROPY_DEST_UDP_SRC_PORT' is it two bytes?
> + * @param[out] error
> + * Perform verbose error reporting if not NULL.
> + * PMDs initialize this structure in case of error only.
> + *
> + * @return
> + * - (0) if success.
> + * - (-ENODEV) if *port_id* invalid.
> + * - (-ENOTSUP) if underlying device does not support this functionality.
> + * - (-EINVAL) if *pattern* doesn't hold enough information to calculate the entropy
> + * or the dest is not supported.
> + */
> +__rte_experimental
> +int
> +rte_flow_calc_encap_entropy(uint16_t port_id, const struct rte_flow_item pattern[],
> + enum rte_flow_entropy_dest dest_field, uint8_t *entropy,
> + struct rte_flow_error *error);
> +
> #ifdef __cplusplus
> }
> #endif
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [RFC] ethdev: introduce entropy calculation
2023-12-14 11:34 ` Ferruh Yigit
@ 2023-12-14 14:16 ` Ori Kam
2023-12-14 15:25 ` Dumitrescu, Cristian
0 siblings, 1 reply; 19+ messages in thread
From: Ori Kam @ 2023-12-14 14:16 UTC (permalink / raw)
To: Ferruh Yigit, Dariusz Sosnowski, cristian.dumitrescu,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Andrew Rybchenko
Cc: dev, Raslan Darawsheh
Hi Ferruh,
> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Thursday, December 14, 2023 1:35 PM
>
> On 12/10/2023 8:30 AM, Ori Kam wrote:
> > When offloading rules with the encap action, the HW may calculate entropy
> based on the encap protocol.
> > Each HW can implement a different algorithm.
> >
>
> Hi Ori,
>
> Can you please provide more details what this 'entropy' is used for,
> what is the usecase?
Sure, in some tunnel protocols, for example, VXLAN, NVGE it is possible to add entropy value in one of the
fields of the tunnel. In VXLAN for example it is in the source port,
From the VXLAN protocol:
Source Port: It is recommended that the UDP source port number
be calculated using a hash of fields from the inner packet --
one example being a hash of the inner Ethernet frame's headers.
This is to enable a level of entropy for the ECMP/load-
balancing of the VM-to-VM traffic across the VXLAN overlay.
When calculating the UDP source port number in this manner, it
is RECOMMENDED that the value be in the dynamic/private port
range 49152-65535 [RFC6335].
Since encap groups number of different 5 tuples together, if HW doesn’t know how to RSS
based on the inner application will not be able to get any distribution of packets.
This value is used to reflect the inner packet on the outer header, so distribution will be possible.
The main use case is, if application does full offload and implements the encap on the RX.
For example:
Ingress/FDB match on 5 tuple encap send to hairpin / different port in case of switch.
The issue starts when there is a miss on the 5 tuple table for example, due to syn packet.
A packet arrives at the application, and then the application offloads the rule.
So the application must encap the packet and set the same entropy as the HW will do for all the rest
of the packets.
>
>
> > When the application receives packets that should have been
> > encaped by the HW, but didn't reach this stage yet (for example TCP SYN
> packets),
> > then when encap is done in SW, application must apply
> > the same entropy calculation algorithm.
> >> Using the new API application can request the PMD to calculate the
> > value as if the packet passed in the HW.
> >
>
> So is this new API a datapath API? Is the intention that application
> call this API per packet that is missing 'entropy' information?
The application will call this API when it gets a packet that it knows, that the rest of the
packets from this connection will be offloaded and encaped by the HW.
(see above explanation)
>
> > Signed-off-by: Ori Kam <orika@nvidia.com>
> > ---
> > lib/ethdev/rte_flow.h | 49
> +++++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 49 insertions(+)
> >
> > diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> > index affdc8121b..3989b089dd 100644
> > --- a/lib/ethdev/rte_flow.h
> > +++ b/lib/ethdev/rte_flow.h
> > @@ -6753,6 +6753,55 @@ rte_flow_calc_table_hash(uint16_t port_id, const
> struct rte_flow_template_table
> > const struct rte_flow_item pattern[], uint8_t
> pattern_template_index,
> > uint32_t *hash, struct rte_flow_error *error);
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Destination field type for the entropy calculation.
> > + *
> > + * @see function rte_flow_calc_encap_entropy
> > + */
> > +enum rte_flow_entropy_dest {
> > + /* Calculate entropy placed in UDP source port field. */
> > + RTE_FLOW_ENTROPY_DEST_UDP_SRC_PORT,
> > + /* Calculate entropy placed in NVGRE flow ID field. */
> > + RTE_FLOW_ENTROPY_DEST_NVGRE_FLOW_ID,
> > +};
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Calculate the entropy generated by the HW for a given pattern,
> > + * when encapsulation flow action is executed.
> > + *
> > + * @param[in] port_id
> > + * Port identifier of Ethernet device.
> > + * @param[in] pattern
> > + * The values to be used in the entropy calculation.
> > + * @param[in] dest_field
> > + * Type of destination field for entropy calculation.
> > + * @param[out] entropy
> > + * Used to return the calculated entropy. It will be written in network order,
> > + * so entropy[0] is the MSB.
> > + * The number of bytes is based on the destination field type.
> >
>
>
> Is the size same as field size in the 'dest_field'?
> Like for 'RTE_FLOW_ENTROPY_DEST_UDP_SRC_PORT' is it two bytes?
Yes,
>
>
> > + * @param[out] error
> > + * Perform verbose error reporting if not NULL.
> > + * PMDs initialize this structure in case of error only.
> > + *
> > + * @return
> > + * - (0) if success.
> > + * - (-ENODEV) if *port_id* invalid.
> > + * - (-ENOTSUP) if underlying device does not support this functionality.
> > + * - (-EINVAL) if *pattern* doesn't hold enough information to calculate the
> entropy
> > + * or the dest is not supported.
> > + */
> > +__rte_experimental
> > +int
> > +rte_flow_calc_encap_entropy(uint16_t port_id, const struct rte_flow_item
> pattern[],
> > + enum rte_flow_entropy_dest dest_field, uint8_t
> *entropy,
> > + struct rte_flow_error *error);
> > +
> > #ifdef __cplusplus
> > }
> > #endif
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [RFC] ethdev: introduce entropy calculation
2023-12-14 14:16 ` Ori Kam
@ 2023-12-14 15:25 ` Dumitrescu, Cristian
2023-12-14 17:18 ` Ori Kam
0 siblings, 1 reply; 19+ messages in thread
From: Dumitrescu, Cristian @ 2023-12-14 15:25 UTC (permalink / raw)
To: Ori Kam, Ferruh Yigit, Dariusz Sosnowski,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Andrew Rybchenko
Cc: dev, Raslan Darawsheh
Hi Ori,
A few questions on top of Ferruh's questions for better understanding the concept inlined below:
> -----Original Message-----
> From: Ori Kam <orika@nvidia.com>
> Sent: Thursday, December 14, 2023 2:17 PM
> To: Ferruh Yigit <ferruh.yigit@amd.com>; Dariusz Sosnowski
> <dsosnowski@nvidia.com>; Dumitrescu, Cristian
> <cristian.dumitrescu@intel.com>; NBU-Contact-Thomas Monjalon (EXTERNAL)
> <thomas@monjalon.net>; Andrew Rybchenko
> <andrew.rybchenko@oktetlabs.ru>
> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
> Subject: RE: [RFC] ethdev: introduce entropy calculation
>
> Hi Ferruh,
>
> > -----Original Message-----
> > From: Ferruh Yigit <ferruh.yigit@amd.com>
> > Sent: Thursday, December 14, 2023 1:35 PM
> >
> > On 12/10/2023 8:30 AM, Ori Kam wrote:
> > > When offloading rules with the encap action, the HW may calculate entropy
> > based on the encap protocol.
> > > Each HW can implement a different algorithm.
> > >
> >
> > Hi Ori,
> >
> > Can you please provide more details what this 'entropy' is used for,
> > what is the usecase?
>
> Sure, in some tunnel protocols, for example, VXLAN, NVGE it is possible to add
> entropy value in one of the
> fields of the tunnel. In VXLAN for example it is in the source port,
> From the VXLAN protocol:
> Source Port: It is recommended that the UDP source port number
> be calculated using a hash of fields from the inner packet --
> one example being a hash of the inner Ethernet frame's headers.
> This is to enable a level of entropy for the ECMP/load-
> balancing of the VM-to-VM traffic across the VXLAN overlay.
> When calculating the UDP source port number in this manner, it
> is RECOMMENDED that the value be in the dynamic/private port
> range 49152-65535 [RFC6335].
>
> Since encap groups number of different 5 tuples together, if HW doesn’t know
> how to RSS
> based on the inner application will not be able to get any distribution of packets.
>
> This value is used to reflect the inner packet on the outer header, so distribution
> will be possible.
>
> The main use case is, if application does full offload and implements the encap on
> the RX.
> For example:
> Ingress/FDB match on 5 tuple encap send to hairpin / different port in case of
> switch.
>
Smart idea! So basically the user is able to get an idea on how good the RSS distribution is, correct?
Can you elaborate a bit on how the entropy is measured: is it a number, what is the range of values, does higher value means better, etc.
> The issue starts when there is a miss on the 5 tuple table for example, due to syn
> packet.
> A packet arrives at the application, and then the application offloads the rule.
> So the application must encap the packet and set the same entropy as the HW
> will do for all the rest
> of the packets.
>
How can the app set the entropy?
> >
> >
> > > When the application receives packets that should have been
> > > encaped by the HW, but didn't reach this stage yet (for example TCP SYN
> > packets),
> > > then when encap is done in SW, application must apply
> > > the same entropy calculation algorithm.
> > >> Using the new API application can request the PMD to calculate the
> > > value as if the packet passed in the HW.
> > >
> >
> > So is this new API a datapath API? Is the intention that application
> > call this API per packet that is missing 'entropy' information?
>
> The application will call this API when it gets a packet that it knows, that the rest
> of the
> packets from this connection will be offloaded and encaped by the HW.
> (see above explanation)
>
> >
> > > Signed-off-by: Ori Kam <orika@nvidia.com>
> > > ---
> > > lib/ethdev/rte_flow.h | 49
> > +++++++++++++++++++++++++++++++++++++++++++
> > > 1 file changed, 49 insertions(+)
> > >
> > > diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> > > index affdc8121b..3989b089dd 100644
> > > --- a/lib/ethdev/rte_flow.h
> > > +++ b/lib/ethdev/rte_flow.h
> > > @@ -6753,6 +6753,55 @@ rte_flow_calc_table_hash(uint16_t port_id, const
> > struct rte_flow_template_table
> > > const struct rte_flow_item pattern[], uint8_t
> > pattern_template_index,
> > > uint32_t *hash, struct rte_flow_error *error);
> > >
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this API may change without prior notice.
> > > + *
> > > + * Destination field type for the entropy calculation.
> > > + *
> > > + * @see function rte_flow_calc_encap_entropy
> > > + */
> > > +enum rte_flow_entropy_dest {
> > > + /* Calculate entropy placed in UDP source port field. */
> > > + RTE_FLOW_ENTROPY_DEST_UDP_SRC_PORT,
> > > + /* Calculate entropy placed in NVGRE flow ID field. */
> > > + RTE_FLOW_ENTROPY_DEST_NVGRE_FLOW_ID,
> > > +};
> > > +
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this API may change without prior notice.
> > > + *
> > > + * Calculate the entropy generated by the HW for a given pattern,
> > > + * when encapsulation flow action is executed.
> > > + *
> > > + * @param[in] port_id
> > > + * Port identifier of Ethernet device.
> > > + * @param[in] pattern
> > > + * The values to be used in the entropy calculation.
> > > + * @param[in] dest_field
> > > + * Type of destination field for entropy calculation.
> > > + * @param[out] entropy
> > > + * Used to return the calculated entropy. It will be written in network order,
> > > + * so entropy[0] is the MSB.
> > > + * The number of bytes is based on the destination field type.
> > >
> >
> >
> > Is the size same as field size in the 'dest_field'?
> > Like for 'RTE_FLOW_ENTROPY_DEST_UDP_SRC_PORT' is it two bytes?
>
> Yes,
> >
> >
> > > + * @param[out] error
> > > + * Perform verbose error reporting if not NULL.
> > > + * PMDs initialize this structure in case of error only.
> > > + *
> > > + * @return
> > > + * - (0) if success.
> > > + * - (-ENODEV) if *port_id* invalid.
> > > + * - (-ENOTSUP) if underlying device does not support this functionality.
> > > + * - (-EINVAL) if *pattern* doesn't hold enough information to calculate the
> > entropy
> > > + * or the dest is not supported.
> > > + */
> > > +__rte_experimental
> > > +int
> > > +rte_flow_calc_encap_entropy(uint16_t port_id, const struct rte_flow_item
> > pattern[],
> > > + enum rte_flow_entropy_dest dest_field, uint8_t
> > *entropy,
> > > + struct rte_flow_error *error);
> > > +
> > > #ifdef __cplusplus
> > > }
> > > #endif
Thanks,
Cristian
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [RFC] ethdev: introduce entropy calculation
2023-12-14 15:25 ` Dumitrescu, Cristian
@ 2023-12-14 17:18 ` Ori Kam
2023-12-14 17:26 ` Stephen Hemminger
0 siblings, 1 reply; 19+ messages in thread
From: Ori Kam @ 2023-12-14 17:18 UTC (permalink / raw)
To: Dumitrescu, Cristian, Ferruh Yigit, Dariusz Sosnowski,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Andrew Rybchenko
Cc: dev, Raslan Darawsheh
Hi Andrew,
> -----Original Message-----
> From: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
> Sent: Thursday, December 14, 2023 5:25 PM
>
> Hi Ori,
>
> A few questions on top of Ferruh's questions for better understanding the
> concept inlined below:
>
> > -----Original Message-----
> > From: Ori Kam <orika@nvidia.com>
> > Sent: Thursday, December 14, 2023 2:17 PM
> > To: Ferruh Yigit <ferruh.yigit@amd.com>; Dariusz Sosnowski
> > <dsosnowski@nvidia.com>; Dumitrescu, Cristian
> > <cristian.dumitrescu@intel.com>; NBU-Contact-Thomas Monjalon
> (EXTERNAL)
> > <thomas@monjalon.net>; Andrew Rybchenko
> > <andrew.rybchenko@oktetlabs.ru>
> > Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
> > Subject: RE: [RFC] ethdev: introduce entropy calculation
> >
> > Hi Ferruh,
> >
> > > -----Original Message-----
> > > From: Ferruh Yigit <ferruh.yigit@amd.com>
> > > Sent: Thursday, December 14, 2023 1:35 PM
> > >
> > > On 12/10/2023 8:30 AM, Ori Kam wrote:
> > > > When offloading rules with the encap action, the HW may calculate
> entropy
> > > based on the encap protocol.
> > > > Each HW can implement a different algorithm.
> > > >
> > >
> > > Hi Ori,
> > >
> > > Can you please provide more details what this 'entropy' is used for,
> > > what is the usecase?
> >
> > Sure, in some tunnel protocols, for example, VXLAN, NVGE it is possible to add
> > entropy value in one of the
> > fields of the tunnel. In VXLAN for example it is in the source port,
> > From the VXLAN protocol:
> > Source Port: It is recommended that the UDP source port number
> > be calculated using a hash of fields from the inner packet --
> > one example being a hash of the inner Ethernet frame's headers.
> > This is to enable a level of entropy for the ECMP/load-
> > balancing of the VM-to-VM traffic across the VXLAN overlay.
> > When calculating the UDP source port number in this manner, it
> > is RECOMMENDED that the value be in the dynamic/private port
> > range 49152-65535 [RFC6335].
> >
> > Since encap groups number of different 5 tuples together, if HW doesn’t know
> > how to RSS
> > based on the inner application will not be able to get any distribution of
> packets.
> >
> > This value is used to reflect the inner packet on the outer header, so
> distribution
> > will be possible.
> >
> > The main use case is, if application does full offload and implements the encap
> on
> > the RX.
> > For example:
> > Ingress/FDB match on 5 tuple encap send to hairpin / different port in case of
> > switch.
> >
>
> Smart idea! So basically the user is able to get an idea on how good the RSS
> distribution is, correct?
>
Not exactly, this simply allows the distribution.
Maybe entropy is a bad name, this is the name they use in the protocol, but in reality
this is some hash calculated on the packet header before the encap and set in the encap header.
Using this hash results in entropy for the packets. Which can be used for load balancing.
Maybe better name would be:
Rte_flow_calc_entropy_hash?
or maybe rte_flow_calc_encap_hash (I like it less since it looks like we calculate the hash on the encap data and not the inner part)
what do you think?
> Can you elaborate a bit on how the entropy is measured: is it a number, what is
> the range of values, does higher value means better, etc.
>
Please see my above answer, this is not entropy value but value used for entropy.
> > The issue starts when there is a miss on the 5 tuple table for example, due to
> syn
> > packet.
> > A packet arrives at the application, and then the application offloads the rule.
> > So the application must encap the packet and set the same entropy as the HW
> > will do for all the rest
> > of the packets.
> >
>
> How can the app set the entropy?
It can't, it is assumed that the HW is configured ether hardcoded or by using other means
with the algorithm. The application doesn't know and doesn't care about what is the calculation
only about the result.
>
> > >
> > >
> > > > When the application receives packets that should have been
> > > > encaped by the HW, but didn't reach this stage yet (for example TCP SYN
> > > packets),
> > > > then when encap is done in SW, application must apply
> > > > the same entropy calculation algorithm.
> > > >> Using the new API application can request the PMD to calculate the
> > > > value as if the packet passed in the HW.
> > > >
> > >
> > > So is this new API a datapath API? Is the intention that application
> > > call this API per packet that is missing 'entropy' information?
> >
> > The application will call this API when it gets a packet that it knows, that the
> rest
> > of the
> > packets from this connection will be offloaded and encaped by the HW.
> > (see above explanation)
> >
> > >
> > > > Signed-off-by: Ori Kam <orika@nvidia.com>
> > > > ---
> > > > lib/ethdev/rte_flow.h | 49
> > > +++++++++++++++++++++++++++++++++++++++++++
> > > > 1 file changed, 49 insertions(+)
> > > >
> > > > diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> > > > index affdc8121b..3989b089dd 100644
> > > > --- a/lib/ethdev/rte_flow.h
> > > > +++ b/lib/ethdev/rte_flow.h
> > > > @@ -6753,6 +6753,55 @@ rte_flow_calc_table_hash(uint16_t port_id,
> const
> > > struct rte_flow_template_table
> > > > const struct rte_flow_item pattern[], uint8_t
> > > pattern_template_index,
> > > > uint32_t *hash, struct rte_flow_error *error);
> > > >
> > > > +/**
> > > > + * @warning
> > > > + * @b EXPERIMENTAL: this API may change without prior notice.
> > > > + *
> > > > + * Destination field type for the entropy calculation.
> > > > + *
> > > > + * @see function rte_flow_calc_encap_entropy
> > > > + */
> > > > +enum rte_flow_entropy_dest {
> > > > + /* Calculate entropy placed in UDP source port field. */
> > > > + RTE_FLOW_ENTROPY_DEST_UDP_SRC_PORT,
> > > > + /* Calculate entropy placed in NVGRE flow ID field. */
> > > > + RTE_FLOW_ENTROPY_DEST_NVGRE_FLOW_ID,
> > > > +};
> > > > +
> > > > +/**
> > > > + * @warning
> > > > + * @b EXPERIMENTAL: this API may change without prior notice.
> > > > + *
> > > > + * Calculate the entropy generated by the HW for a given pattern,
> > > > + * when encapsulation flow action is executed.
> > > > + *
> > > > + * @param[in] port_id
> > > > + * Port identifier of Ethernet device.
> > > > + * @param[in] pattern
> > > > + * The values to be used in the entropy calculation.
> > > > + * @param[in] dest_field
> > > > + * Type of destination field for entropy calculation.
> > > > + * @param[out] entropy
> > > > + * Used to return the calculated entropy. It will be written in network
> order,
> > > > + * so entropy[0] is the MSB.
> > > > + * The number of bytes is based on the destination field type.
> > > >
> > >
> > >
> > > Is the size same as field size in the 'dest_field'?
> > > Like for 'RTE_FLOW_ENTROPY_DEST_UDP_SRC_PORT' is it two bytes?
> >
> > Yes,
> > >
> > >
> > > > + * @param[out] error
> > > > + * Perform verbose error reporting if not NULL.
> > > > + * PMDs initialize this structure in case of error only.
> > > > + *
> > > > + * @return
> > > > + * - (0) if success.
> > > > + * - (-ENODEV) if *port_id* invalid.
> > > > + * - (-ENOTSUP) if underlying device does not support this functionality.
> > > > + * - (-EINVAL) if *pattern* doesn't hold enough information to calculate
> the
> > > entropy
> > > > + * or the dest is not supported.
> > > > + */
> > > > +__rte_experimental
> > > > +int
> > > > +rte_flow_calc_encap_entropy(uint16_t port_id, const struct
> rte_flow_item
> > > pattern[],
> > > > + enum rte_flow_entropy_dest dest_field, uint8_t
> > > *entropy,
> > > > + struct rte_flow_error *error);
> > > > +
> > > > #ifdef __cplusplus
> > > > }
> > > > #endif
>
> Thanks,
> Cristian
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC] ethdev: introduce entropy calculation
2023-12-14 17:18 ` Ori Kam
@ 2023-12-14 17:26 ` Stephen Hemminger
2023-12-15 13:44 ` Ferruh Yigit
0 siblings, 1 reply; 19+ messages in thread
From: Stephen Hemminger @ 2023-12-14 17:26 UTC (permalink / raw)
To: Ori Kam
Cc: Dumitrescu, Cristian, Ferruh Yigit, Dariusz Sosnowski,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Andrew Rybchenko, dev, Raslan Darawsheh
On Thu, 14 Dec 2023 17:18:25 +0000
Ori Kam <orika@nvidia.com> wrote:
> > > Since encap groups number of different 5 tuples together, if HW doesn’t know
> > > how to RSS
> > > based on the inner application will not be able to get any distribution of
> > packets.
> > >
> > > This value is used to reflect the inner packet on the outer header, so
> > distribution
> > > will be possible.
> > >
> > > The main use case is, if application does full offload and implements the encap
> > on
> > > the RX.
> > > For example:
> > > Ingress/FDB match on 5 tuple encap send to hairpin / different port in case of
> > > switch.
> > >
> >
> > Smart idea! So basically the user is able to get an idea on how good the RSS
> > distribution is, correct?
> >
>
> Not exactly, this simply allows the distribution.
> Maybe entropy is a bad name, this is the name they use in the protocol, but in reality
> this is some hash calculated on the packet header before the encap and set in the encap header.
> Using this hash results in entropy for the packets. Which can be used for load balancing.
>
> Maybe better name would be:
> Rte_flow_calc_entropy_hash?
>
> or maybe rte_flow_calc_encap_hash (I like it less since it looks like we calculate the hash on the encap data and not the inner part)
>
> what do you think?
Entropy has meaning in crypto and random numbers generators that is different from
this usage. So entropy is bad name to use. Maybe rte_flow_hash_distribution?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC] ethdev: introduce entropy calculation
2023-12-14 17:26 ` Stephen Hemminger
@ 2023-12-15 13:44 ` Ferruh Yigit
2023-12-15 16:21 ` Thomas Monjalon
0 siblings, 1 reply; 19+ messages in thread
From: Ferruh Yigit @ 2023-12-15 13:44 UTC (permalink / raw)
To: Stephen Hemminger, Ori Kam
Cc: Dumitrescu, Cristian, Dariusz Sosnowski,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Andrew Rybchenko, dev, Raslan Darawsheh
On 12/14/2023 5:26 PM, Stephen Hemminger wrote:
> On Thu, 14 Dec 2023 17:18:25 +0000
> Ori Kam <orika@nvidia.com> wrote:
>
>>>> Since encap groups number of different 5 tuples together, if HW doesn’t know
>>>> how to RSS
>>>> based on the inner application will not be able to get any distribution of
>>> packets.
>>>>
>>>> This value is used to reflect the inner packet on the outer header, so
>>> distribution
>>>> will be possible.
>>>>
>>>> The main use case is, if application does full offload and implements the encap
>>> on
>>>> the RX.
>>>> For example:
>>>> Ingress/FDB match on 5 tuple encap send to hairpin / different port in case of
>>>> switch.
>>>>
>>>
>>> Smart idea! So basically the user is able to get an idea on how good the RSS
>>> distribution is, correct?
>>>
>>
>> Not exactly, this simply allows the distribution.
>> Maybe entropy is a bad name, this is the name they use in the protocol, but in reality
>> this is some hash calculated on the packet header before the encap and set in the encap header.
>> Using this hash results in entropy for the packets. Which can be used for load balancing.
>>
>> Maybe better name would be:
>> Rte_flow_calc_entropy_hash?
>>
>> or maybe rte_flow_calc_encap_hash (I like it less since it looks like we calculate the hash on the encap data and not the inner part)
>>
>> what do you think?
>
> Entropy has meaning in crypto and random numbers generators that is different from
> this usage. So entropy is bad name to use. Maybe rte_flow_hash_distribution?
>
Hi Ori,
Thank you for the description, it is more clear now.
And unless this is specifically defined as 'entropy' in spec, I am too
for rename.
At least in VXLAN spec, it is mentioned that this field is to "enable a
level of entropy", but not exactly names it as entropy.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC] ethdev: introduce entropy calculation
2023-12-15 13:44 ` Ferruh Yigit
@ 2023-12-15 16:21 ` Thomas Monjalon
2023-12-16 9:03 ` Andrew Rybchenko
0 siblings, 1 reply; 19+ messages in thread
From: Thomas Monjalon @ 2023-12-15 16:21 UTC (permalink / raw)
To: Stephen Hemminger, Ori Kam, Ferruh Yigit
Cc: Dumitrescu, Cristian, Dariusz Sosnowski, Andrew Rybchenko, dev,
Raslan Darawsheh
15/12/2023 14:44, Ferruh Yigit:
> On 12/14/2023 5:26 PM, Stephen Hemminger wrote:
> > On Thu, 14 Dec 2023 17:18:25 +0000
> > Ori Kam <orika@nvidia.com> wrote:
> >
> >>>> Since encap groups number of different 5 tuples together, if HW doesn’t know
> >>>> how to RSS
> >>>> based on the inner application will not be able to get any distribution of
> >>> packets.
> >>>>
> >>>> This value is used to reflect the inner packet on the outer header, so
> >>> distribution
> >>>> will be possible.
> >>>>
> >>>> The main use case is, if application does full offload and implements the encap
> >>> on
> >>>> the RX.
> >>>> For example:
> >>>> Ingress/FDB match on 5 tuple encap send to hairpin / different port in case of
> >>>> switch.
> >>>>
> >>>
> >>> Smart idea! So basically the user is able to get an idea on how good the RSS
> >>> distribution is, correct?
> >>>
> >>
> >> Not exactly, this simply allows the distribution.
> >> Maybe entropy is a bad name, this is the name they use in the protocol, but in reality
> >> this is some hash calculated on the packet header before the encap and set in the encap header.
> >> Using this hash results in entropy for the packets. Which can be used for load balancing.
> >>
> >> Maybe better name would be:
> >> Rte_flow_calc_entropy_hash?
> >>
> >> or maybe rte_flow_calc_encap_hash (I like it less since it looks like we calculate the hash on the encap data and not the inner part)
> >>
> >> what do you think?
> >
> > Entropy has meaning in crypto and random numbers generators that is different from
> > this usage. So entropy is bad name to use. Maybe rte_flow_hash_distribution?
> >
>
> Hi Ori,
>
> Thank you for the description, it is more clear now.
>
> And unless this is specifically defined as 'entropy' in spec, I am too
> for rename.
>
> At least in VXLAN spec, it is mentioned that this field is to "enable a
> level of entropy", but not exactly names it as entropy.
Exactly my thought about the naming.
Good to see I am not alone thinking this naming is disturbing :)
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC] ethdev: introduce entropy calculation
2023-12-15 16:21 ` Thomas Monjalon
@ 2023-12-16 9:03 ` Andrew Rybchenko
2023-12-27 15:20 ` Ori Kam
0 siblings, 1 reply; 19+ messages in thread
From: Andrew Rybchenko @ 2023-12-16 9:03 UTC (permalink / raw)
To: Thomas Monjalon, Stephen Hemminger, Ori Kam, Ferruh Yigit
Cc: Dumitrescu, Cristian, Dariusz Sosnowski, dev, Raslan Darawsheh
On 12/15/23 19:21, Thomas Monjalon wrote:
> 15/12/2023 14:44, Ferruh Yigit:
>> On 12/14/2023 5:26 PM, Stephen Hemminger wrote:
>>> On Thu, 14 Dec 2023 17:18:25 +0000
>>> Ori Kam <orika@nvidia.com> wrote:
>>>
>>>>>> Since encap groups number of different 5 tuples together, if HW doesn’t know
>>>>>> how to RSS
>>>>>> based on the inner application will not be able to get any distribution of
>>>>> packets.
>>>>>>
>>>>>> This value is used to reflect the inner packet on the outer header, so
>>>>> distribution
>>>>>> will be possible.
>>>>>>
>>>>>> The main use case is, if application does full offload and implements the encap
>>>>> on
>>>>>> the RX.
>>>>>> For example:
>>>>>> Ingress/FDB match on 5 tuple encap send to hairpin / different port in case of
>>>>>> switch.
>>>>>>
>>>>>
>>>>> Smart idea! So basically the user is able to get an idea on how good the RSS
>>>>> distribution is, correct?
>>>>>
>>>>
>>>> Not exactly, this simply allows the distribution.
>>>> Maybe entropy is a bad name, this is the name they use in the protocol, but in reality
>>>> this is some hash calculated on the packet header before the encap and set in the encap header.
>>>> Using this hash results in entropy for the packets. Which can be used for load balancing.
>>>>
>>>> Maybe better name would be:
>>>> Rte_flow_calc_entropy_hash?
>>>>
>>>> or maybe rte_flow_calc_encap_hash (I like it less since it looks like we calculate the hash on the encap data and not the inner part)
>>>>
>>>> what do you think?
>>>
>>> Entropy has meaning in crypto and random numbers generators that is different from
>>> this usage. So entropy is bad name to use. Maybe rte_flow_hash_distribution?
>>>
>>
>> Hi Ori,
>>
>> Thank you for the description, it is more clear now.
>>
>> And unless this is specifically defined as 'entropy' in spec, I am too
>> for rename.
>>
>> At least in VXLAN spec, it is mentioned that this field is to "enable a
>> level of entropy", but not exactly names it as entropy.
>
> Exactly my thought about the naming.
> Good to see I am not alone thinking this naming is disturbing :)
I'd avoid usage of term "entropy" in this patch. It is very confusing.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC] ethdev: introduce entropy calculation
2023-12-10 8:30 [RFC] ethdev: introduce entropy calculation Ori Kam
2023-12-12 12:19 ` Dariusz Sosnowski
2023-12-14 11:34 ` Ferruh Yigit
@ 2023-12-16 9:19 ` Andrew Rybchenko
2023-12-17 10:07 ` Ori Kam
2 siblings, 1 reply; 19+ messages in thread
From: Andrew Rybchenko @ 2023-12-16 9:19 UTC (permalink / raw)
To: Ori Kam, dsosnowski, ferruh.yigit, cristian.dumitrescu, Thomas Monjalon
Cc: dev, rasland
On 12/10/23 11:30, Ori Kam wrote:
> When offloading rules with the encap action, the HW may calculate entropy based on the encap protocol.
> Each HW can implement a different algorithm.
>
> When the application receives packets that should have been
> encaped by the HW, but didn't reach this stage yet (for example TCP SYN packets),
> then when encap is done in SW, application must apply
> the same entropy calculation algorithm.
>
> Using the new API application can request the PMD to calculate the
> value as if the packet passed in the HW.
I'm still wondering why the API is required. Does app install encap
rule when the first packet is processed? The rule can contain all
outer headers (as it is calculated in SW anyway) and HW does not need
to calculate anything.
>
> Signed-off-by: Ori Kam <orika@nvidia.com>
> ---
> lib/ethdev/rte_flow.h | 49 +++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 49 insertions(+)
>
> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> index affdc8121b..3989b089dd 100644
> --- a/lib/ethdev/rte_flow.h
> +++ b/lib/ethdev/rte_flow.h
> @@ -6753,6 +6753,55 @@ rte_flow_calc_table_hash(uint16_t port_id, const struct rte_flow_template_table
> const struct rte_flow_item pattern[], uint8_t pattern_template_index,
> uint32_t *hash, struct rte_flow_error *error);
>
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Destination field type for the entropy calculation.
> + *
> + * @see function rte_flow_calc_encap_entropy
> + */
> +enum rte_flow_entropy_dest {
> + /* Calculate entropy placed in UDP source port field. */
> + RTE_FLOW_ENTROPY_DEST_UDP_SRC_PORT,
And source and destination are used together but for different
purposes it is very hard to follow which is used for which purpose.
I'd avoid term "dest" in the enum naming. May be present "flow" is
already good enough to highlight that it is per-flow?
rte_flow_encap_hash? rte_flow_encap_field_hash?
> + /* Calculate entropy placed in NVGRE flow ID field. */
> + RTE_FLOW_ENTROPY_DEST_NVGRE_FLOW_ID,
> +};
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Calculate the entropy generated by the HW for a given pattern,
> + * when encapsulation flow action is executed.
> + *
> + * @param[in] port_id
> + * Port identifier of Ethernet device.
> + * @param[in] pattern
> + * The values to be used in the entropy calculation.
Is it inner packet fields? Should all fields be used for hash
calculation? May be it is better to describe as inner packet
headers? How does app know which headers to extract and provide?
> + * @param[in] dest_field
> + * Type of destination field for entropy calculation.
> + * @param[out] entropy
> + * Used to return the calculated entropy. It will be written in network order,
> + * so entropy[0] is the MSB.
> + * The number of bytes is based on the destination field type.f
API contract is a bit unclear here. May be it is safer to provide
buffer size explicitly?
> + * @param[out] error
> + * Perform verbose error reporting if not NULL.
> + * PMDs initialize this structure in case of error only.
> + *
> + * @return
> + * - (0) if success.
> + * - (-ENODEV) if *port_id* invalid.
> + * - (-ENOTSUP) if underlying device does not support this functionality.
> + * - (-EINVAL) if *pattern* doesn't hold enough information to calculate the entropy
> + * or the dest is not supported.
> + */
> +__rte_experimental
> +int
> +rte_flow_calc_encap_entropy(uint16_t port_id, const struct rte_flow_item pattern[],
> + enum rte_flow_entropy_dest dest_field, uint8_t *entropy,
> + struct rte_flow_error *error);
> +
> #ifdef __cplusplus
> }
> #endif
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [RFC] ethdev: introduce entropy calculation
2023-12-16 9:19 ` Andrew Rybchenko
@ 2023-12-17 10:07 ` Ori Kam
2024-01-12 7:46 ` Andrew Rybchenko
0 siblings, 1 reply; 19+ messages in thread
From: Ori Kam @ 2023-12-17 10:07 UTC (permalink / raw)
To: Andrew Rybchenko, Dariusz Sosnowski, ferruh.yigit,
cristian.dumitrescu, NBU-Contact-Thomas Monjalon (EXTERNAL)
Cc: dev, Raslan Darawsheh
Hi Andrew,
> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Saturday, December 16, 2023 11:19 AM
>
> On 12/10/23 11:30, Ori Kam wrote:
> > When offloading rules with the encap action, the HW may calculate entropy
> based on the encap protocol.
> > Each HW can implement a different algorithm.
> >
> > When the application receives packets that should have been
> > encaped by the HW, but didn't reach this stage yet (for example TCP SYN
> packets),
> > then when encap is done in SW, application must apply
> > the same entropy calculation algorithm.
> >
> > Using the new API application can request the PMD to calculate the
> > value as if the packet passed in the HW.
>
> I'm still wondering why the API is required. Does app install encap
> rule when the first packet is processed? The rule can contain all
> outer headers (as it is calculated in SW anyway) and HW does not need
> to calculate anything.
Yes, the application installs a rule based on the first packet.
as a result, all the rest of the packets are encaped by the HW.
This API allows the application to use the same value as the HW will use when encapsulating the packet.
In other words, we have 2 paths:
Path 1 SW, for the first packet
Path 2 HW, all the rest of the packest
>
> >
> > Signed-off-by: Ori Kam <orika@nvidia.com>
> > ---
> > lib/ethdev/rte_flow.h | 49
> +++++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 49 insertions(+)
> >
> > diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> > index affdc8121b..3989b089dd 100644
> > --- a/lib/ethdev/rte_flow.h
> > +++ b/lib/ethdev/rte_flow.h
> > @@ -6753,6 +6753,55 @@ rte_flow_calc_table_hash(uint16_t port_id, const
> struct rte_flow_template_table
> > const struct rte_flow_item pattern[], uint8_t
> pattern_template_index,
> > uint32_t *hash, struct rte_flow_error *error);
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Destination field type for the entropy calculation.
> > + *
> > + * @see function rte_flow_calc_encap_entropy
> > + */
> > +enum rte_flow_entropy_dest {
> > + /* Calculate entropy placed in UDP source port field. */
> > + RTE_FLOW_ENTROPY_DEST_UDP_SRC_PORT,
>
> And source and destination are used together but for different
> purposes it is very hard to follow which is used for which purpose.
> I'd avoid term "dest" in the enum naming. May be present "flow" is
> already good enough to highlight that it is per-flow?
> rte_flow_encap_hash? rte_flow_encap_field_hash?
I'm open to any suggestions, this enum is supposed to show to which
field the HW insert the calculated value. This field is defined by the encapsulation
protocol. For example, VXLAN the hash is stored in source port, while in NVGRE it is stored in
flow_id field. The destination field also impact the size.
What do you think about:
RTE_FLOW_ENCAP_HASH_SRC_PORT?
What about if we change the destination to enum that will hold the destination tunnel type
RTE_FLOW_TUNNEL_TYPE_VXLAN,
RTE_FLOW_TUNNEL_TYPE_NVGRE
>
> > + /* Calculate entropy placed in NVGRE flow ID field. */
> > + RTE_FLOW_ENTROPY_DEST_NVGRE_FLOW_ID,
> > +};
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Calculate the entropy generated by the HW for a given pattern,
> > + * when encapsulation flow action is executed.
> > + *
> > + * @param[in] port_id
> > + * Port identifier of Ethernet device.
> > + * @param[in] pattern
> > + * The values to be used in the entropy calculation.
>
> Is it inner packet fields? Should all fields be used for hash
> calculation? May be it is better to describe as inner packet
> headers? How does app know which headers to extract and provide?
The fields are the outer fields that will become inner fields, after the encapsulation.
The fields are dependent on the HW (in standard case 5 tuple). but applications can /should set
all the fields he got from the packet, at the end those are the fields that the HW will see.
>
> > + * @param[in] dest_field
> > + * Type of destination field for entropy calculation.
> > + * @param[out] entropy
> > + * Used to return the calculated entropy. It will be written in network order,
> > + * so entropy[0] is the MSB.
> > + * The number of bytes is based on the destination field type.f
>
> API contract is a bit unclear here. May be it is safer to provide
> buffer size explicitly?
The size of the field is defined by the destination field, which in turn is defined by the
protocol.
I don't think adding size has any meaning when you know that the value is going to be set
in source port which has the size of 16 bites.
Based on enum suggestion of tunnel type. I think it will also explain the dest and size. What do you think?
>
> > + * @param[out] error
> > + * Perform verbose error reporting if not NULL.
> > + * PMDs initialize this structure in case of error only.
> > + *
> > + * @return
> > + * - (0) if success.
> > + * - (-ENODEV) if *port_id* invalid.
> > + * - (-ENOTSUP) if underlying device does not support this functionality.
> > + * - (-EINVAL) if *pattern* doesn't hold enough information to calculate the
> entropy
> > + * or the dest is not supported.
> > + */
> > +__rte_experimental
> > +int
> > +rte_flow_calc_encap_entropy(uint16_t port_id, const struct rte_flow_item
> pattern[],
> > + enum rte_flow_entropy_dest dest_field, uint8_t
> *entropy,
> > + struct rte_flow_error *error);
> > +
> > #ifdef __cplusplus
> > }
> > #endif
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [RFC] ethdev: introduce entropy calculation
2023-12-16 9:03 ` Andrew Rybchenko
@ 2023-12-27 15:20 ` Ori Kam
2024-01-04 12:57 ` Dumitrescu, Cristian
0 siblings, 1 reply; 19+ messages in thread
From: Ori Kam @ 2023-12-27 15:20 UTC (permalink / raw)
To: Andrew Rybchenko, NBU-Contact-Thomas Monjalon (EXTERNAL),
Stephen Hemminger, Ferruh Yigit
Cc: Dumitrescu, Cristian, Dariusz Sosnowski, dev, Raslan Darawsheh
Hi Andrew, Stephen, Ferruh and Thomas,
> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Saturday, December 16, 2023 11:04 AM
>
> On 12/15/23 19:21, Thomas Monjalon wrote:
> > 15/12/2023 14:44, Ferruh Yigit:
> >> On 12/14/2023 5:26 PM, Stephen Hemminger wrote:
> >>> On Thu, 14 Dec 2023 17:18:25 +0000
> >>> Ori Kam <orika@nvidia.com> wrote:
> >>>
> >>>>>> Since encap groups number of different 5 tuples together, if HW
> doesn’t know
> >>>>>> how to RSS
> >>>>>> based on the inner application will not be able to get any distribution of
> >>>>> packets.
> >>>>>>
> >>>>>> This value is used to reflect the inner packet on the outer header, so
> >>>>> distribution
> >>>>>> will be possible.
> >>>>>>
> >>>>>> The main use case is, if application does full offload and implements
> the encap
> >>>>> on
> >>>>>> the RX.
> >>>>>> For example:
> >>>>>> Ingress/FDB match on 5 tuple encap send to hairpin / different port in
> case of
> >>>>>> switch.
> >>>>>>
> >>>>>
> >>>>> Smart idea! So basically the user is able to get an idea on how good the
> RSS
> >>>>> distribution is, correct?
> >>>>>
> >>>>
> >>>> Not exactly, this simply allows the distribution.
> >>>> Maybe entropy is a bad name, this is the name they use in the protocol,
> but in reality
> >>>> this is some hash calculated on the packet header before the encap and
> set in the encap header.
> >>>> Using this hash results in entropy for the packets. Which can be used for
> load balancing.
> >>>>
> >>>> Maybe better name would be:
> >>>> Rte_flow_calc_entropy_hash?
> >>>>
> >>>> or maybe rte_flow_calc_encap_hash (I like it less since it looks like we
> calculate the hash on the encap data and not the inner part)
> >>>>
> >>>> what do you think?
> >>>
> >>> Entropy has meaning in crypto and random numbers generators that is
> different from
> >>> this usage. So entropy is bad name to use. Maybe
> rte_flow_hash_distribution?
> >>>
> >>
> >> Hi Ori,
> >>
> >> Thank you for the description, it is more clear now.
> >>
> >> And unless this is specifically defined as 'entropy' in spec, I am too
> >> for rename.
> >>
> >> At least in VXLAN spec, it is mentioned that this field is to "enable a
> >> level of entropy", but not exactly names it as entropy.
> >
> > Exactly my thought about the naming.
> > Good to see I am not alone thinking this naming is disturbing :)
>
> I'd avoid usage of term "entropy" in this patch. It is very confusing.
What about rte_flow_calc_encap_hash?
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [RFC] ethdev: introduce entropy calculation
2023-12-27 15:20 ` Ori Kam
@ 2024-01-04 12:57 ` Dumitrescu, Cristian
2024-01-04 14:33 ` Ori Kam
0 siblings, 1 reply; 19+ messages in thread
From: Dumitrescu, Cristian @ 2024-01-04 12:57 UTC (permalink / raw)
To: Ori Kam, Andrew Rybchenko, NBU-Contact-Thomas Monjalon (EXTERNAL),
Stephen Hemminger, Ferruh Yigit
Cc: Dariusz Sosnowski, dev, Raslan Darawsheh
> -----Original Message-----
> From: Ori Kam <orika@nvidia.com>
> Sent: Wednesday, December 27, 2023 3:20 PM
> To: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; NBU-Contact-
> Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; Stephen Hemminger
> <stephen@networkplumber.org>; Ferruh Yigit <ferruh.yigit@amd.com>
> Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Dariusz Sosnowski
> <dsosnowski@nvidia.com>; dev@dpdk.org; Raslan Darawsheh
> <rasland@nvidia.com>
> Subject: RE: [RFC] ethdev: introduce entropy calculation
>
> Hi Andrew, Stephen, Ferruh and Thomas,
>
> > -----Original Message-----
> > From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> > Sent: Saturday, December 16, 2023 11:04 AM
> >
> > On 12/15/23 19:21, Thomas Monjalon wrote:
> > > 15/12/2023 14:44, Ferruh Yigit:
> > >> On 12/14/2023 5:26 PM, Stephen Hemminger wrote:
> > >>> On Thu, 14 Dec 2023 17:18:25 +0000
> > >>> Ori Kam <orika@nvidia.com> wrote:
> > >>>
> > >>>>>> Since encap groups number of different 5 tuples together, if HW
> > doesn’t know
> > >>>>>> how to RSS
> > >>>>>> based on the inner application will not be able to get any distribution of
> > >>>>> packets.
> > >>>>>>
> > >>>>>> This value is used to reflect the inner packet on the outer header, so
> > >>>>> distribution
> > >>>>>> will be possible.
> > >>>>>>
> > >>>>>> The main use case is, if application does full offload and implements
> > the encap
> > >>>>> on
> > >>>>>> the RX.
> > >>>>>> For example:
> > >>>>>> Ingress/FDB match on 5 tuple encap send to hairpin / different port in
> > case of
> > >>>>>> switch.
> > >>>>>>
> > >>>>>
> > >>>>> Smart idea! So basically the user is able to get an idea on how good the
> > RSS
> > >>>>> distribution is, correct?
> > >>>>>
> > >>>>
> > >>>> Not exactly, this simply allows the distribution.
> > >>>> Maybe entropy is a bad name, this is the name they use in the protocol,
> > but in reality
> > >>>> this is some hash calculated on the packet header before the encap and
> > set in the encap header.
> > >>>> Using this hash results in entropy for the packets. Which can be used for
> > load balancing.
> > >>>>
> > >>>> Maybe better name would be:
> > >>>> Rte_flow_calc_entropy_hash?
> > >>>>
> > >>>> or maybe rte_flow_calc_encap_hash (I like it less since it looks like we
> > calculate the hash on the encap data and not the inner part)
> > >>>>
> > >>>> what do you think?
> > >>>
> > >>> Entropy has meaning in crypto and random numbers generators that is
> > different from
> > >>> this usage. So entropy is bad name to use. Maybe
> > rte_flow_hash_distribution?
> > >>>
> > >>
> > >> Hi Ori,
> > >>
> > >> Thank you for the description, it is more clear now.
> > >>
> > >> And unless this is specifically defined as 'entropy' in spec, I am too
> > >> for rename.
> > >>
> > >> At least in VXLAN spec, it is mentioned that this field is to "enable a
> > >> level of entropy", but not exactly names it as entropy.
> > >
> > > Exactly my thought about the naming.
> > > Good to see I am not alone thinking this naming is disturbing :)
> >
> > I'd avoid usage of term "entropy" in this patch. It is very confusing.
>
> What about rte_flow_calc_encap_hash?
>
>
How about simply rte_flow_calc_hash? My understanding is this is a general-purpose hash that is not limited to encapsulation work.
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [RFC] ethdev: introduce entropy calculation
2024-01-04 12:57 ` Dumitrescu, Cristian
@ 2024-01-04 14:33 ` Ori Kam
2024-01-04 18:18 ` Thomas Monjalon
0 siblings, 1 reply; 19+ messages in thread
From: Ori Kam @ 2024-01-04 14:33 UTC (permalink / raw)
To: Dumitrescu, Cristian, Andrew Rybchenko,
NBU-Contact-Thomas Monjalon (EXTERNAL),
Stephen Hemminger, Ferruh Yigit
Cc: Dariusz Sosnowski, dev, Raslan Darawsheh
Hi Cristian,
> -----Original Message-----
> From: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
> Sent: Thursday, January 4, 2024 2:57 PM
> > > >>
> > > >> And unless this is specifically defined as 'entropy' in spec, I am too
> > > >> for rename.
> > > >>
> > > >> At least in VXLAN spec, it is mentioned that this field is to "enable a
> > > >> level of entropy", but not exactly names it as entropy.
> > > >
> > > > Exactly my thought about the naming.
> > > > Good to see I am not alone thinking this naming is disturbing :)
> > >
> > > I'd avoid usage of term "entropy" in this patch. It is very confusing.
> >
> > What about rte_flow_calc_encap_hash?
> >
> >
> How about simply rte_flow_calc_hash? My understanding is this is a general-
> purpose hash that is not limited to encapsulation work.
Unfortunately, this is not a general-purpose hash. HW may implement a different hash for each use case.
also, the hash result is length differs depending on the feature and even the target field.
We can take your naming idea and change the parameters a bit:
rte_flow_calc_hash(port, feature, *attribute, pattern, hash_len, *hash)
For the feature we will have at this point:
NVGRE_HASH,
SPORT_HASH
The attribute parameter will be empty for now, but it may be used later to add extra information
for the hash if more information is required, for example, some key.
In addition, we will also be able to merge the current function rte_flow_calc_table_hash,
if we pass the missing parameters (table id, template id) in the attribute field.
What do you think?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC] ethdev: introduce entropy calculation
2024-01-04 14:33 ` Ori Kam
@ 2024-01-04 18:18 ` Thomas Monjalon
2024-01-07 9:37 ` Ori Kam
0 siblings, 1 reply; 19+ messages in thread
From: Thomas Monjalon @ 2024-01-04 18:18 UTC (permalink / raw)
To: Dumitrescu, Cristian, Andrew Rybchenko, Stephen Hemminger,
Ferruh Yigit, Ori Kam
Cc: Dariusz Sosnowski, dev, Raslan Darawsheh
04/01/2024 15:33, Ori Kam:
> Hi Cristian,
>
> > From: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
> > Sent: Thursday, January 4, 2024 2:57 PM
> > > > >>
> > > > >> And unless this is specifically defined as 'entropy' in spec, I am too
> > > > >> for rename.
> > > > >>
> > > > >> At least in VXLAN spec, it is mentioned that this field is to "enable a
> > > > >> level of entropy", but not exactly names it as entropy.
> > > > >
> > > > > Exactly my thought about the naming.
> > > > > Good to see I am not alone thinking this naming is disturbing :)
> > > >
> > > > I'd avoid usage of term "entropy" in this patch. It is very confusing.
> > >
> > > What about rte_flow_calc_encap_hash?
> > >
> > >
> > How about simply rte_flow_calc_hash? My understanding is this is a general-
> > purpose hash that is not limited to encapsulation work.
>
> Unfortunately, this is not a general-purpose hash. HW may implement a different hash for each use case.
> also, the hash result is length differs depending on the feature and even the target field.
>
> We can take your naming idea and change the parameters a bit:
> rte_flow_calc_hash(port, feature, *attribute, pattern, hash_len, *hash)
>
> For the feature we will have at this point:
> NVGRE_HASH,
> SPORT_HASH
>
> The attribute parameter will be empty for now, but it may be used later to add extra information
> for the hash if more information is required, for example, some key.
> In addition, we will also be able to merge the current function rte_flow_calc_table_hash,
> if we pass the missing parameters (table id, template id) in the attribute field.
>
> What do you think?
I like the idea of having a single function for HW hashes.
Is there an impact on performance? How much is it sensitive?
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [RFC] ethdev: introduce entropy calculation
2024-01-04 18:18 ` Thomas Monjalon
@ 2024-01-07 9:37 ` Ori Kam
0 siblings, 0 replies; 19+ messages in thread
From: Ori Kam @ 2024-01-07 9:37 UTC (permalink / raw)
To: NBU-Contact-Thomas Monjalon (EXTERNAL),
Dumitrescu, Cristian, Andrew Rybchenko, Stephen Hemminger,
Ferruh Yigit
Cc: Dariusz Sosnowski, dev, Raslan Darawsheh
Hi
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Thursday, January 4, 2024 8:19 PM
>
> 04/01/2024 15:33, Ori Kam:
> > Hi Cristian,
> >
> > > From: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
> > > Sent: Thursday, January 4, 2024 2:57 PM
> > > > > >>
> > > > > >> And unless this is specifically defined as 'entropy' in spec, I am too
> > > > > >> for rename.
> > > > > >>
> > > > > >> At least in VXLAN spec, it is mentioned that this field is to "enable a
> > > > > >> level of entropy", but not exactly names it as entropy.
> > > > > >
> > > > > > Exactly my thought about the naming.
> > > > > > Good to see I am not alone thinking this naming is disturbing :)
> > > > >
> > > > > I'd avoid usage of term "entropy" in this patch. It is very confusing.
> > > >
> > > > What about rte_flow_calc_encap_hash?
> > > >
> > > >
> > > How about simply rte_flow_calc_hash? My understanding is this is a
> general-
> > > purpose hash that is not limited to encapsulation work.
> >
> > Unfortunately, this is not a general-purpose hash. HW may implement a
> different hash for each use case.
> > also, the hash result is length differs depending on the feature and even the
> target field.
> >
> > We can take your naming idea and change the parameters a bit:
> > rte_flow_calc_hash(port, feature, *attribute, pattern, hash_len, *hash)
> >
> > For the feature we will have at this point:
> > NVGRE_HASH,
> > SPORT_HASH
> >
> > The attribute parameter will be empty for now, but it may be used later to
> add extra information
> > for the hash if more information is required, for example, some key.
> > In addition, we will also be able to merge the current function
> rte_flow_calc_table_hash,
> > if we pass the missing parameters (table id, template id) in the attribute field.
> >
> > What do you think?
>
> I like the idea of having a single function for HW hashes.
> Is there an impact on performance? How much is it sensitive?
>
>
It is sensitive since we expect this function to be called for each packet.
This may not be such a great idea.
So, back to square one and keep the rte_flow_calc_encap_hash
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC] ethdev: introduce entropy calculation
2023-12-17 10:07 ` Ori Kam
@ 2024-01-12 7:46 ` Andrew Rybchenko
2024-01-21 9:36 ` Ori Kam
0 siblings, 1 reply; 19+ messages in thread
From: Andrew Rybchenko @ 2024-01-12 7:46 UTC (permalink / raw)
To: Ori Kam, Dariusz Sosnowski, ferruh.yigit, cristian.dumitrescu,
NBU-Contact-Thomas Monjalon (EXTERNAL)
Cc: dev, Raslan Darawsheh
Hi Ori,
sorry for delay with reply.
On 12/17/23 13:07, Ori Kam wrote:
> Hi Andrew,
>
>> -----Original Message-----
>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>> Sent: Saturday, December 16, 2023 11:19 AM
>>
>> On 12/10/23 11:30, Ori Kam wrote:
>>> When offloading rules with the encap action, the HW may calculate entropy
>> based on the encap protocol.
>>> Each HW can implement a different algorithm.
>>>
>>> When the application receives packets that should have been
>>> encaped by the HW, but didn't reach this stage yet (for example TCP SYN
>> packets),
>>> then when encap is done in SW, application must apply
>>> the same entropy calculation algorithm.
>>>
>>> Using the new API application can request the PMD to calculate the
>>> value as if the packet passed in the HW.
>>
>> I'm still wondering why the API is required. Does app install encap
>> rule when the first packet is processed? The rule can contain all
>> outer headers (as it is calculated in SW anyway) and HW does not need
>> to calculate anything.
>
> Yes, the application installs a rule based on the first packet.
> as a result, all the rest of the packets are encaped by the HW.
> This API allows the application to use the same value as the HW will use when encapsulating the packet.
> In other words, we have 2 paths:
> Path 1 SW, for the first packet
> Path 2 HW, all the rest of the packest
I get it, but I still don't understand why HW should calculate
something. Can it simply use value provided by SW in encap rule?
If so, calculation becomes not HW-specific and does not require
driver callback.
>>> Signed-off-by: Ori Kam <orika@nvidia.com>
>>> ---
>>> lib/ethdev/rte_flow.h | 49
>> +++++++++++++++++++++++++++++++++++++++++++
>>> 1 file changed, 49 insertions(+)
>>>
>>> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
>>> index affdc8121b..3989b089dd 100644
>>> --- a/lib/ethdev/rte_flow.h
>>> +++ b/lib/ethdev/rte_flow.h
>>> @@ -6753,6 +6753,55 @@ rte_flow_calc_table_hash(uint16_t port_id, const
>> struct rte_flow_template_table
>>> const struct rte_flow_item pattern[], uint8_t
>> pattern_template_index,
>>> uint32_t *hash, struct rte_flow_error *error);
>>>
>>> +/**
>>> + * @warning
>>> + * @b EXPERIMENTAL: this API may change without prior notice.
>>> + *
>>> + * Destination field type for the entropy calculation.
>>> + *
>>> + * @see function rte_flow_calc_encap_entropy
>>> + */
>>> +enum rte_flow_entropy_dest {
>>> + /* Calculate entropy placed in UDP source port field. */
>>> + RTE_FLOW_ENTROPY_DEST_UDP_SRC_PORT,
>>
>> And source and destination are used together but for different
>> purposes it is very hard to follow which is used for which purpose.
>> I'd avoid term "dest" in the enum naming. May be present "flow" is
>> already good enough to highlight that it is per-flow?
>> rte_flow_encap_hash? rte_flow_encap_field_hash?
>
> I'm open to any suggestions, this enum is supposed to show to which
> field the HW insert the calculated value. This field is defined by the encapsulation
> protocol. For example, VXLAN the hash is stored in source port, while in NVGRE it is stored in
> flow_id field. The destination field also impact the size.
>
> What do you think about:
> RTE_FLOW_ENCAP_HASH_SRC_PORT?
Sounds better.
> What about if we change the destination to enum that will hold the destination tunnel type
> RTE_FLOW_TUNNEL_TYPE_VXLAN,
> RTE_FLOW_TUNNEL_TYPE_NVGRE
It could be an option as well, but binds tunnel type to size of hash to
be calculated. It looks OK right now, but may be wrong in the future.
>>> + /* Calculate entropy placed in NVGRE flow ID field. */
>>> + RTE_FLOW_ENTROPY_DEST_NVGRE_FLOW_ID,
>>> +};
>>> +
>>> +/**
>>> + * @warning
>>> + * @b EXPERIMENTAL: this API may change without prior notice.
>>> + *
>>> + * Calculate the entropy generated by the HW for a given pattern,
>>> + * when encapsulation flow action is executed.
>>> + *
>>> + * @param[in] port_id
>>> + * Port identifier of Ethernet device.
>>> + * @param[in] pattern
>>> + * The values to be used in the entropy calculation.
>>
>> Is it inner packet fields? Should all fields be used for hash
>> calculation? May be it is better to describe as inner packet
>> headers? How does app know which headers to extract and provide?
>
> The fields are the outer fields that will become inner fields, after the encapsulation.
> The fields are dependent on the HW (in standard case 5 tuple). but applications can /should set
> all the fields he got from the packet, at the end those are the fields that the HW will see.
OK I see.
>>
>>> + * @param[in] dest_field
>>> + * Type of destination field for entropy calculation.
>>> + * @param[out] entropy
>>> + * Used to return the calculated entropy. It will be written in network order,
>>> + * so entropy[0] is the MSB.
>>> + * The number of bytes is based on the destination field type.f
>>
>> API contract is a bit unclear here. May be it is safer to provide
>> buffer size explicitly?
>
> The size of the field is defined by the destination field, which in turn is defined by the
> protocol.
>
> I don't think adding size has any meaning when you know that the value is going to be set
> in source port which has the size of 16 bites.
> Based on enum suggestion of tunnel type. I think it will also explain the dest and size. What do you think?
I see, but IMHO it is still very unsafe. Simply too error prone and
there is not way to catch it. IMHO buffer size will not overload it
too much, but will make API clearer and safer.
>
>>
>>> + * @param[out] error
>>> + * Perform verbose error reporting if not NULL.
>>> + * PMDs initialize this structure in case of error only.
>>> + *
>>> + * @return
>>> + * - (0) if success.
>>> + * - (-ENODEV) if *port_id* invalid.
>>> + * - (-ENOTSUP) if underlying device does not support this functionality.
>>> + * - (-EINVAL) if *pattern* doesn't hold enough information to calculate the
>> entropy
>>> + * or the dest is not supported.
>>> + */
>>> +__rte_experimental
>>> +int
>>> +rte_flow_calc_encap_entropy(uint16_t port_id, const struct rte_flow_item
>> pattern[],
>>> + enum rte_flow_entropy_dest dest_field, uint8_t
>> *entropy,
>>> + struct rte_flow_error *error);
>>> +
>>> #ifdef __cplusplus
>>> }
>>> #endif
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [RFC] ethdev: introduce entropy calculation
2024-01-12 7:46 ` Andrew Rybchenko
@ 2024-01-21 9:36 ` Ori Kam
0 siblings, 0 replies; 19+ messages in thread
From: Ori Kam @ 2024-01-21 9:36 UTC (permalink / raw)
To: Andrew Rybchenko, Dariusz Sosnowski, ferruh.yigit,
cristian.dumitrescu, NBU-Contact-Thomas Monjalon (EXTERNAL)
Cc: dev, Raslan Darawsheh
Hi Andrew,
> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Friday, January 12, 2024 9:46 AM
> To: Ori Kam <orika@nvidia.com>; Dariusz Sosnowski
> <dsosnowski@nvidia.com>; ferruh.yigit@amd.com;
> cristian.dumitrescu@intel.com; NBU-Contact-Thomas Monjalon (EXTERNAL)
> <thomas@monjalon.net>
> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
> Subject: Re: [RFC] ethdev: introduce entropy calculation
>
> Hi Ori,
>
> sorry for delay with reply.
>
> On 12/17/23 13:07, Ori Kam wrote:
> > Hi Andrew,
> >
> >> -----Original Message-----
> >> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >> Sent: Saturday, December 16, 2023 11:19 AM
> >>
> >> On 12/10/23 11:30, Ori Kam wrote:
> >>> When offloading rules with the encap action, the HW may calculate entropy
> >> based on the encap protocol.
> >>> Each HW can implement a different algorithm.
> >>>
> >>> When the application receives packets that should have been
> >>> encaped by the HW, but didn't reach this stage yet (for example TCP SYN
> >> packets),
> >>> then when encap is done in SW, application must apply
> >>> the same entropy calculation algorithm.
> >>>
> >>> Using the new API application can request the PMD to calculate the
> >>> value as if the packet passed in the HW.
> >>
> >> I'm still wondering why the API is required. Does app install encap
> >> rule when the first packet is processed? The rule can contain all
> >> outer headers (as it is calculated in SW anyway) and HW does not need
> >> to calculate anything.
> >
> > Yes, the application installs a rule based on the first packet.
> > as a result, all the rest of the packets are encaped by the HW.
> > This API allows the application to use the same value as the HW will use when
> encapsulating the packet.
> > In other words, we have 2 paths:
> > Path 1 SW, for the first packet
> > Path 2 HW, all the rest of the packest
>
> I get it, but I still don't understand why HW should calculate
> something. Can it simply use value provided by SW in encap rule?
> If so, calculation becomes not HW-specific and does not require
> driver callback.
The value is calculated per 5 tuple, it is possible that the SW will create
the encap flow with the required data. But this means that this rule should be per tuple.
On the other hand, if the application wants to save HW resources and it can create a single encapsulation
rule that will encapsulate all flows. In the last case the HW must calculate the value since the SW can't configure the same value
for all flows.
Example for such a use case:
Group 0:
Match 5 tuple
Actions: NAT + counter + jump to group 2
Group 2
Match *
Actions: VXLAN encap
In the above case the application only use one HW resource for encapsulation.
I hope the use case is clearer to you
>
>
> >>> Signed-off-by: Ori Kam <orika@nvidia.com>
> >>> ---
> >>> lib/ethdev/rte_flow.h | 49
> >> +++++++++++++++++++++++++++++++++++++++++++
> >>> 1 file changed, 49 insertions(+)
> >>>
> >>> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> >>> index affdc8121b..3989b089dd 100644
> >>> --- a/lib/ethdev/rte_flow.h
> >>> +++ b/lib/ethdev/rte_flow.h
> >>> @@ -6753,6 +6753,55 @@ rte_flow_calc_table_hash(uint16_t port_id,
> const
> >> struct rte_flow_template_table
> >>> const struct rte_flow_item pattern[], uint8_t
> >> pattern_template_index,
> >>> uint32_t *hash, struct rte_flow_error *error);
> >>>
> >>> +/**
> >>> + * @warning
> >>> + * @b EXPERIMENTAL: this API may change without prior notice.
> >>> + *
> >>> + * Destination field type for the entropy calculation.
> >>> + *
> >>> + * @see function rte_flow_calc_encap_entropy
> >>> + */
> >>> +enum rte_flow_entropy_dest {
> >>> + /* Calculate entropy placed in UDP source port field. */
> >>> + RTE_FLOW_ENTROPY_DEST_UDP_SRC_PORT,
> >>
> >> And source and destination are used together but for different
> >> purposes it is very hard to follow which is used for which purpose.
> >> I'd avoid term "dest" in the enum naming. May be present "flow" is
> >> already good enough to highlight that it is per-flow?
> >> rte_flow_encap_hash? rte_flow_encap_field_hash?
> >
> > I'm open to any suggestions, this enum is supposed to show to which
> > field the HW insert the calculated value. This field is defined by the
> encapsulation
> > protocol. For example, VXLAN the hash is stored in source port, while in
> NVGRE it is stored in
> > flow_id field. The destination field also impact the size.
> >
> > What do you think about:
> > RTE_FLOW_ENCAP_HASH_SRC_PORT?
>
> Sounds better.
Great
>
> > What about if we change the destination to enum that will hold the
> destination tunnel type
> > RTE_FLOW_TUNNEL_TYPE_VXLAN,
> > RTE_FLOW_TUNNEL_TYPE_NVGRE
>
> It could be an option as well, but binds tunnel type to size of hash to
> be calculated. It looks OK right now, but may be wrong in the future.
>
Lets start with this and if needed update.
> >>> + /* Calculate entropy placed in NVGRE flow ID field. */
> >>> + RTE_FLOW_ENTROPY_DEST_NVGRE_FLOW_ID,
> >>> +};
> >>> +
> >>> +/**
> >>> + * @warning
> >>> + * @b EXPERIMENTAL: this API may change without prior notice.
> >>> + *
> >>> + * Calculate the entropy generated by the HW for a given pattern,
> >>> + * when encapsulation flow action is executed.
> >>> + *
> >>> + * @param[in] port_id
> >>> + * Port identifier of Ethernet device.
> >>> + * @param[in] pattern
> >>> + * The values to be used in the entropy calculation.
> >>
> >> Is it inner packet fields? Should all fields be used for hash
> >> calculation? May be it is better to describe as inner packet
> >> headers? How does app know which headers to extract and provide?
> >
> > The fields are the outer fields that will become inner fields, after the
> encapsulation.
> > The fields are dependent on the HW (in standard case 5 tuple). but applications
> can /should set
> > all the fields he got from the packet, at the end those are the fields that the
> HW will see.
>
> OK I see.
>
> >>
> >>> + * @param[in] dest_field
> >>> + * Type of destination field for entropy calculation.
> >>> + * @param[out] entropy
> >>> + * Used to return the calculated entropy. It will be written in network
> order,
> >>> + * so entropy[0] is the MSB.
> >>> + * The number of bytes is based on the destination field type.f
> >>
> >> API contract is a bit unclear here. May be it is safer to provide
> >> buffer size explicitly?
> >
> > The size of the field is defined by the destination field, which in turn is defined
> by the
> > protocol.
> >
> > I don't think adding size has any meaning when you know that the value is
> going to be set
> > in source port which has the size of 16 bites.
> > Based on enum suggestion of tunnel type. I think it will also explain the dest
> and size. What do you think?
>
> I see, but IMHO it is still very unsafe. Simply too error prone and
> there is not way to catch it. IMHO buffer size will not overload it
> too much, but will make API clearer and safer.
O.K we can add buffer size, even just for the ability to validate.
>
> >
> >>
> >>> + * @param[out] error
> >>> + * Perform verbose error reporting if not NULL.
> >>> + * PMDs initialize this structure in case of error only.
> >>> + *
> >>> + * @return
> >>> + * - (0) if success.
> >>> + * - (-ENODEV) if *port_id* invalid.
> >>> + * - (-ENOTSUP) if underlying device does not support this functionality.
> >>> + * - (-EINVAL) if *pattern* doesn't hold enough information to calculate
> the
> >> entropy
> >>> + * or the dest is not supported.
> >>> + */
> >>> +__rte_experimental
> >>> +int
> >>> +rte_flow_calc_encap_entropy(uint16_t port_id, const struct rte_flow_item
> >> pattern[],
> >>> + enum rte_flow_entropy_dest dest_field, uint8_t
> >> *entropy,
> >>> + struct rte_flow_error *error);
> >>> +
> >>> #ifdef __cplusplus
> >>> }
> >>> #endif
> >
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2024-01-21 9:36 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-10 8:30 [RFC] ethdev: introduce entropy calculation Ori Kam
2023-12-12 12:19 ` Dariusz Sosnowski
2023-12-14 11:34 ` Ferruh Yigit
2023-12-14 14:16 ` Ori Kam
2023-12-14 15:25 ` Dumitrescu, Cristian
2023-12-14 17:18 ` Ori Kam
2023-12-14 17:26 ` Stephen Hemminger
2023-12-15 13:44 ` Ferruh Yigit
2023-12-15 16:21 ` Thomas Monjalon
2023-12-16 9:03 ` Andrew Rybchenko
2023-12-27 15:20 ` Ori Kam
2024-01-04 12:57 ` Dumitrescu, Cristian
2024-01-04 14:33 ` Ori Kam
2024-01-04 18:18 ` Thomas Monjalon
2024-01-07 9:37 ` Ori Kam
2023-12-16 9:19 ` Andrew Rybchenko
2023-12-17 10:07 ` Ori Kam
2024-01-12 7:46 ` Andrew Rybchenko
2024-01-21 9:36 ` Ori Kam
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).