DPDK patches and discussions
 help / color / mirror / Atom feed
From: Ori Kam <orika@nvidia.com>
To: "Dumitrescu, Cristian" <cristian.dumitrescu@intel.com>,
	Ferruh Yigit <ferruh.yigit@amd.com>,
	Dariusz Sosnowski <dsosnowski@nvidia.com>,
	"NBU-Contact-Thomas Monjalon (EXTERNAL)" <thomas@monjalon.net>,
	Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Cc: "dev@dpdk.org" <dev@dpdk.org>, Raslan Darawsheh <rasland@nvidia.com>
Subject: RE: [RFC] ethdev: introduce entropy calculation
Date: Thu, 14 Dec 2023 17:18:25 +0000	[thread overview]
Message-ID: <MW2PR12MB4666765248588E7A10FEF510D68CA@MW2PR12MB4666.namprd12.prod.outlook.com> (raw)
In-Reply-To: <DS0PR11MB7442F093A8E041BDF4C2F2F3EB8CA@DS0PR11MB7442.namprd11.prod.outlook.com>

Hi Andrew,

> -----Original Message-----
> From: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
> Sent: Thursday, December 14, 2023 5:25 PM
> 
> Hi Ori,
> 
> A few questions on top of Ferruh's questions for better understanding the
> concept inlined below:
> 
> > -----Original Message-----
> > From: Ori Kam <orika@nvidia.com>
> > Sent: Thursday, December 14, 2023 2:17 PM
> > To: Ferruh Yigit <ferruh.yigit@amd.com>; Dariusz Sosnowski
> > <dsosnowski@nvidia.com>; Dumitrescu, Cristian
> > <cristian.dumitrescu@intel.com>; NBU-Contact-Thomas Monjalon
> (EXTERNAL)
> > <thomas@monjalon.net>; Andrew Rybchenko
> > <andrew.rybchenko@oktetlabs.ru>
> > Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>
> > Subject: RE: [RFC] ethdev: introduce entropy calculation
> >
> > Hi Ferruh,
> >
> > > -----Original Message-----
> > > From: Ferruh Yigit <ferruh.yigit@amd.com>
> > > Sent: Thursday, December 14, 2023 1:35 PM
> > >
> > > On 12/10/2023 8:30 AM, Ori Kam wrote:
> > > > When offloading rules with the encap action, the HW may calculate
> entropy
> > > based on the encap protocol.
> > > > Each HW can implement a different algorithm.
> > > >
> > >
> > > Hi Ori,
> > >
> > > Can you please provide more details what this 'entropy' is used for,
> > > what is the usecase?
> >
> > Sure, in some tunnel protocols, for example, VXLAN, NVGE  it is possible to add
> > entropy value in one of the
> > fields of the tunnel. In VXLAN for example it is in the source port,
> > From the VXLAN protocol:
> > Source Port:  It is recommended that the UDP source port number
> >          be calculated using a hash of fields from the inner packet --
> >          one example being a hash of the inner Ethernet frame's headers.
> >          This is to enable a level of entropy for the ECMP/load-
> >          balancing of the VM-to-VM traffic across the VXLAN overlay.
> >          When calculating the UDP source port number in this manner, it
> >          is RECOMMENDED that the value be in the dynamic/private port
> >          range 49152-65535 [RFC6335].
> >
> > Since encap groups number of different 5 tuples together, if HW doesn’t know
> > how to RSS
> > based on the inner application will not be able to get any distribution of
> packets.
> >
> > This value is used to reflect the inner packet on the outer header, so
> distribution
> > will be possible.
> >
> > The main use case is, if application does full offload and implements the encap
> on
> > the RX.
> > For example:
> > Ingress/FDB  match on 5 tuple encap send to hairpin / different port in case of
> > switch.
> >
> 
> Smart idea! So basically the user is able to get an idea on how good the RSS
> distribution is, correct?
> 

Not exactly, this simply allows the distribution.
Maybe entropy is a bad name, this is the name they use in the protocol, but in reality
this is some hash calculated on the packet header before the encap and set in the encap header.
Using this hash results in entropy for the packets. Which can be used for load balancing.

Maybe better name would be:
Rte_flow_calc_entropy_hash?

or maybe rte_flow_calc_encap_hash (I like it less since it looks like we calculate the hash on the encap data and not the inner part)

what do you think?

> Can you elaborate a bit on how the entropy is measured: is it a number, what is
> the range of values, does higher value means better, etc.
> 

Please see my above answer, this is not entropy value but value used for entropy.

> > The issue starts when there is a miss on the 5 tuple table for example, due to
> syn
> > packet.
> > A packet arrives at the application, and then the application offloads the rule.
> > So the application must encap the packet and set the same entropy as the HW
> > will do for all the rest
> > of the packets.
> >
> 
> How can the app set the entropy?

It can't, it is assumed that the HW is configured ether hardcoded or by using other means
with the algorithm. The application doesn't know and doesn't care about what is the calculation
only about the result.

> 
> > >
> > >
> > > > When the application receives packets that should have been
> > > > encaped by the HW, but didn't reach this stage yet (for example TCP SYN
> > > packets),
> > > > then when encap is done in SW, application must apply
> > > > the same entropy calculation algorithm.
> > > >> Using the new API application can request the PMD to calculate the
> > > > value as if the packet passed in the HW.
> > > >
> > >
> > > So is this new API a datapath API? Is the intention that application
> > > call this API per packet that is missing 'entropy' information?
> >
> > The application will call this API when it gets a packet that it knows, that the
> rest
> > of the
> > packets from this connection will be offloaded and encaped by the HW.
> > (see above explanation)
> >
> > >
> > > > Signed-off-by: Ori Kam <orika@nvidia.com>
> > > > ---
> > > >  lib/ethdev/rte_flow.h | 49
> > > +++++++++++++++++++++++++++++++++++++++++++
> > > >  1 file changed, 49 insertions(+)
> > > >
> > > > diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
> > > > index affdc8121b..3989b089dd 100644
> > > > --- a/lib/ethdev/rte_flow.h
> > > > +++ b/lib/ethdev/rte_flow.h
> > > > @@ -6753,6 +6753,55 @@ rte_flow_calc_table_hash(uint16_t port_id,
> const
> > > struct rte_flow_template_table
> > > >  			 const struct rte_flow_item pattern[], uint8_t
> > > pattern_template_index,
> > > >  			 uint32_t *hash, struct rte_flow_error *error);
> > > >
> > > > +/**
> > > > + * @warning
> > > > + * @b EXPERIMENTAL: this API may change without prior notice.
> > > > + *
> > > > + * Destination field type for the entropy calculation.
> > > > + *
> > > > + * @see function rte_flow_calc_encap_entropy
> > > > + */
> > > > +enum rte_flow_entropy_dest {
> > > > +	/* Calculate entropy placed in UDP source port field. */
> > > > +	RTE_FLOW_ENTROPY_DEST_UDP_SRC_PORT,
> > > > +	/* Calculate entropy placed in NVGRE flow ID field. */
> > > > +	RTE_FLOW_ENTROPY_DEST_NVGRE_FLOW_ID,
> > > > +};
> > > > +
> > > > +/**
> > > > + * @warning
> > > > + * @b EXPERIMENTAL: this API may change without prior notice.
> > > > + *
> > > > + * Calculate the entropy generated by the HW for a given pattern,
> > > > + * when encapsulation flow action is executed.
> > > > + *
> > > > + * @param[in] port_id
> > > > + *   Port identifier of Ethernet device.
> > > > + * @param[in] pattern
> > > > + *   The values to be used in the entropy calculation.
> > > > + * @param[in] dest_field
> > > > + *   Type of destination field for entropy calculation.
> > > > + * @param[out] entropy
> > > > + *   Used to return the calculated entropy. It will be written in network
> order,
> > > > + *   so entropy[0] is the MSB.
> > > > + *   The number of bytes is based on the destination field type.
> > > >
> > >
> > >
> > > Is the size same as field size in the 'dest_field'?
> > > Like for 'RTE_FLOW_ENTROPY_DEST_UDP_SRC_PORT' is it two bytes?
> >
> > Yes,
> > >
> > >
> > > > + * @param[out] error
> > > > + *   Perform verbose error reporting if not NULL.
> > > > + *   PMDs initialize this structure in case of error only.
> > > > + *
> > > > + * @return
> > > > + *   - (0) if success.
> > > > + *   - (-ENODEV) if *port_id* invalid.
> > > > + *   - (-ENOTSUP) if underlying device does not support this functionality.
> > > > + *   - (-EINVAL) if *pattern* doesn't hold enough information to calculate
> the
> > > entropy
> > > > + *               or the dest is not supported.
> > > > + */
> > > > +__rte_experimental
> > > > +int
> > > > +rte_flow_calc_encap_entropy(uint16_t port_id, const struct
> rte_flow_item
> > > pattern[],
> > > > +			    enum rte_flow_entropy_dest dest_field, uint8_t
> > > *entropy,
> > > > +			    struct rte_flow_error *error);
> > > > +
> > > >  #ifdef __cplusplus
> > > >  }
> > > >  #endif
> 
> Thanks,
> Cristian

  reply	other threads:[~2023-12-14 17:18 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-10  8:30 Ori Kam
2023-12-12 12:19 ` Dariusz Sosnowski
2023-12-14 11:34 ` Ferruh Yigit
2023-12-14 14:16   ` Ori Kam
2023-12-14 15:25     ` Dumitrescu, Cristian
2023-12-14 17:18       ` Ori Kam [this message]
2023-12-14 17:26         ` Stephen Hemminger
2023-12-15 13:44           ` Ferruh Yigit
2023-12-15 16:21             ` Thomas Monjalon
2023-12-16  9:03               ` Andrew Rybchenko
2023-12-27 15:20                 ` Ori Kam
2024-01-04 12:57                   ` Dumitrescu, Cristian
2024-01-04 14:33                     ` Ori Kam
2024-01-04 18:18                       ` Thomas Monjalon
2024-01-07  9:37                         ` Ori Kam
2023-12-16  9:19 ` Andrew Rybchenko
2023-12-17 10:07   ` Ori Kam
2024-01-12  7:46     ` Andrew Rybchenko
2024-01-21  9:36       ` Ori Kam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MW2PR12MB4666765248588E7A10FEF510D68CA@MW2PR12MB4666.namprd12.prod.outlook.com \
    --to=orika@nvidia.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=cristian.dumitrescu@intel.com \
    --cc=dev@dpdk.org \
    --cc=dsosnowski@nvidia.com \
    --cc=ferruh.yigit@amd.com \
    --cc=rasland@nvidia.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).