From: Ferruh Yigit <ferruh.yigit@amd.com>
To: Ori Kam <orika@nvidia.com>,
Dariusz Sosnowski <dsosnowski@nvidia.com>,
"cristian.dumitrescu@intel.com" <cristian.dumitrescu@intel.com>,
"andrew.rybchenko@oktetlabs.ru" <andrew.rybchenko@oktetlabs.ru>,
"stephen@networkplumber.org" <stephen@networkplumber.org>,
"NBU-Contact-Thomas Monjalon (EXTERNAL)" <thomas@monjalon.net>
Cc: "dev@dpdk.org" <dev@dpdk.org>, Raslan Darawsheh <rasland@nvidia.com>
Subject: Re: [PATCH v2 1/4] ethdev: introduce encap hash calculation
Date: Mon, 12 Feb 2024 20:09:47 +0000 [thread overview]
Message-ID: <19b7d3db-f142-49fa-976d-a180f03d7a0b@amd.com> (raw)
In-Reply-To: <MW2PR12MB46664661B2732AA5AF31D14BD6482@MW2PR12MB4666.namprd12.prod.outlook.com>
On 2/12/2024 6:44 PM, Ori Kam wrote:
> Hi Ferruh
>
>> -----Original Message-----
>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>> Sent: Monday, February 12, 2024 7:05 PM
>>
>> On 2/11/2024 7:29 AM, Ori Kam wrote:
>>> Hi Ferruh,
>>>
>>>> -----Original Message-----
>>>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>>>> Sent: Thursday, February 8, 2024 7:13 PM
>>>> To: Ori Kam <orika@nvidia.com>; Dariusz Sosnowski
>>>>
>>>> On 2/8/2024 9:09 AM, Ori Kam wrote:
>>>>> During encapsulation of a packet, it is possible to change some
>>>>> outer headers to improve flow destribution.
>>>>> For example, from VXLAN RFC:
>>>>> "It is recommended that the UDP source port number
>>>>> be calculated using a hash of fields from the inner packet --
>>>>> one example being a hash of the inner Ethernet frame's headers.
>>>>> This is to enable a level of entropy for the ECMP/load-balancing"
>>>>>
>>>>> The tunnel protocol defines which outer field should hold this hash,
>>>>> but it doesn't define the hash calculation algorithm.
>>>>>
>>>>> An application that uses flow offloads gets the first few packets
>>>>> (exception path) and then decides to offload the flow.
>>>>> As a result, there are two
>>>>> different paths that a packet from a given flow may take.
>>>>> SW for the first few packets or HW for the rest.
>>>>> When the packet goes through the SW, the SW encapsulates the packet
>>>>> and must use the same hash calculation as the HW will do for
>>>>> the rest of the packets in this flow.
>>>>>
>>>>> the new function rte_flow_calc_encap_hash can query the hash value
>>>>> fromm the driver for a given packet as if the packet was passed
>>>>> through the HW.
>>>>>
>>>>> Signed-off-by: Ori Kam <orika@nvidia.com>
>>>>> Acked-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
>>>>>
>>>>
>>>> <...>
>>>>
>>>>> +int
>>>>> +rte_flow_calc_encap_hash(uint16_t port_id, const struct rte_flow_item
>>>> pattern[],
>>>>> + enum rte_flow_encap_hash_field dest_field, uint8_t
>>>> hash_len,
>>>>> + uint8_t *hash, struct rte_flow_error *error)
>>>>> +{
>>>>> + int ret;
>>>>> + struct rte_eth_dev *dev;
>>>>> + const struct rte_flow_ops *ops;
>>>>> +
>>>>> + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
>>>>> + ops = rte_flow_ops_get(port_id, error);
>>>>> + if (!ops || !ops->flow_calc_encap_hash)
>>>>> + return rte_flow_error_set(error, ENOTSUP,
>>>>> +
>>>> RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
>>>>> + "calc encap hash is not supported");
>>>>> + if ((dest_field == RTE_FLOW_ENCAP_HASH_FIELD_SRC_PORT &&
>>>> hash_len != 2) ||
>>>>> + (dest_field == RTE_FLOW_ENCAP_HASH_FIELD_NVGRE_FLOW_ID
>>>> && hash_len != 1))
>>>>>
>>>>
>>>> If there is a fixed mapping with the dest_field and the size, instead of
>>>> putting this information into check code, what do you think to put it
>>>> into the data structure?
>>>>
>>>> I mean instead of using enum for dest_filed, it can be a struct that is
>>>> holding enum and its expected size, this clarifies what the expected
>>>> size for that field.
>>>>
>>>
>>> From my original email I think we only need the type, we don't need the
>> size.
>>> On the RFC thread there was an objection. So I added the size,
>>> If you think it is not needed lets remove it.
>>>
>>
>> I am not saying length is not needed, but
>> API gets 'dest_field' & 'hash_len', and according checks in the API for
>> each 'dest_field' there is an exact 'hash_len' requirement, this
>> requirement is something impacts user but this information is embedded
>> in the API, my suggestion is make it more visible to user.
>>
>> My initial suggestion was put this into an object, like:
>> ```
>> struct x {
>> enum rte_flow_encap_hash_field dest_field;
>> size_t expected size;
>> } y[] = {
>> { RTE_FLOW_ENCAP_HASH_FIELD_SRC_PORT, 2 },
>> { RTE_FLOW_ENCAP_HASH_FIELD_NVGRE_FLOW_ID, 1 }
>> };
>> ```
>>
>> But as you mentioned this is a limited set, perhaps it is sufficient to
>> document size requirement in the "enum rte_flow_encap_hash_field" API
>> doxygen comment.
>
> Will add it to the doxygen.
>
>>
>>
>>
>>>>> + return rte_flow_error_set(error, EINVAL,
>>>>> +
>>>> RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
>>>>> + "hash len doesn't match the
>>>> requested field len");
>>>>> + dev = &rte_eth_devices[port_id];
>>>>> + ret = ops->flow_calc_encap_hash(dev, pattern, dest_field, hash,
>>>> error);
>>>>>
>>>>
>>>> 'hash_len' is get by API, but it is not passed to dev_ops, does this
>>>> mean this information hardcoded in the driver as well, if so why
>>>> duplicate this information in driver instead off passing hash_len to driver?
>>>
>>> Not sure I understand, like I wrote above this is pure verification from my
>> point of view.
>>> The driver knows the size based on the dest.
>>>
>>
>> My intention was similar to above comment, like dest_field type
>> RTE_FLOW_ENCAP_HASH_FIELD_SRC_PORT implies that required size should
>> be
>> 2 bytes, and it seems driver already knows about this requirement.
>
> That is correct, that is why I don't think we need the size, add added it
> only for validation due to community request.
>
>>
>> Instead, it can be possible to verify 'hash_len' in the API level, pass
>> this information to the driver and driver use 'hash_len' directly for
>> its size parameter, so driver will rely on API provided 'hash_len' value
>> instead of storing this information within driver.
>>
>> Lets assume 10 drivers are implementing this feature, should all of them
>> define MLX5DR_CRC_ENCAP_ENTROPY_HASH_SIZE_16 equivalent
>> enum/define
>> withing the driver?
>
> No, the driver implements hard-coded logic, which means that it just needs to know
> the dest field, in order to know what hash to calculate
> It is possible that for each field the HW will calculate the hash using different algorithm.
>
OK if HW already needs to know the size in advance, lets go with enum
doxygen update only.
> Also it is possible that the HW doesn't support writing to the expected field, in which case we
> want the driver call to fail.
>
> Field implies size.
> Size doesn't implies field.
>
>>
>>>>
>>>>
>>>>> + return flow_err(port_id, ret, error);
>>>>> +}
>>>>> diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
>>>>> index 1267c146e5..2bdf3a4a17 100644
>>>>> --- a/lib/ethdev/rte_flow.h
>>>>> +++ b/lib/ethdev/rte_flow.h
>>>>> @@ -6783,6 +6783,57 @@ rte_flow_calc_table_hash(uint16_t port_id,
>>>> const struct rte_flow_template_table
>>>>> const struct rte_flow_item pattern[], uint8_t
>>>> pattern_template_index,
>>>>> uint32_t *hash, struct rte_flow_error *error);
>>>>>
>>>>> +/**
>>>>> + * @warning
>>>>> + * @b EXPERIMENTAL: this API may change without prior notice.
>>>>> + *
>>>>> + * Destination field type for the hash calculation, when encap action is
>>>> used.
>>>>> + *
>>>>> + * @see function rte_flow_calc_encap_hash
>>>>> + */
>>>>> +enum rte_flow_encap_hash_field {
>>>>> + /* Calculate hash placed in UDP source port field. */
>>>>>
>>
>> Just recognized that comments are not doxygen comments.
>
> Thanks,
> Will fix.
>>
>>>>> + RTE_FLOW_ENCAP_HASH_FIELD_SRC_PORT,
>>>>> + /* Calculate hash placed in NVGRE flow ID field. */
>>>>> + RTE_FLOW_ENCAP_HASH_FIELD_NVGRE_FLOW_ID,
>>>>> +};
>>>>>
>>>>
>>>> Indeed above enum represents a field in a network protocol, right?
>>>> Instead of having a 'RTE_FLOW_ENCAP_HASH_' specific one, can re-using
>>>> 'enum rte_flow_field_id' work?
>>>
>>> Since the option are really limited and defined by standard, I prefer to have
>> dedicated options.
>>>
>>
>> OK, my intention is to reduce the duplication. Just for brainstorm, what
>> is the benefit of having 'RTE_FLOW_ENCAP_HASH_' specific enums, if we
>> can present them as generic protocol fiels, like
>> 'RTE_FLOW_ENCAP_HASH_FIELD_SRC_PORT' vs
>> 'RTE_FLOW_FIELD_UDP_PORT_SRC,'?
>
> I guess you want to go with 'RTE_FLOW_FIELD_UDP_PORT_SRC
> right?
>
I just want to discuss if redundancy can be eliminated.
> The main issue is since the options are really limited and used for a very dedicated function.
> When app developers / DPDK developers will look at it, it will be very unclear what is the use of this enum.
> We already have an enum for fields. Like you suggested we could have used it,
> but this will show much more option than there are really.
>
OK, lets use dedicated enums to clarify to the users the specific fields
available for this set of APIs.
Btw, is boundary check like following required for the APIs:
```
if (dest_field > RTE_FLOW_ENCAP_HASH_FIELD_NVGRE_FLOW_ID)
return -EINVAL;
```
In case user pass an invalid value as 'dest_filed'
(Note: I intentionally not used MAX enum something like
'RTE_FLOW_ENCAP_HASH_FIELD_MAX' to not need to deal with ABI issues in
the future.)
next prev parent reply other threads:[~2024-02-12 20:09 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-28 9:39 [PATCH 0/4] " Ori Kam
2024-01-28 9:39 ` [PATCH 1/4] ethdev: " Ori Kam
2024-02-01 8:40 ` Ori Kam
2024-02-06 22:39 ` Thomas Monjalon
2024-02-07 6:56 ` Ori Kam
2024-02-07 9:25 ` Thomas Monjalon
2024-01-28 9:39 ` [PATCH 2/4] net/mlx5/hws: introduce encap entropy hash calculation API Ori Kam
2024-01-28 9:39 ` [PATCH 3/4] net/mlx5: add calc encap hash support Ori Kam
2024-01-28 9:39 ` [PATCH 4/4] app/testpmd: add encap hash calculation Ori Kam
2024-01-31 18:30 ` [PATCH 0/4] introduce " Dariusz Sosnowski
2024-02-08 9:09 ` [PATCH v2 1/4] ethdev: " Ori Kam
2024-02-08 9:09 ` [PATCH v2 2/4] net/mlx5/hws: introduce encap entropy hash calculation API Ori Kam
2024-02-08 9:09 ` [PATCH v2 3/4] net/mlx5: add calc encap hash support Ori Kam
2024-02-08 9:09 ` [PATCH v2 4/4] app/testpmd: add encap hash calculation Ori Kam
2024-02-08 17:13 ` [PATCH v2 1/4] ethdev: introduce " Ferruh Yigit
2024-02-11 7:29 ` Ori Kam
2024-02-12 17:05 ` Ferruh Yigit
2024-02-12 18:44 ` Ori Kam
2024-02-12 20:09 ` Ferruh Yigit [this message]
2024-02-13 7:05 ` Ori Kam
2024-02-13 13:48 ` [PATCH v3 " Ori Kam
2024-02-13 13:48 ` [PATCH v3 2/4] net/mlx5/hws: introduce encap entropy hash calculation API Ori Kam
2024-02-13 13:48 ` [PATCH v3 3/4] net/mlx5: add calc encap hash support Ori Kam
2024-02-13 13:48 ` [PATCH v3 4/4] app/testpmd: add encap hash calculation Ori Kam
2024-02-13 14:16 ` [PATCH v4 1/4] ethdev: introduce " Ori Kam
2024-02-13 14:16 ` [PATCH v4 2/4] net/mlx5/hws: introduce encap entropy hash calculation API Ori Kam
2024-02-13 14:16 ` [PATCH v4 3/4] net/mlx5: add calc encap hash support Ori Kam
2024-02-13 14:16 ` [PATCH v4 4/4] app/testpmd: add encap hash calculation Ori Kam
2024-02-13 15:45 ` Ferruh Yigit
2024-02-13 15:45 ` [PATCH v4 1/4] ethdev: introduce " Ferruh Yigit
2024-02-13 15:45 ` Ferruh Yigit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=19b7d3db-f142-49fa-976d-a180f03d7a0b@amd.com \
--to=ferruh.yigit@amd.com \
--cc=andrew.rybchenko@oktetlabs.ru \
--cc=cristian.dumitrescu@intel.com \
--cc=dev@dpdk.org \
--cc=dsosnowski@nvidia.com \
--cc=orika@nvidia.com \
--cc=rasland@nvidia.com \
--cc=stephen@networkplumber.org \
--cc=thomas@monjalon.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).