From: Ferruh Yigit <ferruh.yigit@amd.com>
To: "Kusztal, ArkadiuszX" <arkadiuszx.kusztal@intel.com>,
"Marchand, David" <david.marchand@redhat.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>, "Ji, Kai" <kai.ji@intel.com>,
"Dooley, Brian" <brian.dooley@intel.com>
Subject: Re: [PATCH v2 1/3] net: add thread-safe crc api
Date: Wed, 9 Oct 2024 10:11:32 +0100 [thread overview]
Message-ID: <683c3a3d-0e02-46a7-9490-9efe759fa712@amd.com> (raw)
In-Reply-To: <PH0PR11MB5013F251BE81CE4A56D91C2C9F7F2@PH0PR11MB5013.namprd11.prod.outlook.com>
On 10/9/2024 8:48 AM, Kusztal, ArkadiuszX wrote:
>
>
>> -----Original Message-----
>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>> Sent: Wednesday, October 9, 2024 3:03 AM
>> To: Kusztal, ArkadiuszX <arkadiuszx.kusztal@intel.com>; Marchand, David
>> <david.marchand@redhat.com>
>> Cc: dev@dpdk.org; Ji, Kai <kai.ji@intel.com>; Dooley, Brian
>> <brian.dooley@intel.com>
>> Subject: Re: [PATCH v2 1/3] net: add thread-safe crc api
>>
>> On 10/8/2024 9:51 PM, Kusztal, ArkadiuszX wrote:
>>> Hi Ferruh,
>>> Thanks for the review, comments inline,
>>>
>>>> -----Original Message-----
>>>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>>>> Sent: Tuesday, October 8, 2024 5:43 AM
>>>> To: Kusztal, ArkadiuszX <arkadiuszx.kusztal@intel.com>; Marchand,
>>>> David <david.marchand@redhat.com>
>>>> Cc: dev@dpdk.org; Ji, Kai <kai.ji@intel.com>; Dooley, Brian
>>>> <brian.dooley@intel.com>
>>>> Subject: Re: [PATCH v2 1/3] net: add thread-safe crc api
>>>>
>>>> On 10/1/2024 7:11 PM, Arkadiusz Kusztal wrote:
>>>>> The current net CRC API is not thread-safe, this patch solves this
>>>>> by adding another, thread-safe API functions.
>>>>> This API is also safe to use across multiple processes, yet with
>>>>> limitations on max-simd-bitwidth, which will be checked only by the
>>>>> process that created the CRC context; all other processes will use
>>>>> the same CRC function when used with the same CRC context.
>>>>> It is an undefined behavior when process binaries are compiled with
>>>>> different SIMD capabilities when the same CRC context is used.
>>>>>
>>>>> Signed-off-by: Arkadiusz Kusztal <arkadiuszx.kusztal@intel.com>
>>>>
>>>> <...>
>>>>
>>>>> +static struct
>>>>> +{
>>>>> + uint32_t (*f[RTE_NET_CRC_REQS])
>>>>> + (const uint8_t *data, uint32_t data_len);
>>>>>
>>>>
>>>> It increases readability to typedef function pointers.
>>>
>>> Agree, though this typedef would be used here only, that’s why I left it out.
>>> But I can add it then.
>>>
>>>>
>>>>
>>>>> +} handlers[RTE_NET_CRC_AVX512 + 1];
>>>>>
>>>>> -/**
>>>>> - * Reflect the bits about the middle
>>>>> - *
>>>>> - * @param val
>>>>> - * value to be reflected
>>>>> - *
>>>>> - * @return
>>>>> - * reflected value
>>>>> - */
>>>>> -static uint32_t
>>>>> +static inline uint32_t
>>>>>
>>>>
>>>> Does changing to 'inline' required, as function is static compiler
>>>> can do the same.
>>>
>>> True, though it may be more readable sometimes.
>>> Of course there is no way that in O3 these functions would not be inlined by
>> the compiler, regardless if the inline hint is present or not.
>>>
>>>>
>>>>> reflect_32bits(uint32_t val)
>>>>> {
>>>>> uint32_t i, res = 0;
>>>>> @@ -99,26 +43,7 @@ reflect_32bits(uint32_t val)
>>>>> return res;
>>>>> }
>>>>>
>>>>> -static void
>>>>> -crc32_eth_init_lut(uint32_t poly,
>>>>> - uint32_t *lut)
>>>>> -{
>>>>> - uint32_t i, j;
>>>>> -
>>>>> - for (i = 0; i < CRC_LUT_SIZE; i++) {
>>>>> - uint32_t crc = reflect_32bits(i);
>>>>> -
>>>>> - for (j = 0; j < 8; j++) {
>>>>> - if (crc & 0x80000000L)
>>>>> - crc = (crc << 1) ^ poly;
>>>>> - else
>>>>> - crc <<= 1;
>>>>> - }
>>>>> - lut[i] = reflect_32bits(crc);
>>>>> - }
>>>>> -}
>>>>> -
>>>>> -static __rte_always_inline uint32_t
>>>>> +static inline uint32_t
>>>>>
>>>>
>>>> Why not forcing inline anymore?
>>>> Are these inline changes related to the thread-safety?
>>>
>>> O3 will inline it anyway, and with always_inline it will be inline even in debug
>> mode. I just see no reason forcing it upon the compiler.
>>>
>>>>
>>>>> crc32_eth_calc_lut(const uint8_t *data,
>>>>> uint32_t data_len,
>>>>> uint32_t crc,
>>>>> @@ -130,20 +55,9 @@ crc32_eth_calc_lut(const uint8_t *data,
>>>>> return crc;
>>>>> }
>>>>>
>>>>> -static void
>>>>> -rte_net_crc_scalar_init(void)
>>>>> -{
>>>>> - /* 32-bit crc init */
>>>>> - crc32_eth_init_lut(CRC32_ETH_POLYNOMIAL, crc32_eth_lut);
>>>>> -
>>>>> - /* 16-bit CRC init */
>>>>> - crc32_eth_init_lut(CRC16_CCITT_POLYNOMIAL << 16, crc16_ccitt_lut);
>>>>> -}
>>>>> -
>>>>> static inline uint32_t
>>>>> -rte_crc16_ccitt_handler(const uint8_t *data, uint32_t data_len)
>>>>> +crc16_ccitt(const uint8_t *data, uint32_t data_len)
>>>>> {
>>>>> - /* return 16-bit CRC value */
>>>>>
>>>>
>>>> Why not keep comments? Are they wrong?
>>>
>>> Functions names are very self-explanatory, that’s why I dropped comments. I
>> can add comments if needed.
>>>
>>
>> I am for restricting changes to the target of the patch which is making CRC
>> calculation thread safe, unless code changes invalidates the comments, lets
>> keep them. Same goes with inline related modifications.
>>
>>>>
>>>> <...>
>>>>
>>>>> +static void
>>>>> +crc_scalar_init(void)
>>>>> +{
>>>>> + crc32_eth_init_lut(CRC32_ETH_POLYNOMIAL, crc32_eth_lut);
>>>>> + crc32_eth_init_lut(CRC16_CCITT_POLYNOMIAL << 16, crc16_ccitt_lut);
>>>>> +
>>>>> + handlers[RTE_NET_CRC_SCALAR].f[RTE_NET_CRC16_CCITT] =
>>>> crc16_ccitt;
>>>>> + handlers[RTE_NET_CRC_SCALAR].f[RTE_NET_CRC32_ETH] = crc32_eth;
>>>>>
>>>>
>>>> +1 to remove global handlers pointer and add context,
>>>>
>>>> But current handlers array content is static, it can be set when
>>>> defined, instead of functions.
>>>
>>> Can do it for scalars, but for SIMD there is this runtime check like this:
>>> if (AVX512_VPCLMULQDQ_CPU_SUPPORTED) { So compiled binary on
>> AVX512
>>> machine could filter it out on the machine which does not support it.
>>> There is no NULL check in crc function so it would not change much when
>> called -> Invalid address vs Invalid instruction.
>>> There are not many checks there, as this is CRC after all, it should be a small as
>> possible, yet probably NULL check could be advisable in crc function then.
>>>
>>
>> There is already AVX512_VPCLMULQDQ_CPU_SUPPORTED etc checks in
>> 'rte_net_crc_set()'.
>> So does it work to update 'handlers' statically, without condition, but have
>> conditions when use them.
>>
>>>
>>>>
>>>> <...>
>>>>
>>>>> -static uint32_t
>>>>> -rte_crc32_eth_default_handler(const uint8_t *data, uint32_t
>>>>> data_len)
>>>>> +struct rte_net_crc rte_net_crc_set(enum rte_net_crc_alg alg,
>>>>> + enum rte_net_crc_type type)
>>>>> {
>>>>> - handlers = NULL;
>>>>> - if (max_simd_bitwidth == 0)
>>>>> - max_simd_bitwidth = rte_vect_get_max_simd_bitwidth();
>>>>> -
>>>>> - handlers = avx512_vpclmulqdq_get_handlers();
>>>>> - if (handlers != NULL)
>>>>> - return handlers[RTE_NET_CRC32_ETH](data, data_len);
>>>>> - handlers = sse42_pclmulqdq_get_handlers();
>>>>> - if (handlers != NULL)
>>>>> - return handlers[RTE_NET_CRC32_ETH](data, data_len);
>>>>> - handlers = neon_pmull_get_handlers();
>>>>> - if (handlers != NULL)
>>>>> - return handlers[RTE_NET_CRC32_ETH](data, data_len);
>>>>> - handlers = handlers_scalar;
>>>>> - return handlers[RTE_NET_CRC32_ETH](data, data_len);
>>>>> -}
>>>>> + uint16_t max_simd_bitwidth;
>>>>>
>>>>> -/* Public API */
>>>>> -
>>>>> -void
>>>>> -rte_net_crc_set_alg(enum rte_net_crc_alg alg) -{
>>>>> - handlers = NULL;
>>>>> - if (max_simd_bitwidth == 0)
>>>>> - max_simd_bitwidth = rte_vect_get_max_simd_bitwidth();
>>>>> + max_simd_bitwidth = rte_vect_get_max_simd_bitwidth();
>>>>>
>>>>> switch (alg) {
>>>>> case RTE_NET_CRC_AVX512:
>>>>> - handlers = avx512_vpclmulqdq_get_handlers();
>>>>> - if (handlers != NULL)
>>>>> - break;
>>>>> +#ifdef CC_X86_64_AVX512_VPCLMULQDQ_SUPPORT
>>>>> + if (AVX512_VPCLMULQDQ_CPU_SUPPORTED &&
>>>>> + max_simd_bitwidth >= RTE_VECT_SIMD_512) {
>>>>> + return (struct rte_net_crc){ RTE_NET_CRC_AVX512,
>>>> type };
>>>>> + }
>>>>> +#endif
>>>>> /* fall-through */
>>>>> case RTE_NET_CRC_SSE42:
>>>>> - handlers = sse42_pclmulqdq_get_handlers();
>>>>> - break; /* for x86, always break here */
>>>>> +#ifdef CC_X86_64_SSE42_PCLMULQDQ_SUPPORT
>>>>> + if (SSE42_PCLMULQDQ_CPU_SUPPORTED &&
>>>>> + max_simd_bitwidth >= RTE_VECT_SIMD_128) {
>>>>> + return (struct rte_net_crc){ RTE_NET_CRC_SSE42, type
>>>> };
>>>>> + }
>>>>> +#endif
>>>>> + break;
>>>>> case RTE_NET_CRC_NEON:
>>>>> - handlers = neon_pmull_get_handlers();
>>>>> - /* fall-through */
>>>>> - case RTE_NET_CRC_SCALAR:
>>>>> - /* fall-through */
>>>>> +#ifdef CC_ARM64_NEON_PMULL_SUPPORT
>>>>> + if (NEON_PMULL_CPU_SUPPORTED &&
>>>>> + max_simd_bitwidth >= RTE_VECT_SIMD_128) {
>>>>> + return (struct rte_net_crc){ RTE_NET_CRC_NEON, type
>>>> };
>>>>> + }
>>>>> +#endif
>>>>>
>>>>
>>>> Is it more readable as following, up to you:
>>>
>>> Agree, I will change it.
>>>
>>>>
>>>> ```
>>>> rte_net_crc_set(alg, type) {
>>>> enum rte_net_crc_alg new_alg = RTE_NET_CRC_SCALAR;
>>>> switch (alg) {
>>>> case AVX512:
>>>> new_alg = ..
>>>> case NEON:
>>>> new_alg = ..
>>>> }
>>>> return struct rte_net_crc){ new_alg, type }; ```
>>>>
>>>>
>>>>
>>>>
>>>>> + break;
>>>>> default:
>>>>> break;
>>>>> }
>>>>> -
>>>>> - if (handlers == NULL)
>>>>> - handlers = handlers_scalar;
>>>>> + return (struct rte_net_crc){ RTE_NET_CRC_SCALAR, type };
>>>>> }
>>>>>
>>>>> -uint32_t
>>>>> -rte_net_crc_calc(const void *data,
>>>>> - uint32_t data_len,
>>>>> - enum rte_net_crc_type type)
>>>>> +uint32_t rte_net_crc(const struct rte_net_crc *ctx,
>>>>> + const void *data, const uint32_t data_len)
>>>>> {
>>>>> - uint32_t ret;
>>>>> - rte_net_crc_handler f_handle;
>>>>> -
>>>>> - f_handle = handlers[type];
>>>>> - ret = f_handle(data, data_len);
>>>>> -
>>>>> - return ret;
>>>>> + return handlers[ctx->alg].f[ctx->type](data, data_len);
>>>>>
>>>>
>>>> 'rte_net_crc()' gets input from user and "struct rte_net_crc" is not
>>>> opaque, so user can provide invalid input, ctx->alg & ctx->type.
>>>> To protect against it input values should be checked before using.
>>>>
>>>> Or I think user not need to know the details of the "struct
>>>> rte_net_crc", so it can be an opaque variable for user.
>>>
>>> I would love it to be opaque, but then I would have to return a pointer, which
>> then would involve some allocations/deallocations and I wanted to keep is as
>> simple as possible.
>>> So probably the checks would be a way to go.
>>>
>>
>> True, +1 for simplicity.
>>
>>>>
>>>> <...>
>>>>
>>>>> -/**
>>>>> - * CRC compute API
>>>>> - *
>>>>> - * @param data
>>>>> - * Pointer to the packet data for CRC computation
>>>>> - * @param data_len
>>>>> - * Data length for CRC computation
>>>>> - * @param type
>>>>> - * CRC type (enum rte_net_crc_type)
>>>>> - *
>>>>> - * @return
>>>>> - * CRC value
>>>>> - */
>>>>> -uint32_t
>>>>> -rte_net_crc_calc(const void *data,
>>>>> - uint32_t data_len,
>>>>> +struct rte_net_crc rte_net_crc_set(enum rte_net_crc_alg alg,
>>>>> enum rte_net_crc_type type);
>>>>>
>>>>> +uint32_t rte_net_crc(const struct rte_net_crc *ctx,
>>>>> + const void *data, const uint32_t data_len);
>>>>> +
>>>>>
>>>>
>>>> As these are APIs, can you please add doxygen comments to them?
>>> +1
>>> I think this change could be deferred to 25.03.
>>> Adding this API without removing the old one should be possible without any
>> unwanted consequences?
>>>
>>
>> This is not a new functionality but replacement of existing one, so it will be
>> confusing to have two set of APIs for same functionality with similar names:
>> rte_net_crc_calc() and rte_net_crc()
>> rte_net_crc_set_alg() and rte_net_crc_set()
>>
>> Also there are some internal functions used by these APIs and supporting both
>> new and old may force to have two version of these internal functions and it will
>> create complexity/noise.
>>
>> As this release is ABI break release, it is easier to update APIs (although
>> deprecation notice is missing for this change).
>>
>>
>> As an alternative option, do you think applying ABI versioning in 25.03 works for
>> these APIs?
>> If so, old version can be cleaned in v25.11.
>
> I would go with versioning. I can preserve old function names as it's basically the same functionality.
>
ack
next prev parent reply other threads:[~2024-10-09 9:11 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-01 18:11 [PATCH v2 0/3] " Arkadiusz Kusztal
2024-10-01 18:11 ` [PATCH v2 1/3] " Arkadiusz Kusztal
2024-10-01 21:44 ` Stephen Hemminger
2024-10-02 8:28 ` Kusztal, ArkadiuszX
2024-10-02 7:42 ` David Marchand
2024-10-02 8:41 ` Kusztal, ArkadiuszX
2024-10-02 9:01 ` David Marchand
2024-10-02 9:16 ` Kusztal, ArkadiuszX
2024-10-08 3:42 ` Ferruh Yigit
2024-10-08 20:51 ` Kusztal, ArkadiuszX
2024-10-09 1:03 ` Ferruh Yigit
2024-10-09 7:48 ` Kusztal, ArkadiuszX
2024-10-09 9:11 ` Ferruh Yigit [this message]
2024-10-01 18:11 ` [PATCH v2 2/3] crypto/qat: use process safe " Arkadiusz Kusztal
2024-10-01 18:11 ` [PATCH v2 3/3] test/crc: replace thread-unsafe api functions Arkadiusz Kusztal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=683c3a3d-0e02-46a7-9490-9efe759fa712@amd.com \
--to=ferruh.yigit@amd.com \
--cc=arkadiuszx.kusztal@intel.com \
--cc=brian.dooley@intel.com \
--cc=david.marchand@redhat.com \
--cc=dev@dpdk.org \
--cc=kai.ji@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).