DPDK patches and discussions
 help / color / mirror / Atom feed
From: Ferruh Yigit <ferruh.yigit@amd.com>
To: "Kusztal, ArkadiuszX" <arkadiuszx.kusztal@intel.com>,
	"Marchand, David" <david.marchand@redhat.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>, "Ji, Kai" <kai.ji@intel.com>,
	"Dooley, Brian" <brian.dooley@intel.com>
Subject: Re: [PATCH v2 1/3] net: add thread-safe crc api
Date: Wed, 9 Oct 2024 10:11:32 +0100	[thread overview]
Message-ID: <683c3a3d-0e02-46a7-9490-9efe759fa712@amd.com> (raw)
In-Reply-To: <PH0PR11MB5013F251BE81CE4A56D91C2C9F7F2@PH0PR11MB5013.namprd11.prod.outlook.com>

On 10/9/2024 8:48 AM, Kusztal, ArkadiuszX wrote:
> 
> 
>> -----Original Message-----
>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>> Sent: Wednesday, October 9, 2024 3:03 AM
>> To: Kusztal, ArkadiuszX <arkadiuszx.kusztal@intel.com>; Marchand, David
>> <david.marchand@redhat.com>
>> Cc: dev@dpdk.org; Ji, Kai <kai.ji@intel.com>; Dooley, Brian
>> <brian.dooley@intel.com>
>> Subject: Re: [PATCH v2 1/3] net: add thread-safe crc api
>>
>> On 10/8/2024 9:51 PM, Kusztal, ArkadiuszX wrote:
>>> Hi Ferruh,
>>> Thanks for the review, comments inline,
>>>
>>>> -----Original Message-----
>>>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>>>> Sent: Tuesday, October 8, 2024 5:43 AM
>>>> To: Kusztal, ArkadiuszX <arkadiuszx.kusztal@intel.com>; Marchand,
>>>> David <david.marchand@redhat.com>
>>>> Cc: dev@dpdk.org; Ji, Kai <kai.ji@intel.com>; Dooley, Brian
>>>> <brian.dooley@intel.com>
>>>> Subject: Re: [PATCH v2 1/3] net: add thread-safe crc api
>>>>
>>>> On 10/1/2024 7:11 PM, Arkadiusz Kusztal wrote:
>>>>> The current net CRC API is not thread-safe, this patch solves this
>>>>> by adding another, thread-safe API functions.
>>>>> This API is also safe to use across multiple processes, yet with
>>>>> limitations on max-simd-bitwidth, which will be checked only by the
>>>>> process that created the CRC context; all other processes will use
>>>>> the same CRC function when used with the same CRC context.
>>>>> It is an undefined behavior when process binaries are compiled with
>>>>> different SIMD capabilities when the same CRC context is used.
>>>>>
>>>>> Signed-off-by: Arkadiusz Kusztal <arkadiuszx.kusztal@intel.com>
>>>>
>>>> <...>
>>>>
>>>>> +static struct
>>>>> +{
>>>>> +	uint32_t (*f[RTE_NET_CRC_REQS])
>>>>> +		(const uint8_t *data, uint32_t data_len);
>>>>>
>>>>
>>>> It increases readability to typedef function pointers.
>>>
>>> Agree, though this typedef would be used here only, that’s why I left it out.
>>> But I can add it then.
>>>
>>>>
>>>>
>>>>> +} handlers[RTE_NET_CRC_AVX512 + 1];
>>>>>
>>>>> -/**
>>>>> - * Reflect the bits about the middle
>>>>> - *
>>>>> - * @param val
>>>>> - *   value to be reflected
>>>>> - *
>>>>> - * @return
>>>>> - *   reflected value
>>>>> - */
>>>>> -static uint32_t
>>>>> +static inline uint32_t
>>>>>
>>>>
>>>> Does changing to 'inline' required, as function is static compiler
>>>> can do the same.
>>>
>>> True, though it may be more readable sometimes.
>>> Of course there is no way that in O3 these functions would not be inlined by
>> the compiler, regardless if the inline hint is present or not.
>>>
>>>>
>>>>>  reflect_32bits(uint32_t val)
>>>>>  {
>>>>>  	uint32_t i, res = 0;
>>>>> @@ -99,26 +43,7 @@ reflect_32bits(uint32_t val)
>>>>>  	return res;
>>>>>  }
>>>>>
>>>>> -static void
>>>>> -crc32_eth_init_lut(uint32_t poly,
>>>>> -	uint32_t *lut)
>>>>> -{
>>>>> -	uint32_t i, j;
>>>>> -
>>>>> -	for (i = 0; i < CRC_LUT_SIZE; i++) {
>>>>> -		uint32_t crc = reflect_32bits(i);
>>>>> -
>>>>> -		for (j = 0; j < 8; j++) {
>>>>> -			if (crc & 0x80000000L)
>>>>> -				crc = (crc << 1) ^ poly;
>>>>> -			else
>>>>> -				crc <<= 1;
>>>>> -		}
>>>>> -		lut[i] = reflect_32bits(crc);
>>>>> -	}
>>>>> -}
>>>>> -
>>>>> -static __rte_always_inline uint32_t
>>>>> +static inline uint32_t
>>>>>
>>>>
>>>> Why not forcing inline anymore?
>>>> Are these inline changes related to the thread-safety?
>>>
>>> O3 will inline it anyway, and with always_inline it will be inline even in debug
>> mode. I just see no reason forcing it upon the compiler.
>>>
>>>>
>>>>>  crc32_eth_calc_lut(const uint8_t *data,
>>>>>  	uint32_t data_len,
>>>>>  	uint32_t crc,
>>>>> @@ -130,20 +55,9 @@ crc32_eth_calc_lut(const uint8_t *data,
>>>>>  	return crc;
>>>>>  }
>>>>>
>>>>> -static void
>>>>> -rte_net_crc_scalar_init(void)
>>>>> -{
>>>>> -	/* 32-bit crc init */
>>>>> -	crc32_eth_init_lut(CRC32_ETH_POLYNOMIAL, crc32_eth_lut);
>>>>> -
>>>>> -	/* 16-bit CRC init */
>>>>> -	crc32_eth_init_lut(CRC16_CCITT_POLYNOMIAL << 16, crc16_ccitt_lut);
>>>>> -}
>>>>> -
>>>>>  static inline uint32_t
>>>>> -rte_crc16_ccitt_handler(const uint8_t *data, uint32_t data_len)
>>>>> +crc16_ccitt(const uint8_t *data, uint32_t data_len)
>>>>>  {
>>>>> -	/* return 16-bit CRC value */
>>>>>
>>>>
>>>> Why not keep comments? Are they wrong?
>>>
>>> Functions names are very self-explanatory, that’s why I dropped comments. I
>> can add comments if needed.
>>>
>>
>> I am for restricting changes to the target of the patch which is making CRC
>> calculation thread safe, unless code changes invalidates the comments, lets
>> keep them. Same goes with inline related modifications.
>>
>>>>
>>>> <...>
>>>>
>>>>> +static void
>>>>> +crc_scalar_init(void)
>>>>> +{
>>>>> +	crc32_eth_init_lut(CRC32_ETH_POLYNOMIAL, crc32_eth_lut);
>>>>> +	crc32_eth_init_lut(CRC16_CCITT_POLYNOMIAL << 16, crc16_ccitt_lut);
>>>>> +
>>>>> +	handlers[RTE_NET_CRC_SCALAR].f[RTE_NET_CRC16_CCITT] =
>>>> crc16_ccitt;
>>>>> +	handlers[RTE_NET_CRC_SCALAR].f[RTE_NET_CRC32_ETH] = crc32_eth;
>>>>>
>>>>
>>>> +1 to remove global handlers pointer and add context,
>>>>
>>>> But current handlers array content is static, it can be set when
>>>> defined, instead of functions.
>>>
>>> Can do it for scalars, but for SIMD there is this runtime check like this:
>>> 	if (AVX512_VPCLMULQDQ_CPU_SUPPORTED) { So compiled binary on
>> AVX512
>>> machine could filter it out on the machine which does not support it.
>>> There is no NULL check in crc function so it would not change much when
>> called -> Invalid address vs Invalid instruction.
>>> There are not many checks there, as this is CRC after all, it should be a small as
>> possible, yet probably NULL check could be advisable in crc function then.
>>>
>>
>> There is already AVX512_VPCLMULQDQ_CPU_SUPPORTED etc checks in
>> 'rte_net_crc_set()'.
>> So does it work to update 'handlers' statically, without condition, but have
>> conditions when use them.
>>
>>>
>>>>
>>>> <...>
>>>>
>>>>> -static uint32_t
>>>>> -rte_crc32_eth_default_handler(const uint8_t *data, uint32_t
>>>>> data_len)
>>>>> +struct rte_net_crc rte_net_crc_set(enum rte_net_crc_alg alg,
>>>>> +	enum rte_net_crc_type type)
>>>>>  {
>>>>> -	handlers = NULL;
>>>>> -	if (max_simd_bitwidth == 0)
>>>>> -		max_simd_bitwidth = rte_vect_get_max_simd_bitwidth();
>>>>> -
>>>>> -	handlers = avx512_vpclmulqdq_get_handlers();
>>>>> -	if (handlers != NULL)
>>>>> -		return handlers[RTE_NET_CRC32_ETH](data, data_len);
>>>>> -	handlers = sse42_pclmulqdq_get_handlers();
>>>>> -	if (handlers != NULL)
>>>>> -		return handlers[RTE_NET_CRC32_ETH](data, data_len);
>>>>> -	handlers = neon_pmull_get_handlers();
>>>>> -	if (handlers != NULL)
>>>>> -		return handlers[RTE_NET_CRC32_ETH](data, data_len);
>>>>> -	handlers = handlers_scalar;
>>>>> -	return handlers[RTE_NET_CRC32_ETH](data, data_len);
>>>>> -}
>>>>> +	uint16_t max_simd_bitwidth;
>>>>>
>>>>> -/* Public API */
>>>>> -
>>>>> -void
>>>>> -rte_net_crc_set_alg(enum rte_net_crc_alg alg) -{
>>>>> -	handlers = NULL;
>>>>> -	if (max_simd_bitwidth == 0)
>>>>> -		max_simd_bitwidth = rte_vect_get_max_simd_bitwidth();
>>>>> +	max_simd_bitwidth = rte_vect_get_max_simd_bitwidth();
>>>>>
>>>>>  	switch (alg) {
>>>>>  	case RTE_NET_CRC_AVX512:
>>>>> -		handlers = avx512_vpclmulqdq_get_handlers();
>>>>> -		if (handlers != NULL)
>>>>> -			break;
>>>>> +#ifdef CC_X86_64_AVX512_VPCLMULQDQ_SUPPORT
>>>>> +		if (AVX512_VPCLMULQDQ_CPU_SUPPORTED &&
>>>>> +				max_simd_bitwidth >= RTE_VECT_SIMD_512) {
>>>>> +			return (struct rte_net_crc){ RTE_NET_CRC_AVX512,
>>>> type };
>>>>> +		}
>>>>> +#endif
>>>>>  		/* fall-through */
>>>>>  	case RTE_NET_CRC_SSE42:
>>>>> -		handlers = sse42_pclmulqdq_get_handlers();
>>>>> -		break; /* for x86, always break here */
>>>>> +#ifdef CC_X86_64_SSE42_PCLMULQDQ_SUPPORT
>>>>> +		if (SSE42_PCLMULQDQ_CPU_SUPPORTED &&
>>>>> +				max_simd_bitwidth >= RTE_VECT_SIMD_128) {
>>>>> +			return (struct rte_net_crc){ RTE_NET_CRC_SSE42, type
>>>> };
>>>>> +		}
>>>>> +#endif
>>>>> +		break;
>>>>>  	case RTE_NET_CRC_NEON:
>>>>> -		handlers = neon_pmull_get_handlers();
>>>>> -		/* fall-through */
>>>>> -	case RTE_NET_CRC_SCALAR:
>>>>> -		/* fall-through */
>>>>> +#ifdef CC_ARM64_NEON_PMULL_SUPPORT
>>>>> +		if (NEON_PMULL_CPU_SUPPORTED &&
>>>>> +				max_simd_bitwidth >= RTE_VECT_SIMD_128) {
>>>>> +			return (struct rte_net_crc){ RTE_NET_CRC_NEON, type
>>>> };
>>>>> +		}
>>>>> +#endif
>>>>>
>>>>
>>>> Is it more readable as following, up to you:
>>>
>>> Agree, I will change it.
>>>
>>>>
>>>> ```
>>>> rte_net_crc_set(alg, type) {
>>>>   enum rte_net_crc_alg new_alg = RTE_NET_CRC_SCALAR;
>>>>   switch (alg) {
>>>>   case AVX512:
>>>>     new_alg = ..
>>>>   case NEON:
>>>>     new_alg = ..
>>>>   }
>>>>   return struct rte_net_crc){ new_alg, type }; ```
>>>>
>>>>
>>>>
>>>>
>>>>> +		break;
>>>>>  	default:
>>>>>  		break;
>>>>>  	}
>>>>> -
>>>>> -	if (handlers == NULL)
>>>>> -		handlers = handlers_scalar;
>>>>> +	return (struct rte_net_crc){ RTE_NET_CRC_SCALAR, type };
>>>>>  }
>>>>>
>>>>> -uint32_t
>>>>> -rte_net_crc_calc(const void *data,
>>>>> -	uint32_t data_len,
>>>>> -	enum rte_net_crc_type type)
>>>>> +uint32_t rte_net_crc(const struct rte_net_crc *ctx,
>>>>> +	const void *data, const uint32_t data_len)
>>>>>  {
>>>>> -	uint32_t ret;
>>>>> -	rte_net_crc_handler f_handle;
>>>>> -
>>>>> -	f_handle = handlers[type];
>>>>> -	ret = f_handle(data, data_len);
>>>>> -
>>>>> -	return ret;
>>>>> +	return handlers[ctx->alg].f[ctx->type](data, data_len);
>>>>>
>>>>
>>>> 'rte_net_crc()' gets input from user and "struct rte_net_crc" is not
>>>> opaque, so user can provide invalid input, ctx->alg & ctx->type.
>>>> To protect against it input values should be checked before using.
>>>>
>>>> Or I think user not need to know the details of the "struct
>>>> rte_net_crc", so it can be an opaque variable for user.
>>>
>>> I would love it to be opaque, but then I would have to return a pointer, which
>> then would involve some allocations/deallocations and I wanted to keep is as
>> simple as possible.
>>> So probably the checks would be a way to go.
>>>
>>
>> True, +1 for simplicity.
>>
>>>>
>>>> <...>
>>>>
>>>>> -/**
>>>>> - * CRC compute API
>>>>> - *
>>>>> - * @param data
>>>>> - *   Pointer to the packet data for CRC computation
>>>>> - * @param data_len
>>>>> - *   Data length for CRC computation
>>>>> - * @param type
>>>>> - *   CRC type (enum rte_net_crc_type)
>>>>> - *
>>>>> - * @return
>>>>> - *   CRC value
>>>>> - */
>>>>> -uint32_t
>>>>> -rte_net_crc_calc(const void *data,
>>>>> -	uint32_t data_len,
>>>>> +struct rte_net_crc rte_net_crc_set(enum rte_net_crc_alg alg,
>>>>>  	enum rte_net_crc_type type);
>>>>>
>>>>> +uint32_t rte_net_crc(const struct rte_net_crc *ctx,
>>>>> +	const void *data, const uint32_t data_len);
>>>>> +
>>>>>
>>>>
>>>> As these are APIs, can you please add doxygen comments to them?
>>> +1
>>> I think this change could be deferred to 25.03.
>>> Adding this API without removing the old one should be possible without any
>> unwanted consequences?
>>>
>>
>> This is not a new functionality but replacement of existing one, so it will be
>> confusing to have two set of APIs for same functionality with similar names:
>> rte_net_crc_calc()    and   rte_net_crc()
>> rte_net_crc_set_alg() and   rte_net_crc_set()
>>
>> Also there are some internal functions used by these APIs and supporting both
>> new and old may force to have two version of these internal functions and it will
>> create complexity/noise.
>>
>> As this release is ABI break release, it is easier to update APIs (although
>> deprecation notice is missing for this change).
>>
>>
>> As an alternative option, do you think applying ABI versioning in 25.03 works for
>> these APIs?
>> If so, old version can be cleaned in v25.11.
> 
> I would go with versioning. I can preserve old function names as it's basically the same functionality. 
> 

ack

  reply	other threads:[~2024-10-09  9:11 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-01 18:11 [PATCH v2 0/3] " Arkadiusz Kusztal
2024-10-01 18:11 ` [PATCH v2 1/3] " Arkadiusz Kusztal
2024-10-01 21:44   ` Stephen Hemminger
2024-10-02  8:28     ` Kusztal, ArkadiuszX
2024-10-02  7:42   ` David Marchand
2024-10-02  8:41     ` Kusztal, ArkadiuszX
2024-10-02  9:01       ` David Marchand
2024-10-02  9:16         ` Kusztal, ArkadiuszX
2024-10-08  3:42   ` Ferruh Yigit
2024-10-08 20:51     ` Kusztal, ArkadiuszX
2024-10-09  1:03       ` Ferruh Yigit
2024-10-09  7:48         ` Kusztal, ArkadiuszX
2024-10-09  9:11           ` Ferruh Yigit [this message]
2024-10-01 18:11 ` [PATCH v2 2/3] crypto/qat: use process safe " Arkadiusz Kusztal
2024-10-01 18:11 ` [PATCH v2 3/3] test/crc: replace thread-unsafe api functions Arkadiusz Kusztal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=683c3a3d-0e02-46a7-9490-9efe759fa712@amd.com \
    --to=ferruh.yigit@amd.com \
    --cc=arkadiuszx.kusztal@intel.com \
    --cc=brian.dooley@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=kai.ji@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).