From: "Kusztal, ArkadiuszX" <arkadiuszx.kusztal@intel.com>
To: Ferruh Yigit <ferruh.yigit@amd.com>,
"Marchand, David" <david.marchand@redhat.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>, "Ji, Kai" <kai.ji@intel.com>,
"Dooley, Brian" <brian.dooley@intel.com>
Subject: RE: [PATCH v2 1/3] net: add thread-safe crc api
Date: Wed, 9 Oct 2024 07:48:20 +0000 [thread overview]
Message-ID: <PH0PR11MB5013F251BE81CE4A56D91C2C9F7F2@PH0PR11MB5013.namprd11.prod.outlook.com> (raw)
In-Reply-To: <b6b839de-1580-48a5-a705-cf06cbae928e@amd.com>
> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Wednesday, October 9, 2024 3:03 AM
> To: Kusztal, ArkadiuszX <arkadiuszx.kusztal@intel.com>; Marchand, David
> <david.marchand@redhat.com>
> Cc: dev@dpdk.org; Ji, Kai <kai.ji@intel.com>; Dooley, Brian
> <brian.dooley@intel.com>
> Subject: Re: [PATCH v2 1/3] net: add thread-safe crc api
>
> On 10/8/2024 9:51 PM, Kusztal, ArkadiuszX wrote:
> > Hi Ferruh,
> > Thanks for the review, comments inline,
> >
> >> -----Original Message-----
> >> From: Ferruh Yigit <ferruh.yigit@amd.com>
> >> Sent: Tuesday, October 8, 2024 5:43 AM
> >> To: Kusztal, ArkadiuszX <arkadiuszx.kusztal@intel.com>; Marchand,
> >> David <david.marchand@redhat.com>
> >> Cc: dev@dpdk.org; Ji, Kai <kai.ji@intel.com>; Dooley, Brian
> >> <brian.dooley@intel.com>
> >> Subject: Re: [PATCH v2 1/3] net: add thread-safe crc api
> >>
> >> On 10/1/2024 7:11 PM, Arkadiusz Kusztal wrote:
> >>> The current net CRC API is not thread-safe, this patch solves this
> >>> by adding another, thread-safe API functions.
> >>> This API is also safe to use across multiple processes, yet with
> >>> limitations on max-simd-bitwidth, which will be checked only by the
> >>> process that created the CRC context; all other processes will use
> >>> the same CRC function when used with the same CRC context.
> >>> It is an undefined behavior when process binaries are compiled with
> >>> different SIMD capabilities when the same CRC context is used.
> >>>
> >>> Signed-off-by: Arkadiusz Kusztal <arkadiuszx.kusztal@intel.com>
> >>
> >> <...>
> >>
> >>> +static struct
> >>> +{
> >>> + uint32_t (*f[RTE_NET_CRC_REQS])
> >>> + (const uint8_t *data, uint32_t data_len);
> >>>
> >>
> >> It increases readability to typedef function pointers.
> >
> > Agree, though this typedef would be used here only, that’s why I left it out.
> > But I can add it then.
> >
> >>
> >>
> >>> +} handlers[RTE_NET_CRC_AVX512 + 1];
> >>>
> >>> -/**
> >>> - * Reflect the bits about the middle
> >>> - *
> >>> - * @param val
> >>> - * value to be reflected
> >>> - *
> >>> - * @return
> >>> - * reflected value
> >>> - */
> >>> -static uint32_t
> >>> +static inline uint32_t
> >>>
> >>
> >> Does changing to 'inline' required, as function is static compiler
> >> can do the same.
> >
> > True, though it may be more readable sometimes.
> > Of course there is no way that in O3 these functions would not be inlined by
> the compiler, regardless if the inline hint is present or not.
> >
> >>
> >>> reflect_32bits(uint32_t val)
> >>> {
> >>> uint32_t i, res = 0;
> >>> @@ -99,26 +43,7 @@ reflect_32bits(uint32_t val)
> >>> return res;
> >>> }
> >>>
> >>> -static void
> >>> -crc32_eth_init_lut(uint32_t poly,
> >>> - uint32_t *lut)
> >>> -{
> >>> - uint32_t i, j;
> >>> -
> >>> - for (i = 0; i < CRC_LUT_SIZE; i++) {
> >>> - uint32_t crc = reflect_32bits(i);
> >>> -
> >>> - for (j = 0; j < 8; j++) {
> >>> - if (crc & 0x80000000L)
> >>> - crc = (crc << 1) ^ poly;
> >>> - else
> >>> - crc <<= 1;
> >>> - }
> >>> - lut[i] = reflect_32bits(crc);
> >>> - }
> >>> -}
> >>> -
> >>> -static __rte_always_inline uint32_t
> >>> +static inline uint32_t
> >>>
> >>
> >> Why not forcing inline anymore?
> >> Are these inline changes related to the thread-safety?
> >
> > O3 will inline it anyway, and with always_inline it will be inline even in debug
> mode. I just see no reason forcing it upon the compiler.
> >
> >>
> >>> crc32_eth_calc_lut(const uint8_t *data,
> >>> uint32_t data_len,
> >>> uint32_t crc,
> >>> @@ -130,20 +55,9 @@ crc32_eth_calc_lut(const uint8_t *data,
> >>> return crc;
> >>> }
> >>>
> >>> -static void
> >>> -rte_net_crc_scalar_init(void)
> >>> -{
> >>> - /* 32-bit crc init */
> >>> - crc32_eth_init_lut(CRC32_ETH_POLYNOMIAL, crc32_eth_lut);
> >>> -
> >>> - /* 16-bit CRC init */
> >>> - crc32_eth_init_lut(CRC16_CCITT_POLYNOMIAL << 16, crc16_ccitt_lut);
> >>> -}
> >>> -
> >>> static inline uint32_t
> >>> -rte_crc16_ccitt_handler(const uint8_t *data, uint32_t data_len)
> >>> +crc16_ccitt(const uint8_t *data, uint32_t data_len)
> >>> {
> >>> - /* return 16-bit CRC value */
> >>>
> >>
> >> Why not keep comments? Are they wrong?
> >
> > Functions names are very self-explanatory, that’s why I dropped comments. I
> can add comments if needed.
> >
>
> I am for restricting changes to the target of the patch which is making CRC
> calculation thread safe, unless code changes invalidates the comments, lets
> keep them. Same goes with inline related modifications.
>
> >>
> >> <...>
> >>
> >>> +static void
> >>> +crc_scalar_init(void)
> >>> +{
> >>> + crc32_eth_init_lut(CRC32_ETH_POLYNOMIAL, crc32_eth_lut);
> >>> + crc32_eth_init_lut(CRC16_CCITT_POLYNOMIAL << 16, crc16_ccitt_lut);
> >>> +
> >>> + handlers[RTE_NET_CRC_SCALAR].f[RTE_NET_CRC16_CCITT] =
> >> crc16_ccitt;
> >>> + handlers[RTE_NET_CRC_SCALAR].f[RTE_NET_CRC32_ETH] = crc32_eth;
> >>>
> >>
> >> +1 to remove global handlers pointer and add context,
> >>
> >> But current handlers array content is static, it can be set when
> >> defined, instead of functions.
> >
> > Can do it for scalars, but for SIMD there is this runtime check like this:
> > if (AVX512_VPCLMULQDQ_CPU_SUPPORTED) { So compiled binary on
> AVX512
> > machine could filter it out on the machine which does not support it.
> > There is no NULL check in crc function so it would not change much when
> called -> Invalid address vs Invalid instruction.
> > There are not many checks there, as this is CRC after all, it should be a small as
> possible, yet probably NULL check could be advisable in crc function then.
> >
>
> There is already AVX512_VPCLMULQDQ_CPU_SUPPORTED etc checks in
> 'rte_net_crc_set()'.
> So does it work to update 'handlers' statically, without condition, but have
> conditions when use them.
>
> >
> >>
> >> <...>
> >>
> >>> -static uint32_t
> >>> -rte_crc32_eth_default_handler(const uint8_t *data, uint32_t
> >>> data_len)
> >>> +struct rte_net_crc rte_net_crc_set(enum rte_net_crc_alg alg,
> >>> + enum rte_net_crc_type type)
> >>> {
> >>> - handlers = NULL;
> >>> - if (max_simd_bitwidth == 0)
> >>> - max_simd_bitwidth = rte_vect_get_max_simd_bitwidth();
> >>> -
> >>> - handlers = avx512_vpclmulqdq_get_handlers();
> >>> - if (handlers != NULL)
> >>> - return handlers[RTE_NET_CRC32_ETH](data, data_len);
> >>> - handlers = sse42_pclmulqdq_get_handlers();
> >>> - if (handlers != NULL)
> >>> - return handlers[RTE_NET_CRC32_ETH](data, data_len);
> >>> - handlers = neon_pmull_get_handlers();
> >>> - if (handlers != NULL)
> >>> - return handlers[RTE_NET_CRC32_ETH](data, data_len);
> >>> - handlers = handlers_scalar;
> >>> - return handlers[RTE_NET_CRC32_ETH](data, data_len);
> >>> -}
> >>> + uint16_t max_simd_bitwidth;
> >>>
> >>> -/* Public API */
> >>> -
> >>> -void
> >>> -rte_net_crc_set_alg(enum rte_net_crc_alg alg) -{
> >>> - handlers = NULL;
> >>> - if (max_simd_bitwidth == 0)
> >>> - max_simd_bitwidth = rte_vect_get_max_simd_bitwidth();
> >>> + max_simd_bitwidth = rte_vect_get_max_simd_bitwidth();
> >>>
> >>> switch (alg) {
> >>> case RTE_NET_CRC_AVX512:
> >>> - handlers = avx512_vpclmulqdq_get_handlers();
> >>> - if (handlers != NULL)
> >>> - break;
> >>> +#ifdef CC_X86_64_AVX512_VPCLMULQDQ_SUPPORT
> >>> + if (AVX512_VPCLMULQDQ_CPU_SUPPORTED &&
> >>> + max_simd_bitwidth >= RTE_VECT_SIMD_512) {
> >>> + return (struct rte_net_crc){ RTE_NET_CRC_AVX512,
> >> type };
> >>> + }
> >>> +#endif
> >>> /* fall-through */
> >>> case RTE_NET_CRC_SSE42:
> >>> - handlers = sse42_pclmulqdq_get_handlers();
> >>> - break; /* for x86, always break here */
> >>> +#ifdef CC_X86_64_SSE42_PCLMULQDQ_SUPPORT
> >>> + if (SSE42_PCLMULQDQ_CPU_SUPPORTED &&
> >>> + max_simd_bitwidth >= RTE_VECT_SIMD_128) {
> >>> + return (struct rte_net_crc){ RTE_NET_CRC_SSE42, type
> >> };
> >>> + }
> >>> +#endif
> >>> + break;
> >>> case RTE_NET_CRC_NEON:
> >>> - handlers = neon_pmull_get_handlers();
> >>> - /* fall-through */
> >>> - case RTE_NET_CRC_SCALAR:
> >>> - /* fall-through */
> >>> +#ifdef CC_ARM64_NEON_PMULL_SUPPORT
> >>> + if (NEON_PMULL_CPU_SUPPORTED &&
> >>> + max_simd_bitwidth >= RTE_VECT_SIMD_128) {
> >>> + return (struct rte_net_crc){ RTE_NET_CRC_NEON, type
> >> };
> >>> + }
> >>> +#endif
> >>>
> >>
> >> Is it more readable as following, up to you:
> >
> > Agree, I will change it.
> >
> >>
> >> ```
> >> rte_net_crc_set(alg, type) {
> >> enum rte_net_crc_alg new_alg = RTE_NET_CRC_SCALAR;
> >> switch (alg) {
> >> case AVX512:
> >> new_alg = ..
> >> case NEON:
> >> new_alg = ..
> >> }
> >> return struct rte_net_crc){ new_alg, type }; ```
> >>
> >>
> >>
> >>
> >>> + break;
> >>> default:
> >>> break;
> >>> }
> >>> -
> >>> - if (handlers == NULL)
> >>> - handlers = handlers_scalar;
> >>> + return (struct rte_net_crc){ RTE_NET_CRC_SCALAR, type };
> >>> }
> >>>
> >>> -uint32_t
> >>> -rte_net_crc_calc(const void *data,
> >>> - uint32_t data_len,
> >>> - enum rte_net_crc_type type)
> >>> +uint32_t rte_net_crc(const struct rte_net_crc *ctx,
> >>> + const void *data, const uint32_t data_len)
> >>> {
> >>> - uint32_t ret;
> >>> - rte_net_crc_handler f_handle;
> >>> -
> >>> - f_handle = handlers[type];
> >>> - ret = f_handle(data, data_len);
> >>> -
> >>> - return ret;
> >>> + return handlers[ctx->alg].f[ctx->type](data, data_len);
> >>>
> >>
> >> 'rte_net_crc()' gets input from user and "struct rte_net_crc" is not
> >> opaque, so user can provide invalid input, ctx->alg & ctx->type.
> >> To protect against it input values should be checked before using.
> >>
> >> Or I think user not need to know the details of the "struct
> >> rte_net_crc", so it can be an opaque variable for user.
> >
> > I would love it to be opaque, but then I would have to return a pointer, which
> then would involve some allocations/deallocations and I wanted to keep is as
> simple as possible.
> > So probably the checks would be a way to go.
> >
>
> True, +1 for simplicity.
>
> >>
> >> <...>
> >>
> >>> -/**
> >>> - * CRC compute API
> >>> - *
> >>> - * @param data
> >>> - * Pointer to the packet data for CRC computation
> >>> - * @param data_len
> >>> - * Data length for CRC computation
> >>> - * @param type
> >>> - * CRC type (enum rte_net_crc_type)
> >>> - *
> >>> - * @return
> >>> - * CRC value
> >>> - */
> >>> -uint32_t
> >>> -rte_net_crc_calc(const void *data,
> >>> - uint32_t data_len,
> >>> +struct rte_net_crc rte_net_crc_set(enum rte_net_crc_alg alg,
> >>> enum rte_net_crc_type type);
> >>>
> >>> +uint32_t rte_net_crc(const struct rte_net_crc *ctx,
> >>> + const void *data, const uint32_t data_len);
> >>> +
> >>>
> >>
> >> As these are APIs, can you please add doxygen comments to them?
> > +1
> > I think this change could be deferred to 25.03.
> > Adding this API without removing the old one should be possible without any
> unwanted consequences?
> >
>
> This is not a new functionality but replacement of existing one, so it will be
> confusing to have two set of APIs for same functionality with similar names:
> rte_net_crc_calc() and rte_net_crc()
> rte_net_crc_set_alg() and rte_net_crc_set()
>
> Also there are some internal functions used by these APIs and supporting both
> new and old may force to have two version of these internal functions and it will
> create complexity/noise.
>
> As this release is ABI break release, it is easier to update APIs (although
> deprecation notice is missing for this change).
>
>
> As an alternative option, do you think applying ABI versioning in 25.03 works for
> these APIs?
> If so, old version can be cleaned in v25.11.
I would go with versioning. I can preserve old function names as it's basically the same functionality.
>
>
> > I still have some second thoughts about this max-simd-width. DPDK does
> > not impose any restrictions on this parameter in the multi-process usage, there
> may be some room to alter some things there.
> >
> >>
> >>> #ifdef __cplusplus
> >>> }
> >>> #endif
> >>> diff --git a/lib/net/version.map b/lib/net/version.map index
> >>> bec4ce23ea..47daf1464a 100644
> >>> --- a/lib/net/version.map
> >>> +++ b/lib/net/version.map
> >>> @@ -4,11 +4,25 @@ DPDK_25 {
> >>> rte_eth_random_addr;
> >>> rte_ether_format_addr;
> >>> rte_ether_unformat_addr;
> >>> - rte_net_crc_calc;
> >>> - rte_net_crc_set_alg;
> >>> rte_net_get_ptype;
> >>> rte_net_make_rarp_packet;
> >>> rte_net_skip_ip6_ext;
> >>> + rte_net_crc;
> >>> + rte_net_crc_set;
> >>>
> >>> local: *;
> >>> };
> >>> +
> >>> +INTERNAL {
> >>> + global:
> >>> +
> >>> + rte_net_crc_sse42_init;
> >>> + rte_crc16_ccitt_sse42_handler;
> >>> + rte_crc32_eth_sse42_handler;
> >>> + rte_net_crc_avx512_init;
> >>> + rte_crc16_ccitt_avx512_handler;
> >>> + rte_crc32_eth_avx512_handler;
> >>> + rte_net_crc_neon_init;
> >>> + rte_crc16_ccitt_neon_handler;
> >>> + rte_crc32_eth_neon_handler;
> >>> +};
> >>>
> >>
> >> +1 to David's comment, these are used only within component, no need
> >> +to
> >> export.
> >>
> >
next prev parent reply other threads:[~2024-10-09 7:48 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-01 18:11 [PATCH v2 0/3] " Arkadiusz Kusztal
2024-10-01 18:11 ` [PATCH v2 1/3] " Arkadiusz Kusztal
2024-10-01 21:44 ` Stephen Hemminger
2024-10-02 8:28 ` Kusztal, ArkadiuszX
2024-10-02 7:42 ` David Marchand
2024-10-02 8:41 ` Kusztal, ArkadiuszX
2024-10-02 9:01 ` David Marchand
2024-10-02 9:16 ` Kusztal, ArkadiuszX
2024-10-08 3:42 ` Ferruh Yigit
2024-10-08 20:51 ` Kusztal, ArkadiuszX
2024-10-09 1:03 ` Ferruh Yigit
2024-10-09 7:48 ` Kusztal, ArkadiuszX [this message]
2024-10-09 9:11 ` Ferruh Yigit
2024-10-01 18:11 ` [PATCH v2 2/3] crypto/qat: use process safe " Arkadiusz Kusztal
2024-10-01 18:11 ` [PATCH v2 3/3] test/crc: replace thread-unsafe api functions Arkadiusz Kusztal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=PH0PR11MB5013F251BE81CE4A56D91C2C9F7F2@PH0PR11MB5013.namprd11.prod.outlook.com \
--to=arkadiuszx.kusztal@intel.com \
--cc=brian.dooley@intel.com \
--cc=david.marchand@redhat.com \
--cc=dev@dpdk.org \
--cc=ferruh.yigit@amd.com \
--cc=kai.ji@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).