DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH v12 0/3] net: optimize raw checksum computation
@ 2026-01-10  1:56 scott.k.mitch1
  2026-01-10  1:56 ` [PATCH v12 1/3] net: optimize __rte_raw_cksum and add tests scott.k.mitch1
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: scott.k.mitch1 @ 2026-01-10  1:56 UTC (permalink / raw)
  To: dev; +Cc: mb, stephen, Scott Mitchell

From: Scott Mitchell <scott.k.mitch1@gmail.com>

This series optimizes __rte_raw_cksum() by replacing memcpy-based access
with unaligned_uint16_t pointer access, enabling vectorization in both
GCC and Clang. The series is split into three patches to clearly separate
the core optimization from compiler-specific workarounds.

Performance improvement from cksum_perf_autotest on Intel Xeon
(Cascade Lake, AVX-512) with Clang 18.1 (TSC cycles/byte):

  Block size    Before    After    Improvement
         100      0.40     0.24        ~40%
        1500      0.50     0.06        ~8x
        9000      0.49     0.06        ~8x

Changes in v12:
- Split into 3-patch series per reviewer feedback
- Patch 1/3: Core optimization and test additions
- Patch 2/3: UBSAN alignment workaround (separate from GCC bug)
- Patch 3/3: GCC optimization bug workaround
- Reverted len & 1 to len % 2 and restored unlikely() per feedback
- Renamed RTE_SUPPRESS_UNINITIALIZED_WARNING to RTE_FORCE_INIT_BARRIER
- Applied minimal changes (no refactoring) to existing code
- Deferred hinic driver refactoring to future series

Note: Patch 1/3 will trigger compiler warnings/failures on GCC versions
with the optimization bug (GCC 11.5.0 and others seen on DPDK CI). These
are resolved by patches 2/3 and 3/3.

Scott Mitchell (3):
  net: optimize __rte_raw_cksum and add tests
  eal: add workaround for UBSAN alignment false positive
  eal/net: add workaround for GCC optimization bug

 app/test/meson.build             |   1 +
 app/test/test_cksum_fuzz.c       | 240 +++++++++++++++++++++++++++++++
 app/test/test_cksum_perf.c       |   2 +-
 drivers/net/hinic/hinic_pmd_tx.c |   2 +
 drivers/net/mlx5/mlx5_flow_dv.c  |   2 +
 lib/eal/include/rte_common.h     |  23 +++
 lib/net/rte_cksum.h              |  15 +-
 lib/net/rte_ip4.h                |   1 +
 lib/net/rte_ip6.h                |   1 +
 9 files changed, 277 insertions(+), 10 deletions(-)
 create mode 100644 app/test/test_cksum_fuzz.c

--
2.39.5 (Apple Git-154)


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-01-11  6:21 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-01-10  1:56 [PATCH v12 0/3] net: optimize raw checksum computation scott.k.mitch1
2026-01-10  1:56 ` [PATCH v12 1/3] net: optimize __rte_raw_cksum and add tests scott.k.mitch1
2026-01-10  2:28   ` Scott Mitchell
2026-01-10 14:47   ` Morten Brørup
2026-01-10  1:56 ` [PATCH v12 2/3] eal: add workaround for UBSAN alignment false positive scott.k.mitch1
2026-01-10 15:02   ` Morten Brørup
2026-01-10  1:56 ` [PATCH v12 3/3] eal/net: add workaround for GCC optimization bug scott.k.mitch1
2026-01-10 15:29   ` Morten Brørup
2026-01-11  6:21     ` Scott Mitchell
2026-01-10 16:59 ` [PATCH v12 0/3] net: optimize raw checksum computation Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).