* [PATCH v4 0/2] eal: RTE_PTR_ADD/SUB char* for compiler optimizations
@ 2026-01-11 16:06 scott.k.mitch1
2026-01-11 16:06 ` [PATCH v4 1/2] " scott.k.mitch1
2026-01-11 16:06 ` [PATCH v4 2/2] mailmap: add Scott Mitchell scott.k.mitch1
0 siblings, 2 replies; 3+ messages in thread
From: scott.k.mitch1 @ 2026-01-11 16:06 UTC (permalink / raw)
To: dev; +Cc: mb, stephen, Scott Mitchell
From: Scott Mitchell <scott.k.mitch1@gmail.com>
This series optimizes RTE_PTR_ADD and RTE_PTR_SUB by using char* pointer
arithmetic instead of uintptr_t casts when operating on pointer types.
This enables better compiler optimization, particularly for Clang which
can now recognize simple pointer patterns and apply vectorization, loop
unrolling, and improved assembly.
The implementation uses C11 _Generic to dispatch based on input type,
maintaining full API compatibility while enabling significant performance
improvements in checksum operations (40% for small buffers, up to 8x for
larger buffers).
The first patch contains the core optimization changes and unit tests.
The second patch adds a .mailmap entry for consistent git attribution.
Note on checkpatch warnings: The patches generate checkpatch warnings due to
C11 _Generic syntax not being fully recognized by checkpatch.pl (designed for
kernel C). The warnings are false positives - the code is valid C11 and
follows the same pattern as existing DPDK code (e.g., __rte_constant macro,
rte_bitops.h). The COMPLEX_MACRO warning is explicitly acknowledged by
checkpatch itself as expected for statement expression macros.
Changes in v4:
- Fixed duplicate "v3 v3" in subject line from v3
Changes in v3:
- Fixed CI build failures on GCC
- Simplified const pointer handling: GCC type-checks all _Generic branches
including unselected ones, causing errors with the nested _Generic/ternary
approach used in v2. Now uses simple char* cast for all pointer types,
which compiles on both GCC and Clang while still achieving the optimization
goal (char* arithmetic enables compiler optimizations vs uintptr_t).
- Removed Clang-specific pragma for -Wconditional-type-mismatch (no longer
needed without the ternary operator)
Changes in v2:
- Fixed copyright in test file to Apple Inc. (was Intel Corporation)
- Added checkpatch warning explanation to cover letter
Scott Mitchell (2):
eal: RTE_PTR_ADD/SUB char* for compiler optimizations
mailmap: add Scott Mitchell
.mailmap | 1 +
app/test/meson.build | 1 +
app/test/test_ptr_add_sub.c | 190 +++++++++++++++++++++++++++++++++++
lib/eal/include/rte_common.h | 60 +++++++++++
4 files changed, 252 insertions(+)
create mode 100644 app/test/test_ptr_add_sub.c
--
2.39.5 (Apple Git-154)
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH v4 1/2] eal: RTE_PTR_ADD/SUB char* for compiler optimizations
2026-01-11 16:06 [PATCH v4 0/2] eal: RTE_PTR_ADD/SUB char* for compiler optimizations scott.k.mitch1
@ 2026-01-11 16:06 ` scott.k.mitch1
2026-01-11 16:06 ` [PATCH v4 2/2] mailmap: add Scott Mitchell scott.k.mitch1
1 sibling, 0 replies; 3+ messages in thread
From: scott.k.mitch1 @ 2026-01-11 16:06 UTC (permalink / raw)
To: dev; +Cc: mb, stephen, Scott Mitchell
From: Scott Mitchell <scott.k.mitch1@gmail.com>
Modify RTE_PTR_ADD and RTE_PTR_SUB to use char* pointer arithmetic
instead of uintptr_t casts when operating on pointer types. This
enables better compiler optimization, particularly for
Clang which can now recognize simple pointer patterns and apply
vectorization, loop unrolling, and improved assembly.
The implementation uses C11 _Generic to dispatch based on input type:
- Integer types (int, uint64_t, etc.) continue using uintptr_t
arithmetic for API compatibility
- Pointer types use char* arithmetic, with const-qualified pointers
using const char* internally to preserve optimization hints
Example use case which benefits is __rte_raw_cksum. Performance
results from cksum_perf_autotest on Intel Xeon (Cascade Lake,
AVX-512) built with Clang 18.1 (TSC cycles/byte):
Block size Before After Improvement
100 0.40 0.24 ~40%
1500 0.50 0.06 ~8x
9000 0.49 0.06 ~8x
Signed-off-by: Scott Mitchell <scott.k.mitch1@gmail.com>
---
app/test/meson.build | 1 +
app/test/test_ptr_add_sub.c | 190 +++++++++++++++++++++++++++++++++++
lib/eal/include/rte_common.h | 60 +++++++++++
3 files changed, 251 insertions(+)
create mode 100644 app/test/test_ptr_add_sub.c
diff --git a/app/test/meson.build b/app/test/meson.build
index efec42a6bf..80a19d65ba 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -152,6 +152,7 @@ source_file_deps = {
'test_power_intel_uncore.c': ['power', 'power_intel_uncore'],
'test_power_kvm_vm.c': ['power', 'power_kvm_vm'],
'test_prefetch.c': [],
+ 'test_ptr_add_sub.c': [],
'test_ptr_compress.c': ['ptr_compress'],
'test_rand_perf.c': [],
'test_rawdev.c': ['rawdev', 'bus_vdev', 'raw_skeleton'],
diff --git a/app/test/test_ptr_add_sub.c b/app/test/test_ptr_add_sub.c
new file mode 100644
index 0000000000..ef449ca9af
--- /dev/null
+++ b/app/test/test_ptr_add_sub.c
@@ -0,0 +1,190 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2026 Apple Inc.
+ */
+
+#include "test.h"
+#include <stdint.h>
+#include <stdbool.h>
+
+#include <rte_common.h>
+
+/* Test all C11 standard integer types */
+static int
+test_ptr_add_sub_integer_types(void)
+{
+ unsigned long long ull = 0x1000;
+ TEST_ASSERT_EQUAL((uintptr_t)RTE_PTR_ADD(ull, 100), 0x1064,
+ "RTE_PTR_ADD failed for unsigned long long");
+ TEST_ASSERT_EQUAL((uintptr_t)RTE_PTR_SUB(ull, 100), 0x1000 - 100,
+ "RTE_PTR_SUB failed for unsigned long long");
+
+ long long ll = 0x1000;
+ TEST_ASSERT_EQUAL((uintptr_t)RTE_PTR_ADD(ll, 100), 0x1064,
+ "RTE_PTR_ADD failed for long long");
+
+ unsigned long ul = 0x1000;
+ TEST_ASSERT_EQUAL((uintptr_t)RTE_PTR_ADD(ul, 100), 0x1064,
+ "RTE_PTR_ADD failed for unsigned long");
+
+ long l = 0x1000;
+ TEST_ASSERT_EQUAL((uintptr_t)RTE_PTR_ADD(l, 100), 0x1064,
+ "RTE_PTR_ADD failed for long");
+
+ unsigned int ui = 0x1000;
+ TEST_ASSERT_EQUAL((uintptr_t)RTE_PTR_ADD(ui, 100), 0x1064,
+ "RTE_PTR_ADD failed for unsigned int");
+
+ int i = 0x1000;
+ TEST_ASSERT_EQUAL((uintptr_t)RTE_PTR_ADD(i, 100), 0x1064,
+ "RTE_PTR_ADD failed for int");
+
+ unsigned short us = 0x1000;
+ TEST_ASSERT_EQUAL((uintptr_t)RTE_PTR_ADD(us, 100), 0x1064,
+ "RTE_PTR_ADD failed for unsigned short");
+
+ short s = 0x1000;
+ TEST_ASSERT_EQUAL((uintptr_t)RTE_PTR_ADD(s, 100), 0x1064,
+ "RTE_PTR_ADD failed for short");
+
+ unsigned char uc = 100;
+ TEST_ASSERT_EQUAL((uintptr_t)RTE_PTR_ADD(uc, 50), 150,
+ "RTE_PTR_ADD failed for unsigned char");
+
+ signed char sc = 100;
+ TEST_ASSERT_EQUAL((uintptr_t)RTE_PTR_ADD(sc, 50), 150,
+ "RTE_PTR_ADD failed for signed char");
+
+ char c = 100;
+ TEST_ASSERT_EQUAL((uintptr_t)RTE_PTR_ADD(c, 50), 150,
+ "RTE_PTR_ADD failed for char");
+
+ _Bool b = 1;
+ TEST_ASSERT_EQUAL((uintptr_t)RTE_PTR_ADD(b, 99), 100,
+ "RTE_PTR_ADD failed for _Bool");
+
+ bool b2 = true;
+ TEST_ASSERT_EQUAL((uintptr_t)RTE_PTR_ADD(b2, 99), 100,
+ "RTE_PTR_ADD failed for bool");
+
+ return 0;
+}
+
+/* Test pointer types including const correctness */
+static int
+test_ptr_add_sub_pointer_types(void)
+{
+ char buffer[256];
+ void *result;
+
+ /* Test void* and const void* */
+ void *vp = buffer;
+ result = RTE_PTR_ADD(vp, 100);
+ TEST_ASSERT_EQUAL(result, (void *)(buffer + 100),
+ "RTE_PTR_ADD failed for void*");
+ result = RTE_PTR_SUB(vp, 50);
+ TEST_ASSERT_EQUAL(result, (void *)(buffer - 50),
+ "RTE_PTR_SUB failed for void*");
+
+ const void *cvp = buffer;
+ result = RTE_PTR_ADD(cvp, 100);
+ TEST_ASSERT_EQUAL(result, (void *)(buffer + 100),
+ "RTE_PTR_ADD failed for const void*");
+ result = RTE_PTR_SUB(cvp, 50);
+ TEST_ASSERT_EQUAL(result, (void *)(buffer - 50),
+ "RTE_PTR_SUB failed for const void*");
+
+ /* Test char* and const char* */
+ char *cp = buffer;
+ result = RTE_PTR_ADD(cp, 100);
+ TEST_ASSERT_EQUAL(result, (void *)(buffer + 100),
+ "RTE_PTR_ADD failed for char*");
+ result = RTE_PTR_SUB(cp, 50);
+ TEST_ASSERT_EQUAL(result, (void *)(buffer - 50),
+ "RTE_PTR_SUB failed for char*");
+
+ const char *ccp = buffer;
+ result = RTE_PTR_ADD(ccp, 100);
+ TEST_ASSERT_EQUAL(result, (void *)(buffer + 100),
+ "RTE_PTR_ADD failed for const char*");
+ result = RTE_PTR_SUB(ccp, 50);
+ TEST_ASSERT_EQUAL(result, (void *)(buffer - 50),
+ "RTE_PTR_SUB failed for const char*");
+
+ /* Test uint32_t* and const uint32_t* */
+ uint32_t *u32p = (uint32_t *)buffer;
+ result = RTE_PTR_ADD(u32p, 100);
+ TEST_ASSERT_EQUAL(result, (void *)(buffer + 100),
+ "RTE_PTR_ADD failed for uint32_t*");
+ result = RTE_PTR_SUB(u32p, 50);
+ TEST_ASSERT_EQUAL(result, (void *)(buffer - 50),
+ "RTE_PTR_SUB failed for uint32_t*");
+
+ const uint32_t *cu32p = (const uint32_t *)buffer;
+ result = RTE_PTR_ADD(cu32p, 100);
+ TEST_ASSERT_EQUAL(result, (void *)(buffer + 100),
+ "RTE_PTR_ADD failed for const uint32_t*");
+ result = RTE_PTR_SUB(cu32p, 50);
+ TEST_ASSERT_EQUAL(result, (void *)(buffer - 50),
+ "RTE_PTR_SUB failed for const uint32_t*");
+
+ /* Verify assigning to const pointer works (adding const is safe) */
+ const void *result_const = RTE_PTR_ADD(cvp, 100);
+ TEST_ASSERT_EQUAL(result_const, (const void *)(buffer + 100),
+ "RTE_PTR_ADD failed when assigning to const void*");
+
+ return 0;
+}
+
+/* Test that typedefs resolve to native types correctly */
+static int
+test_ptr_add_sub_typedefs(void)
+{
+ uint64_t u64 = 0x1000;
+ TEST_ASSERT_EQUAL((uintptr_t)RTE_PTR_ADD(u64, 100), 0x1064,
+ "RTE_PTR_ADD failed for uint64_t");
+
+ uint32_t u32 = 0x1000;
+ TEST_ASSERT_EQUAL((uintptr_t)RTE_PTR_ADD(u32, 100), 0x1064,
+ "RTE_PTR_ADD failed for uint32_t");
+
+ uint16_t u16 = 0x1000;
+ TEST_ASSERT_EQUAL((uintptr_t)RTE_PTR_ADD(u16, 100), 0x1064,
+ "RTE_PTR_ADD failed for uint16_t");
+
+ uint8_t u8 = 100;
+ TEST_ASSERT_EQUAL((uintptr_t)RTE_PTR_ADD(u8, 50), 150,
+ "RTE_PTR_ADD failed for uint8_t");
+
+ uintptr_t uptr = 0x1000;
+ TEST_ASSERT_EQUAL((uintptr_t)RTE_PTR_ADD(uptr, 100), 0x1064,
+ "RTE_PTR_ADD failed for uintptr_t");
+
+ size_t sz = 0x1000;
+ TEST_ASSERT_EQUAL((uintptr_t)RTE_PTR_ADD(sz, 100), 0x1064,
+ "RTE_PTR_ADD failed for size_t");
+
+ return 0;
+}
+
+/* Main test function that runs all subtests */
+static int
+test_ptr_add_sub(void)
+{
+ int ret;
+
+ ret = test_ptr_add_sub_integer_types();
+ if (ret != 0)
+ return ret;
+
+ ret = test_ptr_add_sub_pointer_types();
+ if (ret != 0)
+ return ret;
+
+ ret = test_ptr_add_sub_typedefs();
+ if (ret != 0)
+ return ret;
+
+ return 0;
+}
+
+REGISTER_FAST_TEST(ptr_add_sub_autotest, true, true, test_ptr_add_sub);
diff --git a/lib/eal/include/rte_common.h b/lib/eal/include/rte_common.h
index 9e7d84f929..1ae239b43a 100644
--- a/lib/eal/include/rte_common.h
+++ b/lib/eal/include/rte_common.h
@@ -551,12 +551,72 @@ static void __attribute__((destructor(RTE_PRIO(prio)), used)) func(void)
/**
* add a byte-value offset to a pointer
*/
+#ifndef RTE_TOOLCHAIN_MSVC
+#define RTE_PTR_ADD(ptr, x) \
+ (__extension__ ({ \
+ /* Diagnostics suppressed for internal macro operations only. \
+ * Compiler type-checks all _Generic branches even when unselected, \
+ * triggering warnings with no external impact. */ \
+ __rte_diagnostic_push \
+ __rte_diagnostic_ignored_wcast_qual \
+ /* Uses uintptr_t arithmetic for integer types (API compatibility), \
+ * and char* arithmetic for pointer types (enables optimization). */ \
+ __auto_type _ptr_result = _Generic((ptr), \
+ unsigned long long: ((void *)((uintptr_t)(ptr) + (x))), \
+ long long: ((void *)((uintptr_t)(ptr) + (x))), \
+ unsigned long: ((void *)((uintptr_t)(ptr) + (x))), \
+ long: ((void *)((uintptr_t)(ptr) + (x))), \
+ unsigned int: ((void *)((uintptr_t)(ptr) + (x))), \
+ int: ((void *)((uintptr_t)(ptr) + (x))), \
+ unsigned short: ((void *)((uintptr_t)(ptr) + (x))), \
+ short: ((void *)((uintptr_t)(ptr) + (x))), \
+ unsigned char: ((void *)((uintptr_t)(ptr) + (x))), \
+ signed char: ((void *)((uintptr_t)(ptr) + (x))), \
+ char: ((void *)((uintptr_t)(ptr) + (x))), \
+ _Bool: ((void *)((uintptr_t)(ptr) + (x))), \
+ default: ((void *)((char *)(ptr) + (x))) \
+ ); \
+ __rte_diagnostic_pop \
+ _ptr_result; \
+ }))
+#else
#define RTE_PTR_ADD(ptr, x) ((void*)((uintptr_t)(ptr) + (x)))
+#endif
/**
* subtract a byte-value offset from a pointer
*/
+#ifndef RTE_TOOLCHAIN_MSVC
+#define RTE_PTR_SUB(ptr, x) \
+ (__extension__ ({ \
+ /* Diagnostics suppressed for internal macro operations only. \
+ * Compiler type-checks all _Generic branches even when unselected, \
+ * triggering warnings with no external impact. */ \
+ __rte_diagnostic_push \
+ __rte_diagnostic_ignored_wcast_qual \
+ /* Uses uintptr_t arithmetic for integer types (API compatibility), \
+ * and char* arithmetic for pointer types (enables optimization). */ \
+ __auto_type _ptr_result = _Generic((ptr), \
+ unsigned long long: ((void *)((uintptr_t)(ptr) - (x))), \
+ long long: ((void *)((uintptr_t)(ptr) - (x))), \
+ unsigned long: ((void *)((uintptr_t)(ptr) - (x))), \
+ long: ((void *)((uintptr_t)(ptr) - (x))), \
+ unsigned int: ((void *)((uintptr_t)(ptr) - (x))), \
+ int: ((void *)((uintptr_t)(ptr) - (x))), \
+ unsigned short: ((void *)((uintptr_t)(ptr) - (x))), \
+ short: ((void *)((uintptr_t)(ptr) - (x))), \
+ unsigned char: ((void *)((uintptr_t)(ptr) - (x))), \
+ signed char: ((void *)((uintptr_t)(ptr) - (x))), \
+ char: ((void *)((uintptr_t)(ptr) - (x))), \
+ _Bool: ((void *)((uintptr_t)(ptr) - (x))), \
+ default: ((void *)((char *)(ptr) - (x))) \
+ ); \
+ __rte_diagnostic_pop \
+ _ptr_result; \
+ }))
+#else
#define RTE_PTR_SUB(ptr, x) ((void *)((uintptr_t)(ptr) - (x)))
+#endif
/**
* get the difference between two pointer values, i.e. how far apart
--
2.39.5 (Apple Git-154)
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH v4 2/2] mailmap: add Scott Mitchell
2026-01-11 16:06 [PATCH v4 0/2] eal: RTE_PTR_ADD/SUB char* for compiler optimizations scott.k.mitch1
2026-01-11 16:06 ` [PATCH v4 1/2] " scott.k.mitch1
@ 2026-01-11 16:06 ` scott.k.mitch1
1 sibling, 0 replies; 3+ messages in thread
From: scott.k.mitch1 @ 2026-01-11 16:06 UTC (permalink / raw)
To: dev; +Cc: mb, stephen, Scott Mitchell
From: Scott Mitchell <scott.k.mitch1@gmail.com>
Add email mapping for consistent git attribution.
Signed-off-by: Scott Mitchell <scott.k.mitch1@gmail.com>
---
.mailmap | 1 +
1 file changed, 1 insertion(+)
diff --git a/.mailmap b/.mailmap
index 2f089326ff..a0987fc3c7 100644
--- a/.mailmap
+++ b/.mailmap
@@ -1437,6 +1437,7 @@ Saurabh Singhal <saurabhs@arista.com>
Savinay Dharmappa <savinay.dharmappa@intel.com>
Scott Branden <scott.branden@broadcom.com>
Scott Daniels <daniels@research.att.com>
+Scott Mitchell <scott.k.mitch1@gmail.com> <scott_mitchell@apple.com>
Scott W Taylor <scott.w.taylor@intel.com>
Scott Wasson <scott_wasson@affirmednetworks.com>
Seán Harte <seanbh@gmail.com>
--
2.39.5 (Apple Git-154)
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-01-11 16:06 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-01-11 16:06 [PATCH v4 0/2] eal: RTE_PTR_ADD/SUB char* for compiler optimizations scott.k.mitch1
2026-01-11 16:06 ` [PATCH v4 1/2] " scott.k.mitch1
2026-01-11 16:06 ` [PATCH v4 2/2] mailmap: add Scott Mitchell scott.k.mitch1
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).