I've implemented optimizations to rte_memcpy targeting RISC-V architectures,
achieving an average 10%~15% reduction in execution time for data sizes between
129 to 1024 bytes( 1025~1600 gains little).
These enhancements draw inspiration from x86 implementations,
specifically focusing on:
1)Alignment Handling for Unaligned Scenarios
2)Vector Configuration Tuning
3)Strategic Prefetching
- Patch 1: Cover letter
- Patch 2: Base implementation
- Patch 3: Benchmark report
Tested on Tested on SG2044 (VLEN=128)
Qiguo Chen (2):
riscv support rte_memcpy in vector
benchmark report for rte_memcpy
.mailmap | 1 +
benchmark_report.txt | 149 ++++++++++++++
config/riscv/meson.build | 14 ++
lib/eal/riscv/include/rte_memcpy.h | 310 ++++++++++++++++++++++++++++-
4 files changed, 472 insertions(+), 2 deletions(-)
create mode 100644 benchmark_report.txt
--
2.21.0.windows.1