I've implemented optimizations to rte_memcpy targeting RISC-V architectures, achieving an average 10%~15% reduction in execution time for data sizes between 129 to 1024 bytes( 1025~1600 gains little). These enhancements draw inspiration from x86 implementations, specifically focusing on: 1)Alignment Handling for Unaligned Scenarios 2)Vector Configuration Tuning 3)Strategic Prefetching - Patch 1: Cover letter - Patch 2: Base implementation - Patch 3: Benchmark report Tested on Tested on SG2044 (VLEN=128) Qiguo Chen (2): riscv support rte_memcpy in vector benchmark report for rte_memcpy .mailmap | 1 + benchmark_report.txt | 149 ++++++++++++++ config/riscv/meson.build | 14 ++ lib/eal/riscv/include/rte_memcpy.h | 310 ++++++++++++++++++++++++++++- 4 files changed, 472 insertions(+), 2 deletions(-) create mode 100644 benchmark_report.txt -- 2.21.0.windows.1