DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Mattias Rönnblom" <hofors@lysator.liu.se>
To: "David Marchand" <david.marchand@redhat.com>,
	"Mattias Rönnblom" <mattias.ronnblom@ericsson.com>
Cc: dev@dpdk.org, "Morten Brørup" <mb@smartsharesystems.com>,
	"Stephen Hemminger" <stephen@networkplumber.org>,
	"Pavan Nikhilesh" <pbhagavatula@marvell.com>,
	"Bruce Richardson" <bruce.richardson@intel.com>
Subject: Re: [PATCH v6 5/7] eal: provide option to use compiler memcpy instead of RTE
Date: Fri, 4 Oct 2024 11:27:23 +0200	[thread overview]
Message-ID: <7d165396-a5dd-4aab-bc24-a6bfc610c291@lysator.liu.se> (raw)
In-Reply-To: <CAJFAV8ywLMPsF+_nju3Qsz0vTMNuPqQ5kJo9LebnyuhMfbYhwg@mail.gmail.com>

On 2024-10-04 09:52, David Marchand wrote:
> On Fri, Sep 20, 2024 at 12:36 PM Mattias Rönnblom
> <mattias.ronnblom@ericsson.com> wrote:
>>
>> Provide build option to have functions in <rte_memcpy.h> delegate to
>> the standard compiler/libc memcpy(), instead of using the various
>> custom DPDK, handcrafted, per-architecture rte_memcpy()
>> implementations.
>>
>> A new meson build option 'use_cc_memcpy' is added. By default, the
>> traditional, custom DPDK rte_memcpy() implementation is used.
>>
>> The performance benefits of the custom DPDK rte_memcpy()
>> implementations have been diminishing with every compiler release, and
>> with current toolchains the use of a custom memcpy() implementation
>> may even be a liability.
>>
>> An additional benefit of this change is that compilers and static
>> analysis tools have an easier time detecting incorrect usage of
>> rte_memcpy() (e.g., buffer overruns, or overlapping source and
>> destination buffers).
>>
>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>> Acked-by: Morten Brørup <mb@smartsharesystems.com>
> 
> I like this patch and the direction we are taking: stop reinvent
> memcpy and rely on compiler to optimize it.
> 
> I have some comments on the implementation.
> 
> - When I splitted headers in the early days of dpdk, the intention
> with arch-specific headers in EAL was to have them include the generic
> one, in all cases.
> It seems that, over time, x86 rte_memcpy.h (at least) deviated from
> this and stopped including generic/rte_memcpy.h...
> 
> So in this current patch, I expect every arch specific headers first
> include generic/rte_memcpy.h, regardless of any arch-specific define
> coming from the configuration.
> 
> An additional note on this, ARM32 and ARM64 have their own
> implementation in rte_memcpy_32.h resp. rte_memcpy_64.h, and I would
> check RTE_USE_CC_MEMCPY in each of them rather than in the top as
> ARM32 and ARM64 are like two different arches.
> 
> 
> - Now, looking at what was available for arches so far in DPDK:
> * ARM was relying by default on compiler implementation, with specific
> implementations for ARM32 and ARM64 available (see for more details
> below) => possible values (default first) RTE_USE_CC_MEMCPY = true /
> false
> * loongarch was relying on compiler implementation, with no specific
> implementations, => RTE_USE_CC_MEMCPY = true
> * ppc was relying on arch specific implementation, => RTE_USE_CC_MEMCPY = false
> * risc was relying on compiler implementation, with no specific
> implementations, => RTE_USE_CC_MEMCPY = true
> * x86 was relying on arch specific implementation, => RTE_USE_CC_MEMCPY = false
> 
> We can't get a unified default value for a meson option and keep
> compat for all arches (except maybe introduce a "auto" value).
> 

What if you just renamed RTE_USE_CC_MEMCPY to
RTE_ALWAYS_USE_CC_MEMCPY
RTE_FORCE_CC_MEMCPY

Then the naming would better match a scenario where cc memcpy may be the 
only option.

> Plus, disabling RTE_USE_CC_MEMCPY on loongarch and risc makes no
> sense, as there was never a specific implementation.
> 
> My suggestion is to drop the meson option and instead just set
> RTE_USE_CC_MEMCPY in config/$arch/meson.build.
> Testers / interested users may edit config/$arch/meson.build on their own.
> 
> 
> - Additionnally, ARM people have introduced arch-specific
> implementation config options for memcpy in ARM32 resp. ARM64:
> RTE_ARCH_ARM_NEON_MEMCPY resp. RTE_ARCH_ARM64_MEMCPY.
> RTE_USE_CC_MEMCPY can replace those two options (we may keep some
> compat in case someone relied on those defines for arm).
> That removes the need for a RTE_CC_MEMCPY define.
> 
> More comments below:
> 
> [snip]
> 
>> diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst
>> index 0ff70d9057..8be000294d 100644
>> --- a/doc/guides/rel_notes/release_24_11.rst
>> +++ b/doc/guides/rel_notes/release_24_11.rst
>> @@ -55,6 +55,26 @@ New Features
>>        Also, make sure to start the actual text at the margin.
>>        =======================================================
>>
>> +* **Compiler memcpy replaces custom DPDK implementation.**
>> +
>> +  The memory copy functions of ``<rte_memcpy.h>`` now optionally
>> +  delegates to the standard memcpy() function, implemented by the
>> +  compiler and the C runtime (e.g., libc).
>> +
>> +  In this release of DPDK, the handcrafted, per-architecture memory
>> +  copy implementations are still the default. Compiler memcpy is
>> +  enabled by setting the new ``use_cc_memcpy`` build option to true.
>> +
>> +  The performance benefits of the custom DPDK rte_memcpy()
>> +  implementations have been diminishing with every new compiler
>> +  release, and with current toolchains the use of a custom memcpy()
>> +  implementation may even result in worse performance than the
>> +  standard memcpy().
>> +
>> +  An additional benefit of using compiler memcpy is that compilers and
>> +  static analysis tools have an easier time detecting incorrect usage
>> +  of rte_memcpy() (e.g., buffer overruns, or overlapping source and
>> +  destination buffers).
> 
> As explained in the RN comments, an entry should use the form:
> 
>     * **Add a title in the past tense with a full stop.**
> 
>       Add a short 1-2 sentence description in the past tense.
>       The description should be enough to allow someone scanning
>       the release notes to understand the new feature.
> 
> It seems this note is a copy/paste of the commit log, please adjust
> the title and make the description shorter.
> 
>>
>>   Removed Items
>>   -------------
> 
> [snip]
> 
>> diff --git a/lib/eal/include/generic/rte_memcpy.h b/lib/eal/include/generic/rte_memcpy.h
>> index e7f0f8eaa9..cfb0175bd2 100644
>> --- a/lib/eal/include/generic/rte_memcpy.h
>> +++ b/lib/eal/include/generic/rte_memcpy.h
>> @@ -5,12 +5,19 @@
>>   #ifndef _RTE_MEMCPY_H_
>>   #define _RTE_MEMCPY_H_
>>
>> +#ifdef __cplusplus
>> +extern "C" {
>> +#endif
>> +
>>   /**
>>    * @file
>>    *
>>    * Functions for vectorised implementation of memcpy().
>>    */
>>
>> +#include <stdint.h>
>> +#include <string.h>
> 
> I don't think those includes should go in a extern "C" { block.
> 
>> +
>>   /**
>>    * Copy 16 bytes from one location to another using optimised
>>    * instructions. The locations should not overlap.
>> @@ -35,8 +42,6 @@ rte_mov16(uint8_t *dst, const uint8_t *src);
>>   static inline void
>>   rte_mov32(uint8_t *dst, const uint8_t *src);
>>
>> -#ifdef __DOXYGEN__
>> -
> 
> This strange check was added as not all architectures provide
> rte_mov48 (/me slaps Adrien and Thomas).
> I think the CI reported no issue because of a problem in the next
> patch where all that is tested is RTE_USE_CC_MEMCPY = true
> combination.
> 
> Still, the overall goal of this work is to drop the whole rte_memcpy
> thing in the future, so I think we can live with this #ifdef
> __DOXYGEN__ non sense hiding the absence of rte_mov48 in x86...
> 
> 
>>   /**
>>    * Copy 48 bytes from one location to another using optimised
>>    * instructions. The locations should not overlap.
>> @@ -49,8 +54,6 @@ rte_mov32(uint8_t *dst, const uint8_t *src);
>>   static inline void
>>   rte_mov48(uint8_t *dst, const uint8_t *src);
>>
>> -#endif /* __DOXYGEN__ */
>> -
>>   /**
>>    * Copy 64 bytes from one location to another using optimised
>>    * instructions. The locations should not overlap.
>> @@ -87,8 +90,6 @@ rte_mov128(uint8_t *dst, const uint8_t *src);
>>   static inline void
>>   rte_mov256(uint8_t *dst, const uint8_t *src);
>>
>> -#ifdef __DOXYGEN__
>> -
>>   /**
>>    * Copy bytes from one location to another. The locations must not overlap.
>>    *
>> @@ -111,6 +112,52 @@ rte_mov256(uint8_t *dst, const uint8_t *src);
>>   static void *
>>   rte_memcpy(void *dst, const void *src, size_t n);
>>
>> -#endif /* __DOXYGEN__ */
> 
> Removing this DOXYGEN here should be ok.
> CI will tell us.
> 
> 
>> diff --git a/lib/eal/x86/include/meson.build b/lib/eal/x86/include/meson.build
>> index 52d2f8e969..09c2fe2485 100644
>> --- a/lib/eal/x86/include/meson.build
>> +++ b/lib/eal/x86/include/meson.build
>> @@ -16,6 +16,7 @@ arch_headers = files(
>>           'rte_spinlock.h',
>>           'rte_vect.h',
>>   )
>> +
> 
> Unrelated change.
> 
> 
>>   arch_indirect_headers = files(
>>           'rte_atomic_32.h',
>>           'rte_atomic_64.h',
> 
> 


  parent reply	other threads:[~2024-10-04  9:27 UTC|newest]

Thread overview: 118+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-27 11:11 [RFC] " Mattias Rönnblom
2024-05-28  7:43 ` [RFC v2] " Mattias Rönnblom
2024-05-28  8:19   ` Mattias Rönnblom
2024-05-28  8:27     ` Bruce Richardson
2024-05-28  8:59       ` Mattias Rönnblom
2024-05-28  9:07         ` Morten Brørup
2024-05-28 16:17           ` Mattias Rönnblom
2024-05-28 14:59     ` Stephen Hemminger
2024-05-28 15:09       ` Bruce Richardson
2024-05-31  5:19         ` Mattias Rönnblom
2024-05-31 16:50           ` Stephen Hemminger
2024-06-02 11:33             ` Mattias Rönnblom
2024-05-28 16:03       ` Mattias Rönnblom
2024-05-29 21:55         ` Stephen Hemminger
2024-05-28  8:20   ` Bruce Richardson
2024-06-02 12:39   ` [RFC v3 0/5] Optionally have rte_memcpy delegate to compiler memcpy Mattias Rönnblom
2024-06-02 12:39     ` [RFC v3 1/5] event/dlb2: include headers for vector and memory copy APIs Mattias Rönnblom
2024-06-05  6:49       ` [PATCH 0/5] Optionally have rte_memcpy delegate to compiler memcpy Mattias Rönnblom
2024-06-05  6:49         ` [PATCH 1/5] event/dlb2: include headers for vector and memory copy APIs Mattias Rönnblom
2024-06-05  6:49         ` [PATCH 2/5] net/octeon_ep: properly include vector API header file Mattias Rönnblom
2024-06-05  6:49         ` [PATCH 3/5] distributor: " Mattias Rönnblom
2024-06-10 14:27           ` Tyler Retzlaff
2024-06-05  6:49         ` [PATCH 4/5] fib: " Mattias Rönnblom
2024-06-10 14:28           ` Tyler Retzlaff
2024-06-05  6:49         ` [PATCH 5/5] eal: provide option to use compiler memcpy instead of RTE Mattias Rönnblom
2024-06-20  7:24         ` [PATCH v2 0/6] Optionally have rte_memcpy delegate to compiler memcpy Mattias Rönnblom
2024-06-20  7:24           ` [PATCH v2 1/6] net/fm10k: add missing intrinsic include Mattias Rönnblom
2024-06-20  9:02             ` Bruce Richardson
2024-06-20  9:28             ` Bruce Richardson
2024-06-20 11:40               ` Mattias Rönnblom
2024-06-20 11:59                 ` Bruce Richardson
2024-06-20 11:50             ` [PATCH v3 0/6] Optionally have rte_memcpy delegate to compiler memcpy Mattias Rönnblom
2024-06-20 11:50               ` [PATCH v3 1/6] net/fm10k: add missing vector API header include Mattias Rönnblom
2024-06-20 12:34                 ` Bruce Richardson
2024-06-20 17:57                 ` [PATCH v4 00/13] Optionally have rte_memcpy delegate to compiler memcpy Mattias Rönnblom
2024-06-20 17:57                   ` [PATCH v4 01/13] net/i40e: add missing vector API header include Mattias Rönnblom
2024-07-24  7:53                     ` [PATCH v5 0/6] Optionally have rte_memcpy delegate to compiler memcpy Mattias Rönnblom
2024-07-24  7:53                       ` [PATCH v5 1/6] net/octeon_ep: add missing vector API header include Mattias Rönnblom
2024-09-20 10:27                         ` [PATCH v6 0/7] Optionally have rte_memcpy delegate to compiler memcpy Mattias Rönnblom
2024-09-20 10:27                           ` [PATCH v6 1/7] event/dlb2: include headers for vector and memory copy APIs Mattias Rönnblom
2024-09-20 10:27                           ` [PATCH v6 2/7] net/octeon_ep: add missing vector API header include Mattias Rönnblom
2024-09-20 10:27                           ` [PATCH v6 3/7] distributor: " Mattias Rönnblom
2024-09-20 10:27                           ` [PATCH v6 4/7] fib: " Mattias Rönnblom
2024-09-20 10:27                           ` [PATCH v6 5/7] eal: provide option to use compiler memcpy instead of RTE Mattias Rönnblom
2024-10-04  7:52                             ` David Marchand
2024-10-04  9:21                               ` Mattias Rönnblom
2024-10-04  9:54                                 ` David Marchand
2024-10-04 12:07                                   ` Thomas Monjalon
2024-10-04  9:27                               ` Mattias Rönnblom [this message]
2024-09-20 10:27                           ` [PATCH v6 6/7] ci: test compiler memcpy Mattias Rönnblom
2024-10-04  7:56                             ` David Marchand
2024-09-20 10:27                           ` [PATCH v6 7/7] vhost: optimize memcpy routines when cc memcpy is used Mattias Rönnblom
2024-10-03 11:46                             ` Maxime Coquelin
2024-07-24  7:53                       ` [PATCH v5 2/6] distributor: add missing vector API header include Mattias Rönnblom
2024-07-24  7:53                       ` [PATCH v5 3/6] fib: " Mattias Rönnblom
2024-07-24  7:53                       ` [PATCH v5 4/6] eal: provide option to use compiler memcpy instead of RTE Mattias Rönnblom
2024-07-24  7:53                       ` [PATCH v5 5/6] ci: test compiler memcpy Mattias Rönnblom
2024-07-24  7:53                       ` [PATCH v5 6/6] vhost: optimize memcpy routines when cc memcpy is used Mattias Rönnblom
2024-07-29 11:00                         ` Morten Brørup
2024-07-29 19:27                           ` Mattias Rönnblom
2024-07-29 19:56                             ` Morten Brørup
2024-06-20 17:57                   ` [PATCH v4 02/13] net/iavf: add missing vector API header include Mattias Rönnblom
2024-06-20 17:57                   ` [PATCH v4 03/13] net/ice: " Mattias Rönnblom
2024-06-20 17:57                   ` [PATCH v4 04/13] net/ixgbe: " Mattias Rönnblom
2024-06-20 17:57                   ` [PATCH v4 05/13] net/ngbe: " Mattias Rönnblom
2024-06-20 17:57                   ` [PATCH v4 06/13] net/txgbe: " Mattias Rönnblom
2024-06-20 17:57                   ` [PATCH v4 07/13] net/virtio: " Mattias Rönnblom
2024-06-20 17:57                   ` [PATCH v4 08/13] net/fm10k: " Mattias Rönnblom
2024-06-20 17:57                   ` [PATCH v4 09/13] event/dlb2: include headers for vector and memory copy APIs Mattias Rönnblom
2024-06-20 17:57                   ` [PATCH v4 10/13] net/octeon_ep: add missing vector API header include Mattias Rönnblom
2024-06-20 17:57                   ` [PATCH v4 11/13] distributor: " Mattias Rönnblom
2024-06-20 17:57                   ` [PATCH v4 12/13] fib: " Mattias Rönnblom
2024-06-20 17:57                   ` [PATCH v4 13/13] eal: provide option to use compiler memcpy instead of RTE Mattias Rönnblom
2024-06-21 15:19                     ` Stephen Hemminger
2024-06-24 10:05                     ` Thomas Monjalon
2024-06-24 17:56                       ` Mattias Rönnblom
2024-06-25 13:06                       ` Mattias Rönnblom
2024-06-25 13:34                         ` Thomas Monjalon
2024-06-20 18:53                   ` [PATCH v4 00/13] Optionally have rte_memcpy delegate to compiler memcpy Morten Brørup
2024-06-21  6:56                   ` Mattias Rönnblom
2024-06-21  7:04                     ` David Marchand
2024-06-21  7:35                       ` Mattias Rönnblom
2024-06-21  7:41                         ` David Marchand
2024-06-25 15:29                   ` Maxime Coquelin
2024-06-25 15:44                     ` Stephen Hemminger
2024-06-25 19:27                     ` Mattias Rönnblom
2024-06-26  8:37                       ` Maxime Coquelin
2024-06-26 14:58                         ` Stephen Hemminger
2024-06-26 15:24                           ` Maxime Coquelin
2024-06-26 18:47                             ` Mattias Rönnblom
2024-06-26 20:16                               ` Morten Brørup
2024-06-27 11:06                                 ` Mattias Rönnblom
2024-06-27 15:10                                   ` Stephen Hemminger
2024-06-27 15:23                                     ` Mattias Rönnblom
2024-06-20 11:50               ` [PATCH v3 2/6] event/dlb2: include headers for vector and memory copy APIs Mattias Rönnblom
2024-06-20 11:50               ` [PATCH v3 3/6] net/octeon_ep: add missing vector API header include Mattias Rönnblom
2024-06-20 11:50               ` [PATCH v3 4/6] distributor: " Mattias Rönnblom
2024-06-20 11:50               ` [PATCH v3 5/6] fib: " Mattias Rönnblom
2024-06-20 11:50               ` [PATCH v3 6/6] eal: provide option to use compiler memcpy instead of RTE Mattias Rönnblom
2024-06-20  7:24           ` [PATCH v2 2/6] event/dlb2: include headers for vector and memory copy APIs Mattias Rönnblom
2024-06-20  9:03             ` Bruce Richardson
2024-06-20  7:24           ` [PATCH v2 3/6] net/octeon_ep: properly include vector API header file Mattias Rönnblom
2024-06-20 14:43             ` Stephen Hemminger
2024-06-20  7:24           ` [PATCH v2 4/6] distributor: " Mattias Rönnblom
2024-06-20  9:13             ` Bruce Richardson
2024-06-20  7:24           ` [PATCH v2 5/6] fib: " Mattias Rönnblom
2024-06-20  9:14             ` Bruce Richardson
2024-06-20 14:43               ` Stephen Hemminger
2024-06-20  7:24           ` [PATCH v2 6/6] eal: provide option to use compiler memcpy instead of RTE Mattias Rönnblom
2024-06-02 12:39     ` [RFC v3 2/5] net/octeon_ep: properly include vector API header file Mattias Rönnblom
2024-06-02 12:39     ` [RFC v3 3/5] distributor: " Mattias Rönnblom
2024-06-02 12:39     ` [RFC v3 4/5] fib: " Mattias Rönnblom
2024-06-02 12:39     ` [RFC v3 5/5] eal: provide option to use compiler memcpy instead of RTE Mattias Rönnblom
2024-06-02 20:58       ` Morten Brørup
2024-06-03 17:04         ` Mattias Rönnblom
2024-06-03 17:08           ` Stephen Hemminger
2024-05-29 21:56 ` [RFC] " Stephen Hemminger
2024-06-02 11:30   ` Mattias Rönnblom

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7d165396-a5dd-4aab-bc24-a6bfc610c291@lysator.liu.se \
    --to=hofors@lysator.liu.se \
    --cc=bruce.richardson@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=mattias.ronnblom@ericsson.com \
    --cc=mb@smartsharesystems.com \
    --cc=pbhagavatula@marvell.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).