From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 99E4E454F4; Tue, 25 Jun 2024 21:27:10 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 2B2AA40677; Tue, 25 Jun 2024 21:27:10 +0200 (CEST) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) by mails.dpdk.org (Postfix) with ESMTP id 0CCCD4064F for ; Tue, 25 Jun 2024 21:27:08 +0200 (CEST) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id A4E4E27C1; Tue, 25 Jun 2024 21:27:08 +0200 (CEST) Received: from isengard (h-62-63-215-114.A163.priv.bahnhof.se [62.63.215.114]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (prime256v1) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id 9607D28AC; Tue, 25 Jun 2024 21:27:07 +0200 (CEST) Date: Tue, 25 Jun 2024 21:27:06 +0200 From: Mattias =?iso-8859-1?Q?R=F6nnblom?= To: Maxime Coquelin Cc: Mattias =?iso-8859-1?Q?R=F6nnblom?= , dev@dpdk.org, Morten =?iso-8859-1?Q?Br=F8rup?= , Stephen Hemminger , Abdullah Sevincer , Pavan Nikhilesh , David Hunt , Vladimir Medvedkin , Bruce Richardson Subject: Re: [PATCH v4 00/13] Optionally have rte_memcpy delegate to compiler memcpy Message-ID: References: <20240620115027.420304-2-mattias.ronnblom@ericsson.com> <20240620175731.420639-1-mattias.ronnblom@ericsson.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Virus-Scanned: ClamAV using ClamSMTP X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Tue, Jun 25, 2024 at 05:29:35PM +0200, Maxime Coquelin wrote: > Hi Mattias, > > On 6/20/24 19:57, Mattias Rönnblom wrote: > > This patch set make DPDK library, driver, and application code use the > > compiler/libc memcpy() by default when functions in are > > invoked. > > > > The various custom DPDK rte_memcpy() implementations may be retained > > by means of a build-time option. > > > > This patch set only make a difference on x86, PPC and ARM. Loongarch > > and RISCV already used compiler/libc memcpy(). > > It indeed makes a difference on x86! > > Just tested latest main with and without your series on > Intel(R) Xeon(R) Gold 6438N. > > The test is a simple IO loop between a Vhost PMD and a Virtio-user PMD: > # dpdk-testpmd -l 4-6 --file-prefix=virtio1 --no-pci --vdev 'net_virtio_user0,mac=00:01:02:03:04:05,path=./vhost-net,server=1,mrg_rxbuf=1,in_order=1' > --single-file-segments -- -i > testpmd> start > > # dpdk-testpmd -l 8-10 --file-prefix=vhost1 --no-pci --vdev > 'net_vhost0,iface=vhost-net,client=1' --single-file-segments -- -i > testpmd> start tx_first 32 > > Latest main: 14.5Mpps > Latest main + this series: 10Mpps > I ran the above benchmark on my Raptor Lake desktop (locked to 3,2 GHz). GCC 12.3.0. Core use_cc_memcpy Mpps E false 9.5 E true 9.7 P false 16.4 P true 13.5 On the P-cores, there's a significant performance regression, although not as bad as the one you see on your Sapphire Rapids Xeon. On the E-cores, there's actually a slight performance gain. The virtio PMD does not directly invoke rte_memcpy() or anything else from , but rather use memcpy(), so I'm not sure I understand what's going on here. Does the virtio driver delegate some performance-critical task to some module that in turns uses rte_memcpy()? > So for me, it should be disabled by default. > > Regards, > Maxime > > > This patch set includes a number of fixes in drivers and libraries > > which errornously relied on including header files > > (i.e., ) required by its implementation. > > > > Mattias Rönnblom (13): > > net/i40e: add missing vector API header include > > net/iavf: add missing vector API header include > > net/ice: add missing vector API header include > > net/ixgbe: add missing vector API header include > > net/ngbe: add missing vector API header include > > net/txgbe: add missing vector API header include > > net/virtio: add missing vector API header include > > net/fm10k: add missing vector API header include > > event/dlb2: include headers for vector and memory copy APIs > > net/octeon_ep: add missing vector API header include > > distributor: add missing vector API header include > > fib: add missing vector API header include > > eal: provide option to use compiler memcpy instead of RTE > > > > config/meson.build | 1 + > > doc/guides/rel_notes/release_24_07.rst | 21 +++++++ > > drivers/event/dlb2/dlb2.c | 2 + > > drivers/net/fm10k/fm10k_rxtx_vec.c | 3 +- > > drivers/net/i40e/i40e_rxtx_vec_sse.c | 3 +- > > drivers/net/iavf/iavf_rxtx_vec_sse.c | 3 +- > > drivers/net/ice/ice_rxtx_vec_sse.c | 2 +- > > drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c | 3 +- > > drivers/net/ngbe/ngbe_rxtx_vec_sse.c | 3 +- > > drivers/net/octeon_ep/otx_ep_ethdev.c | 2 + > > drivers/net/txgbe/txgbe_rxtx_vec_sse.c | 3 +- > > drivers/net/virtio/virtio_rxtx_simple_sse.c | 3 +- > > lib/distributor/rte_distributor.c | 1 + > > lib/eal/arm/include/rte_memcpy.h | 10 ++++ > > lib/eal/include/generic/rte_memcpy.h | 61 ++++++++++++++++++--- > > lib/eal/loongarch/include/rte_memcpy.h | 53 ++---------------- > > lib/eal/ppc/include/rte_memcpy.h | 10 ++++ > > lib/eal/riscv/include/rte_memcpy.h | 53 ++---------------- > > lib/eal/x86/include/meson.build | 1 + > > lib/eal/x86/include/rte_memcpy.h | 11 +++- > > lib/fib/trie.c | 1 + > > meson_options.txt | 2 + > > 22 files changed, 131 insertions(+), 121 deletions(-) > > >