From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id BA9BE454FF; Wed, 26 Jun 2024 10:37:50 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 4B9D340649; Wed, 26 Jun 2024 10:37:50 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 737C840611 for ; Wed, 26 Jun 2024 10:37:48 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1719391068; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=UFZj/+JoY1P4my5d4vcgiBamYB86IwGPw1Qpdcu6slQ=; b=Dmrf/YIUwdZb0Z++dM85H1v20vRmT984CWOdmuGqQs5hRc+iSNHFJP+iKMqNDU5VvylPtc xRXXiCHLgG+ip5zh1eIWhKlFYeRYrHlbICwNo7lYmgCTQtUctdnO5yPVGZZqObdFBwpbxW DBQ2sJWLc/xuhTWg+hribLxmTHpvJKk= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-416-gkrUppA2O3OfxHnmWzjcAw-1; Wed, 26 Jun 2024 04:37:41 -0400 X-MC-Unique: gkrUppA2O3OfxHnmWzjcAw-1 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 67E00195608C; Wed, 26 Jun 2024 08:37:38 +0000 (UTC) Received: from [10.39.208.7] (unknown [10.39.208.7]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id A4D95300021A; Wed, 26 Jun 2024 08:37:34 +0000 (UTC) Message-ID: <3eebd7f7-9ba2-424c-80d1-6efa8945641d@redhat.com> Date: Wed, 26 Jun 2024 10:37:31 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v4 00/13] Optionally have rte_memcpy delegate to compiler memcpy To: =?UTF-8?Q?Mattias_R=C3=B6nnblom?= Cc: =?UTF-8?Q?Mattias_R=C3=B6nnblom?= , dev@dpdk.org, =?UTF-8?Q?Morten_Br=C3=B8rup?= , Stephen Hemminger , Abdullah Sevincer , Pavan Nikhilesh , David Hunt , Vladimir Medvedkin , Bruce Richardson References: <20240620115027.420304-2-mattias.ronnblom@ericsson.com> <20240620175731.420639-1-mattias.ronnblom@ericsson.com> From: Maxime Coquelin Autocrypt: addr=maxime.coquelin@redhat.com; keydata= xsFNBFOEQQIBEADjNLYZZqghYuWv1nlLisptPJp+TSxE/KuP7x47e1Gr5/oMDJ1OKNG8rlNg kLgBQUki3voWhUbMb69ybqdMUHOl21DGCj0BTU3lXwapYXOAnsh8q6RRM+deUpasyT+Jvf3a gU35dgZcomRh5HPmKMU4KfeA38cVUebsFec1HuJAWzOb/UdtQkYyZR4rbzw8SbsOemtMtwOx YdXodneQD7KuRU9IhJKiEfipwqk2pufm2VSGl570l5ANyWMA/XADNhcEXhpkZ1Iwj3TWO7XR uH4xfvPl8nBsLo/EbEI7fbuUULcAnHfowQslPUm6/yaGv6cT5160SPXT1t8U9QDO6aTSo59N jH519JS8oeKZB1n1eLDslCfBpIpWkW8ZElGkOGWAN0vmpLfdyiqBNNyS3eGAfMkJ6b1A24un /TKc6j2QxM0QK4yZGfAxDxtvDv9LFXec8ENJYsbiR6WHRHq7wXl/n8guyh5AuBNQ3LIK44x0 KjGXP1FJkUhUuruGyZsMrDLBRHYi+hhDAgRjqHgoXi5XGETA1PAiNBNnQwMf5aubt+mE2Q5r qLNTgwSo2dpTU3+mJ3y3KlsIfoaxYI7XNsPRXGnZi4hbxmeb2NSXgdCXhX3nELUNYm4ArKBP LugOIT/zRwk0H0+RVwL2zHdMO1Tht1UOFGfOZpvuBF60jhMzbQARAQABzSxNYXhpbWUgQ29x dWVsaW4gPG1heGltZS5jb3F1ZWxpbkByZWRoYXQuY29tPsLBeAQTAQIAIgUCV3u/5QIbAwYL CQgHAwIGFQgCCQoLBBYCAwECHgECF4AACgkQyjiNKEaHD4ma2g/+P+Hg9WkONPaY1J4AR7Uf kBneosS4NO3CRy0x4WYmUSLYMLx1I3VH6SVjqZ6uBoYy6Fs6TbF6SHNc7QbB6Qjo3neqnQR1 71Ua1MFvIob8vUEl3jAR/+oaE1UJKrxjWztpppQTukIk4oJOmXbL0nj3d8dA2QgHdTyttZ1H xzZJWWz6vqxCrUqHU7RSH9iWg9R2iuTzii4/vk1oi4Qz7y/q8ONOq6ffOy/t5xSZOMtZCspu Mll2Szzpc/trFO0pLH4LZZfz/nXh2uuUbk8qRIJBIjZH3ZQfACffgfNefLe2PxMqJZ8mFJXc RQO0ONZvwoOoHL6CcnFZp2i0P5ddduzwPdGsPq1bnIXnZqJSl3dUfh3xG5ArkliZ/++zGF1O wvpGvpIuOgLqjyCNNRoR7cP7y8F24gWE/HqJBXs1qzdj/5Hr68NVPV1Tu/l2D1KMOcL5sOrz 2jLXauqDWn1Okk9hkXAP7+0Cmi6QwAPuBT3i6t2e8UdtMtCE4sLesWS/XohnSFFscZR6Vaf3 gKdWiJ/fW64L6b9gjkWtHd4jAJBAIAx1JM6xcA1xMbAFsD8gA2oDBWogHGYcScY/4riDNKXi lw92d6IEHnSf6y7KJCKq8F+Jrj2BwRJiFKTJ6ChbOpyyR6nGTckzsLgday2KxBIyuh4w+hMq TGDSp2rmWGJjASrOwU0EVPSbkwEQAMkaNc084Qvql+XW+wcUIY+Dn9A2D1gMr2BVwdSfVDN7 0ZYxo9PvSkzh6eQmnZNQtl8WSHl3VG3IEDQzsMQ2ftZn2sxjcCadexrQQv3Lu60Tgj7YVYRM H+fLYt9W5YuWduJ+FPLbjIKynBf6JCRMWr75QAOhhhaI0tsie3eDsKQBA0w7WCuPiZiheJaL 4MDe9hcH4rM3ybnRW7K2dLszWNhHVoYSFlZGYh+MGpuODeQKDS035+4H2rEWgg+iaOwqD7bg CQXwTZ1kSrm8NxIRVD3MBtzp9SZdUHLfmBl/tLVwDSZvHZhhvJHC6Lj6VL4jPXF5K2+Nn/Su CQmEBisOmwnXZhhu8ulAZ7S2tcl94DCo60ReheDoPBU8PR2TLg8rS5f9w6mLYarvQWL7cDtT d2eX3Z6TggfNINr/RTFrrAd7NHl5h3OnlXj7PQ1f0kfufduOeCQddJN4gsQfxo/qvWVB7PaE 1WTIggPmWS+Xxijk7xG6x9McTdmGhYaPZBpAxewK8ypl5+yubVsE9yOOhKMVo9DoVCjh5To5 aph7CQWfQsV7cd9PfSJjI2lXI0dhEXhQ7lRCFpf3V3mD6CyrhpcJpV6XVGjxJvGUale7+IOp sQIbPKUHpB2F+ZUPWds9yyVxGwDxD8WLqKKy0WLIjkkSsOb9UBNzgRyzrEC9lgQ/ABEBAAHC wV8EGAECAAkFAlT0m5MCGwwACgkQyjiNKEaHD4nU8hAAtt0xFJAy0sOWqSmyxTc7FUcX+pbD KVyPlpl6urKKMk1XtVMUPuae/+UwvIt0urk1mXi6DnrAN50TmQqvdjcPTQ6uoZ8zjgGeASZg jj0/bJGhgUr9U7oG7Hh2F8vzpOqZrdd65MRkxmc7bWj1k81tOU2woR/Gy8xLzi0k0KUa8ueB iYOcZcIGTcs9CssVwQjYaXRoeT65LJnTxYZif2pfNxfINFzCGw42s3EtZFteczClKcVSJ1+L +QUY/J24x0/ocQX/M1PwtZbB4c/2Pg/t5FS+s6UB1Ce08xsJDcwyOPIH6O3tccZuriHgvqKP yKz/Ble76+NFlTK1mpUlfM7PVhD5XzrDUEHWRTeTJSvJ8TIPL4uyfzhjHhlkCU0mw7Pscyxn DE8G0UYMEaNgaZap8dcGMYH/96EfE5s/nTX0M6MXV0yots7U2BDb4soLCxLOJz4tAFDtNFtA wLBhXRSvWhdBJZiig/9CG3dXmKfi2H+wdUCSvEFHRpgo7GK8/Kh3vGhgKmnnxhl8ACBaGy9n fxjSxjSO6rj4/MeenmlJw1yebzkX8ZmaSi8BHe+n6jTGEFNrbiOdWpJgc5yHIZZnwXaW54QT UhhSjDL1rV2B4F28w30jYmlRmm2RdN7iCZfbyP3dvFQTzQ4ySquuPkIGcOOHrvZzxbRjzMx1 Mwqu3GQ= In-Reply-To: X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 6/25/24 21:27, Mattias Rönnblom wrote: > On Tue, Jun 25, 2024 at 05:29:35PM +0200, Maxime Coquelin wrote: >> Hi Mattias, >> >> On 6/20/24 19:57, Mattias Rönnblom wrote: >>> This patch set make DPDK library, driver, and application code use the >>> compiler/libc memcpy() by default when functions in are >>> invoked. >>> >>> The various custom DPDK rte_memcpy() implementations may be retained >>> by means of a build-time option. >>> >>> This patch set only make a difference on x86, PPC and ARM. Loongarch >>> and RISCV already used compiler/libc memcpy(). >> >> It indeed makes a difference on x86! >> >> Just tested latest main with and without your series on >> Intel(R) Xeon(R) Gold 6438N. >> >> The test is a simple IO loop between a Vhost PMD and a Virtio-user PMD: >> # dpdk-testpmd -l 4-6 --file-prefix=virtio1 --no-pci --vdev 'net_virtio_user0,mac=00:01:02:03:04:05,path=./vhost-net,server=1,mrg_rxbuf=1,in_order=1' >> --single-file-segments -- -i >> testpmd> start >> >> # dpdk-testpmd -l 8-10 --file-prefix=vhost1 --no-pci --vdev >> 'net_vhost0,iface=vhost-net,client=1' --single-file-segments -- -i >> testpmd> start tx_first 32 >> >> Latest main: 14.5Mpps >> Latest main + this series: 10Mpps >> > > I ran the above benchmark on my Raptor Lake desktop (locked to 3,2 > GHz). GCC 12.3.0. > > Core use_cc_memcpy Mpps > E false 9.5 > E true 9.7 > P false 16.4 > P true 13.5 > > On the P-cores, there's a significant performance regression, although > not as bad as the one you see on your Sapphire Rapids Xeon. On the > E-cores, there's actually a slight performance gain. > > The virtio PMD does not directly invoke rte_memcpy() or anything else > from , but rather use memcpy(), so I'm not sure I > understand what's going on here. Does the virtio driver delegate some > performance-critical task to some module that in turns uses > rte_memcpy()? This is because Vhost is the bottleneck here, not Virtio driver. Indeed, the virtqueues memory belongs to the Virtio driver and the descriptors buffers are Virtio's mbufs, so not much memcpy's are done there. Vhost however, is a heavy memcpy user, as all the descriptors buffers are copied to/from its mbufs. >> So for me, it should be disabled by default. >> >> Regards, >> Maxime >> >>> This patch set includes a number of fixes in drivers and libraries >>> which errornously relied on including header files >>> (i.e., ) required by its implementation. >>> >>> Mattias Rönnblom (13): >>> net/i40e: add missing vector API header include >>> net/iavf: add missing vector API header include >>> net/ice: add missing vector API header include >>> net/ixgbe: add missing vector API header include >>> net/ngbe: add missing vector API header include >>> net/txgbe: add missing vector API header include >>> net/virtio: add missing vector API header include >>> net/fm10k: add missing vector API header include >>> event/dlb2: include headers for vector and memory copy APIs >>> net/octeon_ep: add missing vector API header include >>> distributor: add missing vector API header include >>> fib: add missing vector API header include >>> eal: provide option to use compiler memcpy instead of RTE >>> >>> config/meson.build | 1 + >>> doc/guides/rel_notes/release_24_07.rst | 21 +++++++ >>> drivers/event/dlb2/dlb2.c | 2 + >>> drivers/net/fm10k/fm10k_rxtx_vec.c | 3 +- >>> drivers/net/i40e/i40e_rxtx_vec_sse.c | 3 +- >>> drivers/net/iavf/iavf_rxtx_vec_sse.c | 3 +- >>> drivers/net/ice/ice_rxtx_vec_sse.c | 2 +- >>> drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c | 3 +- >>> drivers/net/ngbe/ngbe_rxtx_vec_sse.c | 3 +- >>> drivers/net/octeon_ep/otx_ep_ethdev.c | 2 + >>> drivers/net/txgbe/txgbe_rxtx_vec_sse.c | 3 +- >>> drivers/net/virtio/virtio_rxtx_simple_sse.c | 3 +- >>> lib/distributor/rte_distributor.c | 1 + >>> lib/eal/arm/include/rte_memcpy.h | 10 ++++ >>> lib/eal/include/generic/rte_memcpy.h | 61 ++++++++++++++++++--- >>> lib/eal/loongarch/include/rte_memcpy.h | 53 ++---------------- >>> lib/eal/ppc/include/rte_memcpy.h | 10 ++++ >>> lib/eal/riscv/include/rte_memcpy.h | 53 ++---------------- >>> lib/eal/x86/include/meson.build | 1 + >>> lib/eal/x86/include/rte_memcpy.h | 11 +++- >>> lib/fib/trie.c | 1 + >>> meson_options.txt | 2 + >>> 22 files changed, 131 insertions(+), 121 deletions(-) >>> >> >