From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 22723A0547; Wed, 27 Oct 2021 10:00:06 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D4A8040040; Wed, 27 Oct 2021 10:00:05 +0200 (CEST) Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) by mails.dpdk.org (Postfix) with ESMTP id E8F8C4003F for ; Wed, 27 Oct 2021 10:00:03 +0200 (CEST) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.nyi.internal (Postfix) with ESMTP id 53D915C0263; Wed, 27 Oct 2021 04:00:03 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute1.internal (MEProxy); Wed, 27 Oct 2021 04:00:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding:content-type; s=fm2; bh= fczwOJC0a3mEgNRv8zK1pc2EUGavZxiXKaBl6NmUJ/k=; b=CpzXWKTuzZy/sYab RnVwq1Tlb9GRJ0M351kLIbQzfLFJkD2eauZTKWOZA05rNg7MyT9hQdHBAZtceYnX xe5yXCoSmOMqqJPwb82BM4kMokxs6lg0gttqRaYCfiOso7cjA0L+QeshQnNIkk3P Ses/iKbzvdfwioNv36PYV0TKzsT6y2qWTtW7te9yGQIwps3fBGDXaZU2tp/7URQB Eec9tAuSwflTt8wZwKMwclN+/lV8yybMwAfBFDbMWzpiK5lOIaNJxcdqxtztRZJJ xuS0ehXTpvrWrreLeaPlWP1MS7hMWzquNsZItd52jJJsVmOYJiHme4O7BdGLoOpl SxaWTg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; bh=fczwOJC0a3mEgNRv8zK1pc2EUGavZxiXKaBl6NmUJ /k=; b=krleN3lFsApGWjWV0rK0yyzomehLrSbVX6UVaXCD8i3sRERdBhxmNoerW wTgWuogC45KlOqj9T4TuThXWa6W5BOV4tZp6q4DaPqLr632f0GnBLVnbxnyO1xC9 nfGOyW56JdeYhYVqtnCrzdN/ickO7dqQt3H4Ek0MB/2RbStptB8ecWqy/fRylhCC 1064bKRnwbD1nJfVIJSynKwXmzbqoHKCehSsTXaI9seas4IN5QJildsDI58dWE3H 7XEREvqNj/J139vtKUEytD5RnzferMYxFQSJySV8d0JKOnhILpdEztTF1OE7jqBK UQnAWmyjsiXpAJVMBVFaGZTeuY5pg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvtddrvdefledguddvtdcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefhvffufffkjghfggfgtgesthfuredttddtvdenucfhrhhomhepvfhhohhm rghsucfoohhnjhgrlhhonhcuoehthhhomhgrshesmhhonhhjrghlohhnrdhnvghtqeenuc ggtffrrghtthgvrhhnpedugefgvdefudfftdefgeelgffhueekgfffhfeujedtteeutdej ueeiiedvffegheenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfh hrohhmpehthhhomhgrshesmhhonhhjrghlohhnrdhnvght X-ME-Proxy: Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 27 Oct 2021 04:00:01 -0400 (EDT) From: Thomas Monjalon To: Aman Kumar Cc: dpdk-dev , Slava Ovsiienko , Anatoly Burakov , "Song, Keesang" , Jerin Jacob , konstantin.ananyev@intel.com, bruce.richardson@intel.com Date: Wed, 27 Oct 2021 09:59:59 +0200 Message-ID: <2270210.CKhdPXMPtV@thomas> In-Reply-To: References: <20211019104724.19416-1-aman.kumar@vvdntech.in> <2148097.ar7J4MBmm8@thomas> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Subject: Re: [dpdk-dev] [PATCH v3 3/3] lib/eal: add temporal store memcpy support on AMD platform X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" 27/10/2021 08:34, Aman Kumar: > On Tue, Oct 26, 2021 at 9:44 PM Thomas Monjalon wrote: > > > 26/10/2021 17:56, Aman Kumar: > > > This patch provides a rte_memcpy* call with temporal stores. > > > Use -Dcpu_instruction_set=znverX with build to enable this API. > > > > > > Signed-off-by: Aman Kumar > > > --- > > > config/x86/meson.build | 2 + > > > lib/eal/x86/include/rte_memcpy.h | 114 +++++++++++++++++++++++++++++++ > > > > It looks better as C code. > > Do you achieve the same performance as the asm version? > > > > In a few corner cases assembly performed better, but overall we have very > similar perf observations. > > > > +#if defined RTE_MEMCPY_AMDEPYC > > [...] > > > +static __rte_always_inline void * > > > +rte_memcpy_aligned_tstore16_generic(void *dst, void *src, int len) > > > > So to be clear, an application will benefit of this optimization if > > 1/ DPDK is specifically compiled for AMD > > 2/ the application is compiled with above DPDK build (because of > > inlinining) > > > > I guess there is no good way to benefit from the optimization > > without specific compilation, because of inlining constraint. > > Another design, with less constraint but less performance, > > would be to have a function pointer assigned at runtime based on the CPU. > > > > You're right. We need to build DPDK and apps with this flag enabled to get > the benefit. So the x86 packages, as in Linux distributions, won't have this optimization. > In future versions, we will try to adapt in a more dynamic way. Thanks. No, I was trying to say that unfortunately there is probably no solution.