From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id E9C45A0547; Wed, 27 Oct 2021 08:35:07 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D74D840E0F; Wed, 27 Oct 2021 08:35:07 +0200 (CEST) Received: from mail-ed1-f48.google.com (mail-ed1-f48.google.com [209.85.208.48]) by mails.dpdk.org (Postfix) with ESMTP id 71E71407FF for ; Wed, 27 Oct 2021 08:35:06 +0200 (CEST) Received: by mail-ed1-f48.google.com with SMTP id g10so6356031edj.1 for ; Tue, 26 Oct 2021 23:35:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vvdntech-in.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=tRN3qgY2oXQ1p30ymRr4ZfPMWzxz772aPbiQHriPI4k=; b=sLXTiQBFmXw/Nay8h6lh4y7NR+rsMZgL3q1TWwyS7BbDww+xDrUiFRZ7TORFoSqchS cRJl5A4Ub9uyru8WxDDuyT9AsqKnwDDScDOHJYVggVu3PhNm+3Wb36GhlHTH7VFRwAZf 8aPVbncpfPgeJXtab8IHmQeKyMiZ40BtoaodxOo5HY73Uk6GVKt0076//nnwcv4cC9nj rfvedxAmsaXU9JgS+ewq9nAuKnN+UWQRvstCgf6/qa0VTX1IN7buwTNUaCDjPq33MXs6 TGtdiYXsjew5PaQC8eig7SVAiZNaVpnA/cWNr0VXhPqMNqeXSSGKuW7eqhyLocHeGa7N XrUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=tRN3qgY2oXQ1p30ymRr4ZfPMWzxz772aPbiQHriPI4k=; b=fC+3m3thEv1tTyiMv6cMj5Ae6nDSHFuMe9YEoZa+C8uWw/Lg/3omEZI7jMkXIJ68pY K9Katu10BUpWPBZZeCB6R0/1dv+gypVTxBJ1cWYt3V9FSqeqZ01rZ59bY+Fz8fy9t3tt B2yo9Clo87EnbxSzgr4xFxuDKnlTSdV1JdkKRZH7eAxMLwjq4uyC+t98Z+2eB61XbkI6 lsZOuXoCYESuWLVAOO2LhihXuSCPnCxTRxUHqLS4nwgrL1y7FWPOlT5b2rnq+TDpuTBe h21eTxfstFf/DzTCOvwq8JO2HSAfP9JIbsY08WmNXaDjcArcSg4vK2hv7S0BuKN2UQFe HsUw== X-Gm-Message-State: AOAM5324kgaZ0bZWWpRxQ8MBDAX+g4CxTtJ57a9Nqy1VBNaG+pcoq7fX Arxn7UbY4YhZgctYbfB+aiSnxMetlofggpf5i1FZlw== X-Google-Smtp-Source: ABdhPJxl2nKq69LYzu6kG5i2q42wzrVLSAI81pxCWFcK5jI5OX5g+jB6Q0xO6yfT2bFhkZeBb1EeICs91lg9xcjiXn8= X-Received: by 2002:a17:906:40c6:: with SMTP id a6mr13101167ejk.484.1635316506211; Tue, 26 Oct 2021 23:35:06 -0700 (PDT) MIME-Version: 1.0 References: <20211019104724.19416-1-aman.kumar@vvdntech.in> <20211026155645.246783-1-aman.kumar@vvdntech.in> <20211026155645.246783-3-aman.kumar@vvdntech.in> <2148097.ar7J4MBmm8@thomas> In-Reply-To: <2148097.ar7J4MBmm8@thomas> From: Aman Kumar Date: Wed, 27 Oct 2021 12:04:54 +0530 Message-ID: To: Thomas Monjalon Cc: dpdk-dev , Slava Ovsiienko , Anatoly Burakov , "Song, Keesang" , Jerin Jacob , konstantin.ananyev@intel.com, bruce.richardson@intel.com Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: Re: [dpdk-dev] [PATCH v3 3/3] lib/eal: add temporal store memcpy support on AMD platform X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Tue, Oct 26, 2021 at 9:44 PM Thomas Monjalon wrote: > 26/10/2021 17:56, Aman Kumar: > > This patch provides a rte_memcpy* call with temporal stores. > > Use -Dcpu_instruction_set=znverX with build to enable this API. > > > > Signed-off-by: Aman Kumar > > --- > > config/x86/meson.build | 2 + > > lib/eal/x86/include/rte_memcpy.h | 114 +++++++++++++++++++++++++++++++ > > It looks better as C code. > Do you achieve the same performance as the asm version? > In a few corner cases assembly performed better, but overall we have very similar perf observations. > > +#if defined RTE_MEMCPY_AMDEPYC > [...] > > +static __rte_always_inline void * > > +rte_memcpy_aligned_tstore16_generic(void *dst, void *src, int len) > > So to be clear, an application will benefit of this optimization if > 1/ DPDK is specifically compiled for AMD > 2/ the application is compiled with above DPDK build (because of > inlinining) > > I guess there is no good way to benefit from the optimization > without specific compilation, because of inlining constraint. > Another design, with less constraint but less performance, > would be to have a function pointer assigned at runtime based on the CPU. > You're right. We need to build DPDK and apps with this flag enabled to get the benefit. In future versions, we will try to adapt in a more dynamic way. Thanks.