DPDK patches and discussions
 help / color / mirror / Atom feed
From: Thomas Monjalon <thomas@monjalon.net>
To: Aman Kumar <aman.kumar@vvdntech.in>,
	"Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	"Van Haaren, Harry" <harry.van.haaren@intel.com>
Cc: "mattias. ronnblom" <mattias.ronnblom@ericsson.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"viacheslavo@nvidia.com" <viacheslavo@nvidia.com>,
	"Burakov, Anatoly" <anatoly.burakov@intel.com>,
	"Song, Keesang" <Keesang.Song@amd.com>,
	"jerinjacobk@gmail.com" <jerinjacobk@gmail.com>,
	"Richardson, Bruce" <bruce.richardson@intel.com>,
	"honnappa.nagarahalli@arm.com" <honnappa.nagarahalli@arm.com>,
	Ruifeng Wang <ruifeng.wang@arm.com>,
	David Christensen <drc@linux.vnet.ibm.com>,
	"david.marchand@redhat.com" <david.marchand@redhat.com>,
	"stephen@networkplumber.org" <stephen@networkplumber.org>
Subject: Re: [dpdk-dev] [PATCH v4 2/2] lib/eal: add temporal store memcpy support for AMD platform
Date: Wed, 27 Oct 2021 16:31:25 +0200	[thread overview]
Message-ID: <2811952.0RjXIbInLB@thomas> (raw)
In-Reply-To: <BN0PR11MB5712D149C3399BB80CFB2F83D7859@BN0PR11MB5712.namprd11.prod.outlook.com>

27/10/2021 16:10, Van Haaren, Harry:
> From: Aman Kumar <aman.kumar@vvdntech.in> 
> On Wed, Oct 27, 2021 at 5:53 PM Ananyev, Konstantin <mailto:konstantin.ananyev@intel.com> wrote
> > 
> > Hi Mattias,
> > 
> > > > 6) What is the use-case for this? When would a user *want* to use this instead
> > > of rte_memcpy()?
> > > > If the data being loaded is relevant to datapath/packets, presumably other
> > > packets might require the
> > > > loaded data, so temporal (normal) loads should be used to cache the source
> > > data?
> > >
> > >
> > > I'm not sure if your first question is rhetorical or not, but a memcpy()
> > > in a NT variant is certainly useful. One use case for a memcpy() with
> > > temporal loads and non-temporal stores is if you need to archive packet
> > > payload for (distant, potential) future use, and want to avoid causing
> > > unnecessary LLC evictions while doing so.
> > 
> > Yes I agree that there are certainly benefits in using cache-locality hints.
> > There is an open question around if the src or dst or both are non-temporal.
> > 
> > In the implementation of this patch, the NT/T type of store is reversed from your use-case:
> > 1) Loads are NT (so loaded data is not cached for future packets)
> > 2) Stores are T (so copied/dst data is now resident in L1/L2)
> > 
> > In theory there might even be valid uses for this type of memcpy where loaded
> > data is not needed again soon and stored data is referenced again soon,
> > although I cannot think of any here while typing this mail..
> > 
> > I think some use-case examples, and clear documentation on when/how to choose
> > between rte_memcpy() or any (potential future) rte_memcpy_nt() variants is required
> > to progress this patch.
> > 
> > Assuming a strong use-case exists, and it can be clearly indicators to users of DPDK APIs which
> > rte_memcpy() to use, we can look at technical details around enabling the implementation.
> > 
> 
> [Konstantin wrote]:
> +1 here.
> Function behaviour and restrictions (src parameter needs to be 16/32 B aligned, etc.),
> along with expected usage scenarios have to be documented properly.
> Again, as Harry pointed out, I don't see any AMD specific instructions in this function,
> so presumably such function can go into __AVX2__ code block and no new defines will
> be required. 
> 
> 
> [Aman wrote]:
> Agreed that APIs are generic but we've kept under an AMD flag for a simple reason that it is NOT tested on any other platform.
> A use-case on how to use this was planned earlier for mlx5 pmd but dropped in this version of patch as the data path of mlx5 is going to be refactored soon and may not be useful for future versions of mlx5 (>22.02). 
> Ref link: https://patchwork.dpdk.org/project/dpdk/patch/20211019104724.19416-2-aman.kumar@vvdntech.in/(we've plan to adapt this into future version)
> The patch in the link basically enhances mlx5 mprq implementation for our specific use-case and with 128B packet size, we achieve ~60% better perf. We understand the use of this copy function should be documented which we shall plan along with few other platform specific optimizations in future versions of DPDK. As this does not conflict with other platforms, can we still keep under AMD flag for now as suggested by Thomas?

I said I could merge if there is no objection.
I've overlooked that it's adding completely new functions in the API.
And the comments go in the direction of what I asked in previous version:
what is specific to AMD here?
Now seeing the valid objections, I agree it should be reworked.
We must provide API to applications which is generic, stable and well documented.


> [HvH wrote]:
> As an open-source community, any contributions should aim to improve the whole.
> In the past, numerous improvements have been merged to DPDK that improve performance.
> Sometimes these are architecture specific (x86/arm/ppc) sometimes the are ISA specific (SSE, AVX512, NEON).
> 
> I am not familiar with any cases in DPDK, where there is a #ifdef based on a *specific platform*.
> A quick "grep" through the "dpdk/lib" directory does not show any place where PMD or generic code
> has been explicitly optimized for a *specific platform*.
> 
> Obviously, in cases where ISA either exists or does not exist, yes there is an optimization to enable it.
> But this is not exposed as a top-level compile-time option, it uses runtime CPU ISA detection.
> 
> Please take a step back from the code, and look at what this patch asks of DPDK:
> "Please accept & maintain these changes upstream, which benefit only platform X, even though these ISA features are also available on other platforms".
> 
> Other patches that enhance performance of DPDK ask this:
> "Please accept & maintain these changes upstream, which benefit all platforms which have ISA capability X".
> 
> 
> === Question "As this does not conflict with other platforms, can we still keep under AMD flag for now"?
> I feel the contribution is too specific to a platform. Make it generic by enabling it at an ISA capability level.
> 
> Please yes, contribute to the DPDK community by improving performance of a PMD by enabling/leveraging ISA.
> But do so in a way that does not benefit only a specific platform - do so in a way that enhances all of DPDK, as
> other patches have done for the DPDK that this patch is built on.
> 
> If you have concerns that the PMD maintainers will not accept the changes due to potential regressions on
> other platforms, then discuss those, make a plan on how to performance validate, and work to a solution.
> 
> 
> === Regarding specifically the request for "can we still keep under AMD flag for now"?
> I do not believe we should introduce APIs for specific platforms. DPDK's EAL is an abstraction layer.
> The value of EAL is to provide a common abstraction. This platform-specific flag breaks the abstraction,
> and results in packaging issues, as well as API/ABI instability based on -Dcpu_instruction_set choice.
> So, no, we should not introduce APIs based on any compile-time flag.

I agree



  reply	other threads:[~2021-10-27 14:31 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-23  8:44 [dpdk-dev] [PATCH 1/2] lib/eal: add amd epyc2 memcpy routine to eal Aman Kumar
2021-08-23  8:44 ` [dpdk-dev] [PATCH 2/2] net/mlx5: optimize mprq memcpy for AMD EPYC2 platforms Aman Kumar
2021-10-13 16:53   ` Thomas Monjalon
2021-10-19 10:52     ` Aman Kumar
2021-08-23 15:21 ` [dpdk-dev] [PATCH 1/2] lib/eal: add amd epyc2 memcpy routine to eal Jerin Jacob
2021-08-30  9:39   ` Aman Kumar
2021-10-19 10:47 ` [dpdk-dev] [PATCH v2 " Aman Kumar
2021-10-19 10:47   ` [dpdk-dev] [PATCH v2 2/2] net/mlx5: optimize mprq memcpy for AMD EPYC2 plaform Aman Kumar
2021-10-19 12:31   ` [dpdk-dev] [PATCH v2 1/2] lib/eal: add amd epyc2 memcpy routine to eal Thomas Monjalon
2021-10-19 15:35     ` Stephen Hemminger
2021-10-21 17:10     ` Song, Keesang
2021-10-21 17:40       ` Ananyev, Konstantin
2021-10-21 18:12         ` Song, Keesang
2021-10-21 18:41           ` Thomas Monjalon
2021-10-21 19:03             ` Song, Keesang
2021-10-21 19:50               ` Thomas Monjalon
2021-10-21 20:14   ` Thomas Monjalon
2021-10-22  8:45     ` Bruce Richardson
2021-10-26 15:56   ` [dpdk-dev] [PATCH v3 1/3] config/x86: add support for AMD platform Aman Kumar
2021-10-26 15:56     ` [dpdk-dev] [PATCH v3 2/3] doc/guides: add dpdk build instruction for AMD platforms Aman Kumar
2021-10-26 16:07       ` Thomas Monjalon
2021-10-27  6:30         ` Aman Kumar
2021-10-26 15:56     ` [dpdk-dev] [PATCH v3 3/3] lib/eal: add temporal store memcpy support on AMD platform Aman Kumar
2021-10-26 16:14       ` Thomas Monjalon
2021-10-27  6:34         ` Aman Kumar
2021-10-27  7:59           ` Thomas Monjalon
2021-10-26 21:10       ` Stephen Hemminger
2021-10-27  6:43         ` Aman Kumar
2021-10-26 16:01     ` [dpdk-dev] [PATCH v3 1/3] config/x86: add support for " Thomas Monjalon
2021-10-27  6:26       ` Aman Kumar
2021-10-27  7:28     ` [dpdk-dev] [PATCH v4 1/2] " Aman Kumar
2021-10-27  7:28       ` [dpdk-dev] [PATCH v4 2/2] lib/eal: add temporal store memcpy " Aman Kumar
2021-10-27  8:13         ` Thomas Monjalon
2021-10-27 11:03           ` Van Haaren, Harry
2021-10-27 11:41             ` Mattias Rönnblom
2021-10-27 12:15               ` Van Haaren, Harry
2021-10-27 12:22                 ` Ananyev, Konstantin
2021-10-27 13:34                   ` Aman Kumar
2021-10-27 14:10                     ` Van Haaren, Harry
2021-10-27 14:31                       ` Thomas Monjalon [this message]
2021-10-29 16:01                         ` Song, Keesang
2021-10-27 14:26                     ` Ananyev, Konstantin
2021-10-27 11:33         ` Mattias Rönnblom

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2811952.0RjXIbInLB@thomas \
    --to=thomas@monjalon.net \
    --cc=Keesang.Song@amd.com \
    --cc=aman.kumar@vvdntech.in \
    --cc=anatoly.burakov@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=drc@linux.vnet.ibm.com \
    --cc=harry.van.haaren@intel.com \
    --cc=honnappa.nagarahalli@arm.com \
    --cc=jerinjacobk@gmail.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=mattias.ronnblom@ericsson.com \
    --cc=ruifeng.wang@arm.com \
    --cc=stephen@networkplumber.org \
    --cc=viacheslavo@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).