DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Van Haaren, Harry" <harry.van.haaren@intel.com>
To: Thomas Monjalon <thomas@monjalon.net>,
	Aman Kumar <aman.kumar@vvdntech.in>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	"viacheslavo@nvidia.com" <viacheslavo@nvidia.com>,
	"Burakov, Anatoly" <anatoly.burakov@intel.com>,
	"keesang.song@amd.com" <keesang.song@amd.com>,
	"aman.kumar@vvdntech.in" <aman.kumar@vvdntech.in>,
	"jerinjacobk@gmail.com" <jerinjacobk@gmail.com>,
	"Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	"Richardson, Bruce" <bruce.richardson@intel.com>,
	"honnappa.nagarahalli@arm.com" <honnappa.nagarahalli@arm.com>,
	Ruifeng Wang <ruifeng.wang@arm.com>,
	"David Christensen" <drc@linux.vnet.ibm.com>,
	"david.marchand@redhat.com" <david.marchand@redhat.com>,
	"stephen@networkplumber.org" <stephen@networkplumber.org>
Subject: Re: [dpdk-dev] [PATCH v4 2/2] lib/eal: add temporal store memcpy support for AMD platform
Date: Wed, 27 Oct 2021 11:03:09 +0000	[thread overview]
Message-ID: <BN0PR11MB5712FE65371D05AE347CCD24D7859@BN0PR11MB5712.namprd11.prod.outlook.com> (raw)
In-Reply-To: <1932804.9rrtejxFVQ@thomas>

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Thomas Monjalon
> Sent: Wednesday, October 27, 2021 9:13 AM
> To: Aman Kumar <aman.kumar@vvdntech.in>
> Cc: dev@dpdk.org; viacheslavo@nvidia.com; Burakov, Anatoly
> <anatoly.burakov@intel.com>; keesang.song@amd.com;
> aman.kumar@vvdntech.in; jerinjacobk@gmail.com; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; honnappa.nagarahalli@arm.com; Ruifeng Wang
> <ruifeng.wang@arm.com>; David Christensen <drc@linux.vnet.ibm.com>;
> david.marchand@redhat.com; stephen@networkplumber.org
> Subject: Re: [dpdk-dev] [PATCH v4 2/2] lib/eal: add temporal store memcpy
> support for AMD platform
> 
> 27/10/2021 09:28, Aman Kumar:
> > This patch provides a rte_memcpy* call with temporal stores.
> > Use -Dcpu_instruction_set=znverX with build to enable this API.
> >
> > Signed-off-by: Aman Kumar <aman.kumar@vvdntech.in>
> 
> For the series, Acked-by: Thomas Monjalon <thomas@monjalon.net>
> With the hope that such optimization will go in libc in a near future.
> 
> If there is no objection, I will merge this AMD-specific series in 21.11-rc2.
> It should not affect other platforms.

Hi Folks,

This patchset was brought to my attention, and I have a few concerns.
I'll add short snippets of context from the patch here so I can refer to it below;

+/**
+ * Copy 16 bytes from one location to another,
+ * with temporal stores
+ */
+static __rte_always_inline void
+rte_copy16_ts(uint8_t *dst, uint8_t *src)
+{
+	__m128i var128;
+
+	var128 = _mm_stream_load_si128((__m128i *)src);
+	_mm_storeu_si128((__m128i *)dst, var128);
+}

1) What is fundamentally specific to the znverX CPU? Is there any reason this can not just be enabled for x86-64 generic with SSE4.1 ISA requirements?
_mm_stream_load_si128() is part of SSE4.1
_mm_storeu_si128() is SSE2. 
Using the intrinsics guide for lookup of intrinsics to ISA level: https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html?wapkw=intrinsics%20guide#text=_mm_stream_load&ig_expand=6884

2) Are -D options allowed to change/break API/ABI?
By allowing -Dcpu_instruction_set= to change available functions, any application using it is no longer source-code (API) compatible with "DPDK" proper.
This patch essentially splits a "DPDK" app to depend on "DPDK + CPU version -D flag", in an incompatible way (no fallback?).

3) The stream load instruction used here *requires* 16-byte alignment for its operand.
This is not documented, and worse, a uint8_t* is accepted, which is cast to (__m128i *).
This cast hides the compiler warning for expanding type-alignments.
And the code itself is broken - passing a "src" parameter that is not 16-byte aligned will segfault.

4) Temporal and Non-temporal are not logically presented here.
Temporal loads/stores are normal loads/stores. They use the L1/L2 caches.
Non-temporal loads/stores indicate that the data will *not* be used again in a short space of time.
Non-temporal means "having no relation to time" according to my internet search.

5) The *store* here uses a normal store (temporal, targets cache). The *load* however is a streaming (non-temporal, no cache) load.
It is not clearly documented that A) stream load will be used.
The inverse is documented "copy with ts" aka, copy with temporal store.
Is documenting the store as temporal meant to imply that the load is non-temporal?

6) What is the use-case for this? When would a user *want* to use this instead of rte_memcpy()?
If the data being loaded is relevant to datapath/packets, presumably other packets might require the
loaded data, so temporal (normal) loads should be used to cache the source data?

7) Why is streaming (non-temporal) loads & stores not used? I guess maybe this is regarding the use-case,
but its not clear to me right now why loads are NT, and stores are T.

All in all, I do not think merging this patch is a good idea. I would like to understand the motivation for adding
this type of function, and then see it being done in a way that is clearly documented regarding temporal loads/stores,
and not changing/adding APIs for specific CPUs.

So apologies for late feedback, but this is not of high enough quality to be merged to DPDK right now, NACK.


  reply	other threads:[~2021-10-27 11:03 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-23  8:44 [dpdk-dev] [PATCH 1/2] lib/eal: add amd epyc2 memcpy routine to eal Aman Kumar
2021-08-23  8:44 ` [dpdk-dev] [PATCH 2/2] net/mlx5: optimize mprq memcpy for AMD EPYC2 platforms Aman Kumar
2021-10-13 16:53   ` Thomas Monjalon
2021-10-19 10:52     ` Aman Kumar
2021-08-23 15:21 ` [dpdk-dev] [PATCH 1/2] lib/eal: add amd epyc2 memcpy routine to eal Jerin Jacob
2021-08-30  9:39   ` Aman Kumar
2021-10-19 10:47 ` [dpdk-dev] [PATCH v2 " Aman Kumar
2021-10-19 10:47   ` [dpdk-dev] [PATCH v2 2/2] net/mlx5: optimize mprq memcpy for AMD EPYC2 plaform Aman Kumar
2021-10-19 12:31   ` [dpdk-dev] [PATCH v2 1/2] lib/eal: add amd epyc2 memcpy routine to eal Thomas Monjalon
2021-10-19 15:35     ` Stephen Hemminger
2021-10-21 17:10     ` Song, Keesang
2021-10-21 17:40       ` Ananyev, Konstantin
2021-10-21 18:12         ` Song, Keesang
2021-10-21 18:41           ` Thomas Monjalon
2021-10-21 19:03             ` Song, Keesang
2021-10-21 19:50               ` Thomas Monjalon
2021-10-21 20:14   ` Thomas Monjalon
2021-10-22  8:45     ` Bruce Richardson
2021-10-26 15:56   ` [dpdk-dev] [PATCH v3 1/3] config/x86: add support for AMD platform Aman Kumar
2021-10-26 15:56     ` [dpdk-dev] [PATCH v3 2/3] doc/guides: add dpdk build instruction for AMD platforms Aman Kumar
2021-10-26 16:07       ` Thomas Monjalon
2021-10-27  6:30         ` Aman Kumar
2021-10-26 15:56     ` [dpdk-dev] [PATCH v3 3/3] lib/eal: add temporal store memcpy support on AMD platform Aman Kumar
2021-10-26 16:14       ` Thomas Monjalon
2021-10-27  6:34         ` Aman Kumar
2021-10-27  7:59           ` Thomas Monjalon
2021-10-26 21:10       ` Stephen Hemminger
2021-10-27  6:43         ` Aman Kumar
2021-10-26 16:01     ` [dpdk-dev] [PATCH v3 1/3] config/x86: add support for " Thomas Monjalon
2021-10-27  6:26       ` Aman Kumar
2021-10-27  7:28     ` [dpdk-dev] [PATCH v4 1/2] " Aman Kumar
2021-10-27  7:28       ` [dpdk-dev] [PATCH v4 2/2] lib/eal: add temporal store memcpy " Aman Kumar
2021-10-27  8:13         ` Thomas Monjalon
2021-10-27 11:03           ` Van Haaren, Harry [this message]
2021-10-27 11:41             ` Mattias Rönnblom
2021-10-27 12:15               ` Van Haaren, Harry
2021-10-27 12:22                 ` Ananyev, Konstantin
2021-10-27 13:34                   ` Aman Kumar
2021-10-27 14:10                     ` Van Haaren, Harry
2021-10-27 14:31                       ` Thomas Monjalon
2021-10-29 16:01                         ` Song, Keesang
2021-10-27 14:26                     ` Ananyev, Konstantin
2021-10-27 11:33         ` Mattias Rönnblom

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BN0PR11MB5712FE65371D05AE347CCD24D7859@BN0PR11MB5712.namprd11.prod.outlook.com \
    --to=harry.van.haaren@intel.com \
    --cc=aman.kumar@vvdntech.in \
    --cc=anatoly.burakov@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=drc@linux.vnet.ibm.com \
    --cc=honnappa.nagarahalli@arm.com \
    --cc=jerinjacobk@gmail.com \
    --cc=keesang.song@amd.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=ruifeng.wang@arm.com \
    --cc=stephen@networkplumber.org \
    --cc=thomas@monjalon.net \
    --cc=viacheslavo@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).