From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id E9EBBA04B5;
	Thu, 10 Sep 2020 10:55:26 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 747691BEB3;
	Thu, 10 Sep 2020 10:55:26 +0200 (CEST)
Received: from mga06.intel.com (mga06.intel.com [134.134.136.31])
 by dpdk.org (Postfix) with ESMTP id 71B992BA8
 for <dev@dpdk.org>; Thu, 10 Sep 2020 10:55:24 +0200 (CEST)
IronPort-SDR: NCBd20WIdc7+9TpKe1KXGZM5oRtdosMzXpYR2OgmSjtx31MYRpVfL5nWMymp1EwZ/79w7GcF8U
 Qrs6b4umLXUw==
X-IronPort-AV: E=McAfee;i="6000,8403,9739"; a="220053081"
X-IronPort-AV: E=Sophos;i="5.76,412,1592895600"; d="scan'208";a="220053081"
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga005.jf.intel.com ([10.7.209.41])
 by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 10 Sep 2020 01:55:23 -0700
IronPort-SDR: 5+sfoI0xLQ6zjX1mAWFBNZNJn6GW2PF/eZfe86LpTc8ifjxm2G8i+41+KyUogGtN4W0yUC8GOO
 bPob+wjL2NqA==
X-IronPort-AV: E=Sophos;i="5.76,412,1592895600"; d="scan'208";a="480810483"
Received: from bricha3-mobl.ger.corp.intel.com ([10.252.5.251])
 by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-SHA;
 10 Sep 2020 01:55:22 -0700
Date: Thu, 10 Sep 2020 09:55:19 +0100
From: Bruce Richardson <bruce.richardson@intel.com>
To: Omkar Maslekar <omkar.maslekar@intel.com>
Cc: dev@dpdk.org, ciara.loftus@intel.com
Message-ID: <20200910085519.GB1789@bricha3-MOBL.ger.corp.intel.com>
References: <1599700614-22809-1-git-send-email-omkar.maslekar@intel.com>
 <1599700614-22809-2-git-send-email-omkar.maslekar@intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1599700614-22809-2-git-send-email-omkar.maslekar@intel.com>
Subject: Re: [dpdk-dev] [PATCH] EAL: An addition of cache line demote
 (CLDEMOTE) in rte_prefetch.h
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

On Wed, Sep 09, 2020 at 06:16:54PM -0700, Omkar Maslekar wrote:
> rte_cldemote is similar to a prefetch hint - in reverse. cldemote(addr)
> enables software to hint to hardware that line is likely to be shared.
> Useful in core-to-core communications where cache-line is likely to be
> shared. ARM and PPC implementation is provided with NOP and can be added
> if any equivalent instructions could be used for implementation on those
> architectures.
> 
> Signed-off-by: Omkar Maslekar <omkar.maslekar@intel.com>
> ---

Hi Omkar,

please see some review comments inline below.

Regards,
/Bruce

>  doc/guides/rel_notes/release_20_11.rst        | 26 ++++----------------------
>  lib/librte_eal/arm/include/rte_prefetch_32.h  |  5 +++++
>  lib/librte_eal/arm/include/rte_prefetch_64.h  |  5 +++++
>  lib/librte_eal/include/generic/rte_prefetch.h |  7 +++++++
>  lib/librte_eal/ppc/include/rte_prefetch.h     |  5 +++++
>  lib/librte_eal/x86/include/rte_prefetch.h     |  9 +++++++++
>  6 files changed, 35 insertions(+), 22 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
> index df227a1..c4a4362 100644
> --- a/doc/guides/rel_notes/release_20_11.rst
> +++ b/doc/guides/rel_notes/release_20_11.rst
> @@ -27,29 +27,11 @@ New Features
>  .. This section should contain new features added in this release.
>     Sample format:
>  
> -   * **Add a title in the past tense with a full stop.**
> +Added new instruction CLDEMOTE in rte_prefetch.h.

You need to prefix this with the library it is in, in this case EAL. Also,
since this is C code, you are adding a function, not an instruction.

>  
> -     Add a short 1-2 sentence description in the past tense.
> -     The description should be enough to allow someone scanning
> -     the release notes to understand the new feature.
> -
> -     If the feature adds a lot of sub-features you can use a bullet list
> -     like this:
> -
> -     * Added feature foo to do something.
> -     * Enhanced feature bar to do something else.
> -
> -     Refer to the previous release notes for examples.
> -
> -     Suggested order in release notes items:
> -     * Core libs (EAL, mempool, ring, mbuf, buses)
> -     * Device abstraction libs and PMDs
> -       - ethdev (lib, PMDs)
> -       - cryptodev (lib, PMDs)
> -       - eventdev (lib, PMDs)
> -       - etc
> -     * Other libs
> -     * Apps, Examples, Tools (if significant)

Don't remove these lines, they are all also part of the same comment as
below where it says "Do not overwrite or remove it" :-)

> +     Added a hardware hint CLDEMOTE which is similar to prefetch in reverse.
> +     CLDEMOTES moves the cache line to the last shared cache, where it expects
> +     sharing to be efficient.
>  

Reading the instruction description in the Intel instruction set reference,
it says about moving the cache line to a more remote cache-line, rather
than guaranteeing that it goes to the last level cache. Therefore, to
ensure compatiblity with the current spec and make it more flexible to meet
any other hardware implementations, I suggest changing the "last shared
cache ..." to "more remote cache where sharing may be more efficient".

>       This section is a comment. Do not overwrite or remove it.
>       Also, make sure to start the actual text at the margin.
> diff --git a/lib/librte_eal/arm/include/rte_prefetch_32.h b/lib/librte_eal/arm/include/rte_prefetch_32.h
> index e53420a..ad91edd 100644
> --- a/lib/librte_eal/arm/include/rte_prefetch_32.h
> +++ b/lib/librte_eal/arm/include/rte_prefetch_32.h
> @@ -33,6 +33,11 @@ static inline void rte_prefetch_non_temporal(const volatile void *p)
>  	rte_prefetch0(p);
>  }
>  
> +static inline void rte_cldemote(const volatile void *p)
> +{
> +	RTE_SET_USED(p);
> +}
> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/lib/librte_eal/arm/include/rte_prefetch_64.h b/lib/librte_eal/arm/include/rte_prefetch_64.h
> index fc2b391..35d278a 100644
> --- a/lib/librte_eal/arm/include/rte_prefetch_64.h
> +++ b/lib/librte_eal/arm/include/rte_prefetch_64.h
> @@ -32,6 +32,11 @@ static inline void rte_prefetch_non_temporal(const volatile void *p)
>  	asm volatile ("PRFM PLDL1STRM, [%0]" : : "r" (p));
>  }
>  
> +static inline void rte_cldemote(const volatile void *p)
> +{
> +	RTE_SET_USED(p);
> +}
> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/lib/librte_eal/include/generic/rte_prefetch.h b/lib/librte_eal/include/generic/rte_prefetch.h
> index 6e47bdf..89ec69c 100644
> --- a/lib/librte_eal/include/generic/rte_prefetch.h
> +++ b/lib/librte_eal/include/generic/rte_prefetch.h
> @@ -51,4 +51,11 @@
>   */
>  static inline void rte_prefetch_non_temporal(const volatile void *p);
>  
> +/**
> + * Demote a cache line into the last shared cache level.

Same comment as above. Since this will make it into the official API
doxygen documentation, I think a bit fuller of a description would be good
also.