From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 491B3A04DB; Thu, 15 Oct 2020 10:01:25 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 7EB021DC9B; Thu, 15 Oct 2020 10:01:23 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by dpdk.org (Postfix) with ESMTP id 7339F1DC8D for ; Thu, 15 Oct 2020 10:01:20 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1602748879; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=8/EML9kkZVtkzoPN4XnMIsi8/amd5ZnKBzWjmO5nFoc=; b=Xu0r91XKrZRrmXinAiDY3ljU2chMdfxniSJWfpIeNtBGfaFUEAf8hCvIwkQtyBSsdOyJoM jiacdmdGBD8GwI8tzwoR25wcRXU+gm+U2pn44NTYPx5k8rHH5/1Wde5AhUGYdNcVF3MTqQ N2n6F+LpVXMZT+VYHJ2CzcU0StOc/8k= Received: from mail-vk1-f197.google.com (mail-vk1-f197.google.com [209.85.221.197]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-516-3BwWKHbdPpO-ivNC7RPUmw-1; Thu, 15 Oct 2020 04:01:15 -0400 X-MC-Unique: 3BwWKHbdPpO-ivNC7RPUmw-1 Received: by mail-vk1-f197.google.com with SMTP id o21so438985vko.5 for ; Thu, 15 Oct 2020 01:01:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=8/EML9kkZVtkzoPN4XnMIsi8/amd5ZnKBzWjmO5nFoc=; b=j/i0FC4mIlQx3kD0mxU1NegPRUKWG1nfjisQa8FNeojQAugaLtU7kb35oTfb9gTKzt pzGNWDJAHvunoPcARAwVdIot1pqkrPYgHO6sZHzefDmy+ljmDFr5dfjhwVzml5Q3yGxS yMi/RPDAi6m5ObzmIRxh86v5lkXy7xU+7CuxaWJXcqzzhuQ43uifsxCCNi9+9YdPb0OF qPY6Dr0acTVoljYkX4PZRLIQYGR9s54exYVoBcJarGRZ2XU3+LfM0i4PaPfKK83AR8sR msaq21YMXRjbPG9dTC6LQn55IQGM068KzCEjoRLLhXZA+h5lF1smrsopbL7hxJX4xBYX MPeQ== X-Gm-Message-State: AOAM531b1B4GIkFTxZ70w1Xn9Ost6ZW2Q9a13VOYIMzpg8uU6Xicnh9q 9kDIql/XnBSJ7qIIxd7rg556ObxK9LQjr5txlHevSWEXvImNtPlWQlmfxYzhXp63LCR1QHRLpVL ds/kKKOevhiIcWRMb36E= X-Received: by 2002:ab0:a94:: with SMTP id d20mr1215561uak.86.1602748874854; Thu, 15 Oct 2020 01:01:14 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwN0BbzMBHP4Gq4o456NlNWbmbe4WcPbMR7zWHC/nVfobqq+FiWQOMrdIfQiv8Kvt/AD6rsBeMsPuOzbCnkhDQ= X-Received: by 2002:ab0:a94:: with SMTP id d20mr1215549uak.86.1602748874535; Thu, 15 Oct 2020 01:01:14 -0700 (PDT) MIME-Version: 1.0 References: <1599700614-22809-1-git-send-email-omkar.maslekar@intel.com> <1602582191-23807-1-git-send-email-omkar.maslekar@intel.com> <1602582191-23807-2-git-send-email-omkar.maslekar@intel.com> In-Reply-To: <1602582191-23807-2-git-send-email-omkar.maslekar@intel.com> From: David Marchand Date: Thu, 15 Oct 2020 10:01:03 +0200 Message-ID: To: Omkar Maslekar Cc: dev , Bruce Richardson , Ciara Loftus , David Christensen , Jerin Jacob Kollanukkaran , "Ruifeng Wang (Arm Technology China)" , Honnappa Nagarahalli Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=dmarchan@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Subject: Re: [dpdk-dev] [PATCH v7] eal: add cache-line demote support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Repeating my questions: - would there be a point in hinting at where the "demoted" line goes? - is this instruction available on all x86 CPUs? See comments: On Tue, Oct 13, 2020 at 6:47 PM Omkar Maslekar wrote: > diff --git a/app/test/test_prefetch.c b/app/test/test_prefetch.c > index 41f219a..5c58d0c 100644 > --- a/app/test/test_prefetch.c > +++ b/app/test/test_prefetch.c > @@ -26,7 +26,11 @@ > rte_prefetch1(&a); > rte_prefetch2(&a); > > +/* test for marking a line as shared to test cldemote functionality */ Non indented comment that gives no more info than the call itself. Please remove. > + rte_cldemote(&a); > + > return 0; > } > > + Please remove this empty line. > REGISTER_TEST_COMMAND(prefetch_autotest, test_prefetch); > diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst > index b7881f2..8a1ed01 100644 > --- a/doc/guides/rel_notes/release_20_11.rst > +++ b/doc/guides/rel_notes/release_20_11.rst > @@ -171,6 +171,13 @@ New Features > * Extern objects and functions can be plugged into the pipeline. > * Transaction-oriented table updates. > > +* **Added new function rte_cldemote in rte_prefetch.h.** > + > + Added a hardware hint CLDEMOTE, which is similar to prefetch in reverse. This should come at the top of the features list (but after "write combining store" entry that got in first). Please add a mention that it only concerns x86. > + CLDEMOTE moves the cache line to the more remote cache, where it expects > + sharing to be efficient. Moving the cache line to a level more distant from > + the processor helps to accelerate core-to-core communication. > + > > Removed Items > ------------- > diff --git a/lib/librte_eal/arm/include/rte_prefetch_32.h b/lib/librte_eal/arm/include/rte_prefetch_32.h > index e53420a..28b3d48 100644 > --- a/lib/librte_eal/arm/include/rte_prefetch_32.h > +++ b/lib/librte_eal/arm/include/rte_prefetch_32.h > @@ -10,6 +10,7 @@ > #endif > > #include > +#include Move rte_compat.h inclusion from the arch headers to the generic/rte_prefetch.h header only. > #include "generic/rte_prefetch.h" > > static inline void rte_prefetch0(const volatile void *p) > @@ -33,6 +34,12 @@ static inline void rte_prefetch_non_temporal(const volatile void *p) > rte_prefetch0(p); > } > > +__rte_experimental > +static inline void rte_cldemote(const volatile void *p) > +{ > + RTE_SET_USED(p); > +} > + > #ifdef __cplusplus > } > #endif > diff --git a/lib/librte_eal/arm/include/rte_prefetch_64.h b/lib/librte_eal/arm/include/rte_prefetch_64.h > index fc2b391..1c722eb 100644 > --- a/lib/librte_eal/arm/include/rte_prefetch_64.h > +++ b/lib/librte_eal/arm/include/rte_prefetch_64.h > @@ -10,6 +10,7 @@ > #endif > > #include > +#include > #include "generic/rte_prefetch.h" > > static inline void rte_prefetch0(const volatile void *p) > @@ -32,6 +33,12 @@ static inline void rte_prefetch_non_temporal(const volatile void *p) > asm volatile ("PRFM PLDL1STRM, [%0]" : : "r" (p)); > } > > +__rte_experimental > +static inline void rte_cldemote(const volatile void *p) > +{ > + RTE_SET_USED(p); > +} > + > #ifdef __cplusplus > } > #endif > diff --git a/lib/librte_eal/include/generic/rte_prefetch.h b/lib/librte_eal/include/generic/rte_prefetch.h > index 6e47bdf..ad9844c 100644 > --- a/lib/librte_eal/include/generic/rte_prefetch.h > +++ b/lib/librte_eal/include/generic/rte_prefetch.h > @@ -51,4 +51,19 @@ > */ > static inline void rte_prefetch_non_temporal(const volatile void *p); > > +/** > + * Demote a cache line to a more distant level of cache from the processor. > + * > + * CLDEMOTE hints to hardware to move (demote) a cache line from the closest to > + * the processor to a level more distant from the processor. It is a hint and > + * not guarantee. rte_cldemote is intended to move the cache line to the more guaranteed* > + * remote cache, where it expects sharing to be efficient and to indicate that a > + * line may be accessed by a different core in the future. > + * > + * @param p > + * Address to demote > + */ > +__rte_experimental > +static inline void rte_cldemote(const volatile void *p); > + > #endif /* _RTE_PREFETCH_H_ */ > diff --git a/lib/librte_eal/ppc/include/rte_prefetch.h b/lib/librte_eal/ppc/include/rte_prefetch.h > index 9ba07c8..b55cac4 100644 > --- a/lib/librte_eal/ppc/include/rte_prefetch.h > +++ b/lib/librte_eal/ppc/include/rte_prefetch.h > @@ -11,6 +11,7 @@ > #endif > > #include > +#include > #include "generic/rte_prefetch.h" > > static inline void rte_prefetch0(const volatile void *p) > @@ -34,6 +35,12 @@ static inline void rte_prefetch_non_temporal(const volatile void *p) > rte_prefetch0(p); > } > > +__rte_experimental > +static inline void rte_cldemote(const volatile void *p) > +{ > + RTE_SET_USED(p); > +} > + > #ifdef __cplusplus > } > #endif > diff --git a/lib/librte_eal/x86/include/rte_prefetch.h b/lib/librte_eal/x86/include/rte_prefetch.h > index 384c6b3..92ba05a 100644 > --- a/lib/librte_eal/x86/include/rte_prefetch.h > +++ b/lib/librte_eal/x86/include/rte_prefetch.h > @@ -10,6 +10,7 @@ > #endif > > #include > +#include > #include "generic/rte_prefetch.h" > > static inline void rte_prefetch0(const volatile void *p) > @@ -32,6 +33,16 @@ static inline void rte_prefetch_non_temporal(const volatile void *p) > asm volatile ("prefetchnta %[p]" : : [p] "m" (*(const volatile char *)p)); > } > > +/* > + * we're using raw byte codes for now as only the newest compiler We use > + * versions support this instruction natively. > + */ > +__rte_experimental > +static inline void rte_cldemote(const volatile void *p) > +{ > + asm volatile(".byte 0x0f, 0x1c, 0x06" :: "S" (p)); > +} > + > #ifdef __cplusplus > } > #endif > -- > 1.8.3.1 > -- David Marchand