From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id EA48DA0C40;
	Thu, 29 Jul 2021 09:58:16 +0200 (CEST)
Received: from [217.70.189.124] (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id B2D9840687;
	Thu, 29 Jul 2021 09:58:16 +0200 (CEST)
Received: from mail-wr1-f50.google.com (mail-wr1-f50.google.com
 [209.85.221.50]) by mails.dpdk.org (Postfix) with ESMTP id E91C940041
 for <dev@dpdk.org>; Thu, 29 Jul 2021 09:58:15 +0200 (CEST)
Received: by mail-wr1-f50.google.com with SMTP id c16so5696640wrp.13
 for <dev@dpdk.org>; Thu, 29 Jul 2021 00:58:15 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind.com; s=google;
 h=date:from:to:cc:subject:message-id:references:mime-version
 :content-disposition:in-reply-to;
 bh=t385/1yacmQSgTe+69E/K9lvP6WCe0UeGiG7ttux8/M=;
 b=G4m64xrEFdVIiSiSQUJ4fUpZdIAsEg3ZMrTFDIbfv8OFhdRszBHcqdranFClQnUMSb
 ZlpBw21IBJoEk8+oFZmp4c0sWXM50r/qysfNAnIfJQ2XDAMGalYahrErZjR8mOkUuPIY
 b/J9X1T4CWRwMN/gFRBXAOmTI/Qy8FhegpX65IWybWJGsO0y/7bO7ejq4zJd5nf3rInL
 POdqKs4cBiaFjI9gQJyo10p18fnmdL4SDPTzMCL/wKtSkqRnQ6oMClwIjRaKJpRZXqC+
 7fXXlBBcOZHiA5I1q5A4/0DAW8vzrjIAw7uE/abJzZFKd+bNAAD3CLtO/f6S/jneDGmf
 D6Rw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:date:from:to:cc:subject:message-id:references
 :mime-version:content-disposition:in-reply-to;
 bh=t385/1yacmQSgTe+69E/K9lvP6WCe0UeGiG7ttux8/M=;
 b=j76m4aWBxm15iZq+45TGoObytCKGjn0rStGdGVlWqrKWzTHcX3UD4sMTQv9kMeWZgx
 Muhb22Qtjx92PirO8ryU9WMbNqBTCwyYthSt16PW3CyVK3oToZOpqay6HlaMVO4F1Kp0
 dK9ChNuSh2/Y1Cqdji0C7tPcFfBCCvd1SGRkoxNLVqh8p+/4IEvKlZo/x+vg14i8yDiC
 D5ByCbQccwjqpxtBa+d2EUmLgtlScmvR59LYqc4TYVXRGerDVZ6dUKBs38l/2NoZcgnW
 EETT6HMD1QtiopTJzgo+rp/2kzxvlGtauBPYZptXqdKk8SCd66nwiN+2BkqLjZQ4SIJO
 2WiA==
X-Gm-Message-State: AOAM532ePP6iOj0XZ1Dm6Cg/m072N6EFFz1Pg9vYQ6nOnD5BJO54pWhi
 cseBTpVq9c858VB/6yDXjGaLZQ==
X-Google-Smtp-Source: ABdhPJxtZyXGLDgtz889Xlc7cht2AIeAEvQM8RsHWbLYKlWjI8MqBQ6JbN+HRYodKxg1qufrosKw7A==
X-Received: by 2002:a5d:6146:: with SMTP id y6mr3280576wrt.278.1627545495716; 
 Thu, 29 Jul 2021 00:58:15 -0700 (PDT)
Received: from 6wind.com ([2a01:e0a:5ac:6460:c065:401d:87eb:9b25])
 by smtp.gmail.com with ESMTPSA id k6sm2418670wrm.10.2021.07.29.00.58.15
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Thu, 29 Jul 2021 00:58:15 -0700 (PDT)
Date: Thu, 29 Jul 2021 09:58:14 +0200
From: Olivier Matz <olivier.matz@6wind.com>
To: Joyce Kong <Joyce.Kong@arm.com>
Cc: "thomas@monjalon.net" <thomas@monjalon.net>,
 "david.marchand@redhat.com" <david.marchand@redhat.com>,
 "roretzla@linux.microsoft.com" <roretzla@linux.microsoft.com>,
 "stephen@networkplumber.org" <stephen@networkplumber.org>,
 "andrew.rybchenko@oktetlabs.ru" <andrew.rybchenko@oktetlabs.ru>,
 "harry.van.haaren@intel.com" <harry.van.haaren@intel.com>,
 Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
 Ruifeng Wang <Ruifeng.Wang@arm.com>, "dev@dpdk.org" <dev@dpdk.org>,
 nd <nd@arm.com>
Message-ID: <YQJflq66PoZeWm9n@platinum>
References: <20210616025459.22717-1-joyce.kong@arm.com>
 <20210720035125.14214-1-joyce.kong@arm.com>
 <20210720035125.14214-5-joyce.kong@arm.com>
 <YQEp3g2l8LmVCugx@platinum>
 <AS8PR08MB69357E192881A89EC8A45DB492EB9@AS8PR08MB6935.eurprd08.prod.outlook.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <AS8PR08MB69357E192881A89EC8A45DB492EB9@AS8PR08MB6935.eurprd08.prod.outlook.com>
Subject: Re: [dpdk-dev] [PATCH v3 4/8] test/mcslock: use compiler atomics
 for lcores sync
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

On Thu, Jul 29, 2021 at 07:19:13AM +0000, Joyce Kong wrote:
> Hi Olivier,
> 
> > -----Original Message-----
> > From: Olivier Matz <olivier.matz@6wind.com>
> > Sent: Wednesday, July 28, 2021 5:57 PM
> > To: Joyce Kong <Joyce.Kong@arm.com>
> > Cc: thomas@monjalon.net; david.marchand@redhat.com;
> > roretzla@linux.microsoft.com; stephen@networkplumber.org;
> > andrew.rybchenko@oktetlabs.ru; harry.van.haaren@intel.com; Honnappa
> > Nagarahalli <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> > <Ruifeng.Wang@arm.com>; dev@dpdk.org; nd <nd@arm.com>
> > Subject: Re: [PATCH v3 4/8] test/mcslock: use compiler atomics for lcores
> > sync
> > 
> > Hi Joyce,
> > 
> > On Mon, Jul 19, 2021 at 10:51:21PM -0500, Joyce Kong wrote:
> > > Convert rte_atomic usages to compiler atomic built-ins for lcores sync
> > > in mcslock testcases.
> > >
> > > Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> > > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > > Acked-by: Stephen Hemminger <stephen@networkplumber.org>
> > > ---
> > >  app/test/test_mcslock.c | 14 ++++++--------
> > >  1 file changed, 6 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/app/test/test_mcslock.c b/app/test/test_mcslock.c index
> > > 80eaecc90a..52e45e7e2a 100644
> > > --- a/app/test/test_mcslock.c
> > > +++ b/app/test/test_mcslock.c
> > > @@ -17,7 +17,6 @@
> > >  #include <rte_lcore.h>
> > >  #include <rte_cycles.h>
> > >  #include <rte_mcslock.h>
> > > -#include <rte_atomic.h>
> > >
> > >  #include "test.h"
> > >
> > > @@ -43,7 +42,7 @@ rte_mcslock_t *p_ml_perf;
> > >
> > >  static unsigned int count;
> > >
> > > -static rte_atomic32_t synchro;
> > > +static uint32_t synchro;
> > >
> > >  static int
> > >  test_mcslock_per_core(__rte_unused void *arg) @@ -76,8 +75,7 @@
> > > load_loop_fn(void *func_param)
> > >  	rte_mcslock_t ml_perf_me;
> > >
> > >  	/* wait synchro */
> > > -	while (rte_atomic32_read(&synchro) == 0)
> > > -		;
> > > +	rte_wait_until_equal_32(&synchro, 1, __ATOMIC_RELAXED);
> > >
> > >  	begin = rte_get_timer_cycles();
> > >  	while (lcount < MAX_LOOP) {
> > > @@ -102,15 +100,15 @@ test_mcslock_perf(void)
> > >  	const unsigned int lcore = rte_lcore_id();
> > >
> > >  	printf("\nTest with no lock on single core...\n");
> > > -	rte_atomic32_set(&synchro, 1);
> > > +	__atomic_store_n(&synchro, 1, __ATOMIC_RELAXED);
> > >  	load_loop_fn(&lock);
> > >  	printf("Core [%u] Cost Time = %"PRIu64" us\n",
> > >  			lcore, time_count[lcore]);
> > >  	memset(time_count, 0, sizeof(time_count));
> > >
> > >  	printf("\nTest with lock on single core...\n");
> > > +	__atomic_store_n(&synchro, 1, __ATOMIC_RELAXED);
> > >  	lock = 1;
> > > -	rte_atomic32_set(&synchro, 1);
> > 
> > nit: is there a reason for moving this line?
> 
> I meant to use __atomic_store_n() instead of rte_atomic32_set() to set synchro,
> but put the operation to the line up 'lock=1' by mistake, will change it.
> 
> > > 
> > >  	load_loop_fn(&lock);
> > >  	printf("Core [%u] Cost Time = %"PRIu64" us\n",
> > >  			lcore, time_count[lcore]);
> > > @@ -118,11 +116,11 @@ test_mcslock_perf(void)
> > >
> > >  	printf("\nTest with lock on %u cores...\n", (rte_lcore_count()));
> > >
> > > -	rte_atomic32_set(&synchro, 0);
> > > +	__atomic_store_n(&synchro, 0, __ATOMIC_RELAXED);
> > >  	rte_eal_mp_remote_launch(load_loop_fn, &lock, SKIP_MAIN);
> > >
> > >  	/* start synchro and launch test on main */
> > > -	rte_atomic32_set(&synchro, 1);
> > > +	__atomic_store_n(&synchro, 1, __ATOMIC_RELAXED);
> > >  	load_loop_fn(&lock);
> > 
> > I have a more general question. Please forgive my ignorance about the
> > C++11 atomic builtins and memory model. Both gcc manual and C11
> > standard
> > are not that easy to understand :)
> > 
> > In all the patches of this patchset, __ATOMIC_RELAXED is used. My
> > understanding is that it does not add any inter-thread ordering constraint. I
> > suppose that in this particular case, we rely on the call to
> > rte_eal_mp_remote_launch() being a compiler barrier, and the function itself
> > to be a memory barrier. This ensures that worker threads sees synchro=0
> > until it is set to 1 by the master.
> > Is it correct?
> > 
> 
> Yes, you are right. __ATOMIC_RELAXED would introduce no barrier, and the worker
> threads would sync with master thread by 'synchro'.
> 
> > What is the reason for using the atomic API here? Wouldn't a standard
> > affectation work too? (I mean "synchro = 1;")
> > 
> 
> Here, __atomic_store_n(__ATOMIC_RELAXED) is used to ensure worker threads
> see 'synchro=1' after it is changed by the master. And a standard affection can not
> ensure worker threads get the new value.

So, if I understand correctly, using __atomic_store() acts as if the variable is
volatile, and this is indeed needed to ensure visibility from other worker
threads.

I did some tests to convince myself: https://godbolt.org/z/3qWYeneGf

Thank you for the clarification.

> > 
> > >
> > >  	rte_eal_mp_wait_lcore();
> > > --
> > > 2.17.1
> > >