From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <konstantin.ananyev@intel.com>
Received: from mga05.intel.com (mga05.intel.com [192.55.52.43])
 by dpdk.org (Postfix) with ESMTP id 2DA511B6B0
 for <dev@dpdk.org>; Fri, 10 Nov 2017 10:59:17 +0100 (CET)
Received: from fmsmga003.fm.intel.com ([10.253.24.29])
 by fmsmga105.fm.intel.com with ESMTP; 10 Nov 2017 01:59:16 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.44,373,1505804400"; 
   d="scan'208";a="511742"
Received: from irsmsx110.ger.corp.intel.com ([163.33.3.25])
 by FMSMGA003.fm.intel.com with ESMTP; 10 Nov 2017 01:59:14 -0800
Received: from irsmsx105.ger.corp.intel.com ([169.254.7.67]) by
 irsmsx110.ger.corp.intel.com ([163.33.3.25]) with mapi id 14.03.0319.002;
 Fri, 10 Nov 2017 09:59:14 +0000
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: Jia He <hejianet@gmail.com>, "jerin.jacob@caviumnetworks.com"
 <jerin.jacob@caviumnetworks.com>, "dev@dpdk.org" <dev@dpdk.org>,
 "olivier.matz@6wind.com" <olivier.matz@6wind.com>
CC: "Richardson, Bruce" <bruce.richardson@intel.com>, "jianbo.liu@arm.com"
 <jianbo.liu@arm.com>, "hemant.agrawal@nxp.com" <hemant.agrawal@nxp.com>,
 "Jia He" <jia.he@hxt-semitech.com>, "jie2.liu@hxt-semitech.com"
 <jie2.liu@hxt-semitech.com>, "bing.zhao@hxt-semitech.com"
 <bing.zhao@hxt-semitech.com>
Thread-Topic: [PATCH v5 1/1] ring: guarantee load/load order in enqueue and
 dequeue
Thread-Index: AQHTWcZ9E7Y6hWirFUycD+fP1zSvWqMNYVtw
Date: Fri, 10 Nov 2017 09:59:13 +0000
Message-ID: <2601191342CEEE43887BDE71AB9772585FABBC02@irsmsx105.ger.corp.intel.com>
References: <1510118764-29697-1-git-send-email-hejianet@gmail.com>
 <1510278669-8489-1-git-send-email-hejianet@gmail.com>
 <1510278669-8489-2-git-send-email-hejianet@gmail.com>
In-Reply-To: <1510278669-8489-2-git-send-email-hejianet@gmail.com>
Accept-Language: en-IE, en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiN2Q4Zjg3NmMtY2RjZi00NjNhLThjYWEtN2MxNzhhOWIyNjBhIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE2LjUuOS4zIiwiVHJ1c3RlZExhYmVsSGFzaCI6IjFcL0dUaWdDbSs5d3J4cjNoQzVRanRQUlZ2TG8rVStPRUFlTEFtbmlpQTlZPSJ9
x-ctpclassification: CTP_IC
dlp-product: dlpe-windows
dlp-version: 11.0.0.116
dlp-reaction: no-action
x-originating-ip: [163.33.239.182]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] [PATCH v5 1/1] ring: guarantee load/load order in
 enqueue and dequeue
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Nov 2017 09:59:18 -0000



> -----Original Message-----
> From: Jia He [mailto:hejianet@gmail.com]
> Sent: Friday, November 10, 2017 1:51 AM
> To: jerin.jacob@caviumnetworks.com; dev@dpdk.org; olivier.matz@6wind.com
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Richardson, Bruce=
 <bruce.richardson@intel.com>; jianbo.liu@arm.com;
> hemant.agrawal@nxp.com; Jia He <hejianet@gmail.com>; Jia He <jia.he@hxt-s=
emitech.com>; jie2.liu@hxt-semitech.com; bing.zhao@hxt-
> semitech.com
> Subject: [PATCH v5 1/1] ring: guarantee load/load order in enqueue and de=
queue
>=20
> We watched a rte panic of mbuf_autotest in our qualcomm arm64 server.
> In __rte_ring_move_cons_head()
> ...
>         do {
>                 /* Restore n as it may change every loop */
>                 n =3D max;
>=20
>                 *old_head =3D r->cons.head;                //1st load
>                 const uint32_t prod_tail =3D r->prod.tail; //2nd load
>=20
> cpu1(producer)          cpu2(consumer)          cpu3(consumer)
>                         load r->prod.tail
> in enqueue:
> load r->cons.tail
> load r->prod.head
>=20
> store r->prod.tail
>=20
>                                                 load r->cons.head
>                                                 load r->prod.tail
>                                                 ...
>                                                 store r->cons.{head,tail}
>                         load r->cons.head
>=20
> In weak memory order architectures(powerpc,arm), the 2nd load might be
> reodered before the 1st load, that makes *entries is bigger than we
> wanted. This nasty reording messed enque/deque up. Then, r->cons.head
> will be bigger than prod_tail, then make *entries very big and the
> consumer will go forward incorrectly.
>=20
> After this patch, even with above context switches, the old cons.head
> will be recaculated after failure of rte_atomic32_cmpset. So no race
> conditions left.
>=20
> There is no such issue on X86, because X86 is strong memory order model.
> But rte_smp_rmb() doesn't have impact on runtime performance on X86, so
> keep the same code without architectures specific concerns.
>=20
> Signed-off-by: Jia He <jia.he@hxt-semitech.com>
> Signed-off-by: jie2.liu@hxt-semitech.com
> Signed-off-by: bing.zhao@hxt-semitech.com
> ---
>  lib/librte_ring/rte_ring.h | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>=20
> diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> index 5e9b3b7..3e8085a 100644
> --- a/lib/librte_ring/rte_ring.h
> +++ b/lib/librte_ring/rte_ring.h
> @@ -409,6 +409,11 @@ __rte_ring_move_prod_head(struct rte_ring *r, int is=
_sp,
>  		n =3D max;
>=20
>  		*old_head =3D r->prod.head;
> +
> +		/* add rmb barrier to avoid load/load reorder in weak
> +		 * memory model. It is noop on x86 */
> +		rte_smp_rmb();
> +
>  		const uint32_t cons_tail =3D r->cons.tail;
>  		/*
>  		 *  The subtraction is done between two unsigned 32bits value
> @@ -517,6 +522,11 @@ __rte_ring_move_cons_head(struct rte_ring *r, int is=
_sc,
>  		n =3D max;
>=20
>  		*old_head =3D r->cons.head;
> +
> +		/* add rmb barrier to avoid load/load reorder in weak
> +		 * memory model. It is noop on x86 */
> +		rte_smp_rmb();
> +
>  		const uint32_t prod_tail =3D r->prod.tail;
>  		/* The subtraction is done between two unsigned 32bits value
>  		 * (the result is always modulo 32 bits even if we have
> --

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

> 2.7.4