From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 64030A0597; Wed, 8 Apr 2020 23:10:37 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id B261A1C1A6; Wed, 8 Apr 2020 23:10:36 +0200 (CEST) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 835CD1C18F for ; Wed, 8 Apr 2020 23:10:35 +0200 (CEST) IronPort-SDR: SFWLhUs36taBxb/YiDdxKcarEDZMnjSjsfgqOKCNDJF8+mrz7AKW8BtZ6PKg/8IOPSgdqDxn6y MA1M4cbCgddA== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Apr 2020 14:10:34 -0700 IronPort-SDR: 1JbhW7bvxLW9jyVJelglSk1lRljuo1tO2ZiwYXD0vt8XZO0d/Ln1Ifm66XKkYALZXtqd6pmuMp OT2WT36jzTVg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.72,359,1580803200"; d="scan'208";a="240415977" Received: from orsmsx105.amr.corp.intel.com ([10.22.225.132]) by orsmga007.jf.intel.com with ESMTP; 08 Apr 2020 14:10:34 -0700 Received: from orsmsx602.amr.corp.intel.com (10.22.229.15) by ORSMSX105.amr.corp.intel.com (10.22.225.132) with Microsoft SMTP Server (TLS) id 14.3.439.0; Wed, 8 Apr 2020 14:10:34 -0700 Received: from orsmsx602.amr.corp.intel.com (10.22.229.15) by ORSMSX602.amr.corp.intel.com (10.22.229.15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Wed, 8 Apr 2020 14:10:33 -0700 Received: from ORSEDG002.ED.cps.intel.com (10.7.248.5) by orsmsx602.amr.corp.intel.com (10.22.229.15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.1713.5 via Frontend Transport; Wed, 8 Apr 2020 14:10:33 -0700 Received: from NAM04-SN1-obe.outbound.protection.outlook.com (104.47.44.57) by edgegateway.intel.com (134.134.137.101) with Microsoft SMTP Server (TLS) id 14.3.439.0; Wed, 8 Apr 2020 14:10:31 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=cAfsUTr5tuY9g1Xu5QuruyCkaikuctF+Yb/MQY5l/I79mvQ9GP94T1fL2UyUogniclsrZSU1rlpu6lbEb39/pn1OPCyFacmncJF+Yjh9hwDBnwBzCoNXdyqh8mAcvU/H8n2r1gFu4pnmtHvO7iMvYIep+/YpI5b1IaPsSI2dnuxhk9n0ire0dUnt/UoSpCtxe7nUDS/gtfwSJs9jgjZMx7MCf9ds4HNMWMLdfbwL7MguuhLj2vVbZLCYDwUVIHAs4VfRZj2uFywhaEN+EV7YBpedKlxGuFEz9998QdVJVgZQR1mp9i+fEFK+4rtl3WByttxuPvzAM0bCavVzHOr0IA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=qp0OVEFKRea70RclRExj85dAMpDx200xl82GCE4D580=; b=fU5s5qZUuMpawFUI7oUYmoRhxwyDz8lbWuWH0mdYrSru2no5RDhpkJY1psgKLUVr2Nrv47WC0bp+al/ipCiYSWbeKOkGopG4KyrPX2tnpl11e0ZiF3mkJnNfyc25SXOp1b4+8a05qT5lbq0SCTnOQCwdfaxgU1SQ/WXj10izsTzNNxvr/RUokEwlWAlxwdcIjBW1AFVlMNtrR4FTxEun4NDrjX6h1tOl2MrqSMD4dVCaHVnpuL9pH/2Ucs6ZT/kDNwzAzxjWNKYj2MBd5Bjg6UOugCGhwGUgT/zL0Nr7LlNirwVPXygxTUHB/1pn3MkAtNtvODhkz1HqI3smNK2dtA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=qp0OVEFKRea70RclRExj85dAMpDx200xl82GCE4D580=; b=tK+dIc44oHESGLmcHqygjYed0Z/z0m2i+w651r3Cx86zim7iD7BX8M9BwlWORkRXK7VYwu2M97fatVWM0TVmyqWtYVkPiWZhBlyMT5DoTK62hSGQqNd2TQvSkcQ28I/r/jAgOveV8Iyg0R8VRzuuTLiaeD+FezTqSpdTjbsh4Pw= Received: from SA0PR11MB4656.namprd11.prod.outlook.com (2603:10b6:806:96::23) by SA0PR11MB4719.namprd11.prod.outlook.com (2603:10b6:806:95::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2878.16; Wed, 8 Apr 2020 21:10:22 +0000 Received: from SA0PR11MB4656.namprd11.prod.outlook.com ([fe80::6400:b873:7752:50b4]) by SA0PR11MB4656.namprd11.prod.outlook.com ([fe80::6400:b873:7752:50b4%4]) with mapi id 15.20.2878.018; Wed, 8 Apr 2020 21:10:22 +0000 From: "Carrillo, Erik G" To: Phil Yang , "rsanford@akamai.com" , "dev@dpdk.org" CC: "david.marchand@redhat.com" , "Burakov, Anatoly" , "thomas@monjalon.net" , "jerinj@marvell.com" , "hemant.agrawal@nxp.com" , "Honnappa.Nagarahalli@arm.com" , "gavin.hu@arm.com" , "nd@arm.com" Thread-Topic: [PATCH 2/2] lib/timer: relax barrier for status update Thread-Index: AQHV6t2yBm1KhrbDIU2DdShJp9K8ZKhv+ljQ Date: Wed, 8 Apr 2020 21:10:22 +0000 Message-ID: References: <1582526539-14360-1-git-send-email-phil.yang@arm.com> <1582526539-14360-2-git-send-email-phil.yang@arm.com> In-Reply-To: <1582526539-14360-2-git-send-email-phil.yang@arm.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-reaction: no-action dlp-version: 11.2.0.6 authentication-results: spf=none (sender IP is ) smtp.mailfrom=erik.g.carrillo@intel.com; x-originating-ip: [192.55.52.217] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 627aea8a-96e6-4715-cd3a-08d7dc01400c x-ms-traffictypediagnostic: SA0PR11MB4719: x-ld-processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:6108; x-forefront-prvs: 0367A50BB1 x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SA0PR11MB4656.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFTY:; SFS:(10019020)(136003)(396003)(376002)(366004)(346002)(39850400004)(76116006)(66476007)(15650500001)(5660300002)(81156014)(64756008)(6506007)(66556008)(7416002)(54906003)(186003)(8676002)(9686003)(86362001)(53546011)(110136005)(66446008)(2906002)(8936002)(55016002)(4326008)(52536014)(316002)(66946007)(33656002)(81166007)(478600001)(26005)(71200400001)(7696005); DIR:OUT; SFP:1102; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: gYNLOHIvHMFt0PQFfGnTVUwFCqpkQyqCNK7eYSpT16806p1noTrLJyIrpURrMBp9b91jkOzEE9fP7y1fkyqElKS/chs22ndPXCGXUIthtGg/fCy3r6rcjiKRUuhlAa55z1fzGo/Cqhsh/N6CjIzN6m5Ij5dFtxoqcLq0K1epdzJmH5Edlb9CAOQt7mizgGu94E6ZUBtN17xVTa97FeRoVdOznTHgqCNF/k7vdLzFKXRDF0EfuvcwyIQJFnx1zjdvTRrWZcKs8LIieJGiRnna7M172L8tF7BQzjT6Zv5OZRi5a17FSjR4/w/alFm74EU3XzVnbotro7/b+8Kf+7bGNgq47Bmb6AkX/ljCTDjtel/dWMPJCkWolrCp0VaPGN1Wf2HG7m48NIR45KrvPqU6mkHymht4A0x+Y8iQrdH+bzrPxTI2ZPaFHGos509P4Wi6 x-ms-exchange-antispam-messagedata: N7Cq8NIGhlFPqDgzMjVQu6gUXNneSTUAQGZKBYNCtqMiUeM2Mc0muDCg1q5Hvq79GAIo0GW+OGJLrZVcmtjpnQ0PuZ+W9cqHJ6XV0hnVrtYhnTf6c5UFGj5WLv2X1QPb7iOFH/JIR25CkWo1RRFxeQ== Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-Network-Message-Id: 627aea8a-96e6-4715-cd3a-08d7dc01400c X-MS-Exchange-CrossTenant-originalarrivaltime: 08 Apr 2020 21:10:22.2449 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 0+ORXUd0EbWbqaogmRqTnGoNaiT9kPBL2TaCvOOxBo+r6DgmEeBSoMpoMDpMwXdlJ8aDhS2JujVrLFCDabq4Smd52tFMuVaKyZ4wNYftvxk= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA0PR11MB4719 X-OriginatorOrg: intel.com Subject: Re: [dpdk-dev] [PATCH 2/2] lib/timer: relax barrier for status update X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > -----Original Message----- > From: Phil Yang > Sent: Monday, February 24, 2020 12:42 AM > To: rsanford@akamai.com; Carrillo, Erik G ; > dev@dpdk.org > Cc: david.marchand@redhat.com; Burakov, Anatoly > ; thomas@monjalon.net; jerinj@marvell.com; > hemant.agrawal@nxp.com; Honnappa.Nagarahalli@arm.com; > gavin.hu@arm.com; phil.yang@arm.com; nd@arm.com > Subject: [PATCH 2/2] lib/timer: relax barrier for status update >=20 > Volatile has no ordering semantics. The rte_timer structure defines timer > status as a volatile variable and uses the rte_r/wmb barrier to guarantee > inter-thread visibility. >=20 > This patch optimized the volatile operation with c11 atomic operations an= d > one-way barrier to save the performance penalty. According to the > timer_perf_autotest benchmarking results, this patch can uplift 10%~16% > timer appending performance, 3%~20% timer resetting performance and > 45% timer callbacks scheduling performance on aarch64 and no loss in > performance for x86. >=20 > Suggested-by: Honnappa Nagarahalli > Signed-off-by: Phil Yang > Reviewed-by: Gavin Hu Hi Phil, It seems like the consensus is to generally avoid replacing rte_atomic_* in= terfaces with the GCC builtins directly. In other areas of DPDK that are = being patched, are the C11 APIs going to be investigated? = It seems like that decision will apply here as well. Thanks, Erik > --- > lib/librte_timer/rte_timer.c | 90 +++++++++++++++++++++++++++++++---- > --------- > lib/librte_timer/rte_timer.h | 2 +- > 2 files changed, 65 insertions(+), 27 deletions(-) >=20 > diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c = index > 269e921..be0262d 100644 > --- a/lib/librte_timer/rte_timer.c > +++ b/lib/librte_timer/rte_timer.c > @@ -10,7 +10,6 @@ > #include > #include >=20 > -#include > #include > #include > #include > @@ -218,7 +217,7 @@ rte_timer_init(struct rte_timer *tim) >=20 > status.state =3D RTE_TIMER_STOP; > status.owner =3D RTE_TIMER_NO_OWNER; > - tim->status.u32 =3D status.u32; > + __atomic_store_n(&tim->status.u32, status.u32, > __ATOMIC_RELAXED); > } >=20 > /* > @@ -239,9 +238,9 @@ timer_set_config_state(struct rte_timer *tim, >=20 > /* wait that the timer is in correct status before update, > * and mark it as being configured */ > - while (success =3D=3D 0) { > - prev_status.u32 =3D tim->status.u32; > + prev_status.u32 =3D __atomic_load_n(&tim->status.u32, > __ATOMIC_RELAXED); >=20 > + while (success =3D=3D 0) { > /* timer is running on another core > * or ready to run on local core, exit > */ > @@ -258,9 +257,20 @@ timer_set_config_state(struct rte_timer *tim, > * mark it atomically as being configured */ > status.state =3D RTE_TIMER_CONFIG; > status.owner =3D (int16_t)lcore_id; > - success =3D rte_atomic32_cmpset(&tim->status.u32, > - prev_status.u32, > - status.u32); > + /* If status is observed as RTE_TIMER_CONFIG earlier, > + * that's not going to cause any issues because the > + * pattern is read for status then read the other members. > + * In one of the callers to timer_set_config_state > + * (the __rte_timer_reset) we set other members to the > + * structure (period, expire, f, arg) we want these > + * changes to be observed after our change to status. > + * So we need __ATOMIC_ACQUIRE here. > + */ > + success =3D __atomic_compare_exchange_n(&tim- > >status.u32, > + &prev_status.u32, > + status.u32, 0, > + __ATOMIC_ACQUIRE, > + __ATOMIC_RELAXED); > } >=20 > ret_prev_status->u32 =3D prev_status.u32; @@ -279,20 +289,27 @@ > timer_set_running_state(struct rte_timer *tim) >=20 > /* wait that the timer is in correct status before update, > * and mark it as running */ > - while (success =3D=3D 0) { > - prev_status.u32 =3D tim->status.u32; > + prev_status.u32 =3D __atomic_load_n(&tim->status.u32, > __ATOMIC_RELAXED); >=20 > + while (success =3D=3D 0) { > /* timer is not pending anymore */ > if (prev_status.state !=3D RTE_TIMER_PENDING) > return -1; >=20 > /* here, we know that timer is stopped or pending, > - * mark it atomically as being configured */ > + * mark it atomically as being running > + */ > status.state =3D RTE_TIMER_RUNNING; > status.owner =3D (int16_t)lcore_id; > - success =3D rte_atomic32_cmpset(&tim->status.u32, > - prev_status.u32, > - status.u32); > + /* RUNNING states are acting as locked states. If the > + * timer is in RUNNING state, the state cannot be changed > + * by other threads. So, we should use ACQUIRE here. > + */ > + success =3D __atomic_compare_exchange_n(&tim- > >status.u32, > + &prev_status.u32, > + status.u32, 0, > + __ATOMIC_ACQUIRE, > + __ATOMIC_RELAXED); > } >=20 > return 0; > @@ -520,10 +537,12 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t > expire, >=20 > /* update state: as we are in CONFIG state, only us can modify > * the state so we don't need to use cmpset() here */ > - rte_wmb(); > status.state =3D RTE_TIMER_PENDING; > status.owner =3D (int16_t)tim_lcore; > - tim->status.u32 =3D status.u32; > + /* The "RELEASE" ordering guarantees the memory operations above > + * the status update are observed before the update by all threads > + */ > + __atomic_store_n(&tim->status.u32, status.u32, > __ATOMIC_RELEASE); >=20 > if (tim_lcore !=3D lcore_id || !local_is_locked) > rte_spinlock_unlock(&priv_timer[tim_lcore].list_lock); > @@ -600,10 +619,12 @@ __rte_timer_stop(struct rte_timer *tim, int > local_is_locked, > } >=20 > /* mark timer as stopped */ > - rte_wmb(); > status.state =3D RTE_TIMER_STOP; > status.owner =3D RTE_TIMER_NO_OWNER; > - tim->status.u32 =3D status.u32; > + /* The "RELEASE" ordering guarantees the memory operations above > + * the status update are observed before the update by all threads > + */ > + __atomic_store_n(&tim->status.u32, status.u32, > __ATOMIC_RELEASE); >=20 > return 0; > } > @@ -637,7 +658,8 @@ rte_timer_stop_sync(struct rte_timer *tim) int > rte_timer_pending(struct rte_timer *tim) { > - return tim->status.state =3D=3D RTE_TIMER_PENDING; > + return __atomic_load_n(&tim->status.state, > + __ATOMIC_RELAXED) =3D=3D > RTE_TIMER_PENDING; > } >=20 > /* must be called periodically, run all timer that expired */ @@ -739,8 > +761,12 @@ __rte_timer_manage(struct rte_timer_data *timer_data) > /* remove from done list and mark timer as stopped > */ > status.state =3D RTE_TIMER_STOP; > status.owner =3D RTE_TIMER_NO_OWNER; > - rte_wmb(); > - tim->status.u32 =3D status.u32; > + /* The "RELEASE" ordering guarantees the memory > + * operations above the status update are observed > + * before the update by all threads > + */ > + __atomic_store_n(&tim->status.u32, status.u32, > + __ATOMIC_RELEASE); > } > else { > /* keep it in list and mark timer as pending */ @@ - > 748,8 +774,12 @@ __rte_timer_manage(struct rte_timer_data *timer_data) > status.state =3D RTE_TIMER_PENDING; > __TIMER_STAT_ADD(priv_timer, pending, 1); > status.owner =3D (int16_t)lcore_id; > - rte_wmb(); > - tim->status.u32 =3D status.u32; > + /* The "RELEASE" ordering guarantees the memory > + * operations above the status update are observed > + * before the update by all threads > + */ > + __atomic_store_n(&tim->status.u32, status.u32, > + __ATOMIC_RELEASE); > __rte_timer_reset(tim, tim->expire + tim->period, > tim->period, lcore_id, tim->f, tim->arg, 1, > timer_data); > @@ -919,8 +949,12 @@ rte_timer_alt_manage(uint32_t timer_data_id, > /* remove from done list and mark timer as stopped > */ > status.state =3D RTE_TIMER_STOP; > status.owner =3D RTE_TIMER_NO_OWNER; > - rte_wmb(); > - tim->status.u32 =3D status.u32; > + /* The "RELEASE" ordering guarantees the memory > + * operations above the status update are observed > + * before the update by all threads > + */ > + __atomic_store_n(&tim->status.u32, status.u32, > + __ATOMIC_RELEASE); > } else { > /* keep it in list and mark timer as pending */ > rte_spinlock_lock( > @@ -928,8 +962,12 @@ rte_timer_alt_manage(uint32_t timer_data_id, > status.state =3D RTE_TIMER_PENDING; > __TIMER_STAT_ADD(data->priv_timer, pending, 1); > status.owner =3D (int16_t)this_lcore; > - rte_wmb(); > - tim->status.u32 =3D status.u32; > + /* The "RELEASE" ordering guarantees the memory > + * operations above the status update are observed > + * before the update by all threads > + */ > + __atomic_store_n(&tim->status.u32, status.u32, > + __ATOMIC_RELEASE); > __rte_timer_reset(tim, tim->expire + tim->period, > tim->period, this_lcore, tim->f, tim->arg, 1, > data); > diff --git a/lib/librte_timer/rte_timer.h b/lib/librte_timer/rte_timer.h = index > c6b3d45..df533fa 100644 > --- a/lib/librte_timer/rte_timer.h > +++ b/lib/librte_timer/rte_timer.h > @@ -101,7 +101,7 @@ struct rte_timer > { > uint64_t expire; /**< Time when timer expire. */ > struct rte_timer *sl_next[MAX_SKIPLIST_DEPTH]; > - volatile union rte_timer_status status; /**< Status of timer. */ > + union rte_timer_status status; /**< Status of timer. */ > uint64_t period; /**< Period of timer (0 if not periodic). */ > rte_timer_cb_t f; /**< Callback function. */ > void *arg; /**< Argument to callback function. */ > -- > 2.7.4