From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id AFF2442C5E; Thu, 8 Jun 2023 15:21:40 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 4D8D3406B5; Thu, 8 Jun 2023 15:21:40 +0200 (CEST) Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2084.outbound.protection.outlook.com [40.107.22.84]) by mails.dpdk.org (Postfix) with ESMTP id 35A7040042 for ; Thu, 8 Jun 2023 15:21:39 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=tjqnPH6nYjb5xdJhbYLLhBmIoFBpjw0gxEqrb7jcrxU=; b=e2NYol3FQmKYwgj6ZuPZxp/XsRtzMHm8qCPv0oA1vo9BDolmSyO7Kt0P3AxAn+x6yKhqQISAmvq4m1OFl2O4XxXPmgV4GZ2r13JYNv/okTcEjH3GvOevKZZyEn0eS6ygUCuZYDz/rYaDMgXLSAyoNmFR1hn3YkuIoevpAiW/dHM= Received: from DUZPR01CA0238.eurprd01.prod.exchangelabs.com (2603:10a6:10:4b5::9) by VE1PR08MB5742.eurprd08.prod.outlook.com (2603:10a6:800:1a9::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6455.32; Thu, 8 Jun 2023 13:21:35 +0000 Received: from DBAEUR03FT023.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:4b5:cafe::99) by DUZPR01CA0238.outlook.office365.com (2603:10a6:10:4b5::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6477.25 via Frontend Transport; Thu, 8 Jun 2023 13:21:34 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT023.mail.protection.outlook.com (100.127.142.253) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6477.24 via Frontend Transport; Thu, 8 Jun 2023 13:21:34 +0000 Received: ("Tessian outbound 5bb4c51d5a1f:v136"); Thu, 08 Jun 2023 13:21:34 +0000 X-CR-MTA-TID: 64aa7808 Received: from 539a0e85bf35.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 2E6C5484-71D1-43E5-88AC-5ED6A0BC74DF.1; Thu, 08 Jun 2023 13:21:24 +0000 Received: from EUR04-DB3-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 539a0e85bf35.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 08 Jun 2023 13:21:24 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=gssGKsa/WuHNqs8JIHZ7s2VlCaUqLkjZ38wDmRtY+kniKxBAxKPX+2CSCziK950XjAnn0K9FNnnV7LHEV96Wpnm+L3nSNPSmD6v7MxlTacJYlVlwkQSWPEpli/nmgLHfmUKPTt7zUAfAoS60NyzgMzg3pjCPkItEP98UQX8T3euNXue6mnk+Uk44XeTCpH0uor7Gu2fxEL1RbYv4sf1z0Qpvy0BCOzLY1Af+p/xY9qjJHmnhMKLr/6ne8CbO7y8eeHgtjZiaPUB384w8fM8zB9xdNOQLM5DSU+m6rc9LNjH00+yX1XnXO84n9F6GUWXg9yUwOLcZ7JPp/ByCK1dvqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=tjqnPH6nYjb5xdJhbYLLhBmIoFBpjw0gxEqrb7jcrxU=; b=iqisLnmx6SjSK7bZM0m9uNp0XdhIYQVqjdhkXp/5aj1NzMtRBn31UfZLxmSzQa3VXfWHkf7E270uhyyWg9m8YAS+6RBzKum9YmWS3pODmMhW86JbJxcdDiZdWRLeY0n/rZPxdrFkQekxiu9LY/UaA/O5zn7lwpHtizQdR/25CCOLxX83LBi0Pi7ClA8z5n9Irm0B0lgfsWvr47Oy9lwmRzByJZI0vf9VXtP7W3fTV6eqHUOIop25si467Rv5H25LmNAbT+f0SRHRh/mx3HpDOb+lu3KousNOA9jyqBxbrrcBQRt6iQjvi+YwXqxaLdHmZfG3574mwQAU+rOdln1ZxQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=tjqnPH6nYjb5xdJhbYLLhBmIoFBpjw0gxEqrb7jcrxU=; b=e2NYol3FQmKYwgj6ZuPZxp/XsRtzMHm8qCPv0oA1vo9BDolmSyO7Kt0P3AxAn+x6yKhqQISAmvq4m1OFl2O4XxXPmgV4GZ2r13JYNv/okTcEjH3GvOevKZZyEn0eS6ygUCuZYDz/rYaDMgXLSAyoNmFR1hn3YkuIoevpAiW/dHM= Received: from AS4PR08MB7783.eurprd08.prod.outlook.com (2603:10a6:20b:517::17) by AS8PR08MB8441.eurprd08.prod.outlook.com (2603:10a6:20b:569::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6455.32; Thu, 8 Jun 2023 13:21:22 +0000 Received: from AS4PR08MB7783.eurprd08.prod.outlook.com ([fe80::37a3:577d:6a9b:42b3]) by AS4PR08MB7783.eurprd08.prod.outlook.com ([fe80::37a3:577d:6a9b:42b3%3]) with mapi id 15.20.6455.030; Thu, 8 Jun 2023 13:21:22 +0000 From: Wathsala Wathawana Vithanage To: Honnappa Nagarahalli , "konstantin.v.ananyev@yandex.ru" , Feifei Wang CC: "dev@dpdk.org" , nd , nd Subject: RE: [RFC] ring: improve ring performance with C11 atomics Thread-Topic: [RFC] ring: improve ring performance with C11 atomics Thread-Index: AQHZdIXiKSMiGmNRrEK6N8nH1ijyRq+BLLpg Date: Thu, 8 Jun 2023 13:21:22 +0000 Message-ID: References: <20230421191642.217011-1-wathsala.vithanage@arm.com> In-Reply-To: <20230421191642.217011-1-wathsala.vithanage@arm.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: 487A79C7B9146840A8FCD354F2CE958B.0 x-checkrecipientchecked: true Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-traffictypediagnostic: AS4PR08MB7783:EE_|AS8PR08MB8441:EE_|DBAEUR03FT023:EE_|VE1PR08MB5742:EE_ X-MS-Office365-Filtering-Correlation-Id: 947830db-8a60-46a3-a7a5-08db68234851 x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: mQVFu8ungXi3qL+IOlx1B3assuVs0vK3KB+mhDGiYLbiTpeAN4abS9VnsdejXrHfL93nNxMFkqTTip10MXKVtKdja2FNhy6uhE5Ki4n5k+9Xnwic5UPAIPkm71SaIL/HpM/XzlEefkA4L/ycEWYk+0odDnedhmdreQZ7pjcRqEhMCCG/kFrxiQpP0AFjW80m6UtOFH5suN7o5zM5Dfbu3EH4PILDBO2phKaZHAyYggPupf7Pi9C4JW051b1E4UO14W+SW0JrdjmoWbUpbw8/Na+PpmeyVR8ZCg2vYWeWCE3joWa+Zqa2ksILcUpOVjzGMHiv1Me+gTRjttCwiNMtYp9D62czy8II1kB+0t6S4CtYnvmInABbRG49HiG/7L0rr50lMv9cFcEtD0itBVPm7bstNrQ5w2hH43/dDLN1Q58f8qk1rInul7JeUW3KywjVLOZT0EUzy69ur9X/IeBhYhU4mvfqNhpafeO0tNYZmK9KV7p59PNxPKMuAa9R/O+hxLC31hnKtcln4Pg6l4CO9vRs/hodAXoua7AJYYuozx4bUMq6td+cd8Gj9EtF3JTa2OjSDk/KaImolA7zxuQWWW9bBsYhchvg/JWoEmYFIJz1r0iaoCnr/blgR0hRV9XdpQsNL8FuKmqU56LR/6mMtA== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:AS4PR08MB7783.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230028)(4636009)(39860400002)(396003)(346002)(366004)(136003)(376002)(451199021)(83380400001)(186003)(2906002)(33656002)(38070700005)(38100700002)(122000001)(55016003)(316002)(6506007)(41300700001)(52536014)(5660300002)(110136005)(86362001)(8936002)(8676002)(478600001)(7696005)(54906003)(64756008)(66556008)(76116006)(66476007)(71200400001)(6636002)(66946007)(4326008)(53546011)(9686003)(26005)(66446008)(23180200003); DIR:OUT; SFP:1101; Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB8441 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT023.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: b05d95c3-9071-4860-7020-08db68234102 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: NmhNJb15McQ9FNuBo/W0HK4kX/lScKtrydn3HgJxTEYqap63S4dJsdn5ymYRdkHAnkGwWiEls1SwJpaH+L8/VYhilSmar6hIWFElZGVWSag7//cBl7pg/8KRPIoyrqxfwfc3CnKVoqRLuF/j87iQEaCtG1UwX1M/qq6lQwjwmsyjntZVmMkx5tsOLm4D+VIYXSdYTq+z7uFawLm5dqKxqS9Ajrnj/oL0mLCTcchqjHsoi4EtZi8pNy8aqbcQ83D3DIcLb923rOBka71X2jjXbP+itSOiR6liWgwbnaPMTuK0d+xggBQMcPtF2yJEDHBY9J0slmxLrarmpIvWNqUtYmbcf1xCTrdNS3zXeMylGVqsg8XmFisAfyVJ23Rc0EbaAzu30NBHsiR5svNATdMtmY+h5Fd1Uf2vND2AVVW4VThpiVMOUWRPwKIzBnb+G1N9E8lgWsDp8KsdEoyE0DwkuMtUxToZPJXYlJo7YWoclISej6uyxkQjyYE/kvxbJr1k0F0kzs17nDdvZZyjKav1FyIqnY/RLzrlAhuxi6fd58RHD4RYkf1zcvphshFL6oOfPPKujUWv2YoqaXJl9NEaSi8wocwUwJTAeKpoOIP3bRVjm3lOZVBZ3LGHAyB0OijSI9GRXPj2aMO85lx7PcFjHMkFAR0+OcC0ESVfhCo+G6E0Xd9ejWaGL1C7kF0uAW83rh9DOGE6h5Uy5Y9ib92khblOejTzJTERsKVcX4VxafXnL66+dHSA4RUbKTH1Pz6r0fvcl8Sk9P3uysvpsHBVz/k7a399LrQV6VCRNx1h+n4= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230028)(4636009)(39860400002)(396003)(346002)(136003)(376002)(451199021)(40470700004)(46966006)(36840700001)(83380400001)(40460700003)(186003)(336012)(47076005)(2906002)(33656002)(82310400005)(356005)(82740400003)(40480700001)(81166007)(36860700001)(55016003)(316002)(6506007)(41300700001)(52536014)(5660300002)(110136005)(86362001)(8936002)(8676002)(478600001)(7696005)(54906003)(6636002)(4326008)(70206006)(70586007)(53546011)(9686003)(26005)(525324003)(23180200003); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jun 2023 13:21:34.9512 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 947830db-8a60-46a3-a7a5-08db68234851 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT023.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5742 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org The solution presented in this RFC is not C11 compliant.=20 C11 __atomic_compare_exchange_n updates "expected" only when CAS instructio= n fails.=20 Therefore, the assumption that there is an address dependency from CAS inst= ructions in both producer/consumer head update to the ring element accesses= falls apart with respect to semantics of the C11 memory model.=20 Thus, this solution may corrupt ring elements when executed on CPUs with we= ak memory models. Therefore, this RFC will be retracted. > -----Original Message----- > From: Wathsala Vithanage > Sent: Friday, April 21, 2023 3:17 PM > To: Honnappa Nagarahalli ; > konstantin.v.ananyev@yandex.ru; Feifei Wang > Cc: dev@dpdk.org; nd ; Wathsala Wathawana Vithanage > > Subject: [RFC] ring: improve ring performance with C11 atomics >=20 > Tail load in __rte_ring_move_cons_head and __rte_ring_move_prod_head can > be changed to __ATOMIC_RELAXED from __ATOMIC_ACQUIRE. > Because to calculate the addresses of the dequeue elements > __rte_ring_dequeue_elems uses the old_head updated by the > __atomic_compare_exchange_n intrinsic used in __rte_ring_move_prod_head. > This results in an address dependency between the two operations. Therefo= re > __rte_ring_dequeue_elems cannot happen before > __rte_ring_move_prod_head. > Similarly __rte_ring_enqueue_elems and __rte_ring_move_cons_head won't be > reordered either. >=20 > Performance on Arm N1 > Gain relative to generic implementation > +-------------------------------------------------------------------+ > | Bulk enq/dequeue count on size 8 (Arm N1) | > +-------------------------------------------------------------------+ > | Generic | C11 atomics | C11 atomics improved | > +-------------------------------------------------------------------+ > | Total count: 766730 | Total count: 651686 | Total count: 812125 | > | | Gain: -15% | Gain: 6% | > +-------------------------------------------------------------------+ > +-------------------------------------------------------------------+ > | Bulk enq/dequeue count on size 32 (Arm N1) | > +-------------------------------------------------------------------+ > | Generic | C11 atomics | C11 atomics improved | > +-------------------------------------------------------------------+ > | Total count: 816745 | Total count: 646385 | Total count: 830935 | > | | Gain: -21% | Gain: 2% | > +-------------------------------------------------------------------+ >=20 > Performance on x86-64 Cascade Lake > Gain relative to generic implementation > +-------------------------------------------------------------------+ > | Bulk enq/dequeue count on size 8 | > +-------------------------------------------------------------------+ > | Generic | C11 atomics | C11 atomics improved | > +-------------------------------------------------------------------+ > | Total count: 181640 | Total count: 181995 | Total count: 182791 | > | | Gain: 0.2% | Gain: 0.6% > +-------------------------------------------------------------------+ > +-------------------------------------------------------------------+ > | Bulk enq/dequeue count on size 32 | > +-------------------------------------------------------------------+ > | Generic | C11 atomics | C11 atomics improved | > +-------------------------------------------------------------------+ > | Total count: 167495 | Total count: 161536 | Total count: 163190 | > | | Gain: -3.5% | Gain: -2.6% | > +-------------------------------------------------------------------+ >=20 > Signed-off-by: Wathsala Vithanage > Reviewed-by: Honnappa Nagarahalli > Reviewed-by: Feifei Wang > --- > .mailmap | 1 + > lib/ring/rte_ring_c11_pvt.h | 18 +++++++++--------- > 2 files changed, 10 insertions(+), 9 deletions(-) >=20 > diff --git a/.mailmap b/.mailmap > index 4018f0fc47..367115d134 100644 > --- a/.mailmap > +++ b/.mailmap > @@ -1430,6 +1430,7 @@ Walter Heymans > Wang Sheng-Hui Wangyu (Eric) > Waterman Cao > +Wathsala Vithanage > Weichun Chen > Wei Dai > Weifeng Li > diff --git a/lib/ring/rte_ring_c11_pvt.h b/lib/ring/rte_ring_c11_pvt.h in= dex > f895950df4..1895f2bb0e 100644 > --- a/lib/ring/rte_ring_c11_pvt.h > +++ b/lib/ring/rte_ring_c11_pvt.h > @@ -24,6 +24,13 @@ __rte_ring_update_tail(struct rte_ring_headtail *ht, > uint32_t old_val, > if (!single) > rte_wait_until_equal_32(&ht->tail, old_val, > __ATOMIC_RELAXED); >=20 > + /* > + * Updating of ht->tail cannot happen before elements are added to or > + * removed from the ring, as it could result in data races between > + * producer and consumer threads. Therefore ht->tail should be > updated > + * with release semantics to prevent ring data copy phase from sinking > + * below it. > + */ > __atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE); } >=20 > @@ -69,11 +76,8 @@ __rte_ring_move_prod_head(struct rte_ring *r, > unsigned int is_sp, > /* Ensure the head is read before tail */ > __atomic_thread_fence(__ATOMIC_ACQUIRE); >=20 > - /* load-acquire synchronize with store-release of ht->tail > - * in update_tail. > - */ > cons_tail =3D __atomic_load_n(&r->cons.tail, > - __ATOMIC_ACQUIRE); > + __ATOMIC_RELAXED); >=20 > /* The subtraction is done between two unsigned 32bits value > * (the result is always modulo 32 bits even if we have @@ - > 145,12 +149,8 @@ __rte_ring_move_cons_head(struct rte_ring *r, int is_sc, > /* Ensure the head is read before tail */ > __atomic_thread_fence(__ATOMIC_ACQUIRE); >=20 > - /* this load-acquire synchronize with store-release of ht->tail > - * in update_tail. > - */ > prod_tail =3D __atomic_load_n(&r->prod.tail, > - __ATOMIC_ACQUIRE); > - > + __ATOMIC_RELAXED); > /* The subtraction is done between two unsigned 32bits value > * (the result is always modulo 32 bits even if we have > * cons_head > prod_tail). So 'entries' is always between 0 > -- > 2.25.1