From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8DBA2A0526; Thu, 23 Jul 2020 08:11:37 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 026471BFD7; Thu, 23 Jul 2020 08:11:37 +0200 (CEST) Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2053.outbound.protection.outlook.com [40.107.20.53]) by dpdk.org (Postfix) with ESMTP id 8B3361BFBA for ; Thu, 23 Jul 2020 08:11:35 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=je7tlA5OZjDNK71E1uLt/mrTN8bJ2gb83FCb3OUowI0=; b=yXH6FAUE9WHLEc27jhmNEllG7FHjPOskRIVhrrA1VurRhWocF96b9pLFT96jzA16Sdbg+MbHzYJxvbfDW4QthQvsBOzH89lj3UHxyFzKnLlneH4FE2dq/0NHtJC/zDrP1l4CLW9Wxsgkn+LQiMOMmgMbDg6cVPpVjBBoc9tAG68= Received: from AM7PR03CA0030.eurprd03.prod.outlook.com (2603:10a6:20b:130::40) by VE1PR08MB5070.eurprd08.prod.outlook.com (2603:10a6:803:10c::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3195.23; Thu, 23 Jul 2020 06:11:33 +0000 Received: from VE1EUR03FT032.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:130:cafe::5c) by AM7PR03CA0030.outlook.office365.com (2603:10a6:20b:130::40) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3216.21 via Frontend Transport; Thu, 23 Jul 2020 06:11:33 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dpdk.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dpdk.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VE1EUR03FT032.mail.protection.outlook.com (10.152.18.121) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3216.10 via Frontend Transport; Thu, 23 Jul 2020 06:11:33 +0000 Received: ("Tessian outbound 7de93d801f24:v62"); Thu, 23 Jul 2020 06:11:32 +0000 X-CR-MTA-TID: 64aa7808 Received: from a0f47083a652.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id C4505E60-1425-4921-B940-080255AEC134.1; Thu, 23 Jul 2020 06:11:27 +0000 Received: from EUR02-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id a0f47083a652.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 23 Jul 2020 06:11:27 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=GoC2lmqUNt4Twmcr9CBGq/dKUWqjzupN8YDQIpuOL810jVFM9KBHUy9j7ngoWCZ8ARyfqA7aKQPxxU/xbsX7T5qn4SGCT68/LUoy/6mDOviBvTgBgJLekwbxemU+x+Grg/qneRALg7dlL9qTQb4h8pp4g8OgVZUXsjBpv8XwtnOpwnR4xI6gWoVLlMnLTsSe6p2DJtUxkeJpLkxDbT3g8elGmu5VpNC6qs0v4UF8x7nFyV9Ty8z5SKBNHVvBOjEXiLOK6JQotDTOmHU2qy49/CFSlFmaalS3rEArDayuYB6rM89HR0J9CRpV8K/v78KTo+0/gI+fsGdtyz/NyBe1KA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=je7tlA5OZjDNK71E1uLt/mrTN8bJ2gb83FCb3OUowI0=; b=KRY2qZRAov2VyBkgVAjsfPJ3MctjDIwI2PgihsOeDEEI48UeQog2acwriK3DKGUTJqChUt0nB47Gvk/FBDzJaDhuIm6/ohJXpGG7bv8IM93hhKlDSBMHmDrlgXscCD7tQ6BJwYEnxSSClk2Hc9rU9kxWBDbWmgfYC+fEUfN0gMf+3ycmhct2gqwBeKlW1UVxYx9HqeJiU87l4HUPUjKrJjQAwibZHguohqy/uwI+Rt/6TSmJTtAEYTPsLzWyFaTVxhsxPEICnPwdxi4lqx9JNcZuSHD4B5L5tLm6Sj6itGmXxeMi7pEcGnd9EQO6Fk6J19ej2xd8buHlyfUwkeeOHA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=je7tlA5OZjDNK71E1uLt/mrTN8bJ2gb83FCb3OUowI0=; b=yXH6FAUE9WHLEc27jhmNEllG7FHjPOskRIVhrrA1VurRhWocF96b9pLFT96jzA16Sdbg+MbHzYJxvbfDW4QthQvsBOzH89lj3UHxyFzKnLlneH4FE2dq/0NHtJC/zDrP1l4CLW9Wxsgkn+LQiMOMmgMbDg6cVPpVjBBoc9tAG68= Received: from VE1PR08MB4640.eurprd08.prod.outlook.com (2603:10a6:802:b2::11) by VI1PR08MB3008.eurprd08.prod.outlook.com (2603:10a6:803:43::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3216.24; Thu, 23 Jul 2020 06:11:24 +0000 Received: from VE1PR08MB4640.eurprd08.prod.outlook.com ([fe80::28a3:3a4e:65ca:5707]) by VE1PR08MB4640.eurprd08.prod.outlook.com ([fe80::28a3:3a4e:65ca:5707%3]) with mapi id 15.20.3216.023; Thu, 23 Jul 2020 06:11:24 +0000 From: Phil Yang To: Honnappa Nagarahalli , Alexander Kozyrev , Matan Azrad , Shahaf Shuler , Slava Ovsiienko CC: "drc@linux.vnet.ibm.com" , nd , "dev@dpdk.org" , nd , nd Thread-Topic: [dpdk-dev] [PATCH v3] net/mlx5: relaxed ordering for multi-packet RQ buffer refcnt Thread-Index: AQHWXuyRqTPpygPJg0imDSLz2DSNPakUmuIAgAAMARA= Date: Thu, 23 Jul 2020 06:11:23 +0000 Message-ID: References: <20200410164127.54229-7-gavin.hu@arm.com> <1592900807-13289-1-git-send-email-phil.yang@arm.com> In-Reply-To: Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: 6ba8f6c1-ae7c-48f8-8b77-b3c56a1526ec.0 x-checkrecipientchecked: true Authentication-Results-Original: arm.com; dkim=none (message not signed) header.d=none;arm.com; dmarc=none action=none header.from=arm.com; x-originating-ip: [203.126.0.113] x-ms-publictraffictype: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: a7bd9d4c-ec5b-44c0-12e3-08d82ecf3f92 x-ms-traffictypediagnostic: VI1PR08MB3008:|VE1PR08MB5070: x-ms-exchange-transport-forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true nodisclaimer: true x-ms-oob-tlc-oobclassifiers: OLM:9508;OLM:9508; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: Hr2jAYMrSnPlfoZSByk0IvkqC1L8525B59xEZ/bfrNaNK7rCX6O7YvDqDu07ukVUXJZleGwi1VtnFrwvHT4kGLhEhmTaKvdvQNy82fGkQRGPSQyldCuH0fcbjvl0q7DQmUbZO4TXBuDKFn9qS1RzQu/UpEDdagjY+9SyrIR05w6+DGRAgxoR/+DP5mwzkeLY62du8PXDn80alHscLr6eoQy5un4eJC9rozoPaC00geKjjKDcDLTs+Am4Hh8uibC1t/qgltDVXYJqvxV0SC2EFV9dgzVtO4ks/IQ3Dt5CoQO8CwxnaHqRmOOOzbUTrA/O X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VE1PR08MB4640.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFTY:; SFS:(4636009)(346002)(366004)(396003)(376002)(136003)(39860400002)(9686003)(55016002)(110136005)(76116006)(5660300002)(66946007)(4326008)(71200400001)(6506007)(53546011)(52536014)(54906003)(26005)(186003)(83380400001)(66476007)(66556008)(316002)(478600001)(33656002)(8936002)(64756008)(86362001)(7696005)(2906002)(66446008); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: 4N/rbUorwOtpmOnB53F2MVb69e6GmddA2++rLbqldBn/y992j/bZL5UdS+uIr9kP6BqB272iRoFK4aCf0l4WG8zc50E0pN2XEl0VTDc+2dVtNBueq3INE46k+5v6wV0S0UB994RL9ci0lGBWfAw9gpqWklfAgdyxmMZtaVOhk3JcSvj76UrpsIGVVEgidkxPX+0kzN1yJ9G/su1HJQHCW7XV9p7XuQN5wxoGehSRZzgqr6fYhNW0qX0efAXgOtJ5U6JNyFWZGpx1Bo7NqKBMNv6Nc4WvUhumtCWvigW6+KX77T6CaJmcg06pWpHA09bk/ykE1gRDgakvilrzMNXqPOcr0ULgF1xVGUGquqd3Efg6QendAoKuOOaPy8yrUy9THcESmuW49Hv4r2+NLPwSlNjTTeHLYo1XZAP/ZNF2GV3KU/R3nK9vrpdYqMZmI1fHHsSDlA+c495hlOzAyTIIm1dH1XmLHBzipzuE8yxZgu4= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB3008 Original-Authentication-Results: arm.com; dkim=none (message not signed) header.d=none;arm.com; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VE1EUR03FT032.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 9db24098-d289-401c-2146-08d82ecf3a37 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: kJH+wKMpcSu74+EBl4TOJf7NPF5yGOJvSUw8OKWQs4N034AWnIO5LL2498kQ2+bBMkt5TfWJEHFUyiLXMr1xPeIaJ60yDLRG/6pSzYgydyI4bw63aX24UkncSKqpVRnroV0AUPeR/fQdrivDtEy3GdACS8levsbmXEz6bgVDQC6fXXzLHfHO69lr3++sfeXnOZinKA+9+yUg6hmH6NvCyFAFiJUtGWAEWUlm6iGbKmqGNhCQRIK+F3tmN3YsKfptCaW6JW3lYhCHM7FqRjrGLL+6OC+y5xgtV1IEdYJdPS7HvGHBYUB2uRqNdl4CPC6mabg5lnJgjfM+P39lxRobJfC7I5qvgtjO5xT2rNmhzDZF+URANvnBg8bsnPmUvtHPBOHvQHJWbqcmJNMc4p0XKw== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFTY:; SFS:(4636009)(136003)(346002)(396003)(376002)(39860400002)(46966005)(336012)(478600001)(316002)(70586007)(70206006)(54906003)(110136005)(2906002)(36906005)(8936002)(9686003)(83380400001)(55016002)(52536014)(26005)(356005)(7696005)(33656002)(81166007)(82310400002)(6506007)(86362001)(4326008)(186003)(82740400003)(53546011)(47076004)(5660300002); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Jul 2020 06:11:33.1161 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: a7bd9d4c-ec5b-44c0-12e3-08d82ecf3f92 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: VE1EUR03FT032.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5070 Subject: Re: [dpdk-dev] [PATCH v3] net/mlx5: relaxed ordering for multi-packet RQ buffer refcnt X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" >=20 > > > Subject: Re: [dpdk-dev] [PATCH v3] net/mlx5: relaxed ordering for > > > multi-packet RQ buffer refcnt > > > > > > Hi, > > > > > > We are also doing C11 atomics converting for other components. > > > Your insight would be much appreciated. > > > > > > Thanks, > > > Phil Yang > > > > > > > -----Original Message----- > > > > From: dev On Behalf Of Phil Yang > > > > Sent: Tuesday, June 23, 2020 4:27 PM > > > > To: dev@dpdk.org > > > > Cc: matan@mellanox.com; shahafs@mellanox.com; > > > > viacheslavo@mellanox.com; Honnappa Nagarahalli > > > > ; drc@linux.vnet.ibm.com; nd > > > > > > > > Subject: [dpdk-dev] [PATCH v3] net/mlx5: relaxed ordering for > > > > multi-packet RQ buffer refcnt > > > > > > > > Use c11 atomics with explicit ordering instead of the rte_atomic op= s > > > > which enforce unnecessary barriers on aarch64. > > > > > > > > Signed-off-by: Phil Yang > > > > --- <...> > > > > > > > > drivers/net/mlx5/mlx5_rxq.c | 2 +- drivers/net/mlx5/mlx5_rxtx.c > > > > | 16 +++++++++------- drivers/net/mlx5/mlx5_rxtx.h | 2 +- > > > > 3 files changed, 11 insertions(+), 9 deletions(-) > > > > > > > > diff --git a/drivers/net/mlx5/mlx5_rxq.c > > > > b/drivers/net/mlx5/mlx5_rxq.c index dda0073..7f487f1 100644 > > > > --- a/drivers/net/mlx5/mlx5_rxq.c > > > > +++ b/drivers/net/mlx5/mlx5_rxq.c > > > > @@ -1545,7 +1545,7 @@ mlx5_mprq_buf_init(struct rte_mempool > *mp, > > > > void *opaque_arg, > > > > > > > > memset(_m, 0, sizeof(*buf)); > > > > buf->mp =3D mp; > > > > -rte_atomic16_set(&buf->refcnt, 1); > > > > +__atomic_store_n(&buf->refcnt, 1, __ATOMIC_RELAXED); > > > > for (j =3D 0; j !=3D strd_n; ++j) { > > > > shinfo =3D &buf->shinfos[j]; > > > > shinfo->free_cb =3D mlx5_mprq_buf_free_cb; diff --git > > > > a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index > > > > e4106bf..f0eda88 100644 > > > > --- a/drivers/net/mlx5/mlx5_rxtx.c > > > > +++ b/drivers/net/mlx5/mlx5_rxtx.c > > > > @@ -1595,10 +1595,11 @@ mlx5_mprq_buf_free_cb(void *addr > > > __rte_unused, > > > > void *opaque) { > > > > struct mlx5_mprq_buf *buf =3D opaque; > > > > > > > > -if (rte_atomic16_read(&buf->refcnt) =3D=3D 1) { > > > > +if (__atomic_load_n(&buf->refcnt, __ATOMIC_RELAXED) =3D=3D 1) { > > > > rte_mempool_put(buf->mp, buf); > > > > -} else if (rte_atomic16_add_return(&buf->refcnt, -1) =3D=3D 0) { > > > > -rte_atomic16_set(&buf->refcnt, 1); > > > > +} else if (unlikely(__atomic_sub_fetch(&buf->refcnt, 1, > > > > + __ATOMIC_RELAXED) =3D=3D 0)) { > > > > +__atomic_store_n(&buf->refcnt, 1, __ATOMIC_RELAXED); > > > > rte_mempool_put(buf->mp, buf); > > > > } > > > > } > > > > @@ -1678,7 +1679,8 @@ mlx5_rx_burst_mprq(void *dpdk_rxq, struct > > > > rte_mbuf **pkts, uint16_t pkts_n) > > > > > > > > if (consumed_strd =3D=3D strd_n) { > > > > /* Replace WQE only if the buffer is still in use. */ > > > > -if (rte_atomic16_read(&buf->refcnt) > 1) { > > > > +if (__atomic_load_n(&buf->refcnt, > > > > + __ATOMIC_RELAXED) > 1) { > > > > mprq_buf_replace(rxq, rq_ci & wq_mask, > > > strd_n); > > > > /* Release the old buffer. */ > > > > mlx5_mprq_buf_free(buf); > > > > @@ -1790,9 +1792,9 @@ mlx5_rx_burst_mprq(void *dpdk_rxq, struct > > > > rte_mbuf **pkts, uint16_t pkts_n) > > > > void *buf_addr; > > > > > > > > /* Increment the refcnt of the whole chunk. */ > > > > -rte_atomic16_add_return(&buf->refcnt, 1); > rte_atomic16_add_return includes a full barrier along with atomic operati= on. > But is full barrier required here? For ex: __atomic_add_fetch(&buf->refcn= t, 1, > __ATOMIC_RELAXED) will offer atomicity, but no barrier. Would that be > enough? >=20 > > > > -MLX5_ASSERT((uint16_t)rte_atomic16_read(&buf- > > > > >refcnt) <=3D > > > > - strd_n + 1); > > > > +__atomic_add_fetch(&buf->refcnt, 1, > > > > __ATOMIC_ACQUIRE); The atomic load in MLX5_ASSERT() accesses the same memory space as the prev= ious __atomic_add_fetch() does. They will access this memory space in the program order when we enabled MLX= 5_PMD_DEBUG. So the ACQUIRE barrier in __atomic_add_fetch() becomes unneces= sary. By changing it to RELAXED ordering, this patch got 7.6% performance improve= ment on N1 (making it generate A72 alike instructions). Could you please also try it on your testbed, Alex? >=20 > Can you replace just the above line with the following lines and test it? >=20 > __atomic_add_fetch(&buf->refcnt, 1, __ATOMIC_RELAXED); > __atomic_thread_fence(__ATOMIC_ACQ_REL); >=20 > This should make the generated code same as before this patch. Let me > know if you would prefer us to re-spin the patch instead (for testing). >=20 > > > > +MLX5_ASSERT(__atomic_load_n(&buf->refcnt, > > > > + __ATOMIC_RELAXED) <=3D strd_n + 1); > > > > buf_addr =3D RTE_PTR_SUB(addr, > > > > RTE_PKTMBUF_HEADROOM); > > > > /* > > > > * MLX5 device doesn't use iova but it is necessary in a > > > diff > > > > --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h > > > > index 26621ff..0fc15f3 100644 > > > > --- a/drivers/net/mlx5/mlx5_rxtx.h > > > > +++ b/drivers/net/mlx5/mlx5_rxtx.h > > > > @@ -78,7 +78,7 @@ struct rxq_zip { > > > > /* Multi-Packet RQ buffer header. */ struct mlx5_mprq_buf { > > > > struct rte_mempool *mp; > > > > -rte_atomic16_t refcnt; /* Atomically accessed refcnt. */ > > > > +uint16_t refcnt; /* Atomically accessed refcnt. */ > > > > uint8_t pad[RTE_PKTMBUF_HEADROOM]; /* Headroom for the first > > > packet. > > > > */ > > > > struct rte_mbuf_ext_shared_info shinfos[]; > > > > /* > > > > -- > > > > 2.7.4 >=20