From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1D440A053A; Mon, 27 Jul 2020 16:52:51 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id CC4451BFE8; Mon, 27 Jul 2020 16:52:49 +0200 (CEST) Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2053.outbound.protection.outlook.com [40.107.20.53]) by dpdk.org (Postfix) with ESMTP id 9CED41BFE0 for ; Mon, 27 Jul 2020 16:52:47 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=efVG1dGKB8GbpV/s3gzHoH98Mv/udFPcDWa4eJ7WacY=; b=JfmrIAnwi6x/O4/ocu2dzdVm4zHtclLhdoHkkj8GwW3CBAthpxUI8P0858rAjsgm1E0pG/0FNkGd2iOMnIzO8MNqOdwDdTpsR57qjdmEazas61TjpxMRlPQgyN7uKjR47m2unNI08mk+/rZGl/ctBO9c8ngYP+r5NT480zvBbDE= Received: from DB6PR0301CA0046.eurprd03.prod.outlook.com (2603:10a6:4:54::14) by VI1PR0801MB1983.eurprd08.prod.outlook.com (2603:10a6:800:86::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3216.24; Mon, 27 Jul 2020 14:52:45 +0000 Received: from DB5EUR03FT052.eop-EUR03.prod.protection.outlook.com (2603:10a6:4:54:cafe::35) by DB6PR0301CA0046.outlook.office365.com (2603:10a6:4:54::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3216.20 via Frontend Transport; Mon, 27 Jul 2020 14:52:45 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dpdk.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dpdk.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT052.mail.protection.outlook.com (10.152.21.82) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3216.10 via Frontend Transport; Mon, 27 Jul 2020 14:52:45 +0000 Received: ("Tessian outbound 7de93d801f24:v62"); Mon, 27 Jul 2020 14:52:45 +0000 X-CR-MTA-TID: 64aa7808 Received: from e5b29ee45f69.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 31F2D864-DBFD-40AB-B278-A7B2AB6D20D8.1; Mon, 27 Jul 2020 14:52:40 +0000 Received: from EUR02-VE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id e5b29ee45f69.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 27 Jul 2020 14:52:40 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=mUUqmnXtxZd89Vzd5pIll/3JmV32qFFz9KsIptIfUOWUHrriAB7NUpgOz2PiJIqg9dlopQZKWsbIM+cDGk1TO5LLiC2lixtcESBNPHUPTCvaE0XxpA3of7ffBEBpLAou3HK8xrbVg+mZicsxtL5eXeOUnWg5za0CrP2U3duzeGib50V236Jc5xQZn/KiUnKOXRwPQlRIqlxGbWjVnRJmTwEmpnetJMCAhsbBbeRrQWt3AALt8CgHrgQgPhcO63XVJItLinS5eJNGNPkdIXHlJkX/5wKjMboMAnyN5xakxkAJX5yoxrGc1UqydCtjxDVfVlKHm8CArNpQ3rC+wETwuw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=efVG1dGKB8GbpV/s3gzHoH98Mv/udFPcDWa4eJ7WacY=; b=liTFtnmg4tgRuc2w0o8VfCN7VRZn+bKMnFytjv2YuSApFjc7ItjgoXfxLpSIxzF74ULDd/yyrDkRVSsIJ5pHqYTh6OoX1LKELWMqqC55xeWDnijuzdw4g108ILn1DCCQ5x1gi9LtrNE7239YjXYFLbx6fpe8MtzWFZvUQJ6shqwc8vRw/c7nwcEt38jAic1F5o7rb2QB/5NZQj2k3KY1yJcWA2Og9sRR6sW3CL3hwoP9ooYsCt387kFCst3xrcuuPukCLC18X+UcOeBiFK5/DiCHRPqd60JiF4NsDLsOOarXDdKZjSM3sZ9rFkhmGgZdxJMMEBv5M1cFHpPMG3YijA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=efVG1dGKB8GbpV/s3gzHoH98Mv/udFPcDWa4eJ7WacY=; b=JfmrIAnwi6x/O4/ocu2dzdVm4zHtclLhdoHkkj8GwW3CBAthpxUI8P0858rAjsgm1E0pG/0FNkGd2iOMnIzO8MNqOdwDdTpsR57qjdmEazas61TjpxMRlPQgyN7uKjR47m2unNI08mk+/rZGl/ctBO9c8ngYP+r5NT480zvBbDE= Received: from VE1PR08MB4640.eurprd08.prod.outlook.com (2603:10a6:802:b2::11) by VI1PR08MB4542.eurprd08.prod.outlook.com (2603:10a6:803:fa::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3216.26; Mon, 27 Jul 2020 14:52:39 +0000 Received: from VE1PR08MB4640.eurprd08.prod.outlook.com ([fe80::28a3:3a4e:65ca:5707]) by VE1PR08MB4640.eurprd08.prod.outlook.com ([fe80::28a3:3a4e:65ca:5707%3]) with mapi id 15.20.3216.033; Mon, 27 Jul 2020 14:52:39 +0000 From: Phil Yang To: Alexander Kozyrev , Honnappa Nagarahalli , Matan Azrad , Shahaf Shuler , Slava Ovsiienko CC: "drc@linux.vnet.ibm.com" , nd , "dev@dpdk.org" , nd Thread-Topic: [dpdk-dev] [PATCH v3] net/mlx5: relaxed ordering for multi-packet RQ buffer refcnt Thread-Index: AQHWXuyRqTPpygPJg0imDSLz2DSNPakUmuIAgAAMARCAAL7QAIAF6krw Date: Mon, 27 Jul 2020 14:52:38 +0000 Message-ID: References: <20200410164127.54229-7-gavin.hu@arm.com> <1592900807-13289-1-git-send-email-phil.yang@arm.com> In-Reply-To: Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: 06e09c53-11a6-477b-9969-606acaa3ece0.0 x-checkrecipientchecked: true Authentication-Results-Original: mellanox.com; dkim=none (message not signed) header.d=none; mellanox.com; dmarc=none action=none header.from=arm.com; x-originating-ip: [203.126.0.113] x-ms-publictraffictype: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 13ff1ff8-8530-4487-2c7d-08d8323cb90a x-ms-traffictypediagnostic: VI1PR08MB4542:|VI1PR0801MB1983: x-ms-exchange-transport-forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true nodisclaimer: true x-ms-oob-tlc-oobclassifiers: OLM:10000;OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: FY00c7j5DeM6awcllQinlp1vkkTy+MQ+7rtGVLC3kOOQGbenf/VVXGV4bLA0V01wVRSVU/zrcmFrqn8/AtVLYJhCKy9+krhHbldg4Oq9xlnuA1lNjdPwkNd46hZbTtq0puXw67gzv5u8g1xFkWG76cGc8A6z5Q0jeqqfSwIVPPg7/J+6qBTM3RXp5OX2uFikVhlg25Keg1BJOw4W1YmDfjfYfrr80ahO81c2+uGjPqPlDK6anPQF9vrZpHXGbfX8A311zA+beRk0G8nBy+dhEVLMhOOMdbC62nsvpn9LVTNgSXjcskE+5XYjmITZ9CDc4/XBJCuVg1mB86GfqKiwZg== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VE1PR08MB4640.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFTY:; SFS:(4636009)(346002)(396003)(366004)(136003)(376002)(39860400002)(66946007)(26005)(66476007)(66556008)(5660300002)(66446008)(7696005)(64756008)(9686003)(186003)(86362001)(478600001)(8936002)(6506007)(55016002)(52536014)(71200400001)(33656002)(83380400001)(4326008)(76116006)(316002)(2906002)(110136005)(54906003); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: DALVZvzVxjH+wR8sgoypJmJ75SSyIupKJ9DFXHqZT++W7fkBgNk+H6vc789p3CIezNqApY9IAngtWtAPE8Rac5gK5CSPnkqGOh8mXLtZkNKzAvZBZ2MK9w1q+ci7wwFLKX6vb+PjIQzJhBBi9gGRC/le5h/oNft+di+TwJERxQoBkenYjJ6xpY2ZPaDCUaV554ETBKEkpOMM04LwkBJyoNktkh3tpKYnP+O74S6zNDdwMI5L/Xqsr62cQzRwlXd7N+LMC0F+nrJjEa4jlGwly85JQlczgWQ4stdX2/AJDFNkU9cN1l+rl4jroepSZdWH3/67aNaw3DUCj8NSxRhvEvJbgoDdwbgNw2rYE4JeJ9+dmgRn5ntNgG5hEn+Qwwajo3075NX+FX9RdBe6l9RAcmrU3glgHq07O//T1v7S4s2kgs+XuHeWj74UjqHHdbIjXKOjc/bkgZSDFxIT0STsxwOHUManh8rZgp1H3XZflA0= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB4542 Original-Authentication-Results: mellanox.com; dkim=none (message not signed) header.d=none; mellanox.com; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT052.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 6d192ad3-6e6d-4ee1-4ce6-08d8323cb518 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: st0llNy8sUKP1TPxWDt/lM2jR8ZhPDxR8hMgntsq3U/ocnJot/HczFkAEQ5SYTpzJe4iYTgrbFX3e+owrvtT6ml8dWy0WHGNAqGud0t/5FsG9bfiHlZ9QETWRjcsoptj7RQHePORRTBinBk9r+qHqwM69/dZe8/S5LA3zZYocRkt4ANLK686ed7mUk6ofvM9pnKegps5xjFw6zHD1Rn1NVsFC1U7TkpplTqXYR2ScS5zQTB+AST08z285v/J8doKyXYbFYUwjlrwPVFK5jYbQsKIbWSeY2+JWX4d3SBsTnrtJUODK1eP6uwHabRF/Fuj7NZUhNzRwS7p00QNfNJx/C27xlUOw3dKVa9cVabHp2V1WW6RggOVzfc0pTnLzb0z/XtMcfJ/+uniDFTK7x1s3A== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFTY:; SFS:(4636009)(396003)(39860400002)(346002)(376002)(136003)(46966005)(5660300002)(33656002)(52536014)(26005)(2906002)(81166007)(478600001)(70586007)(4326008)(70206006)(83380400001)(8936002)(82310400002)(54906003)(316002)(110136005)(55016002)(9686003)(47076004)(7696005)(356005)(6506007)(86362001)(336012)(82740400003)(186003); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 27 Jul 2020 14:52:45.6733 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 13ff1ff8-8530-4487-2c7d-08d8323cb90a X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5EUR03FT052.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR0801MB1983 Subject: Re: [dpdk-dev] [PATCH v3] net/mlx5: relaxed ordering for multi-packet RQ buffer refcnt X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Alexander Kozyrev writes: > > > > > > @@ -1790,9 +1792,9 @@ mlx5_rx_burst_mprq(void *dpdk_rxq, > > struct > > > > > > rte_mbuf **pkts, uint16_t pkts_n) void *buf_addr; > > > > > > > > > > > > /* Increment the refcnt of the whole chunk. */ > > > > > > -rte_atomic16_add_return(&buf->refcnt, 1); > > > rte_atomic16_add_return includes a full barrier along with atomic > > operation. > > > But is full barrier required here? For ex: > > > __atomic_add_fetch(&buf->refcnt, 1, > > > __ATOMIC_RELAXED) will offer atomicity, but no barrier. Would that be > > > enough? > > > > > > > > > -MLX5_ASSERT((uint16_t)rte_atomic16_read(&buf- > > > > > > >refcnt) <=3D > > > > > > - strd_n + 1); > > > > > > +__atomic_add_fetch(&buf->refcnt, 1, > > > > > > __ATOMIC_ACQUIRE); > > > > The atomic load in MLX5_ASSERT() accesses the same memory space as the > > previous __atomic_add_fetch() does. > > They will access this memory space in the program order when we enabled > > MLX5_PMD_DEBUG. So the ACQUIRE barrier in __atomic_add_fetch() > > becomes unnecessary. > > > > By changing it to RELAXED ordering, this patch got 7.6% performance > > improvement on N1 (making it generate A72 alike instructions). > > > > Could you please also try it on your testbed, Alex? >=20 > Situation got better with this modification, here are the results: > - no patch: 3.0 Mpps CPU cycles/packet=3D51.52 > - original patch: 2.1 Mpps CPU cycles/packet=3D71.05 > - modified patch: 2.9 Mpps CPU cycles/packet=3D52.79 > Also, I found that the degradation is there only in case I enable bursts = stats. Great! So this patch will not hurt the normal datapath performance. > Could you please turn on the following config options and see if you can > reproduce this as well? > CONFIG_RTE_TEST_PMD_RECORD_CORE_CYCLES=3Dy > CONFIG_RTE_TEST_PMD_RECORD_BURST_STATS=3Dy Thanks, Alex. Some updates. Slightly (about 1%) throughput degradation was detected after we enabled th= ese two config options on N1 SoC. If we look insight the perf stats results, with this patch, both mlx5_rx_bu= rst and mlx5_tx_burst consume fewer CPU cycles than the original code.=20 However, __memcpy_generic takes more cycles. I think that might be the reas= on for CPU cycles per packet increment after applying this patch. Original code: 98.07%--pkt_burst_io_forward | |--44.53%--__memcpy_generic | |--35.85%--mlx5_rx_burst_mprq | |--15.94%--mlx5_tx_burst_none_empw | | | |--7.32%--mlx5_tx_handle_completion.isra.0 | | | --0.50%--__memcpy_generic | --1.14%--memcpy@plt Use C11 with RELAXED ordering: 99.36%--pkt_burst_io_forward | |--47.40%--__memcpy_generic | |--34.62%--mlx5_rx_burst_mprq | |--15.55%--mlx5_tx_burst_none_empw | | | --7.08%--mlx5_tx_handle_completion.isra.0 | --1.17%--memcpy@plt BTW, all the atomic operations in this patch are not the hotspot. >=20 > > > > > > Can you replace just the above line with the following lines and test= it? > > > > > > __atomic_add_fetch(&buf->refcnt, 1, __ATOMIC_RELAXED); > > > __atomic_thread_fence(__ATOMIC_ACQ_REL); > > > > > > This should make the generated code same as before this patch. Let me > > > know if you would prefer us to re-spin the patch instead (for testing= ). > > > > > > > > > +MLX5_ASSERT(__atomic_load_n(&buf->refcnt, > > > > > > + __ATOMIC_RELAXED) <=3D strd_n + 1); > > > > > > buf_addr =3D RTE_PTR_SUB(addr, > > > > > > RTE_PKTMBUF_HEADROOM); > > > > > > /* > > > > > > * MLX5 device doesn't use iova but it is necessary in a > > > > > diff > > > > > > --git a/drivers/net/mlx5/mlx5_rxtx.h > > > > > > b/drivers/net/mlx5/mlx5_rxtx.h index 26621ff..0fc15f3 100644 > > > > > > --- a/drivers/net/mlx5/mlx5_rxtx.h > > > > > > +++ b/drivers/net/mlx5/mlx5_rxtx.h > > >