From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 6A5B2A04D7; Tue, 11 Aug 2020 07:21:06 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 7E2E34C99; Tue, 11 Aug 2020 07:21:05 +0200 (CEST) Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2044.outbound.protection.outlook.com [40.107.20.44]) by dpdk.org (Postfix) with ESMTP id A978B2BFA for ; Tue, 11 Aug 2020 07:21:03 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=9MpllHrbafa+VhLF1DJYa+QgvsUIJP81va9ulQ1J0BY=; b=1phWn3PiGYEew5F5BrPmukCDMLkCiFk4AWnMOsHpDD8EamDnfmj6mcKEBEUfbAWyoCWju8c3vUvKJ+bE6ce5rBuhTUFhi+LDMq8PhcapdCESMJqJVPai+8eNM7UFX3RPThYFSuwub464FMJw+ITtj/63r7s3SnyzkoOn7FS53I4= Received: from AM5PR0701CA0013.eurprd07.prod.outlook.com (2603:10a6:203:51::23) by AM6PR08MB5032.eurprd08.prod.outlook.com (2603:10a6:20b:ea::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3261.19; Tue, 11 Aug 2020 05:21:02 +0000 Received: from VE1EUR03FT048.eop-EUR03.prod.protection.outlook.com (2603:10a6:203:51:cafe::a3) by AM5PR0701CA0013.outlook.office365.com (2603:10a6:203:51::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3283.7 via Frontend Transport; Tue, 11 Aug 2020 05:21:02 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dpdk.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dpdk.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VE1EUR03FT048.mail.protection.outlook.com (10.152.19.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3261.16 via Frontend Transport; Tue, 11 Aug 2020 05:21:01 +0000 Received: ("Tessian outbound bac899b43a54:v64"); Tue, 11 Aug 2020 05:21:01 +0000 X-CR-MTA-TID: 64aa7808 Received: from 2563f5248ad4.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id FFE6D060-4A4F-4324-AFF9-21A93ADB12B5.1; Tue, 11 Aug 2020 05:20:55 +0000 Received: from EUR04-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 2563f5248ad4.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 11 Aug 2020 05:20:55 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=J/X6k2ePpyHnJL1OAvX/6Rzu23gx6Akdjp3nTnsTASvgrA3GLqMewpXNAQOltj6le1m9KCTU1Fxsz19SBCZqbVNpc2Dk08B1QZ0rkxdCqO5Movx/nHazXPAEsqpwAdKrf7aEHtvfutDMLwa+VEv3+xwQChOpmhtuhdsgVm+nE9UZlKdebIcoUd1P5++6fbdEOI+jBfn0JPR56q9TaXgJ+Cazyh1SgFS9cC1Blf7yp/amydP8uR7v24zGznBoWHmFMrrIe3xog7cMJWGughvEn7htoDJAChc0Abzw5i2bwk37kCig0M6Jixjh/Tj75AW42yHI4KSOQfzH9RezxFWwTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=9MpllHrbafa+VhLF1DJYa+QgvsUIJP81va9ulQ1J0BY=; b=Yl49H44pX5mFwBN6vbNdhN+oTnSIAVyrk6U0sFfGmv52Hb8FkYRbRWlpZR7zabrWROMrpBwB0+tx7o0nlWTxH4LAOxcyNK9PrUT5o+BgFsq3FDoMim9XdaikYzO/l9JOSrWB0MF31l032eOMheHapylllDLygKsSqT/qLBJmIKF4KI+rltpOEhu8MEG580P0xKdwwRCN+4WDmmDuuaU1mZGd7RNiUNUosF/sBUWb8CD5+7ic7yFv8wafXbVV3xNgxNmCCsMIvcB9b/5U9tRNevSuQ2vlYQr4sKA8GcXG/Eu/43/i0+ENH1cm4NQvm16uYixct+62hPPgM7pyZ/IZsg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=9MpllHrbafa+VhLF1DJYa+QgvsUIJP81va9ulQ1J0BY=; b=1phWn3PiGYEew5F5BrPmukCDMLkCiFk4AWnMOsHpDD8EamDnfmj6mcKEBEUfbAWyoCWju8c3vUvKJ+bE6ce5rBuhTUFhi+LDMq8PhcapdCESMJqJVPai+8eNM7UFX3RPThYFSuwub464FMJw+ITtj/63r7s3SnyzkoOn7FS53I4= Received: from VI1PR0802MB2447.eurprd08.prod.outlook.com (2603:10a6:800:af::16) by VE1PR08MB5725.eurprd08.prod.outlook.com (2603:10a6:800:1b0::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3261.18; Tue, 11 Aug 2020 05:20:53 +0000 Received: from VI1PR0802MB2447.eurprd08.prod.outlook.com ([fe80::d464:b0dd:5f2e:ccb]) by VI1PR0802MB2447.eurprd08.prod.outlook.com ([fe80::d464:b0dd:5f2e:ccb%5]) with mapi id 15.20.3261.024; Tue, 11 Aug 2020 05:20:52 +0000 From: Honnappa Nagarahalli To: Alexander Kozyrev , Phil Yang , Matan Azrad , Shahaf Shuler , Slava Ovsiienko CC: "drc@linux.vnet.ibm.com" , nd , "dev@dpdk.org" , Honnappa Nagarahalli , nd Thread-Topic: [dpdk-dev] [PATCH v3] net/mlx5: relaxed ordering for multi-packet RQ buffer refcnt Thread-Index: AQHWYKxdP8SsNQeNKkedAL/6TKcbm6kRJjhAgAOA/wCAABdygIAAsZnggAYpXQCADuZekIAH/Knw Date: Tue, 11 Aug 2020 05:20:52 +0000 Message-ID: References: <20200410164127.54229-7-gavin.hu@arm.com> <1592900807-13289-1-git-send-email-phil.yang@arm.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: 795ed8d9-9cf1-4c7d-add0-ab8c09294aec.0 x-checkrecipientchecked: true Authentication-Results-Original: mellanox.com; dkim=none (message not signed) header.d=none; mellanox.com; dmarc=none action=none header.from=arm.com; x-originating-ip: [70.112.90.121] x-ms-publictraffictype: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: a6ab4484-b6ed-4b35-f0d6-08d83db656ac x-ms-traffictypediagnostic: VE1PR08MB5725:|AM6PR08MB5032: x-ms-exchange-transport-forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true nodisclaimer: true x-ms-oob-tlc-oobclassifiers: OLM:10000;OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: Ce5PHyTWlZ0zo6USW7sAumw7CWztwcw4L/+Avz98hLlxkgZv40EIJYX7a8qfrmQvgKegBRtZwUfDb/oLs/WovJIPtdzgFoTtEqeTR48yNiihFwZDcPOMtKx9Zq9AlnhFX7aWBoTRw0D7VOFzuI9I/zUmQmcKh4XAFxroizlH+Tow+hkcu/A3vcSEb55dFDx1VuYGJU4mbZtBLLNOsgsVo8jKBOsJsUpqnYjcxgKqHG1xv8u/NUA5aedtKb5MiLWUAHzHtovizNRj10J+xHDux1XVYhfkLjYlgHHGRJkxooG4k/Tw+K4El75GxNIlJxL1gUD5cmnBHNNGMd/wdmeL5Q== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR0802MB2447.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFTY:; SFS:(4636009)(346002)(366004)(376002)(39860400002)(136003)(396003)(8936002)(55016002)(83380400001)(5660300002)(7696005)(33656002)(9686003)(4326008)(6506007)(316002)(86362001)(54906003)(2906002)(110136005)(64756008)(26005)(478600001)(66556008)(66946007)(66446008)(76116006)(66476007)(71200400001)(186003)(52536014); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: Cr6WuD/xTQwVY+kfIY1NGNWSrASGx022ogT62JyCGGTtzlkUQqb/B7H1lRyMwCcJ6lUSK4No8qxZQgk2Mn5WDoDnRaL0HsHQu8RIEZ1uqZD5MYk+XSIFeTaY5w75mMP8/1FDBMzlS9WbMSn9BUk+1bPPXvUdSyzdIJ/7ZakIrLaI3aAxI9cyIjtC9Cm2Sw3YpDfTfcb67ZFBJ8rQ+Ry81StNnav7xZR4SzNhFEqZE4qjs9pJKcv4WjbsEWAaZPxL5kPb8nOWVL1DoUp6pH07+4RpIjmm5QO3M2Psq1pvpLBO5GIr9CADlMNJUfsZ6gaHtXDb/rljX5OqUE0ioKGjJy/w3TGB/kQ2Mj9S8aPCK/wH7Kv7nVcAJOm5ef7rXba/RUEh5glNH9c3u8cpgf0XRBwdJmaCMCRTXEkZtYrLnlzSLYOwHy0N3XPbor1p+Gt5MjH26wWn42ga4/vgRO7KUGvFjNojj4IBxHv8WMNHBWfWIF/XzjBlVzyw8YgJYypQ+7KriSCT9JWs1WhAHK/sOO3jicTjvu7/NZXtpM8eLK6jBWzsiluKZzTijrPgYrSwtsx+karQKp0WUr9mCg8cEPDL9NkRn/bJPuEVuIKWeeQCT0B/mbX5/Uh/CN6TGBjleOXTlwBhpvtVxX9KbwYiOg== Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5725 Original-Authentication-Results: mellanox.com; dkim=none (message not signed) header.d=none; mellanox.com; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VE1EUR03FT048.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 91cc446c-2031-415f-fc5e-08d83db65143 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: t0J7D48uelF/J8AHvs+oqvj5TXi55GV6kBfp6juljPRBYrAHbCU0djvC8Naxe1JnwKuIOaxTG8u2Z2ccvezE02KZlPA4Kp0EroRMK1xuiCnyz29UsA6JxoZpS3oqEeMolPEH8Tp/kZUxofYYeQrp7/m1N6cKljSXt5n0bs8EZFN41YV2trQlACbrC1zH30eB3na6lHQFLpkLf8/wjm4ZDO2qqJ5VkHBdC4m9jc6HTmIWZUOdK6oPh9SUB6fXUSJgpv0AfbIx7N2hld9utb4nyW9UnoZlWQwXyd3qnhvhzNUuz6kbrBXqc30ryqnsGrNzxPntpXRg4lr6Y1L8n/07wEFWxwfvGGfcQzk7Hv4VS4TFBwbEFm4fwZiCAIdJjcPdSv3HYD+DXzvVf1Ug11RTWA== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFTY:; SFS:(4636009)(396003)(39860400002)(136003)(346002)(376002)(46966005)(186003)(33656002)(478600001)(54906003)(110136005)(9686003)(4326008)(8936002)(7696005)(336012)(55016002)(26005)(316002)(6506007)(36906005)(52536014)(86362001)(70586007)(70206006)(2906002)(47076004)(83380400001)(82310400002)(5660300002)(81166007)(356005)(82740400003); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Aug 2020 05:21:01.9252 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: a6ab4484-b6ed-4b35-f0d6-08d83db656ac X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: VE1EUR03FT048.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM6PR08MB5032 Subject: Re: [dpdk-dev] [PATCH v3] net/mlx5: relaxed ordering for multi-packet RQ buffer refcnt X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > > > > > > > > > > @@ -1790,9 +1792,9 @@ mlx5_rx_burst_mprq(void *dpdk_rxq, > > > > struct > > > > > > > > rte_mbuf **pkts, uint16_t pkts_n) void *buf_addr; > > > > > > > > > > > > > > > > /* Increment the refcnt of the whole chunk. */ > > > > > > > > -rte_atomic16_add_return(&buf->refcnt, 1); > > > > > rte_atomic16_add_return includes a full barrier along with > > > > > atomic > > > > operation. > > > > > But is full barrier required here? For ex: > > > > > __atomic_add_fetch(&buf->refcnt, 1, > > > > > __ATOMIC_RELAXED) will offer atomicity, but no barrier. Would > > > > > that be enough? > > > > > > > > > > > > > -MLX5_ASSERT((uint16_t)rte_atomic16_read(&buf- > > > > > > > > >refcnt) <=3D > > > > > > > > - strd_n + 1); > > > > > > > > +__atomic_add_fetch(&buf->refcnt, 1, > > > > > > > > __ATOMIC_ACQUIRE); > > > > > > > > The atomic load in MLX5_ASSERT() accesses the same memory space as > > > > the previous __atomic_add_fetch() does. > > > > They will access this memory space in the program order when we > > > > enabled MLX5_PMD_DEBUG. So the ACQUIRE barrier in > > > > __atomic_add_fetch() becomes unnecessary. > > > > > > > > By changing it to RELAXED ordering, this patch got 7.6% > > > > performance improvement on N1 (making it generate A72 alike > instructions). > > > > > > > > Could you please also try it on your testbed, Alex? > > > > > > Situation got better with this modification, here are the results: > > > - no patch: 3.0 Mpps CPU cycles/packet=3D51.52 > > > - original patch: 2.1 Mpps CPU cycles/packet=3D71.05 > > > - modified patch: 2.9 Mpps CPU cycles/packet=3D52.79 Also, I found > > > that the degradation is there only in case I enable bursts stats. > > > > > > Great! So this patch will not hurt the normal datapath performance. > > > > > > > Could you please turn on the following config options and see if you > > > can reproduce this as well? > > > CONFIG_RTE_TEST_PMD_RECORD_CORE_CYCLES=3Dy > > > CONFIG_RTE_TEST_PMD_RECORD_BURST_STATS=3Dy > > > > Thanks, Alex. Some updates. > > > > Slightly (about 1%) throughput degradation was detected after we > > enabled these two config options on N1 SoC. > > > > If we look insight the perf stats results, with this patch, both > > mlx5_rx_burst and mlx5_tx_burst consume fewer CPU cycles than the > original code. > > However, __memcpy_generic takes more cycles. I think that might be the > > reason for CPU cycles per packet increment after applying this patch. > > > > Original code: > > 98.07%--pkt_burst_io_forward > > | > > |--44.53%--__memcpy_generic > > | > > |--35.85%--mlx5_rx_burst_mprq > > | > > |--15.94%--mlx5_tx_burst_none_empw > > | | > > | |--7.32%--mlx5_tx_handle_completion.isra.0 > > | | > > | --0.50%--__memcpy_generic > > | > > --1.14%--memcpy@plt > > > > Use C11 with RELAXED ordering: > > 99.36%--pkt_burst_io_forward > > | > > |--47.40%--__memcpy_generic > > | > > |--34.62%--mlx5_rx_burst_mprq > > | > > |--15.55%--mlx5_tx_burst_none_empw > > | | > > | --7.08%--mlx5_tx_handle_completion.isra.0 > > | > > --1.17%--memcpy@plt > > > > BTW, all the atomic operations in this patch are not the hotspot. >=20 > Phil, we are seeing much worse degradation on our ARM platform > unfortunately. > I don't think that discrepancy in memcpy can explain this behavior. > Your patch is not touching this area of code. Let me collect some perf st= at on > our side. Are you testing the patch as is or have you made the changes that were disc= ussed in the thread? >=20 > > > > > > > > > > > > > > > Can you replace just the above line with the following lines and = test it? > > > > > > > > > > __atomic_add_fetch(&buf->refcnt, 1, __ATOMIC_RELAXED); > > > > > __atomic_thread_fence(__ATOMIC_ACQ_REL); > > > > > > > > > > This should make the generated code same as before this patch. > > > > > Let me know if you would prefer us to re-spin the patch instead > > > > > (for > > testing). > > > > > > > > > > > > > +MLX5_ASSERT(__atomic_load_n(&buf->refcnt, > > > > > > > > + __ATOMIC_RELAXED) <=3D strd_n + 1); > > > > > > > > buf_addr =3D RTE_PTR_SUB(addr, RTE_PKTMBUF_HEADROOM); > > > > > > > > /* > > > > > > > > * MLX5 device doesn't use iova but it is necessary in a > > > > > > > diff > > > > > > > > --git a/drivers/net/mlx5/mlx5_rxtx.h > > > > > > > > b/drivers/net/mlx5/mlx5_rxtx.h index 26621ff..0fc15f3 > > > > > > > > 100644 > > > > > > > > --- a/drivers/net/mlx5/mlx5_rxtx.h > > > > > > > > +++ b/drivers/net/mlx5/mlx5_rxtx.h > > > > > > >