From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id EC9B5430CC; Tue, 22 Aug 2023 09:33:25 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id DD42540689; Tue, 22 Aug 2023 09:33:25 +0200 (CEST) Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2050.outbound.protection.outlook.com [40.107.21.50]) by mails.dpdk.org (Postfix) with ESMTP id 98EE94021D for ; Tue, 22 Aug 2023 09:33:24 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=VxHr8O8cKGyQDvlh41bRdwlH0pfjsPAJMV/oa72cjg3WpqJ2qdjkCGeHXgKB34GwQfM7LXA8na+svIVxUM0ppdq1n0GNrsJzGCEOEDSSktXKWE/pn0ZqmzjiGQuFQ1Qk1vZjQLxu34ghL9q09IDDAuG5f+k1nO3bb1zK07Jv2U+D1RoOflSAnok5PxrWLKLwchsMJXTNhl/9F1Gyu73YPQHe51xTIMjy4SLBBP8cVIU5EWGazMGgA2Tij9XMvD3XK2eQ0AVXkqi6vEaSb7Wdef3Uf/CX+05ZD1TpvrdMoYTzpQSBJOw1qFUHjP/NPTCW5h5qI+EACqwR3MA1AEogSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=E/T9H9ZdNjRu0uHCKmfP5hKyYOVi3AcoWMS4JF1rl4g=; b=OX25hsfgI/y6jGn8DZo3qnXekOl1nNBl+Ph7D7o+gNFJnv26zQqw6Mdcy3r1eTjeKSMsB09oiOUA2CN9etOghepKdZZRVaDD6Io8KyvlIivBSYLcR0lSwHJdcHNHT0f2h3uXlKhu6MIRVUYsiRsBwxT2n4dFrbDUGV+8T3e2hSLmWZyPNu4qpLpUNU69GzjIyjj4XaRDj+S/VN8hQJFIGYFfVxTjuH51Td9FcbImX2sZ3HpkMxrDUlGtzPGhp01o0/inuqpMempMx3W+msTprisrQt/1w30/DkOV2aX/L0jSN/g3g12+t4Xl+8iCSurRxnzigg61fpzBePZ837vyMQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=E/T9H9ZdNjRu0uHCKmfP5hKyYOVi3AcoWMS4JF1rl4g=; b=hInQk5dCsZc1Kr2SXcReVff27rcwTBlIWkAXPvEmcEU4zYAFwvKlcnjyNjlDYVynaazS5mVeJya1O69ZY36pJJTEBMMisQ94fsnmHXWdZNBpT20Y4zDsW6XJMl+DYlYNdW28cK30rlydEIIZ0nW/zOy31SiWflFc70NVu2JY8n8= Received: from AS8PR08MB7718.eurprd08.prod.outlook.com (2603:10a6:20b:50a::22) by AS2PR08MB9764.eurprd08.prod.outlook.com (2603:10a6:20b:603::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6699.24; Tue, 22 Aug 2023 07:33:22 +0000 Received: from AS8PR08MB7718.eurprd08.prod.outlook.com ([fe80::70e8:2daa:5a39:dc50]) by AS8PR08MB7718.eurprd08.prod.outlook.com ([fe80::70e8:2daa:5a39:dc50%4]) with mapi id 15.20.6699.022; Tue, 22 Aug 2023 07:33:22 +0000 From: Feifei Wang To: Ferruh Yigit CC: "dev@dpdk.org" , nd , nd Subject: RE: [PATCH v11 0/4] Recycle mbufs from Tx queue into Rx queue Thread-Topic: [PATCH v11 0/4] Recycle mbufs from Tx queue into Rx queue Thread-Index: AQHZ1MobBs/y3i8Lz0+4fawwopu2tK/160Sw Date: Tue, 22 Aug 2023 07:33:22 +0000 Message-ID: References: <20220420081650.2043183-1-feifei.wang2@arm.com> <20230822072710.1945027-1-feifei.wang2@arm.com> In-Reply-To: <20230822072710.1945027-1-feifei.wang2@arm.com> Accept-Language: zh-CN, en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: A60D73248695694D9264420B127F7648.0 x-checkrecipientchecked: true authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: AS8PR08MB7718:EE_|AS2PR08MB9764:EE_ x-ms-office365-filtering-correlation-id: 1740dc8b-4668-4ab1-bdfb-08dba2e2104e nodisclaimer: true x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: POKlbCEbFsymJewMagJl3ezxxjQzbsFpCaKSMVTd/QO27T9uK6W+eoPpZCb9Itu3+/EuDZIaOEykr0UCM6K/2+IG2gZ57TMtJYwkScQ05SHXJpXMZWdnpxMUAOO5oP/IzffkaEw211/g43xwOwvMehH2cP69ytkjr2E9Mx2VTYoB13Laef08ms95YhPz3MArgApqAaLZoDjbpbmZf8iHOST57HoQi4axZJ6B9PBbtwQt6q2BnRueGIeW1jE1QL36myBTUU7XKvkgTUqjGlAT1HmLcVYDWX5Q/zsNjnx0KzIwguNFSOKVNJNqt9xCnDvpsEDx831cG+mYOdPkdMtmJrOx6K8S4WA3U7Wd3wgjTB7ZHQFGshwtUAV9ZAgpZ3JJjJ53DmMSzx+Rtows+VKTm7646GEZiJ74Dx8vGULQ3CRBa5wWCmgWIbDnBvD/OiLem84wlQXbQIStLZ8cvNyxW8isHGtxEgTjwkSdOJuBzCc8Jorm4oKOzoQmk0DTYfAsUjvbC+CjJ+JS5fahLUJXGwo6ukscV4J8TlnuEXKcSpQytZ8zJcsW35PWyIresEf5qVyU9mmbkRZtS544oVtCx3Ti8RVYs7xRuYTpp9J4K+GqcREWKMhIjLbUhZjN0L1h+n0h5UOeS2gEAEB55H3gSVB2m7cCQH/8n4kj+DFeXZA= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:AS8PR08MB7718.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(396003)(136003)(39860400002)(346002)(376002)(366004)(1800799009)(186009)(451199024)(54906003)(66446008)(6916009)(76116006)(66476007)(66556008)(64756008)(316002)(66946007)(9686003)(8676002)(8936002)(4326008)(41300700001)(122000001)(966005)(478600001)(55016003)(71200400001)(38070700005)(38100700002)(6506007)(83380400001)(2906002)(86362001)(7696005)(5660300002)(33656002)(26005)(52536014)(414714003)(473944003); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?J+08aJnxB6PJjjcsI5J33JDh2ZCtFjvxMoeIYe2+LbFWuGKRJJTPszUZwZOD?= =?us-ascii?Q?40M6rcq0/jCEtDDSNybIobaVsa6H5LC+R+iWosICCmf6Wuo+72ue7IEfVALM?= =?us-ascii?Q?4QljfZjvODuEIvFXSt7TLkaCXyYnrakOx6gfUj7A26vcBS7Zztcl/h+WI5KY?= =?us-ascii?Q?EaU2k9ls5YRaD9V1mBzowZVkb2Yhi3VVLBMC/5kIykNh3CVvmf2IwOsc8NdB?= =?us-ascii?Q?wvEb0zVARyXUEPko5b2YjmsqYoYo6A1YqYDOr74IZD2R4oAssQ6gYUJFeTp+?= =?us-ascii?Q?EvS1Gw1rQcGqtO/FTAz7TMGKGLRRuX8cPOIahStUdHnzXVSK+y+cqR6HX1f5?= =?us-ascii?Q?u10nPNmi42ZdneiXUAqMX4REXv2fAGMEAZqSsdQqSCqXmg6e5OhWGAjQXbxy?= =?us-ascii?Q?unL/lhBap5rNIZJegL57cOjLXJ7RNg3OEF2we4B1M8qh3EMhP0p4e9AvNloU?= =?us-ascii?Q?JJbsZZ4TA+F9oXmlaVJ2zPJCwJyyPMtjhdU7r27jtsdSvgNF7DJekbYTdTYd?= =?us-ascii?Q?7wQwirygcNDVLcBtW/C8iM9yVMS95wId6wZto7F+HjqjYl3aK1SsAGpCF/KZ?= =?us-ascii?Q?GmjCW5TnvSFzwiS/QSno89OuS+yt8jGLjvfa9BF3r/sP876j1btByruw3I7f?= =?us-ascii?Q?QKJGmG0gnWKDvvhbiruvWtA2Pxjr428I6KlIw+XsyYMcVWdUScQB7fMB5JPc?= =?us-ascii?Q?+plcvw3osnOfXz045muJ0Ic7hhA8PQM3A7rqpab6OtMRXBoDUio8idMWtP0N?= =?us-ascii?Q?EBnueque+1WOFb7IF94fNd4egxgurk7ehZSIbdE8jskClTzwCnM9jBnd3KAx?= =?us-ascii?Q?9nLDacSDKZVXUCyvbBUlPaYplXf9FKeU4sEJ/1xjMfT2VWyGNdh3L5jKfEej?= =?us-ascii?Q?KIExzwhN7P++rKFcH1aHYw2WwclvsYjQaoAwkAWw33Af/4G6jxBfWLETvFj+?= =?us-ascii?Q?vWAtDyPLZGB0kyWg9f7/W/ZsDGjJSqQ0eDgp55BVwhymNt5Xbmdmt8usgiMH?= =?us-ascii?Q?CwfL4iA8eJ2WGZ+27B/E2CbVGU1CCrDk5xBlAq/YDJ6qdks6FFDXhh0MxpgK?= =?us-ascii?Q?5sDw0LRLPHx6uTfn/7j38c4RA+iBEF+U5BUASLCCXcv42oko+eypnERZNToV?= =?us-ascii?Q?YVrGwzHa+RxtK3ZzWyYkvKxuYuBBYZF6uUK/ePc9tbKzdY8BMG23hg0qjbEr?= =?us-ascii?Q?oGPfWTJvlFyB+WqhuE/ijWlWu74z7NqHpS+pIUkC1OYYH5Owitbb8I8l+DCB?= =?us-ascii?Q?tzfccOgGwFfN1tDSwAmLdjXoOb21jS7XIbNVBUrKzrSeo1NWZiESItGSztlo?= =?us-ascii?Q?pFEjmG9G+kI27246bmhMoA7Jww5cd5UIbtQfEezZR+1ne6KxwMNbqoFSZmGA?= =?us-ascii?Q?53ebKIMhFo6dMuCIIKRTUrSbEgPxOD4gBQYZOPz/oa5jDFtxzRlQ7AgFtqBg?= =?us-ascii?Q?16Da2uEiO3mhWK4yVvBLdJ3JXLm2R/vCrgEZ81WyP9d5sCHhmcs0u2YPSvt3?= =?us-ascii?Q?rb2Ne6INaTTmmT36blnvvLsFuSvTlS8aCWUYG9u2u4uyD8MstpNRBZM83Ucz?= =?us-ascii?Q?JkMBEjwWgPz0XgeWZu/QMKYxzQEBRLS51AsRvky4?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: AS8PR08MB7718.eurprd08.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 1740dc8b-4668-4ab1-bdfb-08dba2e2104e X-MS-Exchange-CrossTenant-originalarrivaltime: 22 Aug 2023 07:33:22.2340 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 0jEVD3PciO71rIC+5UaTKLSMxlh5LuyRZ8JjcM16BKoEX/T4ly3lM9qMASNnDVU5p/axi/CejwiLh63Q+7eRyw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS2PR08MB9764 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi, Ferruh Would you please give some comments on these patches?=20 If no comments, would mbufs recycle mode be merged in dpdk-next branch? Thanks very much. Best Regards Feifei > -----Original Message----- > From: Feifei Wang > Sent: Tuesday, August 22, 2023 3:27 PM > Cc: dev@dpdk.org; nd ; Feifei Wang > > Subject: [PATCH v11 0/4] Recycle mbufs from Tx queue into Rx queue >=20 > Currently, the transmit side frees the buffers into the lcore cache and= the > receive side allocates buffers from the lcore cache. The transmit side ty= pically > frees 32 buffers resulting in 32*8=3D256B of stores to lcore cache. The r= eceive > side allocates 32 buffers and stores them in the receive side software ri= ng, > resulting in 32*8=3D256B of stores and 256B of load from the lcore cache. >=20 > This patch proposes a mechanism to avoid freeing to/allocating from the l= core > cache. i.e. the receive side will free the buffers from transmit side dir= ectly into > its software ring. This will avoid the 256B of loads and stores introduce= d by > the lcore cache. It also frees up the cache lines used by the lcore cache= . And we > can call this mode as mbufs recycle mode. >=20 > In the latest version, mbufs recycle mode is packaged as a separate API. > This allows for the users to change rxq/txq pairing in real time in data = plane, > according to the analysis of the packet flow by the application, for exam= ple: > ----------------------------------------------------------------------- > Step 1: upper application analyse the flow direction Step 2: recycle_rxq_= info =3D > rte_eth_recycle_rx_queue_info_get(rx_portid, rx_queueid) Step 3: > rte_eth_recycle_mbufs(rx_portid, rx_queueid, tx_portid, tx_queueid, > recycle_rxq_info); Step 4: rte_eth_rx_burst(rx_portid,rx_queueid); > Step 5: rte_eth_tx_burst(tx_portid,tx_queueid); > ----------------------------------------------------------------------- > Above can support user to change rxq/txq pairing at run-time and user do= es > not need to know the direction of flow in advance. This can effectively e= xpand > mbufs recycle mode's use scenarios. >=20 > Furthermore, mbufs recycle mode is no longer limited to the same pmd, it = can > support moving mbufs between different vendor pmds, even can put the > mbufs anywhere into your Rx mbuf ring as long as the address of the mbuf > ring can be provided. > In the latest version, we enable mbufs recycle mode in i40e pmd and ixgbe > pmd, and also try to use i40e driver in Rx, ixgbe driver in Tx, and then = achieve > 7-9% performance improvement by mbufs recycle mode. >=20 > Difference between mbuf recycle, ZC API used in mempool and general path > For general path: > Rx: 32 pkts memcpy from mempool cache to rx_sw_ring > Tx: 32 pkts memcpy from tx_sw_ring to temporary variable = + 32 pkts > memcpy from temporary variable to mempool cache For ZC API used in > mempool: > Rx: 32 pkts memcpy from mempool cache to rx_sw_ring > Tx: 32 pkts memcpy from tx_sw_ring to zero-copy mempool c= ache > Refer link: > http://patches.dpdk.org/project/dpdk/patch/20230221055205.22984-2- > kamalakshitha.aligeri@arm.com/ > For mbufs recycle: > Rx/Tx: 32 pkts memcpy from tx_sw_ring to rx_sw_ring Thus = we can > see in the one loop, compared to general path, mbufs recycle mode reduces > 32+32=3D64 pkts memcpy; Compared to ZC API used in mempool, we can see > mbufs recycle mode reduce 32 pkts memcpy in each loop. > So, mbufs recycle has its own benefits. >=20 > Testing status: > (1) dpdk l3fwd test with multiple drivers: > port 0: 82599 NIC port 1: XL710 NIC > ------------------------------------------------------------- > Without fast free With fast free > Thunderx2: +7.53% +13.54% > ------------------------------------------------------------- >=20 > (2) dpdk l3fwd test with same driver: > port 0 && 1: XL710 NIC > ------------------------------------------------------------- > Without fast free With fast free > Ampere altra: +12.61% +11.42% > n1sdp: +8.30% +3.85% > x86-sse: +8.43% +3.72% > ------------------------------------------------------------- >=20 > (3) Performance comparison with ZC_mempool used > port 0 && 1: XL710 NIC > with fast free > ------------------------------------------------------------- > With recycle buffer With zc_mempool > Ampere altra: 11.42% 3.54% > ------------------------------------------------------------- >=20 > Furthermore, we add recycle_mbuf engine in testpmd. Due to XL710 NIC has > I/O bottleneck in testpmd in ampere altra, we can not see throughput chan= ge > compared with I/O fwd engine. However, using record cmd in testpmd: > '$set record-burst-stats on' > we can see the ratio of 'Rx/Tx burst size of 32' is reduced. This indicat= e mbufs > recycle can save CPU cycles. >=20 > V2: > 1. Use data-plane API to enable direct-rearm (Konstantin, Honnappa) 2. Ad= d > 'txq_data_get' API to get txq info for Rx (Konstantin) 3. Use input param= eter to > enable direct rearm in l3fwd (Konstantin) 4. Add condition detection for = direct > rearm API (Morten, Andrew Rybchenko) >=20 > V3: > 1. Seperate Rx and Tx operation with two APIs in direct-rearm (Konstantin= ) 2. > Delete L3fwd change for direct rearm (Jerin) 3. enable direct rearm in ix= gbe > driver in Arm >=20 > v4: > 1. Rename direct-rearm as buffer recycle. Based on this, function name an= d > variable name are changed to let this mode more general for all drivers. > (Konstantin, Morten) 2. Add ring wrapping check (Konstantin) >=20 > v5: > 1. some change for ethdev API (Morten) > 2. add support for avx2, sse, altivec path >=20 > v6: > 1. fix ixgbe build issue in ppc > 2. remove 'recycle_tx_mbufs_reuse' and 'recycle_rx_descriptors_refill' > API wrapper (Tech Board meeting) > 3. add recycle_mbufs engine in testpmd (Tech Board meeting) 4. add > namespace in the functions related to mbufs recycle(Ferruh) >=20 > v7: > 1. move 'rxq/txq data' pointers to the beginning of eth_dev structure, in= order > to keep them in the same cache line as rx/tx_burst function pointers (Mor= ten) > 2. add the extra description for 'rte_eth_recycle_mbufs' to show it can s= upport > feeding 1 Rx queue from 2 Tx queues in the same thread > (Konstantin) > 3. For i40e/ixgbe driver, make the previous copied buffers as invalid if = there are > Tx buffers refcnt > 1 or from unexpected mempool (Konstantin) 4. add chec= k > for the return value of 'rte_eth_recycle_rx_queue_info_get' > in testpmd fwd engine (Morten) >=20 > v8: > 1. add arm/x86 build option to fix ixgbe build issue in ppc >=20 > v9: > 1. delete duplicate file name for ixgbe >=20 > v10: > 1. fix compile issue on windows >=20 > v11: > 1. fix doc warning >=20 > Feifei Wang (4): > ethdev: add API for mbufs recycle mode > net/i40e: implement mbufs recycle mode > net/ixgbe: implement mbufs recycle mode > app/testpmd: add recycle mbufs engine >=20 > app/test-pmd/meson.build | 1 + > app/test-pmd/recycle_mbufs.c | 58 ++++++ > app/test-pmd/testpmd.c | 1 + > app/test-pmd/testpmd.h | 3 + > doc/guides/rel_notes/release_23_11.rst | 15 ++ > doc/guides/testpmd_app_ug/run_app.rst | 1 + > doc/guides/testpmd_app_ug/testpmd_funcs.rst | 5 +- > drivers/net/i40e/i40e_ethdev.c | 1 + > drivers/net/i40e/i40e_ethdev.h | 2 + > .../net/i40e/i40e_recycle_mbufs_vec_common.c | 147 ++++++++++++++ > drivers/net/i40e/i40e_rxtx.c | 32 ++++ > drivers/net/i40e/i40e_rxtx.h | 4 + > drivers/net/i40e/meson.build | 1 + > drivers/net/ixgbe/ixgbe_ethdev.c | 1 + > drivers/net/ixgbe/ixgbe_ethdev.h | 3 + > .../ixgbe/ixgbe_recycle_mbufs_vec_common.c | 143 ++++++++++++++ > drivers/net/ixgbe/ixgbe_rxtx.c | 37 +++- > drivers/net/ixgbe/ixgbe_rxtx.h | 4 + > drivers/net/ixgbe/meson.build | 2 + > lib/ethdev/ethdev_driver.h | 10 + > lib/ethdev/ethdev_private.c | 2 + > lib/ethdev/rte_ethdev.c | 31 +++ > lib/ethdev/rte_ethdev.h | 181 ++++++++++++++++++ > lib/ethdev/rte_ethdev_core.h | 23 ++- > lib/ethdev/version.map | 3 + > 25 files changed, 702 insertions(+), 9 deletions(-) create mode 100644 > app/test-pmd/recycle_mbufs.c create mode 100644 > drivers/net/i40e/i40e_recycle_mbufs_vec_common.c > create mode 100644 drivers/net/ixgbe/ixgbe_recycle_mbufs_vec_common.c >=20 > -- > 2.25.1