From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8CB324298A; Wed, 19 Apr 2023 16:56:10 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 2E3A740A79; Wed, 19 Apr 2023 16:56:10 +0200 (CEST) Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2041.outbound.protection.outlook.com [40.107.236.41]) by mails.dpdk.org (Postfix) with ESMTP id 0FF634021F for ; Wed, 19 Apr 2023 16:56:09 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=I8tYKjcm2q1urqMTiPG5n+LMu1eXks3bEOVdh6fTKmJRNNF5pJJPcfLvWtlXRRT/xpN5EH2JKYC7OltPCmUXARz4z1L0MsxDo9hX87bDoHiwEIqBJBgY/AOUaNpKYP8DgcZpfV3bQaVb255963Em9yPyBuY/h2xK0LPeZQBrGUZEtwb7yd09nLSEGy/SygXupAUtZY+vR/fqZ0146MWc6m7VIrzVUeIid6AEVSMcUQPhs6ookwg77g5z9YFUWxl9LdBMZAMroSYEU6voGceLrr53l4tKW7RdGyPJh0iK8BKpxhN49BySu737+CAOYI4FYAt6wgfgRsviIIYUFQIXVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=9LnUGiPnGWBrpois9hRqc9pdgF3hlmiWdGaZ6l3R8DA=; b=iBiB8aI+txtFoCGN0U1Cr1mYKkWUjMWSj6JQvUGChIZtM58Xh0iZQhgC2jvfcwKVL0mhna3pUVH2ahQYT1DRqIQtXTkpleQ4POheH9NPhh66i1INbHqQ8mxHwPixgPQ/Wq4GC/QBW94zbEH8qN6idWUBEaVh46wxEg1E8LYjHMYZxj4346bGFXiqceqKvUPmIcpmpiL8zaxbdcp+YTB7JE//pLLIGjEThxgoEc2bFM8QdWVjLaEYbb7lwbgF+nPqr+xEfDkjpvSUmhbpevXSTYuCMlRxsrbG3zn41eDsSVoSgv3fvsA3eguXBn2TwOJHH0X/eODidZ4R1lJ+i9DSVQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=9LnUGiPnGWBrpois9hRqc9pdgF3hlmiWdGaZ6l3R8DA=; b=21GY3vN+uW2Aznn8ed2m8dj4c/pV21jIZyX3O+iYuLxRy9EUjyTooXjSgfyXvFPMRmv8JC3TSOpcsbDhschiz2nsGQEG5zxZMeQViu/AggXQ8fzpJf87HtC9hxkeRHtxPmHhwarhWiW2KzJeyq6vM9F4MHSUFd5gU0PHPiNY7a4= Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=amd.com; Received: from CH2PR12MB4294.namprd12.prod.outlook.com (2603:10b6:610:a9::11) by DS0PR12MB6535.namprd12.prod.outlook.com (2603:10b6:8:c0::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6319.22; Wed, 19 Apr 2023 14:56:07 +0000 Received: from CH2PR12MB4294.namprd12.prod.outlook.com ([fe80::5e2c:c0ed:88a6:a4c7]) by CH2PR12MB4294.namprd12.prod.outlook.com ([fe80::5e2c:c0ed:88a6:a4c7%7]) with mapi id 15.20.6319.022; Wed, 19 Apr 2023 14:56:07 +0000 Message-ID: <8d0ec447-1182-119d-5a9e-21f95aecc917@amd.com> Date: Wed, 19 Apr 2023 15:56:01 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 Content-Language: en-US To: Feifei Wang , Qi Z Zhang , "Mcnamara, John" Cc: dev@dpdk.org, konstantin.v.ananyev@yandex.ru, mb@smartsharesystems.com, nd@arm.com References: <20211224164613.32569-1-feifei.wang2@arm.com> <20230330062939.1206267-1-feifei.wang2@arm.com> From: Ferruh Yigit Subject: Re: [PATCH v5 0/3] Recycle buffers from Tx to Rx In-Reply-To: <20230330062939.1206267-1-feifei.wang2@arm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-ClientProxiedBy: LO2P265CA0301.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a5::25) To CH2PR12MB4294.namprd12.prod.outlook.com (2603:10b6:610:a9::11) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR12MB4294:EE_|DS0PR12MB6535:EE_ X-MS-Office365-Filtering-Correlation-Id: 02fe3382-4073-434e-3b70-08db40e63482 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 1xpf+xbvGYSBEB4RyoU1CEldqcN25LBoj6vW143AezkkCBlNtphYLS3w3FglTq0gr1dAoGKryHRMDflYNkyaentcuzTEkB0o4he/zuGo+sanyGbrcQQxtPqbjRqiEk1APXrRZ7VnDqlo9hRJIAY1UttpBZ3UwyOq1jCAW43ksNBKNyU6Bc5woqzN1RR2TtYo+C0+/yYwzIS+rILoXT5HePRsNAHub1VVRZZk6YQxIkT0b0xw28Bs6zvcorXQscV+cwS5I1XWzIQcZ5pWzZ1B7RaFe/STgEc26MFx8aTvVJyAvRfnTKqWvOharMb+t+EOkBA8Q+MB4W8ht25S3nSTt8IupLSCKu9htSOTa3XZ8rc31WZ6oM0cUy4aI9AnR2TuV1oDTlH75JsfCCKs3WJmo21zxO43s9x5I15xNBePw98z6C5NrrXMjqT4CK7CNUwIxc1SizRIZIq/czK/qO0BRxYpOQmPkAbMKgLjnVzaFMHFf8gG6HMDpLfoyKQnZVJ10kxV46XwvR4Y61Q+X6j4RYEVYIuVxsNM5lMQE6cCpq6fChb+nkXD+MCi1dhTMSmYznK3gregaPpLyaayAD/c79y8vVqfxyX7TvZYKzS6tP/lAEz7KmpzPzHUq2Q70sbNf97OgxVyBNBUqpx389HqawPVqfjx0Tx2R4FYf46p2LzMcxzH2ML9Z5MVFvNJahJg3ThmYhCAlh8+tPQB4ztDzg== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH2PR12MB4294.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230028)(4636009)(39860400002)(366004)(396003)(136003)(376002)(346002)(451199021)(8936002)(8676002)(41300700001)(2906002)(44832011)(966005)(5660300002)(83380400001)(86362001)(36756003)(110136005)(478600001)(6512007)(6506007)(26005)(2616005)(53546011)(6486002)(31696002)(38100700002)(186003)(6666004)(31686004)(316002)(4326008)(66556008)(66476007)(66946007)(83323001)(45980500001)(43740500002)(414714003)(473944003); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?WlU3aEpoaHZFWkxteG5OTCtqd1RnZWpuQXFZVy9vclJIZ2ZPMG9NbUFqMDE1?= =?utf-8?B?a1Erc2VLb3B3eFV0b0lwUXhOdlEvbWttdEFVQmdLMVgwT2Rzc1NQWGpxQ1la?= =?utf-8?B?YTlzenZxa0Vwb1dpQ1lMMnZzNGJCSHBGNk9oZFM5SHBnQlNDaE1pUG1HOHFq?= =?utf-8?B?cm8yQnE3VEU5SGlTbmh5WDd0Zmhpb2ZUM0dGSzJVeDlqNkVYRzN3aEtHVUtH?= =?utf-8?B?L0R2dzZEOTR4ay9FTnFrRjREajEvckxRVmt3SWRoM1pDL3FUeWcwRGFFRzBQ?= =?utf-8?B?MlcvQURtM2U0YjY0ZEtxdkdDVzlYSE50YUhudTBSUS9BcW5OOXpoY0kwUm40?= =?utf-8?B?U1RBZ0pmbndCOHNsamllR3RNWk5tMFpwa2lScWxTa09oSnVLbGlMMGR0ODVF?= =?utf-8?B?Z1A5MHo1bW9kK1lybmxLdzBteUkrRGFRT2tqSXJORXN1WnpkMHJhQ2V3UmYr?= =?utf-8?B?MklxaW9GamV3VXNHM0pNM0psSHVja0xJQ0hvVjlWUGRlcXMxOWdDRVB4SkF6?= =?utf-8?B?b0ZKVjdGSHFoVHBYZVdvejEvNkU2akE0dXZNeFVhOGRmRk5zcFBtUGZCb3lr?= =?utf-8?B?RmtDY2M1ek9hS0xVaHhuQ2MrNjM4NlM1Tk5jQ1IrS0NMaWRHVHM5SW1RRUJa?= =?utf-8?B?cW1udVBOcThpZjUvT3RmdVhCbkkwL0RFaHFCb0hEeTNmNkRjeFBxeWkvZFor?= =?utf-8?B?Z2JZMFUvbUNDUUVYaStuVnZnL2orRXcydVJkeURubVRzVlpadExJbWZad3lr?= =?utf-8?B?dm96WlhzeXhqSk5kYjZYR2ZUblFBZFJYdG1RL1NmKzA4TkNLVDFJcURpd0E4?= =?utf-8?B?YStwNkFnUFBIbzRORTE4dWJ1ZmRvU090WTRob0ZtY20xenVPUmRJRU8rNlBp?= =?utf-8?B?TUNrTmN4ZjdLTXBMTEI4VlJhMTJUUjJVN1V1YmhSeFNBTmhBS2dCeWpsbHRZ?= =?utf-8?B?NHhoS0FVeFgrVnFKdmMwakV0RENobEhXU0Z4d0pvMnh2ZnAwNytBblJpc1Ni?= =?utf-8?B?L0V5dCtGS1ZKU3lyQVpWbUtiUXprWUN5YkR2YWtUSzJBbFAxM0s5UWVTVWpJ?= =?utf-8?B?Kzc0WmhvZzBLaXZQNWo0ZjR4TTN6K0Z1MFUzNVVpQUZ4TTRia1ZseG1JcHB2?= =?utf-8?B?Rjg4elBHT0xNRjF3alJ4ODRTRUJLeEljZzgxdVp2U3FxbVlsbkE4clJnN0Vn?= =?utf-8?B?cXFNUDlKYUxxdFk0eHFsRkF6d0RRaTEyVTRFUVVqQkR6V053V1o5MEdTYXBW?= =?utf-8?B?bzkvZ3FtbW5uZU92M0IrdGFOd3FCc3RoUnpZeUdPSjU5enppU3UyZk1ROEJp?= =?utf-8?B?R3dwNUgwNDhXUUdNdkNmMFNQUHR5WDZwek5RaVNkRk82cFlpbm5taS9CSkhZ?= =?utf-8?B?aVJ4Wk55enAySE00WUtsVERscEZYa2paU2tiMi9UcFR0UXpRZzU3WkZWZVdi?= =?utf-8?B?Y2RVR3VmTjVvM1dDWkZkNThKVnNIeCtadUhNZ1ZMblRoc3hNZTdkYWNDRU5s?= =?utf-8?B?d2ZmTlF1Y3cyTFo4cGNMWEVqQXEyWVl1Ti9VRnErVzVuV25MR3BHOHFlVmZM?= =?utf-8?B?N0RqVU1BYkU1dFBzYVZtUzNDSURzclBTQXoycnhoKzBoeW9WeFJzRVAvZjhp?= =?utf-8?B?aGdpdDR1TmRtNkg5SHN3Z1FlVG0vVE9HSmxLdzFGdE45cTdGeDdoMEsxY0Nt?= =?utf-8?B?UE91RmhjaUxXZjBhQUxxaDhZc202bUltc1ozSlpIbnRZWWtINUg5UUVoM0hQ?= =?utf-8?B?WHZNalhoZktOTkFLMVVMREY1M0hibnk4WWtJMVM2bkpZUWtITTg5aEJnWFdz?= =?utf-8?B?OFBFY0RhS2Y4cFRmOGVqaW5OczJRckJVejhQdjJEdnMxQjFvUjZRWW00SW55?= =?utf-8?B?ek1zbkE2UXlMNnU1SDc2a1dFYWdRV21yWURCdEhxR0k0Nm1yTVBlc0hsNzBZ?= =?utf-8?B?eitKeDRQajMzODUrUTdETTlVOERlcHZIY1lJa0xlM3RIaVV3bG4xYnFGSDZ6?= =?utf-8?B?UHNJQklacTBHN2NRSDFIVUZtelNKSVg5Y2M3WnQyUVZnTjd4eE5kT05CVGNi?= =?utf-8?B?M093QTRoK1FkT0ltN2FzN0h1OGtuUFVDSGQyZ3FnQWJ0anBINi9xK1BZQWZ5?= =?utf-8?Q?DfUubtktuN8fa358ayYxNZ0EL?= X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-Network-Message-Id: 02fe3382-4073-434e-3b70-08db40e63482 X-MS-Exchange-CrossTenant-AuthSource: CH2PR12MB4294.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Apr 2023 14:56:07.2945 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 5tvk0bLUi7SbZnTzkhBBbGKCcEko6o01hkUHIyyL2l/CIsbCE3jeXYgOhNhEbScy X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS0PR12MB6535 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 3/30/2023 7:29 AM, Feifei Wang wrote: > Currently, the transmit side frees the buffers into the lcore cache and > the receive side allocates buffers from the lcore cache. The transmit > side typically frees 32 buffers resulting in 32*8=256B of stores to > lcore cache. The receive side allocates 32 buffers and stores them in > the receive side software ring, resulting in 32*8=256B of stores and > 256B of load from the lcore cache. > > This patch proposes a mechanism to avoid freeing to/allocating from > the lcore cache. i.e. the receive side will free the buffers from > transmit side directly into its software ring. This will avoid the 256B > of loads and stores introduced by the lcore cache. It also frees up the > cache lines used by the lcore cache. And we can call this mode as buffer > recycle mode. > > In the latest version, buffer recycle mode is packaged as a separate API. > This allows for the users to change rxq/txq pairing in real time in data plane, > according to the analysis of the packet flow by the application, for example: > ----------------------------------------------------------------------- > Step 1: upper application analyse the flow direction > Step 2: rxq_buf_recycle_info = rte_eth_rx_buf_recycle_info_get(rx_portid, rx_queueid) > Step 3: rte_eth_dev_buf_recycle(rx_portid, rx_queueid, tx_portid, tx_queueid, rxq_buf_recycle_info); > Step 4: rte_eth_rx_burst(rx_portid,rx_queueid); > Step 5: rte_eth_tx_burst(tx_portid,tx_queueid); > ----------------------------------------------------------------------- > Above can support user to change rxq/txq pairing at runtime and user does not need to > know the direction of flow in advance. This can effectively expand buffer recycle mode's > use scenarios. > > Furthermore, buffer recycle mode is no longer limited to the same pmd, > it can support moving buffers between different vendor pmds, even can put the buffer > anywhere into your Rx buffer ring as long as the address of the buffer ring can be provided. > In the latest version, we enable direct-rearm in i40e pmd and ixgbe pmd, and also try to > use i40e driver in Rx, ixgbe driver in Tx, and then achieve 7-9% performance improvement > by buffer recycle mode. > > Difference between buffer recycle, ZC API used in mempool and general path > For general path: > Rx: 32 pkts memcpy from mempool cache to rx_sw_ring > Tx: 32 pkts memcpy from tx_sw_ring to temporary variable + 32 pkts memcpy from temporary variable to mempool cache > For ZC API used in mempool: > Rx: 32 pkts memcpy from mempool cache to rx_sw_ring > Tx: 32 pkts memcpy from tx_sw_ring to zero-copy mempool cache > Refer link: http://patches.dpdk.org/project/dpdk/patch/20230221055205.22984-2-kamalakshitha.aligeri@arm.com/ > For buffer recycle: > Rx/Tx: 32 pkts memcpy from tx_sw_ring to rx_sw_ring > Thus we can see in the one loop, compared to general path, buffer recycle reduce 32+32=64 pkts memcpy; > Compared to ZC API used in mempool, we can see buffer recycle reduce 32 pkts memcpy in each loop. > So, buffer recycle has its own benefits. > > Testing status: > (1) dpdk l3fwd test with multiple drivers: > port 0: 82599 NIC port 1: XL710 NIC > ------------------------------------------------------------- > Without fast free With fast free > Thunderx2: +7.53% +13.54% > ------------------------------------------------------------- > > (2) dpdk l3fwd test with same driver: > port 0 && 1: XL710 NIC > ------------------------------------------------------------- > Without fast free With fast free > Ampere altra: +12.61% +11.42% > n1sdp: +8.30% +3.85% > x86-sse: +8.43% +3.72% > ------------------------------------------------------------- > > (3) Performance comparison with ZC_mempool used > port 0 && 1: XL710 NIC > with fast free > ------------------------------------------------------------- > With recycle buffer With zc_mempool > Ampere altra: 11.42% 3.54% > ------------------------------------------------------------- > Thanks for the perf test reports. Since test is done on Intel NICs, it would be great to get some testing and performance numbers from Intel side too, if possible. > V2: > 1. Use data-plane API to enable direct-rearm (Konstantin, Honnappa) > 2. Add 'txq_data_get' API to get txq info for Rx (Konstantin) > 3. Use input parameter to enable direct rearm in l3fwd (Konstantin) > 4. Add condition detection for direct rearm API (Morten, Andrew Rybchenko) > > V3: > 1. Seperate Rx and Tx operation with two APIs in direct-rearm (Konstantin) > 2. Delete L3fwd change for direct rearm (Jerin) > 3. enable direct rearm in ixgbe driver in Arm > > v4: > 1. Rename direct-rearm as buffer recycle. Based on this, function name > and variable name are changed to let this mode more general for all > drivers. (Konstantin, Morten) > 2. Add ring wrapping check (Konstantin) > > v5: > 1. some change for ethdev API (Morten) > 2. add support for avx2, sse, altivec path > > Feifei Wang (3): > ethdev: add API for buffer recycle mode > net/i40e: implement recycle buffer mode > net/ixgbe: implement recycle buffer mode > > drivers/net/i40e/i40e_ethdev.c | 1 + > drivers/net/i40e/i40e_ethdev.h | 2 + > drivers/net/i40e/i40e_rxtx.c | 159 +++++++++++++++++++++ > drivers/net/i40e/i40e_rxtx.h | 4 + > drivers/net/ixgbe/ixgbe_ethdev.c | 1 + > drivers/net/ixgbe/ixgbe_ethdev.h | 3 + > drivers/net/ixgbe/ixgbe_rxtx.c | 153 ++++++++++++++++++++ > drivers/net/ixgbe/ixgbe_rxtx.h | 4 + > lib/ethdev/ethdev_driver.h | 10 ++ > lib/ethdev/ethdev_private.c | 2 + > lib/ethdev/rte_ethdev.c | 33 +++++ > lib/ethdev/rte_ethdev.h | 230 +++++++++++++++++++++++++++++++ > lib/ethdev/rte_ethdev_core.h | 15 +- > lib/ethdev/version.map | 6 + > 14 files changed, 621 insertions(+), 2 deletions(-) > Is usage sample of these new APIs planned? Can it be a new forwarding mode in testpmd?