From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 399104613C; Mon, 27 Jan 2025 10:46:14 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id BC54C40275; Mon, 27 Jan 2025 10:46:13 +0100 (CET) Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) by mails.dpdk.org (Postfix) with ESMTP id 1176940265 for ; Mon, 27 Jan 2025 10:46:11 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1737971173; x=1769507173; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-transfer-encoding:mime-version; bh=wtBpKHQVRrCOuvFCxA8i2xQp3hHAIKVtkSvkXrNmZos=; b=b8Dwjzu+bx1kRECDnvhbu2RYx8PRiYsqqqaGKOV8kp6wH6cIqb0JJ3RT jtno6E3pQxxhrNMqrFvoiG8XJbxaiXmWgmuGIk3gq3m2bacj5r4WW8+V6 N3cywTMi5TYVEPfQkNwfB53caBiLm+zRIi+uL9Cyc1vvVRgYFty+jJotr alPNdKua1WEz1PB74SZp/iCW4ai4EjwPA3amKGBNb2KqlVc+CnAVlk4Ro YiWLzYO0w93HnaAkW+vniSPBmXiSCpD34AA5OoZEAzEhpaO1cJik3uKM6 EDAY72wSCBMx2WoHZwhF9hGDA62MGWpfMoT2hs9puWv8oj8scGnOwMq/4 A==; X-CSE-ConnectionGUID: H4OUgsUKS7GaFgloMWdpPw== X-CSE-MsgGUID: iHPPnQS7SIia0sVF5paV2A== X-IronPort-AV: E=McAfee;i="6700,10204,11327"; a="49815504" X-IronPort-AV: E=Sophos;i="6.13,238,1732608000"; d="scan'208";a="49815504" Received: from orviesa010.jf.intel.com ([10.64.159.150]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Jan 2025 01:46:11 -0800 X-CSE-ConnectionGUID: M0WLaEMRRI6uV7TOxpBQPA== X-CSE-MsgGUID: hcFtSyizRIiPijYzg1FUfw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="108252036" Received: from orsmsx601.amr.corp.intel.com ([10.22.229.14]) by orviesa010.jf.intel.com with ESMTP/TLS/AES256-GCM-SHA384; 27 Jan 2025 01:46:11 -0800 Received: from orsmsx601.amr.corp.intel.com (10.22.229.14) by ORSMSX601.amr.corp.intel.com (10.22.229.14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.44; Mon, 27 Jan 2025 01:46:09 -0800 Received: from ORSEDG601.ED.cps.intel.com (10.7.248.6) by orsmsx601.amr.corp.intel.com (10.22.229.14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.44 via Frontend Transport; Mon, 27 Jan 2025 01:46:09 -0800 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (104.47.59.174) by edgegateway.intel.com (134.134.137.102) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.44; Mon, 27 Jan 2025 01:46:09 -0800 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=spAVck7wBXLChkPrzuJ90GotfJMnEl2J57uIsJ3aZzYdFe5qCxR4C/zt2gD9U+GR6BZS/lxW7HkKlniHKTQQfaKWbhjcMVrO5YtkSXvRp1cN2c+q1En5EUY9j/qLXIDEp3ZeZYAHnpMROgLL/SknenZdrvjFa6Tv0bqBUlrVuz5YQbSyz4RnTmBIm3ARh6wflhk+HwUXHaxFChzvq3VRE9pXvqUmbYgaaXhoVBgXdg+g4tvNR5vuA0IjrH/a4AZuQ5f2pZn8IX4fMYkaDBtighOr8qNu5UiRx/4KiBP2iF6E+0sDX1+mUXe/ngswiEQxvgoiyHNH4l+96aMZtOYsDg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=uSv6atIcNdPPVetu0jd3zQLEqb7vbsYiLJVGhzyHeAQ=; b=tyJtLwWq1GSDwxZyYeGDPkIMfryzmrSO48QUU/tO4ljmcZcg6GQe649FMUfVYNOwWXGdo2GOQmqponKDhF0GYzb/CszCdkTLp9H935BnCaZOGnCrpvdsUTrBMS6ddQE8kmOy48RqbVt1hd+TIjJgFf5GZ4GNW3zYQI7D2TnDzVotwXFuaICUUnCjpbeLwT/ZeICGXWJAbFCaZmRKoYi2gbZ6Nf3M0h0wx+iSRZa5gDn5K90MiVONcmj19X4WiR+A57DTPDW/E6NJZMC6aZQw/zB0qf0iq65/s6gwBNVXgMEV0IyFBtjw0fcn+k6VrqpPx5WqbwlwIFH0Af9lzUOkLQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Received: from SJ0PR11MB5918.namprd11.prod.outlook.com (2603:10b6:a03:42c::22) by SA0PR11MB4525.namprd11.prod.outlook.com (2603:10b6:806:9d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8377.23; Mon, 27 Jan 2025 09:46:07 +0000 Received: from SJ0PR11MB5918.namprd11.prod.outlook.com ([fe80::891b:9bb3:428a:c72a]) by SJ0PR11MB5918.namprd11.prod.outlook.com ([fe80::891b:9bb3:428a:c72a%6]) with mapi id 15.20.8377.021; Mon, 27 Jan 2025 09:46:07 +0000 From: "Wani, Shaiq" To: "Richardson, Bruce" CC: "dev@dpdk.org" , "Singh, Aman Deep" Subject: RE: [PATCH 2/2] common/idpf: enable AVX2 for single queue Tx Thread-Topic: [PATCH 2/2] common/idpf: enable AVX2 for single queue Tx Thread-Index: AQHbYccz/ftkjWcXQESDGqrBb5/NxbMfyiqAgAqbZQA= Date: Mon, 27 Jan 2025 09:46:07 +0000 Message-ID: References: <20250108121757.170494-1-shaiq.wani@intel.com> <20250108121757.170494-3-shaiq.wani@intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=intel.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: SJ0PR11MB5918:EE_|SA0PR11MB4525:EE_ x-ms-office365-filtering-correlation-id: f1567d17-5f4c-44cf-4738-08dd3eb76c81 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; ARA:13230040|376014|366016|1800799024|38070700018|7053199007; x-microsoft-antispam-message-info: =?us-ascii?Q?UY5SxueeZkbSbN3+z7Lfer8pfl/UrP05uTs5n7KTOf6zNEehKSPm0/yNKm4N?= =?us-ascii?Q?P+P8wDRI1o9DJGENbHZNyQCqPlOum5bfR9yYczXOCz2jXuwfTIHWqQqnPFly?= =?us-ascii?Q?I2B+PVY0/KvvAa9j8VEgNSxj0iQVbJIgAgR/mqwLRP6wcJ/nST2T9qRHFHNG?= =?us-ascii?Q?95qzHjVyD3J1XWI5+eMUsTr8+6vTic4OzVxUx/+UW3DmeAXJq2l/O+FdV0N9?= =?us-ascii?Q?phqTYiQ1J5x8yD29ohMUUKYlCFt0YagQ0TL+RP0d4XKCGWwW8IHNcUBZaIQx?= =?us-ascii?Q?u4WnlHsD8lySihIas4BsKZ+9c5oOojp9vtK9iYAE+efGhEp3qnrPIy5Slxam?= =?us-ascii?Q?ISFqzhC2ZaXOfZl9IPudBASAb8bC+PxPivz9zDDuWtwre7R0cAUIX4i8VDNz?= =?us-ascii?Q?bNB2+s1WYSjR05mmnfceiU+3IeyZKkPsHwC5VjuycFymtyVvhD/1dOB1v2NU?= =?us-ascii?Q?bBc/kxUWiJekHQQpD/dkLHjgzqzmHsAAeT2AsUCO/qbl7a1VoZqRviLt4age?= =?us-ascii?Q?wKLnMeYC8AbXcCxTUenkmBxEQDq7ZI+T0aBf7Wh9Nsi57r97i++eEgCSDGgJ?= =?us-ascii?Q?gPGuSAcqC7lkhtCnaknMaWKWdcTZomx2DX1MVr2XvMUbrSZqYKH5mtRSi9Y2?= =?us-ascii?Q?3/wppmZkffzOx2I8pWJAfv4GSYqtwB6FpbKmHMxkagivnkukIi+RnQ4cAyDi?= =?us-ascii?Q?Xf0Ac5Ew4Wd0ge7DWx78BJSkytNMFaOOAf82gdTO6JOghJEl5qt0Mbw3hefj?= =?us-ascii?Q?BqN+k774ZSq2bQkYAKrEnOhyD/3z6D311RVODvd9leZEXNK1vYzFqToIFugG?= =?us-ascii?Q?U4qYchRfV3PlN6B/L0yvDAYCveTY2pqSsfT4zmFFbRPpVqSRlrHk8N/adL7R?= =?us-ascii?Q?bwdKENxNqGw/vpAG2zXKIE4CyMvH/k9V6uvxtMt1LTkOBfcvdqX3k+O5Ie4C?= =?us-ascii?Q?pXGAGzUsbw8YAkr+uHEoOe6fJfZFUruQdA68Z82ICtJl0xVNNpY1+R2odAIt?= =?us-ascii?Q?OT6UcpDS5VavRCTQbHX4VObCh/XJBl5AKEMjkkJ6pDbgvdIhugcIpGM8lTdL?= =?us-ascii?Q?azh1Szt09Pk3d423BvqlVXpHRLFPf499bIANOfzr8uibiDh/jmM0JMIKJjPB?= =?us-ascii?Q?Ma2BP7XLsdz8S1A1GlbUk8U6pALlkcaXpIB/mFTrppsMwSjCm4XKM17etlzv?= =?us-ascii?Q?gp71gRzJNyqO/ppQu+LY5tbkUJ6TOgNRYRosDmyKucZOOr2fJ0T9u4XMwbb6?= =?us-ascii?Q?4Aax7LvBIU94xZ147vlWPd58V/z4yAeF4xQX1xhvgms18pxn4/uObnTHJa42?= =?us-ascii?Q?AVdvivJMNUKAfdtYLumxKZsO1BurJx70ZIIRQi8a44NNBpdGset+wJug8sS/?= =?us-ascii?Q?b3g47FMCntMSCDGnHv+8aYDS+HIL?= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR11MB5918.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(366016)(1800799024)(38070700018)(7053199007); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?W5v2R7vetPr1Vwz5jxf7AY7B39ttyC05WgjW68XsXNeF3p3gSzfxpaiNAdyE?= =?us-ascii?Q?QiCA2b5kY/wGSMR7WOou7hKzFWvwQH/K6l6HE4Bv/Sb1K3ZE0ENV2JMBBFMS?= =?us-ascii?Q?snOpp+TxOwG6L0Ya2v8vqA1xyXdazLIuBGN5PhhC3J8ZMNj9fYEIEU/G6RHx?= =?us-ascii?Q?vBusD8npwVQKQlGGFFcS5UN/k+9+Zuoa7xzlRP3+4OlcWEN/Y2J+i5A/DVOb?= =?us-ascii?Q?J1ZbaXcSTYjhqTnBkQQOZwCULWLMmd1zI/c2xIdPgKgbtOQXQ64qtYLrl1dT?= =?us-ascii?Q?vRJZBgUF2Wf8BSB6W9rvzK1iZWGzwOIIOtU3CQeR2UXQFfssi6bOszclRf7V?= =?us-ascii?Q?21nJvvFd3auEm9UVpD5JzhzkFN4lt6nyZck1TqPQdi+tiAurN9I8vZ9H3Gpn?= =?us-ascii?Q?lGP+/3jbMQnhtpUIb/291xQUIq5qb7u6tHHa0snVytA9qxR1O2kTrdbQPtPU?= =?us-ascii?Q?AP1DiJMvJVBC8uixTlrthlYgqtHnxoZaD+PUD3kFi39F3lB0K3/+mkEG5fGP?= =?us-ascii?Q?APykECmcF2G8B1qhpJnyDdl0oexDEfjRxCcJieC8i4PkHjU8Yo6Ie6Y9auSx?= =?us-ascii?Q?tM5RK+5iccC/yP5sqF525CaD8J07md1dDo7/t+rBv6oc2bH3wwzbQtTZmpmZ?= =?us-ascii?Q?Y8/uwmmSoCfUiMByzJyA8o/P3cbYl/ZspO9wBUx3trl74yswSCRQbyPn/M2W?= =?us-ascii?Q?m28U6X20L4dULrEiCCZLShtgPoNeMpdtLSKQ4mLPHqgjxM+2DfbaVZQzs2Dc?= =?us-ascii?Q?1+Lc1+ITzILLm+/2PQltwiW4S503sOjeQMNTf3+NTv4FsNAaY1bQGZtkBAcs?= =?us-ascii?Q?Y54unRfGMzQT+SnVsJIdlPIflXn7vAJdd0KPICZD41sKgPZ4nO1pQVrUI2Nw?= =?us-ascii?Q?qygSQgWSdkvdxHM4tkbmvOWt6fYy52g9uzq38UkH7+cuysvf6ywLkvnWSNzV?= =?us-ascii?Q?M/PoXd0HRSJxW9voVvKYa/j+0EBhAjolfNJkoXtsVWO+HR2UuhC2ockmO/FX?= =?us-ascii?Q?BDxe3yFIMjFV7v8hS2HBlCS95WaH/5M+4qQ0Ry9Md8JPthvm9wI6c4sxZKDF?= =?us-ascii?Q?HkhEyhUN+hp4CbUC+WDrojQIdWauYOJNqwfuL6tNu550bdxygB/hlGKYh7s/?= =?us-ascii?Q?QhtnMq+S2uWUMunlg9A+tX6ojc0O4+F5EhIFaCAylGVPz0/cVXmPTKsNFOxK?= =?us-ascii?Q?lWS1gOCTty1a8TkcgczyaEJnhByRYjJWt7ps+EyOvn4kAdoav+w6yI0furjU?= =?us-ascii?Q?vo3yTSEtxiemAG6sqnh+zKX8wHag2lniE97LYrsZ9EIRkAFbE8VfYe4qDgFG?= =?us-ascii?Q?8EdyDaCIeGa392T9SUToHOVNfQY2xJn/8zAzgBqwkig8gCZFwl4rM5utgoY8?= =?us-ascii?Q?D2jS1MTzpPiMFMFKwnXxRblqRsAnZ2F00Jq8Pc+jm0WVb8DKoYiGJDho8cQ6?= =?us-ascii?Q?mq0uoc1LgAfS0KzegoqupSndEUbaLAUrFjskhIJqsKJ34X1abyWDFUmp3NVw?= =?us-ascii?Q?smiA4neSp158Kj0sDnORf/43F4lpp1OQCGR15e+qqpfv3WGexBpluYiqyS2m?= =?us-ascii?Q?6FrmygT8XhZe8GwqfEAGmyg+FMZuafNHSW0on+NO?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: SJ0PR11MB5918.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: f1567d17-5f4c-44cf-4738-08dd3eb76c81 X-MS-Exchange-CrossTenant-originalarrivaltime: 27 Jan 2025 09:46:07.6282 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: Xlb1eDQ3pYDo6nEN3Qdrm/1RwIvrMUd6zjARzk8Cz2u6lALUZH+8sQbr6p8PKvq5n/lMZ3zTYup9Lu0r+KtUqQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA0PR11MB4525 X-OriginatorOrg: intel.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi, Thanks for your review and feedback. Below I have addressed your comments inline. -----Original Message----- From: Richardson, Bruce =20 Sent: Monday, January 20, 2025 7:53 PM To: Wani, Shaiq Cc: dev@dpdk.org; Singh, Aman Deep Subject: Re: [PATCH 2/2] common/idpf: enable AVX2 for single queue Tx On Wed, Jan 08, 2025 at 05:47:57PM +0530, Shaiq Wani wrote: > In case some CPUs don't support AVX512. Enable AVX2 for them to get=20 > better per-core performance. >=20 > Signed-off-by: Shaiq Wani > --- Hi, some review comments inline below. /Bruce > doc/guides/rel_notes/release_25_03.rst | 3 + > drivers/common/idpf/idpf_common_device.h | 1 + > drivers/common/idpf/idpf_common_rxtx.h | 4 + > drivers/common/idpf/idpf_common_rxtx_avx2.c | 225 ++++++++++++++++++++ > drivers/common/idpf/version.map | 1 + > drivers/net/idpf/idpf_rxtx.c | 14 ++ > 6 files changed, 248 insertions(+) >=20 > diff --git a/doc/guides/rel_notes/release_25_03.rst=20 > b/doc/guides/rel_notes/release_25_03.rst > index 426dfcd982..7ded85dac4 100644 > --- a/doc/guides/rel_notes/release_25_03.rst > +++ b/doc/guides/rel_notes/release_25_03.rst > @@ -55,6 +55,9 @@ New Features > Also, make sure to start the actual text at the margin. > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D > =20 > + * **Added support of vector instructions on IDPF.** > + > + Added support of AVX2 instructions in IDPF single queue RX and TX p= ath. > =20 Driver already had vector instructions so title is a little misleading. Clarify the title to be AVX2-specific. For the body, please clarify singleq= vs splitq and what the differences are and when one might get the benefit = of the AVX2 code path. [SHAIQ]- Will address the change in v2 of the patch. > Removed Items > ------------- > diff --git a/drivers/common/idpf/idpf_common_device.h=20 > b/drivers/common/idpf/idpf_common_device.h > index 734be1c88a..5f3e4a4fcf 100644 > --- a/drivers/common/idpf/idpf_common_device.h > +++ b/drivers/common/idpf/idpf_common_device.h > @@ -124,6 +124,7 @@ struct idpf_vport { > bool rx_vec_allowed; > bool tx_vec_allowed; Do we have vector paths other than the 2 AVX ones below. If not, why do we = need this flag? [SHAIQ]- Some processors, e.g., Denverton, support SSE but not AVX. > bool rx_use_avx2; > + bool tx_use_avx2; > bool rx_use_avx512; > bool tx_use_avx512; > =20 > diff --git a/drivers/common/idpf/idpf_common_rxtx.h=20 > b/drivers/common/idpf/idpf_common_rxtx.h > index f50cf5ef46..e19e1878f3 100644 > --- a/drivers/common/idpf/idpf_common_rxtx.h > +++ b/drivers/common/idpf/idpf_common_rxtx.h > @@ -306,5 +306,9 @@ __rte_internal > uint16_t idpf_dp_singleq_recv_pkts_avx2(void *rx_queue, > struct rte_mbuf **rx_pkts, > uint16_t nb_pkts); > +__rte_internal > +uint16_t idpf_dp_singleq_xmit_pkts_avx2(void *tx_queue, > + struct rte_mbuf **tx_pkts, > + uint16_t nb_pkts); > =20 > #endif /* _IDPF_COMMON_RXTX_H_ */ > diff --git a/drivers/common/idpf/idpf_common_rxtx_avx2.c=20 > b/drivers/common/idpf/idpf_common_rxtx_avx2.c > index a05b26c68a..a4bc8e2bef 100644 > --- a/drivers/common/idpf/idpf_common_rxtx_avx2.c > +++ b/drivers/common/idpf/idpf_common_rxtx_avx2.c > @@ -588,3 +588,228 @@ idpf_dp_singleq_recv_pkts_avx2(void *rx_queue,=20 > struct rte_mbuf **rx_pkts, { > return _idpf_singleq_recv_raw_pkts_vec_avx2(rx_queue, rx_pkts,=20 > nb_pkts, NULL); } > + > +static __rte_always_inline void > +idpf_tx_backlog_entry(struct idpf_tx_entry *txep, > + struct rte_mbuf **tx_pkts, uint16_t nb_pkts) { > + int i; > + > + for (i =3D 0; i < (int)nb_pkts; ++i) > + txep[i].mbuf =3D tx_pkts[i]; > +} > + > +static __rte_always_inline int > +idpf_singleq_tx_free_bufs_vec(struct idpf_tx_queue *txq) { > + struct idpf_tx_entry *txep; > + uint32_t n; > + uint32_t i; > + int nb_free =3D 0; > + struct rte_mbuf *m, *free[txq->rs_thresh]; > + > + /* check DD bits on threshold descriptor */ > + if ((txq->tx_ring[txq->next_dd].qw1 & > + rte_cpu_to_le_64(IDPF_TXD_QW1_DTYPE_M)) !=3D > + rte_cpu_to_le_64(IDPF_TX_DESC_DTYPE_DESC_DONE)) > + return 0; > + > + n =3D txq->rs_thresh; > + > + /* first buffer to free from S/W ring is at index > + * next_dd - (rs_thresh-1) > + */ > + txep =3D &txq->sw_ring[txq->next_dd - (n - 1)]; > + m =3D rte_pktmbuf_prefree_seg(txep[0].mbuf); > + if (likely(m)) { > + free[0] =3D m; > + nb_free =3D 1; > + for (i =3D 1; i < n; i++) { > + m =3D rte_pktmbuf_prefree_seg(txep[i].mbuf); > + if (likely(m)) { > + if (likely(m->pool =3D=3D free[0]->pool)) { > + free[nb_free++] =3D m; > + } else { > + rte_mempool_put_bulk(free[0]->pool, > + (void *)free, > + nb_free); > + free[0] =3D m; > + nb_free =3D 1; > + } > + } > + } > + rte_mempool_put_bulk(free[0]->pool, (void **)free, nb_free); > + } else { > + for (i =3D 1; i < n; i++) { > + m =3D rte_pktmbuf_prefree_seg(txep[i].mbuf); > + if (m) > + rte_mempool_put(m->pool, m); > + } > + } > + > + /* buffers were freed, update counters */ > + txq->nb_free =3D (uint16_t)(txq->nb_free + txq->rs_thresh); > + txq->next_dd =3D (uint16_t)(txq->next_dd + txq->rs_thresh); > + if (txq->next_dd >=3D txq->nb_tx_desc) > + txq->next_dd =3D (uint16_t)(txq->rs_thresh - 1); > + > + return txq->rs_thresh; > +} > + If/when patchset [1] is merged, this code should be reworked to use the com= mon functions. [1] https://patches.dpdk.org/project/dpdk/list/?series=3D34398 [SHAIQ]-When the patchset is merged, we will address the changes accordingl= y. > +static inline void > +idpf_singleq_vtx1(volatile struct idpf_base_tx_desc *txdp, > + struct rte_mbuf *pkt, uint64_t flags) { > + uint64_t high_qw =3D > + (IDPF_TX_DESC_DTYPE_DATA | > + ((uint64_t)flags << IDPF_TXD_QW1_CMD_S) | > + ((uint64_t)pkt->data_len << IDPF_TXD_QW1_TX_BUF_SZ_S)); > + > + __m128i descriptor =3D _mm_set_epi64x(high_qw, > + pkt->buf_iova + pkt->data_off); > + _mm_store_si128((__m128i *)txdp, descriptor); } > + > +static inline void > +idpf_singleq_vtx(volatile struct idpf_base_tx_desc *txdp, > + struct rte_mbuf **pkt, uint16_t nb_pkts, uint64_t flags) { > + const uint64_t hi_qw_tmpl =3D (IDPF_TX_DESC_DTYPE_DATA | > + ((uint64_t)flags << IDPF_TXD_QW1_CMD_S)); > + > + /* if unaligned on 32-bit boundary, do one to align */ > + if (((uintptr_t)txdp & 0x1F) !=3D 0 && nb_pkts !=3D 0) { > + idpf_singleq_vtx1(txdp, *pkt, flags); > + nb_pkts--, txdp++, pkt++; > + } > + > + /* do two at a time while possible, in bursts */ > + for (; nb_pkts > 3; txdp +=3D 4, pkt +=3D 4, nb_pkts -=3D 4) { > + uint64_t hi_qw3 =3D > + hi_qw_tmpl | > + ((uint64_t)pkt[3]->data_len << > + IDPF_TXD_QW1_TX_BUF_SZ_S); > + uint64_t hi_qw2 =3D > + hi_qw_tmpl | > + ((uint64_t)pkt[2]->data_len << > + IDPF_TXD_QW1_TX_BUF_SZ_S); > + uint64_t hi_qw1 =3D > + hi_qw_tmpl | > + ((uint64_t)pkt[1]->data_len << > + IDPF_TXD_QW1_TX_BUF_SZ_S); > + uint64_t hi_qw0 =3D > + hi_qw_tmpl | > + ((uint64_t)pkt[0]->data_len << > + IDPF_TXD_QW1_TX_BUF_SZ_S); > + > + __m256i desc2_3 =3D > + _mm256_set_epi64x > + (hi_qw3, > + pkt[3]->buf_iova + pkt[3]->data_off, > + hi_qw2, > + pkt[2]->buf_iova + pkt[2]->data_off); > + __m256i desc0_1 =3D > + _mm256_set_epi64x > + (hi_qw1, > + pkt[1]->buf_iova + pkt[1]->data_off, > + hi_qw0, > + pkt[0]->buf_iova + pkt[0]->data_off); > + _mm256_store_si256((void *)(txdp + 2), desc2_3); > + _mm256_store_si256((void *)txdp, desc0_1); > + } > + > + /* do any last ones */ > + while (nb_pkts) { > + idpf_singleq_vtx1(txdp, *pkt, flags); > + txdp++, pkt++, nb_pkts--; > + } > +} > + > +static inline uint16_t > +idpf_singleq_xmit_fixed_burst_vec_avx2(void *tx_queue, struct rte_mbuf *= *tx_pkts, > + uint16_t nb_pkts) > +{ > + struct idpf_tx_queue *txq =3D (struct idpf_tx_queue *)tx_queue; > + volatile struct idpf_base_tx_desc *txdp; > + struct idpf_tx_entry *txep; > + uint16_t n, nb_commit, tx_id; > + uint64_t flags =3D IDPF_TX_DESC_CMD_EOP; > + uint64_t rs =3D IDPF_TX_DESC_CMD_RS | flags; > + > + /* cross rx_thresh boundary is not allowed */ > + nb_pkts =3D RTE_MIN(nb_pkts, txq->rs_thresh); > + > + if (txq->nb_free < txq->free_thresh) > + idpf_singleq_tx_free_bufs_vec(txq); > + > + nb_commit =3D nb_pkts =3D (uint16_t)RTE_MIN(txq->nb_free, nb_pkts); > + if (unlikely(nb_pkts =3D=3D 0)) > + return 0; > + > + tx_id =3D txq->tx_tail; > + txdp =3D &txq->tx_ring[tx_id]; > + txep =3D &txq->sw_ring[tx_id]; > + > + txq->nb_free =3D (uint16_t)(txq->nb_free - nb_pkts); > + > + n =3D (uint16_t)(txq->nb_tx_desc - tx_id); > + if (nb_commit >=3D n) { > + idpf_tx_backlog_entry(txep, tx_pkts, n); > + > + idpf_singleq_vtx(txdp, tx_pkts, n - 1, flags); > + tx_pkts +=3D (n - 1); > + txdp +=3D (n - 1); > + > + idpf_singleq_vtx1(txdp, *tx_pkts++, rs); > + > + nb_commit =3D (uint16_t)(nb_commit - n); > + > + tx_id =3D 0; > + txq->next_rs =3D (uint16_t)(txq->rs_thresh - 1); > + > + /* avoid reach the end of ring */ > + txdp =3D &txq->tx_ring[tx_id]; > + txep =3D &txq->sw_ring[tx_id]; > + } > + > + idpf_tx_backlog_entry(txep, tx_pkts, nb_commit); > + > + idpf_singleq_vtx(txdp, tx_pkts, nb_commit, flags); > + > + tx_id =3D (uint16_t)(tx_id + nb_commit); > + if (tx_id > txq->next_rs) { > + txq->tx_ring[txq->next_rs].qw1 |=3D > + rte_cpu_to_le_64(((uint64_t)IDPF_TX_DESC_CMD_RS) << > + IDPF_TXD_QW1_CMD_S); > + txq->next_rs =3D > + (uint16_t)(txq->next_rs + txq->rs_thresh); > + } > + > + txq->tx_tail =3D tx_id; > + > + IDPF_PCI_REG_WRITE(txq->qtx_tail, txq->tx_tail); > + > + return nb_pkts; > +} > + > +uint16_t > +idpf_dp_singleq_xmit_pkts_avx2(void *tx_queue, struct rte_mbuf **tx_pkts= , > + uint16_t nb_pkts) > +{ > + uint16_t nb_tx =3D 0; > + struct idpf_tx_queue *txq =3D (struct idpf_tx_queue *)tx_queue; > + > + while (nb_pkts) { > + uint16_t ret, num; > + > + num =3D (uint16_t)RTE_MIN(nb_pkts, txq->rs_thresh); > + ret =3D idpf_singleq_xmit_fixed_burst_vec_avx2(tx_queue, &tx_pkts[nb_t= x], > + num); > + nb_tx +=3D ret; > + nb_pkts -=3D ret; > + if (ret < num) > + break; > + } > + > + return nb_tx; > +} > diff --git a/drivers/common/idpf/version.map=20 > b/drivers/common/idpf/version.map index 4510aae6b3..eadcb9a2cf 100644 > --- a/drivers/common/idpf/version.map > +++ b/drivers/common/idpf/version.map > @@ -15,6 +15,7 @@ INTERNAL { > idpf_dp_splitq_xmit_pkts; > idpf_dp_splitq_xmit_pkts_avx512; > idpf_dp_singleq_recv_pkts_avx2; > + idpf_dp_singleq_xmit_pkts_avx2; > =20 > idpf_qc_rx_thresh_check; > idpf_qc_rx_queue_release; > diff --git a/drivers/net/idpf/idpf_rxtx.c=20 > b/drivers/net/idpf/idpf_rxtx.c index 80c6c325e8..579293b2e8 100644 > --- a/drivers/net/idpf/idpf_rxtx.c > +++ b/drivers/net/idpf/idpf_rxtx.c > @@ -888,6 +888,12 @@ idpf_set_tx_function(struct rte_eth_dev *dev) > if (idpf_tx_vec_dev_check_default(dev) =3D=3D IDPF_VECTOR_PATH && > rte_vect_get_max_simd_bitwidth() >=3D RTE_VECT_SIMD_128) { > vport->tx_vec_allowed =3D true; > + > + if ((rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2) =3D=3D 1 || > + rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) =3D=3D 1) && As with the Rx path, only check the AVX2 flag here. [SHAIQ]- we will address this change in v2 of the patchset. > + rte_vect_get_max_simd_bitwidth() >=3D RTE_VECT_SIMD_256) > + vport->tx_use_avx2 =3D true; > + > if (rte_vect_get_max_simd_bitwidth() >=3D RTE_VECT_SIMD_512) #ifdef=20 > CC_AVX512_SUPPORT > { > @@ -947,6 +953,14 @@ idpf_set_tx_function(struct rte_eth_dev *dev) > return; > } > #endif /* CC_AVX512_SUPPORT */ > + if (vport->tx_use_avx2) { > + PMD_DRV_LOG(NOTICE, > + "Using Single AVX2 Vector Tx (port %d).", > + dev->data->port_id); > + dev->tx_pkt_burst =3D idpf_dp_singleq_xmit_pkts_avx2; > + dev->tx_pkt_prepare =3D idpf_dp_prep_pkts; > + return; > + } > } > PMD_DRV_LOG(NOTICE, > "Using Single Scalar Tx (port %d).", > -- > 2.34.1 >=20