From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id B0584455DF; Tue, 9 Jul 2024 10:24:56 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id CC76F432B2; Tue, 9 Jul 2024 10:24:32 +0200 (CEST) Received: from NAM12-BN8-obe.outbound.protection.outlook.com (mail-bn8nam12on2090.outbound.protection.outlook.com [40.107.237.90]) by mails.dpdk.org (Postfix) with ESMTP id A8F9443295 for ; Tue, 9 Jul 2024 10:24:30 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=FT7d4KYGMRqon8qPlqr1bZv5Sn3G+h15aoZqG5m92WXZmPuI07nMwk9oyF1tP1oDR8zYZw249KwX+BY66H+Ro2LErOpvfvB74ZaT4p6ChJHhUopR4Dzm7C1NEFsvF+VF5PH55QbrhsGfEOMyCzyZhkQBZ1qTUnts63sTL5X5Vt6tNZmRmHfGeK4On4C4SNoVPYqufRYvzJHf2xfhIu8aG6Y5zxzkRBOX8QynWxWw+3bzKMDmP+IvTjv00WLBHyHu5hzvjAydig9V+vPiZrRYLLDExHDCOMFPeuiUkCuT7FtKN4ZGriMRHtcSN55CdBbDj2h3jy2NOtaC9oLNLMDxHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=N6JCFJ1UfYjxx3p4MczPGXGg1TietJWwueWtE9jfUuI=; b=BSuMC7PZEt/e7L7zGQdDU9L2ToJFdVMXzUqM+ydgPWIruDzMqOtxR8NHBp8xhibBJ6/PPxzMUD58X7o+Lcpw44qYxrn3EdFw2AIKvKkNHzn6T8hKU6iYxAxnoO0yDHNVSSPXijCi8XDn1jUzL/ep5GVhTedu6UWwGTfeutmz5AA3vhUzM9aBlRhOcCswalT+J6JoGaKjL20aFTy4LeVeRw7DgJR4jkKZXSWGQUjef4pvmPgLsBPdz521vOKbX0llVSV4zOY/MJInZE3bJ+ydTAQ4LrgTqk2SZvMAo+THqawsjt+5MVq8jQ72RDCiqt1vv11xbTzFgEWIfegLBx8ClQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=corigine.com; dmarc=pass action=none header.from=corigine.com; dkim=pass header.d=corigine.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=corigine.onmicrosoft.com; s=selector2-corigine-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=N6JCFJ1UfYjxx3p4MczPGXGg1TietJWwueWtE9jfUuI=; b=j8sNmz27btvlXgF5DZ8ZDzWcjq/Uuhe+TUT5GRFGPvhga4ls0lx9ejLlcOs/IcEzsw4r3+kX7DOEXckqycNv17n1desF9SQ+H9tJ4D6EGdrCmiEJLip9DROPZpL3WjEpFXBwvms0b8v6qQC9RZD5aUewgNJgGynydVcUKUYhqJ4= Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=corigine.com; Received: from SJ0PR13MB5545.namprd13.prod.outlook.com (2603:10b6:a03:424::5) by PH7PR13MB6194.namprd13.prod.outlook.com (2603:10b6:510:245::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7741.36; Tue, 9 Jul 2024 08:24:29 +0000 Received: from SJ0PR13MB5545.namprd13.prod.outlook.com ([fe80::b900:5f05:766f:833]) by SJ0PR13MB5545.namprd13.prod.outlook.com ([fe80::b900:5f05:766f:833%4]) with mapi id 15.20.7741.033; Tue, 9 Jul 2024 08:24:29 +0000 From: Chaoyong He To: dev@dpdk.org Cc: oss-drivers@corigine.com, Long Wu , Peng Zhang , Chaoyong He Subject: [PATCH v4 4/5] net/nfp: support AVX2 Rx function Date: Tue, 9 Jul 2024 16:24:04 +0800 Message-Id: <20240709082405.248641-5-chaoyong.he@corigine.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20240709082405.248641-1-chaoyong.he@corigine.com> References: <20240709072921.246520-1-chaoyong.he@corigine.com> <20240709082405.248641-1-chaoyong.he@corigine.com> Content-Transfer-Encoding: 8bit Content-Type: text/plain X-ClientProxiedBy: BYAPR05CA0064.namprd05.prod.outlook.com (2603:10b6:a03:74::41) To SJ0PR13MB5545.namprd13.prod.outlook.com (2603:10b6:a03:424::5) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ0PR13MB5545:EE_|PH7PR13MB6194:EE_ X-MS-Office365-Filtering-Correlation-Id: 8f0014dc-0415-407e-35c6-08dc9ff08d1c X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|376014|52116014|1800799024|366016|38350700014; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?HR6li5t/l+skKIBBoq3Q/2d81p5Bu35BG28CFa86PFvuXifArQoHjvjSYPzE?= =?us-ascii?Q?Dj+X8EhVJOUdz6XBhvSLZdrL0Dhq04xRJYUufNPqoTUyFaP9K1uYxODKeWhX?= =?us-ascii?Q?gHpiuWJoSNH37hRlCNgSf9f+y9kh3FpS5A/GQRJiNcpGfOffOULaEafFohCo?= =?us-ascii?Q?JfPGo0p9ZJl/AYzqnrHTeM8SNNK98eITjKUf7ZIGjobw7QI/q+Z25v+OBXsb?= =?us-ascii?Q?91toIDFTUoqJwpQOQMPQZ+eyULiI+2RRamtdWE7WDlBXwhjTy2apdxgrKlHm?= =?us-ascii?Q?nV4toVIp22Mq7Tec7ouHrVowlPOrc5pDqviUxvItQKPESxhAWGaWh954vyrH?= =?us-ascii?Q?7yVhs9nXLpbD682CShsdYqzIWxT/LqiqJdg70wonOPanJSK5kCD63IHojD3M?= =?us-ascii?Q?OV/PaV5uinijlSlNYqtGpyQDJbLNUW/ZrhNuo86AEmHBk+GxfMessnYYDTwg?= =?us-ascii?Q?cQ0IkgfBKjxJ9J7OJkvomh/hi5reVMcIOd1yeDKxsf3KAo9KMr2UbDH0yJC7?= =?us-ascii?Q?RlVDErrGiGn0Uqgy6kuvEOFpf+ASG7kJaR8+F5XU+XCGGz2IQUpeuh0NY2Ge?= =?us-ascii?Q?YmqmBTGCbxQNVzHOHlRFOpS7VIGZ+JuEEjgiajYFIsD3hGVKZLNrb2ZbtqKu?= =?us-ascii?Q?1ZqoQSgI4g1ehZ84MTQR2Os61QameFUE1tiPU6oBClQC18b1zvqvMPrBQ74v?= =?us-ascii?Q?OGZZlp/itdJNMVAJIKVfZcVCoaq78T1ViJ7P17i7oJXZkITU6U3pl1vQsJJa?= =?us-ascii?Q?4NRpslwG3cSE6S/146AOhTtXKuN7uASCrWj9ZAbRJ//3Z/7+T0NP22JBfmdn?= =?us-ascii?Q?0f9i6bxJZ7IydnoOe/Zg6LkX2gRZdqvWHBn5y5faz41lV1MRpehIjPwkF0DD?= =?us-ascii?Q?vO5ozhUMPCtulaXmpSRcCV2VFDSgz7u5OQA01OlCd9BsUY51w8uJJBxFM3Z6?= =?us-ascii?Q?FibQSR8dpuKyB6p3AaeI7xBnR0gyg4YQS/hhMftRV1+44gHFEVHXRoN0zh0R?= =?us-ascii?Q?zL7LlJxM1q7gs14GKtcyd4ukcGBX2lFWRNIh9lhPK4/Ph13h4s/fOQc1JT97?= =?us-ascii?Q?nXWar82T/4bEX8dDCIgyEEINWhw+mCfd4wsRirxKMliFRmF+pYElQJwrXtru?= =?us-ascii?Q?0ZPZpbu2unc//Ylhc3Fhv/c7QdPHHaG8Mp/K3MhAhmar2JgayRfa6A7yPRwu?= =?us-ascii?Q?bc+LDK/uONKGAXXBO8zSVAHaGURqfExbcYO7Cs8DU1YkH6Kh10DW7mBOSC7R?= =?us-ascii?Q?AcuUeGnPwd1C6idWhtCEh16EdpXtIsOK51usjq/P2Tdgr4J4Iv2RxsMoNrqp?= =?us-ascii?Q?oFqWi7NgXx4VZ3gg0Er7FYvf8TI18JhsojXUxL4ciMlujHGltR2RELS3ROwc?= =?us-ascii?Q?X/svMPlvN8UFFGQZ0Gy3iJg51z3zmxDk58wgswRnfm3dj12C+A=3D=3D?= X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR13MB5545.namprd13.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(52116014)(1800799024)(366016)(38350700014); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?d4YxjocNChx5KFJc23VoXW5Bndnyf5Q9y8/8RomPMOAT69QEpNMjCHQRFIA4?= =?us-ascii?Q?HTRUjGSzx0VozkJnZNu/UhstKDTFJ3Zps3jQH+f4UJ/97gCyOZdLtycUs97c?= =?us-ascii?Q?sH33Z08d6ea8NCmWKFBIZZAQcoeilG+qZtXb9ltQlW7yJ+SEwPIj+5fquXZA?= =?us-ascii?Q?OHzYDsn7JA/grM+hbbzt1Fjpfw7EVkdGrd3UBB5UWo9qC0xyAt4IKfx+TMo2?= =?us-ascii?Q?jqR8YBMZyE+dDJNpMYP0bS8u+0MhXWLXfJljJPRvKIp0m/rAmpct2Wf6JFpm?= =?us-ascii?Q?CmtC8veFuKXiXvfglULuIB0UT6RJbpUpDht145wNRNmLOkszIZ6bpV8IbY+R?= =?us-ascii?Q?vQjd0U31E8JMm+RHGU6rPEgZbR6Gia4foRnbbjjOqO6u5nvzgldKzxSbR7Rq?= =?us-ascii?Q?LnSdWbdVC+tY/sYaVomo/Wv/px2MuOZFEXmO9jJXxE0shsjs/ByaZqj8YCbw?= =?us-ascii?Q?lcTUPmyhWp9LiadT+7ut/ycJK/sdDIQmVh2ZbPWq86liDZyKWIYZUROVmRui?= =?us-ascii?Q?UthLaV9/BopKoaPkDt73it+NV/0MlcHH3PDSj0iuu5wU4t5hsVyAB66tVi5d?= =?us-ascii?Q?Rb1FEMddjUvMNfQgBODvPnHlLlaSEJVrrCIJONSF5Z34K1Uaw/i5vQvLGAwk?= =?us-ascii?Q?rIOUfdiHWlg3vf/jDNx/Gp0N3eNIKTdMqiUFMiL/a2USyJs5Pj/ky04Wxwd5?= =?us-ascii?Q?tmn4TgGtxv09O5cPjE17pLqi2CmmDew+mxEglZNH26tlCxA5CGHuPBEbWMwE?= =?us-ascii?Q?wMlcG6csLdIHadP+VuSJgeraYjY5kUNHoin9Sj0ZvbVFJhow9JOPh8xp99tB?= =?us-ascii?Q?9QdUEL+IUJ303S69GXdSOON4iIfgzk7VkeEKqnnDTTVuT9j69VFAFZMtfxTi?= =?us-ascii?Q?l/ZvASSeoDN+BiBMjDyzka4A/dKgU+dZXC8zBIAll5ZXoPeH1S52YZetLb2u?= =?us-ascii?Q?UsANJAxwjlqD/Ctj1P/9Y0NwE7gs/i354fXUBt6Msq8QPfd8AqP6fev8mAtz?= =?us-ascii?Q?fXPFX5MQ1DCi0L1dHsi894rxdlgu5jMiIPC8STRKkbOcJrBXfYU1X/cL50jI?= =?us-ascii?Q?K3AsilsA1x9u45/939j5dFvaTheAWAca/EK+7e/ncgZFRMznsDKFiMaVMUb7?= =?us-ascii?Q?EfitXAkdtBnFzHdZPZmNU8sDKB77ZEL88FiF4YHZQms0P5fNrWKHzjxsQZX4?= =?us-ascii?Q?R/2+hzw3WdWdOSoP+267VOZEcG+OW4koKoravYBoZDm4rUFDKb+t8v6KYzHu?= =?us-ascii?Q?vui3rKUJPTDOsu76FwVpwPWMQtDVyE5vFTVBCvyhIFxD730lTgE3QLwhAWBX?= =?us-ascii?Q?pZY/seXyBQedAGBSGLdGVmIN1wRhotJyd5K8TNeIrIPgkqN4xH24QjYmL2Xp?= =?us-ascii?Q?jPgYGy/Pc0AXYALl8B6C009zNOE51r/LMKZo3QAis+4u1VPHO8UfuN0yo78n?= =?us-ascii?Q?etLE4YdkzLN892003AboIaohpEQ9GVQG64iRD0BG/x9W0iVhjhdIWy4ik9Xz?= =?us-ascii?Q?H9r0/dnKhOfohuqy4umqA0Tbh6ttfAWTotvF8Ahizkrr8oQIZlLLIADdHnX9?= =?us-ascii?Q?2IwckeKbrIyns2LsM62EvaBee/PniKrB9YUe2MKVZzu966+Vr56Ax5aHklfU?= =?us-ascii?Q?4w=3D=3D?= X-OriginatorOrg: corigine.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8f0014dc-0415-407e-35c6-08dc9ff08d1c X-MS-Exchange-CrossTenant-AuthSource: SJ0PR13MB5545.namprd13.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jul 2024 08:24:29.1234 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: fe128f2c-073b-4c20-818e-7246a585940c X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: juSJhAKwX1fscuE3gocpHwBJKzNDvxJPC6i6t67f4FJTxeQCQJsG8NbaSaGyAnBnZdxLm47QRHkXlAMBzSVGnLp96Wr6jZLVIQrf1b+8Lec= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH7PR13MB6194 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Long Wu Use AVX2 instructions to accelerate Rx performance. The acceleration only works on X86 machine. Signed-off-by: Peng Zhang Signed-off-by: Long Wu Reviewed-by: Chaoyong He --- drivers/net/nfp/nfp_ethdev.c | 2 +- drivers/net/nfp/nfp_ethdev_vf.c | 2 +- drivers/net/nfp/nfp_net_meta.c | 1 + drivers/net/nfp/nfp_rxtx.c | 10 ++ drivers/net/nfp/nfp_rxtx.h | 1 + drivers/net/nfp/nfp_rxtx_vec.h | 4 + drivers/net/nfp/nfp_rxtx_vec_avx2.c | 252 ++++++++++++++++++++++++++++ drivers/net/nfp/nfp_rxtx_vec_stub.c | 9 + 8 files changed, 279 insertions(+), 2 deletions(-) diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c index a7b40af712..bd35df2dc9 100644 --- a/drivers/net/nfp/nfp_ethdev.c +++ b/drivers/net/nfp/nfp_ethdev.c @@ -969,7 +969,7 @@ nfp_net_ethdev_ops_mount(struct nfp_net_hw *hw, eth_dev->dev_ops = &nfp_net_eth_dev_ops; eth_dev->rx_queue_count = nfp_net_rx_queue_count; - eth_dev->rx_pkt_burst = &nfp_net_recv_pkts; + nfp_net_recv_pkts_set(eth_dev); } static int diff --git a/drivers/net/nfp/nfp_ethdev_vf.c b/drivers/net/nfp/nfp_ethdev_vf.c index b955624ed6..cdf5da3af7 100644 --- a/drivers/net/nfp/nfp_ethdev_vf.c +++ b/drivers/net/nfp/nfp_ethdev_vf.c @@ -245,7 +245,7 @@ nfp_netvf_ethdev_ops_mount(struct nfp_net_hw *hw, eth_dev->dev_ops = &nfp_netvf_eth_dev_ops; eth_dev->rx_queue_count = nfp_net_rx_queue_count; - eth_dev->rx_pkt_burst = &nfp_net_recv_pkts; + nfp_net_recv_pkts_set(eth_dev); } static int diff --git a/drivers/net/nfp/nfp_net_meta.c b/drivers/net/nfp/nfp_net_meta.c index b31ef56f17..07c6758d33 100644 --- a/drivers/net/nfp/nfp_net_meta.c +++ b/drivers/net/nfp/nfp_net_meta.c @@ -80,6 +80,7 @@ nfp_net_meta_parse_single(uint8_t *meta_base, rte_be32_t meta_header, struct nfp_net_meta_parsed *meta) { + meta->flags = 0; meta->flags |= (1 << NFP_NET_META_HASH); meta->hash_type = rte_be_to_cpu_32(meta_header); meta->hash = rte_be_to_cpu_32(*(rte_be32_t *)(meta_base + 4)); diff --git a/drivers/net/nfp/nfp_rxtx.c b/drivers/net/nfp/nfp_rxtx.c index 1db79ad1cd..4fc3374987 100644 --- a/drivers/net/nfp/nfp_rxtx.c +++ b/drivers/net/nfp/nfp_rxtx.c @@ -17,6 +17,7 @@ #include "nfp_ipsec.h" #include "nfp_logs.h" #include "nfp_net_meta.h" +#include "nfp_rxtx_vec.h" /* * The bit format and map of nfp packet type for rxd.offload_info in Rx descriptor. @@ -867,3 +868,12 @@ nfp_net_tx_queue_info_get(struct rte_eth_dev *dev, info->conf.offloads = dev_info.tx_offload_capa & dev->data->dev_conf.txmode.offloads; } + +void +nfp_net_recv_pkts_set(struct rte_eth_dev *eth_dev) +{ + if (nfp_net_get_avx2_supported()) + eth_dev->rx_pkt_burst = nfp_net_vec_avx2_recv_pkts; + else + eth_dev->rx_pkt_burst = nfp_net_recv_pkts; +} diff --git a/drivers/net/nfp/nfp_rxtx.h b/drivers/net/nfp/nfp_rxtx.h index 3ddf717da0..fff8371991 100644 --- a/drivers/net/nfp/nfp_rxtx.h +++ b/drivers/net/nfp/nfp_rxtx.h @@ -244,5 +244,6 @@ void nfp_net_rx_queue_info_get(struct rte_eth_dev *dev, void nfp_net_tx_queue_info_get(struct rte_eth_dev *dev, uint16_t queue_id, struct rte_eth_txq_info *qinfo); +void nfp_net_recv_pkts_set(struct rte_eth_dev *eth_dev); #endif /* __NFP_RXTX_H__ */ diff --git a/drivers/net/nfp/nfp_rxtx_vec.h b/drivers/net/nfp/nfp_rxtx_vec.h index c92660f963..8720662744 100644 --- a/drivers/net/nfp/nfp_rxtx_vec.h +++ b/drivers/net/nfp/nfp_rxtx_vec.h @@ -10,4 +10,8 @@ bool nfp_net_get_avx2_supported(void); +uint16_t nfp_net_vec_avx2_recv_pkts(void *rx_queue, + struct rte_mbuf **rx_pkts, + uint16_t nb_pkts); + #endif /* __NFP_RXTX_VEC_AVX2_H__ */ diff --git a/drivers/net/nfp/nfp_rxtx_vec_avx2.c b/drivers/net/nfp/nfp_rxtx_vec_avx2.c index 50638e74ab..7c18213624 100644 --- a/drivers/net/nfp/nfp_rxtx_vec_avx2.c +++ b/drivers/net/nfp/nfp_rxtx_vec_avx2.c @@ -5,9 +5,14 @@ #include +#include +#include #include #include +#include "nfp_logs.h" +#include "nfp_net_common.h" +#include "nfp_net_meta.h" #include "nfp_rxtx_vec.h" bool @@ -19,3 +24,250 @@ nfp_net_get_avx2_supported(void) return false; } + +static inline void +nfp_vec_avx2_recv_set_des1(struct nfp_net_rxq *rxq, + struct nfp_net_rx_desc *rxds, + struct rte_mbuf *rxb) +{ + __m128i dma; + __m128i dma_hi; + __m128i vaddr0; + __m128i hdr_room = _mm_set_epi64x(0, RTE_PKTMBUF_HEADROOM); + + dma = _mm_add_epi64(_mm_loadu_si128((__m128i *)&rxb->buf_addr), hdr_room); + dma_hi = _mm_srli_epi64(dma, 32); + vaddr0 = _mm_unpacklo_epi32(dma_hi, dma); + + _mm_storel_epi64((void *)rxds, vaddr0); + + rxq->rd_p = (rxq->rd_p + 1) & (rxq->rx_count - 1); +} + +static inline void +nfp_vec_avx2_recv_set_des4(struct nfp_net_rxq *rxq, + struct nfp_net_rx_desc *rxds, + struct rte_mbuf **rxb) +{ + __m128i dma; + __m128i dma_hi; + __m128i vaddr0; + __m128i vaddr1; + __m128i vaddr2; + __m128i vaddr3; + __m128i vaddr0_1; + __m128i vaddr2_3; + __m256i vaddr0_3; + __m128i hdr_room = _mm_set_epi64x(0, RTE_PKTMBUF_HEADROOM); + + dma = _mm_add_epi64(_mm_loadu_si128((__m128i *)&rxb[0]->buf_addr), hdr_room); + dma_hi = _mm_srli_epi64(dma, 32); + vaddr0 = _mm_unpacklo_epi32(dma_hi, dma); + + dma = _mm_add_epi64(_mm_loadu_si128((__m128i *)&rxb[1]->buf_addr), hdr_room); + dma_hi = _mm_srli_epi64(dma, 32); + vaddr1 = _mm_unpacklo_epi32(dma_hi, dma); + + dma = _mm_add_epi64(_mm_loadu_si128((__m128i *)&rxb[2]->buf_addr), hdr_room); + dma_hi = _mm_srli_epi64(dma, 32); + vaddr2 = _mm_unpacklo_epi32(dma_hi, dma); + + dma = _mm_add_epi64(_mm_loadu_si128((__m128i *)&rxb[3]->buf_addr), hdr_room); + dma_hi = _mm_srli_epi64(dma, 32); + vaddr3 = _mm_unpacklo_epi32(dma_hi, dma); + + vaddr0_1 = _mm_unpacklo_epi64(vaddr0, vaddr1); + vaddr2_3 = _mm_unpacklo_epi64(vaddr2, vaddr3); + + vaddr0_3 = _mm256_inserti128_si256(_mm256_castsi128_si256(vaddr0_1), + vaddr2_3, 1); + + _mm256_store_si256((void *)rxds, vaddr0_3); + + rxq->rd_p = (rxq->rd_p + 4) & (rxq->rx_count - 1); +} + +static inline void +nfp_vec_avx2_recv_set_rxpkt1(struct nfp_net_rxq *rxq, + struct nfp_net_rx_desc *rxds, + struct rte_mbuf *rx_pkt) +{ + struct nfp_net_hw *hw = rxq->hw; + struct nfp_net_meta_parsed meta; + + rx_pkt->data_len = rxds->rxd.data_len - NFP_DESC_META_LEN(rxds); + /* Size of the whole packet. We just support 1 segment */ + rx_pkt->pkt_len = rxds->rxd.data_len - NFP_DESC_META_LEN(rxds); + + /* Filling the received mbuf with packet info */ + if (hw->rx_offset) + rx_pkt->data_off = RTE_PKTMBUF_HEADROOM + hw->rx_offset; + else + rx_pkt->data_off = RTE_PKTMBUF_HEADROOM + NFP_DESC_META_LEN(rxds); + + rx_pkt->port = rxq->port_id; + rx_pkt->nb_segs = 1; + rx_pkt->next = NULL; + + nfp_net_meta_parse(rxds, rxq, hw, rx_pkt, &meta); + + /* Checking the checksum flag */ + nfp_net_rx_cksum(rxq, rxds, rx_pkt); +} + +static inline void +nfp_vec_avx2_recv1(struct nfp_net_rxq *rxq, + struct nfp_net_rx_desc *rxds, + struct rte_mbuf *rxb, + struct rte_mbuf *rx_pkt) +{ + nfp_vec_avx2_recv_set_rxpkt1(rxq, rxds, rx_pkt); + + nfp_vec_avx2_recv_set_des1(rxq, rxds, rxb); +} + +static inline void +nfp_vec_avx2_recv4(struct nfp_net_rxq *rxq, + struct nfp_net_rx_desc *rxds, + struct rte_mbuf **rxb, + struct rte_mbuf **rx_pkts) +{ + nfp_vec_avx2_recv_set_rxpkt1(rxq, rxds, rx_pkts[0]); + nfp_vec_avx2_recv_set_rxpkt1(rxq, rxds + 1, rx_pkts[1]); + nfp_vec_avx2_recv_set_rxpkt1(rxq, rxds + 2, rx_pkts[2]); + nfp_vec_avx2_recv_set_rxpkt1(rxq, rxds + 3, rx_pkts[3]); + + nfp_vec_avx2_recv_set_des4(rxq, rxds, rxb); +} + +static inline bool +nfp_vec_avx2_recv_check_packets4(struct nfp_net_rx_desc *rxds) +{ + __m256i data = _mm256_loadu_si256((void *)rxds); + + if ((_mm256_extract_epi8(data, 3) & PCIE_DESC_RX_DD) == 0 || + (_mm256_extract_epi8(data, 11) & PCIE_DESC_RX_DD) == 0 || + (_mm256_extract_epi8(data, 19) & PCIE_DESC_RX_DD) == 0 || + (_mm256_extract_epi8(data, 27) & PCIE_DESC_RX_DD) == 0) + return false; + + return true; +} + +uint16_t +nfp_net_vec_avx2_recv_pkts(void *rx_queue, + struct rte_mbuf **rx_pkts, + uint16_t nb_pkts) +{ + uint16_t avail; + uint16_t nb_hold; + bool burst_receive; + struct rte_mbuf **rxb; + struct nfp_net_rx_desc *rxds; + struct nfp_net_rxq *rxq = rx_queue; + + if (unlikely(rxq == NULL)) { + PMD_RX_LOG(ERR, "RX Bad queue"); + return 0; + } + + avail = 0; + nb_hold = 0; + burst_receive = true; + while (avail < nb_pkts) { + rxds = &rxq->rxds[rxq->rd_p]; + rxb = &rxq->rxbufs[rxq->rd_p].mbuf; + + if ((_mm_extract_epi8(_mm_loadu_si128((void *)(rxds)), 3) + & PCIE_DESC_RX_DD) == 0) + goto recv_end; + + rte_prefetch0(rxq->rxbufs[rxq->rd_p].mbuf); + + if ((rxq->rd_p & 0x3) == 0) { + rte_prefetch0(&rxq->rxds[rxq->rd_p]); + rte_prefetch0(&rxq->rxbufs[rxq->rd_p]); + rte_prefetch0(rxq->rxbufs[rxq->rd_p + 1].mbuf); + rte_prefetch0(rxq->rxbufs[rxq->rd_p + 2].mbuf); + rte_prefetch0(rxq->rxbufs[rxq->rd_p + 3].mbuf); + } + + if ((rxq->rd_p & 0x7) == 0) { + rte_prefetch0(rxq->rxbufs[rxq->rd_p + 4].mbuf); + rte_prefetch0(rxq->rxbufs[rxq->rd_p + 5].mbuf); + rte_prefetch0(rxq->rxbufs[rxq->rd_p + 6].mbuf); + rte_prefetch0(rxq->rxbufs[rxq->rd_p + 7].mbuf); + } + + /* + * If can not receive burst, just receive one. + * 1. Rx ring will coming to the tail. + * 2. Do not need to receive 4 packets. + * 3. If pointer address unaligned on 32-bit boundary. + * 4. Rx ring does not have 4 packets or alloc 4 mbufs failed. + */ + if ((rxq->rx_count - rxq->rd_p) < 4 || + (nb_pkts - avail) < 4 || + ((uintptr_t)rxds & 0x1F) != 0 || + !burst_receive) { + _mm_storel_epi64((void *)&rx_pkts[avail], + _mm_loadu_si128((void *)rxb)); + + /* Allocate a new mbuf into the software ring. */ + if (rte_pktmbuf_alloc_bulk(rxq->mem_pool, rxb, 1) < 0) { + PMD_RX_LOG(DEBUG, "RX mbuf alloc failed port_id=%u queue_id=%hu", + rxq->port_id, rxq->qidx); + nfp_net_mbuf_alloc_failed(rxq); + goto recv_end; + } + + nfp_vec_avx2_recv1(rxq, rxds, *rxb, rx_pkts[avail]); + + avail++; + nb_hold++; + continue; + } + + burst_receive = nfp_vec_avx2_recv_check_packets4(rxds); + if (!burst_receive) + continue; + + _mm256_storeu_si256((void *)&rx_pkts[avail], + _mm256_loadu_si256((void *)rxb)); + + /* Allocate 4 new mbufs into the software ring. */ + if (rte_pktmbuf_alloc_bulk(rxq->mem_pool, rxb, 4) < 0) { + burst_receive = false; + continue; + } + + nfp_vec_avx2_recv4(rxq, rxds, rxb, &rx_pkts[avail]); + + avail += 4; + nb_hold += 4; + } + +recv_end: + if (nb_hold == 0) + return nb_hold; + + PMD_RX_LOG(DEBUG, "RX port_id=%u queue_id=%u, %d packets received", + rxq->port_id, (unsigned int)rxq->qidx, nb_hold); + + nb_hold += rxq->nb_rx_hold; + + /* + * FL descriptors needs to be written before incrementing the + * FL queue WR pointer + */ + rte_wmb(); + if (nb_hold > rxq->rx_free_thresh) { + PMD_RX_LOG(DEBUG, "port=%hu queue=%hu nb_hold=%hu avail=%hu", + rxq->port_id, rxq->qidx, nb_hold, avail); + nfp_qcp_ptr_add(rxq->qcp_fl, NFP_QCP_WRITE_PTR, nb_hold); + nb_hold = 0; + } + rxq->nb_rx_hold = nb_hold; + + return avail; +} diff --git a/drivers/net/nfp/nfp_rxtx_vec_stub.c b/drivers/net/nfp/nfp_rxtx_vec_stub.c index 1bc55b67e0..c480f61ef0 100644 --- a/drivers/net/nfp/nfp_rxtx_vec_stub.c +++ b/drivers/net/nfp/nfp_rxtx_vec_stub.c @@ -6,6 +6,7 @@ #include #include +#include #include "nfp_rxtx_vec.h" @@ -14,3 +15,11 @@ nfp_net_get_avx2_supported(void) { return false; } + +uint16_t __rte_weak +nfp_net_vec_avx2_recv_pkts(__rte_unused void *rx_queue, + __rte_unused struct rte_mbuf **rx_pkts, + __rte_unused uint16_t nb_pkts) +{ + return 0; +} -- 2.39.1