From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 348AB455C3; Mon, 8 Jul 2024 07:59:36 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 6489A40E13; Mon, 8 Jul 2024 07:59:23 +0200 (CEST) Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2138.outbound.protection.outlook.com [40.107.244.138]) by mails.dpdk.org (Postfix) with ESMTP id 0F2EC40E1E for ; Mon, 8 Jul 2024 07:59:21 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=isR91+pkuKqsyDb0fY2xU2y/nV9v+YiQhTOLPEy0I5BxhaI+JOUElGpAt8WPEyQ4RWSb/vBeiIVPijy+WhCAMdslQyPxjMQvYp80qHwJmniq2vsayIcGOjVN8syTQQQ2YKrGIevql6zOgriYemOi3uSnSS6Ymj97+mwAYuUy3ZJY8WjK/jtaWs8A4tEcT3A0Tpc2cbr75P2GVxbJ2LWoX9k6P+QiGUIZpRZzqfl/eNm8xjm6V8HTqaeVoay/JD5Fy2PiZW0U8psK6qlJjgYc1m/j5hEGiLtnJ9Vk8DF7BRQup3wtYj7iMDuFEWY65rhFDxcS+bvjdwB5POd7Rb0V2w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=N6JCFJ1UfYjxx3p4MczPGXGg1TietJWwueWtE9jfUuI=; b=VqhOGCkKkcbtOBKpc5IqBjLC2eFQahNYhPHKdeskxw0464d/dtw2GTTzsMcyZ8vrX9Lygi/H3RMD//OmZuC3ts8rv2VJuNTbyRdeldm7djm8U2SGJm98zmeVcNNflOsFZyr6sYdPX+AUzXGJjUr0kblHnqr7qJbQyJ3OJqonQ9aMg2JAQjzkUE5QWdzGaWMR3zjKTC6xvCXxjuq0hAU4iDJEL+IpJEYPdSONARWUqyHknGVZeUiesaVWCOaM1iIODFqfRrxBVbcDOPaKZK5XKxrBmFZREsV46yavqARTRafQIDve8eVCsZa89X5iq3BsU5whaLseNr9gT20vkxI+6A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=corigine.com; dmarc=pass action=none header.from=corigine.com; dkim=pass header.d=corigine.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=corigine.onmicrosoft.com; s=selector2-corigine-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=N6JCFJ1UfYjxx3p4MczPGXGg1TietJWwueWtE9jfUuI=; b=prlw2EfwJFzu+CrH1BeS+dgprgWuHLeu7pjX0aVpU4HKVVr1cgiX/fUwGcpCz5vitwZpFx7sef0c3v5s6UenCrcuDn+Fs0pZuzzQWm9grchqNOHKkCHy4Y0TVJshx41yVYF+z5cxLx69415bceUKxMrFuqQzBPoj7FJJkEWaxCI= Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=corigine.com; Received: from SJ0PR13MB5545.namprd13.prod.outlook.com (2603:10b6:a03:424::5) by SJ0PR13MB5499.namprd13.prod.outlook.com (2603:10b6:a03:425::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7741.35; Mon, 8 Jul 2024 05:59:19 +0000 Received: from SJ0PR13MB5545.namprd13.prod.outlook.com ([fe80::b900:5f05:766f:833]) by SJ0PR13MB5545.namprd13.prod.outlook.com ([fe80::b900:5f05:766f:833%4]) with mapi id 15.20.7741.033; Mon, 8 Jul 2024 05:59:19 +0000 From: Chaoyong He To: dev@dpdk.org Cc: oss-drivers@corigine.com, Long Wu , Peng Zhang , Chaoyong He Subject: [PATCH v2 3/4] net/nfp: support AVX2 Rx function Date: Mon, 8 Jul 2024 13:58:53 +0800 Message-Id: <20240708055854.107739-4-chaoyong.he@corigine.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20240708055854.107739-1-chaoyong.he@corigine.com> References: <20240619025914.3216054-1-chaoyong.he@corigine.com> <20240708055854.107739-1-chaoyong.he@corigine.com> Content-Transfer-Encoding: 8bit Content-Type: text/plain X-ClientProxiedBy: SJ0PR05CA0168.namprd05.prod.outlook.com (2603:10b6:a03:339::23) To SJ0PR13MB5545.namprd13.prod.outlook.com (2603:10b6:a03:424::5) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ0PR13MB5545:EE_|SJ0PR13MB5499:EE_ X-MS-Office365-Filtering-Correlation-Id: e25b47e8-263f-4edc-ce6f-08dc9f131b24 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|366016|52116014|376014|1800799024|38350700014; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?GTGKTbTSHBRnMHaq0SffGu3pWe7KfjYOZXWDrLsdMSliFhnn21dhVQHmBjfw?= =?us-ascii?Q?TdciQFg6VeJ5nmaGflpZOmRp7oRYoTiGMWSvkT5YUkN97ead2/cD+4cj8hX3?= =?us-ascii?Q?X6ku3s4MycvgLThHQRqvbQ17N8crQLlC41hCxtnNzB4rX3zs4eF8MwFRgov5?= =?us-ascii?Q?wT/2jnFL790iWNclY0vK6fkNcTS1UHNZIFXwpE8peBWBFR3z7AzP5IaHaHVL?= =?us-ascii?Q?X8YqmIGMPb7mMNcDvFlGQG9IqoqaPQuTFGqWTYGDx9VyobF/jOHTnP2yWaUq?= =?us-ascii?Q?EH08l+oRw3V5jkqBc5VFwJA4qFsxWAhVCXQIvVut2NlL1zPYT5xGijkNTk+E?= =?us-ascii?Q?bemPaRQofI3i3YWkWkvwpDfqfY9bO6oLvqx8n8tVhLhZtT6gHUw0AzOnkaFL?= =?us-ascii?Q?ptIbzTL01FzKBMqsawH0DXDWPJLmpEsFoyrz7P0U6vYm5Y38LYL1utC1pXNl?= =?us-ascii?Q?6uoKuGXFMpqwWSR+5ca8JafravOF35GuXec5rjFBFCa553jYTN4Mo2EG1507?= =?us-ascii?Q?5LCtLB9D6y/xpnxugFrW1z0a4QcxhuT+KhgLUUqIWYvK3D9ZLZUUDQmkgy7S?= =?us-ascii?Q?UTHONIhXba9HJTzbjtgdmI0R+qJOK9iMZ6VBniV04QsVSuqIHFA++XbVKlUd?= =?us-ascii?Q?KLc4f81HZMvjnuVs+Xaa9RGuy0x6vVu5yIqTKoD0Wf3USczHPmrMHnYwI0vU?= =?us-ascii?Q?gAMqjjctG6ykMIMKWGBV4AKTvltdcUR6JWe1+QodJHUnXZMlo3eR25QpJeW6?= =?us-ascii?Q?1YvArAAdkOGCiX1Fs62B7ECyyBOBcFhchWIJ7ZrQ3P+v5lP1LFZ0PpXzjcrG?= =?us-ascii?Q?KTnpUUntCG1yqOkJE3g8E7967SH4v20wvK2ui/lsusQqSClTjvqROCO+JObc?= =?us-ascii?Q?NIBZGavtzqqb03L9xyozYA/BljmDCk5Xyi2SYVbFnYJJzJCdyM5cbG1yXPKG?= =?us-ascii?Q?fIyE7wwlVS6nG8QP3KAqZVorkWro3gomZWUkNVd/qsydSQAz8mltyPvKek8M?= =?us-ascii?Q?bL3kXFgAN+Bv1u2Q2QIiv3IENhadn7NkibhNZpOV4xogkOSBIf5DOiLbuNbr?= =?us-ascii?Q?ZRV3F91OuLzxmiu4lzkpeAu1PCLe3pcTgawFbuIQxyj/kRtBrHVKEyToeYfM?= =?us-ascii?Q?zb57hxU67bmB+Sz/7LGtYQNXqMdmVo8/iF/9TVZM/UAPnTQdaNCHjUhJ1Czy?= =?us-ascii?Q?XcUuIMMPLx1qJ4WRvVrlsPQoVsSjKo4fEP3VsIEyUHo6Gwi+vaqUmpPMF3HN?= =?us-ascii?Q?PiZBcxqCfJHxSxUWcZTzvwauSd3sqRTaN1DZv/pXLD88BuGImSYC2B0BHv1p?= =?us-ascii?Q?kDxfgr7oQA/vJv+kS1jNvSZ448cMECYrV+gpPpAWByc1d5zx1Xp7qpIZAa/1?= =?us-ascii?Q?/1zfUqbH5Z+S/S2VbkQVg5NqnGST/3Pfoiptw3+8Vfcb2N1SOQ=3D=3D?= X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR13MB5545.namprd13.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(366016)(52116014)(376014)(1800799024)(38350700014); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?UoBoD6CtnxHtxK7br3NlK8jNSib0r2tBhfK1SEPw8/SOufJ7iDctfReAPo/e?= =?us-ascii?Q?gMbEm/RTm6QHOb3ZH3s5CMs3WBwG00pIBTdlK/VsHae4xymT4uZHRWrgknJN?= =?us-ascii?Q?fTgDQbPcBWam2irx1sm42k9raQSJlHM0xnGLm06Bgn3tptK2VhLR/Md0IG+n?= =?us-ascii?Q?MzmIUENPhErYFB7s1Tefy/c7SibA6h6OxKFyNGFf+b6G7BcGHq6BXWD8wwoN?= =?us-ascii?Q?eJeL3/cOvAubvqHR2Tmb1MoSuKTHM12d+gygsWNKsMWf2dweSK/vNgFtv5+T?= =?us-ascii?Q?gAT1wNuSyvzSyDXTTB079jRp5O7CBYtUCsY4rhT+LTi/nWZvnbB3cRN3iJZi?= =?us-ascii?Q?i+RfFCyz+kLWUUgf136ZMThsMq/E3vt3zZmPNoYFHkmllYVd+DKnnUJFJYh4?= =?us-ascii?Q?z0ssF/vPGB/kO1lHs+VRKHVvfFkzC61FgWzPlNhXdPtqFNQe2ttqYMBAyOMW?= =?us-ascii?Q?FjujDylMyB6i2Z+IoG0rxyIsY7Ox5InU1jRGKctZhUxBcDDecB6Jn6Iiqs7z?= =?us-ascii?Q?N5J8JRVPErQnU9+HxSLm/r7c0CabG2DqQ0wF6Adz5RADp22igY5Hbeju98xt?= =?us-ascii?Q?gu5exN2o3oZm24C7oqHdBX/1AMlYehQpg3zDt0jws8gy+MMT9mNtSCwITeds?= =?us-ascii?Q?BOk4s736xKIKTfJkePKQ+m4iouJkWnmvSwXV9jLJvRiNJ4+IjOct/+m80FbX?= =?us-ascii?Q?CZ/mC7DBJsdzFZQUVtJtWLnHGtRUM8ItlTOfkbP4E4RPb15JBBcS+dDYcaYH?= =?us-ascii?Q?08uOEDa5+2ZZ09nvnaqwpZMfwCp7dogpmYS/TlMO6y9+0EeMJ4++VErMzpQP?= =?us-ascii?Q?EI/dRbaEl8EhCxQ3OSdmILmN9jCIg4+1GoRduL2hU4agHs6Uw1Bz5YFGab/d?= =?us-ascii?Q?+jUrB/Qunvy0Job3WFxct0DoG2HS/GZ3WqaL3Ab0MyeWvdqVGs7XjlQGIPeJ?= =?us-ascii?Q?lDgPtNVKxQV3AjzbOugMZciKImUz9K1L/0Q4TtcUOxlqHIN8HCbC8G2r+Pc6?= =?us-ascii?Q?tBcPDfqQcNKkOLV59xiHoQO8srPpTvNuenZcR6AGqPk3xkEkkQmsNZp4ZAIm?= =?us-ascii?Q?heUbaF78lX/JHBItOMKo/Eesz1X5Z9+AEEzkcd/xlua+KqOz8f4d/BfEuB0M?= =?us-ascii?Q?wUrsX1nk+jxukt8KldBpTJpi40WuD6KvEmvspLCxWEzPeeNJnVmUrjvU5Y50?= =?us-ascii?Q?rz3biwgvxFrfaQy2XZHHPN9DCS/Q3iPjchfdZ1DN1SPp5A8OqmjhoPfL7fz8?= =?us-ascii?Q?+RVbDbwMJfncxPiONRtD18W17w16FLwnMEIWQ30/S7MotKJehdAfkpgrKHz8?= =?us-ascii?Q?szkMxuwi5HKsEmNeROyjIqOvgd8VweFzHDzXD9MILwF7rFiVQqlGbkcB/IYy?= =?us-ascii?Q?yEBVFOO0WeOlfQWdrEjDjCMyJGX9j1tdOqCO9dfCuAFlulTHha4ECxPtYAP4?= =?us-ascii?Q?n3rxS9yXDW7Jx/55d4Cvt+RIdiFPYGFn8d5vjbIQZjU0LqBkIn4WrqH05APK?= =?us-ascii?Q?Cj2lsr9RtIUrn60uv9DthN+MIV/lXolGfnwzWrhAHi1Z3xEYYIDSKiRJztTv?= =?us-ascii?Q?vlzUhbf6iGvDn+z7jkph0FJKcjmjA7igimghoLUzlx7EIMNiVxPi+uiIRrok?= =?us-ascii?Q?dA=3D=3D?= X-OriginatorOrg: corigine.com X-MS-Exchange-CrossTenant-Network-Message-Id: e25b47e8-263f-4edc-ce6f-08dc9f131b24 X-MS-Exchange-CrossTenant-AuthSource: SJ0PR13MB5545.namprd13.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jul 2024 05:59:19.1704 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: fe128f2c-073b-4c20-818e-7246a585940c X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: kGEswnWDuhov146/hmgdvXlwqdSOtsHYQZxivfwFcDevr3VKdqkmG0bXT4zUskD34bXOf7K+R67FsTRLq/WUHq9ntY+YRZqQ1xnVKBtUUS8= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR13MB5499 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Long Wu Use AVX2 instructions to accelerate Rx performance. The acceleration only works on X86 machine. Signed-off-by: Peng Zhang Signed-off-by: Long Wu Reviewed-by: Chaoyong He --- drivers/net/nfp/nfp_ethdev.c | 2 +- drivers/net/nfp/nfp_ethdev_vf.c | 2 +- drivers/net/nfp/nfp_net_meta.c | 1 + drivers/net/nfp/nfp_rxtx.c | 10 ++ drivers/net/nfp/nfp_rxtx.h | 1 + drivers/net/nfp/nfp_rxtx_vec.h | 4 + drivers/net/nfp/nfp_rxtx_vec_avx2.c | 252 ++++++++++++++++++++++++++++ drivers/net/nfp/nfp_rxtx_vec_stub.c | 9 + 8 files changed, 279 insertions(+), 2 deletions(-) diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c index a7b40af712..bd35df2dc9 100644 --- a/drivers/net/nfp/nfp_ethdev.c +++ b/drivers/net/nfp/nfp_ethdev.c @@ -969,7 +969,7 @@ nfp_net_ethdev_ops_mount(struct nfp_net_hw *hw, eth_dev->dev_ops = &nfp_net_eth_dev_ops; eth_dev->rx_queue_count = nfp_net_rx_queue_count; - eth_dev->rx_pkt_burst = &nfp_net_recv_pkts; + nfp_net_recv_pkts_set(eth_dev); } static int diff --git a/drivers/net/nfp/nfp_ethdev_vf.c b/drivers/net/nfp/nfp_ethdev_vf.c index b955624ed6..cdf5da3af7 100644 --- a/drivers/net/nfp/nfp_ethdev_vf.c +++ b/drivers/net/nfp/nfp_ethdev_vf.c @@ -245,7 +245,7 @@ nfp_netvf_ethdev_ops_mount(struct nfp_net_hw *hw, eth_dev->dev_ops = &nfp_netvf_eth_dev_ops; eth_dev->rx_queue_count = nfp_net_rx_queue_count; - eth_dev->rx_pkt_burst = &nfp_net_recv_pkts; + nfp_net_recv_pkts_set(eth_dev); } static int diff --git a/drivers/net/nfp/nfp_net_meta.c b/drivers/net/nfp/nfp_net_meta.c index b31ef56f17..07c6758d33 100644 --- a/drivers/net/nfp/nfp_net_meta.c +++ b/drivers/net/nfp/nfp_net_meta.c @@ -80,6 +80,7 @@ nfp_net_meta_parse_single(uint8_t *meta_base, rte_be32_t meta_header, struct nfp_net_meta_parsed *meta) { + meta->flags = 0; meta->flags |= (1 << NFP_NET_META_HASH); meta->hash_type = rte_be_to_cpu_32(meta_header); meta->hash = rte_be_to_cpu_32(*(rte_be32_t *)(meta_base + 4)); diff --git a/drivers/net/nfp/nfp_rxtx.c b/drivers/net/nfp/nfp_rxtx.c index 1db79ad1cd..4fc3374987 100644 --- a/drivers/net/nfp/nfp_rxtx.c +++ b/drivers/net/nfp/nfp_rxtx.c @@ -17,6 +17,7 @@ #include "nfp_ipsec.h" #include "nfp_logs.h" #include "nfp_net_meta.h" +#include "nfp_rxtx_vec.h" /* * The bit format and map of nfp packet type for rxd.offload_info in Rx descriptor. @@ -867,3 +868,12 @@ nfp_net_tx_queue_info_get(struct rte_eth_dev *dev, info->conf.offloads = dev_info.tx_offload_capa & dev->data->dev_conf.txmode.offloads; } + +void +nfp_net_recv_pkts_set(struct rte_eth_dev *eth_dev) +{ + if (nfp_net_get_avx2_supported()) + eth_dev->rx_pkt_burst = nfp_net_vec_avx2_recv_pkts; + else + eth_dev->rx_pkt_burst = nfp_net_recv_pkts; +} diff --git a/drivers/net/nfp/nfp_rxtx.h b/drivers/net/nfp/nfp_rxtx.h index 3ddf717da0..fff8371991 100644 --- a/drivers/net/nfp/nfp_rxtx.h +++ b/drivers/net/nfp/nfp_rxtx.h @@ -244,5 +244,6 @@ void nfp_net_rx_queue_info_get(struct rte_eth_dev *dev, void nfp_net_tx_queue_info_get(struct rte_eth_dev *dev, uint16_t queue_id, struct rte_eth_txq_info *qinfo); +void nfp_net_recv_pkts_set(struct rte_eth_dev *eth_dev); #endif /* __NFP_RXTX_H__ */ diff --git a/drivers/net/nfp/nfp_rxtx_vec.h b/drivers/net/nfp/nfp_rxtx_vec.h index c92660f963..8720662744 100644 --- a/drivers/net/nfp/nfp_rxtx_vec.h +++ b/drivers/net/nfp/nfp_rxtx_vec.h @@ -10,4 +10,8 @@ bool nfp_net_get_avx2_supported(void); +uint16_t nfp_net_vec_avx2_recv_pkts(void *rx_queue, + struct rte_mbuf **rx_pkts, + uint16_t nb_pkts); + #endif /* __NFP_RXTX_VEC_AVX2_H__ */ diff --git a/drivers/net/nfp/nfp_rxtx_vec_avx2.c b/drivers/net/nfp/nfp_rxtx_vec_avx2.c index 50638e74ab..7c18213624 100644 --- a/drivers/net/nfp/nfp_rxtx_vec_avx2.c +++ b/drivers/net/nfp/nfp_rxtx_vec_avx2.c @@ -5,9 +5,14 @@ #include +#include +#include #include #include +#include "nfp_logs.h" +#include "nfp_net_common.h" +#include "nfp_net_meta.h" #include "nfp_rxtx_vec.h" bool @@ -19,3 +24,250 @@ nfp_net_get_avx2_supported(void) return false; } + +static inline void +nfp_vec_avx2_recv_set_des1(struct nfp_net_rxq *rxq, + struct nfp_net_rx_desc *rxds, + struct rte_mbuf *rxb) +{ + __m128i dma; + __m128i dma_hi; + __m128i vaddr0; + __m128i hdr_room = _mm_set_epi64x(0, RTE_PKTMBUF_HEADROOM); + + dma = _mm_add_epi64(_mm_loadu_si128((__m128i *)&rxb->buf_addr), hdr_room); + dma_hi = _mm_srli_epi64(dma, 32); + vaddr0 = _mm_unpacklo_epi32(dma_hi, dma); + + _mm_storel_epi64((void *)rxds, vaddr0); + + rxq->rd_p = (rxq->rd_p + 1) & (rxq->rx_count - 1); +} + +static inline void +nfp_vec_avx2_recv_set_des4(struct nfp_net_rxq *rxq, + struct nfp_net_rx_desc *rxds, + struct rte_mbuf **rxb) +{ + __m128i dma; + __m128i dma_hi; + __m128i vaddr0; + __m128i vaddr1; + __m128i vaddr2; + __m128i vaddr3; + __m128i vaddr0_1; + __m128i vaddr2_3; + __m256i vaddr0_3; + __m128i hdr_room = _mm_set_epi64x(0, RTE_PKTMBUF_HEADROOM); + + dma = _mm_add_epi64(_mm_loadu_si128((__m128i *)&rxb[0]->buf_addr), hdr_room); + dma_hi = _mm_srli_epi64(dma, 32); + vaddr0 = _mm_unpacklo_epi32(dma_hi, dma); + + dma = _mm_add_epi64(_mm_loadu_si128((__m128i *)&rxb[1]->buf_addr), hdr_room); + dma_hi = _mm_srli_epi64(dma, 32); + vaddr1 = _mm_unpacklo_epi32(dma_hi, dma); + + dma = _mm_add_epi64(_mm_loadu_si128((__m128i *)&rxb[2]->buf_addr), hdr_room); + dma_hi = _mm_srli_epi64(dma, 32); + vaddr2 = _mm_unpacklo_epi32(dma_hi, dma); + + dma = _mm_add_epi64(_mm_loadu_si128((__m128i *)&rxb[3]->buf_addr), hdr_room); + dma_hi = _mm_srli_epi64(dma, 32); + vaddr3 = _mm_unpacklo_epi32(dma_hi, dma); + + vaddr0_1 = _mm_unpacklo_epi64(vaddr0, vaddr1); + vaddr2_3 = _mm_unpacklo_epi64(vaddr2, vaddr3); + + vaddr0_3 = _mm256_inserti128_si256(_mm256_castsi128_si256(vaddr0_1), + vaddr2_3, 1); + + _mm256_store_si256((void *)rxds, vaddr0_3); + + rxq->rd_p = (rxq->rd_p + 4) & (rxq->rx_count - 1); +} + +static inline void +nfp_vec_avx2_recv_set_rxpkt1(struct nfp_net_rxq *rxq, + struct nfp_net_rx_desc *rxds, + struct rte_mbuf *rx_pkt) +{ + struct nfp_net_hw *hw = rxq->hw; + struct nfp_net_meta_parsed meta; + + rx_pkt->data_len = rxds->rxd.data_len - NFP_DESC_META_LEN(rxds); + /* Size of the whole packet. We just support 1 segment */ + rx_pkt->pkt_len = rxds->rxd.data_len - NFP_DESC_META_LEN(rxds); + + /* Filling the received mbuf with packet info */ + if (hw->rx_offset) + rx_pkt->data_off = RTE_PKTMBUF_HEADROOM + hw->rx_offset; + else + rx_pkt->data_off = RTE_PKTMBUF_HEADROOM + NFP_DESC_META_LEN(rxds); + + rx_pkt->port = rxq->port_id; + rx_pkt->nb_segs = 1; + rx_pkt->next = NULL; + + nfp_net_meta_parse(rxds, rxq, hw, rx_pkt, &meta); + + /* Checking the checksum flag */ + nfp_net_rx_cksum(rxq, rxds, rx_pkt); +} + +static inline void +nfp_vec_avx2_recv1(struct nfp_net_rxq *rxq, + struct nfp_net_rx_desc *rxds, + struct rte_mbuf *rxb, + struct rte_mbuf *rx_pkt) +{ + nfp_vec_avx2_recv_set_rxpkt1(rxq, rxds, rx_pkt); + + nfp_vec_avx2_recv_set_des1(rxq, rxds, rxb); +} + +static inline void +nfp_vec_avx2_recv4(struct nfp_net_rxq *rxq, + struct nfp_net_rx_desc *rxds, + struct rte_mbuf **rxb, + struct rte_mbuf **rx_pkts) +{ + nfp_vec_avx2_recv_set_rxpkt1(rxq, rxds, rx_pkts[0]); + nfp_vec_avx2_recv_set_rxpkt1(rxq, rxds + 1, rx_pkts[1]); + nfp_vec_avx2_recv_set_rxpkt1(rxq, rxds + 2, rx_pkts[2]); + nfp_vec_avx2_recv_set_rxpkt1(rxq, rxds + 3, rx_pkts[3]); + + nfp_vec_avx2_recv_set_des4(rxq, rxds, rxb); +} + +static inline bool +nfp_vec_avx2_recv_check_packets4(struct nfp_net_rx_desc *rxds) +{ + __m256i data = _mm256_loadu_si256((void *)rxds); + + if ((_mm256_extract_epi8(data, 3) & PCIE_DESC_RX_DD) == 0 || + (_mm256_extract_epi8(data, 11) & PCIE_DESC_RX_DD) == 0 || + (_mm256_extract_epi8(data, 19) & PCIE_DESC_RX_DD) == 0 || + (_mm256_extract_epi8(data, 27) & PCIE_DESC_RX_DD) == 0) + return false; + + return true; +} + +uint16_t +nfp_net_vec_avx2_recv_pkts(void *rx_queue, + struct rte_mbuf **rx_pkts, + uint16_t nb_pkts) +{ + uint16_t avail; + uint16_t nb_hold; + bool burst_receive; + struct rte_mbuf **rxb; + struct nfp_net_rx_desc *rxds; + struct nfp_net_rxq *rxq = rx_queue; + + if (unlikely(rxq == NULL)) { + PMD_RX_LOG(ERR, "RX Bad queue"); + return 0; + } + + avail = 0; + nb_hold = 0; + burst_receive = true; + while (avail < nb_pkts) { + rxds = &rxq->rxds[rxq->rd_p]; + rxb = &rxq->rxbufs[rxq->rd_p].mbuf; + + if ((_mm_extract_epi8(_mm_loadu_si128((void *)(rxds)), 3) + & PCIE_DESC_RX_DD) == 0) + goto recv_end; + + rte_prefetch0(rxq->rxbufs[rxq->rd_p].mbuf); + + if ((rxq->rd_p & 0x3) == 0) { + rte_prefetch0(&rxq->rxds[rxq->rd_p]); + rte_prefetch0(&rxq->rxbufs[rxq->rd_p]); + rte_prefetch0(rxq->rxbufs[rxq->rd_p + 1].mbuf); + rte_prefetch0(rxq->rxbufs[rxq->rd_p + 2].mbuf); + rte_prefetch0(rxq->rxbufs[rxq->rd_p + 3].mbuf); + } + + if ((rxq->rd_p & 0x7) == 0) { + rte_prefetch0(rxq->rxbufs[rxq->rd_p + 4].mbuf); + rte_prefetch0(rxq->rxbufs[rxq->rd_p + 5].mbuf); + rte_prefetch0(rxq->rxbufs[rxq->rd_p + 6].mbuf); + rte_prefetch0(rxq->rxbufs[rxq->rd_p + 7].mbuf); + } + + /* + * If can not receive burst, just receive one. + * 1. Rx ring will coming to the tail. + * 2. Do not need to receive 4 packets. + * 3. If pointer address unaligned on 32-bit boundary. + * 4. Rx ring does not have 4 packets or alloc 4 mbufs failed. + */ + if ((rxq->rx_count - rxq->rd_p) < 4 || + (nb_pkts - avail) < 4 || + ((uintptr_t)rxds & 0x1F) != 0 || + !burst_receive) { + _mm_storel_epi64((void *)&rx_pkts[avail], + _mm_loadu_si128((void *)rxb)); + + /* Allocate a new mbuf into the software ring. */ + if (rte_pktmbuf_alloc_bulk(rxq->mem_pool, rxb, 1) < 0) { + PMD_RX_LOG(DEBUG, "RX mbuf alloc failed port_id=%u queue_id=%hu", + rxq->port_id, rxq->qidx); + nfp_net_mbuf_alloc_failed(rxq); + goto recv_end; + } + + nfp_vec_avx2_recv1(rxq, rxds, *rxb, rx_pkts[avail]); + + avail++; + nb_hold++; + continue; + } + + burst_receive = nfp_vec_avx2_recv_check_packets4(rxds); + if (!burst_receive) + continue; + + _mm256_storeu_si256((void *)&rx_pkts[avail], + _mm256_loadu_si256((void *)rxb)); + + /* Allocate 4 new mbufs into the software ring. */ + if (rte_pktmbuf_alloc_bulk(rxq->mem_pool, rxb, 4) < 0) { + burst_receive = false; + continue; + } + + nfp_vec_avx2_recv4(rxq, rxds, rxb, &rx_pkts[avail]); + + avail += 4; + nb_hold += 4; + } + +recv_end: + if (nb_hold == 0) + return nb_hold; + + PMD_RX_LOG(DEBUG, "RX port_id=%u queue_id=%u, %d packets received", + rxq->port_id, (unsigned int)rxq->qidx, nb_hold); + + nb_hold += rxq->nb_rx_hold; + + /* + * FL descriptors needs to be written before incrementing the + * FL queue WR pointer + */ + rte_wmb(); + if (nb_hold > rxq->rx_free_thresh) { + PMD_RX_LOG(DEBUG, "port=%hu queue=%hu nb_hold=%hu avail=%hu", + rxq->port_id, rxq->qidx, nb_hold, avail); + nfp_qcp_ptr_add(rxq->qcp_fl, NFP_QCP_WRITE_PTR, nb_hold); + nb_hold = 0; + } + rxq->nb_rx_hold = nb_hold; + + return avail; +} diff --git a/drivers/net/nfp/nfp_rxtx_vec_stub.c b/drivers/net/nfp/nfp_rxtx_vec_stub.c index 1bc55b67e0..c480f61ef0 100644 --- a/drivers/net/nfp/nfp_rxtx_vec_stub.c +++ b/drivers/net/nfp/nfp_rxtx_vec_stub.c @@ -6,6 +6,7 @@ #include #include +#include #include "nfp_rxtx_vec.h" @@ -14,3 +15,11 @@ nfp_net_get_avx2_supported(void) { return false; } + +uint16_t __rte_weak +nfp_net_vec_avx2_recv_pkts(__rte_unused void *rx_queue, + __rte_unused struct rte_mbuf **rx_pkts, + __rte_unused uint16_t nb_pkts) +{ + return 0; +} -- 2.39.1