From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0F86A454A2; Wed, 19 Jun 2024 04:59:56 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id CD58C427CA; Wed, 19 Jun 2024 04:59:39 +0200 (CEST) Received: from NAM10-BN7-obe.outbound.protection.outlook.com (mail-bn7nam10on2092.outbound.protection.outlook.com [40.107.92.92]) by mails.dpdk.org (Postfix) with ESMTP id 1243141104 for ; Wed, 19 Jun 2024 04:59:38 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=PvVVGiGxbeGRFJrZVB4t3pwYREs9mWsJnEpu3YYpARsAlMmHaDxdatz9HxkuTbbAMsAtyv6+qKbf4XXz//lGV9aVfXiyBcO567L9iEqANuO4ZgBhF2h6mVE/Ac3JEQrAvKmYVYu1mLglv9+wLE7OqgH42HOb53D1cyu4MnP/8W6CHFxcDbenJ3srvuUBfeiEX0f3IhXMlGriFLnJMX2H2+BiSb0Vl0E6DCFp2uMvU0f8vFEOo5DZPNu+gx4puUmZoSIRJoNwqUYSgM/3DNeV1GMTojQLncfpmnQEYTMruLCscjaoWuPnWJ3iwjGfJoZXTM3iCs+2ChLBss/nXEcJzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=fGbYgMWL7C9Bq26wHnXmHTpEjl85FnDhI10JEOKHkTE=; b=AThzHbsoVNsPSWrNpMWC3FXYAJOePKjR4l3iqhURmwmYmUtUw1TVhEFSBGoMg+SSlpeExveE9wdTgsEKoNkU2K2V/PMLxYIxPtp1ET26LsIZF0HHl8K6lCpJGgSePwpqbxrs1jcrDXF/2EVSp65DRHi983bL7evNP4c94iEUWteLfxnKiqhtmsK59jaCGxlOCDb6XxOqAK97a+hGVmr6QNU6ng1xsbm0IDRIQ8XtaMb8Tz2bV1KsLZ+txlAJprT8C8gwUmgYHRdhTFw2b/ADmORn+gGL4GXROy4I7MM3z6pKy54E4NJ5gy056cHtd3SoLCd6m4XklsyzHsgdi3xE2w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=corigine.com; dmarc=pass action=none header.from=corigine.com; dkim=pass header.d=corigine.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=corigine.onmicrosoft.com; s=selector2-corigine-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=fGbYgMWL7C9Bq26wHnXmHTpEjl85FnDhI10JEOKHkTE=; b=j7wH9ORWhG8+nypr7EeeePn8nSHoDPSP3rpU2ZdSbdFuQ64i1BZZmRrxMzKNSF9LsSXpWYkQQoaKQFF/eBq2iN5cbma2r8dZxVt2JIanZhoGXwsESGAdFYm7o4yFb7jVjJgmQsg8cD4DvpPaxdk0B0+3WXnuujvuxgD8s5O61Vs= Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=corigine.com; Received: from SJ0PR13MB5545.namprd13.prod.outlook.com (2603:10b6:a03:424::5) by SN4PR13MB5775.namprd13.prod.outlook.com (2603:10b6:806:217::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7698.19; Wed, 19 Jun 2024 02:59:36 +0000 Received: from SJ0PR13MB5545.namprd13.prod.outlook.com ([fe80::b900:5f05:766f:833]) by SJ0PR13MB5545.namprd13.prod.outlook.com ([fe80::b900:5f05:766f:833%4]) with mapi id 15.20.7677.030; Wed, 19 Jun 2024 02:59:36 +0000 From: Chaoyong He To: dev@dpdk.org Cc: oss-drivers@corigine.com, Long Wu , Peng Zhang , Chaoyong He Subject: [PATCH 3/4] net/nfp: support AVX2 Rx function Date: Wed, 19 Jun 2024 10:59:13 +0800 Message-Id: <20240619025914.3216054-4-chaoyong.he@corigine.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20240619025914.3216054-1-chaoyong.he@corigine.com> References: <20240619025914.3216054-1-chaoyong.he@corigine.com> Content-Transfer-Encoding: 8bit Content-Type: text/plain X-ClientProxiedBy: SJ0PR13CA0189.namprd13.prod.outlook.com (2603:10b6:a03:2c3::14) To SJ0PR13MB5545.namprd13.prod.outlook.com (2603:10b6:a03:424::5) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ0PR13MB5545:EE_|SN4PR13MB5775:EE_ X-MS-Office365-Filtering-Correlation-Id: c192e72b-2b61-47a4-c228-08dc900bda73 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230037|376011|52116011|366013|1800799021|38350700011; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?CrPkA/21EsVvLYaK9+wTGpGm8k8K3h3wax/eOFSlXelGuFyHlvKCd46E7FXu?= =?us-ascii?Q?Dki/KD3+lPeOQoqgnx3zFIxpTXZoEHYn/K3H94SCvv+QCxtxlfi9xTNedT5Y?= =?us-ascii?Q?p879opCCDCJl9JsHIQtDzoDokna0jlY7V4rSSIxIYdmjKd502CT0gdP0tymF?= =?us-ascii?Q?Ce3VJnxDgZSGn2MVkyXo+7BE+KNC/dQNDbYl5S/qLYNY+yUhrtECM8CoUwHV?= =?us-ascii?Q?LlambIG19Z6Gxe2Mg/Ybrw+VP5M/qrUFXP6HoK6E+RcrGz860Ca2YNT40Nku?= =?us-ascii?Q?u1peIKlZR3UY2IJELIeQEy19ddu9mh7Zxeht1ywJzocroDsh2uaVvELRi5Of?= =?us-ascii?Q?td+I3O8RkUhHD090AF+hPLVJwHTxqcZpAUlLz0pHcXMyqVrez+DPlt3xEhNi?= =?us-ascii?Q?T4qyMUmdie8/8nsm3dFHBsFx2Rx4+lemXnf4fT73bzfH4cyG0wQQcy62D4tX?= =?us-ascii?Q?gpWeVMNDP8KkweIqDrfrTUYzwwqBBO+CWL8/QZUsAK3XftabdLijZgPHfBBd?= =?us-ascii?Q?L6so7SdSnH81cTCDyo43ir5fh0DwHqxg6Trd41HpCyzD8fnAsEKgZKkVnC4k?= =?us-ascii?Q?uC8dS3XqlgLS1F+GHpQYxCGtcxqyjPbP5UNioQCZkYiEcd7eYnTKdf44yzvO?= =?us-ascii?Q?3kcNciK/hT9wZJMFpeYaDFUgl00z1M+Fezv+dOtwlRqvXkv2fcoXo/nHLWjm?= =?us-ascii?Q?0bqio/EK1e/Rn1MV5C16mCS/8xfXZIRxsR5KsszVvFLX79NZNp1MHsjxZ7lL?= =?us-ascii?Q?LNieQh+/vLL6cxxTSrwGEK13TGcLH8pjSzaeQ9bQEjFu2/0iHhNam7VTRigE?= =?us-ascii?Q?y+LAJecCq+Z5T1yIQqPLZwdkXcEFShfhOMxxmfG1zLwHlox7IU/1gITZRUW1?= =?us-ascii?Q?8S0hZrnvVWBAwKCrvrbH6DR/GF9+TWpnqfl1cYzH+oBmSKQIpYcWlNbUwk8p?= =?us-ascii?Q?rzNwvpWSqJAgWgTVPnmT7fCrQDvy7fRIyL9Wd3knqakWTPv1Upd3k1tXYluW?= =?us-ascii?Q?Ah0lbPrxsqGQxaGVGEXKdloelrsC5YzGR9JOz6ERzy+6tRq5COnHuI/s/Gp+?= =?us-ascii?Q?YZhmYbQxsXfuIKOE53WZhk0afjyaJbjH5SR+u0ObdOsddvRYIvtAIlnhqcai?= =?us-ascii?Q?ETq4e/2OtMTciIBH91sI29LTx+ENA9fSnutTjb06itzBmUvnbX4trRBJMIlF?= =?us-ascii?Q?cx4Ce+Kc3B3ryNww+/FivldGI181+xRcxdsWp4fI5nM734oWtx1qERnqh5Q6?= =?us-ascii?Q?sEykGyGKotw439omgDYvq4BJwBNbhXp81pzzfrY2C3XTUMsCEFMGQzlxYCjb?= =?us-ascii?Q?+HSuGbuizhP5m7WyVOTfFFNkVtGA+Rs0D8Ylz8QCP/8hrTriHcklLQVo4hkq?= =?us-ascii?Q?Rj49mOY=3D?= X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR13MB5545.namprd13.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230037)(376011)(52116011)(366013)(1800799021)(38350700011); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?se61M6szoN5E7jirkrI8U1Wg/NaJtOacpOCNruMtxeSlGjL3n/Q11kpqiIdy?= =?us-ascii?Q?gBr1E2+kkNG8mZ0Po+Wn25UvrdoxIA/44kXJwSq7YhXQopYbmd8UeV78Kzd3?= =?us-ascii?Q?EjkGc38aBBRmqi+RgJ2Ufz3kikRKOHK4g2xGeGuXBuXvA6mw9aZLjoUu8XlN?= =?us-ascii?Q?6lZJaWN/yuUKfXDkTyPFIRkoe3BYwDRpPqm7VgtusxvwtYvouu3av9EpRK0k?= =?us-ascii?Q?wfgB7MabQ+S6xz2S6hGnx1eex+dF1X2yTVKQZDHjCZ8e/CQ2/pYAggQcgLfz?= =?us-ascii?Q?P0nrSjZH4A34o1ODJetzQ+gBC5HZIQhtmQzgw8g53VXc+5WdVldvmI0qckGS?= =?us-ascii?Q?4SZt6P6tONoAcA6yibWHm+jCtOWcUapDJEWcVMNPniceOj4L75bNRpmVktQz?= =?us-ascii?Q?cRmqca/ZE3YKc56IBzFMA3OlLYw0eeQftUDxT9mTIpzN4+NVdlQ++gldVjfJ?= =?us-ascii?Q?1yxxjfzQTjU4oM80C007CZZ9mMAmwLPAJxywzqtglt9JXIyQQ2YnVRzuqOcc?= =?us-ascii?Q?B1TKcQFvxjGhv1dE0e83PM/4C5sXNSSK2TrUKFVm72yjhLS3RECvYld0qDFM?= =?us-ascii?Q?Piv/6HNohMOFvVbri5qIDm6RWidgh05jC4408Czng333XPmN8jLQL0+4Y4da?= =?us-ascii?Q?brY3d7iN6TZtt10PeERXn2tO74kHU6JLmyUXQw9bSXdgaau0YZ85FcBndwwA?= =?us-ascii?Q?08fL9r4JJ4uZNuJ/aaRHGRh5W2Mc5pqpdByTYiiNA0Xvx+IngS5YJEGAM0BS?= =?us-ascii?Q?a2lQUJpl33BBVnvZ+5Z/W75VXPOM5FErxr7tQ1SjzB5aFEoAisWfYQmJBzCg?= =?us-ascii?Q?AiiENBv9dw82M3bC3zQrB3tc9FB19uxvdV9jpWhV/Z+IfQSKYMA60A2RhRE3?= =?us-ascii?Q?3xzVZYFM1nlAtXPyq/fxoLIqwoopPTyTa0c/NNvGcMtVGGmqf8dPX+B1+RCV?= =?us-ascii?Q?qQp899Io1JCaXynJ8qJA1VrG8f2SBeFYKqJd2Wb+C67sEYy3+3II8RUpn73E?= =?us-ascii?Q?wK/ag2inYEKPL6M/SekkuxJxccXg6bmMYC33f5hzofAn8VFh+pJG4NO35xH4?= =?us-ascii?Q?xkoGVC0iGQT4q7SS0bn7yhKeGvfF6PMiI80X9b+pGkNopJnGEIy3Rmez/CTf?= =?us-ascii?Q?kfFEyzJwFJp2r2DNKmX7+iFJWgg0aX2/8ia65YyK994NYVJv/byVFO1gdHRW?= =?us-ascii?Q?weryFhnZVJzU27+smFJH8q521gw3CXjgz3JUpPftgABdKPOr1qIUPMcxlJLa?= =?us-ascii?Q?DnxVrrz0fTxqQmXmKa/k27osYm3Lxj/wsbZzeEWSwiMbBMTwQtmOie68MBzB?= =?us-ascii?Q?nq9QIMdVh7ahyNi4WkmWiwqwbQAs4aNf7epzz9fHLVIyyg7dJ1yiMg7ZCPMg?= =?us-ascii?Q?pfRTxv/XaBrkjzczXJFZWmEs8Bkpt0FR+LTnFPFGp0FslLKMYuqpn8Zwd7BR?= =?us-ascii?Q?5DY8DO4ZrGXEUpIFDt5567kGVdYwnzj4zo4waHfBMKCd4ZUDW6d6KQFFbVXz?= =?us-ascii?Q?NjNuvU7tU8rum7IRpY3GomYjmYy7sr/2zmvuqz7ASzzyPX7w8i02aS6KafuI?= =?us-ascii?Q?UGl1ZoH++fTYcIh6r8Z0uNuIeXu2yBoSMXEFhpZeKvwsL82T0pXbT3ofveRe?= =?us-ascii?Q?JA=3D=3D?= X-OriginatorOrg: corigine.com X-MS-Exchange-CrossTenant-Network-Message-Id: c192e72b-2b61-47a4-c228-08dc900bda73 X-MS-Exchange-CrossTenant-AuthSource: SJ0PR13MB5545.namprd13.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Jun 2024 02:59:36.6435 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: fe128f2c-073b-4c20-818e-7246a585940c X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: sjKGBtwYqsI1flhDUKu8/X1vs+QzaiXDlrufxA/iSpzhzqyiRCx7ZG0z7Pizea5SNSZAoZbYQRsLJZnq7TOMIH5aWe8dCvUYs7iYSBcrkY8= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN4PR13MB5775 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Long Wu Use AVX2 instructions to accelerate Rx performance. The acceleration only works on X86 machine. Signed-off-by: Peng Zhang Signed-off-by: Long Wu Reviewed-by: Chaoyong He --- drivers/net/nfp/nfp_ethdev.c | 2 +- drivers/net/nfp/nfp_ethdev_vf.c | 2 +- drivers/net/nfp/nfp_net_meta.c | 1 + drivers/net/nfp/nfp_rxtx.c | 10 ++ drivers/net/nfp/nfp_rxtx.h | 1 + drivers/net/nfp/nfp_rxtx_vec.h | 4 + drivers/net/nfp/nfp_rxtx_vec_avx2.c | 252 ++++++++++++++++++++++++++++ drivers/net/nfp/nfp_rxtx_vec_stub.c | 9 + 8 files changed, 279 insertions(+), 2 deletions(-) diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c index acf9a73690..71c4f35c56 100644 --- a/drivers/net/nfp/nfp_ethdev.c +++ b/drivers/net/nfp/nfp_ethdev.c @@ -892,7 +892,7 @@ nfp_net_ethdev_ops_mount(struct nfp_net_hw *hw, eth_dev->dev_ops = &nfp_net_eth_dev_ops; eth_dev->rx_queue_count = nfp_net_rx_queue_count; - eth_dev->rx_pkt_burst = &nfp_net_recv_pkts; + nfp_net_recv_pkts_set(eth_dev); } static int diff --git a/drivers/net/nfp/nfp_ethdev_vf.c b/drivers/net/nfp/nfp_ethdev_vf.c index 63ea0a5d17..a5c600c87b 100644 --- a/drivers/net/nfp/nfp_ethdev_vf.c +++ b/drivers/net/nfp/nfp_ethdev_vf.c @@ -245,7 +245,7 @@ nfp_netvf_ethdev_ops_mount(struct nfp_net_hw *hw, eth_dev->dev_ops = &nfp_netvf_eth_dev_ops; eth_dev->rx_queue_count = nfp_net_rx_queue_count; - eth_dev->rx_pkt_burst = &nfp_net_recv_pkts; + nfp_net_recv_pkts_set(eth_dev); } static int diff --git a/drivers/net/nfp/nfp_net_meta.c b/drivers/net/nfp/nfp_net_meta.c index b31ef56f17..07c6758d33 100644 --- a/drivers/net/nfp/nfp_net_meta.c +++ b/drivers/net/nfp/nfp_net_meta.c @@ -80,6 +80,7 @@ nfp_net_meta_parse_single(uint8_t *meta_base, rte_be32_t meta_header, struct nfp_net_meta_parsed *meta) { + meta->flags = 0; meta->flags |= (1 << NFP_NET_META_HASH); meta->hash_type = rte_be_to_cpu_32(meta_header); meta->hash = rte_be_to_cpu_32(*(rte_be32_t *)(meta_base + 4)); diff --git a/drivers/net/nfp/nfp_rxtx.c b/drivers/net/nfp/nfp_rxtx.c index 1db79ad1cd..4fc3374987 100644 --- a/drivers/net/nfp/nfp_rxtx.c +++ b/drivers/net/nfp/nfp_rxtx.c @@ -17,6 +17,7 @@ #include "nfp_ipsec.h" #include "nfp_logs.h" #include "nfp_net_meta.h" +#include "nfp_rxtx_vec.h" /* * The bit format and map of nfp packet type for rxd.offload_info in Rx descriptor. @@ -867,3 +868,12 @@ nfp_net_tx_queue_info_get(struct rte_eth_dev *dev, info->conf.offloads = dev_info.tx_offload_capa & dev->data->dev_conf.txmode.offloads; } + +void +nfp_net_recv_pkts_set(struct rte_eth_dev *eth_dev) +{ + if (nfp_net_get_avx2_supported()) + eth_dev->rx_pkt_burst = nfp_net_vec_avx2_recv_pkts; + else + eth_dev->rx_pkt_burst = nfp_net_recv_pkts; +} diff --git a/drivers/net/nfp/nfp_rxtx.h b/drivers/net/nfp/nfp_rxtx.h index 3ddf717da0..fff8371991 100644 --- a/drivers/net/nfp/nfp_rxtx.h +++ b/drivers/net/nfp/nfp_rxtx.h @@ -244,5 +244,6 @@ void nfp_net_rx_queue_info_get(struct rte_eth_dev *dev, void nfp_net_tx_queue_info_get(struct rte_eth_dev *dev, uint16_t queue_id, struct rte_eth_txq_info *qinfo); +void nfp_net_recv_pkts_set(struct rte_eth_dev *eth_dev); #endif /* __NFP_RXTX_H__ */ diff --git a/drivers/net/nfp/nfp_rxtx_vec.h b/drivers/net/nfp/nfp_rxtx_vec.h index c92660f963..8720662744 100644 --- a/drivers/net/nfp/nfp_rxtx_vec.h +++ b/drivers/net/nfp/nfp_rxtx_vec.h @@ -10,4 +10,8 @@ bool nfp_net_get_avx2_supported(void); +uint16_t nfp_net_vec_avx2_recv_pkts(void *rx_queue, + struct rte_mbuf **rx_pkts, + uint16_t nb_pkts); + #endif /* __NFP_RXTX_VEC_AVX2_H__ */ diff --git a/drivers/net/nfp/nfp_rxtx_vec_avx2.c b/drivers/net/nfp/nfp_rxtx_vec_avx2.c index 50638e74ab..7c18213624 100644 --- a/drivers/net/nfp/nfp_rxtx_vec_avx2.c +++ b/drivers/net/nfp/nfp_rxtx_vec_avx2.c @@ -5,9 +5,14 @@ #include +#include +#include #include #include +#include "nfp_logs.h" +#include "nfp_net_common.h" +#include "nfp_net_meta.h" #include "nfp_rxtx_vec.h" bool @@ -19,3 +24,250 @@ nfp_net_get_avx2_supported(void) return false; } + +static inline void +nfp_vec_avx2_recv_set_des1(struct nfp_net_rxq *rxq, + struct nfp_net_rx_desc *rxds, + struct rte_mbuf *rxb) +{ + __m128i dma; + __m128i dma_hi; + __m128i vaddr0; + __m128i hdr_room = _mm_set_epi64x(0, RTE_PKTMBUF_HEADROOM); + + dma = _mm_add_epi64(_mm_loadu_si128((__m128i *)&rxb->buf_addr), hdr_room); + dma_hi = _mm_srli_epi64(dma, 32); + vaddr0 = _mm_unpacklo_epi32(dma_hi, dma); + + _mm_storel_epi64((void *)rxds, vaddr0); + + rxq->rd_p = (rxq->rd_p + 1) & (rxq->rx_count - 1); +} + +static inline void +nfp_vec_avx2_recv_set_des4(struct nfp_net_rxq *rxq, + struct nfp_net_rx_desc *rxds, + struct rte_mbuf **rxb) +{ + __m128i dma; + __m128i dma_hi; + __m128i vaddr0; + __m128i vaddr1; + __m128i vaddr2; + __m128i vaddr3; + __m128i vaddr0_1; + __m128i vaddr2_3; + __m256i vaddr0_3; + __m128i hdr_room = _mm_set_epi64x(0, RTE_PKTMBUF_HEADROOM); + + dma = _mm_add_epi64(_mm_loadu_si128((__m128i *)&rxb[0]->buf_addr), hdr_room); + dma_hi = _mm_srli_epi64(dma, 32); + vaddr0 = _mm_unpacklo_epi32(dma_hi, dma); + + dma = _mm_add_epi64(_mm_loadu_si128((__m128i *)&rxb[1]->buf_addr), hdr_room); + dma_hi = _mm_srli_epi64(dma, 32); + vaddr1 = _mm_unpacklo_epi32(dma_hi, dma); + + dma = _mm_add_epi64(_mm_loadu_si128((__m128i *)&rxb[2]->buf_addr), hdr_room); + dma_hi = _mm_srli_epi64(dma, 32); + vaddr2 = _mm_unpacklo_epi32(dma_hi, dma); + + dma = _mm_add_epi64(_mm_loadu_si128((__m128i *)&rxb[3]->buf_addr), hdr_room); + dma_hi = _mm_srli_epi64(dma, 32); + vaddr3 = _mm_unpacklo_epi32(dma_hi, dma); + + vaddr0_1 = _mm_unpacklo_epi64(vaddr0, vaddr1); + vaddr2_3 = _mm_unpacklo_epi64(vaddr2, vaddr3); + + vaddr0_3 = _mm256_inserti128_si256(_mm256_castsi128_si256(vaddr0_1), + vaddr2_3, 1); + + _mm256_store_si256((void *)rxds, vaddr0_3); + + rxq->rd_p = (rxq->rd_p + 4) & (rxq->rx_count - 1); +} + +static inline void +nfp_vec_avx2_recv_set_rxpkt1(struct nfp_net_rxq *rxq, + struct nfp_net_rx_desc *rxds, + struct rte_mbuf *rx_pkt) +{ + struct nfp_net_hw *hw = rxq->hw; + struct nfp_net_meta_parsed meta; + + rx_pkt->data_len = rxds->rxd.data_len - NFP_DESC_META_LEN(rxds); + /* Size of the whole packet. We just support 1 segment */ + rx_pkt->pkt_len = rxds->rxd.data_len - NFP_DESC_META_LEN(rxds); + + /* Filling the received mbuf with packet info */ + if (hw->rx_offset) + rx_pkt->data_off = RTE_PKTMBUF_HEADROOM + hw->rx_offset; + else + rx_pkt->data_off = RTE_PKTMBUF_HEADROOM + NFP_DESC_META_LEN(rxds); + + rx_pkt->port = rxq->port_id; + rx_pkt->nb_segs = 1; + rx_pkt->next = NULL; + + nfp_net_meta_parse(rxds, rxq, hw, rx_pkt, &meta); + + /* Checking the checksum flag */ + nfp_net_rx_cksum(rxq, rxds, rx_pkt); +} + +static inline void +nfp_vec_avx2_recv1(struct nfp_net_rxq *rxq, + struct nfp_net_rx_desc *rxds, + struct rte_mbuf *rxb, + struct rte_mbuf *rx_pkt) +{ + nfp_vec_avx2_recv_set_rxpkt1(rxq, rxds, rx_pkt); + + nfp_vec_avx2_recv_set_des1(rxq, rxds, rxb); +} + +static inline void +nfp_vec_avx2_recv4(struct nfp_net_rxq *rxq, + struct nfp_net_rx_desc *rxds, + struct rte_mbuf **rxb, + struct rte_mbuf **rx_pkts) +{ + nfp_vec_avx2_recv_set_rxpkt1(rxq, rxds, rx_pkts[0]); + nfp_vec_avx2_recv_set_rxpkt1(rxq, rxds + 1, rx_pkts[1]); + nfp_vec_avx2_recv_set_rxpkt1(rxq, rxds + 2, rx_pkts[2]); + nfp_vec_avx2_recv_set_rxpkt1(rxq, rxds + 3, rx_pkts[3]); + + nfp_vec_avx2_recv_set_des4(rxq, rxds, rxb); +} + +static inline bool +nfp_vec_avx2_recv_check_packets4(struct nfp_net_rx_desc *rxds) +{ + __m256i data = _mm256_loadu_si256((void *)rxds); + + if ((_mm256_extract_epi8(data, 3) & PCIE_DESC_RX_DD) == 0 || + (_mm256_extract_epi8(data, 11) & PCIE_DESC_RX_DD) == 0 || + (_mm256_extract_epi8(data, 19) & PCIE_DESC_RX_DD) == 0 || + (_mm256_extract_epi8(data, 27) & PCIE_DESC_RX_DD) == 0) + return false; + + return true; +} + +uint16_t +nfp_net_vec_avx2_recv_pkts(void *rx_queue, + struct rte_mbuf **rx_pkts, + uint16_t nb_pkts) +{ + uint16_t avail; + uint16_t nb_hold; + bool burst_receive; + struct rte_mbuf **rxb; + struct nfp_net_rx_desc *rxds; + struct nfp_net_rxq *rxq = rx_queue; + + if (unlikely(rxq == NULL)) { + PMD_RX_LOG(ERR, "RX Bad queue"); + return 0; + } + + avail = 0; + nb_hold = 0; + burst_receive = true; + while (avail < nb_pkts) { + rxds = &rxq->rxds[rxq->rd_p]; + rxb = &rxq->rxbufs[rxq->rd_p].mbuf; + + if ((_mm_extract_epi8(_mm_loadu_si128((void *)(rxds)), 3) + & PCIE_DESC_RX_DD) == 0) + goto recv_end; + + rte_prefetch0(rxq->rxbufs[rxq->rd_p].mbuf); + + if ((rxq->rd_p & 0x3) == 0) { + rte_prefetch0(&rxq->rxds[rxq->rd_p]); + rte_prefetch0(&rxq->rxbufs[rxq->rd_p]); + rte_prefetch0(rxq->rxbufs[rxq->rd_p + 1].mbuf); + rte_prefetch0(rxq->rxbufs[rxq->rd_p + 2].mbuf); + rte_prefetch0(rxq->rxbufs[rxq->rd_p + 3].mbuf); + } + + if ((rxq->rd_p & 0x7) == 0) { + rte_prefetch0(rxq->rxbufs[rxq->rd_p + 4].mbuf); + rte_prefetch0(rxq->rxbufs[rxq->rd_p + 5].mbuf); + rte_prefetch0(rxq->rxbufs[rxq->rd_p + 6].mbuf); + rte_prefetch0(rxq->rxbufs[rxq->rd_p + 7].mbuf); + } + + /* + * If can not receive burst, just receive one. + * 1. Rx ring will coming to the tail. + * 2. Do not need to receive 4 packets. + * 3. If pointer address unaligned on 32-bit boundary. + * 4. Rx ring does not have 4 packets or alloc 4 mbufs failed. + */ + if ((rxq->rx_count - rxq->rd_p) < 4 || + (nb_pkts - avail) < 4 || + ((uintptr_t)rxds & 0x1F) != 0 || + !burst_receive) { + _mm_storel_epi64((void *)&rx_pkts[avail], + _mm_loadu_si128((void *)rxb)); + + /* Allocate a new mbuf into the software ring. */ + if (rte_pktmbuf_alloc_bulk(rxq->mem_pool, rxb, 1) < 0) { + PMD_RX_LOG(DEBUG, "RX mbuf alloc failed port_id=%u queue_id=%hu", + rxq->port_id, rxq->qidx); + nfp_net_mbuf_alloc_failed(rxq); + goto recv_end; + } + + nfp_vec_avx2_recv1(rxq, rxds, *rxb, rx_pkts[avail]); + + avail++; + nb_hold++; + continue; + } + + burst_receive = nfp_vec_avx2_recv_check_packets4(rxds); + if (!burst_receive) + continue; + + _mm256_storeu_si256((void *)&rx_pkts[avail], + _mm256_loadu_si256((void *)rxb)); + + /* Allocate 4 new mbufs into the software ring. */ + if (rte_pktmbuf_alloc_bulk(rxq->mem_pool, rxb, 4) < 0) { + burst_receive = false; + continue; + } + + nfp_vec_avx2_recv4(rxq, rxds, rxb, &rx_pkts[avail]); + + avail += 4; + nb_hold += 4; + } + +recv_end: + if (nb_hold == 0) + return nb_hold; + + PMD_RX_LOG(DEBUG, "RX port_id=%u queue_id=%u, %d packets received", + rxq->port_id, (unsigned int)rxq->qidx, nb_hold); + + nb_hold += rxq->nb_rx_hold; + + /* + * FL descriptors needs to be written before incrementing the + * FL queue WR pointer + */ + rte_wmb(); + if (nb_hold > rxq->rx_free_thresh) { + PMD_RX_LOG(DEBUG, "port=%hu queue=%hu nb_hold=%hu avail=%hu", + rxq->port_id, rxq->qidx, nb_hold, avail); + nfp_qcp_ptr_add(rxq->qcp_fl, NFP_QCP_WRITE_PTR, nb_hold); + nb_hold = 0; + } + rxq->nb_rx_hold = nb_hold; + + return avail; +} diff --git a/drivers/net/nfp/nfp_rxtx_vec_stub.c b/drivers/net/nfp/nfp_rxtx_vec_stub.c index 1bc55b67e0..c480f61ef0 100644 --- a/drivers/net/nfp/nfp_rxtx_vec_stub.c +++ b/drivers/net/nfp/nfp_rxtx_vec_stub.c @@ -6,6 +6,7 @@ #include #include +#include #include "nfp_rxtx_vec.h" @@ -14,3 +15,11 @@ nfp_net_get_avx2_supported(void) { return false; } + +uint16_t __rte_weak +nfp_net_vec_avx2_recv_pkts(__rte_unused void *rx_queue, + __rte_unused struct rte_mbuf **rx_pkts, + __rte_unused uint16_t nb_pkts) +{ + return 0; +} -- 2.39.1