From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1BB28A034F for ; Wed, 10 Nov 2021 07:51:07 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1490840142; Wed, 10 Nov 2021 07:51:07 +0100 (CET) Received: from NAM02-SN1-obe.outbound.protection.outlook.com (mail-sn1anam02on2059.outbound.protection.outlook.com [40.107.96.59]) by mails.dpdk.org (Postfix) with ESMTP id BFAB340142 for ; Wed, 10 Nov 2021 07:51:05 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=NntCx9cyoftotzxvhuKn1KqYm46Yiwq2zKJiOEHEup6rv4UAxHxzIhQMvkryj7+Wkxu8IYwnIFERUOt4UD16RlVwWJE15Pf6cHZLhKYhWvUaL2AA6LjsdBc3YXv6eBOy0QzqnFrQgssjSxntLiHzwPno6gcuk1ZzhL5huNWfUm2fnPOVY4r6LhJfHd9+/IEeYY2YOgJIr5Q7ncjtYG6Oi7z6JiguI2WgkM8kdZDHLThvAufwnCWwkWPujMJSUhTSIgSe2cOubBA45JTcGI3MkpVbHRzbtCoUD12KfaTESiUQv46ilXoIC9lfOVfcBqrtqG0xx1QoFmWaMjC9K/0ydw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=V3NgkG5miONMhDHvPTqORTD/Cp+iBuIao2CC9kjfJZA=; b=QF2msb74okrJ/gV+kamHVSJ8TuVpEqwGL925lQi/oKzwsBK5Jxp5R4zGlFkM9iExt77ilj4foYBkUL+ZDGtSqKeQualtLopVXnyMbla6jUCxkGKkAT6XZQRjnSy1ZXKeeYOvZtYJ0F9xOTbUdpLK9ymGggNd9Vug8BsbM93bhZO14NKcGzkW9L7e+wU5FsY/wnigFFaGwLx2t0JJgUUfD03D23hkbHB1tRHk6MyUrsz9rnwuIPZ9cuUW9UB3TiJ3A4wGrjx5K/z6siRVz4WVbVaJ3HK7kS1qt/n6PxMeJk6z2+lo+nzWYEL4Uvv/bkBpZFGskppKudCSa1ap4tRzxQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.112.34) smtp.rcpttodomain=intel.com smtp.mailfrom=nvidia.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=V3NgkG5miONMhDHvPTqORTD/Cp+iBuIao2CC9kjfJZA=; b=AeGLisYd/yVx0GQ8mZeg7bAcJILxoaIfzXuBNH0v7Y25X43gl/V9RVmQmHBfuiOI9mLqsPk4SG0qTLpT7+AzaWuTB8KcoVd95LivVHDHOrprWvHlvb5Nd7UEQeseYkD0F2kwXCGXIm1KJXkTg0UGKtVDpbCsvs/jYBA02bWohQNoJS8rzB4jVVlaFPSlRZuw3ks1Tjv01CnR7AqPk8BvNalSaKckc/LNgsrVJ+E7cnZ0LbxP3+brLsPH4vzKnWS1XnKM6TElcV9jjAibYDeH0lqNMb36r6Nm73uC/x0pKh9sJrjhnoqKS548lOzQpp4M07UoVihSok6Ah556PNFE5w== Received: from BN8PR07CA0004.namprd07.prod.outlook.com (2603:10b6:408:ac::17) by CY4PR1201MB2551.namprd12.prod.outlook.com (2603:10b6:903:d9::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.13; Wed, 10 Nov 2021 06:51:03 +0000 Received: from BN8NAM11FT025.eop-nam11.prod.protection.outlook.com (2603:10b6:408:ac:cafe::39) by BN8PR07CA0004.outlook.office365.com (2603:10b6:408:ac::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.13 via Frontend Transport; Wed, 10 Nov 2021 06:51:03 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.112.34) smtp.mailfrom=nvidia.com; intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.112.34 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.112.34; helo=mail.nvidia.com; Received: from mail.nvidia.com (216.228.112.34) by BN8NAM11FT025.mail.protection.outlook.com (10.13.177.136) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.4669.10 via Frontend Transport; Wed, 10 Nov 2021 06:51:02 +0000 Received: from nvidia.com (172.20.187.5) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Wed, 10 Nov 2021 06:50:59 +0000 From: Xueming Li To: Leyi Rong CC: Luca Boccassi , Bruce Richardson , Ferruh Yigit , dpdk stable Date: Wed, 10 Nov 2021 14:30:49 +0800 Message-ID: <20211110063216.2744012-166-xuemingl@nvidia.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211110063216.2744012-1-xuemingl@nvidia.com> References: <20211110063216.2744012-1-xuemingl@nvidia.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [172.20.187.5] X-ClientProxiedBy: HQMAIL101.nvidia.com (172.20.187.10) To HQMAIL107.nvidia.com (172.20.187.13) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 20993680-518e-4492-4861-08d9a41675fa X-MS-TrafficTypeDiagnostic: CY4PR1201MB2551: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:9508; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: JNi+udqmi6M/1RmErj9qvRksd31h7KatRoZrYf9virJw5drjHNyTIYUq/Dj4UBS8thE5gYwsjU/okXU6Y83T0jItvkxwtf714+epVKKkFPNBQSMJ1AS5gQFkrj01Mzjf2oo9ewrouV8Md/GN0ILIfojyh1hzaxSDccehW6/nlVDLIHPpYuDX9DHM43riKyCNoutzVru+GZjAVuGhHa4lB/wQ5qyhiCvN+zzuztdPHYPXqFXnU9i8WwDTIzoKYHMmd1OUFunxHTWPe9T10EImoCfpl9ThRgpVUsk/afDoakTQ9cdrqV2ERXbYMhKtQGK87LaJVeCYhbHXNVCS7mA22nO2+BP7vvDhCEHUWxaf0G/TJ88wcK6f/fN1Mlims5I/3cpMhmhOyIhjl1yw/YkIVuuBpsj6J8fe26k4CfgGG7sHJhg8EGd4/Bd4tHbuFkpf+NM8JEYKRJmGnnnpZZ05Q2/zaGvBnaATGnJtMynTbDaXQj2ohjsrQ7rGlRuOgmklwNy4GJ09t++WiTHXr8KU6sLKzOagPbGJwSLfqiK/04dYbd+Ww2eAO7+c7qfOLliRRcwCjPSKT/HLatMEPMIT3CYUgwt/bsjDLMuEbl/u83M6Xz4yyM9BFfWMc/XRFzjuJGh1iyfFae5mY2RCKLbt7RwxsgWMvPrlMiLYYEVGGHrhcqI1Mkk+qvSfNgvEa02MAOwlc9XqeZy1JSWuBm+xAzI0OgIoTjShTk/jXo1So/3HtnD+y4wE5JdiQvfSOR+eYWsa7S+mxoDpmmjbyy9Y46+k7erqDnV3ZQf9istRsNb69oc29zWSRURWuQX10SIlczVrMGJ3mPqy31rpsVJFX3nF95ROTY5JDGmVURTkczo= X-Forefront-Antispam-Report: CIP:216.228.112.34; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:schybrid03.nvidia.com; CAT:NONE; SFS:(4636009)(46966006)(36840700001)(70586007)(336012)(70206006)(508600001)(7696005)(2906002)(6916009)(55016002)(966005)(426003)(16526019)(82310400003)(7636003)(26005)(1076003)(186003)(36756003)(316002)(83380400001)(19627235002)(86362001)(36860700001)(53546011)(6286002)(5660300002)(356005)(4326008)(36906005)(8936002)(30864003)(4001150100001)(6666004)(2616005)(8676002)(54906003)(47076005)(41533002); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 Nov 2021 06:51:02.3499 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 20993680-518e-4492-4861-08d9a41675fa X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.112.34]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BN8NAM11FT025.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY4PR1201MB2551 Subject: [dpdk-stable] patch 'net/ice: fix generic build on FreeBSD' has been queued to stable release 20.11.4 X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: stable-bounces@dpdk.org Sender: "stable" Hi, FYI, your patch has been queued to stable release 20.11.4 Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet. It will be pushed if I get no objections before 11/12/21. So please shout if anyone has objections. Also note that after the patch there's a diff of the upstream commit vs the patch applied to the branch. This will indicate if there was any rebasing needed to apply to the stable branch. If there were code changes for rebasing (ie: not only metadata diffs), please double check that the rebase was correctly done. Queued patches are on a temporary branch at: https://github.com/steevenlee/dpdk This queued commit can be viewed at: https://github.com/steevenlee/dpdk/commit/84a1e5c1369a24a3728ff3ff5b8f2a13a2c96e5a Thanks. Xueming Li --- >From 84a1e5c1369a24a3728ff3ff5b8f2a13a2c96e5a Mon Sep 17 00:00:00 2001 From: Leyi Rong Date: Tue, 19 Oct 2021 11:02:08 +0800 Subject: [PATCH] net/ice: fix generic build on FreeBSD Cc: Xueming Li [ upstream commit 0d989ff9ca515f80fbab2aad18cede501b759ff9 ] The common header file for vectorization is included in multiple files, and so must use macros for the current compilation unit, rather than the compiler-capability flag set for the whole driver. With the current, incorrect, macro, the AVX512 or AVX2 flags may be set when compiling up SSE code, leading to compilation errors. Changing from "CC_AVX*_SUPPORT" to the compiler-defined "__AVX*__" macros fixes this issue. In addition, splitting AVX-specific code into the new ice_rxtx_common_avx.h header file to avoid such bugs. Bugzilla ID: 788 Fixes: a4e480de268e ("net/ice: optimize Tx by using AVX512") Fixes: 20daa1c978b7 ("net/ice: fix crash in AVX512") Signed-off-by: Leyi Rong Signed-off-by: Bruce Richardson Reviewed-by: Ferruh Yigit --- drivers/net/ice/ice_rxtx_common_avx.h | 213 ++++++++++++++++++++++++++ drivers/net/ice/ice_rxtx_vec_avx2.c | 1 + drivers/net/ice/ice_rxtx_vec_avx512.c | 1 + drivers/net/ice/ice_rxtx_vec_common.h | 201 +----------------------- 4 files changed, 216 insertions(+), 200 deletions(-) create mode 100644 drivers/net/ice/ice_rxtx_common_avx.h diff --git a/drivers/net/ice/ice_rxtx_common_avx.h b/drivers/net/ice/ice_rxtx_common_avx.h new file mode 100644 index 0000000000..81e0db5dd3 --- /dev/null +++ b/drivers/net/ice/ice_rxtx_common_avx.h @@ -0,0 +1,213 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2019 Intel Corporation + */ + +#ifndef _ICE_RXTX_COMMON_AVX_H_ +#define _ICE_RXTX_COMMON_AVX_H_ + +#include "ice_rxtx.h" + +#ifndef __INTEL_COMPILER +#pragma GCC diagnostic ignored "-Wcast-qual" +#endif + +#ifdef __AVX2__ +static __rte_always_inline void +ice_rxq_rearm_common(struct ice_rx_queue *rxq, __rte_unused bool avx512) +{ + int i; + uint16_t rx_id; + volatile union ice_rx_flex_desc *rxdp; + struct ice_rx_entry *rxep = &rxq->sw_ring[rxq->rxrearm_start]; + + rxdp = rxq->rx_ring + rxq->rxrearm_start; + + /* Pull 'n' more MBUFs into the software ring */ + if (rte_mempool_get_bulk(rxq->mp, + (void *)rxep, + ICE_RXQ_REARM_THRESH) < 0) { + if (rxq->rxrearm_nb + ICE_RXQ_REARM_THRESH >= + rxq->nb_rx_desc) { + __m128i dma_addr0; + + dma_addr0 = _mm_setzero_si128(); + for (i = 0; i < ICE_DESCS_PER_LOOP; i++) { + rxep[i].mbuf = &rxq->fake_mbuf; + _mm_store_si128((__m128i *)&rxdp[i].read, + dma_addr0); + } + } + rte_eth_devices[rxq->port_id].data->rx_mbuf_alloc_failed += + ICE_RXQ_REARM_THRESH; + return; + } + +#ifndef RTE_LIBRTE_ICE_16BYTE_RX_DESC + struct rte_mbuf *mb0, *mb1; + __m128i dma_addr0, dma_addr1; + __m128i hdr_room = _mm_set_epi64x(RTE_PKTMBUF_HEADROOM, + RTE_PKTMBUF_HEADROOM); + /* Initialize the mbufs in vector, process 2 mbufs in one loop */ + for (i = 0; i < ICE_RXQ_REARM_THRESH; i += 2, rxep += 2) { + __m128i vaddr0, vaddr1; + + mb0 = rxep[0].mbuf; + mb1 = rxep[1].mbuf; + + /* load buf_addr(lo 64bit) and buf_iova(hi 64bit) */ + RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, buf_iova) != + offsetof(struct rte_mbuf, buf_addr) + 8); + vaddr0 = _mm_loadu_si128((__m128i *)&mb0->buf_addr); + vaddr1 = _mm_loadu_si128((__m128i *)&mb1->buf_addr); + + /* convert pa to dma_addr hdr/data */ + dma_addr0 = _mm_unpackhi_epi64(vaddr0, vaddr0); + dma_addr1 = _mm_unpackhi_epi64(vaddr1, vaddr1); + + /* add headroom to pa values */ + dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room); + dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room); + + /* flush desc with pa dma_addr */ + _mm_store_si128((__m128i *)&rxdp++->read, dma_addr0); + _mm_store_si128((__m128i *)&rxdp++->read, dma_addr1); + } +#else +#ifdef __AVX512VL__ + if (avx512) { + struct rte_mbuf *mb0, *mb1, *mb2, *mb3; + struct rte_mbuf *mb4, *mb5, *mb6, *mb7; + __m512i dma_addr0_3, dma_addr4_7; + __m512i hdr_room = _mm512_set1_epi64(RTE_PKTMBUF_HEADROOM); + /* Initialize the mbufs in vector, process 8 mbufs in one loop */ + for (i = 0; i < ICE_RXQ_REARM_THRESH; + i += 8, rxep += 8, rxdp += 8) { + __m128i vaddr0, vaddr1, vaddr2, vaddr3; + __m128i vaddr4, vaddr5, vaddr6, vaddr7; + __m256i vaddr0_1, vaddr2_3; + __m256i vaddr4_5, vaddr6_7; + __m512i vaddr0_3, vaddr4_7; + + mb0 = rxep[0].mbuf; + mb1 = rxep[1].mbuf; + mb2 = rxep[2].mbuf; + mb3 = rxep[3].mbuf; + mb4 = rxep[4].mbuf; + mb5 = rxep[5].mbuf; + mb6 = rxep[6].mbuf; + mb7 = rxep[7].mbuf; + + /* load buf_addr(lo 64bit) and buf_iova(hi 64bit) */ + RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, buf_iova) != + offsetof(struct rte_mbuf, buf_addr) + 8); + vaddr0 = _mm_loadu_si128((__m128i *)&mb0->buf_addr); + vaddr1 = _mm_loadu_si128((__m128i *)&mb1->buf_addr); + vaddr2 = _mm_loadu_si128((__m128i *)&mb2->buf_addr); + vaddr3 = _mm_loadu_si128((__m128i *)&mb3->buf_addr); + vaddr4 = _mm_loadu_si128((__m128i *)&mb4->buf_addr); + vaddr5 = _mm_loadu_si128((__m128i *)&mb5->buf_addr); + vaddr6 = _mm_loadu_si128((__m128i *)&mb6->buf_addr); + vaddr7 = _mm_loadu_si128((__m128i *)&mb7->buf_addr); + + /** + * merge 0 & 1, by casting 0 to 256-bit and inserting 1 + * into the high lanes. Similarly for 2 & 3, and so on. + */ + vaddr0_1 = + _mm256_inserti128_si256(_mm256_castsi128_si256(vaddr0), + vaddr1, 1); + vaddr2_3 = + _mm256_inserti128_si256(_mm256_castsi128_si256(vaddr2), + vaddr3, 1); + vaddr4_5 = + _mm256_inserti128_si256(_mm256_castsi128_si256(vaddr4), + vaddr5, 1); + vaddr6_7 = + _mm256_inserti128_si256(_mm256_castsi128_si256(vaddr6), + vaddr7, 1); + vaddr0_3 = + _mm512_inserti64x4(_mm512_castsi256_si512(vaddr0_1), + vaddr2_3, 1); + vaddr4_7 = + _mm512_inserti64x4(_mm512_castsi256_si512(vaddr4_5), + vaddr6_7, 1); + + /* convert pa to dma_addr hdr/data */ + dma_addr0_3 = _mm512_unpackhi_epi64(vaddr0_3, vaddr0_3); + dma_addr4_7 = _mm512_unpackhi_epi64(vaddr4_7, vaddr4_7); + + /* add headroom to pa values */ + dma_addr0_3 = _mm512_add_epi64(dma_addr0_3, hdr_room); + dma_addr4_7 = _mm512_add_epi64(dma_addr4_7, hdr_room); + + /* flush desc with pa dma_addr */ + _mm512_store_si512((__m512i *)&rxdp->read, dma_addr0_3); + _mm512_store_si512((__m512i *)&(rxdp + 4)->read, dma_addr4_7); + } + } else +#endif /* __AVX512VL__ */ + { + struct rte_mbuf *mb0, *mb1, *mb2, *mb3; + __m256i dma_addr0_1, dma_addr2_3; + __m256i hdr_room = _mm256_set1_epi64x(RTE_PKTMBUF_HEADROOM); + /* Initialize the mbufs in vector, process 4 mbufs in one loop */ + for (i = 0; i < ICE_RXQ_REARM_THRESH; + i += 4, rxep += 4, rxdp += 4) { + __m128i vaddr0, vaddr1, vaddr2, vaddr3; + __m256i vaddr0_1, vaddr2_3; + + mb0 = rxep[0].mbuf; + mb1 = rxep[1].mbuf; + mb2 = rxep[2].mbuf; + mb3 = rxep[3].mbuf; + + /* load buf_addr(lo 64bit) and buf_iova(hi 64bit) */ + RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, buf_iova) != + offsetof(struct rte_mbuf, buf_addr) + 8); + vaddr0 = _mm_loadu_si128((__m128i *)&mb0->buf_addr); + vaddr1 = _mm_loadu_si128((__m128i *)&mb1->buf_addr); + vaddr2 = _mm_loadu_si128((__m128i *)&mb2->buf_addr); + vaddr3 = _mm_loadu_si128((__m128i *)&mb3->buf_addr); + + /** + * merge 0 & 1, by casting 0 to 256-bit and inserting 1 + * into the high lanes. Similarly for 2 & 3 + */ + vaddr0_1 = + _mm256_inserti128_si256(_mm256_castsi128_si256(vaddr0), + vaddr1, 1); + vaddr2_3 = + _mm256_inserti128_si256(_mm256_castsi128_si256(vaddr2), + vaddr3, 1); + + /* convert pa to dma_addr hdr/data */ + dma_addr0_1 = _mm256_unpackhi_epi64(vaddr0_1, vaddr0_1); + dma_addr2_3 = _mm256_unpackhi_epi64(vaddr2_3, vaddr2_3); + + /* add headroom to pa values */ + dma_addr0_1 = _mm256_add_epi64(dma_addr0_1, hdr_room); + dma_addr2_3 = _mm256_add_epi64(dma_addr2_3, hdr_room); + + /* flush desc with pa dma_addr */ + _mm256_store_si256((__m256i *)&rxdp->read, dma_addr0_1); + _mm256_store_si256((__m256i *)&(rxdp + 2)->read, dma_addr2_3); + } + } + +#endif + + rxq->rxrearm_start += ICE_RXQ_REARM_THRESH; + if (rxq->rxrearm_start >= rxq->nb_rx_desc) + rxq->rxrearm_start = 0; + + rxq->rxrearm_nb -= ICE_RXQ_REARM_THRESH; + + rx_id = (uint16_t)((rxq->rxrearm_start == 0) ? + (rxq->nb_rx_desc - 1) : (rxq->rxrearm_start - 1)); + + /* Update the tail pointer on the NIC */ + ICE_PCI_REG_WC_WRITE(rxq->qrx_tail, rx_id); +} +#endif /* __AVX2__ */ + +#endif /* _ICE_RXTX_COMMON_AVX_H_ */ diff --git a/drivers/net/ice/ice_rxtx_vec_avx2.c b/drivers/net/ice/ice_rxtx_vec_avx2.c index ac2719fa15..581d3d348f 100644 --- a/drivers/net/ice/ice_rxtx_vec_avx2.c +++ b/drivers/net/ice/ice_rxtx_vec_avx2.c @@ -3,6 +3,7 @@ */ #include "ice_rxtx_vec_common.h" +#include "ice_rxtx_common_avx.h" #include diff --git a/drivers/net/ice/ice_rxtx_vec_avx512.c b/drivers/net/ice/ice_rxtx_vec_avx512.c index 719b7b8b38..2c0881a01e 100644 --- a/drivers/net/ice/ice_rxtx_vec_avx512.c +++ b/drivers/net/ice/ice_rxtx_vec_avx512.c @@ -3,6 +3,7 @@ */ #include "ice_rxtx_vec_common.h" +#include "ice_rxtx_common_avx.h" #include diff --git a/drivers/net/ice/ice_rxtx_vec_common.h b/drivers/net/ice/ice_rxtx_vec_common.h index 1d138aa899..bd2450ad5f 100644 --- a/drivers/net/ice/ice_rxtx_vec_common.h +++ b/drivers/net/ice/ice_rxtx_vec_common.h @@ -194,7 +194,7 @@ _ice_tx_queue_release_mbufs_vec(struct ice_tx_queue *txq) */ i = txq->tx_next_dd - txq->tx_rs_thresh + 1; -#ifdef CC_AVX512_SUPPORT +#ifdef __AVX512VL__ struct rte_eth_dev *dev = &rte_eth_devices[txq->vsi->adapter->pf.dev_data->port_id]; if (dev->tx_pkt_burst == ice_xmit_pkts_vec_avx512) { @@ -322,203 +322,4 @@ ice_tx_vec_dev_check_default(struct rte_eth_dev *dev) return 0; } -#ifdef CC_AVX2_SUPPORT -static __rte_always_inline void -ice_rxq_rearm_common(struct ice_rx_queue *rxq, __rte_unused bool avx512) -{ - int i; - uint16_t rx_id; - volatile union ice_rx_flex_desc *rxdp; - struct ice_rx_entry *rxep = &rxq->sw_ring[rxq->rxrearm_start]; - - rxdp = rxq->rx_ring + rxq->rxrearm_start; - - /* Pull 'n' more MBUFs into the software ring */ - if (rte_mempool_get_bulk(rxq->mp, - (void *)rxep, - ICE_RXQ_REARM_THRESH) < 0) { - if (rxq->rxrearm_nb + ICE_RXQ_REARM_THRESH >= - rxq->nb_rx_desc) { - __m128i dma_addr0; - - dma_addr0 = _mm_setzero_si128(); - for (i = 0; i < ICE_DESCS_PER_LOOP; i++) { - rxep[i].mbuf = &rxq->fake_mbuf; - _mm_store_si128((__m128i *)&rxdp[i].read, - dma_addr0); - } - } - rte_eth_devices[rxq->port_id].data->rx_mbuf_alloc_failed += - ICE_RXQ_REARM_THRESH; - return; - } - -#ifndef RTE_LIBRTE_ICE_16BYTE_RX_DESC - struct rte_mbuf *mb0, *mb1; - __m128i dma_addr0, dma_addr1; - __m128i hdr_room = _mm_set_epi64x(RTE_PKTMBUF_HEADROOM, - RTE_PKTMBUF_HEADROOM); - /* Initialize the mbufs in vector, process 2 mbufs in one loop */ - for (i = 0; i < ICE_RXQ_REARM_THRESH; i += 2, rxep += 2) { - __m128i vaddr0, vaddr1; - - mb0 = rxep[0].mbuf; - mb1 = rxep[1].mbuf; - - /* load buf_addr(lo 64bit) and buf_iova(hi 64bit) */ - RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, buf_iova) != - offsetof(struct rte_mbuf, buf_addr) + 8); - vaddr0 = _mm_loadu_si128((__m128i *)&mb0->buf_addr); - vaddr1 = _mm_loadu_si128((__m128i *)&mb1->buf_addr); - - /* convert pa to dma_addr hdr/data */ - dma_addr0 = _mm_unpackhi_epi64(vaddr0, vaddr0); - dma_addr1 = _mm_unpackhi_epi64(vaddr1, vaddr1); - - /* add headroom to pa values */ - dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room); - dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room); - - /* flush desc with pa dma_addr */ - _mm_store_si128((__m128i *)&rxdp++->read, dma_addr0); - _mm_store_si128((__m128i *)&rxdp++->read, dma_addr1); - } -#else -#ifdef CC_AVX512_SUPPORT - if (avx512) { - struct rte_mbuf *mb0, *mb1, *mb2, *mb3; - struct rte_mbuf *mb4, *mb5, *mb6, *mb7; - __m512i dma_addr0_3, dma_addr4_7; - __m512i hdr_room = _mm512_set1_epi64(RTE_PKTMBUF_HEADROOM); - /* Initialize the mbufs in vector, process 8 mbufs in one loop */ - for (i = 0; i < ICE_RXQ_REARM_THRESH; - i += 8, rxep += 8, rxdp += 8) { - __m128i vaddr0, vaddr1, vaddr2, vaddr3; - __m128i vaddr4, vaddr5, vaddr6, vaddr7; - __m256i vaddr0_1, vaddr2_3; - __m256i vaddr4_5, vaddr6_7; - __m512i vaddr0_3, vaddr4_7; - - mb0 = rxep[0].mbuf; - mb1 = rxep[1].mbuf; - mb2 = rxep[2].mbuf; - mb3 = rxep[3].mbuf; - mb4 = rxep[4].mbuf; - mb5 = rxep[5].mbuf; - mb6 = rxep[6].mbuf; - mb7 = rxep[7].mbuf; - - /* load buf_addr(lo 64bit) and buf_iova(hi 64bit) */ - RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, buf_iova) != - offsetof(struct rte_mbuf, buf_addr) + 8); - vaddr0 = _mm_loadu_si128((__m128i *)&mb0->buf_addr); - vaddr1 = _mm_loadu_si128((__m128i *)&mb1->buf_addr); - vaddr2 = _mm_loadu_si128((__m128i *)&mb2->buf_addr); - vaddr3 = _mm_loadu_si128((__m128i *)&mb3->buf_addr); - vaddr4 = _mm_loadu_si128((__m128i *)&mb4->buf_addr); - vaddr5 = _mm_loadu_si128((__m128i *)&mb5->buf_addr); - vaddr6 = _mm_loadu_si128((__m128i *)&mb6->buf_addr); - vaddr7 = _mm_loadu_si128((__m128i *)&mb7->buf_addr); - - /** - * merge 0 & 1, by casting 0 to 256-bit and inserting 1 - * into the high lanes. Similarly for 2 & 3, and so on. - */ - vaddr0_1 = - _mm256_inserti128_si256(_mm256_castsi128_si256(vaddr0), - vaddr1, 1); - vaddr2_3 = - _mm256_inserti128_si256(_mm256_castsi128_si256(vaddr2), - vaddr3, 1); - vaddr4_5 = - _mm256_inserti128_si256(_mm256_castsi128_si256(vaddr4), - vaddr5, 1); - vaddr6_7 = - _mm256_inserti128_si256(_mm256_castsi128_si256(vaddr6), - vaddr7, 1); - vaddr0_3 = - _mm512_inserti64x4(_mm512_castsi256_si512(vaddr0_1), - vaddr2_3, 1); - vaddr4_7 = - _mm512_inserti64x4(_mm512_castsi256_si512(vaddr4_5), - vaddr6_7, 1); - - /* convert pa to dma_addr hdr/data */ - dma_addr0_3 = _mm512_unpackhi_epi64(vaddr0_3, vaddr0_3); - dma_addr4_7 = _mm512_unpackhi_epi64(vaddr4_7, vaddr4_7); - - /* add headroom to pa values */ - dma_addr0_3 = _mm512_add_epi64(dma_addr0_3, hdr_room); - dma_addr4_7 = _mm512_add_epi64(dma_addr4_7, hdr_room); - - /* flush desc with pa dma_addr */ - _mm512_store_si512((__m512i *)&rxdp->read, dma_addr0_3); - _mm512_store_si512((__m512i *)&(rxdp + 4)->read, dma_addr4_7); - } - } else -#endif - { - struct rte_mbuf *mb0, *mb1, *mb2, *mb3; - __m256i dma_addr0_1, dma_addr2_3; - __m256i hdr_room = _mm256_set1_epi64x(RTE_PKTMBUF_HEADROOM); - /* Initialize the mbufs in vector, process 4 mbufs in one loop */ - for (i = 0; i < ICE_RXQ_REARM_THRESH; - i += 4, rxep += 4, rxdp += 4) { - __m128i vaddr0, vaddr1, vaddr2, vaddr3; - __m256i vaddr0_1, vaddr2_3; - - mb0 = rxep[0].mbuf; - mb1 = rxep[1].mbuf; - mb2 = rxep[2].mbuf; - mb3 = rxep[3].mbuf; - - /* load buf_addr(lo 64bit) and buf_iova(hi 64bit) */ - RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, buf_iova) != - offsetof(struct rte_mbuf, buf_addr) + 8); - vaddr0 = _mm_loadu_si128((__m128i *)&mb0->buf_addr); - vaddr1 = _mm_loadu_si128((__m128i *)&mb1->buf_addr); - vaddr2 = _mm_loadu_si128((__m128i *)&mb2->buf_addr); - vaddr3 = _mm_loadu_si128((__m128i *)&mb3->buf_addr); - - /** - * merge 0 & 1, by casting 0 to 256-bit and inserting 1 - * into the high lanes. Similarly for 2 & 3 - */ - vaddr0_1 = - _mm256_inserti128_si256(_mm256_castsi128_si256(vaddr0), - vaddr1, 1); - vaddr2_3 = - _mm256_inserti128_si256(_mm256_castsi128_si256(vaddr2), - vaddr3, 1); - - /* convert pa to dma_addr hdr/data */ - dma_addr0_1 = _mm256_unpackhi_epi64(vaddr0_1, vaddr0_1); - dma_addr2_3 = _mm256_unpackhi_epi64(vaddr2_3, vaddr2_3); - - /* add headroom to pa values */ - dma_addr0_1 = _mm256_add_epi64(dma_addr0_1, hdr_room); - dma_addr2_3 = _mm256_add_epi64(dma_addr2_3, hdr_room); - - /* flush desc with pa dma_addr */ - _mm256_store_si256((__m256i *)&rxdp->read, dma_addr0_1); - _mm256_store_si256((__m256i *)&(rxdp + 2)->read, dma_addr2_3); - } - } - -#endif - - rxq->rxrearm_start += ICE_RXQ_REARM_THRESH; - if (rxq->rxrearm_start >= rxq->nb_rx_desc) - rxq->rxrearm_start = 0; - - rxq->rxrearm_nb -= ICE_RXQ_REARM_THRESH; - - rx_id = (uint16_t)((rxq->rxrearm_start == 0) ? - (rxq->nb_rx_desc - 1) : (rxq->rxrearm_start - 1)); - - /* Update the tail pointer on the NIC */ - ICE_PCI_REG_WC_WRITE(rxq->qrx_tail, rx_id); -} -#endif - #endif -- 2.33.0 --- Diff of the applied patch vs upstream commit (please double-check if non-empty: --- --- - 2021-11-10 14:17:09.271137982 +0800 +++ 0165-net-ice-fix-generic-build-on-FreeBSD.patch 2021-11-10 14:17:01.977411885 +0800 @@ -1 +1 @@ -From 0d989ff9ca515f80fbab2aad18cede501b759ff9 Mon Sep 17 00:00:00 2001 +From 84a1e5c1369a24a3728ff3ff5b8f2a13a2c96e5a Mon Sep 17 00:00:00 2001 @@ -4,0 +5,3 @@ +Cc: Xueming Li + +[ upstream commit 0d989ff9ca515f80fbab2aad18cede501b759ff9 ] @@ -18 +20,0 @@ -Cc: stable@dpdk.org @@ -251 +253 @@ -index 9725ac0180..490693bff2 100644 +index ac2719fa15..581d3d348f 100644 @@ -260 +262 @@ - #include + #include @@ -263 +265 @@ -index 5bba9887d2..7efe7b50a2 100644 +index 719b7b8b38..2c0881a01e 100644 @@ -272 +274 @@ - #include + #include @@ -275 +277 @@ -index 5b5250565e..f0f9926585 100644 +index 1d138aa899..bd2450ad5f 100644 @@ -286,3 +288,3 @@ - if (dev->tx_pkt_burst == ice_xmit_pkts_vec_avx512 || -@@ -355,205 +355,6 @@ ice_tx_vec_dev_check_default(struct rte_eth_dev *dev) - return result; + if (dev->tx_pkt_burst == ice_xmit_pkts_vec_avx512) { +@@ -322,203 +322,4 @@ ice_tx_vec_dev_check_default(struct rte_eth_dev *dev) + return 0; @@ -490,3 +492 @@ - static inline void - ice_txd_enable_offload(struct rte_mbuf *tx_pkt, - uint64_t *txd_hi) + #endif