From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 563CDA0597; Wed, 8 Apr 2020 08:33:03 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id DB4AA1C0BC; Wed, 8 Apr 2020 08:31:35 +0200 (CEST) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by dpdk.org (Postfix) with ESMTP id 0869C1BF59 for ; Wed, 8 Apr 2020 08:31:33 +0200 (CEST) IronPort-SDR: 8PfM/C120XDSOwa7s662BmTGErmk4I5Dg3XO+/meGHUAzdB4rffu7Yktf0itcMFNBf7+KF4H+X WVFTyigF5mIw== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Apr 2020 23:31:33 -0700 IronPort-SDR: 5ZWyjUVMjEfHrbsqcWpy4lYSTrF2E3REFUlgUjFj7We4YTORQCQrnlIlN1UFHYwmH8bq6tgf3E G+98fVRVfeXw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.72,357,1580803200"; d="scan'208";a="251474444" Received: from dpdk-lrong-srv-04.sh.intel.com ([10.67.119.221]) by orsmga003.jf.intel.com with ESMTP; 07 Apr 2020 23:31:31 -0700 From: Leyi Rong To: jingjing.wu@intel.com, qi.z.zhang@intel.com, beilei.xing@intel.com, xiaolong.ye@intel.com Cc: dev@dpdk.org, Leyi Rong Date: Wed, 8 Apr 2020 14:22:09 +0800 Message-Id: <20200408062210.34003-11-leyi.rong@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200408062210.34003-1-leyi.rong@intel.com> References: <20200316074603.10998-1-leyi.rong@intel.com> <20200408062210.34003-1-leyi.rong@intel.com> Subject: [dpdk-dev] [PATCH v3 10/11] net/iavf: add RSS hash parsing in AVX path X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Support RSS hash parsing from Flex Rx descriptor in AVX data path. Signed-off-by: Leyi Rong --- drivers/net/iavf/iavf_rxtx_vec_avx2.c | 92 ++++++++++++++++++++++++++- 1 file changed, 90 insertions(+), 2 deletions(-) diff --git a/drivers/net/iavf/iavf_rxtx_vec_avx2.c b/drivers/net/iavf/iavf_rxtx_vec_avx2.c index 3bf5833fa..22f1b7887 100644 --- a/drivers/net/iavf/iavf_rxtx_vec_avx2.c +++ b/drivers/net/iavf/iavf_rxtx_vec_avx2.c @@ -698,7 +698,7 @@ _iavf_recv_raw_pkts_vec_avx2_flex_rxd(struct iavf_rx_queue *rxq, _mm256_set_epi8 (/* first descriptor */ 0xFF, 0xFF, - 0xFF, 0xFF, /* rss not supported */ + 0xFF, 0xFF, /* rss hash parsed separately */ 11, 10, /* octet 10~11, 16 bits vlan_macip */ 5, 4, /* octet 4~5, 16 bits data_len */ 0xFF, 0xFF, /* skip hi 16 bits pkt_len, zero out */ @@ -707,7 +707,7 @@ _iavf_recv_raw_pkts_vec_avx2_flex_rxd(struct iavf_rx_queue *rxq, 0xFF, 0xFF, /*pkt_type set as unknown */ /* second descriptor */ 0xFF, 0xFF, - 0xFF, 0xFF, /* rss not supported */ + 0xFF, 0xFF, /* rss hash parsed separately */ 11, 10, /* octet 10~11, 16 bits vlan_macip */ 5, 4, /* octet 4~5, 16 bits data_len */ 0xFF, 0xFF, /* skip hi 16 bits pkt_len, zero out */ @@ -994,6 +994,94 @@ _iavf_recv_raw_pkts_vec_avx2_flex_rxd(struct iavf_rx_queue *rxq, _mm256_extract_epi32(fdir_id0_7, 4); } /* if() on fdir_enabled */ + /** + * needs to load 2nd 16B of each desc for RSS hash parsing, + * will cause performance drop to get into this context. + */ + if (rxq->vsi->adapter->eth_dev->data->dev_conf.rxmode.offloads & + DEV_RX_OFFLOAD_RSS_HASH) { + /* load bottom half of every 32B desc */ + const __m128i raw_desc_bh7 = + _mm_load_si128 + ((void *)(&rxdp[7].wb.status_error1)); + rte_compiler_barrier(); + const __m128i raw_desc_bh6 = + _mm_load_si128 + ((void *)(&rxdp[6].wb.status_error1)); + rte_compiler_barrier(); + const __m128i raw_desc_bh5 = + _mm_load_si128 + ((void *)(&rxdp[5].wb.status_error1)); + rte_compiler_barrier(); + const __m128i raw_desc_bh4 = + _mm_load_si128 + ((void *)(&rxdp[4].wb.status_error1)); + rte_compiler_barrier(); + const __m128i raw_desc_bh3 = + _mm_load_si128 + ((void *)(&rxdp[3].wb.status_error1)); + rte_compiler_barrier(); + const __m128i raw_desc_bh2 = + _mm_load_si128 + ((void *)(&rxdp[2].wb.status_error1)); + rte_compiler_barrier(); + const __m128i raw_desc_bh1 = + _mm_load_si128 + ((void *)(&rxdp[1].wb.status_error1)); + rte_compiler_barrier(); + const __m128i raw_desc_bh0 = + _mm_load_si128 + ((void *)(&rxdp[0].wb.status_error1)); + + __m256i raw_desc_bh6_7 = + _mm256_inserti128_si256 + (_mm256_castsi128_si256(raw_desc_bh6), + raw_desc_bh7, 1); + __m256i raw_desc_bh4_5 = + _mm256_inserti128_si256 + (_mm256_castsi128_si256(raw_desc_bh4), + raw_desc_bh5, 1); + __m256i raw_desc_bh2_3 = + _mm256_inserti128_si256 + (_mm256_castsi128_si256(raw_desc_bh2), + raw_desc_bh3, 1); + __m256i raw_desc_bh0_1 = + _mm256_inserti128_si256 + (_mm256_castsi128_si256(raw_desc_bh0), + raw_desc_bh1, 1); + + /** + * to shift the 32b RSS hash value to the + * highest 32b of each 128b before mask + */ + __m256i rss_hash6_7 = + _mm256_slli_epi64(raw_desc_bh6_7, 32); + __m256i rss_hash4_5 = + _mm256_slli_epi64(raw_desc_bh4_5, 32); + __m256i rss_hash2_3 = + _mm256_slli_epi64(raw_desc_bh2_3, 32); + __m256i rss_hash0_1 = + _mm256_slli_epi64(raw_desc_bh0_1, 32); + + __m256i rss_hash_msk = + _mm256_set_epi32(0xFFFFFFFF, 0, 0, 0, + 0xFFFFFFFF, 0, 0, 0); + + rss_hash6_7 = _mm256_and_si256 + (rss_hash6_7, rss_hash_msk); + rss_hash4_5 = _mm256_and_si256 + (rss_hash4_5, rss_hash_msk); + rss_hash2_3 = _mm256_and_si256 + (rss_hash2_3, rss_hash_msk); + rss_hash0_1 = _mm256_and_si256 + (rss_hash0_1, rss_hash_msk); + + mb6_7 = _mm256_or_si256(mb6_7, rss_hash6_7); + mb4_5 = _mm256_or_si256(mb4_5, rss_hash4_5); + mb2_3 = _mm256_or_si256(mb2_3, rss_hash2_3); + mb0_1 = _mm256_or_si256(mb0_1, rss_hash0_1); + } /* if() on RSS hash parsing */ + /** * At this point, we have the 8 sets of flags in the low 16-bits * of each 32-bit value in vlan0. -- 2.17.1