From: "Ye, MingjinX" <mingjinx.ye@intel.com>
To: "thomas@monjalon.net" <thomas@monjalon.net>,
"Zhang, Qi Z" <qi.z.zhang@intel.com>,
"Yang, Qiming" <qiming.yang@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
"stable@dpdk.org" <stable@dpdk.org>,
"Zhou, YidingX" <yidingx.zhou@intel.com>,
"Richardson, Bruce" <bruce.richardson@intel.com>,
Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>,
"Lu, Wenzhuo" <wenzhuo.lu@intel.com>,
"Junyu Jiang" <junyux.jiang@intel.com>,
"Rong, Leyi" <leyi.rong@intel.com>,
"Ajit Khaparde" <ajit.khaparde@broadcom.com>,
Jerin Jacob <jerinj@marvell.com>,
"Xu, Rosen" <rosen.xu@intel.com>,
Hemant Agrawal <hemant.agrawal@nxp.com>,
"Wisam Jaddo" <wisamm@nvidia.com>
Subject: RE: [PATCH v5 1/2] net/ice: fix vlan offload
Date: Fri, 11 Nov 2022 03:34:49 +0000 [thread overview]
Message-ID: <CY8PR11MB71366BBF0F8988B8CE573EB8E5009@CY8PR11MB7136.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20221108132804.714764-1-mingjinx.ye@intel.com>
Hi ALL,
Could you please review and provide suggestions if any.
Thanks,
Mingjin
> -----Original Message-----
> From: Ye, MingjinX <mingjinx.ye@intel.com>
> Sent: 2022年11月8日 21:28
> To: dev@dpdk.org
> Cc: Yang, Qiming <qiming.yang@intel.com>; stable@dpdk.org; Zhou, YidingX
> <yidingx.zhou@intel.com>; Ye, MingjinX <mingjinx.ye@intel.com>;
> Richardson, Bruce <bruce.richardson@intel.com>; Konstantin Ananyev
> <konstantin.v.ananyev@yandex.ru>; Zhang, Qi Z <qi.z.zhang@intel.com>; Lu,
> Wenzhuo <wenzhuo.lu@intel.com>; Junyu Jiang <junyux.jiang@intel.com>;
> Rong, Leyi <leyi.rong@intel.com>; Ajit Khaparde
> <ajit.khaparde@broadcom.com>; Jerin Jacob <jerinj@marvell.com>; Xu,
> Rosen <rosen.xu@intel.com>; Hemant Agrawal
> <hemant.agrawal@nxp.com>; Wisam Jaddo <wisamm@nvidia.com>
> Subject: [PATCH v5 1/2] net/ice: fix vlan offload
>
> The vlan tag and flag in Rx descriptor are not processed on vector path, then
> the upper application can't fetch the tci from mbuf.
>
> This patch is to add handling of vlan RX offloading.
>
> Fixes: c68a52b8b38c ("net/ice: support vector SSE in Rx")
> Fixes: ece1f8a8f1c8 ("net/ice: switch to flexible descriptor in SSE path")
> Fixes: 12443386a0b0 ("net/ice: support flex Rx descriptor RxDID22")
> Fixes: 214f452f7d5f ("net/ice: add AVX2 offload Rx")
> Fixes: 7f85d5ebcfe1 ("net/ice: add AVX512 vector path")
> Fixes: 295968d17407 ("ethdev: add namespace")
> Fixes: 808a17b3c1e6 ("net/ice: add Rx AVX512 offload path")
> Cc: stable@dpdk.org
>
> Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
>
> v3:
> * Fix macros in ice_rxtx_vec_sse.c source file.
> v4:
> * Fix ice_rx_desc_to_olflags_v define in ice_rxtx_vec_sse.c source
> file.
> ---
> drivers/net/ice/ice_rxtx_vec_avx2.c | 135 +++++++++++++++++-----
> drivers/net/ice/ice_rxtx_vec_avx512.c | 154 +++++++++++++++++++++-----
> drivers/net/ice/ice_rxtx_vec_sse.c | 132 ++++++++++++++++------
> 3 files changed, 332 insertions(+), 89 deletions(-)
>
> diff --git a/drivers/net/ice/ice_rxtx_vec_avx2.c
> b/drivers/net/ice/ice_rxtx_vec_avx2.c
> index 31d6af42fd..bddfd6cf65 100644
> --- a/drivers/net/ice/ice_rxtx_vec_avx2.c
> +++ b/drivers/net/ice/ice_rxtx_vec_avx2.c
> @@ -474,7 +474,7 @@ _ice_recv_raw_pkts_vec_avx2(struct ice_rx_queue
> *rxq, struct rte_mbuf **rx_pkts,
> * will cause performance drop to get into this
> context.
> */
> if (rxq->vsi->adapter->pf.dev_data-
> >dev_conf.rxmode.offloads &
> - RTE_ETH_RX_OFFLOAD_RSS_HASH) {
> + (RTE_ETH_RX_OFFLOAD_RSS_HASH |
> RTE_ETH_RX_OFFLOAD_VLAN)) {
> /* load bottom half of every 32B desc */
> const __m128i raw_desc_bh7 =
> _mm_load_si128
> @@ -529,33 +529,112 @@ _ice_recv_raw_pkts_vec_avx2(struct
> ice_rx_queue *rxq, struct rte_mbuf **rx_pkts,
> * to shift the 32b RSS hash value to the
> * highest 32b of each 128b before mask
> */
> - __m256i rss_hash6_7 =
> - _mm256_slli_epi64(raw_desc_bh6_7,
> 32);
> - __m256i rss_hash4_5 =
> - _mm256_slli_epi64(raw_desc_bh4_5,
> 32);
> - __m256i rss_hash2_3 =
> - _mm256_slli_epi64(raw_desc_bh2_3,
> 32);
> - __m256i rss_hash0_1 =
> - _mm256_slli_epi64(raw_desc_bh0_1,
> 32);
> -
> - __m256i rss_hash_msk =
> - _mm256_set_epi32(0xFFFFFFFF, 0, 0,
> 0,
> - 0xFFFFFFFF, 0, 0, 0);
> -
> - rss_hash6_7 = _mm256_and_si256
> - (rss_hash6_7, rss_hash_msk);
> - rss_hash4_5 = _mm256_and_si256
> - (rss_hash4_5, rss_hash_msk);
> - rss_hash2_3 = _mm256_and_si256
> - (rss_hash2_3, rss_hash_msk);
> - rss_hash0_1 = _mm256_and_si256
> - (rss_hash0_1, rss_hash_msk);
> -
> - mb6_7 = _mm256_or_si256(mb6_7,
> rss_hash6_7);
> - mb4_5 = _mm256_or_si256(mb4_5,
> rss_hash4_5);
> - mb2_3 = _mm256_or_si256(mb2_3,
> rss_hash2_3);
> - mb0_1 = _mm256_or_si256(mb0_1,
> rss_hash0_1);
> - } /* if() on RSS hash parsing */
> + if (rxq->vsi->adapter->pf.dev_data-
> >dev_conf.rxmode.offloads &
> +
> RTE_ETH_RX_OFFLOAD_RSS_HASH) {
> + __m256i rss_hash6_7 =
> +
> _mm256_slli_epi64(raw_desc_bh6_7, 32);
> + __m256i rss_hash4_5 =
> +
> _mm256_slli_epi64(raw_desc_bh4_5, 32);
> + __m256i rss_hash2_3 =
> +
> _mm256_slli_epi64(raw_desc_bh2_3, 32);
> + __m256i rss_hash0_1 =
> +
> _mm256_slli_epi64(raw_desc_bh0_1, 32);
> +
> + __m256i rss_hash_msk =
> +
> _mm256_set_epi32(0xFFFFFFFF, 0, 0, 0,
> + 0xFFFFFFFF, 0,
> 0, 0);
> +
> + rss_hash6_7 = _mm256_and_si256
> + (rss_hash6_7,
> rss_hash_msk);
> + rss_hash4_5 = _mm256_and_si256
> + (rss_hash4_5,
> rss_hash_msk);
> + rss_hash2_3 = _mm256_and_si256
> + (rss_hash2_3,
> rss_hash_msk);
> + rss_hash0_1 = _mm256_and_si256
> + (rss_hash0_1,
> rss_hash_msk);
> +
> + mb6_7 = _mm256_or_si256(mb6_7,
> rss_hash6_7);
> + mb4_5 = _mm256_or_si256(mb4_5,
> rss_hash4_5);
> + mb2_3 = _mm256_or_si256(mb2_3,
> rss_hash2_3);
> + mb0_1 = _mm256_or_si256(mb0_1,
> rss_hash0_1);
> + } /* if() on RSS hash parsing */
> +
> + if (rxq->vsi->adapter->pf.dev_data-
> >dev_conf.rxmode.offloads &
> +
> RTE_ETH_RX_OFFLOAD_VLAN) {
> + /* merge the status/error-1 bits into
> one register */
> + const __m256i status1_4_7 =
> +
> _mm256_unpacklo_epi32(raw_desc_bh6_7,
> + raw_desc_bh4_5);
> + const __m256i status1_0_3 =
> +
> _mm256_unpacklo_epi32(raw_desc_bh2_3,
> + raw_desc_bh0_1);
> +
> + const __m256i status1_0_7 =
> +
> _mm256_unpacklo_epi64(status1_4_7,
> + status1_0_3);
> +
> + const __m256i l2tag2p_flag_mask =
> +
> _mm256_set1_epi32(1 << 11);
> +
> + __m256i l2tag2p_flag_bits =
> + _mm256_and_si256
> + (status1_0_7,
> l2tag2p_flag_mask);
> +
> + l2tag2p_flag_bits =
> +
> _mm256_srli_epi32(l2tag2p_flag_bits,
> + 11);
> +
> + __m256i vlan_flags =
> _mm256_setzero_si256();
> + const __m256i l2tag2_flags_shuf =
> + _mm256_set_epi8(0,
> 0, 0, 0,
> + 0, 0, 0,
> 0,
> + 0, 0, 0,
> 0,
> + 0, 0, 0,
> 0,
> + /*
> end up 128-bits */
> + 0, 0, 0,
> 0,
> + 0, 0, 0,
> 0,
> + 0, 0, 0,
> 0,
> + 0, 0,
> +
> RTE_MBUF_F_RX_VLAN |
> +
> RTE_MBUF_F_RX_VLAN_STRIPPED,
> + 0);
> + vlan_flags =
> +
> _mm256_shuffle_epi8(l2tag2_flags_shuf,
> + l2tag2p_flag_bits);
> +
> + /* merge with vlan_flags */
> + mbuf_flags = _mm256_or_si256
> + (mbuf_flags,
> vlan_flags);
> +
> + /* L2TAG2_2 */
> + __m256i vlan_tci6_7 =
> +
> _mm256_slli_si256(raw_desc_bh6_7, 4);
> + __m256i vlan_tci4_5 =
> +
> _mm256_slli_si256(raw_desc_bh4_5, 4);
> + __m256i vlan_tci2_3 =
> +
> _mm256_slli_si256(raw_desc_bh2_3, 4);
> + __m256i vlan_tci0_1 =
> +
> _mm256_slli_si256(raw_desc_bh0_1, 4);
> +
> + const __m256i vlan_tci_msk =
> + _mm256_set_epi32(0,
> 0xFFFF0000, 0, 0,
> + 0, 0xFFFF0000, 0, 0);
> +
> + vlan_tci6_7 = _mm256_and_si256
> +
> (vlan_tci6_7, vlan_tci_msk);
> + vlan_tci4_5 = _mm256_and_si256
> +
> (vlan_tci4_5, vlan_tci_msk);
> + vlan_tci2_3 = _mm256_and_si256
> +
> (vlan_tci2_3, vlan_tci_msk);
> + vlan_tci0_1 = _mm256_and_si256
> +
> (vlan_tci0_1, vlan_tci_msk);
> +
> + mb6_7 = _mm256_or_si256(mb6_7,
> vlan_tci6_7);
> + mb4_5 = _mm256_or_si256(mb4_5,
> vlan_tci4_5);
> + mb2_3 = _mm256_or_si256(mb2_3,
> vlan_tci2_3);
> + mb0_1 = _mm256_or_si256(mb0_1,
> vlan_tci0_1);
> + }
> + }
> #endif
> }
>
> diff --git a/drivers/net/ice/ice_rxtx_vec_avx512.c
> b/drivers/net/ice/ice_rxtx_vec_avx512.c
> index 5bfd5152df..5d5e4bf3cd 100644
> --- a/drivers/net/ice/ice_rxtx_vec_avx512.c
> +++ b/drivers/net/ice/ice_rxtx_vec_avx512.c
> @@ -585,7 +585,7 @@ _ice_recv_raw_pkts_vec_avx512(struct
> ice_rx_queue *rxq,
> * will cause performance drop to get into this
> context.
> */
> if (rxq->vsi->adapter->pf.dev_data-
> >dev_conf.rxmode.offloads &
> - RTE_ETH_RX_OFFLOAD_RSS_HASH) {
> + (RTE_ETH_RX_OFFLOAD_RSS_HASH |
> RTE_ETH_RX_OFFLOAD_VLAN)) {
> /* load bottom half of every 32B desc */
> const __m128i raw_desc_bh7 =
> _mm_load_si128
> @@ -640,33 +640,131 @@ _ice_recv_raw_pkts_vec_avx512(struct
> ice_rx_queue *rxq,
> * to shift the 32b RSS hash value to the
> * highest 32b of each 128b before mask
> */
> - __m256i rss_hash6_7 =
> - _mm256_slli_epi64(raw_desc_bh6_7,
> 32);
> - __m256i rss_hash4_5 =
> - _mm256_slli_epi64(raw_desc_bh4_5,
> 32);
> - __m256i rss_hash2_3 =
> - _mm256_slli_epi64(raw_desc_bh2_3,
> 32);
> - __m256i rss_hash0_1 =
> - _mm256_slli_epi64(raw_desc_bh0_1,
> 32);
> -
> - __m256i rss_hash_msk =
> - _mm256_set_epi32(0xFFFFFFFF, 0, 0,
> 0,
> - 0xFFFFFFFF, 0, 0, 0);
> -
> - rss_hash6_7 = _mm256_and_si256
> - (rss_hash6_7, rss_hash_msk);
> - rss_hash4_5 = _mm256_and_si256
> - (rss_hash4_5, rss_hash_msk);
> - rss_hash2_3 = _mm256_and_si256
> - (rss_hash2_3, rss_hash_msk);
> - rss_hash0_1 = _mm256_and_si256
> - (rss_hash0_1, rss_hash_msk);
> -
> - mb6_7 = _mm256_or_si256(mb6_7,
> rss_hash6_7);
> - mb4_5 = _mm256_or_si256(mb4_5,
> rss_hash4_5);
> - mb2_3 = _mm256_or_si256(mb2_3,
> rss_hash2_3);
> - mb0_1 = _mm256_or_si256(mb0_1,
> rss_hash0_1);
> - } /* if() on RSS hash parsing */
> + if (rxq->vsi->adapter->pf.dev_data-
> >dev_conf.rxmode.offloads &
> +
> RTE_ETH_RX_OFFLOAD_RSS_HASH) {
> + __m256i rss_hash6_7 =
> +
> _mm256_slli_epi64(raw_desc_bh6_7, 32);
> + __m256i rss_hash4_5 =
> +
> _mm256_slli_epi64(raw_desc_bh4_5, 32);
> + __m256i rss_hash2_3 =
> +
> _mm256_slli_epi64(raw_desc_bh2_3, 32);
> + __m256i rss_hash0_1 =
> +
> _mm256_slli_epi64(raw_desc_bh0_1, 32);
> +
> + __m256i rss_hash_msk =
> +
> _mm256_set_epi32(0xFFFFFFFF, 0, 0, 0,
> + 0xFFFFFFFF, 0,
> 0, 0);
> +
> + rss_hash6_7 = _mm256_and_si256
> + (rss_hash6_7,
> rss_hash_msk);
> + rss_hash4_5 = _mm256_and_si256
> + (rss_hash4_5,
> rss_hash_msk);
> + rss_hash2_3 = _mm256_and_si256
> + (rss_hash2_3,
> rss_hash_msk);
> + rss_hash0_1 = _mm256_and_si256
> + (rss_hash0_1,
> rss_hash_msk);
> +
> + mb6_7 = _mm256_or_si256(mb6_7,
> rss_hash6_7);
> + mb4_5 = _mm256_or_si256(mb4_5,
> rss_hash4_5);
> + mb2_3 = _mm256_or_si256(mb2_3,
> rss_hash2_3);
> + mb0_1 = _mm256_or_si256(mb0_1,
> rss_hash0_1);
> + } /* if() on RSS hash parsing */
> +
> + if (rxq->vsi->adapter->pf.dev_data-
> >dev_conf.rxmode.offloads &
> +
> RTE_ETH_RX_OFFLOAD_VLAN) {
> + /* merge the status/error-1 bits into
> one register */
> + const __m256i status1_4_7 =
> + _mm256_unpacklo_epi32
> + (raw_desc_bh6_7,
> + raw_desc_bh4_5);
> + const __m256i status1_0_3 =
> + _mm256_unpacklo_epi32
> + (raw_desc_bh2_3,
> + raw_desc_bh0_1);
> +
> + const __m256i status1_0_7 =
> + _mm256_unpacklo_epi64
> + (status1_4_7, status1_0_3);
> +
> + const __m256i l2tag2p_flag_mask =
> + _mm256_set1_epi32
> + (1 << 11);
> +
> + __m256i l2tag2p_flag_bits =
> + _mm256_and_si256
> + (status1_0_7,
> + l2tag2p_flag_mask);
> +
> + l2tag2p_flag_bits =
> + _mm256_srli_epi32
> + (l2tag2p_flag_bits,
> + 11);
> + const __m256i l2tag2_flags_shuf =
> + _mm256_set_epi8
> + (0, 0, 0, 0,
> + 0, 0, 0, 0,
> + 0, 0, 0, 0,
> + 0, 0, 0, 0,
> + /* end up 128-bits */
> + 0, 0, 0, 0,
> + 0, 0, 0, 0,
> + 0, 0, 0, 0,
> + 0, 0,
> +
> RTE_MBUF_F_RX_VLAN |
> +
> RTE_MBUF_F_RX_VLAN_STRIPPED,
> + 0);
> + __m256i vlan_flags =
> + _mm256_shuffle_epi8
> + (l2tag2_flags_shuf,
> + l2tag2p_flag_bits);
> +
> + /* merge with vlan_flags */
> + mbuf_flags = _mm256_or_si256
> + (mbuf_flags,
> + vlan_flags);
> +
> + /* L2TAG2_2 */
> + __m256i vlan_tci6_7 =
> + _mm256_slli_si256
> + (raw_desc_bh6_7, 4);
> + __m256i vlan_tci4_5 =
> + _mm256_slli_si256
> + (raw_desc_bh4_5, 4);
> + __m256i vlan_tci2_3 =
> + _mm256_slli_si256
> + (raw_desc_bh2_3, 4);
> + __m256i vlan_tci0_1 =
> + _mm256_slli_si256
> + (raw_desc_bh0_1, 4);
> +
> + const __m256i vlan_tci_msk =
> + _mm256_set_epi32
> + (0, 0xFFFF0000, 0, 0,
> + 0, 0xFFFF0000, 0, 0);
> +
> + vlan_tci6_7 = _mm256_and_si256
> + (vlan_tci6_7,
> + vlan_tci_msk);
> + vlan_tci4_5 = _mm256_and_si256
> + (vlan_tci4_5,
> + vlan_tci_msk);
> + vlan_tci2_3 = _mm256_and_si256
> + (vlan_tci2_3,
> + vlan_tci_msk);
> + vlan_tci0_1 = _mm256_and_si256
> + (vlan_tci0_1,
> + vlan_tci_msk);
> +
> + mb6_7 = _mm256_or_si256
> + (mb6_7, vlan_tci6_7);
> + mb4_5 = _mm256_or_si256
> + (mb4_5, vlan_tci4_5);
> + mb2_3 = _mm256_or_si256
> + (mb2_3, vlan_tci2_3);
> + mb0_1 = _mm256_or_si256
> + (mb0_1, vlan_tci0_1);
> + }
> + }
> #endif
> }
>
> diff --git a/drivers/net/ice/ice_rxtx_vec_sse.c
> b/drivers/net/ice/ice_rxtx_vec_sse.c
> index fd94cedde3..cc5b8510dc 100644
> --- a/drivers/net/ice/ice_rxtx_vec_sse.c
> +++ b/drivers/net/ice/ice_rxtx_vec_sse.c
> @@ -100,9 +100,15 @@ ice_rxq_rearm(struct ice_rx_queue *rxq)
> ICE_PCI_REG_WC_WRITE(rxq->qrx_tail, rx_id); }
>
> +#ifndef RTE_LIBRTE_ICE_16BYTE_RX_DESC
> +static inline void
> +ice_rx_desc_to_olflags_v(struct ice_rx_queue *rxq, __m128i descs[4],
> __m128i descs_bh[4],
> + struct rte_mbuf **rx_pkts)
> +#else
> static inline void
> ice_rx_desc_to_olflags_v(struct ice_rx_queue *rxq, __m128i descs[4],
> struct rte_mbuf **rx_pkts)
> +#endif
> {
> const __m128i mbuf_init = _mm_set_epi64x(0, rxq-
> >mbuf_initializer);
> __m128i rearm0, rearm1, rearm2, rearm3; @@ -214,6 +220,38 @@
> ice_rx_desc_to_olflags_v(struct ice_rx_queue *rxq, __m128i descs[4],
> /* merge the flags */
> flags = _mm_or_si128(flags, rss_vlan);
>
> + #ifndef RTE_LIBRTE_ICE_16BYTE_RX_DESC
> + if (rxq->vsi->adapter->pf.dev_data->dev_conf.rxmode.offloads &
> +
> RTE_ETH_RX_OFFLOAD_VLAN) {
> + const __m128i l2tag2_mask =
> + _mm_set1_epi32(1 << 11);
> + const __m128i vlan_tci0_1 =
> + _mm_unpacklo_epi32(descs_bh[0], descs_bh[1]);
> + const __m128i vlan_tci2_3 =
> + _mm_unpacklo_epi32(descs_bh[2], descs_bh[3]);
> + const __m128i vlan_tci0_3 =
> + _mm_unpacklo_epi64(vlan_tci0_1, vlan_tci2_3);
> +
> + __m128i vlan_bits = _mm_and_si128(vlan_tci0_3,
> l2tag2_mask);
> +
> + vlan_bits = _mm_srli_epi32(vlan_bits, 11);
> +
> + const __m128i vlan_flags_shuf =
> + _mm_set_epi8(0, 0, 0, 0,
> + 0, 0, 0, 0,
> + 0, 0, 0, 0,
> + 0, 0,
> + RTE_MBUF_F_RX_VLAN |
> + RTE_MBUF_F_RX_VLAN_STRIPPED,
> + 0);
> +
> + const __m128i vlan_flags =
> _mm_shuffle_epi8(vlan_flags_shuf,
> +vlan_bits);
> +
> + /* merge with vlan_flags */
> + flags = _mm_or_si128(flags, vlan_flags);
> + }
> +#endif
> +
> if (rxq->fdir_enabled) {
> const __m128i fdir_id0_1 =
> _mm_unpackhi_epi32(descs[0], descs[1]); @@ -
> 405,6 +443,9 @@ _ice_recv_raw_pkts_vec(struct ice_rx_queue *rxq, struct
> rte_mbuf **rx_pkts,
> pos += ICE_DESCS_PER_LOOP,
> rxdp += ICE_DESCS_PER_LOOP) {
> __m128i descs[ICE_DESCS_PER_LOOP];
> + #ifndef RTE_LIBRTE_ICE_16BYTE_RX_DESC
> + __m128i descs_bh[ICE_DESCS_PER_LOOP];
> + #endif
> __m128i pkt_mb0, pkt_mb1, pkt_mb2, pkt_mb3;
> __m128i staterr, sterr_tmp1, sterr_tmp2;
> /* 2 64 bit or 4 32 bit mbuf pointers in one XMM reg. */ @@ -
> 463,8 +504,6 @@ _ice_recv_raw_pkts_vec(struct ice_rx_queue *rxq, struct
> rte_mbuf **rx_pkts,
> /* C.1 4=>2 filter staterr info only */
> sterr_tmp1 = _mm_unpackhi_epi32(descs[1], descs[0]);
>
> - ice_rx_desc_to_olflags_v(rxq, descs, &rx_pkts[pos]);
> -
> /* D.2 pkt 3,4 set in_port/nb_seg and remove crc */
> pkt_mb3 = _mm_add_epi16(pkt_mb3, crc_adjust);
> pkt_mb2 = _mm_add_epi16(pkt_mb2, crc_adjust); @@ -
> 479,21 +518,21 @@ _ice_recv_raw_pkts_vec(struct ice_rx_queue *rxq,
> struct rte_mbuf **rx_pkts,
> * will cause performance drop to get into this context.
> */
> if (rxq->vsi->adapter->pf.dev_data-
> >dev_conf.rxmode.offloads &
> - RTE_ETH_RX_OFFLOAD_RSS_HASH) {
> + (RTE_ETH_RX_OFFLOAD_RSS_HASH |
> RTE_ETH_RX_OFFLOAD_VLAN)) {
> /* load bottom half of every 32B desc */
> - const __m128i raw_desc_bh3 =
> + descs_bh[3] =
> _mm_load_si128
> ((void
> *)(&rxdp[3].wb.status_error1));
> rte_compiler_barrier();
> - const __m128i raw_desc_bh2 =
> + descs_bh[2] =
> _mm_load_si128
> ((void
> *)(&rxdp[2].wb.status_error1));
> rte_compiler_barrier();
> - const __m128i raw_desc_bh1 =
> + descs_bh[1] =
> _mm_load_si128
> ((void
> *)(&rxdp[1].wb.status_error1));
> rte_compiler_barrier();
> - const __m128i raw_desc_bh0 =
> + descs_bh[0] =
> _mm_load_si128
> ((void
> *)(&rxdp[0].wb.status_error1));
>
> @@ -501,32 +540,59 @@ _ice_recv_raw_pkts_vec(struct ice_rx_queue *rxq,
> struct rte_mbuf **rx_pkts,
> * to shift the 32b RSS hash value to the
> * highest 32b of each 128b before mask
> */
> - __m128i rss_hash3 =
> - _mm_slli_epi64(raw_desc_bh3, 32);
> - __m128i rss_hash2 =
> - _mm_slli_epi64(raw_desc_bh2, 32);
> - __m128i rss_hash1 =
> - _mm_slli_epi64(raw_desc_bh1, 32);
> - __m128i rss_hash0 =
> - _mm_slli_epi64(raw_desc_bh0, 32);
> -
> - __m128i rss_hash_msk =
> - _mm_set_epi32(0xFFFFFFFF, 0, 0, 0);
> -
> - rss_hash3 = _mm_and_si128
> - (rss_hash3, rss_hash_msk);
> - rss_hash2 = _mm_and_si128
> - (rss_hash2, rss_hash_msk);
> - rss_hash1 = _mm_and_si128
> - (rss_hash1, rss_hash_msk);
> - rss_hash0 = _mm_and_si128
> - (rss_hash0, rss_hash_msk);
> -
> - pkt_mb3 = _mm_or_si128(pkt_mb3, rss_hash3);
> - pkt_mb2 = _mm_or_si128(pkt_mb2, rss_hash2);
> - pkt_mb1 = _mm_or_si128(pkt_mb1, rss_hash1);
> - pkt_mb0 = _mm_or_si128(pkt_mb0, rss_hash0);
> - } /* if() on RSS hash parsing */
> + if (rxq->vsi->adapter->pf.dev_data-
> >dev_conf.rxmode.offloads &
> +
> RTE_ETH_RX_OFFLOAD_RSS_HASH) {
> + __m128i rss_hash3 =
> + _mm_slli_epi64(descs_bh[3], 32);
> + __m128i rss_hash2 =
> + _mm_slli_epi64(descs_bh[2], 32);
> + __m128i rss_hash1 =
> + _mm_slli_epi64(descs_bh[1], 32);
> + __m128i rss_hash0 =
> + _mm_slli_epi64(descs_bh[0], 32);
> +
> + __m128i rss_hash_msk =
> + _mm_set_epi32(0xFFFFFFFF, 0, 0, 0);
> +
> + rss_hash3 = _mm_and_si128
> + (rss_hash3, rss_hash_msk);
> + rss_hash2 = _mm_and_si128
> + (rss_hash2, rss_hash_msk);
> + rss_hash1 = _mm_and_si128
> + (rss_hash1, rss_hash_msk);
> + rss_hash0 = _mm_and_si128
> + (rss_hash0, rss_hash_msk);
> +
> + pkt_mb3 = _mm_or_si128(pkt_mb3,
> rss_hash3);
> + pkt_mb2 = _mm_or_si128(pkt_mb2,
> rss_hash2);
> + pkt_mb1 = _mm_or_si128(pkt_mb1,
> rss_hash1);
> + pkt_mb0 = _mm_or_si128(pkt_mb0,
> rss_hash0);
> + } /* if() on RSS hash parsing */
> +
> + if (rxq->vsi->adapter->pf.dev_data-
> >dev_conf.rxmode.offloads &
> +
> RTE_ETH_RX_OFFLOAD_VLAN) {
> + /*
> L2TAG2_2 */
> + __m128i vlan_tci3 =
> _mm_slli_si128(descs_bh[3], 4);
> + __m128i vlan_tci2 =
> _mm_slli_si128(descs_bh[2], 4);
> + __m128i vlan_tci1 =
> _mm_slli_si128(descs_bh[1], 4);
> + __m128i vlan_tci0 =
> _mm_slli_si128(descs_bh[0], 4);
> +
> + const __m128i vlan_tci_msk =
> _mm_set_epi32(0, 0xFFFF0000, 0, 0);
> +
> + vlan_tci3 = _mm_and_si128(vlan_tci3,
> vlan_tci_msk);
> + vlan_tci2 = _mm_and_si128(vlan_tci2,
> vlan_tci_msk);
> + vlan_tci1 = _mm_and_si128(vlan_tci1,
> vlan_tci_msk);
> + vlan_tci0 = _mm_and_si128(vlan_tci0,
> vlan_tci_msk);
> +
> + pkt_mb3 = _mm_or_si128(pkt_mb3,
> vlan_tci3);
> + pkt_mb2 = _mm_or_si128(pkt_mb2,
> vlan_tci2);
> + pkt_mb1 = _mm_or_si128(pkt_mb1,
> vlan_tci1);
> + pkt_mb0 = _mm_or_si128(pkt_mb0,
> vlan_tci0);
> + }
> + ice_rx_desc_to_olflags_v(rxq, descs, descs_bh,
> &rx_pkts[pos]);
> + }
> +#else
> + ice_rx_desc_to_olflags_v(rxq, descs, &rx_pkts[pos]);
> #endif
>
> /* C.2 get 4 pkts staterr value */
> --
> 2.34.1
next prev parent reply other threads:[~2022-11-11 3:34 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-26 17:10 [PATCH v4 1/2] app/testpmd: fix vlan offload of rxq Mingjin Ye
2022-10-26 9:52 ` lihuisong (C)
2022-10-27 11:02 ` Ye, MingjinX
2022-10-28 2:09 ` lihuisong (C)
2022-11-03 1:28 ` Ye, MingjinX
2022-11-03 7:01 ` lihuisong (C)
2022-11-04 8:21 ` Ye, MingjinX
2022-11-04 10:17 ` lihuisong (C)
2022-11-04 11:33 ` Ye, MingjinX
2022-11-06 10:32 ` Andrew Rybchenko
2022-11-07 7:18 ` Ye, MingjinX
2022-10-26 17:10 ` [PATCH v4 2/2] net/ice: fix vlan offload Mingjin Ye
2022-10-27 8:36 ` Huang, ZhiminX
2022-10-27 8:36 ` [PATCH v4 1/2] app/testpmd: fix vlan offload of rxq Huang, ZhiminX
2022-10-27 13:16 ` Singh, Aman Deep
2022-11-08 13:28 ` [PATCH v5 1/2] net/ice: fix vlan offload Mingjin Ye
2022-11-08 13:28 ` [PATCH v5 2/2] net/ice: fix vlan offload of rxq Mingjin Ye
2022-11-09 1:52 ` Huang, ZhiminX
2022-11-21 2:54 ` [PATCH v6] doc: add PMD known issue Mingjin Ye
2022-11-25 1:55 ` Ye, MingjinX
2022-12-09 10:20 ` Ye, MingjinX
2022-12-13 1:41 ` Zhang, Qi Z
2022-12-13 4:25 ` Ye, MingjinX
2022-12-23 7:32 ` [PATCH v7] " Mingjin Ye
2022-12-26 2:52 ` Mingjin Ye
2022-12-27 9:00 ` Mingjin Ye
2022-12-27 16:40 ` Stephen Hemminger
2023-01-28 6:01 ` [PATCH v8] " Mingjin Ye
2023-01-28 17:17 ` Stephen Hemminger
2023-02-02 2:30 ` Ye, MingjinX
2022-11-09 1:51 ` [PATCH v5 1/2] net/ice: fix vlan offload Huang, ZhiminX
2022-11-11 3:34 ` Ye, MingjinX [this message]
2022-11-08 10:12 Mingjin Ye
2022-11-08 12:14 Mingjin Ye
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CY8PR11MB71366BBF0F8988B8CE573EB8E5009@CY8PR11MB7136.namprd11.prod.outlook.com \
--to=mingjinx.ye@intel.com \
--cc=ajit.khaparde@broadcom.com \
--cc=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=hemant.agrawal@nxp.com \
--cc=jerinj@marvell.com \
--cc=junyux.jiang@intel.com \
--cc=konstantin.v.ananyev@yandex.ru \
--cc=leyi.rong@intel.com \
--cc=qi.z.zhang@intel.com \
--cc=qiming.yang@intel.com \
--cc=rosen.xu@intel.com \
--cc=stable@dpdk.org \
--cc=thomas@monjalon.net \
--cc=wenzhuo.lu@intel.com \
--cc=wisamm@nvidia.com \
--cc=yidingx.zhou@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).