From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 000C18E76 for ; Thu, 29 Oct 2015 10:16:34 +0100 (CET) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga101.jf.intel.com with ESMTP; 29 Oct 2015 02:16:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,213,1444719600"; d="scan'208";a="838264386" Received: from shvmail01.sh.intel.com ([10.239.29.42]) by fmsmga002.fm.intel.com with ESMTP; 29 Oct 2015 02:16:31 -0700 Received: from shecgisg004.sh.intel.com (shecgisg004.sh.intel.com [10.239.29.89]) by shvmail01.sh.intel.com with ESMTP id t9T9GTVT029358; Thu, 29 Oct 2015 17:16:29 +0800 Received: from shecgisg004.sh.intel.com (localhost [127.0.0.1]) by shecgisg004.sh.intel.com (8.13.6/8.13.6/SuSE Linux 0.8) with ESMTP id t9T9GRLd013452; Thu, 29 Oct 2015 17:16:29 +0800 Received: (from jingche2@localhost) by shecgisg004.sh.intel.com (8.13.6/8.13.6/Submit) id t9T9GRej013400; Thu, 29 Oct 2015 17:16:27 +0800 From: "Chen Jing D(Mark)" To: dev@dpdk.org Date: Thu, 29 Oct 2015 17:16:02 +0800 Message-Id: <1446110173-13330-6-git-send-email-jing.d.chen@intel.com> X-Mailer: git-send-email 1.7.12.2 In-Reply-To: <1446110173-13330-1-git-send-email-jing.d.chen@intel.com> References: <1445939209-12783-2-git-send-email-jing.d.chen@intel.com> <1446110173-13330-1-git-send-email-jing.d.chen@intel.com> Subject: [dpdk-dev] [PATCH v4 05/16] fm10k: add 2 functions to parse pkt_type and offload flag X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Oct 2015 09:16:35 -0000 From: "Chen Jing D(Mark)" Add 2 functions, in which using SSE instructions to parse RX desc to get pkt_type and ol_flags in mbuf. Signed-off-by: Chen Jing D(Mark) --- drivers/net/fm10k/fm10k_rxtx_vec.c | 127 ++++++++++++++++++++++++++++++++++++ 1 files changed, 127 insertions(+), 0 deletions(-) diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c b/drivers/net/fm10k/fm10k_rxtx_vec.c index 6c21f15..88c9536 100644 --- a/drivers/net/fm10k/fm10k_rxtx_vec.c +++ b/drivers/net/fm10k/fm10k_rxtx_vec.c @@ -44,6 +44,133 @@ #pragma GCC diagnostic ignored "-Wcast-qual" #endif +/* Handling the offload flags (olflags) field takes computation + * time when receiving packets. Therefore we provide a flag to disable + * the processing of the olflags field when they are not needed. This + * gives improved performance, at the cost of losing the offload info + * in the received packet + */ +#ifdef RTE_LIBRTE_FM10K_RX_OLFLAGS_ENABLE + +/* Vlan present flag shift */ +#define VP_SHIFT (2) +/* L3 type shift */ +#define L3TYPE_SHIFT (4) +/* L4 type shift */ +#define L4TYPE_SHIFT (7) + +static inline void +fm10k_desc_to_olflags_v(__m128i descs[4], struct rte_mbuf **rx_pkts) +{ + __m128i ptype0, ptype1, vtag0, vtag1; + union { + uint16_t e[4]; + uint64_t dword; + } vol; + + const __m128i pkttype_msk = _mm_set_epi16( + 0x0000, 0x0000, 0x0000, 0x0000, + PKT_RX_VLAN_PKT, PKT_RX_VLAN_PKT, + PKT_RX_VLAN_PKT, PKT_RX_VLAN_PKT); + + /* mask everything except rss type */ + const __m128i rsstype_msk = _mm_set_epi16( + 0x0000, 0x0000, 0x0000, 0x0000, + 0x000F, 0x000F, 0x000F, 0x000F); + + /* map rss type to rss hash flag */ + const __m128i rss_flags = _mm_set_epi8(0, 0, 0, 0, + 0, 0, 0, PKT_RX_RSS_HASH, + PKT_RX_RSS_HASH, 0, PKT_RX_RSS_HASH, 0, + PKT_RX_RSS_HASH, PKT_RX_RSS_HASH, PKT_RX_RSS_HASH, 0); + + ptype0 = _mm_unpacklo_epi16(descs[0], descs[1]); + ptype1 = _mm_unpacklo_epi16(descs[2], descs[3]); + vtag0 = _mm_unpackhi_epi16(descs[0], descs[1]); + vtag1 = _mm_unpackhi_epi16(descs[2], descs[3]); + + ptype0 = _mm_unpacklo_epi32(ptype0, ptype1); + ptype0 = _mm_and_si128(ptype0, rsstype_msk); + ptype0 = _mm_shuffle_epi8(rss_flags, ptype0); + + vtag1 = _mm_unpacklo_epi32(vtag0, vtag1); + vtag1 = _mm_srli_epi16(vtag1, VP_SHIFT); + vtag1 = _mm_and_si128(vtag1, pkttype_msk); + + vtag1 = _mm_or_si128(ptype0, vtag1); + vol.dword = _mm_cvtsi128_si64(vtag1); + + rx_pkts[0]->ol_flags = vol.e[0]; + rx_pkts[1]->ol_flags = vol.e[1]; + rx_pkts[2]->ol_flags = vol.e[2]; + rx_pkts[3]->ol_flags = vol.e[3]; +} + +static inline void +fm10k_desc_to_pktype_v(__m128i descs[4], struct rte_mbuf **rx_pkts) +{ + __m128i l3l4type0, l3l4type1, l3type, l4type; + union { + uint16_t e[4]; + uint64_t dword; + } vol; + + /* L3 pkt type mask Bit4 to Bit6 */ + const __m128i l3type_msk = _mm_set_epi16( + 0x0000, 0x0000, 0x0000, 0x0000, + 0x0070, 0x0070, 0x0070, 0x0070); + + /* L4 pkt type mask Bit7 to Bit9 */ + const __m128i l4type_msk = _mm_set_epi16( + 0x0000, 0x0000, 0x0000, 0x0000, + 0x0380, 0x0380, 0x0380, 0x0380); + + /* convert RRC l3 type to mbuf format */ + const __m128i l3type_flags = _mm_set_epi8(0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, RTE_PTYPE_L3_IPV6_EXT, + RTE_PTYPE_L3_IPV6, RTE_PTYPE_L3_IPV4_EXT, + RTE_PTYPE_L3_IPV4, 0); + + /* Convert RRC l4 type to mbuf format l4type_flags shift-left 8 bits + * to fill into8 bits length. + */ + const __m128i l4type_flags = _mm_set_epi8(0, 0, 0, 0, 0, 0, 0, 0, 0, + RTE_PTYPE_TUNNEL_GENEVE >> 8, + RTE_PTYPE_TUNNEL_NVGRE >> 8, + RTE_PTYPE_TUNNEL_VXLAN >> 8, + RTE_PTYPE_TUNNEL_GRE >> 8, + RTE_PTYPE_L4_UDP >> 8, + RTE_PTYPE_L4_TCP >> 8, + 0); + + l3l4type0 = _mm_unpacklo_epi16(descs[0], descs[1]); + l3l4type1 = _mm_unpacklo_epi16(descs[2], descs[3]); + l3l4type0 = _mm_unpacklo_epi32(l3l4type0, l3l4type1); + + l3type = _mm_and_si128(l3l4type0, l3type_msk); + l4type = _mm_and_si128(l3l4type0, l4type_msk); + + l3type = _mm_srli_epi16(l3type, L3TYPE_SHIFT); + l4type = _mm_srli_epi16(l4type, L4TYPE_SHIFT); + + l3type = _mm_shuffle_epi8(l3type_flags, l3type); + /* l4type_flags shift-left for 8 bits, need shift-right back */ + l4type = _mm_shuffle_epi8(l4type_flags, l4type); + + l4type = _mm_slli_epi16(l4type, 8); + l3l4type0 = _mm_or_si128(l3type, l4type); + vol.dword = _mm_cvtsi128_si64(l3l4type0); + + rx_pkts[0]->packet_type = vol.e[0]; + rx_pkts[1]->packet_type = vol.e[1]; + rx_pkts[2]->packet_type = vol.e[2]; + rx_pkts[3]->packet_type = vol.e[3]; +} +#else +#define fm10k_desc_to_olflags_v(desc, rx_pkts) do {} while (0) +#define fm10k_desc_to_pktype_v(desc, rx_pkts) do {} while (0) +#endif + int __attribute__((cold)) fm10k_rxq_vec_setup(struct fm10k_rx_queue *rxq) { -- 1.7.7.6