DPDK patches and discussions
 help / color / mirror / Atom feed
From: Bruce Richardson <bruce.richardson@intel.com>
To: dev@dpdk.org
Cc: Helin Zhang <helin.zhang@intel.com>,
	Jingjing Wu <jingjing.wu@intel.com>,
	 Bruce Richardson <bruce.richardson@intel.com>
Subject: [dpdk-dev] [PATCH v2 3/3] i40e: simplify SSE packet length extraction code
Date: Thu, 14 Apr 2016 17:02:37 +0100	[thread overview]
Message-ID: <1460649757-11862-4-git-send-email-bruce.richardson@intel.com> (raw)
In-Reply-To: <1460649757-11862-1-git-send-email-bruce.richardson@intel.com>

In Table 8-16 of the "Intel® Ethernet Controller XL710 Datasheet" it is
stated that when the whole packet is written to a single buffer, the
header length field in the descriptor will be 0. This means that when
extracting the packet/data_len field from the descriptor in the driver
we do not need to mask out the extra header-length bits.

Inside the vector driver, this reduces the need to pull all four pktlen
fields into a single register to work on. Instead of a shift and mask,
we now need to only do a shift. Therefore, we can work on each descriptor
independently, processing each using one shift intrinsic and a blend.

This change makes the code shorter and easier to read, so we can pull it
into the main descriptor processing loop instead of needing its own
function. This in turn makes the descriptor processing in the loop as a
whole slightly easier to read as it's more linear.

In terms of performance, in testing this change shows little effect, with
single-core perf tests showing a very slight improvement.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 drivers/net/i40e/i40e_rxtx_vec.c | 51 ++++++++++++++--------------------------
 1 file changed, 17 insertions(+), 34 deletions(-)

diff --git a/drivers/net/i40e/i40e_rxtx_vec.c b/drivers/net/i40e/i40e_rxtx_vec.c
index 9f67f9d..f7a62a8 100644
--- a/drivers/net/i40e/i40e_rxtx_vec.c
+++ b/drivers/net/i40e/i40e_rxtx_vec.c
@@ -184,37 +184,7 @@ desc_to_olflags_v(__m128i descs[4], struct rte_mbuf **rx_pkts)
 #define desc_to_olflags_v(desc, rx_pkts) do {} while (0)
 #endif
 
-#define PKTLEN_SHIFT     (6)
-#define PKTLEN_MASK      (0x3FFF)
-/* Handling the pkt len field is not aligned with 1byte, so shift is
- * needed to let it align
- */
-static inline void
-desc_pktlen_align(__m128i descs[4])
-{
-	__m128i pktlen0, pktlen1;
-
-	/* mask everything except pktlen field*/
-	const __m128i pktlen_msk = _mm_set_epi32(PKTLEN_MASK, PKTLEN_MASK,
-						PKTLEN_MASK, PKTLEN_MASK);
-
-	pktlen0 = _mm_unpackhi_epi32(descs[0], descs[2]);
-	pktlen1 = _mm_unpackhi_epi32(descs[1], descs[3]);
-	pktlen0 = _mm_unpackhi_epi32(pktlen0, pktlen1);
-
-	pktlen0 = _mm_srli_epi32(pktlen0, PKTLEN_SHIFT);
-	pktlen0 = _mm_and_si128(pktlen0, pktlen_msk);
-
-	pktlen0 = _mm_packs_epi32(pktlen0, pktlen0);
-
-	descs[3] = _mm_blend_epi16(descs[3], pktlen0, 0x80);
-	pktlen0 = _mm_slli_epi64(pktlen0, 16);
-	descs[2] = _mm_blend_epi16(descs[2], pktlen0, 0x80);
-	pktlen0 = _mm_slli_epi64(pktlen0, 16);
-	descs[1] = _mm_blend_epi16(descs[1], pktlen0, 0x80);
-	pktlen0 = _mm_slli_epi64(pktlen0, 16);
-	descs[0] = _mm_blend_epi16(descs[0], pktlen0, 0x80);
-}
+#define PKTLEN_SHIFT     10
 
  /*
  * Notice:
@@ -333,12 +303,17 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 			rte_prefetch0(&rx_pkts[pos + 3]->cacheline1);
 		}
 
-		/*shift the pktlen field*/
-		desc_pktlen_align(descs);
-
 		/* avoid compiler reorder optimization */
 		rte_compiler_barrier();
 
+		/* pkt 3,4 shift the pktlen field to be 16-bit aligned*/
+		const __m128i len3 = _mm_slli_epi32(descs[3], PKTLEN_SHIFT);
+		const __m128i len2 = _mm_slli_epi32(descs[2], PKTLEN_SHIFT);
+
+		/* merge the now-aligned packet length fields back in */
+		descs[3] = _mm_blend_epi16(descs[3], len3, 0x80);
+		descs[2] = _mm_blend_epi16(descs[2], len2, 0x80);
+
 		/* D.1 pkt 3,4 convert format from desc to pktmbuf */
 		pkt_mb4 = _mm_shuffle_epi8(descs[3], shuf_msk);
 		pkt_mb3 = _mm_shuffle_epi8(descs[2], shuf_msk);
@@ -354,6 +329,14 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 		pkt_mb4 = _mm_add_epi16(pkt_mb4, crc_adjust);
 		pkt_mb3 = _mm_add_epi16(pkt_mb3, crc_adjust);
 
+		/* pkt 1,2 shift the pktlen field to be 16-bit aligned*/
+		const __m128i len1 = _mm_slli_epi32(descs[1], PKTLEN_SHIFT);
+		const __m128i len0 = _mm_slli_epi32(descs[0], PKTLEN_SHIFT);
+
+		/* merge the now-aligned packet length fields back in */
+		descs[1] = _mm_blend_epi16(descs[1], len1, 0x80);
+		descs[0] = _mm_blend_epi16(descs[0], len0, 0x80);
+
 		/* D.1 pkt 1,2 convert format from desc to pktmbuf */
 		pkt_mb2 = _mm_shuffle_epi8(descs[1], shuf_msk);
 		pkt_mb1 = _mm_shuffle_epi8(descs[0], shuf_msk);
-- 
2.5.5

  parent reply	other threads:[~2016-04-14 16:03 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-14 10:15 [dpdk-dev] [PATCH] i40e: improve performance of vector PMD Bruce Richardson
2016-04-14 13:50 ` Bruce Richardson
2016-04-14 14:00   ` Ananyev, Konstantin
2016-04-14 15:33     ` Iremonger, Bernard
2016-04-14 16:02 ` [dpdk-dev] [PATCH v2 0/3] improve i40e vpmd Bruce Richardson
2016-04-14 16:02   ` [dpdk-dev] [PATCH v2 1/3] i40e: require SSE4.1 support for vector driver Bruce Richardson
2016-04-14 16:02   ` [dpdk-dev] [PATCH v2 2/3] i40e: improve performance of vector PMD Bruce Richardson
2016-04-14 16:02   ` Bruce Richardson [this message]
2016-04-17  8:32   ` [dpdk-dev] [PATCH v2 0/3] improve i40e vpmd Zhe Tao
2016-04-27 16:30     ` Bruce Richardson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1460649757-11862-4-git-send-email-bruce.richardson@intel.com \
    --to=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=helin.zhang@intel.com \
    --cc=jingjing.wu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).