patches for DPDK stable branches
 help / color / mirror / Atom feed
* [PATCH] hash: fix SSE comparison
@ 2023-09-06  2:31 Jieqiang Wang
  2023-09-29 15:32 ` David Marchand
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Jieqiang Wang @ 2023-09-06  2:31 UTC (permalink / raw)
  To: Yipeng Wang, Sameh Gobriel, Bruce Richardson, Vladimir Medvedkin,
	Honnappa Nagarahalli, Dharmik Thakkar
  Cc: dev, nd, Jieqiang Wang, stable, Feifei Wang, Ruifeng Wang

__mm_cmpeq_epi16 returns 0xFFFF if the corresponding 16-bit elements are
equal. In original SSE2 implementation for function compare_signatures,
it utilizes _mm_movemask_epi8 to create mask from the MSB of each 8-bit
element, while we should only care about the MSB of lower 8-bit in each
16-bit element.
For example, if the comparison result is all equal, SSE2 path returns
0xFFFF while NEON and default scalar path return 0x5555.
Although this bug is not causing any negative effects since the caller
function solely examines the trailing zeros of each match mask, we
recommend this fix to ensure consistency with NEON and default scalar
code behaviors.

Fixes: c7d93df552c2 ("hash: use partial-key hashing")
Cc: yipeng1.wang@intel.com
Cc: stable@dpdk.org

Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Signed-off-by: Jieqiang Wang <jieqiang.wang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
 lib/hash/rte_cuckoo_hash.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/lib/hash/rte_cuckoo_hash.c b/lib/hash/rte_cuckoo_hash.c
index d92a903bb3..acaa8b74bd 100644
--- a/lib/hash/rte_cuckoo_hash.c
+++ b/lib/hash/rte_cuckoo_hash.c
@@ -1862,17 +1862,19 @@ compare_signatures(uint32_t *prim_hash_matches, uint32_t *sec_hash_matches,
 	/* For match mask the first bit of every two bits indicates the match */
 	switch (sig_cmp_fn) {
 #if defined(__SSE2__)
-	case RTE_HASH_COMPARE_SSE:
+	case RTE_HASH_COMPARE_SSE: {
 		/* Compare all signatures in the bucket */
-		*prim_hash_matches = _mm_movemask_epi8(_mm_cmpeq_epi16(
-				_mm_load_si128(
+		__m128i shift_mask = _mm_set1_epi16(0x0080);
+		__m128i prim_cmp = _mm_cmpeq_epi16(_mm_load_si128(
 					(__m128i const *)prim_bkt->sig_current),
-				_mm_set1_epi16(sig)));
+					_mm_set1_epi16(sig));
+		*prim_hash_matches = _mm_movemask_epi8(_mm_and_si128(prim_cmp, shift_mask));
 		/* Compare all signatures in the bucket */
-		*sec_hash_matches = _mm_movemask_epi8(_mm_cmpeq_epi16(
-				_mm_load_si128(
+		__m128i sec_cmp = _mm_cmpeq_epi16(_mm_load_si128(
 					(__m128i const *)sec_bkt->sig_current),
-				_mm_set1_epi16(sig)));
+					_mm_set1_epi16(sig));
+		*sec_hash_matches = _mm_movemask_epi8(_mm_and_si128(sec_cmp, shift_mask));
+		}
 		break;
 #elif defined(__ARM_NEON)
 	case RTE_HASH_COMPARE_NEON: {
-- 
2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-10-10  9:50 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-06  2:31 [PATCH] hash: fix SSE comparison Jieqiang Wang
2023-09-29 15:32 ` David Marchand
2023-10-02 10:39 ` Bruce Richardson
2023-10-07  6:41   ` 回复: " Jieqiang Wang
2023-10-07  7:15 ` [PATCH v2] " Jieqiang Wang
2023-10-07  7:36 ` [PATCH v3] " Jieqiang Wang
2023-10-09 14:33   ` Bruce Richardson
2023-10-10  9:50     ` David Marchand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).