From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 89719A0032; Tue, 16 Nov 2021 14:54:08 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5C00340141; Tue, 16 Nov 2021 14:54:08 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by mails.dpdk.org (Postfix) with ESMTP id 0F0D240040 for ; Tue, 16 Nov 2021 14:54:05 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1637070845; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=qJbmW13ezffsFNf3OpjtWST278sIZoHIUh6o3jJKJls=; b=Pjxy8cuBvb9fTUF5s2ZsGMEwEPhU/FZ+kH6vmlXjD+wow1UAqViP3q8QUWoeMP7nrDZsPJ 5nATbJUabrdUW7wS/tVebr3tVJ4/YJaiBlEc2+4qmRIzqY7Ym7QkGmL/+HUOY4K5MjAdzI M/0GnQd9aPNNePf39mFqPEzwX7wTkVA= Received: from mail-lj1-f197.google.com (mail-lj1-f197.google.com [209.85.208.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-93-e3GHBuh2NA6VAciXFkNi6w-1; Tue, 16 Nov 2021 08:54:02 -0500 X-MC-Unique: e3GHBuh2NA6VAciXFkNi6w-1 Received: by mail-lj1-f197.google.com with SMTP id e13-20020a2e9e0d000000b00216ace8e8e5so6268107ljk.10 for ; Tue, 16 Nov 2021 05:54:01 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=qJbmW13ezffsFNf3OpjtWST278sIZoHIUh6o3jJKJls=; b=JU8/M5nJjPnLGrSoE3K4ZIJqpY1pfpMrR89AKG0CoCJT3Ba148Lq0HIQAEe69g4kNS pAo5zHfSRbcXRiSzEZyXQQk6Ih6Z/v4ECwmNyVLr3jx3lAzulHS4YpJ7DC0AFUG7S//j ID/nh55KKlpVQ7hAYhle8ujUomwMT+63X9q4/mHjqes13SBpX/lewrDvRWVxVYhz6PE5 GF7hp9ZZSE996VKnRN0XupFDYqqqnBdQSRJUCFfnxED1jxDZvyPJ5CI2/8j/8etxS534 5DyWKihGQ9u9od2pOAEjXpINefGGdJy1TZ88jWnvf8RuIiFCS/d/Xhj5d764aI9KcShm UoKQ== X-Gm-Message-State: AOAM531K7T75a4Wm8d0//AP8wbkDW7bhUBOr8qi0yuErvmNmnM6Akbv2 pwywYgvhR0zbvojLOpeOVg7XHXdiYUPIA8w7XCVzWNgxHMSE1ggT2/AtAz7t2beLNIs0QbZsVzU iztYLs0coVY4z5HVplyU= X-Received: by 2002:a2e:8189:: with SMTP id e9mr7146777ljg.333.1637070840606; Tue, 16 Nov 2021 05:54:00 -0800 (PST) X-Google-Smtp-Source: ABdhPJxHE0PrKN4KuyoRIMbguJvwqPilwVrEo3OAj62XKoSLfb67Z7oNqDkmyf2D3pcIi7Q/OGN/z6xgpSYGblyNhJY= X-Received: by 2002:a2e:8189:: with SMTP id e9mr7146752ljg.333.1637070840380; Tue, 16 Nov 2021 05:54:00 -0800 (PST) MIME-Version: 1.0 References: <20211109172456.147140-1-vladimir.medvedkin@intel.com> <20211112141719.232932-1-vladimir.medvedkin@intel.com> In-Reply-To: <20211112141719.232932-1-vladimir.medvedkin@intel.com> From: David Marchand Date: Tue, 16 Nov 2021 14:53:49 +0100 Message-ID: Subject: Re: [PATCH v2] hash: fix thash gfni implementation To: Vladimir Medvedkin Cc: dev , Thomas Monjalon , "Ananyev, Konstantin" , Lance Richardson , Ji@dpdk.org, Kai , Yipeng Wang , Sameh Gobriel , Bruce Richardson Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=dmarchan@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Fri, Nov 12, 2021 at 3:17 PM Vladimir Medvedkin wrote: > > 1. This patch replaces _mm512_set_epi8 with _mm512_set_epi32 > due to the lack of support by some compilers. Ok, it was the initial report from Lance. > 2. This patch checks if AVX512F is supported along with GFNI. > This is done if the code is built on a platform that supports GFNI, > but does not support AVX512. Ok. > 3. Also this patch fixes compilation problems on 32bit arch due to > lack of support for _mm_extract_epi64() by implementing XOR folding > with _mm_extract_epi32() on 32-bit arch. This code is under a #if defined(__GFNI__) && defined(__AVX512F__). Does such a 32 bits processor exist, that supports AVX512 and GFNI? > > Fixes: 4fd8c4cb0de1 ("hash: add new Toeplitz hash implementation") > Cc: vladimir.medvedkin@intel.com > > Signed-off-by: Vladimir Medvedkin > Acked-by: Lance Richardson > Acked-by: Ji, Kai > --- > lib/hash/rte_thash_x86_gfni.h | 44 ++++++++++++++++++++--------------- > 1 file changed, 25 insertions(+), 19 deletions(-) > > diff --git a/lib/hash/rte_thash_x86_gfni.h b/lib/hash/rte_thash_x86_gfni.h > index c2889c3734..987dec4988 100644 > --- a/lib/hash/rte_thash_x86_gfni.h > +++ b/lib/hash/rte_thash_x86_gfni.h > @@ -18,7 +18,7 @@ > extern "C" { > #endif > > -#ifdef __GFNI__ > +#if defined(__GFNI__) && defined(__AVX512F__) Please update #endif comments accordingly, or remove invalid/obsolete comment about _GFNI_. > #define RTE_THASH_GFNI_DEFINED > > #define RTE_THASH_FIRST_ITER_MSK 0x0f0f0f0f0f0e0c08 > @@ -33,7 +33,6 @@ __rte_thash_xor_reduce(__m512i xor_acc, uint32_t *val_1, uint32_t *val_2) > { > __m256i tmp_256_1, tmp_256_2; > __m128i tmp128_1, tmp128_2; > - uint64_t tmp_1, tmp_2; > > tmp_256_1 = _mm512_castsi512_si256(xor_acc); > tmp_256_2 = _mm512_extracti32x8_epi32(xor_acc, 1); > @@ -43,12 +42,24 @@ __rte_thash_xor_reduce(__m512i xor_acc, uint32_t *val_1, uint32_t *val_2) > tmp128_2 = _mm256_extracti32x4_epi32(tmp_256_1, 1); > tmp128_1 = _mm_xor_si128(tmp128_1, tmp128_2); > > +#ifdef RTE_ARCH_X86_64 > + uint64_t tmp_1, tmp_2; > tmp_1 = _mm_extract_epi64(tmp128_1, 0); > tmp_2 = _mm_extract_epi64(tmp128_1, 1); > tmp_1 ^= tmp_2; > > *val_1 = (uint32_t)tmp_1; > *val_2 = (uint32_t)(tmp_1 >> 32); > +#else > + uint32_t tmp_1, tmp_2; > + tmp_1 = _mm_extract_epi32(tmp128_1, 0); > + tmp_2 = _mm_extract_epi32(tmp128_1, 1); > + tmp_1 ^= _mm_extract_epi32(tmp128_1, 2); > + tmp_2 ^= _mm_extract_epi32(tmp128_1, 3); > + > + *val_1 = tmp_1; > + *val_2 = tmp_2; > +#endif > } > > __rte_internal > @@ -56,23 +67,18 @@ static inline __m512i > __rte_thash_gfni(const uint64_t *mtrx, const uint8_t *tuple, > const uint8_t *secondary_tuple, int len) > { > - __m512i permute_idx = _mm512_set_epi8(7, 6, 5, 4, 7, 6, 5, 4, > - 6, 5, 4, 3, 6, 5, 4, 3, > - 5, 4, 3, 2, 5, 4, 3, 2, > - 4, 3, 2, 1, 4, 3, 2, 1, > - 3, 2, 1, 0, 3, 2, 1, 0, > - 2, 1, 0, -1, 2, 1, 0, -1, > - 1, 0, -1, -2, 1, 0, -1, -2, > - 0, -1, -2, -3, 0, -1, -2, -3); > - > - const __m512i rewind_idx = _mm512_set_epi8(0, 0, 0, 0, 0, 0, 0, 0, > - 0, 0, 0, 0, 0, 0, 0, 0, > - 0, 0, 0, 0, 0, 0, 0, 0, > - 0, 0, 0, 0, 0, 0, 0, 0, > - 0, 0, 0, 0, 0, 0, 0, 0, > - 0, 0, 0, 59, 0, 0, 0, 59, > - 0, 0, 59, 58, 0, 0, 59, 58, > - 0, 59, 58, 57, 0, 59, 58, 57); > + __m512i permute_idx = _mm512_set_epi32(0x7060504, 0x7060504, Nit: it is easier to read fully expanded 32 bits values, like 0x07060504 instead of 0x7060504 Etc... > + 0x6050403, 0x6050403, > + 0x5040302, 0x5040302, > + 0x4030201, 0x4030201, > + 0x3020100, 0x3020100, > + 0x20100FF, 0x20100FF, > + 0x100FFFE, 0x100FFFE, > + 0xFFFEFD, 0xFFFEFD); > + const __m512i rewind_idx = _mm512_set_epi32(0, 0, 0, 0, 0, 0, 0, 0, > + 0, 0, 0x3B, 0x3B, > + 0x3B3A, 0x3B3A, > + 0x3B3A39, 0x3B3A39); > const __mmask64 rewind_mask = RTE_THASH_REWIND_MSK; > const __m512i shift_8 = _mm512_set1_epi8(8); > __m512i xor_acc = _mm512_setzero_si512(); > -- > 2.25.1 > -- David Marchand