From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0C8FEA0C43; Thu, 21 Oct 2021 19:17:14 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E96404003F; Thu, 21 Oct 2021 19:17:13 +0200 (CEST) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by mails.dpdk.org (Postfix) with ESMTP id 2A3634003E for ; Thu, 21 Oct 2021 19:17:11 +0200 (CEST) X-IronPort-AV: E=McAfee;i="6200,9189,10144"; a="216262174" X-IronPort-AV: E=Sophos;i="5.87,170,1631602800"; d="scan'208";a="216262174" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Oct 2021 10:17:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.87,170,1631602800"; d="scan'208";a="495245201" Received: from fmsmsx602.amr.corp.intel.com ([10.18.126.82]) by orsmga008.jf.intel.com with ESMTP; 21 Oct 2021 10:17:10 -0700 Received: from fmsmsx601.amr.corp.intel.com (10.18.126.81) by fmsmsx602.amr.corp.intel.com (10.18.126.82) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2242.12; Thu, 21 Oct 2021 10:17:10 -0700 Received: from fmsedg602.ED.cps.intel.com (10.1.192.136) by fmsmsx601.amr.corp.intel.com (10.18.126.81) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2242.12 via Frontend Transport; Thu, 21 Oct 2021 10:17:10 -0700 Received: from NAM11-CO1-obe.outbound.protection.outlook.com (104.47.56.175) by edgegateway.intel.com (192.55.55.71) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2242.12; Thu, 21 Oct 2021 10:17:09 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=gV9F36Ivd/mAl8KLBzCiZRWQwD9RxVmLS4wkqTpoBe3smZQBY/ixCiSeXJOigFI56ISdVZ12afO1oADf7gtAPluFLDGMdVoRhBOpVmDoC+QF1ftqT9xaKvmpEXxN/fadz5eHIRl9qQaAz8HCjcJFARikvLy70yLYzuSwTmePNypy3DXkt4CulghpqJyet91gFWYCa9A9WkYocwSACQ8UT7L36cMvZ3LyniexcqO1dB3OwOeljBwofVVEqCZj0TSOosUdocRRqaGvviNtqlrKd0l+mXZHh7L2zaWwlw5BV0N+joiWsPLr62qgF7h1DjcnJb1/OdtUbfk0iFflVCc7zA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=gkGxIL2dmHs4EbZg0QSV0sGTMg8nI6ZbSWZgbJgKcC0=; b=eIr8sXI9Uy7hHveew3DlGzFRj8jfWZhcsS7Yig3LR2WVm1u+vVqj8En7geGQXZzUqUn1aqI+MW4zXIWouB4J4Z2JHkMg1O9okXh8SH6zjBjMapyPc7rwepxkbF/PmzeuqU0qEWelpgKebaHKJqf/fITrcjN584BrYYJPRXbyAocLZWEdMZxiYZeX1+3y/fZzJHpNg06ukA1cw++58AL1dtwC9UQkDwoshfVrYvAMlRu5MH3tM7TRReJE23AzzvywUmwRf6Eic4qEMA49rybjGKc3/jgR4+rL0Nx3s4x7tsmcxK6CXwpOsC0m8ENNqXqpFT7zXz8/x25PiYvo7y0AQg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=gkGxIL2dmHs4EbZg0QSV0sGTMg8nI6ZbSWZgbJgKcC0=; b=xkj7OFD30zXfwfZIxoKLpOKHcBK60hngrv8uDBtBvqZ1f5aqGVihIgGsxgShVbvhp+DOIUK17tDz1JBe9LmJE0Q66H6U+5lNnCvPZ3wcEzjxd2oSBxVoLHXHR9nn270mh86OijDL5QfhmTOKjU4g5+c7tigkRkjFs8WkTFpFGpI= Authentication-Results: networkplumber.org; dkim=none (message not signed) header.d=none;networkplumber.org; dmarc=none action=none header.from=intel.com; Received: from CO1PR11MB5012.namprd11.prod.outlook.com (2603:10b6:303:90::18) by MW5PR11MB5858.namprd11.prod.outlook.com (2603:10b6:303:193::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4608.17; Thu, 21 Oct 2021 17:17:08 +0000 Received: from CO1PR11MB5012.namprd11.prod.outlook.com ([fe80::442b:2192:c62b:c6c3]) by CO1PR11MB5012.namprd11.prod.outlook.com ([fe80::442b:2192:c62b:c6c3%7]) with mapi id 15.20.4628.018; Thu, 21 Oct 2021 17:17:08 +0000 To: "Ananyev, Konstantin" , "dev@dpdk.org" CC: "Wang, Yipeng1" , "Gobriel, Sameh" , "Richardson, Bruce" , "stephen@networkplumber.org" References: <1634290206-251913-1-git-send-email-vladimir.medvedkin@intel.com> <1634754016-367978-2-git-send-email-vladimir.medvedkin@intel.com> From: "Medvedkin, Vladimir" Message-ID: <9c3ba761-d91d-f137-f2a9-6f2979f36b5c@intel.com> Date: Thu, 21 Oct 2021 19:17:01 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0 Thunderbird/78.14.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-ClientProxiedBy: DB6PR0202CA0047.eurprd02.prod.outlook.com (2603:10a6:4:a5::33) To CO1PR11MB5012.namprd11.prod.outlook.com (2603:10b6:303:90::18) MIME-Version: 1.0 Received: from [192.198.151.54] (192.198.151.54) by DB6PR0202CA0047.eurprd02.prod.outlook.com (2603:10a6:4:a5::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4628.16 via Frontend Transport; Thu, 21 Oct 2021 17:17:06 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 42169875-a555-4ce1-b06a-08d994b69c49 X-MS-TrafficTypeDiagnostic: MW5PR11MB5858: X-LD-Processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:85; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: xl1BV96Z2oqUxd1QUy+4KDmbm8jkon8Qk6I7GTM2MxWHmK+PtJWjZzmRfs1hHuC9LEqYhN2vZWqvHilef6rRVpQshCU2VR4kylDBP1rliz61nW3w96pfVt/zcgI2zfl6T524nCP/sGdoY0AiHzas7GaSwJIxDPfJHqzLMX1BppKQ5sRMFGiUBWBZsnWKjYTbIqNme1obsAKWLn1SSWxKmLysDitozQGuGXsO7lfA0CKEl3laQexXfyMf1l4yGK+xw25ygRVuZ4JCpzh1PWOzBkCXWSPlNOx7j3FF9/9H7oB7MNz35Cdnw3khhfweruMQwYi3tCg8RdzQ7mAzOAUb5knkyLNSlBf97ovkTuK94vhk4DDZHawmgYi4TqesB3Vmz62seoq4vdpYRn/bllb4nxb7cbLLYK4sgSJGlQlB20jJnI5dKDo1CHeB/d2PvPwybCOqKR2xOI4zMJOdsLR7ZV+tYPUIs7hpaSWch1uwBAo+mtnT3Xz4qh5JYVRbUKPkwTsIzWe6LY44md0E7yxfDyPjRu53NzRlpF3wW7C6NsVEFVYUKPGMagLscgQhkW/LHml3dsOOTvUGYUzheMR+ByE3NdgNpob6uFRA5LdGx5ARtICQHYhykSAAfazaRUR59mO9cENzZVoWZ440akPPZMGXLT0gQSsMxQCMd87G9aJuVJDFvUKNoaHB35NwLtxf02g0nsymnDa4IXl1GAeB/Yu5V16ERVFvJbJMM4MR3wL8HqPBMydS1pD7+paXl1399GPfg11q/9BUKz4DsChW+A== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CO1PR11MB5012.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(38100700002)(31686004)(83380400001)(8936002)(956004)(26005)(30864003)(6666004)(53546011)(186003)(2616005)(508600001)(82960400001)(2906002)(5660300002)(316002)(36756003)(4326008)(110136005)(86362001)(8676002)(54906003)(16576012)(66946007)(66556008)(66476007)(6486002)(31696002)(6706004)(3940600001)(45980500001)(43740500002); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?ZCtQVCtyNlpHajZZb2lZWDlLOVdtRitVVTNHN3VZLzMyUHZaQjVzdFNlaEk0?= =?utf-8?B?dDJWbytOSDRUMjE2bUpsUllaR01OSzBhcjlneFF2cU85cjNvelRqRlliMkFy?= =?utf-8?B?aGx5aW0rZHhKNVZtem1BU1g0TlNoQ1dyQ0MrTGF0VWl0ZTNGQXB4SGRmTUFS?= =?utf-8?B?TVZsR2dDVjQ3ODdxZXBTT0doSDRmdzZ2SU96aG9TN3dRY2FHZnk0d1BXZXBt?= =?utf-8?B?NWR5Z2RwRmpQbkROdHB2SzNlbmpIc2RvM2Jld1pBKzZhWE9lYnppT0xRTS9W?= =?utf-8?B?TjlwR3dYWkxlZCs0U01CUTZsbnh4a3VzODdTbnZsNGRORUw4M3RyYkJoNHR6?= =?utf-8?B?dy9GenBocjRkKytVZjZrTXY0WmoyWGM1V0VscS8vSjJraXVWR1JERWFzUkxr?= =?utf-8?B?L1ljdmtjZzRQVCtYZUhFQmUyY3JMS3dhQThpdDlxMGpsd3NZMW82dVV1WEJ1?= =?utf-8?B?VUVoalhjL1czS1VGcGk5ZjhaeWRVcGhmVW8xRGo2QW5DUnByRmxDYlZuUXFn?= =?utf-8?B?dHB5WktaRjBUMG5KZUlNQmVFemlsT3hreDRxVVhDV2xJQWVoNWV1MGV0RVd5?= =?utf-8?B?VGNWU2UvV1k4NTh3bWdvWnBzTHFpZ2ZaRUpGVWEyU3FoeFlhalpzSkpINm9j?= =?utf-8?B?NVBlbnV5eDJSZ01NZ1BGb2hBRUc1VThUUmJ2QXdFWTdOaXVTVkZqZGMzVjBD?= =?utf-8?B?VE5kNnNvSXFYQWdZRGhVU0tZSmRMRThpMDlodGthbUJicHpxdXRndVNWcW8r?= =?utf-8?B?YVVRWVhkcHRzZmpZdEtpTXFGc3l6bHZydFQ1NGhDM2xQNXltaWRodnQ1WWRX?= =?utf-8?B?MnA3cUFKcjYxd3QxQzdRWDVJQmtSYTdjaGJIdEczNTZKMWdDT2RZcWlQdEhU?= =?utf-8?B?eGFlRHJPYi9Hd2hycmtuMzREZjE4cTNUUWNxMkpmNG9BVDVpK1R5b0lkdGRG?= =?utf-8?B?cml3RElFbWdKaHpXeE15NmY5NkQ3QTIxQ2llcmt2eG9zRmRHUVdGc29mQmV3?= =?utf-8?B?N05NY1orZ2tDTTNBYjRKQ2tBWEhoWW1uUnB4QnNKSTZFSmdiMWRRUEtrNy96?= =?utf-8?B?ZitKZDJSa2JHRUJqYkN5UEUxZXRzbmxRcld1dnhPWGI4K1hvcVRkR0p6VElE?= =?utf-8?B?ZUdzVWxKU2NlOWxCWWxYNXVTQUJDMDhhK2R4T0w1TEJSeE55Ujh3N0xYbHdx?= =?utf-8?B?ODB2a2wrWkREcDgwTndjMUNLNEhtZkVwRCt1b3Bxc1FVRitQT1FJR1I0TVhU?= =?utf-8?B?TXZHQit0SHVCeC8yQmM5Zkg2dFVaMGtXRExGKzMwYi96YnhrcjB0R1REQjZa?= =?utf-8?B?NXM2eFVxUGlVbkZ6cFZYb1A3RGUrc1ZhTXBIa2JRejB6UnhCNFJrZFJmSmd5?= =?utf-8?B?M2diV2tKVm5wdFAwbnBMM25ybEVkb2pwSGpCakVFN3pYbmhHTVNHcEI4MVk0?= =?utf-8?B?aEwyVW9LK0IwMmRuSkFXYnM0VXdWOStvNU1CMlRBdmx6US9VWk1raUo4cU5D?= =?utf-8?B?ZTFkZDBRVVRKNVFrK3lHdStmZnF2VCt4LzlyaGJJdTlTNm5sTkNhTSsyUlBz?= =?utf-8?B?QlBldTUzR20reVhFTW92V1F4WDQ3WWRJNkp0OEFJRHFWZ2IwS095cTB4UG9x?= =?utf-8?B?S203VVI5aUliMlBqQjVUNWgyTGUxQ0JBV3czVSt6Y3RTbHJNTTlpWi9DY2ZP?= =?utf-8?B?VEExOG9BNVFBS0pPa3laUnNoRk1GdGpmODlHWERWMXNWVkJSVW5uc090Rk5C?= =?utf-8?Q?XlWlXZS4ony3dcylJhN4SHZze4Z9GuFkIHl3U0n?= X-MS-Exchange-CrossTenant-Network-Message-Id: 42169875-a555-4ce1-b06a-08d994b69c49 X-MS-Exchange-CrossTenant-AuthSource: CO1PR11MB5012.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 21 Oct 2021 17:17:07.8640 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: vladimir.medvedkin@intel.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW5PR11MB5858 X-OriginatorOrg: intel.com Subject: Re: [dpdk-dev] [PATCH v3 1/5] hash: add new toeplitz hash implementation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Konstantin, On 21/10/2021 11:42, Ananyev, Konstantin wrote: > >> This patch add a new Toeplitz hash implementation using >> Galios Fields New Instructions (GFNI). >> >> Signed-off-by: Vladimir Medvedkin >> --- >> doc/api/doxy-api-index.md | 1 + >> lib/hash/meson.build | 1 + >> lib/hash/rte_thash.c | 29 ++++++ >> lib/hash/rte_thash.h | 35 +++++++ >> lib/hash/rte_thash_gfni.h | 85 ++++++++++++++++ >> lib/hash/rte_thash_x86_gfni.h | 221 ++++++++++++++++++++++++++++++++++++++++++ >> lib/hash/version.map | 2 + >> 7 files changed, 374 insertions(+) >> create mode 100644 lib/hash/rte_thash_gfni.h >> create mode 100644 lib/hash/rte_thash_x86_gfni.h >> >> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md >> index 1992107..7549477 100644 >> --- a/doc/api/doxy-api-index.md >> +++ b/doc/api/doxy-api-index.md >> @@ -139,6 +139,7 @@ The public API headers are grouped by topics: >> [hash] (@ref rte_hash.h), >> [jhash] (@ref rte_jhash.h), >> [thash] (@ref rte_thash.h), >> + [thash_gfni] (@ref rte_thash_gfni.h), >> [FBK hash] (@ref rte_fbk_hash.h), >> [CRC hash] (@ref rte_hash_crc.h) >> >> diff --git a/lib/hash/meson.build b/lib/hash/meson.build >> index 9bc5ef9..40444ac 100644 >> --- a/lib/hash/meson.build >> +++ b/lib/hash/meson.build >> @@ -7,6 +7,7 @@ headers = files( >> 'rte_hash.h', >> 'rte_jhash.h', >> 'rte_thash.h', >> + 'rte_thash_gfni.h', >> ) >> indirect_headers += files('rte_crc_arm64.h') >> >> diff --git a/lib/hash/rte_thash.c b/lib/hash/rte_thash.c >> index 696a112..e605a6f 100644 >> --- a/lib/hash/rte_thash.c >> +++ b/lib/hash/rte_thash.c >> @@ -90,6 +90,35 @@ struct rte_thash_ctx { >> uint8_t hash_key[0]; >> }; >> >> +int >> +rte_thash_gfni_supported(void) >> +{ >> +#ifdef RTE_THASH_GFNI_DEFINED >> + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_GFNI) && >> + (rte_vect_get_max_simd_bitwidth() >= >> + RTE_VECT_SIMD_512)) >> + return 1; >> +#endif >> + >> + return 0; >> +}; >> + >> +void >> +rte_thash_complete_matrix(uint64_t *matrixes, const uint8_t *rss_key, int size) >> +{ >> + int i, j; >> + uint8_t *m = (uint8_t *)matrixes; >> + uint8_t left_part, right_part; >> + >> + for (i = 0; i < size; i++) { >> + for (j = 0; j < 8; j++) { >> + left_part = rss_key[i] << j; >> + right_part = (uint16_t)(rss_key[i + 1]) >> (8 - j); >> + m[i * 8 + j] = left_part|right_part; >> + } >> + } >> +} >> + >> static inline uint32_t >> get_bit_lfsr(struct thash_lfsr *lfsr) >> { >> diff --git a/lib/hash/rte_thash.h b/lib/hash/rte_thash.h >> index 76109fc..a406be0 100644 >> --- a/lib/hash/rte_thash.h >> +++ b/lib/hash/rte_thash.h >> @@ -28,6 +28,7 @@ extern "C" { >> #include >> #include >> #include >> +#include >> >> #if defined(RTE_ARCH_X86) || defined(__ARM_NEON) >> #include >> @@ -223,6 +224,40 @@ rte_softrss_be(uint32_t *input_tuple, uint32_t input_len, >> return ret; >> } >> >> +/** >> + * Indicates if GFNI implementations of the Toeplitz hash are supported. >> + * >> + * @warning >> + * @b EXPERIMENTAL: this API may change without prior notice. >> + * >> + * @return >> + * 1 if GFNI is supported >> + * 0 otherwise >> + */ >> +__rte_experimental >> +int >> +rte_thash_gfni_supported(void); >> + >> +/** >> + * Converts Toeplitz hash key (RSS key) into matrixes required >> + * for GFNI implementation >> + * >> + * @warning >> + * @b EXPERIMENTAL: this API may change without prior notice. >> + * >> + * @param matrixes >> + * pointer to the memory where matrices will be written. >> + * Note: the size of this memory must be equal to size * 8 >> + * @param rss_key >> + * pointer to the Toeplitz hash key >> + * @param size >> + * Size of the rss_key in bytes. >> + */ >> +__rte_experimental >> +void >> +rte_thash_complete_matrix(uint64_t *matrixes, const uint8_t *rss_key, >> + int size); >> + >> /** @internal Logarithm of minimum size of the RSS ReTa */ >> #define RTE_THASH_RETA_SZ_MIN 2U >> /** @internal Logarithm of maximum size of the RSS ReTa */ >> diff --git a/lib/hash/rte_thash_gfni.h b/lib/hash/rte_thash_gfni.h >> new file mode 100644 >> index 0000000..f59587f >> --- /dev/null >> +++ b/lib/hash/rte_thash_gfni.h >> @@ -0,0 +1,85 @@ >> +/* SPDX-License-Identifier: BSD-3-Clause >> + * Copyright(c) 2021 Intel Corporation >> + */ >> + >> +#ifndef _RTE_THASH_GFNI_H_ >> +#define _RTE_THASH_GFNI_H_ >> + >> +#ifdef __cplusplus >> +extern "C" { >> +#endif >> + >> +#ifdef RTE_ARCH_X86 >> + >> +#include >> + >> +#endif >> + >> +#ifndef RTE_THASH_GFNI_DEFINED >> + >> +/** >> + * Calculate Toeplitz hash. >> + * Dummy implementation. >> + * >> + * @warning >> + * @b EXPERIMENTAL: this API may change without prior notice. >> + * >> + * @param m >> + * Pointer to the matrices generated from the corresponding >> + * RSS hash key using rte_thash_complete_matrix(). >> + * @param tuple >> + * Pointer to the data to be hashed. Data must be in network byte order. >> + * @param len >> + * Length of the data to be hashed. >> + * @return >> + * Calculated Toeplitz hash value. >> + */ >> +__rte_experimental >> +static inline uint32_t >> +rte_thash_gfni(const uint64_t *mtrx __rte_unused, >> + const uint8_t *key __rte_unused, int len __rte_unused) >> +{ >> + RTE_LOG(ERR, HASH, "%s is undefined under given arch\n", __func__); > > One nit: as I can see from test report some compilation fails. > Probably we need to add #include to that file. > Apart from that, LGTM. > Acked-by: Konstantin Ananyev > Thanks, I'll send v4 > >> + return 0; >> +} >> + >> +/** >> + * Bulk implementation for Toeplitz hash. >> + * Dummy implementation. >> + * >> + * @warning >> + * @b EXPERIMENTAL: this API may change without prior notice. >> + * >> + * @param m >> + * Pointer to the matrices generated from the corresponding >> + * RSS hash key using rte_thash_complete_matrix(). >> + * @param tuple >> + * Array of the pointers on data to be hashed. >> + * Data must be in network byte order. >> + * @param len >> + * Length of the largest data buffer to be hashed. >> + * @param val >> + * Array of uint32_t where to put calculated Toeplitz hash values >> + * @param num >> + * Number of tuples to hash. >> + */ >> +__rte_experimental >> +static inline void >> +rte_thash_gfni_bulk(const uint64_t *mtrx __rte_unused, >> + int len __rte_unused, uint8_t *tuple[] __rte_unused, >> + uint32_t val[], uint32_t num) >> +{ >> + unsigned int i; >> + >> + RTE_LOG(ERR, HASH, "%s is undefined under given arch\n", __func__); >> + for (i = 0; i < num; i++) >> + val[i] = 0; >> +} >> + >> +#endif /* RTE_THASH_GFNI_DEFINED */ >> + >> +#ifdef __cplusplus >> +} >> +#endif >> + >> +#endif /* _RTE_THASH_GFNI_H_ */ >> diff --git a/lib/hash/rte_thash_x86_gfni.h b/lib/hash/rte_thash_x86_gfni.h >> new file mode 100644 >> index 0000000..faa340a >> --- /dev/null >> +++ b/lib/hash/rte_thash_x86_gfni.h >> @@ -0,0 +1,221 @@ >> +/* SPDX-License-Identifier: BSD-3-Clause >> + * Copyright(c) 2021 Intel Corporation >> + */ >> + >> +#ifndef _RTE_THASH_X86_GFNI_H_ >> +#define _RTE_THASH_X86_GFNI_H_ >> + >> +/** >> + * @file >> + * >> + * Optimized Toeplitz hash functions implementation >> + * using Galois Fields New Instructions. >> + */ >> + >> +#include >> + >> +#ifdef __cplusplus >> +extern "C" { >> +#endif >> + >> +#ifdef __GFNI__ >> +#define RTE_THASH_GFNI_DEFINED >> + >> +#define RTE_THASH_FIRST_ITER_MSK 0x0f0f0f0f0f0e0c08 >> +#define RTE_THASH_PERM_MSK 0x0f0f0f0f0f0f0f0f >> +#define RTE_THASH_FIRST_ITER_MSK_2 0xf0f0f0f0f0e0c080 >> +#define RTE_THASH_PERM_MSK_2 0xf0f0f0f0f0f0f0f0 >> +#define RTE_THASH_REWIND_MSK 0x0000000000113377 >> + >> +__rte_internal >> +static inline void >> +__rte_thash_xor_reduce(__m512i xor_acc, uint32_t *val_1, uint32_t *val_2) >> +{ >> + __m256i tmp_256_1, tmp_256_2; >> + __m128i tmp128_1, tmp128_2; >> + uint64_t tmp_1, tmp_2; >> + >> + tmp_256_1 = _mm512_castsi512_si256(xor_acc); >> + tmp_256_2 = _mm512_extracti32x8_epi32(xor_acc, 1); >> + tmp_256_1 = _mm256_xor_si256(tmp_256_1, tmp_256_2); >> + >> + tmp128_1 = _mm256_castsi256_si128(tmp_256_1); >> + tmp128_2 = _mm256_extracti32x4_epi32(tmp_256_1, 1); >> + tmp128_1 = _mm_xor_si128(tmp128_1, tmp128_2); >> + >> + tmp_1 = _mm_extract_epi64(tmp128_1, 0); >> + tmp_2 = _mm_extract_epi64(tmp128_1, 1); >> + tmp_1 ^= tmp_2; >> + >> + *val_1 = (uint32_t)tmp_1; >> + *val_2 = (uint32_t)(tmp_1 >> 32); >> +} >> + >> +__rte_internal >> +static inline __m512i >> +__rte_thash_gfni(const uint64_t *mtrx, const uint8_t *tuple, >> + const uint8_t *secondary_tuple, int len) >> +{ >> + __m512i permute_idx = _mm512_set_epi8(7, 6, 5, 4, 7, 6, 5, 4, >> + 6, 5, 4, 3, 6, 5, 4, 3, >> + 5, 4, 3, 2, 5, 4, 3, 2, >> + 4, 3, 2, 1, 4, 3, 2, 1, >> + 3, 2, 1, 0, 3, 2, 1, 0, >> + 2, 1, 0, -1, 2, 1, 0, -1, >> + 1, 0, -1, -2, 1, 0, -1, -2, >> + 0, -1, -2, -3, 0, -1, -2, -3); >> + >> + const __m512i rewind_idx = _mm512_set_epi8(0, 0, 0, 0, 0, 0, 0, 0, >> + 0, 0, 0, 0, 0, 0, 0, 0, >> + 0, 0, 0, 0, 0, 0, 0, 0, >> + 0, 0, 0, 0, 0, 0, 0, 0, >> + 0, 0, 0, 0, 0, 0, 0, 0, >> + 0, 0, 0, 59, 0, 0, 0, 59, >> + 0, 0, 59, 58, 0, 0, 59, 58, >> + 0, 59, 58, 57, 0, 59, 58, 57); >> + const __mmask64 rewind_mask = RTE_THASH_REWIND_MSK; >> + const __m512i shift_8 = _mm512_set1_epi8(8); >> + __m512i xor_acc = _mm512_setzero_si512(); >> + __m512i perm_bytes = _mm512_setzero_si512(); >> + __m512i vals, matrixes, tuple_bytes, tuple_bytes_2; >> + __mmask64 load_mask, permute_mask, permute_mask_2; >> + int chunk_len = 0, i = 0; >> + uint8_t mtrx_msk; >> + const int prepend = 3; >> + >> + for (; len > 0; len -= 64, tuple += 64) { >> + if (i == 8) >> + perm_bytes = _mm512_maskz_permutexvar_epi8(rewind_mask, >> + rewind_idx, perm_bytes); >> + >> + permute_mask = RTE_THASH_FIRST_ITER_MSK; >> + load_mask = (len >= 64) ? UINT64_MAX : ((1ULL << len) - 1); >> + tuple_bytes = _mm512_maskz_loadu_epi8(load_mask, tuple); >> + if (secondary_tuple) { >> + permute_mask_2 = RTE_THASH_FIRST_ITER_MSK_2; >> + tuple_bytes_2 = _mm512_maskz_loadu_epi8(load_mask, >> + secondary_tuple); >> + } >> + >> + chunk_len = __builtin_popcountll(load_mask); >> + for (i = 0; i < ((chunk_len + prepend) / 8); i++, mtrx += 8) { >> + perm_bytes = _mm512_mask_permutexvar_epi8(perm_bytes, >> + permute_mask, permute_idx, tuple_bytes); >> + >> + if (secondary_tuple) >> + perm_bytes = >> + _mm512_mask_permutexvar_epi8(perm_bytes, >> + permute_mask_2, permute_idx, >> + tuple_bytes_2); >> + >> + matrixes = _mm512_maskz_loadu_epi64(UINT8_MAX, mtrx); >> + vals = _mm512_gf2p8affine_epi64_epi8(perm_bytes, >> + matrixes, 0); >> + >> + xor_acc = _mm512_xor_si512(xor_acc, vals); >> + permute_idx = _mm512_add_epi8(permute_idx, shift_8); >> + permute_mask = RTE_THASH_PERM_MSK; >> + if (secondary_tuple) >> + permute_mask_2 = RTE_THASH_PERM_MSK_2; >> + } >> + } >> + >> + int rest_len = (chunk_len + prepend) % 8; >> + if (rest_len != 0) { >> + mtrx_msk = (1 << (rest_len % 8)) - 1; >> + matrixes = _mm512_maskz_loadu_epi64(mtrx_msk, mtrx); >> + if (i == 8) { >> + perm_bytes = _mm512_maskz_permutexvar_epi8(rewind_mask, >> + rewind_idx, perm_bytes); >> + } else { >> + perm_bytes = _mm512_mask_permutexvar_epi8(perm_bytes, >> + permute_mask, permute_idx, tuple_bytes); >> + >> + if (secondary_tuple) >> + perm_bytes = >> + _mm512_mask_permutexvar_epi8( >> + perm_bytes, permute_mask_2, >> + permute_idx, tuple_bytes_2); >> + } >> + >> + vals = _mm512_gf2p8affine_epi64_epi8(perm_bytes, matrixes, 0); >> + xor_acc = _mm512_xor_si512(xor_acc, vals); >> + } >> + >> + return xor_acc; >> +} >> + >> +/** >> + * Calculate Toeplitz hash. >> + * >> + * @warning >> + * @b EXPERIMENTAL: this API may change without prior notice. >> + * >> + * @param m >> + * Pointer to the matrices generated from the corresponding >> + * RSS hash key using rte_thash_complete_matrix(). >> + * @param tuple >> + * Pointer to the data to be hashed. Data must be in network byte order. >> + * @param len >> + * Length of the data to be hashed. >> + * @return >> + * Calculated Toeplitz hash value. >> + */ >> +__rte_experimental >> +static inline uint32_t >> +rte_thash_gfni(const uint64_t *m, const uint8_t *tuple, int len) >> +{ >> + uint32_t val, val_zero; >> + >> + __m512i xor_acc = __rte_thash_gfni(m, tuple, NULL, len); >> + __rte_thash_xor_reduce(xor_acc, &val, &val_zero); >> + >> + return val; >> +} >> + >> +/** >> + * Bulk implementation for Toeplitz hash. >> + * >> + * @warning >> + * @b EXPERIMENTAL: this API may change without prior notice. >> + * >> + * @param m >> + * Pointer to the matrices generated from the corresponding >> + * RSS hash key using rte_thash_complete_matrix(). >> + * @param tuple >> + * Array of the pointers on data to be hashed. >> + * Data must be in network byte order. >> + * @param len >> + * Length of the largest data buffer to be hashed. >> + * @param val >> + * Array of uint32_t where to put calculated Toeplitz hash values >> + * @param num >> + * Number of tuples to hash. >> + */ >> +__rte_experimental >> +static inline void >> +rte_thash_gfni_bulk(const uint64_t *mtrx, int len, uint8_t *tuple[], >> + uint32_t val[], uint32_t num) >> +{ >> + uint32_t i; >> + uint32_t val_zero; >> + __m512i xor_acc; >> + >> + for (i = 0; i != (num & ~1); i += 2) { >> + xor_acc = __rte_thash_gfni(mtrx, tuple[i], tuple[i + 1], len); >> + __rte_thash_xor_reduce(xor_acc, val + i, val + i + 1); >> + } >> + >> + if (num & 1) { >> + xor_acc = __rte_thash_gfni(mtrx, tuple[i], NULL, len); >> + __rte_thash_xor_reduce(xor_acc, val + i, &val_zero); >> + } >> +} >> + >> +#endif /* _GFNI_ */ >> + >> +#ifdef __cplusplus >> +} >> +#endif >> + >> +#endif /* _RTE_THASH_X86_GFNI_H_ */ >> diff --git a/lib/hash/version.map b/lib/hash/version.map >> index ce4309a..cecf922 100644 >> --- a/lib/hash/version.map >> +++ b/lib/hash/version.map >> @@ -39,10 +39,12 @@ EXPERIMENTAL { >> rte_hash_rcu_qsbr_add; >> rte_thash_add_helper; >> rte_thash_adjust_tuple; >> + rte_thash_complete_matrix; >> rte_thash_find_existing; >> rte_thash_free_ctx; >> rte_thash_get_complement; >> rte_thash_get_helper; >> rte_thash_get_key; >> + rte_thash_gfni_supported; >> rte_thash_init_ctx; >> }; >> -- >> 2.7.4 > -- Regards, Vladimir