From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2388CA0553; Fri, 10 Jun 2022 18:16:08 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id B9E5D4069C; Fri, 10 Jun 2022 18:16:07 +0200 (CEST) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by mails.dpdk.org (Postfix) with ESMTP id C9A4540689 for ; Fri, 10 Jun 2022 18:16:05 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1654877766; x=1686413766; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=pAaIoU9emy7F7WHpj3T9NsBZZbR+T2e3mKPWJ5xGxa4=; b=K20QhAtn0jKlosM8/KxjtVaqmF1Z7m0sS6fM1riDJiNDu/el+rbmtjcC h8t266ksQ0yEhiTuKbZsxLxrCWU+usSquAjfILhtaHZKci1FnoifaVWnT L6FxhDXpM8YdwcNNNSzhjkxF4U8q5CytlgmPBb6MUuqqAkgGR3DQ/FEeS Dz2YA7TKDyfD/HGdY2S2bovGlK4Rm2W8AmxRAm7T3V5Icb5dckcDlH8Ay f1XyEakpTGwFxPid3nlnH/10QrUxUi19kgo2a7mCJ++tS4/gtHgPOj4CS ff2oDbsoNhI1txQVZ9Fzu/UeNQ7jshvx93rNvLSzA2AzuC4R9PD5IMQx/ Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10374"; a="278470949" X-IronPort-AV: E=Sophos;i="5.91,290,1647327600"; d="scan'208";a="278470949" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Jun 2022 09:16:00 -0700 X-IronPort-AV: E=Sophos;i="5.91,290,1647327600"; d="scan'208";a="649895044" Received: from bricha3-mobl.ger.corp.intel.com ([10.252.8.195]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-SHA; 10 Jun 2022 09:15:58 -0700 Date: Fri, 10 Jun 2022 17:15:55 +0100 From: Bruce Richardson To: Timothy McDaniel Cc: jerinj@marvell.com, dev@dpdk.org, Kent Wires Subject: Re: [PATCH v7] event/dlb2: add support for single 512B write of 4 QEs Message-ID: References: <20220409151849.1007602-1-timothy.mcdaniel@intel.com> <20220610154125.2712367-1-timothy.mcdaniel@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220610154125.2712367-1-timothy.mcdaniel@intel.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Fri, Jun 10, 2022 at 10:41:25AM -0500, Timothy McDaniel wrote: > On Xeon, 512b accesses are available, so movdir64 instruction is able to > perform 512b read and write to DLB producer port. In order for movdir64 > to be able to pull its data from store buffers (store-buffer-forwarding) > (before actual write), data should be in single 512b write format. > This commit add change when code is built for Xeon with 512b AVX support > to make single 512b write of all 4 QEs instead of 4x64b writes. > > Signed-off-by: Timothy McDaniel > Acked-by: Kent Wires > === > > Changes since V6: > 1) Check for AVX512VL only, removing checks for other > AVX512 flags in meson.build > 2) rename dlb2_sve.c to dlb2_sse.c > > Changes since V5: > No code changes - just added --in-reply-to and copied Bruce > > Changes since V4: > 1) Add build-time control for avx512 support to meson.buildi, based > on implementation found in lib/acl/meson.build > 2) Add rte_vect_get_max_simd_bitwidth runtime check before using > avx512 instructions > > Changes since V3: > 1) Renamed dlb2_noavx512.c to dlb2_sve.c, and fixed up meson.build > for new file name. > > Changes since V1: > 1) Split out dlb2_event_build_hcws into two implementations, one > that uses AVX512 instructions, and one that does not. Each implementation > is in its own source file in order to avoid build errors if the compiler > does not support the newer AVX512 instructions. > 2) Update meson.build to and pull in appropriate source file based on > whether the compiler supports AVX512VL > 3) Check if target supports AVX512VL, and use appropriate implementation > based on this runtime check. > --- > drivers/event/dlb2/dlb2.c | 208 +----------------------- > drivers/event/dlb2/dlb2_avx512.c | 267 +++++++++++++++++++++++++++++++ > drivers/event/dlb2/dlb2_priv.h | 10 ++ > drivers/event/dlb2/dlb2_sse.c | 219 +++++++++++++++++++++++++ > drivers/event/dlb2/dlb2_sve.c | 219 +++++++++++++++++++++++++ > drivers/event/dlb2/meson.build | 47 ++++++ > 6 files changed, 769 insertions(+), 201 deletions(-) > create mode 100644 drivers/event/dlb2/dlb2_avx512.c > create mode 100644 drivers/event/dlb2/dlb2_sse.c > create mode 100644 drivers/event/dlb2/dlb2_sve.c > > diff --git a/drivers/event/dlb2/meson.build b/drivers/event/dlb2/meson.build > index f963589fd3..51ea5ec546 100644 > --- a/drivers/event/dlb2/meson.build > +++ b/drivers/event/dlb2/meson.build > @@ -19,6 +19,53 @@ sources = files( > 'dlb2_selftest.c', > ) > > +# compile AVX512 version if: > +# we are building 64-bit binary (checked above) AND binutils > +# can generate proper code > + > +if binutils_ok > + > + # compile AVX512 version if either: > + # a. we have AVX512VL supported in minimum instruction set > + # baseline > + # b. it's not minimum instruction set, but supported by > + # compiler > + # > + # in former case, just add avx512 C file to files list > + # in latter case, compile c file to static lib, using correct > + # compiler flags, and then have the .o file from static lib > + # linked into main lib. > + > + # check if all required flags already enabled (variant a). > + dlb2_avx512_on = false > + if cc.get_define(f, args: machine_args) == '__AVX512VL__' > + dlb2_avx512_on = true > + endif > + > + if dlb2_avx512_on == true > + > + sources += files('dlb2_avx512.c') > + cflags += '-DCC_AVX512_SUPPORT' > + > + elif cc.has_multi_arguments('-mavx512f', '-mavx512vl', > + '-mavx512cd', '-mavx512bw') > + > + cflags += '-DCC_AVX512_SUPPORT' > + avx512_tmplib = static_library('avx512_tmp', > + 'dlb2_avx512.c', > + dependencies: [static_rte_eal, > + static_rte_eventdev], Minor nit - incorrect whitespace > + c_args: cflags + > + ['-mavx512f', '-mavx512vl', > + '-mavx512cd', '-mavx512bw']) > + objs += avx512_tmplib.extract_objects('dlb2_avx512.c') > + else > + sources += files('dlb2_sse.c') > + endif > +else > + sources += files('dlb2_sse.c') > +endif > + > headers = files('rte_pmd_dlb2.h') > > deps += ['mbuf', 'mempool', 'ring', 'pci', 'bus_pci'] These meson.build file changes look ok to me now. /Bruce