From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2C543A04FD; Mon, 23 May 2022 18:55:15 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1EAF840E78; Mon, 23 May 2022 18:55:15 +0200 (CEST) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by mails.dpdk.org (Postfix) with ESMTP id 0B26E4067B for ; Mon, 23 May 2022 18:55:12 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1653324913; x=1684860913; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=rB0g175ru8dm5xnBRym2XLsO6SYalrqLMArj6WA0oIg=; b=R+sgIaqbTvCZmAU/A7u0pxb8ix43M5fnOb/4sAfych9gW+JABH7KmZ3k EEsI5ZmeKsGXe92d6CxY+/u3vYfDJlLk6DCMgjULfDOrDg4ktZUpCw7HM yJkpYtHl/MUu1bw0LS3hPF6vkCRPkCjukHojTZ4OGFUlMpIsyH4dd4qAO aStkhgVcwoKKcqLkcBmz+TvUTd6n3hZP1zQNRg41PrrzFgTEwniQlsyLk 95maz4vaPvlczfh9bwgasLFAQ63ZVBiU8NDWSiTtMwa8UDxGOn02BsZYs 23whOmWnGyv4eHLMRBdb3SwS//0of8MW6QqTT4quBi7iNdOWrJhTMk3iy g==; X-IronPort-AV: E=McAfee;i="6400,9594,10356"; a="272102941" X-IronPort-AV: E=Sophos;i="5.91,246,1647327600"; d="scan'208";a="272102941" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 May 2022 09:55:12 -0700 X-IronPort-AV: E=Sophos;i="5.91,246,1647327600"; d="scan'208";a="577493292" Received: from bricha3-mobl.ger.corp.intel.com ([10.55.133.25]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-SHA; 23 May 2022 09:55:10 -0700 Date: Mon, 23 May 2022 17:55:07 +0100 From: Bruce Richardson To: "McDaniel, Timothy" Cc: "jerinj@marvell.com" , "dev@dpdk.org" , "Wires, Kent" Subject: Re: [PATCH v4] event/dlb2: add support for single 512B write of 4 QEs Message-ID: References: <20220409151849.1007602-1-timothy.mcdaniel@intel.com> <20220523160955.3890850-1-timothy.mcdaniel@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Mon, May 23, 2022 at 05:52:06PM +0100, McDaniel, Timothy wrote: > > > > -----Original Message----- > > From: Richardson, Bruce > > Sent: Monday, May 23, 2022 11:34 AM > > To: McDaniel, Timothy > > Cc: jerinj@marvell.com; dev@dpdk.org; Wires, Kent > > Subject: Re: [PATCH v4] event/dlb2: add support for single 512B write of 4 QEs > > > > On Mon, May 23, 2022 at 11:09:55AM -0500, Timothy McDaniel wrote: > > > On Xeon, as 512b accesses are available, movdir64 instruction is able to > > > perform 512b read and write to DLB producer port. In order for movdir64 > > > to be able to pull its data from store buffers (store-buffer-forwarding) > > > (before actual write), data should be in single 512b write format. > > > This commit add change when code is built for Xeon with 512b AVX support > > > to make single 512b write of all 4 QEs instead of 4x64b writes. > > > > > > Signed-off-by: Timothy McDaniel > > > Acked-by: Kent Wires > > > === > > > > > > Changes since V3: > > > 1) Renamed dlb2_noavx512.c to dlb2_sve.c, and fixed up meson.build > > > for new file name. > > > > > > Changes since V1: > > > 1) Split out dlb2_event_build_hcws into two implementations, one > > > that uses AVX512 instructions, and one that does not. Each implementation > > > is in its own source file in order to avoid build errors if the compiler > > > does not support the newer AVX512 instructions. > > > 2) Update meson.build to and pull in appropriate source file based on > > > whether the compiler supports AVX512VL > > > 3) Check if target supports AVX512VL, and use appropriate implementation > > > based on this runtime check. > > > --- > > > drivers/event/dlb2/dlb2.c | 206 +----------------------- > > > drivers/event/dlb2/dlb2_avx512.c | 267 > > +++++++++++++++++++++++++++++++ > > > drivers/event/dlb2/dlb2_priv.h | 8 + > > > drivers/event/dlb2/dlb2_sve.c | 219 +++++++++++++++++++++++++ > > > drivers/event/dlb2/meson.build | 14 ++ > > > 5 files changed, 513 insertions(+), 201 deletions(-) > > > create mode 100644 drivers/event/dlb2/dlb2_avx512.c > > > create mode 100644 drivers/event/dlb2/dlb2_sve.c > > > > > > > > diff --git a/drivers/event/dlb2/meson.build b/drivers/event/dlb2/meson.build > > > index f963589fd3..0ad4d31785 100644 > > > --- a/drivers/event/dlb2/meson.build > > > +++ b/drivers/event/dlb2/meson.build > > > @@ -19,6 +19,20 @@ sources = files( > > > 'dlb2_selftest.c', > > > ) > > > > > > +dlb2_avx512_support = false > > > + > > > +if dpdk_conf.has('RTE_ARCH_X86_64') > > > + dlb2_avx512_support = ( > > > + cc.get_define('__AVX512VL__', args: machine_args) != '' > > > + ) > > > +endif > > > + > > > +if dlb2_avx512_support == true > > > + sources += files('dlb2_avx512.c') > > > +else > > > + sources += files('dlb2_sve.c') > > > +endif > > > + > > > headers = files('rte_pmd_dlb2.h') > > > > > > deps += ['mbuf', 'mempool', 'ring', 'pci', 'bus_pci'] > > > > I believe this can be improved upon further, since it still does not allow > > a generic build to opportunistically use the AVX-512 code path. > > What does this mean - " generic build to opportunistically use the AVX-512 code path" > > It also > > makes the runtime check largely pointless as the whole build will have been > > done with global AVX-512 support, meaning that the binary likely will fail > > to run if AVX-512 is not available. > > If built for avx512, then that build supports using either avx512, or not. > No, if build for AVX-512, then the compiler can use AVX-512 instructions anywhere in the binary, so that build can only run on AVX-512 supporting systems. > > > > Instead, I'd recommend doing as other places in DPDK - such as in ACL > > library, or i40e or ice net drivers - where we not only check the current > > build support, but also check the compiler support. That way, even if we > > are building for e.g. a target of AVX2, we can still build the AVX-512 > > parts using the appropriate compiler flags, and choose them > > opportunistically at runtime. > > I do not understand what you are getting at here. > Check out net/i40e/meson.build and hopefully things may become clearer. /Bruce