From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5F422A04FD; Mon, 23 May 2022 18:37:42 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E7C5D4067B; Mon, 23 May 2022 18:37:41 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by mails.dpdk.org (Postfix) with ESMTP id EDA9740156 for ; Mon, 23 May 2022 18:37:39 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1653323860; x=1684859860; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=9BS6rHIbxgCgMp64ZKDfn5gZz3i5/E+Vcqx6bBmfqTw=; b=lSjUSD0BTHPPxZLCr2AD4SNLGqEqZGoy+eZPK3V4mhiGiK1ddb4vEdJM jPevuY02xHmyaYXmOfmc3CIGWYLcQv96XFuCGLUAJX+OtqTPP8l+G+qXb DS4BK+c2l3F+z+7a8xC80z91DLcJ0RYNNCr+eT6hIoVz2/hKaQseZ/exT T6WsDzm6Guj2XiVxBCGuI+Q0OGsn3fwwm8WWO8FfF7eFiYt4nhm/yj56g zbmwG0s4ZGj5/Sk9YtgsiEca31jE7Ux8BfrgIU3wuNrEAV0QRp8BqT+jP 6K7vmQ37CkZXqp498/eOh/lbgeTNnZJic5J9GbvIQrsKfz0SSfzgjVl39 A==; X-IronPort-AV: E=McAfee;i="6400,9594,10356"; a="273396003" X-IronPort-AV: E=Sophos;i="5.91,246,1647327600"; d="scan'208";a="273396003" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 May 2022 09:37:24 -0700 X-IronPort-AV: E=Sophos;i="5.91,246,1647327600"; d="scan'208";a="558737325" Received: from bricha3-mobl.ger.corp.intel.com ([10.55.133.25]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-SHA; 23 May 2022 09:37:22 -0700 Date: Mon, 23 May 2022 17:37:19 +0100 From: Bruce Richardson To: Timothy McDaniel Cc: jerinj@marvell.com, dev@dpdk.org, Kent Wires Subject: Re: [PATCH v4] event/dlb2: add support for single 512B write of 4 QEs Message-ID: References: <20220409151849.1007602-1-timothy.mcdaniel@intel.com> <20220523160955.3890850-1-timothy.mcdaniel@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220523160955.3890850-1-timothy.mcdaniel@intel.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Mon, May 23, 2022 at 11:09:55AM -0500, Timothy McDaniel wrote: > On Xeon, as 512b accesses are available, movdir64 instruction is able to > perform 512b read and write to DLB producer port. In order for movdir64 > to be able to pull its data from store buffers (store-buffer-forwarding) > (before actual write), data should be in single 512b write format. > This commit add change when code is built for Xeon with 512b AVX support > to make single 512b write of all 4 QEs instead of 4x64b writes. > > Signed-off-by: Timothy McDaniel > Acked-by: Kent Wires > === > > Changes since V3: > 1) Renamed dlb2_noavx512.c to dlb2_sve.c, and fixed up meson.build > for new file name. > > Changes since V1: > 1) Split out dlb2_event_build_hcws into two implementations, one > that uses AVX512 instructions, and one that does not. Each implementation > is in its own source file in order to avoid build errors if the compiler > does not support the newer AVX512 instructions. > 2) Update meson.build to and pull in appropriate source file based on > whether the compiler supports AVX512VL > 3) Check if target supports AVX512VL, and use appropriate implementation > based on this runtime check. > --- > drivers/event/dlb2/dlb2.c | 206 +----------------------- > drivers/event/dlb2/dlb2_avx512.c | 267 +++++++++++++++++++++++++++++++ > drivers/event/dlb2/dlb2_priv.h | 8 + > drivers/event/dlb2/dlb2_sve.c | 219 +++++++++++++++++++++++++ > drivers/event/dlb2/meson.build | 14 ++ > 5 files changed, 513 insertions(+), 201 deletions(-) > create mode 100644 drivers/event/dlb2/dlb2_avx512.c > create mode 100644 drivers/event/dlb2/dlb2_sve.c > > diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c > index 36f07d0061..ac7572a28d 100644 > --- a/drivers/event/dlb2/dlb2.c > +++ b/drivers/event/dlb2/dlb2.c > @@ -1834,6 +1834,11 @@ dlb2_eventdev_port_setup(struct rte_eventdev *dev, > > dev->data->ports[ev_port_id] = &dlb2->ev_ports[ev_port_id]; > > + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512VL)) > + ev_port->qm_port.use_avx512 = true; > + else > + ev_port->qm_port.use_avx512 = false; > + > return 0; > } > Additional comment for this runtime check. You also should check the max_simd_bitwidth in DPDK i.e. the value specified with --force-max-simd-bitwidth EAL argument, or set programmatically by the app. This is to allow the user runtime control over when the various instruction sets get used, and it's also very useful for testing and debugging various code paths. /Bruce