From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1D53B42220; Fri, 1 Sep 2023 14:26:24 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id EE81040285; Fri, 1 Sep 2023 14:26:23 +0200 (CEST) Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) by mails.dpdk.org (Postfix) with ESMTP id C907A4014F for ; Fri, 1 Sep 2023 14:26:22 +0200 (CEST) Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id 6FE885C0125; Fri, 1 Sep 2023 08:26:22 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute2.internal (MEProxy); Fri, 01 Sep 2023 08:26:22 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= cc:cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to; s=fm1; t= 1693571182; x=1693657582; bh=0ciCNmM+0WXzB1qPG8XgjOH33maeAnNeq2S Idc+ElEY=; b=EV0l4f+la1uNx28B0ogNZe/MHo3/y5C3HOuW0i9nLTm8Q032e6V bz2iXo4sox2uGMMz6XQI3nsteAYzEMx9Hs16rHrmnUiXzUAcvwdSCzYZ3LfP4pOI GcTIxK8lkKgphJ3P7ztj2bL7ZNFAbCz/VxG87/VCWKVIEpmtfupzqS93L8kx37rv lbjF786Ckzy2zg2DsThjjGWZgzZISb1FBeNrLzLNiNRB9+ZJkGUmp5Gp2CSsv6ML xS2QMg+TZEWRbeW216S4DWpbp4IQpoLR8qsesD+MAyZtV6g1a3B8sQCFL/D8jU4F y2QsAxV6djD4hxBoqgi3C0u/lTlO7VtM62g== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t= 1693571182; x=1693657582; bh=0ciCNmM+0WXzB1qPG8XgjOH33maeAnNeq2S Idc+ElEY=; b=xueh5wPVa95YZKEK3xzJZwKOPl24mpNDIB59ynnnDoFFu5KquaV 8boozLszEy82EZiTU2QcvMtFCDe9OUVG8SGvo5jOKJJKOWzn6ZZ7ptuz4xmBTUMu g6DlKfQU6afPauvkl6n+ggGHS/qyEYHV6/HlVi+UHl5+lFmDK5lw7jZxMvIirf3B eLkd41wmf9WN0ipuEM+WURUPlDTUCU33G8sJGgTkXT9CpWxja0+wzpvdHY1S12Ix 6bXap7H1R092drhyL5/daAC9/Uc3qatQcSL3mLTtEK6WA4IZUQrOVRptOp7kIkNq ll+llKfGmePeQ6mdg4+WvTiDeuzUp5DPRyg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedrudegvddgheduucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvvefufffkjghfggfgtgesthhqredttddtudenucfhrhhomhepvfhhohhm rghsucfoohhnjhgrlhhonhcuoehthhhomhgrshesmhhonhhjrghlohhnrdhnvghtqeenuc ggtffrrghtthgvrhhnpeegjeeivdelheduhfeugfefvefgheeiteegiefggffgffffieeu gfeukedtfeefveenucffohhmrghinhepughpughkrdhorhhgnecuvehluhhsthgvrhfuih iivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepthhhohhmrghssehmohhnjhgrlhho nhdrnhgvth X-ME-Proxy: Feedback-ID: i47234305:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 1 Sep 2023 08:26:20 -0400 (EDT) From: Thomas Monjalon To: Morten =?ISO-8859-1?Q?Br=F8rup?= Cc: Bruce Richardson , dev@dpdk.org, olivier.matz@6wind.com, andrew.rybchenko@oktetlabs.ru, honnappa.nagarahalli@arm.com, konstantin.v.ananyev@yandex.ru, mattias.ronnblom@ericsson.com Subject: Re: [RFC] cache guard Date: Fri, 01 Sep 2023 14:26:19 +0200 Message-ID: <2507011.Sgy9Pd6rRy@thomas> In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35D87B47@smartserver.smartshare.dk> References: <98CBD80474FA8B44BF855DF32C47DC35D87B39@smartserver.smartshare.dk> <98CBD80474FA8B44BF855DF32C47DC35D87B47@smartserver.smartshare.dk> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="iso-8859-1" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org 27/08/2023 10:34, Morten Br=F8rup: > +CC Honnappa and Konstantin, Ring lib maintainers > +CC Mattias, PRNG lib maintainer >=20 > > From: Bruce Richardson [mailto:bruce.richardson@intel.com] > > Sent: Friday, 25 August 2023 11.24 > >=20 > > On Fri, Aug 25, 2023 at 11:06:01AM +0200, Morten Br=F8rup wrote: > > > +CC mempool maintainers > > > > > > > From: Bruce Richardson [mailto:bruce.richardson@intel.com] > > > > Sent: Friday, 25 August 2023 10.23 > > > > > > > > On Fri, Aug 25, 2023 at 08:45:12AM +0200, Morten Br=F8rup wrote: > > > > > Bruce, > > > > > > > > > > With this patch [1], it is noted that the ring producer and > > consumer data > > > > should not be on adjacent cache lines, for performance reasons. > > > > > > > > > > [1]: > > > > > > https://git.dpdk.org/dpdk/commit/lib/librte_ring/rte_ring.h?id=3Dd9f0d3= a1f > > fd4b66 > > > > e75485cc8b63b9aedfbdfe8b0 > > > > > > > > > > (It's obvious that they cannot share the same cache line, because > > they are > > > > accessed by two different threads.) > > > > > > > > > > Intuitively, I would think that having them on different cache > > lines would > > > > suffice. Why does having an empty cache line between them make a > > difference? > > > > > > > > > > And does it need to be an empty cache line? Or does it suffice > > having the > > > > second structure start at two cache lines after the start of the > > first > > > > structure (e.g. if the size of the first structure is two cache > > lines)? > > > > > > > > > > I'm asking because the same principle might apply to other code > > too. > > > > > > > > > Hi Morten, > > > > > > > > this was something we discovered when working on the distributor > > library. > > > > If we have cachelines per core where there is heavy access, having > > some > > > > cachelines as a gap between the content cachelines can help > > performance. We > > > > believe this helps due to avoiding issues with the HW prefetchers > > (e.g. > > > > adjacent cacheline prefetcher) bringing in the second cacheline > > > > speculatively when an operation is done on the first line. > > > > > > I guessed that it had something to do with speculative prefetching, > > but wasn't sure. Good to get confirmation, and that it has a measureable > > effect somewhere. Very interesting! > > > > > > NB: More comments in the ring lib about stuff like this would be nice. > > > > > > So, for the mempool lib, what do you think about applying the same > > technique to the rte_mempool_debug_stats structure (which is an array > > indexed per lcore)... Two adjacent lcores heavily accessing their local > > mempool caches seems likely to me. But how heavy does the access need to > > be for this technique to be relevant? > > > > >=20 > > No idea how heavy the accesses need to be for this to have a noticable > > effect. For things like debug stats, I wonder how worthwhile making such > > a > > change would be, but then again, any change would have very low impact > > too > > in that case. >=20 > I just tried adding padding to some of the hot structures in our own appl= ication, and observed a significant performance improvement for those. >=20 > So I think this technique should have higher visibility in DPDK by adding= a new cache macro to rte_common.h: +1 to make more visibility in doc and adding a macro, good idea!