From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5644943FEC; Fri, 10 May 2024 06:56:07 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 257CC402D1; Fri, 10 May 2024 06:56:07 +0200 (CEST) Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) by mails.dpdk.org (Postfix) with ESMTP id CB02B4025F for ; Fri, 10 May 2024 06:56:05 +0200 (CEST) Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-1eecc71311eso13475975ad.3 for ; Thu, 09 May 2024 21:56:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20230601.gappssmtp.com; s=20230601; t=1715316965; x=1715921765; darn=dpdk.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=HWcstqFG4LzfryFitKoq0L0ulJW5AWPjQicnrtpytI0=; b=LGiHF0JAowROA5ZoyNIVluf6+uVbRNuhwXh1JSlQqCzvmMZkaWrq6vpswin+GzN20Y NWrDrPZuac0E4gDNV76f50/s9ZjkwOvsoo8j/5c5RtpBtFEAdjwhOF8O+gxtfN+oI7DG pcfmhRps5mAtGYl/DvDglKHZtbFYXkhC4mnm+7zv6wWa/eMwPPfNe4RgY2/uxJc2dqHC ysl+wU5uj4g5kAjXTZgAxCO2OADcRISz7XIcr0SMDnL0OMXlrFoeEyAYuHeBxdt+8Rne D2ioDd+cWfdaLx95krhJ2L2yruTaMN42xVmrR9x/QIi/Oq5PcP8dr0jAaxYe5LV/MqlO dxww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715316965; x=1715921765; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HWcstqFG4LzfryFitKoq0L0ulJW5AWPjQicnrtpytI0=; b=j0R/ym5ohWwjugvgZjUarJ7dRT/dMX2UID8JxFFKSrkWBQRlZ9ks6O4yea+WLw+7oY KQIIBVPVnXn055LqFcZVbf87lxENFrwOrER0WpgjzZn6QH+Sktqki7eKrAy8eYIJu+jp aKlKCZtNuzfCwQso1sIHvsPBfHVgzRVCcgiY3PNLYYqWT6VNPGxZw7y+dQ4GUFumjdB4 Ej7kp46etzY1R39hKrSxRf3/gsY7S7mCFsz3ObJKpAAwi7+y4/lhqgrY0SUYbUEbgswH ARobvOIjhZq2Szv8udDMgrh0fo6y/oYOPclbQMWnStZbyYooicH3nVLAugAVpfRwcwuu tumA== X-Forwarded-Encrypted: i=1; AJvYcCXxNrGcJ3EwPYICMX/sEBp14DUyVzGIQ41L3qY23au0V1R3XSwz0hsYUJLtHop1Kip4gLm4mXsKhe/ohLA= X-Gm-Message-State: AOJu0Yw21BJIXsr1iWjS/zvFcCmgkIvVzz2me9riFOgOASBE8/SNntR+ 7GfJslwGvA5OjQDfeAuAboO3JBtoRrkWTc4/rHTqEPVJaWXcgha74m51yxhmfqM= X-Google-Smtp-Source: AGHT+IEFrxy7yDRb3kaZ/5iKQjp0baDW08WnfWJcNUhtLLYPeOZRNPNlrIkSC7BREjdQbeuD/gxIQg== X-Received: by 2002:a17:902:b70a:b0:1e2:8841:8d67 with SMTP id d9443c01a7336-1ef43d2da8cmr17929845ad.32.1715316964843; Thu, 09 May 2024 21:56:04 -0700 (PDT) Received: from hermes.local (204-195-96-226.wavecable.com. [204.195.96.226]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1ef0c139302sm22963935ad.270.2024.05.09.21.56.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 09 May 2024 21:56:04 -0700 (PDT) Date: Thu, 9 May 2024 21:56:02 -0700 From: Stephen Hemminger To: Morten =?UTF-8?B?QnLDuHJ1cA==?= Cc: "Bruce Richardson" , "Ferruh Yigit" , Mattias =?UTF-8?B?UsO2bm5ibG9t?= , "John W. Linville" , "Thomas Monjalon" , , Mattias =?UTF-8?B?UsO2bm5ibG9t?= Subject: Re: [RFC v3] net/af_packet: make stats reset reliable Message-ID: <20240509215602.7e03f34d@hermes.local> In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35E9F433@smartserver.smartshare.dk> References: <20240425174617.2126159-1-ferruh.yigit@amd.com> <20240503154547.392069-1-ferruh.yigit@amd.com> <20240503150011.55681b97@hermes.local> <432fb280-96e3-4079-b987-990d509b8c79@lysator.liu.se> <20240508082339.6bf74202@hermes.local> <834102e5-8f4b-4b67-9f62-8e8f61403b39@amd.com> <20240508135418.55505105@hermes.local> <98CBD80474FA8B44BF855DF32C47DC35E9F431@smartserver.smartshare.dk> <98CBD80474FA8B44BF855DF32C47DC35E9F432@smartserver.smartshare.dk> <98CBD80474FA8B44BF855DF32C47DC35E9F433@smartserver.smartshare.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Thu, 9 May 2024 16:19:08 +0200 Morten Br=C3=B8rup wrote: > > From: Morten Br=C3=B8rup [mailto:mb@smartsharesystems.com] > > Sent: Thursday, 9 May 2024 13.37 > > =20 > > > From: Bruce Richardson [mailto:bruce.richardson@intel.com] > > > Sent: Thursday, 9 May 2024 11.30 > > > > > > On Thu, May 09, 2024 at 09:43:16AM +0200, Morten Br=C3=B8rup wrote: = =20 > > > > > From: Stephen Hemminger [mailto:stephen@networkplumber.org] > > > > > Sent: Wednesday, 8 May 2024 22.54 > > > > > > > > > > On Wed, 8 May 2024 20:48:06 +0100 > > > > > Ferruh Yigit wrote: > > > > > =20 > > > > > > > > > > > > > > The idea of load tearing is crazy talk of integral types. It = =20 > > > would =20 > > > > > break so many things. =20 > > > > > > > It is the kind of stupid compiler thing that would send Linus= =20 > > on =20 > > > a =20 > > > > > rant and get =20 > > > > > > > the GCC compiler writers in trouble. > > > > > > > > > > > > > > The DPDK has always favored performance over strict safety =20 > > guard =20 > > > > > rails everywhere. =20 > > > > > > > Switching to making every statistic an atomic operation is no= t =20 > > > in =20 > > > > > the spirit of =20 > > > > > > > what is required. There is no strict guarantee necessary here. > > > > > > > =20 > > > > > > > > > > > > I kind of agree with Stephen. > > > > > > > > > > > > Thanks Mattias, Morten & Stephen, it was informative discussion= . =20 > > > But =20 > > > > > for =20 > > > > > > *SW drivers* stats update and reset is not core functionality = =20 > > and =20 > > > I =20 > > > > > > think we can be OK to get hit on corner cases, instead of > > > > > > over-engineering or making code more complex. =20 > > > > > > > > > > > > > > > I forgot the case of 64 bit values on 32 bit platforms! > > > > > Mostly because haven't cared about 32 bit for years... > > > > > > > > > > The Linux kernel uses some wrappers to handle this. > > > > > On 64 bit platforms they become noop. > > > > > On 32 bit platform, they are protected by a seqlock and updates = =20 > > are =20 > > > > > wrapped by the sequence count. > > > > > > > > > > If we go this way, then doing similar Noop on 64 bit and atomic or > > > > > seqlock > > > > > on 32 bit should be done, but in common helper. > > > > > > > > > > Looking inside FreeBSD, it looks like that has changed over the = =20 > > > years as =20 > > > > > well. > > > > > > > > > > if_inc_counter > > > > > counter_u64_add > > > > > atomic_add_64 > > > > > But the counters are always per-cpu in this case. So although it = =20 > > > does =20 > > > > > use > > > > > locked operation, will always be uncontended. > > > > > > > > > > > > > > > PS: Does DPDK still actually support 32 bit on x86? Can it be =20 > > > dropped =20 > > > > > this cycle? =20 > > > > > > > > We cannot drop 32 bit architecture support altogether. > > > > > > > > But, unlike the Linux kernel, DPDK doesn't need to support ancient = =20 > > 32 =20 > > > bit architectures. =20 > > > > If the few 32 bit architectures supported by DPDK provide non- =20 > > tearing =20 > > > 64 bit loads/stores, we don't need locks (in the fast path) for 64 bit > > > counters. =20 > > > > > > > > In addition to 32 bit x86, DPDK supports ARMv7-A (a 32 bit =20 > > > architecture) and 32 bit ARMv8. =20 > > > > I don't think DPDK support any other 32 bit architectures. > > > > > > > > > > > > As Mattias mentioned, 32 bit x86 can use xmm registers to provide 6= 4 =20 > > > bit non-tearing load/store. =20 > > > > =20 > > > > > > Testing this a little in godbolt, I see gcc using xmm registers on 32- > > > bit > > > when updating 64-bit counters, but clang doesn't seem to do so, but > > > instead > > > does 2 stores when writing back the 64 value. (I tried with both > > > volatile > > > and non-volatile 64-bit values, just to see if volatile would =20 > > encourage =20 > > > clang to do a single store). > > > > > > GCC: https://godbolt.org/z/9eqKfT3hz > > > Clang: https://godbolt.org/z/PT5EqKn4c =20 > >=20 > > Interesting. > > I guess this can be fixed by manually implementing what GCC does. > >=20 > > I'm more concerned about finding a high-performance (in the fast path) > > 64 bit counter solution for 32 bit ARM. =20 >=20 > Reading up on the topic, and continuing Bruce's experiment on Godbolt, it= is possible on 32 bit ARMv7-A too, using LDRD/STRD (Load/Store Register Du= al) instructions, which load/store 64 bit from memory into two registers at= once. >=20 > Clang is emits more efficient code without volatile. > GCC requires volatile to use STRD. >=20 > Clang: https://godbolt.org/z/WjdTq6EKh > GCC: https://godbolt.org/z/qq9j7d4Ea >=20 > Summing it up, it is possible to implement non-tearing 64 bit high-perfor= mance (lockless, barrier-free) counters on the 32 bit architectures support= ed by DPDK. >=20 > But the implementation is both architecture and compiler specific. > So it seems a "64 bit counters" library would be handy. (Or a "non-tearin= g 64 bit integers" library, for support of the signed variant too; but I do= n't think we need that.) > We can use uint64_t as the underlying type and type cast in the library (= where needed by the specific compiler/architecture), or introduce a new rte= _ctr64_t type to ensure that accessor functions are always used and the dev= eloper doesn't have to worry about tearing on 32 bit architectures. >=20 > The most simple variant of such a library only provides load and store fu= nctions. The API would look like this: >=20 > uint64_t inline > rte_ctr64_get(const rte_ctr64_t *const ctr); >=20 > void inline > rte_ctr64_set(rte_ctr64_t *const ctr, const uint64_t value); >=20 > And if some CPU offers special instructions for increment or addition, fa= ster (regarding performance) and/or more compact (regarding instruction mem= ory) than a sequence of load-add-store instructions: >=20 > void inline > rte_ctr64_inc(rte_ctr64_t *const ctr); >=20 > void inline > rte_ctr64_add(rte_ctr64_t *const ctr, const uint64_t value); >=20 > > And perhaps atomic variants of all these functions, with explicit and/or = relaxed memory ordering, for counters shared by multiple threads. > >=20 This kind of what I am experimenting with but... Intend to keep the details of the counters read and update in one file and = not as inlines.