From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 595B944098; Wed, 22 May 2024 17:33:16 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id CD9E8402C2; Wed, 22 May 2024 17:33:15 +0200 (CEST) Received: from mail-pj1-f45.google.com (mail-pj1-f45.google.com [209.85.216.45]) by mails.dpdk.org (Postfix) with ESMTP id DEFD4400D6 for ; Wed, 22 May 2024 17:33:13 +0200 (CEST) Received: by mail-pj1-f45.google.com with SMTP id 98e67ed59e1d1-2ac9b225a91so2164552a91.2 for ; Wed, 22 May 2024 08:33:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20230601.gappssmtp.com; s=20230601; t=1716391993; x=1716996793; darn=dpdk.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=48tqufZA39r3nU8kiPpdiGI2h/P/FPIofVrNOtE+mYY=; b=mbly2rU7LDYd9ZNyqjsU0E1n4Af5qnPFedFPs8Y9ZjL7rK++QxNjsL1WGfLqNDA/OK 6h4SU+xdRcbgoGutmMO2Vy/ZLD6v5HE8zzpt6YB5Mqntkx8wnOajxK2o7bIroQJXNFl7 WWvJvBjm7tH9vu98vC51WA4YkELJE1JLNT9Jwlv5B4JmTIAnGZk3kMTzk1ALmuphqWOF HWUWo1I0yHdQO6vWVoPMU3ctk7j0K3AYL7oF1GjZhTrXKQS4RJQ/M9/WR+KSx26/VMJa TEIyp5KdhIPItpp3gd+av8wMQYG43t6HKXlFSG9SFlNiXrkKCqWxSSfFF3TWEMyl1Cuk ow5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716391993; x=1716996793; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=48tqufZA39r3nU8kiPpdiGI2h/P/FPIofVrNOtE+mYY=; b=UtOdSZJKevhg3UCNQQMms6MNRoHXj8k0YlDLMkaPEuW9MLW5gMAsb458cIouZjVTOu XabchjZ1FTX4I7AUXsLDs3pVl0g7CU1B2T9HqdraGAkUbTmaYviupQu7CuCN+Q7f1goK N3tnNadlcttvjoRIPLnTDfzoe0ICCLfaaJES9blrHuRDsd7JoAurjKFkFMVNzWypBoQB cUF3reD8JTg2NUH7M8UeuexZnVfhlvlzkASCIzr5MP88XIkHIspXHZfS5InSHNYzsVn/ WkS/jxG7Rr6cj4IbrhHqNOnABSLeX9wJBDGOcTs0i3EAvIP4zouPgw2yPgKEVn/e1n/h 2WQA== X-Gm-Message-State: AOJu0Yzsh9DxK69Yh3GHjBO3caxFuhUTCQDcHCCwcjP6QXqGN2pNNEzl DAhKBFLNhGpQUF4jiLSArqoxFhoVECEB+Lb0yxER6IYS6U4LoQrPJAsdaAwG1WQ= X-Google-Smtp-Source: AGHT+IEhSQBHhrrPFYbVGfu4gWGziJLW2n4FcINaY22JP4Bc1gTnTM+TvCkObJwzq8F1wj6KlSECMA== X-Received: by 2002:a17:90a:7f04:b0:2b6:5156:5b9b with SMTP id 98e67ed59e1d1-2bd9f59cf53mr2663144a91.35.1716391992825; Wed, 22 May 2024 08:33:12 -0700 (PDT) Received: from hermes.local (204-195-96-226.wavecable.com. [204.195.96.226]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2b628ea6a83sm25665592a91.56.2024.05.22.08.33.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 May 2024 08:33:12 -0700 (PDT) Date: Wed, 22 May 2024 08:33:09 -0700 From: Stephen Hemminger To: Morten =?UTF-8?B?QnLDuHJ1cA==?= Cc: , "Tyler Retzlaff" Subject: Re: [PATCH v9 1/8] eal: generic 64 bit counter Message-ID: <20240522083309.3faa305f@hermes.local> In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35E9F488@smartserver.smartshare.dk> References: <20240510050507.14381-1-stephen@networkplumber.org> <20240521201801.126886-1-stephen@networkplumber.org> <20240521201801.126886-2-stephen@networkplumber.org> <98CBD80474FA8B44BF855DF32C47DC35E9F488@smartserver.smartshare.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Wed, 22 May 2024 10:31:39 +0200 Morten Br=C3=B8rup wrote: > > +/** > > + * @warning > > + * @b EXPERIMENTAL: this API may change without prior notice. > > + * > > + * On native 64 bit platform, counter is implemented as basic > > + * 64 bit unsigned integer that only increases. > > + */ > > +typedef struct { > > + uint64_t current; > > + uint64_t offset; > > +} rte_counter64_t; =20 >=20 > As discussed in the other thread [1], I strongly prefer having "current" = and "offset" separate, for performance reasons. > Keeping each offset close together with its counter will require more cac= he lines than necessary, because the offsets take up space in the hot part = of a fast path data structure. E.g. the size_bins[] counters could fit into= one cache line instead of two. >=20 > [1]: https://inbox.dpdk.org/dev/98CBD80474FA8B44BF855DF32C47DC35E9F422@sm= artserver.smartshare.dk/ There are no size_bins in the current version of the patch. And the number of counters in ethdev part are small so it is less of a conc= ern. The code is easier to maintain if the counter object is self contained. > > + * @internal > > + * Macro to implement write once (compiler barrier) using stdatomic. > > + * This is compiler barrier only. > > + */ > > +#define __rte_write_once(var, val) \ > > + rte_atomic_store_explicit((__rte_atomic typeof(&(var)))&(var), val, \ > > + rte_memory_order_release) =20 >=20 > These macros certainly make the code using them shorter. > But IMHO, they don't improve the readability of the code using them; quit= e the opposite. Reviewing code using atomics is hard, so I prefer having th= e memory order directly shown in the code, not hidden behind a macro. Agree, was going to drop them in next version. > > +__rte_experimental > > +static inline uint64_t > > +rte_counter64_read(const rte_counter64_t *counter) > > +{ > > + return __rte_read_once(counter->current) - __rte_read_once(counter- = =20 > > >offset); =20 >=20 > I'm not sure that "current" needs to be read using rte_memory_order_consu= me here; I think rte_memory_order_consume for "offset" and rte_memory_order= _relaxed (or perhaps just volatile) for "counter" suffices. But let's settl= e on the high level design before we start micro-optimizing. :-) memory order consume is compiler barrier only. Was trying to choose what wa= s best here. > > + > > +/* On 32 bit platform, need to use atomic to avoid load/store tearing = */ > > +typedef RTE_ATOMIC(uint64_t) rte_counter64_t; =20 >=20 > As shown by Godbolt experiments discussed in a previous thread [2], non-t= earing 64 bit counters can be implemented without using atomic instructions= on all 32 bit architectures supported by DPDK. So we should use the counte= r/offset design pattern for RTE_ARCH_32 too. >=20 > [2]: https://inbox.dpdk.org/dev/98CBD80474FA8B44BF855DF32C47DC35E9F433@sm= artserver.smartshare.dk/ Bruce found some 32 bit versions of x86 have the problem. Godbolt doesn't seem to list 32 bit x86 compiler for Gcc. If you try MSVC in 32 bit mode it will split the loads. I see no problem on ARM which is the only other 32 bit we care about.