From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3C6244409D; Wed, 22 May 2024 21:51:58 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id B4E904028B; Wed, 22 May 2024 21:51:57 +0200 (CEST) Received: from mail-oo1-f45.google.com (mail-oo1-f45.google.com [209.85.161.45]) by mails.dpdk.org (Postfix) with ESMTP id CB961400D6 for ; Wed, 22 May 2024 21:51:56 +0200 (CEST) Received: by mail-oo1-f45.google.com with SMTP id 006d021491bc7-5b2cc8c4b8aso3735878eaf.1 for ; Wed, 22 May 2024 12:51:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20230601.gappssmtp.com; s=20230601; t=1716407516; x=1717012316; darn=dpdk.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=FCmNJ1hSOL1XT4zbcd9JieX4hVxSDq+jizANwagNGEU=; b=Y8ZQozDE+aL/+/rn3r7W3UjMJThFkcWRrdVsMD+IJI72zkeGpG+v3cpbgBubjW+c7P d4RObnzP+eiGRpxVHLPDluWCXsNeHJ4XbyisoeSlsQmEFqmbWbsg4eJiJSRZYTXCdcnz D+DkNIkwTBrUknlq+2y15KrIkaQNmQSfuFaWmbUgOfXVjiFzKUYFoH2XkxkG+zYhEvX5 Cg189FhwqgDmy14ua+X3DxPIViwVvETVpkLv7Jj0ZD8YBgYs3xchavRaCKJuVcLb6y/K 7JnnArMNju5eBX2mWK6fiAooQAFrvMkEVZ0XbHlBMBruKnopa4Hwe6Jgdrv9sUUu81HI Au1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716407516; x=1717012316; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FCmNJ1hSOL1XT4zbcd9JieX4hVxSDq+jizANwagNGEU=; b=nQ0d6qc7V2xXlvWHFUs891wZFmciaOjzF5tXZcH9dp2M+JoSCkMtyAg+cS1rji9OOP SksnfqZvlp1kjFdCpuC+iFHzs9gdR8pNEqKcEyUAeIjOxDrYCXsmtr+nADhkrs6SCJih wJIZM0jaDR2eSxeUZZbRNC+NR3DMBNTgFbjuLzbqHOSSoDbYtpIeW/McdEznvZ+z+Ssm 1PzBV/hZ4YTgNzGFY+PYMwORiqPv/JANjSKpUN3eJtKDjqjeaySCAbsBTEshjr3SZeme 73F4BtcxWdnZle46hwc24oBE3Gsu+JbQVguXb98QP02+ujCqpYssjGfflYz2lER0j8uk wEsw== X-Forwarded-Encrypted: i=1; AJvYcCXPYyrTZksbd8rd0YxK8heB4RLgkGsEnVFG3Hnyz1SqBPVvwetyDSs2Rm43Sm29Fv1YtJkw66Z95eSZ6cM= X-Gm-Message-State: AOJu0YyuGKxRR7zj41mDcoB3MAnpqowsQ6QIwHw4ThDxioT0iyvWEnWQ 5ySF6Wv04Y99/WiF/zVqX5jqS6kNpoIgrQ3CPK0TjcCVTQlKwXaOBB003RANWebt28UcnQ3xgMq 92ZI= X-Google-Smtp-Source: AGHT+IHBQuCs0hl76zNGNoa8zNWIr9twfCGYynvyXcOyGGOH4fcAeLS3ZU+hAPCdzcCQC31MWbhM7g== X-Received: by 2002:a05:6358:6f0d:b0:183:fb12:39f6 with SMTP id e5c5f4694b2df-19791e1d8b4mr417143355d.14.1716407515623; Wed, 22 May 2024 12:51:55 -0700 (PDT) Received: from hermes.local (204-195-96-226.wavecable.com. [204.195.96.226]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f4d2af2a8dsm22870502b3a.154.2024.05.22.12.51.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 May 2024 12:51:55 -0700 (PDT) Date: Wed, 22 May 2024 12:51:53 -0700 From: Stephen Hemminger To: Tyler Retzlaff Cc: Morten =?UTF-8?B?QnLDuHJ1cA==?= , dev@dpdk.org Subject: Re: [PATCH v9 1/8] eal: generic 64 bit counter Message-ID: <20240522125153.08e612f1@hermes.local> In-Reply-To: <20240522190112.GA19947@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> References: <20240510050507.14381-1-stephen@networkplumber.org> <20240521201801.126886-1-stephen@networkplumber.org> <20240521201801.126886-2-stephen@networkplumber.org> <98CBD80474FA8B44BF855DF32C47DC35E9F488@smartserver.smartshare.dk> <20240522083741.64078d7e@hermes.local> <98CBD80474FA8B44BF855DF32C47DC35E9F48C@smartserver.smartshare.dk> <20240522190112.GA19947@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Wed, 22 May 2024 12:01:12 -0700 Tyler Retzlaff wrote: > On Wed, May 22, 2024 at 07:57:01PM +0200, Morten Br=C3=B8rup wrote: > > > From: Stephen Hemminger [mailto:stephen@networkplumber.org] > > > Sent: Wednesday, 22 May 2024 17.38 > > >=20 > > > On Wed, 22 May 2024 10:31:39 +0200 > > > Morten Br=C3=B8rup wrote: > > > =20 > > > > > +/* On 32 bit platform, need to use atomic to avoid load/store =20 > > > tearing */ =20 > > > > > +typedef RTE_ATOMIC(uint64_t) rte_counter64_t; =20 > > > > > > > > As shown by Godbolt experiments discussed in a previous thread [2],= =20 > > > non-tearing 64 bit counters can be implemented without using atomic > > > instructions on all 32 bit architectures supported by DPDK. So we sho= uld > > > use the counter/offset design pattern for RTE_ARCH_32 too. =20 > > > > > > > > [2]: =20 > > > https://inbox.dpdk.org/dev/98CBD80474FA8B44BF855DF32C47DC35E9F433@sma= rts > > > erver.smartshare.dk/ > > >=20 > > >=20 > > > This code built with -O3 and -m32 on godbolt shows split problem. > > >=20 > > > #include > > >=20 > > > typedef uint64_t rte_counter64_t; > > >=20 > > > void > > > rte_counter64_add(rte_counter64_t *counter, uint32_t val) > > > { > > > *counter +=3D val; > > > } > > > =E2=80=A6 *counter =3D val; > > > } > > >=20 > > > rte_counter64_add: > > > push ebx > > > mov eax, DWORD PTR [esp+8] > > > xor ebx, ebx > > > mov ecx, DWORD PTR [esp+12] > > > add DWORD PTR [eax], ecx > > > adc DWORD PTR [eax+4], ebx > > > pop ebx > > > ret > > >=20 > > > rte_counter64_read: > > > mov eax, DWORD PTR [esp+4] > > > mov edx, DWORD PTR [eax+4] > > > mov eax, DWORD PTR [eax] > > > ret > > > rte_counter64_set: > > > movq xmm0, QWORD PTR [esp+8] > > > mov eax, DWORD PTR [esp+4] > > > movq QWORD PTR [eax], xmm0 > > > ret =20 > >=20 > > Sure, atomic might be required on some 32 bit architectures and/or with= some compilers. =20 >=20 > in theory i think you should be able to use generic atomics and > depending on the target you get codegen that works. it might be > something more expensive on 32-bit and nothing on 64-bit etc.. >=20 > what's the damage if we just use atomic generic and relaxed ordering? is > the codegen not optimal? If we use atomic with relaxed memory order, then compiler for x86 still gen= erates a locked increment in the fast path. This costs about 100 extra cycles due to cache and prefetch stall. This whole endeavor is an attempt to avoid tha= t. PS: looking at the locked increment code for 32 bit involves locked compare exchange and potential retry. Probably don't care about performance on that= platform anymore.