From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8D983455B7; Sun, 7 Jul 2024 14:08:59 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 7C8D5402C4; Sun, 7 Jul 2024 14:08:59 +0200 (CEST) Received: from fout2-smtp.messagingengine.com (fout2-smtp.messagingengine.com [103.168.172.145]) by mails.dpdk.org (Postfix) with ESMTP id 9E63640272 for ; Sun, 7 Jul 2024 14:08:58 +0200 (CEST) Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailfout.nyi.internal (Postfix) with ESMTP id 1CD1F138063F; Sun, 7 Jul 2024 08:08:58 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute6.internal (MEProxy); Sun, 07 Jul 2024 08:08:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= cc:cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm2; t=1720354138; x=1720440538; bh=9yownQkp1XfOUnPeHnny8uU/8gnS+lmww28Q7B9y9dc=; b= kSSY/MgMQQmLv6bJE49K2p6CXnN4Tjx4V80TMub3XpWhtJRo7CcYBctKbQrKoqQs SmfuAP4HiHD6uO2gWRbR5K7IeYEZXXbA0Dg6yY41xUNnzBt/+0m5gxaN1ITyGNv5 j70qfaWmgt684davKbJvB8jTetMY2mlPg57L1An37W/3/Zg+3ilxFCou1fWJi2wc JDLeyoR5TwCQq3HCgOo9rqPzgAoKLl5Se9gmg9XEzvJYVV3uKIJbprEsBnGHS4sZ n1rPDUbWinxWGzq1kVWCIPTYxGaDBky36t+Q7pvfVygAC8hyEESPDkUyiYHadASN 7BHIbCG70gQttJVjuUiMIw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1720354138; x= 1720440538; bh=9yownQkp1XfOUnPeHnny8uU/8gnS+lmww28Q7B9y9dc=; b=g xTvuk+s661G5gOoo1u/6Lg0AO1BkDTcd3lcGi64W6yZA5l6kekk8Xdn5jS+CnHbJ W9eVPNJzaCGPvavGhkJudY8CLsMMFNKJTeM2ce4FdmrzFeuZm96pJxiQjFJYAzi2 KgMIwVfvYTPTPlRl4rDp+U33bQ5skYsCH3u/zr/7XVEYIp6GCvD+CEK83W0Le+eY +sGoaoH3CIChhAJxnSjsREsuyOFey7EVGAH2NFyMjU6iZ5fyUSNH636nvh0eOLZt mdv7nEc51f1+1YA0PTqkqJE29V6SkwUhTucmu6YPKMmeD4DmKcDnTRfPi6ZU2FoZ C8GAaBc3BaHEopTDa6ZWQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvdehgdehtdcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvfevufffkfgjfhgggfgtsehtqhertddttdejnecuhfhrohhmpefvhhhomhgr shcuofhonhhjrghlohhnuceothhhohhmrghssehmohhnjhgrlhhonhdrnhgvtheqnecugg ftrfgrthhtvghrnhepgedttdeljeejgeffkeekkedtjeevtdehvedtkeeivdeuuedviedu vdelveejueejnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrh homhepthhhohhmrghssehmohhnjhgrlhhonhdrnhgvth X-ME-Proxy: Feedback-ID: i47234305:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Sun, 7 Jul 2024 08:08:55 -0400 (EDT) From: Thomas Monjalon To: David Marchand , Yoan Picchi Cc: Yipeng Wang , Sameh Gobriel , Bruce Richardson , Vladimir Medvedkin , dev@dpdk.org, nd@arm.com, Ruifeng Wang , Nathan Brown Subject: Re: [PATCH v10 1/4] hash: pack the hitmask for hash in bulk lookup Date: Sun, 07 Jul 2024 14:08:51 +0200 Message-ID: <2375413.iqlQq8Z1Fh@thomas> In-Reply-To: <38b2f96b-f8fc-4dc4-a3e4-f5a79dc4f4b4@arm.com> References: <20231020165159.1649282-1-yoan.picchi@arm.com> <38b2f96b-f8fc-4dc4-a3e4-f5a79dc4f4b4@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org 05/07/2024 19:43, Yoan Picchi: > On 7/4/24 21:31, David Marchand wrote: > > On Wed, Jul 3, 2024 at 7:13=E2=80=AFPM Yoan Picchi wrote: > >> --- /dev/null > >> +++ b/lib/hash/compare_signatures_arm_pvt.h > >=20 > > I guess pvt stands for private. > > No need for such suffix, this header won't be exported in any case. >=20 > pvt do stand for private, yes. I had a look at the other lib and what=20 > they used to state a header as private. Several (rcu, ring and stack)=20 > use _pvt so it looks like that's might be the standard? If no, then how=20 > am I supposed to differentiate a public and a private header? Public headers are prefixed with rte_ We should not use _pvt > >> +#if RTE_HASH_BUCKET_ENTRIES <=3D 8 > >> + case RTE_HASH_COMPARE_NEON: { > >> + uint16x8_t vmat, vsig, x; > >> + int16x8_t shift =3D {0, 1, 2, 3, 4, 5, 6, 7}; > >> + uint16_t low, high; > >> + > >> + vsig =3D vld1q_dup_u16((uint16_t const *)&sig); > >> + /* Compare all signatures in the primary bucket */ > >> + vmat =3D vceqq_u16(vsig, vld1q_u16((uint16_t const *)p= rim_bucket_sigs)); > >> + x =3D vshlq_u16(vandq_u16(vmat, vdupq_n_u16(0x0001)), = shift); > >> + low =3D (uint16_t)(vaddvq_u16(x)); > >> + /* Compare all signatures in the secondary bucket */ > >> + vmat =3D vceqq_u16(vsig, vld1q_u16((uint16_t const *)s= ec_bucket_sigs)); > >> + x =3D vshlq_u16(vandq_u16(vmat, vdupq_n_u16(0x0001)), = shift); > >> + high =3D (uint16_t)(vaddvq_u16(x)); > >> + *hitmask_buffer =3D low | high << RTE_HASH_BUCKET_ENTR= IES; > >> + > >> + } > >> + break; > >> +#endif > >> + default: > >> + for (unsigned int i =3D 0; i < RTE_HASH_BUCKET_ENTRIES= ; i++) { > >> + *hitmask_buffer |=3D (sig =3D=3D prim_bucket_s= igs[i]) << i; > >> + *hitmask_buffer |=3D > >> + ((sig =3D=3D sec_bucket_sigs[i]) << i)= << RTE_HASH_BUCKET_ENTRIES; > >> + } > >> + } > >> +} > >=20 > > IIRC, this code is copied in all three headers. > > It is a common scalar version, so the ARM code could simply call the > > "generic" implementation rather than copy/paste. >=20 > Out of the three files, only two versions are the same: generic and arm.= =20 > Intel's version do have some padding added (given it's sparse). > I prefer to keep a scalar version in the arm implementation because=20 > that's what match the legacy implementation. We used to be able to=20 > choose (at runtime) to use the scalar path even if we had neon. In=20 > practice the choice ends up being made from #defines, but as far as this= =20 > function goes, it is a runtime decision. I have no strong opinion.