From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 2C7E9C410 for ; Mon, 11 May 2015 11:51:31 +0200 (CEST) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga101.jf.intel.com with ESMTP; 11 May 2015 02:51:30 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.13,405,1427785200"; d="scan'208";a="727125761" Received: from irsmsx110.ger.corp.intel.com ([163.33.3.25]) by orsmga002.jf.intel.com with ESMTP; 11 May 2015 02:51:29 -0700 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.73]) by irsmsx110.ger.corp.intel.com ([163.33.3.25]) with mapi id 14.03.0224.002; Mon, 11 May 2015 10:51:28 +0100 From: "Ananyev, Konstantin" To: Ravi Kerur , Matt Laswell Thread-Topic: [dpdk-dev] [PATCH v2] Implement memcmp using AVX/SSE instructions. Thread-Index: AQHQid6FaTAxHIKWMUaetbIyJsocAZ1yn20AgAPq8KA= Date: Mon, 11 May 2015 09:51:28 +0000 Message-ID: <2601191342CEEE43887BDE71AB9772582142E106@irsmsx105.ger.corp.intel.com> References: <1431119946-32078-1-git-send-email-rkerur@gmail.com> <1431119989-32124-1-git-send-email-rkerur@gmail.com> In-Reply-To: Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.182] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] [PATCH v2] Implement memcmp using AVX/SSE instructions. X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 May 2015 09:51:31 -0000 Hi Ravi, > -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ravi Kerur > Sent: Friday, May 08, 2015 11:55 PM > To: Matt Laswell > Cc: dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2] Implement memcmp using AVX/SSE instruc= tions. >=20 > On Fri, May 8, 2015 at 3:29 PM, Matt Laswell wro= te: >=20 > > > > > > On Fri, May 8, 2015 at 4:19 PM, Ravi Kerur wrote: > > > >> This patch replaces memcmp in librte_hash with rte_memcmp which is > >> implemented with AVX/SSE instructions. > >> > >> +static inline int > >> +rte_memcmp(const void *_src_1, const void *_src_2, size_t n) > >> +{ > >> + const uint8_t *src_1 =3D (const uint8_t *)_src_1; > >> + const uint8_t *src_2 =3D (const uint8_t *)_src_2; > >> + int ret =3D 0; > >> + > >> + if (n & 0x80) > >> + return rte_cmp128(src_1, src_2); > >> + > >> + if (n & 0x40) > >> + return rte_cmp64(src_1, src_2); > >> + > >> + if (n & 0x20) { > >> + ret =3D rte_cmp32(src_1, src_2); > >> + n -=3D 0x20; > >> + src_1 +=3D 0x20; > >> + src_2 +=3D 0x20; > >> + } > >> > >> > > Pardon me for butting in, but this seems incorrect for the first two ca= ses > > listed above, as the function as written will only compare the first 12= 8 or > > 64 bytes of each source and return the result. The pattern expressed i= n > > the 32 byte case appears more correct, as it compares the first 32 byte= s > > and then lets later pieces of the function handle the smaller remaining > > bits of the sources. Also, if this function is to handle arbitrarily la= rge > > source data, the 128 byte case needs to be in a loop. > > > > What am I missing? > > >=20 > Current max hash key length supported is 64 bytes, hence no comparison is > done after 64 bytes. 128 bytes comparison is added to measure performance > only and there is no use-case as of now. With the current use-cases its n= ot > required but if there is a need to handle large arbitrary data upto 128 > bytes it can be modified. So on x86 let say rte_memcmp(k1, k2, 65) might produce invalid results, rig= ht? While on PPC will work as expected (as it calls memcpu underneath)?=20 That looks really weird to me. If you plan to use rte_memcmp only for hash comparisons, then probably you should put it somewhere into librte_hash and name it accordingly: rte_h= ash_key_cmp() or something.=20 And put a big comment around it, that it only works with particular lengths= . If you want it to be a generic function inside EAL, then it probably need t= o handle different lengths properly on all supported architectures.=20 Konstantin >=20 > > > > -- > > Matt Laswell > > infinite io, inc. > > laswell@infiniteio.com > > > >