From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by dpdk.space (Postfix) with ESMTP id CC1B9A00E6 for ; Thu, 16 May 2019 17:58:50 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 08F292BAF; Thu, 16 May 2019 17:58:50 +0200 (CEST) Received: from mail-pl1-f193.google.com (mail-pl1-f193.google.com [209.85.214.193]) by dpdk.org (Postfix) with ESMTP id 47C291C0B for ; Thu, 16 May 2019 17:58:49 +0200 (CEST) Received: by mail-pl1-f193.google.com with SMTP id w7so1853142plz.1 for ; Thu, 16 May 2019 08:58:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ETlPHoSBDC2bdU7DGVb3bZZgteL0t5eFU8BXl5KC/UU=; b=LEI6OS4ZLBg57zCTnrEW2F1n5mKXHgoaJ63a7J5UDSA/y6mDNOgrvV8OZBczxJx9ij 7CwvI8jM+aAv+TnbLshSp0BiiVqvs+3xWB2Ea1vhwnLzMWzyS0QqxRDj/bJay3wJYtTN FfwkodSyCa/UGe8iTD+ebafkQkRbxYrCyvo5rN7iaSOTB1ri6uTVdKmmfxA60aQIaoSG vqy3Zy2nj0gqkb+pyo2rctqO9qMeI1+ToOZCxOXOqNe2U84QiaI9siohAGYj0cHphAVG 7RCUMdUZswF4NcvsstM0LN2NmRHN3jB9bof88QaKgKiHedNXa0B89Mft6NyUuA+XaXp0 DdPg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ETlPHoSBDC2bdU7DGVb3bZZgteL0t5eFU8BXl5KC/UU=; b=YeNVAtGClU5sJbJPvecAaKN4WJ1qsCzRf+VcnGBSpvIPHBbcbIlO6dOTytbtKRz739 XSHba36B41C8dWWANWNh8fUBs2KUArMNZWRj9zzVQChEhq5KkBLIG7QOp0aU6VYkM6Ne Q2J3ApNJ9qeRr7TgW9eIob4PIFfbxLifdKH21cLcRaBm+EGVHVQPChN0jC4USoEIc82o YTTBpKMS++zGtH5nAPeQKcRtISKsnZ8fIKVbkLW/3uj2ccFeRtkUR8EZbA5iWlT3zqFI GWG2zUcmaqQ0+apJVSAteXng7swi1gDSynBMyBuN1flaBzLMEKVYWMJszM1TF8ESBm5K hUkA== X-Gm-Message-State: APjAAAV6QhFbmqTJnvAvHffyKTG2PYC3SG6vecdUlweDz66CPyGxpdZi yv5yjFmzYYO+jRe4T2xaKqXsNQ== X-Google-Smtp-Source: APXvYqzxvrIPiVjjefpicmqYj4bxNtQgwDK/Y/xygaI88ao+/moi6/KXkmfUsYw1g7+QZkz4Em29vA== X-Received: by 2002:a17:902:298a:: with SMTP id h10mr24567345plb.6.1558022328305; Thu, 16 May 2019 08:58:48 -0700 (PDT) Received: from hermes.lan (204-195-22-127.wavecable.com. [204.195.22.127]) by smtp.gmail.com with ESMTPSA id 5sm7303821pfh.109.2019.05.16.08.58.47 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 16 May 2019 08:58:48 -0700 (PDT) Date: Thu, 16 May 2019 08:32:12 -0700 From: Stephen Hemminger To: Mattias =?UTF-8?B?UsO2bm5ibG9t?= Cc: dev@dpdk.org Message-ID: <20190516083212.4aaff4c9@hermes.lan> In-Reply-To: <95e9a56f-5d33-2a53-033d-d8963193cbea@ericsson.com> References: <20190515221952.21959-1-stephen@networkplumber.org> <20190515221952.21959-5-stephen@networkplumber.org> <95e9a56f-5d33-2a53-033d-d8963193cbea@ericsson.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [dpdk-dev] [RFC 4/4] net/ether: use bitops to speedup comparison X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Thu, 16 May 2019 11:03:10 +0200 Mattias R=C3=B6nnblom wrote: > On 2019-05-16 00:19, Stephen Hemminger wrote: > > Using bit operations like or and xor is faster than a loop > > on all architectures. Really just explicit unrolling. > >=20 > > Similar cast to uint16 unaligned is already done in > > other functions here. > >=20 > > Signed-off-by: Stephen Hemminger > > --- > > lib/librte_net/rte_ether.h | 17 +++++++---------- > > 1 file changed, 7 insertions(+), 10 deletions(-) > >=20 > > diff --git a/lib/librte_net/rte_ether.h b/lib/librte_net/rte_ether.h > > index b94e64b2195e..5d9242cda230 100644 > > --- a/lib/librte_net/rte_ether.h > > +++ b/lib/librte_net/rte_ether.h > > @@ -78,11 +78,10 @@ struct ether_addr { > > static inline int is_same_ether_addr(const struct ether_addr *ea1, > > const struct ether_addr *ea2) > > { > > - int i; > > - for (i =3D 0; i < ETHER_ADDR_LEN; i++) > > - if (ea1->addr_bytes[i] !=3D ea2->addr_bytes[i]) > > - return 0; > > - return 1; > > + const unaligned_uint16_t *w1 =3D (const uint16_t *)ea1; > > + const unaligned_uint16_t *w2 =3D (const uint16_t *)ea2; > > + > > + return ((w1[0] ^ w2[0]) | (w1[1] ^ w2[1]) | (w1[2] ^ w2[2])) =3D=3D 0; > > } > > =20 >=20 > If you want to shave off a couple of instructions, you can switch the=20 > three 16-bit loads to one 32-bit and one 16-bit load. >=20 > Something like: >=20 > const uint8_t *ea1_b =3D (const uint8_t *)ea1; > const uint8_t *ea2_b =3D (const uint8_t *)ea2; > uint32_t ea1_h; > uint32_t ea2_h; > uint16_t ea1_l; > uint16_t ea2_l; >=20 > memcpy(&ea1_h, &ea1_b[0], sizeof(ea1_h)); > memcpy(&ea1_l, &ea1_b[sizeof(ea1_h)], sizeof(ea1_l)); >=20 > memcpy(&ea2_h, &ea2_b[0], sizeof(ea2_h)); > memcpy(&ea2_l, &ea2_b[sizeof(ea2_h)], sizeof(ea2_l)); >=20 > return ((ea1_l ^ ea2_l) | (ea1_h ^ ea2_h)) =3D=3D 0; >=20 > Code is not as clean as your solution though. >=20 > > /** > > @@ -97,11 +96,9 @@ static inline int is_same_ether_addr(const struct et= her_addr *ea1, > > */ > > static inline int is_zero_ether_addr(const struct ether_addr *ea) > > { > > - int i; > > - for (i =3D 0; i < ETHER_ADDR_LEN; i++) > > - if (ea->addr_bytes[i] !=3D 0x00) > > - return 0; > > - return 1; > > + const unaligned_uint16_t *w =3D (const uint16_t *)ea; > > + > > + return (w[0] | w[1] | w[2]) =3D=3D 0; > > } > > =20 > > /** > > =20 You can even do 64 bit load and then mask, which is what Linux kernel does. But not sure it matters. The cost of the loop is what is expensive