From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id EE7F3A04BC;
	Tue, 29 Sep 2020 17:47:36 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id A49C21C1DC;
	Tue, 29 Sep 2020 17:47:35 +0200 (CEST)
Received: from mga14.intel.com (mga14.intel.com [192.55.52.115])
 by dpdk.org (Postfix) with ESMTP id 8DFA61C198
 for <dev@dpdk.org>; Tue, 29 Sep 2020 17:47:30 +0200 (CEST)
IronPort-SDR: q2ApWzC24e8aqP3VllxZx2jzOAJ0YZnEKNROJ860Cl6pZE8Dk7LZrtXHXsvgg9VeSX6U/rVI22
 0bHbVv6u3Aog==
X-IronPort-AV: E=McAfee;i="6000,8403,9759"; a="161445442"
X-IronPort-AV: E=Sophos;i="5.77,319,1596524400"; d="scan'208";a="161445442"
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga008.jf.intel.com ([10.7.209.65])
 by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 29 Sep 2020 08:47:28 -0700
IronPort-SDR: F/qmOb2sV85ZG7bUqe9AVXG95f6dPDuk06f0E8FRaao4+N0RiGY/Y+ZS9cIGfVU7GXcNMX6/CG
 uDR2GN89hEng==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.77,319,1596524400"; d="scan'208";a="340887788"
Received: from orsmsx602.amr.corp.intel.com ([10.22.229.15])
 by orsmga008.jf.intel.com with ESMTP; 29 Sep 2020 08:47:27 -0700
Received: from orsmsx611.amr.corp.intel.com (10.22.229.24) by
 ORSMSX602.amr.corp.intel.com (10.22.229.15) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id
 15.1.1713.5; Tue, 29 Sep 2020 08:47:27 -0700
Received: from orsmsx601.amr.corp.intel.com (10.22.229.14) by
 ORSMSX611.amr.corp.intel.com (10.22.229.24) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id
 15.1.1713.5; Tue, 29 Sep 2020 08:47:27 -0700
Received: from ORSEDG601.ED.cps.intel.com (10.7.248.6) by
 orsmsx601.amr.corp.intel.com (10.22.229.14) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5
 via Frontend Transport; Tue, 29 Sep 2020 08:47:27 -0700
Received: from NAM11-CO1-obe.outbound.protection.outlook.com (104.47.56.177)
 by edgegateway.intel.com (134.134.137.102) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 15.1.1713.5; Tue, 29 Sep 2020 08:47:26 -0700
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;
 b=nUJFJ3s7HQa6jGmZgDCDRClz7spGGPi5Huqmfc1yRalkIliskhi/h+9lxHId+0jb5MOw2zgF2m9/hJqaISTnobElWbkCgbhpSYrJ97yjtkdx2OV6PQC7IlKoKmtNtlGlLFz24Nn3yuEVq7ahkMntKh+fNISq97ro3ui0URmIwlqkHK6HAfAPk+tx8rirJa6TISGmUHXrLSM6A5haYISPGVwIpr4ZAL3d25PSE5tc2uaDWQFp/oLDsUpFelaOmOfo2Z5UbvVL0Q5y7wNYnoYTo4maSPofAQHEs66NtKRfhRXhCb7peYkmVsTSrdDJopZ4KnDVdOb2NxNDcRUiI0bjDw==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; 
 s=arcselector9901;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=ghqfKPK0X3nSmUti/0BIIPIwQ0KKfP3DY5ZjKyU0W2Y=;
 b=LHeshtaYbgQXr60BLGIVDXluuYrXYW19369Dgq/XsSCeT90NJmedjftFNyaZDxnoc7yc1XjgbB12Je1R/On+FyAiYRTXqJ1X3axB5MPXxR+gQKJi3DW5YIa4vb92VAXNeSN/NmRr1MiyKSkIb9O4J5+OZHdV3n/PalXXe7eMpYVt7KzNHx9UzN/vURtfA4gmDBNfaEa8WUcTywx0tLXDaEhFEDThqqf+aN3I6SHWZYWG+dzPRWBCSfwPMwYSlsohANCRWCIHZ82Rogrgd4mURK7XcsCesbLJuIoWy2meyrHaj52x9ecjO7LytNTw4I2GCWq5/i408KT404t27FWHZw==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass
 smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com;
 dkim=pass header.d=intel.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; 
 s=selector2-intel-onmicrosoft-com;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=ghqfKPK0X3nSmUti/0BIIPIwQ0KKfP3DY5ZjKyU0W2Y=;
 b=V+rOsZ48bunUSO8iuLuzwShFmj4T84t1iBVZaii4Vmg3ZGFivSMmBZFnjn/sVwZq8kS8RH8lPV1n1/K5IRAGw3Fsl5r8c8/2rES71C23zJwFBtm9/IbzAus6mdzkr4WW2haYiPl2c0wS9yT1a6RxfZN3LX+hrKpBBNGuX2Jfvlw=
Received: from MN2PR11MB3725.namprd11.prod.outlook.com (2603:10b6:208:f9::23)
 by MN2PR11MB4693.namprd11.prod.outlook.com (2603:10b6:208:261::18)
 with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.22; Tue, 29 Sep
 2020 15:47:25 +0000
Received: from MN2PR11MB3725.namprd11.prod.outlook.com
 ([fe80::20e3:1612:5449:76e8]) by MN2PR11MB3725.namprd11.prod.outlook.com
 ([fe80::20e3:1612:5449:76e8%5]) with mapi id 15.20.3412.029; Tue, 29 Sep 2020
 15:47:25 +0000
From: "O'loingsigh, Mairtin" <mairtin.oloingsigh@intel.com>
To: "De Lara Guarch, Pablo" <pablo.de.lara.guarch@intel.com>, "Singh,
 Jasvinder" <jasvinder.singh@intel.com>
CC: "dev@dpdk.org" <dev@dpdk.org>, "Ryan, Brendan" <brendan.ryan@intel.com>,
 "Coyle, David" <david.coyle@intel.com>
Thread-Topic: [PATCH] net: add support for AVX512 when generating CRC
Thread-Index: AQHWh2ot3PVCYcPcP0qhNCI3WD6U96ljNTiAgByrEBA=
Date: Tue, 29 Sep 2020 15:47:25 +0000
Message-ID: <MN2PR11MB37258FD0D12FECE129FB429F9C320@MN2PR11MB3725.namprd11.prod.outlook.com>
References: <1599739271-16605-1-git-send-email-mairtin.oloingsigh@intel.com>
 <SN6PR11MB3101F9F890FF914EAE0E92AA84240@SN6PR11MB3101.namprd11.prod.outlook.com>
In-Reply-To: <SN6PR11MB3101F9F890FF914EAE0E92AA84240@SN6PR11MB3101.namprd11.prod.outlook.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
dlp-product: dlpe-windows
dlp-reaction: no-action
dlp-version: 11.5.1.3
authentication-results: intel.com; dkim=none (message not signed)
 header.d=none;intel.com; dmarc=none action=none header.from=intel.com;
x-originating-ip: [86.44.213.168]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 463711cb-1fc5-490d-2bd5-08d8648ef6ae
x-ms-traffictypediagnostic: MN2PR11MB4693:
x-ms-exchange-transport-forked: True
x-microsoft-antispam-prvs: <MN2PR11MB4693F13A6AA2383886A89AEE9C320@MN2PR11MB4693.namprd11.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:7691;
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: 4LEh4bmhkzIA1XMoTFV8nwzKYl10ziA72xA0/nAdF1TqSNptoxNU24AUCvPb2Pu3iWvmWh5OeEdOH5SDVmRV4AxxS2ZBtG8AgGIWkc3ur69bY2g7dGHAdZAzY7GK9cghg8B2myYmHlJdNd5DRby3c6baDjgae96QUcwvsh82Nsa3vaG48dvXvivJMwr8mmN407HZYvG5m/Thtk8eVWP/VgxRZN/iMx2i3vafQJ9q8gUa1mcMmZ14QcOmgkpfRrwY5b91eEtui8zJnISXDW6OHiWKzS+/k4Yd5AqUiUENO8A10EaE8mwETT34IW6lKVgcz+QvDcZzhz8011QtbbN8OvJUJQ7zpNRlgirm68gM3PrXSDsw8Ey4S9gp3UAfLYi6
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:;
 IPV:NLI; SFV:NSPM; H:MN2PR11MB3725.namprd11.prod.outlook.com; PTR:; CAT:NONE;
 SFS:(4636009)(39860400002)(396003)(376002)(366004)(346002)(136003)(33656002)(8676002)(2906002)(107886003)(6636002)(4326008)(55016002)(9686003)(66556008)(64756008)(66476007)(66446008)(83380400001)(71200400001)(86362001)(8936002)(54906003)(186003)(316002)(110136005)(66946007)(26005)(5660300002)(76116006)(6506007)(52536014)(478600001)(53546011)(7696005);
 DIR:OUT; SFP:1102; 
x-ms-exchange-antispam-messagedata: b46vT8PggEzGeFopk1DNQu8piJeutOur6Wav+1OjtXR7DRtWVA0EHDxBXgZLianbT9lrGT52YEd53d4XO15dyZDyG/1jS7dtSofJbaCqhhbaDfBoCjydFIEgyeFuDuzncu3Hb5EcRaAOAJm5GHIMsPKj4+9bTZ8Nbg+fJ5rCZ+xJfF2L356msUISuIZZuXjUGGO0XTmbnr472/DVSIqlgynMitJwMeyIZpkSWMVV/omhI773QCCqIGvzYIaegkcAKHlqrfKV7o4bAy9//OVuO35agUiCqNb2YCV3admkYLotnx7t8s+gmymQVrEaOPYK0GJGppKVMIPratCyhMVMHyAbCGX4DXtUMXceIiSUP46OMrPpvlIsubzkQWYJ32J3BXZ0ocxmp9O2vTErD+0e5n0yRfa40uwkL+lWyKwaiADkzOBUCeB+SiZNYTBuaEKwlhpDST1kL9owDozDeS3ygi/+fBQZV1AMEgWKv0g/Y36wGbg0VHeeMFhkQPd8iuYM3r/GHGPzQp1C5RpFx/ZLDM1HMck0HgAQHeTaQ5AUGqj6FR5xagLl4dOEZaxBN+lQChtJe2EyjGRcF5yOgnEgzZNoy9tk821pSM9Y32AIGf655a0EJKgN4wcljXGna40pWADJRcrkV5rWV57l2mJTKQ==
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: MN2PR11MB3725.namprd11.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 463711cb-1fc5-490d-2bd5-08d8648ef6ae
X-MS-Exchange-CrossTenant-originalarrivaltime: 29 Sep 2020 15:47:25.8449 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: SQUpa+a1QD/4Oyg8gNWbRgXhkMT08HH4JCBOfDHgimksoHRmofzphwBwnyBtIEIQ3v6QAGe2iv/e4nOCM1hnDCLtbbXoZWELcxl7TO6Eib4=
X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR11MB4693
X-OriginatorOrg: intel.com
Subject: Re: [dpdk-dev] [PATCH] net: add support for AVX512 when generating
	CRC
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

Hi,

> -----Original Message-----
> From: De Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>
> Sent: Friday, September 11, 2020 10:58 AM
> To: O'loingsigh, Mairtin <mairtin.oloingsigh@intel.com>; Singh, Jasvinder
> <jasvinder.singh@intel.com>
> Cc: dev@dpdk.org; Ryan, Brendan <brendan.ryan@intel.com>; Coyle, David
> <david.coyle@intel.com>
> Subject: RE: [PATCH] net: add support for AVX512 when generating CRC
>=20
> Hi Mairtin,
>=20
> > -----Original Message-----
> > From: O'loingsigh, Mairtin <mairtin.oloingsigh@intel.com>
> > Sent: Thursday, September 10, 2020 1:01 PM
> > To: Singh, Jasvinder <jasvinder.singh@intel.com>
> > Cc: dev@dpdk.org; Ryan, Brendan <brendan.ryan@intel.com>; Coyle,
> David
> > <david.coyle@intel.com>; De Lara Guarch, Pablo
> > <pablo.de.lara.guarch@intel.com>; O'loingsigh, Mairtin
> > <mairtin.oloingsigh@intel.com>
> > Subject: [PATCH] net: add support for AVX512 when generating CRC
> >
> > This patch enables the generation of CRC using AVX512 instruction set
> > when available on the host platform.
> >
> > Signed-off-by: Mairtin o Loingsigh <mairtin.oloingsigh@intel.com>
> > ---
> >
> > v1:
> > * Initial version, with AVX512 support for CRC32 Ethernet only
> > (requires further
> > updates)
> >   * AVX512 support for CRC16-CCITT and final implementation of
> >     CRC32 Ethernet will be added in v2
> > ---
> >  doc/guides/rel_notes/release_20_11.rst |    4 +
> >  lib/librte_net/net_crc_avx.h           |  331
> ++++++++++++++++++++++++++++++++
> >  lib/librte_net/rte_net_crc.c           |   23 ++-
> >  lib/librte_net/rte_net_crc.h           |    1 +
> >  4 files changed, 358 insertions(+), 1 deletions(-)  create mode
> > 100644 lib/librte_net/net_crc_avx.h
> >
> > diff --git a/doc/guides/rel_notes/release_20_11.rst
> > b/doc/guides/rel_notes/release_20_11.rst
> > index df227a1..d6a84ca 100644
> > --- a/doc/guides/rel_notes/release_20_11.rst
> > +++ b/doc/guides/rel_notes/release_20_11.rst
> > @@ -55,6 +55,10 @@ New Features
> >       Also, make sure to start the actual text at the margin.
> >       =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D
> >
> > +* **Added support for AVX512 in rte_net CRC calculations.**
> > +
> > +  Added new CRC32 calculation code using AVX512 instruction set
> > + Added new CRC16-CCITT calculation code using AVX512 instruction set
> >
> >  Removed Items
> >  -------------
> > diff --git a/lib/librte_net/net_crc_avx.h
> > b/lib/librte_net/net_crc_avx.h new file mode 100644 index
> > 0000000..d9481d5
> > --- /dev/null
> > +++ b/lib/librte_net/net_crc_avx.h
>=20
> ...
>=20
> > +static __rte_always_inline uint32_t
> > +crc32_eth_calc_pclmulqdq(
> > +	const uint8_t *data,
> > +	uint32_t data_len,
> > +	uint32_t crc,
> > +	const struct crc_pclmulqdq512_ctx *params) {
> > +	__m256i b;
> > +	__m512i temp, k;
> > +	__m512i qw0 =3D _mm512_set1_epi64(0);
> > +	__m512i fold0;
> > +	uint32_t n;
>=20
> This is loading 64 bytes of data, but if seems like only 16 are available=
, right?
> Should we use _mm_loadu_si128?
>=20
> > +			fold0 =3D _mm512_xor_si512(fold0, temp);
> > +			goto reduction_128_64;
> > +		}
> > +
> > +		if (unlikely(data_len < 16)) {
> > +			/* 0 to 15 bytes */
> > +			uint8_t buffer[16] __rte_aligned(16);
> > +
> > +			memset(buffer, 0, sizeof(buffer));
> > +			memcpy(buffer, data, data_len);
>=20
> I would use _mm_maskz_loadu_epi8, passing a mask register with ((1 <<
> data_len) - 1).
>=20
> > +
> > +			fold0 =3D _mm512_load_si512((const __m128i
> *)buffer);
> > +			fold0 =3D _mm512_xor_si512(fold0, temp);
> > +			if (unlikely(data_len < 4)) {
> > +				fold0 =3D xmm_shift_left(fold0, 8 - data_len);
> > +				goto barret_reduction;
> > +			}
> > +			fold0 =3D xmm_shift_left(fold0, 16 - data_len);
> > +			goto reduction_128_64;
> > +		}
> > +		/* 17 to 31 bytes */
> > +		fold0 =3D _mm512_loadu_si512((const __m512i *)data);
>=20
> Same here. Looks like you are loading too much data?
>=20
> > +		fold0 =3D _mm512_xor_si512(fold0, temp);
> > +		n =3D 16;
> > +		k =3D params->rk1_rk2;
> > +		goto partial_bytes;
> > +	}
>=20
> ...
>=20
> > +
> > +		fold0 =3D _mm512_xor_si512(fold0, temp);
> > +		fold0 =3D _mm512_xor_si512(fold0, b);
>=20
> You could use _mm512_ternarylogic_epi64 with 0x96 as to do 2x XORs in one
> instruction.
>=20
> > +	}
> > +
> > +	/** Reduction 128 -> 32 Assumes: fold holds 128bit folded data */
> > +reduction_128_64:
> > +	k =3D params->rk5_rk6;
> > +
> > +barret_reduction:
> > +	k =3D params->rk7_rk8;
> > +	n =3D crcr32_reduce_64_to_32(fold0, k);
> > +
> > +	return n;
> > +}
> > +
> > +

The latest version of this patch (v3) reworks a lot of this code and addres=
s the issues noted above

Mairtin