From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1EB2FA04B5; Fri, 11 Sep 2020 11:58:02 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 0877D1C0CC; Fri, 11 Sep 2020 11:58:02 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by dpdk.org (Postfix) with ESMTP id E15391C0CA for ; Fri, 11 Sep 2020 11:58:00 +0200 (CEST) IronPort-SDR: K4LQJwUVexu0ALOpkGZw3oTmz4oR+iyngIaY/qr2C6Pqp5RXMcvFATNhYT8rWwtwi4So9MTOqZ kUmE7q7mfDww== X-IronPort-AV: E=McAfee;i="6000,8403,9740"; a="138745981" X-IronPort-AV: E=Sophos;i="5.76,414,1592895600"; d="scan'208";a="138745981" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Sep 2020 02:58:00 -0700 IronPort-SDR: ajutrNl6dhWY9q3En1qSbrDPx0VB2sMmhJzv3DDxMtsnJmfOkMyf+49TtSmKXz4deMd+ImbKzj CYqybcLVO1KQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,414,1592895600"; d="scan'208";a="449935964" Received: from orsmsx601.amr.corp.intel.com ([10.22.229.14]) by orsmga004.jf.intel.com with ESMTP; 11 Sep 2020 02:57:59 -0700 Received: from orsmsx601.amr.corp.intel.com (10.22.229.14) by ORSMSX601.amr.corp.intel.com (10.22.229.14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Fri, 11 Sep 2020 02:57:59 -0700 Received: from ORSEDG602.ED.cps.intel.com (10.7.248.7) by orsmsx601.amr.corp.intel.com (10.22.229.14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5 via Frontend Transport; Fri, 11 Sep 2020 02:57:59 -0700 Received: from NAM12-BN8-obe.outbound.protection.outlook.com (104.47.55.174) by edgegateway.intel.com (134.134.137.103) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.1713.5; Fri, 11 Sep 2020 02:57:58 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=lmuY59sx3vFEUBo8/Qj/9VYlHWUSpoWmCsg1tcQIO0UKvj9Jn/sHhz99el6VWDeEiWMoNuwgZAleCCewF787wgpTRob4p4R3qhVSdSHV8YoxSK42SasE8zhgDkqXSACWuXa+kdvbm+rbyvB9EXYND6ozako61SMzro7L/e67871yj5LSTnEVrh2x7VAuAiZbb+fpP6qX6z0XgAuEfweXVKsF9xBIHUJGYEnqJNRujG14veek1bTr5kbS6S9UoimCStHtttwq5y6QKJKlD5w1iFFTHrutWrRkqHQdN301y/UUVxdw3/hr+JOUjRnDoNKd9yGKyvsr/DbW7hXmOtxM4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=P7mJg+vlqCLw+HOQbtRmEw+fGY5Z+9WRkFtq0iuiHdE=; b=e/C0YvNOp/2yiwWDyzJAWK6hEgv5a9nwN1Q5cEvS1egRcg+VW1x6xWVkcLuZ3y0OSKfRlFnmwQO/OuMLV304nfLY6tACtJzwsdTM7j6XvppfhImc6+f6CHTtxRiw/NGIAFws8amB7Dj4TO+u0ku75DZ2NxXYKHOTJXWI7Uqtgb84u7XLm4xnKLp75vGJIRv9cwK3jtJkgBq2tt0LcY7FUhu0aP3Gidh1WikBJ6iZ2yfJ9k4OXD/20bmi21+TvMKjimd6TAhVPp3QjLsfKyKbR3hiTgzR/8HORHcld1FcOjrncC7LQbhmKMD9UYSmaqNQaZ4PTymZRL4kPebm1Sc8Ng== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=P7mJg+vlqCLw+HOQbtRmEw+fGY5Z+9WRkFtq0iuiHdE=; b=SdAG5AfDB5uX63+PT5zAd0/+qDu5wHrfrmdJ4BEwDBoIP7BmdWf+irOOqGkl8WTVbUgldjz2g5A4GqE3tCyc7gg1aHPz9wn3gDdhi23r/SHnL4qQEuIogvvWdWskPIeabuJbjB28XtutA0pcDyIl4Y/UYADtbql22xOPLwLdGiw= Received: from SN6PR11MB3101.namprd11.prod.outlook.com (2603:10b6:805:d8::23) by SA0PR11MB4638.namprd11.prod.outlook.com (2603:10b6:806:73::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3370.16; Fri, 11 Sep 2020 09:57:57 +0000 Received: from SN6PR11MB3101.namprd11.prod.outlook.com ([fe80::65f0:fb5b:639d:a0be]) by SN6PR11MB3101.namprd11.prod.outlook.com ([fe80::65f0:fb5b:639d:a0be%6]) with mapi id 15.20.3370.016; Fri, 11 Sep 2020 09:57:57 +0000 From: "De Lara Guarch, Pablo" To: "O'loingsigh, Mairtin" , "Singh, Jasvinder" CC: "dev@dpdk.org" , "Ryan, Brendan" , "Coyle, David" Thread-Topic: [PATCH] net: add support for AVX512 when generating CRC Thread-Index: AQHWh2otWiMyUlm2IkarifLjV5g5GqliaMlg Date: Fri, 11 Sep 2020 09:57:57 +0000 Message-ID: References: <1599739271-16605-1-git-send-email-mairtin.oloingsigh@intel.com> In-Reply-To: <1599739271-16605-1-git-send-email-mairtin.oloingsigh@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-reaction: no-action dlp-version: 11.5.1.3 authentication-results: intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=none action=none header.from=intel.com; x-originating-ip: [109.255.188.24] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 38cd712d-b09c-48cd-7b02-08d8563928ef x-ms-traffictypediagnostic: SA0PR11MB4638: x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:9508; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: jlJBlUY+66qHYKnXPQtMcGBXhf709MzKGwOu1vymqVooT7svUc+YMOta5R+iBsE/H7ReAFlyQYtQJQ8lA4Nd5PgAO+8zjXxjas6jc6y/OBdRfAPViz64hHI6sBcZg27YVevfr+gRKdNN/8rOEng98A7LToPOPwiKaJr8IR5N781nLsxG7u2iq7sKHGjLhhif/K98AAJ+J2iyVuRcC1FTt9alm1ar577niqhgs7KEP9K41Q8V+xI3tKMk1NYmxEc9vE3qhZJan9jgIAgo7Iknp+EcRXcDrnNN4DkDsJWGaWcDMiSqfJIzU9AGTobBHerl x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SN6PR11MB3101.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(396003)(136003)(39860400002)(346002)(376002)(366004)(8676002)(71200400001)(7696005)(26005)(6506007)(53546011)(9686003)(83380400001)(107886003)(4326008)(316002)(55236004)(33656002)(66556008)(66446008)(66476007)(66946007)(5660300002)(55016002)(64756008)(54906003)(86362001)(8936002)(2906002)(478600001)(110136005)(186003)(76116006)(6636002)(52536014); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata: WXYNgC5NDYYZUn/c8q3QYy0Hipj9tmU4kVoZH7YwJqGAIoeqgUoeV1Noz1RlzjwRo7YhiGXBemg0xI4TmUGBdPnHBUOzHTqW3PoInVyfXGiMADmvOcfthl8da6zfdxSvaPeXFpoccvLtcrz7vSbvCw2WOpQpvQFDEYOLuBoxcFcjTyyYSz4t9nXsxIa4+oFSeUMImXFWNgMvs8kp2KVwqUQQRSYXXLe3uJwoxnUU1sCrdLUZaSN8M35nkPd/5OffV7qKWMLaYBtpG88EzUeJl1sBT45UME0tAaqIGsshzVGbo8jIgVHURrVl7+dxsHxN3BZ9y4t7Sn4XH5q2ctpGHdRIzh0edQD8Jax0KV3uuPUlh+smjat+glf5gup0vQYlXrpOs7OLM+3VRM8ME3nYfcSCOGctOic+G/XyeDQg/oXlViLby01oZMmzcn9FLMKqjXgBf2/gXu5pHUdKTC2DPQpKPfHPSFgzEnSCS1TgrSGmzXqxX/jiswCiz/+SRGEgD55fh4/HXBuizDE86+O4yTYUJcGbTcKYiGTEz3eMgAtaWJXoRVdjozBCVcfeCvTUWX9Anvrc4Eagf653GKA/bqyRwgHIJ5MpWIzjcwpsxabH1bPUa54/TI1YnAAQpnsUlrU0QqYDAA7NDxclJEswKw== Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: SN6PR11MB3101.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 38cd712d-b09c-48cd-7b02-08d8563928ef X-MS-Exchange-CrossTenant-originalarrivaltime: 11 Sep 2020 09:57:57.0914 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: gy94IqOSqszYFB1xcCTfIjpiHYe2wfXwmOhIGwebM+gXi2uBygaqbQcM4ugd5rKxsyuDvQ64VKigfb0ckskhblP0rFpqSD3sdcWQXs63QSw= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA0PR11MB4638 X-OriginatorOrg: intel.com Subject: Re: [dpdk-dev] [PATCH] net: add support for AVX512 when generating CRC X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Mairtin, > -----Original Message----- > From: O'loingsigh, Mairtin > Sent: Thursday, September 10, 2020 1:01 PM > To: Singh, Jasvinder > Cc: dev@dpdk.org; Ryan, Brendan ; Coyle, David > ; De Lara Guarch, Pablo > ; O'loingsigh, Mairtin > > Subject: [PATCH] net: add support for AVX512 when generating CRC >=20 > This patch enables the generation of CRC using AVX512 instruction set whe= n > available on the host platform. >=20 > Signed-off-by: Mairtin o Loingsigh > --- >=20 > v1: > * Initial version, with AVX512 support for CRC32 Ethernet only (requires= further > updates) > * AVX512 support for CRC16-CCITT and final implementation of > CRC32 Ethernet will be added in v2 > --- > doc/guides/rel_notes/release_20_11.rst | 4 + > lib/librte_net/net_crc_avx.h | 331 ++++++++++++++++++++++++++= ++++++ > lib/librte_net/rte_net_crc.c | 23 ++- > lib/librte_net/rte_net_crc.h | 1 + > 4 files changed, 358 insertions(+), 1 deletions(-) create mode 100644 > lib/librte_net/net_crc_avx.h >=20 > diff --git a/doc/guides/rel_notes/release_20_11.rst > b/doc/guides/rel_notes/release_20_11.rst > index df227a1..d6a84ca 100644 > --- a/doc/guides/rel_notes/release_20_11.rst > +++ b/doc/guides/rel_notes/release_20_11.rst > @@ -55,6 +55,10 @@ New Features > Also, make sure to start the actual text at the margin. > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D >=20 > +* **Added support for AVX512 in rte_net CRC calculations.** > + > + Added new CRC32 calculation code using AVX512 instruction set Added > + new CRC16-CCITT calculation code using AVX512 instruction set >=20 > Removed Items > ------------- > diff --git a/lib/librte_net/net_crc_avx.h b/lib/librte_net/net_crc_avx.h = new file > mode 100644 index 0000000..d9481d5 > --- /dev/null > +++ b/lib/librte_net/net_crc_avx.h ... > +static __rte_always_inline uint32_t > +crc32_eth_calc_pclmulqdq( > + const uint8_t *data, > + uint32_t data_len, > + uint32_t crc, > + const struct crc_pclmulqdq512_ctx *params) { > + __m256i b; > + __m512i temp, k; > + __m512i qw0 =3D _mm512_set1_epi64(0); > + __m512i fold0; > + uint32_t n; This is loading 64 bytes of data, but if seems like only 16 are available, = right? Should we use _mm_loadu_si128? > + fold0 =3D _mm512_xor_si512(fold0, temp); > + goto reduction_128_64; > + } > + > + if (unlikely(data_len < 16)) { > + /* 0 to 15 bytes */ > + uint8_t buffer[16] __rte_aligned(16); > + > + memset(buffer, 0, sizeof(buffer)); > + memcpy(buffer, data, data_len); I would use _mm_maskz_loadu_epi8, passing a mask register with ((1 << data_= len) - 1). > + > + fold0 =3D _mm512_load_si512((const __m128i *)buffer); > + fold0 =3D _mm512_xor_si512(fold0, temp); > + if (unlikely(data_len < 4)) { > + fold0 =3D xmm_shift_left(fold0, 8 - data_len); > + goto barret_reduction; > + } > + fold0 =3D xmm_shift_left(fold0, 16 - data_len); > + goto reduction_128_64; > + } > + /* 17 to 31 bytes */ > + fold0 =3D _mm512_loadu_si512((const __m512i *)data); Same here. Looks like you are loading too much data? > + fold0 =3D _mm512_xor_si512(fold0, temp); > + n =3D 16; > + k =3D params->rk1_rk2; > + goto partial_bytes; > + } ... > + > + fold0 =3D _mm512_xor_si512(fold0, temp); > + fold0 =3D _mm512_xor_si512(fold0, b); You could use _mm512_ternarylogic_epi64 with 0x96 as to do 2x XORs in one i= nstruction. > + } > + > + /** Reduction 128 -> 32 Assumes: fold holds 128bit folded data */ > +reduction_128_64: > + k =3D params->rk5_rk6; > + > +barret_reduction: > + k =3D params->rk7_rk8; > + n =3D crcr32_reduce_64_to_32(fold0, k); > + > + return n; > +} > + > +