From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id ED6C1A04B5; Mon, 7 Sep 2020 14:01:52 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 6468A1C0BE; Mon, 7 Sep 2020 14:01:52 +0200 (CEST) Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id D286D1BF8A for ; Mon, 7 Sep 2020 14:01:49 +0200 (CEST) IronPort-SDR: XOjQRI6igiOYOO0nG2TMhd4nJX6Vd7uVIlQM45U793exYkTHtFvuqr/OLmNFcZvR1J/surs3Tr 4RI9gslUOe7w== X-IronPort-AV: E=McAfee;i="6000,8403,9736"; a="158009868" X-IronPort-AV: E=Sophos;i="5.76,401,1592895600"; d="scan'208";a="158009868" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Sep 2020 05:01:48 -0700 IronPort-SDR: Y+1lIPa9SjD5HMcUKdmt9bz1rX/iSRDgS8LBxFvGWxPA1Ip0uh5KGZ2hGo6ZxfgKbUm+NMAEkj D9vrj1vrhTIg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,401,1592895600"; d="scan'208";a="504694667" Received: from orsmsx601.amr.corp.intel.com ([10.22.229.14]) by fmsmga005.fm.intel.com with ESMTP; 07 Sep 2020 05:01:46 -0700 Received: from orsmsx607.amr.corp.intel.com (10.22.229.20) by ORSMSX601.amr.corp.intel.com (10.22.229.14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Mon, 7 Sep 2020 05:01:46 -0700 Received: from ORSEDG602.ED.cps.intel.com (10.7.248.7) by orsmsx607.amr.corp.intel.com (10.22.229.20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5 via Frontend Transport; Mon, 7 Sep 2020 05:01:46 -0700 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (104.47.59.171) by edgegateway.intel.com (134.134.137.103) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.1713.5; Mon, 7 Sep 2020 05:01:43 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=hD6G3RAU8MAY0p9sB85AEyilDOfEPptGiXJ9KRYqou7d777Mr/Cc4g5o9TewMYHK7+vl6ONJJGNnjs46x7gLQCcxXJYAc8vyx9drEwC0IUIe9HNUtsClHTwB1/dSsTrpuFrq3Ialyevh9E5VC4JVrWW8DXg3TQduWYV986JGtCoB6VumQhXGrz9BXHQUFKiT3Vhw23g6rsQ+UZjf9ozaqiGQ4YQW+U9wefFgjuh7XfP+0KKjFjY8Hj4VqzBA6p2G1UVZelrgWHF2/E1wqJwN7BX54n5KavcVhDDkUDstLsz9Xu/JXtitdO08htNtz+1ND1Xdjm54KDh7VVkZIEqtlA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=2VQVWaKQXSlQcXmgfuNnr8f3dpHnpjtdO4KHhtAR4Fc=; b=ifhWYknnA7kARicVWHDj+sHmx+lOqUQIEELHnAM0HEx2wkQbmSQPo8/9TZ+VR1DO9Y2Byn2eyUXGm3yXYdxokVlAWKnFkxSBaWzjGl6wyO4FMhw5siicQtMkk5SpjAWJZLkSmX9ZP/v9Qrtzqi6No/fyKVuWsxVJFNwDcquF9/vuxHnLyF7LFXRAmIVaw3EHbKf0CDrWevJuoZR/pX73OwIu3NcLP32YkX9iDmanUmi3n/9KhTHYTXb7Jhm+ftPPP/W15iYNQ5+glVXxUfgqds3XKN+cvN5m/NeFf8dEPmqdYyouKme5f6QxCwG+Ix8xx61Rf4uy+w9uRZ50eQNKZg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=2VQVWaKQXSlQcXmgfuNnr8f3dpHnpjtdO4KHhtAR4Fc=; b=bq8D0dHnJAcXTYG4V/ZfY8AGli0Awgj9JlGimit4TSiaPEcTW3pzDdOihMcP7OS/kQTfGEvcIIsw315XSCE3F28mgGNnTnaCcqgcUM34p1LxBLb5AwEtMTR5zNbLrcbS3Ua7dQGWHMIxvdTGCAaxhAwz1cB2w4B/nFKP0Bd1aOw= Received: from BYAPR11MB3301.namprd11.prod.outlook.com (2603:10b6:a03:7f::26) by BYAPR11MB3621.namprd11.prod.outlook.com (2603:10b6:a03:fc::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3348.15; Mon, 7 Sep 2020 12:01:41 +0000 Received: from BYAPR11MB3301.namprd11.prod.outlook.com ([fe80::f43b:a137:dab8:8b0b]) by BYAPR11MB3301.namprd11.prod.outlook.com ([fe80::f43b:a137:dab8:8b0b%6]) with mapi id 15.20.3348.019; Mon, 7 Sep 2020 12:01:41 +0000 From: "Ananyev, Konstantin" To: "Richardson, Bruce" CC: "Power, Ciara" , "dev@dpdk.org" , "Burakov, Anatoly" , "Mcnamara, John" , "Kovacevic, Marko" Thread-Topic: [dpdk-dev] [PATCH v2 03/17] doc: add detail on using max SIMD bitwidth Thread-Index: AQHWfI2RJTzM42wRBkmWe5Th/zJwNqlcOZRwgACzggCAACSRQA== Date: Mon, 7 Sep 2020 12:01:41 +0000 Message-ID: References: <20200807155859.63888-1-ciara.power@intel.com> <20200827161304.32300-1-ciara.power@intel.com> <20200827161304.32300-4-ciara.power@intel.com> <20200907084428.GB312@bricha3-MOBL.ger.corp.intel.com> In-Reply-To: <20200907084428.GB312@bricha3-MOBL.ger.corp.intel.com> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-reaction: no-action dlp-version: 11.5.1.3 authentication-results: intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=none action=none header.from=intel.com; x-originating-ip: [46.7.39.127] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 1d2b3a17-cbc9-4ef9-5411-08d85325c86d x-ms-traffictypediagnostic: BYAPR11MB3621: x-ld-processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: gpQZ2Fu8TVKQAMnOsBhT3OdCj1A2ZKttb+LzyIappANa7Ajs5e43RQw3ZRHfMi9gJXNIEHQxf2+oPaCMzCuLZJM69KnowROisHxKlrSoOMLXdJul/O9SNgCmMwAwpUKE3E+v+KU17N6JwcqGhcj7VIABzzz/WEpEjNcGxVMyQCj9R9OluDs0jCgWrP9+ieAIS91Bz6Rr2GI1s+9Uw4ML1fqBDTxhMuQRfjd0heJ03jh3cvhniXV+qn0xHXWXkCVrG/jkUnM2B2e04LZ4jiQ4/1mnofOPon6SVtnLp/fKlZxyuyUjCoU64KA5FytAhEfwNAac3BdAutDs6uQhOVmRVw== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BYAPR11MB3301.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(136003)(39860400002)(396003)(346002)(376002)(366004)(2906002)(186003)(76116006)(6506007)(8936002)(55016002)(9686003)(7696005)(83380400001)(6636002)(66556008)(64756008)(66446008)(66476007)(66946007)(52536014)(26005)(71200400001)(5660300002)(54906003)(478600001)(86362001)(4326008)(33656002)(6862004)(316002)(107886003); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata: wg3Or7fGlbcmkDVmUUqOWsOBZJWvpKeK+LgkcYZp5lqJj+RhGkI/tMn7oOL9ZRMFdGmDQ6ka5apto9lTQGUWUK2qiuYJE6FgvLfk7BfI4u3TKmgsA/LxqqGQtOEQeil0mqWkntvyCaVGPDKDzIYoPlD8l2fEXbShHVI/Dg1c/e10PJY4aLw9Ss9yRg14exjLpsa2Vt7pFWAdTvRytZ3MSqLti9h0wygs+Kkh4t96ksY67qJiMaLiDDs8dfoSVnkBHCaStOrq/C5RL1AtFA7NEU4vg/b37ON5wIiB4ak39xfd/mVup41CTEnG4Um54l4oN4BVQwiHYo3L4+cMtDSxMFNh7YkojMWtNKN9EO95Sn7b7DIsHXdsZ7lsD2CBVctXDmSca4UvIpU55lFUgLub6a42cKqheSRi3H810i+0zn6XJqyTAdOwNAzGyYyTg5NNS19xQhzd6MwXKtSufCXkAkz005KFTpEEmNBbIlWxRtlfvBmwuSN6WozYfUpdZRKNr05Sy1sYTs9sDH9W8Kbk7JWDuvcTGmlOANGLBtB3dUgnyhLCTpmvCHfpzDVraK1pODZck7Qw2O3Fj3Vtgj99XEvjWJ2lwvWKEHmaLTs5QmgfDEp6GGOM23MYVvM1Rc6fM0pkDFfhObkV+a0oMRYXOg== Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: BYAPR11MB3301.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 1d2b3a17-cbc9-4ef9-5411-08d85325c86d X-MS-Exchange-CrossTenant-originalarrivaltime: 07 Sep 2020 12:01:41.1658 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: ZEjl+YQWM2ZoGSNurR8OsvOKaj6Hjdax9Sw/RekDwcaaIkWb+zFegF0ZKp8FegHb+xYFG919KpzqcaAqBw7DyRgOCVHCePtCAboqovdHhxA= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR11MB3621 X-OriginatorOrg: intel.com Subject: Re: [dpdk-dev] [PATCH v2 03/17] doc: add detail on using max SIMD bitwidth X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > On Sun, Sep 06, 2020 at 10:20:30PM +0000, Ananyev, Konstantin wrote: > > > This patch adds documentation on the usage of the max SIMD bitwidth E= AL > > > setting, and how to use it to enable AVX-512 at runtime. > > > > > > Cc: Anatoly Burakov > > > Cc: John McNamara > > > Cc: Marko Kovacevic > > > > > > Signed-off-by: Ciara Power > > > --- > > > doc/guides/howto/avx512.rst | 36 +++++++++++++++++= ++ > > > doc/guides/linux_gsg/eal_args.include.rst | 12 +++++++ > > > .../prog_guide/env_abstraction_layer.rst | 31 ++++++++++++++++ > > > 3 files changed, 79 insertions(+) > > > create mode 100644 doc/guides/howto/avx512.rst > > > > > > diff --git a/doc/guides/howto/avx512.rst b/doc/guides/howto/avx512.rs= t > > > new file mode 100644 > > > index 0000000000..ebae0f2b4f > > > --- /dev/null > > > +++ b/doc/guides/howto/avx512.rst > > > @@ -0,0 +1,36 @@ > > > +.. SPDX-License-Identifier: BSD-3-Clause > > > + Copyright(c) 2020 Intel Corporation. > > > + > > > + > > > +Using AVX-512 with DPDK > > > +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D > > > + > > > +AVX-512 is not used by default in DPDK, but it can be selected at ru= ntime by apps through the use of EAL API, > > > +and by the user with a commandline argument. DPDK has a setting for = max SIMD bitwidth, > > > +which can be modified and will then limit the vector path taken by t= he code. > > > > It's is a good idea to have such ability, > > though just one global variable for all DPDK lib/drivers > > seems a bit coarse to me. > > Let say we have 2 libs: libA and libB. > > Both do have RTE_MAX_512_SIMD specific code-path, > > though libA would cause frequency level change, while libB wouldn't. > > So user (to avoid frequency level change) would have to block > > 512_SIMD for both libs. > > I think it would be much better to follow the strategy we use for log-l= evel: > > there is a global simd_width, but each DDPK entity (lib/driver) also ha= s > > it's own simd_width that overrules a global one (more fine-grained cont= rol). >=20 > That for me is a nightmare scenario. How is the user meant to know what > libs could cause him a frequency or not, or is he meant to determine that > empirically by trial and error on each platform?=20 I suppose yes. Let say user can try to run the appp with global --force-max-simd-bitwidth=3D256 and --force-max-simd-bitwidth=3D512 and check the diffenrence. If he is happy with performance he get, he can stick with one of global val= ues (256/512). If not he can try further with choosing different max-simd-width for differ= ent components. >This scenario is > completely unlike logging in that it's non-obvious to the user, and so > needs to be kept as consumable as possible to the app-developer and the > user. This feature is totally optional, if user feels like he doesn't need to car= e about it, he can simply ignore it and use default values. Though for those who do care, one global value seems too restrictive. > Unless we find a concrete scenario where having a single switch is > causing real user problems, I'd much rather keep things simple. As an example, I run several perf tests with acl avx512 code path and so far didn't see any switches to CORE_POWER.LVL2_TURBO_LICENSE (heavy AVX512 instructions). I presume there might be other light-weight avx512 codepaths (lpm, etc.). Though for crypto cpu PMDs (aesni-mb, etc.) I think it would cause switch to the LVL2. > See also answer below, where I point out that the main target of this is = developers, > who can use this flag to indicate what vector bitwidth their app uses, > and then allow DPDK to match that. But in majority if cases developer doesn't know for sure on what platform h= is app will run (unless quite rare situation when app is developed for one particular platf= orm). Again for complex/multi-purpose applications (like VPP, DPDK-OVS) developer= can't even always predict what modules will be used and which wouldn't. Again app can be configured in a way that different modules can run on diff= erent cores (let say module that does ACL lookup on core X, module that does actual cry= pto on core Y). =20 All that depends on particular deployment scenarios. So in many cases only end-user has all information to decide what max-simd = width will be optimal. =20 >=20 > > > > > + > > > + > > > +Using the API in apps > > > +--------------------- > > > + > > > +Apps can request DPDK uses AVX-512 at runtime, if it provides improv= ed application performance. > > > +This can be done by modifying the EAL setting for max SIMD bitwidth = to 512, as by default it is 256, > > > +which does not allow for AVX-512. > > > + > > > +.. code-block:: c > > > + > > > + rte_set_max_simd_bitwidth(RTE_MAX_512_SIMD); > > > + > > > +This API should only be called once at initialization, before EAL in= it. > > > > If the only possible usage scenario for that function is init time befo= re EAL init, > > then do we really need it at all? > > As we have cmd-line flag anyway? > > User can achieve similar goal, by just: rte_eal_init(,..."--force-max-= simd-bitwidth=3D..."...); >=20 > Ideally, the user should never know or care about the cmdline flag, it's > only for testing. The main criteria for allowing DPDK to use longer > instruction sets is whether the application itself will similarly use the= m, > and that's something for the programmer to do. Unfortunately, I don't think programmer also has all information to make su= ch decisions. A lot depends on deployment scenarios, see above.=20 =20 > Having the programmer muck > about with cmdline arguments is less than ideal, so a proper API is > warrented here.=20 Agree, function call is more convenient for the developer. >The reason for the note about EAL init, is that we don't > want libraries to have to check the max bitwidth each time an API is > called, so we want to have a way to prevent people changing things at > runtime. This therefore seemed simplest. I understand that, but for that purpose just cmd-line flag is enough, that's why I asked do we need an API call at all. It seems a bit strange to me to introduce an API that supposed to be called only *before* eal_init(), but from other side I don't see much harm from it= either. So if you and other guys still prefer to keep it - ok by me. Konstantin =20