From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR03-VE1-obe.outbound.protection.outlook.com (mail-eopbgr50061.outbound.protection.outlook.com [40.107.5.61]) by dpdk.org (Postfix) with ESMTP id F085C2BE5; Fri, 9 Nov 2018 00:01:05 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=H5/VsfPNkziIwHNc++NkFBy7uagyQ67Po/mUNarkR14=; b=YjPFe+ojCLZoHVDaIFAISaRC6/VczMHCvCBcU21tL3C2dNKMFq8JaOvyCetWh6FZogh8rxEkzetIIeXObAk9G8f+PnjzinOBRaqNjbk/XGj9VrPbPfJu3jAyo03uw388qG5ZLD9XiWVbCpDF47rJPNOu2MLVCvKEn1tHduR3QjA= Received: from DB3PR0502MB3980.eurprd05.prod.outlook.com (52.134.72.27) by DB3PR0502MB3964.eurprd05.prod.outlook.com (52.134.65.161) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1294.27; Thu, 8 Nov 2018 23:01:04 +0000 Received: from DB3PR0502MB3980.eurprd05.prod.outlook.com ([fe80::58e7:97d8:f9c1:4323]) by DB3PR0502MB3980.eurprd05.prod.outlook.com ([fe80::58e7:97d8:f9c1:4323%3]) with mapi id 15.20.1294.034; Thu, 8 Nov 2018 23:01:04 +0000 From: Yongseok Koh To: Ferruh Yigit CC: Thomas Monjalon , "Wiles, Keith" , dev , "Richardson, Bruce" , Shahaf Shuler , "konstantin.ananyev@intel.com" , "anatoly.burakov@intel.com" , "stable@dpdk.org" , "justin.parus@microsoft.com" , "christian.ehrhardt@canonical.com" , "david.coronel@canonical.com" , "josh.powers@canonical.com" , "jay.vosburgh@canonical.com" , "dan.streetman@canonical.com" Thread-Topic: AVX512 bug on SkyLake Thread-Index: AQHUd3wI58YS6Uwok0qazdXgEOqcyKVGIB+AgABevQA= Date: Thu, 8 Nov 2018 23:01:03 +0000 Message-ID: References: <20181023212318.43082-1-yskoh@mellanox.com> <432F92CE-5714-45DC-B72F-CD8771DAFC89@intel.com> <1612642.At0RDolh7h@xps> <9d3f48fc-5a47-c813-1da8-7e1cab6bdd9e@intel.com> In-Reply-To: <9d3f48fc-5a47-c813-1da8-7e1cab6bdd9e@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=yskoh@mellanox.com; x-originating-ip: [69.181.245.183] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB3PR0502MB3964; 6:+AjoVP60eadmXmAlaejC30e6BhTGnGwRzZn2zrI1398astsx/F7+pWOu8xgK4dJPkRSAIvze/4NEm6DgFZfGyS+v1mfsW5UYZ3c8wK+baXbLcgWnLd71Y4bzc6R3N0w3PT45WrbNwvRwjI1lMa+PHYZZN3FAI24xQBXQQKgPzfxjFF5mHS2arFyS9KUtP4OxTqupjMMUUgqHGP9zI5KwC7pbG2HUyy8LGHUXH4D1VcMSazciB3hW+mKHk/BgraepX/pvUHdN+W1DKlEDs7/d4mpJOWE6JouSq9l3giPMqXogpeQUP227LxHTxpZ4FqA9dtMy6jL17IRtehGwUrr7/Lh8jF3Dp7SMjIEHnxLvY8HIor+2g4Gt2tVn8AOwky+2s44Yq1ApMjMeHQ0YVz9XXkK3a5400zO9m8bXZRTDnTqryNDO7tCH0/0J2BSTPEkqs31gWd+c8OZaQ5xV4PIwiw==; 5:UpLsXBNuVp7ijEPXTGVaZf3mua/hB8dfSlGsDMcxN++/flB9gOw0C7bhzaiH4B97bhChTE+3JY/rKaHsz6cFFWn/Zpu7WtEkXQywuOyUDIrAVKUuuzj4G7cCG31TyN6y5ODDGiaokIXELLcDNzxLHO0e+MG/+dwJ3hElnjPK/UA=; 7:IvH9c2fff4ZeIQu5GOr88DdP+bGBAV2aa3OmZvSxXom06oFM4cYYmOQqhYLwbhfnNSUMpHHf+pBo2GEJIOV+6rOhaG2RL+Te3x8NC4ljXB1/L9V+tRsRKCoEt8WGbqrQsQOZDW3JyIbmhWaPwL/MPw== x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-correlation-id: 3bb0ed3c-e181-47b9-7725-08d645ce0f41 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600074)(711020)(4618075)(2017052603328)(7153060)(7193020); SRVR:DB3PR0502MB3964; x-ms-traffictypediagnostic: DB3PR0502MB3964: x-ld-processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(45079756050767)(189930954265078)(183786458502308)(22074186197030)(228905959029699); x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(93006095)(93001095)(10201501046)(3231382)(944501410)(52105095)(3002001)(6055026)(148016)(149066)(150057)(6041310)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(20161123564045)(20161123558120)(20161123562045)(201708071742011)(7699051)(76991095); SRVR:DB3PR0502MB3964; BCL:0; PCL:0; RULEID:; SRVR:DB3PR0502MB3964; x-forefront-prvs: 0850800A29 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(396003)(136003)(366004)(376002)(346002)(39860400002)(189003)(199004)(5660300001)(966005)(6506007)(76176011)(53546011)(8936002)(102836004)(256004)(25786009)(14454004)(14444005)(99286004)(97736004)(82746002)(36756003)(45080400002)(68736007)(54906003)(81156014)(4326008)(229853002)(86362001)(6916009)(81166006)(8676002)(478600001)(33656002)(6306002)(6512007)(6246003)(105586002)(305945005)(7736002)(2906002)(6486002)(53936002)(93886005)(3846002)(71200400001)(6436002)(7416002)(11346002)(446003)(2616005)(316002)(66066001)(486006)(26005)(476003)(2900100001)(186003)(71190400001)(83716004)(106356001)(6116002); DIR:OUT; SFP:1101; SCL:1; SRVR:DB3PR0502MB3964; H:DB3PR0502MB3980.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: aF/NV/6Dt8ZIGh+9fQqm6l3/Y4oHAnpe7OXgj6IspSRJx5yHQe+9eTpYSOIlrAl9Z4BrmfJ0vmE7W12IklDgm6oaGcFCffsqyAWZNQPbQuq7IfHaabq0oT6+aAup7CpyM4qBHGN7DimgYi/lkQ+eDFtS+n4IcAegEboU/CoCIXMWAse/FDvDziOVYjVWZb/0SINIwEGNu2I8nntjJ011N0tPk/vnbcdycfEu/a03Uy97UdT6KJNlxe/L+lDk0OhBJsevN6e8E9xMe5o8Snc6XaWi8jX18INgpjPkn2MPCcI6JxnDPDTX4K9MuWFIDgGMLqttksse5EVw81+6FhSFPUFLwfDbfAFhKUEw/eKuje0= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-ID: <87B9C8FA2ECA744BB4D8B5D2D00032E4@eurprd05.prod.outlook.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: 3bb0ed3c-e181-47b9-7725-08d645ce0f41 X-MS-Exchange-CrossTenant-originalarrivaltime: 08 Nov 2018 23:01:03.9781 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB3PR0502MB3964 Subject: Re: [dpdk-dev] AVX512 bug on SkyLake X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Nov 2018 23:01:06 -0000 > On Nov 8, 2018, at 9:21 AM, Ferruh Yigit wrote: >=20 > On 11/8/2018 3:59 PM, Thomas Monjalon wrote: >> Hi, >>=20 >> We need to gather more information about this bug. >> More below. >>=20 >> 07/11/2018 10:04, Wiles, Keith: >>>> On Nov 6, 2018, at 9:30 PM, Yongseok Koh wrote: >>>>> On Nov 5, 2018, at 6:06 AM, Wiles, Keith wrot= e: >>>>>> On Nov 2, 2018, at 9:04 PM, Yongseok Koh wrote: >>>>>>=20 >>>>>> This is a workaround to prevent a crash, which might be caused by >>>>>> optimization of newer gcc (7.3.0) on Intel Skylake. >>>>>=20 >>>>> Should the code below not also test for the gcc version and >>>>> the Sky Lake processor, maybe I am wrong but it seems it is >>>>> turning AVX512 for all GCC builds >>>>=20 >>>> I didn't want to check gcc version as 7.3.0 is very new. Only gcc 8 is= newly up since then (gcc 8.2). >>>> Also, I wasn't able to test every gcc versions and I wanted to be a bi= t conservative for this crash. >>>> Performance drop (if any) by disabling a new (experimental) feature wo= uld be less risky than unaccountable crash. >>>> And, it does disable the feature only if CONFIG_RTE_ENABLE_AVX512=3Dn.= Please refer to v3. >>>=20 >>> Are you not turning off all of the GCC versions for AVX512. >>> And you can test for range or greater then GCC version and >>> it just seems like we are turning off every gcc version, is that true? >>=20 >> Do we know exactly which GCC versions are affected? >>=20 >>>>> Also bug 97 seems a bit obscure reference, maybe you know >>>>> the bug report, but more details would be good? >>>>=20 >>>> I sent out the report to dev list two month ago. >>>> And I created the Bug 97 in order to reference it >>>> in the commit message. >>>> I didn't want to repeat same message here and there, >>>> but it would've been better to have some sort of summary >>>> of the Bug, although v3 has a few more words. >>>> However, v3 has been merged. >>>=20 >>> Still this is too obscure if nothing else give a link to >>> a specific bug not just 97. >>=20 >> The URL is >> https://emea01.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Fbu= gs.dpdk.org%2Fshow_bug.cgi%3Fid%3D97&data=3D02%7C01%7Cyskoh%40mellanox.= com%7C90ff6c361faf422b976108d6459eb490%7Ca652971c7d2e4d9ba6a4d149256f461b%7= C0%7C0%7C636772945282345908&sdata=3D2o%2Fg203aWrKCYg16S6oI4BcS41igpLu1D= loS%2FrRnknc%3D&reserved=3D0 >> The bug is also pointing to an email: >> https://emea01.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Fma= ils.dpdk.org%2Farchives%2Fdev%2F2018-September%2F111522.html&data=3D02%= 7C01%7Cyskoh%40mellanox.com%7C90ff6c361faf422b976108d6459eb490%7Ca652971c7d= 2e4d9ba6a4d149256f461b%7C0%7C0%7C636772945282345908&sdata=3DNCFKxaREd69= iZ8eyFKg%2FWBP73CLTXkxrNQQeii%2Bbsao%3D&reserved=3D0 >>=20 >> Summary: >> - CPU: Intel Skylake >> - Linux environment: Ubuntu 18.04 >> - Compiler: gcc-7.3 (Ubuntu 7.3.0-16ubuntu3) >=20 > Is it possible to test a few other gcc versions to check if the issue is > specific to this compiler version? Nothing's impossible but even with my quick search in gcc.gnu.org, I could find the following documents mention mavx512f support: GCC 4.9.0 April 22, 2014 (changes, documentation) =20 GCC 5.1 April 22, 2015 (changes, documentation) =20 GCC 6.4 July 4, 2017 (changes, documentation) =20 GCC 7.1 May 2, 2017 (changes, documentation) =20 GCC 8.1 May 2, 2018 (changes, documentation) We altogether have to put quite large resource to verify all of the version= s. =20 I assumed older than gcc 7 would have the same issue. I know it was a specu= lation but like I mentioned I wanted to be more conservative. I didn't mean this i= s a permanent fix. For two months, we couldn't have any tangible solution (actually nobody car= ed including myself), so I submitted the patch to temporarily disable mavx512f. I'm still not sure what the best option is... Thanks, Yongseok >=20 >> - Scenario: testpmd crashes when it starts forwarding >> - Behaviour: AVX2 version of rte_memcpy() optimized with 512b instructi= ons >> - Fix: disable AVX512 optimization with -mno-avx512f >>=20 >> It seems to have been reproduced only when using mlx5 PMD so far. >> Any other experience?