From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <yskoh@mellanox.com>
Received: from EUR03-VE1-obe.outbound.protection.outlook.com
 (mail-eopbgr50061.outbound.protection.outlook.com [40.107.5.61])
 by dpdk.org (Postfix) with ESMTP id F085C2BE5;
 Fri,  9 Nov 2018 00:01:05 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com;
 s=selector1;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=H5/VsfPNkziIwHNc++NkFBy7uagyQ67Po/mUNarkR14=;
 b=YjPFe+ojCLZoHVDaIFAISaRC6/VczMHCvCBcU21tL3C2dNKMFq8JaOvyCetWh6FZogh8rxEkzetIIeXObAk9G8f+PnjzinOBRaqNjbk/XGj9VrPbPfJu3jAyo03uw388qG5ZLD9XiWVbCpDF47rJPNOu2MLVCvKEn1tHduR3QjA=
Received: from DB3PR0502MB3980.eurprd05.prod.outlook.com (52.134.72.27) by
 DB3PR0502MB3964.eurprd05.prod.outlook.com (52.134.65.161) with Microsoft SMTP
 Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 15.20.1294.27; Thu, 8 Nov 2018 23:01:04 +0000
Received: from DB3PR0502MB3980.eurprd05.prod.outlook.com
 ([fe80::58e7:97d8:f9c1:4323]) by DB3PR0502MB3980.eurprd05.prod.outlook.com
 ([fe80::58e7:97d8:f9c1:4323%3]) with mapi id 15.20.1294.034; Thu, 8 Nov 2018
 23:01:04 +0000
From: Yongseok Koh <yskoh@mellanox.com>
To: Ferruh Yigit <ferruh.yigit@intel.com>
CC: Thomas Monjalon <thomas@monjalon.net>, "Wiles, Keith"
 <keith.wiles@intel.com>, dev <dev@dpdk.org>, "Richardson, Bruce"
 <bruce.richardson@intel.com>, Shahaf Shuler <shahafs@mellanox.com>,
 "konstantin.ananyev@intel.com" <konstantin.ananyev@intel.com>,
 "anatoly.burakov@intel.com" <anatoly.burakov@intel.com>, "stable@dpdk.org"
 <stable@dpdk.org>, "justin.parus@microsoft.com" <justin.parus@microsoft.com>, 
 "christian.ehrhardt@canonical.com" <christian.ehrhardt@canonical.com>,
 "david.coronel@canonical.com" <david.coronel@canonical.com>,
 "josh.powers@canonical.com" <josh.powers@canonical.com>,
 "jay.vosburgh@canonical.com" <jay.vosburgh@canonical.com>,
 "dan.streetman@canonical.com" <dan.streetman@canonical.com>
Thread-Topic: AVX512 bug on SkyLake
Thread-Index: AQHUd3wI58YS6Uwok0qazdXgEOqcyKVGIB+AgABevQA=
Date: Thu, 8 Nov 2018 23:01:03 +0000
Message-ID: <CCB20D12-954E-46D3-98BC-D1E832F07DEA@mellanox.com>
References: <20181023212318.43082-1-yskoh@mellanox.com>
 <C58698CB-44E4-4126-908E-B8C898BF983D@mellanox.com>
 <432F92CE-5714-45DC-B72F-CD8771DAFC89@intel.com> <1612642.At0RDolh7h@xps>
 <9d3f48fc-5a47-c813-1da8-7e1cab6bdd9e@intel.com>
In-Reply-To: <9d3f48fc-5a47-c813-1da8-7e1cab6bdd9e@intel.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
authentication-results: spf=none (sender IP is )
 smtp.mailfrom=yskoh@mellanox.com; 
x-originating-ip: [69.181.245.183]
x-ms-publictraffictype: Email
x-microsoft-exchange-diagnostics: 1; DB3PR0502MB3964;
 6:+AjoVP60eadmXmAlaejC30e6BhTGnGwRzZn2zrI1398astsx/F7+pWOu8xgK4dJPkRSAIvze/4NEm6DgFZfGyS+v1mfsW5UYZ3c8wK+baXbLcgWnLd71Y4bzc6R3N0w3PT45WrbNwvRwjI1lMa+PHYZZN3FAI24xQBXQQKgPzfxjFF5mHS2arFyS9KUtP4OxTqupjMMUUgqHGP9zI5KwC7pbG2HUyy8LGHUXH4D1VcMSazciB3hW+mKHk/BgraepX/pvUHdN+W1DKlEDs7/d4mpJOWE6JouSq9l3giPMqXogpeQUP227LxHTxpZ4FqA9dtMy6jL17IRtehGwUrr7/Lh8jF3Dp7SMjIEHnxLvY8HIor+2g4Gt2tVn8AOwky+2s44Yq1ApMjMeHQ0YVz9XXkK3a5400zO9m8bXZRTDnTqryNDO7tCH0/0J2BSTPEkqs31gWd+c8OZaQ5xV4PIwiw==;
 5:UpLsXBNuVp7ijEPXTGVaZf3mua/hB8dfSlGsDMcxN++/flB9gOw0C7bhzaiH4B97bhChTE+3JY/rKaHsz6cFFWn/Zpu7WtEkXQywuOyUDIrAVKUuuzj4G7cCG31TyN6y5ODDGiaokIXELLcDNzxLHO0e+MG/+dwJ3hElnjPK/UA=;
 7:IvH9c2fff4ZeIQu5GOr88DdP+bGBAV2aa3OmZvSxXom06oFM4cYYmOQqhYLwbhfnNSUMpHHf+pBo2GEJIOV+6rOhaG2RL+Te3x8NC4ljXB1/L9V+tRsRKCoEt8WGbqrQsQOZDW3JyIbmhWaPwL/MPw==
x-ms-exchange-antispam-srfa-diagnostics: SOS;
x-ms-office365-filtering-correlation-id: 3bb0ed3c-e181-47b9-7725-08d645ce0f41
x-ms-office365-filtering-ht: Tenant
x-microsoft-antispam: BCL:0; PCL:0;
 RULEID:(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600074)(711020)(4618075)(2017052603328)(7153060)(7193020);
 SRVR:DB3PR0502MB3964; 
x-ms-traffictypediagnostic: DB3PR0502MB3964:
x-ld-processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr
x-microsoft-antispam-prvs: <DB3PR0502MB396477667AFD8E703113F5F8C3C50@DB3PR0502MB3964.eurprd05.prod.outlook.com>
x-exchange-antispam-report-test: UriScan:(45079756050767)(189930954265078)(183786458502308)(22074186197030)(228905959029699);
x-ms-exchange-senderadcheck: 1
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0;
 RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(93006095)(93001095)(10201501046)(3231382)(944501410)(52105095)(3002001)(6055026)(148016)(149066)(150057)(6041310)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(20161123564045)(20161123558120)(20161123562045)(201708071742011)(7699051)(76991095);
 SRVR:DB3PR0502MB3964; BCL:0; PCL:0; RULEID:; SRVR:DB3PR0502MB3964; 
x-forefront-prvs: 0850800A29
x-forefront-antispam-report: SFV:NSPM;
 SFS:(10009020)(396003)(136003)(366004)(376002)(346002)(39860400002)(189003)(199004)(5660300001)(966005)(6506007)(76176011)(53546011)(8936002)(102836004)(256004)(25786009)(14454004)(14444005)(99286004)(97736004)(82746002)(36756003)(45080400002)(68736007)(54906003)(81156014)(4326008)(229853002)(86362001)(6916009)(81166006)(8676002)(478600001)(33656002)(6306002)(6512007)(6246003)(105586002)(305945005)(7736002)(2906002)(6486002)(53936002)(93886005)(3846002)(71200400001)(6436002)(7416002)(11346002)(446003)(2616005)(316002)(66066001)(486006)(26005)(476003)(2900100001)(186003)(71190400001)(83716004)(106356001)(6116002);
 DIR:OUT; SFP:1101; SCL:1; SRVR:DB3PR0502MB3964;
 H:DB3PR0502MB3980.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en;
 PTR:InfoNoRecords; MX:1; A:1; 
received-spf: None (protection.outlook.com: mellanox.com does not designate
 permitted sender hosts)
x-microsoft-antispam-message-info: aF/NV/6Dt8ZIGh+9fQqm6l3/Y4oHAnpe7OXgj6IspSRJx5yHQe+9eTpYSOIlrAl9Z4BrmfJ0vmE7W12IklDgm6oaGcFCffsqyAWZNQPbQuq7IfHaabq0oT6+aAup7CpyM4qBHGN7DimgYi/lkQ+eDFtS+n4IcAegEboU/CoCIXMWAse/FDvDziOVYjVWZb/0SINIwEGNu2I8nntjJ011N0tPk/vnbcdycfEu/a03Uy97UdT6KJNlxe/L+lDk0OhBJsevN6e8E9xMe5o8Snc6XaWi8jX18INgpjPkn2MPCcI6JxnDPDTX4K9MuWFIDgGMLqttksse5EVw81+6FhSFPUFLwfDbfAFhKUEw/eKuje0=
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-Type: text/plain; charset="us-ascii"
Content-ID: <87B9C8FA2ECA744BB4D8B5D2D00032E4@eurprd05.prod.outlook.com>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-OriginatorOrg: Mellanox.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 3bb0ed3c-e181-47b9-7725-08d645ce0f41
X-MS-Exchange-CrossTenant-originalarrivaltime: 08 Nov 2018 23:01:03.9781 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB3PR0502MB3964
Subject: Re: [dpdk-stable] AVX512 bug on SkyLake
X-BeenThere: stable@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches for DPDK stable branches <stable.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/stable>,
 <mailto:stable-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/stable/>
List-Post: <mailto:stable@dpdk.org>
List-Help: <mailto:stable-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/stable>,
 <mailto:stable-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 08 Nov 2018 23:01:06 -0000


> On Nov 8, 2018, at 9:21 AM, Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>=20
> On 11/8/2018 3:59 PM, Thomas Monjalon wrote:
>> Hi,
>>=20
>> We need to gather more information about this bug.
>> More below.
>>=20
>> 07/11/2018 10:04, Wiles, Keith:
>>>> On Nov 6, 2018, at 9:30 PM, Yongseok Koh <yskoh@mellanox.com> wrote:
>>>>> On Nov 5, 2018, at 6:06 AM, Wiles, Keith <keith.wiles@intel.com> wrot=
e:
>>>>>> On Nov 2, 2018, at 9:04 PM, Yongseok Koh <yskoh@mellanox.com> wrote:
>>>>>>=20
>>>>>> This is a workaround to prevent a crash, which might be caused by
>>>>>> optimization of newer gcc (7.3.0) on Intel Skylake.
>>>>>=20
>>>>> Should the code below not also test for the gcc version and
>>>>> the Sky Lake processor, maybe I am wrong but it seems it is
>>>>> turning AVX512 for all GCC builds
>>>>=20
>>>> I didn't want to check gcc version as 7.3.0 is very new. Only gcc 8 is=
 newly up since then (gcc 8.2).
>>>> Also, I wasn't able to test every gcc versions and I wanted to be a bi=
t conservative for this crash.
>>>> Performance drop (if any) by disabling a new (experimental) feature wo=
uld be less risky than unaccountable crash.
>>>> And, it does disable the feature only if CONFIG_RTE_ENABLE_AVX512=3Dn.=
 Please refer to v3.
>>>=20
>>> Are you not turning off all of the GCC versions for AVX512.
>>> And you can test for range or greater then GCC version and
>>> it just seems like we are turning off every gcc version, is that true?
>>=20
>> Do we know exactly which GCC versions are affected?
>>=20
>>>>> Also bug 97 seems a bit obscure reference, maybe you know
>>>>> the bug report, but more details would be good?
>>>>=20
>>>> I sent out the report to dev list two month ago.
>>>> And I created the Bug 97 in order to reference it
>>>> in the commit message.
>>>> I didn't want to repeat same message here and there,
>>>> but it would've been better to have some sort of summary
>>>> of the Bug, although v3 has a few more words.
>>>> However, v3 has been merged.
>>>=20
>>> Still this is too obscure if nothing else give a link to
>>> a specific bug not just 97.
>>=20
>> The URL is
>> 	https://emea01.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Fbu=
gs.dpdk.org%2Fshow_bug.cgi%3Fid%3D97&amp;data=3D02%7C01%7Cyskoh%40mellanox.=
com%7C90ff6c361faf422b976108d6459eb490%7Ca652971c7d2e4d9ba6a4d149256f461b%7=
C0%7C0%7C636772945282345908&amp;sdata=3D2o%2Fg203aWrKCYg16S6oI4BcS41igpLu1D=
loS%2FrRnknc%3D&amp;reserved=3D0
>> The bug is also pointing to an email:
>> 	https://emea01.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Fma=
ils.dpdk.org%2Farchives%2Fdev%2F2018-September%2F111522.html&amp;data=3D02%=
7C01%7Cyskoh%40mellanox.com%7C90ff6c361faf422b976108d6459eb490%7Ca652971c7d=
2e4d9ba6a4d149256f461b%7C0%7C0%7C636772945282345908&amp;sdata=3DNCFKxaREd69=
iZ8eyFKg%2FWBP73CLTXkxrNQQeii%2Bbsao%3D&amp;reserved=3D0
>>=20
>> Summary:
>> 	- CPU: Intel Skylake
>> 	- Linux environment: Ubuntu 18.04
>> 	- Compiler: gcc-7.3 (Ubuntu 7.3.0-16ubuntu3)
>=20
> Is it possible to test a few other gcc versions to check if the issue is
> specific to this compiler version?

Nothing's impossible but even with my quick search in gcc.gnu.org,
I could find the following documents mention mavx512f support:

GCC 4.9.0
April 22, 2014 (changes, documentation)
=20
GCC 5.1
April 22, 2015 (changes, documentation)
=20
GCC 6.4
July 4, 2017 (changes, documentation)
=20
GCC 7.1
May 2, 2017 (changes, documentation)
=20
GCC 8.1
May 2, 2018 (changes, documentation)

We altogether have to put quite large resource to verify all of the version=
s.
=20
I assumed older than gcc 7 would have the same issue. I know it was a specu=
lation
but like I mentioned I wanted to be more conservative. I didn't mean this i=
s a permanent fix.
For two months, we couldn't have any tangible solution (actually nobody car=
ed including myself),
so I submitted the patch to temporarily disable mavx512f.

I'm still not sure what the best option is...

Thanks,
Yongseok

>=20
>> 	- Scenario: testpmd crashes when it starts forwarding
>> 	- Behaviour: AVX2 version of rte_memcpy() optimized with 512b instructi=
ons
>> 	- Fix: disable AVX512 optimization with -mno-avx512f
>>=20
>> It seems to have been reproduced only when using mlx5 PMD so far.
>> Any other experience?