From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 4FEE0201; Mon, 12 Nov 2018 10:26:41 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 12 Nov 2018 01:26:40 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,494,1534834800"; d="scan'208";a="107513294" Received: from irsmsx151.ger.corp.intel.com ([163.33.192.59]) by orsmga001.jf.intel.com with ESMTP; 12 Nov 2018 01:26:37 -0800 Received: from irsmsx106.ger.corp.intel.com ([169.254.8.87]) by IRSMSX151.ger.corp.intel.com ([169.254.4.122]) with mapi id 14.03.0415.000; Mon, 12 Nov 2018 09:26:37 +0000 From: "Ananyev, Konstantin" To: Thomas Monjalon CC: "Yigit, Ferruh" , "Richardson, Bruce" , "stable@dpdk.org" , "Wiles, Keith" , Yongseok Koh , "dev@dpdk.org" , Shahaf Shuler , "Burakov, Anatoly" , "justin.parus@microsoft.com" , "christian.ehrhardt@canonical.com" , "david.coronel@canonical.com" , "josh.powers@canonical.com" , "jay.vosburgh@canonical.com" , "dan.streetman@canonical.com" Thread-Topic: [dpdk-stable] AVX512 bug on SkyLake Thread-Index: AQHUd3wLL0bTUyiuoEWNUynzlq2gYqVIRviAgAJRWyCAAE2iAIAA/grQ Date: Mon, 12 Nov 2018 09:26:36 +0000 Message-ID: <2601191342CEEE43887BDE71AB977258010CE492F2@IRSMSX106.ger.corp.intel.com> References: <20181023212318.43082-1-yskoh@mellanox.com> <15598804.PupyfcORSR@xps> <2601191342CEEE43887BDE71AB977258010CE46DE0@IRSMSX106.ger.corp.intel.com> <14171327.rkP5k0YJWv@xps> In-Reply-To: <14171327.rkP5k0YJWv@xps> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiMDA5ZmIyNzAtMDM1Yy00ODMxLWFjNWItOGFhODgyMjMxOGUwIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoiZVk2b0NPVk1NZ3BXb0YwbFBvUlwvdHBlVWdNS1V1NFZwQkhTcVwvdGFQZEs1b1E2THlVSnlUc0d1UmdxV080Y1dpIn0= x-ctpclassification: CTP_NT dlp-product: dlpe-windows dlp-version: 11.0.400.15 dlp-reaction: no-action x-originating-ip: [163.33.239.181] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [dpdk-stable] AVX512 bug on SkyLake X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Nov 2018 09:26:42 -0000 > 11/11/2018 15:15, Ananyev, Konstantin: > > Hi Thomas, > > > > > Below is my conclusion for this bug. > > > An expert of x86 is required to follow-up. > > > > > > Summary: > > > - CPU: Intel Skylake > > > - Linux environment: Ubuntu 18.04 > > > - Compiler: GCC 7 or 8 > > > - Scenario: testpmd crashes when it starts forwarding > > > - Behaviour: AVX2 version of rte_memcpy() fails if optimized for AVX= 512 > > > - Context: inline rte_memcpy() is called from > > > inline rte_mempool_put_bulk(), called from > > > mlx5_tx_complete() (inline or not) > > > - Analysis: AVX512 optimization changes vmovdqu to vmovdqu8 > > > > > > Latest status can be found in Bugzilla: > > > https://bugs.dpdk.org/show_bug.cgi?id=3D97#c35 > > > > > > Looking at dissamled output from the bug report, it seems that the > > problem is not in vmovdqu8 instruction itself, but in the wrong offsets > > generated by the compiler: > > > > vmovdqu8 xmm0,XMMWORD PTR [rax*8+0x2] > > vinserti128 ymm0,ymm0,XMMWORD PTR [rax*8+0x30],0x1 > > vmovups XMMWORD PTR [rsi+0x20],xmm0 > > vextracti128 XMMWORD PTR [rsi+0x30],ymm0,0x1 > > vmovdqu8 xmm0,XMMWORD PTR [rax*8+0x4] > > vinserti128 ymm0,ymm0,XMMWORD PTR [rax*8+0x50],0x1 > > vmovups XMMWORD PTR [rsi+0x40],xmm0 > > vextracti128 XMMWORD PTR [rsi+0x50],ymm0,0x1 > > vmovdqu8 xmm0,XMMWORD PTR [rax*8+0x6] > > > > Should be: > > vmovdqu8 xmm0,XMMWORD PTR [rax*8+0x20] > > I think. > > > > Same for next two offsets: 0x4 and 0x6 respectively should be 0x40 and = 0x60. >=20 > Yes, you're right, I missed it, thank you! >=20 > The full diff is below: >=20 > --- bad-avx512-enabled > +++ good-avx512-disabled > - vmovdqu8 xmm0,XMMWORD PTR [rax*8+0x0] > + vmovdqu xmm0,XMMWORD PTR [rax*8+0x0] > vinserti128 ymm0,ymm0,XMMWORD PTR [rax*8+0x10],0x1 > vmovups XMMWORD PTR [rsi],xmm0 > vextracti128 XMMWORD PTR [rsi+0x10],ymm0,0x1 > - vmovdqu8 xmm0,XMMWORD PTR [rax*8+0x2] > + vmovdqu xmm0,XMMWORD PTR [rax*8+0x20] > vinserti128 ymm0,ymm0,XMMWORD PTR [rax*8+0x30],0x1 > vmovups XMMWORD PTR [rsi+0x20],xmm0 > vextracti128 XMMWORD PTR [rsi+0x30],ymm0,0x1 > - vmovdqu8 xmm0,XMMWORD PTR [rax*8+0x4] > + vmovdqu xmm0,XMMWORD PTR [rax*8+0x40] > vinserti128 ymm0,ymm0,XMMWORD PTR [rax*8+0x50],0x1 > vmovups XMMWORD PTR [rsi+0x40],xmm0 > vextracti128 XMMWORD PTR [rsi+0x50],ymm0,0x1 > - vmovdqu8 xmm0,XMMWORD PTR [rax*8+0x6] > + vmovdqu xmm0,XMMWORD PTR [rax*8+0x60] > vinserti128 ymm0,ymm0,XMMWORD PTR [rax*8+0x70],0x1 > vmovups XMMWORD PTR [rsi+0x60],xmm0 > vextracti128 XMMWORD PTR [rsi+0x70],ymm0,0x1 >=20 > > Not sure what causing compiler behaves that way. > > BTW, looking though testpmd objdump output - it seems that only mlx5 dr= iver > > exhibits such problem (I didn't enable mlx4 actually, probably same pro= blem here). > > Which looks a bit weird to me. >=20 > Yes it's weird. I don't see how the mlx5 code could influence > the compiler to generate this bad code in AVX512 mode. Same here, looked through mlx5_rxtx code, it is unclear to me what triggers the issue. So far looks like gcc bug to me. Konstantin