From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D4894A0C43; Wed, 12 May 2021 10:47:22 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id C75EC410F8; Wed, 12 May 2021 10:47:19 +0200 (CEST) Received: from szxga07-in.huawei.com (szxga07-in.huawei.com [45.249.212.35]) by mails.dpdk.org (Postfix) with ESMTP id 03D5B4003F for ; Wed, 12 May 2021 10:47:17 +0200 (CEST) Received: from DGGEMS410-HUB.china.huawei.com (unknown [172.30.72.59]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4Fg7dB3jjkzCrZJ; Wed, 12 May 2021 16:44:30 +0800 (CST) Received: from [127.0.0.1] (10.40.190.165) by DGGEMS410-HUB.china.huawei.com (10.3.19.210) with Microsoft SMTP Server id 14.3.498.0; Wed, 12 May 2021 16:47:03 +0800 To: Honnappa Nagarahalli , Jerin Jacob , "Richardson, Bruce" , "thomas@monjalon.net" , David Marchand , Stephen Hemminger , "Ananyev, Konstantin" CC: "dev@dpdk.org" , "jerinj@marvell.com" , Ruifeng Wang , "humin29@huawei.com" , nd References: <319916e1-3380-6ed5-afd3-38e1295c4733@huawei.com> From: fengchengwen Message-ID: <358a16d0-8489-4180-7c2b-2118544f78e5@huawei.com> Date: Wed, 12 May 2021 16:47:04 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.40.190.165] X-CFilter-Loop: Reflected Subject: Re: [dpdk-dev] How to disable SVE auto vectorization while using GCC X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 2021/5/11 22:10, Honnappa Nagarahalli wrote: > >>> >>>> >>>> Thanks for your suggestions, we found that the -fno-tree-vectorize >>>> option works. >>>> PS: This option is not successfully added in the earliest test. >>>> >>>> Solution: >>>> 1. use the -fno-tree-vectorize option to prevent compiler generate >>>> auto vetorization >>>> code, so tha slow-path will work fine. >>>> 2. add '-march=armv8-a+sve+crc' line of implementer_generic in >>>> arm/meson.build >>>> 'part_number_config': { >>>> 'generic': {'machine_args': ['-march=armv8-a+crc', >>>> '-march=armv8-a+sve+crc', >>>> '-moutline-atomics']} >>>> } >>>> If compiler doesn't support '-march=armv8-a+sve+crc', then it will >> fallback >>>> supports '-march=armv8-a+crc'. >>>> If compiler supports '-march=armv8-a+sve+crc', then it will >>>> compile SVE- related >>>> code, so the IO-path could support SVE. >>>> >>>> Base above we could achieve initial target. >>> The 'generic' target is for generating a binary that would work on all ArmV8 >> machines. If you are building with '-march=armv8-a+sve+crc', the IO-Path >> would not work on non-SVE machines. >>> >> >> The 'generic' only used in local CI (note: the two platforms are both ARMv8 >> machines) >> >> In the IO-path, we support NEON and SVE Rx/Tx, the code was written by >> ACLE, so it will not affect by the -fno-tree-vectorize option. >> >> If compiler supports '-march=armv8-a+sve+crc', then it will compile both >> NEON and SVE related code. > Using '-march=armv8-a+sve+crc' and '-fno-tree-vectorize' does not provide an absolute guarantee that the compiler will not use SVE elsewhere. > > The safest way to ensure that only specific functions use SVE is to compile without +sve (e.g. using -march=armv8-a) and use pragmas around the functions that are allowed to use SVE. Ex: > > #pragma GCC push_options > #pragma GCC target ("+sve") > void f(int *x) { > for (int i = 0; i < 100; ++i) x[i] = i; > } > #pragma GCC pop_options > void g(int *x) { > for (int i = 0; i < 100; ++i) x[i] = i; > } > > compiles f() using SVE and g() with standard options. > > You can also follow the function multiversioning discussed in the other thread. > Thanks for your suggestions Because the SVE code is organized by file, so use the following scheme in hns3 meson.build: if arch_subdir == 'arm' and dpdk_conf.get('RTE_ARCH_64') sources += files('hns3_rxtx_vec.c') # compile SVE when: # a. support SVE in minimum instruction set baseline # b. it's not minimum instruction set, but compiler support if cc.get_define('__ARM_FEATURE_SVE', args: machine_args) != '' cflags += ['-DCC_SVE_SUPPORT'] sources += files('hns3_rxtx_vec_sve.c') elif cc.has_argument('-march=armv8.2-a+sve') cflags += ['-DCC_SVE_SUPPORT'] hns3_sve_lib = static_library('hns3_sve_lib', 'hns3_rxtx_vec_sve.c', dependencies: [static_rte_ethdev], include_directories: includes, c_args: [cflags, '-march=armv8.2-a+sve']) objs += hns3_sve_lib.extract_objects('hns3_rxtx_vec_sve.c') endif endif Ref: https://patchwork.dpdk.org/project/dpdk/patch/1620808126-18876-3-git-send-email-fengchengwen@huawei.com/ Best regards. >> In the runtime, driver supports detect the platform whether support SVE, if >> not it will select the NEON. >> >> Best regards. >> >>>> >>>> >>>> On 2021/5/1 4:54, Honnappa Nagarahalli wrote: >>>>> >>>>> >>>>>> >>>>>> On Fri, Apr 30, 2021 at 5:27 PM fengchengwen >>>>>> wrote: >>>>>>> >>>>>>> Hi, ALL >>>>>>> We have a question for your help: >>>>>>> 1. We have two platforms, both of which are ARM64, one of which >>>>>> supports >>>>>>> both NEON and SVE, the other only support NEON. >>>>>>> 2. We want to run on both platforms with a single binary file, >>>>>>> and use >>>> the >>>>>>> highest vector capability of the corresponding platform >>>>>>> whenever >>>>>> possible. >>>>>> >>>>>> I see VPP has a similar feature. IMO, it is not present in DPDK. >>>>>> Basically, In order to do this. >>>>>> - Compile slow-path code(90% of DPDK) with minimal CPU instruction >>>>>> set support >>>>>> - Have fastpath function compile with different CPU instruction set >>>>>> levels -In slowpath, Attach the fastpath function pointer-based on >>>>>> CPU instruction- level support. >>>>> Agree. >>>>> >>>>>> >>>>>> >>>>>>> 3. So we build the DPDK program with -march=armv8-a+sve+crc (GCC >>>>>> 10.2). >>>>> This defines the minimum capabilities of the target machine. >>>>> >>>>>>> However, it is found that invalid instructions occur when the >> program >>>>>>> runs on a machine that does not support SVE (pls see below). >>>>>>> 4. The problem is caused by the introduction of SVE in GCC >>>>>>> automatic >>>>>> vector >>>>>>> optimization. >>>>>>> >>>>>>> So Is there a way to disable GCC automatic vector optimization >>>>>>> or use >>>> only >>>>>>> NEON to perform automatic vector optimization? >>>>> I do not think this is safe. Once SVE is enabled, compiler is >>>>> allowed to use >>>> the SVE instructions wherever it finds it fit. >>>>> >>>>>>> >>>>>>> BTW: we already test -fno-tree-vectorize (as link below) but >>>>>>> found no >>>>>> effect. >>>>>>> >>>>>>> https://stackoverflow.com/questions/7778174/how-can-i-disable-vect >>>>>>> or >>>>>>> iz >>>>>>> ation-while-using-gcc >>>>>>> >>>>>>> >>>>>>> The GDB output: >>>>>>> EAL: Detected 128 lcore(s) >>>>>>> EAL: Detected 4 NUMA nodes >>>>>>> Option -w, --pci-whitelist is deprecated, use -a, --allow >>>>>>> option instead >>>>>>> >>>>>>> Program received signal SIGILL, Illegal instruction. >>>>>>> 0x0000000000671b88 in eal_adjust_config () >>>>>>> (gdb) >>>>>>> (gdb) where >>>>>>> #0 0x0000000000671b88 in eal_adjust_config () >>>>>>> #1 0x0000000000682840 in rte_eal_init () >>>>>>> #2 0x000000000051c870 in main () >>>>>>> (gdb) >>>>>>> >>>>>>> The disassembly output of eal_adjust_config: >>>>>>> 671b7c: f8237a81 str x1, [x20, x3, lsl #3] >>>>>>> 671b80: f110001f cmp x0, #0x400 >>>>>>> 671b84: 54ffff21 b.ne 671b68 >> // >>>>>> b.any >>>>>>> 671b88: 043357f5 addvl x21, x19, #-1 >>>>>>> 671b8c: 043457e1 addvl x1, x20, #-1 >>>>>>> 671b90: 910562b5 add x21, x21, #0x158 >>>>>>> 671b94: 04e0e3e0 cntd x0 >>>>>>> 671b98: 914012b5 add x21, x21, #0x4, lsl #12 >>>>>>> 671b9c: 52800218 mov w24, #0x10 // #16 >>>>>>> 671ba0: 25d8e3e1 ptrue p1.d >>>>>>> 671ba4: 25f80fe0 whilelo p0.d, wzr, w24 >>>>>>> 671ba8: a5e04020 ld1d {z0.d}, p0/z, [x1, x0, lsl #3] >>>>>>> >>>>>>> >>>>>>> Best regards. >>>>>>> >>> >