From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id CC6891B1ED for ; Thu, 5 Oct 2017 15:24:37 +0200 (CEST) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Oct 2017 06:24:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.42,481,1500966000"; d="scan'208";a="1021983554" Received: from irsmsx105.ger.corp.intel.com ([163.33.3.28]) by orsmga003.jf.intel.com with ESMTP; 05 Oct 2017 06:24:35 -0700 Received: from irsmsx103.ger.corp.intel.com ([169.254.3.49]) by irsmsx105.ger.corp.intel.com ([169.254.7.75]) with mapi id 14.03.0319.002; Thu, 5 Oct 2017 14:24:34 +0100 From: "Ananyev, Konstantin" To: "Li, Xiaoyun" , "Richardson, Bruce" CC: "Lu, Wenzhuo" , "Zhang, Helin" , "dev@dpdk.org" Thread-Topic: [PATCH v7 0/3] run-time Linking support Thread-Index: AQHTPdZctCHG2WpqLUOSRHk8Zg7U1qLVPrqg Date: Thu, 5 Oct 2017 13:24:34 +0000 Message-ID: <2601191342CEEE43887BDE71AB9772585FAA4B83@IRSMSX103.ger.corp.intel.com> References: <1507157911-8702-1-git-send-email-xiaoyun.li@intel.com> <1507206794-79941-1-git-send-email-xiaoyun.li@intel.com> In-Reply-To: <1507206794-79941-1-git-send-email-xiaoyun.li@intel.com> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiNmJjYjE4MmItODc5NS00ZGNiLTk2MTItZWVmODQ4NTA5NjA5IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE2LjUuOS4zIiwiVHJ1c3RlZExhYmVsSGFzaCI6IjFJKytmQnFVOHRSSFpGQ1IrUmk2dUNOcFJiZHhPeExLbGl1VFJpYmNBSGc9In0= x-ctpclassification: CTP_IC dlp-product: dlpe-windows dlp-version: 11.0.0.116 dlp-reaction: no-action x-originating-ip: [163.33.239.181] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v7 0/3] run-time Linking support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Oct 2017 13:24:38 -0000 > -----Original Message----- > From: Li, Xiaoyun > Sent: Thursday, October 5, 2017 1:33 PM > To: Ananyev, Konstantin ; Richardson, Bruce= > Cc: Lu, Wenzhuo ; Zhang, Helin ; dev@dpdk.org; Li, Xiaoyun > Subject: [PATCH v7 0/3] run-time Linking support >=20 > This patchset dynamically selects functions at run-time based on CPU flag= s > that current machine supports.This patchset modifies mempcy, memcpy perf > test and x86 EFD, using function pointers and bind them at constructor ti= me. > Then in the cloud environment, users can compiler once for the minimum ta= rget > such as 'haswell'(not 'native') and run on different platforms (equal or = above > haswell) and can get ISA optimization based on running CPU. >=20 > Xiaoyun Li (3): > eal/x86: run-time dispatch over memcpy > app/test: run-time dispatch over memcpy perf test > efd: run-time dispatch over x86 EFD functions >=20 > --- > v2 > * Use gcc function multi-versioning to avoid compilation issues. > * Add macros for AVX512 and AVX2. Only if users enable AVX512 and the com= piler > supports it, the AVX512 codes would be compiled. Only if the compiler sup= ports > AVX2, the AVX2 codes would be compiled. >=20 > v3 > * Reduce function calls via only keep rte_memcpy_xxx. > * Add conditions that when copy size is small, use inline code path. > Otherwise, use dynamic code path. > * To support attribute target, clang version must be greater than 3.7. > Otherwise, would choose SSE/AVX code path, the same as before. > * Move two mocro functions to the top of the code since they would be use= d in > inline SSE/AVX and dynamic SSE/AVX codes. >=20 > v4 > * Modify rte_memcpy.h to several .c files and modify makefiles to compile > AVX2 and AVX512 files. >=20 > v5 > * Delete redundant repeated codes of rte_memcpy_xxx. > * Modify makefiles to enable reuse of existing rte_memcpy. > * Delete redundant codes of rte_efd_x86.h in v4. Move it into .c file and= enable > compilation -mavx2 for it in makefile since it is already chosen at run-t= ime. >=20 > v6 > * Fix shared target build failure. > * Safely remove redundant efd x86 avx2 codes since the file is compiled > with -mavx2. >=20 > v7 > * Modify the added version map code in v6 to be more reasonable. > * Safely remove redundant efd x86 avx2 codes since the file is compiled > with -mavx2. >=20 > lib/librte_eal/bsdapp/eal/Makefile | 19 + > lib/librte_eal/bsdapp/eal/rte_eal_version.map | 7 + > .../common/include/arch/x86/rte_memcpy.c | 59 ++ > .../common/include/arch/x86/rte_memcpy.h | 861 +--------------= ---- > .../common/include/arch/x86/rte_memcpy_avx2.c | 44 + > .../common/include/arch/x86/rte_memcpy_avx512f.c | 44 + > .../common/include/arch/x86/rte_memcpy_internal.h | 909 +++++++++++++++= ++++++ > .../common/include/arch/x86/rte_memcpy_sse.c | 40 + > lib/librte_eal/linuxapp/eal/Makefile | 19 + > lib/librte_eal/linuxapp/eal/rte_eal_version.map | 7 + > lib/librte_efd/Makefile | 6 + > lib/librte_efd/rte_efd_x86.c | 77 ++ > lib/librte_efd/rte_efd_x86.h | 48 +- > mk/rte.cpuflags.mk | 14 + > test/test/test_memcpy_perf.c | 40 +- > 15 files changed, 1289 insertions(+), 905 deletions(-) > create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memcpy.c > create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memcpy_avx= 2.c > create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memcpy_avx= 512f.c > create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memcpy_int= ernal.h > create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memcpy_sse= .c > create mode 100644 lib/librte_efd/rte_efd_x86.c >=20 > -- Acked-by: Konstantin Ananyev > 2.7.4