From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id B44B21B32B for ; Tue, 3 Oct 2017 13:27:55 +0200 (CEST) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 03 Oct 2017 04:27:54 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.42,474,1500966000"; d="scan'208";a="906202262" Received: from fmsmsx106.amr.corp.intel.com ([10.18.124.204]) by FMSMGA003.fm.intel.com with ESMTP; 03 Oct 2017 04:27:54 -0700 Received: from shsmsx103.ccr.corp.intel.com (10.239.4.69) by FMSMSX106.amr.corp.intel.com (10.18.124.204) with Microsoft SMTP Server (TLS) id 14.3.319.2; Tue, 3 Oct 2017 04:27:54 -0700 Received: from shsmsx101.ccr.corp.intel.com ([169.254.1.159]) by SHSMSX103.ccr.corp.intel.com ([169.254.4.213]) with mapi id 14.03.0319.002; Tue, 3 Oct 2017 19:27:52 +0800 From: "Li, Xiaoyun" To: "Ananyev, Konstantin" , "Richardson, Bruce" CC: "Lu, Wenzhuo" , "Zhang, Helin" , "dev@dpdk.org" Thread-Topic: [PATCH v4 3/3] efd: run-time dispatch over x86 EFD functions Thread-Index: AQHTO5m4ISIyTQkF/0OclfJg++lshqLQQFWAgAGG0RD//69/AIAAh0ug Date: Tue, 3 Oct 2017 11:27:51 +0000 Message-ID: References: <1506411689-94690-1-git-send-email-xiaoyun.li@intel.com> <1506960796-71620-1-git-send-email-xiaoyun.li@intel.com> <1506960796-71620-4-git-send-email-xiaoyun.li@intel.com> <2601191342CEEE43887BDE71AB9772585FAA2FCD@IRSMSX103.ger.corp.intel.com> <2601191342CEEE43887BDE71AB9772585FAA35CF@IRSMSX103.ger.corp.intel.com> In-Reply-To: <2601191342CEEE43887BDE71AB9772585FAA35CF@IRSMSX103.ger.corp.intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 11.0.0.116 dlp-reaction: no-action x-ctpclassification: CTP_IC x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiZDQzMTk5MGItZmMzZC00ZWE2LWFhM2ItZDczNzg0ODJkNTA0IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE2LjUuOS4zIiwiVHJ1c3RlZExhYmVsSGFzaCI6InRUa21kMkdJNjlRZGhuazRWYzg1amR5SUd6amVnOUl5ckpPc1R6aEN5M0E9In0= x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v4 3/3] efd: run-time dispatch over x86 EFD functions X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Oct 2017 11:27:56 -0000 OK. > -----Original Message----- > From: Ananyev, Konstantin > Sent: Tuesday, October 3, 2017 19:23 > To: Li, Xiaoyun ; Richardson, Bruce > > Cc: Lu, Wenzhuo ; Zhang, Helin > ; dev@dpdk.org > Subject: RE: [PATCH v4 3/3] efd: run-time dispatch over x86 EFD functions >=20 >=20 >=20 > > -----Original Message----- > > From: Li, Xiaoyun > > Sent: Tuesday, October 3, 2017 9:15 AM > > To: Ananyev, Konstantin ; Richardson, > > Bruce > > Cc: Lu, Wenzhuo ; Zhang, Helin > > ; dev@dpdk.org > > Subject: RE: [PATCH v4 3/3] efd: run-time dispatch over x86 EFD > > functions > > > > Hi > > > > > -----Original Message----- > > > From: Ananyev, Konstantin > > > Sent: Tuesday, October 3, 2017 00:52 > > > To: Li, Xiaoyun ; Richardson, Bruce > > > > > > Cc: Lu, Wenzhuo ; Zhang, Helin > > > ; dev@dpdk.org > > > Subject: RE: [PATCH v4 3/3] efd: run-time dispatch over x86 EFD > > > functions > > > > > > > > > > > > > -----Original Message----- > > > > From: Li, Xiaoyun > > > > Sent: Monday, October 2, 2017 5:13 PM > > > > To: Ananyev, Konstantin ; > > > > Richardson, Bruce > > > > Cc: Lu, Wenzhuo ; Zhang, Helin > > > > ; dev@dpdk.org; Li, Xiaoyun > > > > > > > > Subject: [PATCH v4 3/3] efd: run-time dispatch over x86 EFD > > > > functions > > > > > > > > This patch dynamically selects x86 EFD functions at run-time. > > > > This patch uses function pointer and binds it to the relative > > > > function based on CPU flags at constructor time. > > > > > > > > Signed-off-by: Xiaoyun Li > > > > --- > > > > lib/librte_efd/rte_efd_x86.h | 41 > > > > ++++++++++++++++++++++++++++++++++++++--- > > > > 1 file changed, 38 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/lib/librte_efd/rte_efd_x86.h > > > > b/lib/librte_efd/rte_efd_x86.h index 34f37d7..93b6743 100644 > > > > --- a/lib/librte_efd/rte_efd_x86.h > > > > +++ b/lib/librte_efd/rte_efd_x86.h > > > > @@ -43,12 +43,29 @@ > > > > #define EFD_LOAD_SI128(val) _mm_lddqu_si128(val) #endif > > > > > > > > +typedef efd_value_t > > > > +(*efd_lookup_internal_avx2_t)(const efd_hashfunc_t > *group_hash_idx, > > > > + const efd_lookuptbl_t *group_lookup_table, > > > > + const uint32_t hash_val_a, const uint32_t hash_val_b); > > > > + > > > > +static efd_lookup_internal_avx2_t efd_lookup_internal_avx2_ptr; > > > > + > > > > static inline efd_value_t > > > > efd_lookup_internal_avx2(const efd_hashfunc_t *group_hash_idx, > > > > const efd_lookuptbl_t *group_lookup_table, > > > > const uint32_t hash_val_a, const uint32_t hash_val_b) { - > > > #ifdef > > > > RTE_MACHINE_CPUFLAG_AVX2 > > > > + return (*efd_lookup_internal_avx2_ptr)(group_hash_idx, > > > > + group_lookup_table, > > > > + hash_val_a, hash_val_b); > > > > > > I don't think you need all that. > > > All you need - build proper avx2 function even if current HW doesn't > > > support it. > > > The existing runtime selection here seems ok already. > > > Konstantin > > > > > > > Sorry, not quite understand here. So don't need to change codes of efd > here? > > I didn't care about the HW. CC_SUPPORT_AVX2 only means the compiler > supports AVX2 since would runtime selection. > > The existing codes RTE_MACHINE_CPUFLAG_AVX2 means both the > compiler and HW supports AVX2. >=20 > What I am saying - you don't need all these dances with extra function > pointer. > All you need - move efd_lookup_internal_avx2() into a .c file and make su= re > it get compiled with -mavx2 flag. > Then at rte_efd_create() select AVX2 only when both HW and compiler > supports AVX2: >=20 > ... > #ifdef CC_SUPPORT_AVX2 > if (RTE_EFD_VALUE_NUM_BITS > 3 && > rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2)) > table->lookup_fn =3D EFD_LOOKUP_AVX2; > else > #endif > ... >=20 > Konstantin >=20 > > > > > > Best Regards, > > Xiaoyun Li > > > > > > > > +} > > > > + > > > > +#ifdef CC_SUPPORT_AVX2 > > > > +static inline efd_value_t > > > > +efd_lookup_internal_avx2_AVX2(const efd_hashfunc_t > *group_hash_idx, > > > > + const efd_lookuptbl_t *group_lookup_table, > > > > + const uint32_t hash_val_a, const uint32_t hash_val_b) { > > > > efd_value_t value =3D 0; > > > > uint32_t i =3D 0; > > > > __m256i vhash_val_a =3D _mm256_set1_epi32(hash_val_a); @@ - > > > 74,13 > > > > +91,31 @@ efd_lookup_internal_avx2(const efd_hashfunc_t > > > *group_hash_idx, > > > > } > > > > > > > > return value; > > > > -#else > > > > +} > > > > +#endif > > > > + > > > > +static inline efd_value_t > > > > +efd_lookup_internal_avx2_DEFAULT(const efd_hashfunc_t > > > *group_hash_idx, > > > > + const efd_lookuptbl_t *group_lookup_table, > > > > + const uint32_t hash_val_a, const uint32_t hash_val_b) { > > > > RTE_SET_USED(group_hash_idx); > > > > RTE_SET_USED(group_lookup_table); > > > > RTE_SET_USED(hash_val_a); > > > > RTE_SET_USED(hash_val_b); > > > > /* Return dummy value, only to avoid compilation breakage */ > > > > return 0; > > > > -#endif > > > > +} > > > > > > > > +static void __attribute__((constructor)) > > > > +rte_efd_x86_init(void) > > > > +{ > > > > +#ifdef CC_SUPPORT_AVX2 > > > > + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2)) > > > > + efd_lookup_internal_avx2_ptr =3D > > > efd_lookup_internal_avx2_AVX2; > > > > + else > > > > + efd_lookup_internal_avx2_ptr =3D > > > efd_lookup_internal_avx2_DEFAULT; > > > > +#else > > > > + efd_lookup_internal_avx2_ptr =3D efd_lookup_internal_avx2_DEFAULT= ; > > > > +#endif > > > > } > > > > -- > > > > 2.7.4