From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id 2AF7F36E for ; Thu, 15 Dec 2016 07:51:13 +0100 (CET) Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga104.fm.intel.com with ESMTP; 14 Dec 2016 22:51:12 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,350,1477983600"; d="scan'208";a="40136569" Received: from fmsmsx107.amr.corp.intel.com ([10.18.124.205]) by orsmga004.jf.intel.com with ESMTP; 14 Dec 2016 22:51:12 -0800 Received: from fmsmsx152.amr.corp.intel.com (10.18.125.5) by fmsmsx107.amr.corp.intel.com (10.18.124.205) with Microsoft SMTP Server (TLS) id 14.3.248.2; Wed, 14 Dec 2016 22:51:12 -0800 Received: from BGSMSX107.gar.corp.intel.com (10.223.4.191) by FMSMSX152.amr.corp.intel.com (10.18.125.5) with Microsoft SMTP Server (TLS) id 14.3.248.2; Wed, 14 Dec 2016 22:51:11 -0800 Received: from bgsmsx101.gar.corp.intel.com ([169.254.1.222]) by BGSMSX107.gar.corp.intel.com ([169.254.9.164]) with mapi id 14.03.0248.002; Thu, 15 Dec 2016 12:21:08 +0530 From: "Yang, Zhiyong" To: "Yang, Zhiyong" , "Ananyev, Konstantin" , Thomas Monjalon CC: "dev@dpdk.org" , "yuanhan.liu@linux.intel.com" , "Richardson, Bruce" , "De Lara Guarch, Pablo" Thread-Topic: [dpdk-dev] [PATCH 1/4] eal/common: introduce rte_memset on IA platform Thread-Index: AQHSTHcq0cqfe4gXqkCtBCGLh5u1zaD0F6iAgAmXsfD//8XGgIAAXKoA//+1VQCABS9VkIAF58Lw Date: Thu, 15 Dec 2016 06:51:08 +0000 Message-ID: References: <1480926387-63838-1-git-send-email-zhiyong.yang@intel.com> <1480926387-63838-2-git-send-email-zhiyong.yang@intel.com> <7223515.9TZuZb6buy@xps13> <2601191342CEEE43887BDE71AB9772583F0E55B0@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB9772583F0E568B@irsmsx105.ger.corp.intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiNGU0NGM0NjUtY2NiMS00ZmM5LWIzY2UtMzQ5Yzc1NWQ3NzY5IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE1LjkuNi42IiwiVHJ1c3RlZExhYmVsSGFzaCI6ImFaNjdNZ3R6SmVhbnJSVVNJNWVQdlVYeDlCM3NnRFlXSktWUlVIVlNKQjg9In0= x-ctpclassification: CTP_IC x-originating-ip: [10.223.10.10] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH 1/4] eal/common: introduce rte_memset on IA platform X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Dec 2016 06:51:14 -0000 Hi, Thomas, Konstantin: > -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Yang, Zhiyong > Sent: Sunday, December 11, 2016 8:33 PM > To: Ananyev, Konstantin ; Thomas > Monjalon > Cc: dev@dpdk.org; yuanhan.liu@linux.intel.com; Richardson, Bruce > ; De Lara Guarch, Pablo > > Subject: Re: [dpdk-dev] [PATCH 1/4] eal/common: introduce rte_memset on > IA platform >=20 > Hi, Konstantin, Bruce: >=20 > > -----Original Message----- > > From: Ananyev, Konstantin > > Sent: Thursday, December 8, 2016 6:31 PM > > To: Yang, Zhiyong ; Thomas Monjalon > > > > Cc: dev@dpdk.org; yuanhan.liu@linux.intel.com; Richardson, Bruce > > ; De Lara Guarch, Pablo > > > > Subject: RE: [dpdk-dev] [PATCH 1/4] eal/common: introduce rte_memset > > on IA platform > > > > > > > > > -----Original Message----- > > > From: Yang, Zhiyong > > > Sent: Thursday, December 8, 2016 9:53 AM > > > To: Ananyev, Konstantin ; Thomas > > > Monjalon > > > Cc: dev@dpdk.org; yuanhan.liu@linux.intel.com; Richardson, Bruce > > > ; De Lara Guarch, Pablo > > > > > > Subject: RE: [dpdk-dev] [PATCH 1/4] eal/common: introduce rte_memset > > > on IA platform > > > > > extern void *(*__rte_memset_vector)( (void *s, int c, size_t n); > > > > static inline void* > > rte_memset_huge(void *s, int c, size_t n) { > > return __rte_memset_vector(s, c, n); } > > > > static inline void * > > rte_memset(void *s, int c, size_t n) > > { > > If (n < XXX) > > return rte_memset_scalar(s, c, n); > > else > > return rte_memset_huge(s, c, n); > > } > > > > XXX could be either a define, or could also be a variable, so it can > > be setuped at startup, depending on the architecture. > > > > Would that work? > > Konstantin > > I have implemented the code for choosing the functions at run time. rte_memcpy is used more frequently, So I test it at run time.=20 typedef void *(*rte_memcpy_vector_t)(void *dst, const void *src, size_t n); extern rte_memcpy_vector_t rte_memcpy_vector; static inline void * rte_memcpy(void *dst, const void *src, size_t n) { return rte_memcpy_vector(dst, src, n); } In order to reduce the overhead at run time,=20 I assign the function address to var rte_memcpy_vector before main() starts= to init the var. static void __attribute__((constructor)) rte_memcpy_init(void) { if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2)) { rte_memcpy_vector =3D rte_memcpy_avx2; } else if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE4_1)) { rte_memcpy_vector =3D rte_memcpy_sse; } else { rte_memcpy_vector =3D memcpy; } } I run the same virtio/vhost loopback tests without NIC. I can see the throughput drop when running choosing functions at run time compared to original code as following on the same platform(my machine is h= aswell)=20 Packet size perf drop 64 -4% 256 -5.4% 1024 -5% 1500 -2.5% Another thing, I run the memcpy_perf_autotest, when N=3D <128,=20 the rte_memcpy perf gains almost disappears When choosing functions at run time. For N=3Dother numbers, the perf gains= will become narrow. Thanks Zhiyong