From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 8543A1B639 for ; Fri, 13 Oct 2017 09:41:17 +0200 (CEST) Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Oct 2017 00:41:16 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.43,369,1503385200"; d="scan'208";a="322801729" Received: from fmsmsx108.amr.corp.intel.com ([10.18.124.206]) by fmsmga004.fm.intel.com with ESMTP; 13 Oct 2017 00:41:15 -0700 Received: from fmsmsx122.amr.corp.intel.com (10.18.125.37) by FMSMSX108.amr.corp.intel.com (10.18.124.206) with Microsoft SMTP Server (TLS) id 14.3.319.2; Fri, 13 Oct 2017 00:41:16 -0700 Received: from shsmsx152.ccr.corp.intel.com (10.239.6.52) by fmsmsx122.amr.corp.intel.com (10.18.125.37) with Microsoft SMTP Server (TLS) id 14.3.319.2; Fri, 13 Oct 2017 00:41:15 -0700 Received: from shsmsx103.ccr.corp.intel.com ([169.254.4.213]) by SHSMSX152.ccr.corp.intel.com ([169.254.6.93]) with mapi id 14.03.0319.002; Fri, 13 Oct 2017 15:41:14 +0800 From: "Li, Xiaoyun" To: Thomas Monjalon , "Ananyev, Konstantin" CC: "dev@dpdk.org" , "Richardson, Bruce" , "Lu, Wenzhuo" , "Zhang, Helin" Thread-Topic: [dpdk-dev] [PATCH v7 1/3] eal/x86: run-time dispatch over memcpy Thread-Index: AQHTPdZeGmi+IF3YRUSW4MVNnLsWHaLbS3MAgAW17WD//+S4AIAAAqaAgAABagCAAIbHUA== Date: Fri, 13 Oct 2017 07:41:14 +0000 Message-ID: References: <1507157911-8702-1-git-send-email-xiaoyun.li@intel.com> <1709550.5v5ZG7JxHL@xps> <2601191342CEEE43887BDE71AB9772585FAA89A1@IRSMSX103.ger.corp.intel.com> <5529102.6cLtonmahJ@xps> In-Reply-To: <5529102.6cLtonmahJ@xps> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v7 1/3] eal/x86: run-time dispatch over memcpy X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Oct 2017 07:41:18 -0000 > -----Original Message----- > From: Thomas Monjalon [mailto:thomas@monjalon.net] > Sent: Friday, October 13, 2017 15:36 > To: Ananyev, Konstantin ; Li, Xiaoyun > > Cc: dev@dpdk.org; Richardson, Bruce ; Lu, > Wenzhuo ; Zhang, Helin > Subject: Re: [dpdk-dev] [PATCH v7 1/3] eal/x86: run-time dispatch over > memcpy >=20 > 13/10/2017 09:31, Ananyev, Konstantin: > > From: Thomas Monjalon [mailto:thomas@monjalon.net] > > > 13/10/2017 03:06, Li, Xiaoyun: > > > > Hi > > > > Sorry for the late reply. I took AL last 3 days. > > > > > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net] > > > > > 05/10/2017 14:33, Xiaoyun Li: > > > > > > +/** > > > > > > + * Macro for copying unaligned block from one location to > > > > > > +another with constant load offset, > > > > > > + * 47 bytes leftover maximum, > > > > > > + * locations should not overlap. > > > > > > + * Requirements: > > > > > > + * - Store is aligned > > > > > > + * - Load offset is , which must be immediate value > > > > > > +within [1, 15] > > > > > > + * - For , make sure bit backwards & <16 - > > > > > > +offset> bit forwards are available for loading > > > > > > + * - , , must be variables > > > > > > + * - __m128i ~ must be pre-defined */ #define > > > > > > +MOVEUNALIGNED_LEFT47_IMM(dst, src, len, > > > > > > > > > > Naive question: > > > > > Is there a real benefit of using a macro compared to a static > > > > > inline function optimized by a modern compiler? > > > > > > > > > The macro is in the existing DPDK codes. I didn't touch it. I just = change > the file name and the function name to rte_memcpy_internal. > > > > So I am not clear about if there is real benefit. > > > > In my opinion, I think it is the same as static inline function. > > > > > > > > Do I need to change them to inline function? > > > > > > In this patch, it appears as a new macro. > > > > Ah no, it definitely been there before. > > All we did here - git mv rte_memcpy.h rte_memcpyu_interlan.h and then > > in rte_memcpy_internal.h renamed rte_memcpy() to > rte_memcpy_internal(). > > > > > If you can, inline function is cleaner for the new one. > > > > I don't think it will be straightforward - one of the parameters is a c= onstant > value. > > My preference would be to keep original rte_memcpy() code intact as > > much as we can here (except probably cosmetic changes - indentation, li= ne > length fixing etc.). > > After all that patch is for adding architecture function selection at r= untime > only. > > If we like to improve our rte_memcpy() any furher - NP with that, but > > let it be a separate patch. >=20 > OK >=20 Then I will just modify indentation and line length fix and keep the origin= al macro. > I am waiting this patch to close RC1 today. I will do it ASAP. Best Regards Xiaoyun Li