From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from bert.emutex.com (bert.emutex.com [91.103.1.109]) by dpdk.org (Postfix) with ESMTP id 3A7CB7CC3 for ; Fri, 1 Sep 2017 11:36:19 +0200 (CEST) Received: from [92.51.199.138] (helo=statler.emutex.com) by bert.emutex.com with esmtp (Exim 4.84) (envelope-from ) id 1dniNK-00047s-04 for dev@dpdk.org; Fri, 01 Sep 2017 10:36:30 +0100 Received: from [10.10.68.120] by statler.emutex.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84) (envelope-from ) id 1dniN6-0001U6-ML for dev@dpdk.org; Fri, 01 Sep 2017 10:36:18 +0100 To: dev@dpdk.org References: From: Pierre Message-ID: Date: Fri, 1 Sep 2017 10:36:17 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US X-Spam-Score: -1.0 (-) X-Spam-Report: Spam detection software, running on the system "statler.emutex.com", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Hi This might not be a good idea. With these modifications, the functions are not inlined any more (attribute inline), and not post-optimized either (-f lto) As per ABI, most of the registers must be saved on the stack before invoking a function. This is not noticeable in isolated test/perf code where there is not much context to save and restore at each function call, but it destroys performance in real heavy application where it is expected, for performance reasons, that rte_memcpy is really an inlined leaf function and all code can be inlined and optimized at compile time. [...] Content analysis details: (-1.0 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP Subject: Re: [dpdk-dev] dev Digest, Vol 159, Issue 119 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Sep 2017 09:36:19 -0000 Hi This might not be a good idea. With these modifications, the functions are not inlined any more (attribute inline), and not post-optimized either (-f lto) As per ABI, most of the registers must be saved on the stack before invoking a function. This is not noticeable in isolated test/perf code where there is not much context to save and restore at each function call, but it destroys performance in real heavy application where it is expected, for performance reasons, that rte_memcpy is really an inlined leaf function and all code can be inlined and optimized at compile time. The DPDK design logic has always been in the past to provide the most efficient implementation for a designated target platform. Else there would not be no advantage to provide rte_memcpy() over the standard generic memcpy() function. Such type of code is slowly starting to creep into DPDK codebase. an other example is the support for dynamic callbacks in rte_eth_tx_burst(). If multi-platform MUST be supported at run time, the right trade-off would be to make-sure this type of code can be compiled out, e.g. add something like RTE_ENABLE_RUN_TIME_DISPATCH in the config file. Regards, Pierre On 01/09/17 09:58, dev-request@dpdk.org wrote: > Send dev mailing list submissions to > dev@dpdk.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://dpdk.org/ml/listinfo/dev > or, via email, send a message with subject or body 'help' to > dev-request@dpdk.org > > You can reach the person managing the list at > dev-owner@dpdk.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of dev digest..." > > > Today's Topics: > > 1. [PATCH v2 0/3] dynamic linking support (Xiaoyun Li) > 2. [PATCH v2 1/3] eal/x86: run-time dispatch over memcpy (Xiaoyun Li) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 1 Sep 2017 16:56:59 +0800 > From: Xiaoyun Li > To: bruce.richardson@intel.com > Cc: dev@dpdk.org, zhihong.wang@intel.com, qi.z.zhang@intel.com, > wenzhuo.lu@intel.com, Xiaoyun Li > Subject: [dpdk-dev] [PATCH v2 0/3] dynamic linking support > Message-ID: <1504256222-32969-1-git-send-email-xiaoyun.li@intel.com> > > This patchset dynamically selects functions at run-time based on CPU flags > that current machine supports. This patchset modifies mempcy, memcpy perf > test and x86 EFD, using function pointers and bind them at constructor time. > Then in the cloud environment, users can compiler once for the minimum target > such as 'haswell'(not 'native') and run on different platforms (equal or above > haswell) and can get ISA optimization based on running CPU. > > Xiaoyun Li (3): > eal/x86: run-time dispatch over memcpy > app/test: run-time dispatch over memcpy perf test > efd: run-time dispatch over x86 EFD functions > > .../common/include/arch/x86/rte_memcpy.h | 343 +++++++++++++-------- > lib/librte_efd/rte_efd_x86.h | 41 ++- > mk/rte.cpuflags.mk | 14 + > test/test/test_memcpy_perf.c | 40 ++- > 4 files changed, 296 insertions(+), 142 deletions(-) >