From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 37B77255 for ; Thu, 29 Jan 2015 07:16:37 +0100 (CET) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga101.fm.intel.com with ESMTP; 28 Jan 2015 22:16:34 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.09,485,1418112000"; d="scan'208";a="677822404" Received: from pgsmsx103.gar.corp.intel.com ([10.221.44.82]) by orsmga002.jf.intel.com with ESMTP; 28 Jan 2015 22:16:34 -0800 Received: from kmsmsx154.gar.corp.intel.com (172.21.73.14) by PGSMSX103.gar.corp.intel.com (10.221.44.82) with Microsoft SMTP Server (TLS) id 14.3.195.1; Thu, 29 Jan 2015 14:16:33 +0800 Received: from shsmsx151.ccr.corp.intel.com (10.239.6.50) by KMSMSX154.gar.corp.intel.com (172.21.73.14) with Microsoft SMTP Server (TLS) id 14.3.195.1; Thu, 29 Jan 2015 14:16:32 +0800 Received: from shsmsx101.ccr.corp.intel.com ([169.254.1.253]) by SHSMSX151.ccr.corp.intel.com ([169.254.3.168]) with mapi id 14.03.0195.001; Thu, 29 Jan 2015 14:16:31 +0800 From: "Fu, JingguoX" To: "Wang, Zhihong" , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH v2 0/4] DPDK memcpy optimization Thread-Index: AQHQO2zMrhnrT2fAo0GT1Kitm7BEQpzWnk+Q Date: Thu, 29 Jan 2015 06:16:31 +0000 Message-ID: <6BD6202160B55B409D423293115822625C2735@SHSMSX101.ccr.corp.intel.com> References: <1422499127-11689-1-git-send-email-zhihong.wang@intel.com> In-Reply-To: <1422499127-11689-1-git-send-email-zhihong.wang@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v2 0/4] DPDK memcpy optimization X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Jan 2015 06:16:37 -0000 Basic Information Patch name DPDK memcpy optimization v2 Brief description about test purpose Verify memory copy and memo= ry copy performance cases on variety OS Test Flag Tested-by Tester name jingguox.fu at intel.com Test Tool Chain information N/A Commit ID 88fa98a60b34812bfed92e5b2706fcf7e1cbcbc8 Test Result Summary Total 6 cases, 6 passed, 0 failed =20 Test environment - Environment 1: OS: Ubuntu12.04 3.2.0-23-generic X86_64 GCC: gcc version 4.6.3 CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz NIC: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ [8086:10fb] = (rev 01) - Environment 2:=20 OS: Ubuntu14.04 3.13.0-24-generic GCC: gcc version 4.8.2 CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz NIC: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ [8086:10fb] = (rev 01) Environment 3: OS: Fedora18 3.6.10-4.fc18.x86_64 GCC: gcc version 4.7.2 20121109 CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz NIC: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ [8086:10fb] = (rev 01) Detailed Testing information Test Case - name test_memcpy Test Case - Description=20 Create two buffers, and initialise one with random values= . These are copied=20 to the second buffer and then compared to see if the copy= was successful. The=20 bytes outside the copied area are also checked to make su= re they were not changed. Test Case -test sample/application test application in app/test Test Case -command / instruction # ./app/test/test -n 1 -c ffff #RTE>> memcpy_autotest Test Case - expected #RTE>> Test OK Test Result- PASSED Test Case - name test_memcpy_perf Test Case - Description a number of different sizes and cached/uncached permutati= ons Test Case -test sample/application test application in app/test Test Case -command / instruction # ./app/test/test -n 1 -c ffff #RTE>> memcpy_perf_autotest Test Case - expected #RTE>> Test OK Test Result- PASSED -----Original Message----- From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Zhihong Wang Sent: Thursday, January 29, 2015 10:39 To: dev@dpdk.org Subject: [dpdk-dev] [PATCH v2 0/4] DPDK memcpy optimization This patch set optimizes memcpy for DPDK for both SSE and AVX platforms. It also extends memcpy test coverage with unaligned cases and more test poi= nts. Optimization techniques are summarized below: 1. Utilize full cache bandwidth 2. Enforce aligned stores 3. Apply load address alignment based on architecture features 4. Make load/store address available as early as possible 5. General optimization techniques like inlining, branch reducing, prefetch= pattern access -------------- Changes in v2: 1. Reduced constant test cases in app/test/test_memcpy_perf.c for fast buil= d 2. Modified macro definition for better code readability & safety Zhihong Wang (4): app/test: Disabled VTA for memcpy test in app/test/Makefile app/test: Removed unnecessary test cases in app/test/test_memcpy.c app/test: Extended test coverage in app/test/test_memcpy_perf.c lib/librte_eal: Optimized memcpy in arch/x86/rte_memcpy.h for both SSE and AVX platforms app/test/Makefile | 6 + app/test/test_memcpy.c | 52 +- app/test/test_memcpy_perf.c | 220 ++++--- .../common/include/arch/x86/rte_memcpy.h | 680 +++++++++++++++--= ---- 4 files changed, 654 insertions(+), 304 deletions(-) --=20 1.9.3