From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58]) by dpdk.org (Postfix) with ESMTP id D08605A85 for ; Mon, 19 Jan 2015 14:02:28 +0100 (CET) Received: from hmsreliant.think-freely.org ([2001:470:8:a08:7aac:c0ff:fec2:933b] helo=localhost) by smtp.tuxdriver.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.63) (envelope-from ) id 1YDByI-0006rf-Ga; Mon, 19 Jan 2015 08:02:27 -0500 Date: Mon, 19 Jan 2015 08:02:21 -0500 From: Neil Horman To: zhihong.wang@intel.com Message-ID: <20150119130221.GB21790@hmsreliant.think-freely.org> References: <1421632414-10027-1-git-send-email-zhihong.wang@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1421632414-10027-1-git-send-email-zhihong.wang@intel.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Score: -2.9 (--) X-Spam-Status: No Cc: dev@dpdk.org Subject: Re: [dpdk-dev] [PATCH 0/4] DPDK memcpy optimization X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Jan 2015 13:02:29 -0000 On Mon, Jan 19, 2015 at 09:53:30AM +0800, zhihong.wang@intel.com wrote: > This patch set optimizes memcpy for DPDK for both SSE and AVX platforms. > It also extends memcpy test coverage with unaligned cases and more test points. > > Optimization techniques are summarized below: > > 1. Utilize full cache bandwidth > > 2. Enforce aligned stores > > 3. Apply load address alignment based on architecture features > > 4. Make load/store address available as early as possible > > 5. General optimization techniques like inlining, branch reducing, prefetch pattern access > > Zhihong Wang (4): > Disabled VTA for memcpy test in app/test/Makefile > Removed unnecessary test cases in test_memcpy.c > Extended test coverage in test_memcpy_perf.c > Optimized memcpy in arch/x86/rte_memcpy.h for both SSE and AVX > platforms > > app/test/Makefile | 6 + > app/test/test_memcpy.c | 52 +- > app/test/test_memcpy_perf.c | 238 +++++--- > .../common/include/arch/x86/rte_memcpy.h | 664 +++++++++++++++------ > 4 files changed, 656 insertions(+), 304 deletions(-) > > -- > 1.9.3 > > Are you able to compile this with gcc 4.9.2? The compilation of test_memcpy_perf is taking forever for me. It appears hung. Neil