From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ob0-f175.google.com (mail-ob0-f175.google.com [209.85.214.175]) by dpdk.org (Postfix) with ESMTP id CD3795584 for ; Sat, 27 Feb 2016 15:06:24 +0100 (CET) Received: by mail-ob0-f175.google.com with SMTP id s6so50583369obg.3 for ; Sat, 27 Feb 2016 06:06:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc; bh=4YWR0cUGChaWDhKUdja0NQgHm9t1yWF4K2UTTzqJ0Cg=; b=SDnwjCt27Y7YL5rMuxnenMJKhjjCrsmTAdbZ8j0cmHPO1AB4Ih+Q0dkBORbdFyOu4y j4vxiw/nCSVT/zNFDa+bIBvZml8SKjdZRg13jgo/a9aub8wx5hxOZAoJHaeds9DK8Jj/ OJmY1n/tIipsNQ05kVQA34LDbXBs7s1+wjSfH/MqSLN6Hnq2z3clRug82u0i7kAI+KdX onyidl95l9TP6SdwSDHbQRugsTdnydUef2Z3SPEsIJhzYLyrPJLNnOyT46/vNliPQuis eWeGItXUIAu+w1J1l8nZfmLReYAimYQ1CLkcD7mP2mzUMLKTSTrn1Hvp639ZTqTOpQos Re0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc; bh=4YWR0cUGChaWDhKUdja0NQgHm9t1yWF4K2UTTzqJ0Cg=; b=ZVZN8+FZ/bd7iFFHnvx4Geg7UM6/Yh212Zh6wBVi+JQBr3vStDKjIiNZPJDcoTEQGK 8VELlbCd30emNNjAR1uaQl9B3KWNi6gTNPoHDRmsU15ZqEqE3J486N6n2TVeOLV/trOo 5FyCroFwJ0mj+LM7XyEreNIE6fsbjhWHt24T2XFlSgCBcN7a9QYjra9GF+bJDcoeG0FE Bg/7IoqiGaEmF4RXOywaY92UXVbfpR0cSxrrteG96KYXmIJowq9TDd4V8Kz4Owd7D8cQ OxeFILTkiHi2CzXWe6UMmDkIWn/j+PR5WXoCV1i4f0sTM/yMC+Di5/hxC/QOFrccOH/S iL3A== X-Gm-Message-State: AD7BkJJ8rL8OwjGlWMOxNyoLiVU/4QTJLI8MD9tZBc7ryKhkGjxCTDijt6iM1wePpV8dm2bUE+un2B76sUh5Tg== MIME-Version: 1.0 X-Received: by 10.60.101.132 with SMTP id fg4mr5058282oeb.21.1456581984132; Sat, 27 Feb 2016 06:06:24 -0800 (PST) Received: by 10.202.172.71 with HTTP; Sat, 27 Feb 2016 06:06:24 -0800 (PST) In-Reply-To: <1453954715-31723-1-git-send-email-zhihong.wang@intel.com> References: <1429562009-11817-1-git-send-email-rkerur@gmail.com> <1453954715-31723-1-git-send-email-zhihong.wang@intel.com> Date: Sat, 27 Feb 2016 06:06:24 -0800 Message-ID: From: Ravi Kerur To: Zhihong Wang Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] [dpdk-dev,v2] Clean up rte_memcpy.h file X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 27 Feb 2016 14:06:25 -0000 On Wed, Jan 27, 2016 at 8:18 PM, Zhihong Wang wrote: > > Remove unnecessary type casting in functions. > > > > Tested on Ubuntu (14.04 x86_64) with "make test". > > "make test" results match the results with baseline. > > "Memcpy perf" results match the results with baseline. > > > > Signed-off-by: Ravi Kerur > > Acked-by: Stephen Hemminger > > > > --- > > .../common/include/arch/x86/rte_memcpy.h | 340 > +++++++++++---------- > > 1 file changed, 175 insertions(+), 165 deletions(-) > > > > diff --git a/lib/librte_eal/common/include/arch/x86/rte_memcpy.h > b/lib/librte_eal/common/include/arch/x86/rte_memcpy.h > > index 6a57426..839d4ec 100644 > > --- a/lib/librte_eal/common/include/arch/x86/rte_memcpy.h > > +++ b/lib/librte_eal/common/include/arch/x86/rte_memcpy.h > > [...] > > > /** > > @@ -150,13 +150,16 @@ rte_mov64blocks(uint8_t *dst, const uint8_t *src, > size_t n) > > __m256i ymm0, ymm1; > > > > while (n >= 64) { > > - ymm0 = _mm256_loadu_si256((const __m256i *)((const uint8_t > *)src + 0 * 32)); > > + > > + ymm0 = _mm256_loadu_si256((const __m256i *)(src + 0 * 32)); > > + ymm1 = _mm256_loadu_si256((const __m256i *)(src + 1 * 32)); > > + > > + _mm256_storeu_si256((__m256i *)(dst + 0 * 32), ymm0); > > + _mm256_storeu_si256((__m256i *)(dst + 1 * 32), ymm1); > > + > > Any particular reason to change the order of the statements here? :) > Overall this patch looks good. > I checked the code changes, initial code had moving addresses (src and dst) and decrement counter scattered between store and load instructions. I changed it to loads, followed by stores and handle address/counters increment/decrement without changing functionality. > > > n -= 64; > > - ymm1 = _mm256_loadu_si256((const __m256i *)((const uint8_t > *)src + 1 * 32)); > > - src = (const uint8_t *)src + 64; > > - _mm256_storeu_si256((__m256i *)((uint8_t *)dst + 0 * 32), > ymm0); > > - _mm256_storeu_si256((__m256i *)((uint8_t *)dst + 1 * 32), > ymm1); > > - dst = (uint8_t *)dst + 64; > > + src = src + 64; > > + dst = dst + 64; > > } > > } > > > >