From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-f46.google.com (mail-oi0-f46.google.com [209.85.218.46]) by dpdk.org (Postfix) with ESMTP id D52665A6D for ; Sun, 25 Jan 2015 21:02:52 +0100 (CET) Received: by mail-oi0-f46.google.com with SMTP id a141so4568326oig.5 for ; Sun, 25 Jan 2015 12:02:52 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:content-type:mime-version:subject:from :in-reply-to:date:cc:content-transfer-encoding:message-id:references :to; bh=CQaUt4180WRkTVCM2duqVG5FdyR/bVFsFW1mdD2n7y4=; b=GmQE5DkSp1SRy/zrP7jt7RBllbhFdanAHXwX5s3RVzvOIpBD29e1q8+img9n+agYNZ eZJlhoAdBBgXjnDR9yUZCVZmcG7kz2AJaIBfEJJfXNCHcUzU0f4SZTkb+i+egYyYu6I8 e0UJNdfGKX9xX2cbVwCAClojlyuXr4hFsF7UAWxCqyGQsRhZktvgEgSFb4t3kCPQ1B9a 1lbwkKbYUyCSXabxvn8JeW4Ydhn0A0RlcUs6sMcTcR0GRayzPddZYjr4xZHpbTQUqVNk b+B/CsV4yukUVFJahKr4kjwHCTRGPH18MOkHvdMXJptJxA5xlVvv6lkdVL7CmcM6eajU 3FLA== X-Gm-Message-State: ALoCoQlY9Grb8l1PkyVaGsl79TGj/ctjoDM+IR+dH/qjtQgRcJ8QZXZb9rM7OqRoHkO57Z6qcjoB X-Received: by 10.202.93.134 with SMTP id r128mr1230880oib.95.1422216172168; Sun, 25 Jan 2015 12:02:52 -0800 (PST) Received: from [172.21.0.96] (65-36-83-120.static.grandenetworks.net. [65.36.83.120]) by mx.google.com with ESMTPSA id ve6sm4217982obb.2.2015.01.25.12.02.51 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sun, 25 Jan 2015 12:02:51 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (1.0) From: Jim Thompson X-Mailer: iPad Mail (12B435) In-Reply-To: <20150120091538.4c3a1363@urahara> Date: Sun, 25 Jan 2015 14:02:51 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: <238FA3A4-9892-4243-8F19-44CC61F01F3D@netgate.com> References: <1421632414-10027-1-git-send-email-zhihong.wang@intel.com> <1421632414-10027-5-git-send-email-zhihong.wang@intel.com> <20150120091538.4c3a1363@urahara> To: Stephen Hemminger Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] [PATCH 4/4] lib/librte_eal: Optimized memcpy in arch/x86/rte_memcpy.h for both SSE and AVX platforms X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Jan 2015 20:02:53 -0000 > On Jan 20, 2015, at 11:15 AM, Stephen Hemminger wrote: >=20 > On Mon, 19 Jan 2015 09:53:34 +0800 > zhihong.wang@intel.com wrote: >=20 >> Main code changes: >>=20 >> 1. Differentiate architectural features based on CPU flags >>=20 >> a. Implement separated move functions for SSE/AVX/AVX2 to make full ut= ilization of cache bandwidth >>=20 >> b. Implement separated copy flow specifically optimized for target arc= hitecture >>=20 >> 2. Rewrite the memcpy function "rte_memcpy" >>=20 >> a. Add store aligning >>=20 >> b. Add load aligning based on architectural features >>=20 >> c. Put block copy loop into inline move functions for better control o= f instruction order >>=20 >> d. Eliminate unnecessary MOVs >>=20 >> 3. Rewrite the inline move functions >>=20 >> a. Add move functions for unaligned load cases >>=20 >> b. Change instruction order in copy loops for better pipeline utilizat= ion >>=20 >> c. Use intrinsics instead of assembly code >>=20 >> 4. Remove slow glibc call for constant copies >>=20 >> Signed-off-by: Zhihong Wang >=20 > Dumb question: why not fix glibc memcpy instead? > What is special about rte_memcpy? In addition to the other points, a FreeBSD doesn't use glibc on the target p= latform, (but it is used on, say MIPS), and FreeBSD is a supported DPDK plat= form.=20 So glibc isn't a solution.=20 Jim=