From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yb0-f182.google.com (mail-yb0-f182.google.com [209.85.213.182]) by dpdk.org (Postfix) with ESMTP id 7009937B1 for ; Mon, 26 Sep 2016 07:39:06 +0200 (CEST) Received: by mail-yb0-f182.google.com with SMTP id t5so8279768yba.2 for ; Sun, 25 Sep 2016 22:39:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=P+7bNLWCE38tvtrgMFfYR5UKzU9zw70KF8QIO/L1yQU=; b=bXc/EnHi0E82k1jXQs2XufNVBjJkqtYC8v+lnPi+R7G/loodPxzbKYuAuNDobx6oPN koGqVDtsWZ+3YOtx9Gao4z5DhMQYgz87WFqgZrEBCo2Q45M1xb99hA2qoijlSrl1EP9U D6KgcmYHpHHZ1ZTxhK27PxVoA3J9C0WwEXCTY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=P+7bNLWCE38tvtrgMFfYR5UKzU9zw70KF8QIO/L1yQU=; b=ZswN5UScRMo+ej/Gujdre7g8GjXoXxqc8s2OS0LLoYCz1+4vu7EQaAmNuuto/l6r9O Ihz3dh7T49H7tCNUwtqjgJSkv/bTUooIcRArT/IU32RdsKwKwkOs7xfWfZ8rsb6TI2rR GCFMsbFllhXYK1IqGQeObfjEBSkPqydrPwx8gwn0y/E4DlCyYdOOlpHWnl0fP22WsRgN Szu6z4fiu/aQRxlYCx2xsyaZ3VVBVAuAltb8mG2rTV3nEAKPZHp/8KLnS0x26DbYwC8N MmAQ7VAxBMNUIvArtUNq243wfGbDDocDcrNeZrYKvQL4q2wRj5PrbFz4jERGnvPZjIb8 Y9pw== X-Gm-Message-State: AA6/9Rl6C1BMqxo6RRHi/jxEonJoT9HU168F5Qb26G0pd2EYWPH/ymmGiPs+ILAoNMbNuqg8g+Tkj/eN7zNS+SMU X-Received: by 10.37.53.68 with SMTP id c65mr12359029yba.64.1474868338960; Sun, 25 Sep 2016 22:38:58 -0700 (PDT) MIME-Version: 1.0 Received: by 10.37.25.6 with HTTP; Sun, 25 Sep 2016 22:38:58 -0700 (PDT) In-Reply-To: <8F6C2BD409508844A0EFC19955BE09414E7B720A@SHSMSX103.ccr.corp.intel.com> References: <1471319402-112998-1-git-send-email-zhihong.wang@intel.com> <8F6C2BD409508844A0EFC19955BE09414E7B6204@SHSMSX103.ccr.corp.intel.com> <1536480.IYe8r5XoNN@xps13> <8F6C2BD409508844A0EFC19955BE09414E7B6EA6@SHSMSX103.ccr.corp.intel.com> <8F6C2BD409508844A0EFC19955BE09414E7B720A@SHSMSX103.ccr.corp.intel.com> From: Jianbo Liu Date: Mon, 26 Sep 2016 13:38:58 +0800 Message-ID: To: "Wang, Zhihong" Cc: Thomas Monjalon , "dev@dpdk.org" , Yuanhan Liu , Maxime Coquelin Content-Type: text/plain; charset=UTF-8 Subject: Re: [dpdk-dev] [PATCH v3 0/5] vhost: optimize enqueue X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Sep 2016 05:39:06 -0000 On 26 September 2016 at 13:25, Wang, Zhihong wrote: > > >> -----Original Message----- >> From: Jianbo Liu [mailto:jianbo.liu@linaro.org] >> Sent: Monday, September 26, 2016 1:13 PM >> To: Wang, Zhihong >> Cc: Thomas Monjalon ; dev@dpdk.org; Yuanhan >> Liu ; Maxime Coquelin >> >> Subject: Re: [dpdk-dev] [PATCH v3 0/5] vhost: optimize enqueue >> >> On 25 September 2016 at 13:41, Wang, Zhihong >> wrote: >> > >> > >> >> -----Original Message----- >> >> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] >> >> Sent: Friday, September 23, 2016 9:41 PM >> >> To: Jianbo Liu >> >> Cc: dev@dpdk.org; Wang, Zhihong ; Yuanhan Liu >> >> ; Maxime Coquelin >> >> >> .... >> > This patch does help in ARM for small packets like 64B sized ones, >> > this actually proves the similarity between x86 and ARM in terms >> > of caching optimization in this patch. >> > >> > My estimation is based on: >> > >> > 1. The last patch are for mrg_rxbuf=on, and since you said it helps >> > perf, we can ignore it for now when we discuss mrg_rxbuf=off >> > >> > 2. Vhost enqueue perf = >> > Ring overhead + Virtio header overhead + Data memcpy overhead >> > >> > 3. This patch helps small packets traffic, which means it helps >> > ring + virtio header operations >> > >> > 4. So, when you say perf drop when packet size larger than 512B, >> > this is most likely caused by memcpy in ARM not working well >> > with this patch >> > >> > I'm not saying glibc's memcpy is not good enough, it's just that >> > this is a rather special use case. And since we see specialized >> > memcpy + this patch give better performance than other combinations >> > significantly on x86, we suggest to hand-craft a specialized memcpy >> > for it. >> > >> > Of course on ARM this is still just my speculation, and we need to >> > either prove it or find the actual root cause. >> > >> > It can be **REALLY HELPFUL** if you could help to test this patch on >> > ARM for mrg_rxbuf=on cases to see if this patch is in fact helpful >> > to ARM at all, since mrg_rxbuf=on the more widely used cases. >> > >> Actually it's worse than mrg_rxbuf=off. > > I mean compare the perf of original vs. original + patch with > mrg_rxbuf turned on. Is there any perf improvement? > Yes, orig + patch + on is better than orig + on, but orig + patch + on is worse than orig + patch + off.