From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id 226E1690F for ; Sun, 9 Oct 2016 14:09:11 +0200 (CEST) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga104.fm.intel.com with ESMTP; 09 Oct 2016 05:09:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.31,466,1473145200"; d="scan'208";a="888220313" Received: from fmsmsx104.amr.corp.intel.com ([10.18.124.202]) by orsmga003.jf.intel.com with ESMTP; 09 Oct 2016 05:09:10 -0700 Received: from fmsmsx115.amr.corp.intel.com (10.18.116.19) by fmsmsx104.amr.corp.intel.com (10.18.124.202) with Microsoft SMTP Server (TLS) id 14.3.248.2; Sun, 9 Oct 2016 05:09:10 -0700 Received: from shsmsx101.ccr.corp.intel.com (10.239.4.153) by fmsmsx115.amr.corp.intel.com (10.18.116.19) with Microsoft SMTP Server (TLS) id 14.3.248.2; Sun, 9 Oct 2016 05:09:10 -0700 Received: from shsmsx103.ccr.corp.intel.com ([169.254.4.234]) by SHSMSX101.ccr.corp.intel.com ([169.254.1.118]) with mapi id 14.03.0248.002; Sun, 9 Oct 2016 20:09:08 +0800 From: "Wang, Zhihong" To: "Wang, Zhihong" , Yuanhan Liu , Jianbo Liu CC: Maxime Coquelin , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH v3 0/5] vhost: optimize enqueue Thread-Index: AQHR+hh5PoNiMS5qakKA1d8du6AZMqBUHzGAgC8wyICAAI0Q0P//tyiAgADjrICAADeEgIAAl3pg//+etYCAB/HtgIAA8UzwgBKLxxA= Date: Sun, 9 Oct 2016 12:09:07 +0000 Message-ID: <8F6C2BD409508844A0EFC19955BE09414E7BBE7D@SHSMSX103.ccr.corp.intel.com> References: <1471319402-112998-1-git-send-email-zhihong.wang@intel.com> <1471585430-125925-1-git-send-email-zhihong.wang@intel.com> <8F6C2BD409508844A0EFC19955BE09414E7B5581@SHSMSX103.ccr.corp.intel.com> <20160922022903.GJ23158@yliu-dev.sh.intel.com> <8F6C2BD409508844A0EFC19955BE09414E7B5DAE@SHSMSX103.ccr.corp.intel.com> <20160927102123.GL25823@yliu-dev.sh.intel.com> <8F6C2BD409508844A0EFC19955BE09414E7B7C0B@SHSMSX103.ccr.corp.intel.com> In-Reply-To: <8F6C2BD409508844A0EFC19955BE09414E7B7C0B@SHSMSX103.ccr.corp.intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v3 0/5] vhost: optimize enqueue X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 09 Oct 2016 12:09:12 -0000 > -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Wang, Zhihong > Sent: Wednesday, September 28, 2016 12:45 AM > To: Yuanhan Liu ; Jianbo Liu > > Cc: Maxime Coquelin ; dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH v3 0/5] vhost: optimize enqueue >=20 >=20 >=20 > > -----Original Message----- > > From: Yuanhan Liu [mailto:yuanhan.liu@linux.intel.com] > > Sent: Tuesday, September 27, 2016 6:21 PM > > To: Jianbo Liu > > Cc: Wang, Zhihong ; Maxime Coquelin > > ; dev@dpdk.org > > Subject: Re: [dpdk-dev] [PATCH v3 0/5] vhost: optimize enqueue > > > > On Thu, Sep 22, 2016 at 05:01:41PM +0800, Jianbo Liu wrote: > > > On 22 September 2016 at 14:58, Wang, Zhihong > > > wrote: > > > > > > > > > > > >> -----Original Message----- > > > >> From: Jianbo Liu [mailto:jianbo.liu@linaro.org] > > > >> Sent: Thursday, September 22, 2016 1:48 PM > > > >> To: Yuanhan Liu > > > >> Cc: Wang, Zhihong ; Maxime Coquelin > > > >> ; dev@dpdk.org > > > >> Subject: Re: [dpdk-dev] [PATCH v3 0/5] vhost: optimize enqueue > > > >> > > > >> On 22 September 2016 at 10:29, Yuanhan Liu > > > > >> wrote: > > > >> > On Wed, Sep 21, 2016 at 08:54:11PM +0800, Jianbo Liu wrote: > > > >> >> >> > My setup consists of one host running a guest. > > > >> >> >> > The guest generates as much 64bytes packets as possible > using > > > >> >> >> > > > >> >> >> Have you tested with other different packet size? > > > >> >> >> My testing shows that performance is dropping when packet > size is > > > >> more > > > >> >> >> than 256. > > > >> >> > > > > >> >> > > > > >> >> > Hi Jianbo, > > > >> >> > > > > >> >> > Thanks for reporting this. > > > >> >> > > > > >> >> > 1. Are you running the vector frontend with mrg_rxbuf=3Doff? > > > >> >> > > > > >> Yes, my testing is mrg_rxbuf=3Doff, but not vector frontend PMD. > > > >> > > > >> >> > 2. Could you please specify what CPU you're running? Is it > Haswell > > > >> >> > or Ivy Bridge? > > > >> >> > > > > >> It's an ARM server. > > > >> > > > >> >> > 3. How many percentage of drop are you seeing? > > > >> The testing result: > > > >> size (bytes) improvement (%) > > > >> 64 3.92 > > > >> 128 11.51 > > > >> 256 24.16 > > > >> 512 -13.79 > > > >> 1024 -22.51 > > > >> 1500 -12.22 > > > >> A correction is that performance is dropping if byte size is large= r than > 512. > > > > > > > > > > > > Jianbo, > > > > > > > > Could you please verify does this patch really cause enqueue perf t= o > drop? > > > > > > > > You can test the enqueue path only by set guest to do rxonly, and > compare > > > > the mpps by show port stats all in the guest. > > > > > > > > > > > Tested with testpmd, host: txonly, guest: rxonly > > > size (bytes) improvement (%) > > > 64 4.12 > > > 128 6 > > > 256 2.65 > > > 512 -1.12 > > > 1024 -7.02 > > > > There is a difference between Zhihong's code and the old I spotted in > > the first time: Zhihong removed the avail_idx prefetch. I understand > > the prefetch becomes a bit tricky when mrg-rx code path is considered; > > thus, I didn't comment on that. > > > > That's one of the difference that, IMO, could drop a regression. I then > > finally got a chance to add it back. > > > > A rough test shows it improves the performance of 1400B packet size > greatly > > in the "txonly in host and rxonly in guest" case: +33% is the number I = get > > with my test server (Ivybridge). >=20 > Thanks Yuanhan! I'll validate this on x86. Hi Yuanhan, Seems your code doesn't perform correctly. I write a new version of avail idx prefetch but didn't see any perf benefit. To be honest I doubt the benefit of this idea. The previous mrg_off code has this method but doesn't give any benefits. Even if this is useful, the benefits should be more significant for small packets, it's unlikely this simple idx prefetch could bring over 30% perf gain for large packets like 1400B ones. But if you really do work it out like that I'll be very glad to see. Thanks Zhihong >=20 > > > > I guess this might/would help your case as well. Mind to have a test > > and tell me the results? > > > > BTW, I made it in rush; I haven't tested the mrg-rx code path yet. > > > > Thanks. > > > > --yliu