From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 798645954 for ; Wed, 20 Jan 2016 04:39:23 +0100 (CET) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP; 19 Jan 2016 19:39:22 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.22,319,1449561600"; d="scan'208";a="636588728" Received: from fmsmsx103.amr.corp.intel.com ([10.18.124.201]) by FMSMGA003.fm.intel.com with ESMTP; 19 Jan 2016 19:39:21 -0800 Received: from fmsmsx158.amr.corp.intel.com (10.18.116.75) by FMSMSX103.amr.corp.intel.com (10.18.124.201) with Microsoft SMTP Server (TLS) id 14.3.248.2; Tue, 19 Jan 2016 19:39:21 -0800 Received: from shsmsx102.ccr.corp.intel.com (10.239.4.154) by fmsmsx158.amr.corp.intel.com (10.18.116.75) with Microsoft SMTP Server (TLS) id 14.3.248.2; Tue, 19 Jan 2016 19:39:21 -0800 Received: from shsmsx101.ccr.corp.intel.com ([169.254.1.215]) by shsmsx102.ccr.corp.intel.com ([169.254.2.172]) with mapi id 14.03.0248.002; Wed, 20 Jan 2016 11:39:19 +0800 From: "Xie, Huawei" To: "Polehn, Mike A" , "Tan, Jianfeng" , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH] vhost: remove lockless enqueue to the virtio ring Thread-Index: AQHRUzQkieIVNSepOEKwhW8jySn4vw== Date: Wed, 20 Jan 2016 03:39:18 +0000 Message-ID: References: <1451918787-85887-1-git-send-email-huawei.xie@intel.com> <569E6372.5030200@intel.com> <745DB4B8861F8E4B9849C970520ABBF1498488E5@ORSMSX102.amr.corp.intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "ann.zhuangyanying@huawei.com" Subject: Re: [dpdk-dev] [PATCH] vhost: remove lockless enqueue to the virtio ring X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Jan 2016 03:39:24 -0000 On 1/20/2016 2:33 AM, Polehn, Mike A wrote:=0A= > SMP operations can be very expensive, sometimes can impact operations by = 100s to 1000s of clock cycles depending on what is the circumstances of the= synchronization. It is how you arrange the SMP operations within the tasks= at hand across the SMP cores that gives methods for top performance. Usin= g traditional general purpose SMP methods will result in traditional genera= l purpose performance. Migrating to general libraries (understood by most g= eneral purpose programmers) from expert abilities (understood by much small= er group of expert programmers focused on performance) will greatly reduce = the value of DPDK since the end result will be lower performance and/or hav= e less predictable operation where rate performance, predictability, and lo= w latency are the primary goals.=0A= >=0A= > The best method to date, is to have multiple outputs to a single port is = to use a DPDK queue with multiple producer, single consumer to do an SMP op= eration for multiple sources to feed a single non SMP task to output to the= port (that is why the ports are not SMP protected). Also when considerable= contention from multiple sources occur often (data feeding at same time), = having DPDK queue with input and output variables in separate cache lines = can have a notable throughput improvement.=0A= >=0A= > Mike =0A= =0A= Mike:=0A= Thanks for detailed explanation. Do you have comment to this patch?=0A= =0A= >=0A= > -----Original Message-----=0A= > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Xie, Huawei=0A= > Sent: Tuesday, January 19, 2016 8:44 AM=0A= > To: Tan, Jianfeng; dev@dpdk.org=0A= > Cc: ann.zhuangyanying@huawei.com=0A= > Subject: Re: [dpdk-dev] [PATCH] vhost: remove lockless enqueue to the vir= tio ring=0A= >=0A= > On 1/20/2016 12:25 AM, Tan, Jianfeng wrote:=0A= >> Hi Huawei,=0A= >>=0A= >> On 1/4/2016 10:46 PM, Huawei Xie wrote:=0A= >>> This patch removes the internal lockless enqueue implmentation.=0A= >>> DPDK doesn't support receiving/transmitting packets from/to the same = =0A= >>> queue. Vhost PMD wraps vhost device as normal DPDK port. DPDK =0A= >>> applications normally have their own lock implmentation when enqueue = =0A= >>> packets to the same queue of a port.=0A= >>>=0A= >>> The atomic cmpset is a costly operation. This patch should help =0A= >>> performance a bit.=0A= >>>=0A= >>> Signed-off-by: Huawei Xie =0A= >>> ---=0A= >>> lib/librte_vhost/vhost_rxtx.c | 86=0A= >>> +++++++++++++------------------------------=0A= >>> 1 file changed, 25 insertions(+), 61 deletions(-)=0A= >>>=0A= >>> diff --git a/lib/librte_vhost/vhost_rxtx.c =0A= >>> b/lib/librte_vhost/vhost_rxtx.c index bbf3fac..26a1b9c 100644=0A= >>> --- a/lib/librte_vhost/vhost_rxtx.c=0A= >>> +++ b/lib/librte_vhost/vhost_rxtx.c=0A= >> I think vhost example will not work well with this patch when=0A= >> vm2vm=3Dsoftware.=0A= >>=0A= >> Test case:=0A= >> Two virtio ports handled by two pmd threads. Thread 0 polls pkts from=0A= >> physical NIC and sends to virtio0, while thread0 receives pkts from=0A= >> virtio1 and routes it to virtio0.=0A= > vhost port will be wrapped as port, by vhost PMD. DPDK APP treats all=0A= > physical and virtual ports as ports equally. When two DPDK threads try=0A= > to enqueue to the same port, the APP needs to consider the contention.=0A= > All the physical PMDs doesn't support concurrent enqueuing/dequeuing.=0A= > Vhost PMD should expose the same behavior unless absolutely necessary=0A= > and we expose the difference of different PMD.=0A= >=0A= >>> -=0A= >>> *(volatile uint16_t *)&vq->used->idx +=3D entry_success;=0A= >> Another unrelated question: We ever try to move this assignment out of= =0A= >> loop to save cost as it's a data contention?=0A= > This operation itself is not that costly, but it has side effect on the= =0A= > cache transfer.=0A= > It is outside of the loop for non-mergable case. For mergeable case, it= =0A= > is inside the loop.=0A= > Actually it has pro and cons whether we do this in burst or in a smaller= =0A= > step. I prefer to move it outside of the loop. Let us address this later.= =0A= >=0A= >> Thanks,=0A= >> Jianfeng=0A= >>=0A= >>=0A= >=0A= =0A=