From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f169.google.com (mail-wi0-f169.google.com [209.85.212.169]) by dpdk.org (Postfix) with ESMTP id 0A339DE0 for ; Tue, 9 Jun 2015 09:45:12 +0200 (CEST) Received: by wiwd19 with SMTP id d19so8789390wiw.0 for ; Tue, 09 Jun 2015 00:45:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=SJgyoN/VP2lIvP65C3RV5v7JyWAjwBfduK2DDM+bV8Q=; b=lmMiso+PyXAIqolk/ir0EqkrmufuXq7ctLa4XDdVGdM+CAQENpWKCNHz/yoOct5ter 0iXD8iibw6V5RbNCgsTRBiZvUiK+0qGF5CMyhtBo+Oz+S9WxZLIn6EcS25dAn+U/Rjm2 PDrOq2QlQPrhJDcR4rkq0huVDUBBJnaOXmgloudZKUa2EdmS+13K3rYZrQsu3fiF1Ue2 9apIigdPeKP7jA/PbnM3nWlKpIDXQUodPL+IP/RXqN2w1v6cBGI/5kIvrmUSBdCUQ+bw lc6bsmwj/x72/3nUstUZweB65Uf6DH4IgX2n0HwnsqzGcOxqCkMSFfmMiWoe+3zpUW3r N3Uw== MIME-Version: 1.0 X-Received: by 10.180.100.197 with SMTP id fa5mr29256228wib.65.1433835911834; Tue, 09 Jun 2015 00:45:11 -0700 (PDT) Sender: lukego@gmail.com Received: by 10.27.134.198 with HTTP; Tue, 9 Jun 2015 00:45:11 -0700 (PDT) In-Reply-To: <55768FE2.5060505@huawei.com> References: <1429720392-25345-1-git-send-email-huawei.xie@intel.com> <553995DB.4000801@huawei.com> <55768FE2.5060505@huawei.com> Date: Tue, 9 Jun 2015 09:45:11 +0200 X-Google-Sender-Auth: odUFoyZ5HixvIs4gAfc4DGewB2k Message-ID: From: Luke Gorrie To: Linhaifeng Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Cc: "dev@dpdk.org" , "Michael S. Tsirkin" Subject: Re: [dpdk-dev] [PATCH] vhost: flush used->idx update before reading avail->flags X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jun 2015 07:45:12 -0000 On 9 June 2015 at 09:04, Linhaifeng wrote: > On 2015/4/24 15:27, Luke Gorrie wrote: > > You should be able to test it like this: > > > > 1. Boot two Linux kernel (e.g. 3.13) guests. > > 2. Connect them via vhost switch. > > 3. Run continuous traffic between them (e.g. iperf). > > > > I would expect that within a reasonable timeframe (< 1 hour) one of the > > guests' network interfaces will hang indefinitely due to a missed > interrupt. > > > > You won't be able to reproduce this using DPDK guests because they are > not > > using the same interrupt suppression method. > > I think this patch can't resole this problem. On the other hand we still > would miss interrupt. > For what it is worth, we were able to reproduce the problem as described above with older Snabb Switch releases and we were also able to verify that inserting a memory barrier fixes this problem. This is the relevant commit in the snabbswitch repo for reference: https://github.com/SnabbCo/snabbswitch/commit/c33cdd8704246887e11d7c353f773f7b488a47f2 In a nutshell, we added an MFENCE instruction after writing used->idx and before checking VRING_F_NO_INTERRUPT. I have not tested this case under DPDK myself and so I am not really certain which memory barrier operations are sufficient/insufficient in that context. I hope that our experience is relevant/helpful though and I am happy to explain more about that if I have missed any important details. Cheers, -Luke