From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 8ED9F379E for ; Tue, 7 Apr 2015 17:31:05 +0200 (CEST) Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id t37FV3ev019904 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 7 Apr 2015 11:31:04 -0400 Received: from redhat.com (ovpn-116-20.ams2.redhat.com [10.36.116.20]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with SMTP id t37FV0Yl025829; Tue, 7 Apr 2015 11:31:01 -0400 Date: Tue, 7 Apr 2015 17:30:59 +0200 From: "Michael S. Tsirkin" To: Luke Gorrie Message-ID: <20150407172849-mutt-send-email-mst@redhat.com> References: <20150127160126.GA10651@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 Cc: "dev@dpdk.org" , "snabb-devel@googlegroups.com" , VirtualOpenSystems Technical Team , virtualization@lists.linux-foundation.org Subject: Re: [dpdk-dev] [snabb-devel] Re: memory barriers in virtq.lua? X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Apr 2015 15:31:06 -0000 On Tue, Apr 07, 2015 at 04:22:42PM +0200, Luke Gorrie wrote: > Hi Michael, > > I'm writing to follow up the previous discussion about memory barriers in > virtio-net device implementations, and Cc'ing the DPDK list because I believe > this is relevant to them too. > > First, thanks again for getting in touch and reviewing our code. > > I have now found a missed case where we *do* require a hardware memory barrier > on x86 in our vhost/virtio-net device. That is when checking the interrupt > suppression flag after updating used->idx. This is needed because x86 can > reorder the write to used->idx after the read from avail->flags, and that > causes the guest to see a stale value of used->idx after it toggles interrupt > suppression. > > If I may spell out my mental model, for the sake of being corrected and/or as > an example of how third party developers are reading and interpreting the > Virtio-net spec: > > Relating this to Virtio 1.0, the most relevant section is 3.2.1 (Supplying > Buffers to the Device) which calls for two "suitable memory barriers". The spec > talks about these from the driver perspective, but they are both relevant to > the device side too. > > The first barrier (write to descriptor table before write to used->idx) is > implicit on x86 because writes by the same core are not reordered. This means > that no explicit hardware barrier is needed. (A compiler barrier may be needed, > however.) > > The second memory barrier (write to used->idx before reading avail->flags) is > not implicit on x86 because stores are reordered after loads. So an explicit > hardware memory barrier is needed. > > I hope that is a correct assessment of the situation. (Forgive my > x86centricity, I am sure that seems very foreign to kernel hackers.) > > If this assessment is correct then the DPDK developers might also want to > review librte_vhost/vhost_rxtx.c and consider adding a hardware memory barrier > between writing used->idx and reading avail->flags. > > Cheers, > -Luke I agree, this looks like a bug in dpdk. > P.S. I notice that the Linux virtio-net driver does not seem to tolerate > spurious interrupts, even though the Virtio 1.0 spec requires this ("must"). On > 3.13.11-ckt15 I see them trigger an "irq nobody cared" kernel log message and > then the irq is disabled. If that sounds suspicious I can supply more > information. > > More information might be useful, yes. Just guessing from the available info: I think you refer to this: The driver MUST handle spurious interrupts from the device. The intent is to be able to handle some spurious interrupts once in a while. AFAIK linux triggers the message if it gets a huge number of spurious interrupts for an extended period of time. For example, this will trigger if the device does not clear interrupt line after interrupt register read.