From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 28DDD1B636 for ; Mon, 16 Oct 2017 12:47:50 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id F2181356D9; Mon, 16 Oct 2017 10:47:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com F2181356D9 Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=maxime.coquelin@redhat.com Received: from [10.36.112.40] (ovpn-112-40.ams2.redhat.com [10.36.112.40]) by smtp.corp.redhat.com (Postfix) with ESMTPS id E3C7A5D965; Mon, 16 Oct 2017 10:47:44 +0000 (UTC) From: Maxime Coquelin To: "Yao, Lei A" , "'dev@dpdk.org'" , "Horton, Remy" , "Bie, Tiwei" , "'yliu@fridaylinux.org'" Cc: "'mst@redhat.com'" , "'jfreiman@redhat.com'" , "'vkaplans@redhat.com'" , "'jasowang@redhat.com'" References: <20171005083627.27828-1-maxime.coquelin@redhat.com> <20171005083627.27828-16-maxime.coquelin@redhat.com> <2DBBFF226F7CF64BAFCA79B681719D953A2A66FC@SHSMSX151.ccr.corp.intel.com> <1105ba3f-1f6f-d8f2-0299-2d26158790b8@redhat.com> <2DBBFF226F7CF64BAFCA79B681719D953A2ADA3F@SHSMSX151.ccr.corp.intel.com> <2DBBFF226F7CF64BAFCA79B681719D953A2B92CC@shsmsx102.ccr.corp.intel.com> <2DBBFF226F7CF64BAFCA79B681719D953A2B92F7@shsmsx102.ccr.corp.intel.com> Message-ID: <3e1f7960-711a-39fd-2a2f-ddea47df36d0@redhat.com> Date: Mon, 16 Oct 2017 12:47:42 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Mon, 16 Oct 2017 10:47:50 +0000 (UTC) Subject: Re: [dpdk-dev] [PATCH v3 15/19] vhost: postpone rings addresses translation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Oct 2017 10:47:51 -0000 On 10/16/2017 11:47 AM, Maxime Coquelin wrote: > Hi Yao, > > On 10/16/2017 08:23 AM, Yao, Lei A wrote: >> Hi, Maxime >> >> Add one comment: >> This issue with virtio-net only occur when I use CPU on socket 1. > > Thanks for the report. > I fail to reproduce for now. > > What is your qemu command line? > Is it reproducible systematically when there is a NUMA reallocation > (DPDK on socket 0, QEMU on socket 1)? Nevermind, I just reproduced the (an?) issue. The issue I reproduce is not linked to NUMA reallocation, but to messages sequencing differences between QEMU versions. So, I'm not 100% this is the same issue, as you mention it works fine when using CPU socket 0. The patch "vhost: postpone rings addresses translation" moves rings addresses translation at either vring kick or enable time, depending on whether protocol features are enabled or not. This has been done because we must not interpret ring information as long as the vring is not fully initialized. While my patch works fine with recent QEMU version, it breaks with older ones, like QEMU v2.5. The reason is that on these older versions, VHOST_USER_SET_VRING_ENABLE is called once and before VHOST_USER_SET_VRING_ADDR. At that time, the ring adresses aren't available so the translation is not done. On recent QEMU versions, we receive VHOST_USER_SET_VRING_ENABLE also after having received the rings addresses, so it works fine. The below fix consists in performing the rings addresses translation also when handling VHOST_USER_SET_VRING_ADDR if ring has already been enabled. I'll post a formal patch later today or tomorrow morning after having tested it more conscientiously. Let me know if it fixes your issue. Thanks, Maxime diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 76c4eeca5..1f6cba4b9 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -372,33 +372,6 @@ ring_addr_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq, return qva_to_vva(dev, ra); } -/* - * The virtio device sends us the desc, used and avail ring addresses. - * This function then converts these to our address space. - */ -static int -vhost_user_set_vring_addr(struct virtio_net *dev, VhostUserMsg *msg) -{ - struct vhost_virtqueue *vq; - struct vhost_vring_addr *addr = &msg->payload.addr; - - if (dev->mem == NULL) - return -1; - - /* addr->index refers to the queue index. The txq 1, rxq is 0. */ - vq = dev->virtqueue[msg->payload.addr.index]; - - /* - * Rings addresses should not be interpreted as long as the ring is not - * started and enabled - */ - memcpy(&vq->ring_addrs, addr, sizeof(*addr)); - - vring_invalidate(dev, vq); - - return 0; -} - static struct virtio_net * translate_ring_addresses(struct virtio_net *dev, int vq_index) { @@ -464,6 +437,43 @@ translate_ring_addresses(struct virtio_net *dev, int vq_index) } /* + * The virtio device sends us the desc, used and avail ring addresses. + * This function then converts these to our address space. + */ +static int +vhost_user_set_vring_addr(struct virtio_net **pdev, VhostUserMsg *msg) +{ + struct vhost_virtqueue *vq; + struct vhost_vring_addr *addr = &msg->payload.addr; + struct virtio_net *dev = *pdev; + + if (dev->mem == NULL) + return -1; + + /* addr->index refers to the queue index. The txq 1, rxq is 0. */ + vq = dev->virtqueue[msg->payload.addr.index]; + + /* + * Rings addresses should not be interpreted as long as the ring is not + * started and enabled + */ + memcpy(&vq->ring_addrs, addr, sizeof(*addr)); + + vring_invalidate(dev, vq); + + if (vq->enabled && (dev->features & + (1ULL << VHOST_USER_F_PROTOCOL_FEATURES))) { + dev = translate_ring_addresses(dev, msg->payload.state.index); + if (!dev) + return -1; + + *pdev = dev; + } + + return 0; +} + +/* * The virtio device sends us the available ring last used index. */ static int @@ -1273,7 +1283,7 @@ vhost_user_msg_handler(int vid, int fd) vhost_user_set_vring_num(dev, &msg); break; case VHOST_USER_SET_VRING_ADDR: - vhost_user_set_vring_addr(dev, &msg); + vhost_user_set_vring_addr(&dev, &msg); break; case VHOST_USER_SET_VRING_BASE: vhost_user_set_vring_base(dev, &msg);