From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id C78FACE7 for ; Fri, 9 Sep 2016 05:59:21 +0200 (CEST) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga101.fm.intel.com with ESMTP; 08 Sep 2016 20:59:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.30,303,1470726000"; d="scan'208,217";a="1047686152" Received: from shwdeisgchi083.ccr.corp.intel.com (HELO [10.239.67.193]) ([10.239.67.193]) by orsmga002.jf.intel.com with ESMTP; 08 Sep 2016 20:59:20 -0700 To: Yuanhan Liu References: <1470397003-5782-1-git-send-email-jianfeng.tan@intel.com> <1470397003-5782-3-git-send-email-jianfeng.tan@intel.com> <20160906064246.GQ30752@yliu-dev.sh.intel.com> <20160906082011.GA23158@yliu-dev.sh.intel.com> <53e1d78c-e3b0-e16c-992d-e45ed1b31a4f@intel.com> <20160908121818.GO23158@yliu-dev.sh.intel.com> Cc: dev@dpdk.org, zhihong.wang@intel.com, lining18@jd.com From: "Tan, Jianfeng" Message-ID: <11291eac-cbbe-7080-3d35-bdacb5341d89@intel.com> Date: Fri, 9 Sep 2016 11:59:18 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: <20160908121818.GO23158@yliu-dev.sh.intel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-dev] [PATCH 2/3] net/virtio_user: fix wrong sequence of messages X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Sep 2016 03:59:22 -0000 On 9/8/2016 8:18 PM, Yuanhan Liu wrote: > On Thu, Sep 08, 2016 at 04:53:22PM +0800, Tan, Jianfeng wrote: >> >> On 9/6/2016 4:20 PM, Yuanhan Liu wrote: >>> On Tue, Sep 06, 2016 at 03:54:30PM +0800, Tan, Jianfeng wrote: >>>> Hi Yuanhan, >>>> >>>> >>>> On 9/6/2016 2:42 PM, Yuanhan Liu wrote: >>>>> On Fri, Aug 05, 2016 at 11:36:42AM +0000, Jianfeng Tan wrote: >>>>>> When virtio_user is used with VPP's native vhost user, it cannot >>>>>> send/receive any packets. >>>>>> >>>>>> The root cause is that vpp-vhost-user translates the message >>>>>> VHOST_USER_SET_FEATURES as puting this device into init state, >>>>>> aka, zero all related structures. However, previous code >>>>>> puts this message at last in the whole initialization process, >>>>>> which leads to all previous information are zeroed. >>>>>> >>>>>> To fix this issue, we rearrange the sequence of those messages. >>>>>> - step 0, send VHOST_USER_SET_VRING_CALL so that vhost allocates >>>>>> virtqueue structures; >>>>> Yes, it is. However, it's not that right to do that (you see there is >>>>> a FIXME in vhost_user_set_vring_call()). >>>> I suppose you are specifying vhost_set_vring_call(). >>> Oh, I was talking about the new code: I have renamed it to >>> vhost_user_set_vring_call :) >>> >>>>> That means it need be fixed: we should not rely on fact that it's the >>>>> first per-vring message we will get in the current QEMU implementation >>>>> as the truth. >>>>> >>>>> That also means, naming a function like virtio_user_create_queue() based >>>>> on above behaviour is wrong. >>>> It's actually a good catch. After a light thought, I think in DPDK vhost, we >>>> may need to create those virtqueues once unix socket gets connected, just >>>> like in vhost-net, virtqueues are created on char file open. Right? >>> There is a difference: for vhost-net and tap mode, IIRC, it knows how >>> many queues before doing setup. >> No, from linux/drivers/vhost/net.c:vhost_net_open(), we can see that >> virtqueues are allocated according to VHOST_NET_VQ_MAX. > Well, if you took a closer look, you will find VHOST_NET_VQ_MAX is > defined to 2. That means it allocates a queue-pair only. > > And FYI, the MQ support for vhost-net is actually done in the tap > driver, but not in vhost-net driver. That results to the MQ > implementation is a bit different between vhost-net and vhost-user. > > Put simply, in vhost-net, one queue-pair has a backend fd associated > with it. Vhost requests for different queue-pair are sent by different > fd. That also means the vhost-net doesn't even need be aware of the > MQ stuff. > > However, for vhost-user implementation, all queue-pairs share one > socket fd. All requests all also sent over the single socket fd, > thus the backend (DPDK vhost) has to figure out how many queue > pairs are actually enabled: and we detect it by reading the > vring index of SET_VRING_CALL message; it's not quite right though. Aha, right, nice analysis. > >> How about reconsidering previous suggestion to allocate vq once connection >> is established? > That's too much, because DPDK claims to support up to 0x8000 > queue-pairs. Don't even to say that the vhost_virtqueue struct > was way too big before: it even holds an array of buf_vec with > size 256. Another mistake of my memory, I was remember it wrongly as only 8 VQs are supported. One thing not related, provided that VHOST_MAX_QUEUE_PAIRS equals to 0x8000, struct vhost_virtqueue *virtqueue[VHOST_MAX_QUEUE_PAIRS * 2] spends 4MB for each virtio device, which could be a refined. > >> Never mind, above fix on the vhost side will not take effect on existing >> vpp-vhost implementations. > Actually, I was talking about the DPDK vhost implementation :) This patch is talking about vpp's native vhost implementation, not dpdk-vhost, and not the way vpp uses dpdk-vhost. > >> We still need to fix it in the virtio side. > Yes, we could fix it in our side, even though VPP is broken. OK, let's back to this patch. > >>> While for vhost-user, it doesn't. That >>> means, we have to allocate and setup virtqueues reactively: just like >>> what we have done in vhost_set_vring_call(). What doesn't look perfect >>> is it assume SET_VRING_CALL is the first per-vring message we will get. >> Yes, depending on the assumption that SET_VRING_CALL is the first per-vring >> message, looks like a bad implementation. As Stephen has suggested, it's >> more like a bug in vpp. If we treat it like that way, we will fix nothing >> here. >> >> >>>>>> - step 1, send VHOST_USER_SET_FEATURES to confirm the features; >>>>>> - step 2, send VHOST_USER_SET_MEM_TABLE to share mem regions; >>>>>> - step 3, send VHOST_USER_SET_VRING_NUM, VHOST_USER_SET_VRING_BASE, >>>>>> VHOST_USER_SET_VRING_ADDR, VHOST_USER_SET_VRING_KICK for each >>>>>> queue; >>>>>> - ... >>>>>> >>>>>> Fixes: 37a7eb2ae816 ("net/virtio-user: add device emulation layer") >>>>>> >>>>>> Reported-by: Zhihong Wang >>>>>> Signed-off-by: Jianfeng Tan >>>>>> --- >>>>>> drivers/net/virtio/virtio_user/virtio_user_dev.c | 120 ++++++++++++++--------- >>>>>> 1 file changed, 72 insertions(+), 48 deletions(-) >>>>> That's too much of code for a bug fix. I'm wondering how about just >>>>> moving VHOST_USER_GET_PROTOCOL_FEATURES ahead, to the begining of >>>>> virtio_user_start_device()? It should fix this issue. >>>> Why does VHOST_USER_GET_PROTOCOL_FEATURES care? Do you mean shifting >>>> VHOST_USER_SET_FEATURES earlier? >>> Oops, right, I meant SET_FEATURES. Sorry about confusion introduced by >>> the silly auto-completion. >> Still not working. VPP needs SET_VRING_CALL to create vq firstly. > Didn't get it. In the proposal, SET_FEATURES is sent before every other > messages, thus it should not cause the issue you described in this patch. OK. Let me try to explain. We take three vhost implementations into consideration: dpdk-2.2-vhost, dpdk-master-vhost, vpp-native-vhost. If set_feature before set_vring_call, dpdk-2.2-vhost will fail: inside set_feature handler, assigning header length to VQs which will be created in set_vring_call handler. So we need to keep set_vring_call firstly. Then set_feature needs to be sent before any other msgs, this is what vpp-native-vhost requires. In all, the sequence is like this: 1. set_vring_call, 2. set_feature, 3. other msgs > Besides, haven't we already sent SET_VRING_CALL before other messages > (well, execept the SET_FEATURES and SET_MEM_TABLE message)? Yes, set_vring_call is already in the first place, but we need to plugin set_feature between set_vring_call and other msgs. Previously, set_vring_call and other msgs are together. Thanks, Jianfeng > That's still > not enough for vpp's native vhost-user implementation? > --yliu