From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id EB078A0C56; Fri, 5 Nov 2021 05:39:12 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8A16640689; Fri, 5 Nov 2021 05:39:12 +0100 (CET) Received: from mail-vk1-f176.google.com (mail-vk1-f176.google.com [209.85.221.176]) by mails.dpdk.org (Postfix) with ESMTP id 7404540151 for ; Fri, 5 Nov 2021 05:39:11 +0100 (CET) Received: by mail-vk1-f176.google.com with SMTP id h133so4085926vke.10 for ; Thu, 04 Nov 2021 21:39:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=smartx-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=lref4cTZcB1mhYNoPSkOAYTRzunnxBkZGqxP9ZnZ7f8=; b=TMv2JXxa0TZUXloulFtXC/6GqJmRs0TQ03U4QJkdaZKoi/RMeghyK6BEdlmFCZVlB0 P5kCt4deMBmnZl7mAGK9xcwvOXIKLb1p2esnX+OcCBmp1wcfg1mbOkUyEvXhp6eLIAuz 5ZBT1J8W870Opx91KPZgEazxqiUXTUyPWRK0yHjstQKD8+gcMMdUXBYQmc8D+lXWUq7g MgtiEeykS8CBCUyl/ovHfVOeOQ0LyF3bijzbx5ULobk+8MH9qePlU9RKIYK1MCsWuekV 9nXwAGP4R6zuR0fdFRFGpskZmVqtiR9dqMg8Vxmlg2BkfmLIkBV/xu0ZG3k7d2945Nhm tDOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=lref4cTZcB1mhYNoPSkOAYTRzunnxBkZGqxP9ZnZ7f8=; b=GgnUgbOMZTIloIw7RoyHJ+egW0NKVHF7HBuY4zyr0x7b0uNR++qHQseZ9BDt5FbxGU 9SxAoEyqVPPdcaQ6jRgudmqX1lwvFKj/6mMbWbej1NdOrHaFwNyWDcQQeFor81iWsMrr 4myLgZCcQdnREQTvtz76TiyDfF1d0fo5aqawlQJ4t+Q3qw+StZXc6I5KdSE2K/vQKuHv k/MKXnjyqeh+PVCoD2Rl3cPw6EaESecf75FwFXyMwiXnTkORKmpiN0iuMNOMEh1Zep9H ZyFtq6LFQGoT4Vb/15Fy+Iyka2658O4vU43bUCIh04g1EulKtKHCBF4Q5O7KQg2zJvJO qvyg== X-Gm-Message-State: AOAM533EswKzj5nKZ4NyfrUTiuMoSniKNuAvEopOJfs0A9E3NpLTnQEg 8jY2adyoYdLw8swn4c0QdTxK3copnxpIrW3Z0XumNg== X-Google-Smtp-Source: ABdhPJz+fFD1+XiZNspn+Mi02+FJDUb05c51k7HF2jFqj0XFDq1M/sd56wdpH4knqSlrjad5Fk1c2WWAVYE01silUGw= X-Received: by 2002:a05:6122:1687:: with SMTP id 7mr30637248vkl.2.1636087150786; Thu, 04 Nov 2021 21:39:10 -0700 (PDT) MIME-Version: 1.0 References: <20210903075637.2201185-1-fengli@smartx.com> <20210903080203.2495559-1-fengli@smartx.com> <34b02edd-b44c-0f75-a1ff-5c78803783cf@redhat.com> In-Reply-To: From: Li Feng Date: Fri, 5 Nov 2021 12:38:59 +0800 Message-ID: To: "Liu, Changpeng" Cc: "Xia, Chenbo" , dev , Maxime Coquelin Content-Type: text/plain; charset="UTF-8" Subject: Re: [dpdk-dev] [PATCH v2] vhost: call destroy_device always in vhost_destroy_device_notify X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Fri, Nov 5, 2021 at 12:21 PM Liu, Changpeng wrote: > > DPDK already provides pre_msg_handle and post_msg_handle to let vhost-backend to process > the differences for different virtio devices. And it's not good idea to start all the virtio queues in SeaBIOS, > we tried that long time ago. I suggest you to submit a SPDK issue so that we can take a look too. > Thanks, I think just changing spdk may not solve this problem. OK, I have submit a issue here: https://github.com/spdk/spdk/issues/2228 > > -----Original Message----- > > From: Li Feng > > Sent: Friday, November 5, 2021 11:18 AM > > To: Liu, Changpeng > > Cc: Xia, Chenbo ; dev ; Maxime > > Coquelin > > Subject: Re: [PATCH v2] vhost: call destroy_device always in > > vhost_destroy_device_notify > > > > Add Liu Changpeng. > > > > > > On Thu, Oct 14, 2021 at 8:49 PM Li Feng wrote: > > > > > > Thanks, > > > Feng Li > > > > > > On Thu, Oct 14, 2021 at 8:28 PM Maxime Coquelin > > > wrote: > > > > > > > > > > > > > > > > On 10/14/21 14:12, Li Feng wrote: > > > > > On Thu, Oct 14, 2021 at 8:01 PM Maxime Coquelin > > > > > wrote: > > > > >> > > > > >> Hi Li, > > > > >> > > > > >> The commit message is not compliant with the contributors guidelines: > > > > >> https://doc.dpdk.org/guides/contributing/patches.html#commit- > > messages-subject-line > > > > > OK, I got it. > > > > >> > > > > >> On 9/3/21 10:02, Li Feng wrote: > > > > >>> Vhost-user client must send the mem table, kick fd, call fd on all > > > > >>> virtqueues, then the device will be VIRTIO_DEV_RUNNING. > > > > >>> > > > > >>> If the vhost-user communication is initialized partly, e.g. > > > > >>> - When initializing the vhost-user, try to restart the vhost-user > > > > >>> backend; > > > > >>> - Seabios only initialized the vhost-scsi req vq. > > > > >>> The device is not with flags VIRTIO_DEV_RUNNING.. > > > > >>> > > > > >>> Root Cause: > > > > >>> The vhost session has been created, and added the scsi/blk requestq > > > > >>> poller into reactor, but when destroying the device, the requestq is not > > > > >>> unregistered. > > > > >>> > > > > >>> Reproduce the crash on spdk vhost-user backend: > > > > >>> 1. Create a VM; > > > > >>> 2. Mount a ISO to a VM, start the VM, don't install the OS; > > > > >>> 3. Restart the spdk_tgt; > > > > >>> > > > > >>> Another discusstion is in seabiso: > > > > >>> https://patchew.org/Seabios/20210831122339.2591585-1- > > fengli@smartx.com/ > > > > >> > > > > >> This is a fix, so you need to add the Fixes tag and cc stable. > > > > > Acked. > > > > > > > > > >> > > > > >>> Signed-off-by: Li Feng > > > > >>> --- > > > > >>> v2: > > > > >>> Fix the commit msg typo: vas -> virtqueues. > > > > >>> -- > > > > >>> lib/vhost/vhost.c | 2 +- > > > > >>> 1 file changed, 1 insertion(+), 1 deletion(-) > > > > >>> > > > > >>> diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c > > > > >>> index 355ff37651..191ba82c41 100644 > > > > >>> --- a/lib/vhost/vhost.c > > > > >>> +++ b/lib/vhost/vhost.c > > > > >>> @@ -710,8 +710,8 @@ vhost_destroy_device_notify(struct virtio_net > > *dev) > > > > >>> if (vdpa_dev) > > > > >>> vdpa_dev->ops->dev_close(dev->vid); > > > > >>> dev->flags &= ~VIRTIO_DEV_RUNNING; > > > > >>> - dev->notify_ops->destroy_device(dev->vid); > > > > >>> } > > > > >>> + dev->notify_ops->destroy_device(dev->vid); > > > > >> > > > > >> .destroy_device() is the counter-part of .new_device(). > > > > >> VIRTIO_DEV_RUNNING is set only when .new_device() has been called > > with > > > > >> success, and cleared when .destroy_device() is called. > > > > >> > > > > >> So I disagree with the fix, we want to keep the correlation between > > > > >> VIRTIO_DEV_RUNNING and .new_device()/.destroy_device(). Doing > > otherwise > > > > >> could lead to regressions with other applications than yours. > > > > >> > > > > >> What is not clear from the commit message or the discussion you link is > > > > >> where does it crash exactly. Is it in SPDK, in DPDK? > > > > > > > > > > The crash is in SPDK, the poller is still running in the reactor, > > > > > however, the device is freed. > > > > > > > > > > I really don't have a good method to handle this partly initialized virt queues. > > > > > This is another patch I prepared to fix this issue: > > > > > > > > > > From 63142ec60088d08b27b9657640b82e837557b5d4 Mon Sep 17 > > 00:00:00 2001 > > > > > From: Li Feng > > > > > Date: Wed, 1 Sep 2021 16:51:44 +0800 > > > > > Subject: [PATCH] vhost: fix vhost session crash > > > > > > > > > > If any vq is inited well, treat the dev is RUNNING status. > > > > > > > > > > Root Cause: > > > > > The session has been created, and added the requestq poller into > > > > > reactor, but when destroying the device, the requestq is not > > > > > unregistered. > > > > > The seabios only initialized the req vq(idx = 2), ignore the controlq > > > > > and eventq vq. > > > > > > > > > > Reproduce: > > > > > 1. Create a VM; > > > > > 2. Mount a ISO to a VM, start the VM, don't install the OS; > > > > > 3. Restart the zbs-chunkd; > > > > > > > > > > Signed-off-by: Li Feng > > > > > Change-Id: I21292e58b0b08237b5d105359095ec6a31907752 > > > > > --- > > > > > lib/librte_vhost/vhost_user.c | 6 ++++-- > > > > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c > > > > > index f211ec8..a80e9f4 100644 > > > > > --- a/lib/librte_vhost/vhost_user.c > > > > > +++ b/lib/librte_vhost/vhost_user.c > > > > > @@ -1394,9 +1394,11 @@ virtio_is_ready(struct virtio_net *dev) > > > > > "kickfd: %d callfd: %d enabled: %d\n", > > > > > dev->ifname, vq, i, vq->desc, vq->avail, > > > > > vq->used, vq->kickfd, vq->callfd, vq->enabled); > > > > > - if (!vq_is_ready(dev, vq)) > > > > > - return 0; > > > > > + if (vq_is_ready(dev, vq)) > > > > > + break; > > > > > } > > > > > + if (i == nr_vring) > > > > > + return 0; > > > > > > > > > > /* If supported, ensure the frontend is really done with config */ > > > > > if (dev->protocol_features & (1ULL << > > VHOST_USER_PROTOCOL_F_STATUS)) > > > > > > > > > > > > > Above patch will also cause regression, as networking backends work > > > > with queue pairs. > > > > > > > > So your issue is that SPDK is processing vrings while DPDK considers the > > > > device as not running. Instead of working around that issue, maybe what > > > > you should do is to introduce a new API and mechanism to help DPDK > > > > determine whether it should consider the device ready based on the > > > > backend type. IIUC, in your Vhost-scsi case, it should be as soon as VQ > > > > 2 is ready? > > > > > > Yes, VQ 2 is ready, the vhost-iscsi device should be ready. > > > I had this idea months ago, if it's acceptable, I will try to do it. > > > > > > Thanks. > > > > > > > > > > > > > > > > > > >