> > > Since backend and frontend message are synchronous in the same thread, > > > there will be a probability of message deadlock. > > > Consider each driver to determine whether to wait for response. > > > > > > Fixes: d90cf7d111ac ("vhost: support host notifier") > > > Cc: maxime.coquelin@redhat.com > > > Signed-off-by: Rma Ma > > > --- > > > v2 - fix format error in commit message > > > v3 - add --in-reply-to > > > --- > > > > Hi Maxime, > > > > This patch helps to fix vhost-user message deadlock, could you help > > review it? > > The patch introduces a new device op, but it is used nowhere in vDPA > drivers. > > What vDPA driver is it going to be used with? > > Regards, > Maxime Hi, Our company's jmnd vdpa driver, which requires the rte_vhost_host_notifier_ctrl interface, replicates the problem with the following scenario: QEMU start vhost-user with modern net and blk, backend use dpdk-vdpa process, after live migration, dest QEMU deadlock with dpdk-vdpa. - QEMU sends VHOST_USER_SET_VRING_KICK to dpdk-vdpa net - QEMU does not need to wait for a response to this message - QEMU then sends VHOST_USER_SET_MEM_TABLE to dpdk-vdpa blk - QEMU needs to wait reply in this message - when dpdk-vdpa recv VHOST_USER_SET_VRING_KICK, - it will send VHOST_USER_BACKEND_VRING_HOST_NOTIFIER_MSG to QEMU - dpdk-vdpa needs to wait for a response to this message - QEMU will deadlock with dpdk-vdpa I tried to add a patch to the qemu community that uses a new thread to loop backend channel, But there will be some multi-threaded synchronization issues I think this is a public issue, and other backend messages take this into account, so I think this message also needs a flag to fix it. and jmnd vdpa driver will subsequently be open-sourced to the community. Best wishes, Rma ________________________________ 发件人: Maxime Coquelin 发送时间: 2023年7月11日 21:25 收件人: Rma Ma ; dpdk-dev 抄送: Chenbo Xia 主题: Re: 回复: [PATCH v3] vhost: add notify reply ops to fix message deadlock Hi, On 7/11/23 11:25, Rma Ma wrote: > > Since backend and frontend message are synchronous in the same thread, > > there will be a probability of message deadlock. > > Consider each driver to determine whether to wait for response. > > > > Fixes: d90cf7d111ac ("vhost: support host notifier") > > Cc: maxime.coquelin@redhat.com > > Signed-off-by: Rma Ma > > --- > > v2 - fix format error in commit message > > v3 - add --in-reply-to > > --- > > Hi Maxime, > > This patch helps to fix vhost-user message deadlock, could you help > review it? The patch introduces a new device op, but it is used nowhere in vDPA drivers. What vDPA driver is it going to be used with? Regards, Maxime > Thanks. > > Best wishes, > > Rma > > ------------------------------------------------------------------------ > *发件人:* Rma Ma > *发送时间:* 2023年7月4日 10:52 > *收件人:* dpdk-dev > *抄送:* Maxime Coquelin ; Chenbo Xia > ; Rma Ma > *主题:* [PATCH v3] vhost: add notify reply ops to fix message deadlock > Since backend and frontend message are synchronous in the same thread, > there will be a probability of message deadlock. > Consider each driver to determine whether to wait for response. > > Fixes: d90cf7d111ac ("vhost: support host notifier") > Cc: maxime.coquelin@redhat.com > Signed-off-by: Rma Ma > --- > v2 - fix format error in commit message > v3 - add --in-reply-to > --- > lib/vhost/vdpa_driver.h | 3 +++ > lib/vhost/vhost_user.c | 23 ++++++++++++++++++----- > 2 files changed, 21 insertions(+), 5 deletions(-) > > diff --git a/lib/vhost/vdpa_driver.h b/lib/vhost/vdpa_driver.h > index 8db4ab9f4d..3d2ea3c90e 100644 > --- a/lib/vhost/vdpa_driver.h > +++ b/lib/vhost/vdpa_driver.h > @@ -81,6 +81,9 @@ struct rte_vdpa_dev_ops { > > /** get device type: net device, blk device... */ > int (*get_dev_type)(struct rte_vdpa_device *dev, uint32_t *type); > + > + /** Get the notify reply flag */ > + int (*get_notify_reply_flag)(int vid, bool *need_reply); > }; > > /** > diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c > index 901a80bbaa..aa61992939 100644 > --- a/lib/vhost/vhost_user.c > +++ b/lib/vhost/vhost_user.c > @@ -3365,13 +3365,14 @@ rte_vhost_backend_config_change(int vid, bool > need_reply) > static int vhost_user_backend_set_vring_host_notifier(struct > virtio_net *dev, > int index, int fd, > uint64_t offset, > - uint64_t size) > + uint64_t size, > + bool need_reply) > { > int ret; > struct vhu_msg_context ctx = { > .msg = { > .request.backend = > VHOST_USER_BACKEND_VRING_HOST_NOTIFIER_MSG, > - .flags = VHOST_USER_VERSION | VHOST_USER_NEED_REPLY, > + .flags = VHOST_USER_VERSION, > .size = sizeof(ctx.msg.payload.area), > .payload.area = { > .u64 = index & VHOST_USER_VRING_IDX_MASK, > @@ -3388,7 +3389,13 @@ static int > vhost_user_backend_set_vring_host_notifier(struct virtio_net *dev, > ctx.fd_num = 1; > } > > - ret = send_vhost_backend_message_process_reply(dev, &ctx); > + if (!need_reply) > + ret = send_vhost_backend_message(dev, &ctx); > + else { > + ctx.msg.flags |= VHOST_USER_NEED_REPLY; > + ret = send_vhost_backend_message_process_reply(dev, &ctx); > + } > + > if (ret < 0) > VHOST_LOG_CONFIG(dev->ifname, ERR, "failed to set host > notifier (%d)\n", ret); > > @@ -3402,6 +3409,7 @@ int rte_vhost_host_notifier_ctrl(int vid, uint16_t > qid, bool enable) > int vfio_device_fd, ret = 0; > uint64_t offset, size; > unsigned int i, q_start, q_last; > + bool need_reply; > > dev = get_device(vid); > if (!dev) > @@ -3440,6 +3448,11 @@ int rte_vhost_host_notifier_ctrl(int vid, > uint16_t qid, bool enable) > if (vfio_device_fd < 0) > return -ENOTSUP; > > + if (vdpa_dev->ops->get_notify_reply_flag == NULL) > + need_reply = true; > + else > + vdpa_dev->ops->get_notify_reply_flag(vid, &need_reply); > + > if (enable) { > for (i = q_start; i <= q_last; i++) { > if (vdpa_dev->ops->get_notify_area(vid, i, > &offset, > @@ -3449,7 +3462,7 @@ int rte_vhost_host_notifier_ctrl(int vid, uint16_t > qid, bool enable) > } > > if > (vhost_user_backend_set_vring_host_notifier(dev, i, > - vfio_device_fd, offset, size) < 0) { > + vfio_device_fd, offset, size, > need_reply) < 0) { > ret = -EFAULT; > goto disable; > } > @@ -3458,7 +3471,7 @@ int rte_vhost_host_notifier_ctrl(int vid, uint16_t > qid, bool enable) > disable: > for (i = q_start; i <= q_last; i++) { > > vhost_user_backend_set_vring_host_notifier(dev, i, -1, > - 0, 0); > + 0, 0, need_reply); > } > } > > -- > 2.17.1 >