DPDK patches and discussions
 help / color / mirror / Atom feed
From: Matan Azrad <matan@mellanox.com>
To: Maxime Coquelin <maxime.coquelin@redhat.com>,
	Xiao Wang <xiao.w.wang@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v1 3/4] vhost: improve device ready definition
Date: Mon, 22 Jun 2020 08:41:20 +0000	[thread overview]
Message-ID: <AM0PR0502MB40198F7066F49FDA30592596D2970@AM0PR0502MB4019.eurprd05.prod.outlook.com> (raw)
In-Reply-To: <74f35cd8-3543-ba1c-0dc7-1502599e6cf7@redhat.com>



From: Maxime Coquelin
> Hi,
> 
> On 6/21/20 8:20 AM, Matan Azrad wrote:
> > Hi Maxime
> >
> > From: Maxime Coquelin:
> >> Hi Matan,
> >>
> >> On 6/19/20 3:11 PM, Matan Azrad wrote:
> >>> Hi Maxime
> >>>
> >>> Thanks for the fast review.
> >>> This is first version, let's review it carefully to be sure it is correct.
> >>> @Xiao Wang, it will be good to hear your idea too.
> >>> We also need to understand the effect on IFC driver/device...
> >>> Just to update that I checked this code with the mlx5 adjustments
> >>> and I
> >> sent in this series.
> >>> It works well with the vDPA example application.
> >>
> >> OK.
> >>
> >>> From: Maxime Coquelin:
> >>>> On 6/18/20 6:28 PM, Matan Azrad wrote:
> >>>>> Some guest drivers may not configure disabled virtio queues.
> >>>>>
> >>>>> In this case, the vhost management never triggers the vDPA device
> >>>>> configuration because it waits to the device to be ready.
> >>>>
> >>>> This is not vDPA-only, even with SW datapath the application's
> >>>> new_device callback never gets called.
> >>>>
> >>> Yes, I wrote it below, I can be more specific here too in the next version.
> >>>
> >>>>> The current ready state means that all the virtio queues should be
> >>>>> configured regardless the enablement status.
> >>>>>
> >>>>> In order to support this case, this patch changes the ready state:
> >>>>> The device is ready when at least 1 queue pair is configured and
> >>>>> enabled.
> >>>>>
> >>>>> So, now, the vDPA driver will be configured when the first queue
> >>>>> pair is configured and enabled.
> >>>>>
> >>>>> Also the queue state operation is change to the next rules:
> >>>>> 	1. queue becomes ready (enabled and fully configured) -
> >>>>> 		set_vring_state(enabled).
> >>>>> 	2. queue becomes not ready - set_vring_state(disabled).
> >>>>> 	3. queue stay ready and VHOST_USER_SET_VRING_ENABLE massage
> >>>> was
> >>>>> 		handled - set_vring_state(enabled).
> >>>>>
> >>>>> The parallel operations for the application are adjusted too.
> >>>>>
> >>>>> Signed-off-by: Matan Azrad <matan@mellanox.com>
> >>>>> ---
> >>>>>  lib/librte_vhost/vhost_user.c | 51
> >>>>> ++++++++++++++++++++++++++++---------------
> >>>>>  1 file changed, 33 insertions(+), 18 deletions(-)
> >>>>>
> >>>>> diff --git a/lib/librte_vhost/vhost_user.c
> >>>>> b/lib/librte_vhost/vhost_user.c index b0849b9..cfd5f27 100644
> >>>>> --- a/lib/librte_vhost/vhost_user.c
> >>>>> +++ b/lib/librte_vhost/vhost_user.c
> >>>>> @@ -1295,7 +1295,7 @@
> >>>>>  {
> >>>>>  	bool rings_ok;
> >>>>>
> >>>>> -	if (!vq)
> >>>>> +	if (!vq || !vq->enabled)
> >>>>>  		return false;
> >>>>>
> >>>>>  	if (vq_is_packed(dev))
> >>>>> @@ -1309,24 +1309,27 @@
> >>>>>  	       vq->callfd != VIRTIO_UNINITIALIZED_EVENTFD;  }
> >>>>>
> >>>>> +#define VIRTIO_DEV_NUM_VQS_TO_BE_READY 2u
> >>>>> +
> >>>>>  static int
> >>>>>  virtio_is_ready(struct virtio_net *dev)  {
> >>>>>  	struct vhost_virtqueue *vq;
> >>>>>  	uint32_t i;
> >>>>>
> >>>>> -	if (dev->nr_vring == 0)
> >>>>> +	if (dev->nr_vring < VIRTIO_DEV_NUM_VQS_TO_BE_READY)
> >>>>>  		return 0;
> >>>>>
> >>>>> -	for (i = 0; i < dev->nr_vring; i++) {
> >>>>> +	for (i = 0; i < VIRTIO_DEV_NUM_VQS_TO_BE_READY; i++) {
> >>>>>  		vq = dev->virtqueue[i];
> >>>>>
> >>>>>  		if (!vq_is_ready(dev, vq))
> >>>>>  			return 0;
> >>>>>  	}
> >>>>>
> >>>>> -	VHOST_LOG_CONFIG(INFO,
> >>>>> -		"virtio is now ready for processing.\n");
> >>>>> +	if (!(dev->flags & VIRTIO_DEV_READY))
> >>>>> +		VHOST_LOG_CONFIG(INFO,
> >>>>> +			"virtio is now ready for processing.\n");
> >>>>>  	return 1;
> >>>>>  }
> >>>>>
> >>>>> @@ -1970,8 +1973,6 @@ static int vhost_user_set_vring_err(struct
> >>>> virtio_net **pdev __rte_unused,
> >>>>>  	struct virtio_net *dev = *pdev;
> >>>>>  	int enable = (int)msg->payload.state.num;
> >>>>>  	int index = (int)msg->payload.state.index;
> >>>>> -	struct rte_vdpa_device *vdpa_dev;
> >>>>> -	int did = -1;
> >>>>>
> >>>>>  	if (validate_msg_fds(msg, 0) != 0)
> >>>>>  		return RTE_VHOST_MSG_RESULT_ERR; @@ -1980,15 +1981,6
> @@ static
> >>>>> int vhost_user_set_vring_err(struct
> >>>> virtio_net **pdev __rte_unused,
> >>>>>  		"set queue enable: %d to qp idx: %d\n",
> >>>>>  		enable, index);
> >>>>>
> >>>>> -	did = dev->vdpa_dev_id;
> >>>>> -	vdpa_dev = rte_vdpa_get_device(did);
> >>>>> -	if (vdpa_dev && vdpa_dev->ops->set_vring_state)
> >>>>> -		vdpa_dev->ops->set_vring_state(dev->vid, index,
> enable);
> >>>>> -
> >>>>> -	if (dev->notify_ops->vring_state_changed)
> >>>>> -		dev->notify_ops->vring_state_changed(dev->vid,
> >>>>> -				index, enable);
> >>>>> -
> >>>>>  	/* On disable, rings have to be stopped being processed. */
> >>>>>  	if (!enable && dev->dequeue_zero_copy)
> >>>>>  		drain_zmbuf_list(dev->virtqueue[index]);
> >>>>> @@ -2622,11 +2614,13 @@ typedef int
> >>>> (*vhost_message_handler_t)(struct virtio_net **pdev,
> >>>>>  	struct virtio_net *dev;
> >>>>>  	struct VhostUserMsg msg;
> >>>>>  	struct rte_vdpa_device *vdpa_dev;
> >>>>> +	bool ready[VHOST_MAX_VRING];
> >>>>>  	int did = -1;
> >>>>>  	int ret;
> >>>>>  	int unlock_required = 0;
> >>>>>  	bool handled;
> >>>>>  	int request;
> >>>>> +	uint32_t i;
> >>>>>
> >>>>>  	dev = get_device(vid);
> >>>>>  	if (dev == NULL)
> >>>>> @@ -2668,6 +2662,10 @@ typedef int
> >> (*vhost_message_handler_t)(struct
> >>>> virtio_net **pdev,
> >>>>>  		VHOST_LOG_CONFIG(DEBUG, "External request %d\n",
> >>>> request);
> >>>>>  	}
> >>>>>
> >>>>> +	/* Save ready status for all the VQs before message handle.
> */
> >>>>> +	for (i = 0; i < VHOST_MAX_VRING; i++)
> >>>>> +		ready[i] = vq_is_ready(dev, dev->virtqueue[i]);
> >>>>> +
> >>>>
> >>>> This big array can be avoided if you save the ready status in the
> >>>> virtqueue once message have been handled.
> >>>
> >>> You mean you prefer to save it in virtqueue structure? Desn't it
> >>> same
> >> memory ?
> >>> In any case I don't think 0x100 is so big 😊
> >>
> >> I mean in the stack.
> >
> > Do you think that 256B is too much for stack?
> >
> >> And one advantage of saving it in the vq structure is for example you
> >> have memory hotplug. The vq is in ready state in the beginning and in
> >> the end, but during the handling the ring host virtual addresses get
> >> changed because of the munmap/mmap and we need to notify the driver
> otherwise it will miss it.
> >
> > Do you mean VHOST_USER_SET_MEM_TABLE call after first configuration?
> >
> > I don't understand what is the issue of saving it in stack here....
> 
> The issue is if you only check ready state only before and after the message
> affecting the ring is handled, it can be ready at both stages, while the rings
> have changed and state change callback should have been called.

But in this version I checked twice, before message handler and after message handler, so it should catch any update.

In any case, as I said, I will move the ready memory to the virtiqueue structure in order to save the check before the message handler.
 
> Please check the example patch I sent on Friday, it takes care of invalidating
> the ring state and call the state change callback.
> 
> > But one advantage of saving it in virtqueue structure is that the message
> handler should not check the ready state before each message.
> >
> > I will change it in next version.
> >
> >>>
> >>>>>  	ret = vhost_user_check_and_alloc_queue_pair(dev, &msg);
> >>>>>  	if (ret < 0) {
> >>>>>  		VHOST_LOG_CONFIG(ERR,
> >>>>> @@ -2802,6 +2800,25 @@ typedef int
> >> (*vhost_message_handler_t)(struct
> >>>> virtio_net **pdev,
> >>>>>  		return -1;
> >>>>>  	}
> >>>>>
> >>>>> +	did = dev->vdpa_dev_id;
> >>>>> +	vdpa_dev = rte_vdpa_get_device(did);
> >>>>> +	/* Update ready status. */
> >>>>> +	for (i = 0; i < VHOST_MAX_VRING; i++) {
> >>>>> +		bool cur_ready = vq_is_ready(dev, dev-
> >virtqueue[i]);
> >>>>> +
> >>>>> +		if ((cur_ready && request ==
> >>>> VHOST_USER_SET_VRING_ENABLE &&
> >>>>> +				i == msg.payload.state.index) ||
> >>>>
> >>>> Couldn't we remove above condition? Aren't the callbacks already
> >>>> called in the set_vring_enable handler?
> >>>
> >>> As we agreed in the design discussion:
> >>>
> >>> " 3. Same handling of the requests, except that we won't notify the
> >>> vdpa driver and the application of vring state changes in the
> >>> VHOST_USER_SET_VRING_ENABLE handler."
> >>>
> >>> So, I removed it from the set_vring_enable handler.
> >>
> >> My bad, the patch context where it is removed made to think it was in
> >> vhost_user_set_vring_err(), so I missed it.
> >>
> >> Thinking at it again since last time we discussed it, we have to send
> >> the notification from the handler in the case
> >>
> >>> Now, the ready state doesn't depend only in
> >> VHOST_USER_SET_VRING_ENABLE massage.
> >>>
> >>>>> +				cur_ready != ready[i]) {
> >>>>> +			if (vdpa_dev && vdpa_dev->ops-
> >set_vring_state)
> >>>>> +				vdpa_dev->ops-
> >set_vring_state(dev->vid, i,
> >>>>> +
> >>>> 	(int)cur_ready);
> >>>>> +
> >>>>> +			if (dev->notify_ops->vring_state_changed)
> >>>>> +				dev->notify_ops-
> >vring_state_changed(dev-
> >>>>> vid,
> >>>>> +							i,
> (int)cur_ready);
> >>>>> +		}
> >>>>> +	}
> >>>>
> >>>> I think we should move this into a dedicated function, which we
> >>>> would call in every message handler that can modify the ready state.
> >>>>
> >>>> Doing so, we would not have to assume the master sent us disable
> >>>> request for the queue before, ans also would have proper
> >>>> synchronization if the request uses reply-ack feature as it could
> >>>> assume the backend is no more processing the ring once reply-ack is
> >> received.
> >>>
> >>> Makes sense to do it before reply-ack and to create dedicated
> >>> function to
> >> it.
> >>>
> >>> Doen't the vDPA conf should be called before reply-ack too to be
> >>> sure
> >> queues are ready before reply?
> >>
> >> I don't think so, because the backend can start processing the ring after.
> >> What we don't want is that the backend continues to process the rings
> >> when the guest asked to stop doing it.
> >
> > But "doing configuration after reply" may cause that the a guest kicks a
> queue while app \ vDPA driver is being configured.
> > It may lead to some order dependencies in configuration....
> I get your point, we can try to move the configuration before the reply.
> 
> But looking at qemu source code, neither SET_VRING_KICK nor
> SET_VRING_CALL nor SET_VRING_ENABLE request for reply-ack, so it won't
> have any effect.
> 
> > In addition, now, the device ready state becomes on only in the same
> > time that a queue becomes on, so we can do the device ready check (for
> new_device \ dev_conf calls) only when a queue becomes ready in the same
> function.
> 
> If you want, we can do try that too.
> 
> >>> If so, we should move also the device ready code below (maybe also
> >>> vdpa
> >> conf) to this function too.
> >>
> >> So I don't think it is needed.
> >>> But maybe call it directly from this function and not from the
> >>> specific
> >> massage handlers is better, something like the
> >> vhost_user_check_and_alloc_queue_pair function style.
> >>>
> >>> What do you think?
> >
> > Any answer here?
> 
> To move the .new_device and .dev_conf callbacks in the same fonction that
> sends the vring change notifications? Yes, we can do that I think.
> 
> >>>
> >>>>>  	if (!(dev->flags & VIRTIO_DEV_RUNNING) && virtio_is_ready(dev)) {
> >>>>>  		dev->flags |= VIRTIO_DEV_READY;
> >>>>>
> >>>>> @@ -2816,8 +2833,6 @@ typedef int
> >> (*vhost_message_handler_t)(struct
> >>>> virtio_net **pdev,
> >>>>>  		}
> >>>>>  	}
> >>>>>
> >>>>> -	did = dev->vdpa_dev_id;
> >>>>> -	vdpa_dev = rte_vdpa_get_device(did);
> >>>>>  	if (vdpa_dev && virtio_is_ready(dev) &&
> >>>>>  			!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED)
> >>>> &&
> >>>>>  			msg.request.master ==
> >>>> VHOST_USER_SET_VRING_CALL) {
> >>>>
> >>>> Shouldn't check on SET_VRING_CALL above be removed?
> >>>
> >>> Isn't it is a workaround for something?
> >>>
> >>
> >> Normally, we should no more need it, as state change notification
> >> will be sent if callfd came to change.
> >
> > Ok, will remove it.
> >


  reply	other threads:[~2020-06-22  8:41 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-18 16:28 [dpdk-dev] [PATCH v1 0/4] vhost: improve ready state Matan Azrad
2020-06-18 16:28 ` [dpdk-dev] [PATCH v1 1/4] vhost: support host notifier queue configuration Matan Azrad
2020-06-19  6:44   ` Maxime Coquelin
2020-06-19 13:28     ` Matan Azrad
2020-06-19 14:01       ` Maxime Coquelin
2020-06-21  6:26         ` Matan Azrad
2020-06-22  8:06           ` Maxime Coquelin
2020-06-18 16:28 ` [dpdk-dev] [PATCH v1 2/4] vhost: skip access lock when vDPA is configured Matan Azrad
2020-06-19  6:49   ` Maxime Coquelin
2020-06-18 16:28 ` [dpdk-dev] [PATCH v1 3/4] vhost: improve device ready definition Matan Azrad
2020-06-19  7:41   ` Maxime Coquelin
2020-06-19 12:04     ` Maxime Coquelin
2020-06-19 13:11     ` Matan Azrad
2020-06-19 13:54       ` Maxime Coquelin
2020-06-21  6:20         ` Matan Azrad
2020-06-22  8:04           ` Maxime Coquelin
2020-06-22  8:41             ` Matan Azrad [this message]
2020-06-22  8:56               ` Maxime Coquelin
2020-06-22 10:06                 ` Matan Azrad
2020-06-22 12:32                   ` Maxime Coquelin
2020-06-22 13:43                     ` Matan Azrad
2020-06-22 14:55                       ` Maxime Coquelin
2020-06-22 15:51                         ` Matan Azrad
2020-06-22 16:47                           ` Maxime Coquelin
2020-06-23  9:02                             ` Matan Azrad
2020-06-23  9:19                               ` Maxime Coquelin
2020-06-23 11:53                                 ` Matan Azrad
2020-06-23 13:55                                   ` Maxime Coquelin
2020-06-23 14:33                                     ` Maxime Coquelin
2020-06-23 14:52                                     ` Matan Azrad
2020-06-23 15:18                                       ` Maxime Coquelin
2020-06-24  5:54                                         ` Matan Azrad
2020-06-24  7:22                                           ` Maxime Coquelin
2020-06-24  8:38                                             ` Matan Azrad
2020-06-24  9:12                                               ` Maxime Coquelin
2020-06-18 16:28 ` [dpdk-dev] [PATCH v1 4/4] vdpa/mlx5: support queue update Matan Azrad
2020-06-25 13:38 ` [dpdk-dev] [PATCH v2 0/5] vhost: improve ready state Matan Azrad
2020-06-25 13:38   ` [dpdk-dev] [PATCH v2 1/5] vhost: skip access lock when vDPA is configured Matan Azrad
2020-06-28  3:06     ` Xia, Chenbo
2020-06-25 13:38   ` [dpdk-dev] [PATCH v2 2/5] vhost: improve device readiness notifications Matan Azrad
2020-06-26 12:10     ` Maxime Coquelin
2020-06-28  3:08     ` Xia, Chenbo
2020-06-25 13:38   ` [dpdk-dev] [PATCH v2 3/5] vhost: handle memory hotplug with vDPA devices Matan Azrad
2020-06-26 12:15     ` Maxime Coquelin
2020-06-28  3:18     ` Xia, Chenbo
2020-06-25 13:38   ` [dpdk-dev] [PATCH v2 4/5] vhost: notify virtq file descriptor update Matan Azrad
2020-06-26 12:19     ` Maxime Coquelin
2020-06-28  3:19     ` Xia, Chenbo
2020-06-25 13:38   ` [dpdk-dev] [PATCH v2 5/5] vdpa/mlx5: support queue update Matan Azrad
2020-06-26 12:29     ` Maxime Coquelin
2020-06-29 14:08   ` [dpdk-dev] [PATCH v3 0/6] vhost: improve ready state Matan Azrad
2020-06-29 14:08     ` [dpdk-dev] [PATCH v3 1/6] vhost: support host notifier queue configuration Matan Azrad
2020-06-29 14:08     ` [dpdk-dev] [PATCH v3 2/6] vhost: skip access lock when vDPA is configured Matan Azrad
2020-06-29 14:08     ` [dpdk-dev] [PATCH v3 3/6] vhost: improve device readiness notifications Matan Azrad
2020-06-29 14:08     ` [dpdk-dev] [PATCH v3 4/6] vhost: handle memory hotplug with vDPA devices Matan Azrad
2020-06-29 14:08     ` [dpdk-dev] [PATCH v3 5/6] vhost: notify virtq file descriptor update Matan Azrad
2020-06-29 14:08     ` [dpdk-dev] [PATCH v3 6/6] vdpa/mlx5: support queue update Matan Azrad
2020-06-29 17:24     ` [dpdk-dev] [PATCH v3 0/6] vhost: improve ready state Maxime Coquelin
2020-07-17  1:41       ` Wang, Yinan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AM0PR0502MB40198F7066F49FDA30592596D2970@AM0PR0502MB4019.eurprd05.prod.outlook.com \
    --to=matan@mellanox.com \
    --cc=dev@dpdk.org \
    --cc=maxime.coquelin@redhat.com \
    --cc=xiao.w.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).