DPDK patches and discussions
 help / color / mirror / Atom feed
From: Maxime Coquelin <maxime.coquelin@redhat.com>
To: Matan Azrad <matan@mellanox.com>,
	"xiaolong.ye@intel.com" <xiaolong.ye@intel.com>,
	Shahaf Shuler <shahafs@mellanox.com>,
	"amorenoz@redhat.com" <amorenoz@redhat.com>,
	"xiao.w.wang@intel.com" <xiao.w.wang@intel.com>,
	Slava Ovsiienko <viacheslavo@mellanox.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Cc: "jasowang@redhat.com" <jasowang@redhat.com>,
	"lulu@redhat.com" <lulu@redhat.com>
Subject: Re: [dpdk-dev] [PATCH 9/9] vhost: only use vDPA config workaround if needed
Date: Tue, 9 Jun 2020 19:23:01 +0200	[thread overview]
Message-ID: <64c59e91-79ef-bc55-bc64-1995cfb03c84@redhat.com> (raw)
In-Reply-To: <AM0PR0502MB40196214D01755C3AD643D8BD2820@AM0PR0502MB4019.eurprd05.prod.outlook.com>



On 6/9/20 1:09 PM, Matan Azrad wrote:
> Hi Maxime
> 
> From: Maxime Coquelin
>> Hi Matan,
>>
>> On 6/8/20 11:19 AM, Matan Azrad wrote:
>>> Hi Maxime
>>>
>>> From: Maxime Coquelin:
>>>> Hi Matan,
>>>>
>>>> On 6/7/20 12:38 PM, Matan Azrad wrote:
>>>>> Hi Maxime
>>>>>
>>>>> Thanks for the huge work.
>>>>> Please see a suggestion inline.
>>>>>
>>>>> From: Maxime Coquelin:
>>>>>> Sent: Thursday, May 14, 2020 11:02 AM
>>>>>> To: xiaolong.ye@intel.com; Shahaf Shuler <shahafs@mellanox.com>;
>>>>>> Matan Azrad <matan@mellanox.com>; amorenoz@redhat.com;
>>>>>> xiao.w.wang@intel.com; Slava Ovsiienko
>> <viacheslavo@mellanox.com>;
>>>>>> dev@dpdk.org
>>>>>> Cc: jasowang@redhat.com; lulu@redhat.com; Maxime Coquelin
>>>>>> <maxime.coquelin@redhat.com>
>>>>>> Subject: [PATCH 9/9] vhost: only use vDPA config workaround if
>>>>>> needed
>>>>>>
>>>>>> Now that we have Virtio device status support, let's only use the
>>>>>> vDPA workaround if it is not supported.
>>>>>>
>>>>>> This patch also document why Virtio device status protocol feature
>>>>>> support is strongly advised.
>>>>>>
>>>>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>> ---
>>>>>>  lib/librte_vhost/vhost_user.c | 16 ++++++++++++++--
>>>>>>  1 file changed, 14 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/lib/librte_vhost/vhost_user.c
>>>>>> b/lib/librte_vhost/vhost_user.c index e5a44be58d..67e96a872a 100644
>>>>>> --- a/lib/librte_vhost/vhost_user.c
>>>>>> +++ b/lib/librte_vhost/vhost_user.c
>>>>>> @@ -2847,8 +2847,20 @@ vhost_user_msg_handler(int vid, int fd)
>>>>>>  	if (!vdpa_dev)
>>>>>>  		goto out;
>>>>>>
>>>>>> -	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED) &&
>>>>>> -			request == VHOST_USER_SET_VRING_CALL) {
>>>>>> +	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED)) {
>>>>>> +		/*
>>>>>> +		 * Workaround when Virtio device status protocol
>>>>>> +		 * feature is not supported, wait for SET_VRING_CALL
>>>>>> +		 * request. This is not ideal as some frontends like
>>>>>> +		 * Virtio-user may not send this request, so vDPA device
>>>>>> +		 * may never be configured. Virtio device status support
>>>>>> +		 * on frontend side is strongly advised.
>>>>>> +		 */
>>>>>> +		if (!(dev->protocol_features &
>>>>>> +				(1ULL <<
>>>>>> VHOST_USER_PROTOCOL_F_STATUS)) &&
>>>>>> +				(request !=
>>>>>> VHOST_USER_SET_VRING_CALL))
>>>>>> +			goto out;
>>>>>> +
>>>>>
>>>>> When status protocol feature is not supported, in the current code,
>>>>> the
>>>> vDPA configuration triggering depends in:
>>>>> 1. Device is ready - all the queues are configured (datapath
>>>>> addresses,
>>>> callfd and kickfd) .
>>>>> 2. last command is callfd.
>>>>>
>>>>>
>>>>> The code doesn't take into account that some queues may stay disabled.
>>>>> Maybe the correct timing is:
>>>>> 1. Device is ready - all the enabled queues are configured and MEM
>>>>> table is
>>>> configured.
>>>>
>>>> I think current virtio_is_ready() already assumes the mem table is
>>>> configured, otherwise we would not have vq->desc, vq->used and
>>>> vq->avail being set as it needs to be translated using the mem table.
>>>>
>>> Yes, but if you don't expect to check them for disabled queues you need to
>> check mem table to be sure it was set.
>>
>> Even disabled queues should be allocated/configured by the guest driver.
> Is it by spec?

Sorry, that was a misunderstanding from my side.
The number of queues set by the driver using MQ_VQ_PAIRS_SET control
message have to be allocated and configured by the driver:
http://docs.oasis-open.org/virtio/virtio/v1.0/cs04/virtio-v1.0-cs04.html#x1-1940001

> We saw that windows virtio guest driver doesn't configure disabled queues.
> Is it bug in windows guest?
> You probably can take a look here:
> https://github.com/virtio-win/kvm-guest-drivers-windows
> 

Indeed it limits the number of queue pairs to the number of CPUs.
This is done here:
https://github.com/virtio-win/kvm-guest-drivers-windows/blob/edda3f50a17015aab1450ca09e3263c7409e4001/NetKVM/Common/ParaNdis_Common.cpp#L956

Linux does the same by the way:
https://elixir.bootlin.com/linux/latest/source/drivers/net/virtio_net.c#L3092

We rarely face this issue because by default, the management
layers usually set the number of queue pairs to the number of vCPUs if
multiqueue is enabled. But the problem is real.

In my opinion, the problem is more on Vhost-user spec side and/or Vhost-
user backend.

The DPDK backend allocates queue pairs for every time it receives a
Vhost-user message setting a new queue (callfd, kickfd, enable,... see
vhost_user_check_and_alloc_queue_pair()). And then virtio_is_ready()
waits for all the allocated queue pairs to be initialized.

The problem is that QEMU sends some if these messages even for queues
that aren't (or won't be) initialized, as you can see in below log where
I reproduced the issue with Windows 10:
https://pastebin.com/YYCfW9Y3

I don't see how the backend could know the guest driver is done with
currently received information from Qemu as it seems to him some queues
are partially initialized (callfd is set).

With VHOST_USER_SET_STATUS, we will be able to handle this properly, as
the backend can be sure the guest won't initialize more queues as soon
as DRIVER_OK Virtio status bit is set. In my v2, I can add one patch to
handle this case properly, by "destorying" queues metadata as soon as
DRIVER_OK is received.

Note that it was the exact reason why I first tried to add support for
VHOST_USER_SET_STATUS more than two years ago...:
https://lists.gnu.org/archive/html/qemu-devel/2018-02/msg04560.html

What do you think?

Regards,
Maxime


  parent reply	other threads:[~2020-06-09 17:23 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-14  8:02 [dpdk-dev] [PATCH (v20.08) 0/9] vhost: improve Vhost/vDPA device init Maxime Coquelin
2020-05-14  8:02 ` [dpdk-dev] [PATCH 1/9] vhost: fix virtio ready flag check Maxime Coquelin
2020-05-14  8:02 ` [dpdk-dev] [PATCH 2/9] vhost: refactor Virtio ready check Maxime Coquelin
2020-05-14  8:02 ` [dpdk-dev] [PATCH 3/9] vdpa/ifc: add support to vDPA queue enable Maxime Coquelin
2020-05-15  8:45   ` Ye Xiaolong
2020-05-15  9:09   ` Jason Wang
2020-05-15  9:42     ` Wang, Xiao W
2020-05-15 10:06       ` Jason Wang
2020-05-15 10:08       ` Jason Wang
2020-05-18  3:09         ` Wang, Xiao W
2020-05-18  3:17           ` Jason Wang
2020-05-14  8:02 ` [dpdk-dev] [PATCH 4/9] vhost: make some vDPA callbacks mandatory Maxime Coquelin
2020-05-14  8:02 ` [dpdk-dev] [PATCH 5/9] vhost: check vDPA configuration succeed Maxime Coquelin
2020-05-14  8:02 ` [dpdk-dev] [PATCH 6/9] vhost: add support for virtio status Maxime Coquelin
2020-06-11  2:45   ` Xia, Chenbo
2020-06-16  4:29   ` Xia, Chenbo
2020-06-22 10:18     ` Adrian Moreno
2020-06-22 11:00       ` Xia, Chenbo
2020-05-14  8:02 ` [dpdk-dev] [PATCH 7/9] vdpa/ifc: enable status protocol feature Maxime Coquelin
2020-05-14  8:02 ` [dpdk-dev] [PATCH 8/9] vdpa/mlx5: " Maxime Coquelin
2020-05-14  8:02 ` [dpdk-dev] [PATCH 9/9] vhost: only use vDPA config workaround if needed Maxime Coquelin
2020-06-07 10:38   ` Matan Azrad
2020-06-08  8:34     ` Maxime Coquelin
2020-06-08  9:19       ` Matan Azrad
2020-06-09  9:04         ` Maxime Coquelin
2020-06-09 11:09           ` Matan Azrad
2020-06-09 11:26             ` Maxime Coquelin
2020-06-09 17:23             ` Maxime Coquelin [this message]
2020-06-14  6:08               ` Matan Azrad
2020-06-17  9:39                 ` Maxime Coquelin
2020-06-17 11:04                   ` Matan Azrad
2020-06-17 12:29                     ` Maxime Coquelin
2020-06-18  6:39                       ` Matan Azrad
2020-06-18  7:30                         ` Maxime Coquelin
2020-06-23 10:42                           ` Wang, Xiao W

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=64c59e91-79ef-bc55-bc64-1995cfb03c84@redhat.com \
    --to=maxime.coquelin@redhat.com \
    --cc=amorenoz@redhat.com \
    --cc=dev@dpdk.org \
    --cc=jasowang@redhat.com \
    --cc=lulu@redhat.com \
    --cc=matan@mellanox.com \
    --cc=shahafs@mellanox.com \
    --cc=viacheslavo@mellanox.com \
    --cc=xiao.w.wang@intel.com \
    --cc=xiaolong.ye@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).