From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 69673A04FE;
	Tue,  9 Jun 2020 19:23:19 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 0CBED2C02;
	Tue,  9 Jun 2020 19:23:19 +0200 (CEST)
Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com
 [205.139.110.120]) by dpdk.org (Postfix) with ESMTP id 791D51DBF
 for <dev@dpdk.org>; Tue,  9 Jun 2020 19:23:17 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
 s=mimecast20190719; t=1591723396;
 h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
 content-transfer-encoding:content-transfer-encoding:
 in-reply-to:in-reply-to:references:references:autocrypt:autocrypt;
 bh=ghEidUbCDbe2OYhind9Ty6jCPxIRlU8/US5/Vczh5BI=;
 b=LRqpQkJ53dY90nUamnLT2Nv+CZXhm4KspOsr25UF7LI5QqjHCOdjlfoM0Q2T/MToWvwtJR
 VhsQ+wKY/lbwW8k1KUDE/UWNtktosTUR0BRa8UoenEzkgqQvamKMCj6pLYVmJ8JKQv+gpa
 h4e0GdiPPXPEhNHscqxOSvQyompii70=
Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com
 [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id
 us-mta-192-U5mO-Zv0OiqSMxlVLQ8cOg-1; Tue, 09 Jun 2020 13:23:09 -0400
X-MC-Unique: U5mO-Zv0OiqSMxlVLQ8cOg-1
Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com
 [10.5.11.13])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 006E38015CE;
 Tue,  9 Jun 2020 17:23:08 +0000 (UTC)
Received: from [10.36.110.37] (unknown [10.36.110.37])
 by smtp.corp.redhat.com (Postfix) with ESMTPS id 2B6D489262;
 Tue,  9 Jun 2020 17:23:04 +0000 (UTC)
To: Matan Azrad <matan@mellanox.com>,
 "xiaolong.ye@intel.com" <xiaolong.ye@intel.com>,
 Shahaf Shuler <shahafs@mellanox.com>,
 "amorenoz@redhat.com" <amorenoz@redhat.com>,
 "xiao.w.wang@intel.com" <xiao.w.wang@intel.com>,
 Slava Ovsiienko <viacheslavo@mellanox.com>, "dev@dpdk.org" <dev@dpdk.org>
Cc: "jasowang@redhat.com" <jasowang@redhat.com>,
 "lulu@redhat.com" <lulu@redhat.com>
References: <20200514080218.1435344-1-maxime.coquelin@redhat.com>
 <20200514080218.1435344-10-maxime.coquelin@redhat.com>
 <AM0PR0502MB4019EE6ECDB66AD2D0D22925D2840@AM0PR0502MB4019.eurprd05.prod.outlook.com>
 <0216165f-aedd-06c7-5a90-2cd0d238b143@redhat.com>
 <AM0PR0502MB401927EDA2422820366C4CF9D2850@AM0PR0502MB4019.eurprd05.prod.outlook.com>
 <97528794-aa14-37da-a5bb-1e1d46e9127a@redhat.com>
 <AM0PR0502MB40196214D01755C3AD643D8BD2820@AM0PR0502MB4019.eurprd05.prod.outlook.com>
From: Maxime Coquelin <maxime.coquelin@redhat.com>
Autocrypt: addr=maxime.coquelin@redhat.com; keydata=
 mQINBFOEQQIBEADjNLYZZqghYuWv1nlLisptPJp+TSxE/KuP7x47e1Gr5/oMDJ1OKNG8rlNg
 kLgBQUki3voWhUbMb69ybqdMUHOl21DGCj0BTU3lXwapYXOAnsh8q6RRM+deUpasyT+Jvf3a
 gU35dgZcomRh5HPmKMU4KfeA38cVUebsFec1HuJAWzOb/UdtQkYyZR4rbzw8SbsOemtMtwOx
 YdXodneQD7KuRU9IhJKiEfipwqk2pufm2VSGl570l5ANyWMA/XADNhcEXhpkZ1Iwj3TWO7XR
 uH4xfvPl8nBsLo/EbEI7fbuUULcAnHfowQslPUm6/yaGv6cT5160SPXT1t8U9QDO6aTSo59N
 jH519JS8oeKZB1n1eLDslCfBpIpWkW8ZElGkOGWAN0vmpLfdyiqBNNyS3eGAfMkJ6b1A24un
 /TKc6j2QxM0QK4yZGfAxDxtvDv9LFXec8ENJYsbiR6WHRHq7wXl/n8guyh5AuBNQ3LIK44x0
 KjGXP1FJkUhUuruGyZsMrDLBRHYi+hhDAgRjqHgoXi5XGETA1PAiNBNnQwMf5aubt+mE2Q5r
 qLNTgwSo2dpTU3+mJ3y3KlsIfoaxYI7XNsPRXGnZi4hbxmeb2NSXgdCXhX3nELUNYm4ArKBP
 LugOIT/zRwk0H0+RVwL2zHdMO1Tht1UOFGfOZpvuBF60jhMzbQARAQABtCxNYXhpbWUgQ29x
 dWVsaW4gPG1heGltZS5jb3F1ZWxpbkByZWRoYXQuY29tPokCOAQTAQIAIgUCV3u/5QIbAwYL
 CQgHAwIGFQgCCQoLBBYCAwECHgECF4AACgkQyjiNKEaHD4ma2g/+P+Hg9WkONPaY1J4AR7Uf
 kBneosS4NO3CRy0x4WYmUSLYMLx1I3VH6SVjqZ6uBoYy6Fs6TbF6SHNc7QbB6Qjo3neqnQR1
 71Ua1MFvIob8vUEl3jAR/+oaE1UJKrxjWztpppQTukIk4oJOmXbL0nj3d8dA2QgHdTyttZ1H
 xzZJWWz6vqxCrUqHU7RSH9iWg9R2iuTzii4/vk1oi4Qz7y/q8ONOq6ffOy/t5xSZOMtZCspu
 Mll2Szzpc/trFO0pLH4LZZfz/nXh2uuUbk8qRIJBIjZH3ZQfACffgfNefLe2PxMqJZ8mFJXc
 RQO0ONZvwoOoHL6CcnFZp2i0P5ddduzwPdGsPq1bnIXnZqJSl3dUfh3xG5ArkliZ/++zGF1O
 wvpGvpIuOgLqjyCNNRoR7cP7y8F24gWE/HqJBXs1qzdj/5Hr68NVPV1Tu/l2D1KMOcL5sOrz
 2jLXauqDWn1Okk9hkXAP7+0Cmi6QwAPuBT3i6t2e8UdtMtCE4sLesWS/XohnSFFscZR6Vaf3
 gKdWiJ/fW64L6b9gjkWtHd4jAJBAIAx1JM6xcA1xMbAFsD8gA2oDBWogHGYcScY/4riDNKXi
 lw92d6IEHnSf6y7KJCKq8F+Jrj2BwRJiFKTJ6ChbOpyyR6nGTckzsLgday2KxBIyuh4w+hMq
 TGDSp2rmWGJjASq5Ag0EVPSbkwEQAMkaNc084Qvql+XW+wcUIY+Dn9A2D1gMr2BVwdSfVDN7
 0ZYxo9PvSkzh6eQmnZNQtl8WSHl3VG3IEDQzsMQ2ftZn2sxjcCadexrQQv3Lu60Tgj7YVYRM
 H+fLYt9W5YuWduJ+FPLbjIKynBf6JCRMWr75QAOhhhaI0tsie3eDsKQBA0w7WCuPiZiheJaL
 4MDe9hcH4rM3ybnRW7K2dLszWNhHVoYSFlZGYh+MGpuODeQKDS035+4H2rEWgg+iaOwqD7bg
 CQXwTZ1kSrm8NxIRVD3MBtzp9SZdUHLfmBl/tLVwDSZvHZhhvJHC6Lj6VL4jPXF5K2+Nn/Su
 CQmEBisOmwnXZhhu8ulAZ7S2tcl94DCo60ReheDoPBU8PR2TLg8rS5f9w6mLYarvQWL7cDtT
 d2eX3Z6TggfNINr/RTFrrAd7NHl5h3OnlXj7PQ1f0kfufduOeCQddJN4gsQfxo/qvWVB7PaE
 1WTIggPmWS+Xxijk7xG6x9McTdmGhYaPZBpAxewK8ypl5+yubVsE9yOOhKMVo9DoVCjh5To5
 aph7CQWfQsV7cd9PfSJjI2lXI0dhEXhQ7lRCFpf3V3mD6CyrhpcJpV6XVGjxJvGUale7+IOp
 sQIbPKUHpB2F+ZUPWds9yyVxGwDxD8WLqKKy0WLIjkkSsOb9UBNzgRyzrEC9lgQ/ABEBAAGJ
 Ah8EGAECAAkFAlT0m5MCGwwACgkQyjiNKEaHD4nU8hAAtt0xFJAy0sOWqSmyxTc7FUcX+pbD
 KVyPlpl6urKKMk1XtVMUPuae/+UwvIt0urk1mXi6DnrAN50TmQqvdjcPTQ6uoZ8zjgGeASZg
 jj0/bJGhgUr9U7oG7Hh2F8vzpOqZrdd65MRkxmc7bWj1k81tOU2woR/Gy8xLzi0k0KUa8ueB
 iYOcZcIGTcs9CssVwQjYaXRoeT65LJnTxYZif2pfNxfINFzCGw42s3EtZFteczClKcVSJ1+L
 +QUY/J24x0/ocQX/M1PwtZbB4c/2Pg/t5FS+s6UB1Ce08xsJDcwyOPIH6O3tccZuriHgvqKP
 yKz/Ble76+NFlTK1mpUlfM7PVhD5XzrDUEHWRTeTJSvJ8TIPL4uyfzhjHhlkCU0mw7Pscyxn
 DE8G0UYMEaNgaZap8dcGMYH/96EfE5s/nTX0M6MXV0yots7U2BDb4soLCxLOJz4tAFDtNFtA
 wLBhXRSvWhdBJZiig/9CG3dXmKfi2H+wdUCSvEFHRpgo7GK8/Kh3vGhgKmnnxhl8ACBaGy9n
 fxjSxjSO6rj4/MeenmlJw1yebzkX8ZmaSi8BHe+n6jTGEFNrbiOdWpJgc5yHIZZnwXaW54QT
 UhhSjDL1rV2B4F28w30jYmlRmm2RdN7iCZfbyP3dvFQTzQ4ySquuPkIGcOOHrvZzxbRjzMx1
 Mwqu3GQ=
Message-ID: <64c59e91-79ef-bc55-bc64-1995cfb03c84@redhat.com>
Date: Tue, 9 Jun 2020 19:23:01 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
 Thunderbird/68.8.0
MIME-Version: 1.0
In-Reply-To: <AM0PR0502MB40196214D01755C3AD643D8BD2820@AM0PR0502MB4019.eurprd05.prod.outlook.com>
Content-Language: en-US
X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Subject: Re: [dpdk-dev] [PATCH 9/9] vhost: only use vDPA config workaround
	if needed
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>



On 6/9/20 1:09 PM, Matan Azrad wrote:
> Hi Maxime
> 
> From: Maxime Coquelin
>> Hi Matan,
>>
>> On 6/8/20 11:19 AM, Matan Azrad wrote:
>>> Hi Maxime
>>>
>>> From: Maxime Coquelin:
>>>> Hi Matan,
>>>>
>>>> On 6/7/20 12:38 PM, Matan Azrad wrote:
>>>>> Hi Maxime
>>>>>
>>>>> Thanks for the huge work.
>>>>> Please see a suggestion inline.
>>>>>
>>>>> From: Maxime Coquelin:
>>>>>> Sent: Thursday, May 14, 2020 11:02 AM
>>>>>> To: xiaolong.ye@intel.com; Shahaf Shuler <shahafs@mellanox.com>;
>>>>>> Matan Azrad <matan@mellanox.com>; amorenoz@redhat.com;
>>>>>> xiao.w.wang@intel.com; Slava Ovsiienko
>> <viacheslavo@mellanox.com>;
>>>>>> dev@dpdk.org
>>>>>> Cc: jasowang@redhat.com; lulu@redhat.com; Maxime Coquelin
>>>>>> <maxime.coquelin@redhat.com>
>>>>>> Subject: [PATCH 9/9] vhost: only use vDPA config workaround if
>>>>>> needed
>>>>>>
>>>>>> Now that we have Virtio device status support, let's only use the
>>>>>> vDPA workaround if it is not supported.
>>>>>>
>>>>>> This patch also document why Virtio device status protocol feature
>>>>>> support is strongly advised.
>>>>>>
>>>>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>> ---
>>>>>>  lib/librte_vhost/vhost_user.c | 16 ++++++++++++++--
>>>>>>  1 file changed, 14 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/lib/librte_vhost/vhost_user.c
>>>>>> b/lib/librte_vhost/vhost_user.c index e5a44be58d..67e96a872a 100644
>>>>>> --- a/lib/librte_vhost/vhost_user.c
>>>>>> +++ b/lib/librte_vhost/vhost_user.c
>>>>>> @@ -2847,8 +2847,20 @@ vhost_user_msg_handler(int vid, int fd)
>>>>>>  	if (!vdpa_dev)
>>>>>>  		goto out;
>>>>>>
>>>>>> -	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED) &&
>>>>>> -			request == VHOST_USER_SET_VRING_CALL) {
>>>>>> +	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED)) {
>>>>>> +		/*
>>>>>> +		 * Workaround when Virtio device status protocol
>>>>>> +		 * feature is not supported, wait for SET_VRING_CALL
>>>>>> +		 * request. This is not ideal as some frontends like
>>>>>> +		 * Virtio-user may not send this request, so vDPA device
>>>>>> +		 * may never be configured. Virtio device status support
>>>>>> +		 * on frontend side is strongly advised.
>>>>>> +		 */
>>>>>> +		if (!(dev->protocol_features &
>>>>>> +				(1ULL <<
>>>>>> VHOST_USER_PROTOCOL_F_STATUS)) &&
>>>>>> +				(request !=
>>>>>> VHOST_USER_SET_VRING_CALL))
>>>>>> +			goto out;
>>>>>> +
>>>>>
>>>>> When status protocol feature is not supported, in the current code,
>>>>> the
>>>> vDPA configuration triggering depends in:
>>>>> 1. Device is ready - all the queues are configured (datapath
>>>>> addresses,
>>>> callfd and kickfd) .
>>>>> 2. last command is callfd.
>>>>>
>>>>>
>>>>> The code doesn't take into account that some queues may stay disabled.
>>>>> Maybe the correct timing is:
>>>>> 1. Device is ready - all the enabled queues are configured and MEM
>>>>> table is
>>>> configured.
>>>>
>>>> I think current virtio_is_ready() already assumes the mem table is
>>>> configured, otherwise we would not have vq->desc, vq->used and
>>>> vq->avail being set as it needs to be translated using the mem table.
>>>>
>>> Yes, but if you don't expect to check them for disabled queues you need to
>> check mem table to be sure it was set.
>>
>> Even disabled queues should be allocated/configured by the guest driver.
> Is it by spec?

Sorry, that was a misunderstanding from my side.
The number of queues set by the driver using MQ_VQ_PAIRS_SET control
message have to be allocated and configured by the driver:
http://docs.oasis-open.org/virtio/virtio/v1.0/cs04/virtio-v1.0-cs04.html#x1-1940001

> We saw that windows virtio guest driver doesn't configure disabled queues.
> Is it bug in windows guest?
> You probably can take a look here:
> https://github.com/virtio-win/kvm-guest-drivers-windows
> 

Indeed it limits the number of queue pairs to the number of CPUs.
This is done here:
https://github.com/virtio-win/kvm-guest-drivers-windows/blob/edda3f50a17015aab1450ca09e3263c7409e4001/NetKVM/Common/ParaNdis_Common.cpp#L956

Linux does the same by the way:
https://elixir.bootlin.com/linux/latest/source/drivers/net/virtio_net.c#L3092

We rarely face this issue because by default, the management
layers usually set the number of queue pairs to the number of vCPUs if
multiqueue is enabled. But the problem is real.

In my opinion, the problem is more on Vhost-user spec side and/or Vhost-
user backend.

The DPDK backend allocates queue pairs for every time it receives a
Vhost-user message setting a new queue (callfd, kickfd, enable,... see
vhost_user_check_and_alloc_queue_pair()). And then virtio_is_ready()
waits for all the allocated queue pairs to be initialized.

The problem is that QEMU sends some if these messages even for queues
that aren't (or won't be) initialized, as you can see in below log where
I reproduced the issue with Windows 10:
https://pastebin.com/YYCfW9Y3

I don't see how the backend could know the guest driver is done with
currently received information from Qemu as it seems to him some queues
are partially initialized (callfd is set).

With VHOST_USER_SET_STATUS, we will be able to handle this properly, as
the backend can be sure the guest won't initialize more queues as soon
as DRIVER_OK Virtio status bit is set. In my v2, I can add one patch to
handle this case properly, by "destorying" queues metadata as soon as
DRIVER_OK is received.

Note that it was the exact reason why I first tried to add support for
VHOST_USER_SET_STATUS more than two years ago...:
https://lists.gnu.org/archive/html/qemu-devel/2018-02/msg04560.html

What do you think?

Regards,
Maxime