From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id BE6B5A0350; Mon, 22 Jun 2020 10:04:23 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 77AB91D532; Mon, 22 Jun 2020 10:04:23 +0200 (CEST) Received: from us-smtp-delivery-1.mimecast.com (us-smtp-2.mimecast.com [207.211.31.81]) by dpdk.org (Postfix) with ESMTP id 0CC9C1D531 for ; Mon, 22 Jun 2020 10:04:21 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1592813061; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=IpoX6lhP5vlUSo4SsHcksSnrIgRF8vbk5mZEVZ+KgBg=; b=FUsm1wHHIZyR1XLuu12GDVeZQxX9CB/vXP+xUheGbtQOMwdFqeWLLWpRBB8PUnpBqwX3hi Mkx2pMNMQ4afCDfboBcIKyOmoKbby1vQEWA1aoURhFAngsQIcbto6N/0YWEi7+mZzu+eRf ixeY9q2NlMoGOuwz789naTU68d1K3JA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-351-BXUtZuqYPZilq5ixvgx6Aw-1; Mon, 22 Jun 2020 04:04:16 -0400 X-MC-Unique: BXUtZuqYPZilq5ixvgx6Aw-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7488F8018AB; Mon, 22 Jun 2020 08:04:15 +0000 (UTC) Received: from [10.36.110.7] (unknown [10.36.110.7]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 1359F71673; Mon, 22 Jun 2020 08:04:13 +0000 (UTC) To: Matan Azrad , Xiao Wang Cc: "dev@dpdk.org" References: <1592497686-433697-1-git-send-email-matan@mellanox.com> <1592497686-433697-4-git-send-email-matan@mellanox.com> <631a3cf3-f21a-b292-b475-93552d8f73e8@redhat.com> <96191de9-63bd-0937-5bdd-d81e9db14e9f@redhat.com> From: Maxime Coquelin Autocrypt: addr=maxime.coquelin@redhat.com; keydata= mQINBFOEQQIBEADjNLYZZqghYuWv1nlLisptPJp+TSxE/KuP7x47e1Gr5/oMDJ1OKNG8rlNg kLgBQUki3voWhUbMb69ybqdMUHOl21DGCj0BTU3lXwapYXOAnsh8q6RRM+deUpasyT+Jvf3a gU35dgZcomRh5HPmKMU4KfeA38cVUebsFec1HuJAWzOb/UdtQkYyZR4rbzw8SbsOemtMtwOx YdXodneQD7KuRU9IhJKiEfipwqk2pufm2VSGl570l5ANyWMA/XADNhcEXhpkZ1Iwj3TWO7XR uH4xfvPl8nBsLo/EbEI7fbuUULcAnHfowQslPUm6/yaGv6cT5160SPXT1t8U9QDO6aTSo59N jH519JS8oeKZB1n1eLDslCfBpIpWkW8ZElGkOGWAN0vmpLfdyiqBNNyS3eGAfMkJ6b1A24un /TKc6j2QxM0QK4yZGfAxDxtvDv9LFXec8ENJYsbiR6WHRHq7wXl/n8guyh5AuBNQ3LIK44x0 KjGXP1FJkUhUuruGyZsMrDLBRHYi+hhDAgRjqHgoXi5XGETA1PAiNBNnQwMf5aubt+mE2Q5r qLNTgwSo2dpTU3+mJ3y3KlsIfoaxYI7XNsPRXGnZi4hbxmeb2NSXgdCXhX3nELUNYm4ArKBP LugOIT/zRwk0H0+RVwL2zHdMO1Tht1UOFGfOZpvuBF60jhMzbQARAQABtCxNYXhpbWUgQ29x dWVsaW4gPG1heGltZS5jb3F1ZWxpbkByZWRoYXQuY29tPokCOAQTAQIAIgUCV3u/5QIbAwYL CQgHAwIGFQgCCQoLBBYCAwECHgECF4AACgkQyjiNKEaHD4ma2g/+P+Hg9WkONPaY1J4AR7Uf kBneosS4NO3CRy0x4WYmUSLYMLx1I3VH6SVjqZ6uBoYy6Fs6TbF6SHNc7QbB6Qjo3neqnQR1 71Ua1MFvIob8vUEl3jAR/+oaE1UJKrxjWztpppQTukIk4oJOmXbL0nj3d8dA2QgHdTyttZ1H xzZJWWz6vqxCrUqHU7RSH9iWg9R2iuTzii4/vk1oi4Qz7y/q8ONOq6ffOy/t5xSZOMtZCspu Mll2Szzpc/trFO0pLH4LZZfz/nXh2uuUbk8qRIJBIjZH3ZQfACffgfNefLe2PxMqJZ8mFJXc RQO0ONZvwoOoHL6CcnFZp2i0P5ddduzwPdGsPq1bnIXnZqJSl3dUfh3xG5ArkliZ/++zGF1O wvpGvpIuOgLqjyCNNRoR7cP7y8F24gWE/HqJBXs1qzdj/5Hr68NVPV1Tu/l2D1KMOcL5sOrz 2jLXauqDWn1Okk9hkXAP7+0Cmi6QwAPuBT3i6t2e8UdtMtCE4sLesWS/XohnSFFscZR6Vaf3 gKdWiJ/fW64L6b9gjkWtHd4jAJBAIAx1JM6xcA1xMbAFsD8gA2oDBWogHGYcScY/4riDNKXi lw92d6IEHnSf6y7KJCKq8F+Jrj2BwRJiFKTJ6ChbOpyyR6nGTckzsLgday2KxBIyuh4w+hMq TGDSp2rmWGJjASq5Ag0EVPSbkwEQAMkaNc084Qvql+XW+wcUIY+Dn9A2D1gMr2BVwdSfVDN7 0ZYxo9PvSkzh6eQmnZNQtl8WSHl3VG3IEDQzsMQ2ftZn2sxjcCadexrQQv3Lu60Tgj7YVYRM H+fLYt9W5YuWduJ+FPLbjIKynBf6JCRMWr75QAOhhhaI0tsie3eDsKQBA0w7WCuPiZiheJaL 4MDe9hcH4rM3ybnRW7K2dLszWNhHVoYSFlZGYh+MGpuODeQKDS035+4H2rEWgg+iaOwqD7bg CQXwTZ1kSrm8NxIRVD3MBtzp9SZdUHLfmBl/tLVwDSZvHZhhvJHC6Lj6VL4jPXF5K2+Nn/Su CQmEBisOmwnXZhhu8ulAZ7S2tcl94DCo60ReheDoPBU8PR2TLg8rS5f9w6mLYarvQWL7cDtT d2eX3Z6TggfNINr/RTFrrAd7NHl5h3OnlXj7PQ1f0kfufduOeCQddJN4gsQfxo/qvWVB7PaE 1WTIggPmWS+Xxijk7xG6x9McTdmGhYaPZBpAxewK8ypl5+yubVsE9yOOhKMVo9DoVCjh5To5 aph7CQWfQsV7cd9PfSJjI2lXI0dhEXhQ7lRCFpf3V3mD6CyrhpcJpV6XVGjxJvGUale7+IOp sQIbPKUHpB2F+ZUPWds9yyVxGwDxD8WLqKKy0WLIjkkSsOb9UBNzgRyzrEC9lgQ/ABEBAAGJ Ah8EGAECAAkFAlT0m5MCGwwACgkQyjiNKEaHD4nU8hAAtt0xFJAy0sOWqSmyxTc7FUcX+pbD KVyPlpl6urKKMk1XtVMUPuae/+UwvIt0urk1mXi6DnrAN50TmQqvdjcPTQ6uoZ8zjgGeASZg jj0/bJGhgUr9U7oG7Hh2F8vzpOqZrdd65MRkxmc7bWj1k81tOU2woR/Gy8xLzi0k0KUa8ueB iYOcZcIGTcs9CssVwQjYaXRoeT65LJnTxYZif2pfNxfINFzCGw42s3EtZFteczClKcVSJ1+L +QUY/J24x0/ocQX/M1PwtZbB4c/2Pg/t5FS+s6UB1Ce08xsJDcwyOPIH6O3tccZuriHgvqKP yKz/Ble76+NFlTK1mpUlfM7PVhD5XzrDUEHWRTeTJSvJ8TIPL4uyfzhjHhlkCU0mw7Pscyxn DE8G0UYMEaNgaZap8dcGMYH/96EfE5s/nTX0M6MXV0yots7U2BDb4soLCxLOJz4tAFDtNFtA wLBhXRSvWhdBJZiig/9CG3dXmKfi2H+wdUCSvEFHRpgo7GK8/Kh3vGhgKmnnxhl8ACBaGy9n fxjSxjSO6rj4/MeenmlJw1yebzkX8ZmaSi8BHe+n6jTGEFNrbiOdWpJgc5yHIZZnwXaW54QT UhhSjDL1rV2B4F28w30jYmlRmm2RdN7iCZfbyP3dvFQTzQ4ySquuPkIGcOOHrvZzxbRjzMx1 Mwqu3GQ= Message-ID: <74f35cd8-3543-ba1c-0dc7-1502599e6cf7@redhat.com> Date: Mon, 22 Jun 2020 10:04:11 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [PATCH v1 3/4] vhost: improve device ready definition X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi, On 6/21/20 8:20 AM, Matan Azrad wrote: > Hi Maxime > > From: Maxime Coquelin: >> Hi Matan, >> >> On 6/19/20 3:11 PM, Matan Azrad wrote: >>> Hi Maxime >>> >>> Thanks for the fast review. >>> This is first version, let's review it carefully to be sure it is correct. >>> @Xiao Wang, it will be good to hear your idea too. >>> We also need to understand the effect on IFC driver/device... >>> Just to update that I checked this code with the mlx5 adjustments and I >> sent in this series. >>> It works well with the vDPA example application. >> >> OK. >> >>> From: Maxime Coquelin: >>>> On 6/18/20 6:28 PM, Matan Azrad wrote: >>>>> Some guest drivers may not configure disabled virtio queues. >>>>> >>>>> In this case, the vhost management never triggers the vDPA device >>>>> configuration because it waits to the device to be ready. >>>> >>>> This is not vDPA-only, even with SW datapath the application's >>>> new_device callback never gets called. >>>> >>> Yes, I wrote it below, I can be more specific here too in the next version. >>> >>>>> The current ready state means that all the virtio queues should be >>>>> configured regardless the enablement status. >>>>> >>>>> In order to support this case, this patch changes the ready state: >>>>> The device is ready when at least 1 queue pair is configured and >>>>> enabled. >>>>> >>>>> So, now, the vDPA driver will be configured when the first queue >>>>> pair is configured and enabled. >>>>> >>>>> Also the queue state operation is change to the next rules: >>>>> 1. queue becomes ready (enabled and fully configured) - >>>>> set_vring_state(enabled). >>>>> 2. queue becomes not ready - set_vring_state(disabled). >>>>> 3. queue stay ready and VHOST_USER_SET_VRING_ENABLE massage >>>> was >>>>> handled - set_vring_state(enabled). >>>>> >>>>> The parallel operations for the application are adjusted too. >>>>> >>>>> Signed-off-by: Matan Azrad >>>>> --- >>>>> lib/librte_vhost/vhost_user.c | 51 >>>>> ++++++++++++++++++++++++++++--------------- >>>>> 1 file changed, 33 insertions(+), 18 deletions(-) >>>>> >>>>> diff --git a/lib/librte_vhost/vhost_user.c >>>>> b/lib/librte_vhost/vhost_user.c index b0849b9..cfd5f27 100644 >>>>> --- a/lib/librte_vhost/vhost_user.c >>>>> +++ b/lib/librte_vhost/vhost_user.c >>>>> @@ -1295,7 +1295,7 @@ >>>>> { >>>>> bool rings_ok; >>>>> >>>>> - if (!vq) >>>>> + if (!vq || !vq->enabled) >>>>> return false; >>>>> >>>>> if (vq_is_packed(dev)) >>>>> @@ -1309,24 +1309,27 @@ >>>>> vq->callfd != VIRTIO_UNINITIALIZED_EVENTFD; } >>>>> >>>>> +#define VIRTIO_DEV_NUM_VQS_TO_BE_READY 2u >>>>> + >>>>> static int >>>>> virtio_is_ready(struct virtio_net *dev) { >>>>> struct vhost_virtqueue *vq; >>>>> uint32_t i; >>>>> >>>>> - if (dev->nr_vring == 0) >>>>> + if (dev->nr_vring < VIRTIO_DEV_NUM_VQS_TO_BE_READY) >>>>> return 0; >>>>> >>>>> - for (i = 0; i < dev->nr_vring; i++) { >>>>> + for (i = 0; i < VIRTIO_DEV_NUM_VQS_TO_BE_READY; i++) { >>>>> vq = dev->virtqueue[i]; >>>>> >>>>> if (!vq_is_ready(dev, vq)) >>>>> return 0; >>>>> } >>>>> >>>>> - VHOST_LOG_CONFIG(INFO, >>>>> - "virtio is now ready for processing.\n"); >>>>> + if (!(dev->flags & VIRTIO_DEV_READY)) >>>>> + VHOST_LOG_CONFIG(INFO, >>>>> + "virtio is now ready for processing.\n"); >>>>> return 1; >>>>> } >>>>> >>>>> @@ -1970,8 +1973,6 @@ static int vhost_user_set_vring_err(struct >>>> virtio_net **pdev __rte_unused, >>>>> struct virtio_net *dev = *pdev; >>>>> int enable = (int)msg->payload.state.num; >>>>> int index = (int)msg->payload.state.index; >>>>> - struct rte_vdpa_device *vdpa_dev; >>>>> - int did = -1; >>>>> >>>>> if (validate_msg_fds(msg, 0) != 0) >>>>> return RTE_VHOST_MSG_RESULT_ERR; >>>>> @@ -1980,15 +1981,6 @@ static int vhost_user_set_vring_err(struct >>>> virtio_net **pdev __rte_unused, >>>>> "set queue enable: %d to qp idx: %d\n", >>>>> enable, index); >>>>> >>>>> - did = dev->vdpa_dev_id; >>>>> - vdpa_dev = rte_vdpa_get_device(did); >>>>> - if (vdpa_dev && vdpa_dev->ops->set_vring_state) >>>>> - vdpa_dev->ops->set_vring_state(dev->vid, index, enable); >>>>> - >>>>> - if (dev->notify_ops->vring_state_changed) >>>>> - dev->notify_ops->vring_state_changed(dev->vid, >>>>> - index, enable); >>>>> - >>>>> /* On disable, rings have to be stopped being processed. */ >>>>> if (!enable && dev->dequeue_zero_copy) >>>>> drain_zmbuf_list(dev->virtqueue[index]); >>>>> @@ -2622,11 +2614,13 @@ typedef int >>>> (*vhost_message_handler_t)(struct virtio_net **pdev, >>>>> struct virtio_net *dev; >>>>> struct VhostUserMsg msg; >>>>> struct rte_vdpa_device *vdpa_dev; >>>>> + bool ready[VHOST_MAX_VRING]; >>>>> int did = -1; >>>>> int ret; >>>>> int unlock_required = 0; >>>>> bool handled; >>>>> int request; >>>>> + uint32_t i; >>>>> >>>>> dev = get_device(vid); >>>>> if (dev == NULL) >>>>> @@ -2668,6 +2662,10 @@ typedef int >> (*vhost_message_handler_t)(struct >>>> virtio_net **pdev, >>>>> VHOST_LOG_CONFIG(DEBUG, "External request %d\n", >>>> request); >>>>> } >>>>> >>>>> + /* Save ready status for all the VQs before message handle. */ >>>>> + for (i = 0; i < VHOST_MAX_VRING; i++) >>>>> + ready[i] = vq_is_ready(dev, dev->virtqueue[i]); >>>>> + >>>> >>>> This big array can be avoided if you save the ready status in the >>>> virtqueue once message have been handled. >>> >>> You mean you prefer to save it in virtqueue structure? Desn't it same >> memory ? >>> In any case I don't think 0x100 is so big 😊 >> >> I mean in the stack. > > Do you think that 256B is too much for stack? > >> And one advantage of saving it in the vq structure is for example you have >> memory hotplug. The vq is in ready state in the beginning and in the end, but >> during the handling the ring host virtual addresses get changed because of >> the munmap/mmap and we need to notify the driver otherwise it will miss it. > > Do you mean VHOST_USER_SET_MEM_TABLE call after first configuration? > > I don't understand what is the issue of saving it in stack here.... The issue is if you only check ready state only before and after the message affecting the ring is handled, it can be ready at both stages, while the rings have changed and state change callback should have been called. Please check the example patch I sent on Friday, it takes care of invalidating the ring state and call the state change callback. > But one advantage of saving it in virtqueue structure is that the message handler should not check the ready state before each message. > > I will change it in next version. > >>> >>>>> ret = vhost_user_check_and_alloc_queue_pair(dev, &msg); >>>>> if (ret < 0) { >>>>> VHOST_LOG_CONFIG(ERR, >>>>> @@ -2802,6 +2800,25 @@ typedef int >> (*vhost_message_handler_t)(struct >>>> virtio_net **pdev, >>>>> return -1; >>>>> } >>>>> >>>>> + did = dev->vdpa_dev_id; >>>>> + vdpa_dev = rte_vdpa_get_device(did); >>>>> + /* Update ready status. */ >>>>> + for (i = 0; i < VHOST_MAX_VRING; i++) { >>>>> + bool cur_ready = vq_is_ready(dev, dev->virtqueue[i]); >>>>> + >>>>> + if ((cur_ready && request == >>>> VHOST_USER_SET_VRING_ENABLE && >>>>> + i == msg.payload.state.index) || >>>> >>>> Couldn't we remove above condition? Aren't the callbacks already >>>> called in the set_vring_enable handler? >>> >>> As we agreed in the design discussion: >>> >>> " 3. Same handling of the requests, except that we won't notify the >>> vdpa driver and the application of vring state changes in the >>> VHOST_USER_SET_VRING_ENABLE handler." >>> >>> So, I removed it from the set_vring_enable handler. >> >> My bad, the patch context where it is removed made to think it was in >> vhost_user_set_vring_err(), so I missed it. >> >> Thinking at it again since last time we discussed it, we have to send the >> notification from the handler in the case >> >>> Now, the ready state doesn't depend only in >> VHOST_USER_SET_VRING_ENABLE massage. >>> >>>>> + cur_ready != ready[i]) { >>>>> + if (vdpa_dev && vdpa_dev->ops->set_vring_state) >>>>> + vdpa_dev->ops->set_vring_state(dev->vid, i, >>>>> + >>>> (int)cur_ready); >>>>> + >>>>> + if (dev->notify_ops->vring_state_changed) >>>>> + dev->notify_ops->vring_state_changed(dev- >>>>> vid, >>>>> + i, (int)cur_ready); >>>>> + } >>>>> + } >>>> >>>> I think we should move this into a dedicated function, which we would >>>> call in every message handler that can modify the ready state. >>>> >>>> Doing so, we would not have to assume the master sent us disable >>>> request for the queue before, ans also would have proper >>>> synchronization if the request uses reply-ack feature as it could >>>> assume the backend is no more processing the ring once reply-ack is >> received. >>> >>> Makes sense to do it before reply-ack and to create dedicated function to >> it. >>> >>> Doen't the vDPA conf should be called before reply-ack too to be sure >> queues are ready before reply? >> >> I don't think so, because the backend can start processing the ring after. >> What we don't want is that the backend continues to process the rings when >> the guest asked to stop doing it. > > But "doing configuration after reply" may cause that the a guest kicks a queue while app \ vDPA driver is being configured. > It may lead to some order dependencies in configuration.... I get your point, we can try to move the configuration before the reply. But looking at qemu source code, neither SET_VRING_KICK nor SET_VRING_CALL nor SET_VRING_ENABLE request for reply-ack, so it won't have any effect. > In addition, now, the device ready state becomes on only in the same time that a queue becomes on, > so we can do the device ready check (for new_device \ dev_conf calls) only when a queue becomes ready in the same function. If you want, we can do try that too. >>> If so, we should move also the device ready code below (maybe also vdpa >> conf) to this function too. >> >> So I don't think it is needed. >>> But maybe call it directly from this function and not from the specific >> massage handlers is better, something like the >> vhost_user_check_and_alloc_queue_pair function style. >>> >>> What do you think? > > Any answer here? To move the .new_device and .dev_conf callbacks in the same fonction that sends the vring change notifications? Yes, we can do that I think. >>> >>>>> if (!(dev->flags & VIRTIO_DEV_RUNNING) && virtio_is_ready(dev)) { >>>>> dev->flags |= VIRTIO_DEV_READY; >>>>> >>>>> @@ -2816,8 +2833,6 @@ typedef int >> (*vhost_message_handler_t)(struct >>>> virtio_net **pdev, >>>>> } >>>>> } >>>>> >>>>> - did = dev->vdpa_dev_id; >>>>> - vdpa_dev = rte_vdpa_get_device(did); >>>>> if (vdpa_dev && virtio_is_ready(dev) && >>>>> !(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED) >>>> && >>>>> msg.request.master == >>>> VHOST_USER_SET_VRING_CALL) { >>>> >>>> Shouldn't check on SET_VRING_CALL above be removed? >>> >>> Isn't it is a workaround for something? >>> >> >> Normally, we should no more need it, as state change notification will be >> sent if callfd came to change. > > Ok, will remove it. >