[dpdk-dev] [PATCH (v20.08) 0/9] vhost: improve Vhost/vDPA device init

DPDK patches and discussions
 help / color / mirror / Atom feed

* [dpdk-dev] [PATCH (v20.08) 0/9] vhost: improve Vhost/vDPA device init
@ 2020-05-14  8:02 Maxime Coquelin
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 1/9] vhost: fix virtio ready flag check Maxime Coquelin
                   ` (8 more replies)
  0 siblings, 9 replies; 35+ messages in thread
From: Maxime Coquelin @ 2020-05-14  8:02 UTC (permalink / raw)
  To: xiaolong.ye, shahafs, matan, amorenoz, xiao.w.wang, viacheslavo, dev
  Cc: jasowang, lulu, Maxime Coquelin

The goal of this series is to make the Vhost/vDPA
device init more robust by adding support to a new
protocol feature.

VHOST_USER_SET_STATUS is received by the backend
when the driver updates the Virtio device status
register.

For now this series only deal with the DRIVER_OK
bit, which indicates that the driver is done with
the device initialization.

For example, such information helps the Vhost
backend to know when it can call de dev_conf vDPA
callback. It avoids doing an hazardous workaround
which only work with Qemu and not Virtio-user PMD.

Before adding support for this new request, some
clean-ups and refactoring are done as preliminary
steps to make the code more easily readable.

Some vDPA APIs are also made mandatory, and the
IFC driver modified to add support for the
.set_vring_state() callback. I think this one
should be mandatory as if the frontent requests
for one queue to be disabled, I think the vDPA
device should no more be processing it.

Note that the VHOST_USER_PROTOCOL_F_STATUS
protocol feature requires frontend support.
For Qemu, it has been posted for reviews and
can be found here:
https://patchwork.ozlabs.org/project/qemu-devel/patch/20200514073332.1434576-1-maxime.coquelin@redhat.com/

For Virtio-user PMD, it is being developped
and will be posted soon.

Maxime Coquelin (9):
  vhost: fix virtio ready flag check
  vhost: refactor Virtio ready check
  vdpa/ifc: add support to vDPA queue enable
  vhost: make some vDPA callbacks mandatory
  vhost: check vDPA configuration succeed
  vhost: add support for virtio status
  vdpa/ifc: enable status protocol feature
  vdpa/mlx5: enable status protocol feature
  vhost: only use vDPA config workaround if needed

 drivers/vdpa/ifc/base/ifcvf.c |   9 +++
 drivers/vdpa/ifc/base/ifcvf.h |   4 ++
 drivers/vdpa/ifc/ifcvf_vdpa.c |  26 ++++++++-
 drivers/vdpa/mlx5/mlx5_vdpa.c |   4 +-
 lib/librte_vhost/rte_vdpa.h   |  14 +++--
 lib/librte_vhost/rte_vhost.h  |   4 ++
 lib/librte_vhost/socket.c     |   6 +-
 lib/librte_vhost/vdpa.c       |  10 ++++
 lib/librte_vhost/vhost.c      |   3 +-
 lib/librte_vhost/vhost.h      |   9 +++
 lib/librte_vhost/vhost_user.c | 103 +++++++++++++++++++++++++++-------
 lib/librte_vhost/vhost_user.h |   6 +-
 12 files changed, 162 insertions(+), 36 deletions(-)

-- 
2.25.4

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [dpdk-dev] [PATCH 1/9] vhost: fix virtio ready flag check
  2020-05-14  8:02 [dpdk-dev] [PATCH (v20.08) 0/9] vhost: improve Vhost/vDPA device init Maxime Coquelin
@ 2020-05-14  8:02 ` Maxime Coquelin
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 2/9] vhost: refactor Virtio ready check Maxime Coquelin
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 35+ messages in thread
From: Maxime Coquelin @ 2020-05-14  8:02 UTC (permalink / raw)
  To: xiaolong.ye, shahafs, matan, amorenoz, xiao.w.wang, viacheslavo, dev
  Cc: jasowang, lulu, Maxime Coquelin, stable

Before checking whether the device is ready is done
a check on whether the RUNNING flag is set. Then the
READY flag is set if virtio_is_ready() returns true.

While it seems to not cause any issue, it makes more
sense to check whether the READY flag is set and not
the RUNNING one.

Fixes: c0674b1bc898 ("vhost: move the device ready check at proper place")
Cc: stable@dpdk.org

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/vhost_user.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index 9c891d4c01..106b6d7609 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -2784,7 +2784,7 @@ vhost_user_msg_handler(int vid, int fd)
 		return -1;
 	}
 
-	if (!(dev->flags & VIRTIO_DEV_RUNNING) && virtio_is_ready(dev)) {
+	if (!(dev->flags & VIRTIO_DEV_READY) && virtio_is_ready(dev)) {
 		dev->flags |= VIRTIO_DEV_READY;
 
 		if (!(dev->flags & VIRTIO_DEV_RUNNING)) {
-- 
2.25.4


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [dpdk-dev] [PATCH 2/9] vhost: refactor Virtio ready check
  2020-05-14  8:02 [dpdk-dev] [PATCH (v20.08) 0/9] vhost: improve Vhost/vDPA device init Maxime Coquelin
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 1/9] vhost: fix virtio ready flag check Maxime Coquelin
@ 2020-05-14  8:02 ` Maxime Coquelin
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 3/9] vdpa/ifc: add support to vDPA queue enable Maxime Coquelin
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 35+ messages in thread
From: Maxime Coquelin @ 2020-05-14  8:02 UTC (permalink / raw)
  To: xiaolong.ye, shahafs, matan, amorenoz, xiao.w.wang, viacheslavo, dev
  Cc: jasowang, lulu, Maxime Coquelin

This patch is a small refactoring, as preliminary work
for adding support to Virtio status support.

No functionnal change here.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/vhost.c      |  1 +
 lib/librte_vhost/vhost_user.c | 40 +++++++++++++++++++++--------------
 2 files changed, 25 insertions(+), 16 deletions(-)

diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index 0266318440..fd8ba1be2d 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -716,6 +716,7 @@ vhost_enable_dequeue_zero_copy(int vid)
 		return;
 
 	dev->dequeue_zero_copy = 1;
+	VHOST_LOG_CONFIG(INFO, "dequeue zero copy is enabled\n");
 }
 
 void
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index 106b6d7609..f5800bd9c1 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -1315,6 +1315,9 @@ virtio_is_ready(struct virtio_net *dev)
 	struct vhost_virtqueue *vq;
 	uint32_t i;
 
+	if (dev->flags & VIRTIO_DEV_READY)
+		return 1;
+
 	if (dev->nr_vring == 0)
 		return 0;
 
@@ -1325,8 +1328,11 @@ virtio_is_ready(struct virtio_net *dev)
 			return 0;
 	}
 
+	dev->flags |= VIRTIO_DEV_READY;
+
 	VHOST_LOG_CONFIG(INFO,
 		"virtio is now ready for processing.\n");
+
 	return 1;
 }
 
@@ -2606,7 +2612,6 @@ vhost_user_msg_handler(int vid, int fd)
 	struct virtio_net *dev;
 	struct VhostUserMsg msg;
 	struct rte_vdpa_device *vdpa_dev;
-	int did = -1;
 	int ret;
 	int unlock_required = 0;
 	bool handled;
@@ -2784,30 +2789,33 @@ vhost_user_msg_handler(int vid, int fd)
 		return -1;
 	}
 
-	if (!(dev->flags & VIRTIO_DEV_READY) && virtio_is_ready(dev)) {
-		dev->flags |= VIRTIO_DEV_READY;
+	if (!virtio_is_ready(dev))
+		goto out;
 
-		if (!(dev->flags & VIRTIO_DEV_RUNNING)) {
-			if (dev->dequeue_zero_copy) {
-				VHOST_LOG_CONFIG(INFO,
-						"dequeue zero copy is enabled\n");
-			}
+	/*
+	 * Virtio is now ready. If not done already, it is time
+	 * to notify the application it can process the rings and
+	 * configure the vDPA device if present.
+	 */
 
-			if (dev->notify_ops->new_device(dev->vid) == 0)
-				dev->flags |= VIRTIO_DEV_RUNNING;
-		}
+	if (!(dev->flags & VIRTIO_DEV_RUNNING)) {
+		if (dev->notify_ops->new_device(dev->vid) == 0)
+			dev->flags |= VIRTIO_DEV_RUNNING;
 	}
 
-	did = dev->vdpa_dev_id;
-	vdpa_dev = rte_vdpa_get_device(did);
-	if (vdpa_dev && virtio_is_ready(dev) &&
-			!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED) &&
-			msg.request.master == VHOST_USER_SET_VRING_CALL) {
+	vdpa_dev = rte_vdpa_get_device(dev->vdpa_dev_id);
+	if (!vdpa_dev)
+		goto out;
+
+	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED) &&
+			request == VHOST_USER_SET_VRING_CALL) {
 		if (vdpa_dev->ops->dev_conf)
 			vdpa_dev->ops->dev_conf(dev->vid);
+
 		dev->flags |= VIRTIO_DEV_VDPA_CONFIGURED;
 	}
 
+out:
 	return 0;
 }
 
-- 
2.25.4


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [dpdk-dev] [PATCH 3/9] vdpa/ifc: add support to vDPA queue enable
  2020-05-14  8:02 [dpdk-dev] [PATCH (v20.08) 0/9] vhost: improve Vhost/vDPA device init Maxime Coquelin
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 1/9] vhost: fix virtio ready flag check Maxime Coquelin
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 2/9] vhost: refactor Virtio ready check Maxime Coquelin
@ 2020-05-14  8:02 ` Maxime Coquelin
  2020-05-15  8:45   ` Ye Xiaolong
  2020-05-15  9:09   ` Jason Wang
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 4/9] vhost: make some vDPA callbacks mandatory Maxime Coquelin
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 35+ messages in thread
From: Maxime Coquelin @ 2020-05-14  8:02 UTC (permalink / raw)
  To: xiaolong.ye, shahafs, matan, amorenoz, xiao.w.wang, viacheslavo, dev
  Cc: jasowang, lulu, Maxime Coquelin

This patch adds support to enabling and disabling
vrings on a per-vring granularity.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 drivers/vdpa/ifc/base/ifcvf.c |  9 +++++++++
 drivers/vdpa/ifc/base/ifcvf.h |  4 ++++
 drivers/vdpa/ifc/ifcvf_vdpa.c | 23 ++++++++++++++++++++++-
 3 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/drivers/vdpa/ifc/base/ifcvf.c b/drivers/vdpa/ifc/base/ifcvf.c
index 3c0b2dff66..dd4e7468ae 100644
--- a/drivers/vdpa/ifc/base/ifcvf.c
+++ b/drivers/vdpa/ifc/base/ifcvf.c
@@ -327,3 +327,12 @@ ifcvf_get_queue_notify_off(struct ifcvf_hw *hw, int qid)
 	return (u8 *)hw->notify_addr[qid] -
 		(u8 *)hw->mem_resource[hw->notify_region].addr;
 }
+
+void
+ifcvf_queue_enable(struct ifcvf_hw *hw, u16 qid,  u16 enable)
+{
+	struct ifcvf_pci_common_cfg *cfg = hw->common_cfg;
+
+	IFCVF_WRITE_REG16(qid, &cfg->queue_select);
+	IFCVF_WRITE_REG16(enable, &cfg->queue_enable);
+}
diff --git a/drivers/vdpa/ifc/base/ifcvf.h b/drivers/vdpa/ifc/base/ifcvf.h
index eb04a94067..bd85010eff 100644
--- a/drivers/vdpa/ifc/base/ifcvf.h
+++ b/drivers/vdpa/ifc/base/ifcvf.h
@@ -159,4 +159,8 @@ ifcvf_get_notify_region(struct ifcvf_hw *hw);
 u64
 ifcvf_get_queue_notify_off(struct ifcvf_hw *hw, int qid);
 
+void
+ifcvf_queue_enable(struct ifcvf_hw *hw, u16 qid,  u16 enable);
+
+
 #endif /* _IFCVF_H_ */
diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c b/drivers/vdpa/ifc/ifcvf_vdpa.c
index ec97178dcb..55ce0cf13d 100644
--- a/drivers/vdpa/ifc/ifcvf_vdpa.c
+++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
@@ -937,6 +937,27 @@ ifcvf_dev_close(int vid)
 	return 0;
 }
 
+static int
+ifcvf_set_vring_state(int vid, int vring, int state)
+{
+	int did;
+	struct internal_list *list;
+	struct ifcvf_internal *internal;
+
+	did = rte_vhost_get_vdpa_device_id(vid);
+	list = find_internal_resource_by_did(did);
+	if (list == NULL) {
+		DRV_LOG(ERR, "Invalid device id: %d", did);
+		return -1;
+	}
+
+	internal = list->internal;
+
+	ifcvf_queue_enable(&internal->hw, (uint16_t)vring, (uint16_t) state);
+
+	return 0;
+}
+
 static int
 ifcvf_set_features(int vid)
 {
@@ -1086,7 +1107,7 @@ static struct rte_vdpa_dev_ops ifcvf_ops = {
 	.get_protocol_features = ifcvf_get_protocol_features,
 	.dev_conf = ifcvf_dev_config,
 	.dev_close = ifcvf_dev_close,
-	.set_vring_state = NULL,
+	.set_vring_state = ifcvf_set_vring_state,
 	.set_features = ifcvf_set_features,
 	.migration_done = NULL,
 	.get_vfio_group_fd = ifcvf_get_vfio_group_fd,
-- 
2.25.4


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [dpdk-dev] [PATCH 4/9] vhost: make some vDPA callbacks mandatory
  2020-05-14  8:02 [dpdk-dev] [PATCH (v20.08) 0/9] vhost: improve Vhost/vDPA device init Maxime Coquelin
                   ` (2 preceding siblings ...)
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 3/9] vdpa/ifc: add support to vDPA queue enable Maxime Coquelin
@ 2020-05-14  8:02 ` Maxime Coquelin
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 5/9] vhost: check vDPA configuration succeed Maxime Coquelin
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 35+ messages in thread
From: Maxime Coquelin @ 2020-05-14  8:02 UTC (permalink / raw)
  To: xiaolong.ye, shahafs, matan, amorenoz, xiao.w.wang, viacheslavo, dev
  Cc: jasowang, lulu, Maxime Coquelin

Some of the vDPA callbacks have to be implemented
for vDPA to work properly.

This patch marks them as mandatory in the API doc and
simplify code calling these ops with removing
unnecessary checks that are now done at registration
time.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/rte_vdpa.h   | 14 ++++++++------
 lib/librte_vhost/socket.c     |  6 +++---
 lib/librte_vhost/vdpa.c       | 10 ++++++++++
 lib/librte_vhost/vhost.c      |  2 +-
 lib/librte_vhost/vhost_user.c |  7 +++----
 5 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/lib/librte_vhost/rte_vdpa.h b/lib/librte_vhost/rte_vdpa.h
index 3c400ee79b..a4f74211db 100644
--- a/lib/librte_vhost/rte_vdpa.h
+++ b/lib/librte_vhost/rte_vdpa.h
@@ -41,23 +41,25 @@ struct rte_vdpa_dev_addr {
  * vdpa device operations
  */
 struct rte_vdpa_dev_ops {
-	/** Get capabilities of this device */
+	/** Get capabilities of this device (Mandatory) */
 	int (*get_queue_num)(int did, uint32_t *queue_num);
 
-	/** Get supported features of this device */
+	/** Get supported features of this device (Mandatory) */
 	int (*get_features)(int did, uint64_t *features);
 
-	/** Get supported protocol features of this device */
+	/** Get supported protocol features of this device (Mandatory) */
 	int (*get_protocol_features)(int did, uint64_t *protocol_features);
 
-	/** Driver configure/close the device */
+	/** Driver configure the device (Mandatory) */
 	int (*dev_conf)(int vid);
+
+	/** Driver close the device (Mandatory) */
 	int (*dev_close)(int vid);
 
-	/** Enable/disable this vring */
+	/** Enable/disable this vring (Mandatory) */
 	int (*set_vring_state)(int vid, int vring, int state);
 
-	/** Set features when changed */
+	/** Set features when changed (Mandatory) */
 	int (*set_features)(int vid);
 
 	/** Destination operations when migration done */
diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
index bb8d0d7801..23f3448d2c 100644
--- a/lib/librte_vhost/socket.c
+++ b/lib/librte_vhost/socket.c
@@ -707,7 +707,7 @@ rte_vhost_driver_get_features(const char *path, uint64_t *features)
 
 	did = vsocket->vdpa_dev_id;
 	vdpa_dev = rte_vdpa_get_device(did);
-	if (!vdpa_dev || !vdpa_dev->ops->get_features) {
+	if (!vdpa_dev) {
 		*features = vsocket->features;
 		goto unlock_exit;
 	}
@@ -762,7 +762,7 @@ rte_vhost_driver_get_protocol_features(const char *path,
 
 	did = vsocket->vdpa_dev_id;
 	vdpa_dev = rte_vdpa_get_device(did);
-	if (!vdpa_dev || !vdpa_dev->ops->get_protocol_features) {
+	if (!vdpa_dev) {
 		*protocol_features = vsocket->protocol_features;
 		goto unlock_exit;
 	}
@@ -804,7 +804,7 @@ rte_vhost_driver_get_queue_num(const char *path, uint32_t *queue_num)
 
 	did = vsocket->vdpa_dev_id;
 	vdpa_dev = rte_vdpa_get_device(did);
-	if (!vdpa_dev || !vdpa_dev->ops->get_queue_num) {
+	if (!vdpa_dev) {
 		*queue_num = VHOST_MAX_QUEUE_PAIRS;
 		goto unlock_exit;
 	}
diff --git a/lib/librte_vhost/vdpa.c b/lib/librte_vhost/vdpa.c
index b2b2a105f1..e5e949f6d3 100644
--- a/lib/librte_vhost/vdpa.c
+++ b/lib/librte_vhost/vdpa.c
@@ -66,6 +66,16 @@ rte_vdpa_register_device(struct rte_vdpa_dev_addr *addr,
 	if (i == MAX_VHOST_DEVICE)
 		return -1;
 
+	/* Check mandatory ops are implemented */
+	if (!ops->get_queue_num || !ops->get_features ||
+			!ops->get_protocol_features || !ops->dev_conf ||
+			!ops->dev_close || !ops->set_vring_state ||
+			!ops->set_features) {
+		VHOST_LOG_CONFIG(ERR,
+				"Some mandatory vDPA ops aren't implemented\n");
+		return -1;
+	}
+
 	snprintf(device_name, sizeof(device_name), "vdpa-dev-%d", i);
 	dev = rte_zmalloc(device_name, sizeof(struct rte_vdpa_device),
 			RTE_CACHE_LINE_SIZE);
diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index fd8ba1be2d..7b5c9b3537 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -649,7 +649,7 @@ vhost_destroy_device_notify(struct virtio_net *dev)
 	if (dev->flags & VIRTIO_DEV_RUNNING) {
 		did = dev->vdpa_dev_id;
 		vdpa_dev = rte_vdpa_get_device(did);
-		if (vdpa_dev && vdpa_dev->ops->dev_close)
+		if (vdpa_dev)
 			vdpa_dev->ops->dev_close(dev->vid);
 		dev->flags &= ~VIRTIO_DEV_RUNNING;
 		dev->notify_ops->destroy_device(dev->vid);
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index f5800bd9c1..7a43c2fae7 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -386,7 +386,7 @@ vhost_user_set_features(struct virtio_net **pdev, struct VhostUserMsg *msg,
 
 	did = dev->vdpa_dev_id;
 	vdpa_dev = rte_vdpa_get_device(did);
-	if (vdpa_dev && vdpa_dev->ops->set_features)
+	if (vdpa_dev)
 		vdpa_dev->ops->set_features(dev->vid);
 
 	return RTE_VHOST_MSG_RESULT_OK;
@@ -1972,7 +1972,7 @@ vhost_user_set_vring_enable(struct virtio_net **pdev,
 
 	did = dev->vdpa_dev_id;
 	vdpa_dev = rte_vdpa_get_device(did);
-	if (vdpa_dev && vdpa_dev->ops->set_vring_state)
+	if (vdpa_dev)
 		vdpa_dev->ops->set_vring_state(dev->vid, index, enable);
 
 	if (dev->notify_ops->vring_state_changed)
@@ -2809,8 +2809,7 @@ vhost_user_msg_handler(int vid, int fd)
 
 	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED) &&
 			request == VHOST_USER_SET_VRING_CALL) {
-		if (vdpa_dev->ops->dev_conf)
-			vdpa_dev->ops->dev_conf(dev->vid);
+		vdpa_dev->ops->dev_conf(dev->vid);
 
 		dev->flags |= VIRTIO_DEV_VDPA_CONFIGURED;
 	}
-- 
2.25.4


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [dpdk-dev] [PATCH 5/9] vhost: check vDPA configuration succeed
  2020-05-14  8:02 [dpdk-dev] [PATCH (v20.08) 0/9] vhost: improve Vhost/vDPA device init Maxime Coquelin
                   ` (3 preceding siblings ...)
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 4/9] vhost: make some vDPA callbacks mandatory Maxime Coquelin
@ 2020-05-14  8:02 ` Maxime Coquelin
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 6/9] vhost: add support for virtio status Maxime Coquelin
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 35+ messages in thread
From: Maxime Coquelin @ 2020-05-14  8:02 UTC (permalink / raw)
  To: xiaolong.ye, shahafs, matan, amorenoz, xiao.w.wang, viacheslavo, dev
  Cc: jasowang, lulu, Maxime Coquelin

This patch checks whether vDPA device configuration
succeed and does not set the CONFIGURED flag if it
didn't.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/vhost_user.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index 7a43c2fae7..4a847f368c 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -2809,9 +2809,11 @@ vhost_user_msg_handler(int vid, int fd)
 
 	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED) &&
 			request == VHOST_USER_SET_VRING_CALL) {
-		vdpa_dev->ops->dev_conf(dev->vid);
-
-		dev->flags |= VIRTIO_DEV_VDPA_CONFIGURED;
+		if (vdpa_dev->ops->dev_conf(dev->vid))
+			VHOST_LOG_CONFIG(ERR,
+					"Failed to configure vDPA device\n");
+		else
+			dev->flags |= VIRTIO_DEV_VDPA_CONFIGURED;
 	}
 
 out:
-- 
2.25.4


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [dpdk-dev] [PATCH 6/9] vhost: add support for virtio status
  2020-05-14  8:02 [dpdk-dev] [PATCH (v20.08) 0/9] vhost: improve Vhost/vDPA device init Maxime Coquelin
                   ` (4 preceding siblings ...)
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 5/9] vhost: check vDPA configuration succeed Maxime Coquelin
@ 2020-05-14  8:02 ` Maxime Coquelin
  2020-06-11  2:45   ` Xia, Chenbo
  2020-06-16  4:29   ` Xia, Chenbo
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 7/9] vdpa/ifc: enable status protocol feature Maxime Coquelin
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 35+ messages in thread
From: Maxime Coquelin @ 2020-05-14  8:02 UTC (permalink / raw)
  To: xiaolong.ye, shahafs, matan, amorenoz, xiao.w.wang, viacheslavo, dev
  Cc: jasowang, lulu, Maxime Coquelin

This patch adds support to the new Virtio device status
Vhost-user protocol feature.

Getting such information in the backend helps to know
when the driver is done with the device configuration
and so makes the initialization phase more robust.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/rte_vhost.h  |  4 ++++
 lib/librte_vhost/vhost.h      |  9 ++++++++
 lib/librte_vhost/vhost_user.c | 40 +++++++++++++++++++++++++++++++++++
 lib/librte_vhost/vhost_user.h |  6 ++++--
 4 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h
index 5c72fba797..b4b7aa1928 100644
--- a/lib/librte_vhost/rte_vhost.h
+++ b/lib/librte_vhost/rte_vhost.h
@@ -85,6 +85,10 @@ extern "C" {
 #define VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD 12
 #endif
 
+#ifndef VHOST_USER_PROTOCOL_F_STATUS
+#define VHOST_USER_PROTOCOL_F_STATUS 15
+#endif
+
 /** Indicate whether protocol features negotiation is supported. */
 #ifndef VHOST_USER_F_PROTOCOL_FEATURES
 #define VHOST_USER_F_PROTOCOL_FEATURES	30
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index df98d15de6..9a9c0a98f5 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -202,6 +202,14 @@ struct vhost_virtqueue {
 	TAILQ_HEAD(, vhost_iotlb_entry) iotlb_pending_list;
 } __rte_cache_aligned;
 
+/* Virtio device status as per Virtio specification */
+#define VIRTIO_DEVICE_STATUS_ACK		0x01
+#define VIRTIO_DEVICE_STATUS_DRIVER		0x02
+#define VIRTIO_DEVICE_STATUS_DRIVER_OK		0x04
+#define VIRTIO_DEVICE_STATUS_FEATURES_OK	0x08
+#define VIRTIO_DEVICE_STATUS_DEV_NEED_RESET	0x40
+#define VIRTIO_DEVICE_STATUS_FAILED		0x80
+
 /* Old kernels have no such macros defined */
 #ifndef VIRTIO_NET_F_GUEST_ANNOUNCE
  #define VIRTIO_NET_F_GUEST_ANNOUNCE 21
@@ -364,6 +372,7 @@ struct virtio_net {
 	uint64_t		log_addr;
 	struct rte_ether_addr	mac;
 	uint16_t		mtu;
+	uint8_t			status;
 
 	struct vhost_device_ops const *notify_ops;
 
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index 4a847f368c..e5a44be58d 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -87,6 +87,7 @@ static const char *vhost_message_str[VHOST_USER_MAX] = {
 	[VHOST_USER_POSTCOPY_END]  = "VHOST_USER_POSTCOPY_END",
 	[VHOST_USER_GET_INFLIGHT_FD] = "VHOST_USER_GET_INFLIGHT_FD",
 	[VHOST_USER_SET_INFLIGHT_FD] = "VHOST_USER_SET_INFLIGHT_FD",
+	[VHOST_USER_SET_STATUS] = "VHOST_USER_SET_STATUS",
 };
 
 static int send_vhost_reply(int sockfd, struct VhostUserMsg *msg);
@@ -1328,6 +1329,11 @@ virtio_is_ready(struct virtio_net *dev)
 			return 0;
 	}
 
+	/* If supported, ensure the frontend is really done with config */
+	if (dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_STATUS))
+		if (!(dev->status & VIRTIO_DEVICE_STATUS_DRIVER_OK))
+			return 0;
+
 	dev->flags |= VIRTIO_DEV_READY;
 
 	VHOST_LOG_CONFIG(INFO,
@@ -2425,6 +2431,39 @@ vhost_user_postcopy_end(struct virtio_net **pdev, struct VhostUserMsg *msg,
 	return RTE_VHOST_MSG_RESULT_REPLY;
 }
 
+static int
+vhost_user_set_status(struct virtio_net **pdev, struct VhostUserMsg *msg,
+			int main_fd __rte_unused)
+{
+	struct virtio_net *dev = *pdev;
+
+	/* As per Virtio specification, the device status is 8bits long */
+	if (msg->payload.u64 > UINT8_MAX) {
+		VHOST_LOG_CONFIG(ERR, "Invalid VHOST_USER_SET_STATUS payload 0x%" PRIx64 "\n",
+				msg->payload.u64);
+		return RTE_VHOST_MSG_RESULT_ERR;
+	}
+
+	dev->status = msg->payload.u64;
+
+	VHOST_LOG_CONFIG(INFO, "New device status(0x%08x):\n"
+			"\t-ACKNOWLEDGE: %u\n"
+			"\t-DRIVER: %u\n"
+			"\t-FEATURES_OK: %u\n"
+			"\t-DRIVER_OK: %u\n"
+			"\t-DEVICE_NEED_RESET: %u\n"
+			"\t-FAILED: %u\n",
+			dev->status,
+			!!(dev->status & VIRTIO_DEVICE_STATUS_ACK),
+			!!(dev->status & VIRTIO_DEVICE_STATUS_DRIVER),
+			!!(dev->status & VIRTIO_DEVICE_STATUS_FEATURES_OK),
+			!!(dev->status & VIRTIO_DEVICE_STATUS_DRIVER_OK),
+			!!(dev->status & VIRTIO_DEVICE_STATUS_DEV_NEED_RESET),
+			!!(dev->status & VIRTIO_DEVICE_STATUS_FAILED));
+
+	return RTE_VHOST_MSG_RESULT_OK;
+}
+
 typedef int (*vhost_message_handler_t)(struct virtio_net **pdev,
 					struct VhostUserMsg *msg,
 					int main_fd);
@@ -2457,6 +2496,7 @@ static vhost_message_handler_t vhost_message_handlers[VHOST_USER_MAX] = {
 	[VHOST_USER_POSTCOPY_END] = vhost_user_postcopy_end,
 	[VHOST_USER_GET_INFLIGHT_FD] = vhost_user_get_inflight_fd,
 	[VHOST_USER_SET_INFLIGHT_FD] = vhost_user_set_inflight_fd,
+	[VHOST_USER_SET_STATUS] = vhost_user_set_status,
 };
 
 /* return bytes# of read on success or negative val on failure. */
diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
index 1f65efa4a9..74fd361e3a 100644
--- a/lib/librte_vhost/vhost_user.h
+++ b/lib/librte_vhost/vhost_user.h
@@ -23,7 +23,8 @@
 					 (1ULL << VHOST_USER_PROTOCOL_F_CRYPTO_SESSION) | \
 					 (1ULL << VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD) | \
 					 (1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER) | \
-					 (1ULL << VHOST_USER_PROTOCOL_F_PAGEFAULT))
+					 (1ULL << VHOST_USER_PROTOCOL_F_PAGEFAULT) | \
+					 (1ULL << VHOST_USER_PROTOCOL_F_STATUS))
 
 typedef enum VhostUserRequest {
 	VHOST_USER_NONE = 0,
@@ -56,7 +57,8 @@ typedef enum VhostUserRequest {
 	VHOST_USER_POSTCOPY_END = 30,
 	VHOST_USER_GET_INFLIGHT_FD = 31,
 	VHOST_USER_SET_INFLIGHT_FD = 32,
-	VHOST_USER_MAX = 33
+	VHOST_USER_SET_STATUS = 36,
+	VHOST_USER_MAX = 37
 } VhostUserRequest;
 
 typedef enum VhostUserSlaveRequest {
-- 
2.25.4


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [dpdk-dev] [PATCH 7/9] vdpa/ifc: enable status protocol feature
  2020-05-14  8:02 [dpdk-dev] [PATCH (v20.08) 0/9] vhost: improve Vhost/vDPA device init Maxime Coquelin
                   ` (5 preceding siblings ...)
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 6/9] vhost: add support for virtio status Maxime Coquelin
@ 2020-05-14  8:02 ` Maxime Coquelin
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 8/9] vdpa/mlx5: " Maxime Coquelin
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 9/9] vhost: only use vDPA config workaround if needed Maxime Coquelin
  8 siblings, 0 replies; 35+ messages in thread
From: Maxime Coquelin @ 2020-05-14  8:02 UTC (permalink / raw)
  To: xiaolong.ye, shahafs, matan, amorenoz, xiao.w.wang, viacheslavo, dev
  Cc: jasowang, lulu, Maxime Coquelin

This patch advertises VHOST_USER_PROTOCOL_F_STATUS
support in the IFC driver so that that the protocol
feature is negotiated.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 drivers/vdpa/ifc/ifcvf_vdpa.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c b/drivers/vdpa/ifc/ifcvf_vdpa.c
index 55ce0cf13d..d398766c83 100644
--- a/drivers/vdpa/ifc/ifcvf_vdpa.c
+++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
@@ -1093,7 +1093,8 @@ ifcvf_get_vdpa_features(int did, uint64_t *features)
 		 1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ | \
 		 1ULL << VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD | \
 		 1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER | \
-		 1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD)
+		 1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD | \
+		 1ULL << VHOST_USER_PROTOCOL_F_STATUS)
 static int
 ifcvf_get_protocol_features(int did __rte_unused, uint64_t *features)
 {
-- 
2.25.4


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [dpdk-dev] [PATCH 8/9] vdpa/mlx5: enable status protocol feature
  2020-05-14  8:02 [dpdk-dev] [PATCH (v20.08) 0/9] vhost: improve Vhost/vDPA device init Maxime Coquelin
                   ` (6 preceding siblings ...)
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 7/9] vdpa/ifc: enable status protocol feature Maxime Coquelin
@ 2020-05-14  8:02 ` Maxime Coquelin
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 9/9] vhost: only use vDPA config workaround if needed Maxime Coquelin
  8 siblings, 0 replies; 35+ messages in thread
From: Maxime Coquelin @ 2020-05-14  8:02 UTC (permalink / raw)
  To: xiaolong.ye, shahafs, matan, amorenoz, xiao.w.wang, viacheslavo, dev
  Cc: jasowang, lulu, Maxime Coquelin

This patch advertises VHOST_USER_PROTOCOL_F_STATUS
support in the MLX5 driver so that that the protocol
feature is negotiated.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 drivers/vdpa/mlx5/mlx5_vdpa.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c
index 1113d6cef0..7021e476cf 100644
--- a/drivers/vdpa/mlx5/mlx5_vdpa.c
+++ b/drivers/vdpa/mlx5/mlx5_vdpa.c
@@ -31,7 +31,9 @@
 			     (1ULL << VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD) | \
 			     (1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER) | \
 			     (1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD) | \
-			     (1ULL << VHOST_USER_PROTOCOL_F_MQ))
+			     (1ULL << VHOST_USER_PROTOCOL_F_MQ) | \
+			     (1ULL << VHOST_USER_PROTOCOL_F_STATUS))
+
 
 TAILQ_HEAD(mlx5_vdpa_privs, mlx5_vdpa_priv) priv_list =
 					      TAILQ_HEAD_INITIALIZER(priv_list);
-- 
2.25.4


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [dpdk-dev] [PATCH 9/9] vhost: only use vDPA config workaround if needed
  2020-05-14  8:02 [dpdk-dev] [PATCH (v20.08) 0/9] vhost: improve Vhost/vDPA device init Maxime Coquelin
                   ` (7 preceding siblings ...)
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 8/9] vdpa/mlx5: " Maxime Coquelin
@ 2020-05-14  8:02 ` Maxime Coquelin
  2020-06-07 10:38   ` Matan Azrad
  8 siblings, 1 reply; 35+ messages in thread
From: Maxime Coquelin @ 2020-05-14  8:02 UTC (permalink / raw)
  To: xiaolong.ye, shahafs, matan, amorenoz, xiao.w.wang, viacheslavo, dev
  Cc: jasowang, lulu, Maxime Coquelin

Now that we have Virtio device status support, let's
only use the vDPA workaround if it is not supported.

This patch also document why Virtio device status
protocol feature support is strongly advised.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/vhost_user.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index e5a44be58d..67e96a872a 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -2847,8 +2847,20 @@ vhost_user_msg_handler(int vid, int fd)
 	if (!vdpa_dev)
 		goto out;
 
-	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED) &&
-			request == VHOST_USER_SET_VRING_CALL) {
+	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED)) {
+		/*
+		 * Workaround when Virtio device status protocol
+		 * feature is not supported, wait for SET_VRING_CALL
+		 * request. This is not ideal as some frontends like
+		 * Virtio-user may not send this request, so vDPA device
+		 * may never be configured. Virtio device status support
+		 * on frontend side is strongly advised.
+		 */
+		if (!(dev->protocol_features &
+				(1ULL << VHOST_USER_PROTOCOL_F_STATUS)) &&
+				(request != VHOST_USER_SET_VRING_CALL))
+			goto out;
+
 		if (vdpa_dev->ops->dev_conf(dev->vid))
 			VHOST_LOG_CONFIG(ERR,
 					"Failed to configure vDPA device\n");
-- 
2.25.4


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 3/9] vdpa/ifc: add support to vDPA queue enable
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 3/9] vdpa/ifc: add support to vDPA queue enable Maxime Coquelin
@ 2020-05-15  8:45   ` Ye Xiaolong
  2020-05-15  9:09   ` Jason Wang
  1 sibling, 0 replies; 35+ messages in thread
From: Ye Xiaolong @ 2020-05-15  8:45 UTC (permalink / raw)
  To: Maxime Coquelin
  Cc: shahafs, matan, amorenoz, xiao.w.wang, viacheslavo, dev, jasowang, lulu

On 05/14, Maxime Coquelin wrote:
>This patch adds support to enabling and disabling
>vrings on a per-vring granularity.
>
>Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>---
> drivers/vdpa/ifc/base/ifcvf.c |  9 +++++++++
> drivers/vdpa/ifc/base/ifcvf.h |  4 ++++
> drivers/vdpa/ifc/ifcvf_vdpa.c | 23 ++++++++++++++++++++++-
> 3 files changed, 35 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/vdpa/ifc/base/ifcvf.c b/drivers/vdpa/ifc/base/ifcvf.c
>index 3c0b2dff66..dd4e7468ae 100644
>--- a/drivers/vdpa/ifc/base/ifcvf.c
>+++ b/drivers/vdpa/ifc/base/ifcvf.c
>@@ -327,3 +327,12 @@ ifcvf_get_queue_notify_off(struct ifcvf_hw *hw, int qid)
> 	return (u8 *)hw->notify_addr[qid] -
> 		(u8 *)hw->mem_resource[hw->notify_region].addr;
> }
>+
>+void
>+ifcvf_queue_enable(struct ifcvf_hw *hw, u16 qid,  u16 enable)
>+{
>+	struct ifcvf_pci_common_cfg *cfg = hw->common_cfg;
>+
>+	IFCVF_WRITE_REG16(qid, &cfg->queue_select);
>+	IFCVF_WRITE_REG16(enable, &cfg->queue_enable);
>+}
>diff --git a/drivers/vdpa/ifc/base/ifcvf.h b/drivers/vdpa/ifc/base/ifcvf.h
>index eb04a94067..bd85010eff 100644
>--- a/drivers/vdpa/ifc/base/ifcvf.h
>+++ b/drivers/vdpa/ifc/base/ifcvf.h
>@@ -159,4 +159,8 @@ ifcvf_get_notify_region(struct ifcvf_hw *hw);
> u64
> ifcvf_get_queue_notify_off(struct ifcvf_hw *hw, int qid);
> 
>+void
>+ifcvf_queue_enable(struct ifcvf_hw *hw, u16 qid,  u16 enable);
>+
>+
> #endif /* _IFCVF_H_ */
>diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c b/drivers/vdpa/ifc/ifcvf_vdpa.c
>index ec97178dcb..55ce0cf13d 100644
>--- a/drivers/vdpa/ifc/ifcvf_vdpa.c
>+++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
>@@ -937,6 +937,27 @@ ifcvf_dev_close(int vid)
> 	return 0;
> }
> 
>+static int
>+ifcvf_set_vring_state(int vid, int vring, int state)
>+{
>+	int did;
>+	struct internal_list *list;
>+	struct ifcvf_internal *internal;
>+
>+	did = rte_vhost_get_vdpa_device_id(vid);
>+	list = find_internal_resource_by_did(did);
>+	if (list == NULL) {
>+		DRV_LOG(ERR, "Invalid device id: %d", did);
>+		return -1;
>+	}

Do we need the sanity check for the vring as well?

Thanks,
Xiaolong
>+
>+	internal = list->internal;
>+
>+	ifcvf_queue_enable(&internal->hw, (uint16_t)vring, (uint16_t) state);
>+
>+	return 0;
>+}
>+
> static int
> ifcvf_set_features(int vid)
> {
>@@ -1086,7 +1107,7 @@ static struct rte_vdpa_dev_ops ifcvf_ops = {
> 	.get_protocol_features = ifcvf_get_protocol_features,
> 	.dev_conf = ifcvf_dev_config,
> 	.dev_close = ifcvf_dev_close,
>-	.set_vring_state = NULL,
>+	.set_vring_state = ifcvf_set_vring_state,
> 	.set_features = ifcvf_set_features,
> 	.migration_done = NULL,
> 	.get_vfio_group_fd = ifcvf_get_vfio_group_fd,
>-- 
>2.25.4
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 3/9] vdpa/ifc: add support to vDPA queue enable
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 3/9] vdpa/ifc: add support to vDPA queue enable Maxime Coquelin
  2020-05-15  8:45   ` Ye Xiaolong
@ 2020-05-15  9:09   ` Jason Wang
  2020-05-15  9:42     ` Wang, Xiao W
  1 sibling, 1 reply; 35+ messages in thread
From: Jason Wang @ 2020-05-15  9:09 UTC (permalink / raw)
  To: Maxime Coquelin, xiaolong.ye, shahafs, matan, amorenoz,
	xiao.w.wang, viacheslavo, dev
  Cc: lulu


On 2020/5/14 下午4:02, Maxime Coquelin wrote:
> This patch adds support to enabling and disabling
> vrings on a per-vring granularity.
>
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>


A question here, I see in qemu peer_attach() may try to generate 
VHOST_USER_SET_VRING_ENABLE, but just from the name I think it should 
behave as queue_enable defined in virtio specification which is 
explicitly under the control of guest?

(Note, in Cindy's vDPA series, we must invent new vhost_ops to differ 
from this one).

Thanks


> ---
>   drivers/vdpa/ifc/base/ifcvf.c |  9 +++++++++
>   drivers/vdpa/ifc/base/ifcvf.h |  4 ++++
>   drivers/vdpa/ifc/ifcvf_vdpa.c | 23 ++++++++++++++++++++++-
>   3 files changed, 35 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/vdpa/ifc/base/ifcvf.c b/drivers/vdpa/ifc/base/ifcvf.c
> index 3c0b2dff66..dd4e7468ae 100644
> --- a/drivers/vdpa/ifc/base/ifcvf.c
> +++ b/drivers/vdpa/ifc/base/ifcvf.c
> @@ -327,3 +327,12 @@ ifcvf_get_queue_notify_off(struct ifcvf_hw *hw, int qid)
>   	return (u8 *)hw->notify_addr[qid] -
>   		(u8 *)hw->mem_resource[hw->notify_region].addr;
>   }
> +
> +void
> +ifcvf_queue_enable(struct ifcvf_hw *hw, u16 qid,  u16 enable)
> +{
> +	struct ifcvf_pci_common_cfg *cfg = hw->common_cfg;
> +
> +	IFCVF_WRITE_REG16(qid, &cfg->queue_select);
> +	IFCVF_WRITE_REG16(enable, &cfg->queue_enable);
> +}
> diff --git a/drivers/vdpa/ifc/base/ifcvf.h b/drivers/vdpa/ifc/base/ifcvf.h
> index eb04a94067..bd85010eff 100644
> --- a/drivers/vdpa/ifc/base/ifcvf.h
> +++ b/drivers/vdpa/ifc/base/ifcvf.h
> @@ -159,4 +159,8 @@ ifcvf_get_notify_region(struct ifcvf_hw *hw);
>   u64
>   ifcvf_get_queue_notify_off(struct ifcvf_hw *hw, int qid);
>   
> +void
> +ifcvf_queue_enable(struct ifcvf_hw *hw, u16 qid,  u16 enable);
> +
> +
>   #endif /* _IFCVF_H_ */
> diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c b/drivers/vdpa/ifc/ifcvf_vdpa.c
> index ec97178dcb..55ce0cf13d 100644
> --- a/drivers/vdpa/ifc/ifcvf_vdpa.c
> +++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
> @@ -937,6 +937,27 @@ ifcvf_dev_close(int vid)
>   	return 0;
>   }
>   
> +static int
> +ifcvf_set_vring_state(int vid, int vring, int state)
> +{
> +	int did;
> +	struct internal_list *list;
> +	struct ifcvf_internal *internal;
> +
> +	did = rte_vhost_get_vdpa_device_id(vid);
> +	list = find_internal_resource_by_did(did);
> +	if (list == NULL) {
> +		DRV_LOG(ERR, "Invalid device id: %d", did);
> +		return -1;
> +	}
> +
> +	internal = list->internal;
> +
> +	ifcvf_queue_enable(&internal->hw, (uint16_t)vring, (uint16_t) state);
> +
> +	return 0;
> +}
> +
>   static int
>   ifcvf_set_features(int vid)
>   {
> @@ -1086,7 +1107,7 @@ static struct rte_vdpa_dev_ops ifcvf_ops = {
>   	.get_protocol_features = ifcvf_get_protocol_features,
>   	.dev_conf = ifcvf_dev_config,
>   	.dev_close = ifcvf_dev_close,
> -	.set_vring_state = NULL,
> +	.set_vring_state = ifcvf_set_vring_state,
>   	.set_features = ifcvf_set_features,
>   	.migration_done = NULL,
>   	.get_vfio_group_fd = ifcvf_get_vfio_group_fd,


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 3/9] vdpa/ifc: add support to vDPA queue enable
  2020-05-15  9:09   ` Jason Wang
@ 2020-05-15  9:42     ` Wang, Xiao W
  2020-05-15 10:06       ` Jason Wang
  2020-05-15 10:08       ` Jason Wang
  0 siblings, 2 replies; 35+ messages in thread
From: Wang, Xiao W @ 2020-05-15  9:42 UTC (permalink / raw)
  To: Jason Wang, Maxime Coquelin, Ye, Xiaolong, shahafs, matan,
	amorenoz, viacheslavo, dev
  Cc: lulu, Xu, Rosen

Hi,



Best Regards,

Xiao



> -----Original Message-----

> From: Jason Wang <jasowang@redhat.com>

> Sent: Friday, May 15, 2020 5:09 PM

> To: Maxime Coquelin <maxime.coquelin@redhat.com>; Ye, Xiaolong

> <xiaolong.ye@intel.com>; shahafs@mellanox.com; matan@mellanox.com;

> amorenoz@redhat.com; Wang, Xiao W <xiao.w.wang@intel.com>;

> viacheslavo@mellanox.com; dev@dpdk.org

> Cc: lulu@redhat.com

> Subject: Re: [PATCH 3/9] vdpa/ifc: add support to vDPA queue enable

>

>

> On 2020/5/14 下午4:02, Maxime Coquelin wrote:

> > This patch adds support to enabling and disabling

> > vrings on a per-vring granularity.

> >

> > Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com<mailto:maxime.coquelin@redhat.com>>

>

>

> A question here, I see in qemu peer_attach() may try to generate

> VHOST_USER_SET_VRING_ENABLE, but just from the name I think it should

> behave as queue_enable defined in virtio specification which is

> explicitly under the control of guest?

>

> (Note, in Cindy's vDPA series, we must invent new vhost_ops to differ

> from this one).



From my view, common_cfg.enable reg is used for registering a queue to hypervisor&vhost, but not ENABLE.

The control queue message VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET is for enable/disable queue pairs.

Think about when virtio net probes, all queues are selected and "enabled" by init_vqs(), but MQ is not enabled until virtnet_set_channels() by user config with "ethtool".



Based on this, below reg writing is not OK to enable MQ. IFC HW supports below registers for VF pass-thru case.

Actually, we have specific reg designed to enable MQ in VDPA case.

> > +      IFCVF_WRITE_REG16(qid, &cfg->queue_select);

> > +      IFCVF_WRITE_REG16(enable, &cfg->queue_enable);



BRs,

Xiao



>

> Thanks

>

>

> > ---

> >   drivers/vdpa/ifc/base/ifcvf.c |  9 +++++++++

> >   drivers/vdpa/ifc/base/ifcvf.h |  4 ++++

> >   drivers/vdpa/ifc/ifcvf_vdpa.c | 23 ++++++++++++++++++++++-

> >   3 files changed, 35 insertions(+), 1 deletion(-)

> >

> > diff --git a/drivers/vdpa/ifc/base/ifcvf.c b/drivers/vdpa/ifc/base/ifcvf.c

> > index 3c0b2dff66..dd4e7468ae 100644

> > --- a/drivers/vdpa/ifc/base/ifcvf.c

> > +++ b/drivers/vdpa/ifc/base/ifcvf.c

> > @@ -327,3 +327,12 @@ ifcvf_get_queue_notify_off(struct ifcvf_hw *hw, int

> qid)

> >         return (u8 *)hw->notify_addr[qid] -

> >                        (u8 *)hw->mem_resource[hw->notify_region].addr;

> >   }

> > +

> > +void

> > +ifcvf_queue_enable(struct ifcvf_hw *hw, u16 qid,  u16 enable)

> > +{

> > +      struct ifcvf_pci_common_cfg *cfg = hw->common_cfg;

> > +

> > +      IFCVF_WRITE_REG16(qid, &cfg->queue_select);

> > +      IFCVF_WRITE_REG16(enable, &cfg->queue_enable);

> > +}

> > diff --git a/drivers/vdpa/ifc/base/ifcvf.h b/drivers/vdpa/ifc/base/ifcvf.h

> > index eb04a94067..bd85010eff 100644

> > --- a/drivers/vdpa/ifc/base/ifcvf.h

> > +++ b/drivers/vdpa/ifc/base/ifcvf.h

> > @@ -159,4 +159,8 @@ ifcvf_get_notify_region(struct ifcvf_hw *hw);

> >   u64

> >   ifcvf_get_queue_notify_off(struct ifcvf_hw *hw, int qid);

> >

> > +void

> > +ifcvf_queue_enable(struct ifcvf_hw *hw, u16 qid,  u16 enable);

> > +

> > +

> >   #endif /* _IFCVF_H_ */

> > diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c b/drivers/vdpa/ifc/ifcvf_vdpa.c

> > index ec97178dcb..55ce0cf13d 100644

> > --- a/drivers/vdpa/ifc/ifcvf_vdpa.c

> > +++ b/drivers/vdpa/ifc/ifcvf_vdpa.c

> > @@ -937,6 +937,27 @@ ifcvf_dev_close(int vid)

> >         return 0;

> >   }

> >

> > +static int

> > +ifcvf_set_vring_state(int vid, int vring, int state)

> > +{

> > +      int did;

> > +      struct internal_list *list;

> > +      struct ifcvf_internal *internal;

> > +

> > +      did = rte_vhost_get_vdpa_device_id(vid);

> > +      list = find_internal_resource_by_did(did);

> > +      if (list == NULL) {

> > +                     DRV_LOG(ERR, "Invalid device id: %d", did);

> > +                     return -1;

> > +      }

> > +

> > +      internal = list->internal;

> > +

> > +      ifcvf_queue_enable(&internal->hw, (uint16_t)vring, (uint16_t) state);

> > +

> > +      return 0;

> > +}

> > +

> >   static int

> >   ifcvf_set_features(int vid)

> >   {

> > @@ -1086,7 +1107,7 @@ static struct rte_vdpa_dev_ops ifcvf_ops = {

> >         .get_protocol_features = ifcvf_get_protocol_features,

> >         .dev_conf = ifcvf_dev_config,

> >         .dev_close = ifcvf_dev_close,

> > -       .set_vring_state = NULL,

> > +      .set_vring_state = ifcvf_set_vring_state,

> >         .set_features = ifcvf_set_features,

> >         .migration_done = NULL,

> >         .get_vfio_group_fd = ifcvf_get_vfio_group_fd,



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 3/9] vdpa/ifc: add support to vDPA queue enable
  2020-05-15  9:42     ` Wang, Xiao W
@ 2020-05-15 10:06       ` Jason Wang
  2020-05-15 10:08       ` Jason Wang
  1 sibling, 0 replies; 35+ messages in thread
From: Jason Wang @ 2020-05-15 10:06 UTC (permalink / raw)
  To: Wang, Xiao W, Maxime Coquelin, Ye, Xiaolong, shahafs, matan,
	amorenoz, viacheslavo, dev
  Cc: lulu, Xu, Rosen


On 2020/5/15 下午5:42, Wang, Xiao W wrote:
>
> Hi,
>
> Best Regards,
>
> Xiao
>
> > -----Original Message-----
>
> > From: Jason Wang <jasowang@redhat.com>
>
> > Sent: Friday, May 15, 2020 5:09 PM
>
> > To: Maxime Coquelin <maxime.coquelin@redhat.com>; Ye, Xiaolong
>
> > <xiaolong.ye@intel.com>; shahafs@mellanox.com; matan@mellanox.com;
>
> > amorenoz@redhat.com; Wang, Xiao W <xiao.w.wang@intel.com>;
>
> > viacheslavo@mellanox.com; dev@dpdk.org
>
> > Cc: lulu@redhat.com
>
> > Subject: Re: [PATCH 3/9] vdpa/ifc: add support to vDPA queue enable
>
> >
>
> >
>
> > On 2020/5/14 下午4:02, Maxime Coquelin wrote:
>
> > > This patch adds support to enabling and disabling
>
> > > vrings on a per-vring granularity.
>
> > >
>
> > > Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com 
> <mailto:maxime.coquelin@redhat.com>>
>
> >
>
> >
>
> > A question here, I see in qemu peer_attach() may try to generate
>
> > VHOST_USER_SET_VRING_ENABLE, but just from the name I think it should
>
> > behave as queue_enable defined in virtio specification which is
>
> > explicitly under the control of guest?
>
> >
>
> > (Note, in Cindy's vDPA series, we must invent new vhost_ops to differ
>
> > from this one).
>
> From my view, common_cfg.enable reg is used for registering a queue to 
> hypervisor&vhost, but not ENABLE.
>

Well, what's your definition of "enable" in this context?

Spec said:

queue_enable
    The driver uses this to selectively prevent the device from
    executing requests from this virtqueue. 1 - enabled; 0 - disabled. 

This means, if queue_enable is not set to 1, device can not execute 
request for this specific virtqueue.


> The control queue message VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET is for 
> enable/disable queue pairs.
>

But in qemu this is hooked to VHOST_USER_SET_VRING_ENABLE, see 
peer_attach(). And this patch hook VHOST_USER_SET_VRING_ENABLE to 
queue_enable.

This means IFCVF uses queue_enable instead of control vq or other 
register for setting multiqueue stuff? My understanding is that IFCVF 
has dedicated register to do this.

Note setting mq is different from queue_enable, changing the number of 
queues should let the underlayer NIC to properly configure its 
steering/switching/filtering logic to make sure traffic were only sent 
to the queues that is set by driver.

So hooking VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET to qeue_enable looks wrong.


> Think about when virtio net probes, all queues are selected and 
> "enabled" by init_vqs(),
>

I think we're talking about aligning the implementation with spec not 
just make it work for some specific drivers. Driver may choose to not 
enable a virtqueue by not setting 1 to queue_enable.

Thanks


> but MQ is not enabled until virtnet_set_channels() by user config with 
> "ethtool".
>
> Based on this, below reg writing is not OK to enable MQ. IFC HW 
> supports below registers for VF pass-thru case.
>
> Actually, we have specific reg designed to enable MQ in VDPA case.
>
> > > +IFCVF_WRITE_REG16(qid, &cfg->queue_select);
>
> > > +IFCVF_WRITE_REG16(enable, &cfg->queue_enable);
>
> BRs,
>
> Xiao
>
> >
>
> > Thanks
>
> >
>
> >
>
> > > ---
>
> > >drivers/vdpa/ifc/base/ifcvf.c |9 +++++++++
>
> > >drivers/vdpa/ifc/base/ifcvf.h |4 ++++
>
> > >drivers/vdpa/ifc/ifcvf_vdpa.c | 23 ++++++++++++++++++++++-
>
> > >3 files changed, 35 insertions(+), 1 deletion(-)
>
> > >
>
> > > diff --git a/drivers/vdpa/ifc/base/ifcvf.c 
> b/drivers/vdpa/ifc/base/ifcvf.c
>
> > > index 3c0b2dff66..dd4e7468ae 100644
>
> > > --- a/drivers/vdpa/ifc/base/ifcvf.c
>
> > > +++ b/drivers/vdpa/ifc/base/ifcvf.c
>
> > > @@ -327,3 +327,12 @@ ifcvf_get_queue_notify_off(struct ifcvf_hw 
> *hw, int
>
> > qid)
>
> > >return (u8 *)hw->notify_addr[qid] -
>
> > >(u8 *)hw->mem_resource[hw->notify_region].addr;
>
> > >}
>
> > > +
>
> > > +void
>
> > > +ifcvf_queue_enable(struct ifcvf_hw *hw, u16 qid,u16 enable)
>
> > > +{
>
> > > +struct ifcvf_pci_common_cfg *cfg = hw->common_cfg;
>
> > > +
>
> > > +IFCVF_WRITE_REG16(qid, &cfg->queue_select);
>
> > > +IFCVF_WRITE_REG16(enable, &cfg->queue_enable);
>
> > > +}
>
> > > diff --git a/drivers/vdpa/ifc/base/ifcvf.h 
> b/drivers/vdpa/ifc/base/ifcvf.h
>
> > > index eb04a94067..bd85010eff 100644
>
> > > --- a/drivers/vdpa/ifc/base/ifcvf.h
>
> > > +++ b/drivers/vdpa/ifc/base/ifcvf.h
>
> > > @@ -159,4 +159,8 @@ ifcvf_get_notify_region(struct ifcvf_hw *hw);
>
> > >u64
>
> > >ifcvf_get_queue_notify_off(struct ifcvf_hw *hw, int qid);
>
> > >
>
> > > +void
>
> > > +ifcvf_queue_enable(struct ifcvf_hw *hw, u16 qid,u16 enable);
>
> > > +
>
> > > +
>
> > >#endif /* _IFCVF_H_ */
>
> > > diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c 
> b/drivers/vdpa/ifc/ifcvf_vdpa.c
>
> > > index ec97178dcb..55ce0cf13d 100644
>
> > > --- a/drivers/vdpa/ifc/ifcvf_vdpa.c
>
> > > +++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
>
> > > @@ -937,6 +937,27 @@ ifcvf_dev_close(int vid)
>
> > >return 0;
>
> > >}
>
> > >
>
> > > +static int
>
> > > +ifcvf_set_vring_state(int vid, int vring, int state)
>
> > > +{
>
> > > +int did;
>
> > > +struct internal_list *list;
>
> > > +struct ifcvf_internal *internal;
>
> > > +
>
> > > +did = rte_vhost_get_vdpa_device_id(vid);
>
> > > +list = find_internal_resource_by_did(did);
>
> > > +if (list == NULL) {
>
> > > +DRV_LOG(ERR, "Invalid device id: %d", did);
>
> > > +return -1;
>
> > > +}
>
> > > +
>
> > > +internal = list->internal;
>
> > > +
>
> > > +ifcvf_queue_enable(&internal->hw, (uint16_t)vring, (uint16_t) state);
>
> > > +
>
> > > +return 0;
>
> > > +}
>
> > > +
>
> > >static int
>
> > >ifcvf_set_features(int vid)
>
> > >{
>
> > > @@ -1086,7 +1107,7 @@ static struct rte_vdpa_dev_ops ifcvf_ops = {
>
> > >.get_protocol_features = ifcvf_get_protocol_features,
>
> > >.dev_conf = ifcvf_dev_config,
>
> > >.dev_close = ifcvf_dev_close,
>
> > > -.set_vring_state = NULL,
>
> > > +.set_vring_state = ifcvf_set_vring_state,
>
> > >.set_features = ifcvf_set_features,
>
> > >.migration_done = NULL,
>
> > >.get_vfio_group_fd = ifcvf_get_vfio_group_fd,
>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 3/9] vdpa/ifc: add support to vDPA queue enable
  2020-05-15  9:42     ` Wang, Xiao W
  2020-05-15 10:06       ` Jason Wang
@ 2020-05-15 10:08       ` Jason Wang
  2020-05-18  3:09         ` Wang, Xiao W
  1 sibling, 1 reply; 35+ messages in thread
From: Jason Wang @ 2020-05-15 10:08 UTC (permalink / raw)
  To: Wang, Xiao W, Maxime Coquelin, Ye, Xiaolong, shahafs, matan,
	amorenoz, viacheslavo, dev
  Cc: lulu, Xu, Rosen


On 2020/5/15 下午5:42, Wang, Xiao W wrote:
>
> Hi,
>
> Best Regards,
>
> Xiao
>
> > -----Original Message-----
>
> > From: Jason Wang <jasowang@redhat.com>
>
> > Sent: Friday, May 15, 2020 5:09 PM
>
> > To: Maxime Coquelin <maxime.coquelin@redhat.com>; Ye, Xiaolong
>
> > <xiaolong.ye@intel.com>; shahafs@mellanox.com; matan@mellanox.com;
>
> > amorenoz@redhat.com; Wang, Xiao W <xiao.w.wang@intel.com>;
>
> > viacheslavo@mellanox.com; dev@dpdk.org
>
> > Cc: lulu@redhat.com
>
> > Subject: Re: [PATCH 3/9] vdpa/ifc: add support to vDPA queue enable
>
> >
>
> >
>
> > On 2020/5/14 下午4:02, Maxime Coquelin wrote:
>
> > > This patch adds support to enabling and disabling
>
> > > vrings on a per-vring granularity.
>
> > >
>
> > > Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com 
> <mailto:maxime.coquelin@redhat.com>>
>
> >
>
> >
>
> > A question here, I see in qemu peer_attach() may try to generate
>
> > VHOST_USER_SET_VRING_ENABLE, but just from the name I think it should
>
> > behave as queue_enable defined in virtio specification which is
>
> > explicitly under the control of guest?
>
> >
>
> > (Note, in Cindy's vDPA series, we must invent new vhost_ops to differ
>
> > from this one).
>
> From my view, common_cfg.enable reg is used for registering a queue to 
> hypervisor&vhost, but not ENABLE.
>

Well, what's your definition of "enable" in this context?

Spec said:

queue_enable
    The driver uses this to selectively prevent the device from
    executing requests from this virtqueue. 1 - enabled; 0 - disabled. 

This means, if queue_enable is not set to 1, device can not execute 
request for this specific virtqueue.


> The control queue message VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET is for 
> enable/disable queue pairs.
>

But in qemu this is hooked to VHOST_USER_SET_VRING_ENABLE, see 
peer_attach(). And this patch hook VHOST_USER_SET_VRING_ENABLE to 
queue_enable.

This means IFCVF uses queue_enable instead of control vq or other 
register for setting multiqueue stuff? My understanding is that IFCVF 
has dedicated register to do this.

Note setting mq is different from queue_enable, changing the number of 
queues should let the underlayer NIC to properly configure its 
steering/switching/filtering logic to make sure traffic were only sent 
to the queues that is set by driver.

So hooking VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET to queue_enable looks wrong.


> Think about when virtio net probes, all queues are selected and 
> "enabled" by init_vqs(),
>

I think we're talking about aligning the implementation with spec not 
just make it work for some specific drivers. Driver may choose to not 
enable a virtqueue by not setting 1 to queue_enable.

Thanks


> but MQ is not enabled until virtnet_set_channels() by user config with 
> "ethtool".
>
> Based on this, below reg writing is not OK to enable MQ. IFC HW 
> supports below registers for VF pass-thru case.
>
> Actually, we have specific reg designed to enable MQ in VDPA case.
>
> > > +IFCVF_WRITE_REG16(qid, &cfg->queue_select);
>
> > > +IFCVF_WRITE_REG16(enable, &cfg->queue_enable);
>
> BRs,
>
> Xiao
>
> >
>
> > Thanks
>
> >
>
> >
>
> > > ---
>
> > >drivers/vdpa/ifc/base/ifcvf.c |9 +++++++++
>
> > >drivers/vdpa/ifc/base/ifcvf.h |4 ++++
>
> > >drivers/vdpa/ifc/ifcvf_vdpa.c | 23 ++++++++++++++++++++++-
>
> > >3 files changed, 35 insertions(+), 1 deletion(-)
>
> > >
>
> > > diff --git a/drivers/vdpa/ifc/base/ifcvf.c 
> b/drivers/vdpa/ifc/base/ifcvf.c
>
> > > index 3c0b2dff66..dd4e7468ae 100644
>
> > > --- a/drivers/vdpa/ifc/base/ifcvf.c
>
> > > +++ b/drivers/vdpa/ifc/base/ifcvf.c
>
> > > @@ -327,3 +327,12 @@ ifcvf_get_queue_notify_off(struct ifcvf_hw 
> *hw, int
>
> > qid)
>
> > >return (u8 *)hw->notify_addr[qid] -
>
> > >(u8 *)hw->mem_resource[hw->notify_region].addr;
>
> > >}
>
> > > +
>
> > > +void
>
> > > +ifcvf_queue_enable(struct ifcvf_hw *hw, u16 qid,u16 enable)
>
> > > +{
>
> > > +struct ifcvf_pci_common_cfg *cfg = hw->common_cfg;
>
> > > +
>
> > > +IFCVF_WRITE_REG16(qid, &cfg->queue_select);
>
> > > +IFCVF_WRITE_REG16(enable, &cfg->queue_enable);
>
> > > +}
>
> > > diff --git a/drivers/vdpa/ifc/base/ifcvf.h 
> b/drivers/vdpa/ifc/base/ifcvf.h
>
> > > index eb04a94067..bd85010eff 100644
>
> > > --- a/drivers/vdpa/ifc/base/ifcvf.h
>
> > > +++ b/drivers/vdpa/ifc/base/ifcvf.h
>
> > > @@ -159,4 +159,8 @@ ifcvf_get_notify_region(struct ifcvf_hw *hw);
>
> > >u64
>
> > >ifcvf_get_queue_notify_off(struct ifcvf_hw *hw, int qid);
>
> > >
>
> > > +void
>
> > > +ifcvf_queue_enable(struct ifcvf_hw *hw, u16 qid,u16 enable);
>
> > > +
>
> > > +
>
> > >#endif /* _IFCVF_H_ */
>
> > > diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c 
> b/drivers/vdpa/ifc/ifcvf_vdpa.c
>
> > > index ec97178dcb..55ce0cf13d 100644
>
> > > --- a/drivers/vdpa/ifc/ifcvf_vdpa.c
>
> > > +++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
>
> > > @@ -937,6 +937,27 @@ ifcvf_dev_close(int vid)
>
> > >return 0;
>
> > >}
>
> > >
>
> > > +static int
>
> > > +ifcvf_set_vring_state(int vid, int vring, int state)
>
> > > +{
>
> > > +int did;
>
> > > +struct internal_list *list;
>
> > > +struct ifcvf_internal *internal;
>
> > > +
>
> > > +did = rte_vhost_get_vdpa_device_id(vid);
>
> > > +list = find_internal_resource_by_did(did);
>
> > > +if (list == NULL) {
>
> > > +DRV_LOG(ERR, "Invalid device id: %d", did);
>
> > > +return -1;
>
> > > +}
>
> > > +
>
> > > +internal = list->internal;
>
> > > +
>
> > > +ifcvf_queue_enable(&internal->hw, (uint16_t)vring, (uint16_t) state);
>
> > > +
>
> > > +return 0;
>
> > > +}
>
> > > +
>
> > >static int
>
> > >ifcvf_set_features(int vid)
>
> > >{
>
> > > @@ -1086,7 +1107,7 @@ static struct rte_vdpa_dev_ops ifcvf_ops = {
>
> > >.get_protocol_features = ifcvf_get_protocol_features,
>
> > >.dev_conf = ifcvf_dev_config,
>
> > >.dev_close = ifcvf_dev_close,
>
> > > -.set_vring_state = NULL,
>
> > > +.set_vring_state = ifcvf_set_vring_state,
>
> > >.set_features = ifcvf_set_features,
>
> > >.migration_done = NULL,
>
> > >.get_vfio_group_fd = ifcvf_get_vfio_group_fd,
>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 3/9] vdpa/ifc: add support to vDPA queue enable
  2020-05-15 10:08       ` Jason Wang
@ 2020-05-18  3:09         ` Wang, Xiao W
  2020-05-18  3:17           ` Jason Wang
  0 siblings, 1 reply; 35+ messages in thread
From: Wang, Xiao W @ 2020-05-18  3:09 UTC (permalink / raw)
  To: Jason Wang, Maxime Coquelin, Ye, Xiaolong, shahafs, matan,
	amorenoz, viacheslavo, dev
  Cc: lulu, Xu, Rosen

Hi,

Comments inline.

Best Regards,
Xiao

> -----Original Message-----
> From: Jason Wang <jasowang@redhat.com>
> Sent: Friday, May 15, 2020 6:09 PM
> To: Wang, Xiao W <xiao.w.wang@intel.com>; Maxime Coquelin
> <maxime.coquelin@redhat.com>; Ye, Xiaolong <xiaolong.ye@intel.com>;
> shahafs@mellanox.com; matan@mellanox.com; amorenoz@redhat.com;
> viacheslavo@mellanox.com; dev@dpdk.org
> Cc: lulu@redhat.com; Xu, Rosen <rosen.xu@intel.com>
> Subject: Re: [PATCH 3/9] vdpa/ifc: add support to vDPA queue enable
> 
> 
> On 2020/5/15 下午5:42, Wang, Xiao W wrote:
> >
> > Hi,
> >
> > Best Regards,
> >
> > Xiao
> >
> > > -----Original Message-----
> >
> > > From: Jason Wang <jasowang@redhat.com>
> >
> > > Sent: Friday, May 15, 2020 5:09 PM
> >
> > > To: Maxime Coquelin <maxime.coquelin@redhat.com>; Ye, Xiaolong
> >
> > > <xiaolong.ye@intel.com>; shahafs@mellanox.com; matan@mellanox.com;
> >
> > > amorenoz@redhat.com; Wang, Xiao W <xiao.w.wang@intel.com>;
> >
> > > viacheslavo@mellanox.com; dev@dpdk.org
> >
> > > Cc: lulu@redhat.com
> >
> > > Subject: Re: [PATCH 3/9] vdpa/ifc: add support to vDPA queue enable
> >
> > >
> >
> > >
> >
> > > On 2020/5/14 下午4:02, Maxime Coquelin wrote:
> >
> > > > This patch adds support to enabling and disabling
> >
> > > > vrings on a per-vring granularity.
> >
> > > >
> >
> > > > Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com
> > <mailto:maxime.coquelin@redhat.com>>
> >
> > >
> >
> > >
> >
> > > A question here, I see in qemu peer_attach() may try to generate
> >
> > > VHOST_USER_SET_VRING_ENABLE, but just from the name I think it should
> >
> > > behave as queue_enable defined in virtio specification which is
> >
> > > explicitly under the control of guest?
> >
> > >
> >
> > > (Note, in Cindy's vDPA series, we must invent new vhost_ops to differ
> >
> > > from this one).
> >
> > From my view, common_cfg.enable reg is used for registering a queue to
> > hypervisor&vhost, but not ENABLE.
> >
> 
> Well, what's your definition of "enable" in this context?

"Enable a queue" means traffic can pass through this queue.

> 
> Spec said:
> 
> queue_enable
>     The driver uses this to selectively prevent the device from
>     executing requests from this virtqueue. 1 - enabled; 0 - disabled.
> 
> This means, if queue_enable is not set to 1, device can not execute
> request for this specific virtqueue.
> 

For queue enabling in virtio MQ case, there're 2 steps needed:
1. select a queue and write 1 to common_cfg.enable reg
2. send control vq message VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET
If no step2, by default there's only 1 queue pair enabled.

> 
> > The control queue message VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET is for
> > enable/disable queue pairs.
> >
> 
> But in qemu this is hooked to VHOST_USER_SET_VRING_ENABLE, see
> peer_attach(). And this patch hook VHOST_USER_SET_VRING_ENABLE to
> queue_enable.
> 
> This means IFCVF uses queue_enable instead of control vq or other
> register for setting multiqueue stuff? My understanding is that IFCVF
> has dedicated register to do this.
> 
> Note setting mq is different from queue_enable, changing the number of
> queues should let the underlayer NIC to properly configure its
> steering/switching/filtering logic to make sure traffic were only sent
> to the queues that is set by driver.
> 
> So hooking VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET to queue_enable looks
> wrong.

We are on the same page. As I said we have dedicated reg designed to enable MQ (ensure traffic only sent to queues enabled by user) in VDPA case.

> 
> 
> > Think about when virtio net probes, all queues are selected and
> > "enabled" by init_vqs(),
> >
> 
> I think we're talking about aligning the implementation with spec not
> just make it work for some specific drivers. Driver may choose to not
> enable a virtqueue by not setting 1 to queue_enable.
> 
> Thanks
> 
> 
> > but MQ is not enabled until virtnet_set_channels() by user config with
> > "ethtool".
> >
> > Based on this, below reg writing is not OK to enable MQ. IFC HW
> > supports below registers for VF pass-thru case.
> >
> > Actually, we have specific reg designed to enable MQ in VDPA case.
> >
> > > > +IFCVF_WRITE_REG16(qid, &cfg->queue_select);
> >
> > > > +IFCVF_WRITE_REG16(enable, &cfg->queue_enable);
> >
> > BRs,
> >
> > Xiao
> >
> > >
> >
> > > Thanks
> >
> > >
> >
> > >
> >
> > > > ---
> >
> > > >drivers/vdpa/ifc/base/ifcvf.c |9 +++++++++
> >
> > > >drivers/vdpa/ifc/base/ifcvf.h |4 ++++
> >
> > > >drivers/vdpa/ifc/ifcvf_vdpa.c | 23 ++++++++++++++++++++++-
> >
> > > >3 files changed, 35 insertions(+), 1 deletion(-)
> >
> > > >
> >
> > > > diff --git a/drivers/vdpa/ifc/base/ifcvf.c
> > b/drivers/vdpa/ifc/base/ifcvf.c
> >
> > > > index 3c0b2dff66..dd4e7468ae 100644
> >
> > > > --- a/drivers/vdpa/ifc/base/ifcvf.c
> >
> > > > +++ b/drivers/vdpa/ifc/base/ifcvf.c
> >
> > > > @@ -327,3 +327,12 @@ ifcvf_get_queue_notify_off(struct ifcvf_hw
> > *hw, int
> >
> > > qid)
> >
> > > >return (u8 *)hw->notify_addr[qid] -
> >
> > > >(u8 *)hw->mem_resource[hw->notify_region].addr;
> >
> > > >}
> >
> > > > +
> >
> > > > +void
> >
> > > > +ifcvf_queue_enable(struct ifcvf_hw *hw, u16 qid,u16 enable)
> >
> > > > +{
> >
> > > > +struct ifcvf_pci_common_cfg *cfg = hw->common_cfg;
> >
> > > > +
> >
> > > > +IFCVF_WRITE_REG16(qid, &cfg->queue_select);
> >
> > > > +IFCVF_WRITE_REG16(enable, &cfg->queue_enable);
> >
> > > > +}
> >
> > > > diff --git a/drivers/vdpa/ifc/base/ifcvf.h
> > b/drivers/vdpa/ifc/base/ifcvf.h


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 3/9] vdpa/ifc: add support to vDPA queue enable
  2020-05-18  3:09         ` Wang, Xiao W
@ 2020-05-18  3:17           ` Jason Wang
  0 siblings, 0 replies; 35+ messages in thread
From: Jason Wang @ 2020-05-18  3:17 UTC (permalink / raw)
  To: Wang, Xiao W, Maxime Coquelin, Ye, Xiaolong, shahafs, matan,
	amorenoz, viacheslavo, dev
  Cc: lulu, Xu, Rosen


On 2020/5/18 上午11:09, Wang, Xiao W wrote:
> Hi,
>
> Comments inline.
>
> Best Regards,
> Xiao
>
>> -----Original Message-----
>> From: Jason Wang<jasowang@redhat.com>
>> Sent: Friday, May 15, 2020 6:09 PM
>> To: Wang, Xiao W<xiao.w.wang@intel.com>; Maxime Coquelin
>> <maxime.coquelin@redhat.com>; Ye, Xiaolong<xiaolong.ye@intel.com>;
>> shahafs@mellanox.com;matan@mellanox.com;amorenoz@redhat.com;
>> viacheslavo@mellanox.com;dev@dpdk.org
>> Cc:lulu@redhat.com; Xu, Rosen<rosen.xu@intel.com>
>> Subject: Re: [PATCH 3/9] vdpa/ifc: add support to vDPA queue enable
>>
>>
>> On 2020/5/15 下午5:42, Wang, Xiao W wrote:
>>> Hi,
>>>
>>> Best Regards,
>>>
>>> Xiao
>>>
>>>> -----Original Message-----
>>>> From: Jason Wang<jasowang@redhat.com>
>>>> Sent: Friday, May 15, 2020 5:09 PM
>>>> To: Maxime Coquelin<maxime.coquelin@redhat.com>; Ye, Xiaolong
>>>> <xiaolong.ye@intel.com>;shahafs@mellanox.com;matan@mellanox.com;
>>>> amorenoz@redhat.com; Wang, Xiao W<xiao.w.wang@intel.com>;
>>>> viacheslavo@mellanox.com;dev@dpdk.org
>>>> Cc:lulu@redhat.com
>>>> Subject: Re: [PATCH 3/9] vdpa/ifc: add support to vDPA queue enable
>>>> On 2020/5/14 下午4:02, Maxime Coquelin wrote:
>>>>> This patch adds support to enabling and disabling
>>>>> vrings on a per-vring granularity.
>>>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com
>>> <mailto:maxime.coquelin@redhat.com>>
>>>
>>>> A question here, I see in qemu peer_attach() may try to generate
>>>> VHOST_USER_SET_VRING_ENABLE, but just from the name I think it should
>>>> behave as queue_enable defined in virtio specification which is
>>>> explicitly under the control of guest?
>>>> (Note, in Cindy's vDPA series, we must invent new vhost_ops to differ
>>>> from this one).
>>>  From my view, common_cfg.enable reg is used for registering a queue to
>>> hypervisor&vhost, but not ENABLE.
>>>
>> Well, what's your definition of "enable" in this context?
> "Enable a queue" means traffic can pass through this queue.
>
>> Spec said:
>>
>> queue_enable
>>      The driver uses this to selectively prevent the device from
>>      executing requests from this virtqueue. 1 - enabled; 0 - disabled.
>>
>> This means, if queue_enable is not set to 1, device can not execute
>> request for this specific virtqueue.
>>
> For queue enabling in virtio MQ case, there're 2 steps needed:
> 1. select a queue and write 1 to common_cfg.enable reg


Note that:

1) queue_enable doesn't mean you can disable a queue by writing zero to 
that (which is not allowed by the spec)
2) queue_enable is not specific to MQ, you need write 1 to all the 
queues that will be used by this driver
3) it's not allowed to write 1 to queue_enable after DRIVER_OK


> 2. send control vq message VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET
> If no step2, by default there's only 1 queue pair enabled.


Yes, and if you read the git history. This command is invented by me :)

Thanks


>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 9/9] vhost: only use vDPA config workaround if needed
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 9/9] vhost: only use vDPA config workaround if needed Maxime Coquelin
@ 2020-06-07 10:38   ` Matan Azrad
  2020-06-08  8:34     ` Maxime Coquelin
  0 siblings, 1 reply; 35+ messages in thread
From: Matan Azrad @ 2020-06-07 10:38 UTC (permalink / raw)
  To: Maxime Coquelin, xiaolong.ye, Shahaf Shuler, amorenoz,
	xiao.w.wang, Slava Ovsiienko, dev
  Cc: jasowang, lulu

Hi Maxime

Thanks for the huge work.
Please see a suggestion inline.

From: Maxime Coquelin:
> Sent: Thursday, May 14, 2020 11:02 AM
> To: xiaolong.ye@intel.com; Shahaf Shuler <shahafs@mellanox.com>; Matan
> Azrad <matan@mellanox.com>; amorenoz@redhat.com;
> xiao.w.wang@intel.com; Slava Ovsiienko <viacheslavo@mellanox.com>;
> dev@dpdk.org
> Cc: jasowang@redhat.com; lulu@redhat.com; Maxime Coquelin
> <maxime.coquelin@redhat.com>
> Subject: [PATCH 9/9] vhost: only use vDPA config workaround if needed
> 
> Now that we have Virtio device status support, let's only use the vDPA
> workaround if it is not supported.
> 
> This patch also document why Virtio device status protocol feature support is
> strongly advised.
> 
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
>  lib/librte_vhost/vhost_user.c | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
> index e5a44be58d..67e96a872a 100644
> --- a/lib/librte_vhost/vhost_user.c
> +++ b/lib/librte_vhost/vhost_user.c
> @@ -2847,8 +2847,20 @@ vhost_user_msg_handler(int vid, int fd)
>  	if (!vdpa_dev)
>  		goto out;
> 
> -	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED) &&
> -			request == VHOST_USER_SET_VRING_CALL) {
> +	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED)) {
> +		/*
> +		 * Workaround when Virtio device status protocol
> +		 * feature is not supported, wait for SET_VRING_CALL
> +		 * request. This is not ideal as some frontends like
> +		 * Virtio-user may not send this request, so vDPA device
> +		 * may never be configured. Virtio device status support
> +		 * on frontend side is strongly advised.
> +		 */
> +		if (!(dev->protocol_features &
> +				(1ULL <<
> VHOST_USER_PROTOCOL_F_STATUS)) &&
> +				(request !=
> VHOST_USER_SET_VRING_CALL))
> +			goto out;
> +


When status protocol feature is not supported, in the current code, the vDPA configuration triggering depends in:
1. Device is ready - all the queues are configured (datapath addresses, callfd and kickfd) .
2. last command is callfd.


The code doesn't take into account that some queues may stay disabled.
Maybe the correct timing is:
1. Device is ready - all the enabled queues are configured and MEM table is configured.
2. no need callfd to be last.

Queues that will be configured later will be configured to the HW when the virtq becoming enabled.


What do think? 


>  		if (vdpa_dev->ops->dev_conf(dev->vid))
>  			VHOST_LOG_CONFIG(ERR,
>  					"Failed to configure vDPA device\n");
> --
> 2.25.4


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 9/9] vhost: only use vDPA config workaround if needed
  2020-06-07 10:38   ` Matan Azrad
@ 2020-06-08  8:34     ` Maxime Coquelin
  2020-06-08  9:19       ` Matan Azrad
  0 siblings, 1 reply; 35+ messages in thread
From: Maxime Coquelin @ 2020-06-08  8:34 UTC (permalink / raw)
  To: Matan Azrad, xiaolong.ye, Shahaf Shuler, amorenoz, xiao.w.wang,
	Slava Ovsiienko, dev
  Cc: jasowang, lulu

Hi Matan,

On 6/7/20 12:38 PM, Matan Azrad wrote:
> Hi Maxime
> 
> Thanks for the huge work.
> Please see a suggestion inline.
> 
> From: Maxime Coquelin:
>> Sent: Thursday, May 14, 2020 11:02 AM
>> To: xiaolong.ye@intel.com; Shahaf Shuler <shahafs@mellanox.com>; Matan
>> Azrad <matan@mellanox.com>; amorenoz@redhat.com;
>> xiao.w.wang@intel.com; Slava Ovsiienko <viacheslavo@mellanox.com>;
>> dev@dpdk.org
>> Cc: jasowang@redhat.com; lulu@redhat.com; Maxime Coquelin
>> <maxime.coquelin@redhat.com>
>> Subject: [PATCH 9/9] vhost: only use vDPA config workaround if needed
>>
>> Now that we have Virtio device status support, let's only use the vDPA
>> workaround if it is not supported.
>>
>> This patch also document why Virtio device status protocol feature support is
>> strongly advised.
>>
>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>> ---
>>  lib/librte_vhost/vhost_user.c | 16 ++++++++++++++--
>>  1 file changed, 14 insertions(+), 2 deletions(-)
>>
>> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
>> index e5a44be58d..67e96a872a 100644
>> --- a/lib/librte_vhost/vhost_user.c
>> +++ b/lib/librte_vhost/vhost_user.c
>> @@ -2847,8 +2847,20 @@ vhost_user_msg_handler(int vid, int fd)
>>  	if (!vdpa_dev)
>>  		goto out;
>>
>> -	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED) &&
>> -			request == VHOST_USER_SET_VRING_CALL) {
>> +	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED)) {
>> +		/*
>> +		 * Workaround when Virtio device status protocol
>> +		 * feature is not supported, wait for SET_VRING_CALL
>> +		 * request. This is not ideal as some frontends like
>> +		 * Virtio-user may not send this request, so vDPA device
>> +		 * may never be configured. Virtio device status support
>> +		 * on frontend side is strongly advised.
>> +		 */
>> +		if (!(dev->protocol_features &
>> +				(1ULL <<
>> VHOST_USER_PROTOCOL_F_STATUS)) &&
>> +				(request !=
>> VHOST_USER_SET_VRING_CALL))
>> +			goto out;
>> +
> 
> 
> When status protocol feature is not supported, in the current code, the vDPA configuration triggering depends in:
> 1. Device is ready - all the queues are configured (datapath addresses, callfd and kickfd) .
> 2. last command is callfd.
> 
> 
> The code doesn't take into account that some queues may stay disabled.
> Maybe the correct timing is:
> 1. Device is ready - all the enabled queues are configured and MEM table is configured.

I think current virtio_is_ready() already assumes the mem table is
configured, otherwise we would not have vq->desc, vq->used and vq->avail
being set as it needs to be translated using the mem table.

> 2. no need callfd to be last.
> 
> Queues that will be configured later will be configured to the HW when the virtq becoming enabled.
> 
> 
> What do think? 

Maybe I did not understood what you mean, so please correct me if
needed.

If I understood correctly, then your suggestion is just to remove the
workaround, but it has been introduced by Intel because the callfd gets
set a second time in some cases.


Thanks,
Maxime
> 
>>  		if (vdpa_dev->ops->dev_conf(dev->vid))
>>  			VHOST_LOG_CONFIG(ERR,
>>  					"Failed to configure vDPA device\n");
>> --
>> 2.25.4
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 9/9] vhost: only use vDPA config workaround if needed
  2020-06-08  8:34     ` Maxime Coquelin
@ 2020-06-08  9:19       ` Matan Azrad
  2020-06-09  9:04         ` Maxime Coquelin
  0 siblings, 1 reply; 35+ messages in thread
From: Matan Azrad @ 2020-06-08  9:19 UTC (permalink / raw)
  To: Maxime Coquelin, xiaolong.ye, Shahaf Shuler, amorenoz,
	xiao.w.wang, Slava Ovsiienko, dev
  Cc: jasowang, lulu

Hi Maxime

From: Maxime Coquelin:
> Hi Matan,
> 
> On 6/7/20 12:38 PM, Matan Azrad wrote:
> > Hi Maxime
> >
> > Thanks for the huge work.
> > Please see a suggestion inline.
> >
> > From: Maxime Coquelin:
> >> Sent: Thursday, May 14, 2020 11:02 AM
> >> To: xiaolong.ye@intel.com; Shahaf Shuler <shahafs@mellanox.com>;
> >> Matan Azrad <matan@mellanox.com>; amorenoz@redhat.com;
> >> xiao.w.wang@intel.com; Slava Ovsiienko <viacheslavo@mellanox.com>;
> >> dev@dpdk.org
> >> Cc: jasowang@redhat.com; lulu@redhat.com; Maxime Coquelin
> >> <maxime.coquelin@redhat.com>
> >> Subject: [PATCH 9/9] vhost: only use vDPA config workaround if needed
> >>
> >> Now that we have Virtio device status support, let's only use the
> >> vDPA workaround if it is not supported.
> >>
> >> This patch also document why Virtio device status protocol feature
> >> support is strongly advised.
> >>
> >> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> >> ---
> >>  lib/librte_vhost/vhost_user.c | 16 ++++++++++++++--
> >>  1 file changed, 14 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/lib/librte_vhost/vhost_user.c
> >> b/lib/librte_vhost/vhost_user.c index e5a44be58d..67e96a872a 100644
> >> --- a/lib/librte_vhost/vhost_user.c
> >> +++ b/lib/librte_vhost/vhost_user.c
> >> @@ -2847,8 +2847,20 @@ vhost_user_msg_handler(int vid, int fd)
> >>  	if (!vdpa_dev)
> >>  		goto out;
> >>
> >> -	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED) &&
> >> -			request == VHOST_USER_SET_VRING_CALL) {
> >> +	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED)) {
> >> +		/*
> >> +		 * Workaround when Virtio device status protocol
> >> +		 * feature is not supported, wait for SET_VRING_CALL
> >> +		 * request. This is not ideal as some frontends like
> >> +		 * Virtio-user may not send this request, so vDPA device
> >> +		 * may never be configured. Virtio device status support
> >> +		 * on frontend side is strongly advised.
> >> +		 */
> >> +		if (!(dev->protocol_features &
> >> +				(1ULL <<
> >> VHOST_USER_PROTOCOL_F_STATUS)) &&
> >> +				(request !=
> >> VHOST_USER_SET_VRING_CALL))
> >> +			goto out;
> >> +
> >
> >
> > When status protocol feature is not supported, in the current code, the
> vDPA configuration triggering depends in:
> > 1. Device is ready - all the queues are configured (datapath addresses,
> callfd and kickfd) .
> > 2. last command is callfd.
> >
> >
> > The code doesn't take into account that some queues may stay disabled.
> > Maybe the correct timing is:
> > 1. Device is ready - all the enabled queues are configured and MEM table is
> configured.
> 
> I think current virtio_is_ready() already assumes the mem table is
> configured, otherwise we would not have vq->desc, vq->used and vq->avail
> being set as it needs to be translated using the mem table.
> 

Yes, but if you don't expect to check them for disabled queues you need to check mem table to be sure it was set.


> > 2. no need callfd to be last.
> >
> > Queues that will be configured later will be configured to the HW when the
> virtq becoming enabled.
> >
> >
> > What do think?
> 
> Maybe I did not understood what you mean, so please correct me if needed.
> 
> If I understood correctly, then your suggestion is just to remove the
> workaround, but it has been introduced by Intel because the callfd gets set a
> second time in some cases.

Not to remove the WA, just to improve it😊

I don't sure I understand the issue here, can you add details?


> 
> Thanks,
> Maxime
> >
> >>  		if (vdpa_dev->ops->dev_conf(dev->vid))
> >>  			VHOST_LOG_CONFIG(ERR,
> >>  					"Failed to configure vDPA device\n");
> >> --
> >> 2.25.4
> >


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 9/9] vhost: only use vDPA config workaround if needed
  2020-06-08  9:19       ` Matan Azrad
@ 2020-06-09  9:04         ` Maxime Coquelin
  2020-06-09 11:09           ` Matan Azrad
  0 siblings, 1 reply; 35+ messages in thread
From: Maxime Coquelin @ 2020-06-09  9:04 UTC (permalink / raw)
  To: Matan Azrad, xiaolong.ye, Shahaf Shuler, amorenoz, xiao.w.wang,
	Slava Ovsiienko, dev
  Cc: jasowang, lulu

Hi Matan,

On 6/8/20 11:19 AM, Matan Azrad wrote:
> Hi Maxime
> 
> From: Maxime Coquelin:
>> Hi Matan,
>>
>> On 6/7/20 12:38 PM, Matan Azrad wrote:
>>> Hi Maxime
>>>
>>> Thanks for the huge work.
>>> Please see a suggestion inline.
>>>
>>> From: Maxime Coquelin:
>>>> Sent: Thursday, May 14, 2020 11:02 AM
>>>> To: xiaolong.ye@intel.com; Shahaf Shuler <shahafs@mellanox.com>;
>>>> Matan Azrad <matan@mellanox.com>; amorenoz@redhat.com;
>>>> xiao.w.wang@intel.com; Slava Ovsiienko <viacheslavo@mellanox.com>;
>>>> dev@dpdk.org
>>>> Cc: jasowang@redhat.com; lulu@redhat.com; Maxime Coquelin
>>>> <maxime.coquelin@redhat.com>
>>>> Subject: [PATCH 9/9] vhost: only use vDPA config workaround if needed
>>>>
>>>> Now that we have Virtio device status support, let's only use the
>>>> vDPA workaround if it is not supported.
>>>>
>>>> This patch also document why Virtio device status protocol feature
>>>> support is strongly advised.
>>>>
>>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>> ---
>>>>  lib/librte_vhost/vhost_user.c | 16 ++++++++++++++--
>>>>  1 file changed, 14 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/lib/librte_vhost/vhost_user.c
>>>> b/lib/librte_vhost/vhost_user.c index e5a44be58d..67e96a872a 100644
>>>> --- a/lib/librte_vhost/vhost_user.c
>>>> +++ b/lib/librte_vhost/vhost_user.c
>>>> @@ -2847,8 +2847,20 @@ vhost_user_msg_handler(int vid, int fd)
>>>>  	if (!vdpa_dev)
>>>>  		goto out;
>>>>
>>>> -	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED) &&
>>>> -			request == VHOST_USER_SET_VRING_CALL) {
>>>> +	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED)) {
>>>> +		/*
>>>> +		 * Workaround when Virtio device status protocol
>>>> +		 * feature is not supported, wait for SET_VRING_CALL
>>>> +		 * request. This is not ideal as some frontends like
>>>> +		 * Virtio-user may not send this request, so vDPA device
>>>> +		 * may never be configured. Virtio device status support
>>>> +		 * on frontend side is strongly advised.
>>>> +		 */
>>>> +		if (!(dev->protocol_features &
>>>> +				(1ULL <<
>>>> VHOST_USER_PROTOCOL_F_STATUS)) &&
>>>> +				(request !=
>>>> VHOST_USER_SET_VRING_CALL))
>>>> +			goto out;
>>>> +
>>>
>>>
>>> When status protocol feature is not supported, in the current code, the
>> vDPA configuration triggering depends in:
>>> 1. Device is ready - all the queues are configured (datapath addresses,
>> callfd and kickfd) .
>>> 2. last command is callfd.
>>>
>>>
>>> The code doesn't take into account that some queues may stay disabled.
>>> Maybe the correct timing is:
>>> 1. Device is ready - all the enabled queues are configured and MEM table is
>> configured.
>>
>> I think current virtio_is_ready() already assumes the mem table is
>> configured, otherwise we would not have vq->desc, vq->used and vq->avail
>> being set as it needs to be translated using the mem table.
>>
> 
> Yes, but if you don't expect to check them for disabled queues you need to check mem table to be sure it was set.

Even disabled queues should be allocated/configured by the guest driver.

> 
>>> 2. no need callfd to be last.
>>>
>>> Queues that will be configured later will be configured to the HW when the
>> virtq becoming enabled.
>>>
>>>
>>> What do think?
>>
>> Maybe I did not understood what you mean, so please correct me if needed.
>>
>> If I understood correctly, then your suggestion is just to remove the
>> workaround, but it has been introduced by Intel because the callfd gets set a
>> second time in some cases.
> 
> Not to remove the WA, just to improve it😊
> 
> I don't sure I understand the issue here, can you add details?

My understanding is that callfd is sent early by Qemu but is then
updated after by Qemu and we have no way to distinguish whether the
first one is valid or not... I did a bit of archeology and found this
explanation from Xiao:

https://inbox.dpdk.org/stable/B7F2E978279D1D49A3034B7786DACF407AFAA0C6@SHSMSX106.ccr.corp.intel.com/

I haven't managed to reproduce the issue myself, so that's why I'm a bit
reluctant in trying to improve it. Ideally Xiao could try to reproduce
the issue, so that if we can find something more elegant (and that does
make Virtio-user to work without the SET_STATUS support) we can be
confident in merging it (and maybe even backport it).

Regards,
Maxime

> 
>>
>> Thanks,
>> Maxime
>>>
>>>>  		if (vdpa_dev->ops->dev_conf(dev->vid))
>>>>  			VHOST_LOG_CONFIG(ERR,
>>>>  					"Failed to configure vDPA device\n");
>>>> --
>>>> 2.25.4
>>>
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 9/9] vhost: only use vDPA config workaround if needed
  2020-06-09  9:04         ` Maxime Coquelin
@ 2020-06-09 11:09           ` Matan Azrad
  2020-06-09 11:26             ` Maxime Coquelin
  2020-06-09 17:23             ` Maxime Coquelin
  0 siblings, 2 replies; 35+ messages in thread
From: Matan Azrad @ 2020-06-09 11:09 UTC (permalink / raw)
  To: Maxime Coquelin, xiaolong.ye, Shahaf Shuler, amorenoz,
	xiao.w.wang, Slava Ovsiienko, dev
  Cc: jasowang, lulu


Hi Maxime

From: Maxime Coquelin
> Hi Matan,
> 
> On 6/8/20 11:19 AM, Matan Azrad wrote:
> > Hi Maxime
> >
> > From: Maxime Coquelin:
> >> Hi Matan,
> >>
> >> On 6/7/20 12:38 PM, Matan Azrad wrote:
> >>> Hi Maxime
> >>>
> >>> Thanks for the huge work.
> >>> Please see a suggestion inline.
> >>>
> >>> From: Maxime Coquelin:
> >>>> Sent: Thursday, May 14, 2020 11:02 AM
> >>>> To: xiaolong.ye@intel.com; Shahaf Shuler <shahafs@mellanox.com>;
> >>>> Matan Azrad <matan@mellanox.com>; amorenoz@redhat.com;
> >>>> xiao.w.wang@intel.com; Slava Ovsiienko
> <viacheslavo@mellanox.com>;
> >>>> dev@dpdk.org
> >>>> Cc: jasowang@redhat.com; lulu@redhat.com; Maxime Coquelin
> >>>> <maxime.coquelin@redhat.com>
> >>>> Subject: [PATCH 9/9] vhost: only use vDPA config workaround if
> >>>> needed
> >>>>
> >>>> Now that we have Virtio device status support, let's only use the
> >>>> vDPA workaround if it is not supported.
> >>>>
> >>>> This patch also document why Virtio device status protocol feature
> >>>> support is strongly advised.
> >>>>
> >>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> >>>> ---
> >>>>  lib/librte_vhost/vhost_user.c | 16 ++++++++++++++--
> >>>>  1 file changed, 14 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/lib/librte_vhost/vhost_user.c
> >>>> b/lib/librte_vhost/vhost_user.c index e5a44be58d..67e96a872a 100644
> >>>> --- a/lib/librte_vhost/vhost_user.c
> >>>> +++ b/lib/librte_vhost/vhost_user.c
> >>>> @@ -2847,8 +2847,20 @@ vhost_user_msg_handler(int vid, int fd)
> >>>>  	if (!vdpa_dev)
> >>>>  		goto out;
> >>>>
> >>>> -	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED) &&
> >>>> -			request == VHOST_USER_SET_VRING_CALL) {
> >>>> +	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED)) {
> >>>> +		/*
> >>>> +		 * Workaround when Virtio device status protocol
> >>>> +		 * feature is not supported, wait for SET_VRING_CALL
> >>>> +		 * request. This is not ideal as some frontends like
> >>>> +		 * Virtio-user may not send this request, so vDPA device
> >>>> +		 * may never be configured. Virtio device status support
> >>>> +		 * on frontend side is strongly advised.
> >>>> +		 */
> >>>> +		if (!(dev->protocol_features &
> >>>> +				(1ULL <<
> >>>> VHOST_USER_PROTOCOL_F_STATUS)) &&
> >>>> +				(request !=
> >>>> VHOST_USER_SET_VRING_CALL))
> >>>> +			goto out;
> >>>> +
> >>>
> >>>
> >>> When status protocol feature is not supported, in the current code,
> >>> the
> >> vDPA configuration triggering depends in:
> >>> 1. Device is ready - all the queues are configured (datapath
> >>> addresses,
> >> callfd and kickfd) .
> >>> 2. last command is callfd.
> >>>
> >>>
> >>> The code doesn't take into account that some queues may stay disabled.
> >>> Maybe the correct timing is:
> >>> 1. Device is ready - all the enabled queues are configured and MEM
> >>> table is
> >> configured.
> >>
> >> I think current virtio_is_ready() already assumes the mem table is
> >> configured, otherwise we would not have vq->desc, vq->used and
> >> vq->avail being set as it needs to be translated using the mem table.
> >>
> >
> > Yes, but if you don't expect to check them for disabled queues you need to
> check mem table to be sure it was set.
> 
> Even disabled queues should be allocated/configured by the guest driver.

Is it by spec?

We saw that windows virtio guest driver doesn't configure disabled queues.
Is it bug in windows guest?
You probably can take a look here:
https://github.com/virtio-win/kvm-guest-drivers-windows

> >>> 2. no need callfd to be last.
> >>>
> >>> Queues that will be configured later will be configured to the HW
> >>> when the
> >> virtq becoming enabled.
> >>>
> >>>
> >>> What do think?
> >>
> >> Maybe I did not understood what you mean, so please correct me if
> needed.
> >>
> >> If I understood correctly, then your suggestion is just to remove the
> >> workaround, but it has been introduced by Intel because the callfd
> >> gets set a second time in some cases.
> >
> > Not to remove the WA, just to improve it😊
> >
> > I don't sure I understand the issue here, can you add details?
> 
> My understanding is that callfd is sent early by Qemu but is then updated
> after by Qemu and we have no way to distinguish whether the first one is
> valid or not... I did a bit of archeology and found this explanation from Xiao:
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Finbox
> .dpdk.org%2Fstable%2FB7F2E978279D1D49A3034B7786DACF407AFAA0C6%4
> 0SHSMSX106.ccr.corp.intel.com%2F&amp;data=02%7C01%7Cmatan%40mella
> nox.com%7Cdafcc4fcf4074202227208d80c542b1e%7Ca652971c7d2e4d9ba6a4
> d149256f461b%7C0%7C0%7C637272902927735523&amp;sdata=hvUJq5VdXH
> usbBp1y%2BSr1Yp2AukNQZbRnFS6dR3rgMw%3D&amp;reserved=0
> 
> I haven't managed to reproduce the issue myself, so that's why I'm a bit
> reluctant in trying to improve it. Ideally Xiao could try to reproduce the issue,
> so that if we can find something more elegant (and that does make Virtio-
> user to work without the SET_STATUS support) we can be confident in
> merging it (and maybe even backport it).


It looks like very specific case WA which breaks other cases, for example:
Guest poll mode driver: callfd is configured twice one by one, the first is X=!-1 and the second -1, here vdpa configuration may be triggered in the first one what make the driver think wrongly that the queue is not in poll mode.


I will send RFC patch with my suggestion.

 
> Regards,
> Maxime
> 
> >
> >>
> >> Thanks,
> >> Maxime
> >>>
> >>>>  		if (vdpa_dev->ops->dev_conf(dev->vid))
> >>>>  			VHOST_LOG_CONFIG(ERR,
> >>>>  					"Failed to configure vDPA device\n");
> >>>> --
> >>>> 2.25.4
> >>>
> >


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 9/9] vhost: only use vDPA config workaround if needed
  2020-06-09 11:09           ` Matan Azrad
@ 2020-06-09 11:26             ` Maxime Coquelin
  2020-06-09 17:23             ` Maxime Coquelin
  1 sibling, 0 replies; 35+ messages in thread
From: Maxime Coquelin @ 2020-06-09 11:26 UTC (permalink / raw)
  To: Matan Azrad, xiaolong.ye, Shahaf Shuler, amorenoz, xiao.w.wang,
	Slava Ovsiienko, dev
  Cc: jasowang, lulu



On 6/9/20 1:09 PM, Matan Azrad wrote:
> 
> Hi Maxime
> 
> From: Maxime Coquelin
>> Hi Matan,
>>
>> On 6/8/20 11:19 AM, Matan Azrad wrote:
>>> Hi Maxime
>>>
>>> From: Maxime Coquelin:
>>>> Hi Matan,
>>>>
>>>> On 6/7/20 12:38 PM, Matan Azrad wrote:
>>>>> Hi Maxime
>>>>>
>>>>> Thanks for the huge work.
>>>>> Please see a suggestion inline.
>>>>>
>>>>> From: Maxime Coquelin:
>>>>>> Sent: Thursday, May 14, 2020 11:02 AM
>>>>>> To: xiaolong.ye@intel.com; Shahaf Shuler <shahafs@mellanox.com>;
>>>>>> Matan Azrad <matan@mellanox.com>; amorenoz@redhat.com;
>>>>>> xiao.w.wang@intel.com; Slava Ovsiienko
>> <viacheslavo@mellanox.com>;
>>>>>> dev@dpdk.org
>>>>>> Cc: jasowang@redhat.com; lulu@redhat.com; Maxime Coquelin
>>>>>> <maxime.coquelin@redhat.com>
>>>>>> Subject: [PATCH 9/9] vhost: only use vDPA config workaround if
>>>>>> needed
>>>>>>
>>>>>> Now that we have Virtio device status support, let's only use the
>>>>>> vDPA workaround if it is not supported.
>>>>>>
>>>>>> This patch also document why Virtio device status protocol feature
>>>>>> support is strongly advised.
>>>>>>
>>>>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>> ---
>>>>>>  lib/librte_vhost/vhost_user.c | 16 ++++++++++++++--
>>>>>>  1 file changed, 14 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/lib/librte_vhost/vhost_user.c
>>>>>> b/lib/librte_vhost/vhost_user.c index e5a44be58d..67e96a872a 100644
>>>>>> --- a/lib/librte_vhost/vhost_user.c
>>>>>> +++ b/lib/librte_vhost/vhost_user.c
>>>>>> @@ -2847,8 +2847,20 @@ vhost_user_msg_handler(int vid, int fd)
>>>>>>  	if (!vdpa_dev)
>>>>>>  		goto out;
>>>>>>
>>>>>> -	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED) &&
>>>>>> -			request == VHOST_USER_SET_VRING_CALL) {
>>>>>> +	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED)) {
>>>>>> +		/*
>>>>>> +		 * Workaround when Virtio device status protocol
>>>>>> +		 * feature is not supported, wait for SET_VRING_CALL
>>>>>> +		 * request. This is not ideal as some frontends like
>>>>>> +		 * Virtio-user may not send this request, so vDPA device
>>>>>> +		 * may never be configured. Virtio device status support
>>>>>> +		 * on frontend side is strongly advised.
>>>>>> +		 */
>>>>>> +		if (!(dev->protocol_features &
>>>>>> +				(1ULL <<
>>>>>> VHOST_USER_PROTOCOL_F_STATUS)) &&
>>>>>> +				(request !=
>>>>>> VHOST_USER_SET_VRING_CALL))
>>>>>> +			goto out;
>>>>>> +
>>>>>
>>>>>
>>>>> When status protocol feature is not supported, in the current code,
>>>>> the
>>>> vDPA configuration triggering depends in:
>>>>> 1. Device is ready - all the queues are configured (datapath
>>>>> addresses,
>>>> callfd and kickfd) .
>>>>> 2. last command is callfd.
>>>>>
>>>>>
>>>>> The code doesn't take into account that some queues may stay disabled.
>>>>> Maybe the correct timing is:
>>>>> 1. Device is ready - all the enabled queues are configured and MEM
>>>>> table is
>>>> configured.
>>>>
>>>> I think current virtio_is_ready() already assumes the mem table is
>>>> configured, otherwise we would not have vq->desc, vq->used and
>>>> vq->avail being set as it needs to be translated using the mem table.
>>>>
>>>
>>> Yes, but if you don't expect to check them for disabled queues you need to
>> check mem table to be sure it was set.
>>
>> Even disabled queues should be allocated/configured by the guest driver.
> 
> Is it by spec?
> 
> We saw that windows virtio guest driver doesn't configure disabled queues.
> Is it bug in windows guest?
> You probably can take a look here:
> https://github.com/virtio-win/kvm-guest-drivers-windows
> 
>>>>> 2. no need callfd to be last.
>>>>>
>>>>> Queues that will be configured later will be configured to the HW
>>>>> when the
>>>> virtq becoming enabled.
>>>>>
>>>>>
>>>>> What do think?
>>>>
>>>> Maybe I did not understood what you mean, so please correct me if
>> needed.
>>>>
>>>> If I understood correctly, then your suggestion is just to remove the
>>>> workaround, but it has been introduced by Intel because the callfd
>>>> gets set a second time in some cases.
>>>
>>> Not to remove the WA, just to improve it😊
>>>
>>> I don't sure I understand the issue here, can you add details?
>>
>> My understanding is that callfd is sent early by Qemu but is then updated
>> after by Qemu and we have no way to distinguish whether the first one is
>> valid or not... I did a bit of archeology and found this explanation from Xiao:
>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Finbox
>> .dpdk.org%2Fstable%2FB7F2E978279D1D49A3034B7786DACF407AFAA0C6%4
>> 0SHSMSX106.ccr.corp.intel.com%2F&amp;data=02%7C01%7Cmatan%40mella
>> nox.com%7Cdafcc4fcf4074202227208d80c542b1e%7Ca652971c7d2e4d9ba6a4
>> d149256f461b%7C0%7C0%7C637272902927735523&amp;sdata=hvUJq5VdXH
>> usbBp1y%2BSr1Yp2AukNQZbRnFS6dR3rgMw%3D&amp;reserved=0
>>
>> I haven't managed to reproduce the issue myself, so that's why I'm a bit
>> reluctant in trying to improve it. Ideally Xiao could try to reproduce the issue,
>> so that if we can find something more elegant (and that does make Virtio-
>> user to work without the SET_STATUS support) we can be confident in
>> merging it (and maybe even backport it).
> 
> 
> It looks like very specific case WA which breaks other cases, for example:
> Guest poll mode driver: callfd is configured twice one by one, the first is X=!-1 and the second -1, here vdpa configuration may be triggered in the first one what make the driver think wrongly that the queue is not in poll mode.

Yes, I agree that it would be better to avoid having this workaround, as
it may create regressions.

> I will send RFC patch with my suggestion.

Thanks.
Xiao, any chance you try to reproduce the initial issue? This way we can
test Matan's RFC.

Maxime

>  
>> Regards,
>> Maxime
>>
>>>
>>>>
>>>> Thanks,
>>>> Maxime
>>>>>
>>>>>>  		if (vdpa_dev->ops->dev_conf(dev->vid))
>>>>>>  			VHOST_LOG_CONFIG(ERR,
>>>>>>  					"Failed to configure vDPA device\n");
>>>>>> --
>>>>>> 2.25.4
>>>>>
>>>
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 9/9] vhost: only use vDPA config workaround if needed
  2020-06-09 11:09           ` Matan Azrad
  2020-06-09 11:26             ` Maxime Coquelin
@ 2020-06-09 17:23             ` Maxime Coquelin
  2020-06-14  6:08               ` Matan Azrad
  1 sibling, 1 reply; 35+ messages in thread
From: Maxime Coquelin @ 2020-06-09 17:23 UTC (permalink / raw)
  To: Matan Azrad, xiaolong.ye, Shahaf Shuler, amorenoz, xiao.w.wang,
	Slava Ovsiienko, dev
  Cc: jasowang, lulu



On 6/9/20 1:09 PM, Matan Azrad wrote:
> Hi Maxime
> 
> From: Maxime Coquelin
>> Hi Matan,
>>
>> On 6/8/20 11:19 AM, Matan Azrad wrote:
>>> Hi Maxime
>>>
>>> From: Maxime Coquelin:
>>>> Hi Matan,
>>>>
>>>> On 6/7/20 12:38 PM, Matan Azrad wrote:
>>>>> Hi Maxime
>>>>>
>>>>> Thanks for the huge work.
>>>>> Please see a suggestion inline.
>>>>>
>>>>> From: Maxime Coquelin:
>>>>>> Sent: Thursday, May 14, 2020 11:02 AM
>>>>>> To: xiaolong.ye@intel.com; Shahaf Shuler <shahafs@mellanox.com>;
>>>>>> Matan Azrad <matan@mellanox.com>; amorenoz@redhat.com;
>>>>>> xiao.w.wang@intel.com; Slava Ovsiienko
>> <viacheslavo@mellanox.com>;
>>>>>> dev@dpdk.org
>>>>>> Cc: jasowang@redhat.com; lulu@redhat.com; Maxime Coquelin
>>>>>> <maxime.coquelin@redhat.com>
>>>>>> Subject: [PATCH 9/9] vhost: only use vDPA config workaround if
>>>>>> needed
>>>>>>
>>>>>> Now that we have Virtio device status support, let's only use the
>>>>>> vDPA workaround if it is not supported.
>>>>>>
>>>>>> This patch also document why Virtio device status protocol feature
>>>>>> support is strongly advised.
>>>>>>
>>>>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>> ---
>>>>>>  lib/librte_vhost/vhost_user.c | 16 ++++++++++++++--
>>>>>>  1 file changed, 14 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/lib/librte_vhost/vhost_user.c
>>>>>> b/lib/librte_vhost/vhost_user.c index e5a44be58d..67e96a872a 100644
>>>>>> --- a/lib/librte_vhost/vhost_user.c
>>>>>> +++ b/lib/librte_vhost/vhost_user.c
>>>>>> @@ -2847,8 +2847,20 @@ vhost_user_msg_handler(int vid, int fd)
>>>>>>  	if (!vdpa_dev)
>>>>>>  		goto out;
>>>>>>
>>>>>> -	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED) &&
>>>>>> -			request == VHOST_USER_SET_VRING_CALL) {
>>>>>> +	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED)) {
>>>>>> +		/*
>>>>>> +		 * Workaround when Virtio device status protocol
>>>>>> +		 * feature is not supported, wait for SET_VRING_CALL
>>>>>> +		 * request. This is not ideal as some frontends like
>>>>>> +		 * Virtio-user may not send this request, so vDPA device
>>>>>> +		 * may never be configured. Virtio device status support
>>>>>> +		 * on frontend side is strongly advised.
>>>>>> +		 */
>>>>>> +		if (!(dev->protocol_features &
>>>>>> +				(1ULL <<
>>>>>> VHOST_USER_PROTOCOL_F_STATUS)) &&
>>>>>> +				(request !=
>>>>>> VHOST_USER_SET_VRING_CALL))
>>>>>> +			goto out;
>>>>>> +
>>>>>
>>>>> When status protocol feature is not supported, in the current code,
>>>>> the
>>>> vDPA configuration triggering depends in:
>>>>> 1. Device is ready - all the queues are configured (datapath
>>>>> addresses,
>>>> callfd and kickfd) .
>>>>> 2. last command is callfd.
>>>>>
>>>>>
>>>>> The code doesn't take into account that some queues may stay disabled.
>>>>> Maybe the correct timing is:
>>>>> 1. Device is ready - all the enabled queues are configured and MEM
>>>>> table is
>>>> configured.
>>>>
>>>> I think current virtio_is_ready() already assumes the mem table is
>>>> configured, otherwise we would not have vq->desc, vq->used and
>>>> vq->avail being set as it needs to be translated using the mem table.
>>>>
>>> Yes, but if you don't expect to check them for disabled queues you need to
>> check mem table to be sure it was set.
>>
>> Even disabled queues should be allocated/configured by the guest driver.
> Is it by spec?

Sorry, that was a misunderstanding from my side.
The number of queues set by the driver using MQ_VQ_PAIRS_SET control
message have to be allocated and configured by the driver:
http://docs.oasis-open.org/virtio/virtio/v1.0/cs04/virtio-v1.0-cs04.html#x1-1940001

> We saw that windows virtio guest driver doesn't configure disabled queues.
> Is it bug in windows guest?
> You probably can take a look here:
> https://github.com/virtio-win/kvm-guest-drivers-windows
> 

Indeed it limits the number of queue pairs to the number of CPUs.
This is done here:
https://github.com/virtio-win/kvm-guest-drivers-windows/blob/edda3f50a17015aab1450ca09e3263c7409e4001/NetKVM/Common/ParaNdis_Common.cpp#L956

Linux does the same by the way:
https://elixir.bootlin.com/linux/latest/source/drivers/net/virtio_net.c#L3092

We rarely face this issue because by default, the management
layers usually set the number of queue pairs to the number of vCPUs if
multiqueue is enabled. But the problem is real.

In my opinion, the problem is more on Vhost-user spec side and/or Vhost-
user backend.

The DPDK backend allocates queue pairs for every time it receives a
Vhost-user message setting a new queue (callfd, kickfd, enable,... see
vhost_user_check_and_alloc_queue_pair()). And then virtio_is_ready()
waits for all the allocated queue pairs to be initialized.

The problem is that QEMU sends some if these messages even for queues
that aren't (or won't be) initialized, as you can see in below log where
I reproduced the issue with Windows 10:
https://pastebin.com/YYCfW9Y3

I don't see how the backend could know the guest driver is done with
currently received information from Qemu as it seems to him some queues
are partially initialized (callfd is set).

With VHOST_USER_SET_STATUS, we will be able to handle this properly, as
the backend can be sure the guest won't initialize more queues as soon
as DRIVER_OK Virtio status bit is set. In my v2, I can add one patch to
handle this case properly, by "destorying" queues metadata as soon as
DRIVER_OK is received.

Note that it was the exact reason why I first tried to add support for
VHOST_USER_SET_STATUS more than two years ago...:
https://lists.gnu.org/archive/html/qemu-devel/2018-02/msg04560.html

What do you think?

Regards,
Maxime


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 6/9] vhost: add support for virtio status
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 6/9] vhost: add support for virtio status Maxime Coquelin
@ 2020-06-11  2:45   ` Xia, Chenbo
  2020-06-16  4:29   ` Xia, Chenbo
  1 sibling, 0 replies; 35+ messages in thread
From: Xia, Chenbo @ 2020-06-11  2:45 UTC (permalink / raw)
  To: Maxime Coquelin, Ye, Xiaolong, shahafs, matan, amorenoz, Wang,
	Xiao W, viacheslavo, dev
  Cc: jasowang, lulu

Hi Maxime,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Maxime Coquelin
> Sent: Thursday, May 14, 2020 4:02 PM
> To: Ye, Xiaolong <xiaolong.ye@intel.com>; shahafs@mellanox.com;
> matan@mellanox.com; amorenoz@redhat.com; Wang, Xiao W
> <xiao.w.wang@intel.com>; viacheslavo@mellanox.com; dev@dpdk.org
> Cc: jasowang@redhat.com; lulu@redhat.com; Maxime Coquelin
> <maxime.coquelin@redhat.com>
> Subject: [dpdk-dev] [PATCH 6/9] vhost: add support for virtio status
> 
> This patch adds support to the new Virtio device status Vhost-user protocol
> feature.
> 
> Getting such information in the backend helps to know when the driver is done
> with the device configuration and so makes the initialization phase more robust.
> 
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
>  lib/librte_vhost/rte_vhost.h  |  4 ++++
>  lib/librte_vhost/vhost.h      |  9 ++++++++
>  lib/librte_vhost/vhost_user.c | 40 +++++++++++++++++++++++++++++++++++
>  lib/librte_vhost/vhost_user.h |  6 ++++--
>  4 files changed, 57 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h index
> 5c72fba797..b4b7aa1928 100644
> --- a/lib/librte_vhost/rte_vhost.h
> +++ b/lib/librte_vhost/rte_vhost.h
> @@ -85,6 +85,10 @@ extern "C" {
>  #define VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD 12  #endif
> 
> +#ifndef VHOST_USER_PROTOCOL_F_STATUS
> +#define VHOST_USER_PROTOCOL_F_STATUS 15 #endif
> +
>  /** Indicate whether protocol features negotiation is supported. */  #ifndef
> VHOST_USER_F_PROTOCOL_FEATURES
>  #define VHOST_USER_F_PROTOCOL_FEATURES	30
> diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h index
> df98d15de6..9a9c0a98f5 100644
> --- a/lib/librte_vhost/vhost.h
> +++ b/lib/librte_vhost/vhost.h
> @@ -202,6 +202,14 @@ struct vhost_virtqueue {
>  	TAILQ_HEAD(, vhost_iotlb_entry) iotlb_pending_list;  }
> __rte_cache_aligned;
> 
> +/* Virtio device status as per Virtio specification */
> +#define VIRTIO_DEVICE_STATUS_ACK		0x01
> +#define VIRTIO_DEVICE_STATUS_DRIVER		0x02
> +#define VIRTIO_DEVICE_STATUS_DRIVER_OK		0x04
> +#define VIRTIO_DEVICE_STATUS_FEATURES_OK	0x08
> +#define VIRTIO_DEVICE_STATUS_DEV_NEED_RESET	0x40
> +#define VIRTIO_DEVICE_STATUS_FAILED		0x80
> +
>  /* Old kernels have no such macros defined */  #ifndef
> VIRTIO_NET_F_GUEST_ANNOUNCE
>   #define VIRTIO_NET_F_GUEST_ANNOUNCE 21 @@ -364,6 +372,7 @@ struct
> virtio_net {
>  	uint64_t		log_addr;
>  	struct rte_ether_addr	mac;
>  	uint16_t		mtu;
> +	uint8_t			status;
> 
>  	struct vhost_device_ops const *notify_ops;
> 
> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index
> 4a847f368c..e5a44be58d 100644
> --- a/lib/librte_vhost/vhost_user.c
> +++ b/lib/librte_vhost/vhost_user.c
> @@ -87,6 +87,7 @@ static const char *vhost_message_str[VHOST_USER_MAX]
> = {
>  	[VHOST_USER_POSTCOPY_END]  = "VHOST_USER_POSTCOPY_END",
>  	[VHOST_USER_GET_INFLIGHT_FD] = "VHOST_USER_GET_INFLIGHT_FD",
>  	[VHOST_USER_SET_INFLIGHT_FD] = "VHOST_USER_SET_INFLIGHT_FD",
> +	[VHOST_USER_SET_STATUS] = "VHOST_USER_SET_STATUS",
>  };
> 
>  static int send_vhost_reply(int sockfd, struct VhostUserMsg *msg); @@ -1328,6
> +1329,11 @@ virtio_is_ready(struct virtio_net *dev)
>  			return 0;
>  	}
> 
> +	/* If supported, ensure the frontend is really done with config */
> +	if (dev->protocol_features & (1ULL <<
> VHOST_USER_PROTOCOL_F_STATUS))
> +		if (!(dev->status & VIRTIO_DEVICE_STATUS_DRIVER_OK))
> +			return 0;
> +
>  	dev->flags |= VIRTIO_DEV_READY;
> 
>  	VHOST_LOG_CONFIG(INFO,
> @@ -2425,6 +2431,39 @@ vhost_user_postcopy_end(struct virtio_net **pdev,
> struct VhostUserMsg *msg,
>  	return RTE_VHOST_MSG_RESULT_REPLY;
>  }
> 
> +static int
> +vhost_user_set_status(struct virtio_net **pdev, struct VhostUserMsg *msg,
> +			int main_fd __rte_unused)
> +{
> +	struct virtio_net *dev = *pdev;
> +

Should we do 'validate_msg_fds' to avoid malicious messages as it is done in every message handler?

Thanks,
Chenbo

> +	/* As per Virtio specification, the device status is 8bits long */
> +	if (msg->payload.u64 > UINT8_MAX) {
> +		VHOST_LOG_CONFIG(ERR, "Invalid VHOST_USER_SET_STATUS
> payload 0x%" PRIx64 "\n",
> +				msg->payload.u64);
> +		return RTE_VHOST_MSG_RESULT_ERR;
> +	}
> +
> +	dev->status = msg->payload.u64;
> +
> +	VHOST_LOG_CONFIG(INFO, "New device status(0x%08x):\n"
> +			"\t-ACKNOWLEDGE: %u\n"
> +			"\t-DRIVER: %u\n"
> +			"\t-FEATURES_OK: %u\n"
> +			"\t-DRIVER_OK: %u\n"
> +			"\t-DEVICE_NEED_RESET: %u\n"
> +			"\t-FAILED: %u\n",
> +			dev->status,
> +			!!(dev->status & VIRTIO_DEVICE_STATUS_ACK),
> +			!!(dev->status & VIRTIO_DEVICE_STATUS_DRIVER),
> +			!!(dev->status &
> VIRTIO_DEVICE_STATUS_FEATURES_OK),
> +			!!(dev->status & VIRTIO_DEVICE_STATUS_DRIVER_OK),
> +			!!(dev->status &
> VIRTIO_DEVICE_STATUS_DEV_NEED_RESET),
> +			!!(dev->status & VIRTIO_DEVICE_STATUS_FAILED));
> +
> +	return RTE_VHOST_MSG_RESULT_OK;
> +}
> +
>  typedef int (*vhost_message_handler_t)(struct virtio_net **pdev,
>  					struct VhostUserMsg *msg,
>  					int main_fd);
> @@ -2457,6 +2496,7 @@ static vhost_message_handler_t
> vhost_message_handlers[VHOST_USER_MAX] = {
>  	[VHOST_USER_POSTCOPY_END] = vhost_user_postcopy_end,
>  	[VHOST_USER_GET_INFLIGHT_FD] = vhost_user_get_inflight_fd,
>  	[VHOST_USER_SET_INFLIGHT_FD] = vhost_user_set_inflight_fd,
> +	[VHOST_USER_SET_STATUS] = vhost_user_set_status,
>  };
> 
>  /* return bytes# of read on success or negative val on failure. */ diff --git
> a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h index
> 1f65efa4a9..74fd361e3a 100644
> --- a/lib/librte_vhost/vhost_user.h
> +++ b/lib/librte_vhost/vhost_user.h
> @@ -23,7 +23,8 @@
>  					 (1ULL <<
> VHOST_USER_PROTOCOL_F_CRYPTO_SESSION) | \
>  					 (1ULL <<
> VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD) | \
>  					 (1ULL <<
> VHOST_USER_PROTOCOL_F_HOST_NOTIFIER) | \
> -					 (1ULL <<
> VHOST_USER_PROTOCOL_F_PAGEFAULT))
> +					 (1ULL <<
> VHOST_USER_PROTOCOL_F_PAGEFAULT) | \
> +					 (1ULL <<
> VHOST_USER_PROTOCOL_F_STATUS))
> 
>  typedef enum VhostUserRequest {
>  	VHOST_USER_NONE = 0,
> @@ -56,7 +57,8 @@ typedef enum VhostUserRequest {
>  	VHOST_USER_POSTCOPY_END = 30,
>  	VHOST_USER_GET_INFLIGHT_FD = 31,
>  	VHOST_USER_SET_INFLIGHT_FD = 32,
> -	VHOST_USER_MAX = 33
> +	VHOST_USER_SET_STATUS = 36,
> +	VHOST_USER_MAX = 37
>  } VhostUserRequest;
> 
>  typedef enum VhostUserSlaveRequest {
> --
> 2.25.4


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 9/9] vhost: only use vDPA config workaround if needed
  2020-06-09 17:23             ` Maxime Coquelin
@ 2020-06-14  6:08               ` Matan Azrad
  2020-06-17  9:39                 ` Maxime Coquelin
  0 siblings, 1 reply; 35+ messages in thread
From: Matan Azrad @ 2020-06-14  6:08 UTC (permalink / raw)
  To: Maxime Coquelin, xiaolong.ye, Shahaf Shuler, amorenoz,
	xiao.w.wang, Slava Ovsiienko, dev
  Cc: jasowang, lulu

Hi Maxime

From: Maxime Coquelin:
> On 6/9/20 1:09 PM, Matan Azrad wrote:
> > Hi Maxime
> >
> > From: Maxime Coquelin
> >> Hi Matan,
> >>
> >> On 6/8/20 11:19 AM, Matan Azrad wrote:
> >>> Hi Maxime
> >>>
> >>> From: Maxime Coquelin:
> >>>> Hi Matan,
> >>>>
> >>>> On 6/7/20 12:38 PM, Matan Azrad wrote:
> >>>>> Hi Maxime
> >>>>>
> >>>>> Thanks for the huge work.
> >>>>> Please see a suggestion inline.
> >>>>>
> >>>>> From: Maxime Coquelin:
> >>>>>> Sent: Thursday, May 14, 2020 11:02 AM
> >>>>>> To: xiaolong.ye@intel.com; Shahaf Shuler
> <shahafs@mellanox.com>;
> >>>>>> Matan Azrad <matan@mellanox.com>; amorenoz@redhat.com;
> >>>>>> xiao.w.wang@intel.com; Slava Ovsiienko
> >> <viacheslavo@mellanox.com>;
> >>>>>> dev@dpdk.org
> >>>>>> Cc: jasowang@redhat.com; lulu@redhat.com; Maxime Coquelin
> >>>>>> <maxime.coquelin@redhat.com>
> >>>>>> Subject: [PATCH 9/9] vhost: only use vDPA config workaround if
> >>>>>> needed
> >>>>>>
> >>>>>> Now that we have Virtio device status support, let's only use the
> >>>>>> vDPA workaround if it is not supported.
> >>>>>>
> >>>>>> This patch also document why Virtio device status protocol
> >>>>>> feature support is strongly advised.
> >>>>>>
> >>>>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> >>>>>> ---
> >>>>>>  lib/librte_vhost/vhost_user.c | 16 ++++++++++++++--
> >>>>>>  1 file changed, 14 insertions(+), 2 deletions(-)
> >>>>>>
> >>>>>> diff --git a/lib/librte_vhost/vhost_user.c
> >>>>>> b/lib/librte_vhost/vhost_user.c index e5a44be58d..67e96a872a
> >>>>>> 100644
> >>>>>> --- a/lib/librte_vhost/vhost_user.c
> >>>>>> +++ b/lib/librte_vhost/vhost_user.c
> >>>>>> @@ -2847,8 +2847,20 @@ vhost_user_msg_handler(int vid, int fd)
> >>>>>>  	if (!vdpa_dev)
> >>>>>>  		goto out;
> >>>>>>
> >>>>>> -	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED) &&
> >>>>>> -			request == VHOST_USER_SET_VRING_CALL)
> {
> >>>>>> +	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED)) {
> >>>>>> +		/*
> >>>>>> +		 * Workaround when Virtio device status protocol
> >>>>>> +		 * feature is not supported, wait for
> SET_VRING_CALL
> >>>>>> +		 * request. This is not ideal as some frontends like
> >>>>>> +		 * Virtio-user may not send this request, so vDPA
> device
> >>>>>> +		 * may never be configured. Virtio device status
> support
> >>>>>> +		 * on frontend side is strongly advised.
> >>>>>> +		 */
> >>>>>> +		if (!(dev->protocol_features &
> >>>>>> +				(1ULL <<
> >>>>>> VHOST_USER_PROTOCOL_F_STATUS)) &&
> >>>>>> +				(request !=
> >>>>>> VHOST_USER_SET_VRING_CALL))
> >>>>>> +			goto out;
> >>>>>> +
> >>>>>
> >>>>> When status protocol feature is not supported, in the current
> >>>>> code, the
> >>>> vDPA configuration triggering depends in:
> >>>>> 1. Device is ready - all the queues are configured (datapath
> >>>>> addresses,
> >>>> callfd and kickfd) .
> >>>>> 2. last command is callfd.
> >>>>>
> >>>>>
> >>>>> The code doesn't take into account that some queues may stay
> disabled.
> >>>>> Maybe the correct timing is:
> >>>>> 1. Device is ready - all the enabled queues are configured and MEM
> >>>>> table is
> >>>> configured.
> >>>>
> >>>> I think current virtio_is_ready() already assumes the mem table is
> >>>> configured, otherwise we would not have vq->desc, vq->used and
> >>>> vq->avail being set as it needs to be translated using the mem table.
> >>>>
> >>> Yes, but if you don't expect to check them for disabled queues you
> >>> need to
> >> check mem table to be sure it was set.
> >>
> >> Even disabled queues should be allocated/configured by the guest driver.
> > Is it by spec?
> 
> Sorry, that was a misunderstanding from my side.
> The number of queues set by the driver using MQ_VQ_PAIRS_SET control
> message have to be allocated and configured by the driver:
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdocs.o
> asis-open.org%2Fvirtio%2Fvirtio%2Fv1.0%2Fcs04%2Fvirtio-v1.0-
> cs04.html%23x1-
> 1940001&amp;data=02%7C01%7Cmatan%40mellanox.com%7Cbed5d361fbff
> 47ab766008d80c99cc53%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%
> 7C637273201984684513&amp;sdata=zbBLclza39Fi5QenFtRx%2F1T29Dgj4w%2
> FudJ6obp5RxYo%3D&amp;reserved=0
> 

Do you mean to the sentence:
"The driver MUST configure the virtqueues before enabling them with the VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command." ?

Maybe I miss English understanding here but it looks like this sentence doesn't say if the driver should do configuration for queues that will not be enabled by the virtio driver (stay disabled forever).


> > We saw that windows virtio guest driver doesn't configure disabled
> queues.
> > Is it bug in windows guest?
> > You probably can take a look here:
> > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
> > ub.com%2Fvirtio-win%2Fkvm-guest-drivers-
> windows&amp;data=02%7C01%7Cmat
> >
> an%40mellanox.com%7Cbed5d361fbff47ab766008d80c99cc53%7Ca652971c7d
> 2e4d9
> >
> ba6a4d149256f461b%7C0%7C0%7C637273201984684513&amp;sdata=%2BqPf
> myvTw1T
> > RFif9woeR%2BsndUEunfR5O9EegJfilDI0%3D&amp;reserved=0
> >
> 
> Indeed it limits the number of queue pairs to the number of CPUs.
> This is done here:
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithu
> b.com%2Fvirtio-win%2Fkvm-guest-drivers-
> windows%2Fblob%2Fedda3f50a17015aab1450ca09e3263c7409e4001%2FNetK
> VM%2FCommon%2FParaNdis_Common.cpp%23L956&amp;data=02%7C01%
> 7Cmatan%40mellanox.com%7Cbed5d361fbff47ab766008d80c99cc53%7Ca652
> 971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637273201984684513&amp;s
> data=XXFIkVJWFacUMZLJwsKyoy6%2Bcqkn5f60fEC9rmMpaNI%3D&amp;res
> erved=0
> 
> Linux does the same by the way:
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.
> bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fnet%2Fvirtio_net.c
> %23L3092&amp;data=02%7C01%7Cmatan%40mellanox.com%7Cbed5d361fbf
> f47ab766008d80c99cc53%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0
> %7C637273201984684513&amp;sdata=ef6KJLHkkaGf5d6V8%2BI8N1WsI0Q3x
> X31jz2Y8oUSeNE%3D&amp;reserved=0

Yes, I think it makes sense.

> We rarely face this issue because by default, the management layers usually
> set the number of queue pairs to the number of vCPUs if multiqueue is
> enabled. But the problem is real.
> 
> In my opinion, the problem is more on Vhost-user spec side and/or Vhost-
> user backend.
> 
> The DPDK backend allocates queue pairs for every time it receives a Vhost-
> user message setting a new queue (callfd, kickfd, enable,... see
> vhost_user_check_and_alloc_queue_pair()). And then virtio_is_ready()
> waits for all the allocated queue pairs to be initialized.
> 
> The problem is that QEMU sends some if these messages even for queues
> that aren't (or won't be) initialized, as you can see in below log where I
> reproduced the issue with Windows 10:
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpaste
> bin.com%2FYYCfW9Y3&amp;data=02%7C01%7Cmatan%40mellanox.com%7C
> bed5d361fbff47ab766008d80c99cc53%7Ca652971c7d2e4d9ba6a4d149256f461
> b%7C0%7C0%7C637273201984684513&amp;sdata=MBG09v4BscpX5%2Bf%2Bl
> 7EOwhJcrpvoH7Wo3ISTLOxC6Lk%3D&amp;reserved=0
> 
> I don't see how the backend could know the guest driver is done with
> currently received information from Qemu as it seems to him some queues
> are partially initialized (callfd is set).

Don’t you think that only enabled queues must be fully initialized when their status is changed from disabled to enabled?
So, you can assume that disabled queues can stay "not fully initialized"...


> With VHOST_USER_SET_STATUS, we will be able to handle this properly, as
> the backend can be sure the guest won't initialize more queues as soon as
> DRIVER_OK Virtio status bit is set. In my v2, I can add one patch to handle this
> case properly, by "destorying" queues metadata as soon as DRIVER_OK is
> received.
> 
> Note that it was the exact reason why I first tried to add support for
> VHOST_USER_SET_STATUS more than two years ago...:
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.g
> nu.org%2Farchive%2Fhtml%2Fqemu-devel%2F2018-
> 02%2Fmsg04560.html&amp;data=02%7C01%7Cmatan%40mellanox.com%7C
> bed5d361fbff47ab766008d80c99cc53%7Ca652971c7d2e4d9ba6a4d149256f461
> b%7C0%7C0%7C637273201984684513&amp;sdata=KGJjdEtEN54duNu41rhBIw
> o4tmdWn6QD4yvdR3zeLI8%3D&amp;reserved=0
> 
> What do you think?

Yes, I agree it may be solved by VHOST_USER_SET_STATUS (and probably a lot of other issues),
But I think we need support also legacy QEMU versions if we can... 
Don't you think so?

> Regards,
> Maxime


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 6/9] vhost: add support for virtio status
  2020-05-14  8:02 ` [dpdk-dev] [PATCH 6/9] vhost: add support for virtio status Maxime Coquelin
  2020-06-11  2:45   ` Xia, Chenbo
@ 2020-06-16  4:29   ` Xia, Chenbo
  2020-06-22 10:18     ` Adrian Moreno
  1 sibling, 1 reply; 35+ messages in thread
From: Xia, Chenbo @ 2020-06-16  4:29 UTC (permalink / raw)
  To: Maxime Coquelin, Ye, Xiaolong, shahafs, matan, amorenoz, Wang,
	Xiao W, viacheslavo, dev
  Cc: jasowang, lulu

Hi Maxime,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Maxime Coquelin
> Sent: Thursday, May 14, 2020 4:02 PM
> To: Ye, Xiaolong <xiaolong.ye@intel.com>; shahafs@mellanox.com;
> matan@mellanox.com; amorenoz@redhat.com; Wang, Xiao W
> <xiao.w.wang@intel.com>; viacheslavo@mellanox.com; dev@dpdk.org
> Cc: jasowang@redhat.com; lulu@redhat.com; Maxime Coquelin
> <maxime.coquelin@redhat.com>
> Subject: [dpdk-dev] [PATCH 6/9] vhost: add support for virtio status
> 
> This patch adds support to the new Virtio device status Vhost-user protocol
> feature.
> 
> Getting such information in the backend helps to know when the driver is done
> with the device configuration and so makes the initialization phase more robust.
> 
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
>  lib/librte_vhost/rte_vhost.h  |  4 ++++
>  lib/librte_vhost/vhost.h      |  9 ++++++++
>  lib/librte_vhost/vhost_user.c | 40 +++++++++++++++++++++++++++++++++++
>  lib/librte_vhost/vhost_user.h |  6 ++++--
>  4 files changed, 57 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h index
> 5c72fba797..b4b7aa1928 100644
> --- a/lib/librte_vhost/rte_vhost.h
> +++ b/lib/librte_vhost/rte_vhost.h
> @@ -85,6 +85,10 @@ extern "C" {
>  #define VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD 12  #endif
> 
> +#ifndef VHOST_USER_PROTOCOL_F_STATUS
> +#define VHOST_USER_PROTOCOL_F_STATUS 15 #endif
> +
>  /** Indicate whether protocol features negotiation is supported. */  #ifndef
> VHOST_USER_F_PROTOCOL_FEATURES
>  #define VHOST_USER_F_PROTOCOL_FEATURES	30
> diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h index
> df98d15de6..9a9c0a98f5 100644
> --- a/lib/librte_vhost/vhost.h
> +++ b/lib/librte_vhost/vhost.h
> @@ -202,6 +202,14 @@ struct vhost_virtqueue {
>  	TAILQ_HEAD(, vhost_iotlb_entry) iotlb_pending_list;  }
> __rte_cache_aligned;
> 
> +/* Virtio device status as per Virtio specification */
> +#define VIRTIO_DEVICE_STATUS_ACK		0x01
> +#define VIRTIO_DEVICE_STATUS_DRIVER		0x02
> +#define VIRTIO_DEVICE_STATUS_DRIVER_OK		0x04
> +#define VIRTIO_DEVICE_STATUS_FEATURES_OK	0x08
> +#define VIRTIO_DEVICE_STATUS_DEV_NEED_RESET	0x40
> +#define VIRTIO_DEVICE_STATUS_FAILED		0x80
> +
>  /* Old kernels have no such macros defined */  #ifndef
> VIRTIO_NET_F_GUEST_ANNOUNCE
>   #define VIRTIO_NET_F_GUEST_ANNOUNCE 21 @@ -364,6 +372,7 @@ struct
> virtio_net {
>  	uint64_t		log_addr;
>  	struct rte_ether_addr	mac;
>  	uint16_t		mtu;
> +	uint8_t			status;
> 
>  	struct vhost_device_ops const *notify_ops;
> 
> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index
> 4a847f368c..e5a44be58d 100644
> --- a/lib/librte_vhost/vhost_user.c
> +++ b/lib/librte_vhost/vhost_user.c
> @@ -87,6 +87,7 @@ static const char *vhost_message_str[VHOST_USER_MAX]
> = {
>  	[VHOST_USER_POSTCOPY_END]  = "VHOST_USER_POSTCOPY_END",
>  	[VHOST_USER_GET_INFLIGHT_FD] = "VHOST_USER_GET_INFLIGHT_FD",
>  	[VHOST_USER_SET_INFLIGHT_FD] = "VHOST_USER_SET_INFLIGHT_FD",
> +	[VHOST_USER_SET_STATUS] = "VHOST_USER_SET_STATUS",
>  };
> 
>  static int send_vhost_reply(int sockfd, struct VhostUserMsg *msg); @@ -1328,6
> +1329,11 @@ virtio_is_ready(struct virtio_net *dev)
>  			return 0;
>  	}
> 
> +	/* If supported, ensure the frontend is really done with config */
> +	if (dev->protocol_features & (1ULL <<
> VHOST_USER_PROTOCOL_F_STATUS))
> +		if (!(dev->status & VIRTIO_DEVICE_STATUS_DRIVER_OK))
> +			return 0;
> +
>  	dev->flags |= VIRTIO_DEV_READY;
> 
>  	VHOST_LOG_CONFIG(INFO,
> @@ -2425,6 +2431,39 @@ vhost_user_postcopy_end(struct virtio_net **pdev,
> struct VhostUserMsg *msg,
>  	return RTE_VHOST_MSG_RESULT_REPLY;
>  }
> 
> +static int
> +vhost_user_set_status(struct virtio_net **pdev, struct VhostUserMsg *msg,
> +			int main_fd __rte_unused)
> +{
> +	struct virtio_net *dev = *pdev;
> +
> +	/* As per Virtio specification, the device status is 8bits long */
> +	if (msg->payload.u64 > UINT8_MAX) {
> +		VHOST_LOG_CONFIG(ERR, "Invalid VHOST_USER_SET_STATUS
> payload 0x%" PRIx64 "\n",
> +				msg->payload.u64);
> +		return RTE_VHOST_MSG_RESULT_ERR;
> +	}
> +
> +	dev->status = msg->payload.u64;
> +
> +	VHOST_LOG_CONFIG(INFO, "New device status(0x%08x):\n"
> +			"\t-ACKNOWLEDGE: %u\n"
> +			"\t-DRIVER: %u\n"
> +			"\t-FEATURES_OK: %u\n"
> +			"\t-DRIVER_OK: %u\n"
> +			"\t-DEVICE_NEED_RESET: %u\n"
> +			"\t-FAILED: %u\n",
> +			dev->status,
> +			!!(dev->status & VIRTIO_DEVICE_STATUS_ACK),
> +			!!(dev->status & VIRTIO_DEVICE_STATUS_DRIVER),
> +			!!(dev->status &
> VIRTIO_DEVICE_STATUS_FEATURES_OK),
> +			!!(dev->status & VIRTIO_DEVICE_STATUS_DRIVER_OK),
> +			!!(dev->status &
> VIRTIO_DEVICE_STATUS_DEV_NEED_RESET),
> +			!!(dev->status & VIRTIO_DEVICE_STATUS_FAILED));
> +
> +	return RTE_VHOST_MSG_RESULT_OK;

I see in your patch for virtio-user SET_STATUS support (http://patchwork.dpdk.org/patch/70677/), the 
VHOST_USER_SET_STATUS msg may request a reply, but this func does not handle this case. If we don't
handle here,  vhost_user_msg_handler will return RTE_VHOST_MSG_RESULT_ERR later. So should we
handle the case here?

Thanks,
Chenbo

> +}
> +
>  typedef int (*vhost_message_handler_t)(struct virtio_net **pdev,
>  					struct VhostUserMsg *msg,
>  					int main_fd);
> @@ -2457,6 +2496,7 @@ static vhost_message_handler_t
> vhost_message_handlers[VHOST_USER_MAX] = {
>  	[VHOST_USER_POSTCOPY_END] = vhost_user_postcopy_end,
>  	[VHOST_USER_GET_INFLIGHT_FD] = vhost_user_get_inflight_fd,
>  	[VHOST_USER_SET_INFLIGHT_FD] = vhost_user_set_inflight_fd,
> +	[VHOST_USER_SET_STATUS] = vhost_user_set_status,
>  };
> 
>  /* return bytes# of read on success or negative val on failure. */ diff --git
> a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h index
> 1f65efa4a9..74fd361e3a 100644
> --- a/lib/librte_vhost/vhost_user.h
> +++ b/lib/librte_vhost/vhost_user.h
> @@ -23,7 +23,8 @@
>  					 (1ULL <<
> VHOST_USER_PROTOCOL_F_CRYPTO_SESSION) | \
>  					 (1ULL <<
> VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD) | \
>  					 (1ULL <<
> VHOST_USER_PROTOCOL_F_HOST_NOTIFIER) | \
> -					 (1ULL <<
> VHOST_USER_PROTOCOL_F_PAGEFAULT))
> +					 (1ULL <<
> VHOST_USER_PROTOCOL_F_PAGEFAULT) | \
> +					 (1ULL <<
> VHOST_USER_PROTOCOL_F_STATUS))
> 
>  typedef enum VhostUserRequest {
>  	VHOST_USER_NONE = 0,
> @@ -56,7 +57,8 @@ typedef enum VhostUserRequest {
>  	VHOST_USER_POSTCOPY_END = 30,
>  	VHOST_USER_GET_INFLIGHT_FD = 31,
>  	VHOST_USER_SET_INFLIGHT_FD = 32,
> -	VHOST_USER_MAX = 33
> +	VHOST_USER_SET_STATUS = 36,
> +	VHOST_USER_MAX = 37
>  } VhostUserRequest;
> 
>  typedef enum VhostUserSlaveRequest {
> --
> 2.25.4


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 9/9] vhost: only use vDPA config workaround if needed
  2020-06-14  6:08               ` Matan Azrad
@ 2020-06-17  9:39                 ` Maxime Coquelin
  2020-06-17 11:04                   ` Matan Azrad
  0 siblings, 1 reply; 35+ messages in thread
From: Maxime Coquelin @ 2020-06-17  9:39 UTC (permalink / raw)
  To: Matan Azrad, xiaolong.ye, Shahaf Shuler, amorenoz, xiao.w.wang,
	Slava Ovsiienko, dev
  Cc: jasowang, lulu

Hi Matan,

On 6/14/20 8:08 AM, Matan Azrad wrote:
> Hi Maxime
> 
> From: Maxime Coquelin:
>> On 6/9/20 1:09 PM, Matan Azrad wrote:
>>> Hi Maxime
>>>
>>> From: Maxime Coquelin
>>>> Hi Matan,
>>>>
>>>> On 6/8/20 11:19 AM, Matan Azrad wrote:
>>>>> Hi Maxime
>>>>>
>>>>> From: Maxime Coquelin:
>>>>>> Hi Matan,
>>>>>>
>>>>>> On 6/7/20 12:38 PM, Matan Azrad wrote:
>>>>>>> Hi Maxime
>>>>>>>
>>>>>>> Thanks for the huge work.
>>>>>>> Please see a suggestion inline.
>>>>>>>
>>>>>>> From: Maxime Coquelin:
>>>>>>>> Sent: Thursday, May 14, 2020 11:02 AM
>>>>>>>> To: xiaolong.ye@intel.com; Shahaf Shuler
>> <shahafs@mellanox.com>;
>>>>>>>> Matan Azrad <matan@mellanox.com>; amorenoz@redhat.com;
>>>>>>>> xiao.w.wang@intel.com; Slava Ovsiienko
>>>> <viacheslavo@mellanox.com>;
>>>>>>>> dev@dpdk.org
>>>>>>>> Cc: jasowang@redhat.com; lulu@redhat.com; Maxime Coquelin
>>>>>>>> <maxime.coquelin@redhat.com>
>>>>>>>> Subject: [PATCH 9/9] vhost: only use vDPA config workaround if
>>>>>>>> needed
>>>>>>>>
>>>>>>>> Now that we have Virtio device status support, let's only use the
>>>>>>>> vDPA workaround if it is not supported.
>>>>>>>>
>>>>>>>> This patch also document why Virtio device status protocol
>>>>>>>> feature support is strongly advised.
>>>>>>>>
>>>>>>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>>>> ---
>>>>>>>>  lib/librte_vhost/vhost_user.c | 16 ++++++++++++++--
>>>>>>>>  1 file changed, 14 insertions(+), 2 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/lib/librte_vhost/vhost_user.c
>>>>>>>> b/lib/librte_vhost/vhost_user.c index e5a44be58d..67e96a872a
>>>>>>>> 100644
>>>>>>>> --- a/lib/librte_vhost/vhost_user.c
>>>>>>>> +++ b/lib/librte_vhost/vhost_user.c
>>>>>>>> @@ -2847,8 +2847,20 @@ vhost_user_msg_handler(int vid, int fd)
>>>>>>>>  	if (!vdpa_dev)
>>>>>>>>  		goto out;
>>>>>>>>
>>>>>>>> -	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED) &&
>>>>>>>> -			request == VHOST_USER_SET_VRING_CALL)
>> {
>>>>>>>> +	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED)) {
>>>>>>>> +		/*
>>>>>>>> +		 * Workaround when Virtio device status protocol
>>>>>>>> +		 * feature is not supported, wait for
>> SET_VRING_CALL
>>>>>>>> +		 * request. This is not ideal as some frontends like
>>>>>>>> +		 * Virtio-user may not send this request, so vDPA
>> device
>>>>>>>> +		 * may never be configured. Virtio device status
>> support
>>>>>>>> +		 * on frontend side is strongly advised.
>>>>>>>> +		 */
>>>>>>>> +		if (!(dev->protocol_features &
>>>>>>>> +				(1ULL <<
>>>>>>>> VHOST_USER_PROTOCOL_F_STATUS)) &&
>>>>>>>> +				(request !=
>>>>>>>> VHOST_USER_SET_VRING_CALL))
>>>>>>>> +			goto out;
>>>>>>>> +
>>>>>>>
>>>>>>> When status protocol feature is not supported, in the current
>>>>>>> code, the
>>>>>> vDPA configuration triggering depends in:
>>>>>>> 1. Device is ready - all the queues are configured (datapath
>>>>>>> addresses,
>>>>>> callfd and kickfd) .
>>>>>>> 2. last command is callfd.
>>>>>>>
>>>>>>>
>>>>>>> The code doesn't take into account that some queues may stay
>> disabled.
>>>>>>> Maybe the correct timing is:
>>>>>>> 1. Device is ready - all the enabled queues are configured and MEM
>>>>>>> table is
>>>>>> configured.
>>>>>>
>>>>>> I think current virtio_is_ready() already assumes the mem table is
>>>>>> configured, otherwise we would not have vq->desc, vq->used and
>>>>>> vq->avail being set as it needs to be translated using the mem table.
>>>>>>
>>>>> Yes, but if you don't expect to check them for disabled queues you
>>>>> need to
>>>> check mem table to be sure it was set.
>>>>
>>>> Even disabled queues should be allocated/configured by the guest driver.
>>> Is it by spec?
>>
>> Sorry, that was a misunderstanding from my side.
>> The number of queues set by the driver using MQ_VQ_PAIRS_SET control
>> message have to be allocated and configured by the driver:
>> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdocs.o
>> asis-open.org%2Fvirtio%2Fvirtio%2Fv1.0%2Fcs04%2Fvirtio-v1.0-
>> cs04.html%23x1-
>> 1940001&amp;data=02%7C01%7Cmatan%40mellanox.com%7Cbed5d361fbff
>> 47ab766008d80c99cc53%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%
>> 7C637273201984684513&amp;sdata=zbBLclza39Fi5QenFtRx%2F1T29Dgj4w%2
>> FudJ6obp5RxYo%3D&amp;reserved=0
>>
> 
> Do you mean to the sentence:
> "The driver MUST configure the virtqueues before enabling them with the VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command." ?
> 
> Maybe I miss English understanding here but it looks like this sentence doesn't say if the driver should do configuration for queues that will not be enabled by the virtio driver (stay disabled forever).
> 
> 
>>> We saw that windows virtio guest driver doesn't configure disabled
>> queues.
>>> Is it bug in windows guest?
>>> You probably can take a look here:
>>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
>>> ub.com%2Fvirtio-win%2Fkvm-guest-drivers-
>> windows&amp;data=02%7C01%7Cmat
>>>
>> an%40mellanox.com%7Cbed5d361fbff47ab766008d80c99cc53%7Ca652971c7d
>> 2e4d9
>>>
>> ba6a4d149256f461b%7C0%7C0%7C637273201984684513&amp;sdata=%2BqPf
>> myvTw1T
>>> RFif9woeR%2BsndUEunfR5O9EegJfilDI0%3D&amp;reserved=0
>>>
>>
>> Indeed it limits the number of queue pairs to the number of CPUs.
>> This is done here:
>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithu
>> b.com%2Fvirtio-win%2Fkvm-guest-drivers-
>> windows%2Fblob%2Fedda3f50a17015aab1450ca09e3263c7409e4001%2FNetK
>> VM%2FCommon%2FParaNdis_Common.cpp%23L956&amp;data=02%7C01%
>> 7Cmatan%40mellanox.com%7Cbed5d361fbff47ab766008d80c99cc53%7Ca652
>> 971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637273201984684513&amp;s
>> data=XXFIkVJWFacUMZLJwsKyoy6%2Bcqkn5f60fEC9rmMpaNI%3D&amp;res
>> erved=0
>>
>> Linux does the same by the way:
>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.
>> bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fnet%2Fvirtio_net.c
>> %23L3092&amp;data=02%7C01%7Cmatan%40mellanox.com%7Cbed5d361fbf
>> f47ab766008d80c99cc53%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0
>> %7C637273201984684513&amp;sdata=ef6KJLHkkaGf5d6V8%2BI8N1WsI0Q3x
>> X31jz2Y8oUSeNE%3D&amp;reserved=0
> 
> Yes, I think it makes sense.
> 
>> We rarely face this issue because by default, the management layers usually
>> set the number of queue pairs to the number of vCPUs if multiqueue is
>> enabled. But the problem is real.
>>
>> In my opinion, the problem is more on Vhost-user spec side and/or Vhost-
>> user backend.
>>
>> The DPDK backend allocates queue pairs for every time it receives a Vhost-
>> user message setting a new queue (callfd, kickfd, enable,... see
>> vhost_user_check_and_alloc_queue_pair()). And then virtio_is_ready()
>> waits for all the allocated queue pairs to be initialized.
>>
>> The problem is that QEMU sends some if these messages even for queues
>> that aren't (or won't be) initialized, as you can see in below log where I
>> reproduced the issue with Windows 10:
>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpaste
>> bin.com%2FYYCfW9Y3&amp;data=02%7C01%7Cmatan%40mellanox.com%7C
>> bed5d361fbff47ab766008d80c99cc53%7Ca652971c7d2e4d9ba6a4d149256f461
>> b%7C0%7C0%7C637273201984684513&amp;sdata=MBG09v4BscpX5%2Bf%2Bl
>> 7EOwhJcrpvoH7Wo3ISTLOxC6Lk%3D&amp;reserved=0
>>
>> I don't see how the backend could know the guest driver is done with
>> currently received information from Qemu as it seems to him some queues
>> are partially initialized (callfd is set).
> 
> Don’t you think that only enabled queues must be fully initialized when their status is changed from disabled to enabled?
> So, you can assume that disabled queues can stay "not fully initialized"...

That may work but might not be following the Virtio spec as with 1.0 we
shouldn't process the rings before DRIVER_OK is set (but we cannot be
sure we follow it anyway without SET_STATUS support).

I propose to cook a patch doing the following:
1. virtio_is_ready() will only ensure the first queue pair is ready
(i.e. enabled and configured). Meaning that app's new_device callback
and vDPA drivers dev_conf callback will be called with only the first
queue pair configured and enabled.

2. Before handling a new vhost-user request, it saves the ready status
for every queue pair.

3. Same handling of the requests, except that we won't notify the vdpa
driver and the application of vring state changes in the
VHOST_USER_SET_VRING_ENABLE handler.

4. Once the Vhost-user request is handled, it compares the new ready
status foe every queues with the old one and send queue state event
changes accordingly.

It is likely to need changes in the .dev_conf and .set_vring_state
implementations by the drivers.

> 
>> With VHOST_USER_SET_STATUS, we will be able to handle this properly, as
>> the backend can be sure the guest won't initialize more queues as soon as
>> DRIVER_OK Virtio status bit is set. In my v2, I can add one patch to handle this
>> case properly, by "destorying" queues metadata as soon as DRIVER_OK is
>> received.
>>
>> Note that it was the exact reason why I first tried to add support for
>> VHOST_USER_SET_STATUS more than two years ago...:
>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.g
>> nu.org%2Farchive%2Fhtml%2Fqemu-devel%2F2018-
>> 02%2Fmsg04560.html&amp;data=02%7C01%7Cmatan%40mellanox.com%7C
>> bed5d361fbff47ab766008d80c99cc53%7Ca652971c7d2e4d9ba6a4d149256f461
>> b%7C0%7C0%7C637273201984684513&amp;sdata=KGJjdEtEN54duNu41rhBIw
>> o4tmdWn6QD4yvdR3zeLI8%3D&amp;reserved=0
>>
>> What do you think?
> 
> Yes, I agree it may be solved by VHOST_USER_SET_STATUS (and probably a lot of other issues),
> But I think we need support also legacy QEMU versions if we can...

I think the SET_STATUS support is important to be compliant with the
Virtio specifictation.

> Don't you think so?

We can try that.
I will try to cook something this week, but it will require validation
with OVS to be sure we don't break multiqueue. I will send it as RFC,
and count on you to try it with your mlx5 vDPA driver.

Does it work for you? (note I'll be on vacation from July 1st to 17th)

Thanks,
Maxime

>> Regards,
>> Maxime
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 9/9] vhost: only use vDPA config workaround if needed
  2020-06-17  9:39                 ` Maxime Coquelin
@ 2020-06-17 11:04                   ` Matan Azrad
  2020-06-17 12:29                     ` Maxime Coquelin
  0 siblings, 1 reply; 35+ messages in thread
From: Matan Azrad @ 2020-06-17 11:04 UTC (permalink / raw)
  To: Maxime Coquelin, xiaolong.ye, Shahaf Shuler, amorenoz,
	xiao.w.wang, Slava Ovsiienko, dev
  Cc: jasowang, lulu

Hi Maxime

From: Maxime Coquelin:
> Hi Matan,
> 
> On 6/14/20 8:08 AM, Matan Azrad wrote:
> > Hi Maxime
> >
> > From: Maxime Coquelin:
> >> On 6/9/20 1:09 PM, Matan Azrad wrote:
> >>> Hi Maxime
> >>>
> >>> From: Maxime Coquelin
> >>>> Hi Matan,
> >>>>
> >>>> On 6/8/20 11:19 AM, Matan Azrad wrote:
> >>>>> Hi Maxime
> >>>>>
> >>>>> From: Maxime Coquelin:
> >>>>>> Hi Matan,
> >>>>>>
> >>>>>> On 6/7/20 12:38 PM, Matan Azrad wrote:
> >>>>>>> Hi Maxime
> >>>>>>>
> >>>>>>> Thanks for the huge work.
> >>>>>>> Please see a suggestion inline.
> >>>>>>>
> >>>>>>> From: Maxime Coquelin:
> >>>>>>>> Sent: Thursday, May 14, 2020 11:02 AM
> >>>>>>>> To: xiaolong.ye@intel.com; Shahaf Shuler
> >> <shahafs@mellanox.com>;
> >>>>>>>> Matan Azrad <matan@mellanox.com>; amorenoz@redhat.com;
> >>>>>>>> xiao.w.wang@intel.com; Slava Ovsiienko
> >>>> <viacheslavo@mellanox.com>;
> >>>>>>>> dev@dpdk.org
> >>>>>>>> Cc: jasowang@redhat.com; lulu@redhat.com; Maxime Coquelin
> >>>>>>>> <maxime.coquelin@redhat.com>
> >>>>>>>> Subject: [PATCH 9/9] vhost: only use vDPA config workaround if
> >>>>>>>> needed
> >>>>>>>>
> >>>>>>>> Now that we have Virtio device status support, let's only use
> >>>>>>>> the vDPA workaround if it is not supported.
> >>>>>>>>
> >>>>>>>> This patch also document why Virtio device status protocol
> >>>>>>>> feature support is strongly advised.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Maxime Coquelin
> <maxime.coquelin@redhat.com>
> >>>>>>>> ---
> >>>>>>>>  lib/librte_vhost/vhost_user.c | 16 ++++++++++++++--
> >>>>>>>>  1 file changed, 14 insertions(+), 2 deletions(-)
> >>>>>>>>
> >>>>>>>> diff --git a/lib/librte_vhost/vhost_user.c
> >>>>>>>> b/lib/librte_vhost/vhost_user.c index e5a44be58d..67e96a872a
> >>>>>>>> 100644
> >>>>>>>> --- a/lib/librte_vhost/vhost_user.c
> >>>>>>>> +++ b/lib/librte_vhost/vhost_user.c
> >>>>>>>> @@ -2847,8 +2847,20 @@ vhost_user_msg_handler(int vid, int
> fd)
> >>>>>>>>  	if (!vdpa_dev)
> >>>>>>>>  		goto out;
> >>>>>>>>
> >>>>>>>> -	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED) &&
> >>>>>>>> -			request == VHOST_USER_SET_VRING_CALL)
> >> {
> >>>>>>>> +	if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED)) {
> >>>>>>>> +		/*
> >>>>>>>> +		 * Workaround when Virtio device status protocol
> >>>>>>>> +		 * feature is not supported, wait for
> >> SET_VRING_CALL
> >>>>>>>> +		 * request. This is not ideal as some frontends like
> >>>>>>>> +		 * Virtio-user may not send this request, so vDPA
> >> device
> >>>>>>>> +		 * may never be configured. Virtio device status
> >> support
> >>>>>>>> +		 * on frontend side is strongly advised.
> >>>>>>>> +		 */
> >>>>>>>> +		if (!(dev->protocol_features &
> >>>>>>>> +				(1ULL <<
> >>>>>>>> VHOST_USER_PROTOCOL_F_STATUS)) &&
> >>>>>>>> +				(request !=
> >>>>>>>> VHOST_USER_SET_VRING_CALL))
> >>>>>>>> +			goto out;
> >>>>>>>> +
> >>>>>>>
> >>>>>>> When status protocol feature is not supported, in the current
> >>>>>>> code, the
> >>>>>> vDPA configuration triggering depends in:
> >>>>>>> 1. Device is ready - all the queues are configured (datapath
> >>>>>>> addresses,
> >>>>>> callfd and kickfd) .
> >>>>>>> 2. last command is callfd.
> >>>>>>>
> >>>>>>>
> >>>>>>> The code doesn't take into account that some queues may stay
> >> disabled.
> >>>>>>> Maybe the correct timing is:
> >>>>>>> 1. Device is ready - all the enabled queues are configured and
> >>>>>>> MEM table is
> >>>>>> configured.
> >>>>>>
> >>>>>> I think current virtio_is_ready() already assumes the mem table
> >>>>>> is configured, otherwise we would not have vq->desc, vq->used and
> >>>>>> vq->avail being set as it needs to be translated using the mem table.
> >>>>>>
> >>>>> Yes, but if you don't expect to check them for disabled queues you
> >>>>> need to
> >>>> check mem table to be sure it was set.
> >>>>
> >>>> Even disabled queues should be allocated/configured by the guest
> driver.
> >>> Is it by spec?
> >>
> >> Sorry, that was a misunderstanding from my side.
> >> The number of queues set by the driver using MQ_VQ_PAIRS_SET control
> >> message have to be allocated and configured by the driver:
> >>
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdocs
> >> .o
> >> asis-open.org%2Fvirtio%2Fvirtio%2Fv1.0%2Fcs04%2Fvirtio-v1.0-
> >> cs04.html%23x1-
> >>
> 1940001&amp;data=02%7C01%7Cmatan%40mellanox.com%7Cbed5d361fbff
> >>
> 47ab766008d80c99cc53%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%
> >>
> 7C637273201984684513&amp;sdata=zbBLclza39Fi5QenFtRx%2F1T29Dgj4w%2
> >> FudJ6obp5RxYo%3D&amp;reserved=0
> >>
> >
> > Do you mean to the sentence:
> > "The driver MUST configure the virtqueues before enabling them with the
> VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command." ?
> >
> > Maybe I miss English understanding here but it looks like this sentence
> doesn't say if the driver should do configuration for queues that will not be
> enabled by the virtio driver (stay disabled forever).
> >
> >
> >>> We saw that windows virtio guest driver doesn't configure disabled
> >> queues.
> >>> Is it bug in windows guest?
> >>> You probably can take a look here:
> >>>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgi
> >>> th
> >>> ub.com%2Fvirtio-win%2Fkvm-guest-drivers-
> >> windows&amp;data=02%7C01%7Cmat
> >>>
> >>
> an%40mellanox.com%7Cbed5d361fbff47ab766008d80c99cc53%7Ca652971c7d
> >> 2e4d9
> >>>
> >>
> ba6a4d149256f461b%7C0%7C0%7C637273201984684513&amp;sdata=%2BqPf
> >> myvTw1T
> >>> RFif9woeR%2BsndUEunfR5O9EegJfilDI0%3D&amp;reserved=0
> >>>
> >>
> >> Indeed it limits the number of queue pairs to the number of CPUs.
> >> This is done here:
> >> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit
> >> hu
> >> b.com%2Fvirtio-win%2Fkvm-guest-drivers-
> >>
> windows%2Fblob%2Fedda3f50a17015aab1450ca09e3263c7409e4001%2FNetK
> >>
> VM%2FCommon%2FParaNdis_Common.cpp%23L956&amp;data=02%7C01%
> >>
> 7Cmatan%40mellanox.com%7Cbed5d361fbff47ab766008d80c99cc53%7Ca652
> >>
> 971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637273201984684513&amp;s
> >>
> data=XXFIkVJWFacUMZLJwsKyoy6%2Bcqkn5f60fEC9rmMpaNI%3D&amp;res
> >> erved=0
> >>
> >> Linux does the same by the way:
> >>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.
> >>
> bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fnet%2Fvirtio_net.c
> >>
> %23L3092&amp;data=02%7C01%7Cmatan%40mellanox.com%7Cbed5d361fbf
> >>
> f47ab766008d80c99cc53%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0
> >>
> %7C637273201984684513&amp;sdata=ef6KJLHkkaGf5d6V8%2BI8N1WsI0Q3x
> >> X31jz2Y8oUSeNE%3D&amp;reserved=0
> >
> > Yes, I think it makes sense.
> >
> >> We rarely face this issue because by default, the management layers
> >> usually set the number of queue pairs to the number of vCPUs if
> >> multiqueue is enabled. But the problem is real.
> >>
> >> In my opinion, the problem is more on Vhost-user spec side and/or
> >> Vhost- user backend.
> >>
> >> The DPDK backend allocates queue pairs for every time it receives a
> >> Vhost- user message setting a new queue (callfd, kickfd, enable,...
> >> see vhost_user_check_and_alloc_queue_pair()). And then
> >> virtio_is_ready() waits for all the allocated queue pairs to be initialized.
> >>
> >> The problem is that QEMU sends some if these messages even for
> queues
> >> that aren't (or won't be) initialized, as you can see in below log
> >> where I reproduced the issue with Windows 10:
> >>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpas
> >> te
> bin.com%2FYYCfW9Y3&amp;data=02%7C01%7Cmatan%40mellanox.com%7C
> >>
> bed5d361fbff47ab766008d80c99cc53%7Ca652971c7d2e4d9ba6a4d149256f461
> >>
> b%7C0%7C0%7C637273201984684513&amp;sdata=MBG09v4BscpX5%2Bf%2Bl
> >> 7EOwhJcrpvoH7Wo3ISTLOxC6Lk%3D&amp;reserved=0
> >>
> >> I don't see how the backend could know the guest driver is done with
> >> currently received information from Qemu as it seems to him some
> >> queues are partially initialized (callfd is set).
> >
> > Don’t you think that only enabled queues must be fully initialized when
> their status is changed from disabled to enabled?
> > So, you can assume that disabled queues can stay "not fully initialized"...
> 
> That may work but might not be following the Virtio spec as with 1.0 we
> shouldn't process the rings before DRIVER_OK is set (but we cannot be sure
> we follow it anyway without SET_STATUS support).
> 
> I propose to cook a patch doing the following:
> 1. virtio_is_ready() will only ensure the first queue pair is ready (i.e. enabled
> and configured). Meaning that app's new_device callback and vDPA drivers
> dev_conf callback will be called with only the first queue pair configured and
> enabled.
> 
> 2. Before handling a new vhost-user request, it saves the ready status for
> every queue pair.
> 
> 3. Same handling of the requests, except that we won't notify the vdpa
> driver and the application of vring state changes in the
> VHOST_USER_SET_VRING_ENABLE handler.
> 
> 4. Once the Vhost-user request is handled, it compares the new ready status
> foe every queues with the old one and send queue state event changes
> accordingly.

Looks very nice to me.

More points:
By this method some queues may be configured by the set_vring_state operation so the next calls are expected to be called for each queue by the driver from the set_vring_state callback :
1. rte_vhost_enable_guest_notification
	This one takes datapath lock so we need to be sure that datapath lock is not locked on this queue from the same caller thread (maybe to not takes datapath locks when vdpa is configured at all).
2. rte_vhost_host_notifier_ctrl
	This function API is per device and not per queue, maybe we need to change this function to be per queue (add new for now and deprecate the old one in 20.11).

3. Need to be sure that if ready queue configuration is changed after dev_conf, we should notify it to the driver. (maybe by set_vring_state(disabl) and set_vring_state(enable)).


> It is likely to need changes in the .dev_conf and .set_vring_state
> implementations by the drivers.

Yes, for Mellanox it is very easy change.
Intel?

 
> >
> >> With VHOST_USER_SET_STATUS, we will be able to handle this properly,
> >> as the backend can be sure the guest won't initialize more queues as
> >> soon as DRIVER_OK Virtio status bit is set. In my v2, I can add one
> >> patch to handle this case properly, by "destorying" queues metadata
> >> as soon as DRIVER_OK is received.
> >>
> >> Note that it was the exact reason why I first tried to add support
> >> for VHOST_USER_SET_STATUS more than two years ago...:
> >> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flis
> >> ts.g
> >> nu.org%2Farchive%2Fhtml%2Fqemu-devel%2F2018-
> >>
> 02%2Fmsg04560.html&amp;data=02%7C01%7Cmatan%40mellanox.com%7C
> >>
> bed5d361fbff47ab766008d80c99cc53%7Ca652971c7d2e4d9ba6a4d149256f461
> >>
> b%7C0%7C0%7C637273201984684513&amp;sdata=KGJjdEtEN54duNu41rhBIw
> >> o4tmdWn6QD4yvdR3zeLI8%3D&amp;reserved=0
> >>
> >> What do you think?
> >
> > Yes, I agree it may be solved by VHOST_USER_SET_STATUS (and probably a
> > lot of other issues), But I think we need support also legacy QEMU versions
> if we can...
> 
> I think the SET_STATUS support is important to be compliant with the Virtio
> specifictation.
> 
> > Don't you think so?

Yes, I agree.

> 
> We can try that.
> I will try to cook something this week, but it will require validation with OVS
> to be sure we don't break multiqueue. I will send it as RFC, and count on you
> to try it with your mlx5 vDPA driver.
> 
> Does it work for you? (note I'll be on vacation from July 1st to 17th)

Sure,
Do you have capacity to do it this week?
I can help..... 

Matan


> 
> Thanks,
> Maxime
> 
> >> Regards,
> >> Maxime
> >


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 9/9] vhost: only use vDPA config workaround if needed
  2020-06-17 11:04                   ` Matan Azrad
@ 2020-06-17 12:29                     ` Maxime Coquelin
  2020-06-18  6:39                       ` Matan Azrad
  0 siblings, 1 reply; 35+ messages in thread
From: Maxime Coquelin @ 2020-06-17 12:29 UTC (permalink / raw)
  To: Matan Azrad, xiaolong.ye, Shahaf Shuler, amorenoz, xiao.w.wang,
	Slava Ovsiienko, dev
  Cc: jasowang, lulu



On 6/17/20 1:04 PM, Matan Azrad wrote:

>>> Don’t you think that only enabled queues must be fully initialized when
>> their status is changed from disabled to enabled?
>>> So, you can assume that disabled queues can stay "not fully initialized"...
>>
>> That may work but might not be following the Virtio spec as with 1.0 we
>> shouldn't process the rings before DRIVER_OK is set (but we cannot be sure
>> we follow it anyway without SET_STATUS support).
>>
>> I propose to cook a patch doing the following:
>> 1. virtio_is_ready() will only ensure the first queue pair is ready (i.e. enabled
>> and configured). Meaning that app's new_device callback and vDPA drivers
>> dev_conf callback will be called with only the first queue pair configured and
>> enabled.
>>
>> 2. Before handling a new vhost-user request, it saves the ready status for
>> every queue pair.
>>
>> 3. Same handling of the requests, except that we won't notify the vdpa
>> driver and the application of vring state changes in the
>> VHOST_USER_SET_VRING_ENABLE handler.
>>
>> 4. Once the Vhost-user request is handled, it compares the new ready status
>> foe every queues with the old one and send queue state event changes
>> accordingly.
> 
> Looks very nice to me.

Cool!

> More points:
> By this method some queues may be configured by the set_vring_state operation so the next calls are expected to be called for each queue by the driver from the set_vring_state callback :
> 1. rte_vhost_enable_guest_notification
> 	This one takes datapath lock so we need to be sure that datapath lock is not locked on this queue from the same caller thread (maybe to not takes datapath locks when vdpa is configured at all).

Good point, I agree we shouldn't need to use the access lock when vdpa
is configured. We may want to document that all the control path is
assumed to be single thread though.


> 2. rte_vhost_host_notifier_ctrl
> 	This function API is per device and not per queue, maybe we need to change this function to be per queue (add new for now and deprecate the old one in 20.11).

This one is still experimental, so no issue in reworking the API to make
it per queue without deprecation notice.

> 3. Need to be sure that if ready queue configuration is changed after dev_conf, we should notify it to the driver. (maybe by set_vring_state(disabl) and set_vring_state(enable)).

Agree, I'm not sure yet if we should just toggle set_vring_state as you
proposes, or if we should have a new callback for this.

>> It is likely to need changes in the .dev_conf and .set_vring_state
>> implementations by the drivers.
> 
> Yes, for Mellanox it is very easy change.
> Intel?
> 
>  
>>>
>>>> With VHOST_USER_SET_STATUS, we will be able to handle this properly,
>>>> as the backend can be sure the guest won't initialize more queues as
>>>> soon as DRIVER_OK Virtio status bit is set. In my v2, I can add one
>>>> patch to handle this case properly, by "destorying" queues metadata
>>>> as soon as DRIVER_OK is received.
>>>>
>>>> Note that it was the exact reason why I first tried to add support
>>>> for VHOST_USER_SET_STATUS more than two years ago...:
>>>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flis
>>>> ts.g
>>>> nu.org%2Farchive%2Fhtml%2Fqemu-devel%2F2018-
>>>>
>> 02%2Fmsg04560.html&amp;data=02%7C01%7Cmatan%40mellanox.com%7C
>>>>
>> bed5d361fbff47ab766008d80c99cc53%7Ca652971c7d2e4d9ba6a4d149256f461
>>>>
>> b%7C0%7C0%7C637273201984684513&amp;sdata=KGJjdEtEN54duNu41rhBIw
>>>> o4tmdWn6QD4yvdR3zeLI8%3D&amp;reserved=0
>>>>
>>>> What do you think?
>>>
>>> Yes, I agree it may be solved by VHOST_USER_SET_STATUS (and probably a
>>> lot of other issues), But I think we need support also legacy QEMU versions
>> if we can...
>>
>> I think the SET_STATUS support is important to be compliant with the Virtio
>> specifictation.
>>
>>> Don't you think so?
> 
> Yes, I agree.
> 
>>
>> We can try that.
>> I will try to cook something this week, but it will require validation with OVS
>> to be sure we don't break multiqueue. I will send it as RFC, and count on you
>> to try it with your mlx5 vDPA driver.
>>
>> Does it work for you? (note I'll be on vacation from July 1st to 17th)
> 
> Sure,
> Do you have capacity to do it this week?
> I can help..... 

That would be welcome, as I initially planned to spend time on reviewing
& merging patches this week.

Thanks,
Maxime
> Matan
> 
> 
>>
>> Thanks,
>> Maxime
>>
>>>> Regards,
>>>> Maxime
>>>
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 9/9] vhost: only use vDPA config workaround if needed
  2020-06-17 12:29                     ` Maxime Coquelin
@ 2020-06-18  6:39                       ` Matan Azrad
  2020-06-18  7:30                         ` Maxime Coquelin
  0 siblings, 1 reply; 35+ messages in thread
From: Matan Azrad @ 2020-06-18  6:39 UTC (permalink / raw)
  To: Maxime Coquelin, xiaolong.ye, Shahaf Shuler, amorenoz,
	xiao.w.wang, Slava Ovsiienko, dev
  Cc: jasowang, lulu

HI Maxime

From: Maxime Coquelin:
> On 6/17/20 1:04 PM, Matan Azrad wrote:
> 
> >>> Don’t you think that only enabled queues must be fully initialized
> >>> when
> >> their status is changed from disabled to enabled?
> >>> So, you can assume that disabled queues can stay "not fully initialized"...
> >>
> >> That may work but might not be following the Virtio spec as with 1.0
> >> we shouldn't process the rings before DRIVER_OK is set (but we cannot
> >> be sure we follow it anyway without SET_STATUS support).
> >>
> >> I propose to cook a patch doing the following:
> >> 1. virtio_is_ready() will only ensure the first queue pair is ready
> >> (i.e. enabled and configured). Meaning that app's new_device callback
> >> and vDPA drivers dev_conf callback will be called with only the first
> >> queue pair configured and enabled.
> >>
> >> 2. Before handling a new vhost-user request, it saves the ready
> >> status for every queue pair.
> >>
> >> 3. Same handling of the requests, except that we won't notify the
> >> vdpa driver and the application of vring state changes in the
> >> VHOST_USER_SET_VRING_ENABLE handler.
> >>
> >> 4. Once the Vhost-user request is handled, it compares the new ready
> >> status foe every queues with the old one and send queue state event
> >> changes accordingly.
> >
> > Looks very nice to me.
> 
> Cool!
> 
> > More points:
> > By this method some queues may be configured by the set_vring_state
> operation so the next calls are expected to be called for each queue by the
> driver from the set_vring_state callback :
> > 1. rte_vhost_enable_guest_notification
> > 	This one takes datapath lock so we need to be sure that datapath
> lock is not locked on this queue from the same caller thread (maybe to not
> takes datapath locks when vdpa is configured at all).
> 
> Good point, I agree we shouldn't need to use the access lock when vdpa is
> configured. We may want to document that all the control path is assumed to
> be single thread though.
> 
> 
> > 2. rte_vhost_host_notifier_ctrl
> > 	This function API is per device and not per queue, maybe we need to
> change this function to be per queue (add new for now and deprecate the
> old one in 20.11).
> 
> This one is still experimental, so no issue in reworking the API to make it per
> queue without deprecation notice.
> 
> > 3. Need to be sure that if ready queue configuration is changed after
> dev_conf, we should notify it to the driver. (maybe by
> set_vring_state(disabl) and set_vring_state(enable)).
> 
> Agree, I'm not sure yet if we should just toggle set_vring_state as you
> proposes, or if we should have a new callback for this.

Actually, when the queue configuration is changed, there is one moment that configuration is not valid (in the write time).
So maybe it makes sense to toggle.

But there is one more option:

It doesn't make sense that after configuration change the QEMU will not send VHOST_USER_SET_VRING_ENABLE massage.

So maybe we need to call set_vring_state in the next events:
	1. queue becomes ready (enabled and fully configured) - set_vring_state(enable).
	2. queue becomes not ready - set_vring_state(disable).
	3. queue stay ready and VHOST_USER_SET_VRING_ENABLE massage was handled - set_vring_state(enable).

Then we need to document that every set_vring_state call may point on configuration changes in the queue even if the state was not changed.

What do you think?



> >> It is likely to need changes in the .dev_conf and .set_vring_state
> >> implementations by the drivers.
> >
> > Yes, for Mellanox it is very easy change.
> > Intel?
> >
> >
> >>>
> >>>> With VHOST_USER_SET_STATUS, we will be able to handle this
> >>>> properly, as the backend can be sure the guest won't initialize
> >>>> more queues as soon as DRIVER_OK Virtio status bit is set. In my
> >>>> v2, I can add one patch to handle this case properly, by
> >>>> "destorying" queues metadata as soon as DRIVER_OK is received.
> >>>>
> >>>> Note that it was the exact reason why I first tried to add support
> >>>> for VHOST_USER_SET_STATUS more than two years ago...:
> >>>>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fl
> >>>> is
> >>>> ts.g
> >>>> nu.org%2Farchive%2Fhtml%2Fqemu-devel%2F2018-
> >>>>
> >>
> 02%2Fmsg04560.html&amp;data=02%7C01%7Cmatan%40mellanox.com%7C
> >>>>
> >>
> bed5d361fbff47ab766008d80c99cc53%7Ca652971c7d2e4d9ba6a4d149256f461
> >>>>
> >>
> b%7C0%7C0%7C637273201984684513&amp;sdata=KGJjdEtEN54duNu41rhBIw
> >>>> o4tmdWn6QD4yvdR3zeLI8%3D&amp;reserved=0
> >>>>
> >>>> What do you think?
> >>>
> >>> Yes, I agree it may be solved by VHOST_USER_SET_STATUS (and
> probably
> >>> a lot of other issues), But I think we need support also legacy QEMU
> >>> versions
> >> if we can...
> >>
> >> I think the SET_STATUS support is important to be compliant with the
> >> Virtio specifictation.
> >>
> >>> Don't you think so?
> >
> > Yes, I agree.
> >
> >>
> >> We can try that.
> >> I will try to cook something this week, but it will require
> >> validation with OVS to be sure we don't break multiqueue. I will send
> >> it as RFC, and count on you to try it with your mlx5 vDPA driver.
> >>
> >> Does it work for you? (note I'll be on vacation from July 1st to
> >> 17th)
> >
> > Sure,
> > Do you have capacity to do it this week?
> > I can help.....
> 
> That would be welcome, as I initially planned to spend time on reviewing &
> merging patches this week.
> 
> Thanks,
> Maxime
> > Matan
> >
> >
> >>
> >> Thanks,
> >> Maxime
> >>
> >>>> Regards,
> >>>> Maxime
> >>>
> >


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 9/9] vhost: only use vDPA config workaround if needed
  2020-06-18  6:39                       ` Matan Azrad
@ 2020-06-18  7:30                         ` Maxime Coquelin
  2020-06-23 10:42                           ` Wang, Xiao W
  0 siblings, 1 reply; 35+ messages in thread
From: Maxime Coquelin @ 2020-06-18  7:30 UTC (permalink / raw)
  To: Matan Azrad, xiaolong.ye, Shahaf Shuler, amorenoz, xiao.w.wang,
	Slava Ovsiienko, dev
  Cc: jasowang, lulu



On 6/18/20 8:39 AM, Matan Azrad wrote:
> HI Maxime
> 
> From: Maxime Coquelin:
>> On 6/17/20 1:04 PM, Matan Azrad wrote:
>>
>>>>> Don’t you think that only enabled queues must be fully initialized
>>>>> when
>>>> their status is changed from disabled to enabled?
>>>>> So, you can assume that disabled queues can stay "not fully initialized"...
>>>>
>>>> That may work but might not be following the Virtio spec as with 1.0
>>>> we shouldn't process the rings before DRIVER_OK is set (but we cannot
>>>> be sure we follow it anyway without SET_STATUS support).
>>>>
>>>> I propose to cook a patch doing the following:
>>>> 1. virtio_is_ready() will only ensure the first queue pair is ready
>>>> (i.e. enabled and configured). Meaning that app's new_device callback
>>>> and vDPA drivers dev_conf callback will be called with only the first
>>>> queue pair configured and enabled.
>>>>
>>>> 2. Before handling a new vhost-user request, it saves the ready
>>>> status for every queue pair.
>>>>
>>>> 3. Same handling of the requests, except that we won't notify the
>>>> vdpa driver and the application of vring state changes in the
>>>> VHOST_USER_SET_VRING_ENABLE handler.
>>>>
>>>> 4. Once the Vhost-user request is handled, it compares the new ready
>>>> status foe every queues with the old one and send queue state event
>>>> changes accordingly.
>>>
>>> Looks very nice to me.
>>
>> Cool!
>>
>>> More points:
>>> By this method some queues may be configured by the set_vring_state
>> operation so the next calls are expected to be called for each queue by the
>> driver from the set_vring_state callback :
>>> 1. rte_vhost_enable_guest_notification
>>> 	This one takes datapath lock so we need to be sure that datapath
>> lock is not locked on this queue from the same caller thread (maybe to not
>> takes datapath locks when vdpa is configured at all).
>>
>> Good point, I agree we shouldn't need to use the access lock when vdpa is
>> configured. We may want to document that all the control path is assumed to
>> be single thread though.
>>
>>
>>> 2. rte_vhost_host_notifier_ctrl
>>> 	This function API is per device and not per queue, maybe we need to
>> change this function to be per queue (add new for now and deprecate the
>> old one in 20.11).
>>
>> This one is still experimental, so no issue in reworking the API to make it per
>> queue without deprecation notice.
>>
>>> 3. Need to be sure that if ready queue configuration is changed after
>> dev_conf, we should notify it to the driver. (maybe by
>> set_vring_state(disabl) and set_vring_state(enable)).
>>
>> Agree, I'm not sure yet if we should just toggle set_vring_state as you
>> proposes, or if we should have a new callback for this.
> 
> Actually, when the queue configuration is changed, there is one moment that configuration is not valid (in the write time).
> So maybe it makes sense to toggle.
> 
> But there is one more option:
> 
> It doesn't make sense that after configuration change the QEMU will not send VHOST_USER_SET_VRING_ENABLE massage.

Agree.
> So maybe we need to call set_vring_state in the next events:
> 	1. queue becomes ready (enabled and fully configured) - set_vring_state(enable).
> 	2. queue becomes not ready - set_vring_state(disable).
> 	3. queue stay ready and VHOST_USER_SET_VRING_ENABLE massage was handled - set_vring_state(enable).
> 
> Then we need to document that every set_vring_state call may point on configuration changes in the queue even if the state was not changed.
> 
> What do you think?

I think it is worth a try.

Thanks,
Maxime


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 6/9] vhost: add support for virtio status
  2020-06-16  4:29   ` Xia, Chenbo
@ 2020-06-22 10:18     ` Adrian Moreno
  2020-06-22 11:00       ` Xia, Chenbo
  0 siblings, 1 reply; 35+ messages in thread
From: Adrian Moreno @ 2020-06-22 10:18 UTC (permalink / raw)
  To: Xia, Chenbo, Maxime Coquelin, Ye, Xiaolong, shahafs, matan, Wang,
	Xiao W, viacheslavo, dev
  Cc: jasowang, lulu

Hi Chenbo

On 6/16/20 6:29 AM, Xia, Chenbo wrote:
> Hi Maxime,
> 
>> -----Original Message-----
>> From: dev <dev-bounces@dpdk.org> On Behalf Of Maxime Coquelin
>> Sent: Thursday, May 14, 2020 4:02 PM
>> To: Ye, Xiaolong <xiaolong.ye@intel.com>; shahafs@mellanox.com;
>> matan@mellanox.com; amorenoz@redhat.com; Wang, Xiao W
>> <xiao.w.wang@intel.com>; viacheslavo@mellanox.com; dev@dpdk.org
>> Cc: jasowang@redhat.com; lulu@redhat.com; Maxime Coquelin
>> <maxime.coquelin@redhat.com>
>> Subject: [dpdk-dev] [PATCH 6/9] vhost: add support for virtio status
>>
>> This patch adds support to the new Virtio device status Vhost-user protocol
>> feature.
>>
>> Getting such information in the backend helps to know when the driver is done
>> with the device configuration and so makes the initialization phase more robust.
>>
>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>> ---
>>  lib/librte_vhost/rte_vhost.h  |  4 ++++
>>  lib/librte_vhost/vhost.h      |  9 ++++++++
>>  lib/librte_vhost/vhost_user.c | 40 +++++++++++++++++++++++++++++++++++
>>  lib/librte_vhost/vhost_user.h |  6 ++++--
>>  4 files changed, 57 insertions(+), 2 deletions(-)
>>
>> diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h index
>> 5c72fba797..b4b7aa1928 100644
>> --- a/lib/librte_vhost/rte_vhost.h
>> +++ b/lib/librte_vhost/rte_vhost.h
>> @@ -85,6 +85,10 @@ extern "C" {
>>  #define VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD 12  #endif
>>
>> +#ifndef VHOST_USER_PROTOCOL_F_STATUS
>> +#define VHOST_USER_PROTOCOL_F_STATUS 15 #endif
>> +
>>  /** Indicate whether protocol features negotiation is supported. */  #ifndef
>> VHOST_USER_F_PROTOCOL_FEATURES
>>  #define VHOST_USER_F_PROTOCOL_FEATURES	30
>> diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h index
>> df98d15de6..9a9c0a98f5 100644
>> --- a/lib/librte_vhost/vhost.h
>> +++ b/lib/librte_vhost/vhost.h
>> @@ -202,6 +202,14 @@ struct vhost_virtqueue {
>>  	TAILQ_HEAD(, vhost_iotlb_entry) iotlb_pending_list;  }
>> __rte_cache_aligned;
>>
>> +/* Virtio device status as per Virtio specification */
>> +#define VIRTIO_DEVICE_STATUS_ACK		0x01
>> +#define VIRTIO_DEVICE_STATUS_DRIVER		0x02
>> +#define VIRTIO_DEVICE_STATUS_DRIVER_OK		0x04
>> +#define VIRTIO_DEVICE_STATUS_FEATURES_OK	0x08
>> +#define VIRTIO_DEVICE_STATUS_DEV_NEED_RESET	0x40
>> +#define VIRTIO_DEVICE_STATUS_FAILED		0x80
>> +
>>  /* Old kernels have no such macros defined */  #ifndef
>> VIRTIO_NET_F_GUEST_ANNOUNCE
>>   #define VIRTIO_NET_F_GUEST_ANNOUNCE 21 @@ -364,6 +372,7 @@ struct
>> virtio_net {
>>  	uint64_t		log_addr;
>>  	struct rte_ether_addr	mac;
>>  	uint16_t		mtu;
>> +	uint8_t			status;
>>
>>  	struct vhost_device_ops const *notify_ops;
>>
>> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index
>> 4a847f368c..e5a44be58d 100644
>> --- a/lib/librte_vhost/vhost_user.c
>> +++ b/lib/librte_vhost/vhost_user.c
>> @@ -87,6 +87,7 @@ static const char *vhost_message_str[VHOST_USER_MAX]
>> = {
>>  	[VHOST_USER_POSTCOPY_END]  = "VHOST_USER_POSTCOPY_END",
>>  	[VHOST_USER_GET_INFLIGHT_FD] = "VHOST_USER_GET_INFLIGHT_FD",
>>  	[VHOST_USER_SET_INFLIGHT_FD] = "VHOST_USER_SET_INFLIGHT_FD",
>> +	[VHOST_USER_SET_STATUS] = "VHOST_USER_SET_STATUS",
>>  };
>>
>>  static int send_vhost_reply(int sockfd, struct VhostUserMsg *msg); @@ -1328,6
>> +1329,11 @@ virtio_is_ready(struct virtio_net *dev)
>>  			return 0;
>>  	}
>>
>> +	/* If supported, ensure the frontend is really done with config */
>> +	if (dev->protocol_features & (1ULL <<
>> VHOST_USER_PROTOCOL_F_STATUS))
>> +		if (!(dev->status & VIRTIO_DEVICE_STATUS_DRIVER_OK))
>> +			return 0;
>> +
>>  	dev->flags |= VIRTIO_DEV_READY;
>>
>>  	VHOST_LOG_CONFIG(INFO,
>> @@ -2425,6 +2431,39 @@ vhost_user_postcopy_end(struct virtio_net **pdev,
>> struct VhostUserMsg *msg,
>>  	return RTE_VHOST_MSG_RESULT_REPLY;
>>  }
>>
>> +static int
>> +vhost_user_set_status(struct virtio_net **pdev, struct VhostUserMsg *msg,
>> +			int main_fd __rte_unused)
>> +{
>> +	struct virtio_net *dev = *pdev;
>> +
>> +	/* As per Virtio specification, the device status is 8bits long */
>> +	if (msg->payload.u64 > UINT8_MAX) {
>> +		VHOST_LOG_CONFIG(ERR, "Invalid VHOST_USER_SET_STATUS
>> payload 0x%" PRIx64 "\n",
>> +				msg->payload.u64);
>> +		return RTE_VHOST_MSG_RESULT_ERR;
>> +	}
>> +
>> +	dev->status = msg->payload.u64;
>> +
>> +	VHOST_LOG_CONFIG(INFO, "New device status(0x%08x):\n"
>> +			"\t-ACKNOWLEDGE: %u\n"
>> +			"\t-DRIVER: %u\n"
>> +			"\t-FEATURES_OK: %u\n"
>> +			"\t-DRIVER_OK: %u\n"
>> +			"\t-DEVICE_NEED_RESET: %u\n"
>> +			"\t-FAILED: %u\n",
>> +			dev->status,
>> +			!!(dev->status & VIRTIO_DEVICE_STATUS_ACK),
>> +			!!(dev->status & VIRTIO_DEVICE_STATUS_DRIVER),
>> +			!!(dev->status &
>> VIRTIO_DEVICE_STATUS_FEATURES_OK),
>> +			!!(dev->status & VIRTIO_DEVICE_STATUS_DRIVER_OK),
>> +			!!(dev->status &
>> VIRTIO_DEVICE_STATUS_DEV_NEED_RESET),
>> +			!!(dev->status & VIRTIO_DEVICE_STATUS_FAILED));
>> +
>> +	return RTE_VHOST_MSG_RESULT_OK;
> 
> I see in your patch for virtio-user SET_STATUS support (http://patchwork.dpdk.org/patch/70677/), the 
> VHOST_USER_SET_STATUS msg may request a reply, but this func does not handle this case. If we don't
> handle here,  vhost_user_msg_handler will return RTE_VHOST_MSG_RESULT_ERR later. So should we
> handle the case here?
> 
Why should the reply be handled in this function?
After this function is called, vhost_user_msg_handler() should handle the
replies in:

	if (msg.flags & VHOST_USER_NEED_REPLY) {
		msg.payload.u64 = ret == RTE_VHOST_MSG_RESULT_ERR;
		msg.size = sizeof(msg.payload.u64);
		msg.fd_num = 0;
		send_vhost_reply(fd, &msg);
	} else if (ret == RTE_VHOST_MSG_RESULT_ERR) {
		VHOST_LOG_CONFIG(ERR,
			"vhost message handling failed.\n");
		return -1;
	}

Am I missing something?


> Thanks,
> Chenbo
> 
>> +}
>> +
>>  typedef int (*vhost_message_handler_t)(struct virtio_net **pdev,
>>  					struct VhostUserMsg *msg,
>>  					int main_fd);
>> @@ -2457,6 +2496,7 @@ static vhost_message_handler_t
>> vhost_message_handlers[VHOST_USER_MAX] = {
>>  	[VHOST_USER_POSTCOPY_END] = vhost_user_postcopy_end,
>>  	[VHOST_USER_GET_INFLIGHT_FD] = vhost_user_get_inflight_fd,
>>  	[VHOST_USER_SET_INFLIGHT_FD] = vhost_user_set_inflight_fd,
>> +	[VHOST_USER_SET_STATUS] = vhost_user_set_status,
>>  };
>>
>>  /* return bytes# of read on success or negative val on failure. */ diff --git
>> a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h index
>> 1f65efa4a9..74fd361e3a 100644
>> --- a/lib/librte_vhost/vhost_user.h
>> +++ b/lib/librte_vhost/vhost_user.h
>> @@ -23,7 +23,8 @@
>>  					 (1ULL <<
>> VHOST_USER_PROTOCOL_F_CRYPTO_SESSION) | \
>>  					 (1ULL <<
>> VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD) | \
>>  					 (1ULL <<
>> VHOST_USER_PROTOCOL_F_HOST_NOTIFIER) | \
>> -					 (1ULL <<
>> VHOST_USER_PROTOCOL_F_PAGEFAULT))
>> +					 (1ULL <<
>> VHOST_USER_PROTOCOL_F_PAGEFAULT) | \
>> +					 (1ULL <<
>> VHOST_USER_PROTOCOL_F_STATUS))
>>
>>  typedef enum VhostUserRequest {
>>  	VHOST_USER_NONE = 0,
>> @@ -56,7 +57,8 @@ typedef enum VhostUserRequest {
>>  	VHOST_USER_POSTCOPY_END = 30,
>>  	VHOST_USER_GET_INFLIGHT_FD = 31,
>>  	VHOST_USER_SET_INFLIGHT_FD = 32,
>> -	VHOST_USER_MAX = 33
>> +	VHOST_USER_SET_STATUS = 36,
>> +	VHOST_USER_MAX = 37
>>  } VhostUserRequest;
>>
>>  typedef enum VhostUserSlaveRequest {
>> --
>> 2.25.4
> 

-- 
Adrián Moreno


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 6/9] vhost: add support for virtio status
  2020-06-22 10:18     ` Adrian Moreno
@ 2020-06-22 11:00       ` Xia, Chenbo
  0 siblings, 0 replies; 35+ messages in thread
From: Xia, Chenbo @ 2020-06-22 11:00 UTC (permalink / raw)
  To: Adrian Moreno, Maxime Coquelin, Ye, Xiaolong, shahafs, matan,
	Wang, Xiao W, viacheslavo, dev
  Cc: jasowang, lulu

Hi Adrian,

> -----Original Message-----
> From: Adrian Moreno <amorenoz@redhat.com>
> Sent: Monday, June 22, 2020 6:18 PM
> To: Xia, Chenbo <chenbo.xia@intel.com>; Maxime Coquelin
> <maxime.coquelin@redhat.com>; Ye, Xiaolong <xiaolong.ye@intel.com>;
> shahafs@mellanox.com; matan@mellanox.com; Wang, Xiao W
> <xiao.w.wang@intel.com>; viacheslavo@mellanox.com; dev@dpdk.org
> Cc: jasowang@redhat.com; lulu@redhat.com
> Subject: Re: [dpdk-dev] [PATCH 6/9] vhost: add support for virtio status
> 
> Hi Chenbo
> 
> On 6/16/20 6:29 AM, Xia, Chenbo wrote:
> > Hi Maxime,
> >
> >> -----Original Message-----
> >> From: dev <dev-bounces@dpdk.org> On Behalf Of Maxime Coquelin
> >> Sent: Thursday, May 14, 2020 4:02 PM
> >> To: Ye, Xiaolong <xiaolong.ye@intel.com>; shahafs@mellanox.com;
> >> matan@mellanox.com; amorenoz@redhat.com; Wang, Xiao W
> >> <xiao.w.wang@intel.com>; viacheslavo@mellanox.com; dev@dpdk.org
> >> Cc: jasowang@redhat.com; lulu@redhat.com; Maxime Coquelin
> >> <maxime.coquelin@redhat.com>
> >> Subject: [dpdk-dev] [PATCH 6/9] vhost: add support for virtio status
> >>
> >> This patch adds support to the new Virtio device status Vhost-user
> >> protocol feature.
> >>
> >> Getting such information in the backend helps to know when the driver
> >> is done with the device configuration and so makes the initialization phase
> more robust.
> >>
> >> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> >> ---
> >>  lib/librte_vhost/rte_vhost.h  |  4 ++++
> >>  lib/librte_vhost/vhost.h      |  9 ++++++++
> >>  lib/librte_vhost/vhost_user.c | 40
> >> +++++++++++++++++++++++++++++++++++
> >>  lib/librte_vhost/vhost_user.h |  6 ++++--
> >>  4 files changed, 57 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/lib/librte_vhost/rte_vhost.h
> >> b/lib/librte_vhost/rte_vhost.h index
> >> 5c72fba797..b4b7aa1928 100644
> >> --- a/lib/librte_vhost/rte_vhost.h
> >> +++ b/lib/librte_vhost/rte_vhost.h
> >> @@ -85,6 +85,10 @@ extern "C" {
> >>  #define VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD 12  #endif
> >>
> >> +#ifndef VHOST_USER_PROTOCOL_F_STATUS #define
> >> +VHOST_USER_PROTOCOL_F_STATUS 15 #endif
> >> +
> >>  /** Indicate whether protocol features negotiation is supported. */
> >> #ifndef VHOST_USER_F_PROTOCOL_FEATURES
> >>  #define VHOST_USER_F_PROTOCOL_FEATURES	30
> >> diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
> >> index
> >> df98d15de6..9a9c0a98f5 100644
> >> --- a/lib/librte_vhost/vhost.h
> >> +++ b/lib/librte_vhost/vhost.h
> >> @@ -202,6 +202,14 @@ struct vhost_virtqueue {
> >>  	TAILQ_HEAD(, vhost_iotlb_entry) iotlb_pending_list;  }
> >> __rte_cache_aligned;
> >>
> >> +/* Virtio device status as per Virtio specification */
> >> +#define VIRTIO_DEVICE_STATUS_ACK		0x01
> >> +#define VIRTIO_DEVICE_STATUS_DRIVER		0x02
> >> +#define VIRTIO_DEVICE_STATUS_DRIVER_OK		0x04
> >> +#define VIRTIO_DEVICE_STATUS_FEATURES_OK	0x08
> >> +#define VIRTIO_DEVICE_STATUS_DEV_NEED_RESET	0x40
> >> +#define VIRTIO_DEVICE_STATUS_FAILED		0x80
> >> +
> >>  /* Old kernels have no such macros defined */  #ifndef
> >> VIRTIO_NET_F_GUEST_ANNOUNCE
> >>   #define VIRTIO_NET_F_GUEST_ANNOUNCE 21 @@ -364,6 +372,7 @@
> struct
> >> virtio_net {
> >>  	uint64_t		log_addr;
> >>  	struct rte_ether_addr	mac;
> >>  	uint16_t		mtu;
> >> +	uint8_t			status;
> >>
> >>  	struct vhost_device_ops const *notify_ops;
> >>
> >> diff --git a/lib/librte_vhost/vhost_user.c
> >> b/lib/librte_vhost/vhost_user.c index 4a847f368c..e5a44be58d 100644
> >> --- a/lib/librte_vhost/vhost_user.c
> >> +++ b/lib/librte_vhost/vhost_user.c
> >> @@ -87,6 +87,7 @@ static const char
> >> *vhost_message_str[VHOST_USER_MAX]
> >> = {
> >>  	[VHOST_USER_POSTCOPY_END]  = "VHOST_USER_POSTCOPY_END",
> >>  	[VHOST_USER_GET_INFLIGHT_FD] = "VHOST_USER_GET_INFLIGHT_FD",
> >>  	[VHOST_USER_SET_INFLIGHT_FD] = "VHOST_USER_SET_INFLIGHT_FD",
> >> +	[VHOST_USER_SET_STATUS] = "VHOST_USER_SET_STATUS",
> >>  };
> >>
> >>  static int send_vhost_reply(int sockfd, struct VhostUserMsg *msg);
> >> @@ -1328,6
> >> +1329,11 @@ virtio_is_ready(struct virtio_net *dev)
> >>  			return 0;
> >>  	}
> >>
> >> +	/* If supported, ensure the frontend is really done with config */
> >> +	if (dev->protocol_features & (1ULL <<
> >> VHOST_USER_PROTOCOL_F_STATUS))
> >> +		if (!(dev->status & VIRTIO_DEVICE_STATUS_DRIVER_OK))
> >> +			return 0;
> >> +
> >>  	dev->flags |= VIRTIO_DEV_READY;
> >>
> >>  	VHOST_LOG_CONFIG(INFO,
> >> @@ -2425,6 +2431,39 @@ vhost_user_postcopy_end(struct virtio_net
> >> **pdev, struct VhostUserMsg *msg,
> >>  	return RTE_VHOST_MSG_RESULT_REPLY;
> >>  }
> >>
> >> +static int
> >> +vhost_user_set_status(struct virtio_net **pdev, struct VhostUserMsg *msg,
> >> +			int main_fd __rte_unused)
> >> +{
> >> +	struct virtio_net *dev = *pdev;
> >> +
> >> +	/* As per Virtio specification, the device status is 8bits long */
> >> +	if (msg->payload.u64 > UINT8_MAX) {
> >> +		VHOST_LOG_CONFIG(ERR, "Invalid VHOST_USER_SET_STATUS
> >> payload 0x%" PRIx64 "\n",
> >> +				msg->payload.u64);
> >> +		return RTE_VHOST_MSG_RESULT_ERR;
> >> +	}
> >> +
> >> +	dev->status = msg->payload.u64;
> >> +
> >> +	VHOST_LOG_CONFIG(INFO, "New device status(0x%08x):\n"
> >> +			"\t-ACKNOWLEDGE: %u\n"
> >> +			"\t-DRIVER: %u\n"
> >> +			"\t-FEATURES_OK: %u\n"
> >> +			"\t-DRIVER_OK: %u\n"
> >> +			"\t-DEVICE_NEED_RESET: %u\n"
> >> +			"\t-FAILED: %u\n",
> >> +			dev->status,
> >> +			!!(dev->status & VIRTIO_DEVICE_STATUS_ACK),
> >> +			!!(dev->status & VIRTIO_DEVICE_STATUS_DRIVER),
> >> +			!!(dev->status &
> >> VIRTIO_DEVICE_STATUS_FEATURES_OK),
> >> +			!!(dev->status & VIRTIO_DEVICE_STATUS_DRIVER_OK),
> >> +			!!(dev->status &
> >> VIRTIO_DEVICE_STATUS_DEV_NEED_RESET),
> >> +			!!(dev->status & VIRTIO_DEVICE_STATUS_FAILED));
> >> +
> >> +	return RTE_VHOST_MSG_RESULT_OK;
> >
> > I see in your patch for virtio-user SET_STATUS support
> > (http://patchwork.dpdk.org/patch/70677/), the VHOST_USER_SET_STATUS
> > msg may request a reply, but this func does not handle this case. If
> > we don't handle here,  vhost_user_msg_handler will return
> RTE_VHOST_MSG_RESULT_ERR later. So should we handle the case here?
> >
> Why should the reply be handled in this function?
> After this function is called, vhost_user_msg_handler() should handle the replies
> in:
> 
> 	if (msg.flags & VHOST_USER_NEED_REPLY) {
> 		msg.payload.u64 = ret == RTE_VHOST_MSG_RESULT_ERR;
> 		msg.size = sizeof(msg.payload.u64);
> 		msg.fd_num = 0;
> 		send_vhost_reply(fd, &msg);
> 	} else if (ret == RTE_VHOST_MSG_RESULT_ERR) {
> 		VHOST_LOG_CONFIG(ERR,
> 			"vhost message handling failed.\n");
> 		return -1;
> 	}
> 
> Am I missing something?

You're correct, I missed this one :)

Thanks!
Chenbo

> 
> 
> > Thanks,
> > Chenbo
> >
> >> +}
> >> +
> >>  typedef int (*vhost_message_handler_t)(struct virtio_net **pdev,
> >>  					struct VhostUserMsg *msg,
> >>  					int main_fd);
> >> @@ -2457,6 +2496,7 @@ static vhost_message_handler_t
> >> vhost_message_handlers[VHOST_USER_MAX] = {
> >>  	[VHOST_USER_POSTCOPY_END] = vhost_user_postcopy_end,
> >>  	[VHOST_USER_GET_INFLIGHT_FD] = vhost_user_get_inflight_fd,
> >>  	[VHOST_USER_SET_INFLIGHT_FD] = vhost_user_set_inflight_fd,
> >> +	[VHOST_USER_SET_STATUS] = vhost_user_set_status,
> >>  };
> >>
> >>  /* return bytes# of read on success or negative val on failure. */
> >> diff --git a/lib/librte_vhost/vhost_user.h
> >> b/lib/librte_vhost/vhost_user.h index 1f65efa4a9..74fd361e3a 100644
> >> --- a/lib/librte_vhost/vhost_user.h
> >> +++ b/lib/librte_vhost/vhost_user.h
> >> @@ -23,7 +23,8 @@
> >>  					 (1ULL <<
> >> VHOST_USER_PROTOCOL_F_CRYPTO_SESSION) | \
> >>  					 (1ULL <<
> >> VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD) | \
> >>  					 (1ULL <<
> >> VHOST_USER_PROTOCOL_F_HOST_NOTIFIER) | \
> >> -					 (1ULL <<
> >> VHOST_USER_PROTOCOL_F_PAGEFAULT))
> >> +					 (1ULL <<
> >> VHOST_USER_PROTOCOL_F_PAGEFAULT) | \
> >> +					 (1ULL <<
> >> VHOST_USER_PROTOCOL_F_STATUS))
> >>
> >>  typedef enum VhostUserRequest {
> >>  	VHOST_USER_NONE = 0,
> >> @@ -56,7 +57,8 @@ typedef enum VhostUserRequest {
> >>  	VHOST_USER_POSTCOPY_END = 30,
> >>  	VHOST_USER_GET_INFLIGHT_FD = 31,
> >>  	VHOST_USER_SET_INFLIGHT_FD = 32,
> >> -	VHOST_USER_MAX = 33
> >> +	VHOST_USER_SET_STATUS = 36,
> >> +	VHOST_USER_MAX = 37
> >>  } VhostUserRequest;
> >>
> >>  typedef enum VhostUserSlaveRequest {
> >> --
> >> 2.25.4
> >
> 
> --
> Adrián Moreno


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [dpdk-dev] [PATCH 9/9] vhost: only use vDPA config workaround if needed
  2020-06-18  7:30                         ` Maxime Coquelin
@ 2020-06-23 10:42                           ` Wang, Xiao W
  0 siblings, 0 replies; 35+ messages in thread
From: Wang, Xiao W @ 2020-06-23 10:42 UTC (permalink / raw)
  To: Maxime Coquelin, Matan Azrad, Ye, Xiaolong, Shahaf Shuler,
	amorenoz, Slava Ovsiienko, dev, Xia, Chenbo, Xu, Rosen, Pei,
	Andy
  Cc: jasowang, lulu

Hi,

The original issue is with legacy QEMU (e.g.QEMUv2.6, with centos7.2 as guest kernel, without set_vring_status as an indicator).
For a normal boot, the last 2 messages are set_vring_kick and set_vring_call, inside the set_vring_kick handling,
virtio_is_ready() will reture true (because of that special very early set_vring_call message). Then
vdpa dev_config is called, and the fake call fd is used. As a result, the virtio kernel driver in VM will not
receive interrupt.

+1 for introducing SET_STATUS to make things clearer.
IFCVF driver hasn't added support for MQ and .set_vring_state ops, so no need to test,

Just curious about MQ live migration case, on the dst side, will this SET_STATUS msg comes to vhost-user? and When?

BRs,
Xiao

> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Thursday, June 18, 2020 3:31 PM
> To: Matan Azrad <matan@mellanox.com>; Ye, Xiaolong
> <xiaolong.ye@intel.com>; Shahaf Shuler <shahafs@mellanox.com>;
> amorenoz@redhat.com; Wang, Xiao W <xiao.w.wang@intel.com>; Slava
> Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
> Cc: jasowang@redhat.com; lulu@redhat.com
> Subject: Re: [PATCH 9/9] vhost: only use vDPA config workaround if needed
> 
> 
> 
> On 6/18/20 8:39 AM, Matan Azrad wrote:
> > HI Maxime
> >
> > From: Maxime Coquelin:
> >> On 6/17/20 1:04 PM, Matan Azrad wrote:
> >>
> >>>>> Don’t you think that only enabled queues must be fully initialized
> >>>>> when
> >>>> their status is changed from disabled to enabled?
> >>>>> So, you can assume that disabled queues can stay "not fully
> initialized"...
> >>>>
> >>>> That may work but might not be following the Virtio spec as with 1.0
> >>>> we shouldn't process the rings before DRIVER_OK is set (but we cannot
> >>>> be sure we follow it anyway without SET_STATUS support).
> >>>>
> >>>> I propose to cook a patch doing the following:
> >>>> 1. virtio_is_ready() will only ensure the first queue pair is ready
> >>>> (i.e. enabled and configured). Meaning that app's new_device callback
> >>>> and vDPA drivers dev_conf callback will be called with only the first
> >>>> queue pair configured and enabled.
> >>>>
> >>>> 2. Before handling a new vhost-user request, it saves the ready
> >>>> status for every queue pair.
> >>>>
> >>>> 3. Same handling of the requests, except that we won't notify the
> >>>> vdpa driver and the application of vring state changes in the
> >>>> VHOST_USER_SET_VRING_ENABLE handler.
> >>>>
> >>>> 4. Once the Vhost-user request is handled, it compares the new ready
> >>>> status foe every queues with the old one and send queue state event
> >>>> changes accordingly.
> >>>
> >>> Looks very nice to me.
> >>
> >> Cool!
> >>
> >>> More points:
> >>> By this method some queues may be configured by the set_vring_state
> >> operation so the next calls are expected to be called for each queue by the
> >> driver from the set_vring_state callback :
> >>> 1. rte_vhost_enable_guest_notification
> >>> 	This one takes datapath lock so we need to be sure that datapath
> >> lock is not locked on this queue from the same caller thread (maybe to not
> >> takes datapath locks when vdpa is configured at all).
> >>
> >> Good point, I agree we shouldn't need to use the access lock when vdpa is
> >> configured. We may want to document that all the control path is
> assumed to
> >> be single thread though.
> >>
> >>
> >>> 2. rte_vhost_host_notifier_ctrl
> >>> 	This function API is per device and not per queue, maybe we need to
> >> change this function to be per queue (add new for now and deprecate the
> >> old one in 20.11).
> >>
> >> This one is still experimental, so no issue in reworking the API to make it
> per
> >> queue without deprecation notice.
> >>
> >>> 3. Need to be sure that if ready queue configuration is changed after
> >> dev_conf, we should notify it to the driver. (maybe by
> >> set_vring_state(disabl) and set_vring_state(enable)).
> >>
> >> Agree, I'm not sure yet if we should just toggle set_vring_state as you
> >> proposes, or if we should have a new callback for this.
> >
> > Actually, when the queue configuration is changed, there is one moment
> that configuration is not valid (in the write time).
> > So maybe it makes sense to toggle.
> >
> > But there is one more option:
> >
> > It doesn't make sense that after configuration change the QEMU will not
> send VHOST_USER_SET_VRING_ENABLE massage.
> 
> Agree.
> > So maybe we need to call set_vring_state in the next events:
> > 	1. queue becomes ready (enabled and fully configured) -
> set_vring_state(enable).
> > 	2. queue becomes not ready - set_vring_state(disable).
> > 	3. queue stay ready and VHOST_USER_SET_VRING_ENABLE massage
> was handled - set_vring_state(enable).
> >
> > Then we need to document that every set_vring_state call may point on
> configuration changes in the queue even if the state was not changed.
> >
> > What do you think?
> 
> I think it is worth a try.
> 
> Thanks,
> Maxime


^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2020-06-23 10:42 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-14  8:02 [dpdk-dev] [PATCH (v20.08) 0/9] vhost: improve Vhost/vDPA device init Maxime Coquelin
2020-05-14  8:02 ` [dpdk-dev] [PATCH 1/9] vhost: fix virtio ready flag check Maxime Coquelin
2020-05-14  8:02 ` [dpdk-dev] [PATCH 2/9] vhost: refactor Virtio ready check Maxime Coquelin
2020-05-14  8:02 ` [dpdk-dev] [PATCH 3/9] vdpa/ifc: add support to vDPA queue enable Maxime Coquelin
2020-05-15  8:45   ` Ye Xiaolong
2020-05-15  9:09   ` Jason Wang
2020-05-15  9:42     ` Wang, Xiao W
2020-05-15 10:06       ` Jason Wang
2020-05-15 10:08       ` Jason Wang
2020-05-18  3:09         ` Wang, Xiao W
2020-05-18  3:17           ` Jason Wang
2020-05-14  8:02 ` [dpdk-dev] [PATCH 4/9] vhost: make some vDPA callbacks mandatory Maxime Coquelin
2020-05-14  8:02 ` [dpdk-dev] [PATCH 5/9] vhost: check vDPA configuration succeed Maxime Coquelin
2020-05-14  8:02 ` [dpdk-dev] [PATCH 6/9] vhost: add support for virtio status Maxime Coquelin
2020-06-11  2:45   ` Xia, Chenbo
2020-06-16  4:29   ` Xia, Chenbo
2020-06-22 10:18     ` Adrian Moreno
2020-06-22 11:00       ` Xia, Chenbo
2020-05-14  8:02 ` [dpdk-dev] [PATCH 7/9] vdpa/ifc: enable status protocol feature Maxime Coquelin
2020-05-14  8:02 ` [dpdk-dev] [PATCH 8/9] vdpa/mlx5: " Maxime Coquelin
2020-05-14  8:02 ` [dpdk-dev] [PATCH 9/9] vhost: only use vDPA config workaround if needed Maxime Coquelin
2020-06-07 10:38   ` Matan Azrad
2020-06-08  8:34     ` Maxime Coquelin
2020-06-08  9:19       ` Matan Azrad
2020-06-09  9:04         ` Maxime Coquelin
2020-06-09 11:09           ` Matan Azrad
2020-06-09 11:26             ` Maxime Coquelin
2020-06-09 17:23             ` Maxime Coquelin
2020-06-14  6:08               ` Matan Azrad
2020-06-17  9:39                 ` Maxime Coquelin
2020-06-17 11:04                   ` Matan Azrad
2020-06-17 12:29                     ` Maxime Coquelin
2020-06-18  6:39                       ` Matan Azrad
2020-06-18  7:30                         ` Maxime Coquelin
2020-06-23 10:42                           ` Wang, Xiao W

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).