[dpdk-dev] [PATCH 0/3] Extend vhost to support VFIO based accelerator

DPDK patches and discussions
 help / color / mirror / Atom feed

* [dpdk-dev] [PATCH 0/3] Extend vhost to support VFIO based accelerator
@ 2018-03-06 10:43 Tiwei Bie
  2018-03-06 10:43 ` [dpdk-dev] [PATCH 1/3] vhost: do not generate signal when sendmsg fails Tiwei Bie
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Tiwei Bie @ 2018-03-06 10:43 UTC (permalink / raw)
  To: dev
  Cc: maxime.coquelin, jianfeng.tan, yliu, zhihong.wang, xiao.w.wang,
	cunming.liang, dan.daly, tiwei.bie

This patch set introduces the VFIO based accelerator support
for vhost. This is a new vhost user protocol feature to better
support the vDPA device at the vhost user backend. It allows
interrupts/notifications being delivered between the driver
in guest and the device in host directly.

Dependencies:

1. This patch set depends on the below patch set for QEMU:

http://lists.nongnu.org/archive/html/qemu-devel/2018-01/msg06028.html

Some of the enum definitions in this patch set have been
updated for the latest QEMU. A new patch set for QEMU will
be sent out later.

2. This patch set depends on Zhihong's "selective datapath"
   patch set:

http://dpdk.org/ml/archives/dev/2018-March/091858.html

This patch set is generated on the latest master branch of
dpdk-next-virtio with Zhihong's patches applied.

Best regards,
Tiwei Bie

Tiwei Bie (3):
  vhost: do not generate signal when sendmsg fails
  vhost: support sending fds via send_vhost_message()
  vhost: support VFIO based accelerator

 lib/librte_vhost/rte_vhost.h           |  28 ++++++
 lib/librte_vhost/rte_vhost_version.map |   1 +
 lib/librte_vhost/socket.c              |   2 +-
 lib/librte_vhost/vhost_user.c          | 174 ++++++++++++++++++++++++++++++++-
 lib/librte_vhost/vhost_user.h          |   9 ++
 5 files changed, 209 insertions(+), 5 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [dpdk-dev] [PATCH 1/3] vhost: do not generate signal when sendmsg fails
  2018-03-06 10:43 [dpdk-dev] [PATCH 0/3] Extend vhost to support VFIO based accelerator Tiwei Bie
@ 2018-03-06 10:43 ` Tiwei Bie
  2018-03-29 12:19   ` Maxime Coquelin
  2018-03-29 13:46   ` Maxime Coquelin
  2018-03-06 10:43 ` [dpdk-dev] [PATCH 2/3] vhost: support sending fds via send_vhost_message() Tiwei Bie
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 14+ messages in thread
From: Tiwei Bie @ 2018-03-06 10:43 UTC (permalink / raw)
  To: dev
  Cc: maxime.coquelin, jianfeng.tan, yliu, zhihong.wang, xiao.w.wang,
	cunming.liang, dan.daly, tiwei.bie

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
---
 lib/librte_vhost/socket.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
index 0354740fa..d703d2114 100644
--- a/lib/librte_vhost/socket.c
+++ b/lib/librte_vhost/socket.c
@@ -181,7 +181,7 @@ send_fd_message(int sockfd, char *buf, int buflen, int *fds, int fd_num)
 	}
 
 	do {
-		ret = sendmsg(sockfd, &msgh, 0);
+		ret = sendmsg(sockfd, &msgh, MSG_NOSIGNAL);
 	} while (ret < 0 && errno == EINTR);
 
 	if (ret < 0) {
-- 
2.11.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [dpdk-dev] [PATCH 2/3] vhost: support sending fds via send_vhost_message()
  2018-03-06 10:43 [dpdk-dev] [PATCH 0/3] Extend vhost to support VFIO based accelerator Tiwei Bie
  2018-03-06 10:43 ` [dpdk-dev] [PATCH 1/3] vhost: do not generate signal when sendmsg fails Tiwei Bie
@ 2018-03-06 10:43 ` Tiwei Bie
  2018-03-29 12:23   ` Maxime Coquelin
  2018-03-29 12:27   ` Maxime Coquelin
  2018-03-06 10:43 ` [dpdk-dev] [PATCH 3/3] vhost: support VFIO based accelerator Tiwei Bie
  2018-04-18  5:49 ` [dpdk-dev] [PATCH v2] " Tiwei Bie
  3 siblings, 2 replies; 14+ messages in thread
From: Tiwei Bie @ 2018-03-06 10:43 UTC (permalink / raw)
  To: dev
  Cc: maxime.coquelin, jianfeng.tan, yliu, zhihong.wang, xiao.w.wang,
	cunming.liang, dan.daly, tiwei.bie

This function will be used to send fds to QEMU via slave channel.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
---
 lib/librte_vhost/vhost_user.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index 8b07b6c43..e3a1dfbfb 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -1308,13 +1308,13 @@ read_vhost_message(int sockfd, struct VhostUserMsg *msg)
 }
 
 static int
-send_vhost_message(int sockfd, struct VhostUserMsg *msg)
+send_vhost_message(int sockfd, struct VhostUserMsg *msg, int *fds, int fd_num)
 {
 	if (!msg)
 		return 0;
 
 	return send_fd_message(sockfd, (char *)msg,
-		VHOST_USER_HDR_SIZE + msg->size, NULL, 0);
+		VHOST_USER_HDR_SIZE + msg->size, fds, fd_num);
 }
 
 static int
@@ -1328,7 +1328,7 @@ send_vhost_reply(int sockfd, struct VhostUserMsg *msg)
 	msg->flags |= VHOST_USER_VERSION;
 	msg->flags |= VHOST_USER_REPLY_MASK;
 
-	return send_vhost_message(sockfd, msg);
+	return send_vhost_message(sockfd, msg, NULL, 0);
 }
 
 /*
@@ -1643,7 +1643,7 @@ vhost_user_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm)
 		},
 	};
 
-	ret = send_vhost_message(dev->slave_req_fd, &msg);
+	ret = send_vhost_message(dev->slave_req_fd, &msg, NULL, 0);
 	if (ret < 0) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 				"Failed to send IOTLB miss message (%d)\n",
-- 
2.11.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [dpdk-dev] [PATCH 3/3] vhost: support VFIO based accelerator
  2018-03-06 10:43 [dpdk-dev] [PATCH 0/3] Extend vhost to support VFIO based accelerator Tiwei Bie
  2018-03-06 10:43 ` [dpdk-dev] [PATCH 1/3] vhost: do not generate signal when sendmsg fails Tiwei Bie
  2018-03-06 10:43 ` [dpdk-dev] [PATCH 2/3] vhost: support sending fds via send_vhost_message() Tiwei Bie
@ 2018-03-06 10:43 ` Tiwei Bie
  2018-03-06 14:24   ` Maxime Coquelin
  2018-04-18  5:49 ` [dpdk-dev] [PATCH v2] " Tiwei Bie
  3 siblings, 1 reply; 14+ messages in thread
From: Tiwei Bie @ 2018-03-06 10:43 UTC (permalink / raw)
  To: dev
  Cc: maxime.coquelin, jianfeng.tan, yliu, zhihong.wang, xiao.w.wang,
	cunming.liang, dan.daly, tiwei.bie

This commit adds the VFIO based accelerator support to
vhost. A new API is provided to support asking QEMU to
do further setup to allow notifications and interrupts
being delivered directly between the driver in guest
and the vDPA device in host.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
---
 lib/librte_vhost/rte_vhost.h           |  28 ++++++
 lib/librte_vhost/rte_vhost_version.map |   1 +
 lib/librte_vhost/vhost_user.c          | 166 +++++++++++++++++++++++++++++++++
 lib/librte_vhost/vhost_user.h          |   9 ++
 4 files changed, 204 insertions(+)

diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h
index d5589c543..68842e908 100644
--- a/lib/librte_vhost/rte_vhost.h
+++ b/lib/librte_vhost/rte_vhost.h
@@ -35,6 +35,7 @@ extern "C" {
 #define RTE_VHOST_USER_PROTOCOL_F_REPLY_ACK	3
 #define RTE_VHOST_USER_PROTOCOL_F_NET_MTU	4
 #define RTE_VHOST_USER_PROTOCOL_F_SLAVE_REQ	5
+#define RTE_VHOST_USER_PROTOCOL_F_VFIO		8
 #define RTE_VHOST_USER_F_PROTOCOL_FEATURES	30
 
 /**
@@ -591,6 +592,33 @@ rte_vhost_get_vdpa_eid(int vid);
 int __rte_experimental
 rte_vhost_get_vdpa_did(int vid);
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Enable or disable the VFIO based accelerator for vhost-user.
+ *
+ * This function is to ask QEMU to do further setup to better
+ * support the vDPA device at vhost user backend. With this
+ * setup, the notifications and interrupts will be delivered
+ * directly between the driver in guest and the vDPA device
+ * in host if platform supports e.g. EPT and Posted interrupt.
+ * It's nice to have, and not mandatory.
+ *
+ * @param vid
+ *  vhost device ID
+ * @param int
+ *  Enable or disable
+ *
+ * @return
+ *   0: success
+ *   -ENODEV: no such vhost device
+ *   -ENOTSUP: device does not support VFIO based accelerator feature
+ *   -EINVAL: there is no accelerator assigned to this vhost device
+ *   -EFAULT: failed to talk with QEMU
+ */
+int rte_vhost_vfio_accelerator_ctrl(int vid, int enable);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_vhost/rte_vhost_version.map b/lib/librte_vhost/rte_vhost_version.map
index 36257e51b..ca970170f 100644
--- a/lib/librte_vhost/rte_vhost_version.map
+++ b/lib/librte_vhost/rte_vhost_version.map
@@ -72,6 +72,7 @@ EXPERIMENTAL {
 	rte_vhost_set_vring_base;
 	rte_vhost_get_vdpa_eid;
 	rte_vhost_get_vdpa_did;
+	rte_vhost_vfio_accelerator_ctrl;
 	rte_vdpa_register_engine;
 	rte_vdpa_unregister_engine;
 	rte_vdpa_find_engine_id;
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index e3a1dfbfb..a65598d80 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -35,6 +35,7 @@
 #include <rte_common.h>
 #include <rte_malloc.h>
 #include <rte_log.h>
+#include <rte_vhost.h>
 
 #include "iotlb.h"
 #include "vhost.h"
@@ -1628,6 +1629,27 @@ vhost_user_msg_handler(int vid, int fd)
 	return 0;
 }
 
+static int process_slave_message_reply(struct virtio_net *dev,
+				       const VhostUserMsg *msg)
+{
+	VhostUserMsg msg_reply;
+
+	if ((msg->flags & VHOST_USER_NEED_REPLY) == 0)
+		return 0;
+
+	if (read_vhost_message(dev->slave_req_fd, &msg_reply) < 0)
+		return -1;
+
+	if (msg_reply.request.slave != msg->request.slave) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"received unexpected msg type (%u), expected %u\n",
+			msg_reply.request.slave, msg->request.slave);
+		return -1;
+	}
+
+	return msg_reply.payload.u64;
+}
+
 int
 vhost_user_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm)
 {
@@ -1653,3 +1675,147 @@ vhost_user_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm)
 
 	return 0;
 }
+
+static int vhost_user_slave_set_vring_file(struct virtio_net *dev,
+					   uint32_t request,
+					   struct vhost_vring_file *file)
+{
+	int *fdp = NULL;
+	size_t fd_num = 0;
+	int ret;
+	struct VhostUserMsg msg = {
+		.request.slave = request,
+		.flags = VHOST_USER_VERSION | VHOST_USER_NEED_REPLY,
+		.payload.u64 = file->index & VHOST_USER_VRING_IDX_MASK,
+		.size = sizeof(msg.payload.u64),
+	};
+
+	if (file->fd < 0)
+		msg.payload.u64 |= VHOST_USER_VRING_NOFD_MASK;
+	else {
+		fdp = &file->fd;
+		fd_num = 1;
+	}
+
+	ret = send_vhost_message(dev->slave_req_fd, &msg, fdp, fd_num);
+	if (ret < 0) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"Failed to send slave message %u (%d)\n",
+			request, ret);
+		return ret;
+	}
+
+	return process_slave_message_reply(dev, &msg);
+}
+
+static int vhost_user_slave_set_vring_notify_area(struct virtio_net *dev,
+						  int index, int fd,
+						  uint64_t offset,
+						  uint64_t size)
+{
+	int *fdp = NULL;
+	size_t fd_num = 0;
+	int ret;
+	struct VhostUserMsg msg = {
+		.request.slave = VHOST_USER_SLAVE_VRING_NOTIFY_AREA_MSG,
+		.flags = VHOST_USER_VERSION | VHOST_USER_NEED_REPLY,
+		.payload.area = {
+			.u64 = index & VHOST_USER_VRING_IDX_MASK,
+			.size = size,
+			.offset = offset,
+		},
+		.size = sizeof(msg.payload.area),
+	};
+
+	if (fd < 0)
+		msg.payload.area.u64 |= VHOST_USER_VRING_NOFD_MASK;
+	else {
+		fdp = &fd;
+		fd_num = 1;
+	}
+
+	ret = send_vhost_message(dev->slave_req_fd, &msg, fdp, fd_num);
+	if (ret < 0) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"Failed to set vring notify area (%d)\n", ret);
+		return ret;
+	}
+
+	return process_slave_message_reply(dev, &msg);
+}
+
+int __rte_experimental
+rte_vhost_vfio_accelerator_ctrl(int vid, int enable)
+{
+	struct virtio_net *dev = get_device(vid);
+	int groupfd, devicefd, eid, ret = 0;
+	struct rte_vdpa_eng_driver *drv;
+	struct vhost_vring_file file;
+	uint64_t offset, size;
+	unsigned int i;
+
+	if (!dev)
+		return -ENODEV;
+
+	eid = dev->eid;
+	if (eid < 0)
+		return -EINVAL;
+
+	if (!(dev->features & (1ULL << VIRTIO_F_VERSION_1)) ||
+	    !(dev->features & (1ULL << RTE_VHOST_USER_F_PROTOCOL_FEATURES)) ||
+	    !(dev->protocol_features &
+			(1ULL << RTE_VHOST_USER_PROTOCOL_F_VFIO)))
+		return -ENOTSUP;
+
+	drv = vdpa_engines[eid]->eng_drv;
+
+	RTE_FUNC_PTR_OR_ERR_RET(drv->dev_ops.get_vfio_device_fd, -ENOTSUP);
+	RTE_FUNC_PTR_OR_ERR_RET(drv->dev_ops.get_vfio_group_fd, -ENOTSUP);
+	RTE_FUNC_PTR_OR_ERR_RET(drv->dev_ops.get_notify_area, -ENOTSUP);
+
+	devicefd = drv->dev_ops.get_vfio_device_fd(vid);
+	if (devicefd < 0)
+		return -ENOTSUP;
+
+	groupfd = drv->dev_ops.get_vfio_group_fd(vid);
+	if (groupfd < 0)
+		return -ENOTSUP;
+
+	if (enable) {
+		for (i = 0; i < dev->nr_vring * 2; i++) {
+			file.index = i;
+			file.fd = groupfd;
+
+			if (drv->dev_ops.get_notify_area(vid, i, &offset,
+					&size) < 0) {
+				ret = -ENOTSUP;
+				goto disable;
+			}
+
+			if (vhost_user_slave_set_vring_file(dev,
+					VHOST_USER_SLAVE_VRING_VFIO_GROUP_MSG,
+					&file) < 0) {
+				ret = -EFAULT;
+				goto disable;
+			}
+			if (vhost_user_slave_set_vring_notify_area(dev, i,
+					devicefd, offset, size) < 0) {
+				ret = -EFAULT;
+				goto disable;
+			}
+		}
+	} else {
+disable:
+		for (i = 0; i < dev->nr_vring * 2; i++) {
+			file.index = i;
+			file.fd = -1;
+			vhost_user_slave_set_vring_file(dev,
+					VHOST_USER_SLAVE_VRING_VFIO_GROUP_MSG,
+					&file);
+			vhost_user_slave_set_vring_notify_area(dev, i, -1,
+					0, 0);
+		}
+	}
+
+	return ret;
+}
diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
index 066e772dd..c74d288d4 100644
--- a/lib/librte_vhost/vhost_user.h
+++ b/lib/librte_vhost/vhost_user.h
@@ -52,6 +52,8 @@ typedef enum VhostUserRequest {
 typedef enum VhostUserSlaveRequest {
 	VHOST_USER_SLAVE_NONE = 0,
 	VHOST_USER_SLAVE_IOTLB_MSG = 1,
+	VHOST_USER_SLAVE_VRING_VFIO_GROUP_MSG = 3,
+	VHOST_USER_SLAVE_VRING_NOTIFY_AREA_MSG = 4,
 	VHOST_USER_SLAVE_MAX
 } VhostUserSlaveRequest;
 
@@ -73,6 +75,12 @@ typedef struct VhostUserLog {
 	uint64_t mmap_offset;
 } VhostUserLog;
 
+typedef struct VhostUserVringArea {
+	uint64_t u64;
+	uint64_t size;
+	uint64_t offset;
+} VhostUserVringArea;
+
 typedef struct VhostUserMsg {
 	union {
 		uint32_t master; /* a VhostUserRequest value */
@@ -93,6 +101,7 @@ typedef struct VhostUserMsg {
 		VhostUserMemory memory;
 		VhostUserLog    log;
 		struct vhost_iotlb_msg iotlb;
+		VhostUserVringArea area;
 	} payload;
 	int fds[VHOST_MEMORY_MAX_NREGIONS];
 } __attribute((packed)) VhostUserMsg;
-- 
2.11.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] [PATCH 3/3] vhost: support VFIO based accelerator
  2018-03-06 10:43 ` [dpdk-dev] [PATCH 3/3] vhost: support VFIO based accelerator Tiwei Bie
@ 2018-03-06 14:24   ` Maxime Coquelin
  2018-03-07  8:59     ` Tiwei Bie
  0 siblings, 1 reply; 14+ messages in thread
From: Maxime Coquelin @ 2018-03-06 14:24 UTC (permalink / raw)
  To: Tiwei Bie, dev
  Cc: jianfeng.tan, yliu, zhihong.wang, xiao.w.wang, cunming.liang, dan.daly



On 03/06/2018 11:43 AM, Tiwei Bie wrote:
> This commit adds the VFIO based accelerator support to
> vhost. A new API is provided to support asking QEMU to
> do further setup to allow notifications and interrupts
> being delivered directly between the driver in guest
> and the vDPA device in host.
> 
> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> ---
>   lib/librte_vhost/rte_vhost.h           |  28 ++++++
>   lib/librte_vhost/rte_vhost_version.map |   1 +
>   lib/librte_vhost/vhost_user.c          | 166 +++++++++++++++++++++++++++++++++
>   lib/librte_vhost/vhost_user.h          |   9 ++
>   4 files changed, 204 insertions(+)
> 
> diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h
> index d5589c543..68842e908 100644
> --- a/lib/librte_vhost/rte_vhost.h
> +++ b/lib/librte_vhost/rte_vhost.h
> @@ -35,6 +35,7 @@ extern "C" {
>   #define RTE_VHOST_USER_PROTOCOL_F_REPLY_ACK	3
>   #define RTE_VHOST_USER_PROTOCOL_F_NET_MTU	4
>   #define RTE_VHOST_USER_PROTOCOL_F_SLAVE_REQ	5
> +#define RTE_VHOST_USER_PROTOCOL_F_VFIO		8
>   #define RTE_VHOST_USER_F_PROTOCOL_FEATURES	30
>   
>   /**
> @@ -591,6 +592,33 @@ rte_vhost_get_vdpa_eid(int vid);
>   int __rte_experimental
>   rte_vhost_get_vdpa_did(int vid);
>   
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice
> + *
> + * Enable or disable the VFIO based accelerator for vhost-user.
> + *
> + * This function is to ask QEMU to do further setup to better
> + * support the vDPA device at vhost user backend. With this
> + * setup, the notifications and interrupts will be delivered
> + * directly between the driver in guest and the vDPA device
> + * in host if platform supports e.g. EPT and Posted interrupt.
> + * It's nice to have, and not mandatory.
> + *
> + * @param vid
> + *  vhost device ID
> + * @param int
> + *  Enable or disable
> + *
> + * @return
> + *   0: success
> + *   -ENODEV: no such vhost device
> + *   -ENOTSUP: device does not support VFIO based accelerator feature
> + *   -EINVAL: there is no accelerator assigned to this vhost device
> + *   -EFAULT: failed to talk with QEMU
> + */
> +int rte_vhost_vfio_accelerator_ctrl(int vid, int enable);
> +
>   #ifdef __cplusplus
>   }
>   #endif
> diff --git a/lib/librte_vhost/rte_vhost_version.map b/lib/librte_vhost/rte_vhost_version.map
> index 36257e51b..ca970170f 100644
> --- a/lib/librte_vhost/rte_vhost_version.map
> +++ b/lib/librte_vhost/rte_vhost_version.map
> @@ -72,6 +72,7 @@ EXPERIMENTAL {
>   	rte_vhost_set_vring_base;
>   	rte_vhost_get_vdpa_eid;
>   	rte_vhost_get_vdpa_did;
> +	rte_vhost_vfio_accelerator_ctrl;
>   	rte_vdpa_register_engine;
>   	rte_vdpa_unregister_engine;
>   	rte_vdpa_find_engine_id;
> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
> index e3a1dfbfb..a65598d80 100644
> --- a/lib/librte_vhost/vhost_user.c
> +++ b/lib/librte_vhost/vhost_user.c
> @@ -35,6 +35,7 @@
>   #include <rte_common.h>
>   #include <rte_malloc.h>
>   #include <rte_log.h>
> +#include <rte_vhost.h>
>   
>   #include "iotlb.h"
>   #include "vhost.h"
> @@ -1628,6 +1629,27 @@ vhost_user_msg_handler(int vid, int fd)
>   	return 0;
>   }
>   
> +static int process_slave_message_reply(struct virtio_net *dev,
> +				       const VhostUserMsg *msg)
> +{
> +	VhostUserMsg msg_reply;
> +
> +	if ((msg->flags & VHOST_USER_NEED_REPLY) == 0)
> +		return 0;
> +
> +	if (read_vhost_message(dev->slave_req_fd, &msg_reply) < 0)
> +		return -1;
> +
> +	if (msg_reply.request.slave != msg->request.slave) {
> +		RTE_LOG(ERR, VHOST_CONFIG,
> +			"received unexpected msg type (%u), expected %u\n",
> +			msg_reply.request.slave, msg->request.slave);
> +		return -1;
> +	}
> +
> +	return msg_reply.payload.u64;
> +}
> +
>   int
>   vhost_user_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm)
>   {
> @@ -1653,3 +1675,147 @@ vhost_user_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm)
>   
>   	return 0;
>   }
> +
> +static int vhost_user_slave_set_vring_file(struct virtio_net *dev,
> +					   uint32_t request,
> +					   struct vhost_vring_file *file)
Why passing the request as an argument?
It seems to be called only with the same request ID.

> +{
> +	int *fdp = NULL;
> +	size_t fd_num = 0;
> +	int ret;
> +	struct VhostUserMsg msg = {
> +		.request.slave = request,
> +		.flags = VHOST_USER_VERSION | VHOST_USER_NEED_REPLY,
> +		.payload.u64 = file->index & VHOST_USER_VRING_IDX_MASK,
> +		.size = sizeof(msg.payload.u64),
> +	};
> +
> +	if (file->fd < 0)
> +		msg.payload.u64 |= VHOST_USER_VRING_NOFD_MASK;
> +	else {
> +		fdp = &file->fd;
> +		fd_num = 1;
> +	}
> +
> +	ret = send_vhost_message(dev->slave_req_fd, &msg, fdp, fd_num);
> +	if (ret < 0) {
> +		RTE_LOG(ERR, VHOST_CONFIG,
> +			"Failed to send slave message %u (%d)\n",
> +			request, ret);
> +		return ret;
> +	}
> +
> +	return process_slave_message_reply(dev, &msg);

Maybe not needed right now, but we'll need a lock to avoid concurrent
requests sending and waiting for reply.

Thanks,
Maxime

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] [PATCH 3/3] vhost: support VFIO based accelerator
  2018-03-06 14:24   ` Maxime Coquelin
@ 2018-03-07  8:59     ` Tiwei Bie
  0 siblings, 0 replies; 14+ messages in thread
From: Tiwei Bie @ 2018-03-07  8:59 UTC (permalink / raw)
  To: Maxime Coquelin
  Cc: dev, jianfeng.tan, yliu, zhihong.wang, xiao.w.wang,
	cunming.liang, dan.daly

On Tue, Mar 06, 2018 at 03:24:27PM +0100, Maxime Coquelin wrote:
> On 03/06/2018 11:43 AM, Tiwei Bie wrote:
[...]
> > +
> > +static int vhost_user_slave_set_vring_file(struct virtio_net *dev,
> > +					   uint32_t request,
> > +					   struct vhost_vring_file *file)
> Why passing the request as an argument?
> It seems to be called only with the same request ID.

I thought there may be other requests that also need to
send a file descriptor for a ring in the future. So I
made this a common routine. Maybe it's not really helpful.
I won't pass the request as an argument in next version.

> 
> > +{
> > +	int *fdp = NULL;
> > +	size_t fd_num = 0;
> > +	int ret;
> > +	struct VhostUserMsg msg = {
> > +		.request.slave = request,
> > +		.flags = VHOST_USER_VERSION | VHOST_USER_NEED_REPLY,
> > +		.payload.u64 = file->index & VHOST_USER_VRING_IDX_MASK,
> > +		.size = sizeof(msg.payload.u64),
> > +	};
> > +
> > +	if (file->fd < 0)
> > +		msg.payload.u64 |= VHOST_USER_VRING_NOFD_MASK;
> > +	else {
> > +		fdp = &file->fd;
> > +		fd_num = 1;
> > +	}
> > +
> > +	ret = send_vhost_message(dev->slave_req_fd, &msg, fdp, fd_num);
> > +	if (ret < 0) {
> > +		RTE_LOG(ERR, VHOST_CONFIG,
> > +			"Failed to send slave message %u (%d)\n",
> > +			request, ret);
> > +		return ret;
> > +	}
> > +
> > +	return process_slave_message_reply(dev, &msg);
> 
> Maybe not needed right now, but we'll need a lock to avoid concurrent
> requests sending and waiting for reply.

Yeah, probably, we need a lock for each slave channel. I didn't
check the code of Linux. Maybe it will cause problems when two
threads send e.g. below messages at the same time:

thread A:
 IOTLB miss message

thread B:
 VFIO group message which has a file descriptor

Thanks for the comments! :)

Best regards,
Tiwei Bie

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] [PATCH 1/3] vhost: do not generate signal when sendmsg fails
  2018-03-06 10:43 ` [dpdk-dev] [PATCH 1/3] vhost: do not generate signal when sendmsg fails Tiwei Bie
@ 2018-03-29 12:19   ` Maxime Coquelin
  2018-03-29 13:25     ` Tiwei Bie
  2018-03-29 13:46   ` Maxime Coquelin
  1 sibling, 1 reply; 14+ messages in thread
From: Maxime Coquelin @ 2018-03-29 12:19 UTC (permalink / raw)
  To: Tiwei Bie, dev
  Cc: jianfeng.tan, yliu, zhihong.wang, xiao.w.wang, cunming.liang, dan.daly

Hi Tiwei,

On 03/06/2018 11:43 AM, Tiwei Bie wrote:
> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>

Could you please elaborate a bit more why this is needed?
Is it fixing a real issue or just an improvement?

Thanks!
Maxime
> ---
>   lib/librte_vhost/socket.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
> index 0354740fa..d703d2114 100644
> --- a/lib/librte_vhost/socket.c
> +++ b/lib/librte_vhost/socket.c
> @@ -181,7 +181,7 @@ send_fd_message(int sockfd, char *buf, int buflen, int *fds, int fd_num)
>   	}
>   
>   	do {
> -		ret = sendmsg(sockfd, &msgh, 0);
> +		ret = sendmsg(sockfd, &msgh, MSG_NOSIGNAL);
>   	} while (ret < 0 && errno == EINTR);
>   
>   	if (ret < 0) {
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] [PATCH 2/3] vhost: support sending fds via send_vhost_message()
  2018-03-06 10:43 ` [dpdk-dev] [PATCH 2/3] vhost: support sending fds via send_vhost_message() Tiwei Bie
@ 2018-03-29 12:23   ` Maxime Coquelin
  2018-03-29 12:27   ` Maxime Coquelin
  1 sibling, 0 replies; 14+ messages in thread
From: Maxime Coquelin @ 2018-03-29 12:23 UTC (permalink / raw)
  To: Tiwei Bie, dev
  Cc: jianfeng.tan, yliu, zhihong.wang, xiao.w.wang, cunming.liang, dan.daly


On 03/06/2018 11:43 AM, Tiwei Bie wrote:
> This function will be used to send fds to QEMU via slave channel.
> 
> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>

Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Thanks,
Maxime

> ---
>   lib/librte_vhost/vhost_user.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
> index 8b07b6c43..e3a1dfbfb 100644
> --- a/lib/librte_vhost/vhost_user.c
> +++ b/lib/librte_vhost/vhost_user.c
> @@ -1308,13 +1308,13 @@ read_vhost_message(int sockfd, struct VhostUserMsg *msg)
>   }
>   
>   static int
> -send_vhost_message(int sockfd, struct VhostUserMsg *msg)
> +send_vhost_message(int sockfd, struct VhostUserMsg *msg, int *fds, int fd_num)
>   {
>   	if (!msg)
>   		return 0;
>   
>   	return send_fd_message(sockfd, (char *)msg,
> -		VHOST_USER_HDR_SIZE + msg->size, NULL, 0);
> +		VHOST_USER_HDR_SIZE + msg->size, fds, fd_num);
>   }
>   
>   static int
> @@ -1328,7 +1328,7 @@ send_vhost_reply(int sockfd, struct VhostUserMsg *msg)
>   	msg->flags |= VHOST_USER_VERSION;
>   	msg->flags |= VHOST_USER_REPLY_MASK;
>   
> -	return send_vhost_message(sockfd, msg);
> +	return send_vhost_message(sockfd, msg, NULL, 0);
>   }
>   
>   /*
> @@ -1643,7 +1643,7 @@ vhost_user_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm)
>   		},
>   	};
>   
> -	ret = send_vhost_message(dev->slave_req_fd, &msg);
> +	ret = send_vhost_message(dev->slave_req_fd, &msg, NULL, 0);
>   	if (ret < 0) {
>   		RTE_LOG(ERR, VHOST_CONFIG,
>   				"Failed to send IOTLB miss message (%d)\n",
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] [PATCH 2/3] vhost: support sending fds via send_vhost_message()
  2018-03-06 10:43 ` [dpdk-dev] [PATCH 2/3] vhost: support sending fds via send_vhost_message() Tiwei Bie
  2018-03-29 12:23   ` Maxime Coquelin
@ 2018-03-29 12:27   ` Maxime Coquelin
  1 sibling, 0 replies; 14+ messages in thread
From: Maxime Coquelin @ 2018-03-29 12:27 UTC (permalink / raw)
  To: Tiwei Bie, dev
  Cc: jianfeng.tan, yliu, zhihong.wang, xiao.w.wang, cunming.liang, dan.daly



On 03/06/2018 11:43 AM, Tiwei Bie wrote:
> This function will be used to send fds to QEMU via slave channel.
> 
> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> ---
>   lib/librte_vhost/vhost_user.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)

Applied to dpdk-next-virtio/master, so this is one less patch
to carry on when QEMU part will be accepted.

I'll apply patch 1 once you have provided more info on the reason of the
change, no need to resubmit patch 1, I'll reword the commit message with
your info.

Thanks,
Maxime

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] [PATCH 1/3] vhost: do not generate signal when sendmsg fails
  2018-03-29 12:19   ` Maxime Coquelin
@ 2018-03-29 13:25     ` Tiwei Bie
  2018-03-29 13:41       ` Maxime Coquelin
  0 siblings, 1 reply; 14+ messages in thread
From: Tiwei Bie @ 2018-03-29 13:25 UTC (permalink / raw)
  To: Maxime Coquelin
  Cc: dev, jianfeng.tan, yliu, zhihong.wang, xiao.w.wang,
	cunming.liang, dan.daly

On Thu, Mar 29, 2018 at 02:19:35PM +0200, Maxime Coquelin wrote:
> Hi Tiwei,
> 
> On 03/06/2018 11:43 AM, Tiwei Bie wrote:
> > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> 
> Could you please elaborate a bit more why this is needed?
> Is it fixing a real issue or just an improvement?

My bad, I really should write a more useful commit log..

I saw your comments on this mail:

http://dpdk.org/ml/archives/dev/2018-March/094201.html

Thank you so much! :-)

It's fixing an issue I met when adding the vDPA support.
SIGPIPE would be generated when sending messages via a
closed slave fd, and it will terminate the process by
default. But as a library, we shouldn't crash the process
in this case, instead we just need to return with an error.
I didn't meet this issue without my vDPA related changes,
so I didn't put a fixline on it. That is to say, I'm
treating it as an improvement.

Below is the commit log for your reference:

------ START HERE ------

vhost: do not generate signal when sendmsg fails

More precisely, do not generate a SIGPIPE signal if the peer
has closed the connection. Otherwise, it will terminate the
process by default. As a library, we should avoid terminating
the application process when error happens and just need to
return with an error.

------ END HERE ------

Thanks again! :-)

Best regards,
Tiwei Bie

> 
> Thanks!
> Maxime
> > ---
> >   lib/librte_vhost/socket.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
> > index 0354740fa..d703d2114 100644
> > --- a/lib/librte_vhost/socket.c
> > +++ b/lib/librte_vhost/socket.c
> > @@ -181,7 +181,7 @@ send_fd_message(int sockfd, char *buf, int buflen, int *fds, int fd_num)
> >   	}
> >   	do {
> > -		ret = sendmsg(sockfd, &msgh, 0);
> > +		ret = sendmsg(sockfd, &msgh, MSG_NOSIGNAL);
> >   	} while (ret < 0 && errno == EINTR);
> >   	if (ret < 0) {
> > 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] [PATCH 1/3] vhost: do not generate signal when sendmsg fails
  2018-03-29 13:25     ` Tiwei Bie
@ 2018-03-29 13:41       ` Maxime Coquelin
  2018-03-29 13:46         ` Tiwei Bie
  0 siblings, 1 reply; 14+ messages in thread
From: Maxime Coquelin @ 2018-03-29 13:41 UTC (permalink / raw)
  To: Tiwei Bie
  Cc: dev, jianfeng.tan, yliu, zhihong.wang, xiao.w.wang,
	cunming.liang, dan.daly



On 03/29/2018 03:25 PM, Tiwei Bie wrote:
> On Thu, Mar 29, 2018 at 02:19:35PM +0200, Maxime Coquelin wrote:
>> Hi Tiwei,
>>
>> On 03/06/2018 11:43 AM, Tiwei Bie wrote:
>>> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
>>
>> Could you please elaborate a bit more why this is needed?
>> Is it fixing a real issue or just an improvement?
> 
> My bad, I really should write a more useful commit log..
> 
> I saw your comments on this mail:
> 
> http://dpdk.org/ml/archives/dev/2018-March/094201.html
> 
> Thank you so much! :-)
> 
> It's fixing an issue I met when adding the vDPA support.
> SIGPIPE would be generated when sending messages via a
> closed slave fd, and it will terminate the process by
> default. But as a library, we shouldn't crash the process
> in this case, instead we just need to return with an error.
> I didn't meet this issue without my vDPA related changes,
> so I didn't put a fixline on it. That is to say, I'm
> treating it as an improvement.

Great, thanks for the details!
I'll apply the patch with below commit message.

Maxime
> 
> Below is the commit log for your reference:
> 
> ------ START HERE ------
> 
> vhost: do not generate signal when sendmsg fails
> 
> More precisely, do not generate a SIGPIPE signal if the peer
> has closed the connection. Otherwise, it will terminate the
> process by default. As a library, we should avoid terminating
> the application process when error happens and just need to
> return with an error.
> 
> ------ END HERE ------
> 
> Thanks again! :-)
> 
> Best regards,
> Tiwei Bie
> 
>>
>> Thanks!
>> Maxime
>>> ---
>>>    lib/librte_vhost/socket.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
>>> index 0354740fa..d703d2114 100644
>>> --- a/lib/librte_vhost/socket.c
>>> +++ b/lib/librte_vhost/socket.c
>>> @@ -181,7 +181,7 @@ send_fd_message(int sockfd, char *buf, int buflen, int *fds, int fd_num)
>>>    	}
>>>    	do {
>>> -		ret = sendmsg(sockfd, &msgh, 0);
>>> +		ret = sendmsg(sockfd, &msgh, MSG_NOSIGNAL);
>>>    	} while (ret < 0 && errno == EINTR);
>>>    	if (ret < 0) {
>>>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] [PATCH 1/3] vhost: do not generate signal when sendmsg fails
  2018-03-06 10:43 ` [dpdk-dev] [PATCH 1/3] vhost: do not generate signal when sendmsg fails Tiwei Bie
  2018-03-29 12:19   ` Maxime Coquelin
@ 2018-03-29 13:46   ` Maxime Coquelin
  1 sibling, 0 replies; 14+ messages in thread
From: Maxime Coquelin @ 2018-03-29 13:46 UTC (permalink / raw)
  To: Tiwei Bie, dev
  Cc: jianfeng.tan, yliu, zhihong.wang, xiao.w.wang, cunming.liang, dan.daly



On 03/06/2018 11:43 AM, Tiwei Bie wrote:
> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> ---
>   lib/librte_vhost/socket.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
> index 0354740fa..d703d2114 100644
> --- a/lib/librte_vhost/socket.c
> +++ b/lib/librte_vhost/socket.c
> @@ -181,7 +181,7 @@ send_fd_message(int sockfd, char *buf, int buflen, int *fds, int fd_num)
>   	}
>   
>   	do {
> -		ret = sendmsg(sockfd, &msgh, 0);
> +		ret = sendmsg(sockfd, &msgh, MSG_NOSIGNAL);
>   	} while (ret < 0 && errno == EINTR);
>   
>   	if (ret < 0) {
> 

Applied to dpdk-next-virtio/master with below commit message

------------------------------------------------------------
More precisely, do not generate a SIGPIPE signal if the peer
has closed the connection. Otherwise, it will terminate the
process by default. As a library, we should avoid terminating
the application process when error happens and just need to
return with an error.
------------------------------------------------------------

Thanks,
Maxime

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [dpdk-dev] [PATCH 1/3] vhost: do not generate signal when sendmsg fails
  2018-03-29 13:41       ` Maxime Coquelin
@ 2018-03-29 13:46         ` Tiwei Bie
  0 siblings, 0 replies; 14+ messages in thread
From: Tiwei Bie @ 2018-03-29 13:46 UTC (permalink / raw)
  To: Maxime Coquelin
  Cc: dev, jianfeng.tan, yliu, zhihong.wang, xiao.w.wang,
	cunming.liang, dan.daly

On Thu, Mar 29, 2018 at 03:41:23PM +0200, Maxime Coquelin wrote:
> On 03/29/2018 03:25 PM, Tiwei Bie wrote:
> > On Thu, Mar 29, 2018 at 02:19:35PM +0200, Maxime Coquelin wrote:
> > > Hi Tiwei,
> > > 
> > > On 03/06/2018 11:43 AM, Tiwei Bie wrote:
> > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > 
> > > Could you please elaborate a bit more why this is needed?
> > > Is it fixing a real issue or just an improvement?
> > 
> > My bad, I really should write a more useful commit log..
> > 
> > I saw your comments on this mail:
> > 
> > http://dpdk.org/ml/archives/dev/2018-March/094201.html
> > 
> > Thank you so much! :-)
> > 
> > It's fixing an issue I met when adding the vDPA support.
> > SIGPIPE would be generated when sending messages via a
> > closed slave fd, and it will terminate the process by
> > default. But as a library, we shouldn't crash the process
> > in this case, instead we just need to return with an error.
> > I didn't meet this issue without my vDPA related changes,
> > so I didn't put a fixline on it. That is to say, I'm
> > treating it as an improvement.
> 
> Great, thanks for the details!
> I'll apply the patch with below commit message.

Thank you very much! :-)

Best regards,
Tiwei Bie

> 
> Maxime
> > 
> > Below is the commit log for your reference:
> > 
> > ------ START HERE ------
> > 
> > vhost: do not generate signal when sendmsg fails
> > 
> > More precisely, do not generate a SIGPIPE signal if the peer
> > has closed the connection. Otherwise, it will terminate the
> > process by default. As a library, we should avoid terminating
> > the application process when error happens and just need to
> > return with an error.
> > 
> > ------ END HERE ------
> > 
> > Thanks again! :-)
> > 
> > Best regards,
> > Tiwei Bie
> > 
> > > 
> > > Thanks!
> > > Maxime
> > > > ---
> > > >    lib/librte_vhost/socket.c | 2 +-
> > > >    1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
> > > > index 0354740fa..d703d2114 100644
> > > > --- a/lib/librte_vhost/socket.c
> > > > +++ b/lib/librte_vhost/socket.c
> > > > @@ -181,7 +181,7 @@ send_fd_message(int sockfd, char *buf, int buflen, int *fds, int fd_num)
> > > >    	}
> > > >    	do {
> > > > -		ret = sendmsg(sockfd, &msgh, 0);
> > > > +		ret = sendmsg(sockfd, &msgh, MSG_NOSIGNAL);
> > > >    	} while (ret < 0 && errno == EINTR);
> > > >    	if (ret < 0) {
> > > > 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [dpdk-dev] [PATCH v2] vhost: support VFIO based accelerator
  2018-03-06 10:43 [dpdk-dev] [PATCH 0/3] Extend vhost to support VFIO based accelerator Tiwei Bie
                   ` (2 preceding siblings ...)
  2018-03-06 10:43 ` [dpdk-dev] [PATCH 3/3] vhost: support VFIO based accelerator Tiwei Bie
@ 2018-04-18  5:49 ` Tiwei Bie
  3 siblings, 0 replies; 14+ messages in thread
From: Tiwei Bie @ 2018-04-18  5:49 UTC (permalink / raw)
  To: maxime.coquelin, jianfeng.tan, dev
  Cc: dan.daly, cunming.liang, zhihong.wang, xiao.w.wang, tiwei.bie

Enable the VFIO based accelerator support in vhost-user.
When a vDPA device is attached, vhost user will try to
ask QEMU to do further setup to allow the notifications
to be delivered between the driver in the guest and the
vDPA device in the host directly.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
---
This patch depends on below patch set for QEMU:
http://lists.nongnu.org/archive/html/qemu-devel/2018-04/msg01779.html

v2:
- Address the changes in vDPA
- Address the changes in QEMU
- Add a lock protection (Maxime)

 lib/librte_vhost/rte_vhost.h  |   4 ++
 lib/librte_vhost/vhost.c      |   4 ++
 lib/librte_vhost/vhost.h      |   1 +
 lib/librte_vhost/vhost_user.c | 146 +++++++++++++++++++++++++++++++++++++++++-
 lib/librte_vhost/vhost_user.h |  12 +++-
 5 files changed, 165 insertions(+), 2 deletions(-)

diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h
index e4e8824c9..c6d58cc16 100644
--- a/lib/librte_vhost/rte_vhost.h
+++ b/lib/librte_vhost/rte_vhost.h
@@ -58,6 +58,10 @@ extern "C" {
 #define VHOST_USER_PROTOCOL_F_CRYPTO_SESSION 7
 #endif
 
+#ifndef VHOST_USER_PROTOCOL_F_HOST_NOTIFIER
+#define VHOST_USER_PROTOCOL_F_HOST_NOTIFIER 10
+#endif
+
 /** Indicate whether protocol features negotiation is supported. */
 #ifndef VHOST_USER_F_PROTOCOL_FEATURES
 #define VHOST_USER_F_PROTOCOL_FEATURES	30
diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index 5ddf55ed9..0f7326af0 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -285,6 +285,8 @@ vhost_new_device(void)
 	dev->slave_req_fd = -1;
 	dev->vdpa_dev_id = -1;
 
+	rte_spinlock_init(&dev->slave_req_lock);
+
 	return i;
 }
 
@@ -339,6 +341,8 @@ vhost_detach_vdpa_device(int vid)
 	if (dev == NULL)
 		return;
 
+	vhost_user_vfio_accelerator_ctrl(vid, false);
+
 	dev->vdpa_dev_id = -1;
 }
 
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index c9b64461d..d814aed9c 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -288,6 +288,7 @@ struct virtio_net {
 	struct guest_page       *guest_pages;
 
 	int			slave_req_fd;
+	rte_spinlock_t		slave_req_lock;
 
 	/*
 	 * Device id to identify a specific backend device.
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index a3dccf67b..81b2000d1 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -1350,6 +1350,22 @@ send_vhost_reply(int sockfd, struct VhostUserMsg *msg)
 	return send_vhost_message(sockfd, msg, NULL, 0);
 }
 
+static int
+send_vhost_slave_message(struct virtio_net *dev, struct VhostUserMsg *msg,
+			 int *fds, int fd_num)
+{
+	int ret;
+
+	if (msg->flags & VHOST_USER_NEED_REPLY)
+		rte_spinlock_lock(&dev->slave_req_lock);
+
+	ret = send_vhost_message(dev->slave_req_fd, msg, fds, fd_num);
+	if (ret < 0 && (msg->flags & VHOST_USER_NEED_REPLY))
+		rte_spinlock_unlock(&dev->slave_req_lock);
+
+	return ret;
+}
+
 /*
  * Allocate a queue pair if it hasn't been allocated yet
  */
@@ -1671,11 +1687,45 @@ vhost_user_msg_handler(int vid, int fd)
 		if (vdpa_dev->ops->dev_conf)
 			vdpa_dev->ops->dev_conf(dev->vid);
 		dev->flags |= VIRTIO_DEV_VDPA_CONFIGURED;
+		if (vhost_user_vfio_accelerator_ctrl(dev->vid, true) != 0) {
+			RTE_LOG(INFO, VHOST_CONFIG,
+				"(%d) software relay is used for vDPA, performance may be low.\n",
+				dev->vid);
+		}
 	}
 
 	return 0;
 }
 
+static int process_slave_message_reply(struct virtio_net *dev,
+				       const VhostUserMsg *msg)
+{
+	VhostUserMsg msg_reply;
+	int ret;
+
+	if ((msg->flags & VHOST_USER_NEED_REPLY) == 0)
+		return 0;
+
+	if (read_vhost_message(dev->slave_req_fd, &msg_reply) < 0) {
+		ret = -1;
+		goto out;
+	}
+
+	if (msg_reply.request.slave != msg->request.slave) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"received unexpected msg type (%u), expected %u\n",
+			msg_reply.request.slave, msg->request.slave);
+		ret = -1;
+		goto out;
+	}
+
+	ret = msg_reply.payload.u64;
+
+out:
+	rte_spinlock_unlock(&dev->slave_req_lock);
+	return ret;
+}
+
 int
 vhost_user_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm)
 {
@@ -1691,7 +1741,7 @@ vhost_user_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm)
 		},
 	};
 
-	ret = send_vhost_message(dev->slave_req_fd, &msg, NULL, 0);
+	ret = send_vhost_slave_message(dev, &msg, NULL, 0);
 	if (ret < 0) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 				"Failed to send IOTLB miss message (%d)\n",
@@ -1701,3 +1751,97 @@ vhost_user_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm)
 
 	return 0;
 }
+
+static int vhost_user_slave_set_vring_host_notifier(struct virtio_net *dev,
+						    int index, int fd,
+						    uint64_t offset,
+						    uint64_t size)
+{
+	int *fdp = NULL;
+	size_t fd_num = 0;
+	int ret;
+	struct VhostUserMsg msg = {
+		.request.slave = VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG,
+		.flags = VHOST_USER_VERSION | VHOST_USER_NEED_REPLY,
+		.size = sizeof(msg.payload.area),
+		.payload.area = {
+			.u64 = index & VHOST_USER_VRING_IDX_MASK,
+			.size = size,
+			.offset = offset,
+		},
+	};
+
+	if (fd < 0)
+		msg.payload.area.u64 |= VHOST_USER_VRING_NOFD_MASK;
+	else {
+		fdp = &fd;
+		fd_num = 1;
+	}
+
+	ret = send_vhost_slave_message(dev, &msg, fdp, fd_num);
+	if (ret < 0) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"Failed to set host notifier (%d)\n", ret);
+		return ret;
+	}
+
+	return process_slave_message_reply(dev, &msg);
+}
+
+int vhost_user_vfio_accelerator_ctrl(int vid, int enable)
+{
+	struct virtio_net *dev;
+	struct rte_vdpa_device *vdpa_dev;
+	int vfio_device_fd, did, ret = 0;
+	uint64_t offset, size;
+	unsigned int i;
+
+	dev = get_device(vid);
+	if (!dev)
+		return -ENODEV;
+
+	did = dev->vdpa_dev_id;
+	if (did < 0)
+		return -EINVAL;
+
+	if (!(dev->features & (1ULL << VIRTIO_F_VERSION_1)) ||
+	    !(dev->features & (1ULL << VHOST_USER_F_PROTOCOL_FEATURES)) ||
+	    !(dev->protocol_features &
+			(1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ)) ||
+	    !(dev->protocol_features &
+			(1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER)))
+		return -ENOTSUP;
+
+	vdpa_dev = rte_vdpa_get_device(did);
+
+	RTE_FUNC_PTR_OR_ERR_RET(vdpa_dev->ops->get_vfio_device_fd, -ENOTSUP);
+	RTE_FUNC_PTR_OR_ERR_RET(vdpa_dev->ops->get_notify_area, -ENOTSUP);
+
+	vfio_device_fd = vdpa_dev->ops->get_vfio_device_fd(vid);
+	if (vfio_device_fd < 0)
+		return -ENOTSUP;
+
+	if (enable) {
+		for (i = 0; i < dev->nr_vring * 2; i++) {
+			if (vdpa_dev->ops->get_notify_area(vid, i, &offset,
+					&size) < 0) {
+				ret = -ENOTSUP;
+				goto disable;
+			}
+
+			if (vhost_user_slave_set_vring_host_notifier(dev, i,
+					vfio_device_fd, offset, size) < 0) {
+				ret = -EFAULT;
+				goto disable;
+			}
+		}
+	} else {
+disable:
+		for (i = 0; i < dev->nr_vring * 2; i++) {
+			vhost_user_slave_set_vring_host_notifier(dev, i, -1,
+					0, 0);
+		}
+	}
+
+	return ret;
+}
diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
index 1ad5cf467..7e285aca6 100644
--- a/lib/librte_vhost/vhost_user.h
+++ b/lib/librte_vhost/vhost_user.h
@@ -20,7 +20,8 @@
 					 (1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK) | \
 					 (1ULL << VHOST_USER_PROTOCOL_F_NET_MTU) | \
 					 (1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ) | \
-					 (1ULL << VHOST_USER_PROTOCOL_F_CRYPTO_SESSION))
+					 (1ULL << VHOST_USER_PROTOCOL_F_CRYPTO_SESSION) | \
+					 (1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER))
 
 typedef enum VhostUserRequest {
 	VHOST_USER_NONE = 0,
@@ -54,6 +55,7 @@ typedef enum VhostUserRequest {
 typedef enum VhostUserSlaveRequest {
 	VHOST_USER_SLAVE_NONE = 0,
 	VHOST_USER_SLAVE_IOTLB_MSG = 1,
+	VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG = 3,
 	VHOST_USER_SLAVE_MAX
 } VhostUserSlaveRequest;
 
@@ -99,6 +101,12 @@ typedef struct VhostUserCryptoSessionParam {
 	uint8_t auth_key_buf[VHOST_USER_CRYPTO_MAX_HMAC_KEY_LENGTH];
 } VhostUserCryptoSessionParam;
 
+typedef struct VhostUserVringArea {
+	uint64_t u64;
+	uint64_t size;
+	uint64_t offset;
+} VhostUserVringArea;
+
 typedef struct VhostUserMsg {
 	union {
 		uint32_t master; /* a VhostUserRequest value */
@@ -120,6 +128,7 @@ typedef struct VhostUserMsg {
 		VhostUserLog    log;
 		struct vhost_iotlb_msg iotlb;
 		VhostUserCryptoSessionParam crypto_session;
+		VhostUserVringArea area;
 	} payload;
 	int fds[VHOST_MEMORY_MAX_NREGIONS];
 } __attribute((packed)) VhostUserMsg;
@@ -133,6 +142,7 @@ typedef struct VhostUserMsg {
 /* vhost_user.c */
 int vhost_user_msg_handler(int vid, int fd);
 int vhost_user_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm);
+int vhost_user_vfio_accelerator_ctrl(int vid, int enable);
 
 /* socket.c */
 int read_fd_message(int sockfd, char *buf, int buflen, int *fds, int fd_num);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-04-18  5:52 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-06 10:43 [dpdk-dev] [PATCH 0/3] Extend vhost to support VFIO based accelerator Tiwei Bie
2018-03-06 10:43 ` [dpdk-dev] [PATCH 1/3] vhost: do not generate signal when sendmsg fails Tiwei Bie
2018-03-29 12:19   ` Maxime Coquelin
2018-03-29 13:25     ` Tiwei Bie
2018-03-29 13:41       ` Maxime Coquelin
2018-03-29 13:46         ` Tiwei Bie
2018-03-29 13:46   ` Maxime Coquelin
2018-03-06 10:43 ` [dpdk-dev] [PATCH 2/3] vhost: support sending fds via send_vhost_message() Tiwei Bie
2018-03-29 12:23   ` Maxime Coquelin
2018-03-29 12:27   ` Maxime Coquelin
2018-03-06 10:43 ` [dpdk-dev] [PATCH 3/3] vhost: support VFIO based accelerator Tiwei Bie
2018-03-06 14:24   ` Maxime Coquelin
2018-03-07  8:59     ` Tiwei Bie
2018-04-18  5:49 ` [dpdk-dev] [PATCH v2] " Tiwei Bie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).