DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 0/9] Introduce vfio-user library
@ 2020-12-18  7:38 Chenbo Xia
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 1/9] lib: introduce " Chenbo Xia
                   ` (10 more replies)
  0 siblings, 11 replies; 43+ messages in thread
From: Chenbo Xia @ 2020-12-18  7:38 UTC (permalink / raw)
  To: dev, thomas, david.marchand
  Cc: stephen, cunming.liang, xiuchun.lu, miao.li, jingjing.wu

This series enables DPDK to be an alternative I/O device emulation library of
building virtualized devices in separate processes outside QEMU. It introduces
a new library for device emulation (librte_vfio_user).

*librte_vfio_user* library is an implementation of VFIO-over-socket[1] (also
known as vfio-user) which is a protocol that allows a device to be virtualized
in a separate process outside of QEMU. 

Background & Motivation 
-----------------------
The disaggregated/multi-process QEMU is using VFIO-over-socket/vfio-user
as the main transport mechanism to disaggregate IO services from QEMU[2].
Vfio-user essentially implements the VFIO device model presented to the
user process by a set of messages over a unix-domain socket. The main
difference between application using vfio-user and application using vfio
kernel module is that device manipulation is based on socket messages for
vfio-user but system calls for vfio kernel module. The vfio-user devices
consist of a generic VFIO device type, living in QEMU, which is called the
client[3], and the core device implementation (emulated device), living
outside of QEMU, which is called the server. With emulated devices removed
from QEMU enabled by vfio-user implementation, other places should be
introduced to accommodate virtualized/emulated device. This series introduces
vfio-user support in DPDK to enable DPDK as one of the living places for
emulated device except QEMU.

This series introduce the server and client implementation of vfio-user protocol.
The server plays the role as emulated devices and the client is the device
consumer. With this implementation, DPDK will be enabled to be both device
provider and consumer.

Design overview
---------------

+--------------+     +--------------+     
| +----------+ |     | +----------+ |
| | Generic  | |     | | Emulated | |
| | vfio-dev | |     | | device   | |
| +----------+ |     | +----|-----+ |
| +----------+ |     | +----|-----+ |
| | vfio-user| |     | | vfio-user| |
| | client   | |<--->| | server   | |
| +----------+ |     | +----------+ |
| QEMU/DPDK    |     | DPDK         |
+--------------+     +--------------+

- Generic vfio-dev. 
  It is the generic vfio framework in vfio applications like QEMU or DPDK.
  Applications can keep the most of vfio device management and plug in a
  vfio-user device type. Note that in current implementation, we have not
  yet integrated client vfio-user into kernel vfio in DPDK but it is viable
  and good to do so.

- vfio-user client.
  For DPDK, it is part of librte_vfio_user implementation to provide ways to
  manipulate a vfio-user based emulated devices. This manipulation is very
  similar with kernel vfio (i.e., syscalls like ioctl, mmap and pread/pwrite).
  It is a base for vfio-user device consumer.

- vfio-user server. 
  It is server part of librte_vfio_user. It provides ways to emulate your own
  device. A device provider could only care about device layout that VFIO
  defines but does not need to know how it communicates with vfio-user client.

- Emulated device.
  It is emulated device of any type (e.g., network, crypto and etc.).

References
----------
[1]: https://patchew.org/QEMU/20201130161229.23164-1-thanos.makatos@nutanix.com/
[2]: https://wiki.qemu.org/Features/MultiProcessQEMU
[3]: https://github.com/oracle/qemu/tree/vfio-user-v0.2

Chenbo Xia (9):
  lib: introduce vfio-user library
  vfio_user: implement lifecycle related APIs
  vfio_user: implement device and region related APIs
  vfio_user: implement DMA table and socket address API
  vfio_user: implement interrupt related APIs
  vfio_user: add client APIs of device attach/detach
  vfio_user: add client APIs of DMA/IRQ/region
  test/vfio_user: introduce functional test
  doc: add vfio-user library guide

 MAINTAINERS                             |    4 +
 app/test/meson.build                    |    4 +
 app/test/test_vfio_user.c               |  646 ++++++++++
 doc/guides/prog_guide/index.rst         |    1 +
 doc/guides/prog_guide/vfio_user_lib.rst |  215 ++++
 doc/guides/rel_notes/release_21_02.rst  |   11 +
 lib/librte_vfio_user/meson.build        |   11 +
 lib/librte_vfio_user/rte_vfio_user.h    |  426 +++++++
 lib/librte_vfio_user/version.map        |   26 +
 lib/librte_vfio_user/vfio_user_base.c   |  217 ++++
 lib/librte_vfio_user/vfio_user_base.h   |  109 ++
 lib/librte_vfio_user/vfio_user_client.c |  691 ++++++++++
 lib/librte_vfio_user/vfio_user_client.h |   25 +
 lib/librte_vfio_user/vfio_user_server.c | 1553 +++++++++++++++++++++++
 lib/librte_vfio_user/vfio_user_server.h |   66 +
 lib/meson.build                         |    1 +
 16 files changed, 4006 insertions(+)
 create mode 100644 app/test/test_vfio_user.c
 create mode 100644 doc/guides/prog_guide/vfio_user_lib.rst
 create mode 100644 lib/librte_vfio_user/meson.build
 create mode 100644 lib/librte_vfio_user/rte_vfio_user.h
 create mode 100644 lib/librte_vfio_user/version.map
 create mode 100644 lib/librte_vfio_user/vfio_user_base.c
 create mode 100644 lib/librte_vfio_user/vfio_user_base.h
 create mode 100644 lib/librte_vfio_user/vfio_user_client.c
 create mode 100644 lib/librte_vfio_user/vfio_user_client.h
 create mode 100644 lib/librte_vfio_user/vfio_user_server.c
 create mode 100644 lib/librte_vfio_user/vfio_user_server.h

-- 
2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [dpdk-dev] [PATCH 1/9] lib: introduce vfio-user library
  2020-12-18  7:38 [dpdk-dev] [PATCH 0/9] Introduce vfio-user library Chenbo Xia
@ 2020-12-18  7:38 ` Chenbo Xia
  2020-12-18 17:13   ` Stephen Hemminger
  2020-12-18 17:17   ` Stephen Hemminger
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 2/9] vfio_user: implement lifecycle related APIs Chenbo Xia
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 43+ messages in thread
From: Chenbo Xia @ 2020-12-18  7:38 UTC (permalink / raw)
  To: dev, thomas, david.marchand
  Cc: stephen, cunming.liang, xiuchun.lu, miao.li, jingjing.wu

This patch introduces vfio-user library, which follows vfio-user
protocol v1.0. As vfio-user has server and client implementaion,
this patch introduces basic structures and internal functions that
will be used by both server and client.

Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
---
 MAINTAINERS                           |   4 +
 lib/librte_vfio_user/meson.build      |   9 ++
 lib/librte_vfio_user/version.map      |   3 +
 lib/librte_vfio_user/vfio_user_base.c | 205 ++++++++++++++++++++++++++
 lib/librte_vfio_user/vfio_user_base.h |  65 ++++++++
 lib/meson.build                       |   1 +
 6 files changed, 287 insertions(+)
 create mode 100644 lib/librte_vfio_user/meson.build
 create mode 100644 lib/librte_vfio_user/version.map
 create mode 100644 lib/librte_vfio_user/vfio_user_base.c
 create mode 100644 lib/librte_vfio_user/vfio_user_base.h

diff --git a/MAINTAINERS b/MAINTAINERS
index eafe9f8c46..5fb4880758 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1540,6 +1540,10 @@ M: Nithin Dabilpuram <ndabilpuram@marvell.com>
 M: Pavan Nikhilesh <pbhagavatula@marvell.com>
 F: lib/librte_node/
 
+Vfio-user - EXPERIMENTAL
+M: Chenbo Xia <chenbo.xia@intel.com>
+M: Xiuchun Lu <xiuchun.lu@intel.com>
+F: lib/librte_vfio_user/
 
 Test Applications
 -----------------
diff --git a/lib/librte_vfio_user/meson.build b/lib/librte_vfio_user/meson.build
new file mode 100644
index 0000000000..0f6407b80f
--- /dev/null
+++ b/lib/librte_vfio_user/meson.build
@@ -0,0 +1,9 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2020 Intel Corporation
+
+if not is_linux
+	build = false
+	reason = 'only supported on Linux'
+endif
+
+sources = files('vfio_user_base.c')
diff --git a/lib/librte_vfio_user/version.map b/lib/librte_vfio_user/version.map
new file mode 100644
index 0000000000..33c1b976f1
--- /dev/null
+++ b/lib/librte_vfio_user/version.map
@@ -0,0 +1,3 @@
+EXPERIMENTAL {
+	local: *;
+};
diff --git a/lib/librte_vfio_user/vfio_user_base.c b/lib/librte_vfio_user/vfio_user_base.c
new file mode 100644
index 0000000000..bbad553e0a
--- /dev/null
+++ b/lib/librte_vfio_user/vfio_user_base.c
@@ -0,0 +1,205 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <unistd.h>
+#include <sys/socket.h>
+#include <string.h>
+
+#include "vfio_user_base.h"
+
+int vfio_user_log_level;
+
+const char *vfio_user_msg_str[VFIO_USER_MAX] = {
+	[VFIO_USER_NONE] = "VFIO_USER_NONE",
+	[VFIO_USER_VERSION] = "VFIO_USER_VERSION",
+};
+
+inline void vfio_user_close_msg_fds(VFIO_USER_MSG *msg)
+{
+	int i;
+
+	for (i = 0; i < msg->fd_num; i++)
+		close(msg->fds[i]);
+}
+
+int vfio_user_check_msg_fdnum(VFIO_USER_MSG *msg, int expected_fds)
+{
+	if (msg->fd_num == expected_fds)
+		return 0;
+
+	VFIO_USER_LOG(ERR, "Expect %d FDs for request %s, received %d\n",
+		expected_fds, vfio_user_msg_str[msg->cmd], msg->fd_num);
+
+	vfio_user_close_msg_fds(msg);
+
+	return -1;
+}
+
+static int vfio_user_recv_fd_msg(int sockfd, char *buf, int buflen, int *fds,
+	int max_fds, int *fd_num)
+{
+	struct iovec iov;
+	struct msghdr msgh;
+	char control[CMSG_SPACE(max_fds * sizeof(int))];
+	struct cmsghdr *cmsg;
+	int fd_sz, got_fds = 0;
+	int ret, i;
+
+	*fd_num = 0;
+
+	memset(&msgh, 0, sizeof(msgh));
+	iov.iov_base = buf;
+	iov.iov_len  = buflen;
+
+	msgh.msg_iov = &iov;
+	msgh.msg_iovlen = 1;
+	msgh.msg_control = control;
+	msgh.msg_controllen = sizeof(control);
+
+	ret = recvmsg(sockfd, &msgh, 0);
+	if (ret <= 0) {
+		if (ret)
+			VFIO_USER_LOG(DEBUG, "recvmsg failed\n");
+		return ret;
+	}
+
+	if (msgh.msg_flags & (MSG_TRUNC | MSG_CTRUNC)) {
+		VFIO_USER_LOG(ERR, "Message is truncated\n");
+		return -1;
+	}
+
+	for (cmsg = CMSG_FIRSTHDR(&msgh); cmsg != NULL;
+		cmsg = CMSG_NXTHDR(&msgh, cmsg)) {
+		if ((cmsg->cmsg_level == SOL_SOCKET) &&
+			(cmsg->cmsg_type == SCM_RIGHTS)) {
+			fd_sz = cmsg->cmsg_len - CMSG_LEN(0);
+			got_fds = fd_sz / sizeof(int);
+			if (got_fds >= max_fds) {
+				/* Invalid message, close fds */
+				int *close_fd = (int *)CMSG_DATA(cmsg);
+				for (i = 0; i < got_fds; i++) {
+					close_fd += i;
+					close(*close_fd);
+				}
+				VFIO_USER_LOG(ERR, "fd num exceeds max "
+					"in vfio-user msg\n");
+				return -1;
+			}
+			*fd_num = got_fds;
+			memcpy(fds, CMSG_DATA(cmsg), got_fds * sizeof(int));
+			break;
+		}
+	}
+
+	/* Make unused file descriptors invalid */
+	while (got_fds < max_fds)
+		fds[got_fds++] = -1;
+
+	return ret;
+}
+
+int vfio_user_recv_msg(int sockfd, VFIO_USER_MSG *msg)
+{
+	int ret;
+
+	ret = vfio_user_recv_fd_msg(sockfd, (char *)msg, VFIO_USER_MSG_HDR_SIZE,
+		msg->fds, VFIO_USER_MAX_FD, &msg->fd_num);
+	if (ret <= 0) {
+		return ret;
+	} else if (ret != VFIO_USER_MSG_HDR_SIZE) {
+		VFIO_USER_LOG(ERR, "Read unexpected header size\n");
+		ret = -1;
+		goto err;
+	}
+
+	if (msg->size > VFIO_USER_MSG_HDR_SIZE) {
+		if (msg->size > (sizeof(msg->payload) +
+			VFIO_USER_MSG_HDR_SIZE)) {
+			VFIO_USER_LOG(ERR, "Read invalid msg size: %d\n",
+				msg->size);
+			ret = -1;
+			goto err;
+		}
+
+		ret = read(sockfd, &msg->payload,
+			msg->size - VFIO_USER_MSG_HDR_SIZE);
+		if (ret <= 0)
+			goto err;
+		if (ret != (int)(msg->size - VFIO_USER_MSG_HDR_SIZE)) {
+			VFIO_USER_LOG(ERR, "Read payload failed\n");
+			ret = -1;
+			goto err;
+		}
+	}
+
+	return ret;
+err:
+	vfio_user_close_msg_fds(msg);
+	return ret;
+}
+
+static int
+vfio_user_send_fd_msg(int sockfd, char *buf, int buflen, int *fds, int fd_num)
+{
+
+	struct iovec iov;
+	struct msghdr msgh;
+	size_t fdsize = fd_num * sizeof(int);
+	char control[CMSG_SPACE(fdsize)];
+	struct cmsghdr *cmsg;
+	int ret;
+
+	memset(&msgh, 0, sizeof(msgh));
+	iov.iov_base = buf;
+	iov.iov_len = buflen;
+
+	msgh.msg_iov = &iov;
+	msgh.msg_iovlen = 1;
+
+	if (fds && fd_num > 0) {
+		msgh.msg_control = control;
+		msgh.msg_controllen = sizeof(control);
+		cmsg = CMSG_FIRSTHDR(&msgh);
+		cmsg->cmsg_len = CMSG_LEN(fdsize);
+		cmsg->cmsg_level = SOL_SOCKET;
+		cmsg->cmsg_type = SCM_RIGHTS;
+		memcpy(CMSG_DATA(cmsg), fds, fdsize);
+	} else {
+		msgh.msg_control = NULL;
+		msgh.msg_controllen = 0;
+	}
+
+	do {
+		ret = sendmsg(sockfd, &msgh, MSG_NOSIGNAL);
+	} while (ret < 0 && errno == EINTR);
+
+	if (ret < 0) {
+		VFIO_USER_LOG(ERR, "sendmsg error\n");
+		return ret;
+	}
+
+	return ret;
+}
+
+int vfio_user_send_msg(int sockfd, VFIO_USER_MSG *msg)
+{
+	if (!msg)
+		return 0;
+
+	return vfio_user_send_fd_msg(sockfd, (char *)msg,
+		msg->size, msg->fds, msg->fd_num);
+}
+
+int vfio_user_reply_msg(int sockfd, VFIO_USER_MSG *msg)
+{
+	if (!msg)
+		return 0;
+
+	msg->flags |= VFIO_USER_NEED_NO_RP;
+	msg->flags |= VFIO_USER_TYPE_REPLY;
+
+	return vfio_user_send_msg(sockfd, msg);
+}
+
+RTE_LOG_REGISTER(vfio_user_log_level, lib.vfio, INFO);
diff --git a/lib/librte_vfio_user/vfio_user_base.h b/lib/librte_vfio_user/vfio_user_base.h
new file mode 100644
index 0000000000..6db45b1819
--- /dev/null
+++ b/lib/librte_vfio_user/vfio_user_base.h
@@ -0,0 +1,65 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _VFIO_USER_BASE_H
+#define _VFIO_USER_BASE_H
+
+#include <rte_log.h>
+
+#define VFIO_USER_MAX_FD 1024
+#define VFIO_USER_MAX_VERSION_DATA 512
+
+extern int vfio_user_log_level;
+extern const char *vfio_user_msg_str[];
+
+#define VFIO_USER_LOG(level, fmt, args...)			\
+	rte_log(RTE_LOG_ ## level, vfio_user_log_level,		\
+	"VFIO_USER: " fmt, ## args)
+
+struct vfio_user_socket {
+	char *sock_addr;
+	int sock_fd;
+	int dev_id;
+};
+
+typedef enum VFIO_USER_CMD_TYPE {
+	VFIO_USER_NONE = 0,
+	VFIO_USER_VERSION = 1,
+	VFIO_USER_MAX = 2,
+} VFIO_USER_CMD_TYPE;
+
+struct vfio_user_version {
+	uint16_t major;
+	uint16_t minor;
+	/* Version data (JSON), for now not supported */
+	uint8_t ver_data[VFIO_USER_MAX_VERSION_DATA];
+};
+
+typedef struct vfio_user_msg {
+	uint16_t msg_id;
+	uint16_t cmd;
+	uint32_t size;
+#define VFIO_USER_TYPE_CMD	(0x0)		/* Message type is COMMAND */
+#define VFIO_USER_TYPE_REPLY	(0x1 << 0)	/* Message type is REPLY */
+#define VFIO_USER_NEED_NO_RP	(0x1 << 4)	/* Message needs no reply */
+#define VFIO_USER_ERROR		(0x1 << 5)	/* Reply message has error */
+	uint32_t flags;
+	uint32_t err;				/* Valid in reply, optional */
+	union {
+		struct vfio_user_version ver;
+	} payload;
+	int fds[VFIO_USER_MAX_FD];
+	int fd_num;
+} __attribute((packed)) VFIO_USER_MSG;
+
+#define VFIO_USER_MSG_HDR_SIZE offsetof(VFIO_USER_MSG, payload.ver)
+
+void vfio_user_close_msg_fds(VFIO_USER_MSG *msg);
+int vfio_user_check_msg_fdnum(VFIO_USER_MSG *msg, int expected_fds);
+void vfio_user_close_msg_fds(VFIO_USER_MSG *msg);
+int vfio_user_recv_msg(int sockfd, VFIO_USER_MSG *msg);
+int vfio_user_send_msg(int sockfd, VFIO_USER_MSG *msg);
+int vfio_user_reply_msg(int sockfd, VFIO_USER_MSG *msg);
+
+#endif
diff --git a/lib/meson.build b/lib/meson.build
index ed00f89146..b7fbfcc95b 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -28,6 +28,7 @@ libraries = [
 	'rib', 'reorder', 'sched', 'security', 'stack', 'vhost',
 	# ipsec lib depends on net, crypto and security
 	'ipsec',
+	'vfio_user',
 	#fib lib depends on rib
 	'fib',
 	# add pkt framework libs which use other libs from above
-- 
2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [dpdk-dev] [PATCH 2/9] vfio_user: implement lifecycle related APIs
  2020-12-18  7:38 [dpdk-dev] [PATCH 0/9] Introduce vfio-user library Chenbo Xia
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 1/9] lib: introduce " Chenbo Xia
@ 2020-12-18  7:38 ` Chenbo Xia
  2021-01-05  8:34   ` Xing, Beilei
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 3/9] vfio_user: implement device and region " Chenbo Xia
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 43+ messages in thread
From: Chenbo Xia @ 2020-12-18  7:38 UTC (permalink / raw)
  To: dev, thomas, david.marchand
  Cc: stephen, cunming.liang, xiuchun.lu, miao.li, jingjing.wu

This patch implements three lifecycle related APIs for vfio-user server,
which are rte_vfio_user_register(), rte_vfio_user_unregister() and
rte_vfio_user_start(). Socket an device management is implemented
along with the API introduction.

Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
---
 lib/librte_vfio_user/meson.build        |   3 +-
 lib/librte_vfio_user/rte_vfio_user.h    |  51 ++
 lib/librte_vfio_user/version.map        |   6 +
 lib/librte_vfio_user/vfio_user_base.h   |   4 +
 lib/librte_vfio_user/vfio_user_server.c | 690 ++++++++++++++++++++++++
 lib/librte_vfio_user/vfio_user_server.h |  55 ++
 6 files changed, 808 insertions(+), 1 deletion(-)
 create mode 100644 lib/librte_vfio_user/rte_vfio_user.h
 create mode 100644 lib/librte_vfio_user/vfio_user_server.c
 create mode 100644 lib/librte_vfio_user/vfio_user_server.h

diff --git a/lib/librte_vfio_user/meson.build b/lib/librte_vfio_user/meson.build
index 0f6407b80f..b7363f61c6 100644
--- a/lib/librte_vfio_user/meson.build
+++ b/lib/librte_vfio_user/meson.build
@@ -6,4 +6,5 @@ if not is_linux
 	reason = 'only supported on Linux'
 endif
 
-sources = files('vfio_user_base.c')
+sources = files('vfio_user_base.c', 'vfio_user_server.c')
+headers = files('rte_vfio_user.h')
diff --git a/lib/librte_vfio_user/rte_vfio_user.h b/lib/librte_vfio_user/rte_vfio_user.h
new file mode 100644
index 0000000000..0d4f6c1be2
--- /dev/null
+++ b/lib/librte_vfio_user/rte_vfio_user.h
@@ -0,0 +1,51 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_VFIO_USER_H
+#define _RTE_VFIO_USER_H
+
+#include <rte_compat.h>
+
+/**
+ *  Below APIs are for vfio-user server (device provider) to use:
+ *	*rte_vfio_user_register
+ *	*rte_vfio_user_unregister
+ *	*rte_vfio_user_start
+ */
+
+/**
+ * Register a vfio-user device.
+ *
+ * @param sock_addr
+ *   Unix domain socket address
+ * @return
+ *   0 on success, -1 on failure
+ */
+__rte_experimental
+int rte_vfio_user_register(const char *sock_addr);
+
+/**
+ * Unregister a vfio-user device.
+ *
+ * @param sock_addr
+ *   Unix domain socket address
+ * @return
+ *   0 on success, -1 on failure
+ */
+__rte_experimental
+int rte_vfio_user_unregister(const char *sock_addr);
+
+/**
+ * Start vfio-user handling for the device.
+ *
+ * This function triggers vfio-user message handling.
+ * @param sock_addr
+ *   Unix domain socket address
+ * @return
+ *   0 on success, -1 on failure
+ */
+__rte_experimental
+int rte_vfio_user_start(const char *sock_addr);
+
+#endif
diff --git a/lib/librte_vfio_user/version.map b/lib/librte_vfio_user/version.map
index 33c1b976f1..e53095eda8 100644
--- a/lib/librte_vfio_user/version.map
+++ b/lib/librte_vfio_user/version.map
@@ -1,3 +1,9 @@
 EXPERIMENTAL {
+	global:
+
+	rte_vfio_user_register;
+	rte_vfio_user_unregister;
+	rte_vfio_user_start;
+
 	local: *;
 };
diff --git a/lib/librte_vfio_user/vfio_user_base.h b/lib/librte_vfio_user/vfio_user_base.h
index 6db45b1819..926cecfa7a 100644
--- a/lib/librte_vfio_user/vfio_user_base.h
+++ b/lib/librte_vfio_user/vfio_user_base.h
@@ -7,6 +7,10 @@
 
 #include <rte_log.h>
 
+#include "rte_vfio_user.h"
+
+#define VFIO_USER_VERSION_MAJOR 1
+#define VFIO_USER_VERSION_MINOR 0
 #define VFIO_USER_MAX_FD 1024
 #define VFIO_USER_MAX_VERSION_DATA 512
 
diff --git a/lib/librte_vfio_user/vfio_user_server.c b/lib/librte_vfio_user/vfio_user_server.c
new file mode 100644
index 0000000000..545c779fb0
--- /dev/null
+++ b/lib/librte_vfio_user/vfio_user_server.c
@@ -0,0 +1,690 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <unistd.h>
+#include <fcntl.h>
+#include <pthread.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+
+#include "vfio_user_server.h"
+
+#define MAX_VFIO_USER_DEVICE 1024
+
+static struct vfio_user_server *vfio_user_devices[MAX_VFIO_USER_DEVICE];
+static pthread_mutex_t vfio_dev_mutex = PTHREAD_MUTEX_INITIALIZER;
+
+static struct vfio_user_ep_sock vfio_ep_sock = {
+	.ep = {
+		.fd_mutex = PTHREAD_MUTEX_INITIALIZER,
+		.fd_num = 0
+	},
+	.sock_num = 0,
+	.mutex = PTHREAD_MUTEX_INITIALIZER,
+};
+
+static int vfio_user_negotiate_version(struct vfio_user_server *dev,
+	VFIO_USER_MSG *msg)
+{
+	struct vfio_user_version *ver = &msg->payload.ver;
+
+	if (vfio_user_check_msg_fdnum(msg, 0) != 0)
+		return -EINVAL;
+
+	if (ver->major == dev->ver.major && ver->minor <= dev->ver.minor)
+		return 0;
+	else
+		return -ENOTSUP;
+}
+
+static vfio_user_msg_handler_t vfio_user_msg_handlers[VFIO_USER_MAX] = {
+	[VFIO_USER_NONE] = NULL,
+	[VFIO_USER_VERSION] = vfio_user_negotiate_version,
+};
+
+static struct vfio_user_server_socket *
+find_vfio_user_socket(const char *sock_addr)
+{
+	uint32_t i;
+
+	if (sock_addr == NULL)
+		return NULL;
+
+	for (i = 0; i < vfio_ep_sock.sock_num; i++) {
+		struct vfio_user_server_socket *s = vfio_ep_sock.sock[i];
+
+		if (!strcmp(s->sock.sock_addr, sock_addr))
+			return s;
+	}
+
+	return NULL;
+}
+
+static struct vfio_user_server_socket *
+vfio_user_create_sock(const char *sock_addr)
+{
+	struct vfio_user_server_socket *sk;
+	struct vfio_user_socket *sock;
+	int fd;
+	struct sockaddr_un *un;
+
+	pthread_mutex_lock(&vfio_ep_sock.mutex);
+	if (vfio_ep_sock.sock_num == VFIO_USER_MAX_FD) {
+		VFIO_USER_LOG(ERR, "Failed to create socket:"
+			" socket num reaches max\n");
+		goto err;
+	}
+
+	sk = find_vfio_user_socket(sock_addr);
+	if (sk) {
+		VFIO_USER_LOG(ERR, "Failed to create socket:"
+			"socket addr exists\n");
+		goto err;
+	}
+
+	sk = malloc(sizeof(*sk));
+	if (!sk) {
+		VFIO_USER_LOG(ERR, "Failed to alloc server socket\n");
+		goto err;
+	}
+
+	sock = &sk->sock;
+	sock->sock_addr = strdup(sock_addr);
+	if (!sock->sock_addr) {
+		VFIO_USER_LOG(ERR, "Failed to copy sock_addr\n");
+		goto err_dup;
+	}
+
+	fd = socket(AF_UNIX, SOCK_STREAM, 0);
+	if (fd < 0) {
+		VFIO_USER_LOG(ERR, "Failed to create socket\n");
+		goto err_sock;
+	}
+
+	if (fcntl(fd, F_SETFL, O_NONBLOCK)) {
+		VFIO_USER_LOG(ERR, "can't set nonblocking mode for socket, "
+			"fd: %d (%s)\n", fd, strerror(errno));
+		goto err_fcntl;
+	}
+
+	un = &sk->un;
+	memset(un, 0, sizeof(*un));
+	un->sun_family = AF_UNIX;
+	strncpy(un->sun_path, sock->sock_addr, sizeof(un->sun_path));
+	un->sun_path[sizeof(un->sun_path) - 1] = '\0';
+	sock->sock_fd = fd;
+	sk->conn_fd = -1;
+
+	vfio_ep_sock.sock[vfio_ep_sock.sock_num++] = sk;
+
+	pthread_mutex_unlock(&vfio_ep_sock.mutex);
+
+	return sk;
+
+err_fcntl:
+	close(fd);
+err_sock:
+	free(sock->sock_addr);
+err_dup:
+	free(sk);
+err:
+	pthread_mutex_unlock(&vfio_ep_sock.mutex);
+	return NULL;
+}
+
+static void vfio_user_delete_sock(struct vfio_user_server_socket *sk)
+{
+	uint32_t i, end;
+	struct vfio_user_socket *sock;
+
+	if (!sk)
+		return;
+
+	pthread_mutex_lock(&vfio_ep_sock.mutex);
+
+	for (i = 0; i < vfio_ep_sock.sock_num; i++) {
+		if (vfio_ep_sock.sock[i] == sk)
+			break;
+	}
+
+	sock = &sk->sock;
+	end = --vfio_ep_sock.sock_num;
+	vfio_ep_sock.sock[i] = vfio_ep_sock.sock[end];
+	vfio_ep_sock.sock[end] = NULL;
+
+	free(sock->sock_addr);
+	close(sock->sock_fd);
+	if (sk->conn_fd != -1)
+		close(sk->conn_fd);
+	unlink(sock->sock_addr);
+	free(sk);
+
+	pthread_mutex_unlock(&vfio_ep_sock.mutex);
+}
+
+static inline int vfio_user_init_epoll(struct vfio_user_epoll *ep)
+{
+	int epfd = epoll_create(1);
+	if (epfd < 0) {
+		VFIO_USER_LOG(ERR, "Failed to create epoll fd\n");
+		return -1;
+	}
+
+	ep->epfd = epfd;
+	return 0;
+}
+
+static inline void vfio_user_destroy_epoll(struct vfio_user_epoll *ep)
+{
+	close(ep->epfd);
+	ep->epfd = -1;
+}
+
+static int vfio_user_add_listen_fd(struct vfio_user_epoll *ep,
+	int sock_fd, event_handler evh, void *data)
+{
+	struct epoll_event evt;
+	int ret = 0;
+	uint32_t event = EPOLLIN | EPOLLPRI;
+
+	pthread_mutex_lock(&ep->fd_mutex);
+
+	evt.events = event;
+	evt.data.ptr = &ep->fdinfo[ep->fd_num];
+
+	if (ep->fd_num >= VFIO_USER_MAX_FD) {
+		VFIO_USER_LOG(ERR, "Error add listen fd, "
+			"exceed max num\n");
+		ret = -1;
+		goto err;
+	}
+
+	ep->fdinfo[ep->fd_num].fd = sock_fd;
+	ep->fdinfo[ep->fd_num].event = event;
+	ep->fdinfo[ep->fd_num].ev_handle = evh;
+	ep->fdinfo[ep->fd_num].data = data;
+
+	if (epoll_ctl(ep->epfd, EPOLL_CTL_ADD, sock_fd, &evt) < 0) {
+		VFIO_USER_LOG(ERR, "Error add listen fd, "
+			"epoll_ctl failed\n");
+		ret = -1;
+		goto err;
+	}
+
+	ep->fd_num++;
+err:
+	pthread_mutex_unlock(&ep->fd_mutex);
+	return ret;
+}
+
+static int vfio_user_del_listen_fd(struct vfio_user_epoll *ep,
+	int sock_fd)
+{
+	struct epoll_event evt;
+	uint32_t event = EPOLLIN | EPOLLPRI;
+	uint32_t i;
+	int ret = 0;
+
+	pthread_mutex_lock(&ep->fd_mutex);
+
+	for (i = 0; i < ep->fd_num; i++) {
+		if (ep->fdinfo[i].fd == sock_fd) {
+			ep->fdinfo[i].fd = -1;
+			break;
+		}
+	}
+
+	evt.events = event;
+	evt.data.ptr = &ep->fdinfo[i];
+
+	if (epoll_ctl(ep->epfd, EPOLL_CTL_DEL, sock_fd, &evt) < 0) {
+		VFIO_USER_LOG(ERR, "Error del listen fd, "
+			"epoll_ctl failed\n");
+		ret = -1;
+	}
+
+	pthread_mutex_unlock(&ep->fd_mutex);
+	return ret;
+}
+
+static inline int next_mv_src_idx(FD_INFO *info, int end)
+{
+	int i;
+
+	for (i = end; i >= 0 && info[i].fd == -1; i--)
+		;
+
+	return i;
+}
+
+static void vfio_user_fd_cleanup(struct vfio_user_epoll *ep)
+{
+	int mv_src_idx, mv_dst_idx;
+	if (ep->fd_num != 0) {
+		pthread_mutex_lock(&ep->fd_mutex);
+
+		mv_src_idx = next_mv_src_idx(ep->fdinfo, ep->fd_num - 1);
+		for (mv_dst_idx = 0; mv_dst_idx < mv_src_idx; mv_dst_idx++) {
+			if (ep->fdinfo[mv_dst_idx].fd != -1)
+				continue;
+			ep->fdinfo[mv_dst_idx] = ep->fdinfo[mv_src_idx];
+			mv_src_idx = next_mv_src_idx(ep->fdinfo,
+				mv_src_idx - 1);
+		}
+		ep->fd_num = mv_src_idx + 1;
+
+		pthread_mutex_unlock(&ep->fd_mutex);
+	}
+}
+
+static void *vfio_user_fd_event_handler(void *arg)
+{
+	struct vfio_user_epoll *ep = arg;
+	struct epoll_event *events;
+	int num_fd, i, ret, cleanup;
+	event_handler evh;
+	FD_INFO *info;
+
+	while (1) {
+		events = ep->events;
+		num_fd = epoll_wait(ep->epfd, events,
+			VFIO_USER_MAX_FD, 1000);
+		if (num_fd <= 0)
+			continue;
+		cleanup = 0;
+
+		for (i = 0; i < num_fd; i++) {
+			info = (FD_INFO *)events[i].data.ptr;
+			evh = info->ev_handle;
+
+			if (evh) {
+				ret = evh(info->fd, info->data);
+				if (ret < 0) {
+					info->fd = -1;
+					cleanup = 1;
+				}
+			}
+		}
+
+		if (cleanup)
+			vfio_user_fd_cleanup(ep);
+	}
+	return NULL;
+}
+
+static inline int vfio_user_add_device(void)
+{
+	struct vfio_user_server *dev;
+	int i;
+
+	pthread_mutex_lock(&vfio_dev_mutex);
+	for (i = 0; i < MAX_VFIO_USER_DEVICE; i++) {
+		if (vfio_user_devices[i] == NULL)
+			break;
+	}
+
+	if (i == MAX_VFIO_USER_DEVICE) {
+		VFIO_USER_LOG(ERR, "vfio user device num reaches max!\n");
+		i = -1;
+		goto exit;
+	}
+
+	dev = malloc(sizeof(struct vfio_user_server));
+	if (dev == NULL) {
+		VFIO_USER_LOG(ERR, "Failed to alloc new vfio-user dev.\n");
+		i = -1;
+		goto exit;
+	}
+
+	memset(dev, 0, sizeof(struct vfio_user_server));
+	vfio_user_devices[i] = dev;
+	dev->dev_id = i;
+	dev->conn_fd = -1;
+
+exit:
+	pthread_mutex_unlock(&vfio_dev_mutex);
+	return i;
+}
+
+static inline void vfio_user_del_device(struct vfio_user_server *dev)
+{
+	if (dev == NULL)
+		return;
+
+	pthread_mutex_lock(&vfio_dev_mutex);
+	vfio_user_devices[dev->dev_id] = NULL;
+	free(dev);
+	pthread_mutex_unlock(&vfio_dev_mutex);
+}
+
+static inline struct vfio_user_server *vfio_user_get_device(int dev_id)
+{
+	struct vfio_user_server *dev;
+
+	pthread_mutex_lock(&vfio_dev_mutex);
+	dev = vfio_user_devices[dev_id];
+	if (!dev)
+		VFIO_USER_LOG(ERR, "Device %d not found.\n", dev_id);
+	pthread_mutex_unlock(&vfio_dev_mutex);
+
+	return dev;
+}
+
+static int vfio_user_message_handler(int dev_id, int fd)
+{
+	struct vfio_user_server *dev;
+	VFIO_USER_MSG msg;
+	uint32_t cmd;
+	int ret = 0;
+
+	dev = vfio_user_get_device(dev_id);
+	if (!dev)
+		return -1;
+
+	ret = vfio_user_recv_msg(fd, &msg);
+	if (ret <= 0) {
+		if (ret < 0)
+			VFIO_USER_LOG(ERR, "Read message failed\n");
+		else
+			VFIO_USER_LOG(ERR, "Peer closed\n");
+		return -1;
+	}
+
+	if (msg.msg_id != dev->msg_id)
+		return -1;
+	ret = 0;
+	cmd = msg.cmd;
+	dev->msg_id++;
+	if (cmd > VFIO_USER_NONE && cmd < VFIO_USER_MAX &&
+			vfio_user_msg_str[cmd]) {
+		VFIO_USER_LOG(INFO, "Read message %s\n",
+			vfio_user_msg_str[cmd]);
+	} else {
+		VFIO_USER_LOG(ERR, "Read unknown message\n");
+		return -1;
+	}
+
+	if (vfio_user_msg_handlers[cmd])
+		ret = vfio_user_msg_handlers[cmd](dev, &msg);
+	else {
+		VFIO_USER_LOG(ERR, "Handler not defined for %s\n",
+			vfio_user_msg_str[cmd]);
+		ret = -1;
+		goto handle_end;
+	}
+
+	if (!(msg.flags & VFIO_USER_NEED_NO_RP)) {
+		if (ret < 0) {
+			msg.flags |= VFIO_USER_ERROR;
+			msg.err = -ret;
+			/* If an error occurs, the reply message must
+			 * only include the reply header.
+			 */
+			msg.size = VFIO_USER_MSG_HDR_SIZE;
+			VFIO_USER_LOG(ERR, "Handle status error(%d) for %s\n",
+				ret, vfio_user_msg_str[cmd]);
+		}
+
+		ret = vfio_user_reply_msg(fd, &msg);
+		if (ret < 0) {
+			VFIO_USER_LOG(ERR, "Reply error for %s\n",
+				vfio_user_msg_str[cmd]);
+		} else {
+			VFIO_USER_LOG(INFO, "Reply %s succeeds\n",
+				vfio_user_msg_str[cmd]);
+			ret = 0;
+		}
+	}
+
+handle_end:
+	return ret;
+}
+
+static int vfio_user_sock_read(int fd, void *data)
+{
+	struct vfio_user_server_socket *sk = data;
+	int ret, dev_id = sk->sock.dev_id;
+
+	ret = vfio_user_message_handler(dev_id, fd);
+	if (ret < 0) {
+		struct vfio_user_server *dev;
+
+		vfio_user_del_listen_fd(&vfio_ep_sock.ep, sk->conn_fd);
+		close(fd);
+		sk->conn_fd = -1;
+		dev = vfio_user_get_device(dev_id);
+		if (dev)
+			dev->msg_id = 0;
+	}
+
+	return ret;
+}
+
+static void
+vfio_user_set_ifname(int dev_id, const char *sock_addr, unsigned int size)
+{
+	struct vfio_user_server *dev;
+	unsigned int len;
+
+	dev = vfio_user_get_device(dev_id);
+	if (!dev)
+		return;
+
+	len = size > sizeof(dev->sock_addr) ?
+		sizeof(dev->sock_addr) : size;
+	strncpy(dev->sock_addr, sock_addr, len);
+	dev->sock_addr[len] = '\0';
+}
+
+static int
+vfio_user_add_new_connection(int fd, void *data)
+{
+	struct vfio_user_server *dev;
+	int dev_id;
+	size_t size;
+	struct vfio_user_server_socket *sk = data;
+	struct vfio_user_socket *sock = &sk->sock;
+	int conn_fd;
+	int ret;
+
+	if (sk->conn_fd != -1)
+		return 0;
+
+	conn_fd = accept(fd, NULL, NULL);
+	if (fd < 0)
+		return -1;
+
+	VFIO_USER_LOG(INFO, "New vfio-user client(%s) connected\n",
+		sock->sock_addr);
+
+	if (sock == NULL)
+		return -1;
+
+	dev_id = sock->dev_id;
+	sk->conn_fd = conn_fd;
+
+	dev = vfio_user_get_device(dev_id);
+	if (!dev)
+		return -1;
+
+	dev->conn_fd = conn_fd;
+
+	size = strnlen(sock->sock_addr, PATH_MAX);
+	vfio_user_set_ifname(dev_id, sock->sock_addr, size);
+
+	ret = vfio_user_add_listen_fd(&vfio_ep_sock.ep,
+		conn_fd, vfio_user_sock_read, sk);
+	if (ret < 0) {
+		VFIO_USER_LOG(ERR, "Failed to add fd %d into vfio server fdset\n",
+			conn_fd);
+		goto err_cleanup;
+	}
+
+	return 0;
+
+err_cleanup:
+	close(fd);
+	return -1;
+}
+
+static int
+vfio_user_start_server(struct vfio_user_server_socket *sk)
+{
+	struct vfio_user_server *dev;
+	int ret;
+	struct vfio_user_socket *sock = &sk->sock;
+	int fd = sock->sock_fd;
+	const char *path = sock->sock_addr;
+
+	dev = vfio_user_get_device(sock->dev_id);
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to start, "
+			"device not found\n");
+		return -1;
+	}
+
+	if (dev->started) {
+		VFIO_USER_LOG(INFO, "device already started\n");
+		return 0;
+	}
+
+	unlink(path);
+	ret = bind(fd, (struct sockaddr *)&sk->un, sizeof(sk->un));
+	if (ret < 0) {
+		VFIO_USER_LOG(ERR, "failed to bind to %s: %s;"
+			"remove it and try again\n",
+			path, strerror(errno));
+		goto err;
+	}
+
+	ret = listen(fd, 128);
+	if (ret < 0)
+		goto err;
+
+	ret = vfio_user_add_listen_fd(&vfio_ep_sock.ep,
+		fd, vfio_user_add_new_connection, (void *)sk);
+	if (ret < 0) {
+		VFIO_USER_LOG(ERR, "failed to add listen fd %d to "
+			"vfio-user server fdset\n", fd);
+		goto err;
+	}
+
+	dev->started = 1;
+
+	return 0;
+
+err:
+	close(fd);
+	return -1;
+}
+
+int rte_vfio_user_register(const char *sock_addr)
+{
+	struct vfio_user_server_socket *sk;
+	struct vfio_user_server *dev;
+	int dev_id;
+
+	if (!sock_addr)
+		return -1;
+
+	sk = vfio_user_create_sock(sock_addr);
+	if (!sk) {
+		VFIO_USER_LOG(ERR, "Create socket failed\n");
+		goto exit;
+	}
+
+	dev_id = vfio_user_add_device();
+	if (dev_id == -1) {
+		VFIO_USER_LOG(ERR, "Failed to add new vfio device\n");
+		goto err_add_dev;
+	}
+	sk->sock.dev_id = dev_id;
+
+	dev = vfio_user_get_device(dev_id);
+
+	dev->ver.major = VFIO_USER_VERSION_MAJOR;
+	dev->ver.minor = VFIO_USER_VERSION_MINOR;
+
+	return 0;
+
+err_add_dev:
+	vfio_user_delete_sock(sk);
+exit:
+	return -1;
+}
+
+int rte_vfio_user_unregister(const char *sock_addr)
+{
+	struct vfio_user_server_socket *sk;
+	struct vfio_user_server *dev;
+	int dev_id;
+
+	pthread_mutex_lock(&vfio_ep_sock.mutex);
+	sk = find_vfio_user_socket(sock_addr);
+	pthread_mutex_unlock(&vfio_ep_sock.mutex);
+
+	if (!sk) {
+		VFIO_USER_LOG(ERR, "Failed to unregister:"
+			"socket addr not registered.\n");
+		return -1;
+	}
+
+	dev_id = sk->sock.dev_id;
+	/* Client may already disconnect before unregistration */
+	if (sk->conn_fd != -1)
+		vfio_user_del_listen_fd(&vfio_ep_sock.ep, sk->conn_fd);
+	vfio_user_del_listen_fd(&vfio_ep_sock.ep, sk->sock.sock_fd);
+	vfio_user_fd_cleanup(&vfio_ep_sock.ep);
+	vfio_user_delete_sock(sk);
+
+	dev = vfio_user_get_device(dev_id);
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to unregister:"
+			"device not found.\n");
+		return -1;
+	}
+
+	vfio_user_del_device(dev);
+
+	return 0;
+}
+
+int rte_vfio_user_start(const char *sock_addr)
+{
+	static pthread_t pid;
+	struct vfio_user_server_socket *sock;
+
+	pthread_mutex_lock(&vfio_ep_sock.mutex);
+
+	sock = find_vfio_user_socket(sock_addr);
+	if (!sock) {
+		VFIO_USER_LOG(ERR, "sock_addr not registered to vfio_user "
+			"before start\n");
+		goto exit;
+	}
+
+	if (pid == 0) {
+		struct vfio_user_epoll *ep = &vfio_ep_sock.ep;
+
+		if (vfio_user_init_epoll(ep)) {
+			VFIO_USER_LOG(ERR, "Init vfio-user epoll failed\n");
+			return -1;
+		}
+
+		if (pthread_create(&pid, NULL,
+			vfio_user_fd_event_handler, ep)) {
+			vfio_user_destroy_epoll(ep);
+			VFIO_USER_LOG(ERR, "Event handler thread create failed\n");
+			return -1;
+		}
+	}
+
+	pthread_mutex_unlock(&vfio_ep_sock.mutex);
+
+	return vfio_user_start_server(sock);
+
+exit:
+	pthread_mutex_unlock(&vfio_ep_sock.mutex);
+	return -1;
+}
diff --git a/lib/librte_vfio_user/vfio_user_server.h b/lib/librte_vfio_user/vfio_user_server.h
new file mode 100644
index 0000000000..00e3f6353d
--- /dev/null
+++ b/lib/librte_vfio_user/vfio_user_server.h
@@ -0,0 +1,55 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _VFIO_USER_SERVER_H
+#define _VFIO_USER_SERVER_H
+
+#include <sys/epoll.h>
+
+#include "vfio_user_base.h"
+
+struct vfio_user_server {
+	int dev_id;
+	int started;
+	int conn_fd;
+	uint32_t msg_id;
+	char sock_addr[PATH_MAX];
+	struct vfio_user_version ver;
+};
+
+typedef int (*event_handler)(int fd, void *data);
+
+typedef struct listen_fd_info {
+	int fd;
+	uint32_t event;
+	event_handler ev_handle;
+	void *data;
+} FD_INFO;
+
+struct vfio_user_epoll {
+	int epfd;
+	FD_INFO fdinfo[VFIO_USER_MAX_FD];
+	uint32_t fd_num;	/* Current num of listen_fd */
+	struct epoll_event events[VFIO_USER_MAX_FD];
+	pthread_mutex_t fd_mutex;
+};
+
+struct vfio_user_server_socket {
+	struct vfio_user_socket sock;
+	struct sockaddr_un un;
+	/* For vfio-user protocol v0.1, a server only supports one client */
+	int conn_fd;
+};
+
+struct vfio_user_ep_sock {
+	struct vfio_user_epoll ep;
+	struct vfio_user_server_socket *sock[VFIO_USER_MAX_FD];
+	uint32_t sock_num;
+	pthread_mutex_t mutex;
+};
+
+typedef int (*vfio_user_msg_handler_t)(struct vfio_user_server *dev,
+					VFIO_USER_MSG *msg);
+
+#endif
-- 
2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [dpdk-dev] [PATCH 3/9] vfio_user: implement device and region related APIs
  2020-12-18  7:38 [dpdk-dev] [PATCH 0/9] Introduce vfio-user library Chenbo Xia
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 1/9] lib: introduce " Chenbo Xia
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 2/9] vfio_user: implement lifecycle related APIs Chenbo Xia
@ 2020-12-18  7:38 ` Chenbo Xia
  2021-01-06  5:51   ` Xing, Beilei
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 4/9] vfio_user: implement DMA table and socket address API Chenbo Xia
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 43+ messages in thread
From: Chenbo Xia @ 2020-12-18  7:38 UTC (permalink / raw)
  To: dev, thomas, david.marchand
  Cc: stephen, cunming.liang, xiuchun.lu, miao.li, jingjing.wu

This patch introduces device and region related APIs, which are
rte_vfio_user_set_dev_info() and rte_vfio_user_set_reg_info().
The corresponding vfio-user command handling is also added with
the definition of all vfio-user command identity.

Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
---
 lib/librte_vfio_user/rte_vfio_user.h    |  60 ++++++
 lib/librte_vfio_user/version.map        |   2 +
 lib/librte_vfio_user/vfio_user_base.c   |  12 ++
 lib/librte_vfio_user/vfio_user_base.h   |  32 +++-
 lib/librte_vfio_user/vfio_user_server.c | 232 ++++++++++++++++++++++++
 lib/librte_vfio_user/vfio_user_server.h |   2 +
 6 files changed, 339 insertions(+), 1 deletion(-)

diff --git a/lib/librte_vfio_user/rte_vfio_user.h b/lib/librte_vfio_user/rte_vfio_user.h
index 0d4f6c1be2..8a999c7aa0 100644
--- a/lib/librte_vfio_user/rte_vfio_user.h
+++ b/lib/librte_vfio_user/rte_vfio_user.h
@@ -5,13 +5,35 @@
 #ifndef _RTE_VFIO_USER_H
 #define _RTE_VFIO_USER_H
 
+#include <linux/vfio.h>
+
 #include <rte_compat.h>
 
+struct rte_vfio_user_reg_info;
+
+typedef ssize_t (*rte_vfio_user_reg_acc_t)(struct rte_vfio_user_reg_info *reg,
+		char *buf, size_t count, loff_t pos, bool iswrite);
+
+struct rte_vfio_user_reg_info {
+	rte_vfio_user_reg_acc_t rw;
+	void *base;
+	int fd;
+	struct vfio_region_info *info;
+	void *priv;
+};
+
+struct rte_vfio_user_regions {
+	uint32_t reg_num;
+	struct rte_vfio_user_reg_info reg_info[];
+};
+
 /**
  *  Below APIs are for vfio-user server (device provider) to use:
  *	*rte_vfio_user_register
  *	*rte_vfio_user_unregister
  *	*rte_vfio_user_start
+ *	*rte_vfio_user_set_dev_info
+ *	*rte_vfio_user_set_reg_info
  */
 
 /**
@@ -48,4 +70,42 @@ int rte_vfio_user_unregister(const char *sock_addr);
 __rte_experimental
 int rte_vfio_user_start(const char *sock_addr);
 
+/**
+ * Set the device information for a vfio-user device.
+ *
+ * This information must be set before calling rte_vfio_user_start, and should
+ * not be updated after start. Update after start can be done by unregistration
+ * and re-registration, and then the device-level change can be detected by
+ * vfio-user client.
+ *
+ * @param sock_addr
+ *   Unix domain socket address
+ * @param dev_info
+ *   Device information for the vfio-user device
+ * @return
+ *   0 on success, -1 on failure
+ */
+__rte_experimental
+int rte_vfio_user_set_dev_info(const char *sock_addr,
+	struct vfio_device_info *dev_info);
+
+/**
+ * Set the region information for a vfio-user device.
+ *
+ * This information must be set before calling rte_vfio_user_start, and should
+ * not be updated after start. Update after start can be done by unregistration
+ * and re-registration, and then the device-level change can be detected by
+ * vfio-user client.
+ *
+ * @param sock_addr
+ *   Unix domain socket address
+ * @param reg
+ *   Region information for the vfio-user device
+ * @return
+ *   0 on success, -1 on failure
+ */
+__rte_experimental
+int rte_vfio_user_set_reg_info(const char *sock_addr,
+	struct rte_vfio_user_regions *reg);
+
 #endif
diff --git a/lib/librte_vfio_user/version.map b/lib/librte_vfio_user/version.map
index e53095eda8..0f4f5acba5 100644
--- a/lib/librte_vfio_user/version.map
+++ b/lib/librte_vfio_user/version.map
@@ -4,6 +4,8 @@ EXPERIMENTAL {
 	rte_vfio_user_register;
 	rte_vfio_user_unregister;
 	rte_vfio_user_start;
+	rte_vfio_user_set_dev_info;
+	rte_vfio_user_set_reg_info;
 
 	local: *;
 };
diff --git a/lib/librte_vfio_user/vfio_user_base.c b/lib/librte_vfio_user/vfio_user_base.c
index bbad553e0a..4960589519 100644
--- a/lib/librte_vfio_user/vfio_user_base.c
+++ b/lib/librte_vfio_user/vfio_user_base.c
@@ -13,6 +13,18 @@ int vfio_user_log_level;
 const char *vfio_user_msg_str[VFIO_USER_MAX] = {
 	[VFIO_USER_NONE] = "VFIO_USER_NONE",
 	[VFIO_USER_VERSION] = "VFIO_USER_VERSION",
+	[VFIO_USER_DMA_MAP] = "VFIO_USER_DMA_MAP",
+	[VFIO_USER_DMA_UNMAP] = "VFIO_USER_DMA_UNMAP",
+	[VFIO_USER_DEVICE_GET_INFO] = "VFIO_USER_DEVICE_GET_INFO",
+	[VFIO_USER_DEVICE_GET_REGION_INFO] = "VFIO_USER_GET_REGION_INFO",
+	[VFIO_USER_DEVICE_GET_IRQ_INFO] = "VFIO_USER_DEVICE_GET_IRQ_INFO",
+	[VFIO_USER_DEVICE_SET_IRQS] = "VFIO_USER_DEVICE_SET_IRQS",
+	[VFIO_USER_REGION_READ] = "VFIO_USER_REGION_READ",
+	[VFIO_USER_REGION_WRITE] = "VFIO_USER_REGION_WRITE",
+	[VFIO_USER_DMA_READ] = "VFIO_USER_DMA_READ",
+	[VFIO_USER_DMA_WRITE] = "VFIO_USER_DMA_WRITE",
+	[VFIO_USER_VM_INTERRUPT] = "VFIO_USER_VM_INTERRUPT",
+	[VFIO_USER_DEVICE_RESET] = "VFIO_USER_DEVICE_RESET",
 };
 
 inline void vfio_user_close_msg_fds(VFIO_USER_MSG *msg)
diff --git a/lib/librte_vfio_user/vfio_user_base.h b/lib/librte_vfio_user/vfio_user_base.h
index 926cecfa7a..0d8abde816 100644
--- a/lib/librte_vfio_user/vfio_user_base.h
+++ b/lib/librte_vfio_user/vfio_user_base.h
@@ -11,6 +11,8 @@
 
 #define VFIO_USER_VERSION_MAJOR 1
 #define VFIO_USER_VERSION_MINOR 0
+#define VFIO_USER_MAX_RSVD 512
+#define VFIO_USER_MAX_RW_DATA 512
 #define VFIO_USER_MAX_FD 1024
 #define VFIO_USER_MAX_VERSION_DATA 512
 
@@ -30,7 +32,19 @@ struct vfio_user_socket {
 typedef enum VFIO_USER_CMD_TYPE {
 	VFIO_USER_NONE = 0,
 	VFIO_USER_VERSION = 1,
-	VFIO_USER_MAX = 2,
+	VFIO_USER_DMA_MAP = 2,
+	VFIO_USER_DMA_UNMAP = 3,
+	VFIO_USER_DEVICE_GET_INFO = 4,
+	VFIO_USER_DEVICE_GET_REGION_INFO = 5,
+	VFIO_USER_DEVICE_GET_IRQ_INFO = 6,
+	VFIO_USER_DEVICE_SET_IRQS = 7,
+	VFIO_USER_REGION_READ = 8,
+	VFIO_USER_REGION_WRITE = 9,
+	VFIO_USER_DMA_READ = 10,
+	VFIO_USER_DMA_WRITE = 11,
+	VFIO_USER_VM_INTERRUPT = 12,
+	VFIO_USER_DEVICE_RESET = 13,
+	VFIO_USER_MAX = 14,
 } VFIO_USER_CMD_TYPE;
 
 struct vfio_user_version {
@@ -40,6 +54,19 @@ struct vfio_user_version {
 	uint8_t ver_data[VFIO_USER_MAX_VERSION_DATA];
 };
 
+struct vfio_user_reg {
+	struct vfio_region_info reg_info;
+	/* Reserved for region capability list */
+	uint8_t rsvd[VFIO_USER_MAX_RSVD];
+};
+
+struct vfio_user_reg_rw {
+	uint64_t reg_offset;
+	uint32_t reg_idx;
+	uint32_t size;
+	char data[VFIO_USER_MAX_RW_DATA];
+};
+
 typedef struct vfio_user_msg {
 	uint16_t msg_id;
 	uint16_t cmd;
@@ -52,6 +79,9 @@ typedef struct vfio_user_msg {
 	uint32_t err;				/* Valid in reply, optional */
 	union {
 		struct vfio_user_version ver;
+		struct vfio_device_info dev_info;
+		struct vfio_user_reg reg_info;
+		struct vfio_user_reg_rw reg_rw;
 	} payload;
 	int fds[VFIO_USER_MAX_FD];
 	int fd_num;
diff --git a/lib/librte_vfio_user/vfio_user_server.c b/lib/librte_vfio_user/vfio_user_server.c
index 545c779fb0..d882f2ccbe 100644
--- a/lib/librte_vfio_user/vfio_user_server.c
+++ b/lib/librte_vfio_user/vfio_user_server.c
@@ -5,6 +5,7 @@
 #include <unistd.h>
 #include <fcntl.h>
 #include <pthread.h>
+#include <inttypes.h>
 #include <sys/socket.h>
 #include <sys/un.h>
 
@@ -38,9 +39,155 @@ static int vfio_user_negotiate_version(struct vfio_user_server *dev,
 		return -ENOTSUP;
 }
 
+static int vfio_user_device_get_info(struct vfio_user_server *dev,
+	VFIO_USER_MSG *msg)
+{
+	struct vfio_device_info *dev_info = &msg->payload.dev_info;
+
+	if (vfio_user_check_msg_fdnum(msg, 0) != 0)
+		return -EINVAL;
+
+	if (msg->size != sizeof(*dev_info) + VFIO_USER_MSG_HDR_SIZE) {
+		VFIO_USER_LOG(ERR, "Invalid message for get dev info\n");
+		return -EINVAL;
+	}
+
+	memcpy(dev_info, dev->dev_info, sizeof(*dev_info));
+
+	VFIO_USER_LOG(DEBUG, "Device info: argsz(0x%x), flags(0x%x), "
+		"regions(%u), irqs(%u)\n", dev_info->argsz, dev_info->flags,
+		dev_info->num_regions, dev_info->num_irqs);
+
+	return 0;
+}
+
+static int vfio_user_device_get_reg_info(struct vfio_user_server *dev,
+	VFIO_USER_MSG *msg)
+{
+	struct vfio_user_reg *reg = &msg->payload.reg_info;
+	struct rte_vfio_user_reg_info *reg_info;
+	struct vfio_region_info *vinfo;
+
+	if (vfio_user_check_msg_fdnum(msg, 0) != 0)
+		return -EINVAL;
+
+	if (msg->size > sizeof(*reg) + VFIO_USER_MSG_HDR_SIZE ||
+		dev->reg->reg_num <= reg->reg_info.index) {
+		VFIO_USER_LOG(ERR, "Invalid message for get dev info\n");
+		return -EINVAL;
+	}
+
+	reg_info = &dev->reg->reg_info[reg->reg_info.index];
+	vinfo = reg_info->info;
+	memcpy(reg, vinfo, vinfo->argsz);
+
+	if (reg_info->fd != -1) {
+		msg->fd_num = 1;
+		msg->fds[0] = reg_info->fd;
+	}
+
+	VFIO_USER_LOG(DEBUG, "Region(%u) info: addr(0x%" PRIx64 "), fd(%d), "
+		"sz(0x%llx), argsz(0x%x), c_off(0x%x), flags(0x%x) "
+		"off(0x%llx)\n", vinfo->index, (uint64_t)reg_info->base,
+		reg_info->fd, vinfo->size, vinfo->argsz, vinfo->cap_offset,
+		vinfo->flags, vinfo->offset);
+
+	return 0;
+}
+
+static int vfio_user_region_read(struct vfio_user_server *dev,
+	VFIO_USER_MSG *msg)
+{
+	struct vfio_user_reg_rw *rw = &msg->payload.reg_rw;
+	struct rte_vfio_user_regions *reg = dev->reg;
+	struct rte_vfio_user_reg_info *reg_info;
+	size_t count;
+
+	if (vfio_user_check_msg_fdnum(msg, 0) != 0)
+		return -EINVAL;
+
+	reg_info = &reg->reg_info[rw->reg_idx];
+
+	if (rw->reg_idx >= reg->reg_num ||
+		rw->size > VFIO_USER_MAX_RW_DATA ||
+		rw->reg_offset >= reg_info->info->size ||
+		rw->reg_offset + rw->size > reg_info->info->size) {
+		VFIO_USER_LOG(ERR, "Invalid read region request\n");
+		rw->size = 0;
+		return 0;
+	}
+
+	VFIO_USER_LOG(DEBUG, "Read Region(%u): offset(0x%" PRIx64 "),"
+		"size(0x%x)\n", rw->reg_idx, rw->reg_offset, rw->size);
+
+	if (reg_info->rw) {
+		count = reg_info->rw(reg_info, msg->payload.reg_rw.data,
+				rw->size, rw->reg_offset, 0);
+		rw->size = count;
+		msg->size += count;
+		return 0;
+	}
+
+	memcpy(&msg->payload.reg_rw.data,
+		(uint8_t *)reg_info->base + rw->reg_offset, rw->size);
+	msg->size += rw->size;
+	return 0;
+}
+
+static int vfio_user_region_write(struct vfio_user_server *dev,
+	VFIO_USER_MSG *msg)
+{
+	struct vfio_user_reg_rw *rw = &msg->payload.reg_rw;
+	struct rte_vfio_user_regions *reg = dev->reg;
+	struct rte_vfio_user_reg_info *reg_info;
+	size_t count;
+
+	if (vfio_user_check_msg_fdnum(msg, 0) != 0)
+		return -EINVAL;
+
+	if (rw->reg_idx >= reg->reg_num) {
+		VFIO_USER_LOG(ERR, "Write a non-existed region\n");
+		return -EINVAL;
+	}
+
+	reg_info = &reg->reg_info[rw->reg_idx];
+
+	VFIO_USER_LOG(DEBUG, "Write Region(%u): offset(0x%" PRIx64 "),"
+		"size(0x%x)\n", rw->reg_idx, rw->reg_offset, rw->size);
+
+	if (reg_info->rw) {
+		count = reg_info->rw(reg_info, msg->payload.reg_rw.data,
+				rw->size, rw->reg_offset, 1);
+		if (count < rw->size) {
+			VFIO_USER_LOG(ERR, "Write region %d failed\n",
+				rw->reg_idx);
+			return -EIO;
+		}
+		rw->size = 0;
+		return 0;
+	}
+
+	memcpy((uint8_t *)reg_info->base + rw->reg_offset,
+		&msg->payload.reg_rw.data, rw->size);
+	rw->size = 0;
+	return 0;
+}
+
 static vfio_user_msg_handler_t vfio_user_msg_handlers[VFIO_USER_MAX] = {
 	[VFIO_USER_NONE] = NULL,
 	[VFIO_USER_VERSION] = vfio_user_negotiate_version,
+	[VFIO_USER_DMA_MAP] = NULL,
+	[VFIO_USER_DMA_UNMAP] = NULL,
+	[VFIO_USER_DEVICE_GET_INFO] = vfio_user_device_get_info,
+	[VFIO_USER_DEVICE_GET_REGION_INFO] = vfio_user_device_get_reg_info,
+	[VFIO_USER_DEVICE_GET_IRQ_INFO] = NULL,
+	[VFIO_USER_DEVICE_SET_IRQS] = NULL,
+	[VFIO_USER_REGION_READ] = vfio_user_region_read,
+	[VFIO_USER_REGION_WRITE] = vfio_user_region_write,
+	[VFIO_USER_DMA_READ] = NULL,
+	[VFIO_USER_DMA_WRITE] = NULL,
+	[VFIO_USER_VM_INTERRUPT] = NULL,
+	[VFIO_USER_DEVICE_RESET] = NULL,
 };
 
 static struct vfio_user_server_socket *
@@ -549,6 +696,13 @@ vfio_user_start_server(struct vfio_user_server_socket *sk)
 		return 0;
 	}
 
+	/* All the info must be set before start */
+	if (!dev->dev_info || !dev->reg) {
+		VFIO_USER_LOG(ERR, "Failed to start, "
+			"dev/reg info must be set before start\n");
+		return -1;
+	}
+
 	unlink(path);
 	ret = bind(fd, (struct sockaddr *)&sk->un, sizeof(sk->un));
 	if (ret < 0) {
@@ -688,3 +842,81 @@ int rte_vfio_user_start(const char *sock_addr)
 	pthread_mutex_unlock(&vfio_ep_sock.mutex);
 	return -1;
 }
+
+int rte_vfio_user_set_dev_info(const char *sock_addr,
+	struct vfio_device_info *dev_info)
+{
+	struct vfio_user_server *dev;
+	struct vfio_user_server_socket *sk;
+	int dev_id;
+
+	if (!dev_info)
+		return -1;
+
+	pthread_mutex_lock(&vfio_ep_sock.mutex);
+	sk = find_vfio_user_socket(sock_addr);
+	pthread_mutex_unlock(&vfio_ep_sock.mutex);
+
+	if (!sk) {
+		VFIO_USER_LOG(ERR, "Failed to set device info with sock_addr "
+			"%s: addr not registered.\n", sock_addr);
+		return -1;
+	}
+
+	dev_id = sk->sock.dev_id;
+	dev = vfio_user_get_device(dev_id);
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to set device info:"
+			"device %d not found.\n", dev_id);
+		return -1;
+	}
+
+	if (dev->started) {
+		VFIO_USER_LOG(ERR, "Failed to set device info for device %d\n"
+			 ", device already started\n", dev_id);
+		return -1;
+	}
+
+	dev->dev_info = dev_info;
+
+	return 0;
+}
+
+int rte_vfio_user_set_reg_info(const char *sock_addr,
+	struct rte_vfio_user_regions *reg)
+{
+	struct vfio_user_server *dev;
+	struct vfio_user_server_socket *sk;
+	int dev_id;
+
+	if (!reg)
+		return -1;
+
+	pthread_mutex_lock(&vfio_ep_sock.mutex);
+	sk = find_vfio_user_socket(sock_addr);
+	pthread_mutex_unlock(&vfio_ep_sock.mutex);
+
+	if (!sk) {
+		VFIO_USER_LOG(ERR, "Failed to set region info with sock_addr:"
+			"%s: addr not registered.\n", sock_addr);
+		return -1;
+	}
+
+	dev_id = sk->sock.dev_id;
+	dev = vfio_user_get_device(dev_id);
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to set region info:"
+			"device %d not found.\n", dev_id);
+		return -1;
+	}
+
+	if (dev->started) {
+		VFIO_USER_LOG(ERR, "Failed to set region info for device %d\n"
+			 ", device already started\n", dev_id);
+		return -1;
+	}
+
+	dev->reg = reg;
+
+	return 0;
+}
diff --git a/lib/librte_vfio_user/vfio_user_server.h b/lib/librte_vfio_user/vfio_user_server.h
index 00e3f6353d..e8fb61cb3e 100644
--- a/lib/librte_vfio_user/vfio_user_server.h
+++ b/lib/librte_vfio_user/vfio_user_server.h
@@ -16,6 +16,8 @@ struct vfio_user_server {
 	uint32_t msg_id;
 	char sock_addr[PATH_MAX];
 	struct vfio_user_version ver;
+	struct vfio_device_info *dev_info;
+	struct rte_vfio_user_regions *reg;
 };
 
 typedef int (*event_handler)(int fd, void *data);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [dpdk-dev] [PATCH 4/9] vfio_user: implement DMA table and socket address API
  2020-12-18  7:38 [dpdk-dev] [PATCH 0/9] Introduce vfio-user library Chenbo Xia
                   ` (2 preceding siblings ...)
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 3/9] vfio_user: implement device and region " Chenbo Xia
@ 2020-12-18  7:38 ` Chenbo Xia
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 5/9] vfio_user: implement interrupt related APIs Chenbo Xia
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 43+ messages in thread
From: Chenbo Xia @ 2020-12-18  7:38 UTC (permalink / raw)
  To: dev, thomas, david.marchand
  Cc: stephen, cunming.liang, xiuchun.lu, miao.li, jingjing.wu

This patch introduces an API called rte_vfio_user_get_mem_table()
for emulated devices to acquire DMA memory table from vfio-user
library.

Notify operations are also introduced to notify the emulated
devices of several events. Another socket address API is introduced
for translation between device ID and socket address in notify
callbacks.

Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
---
 lib/librte_vfio_user/rte_vfio_user.h    |  75 ++++-
 lib/librte_vfio_user/version.map        |   2 +
 lib/librte_vfio_user/vfio_user_base.h   |   2 +
 lib/librte_vfio_user/vfio_user_server.c | 363 +++++++++++++++++++++++-
 lib/librte_vfio_user/vfio_user_server.h |   3 +
 5 files changed, 437 insertions(+), 8 deletions(-)

diff --git a/lib/librte_vfio_user/rte_vfio_user.h b/lib/librte_vfio_user/rte_vfio_user.h
index 8a999c7aa0..044c43e7dc 100644
--- a/lib/librte_vfio_user/rte_vfio_user.h
+++ b/lib/librte_vfio_user/rte_vfio_user.h
@@ -5,10 +5,52 @@
 #ifndef _RTE_VFIO_USER_H
 #define _RTE_VFIO_USER_H
 
+#include <stdint.h>
+#include <stddef.h>
+#include <stdbool.h>
 #include <linux/vfio.h>
+#include <sys/types.h>
 
 #include <rte_compat.h>
 
+#define RTE_VUSER_MAX_DMA 256
+
+struct rte_vfio_user_notify_ops {
+	/* Add device */
+	int (*new_device)(int dev_id);
+	/* Remove device */
+	void (*destroy_device)(int dev_id);
+	/* Update device status */
+	int (*update_status)(int dev_id);
+	/* Lock or unlock data path */
+	int (*lock_dp)(int dev_id, int lock);
+	/* Reset device */
+	int (*reset_device)(int dev_id);
+};
+
+struct rte_vfio_user_mem_reg {
+	uint64_t gpa;
+	uint64_t size;
+	uint64_t fd_offset;
+	uint32_t protection;	/* attributes in <sys/mman.h> */
+#define RTE_VUSER_MEM_MAPPABLE	(0x1 << 0)
+	uint32_t flags;
+};
+
+struct rte_vfio_user_mtb_entry {
+	uint64_t gpa;
+	uint64_t size;
+	uint64_t host_user_addr;
+	void	 *mmap_addr;
+	uint64_t mmap_size;
+	int fd;
+};
+
+struct rte_vfio_user_mem {
+	uint32_t entry_num;
+	struct rte_vfio_user_mtb_entry entry[RTE_VUSER_MAX_DMA];
+};
+
 struct rte_vfio_user_reg_info;
 
 typedef ssize_t (*rte_vfio_user_reg_acc_t)(struct rte_vfio_user_reg_info *reg,
@@ -32,6 +74,8 @@ struct rte_vfio_user_regions {
  *	*rte_vfio_user_register
  *	*rte_vfio_user_unregister
  *	*rte_vfio_user_start
+ *	*rte_vfio_get_sock_addr
+ *	*rte_vfio_user_get_mem_table
  *	*rte_vfio_user_set_dev_info
  *	*rte_vfio_user_set_reg_info
  */
@@ -41,11 +85,14 @@ struct rte_vfio_user_regions {
  *
  * @param sock_addr
  *   Unix domain socket address
+ * @param ops
+ *   Notify ops for the device
  * @return
  *   0 on success, -1 on failure
  */
 __rte_experimental
-int rte_vfio_user_register(const char *sock_addr);
+int rte_vfio_user_register(const char *sock_addr,
+	const struct rte_vfio_user_notify_ops *ops);
 
 /**
  * Unregister a vfio-user device.
@@ -70,6 +117,17 @@ int rte_vfio_user_unregister(const char *sock_addr);
 __rte_experimental
 int rte_vfio_user_start(const char *sock_addr);
 
+/**
+ * Get the memory table of a vfio-user device.
+ *
+ * @param dev_id
+ *   Vfio-user device ID
+ * @return
+ *   Pointer to memory table on success, NULL on failure
+ */
+__rte_experimental
+const struct rte_vfio_user_mem *rte_vfio_user_get_mem_table(int dev_id);
+
 /**
  * Set the device information for a vfio-user device.
  *
@@ -108,4 +166,19 @@ __rte_experimental
 int rte_vfio_user_set_reg_info(const char *sock_addr,
 	struct rte_vfio_user_regions *reg);
 
+/**
+ * Get the socket address for a vfio-user device.
+ *
+ * @param dev_id
+ *   Vfio-user device ID
+ * @param[out] buf
+ *   Buffer to store socket address
+ * @param len
+ *   The length of the buffer
+ * @return
+ *   0 on success, -1 on failure
+ */
+__rte_experimental
+int rte_vfio_get_sock_addr(int dev_id, char *buf, size_t len);
+
 #endif
diff --git a/lib/librte_vfio_user/version.map b/lib/librte_vfio_user/version.map
index 0f4f5acba5..3a50b5ef0e 100644
--- a/lib/librte_vfio_user/version.map
+++ b/lib/librte_vfio_user/version.map
@@ -4,6 +4,8 @@ EXPERIMENTAL {
 	rte_vfio_user_register;
 	rte_vfio_user_unregister;
 	rte_vfio_user_start;
+	rte_vfio_get_sock_addr;
+	rte_vfio_user_get_mem_table;
 	rte_vfio_user_set_dev_info;
 	rte_vfio_user_set_reg_info;
 
diff --git a/lib/librte_vfio_user/vfio_user_base.h b/lib/librte_vfio_user/vfio_user_base.h
index 0d8abde816..5f5e651e87 100644
--- a/lib/librte_vfio_user/vfio_user_base.h
+++ b/lib/librte_vfio_user/vfio_user_base.h
@@ -9,6 +9,7 @@
 
 #include "rte_vfio_user.h"
 
+#define VFIO_USER_MSG_MAX_NREG 8
 #define VFIO_USER_VERSION_MAJOR 1
 #define VFIO_USER_VERSION_MINOR 0
 #define VFIO_USER_MAX_RSVD 512
@@ -79,6 +80,7 @@ typedef struct vfio_user_msg {
 	uint32_t err;				/* Valid in reply, optional */
 	union {
 		struct vfio_user_version ver;
+		struct rte_vfio_user_mem_reg memory[VFIO_USER_MSG_MAX_NREG];
 		struct vfio_device_info dev_info;
 		struct vfio_user_reg reg_info;
 		struct vfio_user_reg_rw reg_rw;
diff --git a/lib/librte_vfio_user/vfio_user_server.c b/lib/librte_vfio_user/vfio_user_server.c
index d882f2ccbe..1162e463b7 100644
--- a/lib/librte_vfio_user/vfio_user_server.c
+++ b/lib/librte_vfio_user/vfio_user_server.c
@@ -7,6 +7,7 @@
 #include <pthread.h>
 #include <inttypes.h>
 #include <sys/socket.h>
+#include <sys/mman.h>
 #include <sys/un.h>
 
 #include "vfio_user_server.h"
@@ -39,6 +40,211 @@ static int vfio_user_negotiate_version(struct vfio_user_server *dev,
 		return -ENOTSUP;
 }
 
+static int mmap_one_region(struct rte_vfio_user_mtb_entry *entry,
+	struct rte_vfio_user_mem_reg *memory, int fd)
+{
+	if (fd != -1) {
+		if (memory->fd_offset >= -memory->size) {
+			VFIO_USER_LOG(ERR, "memory fd_offset and size overflow\n");
+			return -EINVAL;
+		}
+		entry->mmap_size = memory->fd_offset + memory->size;
+		entry->mmap_addr = mmap(NULL,
+			entry->mmap_size,
+			memory->protection, MAP_SHARED,
+			fd, 0);
+		if (entry->mmap_addr == MAP_FAILED) {
+			VFIO_USER_LOG(ERR, "Failed to mmap dma region\n");
+			return -EINVAL;
+		}
+
+		entry->host_user_addr =
+			(uint64_t)entry->mmap_addr + memory->fd_offset;
+		entry->fd = fd;
+	} else {
+		entry->mmap_size = 0;
+		entry->mmap_addr = NULL;
+		entry->host_user_addr = 0;
+		entry->fd = -1;
+	}
+
+	entry->gpa = memory->gpa;
+	entry->size = memory->size;
+
+	return 0;
+}
+
+static uint32_t add_one_region(struct rte_vfio_user_mem *mem,
+	struct rte_vfio_user_mem_reg *memory, int fd)
+{
+	struct rte_vfio_user_mtb_entry *entry = &mem->entry[0];
+	uint32_t num = mem->entry_num, i, j;
+	uint32_t sz = sizeof(struct rte_vfio_user_mtb_entry);
+	struct rte_vfio_user_mtb_entry ent;
+	int err = 0;
+
+	if (mem->entry_num == RTE_VUSER_MAX_DMA) {
+		VFIO_USER_LOG(ERR, "Add mem region failed, reach max!\n");
+		return -EBUSY;
+	}
+
+	for (i = 0; i < num; i++) {
+		entry = &mem->entry[i];
+
+		if (memory->gpa == entry->gpa &&
+			memory->size == entry->size)
+			return -EEXIST;
+
+		if (memory->gpa > entry->gpa &&
+			memory->gpa >= entry->gpa + entry->size)
+			continue;
+
+		if (memory->gpa < entry->gpa &&
+			memory->gpa + memory->size <= entry->gpa)
+			break;
+
+		return -EINVAL;
+	}
+
+	err = mmap_one_region(&ent, memory, fd);
+	if (err)
+		return err;
+
+	for (j = num; j > i; j--)
+		memcpy(&mem->entry[j], &mem->entry[j - 1], sz);
+	memcpy(&mem->entry[i], &ent, sz);
+	mem->entry_num++;
+
+	VFIO_USER_LOG(DEBUG, "DMA MAP(gpa: 0x%" PRIx64 ", sz: 0x%" PRIx64
+			", hva: 0x%" PRIx64 ", ma: 0x%" PRIx64
+			", msz: 0x%" PRIx64 ", fd: %d)\n", ent.gpa,
+			ent.size, ent.host_user_addr, (uint64_t)ent.mmap_addr,
+			ent.mmap_size, ent.fd);
+	return 0;
+}
+
+static void del_one_region(struct rte_vfio_user_mem *mem,
+	struct rte_vfio_user_mem_reg *memory)
+{
+	struct rte_vfio_user_mtb_entry *entry;
+	uint32_t num = mem->entry_num, i, j;
+	uint32_t sz = sizeof(struct rte_vfio_user_mtb_entry);
+
+	if (mem->entry_num == 0) {
+		VFIO_USER_LOG(ERR, "Delete mem region failed (No region exists)!\n");
+		return;
+	}
+
+	for (i = 0; i < num; i++) {
+		entry = &mem->entry[i];
+
+		if (memory->gpa == entry->gpa &&
+			memory->size == entry->size) {
+			if (entry->mmap_addr != NULL) {
+				munmap(entry->mmap_addr, entry->mmap_size);
+				mem->entry[i].mmap_size = 0;
+				mem->entry[i].mmap_addr = NULL;
+				mem->entry[i].host_user_addr = 0;
+				mem->entry[i].fd = -1;
+			}
+
+			mem->entry[i].gpa = 0;
+			mem->entry[i].size = 0;
+
+			for (j = i; j < num - 1; j++) {
+				memcpy(&mem->entry[j], &mem->entry[j + 1],
+					sz);
+			}
+			mem->entry_num--;
+
+			VFIO_USER_LOG(DEBUG, "DMA UNMAP(gpa: 0x%" PRIx64
+				", sz: 0x%" PRIx64 ", hva: 0x%" PRIx64
+				", ma: 0x%" PRIx64", msz: 0x%" PRIx64
+				", fd: %d)\n", entry->gpa, entry->size,
+				entry->host_user_addr,
+				(uint64_t)entry->mmap_addr, entry->mmap_size,
+				entry->fd);
+
+			return;
+		}
+	}
+
+	VFIO_USER_LOG(ERR, "Failed to find the region for dma unmap!\n");
+}
+
+static int vfio_user_dma_map(struct vfio_user_server *dev, VFIO_USER_MSG *msg)
+{
+	struct rte_vfio_user_mem_reg *memory = msg->payload.memory;
+	uint32_t region_num, expected_fd = 0;
+	uint32_t i, j, fd, fd_idx = 0;
+	int ret = 0;
+
+	if ((msg->size - VFIO_USER_MSG_HDR_SIZE) % sizeof(*memory) != 0) {
+		VFIO_USER_LOG(ERR, "Invalid msg size for dma map\n");
+		vfio_user_close_msg_fds(msg);
+		ret = -EINVAL;
+		goto err;
+	}
+
+	region_num = (msg->size - VFIO_USER_MSG_HDR_SIZE)
+		/ sizeof(struct rte_vfio_user_mem_reg);
+
+	for (i = 0; i < region_num; i++) {
+		if (memory[i].flags & RTE_VUSER_MEM_MAPPABLE)
+			expected_fd++;
+	}
+
+	if (vfio_user_check_msg_fdnum(msg, expected_fd) != 0) {
+		ret = -EINVAL;
+		goto err;
+	}
+
+	for (i = 0; i < region_num; i++) {
+		fd = (memory[i].flags & RTE_VUSER_MEM_MAPPABLE) ?
+			msg->fds[fd_idx++] : -1;
+
+		ret = add_one_region(dev->mem, memory + i, fd);
+		if (ret < 0) {
+			VFIO_USER_LOG(ERR, "Failed to add dma map\n");
+			break;
+		}
+	}
+
+	if (i != region_num) {
+		/* Clear all mmaped region and fds */
+		for (j = 0; j < region_num; j++) {
+			if (j < i)
+				del_one_region(dev->mem, memory + j);
+			else
+				close(msg->fds[j]);
+		}
+	}
+err:
+	/* Do not reply fds back */
+	msg->fd_num = 0;
+	return ret;
+}
+
+static int vfio_user_dma_unmap(struct vfio_user_server *dev, VFIO_USER_MSG *msg)
+{
+	struct rte_vfio_user_mem_reg *memory = msg->payload.memory;
+	uint32_t region_num = (msg->size - VFIO_USER_MSG_HDR_SIZE)
+		/ sizeof(struct rte_vfio_user_mem_reg);
+	uint32_t i;
+
+	if (vfio_user_check_msg_fdnum(msg, 0) != 0)
+		return -EINVAL;
+
+	if ((msg->size - VFIO_USER_MSG_HDR_SIZE) % sizeof(*memory) != 0) {
+		VFIO_USER_LOG(ERR, "Invalid msg size for dma unmap\n");
+		return -EINVAL;
+	}
+
+	for (i = 0; i < region_num; i++)
+		del_one_region(dev->mem, memory);
+
+	return 0;
+}
 static int vfio_user_device_get_info(struct vfio_user_server *dev,
 	VFIO_USER_MSG *msg)
 {
@@ -173,11 +379,62 @@ static int vfio_user_region_write(struct vfio_user_server *dev,
 	return 0;
 }
 
+static inline void vfio_user_destroy_mem_entries(struct rte_vfio_user_mem *mem)
+{
+	struct rte_vfio_user_mtb_entry *ent;
+	uint32_t i;
+
+	for (i = 0; i < mem->entry_num; i++) {
+		ent = &mem->entry[i];
+		if (ent->host_user_addr) {
+			munmap(ent->mmap_addr, ent->mmap_size);
+			close(ent->fd);
+		}
+	}
+
+	memset(mem, 0, sizeof(*mem));
+}
+
+static inline void vfio_user_destroy_mem(struct vfio_user_server *dev)
+{
+	struct rte_vfio_user_mem *mem = dev->mem;
+
+	if (!mem)
+		return;
+
+	vfio_user_destroy_mem_entries(mem);
+
+	free(mem);
+	dev->mem = NULL;
+}
+
+static int vfio_user_device_reset(struct vfio_user_server *dev,
+	VFIO_USER_MSG *msg)
+{
+	struct vfio_device_info *dev_info;
+
+	if (vfio_user_check_msg_fdnum(msg, 0) != 0)
+		return -EINVAL;
+
+	dev_info = dev->dev_info;
+
+	if (!(dev_info->flags & VFIO_DEVICE_FLAGS_RESET))
+		return -ENOTSUP;
+
+	vfio_user_destroy_mem_entries(dev->mem);
+	dev->is_ready = 0;
+
+	if (dev->ops->reset_device)
+		dev->ops->reset_device(dev->dev_id);
+
+	return 0;
+}
+
 static vfio_user_msg_handler_t vfio_user_msg_handlers[VFIO_USER_MAX] = {
 	[VFIO_USER_NONE] = NULL,
 	[VFIO_USER_VERSION] = vfio_user_negotiate_version,
-	[VFIO_USER_DMA_MAP] = NULL,
-	[VFIO_USER_DMA_UNMAP] = NULL,
+	[VFIO_USER_DMA_MAP] = vfio_user_dma_map,
+	[VFIO_USER_DMA_UNMAP] = vfio_user_dma_unmap,
 	[VFIO_USER_DEVICE_GET_INFO] = vfio_user_device_get_info,
 	[VFIO_USER_DEVICE_GET_REGION_INFO] = vfio_user_device_get_reg_info,
 	[VFIO_USER_DEVICE_GET_IRQ_INFO] = NULL,
@@ -187,7 +444,7 @@ static vfio_user_msg_handler_t vfio_user_msg_handlers[VFIO_USER_MAX] = {
 	[VFIO_USER_DMA_READ] = NULL,
 	[VFIO_USER_DMA_WRITE] = NULL,
 	[VFIO_USER_VM_INTERRUPT] = NULL,
-	[VFIO_USER_DEVICE_RESET] = NULL,
+	[VFIO_USER_DEVICE_RESET] = vfio_user_device_reset,
 };
 
 static struct vfio_user_server_socket *
@@ -518,12 +775,27 @@ static inline struct vfio_user_server *vfio_user_get_device(int dev_id)
 	return dev;
 }
 
+static inline int vfio_user_is_ready(struct vfio_user_server *dev)
+{
+	/* vfio-user currently has no definition of when the device is ready.
+	 * For now, we define it as when the device has at least one dma
+	 * memory table entry.
+	 */
+	if (dev->mem->entry_num > 0) {
+		dev->is_ready = 1;
+		return 1;
+	}
+
+	return 0;
+}
+
 static int vfio_user_message_handler(int dev_id, int fd)
 {
 	struct vfio_user_server *dev;
 	VFIO_USER_MSG msg;
 	uint32_t cmd;
 	int ret = 0;
+	int dev_locked = 0;
 
 	dev = vfio_user_get_device(dev_id);
 	if (!dev)
@@ -552,6 +824,17 @@ static int vfio_user_message_handler(int dev_id, int fd)
 		return -1;
 	}
 
+	/*
+	 * Below messages should lock the data path upon receiving
+	 * to avoid errors in data path handling
+	 */
+	if ((cmd == VFIO_USER_DMA_MAP || cmd == VFIO_USER_DMA_UNMAP ||
+		cmd == VFIO_USER_DEVICE_RESET)
+		&& dev->ops->lock_dp) {
+		dev->ops->lock_dp(dev_id, 1);
+		dev_locked = 1;
+	}
+
 	if (vfio_user_msg_handlers[cmd])
 		ret = vfio_user_msg_handlers[cmd](dev, &msg);
 	else {
@@ -584,7 +867,18 @@ static int vfio_user_message_handler(int dev_id, int fd)
 		}
 	}
 
+	if (!dev->is_ready) {
+		if (vfio_user_is_ready(dev) && dev->ops->new_device)
+			dev->ops->new_device(dev_id);
+	} else {
+		if ((cmd == VFIO_USER_DMA_MAP || cmd == VFIO_USER_DMA_UNMAP)
+			&& dev->ops->update_status)
+			dev->ops->update_status(dev_id);
+	}
+
 handle_end:
+	if (dev_locked)
+		dev->ops->lock_dp(dev_id, 0);
 	return ret;
 }
 
@@ -601,8 +895,12 @@ static int vfio_user_sock_read(int fd, void *data)
 		close(fd);
 		sk->conn_fd = -1;
 		dev = vfio_user_get_device(dev_id);
-		if (dev)
+		if (dev) {
+			dev->ops->destroy_device(dev_id);
+			vfio_user_destroy_mem_entries(dev->mem);
+			dev->is_ready = 0;
 			dev->msg_id = 0;
+		}
 	}
 
 	return ret;
@@ -733,13 +1031,14 @@ vfio_user_start_server(struct vfio_user_server_socket *sk)
 	return -1;
 }
 
-int rte_vfio_user_register(const char *sock_addr)
+int rte_vfio_user_register(const char *sock_addr,
+	const struct rte_vfio_user_notify_ops *ops)
 {
 	struct vfio_user_server_socket *sk;
 	struct vfio_user_server *dev;
 	int dev_id;
 
-	if (!sock_addr)
+	if (!sock_addr || !ops)
 		return -1;
 
 	sk = vfio_user_create_sock(sock_addr);
@@ -757,11 +1056,22 @@ int rte_vfio_user_register(const char *sock_addr)
 
 	dev = vfio_user_get_device(dev_id);
 
+	dev->mem = malloc(sizeof(struct rte_vfio_user_mem));
+	if (!dev->mem) {
+		VFIO_USER_LOG(ERR, "Failed to alloc vfio_user_mem\n");
+		goto err_mem;
+	}
+	memset(dev->mem, 0, sizeof(struct rte_vfio_user_mem));
+
 	dev->ver.major = VFIO_USER_VERSION_MAJOR;
 	dev->ver.minor = VFIO_USER_VERSION_MINOR;
+	dev->ops = ops;
+	dev->is_ready = 0;
 
 	return 0;
 
+err_mem:
+	vfio_user_del_device(dev);
 err_add_dev:
 	vfio_user_delete_sock(sk);
 exit:
@@ -798,7 +1108,7 @@ int rte_vfio_user_unregister(const char *sock_addr)
 			"device not found.\n");
 		return -1;
 	}
-
+	vfio_user_destroy_mem(dev);
 	vfio_user_del_device(dev);
 
 	return 0;
@@ -920,3 +1230,42 @@ int rte_vfio_user_set_reg_info(const char *sock_addr,
 
 	return 0;
 }
+
+int rte_vfio_get_sock_addr(int dev_id, char *buf, size_t len)
+{
+	struct vfio_user_server *dev;
+
+	dev = vfio_user_get_device(dev_id);
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to get sock address:"
+			"device %d not found.\n", dev_id);
+		return -1;
+	}
+
+	len = len > sizeof(dev->sock_addr) ?
+		sizeof(dev->sock_addr) : len;
+	strncpy(buf, dev->sock_addr, len);
+	buf[len - 1] = '\0';
+
+	return 0;
+}
+
+const struct rte_vfio_user_mem *rte_vfio_user_get_mem_table(int dev_id)
+{
+	struct vfio_user_server *dev;
+
+	dev = vfio_user_get_device(dev_id);
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to get memory table:"
+			"device %d not found.\n", dev_id);
+		return NULL;
+	}
+
+	if (!dev->mem) {
+		VFIO_USER_LOG(ERR, "Failed to get memory table for device %d:"
+			"memory table not allocated.\n", dev_id);
+		return NULL;
+	}
+
+	return dev->mem;
+}
diff --git a/lib/librte_vfio_user/vfio_user_server.h b/lib/librte_vfio_user/vfio_user_server.h
index e8fb61cb3e..737940de62 100644
--- a/lib/librte_vfio_user/vfio_user_server.h
+++ b/lib/librte_vfio_user/vfio_user_server.h
@@ -11,11 +11,14 @@
 
 struct vfio_user_server {
 	int dev_id;
+	int is_ready;
 	int started;
 	int conn_fd;
 	uint32_t msg_id;
 	char sock_addr[PATH_MAX];
+	const struct rte_vfio_user_notify_ops *ops;
 	struct vfio_user_version ver;
+	struct rte_vfio_user_mem *mem;
 	struct vfio_device_info *dev_info;
 	struct rte_vfio_user_regions *reg;
 };
-- 
2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [dpdk-dev] [PATCH 5/9] vfio_user: implement interrupt related APIs
  2020-12-18  7:38 [dpdk-dev] [PATCH 0/9] Introduce vfio-user library Chenbo Xia
                   ` (3 preceding siblings ...)
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 4/9] vfio_user: implement DMA table and socket address API Chenbo Xia
@ 2020-12-18  7:38 ` Chenbo Xia
  2020-12-30  1:04   ` Wu, Jingjing
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 6/9] vfio_user: add client APIs of device attach/detach Chenbo Xia
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 43+ messages in thread
From: Chenbo Xia @ 2020-12-18  7:38 UTC (permalink / raw)
  To: dev, thomas, david.marchand
  Cc: stephen, cunming.liang, xiuchun.lu, miao.li, jingjing.wu

This patch implements two interrupt related APIs, which are
rte_vfio_user_get_irq() and rte_vfio_user_set_irq_info().
The former is for devices to get interrupt configuration
(e.g., irqfds). The latter is for setting interrupt information
before vfio-user starts.

Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
---
 lib/librte_vfio_user/rte_vfio_user.h    |  44 ++++
 lib/librte_vfio_user/version.map        |   2 +
 lib/librte_vfio_user/vfio_user_base.h   |   8 +
 lib/librte_vfio_user/vfio_user_server.c | 292 +++++++++++++++++++++++-
 lib/librte_vfio_user/vfio_user_server.h |   6 +
 5 files changed, 347 insertions(+), 5 deletions(-)

diff --git a/lib/librte_vfio_user/rte_vfio_user.h b/lib/librte_vfio_user/rte_vfio_user.h
index 044c43e7dc..6c12b0b6ed 100644
--- a/lib/librte_vfio_user/rte_vfio_user.h
+++ b/lib/librte_vfio_user/rte_vfio_user.h
@@ -69,6 +69,11 @@ struct rte_vfio_user_regions {
 	struct rte_vfio_user_reg_info reg_info[];
 };
 
+struct rte_vfio_user_irq_info {
+	uint32_t irq_num;
+	struct vfio_irq_info irq_info[];
+};
+
 /**
  *  Below APIs are for vfio-user server (device provider) to use:
  *	*rte_vfio_user_register
@@ -76,8 +81,10 @@ struct rte_vfio_user_regions {
  *	*rte_vfio_user_start
  *	*rte_vfio_get_sock_addr
  *	*rte_vfio_user_get_mem_table
+ *	*rte_vfio_user_get_irq
  *	*rte_vfio_user_set_dev_info
  *	*rte_vfio_user_set_reg_info
+ *	*rte_vfio_user_set_irq_info
  */
 
 /**
@@ -181,4 +188,41 @@ int rte_vfio_user_set_reg_info(const char *sock_addr,
 __rte_experimental
 int rte_vfio_get_sock_addr(int dev_id, char *buf, size_t len);
 
+/**
+ * Get the irqfds of a vfio-user device.
+ *
+ * @param dev_id
+ *   Vfio-user device ID
+ * @param index
+ *   irq index
+ * @param count
+ *   irq count
+ * @param[out] fds
+ *   Pointer to the irqfds
+ * @return
+ *   0 on success, -1 on failure
+ */
+__rte_experimental
+int rte_vfio_user_get_irq(int dev_id, uint32_t index, uint32_t count,
+	int *fds);
+
+/**
+ * Set the irq information for a vfio-user device.
+ *
+ * This information must be set before calling rte_vfio_user_start, and should
+ * not be updated after start. Update after start can be done by unregistration
+ * and re-registration, and then the device-level change can be detected by
+ * vfio-user client.
+ *
+ * @param sock_addr
+ *   Unix domain socket address
+ * @param irq
+ *   IRQ information for the vfio-user device
+ * @return
+ *   0 on success, -1 on failure
+ */
+__rte_experimental
+int rte_vfio_user_set_irq_info(const char *sock_addr,
+	struct rte_vfio_user_irq_info *irq);
+
 #endif
diff --git a/lib/librte_vfio_user/version.map b/lib/librte_vfio_user/version.map
index 3a50b5ef0e..621a51a9fc 100644
--- a/lib/librte_vfio_user/version.map
+++ b/lib/librte_vfio_user/version.map
@@ -6,8 +6,10 @@ EXPERIMENTAL {
 	rte_vfio_user_start;
 	rte_vfio_get_sock_addr;
 	rte_vfio_user_get_mem_table;
+	rte_vfio_user_get_irq;
 	rte_vfio_user_set_dev_info;
 	rte_vfio_user_set_reg_info;
+	rte_vfio_user_set_irq_info;
 
 	local: *;
 };
diff --git a/lib/librte_vfio_user/vfio_user_base.h b/lib/librte_vfio_user/vfio_user_base.h
index 5f5e651e87..7fed52a44e 100644
--- a/lib/librte_vfio_user/vfio_user_base.h
+++ b/lib/librte_vfio_user/vfio_user_base.h
@@ -61,6 +61,12 @@ struct vfio_user_reg {
 	uint8_t rsvd[VFIO_USER_MAX_RSVD];
 };
 
+struct vfio_user_irq_set {
+	struct vfio_irq_set set;
+	/* Reserved for data of irq set */
+	uint8_t rsvd[VFIO_USER_MAX_RSVD];
+};
+
 struct vfio_user_reg_rw {
 	uint64_t reg_offset;
 	uint32_t reg_idx;
@@ -83,6 +89,8 @@ typedef struct vfio_user_msg {
 		struct rte_vfio_user_mem_reg memory[VFIO_USER_MSG_MAX_NREG];
 		struct vfio_device_info dev_info;
 		struct vfio_user_reg reg_info;
+		struct vfio_irq_info irq_info;
+		struct vfio_user_irq_set irq_set;
 		struct vfio_user_reg_rw reg_rw;
 	} payload;
 	int fds[VFIO_USER_MAX_FD];
diff --git a/lib/librte_vfio_user/vfio_user_server.c b/lib/librte_vfio_user/vfio_user_server.c
index 1162e463b7..cbaf3b5ed5 100644
--- a/lib/librte_vfio_user/vfio_user_server.c
+++ b/lib/librte_vfio_user/vfio_user_server.c
@@ -9,6 +9,7 @@
 #include <sys/socket.h>
 #include <sys/mman.h>
 #include <sys/un.h>
+#include <sys/eventfd.h>
 
 #include "vfio_user_server.h"
 
@@ -301,6 +302,146 @@ static int vfio_user_device_get_reg_info(struct vfio_user_server *dev,
 	return 0;
 }
 
+static int vfio_user_device_get_irq_info(struct vfio_user_server *dev,
+	VFIO_USER_MSG *msg)
+{
+	struct vfio_irq_info *irq_info = &msg->payload.irq_info;
+	struct rte_vfio_user_irq_info *info = dev->irqs.info;
+	uint32_t i;
+
+	if (vfio_user_check_msg_fdnum(msg, 0) != 0)
+		return -EINVAL;
+
+	for (i = 0; i < info->irq_num; i++) {
+		if (irq_info->index == info->irq_info[i].index) {
+			irq_info->count = info->irq_info[i].count;
+			irq_info->flags |= info->irq_info[i].flags;
+			break;
+		}
+	}
+	if (i == info->irq_num)
+		return -EINVAL;
+
+	VFIO_USER_LOG(DEBUG, "IRQ info: argsz(0x%x), flags(0x%x), index(0x%x),"
+		" count(0x%x)\n", irq_info->argsz, irq_info->flags,
+		irq_info->index, irq_info->count);
+
+	return 0;
+}
+
+static inline int irq_set_trigger(struct vfio_user_irqs *irqs,
+	struct vfio_irq_set *irq_set, VFIO_USER_MSG *msg)
+{
+	uint32_t i = irq_set->start;
+	int eventfd;
+
+	switch (irq_set->flags & VFIO_IRQ_SET_DATA_TYPE_MASK) {
+	case VFIO_IRQ_SET_DATA_NONE:
+		if (vfio_user_check_msg_fdnum(msg, 0) != 0)
+			return -EINVAL;
+
+		for (; i < irq_set->start + irq_set->count; i++) {
+			eventfd = irqs->fds[irq_set->index][i];
+			if (eventfd >= 0) {
+				if (eventfd_write(eventfd, (eventfd_t)1))
+					return -errno;
+			}
+		}
+		break;
+	case VFIO_IRQ_SET_DATA_BOOL:
+		if (vfio_user_check_msg_fdnum(msg, 0) != 0)
+			return -EINVAL;
+
+		uint8_t *idx = irq_set->data;
+		for (; i < irq_set->start + irq_set->count; i++, idx++) {
+			eventfd = irqs->fds[irq_set->index][i];
+			if (eventfd >= 0 && *idx == 1) {
+				if (eventfd_write(eventfd, (eventfd_t)1))
+					return -errno;
+			}
+		}
+		break;
+	case VFIO_IRQ_SET_DATA_EVENTFD:
+		if (vfio_user_check_msg_fdnum(msg, irq_set->count) != 0)
+			return -EINVAL;
+
+		int32_t *fds = msg->fds;
+		for (; i < irq_set->start + irq_set->count; i++, fds++) {
+			eventfd = irqs->fds[irq_set->index][i];
+			if (eventfd >= 0)
+				close(eventfd); /* Clear original irqfd*/
+			if (*fds >= 0)
+				irqs->fds[irq_set->index][i] = *fds;
+		}
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static void vfio_user_disable_irqs(struct vfio_user_irqs *irqs)
+{
+	struct rte_vfio_user_irq_info *info = irqs->info;
+	uint32_t i, j;
+
+	for (i = 0; i < info->irq_num; i++) {
+		for (j = 0; j < info->irq_info[i].count; j++) {
+			if (irqs->fds[i][j] != -1) {
+				close(irqs->fds[i][j]);
+				irqs->fds[i][j] = -1;
+			}
+		}
+	}
+}
+
+static int vfio_user_device_set_irqs(struct vfio_user_server *dev,
+	VFIO_USER_MSG *msg)
+{
+	struct vfio_user_irq_set *irq = &msg->payload.irq_set;
+	struct vfio_irq_set *irq_set = &irq->set;
+	struct rte_vfio_user_irq_info *info = dev->irqs.info;
+	int ret = 0;
+
+	if (info->irq_num <= irq_set->index
+		|| info->irq_info[irq_set->index].count <
+		irq_set->start + irq_set->count) {
+		vfio_user_close_msg_fds(msg);
+		return -EINVAL;
+	}
+
+	if (irq_set->count == 0) {
+		if (irq_set->flags & VFIO_IRQ_SET_DATA_NONE) {
+			vfio_user_disable_irqs(&dev->irqs);
+			return 0;
+		}
+		vfio_user_close_msg_fds(msg);
+		return -EINVAL;
+	}
+
+	switch (irq_set->flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) {
+	/* Mask/Unmask not supported for now */
+	case VFIO_IRQ_SET_ACTION_MASK:
+		/* FALLTHROUGH */
+	case VFIO_IRQ_SET_ACTION_UNMASK:
+		return 0;
+	case VFIO_IRQ_SET_ACTION_TRIGGER:
+		ret = irq_set_trigger(&dev->irqs, irq_set, msg);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	VFIO_USER_LOG(DEBUG, "Set IRQ: argsz(0x%x), flags(0x%x), index(0x%x), "
+		"start(0x%x), count(0x%x)\n", irq_set->argsz, irq_set->flags,
+		irq_set->index, irq_set->start, irq_set->count);
+
+	/* Do not reply fds back */
+	msg->fd_num = 0;
+	return ret;
+}
+
 static int vfio_user_region_read(struct vfio_user_server *dev,
 	VFIO_USER_MSG *msg)
 {
@@ -408,6 +549,48 @@ static inline void vfio_user_destroy_mem(struct vfio_user_server *dev)
 	dev->mem = NULL;
 }
 
+static inline void vfio_user_destroy_irq(struct vfio_user_server *dev)
+{
+	struct vfio_user_irqs *irq = &dev->irqs;
+	int *fd;
+	uint32_t i, j;
+
+	if (!irq->info)
+		return;
+
+	for (i = 0; i < irq->info->irq_num; i++) {
+		fd = irq->fds[i];
+
+		for (j = 0; j < irq->info->irq_info[i].count; j++) {
+			if (fd[j] != -1)
+				close(fd[j]);
+		}
+
+		free(fd);
+	}
+
+	free(irq->fds);
+}
+
+static inline void vfio_user_clean_irqfd(struct vfio_user_server *dev)
+{
+	struct vfio_user_irqs *irq = &dev->irqs;
+	int *fd;
+	uint32_t i, j;
+
+	if (!irq->info)
+		return;
+
+	for (i = 0; i < irq->info->irq_num; i++) {
+		fd = irq->fds[i];
+
+		for (j = 0; j < irq->info->irq_info[i].count; j++) {
+			close(fd[j]);
+			fd[j] = -1;
+		}
+	}
+}
+
 static int vfio_user_device_reset(struct vfio_user_server *dev,
 	VFIO_USER_MSG *msg)
 {
@@ -422,6 +605,7 @@ static int vfio_user_device_reset(struct vfio_user_server *dev,
 		return -ENOTSUP;
 
 	vfio_user_destroy_mem_entries(dev->mem);
+	vfio_user_clean_irqfd(dev);
 	dev->is_ready = 0;
 
 	if (dev->ops->reset_device)
@@ -437,8 +621,8 @@ static vfio_user_msg_handler_t vfio_user_msg_handlers[VFIO_USER_MAX] = {
 	[VFIO_USER_DMA_UNMAP] = vfio_user_dma_unmap,
 	[VFIO_USER_DEVICE_GET_INFO] = vfio_user_device_get_info,
 	[VFIO_USER_DEVICE_GET_REGION_INFO] = vfio_user_device_get_reg_info,
-	[VFIO_USER_DEVICE_GET_IRQ_INFO] = NULL,
-	[VFIO_USER_DEVICE_SET_IRQS] = NULL,
+	[VFIO_USER_DEVICE_GET_IRQ_INFO] = vfio_user_device_get_irq_info,
+	[VFIO_USER_DEVICE_SET_IRQS] = vfio_user_device_set_irqs,
 	[VFIO_USER_REGION_READ] = vfio_user_region_read,
 	[VFIO_USER_REGION_WRITE] = vfio_user_region_write,
 	[VFIO_USER_DMA_READ] = NULL,
@@ -829,6 +1013,7 @@ static int vfio_user_message_handler(int dev_id, int fd)
 	 * to avoid errors in data path handling
 	 */
 	if ((cmd == VFIO_USER_DMA_MAP || cmd == VFIO_USER_DMA_UNMAP ||
+		cmd == VFIO_USER_DEVICE_SET_IRQS ||
 		cmd == VFIO_USER_DEVICE_RESET)
 		&& dev->ops->lock_dp) {
 		dev->ops->lock_dp(dev_id, 1);
@@ -871,7 +1056,8 @@ static int vfio_user_message_handler(int dev_id, int fd)
 		if (vfio_user_is_ready(dev) && dev->ops->new_device)
 			dev->ops->new_device(dev_id);
 	} else {
-		if ((cmd == VFIO_USER_DMA_MAP || cmd == VFIO_USER_DMA_UNMAP)
+		if ((cmd == VFIO_USER_DMA_MAP || cmd == VFIO_USER_DMA_UNMAP
+			|| cmd == VFIO_USER_DEVICE_SET_IRQS)
 			&& dev->ops->update_status)
 			dev->ops->update_status(dev_id);
 	}
@@ -898,6 +1084,7 @@ static int vfio_user_sock_read(int fd, void *data)
 		if (dev) {
 			dev->ops->destroy_device(dev_id);
 			vfio_user_destroy_mem_entries(dev->mem);
+			vfio_user_clean_irqfd(dev);
 			dev->is_ready = 0;
 			dev->msg_id = 0;
 		}
@@ -995,9 +1182,9 @@ vfio_user_start_server(struct vfio_user_server_socket *sk)
 	}
 
 	/* All the info must be set before start */
-	if (!dev->dev_info || !dev->reg) {
+	if (!dev->dev_info || !dev->reg || !dev->irqs.info) {
 		VFIO_USER_LOG(ERR, "Failed to start, "
-			"dev/reg info must be set before start\n");
+			"dev/reg/irq info must be set before start\n");
 		return -1;
 	}
 
@@ -1109,6 +1296,7 @@ int rte_vfio_user_unregister(const char *sock_addr)
 		return -1;
 	}
 	vfio_user_destroy_mem(dev);
+	vfio_user_destroy_irq(dev);
 	vfio_user_del_device(dev);
 
 	return 0;
@@ -1269,3 +1457,97 @@ const struct rte_vfio_user_mem *rte_vfio_user_get_mem_table(int dev_id)
 
 	return dev->mem;
 }
+
+int rte_vfio_user_get_irq(int dev_id, uint32_t index, uint32_t count, int *fds)
+{
+	struct vfio_user_server *dev;
+	struct vfio_user_irqs *irqs;
+	uint32_t irq_max;
+
+	dev = vfio_user_get_device(dev_id);
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to get irq info:"
+			"device %d not found.\n", dev_id);
+		return -1;
+	}
+
+	if (!fds)
+		return -1;
+
+	irqs = &dev->irqs;
+	if (index >= irqs->info->irq_num)
+		return -1;
+
+	irq_max = irqs->info->irq_info[index].count;
+	if (count > irq_max)
+		return -1;
+
+	memcpy(fds, dev->irqs.fds[index], count * sizeof(int));
+	return 0;
+}
+
+int rte_vfio_user_set_irq_info(const char *sock_addr,
+	struct rte_vfio_user_irq_info *irq)
+{
+	struct vfio_user_server *dev;
+	struct vfio_user_server_socket *sk;
+	uint32_t i;
+	int dev_id, ret;
+
+	if (!irq)
+		return -1;
+
+	pthread_mutex_lock(&vfio_ep_sock.mutex);
+	sk = find_vfio_user_socket(sock_addr);
+	pthread_mutex_unlock(&vfio_ep_sock.mutex);
+
+	if (!sk) {
+		VFIO_USER_LOG(ERR, "Failed to set irq info with sock_addr:"
+			"%s: addr not registered.\n", sock_addr);
+		return -1;
+	}
+
+	dev_id = sk->sock.dev_id;
+	dev = vfio_user_get_device(dev_id);
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to set irq info:"
+			"device %d not found.\n", dev_id);
+		return -1;
+	}
+
+	if (dev->started) {
+		VFIO_USER_LOG(ERR, "Failed to set irq info for device %d\n"
+			 ", device already started\n", dev_id);
+		return -1;
+	}
+
+	if (!dev->irqs.info)
+		vfio_user_destroy_irq(dev);
+
+	dev->irqs.info = irq;
+
+	dev->irqs.fds = malloc(irq->irq_num * sizeof(int *));
+	if (!dev->irqs.fds)
+		return -1;
+
+	for (i = 0; i < irq->irq_num; i++) {
+		uint32_t sz = irq->irq_info[i].count * sizeof(int);
+		dev->irqs.fds[i] = malloc(sz);
+		if (!dev->irqs.fds[i]) {
+			ret = -1;
+			goto exit;
+		}
+
+		memset(dev->irqs.fds[i], 0xFF, sz);
+	}
+
+	return 0;
+exit:
+	for (--i;; i--) {
+		free(dev->irqs.fds[i]);
+		if (i == 0)
+			break;
+	}
+	free(dev->irqs.fds);
+	return ret;
+}
diff --git a/lib/librte_vfio_user/vfio_user_server.h b/lib/librte_vfio_user/vfio_user_server.h
index 737940de62..f357c59705 100644
--- a/lib/librte_vfio_user/vfio_user_server.h
+++ b/lib/librte_vfio_user/vfio_user_server.h
@@ -9,6 +9,11 @@
 
 #include "vfio_user_base.h"
 
+struct vfio_user_irqs {
+	struct rte_vfio_user_irq_info *info;
+	int **fds;
+};
+
 struct vfio_user_server {
 	int dev_id;
 	int is_ready;
@@ -21,6 +26,7 @@ struct vfio_user_server {
 	struct rte_vfio_user_mem *mem;
 	struct vfio_device_info *dev_info;
 	struct rte_vfio_user_regions *reg;
+	struct vfio_user_irqs irqs;
 };
 
 typedef int (*event_handler)(int fd, void *data);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [dpdk-dev] [PATCH 6/9] vfio_user: add client APIs of device attach/detach
  2020-12-18  7:38 [dpdk-dev] [PATCH 0/9] Introduce vfio-user library Chenbo Xia
                   ` (4 preceding siblings ...)
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 5/9] vfio_user: implement interrupt related APIs Chenbo Xia
@ 2020-12-18  7:38 ` Chenbo Xia
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 7/9] vfio_user: add client APIs of DMA/IRQ/region Chenbo Xia
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 43+ messages in thread
From: Chenbo Xia @ 2020-12-18  7:38 UTC (permalink / raw)
  To: dev, thomas, david.marchand
  Cc: stephen, cunming.liang, xiuchun.lu, miao.li, jingjing.wu

This patch implements two APIs, rte_vfio_user_attach_dev() and
rte_vfio_user_detach_dev() for vfio-user client to connect to
or disconnect from a vfio-user device on server side.

Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
---
 lib/librte_vfio_user/meson.build        |   3 +-
 lib/librte_vfio_user/rte_vfio_user.h    |  30 +++
 lib/librte_vfio_user/version.map        |   2 +
 lib/librte_vfio_user/vfio_user_client.c | 279 ++++++++++++++++++++++++
 lib/librte_vfio_user/vfio_user_client.h |  25 +++
 5 files changed, 338 insertions(+), 1 deletion(-)
 create mode 100644 lib/librte_vfio_user/vfio_user_client.c
 create mode 100644 lib/librte_vfio_user/vfio_user_client.h

diff --git a/lib/librte_vfio_user/meson.build b/lib/librte_vfio_user/meson.build
index b7363f61c6..5761f0edd1 100644
--- a/lib/librte_vfio_user/meson.build
+++ b/lib/librte_vfio_user/meson.build
@@ -6,5 +6,6 @@ if not is_linux
 	reason = 'only supported on Linux'
 endif
 
-sources = files('vfio_user_base.c', 'vfio_user_server.c')
+sources = files('vfio_user_base.c', 'vfio_user_server.c',
+	'vfio_user_client.c')
 headers = files('rte_vfio_user.h')
diff --git a/lib/librte_vfio_user/rte_vfio_user.h b/lib/librte_vfio_user/rte_vfio_user.h
index 6c12b0b6ed..b09d83e224 100644
--- a/lib/librte_vfio_user/rte_vfio_user.h
+++ b/lib/librte_vfio_user/rte_vfio_user.h
@@ -225,4 +225,34 @@ __rte_experimental
 int rte_vfio_user_set_irq_info(const char *sock_addr,
 	struct rte_vfio_user_irq_info *irq);
 
+/**
+ *  Below APIs are for vfio-user client (device consumer) to use:
+ *	*rte_vfio_user_attach_dev
+ *	*rte_vfio_user_detach_dev
+ */
+
+/**
+ * Attach to a vfio-user device.
+ *
+ * @param sock_addr
+ *   Unix domain socket address
+ * @return
+ *   - >=0: Success, device attached. Returned value is the device ID.
+ *   - <0: Failure on device attach
+ */
+__rte_experimental
+int rte_vfio_user_attach_dev(const char *sock_addr);
+
+/**
+ * Detach from a vfio-user device.
+ *
+ * @param dev_id
+ *   Device ID of the vfio-user device
+ * @return
+ *   - 0: Success, device detached
+ *   - <0: Failure on device detach
+ */
+__rte_experimental
+int rte_vfio_user_detach_dev(int dev_id);
+
 #endif
diff --git a/lib/librte_vfio_user/version.map b/lib/librte_vfio_user/version.map
index 621a51a9fc..a0cda2b49c 100644
--- a/lib/librte_vfio_user/version.map
+++ b/lib/librte_vfio_user/version.map
@@ -10,6 +10,8 @@ EXPERIMENTAL {
 	rte_vfio_user_set_dev_info;
 	rte_vfio_user_set_reg_info;
 	rte_vfio_user_set_irq_info;
+	rte_vfio_user_attach_dev;
+	rte_vfio_user_detach_dev;
 
 	local: *;
 };
diff --git a/lib/librte_vfio_user/vfio_user_client.c b/lib/librte_vfio_user/vfio_user_client.c
new file mode 100644
index 0000000000..85b2e8cb46
--- /dev/null
+++ b/lib/librte_vfio_user/vfio_user_client.c
@@ -0,0 +1,279 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <unistd.h>
+#include <string.h>
+#include <pthread.h>
+#include <fcntl.h>
+#include <sys/un.h>
+#include <sys/socket.h>
+
+#include "vfio_user_client.h"
+#include "rte_vfio_user.h"
+
+#define REPLY_USEC 1000
+#define RECV_MAX_TRY 50
+
+static struct vfio_user_client_devs vfio_client_devs = {
+	.cl_num = 0,
+	.mutex = PTHREAD_MUTEX_INITIALIZER,
+};
+
+/* Check if the sock_addr exists. If not, alloc and return index */
+static int vfio_user_client_allocate(const char *sock_addr)
+{
+	uint32_t i, count = 0;
+	int index = -1;
+
+	if (sock_addr == NULL)
+		return -1;
+
+	if (vfio_client_devs.cl_num == 0)
+		return 0;
+
+	for (i = 0; i < MAX_VFIO_USER_CLIENT; i++) {
+		struct vfio_user_client *cl = vfio_client_devs.cl[i];
+
+		if (!cl) {
+			if (index == -1)
+				index = i;
+			continue;
+		}
+
+		if (!strcmp(cl->sock.sock_addr, sock_addr))
+			return -1;
+
+		count++;
+		if (count == vfio_client_devs.cl_num)
+			break;
+	}
+
+	return index;
+}
+
+static struct vfio_user_client *
+vfio_user_client_create_dev(const char *sock_addr)
+{
+	struct vfio_user_client *cl;
+	struct vfio_user_socket *sock;
+	int fd, idx;
+	struct sockaddr_un un;
+
+	pthread_mutex_lock(&vfio_client_devs.mutex);
+	if (vfio_client_devs.cl_num == MAX_VFIO_USER_CLIENT) {
+		VFIO_USER_LOG(ERR, "Failed to create client:"
+			" client num reaches max\n");
+		goto err;
+	}
+
+	idx = vfio_user_client_allocate(sock_addr);
+	if (idx < 0) {
+		VFIO_USER_LOG(ERR, "Failed to create client:"
+			"socket addr exists\n");
+		goto err;
+	}
+
+	cl = malloc(sizeof(*cl));
+	if (!cl) {
+		VFIO_USER_LOG(ERR, "Failed to alloc client\n");
+		goto err;
+	}
+
+	sock = &cl->sock;
+	sock->sock_addr = strdup(sock_addr);
+	if (!sock->sock_addr) {
+		VFIO_USER_LOG(ERR, "Failed to copy sock_addr\n");
+		goto err_dup;
+	}
+
+	fd = socket(AF_UNIX, SOCK_STREAM, 0);
+	if (fd < 0) {
+		VFIO_USER_LOG(ERR, "Client failed to create socket: %s\n",
+			strerror(errno));
+		goto err_sock;
+	}
+
+	if (fcntl(fd, F_SETFL, O_NONBLOCK)) {
+		VFIO_USER_LOG(ERR, "Failed to set nonblocking mode for client "
+			"socket, fd: %d (%s)\n", fd, strerror(errno));
+		goto err_ctl;
+	}
+
+	memset(&un, 0, sizeof(un));
+	un.sun_family = AF_UNIX;
+	strncpy(un.sun_path, sock->sock_addr, sizeof(un.sun_path));
+	un.sun_path[sizeof(un.sun_path) - 1] = '\0';
+
+	if (connect(fd, &un, sizeof(un)) < 0) {
+		VFIO_USER_LOG(ERR, "Client connect error, %s\n",
+			strerror(errno));
+		goto err_ctl;
+	}
+
+	sock->sock_fd = fd;
+	sock->dev_id = idx;
+
+	vfio_client_devs.cl[idx] = cl;
+	vfio_client_devs.cl_num++;
+
+	pthread_mutex_unlock(&vfio_client_devs.mutex);
+
+	return cl;
+
+err_ctl:
+	close(fd);
+err_sock:
+	free(sock->sock_addr);
+err_dup:
+	free(sock);
+err:
+	pthread_mutex_unlock(&vfio_client_devs.mutex);
+	return NULL;
+}
+
+static int vfio_user_client_destroy_dev(int dev_id)
+{
+	struct vfio_user_client *cl;
+	struct vfio_user_socket *sock;
+	int ret = 0;
+
+	pthread_mutex_lock(&vfio_client_devs.mutex);
+	if (vfio_client_devs.cl_num == 0) {
+		VFIO_USER_LOG(ERR, "Failed to destroy client:"
+			" no client exists\n");
+		ret = -EINVAL;
+		goto err;
+	}
+
+	cl = vfio_client_devs.cl[dev_id];
+	if (!cl) {
+		VFIO_USER_LOG(ERR, "Failed to destroy client:"
+			" wrong device ID(%d)\n", dev_id);
+		ret = -EINVAL;
+		goto err;
+	}
+
+	sock = &cl->sock;
+	free(sock->sock_addr);
+	close(sock->sock_fd);
+
+	free(cl);
+	vfio_client_devs.cl[dev_id] = NULL;
+	vfio_client_devs.cl_num--;
+
+err:
+	pthread_mutex_unlock(&vfio_client_devs.mutex);
+	return ret;
+}
+
+static inline void vfio_user_client_fill_hdr(VFIO_USER_MSG *msg, uint16_t cmd,
+	uint32_t sz)
+{
+	static uint16_t	msg_id;
+
+	msg->msg_id = msg_id++;
+	msg->cmd = cmd;
+	msg->size = sz;
+	msg->flags = VFIO_USER_TYPE_CMD;
+	msg->err = 0;
+}
+
+static int vfio_user_client_send_recv(int sock_fd, VFIO_USER_MSG *msg)
+{
+	uint16_t cmd = msg->cmd;
+	uint16_t id = msg->msg_id;
+	uint8_t try_recv = 0;
+	int ret;
+
+	ret = vfio_user_send_msg(sock_fd, msg);
+	if (ret < 0) {
+		VFIO_USER_LOG(ERR, "Send error for %s\n",
+			vfio_user_msg_str[cmd]);
+		return -1;
+	}
+
+	VFIO_USER_LOG(INFO, "Send request %s\n", vfio_user_msg_str[cmd]);
+
+	memset(msg, 0, sizeof(*msg));
+
+	while (try_recv < RECV_MAX_TRY) {
+		ret = vfio_user_recv_msg(sock_fd, msg);
+		if (!ret) {
+			VFIO_USER_LOG(ERR, "Peer closed\n");
+			return -1;
+		} else if (ret > 0) {
+			if (id != msg->msg_id)
+				continue;
+			else
+				break;
+		}
+		usleep(REPLY_USEC);
+		try_recv++;
+	}
+
+	if (cmd != msg->cmd) {
+		VFIO_USER_LOG(ERR, "Request and reply mismatch\n");
+		ret = -1;
+	} else
+		ret = 0;
+
+	VFIO_USER_LOG(INFO, "Recv reply %s\n", vfio_user_msg_str[cmd]);
+
+	return ret;
+}
+
+int rte_vfio_user_attach_dev(const char *sock_addr)
+{
+	struct vfio_user_client *dev;
+	VFIO_USER_MSG msg;
+	uint32_t sz = VFIO_USER_MSG_HDR_SIZE + sizeof(struct vfio_user_version)
+		- VFIO_USER_MAX_VERSION_DATA;
+	struct vfio_user_version *ver = &msg.payload.ver;
+	int ret;
+
+	if (!sock_addr)
+		return -EINVAL;
+
+	dev = vfio_user_client_create_dev(sock_addr);
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to attach the device "
+			"with sock_addr %s\n", sock_addr);
+		return -1;
+	}
+
+	memset(&msg, 0, sizeof(VFIO_USER_MSG));
+	vfio_user_client_fill_hdr(&msg, VFIO_USER_VERSION, sz);
+	ver->major = VFIO_USER_VERSION_MAJOR;
+	ver->minor = VFIO_USER_VERSION_MINOR;
+
+	ret = vfio_user_client_send_recv(dev->sock.sock_fd, &msg);
+	if (ret)
+		return ret;
+
+	if (msg.flags & VFIO_USER_ERROR) {
+		VFIO_USER_LOG(ERR, "Failed to negotiate version: %s\n",
+				msg.err ? strerror(msg.err) : "Unknown error");
+		return msg.err ? -(int)msg.err : -1;
+	}
+
+	if (vfio_user_check_msg_fdnum(&msg, 0) != 0)
+		return -1;
+
+	return dev->sock.dev_id;
+}
+
+int rte_vfio_user_detach_dev(int dev_id)
+{
+	int ret;
+
+	if (dev_id < 0)
+		return -EINVAL;
+
+	ret = vfio_user_client_destroy_dev(dev_id);
+	if (ret)
+		VFIO_USER_LOG(ERR, "Failed to detach the device (ID:%d)\n",
+			dev_id);
+
+	return ret;
+}
diff --git a/lib/librte_vfio_user/vfio_user_client.h b/lib/librte_vfio_user/vfio_user_client.h
new file mode 100644
index 0000000000..d489e62037
--- /dev/null
+++ b/lib/librte_vfio_user/vfio_user_client.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _VFIO_USER_CLIENT_H
+#define _VFIO_USER_CLIENT_H
+
+#include <stdint.h>
+
+#include "vfio_user_base.h"
+
+#define MAX_VFIO_USER_CLIENT 1024
+
+struct vfio_user_client {
+	struct vfio_user_socket sock;
+	uint8_t rsvd[16];	/* Reserved for future use */
+};
+
+struct vfio_user_client_devs {
+	struct vfio_user_client *cl[MAX_VFIO_USER_CLIENT];
+	uint32_t cl_num;
+	pthread_mutex_t mutex;
+};
+
+#endif
-- 
2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [dpdk-dev] [PATCH 7/9] vfio_user: add client APIs of DMA/IRQ/region
  2020-12-18  7:38 [dpdk-dev] [PATCH 0/9] Introduce vfio-user library Chenbo Xia
                   ` (5 preceding siblings ...)
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 6/9] vfio_user: add client APIs of device attach/detach Chenbo Xia
@ 2020-12-18  7:38 ` Chenbo Xia
  2021-01-07  2:41   ` Xing, Beilei
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 8/9] test/vfio_user: introduce functional test Chenbo Xia
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 43+ messages in thread
From: Chenbo Xia @ 2020-12-18  7:38 UTC (permalink / raw)
  To: dev, thomas, david.marchand
  Cc: stephen, cunming.liang, xiuchun.lu, miao.li, jingjing.wu

This patch introduces nine APIs
- Device related:
  rte_vfio_user_get_dev_info and rte_vfio_user_reset
- DMA related:
  rte_vfio_user_dma_map/unmap
- Region related:
  rte_vfio_user_get_reg_info and rte_vfio_user_region_read/write
- IRQ related:
  rte_vfio_user_get_irq_info and rte_vfio_user_set_irqs

Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
---
 lib/librte_vfio_user/rte_vfio_user.h    | 168 ++++++++++
 lib/librte_vfio_user/version.map        |   9 +
 lib/librte_vfio_user/vfio_user_client.c | 412 ++++++++++++++++++++++++
 3 files changed, 589 insertions(+)

diff --git a/lib/librte_vfio_user/rte_vfio_user.h b/lib/librte_vfio_user/rte_vfio_user.h
index b09d83e224..fe27d05992 100644
--- a/lib/librte_vfio_user/rte_vfio_user.h
+++ b/lib/librte_vfio_user/rte_vfio_user.h
@@ -229,6 +229,15 @@ int rte_vfio_user_set_irq_info(const char *sock_addr,
  *  Below APIs are for vfio-user client (device consumer) to use:
  *	*rte_vfio_user_attach_dev
  *	*rte_vfio_user_detach_dev
+ *	*rte_vfio_user_get_dev_info
+ *	*rte_vfio_user_get_reg_info
+ *	*rte_vfio_user_get_irq_info
+ *	*rte_vfio_user_dma_map
+ *	*rte_vfio_user_dma_unmap
+ *	*rte_vfio_user_set_irqs
+ *	*rte_vfio_user_region_read
+ *	*rte_vfio_user_region_write
+ *	*rte_vfio_user_reset
  */
 
 /**
@@ -255,4 +264,163 @@ int rte_vfio_user_attach_dev(const char *sock_addr);
 __rte_experimental
 int rte_vfio_user_detach_dev(int dev_id);
 
+/**
+ * Get device information of a vfio-user device.
+ *
+ * @param dev_id
+ *   Device ID of the vfio-user device
+ * @param[out] info
+ *   A pointer to a structure of type *vfio_device_info* to be filled with the
+ *   information of the device.
+ * @return
+ *   - 0: Success, device information updated
+ *   - <0: Failure on get device information
+ */
+__rte_experimental
+int rte_vfio_user_get_dev_info(int dev_id, struct vfio_device_info *info);
+
+/**
+ * Get region information of a vfio-user device.
+ *
+ * @param dev_id
+ *   Device ID of the vfio-user device
+ * @param[out] info
+ *   A pointer to a structure of type *vfio_region_info* to be filled with the
+ *   information of the device region.
+ * @param[out] fd
+ *   A pointer to the file descriptor of the region
+ * @return
+ *   - 0: Success, region information and file descriptor updated. If the region
+ *        can not be mmaped, the file descriptor should be -1.
+ *   - <0: Failure on get region information
+ */
+__rte_experimental
+int rte_vfio_user_get_reg_info(int dev_id, struct vfio_region_info *info,
+	int *fd);
+
+/**
+ * Get IRQ information of a vfio-user device.
+ *
+ * @param dev_id
+ *   Device ID of the vfio-user device
+ * @param[out] info
+ *   A pointer to a structure of type *vfio_irq_info* to be filled with the
+ *   information of the IRQ.
+ * @return
+ *   - 0: Success, IRQ information updated
+ *   - <0: Failure on get IRQ information
+ */
+__rte_experimental
+int rte_vfio_user_get_irq_info(int dev_id, struct vfio_irq_info *info);
+
+/**
+ * Map DMA regions for the vfio-user device.
+ *
+ * @param dev_id
+ *   Device ID of the vfio-user device
+ * @param mem
+ *   A pointer to a structure of type *vfio_user_mem_reg* that identifies
+ *   one or several DMA regions.
+ * @param fds
+ *   A pointer to a list of file descriptors. One file descriptor maps to
+ *   one DMA region.
+ * @param num
+ *   Number of DMA regions (or file descriptors)
+ * @return
+ *   - 0: Success, all DMA regions are mapped.
+ *   - <0: Failure on DMA map. It should be assumed that all DMA regions
+ *         are not mapped.
+ */
+__rte_experimental
+int rte_vfio_user_dma_map(int dev_id, struct rte_vfio_user_mem_reg *mem,
+	int *fds, uint32_t num);
+
+/**
+ * Unmap DMA regions for the vfio-user device.
+ *
+ * @param dev_id
+ *   Device ID of the vfio-user device
+ * @param mem
+ *   A pointer to a structure of type *vfio_user_mem_reg* that identifies
+ *   one or several DMA regions.
+ * @param num
+ *   Number of DMA regions
+ * @return
+ *   - 0: Success, all DMA regions are unmapped.
+ *   - <0: Failure on DMA unmap. It should be assumed that all DMA regions
+ *         are not unmapped.
+ */
+__rte_experimental
+int rte_vfio_user_dma_unmap(int dev_id, struct rte_vfio_user_mem_reg *mem,
+	uint32_t num);
+
+/**
+ * Set interrupt signaling, masking, and unmasking for the vfio-user device.
+ *
+ * @param dev_id
+ *   Device ID of the vfio-user device
+ * @param set
+ *   A pointer to a structure of type *vfio_irq_set* that specifies the set
+ *   data and action
+ * @return
+ *   - 0: Success, IRQs are set successfully.
+ *   - <0: Failure on IRQ set.
+ */
+__rte_experimental
+int rte_vfio_user_set_irqs(int dev_id, struct vfio_irq_set *set);
+
+/**
+ * Read region of the vfio-user device.
+ *
+ * @param dev_id
+ *   Device ID of the vfio-user device
+ * @param idx
+ *   The region index
+ * @param offset
+ *   The region offset
+ * @param size
+ *   Size of the read data
+ * @param[out] data
+ *   The pointer to data to be filled with correct region data
+ * @return
+ *   - 0: Success on region read
+ *   - <0: Failure on region read
+ */
+__rte_experimental
+int rte_vfio_user_region_read(int dev_id, uint32_t idx, uint64_t offset,
+	uint32_t size, void *data);
+
+/**
+ * Write region of the vfio-user device.
+ *
+ * @param dev_id
+ *   Device ID of the vfio-user device
+ * @param idx
+ *   The region index
+ * @param offset
+ *   The region offset
+ * @param size
+ *   Size of the read data
+ * @param data
+ *   The pointer to data that will be written to the region
+ * @return
+ *   - 0: Success on region write
+ *   - <0: Failure on region write
+ */
+__rte_experimental
+int rte_vfio_user_region_write(int dev_id, uint32_t idx, uint64_t offset,
+	uint32_t size, const void *data);
+
+/**
+ * Reset the vfio-user device
+ *
+ * @param dev_id
+ *   Device ID of the vfio-user device
+ * @return
+ *   - 0: Success on device reset
+ *   - <0: Failure on device reset
+ */
+__rte_experimental
+int rte_vfio_user_reset(int dev_id);
+
 #endif
diff --git a/lib/librte_vfio_user/version.map b/lib/librte_vfio_user/version.map
index a0cda2b49c..2e362d00bc 100644
--- a/lib/librte_vfio_user/version.map
+++ b/lib/librte_vfio_user/version.map
@@ -12,6 +12,15 @@ EXPERIMENTAL {
 	rte_vfio_user_set_irq_info;
 	rte_vfio_user_attach_dev;
 	rte_vfio_user_detach_dev;
+	rte_vfio_user_get_dev_info;
+	rte_vfio_user_get_reg_info;
+	rte_vfio_user_get_irq_info;
+	rte_vfio_user_dma_map;
+	rte_vfio_user_dma_unmap;
+	rte_vfio_user_set_irqs;
+	rte_vfio_user_region_read;
+	rte_vfio_user_region_write;
+	rte_vfio_user_reset;
 
 	local: *;
 };
diff --git a/lib/librte_vfio_user/vfio_user_client.c b/lib/librte_vfio_user/vfio_user_client.c
index 85b2e8cb46..24969563d5 100644
--- a/lib/librte_vfio_user/vfio_user_client.c
+++ b/lib/librte_vfio_user/vfio_user_client.c
@@ -277,3 +277,415 @@ int rte_vfio_user_detach_dev(int dev_id)
 
 	return ret;
 }
+
+int rte_vfio_user_get_dev_info(int dev_id, struct vfio_device_info *info)
+{
+	VFIO_USER_MSG msg;
+	struct vfio_user_client *dev;
+	int ret;
+	uint32_t sz = VFIO_USER_MSG_HDR_SIZE + sizeof(struct vfio_device_info);
+
+	if (!info)
+		return -EINVAL;
+
+	dev = vfio_client_devs.cl[dev_id];
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to get device info: "
+			"wrong device ID\n");
+		return -EINVAL;
+	}
+
+	memset(&msg, 0, sizeof(VFIO_USER_MSG));
+	vfio_user_client_fill_hdr(&msg, VFIO_USER_DEVICE_GET_INFO, sz);
+
+	ret = vfio_user_client_send_recv(dev->sock.sock_fd, &msg);
+	if (ret)
+		return ret;
+
+	if (msg.flags & VFIO_USER_ERROR) {
+		VFIO_USER_LOG(ERR, "Failed to get device info: %s\n",
+				msg.err ? strerror(msg.err) : "Unknown error");
+		return msg.err ? -(int)msg.err : -1;
+	}
+
+	if (vfio_user_check_msg_fdnum(&msg, 0) != 0)
+		return -1;
+
+	memcpy(info, &msg.payload.dev_info, sizeof(*info));
+	return ret;
+}
+
+int rte_vfio_user_get_reg_info(int dev_id, struct vfio_region_info *info,
+	int *fd)
+{
+	VFIO_USER_MSG msg;
+	int ret, fd_num = 0;
+	struct vfio_user_client *dev;
+	uint32_t sz = VFIO_USER_MSG_HDR_SIZE + info->argsz;
+	struct vfio_user_reg *reg = &msg.payload.reg_info;
+
+	if (!info || !fd)
+		return -EINVAL;
+
+	dev = vfio_client_devs.cl[dev_id];
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to get region info: "
+			"wrong device ID\n");
+		return -EINVAL;
+	}
+
+	memset(&msg, 0, sizeof(VFIO_USER_MSG));
+	vfio_user_client_fill_hdr(&msg, VFIO_USER_DEVICE_GET_REGION_INFO, sz);
+	reg->reg_info.index = info->index;
+
+	ret = vfio_user_client_send_recv(dev->sock.sock_fd, &msg);
+	if (ret)
+		return ret;
+
+	if (msg.flags & VFIO_USER_ERROR) {
+		VFIO_USER_LOG(ERR, "Failed to get region(%u) info: %s\n",
+				info->index, msg.err ? strerror(msg.err) :
+				"Unknown error");
+		return msg.err ? -(int)msg.err : -1;
+	}
+
+	if (reg->reg_info.flags & VFIO_REGION_INFO_FLAG_MMAP)
+		fd_num = 1;
+
+	if (vfio_user_check_msg_fdnum(&msg, fd_num) != 0)
+		return -1;
+
+	if (reg->reg_info.index != info->index ||
+		msg.size - VFIO_USER_MSG_HDR_SIZE > sizeof(*reg)) {
+		VFIO_USER_LOG(ERR,
+			"Incorrect reply message for region info\n");
+		return -1;
+	}
+
+	memcpy(info, &msg.payload.reg_info, info->argsz);
+	*fd = fd_num == 1 ? msg.fds[0] : -1;
+
+	return 0;
+}
+
+int rte_vfio_user_get_irq_info(int dev_id, struct vfio_irq_info *info)
+{
+	VFIO_USER_MSG msg;
+	int ret;
+	struct vfio_user_client *dev;
+	uint32_t sz = VFIO_USER_MSG_HDR_SIZE + sizeof(struct vfio_irq_info);
+	struct vfio_irq_info *irq_info = &msg.payload.irq_info;
+
+	if (!info)
+		return -EINVAL;
+
+	dev = vfio_client_devs.cl[dev_id];
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to get region info: "
+			"wrong device ID\n");
+		return -EINVAL;
+	}
+
+	memset(&msg, 0, sizeof(VFIO_USER_MSG));
+	vfio_user_client_fill_hdr(&msg, VFIO_USER_DEVICE_GET_IRQ_INFO, sz);
+	irq_info->index = info->index;
+
+	ret = vfio_user_client_send_recv(dev->sock.sock_fd, &msg);
+	if (ret)
+		return ret;
+
+	if (msg.flags & VFIO_USER_ERROR) {
+		VFIO_USER_LOG(ERR, "Failed to get irq(%u) info: %s\n",
+				info->index, msg.err ? strerror(msg.err) :
+				"Unknown error");
+		return msg.err ? -(int)msg.err : -1;
+	}
+
+	if (vfio_user_check_msg_fdnum(&msg, 0) != 0)
+		return -1;
+
+	if (irq_info->index != info->index ||
+		msg.size - VFIO_USER_MSG_HDR_SIZE != sizeof(*irq_info)) {
+		VFIO_USER_LOG(ERR,
+			"Incorrect reply message for IRQ info\n");
+		return -1;
+	}
+
+	memcpy(info, irq_info, sizeof(*info));
+	return 0;
+}
+
+static int vfio_user_client_dma_map_unmap(struct vfio_user_client *dev,
+	struct rte_vfio_user_mem_reg *mem, int *fds, uint32_t num, bool ismap)
+{
+	VFIO_USER_MSG msg;
+	int ret;
+	uint32_t i, mem_sz, map;
+	uint16_t cmd = VFIO_USER_DMA_UNMAP;
+
+	memset(&msg, 0, sizeof(VFIO_USER_MSG));
+
+	if (num > VFIO_USER_MSG_MAX_NREG) {
+		VFIO_USER_LOG(ERR,
+			"Too many memory regions to %s (MAX: %u)\n",
+			ismap ? "map" : "unmap", VFIO_USER_MSG_MAX_NREG);
+		return -EINVAL;
+	}
+
+	if (ismap) {
+		cmd = VFIO_USER_DMA_MAP;
+
+		for (i = 0; i < num; i++) {
+			map = mem->flags & RTE_VUSER_MEM_MAPPABLE;
+			if ((map && (fds[i] == -1)) ||
+				(!map && (fds[i] != -1))) {
+				VFIO_USER_LOG(ERR, "%spable memory region "
+					"should%s have valid fd\n",
+					ismap ? "Map" : "Unmap",
+					ismap ? "" : " not");
+				return -EINVAL;
+			}
+
+			if (fds[i] != -1)
+				msg.fds[msg.fd_num++] = fds[i];
+		}
+	}
+
+	mem_sz = sizeof(struct rte_vfio_user_mem_reg) * num;
+	memcpy(&msg.payload, mem, mem_sz);
+
+	vfio_user_client_fill_hdr(&msg, cmd, mem_sz + VFIO_USER_MSG_HDR_SIZE);
+
+	ret = vfio_user_client_send_recv(dev->sock.sock_fd, &msg);
+	if (ret)
+		return ret;
+
+	if (msg.flags & VFIO_USER_ERROR) {
+		VFIO_USER_LOG(ERR, "Failed to %smap memory regions: "
+				"%s\n", ismap ? "" : "un",
+				msg.err ? strerror(msg.err) : "Unknown error");
+		return msg.err ? -(int)msg.err : -1;
+	}
+
+	if (vfio_user_check_msg_fdnum(&msg, 0) != 0)
+		return -1;
+
+	return 0;
+}
+
+int rte_vfio_user_dma_map(int dev_id, struct rte_vfio_user_mem_reg *mem,
+	int *fds, uint32_t num)
+{
+	struct vfio_user_client *dev;
+
+	if (!mem || !fds)
+		return -EINVAL;
+
+	dev = vfio_client_devs.cl[dev_id];
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to dma map: "
+			"wrong device ID\n");
+		return -EINVAL;
+	}
+
+	return vfio_user_client_dma_map_unmap(dev, mem, fds, num, true);
+}
+
+int rte_vfio_user_dma_unmap(int dev_id, struct rte_vfio_user_mem_reg *mem,
+	uint32_t num)
+{
+	struct vfio_user_client *dev;
+
+	if (!mem)
+		return -EINVAL;
+
+	dev = vfio_client_devs.cl[dev_id];
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to dma unmap: "
+			"wrong device ID\n");
+		return -EINVAL;
+	}
+
+	return vfio_user_client_dma_map_unmap(dev, mem, NULL, num, false);
+}
+
+int rte_vfio_user_set_irqs(int dev_id, struct vfio_irq_set *set)
+{
+	VFIO_USER_MSG msg;
+	int ret;
+	struct vfio_user_client *dev;
+	uint32_t set_sz = set->argsz;
+	struct vfio_user_irq_set *irq_set = &msg.payload.irq_set;
+
+	if (!set)
+		return -EINVAL;
+
+	dev = vfio_client_devs.cl[dev_id];
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to set irqs: "
+			"wrong device ID\n");
+		return -EINVAL;
+	}
+
+	memset(&msg, 0, sizeof(VFIO_USER_MSG));
+
+	if (set->flags & VFIO_IRQ_SET_DATA_EVENTFD) {
+		msg.fd_num = set->count;
+		memcpy(msg.fds, set->data, sizeof(int) * set->count);
+		set_sz -= sizeof(int) * set->count;
+	}
+	memcpy(irq_set, set, set_sz);
+	irq_set->set.argsz = set_sz;
+	vfio_user_client_fill_hdr(&msg, VFIO_USER_DEVICE_SET_IRQS,
+		VFIO_USER_MSG_HDR_SIZE + set_sz);
+
+	ret = vfio_user_client_send_recv(dev->sock.sock_fd, &msg);
+	if (ret)
+		return ret;
+
+	if (msg.flags & VFIO_USER_ERROR) {
+		VFIO_USER_LOG(ERR, "Failed to set irq(%u): %s\n",
+				set->index, msg.err ? strerror(msg.err) :
+				"Unknown error");
+		return msg.err ? -(int)msg.err : -1;
+	}
+
+	if (vfio_user_check_msg_fdnum(&msg, 0) != 0)
+		return -1;
+
+	return 0;
+}
+
+int rte_vfio_user_region_read(int dev_id, uint32_t idx, uint64_t offset,
+	uint32_t size, void *data)
+{
+	VFIO_USER_MSG msg;
+	int ret;
+	struct vfio_user_client *dev;
+	uint32_t sz = VFIO_USER_MSG_HDR_SIZE + sizeof(struct vfio_user_reg_rw)
+		- VFIO_USER_MAX_RW_DATA;
+	struct vfio_user_reg_rw *rw = &msg.payload.reg_rw;
+
+	if (!data)
+		return -EINVAL;
+
+	dev = vfio_client_devs.cl[dev_id];
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to read region: "
+			"wrong device ID\n");
+		return -EINVAL;
+	}
+
+	if (sz > VFIO_USER_MAX_RW_DATA) {
+		VFIO_USER_LOG(ERR, "Region read size exceeds max\n");
+		return -1;
+	}
+
+	memset(&msg, 0, sizeof(VFIO_USER_MSG));
+	vfio_user_client_fill_hdr(&msg, VFIO_USER_REGION_READ, sz);
+
+	rw->reg_idx = idx;
+	rw->reg_offset = offset;
+	rw->size = size;
+
+	ret = vfio_user_client_send_recv(dev->sock.sock_fd, &msg);
+	if (ret)
+		return ret;
+
+	if (msg.flags & VFIO_USER_ERROR) {
+		VFIO_USER_LOG(ERR, "Failed to read region(%u): %s\n",
+				idx, msg.err ? strerror(msg.err) :
+				"Unknown error");
+		return msg.err ? -(int)msg.err : -1;
+	}
+
+	if (vfio_user_check_msg_fdnum(&msg, 0) != 0)
+		return -1;
+
+	memcpy(data, rw->data, size);
+	return 0;
+}
+
+int rte_vfio_user_region_write(int dev_id, uint32_t idx, uint64_t offset,
+	uint32_t size, const void *data)
+{
+	VFIO_USER_MSG msg;
+	int ret;
+	struct vfio_user_client *dev;
+	uint32_t sz = VFIO_USER_MSG_HDR_SIZE + sizeof(struct vfio_user_reg_rw)
+		- VFIO_USER_MAX_RW_DATA + size;
+	struct vfio_user_reg_rw *rw = &msg.payload.reg_rw;
+
+	if (!data)
+		return -EINVAL;
+
+	dev = vfio_client_devs.cl[dev_id];
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to write region: "
+			"wrong device ID\n");
+		return -EINVAL;
+	}
+
+	if (sz > VFIO_USER_MAX_RW_DATA) {
+		VFIO_USER_LOG(ERR, "Region write size exceeds max\n");
+		return -EINVAL;
+	}
+
+	memset(&msg, 0, sizeof(VFIO_USER_MSG));
+	vfio_user_client_fill_hdr(&msg, VFIO_USER_REGION_WRITE, sz);
+
+	rw->reg_idx = idx;
+	rw->reg_offset = offset;
+	rw->size = size;
+	memcpy(rw->data, data, size);
+
+	ret = vfio_user_client_send_recv(dev->sock.sock_fd, &msg);
+	if (ret)
+		return ret;
+
+	if (msg.flags & VFIO_USER_ERROR) {
+		VFIO_USER_LOG(ERR, "Failed to write region(%u): %s\n",
+				idx, msg.err ? strerror(msg.err) :
+				"Unknown error");
+		return msg.err ? -(int)msg.err : -1;
+	}
+
+	if (vfio_user_check_msg_fdnum(&msg, 0) != 0)
+		return -1;
+
+	return 0;
+}
+
+int rte_vfio_user_reset(int dev_id)
+{
+	VFIO_USER_MSG msg;
+	int ret;
+	struct vfio_user_client *dev;
+	uint32_t sz = VFIO_USER_MSG_HDR_SIZE;
+
+	dev = vfio_client_devs.cl[dev_id];
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to write region: "
+			"wrong device ID\n");
+		return -EINVAL;
+	}
+
+	memset(&msg, 0, sizeof(VFIO_USER_MSG));
+	vfio_user_client_fill_hdr(&msg, VFIO_USER_DEVICE_RESET, sz);
+
+	ret = vfio_user_client_send_recv(dev->sock.sock_fd, &msg);
+	if (ret)
+		return ret;
+
+	if (msg.flags & VFIO_USER_ERROR) {
+		VFIO_USER_LOG(ERR, "Failed to reset device: %s\n",
+				msg.err ? strerror(msg.err) :
+				"Unknown error");
+		return msg.err ? -(int)msg.err : -1;
+	}
+
+	if (vfio_user_check_msg_fdnum(&msg, 0) != 0)
+		return -1;
+
+	return ret;
+}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [dpdk-dev] [PATCH 8/9] test/vfio_user: introduce functional test
  2020-12-18  7:38 [dpdk-dev] [PATCH 0/9] Introduce vfio-user library Chenbo Xia
                   ` (6 preceding siblings ...)
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 7/9] vfio_user: add client APIs of DMA/IRQ/region Chenbo Xia
@ 2020-12-18  7:38 ` Chenbo Xia
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 9/9] doc: add vfio-user library guide Chenbo Xia
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 43+ messages in thread
From: Chenbo Xia @ 2020-12-18  7:38 UTC (permalink / raw)
  To: dev, thomas, david.marchand
  Cc: stephen, cunming.liang, xiuchun.lu, miao.li, jingjing.wu

This patch introduces functional test for vfio_user client and
server. Note that the test can only be run with server and client
both started and server should be started first.

Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
---
 app/test/meson.build      |   4 +
 app/test/test_vfio_user.c | 646 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 650 insertions(+)
 create mode 100644 app/test/test_vfio_user.c

diff --git a/app/test/meson.build b/app/test/meson.build
index 94fd39fecb..f5b15ac44c 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -138,6 +138,7 @@ test_sources = files('commands.c',
 	'test_trace.c',
 	'test_trace_register.c',
 	'test_trace_perf.c',
+	'test_vfio_user.c',
 	'test_version.c',
 	'virtual_pmd.c'
 )
@@ -173,6 +174,7 @@ test_deps = ['acl',
 	'ring',
 	'security',
 	'stack',
+	'vfio_user',
 	'telemetry',
 	'timer'
 ]
@@ -266,6 +268,8 @@ fast_tests = [
         ['service_autotest', true],
         ['thash_autotest', true],
         ['trace_autotest', true],
+        ['vfio_user_autotest_client', false],
+        ['vfio_user_autotest_server', false],
 ]
 
 perf_test_names = [
diff --git a/app/test/test_vfio_user.c b/app/test/test_vfio_user.c
new file mode 100644
index 0000000000..ee245e437d
--- /dev/null
+++ b/app/test/test_vfio_user.c
@@ -0,0 +1,646 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <inttypes.h>
+#include <limits.h>
+#include <stdatomic.h>
+#include <sys/eventfd.h>
+#include <sys/mman.h>
+
+#include <rte_vfio_user.h>
+#include <rte_malloc.h>
+#include <rte_hexdump.h>
+#include <rte_pause.h>
+#include <rte_log.h>
+
+#include "test.h"
+
+#define REGION_SIZE 0x100
+
+struct server_mem_tb {
+	uint32_t entry_num;
+	struct rte_vfio_user_mtb_entry entry[];
+};
+
+static const char test_sock[] = "/tmp/dpdk_vfio_test";
+struct server_mem_tb *server_mem;
+int server_irqfd;
+atomic_uint test_failed;
+atomic_uint server_destroyed;
+
+static int test_set_dev_info(const char *sock,
+	struct vfio_device_info *info)
+{
+	int ret;
+
+	info->argsz = sizeof(*info);
+	info->flags = VFIO_DEVICE_FLAGS_RESET | VFIO_DEVICE_FLAGS_PCI;
+	info->num_irqs = VFIO_PCI_NUM_IRQS;
+	info->num_regions = VFIO_PCI_NUM_REGIONS;
+	ret = rte_vfio_user_set_dev_info(sock, info);
+	if (ret) {
+		printf("Failed to set device info\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static ssize_t test_dev_cfg_rw(struct rte_vfio_user_reg_info *reg, char *buf,
+	size_t count, loff_t pos, bool iswrite)
+{
+	char *loc = (char *)reg->base + pos;
+
+	if (!iswrite) {
+		if (pos + count > reg->info->size)
+			return -1;
+		memcpy(buf, loc, count);
+		return count;
+	}
+
+	memcpy(loc, buf, count);
+	return count;
+}
+
+static int test_set_reg_info(const char *sock_addr,
+	struct rte_vfio_user_regions *reg)
+{
+	struct rte_vfio_user_reg_info *reg_info;
+	void *cfg_base = NULL;
+	uint32_t i, j, sz = 0, reg_sz = REGION_SIZE;
+	int ret;
+
+	reg->reg_num = VFIO_PCI_NUM_REGIONS;
+	sz = sizeof(struct vfio_region_info);
+
+	for (i = 0; i < reg->reg_num; i++) {
+		reg_info = &reg->reg_info[i];
+
+		reg_info->info = rte_zmalloc(NULL, sz, 0);
+		if (!reg_info->info) {
+			printf("Failed to alloc vfio region info\n");
+			goto err;
+		}
+
+		reg_info->priv = NULL;
+		reg_info->fd = -1;
+		reg_info->info->argsz = sz;
+		reg_info->info->cap_offset = sz;
+		reg_info->info->index = i;
+		reg_info->info->offset = 0;
+		reg_info->info->flags = VFIO_REGION_INFO_FLAG_READ |
+			VFIO_REGION_INFO_FLAG_WRITE;
+
+		if (i == VFIO_PCI_CONFIG_REGION_INDEX) {
+			cfg_base = rte_zmalloc(NULL, reg_sz, 0);
+			if (!cfg_base) {
+				printf("Failed to alloc cfg space\n");
+				goto err;
+			}
+			reg_info->base = cfg_base;
+			reg_info->rw = test_dev_cfg_rw;
+			reg_info->info->size = reg_sz;
+		} else {
+			reg_info->base = NULL;
+			reg_info->rw = NULL;
+			reg_info->info->size = 0;
+		}
+	}
+
+	ret = rte_vfio_user_set_reg_info(sock_addr, reg);
+	if (ret) {
+		printf("Failed to set region info\n");
+		return -1;
+	}
+
+	return 0;
+err:
+	for (j = 0; j < i; j++)
+		rte_free(reg->reg_info[i].info);
+	rte_free(cfg_base);
+	return -1;
+}
+
+static void cleanup_reg(struct rte_vfio_user_regions *reg)
+{
+	struct rte_vfio_user_reg_info *reg_info;
+	uint32_t i;
+
+	for (i = 0; i < reg->reg_num; i++) {
+		reg_info = &reg->reg_info[i];
+
+		rte_free(reg_info->info);
+
+		if (i == VFIO_PCI_CONFIG_REGION_INDEX)
+			rte_free(reg_info->base);
+	}
+}
+
+static int test_set_irq_info(const char *sock,
+	struct rte_vfio_user_irq_info *info)
+{
+	struct vfio_irq_info *irq_info;
+	int ret;
+	uint32_t i;
+
+	info->irq_num = VFIO_PCI_NUM_IRQS;
+	for (i = 0; i < info->irq_num; i++) {
+		irq_info = &info->irq_info[i];
+		irq_info->argsz = sizeof(irq_info);
+		irq_info->index = i;
+
+		if (i == VFIO_PCI_MSIX_IRQ_INDEX) {
+			irq_info->flags = VFIO_IRQ_INFO_EVENTFD |
+				VFIO_IRQ_INFO_NORESIZE;
+			irq_info->count = 1;
+		} else {
+			irq_info->flags = 0;
+			irq_info->count = 0;
+		}
+	}
+
+	ret = rte_vfio_user_set_irq_info(sock, info);
+	if (ret) {
+		printf("Failed to set irq info\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int test_get_mem(int dev_id)
+{
+	const struct rte_vfio_user_mem *mem;
+	uint32_t entry_sz;
+
+	mem = rte_vfio_user_get_mem_table(dev_id);
+	if (!mem) {
+		printf("Failed to get memory table\n");
+		return -1;
+	}
+
+	entry_sz = sizeof(struct rte_vfio_user_mtb_entry) * mem->entry_num;
+	server_mem = rte_zmalloc(NULL, sizeof(*server_mem) + entry_sz, 0);
+
+	memcpy(server_mem->entry, mem->entry, entry_sz);
+	server_mem->entry_num = mem->entry_num;
+
+	return 0;
+}
+
+static int test_get_irq(int dev_id)
+{
+	int ret;
+
+	server_irqfd = -1;
+	ret = rte_vfio_user_get_irq(dev_id, VFIO_PCI_MSIX_IRQ_INDEX, 1,
+		&server_irqfd);
+	if (ret) {
+		printf("Failed to get IRQ\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int test_create_device(int dev_id)
+{
+	char sock[PATH_MAX];
+
+	RTE_LOG(DEBUG, USER1, "Device created\n");
+
+	if (rte_vfio_get_sock_addr(dev_id, sock, sizeof(sock))) {
+		printf("Failed to get socket addr\n");
+		goto err;
+	}
+
+	if (strcmp(sock, test_sock)) {
+		printf("Wrong socket addr\n");
+		goto err;
+	}
+
+	printf("Get socket address: TEST OK\n");
+
+	return 0;
+err:
+	atomic_store(&test_failed, 1);
+	return -1;
+}
+
+static void test_destroy_device(int dev_id __rte_unused)
+{
+	int ret;
+
+	RTE_LOG(DEBUG, USER1, "Device destroyed\n");
+
+	ret = test_get_mem(dev_id);
+	if (ret)
+		goto err;
+
+	printf("Get memory table: TEST OK\n");
+
+	ret = test_get_irq(dev_id);
+	if (ret)
+		goto err;
+
+	printf("Get IRQ: TEST OK\n");
+
+	atomic_store(&server_destroyed, 1);
+	return;
+err:
+	atomic_store(&test_failed, 1);
+}
+
+static int test_update_device(int dev_id __rte_unused)
+{
+	RTE_LOG(DEBUG, USER1, "Device updated\n");
+
+	return 0;
+}
+
+static int test_lock_dp(int dev_id __rte_unused, int lock)
+{
+	RTE_LOG(DEBUG, USER1, "Device data path %slocked\n", lock ? "" : "un");
+	return 0;
+}
+
+static int test_reset_device(int dev_id __rte_unused)
+{
+	RTE_LOG(DEBUG, USER1, "Device reset\n");
+	return 0;
+}
+
+const struct rte_vfio_user_notify_ops test_vfio_ops = {
+	.new_device = test_create_device,
+	.destroy_device = test_destroy_device,
+	.update_status = test_update_device,
+	.lock_dp = test_lock_dp,
+	.reset_device = test_reset_device,
+};
+
+static int
+test_vfio_user_server(void)
+{
+	struct vfio_device_info dev_info;
+	struct rte_vfio_user_regions *reg;
+	struct rte_vfio_user_reg_info *reg_info;
+	struct vfio_region_info *info;
+	struct rte_vfio_user_irq_info *irq_info;
+	struct rte_vfio_user_mtb_entry *ent;
+	int ret, err;
+	uint32_t i;
+
+	atomic_init(&test_failed, 0);
+	atomic_init(&server_destroyed, 0);
+
+	ret = rte_vfio_user_register(test_sock, &test_vfio_ops);
+	if (ret) {
+		printf("Failed to register\n");
+		ret = TEST_FAILED;
+		goto err_regis;
+	}
+
+	printf("Register device: TEST OK\n");
+
+	reg = rte_zmalloc(NULL, sizeof(*reg) + VFIO_PCI_NUM_REGIONS *
+		sizeof(struct rte_vfio_user_reg_info), 0);
+	if (!reg) {
+		printf("Failed to alloc regions\n");
+		ret = TEST_FAILED;
+		goto err_reg;
+	}
+
+	irq_info = rte_zmalloc(NULL, sizeof(*irq_info) + VFIO_PCI_NUM_IRQS *
+		sizeof(struct vfio_irq_info), 0);
+	if (!irq_info) {
+		printf("Failed to alloc irq info\n");
+		ret = TEST_FAILED;
+		goto err_irq;
+	}
+
+	if (test_set_dev_info(test_sock, &dev_info)) {
+		ret = TEST_FAILED;
+		goto err_set;
+	}
+
+	printf("Set device info: TEST OK\n");
+
+	if (test_set_reg_info(test_sock, reg)) {
+		ret = TEST_FAILED;
+		goto err_set;
+	}
+
+	printf("Set device info: TEST OK\n");
+
+	if (test_set_irq_info(test_sock, irq_info)) {
+		ret = TEST_FAILED;
+		goto err;
+	}
+
+	printf("Set irq info: TEST OK\n");
+
+	ret = rte_vfio_user_start(test_sock);
+	if (ret) {
+		printf("Failed to start\n");
+		ret = TEST_FAILED;
+		goto err;
+	}
+
+	printf("Start device: TEST OK\n");
+
+	while (atomic_load(&test_failed) == 0 &&
+		atomic_load(&server_destroyed) == 0)
+		rte_pause();
+
+	if (atomic_load(&test_failed) == 1) {
+		printf("Test failed during device running\n");
+		ret = TEST_FAILED;
+		goto err;
+	}
+
+	printf("=================================\n");
+	printf("Device layout:\n");
+	printf("=================================\n");
+	printf("%u regions, %u IRQs\n", dev_info.num_regions,
+		dev_info.num_irqs);
+	printf("=================================\n");
+
+	reg_info = &reg->reg_info[VFIO_PCI_CONFIG_REGION_INDEX];
+	info = reg_info->info;
+	printf("Configuration Space:\nsize : 0x%llx, prot: %s%s\n",
+		info->size,
+		(info->flags & VFIO_REGION_INFO_FLAG_READ) ? "read/" : "",
+		(info->flags & VFIO_REGION_INFO_FLAG_WRITE) ? "write" : "");
+	rte_hexdump(stdout, "Content", (const void *)reg_info->base,
+		info->size);
+
+	printf("=================================\n");
+	printf("DMA memory table (Entry num: %u):\n", server_mem->entry_num);
+
+	for (i = 0; i < server_mem->entry_num; i++) {
+		ent = &server_mem->entry[i];
+		printf("(Entry %u) gpa: 0x%" PRIx64
+			", size: 0x%" PRIx64 ", hva: 0x%" PRIx64 "\n"
+			", mmap_addr: 0x%" PRIx64 ", mmap_size: 0x%" PRIx64
+			", fd: %d\n", i, ent->gpa, ent->size,
+			ent->host_user_addr, (uint64_t)ent->mmap_addr,
+			ent->mmap_size, ent->fd);
+	}
+
+	printf("=================================\n");
+	printf("MSI-X Interrupt:\nNumber: %u, irqfd: %s\n",
+		irq_info->irq_info[VFIO_PCI_MSIX_IRQ_INDEX].count,
+		server_irqfd == -1 ? "Invalid" : "Valid");
+
+	ret = TEST_SUCCESS;
+
+err:
+	cleanup_reg(reg);
+err_set:
+	rte_free(irq_info);
+err_irq:
+	rte_free(reg);
+err_reg:
+	err = rte_vfio_user_unregister(test_sock);
+	if (err)
+		ret = TEST_FAILED;
+	else
+		printf("Unregister device: TEST OK\n");
+err_regis:
+	return ret;
+}
+
+static int test_get_dev_info(int dev_id, struct vfio_device_info *info)
+{
+	int ret;
+
+	ret = rte_vfio_user_get_dev_info(dev_id, info);
+	if (ret) {
+		printf("Failed to get device info\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int test_get_reg_info(int dev_id, struct vfio_region_info *info)
+{
+	int ret, fd = -1;
+
+	info->index = VFIO_PCI_CONFIG_REGION_INDEX;
+	info->argsz = sizeof(*info);
+	ret = rte_vfio_user_get_reg_info(dev_id, info, &fd);
+	if (ret) {
+		printf("Failed to get region info\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int test_get_irq_info(int dev_id, struct vfio_irq_info *info)
+{
+	int ret;
+
+	info->index = VFIO_PCI_MSIX_IRQ_INDEX;
+	ret = rte_vfio_user_get_irq_info(dev_id, info);
+	if (ret) {
+		printf("Failed to get irq info\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int test_set_irqs(int dev_id, struct vfio_irq_set *set, int *fd)
+{
+	int ret;
+
+	*fd = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK);
+	if (*fd < 0) {
+		printf("Failed to create eventfd\n");
+		return -1;
+	}
+
+	set->argsz = sizeof(*set) + sizeof(int);
+	set->count = 1;
+	set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER;
+	set->index = VFIO_PCI_MSIX_IRQ_INDEX;
+	set->start = 0;
+	memcpy(set->data, fd, sizeof(*fd));
+
+	ret = rte_vfio_user_set_irqs(dev_id, set);
+	if (ret) {
+		printf("Failed to set irqs\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int test_dma_map_unmap(int dev_id, struct rte_vfio_user_mem_reg *mem)
+{
+	int ret, fd = -1;
+
+	mem->fd_offset = 0;
+	mem->flags = 0;
+	mem->gpa = 0x12345678;
+	mem->protection = PROT_READ | PROT_WRITE;
+	mem->size = 0x10000;
+
+	/* Map -> Unmap -> Map */
+	ret = rte_vfio_user_dma_map(dev_id, mem, &fd, 1);
+	if (ret) {
+		printf("Failed to dma map\n");
+		return -1;
+	}
+
+	ret = rte_vfio_user_dma_unmap(dev_id, mem, 1);
+	if (ret) {
+		printf("Failed to dma unmap\n");
+		return -1;
+	}
+
+	ret = rte_vfio_user_dma_map(dev_id, mem, &fd, 1);
+	if (ret) {
+		printf("Failed to dma re-map\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int test_region_read_write(int dev_id, void *read_data, uint64_t sz)
+{
+	int ret;
+	uint32_t data = 0x1A2B3C4D, idx = VFIO_PCI_CONFIG_REGION_INDEX;
+
+	ret = rte_vfio_user_region_write(dev_id, idx, 0, 4, (void *)&data);
+	if (ret) {
+		printf("Failed to write region\n");
+		return -1;
+	}
+
+	ret = rte_vfio_user_region_read(dev_id, idx, 0, sz, read_data);
+	if (ret) {
+		printf("Failed to read region\n");
+		return -1;
+	}
+
+	return 0;
+}
+static int
+test_vfio_user_client(void)
+{
+	int ret = 0, dev_id, fd = -1;
+	struct vfio_device_info dev_info;
+	struct vfio_irq_info irq_info;
+	struct rte_vfio_user_mem_reg mem;
+	struct vfio_irq_set *set;
+	struct vfio_region_info reg_info;
+	void *data;
+
+	ret = rte_vfio_user_attach_dev(test_sock);
+	if (ret) {
+		printf("Failed to attach device\n");
+		return TEST_FAILED;
+	}
+
+	printf("Attach device: TEST OK\n");
+
+	dev_id = ret;
+	ret = rte_vfio_user_reset(dev_id);
+	if (ret) {
+		printf("Failed to reset device\n");
+		return TEST_FAILED;
+	}
+
+	printf("Reset device: TEST OK\n");
+
+	if (test_get_dev_info(dev_id, &dev_info))
+		return TEST_FAILED;
+
+	printf("Get device info: TEST OK\n");
+
+	if (test_get_reg_info(dev_id, &reg_info))
+		return TEST_FAILED;
+
+	printf("Get region info: TEST OK\n");
+
+	if (test_get_irq_info(dev_id, &irq_info))
+		return TEST_FAILED;
+
+	printf("Get irq info: TEST OK\n");
+
+	set = rte_zmalloc(NULL, sizeof(*set) + sizeof(int), 0);
+	if (!set) {
+		printf("Failed to allocate irq set\n");
+		return TEST_FAILED;
+	}
+
+	data = rte_zmalloc(NULL, reg_info.size, 0);
+	if (!data) {
+		printf("Failed to allcate data\n");
+		ret = TEST_FAILED;
+		goto err_data;
+	}
+
+	if (test_set_irqs(dev_id, set, &fd)) {
+		ret = TEST_FAILED;
+		goto err;
+	}
+
+	printf("Set irqs: TEST OK\n");
+
+	if (test_dma_map_unmap(dev_id, &mem)) {
+		ret = TEST_FAILED;
+		goto err;
+	}
+
+	printf("DMA map/unmap: TEST OK\n");
+
+	if (test_region_read_write(dev_id, data, reg_info.size)) {
+		ret = TEST_FAILED;
+		goto err;
+	}
+
+	printf("Region read/write: TEST OK\n");
+
+	printf("=================================\n");
+	printf("Device layout:\n");
+	printf("=================================\n");
+	printf("%u regions, %u IRQs\n", dev_info.num_regions,
+		dev_info.num_irqs);
+	printf("=================================\n");
+	printf("Configuration Space:\nsize : 0x%llx, prot: %s%s\n",
+		reg_info.size,
+		(reg_info.flags & VFIO_REGION_INFO_FLAG_READ) ? "read/" : "",
+		(reg_info.flags & VFIO_REGION_INFO_FLAG_WRITE) ? "write" : "");
+	rte_hexdump(stdout, "Content", (const void *)data, reg_info.size);
+
+	printf("=================================\n");
+	printf("DMA memory table (Entry num: 1):\ngpa: 0x%" PRIx64
+		", size: 0x%" PRIx64 ", fd: -1, fd_offset:0x%" PRIx64 "\n",
+		mem.gpa, mem.size, mem.fd_offset);
+	printf("=================================\n");
+	printf("MSI-X Interrupt:\nNumber: %u, irqfd: %s\n", irq_info.count,
+		fd == -1 ? "Invalid" : "Valid");
+
+	ret = rte_vfio_user_detach_dev(dev_id);
+	if (ret) {
+		printf("Failed to detach device\n");
+		ret = TEST_FAILED;
+		goto err;
+	}
+
+	printf("Device detach: TEST OK\n");
+err:
+	rte_free(data);
+err_data:
+	rte_free(set);
+	return ret;
+}
+
+REGISTER_TEST_COMMAND(vfio_user_autotest_client, test_vfio_user_client);
+REGISTER_TEST_COMMAND(vfio_user_autotest_server, test_vfio_user_server);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [dpdk-dev] [PATCH 9/9] doc: add vfio-user library guide
  2020-12-18  7:38 [dpdk-dev] [PATCH 0/9] Introduce vfio-user library Chenbo Xia
                   ` (7 preceding siblings ...)
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 8/9] test/vfio_user: introduce functional test Chenbo Xia
@ 2020-12-18  7:38 ` Chenbo Xia
  2021-01-06  5:07   ` Xing, Beilei
  2020-12-18  9:37 ` [dpdk-dev] [PATCH 0/9] Introduce vfio-user library David Marchand
  2021-01-14  6:14 ` [dpdk-dev] [PATCH v2 " Chenbo Xia
  10 siblings, 1 reply; 43+ messages in thread
From: Chenbo Xia @ 2020-12-18  7:38 UTC (permalink / raw)
  To: dev, thomas, david.marchand
  Cc: stephen, cunming.liang, xiuchun.lu, miao.li, jingjing.wu

Add vfio-user library guide and update release notes.

Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
---
 doc/guides/prog_guide/index.rst         |   1 +
 doc/guides/prog_guide/vfio_user_lib.rst | 215 ++++++++++++++++++++++++
 doc/guides/rel_notes/release_21_02.rst  |  11 ++
 3 files changed, 227 insertions(+)
 create mode 100644 doc/guides/prog_guide/vfio_user_lib.rst

diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 45c7dec88d..f9847b1058 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -70,3 +70,4 @@ Programmer's Guide
     lto
     profile_app
     glossary
+    vfio_user_lib
diff --git a/doc/guides/prog_guide/vfio_user_lib.rst b/doc/guides/prog_guide/vfio_user_lib.rst
new file mode 100644
index 0000000000..6daec4d8e5
--- /dev/null
+++ b/doc/guides/prog_guide/vfio_user_lib.rst
@@ -0,0 +1,215 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2020 Intel Corporation.
+
+Vfio User Library
+=============
+
+The vfio-user library implements the vfio-user protocol, which is a protocol
+that allows an I/O device to be emulated in a separate process outside of a
+Virtual Machine Monitor (VMM). The protocol has a client/server model, in which
+the server emulates the device and the client (e.g., VMM) consumes the device.
+Vfio-user library uses the device model of Linux kernel VFIO and core concepts
+defined in its API. The main difference between kernel VFIO and vfio-user is
+that the device consumer uses messages over a UNIX domain socket instead of
+system calls in vfio-user.
+
+The vfio-user library is used to construct and consume emulated devices. The
+server side implementation is mainly for construction of devices and the client
+side implementation is mainly for consumption and manipulation of devices. You
+use server APIs mainly for two things: provide the device information (e.g.,
+region/irq information) to vfio-user library and acquire the configuration
+(e.g., DMA table) from client. To construct a device, you could only focus on
+the device abstraction that vfio-user library defines rather than how the
+server side communicated with client. You use client APIs mainly for acquiring
+the device information and configuring the device. The client API usage is
+almost the same as the kernel VFIO ioctl.
+
+
+Vfio User Server API Overview
+------------------
+
+The following is an overview of key Vfio User Server API functions. You will
+know how to build an emulated device with this overview.
+
+There are mainly four steps of using Vfio User API to build your device:
+
+1. Register
+
+This step includes one API in Vfio User.
+
+* ``rte_vfio_user_register(sock_addr, notify_ops)``
+
+  This function registers a session to communicate with vfio-user client. A
+  session maps to one device so that a device instance will be created upon
+  registration.
+
+  ``sock_addr`` specifies the Unix domain socket file path and is the identity
+  of the session.
+
+  ``notify_ops`` is the a set of callbacks for vfio-user library to notify
+  emulated device. Currently, there are five callbacks:
+
+  - ``new_device``
+    This callback is invoked when the device is configured and becomes ready.
+    The dev_id is for vfio-user library to identify one uniqueue device.
+
+  - ``destroy_device``
+    This callback is invoked when the device is destroyed. In most cases, it
+    means the client is disconnected from the server.
+
+  - ``update_status``
+    This callback is invoked when the device configuration is updated (e.g.,
+    DMA table/IRQ update)
+
+  - ``lock_dp``
+    This callback is invoked when data path needs to be locked or unlocked.
+
+  - ``reset_device``
+    This callback is invoked when the emulated device need reset.
+
+2. Set device information
+
+This step includes three APIs in Vfio User.
+
+* ``rte_vfio_user_set_dev_info(sock_addr, dev_info)``
+
+  This function sets the device information to vfio-user library. The device
+  information is defined in Linux VFIO which mainly consists of device type
+  and the number of vfio regions and IRQs.
+
+* ``rte_vfio_user_set_reg_info(sock_addr, reg)``
+
+  This function sets the vfio region information to vfio-user library. Regions
+  should be created before using this API. The information mainly includes the
+  process virtual address, size, file descriptor, attibutes and capabilities of
+  regions.
+
+* ``rte_vfio_user_set_irq_info(sock_addr, irq)``
+
+  This function sets the IRQ information to vfio-user library. The information
+  includes how many IRQ type the device supports (e.g., MSI/MSI-X) and the IRQ
+  count of each type.
+
+3. Start
+
+This step includes one API in Vfio User.
+
+* ``rte_vfio_user_start(sock_addr)``
+
+  This function starts the registered session with vfio-user client. This means
+  a control thread will start to listen and handle messages sent from the client.
+  Note that only one thread is created for all vfio-user based devices.
+
+  ``sock_addr`` specifies the Unix domain socket file path and is the identity
+  of the session.
+
+4. Get device configuration
+
+This step includes two APIs in Vfio User. Both APIs should be called when the
+device is ready could be updated anytime. A simple usage of both APIs is using
+them in new_device and update_status callbacks.
+
+* ``rte_vfio_user_get_mem_table(dev_id)``
+
+  This function gets the DMA memory table from vfio-user library. The memory
+  table entry has the information of guest physical address, process virtual
+  address, size and file descriptor. Emulated devices could use the memory
+  table to perform DMA read/write on guest memory.
+
+  ``dev_id`` specifies the device ID.
+
+* ``rte_vfio_user_get_irq(dev_id, index, count, fds)``
+
+  This function gets the IRQ's eventfd from vfio-user library. In vfio-user
+  library, an efficient way to send interrupts is using eventfds. The eventfd
+  should be sent from client. Emulated devices could only call eventfd_write
+  to trigger interrupts.
+
+  ``dev_id`` specifies the device ID.
+
+  ``index`` specifies the interrupt type. The mapping of interrupt index and
+  type is defined by emulated device.
+
+  ``count`` specifies the interrupt count.
+
+  ``fds`` is for saving the file descriptors.
+
+
+Vfio User Client API Overview
+------------------
+
+The following is an overview of key Vfio User Client API functions. You will
+know how to use an emulated device with this overview.
+
+There are mainly three steps of using Vfio User Client API to consume the
+device:
+
+1. Attach
+
+This step includes one API in Vfio User.
+
+* ``rte_vfio_user_attach_dev(sock_addr)``
+
+  This function attaches to an emulated device. After the function call
+  success, it is viable to acquire device information and configure the device
+
+  ``sock_addr`` specifies the Unix domain socket file path and is the identity
+  of the session/device.
+
+2. Get device information
+
+This step includes three APIs in Vfio User.
+
+* ``rte_vfio_user_get_dev_info(dev_id, info)``
+
+  This function gets the device information of the emulated device on the other
+  side of socket. The device information is defined in Linux VFIO which mainly
+  consists of device type and the number of vfio regions and IRQs.
+
+  ``dev_id`` specifies the identity of the device.
+
+  ``info`` specifies the information of the device.
+
+* ``rte_vfio_user_get_reg_info(dev_id, info, fd)``
+
+  This function gets the region information of the emulated device on the other
+  side of socket. The region information is defined in Linux VFIO which mainly
+  consists of region size, index and capabilities.
+
+  ``info`` specifies the information of the region.
+
+  ``fd`` specifies the file descriptor of the region.
+
+* ``rte_vfio_user_get_irq_info(dev_id, irq)``
+
+  This function sets the IRQ information to vfio-user library. The IRQ
+  information includes IRQ count and index.
+
+  ``info`` specifies the information of the IRQ.
+
+3. Configure the device
+
+This step includes three APIs in Vfio User.
+
+* ``rte_vfio_user_dma_map(dev_id, mem, fds, num)``
+
+  This function maps DMA memory regions for the emulated device.
+
+  ``mem`` specifies the information of DMA memory regions.
+
+  ``fds`` specifies the file descriptors of the DMA memory regions.
+
+  ``num`` specifies the number of the DMA memory regions.
+
+* ``rte_vfio_user_dma_map(dev_id, mem, num)``
+
+  This function unmaps DMA memory regions for the emulated device.
+
+* ``rte_vfio_user_set_irqs(dev_id, set)``
+
+  This function configure the interrupts for the emulated device.
+
+  ``set`` specifies the configuration of interrupts.
+
+After the above three steps are done, users can easily use the emulated device
+(e.g., do I/O operations).
\ No newline at end of file
diff --git a/doc/guides/rel_notes/release_21_02.rst b/doc/guides/rel_notes/release_21_02.rst
index 638f98168b..6fbb6e8c39 100644
--- a/doc/guides/rel_notes/release_21_02.rst
+++ b/doc/guides/rel_notes/release_21_02.rst
@@ -55,6 +55,17 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added vfio-user Library.**
+
+  Added an experimental library ``librte_vfio_user`` to provide device
+  emulation support.
+
+  The library is an implementation of vfio-user protocol. It includes two main
+  parts: client and server. Server implementation is for device provider to
+  abstract its device. Client implementation is for device consumer to
+  manipulate the emulated device.
+
+  See :doc:`../prog_guide/vfio_user_lib` for more information.
 
 Removed Items
 -------------
-- 
2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [dpdk-dev] [PATCH 0/9] Introduce vfio-user library
  2020-12-18  7:38 [dpdk-dev] [PATCH 0/9] Introduce vfio-user library Chenbo Xia
                   ` (8 preceding siblings ...)
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 9/9] doc: add vfio-user library guide Chenbo Xia
@ 2020-12-18  9:37 ` David Marchand
  2020-12-18 14:07   ` Thanos Makatos
  2021-01-14  6:14 ` [dpdk-dev] [PATCH v2 " Chenbo Xia
  10 siblings, 1 reply; 43+ messages in thread
From: David Marchand @ 2020-12-18  9:37 UTC (permalink / raw)
  To: Chenbo Xia
  Cc: dev, Thomas Monjalon, Stephen Hemminger, Liang, Cunming,
	xiuchun.lu, miao.li, Jingjing Wu, john.g.johnson, thanos.makatos,
	Stefan Hajnoczi, Marc-Andre Lureau, Maxime Coquelin

Hello,

On Fri, Dec 18, 2020 at 8:54 AM Chenbo Xia <chenbo.xia@intel.com> wrote:
> *librte_vfio_user* library is an implementation of VFIO-over-socket[1] (also
> known as vfio-user) which is a protocol that allows a device to be virtualized
> in a separate process outside of QEMU.

What is the status of the specification on QEMU side?
Integrating an implementation in DPDK is premature until we have an
agreed specification.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [dpdk-dev] [PATCH 0/9] Introduce vfio-user library
  2020-12-18  9:37 ` [dpdk-dev] [PATCH 0/9] Introduce vfio-user library David Marchand
@ 2020-12-18 14:07   ` Thanos Makatos
  0 siblings, 0 replies; 43+ messages in thread
From: Thanos Makatos @ 2020-12-18 14:07 UTC (permalink / raw)
  To: David Marchand, Chenbo Xia
  Cc: dev, Thomas Monjalon, Stephen Hemminger, Liang, Cunming,
	xiuchun.lu, miao.li, Jingjing Wu, john.g.johnson,
	Stefan Hajnoczi, Marc-Andre Lureau, Maxime Coquelin,
	Swapnil Ingle, John Levon, Liu, Changpeng, Jagannathan Raman,
	Elena Ufimtseva

> Hello,
> 
> On Fri, Dec 18, 2020 at 8:54 AM Chenbo Xia <chenbo.xia@intel.com> wrote:
> > *librte_vfio_user* library is an implementation of VFIO-over-socket[1]
> (also
> > known as vfio-user) which is a protocol that allows a device to be virtualized
> > in a separate process outside of QEMU.
> 
> What is the status of the specification on QEMU side?
> Integrating an implementation in DPDK is premature until we have an
> agreed specification.

We're in the process of reviewing the specification, the latest version (v7) is here: https://www.mail-archive.com/qemu-devel@nongnu.org/msg763207.html. We haven't had any reviews yet for that revision, IMO we're getting close.

FYI John Johnson is implementing the relevant changes in multiprocess QEMU. John Levon, Swapnil, and I are implementing the server part in libvfio-user (formerly known as MUSER). We also have a mailing list now: https://lists.nongnu.org/mailman/listinfo/libvfio-user-devel. We've been working on integrating the two parts.

Finally, Changpeng is implementing an NVMe controller in SPDK. We're trying to make it work with multiprocess QEMU and libvfio-user, we're very close.



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [dpdk-dev] [PATCH 1/9] lib: introduce vfio-user library
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 1/9] lib: introduce " Chenbo Xia
@ 2020-12-18 17:13   ` Stephen Hemminger
  2020-12-19  6:12     ` Xia, Chenbo
  2020-12-18 17:17   ` Stephen Hemminger
  1 sibling, 1 reply; 43+ messages in thread
From: Stephen Hemminger @ 2020-12-18 17:13 UTC (permalink / raw)
  To: Chenbo Xia
  Cc: dev, thomas, david.marchand, cunming.liang, xiuchun.lu, miao.li,
	jingjing.wu

On Fri, 18 Dec 2020 15:38:43 +0800
Chenbo Xia <chenbo.xia@intel.com> wrote:

> +inline void vfio_user_close_msg_fds(VFIO_USER_MSG *msg)
> +{
> +	int i;
> +
> +	for (i = 0; i < msg->fd_num; i++)
> +		close(msg->fds[i]);
> +}
> +

Please don't use non-static inlines.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [dpdk-dev] [PATCH 1/9] lib: introduce vfio-user library
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 1/9] lib: introduce " Chenbo Xia
  2020-12-18 17:13   ` Stephen Hemminger
@ 2020-12-18 17:17   ` Stephen Hemminger
  2020-12-19  6:25     ` Xia, Chenbo
  1 sibling, 1 reply; 43+ messages in thread
From: Stephen Hemminger @ 2020-12-18 17:17 UTC (permalink / raw)
  To: Chenbo Xia
  Cc: dev, thomas, david.marchand, cunming.liang, xiuchun.lu, miao.li,
	jingjing.wu

On Fri, 18 Dec 2020 15:38:43 +0800
Chenbo Xia <chenbo.xia@intel.com> wrote:

> +typedef struct vfio_user_msg {
> +	uint16_t msg_id;
> +	uint16_t cmd;
> +	uint32_t size;
> +#define VFIO_USER_TYPE_CMD	(0x0)		/* Message type is COMMAND */
> +#define VFIO_USER_TYPE_REPLY	(0x1 << 0)	/* Message type is REPLY */
> +#define VFIO_USER_NEED_NO_RP	(0x1 << 4)	/* Message needs no reply */
> +#define VFIO_USER_ERROR		(0x1 << 5)	/* Reply message has error */
> +	uint32_t flags;
> +	uint32_t err;				/* Valid in reply, optional */
> +	union {
> +		struct vfio_user_version ver;
> +	} payload;
> +	int fds[VFIO_USER_MAX_FD];
> +	int fd_num;
> +} __attribute((packed)) VFIO_USER_MSG;

Please don't introduce all capitals typedefs.
Don't use packed,  it generates slower code. Better to use tools
like pahole and make the layout of the structure naturally packed.
Also, if you put fds[] at the end you could use dynamic sized array
and not be constrained by VFIO_USER_MAX_FD.


Since this is DPDK library the symbols should all start with rte_

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [dpdk-dev] [PATCH 1/9] lib: introduce vfio-user library
  2020-12-18 17:13   ` Stephen Hemminger
@ 2020-12-19  6:12     ` Xia, Chenbo
  0 siblings, 0 replies; 43+ messages in thread
From: Xia, Chenbo @ 2020-12-19  6:12 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, thomas, david.marchand, Liang, Cunming, Lu, Xiuchun, Li,
	Miao, Wu, Jingjing

Hi Stephen,

> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Saturday, December 19, 2020 1:14 AM
> To: Xia, Chenbo <chenbo.xia@intel.com>
> Cc: dev@dpdk.org; thomas@monjalon.net; david.marchand@redhat.com; Liang,
> Cunming <cunming.liang@intel.com>; Lu, Xiuchun <xiuchun.lu@intel.com>; Li,
> Miao <miao.li@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Subject: Re: [PATCH 1/9] lib: introduce vfio-user library
> 
> On Fri, 18 Dec 2020 15:38:43 +0800
> Chenbo Xia <chenbo.xia@intel.com> wrote:
> 
> > +inline void vfio_user_close_msg_fds(VFIO_USER_MSG *msg)
> > +{
> > +	int i;
> > +
> > +	for (i = 0; i < msg->fd_num; i++)
> > +		close(msg->fds[i]);
> > +}
> > +
> 
> Please don't use non-static inlines.

Got it. Will fix in v2.

Thanks!
Chenbo

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [dpdk-dev] [PATCH 1/9] lib: introduce vfio-user library
  2020-12-18 17:17   ` Stephen Hemminger
@ 2020-12-19  6:25     ` Xia, Chenbo
  0 siblings, 0 replies; 43+ messages in thread
From: Xia, Chenbo @ 2020-12-19  6:25 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, thomas, david.marchand, Liang, Cunming, Lu, Xiuchun, Li,
	Miao, Wu, Jingjing

Hi Stephen,

> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Saturday, December 19, 2020 1:18 AM
> To: Xia, Chenbo <chenbo.xia@intel.com>
> Cc: dev@dpdk.org; thomas@monjalon.net; david.marchand@redhat.com; Liang,
> Cunming <cunming.liang@intel.com>; Lu, Xiuchun <xiuchun.lu@intel.com>; Li,
> Miao <miao.li@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Subject: Re: [PATCH 1/9] lib: introduce vfio-user library
> 
> On Fri, 18 Dec 2020 15:38:43 +0800
> Chenbo Xia <chenbo.xia@intel.com> wrote:
> 
> > +typedef struct vfio_user_msg {
> > +	uint16_t msg_id;
> > +	uint16_t cmd;
> > +	uint32_t size;
> > +#define VFIO_USER_TYPE_CMD	(0x0)		/* Message type is COMMAND */
> > +#define VFIO_USER_TYPE_REPLY	(0x1 << 0)	/* Message type is REPLY
> */
> > +#define VFIO_USER_NEED_NO_RP	(0x1 << 4)	/* Message needs no reply
> */
> > +#define VFIO_USER_ERROR		(0x1 << 5)	/* Reply message has error
> */
> > +	uint32_t flags;
> > +	uint32_t err;				/* Valid in reply, optional */
> > +	union {
> > +		struct vfio_user_version ver;
> > +	} payload;
> > +	int fds[VFIO_USER_MAX_FD];
> > +	int fd_num;
> > +} __attribute((packed)) VFIO_USER_MSG;
> 
> Please don't introduce all capitals typedefs.

OK. Will fix in v2.

> Don't use packed,  it generates slower code. Better to use tools
> like pahole and make the layout of the structure naturally packed.

Thanks for the heads up 😊. Will check pahole then.

> Also, if you put fds[] at the end you could use dynamic sized array
> and not be constrained by VFIO_USER_MAX_FD.

I put this constraint just because one vfio-user message in theory will not
send so many fds. And since I am just planning to optimize the message structure,
I will consider this to make it more flexible and reasonable.

> 
> 
> Since this is DPDK library the symbols should all start with rte_

I think structs in this file are not exposed. Do we add rte_ in this case?
If yes, I will fix it in v2 then.

Thanks for all the good catches!
Chenbo


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [dpdk-dev] [PATCH 5/9] vfio_user: implement interrupt related APIs
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 5/9] vfio_user: implement interrupt related APIs Chenbo Xia
@ 2020-12-30  1:04   ` Wu, Jingjing
  2020-12-30  2:31     ` Xia, Chenbo
  0 siblings, 1 reply; 43+ messages in thread
From: Wu, Jingjing @ 2020-12-30  1:04 UTC (permalink / raw)
  To: Xia, Chenbo, dev, thomas, david.marchand
  Cc: stephen, Liang, Cunming, Lu, Xiuchun, Li, Miao

>  	if ((cmd == VFIO_USER_DMA_MAP || cmd == VFIO_USER_DMA_UNMAP
> ||
> +		cmd == VFIO_USER_DEVICE_SET_IRQS ||
>  		cmd == VFIO_USER_DEVICE_RESET)
>  		&& dev->ops->lock_dp) {
>  		dev->ops->lock_dp(dev_id, 1);

About cmd "VFIO_USER_REGION_WRITE", irq setting would cause update_status to iavfbe device.
Where will the lock be?

> @@ -871,7 +1056,8 @@ static int vfio_user_message_handler(int dev_id, int fd)
>  		if (vfio_user_is_ready(dev) && dev->ops->new_device)
>  			dev->ops->new_device(dev_id);
>  	} else {
> -		if ((cmd == VFIO_USER_DMA_MAP || cmd ==
> VFIO_USER_DMA_UNMAP)
> +		if ((cmd == VFIO_USER_DMA_MAP || cmd ==
> VFIO_USER_DMA_UNMAP
> +			|| cmd == VFIO_USER_DEVICE_SET_IRQS)
>  			&& dev->ops->update_status)
>  			dev->ops->update_status(dev_id);
>  	}
> @@ -898,6 +1084,7 @@ static int vfio_user_sock_read(int fd, void *data)
>  		if (dev) {
>  			dev->ops->destroy_device(dev_id);
>  			vfio_user_destroy_mem_entries(dev->mem);
> +			vfio_user_clean_irqfd(dev);
>  			dev->is_ready = 0;
>  			dev->msg_id = 0;
>  		}
> @@ -995,9 +1182,9 @@ vfio_user_start_server(struct vfio_user_server_socket
> *sk)
>  	}
> 
>  	/* All the info must be set before start */
> -	if (!dev->dev_info || !dev->reg) {
> +	if (!dev->dev_info || !dev->reg || !dev->irqs.info) {
>  		VFIO_USER_LOG(ERR, "Failed to start, "
> -			"dev/reg info must be set before start\n");
> +			"dev/reg/irq info must be set before start\n");
>  		return -1;
>  	}
> 


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [dpdk-dev] [PATCH 5/9] vfio_user: implement interrupt related APIs
  2020-12-30  1:04   ` Wu, Jingjing
@ 2020-12-30  2:31     ` Xia, Chenbo
  0 siblings, 0 replies; 43+ messages in thread
From: Xia, Chenbo @ 2020-12-30  2:31 UTC (permalink / raw)
  To: Wu, Jingjing, dev, thomas, david.marchand
  Cc: stephen, Liang, Cunming, Lu, Xiuchun, Li, Miao

Hi Jingjing,

> -----Original Message-----
> From: Wu, Jingjing <jingjing.wu@intel.com>
> Sent: Wednesday, December 30, 2020 9:05 AM
> To: Xia, Chenbo <chenbo.xia@intel.com>; dev@dpdk.org; thomas@monjalon.net;
> david.marchand@redhat.com
> Cc: stephen@networkplumber.org; Liang, Cunming <cunming.liang@intel.com>; Lu,
> Xiuchun <xiuchun.lu@intel.com>; Li, Miao <miao.li@intel.com>
> Subject: RE: [PATCH 5/9] vfio_user: implement interrupt related APIs
> 
> >  	if ((cmd == VFIO_USER_DMA_MAP || cmd == VFIO_USER_DMA_UNMAP
> > ||
> > +		cmd == VFIO_USER_DEVICE_SET_IRQS ||
> >  		cmd == VFIO_USER_DEVICE_RESET)
> >  		&& dev->ops->lock_dp) {
> >  		dev->ops->lock_dp(dev_id, 1);
> 
> About cmd "VFIO_USER_REGION_WRITE", irq setting would cause update_status to
> iavfbe device.
> Where will the lock be?

If emulated device needs some lock in slow-path region rw, IMHO, it should be implemented
in rte_vfio_user_reg_acc_t, which is the region rw callback. As region rw has different
behavior for different registers, it's better to make it device-specific 😊

Thanks!
Chenbo

> 
> > @@ -871,7 +1056,8 @@ static int vfio_user_message_handler(int dev_id, int fd)
> >  		if (vfio_user_is_ready(dev) && dev->ops->new_device)
> >  			dev->ops->new_device(dev_id);
> >  	} else {
> > -		if ((cmd == VFIO_USER_DMA_MAP || cmd ==
> > VFIO_USER_DMA_UNMAP)
> > +		if ((cmd == VFIO_USER_DMA_MAP || cmd ==
> > VFIO_USER_DMA_UNMAP
> > +			|| cmd == VFIO_USER_DEVICE_SET_IRQS)
> >  			&& dev->ops->update_status)
> >  			dev->ops->update_status(dev_id);
> >  	}
> > @@ -898,6 +1084,7 @@ static int vfio_user_sock_read(int fd, void *data)
> >  		if (dev) {
> >  			dev->ops->destroy_device(dev_id);
> >  			vfio_user_destroy_mem_entries(dev->mem);
> > +			vfio_user_clean_irqfd(dev);
> >  			dev->is_ready = 0;
> >  			dev->msg_id = 0;
> >  		}
> > @@ -995,9 +1182,9 @@ vfio_user_start_server(struct vfio_user_server_socket
> > *sk)
> >  	}
> >
> >  	/* All the info must be set before start */
> > -	if (!dev->dev_info || !dev->reg) {
> > +	if (!dev->dev_info || !dev->reg || !dev->irqs.info) {
> >  		VFIO_USER_LOG(ERR, "Failed to start, "
> > -			"dev/reg info must be set before start\n");
> > +			"dev/reg/irq info must be set before start\n");
> >  		return -1;
> >  	}
> >


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [dpdk-dev] [PATCH 2/9] vfio_user: implement lifecycle related APIs
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 2/9] vfio_user: implement lifecycle related APIs Chenbo Xia
@ 2021-01-05  8:34   ` Xing, Beilei
  2021-01-05  9:58     ` Xia, Chenbo
  0 siblings, 1 reply; 43+ messages in thread
From: Xing, Beilei @ 2021-01-05  8:34 UTC (permalink / raw)
  To: Xia, Chenbo, dev, thomas, david.marchand
  Cc: stephen, Liang, Cunming, Lu, Xiuchun, Li, Miao, Wu, Jingjing



> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Chenbo Xia
> Sent: Friday, December 18, 2020 3:39 PM
> To: dev@dpdk.org; thomas@monjalon.net; david.marchand@redhat.com
> Cc: stephen@networkplumber.org; Liang, Cunming
> <cunming.liang@intel.com>; Lu, Xiuchun <xiuchun.lu@intel.com>; Li, Miao
> <miao.li@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Subject: [dpdk-dev] [PATCH 2/9] vfio_user: implement lifecycle related APIs
> 
> This patch implements three lifecycle related APIs for vfio-user server, which
> are rte_vfio_user_register(), rte_vfio_user_unregister() and
> rte_vfio_user_start(). Socket an device management is implemented along
> with the API introduction.
> 
> Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
> Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
> ---
>  lib/librte_vfio_user/meson.build        |   3 +-
>  lib/librte_vfio_user/rte_vfio_user.h    |  51 ++
>  lib/librte_vfio_user/version.map        |   6 +
>  lib/librte_vfio_user/vfio_user_base.h   |   4 +
>  lib/librte_vfio_user/vfio_user_server.c | 690 ++++++++++++++++++++++++
> lib/librte_vfio_user/vfio_user_server.h |  55 ++
>  6 files changed, 808 insertions(+), 1 deletion(-)  create mode 100644
> lib/librte_vfio_user/rte_vfio_user.h
>  create mode 100644 lib/librte_vfio_user/vfio_user_server.c
>  create mode 100644 lib/librte_vfio_user/vfio_user_server.h
> 

<...>

> +static struct vfio_user_server_socket * find_vfio_user_socket(const

1. How about vfio_user_find_socket which is consistent with other function name?
2. According to the coding style, I think it's better to use such format:
static struct vfio_user_server_socket *
vfio_user_find_socket() {
}
And please also check all other functions. 


> +char *sock_addr) {
> +	uint32_t i;
> +
> +	if (sock_addr == NULL)
> +		return NULL;
> +
> +	for (i = 0; i < vfio_ep_sock.sock_num; i++) {
> +		struct vfio_user_server_socket *s = vfio_ep_sock.sock[i];
> +
> +		if (!strcmp(s->sock.sock_addr, sock_addr))
> +			return s;
> +	}
> +
> +	return NULL;
> +}
> +

<...>


> --
> 2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [dpdk-dev] [PATCH 2/9] vfio_user: implement lifecycle related APIs
  2021-01-05  8:34   ` Xing, Beilei
@ 2021-01-05  9:58     ` Xia, Chenbo
  0 siblings, 0 replies; 43+ messages in thread
From: Xia, Chenbo @ 2021-01-05  9:58 UTC (permalink / raw)
  To: Xing, Beilei, dev, thomas, david.marchand
  Cc: stephen, Liang, Cunming, Lu, Xiuchun, Li, Miao, Wu, Jingjing

Hi Beilei,

> -----Original Message-----
> From: Xing, Beilei <beilei.xing@intel.com>
> Sent: Tuesday, January 5, 2021 4:35 PM
> To: Xia, Chenbo <chenbo.xia@intel.com>; dev@dpdk.org; thomas@monjalon.net;
> david.marchand@redhat.com
> Cc: stephen@networkplumber.org; Liang, Cunming <cunming.liang@intel.com>; Lu,
> Xiuchun <xiuchun.lu@intel.com>; Li, Miao <miao.li@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>
> Subject: RE: [dpdk-dev] [PATCH 2/9] vfio_user: implement lifecycle related
> APIs
> 
> 
> 
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Chenbo Xia
> > Sent: Friday, December 18, 2020 3:39 PM
> > To: dev@dpdk.org; thomas@monjalon.net; david.marchand@redhat.com
> > Cc: stephen@networkplumber.org; Liang, Cunming
> > <cunming.liang@intel.com>; Lu, Xiuchun <xiuchun.lu@intel.com>; Li, Miao
> > <miao.li@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > Subject: [dpdk-dev] [PATCH 2/9] vfio_user: implement lifecycle related APIs
> >
> > This patch implements three lifecycle related APIs for vfio-user server,
> which
> > are rte_vfio_user_register(), rte_vfio_user_unregister() and
> > rte_vfio_user_start(). Socket an device management is implemented along
> > with the API introduction.
> >
> > Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
> > Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
> > ---
> >  lib/librte_vfio_user/meson.build        |   3 +-
> >  lib/librte_vfio_user/rte_vfio_user.h    |  51 ++
> >  lib/librte_vfio_user/version.map        |   6 +
> >  lib/librte_vfio_user/vfio_user_base.h   |   4 +
> >  lib/librte_vfio_user/vfio_user_server.c | 690 ++++++++++++++++++++++++
> > lib/librte_vfio_user/vfio_user_server.h |  55 ++
> >  6 files changed, 808 insertions(+), 1 deletion(-)  create mode 100644
> > lib/librte_vfio_user/rte_vfio_user.h
> >  create mode 100644 lib/librte_vfio_user/vfio_user_server.c
> >  create mode 100644 lib/librte_vfio_user/vfio_user_server.h
> >
> 
> <...>
> 
> > +static struct vfio_user_server_socket * find_vfio_user_socket(const
> 
> 1. How about vfio_user_find_socket which is consistent with other function
> name?

Good! Will fix in v2.

> 2. According to the coding style, I think it's better to use such format:
> static struct vfio_user_server_socket *
> vfio_user_find_socket() {
> }
> And please also check all other functions.

OK. Will fix the format and check.

Thanks!
Chenbo

> 
> 
> > +char *sock_addr) {
> > +	uint32_t i;
> > +
> > +	if (sock_addr == NULL)
> > +		return NULL;
> > +
> > +	for (i = 0; i < vfio_ep_sock.sock_num; i++) {
> > +		struct vfio_user_server_socket *s = vfio_ep_sock.sock[i];
> > +
> > +		if (!strcmp(s->sock.sock_addr, sock_addr))
> > +			return s;
> > +	}
> > +
> > +	return NULL;
> > +}
> > +
> 
> <...>
> 
> 
> > --
> > 2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [dpdk-dev] [PATCH 9/9] doc: add vfio-user library guide
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 9/9] doc: add vfio-user library guide Chenbo Xia
@ 2021-01-06  5:07   ` Xing, Beilei
  2021-01-06  7:43     ` Xia, Chenbo
  0 siblings, 1 reply; 43+ messages in thread
From: Xing, Beilei @ 2021-01-06  5:07 UTC (permalink / raw)
  To: Xia, Chenbo, dev, thomas, david.marchand
  Cc: stephen, Liang, Cunming, Lu, Xiuchun, Li, Miao, Wu, Jingjing



> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Chenbo Xia
> Sent: Friday, December 18, 2020 3:39 PM
> To: dev@dpdk.org; thomas@monjalon.net; david.marchand@redhat.com
> Cc: stephen@networkplumber.org; Liang, Cunming
> <cunming.liang@intel.com>; Lu, Xiuchun <xiuchun.lu@intel.com>; Li, Miao
> <miao.li@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Subject: [dpdk-dev] [PATCH 9/9] doc: add vfio-user library guide
> 
> Add vfio-user library guide and update release notes.
> 
> Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
> Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
> ---
>  doc/guides/prog_guide/index.rst         |   1 +
>  doc/guides/prog_guide/vfio_user_lib.rst | 215 ++++++++++++++++++++++++
> doc/guides/rel_notes/release_21_02.rst  |  11 ++
>  3 files changed, 227 insertions(+)
>  create mode 100644 doc/guides/prog_guide/vfio_user_lib.rst
> 
> diff --git a/doc/guides/prog_guide/index.rst
> b/doc/guides/prog_guide/index.rst index 45c7dec88d..f9847b1058 100644
> --- a/doc/guides/prog_guide/index.rst
> +++ b/doc/guides/prog_guide/index.rst
> @@ -70,3 +70,4 @@ Programmer's Guide
>      lto
>      profile_app
>      glossary
> +    vfio_user_lib
> diff --git a/doc/guides/prog_guide/vfio_user_lib.rst
> b/doc/guides/prog_guide/vfio_user_lib.rst
> new file mode 100644
> index 0000000000..6daec4d8e5
> --- /dev/null
> +++ b/doc/guides/prog_guide/vfio_user_lib.rst
> @@ -0,0 +1,215 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +    Copyright(c) 2020 Intel Corporation.
> +

<snip>

> +
> +3. Configure the device
> +
> +This step includes three APIs in Vfio User.
> +
> +* ``rte_vfio_user_dma_map(dev_id, mem, fds, num)``
> +
> +  This function maps DMA memory regions for the emulated device.
> +
> +  ``mem`` specifies the information of DMA memory regions.
> +
> +  ``fds`` specifies the file descriptors of the DMA memory regions.
> +
> +  ``num`` specifies the number of the DMA memory regions.
> +
> +* ``rte_vfio_user_dma_map(dev_id, mem, num)``

Should be rte_vfio_user_dma_unmap here.

> +
> +  This function unmaps DMA memory regions for the emulated device.
> +
> +* ``rte_vfio_user_set_irqs(dev_id, set)``
> +
> +  This function configure the interrupts for the emulated device.
> +
> +  ``set`` specifies the configuration of interrupts.
> +
> +After the above three steps are done, users can easily use the emulated
> +device (e.g., do I/O operations).
> \ No newline at end of file

<snip>

> --
> 2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [dpdk-dev] [PATCH 3/9] vfio_user: implement device and region related APIs
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 3/9] vfio_user: implement device and region " Chenbo Xia
@ 2021-01-06  5:51   ` Xing, Beilei
  2021-01-06  7:50     ` Xia, Chenbo
  0 siblings, 1 reply; 43+ messages in thread
From: Xing, Beilei @ 2021-01-06  5:51 UTC (permalink / raw)
  To: Xia, Chenbo, dev, thomas, david.marchand
  Cc: stephen, Liang, Cunming, Lu, Xiuchun, Li, Miao, Wu, Jingjing



> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Chenbo Xia
> Sent: Friday, December 18, 2020 3:39 PM
> To: dev@dpdk.org; thomas@monjalon.net; david.marchand@redhat.com
> Cc: stephen@networkplumber.org; Liang, Cunming
> <cunming.liang@intel.com>; Lu, Xiuchun <xiuchun.lu@intel.com>; Li, Miao
> <miao.li@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Subject: [dpdk-dev] [PATCH 3/9] vfio_user: implement device and region
> related APIs
> 
> This patch introduces device and region related APIs, which are
> rte_vfio_user_set_dev_info() and rte_vfio_user_set_reg_info().
> The corresponding vfio-user command handling is also added with the
> definition of all vfio-user command identity.
> 
> Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
> Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
> ---
>  lib/librte_vfio_user/rte_vfio_user.h    |  60 ++++++
>  lib/librte_vfio_user/version.map        |   2 +
>  lib/librte_vfio_user/vfio_user_base.c   |  12 ++
>  lib/librte_vfio_user/vfio_user_base.h   |  32 +++-
>  lib/librte_vfio_user/vfio_user_server.c | 232 ++++++++++++++++++++++++
>  lib/librte_vfio_user/vfio_user_server.h |   2 +
>  6 files changed, 339 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_vfio_user/rte_vfio_user.h
> b/lib/librte_vfio_user/rte_vfio_user.h
> index 0d4f6c1be2..8a999c7aa0 100644
> --- a/lib/librte_vfio_user/rte_vfio_user.h
> +++ b/lib/librte_vfio_user/rte_vfio_user.h
> @@ -5,13 +5,35 @@
>  #ifndef _RTE_VFIO_USER_H
>  #define _RTE_VFIO_USER_H
> 

<snip>

> +static int vfio_user_device_get_reg_info(struct vfio_user_server *dev,
> +	VFIO_USER_MSG *msg)
> +{
> +	struct vfio_user_reg *reg = &msg->payload.reg_info;
> +	struct rte_vfio_user_reg_info *reg_info;
> +	struct vfio_region_info *vinfo;
> +
> +	if (vfio_user_check_msg_fdnum(msg, 0) != 0)
> +		return -EINVAL;
> +
> +	if (msg->size > sizeof(*reg) + VFIO_USER_MSG_HDR_SIZE ||
> +		dev->reg->reg_num <= reg->reg_info.index) {
> +		VFIO_USER_LOG(ERR, "Invalid message for get dev info\n");

Invalid message for get region info?

> +		return -EINVAL;
> +	}
> +
> +	reg_info = &dev->reg->reg_info[reg->reg_info.index];
> +	vinfo = reg_info->info;
> +	memcpy(reg, vinfo, vinfo->argsz);
> +
> +	if (reg_info->fd != -1) {
> +		msg->fd_num = 1;
> +		msg->fds[0] = reg_info->fd;
> +	}
> +
> +	VFIO_USER_LOG(DEBUG, "Region(%u) info: addr(0x%" PRIx64 "),
> fd(%d), "
> +		"sz(0x%llx), argsz(0x%x), c_off(0x%x), flags(0x%x) "
> +		"off(0x%llx)\n", vinfo->index, (uint64_t)reg_info->base,
> +		reg_info->fd, vinfo->size, vinfo->argsz, vinfo->cap_offset,
> +		vinfo->flags, vinfo->offset);
> +
> +	return 0;
> +}
> +

<snip>

> +int rte_vfio_user_set_dev_info(const char *sock_addr,
> +	struct vfio_device_info *dev_info)
> +{
> +	struct vfio_user_server *dev;
> +	struct vfio_user_server_socket *sk;
> +	int dev_id;
> +
> +	if (!dev_info)
> +		return -1;
> +
> +	pthread_mutex_lock(&vfio_ep_sock.mutex);
> +	sk = find_vfio_user_socket(sock_addr);
> +	pthread_mutex_unlock(&vfio_ep_sock.mutex);
> +
> +	if (!sk) {
> +		VFIO_USER_LOG(ERR, "Failed to set device info with
> sock_addr "
> +			"%s: addr not registered.\n", sock_addr);
> +		return -1;
> +	}
> +
> +	dev_id = sk->sock.dev_id;
> +	dev = vfio_user_get_device(dev_id);
> +	if (!dev) {
> +		VFIO_USER_LOG(ERR, "Failed to set device info:"
> +			"device %d not found.\n", dev_id);
> +		return -1;
> +	}
> +
> +	if (dev->started) {
> +		VFIO_USER_LOG(ERR, "Failed to set device info for
> device %d\n"
> +			 ", device already started\n", dev_id);
> +		return -1;
> +	}
> +
> +	dev->dev_info = dev_info;
> +
> +	return 0;
> +}
> +
> +int rte_vfio_user_set_reg_info(const char *sock_addr,
> +	struct rte_vfio_user_regions *reg)
> +{
> +	struct vfio_user_server *dev;
> +	struct vfio_user_server_socket *sk;
> +	int dev_id;
> +
> +	if (!reg)
> +		return -1;
> +
> +	pthread_mutex_lock(&vfio_ep_sock.mutex);
> +	sk = find_vfio_user_socket(sock_addr);
> +	pthread_mutex_unlock(&vfio_ep_sock.mutex);
> +
> +	if (!sk) {
> +		VFIO_USER_LOG(ERR, "Failed to set region info with
> sock_addr:"
> +			"%s: addr not registered.\n", sock_addr);
> +		return -1;
> +	}
> +
> +	dev_id = sk->sock.dev_id;
> +	dev = vfio_user_get_device(dev_id);
> +	if (!dev) {
> +		VFIO_USER_LOG(ERR, "Failed to set region info:"
> +			"device %d not found.\n", dev_id);
> +		return -1;
> +	}
> +
> +	if (dev->started) {
> +		VFIO_USER_LOG(ERR, "Failed to set region info for
> device %d\n"
> +			 ", device already started\n", dev_id);
> +		return -1;
> +	}
> +
> +	dev->reg = reg;
> +
> +	return 0;
> +}

Do you think if we can define a static function to cover the duplicated check part of function rte_vfio_user_set_dev_info and rte_vfio_user_set_reg_info?

> diff --git a/lib/librte_vfio_user/vfio_user_server.h
> b/lib/librte_vfio_user/vfio_user_server.h
> index 00e3f6353d..e8fb61cb3e 100644
> --- a/lib/librte_vfio_user/vfio_user_server.h
> +++ b/lib/librte_vfio_user/vfio_user_server.h
> @@ -16,6 +16,8 @@ struct vfio_user_server {
>  	uint32_t msg_id;
>  	char sock_addr[PATH_MAX];
>  	struct vfio_user_version ver;
> +	struct vfio_device_info *dev_info;
> +	struct rte_vfio_user_regions *reg;
>  };
> 
>  typedef int (*event_handler)(int fd, void *data);
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [dpdk-dev] [PATCH 9/9] doc: add vfio-user library guide
  2021-01-06  5:07   ` Xing, Beilei
@ 2021-01-06  7:43     ` Xia, Chenbo
  0 siblings, 0 replies; 43+ messages in thread
From: Xia, Chenbo @ 2021-01-06  7:43 UTC (permalink / raw)
  To: Xing, Beilei, dev, thomas, david.marchand
  Cc: stephen, Liang, Cunming, Lu, Xiuchun, Li, Miao, Wu, Jingjing

Hi Beilei,

> -----Original Message-----
> From: Xing, Beilei <beilei.xing@intel.com>
> Sent: Wednesday, January 6, 2021 1:08 PM
> To: Xia, Chenbo <chenbo.xia@intel.com>; dev@dpdk.org; thomas@monjalon.net;
> david.marchand@redhat.com
> Cc: stephen@networkplumber.org; Liang, Cunming <cunming.liang@intel.com>; Lu,
> Xiuchun <xiuchun.lu@intel.com>; Li, Miao <miao.li@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>
> Subject: RE: [dpdk-dev] [PATCH 9/9] doc: add vfio-user library guide
> 
> 
> 
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Chenbo Xia
> > Sent: Friday, December 18, 2020 3:39 PM
> > To: dev@dpdk.org; thomas@monjalon.net; david.marchand@redhat.com
> > Cc: stephen@networkplumber.org; Liang, Cunming
> > <cunming.liang@intel.com>; Lu, Xiuchun <xiuchun.lu@intel.com>; Li, Miao
> > <miao.li@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > Subject: [dpdk-dev] [PATCH 9/9] doc: add vfio-user library guide
> >
> > Add vfio-user library guide and update release notes.
> >
> > Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
> > Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
> > ---
> >  doc/guides/prog_guide/index.rst         |   1 +
> >  doc/guides/prog_guide/vfio_user_lib.rst | 215 ++++++++++++++++++++++++
> > doc/guides/rel_notes/release_21_02.rst  |  11 ++
> >  3 files changed, 227 insertions(+)
> >  create mode 100644 doc/guides/prog_guide/vfio_user_lib.rst
> >
> > diff --git a/doc/guides/prog_guide/index.rst
> > b/doc/guides/prog_guide/index.rst index 45c7dec88d..f9847b1058 100644
> > --- a/doc/guides/prog_guide/index.rst
> > +++ b/doc/guides/prog_guide/index.rst
> > @@ -70,3 +70,4 @@ Programmer's Guide
> >      lto
> >      profile_app
> >      glossary
> > +    vfio_user_lib
> > diff --git a/doc/guides/prog_guide/vfio_user_lib.rst
> > b/doc/guides/prog_guide/vfio_user_lib.rst
> > new file mode 100644
> > index 0000000000..6daec4d8e5
> > --- /dev/null
> > +++ b/doc/guides/prog_guide/vfio_user_lib.rst
> > @@ -0,0 +1,215 @@
> > +..  SPDX-License-Identifier: BSD-3-Clause
> > +    Copyright(c) 2020 Intel Corporation.
> > +
> 
> <snip>
> 
> > +
> > +3. Configure the device
> > +
> > +This step includes three APIs in Vfio User.
> > +
> > +* ``rte_vfio_user_dma_map(dev_id, mem, fds, num)``
> > +
> > +  This function maps DMA memory regions for the emulated device.
> > +
> > +  ``mem`` specifies the information of DMA memory regions.
> > +
> > +  ``fds`` specifies the file descriptors of the DMA memory regions.
> > +
> > +  ``num`` specifies the number of the DMA memory regions.
> > +
> > +* ``rte_vfio_user_dma_map(dev_id, mem, num)``
> 
> Should be rte_vfio_user_dma_unmap here.

Oops.. yes, you are correct! Will fix it then.

Thanks!
Chenbo

> 
> > +
> > +  This function unmaps DMA memory regions for the emulated device.
> > +
> > +* ``rte_vfio_user_set_irqs(dev_id, set)``
> > +
> > +  This function configure the interrupts for the emulated device.
> > +
> > +  ``set`` specifies the configuration of interrupts.
> > +
> > +After the above three steps are done, users can easily use the emulated
> > +device (e.g., do I/O operations).
> > \ No newline at end of file
> 
> <snip>
> 
> > --
> > 2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [dpdk-dev] [PATCH 3/9] vfio_user: implement device and region related APIs
  2021-01-06  5:51   ` Xing, Beilei
@ 2021-01-06  7:50     ` Xia, Chenbo
  0 siblings, 0 replies; 43+ messages in thread
From: Xia, Chenbo @ 2021-01-06  7:50 UTC (permalink / raw)
  To: Xing, Beilei, dev, thomas, david.marchand
  Cc: stephen, Liang, Cunming, Lu, Xiuchun, Li, Miao, Wu, Jingjing

Hi Beilei,

> -----Original Message-----
> From: Xing, Beilei <beilei.xing@intel.com>
> Sent: Wednesday, January 6, 2021 1:52 PM
> To: Xia, Chenbo <chenbo.xia@intel.com>; dev@dpdk.org; thomas@monjalon.net;
> david.marchand@redhat.com
> Cc: stephen@networkplumber.org; Liang, Cunming <cunming.liang@intel.com>; Lu,
> Xiuchun <xiuchun.lu@intel.com>; Li, Miao <miao.li@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>
> Subject: RE: [dpdk-dev] [PATCH 3/9] vfio_user: implement device and region
> related APIs
> 
> 
> 
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Chenbo Xia
> > Sent: Friday, December 18, 2020 3:39 PM
> > To: dev@dpdk.org; thomas@monjalon.net; david.marchand@redhat.com
> > Cc: stephen@networkplumber.org; Liang, Cunming
> > <cunming.liang@intel.com>; Lu, Xiuchun <xiuchun.lu@intel.com>; Li, Miao
> > <miao.li@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > Subject: [dpdk-dev] [PATCH 3/9] vfio_user: implement device and region
> > related APIs
> >
> > This patch introduces device and region related APIs, which are
> > rte_vfio_user_set_dev_info() and rte_vfio_user_set_reg_info().
> > The corresponding vfio-user command handling is also added with the
> > definition of all vfio-user command identity.
> >
> > Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
> > Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
> > ---
> >  lib/librte_vfio_user/rte_vfio_user.h    |  60 ++++++
> >  lib/librte_vfio_user/version.map        |   2 +
> >  lib/librte_vfio_user/vfio_user_base.c   |  12 ++
> >  lib/librte_vfio_user/vfio_user_base.h   |  32 +++-
> >  lib/librte_vfio_user/vfio_user_server.c | 232 ++++++++++++++++++++++++
> >  lib/librte_vfio_user/vfio_user_server.h |   2 +
> >  6 files changed, 339 insertions(+), 1 deletion(-)
> >
> > diff --git a/lib/librte_vfio_user/rte_vfio_user.h
> > b/lib/librte_vfio_user/rte_vfio_user.h
> > index 0d4f6c1be2..8a999c7aa0 100644
> > --- a/lib/librte_vfio_user/rte_vfio_user.h
> > +++ b/lib/librte_vfio_user/rte_vfio_user.h
> > @@ -5,13 +5,35 @@
> >  #ifndef _RTE_VFIO_USER_H
> >  #define _RTE_VFIO_USER_H
> >
> 
> <snip>
> 
> > +static int vfio_user_device_get_reg_info(struct vfio_user_server *dev,
> > +	VFIO_USER_MSG *msg)
> > +{
> > +	struct vfio_user_reg *reg = &msg->payload.reg_info;
> > +	struct rte_vfio_user_reg_info *reg_info;
> > +	struct vfio_region_info *vinfo;
> > +
> > +	if (vfio_user_check_msg_fdnum(msg, 0) != 0)
> > +		return -EINVAL;
> > +
> > +	if (msg->size > sizeof(*reg) + VFIO_USER_MSG_HDR_SIZE ||
> > +		dev->reg->reg_num <= reg->reg_info.index) {
> > +		VFIO_USER_LOG(ERR, "Invalid message for get dev info\n");
> 
> Invalid message for get region info?

Yes.. Will fix it in v2.

> 
> > +		return -EINVAL;
> > +	}
> > +
> > +	reg_info = &dev->reg->reg_info[reg->reg_info.index];
> > +	vinfo = reg_info->info;
> > +	memcpy(reg, vinfo, vinfo->argsz);
> > +
> > +	if (reg_info->fd != -1) {
> > +		msg->fd_num = 1;
> > +		msg->fds[0] = reg_info->fd;
> > +	}
> > +
> > +	VFIO_USER_LOG(DEBUG, "Region(%u) info: addr(0x%" PRIx64 "),
> > fd(%d), "
> > +		"sz(0x%llx), argsz(0x%x), c_off(0x%x), flags(0x%x) "
> > +		"off(0x%llx)\n", vinfo->index, (uint64_t)reg_info->base,
> > +		reg_info->fd, vinfo->size, vinfo->argsz, vinfo->cap_offset,
> > +		vinfo->flags, vinfo->offset);
> > +
> > +	return 0;
> > +}
> > +
> 
> <snip>
> 
> > +int rte_vfio_user_set_dev_info(const char *sock_addr,
> > +	struct vfio_device_info *dev_info)
> > +{
> > +	struct vfio_user_server *dev;
> > +	struct vfio_user_server_socket *sk;
> > +	int dev_id;
> > +
> > +	if (!dev_info)
> > +		return -1;
> > +
> > +	pthread_mutex_lock(&vfio_ep_sock.mutex);
> > +	sk = find_vfio_user_socket(sock_addr);
> > +	pthread_mutex_unlock(&vfio_ep_sock.mutex);
> > +
> > +	if (!sk) {
> > +		VFIO_USER_LOG(ERR, "Failed to set device info with
> > sock_addr "
> > +			"%s: addr not registered.\n", sock_addr);
> > +		return -1;
> > +	}
> > +
> > +	dev_id = sk->sock.dev_id;
> > +	dev = vfio_user_get_device(dev_id);
> > +	if (!dev) {
> > +		VFIO_USER_LOG(ERR, "Failed to set device info:"
> > +			"device %d not found.\n", dev_id);
> > +		return -1;
> > +	}
> > +
> > +	if (dev->started) {
> > +		VFIO_USER_LOG(ERR, "Failed to set device info for
> > device %d\n"
> > +			 ", device already started\n", dev_id);
> > +		return -1;
> > +	}
> > +
> > +	dev->dev_info = dev_info;
> > +
> > +	return 0;
> > +}
> > +
> > +int rte_vfio_user_set_reg_info(const char *sock_addr,
> > +	struct rte_vfio_user_regions *reg)
> > +{
> > +	struct vfio_user_server *dev;
> > +	struct vfio_user_server_socket *sk;
> > +	int dev_id;
> > +
> > +	if (!reg)
> > +		return -1;
> > +
> > +	pthread_mutex_lock(&vfio_ep_sock.mutex);
> > +	sk = find_vfio_user_socket(sock_addr);
> > +	pthread_mutex_unlock(&vfio_ep_sock.mutex);
> > +
> > +	if (!sk) {
> > +		VFIO_USER_LOG(ERR, "Failed to set region info with
> > sock_addr:"
> > +			"%s: addr not registered.\n", sock_addr);
> > +		return -1;
> > +	}
> > +
> > +	dev_id = sk->sock.dev_id;
> > +	dev = vfio_user_get_device(dev_id);
> > +	if (!dev) {
> > +		VFIO_USER_LOG(ERR, "Failed to set region info:"
> > +			"device %d not found.\n", dev_id);
> > +		return -1;
> > +	}
> > +
> > +	if (dev->started) {
> > +		VFIO_USER_LOG(ERR, "Failed to set region info for
> > device %d\n"
> > +			 ", device already started\n", dev_id);
> > +		return -1;
> > +	}
> > +
> > +	dev->reg = reg;
> > +
> > +	return 0;
> > +}
> 
> Do you think if we can define a static function to cover the duplicated check
> part of function rte_vfio_user_set_dev_info and rte_vfio_user_set_reg_info?

Good catch! Yes, there's some duplicated code in some set APIs. I will refactor them
in v2.

Thanks!
Chenbo

> 
> > diff --git a/lib/librte_vfio_user/vfio_user_server.h
> > b/lib/librte_vfio_user/vfio_user_server.h
> > index 00e3f6353d..e8fb61cb3e 100644
> > --- a/lib/librte_vfio_user/vfio_user_server.h
> > +++ b/lib/librte_vfio_user/vfio_user_server.h
> > @@ -16,6 +16,8 @@ struct vfio_user_server {
> >  	uint32_t msg_id;
> >  	char sock_addr[PATH_MAX];
> >  	struct vfio_user_version ver;
> > +	struct vfio_device_info *dev_info;
> > +	struct rte_vfio_user_regions *reg;
> >  };
> >
> >  typedef int (*event_handler)(int fd, void *data);
> > --
> > 2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [dpdk-dev] [PATCH 7/9] vfio_user: add client APIs of DMA/IRQ/region
  2020-12-18  7:38 ` [dpdk-dev] [PATCH 7/9] vfio_user: add client APIs of DMA/IRQ/region Chenbo Xia
@ 2021-01-07  2:41   ` Xing, Beilei
  2021-01-07  7:26     ` Xia, Chenbo
  0 siblings, 1 reply; 43+ messages in thread
From: Xing, Beilei @ 2021-01-07  2:41 UTC (permalink / raw)
  To: Xia, Chenbo, dev, thomas, david.marchand
  Cc: stephen, Liang, Cunming, Lu, Xiuchun, Li, Miao, Wu, Jingjing



> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Chenbo Xia
> Sent: Friday, December 18, 2020 3:39 PM
> To: dev@dpdk.org; thomas@monjalon.net; david.marchand@redhat.com
> Cc: stephen@networkplumber.org; Liang, Cunming
> <cunming.liang@intel.com>; Lu, Xiuchun <xiuchun.lu@intel.com>; Li, Miao
> <miao.li@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Subject: [dpdk-dev] [PATCH 7/9] vfio_user: add client APIs of DMA/IRQ/region
> 
> This patch introduces nine APIs
> - Device related:
>   rte_vfio_user_get_dev_info and rte_vfio_user_reset
> - DMA related:
>   rte_vfio_user_dma_map/unmap
> - Region related:
>   rte_vfio_user_get_reg_info and rte_vfio_user_region_read/write
> - IRQ related:
>   rte_vfio_user_get_irq_info and rte_vfio_user_set_irqs
> 
> Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
> Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
> ---
>  lib/librte_vfio_user/rte_vfio_user.h    | 168 ++++++++++
>  lib/librte_vfio_user/version.map        |   9 +
>  lib/librte_vfio_user/vfio_user_client.c | 412 ++++++++++++++++++++++++
>  3 files changed, 589 insertions(+)
> 
> diff --git a/lib/librte_vfio_user/rte_vfio_user.h
> b/lib/librte_vfio_user/rte_vfio_user.h
> index b09d83e224..fe27d05992 100644
> --- a/lib/librte_vfio_user/rte_vfio_user.h
> +++ b/lib/librte_vfio_user/rte_vfio_user.h
> @@ -229,6 +229,15 @@ int rte_vfio_user_set_irq_info(const char *sock_addr,


> +int rte_vfio_user_get_dev_info(int dev_id, struct vfio_device_info
> +*info) {
> +	VFIO_USER_MSG msg;
> +	struct vfio_user_client *dev;
> +	int ret;
> +	uint32_t sz = VFIO_USER_MSG_HDR_SIZE + sizeof(struct
> +vfio_device_info);
> +
> +	if (!info)
> +		return -EINVAL;
> +
> +	dev = vfio_client_devs.cl[dev_id];
> +	if (!dev) {
> +		VFIO_USER_LOG(ERR, "Failed to get device info: "
> +			"wrong device ID\n");
> +		return -EINVAL;
> +	}
> +
> +	memset(&msg, 0, sizeof(VFIO_USER_MSG));
> +	vfio_user_client_fill_hdr(&msg, VFIO_USER_DEVICE_GET_INFO, sz);
> +
> +	ret = vfio_user_client_send_recv(dev->sock.sock_fd, &msg);
> +	if (ret)
> +		return ret;
> +
> +	if (msg.flags & VFIO_USER_ERROR) {
> +		VFIO_USER_LOG(ERR, "Failed to get device info: %s\n",
> +				msg.err ? strerror(msg.err) : "Unknown error");
> +		return msg.err ? -(int)msg.err : -1;
> +	}
> +
> +	if (vfio_user_check_msg_fdnum(&msg, 0) != 0)
> +		return -1;
> +
> +	memcpy(info, &msg.payload.dev_info, sizeof(*info));
> +	return ret;
> +}
> +

Seems there's duplicate code in function get_xxx_info, double check and refine.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [dpdk-dev] [PATCH 7/9] vfio_user: add client APIs of DMA/IRQ/region
  2021-01-07  2:41   ` Xing, Beilei
@ 2021-01-07  7:26     ` Xia, Chenbo
  0 siblings, 0 replies; 43+ messages in thread
From: Xia, Chenbo @ 2021-01-07  7:26 UTC (permalink / raw)
  To: Xing, Beilei, dev, thomas, david.marchand
  Cc: stephen, Liang, Cunming, Lu, Xiuchun, Li, Miao, Wu, Jingjing

Hi Beilei,

> -----Original Message-----
> From: Xing, Beilei <beilei.xing@intel.com>
> Sent: Thursday, January 7, 2021 10:42 AM
> To: Xia, Chenbo <chenbo.xia@intel.com>; dev@dpdk.org; thomas@monjalon.net;
> david.marchand@redhat.com
> Cc: stephen@networkplumber.org; Liang, Cunming <cunming.liang@intel.com>; Lu,
> Xiuchun <xiuchun.lu@intel.com>; Li, Miao <miao.li@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>
> Subject: RE: [dpdk-dev] [PATCH 7/9] vfio_user: add client APIs of
> DMA/IRQ/region
> 
> 
> 
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Chenbo Xia
> > Sent: Friday, December 18, 2020 3:39 PM
> > To: dev@dpdk.org; thomas@monjalon.net; david.marchand@redhat.com
> > Cc: stephen@networkplumber.org; Liang, Cunming
> > <cunming.liang@intel.com>; Lu, Xiuchun <xiuchun.lu@intel.com>; Li, Miao
> > <miao.li@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > Subject: [dpdk-dev] [PATCH 7/9] vfio_user: add client APIs of DMA/IRQ/region
> >
> > This patch introduces nine APIs
> > - Device related:
> >   rte_vfio_user_get_dev_info and rte_vfio_user_reset
> > - DMA related:
> >   rte_vfio_user_dma_map/unmap
> > - Region related:
> >   rte_vfio_user_get_reg_info and rte_vfio_user_region_read/write
> > - IRQ related:
> >   rte_vfio_user_get_irq_info and rte_vfio_user_set_irqs
> >
> > Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
> > Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
> > ---
> >  lib/librte_vfio_user/rte_vfio_user.h    | 168 ++++++++++
> >  lib/librte_vfio_user/version.map        |   9 +
> >  lib/librte_vfio_user/vfio_user_client.c | 412 ++++++++++++++++++++++++
> >  3 files changed, 589 insertions(+)
> >
> > diff --git a/lib/librte_vfio_user/rte_vfio_user.h
> > b/lib/librte_vfio_user/rte_vfio_user.h
> > index b09d83e224..fe27d05992 100644
> > --- a/lib/librte_vfio_user/rte_vfio_user.h
> > +++ b/lib/librte_vfio_user/rte_vfio_user.h
> > @@ -229,6 +229,15 @@ int rte_vfio_user_set_irq_info(const char *sock_addr,
> 
> 
> > +int rte_vfio_user_get_dev_info(int dev_id, struct vfio_device_info
> > +*info) {
> > +	VFIO_USER_MSG msg;
> > +	struct vfio_user_client *dev;
> > +	int ret;
> > +	uint32_t sz = VFIO_USER_MSG_HDR_SIZE + sizeof(struct
> > +vfio_device_info);
> > +
> > +	if (!info)
> > +		return -EINVAL;
> > +
> > +	dev = vfio_client_devs.cl[dev_id];
> > +	if (!dev) {
> > +		VFIO_USER_LOG(ERR, "Failed to get device info: "
> > +			"wrong device ID\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	memset(&msg, 0, sizeof(VFIO_USER_MSG));
> > +	vfio_user_client_fill_hdr(&msg, VFIO_USER_DEVICE_GET_INFO, sz);
> > +
> > +	ret = vfio_user_client_send_recv(dev->sock.sock_fd, &msg);
> > +	if (ret)
> > +		return ret;
> > +
> > +	if (msg.flags & VFIO_USER_ERROR) {
> > +		VFIO_USER_LOG(ERR, "Failed to get device info: %s\n",
> > +				msg.err ? strerror(msg.err) : "Unknown error");
> > +		return msg.err ? -(int)msg.err : -1;
> > +	}
> > +
> > +	if (vfio_user_check_msg_fdnum(&msg, 0) != 0)
> > +		return -1;
> > +
> > +	memcpy(info, &msg.payload.dev_info, sizeof(*info));
> > +	return ret;
> > +}
> > +
> 
> Seems there's duplicate code in function get_xxx_info, double check and refine.

OK. Will refine this.

Thanks,
Chenbo

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [dpdk-dev] [PATCH v2 0/9] Introduce vfio-user library
  2020-12-18  7:38 [dpdk-dev] [PATCH 0/9] Introduce vfio-user library Chenbo Xia
                   ` (9 preceding siblings ...)
  2020-12-18  9:37 ` [dpdk-dev] [PATCH 0/9] Introduce vfio-user library David Marchand
@ 2021-01-14  6:14 ` Chenbo Xia
  2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 1/9] lib: introduce " Chenbo Xia
                     ` (9 more replies)
  10 siblings, 10 replies; 43+ messages in thread
From: Chenbo Xia @ 2021-01-14  6:14 UTC (permalink / raw)
  To: dev, thomas, david.marchand
  Cc: stephen, cunming.liang, xiuchun.lu, miao.li, jingjing.wu, beilei.xing

This series enables DPDK to be an alternative I/O device emulation library of
building virtualized devices in separate processes outside QEMU. It introduces
a new library for device emulation (librte_vfio_user).

*librte_vfio_user* library is an implementation of VFIO-over-socket[1] (also
known as vfio-user) which is a protocol that allows a device to be virtualized
in a separate process outside of QEMU. 

Background & Motivation 
-----------------------
The disaggregated/multi-process QEMU is using VFIO-over-socket/vfio-user
as the main transport mechanism to disaggregate IO services from QEMU[2].
Vfio-user essentially implements the VFIO device model presented to the
user process by a set of messages over a unix-domain socket. The main
difference between application using vfio-user and application using vfio
kernel module is that device manipulation is based on socket messages for
vfio-user but system calls for vfio kernel module. The vfio-user devices
consist of a generic VFIO device type, living in QEMU, which is called the
client[3], and the core device implementation (emulated device), living
outside of QEMU, which is called the server. With emulated devices removed
from QEMU enabled by vfio-user implementation, other places should be
introduced to accommodate virtualized/emulated device. This series introduces
vfio-user support in DPDK to enable DPDK as one of the living places for
emulated device except QEMU.

This series introduce the server and client implementation of vfio-user protocol.
The server plays the role as emulated devices and the client is the device
consumer. With this implementation, DPDK will be enabled to be both device
provider and consumer.

Design overview
---------------

+--------------+     +--------------+     
| +----------+ |     | +----------+ |
| | Generic  | |     | | Emulated | |
| | vfio-dev | |     | | device   | |
| +----------+ |     | +----|-----+ |
| +----------+ |     | +----|-----+ |
| | vfio-user| |     | | vfio-user| |
| | client   | |<--->| | server   | |
| +----------+ |     | +----------+ |
| QEMU/DPDK    |     | DPDK         |
+--------------+     +--------------+

- Generic vfio-dev. 
  It is the generic vfio framework in vfio applications like QEMU or DPDK.
  Applications can keep the most of vfio device management and plug in a
  vfio-user device type. Note that in current implementation, we have not
  yet integrated client vfio-user into kernel vfio in DPDK but it is viable
  and good to do so.

- vfio-user client.
  For DPDK, it is part of librte_vfio_user implementation to provide ways to
  manipulate a vfio-user based emulated devices. This manipulation is very
  similar with kernel vfio (i.e., syscalls like ioctl, mmap and pread/pwrite).
  It is a base for vfio-user device consumer.

- vfio-user server. 
  It is server part of librte_vfio_user. It provides ways to emulate your own
  device. A device provider could only care about device layout that VFIO
  defines but does not need to know how it communicates with vfio-user client.

- Emulated device.
  It is emulated device of any type (e.g., network, crypto and etc.).

References
----------
[1]: https://patchew.org/QEMU/20201130161229.23164-1-thanos.makatos@nutanix.com/
[2]: https://wiki.qemu.org/Features/MultiProcessQEMU
[3]: https://github.com/oracle/qemu/tree/vfio-user-v0.2

----------------------------------
v2:
 - Clean up non-static inline function (Stephen)
 - Naturally pack vfio-user message payload and header (Stephen)
 - Make function definiton align with coding style (Beilei)
 - Clean up duplicate code in vfio-user server APIs (Beilei)
 - Fix some typos

Chenbo Xia (9):
  lib: introduce vfio-user library
  vfio_user: implement lifecycle related APIs
  vfio_user: implement device and region related APIs
  vfio_user: implement DMA table and socket address API
  vfio_user: implement interrupt related APIs
  vfio_user: add client APIs of device attach/detach
  vfio_user: add client APIs of DMA/IRQ/region
  test/vfio_user: introduce functional test
  doc: add vfio-user library guide

 MAINTAINERS                             |    4 +
 app/test/meson.build                    |    4 +
 app/test/test_vfio_user.c               |  665 ++++++++++
 doc/guides/prog_guide/index.rst         |    1 +
 doc/guides/prog_guide/vfio_user_lib.rst |  215 +++
 doc/guides/rel_notes/release_21_02.rst  |   11 +
 lib/librte_vfio_user/meson.build        |   11 +
 lib/librte_vfio_user/rte_vfio_user.h    |  446 +++++++
 lib/librte_vfio_user/version.map        |   26 +
 lib/librte_vfio_user/vfio_user_base.c   |  223 ++++
 lib/librte_vfio_user/vfio_user_base.h   |  109 ++
 lib/librte_vfio_user/vfio_user_client.c |  700 ++++++++++
 lib/librte_vfio_user/vfio_user_client.h |   26 +
 lib/librte_vfio_user/vfio_user_server.c | 1593 +++++++++++++++++++++++
 lib/librte_vfio_user/vfio_user_server.h |   66 +
 lib/meson.build                         |    1 +
 16 files changed, 4101 insertions(+)
 create mode 100644 app/test/test_vfio_user.c
 create mode 100644 doc/guides/prog_guide/vfio_user_lib.rst
 create mode 100644 lib/librte_vfio_user/meson.build
 create mode 100644 lib/librte_vfio_user/rte_vfio_user.h
 create mode 100644 lib/librte_vfio_user/version.map
 create mode 100644 lib/librte_vfio_user/vfio_user_base.c
 create mode 100644 lib/librte_vfio_user/vfio_user_base.h
 create mode 100644 lib/librte_vfio_user/vfio_user_client.c
 create mode 100644 lib/librte_vfio_user/vfio_user_client.h
 create mode 100644 lib/librte_vfio_user/vfio_user_server.c
 create mode 100644 lib/librte_vfio_user/vfio_user_server.h

-- 
2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [dpdk-dev] [PATCH v2 1/9] lib: introduce vfio-user library
  2021-01-14  6:14 ` [dpdk-dev] [PATCH v2 " Chenbo Xia
@ 2021-01-14  6:14   ` Chenbo Xia
  2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 2/9] vfio_user: implement lifecycle related APIs Chenbo Xia
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 43+ messages in thread
From: Chenbo Xia @ 2021-01-14  6:14 UTC (permalink / raw)
  To: dev, thomas, david.marchand
  Cc: stephen, cunming.liang, xiuchun.lu, miao.li, jingjing.wu, beilei.xing

This patch introduces vfio-user library, which follows vfio-user
protocol v1.0. As vfio-user has server and client implementation,
this patch introduces basic structures and internal functions that
will be used by both server and client.

Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
---
 MAINTAINERS                           |   4 +
 lib/librte_vfio_user/meson.build      |   9 ++
 lib/librte_vfio_user/version.map      |   3 +
 lib/librte_vfio_user/vfio_user_base.c | 211 ++++++++++++++++++++++++++
 lib/librte_vfio_user/vfio_user_base.h |  65 ++++++++
 lib/meson.build                       |   1 +
 6 files changed, 293 insertions(+)
 create mode 100644 lib/librte_vfio_user/meson.build
 create mode 100644 lib/librte_vfio_user/version.map
 create mode 100644 lib/librte_vfio_user/vfio_user_base.c
 create mode 100644 lib/librte_vfio_user/vfio_user_base.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 6787b15dcc..91b8b2ccc1 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1541,6 +1541,10 @@ M: Nithin Dabilpuram <ndabilpuram@marvell.com>
 M: Pavan Nikhilesh <pbhagavatula@marvell.com>
 F: lib/librte_node/
 
+Vfio-user - EXPERIMENTAL
+M: Chenbo Xia <chenbo.xia@intel.com>
+M: Xiuchun Lu <xiuchun.lu@intel.com>
+F: lib/librte_vfio_user/
 
 Test Applications
 -----------------
diff --git a/lib/librte_vfio_user/meson.build b/lib/librte_vfio_user/meson.build
new file mode 100644
index 0000000000..0f6407b80f
--- /dev/null
+++ b/lib/librte_vfio_user/meson.build
@@ -0,0 +1,9 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2020 Intel Corporation
+
+if not is_linux
+	build = false
+	reason = 'only supported on Linux'
+endif
+
+sources = files('vfio_user_base.c')
diff --git a/lib/librte_vfio_user/version.map b/lib/librte_vfio_user/version.map
new file mode 100644
index 0000000000..33c1b976f1
--- /dev/null
+++ b/lib/librte_vfio_user/version.map
@@ -0,0 +1,3 @@
+EXPERIMENTAL {
+	local: *;
+};
diff --git a/lib/librte_vfio_user/vfio_user_base.c b/lib/librte_vfio_user/vfio_user_base.c
new file mode 100644
index 0000000000..b9fdff5b02
--- /dev/null
+++ b/lib/librte_vfio_user/vfio_user_base.c
@@ -0,0 +1,211 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <unistd.h>
+#include <sys/socket.h>
+#include <string.h>
+
+#include "vfio_user_base.h"
+
+int vfio_user_log_level;
+
+const char *vfio_user_msg_str[VFIO_USER_MAX] = {
+	[VFIO_USER_NONE] = "VFIO_USER_NONE",
+	[VFIO_USER_VERSION] = "VFIO_USER_VERSION",
+};
+
+void
+vfio_user_close_msg_fds(struct vfio_user_msg *msg)
+{
+	int i;
+
+	for (i = 0; i < msg->fd_num; i++)
+		close(msg->fds[i]);
+}
+
+int
+vfio_user_check_msg_fdnum(struct vfio_user_msg *msg, int expected_fds)
+{
+	if (msg->fd_num == expected_fds)
+		return 0;
+
+	VFIO_USER_LOG(ERR, "Expect %d FDs for request %s, received %d\n",
+		expected_fds, vfio_user_msg_str[msg->cmd], msg->fd_num);
+
+	vfio_user_close_msg_fds(msg);
+
+	return -1;
+}
+
+static int
+vfio_user_recv_fd_msg(int sockfd, char *buf, int buflen, int *fds,
+	int max_fds, int *fd_num)
+{
+	struct iovec iov;
+	struct msghdr msgh;
+	char control[CMSG_SPACE(max_fds * sizeof(int))];
+	struct cmsghdr *cmsg;
+	int fd_sz, got_fds = 0;
+	int ret, i;
+
+	*fd_num = 0;
+
+	memset(&msgh, 0, sizeof(msgh));
+	iov.iov_base = buf;
+	iov.iov_len  = buflen;
+
+	msgh.msg_iov = &iov;
+	msgh.msg_iovlen = 1;
+	msgh.msg_control = control;
+	msgh.msg_controllen = sizeof(control);
+
+	ret = recvmsg(sockfd, &msgh, 0);
+	if (ret <= 0) {
+		if (ret)
+			VFIO_USER_LOG(DEBUG, "recvmsg failed\n");
+		return ret;
+	}
+
+	if (msgh.msg_flags & (MSG_TRUNC | MSG_CTRUNC)) {
+		VFIO_USER_LOG(ERR, "Message is truncated\n");
+		return -1;
+	}
+
+	for (cmsg = CMSG_FIRSTHDR(&msgh); cmsg != NULL;
+		cmsg = CMSG_NXTHDR(&msgh, cmsg)) {
+		if ((cmsg->cmsg_level == SOL_SOCKET) &&
+			(cmsg->cmsg_type == SCM_RIGHTS)) {
+			fd_sz = cmsg->cmsg_len - CMSG_LEN(0);
+			got_fds = fd_sz / sizeof(int);
+			if (got_fds >= max_fds) {
+				/* Invalid message, close fds */
+				int *close_fd = (int *)CMSG_DATA(cmsg);
+				for (i = 0; i < got_fds; i++) {
+					close_fd += i;
+					close(*close_fd);
+				}
+				VFIO_USER_LOG(ERR, "fd num exceeds max "
+					"in vfio-user msg\n");
+				return -1;
+			}
+			*fd_num = got_fds;
+			memcpy(fds, CMSG_DATA(cmsg), got_fds * sizeof(int));
+			break;
+		}
+	}
+
+	/* Make unused file descriptors invalid */
+	while (got_fds < max_fds)
+		fds[got_fds++] = -1;
+
+	return ret;
+}
+
+int
+vfio_user_recv_msg(int sockfd, struct vfio_user_msg *msg)
+{
+	int ret;
+
+	ret = vfio_user_recv_fd_msg(sockfd, (char *)msg, VFIO_USER_MSG_HDR_SIZE,
+		msg->fds, VFIO_USER_MAX_FD, &msg->fd_num);
+	if (ret <= 0) {
+		return ret;
+	} else if (ret != VFIO_USER_MSG_HDR_SIZE) {
+		VFIO_USER_LOG(ERR, "Read unexpected header size\n");
+		ret = -1;
+		goto err;
+	}
+
+	if (msg->size > VFIO_USER_MSG_HDR_SIZE) {
+		if (msg->size > (sizeof(msg->payload) +
+			VFIO_USER_MSG_HDR_SIZE)) {
+			VFIO_USER_LOG(ERR, "Read invalid msg size: %d\n",
+				msg->size);
+			ret = -1;
+			goto err;
+		}
+
+		ret = read(sockfd, &msg->payload,
+			msg->size - VFIO_USER_MSG_HDR_SIZE);
+		if (ret <= 0)
+			goto err;
+		if (ret != (int)(msg->size - VFIO_USER_MSG_HDR_SIZE)) {
+			VFIO_USER_LOG(ERR, "Read payload failed\n");
+			ret = -1;
+			goto err;
+		}
+	}
+
+	return ret;
+err:
+	vfio_user_close_msg_fds(msg);
+	return ret;
+}
+
+static int
+vfio_user_send_fd_msg(int sockfd, char *buf, int buflen, int *fds, int fd_num)
+{
+
+	struct iovec iov;
+	struct msghdr msgh;
+	size_t fdsize = fd_num * sizeof(int);
+	char control[CMSG_SPACE(fdsize)];
+	struct cmsghdr *cmsg;
+	int ret;
+
+	memset(&msgh, 0, sizeof(msgh));
+	iov.iov_base = buf;
+	iov.iov_len = buflen;
+
+	msgh.msg_iov = &iov;
+	msgh.msg_iovlen = 1;
+
+	if (fds && fd_num > 0) {
+		msgh.msg_control = control;
+		msgh.msg_controllen = sizeof(control);
+		cmsg = CMSG_FIRSTHDR(&msgh);
+		cmsg->cmsg_len = CMSG_LEN(fdsize);
+		cmsg->cmsg_level = SOL_SOCKET;
+		cmsg->cmsg_type = SCM_RIGHTS;
+		memcpy(CMSG_DATA(cmsg), fds, fdsize);
+	} else {
+		msgh.msg_control = NULL;
+		msgh.msg_controllen = 0;
+	}
+
+	do {
+		ret = sendmsg(sockfd, &msgh, MSG_NOSIGNAL);
+	} while (ret < 0 && errno == EINTR);
+
+	if (ret < 0) {
+		VFIO_USER_LOG(ERR, "sendmsg error\n");
+		return ret;
+	}
+
+	return ret;
+}
+
+int
+vfio_user_send_msg(int sockfd, struct vfio_user_msg *msg)
+{
+	if (!msg)
+		return 0;
+
+	return vfio_user_send_fd_msg(sockfd, (char *)msg,
+		msg->size, msg->fds, msg->fd_num);
+}
+
+int
+vfio_user_reply_msg(int sockfd, struct vfio_user_msg *msg)
+{
+	if (!msg)
+		return 0;
+
+	msg->flags |= VFIO_USER_NEED_NO_RP;
+	msg->flags |= VFIO_USER_TYPE_REPLY;
+
+	return vfio_user_send_msg(sockfd, msg);
+}
+
+RTE_LOG_REGISTER(vfio_user_log_level, lib.vfio, INFO);
diff --git a/lib/librte_vfio_user/vfio_user_base.h b/lib/librte_vfio_user/vfio_user_base.h
new file mode 100644
index 0000000000..34106cc606
--- /dev/null
+++ b/lib/librte_vfio_user/vfio_user_base.h
@@ -0,0 +1,65 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _VFIO_USER_BASE_H
+#define _VFIO_USER_BASE_H
+
+#include <rte_log.h>
+
+#define VFIO_USER_MAX_FD 1024
+#define VFIO_USER_MAX_VERSION_DATA 512
+
+extern int vfio_user_log_level;
+extern const char *vfio_user_msg_str[];
+
+#define VFIO_USER_LOG(level, fmt, args...)			\
+	rte_log(RTE_LOG_ ## level, vfio_user_log_level,		\
+	"VFIO_USER: " fmt, ## args)
+
+struct vfio_user_socket {
+	char *sock_addr;
+	int sock_fd;
+	int dev_id;
+};
+
+typedef enum VFIO_USER_CMD_TYPE {
+	VFIO_USER_NONE = 0,
+	VFIO_USER_VERSION = 1,
+	VFIO_USER_MAX = 2,
+} VFIO_USER_CMD_TYPE;
+
+struct vfio_user_version {
+	uint16_t major;
+	uint16_t minor;
+	/* Version data (JSON), for now not supported */
+	uint8_t ver_data[VFIO_USER_MAX_VERSION_DATA];
+};
+
+struct vfio_user_msg {
+	uint16_t msg_id;
+	uint16_t cmd;
+	uint32_t size;
+#define VFIO_USER_TYPE_CMD	(0x0)		/* Message type is COMMAND */
+#define VFIO_USER_TYPE_REPLY	(0x1 << 0)	/* Message type is REPLY */
+#define VFIO_USER_NEED_NO_RP	(0x1 << 4)	/* Message needs no reply */
+#define VFIO_USER_ERROR		(0x1 << 5)	/* Reply message has error */
+	uint32_t flags;
+	uint32_t err;				/* Valid in reply, optional */
+	union {
+		struct vfio_user_version ver;
+	} payload;
+	int fds[VFIO_USER_MAX_FD];
+	int fd_num;
+};
+
+#define VFIO_USER_MSG_HDR_SIZE offsetof(struct vfio_user_msg, payload.ver)
+
+void vfio_user_close_msg_fds(struct vfio_user_msg *msg);
+int vfio_user_check_msg_fdnum(struct vfio_user_msg *msg, int expected_fds);
+void vfio_user_close_msg_fds(struct vfio_user_msg *msg);
+int vfio_user_recv_msg(int sockfd, struct vfio_user_msg *msg);
+int vfio_user_send_msg(int sockfd, struct vfio_user_msg *msg);
+int vfio_user_reply_msg(int sockfd, struct vfio_user_msg *msg);
+
+#endif
diff --git a/lib/meson.build b/lib/meson.build
index ed00f89146..b7fbfcc95b 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -28,6 +28,7 @@ libraries = [
 	'rib', 'reorder', 'sched', 'security', 'stack', 'vhost',
 	# ipsec lib depends on net, crypto and security
 	'ipsec',
+	'vfio_user',
 	#fib lib depends on rib
 	'fib',
 	# add pkt framework libs which use other libs from above
-- 
2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [dpdk-dev] [PATCH v2 2/9] vfio_user: implement lifecycle related APIs
  2021-01-14  6:14 ` [dpdk-dev] [PATCH v2 " Chenbo Xia
  2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 1/9] lib: introduce " Chenbo Xia
@ 2021-01-14  6:14   ` Chenbo Xia
  2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 3/9] vfio_user: implement device and region " Chenbo Xia
                     ` (7 subsequent siblings)
  9 siblings, 0 replies; 43+ messages in thread
From: Chenbo Xia @ 2021-01-14  6:14 UTC (permalink / raw)
  To: dev, thomas, david.marchand
  Cc: stephen, cunming.liang, xiuchun.lu, miao.li, jingjing.wu, beilei.xing

This patch implements three lifecycle related APIs for vfio-user server,
which are rte_vfio_user_register(), rte_vfio_user_unregister() and
rte_vfio_user_start(). Socket an device management is implemented
along with the API introduction.

Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
---
 lib/librte_vfio_user/meson.build        |   3 +-
 lib/librte_vfio_user/rte_vfio_user.h    |  54 ++
 lib/librte_vfio_user/version.map        |   6 +
 lib/librte_vfio_user/vfio_user_base.h   |   4 +
 lib/librte_vfio_user/vfio_user_server.c | 707 ++++++++++++++++++++++++
 lib/librte_vfio_user/vfio_user_server.h |  55 ++
 6 files changed, 828 insertions(+), 1 deletion(-)
 create mode 100644 lib/librte_vfio_user/rte_vfio_user.h
 create mode 100644 lib/librte_vfio_user/vfio_user_server.c
 create mode 100644 lib/librte_vfio_user/vfio_user_server.h

diff --git a/lib/librte_vfio_user/meson.build b/lib/librte_vfio_user/meson.build
index 0f6407b80f..b7363f61c6 100644
--- a/lib/librte_vfio_user/meson.build
+++ b/lib/librte_vfio_user/meson.build
@@ -6,4 +6,5 @@ if not is_linux
 	reason = 'only supported on Linux'
 endif
 
-sources = files('vfio_user_base.c')
+sources = files('vfio_user_base.c', 'vfio_user_server.c')
+headers = files('rte_vfio_user.h')
diff --git a/lib/librte_vfio_user/rte_vfio_user.h b/lib/librte_vfio_user/rte_vfio_user.h
new file mode 100644
index 0000000000..705a2f6632
--- /dev/null
+++ b/lib/librte_vfio_user/rte_vfio_user.h
@@ -0,0 +1,54 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_VFIO_USER_H
+#define _RTE_VFIO_USER_H
+
+#include <rte_compat.h>
+
+/**
+ *  Below APIs are for vfio-user server (device provider) to use:
+ *	*rte_vfio_user_register
+ *	*rte_vfio_user_unregister
+ *	*rte_vfio_user_start
+ */
+
+/**
+ * Register a vfio-user device.
+ *
+ * @param sock_addr
+ *   Unix domain socket address
+ * @return
+ *   0 on success, -1 on failure
+ */
+__rte_experimental
+int
+rte_vfio_user_register(const char *sock_addr);
+
+/**
+ * Unregister a vfio-user device.
+ *
+ * @param sock_addr
+ *   Unix domain socket address
+ * @return
+ *   0 on success, -1 on failure
+ */
+__rte_experimental
+int
+rte_vfio_user_unregister(const char *sock_addr);
+
+/**
+ * Start vfio-user handling for the device.
+ *
+ * This function triggers vfio-user message handling.
+ * @param sock_addr
+ *   Unix domain socket address
+ * @return
+ *   0 on success, -1 on failure
+ */
+__rte_experimental
+int
+rte_vfio_user_start(const char *sock_addr);
+
+#endif
diff --git a/lib/librte_vfio_user/version.map b/lib/librte_vfio_user/version.map
index 33c1b976f1..e53095eda8 100644
--- a/lib/librte_vfio_user/version.map
+++ b/lib/librte_vfio_user/version.map
@@ -1,3 +1,9 @@
 EXPERIMENTAL {
+	global:
+
+	rte_vfio_user_register;
+	rte_vfio_user_unregister;
+	rte_vfio_user_start;
+
 	local: *;
 };
diff --git a/lib/librte_vfio_user/vfio_user_base.h b/lib/librte_vfio_user/vfio_user_base.h
index 34106cc606..f9b0b94665 100644
--- a/lib/librte_vfio_user/vfio_user_base.h
+++ b/lib/librte_vfio_user/vfio_user_base.h
@@ -7,6 +7,10 @@
 
 #include <rte_log.h>
 
+#include "rte_vfio_user.h"
+
+#define VFIO_USER_VERSION_MAJOR 1
+#define VFIO_USER_VERSION_MINOR 0
 #define VFIO_USER_MAX_FD 1024
 #define VFIO_USER_MAX_VERSION_DATA 512
 
diff --git a/lib/librte_vfio_user/vfio_user_server.c b/lib/librte_vfio_user/vfio_user_server.c
new file mode 100644
index 0000000000..35544c819a
--- /dev/null
+++ b/lib/librte_vfio_user/vfio_user_server.c
@@ -0,0 +1,707 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <unistd.h>
+#include <fcntl.h>
+#include <pthread.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+
+#include "vfio_user_server.h"
+
+#define MAX_VFIO_USER_DEVICE 1024
+
+static struct vfio_user_server *vfio_user_devices[MAX_VFIO_USER_DEVICE];
+static pthread_mutex_t vfio_dev_mutex = PTHREAD_MUTEX_INITIALIZER;
+
+static struct vfio_user_ep_sock vfio_ep_sock = {
+	.ep = {
+		.fd_mutex = PTHREAD_MUTEX_INITIALIZER,
+		.fd_num = 0
+	},
+	.sock_num = 0,
+	.mutex = PTHREAD_MUTEX_INITIALIZER,
+};
+
+static int
+vfio_user_negotiate_version(struct vfio_user_server *dev,
+	struct vfio_user_msg *msg)
+{
+	struct vfio_user_version *ver = &msg->payload.ver;
+
+	if (vfio_user_check_msg_fdnum(msg, 0) != 0)
+		return -EINVAL;
+
+	if (ver->major == dev->ver.major && ver->minor <= dev->ver.minor)
+		return 0;
+	else
+		return -ENOTSUP;
+}
+
+static vfio_user_msg_handler_t vfio_user_msg_handlers[VFIO_USER_MAX] = {
+	[VFIO_USER_NONE] = NULL,
+	[VFIO_USER_VERSION] = vfio_user_negotiate_version,
+};
+
+static struct vfio_user_server_socket *
+vfio_user_find_socket(const char *sock_addr)
+{
+	uint32_t i;
+
+	if (sock_addr == NULL)
+		return NULL;
+
+	for (i = 0; i < vfio_ep_sock.sock_num; i++) {
+		struct vfio_user_server_socket *s = vfio_ep_sock.sock[i];
+
+		if (!strcmp(s->sock.sock_addr, sock_addr))
+			return s;
+	}
+
+	return NULL;
+}
+
+static struct vfio_user_server_socket *
+vfio_user_create_sock(const char *sock_addr)
+{
+	struct vfio_user_server_socket *sk;
+	struct vfio_user_socket *sock;
+	int fd;
+	struct sockaddr_un *un;
+
+	pthread_mutex_lock(&vfio_ep_sock.mutex);
+	if (vfio_ep_sock.sock_num == VFIO_USER_MAX_FD) {
+		VFIO_USER_LOG(ERR, "Failed to create socket:"
+			" socket num reaches max\n");
+		goto err;
+	}
+
+	sk = vfio_user_find_socket(sock_addr);
+	if (sk) {
+		VFIO_USER_LOG(ERR, "Failed to create socket:"
+			"socket addr exists\n");
+		goto err;
+	}
+
+	sk = malloc(sizeof(*sk));
+	if (!sk) {
+		VFIO_USER_LOG(ERR, "Failed to alloc server socket\n");
+		goto err;
+	}
+
+	sock = &sk->sock;
+	sock->sock_addr = strdup(sock_addr);
+	if (!sock->sock_addr) {
+		VFIO_USER_LOG(ERR, "Failed to copy sock_addr\n");
+		goto err_dup;
+	}
+
+	fd = socket(AF_UNIX, SOCK_STREAM, 0);
+	if (fd < 0) {
+		VFIO_USER_LOG(ERR, "Failed to create socket\n");
+		goto err_sock;
+	}
+
+	if (fcntl(fd, F_SETFL, O_NONBLOCK)) {
+		VFIO_USER_LOG(ERR, "can't set nonblocking mode for socket, "
+			"fd: %d (%s)\n", fd, strerror(errno));
+		goto err_fcntl;
+	}
+
+	un = &sk->un;
+	memset(un, 0, sizeof(*un));
+	un->sun_family = AF_UNIX;
+	strncpy(un->sun_path, sock->sock_addr, sizeof(un->sun_path));
+	un->sun_path[sizeof(un->sun_path) - 1] = '\0';
+	sock->sock_fd = fd;
+	sk->conn_fd = -1;
+
+	vfio_ep_sock.sock[vfio_ep_sock.sock_num++] = sk;
+
+	pthread_mutex_unlock(&vfio_ep_sock.mutex);
+
+	return sk;
+
+err_fcntl:
+	close(fd);
+err_sock:
+	free(sock->sock_addr);
+err_dup:
+	free(sk);
+err:
+	pthread_mutex_unlock(&vfio_ep_sock.mutex);
+	return NULL;
+}
+
+static void
+vfio_user_delete_sock(struct vfio_user_server_socket *sk)
+{
+	uint32_t i, end;
+	struct vfio_user_socket *sock;
+
+	if (!sk)
+		return;
+
+	pthread_mutex_lock(&vfio_ep_sock.mutex);
+
+	for (i = 0; i < vfio_ep_sock.sock_num; i++) {
+		if (vfio_ep_sock.sock[i] == sk)
+			break;
+	}
+
+	sock = &sk->sock;
+	end = --vfio_ep_sock.sock_num;
+	vfio_ep_sock.sock[i] = vfio_ep_sock.sock[end];
+	vfio_ep_sock.sock[end] = NULL;
+
+	free(sock->sock_addr);
+	close(sock->sock_fd);
+	if (sk->conn_fd != -1)
+		close(sk->conn_fd);
+	unlink(sock->sock_addr);
+	free(sk);
+
+	pthread_mutex_unlock(&vfio_ep_sock.mutex);
+}
+
+static inline int
+vfio_user_init_epoll(struct vfio_user_epoll *ep)
+{
+	int epfd = epoll_create(1);
+	if (epfd < 0) {
+		VFIO_USER_LOG(ERR, "Failed to create epoll fd\n");
+		return -1;
+	}
+
+	ep->epfd = epfd;
+	return 0;
+}
+
+static inline void
+vfio_user_destroy_epoll(struct vfio_user_epoll *ep)
+{
+	close(ep->epfd);
+	ep->epfd = -1;
+}
+
+static int
+vfio_user_add_listen_fd(struct vfio_user_epoll *ep,
+	int sock_fd, event_handler evh, void *data)
+{
+	struct epoll_event evt;
+	int ret = 0;
+	uint32_t event = EPOLLIN | EPOLLPRI;
+
+	pthread_mutex_lock(&ep->fd_mutex);
+
+	evt.events = event;
+	evt.data.ptr = &ep->fdinfo[ep->fd_num];
+
+	if (ep->fd_num >= VFIO_USER_MAX_FD) {
+		VFIO_USER_LOG(ERR, "Error add listen fd, "
+			"exceed max num\n");
+		ret = -1;
+		goto err;
+	}
+
+	ep->fdinfo[ep->fd_num].fd = sock_fd;
+	ep->fdinfo[ep->fd_num].event = event;
+	ep->fdinfo[ep->fd_num].ev_handle = evh;
+	ep->fdinfo[ep->fd_num].data = data;
+
+	if (epoll_ctl(ep->epfd, EPOLL_CTL_ADD, sock_fd, &evt) < 0) {
+		VFIO_USER_LOG(ERR, "Error add listen fd, "
+			"epoll_ctl failed\n");
+		ret = -1;
+		goto err;
+	}
+
+	ep->fd_num++;
+err:
+	pthread_mutex_unlock(&ep->fd_mutex);
+	return ret;
+}
+
+static int
+vfio_user_del_listen_fd(struct vfio_user_epoll *ep,
+	int sock_fd)
+{
+	struct epoll_event evt;
+	uint32_t event = EPOLLIN | EPOLLPRI;
+	uint32_t i;
+	int ret = 0;
+
+	pthread_mutex_lock(&ep->fd_mutex);
+
+	for (i = 0; i < ep->fd_num; i++) {
+		if (ep->fdinfo[i].fd == sock_fd) {
+			ep->fdinfo[i].fd = -1;
+			break;
+		}
+	}
+
+	evt.events = event;
+	evt.data.ptr = &ep->fdinfo[i];
+
+	if (epoll_ctl(ep->epfd, EPOLL_CTL_DEL, sock_fd, &evt) < 0) {
+		VFIO_USER_LOG(ERR, "Error del listen fd, "
+			"epoll_ctl failed\n");
+		ret = -1;
+	}
+
+	pthread_mutex_unlock(&ep->fd_mutex);
+	return ret;
+}
+
+static inline int
+next_mv_src_idx(FD_INFO *info, int end)
+{
+	int i;
+
+	for (i = end; i >= 0 && info[i].fd == -1; i--)
+		;
+
+	return i;
+}
+
+static void
+vfio_user_fd_cleanup(struct vfio_user_epoll *ep)
+{
+	int mv_src_idx, mv_dst_idx;
+	if (ep->fd_num != 0) {
+		pthread_mutex_lock(&ep->fd_mutex);
+
+		mv_src_idx = next_mv_src_idx(ep->fdinfo, ep->fd_num - 1);
+		for (mv_dst_idx = 0; mv_dst_idx < mv_src_idx; mv_dst_idx++) {
+			if (ep->fdinfo[mv_dst_idx].fd != -1)
+				continue;
+			ep->fdinfo[mv_dst_idx] = ep->fdinfo[mv_src_idx];
+			mv_src_idx = next_mv_src_idx(ep->fdinfo,
+				mv_src_idx - 1);
+		}
+		ep->fd_num = mv_src_idx + 1;
+
+		pthread_mutex_unlock(&ep->fd_mutex);
+	}
+}
+
+static void *
+vfio_user_fd_event_handler(void *arg)
+{
+	struct vfio_user_epoll *ep = arg;
+	struct epoll_event *events;
+	int num_fd, i, ret, cleanup;
+	event_handler evh;
+	FD_INFO *info;
+
+	while (1) {
+		events = ep->events;
+		num_fd = epoll_wait(ep->epfd, events,
+			VFIO_USER_MAX_FD, 1000);
+		if (num_fd <= 0)
+			continue;
+		cleanup = 0;
+
+		for (i = 0; i < num_fd; i++) {
+			info = (FD_INFO *)events[i].data.ptr;
+			evh = info->ev_handle;
+
+			if (evh) {
+				ret = evh(info->fd, info->data);
+				if (ret < 0) {
+					info->fd = -1;
+					cleanup = 1;
+				}
+			}
+		}
+
+		if (cleanup)
+			vfio_user_fd_cleanup(ep);
+	}
+	return NULL;
+}
+
+static inline int
+vfio_user_add_device(void)
+{
+	struct vfio_user_server *dev;
+	int i;
+
+	pthread_mutex_lock(&vfio_dev_mutex);
+	for (i = 0; i < MAX_VFIO_USER_DEVICE; i++) {
+		if (vfio_user_devices[i] == NULL)
+			break;
+	}
+
+	if (i == MAX_VFIO_USER_DEVICE) {
+		VFIO_USER_LOG(ERR, "vfio user device num reaches max!\n");
+		i = -1;
+		goto exit;
+	}
+
+	dev = malloc(sizeof(struct vfio_user_server));
+	if (dev == NULL) {
+		VFIO_USER_LOG(ERR, "Failed to alloc new vfio-user dev.\n");
+		i = -1;
+		goto exit;
+	}
+
+	memset(dev, 0, sizeof(struct vfio_user_server));
+	vfio_user_devices[i] = dev;
+	dev->dev_id = i;
+	dev->conn_fd = -1;
+
+exit:
+	pthread_mutex_unlock(&vfio_dev_mutex);
+	return i;
+}
+
+static inline void
+vfio_user_del_device(struct vfio_user_server *dev)
+{
+	if (dev == NULL)
+		return;
+
+	pthread_mutex_lock(&vfio_dev_mutex);
+	vfio_user_devices[dev->dev_id] = NULL;
+	free(dev);
+	pthread_mutex_unlock(&vfio_dev_mutex);
+}
+
+static inline struct vfio_user_server *
+vfio_user_get_device(int dev_id)
+{
+	struct vfio_user_server *dev;
+
+	pthread_mutex_lock(&vfio_dev_mutex);
+	dev = vfio_user_devices[dev_id];
+	if (!dev)
+		VFIO_USER_LOG(ERR, "Device %d not found.\n", dev_id);
+	pthread_mutex_unlock(&vfio_dev_mutex);
+
+	return dev;
+}
+
+static int
+vfio_user_message_handler(int dev_id, int fd)
+{
+	struct vfio_user_server *dev;
+	struct vfio_user_msg msg;
+	uint32_t cmd;
+	int ret = 0;
+
+	dev = vfio_user_get_device(dev_id);
+	if (!dev)
+		return -1;
+
+	ret = vfio_user_recv_msg(fd, &msg);
+	if (ret <= 0) {
+		if (ret < 0)
+			VFIO_USER_LOG(ERR, "Read message failed\n");
+		else
+			VFIO_USER_LOG(ERR, "Peer closed\n");
+		return -1;
+	}
+
+	if (msg.msg_id != dev->msg_id)
+		return -1;
+	ret = 0;
+	cmd = msg.cmd;
+	dev->msg_id++;
+	if (cmd > VFIO_USER_NONE && cmd < VFIO_USER_MAX &&
+			vfio_user_msg_str[cmd]) {
+		VFIO_USER_LOG(INFO, "Read message %s\n",
+			vfio_user_msg_str[cmd]);
+	} else {
+		VFIO_USER_LOG(ERR, "Read unknown message\n");
+		return -1;
+	}
+
+	if (vfio_user_msg_handlers[cmd])
+		ret = vfio_user_msg_handlers[cmd](dev, &msg);
+	else {
+		VFIO_USER_LOG(ERR, "Handler not defined for %s\n",
+			vfio_user_msg_str[cmd]);
+		ret = -1;
+		goto handle_end;
+	}
+
+	if (!(msg.flags & VFIO_USER_NEED_NO_RP)) {
+		if (ret < 0) {
+			msg.flags |= VFIO_USER_ERROR;
+			msg.err = -ret;
+			/* If an error occurs, the reply message must
+			 * only include the reply header.
+			 */
+			msg.size = VFIO_USER_MSG_HDR_SIZE;
+			VFIO_USER_LOG(ERR, "Handle status error(%d) for %s\n",
+				ret, vfio_user_msg_str[cmd]);
+		}
+
+		ret = vfio_user_reply_msg(fd, &msg);
+		if (ret < 0) {
+			VFIO_USER_LOG(ERR, "Reply error for %s\n",
+				vfio_user_msg_str[cmd]);
+		} else {
+			VFIO_USER_LOG(INFO, "Reply %s succeeds\n",
+				vfio_user_msg_str[cmd]);
+			ret = 0;
+		}
+	}
+
+handle_end:
+	return ret;
+}
+
+static int
+vfio_user_sock_read(int fd, void *data)
+{
+	struct vfio_user_server_socket *sk = data;
+	int ret, dev_id = sk->sock.dev_id;
+
+	ret = vfio_user_message_handler(dev_id, fd);
+	if (ret < 0) {
+		struct vfio_user_server *dev;
+
+		vfio_user_del_listen_fd(&vfio_ep_sock.ep, sk->conn_fd);
+		close(fd);
+		sk->conn_fd = -1;
+		dev = vfio_user_get_device(dev_id);
+		if (dev)
+			dev->msg_id = 0;
+	}
+
+	return ret;
+}
+
+static void
+vfio_user_set_ifname(int dev_id, const char *sock_addr, unsigned int size)
+{
+	struct vfio_user_server *dev;
+	unsigned int len;
+
+	dev = vfio_user_get_device(dev_id);
+	if (!dev)
+		return;
+
+	len = size > sizeof(dev->sock_addr) ?
+		sizeof(dev->sock_addr) : size;
+	strncpy(dev->sock_addr, sock_addr, len);
+	dev->sock_addr[len] = '\0';
+}
+
+static int
+vfio_user_add_new_connection(int fd, void *data)
+{
+	struct vfio_user_server *dev;
+	int dev_id;
+	size_t size;
+	struct vfio_user_server_socket *sk = data;
+	struct vfio_user_socket *sock = &sk->sock;
+	int conn_fd;
+	int ret;
+
+	if (sk->conn_fd != -1)
+		return 0;
+
+	conn_fd = accept(fd, NULL, NULL);
+	if (fd < 0)
+		return -1;
+
+	VFIO_USER_LOG(INFO, "New vfio-user client(%s) connected\n",
+		sock->sock_addr);
+
+	if (sock == NULL)
+		return -1;
+
+	dev_id = sock->dev_id;
+	sk->conn_fd = conn_fd;
+
+	dev = vfio_user_get_device(dev_id);
+	if (!dev)
+		return -1;
+
+	dev->conn_fd = conn_fd;
+
+	size = strnlen(sock->sock_addr, PATH_MAX);
+	vfio_user_set_ifname(dev_id, sock->sock_addr, size);
+
+	ret = vfio_user_add_listen_fd(&vfio_ep_sock.ep,
+		conn_fd, vfio_user_sock_read, sk);
+	if (ret < 0) {
+		VFIO_USER_LOG(ERR, "Failed to add fd %d into vfio server fdset\n",
+			conn_fd);
+		goto err_cleanup;
+	}
+
+	return 0;
+
+err_cleanup:
+	close(fd);
+	return -1;
+}
+
+static int
+vfio_user_start_server(struct vfio_user_server_socket *sk)
+{
+	struct vfio_user_server *dev;
+	int ret;
+	struct vfio_user_socket *sock = &sk->sock;
+	int fd = sock->sock_fd;
+	const char *path = sock->sock_addr;
+
+	dev = vfio_user_get_device(sock->dev_id);
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to start, "
+			"device not found\n");
+		return -1;
+	}
+
+	if (dev->started) {
+		VFIO_USER_LOG(INFO, "device already started\n");
+		return 0;
+	}
+
+	unlink(path);
+	ret = bind(fd, (struct sockaddr *)&sk->un, sizeof(sk->un));
+	if (ret < 0) {
+		VFIO_USER_LOG(ERR, "failed to bind to %s: %s;"
+			"remove it and try again\n",
+			path, strerror(errno));
+		goto err;
+	}
+
+	ret = listen(fd, 128);
+	if (ret < 0)
+		goto err;
+
+	ret = vfio_user_add_listen_fd(&vfio_ep_sock.ep,
+		fd, vfio_user_add_new_connection, (void *)sk);
+	if (ret < 0) {
+		VFIO_USER_LOG(ERR, "failed to add listen fd %d to "
+			"vfio-user server fdset\n", fd);
+		goto err;
+	}
+
+	dev->started = 1;
+
+	return 0;
+
+err:
+	close(fd);
+	return -1;
+}
+
+int
+rte_vfio_user_register(const char *sock_addr)
+{
+	struct vfio_user_server_socket *sk;
+	struct vfio_user_server *dev;
+	int dev_id;
+
+	if (!sock_addr)
+		return -1;
+
+	sk = vfio_user_create_sock(sock_addr);
+	if (!sk) {
+		VFIO_USER_LOG(ERR, "Create socket failed\n");
+		goto exit;
+	}
+
+	dev_id = vfio_user_add_device();
+	if (dev_id == -1) {
+		VFIO_USER_LOG(ERR, "Failed to add new vfio device\n");
+		goto err_add_dev;
+	}
+	sk->sock.dev_id = dev_id;
+
+	dev = vfio_user_get_device(dev_id);
+
+	dev->ver.major = VFIO_USER_VERSION_MAJOR;
+	dev->ver.minor = VFIO_USER_VERSION_MINOR;
+
+	return 0;
+
+err_add_dev:
+	vfio_user_delete_sock(sk);
+exit:
+	return -1;
+}
+
+int
+rte_vfio_user_unregister(const char *sock_addr)
+{
+	struct vfio_user_server_socket *sk;
+	struct vfio_user_server *dev;
+	int dev_id;
+
+	pthread_mutex_lock(&vfio_ep_sock.mutex);
+	sk = vfio_user_find_socket(sock_addr);
+	pthread_mutex_unlock(&vfio_ep_sock.mutex);
+
+	if (!sk) {
+		VFIO_USER_LOG(ERR, "Failed to unregister:"
+			"socket addr not registered.\n");
+		return -1;
+	}
+
+	dev_id = sk->sock.dev_id;
+	/* Client may already disconnect before unregistration */
+	if (sk->conn_fd != -1)
+		vfio_user_del_listen_fd(&vfio_ep_sock.ep, sk->conn_fd);
+	vfio_user_del_listen_fd(&vfio_ep_sock.ep, sk->sock.sock_fd);
+	vfio_user_fd_cleanup(&vfio_ep_sock.ep);
+	vfio_user_delete_sock(sk);
+
+	dev = vfio_user_get_device(dev_id);
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to unregister:"
+			"device not found.\n");
+		return -1;
+	}
+
+	vfio_user_del_device(dev);
+
+	return 0;
+}
+
+int
+rte_vfio_user_start(const char *sock_addr)
+{
+	static pthread_t pid;
+	struct vfio_user_server_socket *sock;
+
+	pthread_mutex_lock(&vfio_ep_sock.mutex);
+
+	sock = vfio_user_find_socket(sock_addr);
+	if (!sock) {
+		VFIO_USER_LOG(ERR, "sock_addr not registered to vfio_user "
+			"before start\n");
+		goto exit;
+	}
+
+	if (pid == 0) {
+		struct vfio_user_epoll *ep = &vfio_ep_sock.ep;
+
+		if (vfio_user_init_epoll(ep)) {
+			VFIO_USER_LOG(ERR, "Init vfio-user epoll failed\n");
+			return -1;
+		}
+
+		if (pthread_create(&pid, NULL,
+			vfio_user_fd_event_handler, ep)) {
+			vfio_user_destroy_epoll(ep);
+			VFIO_USER_LOG(ERR, "Event handler thread create failed\n");
+			return -1;
+		}
+	}
+
+	pthread_mutex_unlock(&vfio_ep_sock.mutex);
+
+	return vfio_user_start_server(sock);
+
+exit:
+	pthread_mutex_unlock(&vfio_ep_sock.mutex);
+	return -1;
+}
diff --git a/lib/librte_vfio_user/vfio_user_server.h b/lib/librte_vfio_user/vfio_user_server.h
new file mode 100644
index 0000000000..0a5b17584a
--- /dev/null
+++ b/lib/librte_vfio_user/vfio_user_server.h
@@ -0,0 +1,55 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _VFIO_USER_SERVER_H
+#define _VFIO_USER_SERVER_H
+
+#include <sys/epoll.h>
+
+#include "vfio_user_base.h"
+
+struct vfio_user_server {
+	int dev_id;
+	int started;
+	int conn_fd;
+	uint32_t msg_id;
+	char sock_addr[PATH_MAX];
+	struct vfio_user_version ver;
+};
+
+typedef int (*event_handler)(int fd, void *data);
+
+typedef struct listen_fd_info {
+	int fd;
+	uint32_t event;
+	event_handler ev_handle;
+	void *data;
+} FD_INFO;
+
+struct vfio_user_epoll {
+	int epfd;
+	FD_INFO fdinfo[VFIO_USER_MAX_FD];
+	uint32_t fd_num;	/* Current num of listen_fd */
+	struct epoll_event events[VFIO_USER_MAX_FD];
+	pthread_mutex_t fd_mutex;
+};
+
+struct vfio_user_server_socket {
+	struct vfio_user_socket sock;
+	struct sockaddr_un un;
+	/* For vfio-user protocol v0.1, a server only supports one client */
+	int conn_fd;
+};
+
+struct vfio_user_ep_sock {
+	struct vfio_user_epoll ep;
+	struct vfio_user_server_socket *sock[VFIO_USER_MAX_FD];
+	uint32_t sock_num;
+	pthread_mutex_t mutex;
+};
+
+typedef int (*vfio_user_msg_handler_t)(struct vfio_user_server *dev,
+					struct vfio_user_msg *msg);
+
+#endif
-- 
2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [dpdk-dev] [PATCH v2 3/9] vfio_user: implement device and region related APIs
  2021-01-14  6:14 ` [dpdk-dev] [PATCH v2 " Chenbo Xia
  2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 1/9] lib: introduce " Chenbo Xia
  2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 2/9] vfio_user: implement lifecycle related APIs Chenbo Xia
@ 2021-01-14  6:14   ` Chenbo Xia
  2021-01-14 18:48     ` David Christensen
  2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 4/9] vfio_user: implement DMA table and socket address API Chenbo Xia
                     ` (6 subsequent siblings)
  9 siblings, 1 reply; 43+ messages in thread
From: Chenbo Xia @ 2021-01-14  6:14 UTC (permalink / raw)
  To: dev, thomas, david.marchand
  Cc: stephen, cunming.liang, xiuchun.lu, miao.li, jingjing.wu, beilei.xing

This patch introduces device and region related APIs, which are
rte_vfio_user_set_dev_info() and rte_vfio_user_set_reg_info().
The corresponding vfio-user command handling is also added with
the definition of all vfio-user command identity.

Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
---
 lib/librte_vfio_user/rte_vfio_user.h    |  62 +++++++
 lib/librte_vfio_user/version.map        |   2 +
 lib/librte_vfio_user/vfio_user_base.c   |  12 ++
 lib/librte_vfio_user/vfio_user_base.h   |  32 +++-
 lib/librte_vfio_user/vfio_user_server.c | 235 ++++++++++++++++++++++++
 lib/librte_vfio_user/vfio_user_server.h |   2 +
 6 files changed, 344 insertions(+), 1 deletion(-)

diff --git a/lib/librte_vfio_user/rte_vfio_user.h b/lib/librte_vfio_user/rte_vfio_user.h
index 705a2f6632..117e994cc6 100644
--- a/lib/librte_vfio_user/rte_vfio_user.h
+++ b/lib/librte_vfio_user/rte_vfio_user.h
@@ -5,13 +5,35 @@
 #ifndef _RTE_VFIO_USER_H
 #define _RTE_VFIO_USER_H
 
+#include <linux/vfio.h>
+
 #include <rte_compat.h>
 
+struct rte_vfio_user_reg_info;
+
+typedef ssize_t (*rte_vfio_user_reg_acc_t)(struct rte_vfio_user_reg_info *reg,
+		char *buf, size_t count, loff_t pos, bool iswrite);
+
+struct rte_vfio_user_reg_info {
+	rte_vfio_user_reg_acc_t rw;
+	void *base;
+	int fd;
+	struct vfio_region_info *info;
+	void *priv;
+};
+
+struct rte_vfio_user_regions {
+	uint32_t reg_num;
+	struct rte_vfio_user_reg_info reg_info[];
+};
+
 /**
  *  Below APIs are for vfio-user server (device provider) to use:
  *	*rte_vfio_user_register
  *	*rte_vfio_user_unregister
  *	*rte_vfio_user_start
+ *	*rte_vfio_user_set_dev_info
+ *	*rte_vfio_user_set_reg_info
  */
 
 /**
@@ -51,4 +73,44 @@ __rte_experimental
 int
 rte_vfio_user_start(const char *sock_addr);
 
+/**
+ * Set the device information for a vfio-user device.
+ *
+ * This information must be set before calling rte_vfio_user_start, and should
+ * not be updated after start. Update after start can be done by unregistration
+ * and re-registration, and then the device-level change can be detected by
+ * vfio-user client.
+ *
+ * @param sock_addr
+ *   Unix domain socket address
+ * @param dev_info
+ *   Device information for the vfio-user device
+ * @return
+ *   0 on success, -1 on failure
+ */
+__rte_experimental
+int
+rte_vfio_user_set_dev_info(const char *sock_addr,
+	struct vfio_device_info *dev_info);
+
+/**
+ * Set the region information for a vfio-user device.
+ *
+ * This information must be set before calling rte_vfio_user_start, and should
+ * not be updated after start. Update after start can be done by unregistration
+ * and re-registration, and then the device-level change can be detected by
+ * vfio-user client.
+ *
+ * @param sock_addr
+ *   Unix domain socket address
+ * @param reg
+ *   Region information for the vfio-user device
+ * @return
+ *   0 on success, -1 on failure
+ */
+__rte_experimental
+int
+rte_vfio_user_set_reg_info(const char *sock_addr,
+	struct rte_vfio_user_regions *reg);
+
 #endif
diff --git a/lib/librte_vfio_user/version.map b/lib/librte_vfio_user/version.map
index e53095eda8..0f4f5acba5 100644
--- a/lib/librte_vfio_user/version.map
+++ b/lib/librte_vfio_user/version.map
@@ -4,6 +4,8 @@ EXPERIMENTAL {
 	rte_vfio_user_register;
 	rte_vfio_user_unregister;
 	rte_vfio_user_start;
+	rte_vfio_user_set_dev_info;
+	rte_vfio_user_set_reg_info;
 
 	local: *;
 };
diff --git a/lib/librte_vfio_user/vfio_user_base.c b/lib/librte_vfio_user/vfio_user_base.c
index b9fdff5b02..7badca23b7 100644
--- a/lib/librte_vfio_user/vfio_user_base.c
+++ b/lib/librte_vfio_user/vfio_user_base.c
@@ -13,6 +13,18 @@ int vfio_user_log_level;
 const char *vfio_user_msg_str[VFIO_USER_MAX] = {
 	[VFIO_USER_NONE] = "VFIO_USER_NONE",
 	[VFIO_USER_VERSION] = "VFIO_USER_VERSION",
+	[VFIO_USER_DMA_MAP] = "VFIO_USER_DMA_MAP",
+	[VFIO_USER_DMA_UNMAP] = "VFIO_USER_DMA_UNMAP",
+	[VFIO_USER_DEVICE_GET_INFO] = "VFIO_USER_DEVICE_GET_INFO",
+	[VFIO_USER_DEVICE_GET_REGION_INFO] = "VFIO_USER_GET_REGION_INFO",
+	[VFIO_USER_DEVICE_GET_IRQ_INFO] = "VFIO_USER_DEVICE_GET_IRQ_INFO",
+	[VFIO_USER_DEVICE_SET_IRQS] = "VFIO_USER_DEVICE_SET_IRQS",
+	[VFIO_USER_REGION_READ] = "VFIO_USER_REGION_READ",
+	[VFIO_USER_REGION_WRITE] = "VFIO_USER_REGION_WRITE",
+	[VFIO_USER_DMA_READ] = "VFIO_USER_DMA_READ",
+	[VFIO_USER_DMA_WRITE] = "VFIO_USER_DMA_WRITE",
+	[VFIO_USER_VM_INTERRUPT] = "VFIO_USER_VM_INTERRUPT",
+	[VFIO_USER_DEVICE_RESET] = "VFIO_USER_DEVICE_RESET",
 };
 
 void
diff --git a/lib/librte_vfio_user/vfio_user_base.h b/lib/librte_vfio_user/vfio_user_base.h
index f9b0b94665..f92886b56a 100644
--- a/lib/librte_vfio_user/vfio_user_base.h
+++ b/lib/librte_vfio_user/vfio_user_base.h
@@ -11,6 +11,8 @@
 
 #define VFIO_USER_VERSION_MAJOR 1
 #define VFIO_USER_VERSION_MINOR 0
+#define VFIO_USER_MAX_RSVD 512
+#define VFIO_USER_MAX_RW_DATA 512
 #define VFIO_USER_MAX_FD 1024
 #define VFIO_USER_MAX_VERSION_DATA 512
 
@@ -30,7 +32,19 @@ struct vfio_user_socket {
 typedef enum VFIO_USER_CMD_TYPE {
 	VFIO_USER_NONE = 0,
 	VFIO_USER_VERSION = 1,
-	VFIO_USER_MAX = 2,
+	VFIO_USER_DMA_MAP = 2,
+	VFIO_USER_DMA_UNMAP = 3,
+	VFIO_USER_DEVICE_GET_INFO = 4,
+	VFIO_USER_DEVICE_GET_REGION_INFO = 5,
+	VFIO_USER_DEVICE_GET_IRQ_INFO = 6,
+	VFIO_USER_DEVICE_SET_IRQS = 7,
+	VFIO_USER_REGION_READ = 8,
+	VFIO_USER_REGION_WRITE = 9,
+	VFIO_USER_DMA_READ = 10,
+	VFIO_USER_DMA_WRITE = 11,
+	VFIO_USER_VM_INTERRUPT = 12,
+	VFIO_USER_DEVICE_RESET = 13,
+	VFIO_USER_MAX = 14,
 } VFIO_USER_CMD_TYPE;
 
 struct vfio_user_version {
@@ -40,6 +54,19 @@ struct vfio_user_version {
 	uint8_t ver_data[VFIO_USER_MAX_VERSION_DATA];
 };
 
+struct vfio_user_reg {
+	struct vfio_region_info reg_info;
+	/* Reserved for region capability list */
+	uint8_t rsvd[VFIO_USER_MAX_RSVD];
+};
+
+struct vfio_user_reg_rw {
+	uint64_t reg_offset;
+	uint32_t reg_idx;
+	uint32_t size;
+	char data[VFIO_USER_MAX_RW_DATA];
+};
+
 struct vfio_user_msg {
 	uint16_t msg_id;
 	uint16_t cmd;
@@ -52,6 +79,9 @@ struct vfio_user_msg {
 	uint32_t err;				/* Valid in reply, optional */
 	union {
 		struct vfio_user_version ver;
+		struct vfio_device_info dev_info;
+		struct vfio_user_reg reg_info;
+		struct vfio_user_reg_rw reg_rw;
 	} payload;
 	int fds[VFIO_USER_MAX_FD];
 	int fd_num;
diff --git a/lib/librte_vfio_user/vfio_user_server.c b/lib/librte_vfio_user/vfio_user_server.c
index 35544c819a..aab923e727 100644
--- a/lib/librte_vfio_user/vfio_user_server.c
+++ b/lib/librte_vfio_user/vfio_user_server.c
@@ -5,6 +5,7 @@
 #include <unistd.h>
 #include <fcntl.h>
 #include <pthread.h>
+#include <inttypes.h>
 #include <sys/socket.h>
 #include <sys/un.h>
 
@@ -39,9 +40,159 @@ vfio_user_negotiate_version(struct vfio_user_server *dev,
 		return -ENOTSUP;
 }
 
+static int
+vfio_user_device_get_info(struct vfio_user_server *dev,
+	struct vfio_user_msg *msg)
+{
+	struct vfio_device_info *dev_info = &msg->payload.dev_info;
+
+	if (vfio_user_check_msg_fdnum(msg, 0) != 0)
+		return -EINVAL;
+
+	if (msg->size != sizeof(*dev_info) + VFIO_USER_MSG_HDR_SIZE) {
+		VFIO_USER_LOG(ERR, "Invalid message for get dev info\n");
+		return -EINVAL;
+	}
+
+	memcpy(dev_info, dev->dev_info, sizeof(*dev_info));
+
+	VFIO_USER_LOG(DEBUG, "Device info: argsz(0x%x), flags(0x%x), "
+		"regions(%u), irqs(%u)\n", dev_info->argsz, dev_info->flags,
+		dev_info->num_regions, dev_info->num_irqs);
+
+	return 0;
+}
+
+static int
+vfio_user_device_get_reg_info(struct vfio_user_server *dev,
+	struct vfio_user_msg *msg)
+{
+	struct vfio_user_reg *reg = &msg->payload.reg_info;
+	struct rte_vfio_user_reg_info *reg_info;
+	struct vfio_region_info *vinfo;
+
+	if (vfio_user_check_msg_fdnum(msg, 0) != 0)
+		return -EINVAL;
+
+	if (msg->size > sizeof(*reg) + VFIO_USER_MSG_HDR_SIZE ||
+		dev->reg->reg_num <= reg->reg_info.index) {
+		VFIO_USER_LOG(ERR, "Invalid message for get region info\n");
+		return -EINVAL;
+	}
+
+	reg_info = &dev->reg->reg_info[reg->reg_info.index];
+	vinfo = reg_info->info;
+	memcpy(reg, vinfo, vinfo->argsz);
+
+	if (reg_info->fd != -1) {
+		msg->fd_num = 1;
+		msg->fds[0] = reg_info->fd;
+	}
+
+	VFIO_USER_LOG(DEBUG, "Region(%u) info: addr(0x%" PRIx64 "), fd(%d), "
+		"sz(0x%llx), argsz(0x%x), c_off(0x%x), flags(0x%x) "
+		"off(0x%llx)\n", vinfo->index, (uint64_t)reg_info->base,
+		reg_info->fd, vinfo->size, vinfo->argsz, vinfo->cap_offset,
+		vinfo->flags, vinfo->offset);
+
+	return 0;
+}
+
+static int
+vfio_user_region_read(struct vfio_user_server *dev,
+	struct vfio_user_msg *msg)
+{
+	struct vfio_user_reg_rw *rw = &msg->payload.reg_rw;
+	struct rte_vfio_user_regions *reg = dev->reg;
+	struct rte_vfio_user_reg_info *reg_info;
+	size_t count;
+
+	if (vfio_user_check_msg_fdnum(msg, 0) != 0)
+		return -EINVAL;
+
+	reg_info = &reg->reg_info[rw->reg_idx];
+
+	if (rw->reg_idx >= reg->reg_num ||
+		rw->size > VFIO_USER_MAX_RW_DATA ||
+		rw->reg_offset >= reg_info->info->size ||
+		rw->reg_offset + rw->size > reg_info->info->size) {
+		VFIO_USER_LOG(ERR, "Invalid read region request\n");
+		rw->size = 0;
+		return 0;
+	}
+
+	VFIO_USER_LOG(DEBUG, "Read Region(%u): offset(0x%" PRIx64 "),"
+		"size(0x%x)\n", rw->reg_idx, rw->reg_offset, rw->size);
+
+	if (reg_info->rw) {
+		count = reg_info->rw(reg_info, msg->payload.reg_rw.data,
+				rw->size, rw->reg_offset, 0);
+		rw->size = count;
+		msg->size += count;
+		return 0;
+	}
+
+	memcpy(&msg->payload.reg_rw.data,
+		(uint8_t *)reg_info->base + rw->reg_offset, rw->size);
+	msg->size += rw->size;
+	return 0;
+}
+
+static int
+vfio_user_region_write(struct vfio_user_server *dev,
+	struct vfio_user_msg *msg)
+{
+	struct vfio_user_reg_rw *rw = &msg->payload.reg_rw;
+	struct rte_vfio_user_regions *reg = dev->reg;
+	struct rte_vfio_user_reg_info *reg_info;
+	size_t count;
+
+	if (vfio_user_check_msg_fdnum(msg, 0) != 0)
+		return -EINVAL;
+
+	if (rw->reg_idx >= reg->reg_num) {
+		VFIO_USER_LOG(ERR, "Write a non-existed region\n");
+		return -EINVAL;
+	}
+
+	reg_info = &reg->reg_info[rw->reg_idx];
+
+	VFIO_USER_LOG(DEBUG, "Write Region(%u): offset(0x%" PRIx64 "),"
+		"size(0x%x)\n", rw->reg_idx, rw->reg_offset, rw->size);
+
+	if (reg_info->rw) {
+		count = reg_info->rw(reg_info, msg->payload.reg_rw.data,
+				rw->size, rw->reg_offset, 1);
+		if (count < rw->size) {
+			VFIO_USER_LOG(ERR, "Write region %d failed\n",
+				rw->reg_idx);
+			return -EIO;
+		}
+		rw->size = 0;
+		return 0;
+	}
+
+	memcpy((uint8_t *)reg_info->base + rw->reg_offset,
+		&msg->payload.reg_rw.data, rw->size);
+	rw->size = 0;
+	return 0;
+}
+
 static vfio_user_msg_handler_t vfio_user_msg_handlers[VFIO_USER_MAX] = {
 	[VFIO_USER_NONE] = NULL,
 	[VFIO_USER_VERSION] = vfio_user_negotiate_version,
+	[VFIO_USER_DMA_MAP] = NULL,
+	[VFIO_USER_DMA_UNMAP] = NULL,
+	[VFIO_USER_DEVICE_GET_INFO] = vfio_user_device_get_info,
+	[VFIO_USER_DEVICE_GET_REGION_INFO] = vfio_user_device_get_reg_info,
+	[VFIO_USER_DEVICE_GET_IRQ_INFO] = NULL,
+	[VFIO_USER_DEVICE_SET_IRQS] = NULL,
+	[VFIO_USER_REGION_READ] = vfio_user_region_read,
+	[VFIO_USER_REGION_WRITE] = vfio_user_region_write,
+	[VFIO_USER_DMA_READ] = NULL,
+	[VFIO_USER_DMA_WRITE] = NULL,
+	[VFIO_USER_VM_INTERRUPT] = NULL,
+	[VFIO_USER_DEVICE_RESET] = NULL,
 };
 
 static struct vfio_user_server_socket *
@@ -563,6 +714,13 @@ vfio_user_start_server(struct vfio_user_server_socket *sk)
 		return 0;
 	}
 
+	/* All the info must be set before start */
+	if (!dev->dev_info || !dev->reg) {
+		VFIO_USER_LOG(ERR, "Failed to start, "
+			"dev/reg info must be set before start\n");
+		return -1;
+	}
+
 	unlink(path);
 	ret = bind(fd, (struct sockaddr *)&sk->un, sizeof(sk->un));
 	if (ret < 0) {
@@ -705,3 +863,80 @@ rte_vfio_user_start(const char *sock_addr)
 	pthread_mutex_unlock(&vfio_ep_sock.mutex);
 	return -1;
 }
+
+static struct vfio_user_server *
+vfio_user_find_stopped_server(const char *sock_addr)
+{
+	struct vfio_user_server *dev;
+	struct vfio_user_server_socket *sk;
+	int dev_id;
+
+	pthread_mutex_lock(&vfio_ep_sock.mutex);
+	sk = vfio_user_find_socket(sock_addr);
+	pthread_mutex_unlock(&vfio_ep_sock.mutex);
+
+	if (!sk) {
+		VFIO_USER_LOG(ERR, "Failed to find server with sock_addr "
+			"%s: addr not registered.\n", sock_addr);
+		return NULL;
+	}
+
+	dev_id = sk->sock.dev_id;
+	dev = vfio_user_get_device(dev_id);
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to find server: "
+			"device %d not found.\n", dev_id);
+		return NULL;
+	}
+
+	if (dev->started) {
+		VFIO_USER_LOG(ERR, "Failed to find stopped server: "
+			 "device %d already started\n", dev_id);
+		return NULL;
+	}
+
+	return dev;
+}
+
+int
+rte_vfio_user_set_dev_info(const char *sock_addr,
+	struct vfio_device_info *dev_info)
+{
+	struct vfio_user_server *dev;
+
+	if (!dev_info)
+		return -1;
+
+	dev = vfio_user_find_stopped_server(sock_addr);
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to set device(%s) information: "
+			"cannot find stopped server\n", sock_addr);
+		return -1;
+	}
+
+	dev->dev_info = dev_info;
+
+	return 0;
+}
+
+int
+rte_vfio_user_set_reg_info(const char *sock_addr,
+	struct rte_vfio_user_regions *reg)
+{
+	struct vfio_user_server *dev;
+
+	if (!reg)
+		return -1;
+
+	dev = vfio_user_find_stopped_server(sock_addr);
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to set region information for "
+			"device with sock(%s): cannot find stopped server\n",
+			sock_addr);
+		return -1;
+	}
+
+	dev->reg = reg;
+
+	return 0;
+}
diff --git a/lib/librte_vfio_user/vfio_user_server.h b/lib/librte_vfio_user/vfio_user_server.h
index 0a5b17584a..4e7337113c 100644
--- a/lib/librte_vfio_user/vfio_user_server.h
+++ b/lib/librte_vfio_user/vfio_user_server.h
@@ -16,6 +16,8 @@ struct vfio_user_server {
 	uint32_t msg_id;
 	char sock_addr[PATH_MAX];
 	struct vfio_user_version ver;
+	struct vfio_device_info *dev_info;
+	struct rte_vfio_user_regions *reg;
 };
 
 typedef int (*event_handler)(int fd, void *data);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [dpdk-dev] [PATCH v2 4/9] vfio_user: implement DMA table and socket address API
  2021-01-14  6:14 ` [dpdk-dev] [PATCH v2 " Chenbo Xia
                     ` (2 preceding siblings ...)
  2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 3/9] vfio_user: implement device and region " Chenbo Xia
@ 2021-01-14  6:14   ` Chenbo Xia
  2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 5/9] vfio_user: implement interrupt related APIs Chenbo Xia
                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 43+ messages in thread
From: Chenbo Xia @ 2021-01-14  6:14 UTC (permalink / raw)
  To: dev, thomas, david.marchand
  Cc: stephen, cunming.liang, xiuchun.lu, miao.li, jingjing.wu, beilei.xing

This patch introduces an API called rte_vfio_user_get_mem_table()
for emulated devices to acquire DMA memory table from vfio-user
library.

Notify operations are also introduced to notify the emulated
devices of several events. Another socket address API is introduced
for translation between device ID and socket address in notify
callbacks.

Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
---
 lib/librte_vfio_user/rte_vfio_user.h    |  77 ++++-
 lib/librte_vfio_user/version.map        |   2 +
 lib/librte_vfio_user/vfio_user_base.h   |   2 +
 lib/librte_vfio_user/vfio_user_server.c | 375 +++++++++++++++++++++++-
 lib/librte_vfio_user/vfio_user_server.h |   3 +
 5 files changed, 451 insertions(+), 8 deletions(-)

diff --git a/lib/librte_vfio_user/rte_vfio_user.h b/lib/librte_vfio_user/rte_vfio_user.h
index 117e994cc6..f575017bdf 100644
--- a/lib/librte_vfio_user/rte_vfio_user.h
+++ b/lib/librte_vfio_user/rte_vfio_user.h
@@ -5,10 +5,52 @@
 #ifndef _RTE_VFIO_USER_H
 #define _RTE_VFIO_USER_H
 
+#include <stdint.h>
+#include <stddef.h>
+#include <stdbool.h>
 #include <linux/vfio.h>
+#include <sys/types.h>
 
 #include <rte_compat.h>
 
+#define RTE_VUSER_MAX_DMA 256
+
+struct rte_vfio_user_notify_ops {
+	/* Add device */
+	int (*new_device)(int dev_id);
+	/* Remove device */
+	void (*destroy_device)(int dev_id);
+	/* Update device status */
+	int (*update_status)(int dev_id);
+	/* Lock or unlock data path */
+	int (*lock_dp)(int dev_id, int lock);
+	/* Reset device */
+	int (*reset_device)(int dev_id);
+};
+
+struct rte_vfio_user_mem_reg {
+	uint64_t gpa;
+	uint64_t size;
+	uint64_t fd_offset;
+	uint32_t protection;	/* attributes in <sys/mman.h> */
+#define RTE_VUSER_MEM_MAPPABLE	(0x1 << 0)
+	uint32_t flags;
+};
+
+struct rte_vfio_user_mtb_entry {
+	uint64_t gpa;
+	uint64_t size;
+	uint64_t host_user_addr;
+	void	 *mmap_addr;
+	uint64_t mmap_size;
+	int fd;
+};
+
+struct rte_vfio_user_mem {
+	uint32_t entry_num;
+	struct rte_vfio_user_mtb_entry entry[RTE_VUSER_MAX_DMA];
+};
+
 struct rte_vfio_user_reg_info;
 
 typedef ssize_t (*rte_vfio_user_reg_acc_t)(struct rte_vfio_user_reg_info *reg,
@@ -32,6 +74,8 @@ struct rte_vfio_user_regions {
  *	*rte_vfio_user_register
  *	*rte_vfio_user_unregister
  *	*rte_vfio_user_start
+ *	*rte_vfio_get_sock_addr
+ *	*rte_vfio_user_get_mem_table
  *	*rte_vfio_user_set_dev_info
  *	*rte_vfio_user_set_reg_info
  */
@@ -41,12 +85,15 @@ struct rte_vfio_user_regions {
  *
  * @param sock_addr
  *   Unix domain socket address
+ * @param ops
+ *   Notify ops for the device
  * @return
  *   0 on success, -1 on failure
  */
 __rte_experimental
 int
-rte_vfio_user_register(const char *sock_addr);
+rte_vfio_user_register(const char *sock_addr,
+	const struct rte_vfio_user_notify_ops *ops);
 
 /**
  * Unregister a vfio-user device.
@@ -73,6 +120,18 @@ __rte_experimental
 int
 rte_vfio_user_start(const char *sock_addr);
 
+/**
+ * Get the memory table of a vfio-user device.
+ *
+ * @param dev_id
+ *   Vfio-user device ID
+ * @return
+ *   Pointer to memory table on success, NULL on failure
+ */
+__rte_experimental
+const struct rte_vfio_user_mem *
+rte_vfio_user_get_mem_table(int dev_id);
+
 /**
  * Set the device information for a vfio-user device.
  *
@@ -113,4 +172,20 @@ int
 rte_vfio_user_set_reg_info(const char *sock_addr,
 	struct rte_vfio_user_regions *reg);
 
+/**
+ * Get the socket address for a vfio-user device.
+ *
+ * @param dev_id
+ *   Vfio-user device ID
+ * @param[out] buf
+ *   Buffer to store socket address
+ * @param len
+ *   The length of the buffer
+ * @return
+ *   0 on success, -1 on failure
+ */
+__rte_experimental
+int
+rte_vfio_get_sock_addr(int dev_id, char *buf, size_t len);
+
 #endif
diff --git a/lib/librte_vfio_user/version.map b/lib/librte_vfio_user/version.map
index 0f4f5acba5..3a50b5ef0e 100644
--- a/lib/librte_vfio_user/version.map
+++ b/lib/librte_vfio_user/version.map
@@ -4,6 +4,8 @@ EXPERIMENTAL {
 	rte_vfio_user_register;
 	rte_vfio_user_unregister;
 	rte_vfio_user_start;
+	rte_vfio_get_sock_addr;
+	rte_vfio_user_get_mem_table;
 	rte_vfio_user_set_dev_info;
 	rte_vfio_user_set_reg_info;
 
diff --git a/lib/librte_vfio_user/vfio_user_base.h b/lib/librte_vfio_user/vfio_user_base.h
index f92886b56a..dd13170298 100644
--- a/lib/librte_vfio_user/vfio_user_base.h
+++ b/lib/librte_vfio_user/vfio_user_base.h
@@ -9,6 +9,7 @@
 
 #include "rte_vfio_user.h"
 
+#define VFIO_USER_MSG_MAX_NREG 8
 #define VFIO_USER_VERSION_MAJOR 1
 #define VFIO_USER_VERSION_MINOR 0
 #define VFIO_USER_MAX_RSVD 512
@@ -79,6 +80,7 @@ struct vfio_user_msg {
 	uint32_t err;				/* Valid in reply, optional */
 	union {
 		struct vfio_user_version ver;
+		struct rte_vfio_user_mem_reg memory[VFIO_USER_MSG_MAX_NREG];
 		struct vfio_device_info dev_info;
 		struct vfio_user_reg reg_info;
 		struct vfio_user_reg_rw reg_rw;
diff --git a/lib/librte_vfio_user/vfio_user_server.c b/lib/librte_vfio_user/vfio_user_server.c
index aab923e727..9e98b4ec81 100644
--- a/lib/librte_vfio_user/vfio_user_server.c
+++ b/lib/librte_vfio_user/vfio_user_server.c
@@ -7,6 +7,7 @@
 #include <pthread.h>
 #include <inttypes.h>
 #include <sys/socket.h>
+#include <sys/mman.h>
 #include <sys/un.h>
 
 #include "vfio_user_server.h"
@@ -40,6 +41,217 @@ vfio_user_negotiate_version(struct vfio_user_server *dev,
 		return -ENOTSUP;
 }
 
+static int
+mmap_one_region(struct rte_vfio_user_mtb_entry *entry,
+	struct rte_vfio_user_mem_reg *memory, int fd)
+{
+	if (fd != -1) {
+		if (memory->fd_offset >= -memory->size) {
+			VFIO_USER_LOG(ERR, "memory fd_offset and size overflow\n");
+			return -EINVAL;
+		}
+		entry->mmap_size = memory->fd_offset + memory->size;
+		entry->mmap_addr = mmap(NULL,
+			entry->mmap_size,
+			memory->protection, MAP_SHARED,
+			fd, 0);
+		if (entry->mmap_addr == MAP_FAILED) {
+			VFIO_USER_LOG(ERR, "Failed to mmap dma region\n");
+			return -EINVAL;
+		}
+
+		entry->host_user_addr =
+			(uint64_t)entry->mmap_addr + memory->fd_offset;
+		entry->fd = fd;
+	} else {
+		entry->mmap_size = 0;
+		entry->mmap_addr = NULL;
+		entry->host_user_addr = 0;
+		entry->fd = -1;
+	}
+
+	entry->gpa = memory->gpa;
+	entry->size = memory->size;
+
+	return 0;
+}
+
+static uint32_t
+add_one_region(struct rte_vfio_user_mem *mem,
+	struct rte_vfio_user_mem_reg *memory, int fd)
+{
+	struct rte_vfio_user_mtb_entry *entry = &mem->entry[0];
+	uint32_t num = mem->entry_num, i, j;
+	uint32_t sz = sizeof(struct rte_vfio_user_mtb_entry);
+	struct rte_vfio_user_mtb_entry ent;
+	int err = 0;
+
+	if (mem->entry_num == RTE_VUSER_MAX_DMA) {
+		VFIO_USER_LOG(ERR, "Add mem region failed, reach max!\n");
+		return -EBUSY;
+	}
+
+	for (i = 0; i < num; i++) {
+		entry = &mem->entry[i];
+
+		if (memory->gpa == entry->gpa &&
+			memory->size == entry->size)
+			return -EEXIST;
+
+		if (memory->gpa > entry->gpa &&
+			memory->gpa >= entry->gpa + entry->size)
+			continue;
+
+		if (memory->gpa < entry->gpa &&
+			memory->gpa + memory->size <= entry->gpa)
+			break;
+
+		return -EINVAL;
+	}
+
+	err = mmap_one_region(&ent, memory, fd);
+	if (err)
+		return err;
+
+	for (j = num; j > i; j--)
+		memcpy(&mem->entry[j], &mem->entry[j - 1], sz);
+	memcpy(&mem->entry[i], &ent, sz);
+	mem->entry_num++;
+
+	VFIO_USER_LOG(DEBUG, "DMA MAP(gpa: 0x%" PRIx64 ", sz: 0x%" PRIx64
+			", hva: 0x%" PRIx64 ", ma: 0x%" PRIx64
+			", msz: 0x%" PRIx64 ", fd: %d)\n", ent.gpa,
+			ent.size, ent.host_user_addr, (uint64_t)ent.mmap_addr,
+			ent.mmap_size, ent.fd);
+	return 0;
+}
+
+static void
+del_one_region(struct rte_vfio_user_mem *mem,
+	struct rte_vfio_user_mem_reg *memory)
+{
+	struct rte_vfio_user_mtb_entry *entry;
+	uint32_t num = mem->entry_num, i, j;
+	uint32_t sz = sizeof(struct rte_vfio_user_mtb_entry);
+
+	if (mem->entry_num == 0) {
+		VFIO_USER_LOG(ERR, "Delete mem region failed (No region exists)!\n");
+		return;
+	}
+
+	for (i = 0; i < num; i++) {
+		entry = &mem->entry[i];
+
+		if (memory->gpa == entry->gpa &&
+			memory->size == entry->size) {
+			if (entry->mmap_addr != NULL) {
+				munmap(entry->mmap_addr, entry->mmap_size);
+				mem->entry[i].mmap_size = 0;
+				mem->entry[i].mmap_addr = NULL;
+				mem->entry[i].host_user_addr = 0;
+				mem->entry[i].fd = -1;
+			}
+
+			mem->entry[i].gpa = 0;
+			mem->entry[i].size = 0;
+
+			for (j = i; j < num - 1; j++) {
+				memcpy(&mem->entry[j], &mem->entry[j + 1],
+					sz);
+			}
+			mem->entry_num--;
+
+			VFIO_USER_LOG(DEBUG, "DMA UNMAP(gpa: 0x%" PRIx64
+				", sz: 0x%" PRIx64 ", hva: 0x%" PRIx64
+				", ma: 0x%" PRIx64", msz: 0x%" PRIx64
+				", fd: %d)\n", entry->gpa, entry->size,
+				entry->host_user_addr,
+				(uint64_t)entry->mmap_addr, entry->mmap_size,
+				entry->fd);
+
+			return;
+		}
+	}
+
+	VFIO_USER_LOG(ERR, "Failed to find the region for dma unmap!\n");
+}
+
+static int
+vfio_user_dma_map(struct vfio_user_server *dev, struct vfio_user_msg *msg)
+{
+	struct rte_vfio_user_mem_reg *memory = msg->payload.memory;
+	uint32_t region_num, expected_fd = 0;
+	uint32_t i, j, fd, fd_idx = 0;
+	int ret = 0;
+
+	if ((msg->size - VFIO_USER_MSG_HDR_SIZE) % sizeof(*memory) != 0) {
+		VFIO_USER_LOG(ERR, "Invalid msg size for dma map\n");
+		vfio_user_close_msg_fds(msg);
+		ret = -EINVAL;
+		goto err;
+	}
+
+	region_num = (msg->size - VFIO_USER_MSG_HDR_SIZE)
+		/ sizeof(struct rte_vfio_user_mem_reg);
+
+	for (i = 0; i < region_num; i++) {
+		if (memory[i].flags & RTE_VUSER_MEM_MAPPABLE)
+			expected_fd++;
+	}
+
+	if (vfio_user_check_msg_fdnum(msg, expected_fd) != 0) {
+		ret = -EINVAL;
+		goto err;
+	}
+
+	for (i = 0; i < region_num; i++) {
+		fd = (memory[i].flags & RTE_VUSER_MEM_MAPPABLE) ?
+			msg->fds[fd_idx++] : -1;
+
+		ret = add_one_region(dev->mem, memory + i, fd);
+		if (ret < 0) {
+			VFIO_USER_LOG(ERR, "Failed to add dma map\n");
+			break;
+		}
+	}
+
+	if (i != region_num) {
+		/* Clear all mmaped region and fds */
+		for (j = 0; j < region_num; j++) {
+			if (j < i)
+				del_one_region(dev->mem, memory + j);
+			else
+				close(msg->fds[j]);
+		}
+	}
+err:
+	/* Do not reply fds back */
+	msg->fd_num = 0;
+	return ret;
+}
+
+static int
+vfio_user_dma_unmap(struct vfio_user_server *dev, struct vfio_user_msg *msg)
+{
+	struct rte_vfio_user_mem_reg *memory = msg->payload.memory;
+	uint32_t region_num = (msg->size - VFIO_USER_MSG_HDR_SIZE)
+		/ sizeof(struct rte_vfio_user_mem_reg);
+	uint32_t i;
+
+	if (vfio_user_check_msg_fdnum(msg, 0) != 0)
+		return -EINVAL;
+
+	if ((msg->size - VFIO_USER_MSG_HDR_SIZE) % sizeof(*memory) != 0) {
+		VFIO_USER_LOG(ERR, "Invalid msg size for dma unmap\n");
+		return -EINVAL;
+	}
+
+	for (i = 0; i < region_num; i++)
+		del_one_region(dev->mem, memory);
+
+	return 0;
+}
+
 static int
 vfio_user_device_get_info(struct vfio_user_server *dev,
 	struct vfio_user_msg *msg)
@@ -178,11 +390,65 @@ vfio_user_region_write(struct vfio_user_server *dev,
 	return 0;
 }
 
+static inline void
+vfio_user_destroy_mem_entries(struct rte_vfio_user_mem *mem)
+{
+	struct rte_vfio_user_mtb_entry *ent;
+	uint32_t i;
+
+	for (i = 0; i < mem->entry_num; i++) {
+		ent = &mem->entry[i];
+		if (ent->host_user_addr) {
+			munmap(ent->mmap_addr, ent->mmap_size);
+			close(ent->fd);
+		}
+	}
+
+	memset(mem, 0, sizeof(*mem));
+}
+
+static inline void
+vfio_user_destroy_mem(struct vfio_user_server *dev)
+{
+	struct rte_vfio_user_mem *mem = dev->mem;
+
+	if (!mem)
+		return;
+
+	vfio_user_destroy_mem_entries(mem);
+
+	free(mem);
+	dev->mem = NULL;
+}
+
+static int
+vfio_user_device_reset(struct vfio_user_server *dev,
+	struct vfio_user_msg *msg)
+{
+	struct vfio_device_info *dev_info;
+
+	if (vfio_user_check_msg_fdnum(msg, 0) != 0)
+		return -EINVAL;
+
+	dev_info = dev->dev_info;
+
+	if (!(dev_info->flags & VFIO_DEVICE_FLAGS_RESET))
+		return -ENOTSUP;
+
+	vfio_user_destroy_mem_entries(dev->mem);
+	dev->is_ready = 0;
+
+	if (dev->ops->reset_device)
+		dev->ops->reset_device(dev->dev_id);
+
+	return 0;
+}
+
 static vfio_user_msg_handler_t vfio_user_msg_handlers[VFIO_USER_MAX] = {
 	[VFIO_USER_NONE] = NULL,
 	[VFIO_USER_VERSION] = vfio_user_negotiate_version,
-	[VFIO_USER_DMA_MAP] = NULL,
-	[VFIO_USER_DMA_UNMAP] = NULL,
+	[VFIO_USER_DMA_MAP] = vfio_user_dma_map,
+	[VFIO_USER_DMA_UNMAP] = vfio_user_dma_unmap,
 	[VFIO_USER_DEVICE_GET_INFO] = vfio_user_device_get_info,
 	[VFIO_USER_DEVICE_GET_REGION_INFO] = vfio_user_device_get_reg_info,
 	[VFIO_USER_DEVICE_GET_IRQ_INFO] = NULL,
@@ -192,7 +458,7 @@ static vfio_user_msg_handler_t vfio_user_msg_handlers[VFIO_USER_MAX] = {
 	[VFIO_USER_DMA_READ] = NULL,
 	[VFIO_USER_DMA_WRITE] = NULL,
 	[VFIO_USER_VM_INTERRUPT] = NULL,
-	[VFIO_USER_DEVICE_RESET] = NULL,
+	[VFIO_USER_DEVICE_RESET] = vfio_user_device_reset,
 };
 
 static struct vfio_user_server_socket *
@@ -534,6 +800,21 @@ vfio_user_get_device(int dev_id)
 	return dev;
 }
 
+static inline int
+vfio_user_is_ready(struct vfio_user_server *dev)
+{
+	/* vfio-user currently has no definition of when the device is ready.
+	 * For now, we define it as when the device has at least one dma
+	 * memory table entry.
+	 */
+	if (dev->mem->entry_num > 0) {
+		dev->is_ready = 1;
+		return 1;
+	}
+
+	return 0;
+}
+
 static int
 vfio_user_message_handler(int dev_id, int fd)
 {
@@ -541,6 +822,7 @@ vfio_user_message_handler(int dev_id, int fd)
 	struct vfio_user_msg msg;
 	uint32_t cmd;
 	int ret = 0;
+	int dev_locked = 0;
 
 	dev = vfio_user_get_device(dev_id);
 	if (!dev)
@@ -569,6 +851,17 @@ vfio_user_message_handler(int dev_id, int fd)
 		return -1;
 	}
 
+	/*
+	 * Below messages should lock the data path upon receiving
+	 * to avoid errors in data path handling
+	 */
+	if ((cmd == VFIO_USER_DMA_MAP || cmd == VFIO_USER_DMA_UNMAP ||
+		cmd == VFIO_USER_DEVICE_RESET)
+		&& dev->ops->lock_dp) {
+		dev->ops->lock_dp(dev_id, 1);
+		dev_locked = 1;
+	}
+
 	if (vfio_user_msg_handlers[cmd])
 		ret = vfio_user_msg_handlers[cmd](dev, &msg);
 	else {
@@ -601,7 +894,18 @@ vfio_user_message_handler(int dev_id, int fd)
 		}
 	}
 
+	if (!dev->is_ready) {
+		if (vfio_user_is_ready(dev) && dev->ops->new_device)
+			dev->ops->new_device(dev_id);
+	} else {
+		if ((cmd == VFIO_USER_DMA_MAP || cmd == VFIO_USER_DMA_UNMAP)
+			&& dev->ops->update_status)
+			dev->ops->update_status(dev_id);
+	}
+
 handle_end:
+	if (dev_locked)
+		dev->ops->lock_dp(dev_id, 0);
 	return ret;
 }
 
@@ -619,8 +923,12 @@ vfio_user_sock_read(int fd, void *data)
 		close(fd);
 		sk->conn_fd = -1;
 		dev = vfio_user_get_device(dev_id);
-		if (dev)
+		if (dev) {
+			dev->ops->destroy_device(dev_id);
+			vfio_user_destroy_mem_entries(dev->mem);
+			dev->is_ready = 0;
 			dev->msg_id = 0;
+		}
 	}
 
 	return ret;
@@ -752,13 +1060,14 @@ vfio_user_start_server(struct vfio_user_server_socket *sk)
 }
 
 int
-rte_vfio_user_register(const char *sock_addr)
+rte_vfio_user_register(const char *sock_addr,
+	const struct rte_vfio_user_notify_ops *ops)
 {
 	struct vfio_user_server_socket *sk;
 	struct vfio_user_server *dev;
 	int dev_id;
 
-	if (!sock_addr)
+	if (!sock_addr || !ops)
 		return -1;
 
 	sk = vfio_user_create_sock(sock_addr);
@@ -776,11 +1085,22 @@ rte_vfio_user_register(const char *sock_addr)
 
 	dev = vfio_user_get_device(dev_id);
 
+	dev->mem = malloc(sizeof(struct rte_vfio_user_mem));
+	if (!dev->mem) {
+		VFIO_USER_LOG(ERR, "Failed to alloc vfio_user_mem\n");
+		goto err_mem;
+	}
+	memset(dev->mem, 0, sizeof(struct rte_vfio_user_mem));
+
 	dev->ver.major = VFIO_USER_VERSION_MAJOR;
 	dev->ver.minor = VFIO_USER_VERSION_MINOR;
+	dev->ops = ops;
+	dev->is_ready = 0;
 
 	return 0;
 
+err_mem:
+	vfio_user_del_device(dev);
 err_add_dev:
 	vfio_user_delete_sock(sk);
 exit:
@@ -818,7 +1138,7 @@ rte_vfio_user_unregister(const char *sock_addr)
 			"device not found.\n");
 		return -1;
 	}
-
+	vfio_user_destroy_mem(dev);
 	vfio_user_del_device(dev);
 
 	return 0;
@@ -940,3 +1260,44 @@ rte_vfio_user_set_reg_info(const char *sock_addr,
 
 	return 0;
 }
+
+int
+rte_vfio_get_sock_addr(int dev_id, char *buf, size_t len)
+{
+	struct vfio_user_server *dev;
+
+	dev = vfio_user_get_device(dev_id);
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to get sock address:"
+			"device %d not found.\n", dev_id);
+		return -1;
+	}
+
+	len = len > sizeof(dev->sock_addr) ?
+		sizeof(dev->sock_addr) : len;
+	strncpy(buf, dev->sock_addr, len);
+	buf[len - 1] = '\0';
+
+	return 0;
+}
+
+const struct rte_vfio_user_mem *
+rte_vfio_user_get_mem_table(int dev_id)
+{
+	struct vfio_user_server *dev;
+
+	dev = vfio_user_get_device(dev_id);
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to get memory table:"
+			"device %d not found.\n", dev_id);
+		return NULL;
+	}
+
+	if (!dev->mem) {
+		VFIO_USER_LOG(ERR, "Failed to get memory table for device %d:"
+			"memory table not allocated.\n", dev_id);
+		return NULL;
+	}
+
+	return dev->mem;
+}
diff --git a/lib/librte_vfio_user/vfio_user_server.h b/lib/librte_vfio_user/vfio_user_server.h
index 4e7337113c..0b20ab4e3a 100644
--- a/lib/librte_vfio_user/vfio_user_server.h
+++ b/lib/librte_vfio_user/vfio_user_server.h
@@ -11,11 +11,14 @@
 
 struct vfio_user_server {
 	int dev_id;
+	int is_ready;
 	int started;
 	int conn_fd;
 	uint32_t msg_id;
 	char sock_addr[PATH_MAX];
+	const struct rte_vfio_user_notify_ops *ops;
 	struct vfio_user_version ver;
+	struct rte_vfio_user_mem *mem;
 	struct vfio_device_info *dev_info;
 	struct rte_vfio_user_regions *reg;
 };
-- 
2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [dpdk-dev] [PATCH v2 5/9] vfio_user: implement interrupt related APIs
  2021-01-14  6:14 ` [dpdk-dev] [PATCH v2 " Chenbo Xia
                     ` (3 preceding siblings ...)
  2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 4/9] vfio_user: implement DMA table and socket address API Chenbo Xia
@ 2021-01-14  6:14   ` Chenbo Xia
  2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 6/9] vfio_user: add client APIs of device attach/detach Chenbo Xia
                     ` (4 subsequent siblings)
  9 siblings, 0 replies; 43+ messages in thread
From: Chenbo Xia @ 2021-01-14  6:14 UTC (permalink / raw)
  To: dev, thomas, david.marchand
  Cc: stephen, cunming.liang, xiuchun.lu, miao.li, jingjing.wu, beilei.xing

This patch implements two interrupt related APIs, which are
rte_vfio_user_get_irq() and rte_vfio_user_set_irq_info().
The former is for devices to get interrupt configuration
(e.g., irqfds). The latter is for setting interrupt information
before vfio-user starts.

Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
---
 lib/librte_vfio_user/rte_vfio_user.h    |  46 ++++
 lib/librte_vfio_user/version.map        |   2 +
 lib/librte_vfio_user/vfio_user_base.h   |   8 +
 lib/librte_vfio_user/vfio_user_server.c | 300 +++++++++++++++++++++++-
 lib/librte_vfio_user/vfio_user_server.h |   6 +
 5 files changed, 357 insertions(+), 5 deletions(-)

diff --git a/lib/librte_vfio_user/rte_vfio_user.h b/lib/librte_vfio_user/rte_vfio_user.h
index f575017bdf..472ca15529 100644
--- a/lib/librte_vfio_user/rte_vfio_user.h
+++ b/lib/librte_vfio_user/rte_vfio_user.h
@@ -69,6 +69,11 @@ struct rte_vfio_user_regions {
 	struct rte_vfio_user_reg_info reg_info[];
 };
 
+struct rte_vfio_user_irq_info {
+	uint32_t irq_num;
+	struct vfio_irq_info irq_info[];
+};
+
 /**
  *  Below APIs are for vfio-user server (device provider) to use:
  *	*rte_vfio_user_register
@@ -76,8 +81,10 @@ struct rte_vfio_user_regions {
  *	*rte_vfio_user_start
  *	*rte_vfio_get_sock_addr
  *	*rte_vfio_user_get_mem_table
+ *	*rte_vfio_user_get_irq
  *	*rte_vfio_user_set_dev_info
  *	*rte_vfio_user_set_reg_info
+ *	*rte_vfio_user_set_irq_info
  */
 
 /**
@@ -188,4 +195,43 @@ __rte_experimental
 int
 rte_vfio_get_sock_addr(int dev_id, char *buf, size_t len);
 
+/**
+ * Get the irqfds of a vfio-user device.
+ *
+ * @param dev_id
+ *   Vfio-user device ID
+ * @param index
+ *   irq index
+ * @param count
+ *   irq count
+ * @param[out] fds
+ *   Pointer to the irqfds
+ * @return
+ *   0 on success, -1 on failure
+ */
+__rte_experimental
+int
+rte_vfio_user_get_irq(int dev_id, uint32_t index, uint32_t count,
+	int *fds);
+
+/**
+ * Set the irq information for a vfio-user device.
+ *
+ * This information must be set before calling rte_vfio_user_start, and should
+ * not be updated after start. Update after start can be done by unregistration
+ * and re-registration, and then the device-level change can be detected by
+ * vfio-user client.
+ *
+ * @param sock_addr
+ *   Unix domain socket address
+ * @param irq
+ *   IRQ information for the vfio-user device
+ * @return
+ *   0 on success, -1 on failure
+ */
+__rte_experimental
+int
+rte_vfio_user_set_irq_info(const char *sock_addr,
+	struct rte_vfio_user_irq_info *irq);
+
 #endif
diff --git a/lib/librte_vfio_user/version.map b/lib/librte_vfio_user/version.map
index 3a50b5ef0e..621a51a9fc 100644
--- a/lib/librte_vfio_user/version.map
+++ b/lib/librte_vfio_user/version.map
@@ -6,8 +6,10 @@ EXPERIMENTAL {
 	rte_vfio_user_start;
 	rte_vfio_get_sock_addr;
 	rte_vfio_user_get_mem_table;
+	rte_vfio_user_get_irq;
 	rte_vfio_user_set_dev_info;
 	rte_vfio_user_set_reg_info;
+	rte_vfio_user_set_irq_info;
 
 	local: *;
 };
diff --git a/lib/librte_vfio_user/vfio_user_base.h b/lib/librte_vfio_user/vfio_user_base.h
index dd13170298..1780db4322 100644
--- a/lib/librte_vfio_user/vfio_user_base.h
+++ b/lib/librte_vfio_user/vfio_user_base.h
@@ -61,6 +61,12 @@ struct vfio_user_reg {
 	uint8_t rsvd[VFIO_USER_MAX_RSVD];
 };
 
+struct vfio_user_irq_set {
+	struct vfio_irq_set set;
+	/* Reserved for data of irq set */
+	uint8_t rsvd[VFIO_USER_MAX_RSVD];
+};
+
 struct vfio_user_reg_rw {
 	uint64_t reg_offset;
 	uint32_t reg_idx;
@@ -83,6 +89,8 @@ struct vfio_user_msg {
 		struct rte_vfio_user_mem_reg memory[VFIO_USER_MSG_MAX_NREG];
 		struct vfio_device_info dev_info;
 		struct vfio_user_reg reg_info;
+		struct vfio_irq_info irq_info;
+		struct vfio_user_irq_set irq_set;
 		struct vfio_user_reg_rw reg_rw;
 	} payload;
 	int fds[VFIO_USER_MAX_FD];
diff --git a/lib/librte_vfio_user/vfio_user_server.c b/lib/librte_vfio_user/vfio_user_server.c
index 9e98b4ec81..104a0abb77 100644
--- a/lib/librte_vfio_user/vfio_user_server.c
+++ b/lib/librte_vfio_user/vfio_user_server.c
@@ -9,6 +9,7 @@
 #include <sys/socket.h>
 #include <sys/mman.h>
 #include <sys/un.h>
+#include <sys/eventfd.h>
 
 #include "vfio_user_server.h"
 
@@ -310,6 +311,150 @@ vfio_user_device_get_reg_info(struct vfio_user_server *dev,
 	return 0;
 }
 
+static int
+vfio_user_device_get_irq_info(struct vfio_user_server *dev,
+	struct vfio_user_msg *msg)
+{
+	struct vfio_irq_info *irq_info = &msg->payload.irq_info;
+	struct rte_vfio_user_irq_info *info = dev->irqs.info;
+	uint32_t i;
+
+	if (vfio_user_check_msg_fdnum(msg, 0) != 0)
+		return -EINVAL;
+
+	for (i = 0; i < info->irq_num; i++) {
+		if (irq_info->index == info->irq_info[i].index) {
+			irq_info->count = info->irq_info[i].count;
+			irq_info->flags |= info->irq_info[i].flags;
+			break;
+		}
+	}
+	if (i == info->irq_num)
+		return -EINVAL;
+
+	VFIO_USER_LOG(DEBUG, "IRQ info: argsz(0x%x), flags(0x%x), index(0x%x),"
+		" count(0x%x)\n", irq_info->argsz, irq_info->flags,
+		irq_info->index, irq_info->count);
+
+	return 0;
+}
+
+static inline int
+irq_set_trigger(struct vfio_user_irqs *irqs,
+	struct vfio_irq_set *irq_set, struct vfio_user_msg *msg)
+{
+	uint32_t i = irq_set->start;
+	int eventfd;
+
+	switch (irq_set->flags & VFIO_IRQ_SET_DATA_TYPE_MASK) {
+	case VFIO_IRQ_SET_DATA_NONE:
+		if (vfio_user_check_msg_fdnum(msg, 0) != 0)
+			return -EINVAL;
+
+		for (; i < irq_set->start + irq_set->count; i++) {
+			eventfd = irqs->fds[irq_set->index][i];
+			if (eventfd >= 0) {
+				if (eventfd_write(eventfd, (eventfd_t)1))
+					return -errno;
+			}
+		}
+		break;
+	case VFIO_IRQ_SET_DATA_BOOL:
+		if (vfio_user_check_msg_fdnum(msg, 0) != 0)
+			return -EINVAL;
+
+		uint8_t *idx = irq_set->data;
+		for (; i < irq_set->start + irq_set->count; i++, idx++) {
+			eventfd = irqs->fds[irq_set->index][i];
+			if (eventfd >= 0 && *idx == 1) {
+				if (eventfd_write(eventfd, (eventfd_t)1))
+					return -errno;
+			}
+		}
+		break;
+	case VFIO_IRQ_SET_DATA_EVENTFD:
+		if (vfio_user_check_msg_fdnum(msg, irq_set->count) != 0)
+			return -EINVAL;
+
+		int32_t *fds = msg->fds;
+		for (; i < irq_set->start + irq_set->count; i++, fds++) {
+			eventfd = irqs->fds[irq_set->index][i];
+			if (eventfd >= 0)
+				close(eventfd); /* Clear original irqfd*/
+			if (*fds >= 0)
+				irqs->fds[irq_set->index][i] = *fds;
+		}
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static void
+vfio_user_disable_irqs(struct vfio_user_irqs *irqs)
+{
+	struct rte_vfio_user_irq_info *info = irqs->info;
+	uint32_t i, j;
+
+	for (i = 0; i < info->irq_num; i++) {
+		for (j = 0; j < info->irq_info[i].count; j++) {
+			if (irqs->fds[i][j] != -1) {
+				close(irqs->fds[i][j]);
+				irqs->fds[i][j] = -1;
+			}
+		}
+	}
+}
+
+static int
+vfio_user_device_set_irqs(struct vfio_user_server *dev,
+	struct vfio_user_msg *msg)
+{
+	struct vfio_user_irq_set *irq = &msg->payload.irq_set;
+	struct vfio_irq_set *irq_set = &irq->set;
+	struct rte_vfio_user_irq_info *info = dev->irqs.info;
+	int ret = 0;
+
+	if (info->irq_num <= irq_set->index
+		|| info->irq_info[irq_set->index].count <
+		irq_set->start + irq_set->count) {
+		vfio_user_close_msg_fds(msg);
+		return -EINVAL;
+	}
+
+	if (irq_set->count == 0) {
+		if (irq_set->flags & VFIO_IRQ_SET_DATA_NONE) {
+			vfio_user_disable_irqs(&dev->irqs);
+			return 0;
+		}
+		vfio_user_close_msg_fds(msg);
+		return -EINVAL;
+	}
+
+	switch (irq_set->flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) {
+	/* Mask/Unmask not supported for now */
+	case VFIO_IRQ_SET_ACTION_MASK:
+		/* FALLTHROUGH */
+	case VFIO_IRQ_SET_ACTION_UNMASK:
+		return 0;
+	case VFIO_IRQ_SET_ACTION_TRIGGER:
+		ret = irq_set_trigger(&dev->irqs, irq_set, msg);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	VFIO_USER_LOG(DEBUG, "Set IRQ: argsz(0x%x), flags(0x%x), index(0x%x), "
+		"start(0x%x), count(0x%x)\n", irq_set->argsz, irq_set->flags,
+		irq_set->index, irq_set->start, irq_set->count);
+
+	/* Do not reply fds back */
+	msg->fd_num = 0;
+	return ret;
+}
+
 static int
 vfio_user_region_read(struct vfio_user_server *dev,
 	struct vfio_user_msg *msg)
@@ -421,6 +566,50 @@ vfio_user_destroy_mem(struct vfio_user_server *dev)
 	dev->mem = NULL;
 }
 
+static inline void
+vfio_user_destroy_irq(struct vfio_user_server *dev)
+{
+	struct vfio_user_irqs *irq = &dev->irqs;
+	int *fd;
+	uint32_t i, j;
+
+	if (!irq->info)
+		return;
+
+	for (i = 0; i < irq->info->irq_num; i++) {
+		fd = irq->fds[i];
+
+		for (j = 0; j < irq->info->irq_info[i].count; j++) {
+			if (fd[j] != -1)
+				close(fd[j]);
+		}
+
+		free(fd);
+	}
+
+	free(irq->fds);
+}
+
+static inline void
+vfio_user_clean_irqfd(struct vfio_user_server *dev)
+{
+	struct vfio_user_irqs *irq = &dev->irqs;
+	int *fd;
+	uint32_t i, j;
+
+	if (!irq->info)
+		return;
+
+	for (i = 0; i < irq->info->irq_num; i++) {
+		fd = irq->fds[i];
+
+		for (j = 0; j < irq->info->irq_info[i].count; j++) {
+			close(fd[j]);
+			fd[j] = -1;
+		}
+	}
+}
+
 static int
 vfio_user_device_reset(struct vfio_user_server *dev,
 	struct vfio_user_msg *msg)
@@ -436,6 +625,7 @@ vfio_user_device_reset(struct vfio_user_server *dev,
 		return -ENOTSUP;
 
 	vfio_user_destroy_mem_entries(dev->mem);
+	vfio_user_clean_irqfd(dev);
 	dev->is_ready = 0;
 
 	if (dev->ops->reset_device)
@@ -451,8 +641,8 @@ static vfio_user_msg_handler_t vfio_user_msg_handlers[VFIO_USER_MAX] = {
 	[VFIO_USER_DMA_UNMAP] = vfio_user_dma_unmap,
 	[VFIO_USER_DEVICE_GET_INFO] = vfio_user_device_get_info,
 	[VFIO_USER_DEVICE_GET_REGION_INFO] = vfio_user_device_get_reg_info,
-	[VFIO_USER_DEVICE_GET_IRQ_INFO] = NULL,
-	[VFIO_USER_DEVICE_SET_IRQS] = NULL,
+	[VFIO_USER_DEVICE_GET_IRQ_INFO] = vfio_user_device_get_irq_info,
+	[VFIO_USER_DEVICE_SET_IRQS] = vfio_user_device_set_irqs,
 	[VFIO_USER_REGION_READ] = vfio_user_region_read,
 	[VFIO_USER_REGION_WRITE] = vfio_user_region_write,
 	[VFIO_USER_DMA_READ] = NULL,
@@ -856,6 +1046,7 @@ vfio_user_message_handler(int dev_id, int fd)
 	 * to avoid errors in data path handling
 	 */
 	if ((cmd == VFIO_USER_DMA_MAP || cmd == VFIO_USER_DMA_UNMAP ||
+		cmd == VFIO_USER_DEVICE_SET_IRQS ||
 		cmd == VFIO_USER_DEVICE_RESET)
 		&& dev->ops->lock_dp) {
 		dev->ops->lock_dp(dev_id, 1);
@@ -898,7 +1089,8 @@ vfio_user_message_handler(int dev_id, int fd)
 		if (vfio_user_is_ready(dev) && dev->ops->new_device)
 			dev->ops->new_device(dev_id);
 	} else {
-		if ((cmd == VFIO_USER_DMA_MAP || cmd == VFIO_USER_DMA_UNMAP)
+		if ((cmd == VFIO_USER_DMA_MAP || cmd == VFIO_USER_DMA_UNMAP
+			|| cmd == VFIO_USER_DEVICE_SET_IRQS)
 			&& dev->ops->update_status)
 			dev->ops->update_status(dev_id);
 	}
@@ -926,6 +1118,7 @@ vfio_user_sock_read(int fd, void *data)
 		if (dev) {
 			dev->ops->destroy_device(dev_id);
 			vfio_user_destroy_mem_entries(dev->mem);
+			vfio_user_clean_irqfd(dev);
 			dev->is_ready = 0;
 			dev->msg_id = 0;
 		}
@@ -1023,9 +1216,9 @@ vfio_user_start_server(struct vfio_user_server_socket *sk)
 	}
 
 	/* All the info must be set before start */
-	if (!dev->dev_info || !dev->reg) {
+	if (!dev->dev_info || !dev->reg || !dev->irqs.info) {
 		VFIO_USER_LOG(ERR, "Failed to start, "
-			"dev/reg info must be set before start\n");
+			"dev/reg/irq info must be set before start\n");
 		return -1;
 	}
 
@@ -1139,6 +1332,7 @@ rte_vfio_user_unregister(const char *sock_addr)
 		return -1;
 	}
 	vfio_user_destroy_mem(dev);
+	vfio_user_destroy_irq(dev);
 	vfio_user_del_device(dev);
 
 	return 0;
@@ -1301,3 +1495,99 @@ rte_vfio_user_get_mem_table(int dev_id)
 
 	return dev->mem;
 }
+
+int
+rte_vfio_user_get_irq(int dev_id, uint32_t index, uint32_t count, int *fds)
+{
+	struct vfio_user_server *dev;
+	struct vfio_user_irqs *irqs;
+	uint32_t irq_max;
+
+	dev = vfio_user_get_device(dev_id);
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to get irq info:"
+			"device %d not found.\n", dev_id);
+		return -1;
+	}
+
+	if (!fds)
+		return -1;
+
+	irqs = &dev->irqs;
+	if (index >= irqs->info->irq_num)
+		return -1;
+
+	irq_max = irqs->info->irq_info[index].count;
+	if (count > irq_max)
+		return -1;
+
+	memcpy(fds, dev->irqs.fds[index], count * sizeof(int));
+	return 0;
+}
+
+int
+rte_vfio_user_set_irq_info(const char *sock_addr,
+	struct rte_vfio_user_irq_info *irq)
+{
+	struct vfio_user_server *dev;
+	struct vfio_user_server_socket *sk;
+	uint32_t i;
+	int dev_id, ret;
+
+	if (!irq)
+		return -1;
+
+	pthread_mutex_lock(&vfio_ep_sock.mutex);
+	sk = vfio_user_find_socket(sock_addr);
+	pthread_mutex_unlock(&vfio_ep_sock.mutex);
+
+	if (!sk) {
+		VFIO_USER_LOG(ERR, "Failed to set irq info with sock_addr:"
+			"%s: addr not registered.\n", sock_addr);
+		return -1;
+	}
+
+	dev_id = sk->sock.dev_id;
+	dev = vfio_user_get_device(dev_id);
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to set irq info:"
+			"device %d not found.\n", dev_id);
+		return -1;
+	}
+
+	if (dev->started) {
+		VFIO_USER_LOG(ERR, "Failed to set irq info for device %d\n"
+			 ", device already started\n", dev_id);
+		return -1;
+	}
+
+	if (dev->irqs.info)
+		vfio_user_destroy_irq(dev);
+
+	dev->irqs.info = irq;
+
+	dev->irqs.fds = malloc(irq->irq_num * sizeof(int *));
+	if (!dev->irqs.fds)
+		return -1;
+
+	for (i = 0; i < irq->irq_num; i++) {
+		uint32_t sz = irq->irq_info[i].count * sizeof(int);
+		dev->irqs.fds[i] = malloc(sz);
+		if (!dev->irqs.fds[i]) {
+			ret = -1;
+			goto exit;
+		}
+
+		memset(dev->irqs.fds[i], 0xFF, sz);
+	}
+
+	return 0;
+exit:
+	for (--i;; i--) {
+		free(dev->irqs.fds[i]);
+		if (i == 0)
+			break;
+	}
+	free(dev->irqs.fds);
+	return ret;
+}
diff --git a/lib/librte_vfio_user/vfio_user_server.h b/lib/librte_vfio_user/vfio_user_server.h
index 0b20ab4e3a..1b4ed4f47c 100644
--- a/lib/librte_vfio_user/vfio_user_server.h
+++ b/lib/librte_vfio_user/vfio_user_server.h
@@ -9,6 +9,11 @@
 
 #include "vfio_user_base.h"
 
+struct vfio_user_irqs {
+	struct rte_vfio_user_irq_info *info;
+	int **fds;
+};
+
 struct vfio_user_server {
 	int dev_id;
 	int is_ready;
@@ -21,6 +26,7 @@ struct vfio_user_server {
 	struct rte_vfio_user_mem *mem;
 	struct vfio_device_info *dev_info;
 	struct rte_vfio_user_regions *reg;
+	struct vfio_user_irqs irqs;
 };
 
 typedef int (*event_handler)(int fd, void *data);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [dpdk-dev] [PATCH v2 6/9] vfio_user: add client APIs of device attach/detach
  2021-01-14  6:14 ` [dpdk-dev] [PATCH v2 " Chenbo Xia
                     ` (4 preceding siblings ...)
  2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 5/9] vfio_user: implement interrupt related APIs Chenbo Xia
@ 2021-01-14  6:14   ` Chenbo Xia
  2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 7/9] vfio_user: add client APIs of DMA/IRQ/region Chenbo Xia
                     ` (3 subsequent siblings)
  9 siblings, 0 replies; 43+ messages in thread
From: Chenbo Xia @ 2021-01-14  6:14 UTC (permalink / raw)
  To: dev, thomas, david.marchand
  Cc: stephen, cunming.liang, xiuchun.lu, miao.li, jingjing.wu, beilei.xing

This patch implements two APIs, rte_vfio_user_attach_dev() and
rte_vfio_user_detach_dev() for vfio-user client to connect to
or disconnect from a vfio-user device on server side.

Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
---
 lib/librte_vfio_user/meson.build        |   3 +-
 lib/librte_vfio_user/rte_vfio_user.h    |  32 +++
 lib/librte_vfio_user/version.map        |   2 +
 lib/librte_vfio_user/vfio_user_client.c | 281 ++++++++++++++++++++++++
 lib/librte_vfio_user/vfio_user_client.h |  26 +++
 5 files changed, 343 insertions(+), 1 deletion(-)
 create mode 100644 lib/librte_vfio_user/vfio_user_client.c
 create mode 100644 lib/librte_vfio_user/vfio_user_client.h

diff --git a/lib/librte_vfio_user/meson.build b/lib/librte_vfio_user/meson.build
index b7363f61c6..5761f0edd1 100644
--- a/lib/librte_vfio_user/meson.build
+++ b/lib/librte_vfio_user/meson.build
@@ -6,5 +6,6 @@ if not is_linux
 	reason = 'only supported on Linux'
 endif
 
-sources = files('vfio_user_base.c', 'vfio_user_server.c')
+sources = files('vfio_user_base.c', 'vfio_user_server.c',
+	'vfio_user_client.c')
 headers = files('rte_vfio_user.h')
diff --git a/lib/librte_vfio_user/rte_vfio_user.h b/lib/librte_vfio_user/rte_vfio_user.h
index 472ca15529..adafa552e2 100644
--- a/lib/librte_vfio_user/rte_vfio_user.h
+++ b/lib/librte_vfio_user/rte_vfio_user.h
@@ -234,4 +234,36 @@ int
 rte_vfio_user_set_irq_info(const char *sock_addr,
 	struct rte_vfio_user_irq_info *irq);
 
+/**
+ *  Below APIs are for vfio-user client (device consumer) to use:
+ *	*rte_vfio_user_attach_dev
+ *	*rte_vfio_user_detach_dev
+ */
+
+/**
+ * Attach to a vfio-user device.
+ *
+ * @param sock_addr
+ *   Unix domain socket address
+ * @return
+ *   - >=0: Success, device attached. Returned value is the device ID.
+ *   - <0: Failure on device attach
+ */
+__rte_experimental
+int
+rte_vfio_user_attach_dev(const char *sock_addr);
+
+/**
+ * Detach from a vfio-user device.
+ *
+ * @param dev_id
+ *   Device ID of the vfio-user device
+ * @return
+ *   - 0: Success, device detached
+ *   - <0: Failure on device detach
+ */
+__rte_experimental
+int
+rte_vfio_user_detach_dev(int dev_id);
+
 #endif
diff --git a/lib/librte_vfio_user/version.map b/lib/librte_vfio_user/version.map
index 621a51a9fc..a0cda2b49c 100644
--- a/lib/librte_vfio_user/version.map
+++ b/lib/librte_vfio_user/version.map
@@ -10,6 +10,8 @@ EXPERIMENTAL {
 	rte_vfio_user_set_dev_info;
 	rte_vfio_user_set_reg_info;
 	rte_vfio_user_set_irq_info;
+	rte_vfio_user_attach_dev;
+	rte_vfio_user_detach_dev;
 
 	local: *;
 };
diff --git a/lib/librte_vfio_user/vfio_user_client.c b/lib/librte_vfio_user/vfio_user_client.c
new file mode 100644
index 0000000000..f288cf70f5
--- /dev/null
+++ b/lib/librte_vfio_user/vfio_user_client.c
@@ -0,0 +1,281 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <unistd.h>
+#include <string.h>
+#include <pthread.h>
+#include <fcntl.h>
+#include <sys/un.h>
+#include <sys/socket.h>
+
+#include "vfio_user_client.h"
+#include "rte_vfio_user.h"
+
+#define REPLY_USEC 1000
+#define RECV_MAX_TRY 50
+
+static struct vfio_user_client_devs vfio_client_devs = {
+	.cl_num = 0,
+	.mutex = PTHREAD_MUTEX_INITIALIZER,
+};
+
+/* Check if the sock_addr exists. If not, alloc and return index */
+static int
+vfio_user_client_allocate(const char *sock_addr)
+{
+	uint32_t i, count = 0;
+	int index = -1;
+
+	if (sock_addr == NULL)
+		return -1;
+
+	if (vfio_client_devs.cl_num == 0)
+		return 0;
+
+	for (i = 0; i < MAX_VFIO_USER_CLIENT; i++) {
+		struct vfio_user_client *cl = vfio_client_devs.cl[i];
+
+		if (!cl) {
+			if (index == -1)
+				index = i;
+			continue;
+		}
+
+		if (!strcmp(cl->sock.sock_addr, sock_addr))
+			return -1;
+
+		count++;
+		if (count == vfio_client_devs.cl_num)
+			break;
+	}
+
+	return index;
+}
+
+static struct vfio_user_client *
+vfio_user_client_create_dev(const char *sock_addr)
+{
+	struct vfio_user_client *cl;
+	struct vfio_user_socket *sock;
+	int fd, idx;
+	struct sockaddr_un un = { 0 };
+
+	pthread_mutex_lock(&vfio_client_devs.mutex);
+	if (vfio_client_devs.cl_num == MAX_VFIO_USER_CLIENT) {
+		VFIO_USER_LOG(ERR, "Failed to create client:"
+			" client num reaches max\n");
+		goto err;
+	}
+
+	idx = vfio_user_client_allocate(sock_addr);
+	if (idx < 0) {
+		VFIO_USER_LOG(ERR, "Failed to alloc a slot for client\n");
+		goto err;
+	}
+
+	cl = malloc(sizeof(*cl));
+	if (!cl) {
+		VFIO_USER_LOG(ERR, "Failed to alloc client\n");
+		goto err;
+	}
+
+	sock = &cl->sock;
+	sock->sock_addr = strdup(sock_addr);
+	if (!sock->sock_addr) {
+		VFIO_USER_LOG(ERR, "Failed to copy sock_addr\n");
+		goto err_dup;
+	}
+
+	fd = socket(AF_UNIX, SOCK_STREAM, 0);
+	if (fd < 0) {
+		VFIO_USER_LOG(ERR, "Client failed to create socket: %s\n",
+			strerror(errno));
+		goto err_sock;
+	}
+
+	if (fcntl(fd, F_SETFL, O_NONBLOCK)) {
+		VFIO_USER_LOG(ERR, "Failed to set nonblocking mode for client "
+			"socket, fd: %d (%s)\n", fd, strerror(errno));
+		goto err_ctl;
+	}
+
+	un.sun_family = AF_UNIX;
+	strncpy(un.sun_path, sock->sock_addr, sizeof(un.sun_path));
+	un.sun_path[sizeof(un.sun_path) - 1] = '\0';
+
+	if (connect(fd, &un, sizeof(un)) < 0) {
+		VFIO_USER_LOG(ERR, "Client connect error, %s\n",
+			strerror(errno));
+		goto err_ctl;
+	}
+
+	sock->sock_fd = fd;
+	sock->dev_id = idx;
+	cl->msg_id = 0;
+
+	vfio_client_devs.cl[idx] = cl;
+	vfio_client_devs.cl_num++;
+
+	pthread_mutex_unlock(&vfio_client_devs.mutex);
+
+	return cl;
+
+err_ctl:
+	close(fd);
+err_sock:
+	free(sock->sock_addr);
+err_dup:
+	free(sock);
+err:
+	pthread_mutex_unlock(&vfio_client_devs.mutex);
+	return NULL;
+}
+
+static int
+vfio_user_client_destroy_dev(int dev_id)
+{
+	struct vfio_user_client *cl;
+	struct vfio_user_socket *sock;
+	int ret = 0;
+
+	pthread_mutex_lock(&vfio_client_devs.mutex);
+	if (vfio_client_devs.cl_num == 0) {
+		VFIO_USER_LOG(ERR, "Failed to destroy client:"
+			" no client exists\n");
+		ret = -EINVAL;
+		goto err;
+	}
+
+	cl = vfio_client_devs.cl[dev_id];
+	if (!cl) {
+		VFIO_USER_LOG(ERR, "Failed to destroy client:"
+			" wrong device ID(%d)\n", dev_id);
+		ret = -EINVAL;
+		goto err;
+	}
+
+	sock = &cl->sock;
+	free(sock->sock_addr);
+	close(sock->sock_fd);
+
+	free(cl);
+	vfio_client_devs.cl[dev_id] = NULL;
+	vfio_client_devs.cl_num--;
+
+err:
+	pthread_mutex_unlock(&vfio_client_devs.mutex);
+	return ret;
+}
+
+static inline void
+vfio_user_client_fill_hdr(struct vfio_user_msg *msg, uint16_t cmd,
+	uint32_t sz, uint16_t msg_id)
+{
+	msg->msg_id = msg_id;
+	msg->cmd = cmd;
+	msg->size = sz;
+	msg->flags = VFIO_USER_TYPE_CMD;
+	msg->err = 0;
+}
+
+static int
+vfio_user_client_send_recv(int sock_fd, struct vfio_user_msg *msg)
+{
+	uint16_t cmd = msg->cmd;
+	uint16_t id = msg->msg_id;
+	uint8_t try_recv = 0;
+	int ret;
+
+	ret = vfio_user_send_msg(sock_fd, msg);
+	if (ret < 0) {
+		VFIO_USER_LOG(ERR, "Send error for %s\n",
+			vfio_user_msg_str[cmd]);
+		return -1;
+	}
+
+	VFIO_USER_LOG(INFO, "Send request %s\n", vfio_user_msg_str[cmd]);
+
+	memset(msg, 0, sizeof(*msg));
+
+	while (try_recv < RECV_MAX_TRY) {
+		ret = vfio_user_recv_msg(sock_fd, msg);
+		if (!ret) {
+			VFIO_USER_LOG(ERR, "Peer closed\n");
+			return -1;
+		} else if (ret > 0) {
+			if (id != msg->msg_id)
+				continue;
+			else
+				break;
+		}
+		usleep(REPLY_USEC);
+		try_recv++;
+	}
+
+	if (cmd != msg->cmd) {
+		VFIO_USER_LOG(ERR, "Request and reply mismatch\n");
+		ret = -1;
+	} else
+		ret = 0;
+
+	VFIO_USER_LOG(INFO, "Recv reply %s\n", vfio_user_msg_str[cmd]);
+
+	return ret;
+}
+
+int
+rte_vfio_user_attach_dev(const char *sock_addr)
+{
+	struct vfio_user_client *dev;
+	struct vfio_user_msg msg = { 0 };
+	uint32_t sz = VFIO_USER_MSG_HDR_SIZE + sizeof(struct vfio_user_version)
+		- VFIO_USER_MAX_VERSION_DATA;
+	struct vfio_user_version *ver = &msg.payload.ver;
+	int ret;
+
+	if (!sock_addr)
+		return -EINVAL;
+
+	dev = vfio_user_client_create_dev(sock_addr);
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to attach the device "
+			"with sock_addr %s\n", sock_addr);
+		return -1;
+	}
+
+	vfio_user_client_fill_hdr(&msg, VFIO_USER_VERSION, sz, dev->msg_id++);
+	ver->major = VFIO_USER_VERSION_MAJOR;
+	ver->minor = VFIO_USER_VERSION_MINOR;
+
+	ret = vfio_user_client_send_recv(dev->sock.sock_fd, &msg);
+	if (ret)
+		return ret;
+
+	if (msg.flags & VFIO_USER_ERROR) {
+		VFIO_USER_LOG(ERR, "Failed to negotiate version: %s\n",
+				msg.err ? strerror(msg.err) : "Unknown error");
+		return msg.err ? -(int)msg.err : -1;
+	}
+
+	if (vfio_user_check_msg_fdnum(&msg, 0) != 0)
+		return -1;
+
+	return dev->sock.dev_id;
+}
+
+int
+rte_vfio_user_detach_dev(int dev_id)
+{
+	int ret;
+
+	if (dev_id < 0)
+		return -EINVAL;
+
+	ret = vfio_user_client_destroy_dev(dev_id);
+	if (ret)
+		VFIO_USER_LOG(ERR, "Failed to detach the device (ID:%d)\n",
+			dev_id);
+
+	return ret;
+}
diff --git a/lib/librte_vfio_user/vfio_user_client.h b/lib/librte_vfio_user/vfio_user_client.h
new file mode 100644
index 0000000000..47401a43cc
--- /dev/null
+++ b/lib/librte_vfio_user/vfio_user_client.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _VFIO_USER_CLIENT_H
+#define _VFIO_USER_CLIENT_H
+
+#include <stdint.h>
+
+#include "vfio_user_base.h"
+
+#define MAX_VFIO_USER_CLIENT 1024
+
+struct vfio_user_client {
+	struct vfio_user_socket sock;
+	uint16_t msg_id;
+	uint8_t rsvd[16];	/* Reserved for future use */
+};
+
+struct vfio_user_client_devs {
+	struct vfio_user_client *cl[MAX_VFIO_USER_CLIENT];
+	uint32_t cl_num;
+	pthread_mutex_t mutex;
+};
+
+#endif
-- 
2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [dpdk-dev] [PATCH v2 7/9] vfio_user: add client APIs of DMA/IRQ/region
  2021-01-14  6:14 ` [dpdk-dev] [PATCH v2 " Chenbo Xia
                     ` (5 preceding siblings ...)
  2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 6/9] vfio_user: add client APIs of device attach/detach Chenbo Xia
@ 2021-01-14  6:14   ` Chenbo Xia
  2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 8/9] test/vfio_user: introduce functional test Chenbo Xia
                     ` (2 subsequent siblings)
  9 siblings, 0 replies; 43+ messages in thread
From: Chenbo Xia @ 2021-01-14  6:14 UTC (permalink / raw)
  To: dev, thomas, david.marchand
  Cc: stephen, cunming.liang, xiuchun.lu, miao.li, jingjing.wu, beilei.xing

This patch introduces nine APIs
- Device related:
  rte_vfio_user_get_dev_info and rte_vfio_user_reset
- DMA related:
  rte_vfio_user_dma_map/unmap
- Region related:
  rte_vfio_user_get_reg_info and rte_vfio_user_region_read/write
- IRQ related:
  rte_vfio_user_get_irq_info and rte_vfio_user_set_irqs

Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
---
 lib/librte_vfio_user/rte_vfio_user.h    | 177 ++++++++++
 lib/librte_vfio_user/version.map        |   9 +
 lib/librte_vfio_user/vfio_user_client.c | 419 ++++++++++++++++++++++++
 3 files changed, 605 insertions(+)

diff --git a/lib/librte_vfio_user/rte_vfio_user.h b/lib/librte_vfio_user/rte_vfio_user.h
index adafa552e2..e5fedcff90 100644
--- a/lib/librte_vfio_user/rte_vfio_user.h
+++ b/lib/librte_vfio_user/rte_vfio_user.h
@@ -238,6 +238,15 @@ rte_vfio_user_set_irq_info(const char *sock_addr,
  *  Below APIs are for vfio-user client (device consumer) to use:
  *	*rte_vfio_user_attach_dev
  *	*rte_vfio_user_detach_dev
+ *	*rte_vfio_user_get_dev_info
+ *	*rte_vfio_user_get_reg_info
+ *	*rte_vfio_user_get_irq_info
+ *	*rte_vfio_user_dma_map
+ *	*rte_vfio_user_dma_unmap
+ *	*rte_vfio_user_set_irqs
+ *	*rte_vfio_user_region_read
+ *	*rte_vfio_user_region_write
+ *	*rte_vfio_user_reset
  */
 
 /**
@@ -266,4 +275,172 @@ __rte_experimental
 int
 rte_vfio_user_detach_dev(int dev_id);
 
+/**
+ * Get device information of a vfio-user device.
+ *
+ * @param dev_id
+ *   Device ID of the vfio-user device
+ * @param[out] info
+ *   A pointer to a structure of type *vfio_device_info* to be filled with the
+ *   information of the device.
+ * @return
+ *   - 0: Success, device information updated
+ *   - <0: Failure on get device information
+ */
+__rte_experimental
+int
+rte_vfio_user_get_dev_info(int dev_id, struct vfio_device_info *info);
+
+/**
+ * Get region information of a vfio-user device.
+ *
+ * @param dev_id
+ *   Device ID of the vfio-user device
+ * @param[out] info
+ *   A pointer to a structure of type *vfio_region_info* to be filled with the
+ *   information of the device region.
+ * @param[out] fd
+ *   A pointer to the file descriptor of the region
+ * @return
+ *   - 0: Success, region information and file descriptor updated. If the region
+ *        can not be mmaped, the file descriptor should be -1.
+ *   - <0: Failure on get region information
+ */
+__rte_experimental
+int
+rte_vfio_user_get_reg_info(int dev_id, struct vfio_region_info *info,
+	int *fd);
+
+/**
+ * Get IRQ information of a vfio-user device.
+ *
+ * @param dev_id
+ *   Device ID of the vfio-user device
+ * @param[out] info
+ *   A pointer to a structure of type *vfio_irq_info* to be filled with the
+ *   information of the IRQ.
+ * @return
+ *   - 0: Success, IRQ information updated
+ *   - <0: Failure on get IRQ information
+ */
+__rte_experimental
+int
+rte_vfio_user_get_irq_info(int dev_id, struct vfio_irq_info *info);
+
+/**
+ * Map DMA regions for the vfio-user device.
+ *
+ * @param dev_id
+ *   Device ID of the vfio-user device
+ * @param mem
+ *   A pointer to a structure of type *vfio_user_mem_reg* that identifies
+ *   one or several DMA regions.
+ * @param fds
+ *   A pointer to a list of file descriptors. One file descriptor maps to
+ *   one DMA region.
+ * @param num
+ *   Number of DMA regions (or file descriptors)
+ * @return
+ *   - 0: Success, all DMA regions are mapped.
+ *   - <0: Failure on DMA map. It should be assumed that all DMA regions
+ *         are not mapped.
+ */
+__rte_experimental
+int
+rte_vfio_user_dma_map(int dev_id, struct rte_vfio_user_mem_reg *mem,
+	int *fds, uint32_t num);
+
+/**
+ * Unmap DMA regions for the vfio-user device.
+ *
+ * @param dev_id
+ *   Device ID of the vfio-user device
+ * @param mem
+ *   A pointer to a structure of type *vfio_user_mem_reg* that identifies
+ *   one or several DMA regions.
+ * @param num
+ *   Number of DMA regions
+ * @return
+ *   - 0: Success, all DMA regions are unmapped.
+ *   - <0: Failure on DMA unmap. It should be assumed that all DMA regions
+ *         are not unmapped.
+ */
+__rte_experimental
+int
+rte_vfio_user_dma_unmap(int dev_id, struct rte_vfio_user_mem_reg *mem,
+	uint32_t num);
+
+/**
+ * Set interrupt signaling, masking, and unmasking for the vfio-user device.
+ *
+ * @param dev_id
+ *   Device ID of the vfio-user device
+ * @param set
+ *   A pointer to a structure of type *vfio_irq_set* that specifies the set
+ *   data and action
+ * @return
+ *   - 0: Success, IRQs are set successfully.
+ *   - <0: Failure on IRQ set.
+ */
+__rte_experimental
+int
+rte_vfio_user_set_irqs(int dev_id, struct vfio_irq_set *set);
+
+/**
+ * Read region of the vfio-user device.
+ *
+ * @param dev_id
+ *   Device ID of the vfio-user device
+ * @param idx
+ *   The region index
+ * @param offset
+ *   The region offset
+ * @param size
+ *   Size of the read data
+ * @param[out] data
+ *   The pointer to data to be filled with correct region data
+ * @return
+ *   - 0: Success on region read
+ *   - <0: Failure on region read
+ */
+__rte_experimental
+int
+rte_vfio_user_region_read(int dev_id, uint32_t idx, uint64_t offset,
+	uint32_t size, void *data);
+
+/**
+ * Write region of the vfio-user device.
+ *
+ * @param dev_id
+ *   Device ID of the vfio-user device
+ * @param idx
+ *   The region index
+ * @param offset
+ *   The region offset
+ * @param size
+ *   Size of the read data
+ * @param data
+ *   The pointer to data that will be written to the region
+ * @return
+ *   - 0: Success on region write
+ *   - <0: Failure on region write
+ */
+__rte_experimental
+int
+rte_vfio_user_region_write(int dev_id, uint32_t idx, uint64_t offset,
+	uint32_t size, const void *data);
+
+/**
+ * Reset the vfio-user device
+ *
+ * @param dev_id
+ *   Device ID of the vfio-user device
+ * @return
+ *   - 0: Success on device reset
+ *   - <0: Failure on device reset
+ */
+__rte_experimental
+int
+rte_vfio_user_reset(int dev_id);
+
 #endif
diff --git a/lib/librte_vfio_user/version.map b/lib/librte_vfio_user/version.map
index a0cda2b49c..2e362d00bc 100644
--- a/lib/librte_vfio_user/version.map
+++ b/lib/librte_vfio_user/version.map
@@ -12,6 +12,15 @@ EXPERIMENTAL {
 	rte_vfio_user_set_irq_info;
 	rte_vfio_user_attach_dev;
 	rte_vfio_user_detach_dev;
+	rte_vfio_user_get_dev_info;
+	rte_vfio_user_get_reg_info;
+	rte_vfio_user_get_irq_info;
+	rte_vfio_user_dma_map;
+	rte_vfio_user_dma_unmap;
+	rte_vfio_user_set_irqs;
+	rte_vfio_user_region_read;
+	rte_vfio_user_region_write;
+	rte_vfio_user_reset;
 
 	local: *;
 };
diff --git a/lib/librte_vfio_user/vfio_user_client.c b/lib/librte_vfio_user/vfio_user_client.c
index f288cf70f5..852e7f253a 100644
--- a/lib/librte_vfio_user/vfio_user_client.c
+++ b/lib/librte_vfio_user/vfio_user_client.c
@@ -279,3 +279,422 @@ rte_vfio_user_detach_dev(int dev_id)
 
 	return ret;
 }
+
+int
+rte_vfio_user_get_dev_info(int dev_id, struct vfio_device_info *info)
+{
+	struct vfio_user_msg msg = { 0 };
+	struct vfio_user_client *dev;
+	int ret;
+	uint32_t sz = VFIO_USER_MSG_HDR_SIZE + sizeof(struct vfio_device_info);
+
+	if (!info)
+		return -EINVAL;
+
+	dev = vfio_client_devs.cl[dev_id];
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to get device info: "
+			"wrong device ID\n");
+		return -EINVAL;
+	}
+
+	vfio_user_client_fill_hdr(&msg, VFIO_USER_DEVICE_GET_INFO,
+		sz, dev->msg_id++);
+
+	ret = vfio_user_client_send_recv(dev->sock.sock_fd, &msg);
+	if (ret)
+		return ret;
+
+	if (msg.flags & VFIO_USER_ERROR) {
+		VFIO_USER_LOG(ERR, "Failed to get device info: %s\n",
+				msg.err ? strerror(msg.err) : "Unknown error");
+		return msg.err ? -(int)msg.err : -1;
+	}
+
+	if (vfio_user_check_msg_fdnum(&msg, 0) != 0)
+		return -1;
+
+	memcpy(info, &msg.payload.dev_info, sizeof(*info));
+	return ret;
+}
+
+int
+rte_vfio_user_get_reg_info(int dev_id, struct vfio_region_info *info,
+	int *fd)
+{
+	struct vfio_user_msg msg = { 0 };
+	int ret, fd_num = 0;
+	struct vfio_user_client *dev;
+	uint32_t sz = VFIO_USER_MSG_HDR_SIZE + info->argsz;
+	struct vfio_user_reg *reg = &msg.payload.reg_info;
+
+	if (!info || !fd)
+		return -EINVAL;
+
+	dev = vfio_client_devs.cl[dev_id];
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to get region info: "
+			"wrong device ID\n");
+		return -EINVAL;
+	}
+
+	vfio_user_client_fill_hdr(&msg, VFIO_USER_DEVICE_GET_REGION_INFO,
+		sz, dev->msg_id++);
+	reg->reg_info.index = info->index;
+
+	ret = vfio_user_client_send_recv(dev->sock.sock_fd, &msg);
+	if (ret)
+		return ret;
+
+	if (msg.flags & VFIO_USER_ERROR) {
+		VFIO_USER_LOG(ERR, "Failed to get region(%u) info: %s\n",
+				info->index, msg.err ? strerror(msg.err) :
+				"Unknown error");
+		return msg.err ? -(int)msg.err : -1;
+	}
+
+	if (reg->reg_info.flags & VFIO_REGION_INFO_FLAG_MMAP)
+		fd_num = 1;
+
+	if (vfio_user_check_msg_fdnum(&msg, fd_num) != 0)
+		return -1;
+
+	if (reg->reg_info.index != info->index ||
+		msg.size - VFIO_USER_MSG_HDR_SIZE > sizeof(*reg)) {
+		VFIO_USER_LOG(ERR,
+			"Incorrect reply message for region info\n");
+		return -1;
+	}
+
+	memcpy(info, &msg.payload.reg_info, info->argsz);
+	*fd = fd_num == 1 ? msg.fds[0] : -1;
+
+	return 0;
+}
+
+int
+rte_vfio_user_get_irq_info(int dev_id, struct vfio_irq_info *info)
+{
+	struct vfio_user_msg msg = { 0 };
+	int ret;
+	struct vfio_user_client *dev;
+	uint32_t sz = VFIO_USER_MSG_HDR_SIZE + sizeof(struct vfio_irq_info);
+	struct vfio_irq_info *irq_info = &msg.payload.irq_info;
+
+	if (!info)
+		return -EINVAL;
+
+	dev = vfio_client_devs.cl[dev_id];
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to get region info: "
+			"wrong device ID\n");
+		return -EINVAL;
+	}
+
+	vfio_user_client_fill_hdr(&msg, VFIO_USER_DEVICE_GET_IRQ_INFO,
+		sz, dev->msg_id++);
+	irq_info->index = info->index;
+
+	ret = vfio_user_client_send_recv(dev->sock.sock_fd, &msg);
+	if (ret)
+		return ret;
+
+	if (msg.flags & VFIO_USER_ERROR) {
+		VFIO_USER_LOG(ERR, "Failed to get irq(%u) info: %s\n",
+				info->index, msg.err ? strerror(msg.err) :
+				"Unknown error");
+		return msg.err ? -(int)msg.err : -1;
+	}
+
+	if (vfio_user_check_msg_fdnum(&msg, 0) != 0)
+		return -1;
+
+	if (irq_info->index != info->index ||
+		msg.size - VFIO_USER_MSG_HDR_SIZE != sizeof(*irq_info)) {
+		VFIO_USER_LOG(ERR,
+			"Incorrect reply message for IRQ info\n");
+		return -1;
+	}
+
+	memcpy(info, irq_info, sizeof(*info));
+	return 0;
+}
+
+static int
+vfio_user_client_dma_map_unmap(struct vfio_user_client *dev,
+	struct rte_vfio_user_mem_reg *mem, int *fds, uint32_t num, bool ismap)
+{
+	struct vfio_user_msg msg = { 0 };
+	int ret;
+	uint32_t i, mem_sz, map;
+	uint16_t cmd = VFIO_USER_DMA_UNMAP;
+
+	if (num > VFIO_USER_MSG_MAX_NREG) {
+		VFIO_USER_LOG(ERR,
+			"Too many memory regions to %s (MAX: %u)\n",
+			ismap ? "map" : "unmap", VFIO_USER_MSG_MAX_NREG);
+		return -EINVAL;
+	}
+
+	if (ismap) {
+		cmd = VFIO_USER_DMA_MAP;
+
+		for (i = 0; i < num; i++) {
+			map = mem->flags & RTE_VUSER_MEM_MAPPABLE;
+			if ((map && (fds[i] == -1)) ||
+				(!map && (fds[i] != -1))) {
+				VFIO_USER_LOG(ERR, "%spable memory region "
+					"should%s have valid fd\n",
+					ismap ? "Map" : "Unmap",
+					ismap ? "" : " not");
+				return -EINVAL;
+			}
+
+			if (fds[i] != -1)
+				msg.fds[msg.fd_num++] = fds[i];
+		}
+	}
+
+	mem_sz = sizeof(struct rte_vfio_user_mem_reg) * num;
+	memcpy(&msg.payload, mem, mem_sz);
+
+	vfio_user_client_fill_hdr(&msg, cmd, mem_sz + VFIO_USER_MSG_HDR_SIZE,
+		dev->msg_id++);
+
+	ret = vfio_user_client_send_recv(dev->sock.sock_fd, &msg);
+	if (ret)
+		return ret;
+
+	if (msg.flags & VFIO_USER_ERROR) {
+		VFIO_USER_LOG(ERR, "Failed to %smap memory regions: "
+				"%s\n", ismap ? "" : "un",
+				msg.err ? strerror(msg.err) : "Unknown error");
+		return msg.err ? -(int)msg.err : -1;
+	}
+
+	if (vfio_user_check_msg_fdnum(&msg, 0) != 0)
+		return -1;
+
+	return 0;
+}
+
+int
+rte_vfio_user_dma_map(int dev_id, struct rte_vfio_user_mem_reg *mem,
+	int *fds, uint32_t num)
+{
+	struct vfio_user_client *dev;
+
+	if (!mem || !fds)
+		return -EINVAL;
+
+	dev = vfio_client_devs.cl[dev_id];
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to dma map: "
+			"wrong device ID\n");
+		return -EINVAL;
+	}
+
+	return vfio_user_client_dma_map_unmap(dev, mem, fds, num, true);
+}
+
+int
+rte_vfio_user_dma_unmap(int dev_id, struct rte_vfio_user_mem_reg *mem,
+	uint32_t num)
+{
+	struct vfio_user_client *dev;
+
+	if (!mem)
+		return -EINVAL;
+
+	dev = vfio_client_devs.cl[dev_id];
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to dma unmap: "
+			"wrong device ID\n");
+		return -EINVAL;
+	}
+
+	return vfio_user_client_dma_map_unmap(dev, mem, NULL, num, false);
+}
+
+int
+rte_vfio_user_set_irqs(int dev_id, struct vfio_irq_set *set)
+{
+	struct vfio_user_msg msg = { 0 };
+	int ret;
+	struct vfio_user_client *dev;
+	uint32_t set_sz = set->argsz;
+	struct vfio_user_irq_set *irq_set = &msg.payload.irq_set;
+
+	if (!set)
+		return -EINVAL;
+
+	dev = vfio_client_devs.cl[dev_id];
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to set irqs: "
+			"wrong device ID\n");
+		return -EINVAL;
+	}
+
+	if (set->flags & VFIO_IRQ_SET_DATA_EVENTFD) {
+		msg.fd_num = set->count;
+		memcpy(msg.fds, set->data, sizeof(int) * set->count);
+		set_sz -= sizeof(int) * set->count;
+	}
+	memcpy(irq_set, set, set_sz);
+	irq_set->set.argsz = set_sz;
+	vfio_user_client_fill_hdr(&msg, VFIO_USER_DEVICE_SET_IRQS,
+		VFIO_USER_MSG_HDR_SIZE + set_sz, dev->msg_id++);
+
+	ret = vfio_user_client_send_recv(dev->sock.sock_fd, &msg);
+	if (ret)
+		return ret;
+
+	if (msg.flags & VFIO_USER_ERROR) {
+		VFIO_USER_LOG(ERR, "Failed to set irq(%u): %s\n",
+				set->index, msg.err ? strerror(msg.err) :
+				"Unknown error");
+		return msg.err ? -(int)msg.err : -1;
+	}
+
+	if (vfio_user_check_msg_fdnum(&msg, 0) != 0)
+		return -1;
+
+	return 0;
+}
+
+int
+rte_vfio_user_region_read(int dev_id, uint32_t idx, uint64_t offset,
+	uint32_t size, void *data)
+{
+	struct vfio_user_msg msg = { 0 };
+	int ret;
+	struct vfio_user_client *dev;
+	uint32_t sz = VFIO_USER_MSG_HDR_SIZE + sizeof(struct vfio_user_reg_rw)
+		- VFIO_USER_MAX_RW_DATA;
+	struct vfio_user_reg_rw *rw = &msg.payload.reg_rw;
+
+	if (!data)
+		return -EINVAL;
+
+	dev = vfio_client_devs.cl[dev_id];
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to read region: "
+			"wrong device ID\n");
+		return -EINVAL;
+	}
+
+	if (sz > VFIO_USER_MAX_RW_DATA) {
+		VFIO_USER_LOG(ERR, "Region read size exceeds max\n");
+		return -1;
+	}
+
+	vfio_user_client_fill_hdr(&msg, VFIO_USER_REGION_READ,
+		sz, dev->msg_id++);
+
+	rw->reg_idx = idx;
+	rw->reg_offset = offset;
+	rw->size = size;
+
+	ret = vfio_user_client_send_recv(dev->sock.sock_fd, &msg);
+	if (ret)
+		return ret;
+
+	if (msg.flags & VFIO_USER_ERROR) {
+		VFIO_USER_LOG(ERR, "Failed to read region(%u): %s\n",
+				idx, msg.err ? strerror(msg.err) :
+				"Unknown error");
+		return msg.err ? -(int)msg.err : -1;
+	}
+
+	if (vfio_user_check_msg_fdnum(&msg, 0) != 0)
+		return -1;
+
+	memcpy(data, rw->data, size);
+	return 0;
+}
+
+int
+rte_vfio_user_region_write(int dev_id, uint32_t idx, uint64_t offset,
+	uint32_t size, const void *data)
+{
+	struct vfio_user_msg msg = { 0 };
+	int ret;
+	struct vfio_user_client *dev;
+	uint32_t sz = VFIO_USER_MSG_HDR_SIZE + sizeof(struct vfio_user_reg_rw)
+		- VFIO_USER_MAX_RW_DATA + size;
+	struct vfio_user_reg_rw *rw = &msg.payload.reg_rw;
+
+	if (!data)
+		return -EINVAL;
+
+	dev = vfio_client_devs.cl[dev_id];
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to write region: "
+			"wrong device ID\n");
+		return -EINVAL;
+	}
+
+	if (sz > VFIO_USER_MAX_RW_DATA) {
+		VFIO_USER_LOG(ERR, "Region write size exceeds max\n");
+		return -EINVAL;
+	}
+
+	vfio_user_client_fill_hdr(&msg, VFIO_USER_REGION_WRITE,
+		sz, dev->msg_id++);
+
+	rw->reg_idx = idx;
+	rw->reg_offset = offset;
+	rw->size = size;
+	memcpy(rw->data, data, size);
+
+	ret = vfio_user_client_send_recv(dev->sock.sock_fd, &msg);
+	if (ret)
+		return ret;
+
+	if (msg.flags & VFIO_USER_ERROR) {
+		VFIO_USER_LOG(ERR, "Failed to write region(%u): %s\n",
+				idx, msg.err ? strerror(msg.err) :
+				"Unknown error");
+		return msg.err ? -(int)msg.err : -1;
+	}
+
+	if (vfio_user_check_msg_fdnum(&msg, 0) != 0)
+		return -1;
+
+	return 0;
+}
+
+int
+rte_vfio_user_reset(int dev_id)
+{
+	struct vfio_user_msg msg = { 0 };
+	int ret;
+	struct vfio_user_client *dev;
+	uint32_t sz = VFIO_USER_MSG_HDR_SIZE;
+
+	dev = vfio_client_devs.cl[dev_id];
+	if (!dev) {
+		VFIO_USER_LOG(ERR, "Failed to write region: "
+			"wrong device ID\n");
+		return -EINVAL;
+	}
+
+	vfio_user_client_fill_hdr(&msg, VFIO_USER_DEVICE_RESET,
+		sz, dev->msg_id++);
+
+	ret = vfio_user_client_send_recv(dev->sock.sock_fd, &msg);
+	if (ret)
+		return ret;
+
+	if (msg.flags & VFIO_USER_ERROR) {
+		VFIO_USER_LOG(ERR, "Failed to reset device: %s\n",
+				msg.err ? strerror(msg.err) :
+				"Unknown error");
+		return msg.err ? -(int)msg.err : -1;
+	}
+
+	if (vfio_user_check_msg_fdnum(&msg, 0) != 0)
+		return -1;
+
+	return ret;
+}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [dpdk-dev] [PATCH v2 8/9] test/vfio_user: introduce functional test
  2021-01-14  6:14 ` [dpdk-dev] [PATCH v2 " Chenbo Xia
                     ` (6 preceding siblings ...)
  2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 7/9] vfio_user: add client APIs of DMA/IRQ/region Chenbo Xia
@ 2021-01-14  6:14   ` Chenbo Xia
  2021-01-14 19:03     ` David Christensen
  2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 9/9] doc: add vfio-user library guide Chenbo Xia
  2021-01-15  7:58   ` [dpdk-dev] [PATCH v2 0/9] Introduce vfio-user library David Marchand
  9 siblings, 1 reply; 43+ messages in thread
From: Chenbo Xia @ 2021-01-14  6:14 UTC (permalink / raw)
  To: dev, thomas, david.marchand
  Cc: stephen, cunming.liang, xiuchun.lu, miao.li, jingjing.wu, beilei.xing

This patch introduces functional test for vfio_user client and
server. Note that the test can only be run with server and client
both started and server should be started first.

Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
---
 app/test/meson.build      |   4 +
 app/test/test_vfio_user.c | 665 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 669 insertions(+)
 create mode 100644 app/test/test_vfio_user.c

diff --git a/app/test/meson.build b/app/test/meson.build
index 94fd39fecb..f5b15ac44c 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -138,6 +138,7 @@ test_sources = files('commands.c',
 	'test_trace.c',
 	'test_trace_register.c',
 	'test_trace_perf.c',
+	'test_vfio_user.c',
 	'test_version.c',
 	'virtual_pmd.c'
 )
@@ -173,6 +174,7 @@ test_deps = ['acl',
 	'ring',
 	'security',
 	'stack',
+	'vfio_user',
 	'telemetry',
 	'timer'
 ]
@@ -266,6 +268,8 @@ fast_tests = [
         ['service_autotest', true],
         ['thash_autotest', true],
         ['trace_autotest', true],
+        ['vfio_user_autotest_client', false],
+        ['vfio_user_autotest_server', false],
 ]
 
 perf_test_names = [
diff --git a/app/test/test_vfio_user.c b/app/test/test_vfio_user.c
new file mode 100644
index 0000000000..c5fc0e842c
--- /dev/null
+++ b/app/test/test_vfio_user.c
@@ -0,0 +1,665 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <inttypes.h>
+#include <limits.h>
+#include <stdatomic.h>
+#include <sys/eventfd.h>
+#include <sys/mman.h>
+
+#include <rte_vfio_user.h>
+#include <rte_malloc.h>
+#include <rte_hexdump.h>
+#include <rte_pause.h>
+#include <rte_log.h>
+
+#include "test.h"
+
+#define REGION_SIZE 0x100
+
+struct server_mem_tb {
+	uint32_t entry_num;
+	struct rte_vfio_user_mtb_entry entry[];
+};
+
+static const char test_sock[] = "/tmp/dpdk_vfio_test";
+struct server_mem_tb *server_mem;
+int server_irqfd;
+atomic_uint test_failed;
+atomic_uint server_destroyed;
+
+static int
+test_set_dev_info(const char *sock,
+	struct vfio_device_info *info)
+{
+	int ret;
+
+	info->argsz = sizeof(*info);
+	info->flags = VFIO_DEVICE_FLAGS_RESET | VFIO_DEVICE_FLAGS_PCI;
+	info->num_irqs = VFIO_PCI_NUM_IRQS;
+	info->num_regions = VFIO_PCI_NUM_REGIONS;
+	ret = rte_vfio_user_set_dev_info(sock, info);
+	if (ret) {
+		printf("Failed to set device info\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static ssize_t
+test_dev_cfg_rw(struct rte_vfio_user_reg_info *reg, char *buf,
+	size_t count, loff_t pos, bool iswrite)
+{
+	char *loc = (char *)reg->base + pos;
+
+	if (!iswrite) {
+		if (pos + count > reg->info->size)
+			return -1;
+		memcpy(buf, loc, count);
+		return count;
+	}
+
+	memcpy(loc, buf, count);
+	return count;
+}
+
+static int
+test_set_reg_info(const char *sock_addr,
+	struct rte_vfio_user_regions *reg)
+{
+	struct rte_vfio_user_reg_info *reg_info;
+	void *cfg_base = NULL;
+	uint32_t i, j, sz = 0, reg_sz = REGION_SIZE;
+	int ret;
+
+	reg->reg_num = VFIO_PCI_NUM_REGIONS;
+	sz = sizeof(struct vfio_region_info);
+
+	for (i = 0; i < reg->reg_num; i++) {
+		reg_info = &reg->reg_info[i];
+
+		reg_info->info = rte_zmalloc(NULL, sz, 0);
+		if (!reg_info->info) {
+			printf("Failed to alloc vfio region info\n");
+			goto err;
+		}
+
+		reg_info->priv = NULL;
+		reg_info->fd = -1;
+		reg_info->info->argsz = sz;
+		reg_info->info->cap_offset = sz;
+		reg_info->info->index = i;
+		reg_info->info->offset = 0;
+		reg_info->info->flags = VFIO_REGION_INFO_FLAG_READ |
+			VFIO_REGION_INFO_FLAG_WRITE;
+
+		if (i == VFIO_PCI_CONFIG_REGION_INDEX) {
+			cfg_base = rte_zmalloc(NULL, reg_sz, 0);
+			if (!cfg_base) {
+				printf("Failed to alloc cfg space\n");
+				goto err;
+			}
+			reg_info->base = cfg_base;
+			reg_info->rw = test_dev_cfg_rw;
+			reg_info->info->size = reg_sz;
+		} else {
+			reg_info->base = NULL;
+			reg_info->rw = NULL;
+			reg_info->info->size = 0;
+		}
+	}
+
+	ret = rte_vfio_user_set_reg_info(sock_addr, reg);
+	if (ret) {
+		printf("Failed to set region info\n");
+		return -1;
+	}
+
+	return 0;
+err:
+	for (j = 0; j < i; j++)
+		rte_free(reg->reg_info[i].info);
+	rte_free(cfg_base);
+	return -1;
+}
+
+static void
+cleanup_reg(struct rte_vfio_user_regions *reg)
+{
+	struct rte_vfio_user_reg_info *reg_info;
+	uint32_t i;
+
+	for (i = 0; i < reg->reg_num; i++) {
+		reg_info = &reg->reg_info[i];
+
+		rte_free(reg_info->info);
+
+		if (i == VFIO_PCI_CONFIG_REGION_INDEX)
+			rte_free(reg_info->base);
+	}
+}
+
+static int
+test_set_irq_info(const char *sock,
+	struct rte_vfio_user_irq_info *info)
+{
+	struct vfio_irq_info *irq_info;
+	int ret;
+	uint32_t i;
+
+	info->irq_num = VFIO_PCI_NUM_IRQS;
+	for (i = 0; i < info->irq_num; i++) {
+		irq_info = &info->irq_info[i];
+		irq_info->argsz = sizeof(irq_info);
+		irq_info->index = i;
+
+		if (i == VFIO_PCI_MSIX_IRQ_INDEX) {
+			irq_info->flags = VFIO_IRQ_INFO_EVENTFD |
+				VFIO_IRQ_INFO_NORESIZE;
+			irq_info->count = 1;
+		} else {
+			irq_info->flags = 0;
+			irq_info->count = 0;
+		}
+	}
+
+	ret = rte_vfio_user_set_irq_info(sock, info);
+	if (ret) {
+		printf("Failed to set irq info\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+test_get_mem(int dev_id)
+{
+	const struct rte_vfio_user_mem *mem;
+	uint32_t entry_sz;
+
+	mem = rte_vfio_user_get_mem_table(dev_id);
+	if (!mem) {
+		printf("Failed to get memory table\n");
+		return -1;
+	}
+
+	entry_sz = sizeof(struct rte_vfio_user_mtb_entry) * mem->entry_num;
+	server_mem = rte_zmalloc(NULL, sizeof(*server_mem) + entry_sz, 0);
+
+	memcpy(server_mem->entry, mem->entry, entry_sz);
+	server_mem->entry_num = mem->entry_num;
+
+	return 0;
+}
+
+static int
+test_get_irq(int dev_id)
+{
+	int ret;
+
+	server_irqfd = -1;
+	ret = rte_vfio_user_get_irq(dev_id, VFIO_PCI_MSIX_IRQ_INDEX, 1,
+		&server_irqfd);
+	if (ret) {
+		printf("Failed to get IRQ\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+test_create_device(int dev_id)
+{
+	char sock[PATH_MAX];
+
+	RTE_LOG(DEBUG, USER1, "Device created\n");
+
+	if (rte_vfio_get_sock_addr(dev_id, sock, sizeof(sock))) {
+		printf("Failed to get socket addr\n");
+		goto err;
+	}
+
+	if (strcmp(sock, test_sock)) {
+		printf("Wrong socket addr\n");
+		goto err;
+	}
+
+	printf("Get socket address: TEST OK\n");
+
+	return 0;
+err:
+	atomic_store(&test_failed, 1);
+	return -1;
+}
+
+static void
+test_destroy_device(int dev_id __rte_unused)
+{
+	int ret;
+
+	RTE_LOG(DEBUG, USER1, "Device destroyed\n");
+
+	ret = test_get_mem(dev_id);
+	if (ret)
+		goto err;
+
+	printf("Get memory table: TEST OK\n");
+
+	ret = test_get_irq(dev_id);
+	if (ret)
+		goto err;
+
+	printf("Get IRQ: TEST OK\n");
+
+	atomic_store(&server_destroyed, 1);
+	return;
+err:
+	atomic_store(&test_failed, 1);
+}
+
+static int
+test_update_device(int dev_id __rte_unused)
+{
+	RTE_LOG(DEBUG, USER1, "Device updated\n");
+
+	return 0;
+}
+
+static int
+test_lock_dp(int dev_id __rte_unused, int lock)
+{
+	RTE_LOG(DEBUG, USER1, "Device data path %slocked\n", lock ? "" : "un");
+	return 0;
+}
+
+static int
+test_reset_device(int dev_id __rte_unused)
+{
+	RTE_LOG(DEBUG, USER1, "Device reset\n");
+	return 0;
+}
+
+const struct rte_vfio_user_notify_ops test_vfio_ops = {
+	.new_device = test_create_device,
+	.destroy_device = test_destroy_device,
+	.update_status = test_update_device,
+	.lock_dp = test_lock_dp,
+	.reset_device = test_reset_device,
+};
+
+static int
+test_vfio_user_server(void)
+{
+	struct vfio_device_info dev_info;
+	struct rte_vfio_user_regions *reg;
+	struct rte_vfio_user_reg_info *reg_info;
+	struct vfio_region_info *info;
+	struct rte_vfio_user_irq_info *irq_info;
+	struct rte_vfio_user_mtb_entry *ent;
+	int ret, err;
+	uint32_t i;
+
+	atomic_init(&test_failed, 0);
+	atomic_init(&server_destroyed, 0);
+
+	ret = rte_vfio_user_register(test_sock, &test_vfio_ops);
+	if (ret) {
+		printf("Failed to register\n");
+		ret = TEST_FAILED;
+		goto err_regis;
+	}
+
+	printf("Register device: TEST OK\n");
+
+	reg = rte_zmalloc(NULL, sizeof(*reg) + VFIO_PCI_NUM_REGIONS *
+		sizeof(struct rte_vfio_user_reg_info), 0);
+	if (!reg) {
+		printf("Failed to alloc regions\n");
+		ret = TEST_FAILED;
+		goto err_reg;
+	}
+
+	irq_info = rte_zmalloc(NULL, sizeof(*irq_info) + VFIO_PCI_NUM_IRQS *
+		sizeof(struct vfio_irq_info), 0);
+	if (!irq_info) {
+		printf("Failed to alloc irq info\n");
+		ret = TEST_FAILED;
+		goto err_irq;
+	}
+
+	if (test_set_dev_info(test_sock, &dev_info)) {
+		ret = TEST_FAILED;
+		goto err_set;
+	}
+
+	printf("Set device info: TEST OK\n");
+
+	if (test_set_reg_info(test_sock, reg)) {
+		ret = TEST_FAILED;
+		goto err_set;
+	}
+
+	printf("Set device info: TEST OK\n");
+
+	if (test_set_irq_info(test_sock, irq_info)) {
+		ret = TEST_FAILED;
+		goto err;
+	}
+
+	printf("Set irq info: TEST OK\n");
+
+	ret = rte_vfio_user_start(test_sock);
+	if (ret) {
+		printf("Failed to start\n");
+		ret = TEST_FAILED;
+		goto err;
+	}
+
+	printf("Start device: TEST OK\n");
+
+	while (atomic_load(&test_failed) == 0 &&
+		atomic_load(&server_destroyed) == 0)
+		rte_pause();
+
+	if (atomic_load(&test_failed) == 1) {
+		printf("Test failed during device running\n");
+		ret = TEST_FAILED;
+		goto err;
+	}
+
+	printf("=================================\n");
+	printf("Device layout:\n");
+	printf("=================================\n");
+	printf("%u regions, %u IRQs\n", dev_info.num_regions,
+		dev_info.num_irqs);
+	printf("=================================\n");
+
+	reg_info = &reg->reg_info[VFIO_PCI_CONFIG_REGION_INDEX];
+	info = reg_info->info;
+	printf("Configuration Space:\nsize : 0x%llx, prot: %s%s\n",
+		info->size,
+		(info->flags & VFIO_REGION_INFO_FLAG_READ) ? "read/" : "",
+		(info->flags & VFIO_REGION_INFO_FLAG_WRITE) ? "write" : "");
+	rte_hexdump(stdout, "Content", (const void *)reg_info->base,
+		info->size);
+
+	printf("=================================\n");
+	printf("DMA memory table (Entry num: %u):\n", server_mem->entry_num);
+
+	for (i = 0; i < server_mem->entry_num; i++) {
+		ent = &server_mem->entry[i];
+		printf("(Entry %u) gpa: 0x%" PRIx64
+			", size: 0x%" PRIx64 ", hva: 0x%" PRIx64 "\n"
+			", mmap_addr: 0x%" PRIx64 ", mmap_size: 0x%" PRIx64
+			", fd: %d\n", i, ent->gpa, ent->size,
+			ent->host_user_addr, (uint64_t)ent->mmap_addr,
+			ent->mmap_size, ent->fd);
+	}
+
+	printf("=================================\n");
+	printf("MSI-X Interrupt:\nNumber: %u, irqfd: %s\n",
+		irq_info->irq_info[VFIO_PCI_MSIX_IRQ_INDEX].count,
+		server_irqfd == -1 ? "Invalid" : "Valid");
+
+	ret = TEST_SUCCESS;
+
+err:
+	cleanup_reg(reg);
+err_set:
+	rte_free(irq_info);
+err_irq:
+	rte_free(reg);
+err_reg:
+	err = rte_vfio_user_unregister(test_sock);
+	if (err)
+		ret = TEST_FAILED;
+	else
+		printf("Unregister device: TEST OK\n");
+err_regis:
+	return ret;
+}
+
+static int
+test_get_dev_info(int dev_id, struct vfio_device_info *info)
+{
+	int ret;
+
+	ret = rte_vfio_user_get_dev_info(dev_id, info);
+	if (ret) {
+		printf("Failed to get device info\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+test_get_reg_info(int dev_id, struct vfio_region_info *info)
+{
+	int ret, fd = -1;
+
+	info->index = VFIO_PCI_CONFIG_REGION_INDEX;
+	info->argsz = sizeof(*info);
+	ret = rte_vfio_user_get_reg_info(dev_id, info, &fd);
+	if (ret) {
+		printf("Failed to get region info\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+test_get_irq_info(int dev_id, struct vfio_irq_info *info)
+{
+	int ret;
+
+	info->index = VFIO_PCI_MSIX_IRQ_INDEX;
+	ret = rte_vfio_user_get_irq_info(dev_id, info);
+	if (ret) {
+		printf("Failed to get irq info\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+test_set_irqs(int dev_id, struct vfio_irq_set *set, int *fd)
+{
+	int ret;
+
+	*fd = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK);
+	if (*fd < 0) {
+		printf("Failed to create eventfd\n");
+		return -1;
+	}
+
+	set->argsz = sizeof(*set) + sizeof(int);
+	set->count = 1;
+	set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER;
+	set->index = VFIO_PCI_MSIX_IRQ_INDEX;
+	set->start = 0;
+	memcpy(set->data, fd, sizeof(*fd));
+
+	ret = rte_vfio_user_set_irqs(dev_id, set);
+	if (ret) {
+		printf("Failed to set irqs\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+test_dma_map_unmap(int dev_id, struct rte_vfio_user_mem_reg *mem)
+{
+	int ret, fd = -1;
+
+	mem->fd_offset = 0;
+	mem->flags = 0;
+	mem->gpa = 0x12345678;
+	mem->protection = PROT_READ | PROT_WRITE;
+	mem->size = 0x10000;
+
+	/* Map -> Unmap -> Map */
+	ret = rte_vfio_user_dma_map(dev_id, mem, &fd, 1);
+	if (ret) {
+		printf("Failed to dma map\n");
+		return -1;
+	}
+
+	ret = rte_vfio_user_dma_unmap(dev_id, mem, 1);
+	if (ret) {
+		printf("Failed to dma unmap\n");
+		return -1;
+	}
+
+	ret = rte_vfio_user_dma_map(dev_id, mem, &fd, 1);
+	if (ret) {
+		printf("Failed to dma re-map\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+test_region_read_write(int dev_id, void *read_data, uint64_t sz)
+{
+	int ret;
+	uint32_t data = 0x1A2B3C4D, idx = VFIO_PCI_CONFIG_REGION_INDEX;
+
+	ret = rte_vfio_user_region_write(dev_id, idx, 0, 4, (void *)&data);
+	if (ret) {
+		printf("Failed to write region\n");
+		return -1;
+	}
+
+	ret = rte_vfio_user_region_read(dev_id, idx, 0, sz, read_data);
+	if (ret) {
+		printf("Failed to read region\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+static int
+test_vfio_user_client(void)
+{
+	int ret = 0, dev_id, fd = -1;
+	struct vfio_device_info dev_info;
+	struct vfio_irq_info irq_info;
+	struct rte_vfio_user_mem_reg mem;
+	struct vfio_irq_set *set;
+	struct vfio_region_info reg_info;
+	void *data;
+
+	ret = rte_vfio_user_attach_dev(test_sock);
+	if (ret) {
+		printf("Failed to attach device\n");
+		return TEST_FAILED;
+	}
+
+	printf("Attach device: TEST OK\n");
+
+	dev_id = ret;
+	ret = rte_vfio_user_reset(dev_id);
+	if (ret) {
+		printf("Failed to reset device\n");
+		return TEST_FAILED;
+	}
+
+	printf("Reset device: TEST OK\n");
+
+	if (test_get_dev_info(dev_id, &dev_info))
+		return TEST_FAILED;
+
+	printf("Get device info: TEST OK\n");
+
+	if (test_get_reg_info(dev_id, &reg_info))
+		return TEST_FAILED;
+
+	printf("Get region info: TEST OK\n");
+
+	if (test_get_irq_info(dev_id, &irq_info))
+		return TEST_FAILED;
+
+	printf("Get irq info: TEST OK\n");
+
+	set = rte_zmalloc(NULL, sizeof(*set) + sizeof(int), 0);
+	if (!set) {
+		printf("Failed to allocate irq set\n");
+		return TEST_FAILED;
+	}
+
+	data = rte_zmalloc(NULL, reg_info.size, 0);
+	if (!data) {
+		printf("Failed to allocate data\n");
+		ret = TEST_FAILED;
+		goto err_data;
+	}
+
+	if (test_set_irqs(dev_id, set, &fd)) {
+		ret = TEST_FAILED;
+		goto err;
+	}
+
+	printf("Set irqs: TEST OK\n");
+
+	if (test_dma_map_unmap(dev_id, &mem)) {
+		ret = TEST_FAILED;
+		goto err;
+	}
+
+	printf("DMA map/unmap: TEST OK\n");
+
+	if (test_region_read_write(dev_id, data, reg_info.size)) {
+		ret = TEST_FAILED;
+		goto err;
+	}
+
+	printf("Region read/write: TEST OK\n");
+
+	printf("=================================\n");
+	printf("Device layout:\n");
+	printf("=================================\n");
+	printf("%u regions, %u IRQs\n", dev_info.num_regions,
+		dev_info.num_irqs);
+	printf("=================================\n");
+	printf("Configuration Space:\nsize : 0x%llx, prot: %s%s\n",
+		reg_info.size,
+		(reg_info.flags & VFIO_REGION_INFO_FLAG_READ) ? "read/" : "",
+		(reg_info.flags & VFIO_REGION_INFO_FLAG_WRITE) ? "write" : "");
+	rte_hexdump(stdout, "Content", (const void *)data, reg_info.size);
+
+	printf("=================================\n");
+	printf("DMA memory table (Entry num: 1):\ngpa: 0x%" PRIx64
+		", size: 0x%" PRIx64 ", fd: -1, fd_offset:0x%" PRIx64 "\n",
+		mem.gpa, mem.size, mem.fd_offset);
+	printf("=================================\n");
+	printf("MSI-X Interrupt:\nNumber: %u, irqfd: %s\n", irq_info.count,
+		fd == -1 ? "Invalid" : "Valid");
+
+	ret = rte_vfio_user_detach_dev(dev_id);
+	if (ret) {
+		printf("Failed to detach device\n");
+		ret = TEST_FAILED;
+		goto err;
+	}
+
+	printf("Device detach: TEST OK\n");
+err:
+	rte_free(data);
+err_data:
+	rte_free(set);
+	return ret;
+}
+
+REGISTER_TEST_COMMAND(vfio_user_autotest_client, test_vfio_user_client);
+REGISTER_TEST_COMMAND(vfio_user_autotest_server, test_vfio_user_server);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [dpdk-dev] [PATCH v2 9/9] doc: add vfio-user library guide
  2021-01-14  6:14 ` [dpdk-dev] [PATCH v2 " Chenbo Xia
                     ` (7 preceding siblings ...)
  2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 8/9] test/vfio_user: introduce functional test Chenbo Xia
@ 2021-01-14  6:14   ` Chenbo Xia
  2021-01-15  7:58   ` [dpdk-dev] [PATCH v2 0/9] Introduce vfio-user library David Marchand
  9 siblings, 0 replies; 43+ messages in thread
From: Chenbo Xia @ 2021-01-14  6:14 UTC (permalink / raw)
  To: dev, thomas, david.marchand
  Cc: stephen, cunming.liang, xiuchun.lu, miao.li, jingjing.wu, beilei.xing

Add vfio-user library guide and update release notes.

Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Signed-off-by: Xiuchun Lu <xiuchun.lu@intel.com>
---
 doc/guides/prog_guide/index.rst         |   1 +
 doc/guides/prog_guide/vfio_user_lib.rst | 215 ++++++++++++++++++++++++
 doc/guides/rel_notes/release_21_02.rst  |  11 ++
 3 files changed, 227 insertions(+)
 create mode 100644 doc/guides/prog_guide/vfio_user_lib.rst

diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 45c7dec88d..f9847b1058 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -70,3 +70,4 @@ Programmer's Guide
     lto
     profile_app
     glossary
+    vfio_user_lib
diff --git a/doc/guides/prog_guide/vfio_user_lib.rst b/doc/guides/prog_guide/vfio_user_lib.rst
new file mode 100644
index 0000000000..5ef9e2f2b0
--- /dev/null
+++ b/doc/guides/prog_guide/vfio_user_lib.rst
@@ -0,0 +1,215 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2020 Intel Corporation.
+
+Vfio User Library
+=============
+
+The vfio-user library implements the vfio-user protocol, which is a protocol
+that allows an I/O device to be emulated in a separate process outside of a
+Virtual Machine Monitor (VMM). The protocol has a client/server model, in which
+the server emulates the device and the client (e.g., VMM) consumes the device.
+Vfio-user library uses the device model of Linux kernel VFIO and core concepts
+defined in its API. The main difference between kernel VFIO and vfio-user is
+that the device consumer uses messages over a UNIX domain socket instead of
+system calls in vfio-user.
+
+The vfio-user library is used to construct and consume emulated devices. The
+server side implementation is mainly for construction of devices and the client
+side implementation is mainly for consumption and manipulation of devices. You
+use server APIs mainly for two things: provide the device information (e.g.,
+region/irq information) to vfio-user library and acquire the configuration
+(e.g., DMA table) from client. To construct a device, you could only focus on
+the device abstraction that vfio-user library defines rather than how the
+server side communicated with client. You use client APIs mainly for acquiring
+the device information and configuring the device. The client API usage is
+almost the same as the kernel VFIO ioctl.
+
+
+Vfio User Server API Overview
+------------------
+
+The following is an overview of key Vfio User Server API functions. You will
+know how to build an emulated device with this overview.
+
+There are mainly four steps of using Vfio User API to build your device:
+
+1. Register
+
+This step includes one API in Vfio User.
+
+* ``rte_vfio_user_register(sock_addr, notify_ops)``
+
+  This function registers a session to communicate with vfio-user client. A
+  session maps to one device so that a device instance will be created upon
+  registration.
+
+  ``sock_addr`` specifies the Unix domain socket file path and is the identity
+  of the session.
+
+  ``notify_ops`` is the a set of callbacks for vfio-user library to notify
+  emulated device. Currently, there are five callbacks:
+
+  - ``new_device``
+    This callback is invoked when the device is configured and becomes ready.
+    The dev_id is for vfio-user library to identify one uniqueue device.
+
+  - ``destroy_device``
+    This callback is invoked when the device is destroyed. In most cases, it
+    means the client is disconnected from the server.
+
+  - ``update_status``
+    This callback is invoked when the device configuration is updated (e.g.,
+    DMA table/IRQ update)
+
+  - ``lock_dp``
+    This callback is invoked when data path needs to be locked or unlocked.
+
+  - ``reset_device``
+    This callback is invoked when the emulated device need reset.
+
+2. Set device information
+
+This step includes three APIs in Vfio User.
+
+* ``rte_vfio_user_set_dev_info(sock_addr, dev_info)``
+
+  This function sets the device information to vfio-user library. The device
+  information is defined in Linux VFIO which mainly consists of device type
+  and the number of vfio regions and IRQs.
+
+* ``rte_vfio_user_set_reg_info(sock_addr, reg)``
+
+  This function sets the vfio region information to vfio-user library. Regions
+  should be created before using this API. The information mainly includes the
+  process virtual address, size, file descriptor, attibutes and capabilities of
+  regions.
+
+* ``rte_vfio_user_set_irq_info(sock_addr, irq)``
+
+  This function sets the IRQ information to vfio-user library. The information
+  includes how many IRQ type the device supports (e.g., MSI/MSI-X) and the IRQ
+  count of each type.
+
+3. Start
+
+This step includes one API in Vfio User.
+
+* ``rte_vfio_user_start(sock_addr)``
+
+  This function starts the registered session with vfio-user client. This means
+  a control thread will start to listen and handle messages sent from the client.
+  Note that only one thread is created for all vfio-user based devices.
+
+  ``sock_addr`` specifies the Unix domain socket file path and is the identity
+  of the session.
+
+4. Get device configuration
+
+This step includes two APIs in Vfio User. Both APIs should be called when the
+device is ready could be updated anytime. A simple usage of both APIs is using
+them in new_device and update_status callbacks.
+
+* ``rte_vfio_user_get_mem_table(dev_id)``
+
+  This function gets the DMA memory table from vfio-user library. The memory
+  table entry has the information of guest physical address, process virtual
+  address, size and file descriptor. Emulated devices could use the memory
+  table to perform DMA read/write on guest memory.
+
+  ``dev_id`` specifies the device ID.
+
+* ``rte_vfio_user_get_irq(dev_id, index, count, fds)``
+
+  This function gets the IRQ's eventfd from vfio-user library. In vfio-user
+  library, an efficient way to send interrupts is using eventfds. The eventfd
+  should be sent from client. Emulated devices could only call eventfd_write
+  to trigger interrupts.
+
+  ``dev_id`` specifies the device ID.
+
+  ``index`` specifies the interrupt type. The mapping of interrupt index and
+  type is defined by emulated device.
+
+  ``count`` specifies the interrupt count.
+
+  ``fds`` is for saving the file descriptors.
+
+
+Vfio User Client API Overview
+------------------
+
+The following is an overview of key Vfio User Client API functions. You will
+know how to use an emulated device with this overview.
+
+There are mainly three steps of using Vfio User Client API to consume the
+device:
+
+1. Attach
+
+This step includes one API in Vfio User.
+
+* ``rte_vfio_user_attach_dev(sock_addr)``
+
+  This function attaches to an emulated device. After the function call
+  success, it is viable to acquire device information and configure the device
+
+  ``sock_addr`` specifies the Unix domain socket file path and is the identity
+  of the session/device.
+
+2. Get device information
+
+This step includes three APIs in Vfio User.
+
+* ``rte_vfio_user_get_dev_info(dev_id, info)``
+
+  This function gets the device information of the emulated device on the other
+  side of socket. The device information is defined in Linux VFIO which mainly
+  consists of device type and the number of vfio regions and IRQs.
+
+  ``dev_id`` specifies the identity of the device.
+
+  ``info`` specifies the information of the device.
+
+* ``rte_vfio_user_get_reg_info(dev_id, info, fd)``
+
+  This function gets the region information of the emulated device on the other
+  side of socket. The region information is defined in Linux VFIO which mainly
+  consists of region size, index and capabilities.
+
+  ``info`` specifies the information of the region.
+
+  ``fd`` specifies the file descriptor of the region.
+
+* ``rte_vfio_user_get_irq_info(dev_id, irq)``
+
+  This function sets the IRQ information to vfio-user library. The IRQ
+  information includes IRQ count and index.
+
+  ``info`` specifies the information of the IRQ.
+
+3. Configure the device
+
+This step includes three APIs in Vfio User.
+
+* ``rte_vfio_user_dma_map(dev_id, mem, fds, num)``
+
+  This function maps DMA memory regions for the emulated device.
+
+  ``mem`` specifies the information of DMA memory regions.
+
+  ``fds`` specifies the file descriptors of the DMA memory regions.
+
+  ``num`` specifies the number of the DMA memory regions.
+
+* ``rte_vfio_user_dma_unmap(dev_id, mem, num)``
+
+  This function unmaps DMA memory regions for the emulated device.
+
+* ``rte_vfio_user_set_irqs(dev_id, set)``
+
+  This function configure the interrupts for the emulated device.
+
+  ``set`` specifies the configuration of interrupts.
+
+After the above three steps are done, users can easily use the emulated device
+(e.g., do I/O operations).
\ No newline at end of file
diff --git a/doc/guides/rel_notes/release_21_02.rst b/doc/guides/rel_notes/release_21_02.rst
index 706cbf8f0c..eabbbfebef 100644
--- a/doc/guides/rel_notes/release_21_02.rst
+++ b/doc/guides/rel_notes/release_21_02.rst
@@ -55,6 +55,17 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added vfio-user Library.**
+
+  Added an experimental library ``librte_vfio_user`` to provide device
+  emulation support.
+
+  The library is an implementation of vfio-user protocol. It includes two main
+  parts: client and server. Server implementation is for device provider to
+  abstract its device. Client implementation is for device consumer to
+  manipulate the emulated device.
+
+  See :doc:`../prog_guide/vfio_user_lib` for more information.
 
 Removed Items
 -------------
-- 
2.17.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/9] vfio_user: implement device and region related APIs
  2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 3/9] vfio_user: implement device and region " Chenbo Xia
@ 2021-01-14 18:48     ` David Christensen
  2021-01-19  3:22       ` Xia, Chenbo
  0 siblings, 1 reply; 43+ messages in thread
From: David Christensen @ 2021-01-14 18:48 UTC (permalink / raw)
  To: dev



On 1/13/21 10:14 PM, Chenbo Xia wrote:
> This patch introduces device and region related APIs, which are
> rte_vfio_user_set_dev_info() and rte_vfio_user_set_reg_info().
> The corresponding vfio-user command handling is also added with
> the definition of all vfio-user command identity.

Receiving a build warning on RHEL 8.3 (gcc 8.3.1) for POWER with this patch:

In file included from ../lib/librte_vfio_user/vfio_user_server.h:10,
                  from ../lib/librte_vfio_user/vfio_user_server.c:12:
../lib/librte_vfio_user/vfio_user_server.c: In function 
‘vfio_user_device_get_reg_info’:
../lib/librte_vfio_user/vfio_user_base.h:24:2: warning: format ‘%llx’ 
expects argument of type ‘long long unsigned int’, but argument 7 has 
type ‘__u64’ {aka ‘long unsigned int’} [-Wformat=]
   "VFIO_USER: " fmt, ## args)
   ^~~~~~~~~~~~~
../lib/librte_vfio_user/vfio_user_server.c:92:2: note: in expansion of 
macro ‘VFIO_USER_LOG’
   VFIO_USER_LOG(DEBUG, "Region(%u) info: addr(0x%" PRIx64 "), fd(%d), "
   ^~~~~~~~~~~~~
../lib/librte_vfio_user/vfio_user_server.c:93:12: note: format string is 
defined here
    "sz(0x%llx), argsz(0x%x), c_off(0x%x), flags(0x%x) "
          ~~~^
          %lx
In file included from ../lib/librte_vfio_user/vfio_user_server.h:10,
                  from ../lib/librte_vfio_user/vfio_user_server.c:12:
../lib/librte_vfio_user/vfio_user_base.h:24:2: warning: format ‘%llx’ 
expects argument of type ‘long long unsigned int’, but argument 11 has 
type ‘__u64’ {aka ‘long unsigned int’} [-Wformat=]
   "VFIO_USER: " fmt, ## args)
   ^~~~~~~~~~~~~
../lib/librte_vfio_user/vfio_user_server.c:92:2: note: in expansion of 
macro ‘VFIO_USER_LOG’
   VFIO_USER_LOG(DEBUG, "Region(%u) info: addr(0x%" PRIx64 "), fd(%d), "
   ^~~~~~~~~~~~~
../lib/librte_vfio_user/vfio_user_server.c:94:13: note: format string is 
defined here
    "off(0x%llx)\n", vinfo->index, (uint64_t)reg_info->base,
           ~~~^
           %lx

Dave

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [dpdk-dev] [PATCH v2 8/9] test/vfio_user: introduce functional test
  2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 8/9] test/vfio_user: introduce functional test Chenbo Xia
@ 2021-01-14 19:03     ` David Christensen
  2021-01-19  3:27       ` Xia, Chenbo
  0 siblings, 1 reply; 43+ messages in thread
From: David Christensen @ 2021-01-14 19:03 UTC (permalink / raw)
  To: dev



On 1/13/21 10:14 PM, Chenbo Xia wrote:
> This patch introduces functional test for vfio_user client and
> server. Note that the test can only be run with server and client
> both started and server should be started first.

Receiving a build warning on RHEL 8.3 (gcc 8.3.1) for POWER with this patch:

../app/test/test_vfio_user.c: In function ‘test_dev_cfg_rw’:
../app/test/test_vfio_user.c:60:3: warning: implicit declaration of 
function ‘memcpy’ [-Wimplicit-function-declaration]
    memcpy(buf, loc, count);
    ^~~~~~
../app/test/test_vfio_user.c:60:3: warning: incompatible implicit 
declaration of built-in function ‘memcpy’
../app/test/test_vfio_user.c:60:3: note: include ‘<string.h>’ or provide 
a declaration of ‘memcpy’
../app/test/test_vfio_user.c:18:1:
+#include <string.h>

../app/test/test_vfio_user.c:60:3:
    memcpy(buf, loc, count);
    ^~~~~~
../app/test/test_vfio_user.c:64:2: warning: incompatible implicit 
declaration of built-in function ‘memcpy’
   memcpy(loc, buf, count);
   ^~~~~~
../app/test/test_vfio_user.c:64:2: note: include ‘<string.h>’ or provide 
a declaration of ‘memcpy’
../app/test/test_vfio_user.c: In function ‘test_get_mem’:
../app/test/test_vfio_user.c:192:2: warning: incompatible implicit 
declaration of built-in function ‘memcpy’
   memcpy(server_mem->entry, mem->entry, entry_sz);
   ^~~~~~
../app/test/test_vfio_user.c:192:2: note: include ‘<string.h>’ or 
provide a declaration of ‘memcpy’
../app/test/test_vfio_user.c: In function ‘test_create_device’:
../app/test/test_vfio_user.c:226:6: warning: implicit declaration of 
function ‘strcmp’ [-Wimplicit-function-declaration]
   if (strcmp(sock, test_sock)) {
       ^~~~~~

Also, when running vfio_user_autotest_server, I'm unable to exit the 
application with CTRL-C directly.  If a run a second test with 
vfio_user_autotest_client then the server test runs to completion 
without an error and I'm able to exit the test app normally.  Any way to 
get out of the server test without running the matching client test?

Dave

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/9] Introduce vfio-user library
  2021-01-14  6:14 ` [dpdk-dev] [PATCH v2 " Chenbo Xia
                     ` (8 preceding siblings ...)
  2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 9/9] doc: add vfio-user library guide Chenbo Xia
@ 2021-01-15  7:58   ` David Marchand
  2021-01-19  3:13     ` Xia, Chenbo
  9 siblings, 1 reply; 43+ messages in thread
From: David Marchand @ 2021-01-15  7:58 UTC (permalink / raw)
  To: Chenbo Xia
  Cc: dev, Thomas Monjalon, Stephen Hemminger, Liang, Cunming,
	xiuchun.lu, miao.li, Jingjing Wu, Beilei Xing

Hello Chenbo,

On Thu, Jan 14, 2021 at 7:19 AM Chenbo Xia <chenbo.xia@intel.com> wrote:
>
> This series enables DPDK to be an alternative I/O device emulation library of
> building virtualized devices in separate processes outside QEMU. It introduces
> a new library for device emulation (librte_vfio_user).
>
> *librte_vfio_user* library is an implementation of VFIO-over-socket[1] (also
> known as vfio-user) which is a protocol that allows a device to be virtualized
> in a separate process outside of QEMU.
>
> Background & Motivation
> -----------------------
> The disaggregated/multi-process QEMU is using VFIO-over-socket/vfio-user
> as the main transport mechanism to disaggregate IO services from QEMU[2].
> Vfio-user essentially implements the VFIO device model presented to the
> user process by a set of messages over a unix-domain socket. The main
> difference between application using vfio-user and application using vfio
> kernel module is that device manipulation is based on socket messages for
> vfio-user but system calls for vfio kernel module. The vfio-user devices
> consist of a generic VFIO device type, living in QEMU, which is called the
> client[3], and the core device implementation (emulated device), living
> outside of QEMU, which is called the server. With emulated devices removed
> from QEMU enabled by vfio-user implementation, other places should be
> introduced to accommodate virtualized/emulated device. This series introduces
> vfio-user support in DPDK to enable DPDK as one of the living places for
> emulated device except QEMU.
>
> This series introduce the server and client implementation of vfio-user protocol.
> The server plays the role as emulated devices and the client is the device
> consumer. With this implementation, DPDK will be enabled to be both device
> provider and consumer.
>
> Design overview
> ---------------
>
> +--------------+     +--------------+
> | +----------+ |     | +----------+ |
> | | Generic  | |     | | Emulated | |
> | | vfio-dev | |     | | device   | |
> | +----------+ |     | +----|-----+ |
> | +----------+ |     | +----|-----+ |
> | | vfio-user| |     | | vfio-user| |
> | | client   | |<--->| | server   | |
> | +----------+ |     | +----------+ |
> | QEMU/DPDK    |     | DPDK         |
> +--------------+     +--------------+
>
> - Generic vfio-dev.
>   It is the generic vfio framework in vfio applications like QEMU or DPDK.
>   Applications can keep the most of vfio device management and plug in a
>   vfio-user device type. Note that in current implementation, we have not
>   yet integrated client vfio-user into kernel vfio in DPDK but it is viable
>   and good to do so.
>
> - vfio-user client.
>   For DPDK, it is part of librte_vfio_user implementation to provide ways to
>   manipulate a vfio-user based emulated devices. This manipulation is very
>   similar with kernel vfio (i.e., syscalls like ioctl, mmap and pread/pwrite).
>   It is a base for vfio-user device consumer.
>
> - vfio-user server.
>   It is server part of librte_vfio_user. It provides ways to emulate your own
>   device. A device provider could only care about device layout that VFIO
>   defines but does not need to know how it communicates with vfio-user client.
>
> - Emulated device.
>   It is emulated device of any type (e.g., network, crypto and etc.).
>
> References
> ----------
> [1]: https://patchew.org/QEMU/20201130161229.23164-1-thanos.makatos@nutanix.com/
> [2]: https://wiki.qemu.org/Features/MultiProcessQEMU
> [3]: https://github.com/oracle/qemu/tree/vfio-user-v0.2
>
> ----------------------------------
> v2:
>  - Clean up non-static inline function (Stephen)
>  - Naturally pack vfio-user message payload and header (Stephen)
>  - Make function definiton align with coding style (Beilei)
>  - Clean up duplicate code in vfio-user server APIs (Beilei)
>  - Fix some typos


GHA (called by the robot) caught various issues:
- doc: https://github.com/ovsrobot/dpdk/runs/1700373705?check_suite_focus=true#step:14:3407
- clang: https://github.com/ovsrobot/dpdk/runs/1700373722?check_suite_focus=true#step:14:1050
- i386: https://github.com/ovsrobot/dpdk/runs/1700373747?check_suite_focus=true#step:14:2607
- aarch64: https://github.com/ovsrobot/dpdk/runs/1700373764?check_suite_focus=true#step:14:2848
and https://github.com/ovsrobot/dpdk/runs/1700373770?check_suite_focus=true#step:14:2880


-- 
David Marchand


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/9] Introduce vfio-user library
  2021-01-15  7:58   ` [dpdk-dev] [PATCH v2 0/9] Introduce vfio-user library David Marchand
@ 2021-01-19  3:13     ` Xia, Chenbo
  0 siblings, 0 replies; 43+ messages in thread
From: Xia, Chenbo @ 2021-01-19  3:13 UTC (permalink / raw)
  To: David Marchand
  Cc: dev, Thomas Monjalon, Stephen Hemminger, Liang, Cunming, Lu,
	Xiuchun, Li, Miao, Wu, Jingjing, Xing, Beilei

Hi David,

> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Friday, January 15, 2021 3:58 PM
> To: Xia, Chenbo <chenbo.xia@intel.com>
> Cc: dev <dev@dpdk.org>; Thomas Monjalon <thomas@monjalon.net>; Stephen
> Hemminger <stephen@networkplumber.org>; Liang, Cunming
> <cunming.liang@intel.com>; Lu, Xiuchun <xiuchun.lu@intel.com>; Li, Miao
> <miao.li@intel.com>; Wu, Jingjing <jingjing.wu@intel.com>; Xing, Beilei
> <beilei.xing@intel.com>
> Subject: Re: [PATCH v2 0/9] Introduce vfio-user library
> 
> Hello Chenbo,
> 
> On Thu, Jan 14, 2021 at 7:19 AM Chenbo Xia <chenbo.xia@intel.com> wrote:
> >
> > This series enables DPDK to be an alternative I/O device emulation library
> of
> > building virtualized devices in separate processes outside QEMU. It
> introduces
> > a new library for device emulation (librte_vfio_user).
> >
> > *librte_vfio_user* library is an implementation of VFIO-over-socket[1] (also
> > known as vfio-user) which is a protocol that allows a device to be
> virtualized
> > in a separate process outside of QEMU.
> >
> > Background & Motivation
> > -----------------------
> > The disaggregated/multi-process QEMU is using VFIO-over-socket/vfio-user
> > as the main transport mechanism to disaggregate IO services from QEMU[2].
> > Vfio-user essentially implements the VFIO device model presented to the
> > user process by a set of messages over a unix-domain socket. The main
> > difference between application using vfio-user and application using vfio
> > kernel module is that device manipulation is based on socket messages for
> > vfio-user but system calls for vfio kernel module. The vfio-user devices
> > consist of a generic VFIO device type, living in QEMU, which is called the
> > client[3], and the core device implementation (emulated device), living
> > outside of QEMU, which is called the server. With emulated devices removed
> > from QEMU enabled by vfio-user implementation, other places should be
> > introduced to accommodate virtualized/emulated device. This series
> introduces
> > vfio-user support in DPDK to enable DPDK as one of the living places for
> > emulated device except QEMU.
> >
> > This series introduce the server and client implementation of vfio-user
> protocol.
> > The server plays the role as emulated devices and the client is the device
> > consumer. With this implementation, DPDK will be enabled to be both device
> > provider and consumer.
> >
> > Design overview
> > ---------------
> >
> > +--------------+     +--------------+
> > | +----------+ |     | +----------+ |
> > | | Generic  | |     | | Emulated | |
> > | | vfio-dev | |     | | device   | |
> > | +----------+ |     | +----|-----+ |
> > | +----------+ |     | +----|-----+ |
> > | | vfio-user| |     | | vfio-user| |
> > | | client   | |<--->| | server   | |
> > | +----------+ |     | +----------+ |
> > | QEMU/DPDK    |     | DPDK         |
> > +--------------+     +--------------+
> >
> > - Generic vfio-dev.
> >   It is the generic vfio framework in vfio applications like QEMU or DPDK.
> >   Applications can keep the most of vfio device management and plug in a
> >   vfio-user device type. Note that in current implementation, we have not
> >   yet integrated client vfio-user into kernel vfio in DPDK but it is viable
> >   and good to do so.
> >
> > - vfio-user client.
> >   For DPDK, it is part of librte_vfio_user implementation to provide ways to
> >   manipulate a vfio-user based emulated devices. This manipulation is very
> >   similar with kernel vfio (i.e., syscalls like ioctl, mmap and
> pread/pwrite).
> >   It is a base for vfio-user device consumer.
> >
> > - vfio-user server.
> >   It is server part of librte_vfio_user. It provides ways to emulate your
> own
> >   device. A device provider could only care about device layout that VFIO
> >   defines but does not need to know how it communicates with vfio-user
> client.
> >
> > - Emulated device.
> >   It is emulated device of any type (e.g., network, crypto and etc.).
> >
> > References
> > ----------
> > [1]: https://patchew.org/QEMU/20201130161229.23164-1-
> thanos.makatos@nutanix.com/
> > [2]: https://wiki.qemu.org/Features/MultiProcessQEMU
> > [3]: https://github.com/oracle/qemu/tree/vfio-user-v0.2
> >
> > ----------------------------------
> > v2:
> >  - Clean up non-static inline function (Stephen)
> >  - Naturally pack vfio-user message payload and header (Stephen)
> >  - Make function definiton align with coding style (Beilei)
> >  - Clean up duplicate code in vfio-user server APIs (Beilei)
> >  - Fix some typos
> 
> 
> GHA (called by the robot) caught various issues:
> - doc:
> https://github.com/ovsrobot/dpdk/runs/1700373705?check_suite_focus=true#step:1
> 4:3407
> - clang:
> https://github.com/ovsrobot/dpdk/runs/1700373722?check_suite_focus=true#step:1
> 4:1050
> - i386:
> https://github.com/ovsrobot/dpdk/runs/1700373747?check_suite_focus=true#step:1
> 4:2607
> - aarch64:
> https://github.com/ovsrobot/dpdk/runs/1700373764?check_suite_focus=true#step:1
> 4:2848
> and
> https://github.com/ovsrobot/dpdk/runs/1700373770?check_suite_focus=true#step:1
> 4:2880

Thanks! Will fix in next version.

Chenbo
> 
> 
> --
> David Marchand


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/9] vfio_user: implement device and region related APIs
  2021-01-14 18:48     ` David Christensen
@ 2021-01-19  3:22       ` Xia, Chenbo
  0 siblings, 0 replies; 43+ messages in thread
From: Xia, Chenbo @ 2021-01-19  3:22 UTC (permalink / raw)
  To: David Christensen, dev

Hi David,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of David Christensen
> Sent: Friday, January 15, 2021 2:49 AM
> To: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 3/9] vfio_user: implement device and region
> related APIs
> 
> 
> 
> On 1/13/21 10:14 PM, Chenbo Xia wrote:
> > This patch introduces device and region related APIs, which are
> > rte_vfio_user_set_dev_info() and rte_vfio_user_set_reg_info().
> > The corresponding vfio-user command handling is also added with
> > the definition of all vfio-user command identity.
> 
> Receiving a build warning on RHEL 8.3 (gcc 8.3.1) for POWER with this patch:
> 
> In file included from ../lib/librte_vfio_user/vfio_user_server.h:10,
>                   from ../lib/librte_vfio_user/vfio_user_server.c:12:
> ../lib/librte_vfio_user/vfio_user_server.c: In function
> ‘vfio_user_device_get_reg_info’:
> ../lib/librte_vfio_user/vfio_user_base.h:24:2: warning: format ‘%llx’
> expects argument of type ‘long long unsigned int’, but argument 7 has
> type ‘__u64’ {aka ‘long unsigned int’} [-Wformat=]
>    "VFIO_USER: " fmt, ## args)
>    ^~~~~~~~~~~~~
> ../lib/librte_vfio_user/vfio_user_server.c:92:2: note: in expansion of
> macro ‘VFIO_USER_LOG’
>    VFIO_USER_LOG(DEBUG, "Region(%u) info: addr(0x%" PRIx64 "), fd(%d), "
>    ^~~~~~~~~~~~~
> ../lib/librte_vfio_user/vfio_user_server.c:93:12: note: format string is
> defined here
>     "sz(0x%llx), argsz(0x%x), c_off(0x%x), flags(0x%x) "
>           ~~~^
>           %lx
> In file included from ../lib/librte_vfio_user/vfio_user_server.h:10,
>                   from ../lib/librte_vfio_user/vfio_user_server.c:12:
> ../lib/librte_vfio_user/vfio_user_base.h:24:2: warning: format ‘%llx’
> expects argument of type ‘long long unsigned int’, but argument 11 has
> type ‘__u64’ {aka ‘long unsigned int’} [-Wformat=]
>    "VFIO_USER: " fmt, ## args)
>    ^~~~~~~~~~~~~
> ../lib/librte_vfio_user/vfio_user_server.c:92:2: note: in expansion of
> macro ‘VFIO_USER_LOG’
>    VFIO_USER_LOG(DEBUG, "Region(%u) info: addr(0x%" PRIx64 "), fd(%d), "
>    ^~~~~~~~~~~~~
> ../lib/librte_vfio_user/vfio_user_server.c:94:13: note: format string is
> defined here
>     "off(0x%llx)\n", vinfo->index, (uint64_t)reg_info->base,
>            ~~~^
>            %lx

Thanks! Will fix it new version.

Chenbo
> 
> Dave

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [dpdk-dev] [PATCH v2 8/9] test/vfio_user: introduce functional test
  2021-01-14 19:03     ` David Christensen
@ 2021-01-19  3:27       ` Xia, Chenbo
  2021-01-19 18:26         ` David Christensen
  0 siblings, 1 reply; 43+ messages in thread
From: Xia, Chenbo @ 2021-01-19  3:27 UTC (permalink / raw)
  To: David Christensen, dev

Hi David,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of David Christensen
> Sent: Friday, January 15, 2021 3:03 AM
> To: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 8/9] test/vfio_user: introduce functional
> test
> 
> 
> 
> On 1/13/21 10:14 PM, Chenbo Xia wrote:
> > This patch introduces functional test for vfio_user client and
> > server. Note that the test can only be run with server and client
> > both started and server should be started first.
> 
> Receiving a build warning on RHEL 8.3 (gcc 8.3.1) for POWER with this patch:
> 
> ../app/test/test_vfio_user.c: In function ‘test_dev_cfg_rw’:
> ../app/test/test_vfio_user.c:60:3: warning: implicit declaration of
> function ‘memcpy’ [-Wimplicit-function-declaration]
>     memcpy(buf, loc, count);
>     ^~~~~~
> ../app/test/test_vfio_user.c:60:3: warning: incompatible implicit
> declaration of built-in function ‘memcpy’
> ../app/test/test_vfio_user.c:60:3: note: include ‘<string.h>’ or provide
> a declaration of ‘memcpy’
> ../app/test/test_vfio_user.c:18:1:
> +#include <string.h>
> 
> ../app/test/test_vfio_user.c:60:3:
>     memcpy(buf, loc, count);
>     ^~~~~~
> ../app/test/test_vfio_user.c:64:2: warning: incompatible implicit
> declaration of built-in function ‘memcpy’
>    memcpy(loc, buf, count);
>    ^~~~~~
> ../app/test/test_vfio_user.c:64:2: note: include ‘<string.h>’ or provide
> a declaration of ‘memcpy’
> ../app/test/test_vfio_user.c: In function ‘test_get_mem’:
> ../app/test/test_vfio_user.c:192:2: warning: incompatible implicit
> declaration of built-in function ‘memcpy’
>    memcpy(server_mem->entry, mem->entry, entry_sz);
>    ^~~~~~
> ../app/test/test_vfio_user.c:192:2: note: include ‘<string.h>’ or
> provide a declaration of ‘memcpy’
> ../app/test/test_vfio_user.c: In function ‘test_create_device’:
> ../app/test/test_vfio_user.c:226:6: warning: implicit declaration of
> function ‘strcmp’ [-Wimplicit-function-declaration]
>    if (strcmp(sock, test_sock)) {
>        ^~~~~~

Will fix in new version.

> 
> Also, when running vfio_user_autotest_server, I'm unable to exit the
> application with CTRL-C directly.  If a run a second test with
> vfio_user_autotest_client then the server test runs to completion
> without an error and I'm able to exit the test app normally.  Any way to
> get out of the server test without running the matching client test?

Oops..I didn't realize it cannot exit with ctrl-C. It should be fixed.
And normally, because this library has a client/server model, we need a server
and client both launched to complete the test.

Thanks!
Chenbo

> 
> Dave

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [dpdk-dev] [PATCH v2 8/9] test/vfio_user: introduce functional test
  2021-01-19  3:27       ` Xia, Chenbo
@ 2021-01-19 18:26         ` David Christensen
  0 siblings, 0 replies; 43+ messages in thread
From: David Christensen @ 2021-01-19 18:26 UTC (permalink / raw)
  To: Xia, Chenbo, dev



On 1/18/21 7:27 PM, Xia, Chenbo wrote:
>> Also, when running vfio_user_autotest_server, I'm unable to exit the
>> application with CTRL-C directly.  If a run a second test with
>> vfio_user_autotest_client then the server test runs to completion
>> without an error and I'm able to exit the test app normally.  Any way to
>> get out of the server test without running the matching client test?
> 
> Oops..I didn't realize it cannot exit with ctrl-C. It should be fixed.
> And normally, because this library has a client/server model, we need a server
> and client both launched to complete the test.
> 

Understand the requirement.  Might be nice to include an example of 
running the two instances of dpdk-test with command line parameters. 
I've not seen dpdk-test used in this way before.

Dave

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2021-01-19 18:26 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-18  7:38 [dpdk-dev] [PATCH 0/9] Introduce vfio-user library Chenbo Xia
2020-12-18  7:38 ` [dpdk-dev] [PATCH 1/9] lib: introduce " Chenbo Xia
2020-12-18 17:13   ` Stephen Hemminger
2020-12-19  6:12     ` Xia, Chenbo
2020-12-18 17:17   ` Stephen Hemminger
2020-12-19  6:25     ` Xia, Chenbo
2020-12-18  7:38 ` [dpdk-dev] [PATCH 2/9] vfio_user: implement lifecycle related APIs Chenbo Xia
2021-01-05  8:34   ` Xing, Beilei
2021-01-05  9:58     ` Xia, Chenbo
2020-12-18  7:38 ` [dpdk-dev] [PATCH 3/9] vfio_user: implement device and region " Chenbo Xia
2021-01-06  5:51   ` Xing, Beilei
2021-01-06  7:50     ` Xia, Chenbo
2020-12-18  7:38 ` [dpdk-dev] [PATCH 4/9] vfio_user: implement DMA table and socket address API Chenbo Xia
2020-12-18  7:38 ` [dpdk-dev] [PATCH 5/9] vfio_user: implement interrupt related APIs Chenbo Xia
2020-12-30  1:04   ` Wu, Jingjing
2020-12-30  2:31     ` Xia, Chenbo
2020-12-18  7:38 ` [dpdk-dev] [PATCH 6/9] vfio_user: add client APIs of device attach/detach Chenbo Xia
2020-12-18  7:38 ` [dpdk-dev] [PATCH 7/9] vfio_user: add client APIs of DMA/IRQ/region Chenbo Xia
2021-01-07  2:41   ` Xing, Beilei
2021-01-07  7:26     ` Xia, Chenbo
2020-12-18  7:38 ` [dpdk-dev] [PATCH 8/9] test/vfio_user: introduce functional test Chenbo Xia
2020-12-18  7:38 ` [dpdk-dev] [PATCH 9/9] doc: add vfio-user library guide Chenbo Xia
2021-01-06  5:07   ` Xing, Beilei
2021-01-06  7:43     ` Xia, Chenbo
2020-12-18  9:37 ` [dpdk-dev] [PATCH 0/9] Introduce vfio-user library David Marchand
2020-12-18 14:07   ` Thanos Makatos
2021-01-14  6:14 ` [dpdk-dev] [PATCH v2 " Chenbo Xia
2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 1/9] lib: introduce " Chenbo Xia
2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 2/9] vfio_user: implement lifecycle related APIs Chenbo Xia
2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 3/9] vfio_user: implement device and region " Chenbo Xia
2021-01-14 18:48     ` David Christensen
2021-01-19  3:22       ` Xia, Chenbo
2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 4/9] vfio_user: implement DMA table and socket address API Chenbo Xia
2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 5/9] vfio_user: implement interrupt related APIs Chenbo Xia
2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 6/9] vfio_user: add client APIs of device attach/detach Chenbo Xia
2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 7/9] vfio_user: add client APIs of DMA/IRQ/region Chenbo Xia
2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 8/9] test/vfio_user: introduce functional test Chenbo Xia
2021-01-14 19:03     ` David Christensen
2021-01-19  3:27       ` Xia, Chenbo
2021-01-19 18:26         ` David Christensen
2021-01-14  6:14   ` [dpdk-dev] [PATCH v2 9/9] doc: add vfio-user library guide Chenbo Xia
2021-01-15  7:58   ` [dpdk-dev] [PATCH v2 0/9] Introduce vfio-user library David Marchand
2021-01-19  3:13     ` Xia, Chenbo

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git