DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 0/4] net/mlx5: rework IPC socket and PMD global data init
@ 2019-03-07  7:33 Yongseok Koh
  2019-03-07  7:33 ` [dpdk-dev] [PATCH 1/4] net/mlx5: fix memory event on secondary process Yongseok Koh
                   ` (5 more replies)
  0 siblings, 6 replies; 45+ messages in thread
From: Yongseok Koh @ 2019-03-07  7:33 UTC (permalink / raw)
  To: shahafs; +Cc: dev

The existing socket-based IPC channel is replaced with the new rte_mp APIs of
EAL and extended to request stop/start of dataplane to secondary processes.
Also, initialization of PMD global data including the new IPC channel is
reworked to provide more generic framework for future use.

Yongseok Koh (4):
  net/mlx5: fix memory event on secondary process
  net/mlx5: replace IPC socket with EAL API
  net/mlx5: rework PMD global data init
  net/mlx5: sync stop/start of datapath with secondary process

 drivers/net/mlx5/Makefile       |   2 +-
 drivers/net/mlx5/meson.build    |   2 +-
 drivers/net/mlx5/mlx5.c         | 257 ++++++++++++++++++++++++---------
 drivers/net/mlx5/mlx5.h         |  49 +++++--
 drivers/net/mlx5/mlx5_ethdev.c  |  29 ----
 drivers/net/mlx5/mlx5_mp.c      | 283 +++++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_mr.c      |   2 +
 drivers/net/mlx5/mlx5_rxtx.c    |   2 +
 drivers/net/mlx5/mlx5_socket.c  | 306 ----------------------------------------
 drivers/net/mlx5/mlx5_trigger.c |   5 +
 drivers/net/mlx5/mlx5_txq.c     |   7 +-
 11 files changed, 524 insertions(+), 420 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_mp.c
 delete mode 100644 drivers/net/mlx5/mlx5_socket.c

-- 
2.11.0

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH 1/4] net/mlx5: fix memory event on secondary process
  2019-03-07  7:33 [dpdk-dev] [PATCH 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
@ 2019-03-07  7:33 ` Yongseok Koh
  2019-03-07  7:33 ` [dpdk-dev] [PATCH 2/4] net/mlx5: replace IPC socket with EAL API Yongseok Koh
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 45+ messages in thread
From: Yongseok Koh @ 2019-03-07  7:33 UTC (permalink / raw)
  To: shahafs; +Cc: dev, stable

As the memory event is propagated to secondary processes, the event is
processed redundantly. This should be processed once because the data
structure used for MR and the event is global across the processes.

Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 drivers/net/mlx5/mlx5.c    | 5 +++--
 drivers/net/mlx5/mlx5_mr.c | 2 ++
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 9706e351aa..f0ad5c73af 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -157,9 +157,10 @@ mlx5_prepare_shared_data(void)
 		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
 			LIST_INIT(&mlx5_shared_data->mem_event_cb_list);
 			rte_rwlock_init(&mlx5_shared_data->mem_event_rwlock);
+			rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
+							mlx5_mr_mem_event_cb,
+							NULL);
 		}
-		rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
-						mlx5_mr_mem_event_cb, NULL);
 	}
 	rte_spinlock_unlock(&mlx5_shared_data_lock);
 }
diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
index 700d83d1bc..d336a77e40 100644
--- a/drivers/net/mlx5/mlx5_mr.c
+++ b/drivers/net/mlx5/mlx5_mr.c
@@ -891,6 +891,8 @@ mlx5_mr_mem_event_cb(enum rte_mem_event event_type, const void *addr,
 	struct mlx5_priv *priv;
 	struct mlx5_dev_list *dev_list = &mlx5_shared_data->mem_event_cb_list;
 
+	/* Must be called from the primary process. */
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
 	switch (event_type) {
 	case RTE_MEM_EVENT_FREE:
 		rte_rwlock_write_lock(&mlx5_shared_data->mem_event_rwlock);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH 2/4] net/mlx5: replace IPC socket with EAL API
  2019-03-07  7:33 [dpdk-dev] [PATCH 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
  2019-03-07  7:33 ` [dpdk-dev] [PATCH 1/4] net/mlx5: fix memory event on secondary process Yongseok Koh
@ 2019-03-07  7:33 ` Yongseok Koh
  2019-03-14 12:36   ` Shahaf Shuler
  2019-03-07  7:33 ` [dpdk-dev] [PATCH 3/4] net/mlx5: rework PMD global data init Yongseok Koh
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 45+ messages in thread
From: Yongseok Koh @ 2019-03-07  7:33 UTC (permalink / raw)
  To: shahafs; +Cc: dev

Socket API is used for IPC in order for secondary process to acquire Verb
command file descriptor. The FD is used to remap UAR address. The new
multi-process APIs (rte_mp) in EAL are newly introduced. mlx5_socket.c is
replaced with mlx5_mp.c, which uses the new APIs.

As it is PMD global infrastructure, only one IPC channel is established.
All the IPC message types may have port_id in the message if there is need
to reference a specific device.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 drivers/net/mlx5/Makefile      |   2 +-
 drivers/net/mlx5/meson.build   |   2 +-
 drivers/net/mlx5/mlx5.c        |   5 +-
 drivers/net/mlx5/mlx5.h        |  26 ++--
 drivers/net/mlx5/mlx5_ethdev.c |  29 ----
 drivers/net/mlx5/mlx5_mp.c     | 126 +++++++++++++++++
 drivers/net/mlx5/mlx5_socket.c | 306 -----------------------------------------
 7 files changed, 148 insertions(+), 348 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_mp.c
 delete mode 100644 drivers/net/mlx5/mlx5_socket.c

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 9a7da18196..34dc957ac0 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -34,7 +34,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_dv.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_tcf.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_verbs.c
-SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_socket.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_mp.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_nl.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_devx_cmds.c
 
diff --git a/drivers/net/mlx5/meson.build b/drivers/net/mlx5/meson.build
index 0cf2f0873e..d99667670c 100644
--- a/drivers/net/mlx5/meson.build
+++ b/drivers/net/mlx5/meson.build
@@ -41,7 +41,7 @@ if build
 		'mlx5_rxmode.c',
 		'mlx5_rxq.c',
 		'mlx5_rxtx.c',
-		'mlx5_socket.c',
+		'mlx5_mp.c',
 		'mlx5_stats.c',
 		'mlx5_trigger.c',
 		'mlx5_txq.c',
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index f0ad5c73af..6ed2418106 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -160,6 +160,7 @@ mlx5_prepare_shared_data(void)
 			rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
 							mlx5_mr_mem_event_cb,
 							NULL);
+			mlx5_mp_init();
 		}
 	}
 	rte_spinlock_unlock(&mlx5_shared_data_lock);
@@ -291,8 +292,6 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 		rte_free(priv->rss_conf.rss_key);
 	if (priv->reta_idx != NULL)
 		rte_free(priv->reta_idx);
-	if (priv->primary_socket)
-		mlx5_socket_uninit(dev);
 	if (priv->config.vf)
 		mlx5_nl_mac_addr_flush(dev);
 	if (priv->nl_socket_route >= 0)
@@ -929,7 +928,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 			goto error;
 		}
 		/* Receive command fd from primary process */
-		err = mlx5_socket_connect(eth_dev);
+		err = mlx5_mp_req_verbs_cmd_fd(eth_dev);
 		if (err < 0) {
 			err = rte_errno;
 			goto error;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 5384453670..80da97bd46 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -56,6 +56,21 @@ enum {
 	PCI_DEVICE_ID_MELLANOX_CONNECTX6VF = 0x101c,
 };
 
+/* Request types for IPC. */
+enum mlx5_mp_req_type {
+	MLX5_MP_REQ_VERBS_CMD_FD = 1,
+};
+
+/* Pameters for IPC. */
+struct mlx5_mp_param {
+	enum mlx5_mp_req_type type;
+	int port_id;
+	int result;
+};
+
+/** Key string for IPC. */
+#define MLX5_MP_NAME "net_mlx5_mp"
+
 /** Switch information returned by mlx5_nl_switch_info(). */
 struct mlx5_switch_info {
 	uint32_t master:1; /**< Master device. */
@@ -241,9 +256,7 @@ struct mlx5_priv {
 	uint32_t link_speed_capa; /* Link speed capabilities. */
 	struct mlx5_xstats_ctrl xstats_ctrl; /* Extended stats control. */
 	struct mlx5_stats_ctrl stats_ctrl; /* Stats control. */
-	int primary_socket; /* Unix socket for primary process. */
 	void *uar_base; /* Reserved address space for UAR mapping */
-	struct rte_intr_handle intr_handle_socket; /* Interrupt handler. */
 	struct mlx5_dev_config config; /* Device configuration. */
 	struct mlx5_verbs_alloc_ctx verbs_alloc_ctx;
 	/* Context for Verbs allocator. */
@@ -399,12 +412,9 @@ int mlx5_ctrl_flow(struct rte_eth_dev *dev,
 int mlx5_flow_create_drop_queue(struct rte_eth_dev *dev);
 void mlx5_flow_delete_drop_queue(struct rte_eth_dev *dev);
 
-/* mlx5_socket.c */
-
-int mlx5_socket_init(struct rte_eth_dev *priv);
-void mlx5_socket_uninit(struct rte_eth_dev *priv);
-void mlx5_socket_handle(struct rte_eth_dev *priv);
-int mlx5_socket_connect(struct rte_eth_dev *priv);
+/* mlx5_mp.c */
+int mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev);
+void mlx5_mp_init(void);
 
 /* mlx5_nl.c */
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index f84f7cf69b..17e65154d2 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1091,20 +1091,6 @@ mlx5_dev_interrupt_handler(void *cb_arg)
 }
 
 /**
- * Handle interrupts from the socket.
- *
- * @param cb_arg
- *   Callback argument.
- */
-static void
-mlx5_dev_handler_socket(void *cb_arg)
-{
-	struct rte_eth_dev *dev = cb_arg;
-
-	mlx5_socket_handle(dev);
-}
-
-/**
  * Uninstall interrupt handler.
  *
  * @param dev
@@ -1119,13 +1105,8 @@ mlx5_dev_interrupt_handler_uninstall(struct rte_eth_dev *dev)
 	    dev->data->dev_conf.intr_conf.rmv)
 		rte_intr_callback_unregister(&priv->intr_handle,
 					     mlx5_dev_interrupt_handler, dev);
-	if (priv->primary_socket)
-		rte_intr_callback_unregister(&priv->intr_handle_socket,
-					     mlx5_dev_handler_socket, dev);
 	priv->intr_handle.fd = 0;
 	priv->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
-	priv->intr_handle_socket.fd = 0;
-	priv->intr_handle_socket.type = RTE_INTR_HANDLE_UNKNOWN;
 }
 
 /**
@@ -1159,16 +1140,6 @@ mlx5_dev_interrupt_handler_install(struct rte_eth_dev *dev)
 		rte_intr_callback_register(&priv->intr_handle,
 					   mlx5_dev_interrupt_handler, dev);
 	}
-	ret = mlx5_socket_init(dev);
-	if (ret)
-		DRV_LOG(ERR, "port %u cannot initialise socket: %s",
-			dev->data->port_id, strerror(rte_errno));
-	else if (priv->primary_socket) {
-		priv->intr_handle_socket.fd = priv->primary_socket;
-		priv->intr_handle_socket.type = RTE_INTR_HANDLE_EXT;
-		rte_intr_callback_register(&priv->intr_handle_socket,
-					   mlx5_dev_handler_socket, dev);
-	}
 }
 
 /**
diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c
new file mode 100644
index 0000000000..19a1f25f0e
--- /dev/null
+++ b/drivers/net/mlx5/mlx5_mp.c
@@ -0,0 +1,126 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2019 6WIND S.A.
+ * Copyright 2019 Mellanox Technologies, Ltd
+ */
+
+#include <assert.h>
+#include <stdio.h>
+#include <time.h>
+
+#include <rte_eal.h>
+#include <rte_ethdev_driver.h>
+#include <rte_string_fns.h>
+
+#include "mlx5.h"
+#include "mlx5_utils.h"
+
+/**
+ * Initialize IPC message.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ * @param[out] msg
+ *   Pointer to message to fill in.
+ * @param[in] type
+ *   Message type.
+ */
+static inline void
+mp_init_msg(struct rte_eth_dev *dev, struct rte_mp_msg *msg,
+	    enum mlx5_mp_req_type type)
+{
+	struct mlx5_mp_param *param = (struct mlx5_mp_param *)msg->param;
+
+	memset(msg, 0, sizeof(*msg));
+	strlcpy(msg->name, MLX5_MP_NAME, sizeof(msg->name));
+	msg->len_param = sizeof(*param);
+	param->type = type;
+	param->port_id = dev->data->port_id;
+}
+
+/**
+ * IPC message handler of primary process.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ * @param[in] peer
+ *   Pointer to the peer socket path.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mp_primary_handle(const struct rte_mp_msg *mp_msg, const void *peer)
+{
+	struct rte_mp_msg mp_res;
+	struct mlx5_mp_param *res = (struct mlx5_mp_param *)mp_res.param;
+	const struct mlx5_mp_param *param =
+		(const struct mlx5_mp_param *)mp_msg->param;
+	struct rte_eth_dev *dev = &rte_eth_devices[param->port_id];
+	struct mlx5_priv *priv = dev->data->dev_private;
+	int ret = 0;
+
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
+	switch (param->type) {
+	case MLX5_MP_REQ_VERBS_CMD_FD:
+		mp_init_msg(dev, &mp_res, param->type);
+		mp_res.num_fds = 1;
+		mp_res.fds[0] = priv->ctx->cmd_fd;
+		res->result = 0;
+		ret = rte_mp_reply(&mp_res, peer);
+		break;
+	default:
+		rte_errno = EINVAL;
+		DRV_LOG(ERR, "port %u invalid mp request type",
+			dev->data->port_id);
+		return -rte_errno;
+	}
+	return ret;
+}
+
+/**
+ * Request Verbs command file descriptor for mmap to the primary process.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ *
+ * @return
+ *   fd on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev)
+{
+	struct rte_mp_msg mp_req;
+	struct rte_mp_msg *mp_res;
+	struct rte_mp_reply mp_rep;
+	struct mlx5_mp_param *res __rte_unused;
+	struct timespec ts = {.tv_sec = 5, .tv_nsec = 0};
+	int cmd_fd;
+	int ret;
+
+	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
+	mp_init_msg(dev, &mp_req, MLX5_MP_REQ_VERBS_CMD_FD);
+	ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts);
+	if (ret) {
+		DRV_LOG(ERR,
+			"port %u failed to get command FD from primary process",
+			dev->data->port_id);
+		return -rte_errno;
+	}
+	assert(mp_rep.nb_received == 1);
+	mp_res = &mp_rep.msgs[0];
+	res = (struct mlx5_mp_param *)mp_res->param;
+	assert(!res->result);
+	assert(mp_res->num_fds == 1);
+	cmd_fd = mp_res->fds[0];
+	free(mp_rep.msgs);
+	DRV_LOG(DEBUG, "port %u command FD from primary is %d",
+		dev->data->port_id, cmd_fd);
+	return cmd_fd;
+}
+
+void
+mlx5_mp_init(void)
+{
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+		rte_mp_action_register(MLX5_MP_NAME, mp_primary_handle);
+}
diff --git a/drivers/net/mlx5/mlx5_socket.c b/drivers/net/mlx5/mlx5_socket.c
deleted file mode 100644
index 41cac3c6aa..0000000000
--- a/drivers/net/mlx5/mlx5_socket.c
+++ /dev/null
@@ -1,306 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright 2016 6WIND S.A.
- * Copyright 2016 Mellanox Technologies, Ltd
- */
-
-#include <sys/types.h>
-#include <sys/socket.h>
-#include <sys/un.h>
-#include <fcntl.h>
-#include <stdio.h>
-#include <unistd.h>
-#include <sys/stat.h>
-
-#include "mlx5.h"
-#include "mlx5_utils.h"
-
-/**
- * Initialise the socket to communicate with the secondary process
- *
- * @param[in] dev
- *   Pointer to Ethernet device.
- *
- * @return
- *   0 on success, a negative errno value otherwise and rte_errno is set.
- */
-int
-mlx5_socket_init(struct rte_eth_dev *dev)
-{
-	struct mlx5_priv *priv = dev->data->dev_private;
-	struct sockaddr_un sun = {
-		.sun_family = AF_UNIX,
-	};
-	int ret;
-	int flags;
-
-	/*
-	 * Close the last socket that was used to communicate
-	 * with the secondary process
-	 */
-	if (priv->primary_socket)
-		mlx5_socket_uninit(dev);
-	/*
-	 * Initialise the socket to communicate with the secondary
-	 * process.
-	 */
-	ret = socket(AF_UNIX, SOCK_STREAM, 0);
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u secondary process not supported: %s",
-			dev->data->port_id, strerror(errno));
-		goto error;
-	}
-	priv->primary_socket = ret;
-	flags = fcntl(priv->primary_socket, F_GETFL, 0);
-	if (flags == -1) {
-		rte_errno = errno;
-		goto error;
-	}
-	ret = fcntl(priv->primary_socket, F_SETFL, flags | O_NONBLOCK);
-	if (ret < 0) {
-		rte_errno = errno;
-		goto error;
-	}
-	snprintf(sun.sun_path, sizeof(sun.sun_path), "/var/tmp/%s_%d",
-		 MLX5_DRIVER_NAME, priv->primary_socket);
-	remove(sun.sun_path);
-	ret = bind(priv->primary_socket, (const struct sockaddr *)&sun,
-		   sizeof(sun));
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING,
-			"port %u cannot bind socket, secondary process not"
-			" supported: %s",
-			dev->data->port_id, strerror(errno));
-		goto close;
-	}
-	ret = listen(priv->primary_socket, 0);
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u secondary process not supported: %s",
-			dev->data->port_id, strerror(errno));
-		goto close;
-	}
-	return 0;
-close:
-	remove(sun.sun_path);
-error:
-	claim_zero(close(priv->primary_socket));
-	priv->primary_socket = 0;
-	return -rte_errno;
-}
-
-/**
- * Un-Initialise the socket to communicate with the secondary process
- *
- * @param[in] dev
- */
-void
-mlx5_socket_uninit(struct rte_eth_dev *dev)
-{
-	struct mlx5_priv *priv = dev->data->dev_private;
-
-	MKSTR(path, "/var/tmp/%s_%d", MLX5_DRIVER_NAME, priv->primary_socket);
-	claim_zero(close(priv->primary_socket));
-	priv->primary_socket = 0;
-	claim_zero(remove(path));
-}
-
-/**
- * Handle socket interrupts.
- *
- * @param dev
- *   Pointer to Ethernet device.
- */
-void
-mlx5_socket_handle(struct rte_eth_dev *dev)
-{
-	struct mlx5_priv *priv = dev->data->dev_private;
-	int conn_sock;
-	int ret = 0;
-	struct cmsghdr *cmsg = NULL;
-	struct ucred *cred = NULL;
-	char buf[CMSG_SPACE(sizeof(struct ucred))] = { 0 };
-	char vbuf[1024] = { 0 };
-	struct iovec io = {
-		.iov_base = vbuf,
-		.iov_len = sizeof(*vbuf),
-	};
-	struct msghdr msg = {
-		.msg_iov = &io,
-		.msg_iovlen = 1,
-		.msg_control = buf,
-		.msg_controllen = sizeof(buf),
-	};
-	int *fd;
-
-	/* Accept the connection from the client. */
-	conn_sock = accept(priv->primary_socket, NULL, NULL);
-	if (conn_sock < 0) {
-		DRV_LOG(WARNING, "port %u connection failed: %s",
-			dev->data->port_id, strerror(errno));
-		return;
-	}
-	ret = setsockopt(conn_sock, SOL_SOCKET, SO_PASSCRED, &(int){1},
-					 sizeof(int));
-	if (ret < 0) {
-		ret = errno;
-		DRV_LOG(WARNING, "port %u cannot change socket options: %s",
-			dev->data->port_id, strerror(rte_errno));
-		goto error;
-	}
-	ret = recvmsg(conn_sock, &msg, MSG_WAITALL);
-	if (ret < 0) {
-		ret = errno;
-		DRV_LOG(WARNING, "port %u received an empty message: %s",
-			dev->data->port_id, strerror(rte_errno));
-		goto error;
-	}
-	/* Expect to receive credentials only. */
-	cmsg = CMSG_FIRSTHDR(&msg);
-	if (cmsg == NULL) {
-		DRV_LOG(WARNING, "port %u no message", dev->data->port_id);
-		goto error;
-	}
-	if ((cmsg->cmsg_type == SCM_CREDENTIALS) &&
-		(cmsg->cmsg_len >= sizeof(*cred))) {
-		cred = (struct ucred *)CMSG_DATA(cmsg);
-		assert(cred != NULL);
-	}
-	cmsg = CMSG_NXTHDR(&msg, cmsg);
-	if (cmsg != NULL) {
-		DRV_LOG(WARNING, "port %u message wrongly formatted",
-			dev->data->port_id);
-		goto error;
-	}
-	/* Make sure all the ancillary data was received and valid. */
-	if ((cred == NULL) || (cred->uid != getuid()) ||
-	    (cred->gid != getgid())) {
-		DRV_LOG(WARNING, "port %u wrong credentials",
-			dev->data->port_id);
-		goto error;
-	}
-	/* Set-up the ancillary data. */
-	cmsg = CMSG_FIRSTHDR(&msg);
-	assert(cmsg != NULL);
-	cmsg->cmsg_level = SOL_SOCKET;
-	cmsg->cmsg_type = SCM_RIGHTS;
-	cmsg->cmsg_len = CMSG_LEN(sizeof(priv->ctx->cmd_fd));
-	fd = (int *)CMSG_DATA(cmsg);
-	*fd = priv->ctx->cmd_fd;
-	ret = sendmsg(conn_sock, &msg, 0);
-	if (ret < 0)
-		DRV_LOG(WARNING, "port %u cannot send response",
-			dev->data->port_id);
-error:
-	close(conn_sock);
-}
-
-/**
- * Connect to the primary process.
- *
- * @param[in] dev
- *   Pointer to Ethernet structure.
- *
- * @return
- *   fd on success, negative errno value otherwise and rte_errno is set.
- */
-int
-mlx5_socket_connect(struct rte_eth_dev *dev)
-{
-	struct mlx5_priv *priv = dev->data->dev_private;
-	struct sockaddr_un sun = {
-		.sun_family = AF_UNIX,
-	};
-	int socket_fd = -1;
-	int *fd = NULL;
-	int ret;
-	struct ucred *cred;
-	char buf[CMSG_SPACE(sizeof(*cred))] = { 0 };
-	char vbuf[1024] = { 0 };
-	struct iovec io = {
-		.iov_base = vbuf,
-		.iov_len = sizeof(*vbuf),
-	};
-	struct msghdr msg = {
-		.msg_control = buf,
-		.msg_controllen = sizeof(buf),
-		.msg_iov = &io,
-		.msg_iovlen = 1,
-	};
-	struct cmsghdr *cmsg;
-
-	ret = socket(AF_UNIX, SOCK_STREAM, 0);
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u cannot connect to primary",
-			dev->data->port_id);
-		goto error;
-	}
-	socket_fd = ret;
-	snprintf(sun.sun_path, sizeof(sun.sun_path), "/var/tmp/%s_%d",
-		 MLX5_DRIVER_NAME, priv->primary_socket);
-	ret = connect(socket_fd, (const struct sockaddr *)&sun, sizeof(sun));
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u cannot connect to primary",
-			dev->data->port_id);
-		goto error;
-	}
-	cmsg = CMSG_FIRSTHDR(&msg);
-	if (cmsg == NULL) {
-		rte_errno = EINVAL;
-		DRV_LOG(DEBUG, "port %u cannot get first message",
-			dev->data->port_id);
-		goto error;
-	}
-	cmsg->cmsg_level = SOL_SOCKET;
-	cmsg->cmsg_type = SCM_CREDENTIALS;
-	cmsg->cmsg_len = CMSG_LEN(sizeof(*cred));
-	cred = (struct ucred *)CMSG_DATA(cmsg);
-	if (cred == NULL) {
-		rte_errno = EINVAL;
-		DRV_LOG(DEBUG, "port %u no credentials received",
-			dev->data->port_id);
-		goto error;
-	}
-	cred->pid = getpid();
-	cred->uid = getuid();
-	cred->gid = getgid();
-	ret = sendmsg(socket_fd, &msg, MSG_DONTWAIT);
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING,
-			"port %u cannot send credentials to primary: %s",
-			dev->data->port_id, strerror(errno));
-		goto error;
-	}
-	ret = recvmsg(socket_fd, &msg, MSG_WAITALL);
-	if (ret <= 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u no message from primary: %s",
-			dev->data->port_id, strerror(errno));
-		goto error;
-	}
-	cmsg = CMSG_FIRSTHDR(&msg);
-	if (cmsg == NULL) {
-		rte_errno = EINVAL;
-		DRV_LOG(WARNING, "port %u no file descriptor received",
-			dev->data->port_id);
-		goto error;
-	}
-	fd = (int *)CMSG_DATA(cmsg);
-	if (*fd < 0) {
-		DRV_LOG(WARNING, "port %u no file descriptor received: %s",
-			dev->data->port_id, strerror(errno));
-		rte_errno = *fd;
-		goto error;
-	}
-	ret = *fd;
-	close(socket_fd);
-	return ret;
-error:
-	if (socket_fd != -1)
-		close(socket_fd);
-	return -rte_errno;
-}
-- 
2.11.0

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH 3/4] net/mlx5: rework PMD global data init
  2019-03-07  7:33 [dpdk-dev] [PATCH 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
  2019-03-07  7:33 ` [dpdk-dev] [PATCH 1/4] net/mlx5: fix memory event on secondary process Yongseok Koh
  2019-03-07  7:33 ` [dpdk-dev] [PATCH 2/4] net/mlx5: replace IPC socket with EAL API Yongseok Koh
@ 2019-03-07  7:33 ` Yongseok Koh
  2019-03-14 12:36   ` Shahaf Shuler
  2019-03-07  7:33 ` [dpdk-dev] [PATCH 4/4] net/mlx5: sync stop/start of datapath with secondary process Yongseok Koh
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 45+ messages in thread
From: Yongseok Koh @ 2019-03-07  7:33 UTC (permalink / raw)
  To: shahafs; +Cc: dev

There's more need to have PMD global data structure. It should be
initialized once per a process regardless of how many PMD instances are
probed. mlx5_init_once() is called during probing and make sure all the
init functions are called once per a process. The existing shared memory
gets more extensively used for this purpose. As there could be multiple
secondary processes, a static storage (local to process) is also added.

As the reserved virtual address for UAR remap is a PMD global resource,
this doesn't need to be stored in the device priv structure, but in the PMD
global data.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 drivers/net/mlx5/mlx5.c     | 250 ++++++++++++++++++++++++++++++++------------
 drivers/net/mlx5/mlx5.h     |  19 +++-
 drivers/net/mlx5/mlx5_mp.c  |  19 +++-
 drivers/net/mlx5/mlx5_txq.c |   7 +-
 4 files changed, 217 insertions(+), 78 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 6ed2418106..ea8fd55ee6 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -128,16 +128,26 @@ struct mlx5_shared_data *mlx5_shared_data;
 /* Spinlock for mlx5_shared_data allocation. */
 static rte_spinlock_t mlx5_shared_data_lock = RTE_SPINLOCK_INITIALIZER;
 
+/* Process local data for secondary processes. */
+static struct mlx5_local_data mlx5_local_data;
+
 /** Driver-specific log messages type. */
 int mlx5_logtype;
 
 /**
- * Prepare shared data between primary and secondary process.
+ * Initialize shared data between primary and secondary process.
+ *
+ * A memzone is reserved by primary process and secondary processes attach to
+ * the memzone.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
-static void
-mlx5_prepare_shared_data(void)
+static int
+mlx5_init_shared_data(void)
 {
 	const struct rte_memzone *mz;
+	int ret = 0;
 
 	rte_spinlock_lock(&mlx5_shared_data_lock);
 	if (mlx5_shared_data == NULL) {
@@ -146,22 +156,53 @@ mlx5_prepare_shared_data(void)
 			mz = rte_memzone_reserve(MZ_MLX5_PMD_SHARED_DATA,
 						 sizeof(*mlx5_shared_data),
 						 SOCKET_ID_ANY, 0);
+			if (mz == NULL) {
+				DRV_LOG(ERR,
+					"Cannot allocate mlx5 shared data\n");
+				ret = -rte_errno;
+				goto error;
+			}
+			mlx5_shared_data = mz->addr;
+			memset(mlx5_shared_data, 0, sizeof(*mlx5_shared_data));
+			rte_spinlock_init(&mlx5_shared_data->lock);
 		} else {
 			/* Lookup allocated shared memory. */
 			mz = rte_memzone_lookup(MZ_MLX5_PMD_SHARED_DATA);
+			if (mz == NULL) {
+				DRV_LOG(ERR,
+					"Cannot attach mlx5 shared data\n");
+				ret = -rte_errno;
+				goto error;
+			}
+			mlx5_shared_data = mz->addr;
+			memset(&mlx5_local_data, 0, sizeof(mlx5_local_data));
 		}
-		if (mz == NULL)
-			rte_panic("Cannot allocate mlx5 shared data\n");
-		mlx5_shared_data = mz->addr;
-		/* Initialize shared data. */
+	}
+error:
+	rte_spinlock_unlock(&mlx5_shared_data_lock);
+	return ret;
+}
+
+/**
+ * Uninitialize shared data between primary and secondary process.
+ *
+ * The pointer of secondary process is dereferenced and primary process frees
+ * the memzone.
+ */
+static void
+mlx5_uninit_shared_data(void)
+{
+	const struct rte_memzone *mz;
+
+	rte_spinlock_lock(&mlx5_shared_data_lock);
+	if (mlx5_shared_data) {
 		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-			LIST_INIT(&mlx5_shared_data->mem_event_cb_list);
-			rte_rwlock_init(&mlx5_shared_data->mem_event_rwlock);
-			rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
-							mlx5_mr_mem_event_cb,
-							NULL);
-			mlx5_mp_init();
+			mz = rte_memzone_lookup(MZ_MLX5_PMD_SHARED_DATA);
+			rte_memzone_free(mz);
+		} else {
+			memset(&mlx5_local_data, 0, sizeof(mlx5_local_data));
 		}
+		mlx5_shared_data = NULL;
 	}
 	rte_spinlock_unlock(&mlx5_shared_data_lock);
 }
@@ -597,15 +638,6 @@ mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
 
 static struct rte_pci_driver mlx5_driver;
 
-/*
- * Reserved UAR address space for TXQ UAR(hw doorbell) mapping, process
- * local resource used by both primary and secondary to avoid duplicate
- * reservation.
- * The space has to be available on both primary and secondary process,
- * TXQ UAR maps to this area using fixed mmap w/o double check.
- */
-static void *uar_base;
-
 static int
 find_lower_va_bound(const struct rte_memseg_list *msl,
 		const struct rte_memseg *ms, void *arg)
@@ -625,25 +657,24 @@ find_lower_va_bound(const struct rte_memseg_list *msl,
 /**
  * Reserve UAR address space for primary process.
  *
- * @param[in] dev
- *   Pointer to Ethernet device.
+ * Process local resource is used by both primary and secondary to avoid
+ * duplicate reservation. The space has to be available on both primary and
+ * secondary process, TXQ UAR maps to this area using fixed mmap w/o double
+ * check.
  *
  * @return
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-mlx5_uar_init_primary(struct rte_eth_dev *dev)
+mlx5_uar_init_primary(void)
 {
-	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_shared_data *sd = mlx5_shared_data;
 	void *addr = (void *)0;
 
-	if (uar_base) { /* UAR address space mapped. */
-		priv->uar_base = uar_base;
+	if (sd->uar_base)
 		return 0;
-	}
 	/* find out lower bound of hugepage segments */
 	rte_memseg_walk(find_lower_va_bound, &addr);
-
 	/* keep distance to hugepages to minimize potential conflicts. */
 	addr = RTE_PTR_SUB(addr, (uintptr_t)(MLX5_UAR_OFFSET + MLX5_UAR_SIZE));
 	/* anonymous mmap, no real memory consumption. */
@@ -651,65 +682,154 @@ mlx5_uar_init_primary(struct rte_eth_dev *dev)
 		    PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
 	if (addr == MAP_FAILED) {
 		DRV_LOG(ERR,
-			"port %u failed to reserve UAR address space, please"
-			" adjust MLX5_UAR_SIZE or try --base-virtaddr",
-			dev->data->port_id);
+			"Failed to reserve UAR address space, please"
+			" adjust MLX5_UAR_SIZE or try --base-virtaddr");
 		rte_errno = ENOMEM;
 		return -rte_errno;
 	}
 	/* Accept either same addr or a new addr returned from mmap if target
 	 * range occupied.
 	 */
-	DRV_LOG(INFO, "port %u reserved UAR address space: %p",
-		dev->data->port_id, addr);
-	priv->uar_base = addr; /* for primary and secondary UAR re-mmap. */
-	uar_base = addr; /* process local, don't reserve again. */
+	DRV_LOG(INFO, "Reserved UAR address space: %p", addr);
+	sd->uar_base = addr; /* for primary and secondary UAR re-mmap. */
 	return 0;
 }
 
 /**
- * Reserve UAR address space for secondary process, align with
- * primary process.
- *
- * @param[in] dev
- *   Pointer to Ethernet device.
+ * Unmap UAR address space reserved for primary process.
+ */
+static void
+mlx5_uar_uninit_primary(void)
+{
+	struct mlx5_shared_data *sd = mlx5_shared_data;
+
+	if (!sd->uar_base)
+		return;
+	munmap(sd->uar_base, MLX5_UAR_SIZE);
+	sd->uar_base = NULL;
+}
+
+/**
+ * Reserve UAR address space for secondary process, align with primary process.
  *
  * @return
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-mlx5_uar_init_secondary(struct rte_eth_dev *dev)
+mlx5_uar_init_secondary(void)
 {
-	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_shared_data *sd = mlx5_shared_data;
+	struct mlx5_local_data *ld = &mlx5_local_data;
 	void *addr;
 
-	assert(priv->uar_base);
-	if (uar_base) { /* already reserved. */
-		assert(uar_base == priv->uar_base);
+	if (ld->uar_base) { /* Already reserved. */
+		assert(sd->uar_base == ld->uar_base);
 		return 0;
 	}
+	assert(sd->uar_base);
 	/* anonymous mmap, no real memory consumption. */
-	addr = mmap(priv->uar_base, MLX5_UAR_SIZE,
+	addr = mmap(sd->uar_base, MLX5_UAR_SIZE,
 		    PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
 	if (addr == MAP_FAILED) {
-		DRV_LOG(ERR, "port %u UAR mmap failed: %p size: %llu",
-			dev->data->port_id, priv->uar_base, MLX5_UAR_SIZE);
+		DRV_LOG(ERR, "UAR mmap failed: %p size: %llu",
+			sd->uar_base, MLX5_UAR_SIZE);
 		rte_errno = ENXIO;
 		return -rte_errno;
 	}
-	if (priv->uar_base != addr) {
+	if (sd->uar_base != addr) {
 		DRV_LOG(ERR,
-			"port %u UAR address %p size %llu occupied, please"
+			"UAR address %p size %llu occupied, please"
 			" adjust MLX5_UAR_OFFSET or try EAL parameter"
 			" --base-virtaddr",
-			dev->data->port_id, priv->uar_base, MLX5_UAR_SIZE);
+			sd->uar_base, MLX5_UAR_SIZE);
 		rte_errno = ENXIO;
 		return -rte_errno;
 	}
-	uar_base = addr; /* process local, don't reserve again */
-	DRV_LOG(INFO, "port %u reserved UAR address space: %p",
-		dev->data->port_id, addr);
+	ld->uar_base = addr;
+	DRV_LOG(INFO, "Reserved UAR address space: %p", addr);
+	return 0;
+}
+
+/**
+ * Unmap UAR address space reserved for secondary process.
+ */
+static void
+mlx5_uar_uninit_secondary(void)
+{
+	struct mlx5_local_data *ld = &mlx5_local_data;
+
+	if (!ld->uar_base)
+		return;
+	munmap(ld->uar_base, MLX5_UAR_SIZE);
+	ld->uar_base = NULL;
+}
+
+/**
+ * PMD global initialization.
+ *
+ * Independent from individual device, this function initializes global
+ * per-PMD data structures distinguishing primary and secondary processes.
+ * Hence, each initialization is called once per a process.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_init_once(void)
+{
+	struct mlx5_shared_data *sd;
+	struct mlx5_local_data *ld = &mlx5_local_data;
+	int ret;
+
+	if (mlx5_init_shared_data())
+		return -rte_errno;
+	sd = mlx5_shared_data;
+	assert(sd);
+	rte_spinlock_lock(&sd->lock);
+	switch (rte_eal_process_type()) {
+	case RTE_PROC_PRIMARY:
+		if (sd->init_done)
+			break;
+		LIST_INIT(&sd->mem_event_cb_list);
+		rte_rwlock_init(&sd->mem_event_rwlock);
+		rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
+						mlx5_mr_mem_event_cb, NULL);
+		mlx5_mp_init_primary();
+		ret = mlx5_uar_init_primary();
+		if (ret)
+			goto error;
+		sd->init_done = true;
+		break;
+	case RTE_PROC_SECONDARY:
+		if (ld->init_done)
+			break;
+		ret = mlx5_uar_init_secondary();
+		if (ret)
+			goto error;
+		++sd->secondary_cnt;
+		ld->init_done = true;
+		break;
+	default:
+		break;
+	}
+	rte_spinlock_unlock(&sd->lock);
 	return 0;
+error:
+	switch (rte_eal_process_type()) {
+	case RTE_PROC_PRIMARY:
+		mlx5_uar_uninit_primary();
+		mlx5_mp_uninit_primary();
+		rte_mem_event_callback_unregister("MLX5_MEM_EVENT_CB", NULL);
+		break;
+	case RTE_PROC_SECONDARY:
+		mlx5_uar_uninit_secondary();
+		break;
+	default:
+		break;
+	}
+	rte_spinlock_unlock(&sd->lock);
+	mlx5_uninit_shared_data();
+	return -rte_errno;
 }
 
 /**
@@ -794,8 +914,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		rte_errno = EEXIST;
 		return NULL;
 	}
-	/* Prepare shared data between primary and secondary process. */
-	mlx5_prepare_shared_data();
 	errno = 0;
 	ctx = mlx5_glue->dv_open_device(ibv_dev);
 	if (ctx) {
@@ -922,11 +1040,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		}
 		eth_dev->device = dpdk_dev;
 		eth_dev->dev_ops = &mlx5_dev_sec_ops;
-		err = mlx5_uar_init_secondary(eth_dev);
-		if (err) {
-			err = rte_errno;
-			goto error;
-		}
 		/* Receive command fd from primary process */
 		err = mlx5_mp_req_verbs_cmd_fd(eth_dev);
 		if (err < 0) {
@@ -1143,11 +1256,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	priv->dev_data = eth_dev->data;
 	eth_dev->data->mac_addrs = priv->mac;
 	eth_dev->device = dpdk_dev;
-	err = mlx5_uar_init_primary(eth_dev);
-	if (err) {
-		err = rte_errno;
-		goto error;
-	}
 	/* Configure the first MAC address by default. */
 	if (mlx5_get_mac(eth_dev, &mac.addr_bytes)) {
 		DRV_LOG(ERR,
@@ -1361,6 +1469,12 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 	struct mlx5_dev_config dev_config;
 	int ret;
 
+	ret = mlx5_init_once();
+	if (ret) {
+		DRV_LOG(ERR, "unable to init PMD global data: %s",
+			strerror(rte_errno));
+		return -rte_errno;
+	}
 	assert(pci_drv == &mlx5_driver);
 	errno = 0;
 	ibv_list = mlx5_glue->get_device_list(&ret);
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 80da97bd46..4b606190f0 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -81,12 +81,25 @@ struct mlx5_switch_info {
 
 LIST_HEAD(mlx5_dev_list, mlx5_priv);
 
-/* Shared memory between primary and secondary processes. */
+/* Shared data between primary and secondary processes. */
 struct mlx5_shared_data {
+	rte_spinlock_t lock;
+	/* Global spinlock for primary and secondary processes. */
+	int init_done; /* Whether primary has done initialization. */
+	unsigned int secondary_cnt; /* Number of secondary processes init'd. */
+	void *uar_base;
+	/* Reserved UAR address space for TXQ UAR(hw doorbell) mapping. */
 	struct mlx5_dev_list mem_event_cb_list;
 	rte_rwlock_t mem_event_rwlock;
 };
 
+/* Per-process data structure, not visible to other processes. */
+struct mlx5_local_data {
+	int init_done; /* Whether a secondary has done initialization. */
+	void *uar_base;
+	/* Reserved UAR address space for TXQ UAR(hw doorbell) mapping. */
+};
+
 extern struct mlx5_shared_data *mlx5_shared_data;
 
 struct mlx5_counter_ctrl {
@@ -256,7 +269,6 @@ struct mlx5_priv {
 	uint32_t link_speed_capa; /* Link speed capabilities. */
 	struct mlx5_xstats_ctrl xstats_ctrl; /* Extended stats control. */
 	struct mlx5_stats_ctrl stats_ctrl; /* Stats control. */
-	void *uar_base; /* Reserved address space for UAR mapping */
 	struct mlx5_dev_config config; /* Device configuration. */
 	struct mlx5_verbs_alloc_ctx verbs_alloc_ctx;
 	/* Context for Verbs allocator. */
@@ -414,7 +426,8 @@ void mlx5_flow_delete_drop_queue(struct rte_eth_dev *dev);
 
 /* mlx5_mp.c */
 int mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev);
-void mlx5_mp_init(void);
+void mlx5_mp_init_primary(void);
+void mlx5_mp_uninit_primary(void);
 
 /* mlx5_nl.c */
 
diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c
index 19a1f25f0e..5f18c8f5c1 100644
--- a/drivers/net/mlx5/mlx5_mp.c
+++ b/drivers/net/mlx5/mlx5_mp.c
@@ -118,9 +118,22 @@ mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev)
 	return cmd_fd;
 }
 
+/**
+ * Initialize by primary process.
+ */
+void
+mlx5_mp_init_primary(void)
+{
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
+	rte_mp_action_register(MLX5_MP_NAME, mp_primary_handle);
+}
+
+/**
+ * Un-initialize by primary process.
+ */
 void
-mlx5_mp_init(void)
+mlx5_mp_uninit_primary(void)
 {
-	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
-		rte_mp_action_register(MLX5_MP_NAME, mp_primary_handle);
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
+	rte_mp_action_unregister(MLX5_MP_NAME);
 }
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index d18561740f..5640fe1b91 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -286,7 +286,7 @@ mlx5_tx_uar_remap(struct rte_eth_dev *dev, int fd)
 			}
 		}
 		/* new address in reserved UAR address space. */
-		addr = RTE_PTR_ADD(priv->uar_base,
+		addr = RTE_PTR_ADD(mlx5_shared_data->uar_base,
 				   uar_va & (uintptr_t)(MLX5_UAR_SIZE - 1));
 		if (!already_mapped) {
 			pages[pages_n++] = uar_va;
@@ -844,9 +844,8 @@ mlx5_txq_release(struct rte_eth_dev *dev, uint16_t idx)
 	txq = container_of((*priv->txqs)[idx], struct mlx5_txq_ctrl, txq);
 	if (txq->ibv && !mlx5_txq_ibv_release(txq->ibv))
 		txq->ibv = NULL;
-	if (priv->uar_base)
-		munmap((void *)RTE_ALIGN_FLOOR((uintptr_t)txq->txq.bf_reg,
-		       page_size), page_size);
+	munmap((void *)RTE_ALIGN_FLOOR((uintptr_t)txq->txq.bf_reg, page_size),
+	       page_size);
 	if (rte_atomic32_dec_and_test(&txq->refcnt)) {
 		txq_free_elts(txq);
 		mlx5_mr_btree_free(&txq->txq.mr_ctrl.cache_bh);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH 4/4] net/mlx5: sync stop/start of datapath with secondary process
  2019-03-07  7:33 [dpdk-dev] [PATCH 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
                   ` (2 preceding siblings ...)
  2019-03-07  7:33 ` [dpdk-dev] [PATCH 3/4] net/mlx5: rework PMD global data init Yongseok Koh
@ 2019-03-07  7:33 ` Yongseok Koh
  2019-03-25 19:15 ` [dpdk-dev] [PATCH v2 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
  2019-04-01 21:12 ` [dpdk-dev] [PATCH v3 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
  5 siblings, 0 replies; 45+ messages in thread
From: Yongseok Koh @ 2019-03-07  7:33 UTC (permalink / raw)
  To: shahafs; +Cc: dev

Rx/Tx burst function pointers are stored in the rte_eth_dev structure,
which is local to a process. Even though primary process replaces the
function pointers, secondary will not run the new ones. With rte_mp APIs,
primary can easily broadcast a request to stop/start the datapath of
secondary processes.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 drivers/net/mlx5/mlx5.c         |   5 ++
 drivers/net/mlx5/mlx5.h         |   6 ++
 drivers/net/mlx5/mlx5_mp.c      | 144 ++++++++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_rxtx.c    |   2 +
 drivers/net/mlx5/mlx5_trigger.c |   5 ++
 5 files changed, 162 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index ea8fd55ee6..dd9106296c 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -305,6 +305,9 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 	/* Prevent crashes when queues are still in use. */
 	dev->rx_pkt_burst = removed_rx_burst;
 	dev->tx_pkt_burst = removed_tx_burst;
+	rte_wmb();
+	/* Disable datapath on secondary process. */
+	mlx5_mp_req_stop_rxtx(dev);
 	if (priv->rxqs != NULL) {
 		/* XXX race condition if mlx5_rx_burst() is still running. */
 		usleep(1000);
@@ -803,6 +806,7 @@ mlx5_init_once(void)
 	case RTE_PROC_SECONDARY:
 		if (ld->init_done)
 			break;
+		mlx5_mp_init_secondary();
 		ret = mlx5_uar_init_secondary();
 		if (ret)
 			goto error;
@@ -823,6 +827,7 @@ mlx5_init_once(void)
 		break;
 	case RTE_PROC_SECONDARY:
 		mlx5_uar_uninit_secondary();
+		mlx5_mp_uninit_secondary();
 		break;
 	default:
 		break;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 4b606190f0..9c042bc3da 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -59,6 +59,8 @@ enum {
 /* Request types for IPC. */
 enum mlx5_mp_req_type {
 	MLX5_MP_REQ_VERBS_CMD_FD = 1,
+	MLX5_MP_REQ_START_RXTX,
+	MLX5_MP_REQ_STOP_RXTX,
 };
 
 /* Pameters for IPC. */
@@ -425,9 +427,13 @@ int mlx5_flow_create_drop_queue(struct rte_eth_dev *dev);
 void mlx5_flow_delete_drop_queue(struct rte_eth_dev *dev);
 
 /* mlx5_mp.c */
+void mlx5_mp_req_start_rxtx(struct rte_eth_dev *dev);
+void mlx5_mp_req_stop_rxtx(struct rte_eth_dev *dev);
 int mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev);
 void mlx5_mp_init_primary(void);
 void mlx5_mp_uninit_primary(void);
+void mlx5_mp_init_secondary(void);
+void mlx5_mp_uninit_secondary(void);
 
 /* mlx5_nl.c */
 
diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c
index 5f18c8f5c1..623dfb8097 100644
--- a/drivers/net/mlx5/mlx5_mp.c
+++ b/drivers/net/mlx5/mlx5_mp.c
@@ -12,6 +12,7 @@
 #include <rte_string_fns.h>
 
 #include "mlx5.h"
+#include "mlx5_rxtx.h"
 #include "mlx5_utils.h"
 
 /**
@@ -78,6 +79,129 @@ mp_primary_handle(const struct rte_mp_msg *mp_msg, const void *peer)
 }
 
 /**
+ * IPC message handler of a secondary process.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ * @param[in] peer
+ *   Pointer to the peer socket path.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mp_secondary_handle(const struct rte_mp_msg *mp_msg, const void *peer)
+{
+	struct rte_mp_msg mp_res;
+	struct mlx5_mp_param *res = (struct mlx5_mp_param *)mp_res.param;
+	const struct mlx5_mp_param *param =
+		(const struct mlx5_mp_param *)mp_msg->param;
+	struct rte_eth_dev *dev = &rte_eth_devices[param->port_id];
+	int ret = 0;
+
+	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
+	switch (param->type) {
+	case MLX5_MP_REQ_START_RXTX:
+		DRV_LOG(INFO, "port %u starting datapath", dev->data->port_id);
+		rte_mb();
+		dev->rx_pkt_burst = mlx5_select_rx_function(dev);
+		dev->tx_pkt_burst = mlx5_select_tx_function(dev);
+		mp_init_msg(dev, &mp_res, param->type);
+		res->result = 0;
+		ret = rte_mp_reply(&mp_res, peer);
+		break;
+	case MLX5_MP_REQ_STOP_RXTX:
+		DRV_LOG(INFO, "port %u stopping datapath", dev->data->port_id);
+		dev->rx_pkt_burst = removed_rx_burst;
+		dev->tx_pkt_burst = removed_tx_burst;
+		rte_mb();
+		mp_init_msg(dev, &mp_res, param->type);
+		res->result = 0;
+		ret = rte_mp_reply(&mp_res, peer);
+		break;
+	default:
+		rte_errno = EINVAL;
+		DRV_LOG(ERR, "port %u invalid mp request type",
+			dev->data->port_id);
+		return -rte_errno;
+	}
+	return ret;
+}
+
+/**
+ * Broadcast request of stopping/starting data-path to secondary processes.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ * @param[in] type
+ *   Request type.
+ */
+static void
+mp_req_on_rxtx(struct rte_eth_dev *dev, enum mlx5_mp_req_type type)
+{
+	struct rte_mp_msg mp_req;
+	struct rte_mp_msg *mp_res;
+	struct rte_mp_reply mp_rep;
+	struct mlx5_mp_param *res __rte_unused;
+	struct timespec ts = {.tv_sec = 5, .tv_nsec = 0};
+	int ret;
+
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
+	if (!mlx5_shared_data->secondary_cnt)
+		return;
+	if (type != MLX5_MP_REQ_START_RXTX && type != MLX5_MP_REQ_STOP_RXTX) {
+		DRV_LOG(ERR, "port %u unknown request (req_type %d)",
+			dev->data->port_id, type);
+		return;
+	}
+	mp_init_msg(dev, &mp_req, type);
+	ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts);
+	if (ret) {
+		DRV_LOG(ERR,
+			"port %u failed to request stop/start Rx/Tx (%d)",
+			dev->data->port_id, type);
+		goto exit;
+	}
+	if (mp_rep.nb_sent != mp_rep.nb_received) {
+		DRV_LOG(ERR,
+			"port %u not all secondaries responded (req_type %d)",
+			dev->data->port_id, type);
+		goto exit;
+	}
+	mp_res = &mp_rep.msgs[0];
+	res = (struct mlx5_mp_param *)mp_res->param;
+	assert(!res->result);
+exit:
+	free(mp_rep.msgs);
+}
+
+/**
+ * Broadcast request of starting data-path to secondary processes. The request
+ * is synchronous.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ */
+void
+mlx5_mp_req_start_rxtx(struct rte_eth_dev *dev)
+{
+	mp_req_on_rxtx(dev, MLX5_MP_REQ_START_RXTX);
+}
+
+/**
+ * Broadcast request of stopping data-path to secondary processes. The request
+ * is synchronous.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ */
+void
+mlx5_mp_req_stop_rxtx(struct rte_eth_dev *dev)
+{
+	mp_req_on_rxtx(dev, MLX5_MP_REQ_STOP_RXTX);
+}
+
+/**
  * Request Verbs command file descriptor for mmap to the primary process.
  *
  * @param[in] dev
@@ -137,3 +261,23 @@ mlx5_mp_uninit_primary(void)
 	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
 	rte_mp_action_unregister(MLX5_MP_NAME);
 }
+
+/**
+ * Initialize by secondary process.
+ */
+void
+mlx5_mp_init_secondary(void)
+{
+	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
+	rte_mp_action_register(MLX5_MP_NAME, mp_secondary_handle);
+}
+
+/**
+ * Un-initialize by secondary process.
+ */
+void
+mlx5_mp_uninit_secondary(void)
+{
+	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
+	rte_mp_action_unregister(MLX5_MP_NAME);
+}
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index baa4079c14..b6946eff4a 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -2372,6 +2372,7 @@ removed_tx_burst(void *dpdk_txq __rte_unused,
 		 struct rte_mbuf **pkts __rte_unused,
 		 uint16_t pkts_n __rte_unused)
 {
+	rte_mb();
 	return 0;
 }
 
@@ -2396,6 +2397,7 @@ removed_rx_burst(void *dpdk_txq __rte_unused,
 		 struct rte_mbuf **pkts __rte_unused,
 		 uint16_t pkts_n __rte_unused)
 {
+	rte_mb();
 	return 0;
 }
 
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index 2137bdc461..7c9ff921ab 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -194,8 +194,11 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 			dev->data->port_id);
 		goto error;
 	}
+	rte_wmb();
 	dev->tx_pkt_burst = mlx5_select_tx_function(dev);
 	dev->rx_pkt_burst = mlx5_select_rx_function(dev);
+	/* Enable datapath on secondary process. */
+	mlx5_mp_req_start_rxtx(dev);
 	mlx5_dev_interrupt_handler_install(dev);
 	return 0;
 error:
@@ -228,6 +231,8 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
 	dev->rx_pkt_burst = removed_rx_burst;
 	dev->tx_pkt_burst = removed_tx_burst;
 	rte_wmb();
+	/* Disable datapath on secondary process. */
+	mlx5_mp_req_stop_rxtx(dev);
 	usleep(1000 * priv->rxqs_n);
 	DRV_LOG(DEBUG, "port %u stopping device", dev->data->port_id);
 	mlx5_flow_stop(dev, &priv->flows);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 3/4] net/mlx5: rework PMD global data init
  2019-03-07  7:33 ` [dpdk-dev] [PATCH 3/4] net/mlx5: rework PMD global data init Yongseok Koh
@ 2019-03-14 12:36   ` Shahaf Shuler
  2019-03-14 12:36     ` Shahaf Shuler
  2019-03-18 21:21     ` Yongseok Koh
  0 siblings, 2 replies; 45+ messages in thread
From: Shahaf Shuler @ 2019-03-14 12:36 UTC (permalink / raw)
  To: Yongseok Koh; +Cc: dev

Hi Koh,

Thursday, March 7, 2019 9:33 AM, Yongseok Koh:
> Subject: [PATCH 3/4] net/mlx5: rework PMD global data init
> 
> There's more need to have PMD global data structure. It should be initialized
> once per a process regardless of how many PMD instances are probed.
> mlx5_init_once() is called during probing and make sure all the init functions
> are called once per a process. The existing shared memory gets more
> extensively used for this purpose. As there could be multiple secondary
> processes, a static storage (local to process) is also added.

It is hard to understand from the commit log what was missing on the old design. 

> 
> As the reserved virtual address for UAR remap is a PMD global resource, this
> doesn't need to be stored in the device priv structure, but in the PMD global
> data.

I thought we agreed to drop those and have different VA for each process. 
If so, is the extra work on the UAR here is needed? 

> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5.c     | 250 ++++++++++++++++++++++++++++++++--
> ----------
>  drivers/net/mlx5/mlx5.h     |  19 +++-
>  drivers/net/mlx5/mlx5_mp.c  |  19 +++-
>  drivers/net/mlx5/mlx5_txq.c |   7 +-
>  4 files changed, 217 insertions(+), 78 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> 6ed2418106..ea8fd55ee6 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -128,16 +128,26 @@ struct mlx5_shared_data *mlx5_shared_data;
>  /* Spinlock for mlx5_shared_data allocation. */  static rte_spinlock_t
> mlx5_shared_data_lock = RTE_SPINLOCK_INITIALIZER;
> 
> +/* Process local data for secondary processes. */ static struct
> +mlx5_local_data mlx5_local_data;

Why not storing this context as part of ethdev-> process_private instead of declaring it static? 

> +
>  /** Driver-specific log messages type. */  int mlx5_logtype;
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 3/4] net/mlx5: rework PMD global data init
  2019-03-14 12:36   ` Shahaf Shuler
@ 2019-03-14 12:36     ` Shahaf Shuler
  2019-03-18 21:21     ` Yongseok Koh
  1 sibling, 0 replies; 45+ messages in thread
From: Shahaf Shuler @ 2019-03-14 12:36 UTC (permalink / raw)
  To: Yongseok Koh; +Cc: dev

Hi Koh,

Thursday, March 7, 2019 9:33 AM, Yongseok Koh:
> Subject: [PATCH 3/4] net/mlx5: rework PMD global data init
> 
> There's more need to have PMD global data structure. It should be initialized
> once per a process regardless of how many PMD instances are probed.
> mlx5_init_once() is called during probing and make sure all the init functions
> are called once per a process. The existing shared memory gets more
> extensively used for this purpose. As there could be multiple secondary
> processes, a static storage (local to process) is also added.

It is hard to understand from the commit log what was missing on the old design. 

> 
> As the reserved virtual address for UAR remap is a PMD global resource, this
> doesn't need to be stored in the device priv structure, but in the PMD global
> data.

I thought we agreed to drop those and have different VA for each process. 
If so, is the extra work on the UAR here is needed? 

> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5.c     | 250 ++++++++++++++++++++++++++++++++--
> ----------
>  drivers/net/mlx5/mlx5.h     |  19 +++-
>  drivers/net/mlx5/mlx5_mp.c  |  19 +++-
>  drivers/net/mlx5/mlx5_txq.c |   7 +-
>  4 files changed, 217 insertions(+), 78 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> 6ed2418106..ea8fd55ee6 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -128,16 +128,26 @@ struct mlx5_shared_data *mlx5_shared_data;
>  /* Spinlock for mlx5_shared_data allocation. */  static rte_spinlock_t
> mlx5_shared_data_lock = RTE_SPINLOCK_INITIALIZER;
> 
> +/* Process local data for secondary processes. */ static struct
> +mlx5_local_data mlx5_local_data;

Why not storing this context as part of ethdev-> process_private instead of declaring it static? 

> +
>  /** Driver-specific log messages type. */  int mlx5_logtype;
> 


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 2/4] net/mlx5: replace IPC socket with EAL API
  2019-03-07  7:33 ` [dpdk-dev] [PATCH 2/4] net/mlx5: replace IPC socket with EAL API Yongseok Koh
@ 2019-03-14 12:36   ` Shahaf Shuler
  2019-03-14 12:36     ` Shahaf Shuler
  2019-03-18 21:29     ` Yongseok Koh
  0 siblings, 2 replies; 45+ messages in thread
From: Shahaf Shuler @ 2019-03-14 12:36 UTC (permalink / raw)
  To: Yongseok Koh; +Cc: dev

Hi Koh, 

Thursday, March 7, 2019 9:33 AM, Yongseok Koh:
> Subject: [PATCH 2/4] net/mlx5: replace IPC socket with EAL API
> 
> Socket API is used for IPC in order for secondary process to acquire Verb
> command file descriptor. The FD is used to remap UAR address. The new
> multi-process APIs (rte_mp) in EAL are newly introduced. mlx5_socket.c is
> replaced with mlx5_mp.c, which uses the new APIs.
> 
> As it is PMD global infrastructure, only one IPC channel is established.
> All the IPC message types may have port_id in the message if there is need
> to reference a specific device.
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>

[...]

>  /**
> diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c new
> file mode 100644 index 0000000000..19a1f25f0e
> --- /dev/null
> +++ b/drivers/net/mlx5/mlx5_mp.c
> @@ -0,0 +1,126 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2019 6WIND S.A.
> + * Copyright 2019 Mellanox Technologies, Ltd  */
> +
> +#include <assert.h>
> +#include <stdio.h>
> +#include <time.h>
> +
> +#include <rte_eal.h>
> +#include <rte_ethdev_driver.h>
> +#include <rte_string_fns.h>
> +
> +#include "mlx5.h"
> +#include "mlx5_utils.h"
> +
> +/**
> + * Initialize IPC message.
> + *
> + * @param[in] dev
> + *   Pointer to Ethernet structure.
> + * @param[out] msg
> + *   Pointer to message to fill in.
> + * @param[in] type
> + *   Message type.
> + */
> +static inline void
> +mp_init_msg(struct rte_eth_dev *dev, struct rte_mp_msg *msg,
> +	    enum mlx5_mp_req_type type)
> +{
> +	struct mlx5_mp_param *param = (struct mlx5_mp_param *)msg-
> >param;
> +
> +	memset(msg, 0, sizeof(*msg));
> +	strlcpy(msg->name, MLX5_MP_NAME, sizeof(msg->name));
> +	msg->len_param = sizeof(*param);
> +	param->type = type;
> +	param->port_id = dev->data->port_id;
> +}
> +
> +/**
> + * IPC message handler of primary process.
> + *
> + * @param[in] dev
> + *   Pointer to Ethernet structure.
> + * @param[in] peer
> + *   Pointer to the peer socket path.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +mp_primary_handle(const struct rte_mp_msg *mp_msg, const void *peer)
> {
> +	struct rte_mp_msg mp_res;
> +	struct mlx5_mp_param *res = (struct mlx5_mp_param
> *)mp_res.param;
> +	const struct mlx5_mp_param *param =
> +		(const struct mlx5_mp_param *)mp_msg->param;
> +	struct rte_eth_dev *dev = &rte_eth_devices[param->port_id];

Need to check dev is a valid one before we continue. If such case happen need error log (like you have for invalid req). 

> +	struct mlx5_priv *priv = dev->data->dev_private;
> +	int ret = 0;
> +
> +	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
> +	switch (param->type) {
> +	case MLX5_MP_REQ_VERBS_CMD_FD:
> +		mp_init_msg(dev, &mp_res, param->type);
> +		mp_res.num_fds = 1;
> +		mp_res.fds[0] = priv->ctx->cmd_fd;
> +		res->result = 0;
> +		ret = rte_mp_reply(&mp_res, peer);
> +		break;
> +	default:
> +		rte_errno = EINVAL;
> +		DRV_LOG(ERR, "port %u invalid mp request type",
> +			dev->data->port_id);
> +		return -rte_errno;
> +	}
> +	return ret;
> +}
> +
> +/**
> + * Request Verbs command file descriptor for mmap to the primary process.
> + *
> + * @param[in] dev
> + *   Pointer to Ethernet structure.
> + *
> + * @return
> + *   fd on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev) {
> +	struct rte_mp_msg mp_req;
> +	struct rte_mp_msg *mp_res;
> +	struct rte_mp_reply mp_rep;
> +	struct mlx5_mp_param *res __rte_unused;
> +	struct timespec ts = {.tv_sec = 5, .tv_nsec = 0};
> +	int cmd_fd;
> +	int ret;
> +
> +	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
> +	mp_init_msg(dev, &mp_req, MLX5_MP_REQ_VERBS_CMD_FD);
> +	ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts);
> +	if (ret) {
> +		DRV_LOG(ERR,
> +			"port %u failed to get command FD from primary
> process",
> +			dev->data->port_id);
> +		return -rte_errno;
> +	}
> +	assert(mp_rep.nb_received == 1);
> +	mp_res = &mp_rep.msgs[0];
> +	res = (struct mlx5_mp_param *)mp_res->param;
> +	assert(!res->result);

Above should not be an assert rather and actual check. 

> +	assert(mp_res->num_fds == 1);
> +	cmd_fd = mp_res->fds[0];
> +	free(mp_rep.msgs);
> +	DRV_LOG(DEBUG, "port %u command FD from primary is %d",
> +		dev->data->port_id, cmd_fd);
> +	return cmd_fd;
> +}
> +
> +void
> +mlx5_mp_init(void)
> +{
> +	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> +		rte_mp_action_register(MLX5_MP_NAME,
> mp_primary_handle); }
> diff --git a/drivers/net/mlx5/mlx5_socket.c
> b/drivers/net/mlx5/mlx5_socket.c deleted file mode 100644 index
> 41cac3c6aa..0000000000
> --- a/drivers/net/mlx5/mlx5_socket.c
> +++ /dev/null
> @@ -1,306 +0,0 @@
> -/* SPDX-License-Identifier: BSD-3-Clause
> - * Copyright 2016 6WIND S.A.
> - * Copyright 2016 Mellanox Technologies, Ltd
> - */
> -
> -#include <sys/types.h>
> -#include <sys/socket.h>
> -#include <sys/un.h>
> -#include <fcntl.h>
> -#include <stdio.h>
> -#include <unistd.h>
> -#include <sys/stat.h>
> -
> -#include "mlx5.h"
> -#include "mlx5_utils.h"
> -
> -/**
> - * Initialise the socket to communicate with the secondary process
> - *
> - * @param[in] dev
> - *   Pointer to Ethernet device.
> - *
> - * @return
> - *   0 on success, a negative errno value otherwise and rte_errno is set.
> - */
> -int
> -mlx5_socket_init(struct rte_eth_dev *dev) -{
> -	struct mlx5_priv *priv = dev->data->dev_private;
> -	struct sockaddr_un sun = {
> -		.sun_family = AF_UNIX,
> -	};
> -	int ret;
> -	int flags;
> -
> -	/*
> -	 * Close the last socket that was used to communicate
> -	 * with the secondary process
> -	 */
> -	if (priv->primary_socket)
> -		mlx5_socket_uninit(dev);
> -	/*
> -	 * Initialise the socket to communicate with the secondary
> -	 * process.
> -	 */
> -	ret = socket(AF_UNIX, SOCK_STREAM, 0);
> -	if (ret < 0) {
> -		rte_errno = errno;
> -		DRV_LOG(WARNING, "port %u secondary process not
> supported: %s",
> -			dev->data->port_id, strerror(errno));
> -		goto error;
> -	}
> -	priv->primary_socket = ret;
> -	flags = fcntl(priv->primary_socket, F_GETFL, 0);
> -	if (flags == -1) {
> -		rte_errno = errno;
> -		goto error;
> -	}
> -	ret = fcntl(priv->primary_socket, F_SETFL, flags | O_NONBLOCK);
> -	if (ret < 0) {
> -		rte_errno = errno;
> -		goto error;
> -	}
> -	snprintf(sun.sun_path, sizeof(sun.sun_path), "/var/tmp/%s_%d",
> -		 MLX5_DRIVER_NAME, priv->primary_socket);
> -	remove(sun.sun_path);
> -	ret = bind(priv->primary_socket, (const struct sockaddr *)&sun,
> -		   sizeof(sun));
> -	if (ret < 0) {
> -		rte_errno = errno;
> -		DRV_LOG(WARNING,
> -			"port %u cannot bind socket, secondary process not"
> -			" supported: %s",
> -			dev->data->port_id, strerror(errno));
> -		goto close;
> -	}
> -	ret = listen(priv->primary_socket, 0);
> -	if (ret < 0) {
> -		rte_errno = errno;
> -		DRV_LOG(WARNING, "port %u secondary process not
> supported: %s",
> -			dev->data->port_id, strerror(errno));
> -		goto close;
> -	}
> -	return 0;
> -close:
> -	remove(sun.sun_path);
> -error:
> -	claim_zero(close(priv->primary_socket));
> -	priv->primary_socket = 0;
> -	return -rte_errno;
> -}
> -
> -/**
> - * Un-Initialise the socket to communicate with the secondary process
> - *
> - * @param[in] dev
> - */
> -void
> -mlx5_socket_uninit(struct rte_eth_dev *dev) -{
> -	struct mlx5_priv *priv = dev->data->dev_private;
> -
> -	MKSTR(path, "/var/tmp/%s_%d", MLX5_DRIVER_NAME, priv-
> >primary_socket);
> -	claim_zero(close(priv->primary_socket));
> -	priv->primary_socket = 0;
> -	claim_zero(remove(path));
> -}
> -
> -/**
> - * Handle socket interrupts.
> - *
> - * @param dev
> - *   Pointer to Ethernet device.
> - */
> -void
> -mlx5_socket_handle(struct rte_eth_dev *dev) -{
> -	struct mlx5_priv *priv = dev->data->dev_private;
> -	int conn_sock;
> -	int ret = 0;
> -	struct cmsghdr *cmsg = NULL;
> -	struct ucred *cred = NULL;
> -	char buf[CMSG_SPACE(sizeof(struct ucred))] = { 0 };
> -	char vbuf[1024] = { 0 };
> -	struct iovec io = {
> -		.iov_base = vbuf,
> -		.iov_len = sizeof(*vbuf),
> -	};
> -	struct msghdr msg = {
> -		.msg_iov = &io,
> -		.msg_iovlen = 1,
> -		.msg_control = buf,
> -		.msg_controllen = sizeof(buf),
> -	};
> -	int *fd;
> -
> -	/* Accept the connection from the client. */
> -	conn_sock = accept(priv->primary_socket, NULL, NULL);
> -	if (conn_sock < 0) {
> -		DRV_LOG(WARNING, "port %u connection failed: %s",
> -			dev->data->port_id, strerror(errno));
> -		return;
> -	}
> -	ret = setsockopt(conn_sock, SOL_SOCKET, SO_PASSCRED, &(int){1},
> -					 sizeof(int));
> -	if (ret < 0) {
> -		ret = errno;
> -		DRV_LOG(WARNING, "port %u cannot change socket
> options: %s",
> -			dev->data->port_id, strerror(rte_errno));
> -		goto error;
> -	}
> -	ret = recvmsg(conn_sock, &msg, MSG_WAITALL);
> -	if (ret < 0) {
> -		ret = errno;
> -		DRV_LOG(WARNING, "port %u received an empty message:
> %s",
> -			dev->data->port_id, strerror(rte_errno));
> -		goto error;
> -	}
> -	/* Expect to receive credentials only. */
> -	cmsg = CMSG_FIRSTHDR(&msg);
> -	if (cmsg == NULL) {
> -		DRV_LOG(WARNING, "port %u no message", dev->data-
> >port_id);
> -		goto error;
> -	}
> -	if ((cmsg->cmsg_type == SCM_CREDENTIALS) &&
> -		(cmsg->cmsg_len >= sizeof(*cred))) {
> -		cred = (struct ucred *)CMSG_DATA(cmsg);
> -		assert(cred != NULL);
> -	}
> -	cmsg = CMSG_NXTHDR(&msg, cmsg);
> -	if (cmsg != NULL) {
> -		DRV_LOG(WARNING, "port %u message wrongly
> formatted",
> -			dev->data->port_id);
> -		goto error;
> -	}
> -	/* Make sure all the ancillary data was received and valid. */
> -	if ((cred == NULL) || (cred->uid != getuid()) ||
> -	    (cred->gid != getgid())) {
> -		DRV_LOG(WARNING, "port %u wrong credentials",
> -			dev->data->port_id);
> -		goto error;
> -	}
> -	/* Set-up the ancillary data. */
> -	cmsg = CMSG_FIRSTHDR(&msg);
> -	assert(cmsg != NULL);
> -	cmsg->cmsg_level = SOL_SOCKET;
> -	cmsg->cmsg_type = SCM_RIGHTS;
> -	cmsg->cmsg_len = CMSG_LEN(sizeof(priv->ctx->cmd_fd));
> -	fd = (int *)CMSG_DATA(cmsg);
> -	*fd = priv->ctx->cmd_fd;
> -	ret = sendmsg(conn_sock, &msg, 0);
> -	if (ret < 0)
> -		DRV_LOG(WARNING, "port %u cannot send response",
> -			dev->data->port_id);
> -error:
> -	close(conn_sock);
> -}
> -
> -/**
> - * Connect to the primary process.
> - *
> - * @param[in] dev
> - *   Pointer to Ethernet structure.
> - *
> - * @return
> - *   fd on success, negative errno value otherwise and rte_errno is set.
> - */
> -int
> -mlx5_socket_connect(struct rte_eth_dev *dev) -{
> -	struct mlx5_priv *priv = dev->data->dev_private;
> -	struct sockaddr_un sun = {
> -		.sun_family = AF_UNIX,
> -	};
> -	int socket_fd = -1;
> -	int *fd = NULL;
> -	int ret;
> -	struct ucred *cred;
> -	char buf[CMSG_SPACE(sizeof(*cred))] = { 0 };
> -	char vbuf[1024] = { 0 };
> -	struct iovec io = {
> -		.iov_base = vbuf,
> -		.iov_len = sizeof(*vbuf),
> -	};
> -	struct msghdr msg = {
> -		.msg_control = buf,
> -		.msg_controllen = sizeof(buf),
> -		.msg_iov = &io,
> -		.msg_iovlen = 1,
> -	};
> -	struct cmsghdr *cmsg;
> -
> -	ret = socket(AF_UNIX, SOCK_STREAM, 0);
> -	if (ret < 0) {
> -		rte_errno = errno;
> -		DRV_LOG(WARNING, "port %u cannot connect to primary",
> -			dev->data->port_id);
> -		goto error;
> -	}
> -	socket_fd = ret;
> -	snprintf(sun.sun_path, sizeof(sun.sun_path), "/var/tmp/%s_%d",
> -		 MLX5_DRIVER_NAME, priv->primary_socket);
> -	ret = connect(socket_fd, (const struct sockaddr *)&sun, sizeof(sun));
> -	if (ret < 0) {
> -		rte_errno = errno;
> -		DRV_LOG(WARNING, "port %u cannot connect to primary",
> -			dev->data->port_id);
> -		goto error;
> -	}
> -	cmsg = CMSG_FIRSTHDR(&msg);
> -	if (cmsg == NULL) {
> -		rte_errno = EINVAL;
> -		DRV_LOG(DEBUG, "port %u cannot get first message",
> -			dev->data->port_id);
> -		goto error;
> -	}
> -	cmsg->cmsg_level = SOL_SOCKET;
> -	cmsg->cmsg_type = SCM_CREDENTIALS;
> -	cmsg->cmsg_len = CMSG_LEN(sizeof(*cred));
> -	cred = (struct ucred *)CMSG_DATA(cmsg);
> -	if (cred == NULL) {
> -		rte_errno = EINVAL;
> -		DRV_LOG(DEBUG, "port %u no credentials received",
> -			dev->data->port_id);
> -		goto error;
> -	}
> -	cred->pid = getpid();
> -	cred->uid = getuid();
> -	cred->gid = getgid();
> -	ret = sendmsg(socket_fd, &msg, MSG_DONTWAIT);
> -	if (ret < 0) {
> -		rte_errno = errno;
> -		DRV_LOG(WARNING,
> -			"port %u cannot send credentials to primary: %s",
> -			dev->data->port_id, strerror(errno));
> -		goto error;
> -	}
> -	ret = recvmsg(socket_fd, &msg, MSG_WAITALL);
> -	if (ret <= 0) {
> -		rte_errno = errno;
> -		DRV_LOG(WARNING, "port %u no message from primary:
> %s",
> -			dev->data->port_id, strerror(errno));
> -		goto error;
> -	}
> -	cmsg = CMSG_FIRSTHDR(&msg);
> -	if (cmsg == NULL) {
> -		rte_errno = EINVAL;
> -		DRV_LOG(WARNING, "port %u no file descriptor received",
> -			dev->data->port_id);
> -		goto error;
> -	}
> -	fd = (int *)CMSG_DATA(cmsg);
> -	if (*fd < 0) {
> -		DRV_LOG(WARNING, "port %u no file descriptor received:
> %s",
> -			dev->data->port_id, strerror(errno));
> -		rte_errno = *fd;
> -		goto error;
> -	}
> -	ret = *fd;
> -	close(socket_fd);
> -	return ret;
> -error:
> -	if (socket_fd != -1)
> -		close(socket_fd);
> -	return -rte_errno;
> -}
> --
> 2.11.0

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 2/4] net/mlx5: replace IPC socket with EAL API
  2019-03-14 12:36   ` Shahaf Shuler
@ 2019-03-14 12:36     ` Shahaf Shuler
  2019-03-18 21:29     ` Yongseok Koh
  1 sibling, 0 replies; 45+ messages in thread
From: Shahaf Shuler @ 2019-03-14 12:36 UTC (permalink / raw)
  To: Yongseok Koh; +Cc: dev

Hi Koh, 

Thursday, March 7, 2019 9:33 AM, Yongseok Koh:
> Subject: [PATCH 2/4] net/mlx5: replace IPC socket with EAL API
> 
> Socket API is used for IPC in order for secondary process to acquire Verb
> command file descriptor. The FD is used to remap UAR address. The new
> multi-process APIs (rte_mp) in EAL are newly introduced. mlx5_socket.c is
> replaced with mlx5_mp.c, which uses the new APIs.
> 
> As it is PMD global infrastructure, only one IPC channel is established.
> All the IPC message types may have port_id in the message if there is need
> to reference a specific device.
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>

[...]

>  /**
> diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c new
> file mode 100644 index 0000000000..19a1f25f0e
> --- /dev/null
> +++ b/drivers/net/mlx5/mlx5_mp.c
> @@ -0,0 +1,126 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2019 6WIND S.A.
> + * Copyright 2019 Mellanox Technologies, Ltd  */
> +
> +#include <assert.h>
> +#include <stdio.h>
> +#include <time.h>
> +
> +#include <rte_eal.h>
> +#include <rte_ethdev_driver.h>
> +#include <rte_string_fns.h>
> +
> +#include "mlx5.h"
> +#include "mlx5_utils.h"
> +
> +/**
> + * Initialize IPC message.
> + *
> + * @param[in] dev
> + *   Pointer to Ethernet structure.
> + * @param[out] msg
> + *   Pointer to message to fill in.
> + * @param[in] type
> + *   Message type.
> + */
> +static inline void
> +mp_init_msg(struct rte_eth_dev *dev, struct rte_mp_msg *msg,
> +	    enum mlx5_mp_req_type type)
> +{
> +	struct mlx5_mp_param *param = (struct mlx5_mp_param *)msg-
> >param;
> +
> +	memset(msg, 0, sizeof(*msg));
> +	strlcpy(msg->name, MLX5_MP_NAME, sizeof(msg->name));
> +	msg->len_param = sizeof(*param);
> +	param->type = type;
> +	param->port_id = dev->data->port_id;
> +}
> +
> +/**
> + * IPC message handler of primary process.
> + *
> + * @param[in] dev
> + *   Pointer to Ethernet structure.
> + * @param[in] peer
> + *   Pointer to the peer socket path.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +mp_primary_handle(const struct rte_mp_msg *mp_msg, const void *peer)
> {
> +	struct rte_mp_msg mp_res;
> +	struct mlx5_mp_param *res = (struct mlx5_mp_param
> *)mp_res.param;
> +	const struct mlx5_mp_param *param =
> +		(const struct mlx5_mp_param *)mp_msg->param;
> +	struct rte_eth_dev *dev = &rte_eth_devices[param->port_id];

Need to check dev is a valid one before we continue. If such case happen need error log (like you have for invalid req). 

> +	struct mlx5_priv *priv = dev->data->dev_private;
> +	int ret = 0;
> +
> +	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
> +	switch (param->type) {
> +	case MLX5_MP_REQ_VERBS_CMD_FD:
> +		mp_init_msg(dev, &mp_res, param->type);
> +		mp_res.num_fds = 1;
> +		mp_res.fds[0] = priv->ctx->cmd_fd;
> +		res->result = 0;
> +		ret = rte_mp_reply(&mp_res, peer);
> +		break;
> +	default:
> +		rte_errno = EINVAL;
> +		DRV_LOG(ERR, "port %u invalid mp request type",
> +			dev->data->port_id);
> +		return -rte_errno;
> +	}
> +	return ret;
> +}
> +
> +/**
> + * Request Verbs command file descriptor for mmap to the primary process.
> + *
> + * @param[in] dev
> + *   Pointer to Ethernet structure.
> + *
> + * @return
> + *   fd on success, a negative errno value otherwise and rte_errno is set.
> + */
> +int
> +mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev) {
> +	struct rte_mp_msg mp_req;
> +	struct rte_mp_msg *mp_res;
> +	struct rte_mp_reply mp_rep;
> +	struct mlx5_mp_param *res __rte_unused;
> +	struct timespec ts = {.tv_sec = 5, .tv_nsec = 0};
> +	int cmd_fd;
> +	int ret;
> +
> +	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
> +	mp_init_msg(dev, &mp_req, MLX5_MP_REQ_VERBS_CMD_FD);
> +	ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts);
> +	if (ret) {
> +		DRV_LOG(ERR,
> +			"port %u failed to get command FD from primary
> process",
> +			dev->data->port_id);
> +		return -rte_errno;
> +	}
> +	assert(mp_rep.nb_received == 1);
> +	mp_res = &mp_rep.msgs[0];
> +	res = (struct mlx5_mp_param *)mp_res->param;
> +	assert(!res->result);

Above should not be an assert rather and actual check. 

> +	assert(mp_res->num_fds == 1);
> +	cmd_fd = mp_res->fds[0];
> +	free(mp_rep.msgs);
> +	DRV_LOG(DEBUG, "port %u command FD from primary is %d",
> +		dev->data->port_id, cmd_fd);
> +	return cmd_fd;
> +}
> +
> +void
> +mlx5_mp_init(void)
> +{
> +	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> +		rte_mp_action_register(MLX5_MP_NAME,
> mp_primary_handle); }
> diff --git a/drivers/net/mlx5/mlx5_socket.c
> b/drivers/net/mlx5/mlx5_socket.c deleted file mode 100644 index
> 41cac3c6aa..0000000000
> --- a/drivers/net/mlx5/mlx5_socket.c
> +++ /dev/null
> @@ -1,306 +0,0 @@
> -/* SPDX-License-Identifier: BSD-3-Clause
> - * Copyright 2016 6WIND S.A.
> - * Copyright 2016 Mellanox Technologies, Ltd
> - */
> -
> -#include <sys/types.h>
> -#include <sys/socket.h>
> -#include <sys/un.h>
> -#include <fcntl.h>
> -#include <stdio.h>
> -#include <unistd.h>
> -#include <sys/stat.h>
> -
> -#include "mlx5.h"
> -#include "mlx5_utils.h"
> -
> -/**
> - * Initialise the socket to communicate with the secondary process
> - *
> - * @param[in] dev
> - *   Pointer to Ethernet device.
> - *
> - * @return
> - *   0 on success, a negative errno value otherwise and rte_errno is set.
> - */
> -int
> -mlx5_socket_init(struct rte_eth_dev *dev) -{
> -	struct mlx5_priv *priv = dev->data->dev_private;
> -	struct sockaddr_un sun = {
> -		.sun_family = AF_UNIX,
> -	};
> -	int ret;
> -	int flags;
> -
> -	/*
> -	 * Close the last socket that was used to communicate
> -	 * with the secondary process
> -	 */
> -	if (priv->primary_socket)
> -		mlx5_socket_uninit(dev);
> -	/*
> -	 * Initialise the socket to communicate with the secondary
> -	 * process.
> -	 */
> -	ret = socket(AF_UNIX, SOCK_STREAM, 0);
> -	if (ret < 0) {
> -		rte_errno = errno;
> -		DRV_LOG(WARNING, "port %u secondary process not
> supported: %s",
> -			dev->data->port_id, strerror(errno));
> -		goto error;
> -	}
> -	priv->primary_socket = ret;
> -	flags = fcntl(priv->primary_socket, F_GETFL, 0);
> -	if (flags == -1) {
> -		rte_errno = errno;
> -		goto error;
> -	}
> -	ret = fcntl(priv->primary_socket, F_SETFL, flags | O_NONBLOCK);
> -	if (ret < 0) {
> -		rte_errno = errno;
> -		goto error;
> -	}
> -	snprintf(sun.sun_path, sizeof(sun.sun_path), "/var/tmp/%s_%d",
> -		 MLX5_DRIVER_NAME, priv->primary_socket);
> -	remove(sun.sun_path);
> -	ret = bind(priv->primary_socket, (const struct sockaddr *)&sun,
> -		   sizeof(sun));
> -	if (ret < 0) {
> -		rte_errno = errno;
> -		DRV_LOG(WARNING,
> -			"port %u cannot bind socket, secondary process not"
> -			" supported: %s",
> -			dev->data->port_id, strerror(errno));
> -		goto close;
> -	}
> -	ret = listen(priv->primary_socket, 0);
> -	if (ret < 0) {
> -		rte_errno = errno;
> -		DRV_LOG(WARNING, "port %u secondary process not
> supported: %s",
> -			dev->data->port_id, strerror(errno));
> -		goto close;
> -	}
> -	return 0;
> -close:
> -	remove(sun.sun_path);
> -error:
> -	claim_zero(close(priv->primary_socket));
> -	priv->primary_socket = 0;
> -	return -rte_errno;
> -}
> -
> -/**
> - * Un-Initialise the socket to communicate with the secondary process
> - *
> - * @param[in] dev
> - */
> -void
> -mlx5_socket_uninit(struct rte_eth_dev *dev) -{
> -	struct mlx5_priv *priv = dev->data->dev_private;
> -
> -	MKSTR(path, "/var/tmp/%s_%d", MLX5_DRIVER_NAME, priv-
> >primary_socket);
> -	claim_zero(close(priv->primary_socket));
> -	priv->primary_socket = 0;
> -	claim_zero(remove(path));
> -}
> -
> -/**
> - * Handle socket interrupts.
> - *
> - * @param dev
> - *   Pointer to Ethernet device.
> - */
> -void
> -mlx5_socket_handle(struct rte_eth_dev *dev) -{
> -	struct mlx5_priv *priv = dev->data->dev_private;
> -	int conn_sock;
> -	int ret = 0;
> -	struct cmsghdr *cmsg = NULL;
> -	struct ucred *cred = NULL;
> -	char buf[CMSG_SPACE(sizeof(struct ucred))] = { 0 };
> -	char vbuf[1024] = { 0 };
> -	struct iovec io = {
> -		.iov_base = vbuf,
> -		.iov_len = sizeof(*vbuf),
> -	};
> -	struct msghdr msg = {
> -		.msg_iov = &io,
> -		.msg_iovlen = 1,
> -		.msg_control = buf,
> -		.msg_controllen = sizeof(buf),
> -	};
> -	int *fd;
> -
> -	/* Accept the connection from the client. */
> -	conn_sock = accept(priv->primary_socket, NULL, NULL);
> -	if (conn_sock < 0) {
> -		DRV_LOG(WARNING, "port %u connection failed: %s",
> -			dev->data->port_id, strerror(errno));
> -		return;
> -	}
> -	ret = setsockopt(conn_sock, SOL_SOCKET, SO_PASSCRED, &(int){1},
> -					 sizeof(int));
> -	if (ret < 0) {
> -		ret = errno;
> -		DRV_LOG(WARNING, "port %u cannot change socket
> options: %s",
> -			dev->data->port_id, strerror(rte_errno));
> -		goto error;
> -	}
> -	ret = recvmsg(conn_sock, &msg, MSG_WAITALL);
> -	if (ret < 0) {
> -		ret = errno;
> -		DRV_LOG(WARNING, "port %u received an empty message:
> %s",
> -			dev->data->port_id, strerror(rte_errno));
> -		goto error;
> -	}
> -	/* Expect to receive credentials only. */
> -	cmsg = CMSG_FIRSTHDR(&msg);
> -	if (cmsg == NULL) {
> -		DRV_LOG(WARNING, "port %u no message", dev->data-
> >port_id);
> -		goto error;
> -	}
> -	if ((cmsg->cmsg_type == SCM_CREDENTIALS) &&
> -		(cmsg->cmsg_len >= sizeof(*cred))) {
> -		cred = (struct ucred *)CMSG_DATA(cmsg);
> -		assert(cred != NULL);
> -	}
> -	cmsg = CMSG_NXTHDR(&msg, cmsg);
> -	if (cmsg != NULL) {
> -		DRV_LOG(WARNING, "port %u message wrongly
> formatted",
> -			dev->data->port_id);
> -		goto error;
> -	}
> -	/* Make sure all the ancillary data was received and valid. */
> -	if ((cred == NULL) || (cred->uid != getuid()) ||
> -	    (cred->gid != getgid())) {
> -		DRV_LOG(WARNING, "port %u wrong credentials",
> -			dev->data->port_id);
> -		goto error;
> -	}
> -	/* Set-up the ancillary data. */
> -	cmsg = CMSG_FIRSTHDR(&msg);
> -	assert(cmsg != NULL);
> -	cmsg->cmsg_level = SOL_SOCKET;
> -	cmsg->cmsg_type = SCM_RIGHTS;
> -	cmsg->cmsg_len = CMSG_LEN(sizeof(priv->ctx->cmd_fd));
> -	fd = (int *)CMSG_DATA(cmsg);
> -	*fd = priv->ctx->cmd_fd;
> -	ret = sendmsg(conn_sock, &msg, 0);
> -	if (ret < 0)
> -		DRV_LOG(WARNING, "port %u cannot send response",
> -			dev->data->port_id);
> -error:
> -	close(conn_sock);
> -}
> -
> -/**
> - * Connect to the primary process.
> - *
> - * @param[in] dev
> - *   Pointer to Ethernet structure.
> - *
> - * @return
> - *   fd on success, negative errno value otherwise and rte_errno is set.
> - */
> -int
> -mlx5_socket_connect(struct rte_eth_dev *dev) -{
> -	struct mlx5_priv *priv = dev->data->dev_private;
> -	struct sockaddr_un sun = {
> -		.sun_family = AF_UNIX,
> -	};
> -	int socket_fd = -1;
> -	int *fd = NULL;
> -	int ret;
> -	struct ucred *cred;
> -	char buf[CMSG_SPACE(sizeof(*cred))] = { 0 };
> -	char vbuf[1024] = { 0 };
> -	struct iovec io = {
> -		.iov_base = vbuf,
> -		.iov_len = sizeof(*vbuf),
> -	};
> -	struct msghdr msg = {
> -		.msg_control = buf,
> -		.msg_controllen = sizeof(buf),
> -		.msg_iov = &io,
> -		.msg_iovlen = 1,
> -	};
> -	struct cmsghdr *cmsg;
> -
> -	ret = socket(AF_UNIX, SOCK_STREAM, 0);
> -	if (ret < 0) {
> -		rte_errno = errno;
> -		DRV_LOG(WARNING, "port %u cannot connect to primary",
> -			dev->data->port_id);
> -		goto error;
> -	}
> -	socket_fd = ret;
> -	snprintf(sun.sun_path, sizeof(sun.sun_path), "/var/tmp/%s_%d",
> -		 MLX5_DRIVER_NAME, priv->primary_socket);
> -	ret = connect(socket_fd, (const struct sockaddr *)&sun, sizeof(sun));
> -	if (ret < 0) {
> -		rte_errno = errno;
> -		DRV_LOG(WARNING, "port %u cannot connect to primary",
> -			dev->data->port_id);
> -		goto error;
> -	}
> -	cmsg = CMSG_FIRSTHDR(&msg);
> -	if (cmsg == NULL) {
> -		rte_errno = EINVAL;
> -		DRV_LOG(DEBUG, "port %u cannot get first message",
> -			dev->data->port_id);
> -		goto error;
> -	}
> -	cmsg->cmsg_level = SOL_SOCKET;
> -	cmsg->cmsg_type = SCM_CREDENTIALS;
> -	cmsg->cmsg_len = CMSG_LEN(sizeof(*cred));
> -	cred = (struct ucred *)CMSG_DATA(cmsg);
> -	if (cred == NULL) {
> -		rte_errno = EINVAL;
> -		DRV_LOG(DEBUG, "port %u no credentials received",
> -			dev->data->port_id);
> -		goto error;
> -	}
> -	cred->pid = getpid();
> -	cred->uid = getuid();
> -	cred->gid = getgid();
> -	ret = sendmsg(socket_fd, &msg, MSG_DONTWAIT);
> -	if (ret < 0) {
> -		rte_errno = errno;
> -		DRV_LOG(WARNING,
> -			"port %u cannot send credentials to primary: %s",
> -			dev->data->port_id, strerror(errno));
> -		goto error;
> -	}
> -	ret = recvmsg(socket_fd, &msg, MSG_WAITALL);
> -	if (ret <= 0) {
> -		rte_errno = errno;
> -		DRV_LOG(WARNING, "port %u no message from primary:
> %s",
> -			dev->data->port_id, strerror(errno));
> -		goto error;
> -	}
> -	cmsg = CMSG_FIRSTHDR(&msg);
> -	if (cmsg == NULL) {
> -		rte_errno = EINVAL;
> -		DRV_LOG(WARNING, "port %u no file descriptor received",
> -			dev->data->port_id);
> -		goto error;
> -	}
> -	fd = (int *)CMSG_DATA(cmsg);
> -	if (*fd < 0) {
> -		DRV_LOG(WARNING, "port %u no file descriptor received:
> %s",
> -			dev->data->port_id, strerror(errno));
> -		rte_errno = *fd;
> -		goto error;
> -	}
> -	ret = *fd;
> -	close(socket_fd);
> -	return ret;
> -error:
> -	if (socket_fd != -1)
> -		close(socket_fd);
> -	return -rte_errno;
> -}
> --
> 2.11.0


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 3/4] net/mlx5: rework PMD global data init
  2019-03-14 12:36   ` Shahaf Shuler
  2019-03-14 12:36     ` Shahaf Shuler
@ 2019-03-18 21:21     ` Yongseok Koh
  2019-03-18 21:21       ` Yongseok Koh
  2019-03-19  6:54       ` Shahaf Shuler
  1 sibling, 2 replies; 45+ messages in thread
From: Yongseok Koh @ 2019-03-18 21:21 UTC (permalink / raw)
  To: Shahaf Shuler; +Cc: dev

On Thu, Mar 14, 2019 at 05:36:28AM -0700, Shahaf Shuler wrote:
> Hi Koh,
> 
> Thursday, March 7, 2019 9:33 AM, Yongseok Koh:
> > Subject: [PATCH 3/4] net/mlx5: rework PMD global data init
> > 
> > There's more need to have PMD global data structure. It should be initialized
> > once per a process regardless of how many PMD instances are probed.
> > mlx5_init_once() is called during probing and make sure all the init functions
> > are called once per a process. The existing shared memory gets more
> > extensively used for this purpose. As there could be multiple secondary
> > processes, a static storage (local to process) is also added.
> 
> It is hard to understand from the commit log what was missing on the old design. 

Okay, will add more comments.

> > As the reserved virtual address for UAR remap is a PMD global resource, this
> > doesn't need to be stored in the device priv structure, but in the PMD global
> > data.
> 
> I thought we agreed to drop those and have different VA for each process. 
> If so, is the extra work on the UAR here is needed? 

My plan was to do that in a separate patch for performance regression.
Let me know if you want it to be done in this patchset.

> > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > ---
> >  drivers/net/mlx5/mlx5.c     | 250 ++++++++++++++++++++++++++++++++--
> > ----------
> >  drivers/net/mlx5/mlx5.h     |  19 +++-
> >  drivers/net/mlx5/mlx5_mp.c  |  19 +++-
> >  drivers/net/mlx5/mlx5_txq.c |   7 +-
> >  4 files changed, 217 insertions(+), 78 deletions(-)
> > 
> > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> > 6ed2418106..ea8fd55ee6 100644
> > --- a/drivers/net/mlx5/mlx5.c
> > +++ b/drivers/net/mlx5/mlx5.c
> > @@ -128,16 +128,26 @@ struct mlx5_shared_data *mlx5_shared_data;
> >  /* Spinlock for mlx5_shared_data allocation. */  static rte_spinlock_t
> > mlx5_shared_data_lock = RTE_SPINLOCK_INITIALIZER;
> > 
> > +/* Process local data for secondary processes. */ static struct
> > +mlx5_local_data mlx5_local_data;
> 
> Why not storing this context as part of ethdev-> process_private instead of declaring it static? 

Because it is not per-device data but per-PMD data.

Will also have to rebase my patchsets when I send out v2.


Thanks,
Yongseok

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 3/4] net/mlx5: rework PMD global data init
  2019-03-18 21:21     ` Yongseok Koh
@ 2019-03-18 21:21       ` Yongseok Koh
  2019-03-19  6:54       ` Shahaf Shuler
  1 sibling, 0 replies; 45+ messages in thread
From: Yongseok Koh @ 2019-03-18 21:21 UTC (permalink / raw)
  To: Shahaf Shuler; +Cc: dev

On Thu, Mar 14, 2019 at 05:36:28AM -0700, Shahaf Shuler wrote:
> Hi Koh,
> 
> Thursday, March 7, 2019 9:33 AM, Yongseok Koh:
> > Subject: [PATCH 3/4] net/mlx5: rework PMD global data init
> > 
> > There's more need to have PMD global data structure. It should be initialized
> > once per a process regardless of how many PMD instances are probed.
> > mlx5_init_once() is called during probing and make sure all the init functions
> > are called once per a process. The existing shared memory gets more
> > extensively used for this purpose. As there could be multiple secondary
> > processes, a static storage (local to process) is also added.
> 
> It is hard to understand from the commit log what was missing on the old design. 

Okay, will add more comments.

> > As the reserved virtual address for UAR remap is a PMD global resource, this
> > doesn't need to be stored in the device priv structure, but in the PMD global
> > data.
> 
> I thought we agreed to drop those and have different VA for each process. 
> If so, is the extra work on the UAR here is needed? 

My plan was to do that in a separate patch for performance regression.
Let me know if you want it to be done in this patchset.

> > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > ---
> >  drivers/net/mlx5/mlx5.c     | 250 ++++++++++++++++++++++++++++++++--
> > ----------
> >  drivers/net/mlx5/mlx5.h     |  19 +++-
> >  drivers/net/mlx5/mlx5_mp.c  |  19 +++-
> >  drivers/net/mlx5/mlx5_txq.c |   7 +-
> >  4 files changed, 217 insertions(+), 78 deletions(-)
> > 
> > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> > 6ed2418106..ea8fd55ee6 100644
> > --- a/drivers/net/mlx5/mlx5.c
> > +++ b/drivers/net/mlx5/mlx5.c
> > @@ -128,16 +128,26 @@ struct mlx5_shared_data *mlx5_shared_data;
> >  /* Spinlock for mlx5_shared_data allocation. */  static rte_spinlock_t
> > mlx5_shared_data_lock = RTE_SPINLOCK_INITIALIZER;
> > 
> > +/* Process local data for secondary processes. */ static struct
> > +mlx5_local_data mlx5_local_data;
> 
> Why not storing this context as part of ethdev-> process_private instead of declaring it static? 

Because it is not per-device data but per-PMD data.

Will also have to rebase my patchsets when I send out v2.


Thanks,
Yongseok

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 2/4] net/mlx5: replace IPC socket with EAL API
  2019-03-14 12:36   ` Shahaf Shuler
  2019-03-14 12:36     ` Shahaf Shuler
@ 2019-03-18 21:29     ` Yongseok Koh
  2019-03-18 21:29       ` Yongseok Koh
  1 sibling, 1 reply; 45+ messages in thread
From: Yongseok Koh @ 2019-03-18 21:29 UTC (permalink / raw)
  To: Shahaf Shuler; +Cc: dev

On Thu, Mar 14, 2019 at 05:36:32AM -0700, Shahaf Shuler wrote:
> Hi Koh, 
> 
> Thursday, March 7, 2019 9:33 AM, Yongseok Koh:
> > Subject: [PATCH 2/4] net/mlx5: replace IPC socket with EAL API
> > 
> > Socket API is used for IPC in order for secondary process to acquire Verb
> > command file descriptor. The FD is used to remap UAR address. The new
> > multi-process APIs (rte_mp) in EAL are newly introduced. mlx5_socket.c is
> > replaced with mlx5_mp.c, which uses the new APIs.
> > 
> > As it is PMD global infrastructure, only one IPC channel is established.
> > All the IPC message types may have port_id in the message if there is need
> > to reference a specific device.
> > 
> > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> 
> [...]
> 
> >  /**
> > diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c new
> > file mode 100644 index 0000000000..19a1f25f0e
> > --- /dev/null
> > +++ b/drivers/net/mlx5/mlx5_mp.c
> > @@ -0,0 +1,126 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright 2019 6WIND S.A.
> > + * Copyright 2019 Mellanox Technologies, Ltd  */
> > +
> > +#include <assert.h>
> > +#include <stdio.h>
> > +#include <time.h>
> > +
> > +#include <rte_eal.h>
> > +#include <rte_ethdev_driver.h>
> > +#include <rte_string_fns.h>
> > +
> > +#include "mlx5.h"
> > +#include "mlx5_utils.h"
> > +
> > +/**
> > + * Initialize IPC message.
> > + *
> > + * @param[in] dev
> > + *   Pointer to Ethernet structure.
> > + * @param[out] msg
> > + *   Pointer to message to fill in.
> > + * @param[in] type
> > + *   Message type.
> > + */
> > +static inline void
> > +mp_init_msg(struct rte_eth_dev *dev, struct rte_mp_msg *msg,
> > +	    enum mlx5_mp_req_type type)
> > +{
> > +	struct mlx5_mp_param *param = (struct mlx5_mp_param *)msg-
> > >param;
> > +
> > +	memset(msg, 0, sizeof(*msg));
> > +	strlcpy(msg->name, MLX5_MP_NAME, sizeof(msg->name));
> > +	msg->len_param = sizeof(*param);
> > +	param->type = type;
> > +	param->port_id = dev->data->port_id;
> > +}
> > +
> > +/**
> > + * IPC message handler of primary process.
> > + *
> > + * @param[in] dev
> > + *   Pointer to Ethernet structure.
> > + * @param[in] peer
> > + *   Pointer to the peer socket path.
> > + *
> > + * @return
> > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > + */
> > +static int
> > +mp_primary_handle(const struct rte_mp_msg *mp_msg, const void *peer)
> > {
> > +	struct rte_mp_msg mp_res;
> > +	struct mlx5_mp_param *res = (struct mlx5_mp_param
> > *)mp_res.param;
> > +	const struct mlx5_mp_param *param =
> > +		(const struct mlx5_mp_param *)mp_msg->param;
> > +	struct rte_eth_dev *dev = &rte_eth_devices[param->port_id];
> 
> Need to check dev is a valid one before we continue. If such case happen need error log (like you have for invalid req). 

Good point.

> > +	struct mlx5_priv *priv = dev->data->dev_private;
> > +	int ret = 0;
> > +
> > +	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
> > +	switch (param->type) {
> > +	case MLX5_MP_REQ_VERBS_CMD_FD:
> > +		mp_init_msg(dev, &mp_res, param->type);
> > +		mp_res.num_fds = 1;
> > +		mp_res.fds[0] = priv->ctx->cmd_fd;
> > +		res->result = 0;
> > +		ret = rte_mp_reply(&mp_res, peer);
> > +		break;
> > +	default:
> > +		rte_errno = EINVAL;
> > +		DRV_LOG(ERR, "port %u invalid mp request type",
> > +			dev->data->port_id);
> > +		return -rte_errno;
> > +	}
> > +	return ret;
> > +}
> > +
> > +/**
> > + * Request Verbs command file descriptor for mmap to the primary process.
> > + *
> > + * @param[in] dev
> > + *   Pointer to Ethernet structure.
> > + *
> > + * @return
> > + *   fd on success, a negative errno value otherwise and rte_errno is set.
> > + */
> > +int
> > +mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev) {
> > +	struct rte_mp_msg mp_req;
> > +	struct rte_mp_msg *mp_res;
> > +	struct rte_mp_reply mp_rep;
> > +	struct mlx5_mp_param *res __rte_unused;
> > +	struct timespec ts = {.tv_sec = 5, .tv_nsec = 0};
> > +	int cmd_fd;
> > +	int ret;
> > +
> > +	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
> > +	mp_init_msg(dev, &mp_req, MLX5_MP_REQ_VERBS_CMD_FD);
> > +	ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts);
> > +	if (ret) {
> > +		DRV_LOG(ERR,
> > +			"port %u failed to get command FD from primary
> > process",
> > +			dev->data->port_id);
> > +		return -rte_errno;
> > +	}
> > +	assert(mp_rep.nb_received == 1);
> > +	mp_res = &mp_rep.msgs[0];
> > +	res = (struct mlx5_mp_param *)mp_res->param;
> > +	assert(!res->result);
> 
> Above should not be an assert rather and actual check. 

The reason was because only zero is set by the handler. If the request is
successful, res->result has to have zero. Otherwise, that should be memory
corruption or bug in the rte_mp lib. So, I think it is better to have 'if'
instead of 'assert' so that we can prevent returning invalid cmd_fd.

Thanks,
Yongseok

> > +	assert(mp_res->num_fds == 1);
> > +	cmd_fd = mp_res->fds[0];
> > +	free(mp_rep.msgs);
> > +	DRV_LOG(DEBUG, "port %u command FD from primary is %d",
> > +		dev->data->port_id, cmd_fd);
> > +	return cmd_fd;
> > +}
> > +
> > +void
> > +mlx5_mp_init(void)
> > +{
> > +	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> > +		rte_mp_action_register(MLX5_MP_NAME,
> > mp_primary_handle); }
> > diff --git a/drivers/net/mlx5/mlx5_socket.c
> > b/drivers/net/mlx5/mlx5_socket.c deleted file mode 100644 index
> > 41cac3c6aa..0000000000
> > --- a/drivers/net/mlx5/mlx5_socket.c
> > +++ /dev/null
> > @@ -1,306 +0,0 @@
> > -/* SPDX-License-Identifier: BSD-3-Clause
> > - * Copyright 2016 6WIND S.A.
> > - * Copyright 2016 Mellanox Technologies, Ltd
> > - */
> > -
> > -#include <sys/types.h>
> > -#include <sys/socket.h>
> > -#include <sys/un.h>
> > -#include <fcntl.h>
> > -#include <stdio.h>
> > -#include <unistd.h>
> > -#include <sys/stat.h>
> > -
> > -#include "mlx5.h"
> > -#include "mlx5_utils.h"
> > -
> > -/**
> > - * Initialise the socket to communicate with the secondary process
> > - *
> > - * @param[in] dev
> > - *   Pointer to Ethernet device.
> > - *
> > - * @return
> > - *   0 on success, a negative errno value otherwise and rte_errno is set.
> > - */
> > -int
> > -mlx5_socket_init(struct rte_eth_dev *dev) -{
> > -	struct mlx5_priv *priv = dev->data->dev_private;
> > -	struct sockaddr_un sun = {
> > -		.sun_family = AF_UNIX,
> > -	};
> > -	int ret;
> > -	int flags;
> > -
> > -	/*
> > -	 * Close the last socket that was used to communicate
> > -	 * with the secondary process
> > -	 */
> > -	if (priv->primary_socket)
> > -		mlx5_socket_uninit(dev);
> > -	/*
> > -	 * Initialise the socket to communicate with the secondary
> > -	 * process.
> > -	 */
> > -	ret = socket(AF_UNIX, SOCK_STREAM, 0);
> > -	if (ret < 0) {
> > -		rte_errno = errno;
> > -		DRV_LOG(WARNING, "port %u secondary process not
> > supported: %s",
> > -			dev->data->port_id, strerror(errno));
> > -		goto error;
> > -	}
> > -	priv->primary_socket = ret;
> > -	flags = fcntl(priv->primary_socket, F_GETFL, 0);
> > -	if (flags == -1) {
> > -		rte_errno = errno;
> > -		goto error;
> > -	}
> > -	ret = fcntl(priv->primary_socket, F_SETFL, flags | O_NONBLOCK);
> > -	if (ret < 0) {
> > -		rte_errno = errno;
> > -		goto error;
> > -	}
> > -	snprintf(sun.sun_path, sizeof(sun.sun_path), "/var/tmp/%s_%d",
> > -		 MLX5_DRIVER_NAME, priv->primary_socket);
> > -	remove(sun.sun_path);
> > -	ret = bind(priv->primary_socket, (const struct sockaddr *)&sun,
> > -		   sizeof(sun));
> > -	if (ret < 0) {
> > -		rte_errno = errno;
> > -		DRV_LOG(WARNING,
> > -			"port %u cannot bind socket, secondary process not"
> > -			" supported: %s",
> > -			dev->data->port_id, strerror(errno));
> > -		goto close;
> > -	}
> > -	ret = listen(priv->primary_socket, 0);
> > -	if (ret < 0) {
> > -		rte_errno = errno;
> > -		DRV_LOG(WARNING, "port %u secondary process not
> > supported: %s",
> > -			dev->data->port_id, strerror(errno));
> > -		goto close;
> > -	}
> > -	return 0;
> > -close:
> > -	remove(sun.sun_path);
> > -error:
> > -	claim_zero(close(priv->primary_socket));
> > -	priv->primary_socket = 0;
> > -	return -rte_errno;
> > -}
> > -
> > -/**
> > - * Un-Initialise the socket to communicate with the secondary process
> > - *
> > - * @param[in] dev
> > - */
> > -void
> > -mlx5_socket_uninit(struct rte_eth_dev *dev) -{
> > -	struct mlx5_priv *priv = dev->data->dev_private;
> > -
> > -	MKSTR(path, "/var/tmp/%s_%d", MLX5_DRIVER_NAME, priv-
> > >primary_socket);
> > -	claim_zero(close(priv->primary_socket));
> > -	priv->primary_socket = 0;
> > -	claim_zero(remove(path));
> > -}
> > -
> > -/**
> > - * Handle socket interrupts.
> > - *
> > - * @param dev
> > - *   Pointer to Ethernet device.
> > - */
> > -void
> > -mlx5_socket_handle(struct rte_eth_dev *dev) -{
> > -	struct mlx5_priv *priv = dev->data->dev_private;
> > -	int conn_sock;
> > -	int ret = 0;
> > -	struct cmsghdr *cmsg = NULL;
> > -	struct ucred *cred = NULL;
> > -	char buf[CMSG_SPACE(sizeof(struct ucred))] = { 0 };
> > -	char vbuf[1024] = { 0 };
> > -	struct iovec io = {
> > -		.iov_base = vbuf,
> > -		.iov_len = sizeof(*vbuf),
> > -	};
> > -	struct msghdr msg = {
> > -		.msg_iov = &io,
> > -		.msg_iovlen = 1,
> > -		.msg_control = buf,
> > -		.msg_controllen = sizeof(buf),
> > -	};
> > -	int *fd;
> > -
> > -	/* Accept the connection from the client. */
> > -	conn_sock = accept(priv->primary_socket, NULL, NULL);
> > -	if (conn_sock < 0) {
> > -		DRV_LOG(WARNING, "port %u connection failed: %s",
> > -			dev->data->port_id, strerror(errno));
> > -		return;
> > -	}
> > -	ret = setsockopt(conn_sock, SOL_SOCKET, SO_PASSCRED, &(int){1},
> > -					 sizeof(int));
> > -	if (ret < 0) {
> > -		ret = errno;
> > -		DRV_LOG(WARNING, "port %u cannot change socket
> > options: %s",
> > -			dev->data->port_id, strerror(rte_errno));
> > -		goto error;
> > -	}
> > -	ret = recvmsg(conn_sock, &msg, MSG_WAITALL);
> > -	if (ret < 0) {
> > -		ret = errno;
> > -		DRV_LOG(WARNING, "port %u received an empty message:
> > %s",
> > -			dev->data->port_id, strerror(rte_errno));
> > -		goto error;
> > -	}
> > -	/* Expect to receive credentials only. */
> > -	cmsg = CMSG_FIRSTHDR(&msg);
> > -	if (cmsg == NULL) {
> > -		DRV_LOG(WARNING, "port %u no message", dev->data-
> > >port_id);
> > -		goto error;
> > -	}
> > -	if ((cmsg->cmsg_type == SCM_CREDENTIALS) &&
> > -		(cmsg->cmsg_len >= sizeof(*cred))) {
> > -		cred = (struct ucred *)CMSG_DATA(cmsg);
> > -		assert(cred != NULL);
> > -	}
> > -	cmsg = CMSG_NXTHDR(&msg, cmsg);
> > -	if (cmsg != NULL) {
> > -		DRV_LOG(WARNING, "port %u message wrongly
> > formatted",
> > -			dev->data->port_id);
> > -		goto error;
> > -	}
> > -	/* Make sure all the ancillary data was received and valid. */
> > -	if ((cred == NULL) || (cred->uid != getuid()) ||
> > -	    (cred->gid != getgid())) {
> > -		DRV_LOG(WARNING, "port %u wrong credentials",
> > -			dev->data->port_id);
> > -		goto error;
> > -	}
> > -	/* Set-up the ancillary data. */
> > -	cmsg = CMSG_FIRSTHDR(&msg);
> > -	assert(cmsg != NULL);
> > -	cmsg->cmsg_level = SOL_SOCKET;
> > -	cmsg->cmsg_type = SCM_RIGHTS;
> > -	cmsg->cmsg_len = CMSG_LEN(sizeof(priv->ctx->cmd_fd));
> > -	fd = (int *)CMSG_DATA(cmsg);
> > -	*fd = priv->ctx->cmd_fd;
> > -	ret = sendmsg(conn_sock, &msg, 0);
> > -	if (ret < 0)
> > -		DRV_LOG(WARNING, "port %u cannot send response",
> > -			dev->data->port_id);
> > -error:
> > -	close(conn_sock);
> > -}
> > -
> > -/**
> > - * Connect to the primary process.
> > - *
> > - * @param[in] dev
> > - *   Pointer to Ethernet structure.
> > - *
> > - * @return
> > - *   fd on success, negative errno value otherwise and rte_errno is set.
> > - */
> > -int
> > -mlx5_socket_connect(struct rte_eth_dev *dev) -{
> > -	struct mlx5_priv *priv = dev->data->dev_private;
> > -	struct sockaddr_un sun = {
> > -		.sun_family = AF_UNIX,
> > -	};
> > -	int socket_fd = -1;
> > -	int *fd = NULL;
> > -	int ret;
> > -	struct ucred *cred;
> > -	char buf[CMSG_SPACE(sizeof(*cred))] = { 0 };
> > -	char vbuf[1024] = { 0 };
> > -	struct iovec io = {
> > -		.iov_base = vbuf,
> > -		.iov_len = sizeof(*vbuf),
> > -	};
> > -	struct msghdr msg = {
> > -		.msg_control = buf,
> > -		.msg_controllen = sizeof(buf),
> > -		.msg_iov = &io,
> > -		.msg_iovlen = 1,
> > -	};
> > -	struct cmsghdr *cmsg;
> > -
> > -	ret = socket(AF_UNIX, SOCK_STREAM, 0);
> > -	if (ret < 0) {
> > -		rte_errno = errno;
> > -		DRV_LOG(WARNING, "port %u cannot connect to primary",
> > -			dev->data->port_id);
> > -		goto error;
> > -	}
> > -	socket_fd = ret;
> > -	snprintf(sun.sun_path, sizeof(sun.sun_path), "/var/tmp/%s_%d",
> > -		 MLX5_DRIVER_NAME, priv->primary_socket);
> > -	ret = connect(socket_fd, (const struct sockaddr *)&sun, sizeof(sun));
> > -	if (ret < 0) {
> > -		rte_errno = errno;
> > -		DRV_LOG(WARNING, "port %u cannot connect to primary",
> > -			dev->data->port_id);
> > -		goto error;
> > -	}
> > -	cmsg = CMSG_FIRSTHDR(&msg);
> > -	if (cmsg == NULL) {
> > -		rte_errno = EINVAL;
> > -		DRV_LOG(DEBUG, "port %u cannot get first message",
> > -			dev->data->port_id);
> > -		goto error;
> > -	}
> > -	cmsg->cmsg_level = SOL_SOCKET;
> > -	cmsg->cmsg_type = SCM_CREDENTIALS;
> > -	cmsg->cmsg_len = CMSG_LEN(sizeof(*cred));
> > -	cred = (struct ucred *)CMSG_DATA(cmsg);
> > -	if (cred == NULL) {
> > -		rte_errno = EINVAL;
> > -		DRV_LOG(DEBUG, "port %u no credentials received",
> > -			dev->data->port_id);
> > -		goto error;
> > -	}
> > -	cred->pid = getpid();
> > -	cred->uid = getuid();
> > -	cred->gid = getgid();
> > -	ret = sendmsg(socket_fd, &msg, MSG_DONTWAIT);
> > -	if (ret < 0) {
> > -		rte_errno = errno;
> > -		DRV_LOG(WARNING,
> > -			"port %u cannot send credentials to primary: %s",
> > -			dev->data->port_id, strerror(errno));
> > -		goto error;
> > -	}
> > -	ret = recvmsg(socket_fd, &msg, MSG_WAITALL);
> > -	if (ret <= 0) {
> > -		rte_errno = errno;
> > -		DRV_LOG(WARNING, "port %u no message from primary:
> > %s",
> > -			dev->data->port_id, strerror(errno));
> > -		goto error;
> > -	}
> > -	cmsg = CMSG_FIRSTHDR(&msg);
> > -	if (cmsg == NULL) {
> > -		rte_errno = EINVAL;
> > -		DRV_LOG(WARNING, "port %u no file descriptor received",
> > -			dev->data->port_id);
> > -		goto error;
> > -	}
> > -	fd = (int *)CMSG_DATA(cmsg);
> > -	if (*fd < 0) {
> > -		DRV_LOG(WARNING, "port %u no file descriptor received:
> > %s",
> > -			dev->data->port_id, strerror(errno));
> > -		rte_errno = *fd;
> > -		goto error;
> > -	}
> > -	ret = *fd;
> > -	close(socket_fd);
> > -	return ret;
> > -error:
> > -	if (socket_fd != -1)
> > -		close(socket_fd);
> > -	return -rte_errno;
> > -}
> > --
> > 2.11.0
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 2/4] net/mlx5: replace IPC socket with EAL API
  2019-03-18 21:29     ` Yongseok Koh
@ 2019-03-18 21:29       ` Yongseok Koh
  0 siblings, 0 replies; 45+ messages in thread
From: Yongseok Koh @ 2019-03-18 21:29 UTC (permalink / raw)
  To: Shahaf Shuler; +Cc: dev

On Thu, Mar 14, 2019 at 05:36:32AM -0700, Shahaf Shuler wrote:
> Hi Koh, 
> 
> Thursday, March 7, 2019 9:33 AM, Yongseok Koh:
> > Subject: [PATCH 2/4] net/mlx5: replace IPC socket with EAL API
> > 
> > Socket API is used for IPC in order for secondary process to acquire Verb
> > command file descriptor. The FD is used to remap UAR address. The new
> > multi-process APIs (rte_mp) in EAL are newly introduced. mlx5_socket.c is
> > replaced with mlx5_mp.c, which uses the new APIs.
> > 
> > As it is PMD global infrastructure, only one IPC channel is established.
> > All the IPC message types may have port_id in the message if there is need
> > to reference a specific device.
> > 
> > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> 
> [...]
> 
> >  /**
> > diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c new
> > file mode 100644 index 0000000000..19a1f25f0e
> > --- /dev/null
> > +++ b/drivers/net/mlx5/mlx5_mp.c
> > @@ -0,0 +1,126 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright 2019 6WIND S.A.
> > + * Copyright 2019 Mellanox Technologies, Ltd  */
> > +
> > +#include <assert.h>
> > +#include <stdio.h>
> > +#include <time.h>
> > +
> > +#include <rte_eal.h>
> > +#include <rte_ethdev_driver.h>
> > +#include <rte_string_fns.h>
> > +
> > +#include "mlx5.h"
> > +#include "mlx5_utils.h"
> > +
> > +/**
> > + * Initialize IPC message.
> > + *
> > + * @param[in] dev
> > + *   Pointer to Ethernet structure.
> > + * @param[out] msg
> > + *   Pointer to message to fill in.
> > + * @param[in] type
> > + *   Message type.
> > + */
> > +static inline void
> > +mp_init_msg(struct rte_eth_dev *dev, struct rte_mp_msg *msg,
> > +	    enum mlx5_mp_req_type type)
> > +{
> > +	struct mlx5_mp_param *param = (struct mlx5_mp_param *)msg-
> > >param;
> > +
> > +	memset(msg, 0, sizeof(*msg));
> > +	strlcpy(msg->name, MLX5_MP_NAME, sizeof(msg->name));
> > +	msg->len_param = sizeof(*param);
> > +	param->type = type;
> > +	param->port_id = dev->data->port_id;
> > +}
> > +
> > +/**
> > + * IPC message handler of primary process.
> > + *
> > + * @param[in] dev
> > + *   Pointer to Ethernet structure.
> > + * @param[in] peer
> > + *   Pointer to the peer socket path.
> > + *
> > + * @return
> > + *   0 on success, a negative errno value otherwise and rte_errno is set.
> > + */
> > +static int
> > +mp_primary_handle(const struct rte_mp_msg *mp_msg, const void *peer)
> > {
> > +	struct rte_mp_msg mp_res;
> > +	struct mlx5_mp_param *res = (struct mlx5_mp_param
> > *)mp_res.param;
> > +	const struct mlx5_mp_param *param =
> > +		(const struct mlx5_mp_param *)mp_msg->param;
> > +	struct rte_eth_dev *dev = &rte_eth_devices[param->port_id];
> 
> Need to check dev is a valid one before we continue. If such case happen need error log (like you have for invalid req). 

Good point.

> > +	struct mlx5_priv *priv = dev->data->dev_private;
> > +	int ret = 0;
> > +
> > +	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
> > +	switch (param->type) {
> > +	case MLX5_MP_REQ_VERBS_CMD_FD:
> > +		mp_init_msg(dev, &mp_res, param->type);
> > +		mp_res.num_fds = 1;
> > +		mp_res.fds[0] = priv->ctx->cmd_fd;
> > +		res->result = 0;
> > +		ret = rte_mp_reply(&mp_res, peer);
> > +		break;
> > +	default:
> > +		rte_errno = EINVAL;
> > +		DRV_LOG(ERR, "port %u invalid mp request type",
> > +			dev->data->port_id);
> > +		return -rte_errno;
> > +	}
> > +	return ret;
> > +}
> > +
> > +/**
> > + * Request Verbs command file descriptor for mmap to the primary process.
> > + *
> > + * @param[in] dev
> > + *   Pointer to Ethernet structure.
> > + *
> > + * @return
> > + *   fd on success, a negative errno value otherwise and rte_errno is set.
> > + */
> > +int
> > +mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev) {
> > +	struct rte_mp_msg mp_req;
> > +	struct rte_mp_msg *mp_res;
> > +	struct rte_mp_reply mp_rep;
> > +	struct mlx5_mp_param *res __rte_unused;
> > +	struct timespec ts = {.tv_sec = 5, .tv_nsec = 0};
> > +	int cmd_fd;
> > +	int ret;
> > +
> > +	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
> > +	mp_init_msg(dev, &mp_req, MLX5_MP_REQ_VERBS_CMD_FD);
> > +	ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts);
> > +	if (ret) {
> > +		DRV_LOG(ERR,
> > +			"port %u failed to get command FD from primary
> > process",
> > +			dev->data->port_id);
> > +		return -rte_errno;
> > +	}
> > +	assert(mp_rep.nb_received == 1);
> > +	mp_res = &mp_rep.msgs[0];
> > +	res = (struct mlx5_mp_param *)mp_res->param;
> > +	assert(!res->result);
> 
> Above should not be an assert rather and actual check. 

The reason was because only zero is set by the handler. If the request is
successful, res->result has to have zero. Otherwise, that should be memory
corruption or bug in the rte_mp lib. So, I think it is better to have 'if'
instead of 'assert' so that we can prevent returning invalid cmd_fd.

Thanks,
Yongseok

> > +	assert(mp_res->num_fds == 1);
> > +	cmd_fd = mp_res->fds[0];
> > +	free(mp_rep.msgs);
> > +	DRV_LOG(DEBUG, "port %u command FD from primary is %d",
> > +		dev->data->port_id, cmd_fd);
> > +	return cmd_fd;
> > +}
> > +
> > +void
> > +mlx5_mp_init(void)
> > +{
> > +	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> > +		rte_mp_action_register(MLX5_MP_NAME,
> > mp_primary_handle); }
> > diff --git a/drivers/net/mlx5/mlx5_socket.c
> > b/drivers/net/mlx5/mlx5_socket.c deleted file mode 100644 index
> > 41cac3c6aa..0000000000
> > --- a/drivers/net/mlx5/mlx5_socket.c
> > +++ /dev/null
> > @@ -1,306 +0,0 @@
> > -/* SPDX-License-Identifier: BSD-3-Clause
> > - * Copyright 2016 6WIND S.A.
> > - * Copyright 2016 Mellanox Technologies, Ltd
> > - */
> > -
> > -#include <sys/types.h>
> > -#include <sys/socket.h>
> > -#include <sys/un.h>
> > -#include <fcntl.h>
> > -#include <stdio.h>
> > -#include <unistd.h>
> > -#include <sys/stat.h>
> > -
> > -#include "mlx5.h"
> > -#include "mlx5_utils.h"
> > -
> > -/**
> > - * Initialise the socket to communicate with the secondary process
> > - *
> > - * @param[in] dev
> > - *   Pointer to Ethernet device.
> > - *
> > - * @return
> > - *   0 on success, a negative errno value otherwise and rte_errno is set.
> > - */
> > -int
> > -mlx5_socket_init(struct rte_eth_dev *dev) -{
> > -	struct mlx5_priv *priv = dev->data->dev_private;
> > -	struct sockaddr_un sun = {
> > -		.sun_family = AF_UNIX,
> > -	};
> > -	int ret;
> > -	int flags;
> > -
> > -	/*
> > -	 * Close the last socket that was used to communicate
> > -	 * with the secondary process
> > -	 */
> > -	if (priv->primary_socket)
> > -		mlx5_socket_uninit(dev);
> > -	/*
> > -	 * Initialise the socket to communicate with the secondary
> > -	 * process.
> > -	 */
> > -	ret = socket(AF_UNIX, SOCK_STREAM, 0);
> > -	if (ret < 0) {
> > -		rte_errno = errno;
> > -		DRV_LOG(WARNING, "port %u secondary process not
> > supported: %s",
> > -			dev->data->port_id, strerror(errno));
> > -		goto error;
> > -	}
> > -	priv->primary_socket = ret;
> > -	flags = fcntl(priv->primary_socket, F_GETFL, 0);
> > -	if (flags == -1) {
> > -		rte_errno = errno;
> > -		goto error;
> > -	}
> > -	ret = fcntl(priv->primary_socket, F_SETFL, flags | O_NONBLOCK);
> > -	if (ret < 0) {
> > -		rte_errno = errno;
> > -		goto error;
> > -	}
> > -	snprintf(sun.sun_path, sizeof(sun.sun_path), "/var/tmp/%s_%d",
> > -		 MLX5_DRIVER_NAME, priv->primary_socket);
> > -	remove(sun.sun_path);
> > -	ret = bind(priv->primary_socket, (const struct sockaddr *)&sun,
> > -		   sizeof(sun));
> > -	if (ret < 0) {
> > -		rte_errno = errno;
> > -		DRV_LOG(WARNING,
> > -			"port %u cannot bind socket, secondary process not"
> > -			" supported: %s",
> > -			dev->data->port_id, strerror(errno));
> > -		goto close;
> > -	}
> > -	ret = listen(priv->primary_socket, 0);
> > -	if (ret < 0) {
> > -		rte_errno = errno;
> > -		DRV_LOG(WARNING, "port %u secondary process not
> > supported: %s",
> > -			dev->data->port_id, strerror(errno));
> > -		goto close;
> > -	}
> > -	return 0;
> > -close:
> > -	remove(sun.sun_path);
> > -error:
> > -	claim_zero(close(priv->primary_socket));
> > -	priv->primary_socket = 0;
> > -	return -rte_errno;
> > -}
> > -
> > -/**
> > - * Un-Initialise the socket to communicate with the secondary process
> > - *
> > - * @param[in] dev
> > - */
> > -void
> > -mlx5_socket_uninit(struct rte_eth_dev *dev) -{
> > -	struct mlx5_priv *priv = dev->data->dev_private;
> > -
> > -	MKSTR(path, "/var/tmp/%s_%d", MLX5_DRIVER_NAME, priv-
> > >primary_socket);
> > -	claim_zero(close(priv->primary_socket));
> > -	priv->primary_socket = 0;
> > -	claim_zero(remove(path));
> > -}
> > -
> > -/**
> > - * Handle socket interrupts.
> > - *
> > - * @param dev
> > - *   Pointer to Ethernet device.
> > - */
> > -void
> > -mlx5_socket_handle(struct rte_eth_dev *dev) -{
> > -	struct mlx5_priv *priv = dev->data->dev_private;
> > -	int conn_sock;
> > -	int ret = 0;
> > -	struct cmsghdr *cmsg = NULL;
> > -	struct ucred *cred = NULL;
> > -	char buf[CMSG_SPACE(sizeof(struct ucred))] = { 0 };
> > -	char vbuf[1024] = { 0 };
> > -	struct iovec io = {
> > -		.iov_base = vbuf,
> > -		.iov_len = sizeof(*vbuf),
> > -	};
> > -	struct msghdr msg = {
> > -		.msg_iov = &io,
> > -		.msg_iovlen = 1,
> > -		.msg_control = buf,
> > -		.msg_controllen = sizeof(buf),
> > -	};
> > -	int *fd;
> > -
> > -	/* Accept the connection from the client. */
> > -	conn_sock = accept(priv->primary_socket, NULL, NULL);
> > -	if (conn_sock < 0) {
> > -		DRV_LOG(WARNING, "port %u connection failed: %s",
> > -			dev->data->port_id, strerror(errno));
> > -		return;
> > -	}
> > -	ret = setsockopt(conn_sock, SOL_SOCKET, SO_PASSCRED, &(int){1},
> > -					 sizeof(int));
> > -	if (ret < 0) {
> > -		ret = errno;
> > -		DRV_LOG(WARNING, "port %u cannot change socket
> > options: %s",
> > -			dev->data->port_id, strerror(rte_errno));
> > -		goto error;
> > -	}
> > -	ret = recvmsg(conn_sock, &msg, MSG_WAITALL);
> > -	if (ret < 0) {
> > -		ret = errno;
> > -		DRV_LOG(WARNING, "port %u received an empty message:
> > %s",
> > -			dev->data->port_id, strerror(rte_errno));
> > -		goto error;
> > -	}
> > -	/* Expect to receive credentials only. */
> > -	cmsg = CMSG_FIRSTHDR(&msg);
> > -	if (cmsg == NULL) {
> > -		DRV_LOG(WARNING, "port %u no message", dev->data-
> > >port_id);
> > -		goto error;
> > -	}
> > -	if ((cmsg->cmsg_type == SCM_CREDENTIALS) &&
> > -		(cmsg->cmsg_len >= sizeof(*cred))) {
> > -		cred = (struct ucred *)CMSG_DATA(cmsg);
> > -		assert(cred != NULL);
> > -	}
> > -	cmsg = CMSG_NXTHDR(&msg, cmsg);
> > -	if (cmsg != NULL) {
> > -		DRV_LOG(WARNING, "port %u message wrongly
> > formatted",
> > -			dev->data->port_id);
> > -		goto error;
> > -	}
> > -	/* Make sure all the ancillary data was received and valid. */
> > -	if ((cred == NULL) || (cred->uid != getuid()) ||
> > -	    (cred->gid != getgid())) {
> > -		DRV_LOG(WARNING, "port %u wrong credentials",
> > -			dev->data->port_id);
> > -		goto error;
> > -	}
> > -	/* Set-up the ancillary data. */
> > -	cmsg = CMSG_FIRSTHDR(&msg);
> > -	assert(cmsg != NULL);
> > -	cmsg->cmsg_level = SOL_SOCKET;
> > -	cmsg->cmsg_type = SCM_RIGHTS;
> > -	cmsg->cmsg_len = CMSG_LEN(sizeof(priv->ctx->cmd_fd));
> > -	fd = (int *)CMSG_DATA(cmsg);
> > -	*fd = priv->ctx->cmd_fd;
> > -	ret = sendmsg(conn_sock, &msg, 0);
> > -	if (ret < 0)
> > -		DRV_LOG(WARNING, "port %u cannot send response",
> > -			dev->data->port_id);
> > -error:
> > -	close(conn_sock);
> > -}
> > -
> > -/**
> > - * Connect to the primary process.
> > - *
> > - * @param[in] dev
> > - *   Pointer to Ethernet structure.
> > - *
> > - * @return
> > - *   fd on success, negative errno value otherwise and rte_errno is set.
> > - */
> > -int
> > -mlx5_socket_connect(struct rte_eth_dev *dev) -{
> > -	struct mlx5_priv *priv = dev->data->dev_private;
> > -	struct sockaddr_un sun = {
> > -		.sun_family = AF_UNIX,
> > -	};
> > -	int socket_fd = -1;
> > -	int *fd = NULL;
> > -	int ret;
> > -	struct ucred *cred;
> > -	char buf[CMSG_SPACE(sizeof(*cred))] = { 0 };
> > -	char vbuf[1024] = { 0 };
> > -	struct iovec io = {
> > -		.iov_base = vbuf,
> > -		.iov_len = sizeof(*vbuf),
> > -	};
> > -	struct msghdr msg = {
> > -		.msg_control = buf,
> > -		.msg_controllen = sizeof(buf),
> > -		.msg_iov = &io,
> > -		.msg_iovlen = 1,
> > -	};
> > -	struct cmsghdr *cmsg;
> > -
> > -	ret = socket(AF_UNIX, SOCK_STREAM, 0);
> > -	if (ret < 0) {
> > -		rte_errno = errno;
> > -		DRV_LOG(WARNING, "port %u cannot connect to primary",
> > -			dev->data->port_id);
> > -		goto error;
> > -	}
> > -	socket_fd = ret;
> > -	snprintf(sun.sun_path, sizeof(sun.sun_path), "/var/tmp/%s_%d",
> > -		 MLX5_DRIVER_NAME, priv->primary_socket);
> > -	ret = connect(socket_fd, (const struct sockaddr *)&sun, sizeof(sun));
> > -	if (ret < 0) {
> > -		rte_errno = errno;
> > -		DRV_LOG(WARNING, "port %u cannot connect to primary",
> > -			dev->data->port_id);
> > -		goto error;
> > -	}
> > -	cmsg = CMSG_FIRSTHDR(&msg);
> > -	if (cmsg == NULL) {
> > -		rte_errno = EINVAL;
> > -		DRV_LOG(DEBUG, "port %u cannot get first message",
> > -			dev->data->port_id);
> > -		goto error;
> > -	}
> > -	cmsg->cmsg_level = SOL_SOCKET;
> > -	cmsg->cmsg_type = SCM_CREDENTIALS;
> > -	cmsg->cmsg_len = CMSG_LEN(sizeof(*cred));
> > -	cred = (struct ucred *)CMSG_DATA(cmsg);
> > -	if (cred == NULL) {
> > -		rte_errno = EINVAL;
> > -		DRV_LOG(DEBUG, "port %u no credentials received",
> > -			dev->data->port_id);
> > -		goto error;
> > -	}
> > -	cred->pid = getpid();
> > -	cred->uid = getuid();
> > -	cred->gid = getgid();
> > -	ret = sendmsg(socket_fd, &msg, MSG_DONTWAIT);
> > -	if (ret < 0) {
> > -		rte_errno = errno;
> > -		DRV_LOG(WARNING,
> > -			"port %u cannot send credentials to primary: %s",
> > -			dev->data->port_id, strerror(errno));
> > -		goto error;
> > -	}
> > -	ret = recvmsg(socket_fd, &msg, MSG_WAITALL);
> > -	if (ret <= 0) {
> > -		rte_errno = errno;
> > -		DRV_LOG(WARNING, "port %u no message from primary:
> > %s",
> > -			dev->data->port_id, strerror(errno));
> > -		goto error;
> > -	}
> > -	cmsg = CMSG_FIRSTHDR(&msg);
> > -	if (cmsg == NULL) {
> > -		rte_errno = EINVAL;
> > -		DRV_LOG(WARNING, "port %u no file descriptor received",
> > -			dev->data->port_id);
> > -		goto error;
> > -	}
> > -	fd = (int *)CMSG_DATA(cmsg);
> > -	if (*fd < 0) {
> > -		DRV_LOG(WARNING, "port %u no file descriptor received:
> > %s",
> > -			dev->data->port_id, strerror(errno));
> > -		rte_errno = *fd;
> > -		goto error;
> > -	}
> > -	ret = *fd;
> > -	close(socket_fd);
> > -	return ret;
> > -error:
> > -	if (socket_fd != -1)
> > -		close(socket_fd);
> > -	return -rte_errno;
> > -}
> > --
> > 2.11.0
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 3/4] net/mlx5: rework PMD global data init
  2019-03-18 21:21     ` Yongseok Koh
  2019-03-18 21:21       ` Yongseok Koh
@ 2019-03-19  6:54       ` Shahaf Shuler
  2019-03-19  6:54         ` Shahaf Shuler
  1 sibling, 1 reply; 45+ messages in thread
From: Shahaf Shuler @ 2019-03-19  6:54 UTC (permalink / raw)
  To: Yongseok Koh; +Cc: dev

Monday, March 18, 2019 11:21 PM, Yongseok Koh:
> Subject: Re: [PATCH 3/4] net/mlx5: rework PMD global data init
> 
> On Thu, Mar 14, 2019 at 05:36:28AM -0700, Shahaf Shuler wrote:
> > Hi Koh,
> >
> > Thursday, March 7, 2019 9:33 AM, Yongseok Koh:
> > > Subject: [PATCH 3/4] net/mlx5: rework PMD global data init
> > >
> > > There's more need to have PMD global data structure. It should be
> > > initialized once per a process regardless of how many PMD instances are
> probed.
> > > mlx5_init_once() is called during probing and make sure all the init
> > > functions are called once per a process. The existing shared memory
> > > gets more extensively used for this purpose. As there could be
> > > multiple secondary processes, a static storage (local to process) is also
> added.
> >
> > It is hard to understand from the commit log what was missing on the old
> design.
> 
> Okay, will add more comments.
> 
> > > As the reserved virtual address for UAR remap is a PMD global
> > > resource, this doesn't need to be stored in the device priv
> > > structure, but in the PMD global data.
> >
> > I thought we agreed to drop those and have different VA for each process.
> > If so, is the extra work on the UAR here is needed?
> 
> My plan was to do that in a separate patch for performance regression.
> Let me know if you want it to be done in this patchset.

I prefer to see how the UAR will look in the final design. If you can include such patch it wil be great.
Otherwise lets keep it as is.

> 
> > > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > > ---
> > >  drivers/net/mlx5/mlx5.c     | 250
> ++++++++++++++++++++++++++++++++--
> > > ----------
> > >  drivers/net/mlx5/mlx5.h     |  19 +++-
> > >  drivers/net/mlx5/mlx5_mp.c  |  19 +++-
> > >  drivers/net/mlx5/mlx5_txq.c |   7 +-
> > >  4 files changed, 217 insertions(+), 78 deletions(-)
> > >
> > > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> > > 6ed2418106..ea8fd55ee6 100644
> > > --- a/drivers/net/mlx5/mlx5.c
> > > +++ b/drivers/net/mlx5/mlx5.c
> > > @@ -128,16 +128,26 @@ struct mlx5_shared_data *mlx5_shared_data;
> > >  /* Spinlock for mlx5_shared_data allocation. */  static
> > > rte_spinlock_t mlx5_shared_data_lock = RTE_SPINLOCK_INITIALIZER;
> > >
> > > +/* Process local data for secondary processes. */ static struct
> > > +mlx5_local_data mlx5_local_data;
> >
> > Why not storing this context as part of ethdev-> process_private instead of
> declaring it static?
> 
> Because it is not per-device data but per-PMD data.
> 
> Will also have to rebase my patchsets when I send out v2.
> 
> 
> Thanks,
> Yongseok

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 3/4] net/mlx5: rework PMD global data init
  2019-03-19  6:54       ` Shahaf Shuler
@ 2019-03-19  6:54         ` Shahaf Shuler
  0 siblings, 0 replies; 45+ messages in thread
From: Shahaf Shuler @ 2019-03-19  6:54 UTC (permalink / raw)
  To: Yongseok Koh; +Cc: dev

Monday, March 18, 2019 11:21 PM, Yongseok Koh:
> Subject: Re: [PATCH 3/4] net/mlx5: rework PMD global data init
> 
> On Thu, Mar 14, 2019 at 05:36:28AM -0700, Shahaf Shuler wrote:
> > Hi Koh,
> >
> > Thursday, March 7, 2019 9:33 AM, Yongseok Koh:
> > > Subject: [PATCH 3/4] net/mlx5: rework PMD global data init
> > >
> > > There's more need to have PMD global data structure. It should be
> > > initialized once per a process regardless of how many PMD instances are
> probed.
> > > mlx5_init_once() is called during probing and make sure all the init
> > > functions are called once per a process. The existing shared memory
> > > gets more extensively used for this purpose. As there could be
> > > multiple secondary processes, a static storage (local to process) is also
> added.
> >
> > It is hard to understand from the commit log what was missing on the old
> design.
> 
> Okay, will add more comments.
> 
> > > As the reserved virtual address for UAR remap is a PMD global
> > > resource, this doesn't need to be stored in the device priv
> > > structure, but in the PMD global data.
> >
> > I thought we agreed to drop those and have different VA for each process.
> > If so, is the extra work on the UAR here is needed?
> 
> My plan was to do that in a separate patch for performance regression.
> Let me know if you want it to be done in this patchset.

I prefer to see how the UAR will look in the final design. If you can include such patch it wil be great.
Otherwise lets keep it as is.

> 
> > > Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > > ---
> > >  drivers/net/mlx5/mlx5.c     | 250
> ++++++++++++++++++++++++++++++++--
> > > ----------
> > >  drivers/net/mlx5/mlx5.h     |  19 +++-
> > >  drivers/net/mlx5/mlx5_mp.c  |  19 +++-
> > >  drivers/net/mlx5/mlx5_txq.c |   7 +-
> > >  4 files changed, 217 insertions(+), 78 deletions(-)
> > >
> > > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> > > 6ed2418106..ea8fd55ee6 100644
> > > --- a/drivers/net/mlx5/mlx5.c
> > > +++ b/drivers/net/mlx5/mlx5.c
> > > @@ -128,16 +128,26 @@ struct mlx5_shared_data *mlx5_shared_data;
> > >  /* Spinlock for mlx5_shared_data allocation. */  static
> > > rte_spinlock_t mlx5_shared_data_lock = RTE_SPINLOCK_INITIALIZER;
> > >
> > > +/* Process local data for secondary processes. */ static struct
> > > +mlx5_local_data mlx5_local_data;
> >
> > Why not storing this context as part of ethdev-> process_private instead of
> declaring it static?
> 
> Because it is not per-device data but per-PMD data.
> 
> Will also have to rebase my patchsets when I send out v2.
> 
> 
> Thanks,
> Yongseok

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 0/4] net/mlx5: rework IPC socket and PMD global data init
  2019-03-07  7:33 [dpdk-dev] [PATCH 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
                   ` (3 preceding siblings ...)
  2019-03-07  7:33 ` [dpdk-dev] [PATCH 4/4] net/mlx5: sync stop/start of datapath with secondary process Yongseok Koh
@ 2019-03-25 19:15 ` Yongseok Koh
  2019-03-25 19:15   ` Yongseok Koh
                     ` (4 more replies)
  2019-04-01 21:12 ` [dpdk-dev] [PATCH v3 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
  5 siblings, 5 replies; 45+ messages in thread
From: Yongseok Koh @ 2019-03-25 19:15 UTC (permalink / raw)
  To: shahafs; +Cc: dev

The existing socket-based IPC channel is replaced with the new rte_mp APIs of
EAL and extended to request stop/start of dataplane to secondary processes.
Also, initialization of PMD global data including the new IPC channel is
reworked to provide more generic framework for future use.

v2:
* add more sanity check for eth_dev and return value from IPC request.
* complement commit messages
* add MLX5_MP_REQ_TIMEOUT_SEC

Yongseok Koh (4):
  net/mlx5: fix memory event on secondary process
  net/mlx5: replace IPC socket with EAL API
  net/mlx5: rework PMD global data init
  net/mlx5: sync stop/start of datapath with secondary process

 drivers/net/mlx5/Makefile       |   2 +-
 drivers/net/mlx5/meson.build    |   2 +-
 drivers/net/mlx5/mlx5.c         | 257 ++++++++++++++++++++++++---------
 drivers/net/mlx5/mlx5.h         |  52 +++++--
 drivers/net/mlx5/mlx5_ethdev.c  |  29 ----
 drivers/net/mlx5/mlx5_mp.c      | 308 ++++++++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_mr.c      |   2 +
 drivers/net/mlx5/mlx5_rxtx.c    |   2 +
 drivers/net/mlx5/mlx5_socket.c  | 306 ---------------------------------------
 drivers/net/mlx5/mlx5_trigger.c |   5 +
 drivers/net/mlx5/mlx5_txq.c     |   7 +-
 11 files changed, 552 insertions(+), 420 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_mp.c
 delete mode 100644 drivers/net/mlx5/mlx5_socket.c

-- 
2.11.0

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 0/4] net/mlx5: rework IPC socket and PMD global data init
  2019-03-25 19:15 ` [dpdk-dev] [PATCH v2 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
@ 2019-03-25 19:15   ` Yongseok Koh
  2019-03-25 19:15   ` [dpdk-dev] [PATCH v2 1/4] net/mlx5: fix memory event on secondary process Yongseok Koh
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 45+ messages in thread
From: Yongseok Koh @ 2019-03-25 19:15 UTC (permalink / raw)
  To: shahafs; +Cc: dev

The existing socket-based IPC channel is replaced with the new rte_mp APIs of
EAL and extended to request stop/start of dataplane to secondary processes.
Also, initialization of PMD global data including the new IPC channel is
reworked to provide more generic framework for future use.

v2:
* add more sanity check for eth_dev and return value from IPC request.
* complement commit messages
* add MLX5_MP_REQ_TIMEOUT_SEC

Yongseok Koh (4):
  net/mlx5: fix memory event on secondary process
  net/mlx5: replace IPC socket with EAL API
  net/mlx5: rework PMD global data init
  net/mlx5: sync stop/start of datapath with secondary process

 drivers/net/mlx5/Makefile       |   2 +-
 drivers/net/mlx5/meson.build    |   2 +-
 drivers/net/mlx5/mlx5.c         | 257 ++++++++++++++++++++++++---------
 drivers/net/mlx5/mlx5.h         |  52 +++++--
 drivers/net/mlx5/mlx5_ethdev.c  |  29 ----
 drivers/net/mlx5/mlx5_mp.c      | 308 ++++++++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_mr.c      |   2 +
 drivers/net/mlx5/mlx5_rxtx.c    |   2 +
 drivers/net/mlx5/mlx5_socket.c  | 306 ---------------------------------------
 drivers/net/mlx5/mlx5_trigger.c |   5 +
 drivers/net/mlx5/mlx5_txq.c     |   7 +-
 11 files changed, 552 insertions(+), 420 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_mp.c
 delete mode 100644 drivers/net/mlx5/mlx5_socket.c

-- 
2.11.0


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 1/4] net/mlx5: fix memory event on secondary process
  2019-03-25 19:15 ` [dpdk-dev] [PATCH v2 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
  2019-03-25 19:15   ` Yongseok Koh
@ 2019-03-25 19:15   ` Yongseok Koh
  2019-03-25 19:15     ` Yongseok Koh
  2019-03-26 12:28     ` Shahaf Shuler
  2019-03-25 19:15   ` [dpdk-dev] [PATCH v2 2/4] net/mlx5: replace IPC socket with EAL API Yongseok Koh
                     ` (2 subsequent siblings)
  4 siblings, 2 replies; 45+ messages in thread
From: Yongseok Koh @ 2019-03-25 19:15 UTC (permalink / raw)
  To: shahafs; +Cc: dev, stable

As the memory event is propagated to secondary processes, the event is
processed redundantly. This should be processed once because the data
structure used for MR and the event is global across the processes.

Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 drivers/net/mlx5/mlx5.c    | 5 +++--
 drivers/net/mlx5/mlx5_mr.c | 2 ++
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index ae4b71695e..dd29eba955 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -157,9 +157,10 @@ mlx5_prepare_shared_data(void)
 		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
 			LIST_INIT(&mlx5_shared_data->mem_event_cb_list);
 			rte_rwlock_init(&mlx5_shared_data->mem_event_rwlock);
+			rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
+							mlx5_mr_mem_event_cb,
+							NULL);
 		}
-		rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
-						mlx5_mr_mem_event_cb, NULL);
 	}
 	rte_spinlock_unlock(&mlx5_shared_data_lock);
 }
diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
index 700d83d1bc..d336a77e40 100644
--- a/drivers/net/mlx5/mlx5_mr.c
+++ b/drivers/net/mlx5/mlx5_mr.c
@@ -891,6 +891,8 @@ mlx5_mr_mem_event_cb(enum rte_mem_event event_type, const void *addr,
 	struct mlx5_priv *priv;
 	struct mlx5_dev_list *dev_list = &mlx5_shared_data->mem_event_cb_list;
 
+	/* Must be called from the primary process. */
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
 	switch (event_type) {
 	case RTE_MEM_EVENT_FREE:
 		rte_rwlock_write_lock(&mlx5_shared_data->mem_event_rwlock);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 1/4] net/mlx5: fix memory event on secondary process
  2019-03-25 19:15   ` [dpdk-dev] [PATCH v2 1/4] net/mlx5: fix memory event on secondary process Yongseok Koh
@ 2019-03-25 19:15     ` Yongseok Koh
  2019-03-26 12:28     ` Shahaf Shuler
  1 sibling, 0 replies; 45+ messages in thread
From: Yongseok Koh @ 2019-03-25 19:15 UTC (permalink / raw)
  To: shahafs; +Cc: dev, stable

As the memory event is propagated to secondary processes, the event is
processed redundantly. This should be processed once because the data
structure used for MR and the event is global across the processes.

Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 drivers/net/mlx5/mlx5.c    | 5 +++--
 drivers/net/mlx5/mlx5_mr.c | 2 ++
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index ae4b71695e..dd29eba955 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -157,9 +157,10 @@ mlx5_prepare_shared_data(void)
 		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
 			LIST_INIT(&mlx5_shared_data->mem_event_cb_list);
 			rte_rwlock_init(&mlx5_shared_data->mem_event_rwlock);
+			rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
+							mlx5_mr_mem_event_cb,
+							NULL);
 		}
-		rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
-						mlx5_mr_mem_event_cb, NULL);
 	}
 	rte_spinlock_unlock(&mlx5_shared_data_lock);
 }
diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
index 700d83d1bc..d336a77e40 100644
--- a/drivers/net/mlx5/mlx5_mr.c
+++ b/drivers/net/mlx5/mlx5_mr.c
@@ -891,6 +891,8 @@ mlx5_mr_mem_event_cb(enum rte_mem_event event_type, const void *addr,
 	struct mlx5_priv *priv;
 	struct mlx5_dev_list *dev_list = &mlx5_shared_data->mem_event_cb_list;
 
+	/* Must be called from the primary process. */
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
 	switch (event_type) {
 	case RTE_MEM_EVENT_FREE:
 		rte_rwlock_write_lock(&mlx5_shared_data->mem_event_rwlock);
-- 
2.11.0


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 2/4] net/mlx5: replace IPC socket with EAL API
  2019-03-25 19:15 ` [dpdk-dev] [PATCH v2 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
  2019-03-25 19:15   ` Yongseok Koh
  2019-03-25 19:15   ` [dpdk-dev] [PATCH v2 1/4] net/mlx5: fix memory event on secondary process Yongseok Koh
@ 2019-03-25 19:15   ` Yongseok Koh
  2019-03-25 19:15     ` Yongseok Koh
  2019-03-26 12:31     ` Shahaf Shuler
  2019-03-25 19:15   ` [dpdk-dev] [PATCH v2 3/4] net/mlx5: rework PMD global data init Yongseok Koh
  2019-03-25 19:15   ` [dpdk-dev] [PATCH v2 4/4] net/mlx5: sync stop/start of datapath with secondary process Yongseok Koh
  4 siblings, 2 replies; 45+ messages in thread
From: Yongseok Koh @ 2019-03-25 19:15 UTC (permalink / raw)
  To: shahafs; +Cc: dev

Socket API is used for IPC in order for secondary process to acquire Verb
command file descriptor. The FD is used to remap UAR address. The new
multi-process APIs (rte_mp) in EAL are newly introduced. mlx5_socket.c is
replaced with mlx5_mp.c, which uses the new APIs.

As it is PMD global infrastructure, only one IPC channel is established.
All the IPC message types may have port_id in the message if there is need
to reference a specific device.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 drivers/net/mlx5/Makefile      |   2 +-
 drivers/net/mlx5/meson.build   |   2 +-
 drivers/net/mlx5/mlx5.c        |   5 +-
 drivers/net/mlx5/mlx5.h        |  29 ++--
 drivers/net/mlx5/mlx5_ethdev.c |  29 ----
 drivers/net/mlx5/mlx5_mp.c     | 139 +++++++++++++++++++
 drivers/net/mlx5/mlx5_socket.c | 306 -----------------------------------------
 7 files changed, 164 insertions(+), 348 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_mp.c
 delete mode 100644 drivers/net/mlx5/mlx5_socket.c

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 9a7da18196..34dc957ac0 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -34,7 +34,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_dv.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_tcf.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_verbs.c
-SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_socket.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_mp.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_nl.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_devx_cmds.c
 
diff --git a/drivers/net/mlx5/meson.build b/drivers/net/mlx5/meson.build
index 0cf2f0873e..d99667670c 100644
--- a/drivers/net/mlx5/meson.build
+++ b/drivers/net/mlx5/meson.build
@@ -41,7 +41,7 @@ if build
 		'mlx5_rxmode.c',
 		'mlx5_rxq.c',
 		'mlx5_rxtx.c',
-		'mlx5_socket.c',
+		'mlx5_mp.c',
 		'mlx5_stats.c',
 		'mlx5_trigger.c',
 		'mlx5_txq.c',
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index dd29eba955..316f34cd05 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -160,6 +160,7 @@ mlx5_prepare_shared_data(void)
 			rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
 							mlx5_mr_mem_event_cb,
 							NULL);
+			mlx5_mp_init();
 		}
 	}
 	rte_spinlock_unlock(&mlx5_shared_data_lock);
@@ -291,8 +292,6 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 		rte_free(priv->rss_conf.rss_key);
 	if (priv->reta_idx != NULL)
 		rte_free(priv->reta_idx);
-	if (priv->primary_socket)
-		mlx5_socket_uninit(dev);
 	if (priv->config.vf)
 		mlx5_nl_mac_addr_flush(dev);
 	if (priv->nl_socket_route >= 0)
@@ -929,7 +928,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 			goto error;
 		}
 		/* Receive command fd from primary process */
-		err = mlx5_socket_connect(eth_dev);
+		err = mlx5_mp_req_verbs_cmd_fd(eth_dev);
 		if (err < 0) {
 			err = rte_errno;
 			goto error;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 88ffb19247..7030c6f7d7 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -56,6 +56,24 @@ enum {
 	PCI_DEVICE_ID_MELLANOX_CONNECTX6VF = 0x101c,
 };
 
+/* Request types for IPC. */
+enum mlx5_mp_req_type {
+	MLX5_MP_REQ_VERBS_CMD_FD = 1,
+};
+
+/* Pameters for IPC. */
+struct mlx5_mp_param {
+	enum mlx5_mp_req_type type;
+	int port_id;
+	int result;
+};
+
+/** Request timeout for IPC. */
+#define MLX5_MP_REQ_TIMEOUT_SEC 5
+
+/** Key string for IPC. */
+#define MLX5_MP_NAME "net_mlx5_mp"
+
 /** Switch information returned by mlx5_nl_switch_info(). */
 struct mlx5_switch_info {
 	uint32_t master:1; /**< Master device. */
@@ -242,9 +260,7 @@ struct mlx5_priv {
 	uint32_t link_speed_capa; /* Link speed capabilities. */
 	struct mlx5_xstats_ctrl xstats_ctrl; /* Extended stats control. */
 	struct mlx5_stats_ctrl stats_ctrl; /* Stats control. */
-	int primary_socket; /* Unix socket for primary process. */
 	void *uar_base; /* Reserved address space for UAR mapping */
-	struct rte_intr_handle intr_handle_socket; /* Interrupt handler. */
 	struct mlx5_dev_config config; /* Device configuration. */
 	struct mlx5_verbs_alloc_ctx verbs_alloc_ctx;
 	/* Context for Verbs allocator. */
@@ -402,12 +418,9 @@ int mlx5_ctrl_flow(struct rte_eth_dev *dev,
 int mlx5_flow_create_drop_queue(struct rte_eth_dev *dev);
 void mlx5_flow_delete_drop_queue(struct rte_eth_dev *dev);
 
-/* mlx5_socket.c */
-
-int mlx5_socket_init(struct rte_eth_dev *priv);
-void mlx5_socket_uninit(struct rte_eth_dev *priv);
-void mlx5_socket_handle(struct rte_eth_dev *priv);
-int mlx5_socket_connect(struct rte_eth_dev *priv);
+/* mlx5_mp.c */
+int mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev);
+void mlx5_mp_init(void);
 
 /* mlx5_nl.c */
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 84d761c8e9..82a77e0d2b 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1091,20 +1091,6 @@ mlx5_dev_interrupt_handler(void *cb_arg)
 }
 
 /**
- * Handle interrupts from the socket.
- *
- * @param cb_arg
- *   Callback argument.
- */
-static void
-mlx5_dev_handler_socket(void *cb_arg)
-{
-	struct rte_eth_dev *dev = cb_arg;
-
-	mlx5_socket_handle(dev);
-}
-
-/**
  * Uninstall interrupt handler.
  *
  * @param dev
@@ -1119,13 +1105,8 @@ mlx5_dev_interrupt_handler_uninstall(struct rte_eth_dev *dev)
 	    dev->data->dev_conf.intr_conf.rmv)
 		rte_intr_callback_unregister(&priv->intr_handle,
 					     mlx5_dev_interrupt_handler, dev);
-	if (priv->primary_socket)
-		rte_intr_callback_unregister(&priv->intr_handle_socket,
-					     mlx5_dev_handler_socket, dev);
 	priv->intr_handle.fd = 0;
 	priv->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
-	priv->intr_handle_socket.fd = 0;
-	priv->intr_handle_socket.type = RTE_INTR_HANDLE_UNKNOWN;
 }
 
 /**
@@ -1159,16 +1140,6 @@ mlx5_dev_interrupt_handler_install(struct rte_eth_dev *dev)
 		rte_intr_callback_register(&priv->intr_handle,
 					   mlx5_dev_interrupt_handler, dev);
 	}
-	ret = mlx5_socket_init(dev);
-	if (ret)
-		DRV_LOG(ERR, "port %u cannot initialise socket: %s",
-			dev->data->port_id, strerror(rte_errno));
-	else if (priv->primary_socket) {
-		priv->intr_handle_socket.fd = priv->primary_socket;
-		priv->intr_handle_socket.type = RTE_INTR_HANDLE_EXT;
-		rte_intr_callback_register(&priv->intr_handle_socket,
-					   mlx5_dev_handler_socket, dev);
-	}
 }
 
 /**
diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c
new file mode 100644
index 0000000000..b8dd4b5fa7
--- /dev/null
+++ b/drivers/net/mlx5/mlx5_mp.c
@@ -0,0 +1,139 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2019 6WIND S.A.
+ * Copyright 2019 Mellanox Technologies, Ltd
+ */
+
+#include <assert.h>
+#include <stdio.h>
+#include <time.h>
+
+#include <rte_eal.h>
+#include <rte_ethdev_driver.h>
+#include <rte_string_fns.h>
+
+#include "mlx5.h"
+#include "mlx5_utils.h"
+
+/**
+ * Initialize IPC message.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ * @param[out] msg
+ *   Pointer to message to fill in.
+ * @param[in] type
+ *   Message type.
+ */
+static inline void
+mp_init_msg(struct rte_eth_dev *dev, struct rte_mp_msg *msg,
+	    enum mlx5_mp_req_type type)
+{
+	struct mlx5_mp_param *param = (struct mlx5_mp_param *)msg->param;
+
+	memset(msg, 0, sizeof(*msg));
+	strlcpy(msg->name, MLX5_MP_NAME, sizeof(msg->name));
+	msg->len_param = sizeof(*param);
+	param->type = type;
+	param->port_id = dev->data->port_id;
+}
+
+/**
+ * IPC message handler of primary process.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ * @param[in] peer
+ *   Pointer to the peer socket path.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mp_primary_handle(const struct rte_mp_msg *mp_msg, const void *peer)
+{
+	struct rte_mp_msg mp_res;
+	struct mlx5_mp_param *res = (struct mlx5_mp_param *)mp_res.param;
+	const struct mlx5_mp_param *param =
+		(const struct mlx5_mp_param *)mp_msg->param;
+	struct rte_eth_dev *dev;
+	struct mlx5_priv *priv;
+	int ret;
+
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
+	if (!rte_eth_dev_is_valid_port(param->port_id)) {
+		rte_errno = ENODEV;
+		DRV_LOG(ERR, "port %u invalid port ID", param->port_id);
+		return -rte_errno;
+	}
+	dev = &rte_eth_devices[param->port_id];
+	priv = dev->data->dev_private;
+	switch (param->type) {
+	case MLX5_MP_REQ_VERBS_CMD_FD:
+		mp_init_msg(dev, &mp_res, param->type);
+		mp_res.num_fds = 1;
+		mp_res.fds[0] = priv->ctx->cmd_fd;
+		res->result = 0;
+		ret = rte_mp_reply(&mp_res, peer);
+		break;
+	default:
+		rte_errno = EINVAL;
+		DRV_LOG(ERR, "port %u invalid mp request type",
+			dev->data->port_id);
+		return -rte_errno;
+	}
+	return ret;
+}
+
+/**
+ * Request Verbs command file descriptor for mmap to the primary process.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ *
+ * @return
+ *   fd on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev)
+{
+	struct rte_mp_msg mp_req;
+	struct rte_mp_msg *mp_res;
+	struct rte_mp_reply mp_rep;
+	struct mlx5_mp_param *res;
+	struct timespec ts = {.tv_sec = MLX5_MP_REQ_TIMEOUT_SEC, .tv_nsec = 0};
+	int ret;
+
+	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
+	mp_init_msg(dev, &mp_req, MLX5_MP_REQ_VERBS_CMD_FD);
+	ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts);
+	if (ret) {
+		DRV_LOG(ERR, "port %u request to primary process failed",
+			dev->data->port_id);
+		return -rte_errno;
+	}
+	assert(mp_rep.nb_received == 1);
+	mp_res = &mp_rep.msgs[0];
+	res = (struct mlx5_mp_param *)mp_res->param;
+	if (res->result) {
+		rte_errno = -res->result;
+		DRV_LOG(ERR,
+			"port %u failed to get command FD from primary process",
+			dev->data->port_id);
+		ret = -rte_errno;
+		goto exit;
+	}
+	assert(mp_res->num_fds == 1);
+	ret = mp_res->fds[0];
+	DRV_LOG(DEBUG, "port %u command FD from primary is %d",
+		dev->data->port_id, ret);
+exit:
+	free(mp_rep.msgs);
+	return ret;
+}
+
+void
+mlx5_mp_init(void)
+{
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+		rte_mp_action_register(MLX5_MP_NAME, mp_primary_handle);
+}
diff --git a/drivers/net/mlx5/mlx5_socket.c b/drivers/net/mlx5/mlx5_socket.c
deleted file mode 100644
index 41cac3c6aa..0000000000
--- a/drivers/net/mlx5/mlx5_socket.c
+++ /dev/null
@@ -1,306 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright 2016 6WIND S.A.
- * Copyright 2016 Mellanox Technologies, Ltd
- */
-
-#include <sys/types.h>
-#include <sys/socket.h>
-#include <sys/un.h>
-#include <fcntl.h>
-#include <stdio.h>
-#include <unistd.h>
-#include <sys/stat.h>
-
-#include "mlx5.h"
-#include "mlx5_utils.h"
-
-/**
- * Initialise the socket to communicate with the secondary process
- *
- * @param[in] dev
- *   Pointer to Ethernet device.
- *
- * @return
- *   0 on success, a negative errno value otherwise and rte_errno is set.
- */
-int
-mlx5_socket_init(struct rte_eth_dev *dev)
-{
-	struct mlx5_priv *priv = dev->data->dev_private;
-	struct sockaddr_un sun = {
-		.sun_family = AF_UNIX,
-	};
-	int ret;
-	int flags;
-
-	/*
-	 * Close the last socket that was used to communicate
-	 * with the secondary process
-	 */
-	if (priv->primary_socket)
-		mlx5_socket_uninit(dev);
-	/*
-	 * Initialise the socket to communicate with the secondary
-	 * process.
-	 */
-	ret = socket(AF_UNIX, SOCK_STREAM, 0);
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u secondary process not supported: %s",
-			dev->data->port_id, strerror(errno));
-		goto error;
-	}
-	priv->primary_socket = ret;
-	flags = fcntl(priv->primary_socket, F_GETFL, 0);
-	if (flags == -1) {
-		rte_errno = errno;
-		goto error;
-	}
-	ret = fcntl(priv->primary_socket, F_SETFL, flags | O_NONBLOCK);
-	if (ret < 0) {
-		rte_errno = errno;
-		goto error;
-	}
-	snprintf(sun.sun_path, sizeof(sun.sun_path), "/var/tmp/%s_%d",
-		 MLX5_DRIVER_NAME, priv->primary_socket);
-	remove(sun.sun_path);
-	ret = bind(priv->primary_socket, (const struct sockaddr *)&sun,
-		   sizeof(sun));
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING,
-			"port %u cannot bind socket, secondary process not"
-			" supported: %s",
-			dev->data->port_id, strerror(errno));
-		goto close;
-	}
-	ret = listen(priv->primary_socket, 0);
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u secondary process not supported: %s",
-			dev->data->port_id, strerror(errno));
-		goto close;
-	}
-	return 0;
-close:
-	remove(sun.sun_path);
-error:
-	claim_zero(close(priv->primary_socket));
-	priv->primary_socket = 0;
-	return -rte_errno;
-}
-
-/**
- * Un-Initialise the socket to communicate with the secondary process
- *
- * @param[in] dev
- */
-void
-mlx5_socket_uninit(struct rte_eth_dev *dev)
-{
-	struct mlx5_priv *priv = dev->data->dev_private;
-
-	MKSTR(path, "/var/tmp/%s_%d", MLX5_DRIVER_NAME, priv->primary_socket);
-	claim_zero(close(priv->primary_socket));
-	priv->primary_socket = 0;
-	claim_zero(remove(path));
-}
-
-/**
- * Handle socket interrupts.
- *
- * @param dev
- *   Pointer to Ethernet device.
- */
-void
-mlx5_socket_handle(struct rte_eth_dev *dev)
-{
-	struct mlx5_priv *priv = dev->data->dev_private;
-	int conn_sock;
-	int ret = 0;
-	struct cmsghdr *cmsg = NULL;
-	struct ucred *cred = NULL;
-	char buf[CMSG_SPACE(sizeof(struct ucred))] = { 0 };
-	char vbuf[1024] = { 0 };
-	struct iovec io = {
-		.iov_base = vbuf,
-		.iov_len = sizeof(*vbuf),
-	};
-	struct msghdr msg = {
-		.msg_iov = &io,
-		.msg_iovlen = 1,
-		.msg_control = buf,
-		.msg_controllen = sizeof(buf),
-	};
-	int *fd;
-
-	/* Accept the connection from the client. */
-	conn_sock = accept(priv->primary_socket, NULL, NULL);
-	if (conn_sock < 0) {
-		DRV_LOG(WARNING, "port %u connection failed: %s",
-			dev->data->port_id, strerror(errno));
-		return;
-	}
-	ret = setsockopt(conn_sock, SOL_SOCKET, SO_PASSCRED, &(int){1},
-					 sizeof(int));
-	if (ret < 0) {
-		ret = errno;
-		DRV_LOG(WARNING, "port %u cannot change socket options: %s",
-			dev->data->port_id, strerror(rte_errno));
-		goto error;
-	}
-	ret = recvmsg(conn_sock, &msg, MSG_WAITALL);
-	if (ret < 0) {
-		ret = errno;
-		DRV_LOG(WARNING, "port %u received an empty message: %s",
-			dev->data->port_id, strerror(rte_errno));
-		goto error;
-	}
-	/* Expect to receive credentials only. */
-	cmsg = CMSG_FIRSTHDR(&msg);
-	if (cmsg == NULL) {
-		DRV_LOG(WARNING, "port %u no message", dev->data->port_id);
-		goto error;
-	}
-	if ((cmsg->cmsg_type == SCM_CREDENTIALS) &&
-		(cmsg->cmsg_len >= sizeof(*cred))) {
-		cred = (struct ucred *)CMSG_DATA(cmsg);
-		assert(cred != NULL);
-	}
-	cmsg = CMSG_NXTHDR(&msg, cmsg);
-	if (cmsg != NULL) {
-		DRV_LOG(WARNING, "port %u message wrongly formatted",
-			dev->data->port_id);
-		goto error;
-	}
-	/* Make sure all the ancillary data was received and valid. */
-	if ((cred == NULL) || (cred->uid != getuid()) ||
-	    (cred->gid != getgid())) {
-		DRV_LOG(WARNING, "port %u wrong credentials",
-			dev->data->port_id);
-		goto error;
-	}
-	/* Set-up the ancillary data. */
-	cmsg = CMSG_FIRSTHDR(&msg);
-	assert(cmsg != NULL);
-	cmsg->cmsg_level = SOL_SOCKET;
-	cmsg->cmsg_type = SCM_RIGHTS;
-	cmsg->cmsg_len = CMSG_LEN(sizeof(priv->ctx->cmd_fd));
-	fd = (int *)CMSG_DATA(cmsg);
-	*fd = priv->ctx->cmd_fd;
-	ret = sendmsg(conn_sock, &msg, 0);
-	if (ret < 0)
-		DRV_LOG(WARNING, "port %u cannot send response",
-			dev->data->port_id);
-error:
-	close(conn_sock);
-}
-
-/**
- * Connect to the primary process.
- *
- * @param[in] dev
- *   Pointer to Ethernet structure.
- *
- * @return
- *   fd on success, negative errno value otherwise and rte_errno is set.
- */
-int
-mlx5_socket_connect(struct rte_eth_dev *dev)
-{
-	struct mlx5_priv *priv = dev->data->dev_private;
-	struct sockaddr_un sun = {
-		.sun_family = AF_UNIX,
-	};
-	int socket_fd = -1;
-	int *fd = NULL;
-	int ret;
-	struct ucred *cred;
-	char buf[CMSG_SPACE(sizeof(*cred))] = { 0 };
-	char vbuf[1024] = { 0 };
-	struct iovec io = {
-		.iov_base = vbuf,
-		.iov_len = sizeof(*vbuf),
-	};
-	struct msghdr msg = {
-		.msg_control = buf,
-		.msg_controllen = sizeof(buf),
-		.msg_iov = &io,
-		.msg_iovlen = 1,
-	};
-	struct cmsghdr *cmsg;
-
-	ret = socket(AF_UNIX, SOCK_STREAM, 0);
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u cannot connect to primary",
-			dev->data->port_id);
-		goto error;
-	}
-	socket_fd = ret;
-	snprintf(sun.sun_path, sizeof(sun.sun_path), "/var/tmp/%s_%d",
-		 MLX5_DRIVER_NAME, priv->primary_socket);
-	ret = connect(socket_fd, (const struct sockaddr *)&sun, sizeof(sun));
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u cannot connect to primary",
-			dev->data->port_id);
-		goto error;
-	}
-	cmsg = CMSG_FIRSTHDR(&msg);
-	if (cmsg == NULL) {
-		rte_errno = EINVAL;
-		DRV_LOG(DEBUG, "port %u cannot get first message",
-			dev->data->port_id);
-		goto error;
-	}
-	cmsg->cmsg_level = SOL_SOCKET;
-	cmsg->cmsg_type = SCM_CREDENTIALS;
-	cmsg->cmsg_len = CMSG_LEN(sizeof(*cred));
-	cred = (struct ucred *)CMSG_DATA(cmsg);
-	if (cred == NULL) {
-		rte_errno = EINVAL;
-		DRV_LOG(DEBUG, "port %u no credentials received",
-			dev->data->port_id);
-		goto error;
-	}
-	cred->pid = getpid();
-	cred->uid = getuid();
-	cred->gid = getgid();
-	ret = sendmsg(socket_fd, &msg, MSG_DONTWAIT);
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING,
-			"port %u cannot send credentials to primary: %s",
-			dev->data->port_id, strerror(errno));
-		goto error;
-	}
-	ret = recvmsg(socket_fd, &msg, MSG_WAITALL);
-	if (ret <= 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u no message from primary: %s",
-			dev->data->port_id, strerror(errno));
-		goto error;
-	}
-	cmsg = CMSG_FIRSTHDR(&msg);
-	if (cmsg == NULL) {
-		rte_errno = EINVAL;
-		DRV_LOG(WARNING, "port %u no file descriptor received",
-			dev->data->port_id);
-		goto error;
-	}
-	fd = (int *)CMSG_DATA(cmsg);
-	if (*fd < 0) {
-		DRV_LOG(WARNING, "port %u no file descriptor received: %s",
-			dev->data->port_id, strerror(errno));
-		rte_errno = *fd;
-		goto error;
-	}
-	ret = *fd;
-	close(socket_fd);
-	return ret;
-error:
-	if (socket_fd != -1)
-		close(socket_fd);
-	return -rte_errno;
-}
-- 
2.11.0

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 2/4] net/mlx5: replace IPC socket with EAL API
  2019-03-25 19:15   ` [dpdk-dev] [PATCH v2 2/4] net/mlx5: replace IPC socket with EAL API Yongseok Koh
@ 2019-03-25 19:15     ` Yongseok Koh
  2019-03-26 12:31     ` Shahaf Shuler
  1 sibling, 0 replies; 45+ messages in thread
From: Yongseok Koh @ 2019-03-25 19:15 UTC (permalink / raw)
  To: shahafs; +Cc: dev

Socket API is used for IPC in order for secondary process to acquire Verb
command file descriptor. The FD is used to remap UAR address. The new
multi-process APIs (rte_mp) in EAL are newly introduced. mlx5_socket.c is
replaced with mlx5_mp.c, which uses the new APIs.

As it is PMD global infrastructure, only one IPC channel is established.
All the IPC message types may have port_id in the message if there is need
to reference a specific device.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 drivers/net/mlx5/Makefile      |   2 +-
 drivers/net/mlx5/meson.build   |   2 +-
 drivers/net/mlx5/mlx5.c        |   5 +-
 drivers/net/mlx5/mlx5.h        |  29 ++--
 drivers/net/mlx5/mlx5_ethdev.c |  29 ----
 drivers/net/mlx5/mlx5_mp.c     | 139 +++++++++++++++++++
 drivers/net/mlx5/mlx5_socket.c | 306 -----------------------------------------
 7 files changed, 164 insertions(+), 348 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_mp.c
 delete mode 100644 drivers/net/mlx5/mlx5_socket.c

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 9a7da18196..34dc957ac0 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -34,7 +34,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_dv.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_tcf.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_verbs.c
-SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_socket.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_mp.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_nl.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_devx_cmds.c
 
diff --git a/drivers/net/mlx5/meson.build b/drivers/net/mlx5/meson.build
index 0cf2f0873e..d99667670c 100644
--- a/drivers/net/mlx5/meson.build
+++ b/drivers/net/mlx5/meson.build
@@ -41,7 +41,7 @@ if build
 		'mlx5_rxmode.c',
 		'mlx5_rxq.c',
 		'mlx5_rxtx.c',
-		'mlx5_socket.c',
+		'mlx5_mp.c',
 		'mlx5_stats.c',
 		'mlx5_trigger.c',
 		'mlx5_txq.c',
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index dd29eba955..316f34cd05 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -160,6 +160,7 @@ mlx5_prepare_shared_data(void)
 			rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
 							mlx5_mr_mem_event_cb,
 							NULL);
+			mlx5_mp_init();
 		}
 	}
 	rte_spinlock_unlock(&mlx5_shared_data_lock);
@@ -291,8 +292,6 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 		rte_free(priv->rss_conf.rss_key);
 	if (priv->reta_idx != NULL)
 		rte_free(priv->reta_idx);
-	if (priv->primary_socket)
-		mlx5_socket_uninit(dev);
 	if (priv->config.vf)
 		mlx5_nl_mac_addr_flush(dev);
 	if (priv->nl_socket_route >= 0)
@@ -929,7 +928,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 			goto error;
 		}
 		/* Receive command fd from primary process */
-		err = mlx5_socket_connect(eth_dev);
+		err = mlx5_mp_req_verbs_cmd_fd(eth_dev);
 		if (err < 0) {
 			err = rte_errno;
 			goto error;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 88ffb19247..7030c6f7d7 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -56,6 +56,24 @@ enum {
 	PCI_DEVICE_ID_MELLANOX_CONNECTX6VF = 0x101c,
 };
 
+/* Request types for IPC. */
+enum mlx5_mp_req_type {
+	MLX5_MP_REQ_VERBS_CMD_FD = 1,
+};
+
+/* Pameters for IPC. */
+struct mlx5_mp_param {
+	enum mlx5_mp_req_type type;
+	int port_id;
+	int result;
+};
+
+/** Request timeout for IPC. */
+#define MLX5_MP_REQ_TIMEOUT_SEC 5
+
+/** Key string for IPC. */
+#define MLX5_MP_NAME "net_mlx5_mp"
+
 /** Switch information returned by mlx5_nl_switch_info(). */
 struct mlx5_switch_info {
 	uint32_t master:1; /**< Master device. */
@@ -242,9 +260,7 @@ struct mlx5_priv {
 	uint32_t link_speed_capa; /* Link speed capabilities. */
 	struct mlx5_xstats_ctrl xstats_ctrl; /* Extended stats control. */
 	struct mlx5_stats_ctrl stats_ctrl; /* Stats control. */
-	int primary_socket; /* Unix socket for primary process. */
 	void *uar_base; /* Reserved address space for UAR mapping */
-	struct rte_intr_handle intr_handle_socket; /* Interrupt handler. */
 	struct mlx5_dev_config config; /* Device configuration. */
 	struct mlx5_verbs_alloc_ctx verbs_alloc_ctx;
 	/* Context for Verbs allocator. */
@@ -402,12 +418,9 @@ int mlx5_ctrl_flow(struct rte_eth_dev *dev,
 int mlx5_flow_create_drop_queue(struct rte_eth_dev *dev);
 void mlx5_flow_delete_drop_queue(struct rte_eth_dev *dev);
 
-/* mlx5_socket.c */
-
-int mlx5_socket_init(struct rte_eth_dev *priv);
-void mlx5_socket_uninit(struct rte_eth_dev *priv);
-void mlx5_socket_handle(struct rte_eth_dev *priv);
-int mlx5_socket_connect(struct rte_eth_dev *priv);
+/* mlx5_mp.c */
+int mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev);
+void mlx5_mp_init(void);
 
 /* mlx5_nl.c */
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 84d761c8e9..82a77e0d2b 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1091,20 +1091,6 @@ mlx5_dev_interrupt_handler(void *cb_arg)
 }
 
 /**
- * Handle interrupts from the socket.
- *
- * @param cb_arg
- *   Callback argument.
- */
-static void
-mlx5_dev_handler_socket(void *cb_arg)
-{
-	struct rte_eth_dev *dev = cb_arg;
-
-	mlx5_socket_handle(dev);
-}
-
-/**
  * Uninstall interrupt handler.
  *
  * @param dev
@@ -1119,13 +1105,8 @@ mlx5_dev_interrupt_handler_uninstall(struct rte_eth_dev *dev)
 	    dev->data->dev_conf.intr_conf.rmv)
 		rte_intr_callback_unregister(&priv->intr_handle,
 					     mlx5_dev_interrupt_handler, dev);
-	if (priv->primary_socket)
-		rte_intr_callback_unregister(&priv->intr_handle_socket,
-					     mlx5_dev_handler_socket, dev);
 	priv->intr_handle.fd = 0;
 	priv->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
-	priv->intr_handle_socket.fd = 0;
-	priv->intr_handle_socket.type = RTE_INTR_HANDLE_UNKNOWN;
 }
 
 /**
@@ -1159,16 +1140,6 @@ mlx5_dev_interrupt_handler_install(struct rte_eth_dev *dev)
 		rte_intr_callback_register(&priv->intr_handle,
 					   mlx5_dev_interrupt_handler, dev);
 	}
-	ret = mlx5_socket_init(dev);
-	if (ret)
-		DRV_LOG(ERR, "port %u cannot initialise socket: %s",
-			dev->data->port_id, strerror(rte_errno));
-	else if (priv->primary_socket) {
-		priv->intr_handle_socket.fd = priv->primary_socket;
-		priv->intr_handle_socket.type = RTE_INTR_HANDLE_EXT;
-		rte_intr_callback_register(&priv->intr_handle_socket,
-					   mlx5_dev_handler_socket, dev);
-	}
 }
 
 /**
diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c
new file mode 100644
index 0000000000..b8dd4b5fa7
--- /dev/null
+++ b/drivers/net/mlx5/mlx5_mp.c
@@ -0,0 +1,139 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2019 6WIND S.A.
+ * Copyright 2019 Mellanox Technologies, Ltd
+ */
+
+#include <assert.h>
+#include <stdio.h>
+#include <time.h>
+
+#include <rte_eal.h>
+#include <rte_ethdev_driver.h>
+#include <rte_string_fns.h>
+
+#include "mlx5.h"
+#include "mlx5_utils.h"
+
+/**
+ * Initialize IPC message.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ * @param[out] msg
+ *   Pointer to message to fill in.
+ * @param[in] type
+ *   Message type.
+ */
+static inline void
+mp_init_msg(struct rte_eth_dev *dev, struct rte_mp_msg *msg,
+	    enum mlx5_mp_req_type type)
+{
+	struct mlx5_mp_param *param = (struct mlx5_mp_param *)msg->param;
+
+	memset(msg, 0, sizeof(*msg));
+	strlcpy(msg->name, MLX5_MP_NAME, sizeof(msg->name));
+	msg->len_param = sizeof(*param);
+	param->type = type;
+	param->port_id = dev->data->port_id;
+}
+
+/**
+ * IPC message handler of primary process.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ * @param[in] peer
+ *   Pointer to the peer socket path.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mp_primary_handle(const struct rte_mp_msg *mp_msg, const void *peer)
+{
+	struct rte_mp_msg mp_res;
+	struct mlx5_mp_param *res = (struct mlx5_mp_param *)mp_res.param;
+	const struct mlx5_mp_param *param =
+		(const struct mlx5_mp_param *)mp_msg->param;
+	struct rte_eth_dev *dev;
+	struct mlx5_priv *priv;
+	int ret;
+
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
+	if (!rte_eth_dev_is_valid_port(param->port_id)) {
+		rte_errno = ENODEV;
+		DRV_LOG(ERR, "port %u invalid port ID", param->port_id);
+		return -rte_errno;
+	}
+	dev = &rte_eth_devices[param->port_id];
+	priv = dev->data->dev_private;
+	switch (param->type) {
+	case MLX5_MP_REQ_VERBS_CMD_FD:
+		mp_init_msg(dev, &mp_res, param->type);
+		mp_res.num_fds = 1;
+		mp_res.fds[0] = priv->ctx->cmd_fd;
+		res->result = 0;
+		ret = rte_mp_reply(&mp_res, peer);
+		break;
+	default:
+		rte_errno = EINVAL;
+		DRV_LOG(ERR, "port %u invalid mp request type",
+			dev->data->port_id);
+		return -rte_errno;
+	}
+	return ret;
+}
+
+/**
+ * Request Verbs command file descriptor for mmap to the primary process.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ *
+ * @return
+ *   fd on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev)
+{
+	struct rte_mp_msg mp_req;
+	struct rte_mp_msg *mp_res;
+	struct rte_mp_reply mp_rep;
+	struct mlx5_mp_param *res;
+	struct timespec ts = {.tv_sec = MLX5_MP_REQ_TIMEOUT_SEC, .tv_nsec = 0};
+	int ret;
+
+	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
+	mp_init_msg(dev, &mp_req, MLX5_MP_REQ_VERBS_CMD_FD);
+	ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts);
+	if (ret) {
+		DRV_LOG(ERR, "port %u request to primary process failed",
+			dev->data->port_id);
+		return -rte_errno;
+	}
+	assert(mp_rep.nb_received == 1);
+	mp_res = &mp_rep.msgs[0];
+	res = (struct mlx5_mp_param *)mp_res->param;
+	if (res->result) {
+		rte_errno = -res->result;
+		DRV_LOG(ERR,
+			"port %u failed to get command FD from primary process",
+			dev->data->port_id);
+		ret = -rte_errno;
+		goto exit;
+	}
+	assert(mp_res->num_fds == 1);
+	ret = mp_res->fds[0];
+	DRV_LOG(DEBUG, "port %u command FD from primary is %d",
+		dev->data->port_id, ret);
+exit:
+	free(mp_rep.msgs);
+	return ret;
+}
+
+void
+mlx5_mp_init(void)
+{
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+		rte_mp_action_register(MLX5_MP_NAME, mp_primary_handle);
+}
diff --git a/drivers/net/mlx5/mlx5_socket.c b/drivers/net/mlx5/mlx5_socket.c
deleted file mode 100644
index 41cac3c6aa..0000000000
--- a/drivers/net/mlx5/mlx5_socket.c
+++ /dev/null
@@ -1,306 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright 2016 6WIND S.A.
- * Copyright 2016 Mellanox Technologies, Ltd
- */
-
-#include <sys/types.h>
-#include <sys/socket.h>
-#include <sys/un.h>
-#include <fcntl.h>
-#include <stdio.h>
-#include <unistd.h>
-#include <sys/stat.h>
-
-#include "mlx5.h"
-#include "mlx5_utils.h"
-
-/**
- * Initialise the socket to communicate with the secondary process
- *
- * @param[in] dev
- *   Pointer to Ethernet device.
- *
- * @return
- *   0 on success, a negative errno value otherwise and rte_errno is set.
- */
-int
-mlx5_socket_init(struct rte_eth_dev *dev)
-{
-	struct mlx5_priv *priv = dev->data->dev_private;
-	struct sockaddr_un sun = {
-		.sun_family = AF_UNIX,
-	};
-	int ret;
-	int flags;
-
-	/*
-	 * Close the last socket that was used to communicate
-	 * with the secondary process
-	 */
-	if (priv->primary_socket)
-		mlx5_socket_uninit(dev);
-	/*
-	 * Initialise the socket to communicate with the secondary
-	 * process.
-	 */
-	ret = socket(AF_UNIX, SOCK_STREAM, 0);
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u secondary process not supported: %s",
-			dev->data->port_id, strerror(errno));
-		goto error;
-	}
-	priv->primary_socket = ret;
-	flags = fcntl(priv->primary_socket, F_GETFL, 0);
-	if (flags == -1) {
-		rte_errno = errno;
-		goto error;
-	}
-	ret = fcntl(priv->primary_socket, F_SETFL, flags | O_NONBLOCK);
-	if (ret < 0) {
-		rte_errno = errno;
-		goto error;
-	}
-	snprintf(sun.sun_path, sizeof(sun.sun_path), "/var/tmp/%s_%d",
-		 MLX5_DRIVER_NAME, priv->primary_socket);
-	remove(sun.sun_path);
-	ret = bind(priv->primary_socket, (const struct sockaddr *)&sun,
-		   sizeof(sun));
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING,
-			"port %u cannot bind socket, secondary process not"
-			" supported: %s",
-			dev->data->port_id, strerror(errno));
-		goto close;
-	}
-	ret = listen(priv->primary_socket, 0);
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u secondary process not supported: %s",
-			dev->data->port_id, strerror(errno));
-		goto close;
-	}
-	return 0;
-close:
-	remove(sun.sun_path);
-error:
-	claim_zero(close(priv->primary_socket));
-	priv->primary_socket = 0;
-	return -rte_errno;
-}
-
-/**
- * Un-Initialise the socket to communicate with the secondary process
- *
- * @param[in] dev
- */
-void
-mlx5_socket_uninit(struct rte_eth_dev *dev)
-{
-	struct mlx5_priv *priv = dev->data->dev_private;
-
-	MKSTR(path, "/var/tmp/%s_%d", MLX5_DRIVER_NAME, priv->primary_socket);
-	claim_zero(close(priv->primary_socket));
-	priv->primary_socket = 0;
-	claim_zero(remove(path));
-}
-
-/**
- * Handle socket interrupts.
- *
- * @param dev
- *   Pointer to Ethernet device.
- */
-void
-mlx5_socket_handle(struct rte_eth_dev *dev)
-{
-	struct mlx5_priv *priv = dev->data->dev_private;
-	int conn_sock;
-	int ret = 0;
-	struct cmsghdr *cmsg = NULL;
-	struct ucred *cred = NULL;
-	char buf[CMSG_SPACE(sizeof(struct ucred))] = { 0 };
-	char vbuf[1024] = { 0 };
-	struct iovec io = {
-		.iov_base = vbuf,
-		.iov_len = sizeof(*vbuf),
-	};
-	struct msghdr msg = {
-		.msg_iov = &io,
-		.msg_iovlen = 1,
-		.msg_control = buf,
-		.msg_controllen = sizeof(buf),
-	};
-	int *fd;
-
-	/* Accept the connection from the client. */
-	conn_sock = accept(priv->primary_socket, NULL, NULL);
-	if (conn_sock < 0) {
-		DRV_LOG(WARNING, "port %u connection failed: %s",
-			dev->data->port_id, strerror(errno));
-		return;
-	}
-	ret = setsockopt(conn_sock, SOL_SOCKET, SO_PASSCRED, &(int){1},
-					 sizeof(int));
-	if (ret < 0) {
-		ret = errno;
-		DRV_LOG(WARNING, "port %u cannot change socket options: %s",
-			dev->data->port_id, strerror(rte_errno));
-		goto error;
-	}
-	ret = recvmsg(conn_sock, &msg, MSG_WAITALL);
-	if (ret < 0) {
-		ret = errno;
-		DRV_LOG(WARNING, "port %u received an empty message: %s",
-			dev->data->port_id, strerror(rte_errno));
-		goto error;
-	}
-	/* Expect to receive credentials only. */
-	cmsg = CMSG_FIRSTHDR(&msg);
-	if (cmsg == NULL) {
-		DRV_LOG(WARNING, "port %u no message", dev->data->port_id);
-		goto error;
-	}
-	if ((cmsg->cmsg_type == SCM_CREDENTIALS) &&
-		(cmsg->cmsg_len >= sizeof(*cred))) {
-		cred = (struct ucred *)CMSG_DATA(cmsg);
-		assert(cred != NULL);
-	}
-	cmsg = CMSG_NXTHDR(&msg, cmsg);
-	if (cmsg != NULL) {
-		DRV_LOG(WARNING, "port %u message wrongly formatted",
-			dev->data->port_id);
-		goto error;
-	}
-	/* Make sure all the ancillary data was received and valid. */
-	if ((cred == NULL) || (cred->uid != getuid()) ||
-	    (cred->gid != getgid())) {
-		DRV_LOG(WARNING, "port %u wrong credentials",
-			dev->data->port_id);
-		goto error;
-	}
-	/* Set-up the ancillary data. */
-	cmsg = CMSG_FIRSTHDR(&msg);
-	assert(cmsg != NULL);
-	cmsg->cmsg_level = SOL_SOCKET;
-	cmsg->cmsg_type = SCM_RIGHTS;
-	cmsg->cmsg_len = CMSG_LEN(sizeof(priv->ctx->cmd_fd));
-	fd = (int *)CMSG_DATA(cmsg);
-	*fd = priv->ctx->cmd_fd;
-	ret = sendmsg(conn_sock, &msg, 0);
-	if (ret < 0)
-		DRV_LOG(WARNING, "port %u cannot send response",
-			dev->data->port_id);
-error:
-	close(conn_sock);
-}
-
-/**
- * Connect to the primary process.
- *
- * @param[in] dev
- *   Pointer to Ethernet structure.
- *
- * @return
- *   fd on success, negative errno value otherwise and rte_errno is set.
- */
-int
-mlx5_socket_connect(struct rte_eth_dev *dev)
-{
-	struct mlx5_priv *priv = dev->data->dev_private;
-	struct sockaddr_un sun = {
-		.sun_family = AF_UNIX,
-	};
-	int socket_fd = -1;
-	int *fd = NULL;
-	int ret;
-	struct ucred *cred;
-	char buf[CMSG_SPACE(sizeof(*cred))] = { 0 };
-	char vbuf[1024] = { 0 };
-	struct iovec io = {
-		.iov_base = vbuf,
-		.iov_len = sizeof(*vbuf),
-	};
-	struct msghdr msg = {
-		.msg_control = buf,
-		.msg_controllen = sizeof(buf),
-		.msg_iov = &io,
-		.msg_iovlen = 1,
-	};
-	struct cmsghdr *cmsg;
-
-	ret = socket(AF_UNIX, SOCK_STREAM, 0);
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u cannot connect to primary",
-			dev->data->port_id);
-		goto error;
-	}
-	socket_fd = ret;
-	snprintf(sun.sun_path, sizeof(sun.sun_path), "/var/tmp/%s_%d",
-		 MLX5_DRIVER_NAME, priv->primary_socket);
-	ret = connect(socket_fd, (const struct sockaddr *)&sun, sizeof(sun));
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u cannot connect to primary",
-			dev->data->port_id);
-		goto error;
-	}
-	cmsg = CMSG_FIRSTHDR(&msg);
-	if (cmsg == NULL) {
-		rte_errno = EINVAL;
-		DRV_LOG(DEBUG, "port %u cannot get first message",
-			dev->data->port_id);
-		goto error;
-	}
-	cmsg->cmsg_level = SOL_SOCKET;
-	cmsg->cmsg_type = SCM_CREDENTIALS;
-	cmsg->cmsg_len = CMSG_LEN(sizeof(*cred));
-	cred = (struct ucred *)CMSG_DATA(cmsg);
-	if (cred == NULL) {
-		rte_errno = EINVAL;
-		DRV_LOG(DEBUG, "port %u no credentials received",
-			dev->data->port_id);
-		goto error;
-	}
-	cred->pid = getpid();
-	cred->uid = getuid();
-	cred->gid = getgid();
-	ret = sendmsg(socket_fd, &msg, MSG_DONTWAIT);
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING,
-			"port %u cannot send credentials to primary: %s",
-			dev->data->port_id, strerror(errno));
-		goto error;
-	}
-	ret = recvmsg(socket_fd, &msg, MSG_WAITALL);
-	if (ret <= 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u no message from primary: %s",
-			dev->data->port_id, strerror(errno));
-		goto error;
-	}
-	cmsg = CMSG_FIRSTHDR(&msg);
-	if (cmsg == NULL) {
-		rte_errno = EINVAL;
-		DRV_LOG(WARNING, "port %u no file descriptor received",
-			dev->data->port_id);
-		goto error;
-	}
-	fd = (int *)CMSG_DATA(cmsg);
-	if (*fd < 0) {
-		DRV_LOG(WARNING, "port %u no file descriptor received: %s",
-			dev->data->port_id, strerror(errno));
-		rte_errno = *fd;
-		goto error;
-	}
-	ret = *fd;
-	close(socket_fd);
-	return ret;
-error:
-	if (socket_fd != -1)
-		close(socket_fd);
-	return -rte_errno;
-}
-- 
2.11.0


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 3/4] net/mlx5: rework PMD global data init
  2019-03-25 19:15 ` [dpdk-dev] [PATCH v2 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
                     ` (2 preceding siblings ...)
  2019-03-25 19:15   ` [dpdk-dev] [PATCH v2 2/4] net/mlx5: replace IPC socket with EAL API Yongseok Koh
@ 2019-03-25 19:15   ` Yongseok Koh
  2019-03-25 19:15     ` Yongseok Koh
  2019-03-26 12:38     ` Shahaf Shuler
  2019-03-25 19:15   ` [dpdk-dev] [PATCH v2 4/4] net/mlx5: sync stop/start of datapath with secondary process Yongseok Koh
  4 siblings, 2 replies; 45+ messages in thread
From: Yongseok Koh @ 2019-03-25 19:15 UTC (permalink / raw)
  To: shahafs; +Cc: dev

There's more need to have PMD global data structure. This should be
initialized once per a process regardless of how many PMD instances are
probed. mlx5_init_once() is called during probing and make sure all the
init functions are called once per a process. Currently, such global data
and its initialization functions are even scattered. Rather than
'extern'-ing such variables and calling such functions one by one making
sure it is called only once by checking the validity of such variables, it
will be better to have a global storage to hold such data and a
consolidated function having all the initializations. The existing shared
memory gets more extensively used for this purpose. As there could be
multiple secondary processes, a static storage (local to process) is also
added.

As the reserved virtual address for UAR remap is a PMD global resource,
this doesn't need to be stored in the device priv structure, but in the PMD
global data.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 drivers/net/mlx5/mlx5.c     | 250 ++++++++++++++++++++++++++++++++------------
 drivers/net/mlx5/mlx5.h     |  19 +++-
 drivers/net/mlx5/mlx5_mp.c  |  19 +++-
 drivers/net/mlx5/mlx5_txq.c |   7 +-
 4 files changed, 217 insertions(+), 78 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 316f34cd05..54a1896ea4 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -128,16 +128,26 @@ struct mlx5_shared_data *mlx5_shared_data;
 /* Spinlock for mlx5_shared_data allocation. */
 static rte_spinlock_t mlx5_shared_data_lock = RTE_SPINLOCK_INITIALIZER;
 
+/* Process local data for secondary processes. */
+static struct mlx5_local_data mlx5_local_data;
+
 /** Driver-specific log messages type. */
 int mlx5_logtype;
 
 /**
- * Prepare shared data between primary and secondary process.
+ * Initialize shared data between primary and secondary process.
+ *
+ * A memzone is reserved by primary process and secondary processes attach to
+ * the memzone.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
-static void
-mlx5_prepare_shared_data(void)
+static int
+mlx5_init_shared_data(void)
 {
 	const struct rte_memzone *mz;
+	int ret = 0;
 
 	rte_spinlock_lock(&mlx5_shared_data_lock);
 	if (mlx5_shared_data == NULL) {
@@ -146,22 +156,53 @@ mlx5_prepare_shared_data(void)
 			mz = rte_memzone_reserve(MZ_MLX5_PMD_SHARED_DATA,
 						 sizeof(*mlx5_shared_data),
 						 SOCKET_ID_ANY, 0);
+			if (mz == NULL) {
+				DRV_LOG(ERR,
+					"Cannot allocate mlx5 shared data\n");
+				ret = -rte_errno;
+				goto error;
+			}
+			mlx5_shared_data = mz->addr;
+			memset(mlx5_shared_data, 0, sizeof(*mlx5_shared_data));
+			rte_spinlock_init(&mlx5_shared_data->lock);
 		} else {
 			/* Lookup allocated shared memory. */
 			mz = rte_memzone_lookup(MZ_MLX5_PMD_SHARED_DATA);
+			if (mz == NULL) {
+				DRV_LOG(ERR,
+					"Cannot attach mlx5 shared data\n");
+				ret = -rte_errno;
+				goto error;
+			}
+			mlx5_shared_data = mz->addr;
+			memset(&mlx5_local_data, 0, sizeof(mlx5_local_data));
 		}
-		if (mz == NULL)
-			rte_panic("Cannot allocate mlx5 shared data\n");
-		mlx5_shared_data = mz->addr;
-		/* Initialize shared data. */
+	}
+error:
+	rte_spinlock_unlock(&mlx5_shared_data_lock);
+	return ret;
+}
+
+/**
+ * Uninitialize shared data between primary and secondary process.
+ *
+ * The pointer of secondary process is dereferenced and primary process frees
+ * the memzone.
+ */
+static void
+mlx5_uninit_shared_data(void)
+{
+	const struct rte_memzone *mz;
+
+	rte_spinlock_lock(&mlx5_shared_data_lock);
+	if (mlx5_shared_data) {
 		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-			LIST_INIT(&mlx5_shared_data->mem_event_cb_list);
-			rte_rwlock_init(&mlx5_shared_data->mem_event_rwlock);
-			rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
-							mlx5_mr_mem_event_cb,
-							NULL);
-			mlx5_mp_init();
+			mz = rte_memzone_lookup(MZ_MLX5_PMD_SHARED_DATA);
+			rte_memzone_free(mz);
+		} else {
+			memset(&mlx5_local_data, 0, sizeof(mlx5_local_data));
 		}
+		mlx5_shared_data = NULL;
 	}
 	rte_spinlock_unlock(&mlx5_shared_data_lock);
 }
@@ -597,15 +638,6 @@ mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
 
 static struct rte_pci_driver mlx5_driver;
 
-/*
- * Reserved UAR address space for TXQ UAR(hw doorbell) mapping, process
- * local resource used by both primary and secondary to avoid duplicate
- * reservation.
- * The space has to be available on both primary and secondary process,
- * TXQ UAR maps to this area using fixed mmap w/o double check.
- */
-static void *uar_base;
-
 static int
 find_lower_va_bound(const struct rte_memseg_list *msl,
 		const struct rte_memseg *ms, void *arg)
@@ -625,25 +657,24 @@ find_lower_va_bound(const struct rte_memseg_list *msl,
 /**
  * Reserve UAR address space for primary process.
  *
- * @param[in] dev
- *   Pointer to Ethernet device.
+ * Process local resource is used by both primary and secondary to avoid
+ * duplicate reservation. The space has to be available on both primary and
+ * secondary process, TXQ UAR maps to this area using fixed mmap w/o double
+ * check.
  *
  * @return
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-mlx5_uar_init_primary(struct rte_eth_dev *dev)
+mlx5_uar_init_primary(void)
 {
-	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_shared_data *sd = mlx5_shared_data;
 	void *addr = (void *)0;
 
-	if (uar_base) { /* UAR address space mapped. */
-		priv->uar_base = uar_base;
+	if (sd->uar_base)
 		return 0;
-	}
 	/* find out lower bound of hugepage segments */
 	rte_memseg_walk(find_lower_va_bound, &addr);
-
 	/* keep distance to hugepages to minimize potential conflicts. */
 	addr = RTE_PTR_SUB(addr, (uintptr_t)(MLX5_UAR_OFFSET + MLX5_UAR_SIZE));
 	/* anonymous mmap, no real memory consumption. */
@@ -651,65 +682,154 @@ mlx5_uar_init_primary(struct rte_eth_dev *dev)
 		    PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
 	if (addr == MAP_FAILED) {
 		DRV_LOG(ERR,
-			"port %u failed to reserve UAR address space, please"
-			" adjust MLX5_UAR_SIZE or try --base-virtaddr",
-			dev->data->port_id);
+			"Failed to reserve UAR address space, please"
+			" adjust MLX5_UAR_SIZE or try --base-virtaddr");
 		rte_errno = ENOMEM;
 		return -rte_errno;
 	}
 	/* Accept either same addr or a new addr returned from mmap if target
 	 * range occupied.
 	 */
-	DRV_LOG(INFO, "port %u reserved UAR address space: %p",
-		dev->data->port_id, addr);
-	priv->uar_base = addr; /* for primary and secondary UAR re-mmap. */
-	uar_base = addr; /* process local, don't reserve again. */
+	DRV_LOG(INFO, "Reserved UAR address space: %p", addr);
+	sd->uar_base = addr; /* for primary and secondary UAR re-mmap. */
 	return 0;
 }
 
 /**
- * Reserve UAR address space for secondary process, align with
- * primary process.
- *
- * @param[in] dev
- *   Pointer to Ethernet device.
+ * Unmap UAR address space reserved for primary process.
+ */
+static void
+mlx5_uar_uninit_primary(void)
+{
+	struct mlx5_shared_data *sd = mlx5_shared_data;
+
+	if (!sd->uar_base)
+		return;
+	munmap(sd->uar_base, MLX5_UAR_SIZE);
+	sd->uar_base = NULL;
+}
+
+/**
+ * Reserve UAR address space for secondary process, align with primary process.
  *
  * @return
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-mlx5_uar_init_secondary(struct rte_eth_dev *dev)
+mlx5_uar_init_secondary(void)
 {
-	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_shared_data *sd = mlx5_shared_data;
+	struct mlx5_local_data *ld = &mlx5_local_data;
 	void *addr;
 
-	assert(priv->uar_base);
-	if (uar_base) { /* already reserved. */
-		assert(uar_base == priv->uar_base);
+	if (ld->uar_base) { /* Already reserved. */
+		assert(sd->uar_base == ld->uar_base);
 		return 0;
 	}
+	assert(sd->uar_base);
 	/* anonymous mmap, no real memory consumption. */
-	addr = mmap(priv->uar_base, MLX5_UAR_SIZE,
+	addr = mmap(sd->uar_base, MLX5_UAR_SIZE,
 		    PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
 	if (addr == MAP_FAILED) {
-		DRV_LOG(ERR, "port %u UAR mmap failed: %p size: %llu",
-			dev->data->port_id, priv->uar_base, MLX5_UAR_SIZE);
+		DRV_LOG(ERR, "UAR mmap failed: %p size: %llu",
+			sd->uar_base, MLX5_UAR_SIZE);
 		rte_errno = ENXIO;
 		return -rte_errno;
 	}
-	if (priv->uar_base != addr) {
+	if (sd->uar_base != addr) {
 		DRV_LOG(ERR,
-			"port %u UAR address %p size %llu occupied, please"
+			"UAR address %p size %llu occupied, please"
 			" adjust MLX5_UAR_OFFSET or try EAL parameter"
 			" --base-virtaddr",
-			dev->data->port_id, priv->uar_base, MLX5_UAR_SIZE);
+			sd->uar_base, MLX5_UAR_SIZE);
 		rte_errno = ENXIO;
 		return -rte_errno;
 	}
-	uar_base = addr; /* process local, don't reserve again */
-	DRV_LOG(INFO, "port %u reserved UAR address space: %p",
-		dev->data->port_id, addr);
+	ld->uar_base = addr;
+	DRV_LOG(INFO, "Reserved UAR address space: %p", addr);
+	return 0;
+}
+
+/**
+ * Unmap UAR address space reserved for secondary process.
+ */
+static void
+mlx5_uar_uninit_secondary(void)
+{
+	struct mlx5_local_data *ld = &mlx5_local_data;
+
+	if (!ld->uar_base)
+		return;
+	munmap(ld->uar_base, MLX5_UAR_SIZE);
+	ld->uar_base = NULL;
+}
+
+/**
+ * PMD global initialization.
+ *
+ * Independent from individual device, this function initializes global
+ * per-PMD data structures distinguishing primary and secondary processes.
+ * Hence, each initialization is called once per a process.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_init_once(void)
+{
+	struct mlx5_shared_data *sd;
+	struct mlx5_local_data *ld = &mlx5_local_data;
+	int ret;
+
+	if (mlx5_init_shared_data())
+		return -rte_errno;
+	sd = mlx5_shared_data;
+	assert(sd);
+	rte_spinlock_lock(&sd->lock);
+	switch (rte_eal_process_type()) {
+	case RTE_PROC_PRIMARY:
+		if (sd->init_done)
+			break;
+		LIST_INIT(&sd->mem_event_cb_list);
+		rte_rwlock_init(&sd->mem_event_rwlock);
+		rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
+						mlx5_mr_mem_event_cb, NULL);
+		mlx5_mp_init_primary();
+		ret = mlx5_uar_init_primary();
+		if (ret)
+			goto error;
+		sd->init_done = true;
+		break;
+	case RTE_PROC_SECONDARY:
+		if (ld->init_done)
+			break;
+		ret = mlx5_uar_init_secondary();
+		if (ret)
+			goto error;
+		++sd->secondary_cnt;
+		ld->init_done = true;
+		break;
+	default:
+		break;
+	}
+	rte_spinlock_unlock(&sd->lock);
 	return 0;
+error:
+	switch (rte_eal_process_type()) {
+	case RTE_PROC_PRIMARY:
+		mlx5_uar_uninit_primary();
+		mlx5_mp_uninit_primary();
+		rte_mem_event_callback_unregister("MLX5_MEM_EVENT_CB", NULL);
+		break;
+	case RTE_PROC_SECONDARY:
+		mlx5_uar_uninit_secondary();
+		break;
+	default:
+		break;
+	}
+	rte_spinlock_unlock(&sd->lock);
+	mlx5_uninit_shared_data();
+	return -rte_errno;
 }
 
 /**
@@ -794,8 +914,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		rte_errno = EEXIST;
 		return NULL;
 	}
-	/* Prepare shared data between primary and secondary process. */
-	mlx5_prepare_shared_data();
 	errno = 0;
 	ctx = mlx5_glue->dv_open_device(ibv_dev);
 	if (ctx) {
@@ -922,11 +1040,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		}
 		eth_dev->device = dpdk_dev;
 		eth_dev->dev_ops = &mlx5_dev_sec_ops;
-		err = mlx5_uar_init_secondary(eth_dev);
-		if (err) {
-			err = rte_errno;
-			goto error;
-		}
 		/* Receive command fd from primary process */
 		err = mlx5_mp_req_verbs_cmd_fd(eth_dev);
 		if (err < 0) {
@@ -1143,11 +1256,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	priv->dev_data = eth_dev->data;
 	eth_dev->data->mac_addrs = priv->mac;
 	eth_dev->device = dpdk_dev;
-	err = mlx5_uar_init_primary(eth_dev);
-	if (err) {
-		err = rte_errno;
-		goto error;
-	}
 	/* Configure the first MAC address by default. */
 	if (mlx5_get_mac(eth_dev, &mac.addr_bytes)) {
 		DRV_LOG(ERR,
@@ -1363,6 +1471,12 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 	struct mlx5_dev_config dev_config;
 	int ret;
 
+	ret = mlx5_init_once();
+	if (ret) {
+		DRV_LOG(ERR, "unable to init PMD global data: %s",
+			strerror(rte_errno));
+		return -rte_errno;
+	}
 	assert(pci_drv == &mlx5_driver);
 	errno = 0;
 	ibv_list = mlx5_glue->get_device_list(&ret);
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 7030c6f7d7..cb454e866a 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -85,12 +85,25 @@ struct mlx5_switch_info {
 
 LIST_HEAD(mlx5_dev_list, mlx5_priv);
 
-/* Shared memory between primary and secondary processes. */
+/* Shared data between primary and secondary processes. */
 struct mlx5_shared_data {
+	rte_spinlock_t lock;
+	/* Global spinlock for primary and secondary processes. */
+	int init_done; /* Whether primary has done initialization. */
+	unsigned int secondary_cnt; /* Number of secondary processes init'd. */
+	void *uar_base;
+	/* Reserved UAR address space for TXQ UAR(hw doorbell) mapping. */
 	struct mlx5_dev_list mem_event_cb_list;
 	rte_rwlock_t mem_event_rwlock;
 };
 
+/* Per-process data structure, not visible to other processes. */
+struct mlx5_local_data {
+	int init_done; /* Whether a secondary has done initialization. */
+	void *uar_base;
+	/* Reserved UAR address space for TXQ UAR(hw doorbell) mapping. */
+};
+
 extern struct mlx5_shared_data *mlx5_shared_data;
 
 struct mlx5_counter_ctrl {
@@ -260,7 +273,6 @@ struct mlx5_priv {
 	uint32_t link_speed_capa; /* Link speed capabilities. */
 	struct mlx5_xstats_ctrl xstats_ctrl; /* Extended stats control. */
 	struct mlx5_stats_ctrl stats_ctrl; /* Stats control. */
-	void *uar_base; /* Reserved address space for UAR mapping */
 	struct mlx5_dev_config config; /* Device configuration. */
 	struct mlx5_verbs_alloc_ctx verbs_alloc_ctx;
 	/* Context for Verbs allocator. */
@@ -420,7 +432,8 @@ void mlx5_flow_delete_drop_queue(struct rte_eth_dev *dev);
 
 /* mlx5_mp.c */
 int mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev);
-void mlx5_mp_init(void);
+void mlx5_mp_init_primary(void);
+void mlx5_mp_uninit_primary(void);
 
 /* mlx5_nl.c */
 
diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c
index b8dd4b5fa7..d0a38c3d52 100644
--- a/drivers/net/mlx5/mlx5_mp.c
+++ b/drivers/net/mlx5/mlx5_mp.c
@@ -131,9 +131,22 @@ mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev)
 	return ret;
 }
 
+/**
+ * Initialize by primary process.
+ */
+void
+mlx5_mp_init_primary(void)
+{
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
+	rte_mp_action_register(MLX5_MP_NAME, mp_primary_handle);
+}
+
+/**
+ * Un-initialize by primary process.
+ */
 void
-mlx5_mp_init(void)
+mlx5_mp_uninit_primary(void)
 {
-	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
-		rte_mp_action_register(MLX5_MP_NAME, mp_primary_handle);
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
+	rte_mp_action_unregister(MLX5_MP_NAME);
 }
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index d18561740f..5640fe1b91 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -286,7 +286,7 @@ mlx5_tx_uar_remap(struct rte_eth_dev *dev, int fd)
 			}
 		}
 		/* new address in reserved UAR address space. */
-		addr = RTE_PTR_ADD(priv->uar_base,
+		addr = RTE_PTR_ADD(mlx5_shared_data->uar_base,
 				   uar_va & (uintptr_t)(MLX5_UAR_SIZE - 1));
 		if (!already_mapped) {
 			pages[pages_n++] = uar_va;
@@ -844,9 +844,8 @@ mlx5_txq_release(struct rte_eth_dev *dev, uint16_t idx)
 	txq = container_of((*priv->txqs)[idx], struct mlx5_txq_ctrl, txq);
 	if (txq->ibv && !mlx5_txq_ibv_release(txq->ibv))
 		txq->ibv = NULL;
-	if (priv->uar_base)
-		munmap((void *)RTE_ALIGN_FLOOR((uintptr_t)txq->txq.bf_reg,
-		       page_size), page_size);
+	munmap((void *)RTE_ALIGN_FLOOR((uintptr_t)txq->txq.bf_reg, page_size),
+	       page_size);
 	if (rte_atomic32_dec_and_test(&txq->refcnt)) {
 		txq_free_elts(txq);
 		mlx5_mr_btree_free(&txq->txq.mr_ctrl.cache_bh);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 3/4] net/mlx5: rework PMD global data init
  2019-03-25 19:15   ` [dpdk-dev] [PATCH v2 3/4] net/mlx5: rework PMD global data init Yongseok Koh
@ 2019-03-25 19:15     ` Yongseok Koh
  2019-03-26 12:38     ` Shahaf Shuler
  1 sibling, 0 replies; 45+ messages in thread
From: Yongseok Koh @ 2019-03-25 19:15 UTC (permalink / raw)
  To: shahafs; +Cc: dev

There's more need to have PMD global data structure. This should be
initialized once per a process regardless of how many PMD instances are
probed. mlx5_init_once() is called during probing and make sure all the
init functions are called once per a process. Currently, such global data
and its initialization functions are even scattered. Rather than
'extern'-ing such variables and calling such functions one by one making
sure it is called only once by checking the validity of such variables, it
will be better to have a global storage to hold such data and a
consolidated function having all the initializations. The existing shared
memory gets more extensively used for this purpose. As there could be
multiple secondary processes, a static storage (local to process) is also
added.

As the reserved virtual address for UAR remap is a PMD global resource,
this doesn't need to be stored in the device priv structure, but in the PMD
global data.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 drivers/net/mlx5/mlx5.c     | 250 ++++++++++++++++++++++++++++++++------------
 drivers/net/mlx5/mlx5.h     |  19 +++-
 drivers/net/mlx5/mlx5_mp.c  |  19 +++-
 drivers/net/mlx5/mlx5_txq.c |   7 +-
 4 files changed, 217 insertions(+), 78 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 316f34cd05..54a1896ea4 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -128,16 +128,26 @@ struct mlx5_shared_data *mlx5_shared_data;
 /* Spinlock for mlx5_shared_data allocation. */
 static rte_spinlock_t mlx5_shared_data_lock = RTE_SPINLOCK_INITIALIZER;
 
+/* Process local data for secondary processes. */
+static struct mlx5_local_data mlx5_local_data;
+
 /** Driver-specific log messages type. */
 int mlx5_logtype;
 
 /**
- * Prepare shared data between primary and secondary process.
+ * Initialize shared data between primary and secondary process.
+ *
+ * A memzone is reserved by primary process and secondary processes attach to
+ * the memzone.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
-static void
-mlx5_prepare_shared_data(void)
+static int
+mlx5_init_shared_data(void)
 {
 	const struct rte_memzone *mz;
+	int ret = 0;
 
 	rte_spinlock_lock(&mlx5_shared_data_lock);
 	if (mlx5_shared_data == NULL) {
@@ -146,22 +156,53 @@ mlx5_prepare_shared_data(void)
 			mz = rte_memzone_reserve(MZ_MLX5_PMD_SHARED_DATA,
 						 sizeof(*mlx5_shared_data),
 						 SOCKET_ID_ANY, 0);
+			if (mz == NULL) {
+				DRV_LOG(ERR,
+					"Cannot allocate mlx5 shared data\n");
+				ret = -rte_errno;
+				goto error;
+			}
+			mlx5_shared_data = mz->addr;
+			memset(mlx5_shared_data, 0, sizeof(*mlx5_shared_data));
+			rte_spinlock_init(&mlx5_shared_data->lock);
 		} else {
 			/* Lookup allocated shared memory. */
 			mz = rte_memzone_lookup(MZ_MLX5_PMD_SHARED_DATA);
+			if (mz == NULL) {
+				DRV_LOG(ERR,
+					"Cannot attach mlx5 shared data\n");
+				ret = -rte_errno;
+				goto error;
+			}
+			mlx5_shared_data = mz->addr;
+			memset(&mlx5_local_data, 0, sizeof(mlx5_local_data));
 		}
-		if (mz == NULL)
-			rte_panic("Cannot allocate mlx5 shared data\n");
-		mlx5_shared_data = mz->addr;
-		/* Initialize shared data. */
+	}
+error:
+	rte_spinlock_unlock(&mlx5_shared_data_lock);
+	return ret;
+}
+
+/**
+ * Uninitialize shared data between primary and secondary process.
+ *
+ * The pointer of secondary process is dereferenced and primary process frees
+ * the memzone.
+ */
+static void
+mlx5_uninit_shared_data(void)
+{
+	const struct rte_memzone *mz;
+
+	rte_spinlock_lock(&mlx5_shared_data_lock);
+	if (mlx5_shared_data) {
 		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-			LIST_INIT(&mlx5_shared_data->mem_event_cb_list);
-			rte_rwlock_init(&mlx5_shared_data->mem_event_rwlock);
-			rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
-							mlx5_mr_mem_event_cb,
-							NULL);
-			mlx5_mp_init();
+			mz = rte_memzone_lookup(MZ_MLX5_PMD_SHARED_DATA);
+			rte_memzone_free(mz);
+		} else {
+			memset(&mlx5_local_data, 0, sizeof(mlx5_local_data));
 		}
+		mlx5_shared_data = NULL;
 	}
 	rte_spinlock_unlock(&mlx5_shared_data_lock);
 }
@@ -597,15 +638,6 @@ mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
 
 static struct rte_pci_driver mlx5_driver;
 
-/*
- * Reserved UAR address space for TXQ UAR(hw doorbell) mapping, process
- * local resource used by both primary and secondary to avoid duplicate
- * reservation.
- * The space has to be available on both primary and secondary process,
- * TXQ UAR maps to this area using fixed mmap w/o double check.
- */
-static void *uar_base;
-
 static int
 find_lower_va_bound(const struct rte_memseg_list *msl,
 		const struct rte_memseg *ms, void *arg)
@@ -625,25 +657,24 @@ find_lower_va_bound(const struct rte_memseg_list *msl,
 /**
  * Reserve UAR address space for primary process.
  *
- * @param[in] dev
- *   Pointer to Ethernet device.
+ * Process local resource is used by both primary and secondary to avoid
+ * duplicate reservation. The space has to be available on both primary and
+ * secondary process, TXQ UAR maps to this area using fixed mmap w/o double
+ * check.
  *
  * @return
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-mlx5_uar_init_primary(struct rte_eth_dev *dev)
+mlx5_uar_init_primary(void)
 {
-	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_shared_data *sd = mlx5_shared_data;
 	void *addr = (void *)0;
 
-	if (uar_base) { /* UAR address space mapped. */
-		priv->uar_base = uar_base;
+	if (sd->uar_base)
 		return 0;
-	}
 	/* find out lower bound of hugepage segments */
 	rte_memseg_walk(find_lower_va_bound, &addr);
-
 	/* keep distance to hugepages to minimize potential conflicts. */
 	addr = RTE_PTR_SUB(addr, (uintptr_t)(MLX5_UAR_OFFSET + MLX5_UAR_SIZE));
 	/* anonymous mmap, no real memory consumption. */
@@ -651,65 +682,154 @@ mlx5_uar_init_primary(struct rte_eth_dev *dev)
 		    PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
 	if (addr == MAP_FAILED) {
 		DRV_LOG(ERR,
-			"port %u failed to reserve UAR address space, please"
-			" adjust MLX5_UAR_SIZE or try --base-virtaddr",
-			dev->data->port_id);
+			"Failed to reserve UAR address space, please"
+			" adjust MLX5_UAR_SIZE or try --base-virtaddr");
 		rte_errno = ENOMEM;
 		return -rte_errno;
 	}
 	/* Accept either same addr or a new addr returned from mmap if target
 	 * range occupied.
 	 */
-	DRV_LOG(INFO, "port %u reserved UAR address space: %p",
-		dev->data->port_id, addr);
-	priv->uar_base = addr; /* for primary and secondary UAR re-mmap. */
-	uar_base = addr; /* process local, don't reserve again. */
+	DRV_LOG(INFO, "Reserved UAR address space: %p", addr);
+	sd->uar_base = addr; /* for primary and secondary UAR re-mmap. */
 	return 0;
 }
 
 /**
- * Reserve UAR address space for secondary process, align with
- * primary process.
- *
- * @param[in] dev
- *   Pointer to Ethernet device.
+ * Unmap UAR address space reserved for primary process.
+ */
+static void
+mlx5_uar_uninit_primary(void)
+{
+	struct mlx5_shared_data *sd = mlx5_shared_data;
+
+	if (!sd->uar_base)
+		return;
+	munmap(sd->uar_base, MLX5_UAR_SIZE);
+	sd->uar_base = NULL;
+}
+
+/**
+ * Reserve UAR address space for secondary process, align with primary process.
  *
  * @return
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-mlx5_uar_init_secondary(struct rte_eth_dev *dev)
+mlx5_uar_init_secondary(void)
 {
-	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_shared_data *sd = mlx5_shared_data;
+	struct mlx5_local_data *ld = &mlx5_local_data;
 	void *addr;
 
-	assert(priv->uar_base);
-	if (uar_base) { /* already reserved. */
-		assert(uar_base == priv->uar_base);
+	if (ld->uar_base) { /* Already reserved. */
+		assert(sd->uar_base == ld->uar_base);
 		return 0;
 	}
+	assert(sd->uar_base);
 	/* anonymous mmap, no real memory consumption. */
-	addr = mmap(priv->uar_base, MLX5_UAR_SIZE,
+	addr = mmap(sd->uar_base, MLX5_UAR_SIZE,
 		    PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
 	if (addr == MAP_FAILED) {
-		DRV_LOG(ERR, "port %u UAR mmap failed: %p size: %llu",
-			dev->data->port_id, priv->uar_base, MLX5_UAR_SIZE);
+		DRV_LOG(ERR, "UAR mmap failed: %p size: %llu",
+			sd->uar_base, MLX5_UAR_SIZE);
 		rte_errno = ENXIO;
 		return -rte_errno;
 	}
-	if (priv->uar_base != addr) {
+	if (sd->uar_base != addr) {
 		DRV_LOG(ERR,
-			"port %u UAR address %p size %llu occupied, please"
+			"UAR address %p size %llu occupied, please"
 			" adjust MLX5_UAR_OFFSET or try EAL parameter"
 			" --base-virtaddr",
-			dev->data->port_id, priv->uar_base, MLX5_UAR_SIZE);
+			sd->uar_base, MLX5_UAR_SIZE);
 		rte_errno = ENXIO;
 		return -rte_errno;
 	}
-	uar_base = addr; /* process local, don't reserve again */
-	DRV_LOG(INFO, "port %u reserved UAR address space: %p",
-		dev->data->port_id, addr);
+	ld->uar_base = addr;
+	DRV_LOG(INFO, "Reserved UAR address space: %p", addr);
+	return 0;
+}
+
+/**
+ * Unmap UAR address space reserved for secondary process.
+ */
+static void
+mlx5_uar_uninit_secondary(void)
+{
+	struct mlx5_local_data *ld = &mlx5_local_data;
+
+	if (!ld->uar_base)
+		return;
+	munmap(ld->uar_base, MLX5_UAR_SIZE);
+	ld->uar_base = NULL;
+}
+
+/**
+ * PMD global initialization.
+ *
+ * Independent from individual device, this function initializes global
+ * per-PMD data structures distinguishing primary and secondary processes.
+ * Hence, each initialization is called once per a process.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_init_once(void)
+{
+	struct mlx5_shared_data *sd;
+	struct mlx5_local_data *ld = &mlx5_local_data;
+	int ret;
+
+	if (mlx5_init_shared_data())
+		return -rte_errno;
+	sd = mlx5_shared_data;
+	assert(sd);
+	rte_spinlock_lock(&sd->lock);
+	switch (rte_eal_process_type()) {
+	case RTE_PROC_PRIMARY:
+		if (sd->init_done)
+			break;
+		LIST_INIT(&sd->mem_event_cb_list);
+		rte_rwlock_init(&sd->mem_event_rwlock);
+		rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
+						mlx5_mr_mem_event_cb, NULL);
+		mlx5_mp_init_primary();
+		ret = mlx5_uar_init_primary();
+		if (ret)
+			goto error;
+		sd->init_done = true;
+		break;
+	case RTE_PROC_SECONDARY:
+		if (ld->init_done)
+			break;
+		ret = mlx5_uar_init_secondary();
+		if (ret)
+			goto error;
+		++sd->secondary_cnt;
+		ld->init_done = true;
+		break;
+	default:
+		break;
+	}
+	rte_spinlock_unlock(&sd->lock);
 	return 0;
+error:
+	switch (rte_eal_process_type()) {
+	case RTE_PROC_PRIMARY:
+		mlx5_uar_uninit_primary();
+		mlx5_mp_uninit_primary();
+		rte_mem_event_callback_unregister("MLX5_MEM_EVENT_CB", NULL);
+		break;
+	case RTE_PROC_SECONDARY:
+		mlx5_uar_uninit_secondary();
+		break;
+	default:
+		break;
+	}
+	rte_spinlock_unlock(&sd->lock);
+	mlx5_uninit_shared_data();
+	return -rte_errno;
 }
 
 /**
@@ -794,8 +914,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		rte_errno = EEXIST;
 		return NULL;
 	}
-	/* Prepare shared data between primary and secondary process. */
-	mlx5_prepare_shared_data();
 	errno = 0;
 	ctx = mlx5_glue->dv_open_device(ibv_dev);
 	if (ctx) {
@@ -922,11 +1040,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		}
 		eth_dev->device = dpdk_dev;
 		eth_dev->dev_ops = &mlx5_dev_sec_ops;
-		err = mlx5_uar_init_secondary(eth_dev);
-		if (err) {
-			err = rte_errno;
-			goto error;
-		}
 		/* Receive command fd from primary process */
 		err = mlx5_mp_req_verbs_cmd_fd(eth_dev);
 		if (err < 0) {
@@ -1143,11 +1256,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	priv->dev_data = eth_dev->data;
 	eth_dev->data->mac_addrs = priv->mac;
 	eth_dev->device = dpdk_dev;
-	err = mlx5_uar_init_primary(eth_dev);
-	if (err) {
-		err = rte_errno;
-		goto error;
-	}
 	/* Configure the first MAC address by default. */
 	if (mlx5_get_mac(eth_dev, &mac.addr_bytes)) {
 		DRV_LOG(ERR,
@@ -1363,6 +1471,12 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 	struct mlx5_dev_config dev_config;
 	int ret;
 
+	ret = mlx5_init_once();
+	if (ret) {
+		DRV_LOG(ERR, "unable to init PMD global data: %s",
+			strerror(rte_errno));
+		return -rte_errno;
+	}
 	assert(pci_drv == &mlx5_driver);
 	errno = 0;
 	ibv_list = mlx5_glue->get_device_list(&ret);
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 7030c6f7d7..cb454e866a 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -85,12 +85,25 @@ struct mlx5_switch_info {
 
 LIST_HEAD(mlx5_dev_list, mlx5_priv);
 
-/* Shared memory between primary and secondary processes. */
+/* Shared data between primary and secondary processes. */
 struct mlx5_shared_data {
+	rte_spinlock_t lock;
+	/* Global spinlock for primary and secondary processes. */
+	int init_done; /* Whether primary has done initialization. */
+	unsigned int secondary_cnt; /* Number of secondary processes init'd. */
+	void *uar_base;
+	/* Reserved UAR address space for TXQ UAR(hw doorbell) mapping. */
 	struct mlx5_dev_list mem_event_cb_list;
 	rte_rwlock_t mem_event_rwlock;
 };
 
+/* Per-process data structure, not visible to other processes. */
+struct mlx5_local_data {
+	int init_done; /* Whether a secondary has done initialization. */
+	void *uar_base;
+	/* Reserved UAR address space for TXQ UAR(hw doorbell) mapping. */
+};
+
 extern struct mlx5_shared_data *mlx5_shared_data;
 
 struct mlx5_counter_ctrl {
@@ -260,7 +273,6 @@ struct mlx5_priv {
 	uint32_t link_speed_capa; /* Link speed capabilities. */
 	struct mlx5_xstats_ctrl xstats_ctrl; /* Extended stats control. */
 	struct mlx5_stats_ctrl stats_ctrl; /* Stats control. */
-	void *uar_base; /* Reserved address space for UAR mapping */
 	struct mlx5_dev_config config; /* Device configuration. */
 	struct mlx5_verbs_alloc_ctx verbs_alloc_ctx;
 	/* Context for Verbs allocator. */
@@ -420,7 +432,8 @@ void mlx5_flow_delete_drop_queue(struct rte_eth_dev *dev);
 
 /* mlx5_mp.c */
 int mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev);
-void mlx5_mp_init(void);
+void mlx5_mp_init_primary(void);
+void mlx5_mp_uninit_primary(void);
 
 /* mlx5_nl.c */
 
diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c
index b8dd4b5fa7..d0a38c3d52 100644
--- a/drivers/net/mlx5/mlx5_mp.c
+++ b/drivers/net/mlx5/mlx5_mp.c
@@ -131,9 +131,22 @@ mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev)
 	return ret;
 }
 
+/**
+ * Initialize by primary process.
+ */
+void
+mlx5_mp_init_primary(void)
+{
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
+	rte_mp_action_register(MLX5_MP_NAME, mp_primary_handle);
+}
+
+/**
+ * Un-initialize by primary process.
+ */
 void
-mlx5_mp_init(void)
+mlx5_mp_uninit_primary(void)
 {
-	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
-		rte_mp_action_register(MLX5_MP_NAME, mp_primary_handle);
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
+	rte_mp_action_unregister(MLX5_MP_NAME);
 }
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index d18561740f..5640fe1b91 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -286,7 +286,7 @@ mlx5_tx_uar_remap(struct rte_eth_dev *dev, int fd)
 			}
 		}
 		/* new address in reserved UAR address space. */
-		addr = RTE_PTR_ADD(priv->uar_base,
+		addr = RTE_PTR_ADD(mlx5_shared_data->uar_base,
 				   uar_va & (uintptr_t)(MLX5_UAR_SIZE - 1));
 		if (!already_mapped) {
 			pages[pages_n++] = uar_va;
@@ -844,9 +844,8 @@ mlx5_txq_release(struct rte_eth_dev *dev, uint16_t idx)
 	txq = container_of((*priv->txqs)[idx], struct mlx5_txq_ctrl, txq);
 	if (txq->ibv && !mlx5_txq_ibv_release(txq->ibv))
 		txq->ibv = NULL;
-	if (priv->uar_base)
-		munmap((void *)RTE_ALIGN_FLOOR((uintptr_t)txq->txq.bf_reg,
-		       page_size), page_size);
+	munmap((void *)RTE_ALIGN_FLOOR((uintptr_t)txq->txq.bf_reg, page_size),
+	       page_size);
 	if (rte_atomic32_dec_and_test(&txq->refcnt)) {
 		txq_free_elts(txq);
 		mlx5_mr_btree_free(&txq->txq.mr_ctrl.cache_bh);
-- 
2.11.0


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 4/4] net/mlx5: sync stop/start of datapath with secondary process
  2019-03-25 19:15 ` [dpdk-dev] [PATCH v2 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
                     ` (3 preceding siblings ...)
  2019-03-25 19:15   ` [dpdk-dev] [PATCH v2 3/4] net/mlx5: rework PMD global data init Yongseok Koh
@ 2019-03-25 19:15   ` Yongseok Koh
  2019-03-25 19:15     ` Yongseok Koh
  2019-03-26 12:49     ` Shahaf Shuler
  4 siblings, 2 replies; 45+ messages in thread
From: Yongseok Koh @ 2019-03-25 19:15 UTC (permalink / raw)
  To: shahafs; +Cc: dev

Rx/Tx burst function pointers are stored in the rte_eth_dev structure,
which is local to a process. Even though primary process replaces the
function pointers, secondary will not run the new ones. With rte_mp APIs,
primary can easily broadcast a request to stop/start the datapath of
secondary processes.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 drivers/net/mlx5/mlx5.c         |   5 ++
 drivers/net/mlx5/mlx5.h         |   6 ++
 drivers/net/mlx5/mlx5_mp.c      | 156 ++++++++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_rxtx.c    |   2 +
 drivers/net/mlx5/mlx5_trigger.c |   5 ++
 5 files changed, 174 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 54a1896ea4..840cd3d307 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -305,6 +305,9 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 	/* Prevent crashes when queues are still in use. */
 	dev->rx_pkt_burst = removed_rx_burst;
 	dev->tx_pkt_burst = removed_tx_burst;
+	rte_wmb();
+	/* Disable datapath on secondary process. */
+	mlx5_mp_req_stop_rxtx(dev);
 	if (priv->rxqs != NULL) {
 		/* XXX race condition if mlx5_rx_burst() is still running. */
 		usleep(1000);
@@ -803,6 +806,7 @@ mlx5_init_once(void)
 	case RTE_PROC_SECONDARY:
 		if (ld->init_done)
 			break;
+		mlx5_mp_init_secondary();
 		ret = mlx5_uar_init_secondary();
 		if (ret)
 			goto error;
@@ -823,6 +827,7 @@ mlx5_init_once(void)
 		break;
 	case RTE_PROC_SECONDARY:
 		mlx5_uar_uninit_secondary();
+		mlx5_mp_uninit_secondary();
 		break;
 	default:
 		break;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index cb454e866a..d8a5162bdb 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -59,6 +59,8 @@ enum {
 /* Request types for IPC. */
 enum mlx5_mp_req_type {
 	MLX5_MP_REQ_VERBS_CMD_FD = 1,
+	MLX5_MP_REQ_START_RXTX,
+	MLX5_MP_REQ_STOP_RXTX,
 };
 
 /* Pameters for IPC. */
@@ -431,9 +433,13 @@ int mlx5_flow_create_drop_queue(struct rte_eth_dev *dev);
 void mlx5_flow_delete_drop_queue(struct rte_eth_dev *dev);
 
 /* mlx5_mp.c */
+void mlx5_mp_req_start_rxtx(struct rte_eth_dev *dev);
+void mlx5_mp_req_stop_rxtx(struct rte_eth_dev *dev);
 int mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev);
 void mlx5_mp_init_primary(void);
 void mlx5_mp_uninit_primary(void);
+void mlx5_mp_init_secondary(void);
+void mlx5_mp_uninit_secondary(void);
 
 /* mlx5_nl.c */
 
diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c
index d0a38c3d52..657ab6872e 100644
--- a/drivers/net/mlx5/mlx5_mp.c
+++ b/drivers/net/mlx5/mlx5_mp.c
@@ -12,6 +12,7 @@
 #include <rte_string_fns.h>
 
 #include "mlx5.h"
+#include "mlx5_rxtx.h"
 #include "mlx5_utils.h"
 
 /**
@@ -85,6 +86,141 @@ mp_primary_handle(const struct rte_mp_msg *mp_msg, const void *peer)
 }
 
 /**
+ * IPC message handler of a secondary process.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ * @param[in] peer
+ *   Pointer to the peer socket path.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mp_secondary_handle(const struct rte_mp_msg *mp_msg, const void *peer)
+{
+	struct rte_mp_msg mp_res;
+	struct mlx5_mp_param *res = (struct mlx5_mp_param *)mp_res.param;
+	const struct mlx5_mp_param *param =
+		(const struct mlx5_mp_param *)mp_msg->param;
+	struct rte_eth_dev *dev;
+	int ret;
+
+	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
+	if (!rte_eth_dev_is_valid_port(param->port_id)) {
+		rte_errno = ENODEV;
+		DRV_LOG(ERR, "port %u invalid port ID", param->port_id);
+		return -rte_errno;
+	}
+	dev = &rte_eth_devices[param->port_id];
+	switch (param->type) {
+	case MLX5_MP_REQ_START_RXTX:
+		DRV_LOG(INFO, "port %u starting datapath", dev->data->port_id);
+		rte_mb();
+		dev->rx_pkt_burst = mlx5_select_rx_function(dev);
+		dev->tx_pkt_burst = mlx5_select_tx_function(dev);
+		mp_init_msg(dev, &mp_res, param->type);
+		res->result = 0;
+		ret = rte_mp_reply(&mp_res, peer);
+		break;
+	case MLX5_MP_REQ_STOP_RXTX:
+		DRV_LOG(INFO, "port %u stopping datapath", dev->data->port_id);
+		dev->rx_pkt_burst = removed_rx_burst;
+		dev->tx_pkt_burst = removed_tx_burst;
+		rte_mb();
+		mp_init_msg(dev, &mp_res, param->type);
+		res->result = 0;
+		ret = rte_mp_reply(&mp_res, peer);
+		break;
+	default:
+		rte_errno = EINVAL;
+		DRV_LOG(ERR, "port %u invalid mp request type",
+			dev->data->port_id);
+		return -rte_errno;
+	}
+	return ret;
+}
+
+/**
+ * Broadcast request of stopping/starting data-path to secondary processes.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ * @param[in] type
+ *   Request type.
+ */
+static void
+mp_req_on_rxtx(struct rte_eth_dev *dev, enum mlx5_mp_req_type type)
+{
+	struct rte_mp_msg mp_req;
+	struct rte_mp_msg *mp_res;
+	struct rte_mp_reply mp_rep;
+	struct mlx5_mp_param *res;
+	struct timespec ts = {.tv_sec = MLX5_MP_REQ_TIMEOUT_SEC, .tv_nsec = 0};
+	int ret;
+	int i;
+
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
+	if (!mlx5_shared_data->secondary_cnt)
+		return;
+	if (type != MLX5_MP_REQ_START_RXTX && type != MLX5_MP_REQ_STOP_RXTX) {
+		DRV_LOG(ERR, "port %u unknown request (req_type %d)",
+			dev->data->port_id, type);
+		return;
+	}
+	mp_init_msg(dev, &mp_req, type);
+	ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts);
+	if (ret) {
+		DRV_LOG(ERR, "port %u failed to request stop/start Rx/Tx (%d)",
+			dev->data->port_id, type);
+		goto exit;
+	}
+	if (mp_rep.nb_sent != mp_rep.nb_received) {
+		DRV_LOG(ERR,
+			"port %u not all secondaries responded (req_type %d)",
+			dev->data->port_id, type);
+		goto exit;
+	}
+	for (i = 0; i < mp_rep.nb_received; i++) {
+		mp_res = &mp_rep.msgs[i];
+		res = (struct mlx5_mp_param *)mp_res->param;
+		if (res->result) {
+			DRV_LOG(ERR, "port %u request failed on secondary #%d",
+				dev->data->port_id, i);
+			goto exit;
+		}
+	}
+exit:
+	free(mp_rep.msgs);
+}
+
+/**
+ * Broadcast request of starting data-path to secondary processes. The request
+ * is synchronous.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ */
+void
+mlx5_mp_req_start_rxtx(struct rte_eth_dev *dev)
+{
+	mp_req_on_rxtx(dev, MLX5_MP_REQ_START_RXTX);
+}
+
+/**
+ * Broadcast request of stopping data-path to secondary processes. The request
+ * is synchronous.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ */
+void
+mlx5_mp_req_stop_rxtx(struct rte_eth_dev *dev)
+{
+	mp_req_on_rxtx(dev, MLX5_MP_REQ_STOP_RXTX);
+}
+
+/**
  * Request Verbs command file descriptor for mmap to the primary process.
  *
  * @param[in] dev
@@ -150,3 +286,23 @@ mlx5_mp_uninit_primary(void)
 	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
 	rte_mp_action_unregister(MLX5_MP_NAME);
 }
+
+/**
+ * Initialize by secondary process.
+ */
+void
+mlx5_mp_init_secondary(void)
+{
+	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
+	rte_mp_action_register(MLX5_MP_NAME, mp_secondary_handle);
+}
+
+/**
+ * Un-initialize by secondary process.
+ */
+void
+mlx5_mp_uninit_secondary(void)
+{
+	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
+	rte_mp_action_unregister(MLX5_MP_NAME);
+}
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 38ce0e29a2..3da3f62fa9 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -2373,6 +2373,7 @@ removed_tx_burst(void *dpdk_txq __rte_unused,
 		 struct rte_mbuf **pkts __rte_unused,
 		 uint16_t pkts_n __rte_unused)
 {
+	rte_mb();
 	return 0;
 }
 
@@ -2397,6 +2398,7 @@ removed_rx_burst(void *dpdk_txq __rte_unused,
 		 struct rte_mbuf **pkts __rte_unused,
 		 uint16_t pkts_n __rte_unused)
 {
+	rte_mb();
 	return 0;
 }
 
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index 2137bdc461..7c9ff921ab 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -194,8 +194,11 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 			dev->data->port_id);
 		goto error;
 	}
+	rte_wmb();
 	dev->tx_pkt_burst = mlx5_select_tx_function(dev);
 	dev->rx_pkt_burst = mlx5_select_rx_function(dev);
+	/* Enable datapath on secondary process. */
+	mlx5_mp_req_start_rxtx(dev);
 	mlx5_dev_interrupt_handler_install(dev);
 	return 0;
 error:
@@ -228,6 +231,8 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
 	dev->rx_pkt_burst = removed_rx_burst;
 	dev->tx_pkt_burst = removed_tx_burst;
 	rte_wmb();
+	/* Disable datapath on secondary process. */
+	mlx5_mp_req_stop_rxtx(dev);
 	usleep(1000 * priv->rxqs_n);
 	DRV_LOG(DEBUG, "port %u stopping device", dev->data->port_id);
 	mlx5_flow_stop(dev, &priv->flows);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 4/4] net/mlx5: sync stop/start of datapath with secondary process
  2019-03-25 19:15   ` [dpdk-dev] [PATCH v2 4/4] net/mlx5: sync stop/start of datapath with secondary process Yongseok Koh
@ 2019-03-25 19:15     ` Yongseok Koh
  2019-03-26 12:49     ` Shahaf Shuler
  1 sibling, 0 replies; 45+ messages in thread
From: Yongseok Koh @ 2019-03-25 19:15 UTC (permalink / raw)
  To: shahafs; +Cc: dev

Rx/Tx burst function pointers are stored in the rte_eth_dev structure,
which is local to a process. Even though primary process replaces the
function pointers, secondary will not run the new ones. With rte_mp APIs,
primary can easily broadcast a request to stop/start the datapath of
secondary processes.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 drivers/net/mlx5/mlx5.c         |   5 ++
 drivers/net/mlx5/mlx5.h         |   6 ++
 drivers/net/mlx5/mlx5_mp.c      | 156 ++++++++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_rxtx.c    |   2 +
 drivers/net/mlx5/mlx5_trigger.c |   5 ++
 5 files changed, 174 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 54a1896ea4..840cd3d307 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -305,6 +305,9 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 	/* Prevent crashes when queues are still in use. */
 	dev->rx_pkt_burst = removed_rx_burst;
 	dev->tx_pkt_burst = removed_tx_burst;
+	rte_wmb();
+	/* Disable datapath on secondary process. */
+	mlx5_mp_req_stop_rxtx(dev);
 	if (priv->rxqs != NULL) {
 		/* XXX race condition if mlx5_rx_burst() is still running. */
 		usleep(1000);
@@ -803,6 +806,7 @@ mlx5_init_once(void)
 	case RTE_PROC_SECONDARY:
 		if (ld->init_done)
 			break;
+		mlx5_mp_init_secondary();
 		ret = mlx5_uar_init_secondary();
 		if (ret)
 			goto error;
@@ -823,6 +827,7 @@ mlx5_init_once(void)
 		break;
 	case RTE_PROC_SECONDARY:
 		mlx5_uar_uninit_secondary();
+		mlx5_mp_uninit_secondary();
 		break;
 	default:
 		break;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index cb454e866a..d8a5162bdb 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -59,6 +59,8 @@ enum {
 /* Request types for IPC. */
 enum mlx5_mp_req_type {
 	MLX5_MP_REQ_VERBS_CMD_FD = 1,
+	MLX5_MP_REQ_START_RXTX,
+	MLX5_MP_REQ_STOP_RXTX,
 };
 
 /* Pameters for IPC. */
@@ -431,9 +433,13 @@ int mlx5_flow_create_drop_queue(struct rte_eth_dev *dev);
 void mlx5_flow_delete_drop_queue(struct rte_eth_dev *dev);
 
 /* mlx5_mp.c */
+void mlx5_mp_req_start_rxtx(struct rte_eth_dev *dev);
+void mlx5_mp_req_stop_rxtx(struct rte_eth_dev *dev);
 int mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev);
 void mlx5_mp_init_primary(void);
 void mlx5_mp_uninit_primary(void);
+void mlx5_mp_init_secondary(void);
+void mlx5_mp_uninit_secondary(void);
 
 /* mlx5_nl.c */
 
diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c
index d0a38c3d52..657ab6872e 100644
--- a/drivers/net/mlx5/mlx5_mp.c
+++ b/drivers/net/mlx5/mlx5_mp.c
@@ -12,6 +12,7 @@
 #include <rte_string_fns.h>
 
 #include "mlx5.h"
+#include "mlx5_rxtx.h"
 #include "mlx5_utils.h"
 
 /**
@@ -85,6 +86,141 @@ mp_primary_handle(const struct rte_mp_msg *mp_msg, const void *peer)
 }
 
 /**
+ * IPC message handler of a secondary process.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ * @param[in] peer
+ *   Pointer to the peer socket path.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mp_secondary_handle(const struct rte_mp_msg *mp_msg, const void *peer)
+{
+	struct rte_mp_msg mp_res;
+	struct mlx5_mp_param *res = (struct mlx5_mp_param *)mp_res.param;
+	const struct mlx5_mp_param *param =
+		(const struct mlx5_mp_param *)mp_msg->param;
+	struct rte_eth_dev *dev;
+	int ret;
+
+	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
+	if (!rte_eth_dev_is_valid_port(param->port_id)) {
+		rte_errno = ENODEV;
+		DRV_LOG(ERR, "port %u invalid port ID", param->port_id);
+		return -rte_errno;
+	}
+	dev = &rte_eth_devices[param->port_id];
+	switch (param->type) {
+	case MLX5_MP_REQ_START_RXTX:
+		DRV_LOG(INFO, "port %u starting datapath", dev->data->port_id);
+		rte_mb();
+		dev->rx_pkt_burst = mlx5_select_rx_function(dev);
+		dev->tx_pkt_burst = mlx5_select_tx_function(dev);
+		mp_init_msg(dev, &mp_res, param->type);
+		res->result = 0;
+		ret = rte_mp_reply(&mp_res, peer);
+		break;
+	case MLX5_MP_REQ_STOP_RXTX:
+		DRV_LOG(INFO, "port %u stopping datapath", dev->data->port_id);
+		dev->rx_pkt_burst = removed_rx_burst;
+		dev->tx_pkt_burst = removed_tx_burst;
+		rte_mb();
+		mp_init_msg(dev, &mp_res, param->type);
+		res->result = 0;
+		ret = rte_mp_reply(&mp_res, peer);
+		break;
+	default:
+		rte_errno = EINVAL;
+		DRV_LOG(ERR, "port %u invalid mp request type",
+			dev->data->port_id);
+		return -rte_errno;
+	}
+	return ret;
+}
+
+/**
+ * Broadcast request of stopping/starting data-path to secondary processes.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ * @param[in] type
+ *   Request type.
+ */
+static void
+mp_req_on_rxtx(struct rte_eth_dev *dev, enum mlx5_mp_req_type type)
+{
+	struct rte_mp_msg mp_req;
+	struct rte_mp_msg *mp_res;
+	struct rte_mp_reply mp_rep;
+	struct mlx5_mp_param *res;
+	struct timespec ts = {.tv_sec = MLX5_MP_REQ_TIMEOUT_SEC, .tv_nsec = 0};
+	int ret;
+	int i;
+
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
+	if (!mlx5_shared_data->secondary_cnt)
+		return;
+	if (type != MLX5_MP_REQ_START_RXTX && type != MLX5_MP_REQ_STOP_RXTX) {
+		DRV_LOG(ERR, "port %u unknown request (req_type %d)",
+			dev->data->port_id, type);
+		return;
+	}
+	mp_init_msg(dev, &mp_req, type);
+	ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts);
+	if (ret) {
+		DRV_LOG(ERR, "port %u failed to request stop/start Rx/Tx (%d)",
+			dev->data->port_id, type);
+		goto exit;
+	}
+	if (mp_rep.nb_sent != mp_rep.nb_received) {
+		DRV_LOG(ERR,
+			"port %u not all secondaries responded (req_type %d)",
+			dev->data->port_id, type);
+		goto exit;
+	}
+	for (i = 0; i < mp_rep.nb_received; i++) {
+		mp_res = &mp_rep.msgs[i];
+		res = (struct mlx5_mp_param *)mp_res->param;
+		if (res->result) {
+			DRV_LOG(ERR, "port %u request failed on secondary #%d",
+				dev->data->port_id, i);
+			goto exit;
+		}
+	}
+exit:
+	free(mp_rep.msgs);
+}
+
+/**
+ * Broadcast request of starting data-path to secondary processes. The request
+ * is synchronous.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ */
+void
+mlx5_mp_req_start_rxtx(struct rte_eth_dev *dev)
+{
+	mp_req_on_rxtx(dev, MLX5_MP_REQ_START_RXTX);
+}
+
+/**
+ * Broadcast request of stopping data-path to secondary processes. The request
+ * is synchronous.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ */
+void
+mlx5_mp_req_stop_rxtx(struct rte_eth_dev *dev)
+{
+	mp_req_on_rxtx(dev, MLX5_MP_REQ_STOP_RXTX);
+}
+
+/**
  * Request Verbs command file descriptor for mmap to the primary process.
  *
  * @param[in] dev
@@ -150,3 +286,23 @@ mlx5_mp_uninit_primary(void)
 	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
 	rte_mp_action_unregister(MLX5_MP_NAME);
 }
+
+/**
+ * Initialize by secondary process.
+ */
+void
+mlx5_mp_init_secondary(void)
+{
+	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
+	rte_mp_action_register(MLX5_MP_NAME, mp_secondary_handle);
+}
+
+/**
+ * Un-initialize by secondary process.
+ */
+void
+mlx5_mp_uninit_secondary(void)
+{
+	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
+	rte_mp_action_unregister(MLX5_MP_NAME);
+}
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 38ce0e29a2..3da3f62fa9 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -2373,6 +2373,7 @@ removed_tx_burst(void *dpdk_txq __rte_unused,
 		 struct rte_mbuf **pkts __rte_unused,
 		 uint16_t pkts_n __rte_unused)
 {
+	rte_mb();
 	return 0;
 }
 
@@ -2397,6 +2398,7 @@ removed_rx_burst(void *dpdk_txq __rte_unused,
 		 struct rte_mbuf **pkts __rte_unused,
 		 uint16_t pkts_n __rte_unused)
 {
+	rte_mb();
 	return 0;
 }
 
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index 2137bdc461..7c9ff921ab 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -194,8 +194,11 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 			dev->data->port_id);
 		goto error;
 	}
+	rte_wmb();
 	dev->tx_pkt_burst = mlx5_select_tx_function(dev);
 	dev->rx_pkt_burst = mlx5_select_rx_function(dev);
+	/* Enable datapath on secondary process. */
+	mlx5_mp_req_start_rxtx(dev);
 	mlx5_dev_interrupt_handler_install(dev);
 	return 0;
 error:
@@ -228,6 +231,8 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
 	dev->rx_pkt_burst = removed_rx_burst;
 	dev->tx_pkt_burst = removed_tx_burst;
 	rte_wmb();
+	/* Disable datapath on secondary process. */
+	mlx5_mp_req_stop_rxtx(dev);
 	usleep(1000 * priv->rxqs_n);
 	DRV_LOG(DEBUG, "port %u stopping device", dev->data->port_id);
 	mlx5_flow_stop(dev, &priv->flows);
-- 
2.11.0


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/4] net/mlx5: fix memory event on secondary process
  2019-03-25 19:15   ` [dpdk-dev] [PATCH v2 1/4] net/mlx5: fix memory event on secondary process Yongseok Koh
  2019-03-25 19:15     ` Yongseok Koh
@ 2019-03-26 12:28     ` Shahaf Shuler
  2019-03-26 12:28       ` Shahaf Shuler
  1 sibling, 1 reply; 45+ messages in thread
From: Shahaf Shuler @ 2019-03-26 12:28 UTC (permalink / raw)
  To: Yongseok Koh; +Cc: dev, stable

Monday, March 25, 2019 9:16 PM, Yongseok Koh:
> Subject: [dpdk-dev] [PATCH v2 1/4] net/mlx5: fix memory event on
> secondary process
> 
> As the memory event is propagated to secondary processes, the event is
> processed redundantly. This should be processed once because the data
> structure used for MR and the event is global across the processes.
> 
> Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>

Acked-by: Shahaf Shuler <shahafs@mellanox.com>

> ---
>  drivers/net/mlx5/mlx5.c    | 5 +++--
>  drivers/net/mlx5/mlx5_mr.c | 2 ++
>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> ae4b71695e..dd29eba955 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -157,9 +157,10 @@ mlx5_prepare_shared_data(void)
>  		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
>  			LIST_INIT(&mlx5_shared_data-
> >mem_event_cb_list);
>  			rte_rwlock_init(&mlx5_shared_data-
> >mem_event_rwlock);
> +
> 	rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
> +
> 	mlx5_mr_mem_event_cb,
> +							NULL);
>  		}
> -
> 	rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
> -						mlx5_mr_mem_event_cb,
> NULL);
>  	}
>  	rte_spinlock_unlock(&mlx5_shared_data_lock);
>  }
> diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c index
> 700d83d1bc..d336a77e40 100644
> --- a/drivers/net/mlx5/mlx5_mr.c
> +++ b/drivers/net/mlx5/mlx5_mr.c
> @@ -891,6 +891,8 @@ mlx5_mr_mem_event_cb(enum rte_mem_event
> event_type, const void *addr,
>  	struct mlx5_priv *priv;
>  	struct mlx5_dev_list *dev_list = &mlx5_shared_data-
> >mem_event_cb_list;
> 
> +	/* Must be called from the primary process. */
> +	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
>  	switch (event_type) {
>  	case RTE_MEM_EVENT_FREE:
>  		rte_rwlock_write_lock(&mlx5_shared_data-
> >mem_event_rwlock);
> --
> 2.11.0

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/4] net/mlx5: fix memory event on secondary process
  2019-03-26 12:28     ` Shahaf Shuler
@ 2019-03-26 12:28       ` Shahaf Shuler
  0 siblings, 0 replies; 45+ messages in thread
From: Shahaf Shuler @ 2019-03-26 12:28 UTC (permalink / raw)
  To: Yongseok Koh; +Cc: dev, stable

Monday, March 25, 2019 9:16 PM, Yongseok Koh:
> Subject: [dpdk-dev] [PATCH v2 1/4] net/mlx5: fix memory event on
> secondary process
> 
> As the memory event is propagated to secondary processes, the event is
> processed redundantly. This should be processed once because the data
> structure used for MR and the event is global across the processes.
> 
> Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>

Acked-by: Shahaf Shuler <shahafs@mellanox.com>

> ---
>  drivers/net/mlx5/mlx5.c    | 5 +++--
>  drivers/net/mlx5/mlx5_mr.c | 2 ++
>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> ae4b71695e..dd29eba955 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -157,9 +157,10 @@ mlx5_prepare_shared_data(void)
>  		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
>  			LIST_INIT(&mlx5_shared_data-
> >mem_event_cb_list);
>  			rte_rwlock_init(&mlx5_shared_data-
> >mem_event_rwlock);
> +
> 	rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
> +
> 	mlx5_mr_mem_event_cb,
> +							NULL);
>  		}
> -
> 	rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
> -						mlx5_mr_mem_event_cb,
> NULL);
>  	}
>  	rte_spinlock_unlock(&mlx5_shared_data_lock);
>  }
> diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c index
> 700d83d1bc..d336a77e40 100644
> --- a/drivers/net/mlx5/mlx5_mr.c
> +++ b/drivers/net/mlx5/mlx5_mr.c
> @@ -891,6 +891,8 @@ mlx5_mr_mem_event_cb(enum rte_mem_event
> event_type, const void *addr,
>  	struct mlx5_priv *priv;
>  	struct mlx5_dev_list *dev_list = &mlx5_shared_data-
> >mem_event_cb_list;
> 
> +	/* Must be called from the primary process. */
> +	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
>  	switch (event_type) {
>  	case RTE_MEM_EVENT_FREE:
>  		rte_rwlock_write_lock(&mlx5_shared_data-
> >mem_event_rwlock);
> --
> 2.11.0


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/4] net/mlx5: replace IPC socket with EAL API
  2019-03-25 19:15   ` [dpdk-dev] [PATCH v2 2/4] net/mlx5: replace IPC socket with EAL API Yongseok Koh
  2019-03-25 19:15     ` Yongseok Koh
@ 2019-03-26 12:31     ` Shahaf Shuler
  2019-03-26 12:31       ` Shahaf Shuler
  1 sibling, 1 reply; 45+ messages in thread
From: Shahaf Shuler @ 2019-03-26 12:31 UTC (permalink / raw)
  To: Yongseok Koh; +Cc: dev

Monday, March 25, 2019 9:16 PM, Yongseok Koh:
> Subject: [dpdk-dev] [PATCH v2 2/4] net/mlx5: replace IPC socket with EAL
> API
> 
> Socket API is used for IPC in order for secondary process to acquire Verb
> command file descriptor. The FD is used to remap UAR address. The new
> multi-process APIs (rte_mp) in EAL are newly introduced. mlx5_socket.c is
> replaced with mlx5_mp.c, which uses the new APIs.
> 
> As it is PMD global infrastructure, only one IPC channel is established.
> All the IPC message types may have port_id in the message if there is need
> to reference a specific device.
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>

Acked-by: Shahaf Shuler <shahafs@mellanox.com>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/4] net/mlx5: replace IPC socket with EAL API
  2019-03-26 12:31     ` Shahaf Shuler
@ 2019-03-26 12:31       ` Shahaf Shuler
  0 siblings, 0 replies; 45+ messages in thread
From: Shahaf Shuler @ 2019-03-26 12:31 UTC (permalink / raw)
  To: Yongseok Koh; +Cc: dev

Monday, March 25, 2019 9:16 PM, Yongseok Koh:
> Subject: [dpdk-dev] [PATCH v2 2/4] net/mlx5: replace IPC socket with EAL
> API
> 
> Socket API is used for IPC in order for secondary process to acquire Verb
> command file descriptor. The FD is used to remap UAR address. The new
> multi-process APIs (rte_mp) in EAL are newly introduced. mlx5_socket.c is
> replaced with mlx5_mp.c, which uses the new APIs.
> 
> As it is PMD global infrastructure, only one IPC channel is established.
> All the IPC message types may have port_id in the message if there is need
> to reference a specific device.
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>

Acked-by: Shahaf Shuler <shahafs@mellanox.com>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/4] net/mlx5: rework PMD global data init
  2019-03-25 19:15   ` [dpdk-dev] [PATCH v2 3/4] net/mlx5: rework PMD global data init Yongseok Koh
  2019-03-25 19:15     ` Yongseok Koh
@ 2019-03-26 12:38     ` Shahaf Shuler
  2019-03-26 12:38       ` Shahaf Shuler
  1 sibling, 1 reply; 45+ messages in thread
From: Shahaf Shuler @ 2019-03-26 12:38 UTC (permalink / raw)
  To: Yongseok Koh; +Cc: dev

Monday, March 25, 2019 9:16 PM, Yongseok Koh:
> Subject: [dpdk-dev] [PATCH v2 3/4] net/mlx5: rework PMD global data init
> 
> There's more need to have PMD global data structure. This should be
> initialized once per a process regardless of how many PMD instances are
> probed. mlx5_init_once() is called during probing and make sure all the init
> functions are called once per a process. Currently, such global data and its
> initialization functions are even scattered. Rather than 'extern'-ing such
> variables and calling such functions one by one making sure it is called only
> once by checking the validity of such variables, it will be better to have a
> global storage to hold such data and a consolidated function having all the
> initializations. The existing shared memory gets more extensively used for
> this purpose. As there could be multiple secondary processes, a static
> storage (local to process) is also added.
> 
> As the reserved virtual address for UAR remap is a PMD global resource, this
> doesn't need to be stored in the device priv structure, but in the PMD global
> data.
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>

Acked-by: Shahaf Shuler <shahafs@mellanox.com>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/4] net/mlx5: rework PMD global data init
  2019-03-26 12:38     ` Shahaf Shuler
@ 2019-03-26 12:38       ` Shahaf Shuler
  0 siblings, 0 replies; 45+ messages in thread
From: Shahaf Shuler @ 2019-03-26 12:38 UTC (permalink / raw)
  To: Yongseok Koh; +Cc: dev

Monday, March 25, 2019 9:16 PM, Yongseok Koh:
> Subject: [dpdk-dev] [PATCH v2 3/4] net/mlx5: rework PMD global data init
> 
> There's more need to have PMD global data structure. This should be
> initialized once per a process regardless of how many PMD instances are
> probed. mlx5_init_once() is called during probing and make sure all the init
> functions are called once per a process. Currently, such global data and its
> initialization functions are even scattered. Rather than 'extern'-ing such
> variables and calling such functions one by one making sure it is called only
> once by checking the validity of such variables, it will be better to have a
> global storage to hold such data and a consolidated function having all the
> initializations. The existing shared memory gets more extensively used for
> this purpose. As there could be multiple secondary processes, a static
> storage (local to process) is also added.
> 
> As the reserved virtual address for UAR remap is a PMD global resource, this
> doesn't need to be stored in the device priv structure, but in the PMD global
> data.
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>

Acked-by: Shahaf Shuler <shahafs@mellanox.com>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 4/4] net/mlx5: sync stop/start of datapath with secondary process
  2019-03-25 19:15   ` [dpdk-dev] [PATCH v2 4/4] net/mlx5: sync stop/start of datapath with secondary process Yongseok Koh
  2019-03-25 19:15     ` Yongseok Koh
@ 2019-03-26 12:49     ` Shahaf Shuler
  2019-03-26 12:49       ` Shahaf Shuler
  1 sibling, 1 reply; 45+ messages in thread
From: Shahaf Shuler @ 2019-03-26 12:49 UTC (permalink / raw)
  To: Yongseok Koh; +Cc: dev

Monday, March 25, 2019 9:16 PM, Yongseok Koh:
> Subject: [dpdk-dev] [PATCH v2 4/4] net/mlx5: sync stop/start of datapath
> with secondary process
> 
> Rx/Tx burst function pointers are stored in the rte_eth_dev structure, which
> is local to a process. Even though primary process replaces the function
> pointers, secondary will not run the new ones. With rte_mp APIs, primary
> can easily broadcast a request to stop/start the datapath of secondary
> processes.
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>

Acked-by: Shahaf Shuler <shahafs@mellanox.com>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 4/4] net/mlx5: sync stop/start of datapath with secondary process
  2019-03-26 12:49     ` Shahaf Shuler
@ 2019-03-26 12:49       ` Shahaf Shuler
  0 siblings, 0 replies; 45+ messages in thread
From: Shahaf Shuler @ 2019-03-26 12:49 UTC (permalink / raw)
  To: Yongseok Koh; +Cc: dev

Monday, March 25, 2019 9:16 PM, Yongseok Koh:
> Subject: [dpdk-dev] [PATCH v2 4/4] net/mlx5: sync stop/start of datapath
> with secondary process
> 
> Rx/Tx burst function pointers are stored in the rte_eth_dev structure, which
> is local to a process. Even though primary process replaces the function
> pointers, secondary will not run the new ones. With rte_mp APIs, primary
> can easily broadcast a request to stop/start the datapath of secondary
> processes.
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>

Acked-by: Shahaf Shuler <shahafs@mellanox.com>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v3 0/4] net/mlx5: rework IPC socket and PMD global data init
  2019-03-07  7:33 [dpdk-dev] [PATCH 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
                   ` (4 preceding siblings ...)
  2019-03-25 19:15 ` [dpdk-dev] [PATCH v2 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
@ 2019-04-01 21:12 ` Yongseok Koh
  2019-04-01 21:12   ` Yongseok Koh
                     ` (5 more replies)
  5 siblings, 6 replies; 45+ messages in thread
From: Yongseok Koh @ 2019-04-01 21:12 UTC (permalink / raw)
  To: shahafs; +Cc: dev

The existing socket-based IPC channel is replaced with the new rte_mp APIs of
EAL and extended to request stop/start of dataplane to secondary processes.
Also, initialization of PMD global data including the new IPC channel is
reworked to provide more generic framework for future use.

v3:
* rebase on the latest branch tip

v2:
* add more sanity check for eth_dev and return value from IPC request
* complement commit messages
* add MLX5_MP_REQ_TIMEOUT_SEC

Yongseok Koh (4):
  net/mlx5: fix memory event on secondary process
  net/mlx5: replace IPC socket with EAL API
  net/mlx5: rework PMD global data init
  net/mlx5: sync stop/start of datapath with secondary process

 drivers/net/mlx5/Makefile       |   2 +-
 drivers/net/mlx5/meson.build    |   2 +-
 drivers/net/mlx5/mlx5.c         | 255 ++++++++++++++++++++++++---------
 drivers/net/mlx5/mlx5.h         |  52 +++++--
 drivers/net/mlx5/mlx5_ethdev.c  |  34 -----
 drivers/net/mlx5/mlx5_mp.c      | 308 ++++++++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_mr.c      |   2 +
 drivers/net/mlx5/mlx5_rxtx.c    |   2 +
 drivers/net/mlx5/mlx5_socket.c  | 306 ---------------------------------------
 drivers/net/mlx5/mlx5_trigger.c |   5 +
 drivers/net/mlx5/mlx5_txq.c     |   7 +-
 11 files changed, 552 insertions(+), 423 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_mp.c
 delete mode 100644 drivers/net/mlx5/mlx5_socket.c

-- 
2.11.0

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v3 0/4] net/mlx5: rework IPC socket and PMD global data init
  2019-04-01 21:12 ` [dpdk-dev] [PATCH v3 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
@ 2019-04-01 21:12   ` Yongseok Koh
  2019-04-01 21:12   ` [dpdk-dev] [PATCH v3 1/4] net/mlx5: fix memory event on secondary process Yongseok Koh
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 45+ messages in thread
From: Yongseok Koh @ 2019-04-01 21:12 UTC (permalink / raw)
  To: shahafs; +Cc: dev

The existing socket-based IPC channel is replaced with the new rte_mp APIs of
EAL and extended to request stop/start of dataplane to secondary processes.
Also, initialization of PMD global data including the new IPC channel is
reworked to provide more generic framework for future use.

v3:
* rebase on the latest branch tip

v2:
* add more sanity check for eth_dev and return value from IPC request
* complement commit messages
* add MLX5_MP_REQ_TIMEOUT_SEC

Yongseok Koh (4):
  net/mlx5: fix memory event on secondary process
  net/mlx5: replace IPC socket with EAL API
  net/mlx5: rework PMD global data init
  net/mlx5: sync stop/start of datapath with secondary process

 drivers/net/mlx5/Makefile       |   2 +-
 drivers/net/mlx5/meson.build    |   2 +-
 drivers/net/mlx5/mlx5.c         | 255 ++++++++++++++++++++++++---------
 drivers/net/mlx5/mlx5.h         |  52 +++++--
 drivers/net/mlx5/mlx5_ethdev.c  |  34 -----
 drivers/net/mlx5/mlx5_mp.c      | 308 ++++++++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_mr.c      |   2 +
 drivers/net/mlx5/mlx5_rxtx.c    |   2 +
 drivers/net/mlx5/mlx5_socket.c  | 306 ---------------------------------------
 drivers/net/mlx5/mlx5_trigger.c |   5 +
 drivers/net/mlx5/mlx5_txq.c     |   7 +-
 11 files changed, 552 insertions(+), 423 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_mp.c
 delete mode 100644 drivers/net/mlx5/mlx5_socket.c

-- 
2.11.0


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v3 1/4] net/mlx5: fix memory event on secondary process
  2019-04-01 21:12 ` [dpdk-dev] [PATCH v3 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
  2019-04-01 21:12   ` Yongseok Koh
@ 2019-04-01 21:12   ` Yongseok Koh
  2019-04-01 21:12     ` Yongseok Koh
  2019-04-01 21:12   ` [dpdk-dev] [PATCH v3 2/4] net/mlx5: replace IPC socket with EAL API Yongseok Koh
                     ` (3 subsequent siblings)
  5 siblings, 1 reply; 45+ messages in thread
From: Yongseok Koh @ 2019-04-01 21:12 UTC (permalink / raw)
  To: shahafs; +Cc: dev, stable

As the memory event is propagated to secondary processes, the event is
processed redundantly. This should be processed once because the data
structure used for MR and the event is global across the processes.

Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
---
 drivers/net/mlx5/mlx5.c    | 5 +++--
 drivers/net/mlx5/mlx5_mr.c | 2 ++
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 1d7ca615bd..2208cc922a 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -322,9 +322,10 @@ mlx5_prepare_shared_data(void)
 		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
 			LIST_INIT(&mlx5_shared_data->mem_event_cb_list);
 			rte_rwlock_init(&mlx5_shared_data->mem_event_rwlock);
+			rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
+							mlx5_mr_mem_event_cb,
+							NULL);
 		}
-		rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
-						mlx5_mr_mem_event_cb, NULL);
 	}
 	rte_spinlock_unlock(&mlx5_shared_data_lock);
 }
diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
index 0f0a64f0a4..88484dd50b 100644
--- a/drivers/net/mlx5/mlx5_mr.c
+++ b/drivers/net/mlx5/mlx5_mr.c
@@ -892,6 +892,8 @@ mlx5_mr_mem_event_cb(enum rte_mem_event event_type, const void *addr,
 	struct mlx5_priv *priv;
 	struct mlx5_dev_list *dev_list = &mlx5_shared_data->mem_event_cb_list;
 
+	/* Must be called from the primary process. */
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
 	switch (event_type) {
 	case RTE_MEM_EVENT_FREE:
 		rte_rwlock_write_lock(&mlx5_shared_data->mem_event_rwlock);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v3 1/4] net/mlx5: fix memory event on secondary process
  2019-04-01 21:12   ` [dpdk-dev] [PATCH v3 1/4] net/mlx5: fix memory event on secondary process Yongseok Koh
@ 2019-04-01 21:12     ` Yongseok Koh
  0 siblings, 0 replies; 45+ messages in thread
From: Yongseok Koh @ 2019-04-01 21:12 UTC (permalink / raw)
  To: shahafs; +Cc: dev, stable

As the memory event is propagated to secondary processes, the event is
processed redundantly. This should be processed once because the data
structure used for MR and the event is global across the processes.

Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
---
 drivers/net/mlx5/mlx5.c    | 5 +++--
 drivers/net/mlx5/mlx5_mr.c | 2 ++
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 1d7ca615bd..2208cc922a 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -322,9 +322,10 @@ mlx5_prepare_shared_data(void)
 		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
 			LIST_INIT(&mlx5_shared_data->mem_event_cb_list);
 			rte_rwlock_init(&mlx5_shared_data->mem_event_rwlock);
+			rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
+							mlx5_mr_mem_event_cb,
+							NULL);
 		}
-		rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
-						mlx5_mr_mem_event_cb, NULL);
 	}
 	rte_spinlock_unlock(&mlx5_shared_data_lock);
 }
diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
index 0f0a64f0a4..88484dd50b 100644
--- a/drivers/net/mlx5/mlx5_mr.c
+++ b/drivers/net/mlx5/mlx5_mr.c
@@ -892,6 +892,8 @@ mlx5_mr_mem_event_cb(enum rte_mem_event event_type, const void *addr,
 	struct mlx5_priv *priv;
 	struct mlx5_dev_list *dev_list = &mlx5_shared_data->mem_event_cb_list;
 
+	/* Must be called from the primary process. */
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
 	switch (event_type) {
 	case RTE_MEM_EVENT_FREE:
 		rte_rwlock_write_lock(&mlx5_shared_data->mem_event_rwlock);
-- 
2.11.0


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v3 2/4] net/mlx5: replace IPC socket with EAL API
  2019-04-01 21:12 ` [dpdk-dev] [PATCH v3 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
  2019-04-01 21:12   ` Yongseok Koh
  2019-04-01 21:12   ` [dpdk-dev] [PATCH v3 1/4] net/mlx5: fix memory event on secondary process Yongseok Koh
@ 2019-04-01 21:12   ` Yongseok Koh
  2019-04-01 21:12     ` Yongseok Koh
  2019-04-01 21:12   ` [dpdk-dev] [PATCH v3 3/4] net/mlx5: rework PMD global data init Yongseok Koh
                     ` (2 subsequent siblings)
  5 siblings, 1 reply; 45+ messages in thread
From: Yongseok Koh @ 2019-04-01 21:12 UTC (permalink / raw)
  To: shahafs; +Cc: dev

Socket API is used for IPC in order for secondary process to acquire Verb
command file descriptor. The FD is used to remap UAR address. The new
multi-process APIs (rte_mp) in EAL are newly introduced. mlx5_socket.c is
replaced with mlx5_mp.c, which uses the new APIs.

As it is PMD global infrastructure, only one IPC channel is established.
All the IPC message types may have port_id in the message if there is need
to reference a specific device.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
---
 drivers/net/mlx5/Makefile      |   2 +-
 drivers/net/mlx5/meson.build   |   2 +-
 drivers/net/mlx5/mlx5.c        |   5 +-
 drivers/net/mlx5/mlx5.h        |  29 ++--
 drivers/net/mlx5/mlx5_ethdev.c |  34 -----
 drivers/net/mlx5/mlx5_mp.c     | 139 +++++++++++++++++++
 drivers/net/mlx5/mlx5_socket.c | 306 -----------------------------------------
 7 files changed, 164 insertions(+), 353 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_mp.c
 delete mode 100644 drivers/net/mlx5/mlx5_socket.c

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index c3264949a7..e5587e9036 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -34,7 +34,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_dv.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_tcf.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_verbs.c
-SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_socket.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_mp.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_nl.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_devx_cmds.c
 
diff --git a/drivers/net/mlx5/meson.build b/drivers/net/mlx5/meson.build
index e3cb9bc201..fe880b9b2e 100644
--- a/drivers/net/mlx5/meson.build
+++ b/drivers/net/mlx5/meson.build
@@ -41,7 +41,7 @@ if build
 		'mlx5_rxmode.c',
 		'mlx5_rxq.c',
 		'mlx5_rxtx.c',
-		'mlx5_socket.c',
+		'mlx5_mp.c',
 		'mlx5_stats.c',
 		'mlx5_trigger.c',
 		'mlx5_txq.c',
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 2208cc922a..61ef29cdef 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -325,6 +325,7 @@ mlx5_prepare_shared_data(void)
 			rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
 							mlx5_mr_mem_event_cb,
 							NULL);
+			mlx5_mp_init();
 		}
 	}
 	rte_spinlock_unlock(&mlx5_shared_data_lock);
@@ -454,8 +455,6 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 		rte_free(priv->rss_conf.rss_key);
 	if (priv->reta_idx != NULL)
 		rte_free(priv->reta_idx);
-	if (priv->primary_socket)
-		mlx5_socket_uninit(dev);
 	if (priv->config.vf)
 		mlx5_nl_mac_addr_flush(dev);
 	if (priv->nl_socket_route >= 0)
@@ -970,7 +969,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		if (err)
 			return NULL;
 		/* Receive command fd from primary process */
-		err = mlx5_socket_connect(eth_dev);
+		err = mlx5_mp_req_verbs_cmd_fd(eth_dev);
 		if (err < 0)
 			return NULL;
 		/* Remap UAR for Tx queues. */
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 7402798d49..f454adb3c1 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -56,6 +56,24 @@ enum {
 	PCI_DEVICE_ID_MELLANOX_CONNECTX6VF = 0x101c,
 };
 
+/* Request types for IPC. */
+enum mlx5_mp_req_type {
+	MLX5_MP_REQ_VERBS_CMD_FD = 1,
+};
+
+/* Pameters for IPC. */
+struct mlx5_mp_param {
+	enum mlx5_mp_req_type type;
+	int port_id;
+	int result;
+};
+
+/** Request timeout for IPC. */
+#define MLX5_MP_REQ_TIMEOUT_SEC 5
+
+/** Key string for IPC. */
+#define MLX5_MP_NAME "net_mlx5_mp"
+
 /** Switch information returned by mlx5_nl_switch_info(). */
 struct mlx5_switch_info {
 	uint32_t master:1; /**< Master device. */
@@ -272,9 +290,7 @@ struct mlx5_priv {
 	uint32_t link_speed_capa; /* Link speed capabilities. */
 	struct mlx5_xstats_ctrl xstats_ctrl; /* Extended stats control. */
 	struct mlx5_stats_ctrl stats_ctrl; /* Stats control. */
-	int primary_socket; /* Unix socket for primary process. */
 	void *uar_base; /* Reserved address space for UAR mapping */
-	struct rte_intr_handle intr_handle_socket; /* Interrupt handler. */
 	struct mlx5_dev_config config; /* Device configuration. */
 	struct mlx5_verbs_alloc_ctx verbs_alloc_ctx;
 	/* Context for Verbs allocator. */
@@ -432,12 +448,9 @@ int mlx5_ctrl_flow(struct rte_eth_dev *dev,
 int mlx5_flow_create_drop_queue(struct rte_eth_dev *dev);
 void mlx5_flow_delete_drop_queue(struct rte_eth_dev *dev);
 
-/* mlx5_socket.c */
-
-int mlx5_socket_init(struct rte_eth_dev *priv);
-void mlx5_socket_uninit(struct rte_eth_dev *priv);
-void mlx5_socket_handle(struct rte_eth_dev *priv);
-int mlx5_socket_connect(struct rte_eth_dev *priv);
+/* mlx5_mp.c */
+int mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev);
+void mlx5_mp_init(void);
 
 /* mlx5_nl.c */
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 7273bd9404..aab8e676c9 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1096,20 +1096,6 @@ mlx5_dev_interrupt_handler(void *cb_arg)
 }
 
 /**
- * Handle interrupts from the socket.
- *
- * @param cb_arg
- *   Callback argument.
- */
-static void
-mlx5_dev_handler_socket(void *cb_arg)
-{
-	struct rte_eth_dev *dev = cb_arg;
-
-	mlx5_socket_handle(dev);
-}
-
-/**
  * Uninstall shared asynchronous device events handler.
  * This function is implemeted to support event sharing
  * between multiple ports of single IB device.
@@ -1208,14 +1194,7 @@ mlx5_dev_shared_handler_install(struct rte_eth_dev *dev)
 void
 mlx5_dev_interrupt_handler_uninstall(struct rte_eth_dev *dev)
 {
-	struct mlx5_priv *priv = dev->data->dev_private;
-
 	mlx5_dev_shared_handler_uninstall(dev);
-	if (priv->primary_socket)
-		rte_intr_callback_unregister(&priv->intr_handle_socket,
-					     mlx5_dev_handler_socket, dev);
-	priv->intr_handle_socket.fd = 0;
-	priv->intr_handle_socket.type = RTE_INTR_HANDLE_UNKNOWN;
 }
 
 /**
@@ -1227,20 +1206,7 @@ mlx5_dev_interrupt_handler_uninstall(struct rte_eth_dev *dev)
 void
 mlx5_dev_interrupt_handler_install(struct rte_eth_dev *dev)
 {
-	struct mlx5_priv *priv = dev->data->dev_private;
-	int ret;
-
 	mlx5_dev_shared_handler_install(dev);
-	ret = mlx5_socket_init(dev);
-	if (ret)
-		DRV_LOG(ERR, "port %u cannot initialise socket: %s",
-			dev->data->port_id, strerror(rte_errno));
-	else if (priv->primary_socket) {
-		priv->intr_handle_socket.fd = priv->primary_socket;
-		priv->intr_handle_socket.type = RTE_INTR_HANDLE_EXT;
-		rte_intr_callback_register(&priv->intr_handle_socket,
-					   mlx5_dev_handler_socket, dev);
-	}
 }
 
 /**
diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c
new file mode 100644
index 0000000000..71a2b663fa
--- /dev/null
+++ b/drivers/net/mlx5/mlx5_mp.c
@@ -0,0 +1,139 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2019 6WIND S.A.
+ * Copyright 2019 Mellanox Technologies, Ltd
+ */
+
+#include <assert.h>
+#include <stdio.h>
+#include <time.h>
+
+#include <rte_eal.h>
+#include <rte_ethdev_driver.h>
+#include <rte_string_fns.h>
+
+#include "mlx5.h"
+#include "mlx5_utils.h"
+
+/**
+ * Initialize IPC message.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ * @param[out] msg
+ *   Pointer to message to fill in.
+ * @param[in] type
+ *   Message type.
+ */
+static inline void
+mp_init_msg(struct rte_eth_dev *dev, struct rte_mp_msg *msg,
+	    enum mlx5_mp_req_type type)
+{
+	struct mlx5_mp_param *param = (struct mlx5_mp_param *)msg->param;
+
+	memset(msg, 0, sizeof(*msg));
+	strlcpy(msg->name, MLX5_MP_NAME, sizeof(msg->name));
+	msg->len_param = sizeof(*param);
+	param->type = type;
+	param->port_id = dev->data->port_id;
+}
+
+/**
+ * IPC message handler of primary process.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ * @param[in] peer
+ *   Pointer to the peer socket path.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mp_primary_handle(const struct rte_mp_msg *mp_msg, const void *peer)
+{
+	struct rte_mp_msg mp_res;
+	struct mlx5_mp_param *res = (struct mlx5_mp_param *)mp_res.param;
+	const struct mlx5_mp_param *param =
+		(const struct mlx5_mp_param *)mp_msg->param;
+	struct rte_eth_dev *dev;
+	struct mlx5_priv *priv;
+	int ret;
+
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
+	if (!rte_eth_dev_is_valid_port(param->port_id)) {
+		rte_errno = ENODEV;
+		DRV_LOG(ERR, "port %u invalid port ID", param->port_id);
+		return -rte_errno;
+	}
+	dev = &rte_eth_devices[param->port_id];
+	priv = dev->data->dev_private;
+	switch (param->type) {
+	case MLX5_MP_REQ_VERBS_CMD_FD:
+		mp_init_msg(dev, &mp_res, param->type);
+		mp_res.num_fds = 1;
+		mp_res.fds[0] = priv->sh->ctx->cmd_fd;
+		res->result = 0;
+		ret = rte_mp_reply(&mp_res, peer);
+		break;
+	default:
+		rte_errno = EINVAL;
+		DRV_LOG(ERR, "port %u invalid mp request type",
+			dev->data->port_id);
+		return -rte_errno;
+	}
+	return ret;
+}
+
+/**
+ * Request Verbs command file descriptor for mmap to the primary process.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ *
+ * @return
+ *   fd on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev)
+{
+	struct rte_mp_msg mp_req;
+	struct rte_mp_msg *mp_res;
+	struct rte_mp_reply mp_rep;
+	struct mlx5_mp_param *res;
+	struct timespec ts = {.tv_sec = MLX5_MP_REQ_TIMEOUT_SEC, .tv_nsec = 0};
+	int ret;
+
+	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
+	mp_init_msg(dev, &mp_req, MLX5_MP_REQ_VERBS_CMD_FD);
+	ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts);
+	if (ret) {
+		DRV_LOG(ERR, "port %u request to primary process failed",
+			dev->data->port_id);
+		return -rte_errno;
+	}
+	assert(mp_rep.nb_received == 1);
+	mp_res = &mp_rep.msgs[0];
+	res = (struct mlx5_mp_param *)mp_res->param;
+	if (res->result) {
+		rte_errno = -res->result;
+		DRV_LOG(ERR,
+			"port %u failed to get command FD from primary process",
+			dev->data->port_id);
+		ret = -rte_errno;
+		goto exit;
+	}
+	assert(mp_res->num_fds == 1);
+	ret = mp_res->fds[0];
+	DRV_LOG(DEBUG, "port %u command FD from primary is %d",
+		dev->data->port_id, ret);
+exit:
+	free(mp_rep.msgs);
+	return ret;
+}
+
+void
+mlx5_mp_init(void)
+{
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+		rte_mp_action_register(MLX5_MP_NAME, mp_primary_handle);
+}
diff --git a/drivers/net/mlx5/mlx5_socket.c b/drivers/net/mlx5/mlx5_socket.c
deleted file mode 100644
index 8fa64309f3..0000000000
--- a/drivers/net/mlx5/mlx5_socket.c
+++ /dev/null
@@ -1,306 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright 2016 6WIND S.A.
- * Copyright 2016 Mellanox Technologies, Ltd
- */
-
-#include <sys/types.h>
-#include <sys/socket.h>
-#include <sys/un.h>
-#include <fcntl.h>
-#include <stdio.h>
-#include <unistd.h>
-#include <sys/stat.h>
-
-#include "mlx5.h"
-#include "mlx5_utils.h"
-
-/**
- * Initialise the socket to communicate with the secondary process
- *
- * @param[in] dev
- *   Pointer to Ethernet device.
- *
- * @return
- *   0 on success, a negative errno value otherwise and rte_errno is set.
- */
-int
-mlx5_socket_init(struct rte_eth_dev *dev)
-{
-	struct mlx5_priv *priv = dev->data->dev_private;
-	struct sockaddr_un sun = {
-		.sun_family = AF_UNIX,
-	};
-	int ret;
-	int flags;
-
-	/*
-	 * Close the last socket that was used to communicate
-	 * with the secondary process
-	 */
-	if (priv->primary_socket)
-		mlx5_socket_uninit(dev);
-	/*
-	 * Initialise the socket to communicate with the secondary
-	 * process.
-	 */
-	ret = socket(AF_UNIX, SOCK_STREAM, 0);
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u secondary process not supported: %s",
-			dev->data->port_id, strerror(errno));
-		goto error;
-	}
-	priv->primary_socket = ret;
-	flags = fcntl(priv->primary_socket, F_GETFL, 0);
-	if (flags == -1) {
-		rte_errno = errno;
-		goto error;
-	}
-	ret = fcntl(priv->primary_socket, F_SETFL, flags | O_NONBLOCK);
-	if (ret < 0) {
-		rte_errno = errno;
-		goto error;
-	}
-	snprintf(sun.sun_path, sizeof(sun.sun_path), "/var/tmp/%s_%d",
-		 MLX5_DRIVER_NAME, priv->primary_socket);
-	remove(sun.sun_path);
-	ret = bind(priv->primary_socket, (const struct sockaddr *)&sun,
-		   sizeof(sun));
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING,
-			"port %u cannot bind socket, secondary process not"
-			" supported: %s",
-			dev->data->port_id, strerror(errno));
-		goto close;
-	}
-	ret = listen(priv->primary_socket, 0);
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u secondary process not supported: %s",
-			dev->data->port_id, strerror(errno));
-		goto close;
-	}
-	return 0;
-close:
-	remove(sun.sun_path);
-error:
-	claim_zero(close(priv->primary_socket));
-	priv->primary_socket = 0;
-	return -rte_errno;
-}
-
-/**
- * Un-Initialise the socket to communicate with the secondary process
- *
- * @param[in] dev
- */
-void
-mlx5_socket_uninit(struct rte_eth_dev *dev)
-{
-	struct mlx5_priv *priv = dev->data->dev_private;
-
-	MKSTR(path, "/var/tmp/%s_%d", MLX5_DRIVER_NAME, priv->primary_socket);
-	claim_zero(close(priv->primary_socket));
-	priv->primary_socket = 0;
-	claim_zero(remove(path));
-}
-
-/**
- * Handle socket interrupts.
- *
- * @param dev
- *   Pointer to Ethernet device.
- */
-void
-mlx5_socket_handle(struct rte_eth_dev *dev)
-{
-	struct mlx5_priv *priv = dev->data->dev_private;
-	int conn_sock;
-	int ret = 0;
-	struct cmsghdr *cmsg = NULL;
-	struct ucred *cred = NULL;
-	char buf[CMSG_SPACE(sizeof(struct ucred))] = { 0 };
-	char vbuf[1024] = { 0 };
-	struct iovec io = {
-		.iov_base = vbuf,
-		.iov_len = sizeof(*vbuf),
-	};
-	struct msghdr msg = {
-		.msg_iov = &io,
-		.msg_iovlen = 1,
-		.msg_control = buf,
-		.msg_controllen = sizeof(buf),
-	};
-	int *fd;
-
-	/* Accept the connection from the client. */
-	conn_sock = accept(priv->primary_socket, NULL, NULL);
-	if (conn_sock < 0) {
-		DRV_LOG(WARNING, "port %u connection failed: %s",
-			dev->data->port_id, strerror(errno));
-		return;
-	}
-	ret = setsockopt(conn_sock, SOL_SOCKET, SO_PASSCRED, &(int){1},
-					 sizeof(int));
-	if (ret < 0) {
-		ret = errno;
-		DRV_LOG(WARNING, "port %u cannot change socket options: %s",
-			dev->data->port_id, strerror(rte_errno));
-		goto error;
-	}
-	ret = recvmsg(conn_sock, &msg, MSG_WAITALL);
-	if (ret < 0) {
-		ret = errno;
-		DRV_LOG(WARNING, "port %u received an empty message: %s",
-			dev->data->port_id, strerror(rte_errno));
-		goto error;
-	}
-	/* Expect to receive credentials only. */
-	cmsg = CMSG_FIRSTHDR(&msg);
-	if (cmsg == NULL) {
-		DRV_LOG(WARNING, "port %u no message", dev->data->port_id);
-		goto error;
-	}
-	if ((cmsg->cmsg_type == SCM_CREDENTIALS) &&
-		(cmsg->cmsg_len >= sizeof(*cred))) {
-		cred = (struct ucred *)CMSG_DATA(cmsg);
-		assert(cred != NULL);
-	}
-	cmsg = CMSG_NXTHDR(&msg, cmsg);
-	if (cmsg != NULL) {
-		DRV_LOG(WARNING, "port %u message wrongly formatted",
-			dev->data->port_id);
-		goto error;
-	}
-	/* Make sure all the ancillary data was received and valid. */
-	if ((cred == NULL) || (cred->uid != getuid()) ||
-	    (cred->gid != getgid())) {
-		DRV_LOG(WARNING, "port %u wrong credentials",
-			dev->data->port_id);
-		goto error;
-	}
-	/* Set-up the ancillary data. */
-	cmsg = CMSG_FIRSTHDR(&msg);
-	assert(cmsg != NULL);
-	cmsg->cmsg_level = SOL_SOCKET;
-	cmsg->cmsg_type = SCM_RIGHTS;
-	cmsg->cmsg_len = CMSG_LEN(sizeof(priv->sh->ctx->cmd_fd));
-	fd = (int *)CMSG_DATA(cmsg);
-	*fd = priv->sh->ctx->cmd_fd;
-	ret = sendmsg(conn_sock, &msg, 0);
-	if (ret < 0)
-		DRV_LOG(WARNING, "port %u cannot send response",
-			dev->data->port_id);
-error:
-	close(conn_sock);
-}
-
-/**
- * Connect to the primary process.
- *
- * @param[in] dev
- *   Pointer to Ethernet structure.
- *
- * @return
- *   fd on success, negative errno value otherwise and rte_errno is set.
- */
-int
-mlx5_socket_connect(struct rte_eth_dev *dev)
-{
-	struct mlx5_priv *priv = dev->data->dev_private;
-	struct sockaddr_un sun = {
-		.sun_family = AF_UNIX,
-	};
-	int socket_fd = -1;
-	int *fd = NULL;
-	int ret;
-	struct ucred *cred;
-	char buf[CMSG_SPACE(sizeof(*cred))] = { 0 };
-	char vbuf[1024] = { 0 };
-	struct iovec io = {
-		.iov_base = vbuf,
-		.iov_len = sizeof(*vbuf),
-	};
-	struct msghdr msg = {
-		.msg_control = buf,
-		.msg_controllen = sizeof(buf),
-		.msg_iov = &io,
-		.msg_iovlen = 1,
-	};
-	struct cmsghdr *cmsg;
-
-	ret = socket(AF_UNIX, SOCK_STREAM, 0);
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u cannot connect to primary",
-			dev->data->port_id);
-		goto error;
-	}
-	socket_fd = ret;
-	snprintf(sun.sun_path, sizeof(sun.sun_path), "/var/tmp/%s_%d",
-		 MLX5_DRIVER_NAME, priv->primary_socket);
-	ret = connect(socket_fd, (const struct sockaddr *)&sun, sizeof(sun));
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u cannot connect to primary",
-			dev->data->port_id);
-		goto error;
-	}
-	cmsg = CMSG_FIRSTHDR(&msg);
-	if (cmsg == NULL) {
-		rte_errno = EINVAL;
-		DRV_LOG(DEBUG, "port %u cannot get first message",
-			dev->data->port_id);
-		goto error;
-	}
-	cmsg->cmsg_level = SOL_SOCKET;
-	cmsg->cmsg_type = SCM_CREDENTIALS;
-	cmsg->cmsg_len = CMSG_LEN(sizeof(*cred));
-	cred = (struct ucred *)CMSG_DATA(cmsg);
-	if (cred == NULL) {
-		rte_errno = EINVAL;
-		DRV_LOG(DEBUG, "port %u no credentials received",
-			dev->data->port_id);
-		goto error;
-	}
-	cred->pid = getpid();
-	cred->uid = getuid();
-	cred->gid = getgid();
-	ret = sendmsg(socket_fd, &msg, MSG_DONTWAIT);
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING,
-			"port %u cannot send credentials to primary: %s",
-			dev->data->port_id, strerror(errno));
-		goto error;
-	}
-	ret = recvmsg(socket_fd, &msg, MSG_WAITALL);
-	if (ret <= 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u no message from primary: %s",
-			dev->data->port_id, strerror(errno));
-		goto error;
-	}
-	cmsg = CMSG_FIRSTHDR(&msg);
-	if (cmsg == NULL) {
-		rte_errno = EINVAL;
-		DRV_LOG(WARNING, "port %u no file descriptor received",
-			dev->data->port_id);
-		goto error;
-	}
-	fd = (int *)CMSG_DATA(cmsg);
-	if (*fd < 0) {
-		DRV_LOG(WARNING, "port %u no file descriptor received: %s",
-			dev->data->port_id, strerror(errno));
-		rte_errno = *fd;
-		goto error;
-	}
-	ret = *fd;
-	close(socket_fd);
-	return ret;
-error:
-	if (socket_fd != -1)
-		close(socket_fd);
-	return -rte_errno;
-}
-- 
2.11.0

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v3 2/4] net/mlx5: replace IPC socket with EAL API
  2019-04-01 21:12   ` [dpdk-dev] [PATCH v3 2/4] net/mlx5: replace IPC socket with EAL API Yongseok Koh
@ 2019-04-01 21:12     ` Yongseok Koh
  0 siblings, 0 replies; 45+ messages in thread
From: Yongseok Koh @ 2019-04-01 21:12 UTC (permalink / raw)
  To: shahafs; +Cc: dev

Socket API is used for IPC in order for secondary process to acquire Verb
command file descriptor. The FD is used to remap UAR address. The new
multi-process APIs (rte_mp) in EAL are newly introduced. mlx5_socket.c is
replaced with mlx5_mp.c, which uses the new APIs.

As it is PMD global infrastructure, only one IPC channel is established.
All the IPC message types may have port_id in the message if there is need
to reference a specific device.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
---
 drivers/net/mlx5/Makefile      |   2 +-
 drivers/net/mlx5/meson.build   |   2 +-
 drivers/net/mlx5/mlx5.c        |   5 +-
 drivers/net/mlx5/mlx5.h        |  29 ++--
 drivers/net/mlx5/mlx5_ethdev.c |  34 -----
 drivers/net/mlx5/mlx5_mp.c     | 139 +++++++++++++++++++
 drivers/net/mlx5/mlx5_socket.c | 306 -----------------------------------------
 7 files changed, 164 insertions(+), 353 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_mp.c
 delete mode 100644 drivers/net/mlx5/mlx5_socket.c

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index c3264949a7..e5587e9036 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -34,7 +34,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_dv.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_tcf.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow_verbs.c
-SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_socket.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_mp.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_nl.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_devx_cmds.c
 
diff --git a/drivers/net/mlx5/meson.build b/drivers/net/mlx5/meson.build
index e3cb9bc201..fe880b9b2e 100644
--- a/drivers/net/mlx5/meson.build
+++ b/drivers/net/mlx5/meson.build
@@ -41,7 +41,7 @@ if build
 		'mlx5_rxmode.c',
 		'mlx5_rxq.c',
 		'mlx5_rxtx.c',
-		'mlx5_socket.c',
+		'mlx5_mp.c',
 		'mlx5_stats.c',
 		'mlx5_trigger.c',
 		'mlx5_txq.c',
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 2208cc922a..61ef29cdef 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -325,6 +325,7 @@ mlx5_prepare_shared_data(void)
 			rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
 							mlx5_mr_mem_event_cb,
 							NULL);
+			mlx5_mp_init();
 		}
 	}
 	rte_spinlock_unlock(&mlx5_shared_data_lock);
@@ -454,8 +455,6 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 		rte_free(priv->rss_conf.rss_key);
 	if (priv->reta_idx != NULL)
 		rte_free(priv->reta_idx);
-	if (priv->primary_socket)
-		mlx5_socket_uninit(dev);
 	if (priv->config.vf)
 		mlx5_nl_mac_addr_flush(dev);
 	if (priv->nl_socket_route >= 0)
@@ -970,7 +969,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		if (err)
 			return NULL;
 		/* Receive command fd from primary process */
-		err = mlx5_socket_connect(eth_dev);
+		err = mlx5_mp_req_verbs_cmd_fd(eth_dev);
 		if (err < 0)
 			return NULL;
 		/* Remap UAR for Tx queues. */
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 7402798d49..f454adb3c1 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -56,6 +56,24 @@ enum {
 	PCI_DEVICE_ID_MELLANOX_CONNECTX6VF = 0x101c,
 };
 
+/* Request types for IPC. */
+enum mlx5_mp_req_type {
+	MLX5_MP_REQ_VERBS_CMD_FD = 1,
+};
+
+/* Pameters for IPC. */
+struct mlx5_mp_param {
+	enum mlx5_mp_req_type type;
+	int port_id;
+	int result;
+};
+
+/** Request timeout for IPC. */
+#define MLX5_MP_REQ_TIMEOUT_SEC 5
+
+/** Key string for IPC. */
+#define MLX5_MP_NAME "net_mlx5_mp"
+
 /** Switch information returned by mlx5_nl_switch_info(). */
 struct mlx5_switch_info {
 	uint32_t master:1; /**< Master device. */
@@ -272,9 +290,7 @@ struct mlx5_priv {
 	uint32_t link_speed_capa; /* Link speed capabilities. */
 	struct mlx5_xstats_ctrl xstats_ctrl; /* Extended stats control. */
 	struct mlx5_stats_ctrl stats_ctrl; /* Stats control. */
-	int primary_socket; /* Unix socket for primary process. */
 	void *uar_base; /* Reserved address space for UAR mapping */
-	struct rte_intr_handle intr_handle_socket; /* Interrupt handler. */
 	struct mlx5_dev_config config; /* Device configuration. */
 	struct mlx5_verbs_alloc_ctx verbs_alloc_ctx;
 	/* Context for Verbs allocator. */
@@ -432,12 +448,9 @@ int mlx5_ctrl_flow(struct rte_eth_dev *dev,
 int mlx5_flow_create_drop_queue(struct rte_eth_dev *dev);
 void mlx5_flow_delete_drop_queue(struct rte_eth_dev *dev);
 
-/* mlx5_socket.c */
-
-int mlx5_socket_init(struct rte_eth_dev *priv);
-void mlx5_socket_uninit(struct rte_eth_dev *priv);
-void mlx5_socket_handle(struct rte_eth_dev *priv);
-int mlx5_socket_connect(struct rte_eth_dev *priv);
+/* mlx5_mp.c */
+int mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev);
+void mlx5_mp_init(void);
 
 /* mlx5_nl.c */
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 7273bd9404..aab8e676c9 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -1096,20 +1096,6 @@ mlx5_dev_interrupt_handler(void *cb_arg)
 }
 
 /**
- * Handle interrupts from the socket.
- *
- * @param cb_arg
- *   Callback argument.
- */
-static void
-mlx5_dev_handler_socket(void *cb_arg)
-{
-	struct rte_eth_dev *dev = cb_arg;
-
-	mlx5_socket_handle(dev);
-}
-
-/**
  * Uninstall shared asynchronous device events handler.
  * This function is implemeted to support event sharing
  * between multiple ports of single IB device.
@@ -1208,14 +1194,7 @@ mlx5_dev_shared_handler_install(struct rte_eth_dev *dev)
 void
 mlx5_dev_interrupt_handler_uninstall(struct rte_eth_dev *dev)
 {
-	struct mlx5_priv *priv = dev->data->dev_private;
-
 	mlx5_dev_shared_handler_uninstall(dev);
-	if (priv->primary_socket)
-		rte_intr_callback_unregister(&priv->intr_handle_socket,
-					     mlx5_dev_handler_socket, dev);
-	priv->intr_handle_socket.fd = 0;
-	priv->intr_handle_socket.type = RTE_INTR_HANDLE_UNKNOWN;
 }
 
 /**
@@ -1227,20 +1206,7 @@ mlx5_dev_interrupt_handler_uninstall(struct rte_eth_dev *dev)
 void
 mlx5_dev_interrupt_handler_install(struct rte_eth_dev *dev)
 {
-	struct mlx5_priv *priv = dev->data->dev_private;
-	int ret;
-
 	mlx5_dev_shared_handler_install(dev);
-	ret = mlx5_socket_init(dev);
-	if (ret)
-		DRV_LOG(ERR, "port %u cannot initialise socket: %s",
-			dev->data->port_id, strerror(rte_errno));
-	else if (priv->primary_socket) {
-		priv->intr_handle_socket.fd = priv->primary_socket;
-		priv->intr_handle_socket.type = RTE_INTR_HANDLE_EXT;
-		rte_intr_callback_register(&priv->intr_handle_socket,
-					   mlx5_dev_handler_socket, dev);
-	}
 }
 
 /**
diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c
new file mode 100644
index 0000000000..71a2b663fa
--- /dev/null
+++ b/drivers/net/mlx5/mlx5_mp.c
@@ -0,0 +1,139 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2019 6WIND S.A.
+ * Copyright 2019 Mellanox Technologies, Ltd
+ */
+
+#include <assert.h>
+#include <stdio.h>
+#include <time.h>
+
+#include <rte_eal.h>
+#include <rte_ethdev_driver.h>
+#include <rte_string_fns.h>
+
+#include "mlx5.h"
+#include "mlx5_utils.h"
+
+/**
+ * Initialize IPC message.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ * @param[out] msg
+ *   Pointer to message to fill in.
+ * @param[in] type
+ *   Message type.
+ */
+static inline void
+mp_init_msg(struct rte_eth_dev *dev, struct rte_mp_msg *msg,
+	    enum mlx5_mp_req_type type)
+{
+	struct mlx5_mp_param *param = (struct mlx5_mp_param *)msg->param;
+
+	memset(msg, 0, sizeof(*msg));
+	strlcpy(msg->name, MLX5_MP_NAME, sizeof(msg->name));
+	msg->len_param = sizeof(*param);
+	param->type = type;
+	param->port_id = dev->data->port_id;
+}
+
+/**
+ * IPC message handler of primary process.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ * @param[in] peer
+ *   Pointer to the peer socket path.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mp_primary_handle(const struct rte_mp_msg *mp_msg, const void *peer)
+{
+	struct rte_mp_msg mp_res;
+	struct mlx5_mp_param *res = (struct mlx5_mp_param *)mp_res.param;
+	const struct mlx5_mp_param *param =
+		(const struct mlx5_mp_param *)mp_msg->param;
+	struct rte_eth_dev *dev;
+	struct mlx5_priv *priv;
+	int ret;
+
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
+	if (!rte_eth_dev_is_valid_port(param->port_id)) {
+		rte_errno = ENODEV;
+		DRV_LOG(ERR, "port %u invalid port ID", param->port_id);
+		return -rte_errno;
+	}
+	dev = &rte_eth_devices[param->port_id];
+	priv = dev->data->dev_private;
+	switch (param->type) {
+	case MLX5_MP_REQ_VERBS_CMD_FD:
+		mp_init_msg(dev, &mp_res, param->type);
+		mp_res.num_fds = 1;
+		mp_res.fds[0] = priv->sh->ctx->cmd_fd;
+		res->result = 0;
+		ret = rte_mp_reply(&mp_res, peer);
+		break;
+	default:
+		rte_errno = EINVAL;
+		DRV_LOG(ERR, "port %u invalid mp request type",
+			dev->data->port_id);
+		return -rte_errno;
+	}
+	return ret;
+}
+
+/**
+ * Request Verbs command file descriptor for mmap to the primary process.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ *
+ * @return
+ *   fd on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev)
+{
+	struct rte_mp_msg mp_req;
+	struct rte_mp_msg *mp_res;
+	struct rte_mp_reply mp_rep;
+	struct mlx5_mp_param *res;
+	struct timespec ts = {.tv_sec = MLX5_MP_REQ_TIMEOUT_SEC, .tv_nsec = 0};
+	int ret;
+
+	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
+	mp_init_msg(dev, &mp_req, MLX5_MP_REQ_VERBS_CMD_FD);
+	ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts);
+	if (ret) {
+		DRV_LOG(ERR, "port %u request to primary process failed",
+			dev->data->port_id);
+		return -rte_errno;
+	}
+	assert(mp_rep.nb_received == 1);
+	mp_res = &mp_rep.msgs[0];
+	res = (struct mlx5_mp_param *)mp_res->param;
+	if (res->result) {
+		rte_errno = -res->result;
+		DRV_LOG(ERR,
+			"port %u failed to get command FD from primary process",
+			dev->data->port_id);
+		ret = -rte_errno;
+		goto exit;
+	}
+	assert(mp_res->num_fds == 1);
+	ret = mp_res->fds[0];
+	DRV_LOG(DEBUG, "port %u command FD from primary is %d",
+		dev->data->port_id, ret);
+exit:
+	free(mp_rep.msgs);
+	return ret;
+}
+
+void
+mlx5_mp_init(void)
+{
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+		rte_mp_action_register(MLX5_MP_NAME, mp_primary_handle);
+}
diff --git a/drivers/net/mlx5/mlx5_socket.c b/drivers/net/mlx5/mlx5_socket.c
deleted file mode 100644
index 8fa64309f3..0000000000
--- a/drivers/net/mlx5/mlx5_socket.c
+++ /dev/null
@@ -1,306 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright 2016 6WIND S.A.
- * Copyright 2016 Mellanox Technologies, Ltd
- */
-
-#include <sys/types.h>
-#include <sys/socket.h>
-#include <sys/un.h>
-#include <fcntl.h>
-#include <stdio.h>
-#include <unistd.h>
-#include <sys/stat.h>
-
-#include "mlx5.h"
-#include "mlx5_utils.h"
-
-/**
- * Initialise the socket to communicate with the secondary process
- *
- * @param[in] dev
- *   Pointer to Ethernet device.
- *
- * @return
- *   0 on success, a negative errno value otherwise and rte_errno is set.
- */
-int
-mlx5_socket_init(struct rte_eth_dev *dev)
-{
-	struct mlx5_priv *priv = dev->data->dev_private;
-	struct sockaddr_un sun = {
-		.sun_family = AF_UNIX,
-	};
-	int ret;
-	int flags;
-
-	/*
-	 * Close the last socket that was used to communicate
-	 * with the secondary process
-	 */
-	if (priv->primary_socket)
-		mlx5_socket_uninit(dev);
-	/*
-	 * Initialise the socket to communicate with the secondary
-	 * process.
-	 */
-	ret = socket(AF_UNIX, SOCK_STREAM, 0);
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u secondary process not supported: %s",
-			dev->data->port_id, strerror(errno));
-		goto error;
-	}
-	priv->primary_socket = ret;
-	flags = fcntl(priv->primary_socket, F_GETFL, 0);
-	if (flags == -1) {
-		rte_errno = errno;
-		goto error;
-	}
-	ret = fcntl(priv->primary_socket, F_SETFL, flags | O_NONBLOCK);
-	if (ret < 0) {
-		rte_errno = errno;
-		goto error;
-	}
-	snprintf(sun.sun_path, sizeof(sun.sun_path), "/var/tmp/%s_%d",
-		 MLX5_DRIVER_NAME, priv->primary_socket);
-	remove(sun.sun_path);
-	ret = bind(priv->primary_socket, (const struct sockaddr *)&sun,
-		   sizeof(sun));
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING,
-			"port %u cannot bind socket, secondary process not"
-			" supported: %s",
-			dev->data->port_id, strerror(errno));
-		goto close;
-	}
-	ret = listen(priv->primary_socket, 0);
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u secondary process not supported: %s",
-			dev->data->port_id, strerror(errno));
-		goto close;
-	}
-	return 0;
-close:
-	remove(sun.sun_path);
-error:
-	claim_zero(close(priv->primary_socket));
-	priv->primary_socket = 0;
-	return -rte_errno;
-}
-
-/**
- * Un-Initialise the socket to communicate with the secondary process
- *
- * @param[in] dev
- */
-void
-mlx5_socket_uninit(struct rte_eth_dev *dev)
-{
-	struct mlx5_priv *priv = dev->data->dev_private;
-
-	MKSTR(path, "/var/tmp/%s_%d", MLX5_DRIVER_NAME, priv->primary_socket);
-	claim_zero(close(priv->primary_socket));
-	priv->primary_socket = 0;
-	claim_zero(remove(path));
-}
-
-/**
- * Handle socket interrupts.
- *
- * @param dev
- *   Pointer to Ethernet device.
- */
-void
-mlx5_socket_handle(struct rte_eth_dev *dev)
-{
-	struct mlx5_priv *priv = dev->data->dev_private;
-	int conn_sock;
-	int ret = 0;
-	struct cmsghdr *cmsg = NULL;
-	struct ucred *cred = NULL;
-	char buf[CMSG_SPACE(sizeof(struct ucred))] = { 0 };
-	char vbuf[1024] = { 0 };
-	struct iovec io = {
-		.iov_base = vbuf,
-		.iov_len = sizeof(*vbuf),
-	};
-	struct msghdr msg = {
-		.msg_iov = &io,
-		.msg_iovlen = 1,
-		.msg_control = buf,
-		.msg_controllen = sizeof(buf),
-	};
-	int *fd;
-
-	/* Accept the connection from the client. */
-	conn_sock = accept(priv->primary_socket, NULL, NULL);
-	if (conn_sock < 0) {
-		DRV_LOG(WARNING, "port %u connection failed: %s",
-			dev->data->port_id, strerror(errno));
-		return;
-	}
-	ret = setsockopt(conn_sock, SOL_SOCKET, SO_PASSCRED, &(int){1},
-					 sizeof(int));
-	if (ret < 0) {
-		ret = errno;
-		DRV_LOG(WARNING, "port %u cannot change socket options: %s",
-			dev->data->port_id, strerror(rte_errno));
-		goto error;
-	}
-	ret = recvmsg(conn_sock, &msg, MSG_WAITALL);
-	if (ret < 0) {
-		ret = errno;
-		DRV_LOG(WARNING, "port %u received an empty message: %s",
-			dev->data->port_id, strerror(rte_errno));
-		goto error;
-	}
-	/* Expect to receive credentials only. */
-	cmsg = CMSG_FIRSTHDR(&msg);
-	if (cmsg == NULL) {
-		DRV_LOG(WARNING, "port %u no message", dev->data->port_id);
-		goto error;
-	}
-	if ((cmsg->cmsg_type == SCM_CREDENTIALS) &&
-		(cmsg->cmsg_len >= sizeof(*cred))) {
-		cred = (struct ucred *)CMSG_DATA(cmsg);
-		assert(cred != NULL);
-	}
-	cmsg = CMSG_NXTHDR(&msg, cmsg);
-	if (cmsg != NULL) {
-		DRV_LOG(WARNING, "port %u message wrongly formatted",
-			dev->data->port_id);
-		goto error;
-	}
-	/* Make sure all the ancillary data was received and valid. */
-	if ((cred == NULL) || (cred->uid != getuid()) ||
-	    (cred->gid != getgid())) {
-		DRV_LOG(WARNING, "port %u wrong credentials",
-			dev->data->port_id);
-		goto error;
-	}
-	/* Set-up the ancillary data. */
-	cmsg = CMSG_FIRSTHDR(&msg);
-	assert(cmsg != NULL);
-	cmsg->cmsg_level = SOL_SOCKET;
-	cmsg->cmsg_type = SCM_RIGHTS;
-	cmsg->cmsg_len = CMSG_LEN(sizeof(priv->sh->ctx->cmd_fd));
-	fd = (int *)CMSG_DATA(cmsg);
-	*fd = priv->sh->ctx->cmd_fd;
-	ret = sendmsg(conn_sock, &msg, 0);
-	if (ret < 0)
-		DRV_LOG(WARNING, "port %u cannot send response",
-			dev->data->port_id);
-error:
-	close(conn_sock);
-}
-
-/**
- * Connect to the primary process.
- *
- * @param[in] dev
- *   Pointer to Ethernet structure.
- *
- * @return
- *   fd on success, negative errno value otherwise and rte_errno is set.
- */
-int
-mlx5_socket_connect(struct rte_eth_dev *dev)
-{
-	struct mlx5_priv *priv = dev->data->dev_private;
-	struct sockaddr_un sun = {
-		.sun_family = AF_UNIX,
-	};
-	int socket_fd = -1;
-	int *fd = NULL;
-	int ret;
-	struct ucred *cred;
-	char buf[CMSG_SPACE(sizeof(*cred))] = { 0 };
-	char vbuf[1024] = { 0 };
-	struct iovec io = {
-		.iov_base = vbuf,
-		.iov_len = sizeof(*vbuf),
-	};
-	struct msghdr msg = {
-		.msg_control = buf,
-		.msg_controllen = sizeof(buf),
-		.msg_iov = &io,
-		.msg_iovlen = 1,
-	};
-	struct cmsghdr *cmsg;
-
-	ret = socket(AF_UNIX, SOCK_STREAM, 0);
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u cannot connect to primary",
-			dev->data->port_id);
-		goto error;
-	}
-	socket_fd = ret;
-	snprintf(sun.sun_path, sizeof(sun.sun_path), "/var/tmp/%s_%d",
-		 MLX5_DRIVER_NAME, priv->primary_socket);
-	ret = connect(socket_fd, (const struct sockaddr *)&sun, sizeof(sun));
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u cannot connect to primary",
-			dev->data->port_id);
-		goto error;
-	}
-	cmsg = CMSG_FIRSTHDR(&msg);
-	if (cmsg == NULL) {
-		rte_errno = EINVAL;
-		DRV_LOG(DEBUG, "port %u cannot get first message",
-			dev->data->port_id);
-		goto error;
-	}
-	cmsg->cmsg_level = SOL_SOCKET;
-	cmsg->cmsg_type = SCM_CREDENTIALS;
-	cmsg->cmsg_len = CMSG_LEN(sizeof(*cred));
-	cred = (struct ucred *)CMSG_DATA(cmsg);
-	if (cred == NULL) {
-		rte_errno = EINVAL;
-		DRV_LOG(DEBUG, "port %u no credentials received",
-			dev->data->port_id);
-		goto error;
-	}
-	cred->pid = getpid();
-	cred->uid = getuid();
-	cred->gid = getgid();
-	ret = sendmsg(socket_fd, &msg, MSG_DONTWAIT);
-	if (ret < 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING,
-			"port %u cannot send credentials to primary: %s",
-			dev->data->port_id, strerror(errno));
-		goto error;
-	}
-	ret = recvmsg(socket_fd, &msg, MSG_WAITALL);
-	if (ret <= 0) {
-		rte_errno = errno;
-		DRV_LOG(WARNING, "port %u no message from primary: %s",
-			dev->data->port_id, strerror(errno));
-		goto error;
-	}
-	cmsg = CMSG_FIRSTHDR(&msg);
-	if (cmsg == NULL) {
-		rte_errno = EINVAL;
-		DRV_LOG(WARNING, "port %u no file descriptor received",
-			dev->data->port_id);
-		goto error;
-	}
-	fd = (int *)CMSG_DATA(cmsg);
-	if (*fd < 0) {
-		DRV_LOG(WARNING, "port %u no file descriptor received: %s",
-			dev->data->port_id, strerror(errno));
-		rte_errno = *fd;
-		goto error;
-	}
-	ret = *fd;
-	close(socket_fd);
-	return ret;
-error:
-	if (socket_fd != -1)
-		close(socket_fd);
-	return -rte_errno;
-}
-- 
2.11.0


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v3 3/4] net/mlx5: rework PMD global data init
  2019-04-01 21:12 ` [dpdk-dev] [PATCH v3 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
                     ` (2 preceding siblings ...)
  2019-04-01 21:12   ` [dpdk-dev] [PATCH v3 2/4] net/mlx5: replace IPC socket with EAL API Yongseok Koh
@ 2019-04-01 21:12   ` Yongseok Koh
  2019-04-01 21:12     ` Yongseok Koh
  2019-04-01 21:12   ` [dpdk-dev] [PATCH v3 4/4] net/mlx5: sync stop/start of datapath with secondary process Yongseok Koh
  2019-04-02  7:11   ` [dpdk-dev] [PATCH v3 0/4] net/mlx5: rework IPC socket and PMD global data init Shahaf Shuler
  5 siblings, 1 reply; 45+ messages in thread
From: Yongseok Koh @ 2019-04-01 21:12 UTC (permalink / raw)
  To: shahafs; +Cc: dev

There's more need to have PMD global data structure. This should be
initialized once per a process regardless of how many PMD instances are
probed. mlx5_init_once() is called during probing and make sure all the
init functions are called once per a process. Currently, such global data
and its initialization functions are even scattered. Rather than
'extern'-ing such variables and calling such functions one by one making
sure it is called only once by checking the validity of such variables, it
will be better to have a global storage to hold such data and a
consolidated function having all the initializations. The existing shared
memory gets more extensively used for this purpose. As there could be
multiple secondary processes, a static storage (local to process) is also
added.

As the reserved virtual address for UAR remap is a PMD global resource,
this doesn't need to be stored in the device priv structure, but in the PMD
global data.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
---
 drivers/net/mlx5/mlx5.c     | 248 ++++++++++++++++++++++++++++++++------------
 drivers/net/mlx5/mlx5.h     |  19 +++-
 drivers/net/mlx5/mlx5_mp.c  |  19 +++-
 drivers/net/mlx5/mlx5_txq.c |   7 +-
 4 files changed, 217 insertions(+), 76 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 61ef29cdef..14dfad34aa 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -127,6 +127,9 @@ struct mlx5_shared_data *mlx5_shared_data;
 /* Spinlock for mlx5_shared_data allocation. */
 static rte_spinlock_t mlx5_shared_data_lock = RTE_SPINLOCK_INITIALIZER;
 
+/* Process local data for secondary processes. */
+static struct mlx5_local_data mlx5_local_data;
+
 /** Driver-specific log messages type. */
 int mlx5_logtype;
 
@@ -297,12 +300,19 @@ mlx5_free_shared_ibctx(struct mlx5_ibv_shared *sh)
 }
 
 /**
- * Prepare shared data between primary and secondary process.
+ * Initialize shared data between primary and secondary process.
+ *
+ * A memzone is reserved by primary process and secondary processes attach to
+ * the memzone.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
-static void
-mlx5_prepare_shared_data(void)
+static int
+mlx5_init_shared_data(void)
 {
 	const struct rte_memzone *mz;
+	int ret = 0;
 
 	rte_spinlock_lock(&mlx5_shared_data_lock);
 	if (mlx5_shared_data == NULL) {
@@ -311,22 +321,53 @@ mlx5_prepare_shared_data(void)
 			mz = rte_memzone_reserve(MZ_MLX5_PMD_SHARED_DATA,
 						 sizeof(*mlx5_shared_data),
 						 SOCKET_ID_ANY, 0);
+			if (mz == NULL) {
+				DRV_LOG(ERR,
+					"Cannot allocate mlx5 shared data\n");
+				ret = -rte_errno;
+				goto error;
+			}
+			mlx5_shared_data = mz->addr;
+			memset(mlx5_shared_data, 0, sizeof(*mlx5_shared_data));
+			rte_spinlock_init(&mlx5_shared_data->lock);
 		} else {
 			/* Lookup allocated shared memory. */
 			mz = rte_memzone_lookup(MZ_MLX5_PMD_SHARED_DATA);
+			if (mz == NULL) {
+				DRV_LOG(ERR,
+					"Cannot attach mlx5 shared data\n");
+				ret = -rte_errno;
+				goto error;
+			}
+			mlx5_shared_data = mz->addr;
+			memset(&mlx5_local_data, 0, sizeof(mlx5_local_data));
 		}
-		if (mz == NULL)
-			rte_panic("Cannot allocate mlx5 shared data\n");
-		mlx5_shared_data = mz->addr;
-		/* Initialize shared data. */
+	}
+error:
+	rte_spinlock_unlock(&mlx5_shared_data_lock);
+	return ret;
+}
+
+/**
+ * Uninitialize shared data between primary and secondary process.
+ *
+ * The pointer of secondary process is dereferenced and primary process frees
+ * the memzone.
+ */
+static void
+mlx5_uninit_shared_data(void)
+{
+	const struct rte_memzone *mz;
+
+	rte_spinlock_lock(&mlx5_shared_data_lock);
+	if (mlx5_shared_data) {
 		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-			LIST_INIT(&mlx5_shared_data->mem_event_cb_list);
-			rte_rwlock_init(&mlx5_shared_data->mem_event_rwlock);
-			rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
-							mlx5_mr_mem_event_cb,
-							NULL);
-			mlx5_mp_init();
+			mz = rte_memzone_lookup(MZ_MLX5_PMD_SHARED_DATA);
+			rte_memzone_free(mz);
+		} else {
+			memset(&mlx5_local_data, 0, sizeof(mlx5_local_data));
 		}
+		mlx5_shared_data = NULL;
 	}
 	rte_spinlock_unlock(&mlx5_shared_data_lock);
 }
@@ -760,15 +801,6 @@ mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
 
 static struct rte_pci_driver mlx5_driver;
 
-/*
- * Reserved UAR address space for TXQ UAR(hw doorbell) mapping, process
- * local resource used by both primary and secondary to avoid duplicate
- * reservation.
- * The space has to be available on both primary and secondary process,
- * TXQ UAR maps to this area using fixed mmap w/o double check.
- */
-static void *uar_base;
-
 static int
 find_lower_va_bound(const struct rte_memseg_list *msl,
 		const struct rte_memseg *ms, void *arg)
@@ -788,25 +820,24 @@ find_lower_va_bound(const struct rte_memseg_list *msl,
 /**
  * Reserve UAR address space for primary process.
  *
- * @param[in] dev
- *   Pointer to Ethernet device.
+ * Process local resource is used by both primary and secondary to avoid
+ * duplicate reservation. The space has to be available on both primary and
+ * secondary process, TXQ UAR maps to this area using fixed mmap w/o double
+ * check.
  *
  * @return
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-mlx5_uar_init_primary(struct rte_eth_dev *dev)
+mlx5_uar_init_primary(void)
 {
-	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_shared_data *sd = mlx5_shared_data;
 	void *addr = (void *)0;
 
-	if (uar_base) { /* UAR address space mapped. */
-		priv->uar_base = uar_base;
+	if (sd->uar_base)
 		return 0;
-	}
 	/* find out lower bound of hugepage segments */
 	rte_memseg_walk(find_lower_va_bound, &addr);
-
 	/* keep distance to hugepages to minimize potential conflicts. */
 	addr = RTE_PTR_SUB(addr, (uintptr_t)(MLX5_UAR_OFFSET + MLX5_UAR_SIZE));
 	/* anonymous mmap, no real memory consumption. */
@@ -814,65 +845,154 @@ mlx5_uar_init_primary(struct rte_eth_dev *dev)
 		    PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
 	if (addr == MAP_FAILED) {
 		DRV_LOG(ERR,
-			"port %u failed to reserve UAR address space, please"
-			" adjust MLX5_UAR_SIZE or try --base-virtaddr",
-			dev->data->port_id);
+			"Failed to reserve UAR address space, please"
+			" adjust MLX5_UAR_SIZE or try --base-virtaddr");
 		rte_errno = ENOMEM;
 		return -rte_errno;
 	}
 	/* Accept either same addr or a new addr returned from mmap if target
 	 * range occupied.
 	 */
-	DRV_LOG(INFO, "port %u reserved UAR address space: %p",
-		dev->data->port_id, addr);
-	priv->uar_base = addr; /* for primary and secondary UAR re-mmap. */
-	uar_base = addr; /* process local, don't reserve again. */
+	DRV_LOG(INFO, "Reserved UAR address space: %p", addr);
+	sd->uar_base = addr; /* for primary and secondary UAR re-mmap. */
 	return 0;
 }
 
 /**
- * Reserve UAR address space for secondary process, align with
- * primary process.
- *
- * @param[in] dev
- *   Pointer to Ethernet device.
+ * Unmap UAR address space reserved for primary process.
+ */
+static void
+mlx5_uar_uninit_primary(void)
+{
+	struct mlx5_shared_data *sd = mlx5_shared_data;
+
+	if (!sd->uar_base)
+		return;
+	munmap(sd->uar_base, MLX5_UAR_SIZE);
+	sd->uar_base = NULL;
+}
+
+/**
+ * Reserve UAR address space for secondary process, align with primary process.
  *
  * @return
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-mlx5_uar_init_secondary(struct rte_eth_dev *dev)
+mlx5_uar_init_secondary(void)
 {
-	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_shared_data *sd = mlx5_shared_data;
+	struct mlx5_local_data *ld = &mlx5_local_data;
 	void *addr;
 
-	assert(priv->uar_base);
-	if (uar_base) { /* already reserved. */
-		assert(uar_base == priv->uar_base);
+	if (ld->uar_base) { /* Already reserved. */
+		assert(sd->uar_base == ld->uar_base);
 		return 0;
 	}
+	assert(sd->uar_base);
 	/* anonymous mmap, no real memory consumption. */
-	addr = mmap(priv->uar_base, MLX5_UAR_SIZE,
+	addr = mmap(sd->uar_base, MLX5_UAR_SIZE,
 		    PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
 	if (addr == MAP_FAILED) {
-		DRV_LOG(ERR, "port %u UAR mmap failed: %p size: %llu",
-			dev->data->port_id, priv->uar_base, MLX5_UAR_SIZE);
+		DRV_LOG(ERR, "UAR mmap failed: %p size: %llu",
+			sd->uar_base, MLX5_UAR_SIZE);
 		rte_errno = ENXIO;
 		return -rte_errno;
 	}
-	if (priv->uar_base != addr) {
+	if (sd->uar_base != addr) {
 		DRV_LOG(ERR,
-			"port %u UAR address %p size %llu occupied, please"
+			"UAR address %p size %llu occupied, please"
 			" adjust MLX5_UAR_OFFSET or try EAL parameter"
 			" --base-virtaddr",
-			dev->data->port_id, priv->uar_base, MLX5_UAR_SIZE);
+			sd->uar_base, MLX5_UAR_SIZE);
 		rte_errno = ENXIO;
 		return -rte_errno;
 	}
-	uar_base = addr; /* process local, don't reserve again */
-	DRV_LOG(INFO, "port %u reserved UAR address space: %p",
-		dev->data->port_id, addr);
+	ld->uar_base = addr;
+	DRV_LOG(INFO, "Reserved UAR address space: %p", addr);
+	return 0;
+}
+
+/**
+ * Unmap UAR address space reserved for secondary process.
+ */
+static void
+mlx5_uar_uninit_secondary(void)
+{
+	struct mlx5_local_data *ld = &mlx5_local_data;
+
+	if (!ld->uar_base)
+		return;
+	munmap(ld->uar_base, MLX5_UAR_SIZE);
+	ld->uar_base = NULL;
+}
+
+/**
+ * PMD global initialization.
+ *
+ * Independent from individual device, this function initializes global
+ * per-PMD data structures distinguishing primary and secondary processes.
+ * Hence, each initialization is called once per a process.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_init_once(void)
+{
+	struct mlx5_shared_data *sd;
+	struct mlx5_local_data *ld = &mlx5_local_data;
+	int ret;
+
+	if (mlx5_init_shared_data())
+		return -rte_errno;
+	sd = mlx5_shared_data;
+	assert(sd);
+	rte_spinlock_lock(&sd->lock);
+	switch (rte_eal_process_type()) {
+	case RTE_PROC_PRIMARY:
+		if (sd->init_done)
+			break;
+		LIST_INIT(&sd->mem_event_cb_list);
+		rte_rwlock_init(&sd->mem_event_rwlock);
+		rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
+						mlx5_mr_mem_event_cb, NULL);
+		mlx5_mp_init_primary();
+		ret = mlx5_uar_init_primary();
+		if (ret)
+			goto error;
+		sd->init_done = true;
+		break;
+	case RTE_PROC_SECONDARY:
+		if (ld->init_done)
+			break;
+		ret = mlx5_uar_init_secondary();
+		if (ret)
+			goto error;
+		++sd->secondary_cnt;
+		ld->init_done = true;
+		break;
+	default:
+		break;
+	}
+	rte_spinlock_unlock(&sd->lock);
 	return 0;
+error:
+	switch (rte_eal_process_type()) {
+	case RTE_PROC_PRIMARY:
+		mlx5_uar_uninit_primary();
+		mlx5_mp_uninit_primary();
+		rte_mem_event_callback_unregister("MLX5_MEM_EVENT_CB", NULL);
+		break;
+	case RTE_PROC_SECONDARY:
+		mlx5_uar_uninit_secondary();
+		break;
+	default:
+		break;
+	}
+	rte_spinlock_unlock(&sd->lock);
+	mlx5_uninit_shared_data();
+	return -rte_errno;
 }
 
 /**
@@ -953,8 +1073,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		rte_errno = EEXIST;
 		return NULL;
 	}
-	/* Prepare shared data between primary and secondary process. */
-	mlx5_prepare_shared_data();
 	DRV_LOG(DEBUG, "naming Ethernet device \"%s\"", name);
 	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
 		eth_dev = rte_eth_dev_attach_secondary(name);
@@ -965,9 +1083,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		}
 		eth_dev->device = dpdk_dev;
 		eth_dev->dev_ops = &mlx5_dev_sec_ops;
-		err = mlx5_uar_init_secondary(eth_dev);
-		if (err)
-			return NULL;
 		/* Receive command fd from primary process */
 		err = mlx5_mp_req_verbs_cmd_fd(eth_dev);
 		if (err < 0)
@@ -1286,11 +1401,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	priv->dev_data = eth_dev->data;
 	eth_dev->data->mac_addrs = priv->mac;
 	eth_dev->device = dpdk_dev;
-	err = mlx5_uar_init_primary(eth_dev);
-	if (err) {
-		err = rte_errno;
-		goto error;
-	}
 	/* Configure the first MAC address by default. */
 	if (mlx5_get_mac(eth_dev, &mac.addr_bytes)) {
 		DRV_LOG(ERR,
@@ -1514,6 +1624,12 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 	struct mlx5_dev_config dev_config;
 	int ret;
 
+	ret = mlx5_init_once();
+	if (ret) {
+		DRV_LOG(ERR, "unable to init PMD global data: %s",
+			strerror(rte_errno));
+		return -rte_errno;
+	}
 	assert(pci_drv == &mlx5_driver);
 	errno = 0;
 	ibv_list = mlx5_glue->get_device_list(&ret);
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index f454adb3c1..fe30353bd0 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -85,12 +85,25 @@ struct mlx5_switch_info {
 
 LIST_HEAD(mlx5_dev_list, mlx5_priv);
 
-/* Shared memory between primary and secondary processes. */
+/* Shared data between primary and secondary processes. */
 struct mlx5_shared_data {
+	rte_spinlock_t lock;
+	/* Global spinlock for primary and secondary processes. */
+	int init_done; /* Whether primary has done initialization. */
+	unsigned int secondary_cnt; /* Number of secondary processes init'd. */
+	void *uar_base;
+	/* Reserved UAR address space for TXQ UAR(hw doorbell) mapping. */
 	struct mlx5_dev_list mem_event_cb_list;
 	rte_rwlock_t mem_event_rwlock;
 };
 
+/* Per-process data structure, not visible to other processes. */
+struct mlx5_local_data {
+	int init_done; /* Whether a secondary has done initialization. */
+	void *uar_base;
+	/* Reserved UAR address space for TXQ UAR(hw doorbell) mapping. */
+};
+
 extern struct mlx5_shared_data *mlx5_shared_data;
 
 struct mlx5_counter_ctrl {
@@ -290,7 +303,6 @@ struct mlx5_priv {
 	uint32_t link_speed_capa; /* Link speed capabilities. */
 	struct mlx5_xstats_ctrl xstats_ctrl; /* Extended stats control. */
 	struct mlx5_stats_ctrl stats_ctrl; /* Stats control. */
-	void *uar_base; /* Reserved address space for UAR mapping */
 	struct mlx5_dev_config config; /* Device configuration. */
 	struct mlx5_verbs_alloc_ctx verbs_alloc_ctx;
 	/* Context for Verbs allocator. */
@@ -450,7 +462,8 @@ void mlx5_flow_delete_drop_queue(struct rte_eth_dev *dev);
 
 /* mlx5_mp.c */
 int mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev);
-void mlx5_mp_init(void);
+void mlx5_mp_init_primary(void);
+void mlx5_mp_uninit_primary(void);
 
 /* mlx5_nl.c */
 
diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c
index 71a2b663fa..701ee1d260 100644
--- a/drivers/net/mlx5/mlx5_mp.c
+++ b/drivers/net/mlx5/mlx5_mp.c
@@ -131,9 +131,22 @@ mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev)
 	return ret;
 }
 
+/**
+ * Initialize by primary process.
+ */
+void
+mlx5_mp_init_primary(void)
+{
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
+	rte_mp_action_register(MLX5_MP_NAME, mp_primary_handle);
+}
+
+/**
+ * Un-initialize by primary process.
+ */
 void
-mlx5_mp_init(void)
+mlx5_mp_uninit_primary(void)
 {
-	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
-		rte_mp_action_register(MLX5_MP_NAME, mp_primary_handle);
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
+	rte_mp_action_unregister(MLX5_MP_NAME);
 }
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index 5062f5c398..1b3d89f2f6 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -286,7 +286,7 @@ mlx5_tx_uar_remap(struct rte_eth_dev *dev, int fd)
 			}
 		}
 		/* new address in reserved UAR address space. */
-		addr = RTE_PTR_ADD(priv->uar_base,
+		addr = RTE_PTR_ADD(mlx5_shared_data->uar_base,
 				   uar_va & (uintptr_t)(MLX5_UAR_SIZE - 1));
 		if (!already_mapped) {
 			pages[pages_n++] = uar_va;
@@ -844,9 +844,8 @@ mlx5_txq_release(struct rte_eth_dev *dev, uint16_t idx)
 	txq = container_of((*priv->txqs)[idx], struct mlx5_txq_ctrl, txq);
 	if (txq->ibv && !mlx5_txq_ibv_release(txq->ibv))
 		txq->ibv = NULL;
-	if (priv->uar_base)
-		munmap((void *)RTE_ALIGN_FLOOR((uintptr_t)txq->txq.bf_reg,
-		       page_size), page_size);
+	munmap((void *)RTE_ALIGN_FLOOR((uintptr_t)txq->txq.bf_reg, page_size),
+	       page_size);
 	if (rte_atomic32_dec_and_test(&txq->refcnt)) {
 		txq_free_elts(txq);
 		mlx5_mr_btree_free(&txq->txq.mr_ctrl.cache_bh);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v3 3/4] net/mlx5: rework PMD global data init
  2019-04-01 21:12   ` [dpdk-dev] [PATCH v3 3/4] net/mlx5: rework PMD global data init Yongseok Koh
@ 2019-04-01 21:12     ` Yongseok Koh
  0 siblings, 0 replies; 45+ messages in thread
From: Yongseok Koh @ 2019-04-01 21:12 UTC (permalink / raw)
  To: shahafs; +Cc: dev

There's more need to have PMD global data structure. This should be
initialized once per a process regardless of how many PMD instances are
probed. mlx5_init_once() is called during probing and make sure all the
init functions are called once per a process. Currently, such global data
and its initialization functions are even scattered. Rather than
'extern'-ing such variables and calling such functions one by one making
sure it is called only once by checking the validity of such variables, it
will be better to have a global storage to hold such data and a
consolidated function having all the initializations. The existing shared
memory gets more extensively used for this purpose. As there could be
multiple secondary processes, a static storage (local to process) is also
added.

As the reserved virtual address for UAR remap is a PMD global resource,
this doesn't need to be stored in the device priv structure, but in the PMD
global data.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
---
 drivers/net/mlx5/mlx5.c     | 248 ++++++++++++++++++++++++++++++++------------
 drivers/net/mlx5/mlx5.h     |  19 +++-
 drivers/net/mlx5/mlx5_mp.c  |  19 +++-
 drivers/net/mlx5/mlx5_txq.c |   7 +-
 4 files changed, 217 insertions(+), 76 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 61ef29cdef..14dfad34aa 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -127,6 +127,9 @@ struct mlx5_shared_data *mlx5_shared_data;
 /* Spinlock for mlx5_shared_data allocation. */
 static rte_spinlock_t mlx5_shared_data_lock = RTE_SPINLOCK_INITIALIZER;
 
+/* Process local data for secondary processes. */
+static struct mlx5_local_data mlx5_local_data;
+
 /** Driver-specific log messages type. */
 int mlx5_logtype;
 
@@ -297,12 +300,19 @@ mlx5_free_shared_ibctx(struct mlx5_ibv_shared *sh)
 }
 
 /**
- * Prepare shared data between primary and secondary process.
+ * Initialize shared data between primary and secondary process.
+ *
+ * A memzone is reserved by primary process and secondary processes attach to
+ * the memzone.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
-static void
-mlx5_prepare_shared_data(void)
+static int
+mlx5_init_shared_data(void)
 {
 	const struct rte_memzone *mz;
+	int ret = 0;
 
 	rte_spinlock_lock(&mlx5_shared_data_lock);
 	if (mlx5_shared_data == NULL) {
@@ -311,22 +321,53 @@ mlx5_prepare_shared_data(void)
 			mz = rte_memzone_reserve(MZ_MLX5_PMD_SHARED_DATA,
 						 sizeof(*mlx5_shared_data),
 						 SOCKET_ID_ANY, 0);
+			if (mz == NULL) {
+				DRV_LOG(ERR,
+					"Cannot allocate mlx5 shared data\n");
+				ret = -rte_errno;
+				goto error;
+			}
+			mlx5_shared_data = mz->addr;
+			memset(mlx5_shared_data, 0, sizeof(*mlx5_shared_data));
+			rte_spinlock_init(&mlx5_shared_data->lock);
 		} else {
 			/* Lookup allocated shared memory. */
 			mz = rte_memzone_lookup(MZ_MLX5_PMD_SHARED_DATA);
+			if (mz == NULL) {
+				DRV_LOG(ERR,
+					"Cannot attach mlx5 shared data\n");
+				ret = -rte_errno;
+				goto error;
+			}
+			mlx5_shared_data = mz->addr;
+			memset(&mlx5_local_data, 0, sizeof(mlx5_local_data));
 		}
-		if (mz == NULL)
-			rte_panic("Cannot allocate mlx5 shared data\n");
-		mlx5_shared_data = mz->addr;
-		/* Initialize shared data. */
+	}
+error:
+	rte_spinlock_unlock(&mlx5_shared_data_lock);
+	return ret;
+}
+
+/**
+ * Uninitialize shared data between primary and secondary process.
+ *
+ * The pointer of secondary process is dereferenced and primary process frees
+ * the memzone.
+ */
+static void
+mlx5_uninit_shared_data(void)
+{
+	const struct rte_memzone *mz;
+
+	rte_spinlock_lock(&mlx5_shared_data_lock);
+	if (mlx5_shared_data) {
 		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-			LIST_INIT(&mlx5_shared_data->mem_event_cb_list);
-			rte_rwlock_init(&mlx5_shared_data->mem_event_rwlock);
-			rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
-							mlx5_mr_mem_event_cb,
-							NULL);
-			mlx5_mp_init();
+			mz = rte_memzone_lookup(MZ_MLX5_PMD_SHARED_DATA);
+			rte_memzone_free(mz);
+		} else {
+			memset(&mlx5_local_data, 0, sizeof(mlx5_local_data));
 		}
+		mlx5_shared_data = NULL;
 	}
 	rte_spinlock_unlock(&mlx5_shared_data_lock);
 }
@@ -760,15 +801,6 @@ mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
 
 static struct rte_pci_driver mlx5_driver;
 
-/*
- * Reserved UAR address space for TXQ UAR(hw doorbell) mapping, process
- * local resource used by both primary and secondary to avoid duplicate
- * reservation.
- * The space has to be available on both primary and secondary process,
- * TXQ UAR maps to this area using fixed mmap w/o double check.
- */
-static void *uar_base;
-
 static int
 find_lower_va_bound(const struct rte_memseg_list *msl,
 		const struct rte_memseg *ms, void *arg)
@@ -788,25 +820,24 @@ find_lower_va_bound(const struct rte_memseg_list *msl,
 /**
  * Reserve UAR address space for primary process.
  *
- * @param[in] dev
- *   Pointer to Ethernet device.
+ * Process local resource is used by both primary and secondary to avoid
+ * duplicate reservation. The space has to be available on both primary and
+ * secondary process, TXQ UAR maps to this area using fixed mmap w/o double
+ * check.
  *
  * @return
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-mlx5_uar_init_primary(struct rte_eth_dev *dev)
+mlx5_uar_init_primary(void)
 {
-	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_shared_data *sd = mlx5_shared_data;
 	void *addr = (void *)0;
 
-	if (uar_base) { /* UAR address space mapped. */
-		priv->uar_base = uar_base;
+	if (sd->uar_base)
 		return 0;
-	}
 	/* find out lower bound of hugepage segments */
 	rte_memseg_walk(find_lower_va_bound, &addr);
-
 	/* keep distance to hugepages to minimize potential conflicts. */
 	addr = RTE_PTR_SUB(addr, (uintptr_t)(MLX5_UAR_OFFSET + MLX5_UAR_SIZE));
 	/* anonymous mmap, no real memory consumption. */
@@ -814,65 +845,154 @@ mlx5_uar_init_primary(struct rte_eth_dev *dev)
 		    PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
 	if (addr == MAP_FAILED) {
 		DRV_LOG(ERR,
-			"port %u failed to reserve UAR address space, please"
-			" adjust MLX5_UAR_SIZE or try --base-virtaddr",
-			dev->data->port_id);
+			"Failed to reserve UAR address space, please"
+			" adjust MLX5_UAR_SIZE or try --base-virtaddr");
 		rte_errno = ENOMEM;
 		return -rte_errno;
 	}
 	/* Accept either same addr or a new addr returned from mmap if target
 	 * range occupied.
 	 */
-	DRV_LOG(INFO, "port %u reserved UAR address space: %p",
-		dev->data->port_id, addr);
-	priv->uar_base = addr; /* for primary and secondary UAR re-mmap. */
-	uar_base = addr; /* process local, don't reserve again. */
+	DRV_LOG(INFO, "Reserved UAR address space: %p", addr);
+	sd->uar_base = addr; /* for primary and secondary UAR re-mmap. */
 	return 0;
 }
 
 /**
- * Reserve UAR address space for secondary process, align with
- * primary process.
- *
- * @param[in] dev
- *   Pointer to Ethernet device.
+ * Unmap UAR address space reserved for primary process.
+ */
+static void
+mlx5_uar_uninit_primary(void)
+{
+	struct mlx5_shared_data *sd = mlx5_shared_data;
+
+	if (!sd->uar_base)
+		return;
+	munmap(sd->uar_base, MLX5_UAR_SIZE);
+	sd->uar_base = NULL;
+}
+
+/**
+ * Reserve UAR address space for secondary process, align with primary process.
  *
  * @return
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-mlx5_uar_init_secondary(struct rte_eth_dev *dev)
+mlx5_uar_init_secondary(void)
 {
-	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_shared_data *sd = mlx5_shared_data;
+	struct mlx5_local_data *ld = &mlx5_local_data;
 	void *addr;
 
-	assert(priv->uar_base);
-	if (uar_base) { /* already reserved. */
-		assert(uar_base == priv->uar_base);
+	if (ld->uar_base) { /* Already reserved. */
+		assert(sd->uar_base == ld->uar_base);
 		return 0;
 	}
+	assert(sd->uar_base);
 	/* anonymous mmap, no real memory consumption. */
-	addr = mmap(priv->uar_base, MLX5_UAR_SIZE,
+	addr = mmap(sd->uar_base, MLX5_UAR_SIZE,
 		    PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
 	if (addr == MAP_FAILED) {
-		DRV_LOG(ERR, "port %u UAR mmap failed: %p size: %llu",
-			dev->data->port_id, priv->uar_base, MLX5_UAR_SIZE);
+		DRV_LOG(ERR, "UAR mmap failed: %p size: %llu",
+			sd->uar_base, MLX5_UAR_SIZE);
 		rte_errno = ENXIO;
 		return -rte_errno;
 	}
-	if (priv->uar_base != addr) {
+	if (sd->uar_base != addr) {
 		DRV_LOG(ERR,
-			"port %u UAR address %p size %llu occupied, please"
+			"UAR address %p size %llu occupied, please"
 			" adjust MLX5_UAR_OFFSET or try EAL parameter"
 			" --base-virtaddr",
-			dev->data->port_id, priv->uar_base, MLX5_UAR_SIZE);
+			sd->uar_base, MLX5_UAR_SIZE);
 		rte_errno = ENXIO;
 		return -rte_errno;
 	}
-	uar_base = addr; /* process local, don't reserve again */
-	DRV_LOG(INFO, "port %u reserved UAR address space: %p",
-		dev->data->port_id, addr);
+	ld->uar_base = addr;
+	DRV_LOG(INFO, "Reserved UAR address space: %p", addr);
+	return 0;
+}
+
+/**
+ * Unmap UAR address space reserved for secondary process.
+ */
+static void
+mlx5_uar_uninit_secondary(void)
+{
+	struct mlx5_local_data *ld = &mlx5_local_data;
+
+	if (!ld->uar_base)
+		return;
+	munmap(ld->uar_base, MLX5_UAR_SIZE);
+	ld->uar_base = NULL;
+}
+
+/**
+ * PMD global initialization.
+ *
+ * Independent from individual device, this function initializes global
+ * per-PMD data structures distinguishing primary and secondary processes.
+ * Hence, each initialization is called once per a process.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_init_once(void)
+{
+	struct mlx5_shared_data *sd;
+	struct mlx5_local_data *ld = &mlx5_local_data;
+	int ret;
+
+	if (mlx5_init_shared_data())
+		return -rte_errno;
+	sd = mlx5_shared_data;
+	assert(sd);
+	rte_spinlock_lock(&sd->lock);
+	switch (rte_eal_process_type()) {
+	case RTE_PROC_PRIMARY:
+		if (sd->init_done)
+			break;
+		LIST_INIT(&sd->mem_event_cb_list);
+		rte_rwlock_init(&sd->mem_event_rwlock);
+		rte_mem_event_callback_register("MLX5_MEM_EVENT_CB",
+						mlx5_mr_mem_event_cb, NULL);
+		mlx5_mp_init_primary();
+		ret = mlx5_uar_init_primary();
+		if (ret)
+			goto error;
+		sd->init_done = true;
+		break;
+	case RTE_PROC_SECONDARY:
+		if (ld->init_done)
+			break;
+		ret = mlx5_uar_init_secondary();
+		if (ret)
+			goto error;
+		++sd->secondary_cnt;
+		ld->init_done = true;
+		break;
+	default:
+		break;
+	}
+	rte_spinlock_unlock(&sd->lock);
 	return 0;
+error:
+	switch (rte_eal_process_type()) {
+	case RTE_PROC_PRIMARY:
+		mlx5_uar_uninit_primary();
+		mlx5_mp_uninit_primary();
+		rte_mem_event_callback_unregister("MLX5_MEM_EVENT_CB", NULL);
+		break;
+	case RTE_PROC_SECONDARY:
+		mlx5_uar_uninit_secondary();
+		break;
+	default:
+		break;
+	}
+	rte_spinlock_unlock(&sd->lock);
+	mlx5_uninit_shared_data();
+	return -rte_errno;
 }
 
 /**
@@ -953,8 +1073,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		rte_errno = EEXIST;
 		return NULL;
 	}
-	/* Prepare shared data between primary and secondary process. */
-	mlx5_prepare_shared_data();
 	DRV_LOG(DEBUG, "naming Ethernet device \"%s\"", name);
 	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
 		eth_dev = rte_eth_dev_attach_secondary(name);
@@ -965,9 +1083,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		}
 		eth_dev->device = dpdk_dev;
 		eth_dev->dev_ops = &mlx5_dev_sec_ops;
-		err = mlx5_uar_init_secondary(eth_dev);
-		if (err)
-			return NULL;
 		/* Receive command fd from primary process */
 		err = mlx5_mp_req_verbs_cmd_fd(eth_dev);
 		if (err < 0)
@@ -1286,11 +1401,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	priv->dev_data = eth_dev->data;
 	eth_dev->data->mac_addrs = priv->mac;
 	eth_dev->device = dpdk_dev;
-	err = mlx5_uar_init_primary(eth_dev);
-	if (err) {
-		err = rte_errno;
-		goto error;
-	}
 	/* Configure the first MAC address by default. */
 	if (mlx5_get_mac(eth_dev, &mac.addr_bytes)) {
 		DRV_LOG(ERR,
@@ -1514,6 +1624,12 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 	struct mlx5_dev_config dev_config;
 	int ret;
 
+	ret = mlx5_init_once();
+	if (ret) {
+		DRV_LOG(ERR, "unable to init PMD global data: %s",
+			strerror(rte_errno));
+		return -rte_errno;
+	}
 	assert(pci_drv == &mlx5_driver);
 	errno = 0;
 	ibv_list = mlx5_glue->get_device_list(&ret);
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index f454adb3c1..fe30353bd0 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -85,12 +85,25 @@ struct mlx5_switch_info {
 
 LIST_HEAD(mlx5_dev_list, mlx5_priv);
 
-/* Shared memory between primary and secondary processes. */
+/* Shared data between primary and secondary processes. */
 struct mlx5_shared_data {
+	rte_spinlock_t lock;
+	/* Global spinlock for primary and secondary processes. */
+	int init_done; /* Whether primary has done initialization. */
+	unsigned int secondary_cnt; /* Number of secondary processes init'd. */
+	void *uar_base;
+	/* Reserved UAR address space for TXQ UAR(hw doorbell) mapping. */
 	struct mlx5_dev_list mem_event_cb_list;
 	rte_rwlock_t mem_event_rwlock;
 };
 
+/* Per-process data structure, not visible to other processes. */
+struct mlx5_local_data {
+	int init_done; /* Whether a secondary has done initialization. */
+	void *uar_base;
+	/* Reserved UAR address space for TXQ UAR(hw doorbell) mapping. */
+};
+
 extern struct mlx5_shared_data *mlx5_shared_data;
 
 struct mlx5_counter_ctrl {
@@ -290,7 +303,6 @@ struct mlx5_priv {
 	uint32_t link_speed_capa; /* Link speed capabilities. */
 	struct mlx5_xstats_ctrl xstats_ctrl; /* Extended stats control. */
 	struct mlx5_stats_ctrl stats_ctrl; /* Stats control. */
-	void *uar_base; /* Reserved address space for UAR mapping */
 	struct mlx5_dev_config config; /* Device configuration. */
 	struct mlx5_verbs_alloc_ctx verbs_alloc_ctx;
 	/* Context for Verbs allocator. */
@@ -450,7 +462,8 @@ void mlx5_flow_delete_drop_queue(struct rte_eth_dev *dev);
 
 /* mlx5_mp.c */
 int mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev);
-void mlx5_mp_init(void);
+void mlx5_mp_init_primary(void);
+void mlx5_mp_uninit_primary(void);
 
 /* mlx5_nl.c */
 
diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c
index 71a2b663fa..701ee1d260 100644
--- a/drivers/net/mlx5/mlx5_mp.c
+++ b/drivers/net/mlx5/mlx5_mp.c
@@ -131,9 +131,22 @@ mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev)
 	return ret;
 }
 
+/**
+ * Initialize by primary process.
+ */
+void
+mlx5_mp_init_primary(void)
+{
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
+	rte_mp_action_register(MLX5_MP_NAME, mp_primary_handle);
+}
+
+/**
+ * Un-initialize by primary process.
+ */
 void
-mlx5_mp_init(void)
+mlx5_mp_uninit_primary(void)
 {
-	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
-		rte_mp_action_register(MLX5_MP_NAME, mp_primary_handle);
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
+	rte_mp_action_unregister(MLX5_MP_NAME);
 }
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index 5062f5c398..1b3d89f2f6 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -286,7 +286,7 @@ mlx5_tx_uar_remap(struct rte_eth_dev *dev, int fd)
 			}
 		}
 		/* new address in reserved UAR address space. */
-		addr = RTE_PTR_ADD(priv->uar_base,
+		addr = RTE_PTR_ADD(mlx5_shared_data->uar_base,
 				   uar_va & (uintptr_t)(MLX5_UAR_SIZE - 1));
 		if (!already_mapped) {
 			pages[pages_n++] = uar_va;
@@ -844,9 +844,8 @@ mlx5_txq_release(struct rte_eth_dev *dev, uint16_t idx)
 	txq = container_of((*priv->txqs)[idx], struct mlx5_txq_ctrl, txq);
 	if (txq->ibv && !mlx5_txq_ibv_release(txq->ibv))
 		txq->ibv = NULL;
-	if (priv->uar_base)
-		munmap((void *)RTE_ALIGN_FLOOR((uintptr_t)txq->txq.bf_reg,
-		       page_size), page_size);
+	munmap((void *)RTE_ALIGN_FLOOR((uintptr_t)txq->txq.bf_reg, page_size),
+	       page_size);
 	if (rte_atomic32_dec_and_test(&txq->refcnt)) {
 		txq_free_elts(txq);
 		mlx5_mr_btree_free(&txq->txq.mr_ctrl.cache_bh);
-- 
2.11.0


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v3 4/4] net/mlx5: sync stop/start of datapath with secondary process
  2019-04-01 21:12 ` [dpdk-dev] [PATCH v3 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
                     ` (3 preceding siblings ...)
  2019-04-01 21:12   ` [dpdk-dev] [PATCH v3 3/4] net/mlx5: rework PMD global data init Yongseok Koh
@ 2019-04-01 21:12   ` Yongseok Koh
  2019-04-01 21:12     ` Yongseok Koh
  2019-04-02  7:11   ` [dpdk-dev] [PATCH v3 0/4] net/mlx5: rework IPC socket and PMD global data init Shahaf Shuler
  5 siblings, 1 reply; 45+ messages in thread
From: Yongseok Koh @ 2019-04-01 21:12 UTC (permalink / raw)
  To: shahafs; +Cc: dev

Rx/Tx burst function pointers are stored in the rte_eth_dev structure,
which is local to a process. Even though primary process replaces the
function pointers, secondary will not run the new ones. With rte_mp APIs,
primary can easily broadcast a request to stop/start the datapath of
secondary processes.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
---
 drivers/net/mlx5/mlx5.c         |   5 ++
 drivers/net/mlx5/mlx5.h         |   6 ++
 drivers/net/mlx5/mlx5_mp.c      | 156 ++++++++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_rxtx.c    |   2 +
 drivers/net/mlx5/mlx5_trigger.c |   5 ++
 5 files changed, 174 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 14dfad34aa..2b7a6d121f 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -470,6 +470,9 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 	/* Prevent crashes when queues are still in use. */
 	dev->rx_pkt_burst = removed_rx_burst;
 	dev->tx_pkt_burst = removed_tx_burst;
+	rte_wmb();
+	/* Disable datapath on secondary process. */
+	mlx5_mp_req_stop_rxtx(dev);
 	if (priv->rxqs != NULL) {
 		/* XXX race condition if mlx5_rx_burst() is still running. */
 		usleep(1000);
@@ -966,6 +969,7 @@ mlx5_init_once(void)
 	case RTE_PROC_SECONDARY:
 		if (ld->init_done)
 			break;
+		mlx5_mp_init_secondary();
 		ret = mlx5_uar_init_secondary();
 		if (ret)
 			goto error;
@@ -986,6 +990,7 @@ mlx5_init_once(void)
 		break;
 	case RTE_PROC_SECONDARY:
 		mlx5_uar_uninit_secondary();
+		mlx5_mp_uninit_secondary();
 		break;
 	default:
 		break;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index fe30353bd0..12692505c3 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -59,6 +59,8 @@ enum {
 /* Request types for IPC. */
 enum mlx5_mp_req_type {
 	MLX5_MP_REQ_VERBS_CMD_FD = 1,
+	MLX5_MP_REQ_START_RXTX,
+	MLX5_MP_REQ_STOP_RXTX,
 };
 
 /* Pameters for IPC. */
@@ -461,9 +463,13 @@ int mlx5_flow_create_drop_queue(struct rte_eth_dev *dev);
 void mlx5_flow_delete_drop_queue(struct rte_eth_dev *dev);
 
 /* mlx5_mp.c */
+void mlx5_mp_req_start_rxtx(struct rte_eth_dev *dev);
+void mlx5_mp_req_stop_rxtx(struct rte_eth_dev *dev);
 int mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev);
 void mlx5_mp_init_primary(void);
 void mlx5_mp_uninit_primary(void);
+void mlx5_mp_init_secondary(void);
+void mlx5_mp_uninit_secondary(void);
 
 /* mlx5_nl.c */
 
diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c
index 701ee1d260..45dcc30426 100644
--- a/drivers/net/mlx5/mlx5_mp.c
+++ b/drivers/net/mlx5/mlx5_mp.c
@@ -12,6 +12,7 @@
 #include <rte_string_fns.h>
 
 #include "mlx5.h"
+#include "mlx5_rxtx.h"
 #include "mlx5_utils.h"
 
 /**
@@ -85,6 +86,141 @@ mp_primary_handle(const struct rte_mp_msg *mp_msg, const void *peer)
 }
 
 /**
+ * IPC message handler of a secondary process.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ * @param[in] peer
+ *   Pointer to the peer socket path.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mp_secondary_handle(const struct rte_mp_msg *mp_msg, const void *peer)
+{
+	struct rte_mp_msg mp_res;
+	struct mlx5_mp_param *res = (struct mlx5_mp_param *)mp_res.param;
+	const struct mlx5_mp_param *param =
+		(const struct mlx5_mp_param *)mp_msg->param;
+	struct rte_eth_dev *dev;
+	int ret;
+
+	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
+	if (!rte_eth_dev_is_valid_port(param->port_id)) {
+		rte_errno = ENODEV;
+		DRV_LOG(ERR, "port %u invalid port ID", param->port_id);
+		return -rte_errno;
+	}
+	dev = &rte_eth_devices[param->port_id];
+	switch (param->type) {
+	case MLX5_MP_REQ_START_RXTX:
+		DRV_LOG(INFO, "port %u starting datapath", dev->data->port_id);
+		rte_mb();
+		dev->rx_pkt_burst = mlx5_select_rx_function(dev);
+		dev->tx_pkt_burst = mlx5_select_tx_function(dev);
+		mp_init_msg(dev, &mp_res, param->type);
+		res->result = 0;
+		ret = rte_mp_reply(&mp_res, peer);
+		break;
+	case MLX5_MP_REQ_STOP_RXTX:
+		DRV_LOG(INFO, "port %u stopping datapath", dev->data->port_id);
+		dev->rx_pkt_burst = removed_rx_burst;
+		dev->tx_pkt_burst = removed_tx_burst;
+		rte_mb();
+		mp_init_msg(dev, &mp_res, param->type);
+		res->result = 0;
+		ret = rte_mp_reply(&mp_res, peer);
+		break;
+	default:
+		rte_errno = EINVAL;
+		DRV_LOG(ERR, "port %u invalid mp request type",
+			dev->data->port_id);
+		return -rte_errno;
+	}
+	return ret;
+}
+
+/**
+ * Broadcast request of stopping/starting data-path to secondary processes.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ * @param[in] type
+ *   Request type.
+ */
+static void
+mp_req_on_rxtx(struct rte_eth_dev *dev, enum mlx5_mp_req_type type)
+{
+	struct rte_mp_msg mp_req;
+	struct rte_mp_msg *mp_res;
+	struct rte_mp_reply mp_rep;
+	struct mlx5_mp_param *res;
+	struct timespec ts = {.tv_sec = MLX5_MP_REQ_TIMEOUT_SEC, .tv_nsec = 0};
+	int ret;
+	int i;
+
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
+	if (!mlx5_shared_data->secondary_cnt)
+		return;
+	if (type != MLX5_MP_REQ_START_RXTX && type != MLX5_MP_REQ_STOP_RXTX) {
+		DRV_LOG(ERR, "port %u unknown request (req_type %d)",
+			dev->data->port_id, type);
+		return;
+	}
+	mp_init_msg(dev, &mp_req, type);
+	ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts);
+	if (ret) {
+		DRV_LOG(ERR, "port %u failed to request stop/start Rx/Tx (%d)",
+			dev->data->port_id, type);
+		goto exit;
+	}
+	if (mp_rep.nb_sent != mp_rep.nb_received) {
+		DRV_LOG(ERR,
+			"port %u not all secondaries responded (req_type %d)",
+			dev->data->port_id, type);
+		goto exit;
+	}
+	for (i = 0; i < mp_rep.nb_received; i++) {
+		mp_res = &mp_rep.msgs[i];
+		res = (struct mlx5_mp_param *)mp_res->param;
+		if (res->result) {
+			DRV_LOG(ERR, "port %u request failed on secondary #%d",
+				dev->data->port_id, i);
+			goto exit;
+		}
+	}
+exit:
+	free(mp_rep.msgs);
+}
+
+/**
+ * Broadcast request of starting data-path to secondary processes. The request
+ * is synchronous.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ */
+void
+mlx5_mp_req_start_rxtx(struct rte_eth_dev *dev)
+{
+	mp_req_on_rxtx(dev, MLX5_MP_REQ_START_RXTX);
+}
+
+/**
+ * Broadcast request of stopping data-path to secondary processes. The request
+ * is synchronous.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ */
+void
+mlx5_mp_req_stop_rxtx(struct rte_eth_dev *dev)
+{
+	mp_req_on_rxtx(dev, MLX5_MP_REQ_STOP_RXTX);
+}
+
+/**
  * Request Verbs command file descriptor for mmap to the primary process.
  *
  * @param[in] dev
@@ -150,3 +286,23 @@ mlx5_mp_uninit_primary(void)
 	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
 	rte_mp_action_unregister(MLX5_MP_NAME);
 }
+
+/**
+ * Initialize by secondary process.
+ */
+void
+mlx5_mp_init_secondary(void)
+{
+	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
+	rte_mp_action_register(MLX5_MP_NAME, mp_secondary_handle);
+}
+
+/**
+ * Un-initialize by secondary process.
+ */
+void
+mlx5_mp_uninit_secondary(void)
+{
+	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
+	rte_mp_action_unregister(MLX5_MP_NAME);
+}
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 38ce0e29a2..3da3f62fa9 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -2373,6 +2373,7 @@ removed_tx_burst(void *dpdk_txq __rte_unused,
 		 struct rte_mbuf **pkts __rte_unused,
 		 uint16_t pkts_n __rte_unused)
 {
+	rte_mb();
 	return 0;
 }
 
@@ -2397,6 +2398,7 @@ removed_rx_burst(void *dpdk_txq __rte_unused,
 		 struct rte_mbuf **pkts __rte_unused,
 		 uint16_t pkts_n __rte_unused)
 {
+	rte_mb();
 	return 0;
 }
 
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index d13a1a1a0f..5b73f0ff03 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -194,8 +194,11 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 			dev->data->port_id);
 		goto error;
 	}
+	rte_wmb();
 	dev->tx_pkt_burst = mlx5_select_tx_function(dev);
 	dev->rx_pkt_burst = mlx5_select_rx_function(dev);
+	/* Enable datapath on secondary process. */
+	mlx5_mp_req_start_rxtx(dev);
 	mlx5_dev_interrupt_handler_install(dev);
 	return 0;
 error:
@@ -228,6 +231,8 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
 	dev->rx_pkt_burst = removed_rx_burst;
 	dev->tx_pkt_burst = removed_tx_burst;
 	rte_wmb();
+	/* Disable datapath on secondary process. */
+	mlx5_mp_req_stop_rxtx(dev);
 	usleep(1000 * priv->rxqs_n);
 	DRV_LOG(DEBUG, "port %u stopping device", dev->data->port_id);
 	mlx5_flow_stop(dev, &priv->flows);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v3 4/4] net/mlx5: sync stop/start of datapath with secondary process
  2019-04-01 21:12   ` [dpdk-dev] [PATCH v3 4/4] net/mlx5: sync stop/start of datapath with secondary process Yongseok Koh
@ 2019-04-01 21:12     ` Yongseok Koh
  0 siblings, 0 replies; 45+ messages in thread
From: Yongseok Koh @ 2019-04-01 21:12 UTC (permalink / raw)
  To: shahafs; +Cc: dev

Rx/Tx burst function pointers are stored in the rte_eth_dev structure,
which is local to a process. Even though primary process replaces the
function pointers, secondary will not run the new ones. With rte_mp APIs,
primary can easily broadcast a request to stop/start the datapath of
secondary processes.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
---
 drivers/net/mlx5/mlx5.c         |   5 ++
 drivers/net/mlx5/mlx5.h         |   6 ++
 drivers/net/mlx5/mlx5_mp.c      | 156 ++++++++++++++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_rxtx.c    |   2 +
 drivers/net/mlx5/mlx5_trigger.c |   5 ++
 5 files changed, 174 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 14dfad34aa..2b7a6d121f 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -470,6 +470,9 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 	/* Prevent crashes when queues are still in use. */
 	dev->rx_pkt_burst = removed_rx_burst;
 	dev->tx_pkt_burst = removed_tx_burst;
+	rte_wmb();
+	/* Disable datapath on secondary process. */
+	mlx5_mp_req_stop_rxtx(dev);
 	if (priv->rxqs != NULL) {
 		/* XXX race condition if mlx5_rx_burst() is still running. */
 		usleep(1000);
@@ -966,6 +969,7 @@ mlx5_init_once(void)
 	case RTE_PROC_SECONDARY:
 		if (ld->init_done)
 			break;
+		mlx5_mp_init_secondary();
 		ret = mlx5_uar_init_secondary();
 		if (ret)
 			goto error;
@@ -986,6 +990,7 @@ mlx5_init_once(void)
 		break;
 	case RTE_PROC_SECONDARY:
 		mlx5_uar_uninit_secondary();
+		mlx5_mp_uninit_secondary();
 		break;
 	default:
 		break;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index fe30353bd0..12692505c3 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -59,6 +59,8 @@ enum {
 /* Request types for IPC. */
 enum mlx5_mp_req_type {
 	MLX5_MP_REQ_VERBS_CMD_FD = 1,
+	MLX5_MP_REQ_START_RXTX,
+	MLX5_MP_REQ_STOP_RXTX,
 };
 
 /* Pameters for IPC. */
@@ -461,9 +463,13 @@ int mlx5_flow_create_drop_queue(struct rte_eth_dev *dev);
 void mlx5_flow_delete_drop_queue(struct rte_eth_dev *dev);
 
 /* mlx5_mp.c */
+void mlx5_mp_req_start_rxtx(struct rte_eth_dev *dev);
+void mlx5_mp_req_stop_rxtx(struct rte_eth_dev *dev);
 int mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev);
 void mlx5_mp_init_primary(void);
 void mlx5_mp_uninit_primary(void);
+void mlx5_mp_init_secondary(void);
+void mlx5_mp_uninit_secondary(void);
 
 /* mlx5_nl.c */
 
diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c
index 701ee1d260..45dcc30426 100644
--- a/drivers/net/mlx5/mlx5_mp.c
+++ b/drivers/net/mlx5/mlx5_mp.c
@@ -12,6 +12,7 @@
 #include <rte_string_fns.h>
 
 #include "mlx5.h"
+#include "mlx5_rxtx.h"
 #include "mlx5_utils.h"
 
 /**
@@ -85,6 +86,141 @@ mp_primary_handle(const struct rte_mp_msg *mp_msg, const void *peer)
 }
 
 /**
+ * IPC message handler of a secondary process.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ * @param[in] peer
+ *   Pointer to the peer socket path.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mp_secondary_handle(const struct rte_mp_msg *mp_msg, const void *peer)
+{
+	struct rte_mp_msg mp_res;
+	struct mlx5_mp_param *res = (struct mlx5_mp_param *)mp_res.param;
+	const struct mlx5_mp_param *param =
+		(const struct mlx5_mp_param *)mp_msg->param;
+	struct rte_eth_dev *dev;
+	int ret;
+
+	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
+	if (!rte_eth_dev_is_valid_port(param->port_id)) {
+		rte_errno = ENODEV;
+		DRV_LOG(ERR, "port %u invalid port ID", param->port_id);
+		return -rte_errno;
+	}
+	dev = &rte_eth_devices[param->port_id];
+	switch (param->type) {
+	case MLX5_MP_REQ_START_RXTX:
+		DRV_LOG(INFO, "port %u starting datapath", dev->data->port_id);
+		rte_mb();
+		dev->rx_pkt_burst = mlx5_select_rx_function(dev);
+		dev->tx_pkt_burst = mlx5_select_tx_function(dev);
+		mp_init_msg(dev, &mp_res, param->type);
+		res->result = 0;
+		ret = rte_mp_reply(&mp_res, peer);
+		break;
+	case MLX5_MP_REQ_STOP_RXTX:
+		DRV_LOG(INFO, "port %u stopping datapath", dev->data->port_id);
+		dev->rx_pkt_burst = removed_rx_burst;
+		dev->tx_pkt_burst = removed_tx_burst;
+		rte_mb();
+		mp_init_msg(dev, &mp_res, param->type);
+		res->result = 0;
+		ret = rte_mp_reply(&mp_res, peer);
+		break;
+	default:
+		rte_errno = EINVAL;
+		DRV_LOG(ERR, "port %u invalid mp request type",
+			dev->data->port_id);
+		return -rte_errno;
+	}
+	return ret;
+}
+
+/**
+ * Broadcast request of stopping/starting data-path to secondary processes.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ * @param[in] type
+ *   Request type.
+ */
+static void
+mp_req_on_rxtx(struct rte_eth_dev *dev, enum mlx5_mp_req_type type)
+{
+	struct rte_mp_msg mp_req;
+	struct rte_mp_msg *mp_res;
+	struct rte_mp_reply mp_rep;
+	struct mlx5_mp_param *res;
+	struct timespec ts = {.tv_sec = MLX5_MP_REQ_TIMEOUT_SEC, .tv_nsec = 0};
+	int ret;
+	int i;
+
+	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
+	if (!mlx5_shared_data->secondary_cnt)
+		return;
+	if (type != MLX5_MP_REQ_START_RXTX && type != MLX5_MP_REQ_STOP_RXTX) {
+		DRV_LOG(ERR, "port %u unknown request (req_type %d)",
+			dev->data->port_id, type);
+		return;
+	}
+	mp_init_msg(dev, &mp_req, type);
+	ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts);
+	if (ret) {
+		DRV_LOG(ERR, "port %u failed to request stop/start Rx/Tx (%d)",
+			dev->data->port_id, type);
+		goto exit;
+	}
+	if (mp_rep.nb_sent != mp_rep.nb_received) {
+		DRV_LOG(ERR,
+			"port %u not all secondaries responded (req_type %d)",
+			dev->data->port_id, type);
+		goto exit;
+	}
+	for (i = 0; i < mp_rep.nb_received; i++) {
+		mp_res = &mp_rep.msgs[i];
+		res = (struct mlx5_mp_param *)mp_res->param;
+		if (res->result) {
+			DRV_LOG(ERR, "port %u request failed on secondary #%d",
+				dev->data->port_id, i);
+			goto exit;
+		}
+	}
+exit:
+	free(mp_rep.msgs);
+}
+
+/**
+ * Broadcast request of starting data-path to secondary processes. The request
+ * is synchronous.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ */
+void
+mlx5_mp_req_start_rxtx(struct rte_eth_dev *dev)
+{
+	mp_req_on_rxtx(dev, MLX5_MP_REQ_START_RXTX);
+}
+
+/**
+ * Broadcast request of stopping data-path to secondary processes. The request
+ * is synchronous.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet structure.
+ */
+void
+mlx5_mp_req_stop_rxtx(struct rte_eth_dev *dev)
+{
+	mp_req_on_rxtx(dev, MLX5_MP_REQ_STOP_RXTX);
+}
+
+/**
  * Request Verbs command file descriptor for mmap to the primary process.
  *
  * @param[in] dev
@@ -150,3 +286,23 @@ mlx5_mp_uninit_primary(void)
 	assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
 	rte_mp_action_unregister(MLX5_MP_NAME);
 }
+
+/**
+ * Initialize by secondary process.
+ */
+void
+mlx5_mp_init_secondary(void)
+{
+	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
+	rte_mp_action_register(MLX5_MP_NAME, mp_secondary_handle);
+}
+
+/**
+ * Un-initialize by secondary process.
+ */
+void
+mlx5_mp_uninit_secondary(void)
+{
+	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
+	rte_mp_action_unregister(MLX5_MP_NAME);
+}
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 38ce0e29a2..3da3f62fa9 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -2373,6 +2373,7 @@ removed_tx_burst(void *dpdk_txq __rte_unused,
 		 struct rte_mbuf **pkts __rte_unused,
 		 uint16_t pkts_n __rte_unused)
 {
+	rte_mb();
 	return 0;
 }
 
@@ -2397,6 +2398,7 @@ removed_rx_burst(void *dpdk_txq __rte_unused,
 		 struct rte_mbuf **pkts __rte_unused,
 		 uint16_t pkts_n __rte_unused)
 {
+	rte_mb();
 	return 0;
 }
 
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index d13a1a1a0f..5b73f0ff03 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -194,8 +194,11 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 			dev->data->port_id);
 		goto error;
 	}
+	rte_wmb();
 	dev->tx_pkt_burst = mlx5_select_tx_function(dev);
 	dev->rx_pkt_burst = mlx5_select_rx_function(dev);
+	/* Enable datapath on secondary process. */
+	mlx5_mp_req_start_rxtx(dev);
 	mlx5_dev_interrupt_handler_install(dev);
 	return 0;
 error:
@@ -228,6 +231,8 @@ mlx5_dev_stop(struct rte_eth_dev *dev)
 	dev->rx_pkt_burst = removed_rx_burst;
 	dev->tx_pkt_burst = removed_tx_burst;
 	rte_wmb();
+	/* Disable datapath on secondary process. */
+	mlx5_mp_req_stop_rxtx(dev);
 	usleep(1000 * priv->rxqs_n);
 	DRV_LOG(DEBUG, "port %u stopping device", dev->data->port_id);
 	mlx5_flow_stop(dev, &priv->flows);
-- 
2.11.0


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/4] net/mlx5: rework IPC socket and PMD global data init
  2019-04-01 21:12 ` [dpdk-dev] [PATCH v3 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
                     ` (4 preceding siblings ...)
  2019-04-01 21:12   ` [dpdk-dev] [PATCH v3 4/4] net/mlx5: sync stop/start of datapath with secondary process Yongseok Koh
@ 2019-04-02  7:11   ` Shahaf Shuler
  2019-04-02  7:11     ` Shahaf Shuler
  5 siblings, 1 reply; 45+ messages in thread
From: Shahaf Shuler @ 2019-04-02  7:11 UTC (permalink / raw)
  To: Yongseok Koh; +Cc: dev

Tuesday, April 2, 2019 12:13 AM, Yongseok Koh:
> Subject: [dpdk-dev] [PATCH v3 0/4] net/mlx5: rework IPC socket and PMD
> global data init
> 
> The existing socket-based IPC channel is replaced with the new rte_mp APIs
> of EAL and extended to request stop/start of dataplane to secondary
> processes.
> Also, initialization of PMD global data including the new IPC channel is
> reworked to provide more generic framework for future use.
> 
> v3:
> * rebase on the latest branch tip
> 
> v2:
> * add more sanity check for eth_dev and return value from IPC request
> * complement commit messages
> * add MLX5_MP_REQ_TIMEOUT_SEC
> 
> Yongseok Koh (4):
>   net/mlx5: fix memory event on secondary process
>   net/mlx5: replace IPC socket with EAL API
>   net/mlx5: rework PMD global data init
>   net/mlx5: sync stop/start of datapath with secondary process

Series applied to next-net-mlx, thanks.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/4] net/mlx5: rework IPC socket and PMD global data init
  2019-04-02  7:11   ` [dpdk-dev] [PATCH v3 0/4] net/mlx5: rework IPC socket and PMD global data init Shahaf Shuler
@ 2019-04-02  7:11     ` Shahaf Shuler
  0 siblings, 0 replies; 45+ messages in thread
From: Shahaf Shuler @ 2019-04-02  7:11 UTC (permalink / raw)
  To: Yongseok Koh; +Cc: dev

Tuesday, April 2, 2019 12:13 AM, Yongseok Koh:
> Subject: [dpdk-dev] [PATCH v3 0/4] net/mlx5: rework IPC socket and PMD
> global data init
> 
> The existing socket-based IPC channel is replaced with the new rte_mp APIs
> of EAL and extended to request stop/start of dataplane to secondary
> processes.
> Also, initialization of PMD global data including the new IPC channel is
> reworked to provide more generic framework for future use.
> 
> v3:
> * rebase on the latest branch tip
> 
> v2:
> * add more sanity check for eth_dev and return value from IPC request
> * complement commit messages
> * add MLX5_MP_REQ_TIMEOUT_SEC
> 
> Yongseok Koh (4):
>   net/mlx5: fix memory event on secondary process
>   net/mlx5: replace IPC socket with EAL API
>   net/mlx5: rework PMD global data init
>   net/mlx5: sync stop/start of datapath with secondary process

Series applied to next-net-mlx, thanks.

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2019-04-02  7:11 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-07  7:33 [dpdk-dev] [PATCH 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
2019-03-07  7:33 ` [dpdk-dev] [PATCH 1/4] net/mlx5: fix memory event on secondary process Yongseok Koh
2019-03-07  7:33 ` [dpdk-dev] [PATCH 2/4] net/mlx5: replace IPC socket with EAL API Yongseok Koh
2019-03-14 12:36   ` Shahaf Shuler
2019-03-14 12:36     ` Shahaf Shuler
2019-03-18 21:29     ` Yongseok Koh
2019-03-18 21:29       ` Yongseok Koh
2019-03-07  7:33 ` [dpdk-dev] [PATCH 3/4] net/mlx5: rework PMD global data init Yongseok Koh
2019-03-14 12:36   ` Shahaf Shuler
2019-03-14 12:36     ` Shahaf Shuler
2019-03-18 21:21     ` Yongseok Koh
2019-03-18 21:21       ` Yongseok Koh
2019-03-19  6:54       ` Shahaf Shuler
2019-03-19  6:54         ` Shahaf Shuler
2019-03-07  7:33 ` [dpdk-dev] [PATCH 4/4] net/mlx5: sync stop/start of datapath with secondary process Yongseok Koh
2019-03-25 19:15 ` [dpdk-dev] [PATCH v2 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
2019-03-25 19:15   ` Yongseok Koh
2019-03-25 19:15   ` [dpdk-dev] [PATCH v2 1/4] net/mlx5: fix memory event on secondary process Yongseok Koh
2019-03-25 19:15     ` Yongseok Koh
2019-03-26 12:28     ` Shahaf Shuler
2019-03-26 12:28       ` Shahaf Shuler
2019-03-25 19:15   ` [dpdk-dev] [PATCH v2 2/4] net/mlx5: replace IPC socket with EAL API Yongseok Koh
2019-03-25 19:15     ` Yongseok Koh
2019-03-26 12:31     ` Shahaf Shuler
2019-03-26 12:31       ` Shahaf Shuler
2019-03-25 19:15   ` [dpdk-dev] [PATCH v2 3/4] net/mlx5: rework PMD global data init Yongseok Koh
2019-03-25 19:15     ` Yongseok Koh
2019-03-26 12:38     ` Shahaf Shuler
2019-03-26 12:38       ` Shahaf Shuler
2019-03-25 19:15   ` [dpdk-dev] [PATCH v2 4/4] net/mlx5: sync stop/start of datapath with secondary process Yongseok Koh
2019-03-25 19:15     ` Yongseok Koh
2019-03-26 12:49     ` Shahaf Shuler
2019-03-26 12:49       ` Shahaf Shuler
2019-04-01 21:12 ` [dpdk-dev] [PATCH v3 0/4] net/mlx5: rework IPC socket and PMD global data init Yongseok Koh
2019-04-01 21:12   ` Yongseok Koh
2019-04-01 21:12   ` [dpdk-dev] [PATCH v3 1/4] net/mlx5: fix memory event on secondary process Yongseok Koh
2019-04-01 21:12     ` Yongseok Koh
2019-04-01 21:12   ` [dpdk-dev] [PATCH v3 2/4] net/mlx5: replace IPC socket with EAL API Yongseok Koh
2019-04-01 21:12     ` Yongseok Koh
2019-04-01 21:12   ` [dpdk-dev] [PATCH v3 3/4] net/mlx5: rework PMD global data init Yongseok Koh
2019-04-01 21:12     ` Yongseok Koh
2019-04-01 21:12   ` [dpdk-dev] [PATCH v3 4/4] net/mlx5: sync stop/start of datapath with secondary process Yongseok Koh
2019-04-01 21:12     ` Yongseok Koh
2019-04-02  7:11   ` [dpdk-dev] [PATCH v3 0/4] net/mlx5: rework IPC socket and PMD global data init Shahaf Shuler
2019-04-02  7:11     ` Shahaf Shuler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).