patches for DPDK stable branches
 help / color / mirror / Atom feed
* [dpdk-stable] [PATCH 19.11 0/2] net/mlx5: backport secondary process crash fix
@ 2021-02-08  3:03 Suanming Mou
  2021-02-08  3:04 ` [dpdk-stable] [PATCH 19.11 1/2] net/mlx5: fix crash on secondary process port close Suanming Mou
  2021-02-08  3:04 ` [dpdk-stable] [PATCH 19.11 2/2] net/mlx5: fix port attach in secondary process Suanming Mou
  0 siblings, 2 replies; 3+ messages in thread
From: Suanming Mou @ 2021-02-08  3:03 UTC (permalink / raw)
  To: christian.ehrhardt; +Cc: viacheslavo, stable

backport secondary process crash fix

Suanming Mou (2):
  net/mlx5: fix crash on secondary process port close
  net/mlx5: fix port attach in secondary process

 drivers/net/mlx5/mlx5.c     |  6 +++---
 drivers/net/mlx5/mlx5.h     |  6 +++++-
 drivers/net/mlx5/mlx5_mp.c  | 22 ++++++++++++++++++++++
 drivers/net/mlx5/mlx5_txq.c | 21 +++++++++++++--------
 4 files changed, 43 insertions(+), 12 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [dpdk-stable] [PATCH 19.11 1/2] net/mlx5: fix crash on secondary process port close
  2021-02-08  3:03 [dpdk-stable] [PATCH 19.11 0/2] net/mlx5: backport secondary process crash fix Suanming Mou
@ 2021-02-08  3:04 ` Suanming Mou
  2021-02-08  3:04 ` [dpdk-stable] [PATCH 19.11 2/2] net/mlx5: fix port attach in secondary process Suanming Mou
  1 sibling, 0 replies; 3+ messages in thread
From: Suanming Mou @ 2021-02-08  3:04 UTC (permalink / raw)
  To: christian.ehrhardt; +Cc: viacheslavo, stable

[ upstream commit 84a22cbcc467a02772dd24156773940331994c84 ]

When secondary process starts, in rte_eth_dev_attach_secondary()
function, the secondary process port device data in struct rte_eth_dev
will be initialized to be shared with primary process port.

When failsafe sub-port hot-plug happens, both primary and secondary
process will release the sub-port, and primary process will clear the
sub-port device data in fs_dev_remove() deactivate stage first before
request secondary process to release the sub-port. In this case, the
secondary process will not be able to get the priv memory pointer from
the shared device data memory anymore, since the device data memory
has been cleared.

Since what secondary process needs in port detach is the UAR table size
to unmap the UAR addresses. It used Tx queue number as size of UAR table
in priv. In fact the uar_table_sz in struct mlx5_proc_priv means the
size of UAR register table - the number of UAR records. However, the
code set this field incorrectly to the size of mlx5_proc_priv structure.

This commit fixes UAR table size to match with relevant Tx queue number,
uses the UAR table size directly to avoid the secondary process to
access the priv pointer in the shared device data memory when unmapping
the UAR address.

Fixes: 120dc4a7dcd3 ("net/mlx5: remove device register remap")
Cc: stable@dpdk.org

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/mlx5.c     |  4 ++--
 drivers/net/mlx5/mlx5_txq.c | 21 +++++++++++++--------
 2 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 4307005..ac47452 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1251,13 +1251,13 @@ struct mlx5_flow_id_pool *
 	 */
 	ppriv_size =
 		sizeof(struct mlx5_proc_priv) + priv->txqs_n * sizeof(void *);
-	ppriv = rte_malloc_socket("mlx5_proc_priv", ppriv_size,
+	ppriv = rte_zmalloc_socket("mlx5_proc_priv", ppriv_size,
 				  RTE_CACHE_LINE_SIZE, dev->device->numa_node);
 	if (!ppriv) {
 		rte_errno = ENOMEM;
 		return -rte_errno;
 	}
-	ppriv->uar_table_sz = ppriv_size;
+	ppriv->uar_table_sz = priv->txqs_n;
 	dev->process_private = ppriv;
 	return 0;
 }
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index 36aa9b5..0fb3ceb 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -433,18 +433,23 @@
 void
 mlx5_tx_uar_uninit_secondary(struct rte_eth_dev *dev)
 {
-	struct mlx5_priv *priv = dev->data->dev_private;
-	struct mlx5_txq_data *txq;
-	struct mlx5_txq_ctrl *txq_ctrl;
+	struct mlx5_proc_priv *ppriv = (struct mlx5_proc_priv *)
+					dev->process_private;
+	const size_t page_size = sysconf(_SC_PAGESIZE);
+	void *addr;
 	unsigned int i;
 
+	if (page_size == (size_t)-1) {
+		DRV_LOG(ERR, "Failed to get mem page size");
+		return;
+	}
 	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
-	for (i = 0; i != priv->txqs_n; ++i) {
-		if (!(*priv->txqs)[i])
+	for (i = 0; i != ppriv->uar_table_sz; ++i) {
+		if (!ppriv->uar_table[i])
 			continue;
-		txq = (*priv->txqs)[i];
-		txq_ctrl = container_of(txq, struct mlx5_txq_ctrl, txq);
-		txq_uar_uninit_secondary(txq_ctrl);
+		addr = ppriv->uar_table[i];
+		munmap(RTE_PTR_ALIGN_FLOOR(addr, page_size), page_size);
+
 	}
 }
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [dpdk-stable] [PATCH 19.11 2/2] net/mlx5: fix port attach in secondary process
  2021-02-08  3:03 [dpdk-stable] [PATCH 19.11 0/2] net/mlx5: backport secondary process crash fix Suanming Mou
  2021-02-08  3:04 ` [dpdk-stable] [PATCH 19.11 1/2] net/mlx5: fix crash on secondary process port close Suanming Mou
@ 2021-02-08  3:04 ` Suanming Mou
  1 sibling, 0 replies; 3+ messages in thread
From: Suanming Mou @ 2021-02-08  3:04 UTC (permalink / raw)
  To: christian.ehrhardt; +Cc: viacheslavo, stable

[ upstream commit 2b36c30b8cd87585502dec0d2b89e76193f13fc3 ]

Currently, the secondary process port UAR register mapping used by Tx
queue is done during port initializing.

Unluckily, in port hot-plug case, the secondary process was requested
to initialize the port when primary process did not complete the
device configuration and the port Tx queue number is not configured
yet. Hence, the secondary process gets the zero Tx queue number during
probing, causing the UAR registers not be mapped in the correct
fashion.

This commit checks the configured number of Tx queues in secondary
process when the port start is requested. In case the Tx queue
number mismatch found the UAR mapping is reinitialized accordingly.

Fixes: 2aac5b5d119f ("net/mlx5: sync stop/start with secondary process")
Cc: stable@dpdk.org

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/mlx5.c    |  2 +-
 drivers/net/mlx5/mlx5.h    |  6 +++++-
 drivers/net/mlx5/mlx5_mp.c | 22 ++++++++++++++++++++++
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index ac47452..f8de9e3 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1268,7 +1268,7 @@ struct mlx5_flow_id_pool *
  * @param dev
  *   Pointer to Ethernet device structure.
  */
-static void
+void
 mlx5_proc_priv_uninit(struct rte_eth_dev *dev)
 {
 	if (!dev->process_private)
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 206c8e9..0e4a5f8 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -699,7 +699,10 @@ struct mlx5_ibv_shared {
 	struct mlx5_ibv_shared_port port[]; /* per device port data array. */
 };
 
-/* Per-process private structure. */
+/*
+ * Per-process private structure.
+ * Caution, secondary process may rebuild the struct during port start.
+ */
 struct mlx5_proc_priv {
 	size_t uar_table_sz;
 	/* Size of UAR register table. */
@@ -799,6 +802,7 @@ int64_t mlx5_get_dbr(struct rte_eth_dev *dev,
 		     struct mlx5_devx_dbr_page **dbr_page);
 int32_t mlx5_release_dbr(struct rte_eth_dev *dev, uint32_t umem_id,
 			 uint64_t offset);
+void mlx5_proc_priv_uninit(struct rte_eth_dev *dev);
 int mlx5_udp_tunnel_port_add(struct rte_eth_dev *dev,
 			      struct rte_eth_udp_tunnel *udp_tunnel);
 uint16_t mlx5_eth_find_next(uint16_t port_id, struct rte_pci_device *pci_dev);
diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c
index 2a031e2..e889247 100644
--- a/drivers/net/mlx5/mlx5_mp.c
+++ b/drivers/net/mlx5/mlx5_mp.c
@@ -119,6 +119,8 @@
 	const struct mlx5_mp_param *param =
 		(const struct mlx5_mp_param *)mp_msg->param;
 	struct rte_eth_dev *dev;
+	struct mlx5_proc_priv *ppriv;
+	struct mlx5_priv *priv;
 	int ret;
 
 	assert(rte_eal_process_type() == RTE_PROC_SECONDARY);
@@ -128,12 +130,27 @@
 		return -rte_errno;
 	}
 	dev = &rte_eth_devices[param->port_id];
+	priv = dev->data->dev_private;
 	switch (param->type) {
 	case MLX5_MP_REQ_START_RXTX:
 		DRV_LOG(INFO, "port %u starting datapath", dev->data->port_id);
 		rte_mb();
 		dev->rx_pkt_burst = mlx5_select_rx_function(dev);
 		dev->tx_pkt_burst = mlx5_select_tx_function(dev);
+		ppriv = (struct mlx5_proc_priv *)dev->process_private;
+		/* If Tx queue number changes, re-initialize UAR. */
+		if (ppriv->uar_table_sz != priv->txqs_n) {
+			mlx5_tx_uar_uninit_secondary(dev);
+			mlx5_proc_priv_uninit(dev);
+			ret = mlx5_proc_priv_init(dev);
+			if (ret)
+				return -rte_errno;
+			ret = mlx5_tx_uar_init_secondary(dev, mp_msg->fds[0]);
+			if (ret) {
+				mlx5_proc_priv_uninit(dev);
+				return -rte_errno;
+			}
+		}
 		mp_init_msg(dev, &mp_res, param->type);
 		res->result = 0;
 		ret = rte_mp_reply(&mp_res, peer);
@@ -172,6 +189,7 @@
 	struct rte_mp_reply mp_rep;
 	struct mlx5_mp_param *res;
 	struct timespec ts = {.tv_sec = MLX5_MP_REQ_TIMEOUT_SEC, .tv_nsec = 0};
+	struct mlx5_priv *priv = dev->data->dev_private;
 	int ret;
 	int i;
 
@@ -184,6 +202,10 @@
 		return;
 	}
 	mp_init_msg(dev, &mp_req, type);
+	if (type == MLX5_MP_REQ_START_RXTX) {
+		mp_req.num_fds = 1;
+		mp_req.fds[0] = ((struct ibv_context *)priv->sh->ctx)->cmd_fd;
+	}
 	ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts);
 	if (ret) {
 		if (rte_errno != ENOTSUP)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-02-08  3:04 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-08  3:03 [dpdk-stable] [PATCH 19.11 0/2] net/mlx5: backport secondary process crash fix Suanming Mou
2021-02-08  3:04 ` [dpdk-stable] [PATCH 19.11 1/2] net/mlx5: fix crash on secondary process port close Suanming Mou
2021-02-08  3:04 ` [dpdk-stable] [PATCH 19.11 2/2] net/mlx5: fix port attach in secondary process Suanming Mou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).