DPDK patches and discussions
 help / color / mirror / Atom feed
From: Ophir Munk <ophirmu@mellanox.com>
To: Gaetan Rivet <gaetan.rivet@6wind.com>
Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>,
	dev@dpdk.org, Thomas Monjalon <thomas@monjalon.net>,
	Olga Shern <olgas@mellanox.com>,
	Ophir Munk <ophirmu@mellanox.com>,
	stable@dpdk.org
Subject: [dpdk-dev] [PATCH] net/failsafe: fix calling device during RMV events
Date: Sat, 23 Sep 2017 21:57:57 +0000	[thread overview]
Message-ID: <1506203877-2090-1-git-send-email-ophirmu@mellanox.com> (raw)
In-Reply-To: <20170911083117.GM21444@bidouze.vm.6wind.com>

This commit prevents control path operations from failing after a sub
device removal.

Following are the failure steps:
1. The physical device is removed due to change in one of PF parameters
(e.g. MTU)
2. The interrupt thread flags the device
3. Within 2 seconds Interrupt thread initializes the actual device removal,
then every 2 seconds it tries to re-sync (plug in) the device. The trials
fail as long as VF parameter mismatches the PF parameter.
4. A control thread initiates a control operation on failsafe which
initiates this operation on the device.
5. A race condition occurs between the control thread and interrupt thread
when accessing the device data structures.

This commit prevents the race condition in step 5. Before this commit if a
device was removed and then a control thread operation was initiated on
failsafe - in some cases failsafe called the sub device operation instead
of avoiding it. Such cases could lead to operations failures.

This commit fixes failsafe criteria to determine when the device is removed
such that it will avoid calling the sub device operations during that time
and will only call them otherwise.

Fixes: a46f8d584eb8 ("net/failsafe: add fail-safe PMD")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
---
This is V2 patch is in reply to <20170911083117.GM21444@bidouze.vm.6wind.com>

 drivers/net/failsafe/failsafe_ether.c   |  1 +
 drivers/net/failsafe/failsafe_ops.c     | 31 +++++++++++++++----------------
 drivers/net/failsafe/failsafe_private.h | 26 +++++++++++++++++++++-----
 3 files changed, 37 insertions(+), 21 deletions(-)

diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
index a3a8cce..1def110 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -378,6 +378,7 @@ failsafe_eth_dev_state_sync(struct rte_eth_dev *dev)
 				      i);
 				goto err_remove;
 			}
+			sdev->remove = 0;
 		}
 	}
 	/*
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index ff9ad15..721a48a 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -232,7 +232,6 @@ fs_dev_configure(struct rte_eth_dev *dev)
 			dev->data->dev_conf.intr_conf.lsc = 0;
 		}
 		DEBUG("Configuring sub-device %d", i);
-		sdev->remove = 0;
 		ret = rte_eth_dev_configure(PORT_ID(sdev),
 					dev->data->nb_rx_queues,
 					dev->data->nb_tx_queues,
@@ -310,7 +309,7 @@ fs_dev_set_link_up(struct rte_eth_dev *dev)
 	uint8_t i;
 	int ret;
 
-	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+	FOREACH_SUBDEV_ACTIVE_SAFE(sdev, i, dev) {
 		DEBUG("Calling rte_eth_dev_set_link_up on sub_device %d", i);
 		ret = rte_eth_dev_set_link_up(PORT_ID(sdev));
 		if (ret) {
@@ -329,7 +328,7 @@ fs_dev_set_link_down(struct rte_eth_dev *dev)
 	uint8_t i;
 	int ret;
 
-	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+	FOREACH_SUBDEV_ACTIVE_SAFE(sdev, i, dev) {
 		DEBUG("Calling rte_eth_dev_set_link_down on sub_device %d", i);
 		ret = rte_eth_dev_set_link_down(PORT_ID(sdev));
 		if (ret) {
@@ -517,7 +516,7 @@ fs_promiscuous_enable(struct rte_eth_dev *dev)
 	struct sub_device *sdev;
 	uint8_t i;
 
-	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE)
+	FOREACH_SUBDEV_ACTIVE_SAFE(sdev, i, dev)
 		rte_eth_promiscuous_enable(PORT_ID(sdev));
 }
 
@@ -527,7 +526,7 @@ fs_promiscuous_disable(struct rte_eth_dev *dev)
 	struct sub_device *sdev;
 	uint8_t i;
 
-	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE)
+	FOREACH_SUBDEV_ACTIVE_SAFE(sdev, i, dev)
 		rte_eth_promiscuous_disable(PORT_ID(sdev));
 }
 
@@ -537,7 +536,7 @@ fs_allmulticast_enable(struct rte_eth_dev *dev)
 	struct sub_device *sdev;
 	uint8_t i;
 
-	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE)
+	FOREACH_SUBDEV_ACTIVE_SAFE(sdev, i, dev)
 		rte_eth_allmulticast_enable(PORT_ID(sdev));
 }
 
@@ -547,7 +546,7 @@ fs_allmulticast_disable(struct rte_eth_dev *dev)
 	struct sub_device *sdev;
 	uint8_t i;
 
-	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE)
+	FOREACH_SUBDEV_ACTIVE_SAFE(sdev, i, dev)
 		rte_eth_allmulticast_disable(PORT_ID(sdev));
 }
 
@@ -559,7 +558,7 @@ fs_link_update(struct rte_eth_dev *dev,
 	uint8_t i;
 	int ret;
 
-	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+	FOREACH_SUBDEV_ACTIVE_SAFE(sdev, i, dev) {
 		DEBUG("Calling link_update on sub_device %d", i);
 		ret = (SUBOPS(sdev, link_update))(ETH(sdev), wait_to_complete);
 		if (ret && ret != -1) {
@@ -597,7 +596,7 @@ fs_stats_reset(struct rte_eth_dev *dev)
 	struct sub_device *sdev;
 	uint8_t i;
 
-	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE)
+	FOREACH_SUBDEV_ACTIVE_SAFE(sdev, i, dev)
 		rte_eth_stats_reset(PORT_ID(sdev));
 }
 
@@ -692,7 +691,7 @@ fs_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
 	uint8_t i;
 	int ret;
 
-	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+	FOREACH_SUBDEV_ACTIVE_SAFE(sdev, i, dev) {
 		DEBUG("Calling rte_eth_dev_set_mtu on sub_device %d", i);
 		ret = rte_eth_dev_set_mtu(PORT_ID(sdev), mtu);
 		if (ret) {
@@ -711,7 +710,7 @@ fs_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
 	uint8_t i;
 	int ret;
 
-	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+	FOREACH_SUBDEV_ACTIVE_SAFE(sdev, i, dev) {
 		DEBUG("Calling rte_eth_dev_vlan_filter on sub_device %d", i);
 		ret = rte_eth_dev_vlan_filter(PORT_ID(sdev), vlan_id, on);
 		if (ret) {
@@ -745,7 +744,7 @@ fs_flow_ctrl_set(struct rte_eth_dev *dev,
 	uint8_t i;
 	int ret;
 
-	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+	FOREACH_SUBDEV_ACTIVE_SAFE(sdev, i, dev) {
 		DEBUG("Calling rte_eth_dev_flow_ctrl_set on sub_device %d", i);
 		ret = rte_eth_dev_flow_ctrl_set(PORT_ID(sdev), fc_conf);
 		if (ret) {
@@ -766,7 +765,7 @@ fs_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
 	/* No check: already done within the rte_eth_dev_mac_addr_remove
 	 * call for the fail-safe device.
 	 */
-	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE)
+	FOREACH_SUBDEV_ACTIVE_SAFE(sdev, i, dev)
 		rte_eth_dev_mac_addr_remove(PORT_ID(sdev),
 				&dev->data->mac_addrs[index]);
 	PRIV(dev)->mac_addr_pool[index] = 0;
@@ -783,7 +782,7 @@ fs_mac_addr_add(struct rte_eth_dev *dev,
 	uint8_t i;
 
 	RTE_ASSERT(index < FAILSAFE_MAX_ETHADDR);
-	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+	FOREACH_SUBDEV_ACTIVE_SAFE(sdev, i, dev) {
 		ret = rte_eth_dev_mac_addr_add(PORT_ID(sdev), mac_addr, vmdq);
 		if (ret) {
 			ERROR("Operation rte_eth_dev_mac_addr_add failed for sub_device %"
@@ -805,7 +804,7 @@ fs_mac_addr_set(struct rte_eth_dev *dev, struct ether_addr *mac_addr)
 	struct sub_device *sdev;
 	uint8_t i;
 
-	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE)
+	FOREACH_SUBDEV_ACTIVE_SAFE(sdev, i, dev)
 		rte_eth_dev_default_mac_addr_set(PORT_ID(sdev), mac_addr);
 }
 
@@ -824,7 +823,7 @@ fs_filter_ctrl(struct rte_eth_dev *dev,
 		*(const void **)arg = &fs_flow_ops;
 		return 0;
 	}
-	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
+	FOREACH_SUBDEV_ACTIVE_SAFE(sdev, i, dev) {
 		DEBUG("Calling rte_eth_dev_filter_ctrl on sub_device %d", i);
 		ret = rte_eth_dev_filter_ctrl(PORT_ID(sdev), type, op, arg);
 		if (ret) {
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index 0361cf4..fda1606 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -221,9 +221,21 @@ extern int mac_from_arg;
  * state: (enum dev_state), minimum acceptable device state
  */
 #define FOREACH_SUBDEV_STATE(s, i, dev, state)				\
-	for (i = fs_find_next((dev), 0, state);				\
+	for (i = fs_find_next((dev), 0, state, 0);			\
 	     i < PRIV(dev)->subs_tail && (s = &PRIV(dev)->subs[i]);	\
-	     i = fs_find_next((dev), i + 1, state))
+	     i = fs_find_next((dev), i + 1, state, 0))
+
+/**
+ * Stateful iterator construct over fail-safe sub-devices
+ * in ACTIVE state and not removed due to RMV event
+ * s:     (struct sub_device *), iterator
+ * i:     (uint8_t), increment
+ * dev:   (struct rte_eth_dev *), fail-safe ethdev
+ */
+#define FOREACH_SUBDEV_ACTIVE_SAFE(s, i, dev)				\
+	for (i = fs_find_next((dev), 0, DEV_ACTIVE, 1);			\
+	     i < PRIV(dev)->subs_tail && (s = &PRIV(dev)->subs[i]);	\
+	     i = fs_find_next((dev), i + 1, DEV_ACTIVE, 1))
 
 /**
  * Iterator construct over fail-safe sub-devices:
@@ -296,11 +308,15 @@ extern int mac_from_arg;
 
 static inline uint8_t
 fs_find_next(struct rte_eth_dev *dev, uint8_t sid,
-		enum dev_state min_state)
+		enum dev_state min_state, int check_remove)
 {
 	while (sid < PRIV(dev)->subs_tail) {
-		if (PRIV(dev)->subs[sid].state >= min_state)
-			break;
+		if (PRIV(dev)->subs[sid].state >= min_state) {
+			if (check_remove == 0)
+				break;
+			if (PRIV(dev)->subs[sid].remove == 0)
+				break;
+		}
 		sid++;
 	}
 	if (sid >= PRIV(dev)->subs_tail)
-- 
2.7.4

  reply	other threads:[~2017-09-23 21:58 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-09 19:27 Ophir Munk
2017-09-11  8:31 ` Gaëtan Rivet
2017-09-23 21:57   ` Ophir Munk [this message]
2017-10-05 22:42     ` [dpdk-dev] [PATCH v3] " Ophir Munk
2017-10-20 10:35       ` Gaëtan Rivet
2017-10-23  7:17         ` Ophir Munk
2017-10-23  8:36           ` Gaëtan Rivet
2017-11-29 19:17             ` [dpdk-dev] [dpdk-stable] " Ferruh Yigit
2018-01-18 22:22               ` Thomas Monjalon
2018-01-18 23:35                 ` Gaëtan Rivet
2018-02-08 12:20       ` [dpdk-dev] [PATCH v4 0/2] failsafe: " Matan Azrad
2018-02-08 12:20         ` [dpdk-dev] [PATCH v4 1/2] net/failsafe: fix hotplug alarm cancel Matan Azrad
2018-02-08 12:20         ` [dpdk-dev] [PATCH v4 2/2] net/failsafe: fix calling device during RMV events Matan Azrad
2018-02-08 16:34         ` [dpdk-dev] [PATCH v5 0/3] failsafe: " Matan Azrad
2018-02-08 16:34           ` [dpdk-dev] [PATCH v5 1/3] net/failsafe: fix hotplug alarm cancel Matan Azrad
2018-02-08 16:34           ` [dpdk-dev] [PATCH v5 2/3] net/failsafe: fix removal scope Matan Azrad
2018-02-08 17:19             ` Gaëtan Rivet
2018-02-08 19:03               ` Matan Azrad
2018-02-08 16:34           ` [dpdk-dev] [PATCH v5 3/3] net/failsafe: fix calling device during RMV events Matan Azrad
2018-02-08 18:11             ` Gaëtan Rivet
2018-02-08 19:24               ` Matan Azrad
2018-02-11 17:24           ` [dpdk-dev] [PATCH v6 0/3] failsafe: fix hotplug races Matan Azrad
2018-02-11 17:24             ` [dpdk-dev] [PATCH v6 1/3] net/failsafe: fix hotplug alarm cancel Matan Azrad
2018-02-11 17:24             ` [dpdk-dev] [PATCH v6 2/3] net/failsafe: fix removal scope Matan Azrad
2018-02-11 17:24             ` [dpdk-dev] [PATCH v6 3/3] net/failsafe: fix hotplug races Matan Azrad
2018-02-12 18:33               ` Gaëtan Rivet
2018-02-12 20:35                 ` Matan Azrad
2018-02-12 20:51             ` [dpdk-dev] [PATCH v7 0/3] failsafe: " Matan Azrad
2018-02-12 20:51               ` [dpdk-dev] [PATCH v7 1/3] net/failsafe: fix hotplug alarm cancel Matan Azrad
2018-02-12 20:51               ` [dpdk-dev] [PATCH v7 2/3] net/failsafe: fix removal scope Matan Azrad
2018-02-12 20:51               ` [dpdk-dev] [PATCH v7 3/3] net/failsafe: fix hotplug races Matan Azrad
2018-02-13 13:31               ` [dpdk-dev] [PATCH v7 0/3] failsafe: " Gaëtan Rivet
2018-02-13 16:12                 ` Thomas Monjalon
2018-02-13 20:58                   ` De Lara Guarch, Pablo
2018-02-13 21:13                     ` Matan Azrad
2018-02-13 21:21                       ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1506203877-2090-1-git-send-email-ophirmu@mellanox.com \
    --to=ophirmu@mellanox.com \
    --cc=adrien.mazarguil@6wind.com \
    --cc=dev@dpdk.org \
    --cc=gaetan.rivet@6wind.com \
    --cc=olgas@mellanox.com \
    --cc=stable@dpdk.org \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).