From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <luca.boccassi@gmail.com>
Received: from mail-wr0-f193.google.com (mail-wr0-f193.google.com
 [209.85.128.193]) by dpdk.org (Postfix) with ESMTP id 334A41B868
 for <stable@dpdk.org>; Tue, 15 May 2018 15:49:34 +0200 (CEST)
Received: by mail-wr0-f193.google.com with SMTP id v5-v6so213039wrf.9
 for <stable@dpdk.org>; Tue, 15 May 2018 06:49:34 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=from:to:cc:subject:date:message-id:in-reply-to:references;
 bh=VyCvWnsvFwKk6k12R+phWNdODmdSsT/6F3yT2sM7O5k=;
 b=rclkonAClyjH8Oz+0gChtJHErJc9/rZzxKfxC9bJeLSvs/NG+qEI/guwe2AYHHIePG
 qs469rnJBPbN+M51oLII8wViyco0uPswZNhB5G8vS+eWG2McFqC0GCeCWcKtt6+RcYWm
 4NuqJD8HeisJaMc+dSLBqTHUCLB6msE0f7CgS+YbMs0T4PdLQj36qcxdNzarF7YRpRP5
 OTeArXAZTbYYyvJRZuWv/YwGf27M0kyJLhjV/Otg6x9/EkVRIUA5GgJEwq4LD7VJzVKK
 sWOCcSEJmmXPd8PH3HhOSrF4kGEiR8mpK29sdJnjbXsnrVQ/rds9W3Gsf/S4ccMzzTHw
 X7RQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
 :references;
 bh=VyCvWnsvFwKk6k12R+phWNdODmdSsT/6F3yT2sM7O5k=;
 b=uDMU9IlP79PADn6DvJGOZAFE4EL3243I+vmhWJl3OAJVcQI2scmIWegKEPBIsFYwcE
 pKCKvhd632bDssVfLpHAka7YZrc5lNgnQto6oD6swEGkzhmMD3IWLHCbGx/VzW0z8Izi
 +n5t8bUqOPzRVIwZXOmTNywF5zEF48ZaAnIca8hQceRFOMWRPgMJGeqyrdGXTGwS9nq0
 bLbl91eUKCdLcSprD9mvsABPbskjKNtwS+LoObeBbgHLDUcGW5fNh4OjB1LbQOzdx9Yo
 82ZXGU9NYcyNkJLybUoZEHiTFRsRtfJqEppdo2EXqaZdOOnzJdiHS9lPRaZFke2D5vZz
 9uug==
X-Gm-Message-State: ALKqPwf5h4WnE2rmEI9dZtS96V01S047HRqSyO9N9mvZvAQ5M59BsPdB
 4p1umVjwyMumNqaSG9JIj1c=
X-Google-Smtp-Source: AB8JxZoclaG+YXjsbOm69/N7YfKvmhD4sZCWAWZath2E0UZrA262OjtprdxxePKArV43vefuOU6g4Q==
X-Received: by 2002:adf:9814:: with SMTP id
 v20-v6mr11068123wrb.93.1526392173887; 
 Tue, 15 May 2018 06:49:33 -0700 (PDT)
Received: from localhost (slip139-92-244-193.lon.uk.prserv.net.
 [139.92.244.193])
 by smtp.gmail.com with ESMTPSA id p7-v6sm83091wrj.85.2018.05.15.06.49.32
 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256);
 Tue, 15 May 2018 06:49:32 -0700 (PDT)
From: luca.boccassi@gmail.com
To: Matan Azrad <matan@mellanox.com>
Cc: Gaetan Rivet <gaetan.rivet@6wind.com>,
 Stephen Hemminger <stephen@networkplumber.org>,
 dpdk stable <stable@dpdk.org>
Date: Tue, 15 May 2018 14:47:18 +0100
Message-Id: <20180515134731.9337-67-luca.boccassi@gmail.com>
X-Mailer: git-send-email 2.14.2
In-Reply-To: <20180515134731.9337-1-luca.boccassi@gmail.com>
References: <20180503110612.12146-2-luca.boccassi@gmail.com>
 <20180515134731.9337-1-luca.boccassi@gmail.com>
Subject: [dpdk-stable] patch 'net/failsafe: fix sub-device ownership race'
	has been queued to stable release 18.02.2
X-BeenThere: stable@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches for DPDK stable branches <stable.dpdk.org>
List-Unsubscribe: <https://dpdk.org/ml/options/stable>,
 <mailto:stable-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/stable/>
List-Post: <mailto:stable@dpdk.org>
List-Help: <mailto:stable-request@dpdk.org?subject=help>
List-Subscribe: <https://dpdk.org/ml/listinfo/stable>,
 <mailto:stable-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Tue, 15 May 2018 13:49:34 -0000

Hi,

FYI, your patch has been queued to stable release 18.02.2

Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
It will be pushed if I get no objections before 05/16/18. So please
shout if anyone has objections.

Thanks.

Luca Boccassi

---
>>From 80c21d589c88ac0614ff0c96eacd839384ccfaa7 Mon Sep 17 00:00:00 2001
From: Matan Azrad <matan@mellanox.com>
Date: Fri, 11 May 2018 01:58:35 +0200
Subject: [PATCH] net/failsafe: fix sub-device ownership race

[ upstream commit 7fda13d3a508473d21238bf20de39245f584a38c ]

There is time between the sub-device port probing by the sub-device PMD
to the sub-device port ownership taking by a fail-safe port.

In this time, the port is available for the application usage. For
example, the port will be exposed to the applications which use
RTE_ETH_FOREACH_DEV iterator.

Thus, ownership unaware applications may manage the port in this time
what may cause a lot of problematic behaviors in the fail-safe
sub-device initialization.

Register to the ethdev NEW event to take the sub-device port ownership
before it becomes exposed to the application.

Fixes: a46f8d584eb8 ("net/failsafe: add fail-safe PMD")

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
---
 drivers/net/failsafe/failsafe.c         | 20 ++++++++++--
 drivers/net/failsafe/failsafe_eal.c     | 56 ++++++++++++++++++++++-----------
 drivers/net/failsafe/failsafe_ether.c   | 23 ++++++++++++++
 drivers/net/failsafe/failsafe_private.h |  3 ++
 4 files changed, 80 insertions(+), 22 deletions(-)

diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index 5b177c7a1..da2c7f07a 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -202,16 +202,25 @@ fs_eth_dev_create(struct rte_vdev_device *vdev)
 	}
 	snprintf(priv->my_owner.name, sizeof(priv->my_owner.name),
 		 FAILSAFE_OWNER_NAME);
+	DEBUG("Failsafe port %u owner info: %s_%016"PRIX64, dev->data->port_id,
+	      priv->my_owner.name, priv->my_owner.id);
+	ret = rte_eth_dev_callback_register(RTE_ETH_ALL, RTE_ETH_EVENT_NEW,
+					    failsafe_eth_new_event_callback,
+					    dev);
+	if (ret) {
+		ERROR("Failed to register NEW callback");
+		goto free_args;
+	}
 	ret = failsafe_eal_init(dev);
 	if (ret)
-		goto free_args;
+		goto unregister_new_callback;
 	ret = fs_mutex_init(priv);
 	if (ret)
-		goto free_args;
+		goto unregister_new_callback;
 	ret = failsafe_hotplug_alarm_install(dev);
 	if (ret) {
 		ERROR("Could not set up plug-in event detection");
-		goto free_args;
+		goto unregister_new_callback;
 	}
 	mac = &dev->data->mac_addrs[0];
 	if (mac_from_arg) {
@@ -260,6 +269,9 @@ fs_eth_dev_create(struct rte_vdev_device *vdev)
 	return 0;
 cancel_alarm:
 	failsafe_hotplug_alarm_cancel(dev);
+unregister_new_callback:
+	rte_eth_dev_callback_unregister(RTE_ETH_ALL, RTE_ETH_EVENT_NEW,
+					failsafe_eth_new_event_callback, dev);
 free_args:
 	failsafe_args_free(dev);
 free_subs:
@@ -279,6 +291,8 @@ fs_rte_eth_free(const char *name)
 	dev = rte_eth_dev_allocated(name);
 	if (dev == NULL)
 		return -ENODEV;
+	rte_eth_dev_callback_unregister(RTE_ETH_ALL, RTE_ETH_EVENT_NEW,
+					failsafe_eth_new_event_callback, dev);
 	ret = failsafe_eal_uninit(dev);
 	if (ret)
 		ERROR("Error while uninitializing sub-EAL");
diff --git a/drivers/net/failsafe/failsafe_eal.c b/drivers/net/failsafe/failsafe_eal.c
index e0fd60598..74b97db5c 100644
--- a/drivers/net/failsafe/failsafe_eal.c
+++ b/drivers/net/failsafe/failsafe_eal.c
@@ -18,8 +18,9 @@ fs_ethdev_portid_get(const char *name, uint16_t *port_id)
 		return -EINVAL;
 	}
 	len = strlen(name);
-	RTE_ETH_FOREACH_DEV(pid) {
-		if (!strncmp(name, rte_eth_devices[pid].device->name, len)) {
+	for (pid = 0; pid < RTE_MAX_ETHPORTS; pid++) {
+		if (rte_eth_dev_is_valid_port(pid) &&
+		    !strncmp(name, rte_eth_devices[pid].device->name, len)) {
 			*port_id = pid;
 			return 0;
 		}
@@ -41,6 +42,8 @@ fs_bus_init(struct rte_eth_dev *dev)
 			continue;
 		da = &sdev->devargs;
 		if (fs_ethdev_portid_get(da->name, &pid) != 0) {
+			struct rte_eth_dev_owner pid_owner;
+
 			ret = rte_eal_hotplug_add(da->bus->name,
 						  da->name,
 						  da->args);
@@ -55,12 +58,26 @@ fs_bus_init(struct rte_eth_dev *dev)
 				ERROR("sub_device %d init went wrong", i);
 				return -ENODEV;
 			}
+			/*
+			 * The NEW callback tried to take ownership, check
+			 * whether it succeed or didn't.
+			 */
+			rte_eth_dev_owner_get(pid, &pid_owner);
+			if (pid_owner.id != PRIV(dev)->my_owner.id) {
+				INFO("sub_device %d owner(%s_%016"PRIX64") is not my,"
+				     " owner(%s_%016"PRIX64"), will try again later",
+				     i, pid_owner.name, pid_owner.id,
+				     PRIV(dev)->my_owner.name,
+				     PRIV(dev)->my_owner.id);
+				continue;
+			}
 		} else {
+			/* The sub-device port was found. */
 			char devstr[DEVARGS_MAXLEN] = "";
 			struct rte_devargs *probed_da =
 					rte_eth_devices[pid].device->devargs;
 
-			/* Take control of device probed by EAL options. */
+			/* Take control of probed device. */
 			free(da->args);
 			memset(da, 0, sizeof(*da));
 			if (probed_da != NULL)
@@ -77,22 +94,23 @@ fs_bus_init(struct rte_eth_dev *dev)
 			}
 			INFO("Taking control of a probed sub device"
 			      " %d named %s", i, da->name);
-		}
-		ret = rte_eth_dev_owner_set(pid, &PRIV(dev)->my_owner);
-		if (ret < 0) {
-			INFO("sub_device %d owner set failed (%s),"
-			     " will try again later", i, strerror(-ret));
-			continue;
-		} else if (strncmp(rte_eth_devices[pid].device->name, da->name,
-			   strlen(da->name)) != 0) {
-			/*
-			 * The device probably was removed and its port id was
-			 * reallocated before ownership set.
-			 */
-			rte_eth_dev_owner_unset(pid, PRIV(dev)->my_owner.id);
-			INFO("sub_device %d was probably removed before taking"
-			     " ownership, will try again later", i);
-			continue;
+			ret = rte_eth_dev_owner_set(pid, &PRIV(dev)->my_owner);
+			if (ret < 0) {
+				INFO("sub_device %d owner set failed (%s), "
+				     "will try again later", i, strerror(-ret));
+				continue;
+			} else if (strncmp(rte_eth_devices[pid].device->name,
+				   da->name, strlen(da->name)) != 0) {
+				/*
+				 * The device probably was removed and its port
+				 * id was reallocated before ownership set.
+				 */
+				rte_eth_dev_owner_unset(pid,
+							PRIV(dev)->my_owner.id);
+				INFO("sub_device %d was removed before taking"
+				     " ownership, will try again later", i);
+				continue;
+			}
 		}
 		ETH(sdev) = &rte_eth_devices[pid];
 		SUB_ID(sdev) = i;
diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
index 2c0bf9366..780bfa4bc 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -463,3 +463,26 @@ failsafe_eth_lsc_event_callback(uint16_t port_id __rte_unused,
 	else
 		return 0;
 }
+
+/* Take sub-device ownership before it becomes exposed to the application. */
+int
+failsafe_eth_new_event_callback(uint16_t port_id,
+				enum rte_eth_event_type event __rte_unused,
+				void *cb_arg, void *out __rte_unused)
+{
+	struct rte_eth_dev *fs_dev = cb_arg;
+	struct sub_device *sdev;
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	uint8_t i;
+
+	FOREACH_SUBDEV_STATE(sdev, i, fs_dev, DEV_PARSED) {
+		if (sdev->state >= DEV_PROBED)
+			continue;
+		if (strcmp(sdev->devargs.name, dev->device->name) != 0)
+			continue;
+		rte_eth_dev_owner_set(port_id, &PRIV(fs_dev)->my_owner);
+		/* The actual owner will be checked after the port probing. */
+		break;
+	}
+	return 0;
+}
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index 2d16ba4ca..155207423 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -220,6 +220,9 @@ int failsafe_eth_rmv_event_callback(uint16_t port_id,
 int failsafe_eth_lsc_event_callback(uint16_t port_id,
 				    enum rte_eth_event_type event,
 				    void *cb_arg, void *out);
+int failsafe_eth_new_event_callback(uint16_t port_id,
+				    enum rte_eth_event_type event,
+				    void *cb_arg, void *out);
 
 /* GLOBALS */
 
-- 
2.14.2