From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <thomas@monjalon.net>
Received: from out2-smtp.messagingengine.com (out2-smtp.messagingengine.com
 [66.111.4.26]) by dpdk.org (Postfix) with ESMTP id 20A6C378E;
 Sat,  2 Mar 2019 03:43:44 +0100 (CET)
Received: from compute1.internal (compute1.nyi.internal [10.202.2.41])
 by mailout.nyi.internal (Postfix) with ESMTP id C56FC21E1D;
 Fri,  1 Mar 2019 21:43:43 -0500 (EST)
Received: from mailfrontend1 ([10.202.2.162])
 by compute1.internal (MEProxy); Fri, 01 Mar 2019 21:43:43 -0500
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h=
 from:to:cc:subject:date:message-id:in-reply-to:references
 :mime-version:content-transfer-encoding; s=mesmtp; bh=F3bGhjmFzR
 Kyk0vPNrtmgzboIiNm5iaCPEZ8ewFd01M=; b=BSU6nWRl1eafLGaud5wcftCrZI
 Vu1l9eIfV8xE/Qi+d+zbVL3YHbqSUtOP7E7pogajGCO6RIGrSnh7AD4U17BKEl4E
 YNGdwkPSAqKnUD47rSGMvB/8e5ZSOm7/nJb8kVo6/rqCHDr47DaO7ZXEkf75CNp4
 98PK6i1hdYAcQCBnA=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:content-transfer-encoding:date:from
 :in-reply-to:message-id:mime-version:references:subject:to
 :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=
 fm2; bh=F3bGhjmFzRKyk0vPNrtmgzboIiNm5iaCPEZ8ewFd01M=; b=fsvJJMdg
 ibXcu5gX5ggx8GKkx28FkMbNvdLq620I+02c2xMKlL8lQMq4zFJCUnr2NEWSn75a
 ZLAxJ1OEy1J+ASce93P5v3y8Dk3pKpqF7p/6l5d8t7z5J6VwGVrcHdtiq6OGvTjj
 CUaR9VYPnq2Y6UVDBh8sncC2C9LEiXQhmpz9kZTG2h+mEBtKIOV0cZPW1/aA3Shg
 PgG2dOzhdFIx7ASCSANK6sT+AZbD+MBy/Q1aNeSX09mjP+YXlE1TozBohcSuy70K
 fFjGFzT1POxeAg4ttugs4eZw00Cn+zumZdl6VzAZRiJA0Xp+mipwWLmXN0GdFrYQ
 ETOSQ4D7axrcJQ==
X-ME-Sender: <xms:3-15XLc7VXpQqtw672P1dZcO9ew4Vjhfl7vdQqxzyVUmL4EASJjy4g>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedutddrvdeigdegkecutefuodetggdotefrodftvf
 curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu
 uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd
 ertddtnecuhfhrohhmpefvhhhomhgrshcuofhonhhjrghlohhnuceothhhohhmrghssehm
 ohhnjhgrlhhonhdrnhgvtheqnecukfhppeejjedrudefgedrvddtfedrudekgeenucfrrg
 hrrghmpehmrghilhhfrhhomhepthhhohhmrghssehmohhnjhgrlhhonhdrnhgvthenucev
 lhhushhtvghrufhiiigvpedv
X-ME-Proxy: <xmx:3-15XP87iLxQ5vXNy1jxTr6KpGkuQOtobsd2RBXhPvmodZItKWJYAA>
 <xmx:3-15XJGcfHNj_JNJ5aFuC4sAm1QU9yapc3cc55PeIGMbl_v3CYqNbQ>
 <xmx:3-15XJDkRgnKtlOrMYqyOcHMtPdLpldw4v2mudykcRcqzYNeqI6Z1Q>
 <xmx:3-15XFRYjC-CIJNgp8MaOphSsiIXQfLYSnQ3YXUGRQB1Kg9wc5jMWg>
Received: from xps.monjalon.net (184.203.134.77.rev.sfr.net [77.134.203.184])
 by mail.messagingengine.com (Postfix) with ESMTPA id EB2C9E40C1;
 Fri,  1 Mar 2019 21:43:42 -0500 (EST)
From: Thomas Monjalon <thomas@monjalon.net>
To: dev@dpdk.org
Cc: qi.z.zhang@intel.com,
	stable@dpdk.org
Date: Sat,  2 Mar 2019 03:42:53 +0100
Message-Id: <20190302024253.15594-4-thomas@monjalon.net>
X-Mailer: git-send-email 2.20.1
In-Reply-To: <20190302024253.15594-1-thomas@monjalon.net>
References: <20190302024253.15594-1-thomas@monjalon.net>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Subject: [dpdk-stable] [PATCH 3/3] eal: fix multi-process probe failure
	handling
X-BeenThere: stable@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches for DPDK stable branches <stable.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/stable>,
 <mailto:stable-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/stable/>
List-Post: <mailto:stable@dpdk.org>
List-Help: <mailto:stable-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/stable>,
 <mailto:stable-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Sat, 02 Mar 2019 02:43:44 -0000

If probe fails in multi-process context, the device must removed
in other processes for consistency. This is a rollback mechanism.
However the rollback should not happen for devices which were
already probed before the current probe transaction.

When probing an already probed device, the driver may reject
with -EEXIST or update and succeed with code 0.
In order to distinguish successful new probe from re-probe,
in the function local_dev_probe(), the positive EEXIST code
is returned for the latter case.

The functions rte_dev_probe() and __handle_secondary_request()
can test for -EEXIST and +EEXIST, and skip rollback in such case.

Fixes: 244d5130719c ("eal: enable hotplug on multi-process")
Fixes: ac9e4a17370f ("eal: support attach/detach shared device from secondary")
Cc: qi.z.zhang@intel.com
Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 lib/librte_eal/common/eal_common_dev.c | 12 ++++++++++--
 lib/librte_eal/common/eal_private.h    |  2 +-
 lib/librte_eal/common/hotplug_mp.c     |  8 ++++++--
 3 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_dev.c b/lib/librte_eal/common/eal_common_dev.c
index deaaea9345..2c7b1ab071 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -132,6 +132,7 @@ local_dev_probe(const char *devargs, struct rte_device **new_dev)
 {
 	struct rte_device *dev;
 	struct rte_devargs *da;
+	bool already_probed;
 	int ret;
 
 	*new_dev = NULL;
@@ -171,12 +172,15 @@ local_dev_probe(const char *devargs, struct rte_device **new_dev)
 	 * those devargs shouldn't be removed manually anymore.
 	 */
 
+	already_probed = rte_dev_is_probed(dev);
 	ret = dev->bus->plug(dev);
 	if (ret && !rte_dev_is_probed(dev)) { /* if hasn't ever succeeded */
 		RTE_LOG(ERR, EAL, "Driver cannot attach the device (%s)\n",
 			dev->name);
 		return ret;
 	}
+	if (ret == 0 && already_probed)
+		ret = EEXIST; /* hint to avoid any rollback */
 
 	*new_dev = dev;
 	return ret;
@@ -194,6 +198,7 @@ rte_dev_probe(const char *devargs)
 {
 	struct eal_dev_mp_req req;
 	struct rte_device *dev;
+	bool already_probed;
 	int ret;
 
 	memset(&req, 0, sizeof(req));
@@ -221,8 +226,8 @@ rte_dev_probe(const char *devargs)
 
 	/* primary attach the new device itself. */
 	ret = local_dev_probe(devargs, &dev);
-
-	if (ret != 0 && ret != -EEXIST) {
+	already_probed = (ret == -EEXIST || ret == EEXIST);
+	if (ret < 0 && !already_probed) {
 		RTE_LOG(ERR, EAL,
 			"Failed to attach device on primary process\n");
 		return ret;
@@ -250,6 +255,9 @@ rte_dev_probe(const char *devargs)
 	return 0;
 
 rollback:
+	if (already_probed)
+		return ret; /* skip rollback */
+
 	req.t = EAL_DEV_REQ_TYPE_ATTACH_ROLLBACK;
 
 	/* primary send rollback request to secondary. */
diff --git a/lib/librte_eal/common/eal_private.h b/lib/librte_eal/common/eal_private.h
index 798ede553b..a01d252930 100644
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -304,7 +304,7 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
  * @param new_dev
  *   new device be probed as output.
  * @return
- *   0 on success, negative on error.
+ *   >=0 on success (+EEXIST if already probed), negative on error.
  */
 int local_dev_probe(const char *devargs, struct rte_device **new_dev);
 
diff --git a/lib/librte_eal/common/hotplug_mp.c b/lib/librte_eal/common/hotplug_mp.c
index 69e9a16d6a..9f8ef28a3b 100644
--- a/lib/librte_eal/common/hotplug_mp.c
+++ b/lib/librte_eal/common/hotplug_mp.c
@@ -90,13 +90,15 @@ __handle_secondary_request(void *param)
 	struct rte_devargs da;
 	struct rte_device *dev;
 	struct rte_bus *bus;
+	bool already_probed = false;
 	int ret = 0;
 
 	tmp_req = *req;
 
 	if (req->t == EAL_DEV_REQ_TYPE_ATTACH) {
 		ret = local_dev_probe(req->devargs, &dev);
-		if (ret != 0 && ret != -EEXIST) {
+		already_probed = (ret == -EEXIST || ret == EEXIST);
+		if (ret < 0 && !already_probed) {
 			RTE_LOG(ERR, EAL, "Failed to hotplug add device on primary\n");
 			goto finish;
 		}
@@ -159,7 +161,7 @@ __handle_secondary_request(void *param)
 	goto finish;
 
 rollback:
-	if (req->t == EAL_DEV_REQ_TYPE_ATTACH) {
+	if (req->t == EAL_DEV_REQ_TYPE_ATTACH && !already_probed) {
 		tmp_req.t = EAL_DEV_REQ_TYPE_ATTACH_ROLLBACK;
 		eal_dev_hotplug_request_to_secondary(&tmp_req);
 		local_dev_remove(dev);
@@ -238,6 +240,8 @@ static void __handle_primary_request(void *param)
 	case EAL_DEV_REQ_TYPE_ATTACH:
 	case EAL_DEV_REQ_TYPE_DETACH_ROLLBACK:
 		ret = local_dev_probe(req->devargs, &dev);
+		if (ret > 0)
+			ret = 0; /* return only errors */
 		break;
 	case EAL_DEV_REQ_TYPE_DETACH:
 	case EAL_DEV_REQ_TYPE_ATTACH_ROLLBACK:
-- 
2.20.1