From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dariusz.stojaczyk@intel.com>
Received: from mga02.intel.com (mga02.intel.com [134.134.136.20])
 by dpdk.org (Postfix) with ESMTP id EEC381B130
 for <dev@dpdk.org>; Fri, 23 Nov 2018 16:47:05 +0100 (CET)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga001.fm.intel.com ([10.253.24.23])
 by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 23 Nov 2018 07:47:04 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.56,270,1539673200"; d="scan'208";a="110595241"
Received: from violet.igk.intel.com ([10.102.54.137])
 by fmsmga001.fm.intel.com with ESMTP; 23 Nov 2018 07:47:03 -0800
From: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
To: dev@dpdk.org
Cc: Darek Stojaczyk <dariusz.stojaczyk@intel.com>, gaetan.rivet@6wind.com,
 thomas@monjalon.net
Date: Fri, 23 Nov 2018 16:43:28 +0100
Message-Id: <20181123154328.97021-1-dariusz.stojaczyk@intel.com>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20181121193827.62540-1-dariusz.stojaczyk@intel.com>
References: <20181121193827.62540-1-dariusz.stojaczyk@intel.com>
Subject: [dpdk-dev] [PATCH v3] dev: don't remove devargs that are still
	referenced
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Fri, 23 Nov 2018 15:47:06 -0000

Even if a device failed to plug, it's still a device
object that references the devargs. Those devargs will
be freed automatically together with the device, but
freeing them any earlier - like it's done in the hotplug
error handling path right now - will give us a dangling
pointer and a segfault scenario.

Consider the following case:
 * secondary process receives the hotplug request IPC message
   * devargs are either created or updated
   * the bus is scanned
     * a new device object is created with the latest devargs
   * the device can't be plugged for whatever reason,
     bus->plug returns error
     * the devargs are freed, even though they're still referenced
       by the device object on the bus

For PCI devices, the generic device name comes from
a buffer within the devargs. Freeing those will make
EAL segfault whenever the device name is checked.

This patch just prevents the hotplug error handling
path from removing the devargs when there's a device
that references them. This is done by simply exiting
early from the hotplug function. As mentioned in the
beginning, those devargs will be freed later, together
with the device itself.

Fixes: 7e8b26650146 ("eal: fix hotplug add / remove")
Cc: gaetan.rivet@6wind.com
Cc: thomas@monjalon.net

Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
---
Changes since v2:
 * added an extra comment (Gaetan)

Changes since v1:
 * described the failing scenario in commit msg (Thomas)

 lib/librte_eal/common/eal_common_dev.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_dev.c b/lib/librte_eal/common/eal_common_dev.c
index 1fdc9ab17..d7950bc9a 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -166,14 +166,17 @@ local_dev_probe(const char *devargs, struct rte_device **new_dev)
 		ret = -ENODEV;
 		goto err_devarg;
 	}
+	/* Since there is a matching device, it is now its responsibility
+	 * to manage the devargs we've just inserted. From this point
+	 * those devargs shouldn't be removed manually anymore.
+	 */
 
 	ret = dev->bus->plug(dev);
 	if (ret) {
-		if (rte_dev_is_probed(dev)) /* if already succeeded earlier */
-			return ret; /* no rollback */
-		RTE_LOG(ERR, EAL, "Driver cannot attach the device (%s)\n",
-			dev->name);
-		goto err_devarg;
+		if (!rte_dev_is_probed(dev)) /* if hasn't succeeded earlier */
+			RTE_LOG(ERR, EAL, "Driver cannot attach the device (%s)\n",
+				dev->name);
+		return ret;
 	}
 
 	*new_dev = dev;
-- 
2.17.1