* [dpdk-dev] [PATCH v6 0/7] hotplug failure handle mechanism
@ 2018-07-09  6:01 Jeff Guo
  2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 1/7] bus: add hotplug failure handler Jeff Guo
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Jeff Guo @ 2018-07-09  6:01 UTC (permalink / raw)
  To: stephen, bruce.richardson, ferruh.yigit, konstantin.ananyev,
	gaetan.rivet, jingjing.wu, thomas, motih, matan,
	harry.van.haaren, qi.z.zhang, shaopeng.he, bernard.iremonger,
	arybchenko
  Cc: jblunck, shreyansh.jain, dev, jia.guo, helin.zhang
As we know, hot plug is an importance feature, either use for the datacenter
device’s fail-safe, or use for SRIOV Live Migration in SDN/NFV. It could bring
the higher flexibility and continuality to the networking services in multiple
use cases in industry. So let we see, dpdk as an importance networking
framework, what can it help to implement hot plug solution for users.
We already have a general device event detect mechanism, failsafe driver,
bonding driver and hot plug/unplug api in framework, app could use these to
develop their hot plug solution.
let’s see the case of hot unplug, it can happen when a hardware device is
be removed physically, or when the software disables it.  App need to call
ether dev API to detach the device, to unplug the device at the bus level and
make access to the device invalid. But the problem is that, the removal of the
device from the software lists is not going to be instantaneous, at this time
if the data(fast) path still read/write the device, it will cause MMIO error
and result of the app crash out.
Seems that we have got fail-safe driver(or app) + RTE_ETH_EVENT_INTR_RMV +
kernel core driver solution to handle it, but still not have failsafe driver
(or app) + RTE_DEV_EVENT_REMOVE + PCIe pmd driver failure handle solution. So
there is an absence in dpdk hot plug solution right now.
Also, we know that kernel only guaranty hot plug on the kernel side, but not for
the user mode side. Firstly we can hardly have a gatekeeper for any MMIO for
multiple PMD driver. Secondly, no more specific 3rd tools such as udev/driverctl
have especially cover these hot plug failure processing. Third, the feasibility
of app’s implement for multiple user mode PMD driver is still a problem. Here,
a general hot plug failure handle mechanism in dpdk framework would be proposed,
it aim to guaranty that, when hot unplug occur, the system will not crash and
app will not be break out, and user space can normally stop and release any
relevant resources, then unplug of the device at the bus level cleanly.
The mechanism should be come across as bellow:
Firstly, app enabled the device event monitor and register the hot plug event’s
callback before running data path. Once the hot unplug behave occur, the
mechanism will detect the removal event and then accordingly do the failure
handle. In order to do that, below functional will be bring in.
 - Add a new bus ops “handle_hot_unplug” to handle bus read/write error, it is
   bus-specific and each kind of bus can implement its own logic.
 - Implement pci bus specific ops “pci_handle_hot_unplug”. It will base on the
   failure address to remap memory for the corresponding device that unplugged.
For the data path or other unexpected control from the control path when hot
unplug occur.
 - Implement a new sigbus handler, it is registered when start device even
   monitoring. The handler is per process. Base on the signal event principle,
   control path thread and data path thread will randomly receive the sigbus
   error, but will go to the common sigbus handler. Once the MMIO sigbus error
   exposure, it will trigger the above hot unplug operation. The sigbus will be
   check if it is cause of the hot unplug or not, if not will info exception as
   the original sigbus handler. If yes, will do memory remapping.
For the control path and the igb uio release:
 - When hot unplug device, the kernel will release the device resource in the
   kernel side, such as the fd sys file will disappear, and the irq will be
   released. At this time, if igb uio driver still try to release this resource,
   it will cause kernel crash.
   On the other hand, something like interrupt disable do not automatically
   process in kernel side. If not handler it, this redundancy and dirty thing
   will affect the interrupt resource be used by other device.
   So the igb_uio driver have to check the hot plug status and corresponding
   process should be taken in igb uio deriver.
   This patch propose to add structure of rte_udev_state into rte_uio_pci_dev
   of igb_uio kernel driver, which will record the state of uio device, such as
   probed/opened/released/removed/unplug. When detect the unexpected removal
   which cause of hot unplug behavior, it will corresponding disable interrupt
   resource, while for the part of releasement which kernel have already handle,
   just skip it to avoid double free or null pointer kernel crash issue.
The mechanism could be use for fail-safe driver and app which want to use hot
plug solution. let testpmd for example:
 - Enable device event monitor->device unplug->failure handle->stop forwarding->
   stop port->close port->detach port.
This process will not breaking the app/fail-safe running, and will not break
other irrelevance device. And app could plug in the device and restart the date
path again by below.
 - Device plug in->bind igb_uio driver ->attached device->start port->
   start forwarding.
patchset history:
v6->v5:
refine some description about bus ops
refine commit log
add some entry check.
v5->v4:
split patches to focus on the failure handle, remove the event usage by testpmd
to another patch.
change the hotplug failure handler name
refine the sigbus handle logic
add lock for udev state in igb uio driver
v4->v3:
split patches to be small and clear
change to use new parameter "--hotplug-mode" in testpmd
to identify the eal hotplug and ethdev hotplug
v3->v2:
change bus ops name to bus_hotplug_handler.
add new API and bus ops of bus_signal_handler
distingush handle generic sigbus and hotplug sigbus
v2->v1(v21):
refine some doc and commit log
fix igb uio kernel issue for control path failure
rebase testpmd code
Since the hot plug solution be discussed serval around in the public, the
scope be changed and the patch set be split into many times. Coming to the
recently RFC and feature design, it just focus on the hot unplug failure
handler at this patch set, so in order let this topic more clear and focus,
summarize privours patch set in history “v1(v21)”, the v2 here go ahead
for further track.
"v1(21)" == v21 as below:
v21->v20:
split function in hot unplug ops
sync failure hanlde to fix multiple process issue fix attach port issue for multiple devices case.
combind rmv callback function to be only one.
v20->v19:
clean the code
refine the remap logic for multiple device.
remove the auto binding
v19->18:
note for limitation of multiple hotplug,fix some typo, sqeeze patch.
v18->v15:
add document, add signal bus handler, refine the code to be more clear.
the prior patch history please check the patch set "add device event monitor framework"
Jeff Guo (7):
  bus: add hotplug failure handler
  bus/pci: implement hotplug failure handler ops
  bus: add sigbus handler
  bus/pci: implement sigbus handler operation
  bus: add helper to handle sigbus
  eal: add failure handle mechanism for hotplug
  igb_uio: fix uio release issue when hot unplug
 drivers/bus/pci/pci_common.c            |  77 +++++++++++++++++++++
 drivers/bus/pci/pci_common_uio.c        |  33 +++++++++
 drivers/bus/pci/private.h               |  12 ++++
 kernel/linux/igb_uio/igb_uio.c          |  51 +++++++++++++-
 lib/librte_eal/common/eal_common_bus.c  |  42 ++++++++++++
 lib/librte_eal/common/eal_private.h     |  12 ++++
 lib/librte_eal/common/include/rte_bus.h |  33 +++++++++
 lib/librte_eal/linuxapp/eal/eal_dev.c   | 114 +++++++++++++++++++++++++++++++-
 8 files changed, 370 insertions(+), 4 deletions(-)
-- 
2.7.4
^ permalink raw reply	[flat|nested] 13+ messages in thread
* [dpdk-dev] [PATCH v6 1/7] bus: add hotplug failure handler
  2018-07-09  6:01 [dpdk-dev] [PATCH v6 0/7] hotplug failure handle mechanism Jeff Guo
@ 2018-07-09  6:01 ` Jeff Guo
  2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 2/7] bus/pci: implement hotplug failure handler ops Jeff Guo
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Jeff Guo @ 2018-07-09  6:01 UTC (permalink / raw)
  To: stephen, bruce.richardson, ferruh.yigit, konstantin.ananyev,
	gaetan.rivet, jingjing.wu, thomas, motih, matan,
	harry.van.haaren, qi.z.zhang, shaopeng.he, bernard.iremonger,
	arybchenko
  Cc: jblunck, shreyansh.jain, dev, jia.guo, helin.zhang
When device be hotplug out, if app still continue to access device by mmio,
it will cause of memory failure and result the system crash.
This patch introduces a bus ops to handle device hotplug failure, it is a
bus specific behavior, so each kind of bus can implement its own logic case
by case.
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
---
v6->v5:
refine some description of bus ops
---
 lib/librte_eal/common/include/rte_bus.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)
diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
index eb9eded..e3a55a8 100644
--- a/lib/librte_eal/common/include/rte_bus.h
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -168,6 +168,20 @@ typedef int (*rte_bus_unplug_t)(struct rte_device *dev);
 typedef int (*rte_bus_parse_t)(const char *name, void *addr);
 
 /**
+ * Implementation a specific hotplug failure handler, which is responsible
+ * for handle the failure when the device be hotplug out from the bus. When
+ * hotplug removal event be detected, it could call this function to handle
+ * failure and guaranty the system would not crash in the case.
+ * @param dev
+ *	Pointer of the device structure.
+ *
+ * @return
+ *	0 on success.
+ *	!0 on error.
+ */
+typedef int (*rte_bus_hotplug_failure_handler_t)(struct rte_device *dev);
+
+/**
  * Bus scan policies
  */
 enum rte_bus_scan_mode {
@@ -211,6 +225,8 @@ struct rte_bus {
 	rte_bus_parse_t parse;       /**< Parse a device name */
 	struct rte_bus_conf conf;    /**< Bus configuration */
 	rte_bus_get_iommu_class_t get_iommu_class; /**< Get iommu class */
+	rte_bus_hotplug_failure_handler_t hotplug_failure_handler;
+					/**< handle hotplug failure on bus */
 };
 
 /**
-- 
2.7.4
^ permalink raw reply	[flat|nested] 13+ messages in thread
* [dpdk-dev] [PATCH v6 2/7] bus/pci: implement hotplug failure handler ops
  2018-07-09  6:01 [dpdk-dev] [PATCH v6 0/7] hotplug failure handle mechanism Jeff Guo
  2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 1/7] bus: add hotplug failure handler Jeff Guo
@ 2018-07-09  6:01 ` Jeff Guo
  2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 3/7] bus: add sigbus handler Jeff Guo
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Jeff Guo @ 2018-07-09  6:01 UTC (permalink / raw)
  To: stephen, bruce.richardson, ferruh.yigit, konstantin.ananyev,
	gaetan.rivet, jingjing.wu, thomas, motih, matan,
	harry.van.haaren, qi.z.zhang, shaopeng.he, bernard.iremonger,
	arybchenko
  Cc: jblunck, shreyansh.jain, dev, jia.guo, helin.zhang
This patch implements the ops of hotplug failure handler for PCI bus,
it is functional to remap a new dummy memory which overlap to the
failure memory to avoid MMIO read/write error.
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
---
v6->v5:
refine some typo
---
 drivers/bus/pci/pci_common.c     | 28 ++++++++++++++++++++++++++++
 drivers/bus/pci/pci_common_uio.c | 33 +++++++++++++++++++++++++++++++++
 drivers/bus/pci/private.h        | 12 ++++++++++++
 3 files changed, 73 insertions(+)
diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
index 94b0f41..d7abe6c 100644
--- a/drivers/bus/pci/pci_common.c
+++ b/drivers/bus/pci/pci_common.c
@@ -408,6 +408,33 @@ pci_find_device(const struct rte_device *start, rte_dev_cmp_t cmp,
 }
 
 static int
+pci_hotplug_failure_handler(struct rte_device *dev)
+{
+	struct rte_pci_device *pdev = NULL;
+	int ret = 0;
+
+	pdev = RTE_DEV_TO_PCI(dev);
+	if (!pdev)
+		return -1;
+
+	switch (pdev->kdrv) {
+	case RTE_KDRV_IGB_UIO:
+	case RTE_KDRV_UIO_GENERIC:
+	case RTE_KDRV_NIC_UIO:
+		/* mmio resource is invalid, remap it to be safe. */
+		ret = pci_uio_remap_resource(pdev);
+		break;
+	default:
+		RTE_LOG(DEBUG, EAL,
+			"Not managed by a supported kernel driver, skipped\n");
+		ret = -1;
+		break;
+	}
+
+	return ret;
+}
+
+static int
 pci_plug(struct rte_device *dev)
 {
 	return pci_probe_all_drivers(RTE_DEV_TO_PCI(dev));
@@ -437,6 +464,7 @@ struct rte_pci_bus rte_pci_bus = {
 		.unplug = pci_unplug,
 		.parse = pci_parse,
 		.get_iommu_class = rte_pci_get_iommu_class,
+		.hotplug_failure_handler = pci_hotplug_failure_handler,
 	},
 	.device_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.device_list),
 	.driver_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.driver_list),
diff --git a/drivers/bus/pci/pci_common_uio.c b/drivers/bus/pci/pci_common_uio.c
index 54bc20b..7ea73db 100644
--- a/drivers/bus/pci/pci_common_uio.c
+++ b/drivers/bus/pci/pci_common_uio.c
@@ -146,6 +146,39 @@ pci_uio_unmap(struct mapped_pci_resource *uio_res)
 	}
 }
 
+/* remap the PCI resource of a PCI device in anonymous virtual memory */
+int
+pci_uio_remap_resource(struct rte_pci_device *dev)
+{
+	int i;
+	void *map_address;
+
+	if (dev == NULL)
+		return -1;
+
+	/* Remap all BARs */
+	for (i = 0; i != PCI_MAX_RESOURCE; i++) {
+		/* skip empty BAR */
+		if (dev->mem_resource[i].phys_addr == 0)
+			continue;
+		map_address = mmap(dev->mem_resource[i].addr,
+				(size_t)dev->mem_resource[i].len,
+				PROT_READ | PROT_WRITE,
+				MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
+		if (map_address == MAP_FAILED) {
+			RTE_LOG(ERR, EAL,
+				"Cannot remap resource for device %s\n",
+				dev->name);
+			return -1;
+		}
+		RTE_LOG(INFO, EAL,
+			"Successful remap resource for device %s\n",
+			dev->name);
+	}
+
+	return 0;
+}
+
 static struct mapped_pci_resource *
 pci_uio_find_resource(struct rte_pci_device *dev)
 {
diff --git a/drivers/bus/pci/private.h b/drivers/bus/pci/private.h
index 8ddd03e..6b312e5 100644
--- a/drivers/bus/pci/private.h
+++ b/drivers/bus/pci/private.h
@@ -123,6 +123,18 @@ void pci_uio_free_resource(struct rte_pci_device *dev,
 		struct mapped_pci_resource *uio_res);
 
 /**
+ * Remap the PCI resource of a PCI device in anonymous virtual memory.
+ *
+ * @param dev
+ *   Point to the struct rte pci device.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+int
+pci_uio_remap_resource(struct rte_pci_device *dev);
+
+/**
  * Map device memory to uio resource
  *
  * This function is private to EAL.
-- 
2.7.4
^ permalink raw reply	[flat|nested] 13+ messages in thread
* [dpdk-dev] [PATCH v6 3/7] bus: add sigbus handler
  2018-07-09  6:01 [dpdk-dev] [PATCH v6 0/7] hotplug failure handle mechanism Jeff Guo
  2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 1/7] bus: add hotplug failure handler Jeff Guo
  2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 2/7] bus/pci: implement hotplug failure handler ops Jeff Guo
@ 2018-07-09  6:01 ` Jeff Guo
  2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 4/7] bus/pci: implement sigbus handler operation Jeff Guo
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Jeff Guo @ 2018-07-09  6:01 UTC (permalink / raw)
  To: stephen, bruce.richardson, ferruh.yigit, konstantin.ananyev,
	gaetan.rivet, jingjing.wu, thomas, motih, matan,
	harry.van.haaren, qi.z.zhang, shaopeng.he, bernard.iremonger,
	arybchenko
  Cc: jblunck, shreyansh.jain, dev, jia.guo, helin.zhang
When device be hotplug out, if data path still read/write device, the
sigbus error will occur, this error need to be handled. So a handler
need to be here to capture the signal and handle it correspondingly.
This patch introduces a bus ops to handle sigbus error, it is a bus
specific behavior, so that each kind of bus can implement its own logic
case by case.
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
---
v6->v5:
refine some description of bus ops
---
 lib/librte_eal/common/include/rte_bus.h | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)
diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
index e3a55a8..216ad1e 100644
--- a/lib/librte_eal/common/include/rte_bus.h
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -182,6 +182,21 @@ typedef int (*rte_bus_parse_t)(const char *name, void *addr);
 typedef int (*rte_bus_hotplug_failure_handler_t)(struct rte_device *dev);
 
 /**
+ * Implementation a specific sigbus handler, which is responsible for handle
+ * the sigbus error which is either original memory error, or specific memory
+ * error that caused of hot unplug. When sigbus error be captured, it could
+ * call this function to handle sigbus error.
+ * @param failure_addr
+ *	Pointer of the fault address of the sigbus error.
+ *
+ * @return
+ *	0 for success handle the sigbus.
+ *	1 for no bus handle the sigbus.
+ *	-1 for failed to handle the sigbus
+ */
+typedef int (*rte_bus_sigbus_handler_t)(const void *failure_addr);
+
+/**
  * Bus scan policies
  */
 enum rte_bus_scan_mode {
@@ -227,6 +242,8 @@ struct rte_bus {
 	rte_bus_get_iommu_class_t get_iommu_class; /**< Get iommu class */
 	rte_bus_hotplug_failure_handler_t hotplug_failure_handler;
 					/**< handle hotplug failure on bus */
+	rte_bus_sigbus_handler_t sigbus_handler; /**< handle sigbus error */
+
 };
 
 /**
-- 
2.7.4
^ permalink raw reply	[flat|nested] 13+ messages in thread
* [dpdk-dev] [PATCH v6 4/7] bus/pci: implement sigbus handler operation
  2018-07-09  6:01 [dpdk-dev] [PATCH v6 0/7] hotplug failure handle mechanism Jeff Guo
                   ` (2 preceding siblings ...)
  2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 3/7] bus: add sigbus handler Jeff Guo
@ 2018-07-09  6:01 ` Jeff Guo
  2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 5/7] bus: add helper to handle sigbus Jeff Guo
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Jeff Guo @ 2018-07-09  6:01 UTC (permalink / raw)
  To: stephen, bruce.richardson, ferruh.yigit, konstantin.ananyev,
	gaetan.rivet, jingjing.wu, thomas, motih, matan,
	harry.van.haaren, qi.z.zhang, shaopeng.he, bernard.iremonger,
	arybchenko
  Cc: jblunck, shreyansh.jain, dev, jia.guo, helin.zhang
This patch implements the ops of sigbus handler for PCI bus, it is
functional to find the corresponding pci device which is been hotplug
out, and then call the bus ops of hotplug failure handler to handle
the failure for the device.
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
---
v6->v5:
refine some typo
---
 drivers/bus/pci/pci_common.c | 49 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)
diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
index d7abe6c..37ad266 100644
--- a/drivers/bus/pci/pci_common.c
+++ b/drivers/bus/pci/pci_common.c
@@ -407,6 +407,32 @@ pci_find_device(const struct rte_device *start, rte_dev_cmp_t cmp,
 	return NULL;
 }
 
+/* check the failure address belongs to which device. */
+static struct rte_pci_device *
+pci_find_device_by_addr(const void *failure_addr)
+{
+	struct rte_pci_device *pdev = NULL;
+	int i;
+
+	FOREACH_DEVICE_ON_PCIBUS(pdev) {
+		for (i = 0; i != RTE_DIM(pdev->mem_resource); i++) {
+			if ((uint64_t)(uintptr_t)failure_addr >=
+			    (uint64_t)(uintptr_t)pdev->mem_resource[i].addr &&
+			    (uint64_t)(uintptr_t)failure_addr <
+			    (uint64_t)(uintptr_t)pdev->mem_resource[i].addr +
+			    pdev->mem_resource[i].len) {
+				RTE_LOG(INFO, EAL, "Failure address "
+					"%16.16"PRIx64" belongs to "
+					"device %s!\n",
+					(uint64_t)(uintptr_t)failure_addr,
+					pdev->device.name);
+				return pdev;
+			}
+		}
+	}
+	return NULL;
+}
+
 static int
 pci_hotplug_failure_handler(struct rte_device *dev)
 {
@@ -435,6 +461,28 @@ pci_hotplug_failure_handler(struct rte_device *dev)
 }
 
 static int
+pci_sigbus_handler(const void *failure_addr)
+{
+	struct rte_pci_device *pdev = NULL;
+	int ret = 0;
+
+	pdev = pci_find_device_by_addr(failure_addr);
+	if (!pdev) {
+		/* It is a generic sigbus error, no bus would handle it. */
+		ret = 1;
+	} else {
+		/* The sigbus error is caused of hot removal. */
+		ret = pci_hotplug_failure_handler(&pdev->device);
+		if (ret) {
+			RTE_LOG(ERR, EAL, "Failed to handle hot plug for "
+				"device %s", pdev->name);
+			ret = -1;
+		}
+	}
+	return ret;
+}
+
+static int
 pci_plug(struct rte_device *dev)
 {
 	return pci_probe_all_drivers(RTE_DEV_TO_PCI(dev));
@@ -465,6 +513,7 @@ struct rte_pci_bus rte_pci_bus = {
 		.parse = pci_parse,
 		.get_iommu_class = rte_pci_get_iommu_class,
 		.hotplug_failure_handler = pci_hotplug_failure_handler,
+		.sigbus_handler = pci_sigbus_handler,
 	},
 	.device_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.device_list),
 	.driver_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.driver_list),
-- 
2.7.4
^ permalink raw reply	[flat|nested] 13+ messages in thread
* [dpdk-dev] [PATCH v6 5/7] bus: add helper to handle sigbus
  2018-07-09  6:01 [dpdk-dev] [PATCH v6 0/7] hotplug failure handle mechanism Jeff Guo
                   ` (3 preceding siblings ...)
  2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 4/7] bus/pci: implement sigbus handler operation Jeff Guo
@ 2018-07-09  6:01 ` Jeff Guo
  2018-07-09  6:36   ` Andrew Rybchenko
  2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 6/7] eal: add failure handle mechanism for hotplug Jeff Guo
  2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 7/7] igb_uio: fix uio release issue when hot unplug Jeff Guo
  6 siblings, 1 reply; 13+ messages in thread
From: Jeff Guo @ 2018-07-09  6:01 UTC (permalink / raw)
  To: stephen, bruce.richardson, ferruh.yigit, konstantin.ananyev,
	gaetan.rivet, jingjing.wu, thomas, motih, matan,
	harry.van.haaren, qi.z.zhang, shaopeng.he, bernard.iremonger,
	arybchenko
  Cc: jblunck, shreyansh.jain, dev, jia.guo, helin.zhang
This patch aim to add a helper to iterate all buses to find the
corresponding bus to handle the sigbus error.
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
---
v6->v5:
refine some coding style.
---
 lib/librte_eal/common/eal_common_bus.c | 42 ++++++++++++++++++++++++++++++++++
 lib/librte_eal/common/eal_private.h    | 12 ++++++++++
 2 files changed, 54 insertions(+)
diff --git a/lib/librte_eal/common/eal_common_bus.c b/lib/librte_eal/common/eal_common_bus.c
index 0943851..8856adc 100644
--- a/lib/librte_eal/common/eal_common_bus.c
+++ b/lib/librte_eal/common/eal_common_bus.c
@@ -37,6 +37,7 @@
 #include <rte_bus.h>
 #include <rte_debug.h>
 #include <rte_string_fns.h>
+#include <rte_errno.h>
 
 #include "eal_private.h"
 
@@ -242,3 +243,44 @@ rte_bus_get_iommu_class(void)
 	}
 	return mode;
 }
+
+static int
+bus_handle_sigbus(const struct rte_bus *bus,
+			const void *failure_addr)
+{
+	int ret;
+
+	if (!bus->sigbus_handler) {
+		RTE_LOG(ERR, EAL, "Function sigbus_handler not supported by "
+			"bus (%s)\n", bus->name);
+		return -1;
+	}
+
+	ret = bus->sigbus_handler(failure_addr);
+	rte_errno = ret;
+
+	return !(bus->sigbus_handler && ret <= 0);
+}
+
+int
+rte_bus_sigbus_handler(const void *failure_addr)
+{
+	struct rte_bus *bus;
+
+	int ret = 0;
+	int old_errno = rte_errno;
+
+	rte_errno = 0;
+
+	bus = rte_bus_find(NULL, bus_handle_sigbus, failure_addr);
+	/* failed to handle the sigbus, pass the new errno. */
+	if (!bus)
+		ret = 1;
+	else if (rte_errno == -1)
+		return -1;
+
+	/* otherwise restore the old errno. */
+	rte_errno = old_errno;
+
+	return ret;
+}
diff --git a/lib/librte_eal/common/eal_private.h b/lib/librte_eal/common/eal_private.h
index bdadc4d..2337e71 100644
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -258,4 +258,16 @@ int rte_mp_channel_init(void);
  */
 void dev_callback_process(char *device_name, enum rte_dev_event_type event);
 
+/**
+ * Iterate all buses to find the corresponding bus, to handle the sigbus error.
+ * @param failure_addr
+ *	Pointer of the fault address of the sigbus error.
+ *
+ * @return
+ *	 0 success to handle the sigbus.
+ *	-1 failed to handle the sigbus
+ *	 1 no bus can handler the sigbus
+ */
+int rte_bus_sigbus_handler(const void *failure_addr);
+
 #endif /* _EAL_PRIVATE_H_ */
-- 
2.7.4
^ permalink raw reply	[flat|nested] 13+ messages in thread
* [dpdk-dev] [PATCH v6 6/7] eal: add failure handle mechanism for hotplug
  2018-07-09  6:01 [dpdk-dev] [PATCH v6 0/7] hotplug failure handle mechanism Jeff Guo
                   ` (4 preceding siblings ...)
  2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 5/7] bus: add helper to handle sigbus Jeff Guo
@ 2018-07-09  6:01 ` Jeff Guo
  2018-07-09  6:47   ` Andrew Rybchenko
  2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 7/7] igb_uio: fix uio release issue when hot unplug Jeff Guo
  6 siblings, 1 reply; 13+ messages in thread
From: Jeff Guo @ 2018-07-09  6:01 UTC (permalink / raw)
  To: stephen, bruce.richardson, ferruh.yigit, konstantin.ananyev,
	gaetan.rivet, jingjing.wu, thomas, motih, matan,
	harry.van.haaren, qi.z.zhang, shaopeng.he, bernard.iremonger,
	arybchenko
  Cc: jblunck, shreyansh.jain, dev, jia.guo, helin.zhang
This patch introduces a failure handle mechanism to handle device
hotplug removal event.
First it can register sigbus handler when enable device event monitor. Once
sigbus error be captured, it will check the failure address and accordingly
remap the invalid memory for the corresponding device. Besed on this
mechanism, it could guaranty the application not crash when the device be
hotplug out.
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
---
v6->v5:
refine some doc and coding style
---
 lib/librte_eal/linuxapp/eal/eal_dev.c | 114 +++++++++++++++++++++++++++++++++-
 1 file changed, 113 insertions(+), 1 deletion(-)
diff --git a/lib/librte_eal/linuxapp/eal/eal_dev.c b/lib/librte_eal/linuxapp/eal/eal_dev.c
index 1cf6aeb..cb30729 100644
--- a/lib/librte_eal/linuxapp/eal/eal_dev.c
+++ b/lib/librte_eal/linuxapp/eal/eal_dev.c
@@ -4,6 +4,8 @@
 
 #include <string.h>
 #include <unistd.h>
+#include <fcntl.h>
+#include <signal.h>
 #include <sys/socket.h>
 #include <linux/netlink.h>
 
@@ -14,15 +16,31 @@
 #include <rte_malloc.h>
 #include <rte_interrupts.h>
 #include <rte_alarm.h>
+#include <rte_bus.h>
+#include <rte_eal.h>
+#include <rte_spinlock.h>
+#include <rte_errno.h>
 
 #include "eal_private.h"
 
 static struct rte_intr_handle intr_handle = {.fd = -1 };
 static bool monitor_started;
 
+extern struct rte_bus_list rte_bus_list;
+
 #define EAL_UEV_MSG_LEN 4096
 #define EAL_UEV_MSG_ELEM_LEN 128
 
+/*
+ * spinlock for device failure process, protect the bus and the device
+ * to avoid race condition.
+ */
+static rte_spinlock_t dev_failure_lock = RTE_SPINLOCK_INITIALIZER;
+
+static struct sigaction sigbus_action_old;
+
+static int sigbus_need_recover;
+
 static void dev_uev_handler(__rte_unused void *param);
 
 /* identify the system layer which reports this event. */
@@ -33,6 +51,49 @@ enum eal_dev_event_subsystem {
 	EAL_DEV_EVENT_SUBSYSTEM_MAX
 };
 
+static void
+sigbus_action_recover(void)
+{
+	if (sigbus_need_recover) {
+		sigaction(SIGBUS, &sigbus_action_old, NULL);
+		sigbus_need_recover = 0;
+	}
+}
+
+static void sigbus_handler(int signum, siginfo_t *info,
+				void *ctx __rte_unused)
+{
+	int ret;
+
+	RTE_LOG(INFO, EAL, "Thread[%d] catch SIGBUS, fault address:%p\n",
+		(int)pthread_self(), info->si_addr);
+
+	rte_spinlock_lock(&dev_failure_lock);
+	ret = rte_bus_sigbus_handler(info->si_addr);
+	rte_spinlock_unlock(&dev_failure_lock);
+	if (ret == -1) {
+		rte_exit(EXIT_FAILURE,
+			 "Failed to handle SIGBUS for hotplug, "
+			 "(rte_errno: %s)!", strerror(rte_errno));
+	} else if (ret == 1) {
+		if (sigbus_action_old.sa_handler)
+			(*(sigbus_action_old.sa_handler))(signum);
+		else
+			rte_exit(EXIT_FAILURE,
+				 "Failed to handle generic SIGBUS!");
+	}
+
+	RTE_LOG(INFO, EAL, "Success to handle SIGBUS for hotplug!\n");
+}
+
+static int cmp_dev_name(const struct rte_device *dev,
+	const void *_name)
+{
+	const char *name = _name;
+
+	return strcmp(dev->name, name);
+}
+
 static int
 dev_uev_socket_fd_create(void)
 {
@@ -147,6 +208,9 @@ dev_uev_handler(__rte_unused void *param)
 	struct rte_dev_event uevent;
 	int ret;
 	char buf[EAL_UEV_MSG_LEN];
+	struct rte_bus *bus;
+	struct rte_device *dev;
+	const char *busname = "";
 
 	memset(&uevent, 0, sizeof(struct rte_dev_event));
 	memset(buf, 0, EAL_UEV_MSG_LEN);
@@ -171,13 +235,50 @@ dev_uev_handler(__rte_unused void *param)
 	RTE_LOG(DEBUG, EAL, "receive uevent(name:%s, type:%d, subsystem:%d)\n",
 		uevent.devname, uevent.type, uevent.subsystem);
 
-	if (uevent.devname)
+	switch (uevent.subsystem) {
+	case EAL_DEV_EVENT_SUBSYSTEM_PCI:
+	case EAL_DEV_EVENT_SUBSYSTEM_UIO:
+		busname = "pci";
+		break;
+	default:
+		break;
+	}
+
+	if (uevent.devname) {
+		if (uevent.type == RTE_DEV_EVENT_REMOVE) {
+			rte_spinlock_lock(&dev_failure_lock);
+			bus = rte_bus_find_by_name(busname);
+			if (bus == NULL) {
+				RTE_LOG(ERR, EAL, "Cannot find bus (%s)\n",
+					busname);
+				return;
+			}
+
+			dev = bus->find_device(NULL, cmp_dev_name,
+					       uevent.devname);
+			if (dev == NULL) {
+				RTE_LOG(ERR, EAL, "Cannot find device (%s) on "
+					"bus (%s)\n", uevent.devname, busname);
+				return;
+			}
+
+			ret = bus->hotplug_failure_handler(dev);
+			rte_spinlock_unlock(&dev_failure_lock);
+			if (ret) {
+				RTE_LOG(ERR, EAL, "Can not handle hotplug for "
+					"device (%s)\n", dev->name);
+				return;
+			}
+		}
 		dev_callback_process(uevent.devname, uevent.type);
+	}
 }
 
 int __rte_experimental
 rte_dev_event_monitor_start(void)
 {
+	sigset_t mask;
+	struct sigaction action;
 	int ret;
 
 	if (monitor_started)
@@ -197,6 +298,14 @@ rte_dev_event_monitor_start(void)
 		return -1;
 	}
 
+	/* register sigbus handler */
+	sigemptyset(&mask);
+	sigaddset(&mask, SIGBUS);
+	action.sa_flags = SA_SIGINFO;
+	action.sa_mask = mask;
+	action.sa_sigaction = sigbus_handler;
+	sigbus_need_recover = !sigaction(SIGBUS, &action, &sigbus_action_old);
+
 	monitor_started = true;
 
 	return 0;
@@ -217,8 +326,11 @@ rte_dev_event_monitor_stop(void)
 		return ret;
 	}
 
+	sigbus_action_recover();
+
 	close(intr_handle.fd);
 	intr_handle.fd = -1;
 	monitor_started = false;
+
 	return 0;
 }
-- 
2.7.4
^ permalink raw reply	[flat|nested] 13+ messages in thread
* [dpdk-dev] [PATCH v6 7/7] igb_uio: fix uio release issue when hot unplug
  2018-07-09  6:01 [dpdk-dev] [PATCH v6 0/7] hotplug failure handle mechanism Jeff Guo
                   ` (5 preceding siblings ...)
  2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 6/7] eal: add failure handle mechanism for hotplug Jeff Guo
@ 2018-07-09  6:01 ` Jeff Guo
  6 siblings, 0 replies; 13+ messages in thread
From: Jeff Guo @ 2018-07-09  6:01 UTC (permalink / raw)
  To: stephen, bruce.richardson, ferruh.yigit, konstantin.ananyev,
	gaetan.rivet, jingjing.wu, thomas, motih, matan,
	harry.van.haaren, qi.z.zhang, shaopeng.he, bernard.iremonger,
	arybchenko
  Cc: jblunck, shreyansh.jain, dev, jia.guo, helin.zhang
When hotplug out device, the kernel will release the device resource in the
kernel side, such as the fd sys file will disappear, and the irq will be
released. At this time, if igb uio driver still try to release this
resource, it will cause kernel crash. On the other hand, something like
interrupt disabling do not automatically process in kernel side. If not
handler it, this redundancy and dirty thing will affect the interrupt
resource be used by other device. So the igb_uio driver have to check the
hotplug status, and the corresponding process should be taken in igb uio
driver.
This patch propose to add structure of rte_udev_state into rte_uio_pci_dev
of igb_uio kernel driver, which will record the state of uio device, such
as probed/opened/released/removed/unplug. When detect the unexpected
removal which cause of hotplug out behavior, it will corresponding disable
interrupt resource, while for the part of releasement which kernel have
already handle, just skip it to avoid double free or null pointer kernel
crash issue.
Signed-off-by: Jeff Guo <jia.guo@intel.com>
---
v6->v5:
no change
---
 kernel/linux/igb_uio/igb_uio.c | 51 +++++++++++++++++++++++++++++++++++++++---
 1 file changed, 48 insertions(+), 3 deletions(-)
diff --git a/kernel/linux/igb_uio/igb_uio.c b/kernel/linux/igb_uio/igb_uio.c
index 3398eac..adc8cea 100644
--- a/kernel/linux/igb_uio/igb_uio.c
+++ b/kernel/linux/igb_uio/igb_uio.c
@@ -19,6 +19,15 @@
 
 #include "compat.h"
 
+/* uio pci device state */
+enum rte_udev_state {
+	RTE_UDEV_PROBED,
+	RTE_UDEV_OPENNED,
+	RTE_UDEV_RELEASED,
+	RTE_UDEV_REMOVED,
+	RTE_UDEV_UNPLUG
+};
+
 /**
  * A structure describing the private information for a uio device.
  */
@@ -28,6 +37,7 @@ struct rte_uio_pci_dev {
 	enum rte_intr_mode mode;
 	struct mutex lock;
 	int refcnt;
+	enum rte_udev_state state;
 };
 
 static int wc_activate;
@@ -195,12 +205,22 @@ igbuio_pci_irqhandler(int irq, void *dev_id)
 {
 	struct rte_uio_pci_dev *udev = (struct rte_uio_pci_dev *)dev_id;
 	struct uio_info *info = &udev->info;
+	struct pci_dev *pdev = udev->pdev;
 
 	/* Legacy mode need to mask in hardware */
 	if (udev->mode == RTE_INTR_MODE_LEGACY &&
 	    !pci_check_and_mask_intx(udev->pdev))
 		return IRQ_NONE;
 
+	mutex_lock(&udev->lock);
+	/* check the uevent of the kobj */
+	if ((&pdev->dev.kobj)->state_remove_uevent_sent == 1) {
+		dev_notice(&pdev->dev, "device:%s, sent remove uevent!\n",
+			   (&pdev->dev.kobj)->name);
+		udev->state = RTE_UDEV_UNPLUG;
+	}
+	mutex_unlock(&udev->lock);
+
 	uio_event_notify(info);
 
 	/* Message signal mode, no share IRQ and automasked */
@@ -309,7 +329,6 @@ igbuio_pci_disable_interrupts(struct rte_uio_pci_dev *udev)
 #endif
 }
 
-
 /**
  * This gets called while opening uio device file.
  */
@@ -331,20 +350,29 @@ igbuio_pci_open(struct uio_info *info, struct inode *inode)
 
 	/* enable interrupts */
 	err = igbuio_pci_enable_interrupts(udev);
-	mutex_unlock(&udev->lock);
 	if (err) {
 		dev_err(&dev->dev, "Enable interrupt fails\n");
+		pci_clear_master(dev);
+		mutex_unlock(&udev->lock);
 		return err;
 	}
+	udev->state = RTE_UDEV_OPENNED;
+	mutex_unlock(&udev->lock);
 	return 0;
 }
 
+/**
+ * This gets called while closing uio device file.
+ */
 static int
 igbuio_pci_release(struct uio_info *info, struct inode *inode)
 {
 	struct rte_uio_pci_dev *udev = info->priv;
 	struct pci_dev *dev = udev->pdev;
 
+	if (udev->state == RTE_UDEV_REMOVED)
+		return 0;
+
 	mutex_lock(&udev->lock);
 	if (--udev->refcnt > 0) {
 		mutex_unlock(&udev->lock);
@@ -356,7 +384,7 @@ igbuio_pci_release(struct uio_info *info, struct inode *inode)
 
 	/* stop the device from further DMA */
 	pci_clear_master(dev);
-
+	udev->state = RTE_UDEV_RELEASED;
 	mutex_unlock(&udev->lock);
 	return 0;
 }
@@ -562,6 +590,9 @@ igbuio_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
 			 (unsigned long long)map_dma_addr, map_addr);
 	}
 
+	mutex_lock(&udev->lock);
+	udev->state = RTE_UDEV_PROBED;
+	mutex_unlock(&udev->lock);
 	return 0;
 
 fail_remove_group:
@@ -579,6 +610,20 @@ static void
 igbuio_pci_remove(struct pci_dev *dev)
 {
 	struct rte_uio_pci_dev *udev = pci_get_drvdata(dev);
+	int ret;
+
+	/* handler hot unplug */
+	if (udev->state == RTE_UDEV_OPENNED ||
+		udev->state == RTE_UDEV_UNPLUG) {
+		dev_notice(&dev->dev, "Unexpected removal!\n");
+		ret = igbuio_pci_release(&udev->info, NULL);
+		if (ret)
+			return;
+		mutex_lock(&udev->lock);
+		udev->state = RTE_UDEV_REMOVED;
+		mutex_unlock(&udev->lock);
+		return;
+	}
 
 	mutex_destroy(&udev->lock);
 	sysfs_remove_group(&dev->dev.kobj, &dev_attr_grp);
-- 
2.7.4
^ permalink raw reply	[flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH v6 5/7] bus: add helper to handle sigbus
  2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 5/7] bus: add helper to handle sigbus Jeff Guo
@ 2018-07-09  6:36   ` Andrew Rybchenko
  0 siblings, 0 replies; 13+ messages in thread
From: Andrew Rybchenko @ 2018-07-09  6:36 UTC (permalink / raw)
  To: Jeff Guo, stephen, bruce.richardson, ferruh.yigit,
	konstantin.ananyev, gaetan.rivet, jingjing.wu, thomas, motih,
	matan, harry.van.haaren, qi.z.zhang, shaopeng.he,
	bernard.iremonger, arybchenko
  Cc: jblunck, shreyansh.jain, dev, helin.zhang
On 09.07.2018 09:01, Jeff Guo wrote:
> This patch aim to add a helper to iterate all buses to find the
> corresponding bus to handle the sigbus error.
>
> Signed-off-by: Jeff Guo <jia.guo@intel.com>
> Acked-by: Shaopeng He <shaopeng.he@intel.com>
> ---
> v6->v5:
> refine some coding style.
> ---
>   lib/librte_eal/common/eal_common_bus.c | 42 ++++++++++++++++++++++++++++++++++
>   lib/librte_eal/common/eal_private.h    | 12 ++++++++++
>   2 files changed, 54 insertions(+)
>
> diff --git a/lib/librte_eal/common/eal_common_bus.c b/lib/librte_eal/common/eal_common_bus.c
> index 0943851..8856adc 100644
> --- a/lib/librte_eal/common/eal_common_bus.c
> +++ b/lib/librte_eal/common/eal_common_bus.c
> @@ -37,6 +37,7 @@
>   #include <rte_bus.h>
>   #include <rte_debug.h>
>   #include <rte_string_fns.h>
> +#include <rte_errno.h>
>   
>   #include "eal_private.h"
>   
> @@ -242,3 +243,44 @@ rte_bus_get_iommu_class(void)
>   	}
>   	return mode;
>   }
> +
> +static int
> +bus_handle_sigbus(const struct rte_bus *bus,
> +			const void *failure_addr)
> +{
> +	int ret;
> +
> +	if (!bus->sigbus_handler) {
> +		RTE_LOG(ERR, EAL, "Function sigbus_handler not supported by "
> +			"bus (%s)\n", bus->name);
> +		return -1;
> +	}
> +
> +	ret = bus->sigbus_handler(failure_addr);
> +	rte_errno = ret;
> +
> +	return !(bus->sigbus_handler && ret <= 0);
Since bus->sigbus_handler is checked above, there is no point to check 
it here.
> +}
> +
> +int
> +rte_bus_sigbus_handler(const void *failure_addr)
> +{
> +	struct rte_bus *bus;
> +
> +	int ret = 0;
> +	int old_errno = rte_errno;
> +
> +	rte_errno = 0;
> +
> +	bus = rte_bus_find(NULL, bus_handle_sigbus, failure_addr);
> +	/* failed to handle the sigbus, pass the new errno. */
> +	if (!bus)
> +		ret = 1;
> +	else if (rte_errno == -1)
Is it OK to preserve negative value in rte_errno here? I think no.
rte_errno should have value from corresponding set of errno values.
> +		return -1;
> +
> +	/* otherwise restore the old errno. */
> +	rte_errno = old_errno;
> +
> +	return ret;
> +}
> diff --git a/lib/librte_eal/common/eal_private.h b/lib/librte_eal/common/eal_private.h
> index bdadc4d..2337e71 100644
> --- a/lib/librte_eal/common/eal_private.h
> +++ b/lib/librte_eal/common/eal_private.h
> @@ -258,4 +258,16 @@ int rte_mp_channel_init(void);
>    */
>   void dev_callback_process(char *device_name, enum rte_dev_event_type event);
>   
> +/**
> + * Iterate all buses to find the corresponding bus, to handle the sigbus error.
> + * @param failure_addr
> + *	Pointer of the fault address of the sigbus error.
> + *
> + * @return
> + *	 0 success to handle the sigbus.
> + *	-1 failed to handle the sigbus
> + *	 1 no bus can handler the sigbus
> + */
> +int rte_bus_sigbus_handler(const void *failure_addr);
> +
>   #endif /* _EAL_PRIVATE_H_ */
^ permalink raw reply	[flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH v6 6/7] eal: add failure handle mechanism for hotplug
  2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 6/7] eal: add failure handle mechanism for hotplug Jeff Guo
@ 2018-07-09  6:47   ` Andrew Rybchenko
  0 siblings, 0 replies; 13+ messages in thread
From: Andrew Rybchenko @ 2018-07-09  6:47 UTC (permalink / raw)
  To: Jeff Guo, stephen, bruce.richardson, ferruh.yigit,
	konstantin.ananyev, gaetan.rivet, jingjing.wu, thomas, motih,
	matan, harry.van.haaren, qi.z.zhang, shaopeng.he,
	bernard.iremonger, arybchenko
  Cc: jblunck, shreyansh.jain, dev, helin.zhang, Shreyansh Jain
CC Shreyansh
On 09.07.2018 09:01, Jeff Guo wrote:
> This patch introduces a failure handle mechanism to handle device
> hotplug removal event.
>
> First it can register sigbus handler when enable device event monitor. Once
> sigbus error be captured, it will check the failure address and accordingly
> remap the invalid memory for the corresponding device. Besed on this
> mechanism, it could guaranty the application not crash when the device be
> hotplug out.
>
> Signed-off-by: Jeff Guo <jia.guo@intel.com>
> Acked-by: Shaopeng He <shaopeng.he@intel.com>
> ---
> v6->v5:
> refine some doc and coding style
> ---
>   lib/librte_eal/linuxapp/eal/eal_dev.c | 114 +++++++++++++++++++++++++++++++++-
>   1 file changed, 113 insertions(+), 1 deletion(-)
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal_dev.c b/lib/librte_eal/linuxapp/eal/eal_dev.c
> index 1cf6aeb..cb30729 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_dev.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_dev.c
> @@ -4,6 +4,8 @@
>   
>   #include <string.h>
>   #include <unistd.h>
> +#include <fcntl.h>
> +#include <signal.h>
>   #include <sys/socket.h>
>   #include <linux/netlink.h>
>   
> @@ -14,15 +16,31 @@
>   #include <rte_malloc.h>
>   #include <rte_interrupts.h>
>   #include <rte_alarm.h>
> +#include <rte_bus.h>
> +#include <rte_eal.h>
> +#include <rte_spinlock.h>
> +#include <rte_errno.h>
>   
>   #include "eal_private.h"
>   
>   static struct rte_intr_handle intr_handle = {.fd = -1 };
>   static bool monitor_started;
>   
> +extern struct rte_bus_list rte_bus_list;
> +
Shreyansh, shouldn't it be done in rte_bus.h?
>   #define EAL_UEV_MSG_LEN 4096
>   #define EAL_UEV_MSG_ELEM_LEN 128
>   
> +/*
> + * spinlock for device failure process, protect the bus and the device
> + * to avoid race condition.
> + */
> +static rte_spinlock_t dev_failure_lock = RTE_SPINLOCK_INITIALIZER;
Unfortunately it is still not very helpful. I don't understand circumstance
when the race condition could happen.
> +
> +static struct sigaction sigbus_action_old;
> +
> +static int sigbus_need_recover;
> +
>   static void dev_uev_handler(__rte_unused void *param);
>   
>   /* identify the system layer which reports this event. */
> @@ -33,6 +51,49 @@ enum eal_dev_event_subsystem {
>   	EAL_DEV_EVENT_SUBSYSTEM_MAX
>   };
>   
> +static void
> +sigbus_action_recover(void)
> +{
> +	if (sigbus_need_recover) {
> +		sigaction(SIGBUS, &sigbus_action_old, NULL);
> +		sigbus_need_recover = 0;
> +	}
> +}
> +
> +static void sigbus_handler(int signum, siginfo_t *info,
> +				void *ctx __rte_unused)
> +{
> +	int ret;
> +
> +	RTE_LOG(INFO, EAL, "Thread[%d] catch SIGBUS, fault address:%p\n",
> +		(int)pthread_self(), info->si_addr);
> +
> +	rte_spinlock_lock(&dev_failure_lock);
> +	ret = rte_bus_sigbus_handler(info->si_addr);
> +	rte_spinlock_unlock(&dev_failure_lock);
> +	if (ret == -1) {
> +		rte_exit(EXIT_FAILURE,
> +			 "Failed to handle SIGBUS for hotplug, "
> +			 "(rte_errno: %s)!", strerror(rte_errno));
> +	} else if (ret == 1) {
> +		if (sigbus_action_old.sa_handler)
> +			(*(sigbus_action_old.sa_handler))(signum);
> +		else
> +			rte_exit(EXIT_FAILURE,
> +				 "Failed to handle generic SIGBUS!");
> +	}
> +
> +	RTE_LOG(INFO, EAL, "Success to handle SIGBUS for hotplug!\n");
> +}
> +
> +static int cmp_dev_name(const struct rte_device *dev,
> +	const void *_name)
> +{
> +	const char *name = _name;
> +
> +	return strcmp(dev->name, name);
> +}
> +
>   static int
>   dev_uev_socket_fd_create(void)
>   {
> @@ -147,6 +208,9 @@ dev_uev_handler(__rte_unused void *param)
>   	struct rte_dev_event uevent;
>   	int ret;
>   	char buf[EAL_UEV_MSG_LEN];
> +	struct rte_bus *bus;
> +	struct rte_device *dev;
> +	const char *busname = "";
>   
>   	memset(&uevent, 0, sizeof(struct rte_dev_event));
>   	memset(buf, 0, EAL_UEV_MSG_LEN);
> @@ -171,13 +235,50 @@ dev_uev_handler(__rte_unused void *param)
>   	RTE_LOG(DEBUG, EAL, "receive uevent(name:%s, type:%d, subsystem:%d)\n",
>   		uevent.devname, uevent.type, uevent.subsystem);
>   
> -	if (uevent.devname)
> +	switch (uevent.subsystem) {
> +	case EAL_DEV_EVENT_SUBSYSTEM_PCI:
> +	case EAL_DEV_EVENT_SUBSYSTEM_UIO:
> +		busname = "pci";
> +		break;
> +	default:
> +		break;
> +	}
> +
> +	if (uevent.devname) {
> +		if (uevent.type == RTE_DEV_EVENT_REMOVE) {
> +			rte_spinlock_lock(&dev_failure_lock);
> +			bus = rte_bus_find_by_name(busname);
I see no point find bus by empty name. I would suggest to initialize
busname to NULL and simply would not even try to find bus with
such name. Just check above and generate appropriate log.
> +			if (bus == NULL) {
> +				RTE_LOG(ERR, EAL, "Cannot find bus (%s)\n",
> +					busname);
> +				return;
> +			}
> +
> +			dev = bus->find_device(NULL, cmp_dev_name,
> +					       uevent.devname);
> +			if (dev == NULL) {
> +				RTE_LOG(ERR, EAL, "Cannot find device (%s) on "
> +					"bus (%s)\n", uevent.devname, busname);
> +				return;
> +			}
> +
> +			ret = bus->hotplug_failure_handler(dev);
> +			rte_spinlock_unlock(&dev_failure_lock);
> +			if (ret) {
> +				RTE_LOG(ERR, EAL, "Can not handle hotplug for "
> +					"device (%s)\n", dev->name);
> +				return;
> +			}
> +		}
>   		dev_callback_process(uevent.devname, uevent.type);
> +	}
>   }
>   
>   int __rte_experimental
>   rte_dev_event_monitor_start(void)
>   {
> +	sigset_t mask;
> +	struct sigaction action;
>   	int ret;
>   
>   	if (monitor_started)
> @@ -197,6 +298,14 @@ rte_dev_event_monitor_start(void)
>   		return -1;
>   	}
>   
> +	/* register sigbus handler */
> +	sigemptyset(&mask);
> +	sigaddset(&mask, SIGBUS);
> +	action.sa_flags = SA_SIGINFO;
> +	action.sa_mask = mask;
> +	action.sa_sigaction = sigbus_handler;
> +	sigbus_need_recover = !sigaction(SIGBUS, &action, &sigbus_action_old);
> +
>   	monitor_started = true;
>   
>   	return 0;
> @@ -217,8 +326,11 @@ rte_dev_event_monitor_stop(void)
>   		return ret;
>   	}
>   
> +	sigbus_action_recover();
> +
>   	close(intr_handle.fd);
>   	intr_handle.fd = -1;
>   	monitor_started = false;
> +
>   	return 0;
>   }
^ permalink raw reply	[flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH v6 6/7] eal: add failure handle mechanism for hotplug
  2018-07-09  7:42     ` Gaëtan Rivet
@ 2018-07-09  8:12       ` Jeff Guo
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff Guo @ 2018-07-09  8:12 UTC (permalink / raw)
  To: Gaëtan Rivet
  Cc: stephen, bruce.richardson, ferruh.yigit, konstantin.ananyev,
	jingjing.wu, thomas, motih, matan, harry.van.haaren, qi.z.zhang,
	shaopeng.he, bernard.iremonger, arybchenko, jblunck,
	shreyansh.jain, dev, helin.zhang
hi, gaetan
On 7/9/2018 3:42 PM, Gaëtan Rivet wrote:
> Hi Jeff,
>
> On Mon, Jul 09, 2018 at 02:51:21PM +0800, Jeff Guo wrote:
>> This patch introduces a failure handle mechanism to handle device
>> hotplug removal event.
>>
>> First it can register sigbus handler when enable device event monitor. Once
>> sigbus error be captured, it will check the failure address and accordingly
>> remap the invalid memory for the corresponding device. Besed on this
>> mechanism, it could guaranty the application not crash when the device be
>> hotplug out.
>>
>> Signed-off-by: Jeff Guo <jia.guo@intel.com>
>> Acked-by: Shaopeng He <shaopeng.he@intel.com>
>> ---
>> v6->v5:
>> refine some doc and coding style
>> ---
>>   lib/librte_eal/linuxapp/eal/eal_dev.c | 114 +++++++++++++++++++++++++++++++++-
>>   1 file changed, 113 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/librte_eal/linuxapp/eal/eal_dev.c b/lib/librte_eal/linuxapp/eal/eal_dev.c
>> index 1cf6aeb..cb30729 100644
>> --- a/lib/librte_eal/linuxapp/eal/eal_dev.c
>> +++ b/lib/librte_eal/linuxapp/eal/eal_dev.c
>> @@ -4,6 +4,8 @@
>>   
>>   #include <string.h>
>>   #include <unistd.h>
>> +#include <fcntl.h>
>> +#include <signal.h>
>>   #include <sys/socket.h>
>>   #include <linux/netlink.h>
>>   
>> @@ -14,15 +16,31 @@
>>   #include <rte_malloc.h>
>>   #include <rte_interrupts.h>
>>   #include <rte_alarm.h>
>> +#include <rte_bus.h>
>> +#include <rte_eal.h>
>> +#include <rte_spinlock.h>
>> +#include <rte_errno.h>
>>   
>>   #include "eal_private.h"
>>   
>>   static struct rte_intr_handle intr_handle = {.fd = -1 };
>>   static bool monitor_started;
>>   
>> +extern struct rte_bus_list rte_bus_list;
>> +
> Where do you use the rte_bus_list? It seems the reference is a remnant
> from a previous version.
>
> You do not seem to need a direct access on rte_bus_list,
> as you call rte_bus_find instead.
>
> Why do you need this extern? I think its absence is motivated: to keep the
> bus list private and force users to access it through standard exposed ways.
>
> Regards,
i think that is my missing here. Will delete it. Thanks for your info.
^ permalink raw reply	[flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH v6 6/7] eal: add failure handle mechanism for hotplug
  2018-07-09  6:51   ` [dpdk-dev] [PATCH v6 6/7] eal: add failure handle mechanism for hotplug Jeff Guo
@ 2018-07-09  7:42     ` Gaëtan Rivet
  2018-07-09  8:12       ` Jeff Guo
  0 siblings, 1 reply; 13+ messages in thread
From: Gaëtan Rivet @ 2018-07-09  7:42 UTC (permalink / raw)
  To: Jeff Guo
  Cc: stephen, bruce.richardson, ferruh.yigit, konstantin.ananyev,
	jingjing.wu, thomas, motih, matan, harry.van.haaren, qi.z.zhang,
	shaopeng.he, bernard.iremonger, arybchenko, jblunck,
	shreyansh.jain, dev, helin.zhang
Hi Jeff,
On Mon, Jul 09, 2018 at 02:51:21PM +0800, Jeff Guo wrote:
> This patch introduces a failure handle mechanism to handle device
> hotplug removal event.
> 
> First it can register sigbus handler when enable device event monitor. Once
> sigbus error be captured, it will check the failure address and accordingly
> remap the invalid memory for the corresponding device. Besed on this
> mechanism, it could guaranty the application not crash when the device be
> hotplug out.
> 
> Signed-off-by: Jeff Guo <jia.guo@intel.com>
> Acked-by: Shaopeng He <shaopeng.he@intel.com>
> ---
> v6->v5:
> refine some doc and coding style
> ---
>  lib/librte_eal/linuxapp/eal/eal_dev.c | 114 +++++++++++++++++++++++++++++++++-
>  1 file changed, 113 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal_dev.c b/lib/librte_eal/linuxapp/eal/eal_dev.c
> index 1cf6aeb..cb30729 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_dev.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_dev.c
> @@ -4,6 +4,8 @@
>  
>  #include <string.h>
>  #include <unistd.h>
> +#include <fcntl.h>
> +#include <signal.h>
>  #include <sys/socket.h>
>  #include <linux/netlink.h>
>  
> @@ -14,15 +16,31 @@
>  #include <rte_malloc.h>
>  #include <rte_interrupts.h>
>  #include <rte_alarm.h>
> +#include <rte_bus.h>
> +#include <rte_eal.h>
> +#include <rte_spinlock.h>
> +#include <rte_errno.h>
>  
>  #include "eal_private.h"
>  
>  static struct rte_intr_handle intr_handle = {.fd = -1 };
>  static bool monitor_started;
>  
> +extern struct rte_bus_list rte_bus_list;
> +
Where do you use the rte_bus_list? It seems the reference is a remnant
from a previous version.
You do not seem to need a direct access on rte_bus_list,
as you call rte_bus_find instead.
Why do you need this extern? I think its absence is motivated: to keep the
bus list private and force users to access it through standard exposed ways.
Regards,
-- 
Gaëtan Rivet
6WIND
^ permalink raw reply	[flat|nested] 13+ messages in thread
* [dpdk-dev] [PATCH v6 6/7] eal: add failure handle mechanism for hotplug
  2018-07-09  6:51 ` [dpdk-dev] [PATCH v6 0/7] hotplug failure handle mechanism Jeff Guo
@ 2018-07-09  6:51   ` Jeff Guo
  2018-07-09  7:42     ` Gaëtan Rivet
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff Guo @ 2018-07-09  6:51 UTC (permalink / raw)
  To: stephen, bruce.richardson, ferruh.yigit, konstantin.ananyev,
	gaetan.rivet, jingjing.wu, thomas, motih, matan,
	harry.van.haaren, qi.z.zhang, shaopeng.he, bernard.iremonger,
	arybchenko
  Cc: jblunck, shreyansh.jain, dev, jia.guo, helin.zhang
This patch introduces a failure handle mechanism to handle device
hotplug removal event.
First it can register sigbus handler when enable device event monitor. Once
sigbus error be captured, it will check the failure address and accordingly
remap the invalid memory for the corresponding device. Besed on this
mechanism, it could guaranty the application not crash when the device be
hotplug out.
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
---
v6->v5:
refine some doc and coding style
---
 lib/librte_eal/linuxapp/eal/eal_dev.c | 114 +++++++++++++++++++++++++++++++++-
 1 file changed, 113 insertions(+), 1 deletion(-)
diff --git a/lib/librte_eal/linuxapp/eal/eal_dev.c b/lib/librte_eal/linuxapp/eal/eal_dev.c
index 1cf6aeb..cb30729 100644
--- a/lib/librte_eal/linuxapp/eal/eal_dev.c
+++ b/lib/librte_eal/linuxapp/eal/eal_dev.c
@@ -4,6 +4,8 @@
 
 #include <string.h>
 #include <unistd.h>
+#include <fcntl.h>
+#include <signal.h>
 #include <sys/socket.h>
 #include <linux/netlink.h>
 
@@ -14,15 +16,31 @@
 #include <rte_malloc.h>
 #include <rte_interrupts.h>
 #include <rte_alarm.h>
+#include <rte_bus.h>
+#include <rte_eal.h>
+#include <rte_spinlock.h>
+#include <rte_errno.h>
 
 #include "eal_private.h"
 
 static struct rte_intr_handle intr_handle = {.fd = -1 };
 static bool monitor_started;
 
+extern struct rte_bus_list rte_bus_list;
+
 #define EAL_UEV_MSG_LEN 4096
 #define EAL_UEV_MSG_ELEM_LEN 128
 
+/*
+ * spinlock for device failure process, protect the bus and the device
+ * to avoid race condition.
+ */
+static rte_spinlock_t dev_failure_lock = RTE_SPINLOCK_INITIALIZER;
+
+static struct sigaction sigbus_action_old;
+
+static int sigbus_need_recover;
+
 static void dev_uev_handler(__rte_unused void *param);
 
 /* identify the system layer which reports this event. */
@@ -33,6 +51,49 @@ enum eal_dev_event_subsystem {
 	EAL_DEV_EVENT_SUBSYSTEM_MAX
 };
 
+static void
+sigbus_action_recover(void)
+{
+	if (sigbus_need_recover) {
+		sigaction(SIGBUS, &sigbus_action_old, NULL);
+		sigbus_need_recover = 0;
+	}
+}
+
+static void sigbus_handler(int signum, siginfo_t *info,
+				void *ctx __rte_unused)
+{
+	int ret;
+
+	RTE_LOG(INFO, EAL, "Thread[%d] catch SIGBUS, fault address:%p\n",
+		(int)pthread_self(), info->si_addr);
+
+	rte_spinlock_lock(&dev_failure_lock);
+	ret = rte_bus_sigbus_handler(info->si_addr);
+	rte_spinlock_unlock(&dev_failure_lock);
+	if (ret == -1) {
+		rte_exit(EXIT_FAILURE,
+			 "Failed to handle SIGBUS for hotplug, "
+			 "(rte_errno: %s)!", strerror(rte_errno));
+	} else if (ret == 1) {
+		if (sigbus_action_old.sa_handler)
+			(*(sigbus_action_old.sa_handler))(signum);
+		else
+			rte_exit(EXIT_FAILURE,
+				 "Failed to handle generic SIGBUS!");
+	}
+
+	RTE_LOG(INFO, EAL, "Success to handle SIGBUS for hotplug!\n");
+}
+
+static int cmp_dev_name(const struct rte_device *dev,
+	const void *_name)
+{
+	const char *name = _name;
+
+	return strcmp(dev->name, name);
+}
+
 static int
 dev_uev_socket_fd_create(void)
 {
@@ -147,6 +208,9 @@ dev_uev_handler(__rte_unused void *param)
 	struct rte_dev_event uevent;
 	int ret;
 	char buf[EAL_UEV_MSG_LEN];
+	struct rte_bus *bus;
+	struct rte_device *dev;
+	const char *busname = "";
 
 	memset(&uevent, 0, sizeof(struct rte_dev_event));
 	memset(buf, 0, EAL_UEV_MSG_LEN);
@@ -171,13 +235,50 @@ dev_uev_handler(__rte_unused void *param)
 	RTE_LOG(DEBUG, EAL, "receive uevent(name:%s, type:%d, subsystem:%d)\n",
 		uevent.devname, uevent.type, uevent.subsystem);
 
-	if (uevent.devname)
+	switch (uevent.subsystem) {
+	case EAL_DEV_EVENT_SUBSYSTEM_PCI:
+	case EAL_DEV_EVENT_SUBSYSTEM_UIO:
+		busname = "pci";
+		break;
+	default:
+		break;
+	}
+
+	if (uevent.devname) {
+		if (uevent.type == RTE_DEV_EVENT_REMOVE) {
+			rte_spinlock_lock(&dev_failure_lock);
+			bus = rte_bus_find_by_name(busname);
+			if (bus == NULL) {
+				RTE_LOG(ERR, EAL, "Cannot find bus (%s)\n",
+					busname);
+				return;
+			}
+
+			dev = bus->find_device(NULL, cmp_dev_name,
+					       uevent.devname);
+			if (dev == NULL) {
+				RTE_LOG(ERR, EAL, "Cannot find device (%s) on "
+					"bus (%s)\n", uevent.devname, busname);
+				return;
+			}
+
+			ret = bus->hotplug_failure_handler(dev);
+			rte_spinlock_unlock(&dev_failure_lock);
+			if (ret) {
+				RTE_LOG(ERR, EAL, "Can not handle hotplug for "
+					"device (%s)\n", dev->name);
+				return;
+			}
+		}
 		dev_callback_process(uevent.devname, uevent.type);
+	}
 }
 
 int __rte_experimental
 rte_dev_event_monitor_start(void)
 {
+	sigset_t mask;
+	struct sigaction action;
 	int ret;
 
 	if (monitor_started)
@@ -197,6 +298,14 @@ rte_dev_event_monitor_start(void)
 		return -1;
 	}
 
+	/* register sigbus handler */
+	sigemptyset(&mask);
+	sigaddset(&mask, SIGBUS);
+	action.sa_flags = SA_SIGINFO;
+	action.sa_mask = mask;
+	action.sa_sigaction = sigbus_handler;
+	sigbus_need_recover = !sigaction(SIGBUS, &action, &sigbus_action_old);
+
 	monitor_started = true;
 
 	return 0;
@@ -217,8 +326,11 @@ rte_dev_event_monitor_stop(void)
 		return ret;
 	}
 
+	sigbus_action_recover();
+
 	close(intr_handle.fd);
 	intr_handle.fd = -1;
 	monitor_started = false;
+
 	return 0;
 }
-- 
2.7.4
^ permalink raw reply	[flat|nested] 13+ messages in thread
end of thread, other threads:[~2018-07-09  8:12 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-09  6:01 [dpdk-dev] [PATCH v6 0/7] hotplug failure handle mechanism Jeff Guo
2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 1/7] bus: add hotplug failure handler Jeff Guo
2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 2/7] bus/pci: implement hotplug failure handler ops Jeff Guo
2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 3/7] bus: add sigbus handler Jeff Guo
2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 4/7] bus/pci: implement sigbus handler operation Jeff Guo
2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 5/7] bus: add helper to handle sigbus Jeff Guo
2018-07-09  6:36   ` Andrew Rybchenko
2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 6/7] eal: add failure handle mechanism for hotplug Jeff Guo
2018-07-09  6:47   ` Andrew Rybchenko
2018-07-09  6:01 ` [dpdk-dev] [PATCH v6 7/7] igb_uio: fix uio release issue when hot unplug Jeff Guo
  -- strict thread matches above, loose matches on Subject: below --
2017-06-29  4:37 [dpdk-dev] [PATCH v3 0/2] add uevent api for hot plug Jeff Guo
2018-07-09  6:51 ` [dpdk-dev] [PATCH v6 0/7] hotplug failure handle mechanism Jeff Guo
2018-07-09  6:51   ` [dpdk-dev] [PATCH v6 6/7] eal: add failure handle mechanism for hotplug Jeff Guo
2018-07-09  7:42     ` Gaëtan Rivet
2018-07-09  8:12       ` Jeff Guo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).