From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <jia.guo@intel.com>
Received: from mga06.intel.com (mga06.intel.com [134.134.136.31])
 by dpdk.org (Postfix) with ESMTP id 378455F38
 for <dev@dpdk.org>; Thu,  5 Jul 2018 10:23:55 +0200 (CEST)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga003.jf.intel.com ([10.7.209.27])
 by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 05 Jul 2018 01:23:54 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.51,311,1526367600"; d="scan'208";a="64577753"
Received: from jeffguo-z170x-ud5.sh.intel.com (HELO localhost.localdomain)
 ([10.67.104.10])
 by orsmga003.jf.intel.com with ESMTP; 05 Jul 2018 01:23:50 -0700
From: Jeff Guo <jia.guo@intel.com>
To: stephen@networkplumber.org, bruce.richardson@intel.com,
 ferruh.yigit@intel.com, konstantin.ananyev@intel.com,
 gaetan.rivet@6wind.com, jingjing.wu@intel.com, thomas@monjalon.net,
 motih@mellanox.com, matan@mellanox.com, harry.van.haaren@intel.com,
 qi.z.zhang@intel.com, shaopeng.he@intel.com, bernard.iremonger@intel.com
Cc: jblunck@infradead.org, shreyansh.jain@nxp.com, dev@dpdk.org,
 jia.guo@intel.com, helin.zhang@intel.com
Date: Thu,  5 Jul 2018 16:21:42 +0800
Message-Id: <1530778903-31160-1-git-send-email-jia.guo@intel.com>
X-Mailer: git-send-email 2.7.4
In-Reply-To: <1498711073-42917-1-git-send-email-jia.guo@intel.com>
References: <1498711073-42917-1-git-send-email-jia.guo@intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Subject: [dpdk-dev] [PATCH V5 0/7] hot plug failure handle mechanism
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Jul 2018 08:23:55 -0000

As we know, hot plug is an importance feature, either use for the datacenter
device’s fail-safe, or use for SRIOV Live Migration in SDN/NFV. It could bring
the higher flexibility and continuality to the networking services in multiple
use cases in industry. So let we see, dpdk as an importance networking
framework, what can it help to implement hot plug solution for users.

We already have a general device event detect mechanism, failsafe driver,
bonding driver and hot plug/unplug api in framework, app could use these to
develop their hot plug solution.

let’s see the case of hot unplug, it can happen when a hardware device is
be removed physically, or when the software disables it.  App need to call
ether dev API to detach the device, to unplug the device at the bus level and
make access to the device invalid. But the problem is that, the removal of the
device from the software lists is not going to be instantaneous, at this time
if the data(fast) path still read/write the device, it will cause MMIO error
and result of the app crash out.

Seems that we have got fail-safe driver(or app) + RTE_ETH_EVENT_INTR_RMV +
kernel core driver solution to handle it, but still not have failsafe driver
(or app) + RTE_DEV_EVENT_REMOVE + PCIe pmd driver failure handle solution. So
there is an absence in dpdk hot plug solution right now.

Also, we know that kernel only guaranty hot plug on the kernel side, but not for
the user mode side. Firstly we can hardly have a gatekeeper for any MMIO for
multiple PMD driver. Secondly, no more specific 3rd tools such as udev/driverctl
have especially cover these hot plug failure processing. Third, the feasibility
of app’s implement for multiple user mode PMD driver is still a problem. Here,
a general hot plug failure handle mechanism in dpdk framework would be proposed,
it aim to guaranty that, when hot unplug occur, the system will not crash and
app will not be break out, and user space can normally stop and release any
relevant resources, then unplug of the device at the bus level cleanly.

The mechanism should be come across as bellow:

Firstly, app enabled the device event monitor and register the hot plug event’s
callback before running data path. Once the hot unplug behave occur, the
mechanism will detect the removal event and then accordingly do the failure
handle. In order to do that, below functional will be bring in.
 - Add a new bus ops “handle_hot_unplug” to handle bus read/write error, it is
   bus-specific and each kind of bus can implement its own logic.
 - Implement pci bus specific ops “pci_handle_hot_unplug”. It will base on the
   failure address to remap memory for the corresponding device that unplugged.

For the data path or other unexpected control from the control path when hot
unplug occur.
 - Implement a new sigbus handler, it is registered when start device even
   monitoring. The handler is per process. Base on the signal event principle,
   control path thread and data path thread will randomly receive the sigbus
   error, but will go to the common sigbus handler. Once the MMIO sigbus error
   exposure, it will trigger the above hot unplug operation. The sigbus will be
   check if it is cause of the hot unplug or not, if not will info exception as
   the original sigbus handler. If yes, will do memory remapping.

For the control path and the igb uio release:
 - When hot unplug device, the kernel will release the device resource in the
   kernel side, such as the fd sys file will disappear, and the irq will be
   released. At this time, if igb uio driver still try to release this resource,
   it will cause kernel crash.
   On the other hand, something like interrupt disable do not automatically
   process in kernel side. If not handler it, this redundancy and dirty thing
   will affect the interrupt resource be used by other device.
   So the igb_uio driver have to check the hot plug status and corresponding
   process should be taken in igb uio deriver.
   This patch propose to add structure of rte_udev_state into rte_uio_pci_dev
   of igb_uio kernel driver, which will record the state of uio device, such as
   probed/opened/released/removed/unplug. When detect the unexpected removal
   which cause of hot unplug behavior, it will corresponding disable interrupt
   resource, while for the part of releasement which kernel have already handle,
   just skip it to avoid double free or null pointer kernel crash issue.

The mechanism could be use for fail-safe driver and app which want to use hot
plug solution. let testpmd for example:
 - Enable device event monitor->device unplug->failure handle->stop forwarding->
   stop port->close port->detach port.

This process will not breaking the app/fail-safe running, and will not break
other irrelevance device. And app could plug in the device and restart the date
path again by below.
 - Device plug in->bind igb_uio driver ->attached device->start port->
   start forwarding.

patchset history:
v5->v4:
split patches to focus on the failure handle, remove the event usage by testpmd
to another patch.
change the hotplug failure handler name
refine the sigbus handle logic
add lock for udev state in igb uio driver 

v4->v3:
split patches to be small and clear
change to use new parameter "--hotplug-mode" in testpmd
to identify the eal hotplug and ethdev hotplug

v3->v2:
change bus ops name to bus_hotplug_handler.
add new API and bus ops of bus_signal_handler
distingush handle generic sigbus and hotplug sigbus

v2->v1(v21):
refine some doc and commit log
fix igb uio kernel issue for control path failure
rebase testpmd code

Since the hot plug solution be discussed serval around in the public, the
scope be changed and the patch set be split into many times. Coming to the
recently RFC and feature design, it just focus on the hot unplug failure
handler at this patch set, so in order let this topic more clear and focus,
summarize privours patch set in history “v1(v21)”, the v2 here go ahead
for further track.

"v1(21)" == v21 as below:
v21->v20:
split function in hot unplug ops
sync failure hanlde to fix multiple process issue fix attach port issue for multiple devices case.
combind rmv callback function to be only one.

v20->v19:
clean the code
refine the remap logic for multiple device.
remove the auto binding

v19->18:
note for limitation of multiple hotplug,fix some typo, sqeeze patch.

v18->v15:
add document, add signal bus handler, refine the code to be more clear.

the prior patch history please check the patch set "add device event monitor framework"

Jeff Guo (7):
  bus: add hotplug failure handler
  bus/pci: implement hotplug failure handler ops
  bus: add sigbus handler
  bus/pci: implement sigbus handler operation
  bus: add helper to handle sigbus
  eal: add failure handle mechanism for hotplug
  igb_uio: fix uio release issue when hot unplug

 drivers/bus/pci/pci_common.c            |  77 ++++++++++++++++++++++
 drivers/bus/pci/pci_common_uio.c        |  33 ++++++++++
 drivers/bus/pci/private.h               |  12 ++++
 kernel/linux/igb_uio/igb_uio.c          |  51 ++++++++++++++-
 lib/librte_eal/common/eal_common_bus.c  |  36 ++++++++++-
 lib/librte_eal/common/eal_private.h     |  12 ++++
 lib/librte_eal/common/include/rte_bus.h |  31 +++++++++
 lib/librte_eal/linuxapp/eal/eal_dev.c   | 111 +++++++++++++++++++++++++++++++-
 8 files changed, 358 insertions(+), 5 deletions(-)

-- 
2.7.4