From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-f66.google.com (mail-ed1-f66.google.com [209.85.208.66]) by dpdk.org (Postfix) with ESMTP id A35A2239 for ; Mon, 1 Oct 2018 11:00:18 +0200 (CEST) Received: by mail-ed1-f66.google.com with SMTP id g32-v6so3042515edg.13 for ; Mon, 01 Oct 2018 02:00:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=LwqtEhi6uyhR/evBhACPoXQW54XilqfWlLNRcLbDqfw=; b=MFK8YgQlVp4mIO50SsbUIHWIQ/3Cjds7mOcxNk2erLLTo8dAEllZ+Przyal996YYX+ Y88/1LO2ehNO6IZH7i26BVY3kCkTKCecQD28QnM4AMhosTlhgIPdgsWQ2f2SqfSsailf /CPO+JFH8sTifSecweYLimtWsJzEjfoNNhzGgS7ikguWKbuhVr1rCkL8EcAgT4mDcIWf 1o/FDZzqLmAhSJToA5jnPv9C+/Z5apb8zA7qOV3poK/RsRQPz0T3cG3eQ8EbhWwzA3Ml 0Gq6Ms2NzpuJwCKRcctny3+4VWAtdNueIPBNY4GTndLAq164HMlllYUq4HFu0cXY9ST2 khaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=LwqtEhi6uyhR/evBhACPoXQW54XilqfWlLNRcLbDqfw=; b=M+VNlg6NDADtrdmrlvxTkMFGJ1yErPHGU23SaZqhxtDiPbFPz0VVoyLTxYDYi2OaP6 y81X+KmpFbU+p92r8VAsiKRNVL5Qi+ze0BWj2eimllcjksQXSDa/owgU9WmMkCxkevMa xsIDfRyG/FcZyM0wLqTmV1rTnlQkT5TeotbPorXsRbPQxsKdbhqzyNAuWY01X3K/aKSp sKUwP8nm80f4Jp/sdm+kt5gLJlKxQmuHDPxU38W8LsnjxYIynaJEuNlSQ45vg/u1Vkce EDee/+PPgluqWFZEjw2VxKXNnfsRtv2SYvORZLWBR1tUrV4gbGbQ8Zqnhem613fsvtfA hHXA== X-Gm-Message-State: ABuFfog5p/Sm4B6Rtgg3GfGnSx2D+dH7Len02gaC68XcHNV7NP3+GlOb Gd6LQe4wWXu+A8SXOlZ/XLl/wg== X-Google-Smtp-Source: ACcGV61kz+ThqsmUHRK1+3+U8mthroM0/ANX3eY/4NfbgQ8Ikrg8Puhj+XoiimJT/vaI8lH8fqIQNg== X-Received: by 2002:a17:906:bb0b:: with SMTP id jz11-v6mr13449849ejb.219.1538384417826; Mon, 01 Oct 2018 02:00:17 -0700 (PDT) Received: from shemminger-XPS-13-9360 ([89.27.154.13]) by smtp.gmail.com with ESMTPSA id h19-v6sm2007930ejz.29.2018.10.01.02.00.16 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 01 Oct 2018 02:00:17 -0700 (PDT) Date: Mon, 1 Oct 2018 11:00:12 +0200 From: Stephen Hemminger To: Jeff Guo Cc: bruce.richardson@intel.com, ferruh.yigit@intel.com, konstantin.ananyev@intel.com, gaetan.rivet@6wind.com, jingjing.wu@intel.com, thomas@monjalon.net, motih@mellanox.com, matan@mellanox.com, harry.van.haaren@intel.com, qi.z.zhang@intel.com, shaopeng.he@intel.com, bernard.iremonger@intel.com, arybchenko@solarflare.com, wenzhuo.lu@intel.com, anatoly.burakov@intel.com, jblunck@infradead.org, shreyansh.jain@nxp.com, dev@dpdk.org, helin.zhang@intel.com Message-ID: <20181001110012.273b38fc@shemminger-XPS-13-9360> In-Reply-To: <1538307003-11836-1-git-send-email-jia.guo@intel.com> References: <1498711073-42917-1-git-send-email-jia.guo@intel.com> <1538307003-11836-1-git-send-email-jia.guo@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [dpdk-dev] [PATCH v11 0/7] hot-unplug failure handle mechanism X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2018 09:00:18 -0000 On Sun, 30 Sep 2018 19:29:56 +0800 Jeff Guo wrote: > Hotplug is an important feature for use-cases like the datacenter device's > fail-safe and for SRIOV Live Migration in SDN/NFV. It could bring higher > flexibility and continuality to networking services in multiple use-cases > in the industry. So let's see how DPDK can help users implement hotplug > solutions. >=20 > We already have a general device-event monitor mechanism, failsafe driver, > and hot plug/unplug API in DPDK. We have already got the solution of > =E2=80=9Cethdev event + kernel PMD hotplug handler + failsafe=E2=80=9D, b= ut we still not > got =E2=80=9Ceal event + hotplug handler for pci PMD + failsafe=E2=80=9D = implement, and we > need to considerate 2 different solutions between uio pci and vfio pci. >=20 > In the case of hotplug for igb_uio, when a hardware device be removed > physically or disabled in software, the application needs to be notified > and detach the device out of the bus, and then make the device invalidate. > The problem is that, the removal of the device is not instantaneous in > software. If the application data path tries to read/write to the device > when removal is still in process, it will cause an MMIO error and > application will crash. >=20 > In this patch set, we propose a PCIe bus failure handler mechanism for > hot-unplug in igb_uio. It aims to guarantee that, when a hot-unplug occur= s, > the application will not crash. >=20 > The mechanism should work as below: >=20 > First, the application enables the device event monitor, registers the > hotplug event=E2=80=99s callback and enable hotplug handling before runni= ng the > data path. Once the hot-unplug occurs, the mechanism will detect the > removal event and then accordingly do the failure handling. In order to > do that, the below functionality will be required: > - Add a new bus ops =E2=80=9Chot_unplug_handler=E2=80=9D to handle hot-u= nplug failure. > - Implement pci bus specific ops =E2=80=9Cpci_hot_unplug_handler=E2=80= =9D. For uio pci, > it will be based on the failure address to remap memory for the corres= ponding > device that unplugged. For vfio pci, could seperate implement case by = case. >=20 > For the data path or other unexpected behaviors from the control path > when a hot unplug occurs: > - Add a new bus ops =E2=80=9Csigbus_handler=E2=80=9D, that is responsibl= e for handling > the sigbus error which is either an original memory error, or a specif= ic > memory error that is caused by a hot unplug. When a sigbus error is > captured, it will call this function to handle sigbus error. > - Implement PCI bus specific ops =E2=80=9Cpci_sigbus_handler=E2=80=9D. I= t will iterate all > device on PCI bus to find which device encounter the failure. > - Implement a "rte_bus_sigbus_handler" to iterate all buses to find a bus > to handle the failure. > - Add a couple of APIs =E2=80=9Crte_dev_hotplug_handle_enable=E2=80=9D a= nd > =E2=80=9Crte_dev_hotplug_handle_diable=E2=80=9D to enable/disable hotp= lug handling. > It will monitor the sigbus error by a handler which is per-process. > Based on the signal event principle, the control path thread and the > data path thread will randomly receive the sigbus error, but will call= the > common sigbus process. When sigbus be captured, it will call the above= API > to find bus to handle it. >=20 > The mechanism could be used by app or PMDs. For example, the whole process > of hotplug in testpmd is: > - Enable device event monitor->Enable hotplug handle->Register event cal= lback > ->attach port->start port->start forwarding->Device unplug->failure ha= ndle > ->stop forwarding->stop port->close port->detach port. =20 >=20 > This patch set would not cover hotplug insert and binding, and it is only > implement the igb_uio failure handler, the vfio hotplug failure handler > will be in next coming patch set. >=20 > patchset history: > v11->v10: > change the ops name, since both uio and vfio will use the hot-unplug ops. > add experimental tag. > since we plan to abandon RTE_ETH_EVENT_INTR_RMV, change to use > RTE_DEV_EVENT_REMOVE, so modify the hotplug event and callback usage. > move the igb_uio fixing part, since it is random issue and should be cons= idarate > as kernel driver defect but not include as this failure handler mechanism. >=20 > v10->v9: > modify the api name and exposure out for public use. > add hotplug handle enable/disable APIs > refine commit log >=20 > v9->v8: > refine commit log to be more readable. >=20 > v8->v7: > refine errno process in sigbus handler. > refine igb uio release process >=20 > v7->v6: > delete some unused part >=20 > v6->v5: > refine some description about bus ops > refine commit log > add some entry check. >=20 > v5->v4: > split patches to focus on the failure handle, remove the event usage > by testpmd to another patch. > change the hotplug failure handler name. > refine the sigbus handle logic. > add lock for udev state in igb uio driver. >=20 > v4->v3: > split patches to be small and clear. > change to use new parameter "--hotplug-mode" in testpmd to identify > the eal hotplug and ethdev hotplug. >=20 > v3->v2: > change bus ops name to bus_hotplug_handler. > add new API and bus ops of bus_signal_handler distingush handle generic. > sigbus and hotplug sigbus. >=20 > v2->v1(v21): > refine some doc and commit log. > fix igb uio kernel issue for control path failure rebase testpmd code. >=20 > Since the hot plug solution be discussed serval around in the public, > the scope be changed and the patch set be split into many times. Coming > to the recently RFC and feature design, it just focus on the hot unplug > failure handler at this patch set, so in order let this topic more clear > and focus, summarize privours patch set in history =E2=80=9Cv1(v21)=E2=80= =9D, the v2 here > go ahead for further track. >=20 > "v1(21)" =3D=3D v21 as below: > v21->v20: > split function in hot unplug ops. > sync failure hanlde to fix multiple process issue fix attach port issue f= or multiple devices case. > combind rmv callback function to be only one. >=20 > v20->v19: > clean the code. > refine the remap logic for multiple device. > remove the auto binding. >=20 > v19->18: > note for limitation of multiple hotplug, fix some typo, sqeeze patch. >=20 > v18->v15: > add document, add signal bus handler, refine the code to be more clear. >=20 > the prior patch history please check the patch set "add device event moni= tor framework". >=20 > Jeff Guo (7): > bus: add hot-unplug handler > bus/pci: implement hot-unplug handler ops > bus: add sigbus handler > bus/pci: implement sigbus handler ops > bus: add helper to handle sigbus > eal: add failure handle mechanism for hot-unplug > testpmd: use hot-unplug failure handle mechanism >=20 > app/test-pmd/testpmd.c | 39 ++++++-- > doc/guides/rel_notes/release_18_08.rst | 5 + > drivers/bus/pci/pci_common.c | 81 ++++++++++++++++ > drivers/bus/pci/pci_common_uio.c | 33 +++++++ > drivers/bus/pci/private.h | 12 +++ > lib/librte_eal/bsdapp/eal/eal_dev.c | 14 +++ > lib/librte_eal/common/eal_common_bus.c | 43 +++++++++ > lib/librte_eal/common/eal_private.h | 39 ++++++++ > lib/librte_eal/common/include/rte_bus.h | 34 +++++++ > lib/librte_eal/common/include/rte_dev.h | 26 +++++ > lib/librte_eal/linuxapp/eal/eal_dev.c | 162 ++++++++++++++++++++++++++= +++++- > lib/librte_eal/rte_eal_version.map | 2 + > 12 files changed, 481 insertions(+), 9 deletions(-) >=20 I am glad to see this, hotplug is needed. But have a somewhat controversial point of view. The DPDK project needs to do more to force users to go to more modern kernels and API's; there has been too much effort already to support new DPDK on older kernels and distributions. This leads to higher testing burden, technical debt and multiple API's. To take the extreme point of view. * igb_uio should be deprecated and all new work only use vfio and vfio-ion= ommu only * kni should be deprecated and replaced by virtio When there are N ways of doing things against X kernel versions, and Y distributions, and multiple device vendors; the combinational explosi= on of cases means that interfaces don't get the depth of testing they deserve. That means why not support hotplug on VFIO only?