From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id 332AB3237 for ; Tue, 2 Oct 2018 12:08:51 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 02 Oct 2018 03:08:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,331,1534834800"; d="scan'208";a="96715173" Received: from jguo15x-mobl.ccr.corp.intel.com (HELO [10.249.174.2]) ([10.249.174.2]) by orsmga002.jf.intel.com with ESMTP; 02 Oct 2018 03:08:32 -0700 To: Jerin Jacob , Stephen Hemminger Cc: bruce.richardson@intel.com, ferruh.yigit@intel.com, konstantin.ananyev@intel.com, gaetan.rivet@6wind.com, jingjing.wu@intel.com, thomas@monjalon.net, motih@mellanox.com, matan@mellanox.com, harry.van.haaren@intel.com, qi.z.zhang@intel.com, shaopeng.he@intel.com, bernard.iremonger@intel.com, arybchenko@solarflare.com, wenzhuo.lu@intel.com, anatoly.burakov@intel.com, jblunck@infradead.org, shreyansh.jain@nxp.com, dev@dpdk.org, helin.zhang@intel.com References: <1498711073-42917-1-git-send-email-jia.guo@intel.com> <1538307003-11836-1-git-send-email-jia.guo@intel.com> <20181001110012.273b38fc@shemminger-XPS-13-9360> <20181001095513.GA18062@jerin> From: Jeff Guo Message-ID: <8e0a1e79-3328-8dd8-e8b0-e1340bbe7829@intel.com> Date: Tue, 2 Oct 2018 18:08:31 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.0 MIME-Version: 1.0 In-Reply-To: <20181001095513.GA18062@jerin> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Subject: Re: [dpdk-dev] [PATCH v11 0/7] hot-unplug failure handle mechanism X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Oct 2018 10:08:51 -0000 hi, jerin Thanks for your comment and reply as below. On 10/1/2018 5:55 PM, Jerin Jacob wrote: > -----Original Message----- >> Date: Mon, 1 Oct 2018 11:00:12 +0200 >> From: Stephen Hemminger >> To: Jeff Guo >> Cc: bruce.richardson@intel.com, ferruh.yigit@intel.com, >> konstantin.ananyev@intel.com, gaetan.rivet@6wind.com, >> jingjing.wu@intel.com, thomas@monjalon.net, motih@mellanox.com, >> matan@mellanox.com, harry.van.haaren@intel.com, qi.z.zhang@intel.com, >> shaopeng.he@intel.com, bernard.iremonger@intel.com, >> arybchenko@solarflare.com, wenzhuo.lu@intel.com, >> anatoly.burakov@intel.com, jblunck@infradead.org, shreyansh.jain@nxp.com, >> dev@dpdk.org, helin.zhang@intel.com >> Subject: Re: [dpdk-dev] [PATCH v11 0/7] hot-unplug failure handle mechanism >> >> >> On Sun, 30 Sep 2018 19:29:56 +0800 >> Jeff Guo wrote: >> >>> Hotplug is an important feature for use-cases like the datacenter device's >>> fail-safe and for SRIOV Live Migration in SDN/NFV. It could bring higher >>> flexibility and continuality to networking services in multiple use-cases >>> in the industry. So let's see how DPDK can help users implement hotplug >>> solutions. >>> >>> We already have a general device-event monitor mechanism, failsafe driver, >>> and hot plug/unplug API in DPDK. We have already got the solution of >>> “ethdev event + kernel PMD hotplug handler + failsafe”, but we still not >>> got “eal event + hotplug handler for pci PMD + failsafe” implement, and we >>> need to considerate 2 different solutions between uio pci and vfio pci. >>> >>> In the case of hotplug for igb_uio, when a hardware device be removed >>> physically or disabled in software, the application needs to be notified >>> and detach the device out of the bus, and then make the device invalidate. >>> The problem is that, the removal of the device is not instantaneous in >>> software. If the application data path tries to read/write to the device >>> when removal is still in process, it will cause an MMIO error and >>> application will crash. >>> >>> In this patch set, we propose a PCIe bus failure handler mechanism for >>> hot-unplug in igb_uio. It aims to guarantee that, when a hot-unplug occurs, >>> the application will not crash. >>> >>> The mechanism should work as below: >>> >>> First, the application enables the device event monitor, registers the >>> hotplug event’s callback and enable hotplug handling before running the >>> data path. Once the hot-unplug occurs, the mechanism will detect the >>> removal event and then accordingly do the failure handling. In order to >>> do that, the below functionality will be required: >>> - Add a new bus ops “hot_unplug_handler” to handle hot-unplug failure. >>> - Implement pci bus specific ops “pci_hot_unplug_handler”. For uio pci, >>> it will be based on the failure address to remap memory for the corresponding >>> device that unplugged. For vfio pci, could seperate implement case by case. >>> >>> For the data path or other unexpected behaviors from the control path >>> when a hot unplug occurs: >>> - Add a new bus ops “sigbus_handler”, that is responsible for handling >>> the sigbus error which is either an original memory error, or a specific >>> memory error that is caused by a hot unplug. When a sigbus error is >>> captured, it will call this function to handle sigbus error. >>> - Implement PCI bus specific ops “pci_sigbus_handler”. It will iterate all >>> device on PCI bus to find which device encounter the failure. >>> - Implement a "rte_bus_sigbus_handler" to iterate all buses to find a bus >>> to handle the failure. >>> - Add a couple of APIs “rte_dev_hotplug_handle_enable” and >>> “rte_dev_hotplug_handle_diable” to enable/disable hotplug handling. >>> It will monitor the sigbus error by a handler which is per-process. >>> Based on the signal event principle, the control path thread and the >>> data path thread will randomly receive the sigbus error, but will call the >>> common sigbus process. When sigbus be captured, it will call the above API >>> to find bus to handle it. >>> >>> The mechanism could be used by app or PMDs. For example, the whole process >>> of hotplug in testpmd is: >>> - Enable device event monitor->Enable hotplug handle->Register event callback >>> ->attach port->start port->start forwarding->Device unplug->failure handle >>> ->stop forwarding->stop port->close port->detach port. >>> >>> This patch set would not cover hotplug insert and binding, and it is only >>> implement the igb_uio failure handler, the vfio hotplug failure handler >>> will be in next coming patch set. >>> >>> patchset history: >>> v11->v10: >>> change the ops name, since both uio and vfio will use the hot-unplug ops. >>> add experimental tag. >>> since we plan to abandon RTE_ETH_EVENT_INTR_RMV, change to use >>> RTE_DEV_EVENT_REMOVE, so modify the hotplug event and callback usage. >>> move the igb_uio fixing part, since it is random issue and should be considarate >>> as kernel driver defect but not include as this failure handler mechanism. >>> >>> v10->v9: >>> modify the api name and exposure out for public use. >>> add hotplug handle enable/disable APIs >>> refine commit log >>> >>> v9->v8: >>> refine commit log to be more readable. >>> >>> v8->v7: >>> refine errno process in sigbus handler. >>> refine igb uio release process >>> >>> v7->v6: >>> delete some unused part >>> >>> v6->v5: >>> refine some description about bus ops >>> refine commit log >>> add some entry check. >>> >>> v5->v4: >>> split patches to focus on the failure handle, remove the event usage >>> by testpmd to another patch. >>> change the hotplug failure handler name. >>> refine the sigbus handle logic. >>> add lock for udev state in igb uio driver. >>> >>> v4->v3: >>> split patches to be small and clear. >>> change to use new parameter "--hotplug-mode" in testpmd to identify >>> the eal hotplug and ethdev hotplug. >>> >>> v3->v2: >>> change bus ops name to bus_hotplug_handler. >>> add new API and bus ops of bus_signal_handler distingush handle generic. >>> sigbus and hotplug sigbus. >>> >>> v2->v1(v21): >>> refine some doc and commit log. >>> fix igb uio kernel issue for control path failure rebase testpmd code. >>> >>> Since the hot plug solution be discussed serval around in the public, >>> the scope be changed and the patch set be split into many times. Coming >>> to the recently RFC and feature design, it just focus on the hot unplug >>> failure handler at this patch set, so in order let this topic more clear >>> and focus, summarize privours patch set in history “v1(v21)”, the v2 here >>> go ahead for further track. >>> >>> "v1(21)" == v21 as below: >>> v21->v20: >>> split function in hot unplug ops. >>> sync failure hanlde to fix multiple process issue fix attach port issue for multiple devices case. >>> combind rmv callback function to be only one. >>> >>> v20->v19: >>> clean the code. >>> refine the remap logic for multiple device. >>> remove the auto binding. >>> >>> v19->18: >>> note for limitation of multiple hotplug, fix some typo, sqeeze patch. >>> >>> v18->v15: >>> add document, add signal bus handler, refine the code to be more clear. >>> >>> the prior patch history please check the patch set "add device event monitor framework". >>> >>> Jeff Guo (7): >>> bus: add hot-unplug handler >>> bus/pci: implement hot-unplug handler ops >>> bus: add sigbus handler >>> bus/pci: implement sigbus handler ops >>> bus: add helper to handle sigbus >>> eal: add failure handle mechanism for hot-unplug >>> testpmd: use hot-unplug failure handle mechanism >>> >>> app/test-pmd/testpmd.c | 39 ++++++-- >>> doc/guides/rel_notes/release_18_08.rst | 5 + >>> drivers/bus/pci/pci_common.c | 81 ++++++++++++++++ >>> drivers/bus/pci/pci_common_uio.c | 33 +++++++ >>> drivers/bus/pci/private.h | 12 +++ >>> lib/librte_eal/bsdapp/eal/eal_dev.c | 14 +++ >>> lib/librte_eal/common/eal_common_bus.c | 43 +++++++++ >>> lib/librte_eal/common/eal_private.h | 39 ++++++++ >>> lib/librte_eal/common/include/rte_bus.h | 34 +++++++ >>> lib/librte_eal/common/include/rte_dev.h | 26 +++++ >>> lib/librte_eal/linuxapp/eal/eal_dev.c | 162 +++++++++++++++++++++++++++++++- >>> lib/librte_eal/rte_eal_version.map | 2 + >>> 12 files changed, 481 insertions(+), 9 deletions(-) >>> >> I am glad to see this, hotplug is needed. But have a somewhat controversial >> point of view. The DPDK project needs to do more to force users to go to >> more modern kernels and API's; there has been too much effort already to >> support new DPDK on older kernels and distributions. This leads to higher >> testing burden, technical debt and multiple API's. >> >> To take the extreme point of view. >> * igb_uio should be deprecated and all new work only use vfio and vfio-ionommu only >> * kni should be deprecated and replaced by virtio > +1 > > I think, The only feature missing in upstream kernel for DPDK may be to > enable SRIOV on PF VFIO devices controlled by DPDK PMD. > I think, Binding a PF device to DPDK along with VFs(VF can be bound to netdev or DPDK > PMDs, Though binding VF to netdev considered as security breach) will be useful for > a) rte_flow actions like redirecting the traffic to PF or VF on the given pattern > b) Some NICs can support promiscuous mode only on PF > c) Enable Switch representation devices > https://doc.dpdk.org/guides/prog_guide/switch_representation.html > > I think, igb_uio mainly used as the backdoor for this use case. > > I think, there was some work in this area but it is not upstreamed due > to various reasons. > https://lkml.org/lkml/2018/3/8/1122 I think you definite raise some meaningful of the hotplug for SRIOV. The igb_uio still is using now, its usage need to highlight to discuss. >> When there are N ways of doing things against X kernel versions, >> and Y distributions, and multiple device vendors; the combinational explosion of cases means >> that interfaces don't get the depth of testing they deserve. >> >> That means why not support hotplug on VFIO only? >>