DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] PCIe Hot Insert/Remove Support
@ 2016-10-24 18:16 Walker, Benjamin
  2016-10-27 16:08 ` Thomas Monjalon
  2016-11-02  9:23 ` Shreyansh Jain
  0 siblings, 2 replies; 3+ messages in thread
From: Walker, Benjamin @ 2016-10-24 18:16 UTC (permalink / raw)
  To: dev

Hi all,

My name is Ben Walker and I'm the technical lead for SPDK (it's like DPDK, but
for storage devices). SPDK relies on DPDK only for the base functionality in the
EAL - memory management, the rings, and the PCI scanning code. A key feature for
storage devices is support for hot insert and remove, so we're currently working
through how best to implement this for a user space driver. While doing this
work, we've run into a few issues with the current DPDK PCI/device/driver
framework that I'd like to discuss with this list. I'm not entirely ramped up on
all of the current activity in this area or what the future plans are, so please
educate me if something is coming that will address our current issues. I'm
working off of the latest commit on the master branch as of today.

Today, there appears to be two lists - one of PCI devices and one of drivers. To
update the list of PCI devices, you call rte_eal_pci_scan(), which scans the PCI
bus. That call does not attempt to load any drivers. One scan is automatically
performed when the eal is first initialized. To add or remove drivers from the
driver list you call rte_eal_driver_register/unregister. To match drivers in the
driver list to devices in the device list, you call rte_eal_pci_probe.

There are a few problems with how the code works for us. First,
rte_eal_pci_scan's algorithm will not correctly detect devices that are in its
internal list but weren't found by the most recent PCI bus scan (i.e. they were
hot removed). DPDK's scan doesn't seem to comprehend hot remove in any way.
Fortunately there is a public API to remove devices from the device list -
rte_eal_pci_detach. That function will automatically unload any drivers
associated with the device and then remove it from the list. There is a similar
call for adding a device to the list - rte_eal_pci_probe_one, which will add a
device to the device list and then automatically match it to drivers. I think if
rte_eal_pci_scan is going to be a public interface (and it is), it needs to
correctly comprehend the removal of PCI devices. Otherwise, make it a private
API that is only called in response to rte_eal_init and only expose the public
probe_one/detach calls for modifying the list of devices. My preference is for
the former, not the latter.

Second, rte_eal_pci_probe will call the driver initialization functions each
time a probe happens, even if the driver has already been successfully loaded.
This tends to crash a lot of the PMDs. It seems to me like rte_eal_pci_probe is
not safe to call more than once during the lifetime of the program, which is a
real challenge when you have multiple users of the PCI framework. For instance,
an application may manage both storage devices using the rte_eal_pci framework
and NICs, and the initialization routine may go something like:

register NIC drivers
rte_eal_probe()
...
register SSD drivers
rte_eal_probe()

This is almost certainly how any real code is going to function because the code
dealing with NICs is unrelated and probably unaware of the code dealing with the
SSDs. It should be fairly trivial to simply not call the probe() callback for a
device if the driver has already been loaded. Is this a reasonable modification
to make?

Thanks,
Ben

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [dpdk-dev] PCIe Hot Insert/Remove Support
  2016-10-24 18:16 [dpdk-dev] PCIe Hot Insert/Remove Support Walker, Benjamin
@ 2016-10-27 16:08 ` Thomas Monjalon
  2016-11-02  9:23 ` Shreyansh Jain
  1 sibling, 0 replies; 3+ messages in thread
From: Thomas Monjalon @ 2016-10-27 16:08 UTC (permalink / raw)
  To: Walker, Benjamin; +Cc: dev

Hi Benjamin,

2016-10-24 18:16, Walker, Benjamin:
> Hi all,
> 
> My name is Ben Walker and I'm the technical lead for SPDK (it's like DPDK, but
> for storage devices). SPDK relies on DPDK only for the base functionality in the
> EAL - memory management, the rings, and the PCI scanning code. A key feature for
> storage devices is support for hot insert and remove, so we're currently working
> through how best to implement this for a user space driver. While doing this
> work, we've run into a few issues with the current DPDK PCI/device/driver
> framework that I'd like to discuss with this list. I'm not entirely ramped up on
> all of the current activity in this area or what the future plans are, so please
> educate me if something is coming that will address our current issues. I'm
> working off of the latest commit on the master branch as of today.
> 
> Today, there appears to be two lists - one of PCI devices and one of drivers. To
> update the list of PCI devices, you call rte_eal_pci_scan(), which scans the PCI
> bus. That call does not attempt to load any drivers. One scan is automatically
> performed when the eal is first initialized. To add or remove drivers from the
> driver list you call rte_eal_driver_register/unregister. To match drivers in the
> driver list to devices in the device list, you call rte_eal_pci_probe.
> 
> There are a few problems with how the code works for us. First,
> rte_eal_pci_scan's algorithm will not correctly detect devices that are in its
> internal list but weren't found by the most recent PCI bus scan (i.e. they were
> hot removed). DPDK's scan doesn't seem to comprehend hot remove in any way.
> Fortunately there is a public API to remove devices from the device list -
> rte_eal_pci_detach. That function will automatically unload any drivers
> associated with the device and then remove it from the list. There is a similar
> call for adding a device to the list - rte_eal_pci_probe_one, which will add a
> device to the device list and then automatically match it to drivers. I think if
> rte_eal_pci_scan is going to be a public interface (and it is), it needs to
> correctly comprehend the removal of PCI devices. Otherwise, make it a private
> API that is only called in response to rte_eal_init and only expose the public
> probe_one/detach calls for modifying the list of devices. My preference is for
> the former, not the latter.
> 
> Second, rte_eal_pci_probe will call the driver initialization functions each
> time a probe happens, even if the driver has already been successfully loaded.
> This tends to crash a lot of the PMDs. It seems to me like rte_eal_pci_probe is
> not safe to call more than once during the lifetime of the program, which is a
> real challenge when you have multiple users of the PCI framework. For instance,
> an application may manage both storage devices using the rte_eal_pci framework
> and NICs, and the initialization routine may go something like:
> 
> register NIC drivers
> rte_eal_probe()
> ...
> register SSD drivers
> rte_eal_probe()
> 
> This is almost certainly how any real code is going to function because the code
> dealing with NICs is unrelated and probably unaware of the code dealing with the
> SSDs. It should be fairly trivial to simply not call the probe() callback for a
> device if the driver has already been loaded. Is this a reasonable modification
> to make?

Yes it seems to be a reasonnable fix.
However we could improve this design a lot.

I'll try to describe the big picture around PCI hotplugging below.

PCI was too much mixed in DPDK code, so we are sorting out what is generic
to every buses and devices, and what is specific to PCI.
Then we want to manage PCI devices as any other device (real or virtual).

>From a generic perspective, we should be able to manage startup and hotplug
with few API functions:
        - scan a bus or all of them (called at startup)
note: scanning for vdev is an argument parsing
        - receive a notification (callback registered by the application)
note: whitelist/blacklist can be handled here at the application level
        - match the device with a driver
        - initialize the device in the matching driver
note: some drivers require binding with a kernel module, and that must
be implemented in EAL

The last missing part for a true hotplug is receiving hardware events
so that we don't need to manually scan anymore.

Hope it will answer your question.
We are waiting for contributions to progress in this direction. Thanks

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [dpdk-dev] PCIe Hot Insert/Remove Support
  2016-10-24 18:16 [dpdk-dev] PCIe Hot Insert/Remove Support Walker, Benjamin
  2016-10-27 16:08 ` Thomas Monjalon
@ 2016-11-02  9:23 ` Shreyansh Jain
  1 sibling, 0 replies; 3+ messages in thread
From: Shreyansh Jain @ 2016-11-02  9:23 UTC (permalink / raw)
  To: Walker, Benjamin; +Cc: dev

Hello Ben,

Apologies for joining this discussion late.

On 10/24/2016 11:46 PM, Walker, Benjamin wrote:
> Hi all,
>
> My name is Ben Walker and I'm the technical lead for SPDK (it's like DPDK, but
> for storage devices). SPDK relies on DPDK only for the base functionality in the
> EAL - memory management, the rings, and the PCI scanning code. A key feature for
> storage devices is support for hot insert and remove, so we're currently working
> through how best to implement this for a user space driver. While doing this
> work, we've run into a few issues with the current DPDK PCI/device/driver
> framework that I'd like to discuss with this list. I'm not entirely ramped up on
> all of the current activity in this area or what the future plans are, so please
> educate me if something is coming that will address our current issues. I'm
> working off of the latest commit on the master branch as of today.

There has been some work recently ([1], [2]) which generalized the overall DPDK EAL framework to not look at devices as PCI only. In that sense, the attach/detach routines were also generalized. But, dynamically removing device during the application initialization is still something which is grey area (from what I understand).
There are some callbacks which the application can register (eal interrupts), but it would entirely depend on drivers ability to handle interrupts generated from device.

A complete restructuring of EAL is still open discussion - which might include this point. There is a patch series floated by Jan (and subsequently managed by me) here [3]. I also initiated a discussion on this lines in DPDK User Summit in Ireland (Slide 18 of [4] might interest you).

[1] http://dpdk.org/ml/archives/dev/2016-January/031390.html
[2] http://dpdk.org/ml/archives/dev/2016-September/047099.html
[3] http://dpdk.org/ml/archives/dev/2016-October/049606.html
[4] https://dpdksummit.com/Archive/pdf/2016Userspace/Day02-Session03-ShreyanshJain-Userspace2016.pdf

>
> Today, there appears to be two lists - one of PCI devices and one of drivers. To
> update the list of PCI devices, you call rte_eal_pci_scan(), which scans the PCI
> bus. That call does not attempt to load any drivers. One scan is automatically
> performed when the eal is first initialized. To add or remove drivers from the
> driver list you call rte_eal_driver_register/unregister. To match drivers in the
> driver list to devices in the device list, you call rte_eal_pci_probe.

I agree with this general flow.

>
> There are a few problems with how the code works for us. First,
> rte_eal_pci_scan's algorithm will not correctly detect devices that are in its
> internal list but weren't found by the most recent PCI bus scan (i.e. they were
> hot removed). DPDK's scan doesn't seem to comprehend hot remove in any way.

Indeed. And this not just limited to failure to scan hot-removed, but also those devices which are either ethernet or crypto but not PCI bus compliant. Essentially, the complete DPDK scan model assumes that PCI compliant devices are available *before* scan is performed and across the scan process.
Hotplugging is limited to applications ability to attach/detach devices.

> Fortunately there is a public API to remove devices from the device list -
> rte_eal_pci_detach. That function will automatically unload any drivers
> associated with the device and then remove it from the list. There is a similar
> call for adding a device to the list - rte_eal_pci_probe_one, which will add a
> device to the device list and then automatically match it to drivers. I think if
> rte_eal_pci_scan is going to be a public interface (and it is), it needs to
> correctly comprehend the removal of PCI devices. Otherwise, make it a private

In some parallel discussions, there have been talk of scan being non-PCI centric - which would make this API hidden beneath another layer of generic scan. Either way, clean handling of a hot-plugging case is indeed desirable. See [4] above.

> API that is only called in response to rte_eal_init and only expose the public
> probe_one/detach calls for modifying the list of devices. My preference is for
> the former, not the latter.

Generic way of DPDK application is to start, rely on DPDK framework to find devices which have already been attached to the PCI drivers (bind script) and then wait for probing to finish  (rte_eal_init) so as to start the I/O (creating pool, queue setup, etc). In the later method suggested by you, this model doesn't bode well. From what I make of it,  applications would then be responsible for attaches thereby making them call extra APIs. In that context, I too prefer the former (handling hotplug).

>
> Second, rte_eal_pci_probe will call the driver initialization functions each
> time a probe happens, even if the driver has already been successfully loaded.
> This tends to crash a lot of the PMDs. It seems to me like rte_eal_pci_probe is
> not safe to call more than once during the lifetime of the program, which is a
> real challenge when you have multiple users of the PCI framework. For instance,
> an application may manage both storage devices using the rte_eal_pci framework
> and NICs, and the initialization routine may go something like:
>
> register NIC drivers
> rte_eal_probe()
> ...
> register SSD drivers
> rte_eal_probe()
>
> This is almost certainly how any real code is going to function because the code
> dealing with NICs is unrelated and probably unaware of the code dealing with the
> SSDs. It should be fairly trivial to simply not call the probe() callback for a
> device if the driver has already been loaded. Is this a reasonable modification
> to make?

I saw your patch on this. It sounds fine to me - indeed initialization is a singleton process and it should remain that way at least for a particular device.

>
> Thanks,
> Ben
>

I am still working on the restructuring. If you would like to collaborate on it or if you have ideas to share, it would be great to chat up on it (ML or offline).
In my opinion, DPDK EAL framework should be agnostic of device's type (PCI, nonPCI) and presence (plugged, unplugged, dynamic hotplug etc).

-
Shreyansh

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-11-02  9:23 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-24 18:16 [dpdk-dev] PCIe Hot Insert/Remove Support Walker, Benjamin
2016-10-27 16:08 ` Thomas Monjalon
2016-11-02  9:23 ` Shreyansh Jain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).