From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by dpdk.org (Postfix) with ESMTP id 8393F56A1 for ; Fri, 30 Mar 2018 05:35:16 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 29 Mar 2018 20:35:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.48,378,1517904000"; d="scan'208";a="28176101" Received: from tanjianf-mobl.ccr.corp.intel.com (HELO [10.67.64.55]) ([10.67.64.55]) by fmsmga007.fm.intel.com with ESMTP; 29 Mar 2018 20:35:12 -0700 To: Jeff Guo , stephen@networkplumber.org, bruce.richardson@intel.com, ferruh.yigit@intel.com, konstantin.ananyev@intel.com, gaetan.rivet@6wind.com, jingjing.wu@intel.com, thomas@monjalon.net, motih@mellanox.com, harry.van.haaren@intel.com References: <1515575544-2141-6-git-send-email-jia.guo@intel.com> <1521612693-13378-1-git-send-email-jia.guo@intel.com> Cc: jblunck@infradead.org, shreyansh.jain@nxp.com, dev@dpdk.org, helin.zhang@intel.com From: "Tan, Jianfeng" Message-ID: <93d2a5d1-9bbd-6d95-411d-582c19bc66f4@intel.com> Date: Fri, 30 Mar 2018 11:35:11 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <1521612693-13378-1-git-send-email-jia.guo@intel.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH V15 1/2] pci_uio: add uevent hotplug failure handler in uio X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Mar 2018 03:35:17 -0000 We shall split this patch to multiple patches. - This helper function is not necessarily exposed to users. Whenever there is a device remove event, before invoking the callbacks, we do the remap. - PCI related code shall be put into librte_pci. - Personally, I don't think we shall add an ops for bus. We don't know if this necessary for all bus types; at least, vdev does not "remap". Even in future, we need to support new bus types, that's just internal changes, which is acceptable. - If there is some issue in igb_uio, fix it in another patch. Thanks, Jianfeng On 3/21/2018 2:11 PM, Jeff Guo wrote: > when detect hot removal uevent of device, the device resource become > invalid, in order to avoid unexpected usage of this resource, remap > the device resource to be a fake memory, that would lead the application > keep running well but not encounter system core dump. > > Signed-off-by: Jeff Guo > --- > v15->v14: > delete find_by_name in bus ops, it is no used. use additional flag and > original pci map resource function to replace of adding a new fixed > memory mapping function. > --- > app/test-pmd/testpmd.c | 4 +++ > drivers/bus/pci/bsd/pci.c | 23 +++++++++++++++++ > drivers/bus/pci/linux/pci.c | 33 +++++++++++++++++++++++++ > drivers/bus/pci/pci_common.c | 21 ++++++++++++++++ > drivers/bus/pci/pci_common_uio.c | 41 +++++++++++++++++++++++++++++++ > drivers/bus/pci/private.h | 12 +++++++++ > drivers/bus/pci/rte_bus_pci.h | 9 +++++++ > drivers/bus/vdev/vdev.c | 7 ++++++ > lib/librte_eal/bsdapp/eal/eal_dev.c | 8 ++++++ > lib/librte_eal/common/eal_common_bus.c | 1 + > lib/librte_eal/common/include/rte_bus.h | 15 +++++++++++ > lib/librte_eal/common/include/rte_dev.h | 18 ++++++++++++++ > lib/librte_eal/linuxapp/eal/eal_dev.c | 34 +++++++++++++++++++++++++ > lib/librte_eal/linuxapp/igb_uio/igb_uio.c | 5 ++++ > 14 files changed, 231 insertions(+) > > diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c > index 915532e..1c4afea 100644 > --- a/app/test-pmd/testpmd.c > +++ b/app/test-pmd/testpmd.c > @@ -2079,6 +2079,10 @@ rmv_uevent_callback(void *arg) > if (!in_hotplug_list(rte_eth_devices[port_id].device->name)) > return; > > + /* do failure handler before stop and close the device */ > + rte_dev_failure_handler(rte_eth_devices[port_id].device, > + rte_eth_devices[port_id].data->kdrv); > + > stop_packet_forwarding(); > > stop_port(port_id); > diff --git a/drivers/bus/pci/bsd/pci.c b/drivers/bus/pci/bsd/pci.c > index 655b34b..d7165b9 100644 > --- a/drivers/bus/pci/bsd/pci.c > +++ b/drivers/bus/pci/bsd/pci.c > @@ -97,6 +97,29 @@ rte_pci_unmap_device(struct rte_pci_device *dev) > } > } > > +/* re-map pci device */ > +int > +rte_pci_remap_device(struct rte_pci_device *dev) > +{ > + int ret; > + > + if (dev == NULL) > + return -EINVAL; > + > + switch (dev->kdrv) { > + case RTE_KDRV_NIC_UIO: > + ret = pci_uio_remap_resource(dev); > + break; > + default: > + RTE_LOG(DEBUG, EAL, > + " Not managed by a supported kernel driver, skipped\n"); > + ret = 1; > + break; > + } > + > + return ret; > +} > + > void > pci_uio_free_resource(struct rte_pci_device *dev, > struct mapped_pci_resource *uio_res) > diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c > index abde641..a7dfec7 100644 > --- a/drivers/bus/pci/linux/pci.c > +++ b/drivers/bus/pci/linux/pci.c > @@ -116,6 +116,38 @@ rte_pci_unmap_device(struct rte_pci_device *dev) > } > } > > +/* Map pci device */ > +int > +rte_pci_remap_device(struct rte_pci_device *dev) > +{ > + int ret = -1; > + > + if (dev == NULL) > + return -EINVAL; > + > + switch (dev->kdrv) { > + case RTE_KDRV_VFIO: > +#ifdef VFIO_PRESENT > + /* no thing to do */ > +#endif > + break; > + case RTE_KDRV_IGB_UIO: > + case RTE_KDRV_UIO_GENERIC: > + if (rte_eal_using_phys_addrs()) { > + /* map resources for devices that use uio */ > + ret = pci_uio_remap_resource(dev); > + } > + break; > + default: > + RTE_LOG(DEBUG, EAL, > + " Not managed by a supported kernel driver, skipped\n"); > + ret = 1; > + break; > + } > + > + return ret; > +} > + > void * > pci_find_max_end_va(void) > { > @@ -357,6 +389,7 @@ pci_scan_one(const char *dirname, const struct rte_pci_addr *addr) > rte_pci_add_device(dev); > } > > + dev->device.state = RTE_DEV_PARSED; > return 0; > } > > diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c > index 2a00f36..46921a4 100644 > --- a/drivers/bus/pci/pci_common.c > +++ b/drivers/bus/pci/pci_common.c > @@ -253,6 +253,7 @@ pci_probe_all_drivers(struct rte_pci_device *dev) > if (rc > 0) > /* positive value means driver doesn't support it */ > continue; > + dev->device.state = RTE_DEV_PARSED; > return 0; > } > return 1; > @@ -474,6 +475,25 @@ pci_find_device(const struct rte_device *start, rte_dev_cmp_t cmp, > } > > static int > +pci_remap_device(struct rte_device *dev) > +{ > + struct rte_pci_device *pdev; > + int ret; > + > + if (dev == NULL) > + return -EINVAL; > + > + pdev = RTE_DEV_TO_PCI(dev); > + > + /* remap resources for devices that use igb_uio */ > + ret = rte_pci_remap_device(pdev); > + if (ret != 0) > + RTE_LOG(ERR, EAL, "failed to remap device %s", > + dev->name); > + return ret; > +} > + > +static int > pci_plug(struct rte_device *dev) > { > return pci_probe_all_drivers(RTE_DEV_TO_PCI(dev)); > @@ -503,6 +523,7 @@ struct rte_pci_bus rte_pci_bus = { > .unplug = pci_unplug, > .parse = pci_parse, > .get_iommu_class = rte_pci_get_iommu_class, > + .remap_device = pci_remap_device, > }, > .device_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.device_list), > .driver_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.driver_list), > diff --git a/drivers/bus/pci/pci_common_uio.c b/drivers/bus/pci/pci_common_uio.c > index 54bc20b..3a0a2bb 100644 > --- a/drivers/bus/pci/pci_common_uio.c > +++ b/drivers/bus/pci/pci_common_uio.c > @@ -146,6 +146,47 @@ pci_uio_unmap(struct mapped_pci_resource *uio_res) > } > } > > +/* remap the PCI resource of a PCI device in private virtual memory */ > +int > +pci_uio_remap_resource(struct rte_pci_device *dev) > +{ > + int i; > + uint64_t phaddr; > + void *map_address; > + > + if (dev == NULL) > + return -1; > + > + close(dev->intr_handle.fd); > + if (dev->intr_handle.uio_cfg_fd >= 0) { > + close(dev->intr_handle.uio_cfg_fd); > + dev->intr_handle.uio_cfg_fd = -1; > + } > + > + dev->intr_handle.fd = -1; > + dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN; > + > + /* Map all BARs */ > + for (i = 0; i != PCI_MAX_RESOURCE; i++) { > + /* skip empty BAR */ > + phaddr = dev->mem_resource[i].phys_addr; > + if (phaddr == 0) > + continue; > + pci_unmap_resource(dev->mem_resource[i].addr, > + (size_t)dev->mem_resource[i].len); > + map_address = pci_map_resource( > + dev->mem_resource[i].addr, -1, 0, > + (size_t)dev->mem_resource[i].len, > + MAP_ANONYMOUS); > + if (map_address == MAP_FAILED) > + return -1; > + memset(map_address, 0xFF, (size_t)dev->mem_resource[i].len); > + dev->mem_resource[i].addr = map_address; > + } > + > + return 0; > +} > + > static struct mapped_pci_resource * > pci_uio_find_resource(struct rte_pci_device *dev) > { > diff --git a/drivers/bus/pci/private.h b/drivers/bus/pci/private.h > index 88fa587..7a862ef 100644 > --- a/drivers/bus/pci/private.h > +++ b/drivers/bus/pci/private.h > @@ -173,6 +173,18 @@ void pci_uio_free_resource(struct rte_pci_device *dev, > struct mapped_pci_resource *uio_res); > > /** > + * remap the pci uio resource.. > + * > + * @param dev > + * Point to the struct rte pci device. > + * @return > + * - On success, zero. > + * - On failure, a negative value. > + */ > +int > +pci_uio_remap_resource(struct rte_pci_device *dev); > + > +/** > * Map device memory to uio resource > * > * This function is private to EAL. > diff --git a/drivers/bus/pci/rte_bus_pci.h b/drivers/bus/pci/rte_bus_pci.h > index 357afb9..6f9cd8b 100644 > --- a/drivers/bus/pci/rte_bus_pci.h > +++ b/drivers/bus/pci/rte_bus_pci.h > @@ -168,6 +168,15 @@ int rte_pci_map_device(struct rte_pci_device *dev); > void rte_pci_unmap_device(struct rte_pci_device *dev); > > /** > + * Remap this device > + * > + * @param dev > + * A pointer to a rte_pci_device structure describing the device > + * to use > + */ > +int rte_pci_remap_device(struct rte_pci_device *dev); > + > +/** > * Dump the content of the PCI bus. > * > * @param f > diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c > index e4bc724..efc348b 100644 > --- a/drivers/bus/vdev/vdev.c > +++ b/drivers/bus/vdev/vdev.c > @@ -400,6 +400,12 @@ vdev_find_device(const struct rte_device *start, rte_dev_cmp_t cmp, > } > > static int > +vdev_remap_device(struct rte_device *dev) > +{ > + RTE_SET_USED(dev); > + return 0; > +} > +static int > vdev_plug(struct rte_device *dev) > { > return vdev_probe_all_drivers(RTE_DEV_TO_VDEV(dev)); > @@ -418,6 +424,7 @@ static struct rte_bus rte_vdev_bus = { > .plug = vdev_plug, > .unplug = vdev_unplug, > .parse = vdev_parse, > + .remap_device = vdev_remap_device, > }; > > RTE_REGISTER_BUS(vdev, rte_vdev_bus); > diff --git a/lib/librte_eal/bsdapp/eal/eal_dev.c b/lib/librte_eal/bsdapp/eal/eal_dev.c > index 3b7bbf2..a076ec7 100644 > --- a/lib/librte_eal/bsdapp/eal/eal_dev.c > +++ b/lib/librte_eal/bsdapp/eal/eal_dev.c > @@ -31,3 +31,11 @@ rte_dev_event_monitor_stop(void) > RTE_LOG(ERR, EAL, "Not support event monitor for FreeBSD\n"); > return -1; > } > + > +int __rte_experimental > +rte_dev_failure_handler(struct rte_device *dev, > + enum rte_kernel_driver kdrv_type) > +{ > + RTE_LOG(ERR, EAL, "Not support device failure handler for FreeBSD\n"); > + return -1; > +} > diff --git a/lib/librte_eal/common/eal_common_bus.c b/lib/librte_eal/common/eal_common_bus.c > index 3e022d5..5510bbe 100644 > --- a/lib/librte_eal/common/eal_common_bus.c > +++ b/lib/librte_eal/common/eal_common_bus.c > @@ -53,6 +53,7 @@ rte_bus_register(struct rte_bus *bus) > RTE_VERIFY(bus->find_device); > /* Buses supporting driver plug also require unplug. */ > RTE_VERIFY(!bus->plug || bus->unplug); > + RTE_VERIFY(bus->remap_device); > > TAILQ_INSERT_TAIL(&rte_bus_list, bus, next); > RTE_LOG(DEBUG, EAL, "Registered [%s] bus.\n", bus->name); > diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h > index 6fb0834..1f3c09b 100644 > --- a/lib/librte_eal/common/include/rte_bus.h > +++ b/lib/librte_eal/common/include/rte_bus.h > @@ -168,6 +168,20 @@ typedef int (*rte_bus_unplug_t)(struct rte_device *dev); > typedef int (*rte_bus_parse_t)(const char *name, void *addr); > > /** > + * Implementation specific remap function which is responsible for remmaping > + * devices on that bus from original share memory resource to a anonymous > + * memory resource for the sake of device has been removal. > + * > + * @param dev > + * Device pointer that was returned by a previous call to find_device. > + * > + * @return > + * 0 on success. > + * !0 on error. > + */ > +typedef int (*rte_bus_remap_device_t)(struct rte_device *dev); > + > +/** > * Bus scan policies > */ > enum rte_bus_scan_mode { > @@ -209,6 +223,7 @@ struct rte_bus { > rte_bus_plug_t plug; /**< Probe single device for drivers */ > rte_bus_unplug_t unplug; /**< Remove single device from driver */ > rte_bus_parse_t parse; /**< Parse a device name */ > + rte_bus_remap_device_t remap_device; /**< remap a device */ > struct rte_bus_conf conf; /**< Bus configuration */ > rte_bus_get_iommu_class_t get_iommu_class; /**< Get iommu class */ > }; > diff --git a/lib/librte_eal/common/include/rte_dev.h b/lib/librte_eal/common/include/rte_dev.h > index 98ea12b..10a5fcf 100644 > --- a/lib/librte_eal/common/include/rte_dev.h > +++ b/lib/librte_eal/common/include/rte_dev.h > @@ -180,6 +180,7 @@ struct rte_device { > const struct rte_driver *driver;/**< Associated driver */ > int numa_node; /**< NUMA node connection */ > struct rte_devargs *devargs; /**< Device user arguments */ > + enum rte_dev_state state; /**< Device state */ > }; > > /** > @@ -405,4 +406,21 @@ rte_dev_event_monitor_start(void); > */ > int __rte_experimental > rte_dev_event_monitor_stop(void); > + > +/** > + * It can be used to do device failure handler to avoid > + * system core dump when failure occur. > + * > + * @param dev > + * The prointer to device structure. > + * @param kdrv_type > + * The specific kernel driver's type. > + * > + * @return > + * - On success, zero. > + * - On failure, a negative value. > + */ > +int __rte_experimental > +rte_dev_failure_handler(struct rte_device *dev, > + enum rte_kernel_driver kdrv_type); > #endif /* _RTE_DEV_H_ */ > diff --git a/lib/librte_eal/linuxapp/eal/eal_dev.c b/lib/librte_eal/linuxapp/eal/eal_dev.c > index 2b34e2c..fa63105 100644 > --- a/lib/librte_eal/linuxapp/eal/eal_dev.c > +++ b/lib/librte_eal/linuxapp/eal/eal_dev.c > @@ -225,3 +225,37 @@ rte_dev_event_monitor_stop(void) > monitor_no_started = true; > return 0; > } > + > +int __rte_experimental > +rte_dev_failure_handler(struct rte_device *dev, > + enum rte_kernel_driver kdrv_type) > +{ > + struct rte_bus *bus = rte_bus_find_by_device_name(dev->name); > + int ret; > + > + switch (kdrv_type) { > + case RTE_KDRV_IGB_UIO: > + if ((!dev) || dev->state == RTE_DEV_UNDEFINED) > + return -1; > + dev->state = RTE_DEV_FAULT; > + /** > + * remap the resource to be fake > + * before user's removal processing > + */ > + ret = bus->remap_device(dev); > + if (ret) { > + RTE_LOG(ERR, EAL, "Driver cannot remap the " > + "device (%s)\n", > + dev->name); > + return -1; > + } > + break; > + case RTE_KDRV_VFIO: > + break; > + case RTE_KDRV_UIO_GENERIC: > + break; > + default: > + break; > + } > + return 0; > +} > diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c > index 4cae4dd..9c50876 100644 > --- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c > +++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c > @@ -350,6 +350,11 @@ igbuio_pci_release(struct uio_info *info, struct inode *inode) > return 0; > } > > + /* check if device has been remove before release */ > + if ((&dev->dev.kobj)->state_remove_uevent_sent == 1) { > + pr_info("The device has been removed\n"); > + return -1; > + } > /* disable interrupts */ > igbuio_pci_disable_interrupts(udev); >