* [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices @ 2021-08-27 17:20 Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 01/13] raw/ioat: only build if dmadev not present Kevin Laatz ` (21 more replies) 0 siblings, 22 replies; 243+ messages in thread From: Kevin Laatz @ 2021-08-27 17:20 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz This patchset adds a dmadev driver and associated documentation to support Intel Data Streaming Accelerator devices. This driver is intended to ultimately replace the current IDXD part of the IOAT rawdev driver. NOTE: This patchset has several dependencies: - v16 of the dmadev set [1] - rfc of the dmadev test suite [2] [1] http://patches.dpdk.org/project/dpdk/list/?series=18391 [2] http://patches.dpdk.org/project/dpdk/list/?series=18477 Bruce Richardson (1): raw/ioat: only build if dmadev not present Conor Walsh (1): dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz (11): doc: initial commit for dmadevs section dma/idxd: add skeleton for VFIO based DSA device dma/idxd: add bus device probing dma/idxd: create dmadev instances on bus probe dma/idxd: create dmadev instances on pci probe dma/idxd: add datapath structures dma/idxd: add configure and info_get functions dma/idxd: add start and stop functions for pci devices dma/idxd: add data-path job submission functions dma/idxd: add data-path job completion functions dma/idxd: add operation statistic tracking MAINTAINERS | 10 + doc/guides/dmadevs/idxd.rst | 255 +++++++++++ doc/guides/dmadevs/index.rst | 14 + doc/guides/index.rst | 1 + doc/guides/rel_notes/release_21_11.rst | 5 + drivers/dma/idxd/dpdk_idxd_cfg.py | 117 +++++ drivers/dma/idxd/idxd_bus.c | 375 ++++++++++++++++ drivers/dma/idxd/idxd_common.c | 571 +++++++++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 130 ++++++ drivers/dma/idxd/idxd_internal.h | 102 +++++ drivers/dma/idxd/idxd_pci.c | 372 ++++++++++++++++ drivers/dma/idxd/meson.build | 10 + drivers/dma/idxd/version.map | 3 + drivers/dma/meson.build | 1 + drivers/raw/ioat/dpdk_idxd_cfg.py | 118 +---- drivers/raw/ioat/meson.build | 23 +- 16 files changed, 1987 insertions(+), 120 deletions(-) create mode 100644 doc/guides/dmadevs/idxd.rst create mode 100644 doc/guides/dmadevs/index.rst create mode 100755 drivers/dma/idxd/dpdk_idxd_cfg.py create mode 100644 drivers/dma/idxd/idxd_bus.c create mode 100644 drivers/dma/idxd/idxd_common.c create mode 100644 drivers/dma/idxd/idxd_hw_defs.h create mode 100644 drivers/dma/idxd/idxd_internal.h create mode 100644 drivers/dma/idxd/idxd_pci.c create mode 100644 drivers/dma/idxd/meson.build create mode 100644 drivers/dma/idxd/version.map mode change 100755 => 120000 drivers/raw/ioat/dpdk_idxd_cfg.py -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH 01/13] raw/ioat: only build if dmadev not present 2021-08-27 17:20 [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices Kevin Laatz @ 2021-08-27 17:20 ` Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 02/13] doc: initial commit for dmadevs section Kevin Laatz ` (20 subsequent siblings) 21 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-08-27 17:20 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh From: Bruce Richardson <bruce.richardson@intel.com> Only build the rawdev IDXD/IOAT drivers if the dmadev drivers are not present. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> --- drivers/raw/ioat/meson.build | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-) diff --git a/drivers/raw/ioat/meson.build b/drivers/raw/ioat/meson.build index 0e81cb5951..7bd9ac912b 100644 --- a/drivers/raw/ioat/meson.build +++ b/drivers/raw/ioat/meson.build @@ -2,14 +2,31 @@ # Copyright 2019 Intel Corporation build = dpdk_conf.has('RTE_ARCH_X86') +# only use ioat rawdev driver if we don't have the equivalent dmadev ones +if not dpdk_conf.has('RTE_DMA_IDXD') and not dpdk_conf.has('RTE_DMA_IOAT') + build = false + subdir_done() +endif + reason = 'only supported on x86' sources = files( - 'idxd_bus.c', - 'idxd_pci.c', 'ioat_common.c', - 'ioat_rawdev.c', 'ioat_rawdev_test.c', ) + +if not dpdk_conf.has('RTE_DMA_IDXD') + sources += files( + 'idxd_bus.c', + 'idxd_pci.c', + ) +endif + +if not dpdk_conf.has('RTE_DMA_IOAT') + sources += files ( + 'ioat_rawdev.c', + ) +endif + deps += ['bus_pci', 'mbuf', 'rawdev'] headers = files( 'rte_ioat_rawdev.h', -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH 02/13] doc: initial commit for dmadevs section 2021-08-27 17:20 [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 01/13] raw/ioat: only build if dmadev not present Kevin Laatz @ 2021-08-27 17:20 ` Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 03/13] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz ` (19 subsequent siblings) 21 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-08-27 17:20 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add new section to the programmer's guide for dmadev devices. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- doc/guides/dmadevs/index.rst | 14 ++++++++++++++ doc/guides/index.rst | 1 + 2 files changed, 15 insertions(+) create mode 100644 doc/guides/dmadevs/index.rst diff --git a/doc/guides/dmadevs/index.rst b/doc/guides/dmadevs/index.rst new file mode 100644 index 0000000000..b30004fd65 --- /dev/null +++ b/doc/guides/dmadevs/index.rst @@ -0,0 +1,14 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2021 Intel Corporation. + +DMA Device Drivers +================== + +The following are a list of DMA device PMDs, which can be used from an +application through DMAdev API. + +.. toctree:: + :maxdepth: 2 + :numbered: + + idxd diff --git a/doc/guides/index.rst b/doc/guides/index.rst index 857f0363d3..ccb71640dd 100644 --- a/doc/guides/index.rst +++ b/doc/guides/index.rst @@ -19,6 +19,7 @@ DPDK documentation bbdevs/index cryptodevs/index compressdevs/index + dmadevs/index vdpadevs/index regexdevs/index eventdevs/index -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH 03/13] dma/idxd: add skeleton for VFIO based DSA device 2021-08-27 17:20 [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 01/13] raw/ioat: only build if dmadev not present Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 02/13] doc: initial commit for dmadevs section Kevin Laatz @ 2021-08-27 17:20 ` Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 04/13] dma/idxd: add bus device probing Kevin Laatz ` (18 subsequent siblings) 21 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-08-27 17:20 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the basic device probe/remove skeleton code for DSA device bound to the vfio pci driver. Relevant documentation and MAINTAINERS update also included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- MAINTAINERS | 10 +++++ doc/guides/dmadevs/idxd.rst | 58 ++++++++++++++++++++++++++ doc/guides/rel_notes/release_21_11.rst | 5 +++ drivers/dma/idxd/idxd_internal.h | 27 ++++++++++++ drivers/dma/idxd/idxd_pci.c | 55 ++++++++++++++++++++++++ drivers/dma/idxd/meson.build | 7 ++++ drivers/dma/idxd/version.map | 3 ++ drivers/dma/meson.build | 1 + 8 files changed, 166 insertions(+) create mode 100644 doc/guides/dmadevs/idxd.rst create mode 100644 drivers/dma/idxd/idxd_internal.h create mode 100644 drivers/dma/idxd/idxd_pci.c create mode 100644 drivers/dma/idxd/meson.build create mode 100644 drivers/dma/idxd/version.map diff --git a/MAINTAINERS b/MAINTAINERS index 1661428a02..3d275c6bb4 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1199,6 +1199,16 @@ F: doc/guides/compressdevs/zlib.rst F: doc/guides/compressdevs/features/zlib.ini +DMAdev Drivers +-------------- + +Intel IDXD - EXPERIMENTAL +M: Bruce Richardson <bruce.richardson@intel.com> +M: Kevin Laatz <kevin.laatz@intel.com> +F: drivers/dma/idxd/ +F: doc/guides/dmadevs/idxd.rst + + RegEx Drivers ------------- diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst new file mode 100644 index 0000000000..924700d17e --- /dev/null +++ b/doc/guides/dmadevs/idxd.rst @@ -0,0 +1,58 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2021 Intel Corporation. + +.. include:: <isonum.txt> + +IDXD DMA Device Driver +====================== + +The ``idxd`` dmadev driver provides a poll-mode driver (PMD) for Intel\ |reg| +Data Streaming Accelerator `(Intel DSA) +<https://software.intel.com/content/www/us/en/develop/articles/intel-data-streaming-accelerator-architecture-specification.html>`_. +This PMD can be used in conjunction with Intel\ |reg| DSA devices to offload +data operations, such as data copies, to hardware, freeing up CPU cycles for +other tasks. + +Hardware Requirements +---------------------- + +The ``dpdk-devbind.py`` script, included with DPDK, can be used to show the +presence of supported hardware. Running ``dpdk-devbind.py --status-dev dma`` +will show all the DMA devices on the system, including IDXD supported devices. +Intel\ |reg| DSA devices, are currently (at time of writing) appearing +as devices with type “0b25”, due to the absence of pci-id database entries for +them at this point. + +Compilation +------------ + +For builds using ``meson`` and ``ninja``, the driver will be built when the +target platform is x86-based. No additional compilation steps are necessary. + +Device Setup +------------- + +Devices using VFIO/UIO drivers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The HW devices to be used will need to be bound to a user-space IO driver for use. +The ``dpdk-devbind.py`` script can be used to view the state of the devices +and to bind them to a suitable DPDK-supported driver, such as ``vfio-pci``. +For example:: + + $ dpdk-devbind.py -b vfio-pci 6a:01.0 + +Device Probing and Initialization +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For devices bound to a suitable DPDK-supported VFIO/UIO driver, the HW devices will +be found as part of the device scan done at application initialization time without +the need to pass parameters to the application. + +For Intel\ |reg| DSA devices, DPDK will automatically configure the device with the +maximum number of workqueues available on it, partitioning all resources equally +among the queues. +If fewer workqueues are required, then the ``max_queues`` parameter may be passed to +the device driver on the EAL commandline, via the ``allowlist`` or ``-a`` flag e.g.:: + + $ dpdk-test -a <b:d:f>,max_queues=4 diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst index 0d3c38f479..9235ee1847 100644 --- a/doc/guides/rel_notes/release_21_11.rst +++ b/doc/guides/rel_notes/release_21_11.rst @@ -61,6 +61,11 @@ New Features provisioning of hardware and software DMA poll mode drivers, defining generic APIs which support a number of different DMA operations. +* **Added IDXD dmadev driver implementation.** + + The IDXD dmadev driver provide device drivers for the Intel DSA devices. + This device driver can be used through the generic dmadev API. + Removed Items ------------- diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h new file mode 100644 index 0000000000..c6a7dcd72f --- /dev/null +++ b/drivers/dma/idxd/idxd_internal.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#ifndef _IDXD_INTERNAL_H_ +#define _IDXD_INTERNAL_H_ + +/** + * @file idxd_internal.h + * + * Internal data structures for the idxd/DSA driver for dmadev + * + * @warning + * @b EXPERIMENTAL: these structures and APIs may change without prior notice + */ + +extern int idxd_pmd_logtype; + +#define IDXD_PMD_LOG(level, fmt, args...) rte_log(RTE_LOG_ ## level, \ + idxd_pmd_logtype, "IDXD: %s(): " fmt "\n", __func__, ##args) + +#define IDXD_PMD_DEBUG(fmt, args...) IDXD_PMD_LOG(DEBUG, fmt, ## args) +#define IDXD_PMD_INFO(fmt, args...) IDXD_PMD_LOG(INFO, fmt, ## args) +#define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) +#define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) + +#endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c new file mode 100644 index 0000000000..79e4aadcab --- /dev/null +++ b/drivers/dma/idxd/idxd_pci.c @@ -0,0 +1,55 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include <rte_bus_pci.h> + +#include "idxd_internal.h" + +#define IDXD_VENDOR_ID 0x8086 +#define IDXD_DEVICE_ID_SPR 0x0B25 + +#define IDXD_PMD_DMADEV_NAME_PCI dmadev_idxd_pci + +const struct rte_pci_id pci_id_idxd_map[] = { + { RTE_PCI_DEVICE(IDXD_VENDOR_ID, IDXD_DEVICE_ID_SPR) }, + { .vendor_id = 0, /* sentinel */ }, +}; + +static int +idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) +{ + int ret = 0; + char name[PCI_PRI_STR_SIZE]; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); + dev->device.driver = &drv->driver; + + return ret; +} + +static int +idxd_dmadev_remove_pci(struct rte_pci_device *dev) +{ + char name[PCI_PRI_STR_SIZE]; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + + IDXD_PMD_INFO("Closing %s on NUMA node %d", + name, dev->device.numa_node); + + return 0; +} + +struct rte_pci_driver idxd_pmd_drv_pci = { + .id_table = pci_id_idxd_map, + .drv_flags = RTE_PCI_DRV_NEED_MAPPING, + .probe = idxd_dmadev_probe_pci, + .remove = idxd_dmadev_remove_pci, +}; + +RTE_PMD_REGISTER_PCI(IDXD_PMD_DMADEV_NAME_PCI, idxd_pmd_drv_pci); +RTE_PMD_REGISTER_PCI_TABLE(IDXD_PMD_DMADEV_NAME_PCI, pci_id_idxd_map); +RTE_PMD_REGISTER_KMOD_DEP(IDXD_PMD_DMADEV_NAME_PCI, "vfio-pci"); +RTE_PMD_REGISTER_PARAM_STRING(dmadev_idxd_pci, "max_queues=0"); diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build new file mode 100644 index 0000000000..24c0f3106e --- /dev/null +++ b/drivers/dma/idxd/meson.build @@ -0,0 +1,7 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2021 Intel Corporation + +deps += ['bus_pci'] +sources = files( + 'idxd_pci.c' +) \ No newline at end of file diff --git a/drivers/dma/idxd/version.map b/drivers/dma/idxd/version.map new file mode 100644 index 0000000000..4a76d1d52d --- /dev/null +++ b/drivers/dma/idxd/version.map @@ -0,0 +1,3 @@ +DPDK_21 { + local: *; +}; diff --git a/drivers/dma/meson.build b/drivers/dma/meson.build index 0c2c34cd00..0b01b6a8ab 100644 --- a/drivers/dma/meson.build +++ b/drivers/dma/meson.build @@ -6,6 +6,7 @@ if is_windows endif drivers = [ + 'idxd', 'skeleton', ] std_deps = ['dmadev'] -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH 04/13] dma/idxd: add bus device probing 2021-08-27 17:20 [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices Kevin Laatz ` (2 preceding siblings ...) 2021-08-27 17:20 ` [dpdk-dev] [PATCH 03/13] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz @ 2021-08-27 17:20 ` Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 05/13] dma/idxd: create dmadev instances on bus probe Kevin Laatz ` (17 subsequent siblings) 21 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-08-27 17:20 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the basic device probing for DSA devices bound to the IDXD kernel driver. These devices can be configured via sysfs and made available to DPDK if they are found during bus scan. Relevant documentation is included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- doc/guides/dmadevs/idxd.rst | 64 +++++++ drivers/dma/idxd/idxd_bus.c | 351 +++++++++++++++++++++++++++++++++++ drivers/dma/idxd/meson.build | 1 + 3 files changed, 416 insertions(+) create mode 100644 drivers/dma/idxd/idxd_bus.c diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 924700d17e..ce33e2857a 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -32,6 +32,56 @@ target platform is x86-based. No additional compilation steps are necessary. Device Setup ------------- +Intel\ |reg| DSA devices can use the IDXD kernel driver or DPDK-supported drivers, +such as ``vfio-pci``. Both are supported by the IDXD PMD. + +Intel\ |reg| DSA devices using IDXD kernel driver +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To use an Intel\ |reg| DSA device bound to the IDXD kernel driver, the device must first be configured. +The `accel-config <https://github.com/intel/idxd-config>`_ utility library can be used for configuration. + +.. note:: + The device configuration can also be done by directly interacting with the sysfs nodes. + An example of how this may be done can be seen in the script ``dpdk_idxd_cfg.py`` + included in the driver source directory. + +There are some mandatory configuration steps before being able to use a device with an application. +The internal engines, which do the copies or other operations, +and the work-queues, which are used by applications to assign work to the device, +need to be assigned to groups, and the various other configuration options, +such as priority or queue depth, need to be set for each queue. + +To assign an engine to a group:: + + $ accel-config config-engine dsa0/engine0.0 --group-id=0 + $ accel-config config-engine dsa0/engine0.1 --group-id=1 + +To assign work queues to groups for passing descriptors to the engines a similar accel-config command can be used. +However, the work queues also need to be configured depending on the use case. +Some configuration options include: + +* mode (Dedicated/Shared): Indicates whether a WQ may accept jobs from multiple queues simultaneously. +* priority: WQ priority between 1 and 15. Larger value means higher priority. +* wq-size: the size of the WQ. Sum of all WQ sizes must be less that the total-size defined by the device. +* type: WQ type (kernel/mdev/user). Determines how the device is presented. +* name: identifier given to the WQ. + +Example configuration for a work queue:: + + $ accel-config config-wq dsa0/wq0.0 --group-id=0 \ + --mode=dedicated --priority=10 --wq-size=8 \ + --type=user --name=dpdk_app1 + +Once the devices have been configured, they need to be enabled:: + + $ accel-config enable-device dsa0 + $ accel-config enable-wq dsa0/wq0.0 + +Check the device configuration:: + + $ accel-config list + Devices using VFIO/UIO drivers ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -56,3 +106,17 @@ If fewer workqueues are required, then the ``max_queues`` parameter may be passe the device driver on the EAL commandline, via the ``allowlist`` or ``-a`` flag e.g.:: $ dpdk-test -a <b:d:f>,max_queues=4 + +For devices bound to the IDXD kernel driver, +the DPDK IDXD driver will automatically perform a scan for available workqueues +to use. Any workqueues found listed in ``/dev/dsa`` on the system will be checked +in ``/sys``, and any which have ``dpdk_`` prefix in their name will be automatically +probed by the driver to make them available to the application. +Alternatively, to support use by multiple DPDK processes simultaneously, +the value used as the DPDK ``--file-prefix`` parameter may be used as a workqueue +name prefix, instead of ``dpdk_``, allowing each DPDK application instance to only +use a subset of configured queues. + +Once probed successfully, irrespective of kernel driver, the device will appear as a ``dmadev``, +that is a "DMA device type" inside DPDK, and can be accessed using APIs from the +``rte_dmadev`` library. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c new file mode 100644 index 0000000000..c08f0f473b --- /dev/null +++ b/drivers/dma/idxd/idxd_bus.c @@ -0,0 +1,351 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include <dirent.h> +#include <fcntl.h> +#include <unistd.h> +#include <sys/mman.h> + +#include <rte_bus.h> +#include <rte_log.h> +#include <rte_dmadev_pmd.h> +#include <rte_string_fns.h> + +#include "idxd_internal.h" + +/* default value for DSA paths, but allow override in environment for testing */ +#define DSA_DEV_PATH "/dev/dsa" +#define DSA_SYSFS_PATH "/sys/bus/dsa/devices" + +static unsigned int devcount; + +/** unique identifier for a DSA device/WQ instance */ +struct dsa_wq_addr { + uint16_t device_id; + uint16_t wq_id; +}; + +/** a DSA device instance */ +struct rte_dsa_device { + struct rte_device device; /**< Inherit core device */ + TAILQ_ENTRY(rte_dsa_device) next; /**< next dev in list */ + + char wq_name[32]; /**< the workqueue name/number e.g. wq0.1 */ + struct dsa_wq_addr addr; /**< Identifies the specific WQ */ +}; + +/* forward prototypes */ +struct dsa_bus; +static int dsa_scan(void); +static int dsa_probe(void); +static struct rte_device *dsa_find_device(const struct rte_device *start, + rte_dev_cmp_t cmp, const void *data); +static enum rte_iova_mode dsa_get_iommu_class(void); +static int dsa_addr_parse(const char *name, void *addr); + +/** List of devices */ +TAILQ_HEAD(dsa_device_list, rte_dsa_device); + +/** + * Structure describing the DSA bus + */ +struct dsa_bus { + struct rte_bus bus; /**< Inherit the generic class */ + struct rte_driver driver; /**< Driver struct for devices to point to */ + struct dsa_device_list device_list; /**< List of PCI devices */ +}; + +struct dsa_bus dsa_bus = { + .bus = { + .scan = dsa_scan, + .probe = dsa_probe, + .find_device = dsa_find_device, + .get_iommu_class = dsa_get_iommu_class, + .parse = dsa_addr_parse, + }, + .driver = { + .name = "dmadev_idxd" + }, + .device_list = TAILQ_HEAD_INITIALIZER(dsa_bus.device_list), +}; + +static inline const char * +dsa_get_dev_path(void) +{ + const char *path = getenv("DSA_DEV_PATH"); + return path ? path : DSA_DEV_PATH; +} + +static inline const char * +dsa_get_sysfs_path(void) +{ + const char *path = getenv("DSA_SYSFS_PATH"); + return path ? path : DSA_SYSFS_PATH; +} + +static void * +idxd_vdev_mmap_wq(struct rte_dsa_device *dev) +{ + void *addr; + char path[PATH_MAX]; + int fd; + + snprintf(path, sizeof(path), "%s/%s", dsa_get_dev_path(), dev->wq_name); + fd = open(path, O_RDWR); + if (fd < 0) { + IDXD_PMD_ERR("Failed to open device path: %s", path); + return NULL; + } + + addr = mmap(NULL, 0x1000, PROT_WRITE, MAP_SHARED, fd, 0); + close(fd); + if (addr == MAP_FAILED) { + IDXD_PMD_ERR("Failed to mmap device %s", path); + return NULL; + } + + return addr; +} + +static int +read_wq_string(struct rte_dsa_device *dev, const char *filename, + char *value, size_t valuelen) +{ + char sysfs_node[PATH_MAX]; + int len; + int fd; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s", + dsa_get_sysfs_path(), dev->wq_name, filename); + fd = open(sysfs_node, O_RDONLY); + if (fd < 0) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + len = read(fd, value, valuelen - 1); + close(fd); + if (len < 0) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + value[len] = '\0'; + return 0; +} + +static int +read_wq_int(struct rte_dsa_device *dev, const char *filename, + int *value) +{ + char sysfs_node[PATH_MAX]; + FILE *f; + int ret = 0; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s", + dsa_get_sysfs_path(), dev->wq_name, filename); + f = fopen(sysfs_node, "r"); + if (f == NULL) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + if (fscanf(f, "%d", value) != 1) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + ret = -1; + } + + fclose(f); + return ret; +} + +static int +read_device_int(struct rte_dsa_device *dev, const char *filename, + int *value) +{ + char sysfs_node[PATH_MAX]; + FILE *f; + int ret = 0; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/dsa%d/%s", + dsa_get_sysfs_path(), dev->addr.device_id, filename); + f = fopen(sysfs_node, "r"); + if (f == NULL) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + if (fscanf(f, "%d", value) != 1) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + ret = -1; + } + + fclose(f); + return ret; +} + +static int +idxd_probe_dsa(struct rte_dsa_device *dev) +{ + struct idxd_dmadev idxd = {0}; + int ret = 0; + + IDXD_PMD_INFO("Probing device %s on numa node %d", + dev->wq_name, dev->device.numa_node); + if (read_wq_int(dev, "size", &ret) < 0) + return -1; + idxd.max_batches = ret; + if (read_wq_int(dev, "max_batch_size", &ret) < 0) + return -1; + idxd.max_batch_size = ret; + idxd.qid = dev->addr.wq_id; + idxd.u.vdev.dsa_id = dev->addr.device_id; + idxd.sva_support = 1; + + idxd.portal = idxd_vdev_mmap_wq(dev); + if (idxd.portal == NULL) { + IDXD_PMD_ERR("WQ mmap failed"); + return -ENOENT; + } + + return 0; +} + +static int +is_for_this_process_use(const char *name) +{ + char *runtime_dir = strdup(rte_eal_get_runtime_dir()); + char *prefix = basename(runtime_dir); + int prefixlen = strlen(prefix); + int retval = 0; + + if (strncmp(name, "dpdk_", 5) == 0) + retval = 1; + if (strncmp(name, prefix, prefixlen) == 0 && name[prefixlen] == '_') + retval = 1; + + free(runtime_dir); + return retval; +} + +static int +dsa_probe(void) +{ + struct rte_dsa_device *dev; + + TAILQ_FOREACH(dev, &dsa_bus.device_list, next) { + char type[64], name[64]; + + if (read_wq_string(dev, "type", type, sizeof(type)) < 0 || + read_wq_string(dev, "name", name, sizeof(name)) < 0) + continue; + + if (strncmp(type, "user", 4) == 0 && is_for_this_process_use(name)) { + dev->device.driver = &dsa_bus.driver; + idxd_probe_dsa(dev); + continue; + } + IDXD_PMD_DEBUG("WQ '%s', not allocated to DPDK", dev->wq_name); + } + + return 0; +} + +static int +dsa_scan(void) +{ + const char *path = dsa_get_dev_path(); + struct dirent *wq; + DIR *dev_dir; + + dev_dir = opendir(path); + if (dev_dir == NULL) { + if (errno == ENOENT) + return 0; /* no bus, return without error */ + IDXD_PMD_ERR("%s(): opendir '%s' failed: %s", + __func__, path, strerror(errno)); + return -1; + } + + while ((wq = readdir(dev_dir)) != NULL) { + struct rte_dsa_device *dev; + int numa_node = -1; + + if (strncmp(wq->d_name, "wq", 2) != 0) + continue; + if (strnlen(wq->d_name, sizeof(dev->wq_name)) == sizeof(dev->wq_name)) { + IDXD_PMD_ERR("%s(): wq name too long: '%s', skipping", + __func__, wq->d_name); + continue; + } + IDXD_PMD_DEBUG("%s(): found %s/%s", __func__, path, wq->d_name); + + dev = malloc(sizeof(*dev)); + if (dsa_addr_parse(wq->d_name, &dev->addr) < 0) { + IDXD_PMD_ERR("Error parsing WQ name: %s", wq->d_name); + free(dev); + continue; + } + dev->device.bus = &dsa_bus.bus; + strlcpy(dev->wq_name, wq->d_name, sizeof(dev->wq_name)); + TAILQ_INSERT_TAIL(&dsa_bus.device_list, dev, next); + devcount++; + + read_device_int(dev, "numa_node", &numa_node); + dev->device.numa_node = numa_node; + dev->device.name = dev->wq_name; + } + + closedir(dev_dir); + return 0; +} + +static struct rte_device * +dsa_find_device(const struct rte_device *start, rte_dev_cmp_t cmp, + const void *data) +{ + struct rte_dsa_device *dev = TAILQ_FIRST(&dsa_bus.device_list); + + /* the rte_device struct must be at start of dsa structure */ + RTE_BUILD_BUG_ON(offsetof(struct rte_dsa_device, device) != 0); + + if (start != NULL) /* jump to start point if given */ + dev = TAILQ_NEXT((const struct rte_dsa_device *)start, next); + while (dev != NULL) { + if (cmp(&dev->device, data) == 0) + return &dev->device; + dev = TAILQ_NEXT(dev, next); + } + return NULL; +} + +static enum rte_iova_mode +dsa_get_iommu_class(void) +{ + /* if there are no devices, report don't care, otherwise VA mode */ + return devcount > 0 ? RTE_IOVA_VA : RTE_IOVA_DC; +} + +static int +dsa_addr_parse(const char *name, void *addr) +{ + struct dsa_wq_addr *wq = addr; + unsigned int device_id, wq_id; + + if (sscanf(name, "wq%u.%u", &device_id, &wq_id) != 2) { + IDXD_PMD_DEBUG("Parsing WQ name failed: %s", name); + return -1; + } + + wq->device_id = device_id; + wq->wq_id = wq_id; + return 0; +} + +RTE_REGISTER_BUS(dsa, dsa_bus.bus); diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index 24c0f3106e..f1fea000a7 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -3,5 +3,6 @@ deps += ['bus_pci'] sources = files( + 'idxd_bus.c', 'idxd_pci.c' ) \ No newline at end of file -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH 05/13] dma/idxd: create dmadev instances on bus probe 2021-08-27 17:20 [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices Kevin Laatz ` (3 preceding siblings ...) 2021-08-27 17:20 ` [dpdk-dev] [PATCH 04/13] dma/idxd: add bus device probing Kevin Laatz @ 2021-08-27 17:20 ` Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 06/13] dma/idxd: create dmadev instances on pci probe Kevin Laatz ` (16 subsequent siblings) 21 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-08-27 17:20 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When a suitable device is found during the bus scan/probe, create a dmadev instance for each HW queue. Internal structures required for device creation are also added. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- drivers/dma/idxd/idxd_bus.c | 20 ++++++++- drivers/dma/idxd/idxd_common.c | 75 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 40 +++++++++++++++++ drivers/dma/idxd/meson.build | 1 + 4 files changed, 135 insertions(+), 1 deletion(-) create mode 100644 drivers/dma/idxd/idxd_common.c diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index c08f0f473b..0f33500dfc 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -84,6 +84,18 @@ dsa_get_sysfs_path(void) return path ? path : DSA_SYSFS_PATH; } +static int +idxd_dev_close(struct rte_dmadev *dev) +{ + struct idxd_dmadev *idxd = dev->data->dev_private; + munmap(idxd->portal, 0x1000); + return 0; +} + +static const struct rte_dmadev_ops idxd_vdev_ops = { + .dev_close = idxd_dev_close, +}; + static void * idxd_vdev_mmap_wq(struct rte_dsa_device *dev) { @@ -205,7 +217,7 @@ idxd_probe_dsa(struct rte_dsa_device *dev) return -1; idxd.max_batch_size = ret; idxd.qid = dev->addr.wq_id; - idxd.u.vdev.dsa_id = dev->addr.device_id; + idxd.u.bus.dsa_id = dev->addr.device_id; idxd.sva_support = 1; idxd.portal = idxd_vdev_mmap_wq(dev); @@ -214,6 +226,12 @@ idxd_probe_dsa(struct rte_dsa_device *dev) return -ENOENT; } + ret = idxd_dmadev_create(dev->wq_name, &dev->device, &idxd, &idxd_vdev_ops); + if (ret) { + IDXD_PMD_ERR("Failed to create rawdev %s", dev->wq_name); + return ret; + } + return 0; } diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c new file mode 100644 index 0000000000..7770b2e264 --- /dev/null +++ b/drivers/dma/idxd/idxd_common.c @@ -0,0 +1,75 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#include <rte_dmadev_pmd.h> +#include <rte_malloc.h> +#include <rte_common.h> + +#include "idxd_internal.h" + +#define IDXD_PMD_NAME_STR "dmadev_idxd" + +int +idxd_dmadev_create(const char *name, struct rte_device *dev, + const struct idxd_dmadev *base_idxd, + const struct rte_dmadev_ops *ops) +{ + struct idxd_dmadev *idxd; + struct rte_dmadev *dmadev = NULL; + int ret = 0; + + if (!name) { + IDXD_PMD_ERR("Invalid name of the device!"); + ret = -EINVAL; + goto cleanup; + } + + /* Allocate device structure */ + dmadev = rte_dmadev_pmd_allocate(name); + if (dmadev == NULL) { + IDXD_PMD_ERR("Unable to allocate raw device"); + ret = -ENOMEM; + goto cleanup; + } + dmadev->dev_ops = ops; + dmadev->device = dev; + + idxd = rte_malloc_socket(NULL, sizeof(struct idxd_dmadev), 0, dev->numa_node); + if (idxd == NULL) { + IDXD_PMD_ERR("Unable to allocate memory for device"); + ret = -ENOMEM; + goto cleanup; + } + dmadev->data->dev_private = idxd; + dmadev->dev_private = idxd; + *idxd = *base_idxd; /* copy over the main fields already passed in */ + idxd->dmadev = dmadev; + + /* allocate batch index ring and completion ring. + * The +1 is because we can never fully use + * the ring, otherwise read == write means both full and empty. + */ + idxd->batch_comp_ring = rte_zmalloc(NULL, (sizeof(idxd->batch_idx_ring[0]) + + sizeof(idxd->batch_comp_ring[0])) * (idxd->max_batches + 1), + sizeof(idxd->batch_comp_ring[0])); + if (idxd->batch_comp_ring == NULL) { + IDXD_PMD_ERR("Unable to reserve memory for batch data\n"); + ret = -ENOMEM; + goto cleanup; + } + idxd->batch_idx_ring = (void *)&idxd->batch_comp_ring[idxd->max_batches+1]; + idxd->batch_iova = rte_mem_virt2iova(idxd->batch_comp_ring); + + return 0; + +cleanup: + if (dmadev) + rte_dmadev_pmd_release(dmadev); + + return ret; +} + +int idxd_pmd_logtype; + +RTE_LOG_REGISTER_DEFAULT(idxd_pmd_logtype, WARNING); diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index c6a7dcd72f..99ab2df925 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -24,4 +24,44 @@ extern int idxd_pmd_logtype; #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) +struct idxd_dmadev { + /* counters to track the batches */ + unsigned short max_batches; + unsigned short batch_idx_read; + unsigned short batch_idx_write; + + /* track descriptors and handles */ + unsigned short desc_ring_mask; + unsigned short ids_avail; /* handles for ops completed */ + unsigned short ids_returned; /* the read pointer for hdls/desc rings */ + unsigned short batch_start; /* start+size == write pointer for hdls/desc */ + unsigned short batch_size; + + void *portal; /* address to write the batch descriptor */ + + struct idxd_completion *batch_comp_ring; + unsigned short *batch_idx_ring; /* store where each batch ends */ + + struct rte_dmadev_stats stats; + + rte_iova_t batch_iova; /* base address of the batch comp ring */ + rte_iova_t desc_iova; /* base address of desc ring, needed for completions */ + + unsigned short max_batch_size; + + struct rte_dmadev *dmadev; + struct rte_dmadev_vchan_conf qcfg; + uint8_t sva_support; + uint8_t qid; + + union { + struct { + unsigned int dsa_id; + } bus; + } u; +}; + +int idxd_dmadev_create(const char *name, struct rte_device *dev, + const struct idxd_dmadev *base_idxd, const struct rte_dmadev_ops *ops); + #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index f1fea000a7..81150e6f25 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -4,5 +4,6 @@ deps += ['bus_pci'] sources = files( 'idxd_bus.c', + 'idxd_common.c', 'idxd_pci.c' ) \ No newline at end of file -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH 06/13] dma/idxd: create dmadev instances on pci probe 2021-08-27 17:20 [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices Kevin Laatz ` (4 preceding siblings ...) 2021-08-27 17:20 ` [dpdk-dev] [PATCH 05/13] dma/idxd: create dmadev instances on bus probe Kevin Laatz @ 2021-08-27 17:20 ` Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 07/13] dma/idxd: add datapath structures Kevin Laatz ` (15 subsequent siblings) 21 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-08-27 17:20 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When a suitable device is found during the PCI probe, create a dmadev instance for each HW queue. HW definitions required are also included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- drivers/dma/idxd/idxd_hw_defs.h | 71 ++++++++ drivers/dma/idxd/idxd_internal.h | 14 ++ drivers/dma/idxd/idxd_pci.c | 268 ++++++++++++++++++++++++++++++- 3 files changed, 350 insertions(+), 3 deletions(-) create mode 100644 drivers/dma/idxd/idxd_hw_defs.h diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h new file mode 100644 index 0000000000..ea627cba6d --- /dev/null +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -0,0 +1,71 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#ifndef _IDXD_HW_DEFS_H_ +#define _IDXD_HW_DEFS_H_ + +/*** Definitions for Intel(R) Data Streaming Accelerator ***/ + +#define IDXD_CMD_SHIFT 20 +enum rte_idxd_cmds { + idxd_enable_dev = 1, + idxd_disable_dev, + idxd_drain_all, + idxd_abort_all, + idxd_reset_device, + idxd_enable_wq, + idxd_disable_wq, + idxd_drain_wq, + idxd_abort_wq, + idxd_reset_wq, +}; + +/* General bar0 registers */ +struct rte_idxd_bar0 { + uint32_t __rte_cache_aligned version; /* offset 0x00 */ + uint64_t __rte_aligned(0x10) gencap; /* offset 0x10 */ + uint64_t __rte_aligned(0x10) wqcap; /* offset 0x20 */ + uint64_t __rte_aligned(0x10) grpcap; /* offset 0x30 */ + uint64_t __rte_aligned(0x08) engcap; /* offset 0x38 */ + uint64_t __rte_aligned(0x10) opcap; /* offset 0x40 */ + uint64_t __rte_aligned(0x20) offsets[2]; /* offset 0x60 */ + uint32_t __rte_aligned(0x20) gencfg; /* offset 0x80 */ + uint32_t __rte_aligned(0x08) genctrl; /* offset 0x88 */ + uint32_t __rte_aligned(0x10) gensts; /* offset 0x90 */ + uint32_t __rte_aligned(0x08) intcause; /* offset 0x98 */ + uint32_t __rte_aligned(0x10) cmd; /* offset 0xA0 */ + uint32_t __rte_aligned(0x08) cmdstatus; /* offset 0xA8 */ + uint64_t __rte_aligned(0x20) swerror[4]; /* offset 0xC0 */ +}; + +/* workqueue config is provided by array of uint32_t. */ +enum rte_idxd_wqcfg { + wq_size_idx, /* size is in first 32-bit value */ + wq_threshold_idx, /* WQ threshold second 32-bits */ + wq_mode_idx, /* WQ mode and other flags */ + wq_sizes_idx, /* WQ transfer and batch sizes */ + wq_occ_int_idx, /* WQ occupancy interrupt handle */ + wq_occ_limit_idx, /* WQ occupancy limit */ + wq_state_idx, /* WQ state and occupancy state */ +}; + +#define WQ_MODE_SHARED 0 +#define WQ_MODE_DEDICATED 1 +#define WQ_PRIORITY_SHIFT 4 +#define WQ_BATCH_SZ_SHIFT 5 +#define WQ_STATE_SHIFT 30 +#define WQ_STATE_MASK 0x3 + +struct rte_idxd_grpcfg { + uint64_t grpwqcfg[4] __rte_cache_aligned; /* 64-byte register set */ + uint64_t grpengcfg; /* offset 32 */ + uint32_t grpflags; /* offset 40 */ +}; + +#define GENSTS_DEV_STATE_MASK 0x03 +#define CMDSTATUS_ACTIVE_SHIFT 31 +#define CMDSTATUS_ACTIVE_MASK (1 << 31) +#define CMDSTATUS_ERR_MASK 0xFF + +#endif diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 99ab2df925..85c400c9ec 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -5,6 +5,8 @@ #ifndef _IDXD_INTERNAL_H_ #define _IDXD_INTERNAL_H_ +#include "idxd_hw_defs.h" + /** * @file idxd_internal.h * @@ -24,6 +26,16 @@ extern int idxd_pmd_logtype; #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) +struct idxd_pci_common { + rte_spinlock_t lk; + + uint8_t wq_cfg_sz; + volatile struct rte_idxd_bar0 *regs; + volatile uint32_t *wq_regs_base; + volatile struct rte_idxd_grpcfg *grp_regs; + volatile void *portals; +}; + struct idxd_dmadev { /* counters to track the batches */ unsigned short max_batches; @@ -58,6 +70,8 @@ struct idxd_dmadev { struct { unsigned int dsa_id; } bus; + + struct idxd_pci_common *pci; } u; }; diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 79e4aadcab..d13bcea4d2 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -3,6 +3,9 @@ */ #include <rte_bus_pci.h> +#include <rte_devargs.h> +#include <rte_dmadev_pmd.h> +#include <rte_malloc.h> #include "idxd_internal.h" @@ -16,17 +19,276 @@ const struct rte_pci_id pci_id_idxd_map[] = { { .vendor_id = 0, /* sentinel */ }, }; +static inline int +idxd_pci_dev_command(struct idxd_dmadev *idxd, enum rte_idxd_cmds command) +{ + uint8_t err_code; + uint16_t qid = idxd->qid; + int i = 0; + + if (command >= idxd_disable_wq && command <= idxd_reset_wq) + qid = (1 << qid); + rte_spinlock_lock(&idxd->u.pci->lk); + idxd->u.pci->regs->cmd = (command << IDXD_CMD_SHIFT) | qid; + + do { + rte_pause(); + err_code = idxd->u.pci->regs->cmdstatus; + if (++i >= 1000) { + IDXD_PMD_ERR("Timeout waiting for command response from HW"); + rte_spinlock_unlock(&idxd->u.pci->lk); + return err_code; + } + } while (idxd->u.pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK); + rte_spinlock_unlock(&idxd->u.pci->lk); + + return err_code & CMDSTATUS_ERR_MASK; +} + +static uint32_t * +idxd_get_wq_cfg(struct idxd_pci_common *pci, uint8_t wq_idx) +{ + return RTE_PTR_ADD(pci->wq_regs_base, + (uintptr_t)wq_idx << (5 + pci->wq_cfg_sz)); +} + +static int +idxd_is_wq_enabled(struct idxd_dmadev *idxd) +{ + uint32_t state = idxd_get_wq_cfg(idxd->u.pci, idxd->qid)[wq_state_idx]; + return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; +} + +static const struct rte_dmadev_ops idxd_pci_ops = { + +}; + +/* each portal uses 4 x 4k pages */ +#define IDXD_PORTAL_SIZE (4096 * 4) + +static int +init_pci_device(struct rte_pci_device *dev, struct idxd_dmadev *idxd, + unsigned int max_queues) +{ + struct idxd_pci_common *pci; + uint8_t nb_groups, nb_engines, nb_wqs; + uint16_t grp_offset, wq_offset; /* how far into bar0 the regs are */ + uint16_t wq_size, total_wq_size; + uint8_t lg2_max_batch, lg2_max_copy_size; + unsigned int i, err_code; + + pci = malloc(sizeof(*pci)); + if (pci == NULL) { + IDXD_PMD_ERR("%s: Can't allocate memory", __func__); + goto err; + } + rte_spinlock_init(&pci->lk); + + /* assign the bar registers, and then configure device */ + pci->regs = dev->mem_resource[0].addr; + grp_offset = (uint16_t)pci->regs->offsets[0]; + pci->grp_regs = RTE_PTR_ADD(pci->regs, grp_offset * 0x100); + wq_offset = (uint16_t)(pci->regs->offsets[0] >> 16); + pci->wq_regs_base = RTE_PTR_ADD(pci->regs, wq_offset * 0x100); + pci->portals = dev->mem_resource[2].addr; + pci->wq_cfg_sz = (pci->regs->wqcap >> 24) & 0x0F; + + /* sanity check device status */ + if (pci->regs->gensts & GENSTS_DEV_STATE_MASK) { + /* need function-level-reset (FLR) or is enabled */ + IDXD_PMD_ERR("Device status is not disabled, cannot init"); + goto err; + } + if (pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK) { + /* command in progress */ + IDXD_PMD_ERR("Device has a command in progress, cannot init"); + goto err; + } + + /* read basic info about the hardware for use when configuring */ + nb_groups = (uint8_t)pci->regs->grpcap; + nb_engines = (uint8_t)pci->regs->engcap; + nb_wqs = (uint8_t)(pci->regs->wqcap >> 16); + total_wq_size = (uint16_t)pci->regs->wqcap; + lg2_max_copy_size = (uint8_t)(pci->regs->gencap >> 16) & 0x1F; + lg2_max_batch = (uint8_t)(pci->regs->gencap >> 21) & 0x0F; + + IDXD_PMD_DEBUG("nb_groups = %u, nb_engines = %u, nb_wqs = %u", + nb_groups, nb_engines, nb_wqs); + + /* zero out any old config */ + for (i = 0; i < nb_groups; i++) { + pci->grp_regs[i].grpengcfg = 0; + pci->grp_regs[i].grpwqcfg[0] = 0; + } + for (i = 0; i < nb_wqs; i++) + idxd_get_wq_cfg(pci, i)[0] = 0; + + /* limit queues if necessary */ + if (max_queues != 0 && nb_wqs > max_queues) { + nb_wqs = max_queues; + if (nb_engines > max_queues) + nb_engines = max_queues; + if (nb_groups > max_queues) + nb_engines = max_queues; + IDXD_PMD_DEBUG("Limiting queues to %u", nb_wqs); + } + + /* put each engine into a separate group to avoid reordering */ + if (nb_groups > nb_engines) + nb_groups = nb_engines; + if (nb_groups < nb_engines) + nb_engines = nb_groups; + + /* assign engines to groups, round-robin style */ + for (i = 0; i < nb_engines; i++) { + IDXD_PMD_DEBUG("Assigning engine %u to group %u", + i, i % nb_groups); + pci->grp_regs[i % nb_groups].grpengcfg |= (1ULL << i); + } + + /* now do the same for queues and give work slots to each queue */ + wq_size = total_wq_size / nb_wqs; + IDXD_PMD_DEBUG("Work queue size = %u, max batch = 2^%u, max copy = 2^%u", + wq_size, lg2_max_batch, lg2_max_copy_size); + for (i = 0; i < nb_wqs; i++) { + /* add engine "i" to a group */ + IDXD_PMD_DEBUG("Assigning work queue %u to group %u", + i, i % nb_groups); + pci->grp_regs[i % nb_groups].grpwqcfg[0] |= (1ULL << i); + /* now configure it, in terms of size, max batch, mode */ + idxd_get_wq_cfg(pci, i)[wq_size_idx] = wq_size; + idxd_get_wq_cfg(pci, i)[wq_mode_idx] = (1 << WQ_PRIORITY_SHIFT) | + WQ_MODE_DEDICATED; + idxd_get_wq_cfg(pci, i)[wq_sizes_idx] = lg2_max_copy_size | + (lg2_max_batch << WQ_BATCH_SZ_SHIFT); + } + + /* dump the group configuration to output */ + for (i = 0; i < nb_groups; i++) { + IDXD_PMD_DEBUG("## Group %d", i); + IDXD_PMD_DEBUG(" GRPWQCFG: %"PRIx64, pci->grp_regs[i].grpwqcfg[0]); + IDXD_PMD_DEBUG(" GRPENGCFG: %"PRIx64, pci->grp_regs[i].grpengcfg); + IDXD_PMD_DEBUG(" GRPFLAGS: %"PRIx32, pci->grp_regs[i].grpflags); + } + + idxd->u.pci = pci; + idxd->max_batches = wq_size; + + /* enable the device itself */ + err_code = idxd_pci_dev_command(idxd, idxd_enable_dev); + if (err_code) { + IDXD_PMD_ERR("Error enabling device: code %#x", err_code); + return err_code; + } + IDXD_PMD_DEBUG("IDXD Device enabled OK"); + + return nb_wqs; + +err: + free(pci); + return -1; +} + static int idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) { - int ret = 0; + struct idxd_dmadev idxd = {0}; + uint8_t nb_wqs; + int qid, ret = 0; char name[PCI_PRI_STR_SIZE]; + unsigned int max_queues = 0; rte_pci_device_name(&dev->addr, name, sizeof(name)); IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); dev->device.driver = &drv->driver; - return ret; + if (dev->device.devargs && dev->device.devargs->args[0] != '\0') { + /* if the number of devargs grows beyond just 1, use rte_kvargs */ + if (sscanf(dev->device.devargs->args, + "max_queues=%u", &max_queues) != 1) { + IDXD_PMD_ERR("Invalid device parameter: '%s'", + dev->device.devargs->args); + return -1; + } + } + + ret = init_pci_device(dev, &idxd, max_queues); + if (ret < 0) { + IDXD_PMD_ERR("Error initializing PCI hardware"); + return ret; + } + nb_wqs = (uint8_t)ret; + + /* set up one device for each queue */ + for (qid = 0; qid < nb_wqs; qid++) { + char qname[32]; + + /* add the queue number to each device name */ + snprintf(qname, sizeof(qname), "%s-q%d", name, qid); + idxd.qid = qid; + idxd.portal = RTE_PTR_ADD(idxd.u.pci->portals, + qid * IDXD_PORTAL_SIZE); + if (idxd_is_wq_enabled(&idxd)) + IDXD_PMD_ERR("Error, WQ %u seems enabled", qid); + ret = idxd_dmadev_create(qname, &dev->device, + &idxd, &idxd_pci_ops); + if (ret != 0) { + IDXD_PMD_ERR("Failed to create dmadev %s", name); + if (qid == 0) /* if no devices using this, free pci */ + free(idxd.u.pci); + return ret; + } + } + + return 0; +} + +static int +idxd_dmadev_destroy(const char *name) +{ + int ret; + uint8_t err_code; + struct rte_dmadev *rdev; + struct idxd_dmadev *idxd; + + if (!name) { + IDXD_PMD_ERR("Invalid device name"); + return -EINVAL; + } + + rdev = rte_dmadev_get_device_by_name(name); + if (!rdev) { + IDXD_PMD_ERR("Invalid device name (%s)", name); + return -EINVAL; + } + + idxd = rdev->dev_private; + if (!idxd) { + IDXD_PMD_ERR("Error getting dev_private"); + return -EINVAL; + } + + /* disable the device */ + err_code = idxd_pci_dev_command(idxd, idxd_disable_dev); + if (err_code) { + IDXD_PMD_ERR("Error disabling device: code %#x", err_code); + return err_code; + } + IDXD_PMD_DEBUG("IDXD Device disabled OK"); + + /* free device memory */ + IDXD_PMD_DEBUG("Freeing device driver memory"); + rdev->dev_private = NULL; + rte_free(idxd->batch_idx_ring); + rte_free(idxd->desc_ring); + + /* rte_dmadev_close is called by pmd_release */ + ret = rte_dmadev_pmd_release(rdev); + if (ret) + IDXD_PMD_DEBUG("Device cleanup failed"); + + return 0; } static int @@ -39,7 +301,7 @@ idxd_dmadev_remove_pci(struct rte_pci_device *dev) IDXD_PMD_INFO("Closing %s on NUMA node %d", name, dev->device.numa_node); - return 0; + return idxd_dmadev_destroy(name); } struct rte_pci_driver idxd_pmd_drv_pci = { -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH 07/13] dma/idxd: add datapath structures 2021-08-27 17:20 [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices Kevin Laatz ` (5 preceding siblings ...) 2021-08-27 17:20 ` [dpdk-dev] [PATCH 06/13] dma/idxd: create dmadev instances on pci probe Kevin Laatz @ 2021-08-27 17:20 ` Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 08/13] dma/idxd: add configure and info_get functions Kevin Laatz ` (14 subsequent siblings) 21 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-08-27 17:20 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add data structures required for the data path for IDXD devices. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 33 ++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 59 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 3 ++ drivers/dma/idxd/idxd_pci.c | 2 +- 5 files changed, 97 insertions(+), 1 deletion(-) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index 0f33500dfc..dc11f829fd 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -94,6 +94,7 @@ idxd_dev_close(struct rte_dmadev *dev) static const struct rte_dmadev_ops idxd_vdev_ops = { .dev_close = idxd_dev_close, + .dev_dump = idxd_dump, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 7770b2e264..9490439fdc 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -10,6 +10,35 @@ #define IDXD_PMD_NAME_STR "dmadev_idxd" +int +idxd_dump(const struct rte_dmadev *dev, FILE *f) +{ + struct idxd_dmadev *idxd = dev->dev_private; + unsigned int i; + + fprintf(f, "== Private Data ==\n"); + fprintf(f, " Portal: %p\n", idxd->portal); + fprintf(f, " Config: { ring_size: %u }\n", + idxd->qcfg.nb_desc); + fprintf(f, " Batch ring (sz = %u, max_batches = %u):\n\t", + idxd->max_batches + 1, idxd->max_batches); + for (i = 0; i <= idxd->max_batches; i++) { + fprintf(f, " %u ", idxd->batch_idx_ring[i]); + if (i == idxd->batch_idx_read && i == idxd->batch_idx_write) + fprintf(f, "[rd ptr, wr ptr] "); + else if (i == idxd->batch_idx_read) + fprintf(f, "[rd ptr] "); + else if (i == idxd->batch_idx_write) + fprintf(f, "[wr ptr] "); + if (i == idxd->max_batches) + fprintf(f, "\n"); + } + + fprintf(f, " Curr batch: start = %u, size = %u\n", idxd->batch_start, idxd->batch_size); + fprintf(f, " IDS: avail = %u, returned: %u\n", idxd->ids_avail, idxd->ids_returned); + return 0; +} + int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, @@ -19,6 +48,10 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, struct rte_dmadev *dmadev = NULL; int ret = 0; + RTE_BUILD_BUG_ON(sizeof(struct idxd_hw_desc) != 64); + RTE_BUILD_BUG_ON(offsetof(struct idxd_hw_desc, size) != 32); + RTE_BUILD_BUG_ON(sizeof(struct idxd_completion) != 32); + if (!name) { IDXD_PMD_ERR("Invalid name of the device!"); ret = -EINVAL; diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h index ea627cba6d..f6b9c25981 100644 --- a/drivers/dma/idxd/idxd_hw_defs.h +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -5,6 +5,65 @@ #ifndef _IDXD_HW_DEFS_H_ #define _IDXD_HW_DEFS_H_ +/* + * Defines used in the data path for interacting with IDXD hardware. + */ +#define IDXD_CMD_OP_SHIFT 24 +enum rte_idxd_ops { + idxd_op_nop = 0, + idxd_op_batch, + idxd_op_drain, + idxd_op_memmove, + idxd_op_fill +}; + +#define IDXD_FLAG_FENCE (1 << 0) +#define IDXD_FLAG_COMPLETION_ADDR_VALID (1 << 2) +#define IDXD_FLAG_REQUEST_COMPLETION (1 << 3) +#define IDXD_FLAG_CACHE_CONTROL (1 << 8) + +/** + * Hardware descriptor used by DSA hardware, for both bursts and + * for individual operations. + */ +struct idxd_hw_desc { + uint32_t pasid; + uint32_t op_flags; + rte_iova_t completion; + + RTE_STD_C11 + union { + rte_iova_t src; /* source address for copy ops etc. */ + rte_iova_t desc_addr; /* descriptor pointer for batch */ + }; + rte_iova_t dst; + + uint32_t size; /* length of data for op, or batch size */ + + uint16_t intr_handle; /* completion interrupt handle */ + + /* remaining 26 bytes are reserved */ + uint16_t __reserved[13]; +} __rte_aligned(64); + +#define IDXD_COMP_STATUS_INCOMPLETE 0 +#define IDXD_COMP_STATUS_SUCCESS 1 +#define IDXD_COMP_STATUS_INVALID_SIZE 0x13 +#define IDXD_COMP_STATUS_SKIPPED 0xFF /* not official IDXD error, but needed as placeholder */ + +/** + * Completion record structure written back by DSA + */ +struct idxd_completion { + uint8_t status; + uint8_t result; + /* 16-bits pad here */ + uint32_t completed_size; /* data length, or descriptors for batch */ + + rte_iova_t fault_address; + uint32_t invalid_flags; +} __rte_aligned(32); + /*** Definitions for Intel(R) Data Streaming Accelerator ***/ #define IDXD_CMD_SHIFT 20 diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 85c400c9ec..09285b4e96 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -37,6 +37,8 @@ struct idxd_pci_common { }; struct idxd_dmadev { + struct idxd_hw_desc *desc_ring; + /* counters to track the batches */ unsigned short max_batches; unsigned short batch_idx_read; @@ -77,5 +79,6 @@ struct idxd_dmadev { int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, const struct rte_dmadev_ops *ops); +int idxd_dump(const struct rte_dmadev *dev, FILE *f); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index d13bcea4d2..ddb4e90447 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -60,7 +60,7 @@ idxd_is_wq_enabled(struct idxd_dmadev *idxd) } static const struct rte_dmadev_ops idxd_pci_ops = { - + .dev_dump = idxd_dump, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH 08/13] dma/idxd: add configure and info_get functions 2021-08-27 17:20 [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices Kevin Laatz ` (6 preceding siblings ...) 2021-08-27 17:20 ` [dpdk-dev] [PATCH 07/13] dma/idxd: add datapath structures Kevin Laatz @ 2021-08-27 17:20 ` Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 09/13] dma/idxd: add start and stop functions for pci devices Kevin Laatz ` (13 subsequent siblings) 21 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-08-27 17:20 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add functions for device configuration. The info_get function is included here since it can be useful for checking successful configuration. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- doc/guides/dmadevs/idxd.rst | 33 ++++++++++++++++ drivers/dma/idxd/idxd_bus.c | 3 ++ drivers/dma/idxd/idxd_common.c | 67 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 +++ drivers/dma/idxd/idxd_pci.c | 3 ++ 5 files changed, 111 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index ce33e2857a..66bc9fe744 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -120,3 +120,36 @@ use a subset of configured queues. Once probed successfully, irrespective of kernel driver, the device will appear as a ``dmadev``, that is a "DMA device type" inside DPDK, and can be accessed using APIs from the ``rte_dmadev`` library. + +Using IDXD DMAdev Devices +-------------------------- + +To use the devices from an application, the dmadev API can be used. + +Getting Device Information +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Basic information about each dmadev device can be queried using the +``rte_dmadev_info_get()`` API. This will return basic device information such as +the ``rte_device`` structure, device capabilities and other device specific values. + +Device Configuration +~~~~~~~~~~~~~~~~~~~~~ + +Configuring an IDXD dmadev device is done using the ``rte_dmadev_configure()`` and +``rte_dmadev_vchan_setup`` APIs. The configurations are passed to these APIs using +the ``rte_dmadev_conf`` and ``rte_dmadev_vchan_conf`` structures, respectively. For +example, these can be used to configure the number of ``vchans`` per device, the +ring size, etc. The ring size must be a power of two, between 64 and 4096. + +The following code shows how the device is configured in +``test_dmadev.c``: + +.. literalinclude:: ../../../app/test/test_dmadev.c + :language: c + :start-after: Setup of the dmadev device. 8< + :end-before: >8 End of setup of the dmadev device. + :dedent: 1 + +Once configured, the device can then be made ready for use by calling the +``rte_dmadev_start()`` API. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index dc11f829fd..ad4f076bad 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -95,6 +95,9 @@ idxd_dev_close(struct rte_dmadev *dev) static const struct rte_dmadev_ops idxd_vdev_ops = { .dev_close = idxd_dev_close, .dev_dump = idxd_dump, + .dev_configure = idxd_configure, + .vchan_setup = idxd_vchan_setup, + .dev_info_get = idxd_info_get, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 9490439fdc..ea2c0b7f19 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -39,6 +39,73 @@ idxd_dump(const struct rte_dmadev *dev, FILE *f) return 0; } +int +idxd_info_get(const struct rte_dmadev *dev, struct rte_dmadev_info *info, uint32_t size) +{ + struct idxd_dmadev *idxd = dev->dev_private; + + if (size < sizeof(*info)) + return -EINVAL; + + *info = (struct rte_dmadev_info) { + .device = dev->device, + .dev_capa = RTE_DMADEV_CAPA_MEM_TO_MEM | + RTE_DMADEV_CAPA_OPS_COPY | RTE_DMADEV_CAPA_OPS_FILL, + .max_vchans = 1, + .max_desc = 4096, + .min_desc = 64, + .nb_vchans = (idxd->desc_ring != NULL), /* returns 1 or 0 */ + }; + if (idxd->sva_support) + info->dev_capa |= RTE_DMADEV_CAPA_SVA; + return 0; +} + +int +idxd_configure(struct rte_dmadev *dev __rte_unused, const struct rte_dmadev_conf *dev_conf) +{ + if (dev_conf->nb_vchans != 1) + return -EINVAL; + return 0; +} + +int +idxd_vchan_setup(struct rte_dmadev *dev, uint16_t vchan __rte_unused, + const struct rte_dmadev_vchan_conf *qconf) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t max_desc = qconf->nb_desc; + uint16_t i; + + idxd->qcfg = *qconf; + + if (!rte_is_power_of_2(max_desc)) + max_desc = rte_align32pow2(max_desc); + IDXD_PMD_DEBUG("DMA dev %u using %u descriptors", dev->data->dev_id, max_desc); + idxd->desc_ring_mask = max_desc - 1; + idxd->qcfg.nb_desc = max_desc; + + /* in case we are reconfiguring a device, free any existing memory */ + rte_free(idxd->desc_ring); + + /* allocate the descriptor ring at 2x size as batches can't wrap */ + idxd->desc_ring = rte_zmalloc(NULL, sizeof(*idxd->desc_ring) * max_desc * 2, 0); + if (idxd->desc_ring == NULL) + return -ENOMEM; + idxd->desc_iova = rte_mem_virt2iova(idxd->desc_ring); + + idxd->batch_idx_read = 0; + idxd->batch_idx_write = 0; + idxd->batch_start = 0; + idxd->batch_size = 0; + idxd->ids_returned = 0; + idxd->ids_avail = 0; + + for (i = 0; i < max_desc * 2; i++) + idxd->desc_ring[i].completion = __desc_idx_to_iova(idxd, i & (max_desc - 1)); + return 0; +} + int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 09285b4e96..18fc65d00c 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -80,5 +80,10 @@ struct idxd_dmadev { int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, const struct rte_dmadev_ops *ops); int idxd_dump(const struct rte_dmadev *dev, FILE *f); +int idxd_configure(struct rte_dmadev *dev, const struct rte_dmadev_conf *dev_conf); +int idxd_vchan_setup(struct rte_dmadev *dev, uint16_t vchan, + const struct rte_dmadev_vchan_conf *qconf); +int idxd_info_get(const struct rte_dmadev *dev, struct rte_dmadev_info *dev_info, + uint32_t size); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index ddb4e90447..46daa13e69 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -61,6 +61,9 @@ idxd_is_wq_enabled(struct idxd_dmadev *idxd) static const struct rte_dmadev_ops idxd_pci_ops = { .dev_dump = idxd_dump, + .dev_configure = idxd_configure, + .vchan_setup = idxd_vchan_setup, + .dev_info_get = idxd_info_get, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH 09/13] dma/idxd: add start and stop functions for pci devices 2021-08-27 17:20 [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices Kevin Laatz ` (7 preceding siblings ...) 2021-08-27 17:20 ` [dpdk-dev] [PATCH 08/13] dma/idxd: add configure and info_get functions Kevin Laatz @ 2021-08-27 17:20 ` Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 10/13] dma/idxd: add data-path job submission functions Kevin Laatz ` (12 subsequent siblings) 21 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-08-27 17:20 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add device start/stop functions for DSA devices bound to vfio. For devices bound to the IDXD kernel driver, these are not required since the IDXD kernel driver takes care of this. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- drivers/dma/idxd/idxd_pci.c | 52 +++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 46daa13e69..9959c81a1e 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -59,11 +59,63 @@ idxd_is_wq_enabled(struct idxd_dmadev *idxd) return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; } +static int +idxd_pci_dev_stop(struct rte_dmadev *dev) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint8_t err_code; + + if (!idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Work queue %d already disabled", idxd->qid); + return -EALREADY; + } + + err_code = idxd_pci_dev_command(idxd, idxd_disable_wq); + if (err_code || idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Failed disabling work queue %d, error code: %#x", + idxd->qid, err_code); + return -err_code; + } + IDXD_PMD_DEBUG("Work queue %d disabled OK", idxd->qid); + + return 0; +} + +static int +idxd_pci_dev_start(struct rte_dmadev *dev) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint8_t err_code; + + if (idxd_is_wq_enabled(idxd)) { + IDXD_PMD_WARN("WQ %d already enabled", idxd->qid); + return 0; + } + + if (idxd->desc_ring == NULL) { + IDXD_PMD_ERR("WQ %d has not been fully configured", idxd->qid); + return -EINVAL; + } + + err_code = idxd_pci_dev_command(idxd, idxd_enable_wq); + if (err_code || !idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Failed enabling work queue %d, error code: %#x", + idxd->qid, err_code); + return err_code == 0 ? -1 : err_code; + } + + IDXD_PMD_DEBUG("Work queue %d enabled OK", idxd->qid); + + return 0; +} + static const struct rte_dmadev_ops idxd_pci_ops = { .dev_dump = idxd_dump, .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .dev_start = idxd_pci_dev_start, + .dev_stop = idxd_pci_dev_stop, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH 10/13] dma/idxd: add data-path job submission functions 2021-08-27 17:20 [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices Kevin Laatz ` (8 preceding siblings ...) 2021-08-27 17:20 ` [dpdk-dev] [PATCH 09/13] dma/idxd: add start and stop functions for pci devices Kevin Laatz @ 2021-08-27 17:20 ` Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 11/13] dma/idxd: add data-path job completion functions Kevin Laatz ` (11 subsequent siblings) 21 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-08-27 17:20 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add data path functions for enqueuing and submitting operations to DSA devices. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- doc/guides/dmadevs/idxd.rst | 64 ++++++++++++++ drivers/dma/idxd/idxd_common.c | 138 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 ++ drivers/dma/idxd/meson.build | 1 + 4 files changed, 208 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 66bc9fe744..0c4c105e0f 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -153,3 +153,67 @@ The following code shows how the device is configured in Once configured, the device can then be made ready for use by calling the ``rte_dmadev_start()`` API. + +Performing Data Copies +~~~~~~~~~~~~~~~~~~~~~~~ + +To perform data copies using IDXD dmadev devices, descriptors should be enqueued +using the ``rte_dmadev_copy()`` API. The HW can be triggered to perform the copy +in two ways, either via a ``RTE_DMA_OP_FLAG_SUBMIT`` flag or by calling +``rte_dmadev_submit()``. Once copies have been completed, the completion will +be reported back when the application calls ``rte_dmadev_completed()`` or +``rte_dmadev_completed_status()``. The latter will also report the status of each +completed operation. + +The ``rte_dmadev_copy()`` function enqueues a single copy to the device ring for +copying at a later point. The parameters to that function include the IOVA addresses +of both the source and destination buffers, as well as the length of the copy. + +The ``rte_dmadev_copy()`` function enqueues a copy operation on the device ring. +If the ``RTE_DMA_OP_FLAG_SUBMIT`` flag is set when calling ``rte_dmadev_copy()``, +the device hardware will be informed of the elements. Alternatively, if the flag +is not set, the application need to call the ``rte_dmadev_submit()`` function to +notify the device hardware. Once the device hardware is informed of the elements +enqueued on the ring, and the device will begin to process them. It is expected +that, for efficiency reasons, a burst of operations will be enqueued to the +device via multiple enqueue calls between calls to the ``rte_dmadev_submit()`` +function. + +The following code from demonstrates how to enqueue a burst of copies to the +device and start the hardware processing of them: + +.. code-block:: C + + struct rte_mbuf *srcs[COMP_BURST_SZ], *dsts[COMP_BURST_SZ]; + unsigned int i; + + for (i = 0; i < RTE_DIM(srcs); i++) { + uint64_t *src_data; + + srcs[i] = rte_pktmbuf_alloc(pool); + dsts[i] = rte_pktmbuf_alloc(pool); + src_data = rte_pktmbuf_mtod(srcs[i], uint64_t *); + if (srcs[i] == NULL || dsts[i] == NULL) { + PRINT_ERR("Error allocating buffers\n"); + return -1; + } + + for (j = 0; j < COPY_LEN/sizeof(uint64_t); j++) + src_data[j] = rte_rand(); + + if (rte_dmadev_copy(dev_id, vchan, srcs[i]->buf_iova + srcs[i]->data_off, + dsts[i]->buf_iova + dsts[i]->data_off, COPY_LEN, 0) < 0) { + PRINT_ERR("Error with rte_dmadev_copy for buffer %u\n", i); + return -1; + } + } + rte_dmadev_submit(dev_id, vchan); + +Filling an Area of Memory +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The IDXD driver also has support for the ``fill`` operation, where an area +of memory is overwritten, or filled, with a short pattern of data. +Fill operations can be performed in much the same was as copy operations +described above, just using the ``rte_dmadev_fill()`` function rather than the +``rte_dmadev_copy()`` function. diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index ea2c0b7f19..e2ef7b3b95 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -2,14 +2,148 @@ * Copyright 2021 Intel Corporation */ +#include <x86intrin.h> + #include <rte_dmadev_pmd.h> #include <rte_malloc.h> #include <rte_common.h> +#include <rte_prefetch.h> #include "idxd_internal.h" #define IDXD_PMD_NAME_STR "dmadev_idxd" +static __rte_always_inline rte_iova_t +__desc_idx_to_iova(struct idxd_dmadev *idxd, uint16_t n) +{ + return idxd->desc_iova + (n * sizeof(struct idxd_hw_desc)); +} + +static __rte_always_inline void +__idxd_movdir64b(volatile void *dst, const struct idxd_hw_desc *src) +{ + asm volatile (".byte 0x66, 0x0f, 0x38, 0xf8, 0x02" + : + : "a" (dst), "d" (src) + : "memory"); +} + +static __rte_always_inline void +__submit(struct idxd_dmadev *idxd) +{ + /* TODO have flag setting indicating polling on same core as submission */ + rte_prefetch1(&idxd->batch_comp_ring[idxd->batch_idx_read]); + + if (idxd->batch_size == 0) + return; + + /* write completion to batch comp ring */ + rte_iova_t comp_addr = idxd->batch_iova + + (idxd->batch_idx_write * sizeof(struct idxd_completion)); + + if (idxd->batch_size == 1) { + /* submit batch directly */ + struct idxd_hw_desc desc = + idxd->desc_ring[idxd->batch_start & idxd->desc_ring_mask]; + desc.completion = comp_addr; + desc.op_flags |= IDXD_FLAG_REQUEST_COMPLETION; + _mm_sfence(); /* fence before writing desc to device */ + __idxd_movdir64b(idxd->portal, &desc); + } else { + const struct idxd_hw_desc batch_desc = { + .op_flags = (idxd_op_batch << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_COMPLETION_ADDR_VALID | + IDXD_FLAG_REQUEST_COMPLETION, + .desc_addr = __desc_idx_to_iova(idxd, + idxd->batch_start & idxd->desc_ring_mask), + .completion = comp_addr, + .size = idxd->batch_size, + }; + _mm_sfence(); /* fence before writing desc to device */ + __idxd_movdir64b(idxd->portal, &batch_desc); + } + + if (++idxd->batch_idx_write > idxd->max_batches) + idxd->batch_idx_write = 0; + + idxd->batch_start += idxd->batch_size; + idxd->batch_size = 0; + idxd->batch_idx_ring[idxd->batch_idx_write] = idxd->batch_start; + _mm256_store_si256((void *)&idxd->batch_comp_ring[idxd->batch_idx_write], + _mm256_setzero_si256()); +} + +static __rte_always_inline int +__idxd_write_desc(struct rte_dmadev *dev, + const uint32_t op_flags, + const rte_iova_t src, + const rte_iova_t dst, + const uint32_t size, + const uint32_t flags) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t mask = idxd->desc_ring_mask; + uint16_t job_id = idxd->batch_start + idxd->batch_size; + /* we never wrap batches, so we only mask the start and allow start+size to overflow */ + uint16_t write_idx = (idxd->batch_start & mask) + idxd->batch_size; + + /* first check batch ring space then desc ring space */ + if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) || + idxd->batch_idx_write + 1 == idxd->batch_idx_read) + goto failed; + if (((write_idx + 1) & mask) == (idxd->ids_returned & mask)) + goto failed; + + /* write desc. Note: descriptors don't wrap, but the completion address does */ + const uint64_t op_flags64 = (uint64_t)(op_flags | IDXD_FLAG_COMPLETION_ADDR_VALID) << 32; + const uint64_t comp_addr = __desc_idx_to_iova(idxd, write_idx & mask); + _mm256_store_si256((void *)&idxd->desc_ring[write_idx], + _mm256_set_epi64x(dst, src, comp_addr, op_flags64)); + _mm256_store_si256((void *)&idxd->desc_ring[write_idx].size, + _mm256_set_epi64x(0, 0, 0, size)); + + idxd->batch_size++; + + rte_prefetch0_write(&idxd->desc_ring[write_idx + 1]); + + if (flags & RTE_DMA_OP_FLAG_SUBMIT) + __submit(idxd); + + return job_id; + +failed: + return -1; +} + +int +idxd_enqueue_copy(struct rte_dmadev *dev, uint16_t qid __rte_unused, rte_iova_t src, + rte_iova_t dst, unsigned int length, uint64_t flags) +{ + /* we can take advantage of the fact that the fence flag in dmadev and DSA are the same, + * but check it at compile time to be sure. + */ + RTE_BUILD_BUG_ON(RTE_DMA_OP_FLAG_FENCE != IDXD_FLAG_FENCE); + uint32_t memmove = (idxd_op_memmove << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_CACHE_CONTROL | (flags & IDXD_FLAG_FENCE); + return __idxd_write_desc(dev, memmove, src, dst, length, flags); +} + +int +idxd_enqueue_fill(struct rte_dmadev *dev, uint16_t qid __rte_unused, uint64_t pattern, + rte_iova_t dst, unsigned int length, uint64_t flags) +{ + uint32_t fill = (idxd_op_fill << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_CACHE_CONTROL | (flags & IDXD_FLAG_FENCE); + return __idxd_write_desc(dev, fill, pattern, dst, length, flags); +} + +int +idxd_submit(struct rte_dmadev *dev, uint16_t qid __rte_unused) +{ + __submit(dev->dev_private); + return 0; +} + int idxd_dump(const struct rte_dmadev *dev, FILE *f) { @@ -135,6 +269,10 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->dev_ops = ops; dmadev->device = dev; + dmadev->copy = idxd_enqueue_copy; + dmadev->fill = idxd_enqueue_fill; + dmadev->submit = idxd_submit; + idxd = rte_malloc_socket(NULL, sizeof(struct idxd_dmadev), 0, dev->numa_node); if (idxd == NULL) { IDXD_PMD_ERR("Unable to allocate memory for device"); diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 18fc65d00c..6a6c69fd61 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -85,5 +85,10 @@ int idxd_vchan_setup(struct rte_dmadev *dev, uint16_t vchan, const struct rte_dmadev_vchan_conf *qconf); int idxd_info_get(const struct rte_dmadev *dev, struct rte_dmadev_info *dev_info, uint32_t size); +int idxd_enqueue_copy(struct rte_dmadev *dev, uint16_t qid, rte_iova_t src, + rte_iova_t dst, unsigned int length, uint64_t flags); +int idxd_enqueue_fill(struct rte_dmadev *dev, uint16_t qid, uint64_t pattern, + rte_iova_t dst, unsigned int length, uint64_t flags); +int idxd_submit(struct rte_dmadev *dev, uint16_t qid); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index 81150e6f25..2de5130fd2 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -2,6 +2,7 @@ # Copyright(c) 2021 Intel Corporation deps += ['bus_pci'] +cflags += '-mavx2' # all platforms with idxd HW support AVX sources = files( 'idxd_bus.c', 'idxd_common.c', -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH 11/13] dma/idxd: add data-path job completion functions 2021-08-27 17:20 [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices Kevin Laatz ` (9 preceding siblings ...) 2021-08-27 17:20 ` [dpdk-dev] [PATCH 10/13] dma/idxd: add data-path job submission functions Kevin Laatz @ 2021-08-27 17:20 ` Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 12/13] dma/idxd: add operation statistic tracking Kevin Laatz ` (10 subsequent siblings) 21 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-08-27 17:20 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the data path functions for gathering completed operations. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- doc/guides/dmadevs/idxd.rst | 25 ++++ drivers/dma/idxd/idxd_common.c | 235 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 + 3 files changed, 265 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 0c4c105e0f..8bf99ef453 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -209,6 +209,31 @@ device and start the hardware processing of them: } rte_dmadev_submit(dev_id, vchan); +To retrieve information about completed copies, ``rte_dmadev_completed()`` and +``rte_dmadev_completed_status()`` APIs should be used. ``rte_dmadev_completed()`` +will return the number of completed operations, along with the index of the last +successful completed operation and whether or not an error was encountered. If an +error was encounted, ``rte_dmadev_completed_status()`` must be used to kick the +device off to continue processing operations and also to gather the status of each +individual operations which is filled in to the ``status`` array provided as +parameter by the application. + +The following code shows how to retrieve the number of successfully completed +copies within a burst and then using ``rte_dmadev_completed_status()`` to check +which operation failed and kick off the device to continue processing operations: + +.. code-block:: C + + enum rte_dma_status_code status[COMP_BURST_SZ]; + uint16_t count, idx, status_count; + bool error = 0; + + count = rte_dmadev_completed(dev_id, vchan, COMP_BURST_SZ, &idx, &error); + + if (error){ + status_count = rte_dmadev_completed_status(dev_id, vchan, COMP_BURST_SZ, &idx, status); + } + Filling an Area of Memory ~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index e2ef7b3b95..50b205d92f 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -144,6 +144,239 @@ idxd_submit(struct rte_dmadev *dev, uint16_t qid __rte_unused) return 0; } +static enum rte_dma_status_code +get_comp_status(struct idxd_completion *c) +{ + uint8_t st = c->status; + switch (st) { + /* successful descriptors are not written back normally */ + case IDXD_COMP_STATUS_INCOMPLETE: + case IDXD_COMP_STATUS_SUCCESS: + return RTE_DMA_STATUS_SUCCESSFUL; + case IDXD_COMP_STATUS_INVALID_SIZE: + return RTE_DMA_STATUS_INVALID_LENGTH; + case IDXD_COMP_STATUS_SKIPPED: + return RTE_DMA_STATUS_NOT_ATTEMPTED; + default: /* TODO - get more detail */ + return RTE_DMA_STATUS_ERROR_UNKNOWN; + } +} + +static __rte_always_inline int +batch_ok(struct idxd_dmadev *idxd, uint8_t max_ops, enum rte_dma_status_code *status) +{ + uint16_t ret; + uint8_t bstatus; + + if (max_ops == 0) + return 0; + + /* first check if there are any unreturned handles from last time */ + if (idxd->ids_avail != idxd->ids_returned) { + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); + idxd->ids_returned += ret; + if (status) + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); + return ret; + } + + if (idxd->batch_idx_read == idxd->batch_idx_write) + return 0; + + bstatus = idxd->batch_comp_ring[idxd->batch_idx_read].status; + /* now check if next batch is complete and successful */ + if (bstatus == IDXD_COMP_STATUS_SUCCESS) { + /* since the batch idx ring stores the start of each batch, pre-increment to lookup + * start of next batch. + */ + if (++idxd->batch_idx_read > idxd->max_batches) + idxd->batch_idx_read = 0; + idxd->ids_avail = idxd->batch_idx_ring[idxd->batch_idx_read]; + + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); + idxd->ids_returned += ret; + if (status) + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); + return ret; + } + /* check if batch is incomplete */ + else if (bstatus == IDXD_COMP_STATUS_INCOMPLETE) + return 0; + + return -1; /* error case */ +} + +static inline uint16_t +batch_completed(struct idxd_dmadev *idxd, uint8_t max_ops, bool *has_error) +{ + uint16_t i; + uint16_t b_start, b_end, next_batch; + + int ret = batch_ok(idxd, max_ops, NULL); + if (ret >= 0) + return ret; + + /* ERROR case, not successful, not incomplete */ + /* Get the batch size, and special case size 1. + * once we identify the actual failure job, return other jobs, then update + * the batch ring indexes to make it look like the first job of the batch has failed. + * Subsequent calls here will always return zero packets, and the error must be cleared by + * calling the completed_status() function. + */ + next_batch = (idxd->batch_idx_read + 1); + if (next_batch > idxd->max_batches) + next_batch = 0; + b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; + b_end = idxd->batch_idx_ring[next_batch]; + + if (b_end - b_start == 1) { /* not a batch */ + *has_error = true; + return 0; + } + + for (i = b_start; i < b_end; i++) { + struct idxd_completion *c = (void *)&idxd->desc_ring[i & idxd->desc_ring_mask]; + if (c->status > IDXD_COMP_STATUS_SUCCESS) /* ignore incomplete(0) and success(1) */ + break; + } + ret = RTE_MIN((uint16_t)(i - idxd->ids_returned), max_ops); + if (ret < max_ops) + *has_error = true; /* we got up to the point of error */ + idxd->ids_avail = idxd->ids_returned += ret; + + /* to ensure we can call twice and just return 0, set start of batch to where we finished */ + idxd->batch_comp_ring[idxd->batch_idx_read].completed_size -= ret; + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; + if (idxd->batch_idx_ring[next_batch] - idxd->batch_idx_ring[idxd->batch_idx_read] == 1) { + /* copy over the descriptor status to the batch ring as if no batch */ + uint16_t d_idx = idxd->batch_idx_ring[idxd->batch_idx_read] & idxd->desc_ring_mask; + struct idxd_completion *desc_comp = (void *)&idxd->desc_ring[d_idx]; + idxd->batch_comp_ring[idxd->batch_idx_read].status = desc_comp->status; + } + + return ret; +} + +static uint16_t +batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) +{ + uint16_t next_batch; + + int ret = batch_ok(idxd, max_ops, status); + if (ret >= 0) + return ret; + + /* ERROR case, not successful, not incomplete */ + /* Get the batch size, and special case size 1. + */ + next_batch = (idxd->batch_idx_read + 1); + if (next_batch > idxd->max_batches) + next_batch = 0; + const uint16_t b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; + const uint16_t b_end = idxd->batch_idx_ring[next_batch]; + const uint16_t b_len = b_end - b_start; + if (b_len == 1) {/* not a batch */ + *status = get_comp_status(&idxd->batch_comp_ring[idxd->batch_idx_read]); + idxd->ids_avail++; + idxd->ids_returned++; + idxd->batch_idx_read = next_batch; + return 1; + } + + /* not a single-element batch, need to process more. + * Scenarios: + * 1. max_ops >= batch_size - can fit everything, simple case + * - loop through completed ops and then add on any not-attempted ones + * 2. max_ops < batch_size - can't fit everything, more complex case + * - loop through completed/incomplete and stop when hit max_ops + * - adjust the batch descriptor to update where we stopped, with appropriate bcount + * - if bcount is to be exactly 1, update the batch descriptor as it will be treated as + * non-batch next time. + */ + const uint16_t bcount = idxd->batch_comp_ring[idxd->batch_idx_read].completed_size; + for (ret = 0; ret < b_len && ret < max_ops; ret++) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; + status[ret] = (ret < bcount) ? get_comp_status(c) : RTE_DMA_STATUS_NOT_ATTEMPTED; + } + idxd->ids_avail = idxd->ids_returned += ret; + + /* everything fit */ + if (ret == b_len) { + idxd->batch_idx_read = next_batch; + return ret; + } + + /* set up for next time, update existing batch descriptor & start idx at batch_idx_read */ + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; + if (ret > bcount) { + /* we have only incomplete ones - set batch completed size to 0 */ + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; + comp->completed_size = 0; + /* if there is only one descriptor left, job skipped so set flag appropriately */ + if (b_len - ret == 1) + comp->status = IDXD_COMP_STATUS_SKIPPED; + } else { + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; + comp->completed_size -= ret; + /* if there is only one descriptor left, copy status info straight to desc */ + if (comp->completed_size == 1) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; + comp->status = c->status; + /* individual descs can be ok without writeback, but not batches */ + if (comp->status == IDXD_COMP_STATUS_INCOMPLETE) + comp->status = IDXD_COMP_STATUS_SUCCESS; + } else if (bcount == b_len) { + /* check if we still have an error, and clear flag if not */ + uint16_t i; + for (i = b_start + ret; i < b_end; i++) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[i & idxd->desc_ring_mask]; + if (c->status > IDXD_COMP_STATUS_SUCCESS) + break; + } + if (i == b_end) /* no errors */ + comp->status = IDXD_COMP_STATUS_SUCCESS; + } + } + + return ret; +} + +uint16_t +idxd_completed(struct rte_dmadev *dev, uint16_t qid __rte_unused, uint16_t max_ops, + uint16_t *last_idx, bool *has_error) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t batch, ret = 0; + + do { + batch = batch_completed(idxd, max_ops - ret, has_error); + ret += batch; + } while (batch > 0 && *has_error == false); + + *last_idx = idxd->ids_returned - 1; + return ret; +} + +uint16_t +idxd_completed_status(struct rte_dmadev *dev, uint16_t qid __rte_unused, uint16_t max_ops, + uint16_t *last_idx, enum rte_dma_status_code *status) +{ + struct idxd_dmadev *idxd = dev->dev_private; + + uint16_t batch, ret = 0; + + do { + batch = batch_completed_status(idxd, max_ops - ret, &status[ret]); + ret += batch; + } while (batch > 0); + + *last_idx = idxd->ids_returned - 1; + return ret; +} + int idxd_dump(const struct rte_dmadev *dev, FILE *f) { @@ -272,6 +505,8 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->copy = idxd_enqueue_copy; dmadev->fill = idxd_enqueue_fill; dmadev->submit = idxd_submit; + dmadev->completed = idxd_completed; + dmadev->completed_status = idxd_completed_status; idxd = rte_malloc_socket(NULL, sizeof(struct idxd_dmadev), 0, dev->numa_node); if (idxd == NULL) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 6a6c69fd61..4bcfe5372b 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -90,5 +90,10 @@ int idxd_enqueue_copy(struct rte_dmadev *dev, uint16_t qid, rte_iova_t src, int idxd_enqueue_fill(struct rte_dmadev *dev, uint16_t qid, uint64_t pattern, rte_iova_t dst, unsigned int length, uint64_t flags); int idxd_submit(struct rte_dmadev *dev, uint16_t qid); +uint16_t idxd_completed(struct rte_dmadev *dev, uint16_t qid, uint16_t max_ops, + uint16_t *last_idx, bool *has_error); +uint16_t idxd_completed_status(struct rte_dmadev *dev, uint16_t qid __rte_unused, + uint16_t max_ops, uint16_t *last_idx, + enum rte_dma_status_code *status); #endif /* _IDXD_INTERNAL_H_ */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH 12/13] dma/idxd: add operation statistic tracking 2021-08-27 17:20 [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices Kevin Laatz ` (10 preceding siblings ...) 2021-08-27 17:20 ` [dpdk-dev] [PATCH 11/13] dma/idxd: add data-path job completion functions Kevin Laatz @ 2021-08-27 17:20 ` Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 13/13] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz ` (9 subsequent siblings) 21 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-08-27 17:20 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add statistic tracking for DSA devices. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- doc/guides/dmadevs/idxd.rst | 11 +++++++++++ drivers/dma/idxd/idxd_bus.c | 2 ++ drivers/dma/idxd/idxd_common.c | 23 +++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 3 +++ 4 files changed, 39 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 8bf99ef453..ba8bbac8b7 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -242,3 +242,14 @@ of memory is overwritten, or filled, with a short pattern of data. Fill operations can be performed in much the same was as copy operations described above, just using the ``rte_dmadev_fill()`` function rather than the ``rte_dmadev_copy()`` function. + +Querying Device Statistics +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The statistics from the IDXD dmadev device can be got via the stats functions in +the ``rte_dmadev`` library, i.e. ``rte_dmadev_stats_get()``. The statistics +returned for each device instance are: + +* ``submitted_count`` +* ``completed_fail_count`` +* ``completed_count`` diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index ad4f076bad..9482dab9be 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -98,6 +98,8 @@ static const struct rte_dmadev_ops idxd_vdev_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .stats_get = idxd_stats_get, + .stats_reset = idxd_stats_reset, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 50b205d92f..2f5addb359 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -66,6 +66,8 @@ __submit(struct idxd_dmadev *idxd) if (++idxd->batch_idx_write > idxd->max_batches) idxd->batch_idx_write = 0; + idxd->stats.submitted += idxd->batch_size; + idxd->batch_start += idxd->batch_size; idxd->batch_size = 0; idxd->batch_idx_ring[idxd->batch_idx_write] = idxd->batch_start; @@ -356,6 +358,7 @@ idxd_completed(struct rte_dmadev *dev, uint16_t qid __rte_unused, uint16_t max_o ret += batch; } while (batch > 0 && *has_error == false); + idxd->stats.completed += ret; *last_idx = idxd->ids_returned - 1; return ret; } @@ -373,6 +376,7 @@ idxd_completed_status(struct rte_dmadev *dev, uint16_t qid __rte_unused, uint16_ ret += batch; } while (batch > 0); + idxd->stats.completed += ret; *last_idx = idxd->ids_returned - 1; return ret; } @@ -406,6 +410,25 @@ idxd_dump(const struct rte_dmadev *dev, FILE *f) return 0; } +int +idxd_stats_get(const struct rte_dmadev *dev, uint16_t vchan __rte_unused, + struct rte_dmadev_stats *stats, uint32_t stats_sz) +{ + struct idxd_dmadev *idxd = dev->dev_private; + if (stats_sz < sizeof(*stats)) + return -EINVAL; + *stats = idxd->stats; + return 0; +} + +int +idxd_stats_reset(struct rte_dmadev *dev, uint16_t vchan __rte_unused) +{ + struct idxd_dmadev *idxd = dev->dev_private; + idxd->stats = (struct rte_dmadev_stats){0}; + return 0; +} + int idxd_info_get(const struct rte_dmadev *dev, struct rte_dmadev_info *info, uint32_t size) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 4bcfe5372b..5079b4207d 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -95,5 +95,8 @@ uint16_t idxd_completed(struct rte_dmadev *dev, uint16_t qid, uint16_t max_ops, uint16_t idxd_completed_status(struct rte_dmadev *dev, uint16_t qid __rte_unused, uint16_t max_ops, uint16_t *last_idx, enum rte_dma_status_code *status); +int idxd_stats_get(const struct rte_dmadev *dev, uint16_t vchan, + struct rte_dmadev_stats *stats, uint32_t stats_sz); +int idxd_stats_reset(struct rte_dmadev *dev, uint16_t vchan); #endif /* _IDXD_INTERNAL_H_ */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH 13/13] dma/idxd: move dpdk_idxd_cfg.py from raw to dma 2021-08-27 17:20 [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices Kevin Laatz ` (11 preceding siblings ...) 2021-08-27 17:20 ` [dpdk-dev] [PATCH 12/13] dma/idxd: add operation statistic tracking Kevin Laatz @ 2021-08-27 17:20 ` Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 00/16] add dmadev driver for idxd devices Kevin Laatz ` (8 subsequent siblings) 21 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-08-27 17:20 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz From: Conor Walsh <conor.walsh@intel.com> Move the example script for configuring IDXD devices bound to the IDXD kernel driver from raw to dma, and create a symlink to still allow use from raw. Signed-off-by: Conor Walsh <conor.walsh@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- drivers/dma/idxd/dpdk_idxd_cfg.py | 117 +++++++++++++++++++++++++++++ drivers/raw/ioat/dpdk_idxd_cfg.py | 118 +----------------------------- 2 files changed, 118 insertions(+), 117 deletions(-) create mode 100755 drivers/dma/idxd/dpdk_idxd_cfg.py mode change 100755 => 120000 drivers/raw/ioat/dpdk_idxd_cfg.py diff --git a/drivers/dma/idxd/dpdk_idxd_cfg.py b/drivers/dma/idxd/dpdk_idxd_cfg.py new file mode 100755 index 0000000000..fcc27822ef --- /dev/null +++ b/drivers/dma/idxd/dpdk_idxd_cfg.py @@ -0,0 +1,117 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2020 Intel Corporation + +""" +Configure an entire Intel DSA instance, using idxd kernel driver, for DPDK use +""" + +import sys +import argparse +import os +import os.path + + +class SysfsDir: + "Used to read/write paths in a sysfs directory" + def __init__(self, path): + self.path = path + + def read_int(self, filename): + "Return a value from sysfs file" + with open(os.path.join(self.path, filename)) as f: + return int(f.readline()) + + def write_values(self, values): + "write dictionary, where key is filename and value is value to write" + for filename, contents in values.items(): + with open(os.path.join(self.path, filename), "w") as f: + f.write(str(contents)) + + +def reset_device(dsa_id): + "Reset the DSA device and all its queues" + drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") + drv_dir.write_values({"unbind": f"dsa{dsa_id}"}) + + +def get_pci_dir(pci): + "Search for the sysfs directory of the PCI device" + base_dir = '/sys/bus/pci/devices/' + for path, dirs, files in os.walk(base_dir): + for dir in dirs: + if pci in dir: + return os.path.join(base_dir, dir) + sys.exit(f"Could not find sysfs directory for device {pci}") + + +def get_dsa_id(pci): + "Get the DSA instance ID using the PCI address of the device" + pci_dir = get_pci_dir(pci) + for path, dirs, files in os.walk(pci_dir): + for dir in dirs: + if dir.startswith('dsa') and 'wq' not in dir: + return int(dir[3:]) + sys.exit(f"Could not get device ID for device {pci}") + + +def configure_dsa(dsa_id, queues, prefix): + "Configure the DSA instance with appropriate number of queues" + dsa_dir = SysfsDir(f"/sys/bus/dsa/devices/dsa{dsa_id}") + drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") + + max_groups = dsa_dir.read_int("max_groups") + max_engines = dsa_dir.read_int("max_engines") + max_queues = dsa_dir.read_int("max_work_queues") + max_work_queues_size = dsa_dir.read_int("max_work_queues_size") + + nb_queues = min(queues, max_queues) + if queues > nb_queues: + print(f"Setting number of queues to max supported value: {max_queues}") + + # we want one engine per group, and no more engines than queues + nb_groups = min(max_engines, max_groups, nb_queues) + for grp in range(nb_groups): + dsa_dir.write_values({f"engine{dsa_id}.{grp}/group_id": grp}) + + # configure each queue + for q in range(nb_queues): + wq_dir = SysfsDir(os.path.join(dsa_dir.path, f"wq{dsa_id}.{q}")) + wq_dir.write_values({"group_id": q % nb_groups, + "type": "user", + "mode": "dedicated", + "name": f"{prefix}_wq{dsa_id}.{q}", + "priority": 1, + "size": int(max_work_queues_size / nb_queues)}) + + # enable device and then queues + drv_dir.write_values({"bind": f"dsa{dsa_id}"}) + for q in range(nb_queues): + drv_dir.write_values({"bind": f"wq{dsa_id}.{q}"}) + + +def main(args): + "Main function, does arg parsing and calls config function" + arg_p = argparse.ArgumentParser( + description="Configure whole DSA device instance for DPDK use") + arg_p.add_argument('dsa_id', + help="Specify DSA instance either via DSA instance number or PCI address") + arg_p.add_argument('-q', metavar='queues', type=int, default=255, + help="Number of queues to set up") + arg_p.add_argument('--name-prefix', metavar='prefix', dest='prefix', + default="dpdk", + help="Prefix for workqueue name to mark for DPDK use [default: 'dpdk']") + arg_p.add_argument('--reset', action='store_true', + help="Reset DSA device and its queues") + parsed_args = arg_p.parse_args(args[1:]) + + dsa_id = parsed_args.dsa_id + dsa_id = get_dsa_id(dsa_id) if ':' in dsa_id else dsa_id + if parsed_args.reset: + reset_device(dsa_id) + else: + configure_dsa(dsa_id, parsed_args.q, parsed_args.prefix) + + +if __name__ == "__main__": + main(sys.argv) diff --git a/drivers/raw/ioat/dpdk_idxd_cfg.py b/drivers/raw/ioat/dpdk_idxd_cfg.py deleted file mode 100755 index fcc27822ef..0000000000 --- a/drivers/raw/ioat/dpdk_idxd_cfg.py +++ /dev/null @@ -1,117 +0,0 @@ -#!/usr/bin/env python3 -# SPDX-License-Identifier: BSD-3-Clause -# Copyright(c) 2020 Intel Corporation - -""" -Configure an entire Intel DSA instance, using idxd kernel driver, for DPDK use -""" - -import sys -import argparse -import os -import os.path - - -class SysfsDir: - "Used to read/write paths in a sysfs directory" - def __init__(self, path): - self.path = path - - def read_int(self, filename): - "Return a value from sysfs file" - with open(os.path.join(self.path, filename)) as f: - return int(f.readline()) - - def write_values(self, values): - "write dictionary, where key is filename and value is value to write" - for filename, contents in values.items(): - with open(os.path.join(self.path, filename), "w") as f: - f.write(str(contents)) - - -def reset_device(dsa_id): - "Reset the DSA device and all its queues" - drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") - drv_dir.write_values({"unbind": f"dsa{dsa_id}"}) - - -def get_pci_dir(pci): - "Search for the sysfs directory of the PCI device" - base_dir = '/sys/bus/pci/devices/' - for path, dirs, files in os.walk(base_dir): - for dir in dirs: - if pci in dir: - return os.path.join(base_dir, dir) - sys.exit(f"Could not find sysfs directory for device {pci}") - - -def get_dsa_id(pci): - "Get the DSA instance ID using the PCI address of the device" - pci_dir = get_pci_dir(pci) - for path, dirs, files in os.walk(pci_dir): - for dir in dirs: - if dir.startswith('dsa') and 'wq' not in dir: - return int(dir[3:]) - sys.exit(f"Could not get device ID for device {pci}") - - -def configure_dsa(dsa_id, queues, prefix): - "Configure the DSA instance with appropriate number of queues" - dsa_dir = SysfsDir(f"/sys/bus/dsa/devices/dsa{dsa_id}") - drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") - - max_groups = dsa_dir.read_int("max_groups") - max_engines = dsa_dir.read_int("max_engines") - max_queues = dsa_dir.read_int("max_work_queues") - max_work_queues_size = dsa_dir.read_int("max_work_queues_size") - - nb_queues = min(queues, max_queues) - if queues > nb_queues: - print(f"Setting number of queues to max supported value: {max_queues}") - - # we want one engine per group, and no more engines than queues - nb_groups = min(max_engines, max_groups, nb_queues) - for grp in range(nb_groups): - dsa_dir.write_values({f"engine{dsa_id}.{grp}/group_id": grp}) - - # configure each queue - for q in range(nb_queues): - wq_dir = SysfsDir(os.path.join(dsa_dir.path, f"wq{dsa_id}.{q}")) - wq_dir.write_values({"group_id": q % nb_groups, - "type": "user", - "mode": "dedicated", - "name": f"{prefix}_wq{dsa_id}.{q}", - "priority": 1, - "size": int(max_work_queues_size / nb_queues)}) - - # enable device and then queues - drv_dir.write_values({"bind": f"dsa{dsa_id}"}) - for q in range(nb_queues): - drv_dir.write_values({"bind": f"wq{dsa_id}.{q}"}) - - -def main(args): - "Main function, does arg parsing and calls config function" - arg_p = argparse.ArgumentParser( - description="Configure whole DSA device instance for DPDK use") - arg_p.add_argument('dsa_id', - help="Specify DSA instance either via DSA instance number or PCI address") - arg_p.add_argument('-q', metavar='queues', type=int, default=255, - help="Number of queues to set up") - arg_p.add_argument('--name-prefix', metavar='prefix', dest='prefix', - default="dpdk", - help="Prefix for workqueue name to mark for DPDK use [default: 'dpdk']") - arg_p.add_argument('--reset', action='store_true', - help="Reset DSA device and its queues") - parsed_args = arg_p.parse_args(args[1:]) - - dsa_id = parsed_args.dsa_id - dsa_id = get_dsa_id(dsa_id) if ':' in dsa_id else dsa_id - if parsed_args.reset: - reset_device(dsa_id) - else: - configure_dsa(dsa_id, parsed_args.q, parsed_args.prefix) - - -if __name__ == "__main__": - main(sys.argv) diff --git a/drivers/raw/ioat/dpdk_idxd_cfg.py b/drivers/raw/ioat/dpdk_idxd_cfg.py new file mode 120000 index 0000000000..85545548d1 --- /dev/null +++ b/drivers/raw/ioat/dpdk_idxd_cfg.py @@ -0,0 +1 @@ +../../dma/idxd/dpdk_idxd_cfg.py \ No newline at end of file -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v2 00/16] add dmadev driver for idxd devices 2021-08-27 17:20 [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices Kevin Laatz ` (12 preceding siblings ...) 2021-08-27 17:20 ` [dpdk-dev] [PATCH 13/13] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz @ 2021-09-03 10:49 ` Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 01/16] raw/ioat: only build if dmadev not present Kevin Laatz ` (16 more replies) 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 00/16] add dmadev driver for idxd devices Kevin Laatz ` (7 subsequent siblings) 21 siblings, 17 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-03 10:49 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz This patchset adds a dmadev driver and associated documentation to support Intel Data Streaming Accelerator devices. This driver is intended to ultimately replace the current IDXD part of the IOAT rawdev driver. NOTE: This patchset has several dependencies: - v19 of the dmadev set [1] - v2 of the dmadev test suite [2] [1] http://patches.dpdk.org/project/dpdk/list/?series=18629 [2] http://patches.dpdk.org/project/dpdk/list/?series=18607 v2: * rebased on above patchsets * added API to check for device being idle * added devbind updates for DMA devices * fixed issue identified by internal coverity scan * other minor miscellaneous changes and fixes Bruce Richardson (1): raw/ioat: only build if dmadev not present Conor Walsh (1): dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz (14): doc: initial commit for dmadevs section dma/idxd: add skeleton for VFIO based DSA device dma/idxd: add bus device probing dma/idxd: create dmadev instances on bus probe dma/idxd: create dmadev instances on pci probe dma/idxd: add datapath structures dma/idxd: add configure and info_get functions dma/idxd: add start and stop functions for pci devices dma/idxd: add data-path job submission functions dma/idxd: add data-path job completion functions dma/idxd: add operation statistic tracking dma/idxd: add vchan idle function devbind: add dma device class devbind: move idxd device ID to dmadev class MAINTAINERS | 10 + app/test/test_dmadev.c | 3 +- doc/guides/dmadevs/idxd.rst | 255 +++++++++++ doc/guides/dmadevs/index.rst | 14 + doc/guides/index.rst | 1 + doc/guides/rel_notes/release_21_11.rst | 5 + drivers/dma/idxd/dpdk_idxd_cfg.py | 117 +++++ drivers/dma/idxd/idxd_bus.c | 376 ++++++++++++++++ drivers/dma/idxd/idxd_common.c | 585 +++++++++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 131 ++++++ drivers/dma/idxd/idxd_internal.h | 103 +++++ drivers/dma/idxd/idxd_pci.c | 379 ++++++++++++++++ drivers/dma/idxd/meson.build | 10 + drivers/dma/idxd/version.map | 3 + drivers/dma/meson.build | 1 + drivers/raw/ioat/dpdk_idxd_cfg.py | 118 +---- drivers/raw/ioat/meson.build | 23 +- usertools/dpdk-devbind.py | 12 +- 18 files changed, 2022 insertions(+), 124 deletions(-) create mode 100644 doc/guides/dmadevs/idxd.rst create mode 100644 doc/guides/dmadevs/index.rst create mode 100755 drivers/dma/idxd/dpdk_idxd_cfg.py create mode 100644 drivers/dma/idxd/idxd_bus.c create mode 100644 drivers/dma/idxd/idxd_common.c create mode 100644 drivers/dma/idxd/idxd_hw_defs.h create mode 100644 drivers/dma/idxd/idxd_internal.h create mode 100644 drivers/dma/idxd/idxd_pci.c create mode 100644 drivers/dma/idxd/meson.build create mode 100644 drivers/dma/idxd/version.map mode change 100755 => 120000 drivers/raw/ioat/dpdk_idxd_cfg.py -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v2 01/16] raw/ioat: only build if dmadev not present 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 00/16] add dmadev driver for idxd devices Kevin Laatz @ 2021-09-03 10:49 ` Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 02/16] doc: initial commit for dmadevs section Kevin Laatz ` (15 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-03 10:49 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh From: Bruce Richardson <bruce.richardson@intel.com> Only build the rawdev IDXD/IOAT drivers if the dmadev drivers are not present. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> --- drivers/raw/ioat/meson.build | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-) diff --git a/drivers/raw/ioat/meson.build b/drivers/raw/ioat/meson.build index 0e81cb5951..7bd9ac912b 100644 --- a/drivers/raw/ioat/meson.build +++ b/drivers/raw/ioat/meson.build @@ -2,14 +2,31 @@ # Copyright 2019 Intel Corporation build = dpdk_conf.has('RTE_ARCH_X86') +# only use ioat rawdev driver if we don't have the equivalent dmadev ones +if not dpdk_conf.has('RTE_DMA_IDXD') and not dpdk_conf.has('RTE_DMA_IOAT') + build = false + subdir_done() +endif + reason = 'only supported on x86' sources = files( - 'idxd_bus.c', - 'idxd_pci.c', 'ioat_common.c', - 'ioat_rawdev.c', 'ioat_rawdev_test.c', ) + +if not dpdk_conf.has('RTE_DMA_IDXD') + sources += files( + 'idxd_bus.c', + 'idxd_pci.c', + ) +endif + +if not dpdk_conf.has('RTE_DMA_IOAT') + sources += files ( + 'ioat_rawdev.c', + ) +endif + deps += ['bus_pci', 'mbuf', 'rawdev'] headers = files( 'rte_ioat_rawdev.h', -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v2 02/16] doc: initial commit for dmadevs section 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 01/16] raw/ioat: only build if dmadev not present Kevin Laatz @ 2021-09-03 10:49 ` Kevin Laatz 2021-09-03 10:51 ` Bruce Richardson 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 03/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz ` (14 subsequent siblings) 16 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-03 10:49 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add new section to the programmer's guide for dmadev devices. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- doc/guides/dmadevs/index.rst | 14 ++++++++++++++ doc/guides/index.rst | 1 + 2 files changed, 15 insertions(+) create mode 100644 doc/guides/dmadevs/index.rst diff --git a/doc/guides/dmadevs/index.rst b/doc/guides/dmadevs/index.rst new file mode 100644 index 0000000000..b30004fd65 --- /dev/null +++ b/doc/guides/dmadevs/index.rst @@ -0,0 +1,14 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2021 Intel Corporation. + +DMA Device Drivers +================== + +The following are a list of DMA device PMDs, which can be used from an +application through DMAdev API. + +.. toctree:: + :maxdepth: 2 + :numbered: + + idxd diff --git a/doc/guides/index.rst b/doc/guides/index.rst index 857f0363d3..ccb71640dd 100644 --- a/doc/guides/index.rst +++ b/doc/guides/index.rst @@ -19,6 +19,7 @@ DPDK documentation bbdevs/index cryptodevs/index compressdevs/index + dmadevs/index vdpadevs/index regexdevs/index eventdevs/index -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v2 02/16] doc: initial commit for dmadevs section 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 02/16] doc: initial commit for dmadevs section Kevin Laatz @ 2021-09-03 10:51 ` Bruce Richardson 0 siblings, 0 replies; 243+ messages in thread From: Bruce Richardson @ 2021-09-03 10:51 UTC (permalink / raw) To: Kevin Laatz; +Cc: dev, fengchengwen, jerinj, conor.walsh On Fri, Sep 03, 2021 at 10:49:47AM +0000, Kevin Laatz wrote: > Add new section to the programmer's guide for dmadev devices. > > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > --- Acked-by: Bruce Richardson <bruce.richardson@intel.com> ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v2 03/16] dma/idxd: add skeleton for VFIO based DSA device 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 01/16] raw/ioat: only build if dmadev not present Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 02/16] doc: initial commit for dmadevs section Kevin Laatz @ 2021-09-03 10:49 ` Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 04/16] dma/idxd: add bus device probing Kevin Laatz ` (13 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-03 10:49 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the basic device probe/remove skeleton code for DSA device bound to the vfio pci driver. Relevant documentation and MAINTAINERS update also included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- MAINTAINERS | 10 +++++ doc/guides/dmadevs/idxd.rst | 58 ++++++++++++++++++++++++++ doc/guides/rel_notes/release_21_11.rst | 5 +++ drivers/dma/idxd/idxd_internal.h | 27 ++++++++++++ drivers/dma/idxd/idxd_pci.c | 55 ++++++++++++++++++++++++ drivers/dma/idxd/meson.build | 7 ++++ drivers/dma/idxd/version.map | 3 ++ drivers/dma/meson.build | 1 + 8 files changed, 166 insertions(+) create mode 100644 doc/guides/dmadevs/idxd.rst create mode 100644 drivers/dma/idxd/idxd_internal.h create mode 100644 drivers/dma/idxd/idxd_pci.c create mode 100644 drivers/dma/idxd/meson.build create mode 100644 drivers/dma/idxd/version.map diff --git a/MAINTAINERS b/MAINTAINERS index c057a090d6..b4c614a229 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1199,6 +1199,16 @@ F: doc/guides/compressdevs/zlib.rst F: doc/guides/compressdevs/features/zlib.ini +DMAdev Drivers +-------------- + +Intel IDXD - EXPERIMENTAL +M: Bruce Richardson <bruce.richardson@intel.com> +M: Kevin Laatz <kevin.laatz@intel.com> +F: drivers/dma/idxd/ +F: doc/guides/dmadevs/idxd.rst + + RegEx Drivers ------------- diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst new file mode 100644 index 0000000000..924700d17e --- /dev/null +++ b/doc/guides/dmadevs/idxd.rst @@ -0,0 +1,58 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2021 Intel Corporation. + +.. include:: <isonum.txt> + +IDXD DMA Device Driver +====================== + +The ``idxd`` dmadev driver provides a poll-mode driver (PMD) for Intel\ |reg| +Data Streaming Accelerator `(Intel DSA) +<https://software.intel.com/content/www/us/en/develop/articles/intel-data-streaming-accelerator-architecture-specification.html>`_. +This PMD can be used in conjunction with Intel\ |reg| DSA devices to offload +data operations, such as data copies, to hardware, freeing up CPU cycles for +other tasks. + +Hardware Requirements +---------------------- + +The ``dpdk-devbind.py`` script, included with DPDK, can be used to show the +presence of supported hardware. Running ``dpdk-devbind.py --status-dev dma`` +will show all the DMA devices on the system, including IDXD supported devices. +Intel\ |reg| DSA devices, are currently (at time of writing) appearing +as devices with type “0b25”, due to the absence of pci-id database entries for +them at this point. + +Compilation +------------ + +For builds using ``meson`` and ``ninja``, the driver will be built when the +target platform is x86-based. No additional compilation steps are necessary. + +Device Setup +------------- + +Devices using VFIO/UIO drivers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The HW devices to be used will need to be bound to a user-space IO driver for use. +The ``dpdk-devbind.py`` script can be used to view the state of the devices +and to bind them to a suitable DPDK-supported driver, such as ``vfio-pci``. +For example:: + + $ dpdk-devbind.py -b vfio-pci 6a:01.0 + +Device Probing and Initialization +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For devices bound to a suitable DPDK-supported VFIO/UIO driver, the HW devices will +be found as part of the device scan done at application initialization time without +the need to pass parameters to the application. + +For Intel\ |reg| DSA devices, DPDK will automatically configure the device with the +maximum number of workqueues available on it, partitioning all resources equally +among the queues. +If fewer workqueues are required, then the ``max_queues`` parameter may be passed to +the device driver on the EAL commandline, via the ``allowlist`` or ``-a`` flag e.g.:: + + $ dpdk-test -a <b:d:f>,max_queues=4 diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst index 78b9691bf3..f7b0fcf3fd 100644 --- a/doc/guides/rel_notes/release_21_11.rst +++ b/doc/guides/rel_notes/release_21_11.rst @@ -60,6 +60,11 @@ New Features The dmadev library provides a DMA device framework for management and provision of hardware and software DMA devices. +* **Added IDXD dmadev driver implementation.** + + The IDXD dmadev driver provide device drivers for the Intel DSA devices. + This device driver can be used through the generic dmadev API. + Removed Items ------------- diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h new file mode 100644 index 0000000000..c6a7dcd72f --- /dev/null +++ b/drivers/dma/idxd/idxd_internal.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#ifndef _IDXD_INTERNAL_H_ +#define _IDXD_INTERNAL_H_ + +/** + * @file idxd_internal.h + * + * Internal data structures for the idxd/DSA driver for dmadev + * + * @warning + * @b EXPERIMENTAL: these structures and APIs may change without prior notice + */ + +extern int idxd_pmd_logtype; + +#define IDXD_PMD_LOG(level, fmt, args...) rte_log(RTE_LOG_ ## level, \ + idxd_pmd_logtype, "IDXD: %s(): " fmt "\n", __func__, ##args) + +#define IDXD_PMD_DEBUG(fmt, args...) IDXD_PMD_LOG(DEBUG, fmt, ## args) +#define IDXD_PMD_INFO(fmt, args...) IDXD_PMD_LOG(INFO, fmt, ## args) +#define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) +#define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) + +#endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c new file mode 100644 index 0000000000..79e4aadcab --- /dev/null +++ b/drivers/dma/idxd/idxd_pci.c @@ -0,0 +1,55 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include <rte_bus_pci.h> + +#include "idxd_internal.h" + +#define IDXD_VENDOR_ID 0x8086 +#define IDXD_DEVICE_ID_SPR 0x0B25 + +#define IDXD_PMD_DMADEV_NAME_PCI dmadev_idxd_pci + +const struct rte_pci_id pci_id_idxd_map[] = { + { RTE_PCI_DEVICE(IDXD_VENDOR_ID, IDXD_DEVICE_ID_SPR) }, + { .vendor_id = 0, /* sentinel */ }, +}; + +static int +idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) +{ + int ret = 0; + char name[PCI_PRI_STR_SIZE]; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); + dev->device.driver = &drv->driver; + + return ret; +} + +static int +idxd_dmadev_remove_pci(struct rte_pci_device *dev) +{ + char name[PCI_PRI_STR_SIZE]; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + + IDXD_PMD_INFO("Closing %s on NUMA node %d", + name, dev->device.numa_node); + + return 0; +} + +struct rte_pci_driver idxd_pmd_drv_pci = { + .id_table = pci_id_idxd_map, + .drv_flags = RTE_PCI_DRV_NEED_MAPPING, + .probe = idxd_dmadev_probe_pci, + .remove = idxd_dmadev_remove_pci, +}; + +RTE_PMD_REGISTER_PCI(IDXD_PMD_DMADEV_NAME_PCI, idxd_pmd_drv_pci); +RTE_PMD_REGISTER_PCI_TABLE(IDXD_PMD_DMADEV_NAME_PCI, pci_id_idxd_map); +RTE_PMD_REGISTER_KMOD_DEP(IDXD_PMD_DMADEV_NAME_PCI, "vfio-pci"); +RTE_PMD_REGISTER_PARAM_STRING(dmadev_idxd_pci, "max_queues=0"); diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build new file mode 100644 index 0000000000..24c0f3106e --- /dev/null +++ b/drivers/dma/idxd/meson.build @@ -0,0 +1,7 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2021 Intel Corporation + +deps += ['bus_pci'] +sources = files( + 'idxd_pci.c' +) \ No newline at end of file diff --git a/drivers/dma/idxd/version.map b/drivers/dma/idxd/version.map new file mode 100644 index 0000000000..4a76d1d52d --- /dev/null +++ b/drivers/dma/idxd/version.map @@ -0,0 +1,3 @@ +DPDK_21 { + local: *; +}; diff --git a/drivers/dma/meson.build b/drivers/dma/meson.build index 0c2c34cd00..0b01b6a8ab 100644 --- a/drivers/dma/meson.build +++ b/drivers/dma/meson.build @@ -6,6 +6,7 @@ if is_windows endif drivers = [ + 'idxd', 'skeleton', ] std_deps = ['dmadev'] -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v2 04/16] dma/idxd: add bus device probing 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 00/16] add dmadev driver for idxd devices Kevin Laatz ` (2 preceding siblings ...) 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 03/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz @ 2021-09-03 10:49 ` Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 05/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz ` (12 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-03 10:49 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the basic device probing for DSA devices bound to the IDXD kernel driver. These devices can be configured via sysfs and made available to DPDK if they are found during bus scan. Relevant documentation is included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- doc/guides/dmadevs/idxd.rst | 64 +++++++ drivers/dma/idxd/idxd_bus.c | 351 +++++++++++++++++++++++++++++++++++ drivers/dma/idxd/meson.build | 1 + 3 files changed, 416 insertions(+) create mode 100644 drivers/dma/idxd/idxd_bus.c diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 924700d17e..ce33e2857a 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -32,6 +32,56 @@ target platform is x86-based. No additional compilation steps are necessary. Device Setup ------------- +Intel\ |reg| DSA devices can use the IDXD kernel driver or DPDK-supported drivers, +such as ``vfio-pci``. Both are supported by the IDXD PMD. + +Intel\ |reg| DSA devices using IDXD kernel driver +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To use an Intel\ |reg| DSA device bound to the IDXD kernel driver, the device must first be configured. +The `accel-config <https://github.com/intel/idxd-config>`_ utility library can be used for configuration. + +.. note:: + The device configuration can also be done by directly interacting with the sysfs nodes. + An example of how this may be done can be seen in the script ``dpdk_idxd_cfg.py`` + included in the driver source directory. + +There are some mandatory configuration steps before being able to use a device with an application. +The internal engines, which do the copies or other operations, +and the work-queues, which are used by applications to assign work to the device, +need to be assigned to groups, and the various other configuration options, +such as priority or queue depth, need to be set for each queue. + +To assign an engine to a group:: + + $ accel-config config-engine dsa0/engine0.0 --group-id=0 + $ accel-config config-engine dsa0/engine0.1 --group-id=1 + +To assign work queues to groups for passing descriptors to the engines a similar accel-config command can be used. +However, the work queues also need to be configured depending on the use case. +Some configuration options include: + +* mode (Dedicated/Shared): Indicates whether a WQ may accept jobs from multiple queues simultaneously. +* priority: WQ priority between 1 and 15. Larger value means higher priority. +* wq-size: the size of the WQ. Sum of all WQ sizes must be less that the total-size defined by the device. +* type: WQ type (kernel/mdev/user). Determines how the device is presented. +* name: identifier given to the WQ. + +Example configuration for a work queue:: + + $ accel-config config-wq dsa0/wq0.0 --group-id=0 \ + --mode=dedicated --priority=10 --wq-size=8 \ + --type=user --name=dpdk_app1 + +Once the devices have been configured, they need to be enabled:: + + $ accel-config enable-device dsa0 + $ accel-config enable-wq dsa0/wq0.0 + +Check the device configuration:: + + $ accel-config list + Devices using VFIO/UIO drivers ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -56,3 +106,17 @@ If fewer workqueues are required, then the ``max_queues`` parameter may be passe the device driver on the EAL commandline, via the ``allowlist`` or ``-a`` flag e.g.:: $ dpdk-test -a <b:d:f>,max_queues=4 + +For devices bound to the IDXD kernel driver, +the DPDK IDXD driver will automatically perform a scan for available workqueues +to use. Any workqueues found listed in ``/dev/dsa`` on the system will be checked +in ``/sys``, and any which have ``dpdk_`` prefix in their name will be automatically +probed by the driver to make them available to the application. +Alternatively, to support use by multiple DPDK processes simultaneously, +the value used as the DPDK ``--file-prefix`` parameter may be used as a workqueue +name prefix, instead of ``dpdk_``, allowing each DPDK application instance to only +use a subset of configured queues. + +Once probed successfully, irrespective of kernel driver, the device will appear as a ``dmadev``, +that is a "DMA device type" inside DPDK, and can be accessed using APIs from the +``rte_dmadev`` library. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c new file mode 100644 index 0000000000..c08f0f473b --- /dev/null +++ b/drivers/dma/idxd/idxd_bus.c @@ -0,0 +1,351 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include <dirent.h> +#include <fcntl.h> +#include <unistd.h> +#include <sys/mman.h> + +#include <rte_bus.h> +#include <rte_log.h> +#include <rte_dmadev_pmd.h> +#include <rte_string_fns.h> + +#include "idxd_internal.h" + +/* default value for DSA paths, but allow override in environment for testing */ +#define DSA_DEV_PATH "/dev/dsa" +#define DSA_SYSFS_PATH "/sys/bus/dsa/devices" + +static unsigned int devcount; + +/** unique identifier for a DSA device/WQ instance */ +struct dsa_wq_addr { + uint16_t device_id; + uint16_t wq_id; +}; + +/** a DSA device instance */ +struct rte_dsa_device { + struct rte_device device; /**< Inherit core device */ + TAILQ_ENTRY(rte_dsa_device) next; /**< next dev in list */ + + char wq_name[32]; /**< the workqueue name/number e.g. wq0.1 */ + struct dsa_wq_addr addr; /**< Identifies the specific WQ */ +}; + +/* forward prototypes */ +struct dsa_bus; +static int dsa_scan(void); +static int dsa_probe(void); +static struct rte_device *dsa_find_device(const struct rte_device *start, + rte_dev_cmp_t cmp, const void *data); +static enum rte_iova_mode dsa_get_iommu_class(void); +static int dsa_addr_parse(const char *name, void *addr); + +/** List of devices */ +TAILQ_HEAD(dsa_device_list, rte_dsa_device); + +/** + * Structure describing the DSA bus + */ +struct dsa_bus { + struct rte_bus bus; /**< Inherit the generic class */ + struct rte_driver driver; /**< Driver struct for devices to point to */ + struct dsa_device_list device_list; /**< List of PCI devices */ +}; + +struct dsa_bus dsa_bus = { + .bus = { + .scan = dsa_scan, + .probe = dsa_probe, + .find_device = dsa_find_device, + .get_iommu_class = dsa_get_iommu_class, + .parse = dsa_addr_parse, + }, + .driver = { + .name = "dmadev_idxd" + }, + .device_list = TAILQ_HEAD_INITIALIZER(dsa_bus.device_list), +}; + +static inline const char * +dsa_get_dev_path(void) +{ + const char *path = getenv("DSA_DEV_PATH"); + return path ? path : DSA_DEV_PATH; +} + +static inline const char * +dsa_get_sysfs_path(void) +{ + const char *path = getenv("DSA_SYSFS_PATH"); + return path ? path : DSA_SYSFS_PATH; +} + +static void * +idxd_vdev_mmap_wq(struct rte_dsa_device *dev) +{ + void *addr; + char path[PATH_MAX]; + int fd; + + snprintf(path, sizeof(path), "%s/%s", dsa_get_dev_path(), dev->wq_name); + fd = open(path, O_RDWR); + if (fd < 0) { + IDXD_PMD_ERR("Failed to open device path: %s", path); + return NULL; + } + + addr = mmap(NULL, 0x1000, PROT_WRITE, MAP_SHARED, fd, 0); + close(fd); + if (addr == MAP_FAILED) { + IDXD_PMD_ERR("Failed to mmap device %s", path); + return NULL; + } + + return addr; +} + +static int +read_wq_string(struct rte_dsa_device *dev, const char *filename, + char *value, size_t valuelen) +{ + char sysfs_node[PATH_MAX]; + int len; + int fd; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s", + dsa_get_sysfs_path(), dev->wq_name, filename); + fd = open(sysfs_node, O_RDONLY); + if (fd < 0) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + len = read(fd, value, valuelen - 1); + close(fd); + if (len < 0) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + value[len] = '\0'; + return 0; +} + +static int +read_wq_int(struct rte_dsa_device *dev, const char *filename, + int *value) +{ + char sysfs_node[PATH_MAX]; + FILE *f; + int ret = 0; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s", + dsa_get_sysfs_path(), dev->wq_name, filename); + f = fopen(sysfs_node, "r"); + if (f == NULL) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + if (fscanf(f, "%d", value) != 1) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + ret = -1; + } + + fclose(f); + return ret; +} + +static int +read_device_int(struct rte_dsa_device *dev, const char *filename, + int *value) +{ + char sysfs_node[PATH_MAX]; + FILE *f; + int ret = 0; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/dsa%d/%s", + dsa_get_sysfs_path(), dev->addr.device_id, filename); + f = fopen(sysfs_node, "r"); + if (f == NULL) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + if (fscanf(f, "%d", value) != 1) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + ret = -1; + } + + fclose(f); + return ret; +} + +static int +idxd_probe_dsa(struct rte_dsa_device *dev) +{ + struct idxd_dmadev idxd = {0}; + int ret = 0; + + IDXD_PMD_INFO("Probing device %s on numa node %d", + dev->wq_name, dev->device.numa_node); + if (read_wq_int(dev, "size", &ret) < 0) + return -1; + idxd.max_batches = ret; + if (read_wq_int(dev, "max_batch_size", &ret) < 0) + return -1; + idxd.max_batch_size = ret; + idxd.qid = dev->addr.wq_id; + idxd.u.vdev.dsa_id = dev->addr.device_id; + idxd.sva_support = 1; + + idxd.portal = idxd_vdev_mmap_wq(dev); + if (idxd.portal == NULL) { + IDXD_PMD_ERR("WQ mmap failed"); + return -ENOENT; + } + + return 0; +} + +static int +is_for_this_process_use(const char *name) +{ + char *runtime_dir = strdup(rte_eal_get_runtime_dir()); + char *prefix = basename(runtime_dir); + int prefixlen = strlen(prefix); + int retval = 0; + + if (strncmp(name, "dpdk_", 5) == 0) + retval = 1; + if (strncmp(name, prefix, prefixlen) == 0 && name[prefixlen] == '_') + retval = 1; + + free(runtime_dir); + return retval; +} + +static int +dsa_probe(void) +{ + struct rte_dsa_device *dev; + + TAILQ_FOREACH(dev, &dsa_bus.device_list, next) { + char type[64], name[64]; + + if (read_wq_string(dev, "type", type, sizeof(type)) < 0 || + read_wq_string(dev, "name", name, sizeof(name)) < 0) + continue; + + if (strncmp(type, "user", 4) == 0 && is_for_this_process_use(name)) { + dev->device.driver = &dsa_bus.driver; + idxd_probe_dsa(dev); + continue; + } + IDXD_PMD_DEBUG("WQ '%s', not allocated to DPDK", dev->wq_name); + } + + return 0; +} + +static int +dsa_scan(void) +{ + const char *path = dsa_get_dev_path(); + struct dirent *wq; + DIR *dev_dir; + + dev_dir = opendir(path); + if (dev_dir == NULL) { + if (errno == ENOENT) + return 0; /* no bus, return without error */ + IDXD_PMD_ERR("%s(): opendir '%s' failed: %s", + __func__, path, strerror(errno)); + return -1; + } + + while ((wq = readdir(dev_dir)) != NULL) { + struct rte_dsa_device *dev; + int numa_node = -1; + + if (strncmp(wq->d_name, "wq", 2) != 0) + continue; + if (strnlen(wq->d_name, sizeof(dev->wq_name)) == sizeof(dev->wq_name)) { + IDXD_PMD_ERR("%s(): wq name too long: '%s', skipping", + __func__, wq->d_name); + continue; + } + IDXD_PMD_DEBUG("%s(): found %s/%s", __func__, path, wq->d_name); + + dev = malloc(sizeof(*dev)); + if (dsa_addr_parse(wq->d_name, &dev->addr) < 0) { + IDXD_PMD_ERR("Error parsing WQ name: %s", wq->d_name); + free(dev); + continue; + } + dev->device.bus = &dsa_bus.bus; + strlcpy(dev->wq_name, wq->d_name, sizeof(dev->wq_name)); + TAILQ_INSERT_TAIL(&dsa_bus.device_list, dev, next); + devcount++; + + read_device_int(dev, "numa_node", &numa_node); + dev->device.numa_node = numa_node; + dev->device.name = dev->wq_name; + } + + closedir(dev_dir); + return 0; +} + +static struct rte_device * +dsa_find_device(const struct rte_device *start, rte_dev_cmp_t cmp, + const void *data) +{ + struct rte_dsa_device *dev = TAILQ_FIRST(&dsa_bus.device_list); + + /* the rte_device struct must be at start of dsa structure */ + RTE_BUILD_BUG_ON(offsetof(struct rte_dsa_device, device) != 0); + + if (start != NULL) /* jump to start point if given */ + dev = TAILQ_NEXT((const struct rte_dsa_device *)start, next); + while (dev != NULL) { + if (cmp(&dev->device, data) == 0) + return &dev->device; + dev = TAILQ_NEXT(dev, next); + } + return NULL; +} + +static enum rte_iova_mode +dsa_get_iommu_class(void) +{ + /* if there are no devices, report don't care, otherwise VA mode */ + return devcount > 0 ? RTE_IOVA_VA : RTE_IOVA_DC; +} + +static int +dsa_addr_parse(const char *name, void *addr) +{ + struct dsa_wq_addr *wq = addr; + unsigned int device_id, wq_id; + + if (sscanf(name, "wq%u.%u", &device_id, &wq_id) != 2) { + IDXD_PMD_DEBUG("Parsing WQ name failed: %s", name); + return -1; + } + + wq->device_id = device_id; + wq->wq_id = wq_id; + return 0; +} + +RTE_REGISTER_BUS(dsa, dsa_bus.bus); diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index 24c0f3106e..f1fea000a7 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -3,5 +3,6 @@ deps += ['bus_pci'] sources = files( + 'idxd_bus.c', 'idxd_pci.c' ) \ No newline at end of file -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v2 05/16] dma/idxd: create dmadev instances on bus probe 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 00/16] add dmadev driver for idxd devices Kevin Laatz ` (3 preceding siblings ...) 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 04/16] dma/idxd: add bus device probing Kevin Laatz @ 2021-09-03 10:49 ` Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 06/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz ` (11 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-03 10:49 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When a suitable device is found during the bus scan/probe, create a dmadev instance for each HW queue. Internal structures required for device creation are also added. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- drivers/dma/idxd/idxd_bus.c | 20 ++++++++- drivers/dma/idxd/idxd_common.c | 75 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 40 +++++++++++++++++ drivers/dma/idxd/meson.build | 1 + 4 files changed, 135 insertions(+), 1 deletion(-) create mode 100644 drivers/dma/idxd/idxd_common.c diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index c08f0f473b..0f33500dfc 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -84,6 +84,18 @@ dsa_get_sysfs_path(void) return path ? path : DSA_SYSFS_PATH; } +static int +idxd_dev_close(struct rte_dmadev *dev) +{ + struct idxd_dmadev *idxd = dev->data->dev_private; + munmap(idxd->portal, 0x1000); + return 0; +} + +static const struct rte_dmadev_ops idxd_vdev_ops = { + .dev_close = idxd_dev_close, +}; + static void * idxd_vdev_mmap_wq(struct rte_dsa_device *dev) { @@ -205,7 +217,7 @@ idxd_probe_dsa(struct rte_dsa_device *dev) return -1; idxd.max_batch_size = ret; idxd.qid = dev->addr.wq_id; - idxd.u.vdev.dsa_id = dev->addr.device_id; + idxd.u.bus.dsa_id = dev->addr.device_id; idxd.sva_support = 1; idxd.portal = idxd_vdev_mmap_wq(dev); @@ -214,6 +226,12 @@ idxd_probe_dsa(struct rte_dsa_device *dev) return -ENOENT; } + ret = idxd_dmadev_create(dev->wq_name, &dev->device, &idxd, &idxd_vdev_ops); + if (ret) { + IDXD_PMD_ERR("Failed to create rawdev %s", dev->wq_name); + return ret; + } + return 0; } diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c new file mode 100644 index 0000000000..7770b2e264 --- /dev/null +++ b/drivers/dma/idxd/idxd_common.c @@ -0,0 +1,75 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#include <rte_dmadev_pmd.h> +#include <rte_malloc.h> +#include <rte_common.h> + +#include "idxd_internal.h" + +#define IDXD_PMD_NAME_STR "dmadev_idxd" + +int +idxd_dmadev_create(const char *name, struct rte_device *dev, + const struct idxd_dmadev *base_idxd, + const struct rte_dmadev_ops *ops) +{ + struct idxd_dmadev *idxd; + struct rte_dmadev *dmadev = NULL; + int ret = 0; + + if (!name) { + IDXD_PMD_ERR("Invalid name of the device!"); + ret = -EINVAL; + goto cleanup; + } + + /* Allocate device structure */ + dmadev = rte_dmadev_pmd_allocate(name); + if (dmadev == NULL) { + IDXD_PMD_ERR("Unable to allocate raw device"); + ret = -ENOMEM; + goto cleanup; + } + dmadev->dev_ops = ops; + dmadev->device = dev; + + idxd = rte_malloc_socket(NULL, sizeof(struct idxd_dmadev), 0, dev->numa_node); + if (idxd == NULL) { + IDXD_PMD_ERR("Unable to allocate memory for device"); + ret = -ENOMEM; + goto cleanup; + } + dmadev->data->dev_private = idxd; + dmadev->dev_private = idxd; + *idxd = *base_idxd; /* copy over the main fields already passed in */ + idxd->dmadev = dmadev; + + /* allocate batch index ring and completion ring. + * The +1 is because we can never fully use + * the ring, otherwise read == write means both full and empty. + */ + idxd->batch_comp_ring = rte_zmalloc(NULL, (sizeof(idxd->batch_idx_ring[0]) + + sizeof(idxd->batch_comp_ring[0])) * (idxd->max_batches + 1), + sizeof(idxd->batch_comp_ring[0])); + if (idxd->batch_comp_ring == NULL) { + IDXD_PMD_ERR("Unable to reserve memory for batch data\n"); + ret = -ENOMEM; + goto cleanup; + } + idxd->batch_idx_ring = (void *)&idxd->batch_comp_ring[idxd->max_batches+1]; + idxd->batch_iova = rte_mem_virt2iova(idxd->batch_comp_ring); + + return 0; + +cleanup: + if (dmadev) + rte_dmadev_pmd_release(dmadev); + + return ret; +} + +int idxd_pmd_logtype; + +RTE_LOG_REGISTER_DEFAULT(idxd_pmd_logtype, WARNING); diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index c6a7dcd72f..99ab2df925 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -24,4 +24,44 @@ extern int idxd_pmd_logtype; #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) +struct idxd_dmadev { + /* counters to track the batches */ + unsigned short max_batches; + unsigned short batch_idx_read; + unsigned short batch_idx_write; + + /* track descriptors and handles */ + unsigned short desc_ring_mask; + unsigned short ids_avail; /* handles for ops completed */ + unsigned short ids_returned; /* the read pointer for hdls/desc rings */ + unsigned short batch_start; /* start+size == write pointer for hdls/desc */ + unsigned short batch_size; + + void *portal; /* address to write the batch descriptor */ + + struct idxd_completion *batch_comp_ring; + unsigned short *batch_idx_ring; /* store where each batch ends */ + + struct rte_dmadev_stats stats; + + rte_iova_t batch_iova; /* base address of the batch comp ring */ + rte_iova_t desc_iova; /* base address of desc ring, needed for completions */ + + unsigned short max_batch_size; + + struct rte_dmadev *dmadev; + struct rte_dmadev_vchan_conf qcfg; + uint8_t sva_support; + uint8_t qid; + + union { + struct { + unsigned int dsa_id; + } bus; + } u; +}; + +int idxd_dmadev_create(const char *name, struct rte_device *dev, + const struct idxd_dmadev *base_idxd, const struct rte_dmadev_ops *ops); + #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index f1fea000a7..81150e6f25 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -4,5 +4,6 @@ deps += ['bus_pci'] sources = files( 'idxd_bus.c', + 'idxd_common.c', 'idxd_pci.c' ) \ No newline at end of file -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v2 06/16] dma/idxd: create dmadev instances on pci probe 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 00/16] add dmadev driver for idxd devices Kevin Laatz ` (4 preceding siblings ...) 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 05/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz @ 2021-09-03 10:49 ` Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 07/16] dma/idxd: add datapath structures Kevin Laatz ` (10 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-03 10:49 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When a suitable device is found during the PCI probe, create a dmadev instance for each HW queue. HW definitions required are also included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- drivers/dma/idxd/idxd_hw_defs.h | 71 ++++++++ drivers/dma/idxd/idxd_internal.h | 14 ++ drivers/dma/idxd/idxd_pci.c | 272 ++++++++++++++++++++++++++++++- 3 files changed, 354 insertions(+), 3 deletions(-) create mode 100644 drivers/dma/idxd/idxd_hw_defs.h diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h new file mode 100644 index 0000000000..ea627cba6d --- /dev/null +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -0,0 +1,71 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#ifndef _IDXD_HW_DEFS_H_ +#define _IDXD_HW_DEFS_H_ + +/*** Definitions for Intel(R) Data Streaming Accelerator ***/ + +#define IDXD_CMD_SHIFT 20 +enum rte_idxd_cmds { + idxd_enable_dev = 1, + idxd_disable_dev, + idxd_drain_all, + idxd_abort_all, + idxd_reset_device, + idxd_enable_wq, + idxd_disable_wq, + idxd_drain_wq, + idxd_abort_wq, + idxd_reset_wq, +}; + +/* General bar0 registers */ +struct rte_idxd_bar0 { + uint32_t __rte_cache_aligned version; /* offset 0x00 */ + uint64_t __rte_aligned(0x10) gencap; /* offset 0x10 */ + uint64_t __rte_aligned(0x10) wqcap; /* offset 0x20 */ + uint64_t __rte_aligned(0x10) grpcap; /* offset 0x30 */ + uint64_t __rte_aligned(0x08) engcap; /* offset 0x38 */ + uint64_t __rte_aligned(0x10) opcap; /* offset 0x40 */ + uint64_t __rte_aligned(0x20) offsets[2]; /* offset 0x60 */ + uint32_t __rte_aligned(0x20) gencfg; /* offset 0x80 */ + uint32_t __rte_aligned(0x08) genctrl; /* offset 0x88 */ + uint32_t __rte_aligned(0x10) gensts; /* offset 0x90 */ + uint32_t __rte_aligned(0x08) intcause; /* offset 0x98 */ + uint32_t __rte_aligned(0x10) cmd; /* offset 0xA0 */ + uint32_t __rte_aligned(0x08) cmdstatus; /* offset 0xA8 */ + uint64_t __rte_aligned(0x20) swerror[4]; /* offset 0xC0 */ +}; + +/* workqueue config is provided by array of uint32_t. */ +enum rte_idxd_wqcfg { + wq_size_idx, /* size is in first 32-bit value */ + wq_threshold_idx, /* WQ threshold second 32-bits */ + wq_mode_idx, /* WQ mode and other flags */ + wq_sizes_idx, /* WQ transfer and batch sizes */ + wq_occ_int_idx, /* WQ occupancy interrupt handle */ + wq_occ_limit_idx, /* WQ occupancy limit */ + wq_state_idx, /* WQ state and occupancy state */ +}; + +#define WQ_MODE_SHARED 0 +#define WQ_MODE_DEDICATED 1 +#define WQ_PRIORITY_SHIFT 4 +#define WQ_BATCH_SZ_SHIFT 5 +#define WQ_STATE_SHIFT 30 +#define WQ_STATE_MASK 0x3 + +struct rte_idxd_grpcfg { + uint64_t grpwqcfg[4] __rte_cache_aligned; /* 64-byte register set */ + uint64_t grpengcfg; /* offset 32 */ + uint32_t grpflags; /* offset 40 */ +}; + +#define GENSTS_DEV_STATE_MASK 0x03 +#define CMDSTATUS_ACTIVE_SHIFT 31 +#define CMDSTATUS_ACTIVE_MASK (1 << 31) +#define CMDSTATUS_ERR_MASK 0xFF + +#endif diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 99ab2df925..85c400c9ec 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -5,6 +5,8 @@ #ifndef _IDXD_INTERNAL_H_ #define _IDXD_INTERNAL_H_ +#include "idxd_hw_defs.h" + /** * @file idxd_internal.h * @@ -24,6 +26,16 @@ extern int idxd_pmd_logtype; #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) +struct idxd_pci_common { + rte_spinlock_t lk; + + uint8_t wq_cfg_sz; + volatile struct rte_idxd_bar0 *regs; + volatile uint32_t *wq_regs_base; + volatile struct rte_idxd_grpcfg *grp_regs; + volatile void *portals; +}; + struct idxd_dmadev { /* counters to track the batches */ unsigned short max_batches; @@ -58,6 +70,8 @@ struct idxd_dmadev { struct { unsigned int dsa_id; } bus; + + struct idxd_pci_common *pci; } u; }; diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 79e4aadcab..318931713c 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -3,6 +3,9 @@ */ #include <rte_bus_pci.h> +#include <rte_devargs.h> +#include <rte_dmadev_pmd.h> +#include <rte_malloc.h> #include "idxd_internal.h" @@ -16,17 +19,280 @@ const struct rte_pci_id pci_id_idxd_map[] = { { .vendor_id = 0, /* sentinel */ }, }; +static inline int +idxd_pci_dev_command(struct idxd_dmadev *idxd, enum rte_idxd_cmds command) +{ + uint8_t err_code; + uint16_t qid = idxd->qid; + int i = 0; + + if (command >= idxd_disable_wq && command <= idxd_reset_wq) + qid = (1 << qid); + rte_spinlock_lock(&idxd->u.pci->lk); + idxd->u.pci->regs->cmd = (command << IDXD_CMD_SHIFT) | qid; + + do { + rte_pause(); + err_code = idxd->u.pci->regs->cmdstatus; + if (++i >= 1000) { + IDXD_PMD_ERR("Timeout waiting for command response from HW"); + rte_spinlock_unlock(&idxd->u.pci->lk); + return err_code; + } + } while (idxd->u.pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK); + rte_spinlock_unlock(&idxd->u.pci->lk); + + return err_code & CMDSTATUS_ERR_MASK; +} + +static uint32_t * +idxd_get_wq_cfg(struct idxd_pci_common *pci, uint8_t wq_idx) +{ + return RTE_PTR_ADD(pci->wq_regs_base, + (uintptr_t)wq_idx << (5 + pci->wq_cfg_sz)); +} + +static int +idxd_is_wq_enabled(struct idxd_dmadev *idxd) +{ + uint32_t state = idxd_get_wq_cfg(idxd->u.pci, idxd->qid)[wq_state_idx]; + return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; +} + +static const struct rte_dmadev_ops idxd_pci_ops = { + +}; + +/* each portal uses 4 x 4k pages */ +#define IDXD_PORTAL_SIZE (4096 * 4) + +static int +init_pci_device(struct rte_pci_device *dev, struct idxd_dmadev *idxd, + unsigned int max_queues) +{ + struct idxd_pci_common *pci; + uint8_t nb_groups, nb_engines, nb_wqs; + uint16_t grp_offset, wq_offset; /* how far into bar0 the regs are */ + uint16_t wq_size, total_wq_size; + uint8_t lg2_max_batch, lg2_max_copy_size; + unsigned int i, err_code; + + pci = malloc(sizeof(*pci)); + if (pci == NULL) { + IDXD_PMD_ERR("%s: Can't allocate memory", __func__); + goto err; + } + rte_spinlock_init(&pci->lk); + + /* assign the bar registers, and then configure device */ + pci->regs = dev->mem_resource[0].addr; + grp_offset = (uint16_t)pci->regs->offsets[0]; + pci->grp_regs = RTE_PTR_ADD(pci->regs, grp_offset * 0x100); + wq_offset = (uint16_t)(pci->regs->offsets[0] >> 16); + pci->wq_regs_base = RTE_PTR_ADD(pci->regs, wq_offset * 0x100); + pci->portals = dev->mem_resource[2].addr; + pci->wq_cfg_sz = (pci->regs->wqcap >> 24) & 0x0F; + + /* sanity check device status */ + if (pci->regs->gensts & GENSTS_DEV_STATE_MASK) { + /* need function-level-reset (FLR) or is enabled */ + IDXD_PMD_ERR("Device status is not disabled, cannot init"); + goto err; + } + if (pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK) { + /* command in progress */ + IDXD_PMD_ERR("Device has a command in progress, cannot init"); + goto err; + } + + /* read basic info about the hardware for use when configuring */ + nb_groups = (uint8_t)pci->regs->grpcap; + nb_engines = (uint8_t)pci->regs->engcap; + nb_wqs = (uint8_t)(pci->regs->wqcap >> 16); + total_wq_size = (uint16_t)pci->regs->wqcap; + lg2_max_copy_size = (uint8_t)(pci->regs->gencap >> 16) & 0x1F; + lg2_max_batch = (uint8_t)(pci->regs->gencap >> 21) & 0x0F; + + IDXD_PMD_DEBUG("nb_groups = %u, nb_engines = %u, nb_wqs = %u", + nb_groups, nb_engines, nb_wqs); + + /* zero out any old config */ + for (i = 0; i < nb_groups; i++) { + pci->grp_regs[i].grpengcfg = 0; + pci->grp_regs[i].grpwqcfg[0] = 0; + } + for (i = 0; i < nb_wqs; i++) + idxd_get_wq_cfg(pci, i)[0] = 0; + + /* limit queues if necessary */ + if (max_queues != 0 && nb_wqs > max_queues) { + nb_wqs = max_queues; + if (nb_engines > max_queues) + nb_engines = max_queues; + if (nb_groups > max_queues) + nb_engines = max_queues; + IDXD_PMD_DEBUG("Limiting queues to %u", nb_wqs); + } + + /* put each engine into a separate group to avoid reordering */ + if (nb_groups > nb_engines) + nb_groups = nb_engines; + if (nb_groups < nb_engines) + nb_engines = nb_groups; + + /* assign engines to groups, round-robin style */ + for (i = 0; i < nb_engines; i++) { + IDXD_PMD_DEBUG("Assigning engine %u to group %u", + i, i % nb_groups); + pci->grp_regs[i % nb_groups].grpengcfg |= (1ULL << i); + } + + /* now do the same for queues and give work slots to each queue */ + wq_size = total_wq_size / nb_wqs; + IDXD_PMD_DEBUG("Work queue size = %u, max batch = 2^%u, max copy = 2^%u", + wq_size, lg2_max_batch, lg2_max_copy_size); + for (i = 0; i < nb_wqs; i++) { + /* add engine "i" to a group */ + IDXD_PMD_DEBUG("Assigning work queue %u to group %u", + i, i % nb_groups); + pci->grp_regs[i % nb_groups].grpwqcfg[0] |= (1ULL << i); + /* now configure it, in terms of size, max batch, mode */ + idxd_get_wq_cfg(pci, i)[wq_size_idx] = wq_size; + idxd_get_wq_cfg(pci, i)[wq_mode_idx] = (1 << WQ_PRIORITY_SHIFT) | + WQ_MODE_DEDICATED; + idxd_get_wq_cfg(pci, i)[wq_sizes_idx] = lg2_max_copy_size | + (lg2_max_batch << WQ_BATCH_SZ_SHIFT); + } + + /* dump the group configuration to output */ + for (i = 0; i < nb_groups; i++) { + IDXD_PMD_DEBUG("## Group %d", i); + IDXD_PMD_DEBUG(" GRPWQCFG: %"PRIx64, pci->grp_regs[i].grpwqcfg[0]); + IDXD_PMD_DEBUG(" GRPENGCFG: %"PRIx64, pci->grp_regs[i].grpengcfg); + IDXD_PMD_DEBUG(" GRPFLAGS: %"PRIx32, pci->grp_regs[i].grpflags); + } + + idxd->u.pci = pci; + idxd->max_batches = wq_size; + + /* enable the device itself */ + err_code = idxd_pci_dev_command(idxd, idxd_enable_dev); + if (err_code) { + IDXD_PMD_ERR("Error enabling device: code %#x", err_code); + return err_code; + } + IDXD_PMD_DEBUG("IDXD Device enabled OK"); + + return nb_wqs; + +err: + free(pci); + return -1; +} + static int idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) { - int ret = 0; + struct idxd_dmadev idxd = {0}; + uint8_t nb_wqs; + int qid, ret = 0; char name[PCI_PRI_STR_SIZE]; + unsigned int max_queues = 0; rte_pci_device_name(&dev->addr, name, sizeof(name)); IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); dev->device.driver = &drv->driver; - return ret; + if (dev->device.devargs && dev->device.devargs->args[0] != '\0') { + /* if the number of devargs grows beyond just 1, use rte_kvargs */ + if (sscanf(dev->device.devargs->args, + "max_queues=%u", &max_queues) != 1) { + IDXD_PMD_ERR("Invalid device parameter: '%s'", + dev->device.devargs->args); + return -1; + } + } + + ret = init_pci_device(dev, &idxd, max_queues); + if (ret < 0) { + IDXD_PMD_ERR("Error initializing PCI hardware"); + return ret; + } + if (idxd.u.pci->portals == NULL) { + IDXD_PMD_ERR("Error, invalid portal assigned during initialization\n"); + return -EINVAL; + } + nb_wqs = (uint8_t)ret; + + /* set up one device for each queue */ + for (qid = 0; qid < nb_wqs; qid++) { + char qname[32]; + + /* add the queue number to each device name */ + snprintf(qname, sizeof(qname), "%s-q%d", name, qid); + idxd.qid = qid; + idxd.portal = RTE_PTR_ADD(idxd.u.pci->portals, + qid * IDXD_PORTAL_SIZE); + if (idxd_is_wq_enabled(&idxd)) + IDXD_PMD_ERR("Error, WQ %u seems enabled", qid); + ret = idxd_dmadev_create(qname, &dev->device, + &idxd, &idxd_pci_ops); + if (ret != 0) { + IDXD_PMD_ERR("Failed to create dmadev %s", name); + if (qid == 0) /* if no devices using this, free pci */ + free(idxd.u.pci); + return ret; + } + } + + return 0; +} + +static int +idxd_dmadev_destroy(const char *name) +{ + int ret; + uint8_t err_code; + struct rte_dmadev *rdev; + struct idxd_dmadev *idxd; + + if (!name) { + IDXD_PMD_ERR("Invalid device name"); + return -EINVAL; + } + + rdev = rte_dmadev_get_device_by_name(name); + if (!rdev) { + IDXD_PMD_ERR("Invalid device name (%s)", name); + return -EINVAL; + } + + idxd = rdev->dev_private; + if (!idxd) { + IDXD_PMD_ERR("Error getting dev_private"); + return -EINVAL; + } + + /* disable the device */ + err_code = idxd_pci_dev_command(idxd, idxd_disable_dev); + if (err_code) { + IDXD_PMD_ERR("Error disabling device: code %#x", err_code); + return err_code; + } + IDXD_PMD_DEBUG("IDXD Device disabled OK"); + + /* free device memory */ + IDXD_PMD_DEBUG("Freeing device driver memory"); + rdev->dev_private = NULL; + rte_free(idxd->batch_idx_ring); + rte_free(idxd->desc_ring); + + /* rte_dmadev_close is called by pmd_release */ + ret = rte_dmadev_pmd_release(rdev); + if (ret) + IDXD_PMD_DEBUG("Device cleanup failed"); + + return 0; } static int @@ -39,7 +305,7 @@ idxd_dmadev_remove_pci(struct rte_pci_device *dev) IDXD_PMD_INFO("Closing %s on NUMA node %d", name, dev->device.numa_node); - return 0; + return idxd_dmadev_destroy(name); } struct rte_pci_driver idxd_pmd_drv_pci = { -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v2 07/16] dma/idxd: add datapath structures 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 00/16] add dmadev driver for idxd devices Kevin Laatz ` (5 preceding siblings ...) 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 06/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz @ 2021-09-03 10:49 ` Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 08/16] dma/idxd: add configure and info_get functions Kevin Laatz ` (9 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-03 10:49 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add data structures required for the data path for IDXD devices. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- v2: add completion status for invalid opcode --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 33 ++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 60 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 3 ++ drivers/dma/idxd/idxd_pci.c | 2 +- 5 files changed, 98 insertions(+), 1 deletion(-) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index 0f33500dfc..dc11f829fd 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -94,6 +94,7 @@ idxd_dev_close(struct rte_dmadev *dev) static const struct rte_dmadev_ops idxd_vdev_ops = { .dev_close = idxd_dev_close, + .dev_dump = idxd_dump, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 7770b2e264..9490439fdc 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -10,6 +10,35 @@ #define IDXD_PMD_NAME_STR "dmadev_idxd" +int +idxd_dump(const struct rte_dmadev *dev, FILE *f) +{ + struct idxd_dmadev *idxd = dev->dev_private; + unsigned int i; + + fprintf(f, "== Private Data ==\n"); + fprintf(f, " Portal: %p\n", idxd->portal); + fprintf(f, " Config: { ring_size: %u }\n", + idxd->qcfg.nb_desc); + fprintf(f, " Batch ring (sz = %u, max_batches = %u):\n\t", + idxd->max_batches + 1, idxd->max_batches); + for (i = 0; i <= idxd->max_batches; i++) { + fprintf(f, " %u ", idxd->batch_idx_ring[i]); + if (i == idxd->batch_idx_read && i == idxd->batch_idx_write) + fprintf(f, "[rd ptr, wr ptr] "); + else if (i == idxd->batch_idx_read) + fprintf(f, "[rd ptr] "); + else if (i == idxd->batch_idx_write) + fprintf(f, "[wr ptr] "); + if (i == idxd->max_batches) + fprintf(f, "\n"); + } + + fprintf(f, " Curr batch: start = %u, size = %u\n", idxd->batch_start, idxd->batch_size); + fprintf(f, " IDS: avail = %u, returned: %u\n", idxd->ids_avail, idxd->ids_returned); + return 0; +} + int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, @@ -19,6 +48,10 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, struct rte_dmadev *dmadev = NULL; int ret = 0; + RTE_BUILD_BUG_ON(sizeof(struct idxd_hw_desc) != 64); + RTE_BUILD_BUG_ON(offsetof(struct idxd_hw_desc, size) != 32); + RTE_BUILD_BUG_ON(sizeof(struct idxd_completion) != 32); + if (!name) { IDXD_PMD_ERR("Invalid name of the device!"); ret = -EINVAL; diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h index ea627cba6d..55ca9f7f52 100644 --- a/drivers/dma/idxd/idxd_hw_defs.h +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -5,6 +5,66 @@ #ifndef _IDXD_HW_DEFS_H_ #define _IDXD_HW_DEFS_H_ +/* + * Defines used in the data path for interacting with IDXD hardware. + */ +#define IDXD_CMD_OP_SHIFT 24 +enum rte_idxd_ops { + idxd_op_nop = 0, + idxd_op_batch, + idxd_op_drain, + idxd_op_memmove, + idxd_op_fill +}; + +#define IDXD_FLAG_FENCE (1 << 0) +#define IDXD_FLAG_COMPLETION_ADDR_VALID (1 << 2) +#define IDXD_FLAG_REQUEST_COMPLETION (1 << 3) +#define IDXD_FLAG_CACHE_CONTROL (1 << 8) + +/** + * Hardware descriptor used by DSA hardware, for both bursts and + * for individual operations. + */ +struct idxd_hw_desc { + uint32_t pasid; + uint32_t op_flags; + rte_iova_t completion; + + RTE_STD_C11 + union { + rte_iova_t src; /* source address for copy ops etc. */ + rte_iova_t desc_addr; /* descriptor pointer for batch */ + }; + rte_iova_t dst; + + uint32_t size; /* length of data for op, or batch size */ + + uint16_t intr_handle; /* completion interrupt handle */ + + /* remaining 26 bytes are reserved */ + uint16_t __reserved[13]; +} __rte_aligned(64); + +#define IDXD_COMP_STATUS_INCOMPLETE 0 +#define IDXD_COMP_STATUS_SUCCESS 1 +#define IDXD_COMP_STATUS_INVALID_OPCODE 0x10 +#define IDXD_COMP_STATUS_INVALID_SIZE 0x13 +#define IDXD_COMP_STATUS_SKIPPED 0xFF /* not official IDXD error, needed as placeholder */ + +/** + * Completion record structure written back by DSA + */ +struct idxd_completion { + uint8_t status; + uint8_t result; + /* 16-bits pad here */ + uint32_t completed_size; /* data length, or descriptors for batch */ + + rte_iova_t fault_address; + uint32_t invalid_flags; +} __rte_aligned(32); + /*** Definitions for Intel(R) Data Streaming Accelerator ***/ #define IDXD_CMD_SHIFT 20 diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 85c400c9ec..09285b4e96 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -37,6 +37,8 @@ struct idxd_pci_common { }; struct idxd_dmadev { + struct idxd_hw_desc *desc_ring; + /* counters to track the batches */ unsigned short max_batches; unsigned short batch_idx_read; @@ -77,5 +79,6 @@ struct idxd_dmadev { int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, const struct rte_dmadev_ops *ops); +int idxd_dump(const struct rte_dmadev *dev, FILE *f); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 318931713c..96d58c8544 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -60,7 +60,7 @@ idxd_is_wq_enabled(struct idxd_dmadev *idxd) } static const struct rte_dmadev_ops idxd_pci_ops = { - + .dev_dump = idxd_dump, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v2 08/16] dma/idxd: add configure and info_get functions 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 00/16] add dmadev driver for idxd devices Kevin Laatz ` (6 preceding siblings ...) 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 07/16] dma/idxd: add datapath structures Kevin Laatz @ 2021-09-03 10:49 ` Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 09/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz ` (8 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-03 10:49 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add functions for device configuration. The info_get function is included here since it can be useful for checking successful configuration. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- v2: - fix reconfigure bug in idxd_vchan_setup() - add literal include comment for the docs to pick up --- app/test/test_dmadev.c | 3 +- doc/guides/dmadevs/idxd.rst | 33 ++++++++++++++++ drivers/dma/idxd/idxd_bus.c | 3 ++ drivers/dma/idxd/idxd_common.c | 66 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 +++ drivers/dma/idxd/idxd_pci.c | 3 ++ 6 files changed, 112 insertions(+), 1 deletion(-) diff --git a/app/test/test_dmadev.c b/app/test/test_dmadev.c index c44c3ad9db..d287d480b4 100644 --- a/app/test/test_dmadev.c +++ b/app/test/test_dmadev.c @@ -770,6 +770,7 @@ static int test_dmadev_instance(uint16_t dev_id) { #define TEST_RINGSIZE 512 + /* Setup of the dmadev device. 8< */ struct rte_dmadev_stats stats; struct rte_dmadev_info info; const struct rte_dmadev_conf conf = { .nb_vchans = 1}; @@ -795,7 +796,7 @@ test_dmadev_instance(uint16_t dev_id) PRINT_ERR("Error with queue configuration\n"); return -1; } - + /* >8 End of setup of the dmadev device. */ rte_dmadev_info_get(dev_id, &info); if (info.nb_vchans != 1) { PRINT_ERR("Error, no configured queues reported on device id %u\n", dev_id); diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index ce33e2857a..66bc9fe744 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -120,3 +120,36 @@ use a subset of configured queues. Once probed successfully, irrespective of kernel driver, the device will appear as a ``dmadev``, that is a "DMA device type" inside DPDK, and can be accessed using APIs from the ``rte_dmadev`` library. + +Using IDXD DMAdev Devices +-------------------------- + +To use the devices from an application, the dmadev API can be used. + +Getting Device Information +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Basic information about each dmadev device can be queried using the +``rte_dmadev_info_get()`` API. This will return basic device information such as +the ``rte_device`` structure, device capabilities and other device specific values. + +Device Configuration +~~~~~~~~~~~~~~~~~~~~~ + +Configuring an IDXD dmadev device is done using the ``rte_dmadev_configure()`` and +``rte_dmadev_vchan_setup`` APIs. The configurations are passed to these APIs using +the ``rte_dmadev_conf`` and ``rte_dmadev_vchan_conf`` structures, respectively. For +example, these can be used to configure the number of ``vchans`` per device, the +ring size, etc. The ring size must be a power of two, between 64 and 4096. + +The following code shows how the device is configured in +``test_dmadev.c``: + +.. literalinclude:: ../../../app/test/test_dmadev.c + :language: c + :start-after: Setup of the dmadev device. 8< + :end-before: >8 End of setup of the dmadev device. + :dedent: 1 + +Once configured, the device can then be made ready for use by calling the +``rte_dmadev_start()`` API. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index dc11f829fd..ad4f076bad 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -95,6 +95,9 @@ idxd_dev_close(struct rte_dmadev *dev) static const struct rte_dmadev_ops idxd_vdev_ops = { .dev_close = idxd_dev_close, .dev_dump = idxd_dump, + .dev_configure = idxd_configure, + .vchan_setup = idxd_vchan_setup, + .dev_info_get = idxd_info_get, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 9490439fdc..6704a26357 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -39,6 +39,72 @@ idxd_dump(const struct rte_dmadev *dev, FILE *f) return 0; } +int +idxd_info_get(const struct rte_dmadev *dev, struct rte_dmadev_info *info, uint32_t size) +{ + struct idxd_dmadev *idxd = dev->dev_private; + + if (size < sizeof(*info)) + return -EINVAL; + + *info = (struct rte_dmadev_info) { + .device = dev->device, + .dev_capa = RTE_DMADEV_CAPA_MEM_TO_MEM | + RTE_DMADEV_CAPA_OPS_COPY | RTE_DMADEV_CAPA_OPS_FILL, + .max_vchans = 1, + .max_desc = 4096, + .min_desc = 64, + .nb_vchans = (idxd->desc_ring != NULL), /* returns 1 or 0 */ + }; + if (idxd->sva_support) + info->dev_capa |= RTE_DMADEV_CAPA_SVA; + return 0; +} + +int +idxd_configure(struct rte_dmadev *dev __rte_unused, const struct rte_dmadev_conf *dev_conf) +{ + if (dev_conf->nb_vchans != 1) + return -EINVAL; + return 0; +} + +int +idxd_vchan_setup(struct rte_dmadev *dev, uint16_t vchan __rte_unused, + const struct rte_dmadev_vchan_conf *qconf) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t max_desc = qconf->nb_desc; + + idxd->qcfg = *qconf; + + if (!rte_is_power_of_2(max_desc)) + max_desc = rte_align32pow2(max_desc); + IDXD_PMD_DEBUG("DMA dev %u using %u descriptors", dev->data->dev_id, max_desc); + idxd->desc_ring_mask = max_desc - 1; + idxd->qcfg.nb_desc = max_desc; + + /* in case we are reconfiguring a device, free any existing memory */ + rte_free(idxd->desc_ring); + + /* allocate the descriptor ring at 2x size as batches can't wrap */ + idxd->desc_ring = rte_zmalloc(NULL, sizeof(*idxd->desc_ring) * max_desc * 2, 0); + if (idxd->desc_ring == NULL) + return -ENOMEM; + idxd->desc_iova = rte_mem_virt2iova(idxd->desc_ring); + + idxd->batch_idx_read = 0; + idxd->batch_idx_write = 0; + idxd->batch_start = 0; + idxd->batch_size = 0; + idxd->ids_returned = 0; + idxd->ids_avail = 0; + + memset(idxd->batch_comp_ring, 0, sizeof(*idxd->batch_comp_ring) * + (idxd->max_batches + 1)); + return 0; +} + int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 09285b4e96..18fc65d00c 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -80,5 +80,10 @@ struct idxd_dmadev { int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, const struct rte_dmadev_ops *ops); int idxd_dump(const struct rte_dmadev *dev, FILE *f); +int idxd_configure(struct rte_dmadev *dev, const struct rte_dmadev_conf *dev_conf); +int idxd_vchan_setup(struct rte_dmadev *dev, uint16_t vchan, + const struct rte_dmadev_vchan_conf *qconf); +int idxd_info_get(const struct rte_dmadev *dev, struct rte_dmadev_info *dev_info, + uint32_t size); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 96d58c8544..569df8d04c 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -61,6 +61,9 @@ idxd_is_wq_enabled(struct idxd_dmadev *idxd) static const struct rte_dmadev_ops idxd_pci_ops = { .dev_dump = idxd_dump, + .dev_configure = idxd_configure, + .vchan_setup = idxd_vchan_setup, + .dev_info_get = idxd_info_get, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v2 09/16] dma/idxd: add start and stop functions for pci devices 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 00/16] add dmadev driver for idxd devices Kevin Laatz ` (7 preceding siblings ...) 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 08/16] dma/idxd: add configure and info_get functions Kevin Laatz @ 2021-09-03 10:49 ` Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 10/16] dma/idxd: add data-path job submission functions Kevin Laatz ` (7 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-03 10:49 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add device start/stop functions for DSA devices bound to vfio. For devices bound to the IDXD kernel driver, these are not required since the IDXD kernel driver takes care of this. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- drivers/dma/idxd/idxd_pci.c | 52 +++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 569df8d04c..3c0e3086f7 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -59,11 +59,63 @@ idxd_is_wq_enabled(struct idxd_dmadev *idxd) return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; } +static int +idxd_pci_dev_stop(struct rte_dmadev *dev) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint8_t err_code; + + if (!idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Work queue %d already disabled", idxd->qid); + return -EALREADY; + } + + err_code = idxd_pci_dev_command(idxd, idxd_disable_wq); + if (err_code || idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Failed disabling work queue %d, error code: %#x", + idxd->qid, err_code); + return -err_code; + } + IDXD_PMD_DEBUG("Work queue %d disabled OK", idxd->qid); + + return 0; +} + +static int +idxd_pci_dev_start(struct rte_dmadev *dev) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint8_t err_code; + + if (idxd_is_wq_enabled(idxd)) { + IDXD_PMD_WARN("WQ %d already enabled", idxd->qid); + return 0; + } + + if (idxd->desc_ring == NULL) { + IDXD_PMD_ERR("WQ %d has not been fully configured", idxd->qid); + return -EINVAL; + } + + err_code = idxd_pci_dev_command(idxd, idxd_enable_wq); + if (err_code || !idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Failed enabling work queue %d, error code: %#x", + idxd->qid, err_code); + return err_code == 0 ? -1 : err_code; + } + + IDXD_PMD_DEBUG("Work queue %d enabled OK", idxd->qid); + + return 0; +} + static const struct rte_dmadev_ops idxd_pci_ops = { .dev_dump = idxd_dump, .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .dev_start = idxd_pci_dev_start, + .dev_stop = idxd_pci_dev_stop, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v2 10/16] dma/idxd: add data-path job submission functions 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 00/16] add dmadev driver for idxd devices Kevin Laatz ` (8 preceding siblings ...) 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 09/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz @ 2021-09-03 10:49 ` Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 11/16] dma/idxd: add data-path job completion functions Kevin Laatz ` (6 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-03 10:49 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add data path functions for enqueuing and submitting operations to DSA devices. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- doc/guides/dmadevs/idxd.rst | 64 +++++++++++++++ drivers/dma/idxd/idxd_common.c | 137 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 ++ drivers/dma/idxd/meson.build | 1 + 4 files changed, 207 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 66bc9fe744..0c4c105e0f 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -153,3 +153,67 @@ The following code shows how the device is configured in Once configured, the device can then be made ready for use by calling the ``rte_dmadev_start()`` API. + +Performing Data Copies +~~~~~~~~~~~~~~~~~~~~~~~ + +To perform data copies using IDXD dmadev devices, descriptors should be enqueued +using the ``rte_dmadev_copy()`` API. The HW can be triggered to perform the copy +in two ways, either via a ``RTE_DMA_OP_FLAG_SUBMIT`` flag or by calling +``rte_dmadev_submit()``. Once copies have been completed, the completion will +be reported back when the application calls ``rte_dmadev_completed()`` or +``rte_dmadev_completed_status()``. The latter will also report the status of each +completed operation. + +The ``rte_dmadev_copy()`` function enqueues a single copy to the device ring for +copying at a later point. The parameters to that function include the IOVA addresses +of both the source and destination buffers, as well as the length of the copy. + +The ``rte_dmadev_copy()`` function enqueues a copy operation on the device ring. +If the ``RTE_DMA_OP_FLAG_SUBMIT`` flag is set when calling ``rte_dmadev_copy()``, +the device hardware will be informed of the elements. Alternatively, if the flag +is not set, the application need to call the ``rte_dmadev_submit()`` function to +notify the device hardware. Once the device hardware is informed of the elements +enqueued on the ring, and the device will begin to process them. It is expected +that, for efficiency reasons, a burst of operations will be enqueued to the +device via multiple enqueue calls between calls to the ``rte_dmadev_submit()`` +function. + +The following code from demonstrates how to enqueue a burst of copies to the +device and start the hardware processing of them: + +.. code-block:: C + + struct rte_mbuf *srcs[COMP_BURST_SZ], *dsts[COMP_BURST_SZ]; + unsigned int i; + + for (i = 0; i < RTE_DIM(srcs); i++) { + uint64_t *src_data; + + srcs[i] = rte_pktmbuf_alloc(pool); + dsts[i] = rte_pktmbuf_alloc(pool); + src_data = rte_pktmbuf_mtod(srcs[i], uint64_t *); + if (srcs[i] == NULL || dsts[i] == NULL) { + PRINT_ERR("Error allocating buffers\n"); + return -1; + } + + for (j = 0; j < COPY_LEN/sizeof(uint64_t); j++) + src_data[j] = rte_rand(); + + if (rte_dmadev_copy(dev_id, vchan, srcs[i]->buf_iova + srcs[i]->data_off, + dsts[i]->buf_iova + dsts[i]->data_off, COPY_LEN, 0) < 0) { + PRINT_ERR("Error with rte_dmadev_copy for buffer %u\n", i); + return -1; + } + } + rte_dmadev_submit(dev_id, vchan); + +Filling an Area of Memory +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The IDXD driver also has support for the ``fill`` operation, where an area +of memory is overwritten, or filled, with a short pattern of data. +Fill operations can be performed in much the same was as copy operations +described above, just using the ``rte_dmadev_fill()`` function rather than the +``rte_dmadev_copy()`` function. diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 6704a26357..d72e83537d 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -2,14 +2,147 @@ * Copyright 2021 Intel Corporation */ +#include <x86intrin.h> + #include <rte_dmadev_pmd.h> #include <rte_malloc.h> #include <rte_common.h> +#include <rte_prefetch.h> #include "idxd_internal.h" #define IDXD_PMD_NAME_STR "dmadev_idxd" +static __rte_always_inline rte_iova_t +__desc_idx_to_iova(struct idxd_dmadev *idxd, uint16_t n) +{ + return idxd->desc_iova + (n * sizeof(struct idxd_hw_desc)); +} + +static __rte_always_inline void +__idxd_movdir64b(volatile void *dst, const struct idxd_hw_desc *src) +{ + asm volatile (".byte 0x66, 0x0f, 0x38, 0xf8, 0x02" + : + : "a" (dst), "d" (src) + : "memory"); +} + +static __rte_always_inline void +__submit(struct idxd_dmadev *idxd) +{ + rte_prefetch1(&idxd->batch_comp_ring[idxd->batch_idx_read]); + + if (idxd->batch_size == 0) + return; + + /* write completion to batch comp ring */ + rte_iova_t comp_addr = idxd->batch_iova + + (idxd->batch_idx_write * sizeof(struct idxd_completion)); + + if (idxd->batch_size == 1) { + /* submit batch directly */ + struct idxd_hw_desc desc = + idxd->desc_ring[idxd->batch_start & idxd->desc_ring_mask]; + desc.completion = comp_addr; + desc.op_flags |= IDXD_FLAG_REQUEST_COMPLETION; + _mm_sfence(); /* fence before writing desc to device */ + __idxd_movdir64b(idxd->portal, &desc); + } else { + const struct idxd_hw_desc batch_desc = { + .op_flags = (idxd_op_batch << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_COMPLETION_ADDR_VALID | + IDXD_FLAG_REQUEST_COMPLETION, + .desc_addr = __desc_idx_to_iova(idxd, + idxd->batch_start & idxd->desc_ring_mask), + .completion = comp_addr, + .size = idxd->batch_size, + }; + _mm_sfence(); /* fence before writing desc to device */ + __idxd_movdir64b(idxd->portal, &batch_desc); + } + + if (++idxd->batch_idx_write > idxd->max_batches) + idxd->batch_idx_write = 0; + + idxd->batch_start += idxd->batch_size; + idxd->batch_size = 0; + idxd->batch_idx_ring[idxd->batch_idx_write] = idxd->batch_start; + _mm256_store_si256((void *)&idxd->batch_comp_ring[idxd->batch_idx_write], + _mm256_setzero_si256()); +} + +static __rte_always_inline int +__idxd_write_desc(struct rte_dmadev *dev, + const uint32_t op_flags, + const rte_iova_t src, + const rte_iova_t dst, + const uint32_t size, + const uint32_t flags) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t mask = idxd->desc_ring_mask; + uint16_t job_id = idxd->batch_start + idxd->batch_size; + /* we never wrap batches, so we only mask the start and allow start+size to overflow */ + uint16_t write_idx = (idxd->batch_start & mask) + idxd->batch_size; + + /* first check batch ring space then desc ring space */ + if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) || + idxd->batch_idx_write + 1 == idxd->batch_idx_read) + goto failed; + if (((write_idx + 1) & mask) == (idxd->ids_returned & mask)) + goto failed; + + /* write desc. Note: descriptors don't wrap, but the completion address does */ + const uint64_t op_flags64 = (uint64_t)(op_flags | IDXD_FLAG_COMPLETION_ADDR_VALID) << 32; + const uint64_t comp_addr = __desc_idx_to_iova(idxd, write_idx & mask); + _mm256_store_si256((void *)&idxd->desc_ring[write_idx], + _mm256_set_epi64x(dst, src, comp_addr, op_flags64)); + _mm256_store_si256((void *)&idxd->desc_ring[write_idx].size, + _mm256_set_epi64x(0, 0, 0, size)); + + idxd->batch_size++; + + rte_prefetch0_write(&idxd->desc_ring[write_idx + 1]); + + if (flags & RTE_DMA_OP_FLAG_SUBMIT) + __submit(idxd); + + return job_id; + +failed: + return -1; +} + +int +idxd_enqueue_copy(struct rte_dmadev *dev, uint16_t qid __rte_unused, rte_iova_t src, + rte_iova_t dst, unsigned int length, uint64_t flags) +{ + /* we can take advantage of the fact that the fence flag in dmadev and DSA are the same, + * but check it at compile time to be sure. + */ + RTE_BUILD_BUG_ON(RTE_DMA_OP_FLAG_FENCE != IDXD_FLAG_FENCE); + uint32_t memmove = (idxd_op_memmove << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_CACHE_CONTROL | (flags & IDXD_FLAG_FENCE); + return __idxd_write_desc(dev, memmove, src, dst, length, flags); +} + +int +idxd_enqueue_fill(struct rte_dmadev *dev, uint16_t qid __rte_unused, uint64_t pattern, + rte_iova_t dst, unsigned int length, uint64_t flags) +{ + uint32_t fill = (idxd_op_fill << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_CACHE_CONTROL | (flags & IDXD_FLAG_FENCE); + return __idxd_write_desc(dev, fill, pattern, dst, length, flags); +} + +int +idxd_submit(struct rte_dmadev *dev, uint16_t qid __rte_unused) +{ + __submit(dev->dev_private); + return 0; +} + int idxd_dump(const struct rte_dmadev *dev, FILE *f) { @@ -134,6 +267,10 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->dev_ops = ops; dmadev->device = dev; + dmadev->copy = idxd_enqueue_copy; + dmadev->fill = idxd_enqueue_fill; + dmadev->submit = idxd_submit; + idxd = rte_malloc_socket(NULL, sizeof(struct idxd_dmadev), 0, dev->numa_node); if (idxd == NULL) { IDXD_PMD_ERR("Unable to allocate memory for device"); diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 18fc65d00c..6a6c69fd61 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -85,5 +85,10 @@ int idxd_vchan_setup(struct rte_dmadev *dev, uint16_t vchan, const struct rte_dmadev_vchan_conf *qconf); int idxd_info_get(const struct rte_dmadev *dev, struct rte_dmadev_info *dev_info, uint32_t size); +int idxd_enqueue_copy(struct rte_dmadev *dev, uint16_t qid, rte_iova_t src, + rte_iova_t dst, unsigned int length, uint64_t flags); +int idxd_enqueue_fill(struct rte_dmadev *dev, uint16_t qid, uint64_t pattern, + rte_iova_t dst, unsigned int length, uint64_t flags); +int idxd_submit(struct rte_dmadev *dev, uint16_t qid); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index 81150e6f25..2de5130fd2 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -2,6 +2,7 @@ # Copyright(c) 2021 Intel Corporation deps += ['bus_pci'] +cflags += '-mavx2' # all platforms with idxd HW support AVX sources = files( 'idxd_bus.c', 'idxd_common.c', -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v2 11/16] dma/idxd: add data-path job completion functions 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 00/16] add dmadev driver for idxd devices Kevin Laatz ` (9 preceding siblings ...) 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 10/16] dma/idxd: add data-path job submission functions Kevin Laatz @ 2021-09-03 10:49 ` Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 12/16] dma/idxd: add operation statistic tracking Kevin Laatz ` (5 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-03 10:49 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the data path functions for gathering completed operations. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- v2: - fixed typo in docs - add completion status for invalid opcode --- doc/guides/dmadevs/idxd.rst | 25 ++++ drivers/dma/idxd/idxd_common.c | 237 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 + 3 files changed, 267 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 0c4c105e0f..b0b5632b48 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -209,6 +209,31 @@ device and start the hardware processing of them: } rte_dmadev_submit(dev_id, vchan); +To retrieve information about completed copies, ``rte_dmadev_completed()`` and +``rte_dmadev_completed_status()`` APIs should be used. ``rte_dmadev_completed()`` +will return the number of completed operations, along with the index of the last +successful completed operation and whether or not an error was encountered. If an +error was encountered, ``rte_dmadev_completed_status()`` must be used to kick the +device off to continue processing operations and also to gather the status of each +individual operations which is filled in to the ``status`` array provided as +parameter by the application. + +The following code shows how to retrieve the number of successfully completed +copies within a burst and then using ``rte_dmadev_completed_status()`` to check +which operation failed and kick off the device to continue processing operations: + +.. code-block:: C + + enum rte_dma_status_code status[COMP_BURST_SZ]; + uint16_t count, idx, status_count; + bool error = 0; + + count = rte_dmadev_completed(dev_id, vchan, COMP_BURST_SZ, &idx, &error); + + if (error){ + status_count = rte_dmadev_completed_status(dev_id, vchan, COMP_BURST_SZ, &idx, status); + } + Filling an Area of Memory ~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index d72e83537d..1bbe313c09 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -143,6 +143,241 @@ idxd_submit(struct rte_dmadev *dev, uint16_t qid __rte_unused) return 0; } +static enum rte_dma_status_code +get_comp_status(struct idxd_completion *c) +{ + uint8_t st = c->status; + switch (st) { + /* successful descriptors are not written back normally */ + case IDXD_COMP_STATUS_INCOMPLETE: + case IDXD_COMP_STATUS_SUCCESS: + return RTE_DMA_STATUS_SUCCESSFUL; + case IDXD_COMP_STATUS_INVALID_OPCODE: + return RTE_DMA_STATUS_INVALID_OPCODE; + case IDXD_COMP_STATUS_INVALID_SIZE: + return RTE_DMA_STATUS_INVALID_LENGTH; + case IDXD_COMP_STATUS_SKIPPED: + return RTE_DMA_STATUS_NOT_ATTEMPTED; + default: + return RTE_DMA_STATUS_ERROR_UNKNOWN; + } +} + +static __rte_always_inline int +batch_ok(struct idxd_dmadev *idxd, uint8_t max_ops, enum rte_dma_status_code *status) +{ + uint16_t ret; + uint8_t bstatus; + + if (max_ops == 0) + return 0; + + /* first check if there are any unreturned handles from last time */ + if (idxd->ids_avail != idxd->ids_returned) { + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); + idxd->ids_returned += ret; + if (status) + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); + return ret; + } + + if (idxd->batch_idx_read == idxd->batch_idx_write) + return 0; + + bstatus = idxd->batch_comp_ring[idxd->batch_idx_read].status; + /* now check if next batch is complete and successful */ + if (bstatus == IDXD_COMP_STATUS_SUCCESS) { + /* since the batch idx ring stores the start of each batch, pre-increment to lookup + * start of next batch. + */ + if (++idxd->batch_idx_read > idxd->max_batches) + idxd->batch_idx_read = 0; + idxd->ids_avail = idxd->batch_idx_ring[idxd->batch_idx_read]; + + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); + idxd->ids_returned += ret; + if (status) + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); + return ret; + } + /* check if batch is incomplete */ + else if (bstatus == IDXD_COMP_STATUS_INCOMPLETE) + return 0; + + return -1; /* error case */ +} + +static inline uint16_t +batch_completed(struct idxd_dmadev *idxd, uint8_t max_ops, bool *has_error) +{ + uint16_t i; + uint16_t b_start, b_end, next_batch; + + int ret = batch_ok(idxd, max_ops, NULL); + if (ret >= 0) + return ret; + + /* ERROR case, not successful, not incomplete */ + /* Get the batch size, and special case size 1. + * once we identify the actual failure job, return other jobs, then update + * the batch ring indexes to make it look like the first job of the batch has failed. + * Subsequent calls here will always return zero packets, and the error must be cleared by + * calling the completed_status() function. + */ + next_batch = (idxd->batch_idx_read + 1); + if (next_batch > idxd->max_batches) + next_batch = 0; + b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; + b_end = idxd->batch_idx_ring[next_batch]; + + if (b_end - b_start == 1) { /* not a batch */ + *has_error = true; + return 0; + } + + for (i = b_start; i < b_end; i++) { + struct idxd_completion *c = (void *)&idxd->desc_ring[i & idxd->desc_ring_mask]; + if (c->status > IDXD_COMP_STATUS_SUCCESS) /* ignore incomplete(0) and success(1) */ + break; + } + ret = RTE_MIN((uint16_t)(i - idxd->ids_returned), max_ops); + if (ret < max_ops) + *has_error = true; /* we got up to the point of error */ + idxd->ids_avail = idxd->ids_returned += ret; + + /* to ensure we can call twice and just return 0, set start of batch to where we finished */ + idxd->batch_comp_ring[idxd->batch_idx_read].completed_size -= ret; + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; + if (idxd->batch_idx_ring[next_batch] - idxd->batch_idx_ring[idxd->batch_idx_read] == 1) { + /* copy over the descriptor status to the batch ring as if no batch */ + uint16_t d_idx = idxd->batch_idx_ring[idxd->batch_idx_read] & idxd->desc_ring_mask; + struct idxd_completion *desc_comp = (void *)&idxd->desc_ring[d_idx]; + idxd->batch_comp_ring[idxd->batch_idx_read].status = desc_comp->status; + } + + return ret; +} + +static uint16_t +batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) +{ + uint16_t next_batch; + + int ret = batch_ok(idxd, max_ops, status); + if (ret >= 0) + return ret; + + /* ERROR case, not successful, not incomplete */ + /* Get the batch size, and special case size 1. + */ + next_batch = (idxd->batch_idx_read + 1); + if (next_batch > idxd->max_batches) + next_batch = 0; + const uint16_t b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; + const uint16_t b_end = idxd->batch_idx_ring[next_batch]; + const uint16_t b_len = b_end - b_start; + if (b_len == 1) {/* not a batch */ + *status = get_comp_status(&idxd->batch_comp_ring[idxd->batch_idx_read]); + idxd->ids_avail++; + idxd->ids_returned++; + idxd->batch_idx_read = next_batch; + return 1; + } + + /* not a single-element batch, need to process more. + * Scenarios: + * 1. max_ops >= batch_size - can fit everything, simple case + * - loop through completed ops and then add on any not-attempted ones + * 2. max_ops < batch_size - can't fit everything, more complex case + * - loop through completed/incomplete and stop when hit max_ops + * - adjust the batch descriptor to update where we stopped, with appropriate bcount + * - if bcount is to be exactly 1, update the batch descriptor as it will be treated as + * non-batch next time. + */ + const uint16_t bcount = idxd->batch_comp_ring[idxd->batch_idx_read].completed_size; + for (ret = 0; ret < b_len && ret < max_ops; ret++) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; + status[ret] = (ret < bcount) ? get_comp_status(c) : RTE_DMA_STATUS_NOT_ATTEMPTED; + } + idxd->ids_avail = idxd->ids_returned += ret; + + /* everything fit */ + if (ret == b_len) { + idxd->batch_idx_read = next_batch; + return ret; + } + + /* set up for next time, update existing batch descriptor & start idx at batch_idx_read */ + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; + if (ret > bcount) { + /* we have only incomplete ones - set batch completed size to 0 */ + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; + comp->completed_size = 0; + /* if there is only one descriptor left, job skipped so set flag appropriately */ + if (b_len - ret == 1) + comp->status = IDXD_COMP_STATUS_SKIPPED; + } else { + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; + comp->completed_size -= ret; + /* if there is only one descriptor left, copy status info straight to desc */ + if (comp->completed_size == 1) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; + comp->status = c->status; + /* individual descs can be ok without writeback, but not batches */ + if (comp->status == IDXD_COMP_STATUS_INCOMPLETE) + comp->status = IDXD_COMP_STATUS_SUCCESS; + } else if (bcount == b_len) { + /* check if we still have an error, and clear flag if not */ + uint16_t i; + for (i = b_start + ret; i < b_end; i++) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[i & idxd->desc_ring_mask]; + if (c->status > IDXD_COMP_STATUS_SUCCESS) + break; + } + if (i == b_end) /* no errors */ + comp->status = IDXD_COMP_STATUS_SUCCESS; + } + } + + return ret; +} + +uint16_t +idxd_completed(struct rte_dmadev *dev, uint16_t qid __rte_unused, uint16_t max_ops, + uint16_t *last_idx, bool *has_error) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t batch, ret = 0; + + do { + batch = batch_completed(idxd, max_ops - ret, has_error); + ret += batch; + } while (batch > 0 && *has_error == false); + + *last_idx = idxd->ids_returned - 1; + return ret; +} + +uint16_t +idxd_completed_status(struct rte_dmadev *dev, uint16_t qid __rte_unused, uint16_t max_ops, + uint16_t *last_idx, enum rte_dma_status_code *status) +{ + struct idxd_dmadev *idxd = dev->dev_private; + + uint16_t batch, ret = 0; + + do { + batch = batch_completed_status(idxd, max_ops - ret, &status[ret]); + ret += batch; + } while (batch > 0); + + *last_idx = idxd->ids_returned - 1; + return ret; +} + int idxd_dump(const struct rte_dmadev *dev, FILE *f) { @@ -270,6 +505,8 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->copy = idxd_enqueue_copy; dmadev->fill = idxd_enqueue_fill; dmadev->submit = idxd_submit; + dmadev->completed = idxd_completed; + dmadev->completed_status = idxd_completed_status; idxd = rte_malloc_socket(NULL, sizeof(struct idxd_dmadev), 0, dev->numa_node); if (idxd == NULL) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 6a6c69fd61..4bcfe5372b 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -90,5 +90,10 @@ int idxd_enqueue_copy(struct rte_dmadev *dev, uint16_t qid, rte_iova_t src, int idxd_enqueue_fill(struct rte_dmadev *dev, uint16_t qid, uint64_t pattern, rte_iova_t dst, unsigned int length, uint64_t flags); int idxd_submit(struct rte_dmadev *dev, uint16_t qid); +uint16_t idxd_completed(struct rte_dmadev *dev, uint16_t qid, uint16_t max_ops, + uint16_t *last_idx, bool *has_error); +uint16_t idxd_completed_status(struct rte_dmadev *dev, uint16_t qid __rte_unused, + uint16_t max_ops, uint16_t *last_idx, + enum rte_dma_status_code *status); #endif /* _IDXD_INTERNAL_H_ */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v2 12/16] dma/idxd: add operation statistic tracking 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 00/16] add dmadev driver for idxd devices Kevin Laatz ` (10 preceding siblings ...) 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 11/16] dma/idxd: add data-path job completion functions Kevin Laatz @ 2021-09-03 10:49 ` Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 13/16] dma/idxd: add vchan idle function Kevin Laatz ` (4 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-03 10:49 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add statistic tracking for DSA devices. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- doc/guides/dmadevs/idxd.rst | 11 +++++++++++ drivers/dma/idxd/idxd_bus.c | 2 ++ drivers/dma/idxd/idxd_common.c | 27 +++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 3 +++ drivers/dma/idxd/idxd_pci.c | 2 ++ 5 files changed, 45 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index b0b5632b48..634ef58985 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -242,3 +242,14 @@ of memory is overwritten, or filled, with a short pattern of data. Fill operations can be performed in much the same was as copy operations described above, just using the ``rte_dmadev_fill()`` function rather than the ``rte_dmadev_copy()`` function. + +Querying Device Statistics +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The statistics from the IDXD dmadev device can be got via the stats functions in +the ``rte_dmadev`` library, i.e. ``rte_dmadev_stats_get()``. The statistics +returned for each device instance are: + +* ``submitted`` +* ``completed`` +* ``errors`` diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index ad4f076bad..9482dab9be 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -98,6 +98,8 @@ static const struct rte_dmadev_ops idxd_vdev_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .stats_get = idxd_stats_get, + .stats_reset = idxd_stats_reset, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 1bbe313c09..7c72396dc8 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -65,6 +65,8 @@ __submit(struct idxd_dmadev *idxd) if (++idxd->batch_idx_write > idxd->max_batches) idxd->batch_idx_write = 0; + idxd->stats.submitted += idxd->batch_size; + idxd->batch_start += idxd->batch_size; idxd->batch_size = 0; idxd->batch_idx_ring[idxd->batch_idx_write] = idxd->batch_start; @@ -278,6 +280,8 @@ batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_ const uint16_t b_len = b_end - b_start; if (b_len == 1) {/* not a batch */ *status = get_comp_status(&idxd->batch_comp_ring[idxd->batch_idx_read]); + if (status != RTE_DMA_STATUS_SUCCESSFUL) + idxd->stats.errors++; idxd->ids_avail++; idxd->ids_returned++; idxd->batch_idx_read = next_batch; @@ -299,6 +303,8 @@ batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_ struct idxd_completion *c = (void *) &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; status[ret] = (ret < bcount) ? get_comp_status(c) : RTE_DMA_STATUS_NOT_ATTEMPTED; + if (status[ret] != RTE_DMA_STATUS_SUCCESSFUL) + idxd->stats.errors++; } idxd->ids_avail = idxd->ids_returned += ret; @@ -357,6 +363,7 @@ idxd_completed(struct rte_dmadev *dev, uint16_t qid __rte_unused, uint16_t max_o ret += batch; } while (batch > 0 && *has_error == false); + idxd->stats.completed += ret; *last_idx = idxd->ids_returned - 1; return ret; } @@ -374,6 +381,7 @@ idxd_completed_status(struct rte_dmadev *dev, uint16_t qid __rte_unused, uint16_ ret += batch; } while (batch > 0); + idxd->stats.completed += ret; *last_idx = idxd->ids_returned - 1; return ret; } @@ -407,6 +415,25 @@ idxd_dump(const struct rte_dmadev *dev, FILE *f) return 0; } +int +idxd_stats_get(const struct rte_dmadev *dev, uint16_t vchan __rte_unused, + struct rte_dmadev_stats *stats, uint32_t stats_sz) +{ + struct idxd_dmadev *idxd = dev->dev_private; + if (stats_sz < sizeof(*stats)) + return -EINVAL; + *stats = idxd->stats; + return 0; +} + +int +idxd_stats_reset(struct rte_dmadev *dev, uint16_t vchan __rte_unused) +{ + struct idxd_dmadev *idxd = dev->dev_private; + idxd->stats = (struct rte_dmadev_stats){0}; + return 0; +} + int idxd_info_get(const struct rte_dmadev *dev, struct rte_dmadev_info *info, uint32_t size) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 4bcfe5372b..5079b4207d 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -95,5 +95,8 @@ uint16_t idxd_completed(struct rte_dmadev *dev, uint16_t qid, uint16_t max_ops, uint16_t idxd_completed_status(struct rte_dmadev *dev, uint16_t qid __rte_unused, uint16_t max_ops, uint16_t *last_idx, enum rte_dma_status_code *status); +int idxd_stats_get(const struct rte_dmadev *dev, uint16_t vchan, + struct rte_dmadev_stats *stats, uint32_t stats_sz); +int idxd_stats_reset(struct rte_dmadev *dev, uint16_t vchan); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 3c0e3086f7..a84232b6e9 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -114,6 +114,8 @@ static const struct rte_dmadev_ops idxd_pci_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .stats_get = idxd_stats_get, + .stats_reset = idxd_stats_reset, .dev_start = idxd_pci_dev_start, .dev_stop = idxd_pci_dev_stop, }; -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v2 13/16] dma/idxd: add vchan idle function 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 00/16] add dmadev driver for idxd devices Kevin Laatz ` (11 preceding siblings ...) 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 12/16] dma/idxd: add operation statistic tracking Kevin Laatz @ 2021-09-03 10:49 ` Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz ` (3 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-03 10:49 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When testing dmadev drivers, it is useful to have the HW device in a known state. This patch adds the implementation of the function which will wait for the device to be idle (all jobs completed) before proceeding. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 10 ++++++++++ drivers/dma/idxd/idxd_internal.h | 1 + drivers/dma/idxd/idxd_pci.c | 1 + 4 files changed, 13 insertions(+) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index 9482dab9be..4788bbf6d7 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -100,6 +100,7 @@ static const struct rte_dmadev_ops idxd_vdev_ops = { .dev_info_get = idxd_info_get, .stats_get = idxd_stats_get, .stats_reset = idxd_stats_reset, + .vchan_idle = idxd_vchan_idle, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 7c72396dc8..ab43875409 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -165,6 +165,16 @@ get_comp_status(struct idxd_completion *c) } } +int +idxd_vchan_idle(const struct rte_dmadev *dev, uint16_t vchan __rte_unused) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t last_batch_write = idxd->batch_idx_write == 0 ? idxd->max_batches : + idxd->batch_idx_write - 1; + + return (idxd->batch_comp_ring[last_batch_write].status != 0); +} + static __rte_always_inline int batch_ok(struct idxd_dmadev *idxd, uint8_t max_ops, enum rte_dma_status_code *status) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 5079b4207d..a865ae1954 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -98,5 +98,6 @@ uint16_t idxd_completed_status(struct rte_dmadev *dev, uint16_t qid __rte_unused int idxd_stats_get(const struct rte_dmadev *dev, uint16_t vchan, struct rte_dmadev_stats *stats, uint32_t stats_sz); int idxd_stats_reset(struct rte_dmadev *dev, uint16_t vchan); +int idxd_vchan_idle(const struct rte_dmadev *dev, uint16_t vchan); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index a84232b6e9..b7a079ac52 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -118,6 +118,7 @@ static const struct rte_dmadev_ops idxd_pci_ops = { .stats_reset = idxd_stats_reset, .dev_start = idxd_pci_dev_start, .dev_stop = idxd_pci_dev_stop, + .vchan_idle = idxd_vchan_idle, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v2 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 00/16] add dmadev driver for idxd devices Kevin Laatz ` (12 preceding siblings ...) 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 13/16] dma/idxd: add vchan idle function Kevin Laatz @ 2021-09-03 10:49 ` Kevin Laatz 2021-09-03 10:50 ` [dpdk-dev] [PATCH v2 15/16] devbind: add dma device class Kevin Laatz ` (2 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-03 10:49 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz From: Conor Walsh <conor.walsh@intel.com> Move the example script for configuring IDXD devices bound to the IDXD kernel driver from raw to dma, and create a symlink to still allow use from raw. Signed-off-by: Conor Walsh <conor.walsh@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- drivers/dma/idxd/dpdk_idxd_cfg.py | 117 +++++++++++++++++++++++++++++ drivers/raw/ioat/dpdk_idxd_cfg.py | 118 +----------------------------- 2 files changed, 118 insertions(+), 117 deletions(-) create mode 100755 drivers/dma/idxd/dpdk_idxd_cfg.py mode change 100755 => 120000 drivers/raw/ioat/dpdk_idxd_cfg.py diff --git a/drivers/dma/idxd/dpdk_idxd_cfg.py b/drivers/dma/idxd/dpdk_idxd_cfg.py new file mode 100755 index 0000000000..fcc27822ef --- /dev/null +++ b/drivers/dma/idxd/dpdk_idxd_cfg.py @@ -0,0 +1,117 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2020 Intel Corporation + +""" +Configure an entire Intel DSA instance, using idxd kernel driver, for DPDK use +""" + +import sys +import argparse +import os +import os.path + + +class SysfsDir: + "Used to read/write paths in a sysfs directory" + def __init__(self, path): + self.path = path + + def read_int(self, filename): + "Return a value from sysfs file" + with open(os.path.join(self.path, filename)) as f: + return int(f.readline()) + + def write_values(self, values): + "write dictionary, where key is filename and value is value to write" + for filename, contents in values.items(): + with open(os.path.join(self.path, filename), "w") as f: + f.write(str(contents)) + + +def reset_device(dsa_id): + "Reset the DSA device and all its queues" + drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") + drv_dir.write_values({"unbind": f"dsa{dsa_id}"}) + + +def get_pci_dir(pci): + "Search for the sysfs directory of the PCI device" + base_dir = '/sys/bus/pci/devices/' + for path, dirs, files in os.walk(base_dir): + for dir in dirs: + if pci in dir: + return os.path.join(base_dir, dir) + sys.exit(f"Could not find sysfs directory for device {pci}") + + +def get_dsa_id(pci): + "Get the DSA instance ID using the PCI address of the device" + pci_dir = get_pci_dir(pci) + for path, dirs, files in os.walk(pci_dir): + for dir in dirs: + if dir.startswith('dsa') and 'wq' not in dir: + return int(dir[3:]) + sys.exit(f"Could not get device ID for device {pci}") + + +def configure_dsa(dsa_id, queues, prefix): + "Configure the DSA instance with appropriate number of queues" + dsa_dir = SysfsDir(f"/sys/bus/dsa/devices/dsa{dsa_id}") + drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") + + max_groups = dsa_dir.read_int("max_groups") + max_engines = dsa_dir.read_int("max_engines") + max_queues = dsa_dir.read_int("max_work_queues") + max_work_queues_size = dsa_dir.read_int("max_work_queues_size") + + nb_queues = min(queues, max_queues) + if queues > nb_queues: + print(f"Setting number of queues to max supported value: {max_queues}") + + # we want one engine per group, and no more engines than queues + nb_groups = min(max_engines, max_groups, nb_queues) + for grp in range(nb_groups): + dsa_dir.write_values({f"engine{dsa_id}.{grp}/group_id": grp}) + + # configure each queue + for q in range(nb_queues): + wq_dir = SysfsDir(os.path.join(dsa_dir.path, f"wq{dsa_id}.{q}")) + wq_dir.write_values({"group_id": q % nb_groups, + "type": "user", + "mode": "dedicated", + "name": f"{prefix}_wq{dsa_id}.{q}", + "priority": 1, + "size": int(max_work_queues_size / nb_queues)}) + + # enable device and then queues + drv_dir.write_values({"bind": f"dsa{dsa_id}"}) + for q in range(nb_queues): + drv_dir.write_values({"bind": f"wq{dsa_id}.{q}"}) + + +def main(args): + "Main function, does arg parsing and calls config function" + arg_p = argparse.ArgumentParser( + description="Configure whole DSA device instance for DPDK use") + arg_p.add_argument('dsa_id', + help="Specify DSA instance either via DSA instance number or PCI address") + arg_p.add_argument('-q', metavar='queues', type=int, default=255, + help="Number of queues to set up") + arg_p.add_argument('--name-prefix', metavar='prefix', dest='prefix', + default="dpdk", + help="Prefix for workqueue name to mark for DPDK use [default: 'dpdk']") + arg_p.add_argument('--reset', action='store_true', + help="Reset DSA device and its queues") + parsed_args = arg_p.parse_args(args[1:]) + + dsa_id = parsed_args.dsa_id + dsa_id = get_dsa_id(dsa_id) if ':' in dsa_id else dsa_id + if parsed_args.reset: + reset_device(dsa_id) + else: + configure_dsa(dsa_id, parsed_args.q, parsed_args.prefix) + + +if __name__ == "__main__": + main(sys.argv) diff --git a/drivers/raw/ioat/dpdk_idxd_cfg.py b/drivers/raw/ioat/dpdk_idxd_cfg.py deleted file mode 100755 index fcc27822ef..0000000000 --- a/drivers/raw/ioat/dpdk_idxd_cfg.py +++ /dev/null @@ -1,117 +0,0 @@ -#!/usr/bin/env python3 -# SPDX-License-Identifier: BSD-3-Clause -# Copyright(c) 2020 Intel Corporation - -""" -Configure an entire Intel DSA instance, using idxd kernel driver, for DPDK use -""" - -import sys -import argparse -import os -import os.path - - -class SysfsDir: - "Used to read/write paths in a sysfs directory" - def __init__(self, path): - self.path = path - - def read_int(self, filename): - "Return a value from sysfs file" - with open(os.path.join(self.path, filename)) as f: - return int(f.readline()) - - def write_values(self, values): - "write dictionary, where key is filename and value is value to write" - for filename, contents in values.items(): - with open(os.path.join(self.path, filename), "w") as f: - f.write(str(contents)) - - -def reset_device(dsa_id): - "Reset the DSA device and all its queues" - drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") - drv_dir.write_values({"unbind": f"dsa{dsa_id}"}) - - -def get_pci_dir(pci): - "Search for the sysfs directory of the PCI device" - base_dir = '/sys/bus/pci/devices/' - for path, dirs, files in os.walk(base_dir): - for dir in dirs: - if pci in dir: - return os.path.join(base_dir, dir) - sys.exit(f"Could not find sysfs directory for device {pci}") - - -def get_dsa_id(pci): - "Get the DSA instance ID using the PCI address of the device" - pci_dir = get_pci_dir(pci) - for path, dirs, files in os.walk(pci_dir): - for dir in dirs: - if dir.startswith('dsa') and 'wq' not in dir: - return int(dir[3:]) - sys.exit(f"Could not get device ID for device {pci}") - - -def configure_dsa(dsa_id, queues, prefix): - "Configure the DSA instance with appropriate number of queues" - dsa_dir = SysfsDir(f"/sys/bus/dsa/devices/dsa{dsa_id}") - drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") - - max_groups = dsa_dir.read_int("max_groups") - max_engines = dsa_dir.read_int("max_engines") - max_queues = dsa_dir.read_int("max_work_queues") - max_work_queues_size = dsa_dir.read_int("max_work_queues_size") - - nb_queues = min(queues, max_queues) - if queues > nb_queues: - print(f"Setting number of queues to max supported value: {max_queues}") - - # we want one engine per group, and no more engines than queues - nb_groups = min(max_engines, max_groups, nb_queues) - for grp in range(nb_groups): - dsa_dir.write_values({f"engine{dsa_id}.{grp}/group_id": grp}) - - # configure each queue - for q in range(nb_queues): - wq_dir = SysfsDir(os.path.join(dsa_dir.path, f"wq{dsa_id}.{q}")) - wq_dir.write_values({"group_id": q % nb_groups, - "type": "user", - "mode": "dedicated", - "name": f"{prefix}_wq{dsa_id}.{q}", - "priority": 1, - "size": int(max_work_queues_size / nb_queues)}) - - # enable device and then queues - drv_dir.write_values({"bind": f"dsa{dsa_id}"}) - for q in range(nb_queues): - drv_dir.write_values({"bind": f"wq{dsa_id}.{q}"}) - - -def main(args): - "Main function, does arg parsing and calls config function" - arg_p = argparse.ArgumentParser( - description="Configure whole DSA device instance for DPDK use") - arg_p.add_argument('dsa_id', - help="Specify DSA instance either via DSA instance number or PCI address") - arg_p.add_argument('-q', metavar='queues', type=int, default=255, - help="Number of queues to set up") - arg_p.add_argument('--name-prefix', metavar='prefix', dest='prefix', - default="dpdk", - help="Prefix for workqueue name to mark for DPDK use [default: 'dpdk']") - arg_p.add_argument('--reset', action='store_true', - help="Reset DSA device and its queues") - parsed_args = arg_p.parse_args(args[1:]) - - dsa_id = parsed_args.dsa_id - dsa_id = get_dsa_id(dsa_id) if ':' in dsa_id else dsa_id - if parsed_args.reset: - reset_device(dsa_id) - else: - configure_dsa(dsa_id, parsed_args.q, parsed_args.prefix) - - -if __name__ == "__main__": - main(sys.argv) diff --git a/drivers/raw/ioat/dpdk_idxd_cfg.py b/drivers/raw/ioat/dpdk_idxd_cfg.py new file mode 120000 index 0000000000..85545548d1 --- /dev/null +++ b/drivers/raw/ioat/dpdk_idxd_cfg.py @@ -0,0 +1 @@ +../../dma/idxd/dpdk_idxd_cfg.py \ No newline at end of file -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v2 15/16] devbind: add dma device class 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 00/16] add dmadev driver for idxd devices Kevin Laatz ` (13 preceding siblings ...) 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz @ 2021-09-03 10:50 ` Kevin Laatz 2021-09-03 10:50 ` [dpdk-dev] [PATCH v2 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 2021-09-08 10:29 ` [dpdk-dev] [PATCH v3 00/17] add dmadev driver for idxd devices Kevin Laatz 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-03 10:50 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add a new class for DMA devices. Devices listed under the DMA class are to be used with the dmadev library. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- usertools/dpdk-devbind.py | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py index 74d16e4c4b..8bb573f4b0 100755 --- a/usertools/dpdk-devbind.py +++ b/usertools/dpdk-devbind.py @@ -69,12 +69,13 @@ network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class] baseband_devices = [acceleration_class] crypto_devices = [encryption_class, intel_processor_class] +dma_devices = [] eventdev_devices = [cavium_sso, cavium_tim, intel_dlb, octeontx2_sso] mempool_devices = [cavium_fpa, octeontx2_npa] compress_devices = [cavium_zip] regex_devices = [octeontx2_ree] -misc_devices = [cnxk_bphy, cnxk_bphy_cgx, intel_ioat_bdw, intel_ioat_skx, intel_ioat_icx, intel_idxd_spr, - intel_ntb_skx, intel_ntb_icx, +misc_devices = [cnxk_bphy, cnxk_bphy_cgx, intel_ioat_bdw, intel_ioat_skx, + intel_ioat_icx, intel_idxd_spr, intel_ntb_skx, intel_ntb_icx, octeontx2_dma] # global dict ethernet devices present. Dictionary indexed by PCI address. @@ -583,6 +584,9 @@ def show_status(): if status_dev in ["crypto", "all"]: show_device_status(crypto_devices, "Crypto") + if status_dev in ["dma", "all"]: + show_device_status(dma_devices, "DMA") + if status_dev in ["event", "all"]: show_device_status(eventdev_devices, "Eventdev") @@ -651,7 +655,7 @@ def parse_args(): parser.add_argument( '--status-dev', help="Print the status of given device group.", - choices=['baseband', 'compress', 'crypto', 'event', + choices=['baseband', 'compress', 'crypto', 'dma', 'event', 'mempool', 'misc', 'net', 'regex']) bind_group = parser.add_mutually_exclusive_group() bind_group.add_argument( @@ -732,6 +736,7 @@ def do_arg_actions(): get_device_details(network_devices) get_device_details(baseband_devices) get_device_details(crypto_devices) + get_device_details(dma_devices) get_device_details(eventdev_devices) get_device_details(mempool_devices) get_device_details(compress_devices) @@ -754,6 +759,7 @@ def main(): get_device_details(network_devices) get_device_details(baseband_devices) get_device_details(crypto_devices) + get_device_details(dma_devices) get_device_details(eventdev_devices) get_device_details(mempool_devices) get_device_details(compress_devices) -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v2 16/16] devbind: move idxd device ID to dmadev class 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 00/16] add dmadev driver for idxd devices Kevin Laatz ` (14 preceding siblings ...) 2021-09-03 10:50 ` [dpdk-dev] [PATCH v2 15/16] devbind: add dma device class Kevin Laatz @ 2021-09-03 10:50 ` Kevin Laatz 2021-09-08 10:29 ` [dpdk-dev] [PATCH v3 00/17] add dmadev driver for idxd devices Kevin Laatz 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-03 10:50 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz The dmadev library is the preferred abstraction for using IDXD devices and will replace the rawdev implementation in future. This patch moves the IDXD device ID to the dmadev class. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- usertools/dpdk-devbind.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py index 8bb573f4b0..98b698ccc0 100755 --- a/usertools/dpdk-devbind.py +++ b/usertools/dpdk-devbind.py @@ -69,13 +69,13 @@ network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class] baseband_devices = [acceleration_class] crypto_devices = [encryption_class, intel_processor_class] -dma_devices = [] +dma_devices = [intel_idxd_spr] eventdev_devices = [cavium_sso, cavium_tim, intel_dlb, octeontx2_sso] mempool_devices = [cavium_fpa, octeontx2_npa] compress_devices = [cavium_zip] regex_devices = [octeontx2_ree] misc_devices = [cnxk_bphy, cnxk_bphy_cgx, intel_ioat_bdw, intel_ioat_skx, - intel_ioat_icx, intel_idxd_spr, intel_ntb_skx, intel_ntb_icx, + intel_ioat_icx, intel_ntb_skx, intel_ntb_icx, octeontx2_dma] # global dict ethernet devices present. Dictionary indexed by PCI address. -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v3 00/17] add dmadev driver for idxd devices 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 00/16] add dmadev driver for idxd devices Kevin Laatz ` (15 preceding siblings ...) 2021-09-03 10:50 ` [dpdk-dev] [PATCH v2 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz @ 2021-09-08 10:29 ` Kevin Laatz 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 01/17] raw/ioat: only build if dmadev not present Kevin Laatz ` (16 more replies) 16 siblings, 17 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-08 10:29 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz This patchset adds a dmadev driver and associated documentation to support Intel Data Streaming Accelerator devices. This driver is intended to ultimately replace the current IDXD part of the IOAT rawdev driver. NOTE: This patchset has several dependencies: - v21 of the dmadev lib set [1] - v3 of the dmadev test suite [2] [1] http://patches.dpdk.org/project/dpdk/list/?series=18738 [2] http://patches.dpdk.org/project/dpdk/list/?series=18744 v3: * rebased on above patchsets * added burst capacity API v2: * rebased on above patchsets * added API to check for device being idle * added devbind updates for DMA devices * fixed issue identified by internal coverity scan * other minor miscellaneous changes and fixes Bruce Richardson (1): raw/ioat: only build if dmadev not present Conor Walsh (1): dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz (15): doc: initial commit for dmadevs section dma/idxd: add skeleton for VFIO based DSA device dma/idxd: add bus device probing dma/idxd: create dmadev instances on bus probe dma/idxd: create dmadev instances on pci probe dma/idxd: add datapath structures dma/idxd: add configure and info_get functions dma/idxd: add start and stop functions for pci devices dma/idxd: add data-path job submission functions dma/idxd: add data-path job completion functions dma/idxd: add operation statistic tracking dma/idxd: add vchan status function dma/idxd: add burst capacity API devbind: add dma device class devbind: move idxd device ID to dmadev class MAINTAINERS | 10 + app/test/test_dmadev.c | 2 + doc/guides/dmadevs/idxd.rst | 255 ++++++++++ doc/guides/dmadevs/index.rst | 14 + doc/guides/index.rst | 1 + doc/guides/rel_notes/release_21_11.rst | 5 + drivers/dma/idxd/dpdk_idxd_cfg.py | 117 +++++ drivers/dma/idxd/idxd_bus.c | 378 +++++++++++++++ drivers/dma/idxd/idxd_common.c | 616 +++++++++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 131 ++++++ drivers/dma/idxd/idxd_internal.h | 108 +++++ drivers/dma/idxd/idxd_pci.c | 381 +++++++++++++++ drivers/dma/idxd/meson.build | 10 + drivers/dma/idxd/version.map | 3 + drivers/dma/meson.build | 1 + drivers/raw/ioat/dpdk_idxd_cfg.py | 118 +---- drivers/raw/ioat/meson.build | 23 +- usertools/dpdk-devbind.py | 12 +- 18 files changed, 2062 insertions(+), 123 deletions(-) create mode 100644 doc/guides/dmadevs/idxd.rst create mode 100644 doc/guides/dmadevs/index.rst create mode 100755 drivers/dma/idxd/dpdk_idxd_cfg.py create mode 100644 drivers/dma/idxd/idxd_bus.c create mode 100644 drivers/dma/idxd/idxd_common.c create mode 100644 drivers/dma/idxd/idxd_hw_defs.h create mode 100644 drivers/dma/idxd/idxd_internal.h create mode 100644 drivers/dma/idxd/idxd_pci.c create mode 100644 drivers/dma/idxd/meson.build create mode 100644 drivers/dma/idxd/version.map mode change 100755 => 120000 drivers/raw/ioat/dpdk_idxd_cfg.py -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v3 01/17] raw/ioat: only build if dmadev not present 2021-09-08 10:29 ` [dpdk-dev] [PATCH v3 00/17] add dmadev driver for idxd devices Kevin Laatz @ 2021-09-08 10:30 ` Kevin Laatz 2021-09-08 16:00 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 02/17] doc: initial commit for dmadevs section Kevin Laatz ` (15 subsequent siblings) 16 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-08 10:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh From: Bruce Richardson <bruce.richardson@intel.com> Only build the rawdev IDXD/IOAT drivers if the dmadev drivers are not present. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> --- drivers/raw/ioat/meson.build | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-) diff --git a/drivers/raw/ioat/meson.build b/drivers/raw/ioat/meson.build index 0e81cb5951..7bd9ac912b 100644 --- a/drivers/raw/ioat/meson.build +++ b/drivers/raw/ioat/meson.build @@ -2,14 +2,31 @@ # Copyright 2019 Intel Corporation build = dpdk_conf.has('RTE_ARCH_X86') +# only use ioat rawdev driver if we don't have the equivalent dmadev ones +if not dpdk_conf.has('RTE_DMA_IDXD') and not dpdk_conf.has('RTE_DMA_IOAT') + build = false + subdir_done() +endif + reason = 'only supported on x86' sources = files( - 'idxd_bus.c', - 'idxd_pci.c', 'ioat_common.c', - 'ioat_rawdev.c', 'ioat_rawdev_test.c', ) + +if not dpdk_conf.has('RTE_DMA_IDXD') + sources += files( + 'idxd_bus.c', + 'idxd_pci.c', + ) +endif + +if not dpdk_conf.has('RTE_DMA_IOAT') + sources += files ( + 'ioat_rawdev.c', + ) +endif + deps += ['bus_pci', 'mbuf', 'rawdev'] headers = files( 'rte_ioat_rawdev.h', -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v3 01/17] raw/ioat: only build if dmadev not present 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 01/17] raw/ioat: only build if dmadev not present Kevin Laatz @ 2021-09-08 16:00 ` Conor Walsh 2021-09-09 11:11 ` Kevin Laatz 0 siblings, 1 reply; 243+ messages in thread From: Conor Walsh @ 2021-09-08 16:00 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, fengchengwen, jerinj > From: Bruce Richardson <bruce.richardson@intel.com> > > Only build the rawdev IDXD/IOAT drivers if the dmadev drivers are not > present. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > --- > drivers/raw/ioat/meson.build | 23 ++++++++++++++++++++--- > 1 file changed, 20 insertions(+), 3 deletions(-) > > diff --git a/drivers/raw/ioat/meson.build b/drivers/raw/ioat/meson.build > index 0e81cb5951..7bd9ac912b 100644 > --- a/drivers/raw/ioat/meson.build > +++ b/drivers/raw/ioat/meson.build > @@ -2,14 +2,31 @@ > # Copyright 2019 Intel Corporation Minor nit the copyright should be updated to 2019-2021 > build = dpdk_conf.has('RTE_ARCH_X86') > +# only use ioat rawdev driver if we don't have the equivalent dmadev ones > +if not dpdk_conf.has('RTE_DMA_IDXD') and not dpdk_conf.has('RTE_DMA_IOAT') When disabling the dmadev drivers using -Ddisable_drivers=dma/* the rawdev driver isnt available to use for dma devices. The way this is ATM if dmadev is disabled it doesnt build rawdev. The logic should be - if dpdk_conf.has('RTE_DMA_IDXD') and dpdk_conf.has('RTE_DMA_IOAT') <snip> Reviewed-by: Conor Walsh <conor.walsh@intel.com> ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v3 01/17] raw/ioat: only build if dmadev not present 2021-09-08 16:00 ` Conor Walsh @ 2021-09-09 11:11 ` Kevin Laatz 0 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-09 11:11 UTC (permalink / raw) To: Conor Walsh, dev; +Cc: bruce.richardson, fengchengwen, jerinj On 08/09/2021 17:00, Conor Walsh wrote: > >> From: Bruce Richardson <bruce.richardson@intel.com> >> >> Only build the rawdev IDXD/IOAT drivers if the dmadev drivers are not >> present. >> >> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> >> --- >> drivers/raw/ioat/meson.build | 23 ++++++++++++++++++++--- >> 1 file changed, 20 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/raw/ioat/meson.build b/drivers/raw/ioat/meson.build >> index 0e81cb5951..7bd9ac912b 100644 >> --- a/drivers/raw/ioat/meson.build >> +++ b/drivers/raw/ioat/meson.build >> @@ -2,14 +2,31 @@ >> # Copyright 2019 Intel Corporation > Minor nit the copyright should be updated to 2019-2021 >> build = dpdk_conf.has('RTE_ARCH_X86') >> +# only use ioat rawdev driver if we don't have the equivalent dmadev >> ones >> +if not dpdk_conf.has('RTE_DMA_IDXD') and not >> dpdk_conf.has('RTE_DMA_IOAT') > > When disabling the dmadev drivers using -Ddisable_drivers=dma/* the > rawdev driver isnt available to use for dma devices. > > The way this is ATM if dmadev is disabled it doesnt build rawdev. > > The logic should be - if dpdk_conf.has('RTE_DMA_IDXD') and > dpdk_conf.has('RTE_DMA_IOAT') > Will fix this in v4, thanks! ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v3 02/17] doc: initial commit for dmadevs section 2021-09-08 10:29 ` [dpdk-dev] [PATCH v3 00/17] add dmadev driver for idxd devices Kevin Laatz 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 01/17] raw/ioat: only build if dmadev not present Kevin Laatz @ 2021-09-08 10:30 ` Kevin Laatz 2021-09-08 16:00 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 03/17] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz ` (14 subsequent siblings) 16 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-08 10:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add new section to the programmer's guide for dmadev devices. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> --- doc/guides/dmadevs/index.rst | 14 ++++++++++++++ doc/guides/index.rst | 1 + 2 files changed, 15 insertions(+) create mode 100644 doc/guides/dmadevs/index.rst diff --git a/doc/guides/dmadevs/index.rst b/doc/guides/dmadevs/index.rst new file mode 100644 index 0000000000..b30004fd65 --- /dev/null +++ b/doc/guides/dmadevs/index.rst @@ -0,0 +1,14 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2021 Intel Corporation. + +DMA Device Drivers +================== + +The following are a list of DMA device PMDs, which can be used from an +application through DMAdev API. + +.. toctree:: + :maxdepth: 2 + :numbered: + + idxd diff --git a/doc/guides/index.rst b/doc/guides/index.rst index 857f0363d3..ccb71640dd 100644 --- a/doc/guides/index.rst +++ b/doc/guides/index.rst @@ -19,6 +19,7 @@ DPDK documentation bbdevs/index cryptodevs/index compressdevs/index + dmadevs/index vdpadevs/index regexdevs/index eventdevs/index -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v3 02/17] doc: initial commit for dmadevs section 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 02/17] doc: initial commit for dmadevs section Kevin Laatz @ 2021-09-08 16:00 ` Conor Walsh 0 siblings, 0 replies; 243+ messages in thread From: Conor Walsh @ 2021-09-08 16:00 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, fengchengwen, jerinj > Add new section to the programmer's guide for dmadev devices. > > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Acked-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> <snip> ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v3 03/17] dma/idxd: add skeleton for VFIO based DSA device 2021-09-08 10:29 ` [dpdk-dev] [PATCH v3 00/17] add dmadev driver for idxd devices Kevin Laatz 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 01/17] raw/ioat: only build if dmadev not present Kevin Laatz 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 02/17] doc: initial commit for dmadevs section Kevin Laatz @ 2021-09-08 10:30 ` Kevin Laatz 2021-09-08 16:47 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 04/17] dma/idxd: add bus device probing Kevin Laatz ` (13 subsequent siblings) 16 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-08 10:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the basic device probe/remove skeleton code for DSA device bound to the vfio pci driver. Relevant documentation and MAINTAINERS update also included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- MAINTAINERS | 10 +++++ doc/guides/dmadevs/idxd.rst | 58 ++++++++++++++++++++++++++ doc/guides/rel_notes/release_21_11.rst | 5 +++ drivers/dma/idxd/idxd_internal.h | 27 ++++++++++++ drivers/dma/idxd/idxd_pci.c | 55 ++++++++++++++++++++++++ drivers/dma/idxd/meson.build | 7 ++++ drivers/dma/idxd/version.map | 3 ++ drivers/dma/meson.build | 1 + 8 files changed, 166 insertions(+) create mode 100644 doc/guides/dmadevs/idxd.rst create mode 100644 drivers/dma/idxd/idxd_internal.h create mode 100644 drivers/dma/idxd/idxd_pci.c create mode 100644 drivers/dma/idxd/meson.build create mode 100644 drivers/dma/idxd/version.map diff --git a/MAINTAINERS b/MAINTAINERS index c057a090d6..b4c614a229 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1199,6 +1199,16 @@ F: doc/guides/compressdevs/zlib.rst F: doc/guides/compressdevs/features/zlib.ini +DMAdev Drivers +-------------- + +Intel IDXD - EXPERIMENTAL +M: Bruce Richardson <bruce.richardson@intel.com> +M: Kevin Laatz <kevin.laatz@intel.com> +F: drivers/dma/idxd/ +F: doc/guides/dmadevs/idxd.rst + + RegEx Drivers ------------- diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst new file mode 100644 index 0000000000..924700d17e --- /dev/null +++ b/doc/guides/dmadevs/idxd.rst @@ -0,0 +1,58 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2021 Intel Corporation. + +.. include:: <isonum.txt> + +IDXD DMA Device Driver +====================== + +The ``idxd`` dmadev driver provides a poll-mode driver (PMD) for Intel\ |reg| +Data Streaming Accelerator `(Intel DSA) +<https://software.intel.com/content/www/us/en/develop/articles/intel-data-streaming-accelerator-architecture-specification.html>`_. +This PMD can be used in conjunction with Intel\ |reg| DSA devices to offload +data operations, such as data copies, to hardware, freeing up CPU cycles for +other tasks. + +Hardware Requirements +---------------------- + +The ``dpdk-devbind.py`` script, included with DPDK, can be used to show the +presence of supported hardware. Running ``dpdk-devbind.py --status-dev dma`` +will show all the DMA devices on the system, including IDXD supported devices. +Intel\ |reg| DSA devices, are currently (at time of writing) appearing +as devices with type “0b25”, due to the absence of pci-id database entries for +them at this point. + +Compilation +------------ + +For builds using ``meson`` and ``ninja``, the driver will be built when the +target platform is x86-based. No additional compilation steps are necessary. + +Device Setup +------------- + +Devices using VFIO/UIO drivers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The HW devices to be used will need to be bound to a user-space IO driver for use. +The ``dpdk-devbind.py`` script can be used to view the state of the devices +and to bind them to a suitable DPDK-supported driver, such as ``vfio-pci``. +For example:: + + $ dpdk-devbind.py -b vfio-pci 6a:01.0 + +Device Probing and Initialization +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For devices bound to a suitable DPDK-supported VFIO/UIO driver, the HW devices will +be found as part of the device scan done at application initialization time without +the need to pass parameters to the application. + +For Intel\ |reg| DSA devices, DPDK will automatically configure the device with the +maximum number of workqueues available on it, partitioning all resources equally +among the queues. +If fewer workqueues are required, then the ``max_queues`` parameter may be passed to +the device driver on the EAL commandline, via the ``allowlist`` or ``-a`` flag e.g.:: + + $ dpdk-test -a <b:d:f>,max_queues=4 diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst index 3562822b3d..8526646b13 100644 --- a/doc/guides/rel_notes/release_21_11.rst +++ b/doc/guides/rel_notes/release_21_11.rst @@ -67,6 +67,11 @@ New Features The dmadev library provides a DMA device framework for management and provision of hardware and software DMA devices. +* **Added IDXD dmadev driver implementation.** + + The IDXD dmadev driver provide device drivers for the Intel DSA devices. + This device driver can be used through the generic dmadev API. + Removed Items ------------- diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h new file mode 100644 index 0000000000..c6a7dcd72f --- /dev/null +++ b/drivers/dma/idxd/idxd_internal.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#ifndef _IDXD_INTERNAL_H_ +#define _IDXD_INTERNAL_H_ + +/** + * @file idxd_internal.h + * + * Internal data structures for the idxd/DSA driver for dmadev + * + * @warning + * @b EXPERIMENTAL: these structures and APIs may change without prior notice + */ + +extern int idxd_pmd_logtype; + +#define IDXD_PMD_LOG(level, fmt, args...) rte_log(RTE_LOG_ ## level, \ + idxd_pmd_logtype, "IDXD: %s(): " fmt "\n", __func__, ##args) + +#define IDXD_PMD_DEBUG(fmt, args...) IDXD_PMD_LOG(DEBUG, fmt, ## args) +#define IDXD_PMD_INFO(fmt, args...) IDXD_PMD_LOG(INFO, fmt, ## args) +#define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) +#define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) + +#endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c new file mode 100644 index 0000000000..79e4aadcab --- /dev/null +++ b/drivers/dma/idxd/idxd_pci.c @@ -0,0 +1,55 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include <rte_bus_pci.h> + +#include "idxd_internal.h" + +#define IDXD_VENDOR_ID 0x8086 +#define IDXD_DEVICE_ID_SPR 0x0B25 + +#define IDXD_PMD_DMADEV_NAME_PCI dmadev_idxd_pci + +const struct rte_pci_id pci_id_idxd_map[] = { + { RTE_PCI_DEVICE(IDXD_VENDOR_ID, IDXD_DEVICE_ID_SPR) }, + { .vendor_id = 0, /* sentinel */ }, +}; + +static int +idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) +{ + int ret = 0; + char name[PCI_PRI_STR_SIZE]; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); + dev->device.driver = &drv->driver; + + return ret; +} + +static int +idxd_dmadev_remove_pci(struct rte_pci_device *dev) +{ + char name[PCI_PRI_STR_SIZE]; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + + IDXD_PMD_INFO("Closing %s on NUMA node %d", + name, dev->device.numa_node); + + return 0; +} + +struct rte_pci_driver idxd_pmd_drv_pci = { + .id_table = pci_id_idxd_map, + .drv_flags = RTE_PCI_DRV_NEED_MAPPING, + .probe = idxd_dmadev_probe_pci, + .remove = idxd_dmadev_remove_pci, +}; + +RTE_PMD_REGISTER_PCI(IDXD_PMD_DMADEV_NAME_PCI, idxd_pmd_drv_pci); +RTE_PMD_REGISTER_PCI_TABLE(IDXD_PMD_DMADEV_NAME_PCI, pci_id_idxd_map); +RTE_PMD_REGISTER_KMOD_DEP(IDXD_PMD_DMADEV_NAME_PCI, "vfio-pci"); +RTE_PMD_REGISTER_PARAM_STRING(dmadev_idxd_pci, "max_queues=0"); diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build new file mode 100644 index 0000000000..24c0f3106e --- /dev/null +++ b/drivers/dma/idxd/meson.build @@ -0,0 +1,7 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2021 Intel Corporation + +deps += ['bus_pci'] +sources = files( + 'idxd_pci.c' +) \ No newline at end of file diff --git a/drivers/dma/idxd/version.map b/drivers/dma/idxd/version.map new file mode 100644 index 0000000000..4a76d1d52d --- /dev/null +++ b/drivers/dma/idxd/version.map @@ -0,0 +1,3 @@ +DPDK_21 { + local: *; +}; diff --git a/drivers/dma/meson.build b/drivers/dma/meson.build index 0c2c34cd00..0b01b6a8ab 100644 --- a/drivers/dma/meson.build +++ b/drivers/dma/meson.build @@ -6,6 +6,7 @@ if is_windows endif drivers = [ + 'idxd', 'skeleton', ] std_deps = ['dmadev'] -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v3 03/17] dma/idxd: add skeleton for VFIO based DSA device 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 03/17] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz @ 2021-09-08 16:47 ` Conor Walsh 0 siblings, 0 replies; 243+ messages in thread From: Conor Walsh @ 2021-09-08 16:47 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, fengchengwen, jerinj On 08/09/2021 11:30, Kevin Laatz wrote: > Add the basic device probe/remove skeleton code for DSA device bound to > the vfio pci driver. Relevant documentation and MAINTAINERS update also > included. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> <snip> ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v3 04/17] dma/idxd: add bus device probing 2021-09-08 10:29 ` [dpdk-dev] [PATCH v3 00/17] add dmadev driver for idxd devices Kevin Laatz ` (2 preceding siblings ...) 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 03/17] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz @ 2021-09-08 10:30 ` Kevin Laatz 2021-09-08 16:47 ` Conor Walsh 2021-09-15 10:12 ` Maxime Coquelin 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 05/17] dma/idxd: create dmadev instances on bus probe Kevin Laatz ` (12 subsequent siblings) 16 siblings, 2 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-08 10:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the basic device probing for DSA devices bound to the IDXD kernel driver. These devices can be configured via sysfs and made available to DPDK if they are found during bus scan. Relevant documentation is included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- doc/guides/dmadevs/idxd.rst | 64 +++++++ drivers/dma/idxd/idxd_bus.c | 352 +++++++++++++++++++++++++++++++++++ drivers/dma/idxd/meson.build | 1 + 3 files changed, 417 insertions(+) create mode 100644 drivers/dma/idxd/idxd_bus.c diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 924700d17e..ce33e2857a 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -32,6 +32,56 @@ target platform is x86-based. No additional compilation steps are necessary. Device Setup ------------- +Intel\ |reg| DSA devices can use the IDXD kernel driver or DPDK-supported drivers, +such as ``vfio-pci``. Both are supported by the IDXD PMD. + +Intel\ |reg| DSA devices using IDXD kernel driver +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To use an Intel\ |reg| DSA device bound to the IDXD kernel driver, the device must first be configured. +The `accel-config <https://github.com/intel/idxd-config>`_ utility library can be used for configuration. + +.. note:: + The device configuration can also be done by directly interacting with the sysfs nodes. + An example of how this may be done can be seen in the script ``dpdk_idxd_cfg.py`` + included in the driver source directory. + +There are some mandatory configuration steps before being able to use a device with an application. +The internal engines, which do the copies or other operations, +and the work-queues, which are used by applications to assign work to the device, +need to be assigned to groups, and the various other configuration options, +such as priority or queue depth, need to be set for each queue. + +To assign an engine to a group:: + + $ accel-config config-engine dsa0/engine0.0 --group-id=0 + $ accel-config config-engine dsa0/engine0.1 --group-id=1 + +To assign work queues to groups for passing descriptors to the engines a similar accel-config command can be used. +However, the work queues also need to be configured depending on the use case. +Some configuration options include: + +* mode (Dedicated/Shared): Indicates whether a WQ may accept jobs from multiple queues simultaneously. +* priority: WQ priority between 1 and 15. Larger value means higher priority. +* wq-size: the size of the WQ. Sum of all WQ sizes must be less that the total-size defined by the device. +* type: WQ type (kernel/mdev/user). Determines how the device is presented. +* name: identifier given to the WQ. + +Example configuration for a work queue:: + + $ accel-config config-wq dsa0/wq0.0 --group-id=0 \ + --mode=dedicated --priority=10 --wq-size=8 \ + --type=user --name=dpdk_app1 + +Once the devices have been configured, they need to be enabled:: + + $ accel-config enable-device dsa0 + $ accel-config enable-wq dsa0/wq0.0 + +Check the device configuration:: + + $ accel-config list + Devices using VFIO/UIO drivers ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -56,3 +106,17 @@ If fewer workqueues are required, then the ``max_queues`` parameter may be passe the device driver on the EAL commandline, via the ``allowlist`` or ``-a`` flag e.g.:: $ dpdk-test -a <b:d:f>,max_queues=4 + +For devices bound to the IDXD kernel driver, +the DPDK IDXD driver will automatically perform a scan for available workqueues +to use. Any workqueues found listed in ``/dev/dsa`` on the system will be checked +in ``/sys``, and any which have ``dpdk_`` prefix in their name will be automatically +probed by the driver to make them available to the application. +Alternatively, to support use by multiple DPDK processes simultaneously, +the value used as the DPDK ``--file-prefix`` parameter may be used as a workqueue +name prefix, instead of ``dpdk_``, allowing each DPDK application instance to only +use a subset of configured queues. + +Once probed successfully, irrespective of kernel driver, the device will appear as a ``dmadev``, +that is a "DMA device type" inside DPDK, and can be accessed using APIs from the +``rte_dmadev`` library. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c new file mode 100644 index 0000000000..4097ecd940 --- /dev/null +++ b/drivers/dma/idxd/idxd_bus.c @@ -0,0 +1,352 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include <dirent.h> +#include <fcntl.h> +#include <unistd.h> +#include <sys/mman.h> +#include <libgen.h> + +#include <rte_bus.h> +#include <rte_log.h> +#include <rte_dmadev_pmd.h> +#include <rte_string_fns.h> + +#include "idxd_internal.h" + +/* default value for DSA paths, but allow override in environment for testing */ +#define DSA_DEV_PATH "/dev/dsa" +#define DSA_SYSFS_PATH "/sys/bus/dsa/devices" + +static unsigned int devcount; + +/** unique identifier for a DSA device/WQ instance */ +struct dsa_wq_addr { + uint16_t device_id; + uint16_t wq_id; +}; + +/** a DSA device instance */ +struct rte_dsa_device { + struct rte_device device; /**< Inherit core device */ + TAILQ_ENTRY(rte_dsa_device) next; /**< next dev in list */ + + char wq_name[32]; /**< the workqueue name/number e.g. wq0.1 */ + struct dsa_wq_addr addr; /**< Identifies the specific WQ */ +}; + +/* forward prototypes */ +struct dsa_bus; +static int dsa_scan(void); +static int dsa_probe(void); +static struct rte_device *dsa_find_device(const struct rte_device *start, + rte_dev_cmp_t cmp, const void *data); +static enum rte_iova_mode dsa_get_iommu_class(void); +static int dsa_addr_parse(const char *name, void *addr); + +/** List of devices */ +TAILQ_HEAD(dsa_device_list, rte_dsa_device); + +/** + * Structure describing the DSA bus + */ +struct dsa_bus { + struct rte_bus bus; /**< Inherit the generic class */ + struct rte_driver driver; /**< Driver struct for devices to point to */ + struct dsa_device_list device_list; /**< List of PCI devices */ +}; + +struct dsa_bus dsa_bus = { + .bus = { + .scan = dsa_scan, + .probe = dsa_probe, + .find_device = dsa_find_device, + .get_iommu_class = dsa_get_iommu_class, + .parse = dsa_addr_parse, + }, + .driver = { + .name = "dmadev_idxd" + }, + .device_list = TAILQ_HEAD_INITIALIZER(dsa_bus.device_list), +}; + +static inline const char * +dsa_get_dev_path(void) +{ + const char *path = getenv("DSA_DEV_PATH"); + return path ? path : DSA_DEV_PATH; +} + +static inline const char * +dsa_get_sysfs_path(void) +{ + const char *path = getenv("DSA_SYSFS_PATH"); + return path ? path : DSA_SYSFS_PATH; +} + +static void * +idxd_vdev_mmap_wq(struct rte_dsa_device *dev) +{ + void *addr; + char path[PATH_MAX]; + int fd; + + snprintf(path, sizeof(path), "%s/%s", dsa_get_dev_path(), dev->wq_name); + fd = open(path, O_RDWR); + if (fd < 0) { + IDXD_PMD_ERR("Failed to open device path: %s", path); + return NULL; + } + + addr = mmap(NULL, 0x1000, PROT_WRITE, MAP_SHARED, fd, 0); + close(fd); + if (addr == MAP_FAILED) { + IDXD_PMD_ERR("Failed to mmap device %s", path); + return NULL; + } + + return addr; +} + +static int +read_wq_string(struct rte_dsa_device *dev, const char *filename, + char *value, size_t valuelen) +{ + char sysfs_node[PATH_MAX]; + int len; + int fd; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s", + dsa_get_sysfs_path(), dev->wq_name, filename); + fd = open(sysfs_node, O_RDONLY); + if (fd < 0) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + len = read(fd, value, valuelen - 1); + close(fd); + if (len < 0) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + value[len] = '\0'; + return 0; +} + +static int +read_wq_int(struct rte_dsa_device *dev, const char *filename, + int *value) +{ + char sysfs_node[PATH_MAX]; + FILE *f; + int ret = 0; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s", + dsa_get_sysfs_path(), dev->wq_name, filename); + f = fopen(sysfs_node, "r"); + if (f == NULL) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + if (fscanf(f, "%d", value) != 1) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + ret = -1; + } + + fclose(f); + return ret; +} + +static int +read_device_int(struct rte_dsa_device *dev, const char *filename, + int *value) +{ + char sysfs_node[PATH_MAX]; + FILE *f; + int ret = 0; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/dsa%d/%s", + dsa_get_sysfs_path(), dev->addr.device_id, filename); + f = fopen(sysfs_node, "r"); + if (f == NULL) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + if (fscanf(f, "%d", value) != 1) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + ret = -1; + } + + fclose(f); + return ret; +} + +static int +idxd_probe_dsa(struct rte_dsa_device *dev) +{ + struct idxd_dmadev idxd = {0}; + int ret = 0; + + IDXD_PMD_INFO("Probing device %s on numa node %d", + dev->wq_name, dev->device.numa_node); + if (read_wq_int(dev, "size", &ret) < 0) + return -1; + idxd.max_batches = ret; + if (read_wq_int(dev, "max_batch_size", &ret) < 0) + return -1; + idxd.max_batch_size = ret; + idxd.qid = dev->addr.wq_id; + idxd.u.vdev.dsa_id = dev->addr.device_id; + idxd.sva_support = 1; + + idxd.portal = idxd_vdev_mmap_wq(dev); + if (idxd.portal == NULL) { + IDXD_PMD_ERR("WQ mmap failed"); + return -ENOENT; + } + + return 0; +} + +static int +is_for_this_process_use(const char *name) +{ + char *runtime_dir = strdup(rte_eal_get_runtime_dir()); + char *prefix = basename(runtime_dir); + int prefixlen = strlen(prefix); + int retval = 0; + + if (strncmp(name, "dpdk_", 5) == 0) + retval = 1; + if (strncmp(name, prefix, prefixlen) == 0 && name[prefixlen] == '_') + retval = 1; + + free(runtime_dir); + return retval; +} + +static int +dsa_probe(void) +{ + struct rte_dsa_device *dev; + + TAILQ_FOREACH(dev, &dsa_bus.device_list, next) { + char type[64], name[64]; + + if (read_wq_string(dev, "type", type, sizeof(type)) < 0 || + read_wq_string(dev, "name", name, sizeof(name)) < 0) + continue; + + if (strncmp(type, "user", 4) == 0 && is_for_this_process_use(name)) { + dev->device.driver = &dsa_bus.driver; + idxd_probe_dsa(dev); + continue; + } + IDXD_PMD_DEBUG("WQ '%s', not allocated to DPDK", dev->wq_name); + } + + return 0; +} + +static int +dsa_scan(void) +{ + const char *path = dsa_get_dev_path(); + struct dirent *wq; + DIR *dev_dir; + + dev_dir = opendir(path); + if (dev_dir == NULL) { + if (errno == ENOENT) + return 0; /* no bus, return without error */ + IDXD_PMD_ERR("%s(): opendir '%s' failed: %s", + __func__, path, strerror(errno)); + return -1; + } + + while ((wq = readdir(dev_dir)) != NULL) { + struct rte_dsa_device *dev; + int numa_node = -1; + + if (strncmp(wq->d_name, "wq", 2) != 0) + continue; + if (strnlen(wq->d_name, sizeof(dev->wq_name)) == sizeof(dev->wq_name)) { + IDXD_PMD_ERR("%s(): wq name too long: '%s', skipping", + __func__, wq->d_name); + continue; + } + IDXD_PMD_DEBUG("%s(): found %s/%s", __func__, path, wq->d_name); + + dev = malloc(sizeof(*dev)); + if (dsa_addr_parse(wq->d_name, &dev->addr) < 0) { + IDXD_PMD_ERR("Error parsing WQ name: %s", wq->d_name); + free(dev); + continue; + } + dev->device.bus = &dsa_bus.bus; + strlcpy(dev->wq_name, wq->d_name, sizeof(dev->wq_name)); + TAILQ_INSERT_TAIL(&dsa_bus.device_list, dev, next); + devcount++; + + read_device_int(dev, "numa_node", &numa_node); + dev->device.numa_node = numa_node; + dev->device.name = dev->wq_name; + } + + closedir(dev_dir); + return 0; +} + +static struct rte_device * +dsa_find_device(const struct rte_device *start, rte_dev_cmp_t cmp, + const void *data) +{ + struct rte_dsa_device *dev = TAILQ_FIRST(&dsa_bus.device_list); + + /* the rte_device struct must be at start of dsa structure */ + RTE_BUILD_BUG_ON(offsetof(struct rte_dsa_device, device) != 0); + + if (start != NULL) /* jump to start point if given */ + dev = TAILQ_NEXT((const struct rte_dsa_device *)start, next); + while (dev != NULL) { + if (cmp(&dev->device, data) == 0) + return &dev->device; + dev = TAILQ_NEXT(dev, next); + } + return NULL; +} + +static enum rte_iova_mode +dsa_get_iommu_class(void) +{ + /* if there are no devices, report don't care, otherwise VA mode */ + return devcount > 0 ? RTE_IOVA_VA : RTE_IOVA_DC; +} + +static int +dsa_addr_parse(const char *name, void *addr) +{ + struct dsa_wq_addr *wq = addr; + unsigned int device_id, wq_id; + + if (sscanf(name, "wq%u.%u", &device_id, &wq_id) != 2) { + IDXD_PMD_DEBUG("Parsing WQ name failed: %s", name); + return -1; + } + + wq->device_id = device_id; + wq->wq_id = wq_id; + return 0; +} + +RTE_REGISTER_BUS(dsa, dsa_bus.bus); diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index 24c0f3106e..f1fea000a7 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -3,5 +3,6 @@ deps += ['bus_pci'] sources = files( + 'idxd_bus.c', 'idxd_pci.c' ) \ No newline at end of file -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v3 04/17] dma/idxd: add bus device probing 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 04/17] dma/idxd: add bus device probing Kevin Laatz @ 2021-09-08 16:47 ` Conor Walsh 2021-09-09 11:10 ` Kevin Laatz 2021-09-15 10:12 ` Maxime Coquelin 1 sibling, 1 reply; 243+ messages in thread From: Conor Walsh @ 2021-09-08 16:47 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, fengchengwen, jerinj <snip> > +static void * > +idxd_vdev_mmap_wq(struct rte_dsa_device *dev) Some inconsistent naming between vdev and bus in some of these functions the above should be idxd_bus_mmap_wq for example. > +{ > + void *addr; > + char path[PATH_MAX]; > + int fd; > + > + snprintf(path, sizeof(path), "%s/%s", dsa_get_dev_path(), dev->wq_name); > + fd = open(path, O_RDWR); > + if (fd < 0) { > + IDXD_PMD_ERR("Failed to open device path: %s", path); > + return NULL; > + } > + > + addr = mmap(NULL, 0x1000, PROT_WRITE, MAP_SHARED, fd, 0); > + close(fd); > + if (addr == MAP_FAILED) { > + IDXD_PMD_ERR("Failed to mmap device %s", path); > + return NULL; > + } > + > + return addr; > +} Reviewed-by: Conor Walsh <conor.walsh@intel.com> ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v3 04/17] dma/idxd: add bus device probing 2021-09-08 16:47 ` Conor Walsh @ 2021-09-09 11:10 ` Kevin Laatz 0 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-09 11:10 UTC (permalink / raw) To: Conor Walsh, dev; +Cc: bruce.richardson, fengchengwen, jerinj On 08/09/2021 17:47, Conor Walsh wrote: > > <snip> > >> +static void * >> +idxd_vdev_mmap_wq(struct rte_dsa_device *dev) > > Some inconsistent naming between vdev and bus in some of these > functions the above should be idxd_bus_mmap_wq for example. Will fix in v4, thanks. > > ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v3 04/17] dma/idxd: add bus device probing 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 04/17] dma/idxd: add bus device probing Kevin Laatz 2021-09-08 16:47 ` Conor Walsh @ 2021-09-15 10:12 ` Maxime Coquelin 2021-09-15 11:06 ` Bruce Richardson 1 sibling, 1 reply; 243+ messages in thread From: Maxime Coquelin @ 2021-09-15 10:12 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh Hi Kevin, On 9/8/21 12:30 PM, Kevin Laatz wrote: > Add the basic device probing for DSA devices bound to the IDXD kernel > driver. These devices can be configured via sysfs and made available to > DPDK if they are found during bus scan. Relevant documentation is included. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > --- > doc/guides/dmadevs/idxd.rst | 64 +++++++ > drivers/dma/idxd/idxd_bus.c | 352 +++++++++++++++++++++++++++++++++++ > drivers/dma/idxd/meson.build | 1 + > 3 files changed, 417 insertions(+) > create mode 100644 drivers/dma/idxd/idxd_bus.c > Sorry if it has been asked before, but what is the reason this DSA bus driver is not in drivers/bus/? ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v3 04/17] dma/idxd: add bus device probing 2021-09-15 10:12 ` Maxime Coquelin @ 2021-09-15 11:06 ` Bruce Richardson 0 siblings, 0 replies; 243+ messages in thread From: Bruce Richardson @ 2021-09-15 11:06 UTC (permalink / raw) To: Maxime Coquelin; +Cc: Kevin Laatz, dev, fengchengwen, jerinj, conor.walsh On Wed, Sep 15, 2021 at 12:12:34PM +0200, Maxime Coquelin wrote: > Hi Kevin, > > On 9/8/21 12:30 PM, Kevin Laatz wrote: > > Add the basic device probing for DSA devices bound to the IDXD kernel > > driver. These devices can be configured via sysfs and made available to > > DPDK if they are found during bus scan. Relevant documentation is included. > > > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > > --- > > doc/guides/dmadevs/idxd.rst | 64 +++++++ > > drivers/dma/idxd/idxd_bus.c | 352 +++++++++++++++++++++++++++++++++++ > > drivers/dma/idxd/meson.build | 1 + > > 3 files changed, 417 insertions(+) > > create mode 100644 drivers/dma/idxd/idxd_bus.c > > > > Sorry if it has been asked before, but what is the reason this DSA bus > driver is not in drivers/bus/? > This bus-driver solution came out of discussion previously when we were looking to add DSA support to the ioat driver. Individual device drivers of any class cannot themselves do any probing to discover devices, but only bus drivers can do that. Therefore, to enable discovery of DSA devices used by the kernel driver, a custom bus driver is necessary. Since this bus driver only supports discovery of DSA devices using a single driver type behind the scenes (referred to in previous discussions as a "singleton" bus driver), it's kept in with the rest of the DSA driver code. It's not a full bus driver, more a minimal driver whose job it is to scan /dev/dsa and initialise the proper DSA/idxd driver. So we could move it to /dev/bus, but I think it would lead to overall higher maintenance and I fail to see any real benefits. /Bruce ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v3 05/17] dma/idxd: create dmadev instances on bus probe 2021-09-08 10:29 ` [dpdk-dev] [PATCH v3 00/17] add dmadev driver for idxd devices Kevin Laatz ` (3 preceding siblings ...) 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 04/17] dma/idxd: add bus device probing Kevin Laatz @ 2021-09-08 10:30 ` Kevin Laatz 2021-09-08 16:47 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 06/17] dma/idxd: create dmadev instances on pci probe Kevin Laatz ` (11 subsequent siblings) 16 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-08 10:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When a suitable device is found during the bus scan/probe, create a dmadev instance for each HW queue. Internal structures required for device creation are also added. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- drivers/dma/idxd/idxd_bus.c | 20 ++++++++- drivers/dma/idxd/idxd_common.c | 75 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 40 +++++++++++++++++ drivers/dma/idxd/meson.build | 1 + 4 files changed, 135 insertions(+), 1 deletion(-) create mode 100644 drivers/dma/idxd/idxd_common.c diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index 4097ecd940..9b55451ad2 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -85,6 +85,18 @@ dsa_get_sysfs_path(void) return path ? path : DSA_SYSFS_PATH; } +static int +idxd_dev_close(struct rte_dmadev *dev) +{ + struct idxd_dmadev *idxd = dev->data->dev_private; + munmap(idxd->portal, 0x1000); + return 0; +} + +static const struct rte_dmadev_ops idxd_vdev_ops = { + .dev_close = idxd_dev_close, +}; + static void * idxd_vdev_mmap_wq(struct rte_dsa_device *dev) { @@ -206,7 +218,7 @@ idxd_probe_dsa(struct rte_dsa_device *dev) return -1; idxd.max_batch_size = ret; idxd.qid = dev->addr.wq_id; - idxd.u.vdev.dsa_id = dev->addr.device_id; + idxd.u.bus.dsa_id = dev->addr.device_id; idxd.sva_support = 1; idxd.portal = idxd_vdev_mmap_wq(dev); @@ -215,6 +227,12 @@ idxd_probe_dsa(struct rte_dsa_device *dev) return -ENOENT; } + ret = idxd_dmadev_create(dev->wq_name, &dev->device, &idxd, &idxd_vdev_ops); + if (ret) { + IDXD_PMD_ERR("Failed to create rawdev %s", dev->wq_name); + return ret; + } + return 0; } diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c new file mode 100644 index 0000000000..7770b2e264 --- /dev/null +++ b/drivers/dma/idxd/idxd_common.c @@ -0,0 +1,75 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#include <rte_dmadev_pmd.h> +#include <rte_malloc.h> +#include <rte_common.h> + +#include "idxd_internal.h" + +#define IDXD_PMD_NAME_STR "dmadev_idxd" + +int +idxd_dmadev_create(const char *name, struct rte_device *dev, + const struct idxd_dmadev *base_idxd, + const struct rte_dmadev_ops *ops) +{ + struct idxd_dmadev *idxd; + struct rte_dmadev *dmadev = NULL; + int ret = 0; + + if (!name) { + IDXD_PMD_ERR("Invalid name of the device!"); + ret = -EINVAL; + goto cleanup; + } + + /* Allocate device structure */ + dmadev = rte_dmadev_pmd_allocate(name); + if (dmadev == NULL) { + IDXD_PMD_ERR("Unable to allocate raw device"); + ret = -ENOMEM; + goto cleanup; + } + dmadev->dev_ops = ops; + dmadev->device = dev; + + idxd = rte_malloc_socket(NULL, sizeof(struct idxd_dmadev), 0, dev->numa_node); + if (idxd == NULL) { + IDXD_PMD_ERR("Unable to allocate memory for device"); + ret = -ENOMEM; + goto cleanup; + } + dmadev->data->dev_private = idxd; + dmadev->dev_private = idxd; + *idxd = *base_idxd; /* copy over the main fields already passed in */ + idxd->dmadev = dmadev; + + /* allocate batch index ring and completion ring. + * The +1 is because we can never fully use + * the ring, otherwise read == write means both full and empty. + */ + idxd->batch_comp_ring = rte_zmalloc(NULL, (sizeof(idxd->batch_idx_ring[0]) + + sizeof(idxd->batch_comp_ring[0])) * (idxd->max_batches + 1), + sizeof(idxd->batch_comp_ring[0])); + if (idxd->batch_comp_ring == NULL) { + IDXD_PMD_ERR("Unable to reserve memory for batch data\n"); + ret = -ENOMEM; + goto cleanup; + } + idxd->batch_idx_ring = (void *)&idxd->batch_comp_ring[idxd->max_batches+1]; + idxd->batch_iova = rte_mem_virt2iova(idxd->batch_comp_ring); + + return 0; + +cleanup: + if (dmadev) + rte_dmadev_pmd_release(dmadev); + + return ret; +} + +int idxd_pmd_logtype; + +RTE_LOG_REGISTER_DEFAULT(idxd_pmd_logtype, WARNING); diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index c6a7dcd72f..99ab2df925 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -24,4 +24,44 @@ extern int idxd_pmd_logtype; #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) +struct idxd_dmadev { + /* counters to track the batches */ + unsigned short max_batches; + unsigned short batch_idx_read; + unsigned short batch_idx_write; + + /* track descriptors and handles */ + unsigned short desc_ring_mask; + unsigned short ids_avail; /* handles for ops completed */ + unsigned short ids_returned; /* the read pointer for hdls/desc rings */ + unsigned short batch_start; /* start+size == write pointer for hdls/desc */ + unsigned short batch_size; + + void *portal; /* address to write the batch descriptor */ + + struct idxd_completion *batch_comp_ring; + unsigned short *batch_idx_ring; /* store where each batch ends */ + + struct rte_dmadev_stats stats; + + rte_iova_t batch_iova; /* base address of the batch comp ring */ + rte_iova_t desc_iova; /* base address of desc ring, needed for completions */ + + unsigned short max_batch_size; + + struct rte_dmadev *dmadev; + struct rte_dmadev_vchan_conf qcfg; + uint8_t sva_support; + uint8_t qid; + + union { + struct { + unsigned int dsa_id; + } bus; + } u; +}; + +int idxd_dmadev_create(const char *name, struct rte_device *dev, + const struct idxd_dmadev *base_idxd, const struct rte_dmadev_ops *ops); + #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index f1fea000a7..81150e6f25 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -4,5 +4,6 @@ deps += ['bus_pci'] sources = files( 'idxd_bus.c', + 'idxd_common.c', 'idxd_pci.c' ) \ No newline at end of file -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v3 05/17] dma/idxd: create dmadev instances on bus probe 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 05/17] dma/idxd: create dmadev instances on bus probe Kevin Laatz @ 2021-09-08 16:47 ` Conor Walsh 0 siblings, 0 replies; 243+ messages in thread From: Conor Walsh @ 2021-09-08 16:47 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, fengchengwen, jerinj <snip> > idxd.qid = dev->addr.wq_id; > - idxd.u.vdev.dsa_id = dev->addr.device_id; > + idxd.u.bus.dsa_id = dev->addr.device_id; This should be done in previous patch. Reviewed-by: Conor Walsh <conor.walsh@intel.com> ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v3 06/17] dma/idxd: create dmadev instances on pci probe 2021-09-08 10:29 ` [dpdk-dev] [PATCH v3 00/17] add dmadev driver for idxd devices Kevin Laatz ` (4 preceding siblings ...) 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 05/17] dma/idxd: create dmadev instances on bus probe Kevin Laatz @ 2021-09-08 10:30 ` Kevin Laatz 2021-09-08 16:48 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 07/17] dma/idxd: add datapath structures Kevin Laatz ` (10 subsequent siblings) 16 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-08 10:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When a suitable device is found during the PCI probe, create a dmadev instance for each HW queue. HW definitions required are also included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- drivers/dma/idxd/idxd_hw_defs.h | 71 ++++++++ drivers/dma/idxd/idxd_internal.h | 16 ++ drivers/dma/idxd/idxd_pci.c | 272 ++++++++++++++++++++++++++++++- 3 files changed, 356 insertions(+), 3 deletions(-) create mode 100644 drivers/dma/idxd/idxd_hw_defs.h diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h new file mode 100644 index 0000000000..ea627cba6d --- /dev/null +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -0,0 +1,71 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#ifndef _IDXD_HW_DEFS_H_ +#define _IDXD_HW_DEFS_H_ + +/*** Definitions for Intel(R) Data Streaming Accelerator ***/ + +#define IDXD_CMD_SHIFT 20 +enum rte_idxd_cmds { + idxd_enable_dev = 1, + idxd_disable_dev, + idxd_drain_all, + idxd_abort_all, + idxd_reset_device, + idxd_enable_wq, + idxd_disable_wq, + idxd_drain_wq, + idxd_abort_wq, + idxd_reset_wq, +}; + +/* General bar0 registers */ +struct rte_idxd_bar0 { + uint32_t __rte_cache_aligned version; /* offset 0x00 */ + uint64_t __rte_aligned(0x10) gencap; /* offset 0x10 */ + uint64_t __rte_aligned(0x10) wqcap; /* offset 0x20 */ + uint64_t __rte_aligned(0x10) grpcap; /* offset 0x30 */ + uint64_t __rte_aligned(0x08) engcap; /* offset 0x38 */ + uint64_t __rte_aligned(0x10) opcap; /* offset 0x40 */ + uint64_t __rte_aligned(0x20) offsets[2]; /* offset 0x60 */ + uint32_t __rte_aligned(0x20) gencfg; /* offset 0x80 */ + uint32_t __rte_aligned(0x08) genctrl; /* offset 0x88 */ + uint32_t __rte_aligned(0x10) gensts; /* offset 0x90 */ + uint32_t __rte_aligned(0x08) intcause; /* offset 0x98 */ + uint32_t __rte_aligned(0x10) cmd; /* offset 0xA0 */ + uint32_t __rte_aligned(0x08) cmdstatus; /* offset 0xA8 */ + uint64_t __rte_aligned(0x20) swerror[4]; /* offset 0xC0 */ +}; + +/* workqueue config is provided by array of uint32_t. */ +enum rte_idxd_wqcfg { + wq_size_idx, /* size is in first 32-bit value */ + wq_threshold_idx, /* WQ threshold second 32-bits */ + wq_mode_idx, /* WQ mode and other flags */ + wq_sizes_idx, /* WQ transfer and batch sizes */ + wq_occ_int_idx, /* WQ occupancy interrupt handle */ + wq_occ_limit_idx, /* WQ occupancy limit */ + wq_state_idx, /* WQ state and occupancy state */ +}; + +#define WQ_MODE_SHARED 0 +#define WQ_MODE_DEDICATED 1 +#define WQ_PRIORITY_SHIFT 4 +#define WQ_BATCH_SZ_SHIFT 5 +#define WQ_STATE_SHIFT 30 +#define WQ_STATE_MASK 0x3 + +struct rte_idxd_grpcfg { + uint64_t grpwqcfg[4] __rte_cache_aligned; /* 64-byte register set */ + uint64_t grpengcfg; /* offset 32 */ + uint32_t grpflags; /* offset 40 */ +}; + +#define GENSTS_DEV_STATE_MASK 0x03 +#define CMDSTATUS_ACTIVE_SHIFT 31 +#define CMDSTATUS_ACTIVE_MASK (1 << 31) +#define CMDSTATUS_ERR_MASK 0xFF + +#endif diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 99ab2df925..d92d7b3e6f 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -5,6 +5,10 @@ #ifndef _IDXD_INTERNAL_H_ #define _IDXD_INTERNAL_H_ +#include <rte_spinlock.h> + +#include "idxd_hw_defs.h" + /** * @file idxd_internal.h * @@ -24,6 +28,16 @@ extern int idxd_pmd_logtype; #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) +struct idxd_pci_common { + rte_spinlock_t lk; + + uint8_t wq_cfg_sz; + volatile struct rte_idxd_bar0 *regs; + volatile uint32_t *wq_regs_base; + volatile struct rte_idxd_grpcfg *grp_regs; + volatile void *portals; +}; + struct idxd_dmadev { /* counters to track the batches */ unsigned short max_batches; @@ -58,6 +72,8 @@ struct idxd_dmadev { struct { unsigned int dsa_id; } bus; + + struct idxd_pci_common *pci; } u; }; diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 79e4aadcab..318931713c 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -3,6 +3,9 @@ */ #include <rte_bus_pci.h> +#include <rte_devargs.h> +#include <rte_dmadev_pmd.h> +#include <rte_malloc.h> #include "idxd_internal.h" @@ -16,17 +19,280 @@ const struct rte_pci_id pci_id_idxd_map[] = { { .vendor_id = 0, /* sentinel */ }, }; +static inline int +idxd_pci_dev_command(struct idxd_dmadev *idxd, enum rte_idxd_cmds command) +{ + uint8_t err_code; + uint16_t qid = idxd->qid; + int i = 0; + + if (command >= idxd_disable_wq && command <= idxd_reset_wq) + qid = (1 << qid); + rte_spinlock_lock(&idxd->u.pci->lk); + idxd->u.pci->regs->cmd = (command << IDXD_CMD_SHIFT) | qid; + + do { + rte_pause(); + err_code = idxd->u.pci->regs->cmdstatus; + if (++i >= 1000) { + IDXD_PMD_ERR("Timeout waiting for command response from HW"); + rte_spinlock_unlock(&idxd->u.pci->lk); + return err_code; + } + } while (idxd->u.pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK); + rte_spinlock_unlock(&idxd->u.pci->lk); + + return err_code & CMDSTATUS_ERR_MASK; +} + +static uint32_t * +idxd_get_wq_cfg(struct idxd_pci_common *pci, uint8_t wq_idx) +{ + return RTE_PTR_ADD(pci->wq_regs_base, + (uintptr_t)wq_idx << (5 + pci->wq_cfg_sz)); +} + +static int +idxd_is_wq_enabled(struct idxd_dmadev *idxd) +{ + uint32_t state = idxd_get_wq_cfg(idxd->u.pci, idxd->qid)[wq_state_idx]; + return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; +} + +static const struct rte_dmadev_ops idxd_pci_ops = { + +}; + +/* each portal uses 4 x 4k pages */ +#define IDXD_PORTAL_SIZE (4096 * 4) + +static int +init_pci_device(struct rte_pci_device *dev, struct idxd_dmadev *idxd, + unsigned int max_queues) +{ + struct idxd_pci_common *pci; + uint8_t nb_groups, nb_engines, nb_wqs; + uint16_t grp_offset, wq_offset; /* how far into bar0 the regs are */ + uint16_t wq_size, total_wq_size; + uint8_t lg2_max_batch, lg2_max_copy_size; + unsigned int i, err_code; + + pci = malloc(sizeof(*pci)); + if (pci == NULL) { + IDXD_PMD_ERR("%s: Can't allocate memory", __func__); + goto err; + } + rte_spinlock_init(&pci->lk); + + /* assign the bar registers, and then configure device */ + pci->regs = dev->mem_resource[0].addr; + grp_offset = (uint16_t)pci->regs->offsets[0]; + pci->grp_regs = RTE_PTR_ADD(pci->regs, grp_offset * 0x100); + wq_offset = (uint16_t)(pci->regs->offsets[0] >> 16); + pci->wq_regs_base = RTE_PTR_ADD(pci->regs, wq_offset * 0x100); + pci->portals = dev->mem_resource[2].addr; + pci->wq_cfg_sz = (pci->regs->wqcap >> 24) & 0x0F; + + /* sanity check device status */ + if (pci->regs->gensts & GENSTS_DEV_STATE_MASK) { + /* need function-level-reset (FLR) or is enabled */ + IDXD_PMD_ERR("Device status is not disabled, cannot init"); + goto err; + } + if (pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK) { + /* command in progress */ + IDXD_PMD_ERR("Device has a command in progress, cannot init"); + goto err; + } + + /* read basic info about the hardware for use when configuring */ + nb_groups = (uint8_t)pci->regs->grpcap; + nb_engines = (uint8_t)pci->regs->engcap; + nb_wqs = (uint8_t)(pci->regs->wqcap >> 16); + total_wq_size = (uint16_t)pci->regs->wqcap; + lg2_max_copy_size = (uint8_t)(pci->regs->gencap >> 16) & 0x1F; + lg2_max_batch = (uint8_t)(pci->regs->gencap >> 21) & 0x0F; + + IDXD_PMD_DEBUG("nb_groups = %u, nb_engines = %u, nb_wqs = %u", + nb_groups, nb_engines, nb_wqs); + + /* zero out any old config */ + for (i = 0; i < nb_groups; i++) { + pci->grp_regs[i].grpengcfg = 0; + pci->grp_regs[i].grpwqcfg[0] = 0; + } + for (i = 0; i < nb_wqs; i++) + idxd_get_wq_cfg(pci, i)[0] = 0; + + /* limit queues if necessary */ + if (max_queues != 0 && nb_wqs > max_queues) { + nb_wqs = max_queues; + if (nb_engines > max_queues) + nb_engines = max_queues; + if (nb_groups > max_queues) + nb_engines = max_queues; + IDXD_PMD_DEBUG("Limiting queues to %u", nb_wqs); + } + + /* put each engine into a separate group to avoid reordering */ + if (nb_groups > nb_engines) + nb_groups = nb_engines; + if (nb_groups < nb_engines) + nb_engines = nb_groups; + + /* assign engines to groups, round-robin style */ + for (i = 0; i < nb_engines; i++) { + IDXD_PMD_DEBUG("Assigning engine %u to group %u", + i, i % nb_groups); + pci->grp_regs[i % nb_groups].grpengcfg |= (1ULL << i); + } + + /* now do the same for queues and give work slots to each queue */ + wq_size = total_wq_size / nb_wqs; + IDXD_PMD_DEBUG("Work queue size = %u, max batch = 2^%u, max copy = 2^%u", + wq_size, lg2_max_batch, lg2_max_copy_size); + for (i = 0; i < nb_wqs; i++) { + /* add engine "i" to a group */ + IDXD_PMD_DEBUG("Assigning work queue %u to group %u", + i, i % nb_groups); + pci->grp_regs[i % nb_groups].grpwqcfg[0] |= (1ULL << i); + /* now configure it, in terms of size, max batch, mode */ + idxd_get_wq_cfg(pci, i)[wq_size_idx] = wq_size; + idxd_get_wq_cfg(pci, i)[wq_mode_idx] = (1 << WQ_PRIORITY_SHIFT) | + WQ_MODE_DEDICATED; + idxd_get_wq_cfg(pci, i)[wq_sizes_idx] = lg2_max_copy_size | + (lg2_max_batch << WQ_BATCH_SZ_SHIFT); + } + + /* dump the group configuration to output */ + for (i = 0; i < nb_groups; i++) { + IDXD_PMD_DEBUG("## Group %d", i); + IDXD_PMD_DEBUG(" GRPWQCFG: %"PRIx64, pci->grp_regs[i].grpwqcfg[0]); + IDXD_PMD_DEBUG(" GRPENGCFG: %"PRIx64, pci->grp_regs[i].grpengcfg); + IDXD_PMD_DEBUG(" GRPFLAGS: %"PRIx32, pci->grp_regs[i].grpflags); + } + + idxd->u.pci = pci; + idxd->max_batches = wq_size; + + /* enable the device itself */ + err_code = idxd_pci_dev_command(idxd, idxd_enable_dev); + if (err_code) { + IDXD_PMD_ERR("Error enabling device: code %#x", err_code); + return err_code; + } + IDXD_PMD_DEBUG("IDXD Device enabled OK"); + + return nb_wqs; + +err: + free(pci); + return -1; +} + static int idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) { - int ret = 0; + struct idxd_dmadev idxd = {0}; + uint8_t nb_wqs; + int qid, ret = 0; char name[PCI_PRI_STR_SIZE]; + unsigned int max_queues = 0; rte_pci_device_name(&dev->addr, name, sizeof(name)); IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); dev->device.driver = &drv->driver; - return ret; + if (dev->device.devargs && dev->device.devargs->args[0] != '\0') { + /* if the number of devargs grows beyond just 1, use rte_kvargs */ + if (sscanf(dev->device.devargs->args, + "max_queues=%u", &max_queues) != 1) { + IDXD_PMD_ERR("Invalid device parameter: '%s'", + dev->device.devargs->args); + return -1; + } + } + + ret = init_pci_device(dev, &idxd, max_queues); + if (ret < 0) { + IDXD_PMD_ERR("Error initializing PCI hardware"); + return ret; + } + if (idxd.u.pci->portals == NULL) { + IDXD_PMD_ERR("Error, invalid portal assigned during initialization\n"); + return -EINVAL; + } + nb_wqs = (uint8_t)ret; + + /* set up one device for each queue */ + for (qid = 0; qid < nb_wqs; qid++) { + char qname[32]; + + /* add the queue number to each device name */ + snprintf(qname, sizeof(qname), "%s-q%d", name, qid); + idxd.qid = qid; + idxd.portal = RTE_PTR_ADD(idxd.u.pci->portals, + qid * IDXD_PORTAL_SIZE); + if (idxd_is_wq_enabled(&idxd)) + IDXD_PMD_ERR("Error, WQ %u seems enabled", qid); + ret = idxd_dmadev_create(qname, &dev->device, + &idxd, &idxd_pci_ops); + if (ret != 0) { + IDXD_PMD_ERR("Failed to create dmadev %s", name); + if (qid == 0) /* if no devices using this, free pci */ + free(idxd.u.pci); + return ret; + } + } + + return 0; +} + +static int +idxd_dmadev_destroy(const char *name) +{ + int ret; + uint8_t err_code; + struct rte_dmadev *rdev; + struct idxd_dmadev *idxd; + + if (!name) { + IDXD_PMD_ERR("Invalid device name"); + return -EINVAL; + } + + rdev = rte_dmadev_get_device_by_name(name); + if (!rdev) { + IDXD_PMD_ERR("Invalid device name (%s)", name); + return -EINVAL; + } + + idxd = rdev->dev_private; + if (!idxd) { + IDXD_PMD_ERR("Error getting dev_private"); + return -EINVAL; + } + + /* disable the device */ + err_code = idxd_pci_dev_command(idxd, idxd_disable_dev); + if (err_code) { + IDXD_PMD_ERR("Error disabling device: code %#x", err_code); + return err_code; + } + IDXD_PMD_DEBUG("IDXD Device disabled OK"); + + /* free device memory */ + IDXD_PMD_DEBUG("Freeing device driver memory"); + rdev->dev_private = NULL; + rte_free(idxd->batch_idx_ring); + rte_free(idxd->desc_ring); + + /* rte_dmadev_close is called by pmd_release */ + ret = rte_dmadev_pmd_release(rdev); + if (ret) + IDXD_PMD_DEBUG("Device cleanup failed"); + + return 0; } static int @@ -39,7 +305,7 @@ idxd_dmadev_remove_pci(struct rte_pci_device *dev) IDXD_PMD_INFO("Closing %s on NUMA node %d", name, dev->device.numa_node); - return 0; + return idxd_dmadev_destroy(name); } struct rte_pci_driver idxd_pmd_drv_pci = { -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v3 06/17] dma/idxd: create dmadev instances on pci probe 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 06/17] dma/idxd: create dmadev instances on pci probe Kevin Laatz @ 2021-09-08 16:48 ` Conor Walsh 0 siblings, 0 replies; 243+ messages in thread From: Conor Walsh @ 2021-09-08 16:48 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, fengchengwen, jerinj > When a suitable device is found during the PCI probe, create a dmadev > instance for each HW queue. HW definitions required are also included. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> <snip> ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v3 07/17] dma/idxd: add datapath structures 2021-09-08 10:29 ` [dpdk-dev] [PATCH v3 00/17] add dmadev driver for idxd devices Kevin Laatz ` (5 preceding siblings ...) 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 06/17] dma/idxd: create dmadev instances on pci probe Kevin Laatz @ 2021-09-08 10:30 ` Kevin Laatz 2021-09-09 11:23 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 08/17] dma/idxd: add configure and info_get functions Kevin Laatz ` (9 subsequent siblings) 16 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-08 10:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add data structures required for the data path for IDXD devices. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- v2: add completion status for invalid opcode --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 33 ++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 60 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 3 ++ drivers/dma/idxd/idxd_pci.c | 2 +- 5 files changed, 98 insertions(+), 1 deletion(-) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index 9b55451ad2..20d17c20ca 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -95,6 +95,7 @@ idxd_dev_close(struct rte_dmadev *dev) static const struct rte_dmadev_ops idxd_vdev_ops = { .dev_close = idxd_dev_close, + .dev_dump = idxd_dump, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 7770b2e264..9490439fdc 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -10,6 +10,35 @@ #define IDXD_PMD_NAME_STR "dmadev_idxd" +int +idxd_dump(const struct rte_dmadev *dev, FILE *f) +{ + struct idxd_dmadev *idxd = dev->dev_private; + unsigned int i; + + fprintf(f, "== Private Data ==\n"); + fprintf(f, " Portal: %p\n", idxd->portal); + fprintf(f, " Config: { ring_size: %u }\n", + idxd->qcfg.nb_desc); + fprintf(f, " Batch ring (sz = %u, max_batches = %u):\n\t", + idxd->max_batches + 1, idxd->max_batches); + for (i = 0; i <= idxd->max_batches; i++) { + fprintf(f, " %u ", idxd->batch_idx_ring[i]); + if (i == idxd->batch_idx_read && i == idxd->batch_idx_write) + fprintf(f, "[rd ptr, wr ptr] "); + else if (i == idxd->batch_idx_read) + fprintf(f, "[rd ptr] "); + else if (i == idxd->batch_idx_write) + fprintf(f, "[wr ptr] "); + if (i == idxd->max_batches) + fprintf(f, "\n"); + } + + fprintf(f, " Curr batch: start = %u, size = %u\n", idxd->batch_start, idxd->batch_size); + fprintf(f, " IDS: avail = %u, returned: %u\n", idxd->ids_avail, idxd->ids_returned); + return 0; +} + int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, @@ -19,6 +48,10 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, struct rte_dmadev *dmadev = NULL; int ret = 0; + RTE_BUILD_BUG_ON(sizeof(struct idxd_hw_desc) != 64); + RTE_BUILD_BUG_ON(offsetof(struct idxd_hw_desc, size) != 32); + RTE_BUILD_BUG_ON(sizeof(struct idxd_completion) != 32); + if (!name) { IDXD_PMD_ERR("Invalid name of the device!"); ret = -EINVAL; diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h index ea627cba6d..55ca9f7f52 100644 --- a/drivers/dma/idxd/idxd_hw_defs.h +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -5,6 +5,66 @@ #ifndef _IDXD_HW_DEFS_H_ #define _IDXD_HW_DEFS_H_ +/* + * Defines used in the data path for interacting with IDXD hardware. + */ +#define IDXD_CMD_OP_SHIFT 24 +enum rte_idxd_ops { + idxd_op_nop = 0, + idxd_op_batch, + idxd_op_drain, + idxd_op_memmove, + idxd_op_fill +}; + +#define IDXD_FLAG_FENCE (1 << 0) +#define IDXD_FLAG_COMPLETION_ADDR_VALID (1 << 2) +#define IDXD_FLAG_REQUEST_COMPLETION (1 << 3) +#define IDXD_FLAG_CACHE_CONTROL (1 << 8) + +/** + * Hardware descriptor used by DSA hardware, for both bursts and + * for individual operations. + */ +struct idxd_hw_desc { + uint32_t pasid; + uint32_t op_flags; + rte_iova_t completion; + + RTE_STD_C11 + union { + rte_iova_t src; /* source address for copy ops etc. */ + rte_iova_t desc_addr; /* descriptor pointer for batch */ + }; + rte_iova_t dst; + + uint32_t size; /* length of data for op, or batch size */ + + uint16_t intr_handle; /* completion interrupt handle */ + + /* remaining 26 bytes are reserved */ + uint16_t __reserved[13]; +} __rte_aligned(64); + +#define IDXD_COMP_STATUS_INCOMPLETE 0 +#define IDXD_COMP_STATUS_SUCCESS 1 +#define IDXD_COMP_STATUS_INVALID_OPCODE 0x10 +#define IDXD_COMP_STATUS_INVALID_SIZE 0x13 +#define IDXD_COMP_STATUS_SKIPPED 0xFF /* not official IDXD error, needed as placeholder */ + +/** + * Completion record structure written back by DSA + */ +struct idxd_completion { + uint8_t status; + uint8_t result; + /* 16-bits pad here */ + uint32_t completed_size; /* data length, or descriptors for batch */ + + rte_iova_t fault_address; + uint32_t invalid_flags; +} __rte_aligned(32); + /*** Definitions for Intel(R) Data Streaming Accelerator ***/ #define IDXD_CMD_SHIFT 20 diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index d92d7b3e6f..e558258ec4 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -39,6 +39,8 @@ struct idxd_pci_common { }; struct idxd_dmadev { + struct idxd_hw_desc *desc_ring; + /* counters to track the batches */ unsigned short max_batches; unsigned short batch_idx_read; @@ -79,5 +81,6 @@ struct idxd_dmadev { int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, const struct rte_dmadev_ops *ops); +int idxd_dump(const struct rte_dmadev *dev, FILE *f); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 318931713c..96d58c8544 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -60,7 +60,7 @@ idxd_is_wq_enabled(struct idxd_dmadev *idxd) } static const struct rte_dmadev_ops idxd_pci_ops = { - + .dev_dump = idxd_dump, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v3 07/17] dma/idxd: add datapath structures 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 07/17] dma/idxd: add datapath structures Kevin Laatz @ 2021-09-09 11:23 ` Conor Walsh 0 siblings, 0 replies; 243+ messages in thread From: Conor Walsh @ 2021-09-09 11:23 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, fengchengwen, jerinj > Add data structures required for the data path for IDXD devices. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> <snip> > +int > +idxd_dump(const struct rte_dmadev *dev, FILE *f) > +{ > + struct idxd_dmadev *idxd = dev->dev_private; > + unsigned int i; > + > + fprintf(f, "== Private Data ==\n"); Minor nit could you call out IDXD somewhere here just to make it clear which driver is being used? It may be helpful for debugging just to quickly see if the correct driver was used. > + fprintf(f, " Portal: %p\n", idxd->portal); > + fprintf(f, " Config: { ring_size: %u }\n", > + idxd->qcfg.nb_desc); > + fprintf(f, " Batch ring (sz = %u, max_batches = %u):\n\t", > + idxd->max_batches + 1, idxd->max_batches); > + for (i = 0; i <= idxd->max_batches; i++) { > + fprintf(f, " %u ", idxd->batch_idx_ring[i]); > + if (i == idxd->batch_idx_read && i == idxd->batch_idx_write) > + fprintf(f, "[rd ptr, wr ptr] "); > + else if (i == idxd->batch_idx_read) > + fprintf(f, "[rd ptr] "); > + else if (i == idxd->batch_idx_write) > + fprintf(f, "[wr ptr] "); > + if (i == idxd->max_batches) > + fprintf(f, "\n"); > + } > + > + fprintf(f, " Curr batch: start = %u, size = %u\n", idxd->batch_start, idxd->batch_size); > + fprintf(f, " IDS: avail = %u, returned: %u\n", idxd->ids_avail, idxd->ids_returned); > + return 0; > +} Reviewed-by: Conor Walsh <conor.walsh@intel.com> <snip> ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v3 08/17] dma/idxd: add configure and info_get functions 2021-09-08 10:29 ` [dpdk-dev] [PATCH v3 00/17] add dmadev driver for idxd devices Kevin Laatz ` (6 preceding siblings ...) 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 07/17] dma/idxd: add datapath structures Kevin Laatz @ 2021-09-08 10:30 ` Kevin Laatz 2021-09-09 11:23 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 09/17] dma/idxd: add start and stop functions for pci devices Kevin Laatz ` (8 subsequent siblings) 16 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-08 10:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add functions for device configuration. The info_get function is included here since it can be useful for checking successful configuration. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- v2: - fix reconfigure bug in idxd_vchan_setup() - add literal include comment for the docs to pick up v3: - fixes needed after changes from rebasing --- app/test/test_dmadev.c | 2 + doc/guides/dmadevs/idxd.rst | 33 +++++++++++++++ drivers/dma/idxd/idxd_bus.c | 3 ++ drivers/dma/idxd/idxd_common.c | 73 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 6 +++ drivers/dma/idxd/idxd_pci.c | 3 ++ 6 files changed, 120 insertions(+) diff --git a/app/test/test_dmadev.c b/app/test/test_dmadev.c index 98dddae6d6..a5276d4cce 100644 --- a/app/test/test_dmadev.c +++ b/app/test/test_dmadev.c @@ -739,6 +739,7 @@ test_dmadev_instance(uint16_t dev_id) { #define TEST_RINGSIZE 512 #define CHECK_ERRS true + /* Setup of the dmadev device. 8< */ struct rte_dmadev_stats stats; struct rte_dmadev_info info; const struct rte_dmadev_conf conf = { .nb_vchans = 1}; @@ -759,6 +760,7 @@ test_dmadev_instance(uint16_t dev_id) if (rte_dmadev_vchan_setup(dev_id, vchan, &qconf) < 0) ERR_RETURN("Error with queue configuration\n"); + /* >8 End of setup of the dmadev device. */ rte_dmadev_info_get(dev_id, &info); if (info.nb_vchans != 1) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index ce33e2857a..66bc9fe744 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -120,3 +120,36 @@ use a subset of configured queues. Once probed successfully, irrespective of kernel driver, the device will appear as a ``dmadev``, that is a "DMA device type" inside DPDK, and can be accessed using APIs from the ``rte_dmadev`` library. + +Using IDXD DMAdev Devices +-------------------------- + +To use the devices from an application, the dmadev API can be used. + +Getting Device Information +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Basic information about each dmadev device can be queried using the +``rte_dmadev_info_get()`` API. This will return basic device information such as +the ``rte_device`` structure, device capabilities and other device specific values. + +Device Configuration +~~~~~~~~~~~~~~~~~~~~~ + +Configuring an IDXD dmadev device is done using the ``rte_dmadev_configure()`` and +``rte_dmadev_vchan_setup`` APIs. The configurations are passed to these APIs using +the ``rte_dmadev_conf`` and ``rte_dmadev_vchan_conf`` structures, respectively. For +example, these can be used to configure the number of ``vchans`` per device, the +ring size, etc. The ring size must be a power of two, between 64 and 4096. + +The following code shows how the device is configured in +``test_dmadev.c``: + +.. literalinclude:: ../../../app/test/test_dmadev.c + :language: c + :start-after: Setup of the dmadev device. 8< + :end-before: >8 End of setup of the dmadev device. + :dedent: 1 + +Once configured, the device can then be made ready for use by calling the +``rte_dmadev_start()`` API. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index 20d17c20ca..7a6afabd27 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -96,6 +96,9 @@ idxd_dev_close(struct rte_dmadev *dev) static const struct rte_dmadev_ops idxd_vdev_ops = { .dev_close = idxd_dev_close, .dev_dump = idxd_dump, + .dev_configure = idxd_configure, + .vchan_setup = idxd_vchan_setup, + .dev_info_get = idxd_info_get, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 9490439fdc..9949608293 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -39,6 +39,79 @@ idxd_dump(const struct rte_dmadev *dev, FILE *f) return 0; } +int +idxd_info_get(const struct rte_dmadev *dev, struct rte_dmadev_info *info, uint32_t size) +{ + struct idxd_dmadev *idxd = dev->dev_private; + + if (size < sizeof(*info)) + return -EINVAL; + + *info = (struct rte_dmadev_info) { + .device = dev->device, + .dev_capa = RTE_DMADEV_CAPA_MEM_TO_MEM | + RTE_DMADEV_CAPA_OPS_COPY | RTE_DMADEV_CAPA_OPS_FILL, + .max_vchans = 1, + .max_desc = 4096, + .min_desc = 64, + .nb_vchans = (idxd->desc_ring != NULL), /* returns 1 or 0 */ + }; + if (idxd->sva_support) + info->dev_capa |= RTE_DMADEV_CAPA_SVA; + return 0; +} + +int +idxd_configure(struct rte_dmadev *dev __rte_unused, const struct rte_dmadev_conf *dev_conf, + uint32_t conf_sz) +{ + if (sizeof(struct rte_dmadev_conf) != conf_sz) + return -EINVAL; + + if (dev_conf->nb_vchans != 1) + return -EINVAL; + return 0; +} + +int +idxd_vchan_setup(struct rte_dmadev *dev, uint16_t vchan __rte_unused, + const struct rte_dmadev_vchan_conf *qconf, uint32_t qconf_sz) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t max_desc = qconf->nb_desc; + + if (sizeof(struct rte_dmadev_vchan_conf) != qconf_sz) + return -EINVAL; + + idxd->qcfg = *qconf; + + if (!rte_is_power_of_2(max_desc)) + max_desc = rte_align32pow2(max_desc); + IDXD_PMD_DEBUG("DMA dev %u using %u descriptors", dev->data->dev_id, max_desc); + idxd->desc_ring_mask = max_desc - 1; + idxd->qcfg.nb_desc = max_desc; + + /* in case we are reconfiguring a device, free any existing memory */ + rte_free(idxd->desc_ring); + + /* allocate the descriptor ring at 2x size as batches can't wrap */ + idxd->desc_ring = rte_zmalloc(NULL, sizeof(*idxd->desc_ring) * max_desc * 2, 0); + if (idxd->desc_ring == NULL) + return -ENOMEM; + idxd->desc_iova = rte_mem_virt2iova(idxd->desc_ring); + + idxd->batch_idx_read = 0; + idxd->batch_idx_write = 0; + idxd->batch_start = 0; + idxd->batch_size = 0; + idxd->ids_returned = 0; + idxd->ids_avail = 0; + + memset(idxd->batch_comp_ring, 0, sizeof(*idxd->batch_comp_ring) * + (idxd->max_batches + 1)); + return 0; +} + int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index e558258ec4..dde0d54df4 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -82,5 +82,11 @@ struct idxd_dmadev { int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, const struct rte_dmadev_ops *ops); int idxd_dump(const struct rte_dmadev *dev, FILE *f); +int idxd_configure(struct rte_dmadev *dev, const struct rte_dmadev_conf *dev_conf, + uint32_t conf_sz); +int idxd_vchan_setup(struct rte_dmadev *dev, uint16_t vchan, + const struct rte_dmadev_vchan_conf *qconf, uint32_t qconf_sz); +int idxd_info_get(const struct rte_dmadev *dev, struct rte_dmadev_info *dev_info, + uint32_t size); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 96d58c8544..569df8d04c 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -61,6 +61,9 @@ idxd_is_wq_enabled(struct idxd_dmadev *idxd) static const struct rte_dmadev_ops idxd_pci_ops = { .dev_dump = idxd_dump, + .dev_configure = idxd_configure, + .vchan_setup = idxd_vchan_setup, + .dev_info_get = idxd_info_get, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v3 08/17] dma/idxd: add configure and info_get functions 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 08/17] dma/idxd: add configure and info_get functions Kevin Laatz @ 2021-09-09 11:23 ` Conor Walsh 0 siblings, 0 replies; 243+ messages in thread From: Conor Walsh @ 2021-09-09 11:23 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, fengchengwen, jerinj > Add functions for device configuration. The info_get function is included > here since it can be useful for checking successful configuration. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> <snip> > +The following code shows how the device is configured in > +``test_dmadev.c``: > + > +.. literalinclude:: ../../../app/test/test_dmadev.c > + :language: c > + :start-after: Setup of the dmadev device. 8< > + :end-before: >8 End of setup of the dmadev device. > + :dedent: 1 > + > +Once configured, the device can then be made ready for use by calling the > +``rte_dmadev_start()`` API. Last line should be in the next patch, but it doesn't really matter. Reviewed-by: Conor Walsh <conor.walsh@intel.com> ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v3 09/17] dma/idxd: add start and stop functions for pci devices 2021-09-08 10:29 ` [dpdk-dev] [PATCH v3 00/17] add dmadev driver for idxd devices Kevin Laatz ` (7 preceding siblings ...) 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 08/17] dma/idxd: add configure and info_get functions Kevin Laatz @ 2021-09-08 10:30 ` Kevin Laatz 2021-09-09 11:24 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 10/17] dma/idxd: add data-path job submission functions Kevin Laatz ` (7 subsequent siblings) 16 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-08 10:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add device start/stop functions for DSA devices bound to vfio. For devices bound to the IDXD kernel driver, these are not required since the IDXD kernel driver takes care of this. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- drivers/dma/idxd/idxd_pci.c | 52 +++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 569df8d04c..3c0e3086f7 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -59,11 +59,63 @@ idxd_is_wq_enabled(struct idxd_dmadev *idxd) return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; } +static int +idxd_pci_dev_stop(struct rte_dmadev *dev) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint8_t err_code; + + if (!idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Work queue %d already disabled", idxd->qid); + return -EALREADY; + } + + err_code = idxd_pci_dev_command(idxd, idxd_disable_wq); + if (err_code || idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Failed disabling work queue %d, error code: %#x", + idxd->qid, err_code); + return -err_code; + } + IDXD_PMD_DEBUG("Work queue %d disabled OK", idxd->qid); + + return 0; +} + +static int +idxd_pci_dev_start(struct rte_dmadev *dev) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint8_t err_code; + + if (idxd_is_wq_enabled(idxd)) { + IDXD_PMD_WARN("WQ %d already enabled", idxd->qid); + return 0; + } + + if (idxd->desc_ring == NULL) { + IDXD_PMD_ERR("WQ %d has not been fully configured", idxd->qid); + return -EINVAL; + } + + err_code = idxd_pci_dev_command(idxd, idxd_enable_wq); + if (err_code || !idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Failed enabling work queue %d, error code: %#x", + idxd->qid, err_code); + return err_code == 0 ? -1 : err_code; + } + + IDXD_PMD_DEBUG("Work queue %d enabled OK", idxd->qid); + + return 0; +} + static const struct rte_dmadev_ops idxd_pci_ops = { .dev_dump = idxd_dump, .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .dev_start = idxd_pci_dev_start, + .dev_stop = idxd_pci_dev_stop, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v3 09/17] dma/idxd: add start and stop functions for pci devices 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 09/17] dma/idxd: add start and stop functions for pci devices Kevin Laatz @ 2021-09-09 11:24 ` Conor Walsh 0 siblings, 0 replies; 243+ messages in thread From: Conor Walsh @ 2021-09-09 11:24 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, fengchengwen, jerinj > Add device start/stop functions for DSA devices bound to vfio. For devices > bound to the IDXD kernel driver, these are not required since the IDXD > kernel driver takes care of this. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v3 10/17] dma/idxd: add data-path job submission functions 2021-09-08 10:29 ` [dpdk-dev] [PATCH v3 00/17] add dmadev driver for idxd devices Kevin Laatz ` (8 preceding siblings ...) 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 09/17] dma/idxd: add start and stop functions for pci devices Kevin Laatz @ 2021-09-08 10:30 ` Kevin Laatz 2021-09-09 11:24 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 11/17] dma/idxd: add data-path job completion functions Kevin Laatz ` (6 subsequent siblings) 16 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-08 10:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add data path functions for enqueuing and submitting operations to DSA devices. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- doc/guides/dmadevs/idxd.rst | 64 +++++++++++++++ drivers/dma/idxd/idxd_common.c | 137 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 ++ drivers/dma/idxd/meson.build | 1 + 4 files changed, 207 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 66bc9fe744..0c4c105e0f 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -153,3 +153,67 @@ The following code shows how the device is configured in Once configured, the device can then be made ready for use by calling the ``rte_dmadev_start()`` API. + +Performing Data Copies +~~~~~~~~~~~~~~~~~~~~~~~ + +To perform data copies using IDXD dmadev devices, descriptors should be enqueued +using the ``rte_dmadev_copy()`` API. The HW can be triggered to perform the copy +in two ways, either via a ``RTE_DMA_OP_FLAG_SUBMIT`` flag or by calling +``rte_dmadev_submit()``. Once copies have been completed, the completion will +be reported back when the application calls ``rte_dmadev_completed()`` or +``rte_dmadev_completed_status()``. The latter will also report the status of each +completed operation. + +The ``rte_dmadev_copy()`` function enqueues a single copy to the device ring for +copying at a later point. The parameters to that function include the IOVA addresses +of both the source and destination buffers, as well as the length of the copy. + +The ``rte_dmadev_copy()`` function enqueues a copy operation on the device ring. +If the ``RTE_DMA_OP_FLAG_SUBMIT`` flag is set when calling ``rte_dmadev_copy()``, +the device hardware will be informed of the elements. Alternatively, if the flag +is not set, the application need to call the ``rte_dmadev_submit()`` function to +notify the device hardware. Once the device hardware is informed of the elements +enqueued on the ring, and the device will begin to process them. It is expected +that, for efficiency reasons, a burst of operations will be enqueued to the +device via multiple enqueue calls between calls to the ``rte_dmadev_submit()`` +function. + +The following code from demonstrates how to enqueue a burst of copies to the +device and start the hardware processing of them: + +.. code-block:: C + + struct rte_mbuf *srcs[COMP_BURST_SZ], *dsts[COMP_BURST_SZ]; + unsigned int i; + + for (i = 0; i < RTE_DIM(srcs); i++) { + uint64_t *src_data; + + srcs[i] = rte_pktmbuf_alloc(pool); + dsts[i] = rte_pktmbuf_alloc(pool); + src_data = rte_pktmbuf_mtod(srcs[i], uint64_t *); + if (srcs[i] == NULL || dsts[i] == NULL) { + PRINT_ERR("Error allocating buffers\n"); + return -1; + } + + for (j = 0; j < COPY_LEN/sizeof(uint64_t); j++) + src_data[j] = rte_rand(); + + if (rte_dmadev_copy(dev_id, vchan, srcs[i]->buf_iova + srcs[i]->data_off, + dsts[i]->buf_iova + dsts[i]->data_off, COPY_LEN, 0) < 0) { + PRINT_ERR("Error with rte_dmadev_copy for buffer %u\n", i); + return -1; + } + } + rte_dmadev_submit(dev_id, vchan); + +Filling an Area of Memory +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The IDXD driver also has support for the ``fill`` operation, where an area +of memory is overwritten, or filled, with a short pattern of data. +Fill operations can be performed in much the same was as copy operations +described above, just using the ``rte_dmadev_fill()`` function rather than the +``rte_dmadev_copy()`` function. diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 9949608293..69851defba 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -2,14 +2,147 @@ * Copyright 2021 Intel Corporation */ +#include <x86intrin.h> + #include <rte_dmadev_pmd.h> #include <rte_malloc.h> #include <rte_common.h> +#include <rte_prefetch.h> #include "idxd_internal.h" #define IDXD_PMD_NAME_STR "dmadev_idxd" +static __rte_always_inline rte_iova_t +__desc_idx_to_iova(struct idxd_dmadev *idxd, uint16_t n) +{ + return idxd->desc_iova + (n * sizeof(struct idxd_hw_desc)); +} + +static __rte_always_inline void +__idxd_movdir64b(volatile void *dst, const struct idxd_hw_desc *src) +{ + asm volatile (".byte 0x66, 0x0f, 0x38, 0xf8, 0x02" + : + : "a" (dst), "d" (src) + : "memory"); +} + +static __rte_always_inline void +__submit(struct idxd_dmadev *idxd) +{ + rte_prefetch1(&idxd->batch_comp_ring[idxd->batch_idx_read]); + + if (idxd->batch_size == 0) + return; + + /* write completion to batch comp ring */ + rte_iova_t comp_addr = idxd->batch_iova + + (idxd->batch_idx_write * sizeof(struct idxd_completion)); + + if (idxd->batch_size == 1) { + /* submit batch directly */ + struct idxd_hw_desc desc = + idxd->desc_ring[idxd->batch_start & idxd->desc_ring_mask]; + desc.completion = comp_addr; + desc.op_flags |= IDXD_FLAG_REQUEST_COMPLETION; + _mm_sfence(); /* fence before writing desc to device */ + __idxd_movdir64b(idxd->portal, &desc); + } else { + const struct idxd_hw_desc batch_desc = { + .op_flags = (idxd_op_batch << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_COMPLETION_ADDR_VALID | + IDXD_FLAG_REQUEST_COMPLETION, + .desc_addr = __desc_idx_to_iova(idxd, + idxd->batch_start & idxd->desc_ring_mask), + .completion = comp_addr, + .size = idxd->batch_size, + }; + _mm_sfence(); /* fence before writing desc to device */ + __idxd_movdir64b(idxd->portal, &batch_desc); + } + + if (++idxd->batch_idx_write > idxd->max_batches) + idxd->batch_idx_write = 0; + + idxd->batch_start += idxd->batch_size; + idxd->batch_size = 0; + idxd->batch_idx_ring[idxd->batch_idx_write] = idxd->batch_start; + _mm256_store_si256((void *)&idxd->batch_comp_ring[idxd->batch_idx_write], + _mm256_setzero_si256()); +} + +static __rte_always_inline int +__idxd_write_desc(struct rte_dmadev *dev, + const uint32_t op_flags, + const rte_iova_t src, + const rte_iova_t dst, + const uint32_t size, + const uint32_t flags) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t mask = idxd->desc_ring_mask; + uint16_t job_id = idxd->batch_start + idxd->batch_size; + /* we never wrap batches, so we only mask the start and allow start+size to overflow */ + uint16_t write_idx = (idxd->batch_start & mask) + idxd->batch_size; + + /* first check batch ring space then desc ring space */ + if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) || + idxd->batch_idx_write + 1 == idxd->batch_idx_read) + goto failed; + if (((write_idx + 1) & mask) == (idxd->ids_returned & mask)) + goto failed; + + /* write desc. Note: descriptors don't wrap, but the completion address does */ + const uint64_t op_flags64 = (uint64_t)(op_flags | IDXD_FLAG_COMPLETION_ADDR_VALID) << 32; + const uint64_t comp_addr = __desc_idx_to_iova(idxd, write_idx & mask); + _mm256_store_si256((void *)&idxd->desc_ring[write_idx], + _mm256_set_epi64x(dst, src, comp_addr, op_flags64)); + _mm256_store_si256((void *)&idxd->desc_ring[write_idx].size, + _mm256_set_epi64x(0, 0, 0, size)); + + idxd->batch_size++; + + rte_prefetch0_write(&idxd->desc_ring[write_idx + 1]); + + if (flags & RTE_DMA_OP_FLAG_SUBMIT) + __submit(idxd); + + return job_id; + +failed: + return -1; +} + +int +idxd_enqueue_copy(struct rte_dmadev *dev, uint16_t qid __rte_unused, rte_iova_t src, + rte_iova_t dst, unsigned int length, uint64_t flags) +{ + /* we can take advantage of the fact that the fence flag in dmadev and DSA are the same, + * but check it at compile time to be sure. + */ + RTE_BUILD_BUG_ON(RTE_DMA_OP_FLAG_FENCE != IDXD_FLAG_FENCE); + uint32_t memmove = (idxd_op_memmove << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_CACHE_CONTROL | (flags & IDXD_FLAG_FENCE); + return __idxd_write_desc(dev, memmove, src, dst, length, flags); +} + +int +idxd_enqueue_fill(struct rte_dmadev *dev, uint16_t qid __rte_unused, uint64_t pattern, + rte_iova_t dst, unsigned int length, uint64_t flags) +{ + uint32_t fill = (idxd_op_fill << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_CACHE_CONTROL | (flags & IDXD_FLAG_FENCE); + return __idxd_write_desc(dev, fill, pattern, dst, length, flags); +} + +int +idxd_submit(struct rte_dmadev *dev, uint16_t qid __rte_unused) +{ + __submit(dev->dev_private); + return 0; +} + int idxd_dump(const struct rte_dmadev *dev, FILE *f) { @@ -141,6 +274,10 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->dev_ops = ops; dmadev->device = dev; + dmadev->copy = idxd_enqueue_copy; + dmadev->fill = idxd_enqueue_fill; + dmadev->submit = idxd_submit; + idxd = rte_malloc_socket(NULL, sizeof(struct idxd_dmadev), 0, dev->numa_node); if (idxd == NULL) { IDXD_PMD_ERR("Unable to allocate memory for device"); diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index dde0d54df4..7017d252b4 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -88,5 +88,10 @@ int idxd_vchan_setup(struct rte_dmadev *dev, uint16_t vchan, const struct rte_dmadev_vchan_conf *qconf, uint32_t qconf_sz); int idxd_info_get(const struct rte_dmadev *dev, struct rte_dmadev_info *dev_info, uint32_t size); +int idxd_enqueue_copy(struct rte_dmadev *dev, uint16_t qid, rte_iova_t src, + rte_iova_t dst, unsigned int length, uint64_t flags); +int idxd_enqueue_fill(struct rte_dmadev *dev, uint16_t qid, uint64_t pattern, + rte_iova_t dst, unsigned int length, uint64_t flags); +int idxd_submit(struct rte_dmadev *dev, uint16_t qid); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index 81150e6f25..2de5130fd2 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -2,6 +2,7 @@ # Copyright(c) 2021 Intel Corporation deps += ['bus_pci'] +cflags += '-mavx2' # all platforms with idxd HW support AVX sources = files( 'idxd_bus.c', 'idxd_common.c', -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v3 10/17] dma/idxd: add data-path job submission functions 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 10/17] dma/idxd: add data-path job submission functions Kevin Laatz @ 2021-09-09 11:24 ` Conor Walsh 0 siblings, 0 replies; 243+ messages in thread From: Conor Walsh @ 2021-09-09 11:24 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, fengchengwen, jerinj > Add data path functions for enqueuing and submitting operations to DSA > devices. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> <snip> > +static __rte_always_inline int > +__idxd_write_desc(struct rte_dmadev *dev, > + const uint32_t op_flags, > + const rte_iova_t src, > + const rte_iova_t dst, > + const uint32_t size, > + const uint32_t flags) > +{ > + struct idxd_dmadev *idxd = dev->dev_private; > + uint16_t mask = idxd->desc_ring_mask; > + uint16_t job_id = idxd->batch_start + idxd->batch_size; > + /* we never wrap batches, so we only mask the start and allow start+size to overflow */ > + uint16_t write_idx = (idxd->batch_start & mask) + idxd->batch_size; > + > + /* first check batch ring space then desc ring space */ > + if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) || > + idxd->batch_idx_write + 1 == idxd->batch_idx_read) > + goto failed; > + if (((write_idx + 1) & mask) == (idxd->ids_returned & mask)) > + goto failed; > + > + /* write desc. Note: descriptors don't wrap, but the completion address does */ > + const uint64_t op_flags64 = (uint64_t)(op_flags | IDXD_FLAG_COMPLETION_ADDR_VALID) << 32; > + const uint64_t comp_addr = __desc_idx_to_iova(idxd, write_idx & mask); > + _mm256_store_si256((void *)&idxd->desc_ring[write_idx], > + _mm256_set_epi64x(dst, src, comp_addr, op_flags64)); > + _mm256_store_si256((void *)&idxd->desc_ring[write_idx].size, > + _mm256_set_epi64x(0, 0, 0, size)); > + > + idxd->batch_size++; > + > + rte_prefetch0_write(&idxd->desc_ring[write_idx + 1]); > + > + if (flags & RTE_DMA_OP_FLAG_SUBMIT) > + __submit(idxd); > + > + return job_id; > + > +failed: > + return -1; > +} If the failed goto just returns -1 it would probably be better to remove it and just return -1 in the 2 spots above. Reviewed-by: Conor Walsh <conor.walsh@intel.com> ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v3 11/17] dma/idxd: add data-path job completion functions 2021-09-08 10:29 ` [dpdk-dev] [PATCH v3 00/17] add dmadev driver for idxd devices Kevin Laatz ` (9 preceding siblings ...) 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 10/17] dma/idxd: add data-path job submission functions Kevin Laatz @ 2021-09-08 10:30 ` Kevin Laatz 2021-09-09 11:24 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 12/17] dma/idxd: add operation statistic tracking Kevin Laatz ` (5 subsequent siblings) 16 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-08 10:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the data path functions for gathering completed operations. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- v2: - fixed typo in docs - add completion status for invalid opcode --- doc/guides/dmadevs/idxd.rst | 25 ++++ drivers/dma/idxd/idxd_common.c | 237 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 + 3 files changed, 267 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 0c4c105e0f..b0b5632b48 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -209,6 +209,31 @@ device and start the hardware processing of them: } rte_dmadev_submit(dev_id, vchan); +To retrieve information about completed copies, ``rte_dmadev_completed()`` and +``rte_dmadev_completed_status()`` APIs should be used. ``rte_dmadev_completed()`` +will return the number of completed operations, along with the index of the last +successful completed operation and whether or not an error was encountered. If an +error was encountered, ``rte_dmadev_completed_status()`` must be used to kick the +device off to continue processing operations and also to gather the status of each +individual operations which is filled in to the ``status`` array provided as +parameter by the application. + +The following code shows how to retrieve the number of successfully completed +copies within a burst and then using ``rte_dmadev_completed_status()`` to check +which operation failed and kick off the device to continue processing operations: + +.. code-block:: C + + enum rte_dma_status_code status[COMP_BURST_SZ]; + uint16_t count, idx, status_count; + bool error = 0; + + count = rte_dmadev_completed(dev_id, vchan, COMP_BURST_SZ, &idx, &error); + + if (error){ + status_count = rte_dmadev_completed_status(dev_id, vchan, COMP_BURST_SZ, &idx, status); + } + Filling an Area of Memory ~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 69851defba..8eb73fdcc6 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -143,6 +143,241 @@ idxd_submit(struct rte_dmadev *dev, uint16_t qid __rte_unused) return 0; } +static enum rte_dma_status_code +get_comp_status(struct idxd_completion *c) +{ + uint8_t st = c->status; + switch (st) { + /* successful descriptors are not written back normally */ + case IDXD_COMP_STATUS_INCOMPLETE: + case IDXD_COMP_STATUS_SUCCESS: + return RTE_DMA_STATUS_SUCCESSFUL; + case IDXD_COMP_STATUS_INVALID_OPCODE: + return RTE_DMA_STATUS_INVALID_OPCODE; + case IDXD_COMP_STATUS_INVALID_SIZE: + return RTE_DMA_STATUS_INVALID_LENGTH; + case IDXD_COMP_STATUS_SKIPPED: + return RTE_DMA_STATUS_NOT_ATTEMPTED; + default: + return RTE_DMA_STATUS_ERROR_UNKNOWN; + } +} + +static __rte_always_inline int +batch_ok(struct idxd_dmadev *idxd, uint8_t max_ops, enum rte_dma_status_code *status) +{ + uint16_t ret; + uint8_t bstatus; + + if (max_ops == 0) + return 0; + + /* first check if there are any unreturned handles from last time */ + if (idxd->ids_avail != idxd->ids_returned) { + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); + idxd->ids_returned += ret; + if (status) + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); + return ret; + } + + if (idxd->batch_idx_read == idxd->batch_idx_write) + return 0; + + bstatus = idxd->batch_comp_ring[idxd->batch_idx_read].status; + /* now check if next batch is complete and successful */ + if (bstatus == IDXD_COMP_STATUS_SUCCESS) { + /* since the batch idx ring stores the start of each batch, pre-increment to lookup + * start of next batch. + */ + if (++idxd->batch_idx_read > idxd->max_batches) + idxd->batch_idx_read = 0; + idxd->ids_avail = idxd->batch_idx_ring[idxd->batch_idx_read]; + + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); + idxd->ids_returned += ret; + if (status) + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); + return ret; + } + /* check if batch is incomplete */ + else if (bstatus == IDXD_COMP_STATUS_INCOMPLETE) + return 0; + + return -1; /* error case */ +} + +static inline uint16_t +batch_completed(struct idxd_dmadev *idxd, uint8_t max_ops, bool *has_error) +{ + uint16_t i; + uint16_t b_start, b_end, next_batch; + + int ret = batch_ok(idxd, max_ops, NULL); + if (ret >= 0) + return ret; + + /* ERROR case, not successful, not incomplete */ + /* Get the batch size, and special case size 1. + * once we identify the actual failure job, return other jobs, then update + * the batch ring indexes to make it look like the first job of the batch has failed. + * Subsequent calls here will always return zero packets, and the error must be cleared by + * calling the completed_status() function. + */ + next_batch = (idxd->batch_idx_read + 1); + if (next_batch > idxd->max_batches) + next_batch = 0; + b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; + b_end = idxd->batch_idx_ring[next_batch]; + + if (b_end - b_start == 1) { /* not a batch */ + *has_error = true; + return 0; + } + + for (i = b_start; i < b_end; i++) { + struct idxd_completion *c = (void *)&idxd->desc_ring[i & idxd->desc_ring_mask]; + if (c->status > IDXD_COMP_STATUS_SUCCESS) /* ignore incomplete(0) and success(1) */ + break; + } + ret = RTE_MIN((uint16_t)(i - idxd->ids_returned), max_ops); + if (ret < max_ops) + *has_error = true; /* we got up to the point of error */ + idxd->ids_avail = idxd->ids_returned += ret; + + /* to ensure we can call twice and just return 0, set start of batch to where we finished */ + idxd->batch_comp_ring[idxd->batch_idx_read].completed_size -= ret; + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; + if (idxd->batch_idx_ring[next_batch] - idxd->batch_idx_ring[idxd->batch_idx_read] == 1) { + /* copy over the descriptor status to the batch ring as if no batch */ + uint16_t d_idx = idxd->batch_idx_ring[idxd->batch_idx_read] & idxd->desc_ring_mask; + struct idxd_completion *desc_comp = (void *)&idxd->desc_ring[d_idx]; + idxd->batch_comp_ring[idxd->batch_idx_read].status = desc_comp->status; + } + + return ret; +} + +static uint16_t +batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) +{ + uint16_t next_batch; + + int ret = batch_ok(idxd, max_ops, status); + if (ret >= 0) + return ret; + + /* ERROR case, not successful, not incomplete */ + /* Get the batch size, and special case size 1. + */ + next_batch = (idxd->batch_idx_read + 1); + if (next_batch > idxd->max_batches) + next_batch = 0; + const uint16_t b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; + const uint16_t b_end = idxd->batch_idx_ring[next_batch]; + const uint16_t b_len = b_end - b_start; + if (b_len == 1) {/* not a batch */ + *status = get_comp_status(&idxd->batch_comp_ring[idxd->batch_idx_read]); + idxd->ids_avail++; + idxd->ids_returned++; + idxd->batch_idx_read = next_batch; + return 1; + } + + /* not a single-element batch, need to process more. + * Scenarios: + * 1. max_ops >= batch_size - can fit everything, simple case + * - loop through completed ops and then add on any not-attempted ones + * 2. max_ops < batch_size - can't fit everything, more complex case + * - loop through completed/incomplete and stop when hit max_ops + * - adjust the batch descriptor to update where we stopped, with appropriate bcount + * - if bcount is to be exactly 1, update the batch descriptor as it will be treated as + * non-batch next time. + */ + const uint16_t bcount = idxd->batch_comp_ring[idxd->batch_idx_read].completed_size; + for (ret = 0; ret < b_len && ret < max_ops; ret++) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; + status[ret] = (ret < bcount) ? get_comp_status(c) : RTE_DMA_STATUS_NOT_ATTEMPTED; + } + idxd->ids_avail = idxd->ids_returned += ret; + + /* everything fit */ + if (ret == b_len) { + idxd->batch_idx_read = next_batch; + return ret; + } + + /* set up for next time, update existing batch descriptor & start idx at batch_idx_read */ + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; + if (ret > bcount) { + /* we have only incomplete ones - set batch completed size to 0 */ + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; + comp->completed_size = 0; + /* if there is only one descriptor left, job skipped so set flag appropriately */ + if (b_len - ret == 1) + comp->status = IDXD_COMP_STATUS_SKIPPED; + } else { + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; + comp->completed_size -= ret; + /* if there is only one descriptor left, copy status info straight to desc */ + if (comp->completed_size == 1) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; + comp->status = c->status; + /* individual descs can be ok without writeback, but not batches */ + if (comp->status == IDXD_COMP_STATUS_INCOMPLETE) + comp->status = IDXD_COMP_STATUS_SUCCESS; + } else if (bcount == b_len) { + /* check if we still have an error, and clear flag if not */ + uint16_t i; + for (i = b_start + ret; i < b_end; i++) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[i & idxd->desc_ring_mask]; + if (c->status > IDXD_COMP_STATUS_SUCCESS) + break; + } + if (i == b_end) /* no errors */ + comp->status = IDXD_COMP_STATUS_SUCCESS; + } + } + + return ret; +} + +uint16_t +idxd_completed(struct rte_dmadev *dev, uint16_t qid __rte_unused, uint16_t max_ops, + uint16_t *last_idx, bool *has_error) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t batch, ret = 0; + + do { + batch = batch_completed(idxd, max_ops - ret, has_error); + ret += batch; + } while (batch > 0 && *has_error == false); + + *last_idx = idxd->ids_returned - 1; + return ret; +} + +uint16_t +idxd_completed_status(struct rte_dmadev *dev, uint16_t qid __rte_unused, uint16_t max_ops, + uint16_t *last_idx, enum rte_dma_status_code *status) +{ + struct idxd_dmadev *idxd = dev->dev_private; + + uint16_t batch, ret = 0; + + do { + batch = batch_completed_status(idxd, max_ops - ret, &status[ret]); + ret += batch; + } while (batch > 0); + + *last_idx = idxd->ids_returned - 1; + return ret; +} + int idxd_dump(const struct rte_dmadev *dev, FILE *f) { @@ -277,6 +512,8 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->copy = idxd_enqueue_copy; dmadev->fill = idxd_enqueue_fill; dmadev->submit = idxd_submit; + dmadev->completed = idxd_completed; + dmadev->completed_status = idxd_completed_status; idxd = rte_malloc_socket(NULL, sizeof(struct idxd_dmadev), 0, dev->numa_node); if (idxd == NULL) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 7017d252b4..84d45a09d6 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -93,5 +93,10 @@ int idxd_enqueue_copy(struct rte_dmadev *dev, uint16_t qid, rte_iova_t src, int idxd_enqueue_fill(struct rte_dmadev *dev, uint16_t qid, uint64_t pattern, rte_iova_t dst, unsigned int length, uint64_t flags); int idxd_submit(struct rte_dmadev *dev, uint16_t qid); +uint16_t idxd_completed(struct rte_dmadev *dev, uint16_t qid, uint16_t max_ops, + uint16_t *last_idx, bool *has_error); +uint16_t idxd_completed_status(struct rte_dmadev *dev, uint16_t qid __rte_unused, + uint16_t max_ops, uint16_t *last_idx, + enum rte_dma_status_code *status); #endif /* _IDXD_INTERNAL_H_ */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v3 11/17] dma/idxd: add data-path job completion functions 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 11/17] dma/idxd: add data-path job completion functions Kevin Laatz @ 2021-09-09 11:24 ` Conor Walsh 0 siblings, 0 replies; 243+ messages in thread From: Conor Walsh @ 2021-09-09 11:24 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, fengchengwen, jerinj > Add the data path functions for gathering completed operations. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v3 12/17] dma/idxd: add operation statistic tracking 2021-09-08 10:29 ` [dpdk-dev] [PATCH v3 00/17] add dmadev driver for idxd devices Kevin Laatz ` (10 preceding siblings ...) 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 11/17] dma/idxd: add data-path job completion functions Kevin Laatz @ 2021-09-08 10:30 ` Kevin Laatz 2021-09-09 11:25 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 13/17] dma/idxd: add vchan status function Kevin Laatz ` (4 subsequent siblings) 16 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-08 10:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add statistic tracking for DSA devices. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- doc/guides/dmadevs/idxd.rst | 11 +++++++++++ drivers/dma/idxd/idxd_bus.c | 2 ++ drivers/dma/idxd/idxd_common.c | 27 +++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 3 +++ drivers/dma/idxd/idxd_pci.c | 2 ++ 5 files changed, 45 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index b0b5632b48..634ef58985 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -242,3 +242,14 @@ of memory is overwritten, or filled, with a short pattern of data. Fill operations can be performed in much the same was as copy operations described above, just using the ``rte_dmadev_fill()`` function rather than the ``rte_dmadev_copy()`` function. + +Querying Device Statistics +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The statistics from the IDXD dmadev device can be got via the stats functions in +the ``rte_dmadev`` library, i.e. ``rte_dmadev_stats_get()``. The statistics +returned for each device instance are: + +* ``submitted`` +* ``completed`` +* ``errors`` diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index 7a6afabd27..8781195d59 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -99,6 +99,8 @@ static const struct rte_dmadev_ops idxd_vdev_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .stats_get = idxd_stats_get, + .stats_reset = idxd_stats_reset, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 8eb73fdcc6..66d1b3432e 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -65,6 +65,8 @@ __submit(struct idxd_dmadev *idxd) if (++idxd->batch_idx_write > idxd->max_batches) idxd->batch_idx_write = 0; + idxd->stats.submitted += idxd->batch_size; + idxd->batch_start += idxd->batch_size; idxd->batch_size = 0; idxd->batch_idx_ring[idxd->batch_idx_write] = idxd->batch_start; @@ -278,6 +280,8 @@ batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_ const uint16_t b_len = b_end - b_start; if (b_len == 1) {/* not a batch */ *status = get_comp_status(&idxd->batch_comp_ring[idxd->batch_idx_read]); + if (status != RTE_DMA_STATUS_SUCCESSFUL) + idxd->stats.errors++; idxd->ids_avail++; idxd->ids_returned++; idxd->batch_idx_read = next_batch; @@ -299,6 +303,8 @@ batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_ struct idxd_completion *c = (void *) &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; status[ret] = (ret < bcount) ? get_comp_status(c) : RTE_DMA_STATUS_NOT_ATTEMPTED; + if (status[ret] != RTE_DMA_STATUS_SUCCESSFUL) + idxd->stats.errors++; } idxd->ids_avail = idxd->ids_returned += ret; @@ -357,6 +363,7 @@ idxd_completed(struct rte_dmadev *dev, uint16_t qid __rte_unused, uint16_t max_o ret += batch; } while (batch > 0 && *has_error == false); + idxd->stats.completed += ret; *last_idx = idxd->ids_returned - 1; return ret; } @@ -374,6 +381,7 @@ idxd_completed_status(struct rte_dmadev *dev, uint16_t qid __rte_unused, uint16_ ret += batch; } while (batch > 0); + idxd->stats.completed += ret; *last_idx = idxd->ids_returned - 1; return ret; } @@ -407,6 +415,25 @@ idxd_dump(const struct rte_dmadev *dev, FILE *f) return 0; } +int +idxd_stats_get(const struct rte_dmadev *dev, uint16_t vchan __rte_unused, + struct rte_dmadev_stats *stats, uint32_t stats_sz) +{ + struct idxd_dmadev *idxd = dev->dev_private; + if (stats_sz < sizeof(*stats)) + return -EINVAL; + *stats = idxd->stats; + return 0; +} + +int +idxd_stats_reset(struct rte_dmadev *dev, uint16_t vchan __rte_unused) +{ + struct idxd_dmadev *idxd = dev->dev_private; + idxd->stats = (struct rte_dmadev_stats){0}; + return 0; +} + int idxd_info_get(const struct rte_dmadev *dev, struct rte_dmadev_info *info, uint32_t size) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 84d45a09d6..c04ee002d8 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -98,5 +98,8 @@ uint16_t idxd_completed(struct rte_dmadev *dev, uint16_t qid, uint16_t max_ops, uint16_t idxd_completed_status(struct rte_dmadev *dev, uint16_t qid __rte_unused, uint16_t max_ops, uint16_t *last_idx, enum rte_dma_status_code *status); +int idxd_stats_get(const struct rte_dmadev *dev, uint16_t vchan, + struct rte_dmadev_stats *stats, uint32_t stats_sz); +int idxd_stats_reset(struct rte_dmadev *dev, uint16_t vchan); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 3c0e3086f7..a84232b6e9 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -114,6 +114,8 @@ static const struct rte_dmadev_ops idxd_pci_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .stats_get = idxd_stats_get, + .stats_reset = idxd_stats_reset, .dev_start = idxd_pci_dev_start, .dev_stop = idxd_pci_dev_stop, }; -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v3 12/17] dma/idxd: add operation statistic tracking 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 12/17] dma/idxd: add operation statistic tracking Kevin Laatz @ 2021-09-09 11:25 ` Conor Walsh 0 siblings, 0 replies; 243+ messages in thread From: Conor Walsh @ 2021-09-09 11:25 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, fengchengwen, jerinj > Add statistic tracking for DSA devices. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v3 13/17] dma/idxd: add vchan status function 2021-09-08 10:29 ` [dpdk-dev] [PATCH v3 00/17] add dmadev driver for idxd devices Kevin Laatz ` (11 preceding siblings ...) 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 12/17] dma/idxd: add operation statistic tracking Kevin Laatz @ 2021-09-08 10:30 ` Kevin Laatz 2021-09-09 11:26 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 14/17] dma/idxd: add burst capacity API Kevin Laatz ` (3 subsequent siblings) 16 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-08 10:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When testing dmadev drivers, it is useful to have the HW device in a known state. This patch adds the implementation of the function which will wait for the device to be idle (all jobs completed) before proceeding. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- v3: update API name to vchan_status --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 14 ++++++++++++++ drivers/dma/idxd/idxd_internal.h | 2 ++ drivers/dma/idxd/idxd_pci.c | 1 + 4 files changed, 18 insertions(+) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index 8781195d59..8f0fcad87a 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -101,6 +101,7 @@ static const struct rte_dmadev_ops idxd_vdev_ops = { .dev_info_get = idxd_info_get, .stats_get = idxd_stats_get, .stats_reset = idxd_stats_reset, + .vchan_status = idxd_vchan_status, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 66d1b3432e..e20b41ae54 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -165,6 +165,20 @@ get_comp_status(struct idxd_completion *c) } } +int +idxd_vchan_status(const struct rte_dmadev *dev, uint16_t vchan __rte_unused, + enum rte_dmadev_vchan_status *status) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t last_batch_write = idxd->batch_idx_write == 0 ? idxd->max_batches : + idxd->batch_idx_write - 1; + uint8_t bstatus = (idxd->batch_comp_ring[last_batch_write].status != 0); + + *status = bstatus ? RTE_DMA_VCHAN_IDLE : RTE_DMA_VCHAN_ACTIVE; + + return 0; +} + static __rte_always_inline int batch_ok(struct idxd_dmadev *idxd, uint8_t max_ops, enum rte_dma_status_code *status) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index c04ee002d8..fcc0235a1d 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -101,5 +101,7 @@ uint16_t idxd_completed_status(struct rte_dmadev *dev, uint16_t qid __rte_unused int idxd_stats_get(const struct rte_dmadev *dev, uint16_t vchan, struct rte_dmadev_stats *stats, uint32_t stats_sz); int idxd_stats_reset(struct rte_dmadev *dev, uint16_t vchan); +int idxd_vchan_status(const struct rte_dmadev *dev, uint16_t vchan, + enum rte_dmadev_vchan_status *status); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index a84232b6e9..f3a5d2a970 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -118,6 +118,7 @@ static const struct rte_dmadev_ops idxd_pci_ops = { .stats_reset = idxd_stats_reset, .dev_start = idxd_pci_dev_start, .dev_stop = idxd_pci_dev_stop, + .vchan_status = idxd_vchan_status, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v3 13/17] dma/idxd: add vchan status function 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 13/17] dma/idxd: add vchan status function Kevin Laatz @ 2021-09-09 11:26 ` Conor Walsh 0 siblings, 0 replies; 243+ messages in thread From: Conor Walsh @ 2021-09-09 11:26 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, fengchengwen, jerinj > When testing dmadev drivers, it is useful to have the HW device in a known > state. This patch adds the implementation of the function which will wait > for the device to be idle (all jobs completed) before proceeding. > > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> <snip> > +int > +idxd_vchan_status(const struct rte_dmadev *dev, uint16_t vchan __rte_unused, > + enum rte_dmadev_vchan_status *status) > +{ > + struct idxd_dmadev *idxd = dev->dev_private; > + uint16_t last_batch_write = idxd->batch_idx_write == 0 ? idxd->max_batches : > + idxd->batch_idx_write - 1; > + uint8_t bstatus = (idxd->batch_comp_ring[last_batch_write].status != 0); > + > + *status = bstatus ? RTE_DMA_VCHAN_IDLE : RTE_DMA_VCHAN_ACTIVE; > + > + return 0; > +} Should there be a comment noting that RTE_DMA_VCHAN_HALTED_ERROR does not apply to IDXD? Reviewed-by: Conor Walsh <conor.walsh@intel.com> ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v3 14/17] dma/idxd: add burst capacity API 2021-09-08 10:29 ` [dpdk-dev] [PATCH v3 00/17] add dmadev driver for idxd devices Kevin Laatz ` (12 preceding siblings ...) 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 13/17] dma/idxd: add vchan status function Kevin Laatz @ 2021-09-08 10:30 ` Kevin Laatz 2021-09-09 11:26 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 15/17] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz ` (2 subsequent siblings) 16 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-08 10:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add support for the burst capacity API. This API will provide the calling application with the remaining capacity of the current burst (limited by max HW batch size). Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 20 ++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 1 + drivers/dma/idxd/idxd_pci.c | 2 ++ 4 files changed, 24 insertions(+) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index 8f0fcad87a..e2bcca1c74 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -102,6 +102,7 @@ static const struct rte_dmadev_ops idxd_vdev_ops = { .stats_get = idxd_stats_get, .stats_reset = idxd_stats_reset, .vchan_status = idxd_vchan_status, + .burst_capacity = idxd_burst_capacity, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index e20b41ae54..ced9f81772 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -470,6 +470,26 @@ idxd_info_get(const struct rte_dmadev *dev, struct rte_dmadev_info *info, uint32 return 0; } +uint16_t +idxd_burst_capacity(const struct rte_dmadev *dev, uint16_t vchan __rte_unused) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t write_idx = idxd->batch_start + idxd->batch_size; + uint16_t used_space; + + /* Check for space in the batch ring */ + if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) || + idxd->batch_idx_write + 1 == idxd->batch_idx_read) + return 0; + + /* For descriptors, check for wrap-around on write but not read */ + if (idxd->ids_returned > write_idx) + write_idx += idxd->desc_ring_mask + 1; + used_space = write_idx - idxd->ids_returned; + + return RTE_MIN((idxd->desc_ring_mask - used_space), idxd->max_batch_size); +} + int idxd_configure(struct rte_dmadev *dev __rte_unused, const struct rte_dmadev_conf *dev_conf, uint32_t conf_sz) diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index fcc0235a1d..692d27cf72 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -103,5 +103,6 @@ int idxd_stats_get(const struct rte_dmadev *dev, uint16_t vchan, int idxd_stats_reset(struct rte_dmadev *dev, uint16_t vchan); int idxd_vchan_status(const struct rte_dmadev *dev, uint16_t vchan, enum rte_dmadev_vchan_status *status); +uint16_t idxd_burst_capacity(const struct rte_dmadev *dev, uint16_t vchan); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index f3a5d2a970..5da14eb9a2 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -119,6 +119,7 @@ static const struct rte_dmadev_ops idxd_pci_ops = { .dev_start = idxd_pci_dev_start, .dev_stop = idxd_pci_dev_stop, .vchan_status = idxd_vchan_status, + .burst_capacity = idxd_burst_capacity, }; /* each portal uses 4 x 4k pages */ @@ -232,6 +233,7 @@ init_pci_device(struct rte_pci_device *dev, struct idxd_dmadev *idxd, idxd->u.pci = pci; idxd->max_batches = wq_size; + idxd->max_batch_size = 1 << lg2_max_batch; /* enable the device itself */ err_code = idxd_pci_dev_command(idxd, idxd_enable_dev); -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v3 14/17] dma/idxd: add burst capacity API 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 14/17] dma/idxd: add burst capacity API Kevin Laatz @ 2021-09-09 11:26 ` Conor Walsh 0 siblings, 0 replies; 243+ messages in thread From: Conor Walsh @ 2021-09-09 11:26 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, fengchengwen, jerinj > Add support for the burst capacity API. This API will provide the calling > application with the remaining capacity of the current burst (limited by > max HW batch size). > > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v3 15/17] dma/idxd: move dpdk_idxd_cfg.py from raw to dma 2021-09-08 10:29 ` [dpdk-dev] [PATCH v3 00/17] add dmadev driver for idxd devices Kevin Laatz ` (13 preceding siblings ...) 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 14/17] dma/idxd: add burst capacity API Kevin Laatz @ 2021-09-08 10:30 ` Kevin Laatz 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 16/17] devbind: add dma device class Kevin Laatz 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 17/17] devbind: move idxd device ID to dmadev class Kevin Laatz 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-08 10:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz From: Conor Walsh <conor.walsh@intel.com> Move the example script for configuring IDXD devices bound to the IDXD kernel driver from raw to dma, and create a symlink to still allow use from raw. Signed-off-by: Conor Walsh <conor.walsh@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- drivers/dma/idxd/dpdk_idxd_cfg.py | 117 +++++++++++++++++++++++++++++ drivers/raw/ioat/dpdk_idxd_cfg.py | 118 +----------------------------- 2 files changed, 118 insertions(+), 117 deletions(-) create mode 100755 drivers/dma/idxd/dpdk_idxd_cfg.py mode change 100755 => 120000 drivers/raw/ioat/dpdk_idxd_cfg.py diff --git a/drivers/dma/idxd/dpdk_idxd_cfg.py b/drivers/dma/idxd/dpdk_idxd_cfg.py new file mode 100755 index 0000000000..fcc27822ef --- /dev/null +++ b/drivers/dma/idxd/dpdk_idxd_cfg.py @@ -0,0 +1,117 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2020 Intel Corporation + +""" +Configure an entire Intel DSA instance, using idxd kernel driver, for DPDK use +""" + +import sys +import argparse +import os +import os.path + + +class SysfsDir: + "Used to read/write paths in a sysfs directory" + def __init__(self, path): + self.path = path + + def read_int(self, filename): + "Return a value from sysfs file" + with open(os.path.join(self.path, filename)) as f: + return int(f.readline()) + + def write_values(self, values): + "write dictionary, where key is filename and value is value to write" + for filename, contents in values.items(): + with open(os.path.join(self.path, filename), "w") as f: + f.write(str(contents)) + + +def reset_device(dsa_id): + "Reset the DSA device and all its queues" + drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") + drv_dir.write_values({"unbind": f"dsa{dsa_id}"}) + + +def get_pci_dir(pci): + "Search for the sysfs directory of the PCI device" + base_dir = '/sys/bus/pci/devices/' + for path, dirs, files in os.walk(base_dir): + for dir in dirs: + if pci in dir: + return os.path.join(base_dir, dir) + sys.exit(f"Could not find sysfs directory for device {pci}") + + +def get_dsa_id(pci): + "Get the DSA instance ID using the PCI address of the device" + pci_dir = get_pci_dir(pci) + for path, dirs, files in os.walk(pci_dir): + for dir in dirs: + if dir.startswith('dsa') and 'wq' not in dir: + return int(dir[3:]) + sys.exit(f"Could not get device ID for device {pci}") + + +def configure_dsa(dsa_id, queues, prefix): + "Configure the DSA instance with appropriate number of queues" + dsa_dir = SysfsDir(f"/sys/bus/dsa/devices/dsa{dsa_id}") + drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") + + max_groups = dsa_dir.read_int("max_groups") + max_engines = dsa_dir.read_int("max_engines") + max_queues = dsa_dir.read_int("max_work_queues") + max_work_queues_size = dsa_dir.read_int("max_work_queues_size") + + nb_queues = min(queues, max_queues) + if queues > nb_queues: + print(f"Setting number of queues to max supported value: {max_queues}") + + # we want one engine per group, and no more engines than queues + nb_groups = min(max_engines, max_groups, nb_queues) + for grp in range(nb_groups): + dsa_dir.write_values({f"engine{dsa_id}.{grp}/group_id": grp}) + + # configure each queue + for q in range(nb_queues): + wq_dir = SysfsDir(os.path.join(dsa_dir.path, f"wq{dsa_id}.{q}")) + wq_dir.write_values({"group_id": q % nb_groups, + "type": "user", + "mode": "dedicated", + "name": f"{prefix}_wq{dsa_id}.{q}", + "priority": 1, + "size": int(max_work_queues_size / nb_queues)}) + + # enable device and then queues + drv_dir.write_values({"bind": f"dsa{dsa_id}"}) + for q in range(nb_queues): + drv_dir.write_values({"bind": f"wq{dsa_id}.{q}"}) + + +def main(args): + "Main function, does arg parsing and calls config function" + arg_p = argparse.ArgumentParser( + description="Configure whole DSA device instance for DPDK use") + arg_p.add_argument('dsa_id', + help="Specify DSA instance either via DSA instance number or PCI address") + arg_p.add_argument('-q', metavar='queues', type=int, default=255, + help="Number of queues to set up") + arg_p.add_argument('--name-prefix', metavar='prefix', dest='prefix', + default="dpdk", + help="Prefix for workqueue name to mark for DPDK use [default: 'dpdk']") + arg_p.add_argument('--reset', action='store_true', + help="Reset DSA device and its queues") + parsed_args = arg_p.parse_args(args[1:]) + + dsa_id = parsed_args.dsa_id + dsa_id = get_dsa_id(dsa_id) if ':' in dsa_id else dsa_id + if parsed_args.reset: + reset_device(dsa_id) + else: + configure_dsa(dsa_id, parsed_args.q, parsed_args.prefix) + + +if __name__ == "__main__": + main(sys.argv) diff --git a/drivers/raw/ioat/dpdk_idxd_cfg.py b/drivers/raw/ioat/dpdk_idxd_cfg.py deleted file mode 100755 index fcc27822ef..0000000000 --- a/drivers/raw/ioat/dpdk_idxd_cfg.py +++ /dev/null @@ -1,117 +0,0 @@ -#!/usr/bin/env python3 -# SPDX-License-Identifier: BSD-3-Clause -# Copyright(c) 2020 Intel Corporation - -""" -Configure an entire Intel DSA instance, using idxd kernel driver, for DPDK use -""" - -import sys -import argparse -import os -import os.path - - -class SysfsDir: - "Used to read/write paths in a sysfs directory" - def __init__(self, path): - self.path = path - - def read_int(self, filename): - "Return a value from sysfs file" - with open(os.path.join(self.path, filename)) as f: - return int(f.readline()) - - def write_values(self, values): - "write dictionary, where key is filename and value is value to write" - for filename, contents in values.items(): - with open(os.path.join(self.path, filename), "w") as f: - f.write(str(contents)) - - -def reset_device(dsa_id): - "Reset the DSA device and all its queues" - drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") - drv_dir.write_values({"unbind": f"dsa{dsa_id}"}) - - -def get_pci_dir(pci): - "Search for the sysfs directory of the PCI device" - base_dir = '/sys/bus/pci/devices/' - for path, dirs, files in os.walk(base_dir): - for dir in dirs: - if pci in dir: - return os.path.join(base_dir, dir) - sys.exit(f"Could not find sysfs directory for device {pci}") - - -def get_dsa_id(pci): - "Get the DSA instance ID using the PCI address of the device" - pci_dir = get_pci_dir(pci) - for path, dirs, files in os.walk(pci_dir): - for dir in dirs: - if dir.startswith('dsa') and 'wq' not in dir: - return int(dir[3:]) - sys.exit(f"Could not get device ID for device {pci}") - - -def configure_dsa(dsa_id, queues, prefix): - "Configure the DSA instance with appropriate number of queues" - dsa_dir = SysfsDir(f"/sys/bus/dsa/devices/dsa{dsa_id}") - drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") - - max_groups = dsa_dir.read_int("max_groups") - max_engines = dsa_dir.read_int("max_engines") - max_queues = dsa_dir.read_int("max_work_queues") - max_work_queues_size = dsa_dir.read_int("max_work_queues_size") - - nb_queues = min(queues, max_queues) - if queues > nb_queues: - print(f"Setting number of queues to max supported value: {max_queues}") - - # we want one engine per group, and no more engines than queues - nb_groups = min(max_engines, max_groups, nb_queues) - for grp in range(nb_groups): - dsa_dir.write_values({f"engine{dsa_id}.{grp}/group_id": grp}) - - # configure each queue - for q in range(nb_queues): - wq_dir = SysfsDir(os.path.join(dsa_dir.path, f"wq{dsa_id}.{q}")) - wq_dir.write_values({"group_id": q % nb_groups, - "type": "user", - "mode": "dedicated", - "name": f"{prefix}_wq{dsa_id}.{q}", - "priority": 1, - "size": int(max_work_queues_size / nb_queues)}) - - # enable device and then queues - drv_dir.write_values({"bind": f"dsa{dsa_id}"}) - for q in range(nb_queues): - drv_dir.write_values({"bind": f"wq{dsa_id}.{q}"}) - - -def main(args): - "Main function, does arg parsing and calls config function" - arg_p = argparse.ArgumentParser( - description="Configure whole DSA device instance for DPDK use") - arg_p.add_argument('dsa_id', - help="Specify DSA instance either via DSA instance number or PCI address") - arg_p.add_argument('-q', metavar='queues', type=int, default=255, - help="Number of queues to set up") - arg_p.add_argument('--name-prefix', metavar='prefix', dest='prefix', - default="dpdk", - help="Prefix for workqueue name to mark for DPDK use [default: 'dpdk']") - arg_p.add_argument('--reset', action='store_true', - help="Reset DSA device and its queues") - parsed_args = arg_p.parse_args(args[1:]) - - dsa_id = parsed_args.dsa_id - dsa_id = get_dsa_id(dsa_id) if ':' in dsa_id else dsa_id - if parsed_args.reset: - reset_device(dsa_id) - else: - configure_dsa(dsa_id, parsed_args.q, parsed_args.prefix) - - -if __name__ == "__main__": - main(sys.argv) diff --git a/drivers/raw/ioat/dpdk_idxd_cfg.py b/drivers/raw/ioat/dpdk_idxd_cfg.py new file mode 120000 index 0000000000..85545548d1 --- /dev/null +++ b/drivers/raw/ioat/dpdk_idxd_cfg.py @@ -0,0 +1 @@ +../../dma/idxd/dpdk_idxd_cfg.py \ No newline at end of file -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v3 16/17] devbind: add dma device class 2021-09-08 10:29 ` [dpdk-dev] [PATCH v3 00/17] add dmadev driver for idxd devices Kevin Laatz ` (14 preceding siblings ...) 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 15/17] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz @ 2021-09-08 10:30 ` Kevin Laatz 2021-09-09 11:26 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 17/17] devbind: move idxd device ID to dmadev class Kevin Laatz 16 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-08 10:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add a new class for DMA devices. Devices listed under the DMA class are to be used with the dmadev library. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- usertools/dpdk-devbind.py | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py index 74d16e4c4b..8bb573f4b0 100755 --- a/usertools/dpdk-devbind.py +++ b/usertools/dpdk-devbind.py @@ -69,12 +69,13 @@ network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class] baseband_devices = [acceleration_class] crypto_devices = [encryption_class, intel_processor_class] +dma_devices = [] eventdev_devices = [cavium_sso, cavium_tim, intel_dlb, octeontx2_sso] mempool_devices = [cavium_fpa, octeontx2_npa] compress_devices = [cavium_zip] regex_devices = [octeontx2_ree] -misc_devices = [cnxk_bphy, cnxk_bphy_cgx, intel_ioat_bdw, intel_ioat_skx, intel_ioat_icx, intel_idxd_spr, - intel_ntb_skx, intel_ntb_icx, +misc_devices = [cnxk_bphy, cnxk_bphy_cgx, intel_ioat_bdw, intel_ioat_skx, + intel_ioat_icx, intel_idxd_spr, intel_ntb_skx, intel_ntb_icx, octeontx2_dma] # global dict ethernet devices present. Dictionary indexed by PCI address. @@ -583,6 +584,9 @@ def show_status(): if status_dev in ["crypto", "all"]: show_device_status(crypto_devices, "Crypto") + if status_dev in ["dma", "all"]: + show_device_status(dma_devices, "DMA") + if status_dev in ["event", "all"]: show_device_status(eventdev_devices, "Eventdev") @@ -651,7 +655,7 @@ def parse_args(): parser.add_argument( '--status-dev', help="Print the status of given device group.", - choices=['baseband', 'compress', 'crypto', 'event', + choices=['baseband', 'compress', 'crypto', 'dma', 'event', 'mempool', 'misc', 'net', 'regex']) bind_group = parser.add_mutually_exclusive_group() bind_group.add_argument( @@ -732,6 +736,7 @@ def do_arg_actions(): get_device_details(network_devices) get_device_details(baseband_devices) get_device_details(crypto_devices) + get_device_details(dma_devices) get_device_details(eventdev_devices) get_device_details(mempool_devices) get_device_details(compress_devices) @@ -754,6 +759,7 @@ def main(): get_device_details(network_devices) get_device_details(baseband_devices) get_device_details(crypto_devices) + get_device_details(dma_devices) get_device_details(eventdev_devices) get_device_details(mempool_devices) get_device_details(compress_devices) -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v3 16/17] devbind: add dma device class 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 16/17] devbind: add dma device class Kevin Laatz @ 2021-09-09 11:26 ` Conor Walsh 0 siblings, 0 replies; 243+ messages in thread From: Conor Walsh @ 2021-09-09 11:26 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, fengchengwen, jerinj > Add a new class for DMA devices. Devices listed under the DMA class are to > be used with the dmadev library. > > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v3 17/17] devbind: move idxd device ID to dmadev class 2021-09-08 10:29 ` [dpdk-dev] [PATCH v3 00/17] add dmadev driver for idxd devices Kevin Laatz ` (15 preceding siblings ...) 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 16/17] devbind: add dma device class Kevin Laatz @ 2021-09-08 10:30 ` Kevin Laatz 2021-09-09 11:27 ` Conor Walsh 16 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-08 10:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz The dmadev library is the preferred abstraction for using IDXD devices and will replace the rawdev implementation in future. This patch moves the IDXD device ID to the dmadev class. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- usertools/dpdk-devbind.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py index 8bb573f4b0..98b698ccc0 100755 --- a/usertools/dpdk-devbind.py +++ b/usertools/dpdk-devbind.py @@ -69,13 +69,13 @@ network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class] baseband_devices = [acceleration_class] crypto_devices = [encryption_class, intel_processor_class] -dma_devices = [] +dma_devices = [intel_idxd_spr] eventdev_devices = [cavium_sso, cavium_tim, intel_dlb, octeontx2_sso] mempool_devices = [cavium_fpa, octeontx2_npa] compress_devices = [cavium_zip] regex_devices = [octeontx2_ree] misc_devices = [cnxk_bphy, cnxk_bphy_cgx, intel_ioat_bdw, intel_ioat_skx, - intel_ioat_icx, intel_idxd_spr, intel_ntb_skx, intel_ntb_icx, + intel_ioat_icx, intel_ntb_skx, intel_ntb_icx, octeontx2_dma] # global dict ethernet devices present. Dictionary indexed by PCI address. -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v3 17/17] devbind: move idxd device ID to dmadev class 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 17/17] devbind: move idxd device ID to dmadev class Kevin Laatz @ 2021-09-09 11:27 ` Conor Walsh 0 siblings, 0 replies; 243+ messages in thread From: Conor Walsh @ 2021-09-09 11:27 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, fengchengwen, jerinj > The dmadev library is the preferred abstraction for using IDXD devices and > will replace the rawdev implementation in future. This patch moves the IDXD > device ID to the dmadev class. > > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v4 00/16] add dmadev driver for idxd devices 2021-08-27 17:20 [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices Kevin Laatz ` (13 preceding siblings ...) 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 00/16] add dmadev driver for idxd devices Kevin Laatz @ 2021-09-17 14:02 ` Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 01/16] raw/ioat: only build if dmadev not present Kevin Laatz ` (15 more replies) 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 00/16] add dmadev driver for idxd devices Kevin Laatz ` (6 subsequent siblings) 21 siblings, 16 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 14:02 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz This patchset adds a dmadev driver and associated documentation to support Intel Data Streaming Accelerator devices. This driver is intended to ultimately replace the current IDXD part of the IOAT rawdev driver. NOTE: This patchset has several dependencies: - v22 of the dmadev lib set [1] - v5 of the dmadev test suite [2] [1] http://patches.dpdk.org/project/dpdk/list/?series=18960 [2] http://patches.dpdk.org/project/dpdk/list/?series=19017 v4: * rebased on above patchsets * minor fixes based on review feedback v3: * rebased on above patchsets * added burst capacity API v2: * rebased on above patchsets * added API to check for device being idle * added devbind updates for DMA devices * fixed issue identified by internal coverity scan * other minor miscellaneous changes and fixes Bruce Richardson (1): raw/ioat: only build if dmadev not present Conor Walsh (1): dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz (14): dma/idxd: add skeleton for VFIO based DSA device dma/idxd: add bus device probing dma/idxd: create dmadev instances on bus probe dma/idxd: create dmadev instances on pci probe dma/idxd: add datapath structures dma/idxd: add configure and info_get functions dma/idxd: add start and stop functions for pci devices dma/idxd: add data-path job submission functions dma/idxd: add data-path job completion functions dma/idxd: add operation statistic tracking dma/idxd: add vchan status function dma/idxd: add burst capacity API devbind: add dma device class devbind: move idxd device ID to dmadev class MAINTAINERS | 10 + app/test/test_dmadev.c | 2 + doc/guides/dmadevs/idxd.rst | 262 +++++++++++ doc/guides/rawdevs/ioat.rst | 7 + doc/guides/rel_notes/release_21_11.rst | 5 + drivers/dma/idxd/dpdk_idxd_cfg.py | 117 +++++ drivers/dma/idxd/idxd_bus.c | 378 +++++++++++++++ drivers/dma/idxd/idxd_common.c | 616 +++++++++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 131 ++++++ drivers/dma/idxd/idxd_internal.h | 108 +++++ drivers/dma/idxd/idxd_pci.c | 387 ++++++++++++++++ drivers/dma/idxd/meson.build | 14 + drivers/dma/idxd/version.map | 3 + drivers/dma/meson.build | 2 + drivers/meson.build | 2 +- drivers/raw/ioat/dpdk_idxd_cfg.py | 118 +---- drivers/raw/ioat/meson.build | 23 +- usertools/dpdk-devbind.py | 12 +- 18 files changed, 2073 insertions(+), 124 deletions(-) create mode 100644 doc/guides/dmadevs/idxd.rst create mode 100755 drivers/dma/idxd/dpdk_idxd_cfg.py create mode 100644 drivers/dma/idxd/idxd_bus.c create mode 100644 drivers/dma/idxd/idxd_common.c create mode 100644 drivers/dma/idxd/idxd_hw_defs.h create mode 100644 drivers/dma/idxd/idxd_internal.h create mode 100644 drivers/dma/idxd/idxd_pci.c create mode 100644 drivers/dma/idxd/meson.build create mode 100644 drivers/dma/idxd/version.map mode change 100755 => 120000 drivers/raw/ioat/dpdk_idxd_cfg.py -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v4 01/16] raw/ioat: only build if dmadev not present 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 00/16] add dmadev driver for idxd devices Kevin Laatz @ 2021-09-17 14:02 ` Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz ` (14 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 14:02 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz From: Bruce Richardson <bruce.richardson@intel.com> Only build the rawdev IDXD/IOAT drivers if the dmadev drivers are not present. A not is also added to the documentation to inform users of this change. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v4: - Fix build issue - Add note in raw documentation to outline this change --- doc/guides/rawdevs/ioat.rst | 7 +++++++ drivers/meson.build | 2 +- drivers/raw/ioat/meson.build | 23 ++++++++++++++++++++--- 3 files changed, 28 insertions(+), 4 deletions(-) diff --git a/doc/guides/rawdevs/ioat.rst b/doc/guides/rawdevs/ioat.rst index a28e909935..4fc327f1a4 100644 --- a/doc/guides/rawdevs/ioat.rst +++ b/doc/guides/rawdevs/ioat.rst @@ -34,6 +34,13 @@ Compilation For builds using ``meson`` and ``ninja``, the driver will be built when the target platform is x86-based. No additional compilation steps are necessary. +.. note:: + Since the addition of the DMAdev library, the ``ioat`` and ``idxd`` parts of this driver + will only be built if their ``DMAdev`` counterparts are not built. The following can be used + to disable the ``DMAdev`` drivers, if the raw drivers are to be used instead:: + + $ meson -Ddisable_drivers=dma/* <build_dir> + Device Setup ------------- diff --git a/drivers/meson.build b/drivers/meson.build index b7d680868a..27ff10a9fc 100644 --- a/drivers/meson.build +++ b/drivers/meson.build @@ -10,6 +10,7 @@ subdirs = [ 'common/qat', # depends on bus. 'common/sfc_efx', # depends on bus. 'mempool', # depends on common and bus. + 'dma', # depends on common and bus. 'net', # depends on common, bus, mempool 'raw', # depends on common, bus and net. 'crypto', # depends on common, bus and mempool (net in future). @@ -18,7 +19,6 @@ subdirs = [ 'vdpa', # depends on common, bus and mempool. 'event', # depends on common, bus, mempool and net. 'baseband', # depends on common and bus. - 'dma', # depends on common and bus. ] if meson.is_cross_build() diff --git a/drivers/raw/ioat/meson.build b/drivers/raw/ioat/meson.build index 0e81cb5951..9be9d8cc65 100644 --- a/drivers/raw/ioat/meson.build +++ b/drivers/raw/ioat/meson.build @@ -2,14 +2,31 @@ # Copyright 2019 Intel Corporation build = dpdk_conf.has('RTE_ARCH_X86') +# only use ioat rawdev driver if we don't have the equivalent dmadev ones +if dpdk_conf.has('RTE_DMA_IDXD') and dpdk_conf.has('RTE_DMA_IOAT') + build = false + subdir_done() +endif + reason = 'only supported on x86' sources = files( - 'idxd_bus.c', - 'idxd_pci.c', 'ioat_common.c', - 'ioat_rawdev.c', 'ioat_rawdev_test.c', ) + +if not dpdk_conf.has('RTE_DMA_IDXD') + sources += files( + 'idxd_bus.c', + 'idxd_pci.c', + ) +endif + +if not dpdk_conf.has('RTE_DMA_IOAT') + sources += files ( + 'ioat_rawdev.c', + ) +endif + deps += ['bus_pci', 'mbuf', 'rawdev'] headers = files( 'rte_ioat_rawdev.h', -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v4 02/16] dma/idxd: add skeleton for VFIO based DSA device 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 01/16] raw/ioat: only build if dmadev not present Kevin Laatz @ 2021-09-17 14:02 ` Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 03/16] dma/idxd: add bus device probing Kevin Laatz ` (13 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 14:02 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the basic device probe/remove skeleton code for DSA device bound to the vfio pci driver. Relevant documentation and MAINTAINERS update also included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- MAINTAINERS | 10 +++++ doc/guides/dmadevs/idxd.rst | 58 ++++++++++++++++++++++++++ doc/guides/rel_notes/release_21_11.rst | 5 +++ drivers/dma/idxd/idxd_internal.h | 27 ++++++++++++ drivers/dma/idxd/idxd_pci.c | 55 ++++++++++++++++++++++++ drivers/dma/idxd/meson.build | 11 +++++ drivers/dma/idxd/version.map | 3 ++ drivers/dma/meson.build | 2 + 8 files changed, 171 insertions(+) create mode 100644 doc/guides/dmadevs/idxd.rst create mode 100644 drivers/dma/idxd/idxd_internal.h create mode 100644 drivers/dma/idxd/idxd_pci.c create mode 100644 drivers/dma/idxd/meson.build create mode 100644 drivers/dma/idxd/version.map diff --git a/MAINTAINERS b/MAINTAINERS index 3258da194d..9cb59b831d 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1200,6 +1200,16 @@ F: doc/guides/compressdevs/zlib.rst F: doc/guides/compressdevs/features/zlib.ini +DMAdev Drivers +-------------- + +Intel IDXD - EXPERIMENTAL +M: Bruce Richardson <bruce.richardson@intel.com> +M: Kevin Laatz <kevin.laatz@intel.com> +F: drivers/dma/idxd/ +F: doc/guides/dmadevs/idxd.rst + + RegEx Drivers ------------- diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst new file mode 100644 index 0000000000..924700d17e --- /dev/null +++ b/doc/guides/dmadevs/idxd.rst @@ -0,0 +1,58 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2021 Intel Corporation. + +.. include:: <isonum.txt> + +IDXD DMA Device Driver +====================== + +The ``idxd`` dmadev driver provides a poll-mode driver (PMD) for Intel\ |reg| +Data Streaming Accelerator `(Intel DSA) +<https://software.intel.com/content/www/us/en/develop/articles/intel-data-streaming-accelerator-architecture-specification.html>`_. +This PMD can be used in conjunction with Intel\ |reg| DSA devices to offload +data operations, such as data copies, to hardware, freeing up CPU cycles for +other tasks. + +Hardware Requirements +---------------------- + +The ``dpdk-devbind.py`` script, included with DPDK, can be used to show the +presence of supported hardware. Running ``dpdk-devbind.py --status-dev dma`` +will show all the DMA devices on the system, including IDXD supported devices. +Intel\ |reg| DSA devices, are currently (at time of writing) appearing +as devices with type “0b25”, due to the absence of pci-id database entries for +them at this point. + +Compilation +------------ + +For builds using ``meson`` and ``ninja``, the driver will be built when the +target platform is x86-based. No additional compilation steps are necessary. + +Device Setup +------------- + +Devices using VFIO/UIO drivers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The HW devices to be used will need to be bound to a user-space IO driver for use. +The ``dpdk-devbind.py`` script can be used to view the state of the devices +and to bind them to a suitable DPDK-supported driver, such as ``vfio-pci``. +For example:: + + $ dpdk-devbind.py -b vfio-pci 6a:01.0 + +Device Probing and Initialization +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For devices bound to a suitable DPDK-supported VFIO/UIO driver, the HW devices will +be found as part of the device scan done at application initialization time without +the need to pass parameters to the application. + +For Intel\ |reg| DSA devices, DPDK will automatically configure the device with the +maximum number of workqueues available on it, partitioning all resources equally +among the queues. +If fewer workqueues are required, then the ``max_queues`` parameter may be passed to +the device driver on the EAL commandline, via the ``allowlist`` or ``-a`` flag e.g.:: + + $ dpdk-test -a <b:d:f>,max_queues=4 diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst index a71853b9c3..c0bfd9c1ba 100644 --- a/doc/guides/rel_notes/release_21_11.rst +++ b/doc/guides/rel_notes/release_21_11.rst @@ -92,6 +92,11 @@ New Features * Device allocation and it's multi-process support. * Control and data plane functions. +* **Added IDXD dmadev driver implementation.** + + The IDXD dmadev driver provide device drivers for the Intel DSA devices. + This device driver can be used through the generic dmadev API. + Removed Items ------------- diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h new file mode 100644 index 0000000000..c6a7dcd72f --- /dev/null +++ b/drivers/dma/idxd/idxd_internal.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#ifndef _IDXD_INTERNAL_H_ +#define _IDXD_INTERNAL_H_ + +/** + * @file idxd_internal.h + * + * Internal data structures for the idxd/DSA driver for dmadev + * + * @warning + * @b EXPERIMENTAL: these structures and APIs may change without prior notice + */ + +extern int idxd_pmd_logtype; + +#define IDXD_PMD_LOG(level, fmt, args...) rte_log(RTE_LOG_ ## level, \ + idxd_pmd_logtype, "IDXD: %s(): " fmt "\n", __func__, ##args) + +#define IDXD_PMD_DEBUG(fmt, args...) IDXD_PMD_LOG(DEBUG, fmt, ## args) +#define IDXD_PMD_INFO(fmt, args...) IDXD_PMD_LOG(INFO, fmt, ## args) +#define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) +#define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) + +#endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c new file mode 100644 index 0000000000..79e4aadcab --- /dev/null +++ b/drivers/dma/idxd/idxd_pci.c @@ -0,0 +1,55 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include <rte_bus_pci.h> + +#include "idxd_internal.h" + +#define IDXD_VENDOR_ID 0x8086 +#define IDXD_DEVICE_ID_SPR 0x0B25 + +#define IDXD_PMD_DMADEV_NAME_PCI dmadev_idxd_pci + +const struct rte_pci_id pci_id_idxd_map[] = { + { RTE_PCI_DEVICE(IDXD_VENDOR_ID, IDXD_DEVICE_ID_SPR) }, + { .vendor_id = 0, /* sentinel */ }, +}; + +static int +idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) +{ + int ret = 0; + char name[PCI_PRI_STR_SIZE]; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); + dev->device.driver = &drv->driver; + + return ret; +} + +static int +idxd_dmadev_remove_pci(struct rte_pci_device *dev) +{ + char name[PCI_PRI_STR_SIZE]; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + + IDXD_PMD_INFO("Closing %s on NUMA node %d", + name, dev->device.numa_node); + + return 0; +} + +struct rte_pci_driver idxd_pmd_drv_pci = { + .id_table = pci_id_idxd_map, + .drv_flags = RTE_PCI_DRV_NEED_MAPPING, + .probe = idxd_dmadev_probe_pci, + .remove = idxd_dmadev_remove_pci, +}; + +RTE_PMD_REGISTER_PCI(IDXD_PMD_DMADEV_NAME_PCI, idxd_pmd_drv_pci); +RTE_PMD_REGISTER_PCI_TABLE(IDXD_PMD_DMADEV_NAME_PCI, pci_id_idxd_map); +RTE_PMD_REGISTER_KMOD_DEP(IDXD_PMD_DMADEV_NAME_PCI, "vfio-pci"); +RTE_PMD_REGISTER_PARAM_STRING(dmadev_idxd_pci, "max_queues=0"); diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build new file mode 100644 index 0000000000..9a64d75005 --- /dev/null +++ b/drivers/dma/idxd/meson.build @@ -0,0 +1,11 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2021 Intel Corporation + +if is_windows + subdir_done() +endif + +deps += ['bus_pci'] +sources = files( + 'idxd_pci.c' +) \ No newline at end of file diff --git a/drivers/dma/idxd/version.map b/drivers/dma/idxd/version.map new file mode 100644 index 0000000000..4a76d1d52d --- /dev/null +++ b/drivers/dma/idxd/version.map @@ -0,0 +1,3 @@ +DPDK_21 { + local: *; +}; diff --git a/drivers/dma/meson.build b/drivers/dma/meson.build index d9c7ede32f..411be7a240 100644 --- a/drivers/dma/meson.build +++ b/drivers/dma/meson.build @@ -2,5 +2,7 @@ # Copyright 2021 HiSilicon Limited drivers = [ + 'idxd', 'skeleton', ] +std_deps = ['dmadev'] -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v4 03/16] dma/idxd: add bus device probing 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 01/16] raw/ioat: only build if dmadev not present Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz @ 2021-09-17 14:02 ` Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz ` (12 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 14:02 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the basic device probing for DSA devices bound to the IDXD kernel driver. These devices can be configured via sysfs and made available to DPDK if they are found during bus scan. Relevant documentation is included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v4: fix 'vdev' naming, changed to 'bus' --- doc/guides/dmadevs/idxd.rst | 64 +++++++ drivers/dma/idxd/idxd_bus.c | 351 +++++++++++++++++++++++++++++++++++ drivers/dma/idxd/meson.build | 1 + 3 files changed, 416 insertions(+) create mode 100644 drivers/dma/idxd/idxd_bus.c diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 924700d17e..ce33e2857a 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -32,6 +32,56 @@ target platform is x86-based. No additional compilation steps are necessary. Device Setup ------------- +Intel\ |reg| DSA devices can use the IDXD kernel driver or DPDK-supported drivers, +such as ``vfio-pci``. Both are supported by the IDXD PMD. + +Intel\ |reg| DSA devices using IDXD kernel driver +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To use an Intel\ |reg| DSA device bound to the IDXD kernel driver, the device must first be configured. +The `accel-config <https://github.com/intel/idxd-config>`_ utility library can be used for configuration. + +.. note:: + The device configuration can also be done by directly interacting with the sysfs nodes. + An example of how this may be done can be seen in the script ``dpdk_idxd_cfg.py`` + included in the driver source directory. + +There are some mandatory configuration steps before being able to use a device with an application. +The internal engines, which do the copies or other operations, +and the work-queues, which are used by applications to assign work to the device, +need to be assigned to groups, and the various other configuration options, +such as priority or queue depth, need to be set for each queue. + +To assign an engine to a group:: + + $ accel-config config-engine dsa0/engine0.0 --group-id=0 + $ accel-config config-engine dsa0/engine0.1 --group-id=1 + +To assign work queues to groups for passing descriptors to the engines a similar accel-config command can be used. +However, the work queues also need to be configured depending on the use case. +Some configuration options include: + +* mode (Dedicated/Shared): Indicates whether a WQ may accept jobs from multiple queues simultaneously. +* priority: WQ priority between 1 and 15. Larger value means higher priority. +* wq-size: the size of the WQ. Sum of all WQ sizes must be less that the total-size defined by the device. +* type: WQ type (kernel/mdev/user). Determines how the device is presented. +* name: identifier given to the WQ. + +Example configuration for a work queue:: + + $ accel-config config-wq dsa0/wq0.0 --group-id=0 \ + --mode=dedicated --priority=10 --wq-size=8 \ + --type=user --name=dpdk_app1 + +Once the devices have been configured, they need to be enabled:: + + $ accel-config enable-device dsa0 + $ accel-config enable-wq dsa0/wq0.0 + +Check the device configuration:: + + $ accel-config list + Devices using VFIO/UIO drivers ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -56,3 +106,17 @@ If fewer workqueues are required, then the ``max_queues`` parameter may be passe the device driver on the EAL commandline, via the ``allowlist`` or ``-a`` flag e.g.:: $ dpdk-test -a <b:d:f>,max_queues=4 + +For devices bound to the IDXD kernel driver, +the DPDK IDXD driver will automatically perform a scan for available workqueues +to use. Any workqueues found listed in ``/dev/dsa`` on the system will be checked +in ``/sys``, and any which have ``dpdk_`` prefix in their name will be automatically +probed by the driver to make them available to the application. +Alternatively, to support use by multiple DPDK processes simultaneously, +the value used as the DPDK ``--file-prefix`` parameter may be used as a workqueue +name prefix, instead of ``dpdk_``, allowing each DPDK application instance to only +use a subset of configured queues. + +Once probed successfully, irrespective of kernel driver, the device will appear as a ``dmadev``, +that is a "DMA device type" inside DPDK, and can be accessed using APIs from the +``rte_dmadev`` library. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c new file mode 100644 index 0000000000..ef589af30e --- /dev/null +++ b/drivers/dma/idxd/idxd_bus.c @@ -0,0 +1,351 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include <dirent.h> +#include <fcntl.h> +#include <unistd.h> +#include <sys/mman.h> +#include <libgen.h> + +#include <rte_bus.h> +#include <rte_log.h> +#include <rte_dmadev_pmd.h> +#include <rte_string_fns.h> + +#include "idxd_internal.h" + +/* default value for DSA paths, but allow override in environment for testing */ +#define DSA_DEV_PATH "/dev/dsa" +#define DSA_SYSFS_PATH "/sys/bus/dsa/devices" + +static unsigned int devcount; + +/** unique identifier for a DSA device/WQ instance */ +struct dsa_wq_addr { + uint16_t device_id; + uint16_t wq_id; +}; + +/** a DSA device instance */ +struct rte_dsa_device { + struct rte_device device; /**< Inherit core device */ + TAILQ_ENTRY(rte_dsa_device) next; /**< next dev in list */ + + char wq_name[32]; /**< the workqueue name/number e.g. wq0.1 */ + struct dsa_wq_addr addr; /**< Identifies the specific WQ */ +}; + +/* forward prototypes */ +struct dsa_bus; +static int dsa_scan(void); +static int dsa_probe(void); +static struct rte_device *dsa_find_device(const struct rte_device *start, + rte_dev_cmp_t cmp, const void *data); +static enum rte_iova_mode dsa_get_iommu_class(void); +static int dsa_addr_parse(const char *name, void *addr); + +/** List of devices */ +TAILQ_HEAD(dsa_device_list, rte_dsa_device); + +/** + * Structure describing the DSA bus + */ +struct dsa_bus { + struct rte_bus bus; /**< Inherit the generic class */ + struct rte_driver driver; /**< Driver struct for devices to point to */ + struct dsa_device_list device_list; /**< List of PCI devices */ +}; + +struct dsa_bus dsa_bus = { + .bus = { + .scan = dsa_scan, + .probe = dsa_probe, + .find_device = dsa_find_device, + .get_iommu_class = dsa_get_iommu_class, + .parse = dsa_addr_parse, + }, + .driver = { + .name = "dmadev_idxd" + }, + .device_list = TAILQ_HEAD_INITIALIZER(dsa_bus.device_list), +}; + +static inline const char * +dsa_get_dev_path(void) +{ + const char *path = getenv("DSA_DEV_PATH"); + return path ? path : DSA_DEV_PATH; +} + +static inline const char * +dsa_get_sysfs_path(void) +{ + const char *path = getenv("DSA_SYSFS_PATH"); + return path ? path : DSA_SYSFS_PATH; +} + +static void * +idxd_bus_mmap_wq(struct rte_dsa_device *dev) +{ + void *addr; + char path[PATH_MAX]; + int fd; + + snprintf(path, sizeof(path), "%s/%s", dsa_get_dev_path(), dev->wq_name); + fd = open(path, O_RDWR); + if (fd < 0) { + IDXD_PMD_ERR("Failed to open device path: %s", path); + return NULL; + } + + addr = mmap(NULL, 0x1000, PROT_WRITE, MAP_SHARED, fd, 0); + close(fd); + if (addr == MAP_FAILED) { + IDXD_PMD_ERR("Failed to mmap device %s", path); + return NULL; + } + + return addr; +} + +static int +read_wq_string(struct rte_dsa_device *dev, const char *filename, + char *value, size_t valuelen) +{ + char sysfs_node[PATH_MAX]; + int len; + int fd; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s", + dsa_get_sysfs_path(), dev->wq_name, filename); + fd = open(sysfs_node, O_RDONLY); + if (fd < 0) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + len = read(fd, value, valuelen - 1); + close(fd); + if (len < 0) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + value[len] = '\0'; + return 0; +} + +static int +read_wq_int(struct rte_dsa_device *dev, const char *filename, + int *value) +{ + char sysfs_node[PATH_MAX]; + FILE *f; + int ret = 0; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s", + dsa_get_sysfs_path(), dev->wq_name, filename); + f = fopen(sysfs_node, "r"); + if (f == NULL) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + if (fscanf(f, "%d", value) != 1) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + ret = -1; + } + + fclose(f); + return ret; +} + +static int +read_device_int(struct rte_dsa_device *dev, const char *filename, + int *value) +{ + char sysfs_node[PATH_MAX]; + FILE *f; + int ret = 0; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/dsa%d/%s", + dsa_get_sysfs_path(), dev->addr.device_id, filename); + f = fopen(sysfs_node, "r"); + if (f == NULL) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + if (fscanf(f, "%d", value) != 1) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + ret = -1; + } + + fclose(f); + return ret; +} + +static int +idxd_probe_dsa(struct rte_dsa_device *dev) +{ + struct idxd_dmadev idxd = {0}; + int ret = 0; + + IDXD_PMD_INFO("Probing device %s on numa node %d", + dev->wq_name, dev->device.numa_node); + if (read_wq_int(dev, "size", &ret) < 0) + return -1; + idxd.max_batches = ret; + if (read_wq_int(dev, "max_batch_size", &ret) < 0) + return -1; + idxd.max_batch_size = ret; + idxd.qid = dev->addr.wq_id; + idxd.sva_support = 1; + + idxd.portal = idxd_bus_mmap_wq(dev); + if (idxd.portal == NULL) { + IDXD_PMD_ERR("WQ mmap failed"); + return -ENOENT; + } + + return 0; +} + +static int +is_for_this_process_use(const char *name) +{ + char *runtime_dir = strdup(rte_eal_get_runtime_dir()); + char *prefix = basename(runtime_dir); + int prefixlen = strlen(prefix); + int retval = 0; + + if (strncmp(name, "dpdk_", 5) == 0) + retval = 1; + if (strncmp(name, prefix, prefixlen) == 0 && name[prefixlen] == '_') + retval = 1; + + free(runtime_dir); + return retval; +} + +static int +dsa_probe(void) +{ + struct rte_dsa_device *dev; + + TAILQ_FOREACH(dev, &dsa_bus.device_list, next) { + char type[64], name[64]; + + if (read_wq_string(dev, "type", type, sizeof(type)) < 0 || + read_wq_string(dev, "name", name, sizeof(name)) < 0) + continue; + + if (strncmp(type, "user", 4) == 0 && is_for_this_process_use(name)) { + dev->device.driver = &dsa_bus.driver; + idxd_probe_dsa(dev); + continue; + } + IDXD_PMD_DEBUG("WQ '%s', not allocated to DPDK", dev->wq_name); + } + + return 0; +} + +static int +dsa_scan(void) +{ + const char *path = dsa_get_dev_path(); + struct dirent *wq; + DIR *dev_dir; + + dev_dir = opendir(path); + if (dev_dir == NULL) { + if (errno == ENOENT) + return 0; /* no bus, return without error */ + IDXD_PMD_ERR("%s(): opendir '%s' failed: %s", + __func__, path, strerror(errno)); + return -1; + } + + while ((wq = readdir(dev_dir)) != NULL) { + struct rte_dsa_device *dev; + int numa_node = -1; + + if (strncmp(wq->d_name, "wq", 2) != 0) + continue; + if (strnlen(wq->d_name, sizeof(dev->wq_name)) == sizeof(dev->wq_name)) { + IDXD_PMD_ERR("%s(): wq name too long: '%s', skipping", + __func__, wq->d_name); + continue; + } + IDXD_PMD_DEBUG("%s(): found %s/%s", __func__, path, wq->d_name); + + dev = malloc(sizeof(*dev)); + if (dsa_addr_parse(wq->d_name, &dev->addr) < 0) { + IDXD_PMD_ERR("Error parsing WQ name: %s", wq->d_name); + free(dev); + continue; + } + dev->device.bus = &dsa_bus.bus; + strlcpy(dev->wq_name, wq->d_name, sizeof(dev->wq_name)); + TAILQ_INSERT_TAIL(&dsa_bus.device_list, dev, next); + devcount++; + + read_device_int(dev, "numa_node", &numa_node); + dev->device.numa_node = numa_node; + dev->device.name = dev->wq_name; + } + + closedir(dev_dir); + return 0; +} + +static struct rte_device * +dsa_find_device(const struct rte_device *start, rte_dev_cmp_t cmp, + const void *data) +{ + struct rte_dsa_device *dev = TAILQ_FIRST(&dsa_bus.device_list); + + /* the rte_device struct must be at start of dsa structure */ + RTE_BUILD_BUG_ON(offsetof(struct rte_dsa_device, device) != 0); + + if (start != NULL) /* jump to start point if given */ + dev = TAILQ_NEXT((const struct rte_dsa_device *)start, next); + while (dev != NULL) { + if (cmp(&dev->device, data) == 0) + return &dev->device; + dev = TAILQ_NEXT(dev, next); + } + return NULL; +} + +static enum rte_iova_mode +dsa_get_iommu_class(void) +{ + /* if there are no devices, report don't care, otherwise VA mode */ + return devcount > 0 ? RTE_IOVA_VA : RTE_IOVA_DC; +} + +static int +dsa_addr_parse(const char *name, void *addr) +{ + struct dsa_wq_addr *wq = addr; + unsigned int device_id, wq_id; + + if (sscanf(name, "wq%u.%u", &device_id, &wq_id) != 2) { + IDXD_PMD_DEBUG("Parsing WQ name failed: %s", name); + return -1; + } + + wq->device_id = device_id; + wq->wq_id = wq_id; + return 0; +} + +RTE_REGISTER_BUS(dsa, dsa_bus.bus); diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index 9a64d75005..c864fce3b3 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -7,5 +7,6 @@ endif deps += ['bus_pci'] sources = files( + 'idxd_bus.c', 'idxd_pci.c' ) \ No newline at end of file -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v4 04/16] dma/idxd: create dmadev instances on bus probe 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 00/16] add dmadev driver for idxd devices Kevin Laatz ` (2 preceding siblings ...) 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 03/16] dma/idxd: add bus device probing Kevin Laatz @ 2021-09-17 14:02 ` Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz ` (11 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 14:02 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When a suitable device is found during the bus scan/probe, create a dmadev instance for each HW queue. Internal structures required for device creation are also added. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v4: - fix 'vdev' naming, changed to 'bus' - rebase changes --- drivers/dma/idxd/idxd_bus.c | 19 ++++++++ drivers/dma/idxd/idxd_common.c | 76 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 40 +++++++++++++++++ drivers/dma/idxd/meson.build | 1 + 4 files changed, 136 insertions(+) create mode 100644 drivers/dma/idxd/idxd_common.c diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index ef589af30e..b48fa954ed 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -85,6 +85,18 @@ dsa_get_sysfs_path(void) return path ? path : DSA_SYSFS_PATH; } +static int +idxd_dev_close(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->data->dev_private; + munmap(idxd->portal, 0x1000); + return 0; +} + +static const struct rte_dma_dev_ops idxd_bus_ops = { + .dev_close = idxd_dev_close, +}; + static void * idxd_bus_mmap_wq(struct rte_dsa_device *dev) { @@ -206,6 +218,7 @@ idxd_probe_dsa(struct rte_dsa_device *dev) return -1; idxd.max_batch_size = ret; idxd.qid = dev->addr.wq_id; + idxd.u.bus.dsa_id = dev->addr.device_id; idxd.sva_support = 1; idxd.portal = idxd_bus_mmap_wq(dev); @@ -214,6 +227,12 @@ idxd_probe_dsa(struct rte_dsa_device *dev) return -ENOENT; } + ret = idxd_dmadev_create(dev->wq_name, &dev->device, &idxd, &idxd_bus_ops); + if (ret) { + IDXD_PMD_ERR("Failed to create rawdev %s", dev->wq_name); + return ret; + } + return 0; } diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c new file mode 100644 index 0000000000..8afad637fc --- /dev/null +++ b/drivers/dma/idxd/idxd_common.c @@ -0,0 +1,76 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#include <rte_dmadev_pmd.h> +#include <rte_malloc.h> +#include <rte_common.h> + +#include "idxd_internal.h" + +#define IDXD_PMD_NAME_STR "dmadev_idxd" + +int +idxd_dmadev_create(const char *name, struct rte_device *dev, + const struct idxd_dmadev *base_idxd, + const struct rte_dma_dev_ops *ops) +{ + struct idxd_dmadev *idxd; + struct rte_dma_dev *dmadev = NULL; + int ret = 0; + + if (!name) { + IDXD_PMD_ERR("Invalid name of the device!"); + ret = -EINVAL; + goto cleanup; + } + + /* Allocate device structure */ + dmadev = rte_dma_pmd_allocate(name, dev->numa_node, + sizeof(dmadev->dev_private)); + if (dmadev == NULL) { + IDXD_PMD_ERR("Unable to allocate raw device"); + ret = -ENOMEM; + goto cleanup; + } + dmadev->dev_ops = ops; + dmadev->device = dev; + + idxd = rte_malloc_socket(NULL, sizeof(struct idxd_dmadev), 0, dev->numa_node); + if (idxd == NULL) { + IDXD_PMD_ERR("Unable to allocate memory for device"); + ret = -ENOMEM; + goto cleanup; + } + dmadev->data->dev_private = idxd; + dmadev->dev_private = idxd; + *idxd = *base_idxd; /* copy over the main fields already passed in */ + idxd->dmadev = dmadev; + + /* allocate batch index ring and completion ring. + * The +1 is because we can never fully use + * the ring, otherwise read == write means both full and empty. + */ + idxd->batch_comp_ring = rte_zmalloc(NULL, (sizeof(idxd->batch_idx_ring[0]) + + sizeof(idxd->batch_comp_ring[0])) * (idxd->max_batches + 1), + sizeof(idxd->batch_comp_ring[0])); + if (idxd->batch_comp_ring == NULL) { + IDXD_PMD_ERR("Unable to reserve memory for batch data\n"); + ret = -ENOMEM; + goto cleanup; + } + idxd->batch_idx_ring = (void *)&idxd->batch_comp_ring[idxd->max_batches+1]; + idxd->batch_iova = rte_mem_virt2iova(idxd->batch_comp_ring); + + return 0; + +cleanup: + if (dmadev) + rte_dma_pmd_release(name); + + return ret; +} + +int idxd_pmd_logtype; + +RTE_LOG_REGISTER_DEFAULT(idxd_pmd_logtype, WARNING); diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index c6a7dcd72f..fa6f053f72 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -24,4 +24,44 @@ extern int idxd_pmd_logtype; #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) +struct idxd_dmadev { + /* counters to track the batches */ + unsigned short max_batches; + unsigned short batch_idx_read; + unsigned short batch_idx_write; + + /* track descriptors and handles */ + unsigned short desc_ring_mask; + unsigned short ids_avail; /* handles for ops completed */ + unsigned short ids_returned; /* the read pointer for hdls/desc rings */ + unsigned short batch_start; /* start+size == write pointer for hdls/desc */ + unsigned short batch_size; + + void *portal; /* address to write the batch descriptor */ + + struct idxd_completion *batch_comp_ring; + unsigned short *batch_idx_ring; /* store where each batch ends */ + + struct rte_dma_stats stats; + + rte_iova_t batch_iova; /* base address of the batch comp ring */ + rte_iova_t desc_iova; /* base address of desc ring, needed for completions */ + + unsigned short max_batch_size; + + struct rte_dma_dev *dmadev; + struct rte_dma_vchan_conf qcfg; + uint8_t sva_support; + uint8_t qid; + + union { + struct { + unsigned int dsa_id; + } bus; + } u; +}; + +int idxd_dmadev_create(const char *name, struct rte_device *dev, + const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); + #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index c864fce3b3..36dbd3e518 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -8,5 +8,6 @@ endif deps += ['bus_pci'] sources = files( 'idxd_bus.c', + 'idxd_common.c', 'idxd_pci.c' ) \ No newline at end of file -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v4 05/16] dma/idxd: create dmadev instances on pci probe 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 00/16] add dmadev driver for idxd devices Kevin Laatz ` (3 preceding siblings ...) 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz @ 2021-09-17 14:02 ` Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 06/16] dma/idxd: add datapath structures Kevin Laatz ` (10 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 14:02 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When a suitable device is found during the PCI probe, create a dmadev instance for each HW queue. HW definitions required are also included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v4: rebase changes --- drivers/dma/idxd/idxd_hw_defs.h | 71 ++++++++ drivers/dma/idxd/idxd_internal.h | 16 ++ drivers/dma/idxd/idxd_pci.c | 278 ++++++++++++++++++++++++++++++- 3 files changed, 362 insertions(+), 3 deletions(-) create mode 100644 drivers/dma/idxd/idxd_hw_defs.h diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h new file mode 100644 index 0000000000..ea627cba6d --- /dev/null +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -0,0 +1,71 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#ifndef _IDXD_HW_DEFS_H_ +#define _IDXD_HW_DEFS_H_ + +/*** Definitions for Intel(R) Data Streaming Accelerator ***/ + +#define IDXD_CMD_SHIFT 20 +enum rte_idxd_cmds { + idxd_enable_dev = 1, + idxd_disable_dev, + idxd_drain_all, + idxd_abort_all, + idxd_reset_device, + idxd_enable_wq, + idxd_disable_wq, + idxd_drain_wq, + idxd_abort_wq, + idxd_reset_wq, +}; + +/* General bar0 registers */ +struct rte_idxd_bar0 { + uint32_t __rte_cache_aligned version; /* offset 0x00 */ + uint64_t __rte_aligned(0x10) gencap; /* offset 0x10 */ + uint64_t __rte_aligned(0x10) wqcap; /* offset 0x20 */ + uint64_t __rte_aligned(0x10) grpcap; /* offset 0x30 */ + uint64_t __rte_aligned(0x08) engcap; /* offset 0x38 */ + uint64_t __rte_aligned(0x10) opcap; /* offset 0x40 */ + uint64_t __rte_aligned(0x20) offsets[2]; /* offset 0x60 */ + uint32_t __rte_aligned(0x20) gencfg; /* offset 0x80 */ + uint32_t __rte_aligned(0x08) genctrl; /* offset 0x88 */ + uint32_t __rte_aligned(0x10) gensts; /* offset 0x90 */ + uint32_t __rte_aligned(0x08) intcause; /* offset 0x98 */ + uint32_t __rte_aligned(0x10) cmd; /* offset 0xA0 */ + uint32_t __rte_aligned(0x08) cmdstatus; /* offset 0xA8 */ + uint64_t __rte_aligned(0x20) swerror[4]; /* offset 0xC0 */ +}; + +/* workqueue config is provided by array of uint32_t. */ +enum rte_idxd_wqcfg { + wq_size_idx, /* size is in first 32-bit value */ + wq_threshold_idx, /* WQ threshold second 32-bits */ + wq_mode_idx, /* WQ mode and other flags */ + wq_sizes_idx, /* WQ transfer and batch sizes */ + wq_occ_int_idx, /* WQ occupancy interrupt handle */ + wq_occ_limit_idx, /* WQ occupancy limit */ + wq_state_idx, /* WQ state and occupancy state */ +}; + +#define WQ_MODE_SHARED 0 +#define WQ_MODE_DEDICATED 1 +#define WQ_PRIORITY_SHIFT 4 +#define WQ_BATCH_SZ_SHIFT 5 +#define WQ_STATE_SHIFT 30 +#define WQ_STATE_MASK 0x3 + +struct rte_idxd_grpcfg { + uint64_t grpwqcfg[4] __rte_cache_aligned; /* 64-byte register set */ + uint64_t grpengcfg; /* offset 32 */ + uint32_t grpflags; /* offset 40 */ +}; + +#define GENSTS_DEV_STATE_MASK 0x03 +#define CMDSTATUS_ACTIVE_SHIFT 31 +#define CMDSTATUS_ACTIVE_MASK (1 << 31) +#define CMDSTATUS_ERR_MASK 0xFF + +#endif diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index fa6f053f72..cb3a68c69b 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -5,6 +5,10 @@ #ifndef _IDXD_INTERNAL_H_ #define _IDXD_INTERNAL_H_ +#include <rte_spinlock.h> + +#include "idxd_hw_defs.h" + /** * @file idxd_internal.h * @@ -24,6 +28,16 @@ extern int idxd_pmd_logtype; #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) +struct idxd_pci_common { + rte_spinlock_t lk; + + uint8_t wq_cfg_sz; + volatile struct rte_idxd_bar0 *regs; + volatile uint32_t *wq_regs_base; + volatile struct rte_idxd_grpcfg *grp_regs; + volatile void *portals; +}; + struct idxd_dmadev { /* counters to track the batches */ unsigned short max_batches; @@ -58,6 +72,8 @@ struct idxd_dmadev { struct { unsigned int dsa_id; } bus; + + struct idxd_pci_common *pci; } u; }; diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 79e4aadcab..171e5ffc07 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -3,6 +3,9 @@ */ #include <rte_bus_pci.h> +#include <rte_devargs.h> +#include <rte_dmadev_pmd.h> +#include <rte_malloc.h> #include "idxd_internal.h" @@ -16,17 +19,286 @@ const struct rte_pci_id pci_id_idxd_map[] = { { .vendor_id = 0, /* sentinel */ }, }; +static inline int +idxd_pci_dev_command(struct idxd_dmadev *idxd, enum rte_idxd_cmds command) +{ + uint8_t err_code; + uint16_t qid = idxd->qid; + int i = 0; + + if (command >= idxd_disable_wq && command <= idxd_reset_wq) + qid = (1 << qid); + rte_spinlock_lock(&idxd->u.pci->lk); + idxd->u.pci->regs->cmd = (command << IDXD_CMD_SHIFT) | qid; + + do { + rte_pause(); + err_code = idxd->u.pci->regs->cmdstatus; + if (++i >= 1000) { + IDXD_PMD_ERR("Timeout waiting for command response from HW"); + rte_spinlock_unlock(&idxd->u.pci->lk); + return err_code; + } + } while (idxd->u.pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK); + rte_spinlock_unlock(&idxd->u.pci->lk); + + return err_code & CMDSTATUS_ERR_MASK; +} + +static uint32_t * +idxd_get_wq_cfg(struct idxd_pci_common *pci, uint8_t wq_idx) +{ + return RTE_PTR_ADD(pci->wq_regs_base, + (uintptr_t)wq_idx << (5 + pci->wq_cfg_sz)); +} + +static int +idxd_is_wq_enabled(struct idxd_dmadev *idxd) +{ + uint32_t state = idxd_get_wq_cfg(idxd->u.pci, idxd->qid)[wq_state_idx]; + return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; +} + +static const struct rte_dma_dev_ops idxd_pci_ops = { + +}; + +/* each portal uses 4 x 4k pages */ +#define IDXD_PORTAL_SIZE (4096 * 4) + +static int +init_pci_device(struct rte_pci_device *dev, struct idxd_dmadev *idxd, + unsigned int max_queues) +{ + struct idxd_pci_common *pci; + uint8_t nb_groups, nb_engines, nb_wqs; + uint16_t grp_offset, wq_offset; /* how far into bar0 the regs are */ + uint16_t wq_size, total_wq_size; + uint8_t lg2_max_batch, lg2_max_copy_size; + unsigned int i, err_code; + + pci = malloc(sizeof(*pci)); + if (pci == NULL) { + IDXD_PMD_ERR("%s: Can't allocate memory", __func__); + goto err; + } + rte_spinlock_init(&pci->lk); + + /* assign the bar registers, and then configure device */ + pci->regs = dev->mem_resource[0].addr; + grp_offset = (uint16_t)pci->regs->offsets[0]; + pci->grp_regs = RTE_PTR_ADD(pci->regs, grp_offset * 0x100); + wq_offset = (uint16_t)(pci->regs->offsets[0] >> 16); + pci->wq_regs_base = RTE_PTR_ADD(pci->regs, wq_offset * 0x100); + pci->portals = dev->mem_resource[2].addr; + pci->wq_cfg_sz = (pci->regs->wqcap >> 24) & 0x0F; + + /* sanity check device status */ + if (pci->regs->gensts & GENSTS_DEV_STATE_MASK) { + /* need function-level-reset (FLR) or is enabled */ + IDXD_PMD_ERR("Device status is not disabled, cannot init"); + goto err; + } + if (pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK) { + /* command in progress */ + IDXD_PMD_ERR("Device has a command in progress, cannot init"); + goto err; + } + + /* read basic info about the hardware for use when configuring */ + nb_groups = (uint8_t)pci->regs->grpcap; + nb_engines = (uint8_t)pci->regs->engcap; + nb_wqs = (uint8_t)(pci->regs->wqcap >> 16); + total_wq_size = (uint16_t)pci->regs->wqcap; + lg2_max_copy_size = (uint8_t)(pci->regs->gencap >> 16) & 0x1F; + lg2_max_batch = (uint8_t)(pci->regs->gencap >> 21) & 0x0F; + + IDXD_PMD_DEBUG("nb_groups = %u, nb_engines = %u, nb_wqs = %u", + nb_groups, nb_engines, nb_wqs); + + /* zero out any old config */ + for (i = 0; i < nb_groups; i++) { + pci->grp_regs[i].grpengcfg = 0; + pci->grp_regs[i].grpwqcfg[0] = 0; + } + for (i = 0; i < nb_wqs; i++) + idxd_get_wq_cfg(pci, i)[0] = 0; + + /* limit queues if necessary */ + if (max_queues != 0 && nb_wqs > max_queues) { + nb_wqs = max_queues; + if (nb_engines > max_queues) + nb_engines = max_queues; + if (nb_groups > max_queues) + nb_engines = max_queues; + IDXD_PMD_DEBUG("Limiting queues to %u", nb_wqs); + } + + /* put each engine into a separate group to avoid reordering */ + if (nb_groups > nb_engines) + nb_groups = nb_engines; + if (nb_groups < nb_engines) + nb_engines = nb_groups; + + /* assign engines to groups, round-robin style */ + for (i = 0; i < nb_engines; i++) { + IDXD_PMD_DEBUG("Assigning engine %u to group %u", + i, i % nb_groups); + pci->grp_regs[i % nb_groups].grpengcfg |= (1ULL << i); + } + + /* now do the same for queues and give work slots to each queue */ + wq_size = total_wq_size / nb_wqs; + IDXD_PMD_DEBUG("Work queue size = %u, max batch = 2^%u, max copy = 2^%u", + wq_size, lg2_max_batch, lg2_max_copy_size); + for (i = 0; i < nb_wqs; i++) { + /* add engine "i" to a group */ + IDXD_PMD_DEBUG("Assigning work queue %u to group %u", + i, i % nb_groups); + pci->grp_regs[i % nb_groups].grpwqcfg[0] |= (1ULL << i); + /* now configure it, in terms of size, max batch, mode */ + idxd_get_wq_cfg(pci, i)[wq_size_idx] = wq_size; + idxd_get_wq_cfg(pci, i)[wq_mode_idx] = (1 << WQ_PRIORITY_SHIFT) | + WQ_MODE_DEDICATED; + idxd_get_wq_cfg(pci, i)[wq_sizes_idx] = lg2_max_copy_size | + (lg2_max_batch << WQ_BATCH_SZ_SHIFT); + } + + /* dump the group configuration to output */ + for (i = 0; i < nb_groups; i++) { + IDXD_PMD_DEBUG("## Group %d", i); + IDXD_PMD_DEBUG(" GRPWQCFG: %"PRIx64, pci->grp_regs[i].grpwqcfg[0]); + IDXD_PMD_DEBUG(" GRPENGCFG: %"PRIx64, pci->grp_regs[i].grpengcfg); + IDXD_PMD_DEBUG(" GRPFLAGS: %"PRIx32, pci->grp_regs[i].grpflags); + } + + idxd->u.pci = pci; + idxd->max_batches = wq_size; + + /* enable the device itself */ + err_code = idxd_pci_dev_command(idxd, idxd_enable_dev); + if (err_code) { + IDXD_PMD_ERR("Error enabling device: code %#x", err_code); + return err_code; + } + IDXD_PMD_DEBUG("IDXD Device enabled OK"); + + return nb_wqs; + +err: + free(pci); + return -1; +} + static int idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) { - int ret = 0; + struct idxd_dmadev idxd = {0}; + uint8_t nb_wqs; + int qid, ret = 0; char name[PCI_PRI_STR_SIZE]; + unsigned int max_queues = 0; rte_pci_device_name(&dev->addr, name, sizeof(name)); IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); dev->device.driver = &drv->driver; - return ret; + if (dev->device.devargs && dev->device.devargs->args[0] != '\0') { + /* if the number of devargs grows beyond just 1, use rte_kvargs */ + if (sscanf(dev->device.devargs->args, + "max_queues=%u", &max_queues) != 1) { + IDXD_PMD_ERR("Invalid device parameter: '%s'", + dev->device.devargs->args); + return -1; + } + } + + ret = init_pci_device(dev, &idxd, max_queues); + if (ret < 0) { + IDXD_PMD_ERR("Error initializing PCI hardware"); + return ret; + } + if (idxd.u.pci->portals == NULL) { + IDXD_PMD_ERR("Error, invalid portal assigned during initialization\n"); + return -EINVAL; + } + nb_wqs = (uint8_t)ret; + + /* set up one device for each queue */ + for (qid = 0; qid < nb_wqs; qid++) { + char qname[32]; + + /* add the queue number to each device name */ + snprintf(qname, sizeof(qname), "%s-q%d", name, qid); + idxd.qid = qid; + idxd.portal = RTE_PTR_ADD(idxd.u.pci->portals, + qid * IDXD_PORTAL_SIZE); + if (idxd_is_wq_enabled(&idxd)) + IDXD_PMD_ERR("Error, WQ %u seems enabled", qid); + ret = idxd_dmadev_create(qname, &dev->device, + &idxd, &idxd_pci_ops); + if (ret != 0) { + IDXD_PMD_ERR("Failed to create dmadev %s", name); + if (qid == 0) /* if no devices using this, free pci */ + free(idxd.u.pci); + return ret; + } + } + + return 0; +} + +static int +idxd_dmadev_destroy(const char *name) +{ + int ret; + uint8_t err_code; + struct rte_dma_dev *dmadev; + struct idxd_dmadev *idxd; + int dev_id = rte_dma_get_dev_id(name); + + if (!name) { + IDXD_PMD_ERR("Invalid device name"); + return -EINVAL; + } + + if (dev_id < 0) { + IDXD_PMD_ERR("Invalid device ID"); + return -EINVAL; + } + + dmadev = &rte_dma_devices[dev_id]; + if (!dmadev) { + IDXD_PMD_ERR("Invalid device name (%s)", name); + return -EINVAL; + } + + idxd = dmadev->dev_private; + if (!idxd) { + IDXD_PMD_ERR("Error getting dev_private"); + return -EINVAL; + } + + /* disable the device */ + err_code = idxd_pci_dev_command(idxd, idxd_disable_dev); + if (err_code) { + IDXD_PMD_ERR("Error disabling device: code %#x", err_code); + return err_code; + } + IDXD_PMD_DEBUG("IDXD Device disabled OK"); + + /* free device memory */ + IDXD_PMD_DEBUG("Freeing device driver memory"); + dmadev->dev_private = NULL; + rte_free(idxd->batch_idx_ring); + rte_free(idxd->desc_ring); + + /* rte_dma_close is called by pmd_release */ + ret = rte_dma_pmd_release(name); + if (ret) + IDXD_PMD_DEBUG("Device cleanup failed"); + + return 0; } static int @@ -39,7 +311,7 @@ idxd_dmadev_remove_pci(struct rte_pci_device *dev) IDXD_PMD_INFO("Closing %s on NUMA node %d", name, dev->device.numa_node); - return 0; + return idxd_dmadev_destroy(name); } struct rte_pci_driver idxd_pmd_drv_pci = { -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v4 06/16] dma/idxd: add datapath structures 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 00/16] add dmadev driver for idxd devices Kevin Laatz ` (4 preceding siblings ...) 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz @ 2021-09-17 14:02 ` Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 07/16] dma/idxd: add configure and info_get functions Kevin Laatz ` (9 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 14:02 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add data structures required for the data path for IDXD devices. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v2: add completion status for invalid opcode --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 33 ++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 60 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 3 ++ drivers/dma/idxd/idxd_pci.c | 2 +- 5 files changed, 98 insertions(+), 1 deletion(-) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index b48fa954ed..3c0837ec52 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -95,6 +95,7 @@ idxd_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_close = idxd_dev_close, + .dev_dump = idxd_dump, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 8afad637fc..45cde78e88 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -10,6 +10,35 @@ #define IDXD_PMD_NAME_STR "dmadev_idxd" +int +idxd_dump(const struct rte_dma_dev *dev, FILE *f) +{ + struct idxd_dmadev *idxd = dev->dev_private; + unsigned int i; + + fprintf(f, "== IDXD Private Data ==\n"); + fprintf(f, " Portal: %p\n", idxd->portal); + fprintf(f, " Config: { ring_size: %u }\n", + idxd->qcfg.nb_desc); + fprintf(f, " Batch ring (sz = %u, max_batches = %u):\n\t", + idxd->max_batches + 1, idxd->max_batches); + for (i = 0; i <= idxd->max_batches; i++) { + fprintf(f, " %u ", idxd->batch_idx_ring[i]); + if (i == idxd->batch_idx_read && i == idxd->batch_idx_write) + fprintf(f, "[rd ptr, wr ptr] "); + else if (i == idxd->batch_idx_read) + fprintf(f, "[rd ptr] "); + else if (i == idxd->batch_idx_write) + fprintf(f, "[wr ptr] "); + if (i == idxd->max_batches) + fprintf(f, "\n"); + } + + fprintf(f, " Curr batch: start = %u, size = %u\n", idxd->batch_start, idxd->batch_size); + fprintf(f, " IDS: avail = %u, returned: %u\n", idxd->ids_avail, idxd->ids_returned); + return 0; +} + int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, @@ -19,6 +48,10 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, struct rte_dma_dev *dmadev = NULL; int ret = 0; + RTE_BUILD_BUG_ON(sizeof(struct idxd_hw_desc) != 64); + RTE_BUILD_BUG_ON(offsetof(struct idxd_hw_desc, size) != 32); + RTE_BUILD_BUG_ON(sizeof(struct idxd_completion) != 32); + if (!name) { IDXD_PMD_ERR("Invalid name of the device!"); ret = -EINVAL; diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h index ea627cba6d..55ca9f7f52 100644 --- a/drivers/dma/idxd/idxd_hw_defs.h +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -5,6 +5,66 @@ #ifndef _IDXD_HW_DEFS_H_ #define _IDXD_HW_DEFS_H_ +/* + * Defines used in the data path for interacting with IDXD hardware. + */ +#define IDXD_CMD_OP_SHIFT 24 +enum rte_idxd_ops { + idxd_op_nop = 0, + idxd_op_batch, + idxd_op_drain, + idxd_op_memmove, + idxd_op_fill +}; + +#define IDXD_FLAG_FENCE (1 << 0) +#define IDXD_FLAG_COMPLETION_ADDR_VALID (1 << 2) +#define IDXD_FLAG_REQUEST_COMPLETION (1 << 3) +#define IDXD_FLAG_CACHE_CONTROL (1 << 8) + +/** + * Hardware descriptor used by DSA hardware, for both bursts and + * for individual operations. + */ +struct idxd_hw_desc { + uint32_t pasid; + uint32_t op_flags; + rte_iova_t completion; + + RTE_STD_C11 + union { + rte_iova_t src; /* source address for copy ops etc. */ + rte_iova_t desc_addr; /* descriptor pointer for batch */ + }; + rte_iova_t dst; + + uint32_t size; /* length of data for op, or batch size */ + + uint16_t intr_handle; /* completion interrupt handle */ + + /* remaining 26 bytes are reserved */ + uint16_t __reserved[13]; +} __rte_aligned(64); + +#define IDXD_COMP_STATUS_INCOMPLETE 0 +#define IDXD_COMP_STATUS_SUCCESS 1 +#define IDXD_COMP_STATUS_INVALID_OPCODE 0x10 +#define IDXD_COMP_STATUS_INVALID_SIZE 0x13 +#define IDXD_COMP_STATUS_SKIPPED 0xFF /* not official IDXD error, needed as placeholder */ + +/** + * Completion record structure written back by DSA + */ +struct idxd_completion { + uint8_t status; + uint8_t result; + /* 16-bits pad here */ + uint32_t completed_size; /* data length, or descriptors for batch */ + + rte_iova_t fault_address; + uint32_t invalid_flags; +} __rte_aligned(32); + /*** Definitions for Intel(R) Data Streaming Accelerator ***/ #define IDXD_CMD_SHIFT 20 diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index cb3a68c69b..99c8e04302 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -39,6 +39,8 @@ struct idxd_pci_common { }; struct idxd_dmadev { + struct idxd_hw_desc *desc_ring; + /* counters to track the batches */ unsigned short max_batches; unsigned short batch_idx_read; @@ -79,5 +81,6 @@ struct idxd_dmadev { int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); +int idxd_dump(const struct rte_dma_dev *dev, FILE *f); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 171e5ffc07..33cf76adfb 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -60,7 +60,7 @@ idxd_is_wq_enabled(struct idxd_dmadev *idxd) } static const struct rte_dma_dev_ops idxd_pci_ops = { - + .dev_dump = idxd_dump, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v4 07/16] dma/idxd: add configure and info_get functions 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 00/16] add dmadev driver for idxd devices Kevin Laatz ` (5 preceding siblings ...) 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 06/16] dma/idxd: add datapath structures Kevin Laatz @ 2021-09-17 14:02 ` Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz ` (8 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 14:02 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add functions for device configuration. The info_get function is included here since it can be useful for checking successful configuration. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v2: - fix reconfigure bug in idxd_vchan_setup() - add literal include comment for the docs to pick up v3: - fixes needed after changes from rebasing --- app/test/test_dmadev.c | 2 + doc/guides/dmadevs/idxd.rst | 30 +++++++++++++ drivers/dma/idxd/idxd_bus.c | 3 ++ drivers/dma/idxd/idxd_common.c | 72 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 6 +++ drivers/dma/idxd/idxd_pci.c | 3 ++ 6 files changed, 116 insertions(+) diff --git a/app/test/test_dmadev.c b/app/test/test_dmadev.c index 98fcab67f3..5bbe4250e0 100644 --- a/app/test/test_dmadev.c +++ b/app/test/test_dmadev.c @@ -739,6 +739,7 @@ test_dmadev_instance(uint16_t dev_id) { #define TEST_RINGSIZE 512 #define CHECK_ERRS true + /* Setup of the dmadev device. 8< */ struct rte_dma_stats stats; struct rte_dma_info info; const struct rte_dma_conf conf = { .nb_vchans = 1}; @@ -759,6 +760,7 @@ test_dmadev_instance(uint16_t dev_id) if (rte_dma_vchan_setup(dev_id, vchan, &qconf) < 0) ERR_RETURN("Error with queue configuration\n"); + /* >8 End of setup of the dmadev device. */ rte_dma_info_get(dev_id, &info); if (info.nb_vchans != 1) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index ce33e2857a..abfa5be9ea 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -120,3 +120,33 @@ use a subset of configured queues. Once probed successfully, irrespective of kernel driver, the device will appear as a ``dmadev``, that is a "DMA device type" inside DPDK, and can be accessed using APIs from the ``rte_dmadev`` library. + +Using IDXD DMAdev Devices +-------------------------- + +To use the devices from an application, the dmadev API can be used. + +Getting Device Information +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Basic information about each dmadev device can be queried using the +``rte_dma_info_get()`` API. This will return basic device information such as +the ``rte_device`` structure, device capabilities and other device specific values. + +Device Configuration +~~~~~~~~~~~~~~~~~~~~~ + +Configuring an IDXD dmadev device is done using the ``rte_dma_configure()`` and +``rte_dma_vchan_setup`` APIs. The configurations are passed to these APIs using +the ``rte_dma_conf`` and ``rte_dma_vchan_conf`` structures, respectively. For +example, these can be used to configure the number of ``vchans`` per device, the +ring size, etc. The ring size must be a power of two, between 64 and 4096. + +The following code shows how the device is configured in +``test_dmadev.c``: + +.. literalinclude:: ../../../app/test/test_dmadev.c + :language: c + :start-after: Setup of the dmadev device. 8< + :end-before: >8 End of setup of the dmadev device. + :dedent: 1 diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index 3c0837ec52..b2acdac4f9 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -96,6 +96,9 @@ idxd_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_close = idxd_dev_close, .dev_dump = idxd_dump, + .dev_configure = idxd_configure, + .vchan_setup = idxd_vchan_setup, + .dev_info_get = idxd_info_get, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 45cde78e88..2c222708cf 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -39,6 +39,78 @@ idxd_dump(const struct rte_dma_dev *dev, FILE *f) return 0; } +int +idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t size) +{ + struct idxd_dmadev *idxd = dev->dev_private; + + if (size < sizeof(*info)) + return -EINVAL; + + *info = (struct rte_dma_info) { + .dev_capa = RTE_DMA_CAPA_MEM_TO_MEM | + RTE_DMA_CAPA_OPS_COPY | RTE_DMA_CAPA_OPS_FILL, + .max_vchans = 1, + .max_desc = 4096, + .min_desc = 64, + .nb_vchans = (idxd->desc_ring != NULL), /* returns 1 or 0 */ + }; + if (idxd->sva_support) + info->dev_capa |= RTE_DMA_CAPA_SVA; + return 0; +} + +int +idxd_configure(struct rte_dma_dev *dev __rte_unused, const struct rte_dma_conf *dev_conf, + uint32_t conf_sz) +{ + if (sizeof(struct rte_dma_conf) != conf_sz) + return -EINVAL; + + if (dev_conf->nb_vchans != 1) + return -EINVAL; + return 0; +} + +int +idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t max_desc = qconf->nb_desc; + + if (sizeof(struct rte_dma_vchan_conf) != qconf_sz) + return -EINVAL; + + idxd->qcfg = *qconf; + + if (!rte_is_power_of_2(max_desc)) + max_desc = rte_align32pow2(max_desc); + IDXD_PMD_DEBUG("DMA dev %u using %u descriptors", dev->data->dev_id, max_desc); + idxd->desc_ring_mask = max_desc - 1; + idxd->qcfg.nb_desc = max_desc; + + /* in case we are reconfiguring a device, free any existing memory */ + rte_free(idxd->desc_ring); + + /* allocate the descriptor ring at 2x size as batches can't wrap */ + idxd->desc_ring = rte_zmalloc(NULL, sizeof(*idxd->desc_ring) * max_desc * 2, 0); + if (idxd->desc_ring == NULL) + return -ENOMEM; + idxd->desc_iova = rte_mem_virt2iova(idxd->desc_ring); + + idxd->batch_idx_read = 0; + idxd->batch_idx_write = 0; + idxd->batch_start = 0; + idxd->batch_size = 0; + idxd->ids_returned = 0; + idxd->ids_avail = 0; + + memset(idxd->batch_comp_ring, 0, sizeof(*idxd->batch_comp_ring) * + (idxd->max_batches + 1)); + return 0; +} + int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 99c8e04302..fdd018ca35 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -82,5 +82,11 @@ struct idxd_dmadev { int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); int idxd_dump(const struct rte_dma_dev *dev, FILE *f); +int idxd_configure(struct rte_dma_dev *dev, const struct rte_dma_conf *dev_conf, + uint32_t conf_sz); +int idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan, + const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz); +int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *dev_info, + uint32_t size); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 33cf76adfb..0216ab80d9 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -61,6 +61,9 @@ idxd_is_wq_enabled(struct idxd_dmadev *idxd) static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_dump = idxd_dump, + .dev_configure = idxd_configure, + .vchan_setup = idxd_vchan_setup, + .dev_info_get = idxd_info_get, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v4 08/16] dma/idxd: add start and stop functions for pci devices 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 00/16] add dmadev driver for idxd devices Kevin Laatz ` (6 preceding siblings ...) 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 07/16] dma/idxd: add configure and info_get functions Kevin Laatz @ 2021-09-17 14:02 ` Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 09/16] dma/idxd: add data-path job submission functions Kevin Laatz ` (7 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 14:02 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add device start/stop functions for DSA devices bound to vfio. For devices bound to the IDXD kernel driver, these are not required since the IDXD kernel driver takes care of this. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 3 +++ drivers/dma/idxd/idxd_pci.c | 52 +++++++++++++++++++++++++++++++++++++ 2 files changed, 55 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index abfa5be9ea..a603c5dd22 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -150,3 +150,6 @@ The following code shows how the device is configured in :start-after: Setup of the dmadev device. 8< :end-before: >8 End of setup of the dmadev device. :dedent: 1 + +Once configured, the device can then be made ready for use by calling the +``rte_dma_start()`` API. diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 0216ab80d9..cfb64ce220 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -59,11 +59,63 @@ idxd_is_wq_enabled(struct idxd_dmadev *idxd) return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; } +static int +idxd_pci_dev_stop(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint8_t err_code; + + if (!idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Work queue %d already disabled", idxd->qid); + return -EALREADY; + } + + err_code = idxd_pci_dev_command(idxd, idxd_disable_wq); + if (err_code || idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Failed disabling work queue %d, error code: %#x", + idxd->qid, err_code); + return -err_code; + } + IDXD_PMD_DEBUG("Work queue %d disabled OK", idxd->qid); + + return 0; +} + +static int +idxd_pci_dev_start(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint8_t err_code; + + if (idxd_is_wq_enabled(idxd)) { + IDXD_PMD_WARN("WQ %d already enabled", idxd->qid); + return 0; + } + + if (idxd->desc_ring == NULL) { + IDXD_PMD_ERR("WQ %d has not been fully configured", idxd->qid); + return -EINVAL; + } + + err_code = idxd_pci_dev_command(idxd, idxd_enable_wq); + if (err_code || !idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Failed enabling work queue %d, error code: %#x", + idxd->qid, err_code); + return err_code == 0 ? -1 : err_code; + } + + IDXD_PMD_DEBUG("Work queue %d enabled OK", idxd->qid); + + return 0; +} + static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_dump = idxd_dump, .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .dev_start = idxd_pci_dev_start, + .dev_stop = idxd_pci_dev_stop, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v4 09/16] dma/idxd: add data-path job submission functions 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 00/16] add dmadev driver for idxd devices Kevin Laatz ` (7 preceding siblings ...) 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz @ 2021-09-17 14:02 ` Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 10/16] dma/idxd: add data-path job completion functions Kevin Laatz ` (6 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 14:02 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add data path functions for enqueuing and submitting operations to DSA devices. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 64 +++++++++++++++ drivers/dma/idxd/idxd_common.c | 136 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 ++ drivers/dma/idxd/meson.build | 1 + 4 files changed, 206 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index a603c5dd22..7835461a22 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -153,3 +153,67 @@ The following code shows how the device is configured in Once configured, the device can then be made ready for use by calling the ``rte_dma_start()`` API. + +Performing Data Copies +~~~~~~~~~~~~~~~~~~~~~~~ + +To perform data copies using IDXD dmadev devices, descriptors should be enqueued +using the ``rte_dma_copy()`` API. The HW can be triggered to perform the copy +in two ways, either via a ``RTE_DMA_OP_FLAG_SUBMIT`` flag or by calling +``rte_dma_submit()``. Once copies have been completed, the completion will +be reported back when the application calls ``rte_dma_completed()`` or +``rte_dma_completed_status()``. The latter will also report the status of each +completed operation. + +The ``rte_dma_copy()`` function enqueues a single copy to the device ring for +copying at a later point. The parameters to that function include the IOVA addresses +of both the source and destination buffers, as well as the length of the copy. + +The ``rte_dma_copy()`` function enqueues a copy operation on the device ring. +If the ``RTE_DMA_OP_FLAG_SUBMIT`` flag is set when calling ``rte_dma_copy()``, +the device hardware will be informed of the elements. Alternatively, if the flag +is not set, the application needs to call the ``rte_dma_submit()`` function to +notify the device hardware. Once the device hardware is informed of the elements +enqueued on the ring, the device will begin to process them. It is expected +that, for efficiency reasons, a burst of operations will be enqueued to the +device via multiple enqueue calls between calls to the ``rte_dma_submit()`` +function. + +The following code demonstrates how to enqueue a burst of copies to the +device and start the hardware processing of them: + +.. code-block:: C + + struct rte_mbuf *srcs[COMP_BURST_SZ], *dsts[COMP_BURST_SZ]; + unsigned int i; + + for (i = 0; i < RTE_DIM(srcs); i++) { + uint64_t *src_data; + + srcs[i] = rte_pktmbuf_alloc(pool); + dsts[i] = rte_pktmbuf_alloc(pool); + src_data = rte_pktmbuf_mtod(srcs[i], uint64_t *); + if (srcs[i] == NULL || dsts[i] == NULL) { + PRINT_ERR("Error allocating buffers\n"); + return -1; + } + + for (j = 0; j < COPY_LEN/sizeof(uint64_t); j++) + src_data[j] = rte_rand(); + + if (rte_dma_copy(dev_id, vchan, srcs[i]->buf_iova + srcs[i]->data_off, + dsts[i]->buf_iova + dsts[i]->data_off, COPY_LEN, 0) < 0) { + PRINT_ERR("Error with rte_dma_copy for buffer %u\n", i); + return -1; + } + } + rte_dma_submit(dev_id, vchan); + +Filling an Area of Memory +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The IDXD driver also has support for the ``fill`` operation, where an area +of memory is overwritten, or filled, with a short pattern of data. +Fill operations can be performed in much the same was as copy operations +described above, just using the ``rte_dma_fill()`` function rather than the +``rte_dma_copy()`` function. diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 2c222708cf..b01edeab07 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -2,14 +2,144 @@ * Copyright 2021 Intel Corporation */ +#include <x86intrin.h> + #include <rte_dmadev_pmd.h> #include <rte_malloc.h> #include <rte_common.h> +#include <rte_prefetch.h> #include "idxd_internal.h" #define IDXD_PMD_NAME_STR "dmadev_idxd" +static __rte_always_inline rte_iova_t +__desc_idx_to_iova(struct idxd_dmadev *idxd, uint16_t n) +{ + return idxd->desc_iova + (n * sizeof(struct idxd_hw_desc)); +} + +static __rte_always_inline void +__idxd_movdir64b(volatile void *dst, const struct idxd_hw_desc *src) +{ + asm volatile (".byte 0x66, 0x0f, 0x38, 0xf8, 0x02" + : + : "a" (dst), "d" (src) + : "memory"); +} + +static __rte_always_inline void +__submit(struct idxd_dmadev *idxd) +{ + rte_prefetch1(&idxd->batch_comp_ring[idxd->batch_idx_read]); + + if (idxd->batch_size == 0) + return; + + /* write completion to batch comp ring */ + rte_iova_t comp_addr = idxd->batch_iova + + (idxd->batch_idx_write * sizeof(struct idxd_completion)); + + if (idxd->batch_size == 1) { + /* submit batch directly */ + struct idxd_hw_desc desc = + idxd->desc_ring[idxd->batch_start & idxd->desc_ring_mask]; + desc.completion = comp_addr; + desc.op_flags |= IDXD_FLAG_REQUEST_COMPLETION; + _mm_sfence(); /* fence before writing desc to device */ + __idxd_movdir64b(idxd->portal, &desc); + } else { + const struct idxd_hw_desc batch_desc = { + .op_flags = (idxd_op_batch << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_COMPLETION_ADDR_VALID | + IDXD_FLAG_REQUEST_COMPLETION, + .desc_addr = __desc_idx_to_iova(idxd, + idxd->batch_start & idxd->desc_ring_mask), + .completion = comp_addr, + .size = idxd->batch_size, + }; + _mm_sfence(); /* fence before writing desc to device */ + __idxd_movdir64b(idxd->portal, &batch_desc); + } + + if (++idxd->batch_idx_write > idxd->max_batches) + idxd->batch_idx_write = 0; + + idxd->batch_start += idxd->batch_size; + idxd->batch_size = 0; + idxd->batch_idx_ring[idxd->batch_idx_write] = idxd->batch_start; + _mm256_store_si256((void *)&idxd->batch_comp_ring[idxd->batch_idx_write], + _mm256_setzero_si256()); +} + +static __rte_always_inline int +__idxd_write_desc(struct rte_dma_dev *dev, + const uint32_t op_flags, + const rte_iova_t src, + const rte_iova_t dst, + const uint32_t size, + const uint32_t flags) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t mask = idxd->desc_ring_mask; + uint16_t job_id = idxd->batch_start + idxd->batch_size; + /* we never wrap batches, so we only mask the start and allow start+size to overflow */ + uint16_t write_idx = (idxd->batch_start & mask) + idxd->batch_size; + + /* first check batch ring space then desc ring space */ + if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) || + idxd->batch_idx_write + 1 == idxd->batch_idx_read) + return -1; + if (((write_idx + 1) & mask) == (idxd->ids_returned & mask)) + return -1; + + /* write desc. Note: descriptors don't wrap, but the completion address does */ + const uint64_t op_flags64 = (uint64_t)(op_flags | IDXD_FLAG_COMPLETION_ADDR_VALID) << 32; + const uint64_t comp_addr = __desc_idx_to_iova(idxd, write_idx & mask); + _mm256_store_si256((void *)&idxd->desc_ring[write_idx], + _mm256_set_epi64x(dst, src, comp_addr, op_flags64)); + _mm256_store_si256((void *)&idxd->desc_ring[write_idx].size, + _mm256_set_epi64x(0, 0, 0, size)); + + idxd->batch_size++; + + rte_prefetch0_write(&idxd->desc_ring[write_idx + 1]); + + if (flags & RTE_DMA_OP_FLAG_SUBMIT) + __submit(idxd); + + return job_id; +} + +int +idxd_enqueue_copy(struct rte_dma_dev *dev, uint16_t qid __rte_unused, rte_iova_t src, + rte_iova_t dst, unsigned int length, uint64_t flags) +{ + /* we can take advantage of the fact that the fence flag in dmadev and DSA are the same, + * but check it at compile time to be sure. + */ + RTE_BUILD_BUG_ON(RTE_DMA_OP_FLAG_FENCE != IDXD_FLAG_FENCE); + uint32_t memmove = (idxd_op_memmove << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_CACHE_CONTROL | (flags & IDXD_FLAG_FENCE); + return __idxd_write_desc(dev, memmove, src, dst, length, flags); +} + +int +idxd_enqueue_fill(struct rte_dma_dev *dev, uint16_t qid __rte_unused, uint64_t pattern, + rte_iova_t dst, unsigned int length, uint64_t flags) +{ + uint32_t fill = (idxd_op_fill << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_CACHE_CONTROL | (flags & IDXD_FLAG_FENCE); + return __idxd_write_desc(dev, fill, pattern, dst, length, flags); +} + +int +idxd_submit(struct rte_dma_dev *dev, uint16_t qid __rte_unused) +{ + __submit(dev->dev_private); + return 0; +} + int idxd_dump(const struct rte_dma_dev *dev, FILE *f) { @@ -141,6 +271,12 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->dev_ops = ops; dmadev->device = dev; + dmadev->copy = idxd_enqueue_copy; + dmadev->fill = idxd_enqueue_fill; + dmadev->submit = idxd_submit; + dmadev->completed = idxd_completed; + dmadev->completed_status = idxd_completed_status; + idxd = rte_malloc_socket(NULL, sizeof(struct idxd_dmadev), 0, dev->numa_node); if (idxd == NULL) { IDXD_PMD_ERR("Unable to allocate memory for device"); diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index fdd018ca35..b66c2d0182 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -88,5 +88,10 @@ int idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan, const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz); int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *dev_info, uint32_t size); +int idxd_enqueue_copy(struct rte_dma_dev *dev, uint16_t qid, rte_iova_t src, + rte_iova_t dst, unsigned int length, uint64_t flags); +int idxd_enqueue_fill(struct rte_dma_dev *dev, uint16_t qid, uint64_t pattern, + rte_iova_t dst, unsigned int length, uint64_t flags); +int idxd_submit(struct rte_dma_dev *dev, uint16_t qid); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index 36dbd3e518..acb1b10618 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -6,6 +6,7 @@ if is_windows endif deps += ['bus_pci'] +cflags += '-mavx2' # all platforms with idxd HW support AVX sources = files( 'idxd_bus.c', 'idxd_common.c', -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v4 10/16] dma/idxd: add data-path job completion functions 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 00/16] add dmadev driver for idxd devices Kevin Laatz ` (8 preceding siblings ...) 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 09/16] dma/idxd: add data-path job submission functions Kevin Laatz @ 2021-09-17 14:02 ` Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 11/16] dma/idxd: add operation statistic tracking Kevin Laatz ` (5 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 14:02 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the data path functions for gathering completed operations. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v2: - fixed typo in docs - add completion status for invalid opcode --- doc/guides/dmadevs/idxd.rst | 32 +++++ drivers/dma/idxd/idxd_common.c | 235 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 + 3 files changed, 272 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 7835461a22..f942a8aa44 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -209,6 +209,38 @@ device and start the hardware processing of them: } rte_dma_submit(dev_id, vchan); +To retrieve information about completed copies, ``rte_dma_completed()`` and +``rte_dma_completed_status()`` APIs should be used. ``rte_dma_completed()`` +will return the number of completed operations, along with the index of the last +successful completed operation and whether or not an error was encountered. If an +error was encountered, ``rte_dma_completed_status()`` must be used to kick the +device off to continue processing operations and also to gather the status of each +individual operations which is filled in to the ``status`` array provided as +parameter by the application. + +The following status codes are supported by IDXD: +* ``RTE_DMA_STATUS_SUCCESSFUL``: The operation was successful. +* ``RTE_DMA_STATUS_INVALID_OPCODE``: The operation failed due to an invalid operation code. +* ``RTE_DMA_STATUS_INVALID_LENGTH``: The operation failed due to an invalid data length. +* ``RTE_DMA_STATUS_NOT_ATTEMPTED``: The operation was not attempted. +* ``RTE_DMA_STATUS_ERROR_UNKNOWN``: The operation failed due to an unspecified error. + +The following code shows how to retrieve the number of successfully completed +copies within a burst and then using ``rte_dma_completed_status()`` to check +which operation failed and kick off the device to continue processing operations: + +.. code-block:: C + + enum rte_dma_status_code status[COMP_BURST_SZ]; + uint16_t count, idx, status_count; + bool error = 0; + + count = rte_dma_completed(dev_id, vchan, COMP_BURST_SZ, &idx, &error); + + if (error){ + status_count = rte_dma_completed_status(dev_id, vchan, COMP_BURST_SZ, &idx, status); + } + Filling an Area of Memory ~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index b01edeab07..a061a956c2 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -140,6 +140,241 @@ idxd_submit(struct rte_dma_dev *dev, uint16_t qid __rte_unused) return 0; } +static enum rte_dma_status_code +get_comp_status(struct idxd_completion *c) +{ + uint8_t st = c->status; + switch (st) { + /* successful descriptors are not written back normally */ + case IDXD_COMP_STATUS_INCOMPLETE: + case IDXD_COMP_STATUS_SUCCESS: + return RTE_DMA_STATUS_SUCCESSFUL; + case IDXD_COMP_STATUS_INVALID_OPCODE: + return RTE_DMA_STATUS_INVALID_OPCODE; + case IDXD_COMP_STATUS_INVALID_SIZE: + return RTE_DMA_STATUS_INVALID_LENGTH; + case IDXD_COMP_STATUS_SKIPPED: + return RTE_DMA_STATUS_NOT_ATTEMPTED; + default: + return RTE_DMA_STATUS_ERROR_UNKNOWN; + } +} + +static __rte_always_inline int +batch_ok(struct idxd_dmadev *idxd, uint8_t max_ops, enum rte_dma_status_code *status) +{ + uint16_t ret; + uint8_t bstatus; + + if (max_ops == 0) + return 0; + + /* first check if there are any unreturned handles from last time */ + if (idxd->ids_avail != idxd->ids_returned) { + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); + idxd->ids_returned += ret; + if (status) + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); + return ret; + } + + if (idxd->batch_idx_read == idxd->batch_idx_write) + return 0; + + bstatus = idxd->batch_comp_ring[idxd->batch_idx_read].status; + /* now check if next batch is complete and successful */ + if (bstatus == IDXD_COMP_STATUS_SUCCESS) { + /* since the batch idx ring stores the start of each batch, pre-increment to lookup + * start of next batch. + */ + if (++idxd->batch_idx_read > idxd->max_batches) + idxd->batch_idx_read = 0; + idxd->ids_avail = idxd->batch_idx_ring[idxd->batch_idx_read]; + + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); + idxd->ids_returned += ret; + if (status) + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); + return ret; + } + /* check if batch is incomplete */ + else if (bstatus == IDXD_COMP_STATUS_INCOMPLETE) + return 0; + + return -1; /* error case */ +} + +static inline uint16_t +batch_completed(struct idxd_dmadev *idxd, uint8_t max_ops, bool *has_error) +{ + uint16_t i; + uint16_t b_start, b_end, next_batch; + + int ret = batch_ok(idxd, max_ops, NULL); + if (ret >= 0) + return ret; + + /* ERROR case, not successful, not incomplete */ + /* Get the batch size, and special case size 1. + * once we identify the actual failure job, return other jobs, then update + * the batch ring indexes to make it look like the first job of the batch has failed. + * Subsequent calls here will always return zero packets, and the error must be cleared by + * calling the completed_status() function. + */ + next_batch = (idxd->batch_idx_read + 1); + if (next_batch > idxd->max_batches) + next_batch = 0; + b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; + b_end = idxd->batch_idx_ring[next_batch]; + + if (b_end - b_start == 1) { /* not a batch */ + *has_error = true; + return 0; + } + + for (i = b_start; i < b_end; i++) { + struct idxd_completion *c = (void *)&idxd->desc_ring[i & idxd->desc_ring_mask]; + if (c->status > IDXD_COMP_STATUS_SUCCESS) /* ignore incomplete(0) and success(1) */ + break; + } + ret = RTE_MIN((uint16_t)(i - idxd->ids_returned), max_ops); + if (ret < max_ops) + *has_error = true; /* we got up to the point of error */ + idxd->ids_avail = idxd->ids_returned += ret; + + /* to ensure we can call twice and just return 0, set start of batch to where we finished */ + idxd->batch_comp_ring[idxd->batch_idx_read].completed_size -= ret; + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; + if (idxd->batch_idx_ring[next_batch] - idxd->batch_idx_ring[idxd->batch_idx_read] == 1) { + /* copy over the descriptor status to the batch ring as if no batch */ + uint16_t d_idx = idxd->batch_idx_ring[idxd->batch_idx_read] & idxd->desc_ring_mask; + struct idxd_completion *desc_comp = (void *)&idxd->desc_ring[d_idx]; + idxd->batch_comp_ring[idxd->batch_idx_read].status = desc_comp->status; + } + + return ret; +} + +static uint16_t +batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) +{ + uint16_t next_batch; + + int ret = batch_ok(idxd, max_ops, status); + if (ret >= 0) + return ret; + + /* ERROR case, not successful, not incomplete */ + /* Get the batch size, and special case size 1. + */ + next_batch = (idxd->batch_idx_read + 1); + if (next_batch > idxd->max_batches) + next_batch = 0; + const uint16_t b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; + const uint16_t b_end = idxd->batch_idx_ring[next_batch]; + const uint16_t b_len = b_end - b_start; + if (b_len == 1) {/* not a batch */ + *status = get_comp_status(&idxd->batch_comp_ring[idxd->batch_idx_read]); + idxd->ids_avail++; + idxd->ids_returned++; + idxd->batch_idx_read = next_batch; + return 1; + } + + /* not a single-element batch, need to process more. + * Scenarios: + * 1. max_ops >= batch_size - can fit everything, simple case + * - loop through completed ops and then add on any not-attempted ones + * 2. max_ops < batch_size - can't fit everything, more complex case + * - loop through completed/incomplete and stop when hit max_ops + * - adjust the batch descriptor to update where we stopped, with appropriate bcount + * - if bcount is to be exactly 1, update the batch descriptor as it will be treated as + * non-batch next time. + */ + const uint16_t bcount = idxd->batch_comp_ring[idxd->batch_idx_read].completed_size; + for (ret = 0; ret < b_len && ret < max_ops; ret++) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; + status[ret] = (ret < bcount) ? get_comp_status(c) : RTE_DMA_STATUS_NOT_ATTEMPTED; + } + idxd->ids_avail = idxd->ids_returned += ret; + + /* everything fit */ + if (ret == b_len) { + idxd->batch_idx_read = next_batch; + return ret; + } + + /* set up for next time, update existing batch descriptor & start idx at batch_idx_read */ + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; + if (ret > bcount) { + /* we have only incomplete ones - set batch completed size to 0 */ + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; + comp->completed_size = 0; + /* if there is only one descriptor left, job skipped so set flag appropriately */ + if (b_len - ret == 1) + comp->status = IDXD_COMP_STATUS_SKIPPED; + } else { + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; + comp->completed_size -= ret; + /* if there is only one descriptor left, copy status info straight to desc */ + if (comp->completed_size == 1) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; + comp->status = c->status; + /* individual descs can be ok without writeback, but not batches */ + if (comp->status == IDXD_COMP_STATUS_INCOMPLETE) + comp->status = IDXD_COMP_STATUS_SUCCESS; + } else if (bcount == b_len) { + /* check if we still have an error, and clear flag if not */ + uint16_t i; + for (i = b_start + ret; i < b_end; i++) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[i & idxd->desc_ring_mask]; + if (c->status > IDXD_COMP_STATUS_SUCCESS) + break; + } + if (i == b_end) /* no errors */ + comp->status = IDXD_COMP_STATUS_SUCCESS; + } + } + + return ret; +} + +uint16_t +idxd_completed(struct rte_dma_dev *dev, uint16_t qid __rte_unused, uint16_t max_ops, + uint16_t *last_idx, bool *has_error) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t batch, ret = 0; + + do { + batch = batch_completed(idxd, max_ops - ret, has_error); + ret += batch; + } while (batch > 0 && *has_error == false); + + *last_idx = idxd->ids_returned - 1; + return ret; +} + +uint16_t +idxd_completed_status(struct rte_dma_dev *dev, uint16_t qid __rte_unused, uint16_t max_ops, + uint16_t *last_idx, enum rte_dma_status_code *status) +{ + struct idxd_dmadev *idxd = dev->dev_private; + + uint16_t batch, ret = 0; + + do { + batch = batch_completed_status(idxd, max_ops - ret, &status[ret]); + ret += batch; + } while (batch > 0); + + *last_idx = idxd->ids_returned - 1; + return ret; +} + int idxd_dump(const struct rte_dma_dev *dev, FILE *f) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index b66c2d0182..15115a0966 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -93,5 +93,10 @@ int idxd_enqueue_copy(struct rte_dma_dev *dev, uint16_t qid, rte_iova_t src, int idxd_enqueue_fill(struct rte_dma_dev *dev, uint16_t qid, uint64_t pattern, rte_iova_t dst, unsigned int length, uint64_t flags); int idxd_submit(struct rte_dma_dev *dev, uint16_t qid); +uint16_t idxd_completed(struct rte_dma_dev *dev, uint16_t qid, uint16_t max_ops, + uint16_t *last_idx, bool *has_error); +uint16_t idxd_completed_status(struct rte_dma_dev *dev, uint16_t qid __rte_unused, + uint16_t max_ops, uint16_t *last_idx, + enum rte_dma_status_code *status); #endif /* _IDXD_INTERNAL_H_ */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v4 11/16] dma/idxd: add operation statistic tracking 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 00/16] add dmadev driver for idxd devices Kevin Laatz ` (9 preceding siblings ...) 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 10/16] dma/idxd: add data-path job completion functions Kevin Laatz @ 2021-09-17 14:02 ` Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 12/16] dma/idxd: add vchan status function Kevin Laatz ` (4 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 14:02 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add statistic tracking for DSA devices. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 11 +++++++++++ drivers/dma/idxd/idxd_bus.c | 2 ++ drivers/dma/idxd/idxd_common.c | 27 +++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 3 +++ drivers/dma/idxd/idxd_pci.c | 2 ++ 5 files changed, 45 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index f942a8aa44..c81f1d15cc 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -249,3 +249,14 @@ of memory is overwritten, or filled, with a short pattern of data. Fill operations can be performed in much the same was as copy operations described above, just using the ``rte_dma_fill()`` function rather than the ``rte_dma_copy()`` function. + +Querying Device Statistics +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The statistics from the IDXD dmadev device can be got via the stats functions in +the ``rte_dmadev`` library, i.e. ``rte_dma_stats_get()``. The statistics +returned for each device instance are: + +* ``submitted``: The number of operations submitted to the device. +* ``completed``: The number of operations which have completed (successful and failed). +* ``errors``: The number of operations that completed with error. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index b2acdac4f9..b52ea02854 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -99,6 +99,8 @@ static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .stats_get = idxd_stats_get, + .stats_reset = idxd_stats_reset, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index a061a956c2..d86c58c12a 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -65,6 +65,8 @@ __submit(struct idxd_dmadev *idxd) if (++idxd->batch_idx_write > idxd->max_batches) idxd->batch_idx_write = 0; + idxd->stats.submitted += idxd->batch_size; + idxd->batch_start += idxd->batch_size; idxd->batch_size = 0; idxd->batch_idx_ring[idxd->batch_idx_write] = idxd->batch_start; @@ -275,6 +277,8 @@ batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_ const uint16_t b_len = b_end - b_start; if (b_len == 1) {/* not a batch */ *status = get_comp_status(&idxd->batch_comp_ring[idxd->batch_idx_read]); + if (status != RTE_DMA_STATUS_SUCCESSFUL) + idxd->stats.errors++; idxd->ids_avail++; idxd->ids_returned++; idxd->batch_idx_read = next_batch; @@ -296,6 +300,8 @@ batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_ struct idxd_completion *c = (void *) &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; status[ret] = (ret < bcount) ? get_comp_status(c) : RTE_DMA_STATUS_NOT_ATTEMPTED; + if (status[ret] != RTE_DMA_STATUS_SUCCESSFUL) + idxd->stats.errors++; } idxd->ids_avail = idxd->ids_returned += ret; @@ -354,6 +360,7 @@ idxd_completed(struct rte_dma_dev *dev, uint16_t qid __rte_unused, uint16_t max_ ret += batch; } while (batch > 0 && *has_error == false); + idxd->stats.completed += ret; *last_idx = idxd->ids_returned - 1; return ret; } @@ -371,6 +378,7 @@ idxd_completed_status(struct rte_dma_dev *dev, uint16_t qid __rte_unused, uint16 ret += batch; } while (batch > 0); + idxd->stats.completed += ret; *last_idx = idxd->ids_returned - 1; return ret; } @@ -404,6 +412,25 @@ idxd_dump(const struct rte_dma_dev *dev, FILE *f) return 0; } +int +idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + struct rte_dma_stats *stats, uint32_t stats_sz) +{ + struct idxd_dmadev *idxd = dev->dev_private; + if (stats_sz < sizeof(*stats)) + return -EINVAL; + *stats = idxd->stats; + return 0; +} + +int +idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan __rte_unused) +{ + struct idxd_dmadev *idxd = dev->dev_private; + idxd->stats = (struct rte_dma_stats){0}; + return 0; +} + int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t size) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 15115a0966..e2a1119ef7 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -98,5 +98,8 @@ uint16_t idxd_completed(struct rte_dma_dev *dev, uint16_t qid, uint16_t max_ops, uint16_t idxd_completed_status(struct rte_dma_dev *dev, uint16_t qid __rte_unused, uint16_t max_ops, uint16_t *last_idx, enum rte_dma_status_code *status); +int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, + struct rte_dma_stats *stats, uint32_t stats_sz); +int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index cfb64ce220..d73845aa3d 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -114,6 +114,8 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .stats_get = idxd_stats_get, + .stats_reset = idxd_stats_reset, .dev_start = idxd_pci_dev_start, .dev_stop = idxd_pci_dev_stop, }; -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v4 12/16] dma/idxd: add vchan status function 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 00/16] add dmadev driver for idxd devices Kevin Laatz ` (10 preceding siblings ...) 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 11/16] dma/idxd: add operation statistic tracking Kevin Laatz @ 2021-09-17 14:02 ` Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 13/16] dma/idxd: add burst capacity API Kevin Laatz ` (3 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 14:02 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When testing dmadev drivers, it is useful to have the HW device in a known state. This patch adds the implementation of the function which will wait for the device to be idle (all jobs completed) before proceeding. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v3: update API name to vchan_status --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 17 +++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 2 ++ drivers/dma/idxd/idxd_pci.c | 1 + 4 files changed, 21 insertions(+) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index b52ea02854..e6caa048a9 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -101,6 +101,7 @@ static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_info_get = idxd_info_get, .stats_get = idxd_stats_get, .stats_reset = idxd_stats_reset, + .vchan_status = idxd_vchan_status, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index d86c58c12a..87d84c081e 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -162,6 +162,23 @@ get_comp_status(struct idxd_completion *c) } } +int +idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + enum rte_dma_vchan_status *status) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t last_batch_write = idxd->batch_idx_write == 0 ? idxd->max_batches : + idxd->batch_idx_write - 1; + uint8_t bstatus = (idxd->batch_comp_ring[last_batch_write].status != 0); + + /* An IDXD device will always be either active or idle. + * RTE_DMA_VCHAN_HALTED_ERROR is therefore not supported by IDXD. + */ + *status = bstatus ? RTE_DMA_VCHAN_IDLE : RTE_DMA_VCHAN_ACTIVE; + + return 0; +} + static __rte_always_inline int batch_ok(struct idxd_dmadev *idxd, uint8_t max_ops, enum rte_dma_status_code *status) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index e2a1119ef7..a291ad26d9 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -101,5 +101,7 @@ uint16_t idxd_completed_status(struct rte_dma_dev *dev, uint16_t qid __rte_unuse int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, struct rte_dma_stats *stats, uint32_t stats_sz); int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); +int idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan, + enum rte_dma_vchan_status *status); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index d73845aa3d..2464d4a06c 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -118,6 +118,7 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .stats_reset = idxd_stats_reset, .dev_start = idxd_pci_dev_start, .dev_stop = idxd_pci_dev_stop, + .vchan_status = idxd_vchan_status, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v4 13/16] dma/idxd: add burst capacity API 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 00/16] add dmadev driver for idxd devices Kevin Laatz ` (11 preceding siblings ...) 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 12/16] dma/idxd: add vchan status function Kevin Laatz @ 2021-09-17 14:02 ` Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz ` (2 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 14:02 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add support for the burst capacity API. This API will provide the calling application with the remaining capacity of the current burst (limited by max HW batch size). Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 20 ++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 1 + drivers/dma/idxd/idxd_pci.c | 2 ++ 4 files changed, 24 insertions(+) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index e6caa048a9..54129e5083 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -102,6 +102,7 @@ static const struct rte_dma_dev_ops idxd_bus_ops = { .stats_get = idxd_stats_get, .stats_reset = idxd_stats_reset, .vchan_status = idxd_vchan_status, + .burst_capacity = idxd_burst_capacity, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 87d84c081e..b31611c8a4 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -469,6 +469,26 @@ idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t return 0; } +uint16_t +idxd_burst_capacity(const struct rte_dma_dev *dev, uint16_t vchan __rte_unused) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t write_idx = idxd->batch_start + idxd->batch_size; + uint16_t used_space; + + /* Check for space in the batch ring */ + if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) || + idxd->batch_idx_write + 1 == idxd->batch_idx_read) + return 0; + + /* For descriptors, check for wrap-around on write but not read */ + if (idxd->ids_returned > write_idx) + write_idx += idxd->desc_ring_mask + 1; + used_space = write_idx - idxd->ids_returned; + + return RTE_MIN((idxd->desc_ring_mask - used_space), idxd->max_batch_size); +} + int idxd_configure(struct rte_dma_dev *dev __rte_unused, const struct rte_dma_conf *dev_conf, uint32_t conf_sz) diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index a291ad26d9..3ef2f729a8 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -103,5 +103,6 @@ int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); int idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan, enum rte_dma_vchan_status *status); +uint16_t idxd_burst_capacity(const struct rte_dma_dev *dev, uint16_t vchan); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 2464d4a06c..03ddd63f38 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -119,6 +119,7 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_start = idxd_pci_dev_start, .dev_stop = idxd_pci_dev_stop, .vchan_status = idxd_vchan_status, + .burst_capacity = idxd_burst_capacity, }; /* each portal uses 4 x 4k pages */ @@ -232,6 +233,7 @@ init_pci_device(struct rte_pci_device *dev, struct idxd_dmadev *idxd, idxd->u.pci = pci; idxd->max_batches = wq_size; + idxd->max_batch_size = 1 << lg2_max_batch; /* enable the device itself */ err_code = idxd_pci_dev_command(idxd, idxd_enable_dev); -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v4 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 00/16] add dmadev driver for idxd devices Kevin Laatz ` (12 preceding siblings ...) 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 13/16] dma/idxd: add burst capacity API Kevin Laatz @ 2021-09-17 14:02 ` Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 15/16] devbind: add dma device class Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 14:02 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz From: Conor Walsh <conor.walsh@intel.com> Move the example script for configuring IDXD devices bound to the IDXD kernel driver from raw to dma, and create a symlink to still allow use from raw. Signed-off-by: Conor Walsh <conor.walsh@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- drivers/dma/idxd/dpdk_idxd_cfg.py | 117 +++++++++++++++++++++++++++++ drivers/raw/ioat/dpdk_idxd_cfg.py | 118 +----------------------------- 2 files changed, 118 insertions(+), 117 deletions(-) create mode 100755 drivers/dma/idxd/dpdk_idxd_cfg.py mode change 100755 => 120000 drivers/raw/ioat/dpdk_idxd_cfg.py diff --git a/drivers/dma/idxd/dpdk_idxd_cfg.py b/drivers/dma/idxd/dpdk_idxd_cfg.py new file mode 100755 index 0000000000..fcc27822ef --- /dev/null +++ b/drivers/dma/idxd/dpdk_idxd_cfg.py @@ -0,0 +1,117 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2020 Intel Corporation + +""" +Configure an entire Intel DSA instance, using idxd kernel driver, for DPDK use +""" + +import sys +import argparse +import os +import os.path + + +class SysfsDir: + "Used to read/write paths in a sysfs directory" + def __init__(self, path): + self.path = path + + def read_int(self, filename): + "Return a value from sysfs file" + with open(os.path.join(self.path, filename)) as f: + return int(f.readline()) + + def write_values(self, values): + "write dictionary, where key is filename and value is value to write" + for filename, contents in values.items(): + with open(os.path.join(self.path, filename), "w") as f: + f.write(str(contents)) + + +def reset_device(dsa_id): + "Reset the DSA device and all its queues" + drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") + drv_dir.write_values({"unbind": f"dsa{dsa_id}"}) + + +def get_pci_dir(pci): + "Search for the sysfs directory of the PCI device" + base_dir = '/sys/bus/pci/devices/' + for path, dirs, files in os.walk(base_dir): + for dir in dirs: + if pci in dir: + return os.path.join(base_dir, dir) + sys.exit(f"Could not find sysfs directory for device {pci}") + + +def get_dsa_id(pci): + "Get the DSA instance ID using the PCI address of the device" + pci_dir = get_pci_dir(pci) + for path, dirs, files in os.walk(pci_dir): + for dir in dirs: + if dir.startswith('dsa') and 'wq' not in dir: + return int(dir[3:]) + sys.exit(f"Could not get device ID for device {pci}") + + +def configure_dsa(dsa_id, queues, prefix): + "Configure the DSA instance with appropriate number of queues" + dsa_dir = SysfsDir(f"/sys/bus/dsa/devices/dsa{dsa_id}") + drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") + + max_groups = dsa_dir.read_int("max_groups") + max_engines = dsa_dir.read_int("max_engines") + max_queues = dsa_dir.read_int("max_work_queues") + max_work_queues_size = dsa_dir.read_int("max_work_queues_size") + + nb_queues = min(queues, max_queues) + if queues > nb_queues: + print(f"Setting number of queues to max supported value: {max_queues}") + + # we want one engine per group, and no more engines than queues + nb_groups = min(max_engines, max_groups, nb_queues) + for grp in range(nb_groups): + dsa_dir.write_values({f"engine{dsa_id}.{grp}/group_id": grp}) + + # configure each queue + for q in range(nb_queues): + wq_dir = SysfsDir(os.path.join(dsa_dir.path, f"wq{dsa_id}.{q}")) + wq_dir.write_values({"group_id": q % nb_groups, + "type": "user", + "mode": "dedicated", + "name": f"{prefix}_wq{dsa_id}.{q}", + "priority": 1, + "size": int(max_work_queues_size / nb_queues)}) + + # enable device and then queues + drv_dir.write_values({"bind": f"dsa{dsa_id}"}) + for q in range(nb_queues): + drv_dir.write_values({"bind": f"wq{dsa_id}.{q}"}) + + +def main(args): + "Main function, does arg parsing and calls config function" + arg_p = argparse.ArgumentParser( + description="Configure whole DSA device instance for DPDK use") + arg_p.add_argument('dsa_id', + help="Specify DSA instance either via DSA instance number or PCI address") + arg_p.add_argument('-q', metavar='queues', type=int, default=255, + help="Number of queues to set up") + arg_p.add_argument('--name-prefix', metavar='prefix', dest='prefix', + default="dpdk", + help="Prefix for workqueue name to mark for DPDK use [default: 'dpdk']") + arg_p.add_argument('--reset', action='store_true', + help="Reset DSA device and its queues") + parsed_args = arg_p.parse_args(args[1:]) + + dsa_id = parsed_args.dsa_id + dsa_id = get_dsa_id(dsa_id) if ':' in dsa_id else dsa_id + if parsed_args.reset: + reset_device(dsa_id) + else: + configure_dsa(dsa_id, parsed_args.q, parsed_args.prefix) + + +if __name__ == "__main__": + main(sys.argv) diff --git a/drivers/raw/ioat/dpdk_idxd_cfg.py b/drivers/raw/ioat/dpdk_idxd_cfg.py deleted file mode 100755 index fcc27822ef..0000000000 --- a/drivers/raw/ioat/dpdk_idxd_cfg.py +++ /dev/null @@ -1,117 +0,0 @@ -#!/usr/bin/env python3 -# SPDX-License-Identifier: BSD-3-Clause -# Copyright(c) 2020 Intel Corporation - -""" -Configure an entire Intel DSA instance, using idxd kernel driver, for DPDK use -""" - -import sys -import argparse -import os -import os.path - - -class SysfsDir: - "Used to read/write paths in a sysfs directory" - def __init__(self, path): - self.path = path - - def read_int(self, filename): - "Return a value from sysfs file" - with open(os.path.join(self.path, filename)) as f: - return int(f.readline()) - - def write_values(self, values): - "write dictionary, where key is filename and value is value to write" - for filename, contents in values.items(): - with open(os.path.join(self.path, filename), "w") as f: - f.write(str(contents)) - - -def reset_device(dsa_id): - "Reset the DSA device and all its queues" - drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") - drv_dir.write_values({"unbind": f"dsa{dsa_id}"}) - - -def get_pci_dir(pci): - "Search for the sysfs directory of the PCI device" - base_dir = '/sys/bus/pci/devices/' - for path, dirs, files in os.walk(base_dir): - for dir in dirs: - if pci in dir: - return os.path.join(base_dir, dir) - sys.exit(f"Could not find sysfs directory for device {pci}") - - -def get_dsa_id(pci): - "Get the DSA instance ID using the PCI address of the device" - pci_dir = get_pci_dir(pci) - for path, dirs, files in os.walk(pci_dir): - for dir in dirs: - if dir.startswith('dsa') and 'wq' not in dir: - return int(dir[3:]) - sys.exit(f"Could not get device ID for device {pci}") - - -def configure_dsa(dsa_id, queues, prefix): - "Configure the DSA instance with appropriate number of queues" - dsa_dir = SysfsDir(f"/sys/bus/dsa/devices/dsa{dsa_id}") - drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") - - max_groups = dsa_dir.read_int("max_groups") - max_engines = dsa_dir.read_int("max_engines") - max_queues = dsa_dir.read_int("max_work_queues") - max_work_queues_size = dsa_dir.read_int("max_work_queues_size") - - nb_queues = min(queues, max_queues) - if queues > nb_queues: - print(f"Setting number of queues to max supported value: {max_queues}") - - # we want one engine per group, and no more engines than queues - nb_groups = min(max_engines, max_groups, nb_queues) - for grp in range(nb_groups): - dsa_dir.write_values({f"engine{dsa_id}.{grp}/group_id": grp}) - - # configure each queue - for q in range(nb_queues): - wq_dir = SysfsDir(os.path.join(dsa_dir.path, f"wq{dsa_id}.{q}")) - wq_dir.write_values({"group_id": q % nb_groups, - "type": "user", - "mode": "dedicated", - "name": f"{prefix}_wq{dsa_id}.{q}", - "priority": 1, - "size": int(max_work_queues_size / nb_queues)}) - - # enable device and then queues - drv_dir.write_values({"bind": f"dsa{dsa_id}"}) - for q in range(nb_queues): - drv_dir.write_values({"bind": f"wq{dsa_id}.{q}"}) - - -def main(args): - "Main function, does arg parsing and calls config function" - arg_p = argparse.ArgumentParser( - description="Configure whole DSA device instance for DPDK use") - arg_p.add_argument('dsa_id', - help="Specify DSA instance either via DSA instance number or PCI address") - arg_p.add_argument('-q', metavar='queues', type=int, default=255, - help="Number of queues to set up") - arg_p.add_argument('--name-prefix', metavar='prefix', dest='prefix', - default="dpdk", - help="Prefix for workqueue name to mark for DPDK use [default: 'dpdk']") - arg_p.add_argument('--reset', action='store_true', - help="Reset DSA device and its queues") - parsed_args = arg_p.parse_args(args[1:]) - - dsa_id = parsed_args.dsa_id - dsa_id = get_dsa_id(dsa_id) if ':' in dsa_id else dsa_id - if parsed_args.reset: - reset_device(dsa_id) - else: - configure_dsa(dsa_id, parsed_args.q, parsed_args.prefix) - - -if __name__ == "__main__": - main(sys.argv) diff --git a/drivers/raw/ioat/dpdk_idxd_cfg.py b/drivers/raw/ioat/dpdk_idxd_cfg.py new file mode 120000 index 0000000000..85545548d1 --- /dev/null +++ b/drivers/raw/ioat/dpdk_idxd_cfg.py @@ -0,0 +1 @@ +../../dma/idxd/dpdk_idxd_cfg.py \ No newline at end of file -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v4 15/16] devbind: add dma device class 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 00/16] add dmadev driver for idxd devices Kevin Laatz ` (13 preceding siblings ...) 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz @ 2021-09-17 14:02 ` Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 14:02 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add a new class for DMA devices. Devices listed under the DMA class are to be used with the dmadev library. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- usertools/dpdk-devbind.py | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py index 74d16e4c4b..8bb573f4b0 100755 --- a/usertools/dpdk-devbind.py +++ b/usertools/dpdk-devbind.py @@ -69,12 +69,13 @@ network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class] baseband_devices = [acceleration_class] crypto_devices = [encryption_class, intel_processor_class] +dma_devices = [] eventdev_devices = [cavium_sso, cavium_tim, intel_dlb, octeontx2_sso] mempool_devices = [cavium_fpa, octeontx2_npa] compress_devices = [cavium_zip] regex_devices = [octeontx2_ree] -misc_devices = [cnxk_bphy, cnxk_bphy_cgx, intel_ioat_bdw, intel_ioat_skx, intel_ioat_icx, intel_idxd_spr, - intel_ntb_skx, intel_ntb_icx, +misc_devices = [cnxk_bphy, cnxk_bphy_cgx, intel_ioat_bdw, intel_ioat_skx, + intel_ioat_icx, intel_idxd_spr, intel_ntb_skx, intel_ntb_icx, octeontx2_dma] # global dict ethernet devices present. Dictionary indexed by PCI address. @@ -583,6 +584,9 @@ def show_status(): if status_dev in ["crypto", "all"]: show_device_status(crypto_devices, "Crypto") + if status_dev in ["dma", "all"]: + show_device_status(dma_devices, "DMA") + if status_dev in ["event", "all"]: show_device_status(eventdev_devices, "Eventdev") @@ -651,7 +655,7 @@ def parse_args(): parser.add_argument( '--status-dev', help="Print the status of given device group.", - choices=['baseband', 'compress', 'crypto', 'event', + choices=['baseband', 'compress', 'crypto', 'dma', 'event', 'mempool', 'misc', 'net', 'regex']) bind_group = parser.add_mutually_exclusive_group() bind_group.add_argument( @@ -732,6 +736,7 @@ def do_arg_actions(): get_device_details(network_devices) get_device_details(baseband_devices) get_device_details(crypto_devices) + get_device_details(dma_devices) get_device_details(eventdev_devices) get_device_details(mempool_devices) get_device_details(compress_devices) @@ -754,6 +759,7 @@ def main(): get_device_details(network_devices) get_device_details(baseband_devices) get_device_details(crypto_devices) + get_device_details(dma_devices) get_device_details(eventdev_devices) get_device_details(mempool_devices) get_device_details(compress_devices) -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v4 16/16] devbind: move idxd device ID to dmadev class 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 00/16] add dmadev driver for idxd devices Kevin Laatz ` (14 preceding siblings ...) 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 15/16] devbind: add dma device class Kevin Laatz @ 2021-09-17 14:02 ` Kevin Laatz 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 14:02 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz The dmadev library is the preferred abstraction for using IDXD devices and will replace the rawdev implementation in future. This patch moves the IDXD device ID to the dmadev class. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- usertools/dpdk-devbind.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py index 8bb573f4b0..98b698ccc0 100755 --- a/usertools/dpdk-devbind.py +++ b/usertools/dpdk-devbind.py @@ -69,13 +69,13 @@ network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class] baseband_devices = [acceleration_class] crypto_devices = [encryption_class, intel_processor_class] -dma_devices = [] +dma_devices = [intel_idxd_spr] eventdev_devices = [cavium_sso, cavium_tim, intel_dlb, octeontx2_sso] mempool_devices = [cavium_fpa, octeontx2_npa] compress_devices = [cavium_zip] regex_devices = [octeontx2_ree] misc_devices = [cnxk_bphy, cnxk_bphy_cgx, intel_ioat_bdw, intel_ioat_skx, - intel_ioat_icx, intel_idxd_spr, intel_ntb_skx, intel_ntb_icx, + intel_ioat_icx, intel_ntb_skx, intel_ntb_icx, octeontx2_dma] # global dict ethernet devices present. Dictionary indexed by PCI address. -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v5 00/16] add dmadev driver for idxd devices 2021-08-27 17:20 [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices Kevin Laatz ` (14 preceding siblings ...) 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 00/16] add dmadev driver for idxd devices Kevin Laatz @ 2021-09-17 15:24 ` Kevin Laatz 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 01/16] raw/ioat: only build if dmadev not present Kevin Laatz ` (15 more replies) 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 00/16] add dmadev driver for idxd devices Kevin Laatz ` (5 subsequent siblings) 21 siblings, 16 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 15:24 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz This patchset adds a dmadev driver and associated documentation to support Intel Data Streaming Accelerator devices. This driver is intended to ultimately replace the current IDXD part of the IOAT rawdev driver. NOTE: This patchset has several dependencies: - v22 of the dmadev lib set [1] - v5 of the dmadev test suite [2] [1] http://patches.dpdk.org/project/dpdk/list/?series=18960 [2] http://patches.dpdk.org/project/dpdk/list/?series=19017 v5: * add missing toctree entry for idxd driver v4: * rebased on above patchsets * minor fixes based on review feedback v3: * rebased on above patchsets * added burst capacity API v2: * rebased on above patchsets * added API to check for device being idle * added devbind updates for DMA devices * fixed issue identified by internal coverity scan * other minor miscellaneous changes and fixes Bruce Richardson (1): raw/ioat: only build if dmadev not present Conor Walsh (1): dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz (14): dma/idxd: add skeleton for VFIO based DSA device dma/idxd: add bus device probing dma/idxd: create dmadev instances on bus probe dma/idxd: create dmadev instances on pci probe dma/idxd: add datapath structures dma/idxd: add configure and info_get functions dma/idxd: add start and stop functions for pci devices dma/idxd: add data-path job submission functions dma/idxd: add data-path job completion functions dma/idxd: add operation statistic tracking dma/idxd: add vchan status function dma/idxd: add burst capacity API devbind: add dma device class devbind: move idxd device ID to dmadev class MAINTAINERS | 10 + app/test/test_dmadev.c | 2 + doc/guides/dmadevs/idxd.rst | 262 +++++++++++ doc/guides/dmadevs/index.rst | 2 + doc/guides/rawdevs/ioat.rst | 7 + doc/guides/rel_notes/release_21_11.rst | 5 + drivers/dma/idxd/dpdk_idxd_cfg.py | 117 +++++ drivers/dma/idxd/idxd_bus.c | 378 +++++++++++++++ drivers/dma/idxd/idxd_common.c | 616 +++++++++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 131 ++++++ drivers/dma/idxd/idxd_internal.h | 108 +++++ drivers/dma/idxd/idxd_pci.c | 387 ++++++++++++++++ drivers/dma/idxd/meson.build | 14 + drivers/dma/idxd/version.map | 3 + drivers/dma/meson.build | 2 + drivers/meson.build | 2 +- drivers/raw/ioat/dpdk_idxd_cfg.py | 118 +---- drivers/raw/ioat/meson.build | 23 +- usertools/dpdk-devbind.py | 12 +- 19 files changed, 2075 insertions(+), 124 deletions(-) create mode 100644 doc/guides/dmadevs/idxd.rst create mode 100755 drivers/dma/idxd/dpdk_idxd_cfg.py create mode 100644 drivers/dma/idxd/idxd_bus.c create mode 100644 drivers/dma/idxd/idxd_common.c create mode 100644 drivers/dma/idxd/idxd_hw_defs.h create mode 100644 drivers/dma/idxd/idxd_internal.h create mode 100644 drivers/dma/idxd/idxd_pci.c create mode 100644 drivers/dma/idxd/meson.build create mode 100644 drivers/dma/idxd/version.map mode change 100755 => 120000 drivers/raw/ioat/dpdk_idxd_cfg.py -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v5 01/16] raw/ioat: only build if dmadev not present 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 00/16] add dmadev driver for idxd devices Kevin Laatz @ 2021-09-17 15:24 ` Kevin Laatz 2021-09-20 10:15 ` Bruce Richardson 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz ` (14 subsequent siblings) 15 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 15:24 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz From: Bruce Richardson <bruce.richardson@intel.com> Only build the rawdev IDXD/IOAT drivers if the dmadev drivers are not present. A not is also added to the documentation to inform users of this change. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v4: - Fix build issue - Add note in raw documentation to outline this change --- doc/guides/rawdevs/ioat.rst | 7 +++++++ drivers/meson.build | 2 +- drivers/raw/ioat/meson.build | 23 ++++++++++++++++++++--- 3 files changed, 28 insertions(+), 4 deletions(-) diff --git a/doc/guides/rawdevs/ioat.rst b/doc/guides/rawdevs/ioat.rst index a28e909935..4fc327f1a4 100644 --- a/doc/guides/rawdevs/ioat.rst +++ b/doc/guides/rawdevs/ioat.rst @@ -34,6 +34,13 @@ Compilation For builds using ``meson`` and ``ninja``, the driver will be built when the target platform is x86-based. No additional compilation steps are necessary. +.. note:: + Since the addition of the DMAdev library, the ``ioat`` and ``idxd`` parts of this driver + will only be built if their ``DMAdev`` counterparts are not built. The following can be used + to disable the ``DMAdev`` drivers, if the raw drivers are to be used instead:: + + $ meson -Ddisable_drivers=dma/* <build_dir> + Device Setup ------------- diff --git a/drivers/meson.build b/drivers/meson.build index b7d680868a..27ff10a9fc 100644 --- a/drivers/meson.build +++ b/drivers/meson.build @@ -10,6 +10,7 @@ subdirs = [ 'common/qat', # depends on bus. 'common/sfc_efx', # depends on bus. 'mempool', # depends on common and bus. + 'dma', # depends on common and bus. 'net', # depends on common, bus, mempool 'raw', # depends on common, bus and net. 'crypto', # depends on common, bus and mempool (net in future). @@ -18,7 +19,6 @@ subdirs = [ 'vdpa', # depends on common, bus and mempool. 'event', # depends on common, bus, mempool and net. 'baseband', # depends on common and bus. - 'dma', # depends on common and bus. ] if meson.is_cross_build() diff --git a/drivers/raw/ioat/meson.build b/drivers/raw/ioat/meson.build index 0e81cb5951..9be9d8cc65 100644 --- a/drivers/raw/ioat/meson.build +++ b/drivers/raw/ioat/meson.build @@ -2,14 +2,31 @@ # Copyright 2019 Intel Corporation build = dpdk_conf.has('RTE_ARCH_X86') +# only use ioat rawdev driver if we don't have the equivalent dmadev ones +if dpdk_conf.has('RTE_DMA_IDXD') and dpdk_conf.has('RTE_DMA_IOAT') + build = false + subdir_done() +endif + reason = 'only supported on x86' sources = files( - 'idxd_bus.c', - 'idxd_pci.c', 'ioat_common.c', - 'ioat_rawdev.c', 'ioat_rawdev_test.c', ) + +if not dpdk_conf.has('RTE_DMA_IDXD') + sources += files( + 'idxd_bus.c', + 'idxd_pci.c', + ) +endif + +if not dpdk_conf.has('RTE_DMA_IOAT') + sources += files ( + 'ioat_rawdev.c', + ) +endif + deps += ['bus_pci', 'mbuf', 'rawdev'] headers = files( 'rte_ioat_rawdev.h', -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v5 01/16] raw/ioat: only build if dmadev not present 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 01/16] raw/ioat: only build if dmadev not present Kevin Laatz @ 2021-09-20 10:15 ` Bruce Richardson 0 siblings, 0 replies; 243+ messages in thread From: Bruce Richardson @ 2021-09-20 10:15 UTC (permalink / raw) To: Kevin Laatz; +Cc: dev, fengchengwen, jerinj, conor.walsh On Fri, Sep 17, 2021 at 03:24:22PM +0000, Kevin Laatz wrote: > From: Bruce Richardson <bruce.richardson@intel.com> > > Only build the rawdev IDXD/IOAT drivers if the dmadev drivers are not > present. > > A not is also added to the documentation to inform users of this change. typo: "note" It would also be worthwhile mentioning in the commit log that the order of dependencies is changed so that dmadev comes before rawdev. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > > --- > v4: > - Fix build issue > - Add note in raw documentation to outline this change > --- > doc/guides/rawdevs/ioat.rst | 7 +++++++ > drivers/meson.build | 2 +- > drivers/raw/ioat/meson.build | 23 ++++++++++++++++++++--- > 3 files changed, 28 insertions(+), 4 deletions(-) > > diff --git a/doc/guides/rawdevs/ioat.rst b/doc/guides/rawdevs/ioat.rst > index a28e909935..4fc327f1a4 100644 > --- a/doc/guides/rawdevs/ioat.rst > +++ b/doc/guides/rawdevs/ioat.rst > @@ -34,6 +34,13 @@ Compilation > For builds using ``meson`` and ``ninja``, the driver will be built when the target platform is x86-based. > No additional compilation steps are necessary. > > +.. note:: > + Since the addition of the DMAdev library, the ``ioat`` and ``idxd`` parts of this driver > + will only be built if their ``DMAdev`` counterparts are not built. The following can be used > + to disable the ``DMAdev`` drivers, if the raw drivers are to be used instead:: > + Suggest where possible to split lines on punctuation. Put a line break after the "." at the end of the first sentence. Similarly if breaking lines, try and do so after commas. > + $ meson -Ddisable_drivers=dma/* <build_dir> > + > Device Setup > ------------- > > diff --git a/drivers/meson.build b/drivers/meson.build > index b7d680868a..27ff10a9fc 100644 > --- a/drivers/meson.build > +++ b/drivers/meson.build > @@ -10,6 +10,7 @@ subdirs = [ > 'common/qat', # depends on bus. > 'common/sfc_efx', # depends on bus. > 'mempool', # depends on common and bus. > + 'dma', # depends on common and bus. > 'net', # depends on common, bus, mempool > 'raw', # depends on common, bus and net. > 'crypto', # depends on common, bus and mempool (net in future). > @@ -18,7 +19,6 @@ subdirs = [ > 'vdpa', # depends on common, bus and mempool. > 'event', # depends on common, bus, mempool and net. > 'baseband', # depends on common and bus. > - 'dma', # depends on common and bus. > ] > As stated above, I think the reason for this change should be noted in the commit log. > if meson.is_cross_build() > diff --git a/drivers/raw/ioat/meson.build b/drivers/raw/ioat/meson.build > index 0e81cb5951..9be9d8cc65 100644 > --- a/drivers/raw/ioat/meson.build > +++ b/drivers/raw/ioat/meson.build > @@ -2,14 +2,31 @@ > # Copyright 2019 Intel Corporation > > build = dpdk_conf.has('RTE_ARCH_X86') > +# only use ioat rawdev driver if we don't have the equivalent dmadev ones > +if dpdk_conf.has('RTE_DMA_IDXD') and dpdk_conf.has('RTE_DMA_IOAT') > + build = false > + subdir_done() > +endif > + > reason = 'only supported on x86' > sources = files( > - 'idxd_bus.c', > - 'idxd_pci.c', > 'ioat_common.c', > - 'ioat_rawdev.c', > 'ioat_rawdev_test.c', > ) > + > +if not dpdk_conf.has('RTE_DMA_IDXD') > + sources += files( > + 'idxd_bus.c', > + 'idxd_pci.c', > + ) > +endif > + > +if not dpdk_conf.has('RTE_DMA_IOAT') > + sources += files ( > + 'ioat_rawdev.c', > + ) > +endif > + > deps += ['bus_pci', 'mbuf', 'rawdev'] > headers = files( > 'rte_ioat_rawdev.h', > -- > 2.30.2 > ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v5 02/16] dma/idxd: add skeleton for VFIO based DSA device 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 01/16] raw/ioat: only build if dmadev not present Kevin Laatz @ 2021-09-17 15:24 ` Kevin Laatz 2021-09-20 10:23 ` Bruce Richardson 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 03/16] dma/idxd: add bus device probing Kevin Laatz ` (13 subsequent siblings) 15 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 15:24 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the basic device probe/remove skeleton code for DSA device bound to the vfio pci driver. Relevant documentation and MAINTAINERS update also included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v5: add missing toctree entry for idxd driver --- MAINTAINERS | 10 +++++ doc/guides/dmadevs/idxd.rst | 58 ++++++++++++++++++++++++++ doc/guides/dmadevs/index.rst | 2 + doc/guides/rel_notes/release_21_11.rst | 5 +++ drivers/dma/idxd/idxd_internal.h | 27 ++++++++++++ drivers/dma/idxd/idxd_pci.c | 55 ++++++++++++++++++++++++ drivers/dma/idxd/meson.build | 11 +++++ drivers/dma/idxd/version.map | 3 ++ drivers/dma/meson.build | 2 + 9 files changed, 173 insertions(+) create mode 100644 doc/guides/dmadevs/idxd.rst create mode 100644 drivers/dma/idxd/idxd_internal.h create mode 100644 drivers/dma/idxd/idxd_pci.c create mode 100644 drivers/dma/idxd/meson.build create mode 100644 drivers/dma/idxd/version.map diff --git a/MAINTAINERS b/MAINTAINERS index 3258da194d..9cb59b831d 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1200,6 +1200,16 @@ F: doc/guides/compressdevs/zlib.rst F: doc/guides/compressdevs/features/zlib.ini +DMAdev Drivers +-------------- + +Intel IDXD - EXPERIMENTAL +M: Bruce Richardson <bruce.richardson@intel.com> +M: Kevin Laatz <kevin.laatz@intel.com> +F: drivers/dma/idxd/ +F: doc/guides/dmadevs/idxd.rst + + RegEx Drivers ------------- diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst new file mode 100644 index 0000000000..924700d17e --- /dev/null +++ b/doc/guides/dmadevs/idxd.rst @@ -0,0 +1,58 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2021 Intel Corporation. + +.. include:: <isonum.txt> + +IDXD DMA Device Driver +====================== + +The ``idxd`` dmadev driver provides a poll-mode driver (PMD) for Intel\ |reg| +Data Streaming Accelerator `(Intel DSA) +<https://software.intel.com/content/www/us/en/develop/articles/intel-data-streaming-accelerator-architecture-specification.html>`_. +This PMD can be used in conjunction with Intel\ |reg| DSA devices to offload +data operations, such as data copies, to hardware, freeing up CPU cycles for +other tasks. + +Hardware Requirements +---------------------- + +The ``dpdk-devbind.py`` script, included with DPDK, can be used to show the +presence of supported hardware. Running ``dpdk-devbind.py --status-dev dma`` +will show all the DMA devices on the system, including IDXD supported devices. +Intel\ |reg| DSA devices, are currently (at time of writing) appearing +as devices with type “0b25”, due to the absence of pci-id database entries for +them at this point. + +Compilation +------------ + +For builds using ``meson`` and ``ninja``, the driver will be built when the +target platform is x86-based. No additional compilation steps are necessary. + +Device Setup +------------- + +Devices using VFIO/UIO drivers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The HW devices to be used will need to be bound to a user-space IO driver for use. +The ``dpdk-devbind.py`` script can be used to view the state of the devices +and to bind them to a suitable DPDK-supported driver, such as ``vfio-pci``. +For example:: + + $ dpdk-devbind.py -b vfio-pci 6a:01.0 + +Device Probing and Initialization +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For devices bound to a suitable DPDK-supported VFIO/UIO driver, the HW devices will +be found as part of the device scan done at application initialization time without +the need to pass parameters to the application. + +For Intel\ |reg| DSA devices, DPDK will automatically configure the device with the +maximum number of workqueues available on it, partitioning all resources equally +among the queues. +If fewer workqueues are required, then the ``max_queues`` parameter may be passed to +the device driver on the EAL commandline, via the ``allowlist`` or ``-a`` flag e.g.:: + + $ dpdk-test -a <b:d:f>,max_queues=4 diff --git a/doc/guides/dmadevs/index.rst b/doc/guides/dmadevs/index.rst index 0bce29d766..5d4abf880e 100644 --- a/doc/guides/dmadevs/index.rst +++ b/doc/guides/dmadevs/index.rst @@ -10,3 +10,5 @@ an application through DMA API. .. toctree:: :maxdepth: 2 :numbered: + + idxd diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst index a71853b9c3..c0bfd9c1ba 100644 --- a/doc/guides/rel_notes/release_21_11.rst +++ b/doc/guides/rel_notes/release_21_11.rst @@ -92,6 +92,11 @@ New Features * Device allocation and it's multi-process support. * Control and data plane functions. +* **Added IDXD dmadev driver implementation.** + + The IDXD dmadev driver provide device drivers for the Intel DSA devices. + This device driver can be used through the generic dmadev API. + Removed Items ------------- diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h new file mode 100644 index 0000000000..c6a7dcd72f --- /dev/null +++ b/drivers/dma/idxd/idxd_internal.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#ifndef _IDXD_INTERNAL_H_ +#define _IDXD_INTERNAL_H_ + +/** + * @file idxd_internal.h + * + * Internal data structures for the idxd/DSA driver for dmadev + * + * @warning + * @b EXPERIMENTAL: these structures and APIs may change without prior notice + */ + +extern int idxd_pmd_logtype; + +#define IDXD_PMD_LOG(level, fmt, args...) rte_log(RTE_LOG_ ## level, \ + idxd_pmd_logtype, "IDXD: %s(): " fmt "\n", __func__, ##args) + +#define IDXD_PMD_DEBUG(fmt, args...) IDXD_PMD_LOG(DEBUG, fmt, ## args) +#define IDXD_PMD_INFO(fmt, args...) IDXD_PMD_LOG(INFO, fmt, ## args) +#define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) +#define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) + +#endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c new file mode 100644 index 0000000000..79e4aadcab --- /dev/null +++ b/drivers/dma/idxd/idxd_pci.c @@ -0,0 +1,55 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include <rte_bus_pci.h> + +#include "idxd_internal.h" + +#define IDXD_VENDOR_ID 0x8086 +#define IDXD_DEVICE_ID_SPR 0x0B25 + +#define IDXD_PMD_DMADEV_NAME_PCI dmadev_idxd_pci + +const struct rte_pci_id pci_id_idxd_map[] = { + { RTE_PCI_DEVICE(IDXD_VENDOR_ID, IDXD_DEVICE_ID_SPR) }, + { .vendor_id = 0, /* sentinel */ }, +}; + +static int +idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) +{ + int ret = 0; + char name[PCI_PRI_STR_SIZE]; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); + dev->device.driver = &drv->driver; + + return ret; +} + +static int +idxd_dmadev_remove_pci(struct rte_pci_device *dev) +{ + char name[PCI_PRI_STR_SIZE]; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + + IDXD_PMD_INFO("Closing %s on NUMA node %d", + name, dev->device.numa_node); + + return 0; +} + +struct rte_pci_driver idxd_pmd_drv_pci = { + .id_table = pci_id_idxd_map, + .drv_flags = RTE_PCI_DRV_NEED_MAPPING, + .probe = idxd_dmadev_probe_pci, + .remove = idxd_dmadev_remove_pci, +}; + +RTE_PMD_REGISTER_PCI(IDXD_PMD_DMADEV_NAME_PCI, idxd_pmd_drv_pci); +RTE_PMD_REGISTER_PCI_TABLE(IDXD_PMD_DMADEV_NAME_PCI, pci_id_idxd_map); +RTE_PMD_REGISTER_KMOD_DEP(IDXD_PMD_DMADEV_NAME_PCI, "vfio-pci"); +RTE_PMD_REGISTER_PARAM_STRING(dmadev_idxd_pci, "max_queues=0"); diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build new file mode 100644 index 0000000000..9a64d75005 --- /dev/null +++ b/drivers/dma/idxd/meson.build @@ -0,0 +1,11 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2021 Intel Corporation + +if is_windows + subdir_done() +endif + +deps += ['bus_pci'] +sources = files( + 'idxd_pci.c' +) \ No newline at end of file diff --git a/drivers/dma/idxd/version.map b/drivers/dma/idxd/version.map new file mode 100644 index 0000000000..4a76d1d52d --- /dev/null +++ b/drivers/dma/idxd/version.map @@ -0,0 +1,3 @@ +DPDK_21 { + local: *; +}; diff --git a/drivers/dma/meson.build b/drivers/dma/meson.build index d9c7ede32f..411be7a240 100644 --- a/drivers/dma/meson.build +++ b/drivers/dma/meson.build @@ -2,5 +2,7 @@ # Copyright 2021 HiSilicon Limited drivers = [ + 'idxd', 'skeleton', ] +std_deps = ['dmadev'] -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v5 02/16] dma/idxd: add skeleton for VFIO based DSA device 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz @ 2021-09-20 10:23 ` Bruce Richardson 0 siblings, 0 replies; 243+ messages in thread From: Bruce Richardson @ 2021-09-20 10:23 UTC (permalink / raw) To: Kevin Laatz; +Cc: dev, fengchengwen, jerinj, conor.walsh On Fri, Sep 17, 2021 at 03:24:23PM +0000, Kevin Laatz wrote: > Add the basic device probe/remove skeleton code for DSA device bound to > the vfio pci driver. Relevant documentation and MAINTAINERS update also > included. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > <snip> > --- /dev/null > +++ b/drivers/dma/idxd/meson.build > @@ -0,0 +1,11 @@ > +# SPDX-License-Identifier: BSD-3-Clause > +# Copyright(c) 2021 Intel Corporation > + > +if is_windows > + subdir_done() > +endif > + > +deps += ['bus_pci'] > +sources = files( > + 'idxd_pci.c' > +) > \ No newline at end of file If doing a v6, this should be fixed to have a newline at end. /Bruce ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v5 03/16] dma/idxd: add bus device probing 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 01/16] raw/ioat: only build if dmadev not present Kevin Laatz 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz @ 2021-09-17 15:24 ` Kevin Laatz 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz ` (12 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 15:24 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the basic device probing for DSA devices bound to the IDXD kernel driver. These devices can be configured via sysfs and made available to DPDK if they are found during bus scan. Relevant documentation is included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v4: fix 'vdev' naming, changed to 'bus' --- doc/guides/dmadevs/idxd.rst | 64 +++++++ drivers/dma/idxd/idxd_bus.c | 351 +++++++++++++++++++++++++++++++++++ drivers/dma/idxd/meson.build | 1 + 3 files changed, 416 insertions(+) create mode 100644 drivers/dma/idxd/idxd_bus.c diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 924700d17e..ce33e2857a 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -32,6 +32,56 @@ target platform is x86-based. No additional compilation steps are necessary. Device Setup ------------- +Intel\ |reg| DSA devices can use the IDXD kernel driver or DPDK-supported drivers, +such as ``vfio-pci``. Both are supported by the IDXD PMD. + +Intel\ |reg| DSA devices using IDXD kernel driver +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To use an Intel\ |reg| DSA device bound to the IDXD kernel driver, the device must first be configured. +The `accel-config <https://github.com/intel/idxd-config>`_ utility library can be used for configuration. + +.. note:: + The device configuration can also be done by directly interacting with the sysfs nodes. + An example of how this may be done can be seen in the script ``dpdk_idxd_cfg.py`` + included in the driver source directory. + +There are some mandatory configuration steps before being able to use a device with an application. +The internal engines, which do the copies or other operations, +and the work-queues, which are used by applications to assign work to the device, +need to be assigned to groups, and the various other configuration options, +such as priority or queue depth, need to be set for each queue. + +To assign an engine to a group:: + + $ accel-config config-engine dsa0/engine0.0 --group-id=0 + $ accel-config config-engine dsa0/engine0.1 --group-id=1 + +To assign work queues to groups for passing descriptors to the engines a similar accel-config command can be used. +However, the work queues also need to be configured depending on the use case. +Some configuration options include: + +* mode (Dedicated/Shared): Indicates whether a WQ may accept jobs from multiple queues simultaneously. +* priority: WQ priority between 1 and 15. Larger value means higher priority. +* wq-size: the size of the WQ. Sum of all WQ sizes must be less that the total-size defined by the device. +* type: WQ type (kernel/mdev/user). Determines how the device is presented. +* name: identifier given to the WQ. + +Example configuration for a work queue:: + + $ accel-config config-wq dsa0/wq0.0 --group-id=0 \ + --mode=dedicated --priority=10 --wq-size=8 \ + --type=user --name=dpdk_app1 + +Once the devices have been configured, they need to be enabled:: + + $ accel-config enable-device dsa0 + $ accel-config enable-wq dsa0/wq0.0 + +Check the device configuration:: + + $ accel-config list + Devices using VFIO/UIO drivers ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -56,3 +106,17 @@ If fewer workqueues are required, then the ``max_queues`` parameter may be passe the device driver on the EAL commandline, via the ``allowlist`` or ``-a`` flag e.g.:: $ dpdk-test -a <b:d:f>,max_queues=4 + +For devices bound to the IDXD kernel driver, +the DPDK IDXD driver will automatically perform a scan for available workqueues +to use. Any workqueues found listed in ``/dev/dsa`` on the system will be checked +in ``/sys``, and any which have ``dpdk_`` prefix in their name will be automatically +probed by the driver to make them available to the application. +Alternatively, to support use by multiple DPDK processes simultaneously, +the value used as the DPDK ``--file-prefix`` parameter may be used as a workqueue +name prefix, instead of ``dpdk_``, allowing each DPDK application instance to only +use a subset of configured queues. + +Once probed successfully, irrespective of kernel driver, the device will appear as a ``dmadev``, +that is a "DMA device type" inside DPDK, and can be accessed using APIs from the +``rte_dmadev`` library. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c new file mode 100644 index 0000000000..ef589af30e --- /dev/null +++ b/drivers/dma/idxd/idxd_bus.c @@ -0,0 +1,351 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include <dirent.h> +#include <fcntl.h> +#include <unistd.h> +#include <sys/mman.h> +#include <libgen.h> + +#include <rte_bus.h> +#include <rte_log.h> +#include <rte_dmadev_pmd.h> +#include <rte_string_fns.h> + +#include "idxd_internal.h" + +/* default value for DSA paths, but allow override in environment for testing */ +#define DSA_DEV_PATH "/dev/dsa" +#define DSA_SYSFS_PATH "/sys/bus/dsa/devices" + +static unsigned int devcount; + +/** unique identifier for a DSA device/WQ instance */ +struct dsa_wq_addr { + uint16_t device_id; + uint16_t wq_id; +}; + +/** a DSA device instance */ +struct rte_dsa_device { + struct rte_device device; /**< Inherit core device */ + TAILQ_ENTRY(rte_dsa_device) next; /**< next dev in list */ + + char wq_name[32]; /**< the workqueue name/number e.g. wq0.1 */ + struct dsa_wq_addr addr; /**< Identifies the specific WQ */ +}; + +/* forward prototypes */ +struct dsa_bus; +static int dsa_scan(void); +static int dsa_probe(void); +static struct rte_device *dsa_find_device(const struct rte_device *start, + rte_dev_cmp_t cmp, const void *data); +static enum rte_iova_mode dsa_get_iommu_class(void); +static int dsa_addr_parse(const char *name, void *addr); + +/** List of devices */ +TAILQ_HEAD(dsa_device_list, rte_dsa_device); + +/** + * Structure describing the DSA bus + */ +struct dsa_bus { + struct rte_bus bus; /**< Inherit the generic class */ + struct rte_driver driver; /**< Driver struct for devices to point to */ + struct dsa_device_list device_list; /**< List of PCI devices */ +}; + +struct dsa_bus dsa_bus = { + .bus = { + .scan = dsa_scan, + .probe = dsa_probe, + .find_device = dsa_find_device, + .get_iommu_class = dsa_get_iommu_class, + .parse = dsa_addr_parse, + }, + .driver = { + .name = "dmadev_idxd" + }, + .device_list = TAILQ_HEAD_INITIALIZER(dsa_bus.device_list), +}; + +static inline const char * +dsa_get_dev_path(void) +{ + const char *path = getenv("DSA_DEV_PATH"); + return path ? path : DSA_DEV_PATH; +} + +static inline const char * +dsa_get_sysfs_path(void) +{ + const char *path = getenv("DSA_SYSFS_PATH"); + return path ? path : DSA_SYSFS_PATH; +} + +static void * +idxd_bus_mmap_wq(struct rte_dsa_device *dev) +{ + void *addr; + char path[PATH_MAX]; + int fd; + + snprintf(path, sizeof(path), "%s/%s", dsa_get_dev_path(), dev->wq_name); + fd = open(path, O_RDWR); + if (fd < 0) { + IDXD_PMD_ERR("Failed to open device path: %s", path); + return NULL; + } + + addr = mmap(NULL, 0x1000, PROT_WRITE, MAP_SHARED, fd, 0); + close(fd); + if (addr == MAP_FAILED) { + IDXD_PMD_ERR("Failed to mmap device %s", path); + return NULL; + } + + return addr; +} + +static int +read_wq_string(struct rte_dsa_device *dev, const char *filename, + char *value, size_t valuelen) +{ + char sysfs_node[PATH_MAX]; + int len; + int fd; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s", + dsa_get_sysfs_path(), dev->wq_name, filename); + fd = open(sysfs_node, O_RDONLY); + if (fd < 0) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + len = read(fd, value, valuelen - 1); + close(fd); + if (len < 0) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + value[len] = '\0'; + return 0; +} + +static int +read_wq_int(struct rte_dsa_device *dev, const char *filename, + int *value) +{ + char sysfs_node[PATH_MAX]; + FILE *f; + int ret = 0; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s", + dsa_get_sysfs_path(), dev->wq_name, filename); + f = fopen(sysfs_node, "r"); + if (f == NULL) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + if (fscanf(f, "%d", value) != 1) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + ret = -1; + } + + fclose(f); + return ret; +} + +static int +read_device_int(struct rte_dsa_device *dev, const char *filename, + int *value) +{ + char sysfs_node[PATH_MAX]; + FILE *f; + int ret = 0; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/dsa%d/%s", + dsa_get_sysfs_path(), dev->addr.device_id, filename); + f = fopen(sysfs_node, "r"); + if (f == NULL) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + if (fscanf(f, "%d", value) != 1) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + ret = -1; + } + + fclose(f); + return ret; +} + +static int +idxd_probe_dsa(struct rte_dsa_device *dev) +{ + struct idxd_dmadev idxd = {0}; + int ret = 0; + + IDXD_PMD_INFO("Probing device %s on numa node %d", + dev->wq_name, dev->device.numa_node); + if (read_wq_int(dev, "size", &ret) < 0) + return -1; + idxd.max_batches = ret; + if (read_wq_int(dev, "max_batch_size", &ret) < 0) + return -1; + idxd.max_batch_size = ret; + idxd.qid = dev->addr.wq_id; + idxd.sva_support = 1; + + idxd.portal = idxd_bus_mmap_wq(dev); + if (idxd.portal == NULL) { + IDXD_PMD_ERR("WQ mmap failed"); + return -ENOENT; + } + + return 0; +} + +static int +is_for_this_process_use(const char *name) +{ + char *runtime_dir = strdup(rte_eal_get_runtime_dir()); + char *prefix = basename(runtime_dir); + int prefixlen = strlen(prefix); + int retval = 0; + + if (strncmp(name, "dpdk_", 5) == 0) + retval = 1; + if (strncmp(name, prefix, prefixlen) == 0 && name[prefixlen] == '_') + retval = 1; + + free(runtime_dir); + return retval; +} + +static int +dsa_probe(void) +{ + struct rte_dsa_device *dev; + + TAILQ_FOREACH(dev, &dsa_bus.device_list, next) { + char type[64], name[64]; + + if (read_wq_string(dev, "type", type, sizeof(type)) < 0 || + read_wq_string(dev, "name", name, sizeof(name)) < 0) + continue; + + if (strncmp(type, "user", 4) == 0 && is_for_this_process_use(name)) { + dev->device.driver = &dsa_bus.driver; + idxd_probe_dsa(dev); + continue; + } + IDXD_PMD_DEBUG("WQ '%s', not allocated to DPDK", dev->wq_name); + } + + return 0; +} + +static int +dsa_scan(void) +{ + const char *path = dsa_get_dev_path(); + struct dirent *wq; + DIR *dev_dir; + + dev_dir = opendir(path); + if (dev_dir == NULL) { + if (errno == ENOENT) + return 0; /* no bus, return without error */ + IDXD_PMD_ERR("%s(): opendir '%s' failed: %s", + __func__, path, strerror(errno)); + return -1; + } + + while ((wq = readdir(dev_dir)) != NULL) { + struct rte_dsa_device *dev; + int numa_node = -1; + + if (strncmp(wq->d_name, "wq", 2) != 0) + continue; + if (strnlen(wq->d_name, sizeof(dev->wq_name)) == sizeof(dev->wq_name)) { + IDXD_PMD_ERR("%s(): wq name too long: '%s', skipping", + __func__, wq->d_name); + continue; + } + IDXD_PMD_DEBUG("%s(): found %s/%s", __func__, path, wq->d_name); + + dev = malloc(sizeof(*dev)); + if (dsa_addr_parse(wq->d_name, &dev->addr) < 0) { + IDXD_PMD_ERR("Error parsing WQ name: %s", wq->d_name); + free(dev); + continue; + } + dev->device.bus = &dsa_bus.bus; + strlcpy(dev->wq_name, wq->d_name, sizeof(dev->wq_name)); + TAILQ_INSERT_TAIL(&dsa_bus.device_list, dev, next); + devcount++; + + read_device_int(dev, "numa_node", &numa_node); + dev->device.numa_node = numa_node; + dev->device.name = dev->wq_name; + } + + closedir(dev_dir); + return 0; +} + +static struct rte_device * +dsa_find_device(const struct rte_device *start, rte_dev_cmp_t cmp, + const void *data) +{ + struct rte_dsa_device *dev = TAILQ_FIRST(&dsa_bus.device_list); + + /* the rte_device struct must be at start of dsa structure */ + RTE_BUILD_BUG_ON(offsetof(struct rte_dsa_device, device) != 0); + + if (start != NULL) /* jump to start point if given */ + dev = TAILQ_NEXT((const struct rte_dsa_device *)start, next); + while (dev != NULL) { + if (cmp(&dev->device, data) == 0) + return &dev->device; + dev = TAILQ_NEXT(dev, next); + } + return NULL; +} + +static enum rte_iova_mode +dsa_get_iommu_class(void) +{ + /* if there are no devices, report don't care, otherwise VA mode */ + return devcount > 0 ? RTE_IOVA_VA : RTE_IOVA_DC; +} + +static int +dsa_addr_parse(const char *name, void *addr) +{ + struct dsa_wq_addr *wq = addr; + unsigned int device_id, wq_id; + + if (sscanf(name, "wq%u.%u", &device_id, &wq_id) != 2) { + IDXD_PMD_DEBUG("Parsing WQ name failed: %s", name); + return -1; + } + + wq->device_id = device_id; + wq->wq_id = wq_id; + return 0; +} + +RTE_REGISTER_BUS(dsa, dsa_bus.bus); diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index 9a64d75005..c864fce3b3 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -7,5 +7,6 @@ endif deps += ['bus_pci'] sources = files( + 'idxd_bus.c', 'idxd_pci.c' ) \ No newline at end of file -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v5 04/16] dma/idxd: create dmadev instances on bus probe 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 00/16] add dmadev driver for idxd devices Kevin Laatz ` (2 preceding siblings ...) 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 03/16] dma/idxd: add bus device probing Kevin Laatz @ 2021-09-17 15:24 ` Kevin Laatz 2021-09-22 2:04 ` fengchengwen 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz ` (11 subsequent siblings) 15 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 15:24 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When a suitable device is found during the bus scan/probe, create a dmadev instance for each HW queue. Internal structures required for device creation are also added. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v4: - fix 'vdev' naming, changed to 'bus' - rebase changes --- drivers/dma/idxd/idxd_bus.c | 19 ++++++++ drivers/dma/idxd/idxd_common.c | 76 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 40 +++++++++++++++++ drivers/dma/idxd/meson.build | 1 + 4 files changed, 136 insertions(+) create mode 100644 drivers/dma/idxd/idxd_common.c diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index ef589af30e..b48fa954ed 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -85,6 +85,18 @@ dsa_get_sysfs_path(void) return path ? path : DSA_SYSFS_PATH; } +static int +idxd_dev_close(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->data->dev_private; + munmap(idxd->portal, 0x1000); + return 0; +} + +static const struct rte_dma_dev_ops idxd_bus_ops = { + .dev_close = idxd_dev_close, +}; + static void * idxd_bus_mmap_wq(struct rte_dsa_device *dev) { @@ -206,6 +218,7 @@ idxd_probe_dsa(struct rte_dsa_device *dev) return -1; idxd.max_batch_size = ret; idxd.qid = dev->addr.wq_id; + idxd.u.bus.dsa_id = dev->addr.device_id; idxd.sva_support = 1; idxd.portal = idxd_bus_mmap_wq(dev); @@ -214,6 +227,12 @@ idxd_probe_dsa(struct rte_dsa_device *dev) return -ENOENT; } + ret = idxd_dmadev_create(dev->wq_name, &dev->device, &idxd, &idxd_bus_ops); + if (ret) { + IDXD_PMD_ERR("Failed to create rawdev %s", dev->wq_name); + return ret; + } + return 0; } diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c new file mode 100644 index 0000000000..8afad637fc --- /dev/null +++ b/drivers/dma/idxd/idxd_common.c @@ -0,0 +1,76 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#include <rte_dmadev_pmd.h> +#include <rte_malloc.h> +#include <rte_common.h> + +#include "idxd_internal.h" + +#define IDXD_PMD_NAME_STR "dmadev_idxd" + +int +idxd_dmadev_create(const char *name, struct rte_device *dev, + const struct idxd_dmadev *base_idxd, + const struct rte_dma_dev_ops *ops) +{ + struct idxd_dmadev *idxd; + struct rte_dma_dev *dmadev = NULL; + int ret = 0; + + if (!name) { + IDXD_PMD_ERR("Invalid name of the device!"); + ret = -EINVAL; + goto cleanup; + } + + /* Allocate device structure */ + dmadev = rte_dma_pmd_allocate(name, dev->numa_node, + sizeof(dmadev->dev_private)); + if (dmadev == NULL) { + IDXD_PMD_ERR("Unable to allocate raw device"); + ret = -ENOMEM; + goto cleanup; + } + dmadev->dev_ops = ops; + dmadev->device = dev; + + idxd = rte_malloc_socket(NULL, sizeof(struct idxd_dmadev), 0, dev->numa_node); + if (idxd == NULL) { + IDXD_PMD_ERR("Unable to allocate memory for device"); + ret = -ENOMEM; + goto cleanup; + } + dmadev->data->dev_private = idxd; + dmadev->dev_private = idxd; + *idxd = *base_idxd; /* copy over the main fields already passed in */ + idxd->dmadev = dmadev; + + /* allocate batch index ring and completion ring. + * The +1 is because we can never fully use + * the ring, otherwise read == write means both full and empty. + */ + idxd->batch_comp_ring = rte_zmalloc(NULL, (sizeof(idxd->batch_idx_ring[0]) + + sizeof(idxd->batch_comp_ring[0])) * (idxd->max_batches + 1), + sizeof(idxd->batch_comp_ring[0])); + if (idxd->batch_comp_ring == NULL) { + IDXD_PMD_ERR("Unable to reserve memory for batch data\n"); + ret = -ENOMEM; + goto cleanup; + } + idxd->batch_idx_ring = (void *)&idxd->batch_comp_ring[idxd->max_batches+1]; + idxd->batch_iova = rte_mem_virt2iova(idxd->batch_comp_ring); + + return 0; + +cleanup: + if (dmadev) + rte_dma_pmd_release(name); + + return ret; +} + +int idxd_pmd_logtype; + +RTE_LOG_REGISTER_DEFAULT(idxd_pmd_logtype, WARNING); diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index c6a7dcd72f..fa6f053f72 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -24,4 +24,44 @@ extern int idxd_pmd_logtype; #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) +struct idxd_dmadev { + /* counters to track the batches */ + unsigned short max_batches; + unsigned short batch_idx_read; + unsigned short batch_idx_write; + + /* track descriptors and handles */ + unsigned short desc_ring_mask; + unsigned short ids_avail; /* handles for ops completed */ + unsigned short ids_returned; /* the read pointer for hdls/desc rings */ + unsigned short batch_start; /* start+size == write pointer for hdls/desc */ + unsigned short batch_size; + + void *portal; /* address to write the batch descriptor */ + + struct idxd_completion *batch_comp_ring; + unsigned short *batch_idx_ring; /* store where each batch ends */ + + struct rte_dma_stats stats; + + rte_iova_t batch_iova; /* base address of the batch comp ring */ + rte_iova_t desc_iova; /* base address of desc ring, needed for completions */ + + unsigned short max_batch_size; + + struct rte_dma_dev *dmadev; + struct rte_dma_vchan_conf qcfg; + uint8_t sva_support; + uint8_t qid; + + union { + struct { + unsigned int dsa_id; + } bus; + } u; +}; + +int idxd_dmadev_create(const char *name, struct rte_device *dev, + const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); + #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index c864fce3b3..36dbd3e518 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -8,5 +8,6 @@ endif deps += ['bus_pci'] sources = files( 'idxd_bus.c', + 'idxd_common.c', 'idxd_pci.c' ) \ No newline at end of file -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v5 04/16] dma/idxd: create dmadev instances on bus probe 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz @ 2021-09-22 2:04 ` fengchengwen 2021-09-22 9:12 ` Kevin Laatz 0 siblings, 1 reply; 243+ messages in thread From: fengchengwen @ 2021-09-22 2:04 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, jerinj, conor.walsh On 2021/9/17 23:24, Kevin Laatz wrote: > When a suitable device is found during the bus scan/probe, create a dmadev > instance for each HW queue. Internal structures required for device > creation are also added. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > > --- > v4: > - fix 'vdev' naming, changed to 'bus' > - rebase changes > --- > drivers/dma/idxd/idxd_bus.c | 19 ++++++++ > drivers/dma/idxd/idxd_common.c | 76 ++++++++++++++++++++++++++++++++ > drivers/dma/idxd/idxd_internal.h | 40 +++++++++++++++++ > drivers/dma/idxd/meson.build | 1 + > 4 files changed, 136 insertions(+) > create mode 100644 drivers/dma/idxd/idxd_common.c > > diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c > index ef589af30e..b48fa954ed 100644 > --- a/drivers/dma/idxd/idxd_bus.c > +++ b/drivers/dma/idxd/idxd_bus.c > @@ -85,6 +85,18 @@ dsa_get_sysfs_path(void) > return path ? path : DSA_SYSFS_PATH; > } > > +static int > +idxd_dev_close(struct rte_dma_dev *dev) > +{ > + struct idxd_dmadev *idxd = dev->data->dev_private; > + munmap(idxd->portal, 0x1000); > + return 0; > +} > + > +static const struct rte_dma_dev_ops idxd_bus_ops = { > + .dev_close = idxd_dev_close, > +}; > + > static void * > idxd_bus_mmap_wq(struct rte_dsa_device *dev) > { > @@ -206,6 +218,7 @@ idxd_probe_dsa(struct rte_dsa_device *dev) > return -1; > idxd.max_batch_size = ret; > idxd.qid = dev->addr.wq_id; > + idxd.u.bus.dsa_id = dev->addr.device_id; > idxd.sva_support = 1; > > idxd.portal = idxd_bus_mmap_wq(dev); > @@ -214,6 +227,12 @@ idxd_probe_dsa(struct rte_dsa_device *dev) > return -ENOENT; > } > > + ret = idxd_dmadev_create(dev->wq_name, &dev->device, &idxd, &idxd_bus_ops); > + if (ret) { > + IDXD_PMD_ERR("Failed to create rawdev %s", dev->wq_name); > + return ret; > + } > + > return 0; > } > > diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c > new file mode 100644 > index 0000000000..8afad637fc > --- /dev/null > +++ b/drivers/dma/idxd/idxd_common.c > @@ -0,0 +1,76 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright 2021 Intel Corporation > + */ > + > +#include <rte_dmadev_pmd.h> > +#include <rte_malloc.h> > +#include <rte_common.h> > + > +#include "idxd_internal.h" > + > +#define IDXD_PMD_NAME_STR "dmadev_idxd" > + > +int > +idxd_dmadev_create(const char *name, struct rte_device *dev, > + const struct idxd_dmadev *base_idxd, > + const struct rte_dma_dev_ops *ops) > +{ > + struct idxd_dmadev *idxd; > + struct rte_dma_dev *dmadev = NULL; > + int ret = 0; > + > + if (!name) { > + IDXD_PMD_ERR("Invalid name of the device!"); > + ret = -EINVAL; > + goto cleanup; > + } > + > + /* Allocate device structure */ > + dmadev = rte_dma_pmd_allocate(name, dev->numa_node, > + sizeof(dmadev->dev_private)); > + if (dmadev == NULL) { > + IDXD_PMD_ERR("Unable to allocate raw device"); > + ret = -ENOMEM; > + goto cleanup; > + } > + dmadev->dev_ops = ops; > + dmadev->device = dev; > + > + idxd = rte_malloc_socket(NULL, sizeof(struct idxd_dmadev), 0, dev->numa_node); > + if (idxd == NULL) { > + IDXD_PMD_ERR("Unable to allocate memory for device"); > + ret = -ENOMEM; > + goto cleanup; > + } > + dmadev->data->dev_private = idxd; > + dmadev->dev_private = idxd; The dmadev->dev_private and dmadev->data->dev_private already inited by rte_dma_pmd_allocate, and the driver only needs to pass in the correct parameters. Recommended: dmadev = rte_dma_pmd_allocate(name, dev->name, sizeof(struct idxd_dmadev)); > + *idxd = *base_idxd; /* copy over the main fields already passed in */ > + idxd->dmadev = dmadev; > + > + /* allocate batch index ring and completion ring. > + * The +1 is because we can never fully use > + * the ring, otherwise read == write means both full and empty. > + */ > + idxd->batch_comp_ring = rte_zmalloc(NULL, (sizeof(idxd->batch_idx_ring[0]) + > + sizeof(idxd->batch_comp_ring[0])) * (idxd->max_batches + 1), > + sizeof(idxd->batch_comp_ring[0])); > + if (idxd->batch_comp_ring == NULL) { > + IDXD_PMD_ERR("Unable to reserve memory for batch data\n"); > + ret = -ENOMEM; > + goto cleanup; > + } > + idxd->batch_idx_ring = (void *)&idxd->batch_comp_ring[idxd->max_batches+1]; > + idxd->batch_iova = rte_mem_virt2iova(idxd->batch_comp_ring); > + Once init one dmadev successful, driver need changes it's state to READY, like: dmadev->state = RTE_DMA_DEV_READY; This was useful when call rte_dma_pmd_release: if the state is ready, lib will call rte_dma_close() to release the dmadev, else it only clean the struct which lib holds. > + return 0; > + > +cleanup: > + if (dmadev) > + rte_dma_pmd_release(name); > + > + return ret; > +} > + > +int idxd_pmd_logtype; > + > +RTE_LOG_REGISTER_DEFAULT(idxd_pmd_logtype, WARNING); > diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h > index c6a7dcd72f..fa6f053f72 100644 > --- a/drivers/dma/idxd/idxd_internal.h > +++ b/drivers/dma/idxd/idxd_internal.h > @@ -24,4 +24,44 @@ extern int idxd_pmd_logtype; > #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) > #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) > > +struct idxd_dmadev { > + /* counters to track the batches */ > + unsigned short max_batches; > + unsigned short batch_idx_read; > + unsigned short batch_idx_write; > + > + /* track descriptors and handles */ > + unsigned short desc_ring_mask; > + unsigned short ids_avail; /* handles for ops completed */ > + unsigned short ids_returned; /* the read pointer for hdls/desc rings */ > + unsigned short batch_start; /* start+size == write pointer for hdls/desc */ > + unsigned short batch_size; > + > + void *portal; /* address to write the batch descriptor */ > + > + struct idxd_completion *batch_comp_ring; > + unsigned short *batch_idx_ring; /* store where each batch ends */ > + > + struct rte_dma_stats stats; > + > + rte_iova_t batch_iova; /* base address of the batch comp ring */ > + rte_iova_t desc_iova; /* base address of desc ring, needed for completions */ > + > + unsigned short max_batch_size; > + > + struct rte_dma_dev *dmadev; > + struct rte_dma_vchan_conf qcfg; > + uint8_t sva_support; > + uint8_t qid; > + > + union { > + struct { > + unsigned int dsa_id; > + } bus; > + } u; > +}; > + > +int idxd_dmadev_create(const char *name, struct rte_device *dev, > + const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); > + > #endif /* _IDXD_INTERNAL_H_ */ > diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build > index c864fce3b3..36dbd3e518 100644 > --- a/drivers/dma/idxd/meson.build > +++ b/drivers/dma/idxd/meson.build > @@ -8,5 +8,6 @@ endif > deps += ['bus_pci'] > sources = files( > 'idxd_bus.c', > + 'idxd_common.c', > 'idxd_pci.c' > ) > \ No newline at end of file > ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v5 04/16] dma/idxd: create dmadev instances on bus probe 2021-09-22 2:04 ` fengchengwen @ 2021-09-22 9:12 ` Kevin Laatz 0 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-22 9:12 UTC (permalink / raw) To: fengchengwen, dev; +Cc: bruce.richardson, jerinj, conor.walsh On 22/09/2021 03:04, fengchengwen wrote: > On 2021/9/17 23:24, Kevin Laatz wrote: >> When a suitable device is found during the bus scan/probe, create a dmadev >> instance for each HW queue. Internal structures required for device >> creation are also added. >> >> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> >> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> >> Reviewed-by: Conor Walsh <conor.walsh@intel.com> >> >> --- >> v4: >> - fix 'vdev' naming, changed to 'bus' >> - rebase changes >> --- >> drivers/dma/idxd/idxd_bus.c | 19 ++++++++ >> drivers/dma/idxd/idxd_common.c | 76 ++++++++++++++++++++++++++++++++ >> drivers/dma/idxd/idxd_internal.h | 40 +++++++++++++++++ >> drivers/dma/idxd/meson.build | 1 + >> 4 files changed, 136 insertions(+) >> create mode 100644 drivers/dma/idxd/idxd_common.c >> >> diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c >> index ef589af30e..b48fa954ed 100644 >> --- a/drivers/dma/idxd/idxd_bus.c >> +++ b/drivers/dma/idxd/idxd_bus.c >> @@ -85,6 +85,18 @@ dsa_get_sysfs_path(void) >> return path ? path : DSA_SYSFS_PATH; >> } >> >> +static int >> +idxd_dev_close(struct rte_dma_dev *dev) >> +{ >> + struct idxd_dmadev *idxd = dev->data->dev_private; >> + munmap(idxd->portal, 0x1000); >> + return 0; >> +} >> + >> +static const struct rte_dma_dev_ops idxd_bus_ops = { >> + .dev_close = idxd_dev_close, >> +}; >> + >> static void * >> idxd_bus_mmap_wq(struct rte_dsa_device *dev) >> { >> @@ -206,6 +218,7 @@ idxd_probe_dsa(struct rte_dsa_device *dev) >> return -1; >> idxd.max_batch_size = ret; >> idxd.qid = dev->addr.wq_id; >> + idxd.u.bus.dsa_id = dev->addr.device_id; >> idxd.sva_support = 1; >> >> idxd.portal = idxd_bus_mmap_wq(dev); >> @@ -214,6 +227,12 @@ idxd_probe_dsa(struct rte_dsa_device *dev) >> return -ENOENT; >> } >> >> + ret = idxd_dmadev_create(dev->wq_name, &dev->device, &idxd, &idxd_bus_ops); >> + if (ret) { >> + IDXD_PMD_ERR("Failed to create rawdev %s", dev->wq_name); >> + return ret; >> + } >> + >> return 0; >> } >> >> diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c >> new file mode 100644 >> index 0000000000..8afad637fc >> --- /dev/null >> +++ b/drivers/dma/idxd/idxd_common.c >> @@ -0,0 +1,76 @@ >> +/* SPDX-License-Identifier: BSD-3-Clause >> + * Copyright 2021 Intel Corporation >> + */ >> + >> +#include <rte_dmadev_pmd.h> >> +#include <rte_malloc.h> >> +#include <rte_common.h> >> + >> +#include "idxd_internal.h" >> + >> +#define IDXD_PMD_NAME_STR "dmadev_idxd" >> + >> +int >> +idxd_dmadev_create(const char *name, struct rte_device *dev, >> + const struct idxd_dmadev *base_idxd, >> + const struct rte_dma_dev_ops *ops) >> +{ >> + struct idxd_dmadev *idxd; >> + struct rte_dma_dev *dmadev = NULL; >> + int ret = 0; >> + >> + if (!name) { >> + IDXD_PMD_ERR("Invalid name of the device!"); >> + ret = -EINVAL; >> + goto cleanup; >> + } >> + >> + /* Allocate device structure */ >> + dmadev = rte_dma_pmd_allocate(name, dev->numa_node, >> + sizeof(dmadev->dev_private)); >> + if (dmadev == NULL) { >> + IDXD_PMD_ERR("Unable to allocate raw device"); >> + ret = -ENOMEM; >> + goto cleanup; >> + } >> + dmadev->dev_ops = ops; >> + dmadev->device = dev; >> + >> + idxd = rte_malloc_socket(NULL, sizeof(struct idxd_dmadev), 0, dev->numa_node); >> + if (idxd == NULL) { >> + IDXD_PMD_ERR("Unable to allocate memory for device"); >> + ret = -ENOMEM; >> + goto cleanup; >> + } >> + dmadev->data->dev_private = idxd; >> + dmadev->dev_private = idxd; > The dmadev->dev_private and dmadev->data->dev_private already inited by rte_dma_pmd_allocate, > and the driver only needs to pass in the correct parameters. > > Recommended: > dmadev = rte_dma_pmd_allocate(name, dev->name, sizeof(struct idxd_dmadev)); > > >> + *idxd = *base_idxd; /* copy over the main fields already passed in */ >> + idxd->dmadev = dmadev; >> + >> + /* allocate batch index ring and completion ring. >> + * The +1 is because we can never fully use >> + * the ring, otherwise read == write means both full and empty. >> + */ >> + idxd->batch_comp_ring = rte_zmalloc(NULL, (sizeof(idxd->batch_idx_ring[0]) + >> + sizeof(idxd->batch_comp_ring[0])) * (idxd->max_batches + 1), >> + sizeof(idxd->batch_comp_ring[0])); >> + if (idxd->batch_comp_ring == NULL) { >> + IDXD_PMD_ERR("Unable to reserve memory for batch data\n"); >> + ret = -ENOMEM; >> + goto cleanup; >> + } >> + idxd->batch_idx_ring = (void *)&idxd->batch_comp_ring[idxd->max_batches+1]; >> + idxd->batch_iova = rte_mem_virt2iova(idxd->batch_comp_ring); >> + > Once init one dmadev successful, driver need changes it's state to READY, like: > dmadev->state = RTE_DMA_DEV_READY; > This was useful when call rte_dma_pmd_release: if the state is ready, lib will call > rte_dma_close() to release the dmadev, else it only clean the struct which lib holds. Will make these changes in v6, thanks! ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v5 05/16] dma/idxd: create dmadev instances on pci probe 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 00/16] add dmadev driver for idxd devices Kevin Laatz ` (3 preceding siblings ...) 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz @ 2021-09-17 15:24 ` Kevin Laatz 2021-09-22 2:12 ` fengchengwen 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 06/16] dma/idxd: add datapath structures Kevin Laatz ` (10 subsequent siblings) 15 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 15:24 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When a suitable device is found during the PCI probe, create a dmadev instance for each HW queue. HW definitions required are also included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v4: rebase changes --- drivers/dma/idxd/idxd_hw_defs.h | 71 ++++++++ drivers/dma/idxd/idxd_internal.h | 16 ++ drivers/dma/idxd/idxd_pci.c | 278 ++++++++++++++++++++++++++++++- 3 files changed, 362 insertions(+), 3 deletions(-) create mode 100644 drivers/dma/idxd/idxd_hw_defs.h diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h new file mode 100644 index 0000000000..ea627cba6d --- /dev/null +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -0,0 +1,71 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#ifndef _IDXD_HW_DEFS_H_ +#define _IDXD_HW_DEFS_H_ + +/*** Definitions for Intel(R) Data Streaming Accelerator ***/ + +#define IDXD_CMD_SHIFT 20 +enum rte_idxd_cmds { + idxd_enable_dev = 1, + idxd_disable_dev, + idxd_drain_all, + idxd_abort_all, + idxd_reset_device, + idxd_enable_wq, + idxd_disable_wq, + idxd_drain_wq, + idxd_abort_wq, + idxd_reset_wq, +}; + +/* General bar0 registers */ +struct rte_idxd_bar0 { + uint32_t __rte_cache_aligned version; /* offset 0x00 */ + uint64_t __rte_aligned(0x10) gencap; /* offset 0x10 */ + uint64_t __rte_aligned(0x10) wqcap; /* offset 0x20 */ + uint64_t __rte_aligned(0x10) grpcap; /* offset 0x30 */ + uint64_t __rte_aligned(0x08) engcap; /* offset 0x38 */ + uint64_t __rte_aligned(0x10) opcap; /* offset 0x40 */ + uint64_t __rte_aligned(0x20) offsets[2]; /* offset 0x60 */ + uint32_t __rte_aligned(0x20) gencfg; /* offset 0x80 */ + uint32_t __rte_aligned(0x08) genctrl; /* offset 0x88 */ + uint32_t __rte_aligned(0x10) gensts; /* offset 0x90 */ + uint32_t __rte_aligned(0x08) intcause; /* offset 0x98 */ + uint32_t __rte_aligned(0x10) cmd; /* offset 0xA0 */ + uint32_t __rte_aligned(0x08) cmdstatus; /* offset 0xA8 */ + uint64_t __rte_aligned(0x20) swerror[4]; /* offset 0xC0 */ +}; + +/* workqueue config is provided by array of uint32_t. */ +enum rte_idxd_wqcfg { + wq_size_idx, /* size is in first 32-bit value */ + wq_threshold_idx, /* WQ threshold second 32-bits */ + wq_mode_idx, /* WQ mode and other flags */ + wq_sizes_idx, /* WQ transfer and batch sizes */ + wq_occ_int_idx, /* WQ occupancy interrupt handle */ + wq_occ_limit_idx, /* WQ occupancy limit */ + wq_state_idx, /* WQ state and occupancy state */ +}; + +#define WQ_MODE_SHARED 0 +#define WQ_MODE_DEDICATED 1 +#define WQ_PRIORITY_SHIFT 4 +#define WQ_BATCH_SZ_SHIFT 5 +#define WQ_STATE_SHIFT 30 +#define WQ_STATE_MASK 0x3 + +struct rte_idxd_grpcfg { + uint64_t grpwqcfg[4] __rte_cache_aligned; /* 64-byte register set */ + uint64_t grpengcfg; /* offset 32 */ + uint32_t grpflags; /* offset 40 */ +}; + +#define GENSTS_DEV_STATE_MASK 0x03 +#define CMDSTATUS_ACTIVE_SHIFT 31 +#define CMDSTATUS_ACTIVE_MASK (1 << 31) +#define CMDSTATUS_ERR_MASK 0xFF + +#endif diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index fa6f053f72..cb3a68c69b 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -5,6 +5,10 @@ #ifndef _IDXD_INTERNAL_H_ #define _IDXD_INTERNAL_H_ +#include <rte_spinlock.h> + +#include "idxd_hw_defs.h" + /** * @file idxd_internal.h * @@ -24,6 +28,16 @@ extern int idxd_pmd_logtype; #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) +struct idxd_pci_common { + rte_spinlock_t lk; + + uint8_t wq_cfg_sz; + volatile struct rte_idxd_bar0 *regs; + volatile uint32_t *wq_regs_base; + volatile struct rte_idxd_grpcfg *grp_regs; + volatile void *portals; +}; + struct idxd_dmadev { /* counters to track the batches */ unsigned short max_batches; @@ -58,6 +72,8 @@ struct idxd_dmadev { struct { unsigned int dsa_id; } bus; + + struct idxd_pci_common *pci; } u; }; diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 79e4aadcab..171e5ffc07 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -3,6 +3,9 @@ */ #include <rte_bus_pci.h> +#include <rte_devargs.h> +#include <rte_dmadev_pmd.h> +#include <rte_malloc.h> #include "idxd_internal.h" @@ -16,17 +19,286 @@ const struct rte_pci_id pci_id_idxd_map[] = { { .vendor_id = 0, /* sentinel */ }, }; +static inline int +idxd_pci_dev_command(struct idxd_dmadev *idxd, enum rte_idxd_cmds command) +{ + uint8_t err_code; + uint16_t qid = idxd->qid; + int i = 0; + + if (command >= idxd_disable_wq && command <= idxd_reset_wq) + qid = (1 << qid); + rte_spinlock_lock(&idxd->u.pci->lk); + idxd->u.pci->regs->cmd = (command << IDXD_CMD_SHIFT) | qid; + + do { + rte_pause(); + err_code = idxd->u.pci->regs->cmdstatus; + if (++i >= 1000) { + IDXD_PMD_ERR("Timeout waiting for command response from HW"); + rte_spinlock_unlock(&idxd->u.pci->lk); + return err_code; + } + } while (idxd->u.pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK); + rte_spinlock_unlock(&idxd->u.pci->lk); + + return err_code & CMDSTATUS_ERR_MASK; +} + +static uint32_t * +idxd_get_wq_cfg(struct idxd_pci_common *pci, uint8_t wq_idx) +{ + return RTE_PTR_ADD(pci->wq_regs_base, + (uintptr_t)wq_idx << (5 + pci->wq_cfg_sz)); +} + +static int +idxd_is_wq_enabled(struct idxd_dmadev *idxd) +{ + uint32_t state = idxd_get_wq_cfg(idxd->u.pci, idxd->qid)[wq_state_idx]; + return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; +} + +static const struct rte_dma_dev_ops idxd_pci_ops = { + +}; + +/* each portal uses 4 x 4k pages */ +#define IDXD_PORTAL_SIZE (4096 * 4) + +static int +init_pci_device(struct rte_pci_device *dev, struct idxd_dmadev *idxd, + unsigned int max_queues) +{ + struct idxd_pci_common *pci; + uint8_t nb_groups, nb_engines, nb_wqs; + uint16_t grp_offset, wq_offset; /* how far into bar0 the regs are */ + uint16_t wq_size, total_wq_size; + uint8_t lg2_max_batch, lg2_max_copy_size; + unsigned int i, err_code; + + pci = malloc(sizeof(*pci)); + if (pci == NULL) { + IDXD_PMD_ERR("%s: Can't allocate memory", __func__); + goto err; + } + rte_spinlock_init(&pci->lk); + + /* assign the bar registers, and then configure device */ + pci->regs = dev->mem_resource[0].addr; + grp_offset = (uint16_t)pci->regs->offsets[0]; + pci->grp_regs = RTE_PTR_ADD(pci->regs, grp_offset * 0x100); + wq_offset = (uint16_t)(pci->regs->offsets[0] >> 16); + pci->wq_regs_base = RTE_PTR_ADD(pci->regs, wq_offset * 0x100); + pci->portals = dev->mem_resource[2].addr; + pci->wq_cfg_sz = (pci->regs->wqcap >> 24) & 0x0F; + + /* sanity check device status */ + if (pci->regs->gensts & GENSTS_DEV_STATE_MASK) { + /* need function-level-reset (FLR) or is enabled */ + IDXD_PMD_ERR("Device status is not disabled, cannot init"); + goto err; + } + if (pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK) { + /* command in progress */ + IDXD_PMD_ERR("Device has a command in progress, cannot init"); + goto err; + } + + /* read basic info about the hardware for use when configuring */ + nb_groups = (uint8_t)pci->regs->grpcap; + nb_engines = (uint8_t)pci->regs->engcap; + nb_wqs = (uint8_t)(pci->regs->wqcap >> 16); + total_wq_size = (uint16_t)pci->regs->wqcap; + lg2_max_copy_size = (uint8_t)(pci->regs->gencap >> 16) & 0x1F; + lg2_max_batch = (uint8_t)(pci->regs->gencap >> 21) & 0x0F; + + IDXD_PMD_DEBUG("nb_groups = %u, nb_engines = %u, nb_wqs = %u", + nb_groups, nb_engines, nb_wqs); + + /* zero out any old config */ + for (i = 0; i < nb_groups; i++) { + pci->grp_regs[i].grpengcfg = 0; + pci->grp_regs[i].grpwqcfg[0] = 0; + } + for (i = 0; i < nb_wqs; i++) + idxd_get_wq_cfg(pci, i)[0] = 0; + + /* limit queues if necessary */ + if (max_queues != 0 && nb_wqs > max_queues) { + nb_wqs = max_queues; + if (nb_engines > max_queues) + nb_engines = max_queues; + if (nb_groups > max_queues) + nb_engines = max_queues; + IDXD_PMD_DEBUG("Limiting queues to %u", nb_wqs); + } + + /* put each engine into a separate group to avoid reordering */ + if (nb_groups > nb_engines) + nb_groups = nb_engines; + if (nb_groups < nb_engines) + nb_engines = nb_groups; + + /* assign engines to groups, round-robin style */ + for (i = 0; i < nb_engines; i++) { + IDXD_PMD_DEBUG("Assigning engine %u to group %u", + i, i % nb_groups); + pci->grp_regs[i % nb_groups].grpengcfg |= (1ULL << i); + } + + /* now do the same for queues and give work slots to each queue */ + wq_size = total_wq_size / nb_wqs; + IDXD_PMD_DEBUG("Work queue size = %u, max batch = 2^%u, max copy = 2^%u", + wq_size, lg2_max_batch, lg2_max_copy_size); + for (i = 0; i < nb_wqs; i++) { + /* add engine "i" to a group */ + IDXD_PMD_DEBUG("Assigning work queue %u to group %u", + i, i % nb_groups); + pci->grp_regs[i % nb_groups].grpwqcfg[0] |= (1ULL << i); + /* now configure it, in terms of size, max batch, mode */ + idxd_get_wq_cfg(pci, i)[wq_size_idx] = wq_size; + idxd_get_wq_cfg(pci, i)[wq_mode_idx] = (1 << WQ_PRIORITY_SHIFT) | + WQ_MODE_DEDICATED; + idxd_get_wq_cfg(pci, i)[wq_sizes_idx] = lg2_max_copy_size | + (lg2_max_batch << WQ_BATCH_SZ_SHIFT); + } + + /* dump the group configuration to output */ + for (i = 0; i < nb_groups; i++) { + IDXD_PMD_DEBUG("## Group %d", i); + IDXD_PMD_DEBUG(" GRPWQCFG: %"PRIx64, pci->grp_regs[i].grpwqcfg[0]); + IDXD_PMD_DEBUG(" GRPENGCFG: %"PRIx64, pci->grp_regs[i].grpengcfg); + IDXD_PMD_DEBUG(" GRPFLAGS: %"PRIx32, pci->grp_regs[i].grpflags); + } + + idxd->u.pci = pci; + idxd->max_batches = wq_size; + + /* enable the device itself */ + err_code = idxd_pci_dev_command(idxd, idxd_enable_dev); + if (err_code) { + IDXD_PMD_ERR("Error enabling device: code %#x", err_code); + return err_code; + } + IDXD_PMD_DEBUG("IDXD Device enabled OK"); + + return nb_wqs; + +err: + free(pci); + return -1; +} + static int idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) { - int ret = 0; + struct idxd_dmadev idxd = {0}; + uint8_t nb_wqs; + int qid, ret = 0; char name[PCI_PRI_STR_SIZE]; + unsigned int max_queues = 0; rte_pci_device_name(&dev->addr, name, sizeof(name)); IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); dev->device.driver = &drv->driver; - return ret; + if (dev->device.devargs && dev->device.devargs->args[0] != '\0') { + /* if the number of devargs grows beyond just 1, use rte_kvargs */ + if (sscanf(dev->device.devargs->args, + "max_queues=%u", &max_queues) != 1) { + IDXD_PMD_ERR("Invalid device parameter: '%s'", + dev->device.devargs->args); + return -1; + } + } + + ret = init_pci_device(dev, &idxd, max_queues); + if (ret < 0) { + IDXD_PMD_ERR("Error initializing PCI hardware"); + return ret; + } + if (idxd.u.pci->portals == NULL) { + IDXD_PMD_ERR("Error, invalid portal assigned during initialization\n"); + return -EINVAL; + } + nb_wqs = (uint8_t)ret; + + /* set up one device for each queue */ + for (qid = 0; qid < nb_wqs; qid++) { + char qname[32]; + + /* add the queue number to each device name */ + snprintf(qname, sizeof(qname), "%s-q%d", name, qid); + idxd.qid = qid; + idxd.portal = RTE_PTR_ADD(idxd.u.pci->portals, + qid * IDXD_PORTAL_SIZE); + if (idxd_is_wq_enabled(&idxd)) + IDXD_PMD_ERR("Error, WQ %u seems enabled", qid); + ret = idxd_dmadev_create(qname, &dev->device, + &idxd, &idxd_pci_ops); + if (ret != 0) { + IDXD_PMD_ERR("Failed to create dmadev %s", name); + if (qid == 0) /* if no devices using this, free pci */ + free(idxd.u.pci); + return ret; + } + } + + return 0; +} + +static int +idxd_dmadev_destroy(const char *name) +{ + int ret; + uint8_t err_code; + struct rte_dma_dev *dmadev; + struct idxd_dmadev *idxd; + int dev_id = rte_dma_get_dev_id(name); + + if (!name) { + IDXD_PMD_ERR("Invalid device name"); + return -EINVAL; + } + + if (dev_id < 0) { + IDXD_PMD_ERR("Invalid device ID"); + return -EINVAL; + } + + dmadev = &rte_dma_devices[dev_id]; + if (!dmadev) { + IDXD_PMD_ERR("Invalid device name (%s)", name); + return -EINVAL; + } + + idxd = dmadev->dev_private; + if (!idxd) { + IDXD_PMD_ERR("Error getting dev_private"); + return -EINVAL; + } + + /* disable the device */ + err_code = idxd_pci_dev_command(idxd, idxd_disable_dev); + if (err_code) { + IDXD_PMD_ERR("Error disabling device: code %#x", err_code); + return err_code; + } + IDXD_PMD_DEBUG("IDXD Device disabled OK"); + + /* free device memory */ + IDXD_PMD_DEBUG("Freeing device driver memory"); + dmadev->dev_private = NULL; + rte_free(idxd->batch_idx_ring); + rte_free(idxd->desc_ring); + + /* rte_dma_close is called by pmd_release */ + ret = rte_dma_pmd_release(name); + if (ret) + IDXD_PMD_DEBUG("Device cleanup failed"); + + return 0; } static int @@ -39,7 +311,7 @@ idxd_dmadev_remove_pci(struct rte_pci_device *dev) IDXD_PMD_INFO("Closing %s on NUMA node %d", name, dev->device.numa_node); - return 0; + return idxd_dmadev_destroy(name); } struct rte_pci_driver idxd_pmd_drv_pci = { -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v5 05/16] dma/idxd: create dmadev instances on pci probe 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz @ 2021-09-22 2:12 ` fengchengwen 2021-09-22 9:18 ` Kevin Laatz 0 siblings, 1 reply; 243+ messages in thread From: fengchengwen @ 2021-09-22 2:12 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, jerinj, conor.walsh On 2021/9/17 23:24, Kevin Laatz wrote: > When a suitable device is found during the PCI probe, create a dmadev > instance for each HW queue. HW definitions required are also included. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > > --- > v4: rebase changes > --- > drivers/dma/idxd/idxd_hw_defs.h | 71 ++++++++ > drivers/dma/idxd/idxd_internal.h | 16 ++ > drivers/dma/idxd/idxd_pci.c | 278 ++++++++++++++++++++++++++++++- > 3 files changed, 362 insertions(+), 3 deletions(-) > create mode 100644 drivers/dma/idxd/idxd_hw_defs.h > > diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h > new file mode 100644 > index 0000000000..ea627cba6d > --- /dev/null > +++ b/drivers/dma/idxd/idxd_hw_defs.h > @@ -0,0 +1,71 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright 2021 Intel Corporation > + */ > + > +#ifndef _IDXD_HW_DEFS_H_ > +#define _IDXD_HW_DEFS_H_ > + > +/*** Definitions for Intel(R) Data Streaming Accelerator ***/ > + > +#define IDXD_CMD_SHIFT 20 > +enum rte_idxd_cmds { > + idxd_enable_dev = 1, > + idxd_disable_dev, > + idxd_drain_all, > + idxd_abort_all, > + idxd_reset_device, > + idxd_enable_wq, > + idxd_disable_wq, > + idxd_drain_wq, > + idxd_abort_wq, > + idxd_reset_wq, > +}; > + > +/* General bar0 registers */ > +struct rte_idxd_bar0 { > + uint32_t __rte_cache_aligned version; /* offset 0x00 */ > + uint64_t __rte_aligned(0x10) gencap; /* offset 0x10 */ > + uint64_t __rte_aligned(0x10) wqcap; /* offset 0x20 */ > + uint64_t __rte_aligned(0x10) grpcap; /* offset 0x30 */ > + uint64_t __rte_aligned(0x08) engcap; /* offset 0x38 */ > + uint64_t __rte_aligned(0x10) opcap; /* offset 0x40 */ > + uint64_t __rte_aligned(0x20) offsets[2]; /* offset 0x60 */ > + uint32_t __rte_aligned(0x20) gencfg; /* offset 0x80 */ > + uint32_t __rte_aligned(0x08) genctrl; /* offset 0x88 */ > + uint32_t __rte_aligned(0x10) gensts; /* offset 0x90 */ > + uint32_t __rte_aligned(0x08) intcause; /* offset 0x98 */ > + uint32_t __rte_aligned(0x10) cmd; /* offset 0xA0 */ > + uint32_t __rte_aligned(0x08) cmdstatus; /* offset 0xA8 */ > + uint64_t __rte_aligned(0x20) swerror[4]; /* offset 0xC0 */ > +}; > + > +/* workqueue config is provided by array of uint32_t. */ > +enum rte_idxd_wqcfg { > + wq_size_idx, /* size is in first 32-bit value */ > + wq_threshold_idx, /* WQ threshold second 32-bits */ > + wq_mode_idx, /* WQ mode and other flags */ > + wq_sizes_idx, /* WQ transfer and batch sizes */ > + wq_occ_int_idx, /* WQ occupancy interrupt handle */ > + wq_occ_limit_idx, /* WQ occupancy limit */ > + wq_state_idx, /* WQ state and occupancy state */ > +}; > + > +#define WQ_MODE_SHARED 0 > +#define WQ_MODE_DEDICATED 1 > +#define WQ_PRIORITY_SHIFT 4 > +#define WQ_BATCH_SZ_SHIFT 5 > +#define WQ_STATE_SHIFT 30 > +#define WQ_STATE_MASK 0x3 > + > +struct rte_idxd_grpcfg { > + uint64_t grpwqcfg[4] __rte_cache_aligned; /* 64-byte register set */ > + uint64_t grpengcfg; /* offset 32 */ > + uint32_t grpflags; /* offset 40 */ > +}; > + > +#define GENSTS_DEV_STATE_MASK 0x03 > +#define CMDSTATUS_ACTIVE_SHIFT 31 > +#define CMDSTATUS_ACTIVE_MASK (1 << 31) > +#define CMDSTATUS_ERR_MASK 0xFF > + > +#endif > diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h > index fa6f053f72..cb3a68c69b 100644 > --- a/drivers/dma/idxd/idxd_internal.h > +++ b/drivers/dma/idxd/idxd_internal.h > @@ -5,6 +5,10 @@ > #ifndef _IDXD_INTERNAL_H_ > #define _IDXD_INTERNAL_H_ > > +#include <rte_spinlock.h> > + > +#include "idxd_hw_defs.h" > + > /** > * @file idxd_internal.h > * > @@ -24,6 +28,16 @@ extern int idxd_pmd_logtype; > #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) > #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) > > +struct idxd_pci_common { > + rte_spinlock_t lk; > + > + uint8_t wq_cfg_sz; > + volatile struct rte_idxd_bar0 *regs; > + volatile uint32_t *wq_regs_base; > + volatile struct rte_idxd_grpcfg *grp_regs; > + volatile void *portals; > +}; > + > struct idxd_dmadev { > /* counters to track the batches */ > unsigned short max_batches; > @@ -58,6 +72,8 @@ struct idxd_dmadev { > struct { > unsigned int dsa_id; > } bus; > + > + struct idxd_pci_common *pci; > } u; > }; > > diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c > index 79e4aadcab..171e5ffc07 100644 > --- a/drivers/dma/idxd/idxd_pci.c > +++ b/drivers/dma/idxd/idxd_pci.c > @@ -3,6 +3,9 @@ > */ > > #include <rte_bus_pci.h> > +#include <rte_devargs.h> > +#include <rte_dmadev_pmd.h> > +#include <rte_malloc.h> > > #include "idxd_internal.h" > > @@ -16,17 +19,286 @@ const struct rte_pci_id pci_id_idxd_map[] = { > { .vendor_id = 0, /* sentinel */ }, > }; > > +static inline int > +idxd_pci_dev_command(struct idxd_dmadev *idxd, enum rte_idxd_cmds command) > +{ > + uint8_t err_code; > + uint16_t qid = idxd->qid; > + int i = 0; > + > + if (command >= idxd_disable_wq && command <= idxd_reset_wq) > + qid = (1 << qid); > + rte_spinlock_lock(&idxd->u.pci->lk); > + idxd->u.pci->regs->cmd = (command << IDXD_CMD_SHIFT) | qid; > + > + do { > + rte_pause(); > + err_code = idxd->u.pci->regs->cmdstatus; > + if (++i >= 1000) { > + IDXD_PMD_ERR("Timeout waiting for command response from HW"); > + rte_spinlock_unlock(&idxd->u.pci->lk); > + return err_code; > + } > + } while (idxd->u.pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK); > + rte_spinlock_unlock(&idxd->u.pci->lk); > + > + return err_code & CMDSTATUS_ERR_MASK; > +} > + > +static uint32_t * > +idxd_get_wq_cfg(struct idxd_pci_common *pci, uint8_t wq_idx) > +{ > + return RTE_PTR_ADD(pci->wq_regs_base, > + (uintptr_t)wq_idx << (5 + pci->wq_cfg_sz)); > +} > + > +static int > +idxd_is_wq_enabled(struct idxd_dmadev *idxd) > +{ > + uint32_t state = idxd_get_wq_cfg(idxd->u.pci, idxd->qid)[wq_state_idx]; > + return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; > +} > + > +static const struct rte_dma_dev_ops idxd_pci_ops = { > + > +}; > + > +/* each portal uses 4 x 4k pages */ > +#define IDXD_PORTAL_SIZE (4096 * 4) > + > +static int > +init_pci_device(struct rte_pci_device *dev, struct idxd_dmadev *idxd, > + unsigned int max_queues) > +{ > + struct idxd_pci_common *pci; > + uint8_t nb_groups, nb_engines, nb_wqs; > + uint16_t grp_offset, wq_offset; /* how far into bar0 the regs are */ > + uint16_t wq_size, total_wq_size; > + uint8_t lg2_max_batch, lg2_max_copy_size; > + unsigned int i, err_code; > + > + pci = malloc(sizeof(*pci)); > + if (pci == NULL) { > + IDXD_PMD_ERR("%s: Can't allocate memory", __func__); > + goto err; > + } > + rte_spinlock_init(&pci->lk); > + > + /* assign the bar registers, and then configure device */ > + pci->regs = dev->mem_resource[0].addr; > + grp_offset = (uint16_t)pci->regs->offsets[0]; > + pci->grp_regs = RTE_PTR_ADD(pci->regs, grp_offset * 0x100); > + wq_offset = (uint16_t)(pci->regs->offsets[0] >> 16); > + pci->wq_regs_base = RTE_PTR_ADD(pci->regs, wq_offset * 0x100); > + pci->portals = dev->mem_resource[2].addr; > + pci->wq_cfg_sz = (pci->regs->wqcap >> 24) & 0x0F; > + > + /* sanity check device status */ > + if (pci->regs->gensts & GENSTS_DEV_STATE_MASK) { > + /* need function-level-reset (FLR) or is enabled */ > + IDXD_PMD_ERR("Device status is not disabled, cannot init"); > + goto err; > + } > + if (pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK) { > + /* command in progress */ > + IDXD_PMD_ERR("Device has a command in progress, cannot init"); > + goto err; > + } > + > + /* read basic info about the hardware for use when configuring */ > + nb_groups = (uint8_t)pci->regs->grpcap; > + nb_engines = (uint8_t)pci->regs->engcap; > + nb_wqs = (uint8_t)(pci->regs->wqcap >> 16); > + total_wq_size = (uint16_t)pci->regs->wqcap; > + lg2_max_copy_size = (uint8_t)(pci->regs->gencap >> 16) & 0x1F; > + lg2_max_batch = (uint8_t)(pci->regs->gencap >> 21) & 0x0F; > + > + IDXD_PMD_DEBUG("nb_groups = %u, nb_engines = %u, nb_wqs = %u", > + nb_groups, nb_engines, nb_wqs); > + > + /* zero out any old config */ > + for (i = 0; i < nb_groups; i++) { > + pci->grp_regs[i].grpengcfg = 0; > + pci->grp_regs[i].grpwqcfg[0] = 0; > + } > + for (i = 0; i < nb_wqs; i++) > + idxd_get_wq_cfg(pci, i)[0] = 0; > + > + /* limit queues if necessary */ > + if (max_queues != 0 && nb_wqs > max_queues) { > + nb_wqs = max_queues; > + if (nb_engines > max_queues) > + nb_engines = max_queues; > + if (nb_groups > max_queues) > + nb_engines = max_queues; > + IDXD_PMD_DEBUG("Limiting queues to %u", nb_wqs); > + } > + > + /* put each engine into a separate group to avoid reordering */ > + if (nb_groups > nb_engines) > + nb_groups = nb_engines; > + if (nb_groups < nb_engines) > + nb_engines = nb_groups; > + > + /* assign engines to groups, round-robin style */ > + for (i = 0; i < nb_engines; i++) { > + IDXD_PMD_DEBUG("Assigning engine %u to group %u", > + i, i % nb_groups); > + pci->grp_regs[i % nb_groups].grpengcfg |= (1ULL << i); > + } > + > + /* now do the same for queues and give work slots to each queue */ > + wq_size = total_wq_size / nb_wqs; > + IDXD_PMD_DEBUG("Work queue size = %u, max batch = 2^%u, max copy = 2^%u", > + wq_size, lg2_max_batch, lg2_max_copy_size); > + for (i = 0; i < nb_wqs; i++) { > + /* add engine "i" to a group */ > + IDXD_PMD_DEBUG("Assigning work queue %u to group %u", > + i, i % nb_groups); > + pci->grp_regs[i % nb_groups].grpwqcfg[0] |= (1ULL << i); > + /* now configure it, in terms of size, max batch, mode */ > + idxd_get_wq_cfg(pci, i)[wq_size_idx] = wq_size; > + idxd_get_wq_cfg(pci, i)[wq_mode_idx] = (1 << WQ_PRIORITY_SHIFT) | > + WQ_MODE_DEDICATED; > + idxd_get_wq_cfg(pci, i)[wq_sizes_idx] = lg2_max_copy_size | > + (lg2_max_batch << WQ_BATCH_SZ_SHIFT); > + } > + > + /* dump the group configuration to output */ > + for (i = 0; i < nb_groups; i++) { > + IDXD_PMD_DEBUG("## Group %d", i); > + IDXD_PMD_DEBUG(" GRPWQCFG: %"PRIx64, pci->grp_regs[i].grpwqcfg[0]); > + IDXD_PMD_DEBUG(" GRPENGCFG: %"PRIx64, pci->grp_regs[i].grpengcfg); > + IDXD_PMD_DEBUG(" GRPFLAGS: %"PRIx32, pci->grp_regs[i].grpflags); > + } > + > + idxd->u.pci = pci; > + idxd->max_batches = wq_size; > + > + /* enable the device itself */ > + err_code = idxd_pci_dev_command(idxd, idxd_enable_dev); > + if (err_code) { > + IDXD_PMD_ERR("Error enabling device: code %#x", err_code); > + return err_code; > + } > + IDXD_PMD_DEBUG("IDXD Device enabled OK"); > + > + return nb_wqs; > + > +err: > + free(pci); > + return -1; > +} > + > static int > idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) > { > - int ret = 0; > + struct idxd_dmadev idxd = {0}; > + uint8_t nb_wqs; > + int qid, ret = 0; > char name[PCI_PRI_STR_SIZE]; > + unsigned int max_queues = 0; > > rte_pci_device_name(&dev->addr, name, sizeof(name)); > IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); > dev->device.driver = &drv->driver; > > - return ret; > + if (dev->device.devargs && dev->device.devargs->args[0] != '\0') { > + /* if the number of devargs grows beyond just 1, use rte_kvargs */ > + if (sscanf(dev->device.devargs->args, > + "max_queues=%u", &max_queues) != 1) { > + IDXD_PMD_ERR("Invalid device parameter: '%s'", > + dev->device.devargs->args); > + return -1; > + } > + } > + > + ret = init_pci_device(dev, &idxd, max_queues); > + if (ret < 0) { > + IDXD_PMD_ERR("Error initializing PCI hardware"); > + return ret; > + } > + if (idxd.u.pci->portals == NULL) { > + IDXD_PMD_ERR("Error, invalid portal assigned during initialization\n"); > + return -EINVAL; > + } > + nb_wqs = (uint8_t)ret; > + > + /* set up one device for each queue */ > + for (qid = 0; qid < nb_wqs; qid++) { > + char qname[32]; > + > + /* add the queue number to each device name */ > + snprintf(qname, sizeof(qname), "%s-q%d", name, qid); > + idxd.qid = qid; > + idxd.portal = RTE_PTR_ADD(idxd.u.pci->portals, > + qid * IDXD_PORTAL_SIZE); > + if (idxd_is_wq_enabled(&idxd)) > + IDXD_PMD_ERR("Error, WQ %u seems enabled", qid); > + ret = idxd_dmadev_create(qname, &dev->device, > + &idxd, &idxd_pci_ops); > + if (ret != 0) { > + IDXD_PMD_ERR("Failed to create dmadev %s", name); > + if (qid == 0) /* if no devices using this, free pci */ > + free(idxd.u.pci); > + return ret; > + } > + } > + > + return 0; > +} > + > +static int > +idxd_dmadev_destroy(const char *name) > +{ > + int ret; > + uint8_t err_code; > + struct rte_dma_dev *dmadev; > + struct idxd_dmadev *idxd; > + int dev_id = rte_dma_get_dev_id(name); > + > + if (!name) { > + IDXD_PMD_ERR("Invalid device name"); > + return -EINVAL; > + } > + > + if (dev_id < 0) { > + IDXD_PMD_ERR("Invalid device ID"); > + return -EINVAL; > + } > + > + dmadev = &rte_dma_devices[dev_id]; > + if (!dmadev) { > + IDXD_PMD_ERR("Invalid device name (%s)", name); > + return -EINVAL; > + } > + > + idxd = dmadev->dev_private; > + if (!idxd) { > + IDXD_PMD_ERR("Error getting dev_private"); > + return -EINVAL; > + } > + > + /* disable the device */ > + err_code = idxd_pci_dev_command(idxd, idxd_disable_dev); > + if (err_code) { > + IDXD_PMD_ERR("Error disabling device: code %#x", err_code); > + return err_code; > + } > + IDXD_PMD_DEBUG("IDXD Device disabled OK"); Recommended: move the above disable idxd device to rte_dma_close() ops. when create: mark state as READY when destroy: direct call rte_dma_pmd_release(name), the lib will call rte_dma_close(). > + > + /* free device memory */ > + IDXD_PMD_DEBUG("Freeing device driver memory"); > + dmadev->dev_private = NULL; The dmalib managed dev_private, so that driver could not do free again. > + rte_free(idxd->batch_idx_ring); > + rte_free(idxd->desc_ring); Please move above free ops to rte_dma_close() ops. > + > + /* rte_dma_close is called by pmd_release */ > + ret = rte_dma_pmd_release(name); > + if (ret) > + IDXD_PMD_DEBUG("Device cleanup failed"); > + > + return 0; > } > > static int > @@ -39,7 +311,7 @@ idxd_dmadev_remove_pci(struct rte_pci_device *dev) > IDXD_PMD_INFO("Closing %s on NUMA node %d", > name, dev->device.numa_node); > > - return 0; > + return idxd_dmadev_destroy(name); > } > > struct rte_pci_driver idxd_pmd_drv_pci = { > ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v5 05/16] dma/idxd: create dmadev instances on pci probe 2021-09-22 2:12 ` fengchengwen @ 2021-09-22 9:18 ` Kevin Laatz 0 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-22 9:18 UTC (permalink / raw) To: fengchengwen, dev; +Cc: bruce.richardson, jerinj, conor.walsh On 22/09/2021 03:12, fengchengwen wrote: > On 2021/9/17 23:24, Kevin Laatz wrote: >> When a suitable device is found during the PCI probe, create a dmadev >> instance for each HW queue. HW definitions required are also included. >> >> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> >> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> >> Reviewed-by: Conor Walsh <conor.walsh@intel.com> >> >> --- >> v4: rebase changes >> --- >> drivers/dma/idxd/idxd_hw_defs.h | 71 ++++++++ >> drivers/dma/idxd/idxd_internal.h | 16 ++ >> drivers/dma/idxd/idxd_pci.c | 278 ++++++++++++++++++++++++++++++- >> 3 files changed, 362 insertions(+), 3 deletions(-) >> create mode 100644 drivers/dma/idxd/idxd_hw_defs.h >> [snip] >> + >> +static int >> +idxd_dmadev_destroy(const char *name) >> +{ >> + int ret; >> + uint8_t err_code; >> + struct rte_dma_dev *dmadev; >> + struct idxd_dmadev *idxd; >> + int dev_id = rte_dma_get_dev_id(name); >> + >> + if (!name) { >> + IDXD_PMD_ERR("Invalid device name"); >> + return -EINVAL; >> + } >> + >> + if (dev_id < 0) { >> + IDXD_PMD_ERR("Invalid device ID"); >> + return -EINVAL; >> + } >> + >> + dmadev = &rte_dma_devices[dev_id]; >> + if (!dmadev) { >> + IDXD_PMD_ERR("Invalid device name (%s)", name); >> + return -EINVAL; >> + } >> + >> + idxd = dmadev->dev_private; >> + if (!idxd) { >> + IDXD_PMD_ERR("Error getting dev_private"); >> + return -EINVAL; >> + } >> + >> + /* disable the device */ >> + err_code = idxd_pci_dev_command(idxd, idxd_disable_dev); >> + if (err_code) { >> + IDXD_PMD_ERR("Error disabling device: code %#x", err_code); >> + return err_code; >> + } >> + IDXD_PMD_DEBUG("IDXD Device disabled OK"); > Recommended: move the above disable idxd device to rte_dma_close() ops. > when create: mark state as READY > when destroy: direct call rte_dma_pmd_release(name), the lib will call rte_dma_close(). > >> + >> + /* free device memory */ >> + IDXD_PMD_DEBUG("Freeing device driver memory"); >> + dmadev->dev_private = NULL; > The dmalib managed dev_private, so that driver could not do free again. > >> + rte_free(idxd->batch_idx_ring); >> + rte_free(idxd->desc_ring); > Please move above free ops to rte_dma_close() ops. > Will fix, thanks ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v5 06/16] dma/idxd: add datapath structures 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 00/16] add dmadev driver for idxd devices Kevin Laatz ` (4 preceding siblings ...) 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz @ 2021-09-17 15:24 ` Kevin Laatz 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 07/16] dma/idxd: add configure and info_get functions Kevin Laatz ` (9 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 15:24 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add data structures required for the data path for IDXD devices. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v2: add completion status for invalid opcode --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 33 ++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 60 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 3 ++ drivers/dma/idxd/idxd_pci.c | 2 +- 5 files changed, 98 insertions(+), 1 deletion(-) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index b48fa954ed..3c0837ec52 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -95,6 +95,7 @@ idxd_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_close = idxd_dev_close, + .dev_dump = idxd_dump, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 8afad637fc..45cde78e88 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -10,6 +10,35 @@ #define IDXD_PMD_NAME_STR "dmadev_idxd" +int +idxd_dump(const struct rte_dma_dev *dev, FILE *f) +{ + struct idxd_dmadev *idxd = dev->dev_private; + unsigned int i; + + fprintf(f, "== IDXD Private Data ==\n"); + fprintf(f, " Portal: %p\n", idxd->portal); + fprintf(f, " Config: { ring_size: %u }\n", + idxd->qcfg.nb_desc); + fprintf(f, " Batch ring (sz = %u, max_batches = %u):\n\t", + idxd->max_batches + 1, idxd->max_batches); + for (i = 0; i <= idxd->max_batches; i++) { + fprintf(f, " %u ", idxd->batch_idx_ring[i]); + if (i == idxd->batch_idx_read && i == idxd->batch_idx_write) + fprintf(f, "[rd ptr, wr ptr] "); + else if (i == idxd->batch_idx_read) + fprintf(f, "[rd ptr] "); + else if (i == idxd->batch_idx_write) + fprintf(f, "[wr ptr] "); + if (i == idxd->max_batches) + fprintf(f, "\n"); + } + + fprintf(f, " Curr batch: start = %u, size = %u\n", idxd->batch_start, idxd->batch_size); + fprintf(f, " IDS: avail = %u, returned: %u\n", idxd->ids_avail, idxd->ids_returned); + return 0; +} + int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, @@ -19,6 +48,10 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, struct rte_dma_dev *dmadev = NULL; int ret = 0; + RTE_BUILD_BUG_ON(sizeof(struct idxd_hw_desc) != 64); + RTE_BUILD_BUG_ON(offsetof(struct idxd_hw_desc, size) != 32); + RTE_BUILD_BUG_ON(sizeof(struct idxd_completion) != 32); + if (!name) { IDXD_PMD_ERR("Invalid name of the device!"); ret = -EINVAL; diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h index ea627cba6d..55ca9f7f52 100644 --- a/drivers/dma/idxd/idxd_hw_defs.h +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -5,6 +5,66 @@ #ifndef _IDXD_HW_DEFS_H_ #define _IDXD_HW_DEFS_H_ +/* + * Defines used in the data path for interacting with IDXD hardware. + */ +#define IDXD_CMD_OP_SHIFT 24 +enum rte_idxd_ops { + idxd_op_nop = 0, + idxd_op_batch, + idxd_op_drain, + idxd_op_memmove, + idxd_op_fill +}; + +#define IDXD_FLAG_FENCE (1 << 0) +#define IDXD_FLAG_COMPLETION_ADDR_VALID (1 << 2) +#define IDXD_FLAG_REQUEST_COMPLETION (1 << 3) +#define IDXD_FLAG_CACHE_CONTROL (1 << 8) + +/** + * Hardware descriptor used by DSA hardware, for both bursts and + * for individual operations. + */ +struct idxd_hw_desc { + uint32_t pasid; + uint32_t op_flags; + rte_iova_t completion; + + RTE_STD_C11 + union { + rte_iova_t src; /* source address for copy ops etc. */ + rte_iova_t desc_addr; /* descriptor pointer for batch */ + }; + rte_iova_t dst; + + uint32_t size; /* length of data for op, or batch size */ + + uint16_t intr_handle; /* completion interrupt handle */ + + /* remaining 26 bytes are reserved */ + uint16_t __reserved[13]; +} __rte_aligned(64); + +#define IDXD_COMP_STATUS_INCOMPLETE 0 +#define IDXD_COMP_STATUS_SUCCESS 1 +#define IDXD_COMP_STATUS_INVALID_OPCODE 0x10 +#define IDXD_COMP_STATUS_INVALID_SIZE 0x13 +#define IDXD_COMP_STATUS_SKIPPED 0xFF /* not official IDXD error, needed as placeholder */ + +/** + * Completion record structure written back by DSA + */ +struct idxd_completion { + uint8_t status; + uint8_t result; + /* 16-bits pad here */ + uint32_t completed_size; /* data length, or descriptors for batch */ + + rte_iova_t fault_address; + uint32_t invalid_flags; +} __rte_aligned(32); + /*** Definitions for Intel(R) Data Streaming Accelerator ***/ #define IDXD_CMD_SHIFT 20 diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index cb3a68c69b..99c8e04302 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -39,6 +39,8 @@ struct idxd_pci_common { }; struct idxd_dmadev { + struct idxd_hw_desc *desc_ring; + /* counters to track the batches */ unsigned short max_batches; unsigned short batch_idx_read; @@ -79,5 +81,6 @@ struct idxd_dmadev { int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); +int idxd_dump(const struct rte_dma_dev *dev, FILE *f); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 171e5ffc07..33cf76adfb 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -60,7 +60,7 @@ idxd_is_wq_enabled(struct idxd_dmadev *idxd) } static const struct rte_dma_dev_ops idxd_pci_ops = { - + .dev_dump = idxd_dump, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v5 07/16] dma/idxd: add configure and info_get functions 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 00/16] add dmadev driver for idxd devices Kevin Laatz ` (5 preceding siblings ...) 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 06/16] dma/idxd: add datapath structures Kevin Laatz @ 2021-09-17 15:24 ` Kevin Laatz 2021-09-20 10:27 ` Bruce Richardson 2021-09-22 2:31 ` fengchengwen 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz ` (8 subsequent siblings) 15 siblings, 2 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 15:24 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add functions for device configuration. The info_get function is included here since it can be useful for checking successful configuration. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v2: - fix reconfigure bug in idxd_vchan_setup() - add literal include comment for the docs to pick up v3: - fixes needed after changes from rebasing --- app/test/test_dmadev.c | 2 + doc/guides/dmadevs/idxd.rst | 30 +++++++++++++ drivers/dma/idxd/idxd_bus.c | 3 ++ drivers/dma/idxd/idxd_common.c | 72 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 6 +++ drivers/dma/idxd/idxd_pci.c | 3 ++ 6 files changed, 116 insertions(+) diff --git a/app/test/test_dmadev.c b/app/test/test_dmadev.c index 98fcab67f3..5bbe4250e0 100644 --- a/app/test/test_dmadev.c +++ b/app/test/test_dmadev.c @@ -739,6 +739,7 @@ test_dmadev_instance(uint16_t dev_id) { #define TEST_RINGSIZE 512 #define CHECK_ERRS true + /* Setup of the dmadev device. 8< */ struct rte_dma_stats stats; struct rte_dma_info info; const struct rte_dma_conf conf = { .nb_vchans = 1}; @@ -759,6 +760,7 @@ test_dmadev_instance(uint16_t dev_id) if (rte_dma_vchan_setup(dev_id, vchan, &qconf) < 0) ERR_RETURN("Error with queue configuration\n"); + /* >8 End of setup of the dmadev device. */ rte_dma_info_get(dev_id, &info); if (info.nb_vchans != 1) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index ce33e2857a..abfa5be9ea 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -120,3 +120,33 @@ use a subset of configured queues. Once probed successfully, irrespective of kernel driver, the device will appear as a ``dmadev``, that is a "DMA device type" inside DPDK, and can be accessed using APIs from the ``rte_dmadev`` library. + +Using IDXD DMAdev Devices +-------------------------- + +To use the devices from an application, the dmadev API can be used. + +Getting Device Information +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Basic information about each dmadev device can be queried using the +``rte_dma_info_get()`` API. This will return basic device information such as +the ``rte_device`` structure, device capabilities and other device specific values. + +Device Configuration +~~~~~~~~~~~~~~~~~~~~~ + +Configuring an IDXD dmadev device is done using the ``rte_dma_configure()`` and +``rte_dma_vchan_setup`` APIs. The configurations are passed to these APIs using +the ``rte_dma_conf`` and ``rte_dma_vchan_conf`` structures, respectively. For +example, these can be used to configure the number of ``vchans`` per device, the +ring size, etc. The ring size must be a power of two, between 64 and 4096. + +The following code shows how the device is configured in +``test_dmadev.c``: + +.. literalinclude:: ../../../app/test/test_dmadev.c + :language: c + :start-after: Setup of the dmadev device. 8< + :end-before: >8 End of setup of the dmadev device. + :dedent: 1 diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index 3c0837ec52..b2acdac4f9 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -96,6 +96,9 @@ idxd_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_close = idxd_dev_close, .dev_dump = idxd_dump, + .dev_configure = idxd_configure, + .vchan_setup = idxd_vchan_setup, + .dev_info_get = idxd_info_get, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 45cde78e88..2c222708cf 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -39,6 +39,78 @@ idxd_dump(const struct rte_dma_dev *dev, FILE *f) return 0; } +int +idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t size) +{ + struct idxd_dmadev *idxd = dev->dev_private; + + if (size < sizeof(*info)) + return -EINVAL; + + *info = (struct rte_dma_info) { + .dev_capa = RTE_DMA_CAPA_MEM_TO_MEM | + RTE_DMA_CAPA_OPS_COPY | RTE_DMA_CAPA_OPS_FILL, + .max_vchans = 1, + .max_desc = 4096, + .min_desc = 64, + .nb_vchans = (idxd->desc_ring != NULL), /* returns 1 or 0 */ + }; + if (idxd->sva_support) + info->dev_capa |= RTE_DMA_CAPA_SVA; + return 0; +} + +int +idxd_configure(struct rte_dma_dev *dev __rte_unused, const struct rte_dma_conf *dev_conf, + uint32_t conf_sz) +{ + if (sizeof(struct rte_dma_conf) != conf_sz) + return -EINVAL; + + if (dev_conf->nb_vchans != 1) + return -EINVAL; + return 0; +} + +int +idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t max_desc = qconf->nb_desc; + + if (sizeof(struct rte_dma_vchan_conf) != qconf_sz) + return -EINVAL; + + idxd->qcfg = *qconf; + + if (!rte_is_power_of_2(max_desc)) + max_desc = rte_align32pow2(max_desc); + IDXD_PMD_DEBUG("DMA dev %u using %u descriptors", dev->data->dev_id, max_desc); + idxd->desc_ring_mask = max_desc - 1; + idxd->qcfg.nb_desc = max_desc; + + /* in case we are reconfiguring a device, free any existing memory */ + rte_free(idxd->desc_ring); + + /* allocate the descriptor ring at 2x size as batches can't wrap */ + idxd->desc_ring = rte_zmalloc(NULL, sizeof(*idxd->desc_ring) * max_desc * 2, 0); + if (idxd->desc_ring == NULL) + return -ENOMEM; + idxd->desc_iova = rte_mem_virt2iova(idxd->desc_ring); + + idxd->batch_idx_read = 0; + idxd->batch_idx_write = 0; + idxd->batch_start = 0; + idxd->batch_size = 0; + idxd->ids_returned = 0; + idxd->ids_avail = 0; + + memset(idxd->batch_comp_ring, 0, sizeof(*idxd->batch_comp_ring) * + (idxd->max_batches + 1)); + return 0; +} + int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 99c8e04302..fdd018ca35 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -82,5 +82,11 @@ struct idxd_dmadev { int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); int idxd_dump(const struct rte_dma_dev *dev, FILE *f); +int idxd_configure(struct rte_dma_dev *dev, const struct rte_dma_conf *dev_conf, + uint32_t conf_sz); +int idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan, + const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz); +int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *dev_info, + uint32_t size); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 33cf76adfb..0216ab80d9 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -61,6 +61,9 @@ idxd_is_wq_enabled(struct idxd_dmadev *idxd) static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_dump = idxd_dump, + .dev_configure = idxd_configure, + .vchan_setup = idxd_vchan_setup, + .dev_info_get = idxd_info_get, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v5 07/16] dma/idxd: add configure and info_get functions 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 07/16] dma/idxd: add configure and info_get functions Kevin Laatz @ 2021-09-20 10:27 ` Bruce Richardson 2021-09-22 2:31 ` fengchengwen 1 sibling, 0 replies; 243+ messages in thread From: Bruce Richardson @ 2021-09-20 10:27 UTC (permalink / raw) To: Kevin Laatz; +Cc: dev, fengchengwen, jerinj, conor.walsh On Fri, Sep 17, 2021 at 03:24:28PM +0000, Kevin Laatz wrote: > Add functions for device configuration. The info_get function is included > here since it can be useful for checking successful configuration. > Since this patch makes a change in the test code to enable use in the docs, that should be called out here too, I think, for example: "When providing an example of the function's use in the documentation, use code snippet from the unit tests". ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v5 07/16] dma/idxd: add configure and info_get functions 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 07/16] dma/idxd: add configure and info_get functions Kevin Laatz 2021-09-20 10:27 ` Bruce Richardson @ 2021-09-22 2:31 ` fengchengwen 1 sibling, 0 replies; 243+ messages in thread From: fengchengwen @ 2021-09-22 2:31 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, jerinj, conor.walsh On 2021/9/17 23:24, Kevin Laatz wrote: > Add functions for device configuration. The info_get function is included > here since it can be useful for checking successful configuration. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > > --- > v2: > - fix reconfigure bug in idxd_vchan_setup() > - add literal include comment for the docs to pick up > v3: > - fixes needed after changes from rebasing > --- > app/test/test_dmadev.c | 2 + > doc/guides/dmadevs/idxd.rst | 30 +++++++++++++ > drivers/dma/idxd/idxd_bus.c | 3 ++ > drivers/dma/idxd/idxd_common.c | 72 ++++++++++++++++++++++++++++++++ > drivers/dma/idxd/idxd_internal.h | 6 +++ > drivers/dma/idxd/idxd_pci.c | 3 ++ > 6 files changed, 116 insertions(+) > > diff --git a/app/test/test_dmadev.c b/app/test/test_dmadev.c > index 98fcab67f3..5bbe4250e0 100644 > --- a/app/test/test_dmadev.c > +++ b/app/test/test_dmadev.c > @@ -739,6 +739,7 @@ test_dmadev_instance(uint16_t dev_id) > { > #define TEST_RINGSIZE 512 > #define CHECK_ERRS true > + /* Setup of the dmadev device. 8< */ > struct rte_dma_stats stats; > struct rte_dma_info info; > const struct rte_dma_conf conf = { .nb_vchans = 1}; > @@ -759,6 +760,7 @@ test_dmadev_instance(uint16_t dev_id) > > if (rte_dma_vchan_setup(dev_id, vchan, &qconf) < 0) > ERR_RETURN("Error with queue configuration\n"); > + /* >8 End of setup of the dmadev device. */ > > rte_dma_info_get(dev_id, &info); > if (info.nb_vchans != 1) > diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst > index ce33e2857a..abfa5be9ea 100644 > --- a/doc/guides/dmadevs/idxd.rst > +++ b/doc/guides/dmadevs/idxd.rst > @@ -120,3 +120,33 @@ use a subset of configured queues. > Once probed successfully, irrespective of kernel driver, the device will appear as a ``dmadev``, > that is a "DMA device type" inside DPDK, and can be accessed using APIs from the > ``rte_dmadev`` library. > + > +Using IDXD DMAdev Devices > +-------------------------- > + > +To use the devices from an application, the dmadev API can be used. > + > +Getting Device Information > +~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +Basic information about each dmadev device can be queried using the > +``rte_dma_info_get()`` API. This will return basic device information such as > +the ``rte_device`` structure, device capabilities and other device specific values. The info_get cannot obtains 'rte_device' now. > + > +Device Configuration > +~~~~~~~~~~~~~~~~~~~~~ > + > +Configuring an IDXD dmadev device is done using the ``rte_dma_configure()`` and > +``rte_dma_vchan_setup`` APIs. The configurations are passed to these APIs using > +the ``rte_dma_conf`` and ``rte_dma_vchan_conf`` structures, respectively. For > +example, these can be used to configure the number of ``vchans`` per device, the > +ring size, etc. The ring size must be a power of two, between 64 and 4096. > + > +The following code shows how the device is configured in > +``test_dmadev.c``: > + > +.. literalinclude:: ../../../app/test/test_dmadev.c > + :language: c > + :start-after: Setup of the dmadev device. 8< > + :end-before: >8 End of setup of the dmadev device. > + :dedent: 1 > diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c > index 3c0837ec52..b2acdac4f9 100644 > --- a/drivers/dma/idxd/idxd_bus.c > +++ b/drivers/dma/idxd/idxd_bus.c > @@ -96,6 +96,9 @@ idxd_dev_close(struct rte_dma_dev *dev) > static const struct rte_dma_dev_ops idxd_bus_ops = { > .dev_close = idxd_dev_close, > .dev_dump = idxd_dump, > + .dev_configure = idxd_configure, > + .vchan_setup = idxd_vchan_setup, > + .dev_info_get = idxd_info_get, > }; > > static void * > diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c > index 45cde78e88..2c222708cf 100644 > --- a/drivers/dma/idxd/idxd_common.c > +++ b/drivers/dma/idxd/idxd_common.c > @@ -39,6 +39,78 @@ idxd_dump(const struct rte_dma_dev *dev, FILE *f) > return 0; > } > > +int > +idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t size) > +{ > + struct idxd_dmadev *idxd = dev->dev_private; > + > + if (size < sizeof(*info)) > + return -EINVAL; > + > + *info = (struct rte_dma_info) { > + .dev_capa = RTE_DMA_CAPA_MEM_TO_MEM | > + RTE_DMA_CAPA_OPS_COPY | RTE_DMA_CAPA_OPS_FILL, > + .max_vchans = 1, > + .max_desc = 4096, > + .min_desc = 64, > + .nb_vchans = (idxd->desc_ring != NULL), /* returns 1 or 0 */ The nb_vchans field was filled by lib. > + }; > + if (idxd->sva_support) > + info->dev_capa |= RTE_DMA_CAPA_SVA; > + return 0; > +} > + > +int > +idxd_configure(struct rte_dma_dev *dev __rte_unused, const struct rte_dma_conf *dev_conf, > + uint32_t conf_sz) > +{ > + if (sizeof(struct rte_dma_conf) != conf_sz) > + return -EINVAL; > + > + if (dev_conf->nb_vchans != 1) > + return -EINVAL; > + return 0; > +} > + > +int > +idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan __rte_unused, > + const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz) > +{ > + struct idxd_dmadev *idxd = dev->dev_private; > + uint16_t max_desc = qconf->nb_desc; > + > + if (sizeof(struct rte_dma_vchan_conf) != qconf_sz) > + return -EINVAL; > + > + idxd->qcfg = *qconf; > + > + if (!rte_is_power_of_2(max_desc)) > + max_desc = rte_align32pow2(max_desc); > + IDXD_PMD_DEBUG("DMA dev %u using %u descriptors", dev->data->dev_id, max_desc); > + idxd->desc_ring_mask = max_desc - 1; > + idxd->qcfg.nb_desc = max_desc; > + > + /* in case we are reconfiguring a device, free any existing memory */ > + rte_free(idxd->desc_ring); > + > + /* allocate the descriptor ring at 2x size as batches can't wrap */ > + idxd->desc_ring = rte_zmalloc(NULL, sizeof(*idxd->desc_ring) * max_desc * 2, 0); > + if (idxd->desc_ring == NULL) > + return -ENOMEM; > + idxd->desc_iova = rte_mem_virt2iova(idxd->desc_ring); > + > + idxd->batch_idx_read = 0; > + idxd->batch_idx_write = 0; > + idxd->batch_start = 0; > + idxd->batch_size = 0; > + idxd->ids_returned = 0; > + idxd->ids_avail = 0; > + > + memset(idxd->batch_comp_ring, 0, sizeof(*idxd->batch_comp_ring) * > + (idxd->max_batches + 1)); > + return 0; > +} > + > int > idxd_dmadev_create(const char *name, struct rte_device *dev, > const struct idxd_dmadev *base_idxd, > diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h > index 99c8e04302..fdd018ca35 100644 > --- a/drivers/dma/idxd/idxd_internal.h > +++ b/drivers/dma/idxd/idxd_internal.h > @@ -82,5 +82,11 @@ struct idxd_dmadev { > int idxd_dmadev_create(const char *name, struct rte_device *dev, > const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); > int idxd_dump(const struct rte_dma_dev *dev, FILE *f); > +int idxd_configure(struct rte_dma_dev *dev, const struct rte_dma_conf *dev_conf, > + uint32_t conf_sz); > +int idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan, > + const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz); > +int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *dev_info, > + uint32_t size); > > #endif /* _IDXD_INTERNAL_H_ */ > diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c > index 33cf76adfb..0216ab80d9 100644 > --- a/drivers/dma/idxd/idxd_pci.c > +++ b/drivers/dma/idxd/idxd_pci.c > @@ -61,6 +61,9 @@ idxd_is_wq_enabled(struct idxd_dmadev *idxd) > > static const struct rte_dma_dev_ops idxd_pci_ops = { > .dev_dump = idxd_dump, > + .dev_configure = idxd_configure, > + .vchan_setup = idxd_vchan_setup, > + .dev_info_get = idxd_info_get, > }; > > /* each portal uses 4 x 4k pages */ > ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v5 08/16] dma/idxd: add start and stop functions for pci devices 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 00/16] add dmadev driver for idxd devices Kevin Laatz ` (6 preceding siblings ...) 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 07/16] dma/idxd: add configure and info_get functions Kevin Laatz @ 2021-09-17 15:24 ` Kevin Laatz 2021-09-22 2:40 ` fengchengwen 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 09/16] dma/idxd: add data-path job submission functions Kevin Laatz ` (7 subsequent siblings) 15 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 15:24 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add device start/stop functions for DSA devices bound to vfio. For devices bound to the IDXD kernel driver, these are not required since the IDXD kernel driver takes care of this. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 3 +++ drivers/dma/idxd/idxd_pci.c | 52 +++++++++++++++++++++++++++++++++++++ 2 files changed, 55 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index abfa5be9ea..a603c5dd22 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -150,3 +150,6 @@ The following code shows how the device is configured in :start-after: Setup of the dmadev device. 8< :end-before: >8 End of setup of the dmadev device. :dedent: 1 + +Once configured, the device can then be made ready for use by calling the +``rte_dma_start()`` API. diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 0216ab80d9..cfb64ce220 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -59,11 +59,63 @@ idxd_is_wq_enabled(struct idxd_dmadev *idxd) return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; } +static int +idxd_pci_dev_stop(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint8_t err_code; + + if (!idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Work queue %d already disabled", idxd->qid); + return -EALREADY; + } + + err_code = idxd_pci_dev_command(idxd, idxd_disable_wq); + if (err_code || idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Failed disabling work queue %d, error code: %#x", + idxd->qid, err_code); + return -err_code; + } + IDXD_PMD_DEBUG("Work queue %d disabled OK", idxd->qid); + + return 0; +} + +static int +idxd_pci_dev_start(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint8_t err_code; + + if (idxd_is_wq_enabled(idxd)) { + IDXD_PMD_WARN("WQ %d already enabled", idxd->qid); + return 0; + } + + if (idxd->desc_ring == NULL) { + IDXD_PMD_ERR("WQ %d has not been fully configured", idxd->qid); + return -EINVAL; + } + + err_code = idxd_pci_dev_command(idxd, idxd_enable_wq); + if (err_code || !idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Failed enabling work queue %d, error code: %#x", + idxd->qid, err_code); + return err_code == 0 ? -1 : err_code; + } + + IDXD_PMD_DEBUG("Work queue %d enabled OK", idxd->qid); + + return 0; +} + static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_dump = idxd_dump, .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .dev_start = idxd_pci_dev_start, + .dev_stop = idxd_pci_dev_stop, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v5 08/16] dma/idxd: add start and stop functions for pci devices 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz @ 2021-09-22 2:40 ` fengchengwen 0 siblings, 0 replies; 243+ messages in thread From: fengchengwen @ 2021-09-22 2:40 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, jerinj, conor.walsh On 2021/9/17 23:24, Kevin Laatz wrote: > Add device start/stop functions for DSA devices bound to vfio. For devices > bound to the IDXD kernel driver, these are not required since the IDXD > kernel driver takes care of this. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > --- > doc/guides/dmadevs/idxd.rst | 3 +++ > drivers/dma/idxd/idxd_pci.c | 52 +++++++++++++++++++++++++++++++++++++ > 2 files changed, 55 insertions(+) > > diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst > index abfa5be9ea..a603c5dd22 100644 > --- a/doc/guides/dmadevs/idxd.rst > +++ b/doc/guides/dmadevs/idxd.rst > @@ -150,3 +150,6 @@ The following code shows how the device is configured in > :start-after: Setup of the dmadev device. 8< > :end-before: >8 End of setup of the dmadev device. > :dedent: 1 > + > +Once configured, the device can then be made ready for use by calling the > +``rte_dma_start()`` API. > diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c > index 0216ab80d9..cfb64ce220 100644 > --- a/drivers/dma/idxd/idxd_pci.c > +++ b/drivers/dma/idxd/idxd_pci.c > @@ -59,11 +59,63 @@ idxd_is_wq_enabled(struct idxd_dmadev *idxd) > return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; > } > > +static int > +idxd_pci_dev_stop(struct rte_dma_dev *dev) > +{ > + struct idxd_dmadev *idxd = dev->dev_private; > + uint8_t err_code; > + > + if (!idxd_is_wq_enabled(idxd)) { > + IDXD_PMD_ERR("Work queue %d already disabled", idxd->qid); > + return -EALREADY; > + } > + > + err_code = idxd_pci_dev_command(idxd, idxd_disable_wq); > + if (err_code || idxd_is_wq_enabled(idxd)) { > + IDXD_PMD_ERR("Failed disabling work queue %d, error code: %#x", > + idxd->qid, err_code); > + return -err_code; The err_code may zero. > + } > + IDXD_PMD_DEBUG("Work queue %d disabled OK", idxd->qid); > + > + return 0; > +} > + > +static int > +idxd_pci_dev_start(struct rte_dma_dev *dev) > +{ > + struct idxd_dmadev *idxd = dev->dev_private; > + uint8_t err_code; > + > + if (idxd_is_wq_enabled(idxd)) { > + IDXD_PMD_WARN("WQ %d already enabled", idxd->qid); > + return 0; > + } > + > + if (idxd->desc_ring == NULL) { > + IDXD_PMD_ERR("WQ %d has not been fully configured", idxd->qid); > + return -EINVAL; > + } > + > + err_code = idxd_pci_dev_command(idxd, idxd_enable_wq); > + if (err_code || !idxd_is_wq_enabled(idxd)) { > + IDXD_PMD_ERR("Failed enabling work queue %d, error code: %#x", > + idxd->qid, err_code); > + return err_code == 0 ? -1 : err_code; The rte_dma_start specified that a negative number is returned for failure. Suggestions return err_code == 0 ? -1 : -err_code; > + } > + > + IDXD_PMD_DEBUG("Work queue %d enabled OK", idxd->qid); > + > + return 0; > +} > + > static const struct rte_dma_dev_ops idxd_pci_ops = { > .dev_dump = idxd_dump, > .dev_configure = idxd_configure, > .vchan_setup = idxd_vchan_setup, > .dev_info_get = idxd_info_get, > + .dev_start = idxd_pci_dev_start, > + .dev_stop = idxd_pci_dev_stop, > }; > > /* each portal uses 4 x 4k pages */ > ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v5 09/16] dma/idxd: add data-path job submission functions 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 00/16] add dmadev driver for idxd devices Kevin Laatz ` (7 preceding siblings ...) 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz @ 2021-09-17 15:24 ` Kevin Laatz 2021-09-20 10:30 ` Bruce Richardson 2021-09-22 3:22 ` fengchengwen 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 10/16] dma/idxd: add data-path job completion functions Kevin Laatz ` (6 subsequent siblings) 15 siblings, 2 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 15:24 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add data path functions for enqueuing and submitting operations to DSA devices. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 64 +++++++++++++++ drivers/dma/idxd/idxd_common.c | 136 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 ++ drivers/dma/idxd/meson.build | 1 + 4 files changed, 206 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index a603c5dd22..7835461a22 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -153,3 +153,67 @@ The following code shows how the device is configured in Once configured, the device can then be made ready for use by calling the ``rte_dma_start()`` API. + +Performing Data Copies +~~~~~~~~~~~~~~~~~~~~~~~ + +To perform data copies using IDXD dmadev devices, descriptors should be enqueued +using the ``rte_dma_copy()`` API. The HW can be triggered to perform the copy +in two ways, either via a ``RTE_DMA_OP_FLAG_SUBMIT`` flag or by calling +``rte_dma_submit()``. Once copies have been completed, the completion will +be reported back when the application calls ``rte_dma_completed()`` or +``rte_dma_completed_status()``. The latter will also report the status of each +completed operation. + +The ``rte_dma_copy()`` function enqueues a single copy to the device ring for +copying at a later point. The parameters to that function include the IOVA addresses +of both the source and destination buffers, as well as the length of the copy. + +The ``rte_dma_copy()`` function enqueues a copy operation on the device ring. +If the ``RTE_DMA_OP_FLAG_SUBMIT`` flag is set when calling ``rte_dma_copy()``, +the device hardware will be informed of the elements. Alternatively, if the flag +is not set, the application needs to call the ``rte_dma_submit()`` function to +notify the device hardware. Once the device hardware is informed of the elements +enqueued on the ring, the device will begin to process them. It is expected +that, for efficiency reasons, a burst of operations will be enqueued to the +device via multiple enqueue calls between calls to the ``rte_dma_submit()`` +function. + +The following code demonstrates how to enqueue a burst of copies to the +device and start the hardware processing of them: + +.. code-block:: C + + struct rte_mbuf *srcs[COMP_BURST_SZ], *dsts[COMP_BURST_SZ]; + unsigned int i; + + for (i = 0; i < RTE_DIM(srcs); i++) { + uint64_t *src_data; + + srcs[i] = rte_pktmbuf_alloc(pool); + dsts[i] = rte_pktmbuf_alloc(pool); + src_data = rte_pktmbuf_mtod(srcs[i], uint64_t *); + if (srcs[i] == NULL || dsts[i] == NULL) { + PRINT_ERR("Error allocating buffers\n"); + return -1; + } + + for (j = 0; j < COPY_LEN/sizeof(uint64_t); j++) + src_data[j] = rte_rand(); + + if (rte_dma_copy(dev_id, vchan, srcs[i]->buf_iova + srcs[i]->data_off, + dsts[i]->buf_iova + dsts[i]->data_off, COPY_LEN, 0) < 0) { + PRINT_ERR("Error with rte_dma_copy for buffer %u\n", i); + return -1; + } + } + rte_dma_submit(dev_id, vchan); + +Filling an Area of Memory +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The IDXD driver also has support for the ``fill`` operation, where an area +of memory is overwritten, or filled, with a short pattern of data. +Fill operations can be performed in much the same was as copy operations +described above, just using the ``rte_dma_fill()`` function rather than the +``rte_dma_copy()`` function. diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 2c222708cf..b01edeab07 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -2,14 +2,144 @@ * Copyright 2021 Intel Corporation */ +#include <x86intrin.h> + #include <rte_dmadev_pmd.h> #include <rte_malloc.h> #include <rte_common.h> +#include <rte_prefetch.h> #include "idxd_internal.h" #define IDXD_PMD_NAME_STR "dmadev_idxd" +static __rte_always_inline rte_iova_t +__desc_idx_to_iova(struct idxd_dmadev *idxd, uint16_t n) +{ + return idxd->desc_iova + (n * sizeof(struct idxd_hw_desc)); +} + +static __rte_always_inline void +__idxd_movdir64b(volatile void *dst, const struct idxd_hw_desc *src) +{ + asm volatile (".byte 0x66, 0x0f, 0x38, 0xf8, 0x02" + : + : "a" (dst), "d" (src) + : "memory"); +} + +static __rte_always_inline void +__submit(struct idxd_dmadev *idxd) +{ + rte_prefetch1(&idxd->batch_comp_ring[idxd->batch_idx_read]); + + if (idxd->batch_size == 0) + return; + + /* write completion to batch comp ring */ + rte_iova_t comp_addr = idxd->batch_iova + + (idxd->batch_idx_write * sizeof(struct idxd_completion)); + + if (idxd->batch_size == 1) { + /* submit batch directly */ + struct idxd_hw_desc desc = + idxd->desc_ring[idxd->batch_start & idxd->desc_ring_mask]; + desc.completion = comp_addr; + desc.op_flags |= IDXD_FLAG_REQUEST_COMPLETION; + _mm_sfence(); /* fence before writing desc to device */ + __idxd_movdir64b(idxd->portal, &desc); + } else { + const struct idxd_hw_desc batch_desc = { + .op_flags = (idxd_op_batch << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_COMPLETION_ADDR_VALID | + IDXD_FLAG_REQUEST_COMPLETION, + .desc_addr = __desc_idx_to_iova(idxd, + idxd->batch_start & idxd->desc_ring_mask), + .completion = comp_addr, + .size = idxd->batch_size, + }; + _mm_sfence(); /* fence before writing desc to device */ + __idxd_movdir64b(idxd->portal, &batch_desc); + } + + if (++idxd->batch_idx_write > idxd->max_batches) + idxd->batch_idx_write = 0; + + idxd->batch_start += idxd->batch_size; + idxd->batch_size = 0; + idxd->batch_idx_ring[idxd->batch_idx_write] = idxd->batch_start; + _mm256_store_si256((void *)&idxd->batch_comp_ring[idxd->batch_idx_write], + _mm256_setzero_si256()); +} + +static __rte_always_inline int +__idxd_write_desc(struct rte_dma_dev *dev, + const uint32_t op_flags, + const rte_iova_t src, + const rte_iova_t dst, + const uint32_t size, + const uint32_t flags) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t mask = idxd->desc_ring_mask; + uint16_t job_id = idxd->batch_start + idxd->batch_size; + /* we never wrap batches, so we only mask the start and allow start+size to overflow */ + uint16_t write_idx = (idxd->batch_start & mask) + idxd->batch_size; + + /* first check batch ring space then desc ring space */ + if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) || + idxd->batch_idx_write + 1 == idxd->batch_idx_read) + return -1; + if (((write_idx + 1) & mask) == (idxd->ids_returned & mask)) + return -1; + + /* write desc. Note: descriptors don't wrap, but the completion address does */ + const uint64_t op_flags64 = (uint64_t)(op_flags | IDXD_FLAG_COMPLETION_ADDR_VALID) << 32; + const uint64_t comp_addr = __desc_idx_to_iova(idxd, write_idx & mask); + _mm256_store_si256((void *)&idxd->desc_ring[write_idx], + _mm256_set_epi64x(dst, src, comp_addr, op_flags64)); + _mm256_store_si256((void *)&idxd->desc_ring[write_idx].size, + _mm256_set_epi64x(0, 0, 0, size)); + + idxd->batch_size++; + + rte_prefetch0_write(&idxd->desc_ring[write_idx + 1]); + + if (flags & RTE_DMA_OP_FLAG_SUBMIT) + __submit(idxd); + + return job_id; +} + +int +idxd_enqueue_copy(struct rte_dma_dev *dev, uint16_t qid __rte_unused, rte_iova_t src, + rte_iova_t dst, unsigned int length, uint64_t flags) +{ + /* we can take advantage of the fact that the fence flag in dmadev and DSA are the same, + * but check it at compile time to be sure. + */ + RTE_BUILD_BUG_ON(RTE_DMA_OP_FLAG_FENCE != IDXD_FLAG_FENCE); + uint32_t memmove = (idxd_op_memmove << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_CACHE_CONTROL | (flags & IDXD_FLAG_FENCE); + return __idxd_write_desc(dev, memmove, src, dst, length, flags); +} + +int +idxd_enqueue_fill(struct rte_dma_dev *dev, uint16_t qid __rte_unused, uint64_t pattern, + rte_iova_t dst, unsigned int length, uint64_t flags) +{ + uint32_t fill = (idxd_op_fill << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_CACHE_CONTROL | (flags & IDXD_FLAG_FENCE); + return __idxd_write_desc(dev, fill, pattern, dst, length, flags); +} + +int +idxd_submit(struct rte_dma_dev *dev, uint16_t qid __rte_unused) +{ + __submit(dev->dev_private); + return 0; +} + int idxd_dump(const struct rte_dma_dev *dev, FILE *f) { @@ -141,6 +271,12 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->dev_ops = ops; dmadev->device = dev; + dmadev->copy = idxd_enqueue_copy; + dmadev->fill = idxd_enqueue_fill; + dmadev->submit = idxd_submit; + dmadev->completed = idxd_completed; + dmadev->completed_status = idxd_completed_status; + idxd = rte_malloc_socket(NULL, sizeof(struct idxd_dmadev), 0, dev->numa_node); if (idxd == NULL) { IDXD_PMD_ERR("Unable to allocate memory for device"); diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index fdd018ca35..b66c2d0182 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -88,5 +88,10 @@ int idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan, const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz); int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *dev_info, uint32_t size); +int idxd_enqueue_copy(struct rte_dma_dev *dev, uint16_t qid, rte_iova_t src, + rte_iova_t dst, unsigned int length, uint64_t flags); +int idxd_enqueue_fill(struct rte_dma_dev *dev, uint16_t qid, uint64_t pattern, + rte_iova_t dst, unsigned int length, uint64_t flags); +int idxd_submit(struct rte_dma_dev *dev, uint16_t qid); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index 36dbd3e518..acb1b10618 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -6,6 +6,7 @@ if is_windows endif deps += ['bus_pci'] +cflags += '-mavx2' # all platforms with idxd HW support AVX sources = files( 'idxd_bus.c', 'idxd_common.c', -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v5 09/16] dma/idxd: add data-path job submission functions 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 09/16] dma/idxd: add data-path job submission functions Kevin Laatz @ 2021-09-20 10:30 ` Bruce Richardson 2021-09-22 3:22 ` fengchengwen 1 sibling, 0 replies; 243+ messages in thread From: Bruce Richardson @ 2021-09-20 10:30 UTC (permalink / raw) To: Kevin Laatz; +Cc: dev, fengchengwen, jerinj, conor.walsh On Fri, Sep 17, 2021 at 03:24:30PM +0000, Kevin Laatz wrote: > Add data path functions for enqueuing and submitting operations to DSA > devices. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > --- > doc/guides/dmadevs/idxd.rst | 64 +++++++++++++++ > drivers/dma/idxd/idxd_common.c | 136 +++++++++++++++++++++++++++++++ > drivers/dma/idxd/idxd_internal.h | 5 ++ > drivers/dma/idxd/meson.build | 1 + > 4 files changed, 206 insertions(+) > > diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst > index a603c5dd22..7835461a22 100644 > --- a/doc/guides/dmadevs/idxd.rst > +++ b/doc/guides/dmadevs/idxd.rst > @@ -153,3 +153,67 @@ The following code shows how the device is configured in > > Once configured, the device can then be made ready for use by calling the > ``rte_dma_start()`` API. > + > +Performing Data Copies > +~~~~~~~~~~~~~~~~~~~~~~~ > + > +To perform data copies using IDXD dmadev devices, descriptors should be enqueued > +using the ``rte_dma_copy()`` API. The HW can be triggered to perform the copy > +in two ways, either via a ``RTE_DMA_OP_FLAG_SUBMIT`` flag or by calling > +``rte_dma_submit()``. Once copies have been completed, the completion will > +be reported back when the application calls ``rte_dma_completed()`` or > +``rte_dma_completed_status()``. The latter will also report the status of each > +completed operation. > + > +The ``rte_dma_copy()`` function enqueues a single copy to the device ring for > +copying at a later point. The parameters to that function include the IOVA addresses > +of both the source and destination buffers, as well as the length of the copy. > + > +The ``rte_dma_copy()`` function enqueues a copy operation on the device ring. > +If the ``RTE_DMA_OP_FLAG_SUBMIT`` flag is set when calling ``rte_dma_copy()``, > +the device hardware will be informed of the elements. Alternatively, if the flag > +is not set, the application needs to call the ``rte_dma_submit()`` function to > +notify the device hardware. Once the device hardware is informed of the elements > +enqueued on the ring, the device will begin to process them. It is expected > +that, for efficiency reasons, a burst of operations will be enqueued to the > +device via multiple enqueue calls between calls to the ``rte_dma_submit()`` > +function. > + > +The following code demonstrates how to enqueue a burst of copies to the > +device and start the hardware processing of them: > + > +.. code-block:: C > + > + struct rte_mbuf *srcs[COMP_BURST_SZ], *dsts[COMP_BURST_SZ]; > + unsigned int i; > + > + for (i = 0; i < RTE_DIM(srcs); i++) { > + uint64_t *src_data; > + > + srcs[i] = rte_pktmbuf_alloc(pool); > + dsts[i] = rte_pktmbuf_alloc(pool); > + src_data = rte_pktmbuf_mtod(srcs[i], uint64_t *); > + if (srcs[i] == NULL || dsts[i] == NULL) { > + PRINT_ERR("Error allocating buffers\n"); > + return -1; > + } > + > + for (j = 0; j < COPY_LEN/sizeof(uint64_t); j++) > + src_data[j] = rte_rand(); > + > + if (rte_dma_copy(dev_id, vchan, srcs[i]->buf_iova + srcs[i]->data_off, > + dsts[i]->buf_iova + dsts[i]->data_off, COPY_LEN, 0) < 0) { > + PRINT_ERR("Error with rte_dma_copy for buffer %u\n", i); > + return -1; > + } > + } > + rte_dma_submit(dev_id, vchan); > + I think this code block is larger than necessary, because it shows buffer allocation and initialization rather than just the basics of copy() and submit() APIs. Furthermore, rather than calling out the generic API use in the idxd-specific docs, can we just include a reference to the dmadev documentation? /Bruce ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v5 09/16] dma/idxd: add data-path job submission functions 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 09/16] dma/idxd: add data-path job submission functions Kevin Laatz 2021-09-20 10:30 ` Bruce Richardson @ 2021-09-22 3:22 ` fengchengwen 1 sibling, 0 replies; 243+ messages in thread From: fengchengwen @ 2021-09-22 3:22 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, jerinj, conor.walsh On 2021/9/17 23:24, Kevin Laatz wrote: > Add data path functions for enqueuing and submitting operations to DSA > devices. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > --- > doc/guides/dmadevs/idxd.rst | 64 +++++++++++++++ > drivers/dma/idxd/idxd_common.c | 136 +++++++++++++++++++++++++++++++ > drivers/dma/idxd/idxd_internal.h | 5 ++ > drivers/dma/idxd/meson.build | 1 + > 4 files changed, 206 insertions(+) > [snip] > + > +static __rte_always_inline int > +__idxd_write_desc(struct rte_dma_dev *dev, > + const uint32_t op_flags, > + const rte_iova_t src, > + const rte_iova_t dst, > + const uint32_t size, > + const uint32_t flags) > +{ > + struct idxd_dmadev *idxd = dev->dev_private; > + uint16_t mask = idxd->desc_ring_mask; > + uint16_t job_id = idxd->batch_start + idxd->batch_size; > + /* we never wrap batches, so we only mask the start and allow start+size to overflow */ > + uint16_t write_idx = (idxd->batch_start & mask) + idxd->batch_size; > + > + /* first check batch ring space then desc ring space */ > + if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) || > + idxd->batch_idx_write + 1 == idxd->batch_idx_read) > + return -1; > + if (((write_idx + 1) & mask) == (idxd->ids_returned & mask)) > + return -1; Please return -ENOSPC when the ring is full. > + > + /* write desc. Note: descriptors don't wrap, but the completion address does */ > + const uint64_t op_flags64 = (uint64_t)(op_flags | IDXD_FLAG_COMPLETION_ADDR_VALID) << 32; > + const uint64_t comp_addr = __desc_idx_to_iova(idxd, write_idx & mask); > + _mm256_store_si256((void *)&idxd->desc_ring[write_idx], > + _mm256_set_epi64x(dst, src, comp_addr, op_flags64)); > + _mm256_store_si256((void *)&idxd->desc_ring[write_idx].size, > + _mm256_set_epi64x(0, 0, 0, size)); > + > + idxd->batch_size++; > + > + rte_prefetch0_write(&idxd->desc_ring[write_idx + 1]); > + > + if (flags & RTE_DMA_OP_FLAG_SUBMIT) > + __submit(idxd); > + > + return job_id; > +} > + > +int > +idxd_enqueue_copy(struct rte_dma_dev *dev, uint16_t qid __rte_unused, rte_iova_t src, > + rte_iova_t dst, unsigned int length, uint64_t flags) > +{ > + /* we can take advantage of the fact that the fence flag in dmadev and DSA are the same, > + * but check it at compile time to be sure. > + */ > + RTE_BUILD_BUG_ON(RTE_DMA_OP_FLAG_FENCE != IDXD_FLAG_FENCE); > + uint32_t memmove = (idxd_op_memmove << IDXD_CMD_OP_SHIFT) | > + IDXD_FLAG_CACHE_CONTROL | (flags & IDXD_FLAG_FENCE); > + return __idxd_write_desc(dev, memmove, src, dst, length, flags); > +} > + [snip] ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v5 10/16] dma/idxd: add data-path job completion functions 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 00/16] add dmadev driver for idxd devices Kevin Laatz ` (8 preceding siblings ...) 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 09/16] dma/idxd: add data-path job submission functions Kevin Laatz @ 2021-09-17 15:24 ` Kevin Laatz 2021-09-20 10:36 ` Bruce Richardson 2021-09-22 3:47 ` fengchengwen 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 11/16] dma/idxd: add operation statistic tracking Kevin Laatz ` (5 subsequent siblings) 15 siblings, 2 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 15:24 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the data path functions for gathering completed operations. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v2: - fixed typo in docs - add completion status for invalid opcode --- doc/guides/dmadevs/idxd.rst | 32 +++++ drivers/dma/idxd/idxd_common.c | 235 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 + 3 files changed, 272 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 7835461a22..f942a8aa44 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -209,6 +209,38 @@ device and start the hardware processing of them: } rte_dma_submit(dev_id, vchan); +To retrieve information about completed copies, ``rte_dma_completed()`` and +``rte_dma_completed_status()`` APIs should be used. ``rte_dma_completed()`` +will return the number of completed operations, along with the index of the last +successful completed operation and whether or not an error was encountered. If an +error was encountered, ``rte_dma_completed_status()`` must be used to kick the +device off to continue processing operations and also to gather the status of each +individual operations which is filled in to the ``status`` array provided as +parameter by the application. + +The following status codes are supported by IDXD: +* ``RTE_DMA_STATUS_SUCCESSFUL``: The operation was successful. +* ``RTE_DMA_STATUS_INVALID_OPCODE``: The operation failed due to an invalid operation code. +* ``RTE_DMA_STATUS_INVALID_LENGTH``: The operation failed due to an invalid data length. +* ``RTE_DMA_STATUS_NOT_ATTEMPTED``: The operation was not attempted. +* ``RTE_DMA_STATUS_ERROR_UNKNOWN``: The operation failed due to an unspecified error. + +The following code shows how to retrieve the number of successfully completed +copies within a burst and then using ``rte_dma_completed_status()`` to check +which operation failed and kick off the device to continue processing operations: + +.. code-block:: C + + enum rte_dma_status_code status[COMP_BURST_SZ]; + uint16_t count, idx, status_count; + bool error = 0; + + count = rte_dma_completed(dev_id, vchan, COMP_BURST_SZ, &idx, &error); + + if (error){ + status_count = rte_dma_completed_status(dev_id, vchan, COMP_BURST_SZ, &idx, status); + } + Filling an Area of Memory ~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index b01edeab07..a061a956c2 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -140,6 +140,241 @@ idxd_submit(struct rte_dma_dev *dev, uint16_t qid __rte_unused) return 0; } +static enum rte_dma_status_code +get_comp_status(struct idxd_completion *c) +{ + uint8_t st = c->status; + switch (st) { + /* successful descriptors are not written back normally */ + case IDXD_COMP_STATUS_INCOMPLETE: + case IDXD_COMP_STATUS_SUCCESS: + return RTE_DMA_STATUS_SUCCESSFUL; + case IDXD_COMP_STATUS_INVALID_OPCODE: + return RTE_DMA_STATUS_INVALID_OPCODE; + case IDXD_COMP_STATUS_INVALID_SIZE: + return RTE_DMA_STATUS_INVALID_LENGTH; + case IDXD_COMP_STATUS_SKIPPED: + return RTE_DMA_STATUS_NOT_ATTEMPTED; + default: + return RTE_DMA_STATUS_ERROR_UNKNOWN; + } +} + +static __rte_always_inline int +batch_ok(struct idxd_dmadev *idxd, uint8_t max_ops, enum rte_dma_status_code *status) +{ + uint16_t ret; + uint8_t bstatus; + + if (max_ops == 0) + return 0; + + /* first check if there are any unreturned handles from last time */ + if (idxd->ids_avail != idxd->ids_returned) { + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); + idxd->ids_returned += ret; + if (status) + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); + return ret; + } + + if (idxd->batch_idx_read == idxd->batch_idx_write) + return 0; + + bstatus = idxd->batch_comp_ring[idxd->batch_idx_read].status; + /* now check if next batch is complete and successful */ + if (bstatus == IDXD_COMP_STATUS_SUCCESS) { + /* since the batch idx ring stores the start of each batch, pre-increment to lookup + * start of next batch. + */ + if (++idxd->batch_idx_read > idxd->max_batches) + idxd->batch_idx_read = 0; + idxd->ids_avail = idxd->batch_idx_ring[idxd->batch_idx_read]; + + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); + idxd->ids_returned += ret; + if (status) + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); + return ret; + } + /* check if batch is incomplete */ + else if (bstatus == IDXD_COMP_STATUS_INCOMPLETE) + return 0; + + return -1; /* error case */ +} + +static inline uint16_t +batch_completed(struct idxd_dmadev *idxd, uint8_t max_ops, bool *has_error) +{ + uint16_t i; + uint16_t b_start, b_end, next_batch; + + int ret = batch_ok(idxd, max_ops, NULL); + if (ret >= 0) + return ret; + + /* ERROR case, not successful, not incomplete */ + /* Get the batch size, and special case size 1. + * once we identify the actual failure job, return other jobs, then update + * the batch ring indexes to make it look like the first job of the batch has failed. + * Subsequent calls here will always return zero packets, and the error must be cleared by + * calling the completed_status() function. + */ + next_batch = (idxd->batch_idx_read + 1); + if (next_batch > idxd->max_batches) + next_batch = 0; + b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; + b_end = idxd->batch_idx_ring[next_batch]; + + if (b_end - b_start == 1) { /* not a batch */ + *has_error = true; + return 0; + } + + for (i = b_start; i < b_end; i++) { + struct idxd_completion *c = (void *)&idxd->desc_ring[i & idxd->desc_ring_mask]; + if (c->status > IDXD_COMP_STATUS_SUCCESS) /* ignore incomplete(0) and success(1) */ + break; + } + ret = RTE_MIN((uint16_t)(i - idxd->ids_returned), max_ops); + if (ret < max_ops) + *has_error = true; /* we got up to the point of error */ + idxd->ids_avail = idxd->ids_returned += ret; + + /* to ensure we can call twice and just return 0, set start of batch to where we finished */ + idxd->batch_comp_ring[idxd->batch_idx_read].completed_size -= ret; + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; + if (idxd->batch_idx_ring[next_batch] - idxd->batch_idx_ring[idxd->batch_idx_read] == 1) { + /* copy over the descriptor status to the batch ring as if no batch */ + uint16_t d_idx = idxd->batch_idx_ring[idxd->batch_idx_read] & idxd->desc_ring_mask; + struct idxd_completion *desc_comp = (void *)&idxd->desc_ring[d_idx]; + idxd->batch_comp_ring[idxd->batch_idx_read].status = desc_comp->status; + } + + return ret; +} + +static uint16_t +batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) +{ + uint16_t next_batch; + + int ret = batch_ok(idxd, max_ops, status); + if (ret >= 0) + return ret; + + /* ERROR case, not successful, not incomplete */ + /* Get the batch size, and special case size 1. + */ + next_batch = (idxd->batch_idx_read + 1); + if (next_batch > idxd->max_batches) + next_batch = 0; + const uint16_t b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; + const uint16_t b_end = idxd->batch_idx_ring[next_batch]; + const uint16_t b_len = b_end - b_start; + if (b_len == 1) {/* not a batch */ + *status = get_comp_status(&idxd->batch_comp_ring[idxd->batch_idx_read]); + idxd->ids_avail++; + idxd->ids_returned++; + idxd->batch_idx_read = next_batch; + return 1; + } + + /* not a single-element batch, need to process more. + * Scenarios: + * 1. max_ops >= batch_size - can fit everything, simple case + * - loop through completed ops and then add on any not-attempted ones + * 2. max_ops < batch_size - can't fit everything, more complex case + * - loop through completed/incomplete and stop when hit max_ops + * - adjust the batch descriptor to update where we stopped, with appropriate bcount + * - if bcount is to be exactly 1, update the batch descriptor as it will be treated as + * non-batch next time. + */ + const uint16_t bcount = idxd->batch_comp_ring[idxd->batch_idx_read].completed_size; + for (ret = 0; ret < b_len && ret < max_ops; ret++) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; + status[ret] = (ret < bcount) ? get_comp_status(c) : RTE_DMA_STATUS_NOT_ATTEMPTED; + } + idxd->ids_avail = idxd->ids_returned += ret; + + /* everything fit */ + if (ret == b_len) { + idxd->batch_idx_read = next_batch; + return ret; + } + + /* set up for next time, update existing batch descriptor & start idx at batch_idx_read */ + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; + if (ret > bcount) { + /* we have only incomplete ones - set batch completed size to 0 */ + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; + comp->completed_size = 0; + /* if there is only one descriptor left, job skipped so set flag appropriately */ + if (b_len - ret == 1) + comp->status = IDXD_COMP_STATUS_SKIPPED; + } else { + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; + comp->completed_size -= ret; + /* if there is only one descriptor left, copy status info straight to desc */ + if (comp->completed_size == 1) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; + comp->status = c->status; + /* individual descs can be ok without writeback, but not batches */ + if (comp->status == IDXD_COMP_STATUS_INCOMPLETE) + comp->status = IDXD_COMP_STATUS_SUCCESS; + } else if (bcount == b_len) { + /* check if we still have an error, and clear flag if not */ + uint16_t i; + for (i = b_start + ret; i < b_end; i++) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[i & idxd->desc_ring_mask]; + if (c->status > IDXD_COMP_STATUS_SUCCESS) + break; + } + if (i == b_end) /* no errors */ + comp->status = IDXD_COMP_STATUS_SUCCESS; + } + } + + return ret; +} + +uint16_t +idxd_completed(struct rte_dma_dev *dev, uint16_t qid __rte_unused, uint16_t max_ops, + uint16_t *last_idx, bool *has_error) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t batch, ret = 0; + + do { + batch = batch_completed(idxd, max_ops - ret, has_error); + ret += batch; + } while (batch > 0 && *has_error == false); + + *last_idx = idxd->ids_returned - 1; + return ret; +} + +uint16_t +idxd_completed_status(struct rte_dma_dev *dev, uint16_t qid __rte_unused, uint16_t max_ops, + uint16_t *last_idx, enum rte_dma_status_code *status) +{ + struct idxd_dmadev *idxd = dev->dev_private; + + uint16_t batch, ret = 0; + + do { + batch = batch_completed_status(idxd, max_ops - ret, &status[ret]); + ret += batch; + } while (batch > 0); + + *last_idx = idxd->ids_returned - 1; + return ret; +} + int idxd_dump(const struct rte_dma_dev *dev, FILE *f) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index b66c2d0182..15115a0966 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -93,5 +93,10 @@ int idxd_enqueue_copy(struct rte_dma_dev *dev, uint16_t qid, rte_iova_t src, int idxd_enqueue_fill(struct rte_dma_dev *dev, uint16_t qid, uint64_t pattern, rte_iova_t dst, unsigned int length, uint64_t flags); int idxd_submit(struct rte_dma_dev *dev, uint16_t qid); +uint16_t idxd_completed(struct rte_dma_dev *dev, uint16_t qid, uint16_t max_ops, + uint16_t *last_idx, bool *has_error); +uint16_t idxd_completed_status(struct rte_dma_dev *dev, uint16_t qid __rte_unused, + uint16_t max_ops, uint16_t *last_idx, + enum rte_dma_status_code *status); #endif /* _IDXD_INTERNAL_H_ */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v5 10/16] dma/idxd: add data-path job completion functions 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 10/16] dma/idxd: add data-path job completion functions Kevin Laatz @ 2021-09-20 10:36 ` Bruce Richardson 2021-09-22 3:47 ` fengchengwen 1 sibling, 0 replies; 243+ messages in thread From: Bruce Richardson @ 2021-09-20 10:36 UTC (permalink / raw) To: Kevin Laatz; +Cc: dev, fengchengwen, jerinj, conor.walsh On Fri, Sep 17, 2021 at 03:24:31PM +0000, Kevin Laatz wrote: > Add the data path functions for gathering completed operations. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > > --- > v2: > - fixed typo in docs > - add completion status for invalid opcode > --- > doc/guides/dmadevs/idxd.rst | 32 +++++ > drivers/dma/idxd/idxd_common.c | 235 +++++++++++++++++++++++++++++++ > drivers/dma/idxd/idxd_internal.h | 5 + > 3 files changed, 272 insertions(+) > > diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst > index 7835461a22..f942a8aa44 100644 > --- a/doc/guides/dmadevs/idxd.rst > +++ b/doc/guides/dmadevs/idxd.rst > @@ -209,6 +209,38 @@ device and start the hardware processing of them: > } > rte_dma_submit(dev_id, vchan); > > +To retrieve information about completed copies, ``rte_dma_completed()`` and > +``rte_dma_completed_status()`` APIs should be used. ``rte_dma_completed()`` > +will return the number of completed operations, along with the index of the last > +successful completed operation and whether or not an error was encountered. If an > +error was encountered, ``rte_dma_completed_status()`` must be used to kick the > +device off to continue processing operations and also to gather the status of each > +individual operations which is filled in to the ``status`` array provided as > +parameter by the application. > + > +The following status codes are supported by IDXD: > +* ``RTE_DMA_STATUS_SUCCESSFUL``: The operation was successful. > +* ``RTE_DMA_STATUS_INVALID_OPCODE``: The operation failed due to an invalid operation code. > +* ``RTE_DMA_STATUS_INVALID_LENGTH``: The operation failed due to an invalid data length. > +* ``RTE_DMA_STATUS_NOT_ATTEMPTED``: The operation was not attempted. > +* ``RTE_DMA_STATUS_ERROR_UNKNOWN``: The operation failed due to an unspecified error. > + > +The following code shows how to retrieve the number of successfully completed > +copies within a burst and then using ``rte_dma_completed_status()`` to check > +which operation failed and kick off the device to continue processing operations: > + > +.. code-block:: C > + > + enum rte_dma_status_code status[COMP_BURST_SZ]; > + uint16_t count, idx, status_count; > + bool error = 0; > + > + count = rte_dma_completed(dev_id, vchan, COMP_BURST_SZ, &idx, &error); > + > + if (error){ > + status_count = rte_dma_completed_status(dev_id, vchan, COMP_BURST_SZ, &idx, status); > + } > + As with some of the other documentation text, it should be checked for overlap with the dmadev documentation, and merged with that if appropriate. /Bruce ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v5 10/16] dma/idxd: add data-path job completion functions 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 10/16] dma/idxd: add data-path job completion functions Kevin Laatz 2021-09-20 10:36 ` Bruce Richardson @ 2021-09-22 3:47 ` fengchengwen 1 sibling, 0 replies; 243+ messages in thread From: fengchengwen @ 2021-09-22 3:47 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, jerinj, conor.walsh On 2021/9/17 23:24, Kevin Laatz wrote: > Add the data path functions for gathering completed operations. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > > --- > v2: > - fixed typo in docs > - add completion status for invalid opcode > --- > doc/guides/dmadevs/idxd.rst | 32 +++++ > drivers/dma/idxd/idxd_common.c | 235 +++++++++++++++++++++++++++++++ > drivers/dma/idxd/idxd_internal.h | 5 + > 3 files changed, 272 insertions(+) > > diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst > index 7835461a22..f942a8aa44 100644 > --- a/doc/guides/dmadevs/idxd.rst > +++ b/doc/guides/dmadevs/idxd.rst > @@ -209,6 +209,38 @@ device and start the hardware processing of them: > } > rte_dma_submit(dev_id, vchan); > > +To retrieve information about completed copies, ``rte_dma_completed()`` and > +``rte_dma_completed_status()`` APIs should be used. ``rte_dma_completed()`` > +will return the number of completed operations, along with the index of the last > +successful completed operation and whether or not an error was encountered. If an > +error was encountered, ``rte_dma_completed_status()`` must be used to kick the > +device off to continue processing operations and also to gather the status of each > +individual operations which is filled in to the ``status`` array provided as > +parameter by the application. > + > +The following status codes are supported by IDXD: > +* ``RTE_DMA_STATUS_SUCCESSFUL``: The operation was successful. > +* ``RTE_DMA_STATUS_INVALID_OPCODE``: The operation failed due to an invalid operation code. > +* ``RTE_DMA_STATUS_INVALID_LENGTH``: The operation failed due to an invalid data length. > +* ``RTE_DMA_STATUS_NOT_ATTEMPTED``: The operation was not attempted. > +* ``RTE_DMA_STATUS_ERROR_UNKNOWN``: The operation failed due to an unspecified error. > + > +The following code shows how to retrieve the number of successfully completed > +copies within a burst and then using ``rte_dma_completed_status()`` to check > +which operation failed and kick off the device to continue processing operations: > + > +.. code-block:: C > + > + enum rte_dma_status_code status[COMP_BURST_SZ]; > + uint16_t count, idx, status_count; > + bool error = 0; > + > + count = rte_dma_completed(dev_id, vchan, COMP_BURST_SZ, &idx, &error); > + > + if (error){ > + status_count = rte_dma_completed_status(dev_id, vchan, COMP_BURST_SZ, &idx, status); > + } Because it's a sequential scan to get the completion status, so the count may >0 even error was true. e.g. request 1 successful request 2 successful request 3 fail so the rte_dma_completed(dev_id, vchan, 3, &idx, &error) will return 2 and error was mark with true. If return 0 and error with true in above situation, this means that the driver must scans all completions first. which will lead to low performance. Therefore, the recommended handling method is as follows: count = rte_dma_completed(dev_id, vchan, COMP_BURST_SZ, &idx, &error); if (likely(count > 0)) { // process success completed } if (unlikely(error)) { // } > + > Filling an Area of Memory > ~~~~~~~~~~~~~~~~~~~~~~~~~~ > > diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c > index b01edeab07..a061a956c2 100644 > --- a/drivers/dma/idxd/idxd_common.c > +++ b/drivers/dma/idxd/idxd_common.c > @@ -140,6 +140,241 @@ idxd_submit(struct rte_dma_dev *dev, uint16_t qid __rte_unused) > return 0; > } > > +static enum rte_dma_status_code > +get_comp_status(struct idxd_completion *c) > +{ > + uint8_t st = c->status; > + switch (st) { > + /* successful descriptors are not written back normally */ > + case IDXD_COMP_STATUS_INCOMPLETE: > + case IDXD_COMP_STATUS_SUCCESS: > + return RTE_DMA_STATUS_SUCCESSFUL; > + case IDXD_COMP_STATUS_INVALID_OPCODE: > + return RTE_DMA_STATUS_INVALID_OPCODE; > + case IDXD_COMP_STATUS_INVALID_SIZE: > + return RTE_DMA_STATUS_INVALID_LENGTH; > + case IDXD_COMP_STATUS_SKIPPED: > + return RTE_DMA_STATUS_NOT_ATTEMPTED; > + default: > + return RTE_DMA_STATUS_ERROR_UNKNOWN; > + } > +} > + > +static __rte_always_inline int > +batch_ok(struct idxd_dmadev *idxd, uint8_t max_ops, enum rte_dma_status_code *status) uint8_t max_ops -> uint16_t max_ops ? > +{ > + uint16_t ret; > + uint8_t bstatus; > + > + if (max_ops == 0) > + return 0; > + > + /* first check if there are any unreturned handles from last time */ > + if (idxd->ids_avail != idxd->ids_returned) { > + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); > + idxd->ids_returned += ret; > + if (status) > + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); > + return ret; > + } > + > + if (idxd->batch_idx_read == idxd->batch_idx_write) > + return 0; > + > + bstatus = idxd->batch_comp_ring[idxd->batch_idx_read].status; > + /* now check if next batch is complete and successful */ > + if (bstatus == IDXD_COMP_STATUS_SUCCESS) { > + /* since the batch idx ring stores the start of each batch, pre-increment to lookup > + * start of next batch. > + */ > + if (++idxd->batch_idx_read > idxd->max_batches) > + idxd->batch_idx_read = 0; > + idxd->ids_avail = idxd->batch_idx_ring[idxd->batch_idx_read]; > + > + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); > + idxd->ids_returned += ret; > + if (status) > + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); > + return ret; > + } > + /* check if batch is incomplete */ > + else if (bstatus == IDXD_COMP_STATUS_INCOMPLETE) > + return 0; > + > + return -1; /* error case */ > +} > + > +static inline uint16_t > +batch_completed(struct idxd_dmadev *idxd, uint8_t max_ops, bool *has_error) uint8_t max_ops -> uint16_t max_ops ? > +{ > + uint16_t i; > + uint16_t b_start, b_end, next_batch; > + > + int ret = batch_ok(idxd, max_ops, NULL); > + if (ret >= 0) > + return ret; > + > + /* ERROR case, not successful, not incomplete */ > + /* Get the batch size, and special case size 1. > + * once we identify the actual failure job, return other jobs, then update > + * the batch ring indexes to make it look like the first job of the batch has failed. > + * Subsequent calls here will always return zero packets, and the error must be cleared by > + * calling the completed_status() function. > + */ > + next_batch = (idxd->batch_idx_read + 1); > + if (next_batch > idxd->max_batches) > + next_batch = 0; > + b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; > + b_end = idxd->batch_idx_ring[next_batch]; > + > + if (b_end - b_start == 1) { /* not a batch */ > + *has_error = true; > + return 0; > + } > + > + for (i = b_start; i < b_end; i++) { > + struct idxd_completion *c = (void *)&idxd->desc_ring[i & idxd->desc_ring_mask]; > + if (c->status > IDXD_COMP_STATUS_SUCCESS) /* ignore incomplete(0) and success(1) */ > + break; > + } > + ret = RTE_MIN((uint16_t)(i - idxd->ids_returned), max_ops); > + if (ret < max_ops) > + *has_error = true; /* we got up to the point of error */ > + idxd->ids_avail = idxd->ids_returned += ret; > + > + /* to ensure we can call twice and just return 0, set start of batch to where we finished */ > + idxd->batch_comp_ring[idxd->batch_idx_read].completed_size -= ret; > + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; > + if (idxd->batch_idx_ring[next_batch] - idxd->batch_idx_ring[idxd->batch_idx_read] == 1) { > + /* copy over the descriptor status to the batch ring as if no batch */ > + uint16_t d_idx = idxd->batch_idx_ring[idxd->batch_idx_read] & idxd->desc_ring_mask; > + struct idxd_completion *desc_comp = (void *)&idxd->desc_ring[d_idx]; > + idxd->batch_comp_ring[idxd->batch_idx_read].status = desc_comp->status; > + } > + > + return ret; > +} > + > +static uint16_t > +batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) > +{ > + uint16_t next_batch; > + > + int ret = batch_ok(idxd, max_ops, status); > + if (ret >= 0) > + return ret; > + > + /* ERROR case, not successful, not incomplete */ > + /* Get the batch size, and special case size 1. > + */ > + next_batch = (idxd->batch_idx_read + 1); > + if (next_batch > idxd->max_batches) > + next_batch = 0; > + const uint16_t b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; > + const uint16_t b_end = idxd->batch_idx_ring[next_batch]; > + const uint16_t b_len = b_end - b_start; > + if (b_len == 1) {/* not a batch */ > + *status = get_comp_status(&idxd->batch_comp_ring[idxd->batch_idx_read]); > + idxd->ids_avail++; > + idxd->ids_returned++; > + idxd->batch_idx_read = next_batch; > + return 1; > + } > + > + /* not a single-element batch, need to process more. > + * Scenarios: > + * 1. max_ops >= batch_size - can fit everything, simple case > + * - loop through completed ops and then add on any not-attempted ones > + * 2. max_ops < batch_size - can't fit everything, more complex case > + * - loop through completed/incomplete and stop when hit max_ops > + * - adjust the batch descriptor to update where we stopped, with appropriate bcount > + * - if bcount is to be exactly 1, update the batch descriptor as it will be treated as > + * non-batch next time. > + */ > + const uint16_t bcount = idxd->batch_comp_ring[idxd->batch_idx_read].completed_size; > + for (ret = 0; ret < b_len && ret < max_ops; ret++) { > + struct idxd_completion *c = (void *) > + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; > + status[ret] = (ret < bcount) ? get_comp_status(c) : RTE_DMA_STATUS_NOT_ATTEMPTED; > + } > + idxd->ids_avail = idxd->ids_returned += ret; > + > + /* everything fit */ > + if (ret == b_len) { > + idxd->batch_idx_read = next_batch; > + return ret; > + } > + > + /* set up for next time, update existing batch descriptor & start idx at batch_idx_read */ > + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; > + if (ret > bcount) { > + /* we have only incomplete ones - set batch completed size to 0 */ > + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; > + comp->completed_size = 0; > + /* if there is only one descriptor left, job skipped so set flag appropriately */ > + if (b_len - ret == 1) > + comp->status = IDXD_COMP_STATUS_SKIPPED; > + } else { > + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; > + comp->completed_size -= ret; > + /* if there is only one descriptor left, copy status info straight to desc */ > + if (comp->completed_size == 1) { > + struct idxd_completion *c = (void *) > + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; > + comp->status = c->status; > + /* individual descs can be ok without writeback, but not batches */ > + if (comp->status == IDXD_COMP_STATUS_INCOMPLETE) > + comp->status = IDXD_COMP_STATUS_SUCCESS; > + } else if (bcount == b_len) { > + /* check if we still have an error, and clear flag if not */ > + uint16_t i; > + for (i = b_start + ret; i < b_end; i++) { > + struct idxd_completion *c = (void *) > + &idxd->desc_ring[i & idxd->desc_ring_mask]; > + if (c->status > IDXD_COMP_STATUS_SUCCESS) > + break; > + } > + if (i == b_end) /* no errors */ > + comp->status = IDXD_COMP_STATUS_SUCCESS; > + } > + } > + > + return ret; > +} > + > +uint16_t > +idxd_completed(struct rte_dma_dev *dev, uint16_t qid __rte_unused, uint16_t max_ops, > + uint16_t *last_idx, bool *has_error) > +{ > + struct idxd_dmadev *idxd = dev->dev_private; > + uint16_t batch, ret = 0; > + > + do { > + batch = batch_completed(idxd, max_ops - ret, has_error); > + ret += batch; > + } while (batch > 0 && *has_error == false); > + > + *last_idx = idxd->ids_returned - 1; > + return ret; > +} > + > +uint16_t > +idxd_completed_status(struct rte_dma_dev *dev, uint16_t qid __rte_unused, uint16_t max_ops, > + uint16_t *last_idx, enum rte_dma_status_code *status) > +{ > + struct idxd_dmadev *idxd = dev->dev_private; > + > + uint16_t batch, ret = 0; > + > + do { > + batch = batch_completed_status(idxd, max_ops - ret, &status[ret]); > + ret += batch; > + } while (batch > 0); > + > + *last_idx = idxd->ids_returned - 1; > + return ret; > +} > + > int > idxd_dump(const struct rte_dma_dev *dev, FILE *f) > { > diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h > index b66c2d0182..15115a0966 100644 > --- a/drivers/dma/idxd/idxd_internal.h > +++ b/drivers/dma/idxd/idxd_internal.h > @@ -93,5 +93,10 @@ int idxd_enqueue_copy(struct rte_dma_dev *dev, uint16_t qid, rte_iova_t src, > int idxd_enqueue_fill(struct rte_dma_dev *dev, uint16_t qid, uint64_t pattern, > rte_iova_t dst, unsigned int length, uint64_t flags); > int idxd_submit(struct rte_dma_dev *dev, uint16_t qid); > +uint16_t idxd_completed(struct rte_dma_dev *dev, uint16_t qid, uint16_t max_ops, > + uint16_t *last_idx, bool *has_error); > +uint16_t idxd_completed_status(struct rte_dma_dev *dev, uint16_t qid __rte_unused, > + uint16_t max_ops, uint16_t *last_idx, > + enum rte_dma_status_code *status); > > #endif /* _IDXD_INTERNAL_H_ */ > ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v5 11/16] dma/idxd: add operation statistic tracking 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 00/16] add dmadev driver for idxd devices Kevin Laatz ` (9 preceding siblings ...) 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 10/16] dma/idxd: add data-path job completion functions Kevin Laatz @ 2021-09-17 15:24 ` Kevin Laatz 2021-09-22 3:51 ` fengchengwen 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 12/16] dma/idxd: add vchan status function Kevin Laatz ` (4 subsequent siblings) 15 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 15:24 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add statistic tracking for DSA devices. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 11 +++++++++++ drivers/dma/idxd/idxd_bus.c | 2 ++ drivers/dma/idxd/idxd_common.c | 27 +++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 3 +++ drivers/dma/idxd/idxd_pci.c | 2 ++ 5 files changed, 45 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index f942a8aa44..c81f1d15cc 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -249,3 +249,14 @@ of memory is overwritten, or filled, with a short pattern of data. Fill operations can be performed in much the same was as copy operations described above, just using the ``rte_dma_fill()`` function rather than the ``rte_dma_copy()`` function. + +Querying Device Statistics +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The statistics from the IDXD dmadev device can be got via the stats functions in +the ``rte_dmadev`` library, i.e. ``rte_dma_stats_get()``. The statistics +returned for each device instance are: + +* ``submitted``: The number of operations submitted to the device. +* ``completed``: The number of operations which have completed (successful and failed). +* ``errors``: The number of operations that completed with error. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index b2acdac4f9..b52ea02854 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -99,6 +99,8 @@ static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .stats_get = idxd_stats_get, + .stats_reset = idxd_stats_reset, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index a061a956c2..d86c58c12a 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -65,6 +65,8 @@ __submit(struct idxd_dmadev *idxd) if (++idxd->batch_idx_write > idxd->max_batches) idxd->batch_idx_write = 0; + idxd->stats.submitted += idxd->batch_size; + idxd->batch_start += idxd->batch_size; idxd->batch_size = 0; idxd->batch_idx_ring[idxd->batch_idx_write] = idxd->batch_start; @@ -275,6 +277,8 @@ batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_ const uint16_t b_len = b_end - b_start; if (b_len == 1) {/* not a batch */ *status = get_comp_status(&idxd->batch_comp_ring[idxd->batch_idx_read]); + if (status != RTE_DMA_STATUS_SUCCESSFUL) + idxd->stats.errors++; idxd->ids_avail++; idxd->ids_returned++; idxd->batch_idx_read = next_batch; @@ -296,6 +300,8 @@ batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_ struct idxd_completion *c = (void *) &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; status[ret] = (ret < bcount) ? get_comp_status(c) : RTE_DMA_STATUS_NOT_ATTEMPTED; + if (status[ret] != RTE_DMA_STATUS_SUCCESSFUL) + idxd->stats.errors++; } idxd->ids_avail = idxd->ids_returned += ret; @@ -354,6 +360,7 @@ idxd_completed(struct rte_dma_dev *dev, uint16_t qid __rte_unused, uint16_t max_ ret += batch; } while (batch > 0 && *has_error == false); + idxd->stats.completed += ret; *last_idx = idxd->ids_returned - 1; return ret; } @@ -371,6 +378,7 @@ idxd_completed_status(struct rte_dma_dev *dev, uint16_t qid __rte_unused, uint16 ret += batch; } while (batch > 0); + idxd->stats.completed += ret; *last_idx = idxd->ids_returned - 1; return ret; } @@ -404,6 +412,25 @@ idxd_dump(const struct rte_dma_dev *dev, FILE *f) return 0; } +int +idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + struct rte_dma_stats *stats, uint32_t stats_sz) +{ + struct idxd_dmadev *idxd = dev->dev_private; + if (stats_sz < sizeof(*stats)) + return -EINVAL; + *stats = idxd->stats; + return 0; +} + +int +idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan __rte_unused) +{ + struct idxd_dmadev *idxd = dev->dev_private; + idxd->stats = (struct rte_dma_stats){0}; + return 0; +} + int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t size) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 15115a0966..e2a1119ef7 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -98,5 +98,8 @@ uint16_t idxd_completed(struct rte_dma_dev *dev, uint16_t qid, uint16_t max_ops, uint16_t idxd_completed_status(struct rte_dma_dev *dev, uint16_t qid __rte_unused, uint16_t max_ops, uint16_t *last_idx, enum rte_dma_status_code *status); +int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, + struct rte_dma_stats *stats, uint32_t stats_sz); +int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index cfb64ce220..d73845aa3d 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -114,6 +114,8 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .stats_get = idxd_stats_get, + .stats_reset = idxd_stats_reset, .dev_start = idxd_pci_dev_start, .dev_stop = idxd_pci_dev_stop, }; -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v5 11/16] dma/idxd: add operation statistic tracking 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 11/16] dma/idxd: add operation statistic tracking Kevin Laatz @ 2021-09-22 3:51 ` fengchengwen 0 siblings, 0 replies; 243+ messages in thread From: fengchengwen @ 2021-09-22 3:51 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, jerinj, conor.walsh Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> On 2021/9/17 23:24, Kevin Laatz wrote: > Add statistic tracking for DSA devices. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > --- > doc/guides/dmadevs/idxd.rst | 11 +++++++++++ > drivers/dma/idxd/idxd_bus.c | 2 ++ > drivers/dma/idxd/idxd_common.c | 27 +++++++++++++++++++++++++++ > drivers/dma/idxd/idxd_internal.h | 3 +++ > drivers/dma/idxd/idxd_pci.c | 2 ++ > 5 files changed, 45 insertions(+) > [snip] ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v5 12/16] dma/idxd: add vchan status function 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 00/16] add dmadev driver for idxd devices Kevin Laatz ` (10 preceding siblings ...) 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 11/16] dma/idxd: add operation statistic tracking Kevin Laatz @ 2021-09-17 15:24 ` Kevin Laatz 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 13/16] dma/idxd: add burst capacity API Kevin Laatz ` (3 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 15:24 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When testing dmadev drivers, it is useful to have the HW device in a known state. This patch adds the implementation of the function which will wait for the device to be idle (all jobs completed) before proceeding. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v3: update API name to vchan_status --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 17 +++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 2 ++ drivers/dma/idxd/idxd_pci.c | 1 + 4 files changed, 21 insertions(+) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index b52ea02854..e6caa048a9 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -101,6 +101,7 @@ static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_info_get = idxd_info_get, .stats_get = idxd_stats_get, .stats_reset = idxd_stats_reset, + .vchan_status = idxd_vchan_status, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index d86c58c12a..87d84c081e 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -162,6 +162,23 @@ get_comp_status(struct idxd_completion *c) } } +int +idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + enum rte_dma_vchan_status *status) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t last_batch_write = idxd->batch_idx_write == 0 ? idxd->max_batches : + idxd->batch_idx_write - 1; + uint8_t bstatus = (idxd->batch_comp_ring[last_batch_write].status != 0); + + /* An IDXD device will always be either active or idle. + * RTE_DMA_VCHAN_HALTED_ERROR is therefore not supported by IDXD. + */ + *status = bstatus ? RTE_DMA_VCHAN_IDLE : RTE_DMA_VCHAN_ACTIVE; + + return 0; +} + static __rte_always_inline int batch_ok(struct idxd_dmadev *idxd, uint8_t max_ops, enum rte_dma_status_code *status) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index e2a1119ef7..a291ad26d9 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -101,5 +101,7 @@ uint16_t idxd_completed_status(struct rte_dma_dev *dev, uint16_t qid __rte_unuse int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, struct rte_dma_stats *stats, uint32_t stats_sz); int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); +int idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan, + enum rte_dma_vchan_status *status); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index d73845aa3d..2464d4a06c 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -118,6 +118,7 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .stats_reset = idxd_stats_reset, .dev_start = idxd_pci_dev_start, .dev_stop = idxd_pci_dev_stop, + .vchan_status = idxd_vchan_status, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v5 13/16] dma/idxd: add burst capacity API 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 00/16] add dmadev driver for idxd devices Kevin Laatz ` (11 preceding siblings ...) 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 12/16] dma/idxd: add vchan status function Kevin Laatz @ 2021-09-17 15:24 ` Kevin Laatz 2021-09-20 10:39 ` Bruce Richardson 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz ` (2 subsequent siblings) 15 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 15:24 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add support for the burst capacity API. This API will provide the calling application with the remaining capacity of the current burst (limited by max HW batch size). Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 20 ++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 1 + drivers/dma/idxd/idxd_pci.c | 2 ++ 4 files changed, 24 insertions(+) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index e6caa048a9..54129e5083 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -102,6 +102,7 @@ static const struct rte_dma_dev_ops idxd_bus_ops = { .stats_get = idxd_stats_get, .stats_reset = idxd_stats_reset, .vchan_status = idxd_vchan_status, + .burst_capacity = idxd_burst_capacity, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 87d84c081e..b31611c8a4 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -469,6 +469,26 @@ idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t return 0; } +uint16_t +idxd_burst_capacity(const struct rte_dma_dev *dev, uint16_t vchan __rte_unused) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t write_idx = idxd->batch_start + idxd->batch_size; + uint16_t used_space; + + /* Check for space in the batch ring */ + if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) || + idxd->batch_idx_write + 1 == idxd->batch_idx_read) + return 0; + + /* For descriptors, check for wrap-around on write but not read */ + if (idxd->ids_returned > write_idx) + write_idx += idxd->desc_ring_mask + 1; + used_space = write_idx - idxd->ids_returned; + + return RTE_MIN((idxd->desc_ring_mask - used_space), idxd->max_batch_size); +} + int idxd_configure(struct rte_dma_dev *dev __rte_unused, const struct rte_dma_conf *dev_conf, uint32_t conf_sz) diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index a291ad26d9..3ef2f729a8 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -103,5 +103,6 @@ int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); int idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan, enum rte_dma_vchan_status *status); +uint16_t idxd_burst_capacity(const struct rte_dma_dev *dev, uint16_t vchan); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 2464d4a06c..03ddd63f38 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -119,6 +119,7 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_start = idxd_pci_dev_start, .dev_stop = idxd_pci_dev_stop, .vchan_status = idxd_vchan_status, + .burst_capacity = idxd_burst_capacity, }; /* each portal uses 4 x 4k pages */ @@ -232,6 +233,7 @@ init_pci_device(struct rte_pci_device *dev, struct idxd_dmadev *idxd, idxd->u.pci = pci; idxd->max_batches = wq_size; + idxd->max_batch_size = 1 << lg2_max_batch; /* enable the device itself */ err_code = idxd_pci_dev_command(idxd, idxd_enable_dev); -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v5 13/16] dma/idxd: add burst capacity API 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 13/16] dma/idxd: add burst capacity API Kevin Laatz @ 2021-09-20 10:39 ` Bruce Richardson 0 siblings, 0 replies; 243+ messages in thread From: Bruce Richardson @ 2021-09-20 10:39 UTC (permalink / raw) To: Kevin Laatz; +Cc: dev, fengchengwen, jerinj, conor.walsh On Fri, Sep 17, 2021 at 03:24:34PM +0000, Kevin Laatz wrote: > Add support for the burst capacity API. This API will provide the calling > application with the remaining capacity of the current burst (limited by > max HW batch size). > > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > --- Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v5 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 00/16] add dmadev driver for idxd devices Kevin Laatz ` (12 preceding siblings ...) 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 13/16] dma/idxd: add burst capacity API Kevin Laatz @ 2021-09-17 15:24 ` Kevin Laatz 2021-09-20 10:43 ` Bruce Richardson 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 15/16] devbind: add dma device class Kevin Laatz 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 15 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 15:24 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz From: Conor Walsh <conor.walsh@intel.com> Move the example script for configuring IDXD devices bound to the IDXD kernel driver from raw to dma, and create a symlink to still allow use from raw. Signed-off-by: Conor Walsh <conor.walsh@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> --- drivers/dma/idxd/dpdk_idxd_cfg.py | 117 +++++++++++++++++++++++++++++ drivers/raw/ioat/dpdk_idxd_cfg.py | 118 +----------------------------- 2 files changed, 118 insertions(+), 117 deletions(-) create mode 100755 drivers/dma/idxd/dpdk_idxd_cfg.py mode change 100755 => 120000 drivers/raw/ioat/dpdk_idxd_cfg.py diff --git a/drivers/dma/idxd/dpdk_idxd_cfg.py b/drivers/dma/idxd/dpdk_idxd_cfg.py new file mode 100755 index 0000000000..fcc27822ef --- /dev/null +++ b/drivers/dma/idxd/dpdk_idxd_cfg.py @@ -0,0 +1,117 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2020 Intel Corporation + +""" +Configure an entire Intel DSA instance, using idxd kernel driver, for DPDK use +""" + +import sys +import argparse +import os +import os.path + + +class SysfsDir: + "Used to read/write paths in a sysfs directory" + def __init__(self, path): + self.path = path + + def read_int(self, filename): + "Return a value from sysfs file" + with open(os.path.join(self.path, filename)) as f: + return int(f.readline()) + + def write_values(self, values): + "write dictionary, where key is filename and value is value to write" + for filename, contents in values.items(): + with open(os.path.join(self.path, filename), "w") as f: + f.write(str(contents)) + + +def reset_device(dsa_id): + "Reset the DSA device and all its queues" + drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") + drv_dir.write_values({"unbind": f"dsa{dsa_id}"}) + + +def get_pci_dir(pci): + "Search for the sysfs directory of the PCI device" + base_dir = '/sys/bus/pci/devices/' + for path, dirs, files in os.walk(base_dir): + for dir in dirs: + if pci in dir: + return os.path.join(base_dir, dir) + sys.exit(f"Could not find sysfs directory for device {pci}") + + +def get_dsa_id(pci): + "Get the DSA instance ID using the PCI address of the device" + pci_dir = get_pci_dir(pci) + for path, dirs, files in os.walk(pci_dir): + for dir in dirs: + if dir.startswith('dsa') and 'wq' not in dir: + return int(dir[3:]) + sys.exit(f"Could not get device ID for device {pci}") + + +def configure_dsa(dsa_id, queues, prefix): + "Configure the DSA instance with appropriate number of queues" + dsa_dir = SysfsDir(f"/sys/bus/dsa/devices/dsa{dsa_id}") + drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") + + max_groups = dsa_dir.read_int("max_groups") + max_engines = dsa_dir.read_int("max_engines") + max_queues = dsa_dir.read_int("max_work_queues") + max_work_queues_size = dsa_dir.read_int("max_work_queues_size") + + nb_queues = min(queues, max_queues) + if queues > nb_queues: + print(f"Setting number of queues to max supported value: {max_queues}") + + # we want one engine per group, and no more engines than queues + nb_groups = min(max_engines, max_groups, nb_queues) + for grp in range(nb_groups): + dsa_dir.write_values({f"engine{dsa_id}.{grp}/group_id": grp}) + + # configure each queue + for q in range(nb_queues): + wq_dir = SysfsDir(os.path.join(dsa_dir.path, f"wq{dsa_id}.{q}")) + wq_dir.write_values({"group_id": q % nb_groups, + "type": "user", + "mode": "dedicated", + "name": f"{prefix}_wq{dsa_id}.{q}", + "priority": 1, + "size": int(max_work_queues_size / nb_queues)}) + + # enable device and then queues + drv_dir.write_values({"bind": f"dsa{dsa_id}"}) + for q in range(nb_queues): + drv_dir.write_values({"bind": f"wq{dsa_id}.{q}"}) + + +def main(args): + "Main function, does arg parsing and calls config function" + arg_p = argparse.ArgumentParser( + description="Configure whole DSA device instance for DPDK use") + arg_p.add_argument('dsa_id', + help="Specify DSA instance either via DSA instance number or PCI address") + arg_p.add_argument('-q', metavar='queues', type=int, default=255, + help="Number of queues to set up") + arg_p.add_argument('--name-prefix', metavar='prefix', dest='prefix', + default="dpdk", + help="Prefix for workqueue name to mark for DPDK use [default: 'dpdk']") + arg_p.add_argument('--reset', action='store_true', + help="Reset DSA device and its queues") + parsed_args = arg_p.parse_args(args[1:]) + + dsa_id = parsed_args.dsa_id + dsa_id = get_dsa_id(dsa_id) if ':' in dsa_id else dsa_id + if parsed_args.reset: + reset_device(dsa_id) + else: + configure_dsa(dsa_id, parsed_args.q, parsed_args.prefix) + + +if __name__ == "__main__": + main(sys.argv) diff --git a/drivers/raw/ioat/dpdk_idxd_cfg.py b/drivers/raw/ioat/dpdk_idxd_cfg.py deleted file mode 100755 index fcc27822ef..0000000000 --- a/drivers/raw/ioat/dpdk_idxd_cfg.py +++ /dev/null @@ -1,117 +0,0 @@ -#!/usr/bin/env python3 -# SPDX-License-Identifier: BSD-3-Clause -# Copyright(c) 2020 Intel Corporation - -""" -Configure an entire Intel DSA instance, using idxd kernel driver, for DPDK use -""" - -import sys -import argparse -import os -import os.path - - -class SysfsDir: - "Used to read/write paths in a sysfs directory" - def __init__(self, path): - self.path = path - - def read_int(self, filename): - "Return a value from sysfs file" - with open(os.path.join(self.path, filename)) as f: - return int(f.readline()) - - def write_values(self, values): - "write dictionary, where key is filename and value is value to write" - for filename, contents in values.items(): - with open(os.path.join(self.path, filename), "w") as f: - f.write(str(contents)) - - -def reset_device(dsa_id): - "Reset the DSA device and all its queues" - drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") - drv_dir.write_values({"unbind": f"dsa{dsa_id}"}) - - -def get_pci_dir(pci): - "Search for the sysfs directory of the PCI device" - base_dir = '/sys/bus/pci/devices/' - for path, dirs, files in os.walk(base_dir): - for dir in dirs: - if pci in dir: - return os.path.join(base_dir, dir) - sys.exit(f"Could not find sysfs directory for device {pci}") - - -def get_dsa_id(pci): - "Get the DSA instance ID using the PCI address of the device" - pci_dir = get_pci_dir(pci) - for path, dirs, files in os.walk(pci_dir): - for dir in dirs: - if dir.startswith('dsa') and 'wq' not in dir: - return int(dir[3:]) - sys.exit(f"Could not get device ID for device {pci}") - - -def configure_dsa(dsa_id, queues, prefix): - "Configure the DSA instance with appropriate number of queues" - dsa_dir = SysfsDir(f"/sys/bus/dsa/devices/dsa{dsa_id}") - drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") - - max_groups = dsa_dir.read_int("max_groups") - max_engines = dsa_dir.read_int("max_engines") - max_queues = dsa_dir.read_int("max_work_queues") - max_work_queues_size = dsa_dir.read_int("max_work_queues_size") - - nb_queues = min(queues, max_queues) - if queues > nb_queues: - print(f"Setting number of queues to max supported value: {max_queues}") - - # we want one engine per group, and no more engines than queues - nb_groups = min(max_engines, max_groups, nb_queues) - for grp in range(nb_groups): - dsa_dir.write_values({f"engine{dsa_id}.{grp}/group_id": grp}) - - # configure each queue - for q in range(nb_queues): - wq_dir = SysfsDir(os.path.join(dsa_dir.path, f"wq{dsa_id}.{q}")) - wq_dir.write_values({"group_id": q % nb_groups, - "type": "user", - "mode": "dedicated", - "name": f"{prefix}_wq{dsa_id}.{q}", - "priority": 1, - "size": int(max_work_queues_size / nb_queues)}) - - # enable device and then queues - drv_dir.write_values({"bind": f"dsa{dsa_id}"}) - for q in range(nb_queues): - drv_dir.write_values({"bind": f"wq{dsa_id}.{q}"}) - - -def main(args): - "Main function, does arg parsing and calls config function" - arg_p = argparse.ArgumentParser( - description="Configure whole DSA device instance for DPDK use") - arg_p.add_argument('dsa_id', - help="Specify DSA instance either via DSA instance number or PCI address") - arg_p.add_argument('-q', metavar='queues', type=int, default=255, - help="Number of queues to set up") - arg_p.add_argument('--name-prefix', metavar='prefix', dest='prefix', - default="dpdk", - help="Prefix for workqueue name to mark for DPDK use [default: 'dpdk']") - arg_p.add_argument('--reset', action='store_true', - help="Reset DSA device and its queues") - parsed_args = arg_p.parse_args(args[1:]) - - dsa_id = parsed_args.dsa_id - dsa_id = get_dsa_id(dsa_id) if ':' in dsa_id else dsa_id - if parsed_args.reset: - reset_device(dsa_id) - else: - configure_dsa(dsa_id, parsed_args.q, parsed_args.prefix) - - -if __name__ == "__main__": - main(sys.argv) diff --git a/drivers/raw/ioat/dpdk_idxd_cfg.py b/drivers/raw/ioat/dpdk_idxd_cfg.py new file mode 120000 index 0000000000..85545548d1 --- /dev/null +++ b/drivers/raw/ioat/dpdk_idxd_cfg.py @@ -0,0 +1 @@ +../../dma/idxd/dpdk_idxd_cfg.py \ No newline at end of file -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v5 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz @ 2021-09-20 10:43 ` Bruce Richardson 0 siblings, 0 replies; 243+ messages in thread From: Bruce Richardson @ 2021-09-20 10:43 UTC (permalink / raw) To: Kevin Laatz; +Cc: dev, fengchengwen, jerinj, conor.walsh On Fri, Sep 17, 2021 at 03:24:35PM +0000, Kevin Laatz wrote: > From: Conor Walsh <conor.walsh@intel.com> > > Move the example script for configuring IDXD devices bound to the IDXD > kernel driver from raw to dma, and create a symlink to still allow use from > raw. > > Signed-off-by: Conor Walsh <conor.walsh@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > --- Acked-by: Bruce Richardson <bruce.richardson@intel.com> ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v5 15/16] devbind: add dma device class 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 00/16] add dmadev driver for idxd devices Kevin Laatz ` (13 preceding siblings ...) 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz @ 2021-09-17 15:24 ` Kevin Laatz 2021-09-20 10:45 ` Bruce Richardson 2021-09-22 2:19 ` fengchengwen 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 15 siblings, 2 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 15:24 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add a new class for DMA devices. Devices listed under the DMA class are to be used with the dmadev library. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- usertools/dpdk-devbind.py | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py index 74d16e4c4b..8bb573f4b0 100755 --- a/usertools/dpdk-devbind.py +++ b/usertools/dpdk-devbind.py @@ -69,12 +69,13 @@ network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class] baseband_devices = [acceleration_class] crypto_devices = [encryption_class, intel_processor_class] +dma_devices = [] eventdev_devices = [cavium_sso, cavium_tim, intel_dlb, octeontx2_sso] mempool_devices = [cavium_fpa, octeontx2_npa] compress_devices = [cavium_zip] regex_devices = [octeontx2_ree] -misc_devices = [cnxk_bphy, cnxk_bphy_cgx, intel_ioat_bdw, intel_ioat_skx, intel_ioat_icx, intel_idxd_spr, - intel_ntb_skx, intel_ntb_icx, +misc_devices = [cnxk_bphy, cnxk_bphy_cgx, intel_ioat_bdw, intel_ioat_skx, + intel_ioat_icx, intel_idxd_spr, intel_ntb_skx, intel_ntb_icx, octeontx2_dma] # global dict ethernet devices present. Dictionary indexed by PCI address. @@ -583,6 +584,9 @@ def show_status(): if status_dev in ["crypto", "all"]: show_device_status(crypto_devices, "Crypto") + if status_dev in ["dma", "all"]: + show_device_status(dma_devices, "DMA") + if status_dev in ["event", "all"]: show_device_status(eventdev_devices, "Eventdev") @@ -651,7 +655,7 @@ def parse_args(): parser.add_argument( '--status-dev', help="Print the status of given device group.", - choices=['baseband', 'compress', 'crypto', 'event', + choices=['baseband', 'compress', 'crypto', 'dma', 'event', 'mempool', 'misc', 'net', 'regex']) bind_group = parser.add_mutually_exclusive_group() bind_group.add_argument( @@ -732,6 +736,7 @@ def do_arg_actions(): get_device_details(network_devices) get_device_details(baseband_devices) get_device_details(crypto_devices) + get_device_details(dma_devices) get_device_details(eventdev_devices) get_device_details(mempool_devices) get_device_details(compress_devices) @@ -754,6 +759,7 @@ def main(): get_device_details(network_devices) get_device_details(baseband_devices) get_device_details(crypto_devices) + get_device_details(dma_devices) get_device_details(eventdev_devices) get_device_details(mempool_devices) get_device_details(compress_devices) -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v5 15/16] devbind: add dma device class 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 15/16] devbind: add dma device class Kevin Laatz @ 2021-09-20 10:45 ` Bruce Richardson 2021-09-22 2:19 ` fengchengwen 1 sibling, 0 replies; 243+ messages in thread From: Bruce Richardson @ 2021-09-20 10:45 UTC (permalink / raw) To: Kevin Laatz; +Cc: dev, fengchengwen, jerinj, conor.walsh On Fri, Sep 17, 2021 at 03:24:36PM +0000, Kevin Laatz wrote: > Add a new class for DMA devices. Devices listed under the DMA class are to > be used with the dmadev library. > > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > --- One small comment below to be fixed. Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> > usertools/dpdk-devbind.py | 12 +++++++++--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py > index 74d16e4c4b..8bb573f4b0 100755 > --- a/usertools/dpdk-devbind.py > +++ b/usertools/dpdk-devbind.py > @@ -69,12 +69,13 @@ > network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class] > baseband_devices = [acceleration_class] > crypto_devices = [encryption_class, intel_processor_class] > +dma_devices = [] > eventdev_devices = [cavium_sso, cavium_tim, intel_dlb, octeontx2_sso] > mempool_devices = [cavium_fpa, octeontx2_npa] > compress_devices = [cavium_zip] > regex_devices = [octeontx2_ree] > -misc_devices = [cnxk_bphy, cnxk_bphy_cgx, intel_ioat_bdw, intel_ioat_skx, intel_ioat_icx, intel_idxd_spr, > - intel_ntb_skx, intel_ntb_icx, > +misc_devices = [cnxk_bphy, cnxk_bphy_cgx, intel_ioat_bdw, intel_ioat_skx, > + intel_ioat_icx, intel_idxd_spr, intel_ntb_skx, intel_ntb_icx, This looks a purely cosmetic change, which doesn't really below in the patch - especially since a number of these entries are to move in later patches for 21.11. /Bruce ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v5 15/16] devbind: add dma device class 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 15/16] devbind: add dma device class Kevin Laatz 2021-09-20 10:45 ` Bruce Richardson @ 2021-09-22 2:19 ` fengchengwen 1 sibling, 0 replies; 243+ messages in thread From: fengchengwen @ 2021-09-22 2:19 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: bruce.richardson, jerinj, conor.walsh Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> On 2021/9/17 23:24, Kevin Laatz wrote: > Add a new class for DMA devices. Devices listed under the DMA class are to > be used with the dmadev library. > > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > --- > usertools/dpdk-devbind.py | 12 +++++++++--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py > index 74d16e4c4b..8bb573f4b0 100755 > --- a/usertools/dpdk-devbind.py > +++ b/usertools/dpdk-devbind.py > @@ -69,12 +69,13 @@ > network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class] > baseband_devices = [acceleration_class] > crypto_devices = [encryption_class, intel_processor_class] > +dma_devices = [] > eventdev_devices = [cavium_sso, cavium_tim, intel_dlb, octeontx2_sso] > mempool_devices = [cavium_fpa, octeontx2_npa] > compress_devices = [cavium_zip] > regex_devices = [octeontx2_ree] > -misc_devices = [cnxk_bphy, cnxk_bphy_cgx, intel_ioat_bdw, intel_ioat_skx, intel_ioat_icx, intel_idxd_spr, > - intel_ntb_skx, intel_ntb_icx, > +misc_devices = [cnxk_bphy, cnxk_bphy_cgx, intel_ioat_bdw, intel_ioat_skx, > + intel_ioat_icx, intel_idxd_spr, intel_ntb_skx, intel_ntb_icx, > octeontx2_dma] > > # global dict ethernet devices present. Dictionary indexed by PCI address. > @@ -583,6 +584,9 @@ def show_status(): > if status_dev in ["crypto", "all"]: > show_device_status(crypto_devices, "Crypto") > > + if status_dev in ["dma", "all"]: > + show_device_status(dma_devices, "DMA") > + > if status_dev in ["event", "all"]: > show_device_status(eventdev_devices, "Eventdev") > > @@ -651,7 +655,7 @@ def parse_args(): > parser.add_argument( > '--status-dev', > help="Print the status of given device group.", > - choices=['baseband', 'compress', 'crypto', 'event', > + choices=['baseband', 'compress', 'crypto', 'dma', 'event', > 'mempool', 'misc', 'net', 'regex']) > bind_group = parser.add_mutually_exclusive_group() > bind_group.add_argument( > @@ -732,6 +736,7 @@ def do_arg_actions(): > get_device_details(network_devices) > get_device_details(baseband_devices) > get_device_details(crypto_devices) > + get_device_details(dma_devices) > get_device_details(eventdev_devices) > get_device_details(mempool_devices) > get_device_details(compress_devices) > @@ -754,6 +759,7 @@ def main(): > get_device_details(network_devices) > get_device_details(baseband_devices) > get_device_details(crypto_devices) > + get_device_details(dma_devices) > get_device_details(eventdev_devices) > get_device_details(mempool_devices) > get_device_details(compress_devices) > ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v5 16/16] devbind: move idxd device ID to dmadev class 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 00/16] add dmadev driver for idxd devices Kevin Laatz ` (14 preceding siblings ...) 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 15/16] devbind: add dma device class Kevin Laatz @ 2021-09-17 15:24 ` Kevin Laatz 2021-09-20 10:46 ` Bruce Richardson 15 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-09-17 15:24 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz The dmadev library is the preferred abstraction for using IDXD devices and will replace the rawdev implementation in future. This patch moves the IDXD device ID to the dmadev class. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- usertools/dpdk-devbind.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py index 8bb573f4b0..98b698ccc0 100755 --- a/usertools/dpdk-devbind.py +++ b/usertools/dpdk-devbind.py @@ -69,13 +69,13 @@ network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class] baseband_devices = [acceleration_class] crypto_devices = [encryption_class, intel_processor_class] -dma_devices = [] +dma_devices = [intel_idxd_spr] eventdev_devices = [cavium_sso, cavium_tim, intel_dlb, octeontx2_sso] mempool_devices = [cavium_fpa, octeontx2_npa] compress_devices = [cavium_zip] regex_devices = [octeontx2_ree] misc_devices = [cnxk_bphy, cnxk_bphy_cgx, intel_ioat_bdw, intel_ioat_skx, - intel_ioat_icx, intel_idxd_spr, intel_ntb_skx, intel_ntb_icx, + intel_ioat_icx, intel_ntb_skx, intel_ntb_icx, octeontx2_dma] # global dict ethernet devices present. Dictionary indexed by PCI address. -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v5 16/16] devbind: move idxd device ID to dmadev class 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz @ 2021-09-20 10:46 ` Bruce Richardson 0 siblings, 0 replies; 243+ messages in thread From: Bruce Richardson @ 2021-09-20 10:46 UTC (permalink / raw) To: Kevin Laatz; +Cc: dev, fengchengwen, jerinj, conor.walsh On Fri, Sep 17, 2021 at 03:24:37PM +0000, Kevin Laatz wrote: > The dmadev library is the preferred abstraction for using IDXD devices and > will replace the rawdev implementation in future. This patch moves the IDXD > device ID to the dmadev class. > > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > --- Acked-by: Bruce Richardson <bruce.richardson@intel.com> ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v6 00/16] add dmadev driver for idxd devices 2021-08-27 17:20 [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices Kevin Laatz ` (15 preceding siblings ...) 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 00/16] add dmadev driver for idxd devices Kevin Laatz @ 2021-09-24 13:39 ` Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 01/16] raw/ioat: only build if dmadev not present Kevin Laatz ` (15 more replies) 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 00/16] add dmadev driver for idxd devices Kevin Laatz ` (4 subsequent siblings) 21 siblings, 16 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-24 13:39 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz This patchset adds a dmadev driver and associated documentation to support Intel Data Streaming Accelerator devices. This driver is intended to ultimately replace the current IDXD part of the IOAT rawdev driver. NOTE: This patchset has several dependencies: - v23 of the dmadev lib set [1] - v6 of the dmadev test suite [2] [1] http://patches.dpdk.org/project/dpdk/list/?series=19140 [2] http://patches.dpdk.org/project/dpdk/list/?series=19138 v6: * set state of device during create * add dev_close function * documentation updates - moved generic pieces from driver doc to lib doc * other small miscellaneous fixes based on rebasing and ML feedback v5: * add missing toctree entry for idxd driver v4: * rebased on above patchsets * minor fixes based on review feedback v3: * rebased on above patchsets * added burst capacity API v2: * rebased on above patchsets * added API to check for device being idle * added devbind updates for DMA devices * fixed issue identified by internal coverity scan Bruce Richardson (1): raw/ioat: only build if dmadev not present Conor Walsh (1): dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz (14): dma/idxd: add skeleton for VFIO based DSA device dma/idxd: add bus device probing dma/idxd: create dmadev instances on bus probe dma/idxd: create dmadev instances on pci probe dma/idxd: add datapath structures dma/idxd: add configure and info_get functions dma/idxd: add start and stop functions for pci devices dma/idxd: add data-path job submission functions dma/idxd: add data-path job completion functions dma/idxd: add operation statistic tracking dma/idxd: add vchan status function dma/idxd: add burst capacity API devbind: add dma device class devbind: move idxd device ID to dmadev class MAINTAINERS | 10 + doc/guides/dmadevs/idxd.rst | 183 ++++++++ doc/guides/dmadevs/index.rst | 2 + doc/guides/prog_guide/dmadev.rst | 34 ++ doc/guides/rawdevs/ioat.rst | 8 + doc/guides/rel_notes/release_21_11.rst | 5 + drivers/dma/idxd/dpdk_idxd_cfg.py | 117 +++++ drivers/dma/idxd/idxd_bus.c | 377 +++++++++++++++ drivers/dma/idxd/idxd_common.c | 610 +++++++++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 131 ++++++ drivers/dma/idxd/idxd_internal.h | 108 +++++ drivers/dma/idxd/idxd_pci.c | 393 ++++++++++++++++ drivers/dma/idxd/meson.build | 14 + drivers/dma/idxd/version.map | 3 + drivers/dma/meson.build | 2 + drivers/meson.build | 4 +- drivers/raw/ioat/dpdk_idxd_cfg.py | 118 +---- drivers/raw/ioat/meson.build | 23 +- usertools/dpdk-devbind.py | 10 +- 19 files changed, 2028 insertions(+), 124 deletions(-) create mode 100644 doc/guides/dmadevs/idxd.rst create mode 100755 drivers/dma/idxd/dpdk_idxd_cfg.py create mode 100644 drivers/dma/idxd/idxd_bus.c create mode 100644 drivers/dma/idxd/idxd_common.c create mode 100644 drivers/dma/idxd/idxd_hw_defs.h create mode 100644 drivers/dma/idxd/idxd_internal.h create mode 100644 drivers/dma/idxd/idxd_pci.c create mode 100644 drivers/dma/idxd/meson.build create mode 100644 drivers/dma/idxd/version.map mode change 100755 => 120000 drivers/raw/ioat/dpdk_idxd_cfg.py -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v6 01/16] raw/ioat: only build if dmadev not present 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 00/16] add dmadev driver for idxd devices Kevin Laatz @ 2021-09-24 13:39 ` Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz ` (14 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-24 13:39 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz From: Bruce Richardson <bruce.richardson@intel.com> Only build the rawdev IDXD/IOAT drivers if the dmadev drivers are not present. This change requires the dependencies to be reordered in drivers/meson.build so that rawdev can use the "RTE_DMA_* build macros to check for the presence of the equivalent dmadev driver. A note is also added to the documentation to inform users of this change. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v4: - Fix build issue - Add note in raw documentation to outline this change v5: - Provide more detail in commit message - Minor doc changes --- doc/guides/rawdevs/ioat.rst | 8 ++++++++ drivers/meson.build | 4 ++-- drivers/raw/ioat/meson.build | 23 ++++++++++++++++++++--- 3 files changed, 30 insertions(+), 5 deletions(-) diff --git a/doc/guides/rawdevs/ioat.rst b/doc/guides/rawdevs/ioat.rst index a28e909935..a65530bd30 100644 --- a/doc/guides/rawdevs/ioat.rst +++ b/doc/guides/rawdevs/ioat.rst @@ -34,6 +34,14 @@ Compilation For builds using ``meson`` and ``ninja``, the driver will be built when the target platform is x86-based. No additional compilation steps are necessary. +.. note:: + Since the addition of the dmadev library, the ``ioat`` and ``idxd`` parts of this driver + will only be built if their ``dmadev`` counterparts are not built. + The following can be used to disable the ``dmadev`` drivers, + if the raw drivers are to be used instead:: + + $ meson -Ddisable_drivers=dma/* <build_dir> + Device Setup ------------- diff --git a/drivers/meson.build b/drivers/meson.build index b7d680868a..34c0276487 100644 --- a/drivers/meson.build +++ b/drivers/meson.build @@ -10,15 +10,15 @@ subdirs = [ 'common/qat', # depends on bus. 'common/sfc_efx', # depends on bus. 'mempool', # depends on common and bus. + 'dma', # depends on common and bus. 'net', # depends on common, bus, mempool - 'raw', # depends on common, bus and net. + 'raw', # depends on common, bus, dma and net. 'crypto', # depends on common, bus and mempool (net in future). 'compress', # depends on common, bus, mempool. 'regex', # depends on common, bus, regexdev. 'vdpa', # depends on common, bus and mempool. 'event', # depends on common, bus, mempool and net. 'baseband', # depends on common and bus. - 'dma', # depends on common and bus. ] if meson.is_cross_build() diff --git a/drivers/raw/ioat/meson.build b/drivers/raw/ioat/meson.build index 0e81cb5951..9be9d8cc65 100644 --- a/drivers/raw/ioat/meson.build +++ b/drivers/raw/ioat/meson.build @@ -2,14 +2,31 @@ # Copyright 2019 Intel Corporation build = dpdk_conf.has('RTE_ARCH_X86') +# only use ioat rawdev driver if we don't have the equivalent dmadev ones +if dpdk_conf.has('RTE_DMA_IDXD') and dpdk_conf.has('RTE_DMA_IOAT') + build = false + subdir_done() +endif + reason = 'only supported on x86' sources = files( - 'idxd_bus.c', - 'idxd_pci.c', 'ioat_common.c', - 'ioat_rawdev.c', 'ioat_rawdev_test.c', ) + +if not dpdk_conf.has('RTE_DMA_IDXD') + sources += files( + 'idxd_bus.c', + 'idxd_pci.c', + ) +endif + +if not dpdk_conf.has('RTE_DMA_IOAT') + sources += files ( + 'ioat_rawdev.c', + ) +endif + deps += ['bus_pci', 'mbuf', 'rawdev'] headers = files( 'rte_ioat_rawdev.h', -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v6 02/16] dma/idxd: add skeleton for VFIO based DSA device 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 01/16] raw/ioat: only build if dmadev not present Kevin Laatz @ 2021-09-24 13:39 ` Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 03/16] dma/idxd: add bus device probing Kevin Laatz ` (13 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-24 13:39 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the basic device probe/remove skeleton code for DSA device bound to the vfio pci driver. Relevant documentation and MAINTAINERS update also included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v5: add missing toctree entry for idxd driver v6: add missing new line at end of meson file --- MAINTAINERS | 10 +++++ doc/guides/dmadevs/idxd.rst | 58 ++++++++++++++++++++++++++ doc/guides/dmadevs/index.rst | 2 + doc/guides/rel_notes/release_21_11.rst | 5 +++ drivers/dma/idxd/idxd_internal.h | 27 ++++++++++++ drivers/dma/idxd/idxd_pci.c | 55 ++++++++++++++++++++++++ drivers/dma/idxd/meson.build | 11 +++++ drivers/dma/idxd/version.map | 3 ++ drivers/dma/meson.build | 2 + 9 files changed, 173 insertions(+) create mode 100644 doc/guides/dmadevs/idxd.rst create mode 100644 drivers/dma/idxd/idxd_internal.h create mode 100644 drivers/dma/idxd/idxd_pci.c create mode 100644 drivers/dma/idxd/meson.build create mode 100644 drivers/dma/idxd/version.map diff --git a/MAINTAINERS b/MAINTAINERS index 371d80c42c..497219e948 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1200,6 +1200,16 @@ F: doc/guides/compressdevs/zlib.rst F: doc/guides/compressdevs/features/zlib.ini +DMAdev Drivers +-------------- + +Intel IDXD - EXPERIMENTAL +M: Bruce Richardson <bruce.richardson@intel.com> +M: Kevin Laatz <kevin.laatz@intel.com> +F: drivers/dma/idxd/ +F: doc/guides/dmadevs/idxd.rst + + RegEx Drivers ------------- diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst new file mode 100644 index 0000000000..924700d17e --- /dev/null +++ b/doc/guides/dmadevs/idxd.rst @@ -0,0 +1,58 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2021 Intel Corporation. + +.. include:: <isonum.txt> + +IDXD DMA Device Driver +====================== + +The ``idxd`` dmadev driver provides a poll-mode driver (PMD) for Intel\ |reg| +Data Streaming Accelerator `(Intel DSA) +<https://software.intel.com/content/www/us/en/develop/articles/intel-data-streaming-accelerator-architecture-specification.html>`_. +This PMD can be used in conjunction with Intel\ |reg| DSA devices to offload +data operations, such as data copies, to hardware, freeing up CPU cycles for +other tasks. + +Hardware Requirements +---------------------- + +The ``dpdk-devbind.py`` script, included with DPDK, can be used to show the +presence of supported hardware. Running ``dpdk-devbind.py --status-dev dma`` +will show all the DMA devices on the system, including IDXD supported devices. +Intel\ |reg| DSA devices, are currently (at time of writing) appearing +as devices with type “0b25”, due to the absence of pci-id database entries for +them at this point. + +Compilation +------------ + +For builds using ``meson`` and ``ninja``, the driver will be built when the +target platform is x86-based. No additional compilation steps are necessary. + +Device Setup +------------- + +Devices using VFIO/UIO drivers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The HW devices to be used will need to be bound to a user-space IO driver for use. +The ``dpdk-devbind.py`` script can be used to view the state of the devices +and to bind them to a suitable DPDK-supported driver, such as ``vfio-pci``. +For example:: + + $ dpdk-devbind.py -b vfio-pci 6a:01.0 + +Device Probing and Initialization +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For devices bound to a suitable DPDK-supported VFIO/UIO driver, the HW devices will +be found as part of the device scan done at application initialization time without +the need to pass parameters to the application. + +For Intel\ |reg| DSA devices, DPDK will automatically configure the device with the +maximum number of workqueues available on it, partitioning all resources equally +among the queues. +If fewer workqueues are required, then the ``max_queues`` parameter may be passed to +the device driver on the EAL commandline, via the ``allowlist`` or ``-a`` flag e.g.:: + + $ dpdk-test -a <b:d:f>,max_queues=4 diff --git a/doc/guides/dmadevs/index.rst b/doc/guides/dmadevs/index.rst index 0bce29d766..5d4abf880e 100644 --- a/doc/guides/dmadevs/index.rst +++ b/doc/guides/dmadevs/index.rst @@ -10,3 +10,5 @@ an application through DMA API. .. toctree:: :maxdepth: 2 :numbered: + + idxd diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst index 7ef5c3c7b0..c980e729f8 100644 --- a/doc/guides/rel_notes/release_21_11.rst +++ b/doc/guides/rel_notes/release_21_11.rst @@ -98,6 +98,11 @@ New Features * Data plane APIs. * Multi-process support. +* **Added IDXD dmadev driver implementation.** + + The IDXD dmadev driver provide device drivers for the Intel DSA devices. + This device driver can be used through the generic dmadev API. + Removed Items ------------- diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h new file mode 100644 index 0000000000..c6a7dcd72f --- /dev/null +++ b/drivers/dma/idxd/idxd_internal.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#ifndef _IDXD_INTERNAL_H_ +#define _IDXD_INTERNAL_H_ + +/** + * @file idxd_internal.h + * + * Internal data structures for the idxd/DSA driver for dmadev + * + * @warning + * @b EXPERIMENTAL: these structures and APIs may change without prior notice + */ + +extern int idxd_pmd_logtype; + +#define IDXD_PMD_LOG(level, fmt, args...) rte_log(RTE_LOG_ ## level, \ + idxd_pmd_logtype, "IDXD: %s(): " fmt "\n", __func__, ##args) + +#define IDXD_PMD_DEBUG(fmt, args...) IDXD_PMD_LOG(DEBUG, fmt, ## args) +#define IDXD_PMD_INFO(fmt, args...) IDXD_PMD_LOG(INFO, fmt, ## args) +#define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) +#define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) + +#endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c new file mode 100644 index 0000000000..79e4aadcab --- /dev/null +++ b/drivers/dma/idxd/idxd_pci.c @@ -0,0 +1,55 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include <rte_bus_pci.h> + +#include "idxd_internal.h" + +#define IDXD_VENDOR_ID 0x8086 +#define IDXD_DEVICE_ID_SPR 0x0B25 + +#define IDXD_PMD_DMADEV_NAME_PCI dmadev_idxd_pci + +const struct rte_pci_id pci_id_idxd_map[] = { + { RTE_PCI_DEVICE(IDXD_VENDOR_ID, IDXD_DEVICE_ID_SPR) }, + { .vendor_id = 0, /* sentinel */ }, +}; + +static int +idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) +{ + int ret = 0; + char name[PCI_PRI_STR_SIZE]; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); + dev->device.driver = &drv->driver; + + return ret; +} + +static int +idxd_dmadev_remove_pci(struct rte_pci_device *dev) +{ + char name[PCI_PRI_STR_SIZE]; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + + IDXD_PMD_INFO("Closing %s on NUMA node %d", + name, dev->device.numa_node); + + return 0; +} + +struct rte_pci_driver idxd_pmd_drv_pci = { + .id_table = pci_id_idxd_map, + .drv_flags = RTE_PCI_DRV_NEED_MAPPING, + .probe = idxd_dmadev_probe_pci, + .remove = idxd_dmadev_remove_pci, +}; + +RTE_PMD_REGISTER_PCI(IDXD_PMD_DMADEV_NAME_PCI, idxd_pmd_drv_pci); +RTE_PMD_REGISTER_PCI_TABLE(IDXD_PMD_DMADEV_NAME_PCI, pci_id_idxd_map); +RTE_PMD_REGISTER_KMOD_DEP(IDXD_PMD_DMADEV_NAME_PCI, "vfio-pci"); +RTE_PMD_REGISTER_PARAM_STRING(dmadev_idxd_pci, "max_queues=0"); diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build new file mode 100644 index 0000000000..4426a9f65c --- /dev/null +++ b/drivers/dma/idxd/meson.build @@ -0,0 +1,11 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2021 Intel Corporation + +if is_windows + subdir_done() +endif + +deps += ['bus_pci'] +sources = files( + 'idxd_pci.c' +) diff --git a/drivers/dma/idxd/version.map b/drivers/dma/idxd/version.map new file mode 100644 index 0000000000..4a76d1d52d --- /dev/null +++ b/drivers/dma/idxd/version.map @@ -0,0 +1,3 @@ +DPDK_21 { + local: *; +}; diff --git a/drivers/dma/meson.build b/drivers/dma/meson.build index d9c7ede32f..411be7a240 100644 --- a/drivers/dma/meson.build +++ b/drivers/dma/meson.build @@ -2,5 +2,7 @@ # Copyright 2021 HiSilicon Limited drivers = [ + 'idxd', 'skeleton', ] +std_deps = ['dmadev'] -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v6 03/16] dma/idxd: add bus device probing 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 01/16] raw/ioat: only build if dmadev not present Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz @ 2021-09-24 13:39 ` Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz ` (12 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-24 13:39 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the basic device probing for DSA devices bound to the IDXD kernel driver. These devices can be configured via sysfs and made available to DPDK if they are found during bus scan. Relevant documentation is included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v4: fix 'vdev' naming, changed to 'bus' --- doc/guides/dmadevs/idxd.rst | 64 +++++++ drivers/dma/idxd/idxd_bus.c | 351 +++++++++++++++++++++++++++++++++++ drivers/dma/idxd/meson.build | 1 + 3 files changed, 416 insertions(+) create mode 100644 drivers/dma/idxd/idxd_bus.c diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 924700d17e..ce33e2857a 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -32,6 +32,56 @@ target platform is x86-based. No additional compilation steps are necessary. Device Setup ------------- +Intel\ |reg| DSA devices can use the IDXD kernel driver or DPDK-supported drivers, +such as ``vfio-pci``. Both are supported by the IDXD PMD. + +Intel\ |reg| DSA devices using IDXD kernel driver +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To use an Intel\ |reg| DSA device bound to the IDXD kernel driver, the device must first be configured. +The `accel-config <https://github.com/intel/idxd-config>`_ utility library can be used for configuration. + +.. note:: + The device configuration can also be done by directly interacting with the sysfs nodes. + An example of how this may be done can be seen in the script ``dpdk_idxd_cfg.py`` + included in the driver source directory. + +There are some mandatory configuration steps before being able to use a device with an application. +The internal engines, which do the copies or other operations, +and the work-queues, which are used by applications to assign work to the device, +need to be assigned to groups, and the various other configuration options, +such as priority or queue depth, need to be set for each queue. + +To assign an engine to a group:: + + $ accel-config config-engine dsa0/engine0.0 --group-id=0 + $ accel-config config-engine dsa0/engine0.1 --group-id=1 + +To assign work queues to groups for passing descriptors to the engines a similar accel-config command can be used. +However, the work queues also need to be configured depending on the use case. +Some configuration options include: + +* mode (Dedicated/Shared): Indicates whether a WQ may accept jobs from multiple queues simultaneously. +* priority: WQ priority between 1 and 15. Larger value means higher priority. +* wq-size: the size of the WQ. Sum of all WQ sizes must be less that the total-size defined by the device. +* type: WQ type (kernel/mdev/user). Determines how the device is presented. +* name: identifier given to the WQ. + +Example configuration for a work queue:: + + $ accel-config config-wq dsa0/wq0.0 --group-id=0 \ + --mode=dedicated --priority=10 --wq-size=8 \ + --type=user --name=dpdk_app1 + +Once the devices have been configured, they need to be enabled:: + + $ accel-config enable-device dsa0 + $ accel-config enable-wq dsa0/wq0.0 + +Check the device configuration:: + + $ accel-config list + Devices using VFIO/UIO drivers ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -56,3 +106,17 @@ If fewer workqueues are required, then the ``max_queues`` parameter may be passe the device driver on the EAL commandline, via the ``allowlist`` or ``-a`` flag e.g.:: $ dpdk-test -a <b:d:f>,max_queues=4 + +For devices bound to the IDXD kernel driver, +the DPDK IDXD driver will automatically perform a scan for available workqueues +to use. Any workqueues found listed in ``/dev/dsa`` on the system will be checked +in ``/sys``, and any which have ``dpdk_`` prefix in their name will be automatically +probed by the driver to make them available to the application. +Alternatively, to support use by multiple DPDK processes simultaneously, +the value used as the DPDK ``--file-prefix`` parameter may be used as a workqueue +name prefix, instead of ``dpdk_``, allowing each DPDK application instance to only +use a subset of configured queues. + +Once probed successfully, irrespective of kernel driver, the device will appear as a ``dmadev``, +that is a "DMA device type" inside DPDK, and can be accessed using APIs from the +``rte_dmadev`` library. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c new file mode 100644 index 0000000000..ef589af30e --- /dev/null +++ b/drivers/dma/idxd/idxd_bus.c @@ -0,0 +1,351 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include <dirent.h> +#include <fcntl.h> +#include <unistd.h> +#include <sys/mman.h> +#include <libgen.h> + +#include <rte_bus.h> +#include <rte_log.h> +#include <rte_dmadev_pmd.h> +#include <rte_string_fns.h> + +#include "idxd_internal.h" + +/* default value for DSA paths, but allow override in environment for testing */ +#define DSA_DEV_PATH "/dev/dsa" +#define DSA_SYSFS_PATH "/sys/bus/dsa/devices" + +static unsigned int devcount; + +/** unique identifier for a DSA device/WQ instance */ +struct dsa_wq_addr { + uint16_t device_id; + uint16_t wq_id; +}; + +/** a DSA device instance */ +struct rte_dsa_device { + struct rte_device device; /**< Inherit core device */ + TAILQ_ENTRY(rte_dsa_device) next; /**< next dev in list */ + + char wq_name[32]; /**< the workqueue name/number e.g. wq0.1 */ + struct dsa_wq_addr addr; /**< Identifies the specific WQ */ +}; + +/* forward prototypes */ +struct dsa_bus; +static int dsa_scan(void); +static int dsa_probe(void); +static struct rte_device *dsa_find_device(const struct rte_device *start, + rte_dev_cmp_t cmp, const void *data); +static enum rte_iova_mode dsa_get_iommu_class(void); +static int dsa_addr_parse(const char *name, void *addr); + +/** List of devices */ +TAILQ_HEAD(dsa_device_list, rte_dsa_device); + +/** + * Structure describing the DSA bus + */ +struct dsa_bus { + struct rte_bus bus; /**< Inherit the generic class */ + struct rte_driver driver; /**< Driver struct for devices to point to */ + struct dsa_device_list device_list; /**< List of PCI devices */ +}; + +struct dsa_bus dsa_bus = { + .bus = { + .scan = dsa_scan, + .probe = dsa_probe, + .find_device = dsa_find_device, + .get_iommu_class = dsa_get_iommu_class, + .parse = dsa_addr_parse, + }, + .driver = { + .name = "dmadev_idxd" + }, + .device_list = TAILQ_HEAD_INITIALIZER(dsa_bus.device_list), +}; + +static inline const char * +dsa_get_dev_path(void) +{ + const char *path = getenv("DSA_DEV_PATH"); + return path ? path : DSA_DEV_PATH; +} + +static inline const char * +dsa_get_sysfs_path(void) +{ + const char *path = getenv("DSA_SYSFS_PATH"); + return path ? path : DSA_SYSFS_PATH; +} + +static void * +idxd_bus_mmap_wq(struct rte_dsa_device *dev) +{ + void *addr; + char path[PATH_MAX]; + int fd; + + snprintf(path, sizeof(path), "%s/%s", dsa_get_dev_path(), dev->wq_name); + fd = open(path, O_RDWR); + if (fd < 0) { + IDXD_PMD_ERR("Failed to open device path: %s", path); + return NULL; + } + + addr = mmap(NULL, 0x1000, PROT_WRITE, MAP_SHARED, fd, 0); + close(fd); + if (addr == MAP_FAILED) { + IDXD_PMD_ERR("Failed to mmap device %s", path); + return NULL; + } + + return addr; +} + +static int +read_wq_string(struct rte_dsa_device *dev, const char *filename, + char *value, size_t valuelen) +{ + char sysfs_node[PATH_MAX]; + int len; + int fd; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s", + dsa_get_sysfs_path(), dev->wq_name, filename); + fd = open(sysfs_node, O_RDONLY); + if (fd < 0) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + len = read(fd, value, valuelen - 1); + close(fd); + if (len < 0) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + value[len] = '\0'; + return 0; +} + +static int +read_wq_int(struct rte_dsa_device *dev, const char *filename, + int *value) +{ + char sysfs_node[PATH_MAX]; + FILE *f; + int ret = 0; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s", + dsa_get_sysfs_path(), dev->wq_name, filename); + f = fopen(sysfs_node, "r"); + if (f == NULL) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + if (fscanf(f, "%d", value) != 1) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + ret = -1; + } + + fclose(f); + return ret; +} + +static int +read_device_int(struct rte_dsa_device *dev, const char *filename, + int *value) +{ + char sysfs_node[PATH_MAX]; + FILE *f; + int ret = 0; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/dsa%d/%s", + dsa_get_sysfs_path(), dev->addr.device_id, filename); + f = fopen(sysfs_node, "r"); + if (f == NULL) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + if (fscanf(f, "%d", value) != 1) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + ret = -1; + } + + fclose(f); + return ret; +} + +static int +idxd_probe_dsa(struct rte_dsa_device *dev) +{ + struct idxd_dmadev idxd = {0}; + int ret = 0; + + IDXD_PMD_INFO("Probing device %s on numa node %d", + dev->wq_name, dev->device.numa_node); + if (read_wq_int(dev, "size", &ret) < 0) + return -1; + idxd.max_batches = ret; + if (read_wq_int(dev, "max_batch_size", &ret) < 0) + return -1; + idxd.max_batch_size = ret; + idxd.qid = dev->addr.wq_id; + idxd.sva_support = 1; + + idxd.portal = idxd_bus_mmap_wq(dev); + if (idxd.portal == NULL) { + IDXD_PMD_ERR("WQ mmap failed"); + return -ENOENT; + } + + return 0; +} + +static int +is_for_this_process_use(const char *name) +{ + char *runtime_dir = strdup(rte_eal_get_runtime_dir()); + char *prefix = basename(runtime_dir); + int prefixlen = strlen(prefix); + int retval = 0; + + if (strncmp(name, "dpdk_", 5) == 0) + retval = 1; + if (strncmp(name, prefix, prefixlen) == 0 && name[prefixlen] == '_') + retval = 1; + + free(runtime_dir); + return retval; +} + +static int +dsa_probe(void) +{ + struct rte_dsa_device *dev; + + TAILQ_FOREACH(dev, &dsa_bus.device_list, next) { + char type[64], name[64]; + + if (read_wq_string(dev, "type", type, sizeof(type)) < 0 || + read_wq_string(dev, "name", name, sizeof(name)) < 0) + continue; + + if (strncmp(type, "user", 4) == 0 && is_for_this_process_use(name)) { + dev->device.driver = &dsa_bus.driver; + idxd_probe_dsa(dev); + continue; + } + IDXD_PMD_DEBUG("WQ '%s', not allocated to DPDK", dev->wq_name); + } + + return 0; +} + +static int +dsa_scan(void) +{ + const char *path = dsa_get_dev_path(); + struct dirent *wq; + DIR *dev_dir; + + dev_dir = opendir(path); + if (dev_dir == NULL) { + if (errno == ENOENT) + return 0; /* no bus, return without error */ + IDXD_PMD_ERR("%s(): opendir '%s' failed: %s", + __func__, path, strerror(errno)); + return -1; + } + + while ((wq = readdir(dev_dir)) != NULL) { + struct rte_dsa_device *dev; + int numa_node = -1; + + if (strncmp(wq->d_name, "wq", 2) != 0) + continue; + if (strnlen(wq->d_name, sizeof(dev->wq_name)) == sizeof(dev->wq_name)) { + IDXD_PMD_ERR("%s(): wq name too long: '%s', skipping", + __func__, wq->d_name); + continue; + } + IDXD_PMD_DEBUG("%s(): found %s/%s", __func__, path, wq->d_name); + + dev = malloc(sizeof(*dev)); + if (dsa_addr_parse(wq->d_name, &dev->addr) < 0) { + IDXD_PMD_ERR("Error parsing WQ name: %s", wq->d_name); + free(dev); + continue; + } + dev->device.bus = &dsa_bus.bus; + strlcpy(dev->wq_name, wq->d_name, sizeof(dev->wq_name)); + TAILQ_INSERT_TAIL(&dsa_bus.device_list, dev, next); + devcount++; + + read_device_int(dev, "numa_node", &numa_node); + dev->device.numa_node = numa_node; + dev->device.name = dev->wq_name; + } + + closedir(dev_dir); + return 0; +} + +static struct rte_device * +dsa_find_device(const struct rte_device *start, rte_dev_cmp_t cmp, + const void *data) +{ + struct rte_dsa_device *dev = TAILQ_FIRST(&dsa_bus.device_list); + + /* the rte_device struct must be at start of dsa structure */ + RTE_BUILD_BUG_ON(offsetof(struct rte_dsa_device, device) != 0); + + if (start != NULL) /* jump to start point if given */ + dev = TAILQ_NEXT((const struct rte_dsa_device *)start, next); + while (dev != NULL) { + if (cmp(&dev->device, data) == 0) + return &dev->device; + dev = TAILQ_NEXT(dev, next); + } + return NULL; +} + +static enum rte_iova_mode +dsa_get_iommu_class(void) +{ + /* if there are no devices, report don't care, otherwise VA mode */ + return devcount > 0 ? RTE_IOVA_VA : RTE_IOVA_DC; +} + +static int +dsa_addr_parse(const char *name, void *addr) +{ + struct dsa_wq_addr *wq = addr; + unsigned int device_id, wq_id; + + if (sscanf(name, "wq%u.%u", &device_id, &wq_id) != 2) { + IDXD_PMD_DEBUG("Parsing WQ name failed: %s", name); + return -1; + } + + wq->device_id = device_id; + wq->wq_id = wq_id; + return 0; +} + +RTE_REGISTER_BUS(dsa, dsa_bus.bus); diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index 4426a9f65c..45418077f4 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -7,5 +7,6 @@ endif deps += ['bus_pci'] sources = files( + 'idxd_bus.c', 'idxd_pci.c' ) -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v6 04/16] dma/idxd: create dmadev instances on bus probe 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 00/16] add dmadev driver for idxd devices Kevin Laatz ` (2 preceding siblings ...) 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 03/16] dma/idxd: add bus device probing Kevin Laatz @ 2021-09-24 13:39 ` Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz ` (11 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-24 13:39 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When a suitable device is found during the bus scan/probe, create a dmadev instance for each HW queue. Internal structures required for device creation are also added. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v4: - fix 'vdev' naming, changed to 'bus' - rebase changes v6: - remove redundant struct initialization in create - set device state to ready at end of create --- drivers/dma/idxd/idxd_bus.c | 19 +++++++++ drivers/dma/idxd/idxd_common.c | 70 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 40 ++++++++++++++++++ drivers/dma/idxd/meson.build | 1 + 4 files changed, 130 insertions(+) create mode 100644 drivers/dma/idxd/idxd_common.c diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index ef589af30e..b48fa954ed 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -85,6 +85,18 @@ dsa_get_sysfs_path(void) return path ? path : DSA_SYSFS_PATH; } +static int +idxd_dev_close(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->data->dev_private; + munmap(idxd->portal, 0x1000); + return 0; +} + +static const struct rte_dma_dev_ops idxd_bus_ops = { + .dev_close = idxd_dev_close, +}; + static void * idxd_bus_mmap_wq(struct rte_dsa_device *dev) { @@ -206,6 +218,7 @@ idxd_probe_dsa(struct rte_dsa_device *dev) return -1; idxd.max_batch_size = ret; idxd.qid = dev->addr.wq_id; + idxd.u.bus.dsa_id = dev->addr.device_id; idxd.sva_support = 1; idxd.portal = idxd_bus_mmap_wq(dev); @@ -214,6 +227,12 @@ idxd_probe_dsa(struct rte_dsa_device *dev) return -ENOENT; } + ret = idxd_dmadev_create(dev->wq_name, &dev->device, &idxd, &idxd_bus_ops); + if (ret) { + IDXD_PMD_ERR("Failed to create rawdev %s", dev->wq_name); + return ret; + } + return 0; } diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c new file mode 100644 index 0000000000..130153f7d2 --- /dev/null +++ b/drivers/dma/idxd/idxd_common.c @@ -0,0 +1,70 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#include <rte_dmadev_pmd.h> +#include <rte_malloc.h> +#include <rte_common.h> + +#include "idxd_internal.h" + +#define IDXD_PMD_NAME_STR "dmadev_idxd" + +int +idxd_dmadev_create(const char *name, struct rte_device *dev, + const struct idxd_dmadev *base_idxd, + const struct rte_dma_dev_ops *ops) +{ + struct idxd_dmadev *idxd = NULL; + struct rte_dma_dev *dmadev = NULL; + int ret = 0; + + if (!name) { + IDXD_PMD_ERR("Invalid name of the device!"); + ret = -EINVAL; + goto cleanup; + } + + /* Allocate device structure */ + dmadev = rte_dma_pmd_allocate(name, dev->numa_node, sizeof(struct idxd_dmadev)); + if (dmadev == NULL) { + IDXD_PMD_ERR("Unable to allocate raw device"); + ret = -ENOMEM; + goto cleanup; + } + dmadev->dev_ops = ops; + dmadev->device = dev; + + idxd = dmadev->dev_private; + *idxd = *base_idxd; /* copy over the main fields already passed in */ + idxd->dmadev = dmadev; + + /* allocate batch index ring and completion ring. + * The +1 is because we can never fully use + * the ring, otherwise read == write means both full and empty. + */ + idxd->batch_comp_ring = rte_zmalloc(NULL, (sizeof(idxd->batch_idx_ring[0]) + + sizeof(idxd->batch_comp_ring[0])) * (idxd->max_batches + 1), + sizeof(idxd->batch_comp_ring[0])); + if (idxd->batch_comp_ring == NULL) { + IDXD_PMD_ERR("Unable to reserve memory for batch data\n"); + ret = -ENOMEM; + goto cleanup; + } + idxd->batch_idx_ring = (void *)&idxd->batch_comp_ring[idxd->max_batches+1]; + idxd->batch_iova = rte_mem_virt2iova(idxd->batch_comp_ring); + + idxd->dmadev->state = RTE_DMA_DEV_READY; + + return 0; + +cleanup: + if (dmadev) + rte_dma_pmd_release(name); + + return ret; +} + +int idxd_pmd_logtype; + +RTE_LOG_REGISTER_DEFAULT(idxd_pmd_logtype, WARNING); diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index c6a7dcd72f..fa6f053f72 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -24,4 +24,44 @@ extern int idxd_pmd_logtype; #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) +struct idxd_dmadev { + /* counters to track the batches */ + unsigned short max_batches; + unsigned short batch_idx_read; + unsigned short batch_idx_write; + + /* track descriptors and handles */ + unsigned short desc_ring_mask; + unsigned short ids_avail; /* handles for ops completed */ + unsigned short ids_returned; /* the read pointer for hdls/desc rings */ + unsigned short batch_start; /* start+size == write pointer for hdls/desc */ + unsigned short batch_size; + + void *portal; /* address to write the batch descriptor */ + + struct idxd_completion *batch_comp_ring; + unsigned short *batch_idx_ring; /* store where each batch ends */ + + struct rte_dma_stats stats; + + rte_iova_t batch_iova; /* base address of the batch comp ring */ + rte_iova_t desc_iova; /* base address of desc ring, needed for completions */ + + unsigned short max_batch_size; + + struct rte_dma_dev *dmadev; + struct rte_dma_vchan_conf qcfg; + uint8_t sva_support; + uint8_t qid; + + union { + struct { + unsigned int dsa_id; + } bus; + } u; +}; + +int idxd_dmadev_create(const char *name, struct rte_device *dev, + const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); + #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index 45418077f4..da5dc2b019 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -8,5 +8,6 @@ endif deps += ['bus_pci'] sources = files( 'idxd_bus.c', + 'idxd_common.c', 'idxd_pci.c' ) -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v6 05/16] dma/idxd: create dmadev instances on pci probe 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 00/16] add dmadev driver for idxd devices Kevin Laatz ` (3 preceding siblings ...) 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz @ 2021-09-24 13:39 ` Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 06/16] dma/idxd: add datapath structures Kevin Laatz ` (10 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-24 13:39 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When a suitable device is found during the PCI probe, create a dmadev instance for each HW queue. HW definitions required are also included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v4: rebase changes v6: add close function for device destroy and cleanup --- drivers/dma/idxd/idxd_hw_defs.h | 71 ++++++++ drivers/dma/idxd/idxd_internal.h | 16 ++ drivers/dma/idxd/idxd_pci.c | 285 ++++++++++++++++++++++++++++++- 3 files changed, 369 insertions(+), 3 deletions(-) create mode 100644 drivers/dma/idxd/idxd_hw_defs.h diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h new file mode 100644 index 0000000000..ea627cba6d --- /dev/null +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -0,0 +1,71 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#ifndef _IDXD_HW_DEFS_H_ +#define _IDXD_HW_DEFS_H_ + +/*** Definitions for Intel(R) Data Streaming Accelerator ***/ + +#define IDXD_CMD_SHIFT 20 +enum rte_idxd_cmds { + idxd_enable_dev = 1, + idxd_disable_dev, + idxd_drain_all, + idxd_abort_all, + idxd_reset_device, + idxd_enable_wq, + idxd_disable_wq, + idxd_drain_wq, + idxd_abort_wq, + idxd_reset_wq, +}; + +/* General bar0 registers */ +struct rte_idxd_bar0 { + uint32_t __rte_cache_aligned version; /* offset 0x00 */ + uint64_t __rte_aligned(0x10) gencap; /* offset 0x10 */ + uint64_t __rte_aligned(0x10) wqcap; /* offset 0x20 */ + uint64_t __rte_aligned(0x10) grpcap; /* offset 0x30 */ + uint64_t __rte_aligned(0x08) engcap; /* offset 0x38 */ + uint64_t __rte_aligned(0x10) opcap; /* offset 0x40 */ + uint64_t __rte_aligned(0x20) offsets[2]; /* offset 0x60 */ + uint32_t __rte_aligned(0x20) gencfg; /* offset 0x80 */ + uint32_t __rte_aligned(0x08) genctrl; /* offset 0x88 */ + uint32_t __rte_aligned(0x10) gensts; /* offset 0x90 */ + uint32_t __rte_aligned(0x08) intcause; /* offset 0x98 */ + uint32_t __rte_aligned(0x10) cmd; /* offset 0xA0 */ + uint32_t __rte_aligned(0x08) cmdstatus; /* offset 0xA8 */ + uint64_t __rte_aligned(0x20) swerror[4]; /* offset 0xC0 */ +}; + +/* workqueue config is provided by array of uint32_t. */ +enum rte_idxd_wqcfg { + wq_size_idx, /* size is in first 32-bit value */ + wq_threshold_idx, /* WQ threshold second 32-bits */ + wq_mode_idx, /* WQ mode and other flags */ + wq_sizes_idx, /* WQ transfer and batch sizes */ + wq_occ_int_idx, /* WQ occupancy interrupt handle */ + wq_occ_limit_idx, /* WQ occupancy limit */ + wq_state_idx, /* WQ state and occupancy state */ +}; + +#define WQ_MODE_SHARED 0 +#define WQ_MODE_DEDICATED 1 +#define WQ_PRIORITY_SHIFT 4 +#define WQ_BATCH_SZ_SHIFT 5 +#define WQ_STATE_SHIFT 30 +#define WQ_STATE_MASK 0x3 + +struct rte_idxd_grpcfg { + uint64_t grpwqcfg[4] __rte_cache_aligned; /* 64-byte register set */ + uint64_t grpengcfg; /* offset 32 */ + uint32_t grpflags; /* offset 40 */ +}; + +#define GENSTS_DEV_STATE_MASK 0x03 +#define CMDSTATUS_ACTIVE_SHIFT 31 +#define CMDSTATUS_ACTIVE_MASK (1 << 31) +#define CMDSTATUS_ERR_MASK 0xFF + +#endif diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index fa6f053f72..cb3a68c69b 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -5,6 +5,10 @@ #ifndef _IDXD_INTERNAL_H_ #define _IDXD_INTERNAL_H_ +#include <rte_spinlock.h> + +#include "idxd_hw_defs.h" + /** * @file idxd_internal.h * @@ -24,6 +28,16 @@ extern int idxd_pmd_logtype; #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) +struct idxd_pci_common { + rte_spinlock_t lk; + + uint8_t wq_cfg_sz; + volatile struct rte_idxd_bar0 *regs; + volatile uint32_t *wq_regs_base; + volatile struct rte_idxd_grpcfg *grp_regs; + volatile void *portals; +}; + struct idxd_dmadev { /* counters to track the batches */ unsigned short max_batches; @@ -58,6 +72,8 @@ struct idxd_dmadev { struct { unsigned int dsa_id; } bus; + + struct idxd_pci_common *pci; } u; }; diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 79e4aadcab..0c03a51449 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -3,6 +3,9 @@ */ #include <rte_bus_pci.h> +#include <rte_devargs.h> +#include <rte_dmadev_pmd.h> +#include <rte_malloc.h> #include "idxd_internal.h" @@ -16,17 +19,293 @@ const struct rte_pci_id pci_id_idxd_map[] = { { .vendor_id = 0, /* sentinel */ }, }; +static inline int +idxd_pci_dev_command(struct idxd_dmadev *idxd, enum rte_idxd_cmds command) +{ + uint8_t err_code; + uint16_t qid = idxd->qid; + int i = 0; + + if (command >= idxd_disable_wq && command <= idxd_reset_wq) + qid = (1 << qid); + rte_spinlock_lock(&idxd->u.pci->lk); + idxd->u.pci->regs->cmd = (command << IDXD_CMD_SHIFT) | qid; + + do { + rte_pause(); + err_code = idxd->u.pci->regs->cmdstatus; + if (++i >= 1000) { + IDXD_PMD_ERR("Timeout waiting for command response from HW"); + rte_spinlock_unlock(&idxd->u.pci->lk); + return err_code; + } + } while (idxd->u.pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK); + rte_spinlock_unlock(&idxd->u.pci->lk); + + return err_code & CMDSTATUS_ERR_MASK; +} + +static uint32_t * +idxd_get_wq_cfg(struct idxd_pci_common *pci, uint8_t wq_idx) +{ + return RTE_PTR_ADD(pci->wq_regs_base, + (uintptr_t)wq_idx << (5 + pci->wq_cfg_sz)); +} + +static int +idxd_is_wq_enabled(struct idxd_dmadev *idxd) +{ + uint32_t state = idxd_get_wq_cfg(idxd->u.pci, idxd->qid)[wq_state_idx]; + return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; +} + +static int +idxd_pci_dev_close(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint8_t err_code; + + /* disable the device */ + err_code = idxd_pci_dev_command(idxd, idxd_disable_dev); + if (err_code) { + IDXD_PMD_ERR("Error disabling device: code %#x", err_code); + return err_code; + } + IDXD_PMD_DEBUG("IDXD Device disabled OK"); + + /* free device memory */ + IDXD_PMD_DEBUG("Freeing device driver memory"); + rte_free(idxd->batch_idx_ring); + rte_free(idxd->desc_ring); + + return 0; +} + +static const struct rte_dma_dev_ops idxd_pci_ops = { + .dev_close = idxd_pci_dev_close, +}; + +/* each portal uses 4 x 4k pages */ +#define IDXD_PORTAL_SIZE (4096 * 4) + +static int +init_pci_device(struct rte_pci_device *dev, struct idxd_dmadev *idxd, + unsigned int max_queues) +{ + struct idxd_pci_common *pci; + uint8_t nb_groups, nb_engines, nb_wqs; + uint16_t grp_offset, wq_offset; /* how far into bar0 the regs are */ + uint16_t wq_size, total_wq_size; + uint8_t lg2_max_batch, lg2_max_copy_size; + unsigned int i, err_code; + + pci = malloc(sizeof(*pci)); + if (pci == NULL) { + IDXD_PMD_ERR("%s: Can't allocate memory", __func__); + goto err; + } + rte_spinlock_init(&pci->lk); + + /* assign the bar registers, and then configure device */ + pci->regs = dev->mem_resource[0].addr; + grp_offset = (uint16_t)pci->regs->offsets[0]; + pci->grp_regs = RTE_PTR_ADD(pci->regs, grp_offset * 0x100); + wq_offset = (uint16_t)(pci->regs->offsets[0] >> 16); + pci->wq_regs_base = RTE_PTR_ADD(pci->regs, wq_offset * 0x100); + pci->portals = dev->mem_resource[2].addr; + pci->wq_cfg_sz = (pci->regs->wqcap >> 24) & 0x0F; + + /* sanity check device status */ + if (pci->regs->gensts & GENSTS_DEV_STATE_MASK) { + /* need function-level-reset (FLR) or is enabled */ + IDXD_PMD_ERR("Device status is not disabled, cannot init"); + goto err; + } + if (pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK) { + /* command in progress */ + IDXD_PMD_ERR("Device has a command in progress, cannot init"); + goto err; + } + + /* read basic info about the hardware for use when configuring */ + nb_groups = (uint8_t)pci->regs->grpcap; + nb_engines = (uint8_t)pci->regs->engcap; + nb_wqs = (uint8_t)(pci->regs->wqcap >> 16); + total_wq_size = (uint16_t)pci->regs->wqcap; + lg2_max_copy_size = (uint8_t)(pci->regs->gencap >> 16) & 0x1F; + lg2_max_batch = (uint8_t)(pci->regs->gencap >> 21) & 0x0F; + + IDXD_PMD_DEBUG("nb_groups = %u, nb_engines = %u, nb_wqs = %u", + nb_groups, nb_engines, nb_wqs); + + /* zero out any old config */ + for (i = 0; i < nb_groups; i++) { + pci->grp_regs[i].grpengcfg = 0; + pci->grp_regs[i].grpwqcfg[0] = 0; + } + for (i = 0; i < nb_wqs; i++) + idxd_get_wq_cfg(pci, i)[0] = 0; + + /* limit queues if necessary */ + if (max_queues != 0 && nb_wqs > max_queues) { + nb_wqs = max_queues; + if (nb_engines > max_queues) + nb_engines = max_queues; + if (nb_groups > max_queues) + nb_engines = max_queues; + IDXD_PMD_DEBUG("Limiting queues to %u", nb_wqs); + } + + /* put each engine into a separate group to avoid reordering */ + if (nb_groups > nb_engines) + nb_groups = nb_engines; + if (nb_groups < nb_engines) + nb_engines = nb_groups; + + /* assign engines to groups, round-robin style */ + for (i = 0; i < nb_engines; i++) { + IDXD_PMD_DEBUG("Assigning engine %u to group %u", + i, i % nb_groups); + pci->grp_regs[i % nb_groups].grpengcfg |= (1ULL << i); + } + + /* now do the same for queues and give work slots to each queue */ + wq_size = total_wq_size / nb_wqs; + IDXD_PMD_DEBUG("Work queue size = %u, max batch = 2^%u, max copy = 2^%u", + wq_size, lg2_max_batch, lg2_max_copy_size); + for (i = 0; i < nb_wqs; i++) { + /* add engine "i" to a group */ + IDXD_PMD_DEBUG("Assigning work queue %u to group %u", + i, i % nb_groups); + pci->grp_regs[i % nb_groups].grpwqcfg[0] |= (1ULL << i); + /* now configure it, in terms of size, max batch, mode */ + idxd_get_wq_cfg(pci, i)[wq_size_idx] = wq_size; + idxd_get_wq_cfg(pci, i)[wq_mode_idx] = (1 << WQ_PRIORITY_SHIFT) | + WQ_MODE_DEDICATED; + idxd_get_wq_cfg(pci, i)[wq_sizes_idx] = lg2_max_copy_size | + (lg2_max_batch << WQ_BATCH_SZ_SHIFT); + } + + /* dump the group configuration to output */ + for (i = 0; i < nb_groups; i++) { + IDXD_PMD_DEBUG("## Group %d", i); + IDXD_PMD_DEBUG(" GRPWQCFG: %"PRIx64, pci->grp_regs[i].grpwqcfg[0]); + IDXD_PMD_DEBUG(" GRPENGCFG: %"PRIx64, pci->grp_regs[i].grpengcfg); + IDXD_PMD_DEBUG(" GRPFLAGS: %"PRIx32, pci->grp_regs[i].grpflags); + } + + idxd->u.pci = pci; + idxd->max_batches = wq_size; + + /* enable the device itself */ + err_code = idxd_pci_dev_command(idxd, idxd_enable_dev); + if (err_code) { + IDXD_PMD_ERR("Error enabling device: code %#x", err_code); + return err_code; + } + IDXD_PMD_DEBUG("IDXD Device enabled OK"); + + return nb_wqs; + +err: + free(pci); + return -1; +} + static int idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) { - int ret = 0; + struct idxd_dmadev idxd = {0}; + uint8_t nb_wqs; + int qid, ret = 0; char name[PCI_PRI_STR_SIZE]; + unsigned int max_queues = 0; rte_pci_device_name(&dev->addr, name, sizeof(name)); IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); dev->device.driver = &drv->driver; - return ret; + if (dev->device.devargs && dev->device.devargs->args[0] != '\0') { + /* if the number of devargs grows beyond just 1, use rte_kvargs */ + if (sscanf(dev->device.devargs->args, + "max_queues=%u", &max_queues) != 1) { + IDXD_PMD_ERR("Invalid device parameter: '%s'", + dev->device.devargs->args); + return -1; + } + } + + ret = init_pci_device(dev, &idxd, max_queues); + if (ret < 0) { + IDXD_PMD_ERR("Error initializing PCI hardware"); + return ret; + } + if (idxd.u.pci->portals == NULL) { + IDXD_PMD_ERR("Error, invalid portal assigned during initialization\n"); + return -EINVAL; + } + nb_wqs = (uint8_t)ret; + + /* set up one device for each queue */ + for (qid = 0; qid < nb_wqs; qid++) { + char qname[32]; + + /* add the queue number to each device name */ + snprintf(qname, sizeof(qname), "%s-q%d", name, qid); + idxd.qid = qid; + idxd.portal = RTE_PTR_ADD(idxd.u.pci->portals, + qid * IDXD_PORTAL_SIZE); + if (idxd_is_wq_enabled(&idxd)) + IDXD_PMD_ERR("Error, WQ %u seems enabled", qid); + ret = idxd_dmadev_create(qname, &dev->device, + &idxd, &idxd_pci_ops); + if (ret != 0) { + IDXD_PMD_ERR("Failed to create dmadev %s", name); + if (qid == 0) /* if no devices using this, free pci */ + free(idxd.u.pci); + return ret; + } + } + + return 0; +} + +static int +idxd_dmadev_destroy(const char *name) +{ + int ret; + struct rte_dma_dev *dmadev; + struct idxd_dmadev *idxd; + int dev_id = rte_dma_get_dev_id(name); + + if (!name) { + IDXD_PMD_ERR("Invalid device name"); + return -EINVAL; + } + + if (dev_id < 0) { + IDXD_PMD_ERR("Invalid device ID"); + return -EINVAL; + } + + dmadev = &rte_dma_devices[dev_id]; + if (!dmadev) { + IDXD_PMD_ERR("Invalid device name (%s)", name); + return -EINVAL; + } + + idxd = dmadev->dev_private; + if (!idxd) { + IDXD_PMD_ERR("Error getting dev_private"); + return -EINVAL; + } + + /* rte_dma_close is called by pmd_release */ + ret = rte_dma_pmd_release(name); + if (ret) + IDXD_PMD_DEBUG("Device cleanup failed"); + + return 0; } static int @@ -39,7 +318,7 @@ idxd_dmadev_remove_pci(struct rte_pci_device *dev) IDXD_PMD_INFO("Closing %s on NUMA node %d", name, dev->device.numa_node); - return 0; + return idxd_dmadev_destroy(name); } struct rte_pci_driver idxd_pmd_drv_pci = { -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v6 06/16] dma/idxd: add datapath structures 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 00/16] add dmadev driver for idxd devices Kevin Laatz ` (4 preceding siblings ...) 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz @ 2021-09-24 13:39 ` Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 07/16] dma/idxd: add configure and info_get functions Kevin Laatz ` (9 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-24 13:39 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add data structures required for the data path for IDXD devices. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v2: add completion status for invalid opcode --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 33 ++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 60 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 3 ++ drivers/dma/idxd/idxd_pci.c | 1 + 5 files changed, 98 insertions(+) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index b48fa954ed..3c0837ec52 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -95,6 +95,7 @@ idxd_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_close = idxd_dev_close, + .dev_dump = idxd_dump, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 130153f7d2..b285fda65b 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -10,6 +10,35 @@ #define IDXD_PMD_NAME_STR "dmadev_idxd" +int +idxd_dump(const struct rte_dma_dev *dev, FILE *f) +{ + struct idxd_dmadev *idxd = dev->dev_private; + unsigned int i; + + fprintf(f, "== IDXD Private Data ==\n"); + fprintf(f, " Portal: %p\n", idxd->portal); + fprintf(f, " Config: { ring_size: %u }\n", + idxd->qcfg.nb_desc); + fprintf(f, " Batch ring (sz = %u, max_batches = %u):\n\t", + idxd->max_batches + 1, idxd->max_batches); + for (i = 0; i <= idxd->max_batches; i++) { + fprintf(f, " %u ", idxd->batch_idx_ring[i]); + if (i == idxd->batch_idx_read && i == idxd->batch_idx_write) + fprintf(f, "[rd ptr, wr ptr] "); + else if (i == idxd->batch_idx_read) + fprintf(f, "[rd ptr] "); + else if (i == idxd->batch_idx_write) + fprintf(f, "[wr ptr] "); + if (i == idxd->max_batches) + fprintf(f, "\n"); + } + + fprintf(f, " Curr batch: start = %u, size = %u\n", idxd->batch_start, idxd->batch_size); + fprintf(f, " IDS: avail = %u, returned: %u\n", idxd->ids_avail, idxd->ids_returned); + return 0; +} + int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, @@ -19,6 +48,10 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, struct rte_dma_dev *dmadev = NULL; int ret = 0; + RTE_BUILD_BUG_ON(sizeof(struct idxd_hw_desc) != 64); + RTE_BUILD_BUG_ON(offsetof(struct idxd_hw_desc, size) != 32); + RTE_BUILD_BUG_ON(sizeof(struct idxd_completion) != 32); + if (!name) { IDXD_PMD_ERR("Invalid name of the device!"); ret = -EINVAL; diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h index ea627cba6d..55ca9f7f52 100644 --- a/drivers/dma/idxd/idxd_hw_defs.h +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -5,6 +5,66 @@ #ifndef _IDXD_HW_DEFS_H_ #define _IDXD_HW_DEFS_H_ +/* + * Defines used in the data path for interacting with IDXD hardware. + */ +#define IDXD_CMD_OP_SHIFT 24 +enum rte_idxd_ops { + idxd_op_nop = 0, + idxd_op_batch, + idxd_op_drain, + idxd_op_memmove, + idxd_op_fill +}; + +#define IDXD_FLAG_FENCE (1 << 0) +#define IDXD_FLAG_COMPLETION_ADDR_VALID (1 << 2) +#define IDXD_FLAG_REQUEST_COMPLETION (1 << 3) +#define IDXD_FLAG_CACHE_CONTROL (1 << 8) + +/** + * Hardware descriptor used by DSA hardware, for both bursts and + * for individual operations. + */ +struct idxd_hw_desc { + uint32_t pasid; + uint32_t op_flags; + rte_iova_t completion; + + RTE_STD_C11 + union { + rte_iova_t src; /* source address for copy ops etc. */ + rte_iova_t desc_addr; /* descriptor pointer for batch */ + }; + rte_iova_t dst; + + uint32_t size; /* length of data for op, or batch size */ + + uint16_t intr_handle; /* completion interrupt handle */ + + /* remaining 26 bytes are reserved */ + uint16_t __reserved[13]; +} __rte_aligned(64); + +#define IDXD_COMP_STATUS_INCOMPLETE 0 +#define IDXD_COMP_STATUS_SUCCESS 1 +#define IDXD_COMP_STATUS_INVALID_OPCODE 0x10 +#define IDXD_COMP_STATUS_INVALID_SIZE 0x13 +#define IDXD_COMP_STATUS_SKIPPED 0xFF /* not official IDXD error, needed as placeholder */ + +/** + * Completion record structure written back by DSA + */ +struct idxd_completion { + uint8_t status; + uint8_t result; + /* 16-bits pad here */ + uint32_t completed_size; /* data length, or descriptors for batch */ + + rte_iova_t fault_address; + uint32_t invalid_flags; +} __rte_aligned(32); + /*** Definitions for Intel(R) Data Streaming Accelerator ***/ #define IDXD_CMD_SHIFT 20 diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index cb3a68c69b..99c8e04302 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -39,6 +39,8 @@ struct idxd_pci_common { }; struct idxd_dmadev { + struct idxd_hw_desc *desc_ring; + /* counters to track the batches */ unsigned short max_batches; unsigned short batch_idx_read; @@ -79,5 +81,6 @@ struct idxd_dmadev { int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); +int idxd_dump(const struct rte_dma_dev *dev, FILE *f); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 0c03a51449..add241d172 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -83,6 +83,7 @@ idxd_pci_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_close = idxd_pci_dev_close, + .dev_dump = idxd_dump, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v6 07/16] dma/idxd: add configure and info_get functions 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 00/16] add dmadev driver for idxd devices Kevin Laatz ` (5 preceding siblings ...) 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 06/16] dma/idxd: add datapath structures Kevin Laatz @ 2021-09-24 13:39 ` Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz ` (8 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-24 13:39 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add functions for device configuration. The info_get function is included here since it can be useful for checking successful configuration. Documentation is also updated to add device configuration usage info. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v2: - fix reconfigure bug in idxd_vchan_setup() - add literal include comment for the docs to pick up v3: - fixes needed after changes from rebasing v6: - update doc to reference library documentation to remove duplication - remove nb_vchans from info_get() since the lib fills it - add error handling capability flag to info_get --- doc/guides/dmadevs/idxd.rst | 19 +++++++++ doc/guides/prog_guide/dmadev.rst | 4 ++ drivers/dma/idxd/idxd_bus.c | 3 ++ drivers/dma/idxd/idxd_common.c | 71 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 6 +++ drivers/dma/idxd/idxd_pci.c | 3 ++ 6 files changed, 106 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index ce33e2857a..42efd59594 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -120,3 +120,22 @@ use a subset of configured queues. Once probed successfully, irrespective of kernel driver, the device will appear as a ``dmadev``, that is a "DMA device type" inside DPDK, and can be accessed using APIs from the ``rte_dmadev`` library. + +Using IDXD DMAdev Devices +-------------------------- + +To use the devices from an application, the dmadev API can be used. + +Device Configuration +~~~~~~~~~~~~~~~~~~~~~ + +Refer to the :ref:`Device Configuration <dmadev_device_configuration>` and +:ref:`Configuration of Virtual DMA Channels <dmadev_vchan_configuration>` sections +of the dmadev library documentation for details on device configuration API usage. + +IDXD configuration requirements: + +* ``ring_size`` must be a power of two, between 64 and 4096. +* Only one ``vchan`` is supported per device (work queue). +* IDXD devices do not support silent mode. +* The transfer direction must be set to ``RTE_DMA_DIR_MEM_TO_MEM`` to copy from memory to memory. diff --git a/doc/guides/prog_guide/dmadev.rst b/doc/guides/prog_guide/dmadev.rst index de8b599d96..3612315325 100644 --- a/doc/guides/prog_guide/dmadev.rst +++ b/doc/guides/prog_guide/dmadev.rst @@ -63,6 +63,8 @@ identifiers: - A device name used to designate the DMA device in console messages, for administration or debugging purposes. +.. _dmadev_device_configuration: + Device Configuration ~~~~~~~~~~~~~~~~~~~~ @@ -79,6 +81,8 @@ for the DMA device for example the number of virtual DMA channels to set up, indication of whether to enable silent mode. +.. _dmadev_vchan_configuration: + Configuration of Virtual DMA Channels ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index 3c0837ec52..b2acdac4f9 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -96,6 +96,9 @@ idxd_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_close = idxd_dev_close, .dev_dump = idxd_dump, + .dev_configure = idxd_configure, + .vchan_setup = idxd_vchan_setup, + .dev_info_get = idxd_info_get, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index b285fda65b..32ddb5f7f8 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -39,6 +39,77 @@ idxd_dump(const struct rte_dma_dev *dev, FILE *f) return 0; } +int +idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t size) +{ + struct idxd_dmadev *idxd = dev->dev_private; + + if (size < sizeof(*info)) + return -EINVAL; + + *info = (struct rte_dma_info) { + .dev_capa = RTE_DMA_CAPA_MEM_TO_MEM | RTE_DMA_CAPA_HANDLES_ERRORS | + RTE_DMA_CAPA_OPS_COPY | RTE_DMA_CAPA_OPS_FILL, + .max_vchans = 1, + .max_desc = 4096, + .min_desc = 64, + }; + if (idxd->sva_support) + info->dev_capa |= RTE_DMA_CAPA_SVA; + return 0; +} + +int +idxd_configure(struct rte_dma_dev *dev __rte_unused, const struct rte_dma_conf *dev_conf, + uint32_t conf_sz) +{ + if (sizeof(struct rte_dma_conf) != conf_sz) + return -EINVAL; + + if (dev_conf->nb_vchans != 1) + return -EINVAL; + return 0; +} + +int +idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t max_desc = qconf->nb_desc; + + if (sizeof(struct rte_dma_vchan_conf) != qconf_sz) + return -EINVAL; + + idxd->qcfg = *qconf; + + if (!rte_is_power_of_2(max_desc)) + max_desc = rte_align32pow2(max_desc); + IDXD_PMD_DEBUG("DMA dev %u using %u descriptors", dev->data->dev_id, max_desc); + idxd->desc_ring_mask = max_desc - 1; + idxd->qcfg.nb_desc = max_desc; + + /* in case we are reconfiguring a device, free any existing memory */ + rte_free(idxd->desc_ring); + + /* allocate the descriptor ring at 2x size as batches can't wrap */ + idxd->desc_ring = rte_zmalloc(NULL, sizeof(*idxd->desc_ring) * max_desc * 2, 0); + if (idxd->desc_ring == NULL) + return -ENOMEM; + idxd->desc_iova = rte_mem_virt2iova(idxd->desc_ring); + + idxd->batch_idx_read = 0; + idxd->batch_idx_write = 0; + idxd->batch_start = 0; + idxd->batch_size = 0; + idxd->ids_returned = 0; + idxd->ids_avail = 0; + + memset(idxd->batch_comp_ring, 0, sizeof(*idxd->batch_comp_ring) * + (idxd->max_batches + 1)); + return 0; +} + int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 99c8e04302..fdd018ca35 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -82,5 +82,11 @@ struct idxd_dmadev { int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); int idxd_dump(const struct rte_dma_dev *dev, FILE *f); +int idxd_configure(struct rte_dma_dev *dev, const struct rte_dma_conf *dev_conf, + uint32_t conf_sz); +int idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan, + const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz); +int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *dev_info, + uint32_t size); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index add241d172..0ac5e5f30a 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -84,6 +84,9 @@ idxd_pci_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_close = idxd_pci_dev_close, .dev_dump = idxd_dump, + .dev_configure = idxd_configure, + .vchan_setup = idxd_vchan_setup, + .dev_info_get = idxd_info_get, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v6 08/16] dma/idxd: add start and stop functions for pci devices 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 00/16] add dmadev driver for idxd devices Kevin Laatz ` (6 preceding siblings ...) 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 07/16] dma/idxd: add configure and info_get functions Kevin Laatz @ 2021-09-24 13:39 ` Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 09/16] dma/idxd: add data-path job submission functions Kevin Laatz ` (7 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-24 13:39 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add device start/stop functions for DSA devices bound to vfio. For devices bound to the IDXD kernel driver, these are not required since the IDXD kernel driver takes care of this. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v6: fix return values of start and stop functions --- doc/guides/dmadevs/idxd.rst | 3 +++ drivers/dma/idxd/idxd_pci.c | 51 +++++++++++++++++++++++++++++++++++++ 2 files changed, 54 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 42efd59594..da5e51bfa7 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -139,3 +139,6 @@ IDXD configuration requirements: * Only one ``vchan`` is supported per device (work queue). * IDXD devices do not support silent mode. * The transfer direction must be set to ``RTE_DMA_DIR_MEM_TO_MEM`` to copy from memory to memory. + +Once configured, the device can then be made ready for use by calling the +``rte_dma_start()`` API. diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 0ac5e5f30a..86a033862b 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -59,6 +59,55 @@ idxd_is_wq_enabled(struct idxd_dmadev *idxd) return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; } +static int +idxd_pci_dev_stop(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint8_t err_code; + + if (!idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Work queue %d already disabled", idxd->qid); + return -EALREADY; + } + + err_code = idxd_pci_dev_command(idxd, idxd_disable_wq); + if (err_code || idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Failed disabling work queue %d, error code: %#x", + idxd->qid, err_code); + return err_code == 0 ? -1 : -err_code; + } + IDXD_PMD_DEBUG("Work queue %d disabled OK", idxd->qid); + + return 0; +} + +static int +idxd_pci_dev_start(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint8_t err_code; + + if (idxd_is_wq_enabled(idxd)) { + IDXD_PMD_WARN("WQ %d already enabled", idxd->qid); + return 0; + } + + if (idxd->desc_ring == NULL) { + IDXD_PMD_ERR("WQ %d has not been fully configured", idxd->qid); + return -EINVAL; + } + + err_code = idxd_pci_dev_command(idxd, idxd_enable_wq); + if (err_code || !idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Failed enabling work queue %d, error code: %#x", + idxd->qid, err_code); + return err_code == 0 ? -1 : -err_code; + } + IDXD_PMD_DEBUG("Work queue %d enabled OK", idxd->qid); + + return 0; +} + static int idxd_pci_dev_close(struct rte_dma_dev *dev) { @@ -87,6 +136,8 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .dev_start = idxd_pci_dev_start, + .dev_stop = idxd_pci_dev_stop, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v6 09/16] dma/idxd: add data-path job submission functions 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 00/16] add dmadev driver for idxd devices Kevin Laatz ` (7 preceding siblings ...) 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz @ 2021-09-24 13:39 ` Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 10/16] dma/idxd: add data-path job completion functions Kevin Laatz ` (6 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-24 13:39 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add data path functions for enqueuing and submitting operations to DSA devices. Documentation updates are included for dmadev library and IDXD driver docs as appropriate. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v6: - add references to dmadev lib docs for generic info - fix return values in "__idxd_write_desc()" --- doc/guides/dmadevs/idxd.rst | 9 ++ doc/guides/prog_guide/dmadev.rst | 19 +++++ drivers/dma/idxd/idxd_common.c | 136 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 ++ drivers/dma/idxd/meson.build | 1 + 5 files changed, 170 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index da5e51bfa7..b3d78482be 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -142,3 +142,12 @@ IDXD configuration requirements: Once configured, the device can then be made ready for use by calling the ``rte_dma_start()`` API. + +Performing Data Copies +~~~~~~~~~~~~~~~~~~~~~~~ + +Refer to the :ref:`Enqueue / Dequeue APIs <dmadev_enqueue_dequeue>` section of the dmadev library +documentation for details on operation enqueue and submission API usage. + +It is expected that, for efficiency reasons, a burst of operations will be enqueued to the +device via multiple enqueue calls between calls to the ``rte_dma_submit()`` function. diff --git a/doc/guides/prog_guide/dmadev.rst b/doc/guides/prog_guide/dmadev.rst index 3612315325..4908e33762 100644 --- a/doc/guides/prog_guide/dmadev.rst +++ b/doc/guides/prog_guide/dmadev.rst @@ -108,6 +108,8 @@ can be used to get the device info and supported features. Silent mode is a special device capability which does not require the application to invoke dequeue APIs. +.. _dmadev_enqueue_dequeue: + Enqueue / Dequeue APIs ~~~~~~~~~~~~~~~~~~~~~~ @@ -121,6 +123,23 @@ The ``rte_dma_submit`` API is used to issue doorbell to hardware. Alternatively the ``RTE_DMA_OP_FLAG_SUBMIT`` flag can be passed to the enqueue APIs to also issue the doorbell to hardware. +The following code demonstrates how to enqueue a burst of copies to the +device and start the hardware processing of them: + +.. code-block:: C + + struct rte_mbuf *srcs[DMA_BURST_SZ], *dsts[DMA_BURST_SZ]; + unsigned int i; + + for (i = 0; i < RTE_DIM(srcs); i++) { + if (rte_dma_copy(dev_id, vchan, rte_pktmbuf_iova(srcs), + rte_pktmbuf_iova(dsts), COPY_LEN, 0) < 0) { + PRINT_ERR("Error with rte_dma_copy for buffer %u\n", i); + return -1; + } + } + rte_dma_submit(dev_id, vchan); + There are two dequeue APIs ``rte_dma_completed`` and ``rte_dma_completed_status``, these are used to obtain the results of the enqueue requests. ``rte_dma_completed`` will return the number of successfully diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 32ddb5f7f8..1580f5029c 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -2,14 +2,144 @@ * Copyright 2021 Intel Corporation */ +#include <x86intrin.h> + #include <rte_dmadev_pmd.h> #include <rte_malloc.h> #include <rte_common.h> +#include <rte_prefetch.h> #include "idxd_internal.h" #define IDXD_PMD_NAME_STR "dmadev_idxd" +static __rte_always_inline rte_iova_t +__desc_idx_to_iova(struct idxd_dmadev *idxd, uint16_t n) +{ + return idxd->desc_iova + (n * sizeof(struct idxd_hw_desc)); +} + +static __rte_always_inline void +__idxd_movdir64b(volatile void *dst, const struct idxd_hw_desc *src) +{ + asm volatile (".byte 0x66, 0x0f, 0x38, 0xf8, 0x02" + : + : "a" (dst), "d" (src) + : "memory"); +} + +static __rte_always_inline void +__submit(struct idxd_dmadev *idxd) +{ + rte_prefetch1(&idxd->batch_comp_ring[idxd->batch_idx_read]); + + if (idxd->batch_size == 0) + return; + + /* write completion to batch comp ring */ + rte_iova_t comp_addr = idxd->batch_iova + + (idxd->batch_idx_write * sizeof(struct idxd_completion)); + + if (idxd->batch_size == 1) { + /* submit batch directly */ + struct idxd_hw_desc desc = + idxd->desc_ring[idxd->batch_start & idxd->desc_ring_mask]; + desc.completion = comp_addr; + desc.op_flags |= IDXD_FLAG_REQUEST_COMPLETION; + _mm_sfence(); /* fence before writing desc to device */ + __idxd_movdir64b(idxd->portal, &desc); + } else { + const struct idxd_hw_desc batch_desc = { + .op_flags = (idxd_op_batch << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_COMPLETION_ADDR_VALID | + IDXD_FLAG_REQUEST_COMPLETION, + .desc_addr = __desc_idx_to_iova(idxd, + idxd->batch_start & idxd->desc_ring_mask), + .completion = comp_addr, + .size = idxd->batch_size, + }; + _mm_sfence(); /* fence before writing desc to device */ + __idxd_movdir64b(idxd->portal, &batch_desc); + } + + if (++idxd->batch_idx_write > idxd->max_batches) + idxd->batch_idx_write = 0; + + idxd->batch_start += idxd->batch_size; + idxd->batch_size = 0; + idxd->batch_idx_ring[idxd->batch_idx_write] = idxd->batch_start; + _mm256_store_si256((void *)&idxd->batch_comp_ring[idxd->batch_idx_write], + _mm256_setzero_si256()); +} + +static __rte_always_inline int +__idxd_write_desc(struct rte_dma_dev *dev, + const uint32_t op_flags, + const rte_iova_t src, + const rte_iova_t dst, + const uint32_t size, + const uint32_t flags) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t mask = idxd->desc_ring_mask; + uint16_t job_id = idxd->batch_start + idxd->batch_size; + /* we never wrap batches, so we only mask the start and allow start+size to overflow */ + uint16_t write_idx = (idxd->batch_start & mask) + idxd->batch_size; + + /* first check batch ring space then desc ring space */ + if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) || + idxd->batch_idx_write + 1 == idxd->batch_idx_read) + return -ENOSPC; + if (((write_idx + 1) & mask) == (idxd->ids_returned & mask)) + return -ENOSPC; + + /* write desc. Note: descriptors don't wrap, but the completion address does */ + const uint64_t op_flags64 = (uint64_t)(op_flags | IDXD_FLAG_COMPLETION_ADDR_VALID) << 32; + const uint64_t comp_addr = __desc_idx_to_iova(idxd, write_idx & mask); + _mm256_store_si256((void *)&idxd->desc_ring[write_idx], + _mm256_set_epi64x(dst, src, comp_addr, op_flags64)); + _mm256_store_si256((void *)&idxd->desc_ring[write_idx].size, + _mm256_set_epi64x(0, 0, 0, size)); + + idxd->batch_size++; + + rte_prefetch0_write(&idxd->desc_ring[write_idx + 1]); + + if (flags & RTE_DMA_OP_FLAG_SUBMIT) + __submit(idxd); + + return job_id; +} + +int +idxd_enqueue_copy(struct rte_dma_dev *dev, uint16_t qid __rte_unused, rte_iova_t src, + rte_iova_t dst, unsigned int length, uint64_t flags) +{ + /* we can take advantage of the fact that the fence flag in dmadev and DSA are the same, + * but check it at compile time to be sure. + */ + RTE_BUILD_BUG_ON(RTE_DMA_OP_FLAG_FENCE != IDXD_FLAG_FENCE); + uint32_t memmove = (idxd_op_memmove << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_CACHE_CONTROL | (flags & IDXD_FLAG_FENCE); + return __idxd_write_desc(dev, memmove, src, dst, length, flags); +} + +int +idxd_enqueue_fill(struct rte_dma_dev *dev, uint16_t qid __rte_unused, uint64_t pattern, + rte_iova_t dst, unsigned int length, uint64_t flags) +{ + uint32_t fill = (idxd_op_fill << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_CACHE_CONTROL | (flags & IDXD_FLAG_FENCE); + return __idxd_write_desc(dev, fill, pattern, dst, length, flags); +} + +int +idxd_submit(struct rte_dma_dev *dev, uint16_t qid __rte_unused) +{ + __submit(dev->dev_private); + return 0; +} + int idxd_dump(const struct rte_dma_dev *dev, FILE *f) { @@ -139,6 +269,12 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->dev_ops = ops; dmadev->device = dev; + dmadev->copy = idxd_enqueue_copy; + dmadev->fill = idxd_enqueue_fill; + dmadev->submit = idxd_submit; + dmadev->completed = idxd_completed; + dmadev->completed_status = idxd_completed_status; + idxd = dmadev->dev_private; *idxd = *base_idxd; /* copy over the main fields already passed in */ idxd->dmadev = dmadev; diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index fdd018ca35..b66c2d0182 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -88,5 +88,10 @@ int idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan, const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz); int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *dev_info, uint32_t size); +int idxd_enqueue_copy(struct rte_dma_dev *dev, uint16_t qid, rte_iova_t src, + rte_iova_t dst, unsigned int length, uint64_t flags); +int idxd_enqueue_fill(struct rte_dma_dev *dev, uint16_t qid, uint64_t pattern, + rte_iova_t dst, unsigned int length, uint64_t flags); +int idxd_submit(struct rte_dma_dev *dev, uint16_t qid); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index da5dc2b019..3b5133c578 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -6,6 +6,7 @@ if is_windows endif deps += ['bus_pci'] +cflags += '-mavx2' # all platforms with idxd HW support AVX sources = files( 'idxd_bus.c', 'idxd_common.c', -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v6 10/16] dma/idxd: add data-path job completion functions 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 00/16] add dmadev driver for idxd devices Kevin Laatz ` (8 preceding siblings ...) 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 09/16] dma/idxd: add data-path job submission functions Kevin Laatz @ 2021-09-24 13:39 ` Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 11/16] dma/idxd: add operation statistic tracking Kevin Laatz ` (5 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-24 13:39 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the data path functions for gathering completed operations. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v2: - fixed typo in docs - add completion status for invalid opcode v6: - update documentation to reduce duplication --- doc/guides/dmadevs/idxd.rst | 32 ++++- drivers/dma/idxd/idxd_common.c | 235 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 + 3 files changed, 271 insertions(+), 1 deletion(-) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index b3d78482be..2220e454bc 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -147,7 +147,37 @@ Performing Data Copies ~~~~~~~~~~~~~~~~~~~~~~~ Refer to the :ref:`Enqueue / Dequeue APIs <dmadev_enqueue_dequeue>` section of the dmadev library -documentation for details on operation enqueue and submission API usage. +documentation for details on operation enqueue, submission and completion API usage. It is expected that, for efficiency reasons, a burst of operations will be enqueued to the device via multiple enqueue calls between calls to the ``rte_dma_submit()`` function. + +When gathering completions, ``rte_dma_completed()`` should be used, up until the point an error +occurs in an operation. If an error was encountered, ``rte_dma_completed_status()`` must be used +to kick the device off to continue processing operations and also to gather the status of each +individual operations which is filled in to the ``status`` array provided as parameter by the +application. + +The following status codes are supported by IDXD: + +* ``RTE_DMA_STATUS_SUCCESSFUL``: The operation was successful. +* ``RTE_DMA_STATUS_INVALID_OPCODE``: The operation failed due to an invalid operation code. +* ``RTE_DMA_STATUS_INVALID_LENGTH``: The operation failed due to an invalid data length. +* ``RTE_DMA_STATUS_NOT_ATTEMPTED``: The operation was not attempted. +* ``RTE_DMA_STATUS_ERROR_UNKNOWN``: The operation failed due to an unspecified error. + +The following code shows how to retrieve the number of successfully completed +copies within a burst and then using ``rte_dma_completed_status()`` to check +which operation failed and kick off the device to continue processing operations: + +.. code-block:: C + + enum rte_dma_status_code status[COMP_BURST_SZ]; + uint16_t count, idx, status_count; + bool error = 0; + + count = rte_dma_completed(dev_id, vchan, COMP_BURST_SZ, &idx, &error); + + if (error){ + status_count = rte_dma_completed_status(dev_id, vchan, COMP_BURST_SZ, &idx, status); + } diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 1580f5029c..76ef7d0378 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -140,6 +140,241 @@ idxd_submit(struct rte_dma_dev *dev, uint16_t qid __rte_unused) return 0; } +static enum rte_dma_status_code +get_comp_status(struct idxd_completion *c) +{ + uint8_t st = c->status; + switch (st) { + /* successful descriptors are not written back normally */ + case IDXD_COMP_STATUS_INCOMPLETE: + case IDXD_COMP_STATUS_SUCCESS: + return RTE_DMA_STATUS_SUCCESSFUL; + case IDXD_COMP_STATUS_INVALID_OPCODE: + return RTE_DMA_STATUS_INVALID_OPCODE; + case IDXD_COMP_STATUS_INVALID_SIZE: + return RTE_DMA_STATUS_INVALID_LENGTH; + case IDXD_COMP_STATUS_SKIPPED: + return RTE_DMA_STATUS_NOT_ATTEMPTED; + default: + return RTE_DMA_STATUS_ERROR_UNKNOWN; + } +} + +static __rte_always_inline int +batch_ok(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) +{ + uint16_t ret; + uint8_t bstatus; + + if (max_ops == 0) + return 0; + + /* first check if there are any unreturned handles from last time */ + if (idxd->ids_avail != idxd->ids_returned) { + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); + idxd->ids_returned += ret; + if (status) + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); + return ret; + } + + if (idxd->batch_idx_read == idxd->batch_idx_write) + return 0; + + bstatus = idxd->batch_comp_ring[idxd->batch_idx_read].status; + /* now check if next batch is complete and successful */ + if (bstatus == IDXD_COMP_STATUS_SUCCESS) { + /* since the batch idx ring stores the start of each batch, pre-increment to lookup + * start of next batch. + */ + if (++idxd->batch_idx_read > idxd->max_batches) + idxd->batch_idx_read = 0; + idxd->ids_avail = idxd->batch_idx_ring[idxd->batch_idx_read]; + + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); + idxd->ids_returned += ret; + if (status) + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); + return ret; + } + /* check if batch is incomplete */ + else if (bstatus == IDXD_COMP_STATUS_INCOMPLETE) + return 0; + + return -1; /* error case */ +} + +static inline uint16_t +batch_completed(struct idxd_dmadev *idxd, uint16_t max_ops, bool *has_error) +{ + uint16_t i; + uint16_t b_start, b_end, next_batch; + + int ret = batch_ok(idxd, max_ops, NULL); + if (ret >= 0) + return ret; + + /* ERROR case, not successful, not incomplete */ + /* Get the batch size, and special case size 1. + * once we identify the actual failure job, return other jobs, then update + * the batch ring indexes to make it look like the first job of the batch has failed. + * Subsequent calls here will always return zero packets, and the error must be cleared by + * calling the completed_status() function. + */ + next_batch = (idxd->batch_idx_read + 1); + if (next_batch > idxd->max_batches) + next_batch = 0; + b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; + b_end = idxd->batch_idx_ring[next_batch]; + + if (b_end - b_start == 1) { /* not a batch */ + *has_error = true; + return 0; + } + + for (i = b_start; i < b_end; i++) { + struct idxd_completion *c = (void *)&idxd->desc_ring[i & idxd->desc_ring_mask]; + if (c->status > IDXD_COMP_STATUS_SUCCESS) /* ignore incomplete(0) and success(1) */ + break; + } + ret = RTE_MIN((uint16_t)(i - idxd->ids_returned), max_ops); + if (ret < max_ops) + *has_error = true; /* we got up to the point of error */ + idxd->ids_avail = idxd->ids_returned += ret; + + /* to ensure we can call twice and just return 0, set start of batch to where we finished */ + idxd->batch_comp_ring[idxd->batch_idx_read].completed_size -= ret; + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; + if (idxd->batch_idx_ring[next_batch] - idxd->batch_idx_ring[idxd->batch_idx_read] == 1) { + /* copy over the descriptor status to the batch ring as if no batch */ + uint16_t d_idx = idxd->batch_idx_ring[idxd->batch_idx_read] & idxd->desc_ring_mask; + struct idxd_completion *desc_comp = (void *)&idxd->desc_ring[d_idx]; + idxd->batch_comp_ring[idxd->batch_idx_read].status = desc_comp->status; + } + + return ret; +} + +static uint16_t +batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) +{ + uint16_t next_batch; + + int ret = batch_ok(idxd, max_ops, status); + if (ret >= 0) + return ret; + + /* ERROR case, not successful, not incomplete */ + /* Get the batch size, and special case size 1. + */ + next_batch = (idxd->batch_idx_read + 1); + if (next_batch > idxd->max_batches) + next_batch = 0; + const uint16_t b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; + const uint16_t b_end = idxd->batch_idx_ring[next_batch]; + const uint16_t b_len = b_end - b_start; + if (b_len == 1) {/* not a batch */ + *status = get_comp_status(&idxd->batch_comp_ring[idxd->batch_idx_read]); + idxd->ids_avail++; + idxd->ids_returned++; + idxd->batch_idx_read = next_batch; + return 1; + } + + /* not a single-element batch, need to process more. + * Scenarios: + * 1. max_ops >= batch_size - can fit everything, simple case + * - loop through completed ops and then add on any not-attempted ones + * 2. max_ops < batch_size - can't fit everything, more complex case + * - loop through completed/incomplete and stop when hit max_ops + * - adjust the batch descriptor to update where we stopped, with appropriate bcount + * - if bcount is to be exactly 1, update the batch descriptor as it will be treated as + * non-batch next time. + */ + const uint16_t bcount = idxd->batch_comp_ring[idxd->batch_idx_read].completed_size; + for (ret = 0; ret < b_len && ret < max_ops; ret++) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; + status[ret] = (ret < bcount) ? get_comp_status(c) : RTE_DMA_STATUS_NOT_ATTEMPTED; + } + idxd->ids_avail = idxd->ids_returned += ret; + + /* everything fit */ + if (ret == b_len) { + idxd->batch_idx_read = next_batch; + return ret; + } + + /* set up for next time, update existing batch descriptor & start idx at batch_idx_read */ + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; + if (ret > bcount) { + /* we have only incomplete ones - set batch completed size to 0 */ + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; + comp->completed_size = 0; + /* if there is only one descriptor left, job skipped so set flag appropriately */ + if (b_len - ret == 1) + comp->status = IDXD_COMP_STATUS_SKIPPED; + } else { + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; + comp->completed_size -= ret; + /* if there is only one descriptor left, copy status info straight to desc */ + if (comp->completed_size == 1) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; + comp->status = c->status; + /* individual descs can be ok without writeback, but not batches */ + if (comp->status == IDXD_COMP_STATUS_INCOMPLETE) + comp->status = IDXD_COMP_STATUS_SUCCESS; + } else if (bcount == b_len) { + /* check if we still have an error, and clear flag if not */ + uint16_t i; + for (i = b_start + ret; i < b_end; i++) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[i & idxd->desc_ring_mask]; + if (c->status > IDXD_COMP_STATUS_SUCCESS) + break; + } + if (i == b_end) /* no errors */ + comp->status = IDXD_COMP_STATUS_SUCCESS; + } + } + + return ret; +} + +uint16_t +idxd_completed(struct rte_dma_dev *dev, uint16_t qid __rte_unused, uint16_t max_ops, + uint16_t *last_idx, bool *has_error) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t batch, ret = 0; + + do { + batch = batch_completed(idxd, max_ops - ret, has_error); + ret += batch; + } while (batch > 0 && *has_error == false); + + *last_idx = idxd->ids_returned - 1; + return ret; +} + +uint16_t +idxd_completed_status(struct rte_dma_dev *dev, uint16_t qid __rte_unused, uint16_t max_ops, + uint16_t *last_idx, enum rte_dma_status_code *status) +{ + struct idxd_dmadev *idxd = dev->dev_private; + + uint16_t batch, ret = 0; + + do { + batch = batch_completed_status(idxd, max_ops - ret, &status[ret]); + ret += batch; + } while (batch > 0); + + *last_idx = idxd->ids_returned - 1; + return ret; +} + int idxd_dump(const struct rte_dma_dev *dev, FILE *f) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index b66c2d0182..15115a0966 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -93,5 +93,10 @@ int idxd_enqueue_copy(struct rte_dma_dev *dev, uint16_t qid, rte_iova_t src, int idxd_enqueue_fill(struct rte_dma_dev *dev, uint16_t qid, uint64_t pattern, rte_iova_t dst, unsigned int length, uint64_t flags); int idxd_submit(struct rte_dma_dev *dev, uint16_t qid); +uint16_t idxd_completed(struct rte_dma_dev *dev, uint16_t qid, uint16_t max_ops, + uint16_t *last_idx, bool *has_error); +uint16_t idxd_completed_status(struct rte_dma_dev *dev, uint16_t qid __rte_unused, + uint16_t max_ops, uint16_t *last_idx, + enum rte_dma_status_code *status); #endif /* _IDXD_INTERNAL_H_ */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v6 11/16] dma/idxd: add operation statistic tracking 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 00/16] add dmadev driver for idxd devices Kevin Laatz ` (9 preceding siblings ...) 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 10/16] dma/idxd: add data-path job completion functions Kevin Laatz @ 2021-09-24 13:39 ` Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 12/16] dma/idxd: add vchan status function Kevin Laatz ` (4 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-24 13:39 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add statistic tracking for DSA devices. The dmadev library documentation is also updated to add a generic section for using the library's statistics APIs. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> --- v6: move doc update to dmadev library doC --- doc/guides/prog_guide/dmadev.rst | 11 +++++++++++ drivers/dma/idxd/idxd_bus.c | 2 ++ drivers/dma/idxd/idxd_common.c | 27 +++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 3 +++ drivers/dma/idxd/idxd_pci.c | 2 ++ 5 files changed, 45 insertions(+) diff --git a/doc/guides/prog_guide/dmadev.rst b/doc/guides/prog_guide/dmadev.rst index 4908e33762..b268dc8d46 100644 --- a/doc/guides/prog_guide/dmadev.rst +++ b/doc/guides/prog_guide/dmadev.rst @@ -148,3 +148,14 @@ completed operations along with the status of each operation (filled into the ``status`` array passed by user). These two APIs can also return the last completed operation's ``ring_idx`` which could help user track operations within their own application-defined rings. + + +Querying Device Statistics +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The statistics from a dmadev device can be got via the statistics functions, +i.e. ``rte_dma_stats_get()``. The statistics returned for each device instance are: + +* ``submitted``: The number of operations submitted to the device. +* ``completed``: The number of operations which have completed (successful and failed). +* ``errors``: The number of operations that completed with error. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index b2acdac4f9..b52ea02854 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -99,6 +99,8 @@ static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .stats_get = idxd_stats_get, + .stats_reset = idxd_stats_reset, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 76ef7d0378..7a3eb0a4c1 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -65,6 +65,8 @@ __submit(struct idxd_dmadev *idxd) if (++idxd->batch_idx_write > idxd->max_batches) idxd->batch_idx_write = 0; + idxd->stats.submitted += idxd->batch_size; + idxd->batch_start += idxd->batch_size; idxd->batch_size = 0; idxd->batch_idx_ring[idxd->batch_idx_write] = idxd->batch_start; @@ -275,6 +277,8 @@ batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_ const uint16_t b_len = b_end - b_start; if (b_len == 1) {/* not a batch */ *status = get_comp_status(&idxd->batch_comp_ring[idxd->batch_idx_read]); + if (status != RTE_DMA_STATUS_SUCCESSFUL) + idxd->stats.errors++; idxd->ids_avail++; idxd->ids_returned++; idxd->batch_idx_read = next_batch; @@ -296,6 +300,8 @@ batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_ struct idxd_completion *c = (void *) &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; status[ret] = (ret < bcount) ? get_comp_status(c) : RTE_DMA_STATUS_NOT_ATTEMPTED; + if (status[ret] != RTE_DMA_STATUS_SUCCESSFUL) + idxd->stats.errors++; } idxd->ids_avail = idxd->ids_returned += ret; @@ -354,6 +360,7 @@ idxd_completed(struct rte_dma_dev *dev, uint16_t qid __rte_unused, uint16_t max_ ret += batch; } while (batch > 0 && *has_error == false); + idxd->stats.completed += ret; *last_idx = idxd->ids_returned - 1; return ret; } @@ -371,6 +378,7 @@ idxd_completed_status(struct rte_dma_dev *dev, uint16_t qid __rte_unused, uint16 ret += batch; } while (batch > 0); + idxd->stats.completed += ret; *last_idx = idxd->ids_returned - 1; return ret; } @@ -404,6 +412,25 @@ idxd_dump(const struct rte_dma_dev *dev, FILE *f) return 0; } +int +idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + struct rte_dma_stats *stats, uint32_t stats_sz) +{ + struct idxd_dmadev *idxd = dev->dev_private; + if (stats_sz < sizeof(*stats)) + return -EINVAL; + *stats = idxd->stats; + return 0; +} + +int +idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan __rte_unused) +{ + struct idxd_dmadev *idxd = dev->dev_private; + idxd->stats = (struct rte_dma_stats){0}; + return 0; +} + int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t size) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 15115a0966..e2a1119ef7 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -98,5 +98,8 @@ uint16_t idxd_completed(struct rte_dma_dev *dev, uint16_t qid, uint16_t max_ops, uint16_t idxd_completed_status(struct rte_dma_dev *dev, uint16_t qid __rte_unused, uint16_t max_ops, uint16_t *last_idx, enum rte_dma_status_code *status); +int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, + struct rte_dma_stats *stats, uint32_t stats_sz); +int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 86a033862b..cf91eb9c5e 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -136,6 +136,8 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .stats_get = idxd_stats_get, + .stats_reset = idxd_stats_reset, .dev_start = idxd_pci_dev_start, .dev_stop = idxd_pci_dev_stop, }; -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v6 12/16] dma/idxd: add vchan status function 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 00/16] add dmadev driver for idxd devices Kevin Laatz ` (10 preceding siblings ...) 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 11/16] dma/idxd: add operation statistic tracking Kevin Laatz @ 2021-09-24 13:39 ` Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 13/16] dma/idxd: add burst capacity API Kevin Laatz ` (3 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-24 13:39 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When testing dmadev drivers, it is useful to have the HW device in a known state. This patch adds the implementation of the function which will wait for the device to be idle (all jobs completed) before proceeding. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v3: update API name to vchan_status --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 17 +++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 2 ++ drivers/dma/idxd/idxd_pci.c | 1 + 4 files changed, 21 insertions(+) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index b52ea02854..e6caa048a9 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -101,6 +101,7 @@ static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_info_get = idxd_info_get, .stats_get = idxd_stats_get, .stats_reset = idxd_stats_reset, + .vchan_status = idxd_vchan_status, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 7a3eb0a4c1..12c113a93b 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -162,6 +162,23 @@ get_comp_status(struct idxd_completion *c) } } +int +idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + enum rte_dma_vchan_status *status) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t last_batch_write = idxd->batch_idx_write == 0 ? idxd->max_batches : + idxd->batch_idx_write - 1; + uint8_t bstatus = (idxd->batch_comp_ring[last_batch_write].status != 0); + + /* An IDXD device will always be either active or idle. + * RTE_DMA_VCHAN_HALTED_ERROR is therefore not supported by IDXD. + */ + *status = bstatus ? RTE_DMA_VCHAN_IDLE : RTE_DMA_VCHAN_ACTIVE; + + return 0; +} + static __rte_always_inline int batch_ok(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index e2a1119ef7..a291ad26d9 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -101,5 +101,7 @@ uint16_t idxd_completed_status(struct rte_dma_dev *dev, uint16_t qid __rte_unuse int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, struct rte_dma_stats *stats, uint32_t stats_sz); int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); +int idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan, + enum rte_dma_vchan_status *status); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index cf91eb9c5e..3152ec1289 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -140,6 +140,7 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .stats_reset = idxd_stats_reset, .dev_start = idxd_pci_dev_start, .dev_stop = idxd_pci_dev_stop, + .vchan_status = idxd_vchan_status, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v6 13/16] dma/idxd: add burst capacity API 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 00/16] add dmadev driver for idxd devices Kevin Laatz ` (11 preceding siblings ...) 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 12/16] dma/idxd: add vchan status function Kevin Laatz @ 2021-09-24 13:39 ` Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz ` (2 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-24 13:39 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add support for the burst capacity API. This API will provide the calling application with the remaining capacity of the current burst (limited by max HW batch size). Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> --- v6: updates for burst capacity api moving to fastpath --- drivers/dma/idxd/idxd_common.c | 21 +++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 1 + drivers/dma/idxd/idxd_pci.c | 1 + 3 files changed, 23 insertions(+) diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 12c113a93b..a00fadc431 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -468,6 +468,26 @@ idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t return 0; } +uint16_t +idxd_burst_capacity(const struct rte_dma_dev *dev, uint16_t vchan __rte_unused) +{ + struct idxd_dmadev *idxd = dev->dev_private; + uint16_t write_idx = idxd->batch_start + idxd->batch_size; + uint16_t used_space; + + /* Check for space in the batch ring */ + if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) || + idxd->batch_idx_write + 1 == idxd->batch_idx_read) + return 0; + + /* For descriptors, check for wrap-around on write but not read */ + if (idxd->ids_returned > write_idx) + write_idx += idxd->desc_ring_mask + 1; + used_space = write_idx - idxd->ids_returned; + + return RTE_MIN((idxd->desc_ring_mask - used_space), idxd->max_batch_size); +} + int idxd_configure(struct rte_dma_dev *dev __rte_unused, const struct rte_dma_conf *dev_conf, uint32_t conf_sz) @@ -553,6 +573,7 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->submit = idxd_submit; dmadev->completed = idxd_completed; dmadev->completed_status = idxd_completed_status; + dmadev->burst_capacity = idxd_burst_capacity; idxd = dmadev->dev_private; *idxd = *base_idxd; /* copy over the main fields already passed in */ diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index a291ad26d9..3ef2f729a8 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -103,5 +103,6 @@ int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); int idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan, enum rte_dma_vchan_status *status); +uint16_t idxd_burst_capacity(const struct rte_dma_dev *dev, uint16_t vchan); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 3152ec1289..f76383710c 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -254,6 +254,7 @@ init_pci_device(struct rte_pci_device *dev, struct idxd_dmadev *idxd, idxd->u.pci = pci; idxd->max_batches = wq_size; + idxd->max_batch_size = 1 << lg2_max_batch; /* enable the device itself */ err_code = idxd_pci_dev_command(idxd, idxd_enable_dev); -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v6 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 00/16] add dmadev driver for idxd devices Kevin Laatz ` (12 preceding siblings ...) 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 13/16] dma/idxd: add burst capacity API Kevin Laatz @ 2021-09-24 13:39 ` Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 15/16] devbind: add dma device class Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-24 13:39 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz From: Conor Walsh <conor.walsh@intel.com> Move the example script for configuring IDXD devices bound to the IDXD kernel driver from raw to dma, and create a symlink to still allow use from raw. Signed-off-by: Conor Walsh <conor.walsh@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> --- drivers/dma/idxd/dpdk_idxd_cfg.py | 117 +++++++++++++++++++++++++++++ drivers/raw/ioat/dpdk_idxd_cfg.py | 118 +----------------------------- 2 files changed, 118 insertions(+), 117 deletions(-) create mode 100755 drivers/dma/idxd/dpdk_idxd_cfg.py mode change 100755 => 120000 drivers/raw/ioat/dpdk_idxd_cfg.py diff --git a/drivers/dma/idxd/dpdk_idxd_cfg.py b/drivers/dma/idxd/dpdk_idxd_cfg.py new file mode 100755 index 0000000000..fcc27822ef --- /dev/null +++ b/drivers/dma/idxd/dpdk_idxd_cfg.py @@ -0,0 +1,117 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2020 Intel Corporation + +""" +Configure an entire Intel DSA instance, using idxd kernel driver, for DPDK use +""" + +import sys +import argparse +import os +import os.path + + +class SysfsDir: + "Used to read/write paths in a sysfs directory" + def __init__(self, path): + self.path = path + + def read_int(self, filename): + "Return a value from sysfs file" + with open(os.path.join(self.path, filename)) as f: + return int(f.readline()) + + def write_values(self, values): + "write dictionary, where key is filename and value is value to write" + for filename, contents in values.items(): + with open(os.path.join(self.path, filename), "w") as f: + f.write(str(contents)) + + +def reset_device(dsa_id): + "Reset the DSA device and all its queues" + drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") + drv_dir.write_values({"unbind": f"dsa{dsa_id}"}) + + +def get_pci_dir(pci): + "Search for the sysfs directory of the PCI device" + base_dir = '/sys/bus/pci/devices/' + for path, dirs, files in os.walk(base_dir): + for dir in dirs: + if pci in dir: + return os.path.join(base_dir, dir) + sys.exit(f"Could not find sysfs directory for device {pci}") + + +def get_dsa_id(pci): + "Get the DSA instance ID using the PCI address of the device" + pci_dir = get_pci_dir(pci) + for path, dirs, files in os.walk(pci_dir): + for dir in dirs: + if dir.startswith('dsa') and 'wq' not in dir: + return int(dir[3:]) + sys.exit(f"Could not get device ID for device {pci}") + + +def configure_dsa(dsa_id, queues, prefix): + "Configure the DSA instance with appropriate number of queues" + dsa_dir = SysfsDir(f"/sys/bus/dsa/devices/dsa{dsa_id}") + drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") + + max_groups = dsa_dir.read_int("max_groups") + max_engines = dsa_dir.read_int("max_engines") + max_queues = dsa_dir.read_int("max_work_queues") + max_work_queues_size = dsa_dir.read_int("max_work_queues_size") + + nb_queues = min(queues, max_queues) + if queues > nb_queues: + print(f"Setting number of queues to max supported value: {max_queues}") + + # we want one engine per group, and no more engines than queues + nb_groups = min(max_engines, max_groups, nb_queues) + for grp in range(nb_groups): + dsa_dir.write_values({f"engine{dsa_id}.{grp}/group_id": grp}) + + # configure each queue + for q in range(nb_queues): + wq_dir = SysfsDir(os.path.join(dsa_dir.path, f"wq{dsa_id}.{q}")) + wq_dir.write_values({"group_id": q % nb_groups, + "type": "user", + "mode": "dedicated", + "name": f"{prefix}_wq{dsa_id}.{q}", + "priority": 1, + "size": int(max_work_queues_size / nb_queues)}) + + # enable device and then queues + drv_dir.write_values({"bind": f"dsa{dsa_id}"}) + for q in range(nb_queues): + drv_dir.write_values({"bind": f"wq{dsa_id}.{q}"}) + + +def main(args): + "Main function, does arg parsing and calls config function" + arg_p = argparse.ArgumentParser( + description="Configure whole DSA device instance for DPDK use") + arg_p.add_argument('dsa_id', + help="Specify DSA instance either via DSA instance number or PCI address") + arg_p.add_argument('-q', metavar='queues', type=int, default=255, + help="Number of queues to set up") + arg_p.add_argument('--name-prefix', metavar='prefix', dest='prefix', + default="dpdk", + help="Prefix for workqueue name to mark for DPDK use [default: 'dpdk']") + arg_p.add_argument('--reset', action='store_true', + help="Reset DSA device and its queues") + parsed_args = arg_p.parse_args(args[1:]) + + dsa_id = parsed_args.dsa_id + dsa_id = get_dsa_id(dsa_id) if ':' in dsa_id else dsa_id + if parsed_args.reset: + reset_device(dsa_id) + else: + configure_dsa(dsa_id, parsed_args.q, parsed_args.prefix) + + +if __name__ == "__main__": + main(sys.argv) diff --git a/drivers/raw/ioat/dpdk_idxd_cfg.py b/drivers/raw/ioat/dpdk_idxd_cfg.py deleted file mode 100755 index fcc27822ef..0000000000 --- a/drivers/raw/ioat/dpdk_idxd_cfg.py +++ /dev/null @@ -1,117 +0,0 @@ -#!/usr/bin/env python3 -# SPDX-License-Identifier: BSD-3-Clause -# Copyright(c) 2020 Intel Corporation - -""" -Configure an entire Intel DSA instance, using idxd kernel driver, for DPDK use -""" - -import sys -import argparse -import os -import os.path - - -class SysfsDir: - "Used to read/write paths in a sysfs directory" - def __init__(self, path): - self.path = path - - def read_int(self, filename): - "Return a value from sysfs file" - with open(os.path.join(self.path, filename)) as f: - return int(f.readline()) - - def write_values(self, values): - "write dictionary, where key is filename and value is value to write" - for filename, contents in values.items(): - with open(os.path.join(self.path, filename), "w") as f: - f.write(str(contents)) - - -def reset_device(dsa_id): - "Reset the DSA device and all its queues" - drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") - drv_dir.write_values({"unbind": f"dsa{dsa_id}"}) - - -def get_pci_dir(pci): - "Search for the sysfs directory of the PCI device" - base_dir = '/sys/bus/pci/devices/' - for path, dirs, files in os.walk(base_dir): - for dir in dirs: - if pci in dir: - return os.path.join(base_dir, dir) - sys.exit(f"Could not find sysfs directory for device {pci}") - - -def get_dsa_id(pci): - "Get the DSA instance ID using the PCI address of the device" - pci_dir = get_pci_dir(pci) - for path, dirs, files in os.walk(pci_dir): - for dir in dirs: - if dir.startswith('dsa') and 'wq' not in dir: - return int(dir[3:]) - sys.exit(f"Could not get device ID for device {pci}") - - -def configure_dsa(dsa_id, queues, prefix): - "Configure the DSA instance with appropriate number of queues" - dsa_dir = SysfsDir(f"/sys/bus/dsa/devices/dsa{dsa_id}") - drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") - - max_groups = dsa_dir.read_int("max_groups") - max_engines = dsa_dir.read_int("max_engines") - max_queues = dsa_dir.read_int("max_work_queues") - max_work_queues_size = dsa_dir.read_int("max_work_queues_size") - - nb_queues = min(queues, max_queues) - if queues > nb_queues: - print(f"Setting number of queues to max supported value: {max_queues}") - - # we want one engine per group, and no more engines than queues - nb_groups = min(max_engines, max_groups, nb_queues) - for grp in range(nb_groups): - dsa_dir.write_values({f"engine{dsa_id}.{grp}/group_id": grp}) - - # configure each queue - for q in range(nb_queues): - wq_dir = SysfsDir(os.path.join(dsa_dir.path, f"wq{dsa_id}.{q}")) - wq_dir.write_values({"group_id": q % nb_groups, - "type": "user", - "mode": "dedicated", - "name": f"{prefix}_wq{dsa_id}.{q}", - "priority": 1, - "size": int(max_work_queues_size / nb_queues)}) - - # enable device and then queues - drv_dir.write_values({"bind": f"dsa{dsa_id}"}) - for q in range(nb_queues): - drv_dir.write_values({"bind": f"wq{dsa_id}.{q}"}) - - -def main(args): - "Main function, does arg parsing and calls config function" - arg_p = argparse.ArgumentParser( - description="Configure whole DSA device instance for DPDK use") - arg_p.add_argument('dsa_id', - help="Specify DSA instance either via DSA instance number or PCI address") - arg_p.add_argument('-q', metavar='queues', type=int, default=255, - help="Number of queues to set up") - arg_p.add_argument('--name-prefix', metavar='prefix', dest='prefix', - default="dpdk", - help="Prefix for workqueue name to mark for DPDK use [default: 'dpdk']") - arg_p.add_argument('--reset', action='store_true', - help="Reset DSA device and its queues") - parsed_args = arg_p.parse_args(args[1:]) - - dsa_id = parsed_args.dsa_id - dsa_id = get_dsa_id(dsa_id) if ':' in dsa_id else dsa_id - if parsed_args.reset: - reset_device(dsa_id) - else: - configure_dsa(dsa_id, parsed_args.q, parsed_args.prefix) - - -if __name__ == "__main__": - main(sys.argv) diff --git a/drivers/raw/ioat/dpdk_idxd_cfg.py b/drivers/raw/ioat/dpdk_idxd_cfg.py new file mode 120000 index 0000000000..85545548d1 --- /dev/null +++ b/drivers/raw/ioat/dpdk_idxd_cfg.py @@ -0,0 +1 @@ +../../dma/idxd/dpdk_idxd_cfg.py \ No newline at end of file -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v6 15/16] devbind: add dma device class 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 00/16] add dmadev driver for idxd devices Kevin Laatz ` (13 preceding siblings ...) 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz @ 2021-09-24 13:39 ` Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-24 13:39 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add a new class for DMA devices. Devices listed under the DMA class are to be used with the dmadev library. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> --- v6: remove purely cosmetic change from patch --- usertools/dpdk-devbind.py | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py index 74d16e4c4b..fb43e3c0b1 100755 --- a/usertools/dpdk-devbind.py +++ b/usertools/dpdk-devbind.py @@ -69,6 +69,7 @@ network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class] baseband_devices = [acceleration_class] crypto_devices = [encryption_class, intel_processor_class] +dma_devices = [] eventdev_devices = [cavium_sso, cavium_tim, intel_dlb, octeontx2_sso] mempool_devices = [cavium_fpa, octeontx2_npa] compress_devices = [cavium_zip] @@ -583,6 +584,9 @@ def show_status(): if status_dev in ["crypto", "all"]: show_device_status(crypto_devices, "Crypto") + if status_dev in ["dma", "all"]: + show_device_status(dma_devices, "DMA") + if status_dev in ["event", "all"]: show_device_status(eventdev_devices, "Eventdev") @@ -651,7 +655,7 @@ def parse_args(): parser.add_argument( '--status-dev', help="Print the status of given device group.", - choices=['baseband', 'compress', 'crypto', 'event', + choices=['baseband', 'compress', 'crypto', 'dma', 'event', 'mempool', 'misc', 'net', 'regex']) bind_group = parser.add_mutually_exclusive_group() bind_group.add_argument( @@ -732,6 +736,7 @@ def do_arg_actions(): get_device_details(network_devices) get_device_details(baseband_devices) get_device_details(crypto_devices) + get_device_details(dma_devices) get_device_details(eventdev_devices) get_device_details(mempool_devices) get_device_details(compress_devices) @@ -754,6 +759,7 @@ def main(): get_device_details(network_devices) get_device_details(baseband_devices) get_device_details(crypto_devices) + get_device_details(dma_devices) get_device_details(eventdev_devices) get_device_details(mempool_devices) get_device_details(compress_devices) -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v6 16/16] devbind: move idxd device ID to dmadev class 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 00/16] add dmadev driver for idxd devices Kevin Laatz ` (14 preceding siblings ...) 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 15/16] devbind: add dma device class Kevin Laatz @ 2021-09-24 13:39 ` Kevin Laatz 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-09-24 13:39 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz The dmadev library is the preferred abstraction for using IDXD devices and will replace the rawdev implementation in future. This patch moves the IDXD device ID to the dmadev class. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> --- usertools/dpdk-devbind.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py index fb43e3c0b1..15d438715f 100755 --- a/usertools/dpdk-devbind.py +++ b/usertools/dpdk-devbind.py @@ -69,12 +69,12 @@ network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class] baseband_devices = [acceleration_class] crypto_devices = [encryption_class, intel_processor_class] -dma_devices = [] +dma_devices = [intel_idxd_spr] eventdev_devices = [cavium_sso, cavium_tim, intel_dlb, octeontx2_sso] mempool_devices = [cavium_fpa, octeontx2_npa] compress_devices = [cavium_zip] regex_devices = [octeontx2_ree] -misc_devices = [cnxk_bphy, cnxk_bphy_cgx, intel_ioat_bdw, intel_ioat_skx, intel_ioat_icx, intel_idxd_spr, +misc_devices = [cnxk_bphy, cnxk_bphy_cgx, intel_ioat_bdw, intel_ioat_skx, intel_ioat_icx, intel_ntb_skx, intel_ntb_icx, octeontx2_dma] -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v7 00/16] add dmadev driver for idxd devices 2021-08-27 17:20 [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices Kevin Laatz ` (16 preceding siblings ...) 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 00/16] add dmadev driver for idxd devices Kevin Laatz @ 2021-10-13 16:30 ` Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 01/16] raw/ioat: only build if dmadev not present Kevin Laatz ` (15 more replies) 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 00/16] add dmadev driver for idxd devices Kevin Laatz ` (3 subsequent siblings) 21 siblings, 16 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-13 16:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz This patchset adds a dmadev driver and associated documentation to support Intel Data Streaming Accelerator devices. This driver is intended to ultimately replace the current IDXD part of the IOAT rawdev driver. NOTE: This patchset has several dependencies: - v26 of the dmadev lib set [1] - v7 of the dmadev test suite [2] [1] http://patches.dpdk.org/project/dpdk/list/?series=19594 [2] http://patches.dpdk.org/project/dpdk/list/?series=19599 v7: * rebase on above patchsets * add meson reason for rawdev build v6: * set state of device during create * add dev_close function * documentation updates - moved generic pieces from driver doc to lib doc * other small miscellaneous fixes based on rebasing and ML feedback v5: * add missing toctree entry for idxd driver v4: * rebased on above patchsets * minor fixes based on review feedback v3: * rebased on above patchsets * added burst capacity API v2: * rebased on above patchsets * added API to check for device being idle * added devbind updates for DMA devices * fixed issue identified by internal coverity scan Bruce Richardson (1): raw/ioat: only build if dmadev not present Conor Walsh (1): dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz (14): dma/idxd: add skeleton for VFIO based DSA device dma/idxd: add bus device probing dma/idxd: create dmadev instances on bus probe dma/idxd: create dmadev instances on pci probe dma/idxd: add datapath structures dma/idxd: add configure and info_get functions dma/idxd: add start and stop functions for pci devices dma/idxd: add data-path job submission functions dma/idxd: add data-path job completion functions dma/idxd: add operation statistic tracking dma/idxd: add vchan status function dma/idxd: add burst capacity API devbind: add dma device class devbind: move idxd device ID to dmadev class MAINTAINERS | 10 + doc/guides/dmadevs/idxd.rst | 179 ++++++++ doc/guides/dmadevs/index.rst | 2 + doc/guides/prog_guide/dmadev.rst | 30 ++ doc/guides/rawdevs/ioat.rst | 8 + doc/guides/rel_notes/release_21_11.rst | 5 + drivers/dma/idxd/dpdk_idxd_cfg.py | 117 +++++ drivers/dma/idxd/idxd_bus.c | 377 +++++++++++++++ drivers/dma/idxd/idxd_common.c | 612 +++++++++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 131 ++++++ drivers/dma/idxd/idxd_internal.h | 108 +++++ drivers/dma/idxd/idxd_pci.c | 386 ++++++++++++++++ drivers/dma/idxd/meson.build | 14 + drivers/dma/idxd/version.map | 3 + drivers/dma/meson.build | 2 + drivers/meson.build | 4 +- drivers/raw/ioat/dpdk_idxd_cfg.py | 118 +---- drivers/raw/ioat/meson.build | 24 +- usertools/dpdk-devbind.py | 10 +- 19 files changed, 2016 insertions(+), 124 deletions(-) create mode 100644 doc/guides/dmadevs/idxd.rst create mode 100755 drivers/dma/idxd/dpdk_idxd_cfg.py create mode 100644 drivers/dma/idxd/idxd_bus.c create mode 100644 drivers/dma/idxd/idxd_common.c create mode 100644 drivers/dma/idxd/idxd_hw_defs.h create mode 100644 drivers/dma/idxd/idxd_internal.h create mode 100644 drivers/dma/idxd/idxd_pci.c create mode 100644 drivers/dma/idxd/meson.build create mode 100644 drivers/dma/idxd/version.map mode change 100755 => 120000 drivers/raw/ioat/dpdk_idxd_cfg.py -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v7 01/16] raw/ioat: only build if dmadev not present 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 00/16] add dmadev driver for idxd devices Kevin Laatz @ 2021-10-13 16:30 ` Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz ` (14 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-13 16:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz From: Bruce Richardson <bruce.richardson@intel.com> Only build the rawdev IDXD/IOAT drivers if the dmadev drivers are not present. This change requires the dependencies to be reordered in drivers/meson.build so that rawdev can use the "RTE_DMA_* build macros to check for the presence of the equivalent dmadev driver. A note is also added to the documentation to inform users of this change. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v7: add meson reason for not building rawdev --- doc/guides/rawdevs/ioat.rst | 8 ++++++++ drivers/meson.build | 4 ++-- drivers/raw/ioat/meson.build | 24 +++++++++++++++++++++--- 3 files changed, 31 insertions(+), 5 deletions(-) diff --git a/doc/guides/rawdevs/ioat.rst b/doc/guides/rawdevs/ioat.rst index a28e909935..a65530bd30 100644 --- a/doc/guides/rawdevs/ioat.rst +++ b/doc/guides/rawdevs/ioat.rst @@ -34,6 +34,14 @@ Compilation For builds using ``meson`` and ``ninja``, the driver will be built when the target platform is x86-based. No additional compilation steps are necessary. +.. note:: + Since the addition of the dmadev library, the ``ioat`` and ``idxd`` parts of this driver + will only be built if their ``dmadev`` counterparts are not built. + The following can be used to disable the ``dmadev`` drivers, + if the raw drivers are to be used instead:: + + $ meson -Ddisable_drivers=dma/* <build_dir> + Device Setup ------------- diff --git a/drivers/meson.build b/drivers/meson.build index b7d680868a..34c0276487 100644 --- a/drivers/meson.build +++ b/drivers/meson.build @@ -10,15 +10,15 @@ subdirs = [ 'common/qat', # depends on bus. 'common/sfc_efx', # depends on bus. 'mempool', # depends on common and bus. + 'dma', # depends on common and bus. 'net', # depends on common, bus, mempool - 'raw', # depends on common, bus and net. + 'raw', # depends on common, bus, dma and net. 'crypto', # depends on common, bus and mempool (net in future). 'compress', # depends on common, bus, mempool. 'regex', # depends on common, bus, regexdev. 'vdpa', # depends on common, bus and mempool. 'event', # depends on common, bus, mempool and net. 'baseband', # depends on common and bus. - 'dma', # depends on common and bus. ] if meson.is_cross_build() diff --git a/drivers/raw/ioat/meson.build b/drivers/raw/ioat/meson.build index 0e81cb5951..1b866aab74 100644 --- a/drivers/raw/ioat/meson.build +++ b/drivers/raw/ioat/meson.build @@ -2,14 +2,32 @@ # Copyright 2019 Intel Corporation build = dpdk_conf.has('RTE_ARCH_X86') +# only use ioat rawdev driver if we don't have the equivalent dmadev ones +if dpdk_conf.has('RTE_DMA_IDXD') and dpdk_conf.has('RTE_DMA_IOAT') + build = false + reason = 'replaced by dmadev drivers' + subdir_done() +endif + reason = 'only supported on x86' sources = files( - 'idxd_bus.c', - 'idxd_pci.c', 'ioat_common.c', - 'ioat_rawdev.c', 'ioat_rawdev_test.c', ) + +if not dpdk_conf.has('RTE_DMA_IDXD') + sources += files( + 'idxd_bus.c', + 'idxd_pci.c', + ) +endif + +if not dpdk_conf.has('RTE_DMA_IOAT') + sources += files ( + 'ioat_rawdev.c', + ) +endif + deps += ['bus_pci', 'mbuf', 'rawdev'] headers = files( 'rte_ioat_rawdev.h', -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v7 02/16] dma/idxd: add skeleton for VFIO based DSA device 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 01/16] raw/ioat: only build if dmadev not present Kevin Laatz @ 2021-10-13 16:30 ` Kevin Laatz 2021-10-18 10:32 ` Thomas Monjalon 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 03/16] dma/idxd: add bus device probing Kevin Laatz ` (13 subsequent siblings) 15 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-10-13 16:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the basic device probe/remove skeleton code for DSA device bound to the vfio pci driver. Relevant documentation and MAINTAINERS update also included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- MAINTAINERS | 10 +++++ doc/guides/dmadevs/idxd.rst | 58 ++++++++++++++++++++++++++ doc/guides/dmadevs/index.rst | 2 + doc/guides/rel_notes/release_21_11.rst | 5 +++ drivers/dma/idxd/idxd_internal.h | 27 ++++++++++++ drivers/dma/idxd/idxd_pci.c | 55 ++++++++++++++++++++++++ drivers/dma/idxd/meson.build | 11 +++++ drivers/dma/idxd/version.map | 3 ++ drivers/dma/meson.build | 2 + 9 files changed, 173 insertions(+) create mode 100644 doc/guides/dmadevs/idxd.rst create mode 100644 drivers/dma/idxd/idxd_internal.h create mode 100644 drivers/dma/idxd/idxd_pci.c create mode 100644 drivers/dma/idxd/meson.build create mode 100644 drivers/dma/idxd/version.map diff --git a/MAINTAINERS b/MAINTAINERS index 5387ffd4fc..423d8a73ce 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1200,6 +1200,16 @@ F: doc/guides/compressdevs/zlib.rst F: doc/guides/compressdevs/features/zlib.ini +DMAdev Drivers +-------------- + +Intel IDXD - EXPERIMENTAL +M: Bruce Richardson <bruce.richardson@intel.com> +M: Kevin Laatz <kevin.laatz@intel.com> +F: drivers/dma/idxd/ +F: doc/guides/dmadevs/idxd.rst + + RegEx Drivers ------------- diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst new file mode 100644 index 0000000000..924700d17e --- /dev/null +++ b/doc/guides/dmadevs/idxd.rst @@ -0,0 +1,58 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2021 Intel Corporation. + +.. include:: <isonum.txt> + +IDXD DMA Device Driver +====================== + +The ``idxd`` dmadev driver provides a poll-mode driver (PMD) for Intel\ |reg| +Data Streaming Accelerator `(Intel DSA) +<https://software.intel.com/content/www/us/en/develop/articles/intel-data-streaming-accelerator-architecture-specification.html>`_. +This PMD can be used in conjunction with Intel\ |reg| DSA devices to offload +data operations, such as data copies, to hardware, freeing up CPU cycles for +other tasks. + +Hardware Requirements +---------------------- + +The ``dpdk-devbind.py`` script, included with DPDK, can be used to show the +presence of supported hardware. Running ``dpdk-devbind.py --status-dev dma`` +will show all the DMA devices on the system, including IDXD supported devices. +Intel\ |reg| DSA devices, are currently (at time of writing) appearing +as devices with type “0b25”, due to the absence of pci-id database entries for +them at this point. + +Compilation +------------ + +For builds using ``meson`` and ``ninja``, the driver will be built when the +target platform is x86-based. No additional compilation steps are necessary. + +Device Setup +------------- + +Devices using VFIO/UIO drivers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The HW devices to be used will need to be bound to a user-space IO driver for use. +The ``dpdk-devbind.py`` script can be used to view the state of the devices +and to bind them to a suitable DPDK-supported driver, such as ``vfio-pci``. +For example:: + + $ dpdk-devbind.py -b vfio-pci 6a:01.0 + +Device Probing and Initialization +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For devices bound to a suitable DPDK-supported VFIO/UIO driver, the HW devices will +be found as part of the device scan done at application initialization time without +the need to pass parameters to the application. + +For Intel\ |reg| DSA devices, DPDK will automatically configure the device with the +maximum number of workqueues available on it, partitioning all resources equally +among the queues. +If fewer workqueues are required, then the ``max_queues`` parameter may be passed to +the device driver on the EAL commandline, via the ``allowlist`` or ``-a`` flag e.g.:: + + $ dpdk-test -a <b:d:f>,max_queues=4 diff --git a/doc/guides/dmadevs/index.rst b/doc/guides/dmadevs/index.rst index 0bce29d766..5d4abf880e 100644 --- a/doc/guides/dmadevs/index.rst +++ b/doc/guides/dmadevs/index.rst @@ -10,3 +10,5 @@ an application through DMA API. .. toctree:: :maxdepth: 2 :numbered: + + idxd diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst index f2c926d7fb..65868a730a 100644 --- a/doc/guides/rel_notes/release_21_11.rst +++ b/doc/guides/rel_notes/release_21_11.rst @@ -80,6 +80,11 @@ New Features operations. * Added multi-process support. +* **Added IDXD dmadev driver implementation.** + + The IDXD dmadev driver provide device drivers for the Intel DSA devices. + This device driver can be used through the generic dmadev API. + * **Updated af_packet ethdev driver.** * Default VLAN strip behavior was changed. VLAN tag won't be stripped diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h new file mode 100644 index 0000000000..c6a7dcd72f --- /dev/null +++ b/drivers/dma/idxd/idxd_internal.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#ifndef _IDXD_INTERNAL_H_ +#define _IDXD_INTERNAL_H_ + +/** + * @file idxd_internal.h + * + * Internal data structures for the idxd/DSA driver for dmadev + * + * @warning + * @b EXPERIMENTAL: these structures and APIs may change without prior notice + */ + +extern int idxd_pmd_logtype; + +#define IDXD_PMD_LOG(level, fmt, args...) rte_log(RTE_LOG_ ## level, \ + idxd_pmd_logtype, "IDXD: %s(): " fmt "\n", __func__, ##args) + +#define IDXD_PMD_DEBUG(fmt, args...) IDXD_PMD_LOG(DEBUG, fmt, ## args) +#define IDXD_PMD_INFO(fmt, args...) IDXD_PMD_LOG(INFO, fmt, ## args) +#define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) +#define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) + +#endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c new file mode 100644 index 0000000000..79e4aadcab --- /dev/null +++ b/drivers/dma/idxd/idxd_pci.c @@ -0,0 +1,55 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include <rte_bus_pci.h> + +#include "idxd_internal.h" + +#define IDXD_VENDOR_ID 0x8086 +#define IDXD_DEVICE_ID_SPR 0x0B25 + +#define IDXD_PMD_DMADEV_NAME_PCI dmadev_idxd_pci + +const struct rte_pci_id pci_id_idxd_map[] = { + { RTE_PCI_DEVICE(IDXD_VENDOR_ID, IDXD_DEVICE_ID_SPR) }, + { .vendor_id = 0, /* sentinel */ }, +}; + +static int +idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) +{ + int ret = 0; + char name[PCI_PRI_STR_SIZE]; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); + dev->device.driver = &drv->driver; + + return ret; +} + +static int +idxd_dmadev_remove_pci(struct rte_pci_device *dev) +{ + char name[PCI_PRI_STR_SIZE]; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + + IDXD_PMD_INFO("Closing %s on NUMA node %d", + name, dev->device.numa_node); + + return 0; +} + +struct rte_pci_driver idxd_pmd_drv_pci = { + .id_table = pci_id_idxd_map, + .drv_flags = RTE_PCI_DRV_NEED_MAPPING, + .probe = idxd_dmadev_probe_pci, + .remove = idxd_dmadev_remove_pci, +}; + +RTE_PMD_REGISTER_PCI(IDXD_PMD_DMADEV_NAME_PCI, idxd_pmd_drv_pci); +RTE_PMD_REGISTER_PCI_TABLE(IDXD_PMD_DMADEV_NAME_PCI, pci_id_idxd_map); +RTE_PMD_REGISTER_KMOD_DEP(IDXD_PMD_DMADEV_NAME_PCI, "vfio-pci"); +RTE_PMD_REGISTER_PARAM_STRING(dmadev_idxd_pci, "max_queues=0"); diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build new file mode 100644 index 0000000000..4426a9f65c --- /dev/null +++ b/drivers/dma/idxd/meson.build @@ -0,0 +1,11 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2021 Intel Corporation + +if is_windows + subdir_done() +endif + +deps += ['bus_pci'] +sources = files( + 'idxd_pci.c' +) diff --git a/drivers/dma/idxd/version.map b/drivers/dma/idxd/version.map new file mode 100644 index 0000000000..4a76d1d52d --- /dev/null +++ b/drivers/dma/idxd/version.map @@ -0,0 +1,3 @@ +DPDK_21 { + local: *; +}; diff --git a/drivers/dma/meson.build b/drivers/dma/meson.build index d9c7ede32f..411be7a240 100644 --- a/drivers/dma/meson.build +++ b/drivers/dma/meson.build @@ -2,5 +2,7 @@ # Copyright 2021 HiSilicon Limited drivers = [ + 'idxd', 'skeleton', ] +std_deps = ['dmadev'] -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v7 02/16] dma/idxd: add skeleton for VFIO based DSA device 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz @ 2021-10-18 10:32 ` Thomas Monjalon 2021-10-18 10:41 ` Kevin Laatz 0 siblings, 1 reply; 243+ messages in thread From: Thomas Monjalon @ 2021-10-18 10:32 UTC (permalink / raw) To: Kevin Laatz; +Cc: dev, bruce.richardson, fengchengwen, jerinj, conor.walsh 13/10/2021 18:30, Kevin Laatz: > Add the basic device probe/remove skeleton code for DSA device bound to > the vfio pci driver. Relevant documentation and MAINTAINERS update also > included. It seems there is a compilation issue with this patch: undefined reference to `idxd_pmd_logtype' Please check compilation per-patch. ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v7 02/16] dma/idxd: add skeleton for VFIO based DSA device 2021-10-18 10:32 ` Thomas Monjalon @ 2021-10-18 10:41 ` Kevin Laatz 0 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-18 10:41 UTC (permalink / raw) To: Thomas Monjalon; +Cc: dev, bruce.richardson, fengchengwen, jerinj, conor.walsh On 18/10/2021 11:32, Thomas Monjalon wrote: > 13/10/2021 18:30, Kevin Laatz: >> Add the basic device probe/remove skeleton code for DSA device bound to >> the vfio pci driver. Relevant documentation and MAINTAINERS update also >> included. > It seems there is a compilation issue with this patch: > undefined reference to `idxd_pmd_logtype' > > Please check compilation per-patch. > I'll take a look and fix, thanks! /Kevin ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v7 03/16] dma/idxd: add bus device probing 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 01/16] raw/ioat: only build if dmadev not present Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz @ 2021-10-13 16:30 ` Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz ` (12 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-13 16:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the basic device probing for DSA devices bound to the IDXD kernel driver. These devices can be configured via sysfs and made available to DPDK if they are found during bus scan. Relevant documentation is included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 64 +++++++ drivers/dma/idxd/idxd_bus.c | 351 +++++++++++++++++++++++++++++++++++ drivers/dma/idxd/meson.build | 1 + 3 files changed, 416 insertions(+) create mode 100644 drivers/dma/idxd/idxd_bus.c diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 924700d17e..ce33e2857a 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -32,6 +32,56 @@ target platform is x86-based. No additional compilation steps are necessary. Device Setup ------------- +Intel\ |reg| DSA devices can use the IDXD kernel driver or DPDK-supported drivers, +such as ``vfio-pci``. Both are supported by the IDXD PMD. + +Intel\ |reg| DSA devices using IDXD kernel driver +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To use an Intel\ |reg| DSA device bound to the IDXD kernel driver, the device must first be configured. +The `accel-config <https://github.com/intel/idxd-config>`_ utility library can be used for configuration. + +.. note:: + The device configuration can also be done by directly interacting with the sysfs nodes. + An example of how this may be done can be seen in the script ``dpdk_idxd_cfg.py`` + included in the driver source directory. + +There are some mandatory configuration steps before being able to use a device with an application. +The internal engines, which do the copies or other operations, +and the work-queues, which are used by applications to assign work to the device, +need to be assigned to groups, and the various other configuration options, +such as priority or queue depth, need to be set for each queue. + +To assign an engine to a group:: + + $ accel-config config-engine dsa0/engine0.0 --group-id=0 + $ accel-config config-engine dsa0/engine0.1 --group-id=1 + +To assign work queues to groups for passing descriptors to the engines a similar accel-config command can be used. +However, the work queues also need to be configured depending on the use case. +Some configuration options include: + +* mode (Dedicated/Shared): Indicates whether a WQ may accept jobs from multiple queues simultaneously. +* priority: WQ priority between 1 and 15. Larger value means higher priority. +* wq-size: the size of the WQ. Sum of all WQ sizes must be less that the total-size defined by the device. +* type: WQ type (kernel/mdev/user). Determines how the device is presented. +* name: identifier given to the WQ. + +Example configuration for a work queue:: + + $ accel-config config-wq dsa0/wq0.0 --group-id=0 \ + --mode=dedicated --priority=10 --wq-size=8 \ + --type=user --name=dpdk_app1 + +Once the devices have been configured, they need to be enabled:: + + $ accel-config enable-device dsa0 + $ accel-config enable-wq dsa0/wq0.0 + +Check the device configuration:: + + $ accel-config list + Devices using VFIO/UIO drivers ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -56,3 +106,17 @@ If fewer workqueues are required, then the ``max_queues`` parameter may be passe the device driver on the EAL commandline, via the ``allowlist`` or ``-a`` flag e.g.:: $ dpdk-test -a <b:d:f>,max_queues=4 + +For devices bound to the IDXD kernel driver, +the DPDK IDXD driver will automatically perform a scan for available workqueues +to use. Any workqueues found listed in ``/dev/dsa`` on the system will be checked +in ``/sys``, and any which have ``dpdk_`` prefix in their name will be automatically +probed by the driver to make them available to the application. +Alternatively, to support use by multiple DPDK processes simultaneously, +the value used as the DPDK ``--file-prefix`` parameter may be used as a workqueue +name prefix, instead of ``dpdk_``, allowing each DPDK application instance to only +use a subset of configured queues. + +Once probed successfully, irrespective of kernel driver, the device will appear as a ``dmadev``, +that is a "DMA device type" inside DPDK, and can be accessed using APIs from the +``rte_dmadev`` library. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c new file mode 100644 index 0000000000..ef589af30e --- /dev/null +++ b/drivers/dma/idxd/idxd_bus.c @@ -0,0 +1,351 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include <dirent.h> +#include <fcntl.h> +#include <unistd.h> +#include <sys/mman.h> +#include <libgen.h> + +#include <rte_bus.h> +#include <rte_log.h> +#include <rte_dmadev_pmd.h> +#include <rte_string_fns.h> + +#include "idxd_internal.h" + +/* default value for DSA paths, but allow override in environment for testing */ +#define DSA_DEV_PATH "/dev/dsa" +#define DSA_SYSFS_PATH "/sys/bus/dsa/devices" + +static unsigned int devcount; + +/** unique identifier for a DSA device/WQ instance */ +struct dsa_wq_addr { + uint16_t device_id; + uint16_t wq_id; +}; + +/** a DSA device instance */ +struct rte_dsa_device { + struct rte_device device; /**< Inherit core device */ + TAILQ_ENTRY(rte_dsa_device) next; /**< next dev in list */ + + char wq_name[32]; /**< the workqueue name/number e.g. wq0.1 */ + struct dsa_wq_addr addr; /**< Identifies the specific WQ */ +}; + +/* forward prototypes */ +struct dsa_bus; +static int dsa_scan(void); +static int dsa_probe(void); +static struct rte_device *dsa_find_device(const struct rte_device *start, + rte_dev_cmp_t cmp, const void *data); +static enum rte_iova_mode dsa_get_iommu_class(void); +static int dsa_addr_parse(const char *name, void *addr); + +/** List of devices */ +TAILQ_HEAD(dsa_device_list, rte_dsa_device); + +/** + * Structure describing the DSA bus + */ +struct dsa_bus { + struct rte_bus bus; /**< Inherit the generic class */ + struct rte_driver driver; /**< Driver struct for devices to point to */ + struct dsa_device_list device_list; /**< List of PCI devices */ +}; + +struct dsa_bus dsa_bus = { + .bus = { + .scan = dsa_scan, + .probe = dsa_probe, + .find_device = dsa_find_device, + .get_iommu_class = dsa_get_iommu_class, + .parse = dsa_addr_parse, + }, + .driver = { + .name = "dmadev_idxd" + }, + .device_list = TAILQ_HEAD_INITIALIZER(dsa_bus.device_list), +}; + +static inline const char * +dsa_get_dev_path(void) +{ + const char *path = getenv("DSA_DEV_PATH"); + return path ? path : DSA_DEV_PATH; +} + +static inline const char * +dsa_get_sysfs_path(void) +{ + const char *path = getenv("DSA_SYSFS_PATH"); + return path ? path : DSA_SYSFS_PATH; +} + +static void * +idxd_bus_mmap_wq(struct rte_dsa_device *dev) +{ + void *addr; + char path[PATH_MAX]; + int fd; + + snprintf(path, sizeof(path), "%s/%s", dsa_get_dev_path(), dev->wq_name); + fd = open(path, O_RDWR); + if (fd < 0) { + IDXD_PMD_ERR("Failed to open device path: %s", path); + return NULL; + } + + addr = mmap(NULL, 0x1000, PROT_WRITE, MAP_SHARED, fd, 0); + close(fd); + if (addr == MAP_FAILED) { + IDXD_PMD_ERR("Failed to mmap device %s", path); + return NULL; + } + + return addr; +} + +static int +read_wq_string(struct rte_dsa_device *dev, const char *filename, + char *value, size_t valuelen) +{ + char sysfs_node[PATH_MAX]; + int len; + int fd; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s", + dsa_get_sysfs_path(), dev->wq_name, filename); + fd = open(sysfs_node, O_RDONLY); + if (fd < 0) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + len = read(fd, value, valuelen - 1); + close(fd); + if (len < 0) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + value[len] = '\0'; + return 0; +} + +static int +read_wq_int(struct rte_dsa_device *dev, const char *filename, + int *value) +{ + char sysfs_node[PATH_MAX]; + FILE *f; + int ret = 0; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s", + dsa_get_sysfs_path(), dev->wq_name, filename); + f = fopen(sysfs_node, "r"); + if (f == NULL) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + if (fscanf(f, "%d", value) != 1) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + ret = -1; + } + + fclose(f); + return ret; +} + +static int +read_device_int(struct rte_dsa_device *dev, const char *filename, + int *value) +{ + char sysfs_node[PATH_MAX]; + FILE *f; + int ret = 0; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/dsa%d/%s", + dsa_get_sysfs_path(), dev->addr.device_id, filename); + f = fopen(sysfs_node, "r"); + if (f == NULL) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + if (fscanf(f, "%d", value) != 1) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + ret = -1; + } + + fclose(f); + return ret; +} + +static int +idxd_probe_dsa(struct rte_dsa_device *dev) +{ + struct idxd_dmadev idxd = {0}; + int ret = 0; + + IDXD_PMD_INFO("Probing device %s on numa node %d", + dev->wq_name, dev->device.numa_node); + if (read_wq_int(dev, "size", &ret) < 0) + return -1; + idxd.max_batches = ret; + if (read_wq_int(dev, "max_batch_size", &ret) < 0) + return -1; + idxd.max_batch_size = ret; + idxd.qid = dev->addr.wq_id; + idxd.sva_support = 1; + + idxd.portal = idxd_bus_mmap_wq(dev); + if (idxd.portal == NULL) { + IDXD_PMD_ERR("WQ mmap failed"); + return -ENOENT; + } + + return 0; +} + +static int +is_for_this_process_use(const char *name) +{ + char *runtime_dir = strdup(rte_eal_get_runtime_dir()); + char *prefix = basename(runtime_dir); + int prefixlen = strlen(prefix); + int retval = 0; + + if (strncmp(name, "dpdk_", 5) == 0) + retval = 1; + if (strncmp(name, prefix, prefixlen) == 0 && name[prefixlen] == '_') + retval = 1; + + free(runtime_dir); + return retval; +} + +static int +dsa_probe(void) +{ + struct rte_dsa_device *dev; + + TAILQ_FOREACH(dev, &dsa_bus.device_list, next) { + char type[64], name[64]; + + if (read_wq_string(dev, "type", type, sizeof(type)) < 0 || + read_wq_string(dev, "name", name, sizeof(name)) < 0) + continue; + + if (strncmp(type, "user", 4) == 0 && is_for_this_process_use(name)) { + dev->device.driver = &dsa_bus.driver; + idxd_probe_dsa(dev); + continue; + } + IDXD_PMD_DEBUG("WQ '%s', not allocated to DPDK", dev->wq_name); + } + + return 0; +} + +static int +dsa_scan(void) +{ + const char *path = dsa_get_dev_path(); + struct dirent *wq; + DIR *dev_dir; + + dev_dir = opendir(path); + if (dev_dir == NULL) { + if (errno == ENOENT) + return 0; /* no bus, return without error */ + IDXD_PMD_ERR("%s(): opendir '%s' failed: %s", + __func__, path, strerror(errno)); + return -1; + } + + while ((wq = readdir(dev_dir)) != NULL) { + struct rte_dsa_device *dev; + int numa_node = -1; + + if (strncmp(wq->d_name, "wq", 2) != 0) + continue; + if (strnlen(wq->d_name, sizeof(dev->wq_name)) == sizeof(dev->wq_name)) { + IDXD_PMD_ERR("%s(): wq name too long: '%s', skipping", + __func__, wq->d_name); + continue; + } + IDXD_PMD_DEBUG("%s(): found %s/%s", __func__, path, wq->d_name); + + dev = malloc(sizeof(*dev)); + if (dsa_addr_parse(wq->d_name, &dev->addr) < 0) { + IDXD_PMD_ERR("Error parsing WQ name: %s", wq->d_name); + free(dev); + continue; + } + dev->device.bus = &dsa_bus.bus; + strlcpy(dev->wq_name, wq->d_name, sizeof(dev->wq_name)); + TAILQ_INSERT_TAIL(&dsa_bus.device_list, dev, next); + devcount++; + + read_device_int(dev, "numa_node", &numa_node); + dev->device.numa_node = numa_node; + dev->device.name = dev->wq_name; + } + + closedir(dev_dir); + return 0; +} + +static struct rte_device * +dsa_find_device(const struct rte_device *start, rte_dev_cmp_t cmp, + const void *data) +{ + struct rte_dsa_device *dev = TAILQ_FIRST(&dsa_bus.device_list); + + /* the rte_device struct must be at start of dsa structure */ + RTE_BUILD_BUG_ON(offsetof(struct rte_dsa_device, device) != 0); + + if (start != NULL) /* jump to start point if given */ + dev = TAILQ_NEXT((const struct rte_dsa_device *)start, next); + while (dev != NULL) { + if (cmp(&dev->device, data) == 0) + return &dev->device; + dev = TAILQ_NEXT(dev, next); + } + return NULL; +} + +static enum rte_iova_mode +dsa_get_iommu_class(void) +{ + /* if there are no devices, report don't care, otherwise VA mode */ + return devcount > 0 ? RTE_IOVA_VA : RTE_IOVA_DC; +} + +static int +dsa_addr_parse(const char *name, void *addr) +{ + struct dsa_wq_addr *wq = addr; + unsigned int device_id, wq_id; + + if (sscanf(name, "wq%u.%u", &device_id, &wq_id) != 2) { + IDXD_PMD_DEBUG("Parsing WQ name failed: %s", name); + return -1; + } + + wq->device_id = device_id; + wq->wq_id = wq_id; + return 0; +} + +RTE_REGISTER_BUS(dsa, dsa_bus.bus); diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index 4426a9f65c..45418077f4 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -7,5 +7,6 @@ endif deps += ['bus_pci'] sources = files( + 'idxd_bus.c', 'idxd_pci.c' ) -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v7 04/16] dma/idxd: create dmadev instances on bus probe 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 00/16] add dmadev driver for idxd devices Kevin Laatz ` (2 preceding siblings ...) 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 03/16] dma/idxd: add bus device probing Kevin Laatz @ 2021-10-13 16:30 ` Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz ` (11 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-13 16:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When a suitable device is found during the bus scan/probe, create a dmadev instance for each HW queue. Internal structures required for device creation are also added. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- drivers/dma/idxd/idxd_bus.c | 19 +++++++++ drivers/dma/idxd/idxd_common.c | 72 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 40 ++++++++++++++++++ drivers/dma/idxd/meson.build | 1 + 4 files changed, 132 insertions(+) create mode 100644 drivers/dma/idxd/idxd_common.c diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index ef589af30e..b48fa954ed 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -85,6 +85,18 @@ dsa_get_sysfs_path(void) return path ? path : DSA_SYSFS_PATH; } +static int +idxd_dev_close(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->data->dev_private; + munmap(idxd->portal, 0x1000); + return 0; +} + +static const struct rte_dma_dev_ops idxd_bus_ops = { + .dev_close = idxd_dev_close, +}; + static void * idxd_bus_mmap_wq(struct rte_dsa_device *dev) { @@ -206,6 +218,7 @@ idxd_probe_dsa(struct rte_dsa_device *dev) return -1; idxd.max_batch_size = ret; idxd.qid = dev->addr.wq_id; + idxd.u.bus.dsa_id = dev->addr.device_id; idxd.sva_support = 1; idxd.portal = idxd_bus_mmap_wq(dev); @@ -214,6 +227,12 @@ idxd_probe_dsa(struct rte_dsa_device *dev) return -ENOENT; } + ret = idxd_dmadev_create(dev->wq_name, &dev->device, &idxd, &idxd_bus_ops); + if (ret) { + IDXD_PMD_ERR("Failed to create rawdev %s", dev->wq_name); + return ret; + } + return 0; } diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c new file mode 100644 index 0000000000..be8f684bd5 --- /dev/null +++ b/drivers/dma/idxd/idxd_common.c @@ -0,0 +1,72 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#include <rte_dmadev_pmd.h> +#include <rte_malloc.h> +#include <rte_common.h> + +#include "idxd_internal.h" + +#define IDXD_PMD_NAME_STR "dmadev_idxd" + +int +idxd_dmadev_create(const char *name, struct rte_device *dev, + const struct idxd_dmadev *base_idxd, + const struct rte_dma_dev_ops *ops) +{ + struct idxd_dmadev *idxd = NULL; + struct rte_dma_dev *dmadev = NULL; + int ret = 0; + + if (!name) { + IDXD_PMD_ERR("Invalid name of the device!"); + ret = -EINVAL; + goto cleanup; + } + + /* Allocate device structure */ + dmadev = rte_dma_pmd_allocate(name, dev->numa_node, sizeof(struct idxd_dmadev)); + if (dmadev == NULL) { + IDXD_PMD_ERR("Unable to allocate raw device"); + ret = -ENOMEM; + goto cleanup; + } + dmadev->dev_ops = ops; + dmadev->device = dev; + + idxd = dmadev->data->dev_private; + *idxd = *base_idxd; /* copy over the main fields already passed in */ + idxd->dmadev = dmadev; + + /* allocate batch index ring and completion ring. + * The +1 is because we can never fully use + * the ring, otherwise read == write means both full and empty. + */ + idxd->batch_comp_ring = rte_zmalloc(NULL, (sizeof(idxd->batch_idx_ring[0]) + + sizeof(idxd->batch_comp_ring[0])) * (idxd->max_batches + 1), + sizeof(idxd->batch_comp_ring[0])); + if (idxd->batch_comp_ring == NULL) { + IDXD_PMD_ERR("Unable to reserve memory for batch data\n"); + ret = -ENOMEM; + goto cleanup; + } + idxd->batch_idx_ring = (void *)&idxd->batch_comp_ring[idxd->max_batches+1]; + idxd->batch_iova = rte_mem_virt2iova(idxd->batch_comp_ring); + + dmadev->fp_obj->dev_private = idxd; + + idxd->dmadev->state = RTE_DMA_DEV_READY; + + return 0; + +cleanup: + if (dmadev) + rte_dma_pmd_release(name); + + return ret; +} + +int idxd_pmd_logtype; + +RTE_LOG_REGISTER_DEFAULT(idxd_pmd_logtype, WARNING); diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index c6a7dcd72f..fa6f053f72 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -24,4 +24,44 @@ extern int idxd_pmd_logtype; #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) +struct idxd_dmadev { + /* counters to track the batches */ + unsigned short max_batches; + unsigned short batch_idx_read; + unsigned short batch_idx_write; + + /* track descriptors and handles */ + unsigned short desc_ring_mask; + unsigned short ids_avail; /* handles for ops completed */ + unsigned short ids_returned; /* the read pointer for hdls/desc rings */ + unsigned short batch_start; /* start+size == write pointer for hdls/desc */ + unsigned short batch_size; + + void *portal; /* address to write the batch descriptor */ + + struct idxd_completion *batch_comp_ring; + unsigned short *batch_idx_ring; /* store where each batch ends */ + + struct rte_dma_stats stats; + + rte_iova_t batch_iova; /* base address of the batch comp ring */ + rte_iova_t desc_iova; /* base address of desc ring, needed for completions */ + + unsigned short max_batch_size; + + struct rte_dma_dev *dmadev; + struct rte_dma_vchan_conf qcfg; + uint8_t sva_support; + uint8_t qid; + + union { + struct { + unsigned int dsa_id; + } bus; + } u; +}; + +int idxd_dmadev_create(const char *name, struct rte_device *dev, + const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); + #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index 45418077f4..da5dc2b019 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -8,5 +8,6 @@ endif deps += ['bus_pci'] sources = files( 'idxd_bus.c', + 'idxd_common.c', 'idxd_pci.c' ) -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v7 05/16] dma/idxd: create dmadev instances on pci probe 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 00/16] add dmadev driver for idxd devices Kevin Laatz ` (3 preceding siblings ...) 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz @ 2021-10-13 16:30 ` Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 06/16] dma/idxd: add datapath structures Kevin Laatz ` (10 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-13 16:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When a suitable device is found during the PCI probe, create a dmadev instance for each HW queue. HW definitions required are also included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- drivers/dma/idxd/idxd_hw_defs.h | 71 ++++++++ drivers/dma/idxd/idxd_internal.h | 16 ++ drivers/dma/idxd/idxd_pci.c | 278 ++++++++++++++++++++++++++++++- 3 files changed, 362 insertions(+), 3 deletions(-) create mode 100644 drivers/dma/idxd/idxd_hw_defs.h diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h new file mode 100644 index 0000000000..ea627cba6d --- /dev/null +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -0,0 +1,71 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#ifndef _IDXD_HW_DEFS_H_ +#define _IDXD_HW_DEFS_H_ + +/*** Definitions for Intel(R) Data Streaming Accelerator ***/ + +#define IDXD_CMD_SHIFT 20 +enum rte_idxd_cmds { + idxd_enable_dev = 1, + idxd_disable_dev, + idxd_drain_all, + idxd_abort_all, + idxd_reset_device, + idxd_enable_wq, + idxd_disable_wq, + idxd_drain_wq, + idxd_abort_wq, + idxd_reset_wq, +}; + +/* General bar0 registers */ +struct rte_idxd_bar0 { + uint32_t __rte_cache_aligned version; /* offset 0x00 */ + uint64_t __rte_aligned(0x10) gencap; /* offset 0x10 */ + uint64_t __rte_aligned(0x10) wqcap; /* offset 0x20 */ + uint64_t __rte_aligned(0x10) grpcap; /* offset 0x30 */ + uint64_t __rte_aligned(0x08) engcap; /* offset 0x38 */ + uint64_t __rte_aligned(0x10) opcap; /* offset 0x40 */ + uint64_t __rte_aligned(0x20) offsets[2]; /* offset 0x60 */ + uint32_t __rte_aligned(0x20) gencfg; /* offset 0x80 */ + uint32_t __rte_aligned(0x08) genctrl; /* offset 0x88 */ + uint32_t __rte_aligned(0x10) gensts; /* offset 0x90 */ + uint32_t __rte_aligned(0x08) intcause; /* offset 0x98 */ + uint32_t __rte_aligned(0x10) cmd; /* offset 0xA0 */ + uint32_t __rte_aligned(0x08) cmdstatus; /* offset 0xA8 */ + uint64_t __rte_aligned(0x20) swerror[4]; /* offset 0xC0 */ +}; + +/* workqueue config is provided by array of uint32_t. */ +enum rte_idxd_wqcfg { + wq_size_idx, /* size is in first 32-bit value */ + wq_threshold_idx, /* WQ threshold second 32-bits */ + wq_mode_idx, /* WQ mode and other flags */ + wq_sizes_idx, /* WQ transfer and batch sizes */ + wq_occ_int_idx, /* WQ occupancy interrupt handle */ + wq_occ_limit_idx, /* WQ occupancy limit */ + wq_state_idx, /* WQ state and occupancy state */ +}; + +#define WQ_MODE_SHARED 0 +#define WQ_MODE_DEDICATED 1 +#define WQ_PRIORITY_SHIFT 4 +#define WQ_BATCH_SZ_SHIFT 5 +#define WQ_STATE_SHIFT 30 +#define WQ_STATE_MASK 0x3 + +struct rte_idxd_grpcfg { + uint64_t grpwqcfg[4] __rte_cache_aligned; /* 64-byte register set */ + uint64_t grpengcfg; /* offset 32 */ + uint32_t grpflags; /* offset 40 */ +}; + +#define GENSTS_DEV_STATE_MASK 0x03 +#define CMDSTATUS_ACTIVE_SHIFT 31 +#define CMDSTATUS_ACTIVE_MASK (1 << 31) +#define CMDSTATUS_ERR_MASK 0xFF + +#endif diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index fa6f053f72..cb3a68c69b 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -5,6 +5,10 @@ #ifndef _IDXD_INTERNAL_H_ #define _IDXD_INTERNAL_H_ +#include <rte_spinlock.h> + +#include "idxd_hw_defs.h" + /** * @file idxd_internal.h * @@ -24,6 +28,16 @@ extern int idxd_pmd_logtype; #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) +struct idxd_pci_common { + rte_spinlock_t lk; + + uint8_t wq_cfg_sz; + volatile struct rte_idxd_bar0 *regs; + volatile uint32_t *wq_regs_base; + volatile struct rte_idxd_grpcfg *grp_regs; + volatile void *portals; +}; + struct idxd_dmadev { /* counters to track the batches */ unsigned short max_batches; @@ -58,6 +72,8 @@ struct idxd_dmadev { struct { unsigned int dsa_id; } bus; + + struct idxd_pci_common *pci; } u; }; diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 79e4aadcab..2589d59a7d 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -3,6 +3,9 @@ */ #include <rte_bus_pci.h> +#include <rte_devargs.h> +#include <rte_dmadev_pmd.h> +#include <rte_malloc.h> #include "idxd_internal.h" @@ -16,17 +19,286 @@ const struct rte_pci_id pci_id_idxd_map[] = { { .vendor_id = 0, /* sentinel */ }, }; +static inline int +idxd_pci_dev_command(struct idxd_dmadev *idxd, enum rte_idxd_cmds command) +{ + uint8_t err_code; + uint16_t qid = idxd->qid; + int i = 0; + + if (command >= idxd_disable_wq && command <= idxd_reset_wq) + qid = (1 << qid); + rte_spinlock_lock(&idxd->u.pci->lk); + idxd->u.pci->regs->cmd = (command << IDXD_CMD_SHIFT) | qid; + + do { + rte_pause(); + err_code = idxd->u.pci->regs->cmdstatus; + if (++i >= 1000) { + IDXD_PMD_ERR("Timeout waiting for command response from HW"); + rte_spinlock_unlock(&idxd->u.pci->lk); + return err_code; + } + } while (idxd->u.pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK); + rte_spinlock_unlock(&idxd->u.pci->lk); + + return err_code & CMDSTATUS_ERR_MASK; +} + +static uint32_t * +idxd_get_wq_cfg(struct idxd_pci_common *pci, uint8_t wq_idx) +{ + return RTE_PTR_ADD(pci->wq_regs_base, + (uintptr_t)wq_idx << (5 + pci->wq_cfg_sz)); +} + +static int +idxd_is_wq_enabled(struct idxd_dmadev *idxd) +{ + uint32_t state = idxd_get_wq_cfg(idxd->u.pci, idxd->qid)[wq_state_idx]; + return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; +} + +static int +idxd_pci_dev_close(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint8_t err_code; + + /* disable the device */ + err_code = idxd_pci_dev_command(idxd, idxd_disable_dev); + if (err_code) { + IDXD_PMD_ERR("Error disabling device: code %#x", err_code); + return err_code; + } + IDXD_PMD_DEBUG("IDXD Device disabled OK"); + + /* free device memory */ + IDXD_PMD_DEBUG("Freeing device driver memory"); + rte_free(idxd->batch_idx_ring); + rte_free(idxd->desc_ring); + + return 0; +} + +static const struct rte_dma_dev_ops idxd_pci_ops = { + .dev_close = idxd_pci_dev_close, +}; + +/* each portal uses 4 x 4k pages */ +#define IDXD_PORTAL_SIZE (4096 * 4) + +static int +init_pci_device(struct rte_pci_device *dev, struct idxd_dmadev *idxd, + unsigned int max_queues) +{ + struct idxd_pci_common *pci; + uint8_t nb_groups, nb_engines, nb_wqs; + uint16_t grp_offset, wq_offset; /* how far into bar0 the regs are */ + uint16_t wq_size, total_wq_size; + uint8_t lg2_max_batch, lg2_max_copy_size; + unsigned int i, err_code; + + pci = malloc(sizeof(*pci)); + if (pci == NULL) { + IDXD_PMD_ERR("%s: Can't allocate memory", __func__); + goto err; + } + rte_spinlock_init(&pci->lk); + + /* assign the bar registers, and then configure device */ + pci->regs = dev->mem_resource[0].addr; + grp_offset = (uint16_t)pci->regs->offsets[0]; + pci->grp_regs = RTE_PTR_ADD(pci->regs, grp_offset * 0x100); + wq_offset = (uint16_t)(pci->regs->offsets[0] >> 16); + pci->wq_regs_base = RTE_PTR_ADD(pci->regs, wq_offset * 0x100); + pci->portals = dev->mem_resource[2].addr; + pci->wq_cfg_sz = (pci->regs->wqcap >> 24) & 0x0F; + + /* sanity check device status */ + if (pci->regs->gensts & GENSTS_DEV_STATE_MASK) { + /* need function-level-reset (FLR) or is enabled */ + IDXD_PMD_ERR("Device status is not disabled, cannot init"); + goto err; + } + if (pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK) { + /* command in progress */ + IDXD_PMD_ERR("Device has a command in progress, cannot init"); + goto err; + } + + /* read basic info about the hardware for use when configuring */ + nb_groups = (uint8_t)pci->regs->grpcap; + nb_engines = (uint8_t)pci->regs->engcap; + nb_wqs = (uint8_t)(pci->regs->wqcap >> 16); + total_wq_size = (uint16_t)pci->regs->wqcap; + lg2_max_copy_size = (uint8_t)(pci->regs->gencap >> 16) & 0x1F; + lg2_max_batch = (uint8_t)(pci->regs->gencap >> 21) & 0x0F; + + IDXD_PMD_DEBUG("nb_groups = %u, nb_engines = %u, nb_wqs = %u", + nb_groups, nb_engines, nb_wqs); + + /* zero out any old config */ + for (i = 0; i < nb_groups; i++) { + pci->grp_regs[i].grpengcfg = 0; + pci->grp_regs[i].grpwqcfg[0] = 0; + } + for (i = 0; i < nb_wqs; i++) + idxd_get_wq_cfg(pci, i)[0] = 0; + + /* limit queues if necessary */ + if (max_queues != 0 && nb_wqs > max_queues) { + nb_wqs = max_queues; + if (nb_engines > max_queues) + nb_engines = max_queues; + if (nb_groups > max_queues) + nb_engines = max_queues; + IDXD_PMD_DEBUG("Limiting queues to %u", nb_wqs); + } + + /* put each engine into a separate group to avoid reordering */ + if (nb_groups > nb_engines) + nb_groups = nb_engines; + if (nb_groups < nb_engines) + nb_engines = nb_groups; + + /* assign engines to groups, round-robin style */ + for (i = 0; i < nb_engines; i++) { + IDXD_PMD_DEBUG("Assigning engine %u to group %u", + i, i % nb_groups); + pci->grp_regs[i % nb_groups].grpengcfg |= (1ULL << i); + } + + /* now do the same for queues and give work slots to each queue */ + wq_size = total_wq_size / nb_wqs; + IDXD_PMD_DEBUG("Work queue size = %u, max batch = 2^%u, max copy = 2^%u", + wq_size, lg2_max_batch, lg2_max_copy_size); + for (i = 0; i < nb_wqs; i++) { + /* add engine "i" to a group */ + IDXD_PMD_DEBUG("Assigning work queue %u to group %u", + i, i % nb_groups); + pci->grp_regs[i % nb_groups].grpwqcfg[0] |= (1ULL << i); + /* now configure it, in terms of size, max batch, mode */ + idxd_get_wq_cfg(pci, i)[wq_size_idx] = wq_size; + idxd_get_wq_cfg(pci, i)[wq_mode_idx] = (1 << WQ_PRIORITY_SHIFT) | + WQ_MODE_DEDICATED; + idxd_get_wq_cfg(pci, i)[wq_sizes_idx] = lg2_max_copy_size | + (lg2_max_batch << WQ_BATCH_SZ_SHIFT); + } + + /* dump the group configuration to output */ + for (i = 0; i < nb_groups; i++) { + IDXD_PMD_DEBUG("## Group %d", i); + IDXD_PMD_DEBUG(" GRPWQCFG: %"PRIx64, pci->grp_regs[i].grpwqcfg[0]); + IDXD_PMD_DEBUG(" GRPENGCFG: %"PRIx64, pci->grp_regs[i].grpengcfg); + IDXD_PMD_DEBUG(" GRPFLAGS: %"PRIx32, pci->grp_regs[i].grpflags); + } + + idxd->u.pci = pci; + idxd->max_batches = wq_size; + + /* enable the device itself */ + err_code = idxd_pci_dev_command(idxd, idxd_enable_dev); + if (err_code) { + IDXD_PMD_ERR("Error enabling device: code %#x", err_code); + return err_code; + } + IDXD_PMD_DEBUG("IDXD Device enabled OK"); + + return nb_wqs; + +err: + free(pci); + return -1; +} + static int idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) { - int ret = 0; + struct idxd_dmadev idxd = {0}; + uint8_t nb_wqs; + int qid, ret = 0; char name[PCI_PRI_STR_SIZE]; + unsigned int max_queues = 0; rte_pci_device_name(&dev->addr, name, sizeof(name)); IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); dev->device.driver = &drv->driver; - return ret; + if (dev->device.devargs && dev->device.devargs->args[0] != '\0') { + /* if the number of devargs grows beyond just 1, use rte_kvargs */ + if (sscanf(dev->device.devargs->args, + "max_queues=%u", &max_queues) != 1) { + IDXD_PMD_ERR("Invalid device parameter: '%s'", + dev->device.devargs->args); + return -1; + } + } + + ret = init_pci_device(dev, &idxd, max_queues); + if (ret < 0) { + IDXD_PMD_ERR("Error initializing PCI hardware"); + return ret; + } + if (idxd.u.pci->portals == NULL) { + IDXD_PMD_ERR("Error, invalid portal assigned during initialization\n"); + return -EINVAL; + } + nb_wqs = (uint8_t)ret; + + /* set up one device for each queue */ + for (qid = 0; qid < nb_wqs; qid++) { + char qname[32]; + + /* add the queue number to each device name */ + snprintf(qname, sizeof(qname), "%s-q%d", name, qid); + idxd.qid = qid; + idxd.portal = RTE_PTR_ADD(idxd.u.pci->portals, + qid * IDXD_PORTAL_SIZE); + if (idxd_is_wq_enabled(&idxd)) + IDXD_PMD_ERR("Error, WQ %u seems enabled", qid); + ret = idxd_dmadev_create(qname, &dev->device, + &idxd, &idxd_pci_ops); + if (ret != 0) { + IDXD_PMD_ERR("Failed to create dmadev %s", name); + if (qid == 0) /* if no devices using this, free pci */ + free(idxd.u.pci); + return ret; + } + } + + return 0; +} + +static int +idxd_dmadev_destroy(const char *name) +{ + int ret; + struct rte_dma_dev *dmadev; + int dev_id = rte_dma_get_dev_id_by_name(name); + + if (!name) { + IDXD_PMD_ERR("Invalid device name"); + return -EINVAL; + } + + if (dev_id < 0) { + IDXD_PMD_ERR("Invalid device ID"); + return -EINVAL; + } + + dmadev = &rte_dma_devices[dev_id]; + if (!dmadev) { + IDXD_PMD_ERR("Invalid device name (%s)", name); + return -EINVAL; + } + + /* rte_dma_close is called by pmd_release */ + ret = rte_dma_pmd_release(name); + if (ret) + IDXD_PMD_DEBUG("Device cleanup failed"); + + return 0; } static int @@ -39,7 +311,7 @@ idxd_dmadev_remove_pci(struct rte_pci_device *dev) IDXD_PMD_INFO("Closing %s on NUMA node %d", name, dev->device.numa_node); - return 0; + return idxd_dmadev_destroy(name); } struct rte_pci_driver idxd_pmd_drv_pci = { -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v7 06/16] dma/idxd: add datapath structures 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 00/16] add dmadev driver for idxd devices Kevin Laatz ` (4 preceding siblings ...) 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz @ 2021-10-13 16:30 ` Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 07/16] dma/idxd: add configure and info_get functions Kevin Laatz ` (9 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-13 16:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add data structures required for the data path for IDXD devices. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 33 ++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 60 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 3 ++ drivers/dma/idxd/idxd_pci.c | 1 + 5 files changed, 98 insertions(+) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index b48fa954ed..3c0837ec52 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -95,6 +95,7 @@ idxd_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_close = idxd_dev_close, + .dev_dump = idxd_dump, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index be8f684bd5..5480a29c7b 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -10,6 +10,35 @@ #define IDXD_PMD_NAME_STR "dmadev_idxd" +int +idxd_dump(const struct rte_dma_dev *dev, FILE *f) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + unsigned int i; + + fprintf(f, "== IDXD Private Data ==\n"); + fprintf(f, " Portal: %p\n", idxd->portal); + fprintf(f, " Config: { ring_size: %u }\n", + idxd->qcfg.nb_desc); + fprintf(f, " Batch ring (sz = %u, max_batches = %u):\n\t", + idxd->max_batches + 1, idxd->max_batches); + for (i = 0; i <= idxd->max_batches; i++) { + fprintf(f, " %u ", idxd->batch_idx_ring[i]); + if (i == idxd->batch_idx_read && i == idxd->batch_idx_write) + fprintf(f, "[rd ptr, wr ptr] "); + else if (i == idxd->batch_idx_read) + fprintf(f, "[rd ptr] "); + else if (i == idxd->batch_idx_write) + fprintf(f, "[wr ptr] "); + if (i == idxd->max_batches) + fprintf(f, "\n"); + } + + fprintf(f, " Curr batch: start = %u, size = %u\n", idxd->batch_start, idxd->batch_size); + fprintf(f, " IDS: avail = %u, returned: %u\n", idxd->ids_avail, idxd->ids_returned); + return 0; +} + int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, @@ -19,6 +48,10 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, struct rte_dma_dev *dmadev = NULL; int ret = 0; + RTE_BUILD_BUG_ON(sizeof(struct idxd_hw_desc) != 64); + RTE_BUILD_BUG_ON(offsetof(struct idxd_hw_desc, size) != 32); + RTE_BUILD_BUG_ON(sizeof(struct idxd_completion) != 32); + if (!name) { IDXD_PMD_ERR("Invalid name of the device!"); ret = -EINVAL; diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h index ea627cba6d..55ca9f7f52 100644 --- a/drivers/dma/idxd/idxd_hw_defs.h +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -5,6 +5,66 @@ #ifndef _IDXD_HW_DEFS_H_ #define _IDXD_HW_DEFS_H_ +/* + * Defines used in the data path for interacting with IDXD hardware. + */ +#define IDXD_CMD_OP_SHIFT 24 +enum rte_idxd_ops { + idxd_op_nop = 0, + idxd_op_batch, + idxd_op_drain, + idxd_op_memmove, + idxd_op_fill +}; + +#define IDXD_FLAG_FENCE (1 << 0) +#define IDXD_FLAG_COMPLETION_ADDR_VALID (1 << 2) +#define IDXD_FLAG_REQUEST_COMPLETION (1 << 3) +#define IDXD_FLAG_CACHE_CONTROL (1 << 8) + +/** + * Hardware descriptor used by DSA hardware, for both bursts and + * for individual operations. + */ +struct idxd_hw_desc { + uint32_t pasid; + uint32_t op_flags; + rte_iova_t completion; + + RTE_STD_C11 + union { + rte_iova_t src; /* source address for copy ops etc. */ + rte_iova_t desc_addr; /* descriptor pointer for batch */ + }; + rte_iova_t dst; + + uint32_t size; /* length of data for op, or batch size */ + + uint16_t intr_handle; /* completion interrupt handle */ + + /* remaining 26 bytes are reserved */ + uint16_t __reserved[13]; +} __rte_aligned(64); + +#define IDXD_COMP_STATUS_INCOMPLETE 0 +#define IDXD_COMP_STATUS_SUCCESS 1 +#define IDXD_COMP_STATUS_INVALID_OPCODE 0x10 +#define IDXD_COMP_STATUS_INVALID_SIZE 0x13 +#define IDXD_COMP_STATUS_SKIPPED 0xFF /* not official IDXD error, needed as placeholder */ + +/** + * Completion record structure written back by DSA + */ +struct idxd_completion { + uint8_t status; + uint8_t result; + /* 16-bits pad here */ + uint32_t completed_size; /* data length, or descriptors for batch */ + + rte_iova_t fault_address; + uint32_t invalid_flags; +} __rte_aligned(32); + /*** Definitions for Intel(R) Data Streaming Accelerator ***/ #define IDXD_CMD_SHIFT 20 diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index cb3a68c69b..99c8e04302 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -39,6 +39,8 @@ struct idxd_pci_common { }; struct idxd_dmadev { + struct idxd_hw_desc *desc_ring; + /* counters to track the batches */ unsigned short max_batches; unsigned short batch_idx_read; @@ -79,5 +81,6 @@ struct idxd_dmadev { int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); +int idxd_dump(const struct rte_dma_dev *dev, FILE *f); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 2589d59a7d..72d9723fe7 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -83,6 +83,7 @@ idxd_pci_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_close = idxd_pci_dev_close, + .dev_dump = idxd_dump, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v7 07/16] dma/idxd: add configure and info_get functions 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 00/16] add dmadev driver for idxd devices Kevin Laatz ` (5 preceding siblings ...) 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 06/16] dma/idxd: add datapath structures Kevin Laatz @ 2021-10-13 16:30 ` Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz ` (8 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-13 16:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add functions for device configuration. The info_get function is included here since it can be useful for checking successful configuration. Documentation is also updated to add device configuration usage info. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v7: remove invalid doc references after rebase --- doc/guides/dmadevs/idxd.rst | 15 +++++++ drivers/dma/idxd/idxd_bus.c | 3 ++ drivers/dma/idxd/idxd_common.c | 71 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 6 +++ drivers/dma/idxd/idxd_pci.c | 3 ++ 5 files changed, 98 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index ce33e2857a..62ffd39ee0 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -120,3 +120,18 @@ use a subset of configured queues. Once probed successfully, irrespective of kernel driver, the device will appear as a ``dmadev``, that is a "DMA device type" inside DPDK, and can be accessed using APIs from the ``rte_dmadev`` library. + +Using IDXD DMAdev Devices +-------------------------- + +To use the devices from an application, the dmadev API can be used. + +Device Configuration +~~~~~~~~~~~~~~~~~~~~~ + +IDXD configuration requirements: + +* ``ring_size`` must be a power of two, between 64 and 4096. +* Only one ``vchan`` is supported per device (work queue). +* IDXD devices do not support silent mode. +* The transfer direction must be set to ``RTE_DMA_DIR_MEM_TO_MEM`` to copy from memory to memory. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index 3c0837ec52..b2acdac4f9 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -96,6 +96,9 @@ idxd_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_close = idxd_dev_close, .dev_dump = idxd_dump, + .dev_configure = idxd_configure, + .vchan_setup = idxd_vchan_setup, + .dev_info_get = idxd_info_get, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 5480a29c7b..378a1d93a6 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -39,6 +39,77 @@ idxd_dump(const struct rte_dma_dev *dev, FILE *f) return 0; } +int +idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t size) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + + if (size < sizeof(*info)) + return -EINVAL; + + *info = (struct rte_dma_info) { + .dev_capa = RTE_DMA_CAPA_MEM_TO_MEM | RTE_DMA_CAPA_HANDLES_ERRORS | + RTE_DMA_CAPA_OPS_COPY | RTE_DMA_CAPA_OPS_FILL, + .max_vchans = 1, + .max_desc = 4096, + .min_desc = 64, + }; + if (idxd->sva_support) + info->dev_capa |= RTE_DMA_CAPA_SVA; + return 0; +} + +int +idxd_configure(struct rte_dma_dev *dev __rte_unused, const struct rte_dma_conf *dev_conf, + uint32_t conf_sz) +{ + if (sizeof(struct rte_dma_conf) != conf_sz) + return -EINVAL; + + if (dev_conf->nb_vchans != 1) + return -EINVAL; + return 0; +} + +int +idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint16_t max_desc = qconf->nb_desc; + + if (sizeof(struct rte_dma_vchan_conf) != qconf_sz) + return -EINVAL; + + idxd->qcfg = *qconf; + + if (!rte_is_power_of_2(max_desc)) + max_desc = rte_align32pow2(max_desc); + IDXD_PMD_DEBUG("DMA dev %u using %u descriptors", dev->data->dev_id, max_desc); + idxd->desc_ring_mask = max_desc - 1; + idxd->qcfg.nb_desc = max_desc; + + /* in case we are reconfiguring a device, free any existing memory */ + rte_free(idxd->desc_ring); + + /* allocate the descriptor ring at 2x size as batches can't wrap */ + idxd->desc_ring = rte_zmalloc(NULL, sizeof(*idxd->desc_ring) * max_desc * 2, 0); + if (idxd->desc_ring == NULL) + return -ENOMEM; + idxd->desc_iova = rte_mem_virt2iova(idxd->desc_ring); + + idxd->batch_idx_read = 0; + idxd->batch_idx_write = 0; + idxd->batch_start = 0; + idxd->batch_size = 0; + idxd->ids_returned = 0; + idxd->ids_avail = 0; + + memset(idxd->batch_comp_ring, 0, sizeof(*idxd->batch_comp_ring) * + (idxd->max_batches + 1)); + return 0; +} + int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 99c8e04302..fdd018ca35 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -82,5 +82,11 @@ struct idxd_dmadev { int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); int idxd_dump(const struct rte_dma_dev *dev, FILE *f); +int idxd_configure(struct rte_dma_dev *dev, const struct rte_dma_conf *dev_conf, + uint32_t conf_sz); +int idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan, + const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz); +int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *dev_info, + uint32_t size); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 72d9723fe7..7cf2f61138 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -84,6 +84,9 @@ idxd_pci_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_close = idxd_pci_dev_close, .dev_dump = idxd_dump, + .dev_configure = idxd_configure, + .vchan_setup = idxd_vchan_setup, + .dev_info_get = idxd_info_get, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v7 08/16] dma/idxd: add start and stop functions for pci devices 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 00/16] add dmadev driver for idxd devices Kevin Laatz ` (6 preceding siblings ...) 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 07/16] dma/idxd: add configure and info_get functions Kevin Laatz @ 2021-10-13 16:30 ` Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 09/16] dma/idxd: add data-path job submission functions Kevin Laatz ` (7 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-13 16:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add device start/stop functions for DSA devices bound to vfio. For devices bound to the IDXD kernel driver, these are not required since the IDXD kernel driver takes care of this. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 3 +++ drivers/dma/idxd/idxd_pci.c | 51 +++++++++++++++++++++++++++++++++++++ 2 files changed, 54 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 62ffd39ee0..711890bd9e 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -135,3 +135,6 @@ IDXD configuration requirements: * Only one ``vchan`` is supported per device (work queue). * IDXD devices do not support silent mode. * The transfer direction must be set to ``RTE_DMA_DIR_MEM_TO_MEM`` to copy from memory to memory. + +Once configured, the device can then be made ready for use by calling the +``rte_dma_start()`` API. diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 7cf2f61138..68d8660285 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -59,6 +59,55 @@ idxd_is_wq_enabled(struct idxd_dmadev *idxd) return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; } +static int +idxd_pci_dev_stop(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint8_t err_code; + + if (!idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Work queue %d already disabled", idxd->qid); + return -EALREADY; + } + + err_code = idxd_pci_dev_command(idxd, idxd_disable_wq); + if (err_code || idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Failed disabling work queue %d, error code: %#x", + idxd->qid, err_code); + return err_code == 0 ? -1 : -err_code; + } + IDXD_PMD_DEBUG("Work queue %d disabled OK", idxd->qid); + + return 0; +} + +static int +idxd_pci_dev_start(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint8_t err_code; + + if (idxd_is_wq_enabled(idxd)) { + IDXD_PMD_WARN("WQ %d already enabled", idxd->qid); + return 0; + } + + if (idxd->desc_ring == NULL) { + IDXD_PMD_ERR("WQ %d has not been fully configured", idxd->qid); + return -EINVAL; + } + + err_code = idxd_pci_dev_command(idxd, idxd_enable_wq); + if (err_code || !idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Failed enabling work queue %d, error code: %#x", + idxd->qid, err_code); + return err_code == 0 ? -1 : -err_code; + } + IDXD_PMD_DEBUG("Work queue %d enabled OK", idxd->qid); + + return 0; +} + static int idxd_pci_dev_close(struct rte_dma_dev *dev) { @@ -87,6 +136,8 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .dev_start = idxd_pci_dev_start, + .dev_stop = idxd_pci_dev_stop, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v7 09/16] dma/idxd: add data-path job submission functions 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 00/16] add dmadev driver for idxd devices Kevin Laatz ` (7 preceding siblings ...) 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz @ 2021-10-13 16:30 ` Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 10/16] dma/idxd: add data-path job completion functions Kevin Laatz ` (6 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-13 16:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add data path functions for enqueuing and submitting operations to DSA devices. Documentation updates are included for dmadev library and IDXD driver docs as appropriate. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 9 ++ doc/guides/prog_guide/dmadev.rst | 19 +++++ drivers/dma/idxd/idxd_common.c | 137 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 ++ drivers/dma/idxd/meson.build | 1 + 5 files changed, 171 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 711890bd9e..d548c4751a 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -138,3 +138,12 @@ IDXD configuration requirements: Once configured, the device can then be made ready for use by calling the ``rte_dma_start()`` API. + +Performing Data Copies +~~~~~~~~~~~~~~~~~~~~~~~ + +Refer to the :ref:`Enqueue / Dequeue APIs <dmadev_enqueue_dequeue>` section of the dmadev library +documentation for details on operation enqueue and submission API usage. + +It is expected that, for efficiency reasons, a burst of operations will be enqueued to the +device via multiple enqueue calls between calls to the ``rte_dma_submit()`` function. diff --git a/doc/guides/prog_guide/dmadev.rst b/doc/guides/prog_guide/dmadev.rst index 32f7147862..e853ffda3a 100644 --- a/doc/guides/prog_guide/dmadev.rst +++ b/doc/guides/prog_guide/dmadev.rst @@ -67,6 +67,8 @@ can be used to get the device info and supported features. Silent mode is a special device capability which does not require the application to invoke dequeue APIs. +.. _dmadev_enqueue_dequeue: + Enqueue / Dequeue APIs ~~~~~~~~~~~~~~~~~~~~~~ @@ -80,6 +82,23 @@ The ``rte_dma_submit`` API is used to issue doorbell to hardware. Alternatively the ``RTE_DMA_OP_FLAG_SUBMIT`` flag can be passed to the enqueue APIs to also issue the doorbell to hardware. +The following code demonstrates how to enqueue a burst of copies to the +device and start the hardware processing of them: + +.. code-block:: C + + struct rte_mbuf *srcs[DMA_BURST_SZ], *dsts[DMA_BURST_SZ]; + unsigned int i; + + for (i = 0; i < RTE_DIM(srcs); i++) { + if (rte_dma_copy(dev_id, vchan, rte_pktmbuf_iova(srcs), + rte_pktmbuf_iova(dsts), COPY_LEN, 0) < 0) { + PRINT_ERR("Error with rte_dma_copy for buffer %u\n", i); + return -1; + } + } + rte_dma_submit(dev_id, vchan); + There are two dequeue APIs ``rte_dma_completed`` and ``rte_dma_completed_status``, these are used to obtain the results of the enqueue requests. ``rte_dma_completed`` will return the number of successfully diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 378a1d93a6..dfa7c36d45 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -2,14 +2,145 @@ * Copyright 2021 Intel Corporation */ +#include <x86intrin.h> + #include <rte_dmadev_pmd.h> #include <rte_malloc.h> #include <rte_common.h> +#include <rte_prefetch.h> #include "idxd_internal.h" #define IDXD_PMD_NAME_STR "dmadev_idxd" +static __rte_always_inline rte_iova_t +__desc_idx_to_iova(struct idxd_dmadev *idxd, uint16_t n) +{ + return idxd->desc_iova + (n * sizeof(struct idxd_hw_desc)); +} + +static __rte_always_inline void +__idxd_movdir64b(volatile void *dst, const struct idxd_hw_desc *src) +{ + asm volatile (".byte 0x66, 0x0f, 0x38, 0xf8, 0x02" + : + : "a" (dst), "d" (src) + : "memory"); +} + +static __rte_always_inline void +__submit(struct idxd_dmadev *idxd) +{ + rte_prefetch1(&idxd->batch_comp_ring[idxd->batch_idx_read]); + + if (idxd->batch_size == 0) + return; + + /* write completion to batch comp ring */ + rte_iova_t comp_addr = idxd->batch_iova + + (idxd->batch_idx_write * sizeof(struct idxd_completion)); + + if (idxd->batch_size == 1) { + /* submit batch directly */ + struct idxd_hw_desc desc = + idxd->desc_ring[idxd->batch_start & idxd->desc_ring_mask]; + desc.completion = comp_addr; + desc.op_flags |= IDXD_FLAG_REQUEST_COMPLETION; + _mm_sfence(); /* fence before writing desc to device */ + __idxd_movdir64b(idxd->portal, &desc); + } else { + const struct idxd_hw_desc batch_desc = { + .op_flags = (idxd_op_batch << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_COMPLETION_ADDR_VALID | + IDXD_FLAG_REQUEST_COMPLETION, + .desc_addr = __desc_idx_to_iova(idxd, + idxd->batch_start & idxd->desc_ring_mask), + .completion = comp_addr, + .size = idxd->batch_size, + }; + _mm_sfence(); /* fence before writing desc to device */ + __idxd_movdir64b(idxd->portal, &batch_desc); + } + + if (++idxd->batch_idx_write > idxd->max_batches) + idxd->batch_idx_write = 0; + + idxd->batch_start += idxd->batch_size; + idxd->batch_size = 0; + idxd->batch_idx_ring[idxd->batch_idx_write] = idxd->batch_start; + _mm256_store_si256((void *)&idxd->batch_comp_ring[idxd->batch_idx_write], + _mm256_setzero_si256()); +} + +static __rte_always_inline int +__idxd_write_desc(struct idxd_dmadev *idxd, + const uint32_t op_flags, + const rte_iova_t src, + const rte_iova_t dst, + const uint32_t size, + const uint32_t flags) +{ + uint16_t mask = idxd->desc_ring_mask; + uint16_t job_id = idxd->batch_start + idxd->batch_size; + /* we never wrap batches, so we only mask the start and allow start+size to overflow */ + uint16_t write_idx = (idxd->batch_start & mask) + idxd->batch_size; + + /* first check batch ring space then desc ring space */ + if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) || + idxd->batch_idx_write + 1 == idxd->batch_idx_read) + return -ENOSPC; + if (((write_idx + 1) & mask) == (idxd->ids_returned & mask)) + return -ENOSPC; + + /* write desc. Note: descriptors don't wrap, but the completion address does */ + const uint64_t op_flags64 = (uint64_t)(op_flags | IDXD_FLAG_COMPLETION_ADDR_VALID) << 32; + const uint64_t comp_addr = __desc_idx_to_iova(idxd, write_idx & mask); + _mm256_store_si256((void *)&idxd->desc_ring[write_idx], + _mm256_set_epi64x(dst, src, comp_addr, op_flags64)); + _mm256_store_si256((void *)&idxd->desc_ring[write_idx].size, + _mm256_set_epi64x(0, 0, 0, size)); + + idxd->batch_size++; + + rte_prefetch0_write(&idxd->desc_ring[write_idx + 1]); + + if (flags & RTE_DMA_OP_FLAG_SUBMIT) + __submit(idxd); + + return job_id; +} + +int +idxd_enqueue_copy(void *dev_private, uint16_t qid __rte_unused, rte_iova_t src, + rte_iova_t dst, unsigned int length, uint64_t flags) +{ + /* we can take advantage of the fact that the fence flag in dmadev and DSA are the same, + * but check it at compile time to be sure. + */ + RTE_BUILD_BUG_ON(RTE_DMA_OP_FLAG_FENCE != IDXD_FLAG_FENCE); + uint32_t memmove = (idxd_op_memmove << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_CACHE_CONTROL | (flags & IDXD_FLAG_FENCE); + return __idxd_write_desc(dev_private, memmove, src, dst, length, + flags); +} + +int +idxd_enqueue_fill(void *dev_private, uint16_t qid __rte_unused, uint64_t pattern, + rte_iova_t dst, unsigned int length, uint64_t flags) +{ + uint32_t fill = (idxd_op_fill << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_CACHE_CONTROL | (flags & IDXD_FLAG_FENCE); + return __idxd_write_desc(dev_private, fill, pattern, dst, length, + flags); +} + +int +idxd_submit(void *dev_private, uint16_t qid __rte_unused) +{ + __submit(dev_private); + return 0; +} + int idxd_dump(const struct rte_dma_dev *dev, FILE *f) { @@ -139,6 +270,12 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->dev_ops = ops; dmadev->device = dev; + dmadev->fp_obj->copy = idxd_enqueue_copy; + dmadev->fp_obj->fill = idxd_enqueue_fill; + dmadev->fp_obj->submit = idxd_submit; + dmadev->fp_obj->completed = idxd_completed; + dmadev->fp_obj->completed_status = idxd_completed_status; + idxd = dmadev->data->dev_private; *idxd = *base_idxd; /* copy over the main fields already passed in */ idxd->dmadev = dmadev; diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index fdd018ca35..5b83e6dc60 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -88,5 +88,10 @@ int idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan, const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz); int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *dev_info, uint32_t size); +int idxd_enqueue_copy(void *dev_private, uint16_t qid, rte_iova_t src, + rte_iova_t dst, unsigned int length, uint64_t flags); +int idxd_enqueue_fill(void *dev_private, uint16_t qid, uint64_t pattern, + rte_iova_t dst, unsigned int length, uint64_t flags); +int idxd_submit(void *dev_private, uint16_t qid); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index da5dc2b019..3b5133c578 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -6,6 +6,7 @@ if is_windows endif deps += ['bus_pci'] +cflags += '-mavx2' # all platforms with idxd HW support AVX sources = files( 'idxd_bus.c', 'idxd_common.c', -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v7 10/16] dma/idxd: add data-path job completion functions 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 00/16] add dmadev driver for idxd devices Kevin Laatz ` (8 preceding siblings ...) 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 09/16] dma/idxd: add data-path job submission functions Kevin Laatz @ 2021-10-13 16:30 ` Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 11/16] dma/idxd: add operation statistic tracking Kevin Laatz ` (5 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-13 16:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the data path functions for gathering completed operations. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 32 ++++- drivers/dma/idxd/idxd_common.c | 234 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 + 3 files changed, 270 insertions(+), 1 deletion(-) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index d548c4751a..d4a210b854 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -143,7 +143,37 @@ Performing Data Copies ~~~~~~~~~~~~~~~~~~~~~~~ Refer to the :ref:`Enqueue / Dequeue APIs <dmadev_enqueue_dequeue>` section of the dmadev library -documentation for details on operation enqueue and submission API usage. +documentation for details on operation enqueue, submission and completion API usage. It is expected that, for efficiency reasons, a burst of operations will be enqueued to the device via multiple enqueue calls between calls to the ``rte_dma_submit()`` function. + +When gathering completions, ``rte_dma_completed()`` should be used, up until the point an error +occurs in an operation. If an error was encountered, ``rte_dma_completed_status()`` must be used +to kick the device off to continue processing operations and also to gather the status of each +individual operations which is filled in to the ``status`` array provided as parameter by the +application. + +The following status codes are supported by IDXD: + +* ``RTE_DMA_STATUS_SUCCESSFUL``: The operation was successful. +* ``RTE_DMA_STATUS_INVALID_OPCODE``: The operation failed due to an invalid operation code. +* ``RTE_DMA_STATUS_INVALID_LENGTH``: The operation failed due to an invalid data length. +* ``RTE_DMA_STATUS_NOT_ATTEMPTED``: The operation was not attempted. +* ``RTE_DMA_STATUS_ERROR_UNKNOWN``: The operation failed due to an unspecified error. + +The following code shows how to retrieve the number of successfully completed +copies within a burst and then using ``rte_dma_completed_status()`` to check +which operation failed and kick off the device to continue processing operations: + +.. code-block:: C + + enum rte_dma_status_code status[COMP_BURST_SZ]; + uint16_t count, idx, status_count; + bool error = 0; + + count = rte_dma_completed(dev_id, vchan, COMP_BURST_SZ, &idx, &error); + + if (error){ + status_count = rte_dma_completed_status(dev_id, vchan, COMP_BURST_SZ, &idx, status); + } diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index dfa7c36d45..6efe09121f 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -141,6 +141,240 @@ idxd_submit(void *dev_private, uint16_t qid __rte_unused) return 0; } +static enum rte_dma_status_code +get_comp_status(struct idxd_completion *c) +{ + uint8_t st = c->status; + switch (st) { + /* successful descriptors are not written back normally */ + case IDXD_COMP_STATUS_INCOMPLETE: + case IDXD_COMP_STATUS_SUCCESS: + return RTE_DMA_STATUS_SUCCESSFUL; + case IDXD_COMP_STATUS_INVALID_OPCODE: + return RTE_DMA_STATUS_INVALID_OPCODE; + case IDXD_COMP_STATUS_INVALID_SIZE: + return RTE_DMA_STATUS_INVALID_LENGTH; + case IDXD_COMP_STATUS_SKIPPED: + return RTE_DMA_STATUS_NOT_ATTEMPTED; + default: + return RTE_DMA_STATUS_ERROR_UNKNOWN; + } +} + +static __rte_always_inline int +batch_ok(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) +{ + uint16_t ret; + uint8_t bstatus; + + if (max_ops == 0) + return 0; + + /* first check if there are any unreturned handles from last time */ + if (idxd->ids_avail != idxd->ids_returned) { + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); + idxd->ids_returned += ret; + if (status) + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); + return ret; + } + + if (idxd->batch_idx_read == idxd->batch_idx_write) + return 0; + + bstatus = idxd->batch_comp_ring[idxd->batch_idx_read].status; + /* now check if next batch is complete and successful */ + if (bstatus == IDXD_COMP_STATUS_SUCCESS) { + /* since the batch idx ring stores the start of each batch, pre-increment to lookup + * start of next batch. + */ + if (++idxd->batch_idx_read > idxd->max_batches) + idxd->batch_idx_read = 0; + idxd->ids_avail = idxd->batch_idx_ring[idxd->batch_idx_read]; + + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); + idxd->ids_returned += ret; + if (status) + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); + return ret; + } + /* check if batch is incomplete */ + else if (bstatus == IDXD_COMP_STATUS_INCOMPLETE) + return 0; + + return -1; /* error case */ +} + +static inline uint16_t +batch_completed(struct idxd_dmadev *idxd, uint16_t max_ops, bool *has_error) +{ + uint16_t i; + uint16_t b_start, b_end, next_batch; + + int ret = batch_ok(idxd, max_ops, NULL); + if (ret >= 0) + return ret; + + /* ERROR case, not successful, not incomplete */ + /* Get the batch size, and special case size 1. + * once we identify the actual failure job, return other jobs, then update + * the batch ring indexes to make it look like the first job of the batch has failed. + * Subsequent calls here will always return zero packets, and the error must be cleared by + * calling the completed_status() function. + */ + next_batch = (idxd->batch_idx_read + 1); + if (next_batch > idxd->max_batches) + next_batch = 0; + b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; + b_end = idxd->batch_idx_ring[next_batch]; + + if (b_end - b_start == 1) { /* not a batch */ + *has_error = true; + return 0; + } + + for (i = b_start; i < b_end; i++) { + struct idxd_completion *c = (void *)&idxd->desc_ring[i & idxd->desc_ring_mask]; + if (c->status > IDXD_COMP_STATUS_SUCCESS) /* ignore incomplete(0) and success(1) */ + break; + } + ret = RTE_MIN((uint16_t)(i - idxd->ids_returned), max_ops); + if (ret < max_ops) + *has_error = true; /* we got up to the point of error */ + idxd->ids_avail = idxd->ids_returned += ret; + + /* to ensure we can call twice and just return 0, set start of batch to where we finished */ + idxd->batch_comp_ring[idxd->batch_idx_read].completed_size -= ret; + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; + if (idxd->batch_idx_ring[next_batch] - idxd->batch_idx_ring[idxd->batch_idx_read] == 1) { + /* copy over the descriptor status to the batch ring as if no batch */ + uint16_t d_idx = idxd->batch_idx_ring[idxd->batch_idx_read] & idxd->desc_ring_mask; + struct idxd_completion *desc_comp = (void *)&idxd->desc_ring[d_idx]; + idxd->batch_comp_ring[idxd->batch_idx_read].status = desc_comp->status; + } + + return ret; +} + +static uint16_t +batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) +{ + uint16_t next_batch; + + int ret = batch_ok(idxd, max_ops, status); + if (ret >= 0) + return ret; + + /* ERROR case, not successful, not incomplete */ + /* Get the batch size, and special case size 1. + */ + next_batch = (idxd->batch_idx_read + 1); + if (next_batch > idxd->max_batches) + next_batch = 0; + const uint16_t b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; + const uint16_t b_end = idxd->batch_idx_ring[next_batch]; + const uint16_t b_len = b_end - b_start; + if (b_len == 1) {/* not a batch */ + *status = get_comp_status(&idxd->batch_comp_ring[idxd->batch_idx_read]); + idxd->ids_avail++; + idxd->ids_returned++; + idxd->batch_idx_read = next_batch; + return 1; + } + + /* not a single-element batch, need to process more. + * Scenarios: + * 1. max_ops >= batch_size - can fit everything, simple case + * - loop through completed ops and then add on any not-attempted ones + * 2. max_ops < batch_size - can't fit everything, more complex case + * - loop through completed/incomplete and stop when hit max_ops + * - adjust the batch descriptor to update where we stopped, with appropriate bcount + * - if bcount is to be exactly 1, update the batch descriptor as it will be treated as + * non-batch next time. + */ + const uint16_t bcount = idxd->batch_comp_ring[idxd->batch_idx_read].completed_size; + for (ret = 0; ret < b_len && ret < max_ops; ret++) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; + status[ret] = (ret < bcount) ? get_comp_status(c) : RTE_DMA_STATUS_NOT_ATTEMPTED; + } + idxd->ids_avail = idxd->ids_returned += ret; + + /* everything fit */ + if (ret == b_len) { + idxd->batch_idx_read = next_batch; + return ret; + } + + /* set up for next time, update existing batch descriptor & start idx at batch_idx_read */ + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; + if (ret > bcount) { + /* we have only incomplete ones - set batch completed size to 0 */ + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; + comp->completed_size = 0; + /* if there is only one descriptor left, job skipped so set flag appropriately */ + if (b_len - ret == 1) + comp->status = IDXD_COMP_STATUS_SKIPPED; + } else { + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; + comp->completed_size -= ret; + /* if there is only one descriptor left, copy status info straight to desc */ + if (comp->completed_size == 1) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; + comp->status = c->status; + /* individual descs can be ok without writeback, but not batches */ + if (comp->status == IDXD_COMP_STATUS_INCOMPLETE) + comp->status = IDXD_COMP_STATUS_SUCCESS; + } else if (bcount == b_len) { + /* check if we still have an error, and clear flag if not */ + uint16_t i; + for (i = b_start + ret; i < b_end; i++) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[i & idxd->desc_ring_mask]; + if (c->status > IDXD_COMP_STATUS_SUCCESS) + break; + } + if (i == b_end) /* no errors */ + comp->status = IDXD_COMP_STATUS_SUCCESS; + } + } + + return ret; +} + +uint16_t +idxd_completed(void *dev_private, uint16_t qid __rte_unused, uint16_t max_ops, + uint16_t *last_idx, bool *has_error) +{ + struct idxd_dmadev *idxd = dev_private; + uint16_t batch, ret = 0; + + do { + batch = batch_completed(idxd, max_ops - ret, has_error); + ret += batch; + } while (batch > 0 && *has_error == false); + + *last_idx = idxd->ids_returned - 1; + return ret; +} + +uint16_t +idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, uint16_t max_ops, + uint16_t *last_idx, enum rte_dma_status_code *status) +{ + struct idxd_dmadev *idxd = dev_private; + uint16_t batch, ret = 0; + + do { + batch = batch_completed_status(idxd, max_ops - ret, &status[ret]); + ret += batch; + } while (batch > 0); + + *last_idx = idxd->ids_returned - 1; + return ret; +} + int idxd_dump(const struct rte_dma_dev *dev, FILE *f) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 5b83e6dc60..840f8ce345 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -93,5 +93,10 @@ int idxd_enqueue_copy(void *dev_private, uint16_t qid, rte_iova_t src, int idxd_enqueue_fill(void *dev_private, uint16_t qid, uint64_t pattern, rte_iova_t dst, unsigned int length, uint64_t flags); int idxd_submit(void *dev_private, uint16_t qid); +uint16_t idxd_completed(void *dev_private, uint16_t qid, uint16_t max_ops, + uint16_t *last_idx, bool *has_error); +uint16_t idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, + uint16_t max_ops, uint16_t *last_idx, + enum rte_dma_status_code *status); #endif /* _IDXD_INTERNAL_H_ */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v7 11/16] dma/idxd: add operation statistic tracking 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 00/16] add dmadev driver for idxd devices Kevin Laatz ` (9 preceding siblings ...) 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 10/16] dma/idxd: add data-path job completion functions Kevin Laatz @ 2021-10-13 16:30 ` Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 12/16] dma/idxd: add vchan status function Kevin Laatz ` (4 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-13 16:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add statistic tracking for DSA devices. The dmadev library documentation is also updated to add a generic section for using the library's statistics APIs. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> --- doc/guides/prog_guide/dmadev.rst | 11 +++++++++++ drivers/dma/idxd/idxd_bus.c | 2 ++ drivers/dma/idxd/idxd_common.c | 27 +++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 3 +++ drivers/dma/idxd/idxd_pci.c | 2 ++ 5 files changed, 45 insertions(+) diff --git a/doc/guides/prog_guide/dmadev.rst b/doc/guides/prog_guide/dmadev.rst index e853ffda3a..139eaff299 100644 --- a/doc/guides/prog_guide/dmadev.rst +++ b/doc/guides/prog_guide/dmadev.rst @@ -107,3 +107,14 @@ completed operations along with the status of each operation (filled into the ``status`` array passed by user). These two APIs can also return the last completed operation's ``ring_idx`` which could help user track operations within their own application-defined rings. + + +Querying Device Statistics +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The statistics from a dmadev device can be got via the statistics functions, +i.e. ``rte_dma_stats_get()``. The statistics returned for each device instance are: + +* ``submitted``: The number of operations submitted to the device. +* ``completed``: The number of operations which have completed (successful and failed). +* ``errors``: The number of operations that completed with error. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index b2acdac4f9..b52ea02854 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -99,6 +99,8 @@ static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .stats_get = idxd_stats_get, + .stats_reset = idxd_stats_reset, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 6efe09121f..c36cd96aa6 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -65,6 +65,8 @@ __submit(struct idxd_dmadev *idxd) if (++idxd->batch_idx_write > idxd->max_batches) idxd->batch_idx_write = 0; + idxd->stats.submitted += idxd->batch_size; + idxd->batch_start += idxd->batch_size; idxd->batch_size = 0; idxd->batch_idx_ring[idxd->batch_idx_write] = idxd->batch_start; @@ -276,6 +278,8 @@ batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_ const uint16_t b_len = b_end - b_start; if (b_len == 1) {/* not a batch */ *status = get_comp_status(&idxd->batch_comp_ring[idxd->batch_idx_read]); + if (status != RTE_DMA_STATUS_SUCCESSFUL) + idxd->stats.errors++; idxd->ids_avail++; idxd->ids_returned++; idxd->batch_idx_read = next_batch; @@ -297,6 +301,8 @@ batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_ struct idxd_completion *c = (void *) &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; status[ret] = (ret < bcount) ? get_comp_status(c) : RTE_DMA_STATUS_NOT_ATTEMPTED; + if (status[ret] != RTE_DMA_STATUS_SUCCESSFUL) + idxd->stats.errors++; } idxd->ids_avail = idxd->ids_returned += ret; @@ -355,6 +361,7 @@ idxd_completed(void *dev_private, uint16_t qid __rte_unused, uint16_t max_ops, ret += batch; } while (batch > 0 && *has_error == false); + idxd->stats.completed += ret; *last_idx = idxd->ids_returned - 1; return ret; } @@ -371,6 +378,7 @@ idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, uint16_t max ret += batch; } while (batch > 0); + idxd->stats.completed += ret; *last_idx = idxd->ids_returned - 1; return ret; } @@ -404,6 +412,25 @@ idxd_dump(const struct rte_dma_dev *dev, FILE *f) return 0; } +int +idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + struct rte_dma_stats *stats, uint32_t stats_sz) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + if (stats_sz < sizeof(*stats)) + return -EINVAL; + *stats = idxd->stats; + return 0; +} + +int +idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan __rte_unused) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + idxd->stats = (struct rte_dma_stats){0}; + return 0; +} + int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t size) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 840f8ce345..7842936936 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -98,5 +98,8 @@ uint16_t idxd_completed(void *dev_private, uint16_t qid, uint16_t max_ops, uint16_t idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, uint16_t max_ops, uint16_t *last_idx, enum rte_dma_status_code *status); +int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, + struct rte_dma_stats *stats, uint32_t stats_sz); +int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 68d8660285..307bfe6151 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -136,6 +136,8 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .stats_get = idxd_stats_get, + .stats_reset = idxd_stats_reset, .dev_start = idxd_pci_dev_start, .dev_stop = idxd_pci_dev_stop, }; -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v7 12/16] dma/idxd: add vchan status function 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 00/16] add dmadev driver for idxd devices Kevin Laatz ` (10 preceding siblings ...) 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 11/16] dma/idxd: add operation statistic tracking Kevin Laatz @ 2021-10-13 16:30 ` Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 13/16] dma/idxd: add burst capacity API Kevin Laatz ` (3 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-13 16:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When testing dmadev drivers, it is useful to have the HW device in a known state. This patch adds the implementation of the function which will wait for the device to be idle (all jobs completed) before proceeding. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 17 +++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 2 ++ drivers/dma/idxd/idxd_pci.c | 1 + 4 files changed, 21 insertions(+) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index b52ea02854..e6caa048a9 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -101,6 +101,7 @@ static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_info_get = idxd_info_get, .stats_get = idxd_stats_get, .stats_reset = idxd_stats_reset, + .vchan_status = idxd_vchan_status, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index c36cd96aa6..a2edc8a91f 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -163,6 +163,23 @@ get_comp_status(struct idxd_completion *c) } } +int +idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + enum rte_dma_vchan_status *status) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint16_t last_batch_write = idxd->batch_idx_write == 0 ? idxd->max_batches : + idxd->batch_idx_write - 1; + uint8_t bstatus = (idxd->batch_comp_ring[last_batch_write].status != 0); + + /* An IDXD device will always be either active or idle. + * RTE_DMA_VCHAN_HALTED_ERROR is therefore not supported by IDXD. + */ + *status = bstatus ? RTE_DMA_VCHAN_IDLE : RTE_DMA_VCHAN_ACTIVE; + + return 0; +} + static __rte_always_inline int batch_ok(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 7842936936..2b16a358e3 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -101,5 +101,7 @@ uint16_t idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, struct rte_dma_stats *stats, uint32_t stats_sz); int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); +int idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan, + enum rte_dma_vchan_status *status); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 307bfe6151..5abf2ad55b 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -140,6 +140,7 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .stats_reset = idxd_stats_reset, .dev_start = idxd_pci_dev_start, .dev_stop = idxd_pci_dev_stop, + .vchan_status = idxd_vchan_status, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v7 13/16] dma/idxd: add burst capacity API 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 00/16] add dmadev driver for idxd devices Kevin Laatz ` (11 preceding siblings ...) 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 12/16] dma/idxd: add vchan status function Kevin Laatz @ 2021-10-13 16:30 ` Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz ` (2 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-13 16:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add support for the burst capacity API. This API will provide the calling application with the remaining capacity of the current burst (limited by max HW batch size). Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> --- drivers/dma/idxd/idxd_common.c | 21 +++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 1 + drivers/dma/idxd/idxd_pci.c | 1 + 3 files changed, 23 insertions(+) diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index a2edc8a91f..fcad437275 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -468,6 +468,26 @@ idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t return 0; } +uint16_t +idxd_burst_capacity(const void *dev_private, uint16_t vchan __rte_unused) +{ + const struct idxd_dmadev *idxd = dev_private; + uint16_t write_idx = idxd->batch_start + idxd->batch_size; + uint16_t used_space; + + /* Check for space in the batch ring */ + if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) || + idxd->batch_idx_write + 1 == idxd->batch_idx_read) + return 0; + + /* For descriptors, check for wrap-around on write but not read */ + if (idxd->ids_returned > write_idx) + write_idx += idxd->desc_ring_mask + 1; + used_space = write_idx - idxd->ids_returned; + + return RTE_MIN((idxd->desc_ring_mask - used_space), idxd->max_batch_size); +} + int idxd_configure(struct rte_dma_dev *dev __rte_unused, const struct rte_dma_conf *dev_conf, uint32_t conf_sz) @@ -553,6 +573,7 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->fp_obj->submit = idxd_submit; dmadev->fp_obj->completed = idxd_completed; dmadev->fp_obj->completed_status = idxd_completed_status; + dmadev->fp_obj->burst_capacity = idxd_burst_capacity; idxd = dmadev->data->dev_private; *idxd = *base_idxd; /* copy over the main fields already passed in */ diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 2b16a358e3..67ee4afc7b 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -103,5 +103,6 @@ int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); int idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan, enum rte_dma_vchan_status *status); +uint16_t idxd_burst_capacity(const void *dev_private, uint16_t vchan); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 5abf2ad55b..916af296c2 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -254,6 +254,7 @@ init_pci_device(struct rte_pci_device *dev, struct idxd_dmadev *idxd, idxd->u.pci = pci; idxd->max_batches = wq_size; + idxd->max_batch_size = 1 << lg2_max_batch; /* enable the device itself */ err_code = idxd_pci_dev_command(idxd, idxd_enable_dev); -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v7 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 00/16] add dmadev driver for idxd devices Kevin Laatz ` (12 preceding siblings ...) 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 13/16] dma/idxd: add burst capacity API Kevin Laatz @ 2021-10-13 16:30 ` Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 15/16] devbind: add dma device class Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-13 16:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz From: Conor Walsh <conor.walsh@intel.com> Move the example script for configuring IDXD devices bound to the IDXD kernel driver from raw to dma, and create a symlink to still allow use from raw. Signed-off-by: Conor Walsh <conor.walsh@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> --- drivers/dma/idxd/dpdk_idxd_cfg.py | 117 +++++++++++++++++++++++++++++ drivers/raw/ioat/dpdk_idxd_cfg.py | 118 +----------------------------- 2 files changed, 118 insertions(+), 117 deletions(-) create mode 100755 drivers/dma/idxd/dpdk_idxd_cfg.py mode change 100755 => 120000 drivers/raw/ioat/dpdk_idxd_cfg.py diff --git a/drivers/dma/idxd/dpdk_idxd_cfg.py b/drivers/dma/idxd/dpdk_idxd_cfg.py new file mode 100755 index 0000000000..fcc27822ef --- /dev/null +++ b/drivers/dma/idxd/dpdk_idxd_cfg.py @@ -0,0 +1,117 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2020 Intel Corporation + +""" +Configure an entire Intel DSA instance, using idxd kernel driver, for DPDK use +""" + +import sys +import argparse +import os +import os.path + + +class SysfsDir: + "Used to read/write paths in a sysfs directory" + def __init__(self, path): + self.path = path + + def read_int(self, filename): + "Return a value from sysfs file" + with open(os.path.join(self.path, filename)) as f: + return int(f.readline()) + + def write_values(self, values): + "write dictionary, where key is filename and value is value to write" + for filename, contents in values.items(): + with open(os.path.join(self.path, filename), "w") as f: + f.write(str(contents)) + + +def reset_device(dsa_id): + "Reset the DSA device and all its queues" + drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") + drv_dir.write_values({"unbind": f"dsa{dsa_id}"}) + + +def get_pci_dir(pci): + "Search for the sysfs directory of the PCI device" + base_dir = '/sys/bus/pci/devices/' + for path, dirs, files in os.walk(base_dir): + for dir in dirs: + if pci in dir: + return os.path.join(base_dir, dir) + sys.exit(f"Could not find sysfs directory for device {pci}") + + +def get_dsa_id(pci): + "Get the DSA instance ID using the PCI address of the device" + pci_dir = get_pci_dir(pci) + for path, dirs, files in os.walk(pci_dir): + for dir in dirs: + if dir.startswith('dsa') and 'wq' not in dir: + return int(dir[3:]) + sys.exit(f"Could not get device ID for device {pci}") + + +def configure_dsa(dsa_id, queues, prefix): + "Configure the DSA instance with appropriate number of queues" + dsa_dir = SysfsDir(f"/sys/bus/dsa/devices/dsa{dsa_id}") + drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") + + max_groups = dsa_dir.read_int("max_groups") + max_engines = dsa_dir.read_int("max_engines") + max_queues = dsa_dir.read_int("max_work_queues") + max_work_queues_size = dsa_dir.read_int("max_work_queues_size") + + nb_queues = min(queues, max_queues) + if queues > nb_queues: + print(f"Setting number of queues to max supported value: {max_queues}") + + # we want one engine per group, and no more engines than queues + nb_groups = min(max_engines, max_groups, nb_queues) + for grp in range(nb_groups): + dsa_dir.write_values({f"engine{dsa_id}.{grp}/group_id": grp}) + + # configure each queue + for q in range(nb_queues): + wq_dir = SysfsDir(os.path.join(dsa_dir.path, f"wq{dsa_id}.{q}")) + wq_dir.write_values({"group_id": q % nb_groups, + "type": "user", + "mode": "dedicated", + "name": f"{prefix}_wq{dsa_id}.{q}", + "priority": 1, + "size": int(max_work_queues_size / nb_queues)}) + + # enable device and then queues + drv_dir.write_values({"bind": f"dsa{dsa_id}"}) + for q in range(nb_queues): + drv_dir.write_values({"bind": f"wq{dsa_id}.{q}"}) + + +def main(args): + "Main function, does arg parsing and calls config function" + arg_p = argparse.ArgumentParser( + description="Configure whole DSA device instance for DPDK use") + arg_p.add_argument('dsa_id', + help="Specify DSA instance either via DSA instance number or PCI address") + arg_p.add_argument('-q', metavar='queues', type=int, default=255, + help="Number of queues to set up") + arg_p.add_argument('--name-prefix', metavar='prefix', dest='prefix', + default="dpdk", + help="Prefix for workqueue name to mark for DPDK use [default: 'dpdk']") + arg_p.add_argument('--reset', action='store_true', + help="Reset DSA device and its queues") + parsed_args = arg_p.parse_args(args[1:]) + + dsa_id = parsed_args.dsa_id + dsa_id = get_dsa_id(dsa_id) if ':' in dsa_id else dsa_id + if parsed_args.reset: + reset_device(dsa_id) + else: + configure_dsa(dsa_id, parsed_args.q, parsed_args.prefix) + + +if __name__ == "__main__": + main(sys.argv) diff --git a/drivers/raw/ioat/dpdk_idxd_cfg.py b/drivers/raw/ioat/dpdk_idxd_cfg.py deleted file mode 100755 index fcc27822ef..0000000000 --- a/drivers/raw/ioat/dpdk_idxd_cfg.py +++ /dev/null @@ -1,117 +0,0 @@ -#!/usr/bin/env python3 -# SPDX-License-Identifier: BSD-3-Clause -# Copyright(c) 2020 Intel Corporation - -""" -Configure an entire Intel DSA instance, using idxd kernel driver, for DPDK use -""" - -import sys -import argparse -import os -import os.path - - -class SysfsDir: - "Used to read/write paths in a sysfs directory" - def __init__(self, path): - self.path = path - - def read_int(self, filename): - "Return a value from sysfs file" - with open(os.path.join(self.path, filename)) as f: - return int(f.readline()) - - def write_values(self, values): - "write dictionary, where key is filename and value is value to write" - for filename, contents in values.items(): - with open(os.path.join(self.path, filename), "w") as f: - f.write(str(contents)) - - -def reset_device(dsa_id): - "Reset the DSA device and all its queues" - drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") - drv_dir.write_values({"unbind": f"dsa{dsa_id}"}) - - -def get_pci_dir(pci): - "Search for the sysfs directory of the PCI device" - base_dir = '/sys/bus/pci/devices/' - for path, dirs, files in os.walk(base_dir): - for dir in dirs: - if pci in dir: - return os.path.join(base_dir, dir) - sys.exit(f"Could not find sysfs directory for device {pci}") - - -def get_dsa_id(pci): - "Get the DSA instance ID using the PCI address of the device" - pci_dir = get_pci_dir(pci) - for path, dirs, files in os.walk(pci_dir): - for dir in dirs: - if dir.startswith('dsa') and 'wq' not in dir: - return int(dir[3:]) - sys.exit(f"Could not get device ID for device {pci}") - - -def configure_dsa(dsa_id, queues, prefix): - "Configure the DSA instance with appropriate number of queues" - dsa_dir = SysfsDir(f"/sys/bus/dsa/devices/dsa{dsa_id}") - drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") - - max_groups = dsa_dir.read_int("max_groups") - max_engines = dsa_dir.read_int("max_engines") - max_queues = dsa_dir.read_int("max_work_queues") - max_work_queues_size = dsa_dir.read_int("max_work_queues_size") - - nb_queues = min(queues, max_queues) - if queues > nb_queues: - print(f"Setting number of queues to max supported value: {max_queues}") - - # we want one engine per group, and no more engines than queues - nb_groups = min(max_engines, max_groups, nb_queues) - for grp in range(nb_groups): - dsa_dir.write_values({f"engine{dsa_id}.{grp}/group_id": grp}) - - # configure each queue - for q in range(nb_queues): - wq_dir = SysfsDir(os.path.join(dsa_dir.path, f"wq{dsa_id}.{q}")) - wq_dir.write_values({"group_id": q % nb_groups, - "type": "user", - "mode": "dedicated", - "name": f"{prefix}_wq{dsa_id}.{q}", - "priority": 1, - "size": int(max_work_queues_size / nb_queues)}) - - # enable device and then queues - drv_dir.write_values({"bind": f"dsa{dsa_id}"}) - for q in range(nb_queues): - drv_dir.write_values({"bind": f"wq{dsa_id}.{q}"}) - - -def main(args): - "Main function, does arg parsing and calls config function" - arg_p = argparse.ArgumentParser( - description="Configure whole DSA device instance for DPDK use") - arg_p.add_argument('dsa_id', - help="Specify DSA instance either via DSA instance number or PCI address") - arg_p.add_argument('-q', metavar='queues', type=int, default=255, - help="Number of queues to set up") - arg_p.add_argument('--name-prefix', metavar='prefix', dest='prefix', - default="dpdk", - help="Prefix for workqueue name to mark for DPDK use [default: 'dpdk']") - arg_p.add_argument('--reset', action='store_true', - help="Reset DSA device and its queues") - parsed_args = arg_p.parse_args(args[1:]) - - dsa_id = parsed_args.dsa_id - dsa_id = get_dsa_id(dsa_id) if ':' in dsa_id else dsa_id - if parsed_args.reset: - reset_device(dsa_id) - else: - configure_dsa(dsa_id, parsed_args.q, parsed_args.prefix) - - -if __name__ == "__main__": - main(sys.argv) diff --git a/drivers/raw/ioat/dpdk_idxd_cfg.py b/drivers/raw/ioat/dpdk_idxd_cfg.py new file mode 120000 index 0000000000..85545548d1 --- /dev/null +++ b/drivers/raw/ioat/dpdk_idxd_cfg.py @@ -0,0 +1 @@ +../../dma/idxd/dpdk_idxd_cfg.py \ No newline at end of file -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v7 15/16] devbind: add dma device class 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 00/16] add dmadev driver for idxd devices Kevin Laatz ` (13 preceding siblings ...) 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz @ 2021-10-13 16:30 ` Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-13 16:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add a new class for DMA devices. Devices listed under the DMA class are to be used with the dmadev library. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> --- usertools/dpdk-devbind.py | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py index 5f0e817055..da89b87816 100755 --- a/usertools/dpdk-devbind.py +++ b/usertools/dpdk-devbind.py @@ -71,6 +71,7 @@ network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class] baseband_devices = [acceleration_class] crypto_devices = [encryption_class, intel_processor_class] +dma_devices = [] eventdev_devices = [cavium_sso, cavium_tim, intel_dlb, octeontx2_sso] mempool_devices = [cavium_fpa, octeontx2_npa] compress_devices = [cavium_zip] @@ -585,6 +586,9 @@ def show_status(): if status_dev in ["crypto", "all"]: show_device_status(crypto_devices, "Crypto") + if status_dev in ["dma", "all"]: + show_device_status(dma_devices, "DMA") + if status_dev in ["event", "all"]: show_device_status(eventdev_devices, "Eventdev") @@ -653,7 +657,7 @@ def parse_args(): parser.add_argument( '--status-dev', help="Print the status of given device group.", - choices=['baseband', 'compress', 'crypto', 'event', + choices=['baseband', 'compress', 'crypto', 'dma', 'event', 'mempool', 'misc', 'net', 'regex']) bind_group = parser.add_mutually_exclusive_group() bind_group.add_argument( @@ -734,6 +738,7 @@ def do_arg_actions(): get_device_details(network_devices) get_device_details(baseband_devices) get_device_details(crypto_devices) + get_device_details(dma_devices) get_device_details(eventdev_devices) get_device_details(mempool_devices) get_device_details(compress_devices) @@ -756,6 +761,7 @@ def main(): get_device_details(network_devices) get_device_details(baseband_devices) get_device_details(crypto_devices) + get_device_details(dma_devices) get_device_details(eventdev_devices) get_device_details(mempool_devices) get_device_details(compress_devices) -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v7 16/16] devbind: move idxd device ID to dmadev class 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 00/16] add dmadev driver for idxd devices Kevin Laatz ` (14 preceding siblings ...) 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 15/16] devbind: add dma device class Kevin Laatz @ 2021-10-13 16:30 ` Kevin Laatz 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-13 16:30 UTC (permalink / raw) To: dev; +Cc: bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz The dmadev library is the preferred abstraction for using IDXD devices and will replace the rawdev implementation in future. This patch moves the IDXD device ID to the dmadev class. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> --- usertools/dpdk-devbind.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py index da89b87816..ba18e2a487 100755 --- a/usertools/dpdk-devbind.py +++ b/usertools/dpdk-devbind.py @@ -71,13 +71,13 @@ network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class] baseband_devices = [acceleration_class] crypto_devices = [encryption_class, intel_processor_class] -dma_devices = [] +dma_devices = [intel_idxd_spr] eventdev_devices = [cavium_sso, cavium_tim, intel_dlb, octeontx2_sso] mempool_devices = [cavium_fpa, octeontx2_npa] compress_devices = [cavium_zip] regex_devices = [octeontx2_ree] misc_devices = [cnxk_bphy, cnxk_bphy_cgx, cnxk_inl_dev, intel_ioat_bdw, - intel_ioat_skx, intel_ioat_icx, intel_idxd_spr, intel_ntb_skx, + intel_ioat_skx, intel_ioat_icx, intel_ntb_skx, intel_ntb_icx, octeontx2_dma] # global dict ethernet devices present. Dictionary indexed by PCI address. -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v8 00/16] add dmadev driver for idxd devices 2021-08-27 17:20 [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices Kevin Laatz ` (17 preceding siblings ...) 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 00/16] add dmadev driver for idxd devices Kevin Laatz @ 2021-10-18 12:28 ` Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 01/16] raw/ioat: only build if dmadev not present Kevin Laatz ` (15 more replies) 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 00/16] add dmadev driver for idxd devices Kevin Laatz ` (2 subsequent siblings) 21 siblings, 16 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-18 12:28 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz This patchset adds a dmadev driver and associated documentation to support Intel Data Streaming Accelerator devices. This driver is intended to ultimately replace the current IDXD part of the IOAT rawdev driver. v8: * fix compilation issues of individual patches v7: * rebase on above patchsets * add meson reason for rawdev build v6: * set state of device during create * add dev_close function * documentation updates - moved generic pieces from driver doc to lib doc * other small miscellaneous fixes based on rebasing and ML feedback v5: * add missing toctree entry for idxd driver v4: * rebased on above patchsets * minor fixes based on review feedback v3: * rebased on above patchsets * added burst capacity API v2: * rebased on above patchsets * added API to check for device being idle * added devbind updates for DMA devices * fixed issue identified by internal coverity scan Bruce Richardson (1): raw/ioat: only build if dmadev not present Conor Walsh (1): dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz (14): dma/idxd: add skeleton for VFIO based DSA device dma/idxd: add bus device probing dma/idxd: create dmadev instances on bus probe dma/idxd: create dmadev instances on pci probe dma/idxd: add datapath structures dma/idxd: add configure and info_get functions dma/idxd: add start and stop functions for pci devices dma/idxd: add data-path job submission functions dma/idxd: add data-path job completion functions dma/idxd: add operation statistic tracking dma/idxd: add vchan status function dma/idxd: add burst capacity API devbind: add dma device class devbind: move idxd device ID to dmadev class MAINTAINERS | 10 + doc/guides/dmadevs/idxd.rst | 179 ++++++++ doc/guides/dmadevs/index.rst | 2 + doc/guides/prog_guide/dmadev.rst | 30 ++ doc/guides/rawdevs/ioat.rst | 8 + doc/guides/rel_notes/release_21_11.rst | 5 + drivers/dma/idxd/dpdk_idxd_cfg.py | 117 +++++ drivers/dma/idxd/idxd_bus.c | 377 +++++++++++++++ drivers/dma/idxd/idxd_common.c | 612 +++++++++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 131 ++++++ drivers/dma/idxd/idxd_internal.h | 109 +++++ drivers/dma/idxd/idxd_pci.c | 368 +++++++++++++++ drivers/dma/idxd/meson.build | 14 + drivers/dma/idxd/version.map | 3 + drivers/dma/meson.build | 2 + drivers/meson.build | 4 +- drivers/raw/ioat/dpdk_idxd_cfg.py | 118 +---- drivers/raw/ioat/meson.build | 24 +- lib/dmadev/rte_dmadev.h | 1 + usertools/dpdk-devbind.py | 10 +- 20 files changed, 2000 insertions(+), 124 deletions(-) create mode 100644 doc/guides/dmadevs/idxd.rst create mode 100755 drivers/dma/idxd/dpdk_idxd_cfg.py create mode 100644 drivers/dma/idxd/idxd_bus.c create mode 100644 drivers/dma/idxd/idxd_common.c create mode 100644 drivers/dma/idxd/idxd_hw_defs.h create mode 100644 drivers/dma/idxd/idxd_internal.h create mode 100644 drivers/dma/idxd/idxd_pci.c create mode 100644 drivers/dma/idxd/meson.build create mode 100644 drivers/dma/idxd/version.map mode change 100755 => 120000 drivers/raw/ioat/dpdk_idxd_cfg.py -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v8 01/16] raw/ioat: only build if dmadev not present 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 00/16] add dmadev driver for idxd devices Kevin Laatz @ 2021-10-18 12:28 ` Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz ` (14 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-18 12:28 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz From: Bruce Richardson <bruce.richardson@intel.com> Only build the rawdev IDXD/IOAT drivers if the dmadev drivers are not present. This change requires the dependencies to be reordered in drivers/meson.build so that rawdev can use the "RTE_DMA_* build macros to check for the presence of the equivalent dmadev driver. A note is also added to the documentation to inform users of this change. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/rawdevs/ioat.rst | 8 ++++++++ drivers/meson.build | 4 ++-- drivers/raw/ioat/meson.build | 24 +++++++++++++++++++++--- 3 files changed, 31 insertions(+), 5 deletions(-) diff --git a/doc/guides/rawdevs/ioat.rst b/doc/guides/rawdevs/ioat.rst index a28e909935..a65530bd30 100644 --- a/doc/guides/rawdevs/ioat.rst +++ b/doc/guides/rawdevs/ioat.rst @@ -34,6 +34,14 @@ Compilation For builds using ``meson`` and ``ninja``, the driver will be built when the target platform is x86-based. No additional compilation steps are necessary. +.. note:: + Since the addition of the dmadev library, the ``ioat`` and ``idxd`` parts of this driver + will only be built if their ``dmadev`` counterparts are not built. + The following can be used to disable the ``dmadev`` drivers, + if the raw drivers are to be used instead:: + + $ meson -Ddisable_drivers=dma/* <build_dir> + Device Setup ------------- diff --git a/drivers/meson.build b/drivers/meson.build index b7d680868a..34c0276487 100644 --- a/drivers/meson.build +++ b/drivers/meson.build @@ -10,15 +10,15 @@ subdirs = [ 'common/qat', # depends on bus. 'common/sfc_efx', # depends on bus. 'mempool', # depends on common and bus. + 'dma', # depends on common and bus. 'net', # depends on common, bus, mempool - 'raw', # depends on common, bus and net. + 'raw', # depends on common, bus, dma and net. 'crypto', # depends on common, bus and mempool (net in future). 'compress', # depends on common, bus, mempool. 'regex', # depends on common, bus, regexdev. 'vdpa', # depends on common, bus and mempool. 'event', # depends on common, bus, mempool and net. 'baseband', # depends on common and bus. - 'dma', # depends on common and bus. ] if meson.is_cross_build() diff --git a/drivers/raw/ioat/meson.build b/drivers/raw/ioat/meson.build index 0e81cb5951..1b866aab74 100644 --- a/drivers/raw/ioat/meson.build +++ b/drivers/raw/ioat/meson.build @@ -2,14 +2,32 @@ # Copyright 2019 Intel Corporation build = dpdk_conf.has('RTE_ARCH_X86') +# only use ioat rawdev driver if we don't have the equivalent dmadev ones +if dpdk_conf.has('RTE_DMA_IDXD') and dpdk_conf.has('RTE_DMA_IOAT') + build = false + reason = 'replaced by dmadev drivers' + subdir_done() +endif + reason = 'only supported on x86' sources = files( - 'idxd_bus.c', - 'idxd_pci.c', 'ioat_common.c', - 'ioat_rawdev.c', 'ioat_rawdev_test.c', ) + +if not dpdk_conf.has('RTE_DMA_IDXD') + sources += files( + 'idxd_bus.c', + 'idxd_pci.c', + ) +endif + +if not dpdk_conf.has('RTE_DMA_IOAT') + sources += files ( + 'ioat_rawdev.c', + ) +endif + deps += ['bus_pci', 'mbuf', 'rawdev'] headers = files( 'rte_ioat_rawdev.h', -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v8 02/16] dma/idxd: add skeleton for VFIO based DSA device 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 01/16] raw/ioat: only build if dmadev not present Kevin Laatz @ 2021-10-18 12:28 ` Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 03/16] dma/idxd: add bus device probing Kevin Laatz ` (13 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-18 12:28 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the basic device probe/remove skeleton code for DSA device bound to the vfio pci driver. Relevant documentation and MAINTAINERS update also included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v8: fix compile issue --- MAINTAINERS | 10 +++++ doc/guides/dmadevs/idxd.rst | 58 ++++++++++++++++++++++++++ doc/guides/dmadevs/index.rst | 2 + doc/guides/rel_notes/release_21_11.rst | 5 +++ drivers/dma/idxd/idxd_common.c | 11 +++++ drivers/dma/idxd/idxd_internal.h | 27 ++++++++++++ drivers/dma/idxd/idxd_pci.c | 55 ++++++++++++++++++++++++ drivers/dma/idxd/meson.build | 12 ++++++ drivers/dma/idxd/version.map | 3 ++ drivers/dma/meson.build | 2 + 10 files changed, 185 insertions(+) create mode 100644 doc/guides/dmadevs/idxd.rst create mode 100644 drivers/dma/idxd/idxd_common.c create mode 100644 drivers/dma/idxd/idxd_internal.h create mode 100644 drivers/dma/idxd/idxd_pci.c create mode 100644 drivers/dma/idxd/meson.build create mode 100644 drivers/dma/idxd/version.map diff --git a/MAINTAINERS b/MAINTAINERS index 5387ffd4fc..423d8a73ce 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1200,6 +1200,16 @@ F: doc/guides/compressdevs/zlib.rst F: doc/guides/compressdevs/features/zlib.ini +DMAdev Drivers +-------------- + +Intel IDXD - EXPERIMENTAL +M: Bruce Richardson <bruce.richardson@intel.com> +M: Kevin Laatz <kevin.laatz@intel.com> +F: drivers/dma/idxd/ +F: doc/guides/dmadevs/idxd.rst + + RegEx Drivers ------------- diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst new file mode 100644 index 0000000000..924700d17e --- /dev/null +++ b/doc/guides/dmadevs/idxd.rst @@ -0,0 +1,58 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2021 Intel Corporation. + +.. include:: <isonum.txt> + +IDXD DMA Device Driver +====================== + +The ``idxd`` dmadev driver provides a poll-mode driver (PMD) for Intel\ |reg| +Data Streaming Accelerator `(Intel DSA) +<https://software.intel.com/content/www/us/en/develop/articles/intel-data-streaming-accelerator-architecture-specification.html>`_. +This PMD can be used in conjunction with Intel\ |reg| DSA devices to offload +data operations, such as data copies, to hardware, freeing up CPU cycles for +other tasks. + +Hardware Requirements +---------------------- + +The ``dpdk-devbind.py`` script, included with DPDK, can be used to show the +presence of supported hardware. Running ``dpdk-devbind.py --status-dev dma`` +will show all the DMA devices on the system, including IDXD supported devices. +Intel\ |reg| DSA devices, are currently (at time of writing) appearing +as devices with type “0b25”, due to the absence of pci-id database entries for +them at this point. + +Compilation +------------ + +For builds using ``meson`` and ``ninja``, the driver will be built when the +target platform is x86-based. No additional compilation steps are necessary. + +Device Setup +------------- + +Devices using VFIO/UIO drivers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The HW devices to be used will need to be bound to a user-space IO driver for use. +The ``dpdk-devbind.py`` script can be used to view the state of the devices +and to bind them to a suitable DPDK-supported driver, such as ``vfio-pci``. +For example:: + + $ dpdk-devbind.py -b vfio-pci 6a:01.0 + +Device Probing and Initialization +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For devices bound to a suitable DPDK-supported VFIO/UIO driver, the HW devices will +be found as part of the device scan done at application initialization time without +the need to pass parameters to the application. + +For Intel\ |reg| DSA devices, DPDK will automatically configure the device with the +maximum number of workqueues available on it, partitioning all resources equally +among the queues. +If fewer workqueues are required, then the ``max_queues`` parameter may be passed to +the device driver on the EAL commandline, via the ``allowlist`` or ``-a`` flag e.g.:: + + $ dpdk-test -a <b:d:f>,max_queues=4 diff --git a/doc/guides/dmadevs/index.rst b/doc/guides/dmadevs/index.rst index 0bce29d766..5d4abf880e 100644 --- a/doc/guides/dmadevs/index.rst +++ b/doc/guides/dmadevs/index.rst @@ -10,3 +10,5 @@ an application through DMA API. .. toctree:: :maxdepth: 2 :numbered: + + idxd diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst index d5435a64aa..f8678efa94 100644 --- a/doc/guides/rel_notes/release_21_11.rst +++ b/doc/guides/rel_notes/release_21_11.rst @@ -75,6 +75,11 @@ New Features operations. * Added multi-process support. +* **Added IDXD dmadev driver implementation.** + + The IDXD dmadev driver provide device drivers for the Intel DSA devices. + This device driver can be used through the generic dmadev API. + * **Added new RSS offload types for IPv4/L4 checksum in RSS flow.** Added macros ETH_RSS_IPV4_CHKSUM and ETH_RSS_L4_CHKSUM, now IPv4 and diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c new file mode 100644 index 0000000000..e00ddbe5ef --- /dev/null +++ b/drivers/dma/idxd/idxd_common.c @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#include <rte_log.h> + +#include "idxd_internal.h" + +int idxd_pmd_logtype; + +RTE_LOG_REGISTER_DEFAULT(idxd_pmd_logtype, WARNING); diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h new file mode 100644 index 0000000000..c6a7dcd72f --- /dev/null +++ b/drivers/dma/idxd/idxd_internal.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#ifndef _IDXD_INTERNAL_H_ +#define _IDXD_INTERNAL_H_ + +/** + * @file idxd_internal.h + * + * Internal data structures for the idxd/DSA driver for dmadev + * + * @warning + * @b EXPERIMENTAL: these structures and APIs may change without prior notice + */ + +extern int idxd_pmd_logtype; + +#define IDXD_PMD_LOG(level, fmt, args...) rte_log(RTE_LOG_ ## level, \ + idxd_pmd_logtype, "IDXD: %s(): " fmt "\n", __func__, ##args) + +#define IDXD_PMD_DEBUG(fmt, args...) IDXD_PMD_LOG(DEBUG, fmt, ## args) +#define IDXD_PMD_INFO(fmt, args...) IDXD_PMD_LOG(INFO, fmt, ## args) +#define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) +#define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) + +#endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c new file mode 100644 index 0000000000..79e4aadcab --- /dev/null +++ b/drivers/dma/idxd/idxd_pci.c @@ -0,0 +1,55 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include <rte_bus_pci.h> + +#include "idxd_internal.h" + +#define IDXD_VENDOR_ID 0x8086 +#define IDXD_DEVICE_ID_SPR 0x0B25 + +#define IDXD_PMD_DMADEV_NAME_PCI dmadev_idxd_pci + +const struct rte_pci_id pci_id_idxd_map[] = { + { RTE_PCI_DEVICE(IDXD_VENDOR_ID, IDXD_DEVICE_ID_SPR) }, + { .vendor_id = 0, /* sentinel */ }, +}; + +static int +idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) +{ + int ret = 0; + char name[PCI_PRI_STR_SIZE]; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); + dev->device.driver = &drv->driver; + + return ret; +} + +static int +idxd_dmadev_remove_pci(struct rte_pci_device *dev) +{ + char name[PCI_PRI_STR_SIZE]; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + + IDXD_PMD_INFO("Closing %s on NUMA node %d", + name, dev->device.numa_node); + + return 0; +} + +struct rte_pci_driver idxd_pmd_drv_pci = { + .id_table = pci_id_idxd_map, + .drv_flags = RTE_PCI_DRV_NEED_MAPPING, + .probe = idxd_dmadev_probe_pci, + .remove = idxd_dmadev_remove_pci, +}; + +RTE_PMD_REGISTER_PCI(IDXD_PMD_DMADEV_NAME_PCI, idxd_pmd_drv_pci); +RTE_PMD_REGISTER_PCI_TABLE(IDXD_PMD_DMADEV_NAME_PCI, pci_id_idxd_map); +RTE_PMD_REGISTER_KMOD_DEP(IDXD_PMD_DMADEV_NAME_PCI, "vfio-pci"); +RTE_PMD_REGISTER_PARAM_STRING(dmadev_idxd_pci, "max_queues=0"); diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build new file mode 100644 index 0000000000..ec5bed7636 --- /dev/null +++ b/drivers/dma/idxd/meson.build @@ -0,0 +1,12 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2021 Intel Corporation + +if is_windows + subdir_done() +endif + +deps += ['bus_pci'] +sources = files( + 'idxd_common.c', + 'idxd_pci.c' +) diff --git a/drivers/dma/idxd/version.map b/drivers/dma/idxd/version.map new file mode 100644 index 0000000000..4a76d1d52d --- /dev/null +++ b/drivers/dma/idxd/version.map @@ -0,0 +1,3 @@ +DPDK_21 { + local: *; +}; diff --git a/drivers/dma/meson.build b/drivers/dma/meson.build index d9c7ede32f..411be7a240 100644 --- a/drivers/dma/meson.build +++ b/drivers/dma/meson.build @@ -2,5 +2,7 @@ # Copyright 2021 HiSilicon Limited drivers = [ + 'idxd', 'skeleton', ] +std_deps = ['dmadev'] -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v8 03/16] dma/idxd: add bus device probing 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 01/16] raw/ioat: only build if dmadev not present Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz @ 2021-10-18 12:28 ` Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz ` (12 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-18 12:28 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the basic device probing for DSA devices bound to the IDXD kernel driver. These devices can be configured via sysfs and made available to DPDK if they are found during bus scan. Relevant documentation is included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 64 ++++++ drivers/dma/idxd/idxd_bus.c | 351 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 34 +++ drivers/dma/idxd/meson.build | 1 + lib/dmadev/rte_dmadev.h | 1 + 5 files changed, 451 insertions(+) create mode 100644 drivers/dma/idxd/idxd_bus.c diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 924700d17e..ce33e2857a 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -32,6 +32,56 @@ target platform is x86-based. No additional compilation steps are necessary. Device Setup ------------- +Intel\ |reg| DSA devices can use the IDXD kernel driver or DPDK-supported drivers, +such as ``vfio-pci``. Both are supported by the IDXD PMD. + +Intel\ |reg| DSA devices using IDXD kernel driver +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To use an Intel\ |reg| DSA device bound to the IDXD kernel driver, the device must first be configured. +The `accel-config <https://github.com/intel/idxd-config>`_ utility library can be used for configuration. + +.. note:: + The device configuration can also be done by directly interacting with the sysfs nodes. + An example of how this may be done can be seen in the script ``dpdk_idxd_cfg.py`` + included in the driver source directory. + +There are some mandatory configuration steps before being able to use a device with an application. +The internal engines, which do the copies or other operations, +and the work-queues, which are used by applications to assign work to the device, +need to be assigned to groups, and the various other configuration options, +such as priority or queue depth, need to be set for each queue. + +To assign an engine to a group:: + + $ accel-config config-engine dsa0/engine0.0 --group-id=0 + $ accel-config config-engine dsa0/engine0.1 --group-id=1 + +To assign work queues to groups for passing descriptors to the engines a similar accel-config command can be used. +However, the work queues also need to be configured depending on the use case. +Some configuration options include: + +* mode (Dedicated/Shared): Indicates whether a WQ may accept jobs from multiple queues simultaneously. +* priority: WQ priority between 1 and 15. Larger value means higher priority. +* wq-size: the size of the WQ. Sum of all WQ sizes must be less that the total-size defined by the device. +* type: WQ type (kernel/mdev/user). Determines how the device is presented. +* name: identifier given to the WQ. + +Example configuration for a work queue:: + + $ accel-config config-wq dsa0/wq0.0 --group-id=0 \ + --mode=dedicated --priority=10 --wq-size=8 \ + --type=user --name=dpdk_app1 + +Once the devices have been configured, they need to be enabled:: + + $ accel-config enable-device dsa0 + $ accel-config enable-wq dsa0/wq0.0 + +Check the device configuration:: + + $ accel-config list + Devices using VFIO/UIO drivers ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -56,3 +106,17 @@ If fewer workqueues are required, then the ``max_queues`` parameter may be passe the device driver on the EAL commandline, via the ``allowlist`` or ``-a`` flag e.g.:: $ dpdk-test -a <b:d:f>,max_queues=4 + +For devices bound to the IDXD kernel driver, +the DPDK IDXD driver will automatically perform a scan for available workqueues +to use. Any workqueues found listed in ``/dev/dsa`` on the system will be checked +in ``/sys``, and any which have ``dpdk_`` prefix in their name will be automatically +probed by the driver to make them available to the application. +Alternatively, to support use by multiple DPDK processes simultaneously, +the value used as the DPDK ``--file-prefix`` parameter may be used as a workqueue +name prefix, instead of ``dpdk_``, allowing each DPDK application instance to only +use a subset of configured queues. + +Once probed successfully, irrespective of kernel driver, the device will appear as a ``dmadev``, +that is a "DMA device type" inside DPDK, and can be accessed using APIs from the +``rte_dmadev`` library. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c new file mode 100644 index 0000000000..ef589af30e --- /dev/null +++ b/drivers/dma/idxd/idxd_bus.c @@ -0,0 +1,351 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include <dirent.h> +#include <fcntl.h> +#include <unistd.h> +#include <sys/mman.h> +#include <libgen.h> + +#include <rte_bus.h> +#include <rte_log.h> +#include <rte_dmadev_pmd.h> +#include <rte_string_fns.h> + +#include "idxd_internal.h" + +/* default value for DSA paths, but allow override in environment for testing */ +#define DSA_DEV_PATH "/dev/dsa" +#define DSA_SYSFS_PATH "/sys/bus/dsa/devices" + +static unsigned int devcount; + +/** unique identifier for a DSA device/WQ instance */ +struct dsa_wq_addr { + uint16_t device_id; + uint16_t wq_id; +}; + +/** a DSA device instance */ +struct rte_dsa_device { + struct rte_device device; /**< Inherit core device */ + TAILQ_ENTRY(rte_dsa_device) next; /**< next dev in list */ + + char wq_name[32]; /**< the workqueue name/number e.g. wq0.1 */ + struct dsa_wq_addr addr; /**< Identifies the specific WQ */ +}; + +/* forward prototypes */ +struct dsa_bus; +static int dsa_scan(void); +static int dsa_probe(void); +static struct rte_device *dsa_find_device(const struct rte_device *start, + rte_dev_cmp_t cmp, const void *data); +static enum rte_iova_mode dsa_get_iommu_class(void); +static int dsa_addr_parse(const char *name, void *addr); + +/** List of devices */ +TAILQ_HEAD(dsa_device_list, rte_dsa_device); + +/** + * Structure describing the DSA bus + */ +struct dsa_bus { + struct rte_bus bus; /**< Inherit the generic class */ + struct rte_driver driver; /**< Driver struct for devices to point to */ + struct dsa_device_list device_list; /**< List of PCI devices */ +}; + +struct dsa_bus dsa_bus = { + .bus = { + .scan = dsa_scan, + .probe = dsa_probe, + .find_device = dsa_find_device, + .get_iommu_class = dsa_get_iommu_class, + .parse = dsa_addr_parse, + }, + .driver = { + .name = "dmadev_idxd" + }, + .device_list = TAILQ_HEAD_INITIALIZER(dsa_bus.device_list), +}; + +static inline const char * +dsa_get_dev_path(void) +{ + const char *path = getenv("DSA_DEV_PATH"); + return path ? path : DSA_DEV_PATH; +} + +static inline const char * +dsa_get_sysfs_path(void) +{ + const char *path = getenv("DSA_SYSFS_PATH"); + return path ? path : DSA_SYSFS_PATH; +} + +static void * +idxd_bus_mmap_wq(struct rte_dsa_device *dev) +{ + void *addr; + char path[PATH_MAX]; + int fd; + + snprintf(path, sizeof(path), "%s/%s", dsa_get_dev_path(), dev->wq_name); + fd = open(path, O_RDWR); + if (fd < 0) { + IDXD_PMD_ERR("Failed to open device path: %s", path); + return NULL; + } + + addr = mmap(NULL, 0x1000, PROT_WRITE, MAP_SHARED, fd, 0); + close(fd); + if (addr == MAP_FAILED) { + IDXD_PMD_ERR("Failed to mmap device %s", path); + return NULL; + } + + return addr; +} + +static int +read_wq_string(struct rte_dsa_device *dev, const char *filename, + char *value, size_t valuelen) +{ + char sysfs_node[PATH_MAX]; + int len; + int fd; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s", + dsa_get_sysfs_path(), dev->wq_name, filename); + fd = open(sysfs_node, O_RDONLY); + if (fd < 0) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + len = read(fd, value, valuelen - 1); + close(fd); + if (len < 0) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + value[len] = '\0'; + return 0; +} + +static int +read_wq_int(struct rte_dsa_device *dev, const char *filename, + int *value) +{ + char sysfs_node[PATH_MAX]; + FILE *f; + int ret = 0; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s", + dsa_get_sysfs_path(), dev->wq_name, filename); + f = fopen(sysfs_node, "r"); + if (f == NULL) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + if (fscanf(f, "%d", value) != 1) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + ret = -1; + } + + fclose(f); + return ret; +} + +static int +read_device_int(struct rte_dsa_device *dev, const char *filename, + int *value) +{ + char sysfs_node[PATH_MAX]; + FILE *f; + int ret = 0; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/dsa%d/%s", + dsa_get_sysfs_path(), dev->addr.device_id, filename); + f = fopen(sysfs_node, "r"); + if (f == NULL) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + if (fscanf(f, "%d", value) != 1) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + ret = -1; + } + + fclose(f); + return ret; +} + +static int +idxd_probe_dsa(struct rte_dsa_device *dev) +{ + struct idxd_dmadev idxd = {0}; + int ret = 0; + + IDXD_PMD_INFO("Probing device %s on numa node %d", + dev->wq_name, dev->device.numa_node); + if (read_wq_int(dev, "size", &ret) < 0) + return -1; + idxd.max_batches = ret; + if (read_wq_int(dev, "max_batch_size", &ret) < 0) + return -1; + idxd.max_batch_size = ret; + idxd.qid = dev->addr.wq_id; + idxd.sva_support = 1; + + idxd.portal = idxd_bus_mmap_wq(dev); + if (idxd.portal == NULL) { + IDXD_PMD_ERR("WQ mmap failed"); + return -ENOENT; + } + + return 0; +} + +static int +is_for_this_process_use(const char *name) +{ + char *runtime_dir = strdup(rte_eal_get_runtime_dir()); + char *prefix = basename(runtime_dir); + int prefixlen = strlen(prefix); + int retval = 0; + + if (strncmp(name, "dpdk_", 5) == 0) + retval = 1; + if (strncmp(name, prefix, prefixlen) == 0 && name[prefixlen] == '_') + retval = 1; + + free(runtime_dir); + return retval; +} + +static int +dsa_probe(void) +{ + struct rte_dsa_device *dev; + + TAILQ_FOREACH(dev, &dsa_bus.device_list, next) { + char type[64], name[64]; + + if (read_wq_string(dev, "type", type, sizeof(type)) < 0 || + read_wq_string(dev, "name", name, sizeof(name)) < 0) + continue; + + if (strncmp(type, "user", 4) == 0 && is_for_this_process_use(name)) { + dev->device.driver = &dsa_bus.driver; + idxd_probe_dsa(dev); + continue; + } + IDXD_PMD_DEBUG("WQ '%s', not allocated to DPDK", dev->wq_name); + } + + return 0; +} + +static int +dsa_scan(void) +{ + const char *path = dsa_get_dev_path(); + struct dirent *wq; + DIR *dev_dir; + + dev_dir = opendir(path); + if (dev_dir == NULL) { + if (errno == ENOENT) + return 0; /* no bus, return without error */ + IDXD_PMD_ERR("%s(): opendir '%s' failed: %s", + __func__, path, strerror(errno)); + return -1; + } + + while ((wq = readdir(dev_dir)) != NULL) { + struct rte_dsa_device *dev; + int numa_node = -1; + + if (strncmp(wq->d_name, "wq", 2) != 0) + continue; + if (strnlen(wq->d_name, sizeof(dev->wq_name)) == sizeof(dev->wq_name)) { + IDXD_PMD_ERR("%s(): wq name too long: '%s', skipping", + __func__, wq->d_name); + continue; + } + IDXD_PMD_DEBUG("%s(): found %s/%s", __func__, path, wq->d_name); + + dev = malloc(sizeof(*dev)); + if (dsa_addr_parse(wq->d_name, &dev->addr) < 0) { + IDXD_PMD_ERR("Error parsing WQ name: %s", wq->d_name); + free(dev); + continue; + } + dev->device.bus = &dsa_bus.bus; + strlcpy(dev->wq_name, wq->d_name, sizeof(dev->wq_name)); + TAILQ_INSERT_TAIL(&dsa_bus.device_list, dev, next); + devcount++; + + read_device_int(dev, "numa_node", &numa_node); + dev->device.numa_node = numa_node; + dev->device.name = dev->wq_name; + } + + closedir(dev_dir); + return 0; +} + +static struct rte_device * +dsa_find_device(const struct rte_device *start, rte_dev_cmp_t cmp, + const void *data) +{ + struct rte_dsa_device *dev = TAILQ_FIRST(&dsa_bus.device_list); + + /* the rte_device struct must be at start of dsa structure */ + RTE_BUILD_BUG_ON(offsetof(struct rte_dsa_device, device) != 0); + + if (start != NULL) /* jump to start point if given */ + dev = TAILQ_NEXT((const struct rte_dsa_device *)start, next); + while (dev != NULL) { + if (cmp(&dev->device, data) == 0) + return &dev->device; + dev = TAILQ_NEXT(dev, next); + } + return NULL; +} + +static enum rte_iova_mode +dsa_get_iommu_class(void) +{ + /* if there are no devices, report don't care, otherwise VA mode */ + return devcount > 0 ? RTE_IOVA_VA : RTE_IOVA_DC; +} + +static int +dsa_addr_parse(const char *name, void *addr) +{ + struct dsa_wq_addr *wq = addr; + unsigned int device_id, wq_id; + + if (sscanf(name, "wq%u.%u", &device_id, &wq_id) != 2) { + IDXD_PMD_DEBUG("Parsing WQ name failed: %s", name); + return -1; + } + + wq->device_id = device_id; + wq->wq_id = wq_id; + return 0; +} + +RTE_REGISTER_BUS(dsa, dsa_bus.bus); diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index c6a7dcd72f..b8a7d7dab6 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -24,4 +24,38 @@ extern int idxd_pmd_logtype; #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) +struct idxd_dmadev { + /* counters to track the batches */ + unsigned short max_batches; + unsigned short batch_idx_read; + unsigned short batch_idx_write; + + /* track descriptors and handles */ + unsigned short desc_ring_mask; + unsigned short ids_avail; /* handles for ops completed */ + unsigned short ids_returned; /* the read pointer for hdls/desc rings */ + unsigned short batch_start; /* start+size == write pointer for hdls/desc */ + unsigned short batch_size; + + void *portal; /* address to write the batch descriptor */ + + struct idxd_completion *batch_comp_ring; + unsigned short *batch_idx_ring; /* store where each batch ends */ + + rte_iova_t batch_iova; /* base address of the batch comp ring */ + rte_iova_t desc_iova; /* base address of desc ring, needed for completions */ + + unsigned short max_batch_size; + + struct rte_dma_dev *dmadev; + uint8_t sva_support; + uint8_t qid; + + union { + struct { + unsigned int dsa_id; + } bus; + } u; +}; + #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index ec5bed7636..da5dc2b019 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -7,6 +7,7 @@ endif deps += ['bus_pci'] sources = files( + 'idxd_bus.c', 'idxd_common.c', 'idxd_pci.c' ) diff --git a/lib/dmadev/rte_dmadev.h b/lib/dmadev/rte_dmadev.h index f5d23017b1..7394f0932b 100644 --- a/lib/dmadev/rte_dmadev.h +++ b/lib/dmadev/rte_dmadev.h @@ -149,6 +149,7 @@ #include <rte_bitops.h> #include <rte_common.h> #include <rte_compat.h> +#include <rte_eal.h> #include <rte_dev.h> #ifdef __cplusplus -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v8 04/16] dma/idxd: create dmadev instances on bus probe 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 00/16] add dmadev driver for idxd devices Kevin Laatz ` (2 preceding siblings ...) 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 03/16] dma/idxd: add bus device probing Kevin Laatz @ 2021-10-18 12:28 ` Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz ` (11 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-18 12:28 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When a suitable device is found during the bus scan/probe, create a dmadev instance for each HW queue. Internal structures required for device creation are also added. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- drivers/dma/idxd/idxd_bus.c | 19 ++++++++++ drivers/dma/idxd/idxd_common.c | 61 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 27 ++++++++++++++ drivers/dma/idxd/idxd_internal.h | 7 ++++ 4 files changed, 114 insertions(+) create mode 100644 drivers/dma/idxd/idxd_hw_defs.h diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index ef589af30e..b48fa954ed 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -85,6 +85,18 @@ dsa_get_sysfs_path(void) return path ? path : DSA_SYSFS_PATH; } +static int +idxd_dev_close(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->data->dev_private; + munmap(idxd->portal, 0x1000); + return 0; +} + +static const struct rte_dma_dev_ops idxd_bus_ops = { + .dev_close = idxd_dev_close, +}; + static void * idxd_bus_mmap_wq(struct rte_dsa_device *dev) { @@ -206,6 +218,7 @@ idxd_probe_dsa(struct rte_dsa_device *dev) return -1; idxd.max_batch_size = ret; idxd.qid = dev->addr.wq_id; + idxd.u.bus.dsa_id = dev->addr.device_id; idxd.sva_support = 1; idxd.portal = idxd_bus_mmap_wq(dev); @@ -214,6 +227,12 @@ idxd_probe_dsa(struct rte_dsa_device *dev) return -ENOENT; } + ret = idxd_dmadev_create(dev->wq_name, &dev->device, &idxd, &idxd_bus_ops); + if (ret) { + IDXD_PMD_ERR("Failed to create rawdev %s", dev->wq_name); + return ret; + } + return 0; } diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index e00ddbe5ef..5abff34292 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -2,10 +2,71 @@ * Copyright 2021 Intel Corporation */ +#include <rte_malloc.h> +#include <rte_common.h> #include <rte_log.h> #include "idxd_internal.h" +#define IDXD_PMD_NAME_STR "dmadev_idxd" + +int +idxd_dmadev_create(const char *name, struct rte_device *dev, + const struct idxd_dmadev *base_idxd, + const struct rte_dma_dev_ops *ops) +{ + struct idxd_dmadev *idxd = NULL; + struct rte_dma_dev *dmadev = NULL; + int ret = 0; + + if (!name) { + IDXD_PMD_ERR("Invalid name of the device!"); + ret = -EINVAL; + goto cleanup; + } + + /* Allocate device structure */ + dmadev = rte_dma_pmd_allocate(name, dev->numa_node, sizeof(struct idxd_dmadev)); + if (dmadev == NULL) { + IDXD_PMD_ERR("Unable to allocate raw device"); + ret = -ENOMEM; + goto cleanup; + } + dmadev->dev_ops = ops; + dmadev->device = dev; + + idxd = dmadev->data->dev_private; + *idxd = *base_idxd; /* copy over the main fields already passed in */ + idxd->dmadev = dmadev; + + /* allocate batch index ring and completion ring. + * The +1 is because we can never fully use + * the ring, otherwise read == write means both full and empty. + */ + idxd->batch_comp_ring = rte_zmalloc(NULL, (sizeof(idxd->batch_idx_ring[0]) + + sizeof(idxd->batch_comp_ring[0])) * (idxd->max_batches + 1), + sizeof(idxd->batch_comp_ring[0])); + if (idxd->batch_comp_ring == NULL) { + IDXD_PMD_ERR("Unable to reserve memory for batch data\n"); + ret = -ENOMEM; + goto cleanup; + } + idxd->batch_idx_ring = (void *)&idxd->batch_comp_ring[idxd->max_batches+1]; + idxd->batch_iova = rte_mem_virt2iova(idxd->batch_comp_ring); + + dmadev->fp_obj->dev_private = idxd; + + idxd->dmadev->state = RTE_DMA_DEV_READY; + + return 0; + +cleanup: + if (dmadev) + rte_dma_pmd_release(name); + + return ret; +} + int idxd_pmd_logtype; RTE_LOG_REGISTER_DEFAULT(idxd_pmd_logtype, WARNING); diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h new file mode 100644 index 0000000000..a92d462d01 --- /dev/null +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#ifndef _IDXD_HW_DEFS_H_ +#define _IDXD_HW_DEFS_H_ + +#define IDXD_COMP_STATUS_INCOMPLETE 0 +#define IDXD_COMP_STATUS_SUCCESS 1 +#define IDXD_COMP_STATUS_INVALID_OPCODE 0x10 +#define IDXD_COMP_STATUS_INVALID_SIZE 0x13 +#define IDXD_COMP_STATUS_SKIPPED 0xFF /* not official IDXD error, needed as placeholder */ + +/** + * Completion record structure written back by DSA + */ +struct idxd_completion { + uint8_t status; + uint8_t result; + /* 16-bits pad here */ + uint32_t completed_size; /* data length, or descriptors for batch */ + + rte_iova_t fault_address; + uint32_t invalid_flags; +} __rte_aligned(32); + +#endif diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index b8a7d7dab6..8f1cdf6102 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -5,6 +5,10 @@ #ifndef _IDXD_INTERNAL_H_ #define _IDXD_INTERNAL_H_ +#include <rte_dmadev_pmd.h> + +#include "idxd_hw_defs.h" + /** * @file idxd_internal.h * @@ -58,4 +62,7 @@ struct idxd_dmadev { } u; }; +int idxd_dmadev_create(const char *name, struct rte_device *dev, + const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); + #endif /* _IDXD_INTERNAL_H_ */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v8 05/16] dma/idxd: create dmadev instances on pci probe 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 00/16] add dmadev driver for idxd devices Kevin Laatz ` (3 preceding siblings ...) 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz @ 2021-10-18 12:28 ` Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 06/16] dma/idxd: add datapath structures Kevin Laatz ` (10 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-18 12:28 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When a suitable device is found during the PCI probe, create a dmadev instance for each HW queue. HW definitions required are also included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- drivers/dma/idxd/idxd_hw_defs.h | 63 ++++++++ drivers/dma/idxd/idxd_internal.h | 13 ++ drivers/dma/idxd/idxd_pci.c | 259 ++++++++++++++++++++++++++++++- 3 files changed, 332 insertions(+), 3 deletions(-) diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h index a92d462d01..86f7f3526b 100644 --- a/drivers/dma/idxd/idxd_hw_defs.h +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -24,4 +24,67 @@ struct idxd_completion { uint32_t invalid_flags; } __rte_aligned(32); +/*** Definitions for Intel(R) Data Streaming Accelerator ***/ + +#define IDXD_CMD_SHIFT 20 +enum rte_idxd_cmds { + idxd_enable_dev = 1, + idxd_disable_dev, + idxd_drain_all, + idxd_abort_all, + idxd_reset_device, + idxd_enable_wq, + idxd_disable_wq, + idxd_drain_wq, + idxd_abort_wq, + idxd_reset_wq, +}; + +/* General bar0 registers */ +struct rte_idxd_bar0 { + uint32_t __rte_cache_aligned version; /* offset 0x00 */ + uint64_t __rte_aligned(0x10) gencap; /* offset 0x10 */ + uint64_t __rte_aligned(0x10) wqcap; /* offset 0x20 */ + uint64_t __rte_aligned(0x10) grpcap; /* offset 0x30 */ + uint64_t __rte_aligned(0x08) engcap; /* offset 0x38 */ + uint64_t __rte_aligned(0x10) opcap; /* offset 0x40 */ + uint64_t __rte_aligned(0x20) offsets[2]; /* offset 0x60 */ + uint32_t __rte_aligned(0x20) gencfg; /* offset 0x80 */ + uint32_t __rte_aligned(0x08) genctrl; /* offset 0x88 */ + uint32_t __rte_aligned(0x10) gensts; /* offset 0x90 */ + uint32_t __rte_aligned(0x08) intcause; /* offset 0x98 */ + uint32_t __rte_aligned(0x10) cmd; /* offset 0xA0 */ + uint32_t __rte_aligned(0x08) cmdstatus; /* offset 0xA8 */ + uint64_t __rte_aligned(0x20) swerror[4]; /* offset 0xC0 */ +}; + +/* workqueue config is provided by array of uint32_t. */ +enum rte_idxd_wqcfg { + wq_size_idx, /* size is in first 32-bit value */ + wq_threshold_idx, /* WQ threshold second 32-bits */ + wq_mode_idx, /* WQ mode and other flags */ + wq_sizes_idx, /* WQ transfer and batch sizes */ + wq_occ_int_idx, /* WQ occupancy interrupt handle */ + wq_occ_limit_idx, /* WQ occupancy limit */ + wq_state_idx, /* WQ state and occupancy state */ +}; + +#define WQ_MODE_SHARED 0 +#define WQ_MODE_DEDICATED 1 +#define WQ_PRIORITY_SHIFT 4 +#define WQ_BATCH_SZ_SHIFT 5 +#define WQ_STATE_SHIFT 30 +#define WQ_STATE_MASK 0x3 + +struct rte_idxd_grpcfg { + uint64_t grpwqcfg[4] __rte_cache_aligned; /* 64-byte register set */ + uint64_t grpengcfg; /* offset 32 */ + uint32_t grpflags; /* offset 40 */ +}; + +#define GENSTS_DEV_STATE_MASK 0x03 +#define CMDSTATUS_ACTIVE_SHIFT 31 +#define CMDSTATUS_ACTIVE_MASK (1 << 31) +#define CMDSTATUS_ERR_MASK 0xFF + #endif diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 8f1cdf6102..8473bf939f 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -6,6 +6,7 @@ #define _IDXD_INTERNAL_H_ #include <rte_dmadev_pmd.h> +#include <rte_spinlock.h> #include "idxd_hw_defs.h" @@ -28,6 +29,16 @@ extern int idxd_pmd_logtype; #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) +struct idxd_pci_common { + rte_spinlock_t lk; + + uint8_t wq_cfg_sz; + volatile struct rte_idxd_bar0 *regs; + volatile uint32_t *wq_regs_base; + volatile struct rte_idxd_grpcfg *grp_regs; + volatile void *portals; +}; + struct idxd_dmadev { /* counters to track the batches */ unsigned short max_batches; @@ -59,6 +70,8 @@ struct idxd_dmadev { struct { unsigned int dsa_id; } bus; + + struct idxd_pci_common *pci; } u; }; diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 79e4aadcab..7127483b10 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -3,6 +3,9 @@ */ #include <rte_bus_pci.h> +#include <rte_devargs.h> +#include <rte_dmadev_pmd.h> +#include <rte_malloc.h> #include "idxd_internal.h" @@ -16,17 +19,267 @@ const struct rte_pci_id pci_id_idxd_map[] = { { .vendor_id = 0, /* sentinel */ }, }; +static inline int +idxd_pci_dev_command(struct idxd_dmadev *idxd, enum rte_idxd_cmds command) +{ + uint8_t err_code; + uint16_t qid = idxd->qid; + int i = 0; + + if (command >= idxd_disable_wq && command <= idxd_reset_wq) + qid = (1 << qid); + rte_spinlock_lock(&idxd->u.pci->lk); + idxd->u.pci->regs->cmd = (command << IDXD_CMD_SHIFT) | qid; + + do { + rte_pause(); + err_code = idxd->u.pci->regs->cmdstatus; + if (++i >= 1000) { + IDXD_PMD_ERR("Timeout waiting for command response from HW"); + rte_spinlock_unlock(&idxd->u.pci->lk); + return err_code; + } + } while (idxd->u.pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK); + rte_spinlock_unlock(&idxd->u.pci->lk); + + return err_code & CMDSTATUS_ERR_MASK; +} + +static uint32_t * +idxd_get_wq_cfg(struct idxd_pci_common *pci, uint8_t wq_idx) +{ + return RTE_PTR_ADD(pci->wq_regs_base, + (uintptr_t)wq_idx << (5 + pci->wq_cfg_sz)); +} + +static int +idxd_is_wq_enabled(struct idxd_dmadev *idxd) +{ + uint32_t state = idxd_get_wq_cfg(idxd->u.pci, idxd->qid)[wq_state_idx]; + return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; +} + +static int +idxd_pci_dev_close(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint8_t err_code; + + /* disable the device */ + err_code = idxd_pci_dev_command(idxd, idxd_disable_dev); + if (err_code) { + IDXD_PMD_ERR("Error disabling device: code %#x", err_code); + return err_code; + } + IDXD_PMD_DEBUG("IDXD Device disabled OK"); + + /* free device memory */ + IDXD_PMD_DEBUG("Freeing device driver memory"); + rte_free(idxd->batch_idx_ring); + + return 0; +} + +static const struct rte_dma_dev_ops idxd_pci_ops = { + .dev_close = idxd_pci_dev_close, +}; + +/* each portal uses 4 x 4k pages */ +#define IDXD_PORTAL_SIZE (4096 * 4) + +static int +init_pci_device(struct rte_pci_device *dev, struct idxd_dmadev *idxd, + unsigned int max_queues) +{ + struct idxd_pci_common *pci; + uint8_t nb_groups, nb_engines, nb_wqs; + uint16_t grp_offset, wq_offset; /* how far into bar0 the regs are */ + uint16_t wq_size, total_wq_size; + uint8_t lg2_max_batch, lg2_max_copy_size; + unsigned int i, err_code; + + pci = malloc(sizeof(*pci)); + if (pci == NULL) { + IDXD_PMD_ERR("%s: Can't allocate memory", __func__); + goto err; + } + rte_spinlock_init(&pci->lk); + + /* assign the bar registers, and then configure device */ + pci->regs = dev->mem_resource[0].addr; + grp_offset = (uint16_t)pci->regs->offsets[0]; + pci->grp_regs = RTE_PTR_ADD(pci->regs, grp_offset * 0x100); + wq_offset = (uint16_t)(pci->regs->offsets[0] >> 16); + pci->wq_regs_base = RTE_PTR_ADD(pci->regs, wq_offset * 0x100); + pci->portals = dev->mem_resource[2].addr; + pci->wq_cfg_sz = (pci->regs->wqcap >> 24) & 0x0F; + + /* sanity check device status */ + if (pci->regs->gensts & GENSTS_DEV_STATE_MASK) { + /* need function-level-reset (FLR) or is enabled */ + IDXD_PMD_ERR("Device status is not disabled, cannot init"); + goto err; + } + if (pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK) { + /* command in progress */ + IDXD_PMD_ERR("Device has a command in progress, cannot init"); + goto err; + } + + /* read basic info about the hardware for use when configuring */ + nb_groups = (uint8_t)pci->regs->grpcap; + nb_engines = (uint8_t)pci->regs->engcap; + nb_wqs = (uint8_t)(pci->regs->wqcap >> 16); + total_wq_size = (uint16_t)pci->regs->wqcap; + lg2_max_copy_size = (uint8_t)(pci->regs->gencap >> 16) & 0x1F; + lg2_max_batch = (uint8_t)(pci->regs->gencap >> 21) & 0x0F; + + IDXD_PMD_DEBUG("nb_groups = %u, nb_engines = %u, nb_wqs = %u", + nb_groups, nb_engines, nb_wqs); + + /* zero out any old config */ + for (i = 0; i < nb_groups; i++) { + pci->grp_regs[i].grpengcfg = 0; + pci->grp_regs[i].grpwqcfg[0] = 0; + } + for (i = 0; i < nb_wqs; i++) + idxd_get_wq_cfg(pci, i)[0] = 0; + + /* limit queues if necessary */ + if (max_queues != 0 && nb_wqs > max_queues) { + nb_wqs = max_queues; + if (nb_engines > max_queues) + nb_engines = max_queues; + if (nb_groups > max_queues) + nb_engines = max_queues; + IDXD_PMD_DEBUG("Limiting queues to %u", nb_wqs); + } + + /* put each engine into a separate group to avoid reordering */ + if (nb_groups > nb_engines) + nb_groups = nb_engines; + if (nb_groups < nb_engines) + nb_engines = nb_groups; + + /* assign engines to groups, round-robin style */ + for (i = 0; i < nb_engines; i++) { + IDXD_PMD_DEBUG("Assigning engine %u to group %u", + i, i % nb_groups); + pci->grp_regs[i % nb_groups].grpengcfg |= (1ULL << i); + } + + /* now do the same for queues and give work slots to each queue */ + wq_size = total_wq_size / nb_wqs; + IDXD_PMD_DEBUG("Work queue size = %u, max batch = 2^%u, max copy = 2^%u", + wq_size, lg2_max_batch, lg2_max_copy_size); + for (i = 0; i < nb_wqs; i++) { + /* add engine "i" to a group */ + IDXD_PMD_DEBUG("Assigning work queue %u to group %u", + i, i % nb_groups); + pci->grp_regs[i % nb_groups].grpwqcfg[0] |= (1ULL << i); + /* now configure it, in terms of size, max batch, mode */ + idxd_get_wq_cfg(pci, i)[wq_size_idx] = wq_size; + idxd_get_wq_cfg(pci, i)[wq_mode_idx] = (1 << WQ_PRIORITY_SHIFT) | + WQ_MODE_DEDICATED; + idxd_get_wq_cfg(pci, i)[wq_sizes_idx] = lg2_max_copy_size | + (lg2_max_batch << WQ_BATCH_SZ_SHIFT); + } + + /* dump the group configuration to output */ + for (i = 0; i < nb_groups; i++) { + IDXD_PMD_DEBUG("## Group %d", i); + IDXD_PMD_DEBUG(" GRPWQCFG: %"PRIx64, pci->grp_regs[i].grpwqcfg[0]); + IDXD_PMD_DEBUG(" GRPENGCFG: %"PRIx64, pci->grp_regs[i].grpengcfg); + IDXD_PMD_DEBUG(" GRPFLAGS: %"PRIx32, pci->grp_regs[i].grpflags); + } + + idxd->u.pci = pci; + idxd->max_batches = wq_size; + + /* enable the device itself */ + err_code = idxd_pci_dev_command(idxd, idxd_enable_dev); + if (err_code) { + IDXD_PMD_ERR("Error enabling device: code %#x", err_code); + return err_code; + } + IDXD_PMD_DEBUG("IDXD Device enabled OK"); + + return nb_wqs; + +err: + free(pci); + return -1; +} + static int idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) { - int ret = 0; + struct idxd_dmadev idxd = {0}; + uint8_t nb_wqs; + int qid, ret = 0; char name[PCI_PRI_STR_SIZE]; + unsigned int max_queues = 0; rte_pci_device_name(&dev->addr, name, sizeof(name)); IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); dev->device.driver = &drv->driver; - return ret; + if (dev->device.devargs && dev->device.devargs->args[0] != '\0') { + /* if the number of devargs grows beyond just 1, use rte_kvargs */ + if (sscanf(dev->device.devargs->args, + "max_queues=%u", &max_queues) != 1) { + IDXD_PMD_ERR("Invalid device parameter: '%s'", + dev->device.devargs->args); + return -1; + } + } + + ret = init_pci_device(dev, &idxd, max_queues); + if (ret < 0) { + IDXD_PMD_ERR("Error initializing PCI hardware"); + return ret; + } + if (idxd.u.pci->portals == NULL) { + IDXD_PMD_ERR("Error, invalid portal assigned during initialization\n"); + return -EINVAL; + } + nb_wqs = (uint8_t)ret; + + /* set up one device for each queue */ + for (qid = 0; qid < nb_wqs; qid++) { + char qname[32]; + + /* add the queue number to each device name */ + snprintf(qname, sizeof(qname), "%s-q%d", name, qid); + idxd.qid = qid; + idxd.portal = RTE_PTR_ADD(idxd.u.pci->portals, + qid * IDXD_PORTAL_SIZE); + if (idxd_is_wq_enabled(&idxd)) + IDXD_PMD_ERR("Error, WQ %u seems enabled", qid); + ret = idxd_dmadev_create(qname, &dev->device, + &idxd, &idxd_pci_ops); + if (ret != 0) { + IDXD_PMD_ERR("Failed to create dmadev %s", name); + if (qid == 0) /* if no devices using this, free pci */ + free(idxd.u.pci); + return ret; + } + } + + return 0; +} + +static int +idxd_dmadev_destroy(const char *name) +{ + int ret; + + /* rte_dma_close is called by pmd_release */ + ret = rte_dma_pmd_release(name); + if (ret) + IDXD_PMD_DEBUG("Device cleanup failed"); + + return 0; } static int @@ -39,7 +292,7 @@ idxd_dmadev_remove_pci(struct rte_pci_device *dev) IDXD_PMD_INFO("Closing %s on NUMA node %d", name, dev->device.numa_node); - return 0; + return idxd_dmadev_destroy(name); } struct rte_pci_driver idxd_pmd_drv_pci = { -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v8 06/16] dma/idxd: add datapath structures 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 00/16] add dmadev driver for idxd devices Kevin Laatz ` (4 preceding siblings ...) 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz @ 2021-10-18 12:28 ` Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 07/16] dma/idxd: add configure and info_get functions Kevin Laatz ` (9 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-18 12:28 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add data structures required for the data path for IDXD devices. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 33 +++++++++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 41 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 4 ++++ drivers/dma/idxd/idxd_pci.c | 2 ++ 5 files changed, 81 insertions(+) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index b48fa954ed..3c0837ec52 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -95,6 +95,7 @@ idxd_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_close = idxd_dev_close, + .dev_dump = idxd_dump, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 5abff34292..f972260a56 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -10,6 +10,35 @@ #define IDXD_PMD_NAME_STR "dmadev_idxd" +int +idxd_dump(const struct rte_dma_dev *dev, FILE *f) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + unsigned int i; + + fprintf(f, "== IDXD Private Data ==\n"); + fprintf(f, " Portal: %p\n", idxd->portal); + fprintf(f, " Config: { ring_size: %u }\n", + idxd->qcfg.nb_desc); + fprintf(f, " Batch ring (sz = %u, max_batches = %u):\n\t", + idxd->max_batches + 1, idxd->max_batches); + for (i = 0; i <= idxd->max_batches; i++) { + fprintf(f, " %u ", idxd->batch_idx_ring[i]); + if (i == idxd->batch_idx_read && i == idxd->batch_idx_write) + fprintf(f, "[rd ptr, wr ptr] "); + else if (i == idxd->batch_idx_read) + fprintf(f, "[rd ptr] "); + else if (i == idxd->batch_idx_write) + fprintf(f, "[wr ptr] "); + if (i == idxd->max_batches) + fprintf(f, "\n"); + } + + fprintf(f, " Curr batch: start = %u, size = %u\n", idxd->batch_start, idxd->batch_size); + fprintf(f, " IDS: avail = %u, returned: %u\n", idxd->ids_avail, idxd->ids_returned); + return 0; +} + int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, @@ -19,6 +48,10 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, struct rte_dma_dev *dmadev = NULL; int ret = 0; + RTE_BUILD_BUG_ON(sizeof(struct idxd_hw_desc) != 64); + RTE_BUILD_BUG_ON(offsetof(struct idxd_hw_desc, size) != 32); + RTE_BUILD_BUG_ON(sizeof(struct idxd_completion) != 32); + if (!name) { IDXD_PMD_ERR("Invalid name of the device!"); ret = -EINVAL; diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h index 86f7f3526b..55ca9f7f52 100644 --- a/drivers/dma/idxd/idxd_hw_defs.h +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -5,6 +5,47 @@ #ifndef _IDXD_HW_DEFS_H_ #define _IDXD_HW_DEFS_H_ +/* + * Defines used in the data path for interacting with IDXD hardware. + */ +#define IDXD_CMD_OP_SHIFT 24 +enum rte_idxd_ops { + idxd_op_nop = 0, + idxd_op_batch, + idxd_op_drain, + idxd_op_memmove, + idxd_op_fill +}; + +#define IDXD_FLAG_FENCE (1 << 0) +#define IDXD_FLAG_COMPLETION_ADDR_VALID (1 << 2) +#define IDXD_FLAG_REQUEST_COMPLETION (1 << 3) +#define IDXD_FLAG_CACHE_CONTROL (1 << 8) + +/** + * Hardware descriptor used by DSA hardware, for both bursts and + * for individual operations. + */ +struct idxd_hw_desc { + uint32_t pasid; + uint32_t op_flags; + rte_iova_t completion; + + RTE_STD_C11 + union { + rte_iova_t src; /* source address for copy ops etc. */ + rte_iova_t desc_addr; /* descriptor pointer for batch */ + }; + rte_iova_t dst; + + uint32_t size; /* length of data for op, or batch size */ + + uint16_t intr_handle; /* completion interrupt handle */ + + /* remaining 26 bytes are reserved */ + uint16_t __reserved[13]; +} __rte_aligned(64); + #define IDXD_COMP_STATUS_INCOMPLETE 0 #define IDXD_COMP_STATUS_SUCCESS 1 #define IDXD_COMP_STATUS_INVALID_OPCODE 0x10 diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 8473bf939f..5e253fdfbc 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -40,6 +40,8 @@ struct idxd_pci_common { }; struct idxd_dmadev { + struct idxd_hw_desc *desc_ring; + /* counters to track the batches */ unsigned short max_batches; unsigned short batch_idx_read; @@ -63,6 +65,7 @@ struct idxd_dmadev { unsigned short max_batch_size; struct rte_dma_dev *dmadev; + struct rte_dma_vchan_conf qcfg; uint8_t sva_support; uint8_t qid; @@ -77,5 +80,6 @@ struct idxd_dmadev { int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); +int idxd_dump(const struct rte_dma_dev *dev, FILE *f); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 7127483b10..96c8c65cc0 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -76,12 +76,14 @@ idxd_pci_dev_close(struct rte_dma_dev *dev) /* free device memory */ IDXD_PMD_DEBUG("Freeing device driver memory"); rte_free(idxd->batch_idx_ring); + rte_free(idxd->desc_ring); return 0; } static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_close = idxd_pci_dev_close, + .dev_dump = idxd_dump, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v8 07/16] dma/idxd: add configure and info_get functions 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 00/16] add dmadev driver for idxd devices Kevin Laatz ` (5 preceding siblings ...) 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 06/16] dma/idxd: add datapath structures Kevin Laatz @ 2021-10-18 12:28 ` Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz ` (8 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-18 12:28 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add functions for device configuration. The info_get function is included here since it can be useful for checking successful configuration. Documentation is also updated to add device configuration usage info. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 15 +++++++ drivers/dma/idxd/idxd_bus.c | 3 ++ drivers/dma/idxd/idxd_common.c | 71 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 6 +++ drivers/dma/idxd/idxd_pci.c | 3 ++ 5 files changed, 98 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index ce33e2857a..62ffd39ee0 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -120,3 +120,18 @@ use a subset of configured queues. Once probed successfully, irrespective of kernel driver, the device will appear as a ``dmadev``, that is a "DMA device type" inside DPDK, and can be accessed using APIs from the ``rte_dmadev`` library. + +Using IDXD DMAdev Devices +-------------------------- + +To use the devices from an application, the dmadev API can be used. + +Device Configuration +~~~~~~~~~~~~~~~~~~~~~ + +IDXD configuration requirements: + +* ``ring_size`` must be a power of two, between 64 and 4096. +* Only one ``vchan`` is supported per device (work queue). +* IDXD devices do not support silent mode. +* The transfer direction must be set to ``RTE_DMA_DIR_MEM_TO_MEM`` to copy from memory to memory. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index 3c0837ec52..b2acdac4f9 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -96,6 +96,9 @@ idxd_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_close = idxd_dev_close, .dev_dump = idxd_dump, + .dev_configure = idxd_configure, + .vchan_setup = idxd_vchan_setup, + .dev_info_get = idxd_info_get, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index f972260a56..b0c79a2e42 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -39,6 +39,77 @@ idxd_dump(const struct rte_dma_dev *dev, FILE *f) return 0; } +int +idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t size) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + + if (size < sizeof(*info)) + return -EINVAL; + + *info = (struct rte_dma_info) { + .dev_capa = RTE_DMA_CAPA_MEM_TO_MEM | RTE_DMA_CAPA_HANDLES_ERRORS | + RTE_DMA_CAPA_OPS_COPY | RTE_DMA_CAPA_OPS_FILL, + .max_vchans = 1, + .max_desc = 4096, + .min_desc = 64, + }; + if (idxd->sva_support) + info->dev_capa |= RTE_DMA_CAPA_SVA; + return 0; +} + +int +idxd_configure(struct rte_dma_dev *dev __rte_unused, const struct rte_dma_conf *dev_conf, + uint32_t conf_sz) +{ + if (sizeof(struct rte_dma_conf) != conf_sz) + return -EINVAL; + + if (dev_conf->nb_vchans != 1) + return -EINVAL; + return 0; +} + +int +idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint16_t max_desc = qconf->nb_desc; + + if (sizeof(struct rte_dma_vchan_conf) != qconf_sz) + return -EINVAL; + + idxd->qcfg = *qconf; + + if (!rte_is_power_of_2(max_desc)) + max_desc = rte_align32pow2(max_desc); + IDXD_PMD_DEBUG("DMA dev %u using %u descriptors", dev->data->dev_id, max_desc); + idxd->desc_ring_mask = max_desc - 1; + idxd->qcfg.nb_desc = max_desc; + + /* in case we are reconfiguring a device, free any existing memory */ + rte_free(idxd->desc_ring); + + /* allocate the descriptor ring at 2x size as batches can't wrap */ + idxd->desc_ring = rte_zmalloc(NULL, sizeof(*idxd->desc_ring) * max_desc * 2, 0); + if (idxd->desc_ring == NULL) + return -ENOMEM; + idxd->desc_iova = rte_mem_virt2iova(idxd->desc_ring); + + idxd->batch_idx_read = 0; + idxd->batch_idx_write = 0; + idxd->batch_start = 0; + idxd->batch_size = 0; + idxd->ids_returned = 0; + idxd->ids_avail = 0; + + memset(idxd->batch_comp_ring, 0, sizeof(*idxd->batch_comp_ring) * + (idxd->max_batches + 1)); + return 0; +} + int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 5e253fdfbc..1dbe31abcd 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -81,5 +81,11 @@ struct idxd_dmadev { int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); int idxd_dump(const struct rte_dma_dev *dev, FILE *f); +int idxd_configure(struct rte_dma_dev *dev, const struct rte_dma_conf *dev_conf, + uint32_t conf_sz); +int idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan, + const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz); +int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *dev_info, + uint32_t size); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 96c8c65cc0..681bb55efe 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -84,6 +84,9 @@ idxd_pci_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_close = idxd_pci_dev_close, .dev_dump = idxd_dump, + .dev_configure = idxd_configure, + .vchan_setup = idxd_vchan_setup, + .dev_info_get = idxd_info_get, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v8 08/16] dma/idxd: add start and stop functions for pci devices 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 00/16] add dmadev driver for idxd devices Kevin Laatz ` (6 preceding siblings ...) 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 07/16] dma/idxd: add configure and info_get functions Kevin Laatz @ 2021-10-18 12:28 ` Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 09/16] dma/idxd: add data-path job submission functions Kevin Laatz ` (7 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-18 12:28 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add device start/stop functions for DSA devices bound to vfio. For devices bound to the IDXD kernel driver, these are not required since the IDXD kernel driver takes care of this. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 3 +++ drivers/dma/idxd/idxd_pci.c | 51 +++++++++++++++++++++++++++++++++++++ 2 files changed, 54 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 62ffd39ee0..711890bd9e 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -135,3 +135,6 @@ IDXD configuration requirements: * Only one ``vchan`` is supported per device (work queue). * IDXD devices do not support silent mode. * The transfer direction must be set to ``RTE_DMA_DIR_MEM_TO_MEM`` to copy from memory to memory. + +Once configured, the device can then be made ready for use by calling the +``rte_dma_start()`` API. diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 681bb55efe..ed5bf99425 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -59,6 +59,55 @@ idxd_is_wq_enabled(struct idxd_dmadev *idxd) return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; } +static int +idxd_pci_dev_stop(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint8_t err_code; + + if (!idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Work queue %d already disabled", idxd->qid); + return -EALREADY; + } + + err_code = idxd_pci_dev_command(idxd, idxd_disable_wq); + if (err_code || idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Failed disabling work queue %d, error code: %#x", + idxd->qid, err_code); + return err_code == 0 ? -1 : -err_code; + } + IDXD_PMD_DEBUG("Work queue %d disabled OK", idxd->qid); + + return 0; +} + +static int +idxd_pci_dev_start(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint8_t err_code; + + if (idxd_is_wq_enabled(idxd)) { + IDXD_PMD_WARN("WQ %d already enabled", idxd->qid); + return 0; + } + + if (idxd->desc_ring == NULL) { + IDXD_PMD_ERR("WQ %d has not been fully configured", idxd->qid); + return -EINVAL; + } + + err_code = idxd_pci_dev_command(idxd, idxd_enable_wq); + if (err_code || !idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Failed enabling work queue %d, error code: %#x", + idxd->qid, err_code); + return err_code == 0 ? -1 : -err_code; + } + IDXD_PMD_DEBUG("Work queue %d enabled OK", idxd->qid); + + return 0; +} + static int idxd_pci_dev_close(struct rte_dma_dev *dev) { @@ -87,6 +136,8 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .dev_start = idxd_pci_dev_start, + .dev_stop = idxd_pci_dev_stop, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v8 09/16] dma/idxd: add data-path job submission functions 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 00/16] add dmadev driver for idxd devices Kevin Laatz ` (7 preceding siblings ...) 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz @ 2021-10-18 12:28 ` Kevin Laatz 2021-10-19 7:04 ` Thomas Monjalon 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 10/16] dma/idxd: add data-path job completion functions Kevin Laatz ` (6 subsequent siblings) 15 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-10-18 12:28 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add data path functions for enqueuing and submitting operations to DSA devices. Documentation updates are included for dmadev library and IDXD driver docs as appropriate. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 9 +++ doc/guides/prog_guide/dmadev.rst | 19 +++++ drivers/dma/idxd/idxd_common.c | 135 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 ++ drivers/dma/idxd/meson.build | 1 + 5 files changed, 169 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 711890bd9e..d548c4751a 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -138,3 +138,12 @@ IDXD configuration requirements: Once configured, the device can then be made ready for use by calling the ``rte_dma_start()`` API. + +Performing Data Copies +~~~~~~~~~~~~~~~~~~~~~~~ + +Refer to the :ref:`Enqueue / Dequeue APIs <dmadev_enqueue_dequeue>` section of the dmadev library +documentation for details on operation enqueue and submission API usage. + +It is expected that, for efficiency reasons, a burst of operations will be enqueued to the +device via multiple enqueue calls between calls to the ``rte_dma_submit()`` function. diff --git a/doc/guides/prog_guide/dmadev.rst b/doc/guides/prog_guide/dmadev.rst index 32f7147862..e853ffda3a 100644 --- a/doc/guides/prog_guide/dmadev.rst +++ b/doc/guides/prog_guide/dmadev.rst @@ -67,6 +67,8 @@ can be used to get the device info and supported features. Silent mode is a special device capability which does not require the application to invoke dequeue APIs. +.. _dmadev_enqueue_dequeue: + Enqueue / Dequeue APIs ~~~~~~~~~~~~~~~~~~~~~~ @@ -80,6 +82,23 @@ The ``rte_dma_submit`` API is used to issue doorbell to hardware. Alternatively the ``RTE_DMA_OP_FLAG_SUBMIT`` flag can be passed to the enqueue APIs to also issue the doorbell to hardware. +The following code demonstrates how to enqueue a burst of copies to the +device and start the hardware processing of them: + +.. code-block:: C + + struct rte_mbuf *srcs[DMA_BURST_SZ], *dsts[DMA_BURST_SZ]; + unsigned int i; + + for (i = 0; i < RTE_DIM(srcs); i++) { + if (rte_dma_copy(dev_id, vchan, rte_pktmbuf_iova(srcs), + rte_pktmbuf_iova(dsts), COPY_LEN, 0) < 0) { + PRINT_ERR("Error with rte_dma_copy for buffer %u\n", i); + return -1; + } + } + rte_dma_submit(dev_id, vchan); + There are two dequeue APIs ``rte_dma_completed`` and ``rte_dma_completed_status``, these are used to obtain the results of the enqueue requests. ``rte_dma_completed`` will return the number of successfully diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index b0c79a2e42..a686ad421c 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -2,14 +2,145 @@ * Copyright 2021 Intel Corporation */ +#include <x86intrin.h> + #include <rte_malloc.h> #include <rte_common.h> #include <rte_log.h> +#include <rte_prefetch.h> #include "idxd_internal.h" #define IDXD_PMD_NAME_STR "dmadev_idxd" +static __rte_always_inline rte_iova_t +__desc_idx_to_iova(struct idxd_dmadev *idxd, uint16_t n) +{ + return idxd->desc_iova + (n * sizeof(struct idxd_hw_desc)); +} + +static __rte_always_inline void +__idxd_movdir64b(volatile void *dst, const struct idxd_hw_desc *src) +{ + asm volatile (".byte 0x66, 0x0f, 0x38, 0xf8, 0x02" + : + : "a" (dst), "d" (src) + : "memory"); +} + +static __rte_always_inline void +__submit(struct idxd_dmadev *idxd) +{ + rte_prefetch1(&idxd->batch_comp_ring[idxd->batch_idx_read]); + + if (idxd->batch_size == 0) + return; + + /* write completion to batch comp ring */ + rte_iova_t comp_addr = idxd->batch_iova + + (idxd->batch_idx_write * sizeof(struct idxd_completion)); + + if (idxd->batch_size == 1) { + /* submit batch directly */ + struct idxd_hw_desc desc = + idxd->desc_ring[idxd->batch_start & idxd->desc_ring_mask]; + desc.completion = comp_addr; + desc.op_flags |= IDXD_FLAG_REQUEST_COMPLETION; + _mm_sfence(); /* fence before writing desc to device */ + __idxd_movdir64b(idxd->portal, &desc); + } else { + const struct idxd_hw_desc batch_desc = { + .op_flags = (idxd_op_batch << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_COMPLETION_ADDR_VALID | + IDXD_FLAG_REQUEST_COMPLETION, + .desc_addr = __desc_idx_to_iova(idxd, + idxd->batch_start & idxd->desc_ring_mask), + .completion = comp_addr, + .size = idxd->batch_size, + }; + _mm_sfence(); /* fence before writing desc to device */ + __idxd_movdir64b(idxd->portal, &batch_desc); + } + + if (++idxd->batch_idx_write > idxd->max_batches) + idxd->batch_idx_write = 0; + + idxd->batch_start += idxd->batch_size; + idxd->batch_size = 0; + idxd->batch_idx_ring[idxd->batch_idx_write] = idxd->batch_start; + _mm256_store_si256((void *)&idxd->batch_comp_ring[idxd->batch_idx_write], + _mm256_setzero_si256()); +} + +static __rte_always_inline int +__idxd_write_desc(struct idxd_dmadev *idxd, + const uint32_t op_flags, + const rte_iova_t src, + const rte_iova_t dst, + const uint32_t size, + const uint32_t flags) +{ + uint16_t mask = idxd->desc_ring_mask; + uint16_t job_id = idxd->batch_start + idxd->batch_size; + /* we never wrap batches, so we only mask the start and allow start+size to overflow */ + uint16_t write_idx = (idxd->batch_start & mask) + idxd->batch_size; + + /* first check batch ring space then desc ring space */ + if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) || + idxd->batch_idx_write + 1 == idxd->batch_idx_read) + return -ENOSPC; + if (((write_idx + 1) & mask) == (idxd->ids_returned & mask)) + return -ENOSPC; + + /* write desc. Note: descriptors don't wrap, but the completion address does */ + const uint64_t op_flags64 = (uint64_t)(op_flags | IDXD_FLAG_COMPLETION_ADDR_VALID) << 32; + const uint64_t comp_addr = __desc_idx_to_iova(idxd, write_idx & mask); + _mm256_store_si256((void *)&idxd->desc_ring[write_idx], + _mm256_set_epi64x(dst, src, comp_addr, op_flags64)); + _mm256_store_si256((void *)&idxd->desc_ring[write_idx].size, + _mm256_set_epi64x(0, 0, 0, size)); + + idxd->batch_size++; + + rte_prefetch0_write(&idxd->desc_ring[write_idx + 1]); + + if (flags & RTE_DMA_OP_FLAG_SUBMIT) + __submit(idxd); + + return job_id; +} + +int +idxd_enqueue_copy(void *dev_private, uint16_t qid __rte_unused, rte_iova_t src, + rte_iova_t dst, unsigned int length, uint64_t flags) +{ + /* we can take advantage of the fact that the fence flag in dmadev and DSA are the same, + * but check it at compile time to be sure. + */ + RTE_BUILD_BUG_ON(RTE_DMA_OP_FLAG_FENCE != IDXD_FLAG_FENCE); + uint32_t memmove = (idxd_op_memmove << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_CACHE_CONTROL | (flags & IDXD_FLAG_FENCE); + return __idxd_write_desc(dev_private, memmove, src, dst, length, + flags); +} + +int +idxd_enqueue_fill(void *dev_private, uint16_t qid __rte_unused, uint64_t pattern, + rte_iova_t dst, unsigned int length, uint64_t flags) +{ + uint32_t fill = (idxd_op_fill << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_CACHE_CONTROL | (flags & IDXD_FLAG_FENCE); + return __idxd_write_desc(dev_private, fill, pattern, dst, length, + flags); +} + +int +idxd_submit(void *dev_private, uint16_t qid __rte_unused) +{ + __submit(dev_private); + return 0; +} + int idxd_dump(const struct rte_dma_dev *dev, FILE *f) { @@ -139,6 +270,10 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->dev_ops = ops; dmadev->device = dev; + dmadev->fp_obj->copy = idxd_enqueue_copy; + dmadev->fp_obj->fill = idxd_enqueue_fill; + dmadev->fp_obj->submit = idxd_submit; + idxd = dmadev->data->dev_private; *idxd = *base_idxd; /* copy over the main fields already passed in */ idxd->dmadev = dmadev; diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 1dbe31abcd..ab4d71095e 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -87,5 +87,10 @@ int idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan, const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz); int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *dev_info, uint32_t size); +int idxd_enqueue_copy(void *dev_private, uint16_t qid, rte_iova_t src, + rte_iova_t dst, unsigned int length, uint64_t flags); +int idxd_enqueue_fill(void *dev_private, uint16_t qid, uint64_t pattern, + rte_iova_t dst, unsigned int length, uint64_t flags); +int idxd_submit(void *dev_private, uint16_t qid); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index da5dc2b019..3b5133c578 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -6,6 +6,7 @@ if is_windows endif deps += ['bus_pci'] +cflags += '-mavx2' # all platforms with idxd HW support AVX sources = files( 'idxd_bus.c', 'idxd_common.c', -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v8 09/16] dma/idxd: add data-path job submission functions 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 09/16] dma/idxd: add data-path job submission functions Kevin Laatz @ 2021-10-19 7:04 ` Thomas Monjalon 0 siblings, 0 replies; 243+ messages in thread From: Thomas Monjalon @ 2021-10-19 7:04 UTC (permalink / raw) To: Kevin Laatz; +Cc: dev, bruce.richardson, fengchengwen, jerinj, conor.walsh 18/10/2021 14:28, Kevin Laatz: > --- a/drivers/dma/idxd/meson.build > +++ b/drivers/dma/idxd/meson.build > +cflags += '-mavx2' # all platforms with idxd HW support AVX It would work only if the driver was disabled on non-x86 targets. For now it produces: aarch64-linux-gnu-gcc: error: unrecognized command-line option '-mavx2' ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v8 10/16] dma/idxd: add data-path job completion functions 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 00/16] add dmadev driver for idxd devices Kevin Laatz ` (8 preceding siblings ...) 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 09/16] dma/idxd: add data-path job submission functions Kevin Laatz @ 2021-10-18 12:28 ` Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 11/16] dma/idxd: add operation statistic tracking Kevin Laatz ` (5 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-18 12:28 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the data path functions for gathering completed operations. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 32 ++++- drivers/dma/idxd/idxd_common.c | 236 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 + 3 files changed, 272 insertions(+), 1 deletion(-) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index d548c4751a..d4a210b854 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -143,7 +143,37 @@ Performing Data Copies ~~~~~~~~~~~~~~~~~~~~~~~ Refer to the :ref:`Enqueue / Dequeue APIs <dmadev_enqueue_dequeue>` section of the dmadev library -documentation for details on operation enqueue and submission API usage. +documentation for details on operation enqueue, submission and completion API usage. It is expected that, for efficiency reasons, a burst of operations will be enqueued to the device via multiple enqueue calls between calls to the ``rte_dma_submit()`` function. + +When gathering completions, ``rte_dma_completed()`` should be used, up until the point an error +occurs in an operation. If an error was encountered, ``rte_dma_completed_status()`` must be used +to kick the device off to continue processing operations and also to gather the status of each +individual operations which is filled in to the ``status`` array provided as parameter by the +application. + +The following status codes are supported by IDXD: + +* ``RTE_DMA_STATUS_SUCCESSFUL``: The operation was successful. +* ``RTE_DMA_STATUS_INVALID_OPCODE``: The operation failed due to an invalid operation code. +* ``RTE_DMA_STATUS_INVALID_LENGTH``: The operation failed due to an invalid data length. +* ``RTE_DMA_STATUS_NOT_ATTEMPTED``: The operation was not attempted. +* ``RTE_DMA_STATUS_ERROR_UNKNOWN``: The operation failed due to an unspecified error. + +The following code shows how to retrieve the number of successfully completed +copies within a burst and then using ``rte_dma_completed_status()`` to check +which operation failed and kick off the device to continue processing operations: + +.. code-block:: C + + enum rte_dma_status_code status[COMP_BURST_SZ]; + uint16_t count, idx, status_count; + bool error = 0; + + count = rte_dma_completed(dev_id, vchan, COMP_BURST_SZ, &idx, &error); + + if (error){ + status_count = rte_dma_completed_status(dev_id, vchan, COMP_BURST_SZ, &idx, status); + } diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index a686ad421c..76bc2e1364 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -141,6 +141,240 @@ idxd_submit(void *dev_private, uint16_t qid __rte_unused) return 0; } +static enum rte_dma_status_code +get_comp_status(struct idxd_completion *c) +{ + uint8_t st = c->status; + switch (st) { + /* successful descriptors are not written back normally */ + case IDXD_COMP_STATUS_INCOMPLETE: + case IDXD_COMP_STATUS_SUCCESS: + return RTE_DMA_STATUS_SUCCESSFUL; + case IDXD_COMP_STATUS_INVALID_OPCODE: + return RTE_DMA_STATUS_INVALID_OPCODE; + case IDXD_COMP_STATUS_INVALID_SIZE: + return RTE_DMA_STATUS_INVALID_LENGTH; + case IDXD_COMP_STATUS_SKIPPED: + return RTE_DMA_STATUS_NOT_ATTEMPTED; + default: + return RTE_DMA_STATUS_ERROR_UNKNOWN; + } +} + +static __rte_always_inline int +batch_ok(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) +{ + uint16_t ret; + uint8_t bstatus; + + if (max_ops == 0) + return 0; + + /* first check if there are any unreturned handles from last time */ + if (idxd->ids_avail != idxd->ids_returned) { + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); + idxd->ids_returned += ret; + if (status) + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); + return ret; + } + + if (idxd->batch_idx_read == idxd->batch_idx_write) + return 0; + + bstatus = idxd->batch_comp_ring[idxd->batch_idx_read].status; + /* now check if next batch is complete and successful */ + if (bstatus == IDXD_COMP_STATUS_SUCCESS) { + /* since the batch idx ring stores the start of each batch, pre-increment to lookup + * start of next batch. + */ + if (++idxd->batch_idx_read > idxd->max_batches) + idxd->batch_idx_read = 0; + idxd->ids_avail = idxd->batch_idx_ring[idxd->batch_idx_read]; + + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); + idxd->ids_returned += ret; + if (status) + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); + return ret; + } + /* check if batch is incomplete */ + else if (bstatus == IDXD_COMP_STATUS_INCOMPLETE) + return 0; + + return -1; /* error case */ +} + +static inline uint16_t +batch_completed(struct idxd_dmadev *idxd, uint16_t max_ops, bool *has_error) +{ + uint16_t i; + uint16_t b_start, b_end, next_batch; + + int ret = batch_ok(idxd, max_ops, NULL); + if (ret >= 0) + return ret; + + /* ERROR case, not successful, not incomplete */ + /* Get the batch size, and special case size 1. + * once we identify the actual failure job, return other jobs, then update + * the batch ring indexes to make it look like the first job of the batch has failed. + * Subsequent calls here will always return zero packets, and the error must be cleared by + * calling the completed_status() function. + */ + next_batch = (idxd->batch_idx_read + 1); + if (next_batch > idxd->max_batches) + next_batch = 0; + b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; + b_end = idxd->batch_idx_ring[next_batch]; + + if (b_end - b_start == 1) { /* not a batch */ + *has_error = true; + return 0; + } + + for (i = b_start; i < b_end; i++) { + struct idxd_completion *c = (void *)&idxd->desc_ring[i & idxd->desc_ring_mask]; + if (c->status > IDXD_COMP_STATUS_SUCCESS) /* ignore incomplete(0) and success(1) */ + break; + } + ret = RTE_MIN((uint16_t)(i - idxd->ids_returned), max_ops); + if (ret < max_ops) + *has_error = true; /* we got up to the point of error */ + idxd->ids_avail = idxd->ids_returned += ret; + + /* to ensure we can call twice and just return 0, set start of batch to where we finished */ + idxd->batch_comp_ring[idxd->batch_idx_read].completed_size -= ret; + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; + if (idxd->batch_idx_ring[next_batch] - idxd->batch_idx_ring[idxd->batch_idx_read] == 1) { + /* copy over the descriptor status to the batch ring as if no batch */ + uint16_t d_idx = idxd->batch_idx_ring[idxd->batch_idx_read] & idxd->desc_ring_mask; + struct idxd_completion *desc_comp = (void *)&idxd->desc_ring[d_idx]; + idxd->batch_comp_ring[idxd->batch_idx_read].status = desc_comp->status; + } + + return ret; +} + +static uint16_t +batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) +{ + uint16_t next_batch; + + int ret = batch_ok(idxd, max_ops, status); + if (ret >= 0) + return ret; + + /* ERROR case, not successful, not incomplete */ + /* Get the batch size, and special case size 1. + */ + next_batch = (idxd->batch_idx_read + 1); + if (next_batch > idxd->max_batches) + next_batch = 0; + const uint16_t b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; + const uint16_t b_end = idxd->batch_idx_ring[next_batch]; + const uint16_t b_len = b_end - b_start; + if (b_len == 1) {/* not a batch */ + *status = get_comp_status(&idxd->batch_comp_ring[idxd->batch_idx_read]); + idxd->ids_avail++; + idxd->ids_returned++; + idxd->batch_idx_read = next_batch; + return 1; + } + + /* not a single-element batch, need to process more. + * Scenarios: + * 1. max_ops >= batch_size - can fit everything, simple case + * - loop through completed ops and then add on any not-attempted ones + * 2. max_ops < batch_size - can't fit everything, more complex case + * - loop through completed/incomplete and stop when hit max_ops + * - adjust the batch descriptor to update where we stopped, with appropriate bcount + * - if bcount is to be exactly 1, update the batch descriptor as it will be treated as + * non-batch next time. + */ + const uint16_t bcount = idxd->batch_comp_ring[idxd->batch_idx_read].completed_size; + for (ret = 0; ret < b_len && ret < max_ops; ret++) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; + status[ret] = (ret < bcount) ? get_comp_status(c) : RTE_DMA_STATUS_NOT_ATTEMPTED; + } + idxd->ids_avail = idxd->ids_returned += ret; + + /* everything fit */ + if (ret == b_len) { + idxd->batch_idx_read = next_batch; + return ret; + } + + /* set up for next time, update existing batch descriptor & start idx at batch_idx_read */ + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; + if (ret > bcount) { + /* we have only incomplete ones - set batch completed size to 0 */ + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; + comp->completed_size = 0; + /* if there is only one descriptor left, job skipped so set flag appropriately */ + if (b_len - ret == 1) + comp->status = IDXD_COMP_STATUS_SKIPPED; + } else { + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; + comp->completed_size -= ret; + /* if there is only one descriptor left, copy status info straight to desc */ + if (comp->completed_size == 1) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; + comp->status = c->status; + /* individual descs can be ok without writeback, but not batches */ + if (comp->status == IDXD_COMP_STATUS_INCOMPLETE) + comp->status = IDXD_COMP_STATUS_SUCCESS; + } else if (bcount == b_len) { + /* check if we still have an error, and clear flag if not */ + uint16_t i; + for (i = b_start + ret; i < b_end; i++) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[i & idxd->desc_ring_mask]; + if (c->status > IDXD_COMP_STATUS_SUCCESS) + break; + } + if (i == b_end) /* no errors */ + comp->status = IDXD_COMP_STATUS_SUCCESS; + } + } + + return ret; +} + +uint16_t +idxd_completed(void *dev_private, uint16_t qid __rte_unused, uint16_t max_ops, + uint16_t *last_idx, bool *has_error) +{ + struct idxd_dmadev *idxd = dev_private; + uint16_t batch, ret = 0; + + do { + batch = batch_completed(idxd, max_ops - ret, has_error); + ret += batch; + } while (batch > 0 && *has_error == false); + + *last_idx = idxd->ids_returned - 1; + return ret; +} + +uint16_t +idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, uint16_t max_ops, + uint16_t *last_idx, enum rte_dma_status_code *status) +{ + struct idxd_dmadev *idxd = dev_private; + uint16_t batch, ret = 0; + + do { + batch = batch_completed_status(idxd, max_ops - ret, &status[ret]); + ret += batch; + } while (batch > 0); + + *last_idx = idxd->ids_returned - 1; + return ret; +} + int idxd_dump(const struct rte_dma_dev *dev, FILE *f) { @@ -273,6 +507,8 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->fp_obj->copy = idxd_enqueue_copy; dmadev->fp_obj->fill = idxd_enqueue_fill; dmadev->fp_obj->submit = idxd_submit; + dmadev->fp_obj->completed = idxd_completed; + dmadev->fp_obj->completed_status = idxd_completed_status; idxd = dmadev->data->dev_private; *idxd = *base_idxd; /* copy over the main fields already passed in */ diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index ab4d71095e..4208b0dee8 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -92,5 +92,10 @@ int idxd_enqueue_copy(void *dev_private, uint16_t qid, rte_iova_t src, int idxd_enqueue_fill(void *dev_private, uint16_t qid, uint64_t pattern, rte_iova_t dst, unsigned int length, uint64_t flags); int idxd_submit(void *dev_private, uint16_t qid); +uint16_t idxd_completed(void *dev_private, uint16_t qid, uint16_t max_ops, + uint16_t *last_idx, bool *has_error); +uint16_t idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, + uint16_t max_ops, uint16_t *last_idx, + enum rte_dma_status_code *status); #endif /* _IDXD_INTERNAL_H_ */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v8 11/16] dma/idxd: add operation statistic tracking 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 00/16] add dmadev driver for idxd devices Kevin Laatz ` (9 preceding siblings ...) 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 10/16] dma/idxd: add data-path job completion functions Kevin Laatz @ 2021-10-18 12:28 ` Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 12/16] dma/idxd: add vchan status function Kevin Laatz ` (4 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-18 12:28 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add statistic tracking for DSA devices. The dmadev library documentation is also updated to add a generic section for using the library's statistics APIs. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> --- doc/guides/prog_guide/dmadev.rst | 11 +++++++++++ drivers/dma/idxd/idxd_bus.c | 2 ++ drivers/dma/idxd/idxd_common.c | 27 +++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 +++++ drivers/dma/idxd/idxd_pci.c | 2 ++ 5 files changed, 47 insertions(+) diff --git a/doc/guides/prog_guide/dmadev.rst b/doc/guides/prog_guide/dmadev.rst index e853ffda3a..139eaff299 100644 --- a/doc/guides/prog_guide/dmadev.rst +++ b/doc/guides/prog_guide/dmadev.rst @@ -107,3 +107,14 @@ completed operations along with the status of each operation (filled into the ``status`` array passed by user). These two APIs can also return the last completed operation's ``ring_idx`` which could help user track operations within their own application-defined rings. + + +Querying Device Statistics +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The statistics from a dmadev device can be got via the statistics functions, +i.e. ``rte_dma_stats_get()``. The statistics returned for each device instance are: + +* ``submitted``: The number of operations submitted to the device. +* ``completed``: The number of operations which have completed (successful and failed). +* ``errors``: The number of operations that completed with error. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index b2acdac4f9..b52ea02854 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -99,6 +99,8 @@ static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .stats_get = idxd_stats_get, + .stats_reset = idxd_stats_reset, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 76bc2e1364..fd81418b7c 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -65,6 +65,8 @@ __submit(struct idxd_dmadev *idxd) if (++idxd->batch_idx_write > idxd->max_batches) idxd->batch_idx_write = 0; + idxd->stats.submitted += idxd->batch_size; + idxd->batch_start += idxd->batch_size; idxd->batch_size = 0; idxd->batch_idx_ring[idxd->batch_idx_write] = idxd->batch_start; @@ -276,6 +278,8 @@ batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_ const uint16_t b_len = b_end - b_start; if (b_len == 1) {/* not a batch */ *status = get_comp_status(&idxd->batch_comp_ring[idxd->batch_idx_read]); + if (status != RTE_DMA_STATUS_SUCCESSFUL) + idxd->stats.errors++; idxd->ids_avail++; idxd->ids_returned++; idxd->batch_idx_read = next_batch; @@ -297,6 +301,8 @@ batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_ struct idxd_completion *c = (void *) &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; status[ret] = (ret < bcount) ? get_comp_status(c) : RTE_DMA_STATUS_NOT_ATTEMPTED; + if (status[ret] != RTE_DMA_STATUS_SUCCESSFUL) + idxd->stats.errors++; } idxd->ids_avail = idxd->ids_returned += ret; @@ -355,6 +361,7 @@ idxd_completed(void *dev_private, uint16_t qid __rte_unused, uint16_t max_ops, ret += batch; } while (batch > 0 && *has_error == false); + idxd->stats.completed += ret; *last_idx = idxd->ids_returned - 1; return ret; } @@ -371,6 +378,7 @@ idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, uint16_t max ret += batch; } while (batch > 0); + idxd->stats.completed += ret; *last_idx = idxd->ids_returned - 1; return ret; } @@ -404,6 +412,25 @@ idxd_dump(const struct rte_dma_dev *dev, FILE *f) return 0; } +int +idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + struct rte_dma_stats *stats, uint32_t stats_sz) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + if (stats_sz < sizeof(*stats)) + return -EINVAL; + *stats = idxd->stats; + return 0; +} + +int +idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan __rte_unused) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + idxd->stats = (struct rte_dma_stats){0}; + return 0; +} + int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t size) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 4208b0dee8..a85a1fb79e 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -59,6 +59,8 @@ struct idxd_dmadev { struct idxd_completion *batch_comp_ring; unsigned short *batch_idx_ring; /* store where each batch ends */ + struct rte_dma_stats stats; + rte_iova_t batch_iova; /* base address of the batch comp ring */ rte_iova_t desc_iova; /* base address of desc ring, needed for completions */ @@ -97,5 +99,8 @@ uint16_t idxd_completed(void *dev_private, uint16_t qid, uint16_t max_ops, uint16_t idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, uint16_t max_ops, uint16_t *last_idx, enum rte_dma_status_code *status); +int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, + struct rte_dma_stats *stats, uint32_t stats_sz); +int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index ed5bf99425..9d7f0531d5 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -136,6 +136,8 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .stats_get = idxd_stats_get, + .stats_reset = idxd_stats_reset, .dev_start = idxd_pci_dev_start, .dev_stop = idxd_pci_dev_stop, }; -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v8 12/16] dma/idxd: add vchan status function 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 00/16] add dmadev driver for idxd devices Kevin Laatz ` (10 preceding siblings ...) 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 11/16] dma/idxd: add operation statistic tracking Kevin Laatz @ 2021-10-18 12:28 ` Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 13/16] dma/idxd: add burst capacity API Kevin Laatz ` (3 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-18 12:28 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When testing dmadev drivers, it is useful to have the HW device in a known state. This patch adds the implementation of the function which will wait for the device to be idle (all jobs completed) before proceeding. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 17 +++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 2 ++ drivers/dma/idxd/idxd_pci.c | 1 + 4 files changed, 21 insertions(+) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index b52ea02854..e6caa048a9 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -101,6 +101,7 @@ static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_info_get = idxd_info_get, .stats_get = idxd_stats_get, .stats_reset = idxd_stats_reset, + .vchan_status = idxd_vchan_status, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index fd81418b7c..3c8cff15c0 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -163,6 +163,23 @@ get_comp_status(struct idxd_completion *c) } } +int +idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + enum rte_dma_vchan_status *status) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint16_t last_batch_write = idxd->batch_idx_write == 0 ? idxd->max_batches : + idxd->batch_idx_write - 1; + uint8_t bstatus = (idxd->batch_comp_ring[last_batch_write].status != 0); + + /* An IDXD device will always be either active or idle. + * RTE_DMA_VCHAN_HALTED_ERROR is therefore not supported by IDXD. + */ + *status = bstatus ? RTE_DMA_VCHAN_IDLE : RTE_DMA_VCHAN_ACTIVE; + + return 0; +} + static __rte_always_inline int batch_ok(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index a85a1fb79e..50acb82d3d 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -102,5 +102,7 @@ uint16_t idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, struct rte_dma_stats *stats, uint32_t stats_sz); int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); +int idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan, + enum rte_dma_vchan_status *status); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 9d7f0531d5..23c10c0fb0 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -140,6 +140,7 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .stats_reset = idxd_stats_reset, .dev_start = idxd_pci_dev_start, .dev_stop = idxd_pci_dev_stop, + .vchan_status = idxd_vchan_status, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v8 13/16] dma/idxd: add burst capacity API 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 00/16] add dmadev driver for idxd devices Kevin Laatz ` (11 preceding siblings ...) 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 12/16] dma/idxd: add vchan status function Kevin Laatz @ 2021-10-18 12:28 ` Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz ` (2 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-18 12:28 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add support for the burst capacity API. This API will provide the calling application with the remaining capacity of the current burst (limited by max HW batch size). Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> --- drivers/dma/idxd/idxd_common.c | 21 +++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 1 + drivers/dma/idxd/idxd_pci.c | 1 + 3 files changed, 23 insertions(+) diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 3c8cff15c0..ff4647f579 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -468,6 +468,26 @@ idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t return 0; } +uint16_t +idxd_burst_capacity(const void *dev_private, uint16_t vchan __rte_unused) +{ + const struct idxd_dmadev *idxd = dev_private; + uint16_t write_idx = idxd->batch_start + idxd->batch_size; + uint16_t used_space; + + /* Check for space in the batch ring */ + if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) || + idxd->batch_idx_write + 1 == idxd->batch_idx_read) + return 0; + + /* For descriptors, check for wrap-around on write but not read */ + if (idxd->ids_returned > write_idx) + write_idx += idxd->desc_ring_mask + 1; + used_space = write_idx - idxd->ids_returned; + + return RTE_MIN((idxd->desc_ring_mask - used_space), idxd->max_batch_size); +} + int idxd_configure(struct rte_dma_dev *dev __rte_unused, const struct rte_dma_conf *dev_conf, uint32_t conf_sz) @@ -553,6 +573,7 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->fp_obj->submit = idxd_submit; dmadev->fp_obj->completed = idxd_completed; dmadev->fp_obj->completed_status = idxd_completed_status; + dmadev->fp_obj->burst_capacity = idxd_burst_capacity; idxd = dmadev->data->dev_private; *idxd = *base_idxd; /* copy over the main fields already passed in */ diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 50acb82d3d..3375600217 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -104,5 +104,6 @@ int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); int idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan, enum rte_dma_vchan_status *status); +uint16_t idxd_burst_capacity(const void *dev_private, uint16_t vchan); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 23c10c0fb0..beef3848aa 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -254,6 +254,7 @@ init_pci_device(struct rte_pci_device *dev, struct idxd_dmadev *idxd, idxd->u.pci = pci; idxd->max_batches = wq_size; + idxd->max_batch_size = 1 << lg2_max_batch; /* enable the device itself */ err_code = idxd_pci_dev_command(idxd, idxd_enable_dev); -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v8 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 00/16] add dmadev driver for idxd devices Kevin Laatz ` (12 preceding siblings ...) 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 13/16] dma/idxd: add burst capacity API Kevin Laatz @ 2021-10-18 12:28 ` Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 15/16] devbind: add dma device class Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-18 12:28 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz From: Conor Walsh <conor.walsh@intel.com> Move the example script for configuring IDXD devices bound to the IDXD kernel driver from raw to dma, and create a symlink to still allow use from raw. Signed-off-by: Conor Walsh <conor.walsh@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> --- drivers/dma/idxd/dpdk_idxd_cfg.py | 117 +++++++++++++++++++++++++++++ drivers/raw/ioat/dpdk_idxd_cfg.py | 118 +----------------------------- 2 files changed, 118 insertions(+), 117 deletions(-) create mode 100755 drivers/dma/idxd/dpdk_idxd_cfg.py mode change 100755 => 120000 drivers/raw/ioat/dpdk_idxd_cfg.py diff --git a/drivers/dma/idxd/dpdk_idxd_cfg.py b/drivers/dma/idxd/dpdk_idxd_cfg.py new file mode 100755 index 0000000000..fcc27822ef --- /dev/null +++ b/drivers/dma/idxd/dpdk_idxd_cfg.py @@ -0,0 +1,117 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2020 Intel Corporation + +""" +Configure an entire Intel DSA instance, using idxd kernel driver, for DPDK use +""" + +import sys +import argparse +import os +import os.path + + +class SysfsDir: + "Used to read/write paths in a sysfs directory" + def __init__(self, path): + self.path = path + + def read_int(self, filename): + "Return a value from sysfs file" + with open(os.path.join(self.path, filename)) as f: + return int(f.readline()) + + def write_values(self, values): + "write dictionary, where key is filename and value is value to write" + for filename, contents in values.items(): + with open(os.path.join(self.path, filename), "w") as f: + f.write(str(contents)) + + +def reset_device(dsa_id): + "Reset the DSA device and all its queues" + drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") + drv_dir.write_values({"unbind": f"dsa{dsa_id}"}) + + +def get_pci_dir(pci): + "Search for the sysfs directory of the PCI device" + base_dir = '/sys/bus/pci/devices/' + for path, dirs, files in os.walk(base_dir): + for dir in dirs: + if pci in dir: + return os.path.join(base_dir, dir) + sys.exit(f"Could not find sysfs directory for device {pci}") + + +def get_dsa_id(pci): + "Get the DSA instance ID using the PCI address of the device" + pci_dir = get_pci_dir(pci) + for path, dirs, files in os.walk(pci_dir): + for dir in dirs: + if dir.startswith('dsa') and 'wq' not in dir: + return int(dir[3:]) + sys.exit(f"Could not get device ID for device {pci}") + + +def configure_dsa(dsa_id, queues, prefix): + "Configure the DSA instance with appropriate number of queues" + dsa_dir = SysfsDir(f"/sys/bus/dsa/devices/dsa{dsa_id}") + drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") + + max_groups = dsa_dir.read_int("max_groups") + max_engines = dsa_dir.read_int("max_engines") + max_queues = dsa_dir.read_int("max_work_queues") + max_work_queues_size = dsa_dir.read_int("max_work_queues_size") + + nb_queues = min(queues, max_queues) + if queues > nb_queues: + print(f"Setting number of queues to max supported value: {max_queues}") + + # we want one engine per group, and no more engines than queues + nb_groups = min(max_engines, max_groups, nb_queues) + for grp in range(nb_groups): + dsa_dir.write_values({f"engine{dsa_id}.{grp}/group_id": grp}) + + # configure each queue + for q in range(nb_queues): + wq_dir = SysfsDir(os.path.join(dsa_dir.path, f"wq{dsa_id}.{q}")) + wq_dir.write_values({"group_id": q % nb_groups, + "type": "user", + "mode": "dedicated", + "name": f"{prefix}_wq{dsa_id}.{q}", + "priority": 1, + "size": int(max_work_queues_size / nb_queues)}) + + # enable device and then queues + drv_dir.write_values({"bind": f"dsa{dsa_id}"}) + for q in range(nb_queues): + drv_dir.write_values({"bind": f"wq{dsa_id}.{q}"}) + + +def main(args): + "Main function, does arg parsing and calls config function" + arg_p = argparse.ArgumentParser( + description="Configure whole DSA device instance for DPDK use") + arg_p.add_argument('dsa_id', + help="Specify DSA instance either via DSA instance number or PCI address") + arg_p.add_argument('-q', metavar='queues', type=int, default=255, + help="Number of queues to set up") + arg_p.add_argument('--name-prefix', metavar='prefix', dest='prefix', + default="dpdk", + help="Prefix for workqueue name to mark for DPDK use [default: 'dpdk']") + arg_p.add_argument('--reset', action='store_true', + help="Reset DSA device and its queues") + parsed_args = arg_p.parse_args(args[1:]) + + dsa_id = parsed_args.dsa_id + dsa_id = get_dsa_id(dsa_id) if ':' in dsa_id else dsa_id + if parsed_args.reset: + reset_device(dsa_id) + else: + configure_dsa(dsa_id, parsed_args.q, parsed_args.prefix) + + +if __name__ == "__main__": + main(sys.argv) diff --git a/drivers/raw/ioat/dpdk_idxd_cfg.py b/drivers/raw/ioat/dpdk_idxd_cfg.py deleted file mode 100755 index fcc27822ef..0000000000 --- a/drivers/raw/ioat/dpdk_idxd_cfg.py +++ /dev/null @@ -1,117 +0,0 @@ -#!/usr/bin/env python3 -# SPDX-License-Identifier: BSD-3-Clause -# Copyright(c) 2020 Intel Corporation - -""" -Configure an entire Intel DSA instance, using idxd kernel driver, for DPDK use -""" - -import sys -import argparse -import os -import os.path - - -class SysfsDir: - "Used to read/write paths in a sysfs directory" - def __init__(self, path): - self.path = path - - def read_int(self, filename): - "Return a value from sysfs file" - with open(os.path.join(self.path, filename)) as f: - return int(f.readline()) - - def write_values(self, values): - "write dictionary, where key is filename and value is value to write" - for filename, contents in values.items(): - with open(os.path.join(self.path, filename), "w") as f: - f.write(str(contents)) - - -def reset_device(dsa_id): - "Reset the DSA device and all its queues" - drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") - drv_dir.write_values({"unbind": f"dsa{dsa_id}"}) - - -def get_pci_dir(pci): - "Search for the sysfs directory of the PCI device" - base_dir = '/sys/bus/pci/devices/' - for path, dirs, files in os.walk(base_dir): - for dir in dirs: - if pci in dir: - return os.path.join(base_dir, dir) - sys.exit(f"Could not find sysfs directory for device {pci}") - - -def get_dsa_id(pci): - "Get the DSA instance ID using the PCI address of the device" - pci_dir = get_pci_dir(pci) - for path, dirs, files in os.walk(pci_dir): - for dir in dirs: - if dir.startswith('dsa') and 'wq' not in dir: - return int(dir[3:]) - sys.exit(f"Could not get device ID for device {pci}") - - -def configure_dsa(dsa_id, queues, prefix): - "Configure the DSA instance with appropriate number of queues" - dsa_dir = SysfsDir(f"/sys/bus/dsa/devices/dsa{dsa_id}") - drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") - - max_groups = dsa_dir.read_int("max_groups") - max_engines = dsa_dir.read_int("max_engines") - max_queues = dsa_dir.read_int("max_work_queues") - max_work_queues_size = dsa_dir.read_int("max_work_queues_size") - - nb_queues = min(queues, max_queues) - if queues > nb_queues: - print(f"Setting number of queues to max supported value: {max_queues}") - - # we want one engine per group, and no more engines than queues - nb_groups = min(max_engines, max_groups, nb_queues) - for grp in range(nb_groups): - dsa_dir.write_values({f"engine{dsa_id}.{grp}/group_id": grp}) - - # configure each queue - for q in range(nb_queues): - wq_dir = SysfsDir(os.path.join(dsa_dir.path, f"wq{dsa_id}.{q}")) - wq_dir.write_values({"group_id": q % nb_groups, - "type": "user", - "mode": "dedicated", - "name": f"{prefix}_wq{dsa_id}.{q}", - "priority": 1, - "size": int(max_work_queues_size / nb_queues)}) - - # enable device and then queues - drv_dir.write_values({"bind": f"dsa{dsa_id}"}) - for q in range(nb_queues): - drv_dir.write_values({"bind": f"wq{dsa_id}.{q}"}) - - -def main(args): - "Main function, does arg parsing and calls config function" - arg_p = argparse.ArgumentParser( - description="Configure whole DSA device instance for DPDK use") - arg_p.add_argument('dsa_id', - help="Specify DSA instance either via DSA instance number or PCI address") - arg_p.add_argument('-q', metavar='queues', type=int, default=255, - help="Number of queues to set up") - arg_p.add_argument('--name-prefix', metavar='prefix', dest='prefix', - default="dpdk", - help="Prefix for workqueue name to mark for DPDK use [default: 'dpdk']") - arg_p.add_argument('--reset', action='store_true', - help="Reset DSA device and its queues") - parsed_args = arg_p.parse_args(args[1:]) - - dsa_id = parsed_args.dsa_id - dsa_id = get_dsa_id(dsa_id) if ':' in dsa_id else dsa_id - if parsed_args.reset: - reset_device(dsa_id) - else: - configure_dsa(dsa_id, parsed_args.q, parsed_args.prefix) - - -if __name__ == "__main__": - main(sys.argv) diff --git a/drivers/raw/ioat/dpdk_idxd_cfg.py b/drivers/raw/ioat/dpdk_idxd_cfg.py new file mode 120000 index 0000000000..85545548d1 --- /dev/null +++ b/drivers/raw/ioat/dpdk_idxd_cfg.py @@ -0,0 +1 @@ +../../dma/idxd/dpdk_idxd_cfg.py \ No newline at end of file -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v8 15/16] devbind: add dma device class 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 00/16] add dmadev driver for idxd devices Kevin Laatz ` (13 preceding siblings ...) 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz @ 2021-10-18 12:28 ` Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-18 12:28 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add a new class for DMA devices. Devices listed under the DMA class are to be used with the dmadev library. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> --- usertools/dpdk-devbind.py | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py index 5f0e817055..da89b87816 100755 --- a/usertools/dpdk-devbind.py +++ b/usertools/dpdk-devbind.py @@ -71,6 +71,7 @@ network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class] baseband_devices = [acceleration_class] crypto_devices = [encryption_class, intel_processor_class] +dma_devices = [] eventdev_devices = [cavium_sso, cavium_tim, intel_dlb, octeontx2_sso] mempool_devices = [cavium_fpa, octeontx2_npa] compress_devices = [cavium_zip] @@ -585,6 +586,9 @@ def show_status(): if status_dev in ["crypto", "all"]: show_device_status(crypto_devices, "Crypto") + if status_dev in ["dma", "all"]: + show_device_status(dma_devices, "DMA") + if status_dev in ["event", "all"]: show_device_status(eventdev_devices, "Eventdev") @@ -653,7 +657,7 @@ def parse_args(): parser.add_argument( '--status-dev', help="Print the status of given device group.", - choices=['baseband', 'compress', 'crypto', 'event', + choices=['baseband', 'compress', 'crypto', 'dma', 'event', 'mempool', 'misc', 'net', 'regex']) bind_group = parser.add_mutually_exclusive_group() bind_group.add_argument( @@ -734,6 +738,7 @@ def do_arg_actions(): get_device_details(network_devices) get_device_details(baseband_devices) get_device_details(crypto_devices) + get_device_details(dma_devices) get_device_details(eventdev_devices) get_device_details(mempool_devices) get_device_details(compress_devices) @@ -756,6 +761,7 @@ def main(): get_device_details(network_devices) get_device_details(baseband_devices) get_device_details(crypto_devices) + get_device_details(dma_devices) get_device_details(eventdev_devices) get_device_details(mempool_devices) get_device_details(compress_devices) -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v8 16/16] devbind: move idxd device ID to dmadev class 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 00/16] add dmadev driver for idxd devices Kevin Laatz ` (14 preceding siblings ...) 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 15/16] devbind: add dma device class Kevin Laatz @ 2021-10-18 12:28 ` Kevin Laatz 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-18 12:28 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz The dmadev library is the preferred abstraction for using IDXD devices and will replace the rawdev implementation in future. This patch moves the IDXD device ID to the dmadev class. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> --- usertools/dpdk-devbind.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py index da89b87816..ba18e2a487 100755 --- a/usertools/dpdk-devbind.py +++ b/usertools/dpdk-devbind.py @@ -71,13 +71,13 @@ network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class] baseband_devices = [acceleration_class] crypto_devices = [encryption_class, intel_processor_class] -dma_devices = [] +dma_devices = [intel_idxd_spr] eventdev_devices = [cavium_sso, cavium_tim, intel_dlb, octeontx2_sso] mempool_devices = [cavium_fpa, octeontx2_npa] compress_devices = [cavium_zip] regex_devices = [octeontx2_ree] misc_devices = [cnxk_bphy, cnxk_bphy_cgx, cnxk_inl_dev, intel_ioat_bdw, - intel_ioat_skx, intel_ioat_icx, intel_idxd_spr, intel_ntb_skx, + intel_ioat_skx, intel_ioat_icx, intel_ntb_skx, intel_ntb_icx, octeontx2_dma] # global dict ethernet devices present. Dictionary indexed by PCI address. -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v9 00/16] add dmadev driver for idxd devices 2021-08-27 17:20 [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices Kevin Laatz ` (18 preceding siblings ...) 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 00/16] add dmadev driver for idxd devices Kevin Laatz @ 2021-10-19 11:25 ` Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 01/16] raw/ioat: only build if dmadev not present Kevin Laatz ` (15 more replies) 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Kevin Laatz 21 siblings, 16 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 11:25 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz This patchset adds a dmadev driver and associated documentation to support Intel Data Streaming Accelerator devices. This driver is intended to ultimately replace the current IDXD part of the IOAT rawdev driver. v9: * add missing meson check for x86 v8: * fix compilation issues of individual patches v7: * rebase on above patchsets * add meson reason for rawdev build v6: * set state of device during create * add dev_close function * documentation updates - moved generic pieces from driver doc to lib doc * other small miscellaneous fixes based on rebasing and ML feedback v5: * add missing toctree entry for idxd driver v4: * rebased on above patchsets * minor fixes based on review feedback v3: * rebased on above patchsets * added burst capacity API v2: * rebased on above patchsets * added API to check for device being idle * added devbind updates for DMA devices * fixed issue identified by internal coverity scan Bruce Richardson (1): raw/ioat: only build if dmadev not present Conor Walsh (1): dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz (14): dma/idxd: add skeleton for VFIO based DSA device dma/idxd: add bus device probing dma/idxd: create dmadev instances on bus probe dma/idxd: create dmadev instances on pci probe dma/idxd: add datapath structures dma/idxd: add configure and info_get functions dma/idxd: add start and stop functions for pci devices dma/idxd: add data-path job submission functions dma/idxd: add data-path job completion functions dma/idxd: add operation statistic tracking dma/idxd: add vchan status function dma/idxd: add burst capacity API devbind: add dma device class devbind: move idxd device ID to dmadev class MAINTAINERS | 10 + doc/guides/dmadevs/idxd.rst | 179 ++++++++ doc/guides/dmadevs/index.rst | 2 + doc/guides/prog_guide/dmadev.rst | 30 ++ doc/guides/rawdevs/ioat.rst | 8 + doc/guides/rel_notes/release_21_11.rst | 5 + drivers/dma/idxd/dpdk_idxd_cfg.py | 117 +++++ drivers/dma/idxd/idxd_bus.c | 377 +++++++++++++++ drivers/dma/idxd/idxd_common.c | 612 +++++++++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 131 ++++++ drivers/dma/idxd/idxd_internal.h | 109 +++++ drivers/dma/idxd/idxd_pci.c | 368 +++++++++++++++ drivers/dma/idxd/meson.build | 17 + drivers/dma/idxd/version.map | 3 + drivers/dma/meson.build | 2 + drivers/meson.build | 4 +- drivers/raw/ioat/dpdk_idxd_cfg.py | 118 +---- drivers/raw/ioat/meson.build | 24 +- lib/dmadev/rte_dmadev.h | 1 + usertools/dpdk-devbind.py | 10 +- 20 files changed, 2003 insertions(+), 124 deletions(-) create mode 100644 doc/guides/dmadevs/idxd.rst create mode 100755 drivers/dma/idxd/dpdk_idxd_cfg.py create mode 100644 drivers/dma/idxd/idxd_bus.c create mode 100644 drivers/dma/idxd/idxd_common.c create mode 100644 drivers/dma/idxd/idxd_hw_defs.h create mode 100644 drivers/dma/idxd/idxd_internal.h create mode 100644 drivers/dma/idxd/idxd_pci.c create mode 100644 drivers/dma/idxd/meson.build create mode 100644 drivers/dma/idxd/version.map mode change 100755 => 120000 drivers/raw/ioat/dpdk_idxd_cfg.py -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v9 01/16] raw/ioat: only build if dmadev not present 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 00/16] add dmadev driver for idxd devices Kevin Laatz @ 2021-10-19 11:25 ` Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz ` (14 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 11:25 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz From: Bruce Richardson <bruce.richardson@intel.com> Only build the rawdev IDXD/IOAT drivers if the dmadev drivers are not present. This change requires the dependencies to be reordered in drivers/meson.build so that rawdev can use the "RTE_DMA_* build macros to check for the presence of the equivalent dmadev driver. A note is also added to the documentation to inform users of this change. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/rawdevs/ioat.rst | 8 ++++++++ drivers/meson.build | 4 ++-- drivers/raw/ioat/meson.build | 24 +++++++++++++++++++++--- 3 files changed, 31 insertions(+), 5 deletions(-) diff --git a/doc/guides/rawdevs/ioat.rst b/doc/guides/rawdevs/ioat.rst index a28e909935..a65530bd30 100644 --- a/doc/guides/rawdevs/ioat.rst +++ b/doc/guides/rawdevs/ioat.rst @@ -34,6 +34,14 @@ Compilation For builds using ``meson`` and ``ninja``, the driver will be built when the target platform is x86-based. No additional compilation steps are necessary. +.. note:: + Since the addition of the dmadev library, the ``ioat`` and ``idxd`` parts of this driver + will only be built if their ``dmadev`` counterparts are not built. + The following can be used to disable the ``dmadev`` drivers, + if the raw drivers are to be used instead:: + + $ meson -Ddisable_drivers=dma/* <build_dir> + Device Setup ------------- diff --git a/drivers/meson.build b/drivers/meson.build index b7d680868a..34c0276487 100644 --- a/drivers/meson.build +++ b/drivers/meson.build @@ -10,15 +10,15 @@ subdirs = [ 'common/qat', # depends on bus. 'common/sfc_efx', # depends on bus. 'mempool', # depends on common and bus. + 'dma', # depends on common and bus. 'net', # depends on common, bus, mempool - 'raw', # depends on common, bus and net. + 'raw', # depends on common, bus, dma and net. 'crypto', # depends on common, bus and mempool (net in future). 'compress', # depends on common, bus, mempool. 'regex', # depends on common, bus, regexdev. 'vdpa', # depends on common, bus and mempool. 'event', # depends on common, bus, mempool and net. 'baseband', # depends on common and bus. - 'dma', # depends on common and bus. ] if meson.is_cross_build() diff --git a/drivers/raw/ioat/meson.build b/drivers/raw/ioat/meson.build index 0e81cb5951..1b866aab74 100644 --- a/drivers/raw/ioat/meson.build +++ b/drivers/raw/ioat/meson.build @@ -2,14 +2,32 @@ # Copyright 2019 Intel Corporation build = dpdk_conf.has('RTE_ARCH_X86') +# only use ioat rawdev driver if we don't have the equivalent dmadev ones +if dpdk_conf.has('RTE_DMA_IDXD') and dpdk_conf.has('RTE_DMA_IOAT') + build = false + reason = 'replaced by dmadev drivers' + subdir_done() +endif + reason = 'only supported on x86' sources = files( - 'idxd_bus.c', - 'idxd_pci.c', 'ioat_common.c', - 'ioat_rawdev.c', 'ioat_rawdev_test.c', ) + +if not dpdk_conf.has('RTE_DMA_IDXD') + sources += files( + 'idxd_bus.c', + 'idxd_pci.c', + ) +endif + +if not dpdk_conf.has('RTE_DMA_IOAT') + sources += files ( + 'ioat_rawdev.c', + ) +endif + deps += ['bus_pci', 'mbuf', 'rawdev'] headers = files( 'rte_ioat_rawdev.h', -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v9 02/16] dma/idxd: add skeleton for VFIO based DSA device 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 01/16] raw/ioat: only build if dmadev not present Kevin Laatz @ 2021-10-19 11:25 ` Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 03/16] dma/idxd: add bus device probing Kevin Laatz ` (13 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 11:25 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the basic device probe/remove skeleton code for DSA device bound to the vfio pci driver. Relevant documentation and MAINTAINERS update also included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v8: fix compile issue v9: add meson check for x86 --- MAINTAINERS | 10 +++++ doc/guides/dmadevs/idxd.rst | 58 ++++++++++++++++++++++++++ doc/guides/dmadevs/index.rst | 2 + doc/guides/rel_notes/release_21_11.rst | 5 +++ drivers/dma/idxd/idxd_common.c | 11 +++++ drivers/dma/idxd/idxd_internal.h | 27 ++++++++++++ drivers/dma/idxd/idxd_pci.c | 55 ++++++++++++++++++++++++ drivers/dma/idxd/meson.build | 15 +++++++ drivers/dma/idxd/version.map | 3 ++ drivers/dma/meson.build | 2 + 10 files changed, 188 insertions(+) create mode 100644 doc/guides/dmadevs/idxd.rst create mode 100644 drivers/dma/idxd/idxd_common.c create mode 100644 drivers/dma/idxd/idxd_internal.h create mode 100644 drivers/dma/idxd/idxd_pci.c create mode 100644 drivers/dma/idxd/meson.build create mode 100644 drivers/dma/idxd/version.map diff --git a/MAINTAINERS b/MAINTAINERS index 5387ffd4fc..423d8a73ce 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1200,6 +1200,16 @@ F: doc/guides/compressdevs/zlib.rst F: doc/guides/compressdevs/features/zlib.ini +DMAdev Drivers +-------------- + +Intel IDXD - EXPERIMENTAL +M: Bruce Richardson <bruce.richardson@intel.com> +M: Kevin Laatz <kevin.laatz@intel.com> +F: drivers/dma/idxd/ +F: doc/guides/dmadevs/idxd.rst + + RegEx Drivers ------------- diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst new file mode 100644 index 0000000000..924700d17e --- /dev/null +++ b/doc/guides/dmadevs/idxd.rst @@ -0,0 +1,58 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2021 Intel Corporation. + +.. include:: <isonum.txt> + +IDXD DMA Device Driver +====================== + +The ``idxd`` dmadev driver provides a poll-mode driver (PMD) for Intel\ |reg| +Data Streaming Accelerator `(Intel DSA) +<https://software.intel.com/content/www/us/en/develop/articles/intel-data-streaming-accelerator-architecture-specification.html>`_. +This PMD can be used in conjunction with Intel\ |reg| DSA devices to offload +data operations, such as data copies, to hardware, freeing up CPU cycles for +other tasks. + +Hardware Requirements +---------------------- + +The ``dpdk-devbind.py`` script, included with DPDK, can be used to show the +presence of supported hardware. Running ``dpdk-devbind.py --status-dev dma`` +will show all the DMA devices on the system, including IDXD supported devices. +Intel\ |reg| DSA devices, are currently (at time of writing) appearing +as devices with type “0b25”, due to the absence of pci-id database entries for +them at this point. + +Compilation +------------ + +For builds using ``meson`` and ``ninja``, the driver will be built when the +target platform is x86-based. No additional compilation steps are necessary. + +Device Setup +------------- + +Devices using VFIO/UIO drivers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The HW devices to be used will need to be bound to a user-space IO driver for use. +The ``dpdk-devbind.py`` script can be used to view the state of the devices +and to bind them to a suitable DPDK-supported driver, such as ``vfio-pci``. +For example:: + + $ dpdk-devbind.py -b vfio-pci 6a:01.0 + +Device Probing and Initialization +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For devices bound to a suitable DPDK-supported VFIO/UIO driver, the HW devices will +be found as part of the device scan done at application initialization time without +the need to pass parameters to the application. + +For Intel\ |reg| DSA devices, DPDK will automatically configure the device with the +maximum number of workqueues available on it, partitioning all resources equally +among the queues. +If fewer workqueues are required, then the ``max_queues`` parameter may be passed to +the device driver on the EAL commandline, via the ``allowlist`` or ``-a`` flag e.g.:: + + $ dpdk-test -a <b:d:f>,max_queues=4 diff --git a/doc/guides/dmadevs/index.rst b/doc/guides/dmadevs/index.rst index 0bce29d766..5d4abf880e 100644 --- a/doc/guides/dmadevs/index.rst +++ b/doc/guides/dmadevs/index.rst @@ -10,3 +10,5 @@ an application through DMA API. .. toctree:: :maxdepth: 2 :numbered: + + idxd diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst index d5435a64aa..f8678efa94 100644 --- a/doc/guides/rel_notes/release_21_11.rst +++ b/doc/guides/rel_notes/release_21_11.rst @@ -75,6 +75,11 @@ New Features operations. * Added multi-process support. +* **Added IDXD dmadev driver implementation.** + + The IDXD dmadev driver provide device drivers for the Intel DSA devices. + This device driver can be used through the generic dmadev API. + * **Added new RSS offload types for IPv4/L4 checksum in RSS flow.** Added macros ETH_RSS_IPV4_CHKSUM and ETH_RSS_L4_CHKSUM, now IPv4 and diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c new file mode 100644 index 0000000000..e00ddbe5ef --- /dev/null +++ b/drivers/dma/idxd/idxd_common.c @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#include <rte_log.h> + +#include "idxd_internal.h" + +int idxd_pmd_logtype; + +RTE_LOG_REGISTER_DEFAULT(idxd_pmd_logtype, WARNING); diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h new file mode 100644 index 0000000000..c6a7dcd72f --- /dev/null +++ b/drivers/dma/idxd/idxd_internal.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#ifndef _IDXD_INTERNAL_H_ +#define _IDXD_INTERNAL_H_ + +/** + * @file idxd_internal.h + * + * Internal data structures for the idxd/DSA driver for dmadev + * + * @warning + * @b EXPERIMENTAL: these structures and APIs may change without prior notice + */ + +extern int idxd_pmd_logtype; + +#define IDXD_PMD_LOG(level, fmt, args...) rte_log(RTE_LOG_ ## level, \ + idxd_pmd_logtype, "IDXD: %s(): " fmt "\n", __func__, ##args) + +#define IDXD_PMD_DEBUG(fmt, args...) IDXD_PMD_LOG(DEBUG, fmt, ## args) +#define IDXD_PMD_INFO(fmt, args...) IDXD_PMD_LOG(INFO, fmt, ## args) +#define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) +#define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) + +#endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c new file mode 100644 index 0000000000..79e4aadcab --- /dev/null +++ b/drivers/dma/idxd/idxd_pci.c @@ -0,0 +1,55 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include <rte_bus_pci.h> + +#include "idxd_internal.h" + +#define IDXD_VENDOR_ID 0x8086 +#define IDXD_DEVICE_ID_SPR 0x0B25 + +#define IDXD_PMD_DMADEV_NAME_PCI dmadev_idxd_pci + +const struct rte_pci_id pci_id_idxd_map[] = { + { RTE_PCI_DEVICE(IDXD_VENDOR_ID, IDXD_DEVICE_ID_SPR) }, + { .vendor_id = 0, /* sentinel */ }, +}; + +static int +idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) +{ + int ret = 0; + char name[PCI_PRI_STR_SIZE]; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); + dev->device.driver = &drv->driver; + + return ret; +} + +static int +idxd_dmadev_remove_pci(struct rte_pci_device *dev) +{ + char name[PCI_PRI_STR_SIZE]; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + + IDXD_PMD_INFO("Closing %s on NUMA node %d", + name, dev->device.numa_node); + + return 0; +} + +struct rte_pci_driver idxd_pmd_drv_pci = { + .id_table = pci_id_idxd_map, + .drv_flags = RTE_PCI_DRV_NEED_MAPPING, + .probe = idxd_dmadev_probe_pci, + .remove = idxd_dmadev_remove_pci, +}; + +RTE_PMD_REGISTER_PCI(IDXD_PMD_DMADEV_NAME_PCI, idxd_pmd_drv_pci); +RTE_PMD_REGISTER_PCI_TABLE(IDXD_PMD_DMADEV_NAME_PCI, pci_id_idxd_map); +RTE_PMD_REGISTER_KMOD_DEP(IDXD_PMD_DMADEV_NAME_PCI, "vfio-pci"); +RTE_PMD_REGISTER_PARAM_STRING(dmadev_idxd_pci, "max_queues=0"); diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build new file mode 100644 index 0000000000..e984b85637 --- /dev/null +++ b/drivers/dma/idxd/meson.build @@ -0,0 +1,15 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2021 Intel Corporation + +build = dpdk_conf.has('RTE_ARCH_X86') +reason = 'only supported on x86' + +if is_windows + subdir_done() +endif + +deps += ['bus_pci'] +sources = files( + 'idxd_common.c', + 'idxd_pci.c' +) diff --git a/drivers/dma/idxd/version.map b/drivers/dma/idxd/version.map new file mode 100644 index 0000000000..4a76d1d52d --- /dev/null +++ b/drivers/dma/idxd/version.map @@ -0,0 +1,3 @@ +DPDK_21 { + local: *; +}; diff --git a/drivers/dma/meson.build b/drivers/dma/meson.build index d9c7ede32f..411be7a240 100644 --- a/drivers/dma/meson.build +++ b/drivers/dma/meson.build @@ -2,5 +2,7 @@ # Copyright 2021 HiSilicon Limited drivers = [ + 'idxd', 'skeleton', ] +std_deps = ['dmadev'] -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v9 03/16] dma/idxd: add bus device probing 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 01/16] raw/ioat: only build if dmadev not present Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz @ 2021-10-19 11:25 ` Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz ` (12 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 11:25 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the basic device probing for DSA devices bound to the IDXD kernel driver. These devices can be configured via sysfs and made available to DPDK if they are found during bus scan. Relevant documentation is included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 64 ++++++ drivers/dma/idxd/idxd_bus.c | 351 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 34 +++ drivers/dma/idxd/meson.build | 1 + lib/dmadev/rte_dmadev.h | 1 + 5 files changed, 451 insertions(+) create mode 100644 drivers/dma/idxd/idxd_bus.c diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 924700d17e..ce33e2857a 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -32,6 +32,56 @@ target platform is x86-based. No additional compilation steps are necessary. Device Setup ------------- +Intel\ |reg| DSA devices can use the IDXD kernel driver or DPDK-supported drivers, +such as ``vfio-pci``. Both are supported by the IDXD PMD. + +Intel\ |reg| DSA devices using IDXD kernel driver +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To use an Intel\ |reg| DSA device bound to the IDXD kernel driver, the device must first be configured. +The `accel-config <https://github.com/intel/idxd-config>`_ utility library can be used for configuration. + +.. note:: + The device configuration can also be done by directly interacting with the sysfs nodes. + An example of how this may be done can be seen in the script ``dpdk_idxd_cfg.py`` + included in the driver source directory. + +There are some mandatory configuration steps before being able to use a device with an application. +The internal engines, which do the copies or other operations, +and the work-queues, which are used by applications to assign work to the device, +need to be assigned to groups, and the various other configuration options, +such as priority or queue depth, need to be set for each queue. + +To assign an engine to a group:: + + $ accel-config config-engine dsa0/engine0.0 --group-id=0 + $ accel-config config-engine dsa0/engine0.1 --group-id=1 + +To assign work queues to groups for passing descriptors to the engines a similar accel-config command can be used. +However, the work queues also need to be configured depending on the use case. +Some configuration options include: + +* mode (Dedicated/Shared): Indicates whether a WQ may accept jobs from multiple queues simultaneously. +* priority: WQ priority between 1 and 15. Larger value means higher priority. +* wq-size: the size of the WQ. Sum of all WQ sizes must be less that the total-size defined by the device. +* type: WQ type (kernel/mdev/user). Determines how the device is presented. +* name: identifier given to the WQ. + +Example configuration for a work queue:: + + $ accel-config config-wq dsa0/wq0.0 --group-id=0 \ + --mode=dedicated --priority=10 --wq-size=8 \ + --type=user --name=dpdk_app1 + +Once the devices have been configured, they need to be enabled:: + + $ accel-config enable-device dsa0 + $ accel-config enable-wq dsa0/wq0.0 + +Check the device configuration:: + + $ accel-config list + Devices using VFIO/UIO drivers ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -56,3 +106,17 @@ If fewer workqueues are required, then the ``max_queues`` parameter may be passe the device driver on the EAL commandline, via the ``allowlist`` or ``-a`` flag e.g.:: $ dpdk-test -a <b:d:f>,max_queues=4 + +For devices bound to the IDXD kernel driver, +the DPDK IDXD driver will automatically perform a scan for available workqueues +to use. Any workqueues found listed in ``/dev/dsa`` on the system will be checked +in ``/sys``, and any which have ``dpdk_`` prefix in their name will be automatically +probed by the driver to make them available to the application. +Alternatively, to support use by multiple DPDK processes simultaneously, +the value used as the DPDK ``--file-prefix`` parameter may be used as a workqueue +name prefix, instead of ``dpdk_``, allowing each DPDK application instance to only +use a subset of configured queues. + +Once probed successfully, irrespective of kernel driver, the device will appear as a ``dmadev``, +that is a "DMA device type" inside DPDK, and can be accessed using APIs from the +``rte_dmadev`` library. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c new file mode 100644 index 0000000000..ef589af30e --- /dev/null +++ b/drivers/dma/idxd/idxd_bus.c @@ -0,0 +1,351 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include <dirent.h> +#include <fcntl.h> +#include <unistd.h> +#include <sys/mman.h> +#include <libgen.h> + +#include <rte_bus.h> +#include <rte_log.h> +#include <rte_dmadev_pmd.h> +#include <rte_string_fns.h> + +#include "idxd_internal.h" + +/* default value for DSA paths, but allow override in environment for testing */ +#define DSA_DEV_PATH "/dev/dsa" +#define DSA_SYSFS_PATH "/sys/bus/dsa/devices" + +static unsigned int devcount; + +/** unique identifier for a DSA device/WQ instance */ +struct dsa_wq_addr { + uint16_t device_id; + uint16_t wq_id; +}; + +/** a DSA device instance */ +struct rte_dsa_device { + struct rte_device device; /**< Inherit core device */ + TAILQ_ENTRY(rte_dsa_device) next; /**< next dev in list */ + + char wq_name[32]; /**< the workqueue name/number e.g. wq0.1 */ + struct dsa_wq_addr addr; /**< Identifies the specific WQ */ +}; + +/* forward prototypes */ +struct dsa_bus; +static int dsa_scan(void); +static int dsa_probe(void); +static struct rte_device *dsa_find_device(const struct rte_device *start, + rte_dev_cmp_t cmp, const void *data); +static enum rte_iova_mode dsa_get_iommu_class(void); +static int dsa_addr_parse(const char *name, void *addr); + +/** List of devices */ +TAILQ_HEAD(dsa_device_list, rte_dsa_device); + +/** + * Structure describing the DSA bus + */ +struct dsa_bus { + struct rte_bus bus; /**< Inherit the generic class */ + struct rte_driver driver; /**< Driver struct for devices to point to */ + struct dsa_device_list device_list; /**< List of PCI devices */ +}; + +struct dsa_bus dsa_bus = { + .bus = { + .scan = dsa_scan, + .probe = dsa_probe, + .find_device = dsa_find_device, + .get_iommu_class = dsa_get_iommu_class, + .parse = dsa_addr_parse, + }, + .driver = { + .name = "dmadev_idxd" + }, + .device_list = TAILQ_HEAD_INITIALIZER(dsa_bus.device_list), +}; + +static inline const char * +dsa_get_dev_path(void) +{ + const char *path = getenv("DSA_DEV_PATH"); + return path ? path : DSA_DEV_PATH; +} + +static inline const char * +dsa_get_sysfs_path(void) +{ + const char *path = getenv("DSA_SYSFS_PATH"); + return path ? path : DSA_SYSFS_PATH; +} + +static void * +idxd_bus_mmap_wq(struct rte_dsa_device *dev) +{ + void *addr; + char path[PATH_MAX]; + int fd; + + snprintf(path, sizeof(path), "%s/%s", dsa_get_dev_path(), dev->wq_name); + fd = open(path, O_RDWR); + if (fd < 0) { + IDXD_PMD_ERR("Failed to open device path: %s", path); + return NULL; + } + + addr = mmap(NULL, 0x1000, PROT_WRITE, MAP_SHARED, fd, 0); + close(fd); + if (addr == MAP_FAILED) { + IDXD_PMD_ERR("Failed to mmap device %s", path); + return NULL; + } + + return addr; +} + +static int +read_wq_string(struct rte_dsa_device *dev, const char *filename, + char *value, size_t valuelen) +{ + char sysfs_node[PATH_MAX]; + int len; + int fd; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s", + dsa_get_sysfs_path(), dev->wq_name, filename); + fd = open(sysfs_node, O_RDONLY); + if (fd < 0) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + len = read(fd, value, valuelen - 1); + close(fd); + if (len < 0) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + value[len] = '\0'; + return 0; +} + +static int +read_wq_int(struct rte_dsa_device *dev, const char *filename, + int *value) +{ + char sysfs_node[PATH_MAX]; + FILE *f; + int ret = 0; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s", + dsa_get_sysfs_path(), dev->wq_name, filename); + f = fopen(sysfs_node, "r"); + if (f == NULL) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + if (fscanf(f, "%d", value) != 1) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + ret = -1; + } + + fclose(f); + return ret; +} + +static int +read_device_int(struct rte_dsa_device *dev, const char *filename, + int *value) +{ + char sysfs_node[PATH_MAX]; + FILE *f; + int ret = 0; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/dsa%d/%s", + dsa_get_sysfs_path(), dev->addr.device_id, filename); + f = fopen(sysfs_node, "r"); + if (f == NULL) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + if (fscanf(f, "%d", value) != 1) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + ret = -1; + } + + fclose(f); + return ret; +} + +static int +idxd_probe_dsa(struct rte_dsa_device *dev) +{ + struct idxd_dmadev idxd = {0}; + int ret = 0; + + IDXD_PMD_INFO("Probing device %s on numa node %d", + dev->wq_name, dev->device.numa_node); + if (read_wq_int(dev, "size", &ret) < 0) + return -1; + idxd.max_batches = ret; + if (read_wq_int(dev, "max_batch_size", &ret) < 0) + return -1; + idxd.max_batch_size = ret; + idxd.qid = dev->addr.wq_id; + idxd.sva_support = 1; + + idxd.portal = idxd_bus_mmap_wq(dev); + if (idxd.portal == NULL) { + IDXD_PMD_ERR("WQ mmap failed"); + return -ENOENT; + } + + return 0; +} + +static int +is_for_this_process_use(const char *name) +{ + char *runtime_dir = strdup(rte_eal_get_runtime_dir()); + char *prefix = basename(runtime_dir); + int prefixlen = strlen(prefix); + int retval = 0; + + if (strncmp(name, "dpdk_", 5) == 0) + retval = 1; + if (strncmp(name, prefix, prefixlen) == 0 && name[prefixlen] == '_') + retval = 1; + + free(runtime_dir); + return retval; +} + +static int +dsa_probe(void) +{ + struct rte_dsa_device *dev; + + TAILQ_FOREACH(dev, &dsa_bus.device_list, next) { + char type[64], name[64]; + + if (read_wq_string(dev, "type", type, sizeof(type)) < 0 || + read_wq_string(dev, "name", name, sizeof(name)) < 0) + continue; + + if (strncmp(type, "user", 4) == 0 && is_for_this_process_use(name)) { + dev->device.driver = &dsa_bus.driver; + idxd_probe_dsa(dev); + continue; + } + IDXD_PMD_DEBUG("WQ '%s', not allocated to DPDK", dev->wq_name); + } + + return 0; +} + +static int +dsa_scan(void) +{ + const char *path = dsa_get_dev_path(); + struct dirent *wq; + DIR *dev_dir; + + dev_dir = opendir(path); + if (dev_dir == NULL) { + if (errno == ENOENT) + return 0; /* no bus, return without error */ + IDXD_PMD_ERR("%s(): opendir '%s' failed: %s", + __func__, path, strerror(errno)); + return -1; + } + + while ((wq = readdir(dev_dir)) != NULL) { + struct rte_dsa_device *dev; + int numa_node = -1; + + if (strncmp(wq->d_name, "wq", 2) != 0) + continue; + if (strnlen(wq->d_name, sizeof(dev->wq_name)) == sizeof(dev->wq_name)) { + IDXD_PMD_ERR("%s(): wq name too long: '%s', skipping", + __func__, wq->d_name); + continue; + } + IDXD_PMD_DEBUG("%s(): found %s/%s", __func__, path, wq->d_name); + + dev = malloc(sizeof(*dev)); + if (dsa_addr_parse(wq->d_name, &dev->addr) < 0) { + IDXD_PMD_ERR("Error parsing WQ name: %s", wq->d_name); + free(dev); + continue; + } + dev->device.bus = &dsa_bus.bus; + strlcpy(dev->wq_name, wq->d_name, sizeof(dev->wq_name)); + TAILQ_INSERT_TAIL(&dsa_bus.device_list, dev, next); + devcount++; + + read_device_int(dev, "numa_node", &numa_node); + dev->device.numa_node = numa_node; + dev->device.name = dev->wq_name; + } + + closedir(dev_dir); + return 0; +} + +static struct rte_device * +dsa_find_device(const struct rte_device *start, rte_dev_cmp_t cmp, + const void *data) +{ + struct rte_dsa_device *dev = TAILQ_FIRST(&dsa_bus.device_list); + + /* the rte_device struct must be at start of dsa structure */ + RTE_BUILD_BUG_ON(offsetof(struct rte_dsa_device, device) != 0); + + if (start != NULL) /* jump to start point if given */ + dev = TAILQ_NEXT((const struct rte_dsa_device *)start, next); + while (dev != NULL) { + if (cmp(&dev->device, data) == 0) + return &dev->device; + dev = TAILQ_NEXT(dev, next); + } + return NULL; +} + +static enum rte_iova_mode +dsa_get_iommu_class(void) +{ + /* if there are no devices, report don't care, otherwise VA mode */ + return devcount > 0 ? RTE_IOVA_VA : RTE_IOVA_DC; +} + +static int +dsa_addr_parse(const char *name, void *addr) +{ + struct dsa_wq_addr *wq = addr; + unsigned int device_id, wq_id; + + if (sscanf(name, "wq%u.%u", &device_id, &wq_id) != 2) { + IDXD_PMD_DEBUG("Parsing WQ name failed: %s", name); + return -1; + } + + wq->device_id = device_id; + wq->wq_id = wq_id; + return 0; +} + +RTE_REGISTER_BUS(dsa, dsa_bus.bus); diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index c6a7dcd72f..b8a7d7dab6 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -24,4 +24,38 @@ extern int idxd_pmd_logtype; #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) +struct idxd_dmadev { + /* counters to track the batches */ + unsigned short max_batches; + unsigned short batch_idx_read; + unsigned short batch_idx_write; + + /* track descriptors and handles */ + unsigned short desc_ring_mask; + unsigned short ids_avail; /* handles for ops completed */ + unsigned short ids_returned; /* the read pointer for hdls/desc rings */ + unsigned short batch_start; /* start+size == write pointer for hdls/desc */ + unsigned short batch_size; + + void *portal; /* address to write the batch descriptor */ + + struct idxd_completion *batch_comp_ring; + unsigned short *batch_idx_ring; /* store where each batch ends */ + + rte_iova_t batch_iova; /* base address of the batch comp ring */ + rte_iova_t desc_iova; /* base address of desc ring, needed for completions */ + + unsigned short max_batch_size; + + struct rte_dma_dev *dmadev; + uint8_t sva_support; + uint8_t qid; + + union { + struct { + unsigned int dsa_id; + } bus; + } u; +}; + #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index e984b85637..cc8450a096 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -10,6 +10,7 @@ endif deps += ['bus_pci'] sources = files( + 'idxd_bus.c', 'idxd_common.c', 'idxd_pci.c' ) diff --git a/lib/dmadev/rte_dmadev.h b/lib/dmadev/rte_dmadev.h index f5d23017b1..7394f0932b 100644 --- a/lib/dmadev/rte_dmadev.h +++ b/lib/dmadev/rte_dmadev.h @@ -149,6 +149,7 @@ #include <rte_bitops.h> #include <rte_common.h> #include <rte_compat.h> +#include <rte_eal.h> #include <rte_dev.h> #ifdef __cplusplus -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v9 04/16] dma/idxd: create dmadev instances on bus probe 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 00/16] add dmadev driver for idxd devices Kevin Laatz ` (2 preceding siblings ...) 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 03/16] dma/idxd: add bus device probing Kevin Laatz @ 2021-10-19 11:25 ` Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz ` (11 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 11:25 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When a suitable device is found during the bus scan/probe, create a dmadev instance for each HW queue. Internal structures required for device creation are also added. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- drivers/dma/idxd/idxd_bus.c | 19 ++++++++++ drivers/dma/idxd/idxd_common.c | 61 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 27 ++++++++++++++ drivers/dma/idxd/idxd_internal.h | 7 ++++ 4 files changed, 114 insertions(+) create mode 100644 drivers/dma/idxd/idxd_hw_defs.h diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index ef589af30e..b48fa954ed 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -85,6 +85,18 @@ dsa_get_sysfs_path(void) return path ? path : DSA_SYSFS_PATH; } +static int +idxd_dev_close(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->data->dev_private; + munmap(idxd->portal, 0x1000); + return 0; +} + +static const struct rte_dma_dev_ops idxd_bus_ops = { + .dev_close = idxd_dev_close, +}; + static void * idxd_bus_mmap_wq(struct rte_dsa_device *dev) { @@ -206,6 +218,7 @@ idxd_probe_dsa(struct rte_dsa_device *dev) return -1; idxd.max_batch_size = ret; idxd.qid = dev->addr.wq_id; + idxd.u.bus.dsa_id = dev->addr.device_id; idxd.sva_support = 1; idxd.portal = idxd_bus_mmap_wq(dev); @@ -214,6 +227,12 @@ idxd_probe_dsa(struct rte_dsa_device *dev) return -ENOENT; } + ret = idxd_dmadev_create(dev->wq_name, &dev->device, &idxd, &idxd_bus_ops); + if (ret) { + IDXD_PMD_ERR("Failed to create rawdev %s", dev->wq_name); + return ret; + } + return 0; } diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index e00ddbe5ef..5abff34292 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -2,10 +2,71 @@ * Copyright 2021 Intel Corporation */ +#include <rte_malloc.h> +#include <rte_common.h> #include <rte_log.h> #include "idxd_internal.h" +#define IDXD_PMD_NAME_STR "dmadev_idxd" + +int +idxd_dmadev_create(const char *name, struct rte_device *dev, + const struct idxd_dmadev *base_idxd, + const struct rte_dma_dev_ops *ops) +{ + struct idxd_dmadev *idxd = NULL; + struct rte_dma_dev *dmadev = NULL; + int ret = 0; + + if (!name) { + IDXD_PMD_ERR("Invalid name of the device!"); + ret = -EINVAL; + goto cleanup; + } + + /* Allocate device structure */ + dmadev = rte_dma_pmd_allocate(name, dev->numa_node, sizeof(struct idxd_dmadev)); + if (dmadev == NULL) { + IDXD_PMD_ERR("Unable to allocate raw device"); + ret = -ENOMEM; + goto cleanup; + } + dmadev->dev_ops = ops; + dmadev->device = dev; + + idxd = dmadev->data->dev_private; + *idxd = *base_idxd; /* copy over the main fields already passed in */ + idxd->dmadev = dmadev; + + /* allocate batch index ring and completion ring. + * The +1 is because we can never fully use + * the ring, otherwise read == write means both full and empty. + */ + idxd->batch_comp_ring = rte_zmalloc(NULL, (sizeof(idxd->batch_idx_ring[0]) + + sizeof(idxd->batch_comp_ring[0])) * (idxd->max_batches + 1), + sizeof(idxd->batch_comp_ring[0])); + if (idxd->batch_comp_ring == NULL) { + IDXD_PMD_ERR("Unable to reserve memory for batch data\n"); + ret = -ENOMEM; + goto cleanup; + } + idxd->batch_idx_ring = (void *)&idxd->batch_comp_ring[idxd->max_batches+1]; + idxd->batch_iova = rte_mem_virt2iova(idxd->batch_comp_ring); + + dmadev->fp_obj->dev_private = idxd; + + idxd->dmadev->state = RTE_DMA_DEV_READY; + + return 0; + +cleanup: + if (dmadev) + rte_dma_pmd_release(name); + + return ret; +} + int idxd_pmd_logtype; RTE_LOG_REGISTER_DEFAULT(idxd_pmd_logtype, WARNING); diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h new file mode 100644 index 0000000000..a92d462d01 --- /dev/null +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#ifndef _IDXD_HW_DEFS_H_ +#define _IDXD_HW_DEFS_H_ + +#define IDXD_COMP_STATUS_INCOMPLETE 0 +#define IDXD_COMP_STATUS_SUCCESS 1 +#define IDXD_COMP_STATUS_INVALID_OPCODE 0x10 +#define IDXD_COMP_STATUS_INVALID_SIZE 0x13 +#define IDXD_COMP_STATUS_SKIPPED 0xFF /* not official IDXD error, needed as placeholder */ + +/** + * Completion record structure written back by DSA + */ +struct idxd_completion { + uint8_t status; + uint8_t result; + /* 16-bits pad here */ + uint32_t completed_size; /* data length, or descriptors for batch */ + + rte_iova_t fault_address; + uint32_t invalid_flags; +} __rte_aligned(32); + +#endif diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index b8a7d7dab6..8f1cdf6102 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -5,6 +5,10 @@ #ifndef _IDXD_INTERNAL_H_ #define _IDXD_INTERNAL_H_ +#include <rte_dmadev_pmd.h> + +#include "idxd_hw_defs.h" + /** * @file idxd_internal.h * @@ -58,4 +62,7 @@ struct idxd_dmadev { } u; }; +int idxd_dmadev_create(const char *name, struct rte_device *dev, + const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); + #endif /* _IDXD_INTERNAL_H_ */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v9 05/16] dma/idxd: create dmadev instances on pci probe 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 00/16] add dmadev driver for idxd devices Kevin Laatz ` (3 preceding siblings ...) 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz @ 2021-10-19 11:25 ` Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 06/16] dma/idxd: add datapath structures Kevin Laatz ` (10 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 11:25 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When a suitable device is found during the PCI probe, create a dmadev instance for each HW queue. HW definitions required are also included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- drivers/dma/idxd/idxd_hw_defs.h | 63 ++++++++ drivers/dma/idxd/idxd_internal.h | 13 ++ drivers/dma/idxd/idxd_pci.c | 259 ++++++++++++++++++++++++++++++- 3 files changed, 332 insertions(+), 3 deletions(-) diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h index a92d462d01..86f7f3526b 100644 --- a/drivers/dma/idxd/idxd_hw_defs.h +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -24,4 +24,67 @@ struct idxd_completion { uint32_t invalid_flags; } __rte_aligned(32); +/*** Definitions for Intel(R) Data Streaming Accelerator ***/ + +#define IDXD_CMD_SHIFT 20 +enum rte_idxd_cmds { + idxd_enable_dev = 1, + idxd_disable_dev, + idxd_drain_all, + idxd_abort_all, + idxd_reset_device, + idxd_enable_wq, + idxd_disable_wq, + idxd_drain_wq, + idxd_abort_wq, + idxd_reset_wq, +}; + +/* General bar0 registers */ +struct rte_idxd_bar0 { + uint32_t __rte_cache_aligned version; /* offset 0x00 */ + uint64_t __rte_aligned(0x10) gencap; /* offset 0x10 */ + uint64_t __rte_aligned(0x10) wqcap; /* offset 0x20 */ + uint64_t __rte_aligned(0x10) grpcap; /* offset 0x30 */ + uint64_t __rte_aligned(0x08) engcap; /* offset 0x38 */ + uint64_t __rte_aligned(0x10) opcap; /* offset 0x40 */ + uint64_t __rte_aligned(0x20) offsets[2]; /* offset 0x60 */ + uint32_t __rte_aligned(0x20) gencfg; /* offset 0x80 */ + uint32_t __rte_aligned(0x08) genctrl; /* offset 0x88 */ + uint32_t __rte_aligned(0x10) gensts; /* offset 0x90 */ + uint32_t __rte_aligned(0x08) intcause; /* offset 0x98 */ + uint32_t __rte_aligned(0x10) cmd; /* offset 0xA0 */ + uint32_t __rte_aligned(0x08) cmdstatus; /* offset 0xA8 */ + uint64_t __rte_aligned(0x20) swerror[4]; /* offset 0xC0 */ +}; + +/* workqueue config is provided by array of uint32_t. */ +enum rte_idxd_wqcfg { + wq_size_idx, /* size is in first 32-bit value */ + wq_threshold_idx, /* WQ threshold second 32-bits */ + wq_mode_idx, /* WQ mode and other flags */ + wq_sizes_idx, /* WQ transfer and batch sizes */ + wq_occ_int_idx, /* WQ occupancy interrupt handle */ + wq_occ_limit_idx, /* WQ occupancy limit */ + wq_state_idx, /* WQ state and occupancy state */ +}; + +#define WQ_MODE_SHARED 0 +#define WQ_MODE_DEDICATED 1 +#define WQ_PRIORITY_SHIFT 4 +#define WQ_BATCH_SZ_SHIFT 5 +#define WQ_STATE_SHIFT 30 +#define WQ_STATE_MASK 0x3 + +struct rte_idxd_grpcfg { + uint64_t grpwqcfg[4] __rte_cache_aligned; /* 64-byte register set */ + uint64_t grpengcfg; /* offset 32 */ + uint32_t grpflags; /* offset 40 */ +}; + +#define GENSTS_DEV_STATE_MASK 0x03 +#define CMDSTATUS_ACTIVE_SHIFT 31 +#define CMDSTATUS_ACTIVE_MASK (1 << 31) +#define CMDSTATUS_ERR_MASK 0xFF + #endif diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 8f1cdf6102..8473bf939f 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -6,6 +6,7 @@ #define _IDXD_INTERNAL_H_ #include <rte_dmadev_pmd.h> +#include <rte_spinlock.h> #include "idxd_hw_defs.h" @@ -28,6 +29,16 @@ extern int idxd_pmd_logtype; #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) +struct idxd_pci_common { + rte_spinlock_t lk; + + uint8_t wq_cfg_sz; + volatile struct rte_idxd_bar0 *regs; + volatile uint32_t *wq_regs_base; + volatile struct rte_idxd_grpcfg *grp_regs; + volatile void *portals; +}; + struct idxd_dmadev { /* counters to track the batches */ unsigned short max_batches; @@ -59,6 +70,8 @@ struct idxd_dmadev { struct { unsigned int dsa_id; } bus; + + struct idxd_pci_common *pci; } u; }; diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 79e4aadcab..7127483b10 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -3,6 +3,9 @@ */ #include <rte_bus_pci.h> +#include <rte_devargs.h> +#include <rte_dmadev_pmd.h> +#include <rte_malloc.h> #include "idxd_internal.h" @@ -16,17 +19,267 @@ const struct rte_pci_id pci_id_idxd_map[] = { { .vendor_id = 0, /* sentinel */ }, }; +static inline int +idxd_pci_dev_command(struct idxd_dmadev *idxd, enum rte_idxd_cmds command) +{ + uint8_t err_code; + uint16_t qid = idxd->qid; + int i = 0; + + if (command >= idxd_disable_wq && command <= idxd_reset_wq) + qid = (1 << qid); + rte_spinlock_lock(&idxd->u.pci->lk); + idxd->u.pci->regs->cmd = (command << IDXD_CMD_SHIFT) | qid; + + do { + rte_pause(); + err_code = idxd->u.pci->regs->cmdstatus; + if (++i >= 1000) { + IDXD_PMD_ERR("Timeout waiting for command response from HW"); + rte_spinlock_unlock(&idxd->u.pci->lk); + return err_code; + } + } while (idxd->u.pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK); + rte_spinlock_unlock(&idxd->u.pci->lk); + + return err_code & CMDSTATUS_ERR_MASK; +} + +static uint32_t * +idxd_get_wq_cfg(struct idxd_pci_common *pci, uint8_t wq_idx) +{ + return RTE_PTR_ADD(pci->wq_regs_base, + (uintptr_t)wq_idx << (5 + pci->wq_cfg_sz)); +} + +static int +idxd_is_wq_enabled(struct idxd_dmadev *idxd) +{ + uint32_t state = idxd_get_wq_cfg(idxd->u.pci, idxd->qid)[wq_state_idx]; + return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; +} + +static int +idxd_pci_dev_close(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint8_t err_code; + + /* disable the device */ + err_code = idxd_pci_dev_command(idxd, idxd_disable_dev); + if (err_code) { + IDXD_PMD_ERR("Error disabling device: code %#x", err_code); + return err_code; + } + IDXD_PMD_DEBUG("IDXD Device disabled OK"); + + /* free device memory */ + IDXD_PMD_DEBUG("Freeing device driver memory"); + rte_free(idxd->batch_idx_ring); + + return 0; +} + +static const struct rte_dma_dev_ops idxd_pci_ops = { + .dev_close = idxd_pci_dev_close, +}; + +/* each portal uses 4 x 4k pages */ +#define IDXD_PORTAL_SIZE (4096 * 4) + +static int +init_pci_device(struct rte_pci_device *dev, struct idxd_dmadev *idxd, + unsigned int max_queues) +{ + struct idxd_pci_common *pci; + uint8_t nb_groups, nb_engines, nb_wqs; + uint16_t grp_offset, wq_offset; /* how far into bar0 the regs are */ + uint16_t wq_size, total_wq_size; + uint8_t lg2_max_batch, lg2_max_copy_size; + unsigned int i, err_code; + + pci = malloc(sizeof(*pci)); + if (pci == NULL) { + IDXD_PMD_ERR("%s: Can't allocate memory", __func__); + goto err; + } + rte_spinlock_init(&pci->lk); + + /* assign the bar registers, and then configure device */ + pci->regs = dev->mem_resource[0].addr; + grp_offset = (uint16_t)pci->regs->offsets[0]; + pci->grp_regs = RTE_PTR_ADD(pci->regs, grp_offset * 0x100); + wq_offset = (uint16_t)(pci->regs->offsets[0] >> 16); + pci->wq_regs_base = RTE_PTR_ADD(pci->regs, wq_offset * 0x100); + pci->portals = dev->mem_resource[2].addr; + pci->wq_cfg_sz = (pci->regs->wqcap >> 24) & 0x0F; + + /* sanity check device status */ + if (pci->regs->gensts & GENSTS_DEV_STATE_MASK) { + /* need function-level-reset (FLR) or is enabled */ + IDXD_PMD_ERR("Device status is not disabled, cannot init"); + goto err; + } + if (pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK) { + /* command in progress */ + IDXD_PMD_ERR("Device has a command in progress, cannot init"); + goto err; + } + + /* read basic info about the hardware for use when configuring */ + nb_groups = (uint8_t)pci->regs->grpcap; + nb_engines = (uint8_t)pci->regs->engcap; + nb_wqs = (uint8_t)(pci->regs->wqcap >> 16); + total_wq_size = (uint16_t)pci->regs->wqcap; + lg2_max_copy_size = (uint8_t)(pci->regs->gencap >> 16) & 0x1F; + lg2_max_batch = (uint8_t)(pci->regs->gencap >> 21) & 0x0F; + + IDXD_PMD_DEBUG("nb_groups = %u, nb_engines = %u, nb_wqs = %u", + nb_groups, nb_engines, nb_wqs); + + /* zero out any old config */ + for (i = 0; i < nb_groups; i++) { + pci->grp_regs[i].grpengcfg = 0; + pci->grp_regs[i].grpwqcfg[0] = 0; + } + for (i = 0; i < nb_wqs; i++) + idxd_get_wq_cfg(pci, i)[0] = 0; + + /* limit queues if necessary */ + if (max_queues != 0 && nb_wqs > max_queues) { + nb_wqs = max_queues; + if (nb_engines > max_queues) + nb_engines = max_queues; + if (nb_groups > max_queues) + nb_engines = max_queues; + IDXD_PMD_DEBUG("Limiting queues to %u", nb_wqs); + } + + /* put each engine into a separate group to avoid reordering */ + if (nb_groups > nb_engines) + nb_groups = nb_engines; + if (nb_groups < nb_engines) + nb_engines = nb_groups; + + /* assign engines to groups, round-robin style */ + for (i = 0; i < nb_engines; i++) { + IDXD_PMD_DEBUG("Assigning engine %u to group %u", + i, i % nb_groups); + pci->grp_regs[i % nb_groups].grpengcfg |= (1ULL << i); + } + + /* now do the same for queues and give work slots to each queue */ + wq_size = total_wq_size / nb_wqs; + IDXD_PMD_DEBUG("Work queue size = %u, max batch = 2^%u, max copy = 2^%u", + wq_size, lg2_max_batch, lg2_max_copy_size); + for (i = 0; i < nb_wqs; i++) { + /* add engine "i" to a group */ + IDXD_PMD_DEBUG("Assigning work queue %u to group %u", + i, i % nb_groups); + pci->grp_regs[i % nb_groups].grpwqcfg[0] |= (1ULL << i); + /* now configure it, in terms of size, max batch, mode */ + idxd_get_wq_cfg(pci, i)[wq_size_idx] = wq_size; + idxd_get_wq_cfg(pci, i)[wq_mode_idx] = (1 << WQ_PRIORITY_SHIFT) | + WQ_MODE_DEDICATED; + idxd_get_wq_cfg(pci, i)[wq_sizes_idx] = lg2_max_copy_size | + (lg2_max_batch << WQ_BATCH_SZ_SHIFT); + } + + /* dump the group configuration to output */ + for (i = 0; i < nb_groups; i++) { + IDXD_PMD_DEBUG("## Group %d", i); + IDXD_PMD_DEBUG(" GRPWQCFG: %"PRIx64, pci->grp_regs[i].grpwqcfg[0]); + IDXD_PMD_DEBUG(" GRPENGCFG: %"PRIx64, pci->grp_regs[i].grpengcfg); + IDXD_PMD_DEBUG(" GRPFLAGS: %"PRIx32, pci->grp_regs[i].grpflags); + } + + idxd->u.pci = pci; + idxd->max_batches = wq_size; + + /* enable the device itself */ + err_code = idxd_pci_dev_command(idxd, idxd_enable_dev); + if (err_code) { + IDXD_PMD_ERR("Error enabling device: code %#x", err_code); + return err_code; + } + IDXD_PMD_DEBUG("IDXD Device enabled OK"); + + return nb_wqs; + +err: + free(pci); + return -1; +} + static int idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) { - int ret = 0; + struct idxd_dmadev idxd = {0}; + uint8_t nb_wqs; + int qid, ret = 0; char name[PCI_PRI_STR_SIZE]; + unsigned int max_queues = 0; rte_pci_device_name(&dev->addr, name, sizeof(name)); IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); dev->device.driver = &drv->driver; - return ret; + if (dev->device.devargs && dev->device.devargs->args[0] != '\0') { + /* if the number of devargs grows beyond just 1, use rte_kvargs */ + if (sscanf(dev->device.devargs->args, + "max_queues=%u", &max_queues) != 1) { + IDXD_PMD_ERR("Invalid device parameter: '%s'", + dev->device.devargs->args); + return -1; + } + } + + ret = init_pci_device(dev, &idxd, max_queues); + if (ret < 0) { + IDXD_PMD_ERR("Error initializing PCI hardware"); + return ret; + } + if (idxd.u.pci->portals == NULL) { + IDXD_PMD_ERR("Error, invalid portal assigned during initialization\n"); + return -EINVAL; + } + nb_wqs = (uint8_t)ret; + + /* set up one device for each queue */ + for (qid = 0; qid < nb_wqs; qid++) { + char qname[32]; + + /* add the queue number to each device name */ + snprintf(qname, sizeof(qname), "%s-q%d", name, qid); + idxd.qid = qid; + idxd.portal = RTE_PTR_ADD(idxd.u.pci->portals, + qid * IDXD_PORTAL_SIZE); + if (idxd_is_wq_enabled(&idxd)) + IDXD_PMD_ERR("Error, WQ %u seems enabled", qid); + ret = idxd_dmadev_create(qname, &dev->device, + &idxd, &idxd_pci_ops); + if (ret != 0) { + IDXD_PMD_ERR("Failed to create dmadev %s", name); + if (qid == 0) /* if no devices using this, free pci */ + free(idxd.u.pci); + return ret; + } + } + + return 0; +} + +static int +idxd_dmadev_destroy(const char *name) +{ + int ret; + + /* rte_dma_close is called by pmd_release */ + ret = rte_dma_pmd_release(name); + if (ret) + IDXD_PMD_DEBUG("Device cleanup failed"); + + return 0; } static int @@ -39,7 +292,7 @@ idxd_dmadev_remove_pci(struct rte_pci_device *dev) IDXD_PMD_INFO("Closing %s on NUMA node %d", name, dev->device.numa_node); - return 0; + return idxd_dmadev_destroy(name); } struct rte_pci_driver idxd_pmd_drv_pci = { -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v9 06/16] dma/idxd: add datapath structures 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 00/16] add dmadev driver for idxd devices Kevin Laatz ` (4 preceding siblings ...) 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz @ 2021-10-19 11:25 ` Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 07/16] dma/idxd: add configure and info_get functions Kevin Laatz ` (9 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 11:25 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add data structures required for the data path for IDXD devices. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 33 +++++++++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 41 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 4 ++++ drivers/dma/idxd/idxd_pci.c | 2 ++ 5 files changed, 81 insertions(+) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index b48fa954ed..3c0837ec52 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -95,6 +95,7 @@ idxd_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_close = idxd_dev_close, + .dev_dump = idxd_dump, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 5abff34292..f972260a56 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -10,6 +10,35 @@ #define IDXD_PMD_NAME_STR "dmadev_idxd" +int +idxd_dump(const struct rte_dma_dev *dev, FILE *f) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + unsigned int i; + + fprintf(f, "== IDXD Private Data ==\n"); + fprintf(f, " Portal: %p\n", idxd->portal); + fprintf(f, " Config: { ring_size: %u }\n", + idxd->qcfg.nb_desc); + fprintf(f, " Batch ring (sz = %u, max_batches = %u):\n\t", + idxd->max_batches + 1, idxd->max_batches); + for (i = 0; i <= idxd->max_batches; i++) { + fprintf(f, " %u ", idxd->batch_idx_ring[i]); + if (i == idxd->batch_idx_read && i == idxd->batch_idx_write) + fprintf(f, "[rd ptr, wr ptr] "); + else if (i == idxd->batch_idx_read) + fprintf(f, "[rd ptr] "); + else if (i == idxd->batch_idx_write) + fprintf(f, "[wr ptr] "); + if (i == idxd->max_batches) + fprintf(f, "\n"); + } + + fprintf(f, " Curr batch: start = %u, size = %u\n", idxd->batch_start, idxd->batch_size); + fprintf(f, " IDS: avail = %u, returned: %u\n", idxd->ids_avail, idxd->ids_returned); + return 0; +} + int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, @@ -19,6 +48,10 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, struct rte_dma_dev *dmadev = NULL; int ret = 0; + RTE_BUILD_BUG_ON(sizeof(struct idxd_hw_desc) != 64); + RTE_BUILD_BUG_ON(offsetof(struct idxd_hw_desc, size) != 32); + RTE_BUILD_BUG_ON(sizeof(struct idxd_completion) != 32); + if (!name) { IDXD_PMD_ERR("Invalid name of the device!"); ret = -EINVAL; diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h index 86f7f3526b..55ca9f7f52 100644 --- a/drivers/dma/idxd/idxd_hw_defs.h +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -5,6 +5,47 @@ #ifndef _IDXD_HW_DEFS_H_ #define _IDXD_HW_DEFS_H_ +/* + * Defines used in the data path for interacting with IDXD hardware. + */ +#define IDXD_CMD_OP_SHIFT 24 +enum rte_idxd_ops { + idxd_op_nop = 0, + idxd_op_batch, + idxd_op_drain, + idxd_op_memmove, + idxd_op_fill +}; + +#define IDXD_FLAG_FENCE (1 << 0) +#define IDXD_FLAG_COMPLETION_ADDR_VALID (1 << 2) +#define IDXD_FLAG_REQUEST_COMPLETION (1 << 3) +#define IDXD_FLAG_CACHE_CONTROL (1 << 8) + +/** + * Hardware descriptor used by DSA hardware, for both bursts and + * for individual operations. + */ +struct idxd_hw_desc { + uint32_t pasid; + uint32_t op_flags; + rte_iova_t completion; + + RTE_STD_C11 + union { + rte_iova_t src; /* source address for copy ops etc. */ + rte_iova_t desc_addr; /* descriptor pointer for batch */ + }; + rte_iova_t dst; + + uint32_t size; /* length of data for op, or batch size */ + + uint16_t intr_handle; /* completion interrupt handle */ + + /* remaining 26 bytes are reserved */ + uint16_t __reserved[13]; +} __rte_aligned(64); + #define IDXD_COMP_STATUS_INCOMPLETE 0 #define IDXD_COMP_STATUS_SUCCESS 1 #define IDXD_COMP_STATUS_INVALID_OPCODE 0x10 diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 8473bf939f..5e253fdfbc 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -40,6 +40,8 @@ struct idxd_pci_common { }; struct idxd_dmadev { + struct idxd_hw_desc *desc_ring; + /* counters to track the batches */ unsigned short max_batches; unsigned short batch_idx_read; @@ -63,6 +65,7 @@ struct idxd_dmadev { unsigned short max_batch_size; struct rte_dma_dev *dmadev; + struct rte_dma_vchan_conf qcfg; uint8_t sva_support; uint8_t qid; @@ -77,5 +80,6 @@ struct idxd_dmadev { int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); +int idxd_dump(const struct rte_dma_dev *dev, FILE *f); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 7127483b10..96c8c65cc0 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -76,12 +76,14 @@ idxd_pci_dev_close(struct rte_dma_dev *dev) /* free device memory */ IDXD_PMD_DEBUG("Freeing device driver memory"); rte_free(idxd->batch_idx_ring); + rte_free(idxd->desc_ring); return 0; } static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_close = idxd_pci_dev_close, + .dev_dump = idxd_dump, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v9 07/16] dma/idxd: add configure and info_get functions 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 00/16] add dmadev driver for idxd devices Kevin Laatz ` (5 preceding siblings ...) 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 06/16] dma/idxd: add datapath structures Kevin Laatz @ 2021-10-19 11:25 ` Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz ` (8 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 11:25 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add functions for device configuration. The info_get function is included here since it can be useful for checking successful configuration. Documentation is also updated to add device configuration usage info. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 15 +++++++ drivers/dma/idxd/idxd_bus.c | 3 ++ drivers/dma/idxd/idxd_common.c | 71 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 6 +++ drivers/dma/idxd/idxd_pci.c | 3 ++ 5 files changed, 98 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index ce33e2857a..62ffd39ee0 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -120,3 +120,18 @@ use a subset of configured queues. Once probed successfully, irrespective of kernel driver, the device will appear as a ``dmadev``, that is a "DMA device type" inside DPDK, and can be accessed using APIs from the ``rte_dmadev`` library. + +Using IDXD DMAdev Devices +-------------------------- + +To use the devices from an application, the dmadev API can be used. + +Device Configuration +~~~~~~~~~~~~~~~~~~~~~ + +IDXD configuration requirements: + +* ``ring_size`` must be a power of two, between 64 and 4096. +* Only one ``vchan`` is supported per device (work queue). +* IDXD devices do not support silent mode. +* The transfer direction must be set to ``RTE_DMA_DIR_MEM_TO_MEM`` to copy from memory to memory. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index 3c0837ec52..b2acdac4f9 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -96,6 +96,9 @@ idxd_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_close = idxd_dev_close, .dev_dump = idxd_dump, + .dev_configure = idxd_configure, + .vchan_setup = idxd_vchan_setup, + .dev_info_get = idxd_info_get, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index f972260a56..b0c79a2e42 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -39,6 +39,77 @@ idxd_dump(const struct rte_dma_dev *dev, FILE *f) return 0; } +int +idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t size) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + + if (size < sizeof(*info)) + return -EINVAL; + + *info = (struct rte_dma_info) { + .dev_capa = RTE_DMA_CAPA_MEM_TO_MEM | RTE_DMA_CAPA_HANDLES_ERRORS | + RTE_DMA_CAPA_OPS_COPY | RTE_DMA_CAPA_OPS_FILL, + .max_vchans = 1, + .max_desc = 4096, + .min_desc = 64, + }; + if (idxd->sva_support) + info->dev_capa |= RTE_DMA_CAPA_SVA; + return 0; +} + +int +idxd_configure(struct rte_dma_dev *dev __rte_unused, const struct rte_dma_conf *dev_conf, + uint32_t conf_sz) +{ + if (sizeof(struct rte_dma_conf) != conf_sz) + return -EINVAL; + + if (dev_conf->nb_vchans != 1) + return -EINVAL; + return 0; +} + +int +idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint16_t max_desc = qconf->nb_desc; + + if (sizeof(struct rte_dma_vchan_conf) != qconf_sz) + return -EINVAL; + + idxd->qcfg = *qconf; + + if (!rte_is_power_of_2(max_desc)) + max_desc = rte_align32pow2(max_desc); + IDXD_PMD_DEBUG("DMA dev %u using %u descriptors", dev->data->dev_id, max_desc); + idxd->desc_ring_mask = max_desc - 1; + idxd->qcfg.nb_desc = max_desc; + + /* in case we are reconfiguring a device, free any existing memory */ + rte_free(idxd->desc_ring); + + /* allocate the descriptor ring at 2x size as batches can't wrap */ + idxd->desc_ring = rte_zmalloc(NULL, sizeof(*idxd->desc_ring) * max_desc * 2, 0); + if (idxd->desc_ring == NULL) + return -ENOMEM; + idxd->desc_iova = rte_mem_virt2iova(idxd->desc_ring); + + idxd->batch_idx_read = 0; + idxd->batch_idx_write = 0; + idxd->batch_start = 0; + idxd->batch_size = 0; + idxd->ids_returned = 0; + idxd->ids_avail = 0; + + memset(idxd->batch_comp_ring, 0, sizeof(*idxd->batch_comp_ring) * + (idxd->max_batches + 1)); + return 0; +} + int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 5e253fdfbc..1dbe31abcd 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -81,5 +81,11 @@ struct idxd_dmadev { int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); int idxd_dump(const struct rte_dma_dev *dev, FILE *f); +int idxd_configure(struct rte_dma_dev *dev, const struct rte_dma_conf *dev_conf, + uint32_t conf_sz); +int idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan, + const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz); +int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *dev_info, + uint32_t size); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 96c8c65cc0..681bb55efe 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -84,6 +84,9 @@ idxd_pci_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_close = idxd_pci_dev_close, .dev_dump = idxd_dump, + .dev_configure = idxd_configure, + .vchan_setup = idxd_vchan_setup, + .dev_info_get = idxd_info_get, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v9 08/16] dma/idxd: add start and stop functions for pci devices 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 00/16] add dmadev driver for idxd devices Kevin Laatz ` (6 preceding siblings ...) 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 07/16] dma/idxd: add configure and info_get functions Kevin Laatz @ 2021-10-19 11:25 ` Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 09/16] dma/idxd: add data-path job submission functions Kevin Laatz ` (7 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 11:25 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add device start/stop functions for DSA devices bound to vfio. For devices bound to the IDXD kernel driver, these are not required since the IDXD kernel driver takes care of this. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 3 +++ drivers/dma/idxd/idxd_pci.c | 51 +++++++++++++++++++++++++++++++++++++ 2 files changed, 54 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 62ffd39ee0..711890bd9e 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -135,3 +135,6 @@ IDXD configuration requirements: * Only one ``vchan`` is supported per device (work queue). * IDXD devices do not support silent mode. * The transfer direction must be set to ``RTE_DMA_DIR_MEM_TO_MEM`` to copy from memory to memory. + +Once configured, the device can then be made ready for use by calling the +``rte_dma_start()`` API. diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 681bb55efe..ed5bf99425 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -59,6 +59,55 @@ idxd_is_wq_enabled(struct idxd_dmadev *idxd) return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; } +static int +idxd_pci_dev_stop(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint8_t err_code; + + if (!idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Work queue %d already disabled", idxd->qid); + return -EALREADY; + } + + err_code = idxd_pci_dev_command(idxd, idxd_disable_wq); + if (err_code || idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Failed disabling work queue %d, error code: %#x", + idxd->qid, err_code); + return err_code == 0 ? -1 : -err_code; + } + IDXD_PMD_DEBUG("Work queue %d disabled OK", idxd->qid); + + return 0; +} + +static int +idxd_pci_dev_start(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint8_t err_code; + + if (idxd_is_wq_enabled(idxd)) { + IDXD_PMD_WARN("WQ %d already enabled", idxd->qid); + return 0; + } + + if (idxd->desc_ring == NULL) { + IDXD_PMD_ERR("WQ %d has not been fully configured", idxd->qid); + return -EINVAL; + } + + err_code = idxd_pci_dev_command(idxd, idxd_enable_wq); + if (err_code || !idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Failed enabling work queue %d, error code: %#x", + idxd->qid, err_code); + return err_code == 0 ? -1 : -err_code; + } + IDXD_PMD_DEBUG("Work queue %d enabled OK", idxd->qid); + + return 0; +} + static int idxd_pci_dev_close(struct rte_dma_dev *dev) { @@ -87,6 +136,8 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .dev_start = idxd_pci_dev_start, + .dev_stop = idxd_pci_dev_stop, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v9 09/16] dma/idxd: add data-path job submission functions 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 00/16] add dmadev driver for idxd devices Kevin Laatz ` (7 preceding siblings ...) 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz @ 2021-10-19 11:25 ` Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 10/16] dma/idxd: add data-path job completion functions Kevin Laatz ` (6 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 11:25 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add data path functions for enqueuing and submitting operations to DSA devices. Documentation updates are included for dmadev library and IDXD driver docs as appropriate. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 9 +++ doc/guides/prog_guide/dmadev.rst | 19 +++++ drivers/dma/idxd/idxd_common.c | 135 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 ++ drivers/dma/idxd/meson.build | 1 + 5 files changed, 169 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 711890bd9e..d548c4751a 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -138,3 +138,12 @@ IDXD configuration requirements: Once configured, the device can then be made ready for use by calling the ``rte_dma_start()`` API. + +Performing Data Copies +~~~~~~~~~~~~~~~~~~~~~~~ + +Refer to the :ref:`Enqueue / Dequeue APIs <dmadev_enqueue_dequeue>` section of the dmadev library +documentation for details on operation enqueue and submission API usage. + +It is expected that, for efficiency reasons, a burst of operations will be enqueued to the +device via multiple enqueue calls between calls to the ``rte_dma_submit()`` function. diff --git a/doc/guides/prog_guide/dmadev.rst b/doc/guides/prog_guide/dmadev.rst index 32f7147862..e853ffda3a 100644 --- a/doc/guides/prog_guide/dmadev.rst +++ b/doc/guides/prog_guide/dmadev.rst @@ -67,6 +67,8 @@ can be used to get the device info and supported features. Silent mode is a special device capability which does not require the application to invoke dequeue APIs. +.. _dmadev_enqueue_dequeue: + Enqueue / Dequeue APIs ~~~~~~~~~~~~~~~~~~~~~~ @@ -80,6 +82,23 @@ The ``rte_dma_submit`` API is used to issue doorbell to hardware. Alternatively the ``RTE_DMA_OP_FLAG_SUBMIT`` flag can be passed to the enqueue APIs to also issue the doorbell to hardware. +The following code demonstrates how to enqueue a burst of copies to the +device and start the hardware processing of them: + +.. code-block:: C + + struct rte_mbuf *srcs[DMA_BURST_SZ], *dsts[DMA_BURST_SZ]; + unsigned int i; + + for (i = 0; i < RTE_DIM(srcs); i++) { + if (rte_dma_copy(dev_id, vchan, rte_pktmbuf_iova(srcs), + rte_pktmbuf_iova(dsts), COPY_LEN, 0) < 0) { + PRINT_ERR("Error with rte_dma_copy for buffer %u\n", i); + return -1; + } + } + rte_dma_submit(dev_id, vchan); + There are two dequeue APIs ``rte_dma_completed`` and ``rte_dma_completed_status``, these are used to obtain the results of the enqueue requests. ``rte_dma_completed`` will return the number of successfully diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index b0c79a2e42..a686ad421c 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -2,14 +2,145 @@ * Copyright 2021 Intel Corporation */ +#include <x86intrin.h> + #include <rte_malloc.h> #include <rte_common.h> #include <rte_log.h> +#include <rte_prefetch.h> #include "idxd_internal.h" #define IDXD_PMD_NAME_STR "dmadev_idxd" +static __rte_always_inline rte_iova_t +__desc_idx_to_iova(struct idxd_dmadev *idxd, uint16_t n) +{ + return idxd->desc_iova + (n * sizeof(struct idxd_hw_desc)); +} + +static __rte_always_inline void +__idxd_movdir64b(volatile void *dst, const struct idxd_hw_desc *src) +{ + asm volatile (".byte 0x66, 0x0f, 0x38, 0xf8, 0x02" + : + : "a" (dst), "d" (src) + : "memory"); +} + +static __rte_always_inline void +__submit(struct idxd_dmadev *idxd) +{ + rte_prefetch1(&idxd->batch_comp_ring[idxd->batch_idx_read]); + + if (idxd->batch_size == 0) + return; + + /* write completion to batch comp ring */ + rte_iova_t comp_addr = idxd->batch_iova + + (idxd->batch_idx_write * sizeof(struct idxd_completion)); + + if (idxd->batch_size == 1) { + /* submit batch directly */ + struct idxd_hw_desc desc = + idxd->desc_ring[idxd->batch_start & idxd->desc_ring_mask]; + desc.completion = comp_addr; + desc.op_flags |= IDXD_FLAG_REQUEST_COMPLETION; + _mm_sfence(); /* fence before writing desc to device */ + __idxd_movdir64b(idxd->portal, &desc); + } else { + const struct idxd_hw_desc batch_desc = { + .op_flags = (idxd_op_batch << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_COMPLETION_ADDR_VALID | + IDXD_FLAG_REQUEST_COMPLETION, + .desc_addr = __desc_idx_to_iova(idxd, + idxd->batch_start & idxd->desc_ring_mask), + .completion = comp_addr, + .size = idxd->batch_size, + }; + _mm_sfence(); /* fence before writing desc to device */ + __idxd_movdir64b(idxd->portal, &batch_desc); + } + + if (++idxd->batch_idx_write > idxd->max_batches) + idxd->batch_idx_write = 0; + + idxd->batch_start += idxd->batch_size; + idxd->batch_size = 0; + idxd->batch_idx_ring[idxd->batch_idx_write] = idxd->batch_start; + _mm256_store_si256((void *)&idxd->batch_comp_ring[idxd->batch_idx_write], + _mm256_setzero_si256()); +} + +static __rte_always_inline int +__idxd_write_desc(struct idxd_dmadev *idxd, + const uint32_t op_flags, + const rte_iova_t src, + const rte_iova_t dst, + const uint32_t size, + const uint32_t flags) +{ + uint16_t mask = idxd->desc_ring_mask; + uint16_t job_id = idxd->batch_start + idxd->batch_size; + /* we never wrap batches, so we only mask the start and allow start+size to overflow */ + uint16_t write_idx = (idxd->batch_start & mask) + idxd->batch_size; + + /* first check batch ring space then desc ring space */ + if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) || + idxd->batch_idx_write + 1 == idxd->batch_idx_read) + return -ENOSPC; + if (((write_idx + 1) & mask) == (idxd->ids_returned & mask)) + return -ENOSPC; + + /* write desc. Note: descriptors don't wrap, but the completion address does */ + const uint64_t op_flags64 = (uint64_t)(op_flags | IDXD_FLAG_COMPLETION_ADDR_VALID) << 32; + const uint64_t comp_addr = __desc_idx_to_iova(idxd, write_idx & mask); + _mm256_store_si256((void *)&idxd->desc_ring[write_idx], + _mm256_set_epi64x(dst, src, comp_addr, op_flags64)); + _mm256_store_si256((void *)&idxd->desc_ring[write_idx].size, + _mm256_set_epi64x(0, 0, 0, size)); + + idxd->batch_size++; + + rte_prefetch0_write(&idxd->desc_ring[write_idx + 1]); + + if (flags & RTE_DMA_OP_FLAG_SUBMIT) + __submit(idxd); + + return job_id; +} + +int +idxd_enqueue_copy(void *dev_private, uint16_t qid __rte_unused, rte_iova_t src, + rte_iova_t dst, unsigned int length, uint64_t flags) +{ + /* we can take advantage of the fact that the fence flag in dmadev and DSA are the same, + * but check it at compile time to be sure. + */ + RTE_BUILD_BUG_ON(RTE_DMA_OP_FLAG_FENCE != IDXD_FLAG_FENCE); + uint32_t memmove = (idxd_op_memmove << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_CACHE_CONTROL | (flags & IDXD_FLAG_FENCE); + return __idxd_write_desc(dev_private, memmove, src, dst, length, + flags); +} + +int +idxd_enqueue_fill(void *dev_private, uint16_t qid __rte_unused, uint64_t pattern, + rte_iova_t dst, unsigned int length, uint64_t flags) +{ + uint32_t fill = (idxd_op_fill << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_CACHE_CONTROL | (flags & IDXD_FLAG_FENCE); + return __idxd_write_desc(dev_private, fill, pattern, dst, length, + flags); +} + +int +idxd_submit(void *dev_private, uint16_t qid __rte_unused) +{ + __submit(dev_private); + return 0; +} + int idxd_dump(const struct rte_dma_dev *dev, FILE *f) { @@ -139,6 +270,10 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->dev_ops = ops; dmadev->device = dev; + dmadev->fp_obj->copy = idxd_enqueue_copy; + dmadev->fp_obj->fill = idxd_enqueue_fill; + dmadev->fp_obj->submit = idxd_submit; + idxd = dmadev->data->dev_private; *idxd = *base_idxd; /* copy over the main fields already passed in */ idxd->dmadev = dmadev; diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 1dbe31abcd..ab4d71095e 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -87,5 +87,10 @@ int idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan, const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz); int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *dev_info, uint32_t size); +int idxd_enqueue_copy(void *dev_private, uint16_t qid, rte_iova_t src, + rte_iova_t dst, unsigned int length, uint64_t flags); +int idxd_enqueue_fill(void *dev_private, uint16_t qid, uint64_t pattern, + rte_iova_t dst, unsigned int length, uint64_t flags); +int idxd_submit(void *dev_private, uint16_t qid); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index cc8450a096..db5a76f073 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -9,6 +9,7 @@ if is_windows endif deps += ['bus_pci'] +cflags += '-mavx2' # all platforms with idxd HW support AVX sources = files( 'idxd_bus.c', 'idxd_common.c', -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v9 10/16] dma/idxd: add data-path job completion functions 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 00/16] add dmadev driver for idxd devices Kevin Laatz ` (8 preceding siblings ...) 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 09/16] dma/idxd: add data-path job submission functions Kevin Laatz @ 2021-10-19 11:25 ` Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 11/16] dma/idxd: add operation statistic tracking Kevin Laatz ` (5 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 11:25 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the data path functions for gathering completed operations. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 32 ++++- drivers/dma/idxd/idxd_common.c | 236 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 + 3 files changed, 272 insertions(+), 1 deletion(-) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index d548c4751a..d4a210b854 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -143,7 +143,37 @@ Performing Data Copies ~~~~~~~~~~~~~~~~~~~~~~~ Refer to the :ref:`Enqueue / Dequeue APIs <dmadev_enqueue_dequeue>` section of the dmadev library -documentation for details on operation enqueue and submission API usage. +documentation for details on operation enqueue, submission and completion API usage. It is expected that, for efficiency reasons, a burst of operations will be enqueued to the device via multiple enqueue calls between calls to the ``rte_dma_submit()`` function. + +When gathering completions, ``rte_dma_completed()`` should be used, up until the point an error +occurs in an operation. If an error was encountered, ``rte_dma_completed_status()`` must be used +to kick the device off to continue processing operations and also to gather the status of each +individual operations which is filled in to the ``status`` array provided as parameter by the +application. + +The following status codes are supported by IDXD: + +* ``RTE_DMA_STATUS_SUCCESSFUL``: The operation was successful. +* ``RTE_DMA_STATUS_INVALID_OPCODE``: The operation failed due to an invalid operation code. +* ``RTE_DMA_STATUS_INVALID_LENGTH``: The operation failed due to an invalid data length. +* ``RTE_DMA_STATUS_NOT_ATTEMPTED``: The operation was not attempted. +* ``RTE_DMA_STATUS_ERROR_UNKNOWN``: The operation failed due to an unspecified error. + +The following code shows how to retrieve the number of successfully completed +copies within a burst and then using ``rte_dma_completed_status()`` to check +which operation failed and kick off the device to continue processing operations: + +.. code-block:: C + + enum rte_dma_status_code status[COMP_BURST_SZ]; + uint16_t count, idx, status_count; + bool error = 0; + + count = rte_dma_completed(dev_id, vchan, COMP_BURST_SZ, &idx, &error); + + if (error){ + status_count = rte_dma_completed_status(dev_id, vchan, COMP_BURST_SZ, &idx, status); + } diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index a686ad421c..76bc2e1364 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -141,6 +141,240 @@ idxd_submit(void *dev_private, uint16_t qid __rte_unused) return 0; } +static enum rte_dma_status_code +get_comp_status(struct idxd_completion *c) +{ + uint8_t st = c->status; + switch (st) { + /* successful descriptors are not written back normally */ + case IDXD_COMP_STATUS_INCOMPLETE: + case IDXD_COMP_STATUS_SUCCESS: + return RTE_DMA_STATUS_SUCCESSFUL; + case IDXD_COMP_STATUS_INVALID_OPCODE: + return RTE_DMA_STATUS_INVALID_OPCODE; + case IDXD_COMP_STATUS_INVALID_SIZE: + return RTE_DMA_STATUS_INVALID_LENGTH; + case IDXD_COMP_STATUS_SKIPPED: + return RTE_DMA_STATUS_NOT_ATTEMPTED; + default: + return RTE_DMA_STATUS_ERROR_UNKNOWN; + } +} + +static __rte_always_inline int +batch_ok(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) +{ + uint16_t ret; + uint8_t bstatus; + + if (max_ops == 0) + return 0; + + /* first check if there are any unreturned handles from last time */ + if (idxd->ids_avail != idxd->ids_returned) { + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); + idxd->ids_returned += ret; + if (status) + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); + return ret; + } + + if (idxd->batch_idx_read == idxd->batch_idx_write) + return 0; + + bstatus = idxd->batch_comp_ring[idxd->batch_idx_read].status; + /* now check if next batch is complete and successful */ + if (bstatus == IDXD_COMP_STATUS_SUCCESS) { + /* since the batch idx ring stores the start of each batch, pre-increment to lookup + * start of next batch. + */ + if (++idxd->batch_idx_read > idxd->max_batches) + idxd->batch_idx_read = 0; + idxd->ids_avail = idxd->batch_idx_ring[idxd->batch_idx_read]; + + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); + idxd->ids_returned += ret; + if (status) + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); + return ret; + } + /* check if batch is incomplete */ + else if (bstatus == IDXD_COMP_STATUS_INCOMPLETE) + return 0; + + return -1; /* error case */ +} + +static inline uint16_t +batch_completed(struct idxd_dmadev *idxd, uint16_t max_ops, bool *has_error) +{ + uint16_t i; + uint16_t b_start, b_end, next_batch; + + int ret = batch_ok(idxd, max_ops, NULL); + if (ret >= 0) + return ret; + + /* ERROR case, not successful, not incomplete */ + /* Get the batch size, and special case size 1. + * once we identify the actual failure job, return other jobs, then update + * the batch ring indexes to make it look like the first job of the batch has failed. + * Subsequent calls here will always return zero packets, and the error must be cleared by + * calling the completed_status() function. + */ + next_batch = (idxd->batch_idx_read + 1); + if (next_batch > idxd->max_batches) + next_batch = 0; + b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; + b_end = idxd->batch_idx_ring[next_batch]; + + if (b_end - b_start == 1) { /* not a batch */ + *has_error = true; + return 0; + } + + for (i = b_start; i < b_end; i++) { + struct idxd_completion *c = (void *)&idxd->desc_ring[i & idxd->desc_ring_mask]; + if (c->status > IDXD_COMP_STATUS_SUCCESS) /* ignore incomplete(0) and success(1) */ + break; + } + ret = RTE_MIN((uint16_t)(i - idxd->ids_returned), max_ops); + if (ret < max_ops) + *has_error = true; /* we got up to the point of error */ + idxd->ids_avail = idxd->ids_returned += ret; + + /* to ensure we can call twice and just return 0, set start of batch to where we finished */ + idxd->batch_comp_ring[idxd->batch_idx_read].completed_size -= ret; + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; + if (idxd->batch_idx_ring[next_batch] - idxd->batch_idx_ring[idxd->batch_idx_read] == 1) { + /* copy over the descriptor status to the batch ring as if no batch */ + uint16_t d_idx = idxd->batch_idx_ring[idxd->batch_idx_read] & idxd->desc_ring_mask; + struct idxd_completion *desc_comp = (void *)&idxd->desc_ring[d_idx]; + idxd->batch_comp_ring[idxd->batch_idx_read].status = desc_comp->status; + } + + return ret; +} + +static uint16_t +batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) +{ + uint16_t next_batch; + + int ret = batch_ok(idxd, max_ops, status); + if (ret >= 0) + return ret; + + /* ERROR case, not successful, not incomplete */ + /* Get the batch size, and special case size 1. + */ + next_batch = (idxd->batch_idx_read + 1); + if (next_batch > idxd->max_batches) + next_batch = 0; + const uint16_t b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; + const uint16_t b_end = idxd->batch_idx_ring[next_batch]; + const uint16_t b_len = b_end - b_start; + if (b_len == 1) {/* not a batch */ + *status = get_comp_status(&idxd->batch_comp_ring[idxd->batch_idx_read]); + idxd->ids_avail++; + idxd->ids_returned++; + idxd->batch_idx_read = next_batch; + return 1; + } + + /* not a single-element batch, need to process more. + * Scenarios: + * 1. max_ops >= batch_size - can fit everything, simple case + * - loop through completed ops and then add on any not-attempted ones + * 2. max_ops < batch_size - can't fit everything, more complex case + * - loop through completed/incomplete and stop when hit max_ops + * - adjust the batch descriptor to update where we stopped, with appropriate bcount + * - if bcount is to be exactly 1, update the batch descriptor as it will be treated as + * non-batch next time. + */ + const uint16_t bcount = idxd->batch_comp_ring[idxd->batch_idx_read].completed_size; + for (ret = 0; ret < b_len && ret < max_ops; ret++) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; + status[ret] = (ret < bcount) ? get_comp_status(c) : RTE_DMA_STATUS_NOT_ATTEMPTED; + } + idxd->ids_avail = idxd->ids_returned += ret; + + /* everything fit */ + if (ret == b_len) { + idxd->batch_idx_read = next_batch; + return ret; + } + + /* set up for next time, update existing batch descriptor & start idx at batch_idx_read */ + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; + if (ret > bcount) { + /* we have only incomplete ones - set batch completed size to 0 */ + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; + comp->completed_size = 0; + /* if there is only one descriptor left, job skipped so set flag appropriately */ + if (b_len - ret == 1) + comp->status = IDXD_COMP_STATUS_SKIPPED; + } else { + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; + comp->completed_size -= ret; + /* if there is only one descriptor left, copy status info straight to desc */ + if (comp->completed_size == 1) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; + comp->status = c->status; + /* individual descs can be ok without writeback, but not batches */ + if (comp->status == IDXD_COMP_STATUS_INCOMPLETE) + comp->status = IDXD_COMP_STATUS_SUCCESS; + } else if (bcount == b_len) { + /* check if we still have an error, and clear flag if not */ + uint16_t i; + for (i = b_start + ret; i < b_end; i++) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[i & idxd->desc_ring_mask]; + if (c->status > IDXD_COMP_STATUS_SUCCESS) + break; + } + if (i == b_end) /* no errors */ + comp->status = IDXD_COMP_STATUS_SUCCESS; + } + } + + return ret; +} + +uint16_t +idxd_completed(void *dev_private, uint16_t qid __rte_unused, uint16_t max_ops, + uint16_t *last_idx, bool *has_error) +{ + struct idxd_dmadev *idxd = dev_private; + uint16_t batch, ret = 0; + + do { + batch = batch_completed(idxd, max_ops - ret, has_error); + ret += batch; + } while (batch > 0 && *has_error == false); + + *last_idx = idxd->ids_returned - 1; + return ret; +} + +uint16_t +idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, uint16_t max_ops, + uint16_t *last_idx, enum rte_dma_status_code *status) +{ + struct idxd_dmadev *idxd = dev_private; + uint16_t batch, ret = 0; + + do { + batch = batch_completed_status(idxd, max_ops - ret, &status[ret]); + ret += batch; + } while (batch > 0); + + *last_idx = idxd->ids_returned - 1; + return ret; +} + int idxd_dump(const struct rte_dma_dev *dev, FILE *f) { @@ -273,6 +507,8 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->fp_obj->copy = idxd_enqueue_copy; dmadev->fp_obj->fill = idxd_enqueue_fill; dmadev->fp_obj->submit = idxd_submit; + dmadev->fp_obj->completed = idxd_completed; + dmadev->fp_obj->completed_status = idxd_completed_status; idxd = dmadev->data->dev_private; *idxd = *base_idxd; /* copy over the main fields already passed in */ diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index ab4d71095e..4208b0dee8 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -92,5 +92,10 @@ int idxd_enqueue_copy(void *dev_private, uint16_t qid, rte_iova_t src, int idxd_enqueue_fill(void *dev_private, uint16_t qid, uint64_t pattern, rte_iova_t dst, unsigned int length, uint64_t flags); int idxd_submit(void *dev_private, uint16_t qid); +uint16_t idxd_completed(void *dev_private, uint16_t qid, uint16_t max_ops, + uint16_t *last_idx, bool *has_error); +uint16_t idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, + uint16_t max_ops, uint16_t *last_idx, + enum rte_dma_status_code *status); #endif /* _IDXD_INTERNAL_H_ */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v9 11/16] dma/idxd: add operation statistic tracking 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 00/16] add dmadev driver for idxd devices Kevin Laatz ` (9 preceding siblings ...) 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 10/16] dma/idxd: add data-path job completion functions Kevin Laatz @ 2021-10-19 11:25 ` Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 12/16] dma/idxd: add vchan status function Kevin Laatz ` (4 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 11:25 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add statistic tracking for DSA devices. The dmadev library documentation is also updated to add a generic section for using the library's statistics APIs. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> --- doc/guides/prog_guide/dmadev.rst | 11 +++++++++++ drivers/dma/idxd/idxd_bus.c | 2 ++ drivers/dma/idxd/idxd_common.c | 27 +++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 +++++ drivers/dma/idxd/idxd_pci.c | 2 ++ 5 files changed, 47 insertions(+) diff --git a/doc/guides/prog_guide/dmadev.rst b/doc/guides/prog_guide/dmadev.rst index e853ffda3a..139eaff299 100644 --- a/doc/guides/prog_guide/dmadev.rst +++ b/doc/guides/prog_guide/dmadev.rst @@ -107,3 +107,14 @@ completed operations along with the status of each operation (filled into the ``status`` array passed by user). These two APIs can also return the last completed operation's ``ring_idx`` which could help user track operations within their own application-defined rings. + + +Querying Device Statistics +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The statistics from a dmadev device can be got via the statistics functions, +i.e. ``rte_dma_stats_get()``. The statistics returned for each device instance are: + +* ``submitted``: The number of operations submitted to the device. +* ``completed``: The number of operations which have completed (successful and failed). +* ``errors``: The number of operations that completed with error. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index b2acdac4f9..b52ea02854 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -99,6 +99,8 @@ static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .stats_get = idxd_stats_get, + .stats_reset = idxd_stats_reset, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 76bc2e1364..fd81418b7c 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -65,6 +65,8 @@ __submit(struct idxd_dmadev *idxd) if (++idxd->batch_idx_write > idxd->max_batches) idxd->batch_idx_write = 0; + idxd->stats.submitted += idxd->batch_size; + idxd->batch_start += idxd->batch_size; idxd->batch_size = 0; idxd->batch_idx_ring[idxd->batch_idx_write] = idxd->batch_start; @@ -276,6 +278,8 @@ batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_ const uint16_t b_len = b_end - b_start; if (b_len == 1) {/* not a batch */ *status = get_comp_status(&idxd->batch_comp_ring[idxd->batch_idx_read]); + if (status != RTE_DMA_STATUS_SUCCESSFUL) + idxd->stats.errors++; idxd->ids_avail++; idxd->ids_returned++; idxd->batch_idx_read = next_batch; @@ -297,6 +301,8 @@ batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_ struct idxd_completion *c = (void *) &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; status[ret] = (ret < bcount) ? get_comp_status(c) : RTE_DMA_STATUS_NOT_ATTEMPTED; + if (status[ret] != RTE_DMA_STATUS_SUCCESSFUL) + idxd->stats.errors++; } idxd->ids_avail = idxd->ids_returned += ret; @@ -355,6 +361,7 @@ idxd_completed(void *dev_private, uint16_t qid __rte_unused, uint16_t max_ops, ret += batch; } while (batch > 0 && *has_error == false); + idxd->stats.completed += ret; *last_idx = idxd->ids_returned - 1; return ret; } @@ -371,6 +378,7 @@ idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, uint16_t max ret += batch; } while (batch > 0); + idxd->stats.completed += ret; *last_idx = idxd->ids_returned - 1; return ret; } @@ -404,6 +412,25 @@ idxd_dump(const struct rte_dma_dev *dev, FILE *f) return 0; } +int +idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + struct rte_dma_stats *stats, uint32_t stats_sz) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + if (stats_sz < sizeof(*stats)) + return -EINVAL; + *stats = idxd->stats; + return 0; +} + +int +idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan __rte_unused) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + idxd->stats = (struct rte_dma_stats){0}; + return 0; +} + int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t size) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 4208b0dee8..a85a1fb79e 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -59,6 +59,8 @@ struct idxd_dmadev { struct idxd_completion *batch_comp_ring; unsigned short *batch_idx_ring; /* store where each batch ends */ + struct rte_dma_stats stats; + rte_iova_t batch_iova; /* base address of the batch comp ring */ rte_iova_t desc_iova; /* base address of desc ring, needed for completions */ @@ -97,5 +99,8 @@ uint16_t idxd_completed(void *dev_private, uint16_t qid, uint16_t max_ops, uint16_t idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, uint16_t max_ops, uint16_t *last_idx, enum rte_dma_status_code *status); +int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, + struct rte_dma_stats *stats, uint32_t stats_sz); +int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index ed5bf99425..9d7f0531d5 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -136,6 +136,8 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .stats_get = idxd_stats_get, + .stats_reset = idxd_stats_reset, .dev_start = idxd_pci_dev_start, .dev_stop = idxd_pci_dev_stop, }; -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v9 12/16] dma/idxd: add vchan status function 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 00/16] add dmadev driver for idxd devices Kevin Laatz ` (10 preceding siblings ...) 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 11/16] dma/idxd: add operation statistic tracking Kevin Laatz @ 2021-10-19 11:25 ` Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 13/16] dma/idxd: add burst capacity API Kevin Laatz ` (3 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 11:25 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When testing dmadev drivers, it is useful to have the HW device in a known state. This patch adds the implementation of the function which will wait for the device to be idle (all jobs completed) before proceeding. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 17 +++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 2 ++ drivers/dma/idxd/idxd_pci.c | 1 + 4 files changed, 21 insertions(+) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index b52ea02854..e6caa048a9 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -101,6 +101,7 @@ static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_info_get = idxd_info_get, .stats_get = idxd_stats_get, .stats_reset = idxd_stats_reset, + .vchan_status = idxd_vchan_status, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index fd81418b7c..3c8cff15c0 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -163,6 +163,23 @@ get_comp_status(struct idxd_completion *c) } } +int +idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + enum rte_dma_vchan_status *status) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint16_t last_batch_write = idxd->batch_idx_write == 0 ? idxd->max_batches : + idxd->batch_idx_write - 1; + uint8_t bstatus = (idxd->batch_comp_ring[last_batch_write].status != 0); + + /* An IDXD device will always be either active or idle. + * RTE_DMA_VCHAN_HALTED_ERROR is therefore not supported by IDXD. + */ + *status = bstatus ? RTE_DMA_VCHAN_IDLE : RTE_DMA_VCHAN_ACTIVE; + + return 0; +} + static __rte_always_inline int batch_ok(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index a85a1fb79e..50acb82d3d 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -102,5 +102,7 @@ uint16_t idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, struct rte_dma_stats *stats, uint32_t stats_sz); int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); +int idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan, + enum rte_dma_vchan_status *status); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 9d7f0531d5..23c10c0fb0 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -140,6 +140,7 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .stats_reset = idxd_stats_reset, .dev_start = idxd_pci_dev_start, .dev_stop = idxd_pci_dev_stop, + .vchan_status = idxd_vchan_status, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v9 13/16] dma/idxd: add burst capacity API 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 00/16] add dmadev driver for idxd devices Kevin Laatz ` (11 preceding siblings ...) 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 12/16] dma/idxd: add vchan status function Kevin Laatz @ 2021-10-19 11:25 ` Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz ` (2 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 11:25 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add support for the burst capacity API. This API will provide the calling application with the remaining capacity of the current burst (limited by max HW batch size). Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> --- drivers/dma/idxd/idxd_common.c | 21 +++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 1 + drivers/dma/idxd/idxd_pci.c | 1 + 3 files changed, 23 insertions(+) diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 3c8cff15c0..ff4647f579 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -468,6 +468,26 @@ idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t return 0; } +uint16_t +idxd_burst_capacity(const void *dev_private, uint16_t vchan __rte_unused) +{ + const struct idxd_dmadev *idxd = dev_private; + uint16_t write_idx = idxd->batch_start + idxd->batch_size; + uint16_t used_space; + + /* Check for space in the batch ring */ + if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) || + idxd->batch_idx_write + 1 == idxd->batch_idx_read) + return 0; + + /* For descriptors, check for wrap-around on write but not read */ + if (idxd->ids_returned > write_idx) + write_idx += idxd->desc_ring_mask + 1; + used_space = write_idx - idxd->ids_returned; + + return RTE_MIN((idxd->desc_ring_mask - used_space), idxd->max_batch_size); +} + int idxd_configure(struct rte_dma_dev *dev __rte_unused, const struct rte_dma_conf *dev_conf, uint32_t conf_sz) @@ -553,6 +573,7 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->fp_obj->submit = idxd_submit; dmadev->fp_obj->completed = idxd_completed; dmadev->fp_obj->completed_status = idxd_completed_status; + dmadev->fp_obj->burst_capacity = idxd_burst_capacity; idxd = dmadev->data->dev_private; *idxd = *base_idxd; /* copy over the main fields already passed in */ diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 50acb82d3d..3375600217 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -104,5 +104,6 @@ int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); int idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan, enum rte_dma_vchan_status *status); +uint16_t idxd_burst_capacity(const void *dev_private, uint16_t vchan); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 23c10c0fb0..beef3848aa 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -254,6 +254,7 @@ init_pci_device(struct rte_pci_device *dev, struct idxd_dmadev *idxd, idxd->u.pci = pci; idxd->max_batches = wq_size; + idxd->max_batch_size = 1 << lg2_max_batch; /* enable the device itself */ err_code = idxd_pci_dev_command(idxd, idxd_enable_dev); -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v9 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 00/16] add dmadev driver for idxd devices Kevin Laatz ` (12 preceding siblings ...) 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 13/16] dma/idxd: add burst capacity API Kevin Laatz @ 2021-10-19 11:25 ` Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 15/16] devbind: add dma device class Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 11:25 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz From: Conor Walsh <conor.walsh@intel.com> Move the example script for configuring IDXD devices bound to the IDXD kernel driver from raw to dma, and create a symlink to still allow use from raw. Signed-off-by: Conor Walsh <conor.walsh@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> --- drivers/dma/idxd/dpdk_idxd_cfg.py | 117 +++++++++++++++++++++++++++++ drivers/raw/ioat/dpdk_idxd_cfg.py | 118 +----------------------------- 2 files changed, 118 insertions(+), 117 deletions(-) create mode 100755 drivers/dma/idxd/dpdk_idxd_cfg.py mode change 100755 => 120000 drivers/raw/ioat/dpdk_idxd_cfg.py diff --git a/drivers/dma/idxd/dpdk_idxd_cfg.py b/drivers/dma/idxd/dpdk_idxd_cfg.py new file mode 100755 index 0000000000..fcc27822ef --- /dev/null +++ b/drivers/dma/idxd/dpdk_idxd_cfg.py @@ -0,0 +1,117 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2020 Intel Corporation + +""" +Configure an entire Intel DSA instance, using idxd kernel driver, for DPDK use +""" + +import sys +import argparse +import os +import os.path + + +class SysfsDir: + "Used to read/write paths in a sysfs directory" + def __init__(self, path): + self.path = path + + def read_int(self, filename): + "Return a value from sysfs file" + with open(os.path.join(self.path, filename)) as f: + return int(f.readline()) + + def write_values(self, values): + "write dictionary, where key is filename and value is value to write" + for filename, contents in values.items(): + with open(os.path.join(self.path, filename), "w") as f: + f.write(str(contents)) + + +def reset_device(dsa_id): + "Reset the DSA device and all its queues" + drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") + drv_dir.write_values({"unbind": f"dsa{dsa_id}"}) + + +def get_pci_dir(pci): + "Search for the sysfs directory of the PCI device" + base_dir = '/sys/bus/pci/devices/' + for path, dirs, files in os.walk(base_dir): + for dir in dirs: + if pci in dir: + return os.path.join(base_dir, dir) + sys.exit(f"Could not find sysfs directory for device {pci}") + + +def get_dsa_id(pci): + "Get the DSA instance ID using the PCI address of the device" + pci_dir = get_pci_dir(pci) + for path, dirs, files in os.walk(pci_dir): + for dir in dirs: + if dir.startswith('dsa') and 'wq' not in dir: + return int(dir[3:]) + sys.exit(f"Could not get device ID for device {pci}") + + +def configure_dsa(dsa_id, queues, prefix): + "Configure the DSA instance with appropriate number of queues" + dsa_dir = SysfsDir(f"/sys/bus/dsa/devices/dsa{dsa_id}") + drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") + + max_groups = dsa_dir.read_int("max_groups") + max_engines = dsa_dir.read_int("max_engines") + max_queues = dsa_dir.read_int("max_work_queues") + max_work_queues_size = dsa_dir.read_int("max_work_queues_size") + + nb_queues = min(queues, max_queues) + if queues > nb_queues: + print(f"Setting number of queues to max supported value: {max_queues}") + + # we want one engine per group, and no more engines than queues + nb_groups = min(max_engines, max_groups, nb_queues) + for grp in range(nb_groups): + dsa_dir.write_values({f"engine{dsa_id}.{grp}/group_id": grp}) + + # configure each queue + for q in range(nb_queues): + wq_dir = SysfsDir(os.path.join(dsa_dir.path, f"wq{dsa_id}.{q}")) + wq_dir.write_values({"group_id": q % nb_groups, + "type": "user", + "mode": "dedicated", + "name": f"{prefix}_wq{dsa_id}.{q}", + "priority": 1, + "size": int(max_work_queues_size / nb_queues)}) + + # enable device and then queues + drv_dir.write_values({"bind": f"dsa{dsa_id}"}) + for q in range(nb_queues): + drv_dir.write_values({"bind": f"wq{dsa_id}.{q}"}) + + +def main(args): + "Main function, does arg parsing and calls config function" + arg_p = argparse.ArgumentParser( + description="Configure whole DSA device instance for DPDK use") + arg_p.add_argument('dsa_id', + help="Specify DSA instance either via DSA instance number or PCI address") + arg_p.add_argument('-q', metavar='queues', type=int, default=255, + help="Number of queues to set up") + arg_p.add_argument('--name-prefix', metavar='prefix', dest='prefix', + default="dpdk", + help="Prefix for workqueue name to mark for DPDK use [default: 'dpdk']") + arg_p.add_argument('--reset', action='store_true', + help="Reset DSA device and its queues") + parsed_args = arg_p.parse_args(args[1:]) + + dsa_id = parsed_args.dsa_id + dsa_id = get_dsa_id(dsa_id) if ':' in dsa_id else dsa_id + if parsed_args.reset: + reset_device(dsa_id) + else: + configure_dsa(dsa_id, parsed_args.q, parsed_args.prefix) + + +if __name__ == "__main__": + main(sys.argv) diff --git a/drivers/raw/ioat/dpdk_idxd_cfg.py b/drivers/raw/ioat/dpdk_idxd_cfg.py deleted file mode 100755 index fcc27822ef..0000000000 --- a/drivers/raw/ioat/dpdk_idxd_cfg.py +++ /dev/null @@ -1,117 +0,0 @@ -#!/usr/bin/env python3 -# SPDX-License-Identifier: BSD-3-Clause -# Copyright(c) 2020 Intel Corporation - -""" -Configure an entire Intel DSA instance, using idxd kernel driver, for DPDK use -""" - -import sys -import argparse -import os -import os.path - - -class SysfsDir: - "Used to read/write paths in a sysfs directory" - def __init__(self, path): - self.path = path - - def read_int(self, filename): - "Return a value from sysfs file" - with open(os.path.join(self.path, filename)) as f: - return int(f.readline()) - - def write_values(self, values): - "write dictionary, where key is filename and value is value to write" - for filename, contents in values.items(): - with open(os.path.join(self.path, filename), "w") as f: - f.write(str(contents)) - - -def reset_device(dsa_id): - "Reset the DSA device and all its queues" - drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") - drv_dir.write_values({"unbind": f"dsa{dsa_id}"}) - - -def get_pci_dir(pci): - "Search for the sysfs directory of the PCI device" - base_dir = '/sys/bus/pci/devices/' - for path, dirs, files in os.walk(base_dir): - for dir in dirs: - if pci in dir: - return os.path.join(base_dir, dir) - sys.exit(f"Could not find sysfs directory for device {pci}") - - -def get_dsa_id(pci): - "Get the DSA instance ID using the PCI address of the device" - pci_dir = get_pci_dir(pci) - for path, dirs, files in os.walk(pci_dir): - for dir in dirs: - if dir.startswith('dsa') and 'wq' not in dir: - return int(dir[3:]) - sys.exit(f"Could not get device ID for device {pci}") - - -def configure_dsa(dsa_id, queues, prefix): - "Configure the DSA instance with appropriate number of queues" - dsa_dir = SysfsDir(f"/sys/bus/dsa/devices/dsa{dsa_id}") - drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") - - max_groups = dsa_dir.read_int("max_groups") - max_engines = dsa_dir.read_int("max_engines") - max_queues = dsa_dir.read_int("max_work_queues") - max_work_queues_size = dsa_dir.read_int("max_work_queues_size") - - nb_queues = min(queues, max_queues) - if queues > nb_queues: - print(f"Setting number of queues to max supported value: {max_queues}") - - # we want one engine per group, and no more engines than queues - nb_groups = min(max_engines, max_groups, nb_queues) - for grp in range(nb_groups): - dsa_dir.write_values({f"engine{dsa_id}.{grp}/group_id": grp}) - - # configure each queue - for q in range(nb_queues): - wq_dir = SysfsDir(os.path.join(dsa_dir.path, f"wq{dsa_id}.{q}")) - wq_dir.write_values({"group_id": q % nb_groups, - "type": "user", - "mode": "dedicated", - "name": f"{prefix}_wq{dsa_id}.{q}", - "priority": 1, - "size": int(max_work_queues_size / nb_queues)}) - - # enable device and then queues - drv_dir.write_values({"bind": f"dsa{dsa_id}"}) - for q in range(nb_queues): - drv_dir.write_values({"bind": f"wq{dsa_id}.{q}"}) - - -def main(args): - "Main function, does arg parsing and calls config function" - arg_p = argparse.ArgumentParser( - description="Configure whole DSA device instance for DPDK use") - arg_p.add_argument('dsa_id', - help="Specify DSA instance either via DSA instance number or PCI address") - arg_p.add_argument('-q', metavar='queues', type=int, default=255, - help="Number of queues to set up") - arg_p.add_argument('--name-prefix', metavar='prefix', dest='prefix', - default="dpdk", - help="Prefix for workqueue name to mark for DPDK use [default: 'dpdk']") - arg_p.add_argument('--reset', action='store_true', - help="Reset DSA device and its queues") - parsed_args = arg_p.parse_args(args[1:]) - - dsa_id = parsed_args.dsa_id - dsa_id = get_dsa_id(dsa_id) if ':' in dsa_id else dsa_id - if parsed_args.reset: - reset_device(dsa_id) - else: - configure_dsa(dsa_id, parsed_args.q, parsed_args.prefix) - - -if __name__ == "__main__": - main(sys.argv) diff --git a/drivers/raw/ioat/dpdk_idxd_cfg.py b/drivers/raw/ioat/dpdk_idxd_cfg.py new file mode 120000 index 0000000000..85545548d1 --- /dev/null +++ b/drivers/raw/ioat/dpdk_idxd_cfg.py @@ -0,0 +1 @@ +../../dma/idxd/dpdk_idxd_cfg.py \ No newline at end of file -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v9 15/16] devbind: add dma device class 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 00/16] add dmadev driver for idxd devices Kevin Laatz ` (13 preceding siblings ...) 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz @ 2021-10-19 11:25 ` Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 11:25 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add a new class for DMA devices. Devices listed under the DMA class are to be used with the dmadev library. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> --- usertools/dpdk-devbind.py | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py index 5f0e817055..da89b87816 100755 --- a/usertools/dpdk-devbind.py +++ b/usertools/dpdk-devbind.py @@ -71,6 +71,7 @@ network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class] baseband_devices = [acceleration_class] crypto_devices = [encryption_class, intel_processor_class] +dma_devices = [] eventdev_devices = [cavium_sso, cavium_tim, intel_dlb, octeontx2_sso] mempool_devices = [cavium_fpa, octeontx2_npa] compress_devices = [cavium_zip] @@ -585,6 +586,9 @@ def show_status(): if status_dev in ["crypto", "all"]: show_device_status(crypto_devices, "Crypto") + if status_dev in ["dma", "all"]: + show_device_status(dma_devices, "DMA") + if status_dev in ["event", "all"]: show_device_status(eventdev_devices, "Eventdev") @@ -653,7 +657,7 @@ def parse_args(): parser.add_argument( '--status-dev', help="Print the status of given device group.", - choices=['baseband', 'compress', 'crypto', 'event', + choices=['baseband', 'compress', 'crypto', 'dma', 'event', 'mempool', 'misc', 'net', 'regex']) bind_group = parser.add_mutually_exclusive_group() bind_group.add_argument( @@ -734,6 +738,7 @@ def do_arg_actions(): get_device_details(network_devices) get_device_details(baseband_devices) get_device_details(crypto_devices) + get_device_details(dma_devices) get_device_details(eventdev_devices) get_device_details(mempool_devices) get_device_details(compress_devices) @@ -756,6 +761,7 @@ def main(): get_device_details(network_devices) get_device_details(baseband_devices) get_device_details(crypto_devices) + get_device_details(dma_devices) get_device_details(eventdev_devices) get_device_details(mempool_devices) get_device_details(compress_devices) -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v9 16/16] devbind: move idxd device ID to dmadev class 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 00/16] add dmadev driver for idxd devices Kevin Laatz ` (14 preceding siblings ...) 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 15/16] devbind: add dma device class Kevin Laatz @ 2021-10-19 11:25 ` Kevin Laatz 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 11:25 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz The dmadev library is the preferred abstraction for using IDXD devices and will replace the rawdev implementation in future. This patch moves the IDXD device ID to the dmadev class. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> --- usertools/dpdk-devbind.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py index da89b87816..ba18e2a487 100755 --- a/usertools/dpdk-devbind.py +++ b/usertools/dpdk-devbind.py @@ -71,13 +71,13 @@ network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class] baseband_devices = [acceleration_class] crypto_devices = [encryption_class, intel_processor_class] -dma_devices = [] +dma_devices = [intel_idxd_spr] eventdev_devices = [cavium_sso, cavium_tim, intel_dlb, octeontx2_sso] mempool_devices = [cavium_fpa, octeontx2_npa] compress_devices = [cavium_zip] regex_devices = [octeontx2_ree] misc_devices = [cnxk_bphy, cnxk_bphy_cgx, cnxk_inl_dev, intel_ioat_bdw, - intel_ioat_skx, intel_ioat_icx, intel_idxd_spr, intel_ntb_skx, + intel_ioat_skx, intel_ioat_icx, intel_ntb_skx, intel_ntb_icx, octeontx2_dma] # global dict ethernet devices present. Dictionary indexed by PCI address. -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v10 00/16] add dmadev driver for idxd devices 2021-08-27 17:20 [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices Kevin Laatz ` (19 preceding siblings ...) 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 00/16] add dmadev driver for idxd devices Kevin Laatz @ 2021-10-19 14:10 ` Kevin Laatz 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 01/16] raw/ioat: only build if dmadev not present Kevin Laatz ` (15 more replies) 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Kevin Laatz 21 siblings, 16 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 14:10 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz This patchset adds a dmadev driver and associated documentation to support Intel Data Streaming Accelerator devices. This driver is intended to ultimately replace the current IDXD part of the IOAT rawdev driver. v10: * meson fix to ensure Windows and BSD builds compile v9: * add missing meson check for x86 v8: * fix compilation issues of individual patches v7: * rebase on above patchsets * add meson reason for rawdev build v6: * set state of device during create * add dev_close function * documentation updates - moved generic pieces from driver doc to lib doc * other small miscellaneous fixes based on rebasing and ML feedback v5: * add missing toctree entry for idxd driver v4: * rebased on above patchsets * minor fixes based on review feedback v3: * rebased on above patchsets * added burst capacity API v2: * rebased on above patchsets * added API to check for device being idle * added devbind updates for DMA devices * fixed issue identified by internal coverity scan Bruce Richardson (1): raw/ioat: only build if dmadev not present Conor Walsh (1): dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz (14): dma/idxd: add skeleton for VFIO based DSA device dma/idxd: add bus device probing dma/idxd: create dmadev instances on bus probe dma/idxd: create dmadev instances on pci probe dma/idxd: add datapath structures dma/idxd: add configure and info_get functions dma/idxd: add start and stop functions for pci devices dma/idxd: add data-path job submission functions dma/idxd: add data-path job completion functions dma/idxd: add operation statistic tracking dma/idxd: add vchan status function dma/idxd: add burst capacity API devbind: add dma device class devbind: move idxd device ID to dmadev class MAINTAINERS | 10 + doc/guides/dmadevs/idxd.rst | 179 ++++++++ doc/guides/dmadevs/index.rst | 2 + doc/guides/prog_guide/dmadev.rst | 30 ++ doc/guides/rawdevs/ioat.rst | 8 + doc/guides/rel_notes/release_21_11.rst | 5 + drivers/dma/idxd/dpdk_idxd_cfg.py | 117 +++++ drivers/dma/idxd/idxd_bus.c | 377 +++++++++++++++ drivers/dma/idxd/idxd_common.c | 612 +++++++++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 131 ++++++ drivers/dma/idxd/idxd_internal.h | 109 +++++ drivers/dma/idxd/idxd_pci.c | 368 +++++++++++++++ drivers/dma/idxd/meson.build | 16 + drivers/dma/idxd/version.map | 3 + drivers/dma/meson.build | 2 + drivers/meson.build | 4 +- drivers/raw/ioat/dpdk_idxd_cfg.py | 118 +---- drivers/raw/ioat/meson.build | 24 +- lib/dmadev/rte_dmadev.h | 1 + usertools/dpdk-devbind.py | 10 +- 20 files changed, 2002 insertions(+), 124 deletions(-) create mode 100644 doc/guides/dmadevs/idxd.rst create mode 100755 drivers/dma/idxd/dpdk_idxd_cfg.py create mode 100644 drivers/dma/idxd/idxd_bus.c create mode 100644 drivers/dma/idxd/idxd_common.c create mode 100644 drivers/dma/idxd/idxd_hw_defs.h create mode 100644 drivers/dma/idxd/idxd_internal.h create mode 100644 drivers/dma/idxd/idxd_pci.c create mode 100644 drivers/dma/idxd/meson.build create mode 100644 drivers/dma/idxd/version.map mode change 100755 => 120000 drivers/raw/ioat/dpdk_idxd_cfg.py -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v10 01/16] raw/ioat: only build if dmadev not present 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 00/16] add dmadev driver for idxd devices Kevin Laatz @ 2021-10-19 14:10 ` Kevin Laatz 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz ` (14 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 14:10 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz From: Bruce Richardson <bruce.richardson@intel.com> Only build the rawdev IDXD/IOAT drivers if the dmadev drivers are not present. This change requires the dependencies to be reordered in drivers/meson.build so that rawdev can use the "RTE_DMA_* build macros to check for the presence of the equivalent dmadev driver. A note is also added to the documentation to inform users of this change. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/rawdevs/ioat.rst | 8 ++++++++ drivers/meson.build | 4 ++-- drivers/raw/ioat/meson.build | 24 +++++++++++++++++++++--- 3 files changed, 31 insertions(+), 5 deletions(-) diff --git a/doc/guides/rawdevs/ioat.rst b/doc/guides/rawdevs/ioat.rst index a28e909935..a65530bd30 100644 --- a/doc/guides/rawdevs/ioat.rst +++ b/doc/guides/rawdevs/ioat.rst @@ -34,6 +34,14 @@ Compilation For builds using ``meson`` and ``ninja``, the driver will be built when the target platform is x86-based. No additional compilation steps are necessary. +.. note:: + Since the addition of the dmadev library, the ``ioat`` and ``idxd`` parts of this driver + will only be built if their ``dmadev`` counterparts are not built. + The following can be used to disable the ``dmadev`` drivers, + if the raw drivers are to be used instead:: + + $ meson -Ddisable_drivers=dma/* <build_dir> + Device Setup ------------- diff --git a/drivers/meson.build b/drivers/meson.build index b7d680868a..34c0276487 100644 --- a/drivers/meson.build +++ b/drivers/meson.build @@ -10,15 +10,15 @@ subdirs = [ 'common/qat', # depends on bus. 'common/sfc_efx', # depends on bus. 'mempool', # depends on common and bus. + 'dma', # depends on common and bus. 'net', # depends on common, bus, mempool - 'raw', # depends on common, bus and net. + 'raw', # depends on common, bus, dma and net. 'crypto', # depends on common, bus and mempool (net in future). 'compress', # depends on common, bus, mempool. 'regex', # depends on common, bus, regexdev. 'vdpa', # depends on common, bus and mempool. 'event', # depends on common, bus, mempool and net. 'baseband', # depends on common and bus. - 'dma', # depends on common and bus. ] if meson.is_cross_build() diff --git a/drivers/raw/ioat/meson.build b/drivers/raw/ioat/meson.build index 0e81cb5951..1b866aab74 100644 --- a/drivers/raw/ioat/meson.build +++ b/drivers/raw/ioat/meson.build @@ -2,14 +2,32 @@ # Copyright 2019 Intel Corporation build = dpdk_conf.has('RTE_ARCH_X86') +# only use ioat rawdev driver if we don't have the equivalent dmadev ones +if dpdk_conf.has('RTE_DMA_IDXD') and dpdk_conf.has('RTE_DMA_IOAT') + build = false + reason = 'replaced by dmadev drivers' + subdir_done() +endif + reason = 'only supported on x86' sources = files( - 'idxd_bus.c', - 'idxd_pci.c', 'ioat_common.c', - 'ioat_rawdev.c', 'ioat_rawdev_test.c', ) + +if not dpdk_conf.has('RTE_DMA_IDXD') + sources += files( + 'idxd_bus.c', + 'idxd_pci.c', + ) +endif + +if not dpdk_conf.has('RTE_DMA_IOAT') + sources += files ( + 'ioat_rawdev.c', + ) +endif + deps += ['bus_pci', 'mbuf', 'rawdev'] headers = files( 'rte_ioat_rawdev.h', -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v10 02/16] dma/idxd: add skeleton for VFIO based DSA device 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 01/16] raw/ioat: only build if dmadev not present Kevin Laatz @ 2021-10-19 14:10 ` Kevin Laatz 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 03/16] dma/idxd: add bus device probing Kevin Laatz ` (13 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 14:10 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the basic device probe/remove skeleton code for DSA device bound to the vfio pci driver. Relevant documentation and MAINTAINERS update also included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v8: fix compile issue v9: add meson check for x86 --- MAINTAINERS | 10 +++++ doc/guides/dmadevs/idxd.rst | 58 ++++++++++++++++++++++++++ doc/guides/dmadevs/index.rst | 2 + doc/guides/rel_notes/release_21_11.rst | 5 +++ drivers/dma/idxd/idxd_common.c | 11 +++++ drivers/dma/idxd/idxd_internal.h | 27 ++++++++++++ drivers/dma/idxd/idxd_pci.c | 55 ++++++++++++++++++++++++ drivers/dma/idxd/meson.build | 11 +++++ drivers/dma/idxd/version.map | 3 ++ drivers/dma/meson.build | 2 + 10 files changed, 184 insertions(+) create mode 100644 doc/guides/dmadevs/idxd.rst create mode 100644 drivers/dma/idxd/idxd_common.c create mode 100644 drivers/dma/idxd/idxd_internal.h create mode 100644 drivers/dma/idxd/idxd_pci.c create mode 100644 drivers/dma/idxd/meson.build create mode 100644 drivers/dma/idxd/version.map diff --git a/MAINTAINERS b/MAINTAINERS index 5387ffd4fc..423d8a73ce 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1200,6 +1200,16 @@ F: doc/guides/compressdevs/zlib.rst F: doc/guides/compressdevs/features/zlib.ini +DMAdev Drivers +-------------- + +Intel IDXD - EXPERIMENTAL +M: Bruce Richardson <bruce.richardson@intel.com> +M: Kevin Laatz <kevin.laatz@intel.com> +F: drivers/dma/idxd/ +F: doc/guides/dmadevs/idxd.rst + + RegEx Drivers ------------- diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst new file mode 100644 index 0000000000..924700d17e --- /dev/null +++ b/doc/guides/dmadevs/idxd.rst @@ -0,0 +1,58 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2021 Intel Corporation. + +.. include:: <isonum.txt> + +IDXD DMA Device Driver +====================== + +The ``idxd`` dmadev driver provides a poll-mode driver (PMD) for Intel\ |reg| +Data Streaming Accelerator `(Intel DSA) +<https://software.intel.com/content/www/us/en/develop/articles/intel-data-streaming-accelerator-architecture-specification.html>`_. +This PMD can be used in conjunction with Intel\ |reg| DSA devices to offload +data operations, such as data copies, to hardware, freeing up CPU cycles for +other tasks. + +Hardware Requirements +---------------------- + +The ``dpdk-devbind.py`` script, included with DPDK, can be used to show the +presence of supported hardware. Running ``dpdk-devbind.py --status-dev dma`` +will show all the DMA devices on the system, including IDXD supported devices. +Intel\ |reg| DSA devices, are currently (at time of writing) appearing +as devices with type “0b25”, due to the absence of pci-id database entries for +them at this point. + +Compilation +------------ + +For builds using ``meson`` and ``ninja``, the driver will be built when the +target platform is x86-based. No additional compilation steps are necessary. + +Device Setup +------------- + +Devices using VFIO/UIO drivers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The HW devices to be used will need to be bound to a user-space IO driver for use. +The ``dpdk-devbind.py`` script can be used to view the state of the devices +and to bind them to a suitable DPDK-supported driver, such as ``vfio-pci``. +For example:: + + $ dpdk-devbind.py -b vfio-pci 6a:01.0 + +Device Probing and Initialization +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For devices bound to a suitable DPDK-supported VFIO/UIO driver, the HW devices will +be found as part of the device scan done at application initialization time without +the need to pass parameters to the application. + +For Intel\ |reg| DSA devices, DPDK will automatically configure the device with the +maximum number of workqueues available on it, partitioning all resources equally +among the queues. +If fewer workqueues are required, then the ``max_queues`` parameter may be passed to +the device driver on the EAL commandline, via the ``allowlist`` or ``-a`` flag e.g.:: + + $ dpdk-test -a <b:d:f>,max_queues=4 diff --git a/doc/guides/dmadevs/index.rst b/doc/guides/dmadevs/index.rst index 0bce29d766..5d4abf880e 100644 --- a/doc/guides/dmadevs/index.rst +++ b/doc/guides/dmadevs/index.rst @@ -10,3 +10,5 @@ an application through DMA API. .. toctree:: :maxdepth: 2 :numbered: + + idxd diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst index d5435a64aa..f8678efa94 100644 --- a/doc/guides/rel_notes/release_21_11.rst +++ b/doc/guides/rel_notes/release_21_11.rst @@ -75,6 +75,11 @@ New Features operations. * Added multi-process support. +* **Added IDXD dmadev driver implementation.** + + The IDXD dmadev driver provide device drivers for the Intel DSA devices. + This device driver can be used through the generic dmadev API. + * **Added new RSS offload types for IPv4/L4 checksum in RSS flow.** Added macros ETH_RSS_IPV4_CHKSUM and ETH_RSS_L4_CHKSUM, now IPv4 and diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c new file mode 100644 index 0000000000..e00ddbe5ef --- /dev/null +++ b/drivers/dma/idxd/idxd_common.c @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#include <rte_log.h> + +#include "idxd_internal.h" + +int idxd_pmd_logtype; + +RTE_LOG_REGISTER_DEFAULT(idxd_pmd_logtype, WARNING); diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h new file mode 100644 index 0000000000..c6a7dcd72f --- /dev/null +++ b/drivers/dma/idxd/idxd_internal.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#ifndef _IDXD_INTERNAL_H_ +#define _IDXD_INTERNAL_H_ + +/** + * @file idxd_internal.h + * + * Internal data structures for the idxd/DSA driver for dmadev + * + * @warning + * @b EXPERIMENTAL: these structures and APIs may change without prior notice + */ + +extern int idxd_pmd_logtype; + +#define IDXD_PMD_LOG(level, fmt, args...) rte_log(RTE_LOG_ ## level, \ + idxd_pmd_logtype, "IDXD: %s(): " fmt "\n", __func__, ##args) + +#define IDXD_PMD_DEBUG(fmt, args...) IDXD_PMD_LOG(DEBUG, fmt, ## args) +#define IDXD_PMD_INFO(fmt, args...) IDXD_PMD_LOG(INFO, fmt, ## args) +#define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) +#define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) + +#endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c new file mode 100644 index 0000000000..79e4aadcab --- /dev/null +++ b/drivers/dma/idxd/idxd_pci.c @@ -0,0 +1,55 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include <rte_bus_pci.h> + +#include "idxd_internal.h" + +#define IDXD_VENDOR_ID 0x8086 +#define IDXD_DEVICE_ID_SPR 0x0B25 + +#define IDXD_PMD_DMADEV_NAME_PCI dmadev_idxd_pci + +const struct rte_pci_id pci_id_idxd_map[] = { + { RTE_PCI_DEVICE(IDXD_VENDOR_ID, IDXD_DEVICE_ID_SPR) }, + { .vendor_id = 0, /* sentinel */ }, +}; + +static int +idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) +{ + int ret = 0; + char name[PCI_PRI_STR_SIZE]; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); + dev->device.driver = &drv->driver; + + return ret; +} + +static int +idxd_dmadev_remove_pci(struct rte_pci_device *dev) +{ + char name[PCI_PRI_STR_SIZE]; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + + IDXD_PMD_INFO("Closing %s on NUMA node %d", + name, dev->device.numa_node); + + return 0; +} + +struct rte_pci_driver idxd_pmd_drv_pci = { + .id_table = pci_id_idxd_map, + .drv_flags = RTE_PCI_DRV_NEED_MAPPING, + .probe = idxd_dmadev_probe_pci, + .remove = idxd_dmadev_remove_pci, +}; + +RTE_PMD_REGISTER_PCI(IDXD_PMD_DMADEV_NAME_PCI, idxd_pmd_drv_pci); +RTE_PMD_REGISTER_PCI_TABLE(IDXD_PMD_DMADEV_NAME_PCI, pci_id_idxd_map); +RTE_PMD_REGISTER_KMOD_DEP(IDXD_PMD_DMADEV_NAME_PCI, "vfio-pci"); +RTE_PMD_REGISTER_PARAM_STRING(dmadev_idxd_pci, "max_queues=0"); diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build new file mode 100644 index 0000000000..11620ba156 --- /dev/null +++ b/drivers/dma/idxd/meson.build @@ -0,0 +1,11 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2021 Intel Corporation + +build = dpdk_conf.has('RTE_ARCH_X86') +reason = 'only supported on x86' + +deps += ['bus_pci'] +sources = files( + 'idxd_common.c', + 'idxd_pci.c' +) diff --git a/drivers/dma/idxd/version.map b/drivers/dma/idxd/version.map new file mode 100644 index 0000000000..4a76d1d52d --- /dev/null +++ b/drivers/dma/idxd/version.map @@ -0,0 +1,3 @@ +DPDK_21 { + local: *; +}; diff --git a/drivers/dma/meson.build b/drivers/dma/meson.build index d9c7ede32f..411be7a240 100644 --- a/drivers/dma/meson.build +++ b/drivers/dma/meson.build @@ -2,5 +2,7 @@ # Copyright 2021 HiSilicon Limited drivers = [ + 'idxd', 'skeleton', ] +std_deps = ['dmadev'] -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v10 03/16] dma/idxd: add bus device probing 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 01/16] raw/ioat: only build if dmadev not present Kevin Laatz 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz @ 2021-10-19 14:10 ` Kevin Laatz 2021-10-20 6:54 ` fengchengwen 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz ` (12 subsequent siblings) 15 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 14:10 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the basic device probing for DSA devices bound to the IDXD kernel driver. These devices can be configured via sysfs and made available to DPDK if they are found during bus scan. Relevant documentation is included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 64 ++++++ drivers/dma/idxd/idxd_bus.c | 351 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 34 +++ drivers/dma/idxd/meson.build | 4 + lib/dmadev/rte_dmadev.h | 1 + 5 files changed, 454 insertions(+) create mode 100644 drivers/dma/idxd/idxd_bus.c diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 924700d17e..ce33e2857a 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -32,6 +32,56 @@ target platform is x86-based. No additional compilation steps are necessary. Device Setup ------------- +Intel\ |reg| DSA devices can use the IDXD kernel driver or DPDK-supported drivers, +such as ``vfio-pci``. Both are supported by the IDXD PMD. + +Intel\ |reg| DSA devices using IDXD kernel driver +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To use an Intel\ |reg| DSA device bound to the IDXD kernel driver, the device must first be configured. +The `accel-config <https://github.com/intel/idxd-config>`_ utility library can be used for configuration. + +.. note:: + The device configuration can also be done by directly interacting with the sysfs nodes. + An example of how this may be done can be seen in the script ``dpdk_idxd_cfg.py`` + included in the driver source directory. + +There are some mandatory configuration steps before being able to use a device with an application. +The internal engines, which do the copies or other operations, +and the work-queues, which are used by applications to assign work to the device, +need to be assigned to groups, and the various other configuration options, +such as priority or queue depth, need to be set for each queue. + +To assign an engine to a group:: + + $ accel-config config-engine dsa0/engine0.0 --group-id=0 + $ accel-config config-engine dsa0/engine0.1 --group-id=1 + +To assign work queues to groups for passing descriptors to the engines a similar accel-config command can be used. +However, the work queues also need to be configured depending on the use case. +Some configuration options include: + +* mode (Dedicated/Shared): Indicates whether a WQ may accept jobs from multiple queues simultaneously. +* priority: WQ priority between 1 and 15. Larger value means higher priority. +* wq-size: the size of the WQ. Sum of all WQ sizes must be less that the total-size defined by the device. +* type: WQ type (kernel/mdev/user). Determines how the device is presented. +* name: identifier given to the WQ. + +Example configuration for a work queue:: + + $ accel-config config-wq dsa0/wq0.0 --group-id=0 \ + --mode=dedicated --priority=10 --wq-size=8 \ + --type=user --name=dpdk_app1 + +Once the devices have been configured, they need to be enabled:: + + $ accel-config enable-device dsa0 + $ accel-config enable-wq dsa0/wq0.0 + +Check the device configuration:: + + $ accel-config list + Devices using VFIO/UIO drivers ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -56,3 +106,17 @@ If fewer workqueues are required, then the ``max_queues`` parameter may be passe the device driver on the EAL commandline, via the ``allowlist`` or ``-a`` flag e.g.:: $ dpdk-test -a <b:d:f>,max_queues=4 + +For devices bound to the IDXD kernel driver, +the DPDK IDXD driver will automatically perform a scan for available workqueues +to use. Any workqueues found listed in ``/dev/dsa`` on the system will be checked +in ``/sys``, and any which have ``dpdk_`` prefix in their name will be automatically +probed by the driver to make them available to the application. +Alternatively, to support use by multiple DPDK processes simultaneously, +the value used as the DPDK ``--file-prefix`` parameter may be used as a workqueue +name prefix, instead of ``dpdk_``, allowing each DPDK application instance to only +use a subset of configured queues. + +Once probed successfully, irrespective of kernel driver, the device will appear as a ``dmadev``, +that is a "DMA device type" inside DPDK, and can be accessed using APIs from the +``rte_dmadev`` library. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c new file mode 100644 index 0000000000..ef589af30e --- /dev/null +++ b/drivers/dma/idxd/idxd_bus.c @@ -0,0 +1,351 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include <dirent.h> +#include <fcntl.h> +#include <unistd.h> +#include <sys/mman.h> +#include <libgen.h> + +#include <rte_bus.h> +#include <rte_log.h> +#include <rte_dmadev_pmd.h> +#include <rte_string_fns.h> + +#include "idxd_internal.h" + +/* default value for DSA paths, but allow override in environment for testing */ +#define DSA_DEV_PATH "/dev/dsa" +#define DSA_SYSFS_PATH "/sys/bus/dsa/devices" + +static unsigned int devcount; + +/** unique identifier for a DSA device/WQ instance */ +struct dsa_wq_addr { + uint16_t device_id; + uint16_t wq_id; +}; + +/** a DSA device instance */ +struct rte_dsa_device { + struct rte_device device; /**< Inherit core device */ + TAILQ_ENTRY(rte_dsa_device) next; /**< next dev in list */ + + char wq_name[32]; /**< the workqueue name/number e.g. wq0.1 */ + struct dsa_wq_addr addr; /**< Identifies the specific WQ */ +}; + +/* forward prototypes */ +struct dsa_bus; +static int dsa_scan(void); +static int dsa_probe(void); +static struct rte_device *dsa_find_device(const struct rte_device *start, + rte_dev_cmp_t cmp, const void *data); +static enum rte_iova_mode dsa_get_iommu_class(void); +static int dsa_addr_parse(const char *name, void *addr); + +/** List of devices */ +TAILQ_HEAD(dsa_device_list, rte_dsa_device); + +/** + * Structure describing the DSA bus + */ +struct dsa_bus { + struct rte_bus bus; /**< Inherit the generic class */ + struct rte_driver driver; /**< Driver struct for devices to point to */ + struct dsa_device_list device_list; /**< List of PCI devices */ +}; + +struct dsa_bus dsa_bus = { + .bus = { + .scan = dsa_scan, + .probe = dsa_probe, + .find_device = dsa_find_device, + .get_iommu_class = dsa_get_iommu_class, + .parse = dsa_addr_parse, + }, + .driver = { + .name = "dmadev_idxd" + }, + .device_list = TAILQ_HEAD_INITIALIZER(dsa_bus.device_list), +}; + +static inline const char * +dsa_get_dev_path(void) +{ + const char *path = getenv("DSA_DEV_PATH"); + return path ? path : DSA_DEV_PATH; +} + +static inline const char * +dsa_get_sysfs_path(void) +{ + const char *path = getenv("DSA_SYSFS_PATH"); + return path ? path : DSA_SYSFS_PATH; +} + +static void * +idxd_bus_mmap_wq(struct rte_dsa_device *dev) +{ + void *addr; + char path[PATH_MAX]; + int fd; + + snprintf(path, sizeof(path), "%s/%s", dsa_get_dev_path(), dev->wq_name); + fd = open(path, O_RDWR); + if (fd < 0) { + IDXD_PMD_ERR("Failed to open device path: %s", path); + return NULL; + } + + addr = mmap(NULL, 0x1000, PROT_WRITE, MAP_SHARED, fd, 0); + close(fd); + if (addr == MAP_FAILED) { + IDXD_PMD_ERR("Failed to mmap device %s", path); + return NULL; + } + + return addr; +} + +static int +read_wq_string(struct rte_dsa_device *dev, const char *filename, + char *value, size_t valuelen) +{ + char sysfs_node[PATH_MAX]; + int len; + int fd; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s", + dsa_get_sysfs_path(), dev->wq_name, filename); + fd = open(sysfs_node, O_RDONLY); + if (fd < 0) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + len = read(fd, value, valuelen - 1); + close(fd); + if (len < 0) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + value[len] = '\0'; + return 0; +} + +static int +read_wq_int(struct rte_dsa_device *dev, const char *filename, + int *value) +{ + char sysfs_node[PATH_MAX]; + FILE *f; + int ret = 0; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s", + dsa_get_sysfs_path(), dev->wq_name, filename); + f = fopen(sysfs_node, "r"); + if (f == NULL) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + if (fscanf(f, "%d", value) != 1) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + ret = -1; + } + + fclose(f); + return ret; +} + +static int +read_device_int(struct rte_dsa_device *dev, const char *filename, + int *value) +{ + char sysfs_node[PATH_MAX]; + FILE *f; + int ret = 0; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/dsa%d/%s", + dsa_get_sysfs_path(), dev->addr.device_id, filename); + f = fopen(sysfs_node, "r"); + if (f == NULL) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + if (fscanf(f, "%d", value) != 1) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + ret = -1; + } + + fclose(f); + return ret; +} + +static int +idxd_probe_dsa(struct rte_dsa_device *dev) +{ + struct idxd_dmadev idxd = {0}; + int ret = 0; + + IDXD_PMD_INFO("Probing device %s on numa node %d", + dev->wq_name, dev->device.numa_node); + if (read_wq_int(dev, "size", &ret) < 0) + return -1; + idxd.max_batches = ret; + if (read_wq_int(dev, "max_batch_size", &ret) < 0) + return -1; + idxd.max_batch_size = ret; + idxd.qid = dev->addr.wq_id; + idxd.sva_support = 1; + + idxd.portal = idxd_bus_mmap_wq(dev); + if (idxd.portal == NULL) { + IDXD_PMD_ERR("WQ mmap failed"); + return -ENOENT; + } + + return 0; +} + +static int +is_for_this_process_use(const char *name) +{ + char *runtime_dir = strdup(rte_eal_get_runtime_dir()); + char *prefix = basename(runtime_dir); + int prefixlen = strlen(prefix); + int retval = 0; + + if (strncmp(name, "dpdk_", 5) == 0) + retval = 1; + if (strncmp(name, prefix, prefixlen) == 0 && name[prefixlen] == '_') + retval = 1; + + free(runtime_dir); + return retval; +} + +static int +dsa_probe(void) +{ + struct rte_dsa_device *dev; + + TAILQ_FOREACH(dev, &dsa_bus.device_list, next) { + char type[64], name[64]; + + if (read_wq_string(dev, "type", type, sizeof(type)) < 0 || + read_wq_string(dev, "name", name, sizeof(name)) < 0) + continue; + + if (strncmp(type, "user", 4) == 0 && is_for_this_process_use(name)) { + dev->device.driver = &dsa_bus.driver; + idxd_probe_dsa(dev); + continue; + } + IDXD_PMD_DEBUG("WQ '%s', not allocated to DPDK", dev->wq_name); + } + + return 0; +} + +static int +dsa_scan(void) +{ + const char *path = dsa_get_dev_path(); + struct dirent *wq; + DIR *dev_dir; + + dev_dir = opendir(path); + if (dev_dir == NULL) { + if (errno == ENOENT) + return 0; /* no bus, return without error */ + IDXD_PMD_ERR("%s(): opendir '%s' failed: %s", + __func__, path, strerror(errno)); + return -1; + } + + while ((wq = readdir(dev_dir)) != NULL) { + struct rte_dsa_device *dev; + int numa_node = -1; + + if (strncmp(wq->d_name, "wq", 2) != 0) + continue; + if (strnlen(wq->d_name, sizeof(dev->wq_name)) == sizeof(dev->wq_name)) { + IDXD_PMD_ERR("%s(): wq name too long: '%s', skipping", + __func__, wq->d_name); + continue; + } + IDXD_PMD_DEBUG("%s(): found %s/%s", __func__, path, wq->d_name); + + dev = malloc(sizeof(*dev)); + if (dsa_addr_parse(wq->d_name, &dev->addr) < 0) { + IDXD_PMD_ERR("Error parsing WQ name: %s", wq->d_name); + free(dev); + continue; + } + dev->device.bus = &dsa_bus.bus; + strlcpy(dev->wq_name, wq->d_name, sizeof(dev->wq_name)); + TAILQ_INSERT_TAIL(&dsa_bus.device_list, dev, next); + devcount++; + + read_device_int(dev, "numa_node", &numa_node); + dev->device.numa_node = numa_node; + dev->device.name = dev->wq_name; + } + + closedir(dev_dir); + return 0; +} + +static struct rte_device * +dsa_find_device(const struct rte_device *start, rte_dev_cmp_t cmp, + const void *data) +{ + struct rte_dsa_device *dev = TAILQ_FIRST(&dsa_bus.device_list); + + /* the rte_device struct must be at start of dsa structure */ + RTE_BUILD_BUG_ON(offsetof(struct rte_dsa_device, device) != 0); + + if (start != NULL) /* jump to start point if given */ + dev = TAILQ_NEXT((const struct rte_dsa_device *)start, next); + while (dev != NULL) { + if (cmp(&dev->device, data) == 0) + return &dev->device; + dev = TAILQ_NEXT(dev, next); + } + return NULL; +} + +static enum rte_iova_mode +dsa_get_iommu_class(void) +{ + /* if there are no devices, report don't care, otherwise VA mode */ + return devcount > 0 ? RTE_IOVA_VA : RTE_IOVA_DC; +} + +static int +dsa_addr_parse(const char *name, void *addr) +{ + struct dsa_wq_addr *wq = addr; + unsigned int device_id, wq_id; + + if (sscanf(name, "wq%u.%u", &device_id, &wq_id) != 2) { + IDXD_PMD_DEBUG("Parsing WQ name failed: %s", name); + return -1; + } + + wq->device_id = device_id; + wq->wq_id = wq_id; + return 0; +} + +RTE_REGISTER_BUS(dsa, dsa_bus.bus); diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index c6a7dcd72f..b8a7d7dab6 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -24,4 +24,38 @@ extern int idxd_pmd_logtype; #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) +struct idxd_dmadev { + /* counters to track the batches */ + unsigned short max_batches; + unsigned short batch_idx_read; + unsigned short batch_idx_write; + + /* track descriptors and handles */ + unsigned short desc_ring_mask; + unsigned short ids_avail; /* handles for ops completed */ + unsigned short ids_returned; /* the read pointer for hdls/desc rings */ + unsigned short batch_start; /* start+size == write pointer for hdls/desc */ + unsigned short batch_size; + + void *portal; /* address to write the batch descriptor */ + + struct idxd_completion *batch_comp_ring; + unsigned short *batch_idx_ring; /* store where each batch ends */ + + rte_iova_t batch_iova; /* base address of the batch comp ring */ + rte_iova_t desc_iova; /* base address of desc ring, needed for completions */ + + unsigned short max_batch_size; + + struct rte_dma_dev *dmadev; + uint8_t sva_support; + uint8_t qid; + + union { + struct { + unsigned int dsa_id; + } bus; + } u; +}; + #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index 11620ba156..37af6e1b8f 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -9,3 +9,7 @@ sources = files( 'idxd_common.c', 'idxd_pci.c' ) + +if is_linux + sources += files('idxd_bus.c') +endif diff --git a/lib/dmadev/rte_dmadev.h b/lib/dmadev/rte_dmadev.h index f5d23017b1..7394f0932b 100644 --- a/lib/dmadev/rte_dmadev.h +++ b/lib/dmadev/rte_dmadev.h @@ -149,6 +149,7 @@ #include <rte_bitops.h> #include <rte_common.h> #include <rte_compat.h> +#include <rte_eal.h> #include <rte_dev.h> #ifdef __cplusplus -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v10 03/16] dma/idxd: add bus device probing 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 03/16] dma/idxd: add bus device probing Kevin Laatz @ 2021-10-20 6:54 ` fengchengwen 0 siblings, 0 replies; 243+ messages in thread From: fengchengwen @ 2021-10-20 6:54 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: thomas, bruce.richardson, jerinj, conor.walsh On 2021/10/19 22:10, Kevin Laatz wrote: > Add the basic device probing for DSA devices bound to the IDXD kernel > driver. These devices can be configured via sysfs and made available to > DPDK if they are found during bus scan. Relevant documentation is included. > [snip] > diff --git a/lib/dmadev/rte_dmadev.h b/lib/dmadev/rte_dmadev.h > index f5d23017b1..7394f0932b 100644 > --- a/lib/dmadev/rte_dmadev.h > +++ b/lib/dmadev/rte_dmadev.h > @@ -149,6 +149,7 @@ > #include <rte_bitops.h> > #include <rte_common.h> > #include <rte_compat.h> > +#include <rte_eal.h> Why add rte_eal.h to rte_dmadev.h, just use rte_eal_get_runtime_dir() ? Suggest add rte_eal.h in the PMD's C files. > #include <rte_dev.h> > > #ifdef __cplusplus > ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v10 04/16] dma/idxd: create dmadev instances on bus probe 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 00/16] add dmadev driver for idxd devices Kevin Laatz ` (2 preceding siblings ...) 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 03/16] dma/idxd: add bus device probing Kevin Laatz @ 2021-10-19 14:10 ` Kevin Laatz 2021-10-20 7:10 ` fengchengwen 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz ` (11 subsequent siblings) 15 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 14:10 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When a suitable device is found during the bus scan/probe, create a dmadev instance for each HW queue. Internal structures required for device creation are also added. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- drivers/dma/idxd/idxd_bus.c | 19 ++++++++++ drivers/dma/idxd/idxd_common.c | 61 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 27 ++++++++++++++ drivers/dma/idxd/idxd_internal.h | 7 ++++ 4 files changed, 114 insertions(+) create mode 100644 drivers/dma/idxd/idxd_hw_defs.h diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index ef589af30e..b48fa954ed 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -85,6 +85,18 @@ dsa_get_sysfs_path(void) return path ? path : DSA_SYSFS_PATH; } +static int +idxd_dev_close(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->data->dev_private; + munmap(idxd->portal, 0x1000); + return 0; +} + +static const struct rte_dma_dev_ops idxd_bus_ops = { + .dev_close = idxd_dev_close, +}; + static void * idxd_bus_mmap_wq(struct rte_dsa_device *dev) { @@ -206,6 +218,7 @@ idxd_probe_dsa(struct rte_dsa_device *dev) return -1; idxd.max_batch_size = ret; idxd.qid = dev->addr.wq_id; + idxd.u.bus.dsa_id = dev->addr.device_id; idxd.sva_support = 1; idxd.portal = idxd_bus_mmap_wq(dev); @@ -214,6 +227,12 @@ idxd_probe_dsa(struct rte_dsa_device *dev) return -ENOENT; } + ret = idxd_dmadev_create(dev->wq_name, &dev->device, &idxd, &idxd_bus_ops); + if (ret) { + IDXD_PMD_ERR("Failed to create rawdev %s", dev->wq_name); + return ret; + } + return 0; } diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index e00ddbe5ef..5abff34292 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -2,10 +2,71 @@ * Copyright 2021 Intel Corporation */ +#include <rte_malloc.h> +#include <rte_common.h> #include <rte_log.h> #include "idxd_internal.h" +#define IDXD_PMD_NAME_STR "dmadev_idxd" + +int +idxd_dmadev_create(const char *name, struct rte_device *dev, + const struct idxd_dmadev *base_idxd, + const struct rte_dma_dev_ops *ops) +{ + struct idxd_dmadev *idxd = NULL; + struct rte_dma_dev *dmadev = NULL; + int ret = 0; + + if (!name) { + IDXD_PMD_ERR("Invalid name of the device!"); + ret = -EINVAL; + goto cleanup; + } + + /* Allocate device structure */ + dmadev = rte_dma_pmd_allocate(name, dev->numa_node, sizeof(struct idxd_dmadev)); + if (dmadev == NULL) { + IDXD_PMD_ERR("Unable to allocate raw device"); + ret = -ENOMEM; + goto cleanup; + } + dmadev->dev_ops = ops; + dmadev->device = dev; + + idxd = dmadev->data->dev_private; + *idxd = *base_idxd; /* copy over the main fields already passed in */ + idxd->dmadev = dmadev; + + /* allocate batch index ring and completion ring. + * The +1 is because we can never fully use + * the ring, otherwise read == write means both full and empty. + */ + idxd->batch_comp_ring = rte_zmalloc(NULL, (sizeof(idxd->batch_idx_ring[0]) + + sizeof(idxd->batch_comp_ring[0])) * (idxd->max_batches + 1), + sizeof(idxd->batch_comp_ring[0])); + if (idxd->batch_comp_ring == NULL) { + IDXD_PMD_ERR("Unable to reserve memory for batch data\n"); + ret = -ENOMEM; + goto cleanup; + } + idxd->batch_idx_ring = (void *)&idxd->batch_comp_ring[idxd->max_batches+1]; + idxd->batch_iova = rte_mem_virt2iova(idxd->batch_comp_ring); + + dmadev->fp_obj->dev_private = idxd; + + idxd->dmadev->state = RTE_DMA_DEV_READY; + + return 0; + +cleanup: + if (dmadev) + rte_dma_pmd_release(name); + + return ret; +} + int idxd_pmd_logtype; RTE_LOG_REGISTER_DEFAULT(idxd_pmd_logtype, WARNING); diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h new file mode 100644 index 0000000000..a92d462d01 --- /dev/null +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#ifndef _IDXD_HW_DEFS_H_ +#define _IDXD_HW_DEFS_H_ + +#define IDXD_COMP_STATUS_INCOMPLETE 0 +#define IDXD_COMP_STATUS_SUCCESS 1 +#define IDXD_COMP_STATUS_INVALID_OPCODE 0x10 +#define IDXD_COMP_STATUS_INVALID_SIZE 0x13 +#define IDXD_COMP_STATUS_SKIPPED 0xFF /* not official IDXD error, needed as placeholder */ + +/** + * Completion record structure written back by DSA + */ +struct idxd_completion { + uint8_t status; + uint8_t result; + /* 16-bits pad here */ + uint32_t completed_size; /* data length, or descriptors for batch */ + + rte_iova_t fault_address; + uint32_t invalid_flags; +} __rte_aligned(32); + +#endif diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index b8a7d7dab6..8f1cdf6102 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -5,6 +5,10 @@ #ifndef _IDXD_INTERNAL_H_ #define _IDXD_INTERNAL_H_ +#include <rte_dmadev_pmd.h> + +#include "idxd_hw_defs.h" + /** * @file idxd_internal.h * @@ -58,4 +62,7 @@ struct idxd_dmadev { } u; }; +int idxd_dmadev_create(const char *name, struct rte_device *dev, + const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); + #endif /* _IDXD_INTERNAL_H_ */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v10 04/16] dma/idxd: create dmadev instances on bus probe 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz @ 2021-10-20 7:10 ` fengchengwen 0 siblings, 0 replies; 243+ messages in thread From: fengchengwen @ 2021-10-20 7:10 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: thomas, bruce.richardson, jerinj, conor.walsh On 2021/10/19 22:10, Kevin Laatz wrote: > When a suitable device is found during the bus scan/probe, create a dmadev > instance for each HW queue. Internal structures required for device > creation are also added. > [snip] > static void * > idxd_bus_mmap_wq(struct rte_dsa_device *dev) > { > @@ -206,6 +218,7 @@ idxd_probe_dsa(struct rte_dsa_device *dev) > return -1; > idxd.max_batch_size = ret; > idxd.qid = dev->addr.wq_id; > + idxd.u.bus.dsa_id = dev->addr.device_id; > idxd.sva_support = 1; > > idxd.portal = idxd_bus_mmap_wq(dev); > @@ -214,6 +227,12 @@ idxd_probe_dsa(struct rte_dsa_device *dev) > return -ENOENT; > } > > + ret = idxd_dmadev_create(dev->wq_name, &dev->device, &idxd, &idxd_bus_ops); > + if (ret) { > + IDXD_PMD_ERR("Failed to create rawdev %s", dev->wq_name); rawdev -> dmadev > + return ret; > + } > + > return 0; > } > > diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c > index e00ddbe5ef..5abff34292 100644 > --- a/drivers/dma/idxd/idxd_common.c > +++ b/drivers/dma/idxd/idxd_common.c > @@ -2,10 +2,71 @@ > * Copyright 2021 Intel Corporation > */ > > +#include <rte_malloc.h> > +#include <rte_common.h> > #include <rte_log.h> > > #include "idxd_internal.h" > > +#define IDXD_PMD_NAME_STR "dmadev_idxd" > + > +int > +idxd_dmadev_create(const char *name, struct rte_device *dev, > + const struct idxd_dmadev *base_idxd, > + const struct rte_dma_dev_ops *ops) > +{ > + struct idxd_dmadev *idxd = NULL; > + struct rte_dma_dev *dmadev = NULL; > + int ret = 0; > + > + if (!name) { > + IDXD_PMD_ERR("Invalid name of the device!"); > + ret = -EINVAL; > + goto cleanup; > + } > + > + /* Allocate device structure */ > + dmadev = rte_dma_pmd_allocate(name, dev->numa_node, sizeof(struct idxd_dmadev)); > + if (dmadev == NULL) { > + IDXD_PMD_ERR("Unable to allocate raw device"); raw -> dma Better check the 'raw' keyword in the patch set. > + ret = -ENOMEM; > + goto cleanup; > + } > + dmadev->dev_ops = ops; > + dmadev->device = dev; > + > + idxd = dmadev->data->dev_private; > + *idxd = *base_idxd; /* copy over the main fields already passed in */ > + idxd->dmadev = dmadev; > + > + /* allocate batch index ring and completion ring. > + * The +1 is because we can never fully use > + * the ring, otherwise read == write means both full and empty. > + */ > + idxd->batch_comp_ring = rte_zmalloc(NULL, (sizeof(idxd->batch_idx_ring[0]) + > + sizeof(idxd->batch_comp_ring[0])) * (idxd->max_batches + 1), > + sizeof(idxd->batch_comp_ring[0])); infer the batch_comp_ring will access by hardware, maybe better use rte_zmalloc_socket() because rte_zmalloc will use rte_socket_id() and it may at diff socket when call. > + if (idxd->batch_comp_ring == NULL) { > + IDXD_PMD_ERR("Unable to reserve memory for batch data\n"); > + ret = -ENOMEM; > [snip] ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v10 05/16] dma/idxd: create dmadev instances on pci probe 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 00/16] add dmadev driver for idxd devices Kevin Laatz ` (3 preceding siblings ...) 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz @ 2021-10-19 14:10 ` Kevin Laatz 2021-10-20 7:34 ` fengchengwen 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 06/16] dma/idxd: add datapath structures Kevin Laatz ` (10 subsequent siblings) 15 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 14:10 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When a suitable device is found during the PCI probe, create a dmadev instance for each HW queue. HW definitions required are also included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- drivers/dma/idxd/idxd_hw_defs.h | 63 ++++++++ drivers/dma/idxd/idxd_internal.h | 13 ++ drivers/dma/idxd/idxd_pci.c | 259 ++++++++++++++++++++++++++++++- 3 files changed, 332 insertions(+), 3 deletions(-) diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h index a92d462d01..86f7f3526b 100644 --- a/drivers/dma/idxd/idxd_hw_defs.h +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -24,4 +24,67 @@ struct idxd_completion { uint32_t invalid_flags; } __rte_aligned(32); +/*** Definitions for Intel(R) Data Streaming Accelerator ***/ + +#define IDXD_CMD_SHIFT 20 +enum rte_idxd_cmds { + idxd_enable_dev = 1, + idxd_disable_dev, + idxd_drain_all, + idxd_abort_all, + idxd_reset_device, + idxd_enable_wq, + idxd_disable_wq, + idxd_drain_wq, + idxd_abort_wq, + idxd_reset_wq, +}; + +/* General bar0 registers */ +struct rte_idxd_bar0 { + uint32_t __rte_cache_aligned version; /* offset 0x00 */ + uint64_t __rte_aligned(0x10) gencap; /* offset 0x10 */ + uint64_t __rte_aligned(0x10) wqcap; /* offset 0x20 */ + uint64_t __rte_aligned(0x10) grpcap; /* offset 0x30 */ + uint64_t __rte_aligned(0x08) engcap; /* offset 0x38 */ + uint64_t __rte_aligned(0x10) opcap; /* offset 0x40 */ + uint64_t __rte_aligned(0x20) offsets[2]; /* offset 0x60 */ + uint32_t __rte_aligned(0x20) gencfg; /* offset 0x80 */ + uint32_t __rte_aligned(0x08) genctrl; /* offset 0x88 */ + uint32_t __rte_aligned(0x10) gensts; /* offset 0x90 */ + uint32_t __rte_aligned(0x08) intcause; /* offset 0x98 */ + uint32_t __rte_aligned(0x10) cmd; /* offset 0xA0 */ + uint32_t __rte_aligned(0x08) cmdstatus; /* offset 0xA8 */ + uint64_t __rte_aligned(0x20) swerror[4]; /* offset 0xC0 */ +}; + +/* workqueue config is provided by array of uint32_t. */ +enum rte_idxd_wqcfg { + wq_size_idx, /* size is in first 32-bit value */ + wq_threshold_idx, /* WQ threshold second 32-bits */ + wq_mode_idx, /* WQ mode and other flags */ + wq_sizes_idx, /* WQ transfer and batch sizes */ + wq_occ_int_idx, /* WQ occupancy interrupt handle */ + wq_occ_limit_idx, /* WQ occupancy limit */ + wq_state_idx, /* WQ state and occupancy state */ +}; + +#define WQ_MODE_SHARED 0 +#define WQ_MODE_DEDICATED 1 +#define WQ_PRIORITY_SHIFT 4 +#define WQ_BATCH_SZ_SHIFT 5 +#define WQ_STATE_SHIFT 30 +#define WQ_STATE_MASK 0x3 + +struct rte_idxd_grpcfg { + uint64_t grpwqcfg[4] __rte_cache_aligned; /* 64-byte register set */ + uint64_t grpengcfg; /* offset 32 */ + uint32_t grpflags; /* offset 40 */ +}; + +#define GENSTS_DEV_STATE_MASK 0x03 +#define CMDSTATUS_ACTIVE_SHIFT 31 +#define CMDSTATUS_ACTIVE_MASK (1 << 31) +#define CMDSTATUS_ERR_MASK 0xFF + #endif diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 8f1cdf6102..8473bf939f 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -6,6 +6,7 @@ #define _IDXD_INTERNAL_H_ #include <rte_dmadev_pmd.h> +#include <rte_spinlock.h> #include "idxd_hw_defs.h" @@ -28,6 +29,16 @@ extern int idxd_pmd_logtype; #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) +struct idxd_pci_common { + rte_spinlock_t lk; + + uint8_t wq_cfg_sz; + volatile struct rte_idxd_bar0 *regs; + volatile uint32_t *wq_regs_base; + volatile struct rte_idxd_grpcfg *grp_regs; + volatile void *portals; +}; + struct idxd_dmadev { /* counters to track the batches */ unsigned short max_batches; @@ -59,6 +70,8 @@ struct idxd_dmadev { struct { unsigned int dsa_id; } bus; + + struct idxd_pci_common *pci; } u; }; diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 79e4aadcab..7127483b10 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -3,6 +3,9 @@ */ #include <rte_bus_pci.h> +#include <rte_devargs.h> +#include <rte_dmadev_pmd.h> +#include <rte_malloc.h> #include "idxd_internal.h" @@ -16,17 +19,267 @@ const struct rte_pci_id pci_id_idxd_map[] = { { .vendor_id = 0, /* sentinel */ }, }; +static inline int +idxd_pci_dev_command(struct idxd_dmadev *idxd, enum rte_idxd_cmds command) +{ + uint8_t err_code; + uint16_t qid = idxd->qid; + int i = 0; + + if (command >= idxd_disable_wq && command <= idxd_reset_wq) + qid = (1 << qid); + rte_spinlock_lock(&idxd->u.pci->lk); + idxd->u.pci->regs->cmd = (command << IDXD_CMD_SHIFT) | qid; + + do { + rte_pause(); + err_code = idxd->u.pci->regs->cmdstatus; + if (++i >= 1000) { + IDXD_PMD_ERR("Timeout waiting for command response from HW"); + rte_spinlock_unlock(&idxd->u.pci->lk); + return err_code; + } + } while (idxd->u.pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK); + rte_spinlock_unlock(&idxd->u.pci->lk); + + return err_code & CMDSTATUS_ERR_MASK; +} + +static uint32_t * +idxd_get_wq_cfg(struct idxd_pci_common *pci, uint8_t wq_idx) +{ + return RTE_PTR_ADD(pci->wq_regs_base, + (uintptr_t)wq_idx << (5 + pci->wq_cfg_sz)); +} + +static int +idxd_is_wq_enabled(struct idxd_dmadev *idxd) +{ + uint32_t state = idxd_get_wq_cfg(idxd->u.pci, idxd->qid)[wq_state_idx]; + return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; +} + +static int +idxd_pci_dev_close(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint8_t err_code; + + /* disable the device */ + err_code = idxd_pci_dev_command(idxd, idxd_disable_dev); + if (err_code) { + IDXD_PMD_ERR("Error disabling device: code %#x", err_code); + return err_code; + } + IDXD_PMD_DEBUG("IDXD Device disabled OK"); + + /* free device memory */ + IDXD_PMD_DEBUG("Freeing device driver memory"); + rte_free(idxd->batch_idx_ring); + + return 0; +} + +static const struct rte_dma_dev_ops idxd_pci_ops = { + .dev_close = idxd_pci_dev_close, +}; + +/* each portal uses 4 x 4k pages */ +#define IDXD_PORTAL_SIZE (4096 * 4) + +static int +init_pci_device(struct rte_pci_device *dev, struct idxd_dmadev *idxd, + unsigned int max_queues) +{ + struct idxd_pci_common *pci; + uint8_t nb_groups, nb_engines, nb_wqs; + uint16_t grp_offset, wq_offset; /* how far into bar0 the regs are */ + uint16_t wq_size, total_wq_size; + uint8_t lg2_max_batch, lg2_max_copy_size; + unsigned int i, err_code; + + pci = malloc(sizeof(*pci)); + if (pci == NULL) { + IDXD_PMD_ERR("%s: Can't allocate memory", __func__); + goto err; + } + rte_spinlock_init(&pci->lk); + + /* assign the bar registers, and then configure device */ + pci->regs = dev->mem_resource[0].addr; + grp_offset = (uint16_t)pci->regs->offsets[0]; + pci->grp_regs = RTE_PTR_ADD(pci->regs, grp_offset * 0x100); + wq_offset = (uint16_t)(pci->regs->offsets[0] >> 16); + pci->wq_regs_base = RTE_PTR_ADD(pci->regs, wq_offset * 0x100); + pci->portals = dev->mem_resource[2].addr; + pci->wq_cfg_sz = (pci->regs->wqcap >> 24) & 0x0F; + + /* sanity check device status */ + if (pci->regs->gensts & GENSTS_DEV_STATE_MASK) { + /* need function-level-reset (FLR) or is enabled */ + IDXD_PMD_ERR("Device status is not disabled, cannot init"); + goto err; + } + if (pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK) { + /* command in progress */ + IDXD_PMD_ERR("Device has a command in progress, cannot init"); + goto err; + } + + /* read basic info about the hardware for use when configuring */ + nb_groups = (uint8_t)pci->regs->grpcap; + nb_engines = (uint8_t)pci->regs->engcap; + nb_wqs = (uint8_t)(pci->regs->wqcap >> 16); + total_wq_size = (uint16_t)pci->regs->wqcap; + lg2_max_copy_size = (uint8_t)(pci->regs->gencap >> 16) & 0x1F; + lg2_max_batch = (uint8_t)(pci->regs->gencap >> 21) & 0x0F; + + IDXD_PMD_DEBUG("nb_groups = %u, nb_engines = %u, nb_wqs = %u", + nb_groups, nb_engines, nb_wqs); + + /* zero out any old config */ + for (i = 0; i < nb_groups; i++) { + pci->grp_regs[i].grpengcfg = 0; + pci->grp_regs[i].grpwqcfg[0] = 0; + } + for (i = 0; i < nb_wqs; i++) + idxd_get_wq_cfg(pci, i)[0] = 0; + + /* limit queues if necessary */ + if (max_queues != 0 && nb_wqs > max_queues) { + nb_wqs = max_queues; + if (nb_engines > max_queues) + nb_engines = max_queues; + if (nb_groups > max_queues) + nb_engines = max_queues; + IDXD_PMD_DEBUG("Limiting queues to %u", nb_wqs); + } + + /* put each engine into a separate group to avoid reordering */ + if (nb_groups > nb_engines) + nb_groups = nb_engines; + if (nb_groups < nb_engines) + nb_engines = nb_groups; + + /* assign engines to groups, round-robin style */ + for (i = 0; i < nb_engines; i++) { + IDXD_PMD_DEBUG("Assigning engine %u to group %u", + i, i % nb_groups); + pci->grp_regs[i % nb_groups].grpengcfg |= (1ULL << i); + } + + /* now do the same for queues and give work slots to each queue */ + wq_size = total_wq_size / nb_wqs; + IDXD_PMD_DEBUG("Work queue size = %u, max batch = 2^%u, max copy = 2^%u", + wq_size, lg2_max_batch, lg2_max_copy_size); + for (i = 0; i < nb_wqs; i++) { + /* add engine "i" to a group */ + IDXD_PMD_DEBUG("Assigning work queue %u to group %u", + i, i % nb_groups); + pci->grp_regs[i % nb_groups].grpwqcfg[0] |= (1ULL << i); + /* now configure it, in terms of size, max batch, mode */ + idxd_get_wq_cfg(pci, i)[wq_size_idx] = wq_size; + idxd_get_wq_cfg(pci, i)[wq_mode_idx] = (1 << WQ_PRIORITY_SHIFT) | + WQ_MODE_DEDICATED; + idxd_get_wq_cfg(pci, i)[wq_sizes_idx] = lg2_max_copy_size | + (lg2_max_batch << WQ_BATCH_SZ_SHIFT); + } + + /* dump the group configuration to output */ + for (i = 0; i < nb_groups; i++) { + IDXD_PMD_DEBUG("## Group %d", i); + IDXD_PMD_DEBUG(" GRPWQCFG: %"PRIx64, pci->grp_regs[i].grpwqcfg[0]); + IDXD_PMD_DEBUG(" GRPENGCFG: %"PRIx64, pci->grp_regs[i].grpengcfg); + IDXD_PMD_DEBUG(" GRPFLAGS: %"PRIx32, pci->grp_regs[i].grpflags); + } + + idxd->u.pci = pci; + idxd->max_batches = wq_size; + + /* enable the device itself */ + err_code = idxd_pci_dev_command(idxd, idxd_enable_dev); + if (err_code) { + IDXD_PMD_ERR("Error enabling device: code %#x", err_code); + return err_code; + } + IDXD_PMD_DEBUG("IDXD Device enabled OK"); + + return nb_wqs; + +err: + free(pci); + return -1; +} + static int idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) { - int ret = 0; + struct idxd_dmadev idxd = {0}; + uint8_t nb_wqs; + int qid, ret = 0; char name[PCI_PRI_STR_SIZE]; + unsigned int max_queues = 0; rte_pci_device_name(&dev->addr, name, sizeof(name)); IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); dev->device.driver = &drv->driver; - return ret; + if (dev->device.devargs && dev->device.devargs->args[0] != '\0') { + /* if the number of devargs grows beyond just 1, use rte_kvargs */ + if (sscanf(dev->device.devargs->args, + "max_queues=%u", &max_queues) != 1) { + IDXD_PMD_ERR("Invalid device parameter: '%s'", + dev->device.devargs->args); + return -1; + } + } + + ret = init_pci_device(dev, &idxd, max_queues); + if (ret < 0) { + IDXD_PMD_ERR("Error initializing PCI hardware"); + return ret; + } + if (idxd.u.pci->portals == NULL) { + IDXD_PMD_ERR("Error, invalid portal assigned during initialization\n"); + return -EINVAL; + } + nb_wqs = (uint8_t)ret; + + /* set up one device for each queue */ + for (qid = 0; qid < nb_wqs; qid++) { + char qname[32]; + + /* add the queue number to each device name */ + snprintf(qname, sizeof(qname), "%s-q%d", name, qid); + idxd.qid = qid; + idxd.portal = RTE_PTR_ADD(idxd.u.pci->portals, + qid * IDXD_PORTAL_SIZE); + if (idxd_is_wq_enabled(&idxd)) + IDXD_PMD_ERR("Error, WQ %u seems enabled", qid); + ret = idxd_dmadev_create(qname, &dev->device, + &idxd, &idxd_pci_ops); + if (ret != 0) { + IDXD_PMD_ERR("Failed to create dmadev %s", name); + if (qid == 0) /* if no devices using this, free pci */ + free(idxd.u.pci); + return ret; + } + } + + return 0; +} + +static int +idxd_dmadev_destroy(const char *name) +{ + int ret; + + /* rte_dma_close is called by pmd_release */ + ret = rte_dma_pmd_release(name); + if (ret) + IDXD_PMD_DEBUG("Device cleanup failed"); + + return 0; } static int @@ -39,7 +292,7 @@ idxd_dmadev_remove_pci(struct rte_pci_device *dev) IDXD_PMD_INFO("Closing %s on NUMA node %d", name, dev->device.numa_node); - return 0; + return idxd_dmadev_destroy(name); } struct rte_pci_driver idxd_pmd_drv_pci = { -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v10 05/16] dma/idxd: create dmadev instances on pci probe 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz @ 2021-10-20 7:34 ` fengchengwen 0 siblings, 0 replies; 243+ messages in thread From: fengchengwen @ 2021-10-20 7:34 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: thomas, bruce.richardson, jerinj, conor.walsh On 2021/10/19 22:10, Kevin Laatz wrote: > When a suitable device is found during the PCI probe, create a dmadev > instance for each HW queue. HW definitions required are also included. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > --- [snip] > > +static inline int > +idxd_pci_dev_command(struct idxd_dmadev *idxd, enum rte_idxd_cmds command) > +{ > + uint8_t err_code; > + uint16_t qid = idxd->qid; > + int i = 0; > + > + if (command >= idxd_disable_wq && command <= idxd_reset_wq) > + qid = (1 << qid); > + rte_spinlock_lock(&idxd->u.pci->lk); > + idxd->u.pci->regs->cmd = (command << IDXD_CMD_SHIFT) | qid; > + > + do { > + rte_pause(); > + err_code = idxd->u.pci->regs->cmdstatus; > + if (++i >= 1000) { > + IDXD_PMD_ERR("Timeout waiting for command response from HW"); > + rte_spinlock_unlock(&idxd->u.pci->lk); > + return err_code; > + } > + } while (idxd->u.pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK); why not while (err_code & CMDSTATUS_ACTIVE_MASK) ? the cmdstatus reg may change in load to err_code and while judge, so suggest always use err_code. > + rte_spinlock_unlock(&idxd->u.pci->lk); > + > + return err_code & CMDSTATUS_ERR_MASK; > +} > + > +static uint32_t * > +idxd_get_wq_cfg(struct idxd_pci_common *pci, uint8_t wq_idx) > +{ > + return RTE_PTR_ADD(pci->wq_regs_base, > + (uintptr_t)wq_idx << (5 + pci->wq_cfg_sz)); > +} > + > +static int > +idxd_is_wq_enabled(struct idxd_dmadev *idxd) > +{ > + uint32_t state = idxd_get_wq_cfg(idxd->u.pci, idxd->qid)[wq_state_idx]; > + return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; > +} > + > +static int > +idxd_pci_dev_close(struct rte_dma_dev *dev) > +{ > + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; > + uint8_t err_code; > + > + /* disable the device */ > + err_code = idxd_pci_dev_command(idxd, idxd_disable_dev); > + if (err_code) { > + IDXD_PMD_ERR("Error disabling device: code %#x", err_code); > + return err_code; > + } > + IDXD_PMD_DEBUG("IDXD Device disabled OK"); > + > + /* free device memory */ > + IDXD_PMD_DEBUG("Freeing device driver memory"); > + rte_free(idxd->batch_idx_ring); > + > + return 0; > +} > + > +static const struct rte_dma_dev_ops idxd_pci_ops = { > + .dev_close = idxd_pci_dev_close, > +}; > + > +/* each portal uses 4 x 4k pages */ > +#define IDXD_PORTAL_SIZE (4096 * 4) > + > +static int > +init_pci_device(struct rte_pci_device *dev, struct idxd_dmadev *idxd, > + unsigned int max_queues) > +{ > + struct idxd_pci_common *pci; > + uint8_t nb_groups, nb_engines, nb_wqs; > + uint16_t grp_offset, wq_offset; /* how far into bar0 the regs are */ > + uint16_t wq_size, total_wq_size; > + uint8_t lg2_max_batch, lg2_max_copy_size; > + unsigned int i, err_code; > + > + pci = malloc(sizeof(*pci)); > + if (pci == NULL) { > + IDXD_PMD_ERR("%s: Can't allocate memory", __func__); > + goto err; > + } > + rte_spinlock_init(&pci->lk); > + > + /* assign the bar registers, and then configure device */ > + pci->regs = dev->mem_resource[0].addr; > + grp_offset = (uint16_t)pci->regs->offsets[0]; > + pci->grp_regs = RTE_PTR_ADD(pci->regs, grp_offset * 0x100); > + wq_offset = (uint16_t)(pci->regs->offsets[0] >> 16); > + pci->wq_regs_base = RTE_PTR_ADD(pci->regs, wq_offset * 0x100); > + pci->portals = dev->mem_resource[2].addr; > + pci->wq_cfg_sz = (pci->regs->wqcap >> 24) & 0x0F; > + > + /* sanity check device status */ > + if (pci->regs->gensts & GENSTS_DEV_STATE_MASK) { > + /* need function-level-reset (FLR) or is enabled */ > + IDXD_PMD_ERR("Device status is not disabled, cannot init"); > + goto err; > + } > + if (pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK) { > + /* command in progress */ > + IDXD_PMD_ERR("Device has a command in progress, cannot init"); > + goto err; > + } > + > + /* read basic info about the hardware for use when configuring */ > + nb_groups = (uint8_t)pci->regs->grpcap; > + nb_engines = (uint8_t)pci->regs->engcap; > + nb_wqs = (uint8_t)(pci->regs->wqcap >> 16); > + total_wq_size = (uint16_t)pci->regs->wqcap; > + lg2_max_copy_size = (uint8_t)(pci->regs->gencap >> 16) & 0x1F; > + lg2_max_batch = (uint8_t)(pci->regs->gencap >> 21) & 0x0F; > + > + IDXD_PMD_DEBUG("nb_groups = %u, nb_engines = %u, nb_wqs = %u", > + nb_groups, nb_engines, nb_wqs); > + > + /* zero out any old config */ > + for (i = 0; i < nb_groups; i++) { > + pci->grp_regs[i].grpengcfg = 0; > + pci->grp_regs[i].grpwqcfg[0] = 0; > + } > + for (i = 0; i < nb_wqs; i++) > + idxd_get_wq_cfg(pci, i)[0] = 0; > + > + /* limit queues if necessary */ > + if (max_queues != 0 && nb_wqs > max_queues) { > + nb_wqs = max_queues; > + if (nb_engines > max_queues) > + nb_engines = max_queues; > + if (nb_groups > max_queues) > + nb_engines = max_queues; > + IDXD_PMD_DEBUG("Limiting queues to %u", nb_wqs); > + } > + > + /* put each engine into a separate group to avoid reordering */ > + if (nb_groups > nb_engines) > + nb_groups = nb_engines; > + if (nb_groups < nb_engines) > + nb_engines = nb_groups; > + > + /* assign engines to groups, round-robin style */ > + for (i = 0; i < nb_engines; i++) { > + IDXD_PMD_DEBUG("Assigning engine %u to group %u", > + i, i % nb_groups); > + pci->grp_regs[i % nb_groups].grpengcfg |= (1ULL << i); > + } > + > + /* now do the same for queues and give work slots to each queue */ > + wq_size = total_wq_size / nb_wqs; > + IDXD_PMD_DEBUG("Work queue size = %u, max batch = 2^%u, max copy = 2^%u", > + wq_size, lg2_max_batch, lg2_max_copy_size); > + for (i = 0; i < nb_wqs; i++) { > + /* add engine "i" to a group */ > + IDXD_PMD_DEBUG("Assigning work queue %u to group %u", > + i, i % nb_groups); > + pci->grp_regs[i % nb_groups].grpwqcfg[0] |= (1ULL << i); > + /* now configure it, in terms of size, max batch, mode */ > + idxd_get_wq_cfg(pci, i)[wq_size_idx] = wq_size; > + idxd_get_wq_cfg(pci, i)[wq_mode_idx] = (1 << WQ_PRIORITY_SHIFT) | > + WQ_MODE_DEDICATED; > + idxd_get_wq_cfg(pci, i)[wq_sizes_idx] = lg2_max_copy_size | > + (lg2_max_batch << WQ_BATCH_SZ_SHIFT); > + } > + > + /* dump the group configuration to output */ > + for (i = 0; i < nb_groups; i++) { > + IDXD_PMD_DEBUG("## Group %d", i); > + IDXD_PMD_DEBUG(" GRPWQCFG: %"PRIx64, pci->grp_regs[i].grpwqcfg[0]); > + IDXD_PMD_DEBUG(" GRPENGCFG: %"PRIx64, pci->grp_regs[i].grpengcfg); > + IDXD_PMD_DEBUG(" GRPFLAGS: %"PRIx32, pci->grp_regs[i].grpflags); > + } > + > + idxd->u.pci = pci; > + idxd->max_batches = wq_size; > + > + /* enable the device itself */ > + err_code = idxd_pci_dev_command(idxd, idxd_enable_dev); > + if (err_code) { > + IDXD_PMD_ERR("Error enabling device: code %#x", err_code); > + return err_code; 1. the err_code come from idxd->u.pci->regs->cmdstatus which may >0, I think it better use negative explicit. 2. suggest use goto err which also free pci memory. > + } > + IDXD_PMD_DEBUG("IDXD Device enabled OK"); > + > + return nb_wqs; > + > +err: > + free(pci); > + return -1; > +} > + > static int > idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) > { > - int ret = 0; > + struct idxd_dmadev idxd = {0}; > + uint8_t nb_wqs; > + int qid, ret = 0; > char name[PCI_PRI_STR_SIZE]; > + unsigned int max_queues = 0; > > rte_pci_device_name(&dev->addr, name, sizeof(name)); > IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); > dev->device.driver = &drv->driver; > > - return ret; > + if (dev->device.devargs && dev->device.devargs->args[0] != '\0') { > + /* if the number of devargs grows beyond just 1, use rte_kvargs */ > + if (sscanf(dev->device.devargs->args, > + "max_queues=%u", &max_queues) != 1) { > + IDXD_PMD_ERR("Invalid device parameter: '%s'", > + dev->device.devargs->args); > + return -1; > + } > + } > + > + ret = init_pci_device(dev, &idxd, max_queues); > + if (ret < 0) { > + IDXD_PMD_ERR("Error initializing PCI hardware"); > + return ret; > + } > + if (idxd.u.pci->portals == NULL) { > + IDXD_PMD_ERR("Error, invalid portal assigned during initialization\n"); need free idxd.u.pci's memory. > + return -EINVAL; > + } > + nb_wqs = (uint8_t)ret; > + > + /* set up one device for each queue */ > + for (qid = 0; qid < nb_wqs; qid++) { > + char qname[32]; > + > + /* add the queue number to each device name */ > + snprintf(qname, sizeof(qname), "%s-q%d", name, qid); > + idxd.qid = qid; > + idxd.portal = RTE_PTR_ADD(idxd.u.pci->portals, > + qid * IDXD_PORTAL_SIZE); > + if (idxd_is_wq_enabled(&idxd)) > + IDXD_PMD_ERR("Error, WQ %u seems enabled", qid); > + ret = idxd_dmadev_create(qname, &dev->device, > + &idxd, &idxd_pci_ops); > + if (ret != 0) { > + IDXD_PMD_ERR("Failed to create dmadev %s", name); > + if (qid == 0) /* if no devices using this, free pci */ > + free(idxd.u.pci); > + return ret; > + } > + } > + > + return 0; > +} > + > +static int > +idxd_dmadev_destroy(const char *name) > +{ > + int ret; > + > + /* rte_dma_close is called by pmd_release */ > + ret = rte_dma_pmd_release(name); > + if (ret) > + IDXD_PMD_DEBUG("Device cleanup failed"); > + > + return 0; > } > > static int > @@ -39,7 +292,7 @@ idxd_dmadev_remove_pci(struct rte_pci_device *dev) > IDXD_PMD_INFO("Closing %s on NUMA node %d", > name, dev->device.numa_node); > > - return 0; > + return idxd_dmadev_destroy(name); The name should be 'snprintf(qname, sizeof(qname), "%s-q%d", name, qid)', and also free auxiliary memory like idxd.u.pci > } > > struct rte_pci_driver idxd_pmd_drv_pci = { > ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v10 06/16] dma/idxd: add datapath structures 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 00/16] add dmadev driver for idxd devices Kevin Laatz ` (4 preceding siblings ...) 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz @ 2021-10-19 14:10 ` Kevin Laatz 2021-10-20 7:44 ` fengchengwen 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 07/16] dma/idxd: add configure and info_get functions Kevin Laatz ` (9 subsequent siblings) 15 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 14:10 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add data structures required for the data path for IDXD devices. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 33 +++++++++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 41 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 4 ++++ drivers/dma/idxd/idxd_pci.c | 2 ++ 5 files changed, 81 insertions(+) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index b48fa954ed..3c0837ec52 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -95,6 +95,7 @@ idxd_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_close = idxd_dev_close, + .dev_dump = idxd_dump, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 5abff34292..f972260a56 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -10,6 +10,35 @@ #define IDXD_PMD_NAME_STR "dmadev_idxd" +int +idxd_dump(const struct rte_dma_dev *dev, FILE *f) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + unsigned int i; + + fprintf(f, "== IDXD Private Data ==\n"); + fprintf(f, " Portal: %p\n", idxd->portal); + fprintf(f, " Config: { ring_size: %u }\n", + idxd->qcfg.nb_desc); + fprintf(f, " Batch ring (sz = %u, max_batches = %u):\n\t", + idxd->max_batches + 1, idxd->max_batches); + for (i = 0; i <= idxd->max_batches; i++) { + fprintf(f, " %u ", idxd->batch_idx_ring[i]); + if (i == idxd->batch_idx_read && i == idxd->batch_idx_write) + fprintf(f, "[rd ptr, wr ptr] "); + else if (i == idxd->batch_idx_read) + fprintf(f, "[rd ptr] "); + else if (i == idxd->batch_idx_write) + fprintf(f, "[wr ptr] "); + if (i == idxd->max_batches) + fprintf(f, "\n"); + } + + fprintf(f, " Curr batch: start = %u, size = %u\n", idxd->batch_start, idxd->batch_size); + fprintf(f, " IDS: avail = %u, returned: %u\n", idxd->ids_avail, idxd->ids_returned); + return 0; +} + int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, @@ -19,6 +48,10 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, struct rte_dma_dev *dmadev = NULL; int ret = 0; + RTE_BUILD_BUG_ON(sizeof(struct idxd_hw_desc) != 64); + RTE_BUILD_BUG_ON(offsetof(struct idxd_hw_desc, size) != 32); + RTE_BUILD_BUG_ON(sizeof(struct idxd_completion) != 32); + if (!name) { IDXD_PMD_ERR("Invalid name of the device!"); ret = -EINVAL; diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h index 86f7f3526b..55ca9f7f52 100644 --- a/drivers/dma/idxd/idxd_hw_defs.h +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -5,6 +5,47 @@ #ifndef _IDXD_HW_DEFS_H_ #define _IDXD_HW_DEFS_H_ +/* + * Defines used in the data path for interacting with IDXD hardware. + */ +#define IDXD_CMD_OP_SHIFT 24 +enum rte_idxd_ops { + idxd_op_nop = 0, + idxd_op_batch, + idxd_op_drain, + idxd_op_memmove, + idxd_op_fill +}; + +#define IDXD_FLAG_FENCE (1 << 0) +#define IDXD_FLAG_COMPLETION_ADDR_VALID (1 << 2) +#define IDXD_FLAG_REQUEST_COMPLETION (1 << 3) +#define IDXD_FLAG_CACHE_CONTROL (1 << 8) + +/** + * Hardware descriptor used by DSA hardware, for both bursts and + * for individual operations. + */ +struct idxd_hw_desc { + uint32_t pasid; + uint32_t op_flags; + rte_iova_t completion; + + RTE_STD_C11 + union { + rte_iova_t src; /* source address for copy ops etc. */ + rte_iova_t desc_addr; /* descriptor pointer for batch */ + }; + rte_iova_t dst; + + uint32_t size; /* length of data for op, or batch size */ + + uint16_t intr_handle; /* completion interrupt handle */ + + /* remaining 26 bytes are reserved */ + uint16_t __reserved[13]; +} __rte_aligned(64); + #define IDXD_COMP_STATUS_INCOMPLETE 0 #define IDXD_COMP_STATUS_SUCCESS 1 #define IDXD_COMP_STATUS_INVALID_OPCODE 0x10 diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 8473bf939f..5e253fdfbc 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -40,6 +40,8 @@ struct idxd_pci_common { }; struct idxd_dmadev { + struct idxd_hw_desc *desc_ring; + /* counters to track the batches */ unsigned short max_batches; unsigned short batch_idx_read; @@ -63,6 +65,7 @@ struct idxd_dmadev { unsigned short max_batch_size; struct rte_dma_dev *dmadev; + struct rte_dma_vchan_conf qcfg; uint8_t sva_support; uint8_t qid; @@ -77,5 +80,6 @@ struct idxd_dmadev { int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); +int idxd_dump(const struct rte_dma_dev *dev, FILE *f); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 7127483b10..96c8c65cc0 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -76,12 +76,14 @@ idxd_pci_dev_close(struct rte_dma_dev *dev) /* free device memory */ IDXD_PMD_DEBUG("Freeing device driver memory"); rte_free(idxd->batch_idx_ring); + rte_free(idxd->desc_ring); return 0; } static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_close = idxd_pci_dev_close, + .dev_dump = idxd_dump, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v10 06/16] dma/idxd: add datapath structures 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 06/16] dma/idxd: add datapath structures Kevin Laatz @ 2021-10-20 7:44 ` fengchengwen 2021-10-20 8:20 ` Bruce Richardson 0 siblings, 1 reply; 243+ messages in thread From: fengchengwen @ 2021-10-20 7:44 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: thomas, bruce.richardson, jerinj, conor.walsh On 2021/10/19 22:10, Kevin Laatz wrote: > Add data structures required for the data path for IDXD devices. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > --- > drivers/dma/idxd/idxd_bus.c | 1 + > drivers/dma/idxd/idxd_common.c | 33 +++++++++++++++++++++++++ > drivers/dma/idxd/idxd_hw_defs.h | 41 ++++++++++++++++++++++++++++++++ > drivers/dma/idxd/idxd_internal.h | 4 ++++ > drivers/dma/idxd/idxd_pci.c | 2 ++ > 5 files changed, 81 insertions(+) [snip] > +/** > + * Hardware descriptor used by DSA hardware, for both bursts and > + * for individual operations. > + */ > +struct idxd_hw_desc { > + uint32_t pasid; > + uint32_t op_flags; > + rte_iova_t completion; > + > + RTE_STD_C11 > + union { > + rte_iova_t src; /* source address for copy ops etc. */ > + rte_iova_t desc_addr; /* descriptor pointer for batch */ > + }; > + rte_iova_t dst; > + > + uint32_t size; /* length of data for op, or batch size */ > + > + uint16_t intr_handle; /* completion interrupt handle */ > + > + /* remaining 26 bytes are reserved */ > + uint16_t __reserved[13]; The non-reserved take about 30+B, and the struct align 64, so the __reserved[13] could delete. It's a minor problem, so: Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> > +} __rte_aligned(64); > + > #define IDXD_COMP_STATUS_INCOMPLETE 0 > #define IDXD_COMP_STATUS_SUCCESS 1 > #define IDXD_COMP_STATUS_INVALID_OPCODE 0x10 > diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h > index 8473bf939f..5e253fdfbc 100644 > --- a/drivers/dma/idxd/idxd_internal.h > +++ b/drivers/dma/idxd/idxd_internal.h > @@ -40,6 +40,8 @@ struct idxd_pci_common { > }; [snip] ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v10 06/16] dma/idxd: add datapath structures 2021-10-20 7:44 ` fengchengwen @ 2021-10-20 8:20 ` Bruce Richardson 0 siblings, 0 replies; 243+ messages in thread From: Bruce Richardson @ 2021-10-20 8:20 UTC (permalink / raw) To: fengchengwen; +Cc: Kevin Laatz, dev, thomas, jerinj, conor.walsh On Wed, Oct 20, 2021 at 03:44:28PM +0800, fengchengwen wrote: > On 2021/10/19 22:10, Kevin Laatz wrote: > > Add data structures required for the data path for IDXD devices. > > > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > > --- > > drivers/dma/idxd/idxd_bus.c | 1 + > > drivers/dma/idxd/idxd_common.c | 33 +++++++++++++++++++++++++ > > drivers/dma/idxd/idxd_hw_defs.h | 41 ++++++++++++++++++++++++++++++++ > > drivers/dma/idxd/idxd_internal.h | 4 ++++ > > drivers/dma/idxd/idxd_pci.c | 2 ++ > > 5 files changed, 81 insertions(+) > > [snip] > > > +/** > > + * Hardware descriptor used by DSA hardware, for both bursts and > > + * for individual operations. > > + */ > > +struct idxd_hw_desc { > > + uint32_t pasid; > > + uint32_t op_flags; > > + rte_iova_t completion; > > + > > + RTE_STD_C11 > > + union { > > + rte_iova_t src; /* source address for copy ops etc. */ > > + rte_iova_t desc_addr; /* descriptor pointer for batch */ > > + }; > > + rte_iova_t dst; > > + > > + uint32_t size; /* length of data for op, or batch size */ > > + > > + uint16_t intr_handle; /* completion interrupt handle */ > > + > > + /* remaining 26 bytes are reserved */ > > + uint16_t __reserved[13]; > > The non-reserved take about 30+B, and the struct align 64, so the __reserved[13] could delete. > > It's a minor problem, so: > Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> > There are actually cases where that reserved field makes a difference. If we go to initialize a descriptor as a local variable the compiler is required to initialize all unspecified fields to 0, which means that if we don't explicitly put in place those reserved fields those bytes will be uninitialized. Since the hardware requires all unused fields to be zero, we need to keep this field in place to simplify the code and save us having to do extra memsets to zero the unused space. ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v10 07/16] dma/idxd: add configure and info_get functions 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 00/16] add dmadev driver for idxd devices Kevin Laatz ` (5 preceding siblings ...) 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 06/16] dma/idxd: add datapath structures Kevin Laatz @ 2021-10-19 14:10 ` Kevin Laatz 2021-10-20 7:54 ` fengchengwen 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz ` (8 subsequent siblings) 15 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 14:10 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add functions for device configuration. The info_get function is included here since it can be useful for checking successful configuration. Documentation is also updated to add device configuration usage info. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 15 +++++++ drivers/dma/idxd/idxd_bus.c | 3 ++ drivers/dma/idxd/idxd_common.c | 71 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 6 +++ drivers/dma/idxd/idxd_pci.c | 3 ++ 5 files changed, 98 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index ce33e2857a..62ffd39ee0 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -120,3 +120,18 @@ use a subset of configured queues. Once probed successfully, irrespective of kernel driver, the device will appear as a ``dmadev``, that is a "DMA device type" inside DPDK, and can be accessed using APIs from the ``rte_dmadev`` library. + +Using IDXD DMAdev Devices +-------------------------- + +To use the devices from an application, the dmadev API can be used. + +Device Configuration +~~~~~~~~~~~~~~~~~~~~~ + +IDXD configuration requirements: + +* ``ring_size`` must be a power of two, between 64 and 4096. +* Only one ``vchan`` is supported per device (work queue). +* IDXD devices do not support silent mode. +* The transfer direction must be set to ``RTE_DMA_DIR_MEM_TO_MEM`` to copy from memory to memory. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index 3c0837ec52..b2acdac4f9 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -96,6 +96,9 @@ idxd_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_close = idxd_dev_close, .dev_dump = idxd_dump, + .dev_configure = idxd_configure, + .vchan_setup = idxd_vchan_setup, + .dev_info_get = idxd_info_get, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index f972260a56..b0c79a2e42 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -39,6 +39,77 @@ idxd_dump(const struct rte_dma_dev *dev, FILE *f) return 0; } +int +idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t size) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + + if (size < sizeof(*info)) + return -EINVAL; + + *info = (struct rte_dma_info) { + .dev_capa = RTE_DMA_CAPA_MEM_TO_MEM | RTE_DMA_CAPA_HANDLES_ERRORS | + RTE_DMA_CAPA_OPS_COPY | RTE_DMA_CAPA_OPS_FILL, + .max_vchans = 1, + .max_desc = 4096, + .min_desc = 64, + }; + if (idxd->sva_support) + info->dev_capa |= RTE_DMA_CAPA_SVA; + return 0; +} + +int +idxd_configure(struct rte_dma_dev *dev __rte_unused, const struct rte_dma_conf *dev_conf, + uint32_t conf_sz) +{ + if (sizeof(struct rte_dma_conf) != conf_sz) + return -EINVAL; + + if (dev_conf->nb_vchans != 1) + return -EINVAL; + return 0; +} + +int +idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint16_t max_desc = qconf->nb_desc; + + if (sizeof(struct rte_dma_vchan_conf) != qconf_sz) + return -EINVAL; + + idxd->qcfg = *qconf; + + if (!rte_is_power_of_2(max_desc)) + max_desc = rte_align32pow2(max_desc); + IDXD_PMD_DEBUG("DMA dev %u using %u descriptors", dev->data->dev_id, max_desc); + idxd->desc_ring_mask = max_desc - 1; + idxd->qcfg.nb_desc = max_desc; + + /* in case we are reconfiguring a device, free any existing memory */ + rte_free(idxd->desc_ring); + + /* allocate the descriptor ring at 2x size as batches can't wrap */ + idxd->desc_ring = rte_zmalloc(NULL, sizeof(*idxd->desc_ring) * max_desc * 2, 0); + if (idxd->desc_ring == NULL) + return -ENOMEM; + idxd->desc_iova = rte_mem_virt2iova(idxd->desc_ring); + + idxd->batch_idx_read = 0; + idxd->batch_idx_write = 0; + idxd->batch_start = 0; + idxd->batch_size = 0; + idxd->ids_returned = 0; + idxd->ids_avail = 0; + + memset(idxd->batch_comp_ring, 0, sizeof(*idxd->batch_comp_ring) * + (idxd->max_batches + 1)); + return 0; +} + int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 5e253fdfbc..1dbe31abcd 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -81,5 +81,11 @@ struct idxd_dmadev { int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); int idxd_dump(const struct rte_dma_dev *dev, FILE *f); +int idxd_configure(struct rte_dma_dev *dev, const struct rte_dma_conf *dev_conf, + uint32_t conf_sz); +int idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan, + const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz); +int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *dev_info, + uint32_t size); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 96c8c65cc0..681bb55efe 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -84,6 +84,9 @@ idxd_pci_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_close = idxd_pci_dev_close, .dev_dump = idxd_dump, + .dev_configure = idxd_configure, + .vchan_setup = idxd_vchan_setup, + .dev_info_get = idxd_info_get, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v10 07/16] dma/idxd: add configure and info_get functions 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 07/16] dma/idxd: add configure and info_get functions Kevin Laatz @ 2021-10-20 7:54 ` fengchengwen 0 siblings, 0 replies; 243+ messages in thread From: fengchengwen @ 2021-10-20 7:54 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: thomas, bruce.richardson, jerinj, conor.walsh Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> On 2021/10/19 22:10, Kevin Laatz wrote: > Add functions for device configuration. The info_get function is included > here since it can be useful for checking successful configuration. > > Documentation is also updated to add device configuration usage info. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > --- [snip] ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v10 08/16] dma/idxd: add start and stop functions for pci devices 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 00/16] add dmadev driver for idxd devices Kevin Laatz ` (6 preceding siblings ...) 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 07/16] dma/idxd: add configure and info_get functions Kevin Laatz @ 2021-10-19 14:10 ` Kevin Laatz 2021-10-20 8:04 ` fengchengwen 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 09/16] dma/idxd: add data-path job submission functions Kevin Laatz ` (7 subsequent siblings) 15 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 14:10 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add device start/stop functions for DSA devices bound to vfio. For devices bound to the IDXD kernel driver, these are not required since the IDXD kernel driver takes care of this. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 3 +++ drivers/dma/idxd/idxd_pci.c | 51 +++++++++++++++++++++++++++++++++++++ 2 files changed, 54 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 62ffd39ee0..711890bd9e 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -135,3 +135,6 @@ IDXD configuration requirements: * Only one ``vchan`` is supported per device (work queue). * IDXD devices do not support silent mode. * The transfer direction must be set to ``RTE_DMA_DIR_MEM_TO_MEM`` to copy from memory to memory. + +Once configured, the device can then be made ready for use by calling the +``rte_dma_start()`` API. diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 681bb55efe..ed5bf99425 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -59,6 +59,55 @@ idxd_is_wq_enabled(struct idxd_dmadev *idxd) return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; } +static int +idxd_pci_dev_stop(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint8_t err_code; + + if (!idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Work queue %d already disabled", idxd->qid); + return -EALREADY; + } + + err_code = idxd_pci_dev_command(idxd, idxd_disable_wq); + if (err_code || idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Failed disabling work queue %d, error code: %#x", + idxd->qid, err_code); + return err_code == 0 ? -1 : -err_code; + } + IDXD_PMD_DEBUG("Work queue %d disabled OK", idxd->qid); + + return 0; +} + +static int +idxd_pci_dev_start(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint8_t err_code; + + if (idxd_is_wq_enabled(idxd)) { + IDXD_PMD_WARN("WQ %d already enabled", idxd->qid); + return 0; + } + + if (idxd->desc_ring == NULL) { + IDXD_PMD_ERR("WQ %d has not been fully configured", idxd->qid); + return -EINVAL; + } + + err_code = idxd_pci_dev_command(idxd, idxd_enable_wq); + if (err_code || !idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Failed enabling work queue %d, error code: %#x", + idxd->qid, err_code); + return err_code == 0 ? -1 : -err_code; + } + IDXD_PMD_DEBUG("Work queue %d enabled OK", idxd->qid); + + return 0; +} + static int idxd_pci_dev_close(struct rte_dma_dev *dev) { @@ -87,6 +136,8 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .dev_start = idxd_pci_dev_start, + .dev_stop = idxd_pci_dev_stop, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v10 08/16] dma/idxd: add start and stop functions for pci devices 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz @ 2021-10-20 8:04 ` fengchengwen 0 siblings, 0 replies; 243+ messages in thread From: fengchengwen @ 2021-10-20 8:04 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: thomas, bruce.richardson, jerinj, conor.walsh On 2021/10/19 22:10, Kevin Laatz wrote: > Add device start/stop functions for DSA devices bound to vfio. For devices > bound to the IDXD kernel driver, these are not required since the IDXD > kernel driver takes care of this. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > --- [snip] > > +static int > +idxd_pci_dev_stop(struct rte_dma_dev *dev) > +{ > + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; > + uint8_t err_code; > + > + if (!idxd_is_wq_enabled(idxd)) { > + IDXD_PMD_ERR("Work queue %d already disabled", idxd->qid); > + return -EALREADY; suggest return 0. > + } > + > + err_code = idxd_pci_dev_command(idxd, idxd_disable_wq); > + if (err_code || idxd_is_wq_enabled(idxd)) { > + IDXD_PMD_ERR("Failed disabling work queue %d, error code: %#x", > + idxd->qid, err_code); > + return err_code == 0 ? -1 : -err_code; > + } > + IDXD_PMD_DEBUG("Work queue %d disabled OK", idxd->qid); > + > + return 0; > +} > + > +static int > +idxd_pci_dev_start(struct rte_dma_dev *dev) > +{ > + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; > + uint8_t err_code; > + > + if (idxd_is_wq_enabled(idxd)) { > + IDXD_PMD_WARN("WQ %d already enabled", idxd->qid); > + return 0; > + } > + > + if (idxd->desc_ring == NULL) { > + IDXD_PMD_ERR("WQ %d has not been fully configured", idxd->qid); > + return -EINVAL; > + } > + > + err_code = idxd_pci_dev_command(idxd, idxd_enable_wq); > + if (err_code || !idxd_is_wq_enabled(idxd)) { > + IDXD_PMD_ERR("Failed enabling work queue %d, error code: %#x", > + idxd->qid, err_code); > + return err_code == 0 ? -1 : -err_code; > + } > + IDXD_PMD_DEBUG("Work queue %d enabled OK", idxd->qid); > + > + return 0; > +} > + > static int > idxd_pci_dev_close(struct rte_dma_dev *dev) > { > @@ -87,6 +136,8 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { > .dev_configure = idxd_configure, > .vchan_setup = idxd_vchan_setup, > .dev_info_get = idxd_info_get, > + .dev_start = idxd_pci_dev_start, > + .dev_stop = idxd_pci_dev_stop, > }; > > /* each portal uses 4 x 4k pages */ > ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v10 09/16] dma/idxd: add data-path job submission functions 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 00/16] add dmadev driver for idxd devices Kevin Laatz ` (7 preceding siblings ...) 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz @ 2021-10-19 14:10 ` Kevin Laatz 2021-10-20 8:27 ` fengchengwen 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 10/16] dma/idxd: add data-path job completion functions Kevin Laatz ` (6 subsequent siblings) 15 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 14:10 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add data path functions for enqueuing and submitting operations to DSA devices. Documentation updates are included for dmadev library and IDXD driver docs as appropriate. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 9 +++ doc/guides/prog_guide/dmadev.rst | 19 +++++ drivers/dma/idxd/idxd_common.c | 135 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 ++ drivers/dma/idxd/meson.build | 1 + 5 files changed, 169 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 711890bd9e..d548c4751a 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -138,3 +138,12 @@ IDXD configuration requirements: Once configured, the device can then be made ready for use by calling the ``rte_dma_start()`` API. + +Performing Data Copies +~~~~~~~~~~~~~~~~~~~~~~~ + +Refer to the :ref:`Enqueue / Dequeue APIs <dmadev_enqueue_dequeue>` section of the dmadev library +documentation for details on operation enqueue and submission API usage. + +It is expected that, for efficiency reasons, a burst of operations will be enqueued to the +device via multiple enqueue calls between calls to the ``rte_dma_submit()`` function. diff --git a/doc/guides/prog_guide/dmadev.rst b/doc/guides/prog_guide/dmadev.rst index 32f7147862..e853ffda3a 100644 --- a/doc/guides/prog_guide/dmadev.rst +++ b/doc/guides/prog_guide/dmadev.rst @@ -67,6 +67,8 @@ can be used to get the device info and supported features. Silent mode is a special device capability which does not require the application to invoke dequeue APIs. +.. _dmadev_enqueue_dequeue: + Enqueue / Dequeue APIs ~~~~~~~~~~~~~~~~~~~~~~ @@ -80,6 +82,23 @@ The ``rte_dma_submit`` API is used to issue doorbell to hardware. Alternatively the ``RTE_DMA_OP_FLAG_SUBMIT`` flag can be passed to the enqueue APIs to also issue the doorbell to hardware. +The following code demonstrates how to enqueue a burst of copies to the +device and start the hardware processing of them: + +.. code-block:: C + + struct rte_mbuf *srcs[DMA_BURST_SZ], *dsts[DMA_BURST_SZ]; + unsigned int i; + + for (i = 0; i < RTE_DIM(srcs); i++) { + if (rte_dma_copy(dev_id, vchan, rte_pktmbuf_iova(srcs), + rte_pktmbuf_iova(dsts), COPY_LEN, 0) < 0) { + PRINT_ERR("Error with rte_dma_copy for buffer %u\n", i); + return -1; + } + } + rte_dma_submit(dev_id, vchan); + There are two dequeue APIs ``rte_dma_completed`` and ``rte_dma_completed_status``, these are used to obtain the results of the enqueue requests. ``rte_dma_completed`` will return the number of successfully diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index b0c79a2e42..a686ad421c 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -2,14 +2,145 @@ * Copyright 2021 Intel Corporation */ +#include <x86intrin.h> + #include <rte_malloc.h> #include <rte_common.h> #include <rte_log.h> +#include <rte_prefetch.h> #include "idxd_internal.h" #define IDXD_PMD_NAME_STR "dmadev_idxd" +static __rte_always_inline rte_iova_t +__desc_idx_to_iova(struct idxd_dmadev *idxd, uint16_t n) +{ + return idxd->desc_iova + (n * sizeof(struct idxd_hw_desc)); +} + +static __rte_always_inline void +__idxd_movdir64b(volatile void *dst, const struct idxd_hw_desc *src) +{ + asm volatile (".byte 0x66, 0x0f, 0x38, 0xf8, 0x02" + : + : "a" (dst), "d" (src) + : "memory"); +} + +static __rte_always_inline void +__submit(struct idxd_dmadev *idxd) +{ + rte_prefetch1(&idxd->batch_comp_ring[idxd->batch_idx_read]); + + if (idxd->batch_size == 0) + return; + + /* write completion to batch comp ring */ + rte_iova_t comp_addr = idxd->batch_iova + + (idxd->batch_idx_write * sizeof(struct idxd_completion)); + + if (idxd->batch_size == 1) { + /* submit batch directly */ + struct idxd_hw_desc desc = + idxd->desc_ring[idxd->batch_start & idxd->desc_ring_mask]; + desc.completion = comp_addr; + desc.op_flags |= IDXD_FLAG_REQUEST_COMPLETION; + _mm_sfence(); /* fence before writing desc to device */ + __idxd_movdir64b(idxd->portal, &desc); + } else { + const struct idxd_hw_desc batch_desc = { + .op_flags = (idxd_op_batch << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_COMPLETION_ADDR_VALID | + IDXD_FLAG_REQUEST_COMPLETION, + .desc_addr = __desc_idx_to_iova(idxd, + idxd->batch_start & idxd->desc_ring_mask), + .completion = comp_addr, + .size = idxd->batch_size, + }; + _mm_sfence(); /* fence before writing desc to device */ + __idxd_movdir64b(idxd->portal, &batch_desc); + } + + if (++idxd->batch_idx_write > idxd->max_batches) + idxd->batch_idx_write = 0; + + idxd->batch_start += idxd->batch_size; + idxd->batch_size = 0; + idxd->batch_idx_ring[idxd->batch_idx_write] = idxd->batch_start; + _mm256_store_si256((void *)&idxd->batch_comp_ring[idxd->batch_idx_write], + _mm256_setzero_si256()); +} + +static __rte_always_inline int +__idxd_write_desc(struct idxd_dmadev *idxd, + const uint32_t op_flags, + const rte_iova_t src, + const rte_iova_t dst, + const uint32_t size, + const uint32_t flags) +{ + uint16_t mask = idxd->desc_ring_mask; + uint16_t job_id = idxd->batch_start + idxd->batch_size; + /* we never wrap batches, so we only mask the start and allow start+size to overflow */ + uint16_t write_idx = (idxd->batch_start & mask) + idxd->batch_size; + + /* first check batch ring space then desc ring space */ + if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) || + idxd->batch_idx_write + 1 == idxd->batch_idx_read) + return -ENOSPC; + if (((write_idx + 1) & mask) == (idxd->ids_returned & mask)) + return -ENOSPC; + + /* write desc. Note: descriptors don't wrap, but the completion address does */ + const uint64_t op_flags64 = (uint64_t)(op_flags | IDXD_FLAG_COMPLETION_ADDR_VALID) << 32; + const uint64_t comp_addr = __desc_idx_to_iova(idxd, write_idx & mask); + _mm256_store_si256((void *)&idxd->desc_ring[write_idx], + _mm256_set_epi64x(dst, src, comp_addr, op_flags64)); + _mm256_store_si256((void *)&idxd->desc_ring[write_idx].size, + _mm256_set_epi64x(0, 0, 0, size)); + + idxd->batch_size++; + + rte_prefetch0_write(&idxd->desc_ring[write_idx + 1]); + + if (flags & RTE_DMA_OP_FLAG_SUBMIT) + __submit(idxd); + + return job_id; +} + +int +idxd_enqueue_copy(void *dev_private, uint16_t qid __rte_unused, rte_iova_t src, + rte_iova_t dst, unsigned int length, uint64_t flags) +{ + /* we can take advantage of the fact that the fence flag in dmadev and DSA are the same, + * but check it at compile time to be sure. + */ + RTE_BUILD_BUG_ON(RTE_DMA_OP_FLAG_FENCE != IDXD_FLAG_FENCE); + uint32_t memmove = (idxd_op_memmove << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_CACHE_CONTROL | (flags & IDXD_FLAG_FENCE); + return __idxd_write_desc(dev_private, memmove, src, dst, length, + flags); +} + +int +idxd_enqueue_fill(void *dev_private, uint16_t qid __rte_unused, uint64_t pattern, + rte_iova_t dst, unsigned int length, uint64_t flags) +{ + uint32_t fill = (idxd_op_fill << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_CACHE_CONTROL | (flags & IDXD_FLAG_FENCE); + return __idxd_write_desc(dev_private, fill, pattern, dst, length, + flags); +} + +int +idxd_submit(void *dev_private, uint16_t qid __rte_unused) +{ + __submit(dev_private); + return 0; +} + int idxd_dump(const struct rte_dma_dev *dev, FILE *f) { @@ -139,6 +270,10 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->dev_ops = ops; dmadev->device = dev; + dmadev->fp_obj->copy = idxd_enqueue_copy; + dmadev->fp_obj->fill = idxd_enqueue_fill; + dmadev->fp_obj->submit = idxd_submit; + idxd = dmadev->data->dev_private; *idxd = *base_idxd; /* copy over the main fields already passed in */ idxd->dmadev = dmadev; diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 1dbe31abcd..ab4d71095e 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -87,5 +87,10 @@ int idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan, const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz); int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *dev_info, uint32_t size); +int idxd_enqueue_copy(void *dev_private, uint16_t qid, rte_iova_t src, + rte_iova_t dst, unsigned int length, uint64_t flags); +int idxd_enqueue_fill(void *dev_private, uint16_t qid, uint64_t pattern, + rte_iova_t dst, unsigned int length, uint64_t flags); +int idxd_submit(void *dev_private, uint16_t qid); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index 37af6e1b8f..fdfce81a94 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -5,6 +5,7 @@ build = dpdk_conf.has('RTE_ARCH_X86') reason = 'only supported on x86' deps += ['bus_pci'] +cflags += '-mavx2' # all platforms with idxd HW support AVX sources = files( 'idxd_common.c', 'idxd_pci.c' -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v10 09/16] dma/idxd: add data-path job submission functions 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 09/16] dma/idxd: add data-path job submission functions Kevin Laatz @ 2021-10-20 8:27 ` fengchengwen 0 siblings, 0 replies; 243+ messages in thread From: fengchengwen @ 2021-10-20 8:27 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: thomas, bruce.richardson, jerinj, conor.walsh On 2021/10/19 22:10, Kevin Laatz wrote: > Add data path functions for enqueuing and submitting operations to DSA > devices. > > Documentation updates are included for dmadev library and IDXD driver docs > as appropriate. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > --- > doc/guides/dmadevs/idxd.rst | 9 +++ > doc/guides/prog_guide/dmadev.rst | 19 +++++ > drivers/dma/idxd/idxd_common.c | 135 +++++++++++++++++++++++++++++++ > drivers/dma/idxd/idxd_internal.h | 5 ++ > drivers/dma/idxd/meson.build | 1 + > 5 files changed, 169 insertions(+) > > diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst > index 711890bd9e..d548c4751a 100644 > --- a/doc/guides/dmadevs/idxd.rst > +++ b/doc/guides/dmadevs/idxd.rst > @@ -138,3 +138,12 @@ IDXD configuration requirements: > > Once configured, the device can then be made ready for use by calling the > ``rte_dma_start()`` API. > + > +Performing Data Copies > +~~~~~~~~~~~~~~~~~~~~~~~ > + > +Refer to the :ref:`Enqueue / Dequeue APIs <dmadev_enqueue_dequeue>` section of the dmadev library > +documentation for details on operation enqueue and submission API usage. > + > +It is expected that, for efficiency reasons, a burst of operations will be enqueued to the > +device via multiple enqueue calls between calls to the ``rte_dma_submit()`` function. > diff --git a/doc/guides/prog_guide/dmadev.rst b/doc/guides/prog_guide/dmadev.rst > index 32f7147862..e853ffda3a 100644 > --- a/doc/guides/prog_guide/dmadev.rst > +++ b/doc/guides/prog_guide/dmadev.rst > @@ -67,6 +67,8 @@ can be used to get the device info and supported features. > Silent mode is a special device capability which does not require the > application to invoke dequeue APIs. > > +.. _dmadev_enqueue_dequeue: > + > > Enqueue / Dequeue APIs > ~~~~~~~~~~~~~~~~~~~~~~ > @@ -80,6 +82,23 @@ The ``rte_dma_submit`` API is used to issue doorbell to hardware. > Alternatively the ``RTE_DMA_OP_FLAG_SUBMIT`` flag can be passed to the enqueue > APIs to also issue the doorbell to hardware. > > +The following code demonstrates how to enqueue a burst of copies to the > +device and start the hardware processing of them: > + > +.. code-block:: C > + > + struct rte_mbuf *srcs[DMA_BURST_SZ], *dsts[DMA_BURST_SZ]; > + unsigned int i; > + > + for (i = 0; i < RTE_DIM(srcs); i++) { > + if (rte_dma_copy(dev_id, vchan, rte_pktmbuf_iova(srcs), > + rte_pktmbuf_iova(dsts), COPY_LEN, 0) < 0) { srcs->srcs[i] dsts->dsts[i] could add my reviewed-by after fix it, thanks. > + PRINT_ERR("Error with rte_dma_copy for buffer %u\n", i); > + return -1; > + } > + } > + rte_dma_submit(dev_id, vchan); > + > There are two dequeue APIs ``rte_dma_completed`` and > ``rte_dma_completed_status``, these are used to obtain the results of the > enqueue requests. ``rte_dma_completed`` will return the number of successfully > diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c > index b0c79a2e42..a686ad421c 100644 > --- a/drivers/dma/idxd/idxd_common.c > +++ b/drivers/dma/idxd/idxd_common.c > @@ -2,14 +2,145 @@ > * Copyright 2021 Intel Corporation > */ > [snip] ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v10 10/16] dma/idxd: add data-path job completion functions 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 00/16] add dmadev driver for idxd devices Kevin Laatz ` (8 preceding siblings ...) 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 09/16] dma/idxd: add data-path job submission functions Kevin Laatz @ 2021-10-19 14:10 ` Kevin Laatz 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 11/16] dma/idxd: add operation statistic tracking Kevin Laatz ` (5 subsequent siblings) 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 14:10 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the data path functions for gathering completed operations. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 32 ++++- drivers/dma/idxd/idxd_common.c | 236 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 + 3 files changed, 272 insertions(+), 1 deletion(-) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index d548c4751a..d4a210b854 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -143,7 +143,37 @@ Performing Data Copies ~~~~~~~~~~~~~~~~~~~~~~~ Refer to the :ref:`Enqueue / Dequeue APIs <dmadev_enqueue_dequeue>` section of the dmadev library -documentation for details on operation enqueue and submission API usage. +documentation for details on operation enqueue, submission and completion API usage. It is expected that, for efficiency reasons, a burst of operations will be enqueued to the device via multiple enqueue calls between calls to the ``rte_dma_submit()`` function. + +When gathering completions, ``rte_dma_completed()`` should be used, up until the point an error +occurs in an operation. If an error was encountered, ``rte_dma_completed_status()`` must be used +to kick the device off to continue processing operations and also to gather the status of each +individual operations which is filled in to the ``status`` array provided as parameter by the +application. + +The following status codes are supported by IDXD: + +* ``RTE_DMA_STATUS_SUCCESSFUL``: The operation was successful. +* ``RTE_DMA_STATUS_INVALID_OPCODE``: The operation failed due to an invalid operation code. +* ``RTE_DMA_STATUS_INVALID_LENGTH``: The operation failed due to an invalid data length. +* ``RTE_DMA_STATUS_NOT_ATTEMPTED``: The operation was not attempted. +* ``RTE_DMA_STATUS_ERROR_UNKNOWN``: The operation failed due to an unspecified error. + +The following code shows how to retrieve the number of successfully completed +copies within a burst and then using ``rte_dma_completed_status()`` to check +which operation failed and kick off the device to continue processing operations: + +.. code-block:: C + + enum rte_dma_status_code status[COMP_BURST_SZ]; + uint16_t count, idx, status_count; + bool error = 0; + + count = rte_dma_completed(dev_id, vchan, COMP_BURST_SZ, &idx, &error); + + if (error){ + status_count = rte_dma_completed_status(dev_id, vchan, COMP_BURST_SZ, &idx, status); + } diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index a686ad421c..76bc2e1364 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -141,6 +141,240 @@ idxd_submit(void *dev_private, uint16_t qid __rte_unused) return 0; } +static enum rte_dma_status_code +get_comp_status(struct idxd_completion *c) +{ + uint8_t st = c->status; + switch (st) { + /* successful descriptors are not written back normally */ + case IDXD_COMP_STATUS_INCOMPLETE: + case IDXD_COMP_STATUS_SUCCESS: + return RTE_DMA_STATUS_SUCCESSFUL; + case IDXD_COMP_STATUS_INVALID_OPCODE: + return RTE_DMA_STATUS_INVALID_OPCODE; + case IDXD_COMP_STATUS_INVALID_SIZE: + return RTE_DMA_STATUS_INVALID_LENGTH; + case IDXD_COMP_STATUS_SKIPPED: + return RTE_DMA_STATUS_NOT_ATTEMPTED; + default: + return RTE_DMA_STATUS_ERROR_UNKNOWN; + } +} + +static __rte_always_inline int +batch_ok(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) +{ + uint16_t ret; + uint8_t bstatus; + + if (max_ops == 0) + return 0; + + /* first check if there are any unreturned handles from last time */ + if (idxd->ids_avail != idxd->ids_returned) { + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); + idxd->ids_returned += ret; + if (status) + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); + return ret; + } + + if (idxd->batch_idx_read == idxd->batch_idx_write) + return 0; + + bstatus = idxd->batch_comp_ring[idxd->batch_idx_read].status; + /* now check if next batch is complete and successful */ + if (bstatus == IDXD_COMP_STATUS_SUCCESS) { + /* since the batch idx ring stores the start of each batch, pre-increment to lookup + * start of next batch. + */ + if (++idxd->batch_idx_read > idxd->max_batches) + idxd->batch_idx_read = 0; + idxd->ids_avail = idxd->batch_idx_ring[idxd->batch_idx_read]; + + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); + idxd->ids_returned += ret; + if (status) + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); + return ret; + } + /* check if batch is incomplete */ + else if (bstatus == IDXD_COMP_STATUS_INCOMPLETE) + return 0; + + return -1; /* error case */ +} + +static inline uint16_t +batch_completed(struct idxd_dmadev *idxd, uint16_t max_ops, bool *has_error) +{ + uint16_t i; + uint16_t b_start, b_end, next_batch; + + int ret = batch_ok(idxd, max_ops, NULL); + if (ret >= 0) + return ret; + + /* ERROR case, not successful, not incomplete */ + /* Get the batch size, and special case size 1. + * once we identify the actual failure job, return other jobs, then update + * the batch ring indexes to make it look like the first job of the batch has failed. + * Subsequent calls here will always return zero packets, and the error must be cleared by + * calling the completed_status() function. + */ + next_batch = (idxd->batch_idx_read + 1); + if (next_batch > idxd->max_batches) + next_batch = 0; + b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; + b_end = idxd->batch_idx_ring[next_batch]; + + if (b_end - b_start == 1) { /* not a batch */ + *has_error = true; + return 0; + } + + for (i = b_start; i < b_end; i++) { + struct idxd_completion *c = (void *)&idxd->desc_ring[i & idxd->desc_ring_mask]; + if (c->status > IDXD_COMP_STATUS_SUCCESS) /* ignore incomplete(0) and success(1) */ + break; + } + ret = RTE_MIN((uint16_t)(i - idxd->ids_returned), max_ops); + if (ret < max_ops) + *has_error = true; /* we got up to the point of error */ + idxd->ids_avail = idxd->ids_returned += ret; + + /* to ensure we can call twice and just return 0, set start of batch to where we finished */ + idxd->batch_comp_ring[idxd->batch_idx_read].completed_size -= ret; + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; + if (idxd->batch_idx_ring[next_batch] - idxd->batch_idx_ring[idxd->batch_idx_read] == 1) { + /* copy over the descriptor status to the batch ring as if no batch */ + uint16_t d_idx = idxd->batch_idx_ring[idxd->batch_idx_read] & idxd->desc_ring_mask; + struct idxd_completion *desc_comp = (void *)&idxd->desc_ring[d_idx]; + idxd->batch_comp_ring[idxd->batch_idx_read].status = desc_comp->status; + } + + return ret; +} + +static uint16_t +batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) +{ + uint16_t next_batch; + + int ret = batch_ok(idxd, max_ops, status); + if (ret >= 0) + return ret; + + /* ERROR case, not successful, not incomplete */ + /* Get the batch size, and special case size 1. + */ + next_batch = (idxd->batch_idx_read + 1); + if (next_batch > idxd->max_batches) + next_batch = 0; + const uint16_t b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; + const uint16_t b_end = idxd->batch_idx_ring[next_batch]; + const uint16_t b_len = b_end - b_start; + if (b_len == 1) {/* not a batch */ + *status = get_comp_status(&idxd->batch_comp_ring[idxd->batch_idx_read]); + idxd->ids_avail++; + idxd->ids_returned++; + idxd->batch_idx_read = next_batch; + return 1; + } + + /* not a single-element batch, need to process more. + * Scenarios: + * 1. max_ops >= batch_size - can fit everything, simple case + * - loop through completed ops and then add on any not-attempted ones + * 2. max_ops < batch_size - can't fit everything, more complex case + * - loop through completed/incomplete and stop when hit max_ops + * - adjust the batch descriptor to update where we stopped, with appropriate bcount + * - if bcount is to be exactly 1, update the batch descriptor as it will be treated as + * non-batch next time. + */ + const uint16_t bcount = idxd->batch_comp_ring[idxd->batch_idx_read].completed_size; + for (ret = 0; ret < b_len && ret < max_ops; ret++) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; + status[ret] = (ret < bcount) ? get_comp_status(c) : RTE_DMA_STATUS_NOT_ATTEMPTED; + } + idxd->ids_avail = idxd->ids_returned += ret; + + /* everything fit */ + if (ret == b_len) { + idxd->batch_idx_read = next_batch; + return ret; + } + + /* set up for next time, update existing batch descriptor & start idx at batch_idx_read */ + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; + if (ret > bcount) { + /* we have only incomplete ones - set batch completed size to 0 */ + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; + comp->completed_size = 0; + /* if there is only one descriptor left, job skipped so set flag appropriately */ + if (b_len - ret == 1) + comp->status = IDXD_COMP_STATUS_SKIPPED; + } else { + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; + comp->completed_size -= ret; + /* if there is only one descriptor left, copy status info straight to desc */ + if (comp->completed_size == 1) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; + comp->status = c->status; + /* individual descs can be ok without writeback, but not batches */ + if (comp->status == IDXD_COMP_STATUS_INCOMPLETE) + comp->status = IDXD_COMP_STATUS_SUCCESS; + } else if (bcount == b_len) { + /* check if we still have an error, and clear flag if not */ + uint16_t i; + for (i = b_start + ret; i < b_end; i++) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[i & idxd->desc_ring_mask]; + if (c->status > IDXD_COMP_STATUS_SUCCESS) + break; + } + if (i == b_end) /* no errors */ + comp->status = IDXD_COMP_STATUS_SUCCESS; + } + } + + return ret; +} + +uint16_t +idxd_completed(void *dev_private, uint16_t qid __rte_unused, uint16_t max_ops, + uint16_t *last_idx, bool *has_error) +{ + struct idxd_dmadev *idxd = dev_private; + uint16_t batch, ret = 0; + + do { + batch = batch_completed(idxd, max_ops - ret, has_error); + ret += batch; + } while (batch > 0 && *has_error == false); + + *last_idx = idxd->ids_returned - 1; + return ret; +} + +uint16_t +idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, uint16_t max_ops, + uint16_t *last_idx, enum rte_dma_status_code *status) +{ + struct idxd_dmadev *idxd = dev_private; + uint16_t batch, ret = 0; + + do { + batch = batch_completed_status(idxd, max_ops - ret, &status[ret]); + ret += batch; + } while (batch > 0); + + *last_idx = idxd->ids_returned - 1; + return ret; +} + int idxd_dump(const struct rte_dma_dev *dev, FILE *f) { @@ -273,6 +507,8 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->fp_obj->copy = idxd_enqueue_copy; dmadev->fp_obj->fill = idxd_enqueue_fill; dmadev->fp_obj->submit = idxd_submit; + dmadev->fp_obj->completed = idxd_completed; + dmadev->fp_obj->completed_status = idxd_completed_status; idxd = dmadev->data->dev_private; *idxd = *base_idxd; /* copy over the main fields already passed in */ diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index ab4d71095e..4208b0dee8 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -92,5 +92,10 @@ int idxd_enqueue_copy(void *dev_private, uint16_t qid, rte_iova_t src, int idxd_enqueue_fill(void *dev_private, uint16_t qid, uint64_t pattern, rte_iova_t dst, unsigned int length, uint64_t flags); int idxd_submit(void *dev_private, uint16_t qid); +uint16_t idxd_completed(void *dev_private, uint16_t qid, uint16_t max_ops, + uint16_t *last_idx, bool *has_error); +uint16_t idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, + uint16_t max_ops, uint16_t *last_idx, + enum rte_dma_status_code *status); #endif /* _IDXD_INTERNAL_H_ */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v10 11/16] dma/idxd: add operation statistic tracking 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 00/16] add dmadev driver for idxd devices Kevin Laatz ` (9 preceding siblings ...) 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 10/16] dma/idxd: add data-path job completion functions Kevin Laatz @ 2021-10-19 14:10 ` Kevin Laatz 2021-10-20 9:18 ` fengchengwen 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 12/16] dma/idxd: add vchan status function Kevin Laatz ` (4 subsequent siblings) 15 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 14:10 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add statistic tracking for DSA devices. The dmadev library documentation is also updated to add a generic section for using the library's statistics APIs. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> --- doc/guides/prog_guide/dmadev.rst | 11 +++++++++++ drivers/dma/idxd/idxd_bus.c | 2 ++ drivers/dma/idxd/idxd_common.c | 27 +++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 +++++ drivers/dma/idxd/idxd_pci.c | 2 ++ 5 files changed, 47 insertions(+) diff --git a/doc/guides/prog_guide/dmadev.rst b/doc/guides/prog_guide/dmadev.rst index e853ffda3a..139eaff299 100644 --- a/doc/guides/prog_guide/dmadev.rst +++ b/doc/guides/prog_guide/dmadev.rst @@ -107,3 +107,14 @@ completed operations along with the status of each operation (filled into the ``status`` array passed by user). These two APIs can also return the last completed operation's ``ring_idx`` which could help user track operations within their own application-defined rings. + + +Querying Device Statistics +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The statistics from a dmadev device can be got via the statistics functions, +i.e. ``rte_dma_stats_get()``. The statistics returned for each device instance are: + +* ``submitted``: The number of operations submitted to the device. +* ``completed``: The number of operations which have completed (successful and failed). +* ``errors``: The number of operations that completed with error. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index b2acdac4f9..b52ea02854 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -99,6 +99,8 @@ static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .stats_get = idxd_stats_get, + .stats_reset = idxd_stats_reset, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 76bc2e1364..fd81418b7c 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -65,6 +65,8 @@ __submit(struct idxd_dmadev *idxd) if (++idxd->batch_idx_write > idxd->max_batches) idxd->batch_idx_write = 0; + idxd->stats.submitted += idxd->batch_size; + idxd->batch_start += idxd->batch_size; idxd->batch_size = 0; idxd->batch_idx_ring[idxd->batch_idx_write] = idxd->batch_start; @@ -276,6 +278,8 @@ batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_ const uint16_t b_len = b_end - b_start; if (b_len == 1) {/* not a batch */ *status = get_comp_status(&idxd->batch_comp_ring[idxd->batch_idx_read]); + if (status != RTE_DMA_STATUS_SUCCESSFUL) + idxd->stats.errors++; idxd->ids_avail++; idxd->ids_returned++; idxd->batch_idx_read = next_batch; @@ -297,6 +301,8 @@ batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_ struct idxd_completion *c = (void *) &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; status[ret] = (ret < bcount) ? get_comp_status(c) : RTE_DMA_STATUS_NOT_ATTEMPTED; + if (status[ret] != RTE_DMA_STATUS_SUCCESSFUL) + idxd->stats.errors++; } idxd->ids_avail = idxd->ids_returned += ret; @@ -355,6 +361,7 @@ idxd_completed(void *dev_private, uint16_t qid __rte_unused, uint16_t max_ops, ret += batch; } while (batch > 0 && *has_error == false); + idxd->stats.completed += ret; *last_idx = idxd->ids_returned - 1; return ret; } @@ -371,6 +378,7 @@ idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, uint16_t max ret += batch; } while (batch > 0); + idxd->stats.completed += ret; *last_idx = idxd->ids_returned - 1; return ret; } @@ -404,6 +412,25 @@ idxd_dump(const struct rte_dma_dev *dev, FILE *f) return 0; } +int +idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + struct rte_dma_stats *stats, uint32_t stats_sz) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + if (stats_sz < sizeof(*stats)) + return -EINVAL; + *stats = idxd->stats; + return 0; +} + +int +idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan __rte_unused) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + idxd->stats = (struct rte_dma_stats){0}; + return 0; +} + int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t size) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 4208b0dee8..a85a1fb79e 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -59,6 +59,8 @@ struct idxd_dmadev { struct idxd_completion *batch_comp_ring; unsigned short *batch_idx_ring; /* store where each batch ends */ + struct rte_dma_stats stats; + rte_iova_t batch_iova; /* base address of the batch comp ring */ rte_iova_t desc_iova; /* base address of desc ring, needed for completions */ @@ -97,5 +99,8 @@ uint16_t idxd_completed(void *dev_private, uint16_t qid, uint16_t max_ops, uint16_t idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, uint16_t max_ops, uint16_t *last_idx, enum rte_dma_status_code *status); +int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, + struct rte_dma_stats *stats, uint32_t stats_sz); +int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index ed5bf99425..9d7f0531d5 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -136,6 +136,8 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .stats_get = idxd_stats_get, + .stats_reset = idxd_stats_reset, .dev_start = idxd_pci_dev_start, .dev_stop = idxd_pci_dev_stop, }; -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v10 11/16] dma/idxd: add operation statistic tracking 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 11/16] dma/idxd: add operation statistic tracking Kevin Laatz @ 2021-10-20 9:18 ` fengchengwen 0 siblings, 0 replies; 243+ messages in thread From: fengchengwen @ 2021-10-20 9:18 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: thomas, bruce.richardson, jerinj, conor.walsh On 2021/10/19 22:10, Kevin Laatz wrote: > Add statistic tracking for DSA devices. > > The dmadev library documentation is also updated to add a generic section > for using the library's statistics APIs. > > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> > --- > doc/guides/prog_guide/dmadev.rst | 11 +++++++++++ > drivers/dma/idxd/idxd_bus.c | 2 ++ > drivers/dma/idxd/idxd_common.c | 27 +++++++++++++++++++++++++++ > drivers/dma/idxd/idxd_internal.h | 5 +++++ > drivers/dma/idxd/idxd_pci.c | 2 ++ > 5 files changed, 47 insertions(+) > > diff --git a/doc/guides/prog_guide/dmadev.rst b/doc/guides/prog_guide/dmadev.rst > index e853ffda3a..139eaff299 100644 > --- a/doc/guides/prog_guide/dmadev.rst > +++ b/doc/guides/prog_guide/dmadev.rst > @@ -107,3 +107,14 @@ completed operations along with the status of each operation (filled into the > ``status`` array passed by user). These two APIs can also return the last > completed operation's ``ring_idx`` which could help user track operations within > their own application-defined rings. > + > + > +Querying Device Statistics > +~~~~~~~~~~~~~~~~~~~~~~~~~~~ could remove last ~ anyway, Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> > + > +The statistics from a dmadev device can be got via the statistics functions, > +i.e. ``rte_dma_stats_get()``. The statistics returned for each device instance are: > + > +* ``submitted``: The number of operations submitted to the device. > +* ``completed``: The number of operations which have completed (successful and failed). > +* ``errors``: The number of operations that completed with error. > diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c > index b2acdac4f9..b52ea02854 100644 > --- a/drivers/dma/idxd/idxd_bus.c > +++ b/drivers/dma/idxd/idxd_bus.c > @@ -99,6 +99,8 @@ static const struct rte_dma_dev_ops idxd_bus_ops = { > .dev_configure = idxd_configure, > .vchan_setup = idxd_vchan_setup, > .dev_info_get = idxd_info_get, > + .stats_get = idxd_stats_get, > + .stats_reset = idxd_stats_reset, > }; > [snip] > ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v10 12/16] dma/idxd: add vchan status function 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 00/16] add dmadev driver for idxd devices Kevin Laatz ` (10 preceding siblings ...) 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 11/16] dma/idxd: add operation statistic tracking Kevin Laatz @ 2021-10-19 14:10 ` Kevin Laatz 2021-10-20 9:30 ` fengchengwen 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 13/16] dma/idxd: add burst capacity API Kevin Laatz ` (3 subsequent siblings) 15 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 14:10 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When testing dmadev drivers, it is useful to have the HW device in a known state. This patch adds the implementation of the function which will wait for the device to be idle (all jobs completed) before proceeding. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 17 +++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 2 ++ drivers/dma/idxd/idxd_pci.c | 1 + 4 files changed, 21 insertions(+) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index b52ea02854..e6caa048a9 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -101,6 +101,7 @@ static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_info_get = idxd_info_get, .stats_get = idxd_stats_get, .stats_reset = idxd_stats_reset, + .vchan_status = idxd_vchan_status, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index fd81418b7c..3c8cff15c0 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -163,6 +163,23 @@ get_comp_status(struct idxd_completion *c) } } +int +idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + enum rte_dma_vchan_status *status) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint16_t last_batch_write = idxd->batch_idx_write == 0 ? idxd->max_batches : + idxd->batch_idx_write - 1; + uint8_t bstatus = (idxd->batch_comp_ring[last_batch_write].status != 0); + + /* An IDXD device will always be either active or idle. + * RTE_DMA_VCHAN_HALTED_ERROR is therefore not supported by IDXD. + */ + *status = bstatus ? RTE_DMA_VCHAN_IDLE : RTE_DMA_VCHAN_ACTIVE; + + return 0; +} + static __rte_always_inline int batch_ok(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index a85a1fb79e..50acb82d3d 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -102,5 +102,7 @@ uint16_t idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, struct rte_dma_stats *stats, uint32_t stats_sz); int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); +int idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan, + enum rte_dma_vchan_status *status); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 9d7f0531d5..23c10c0fb0 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -140,6 +140,7 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .stats_reset = idxd_stats_reset, .dev_start = idxd_pci_dev_start, .dev_stop = idxd_pci_dev_stop, + .vchan_status = idxd_vchan_status, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v10 12/16] dma/idxd: add vchan status function 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 12/16] dma/idxd: add vchan status function Kevin Laatz @ 2021-10-20 9:30 ` fengchengwen 2021-10-20 9:52 ` Bruce Richardson 0 siblings, 1 reply; 243+ messages in thread From: fengchengwen @ 2021-10-20 9:30 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: thomas, bruce.richardson, jerinj, conor.walsh On 2021/10/19 22:10, Kevin Laatz wrote: > When testing dmadev drivers, it is useful to have the HW device in a known > state. This patch adds the implementation of the function which will wait > for the device to be idle (all jobs completed) before proceeding. > > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > --- > drivers/dma/idxd/idxd_bus.c | 1 + > drivers/dma/idxd/idxd_common.c | 17 +++++++++++++++++ > drivers/dma/idxd/idxd_internal.h | 2 ++ > drivers/dma/idxd/idxd_pci.c | 1 + > 4 files changed, 21 insertions(+) > > diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c > index b52ea02854..e6caa048a9 100644 > --- a/drivers/dma/idxd/idxd_bus.c > +++ b/drivers/dma/idxd/idxd_bus.c > @@ -101,6 +101,7 @@ static const struct rte_dma_dev_ops idxd_bus_ops = { > .dev_info_get = idxd_info_get, > .stats_get = idxd_stats_get, > .stats_reset = idxd_stats_reset, > + .vchan_status = idxd_vchan_status, > }; > > static void * > diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c > index fd81418b7c..3c8cff15c0 100644 > --- a/drivers/dma/idxd/idxd_common.c > +++ b/drivers/dma/idxd/idxd_common.c > @@ -163,6 +163,23 @@ get_comp_status(struct idxd_completion *c) > } > } > > +int > +idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan __rte_unused, > + enum rte_dma_vchan_status *status) > +{ > + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; > + uint16_t last_batch_write = idxd->batch_idx_write == 0 ? idxd->max_batches : > + idxd->batch_idx_write - 1; > + uint8_t bstatus = (idxd->batch_comp_ring[last_batch_write].status != 0); > + > + /* An IDXD device will always be either active or idle. > + * RTE_DMA_VCHAN_HALTED_ERROR is therefore not supported by IDXD. > + */ > + *status = bstatus ? RTE_DMA_VCHAN_IDLE : RTE_DMA_VCHAN_ACTIVE; why not use stats.submitted and completed ? or I miss some thing about this API? does this api must called after rte_dma_submit() ? If not the following seq will function fail: enqueue multiple copy request submit to hardware enqueue multiple copy request invoke rte_dma_vchan_status to query status --because the copy requests not submit, the last comp will be non-zero, so it will return IDLE. > + > + return 0; > +} > + > static __rte_always_inline int > batch_ok(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) > { > diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h > index a85a1fb79e..50acb82d3d 100644 > --- a/drivers/dma/idxd/idxd_internal.h > +++ b/drivers/dma/idxd/idxd_internal.h > @@ -102,5 +102,7 @@ uint16_t idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, > int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, > struct rte_dma_stats *stats, uint32_t stats_sz); > int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); > +int idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan, > + enum rte_dma_vchan_status *status); > > #endif /* _IDXD_INTERNAL_H_ */ > diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c > index 9d7f0531d5..23c10c0fb0 100644 > --- a/drivers/dma/idxd/idxd_pci.c > +++ b/drivers/dma/idxd/idxd_pci.c > @@ -140,6 +140,7 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { > .stats_reset = idxd_stats_reset, > .dev_start = idxd_pci_dev_start, > .dev_stop = idxd_pci_dev_stop, > + .vchan_status = idxd_vchan_status, > }; > > /* each portal uses 4 x 4k pages */ > ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v10 12/16] dma/idxd: add vchan status function 2021-10-20 9:30 ` fengchengwen @ 2021-10-20 9:52 ` Bruce Richardson 0 siblings, 0 replies; 243+ messages in thread From: Bruce Richardson @ 2021-10-20 9:52 UTC (permalink / raw) To: fengchengwen; +Cc: Kevin Laatz, dev, thomas, jerinj, conor.walsh On Wed, Oct 20, 2021 at 05:30:13PM +0800, fengchengwen wrote: > On 2021/10/19 22:10, Kevin Laatz wrote: > > When testing dmadev drivers, it is useful to have the HW device in a known > > state. This patch adds the implementation of the function which will wait > > for the device to be idle (all jobs completed) before proceeding. > > > > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > > --- > > drivers/dma/idxd/idxd_bus.c | 1 + > > drivers/dma/idxd/idxd_common.c | 17 +++++++++++++++++ > > drivers/dma/idxd/idxd_internal.h | 2 ++ > > drivers/dma/idxd/idxd_pci.c | 1 + > > 4 files changed, 21 insertions(+) > > > > diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c > > index b52ea02854..e6caa048a9 100644 > > --- a/drivers/dma/idxd/idxd_bus.c > > +++ b/drivers/dma/idxd/idxd_bus.c > > @@ -101,6 +101,7 @@ static const struct rte_dma_dev_ops idxd_bus_ops = { > > .dev_info_get = idxd_info_get, > > .stats_get = idxd_stats_get, > > .stats_reset = idxd_stats_reset, > > + .vchan_status = idxd_vchan_status, > > }; > > > > static void * > > diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c > > index fd81418b7c..3c8cff15c0 100644 > > --- a/drivers/dma/idxd/idxd_common.c > > +++ b/drivers/dma/idxd/idxd_common.c > > @@ -163,6 +163,23 @@ get_comp_status(struct idxd_completion *c) > > } > > } > > > > +int > > +idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan __rte_unused, > > + enum rte_dma_vchan_status *status) > > +{ > > + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; > > + uint16_t last_batch_write = idxd->batch_idx_write == 0 ? idxd->max_batches : > > + idxd->batch_idx_write - 1; > > + uint8_t bstatus = (idxd->batch_comp_ring[last_batch_write].status != 0); > > + > > + /* An IDXD device will always be either active or idle. > > + * RTE_DMA_VCHAN_HALTED_ERROR is therefore not supported by IDXD. > > + */ > > + *status = bstatus ? RTE_DMA_VCHAN_IDLE : RTE_DMA_VCHAN_ACTIVE; > > why not use stats.submitted and completed ? or I miss some thing about this API? > > does this api must called after rte_dma_submit() ? If not the following seq will function fail: > enqueue multiple copy request > submit to hardware > enqueue multiple copy request > invoke rte_dma_vchan_status to query status --because the copy requests not submit, the last comp will be non-zero, so it will return IDLE. > That is correct. Until the jobs are submitted the device HW is idle as it is not processing any job. This API is to return the HW state, because that is the key concern here, whether the HW is in the process of doing DMA or not, since that is what can cause race conditions. The timing of sending down jobs to the device is under app control. ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v10 13/16] dma/idxd: add burst capacity API 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 00/16] add dmadev driver for idxd devices Kevin Laatz ` (11 preceding siblings ...) 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 12/16] dma/idxd: add vchan status function Kevin Laatz @ 2021-10-19 14:10 ` Kevin Laatz 2021-10-20 9:32 ` fengchengwen 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz ` (2 subsequent siblings) 15 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 14:10 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add support for the burst capacity API. This API will provide the calling application with the remaining capacity of the current burst (limited by max HW batch size). Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> --- drivers/dma/idxd/idxd_common.c | 21 +++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 1 + drivers/dma/idxd/idxd_pci.c | 1 + 3 files changed, 23 insertions(+) diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 3c8cff15c0..ff4647f579 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -468,6 +468,26 @@ idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t return 0; } +uint16_t +idxd_burst_capacity(const void *dev_private, uint16_t vchan __rte_unused) +{ + const struct idxd_dmadev *idxd = dev_private; + uint16_t write_idx = idxd->batch_start + idxd->batch_size; + uint16_t used_space; + + /* Check for space in the batch ring */ + if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) || + idxd->batch_idx_write + 1 == idxd->batch_idx_read) + return 0; + + /* For descriptors, check for wrap-around on write but not read */ + if (idxd->ids_returned > write_idx) + write_idx += idxd->desc_ring_mask + 1; + used_space = write_idx - idxd->ids_returned; + + return RTE_MIN((idxd->desc_ring_mask - used_space), idxd->max_batch_size); +} + int idxd_configure(struct rte_dma_dev *dev __rte_unused, const struct rte_dma_conf *dev_conf, uint32_t conf_sz) @@ -553,6 +573,7 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->fp_obj->submit = idxd_submit; dmadev->fp_obj->completed = idxd_completed; dmadev->fp_obj->completed_status = idxd_completed_status; + dmadev->fp_obj->burst_capacity = idxd_burst_capacity; idxd = dmadev->data->dev_private; *idxd = *base_idxd; /* copy over the main fields already passed in */ diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 50acb82d3d..3375600217 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -104,5 +104,6 @@ int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); int idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan, enum rte_dma_vchan_status *status); +uint16_t idxd_burst_capacity(const void *dev_private, uint16_t vchan); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 23c10c0fb0..beef3848aa 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -254,6 +254,7 @@ init_pci_device(struct rte_pci_device *dev, struct idxd_dmadev *idxd, idxd->u.pci = pci; idxd->max_batches = wq_size; + idxd->max_batch_size = 1 << lg2_max_batch; /* enable the device itself */ err_code = idxd_pci_dev_command(idxd, idxd_enable_dev); -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v10 13/16] dma/idxd: add burst capacity API 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 13/16] dma/idxd: add burst capacity API Kevin Laatz @ 2021-10-20 9:32 ` fengchengwen 0 siblings, 0 replies; 243+ messages in thread From: fengchengwen @ 2021-10-20 9:32 UTC (permalink / raw) To: Kevin Laatz, dev; +Cc: thomas, bruce.richardson, jerinj, conor.walsh Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> On 2021/10/19 22:10, Kevin Laatz wrote: > Add support for the burst capacity API. This API will provide the calling > application with the remaining capacity of the current burst (limited by > max HW batch size). > > Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> > Reviewed-by: Conor Walsh <conor.walsh@intel.com> > Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> > --- > drivers/dma/idxd/idxd_common.c | 21 +++++++++++++++++++++ > drivers/dma/idxd/idxd_internal.h | 1 + > drivers/dma/idxd/idxd_pci.c | 1 + > 3 files changed, 23 insertions(+) > > diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c > index 3c8cff15c0..ff4647f579 100644 > --- a/drivers/dma/idxd/idxd_common.c > +++ b/drivers/dma/idxd/idxd_common.c > @@ -468,6 +468,26 @@ idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t > return 0; > } > [snip] ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v10 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 00/16] add dmadev driver for idxd devices Kevin Laatz ` (12 preceding siblings ...) 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 13/16] dma/idxd: add burst capacity API Kevin Laatz @ 2021-10-19 14:10 ` Kevin Laatz 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 15/16] devbind: add dma device class Kevin Laatz 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 14:10 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz From: Conor Walsh <conor.walsh@intel.com> Move the example script for configuring IDXD devices bound to the IDXD kernel driver from raw to dma, and create a symlink to still allow use from raw. Signed-off-by: Conor Walsh <conor.walsh@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> --- drivers/dma/idxd/dpdk_idxd_cfg.py | 117 +++++++++++++++++++++++++++++ drivers/raw/ioat/dpdk_idxd_cfg.py | 118 +----------------------------- 2 files changed, 118 insertions(+), 117 deletions(-) create mode 100755 drivers/dma/idxd/dpdk_idxd_cfg.py mode change 100755 => 120000 drivers/raw/ioat/dpdk_idxd_cfg.py diff --git a/drivers/dma/idxd/dpdk_idxd_cfg.py b/drivers/dma/idxd/dpdk_idxd_cfg.py new file mode 100755 index 0000000000..fcc27822ef --- /dev/null +++ b/drivers/dma/idxd/dpdk_idxd_cfg.py @@ -0,0 +1,117 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2020 Intel Corporation + +""" +Configure an entire Intel DSA instance, using idxd kernel driver, for DPDK use +""" + +import sys +import argparse +import os +import os.path + + +class SysfsDir: + "Used to read/write paths in a sysfs directory" + def __init__(self, path): + self.path = path + + def read_int(self, filename): + "Return a value from sysfs file" + with open(os.path.join(self.path, filename)) as f: + return int(f.readline()) + + def write_values(self, values): + "write dictionary, where key is filename and value is value to write" + for filename, contents in values.items(): + with open(os.path.join(self.path, filename), "w") as f: + f.write(str(contents)) + + +def reset_device(dsa_id): + "Reset the DSA device and all its queues" + drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") + drv_dir.write_values({"unbind": f"dsa{dsa_id}"}) + + +def get_pci_dir(pci): + "Search for the sysfs directory of the PCI device" + base_dir = '/sys/bus/pci/devices/' + for path, dirs, files in os.walk(base_dir): + for dir in dirs: + if pci in dir: + return os.path.join(base_dir, dir) + sys.exit(f"Could not find sysfs directory for device {pci}") + + +def get_dsa_id(pci): + "Get the DSA instance ID using the PCI address of the device" + pci_dir = get_pci_dir(pci) + for path, dirs, files in os.walk(pci_dir): + for dir in dirs: + if dir.startswith('dsa') and 'wq' not in dir: + return int(dir[3:]) + sys.exit(f"Could not get device ID for device {pci}") + + +def configure_dsa(dsa_id, queues, prefix): + "Configure the DSA instance with appropriate number of queues" + dsa_dir = SysfsDir(f"/sys/bus/dsa/devices/dsa{dsa_id}") + drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") + + max_groups = dsa_dir.read_int("max_groups") + max_engines = dsa_dir.read_int("max_engines") + max_queues = dsa_dir.read_int("max_work_queues") + max_work_queues_size = dsa_dir.read_int("max_work_queues_size") + + nb_queues = min(queues, max_queues) + if queues > nb_queues: + print(f"Setting number of queues to max supported value: {max_queues}") + + # we want one engine per group, and no more engines than queues + nb_groups = min(max_engines, max_groups, nb_queues) + for grp in range(nb_groups): + dsa_dir.write_values({f"engine{dsa_id}.{grp}/group_id": grp}) + + # configure each queue + for q in range(nb_queues): + wq_dir = SysfsDir(os.path.join(dsa_dir.path, f"wq{dsa_id}.{q}")) + wq_dir.write_values({"group_id": q % nb_groups, + "type": "user", + "mode": "dedicated", + "name": f"{prefix}_wq{dsa_id}.{q}", + "priority": 1, + "size": int(max_work_queues_size / nb_queues)}) + + # enable device and then queues + drv_dir.write_values({"bind": f"dsa{dsa_id}"}) + for q in range(nb_queues): + drv_dir.write_values({"bind": f"wq{dsa_id}.{q}"}) + + +def main(args): + "Main function, does arg parsing and calls config function" + arg_p = argparse.ArgumentParser( + description="Configure whole DSA device instance for DPDK use") + arg_p.add_argument('dsa_id', + help="Specify DSA instance either via DSA instance number or PCI address") + arg_p.add_argument('-q', metavar='queues', type=int, default=255, + help="Number of queues to set up") + arg_p.add_argument('--name-prefix', metavar='prefix', dest='prefix', + default="dpdk", + help="Prefix for workqueue name to mark for DPDK use [default: 'dpdk']") + arg_p.add_argument('--reset', action='store_true', + help="Reset DSA device and its queues") + parsed_args = arg_p.parse_args(args[1:]) + + dsa_id = parsed_args.dsa_id + dsa_id = get_dsa_id(dsa_id) if ':' in dsa_id else dsa_id + if parsed_args.reset: + reset_device(dsa_id) + else: + configure_dsa(dsa_id, parsed_args.q, parsed_args.prefix) + + +if __name__ == "__main__": + main(sys.argv) diff --git a/drivers/raw/ioat/dpdk_idxd_cfg.py b/drivers/raw/ioat/dpdk_idxd_cfg.py deleted file mode 100755 index fcc27822ef..0000000000 --- a/drivers/raw/ioat/dpdk_idxd_cfg.py +++ /dev/null @@ -1,117 +0,0 @@ -#!/usr/bin/env python3 -# SPDX-License-Identifier: BSD-3-Clause -# Copyright(c) 2020 Intel Corporation - -""" -Configure an entire Intel DSA instance, using idxd kernel driver, for DPDK use -""" - -import sys -import argparse -import os -import os.path - - -class SysfsDir: - "Used to read/write paths in a sysfs directory" - def __init__(self, path): - self.path = path - - def read_int(self, filename): - "Return a value from sysfs file" - with open(os.path.join(self.path, filename)) as f: - return int(f.readline()) - - def write_values(self, values): - "write dictionary, where key is filename and value is value to write" - for filename, contents in values.items(): - with open(os.path.join(self.path, filename), "w") as f: - f.write(str(contents)) - - -def reset_device(dsa_id): - "Reset the DSA device and all its queues" - drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") - drv_dir.write_values({"unbind": f"dsa{dsa_id}"}) - - -def get_pci_dir(pci): - "Search for the sysfs directory of the PCI device" - base_dir = '/sys/bus/pci/devices/' - for path, dirs, files in os.walk(base_dir): - for dir in dirs: - if pci in dir: - return os.path.join(base_dir, dir) - sys.exit(f"Could not find sysfs directory for device {pci}") - - -def get_dsa_id(pci): - "Get the DSA instance ID using the PCI address of the device" - pci_dir = get_pci_dir(pci) - for path, dirs, files in os.walk(pci_dir): - for dir in dirs: - if dir.startswith('dsa') and 'wq' not in dir: - return int(dir[3:]) - sys.exit(f"Could not get device ID for device {pci}") - - -def configure_dsa(dsa_id, queues, prefix): - "Configure the DSA instance with appropriate number of queues" - dsa_dir = SysfsDir(f"/sys/bus/dsa/devices/dsa{dsa_id}") - drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") - - max_groups = dsa_dir.read_int("max_groups") - max_engines = dsa_dir.read_int("max_engines") - max_queues = dsa_dir.read_int("max_work_queues") - max_work_queues_size = dsa_dir.read_int("max_work_queues_size") - - nb_queues = min(queues, max_queues) - if queues > nb_queues: - print(f"Setting number of queues to max supported value: {max_queues}") - - # we want one engine per group, and no more engines than queues - nb_groups = min(max_engines, max_groups, nb_queues) - for grp in range(nb_groups): - dsa_dir.write_values({f"engine{dsa_id}.{grp}/group_id": grp}) - - # configure each queue - for q in range(nb_queues): - wq_dir = SysfsDir(os.path.join(dsa_dir.path, f"wq{dsa_id}.{q}")) - wq_dir.write_values({"group_id": q % nb_groups, - "type": "user", - "mode": "dedicated", - "name": f"{prefix}_wq{dsa_id}.{q}", - "priority": 1, - "size": int(max_work_queues_size / nb_queues)}) - - # enable device and then queues - drv_dir.write_values({"bind": f"dsa{dsa_id}"}) - for q in range(nb_queues): - drv_dir.write_values({"bind": f"wq{dsa_id}.{q}"}) - - -def main(args): - "Main function, does arg parsing and calls config function" - arg_p = argparse.ArgumentParser( - description="Configure whole DSA device instance for DPDK use") - arg_p.add_argument('dsa_id', - help="Specify DSA instance either via DSA instance number or PCI address") - arg_p.add_argument('-q', metavar='queues', type=int, default=255, - help="Number of queues to set up") - arg_p.add_argument('--name-prefix', metavar='prefix', dest='prefix', - default="dpdk", - help="Prefix for workqueue name to mark for DPDK use [default: 'dpdk']") - arg_p.add_argument('--reset', action='store_true', - help="Reset DSA device and its queues") - parsed_args = arg_p.parse_args(args[1:]) - - dsa_id = parsed_args.dsa_id - dsa_id = get_dsa_id(dsa_id) if ':' in dsa_id else dsa_id - if parsed_args.reset: - reset_device(dsa_id) - else: - configure_dsa(dsa_id, parsed_args.q, parsed_args.prefix) - - -if __name__ == "__main__": - main(sys.argv) diff --git a/drivers/raw/ioat/dpdk_idxd_cfg.py b/drivers/raw/ioat/dpdk_idxd_cfg.py new file mode 120000 index 0000000000..85545548d1 --- /dev/null +++ b/drivers/raw/ioat/dpdk_idxd_cfg.py @@ -0,0 +1 @@ +../../dma/idxd/dpdk_idxd_cfg.py \ No newline at end of file -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v10 15/16] devbind: add dma device class 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 00/16] add dmadev driver for idxd devices Kevin Laatz ` (13 preceding siblings ...) 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz @ 2021-10-19 14:10 ` Kevin Laatz 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 14:10 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add a new class for DMA devices. Devices listed under the DMA class are to be used with the dmadev library. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> --- usertools/dpdk-devbind.py | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py index 5f0e817055..da89b87816 100755 --- a/usertools/dpdk-devbind.py +++ b/usertools/dpdk-devbind.py @@ -71,6 +71,7 @@ network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class] baseband_devices = [acceleration_class] crypto_devices = [encryption_class, intel_processor_class] +dma_devices = [] eventdev_devices = [cavium_sso, cavium_tim, intel_dlb, octeontx2_sso] mempool_devices = [cavium_fpa, octeontx2_npa] compress_devices = [cavium_zip] @@ -585,6 +586,9 @@ def show_status(): if status_dev in ["crypto", "all"]: show_device_status(crypto_devices, "Crypto") + if status_dev in ["dma", "all"]: + show_device_status(dma_devices, "DMA") + if status_dev in ["event", "all"]: show_device_status(eventdev_devices, "Eventdev") @@ -653,7 +657,7 @@ def parse_args(): parser.add_argument( '--status-dev', help="Print the status of given device group.", - choices=['baseband', 'compress', 'crypto', 'event', + choices=['baseband', 'compress', 'crypto', 'dma', 'event', 'mempool', 'misc', 'net', 'regex']) bind_group = parser.add_mutually_exclusive_group() bind_group.add_argument( @@ -734,6 +738,7 @@ def do_arg_actions(): get_device_details(network_devices) get_device_details(baseband_devices) get_device_details(crypto_devices) + get_device_details(dma_devices) get_device_details(eventdev_devices) get_device_details(mempool_devices) get_device_details(compress_devices) @@ -756,6 +761,7 @@ def main(): get_device_details(network_devices) get_device_details(baseband_devices) get_device_details(crypto_devices) + get_device_details(dma_devices) get_device_details(eventdev_devices) get_device_details(mempool_devices) get_device_details(compress_devices) -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v10 16/16] devbind: move idxd device ID to dmadev class 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 00/16] add dmadev driver for idxd devices Kevin Laatz ` (14 preceding siblings ...) 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 15/16] devbind: add dma device class Kevin Laatz @ 2021-10-19 14:10 ` Kevin Laatz 15 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-19 14:10 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz The dmadev library is the preferred abstraction for using IDXD devices and will replace the rawdev implementation in future. This patch moves the IDXD device ID to the dmadev class. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> --- usertools/dpdk-devbind.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py index da89b87816..ba18e2a487 100755 --- a/usertools/dpdk-devbind.py +++ b/usertools/dpdk-devbind.py @@ -71,13 +71,13 @@ network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class] baseband_devices = [acceleration_class] crypto_devices = [encryption_class, intel_processor_class] -dma_devices = [] +dma_devices = [intel_idxd_spr] eventdev_devices = [cavium_sso, cavium_tim, intel_dlb, octeontx2_sso] mempool_devices = [cavium_fpa, octeontx2_npa] compress_devices = [cavium_zip] regex_devices = [octeontx2_ree] misc_devices = [cnxk_bphy, cnxk_bphy_cgx, cnxk_inl_dev, intel_ioat_bdw, - intel_ioat_skx, intel_ioat_icx, intel_idxd_spr, intel_ntb_skx, + intel_ioat_skx, intel_ioat_icx, intel_ntb_skx, intel_ntb_icx, octeontx2_dma] # global dict ethernet devices present. Dictionary indexed by PCI address. -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices 2021-08-27 17:20 [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices Kevin Laatz ` (20 preceding siblings ...) 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 00/16] add dmadev driver for idxd devices Kevin Laatz @ 2021-10-20 16:29 ` Kevin Laatz 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 01/16] raw/ioat: only build if dmadev not present Kevin Laatz ` (16 more replies) 21 siblings, 17 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-20 16:29 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz This patchset adds a dmadev driver and associated documentation to support Intel Data Streaming Accelerator devices. This driver is intended to ultimately replace the current IDXD part of the IOAT rawdev driver. v11: * addressed ML feedback from Chengwen v10: * meson fix to ensure Windows and BSD builds compile v9: * add missing meson check for x86 v8: * fix compilation issues of individual patches v7: * rebase on above patchsets * add meson reason for rawdev build v6: * set state of device during create * add dev_close function * documentation updates - moved generic pieces from driver doc to lib doc * other small miscellaneous fixes based on rebasing and ML feedback v5: * add missing toctree entry for idxd driver v4: * rebased on above patchsets * minor fixes based on review feedback v3: * rebased on above patchsets * added burst capacity API v2: * rebased on above patchsets * added API to check for device being idle * added devbind updates for DMA devices * fixed issue identified by internal coverity scan Bruce Richardson (1): raw/ioat: only build if dmadev not present Conor Walsh (1): dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz (14): dma/idxd: add skeleton for VFIO based DSA device dma/idxd: add bus device probing dma/idxd: create dmadev instances on bus probe dma/idxd: create dmadev instances on pci probe dma/idxd: add datapath structures dma/idxd: add configure and info_get functions dma/idxd: add start and stop functions for pci devices dma/idxd: add data-path job submission functions dma/idxd: add data-path job completion functions dma/idxd: add operation statistic tracking dma/idxd: add vchan status function dma/idxd: add burst capacity API devbind: add dma device class devbind: move idxd device ID to dmadev class MAINTAINERS | 10 + doc/guides/dmadevs/idxd.rst | 179 ++++++++ doc/guides/dmadevs/index.rst | 2 + doc/guides/prog_guide/dmadev.rst | 30 ++ doc/guides/rawdevs/ioat.rst | 8 + doc/guides/rel_notes/release_21_11.rst | 5 + drivers/dma/idxd/dpdk_idxd_cfg.py | 117 +++++ drivers/dma/idxd/idxd_bus.c | 378 +++++++++++++++ drivers/dma/idxd/idxd_common.c | 612 +++++++++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 131 ++++++ drivers/dma/idxd/idxd_internal.h | 109 +++++ drivers/dma/idxd/idxd_pci.c | 380 +++++++++++++++ drivers/dma/idxd/meson.build | 16 + drivers/dma/idxd/version.map | 3 + drivers/dma/meson.build | 2 + drivers/meson.build | 4 +- drivers/raw/ioat/dpdk_idxd_cfg.py | 118 +---- drivers/raw/ioat/meson.build | 24 +- usertools/dpdk-devbind.py | 10 +- 19 files changed, 2014 insertions(+), 124 deletions(-) create mode 100644 doc/guides/dmadevs/idxd.rst create mode 100755 drivers/dma/idxd/dpdk_idxd_cfg.py create mode 100644 drivers/dma/idxd/idxd_bus.c create mode 100644 drivers/dma/idxd/idxd_common.c create mode 100644 drivers/dma/idxd/idxd_hw_defs.h create mode 100644 drivers/dma/idxd/idxd_internal.h create mode 100644 drivers/dma/idxd/idxd_pci.c create mode 100644 drivers/dma/idxd/meson.build create mode 100644 drivers/dma/idxd/version.map mode change 100755 => 120000 drivers/raw/ioat/dpdk_idxd_cfg.py -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v11 01/16] raw/ioat: only build if dmadev not present 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Kevin Laatz @ 2021-10-20 16:29 ` Kevin Laatz 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz ` (15 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-20 16:29 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz From: Bruce Richardson <bruce.richardson@intel.com> Only build the rawdev IDXD/IOAT drivers if the dmadev drivers are not present. This change requires the dependencies to be reordered in drivers/meson.build so that rawdev can use the "RTE_DMA_* build macros to check for the presence of the equivalent dmadev driver. A note is also added to the documentation to inform users of this change. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/rawdevs/ioat.rst | 8 ++++++++ drivers/meson.build | 4 ++-- drivers/raw/ioat/meson.build | 24 +++++++++++++++++++++--- 3 files changed, 31 insertions(+), 5 deletions(-) diff --git a/doc/guides/rawdevs/ioat.rst b/doc/guides/rawdevs/ioat.rst index a28e909935..a65530bd30 100644 --- a/doc/guides/rawdevs/ioat.rst +++ b/doc/guides/rawdevs/ioat.rst @@ -34,6 +34,14 @@ Compilation For builds using ``meson`` and ``ninja``, the driver will be built when the target platform is x86-based. No additional compilation steps are necessary. +.. note:: + Since the addition of the dmadev library, the ``ioat`` and ``idxd`` parts of this driver + will only be built if their ``dmadev`` counterparts are not built. + The following can be used to disable the ``dmadev`` drivers, + if the raw drivers are to be used instead:: + + $ meson -Ddisable_drivers=dma/* <build_dir> + Device Setup ------------- diff --git a/drivers/meson.build b/drivers/meson.build index b7d680868a..34c0276487 100644 --- a/drivers/meson.build +++ b/drivers/meson.build @@ -10,15 +10,15 @@ subdirs = [ 'common/qat', # depends on bus. 'common/sfc_efx', # depends on bus. 'mempool', # depends on common and bus. + 'dma', # depends on common and bus. 'net', # depends on common, bus, mempool - 'raw', # depends on common, bus and net. + 'raw', # depends on common, bus, dma and net. 'crypto', # depends on common, bus and mempool (net in future). 'compress', # depends on common, bus, mempool. 'regex', # depends on common, bus, regexdev. 'vdpa', # depends on common, bus and mempool. 'event', # depends on common, bus, mempool and net. 'baseband', # depends on common and bus. - 'dma', # depends on common and bus. ] if meson.is_cross_build() diff --git a/drivers/raw/ioat/meson.build b/drivers/raw/ioat/meson.build index 0e81cb5951..1b866aab74 100644 --- a/drivers/raw/ioat/meson.build +++ b/drivers/raw/ioat/meson.build @@ -2,14 +2,32 @@ # Copyright 2019 Intel Corporation build = dpdk_conf.has('RTE_ARCH_X86') +# only use ioat rawdev driver if we don't have the equivalent dmadev ones +if dpdk_conf.has('RTE_DMA_IDXD') and dpdk_conf.has('RTE_DMA_IOAT') + build = false + reason = 'replaced by dmadev drivers' + subdir_done() +endif + reason = 'only supported on x86' sources = files( - 'idxd_bus.c', - 'idxd_pci.c', 'ioat_common.c', - 'ioat_rawdev.c', 'ioat_rawdev_test.c', ) + +if not dpdk_conf.has('RTE_DMA_IDXD') + sources += files( + 'idxd_bus.c', + 'idxd_pci.c', + ) +endif + +if not dpdk_conf.has('RTE_DMA_IOAT') + sources += files ( + 'ioat_rawdev.c', + ) +endif + deps += ['bus_pci', 'mbuf', 'rawdev'] headers = files( 'rte_ioat_rawdev.h', -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v11 02/16] dma/idxd: add skeleton for VFIO based DSA device 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 01/16] raw/ioat: only build if dmadev not present Kevin Laatz @ 2021-10-20 16:29 ` Kevin Laatz 2021-10-22 15:47 ` Thomas Monjalon 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 03/16] dma/idxd: add bus device probing Kevin Laatz ` (14 subsequent siblings) 16 siblings, 1 reply; 243+ messages in thread From: Kevin Laatz @ 2021-10-20 16:29 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the basic device probe/remove skeleton code for DSA device bound to the vfio pci driver. Relevant documentation and MAINTAINERS update also included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v8: fix compile issue v9: add meson check for x86 --- MAINTAINERS | 10 +++++ doc/guides/dmadevs/idxd.rst | 58 ++++++++++++++++++++++++++ doc/guides/dmadevs/index.rst | 2 + doc/guides/rel_notes/release_21_11.rst | 5 +++ drivers/dma/idxd/idxd_common.c | 11 +++++ drivers/dma/idxd/idxd_internal.h | 27 ++++++++++++ drivers/dma/idxd/idxd_pci.c | 55 ++++++++++++++++++++++++ drivers/dma/idxd/meson.build | 11 +++++ drivers/dma/idxd/version.map | 3 ++ drivers/dma/meson.build | 2 + 10 files changed, 184 insertions(+) create mode 100644 doc/guides/dmadevs/idxd.rst create mode 100644 drivers/dma/idxd/idxd_common.c create mode 100644 drivers/dma/idxd/idxd_internal.h create mode 100644 drivers/dma/idxd/idxd_pci.c create mode 100644 drivers/dma/idxd/meson.build create mode 100644 drivers/dma/idxd/version.map diff --git a/MAINTAINERS b/MAINTAINERS index 5387ffd4fc..423d8a73ce 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1200,6 +1200,16 @@ F: doc/guides/compressdevs/zlib.rst F: doc/guides/compressdevs/features/zlib.ini +DMAdev Drivers +-------------- + +Intel IDXD - EXPERIMENTAL +M: Bruce Richardson <bruce.richardson@intel.com> +M: Kevin Laatz <kevin.laatz@intel.com> +F: drivers/dma/idxd/ +F: doc/guides/dmadevs/idxd.rst + + RegEx Drivers ------------- diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst new file mode 100644 index 0000000000..924700d17e --- /dev/null +++ b/doc/guides/dmadevs/idxd.rst @@ -0,0 +1,58 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2021 Intel Corporation. + +.. include:: <isonum.txt> + +IDXD DMA Device Driver +====================== + +The ``idxd`` dmadev driver provides a poll-mode driver (PMD) for Intel\ |reg| +Data Streaming Accelerator `(Intel DSA) +<https://software.intel.com/content/www/us/en/develop/articles/intel-data-streaming-accelerator-architecture-specification.html>`_. +This PMD can be used in conjunction with Intel\ |reg| DSA devices to offload +data operations, such as data copies, to hardware, freeing up CPU cycles for +other tasks. + +Hardware Requirements +---------------------- + +The ``dpdk-devbind.py`` script, included with DPDK, can be used to show the +presence of supported hardware. Running ``dpdk-devbind.py --status-dev dma`` +will show all the DMA devices on the system, including IDXD supported devices. +Intel\ |reg| DSA devices, are currently (at time of writing) appearing +as devices with type “0b25”, due to the absence of pci-id database entries for +them at this point. + +Compilation +------------ + +For builds using ``meson`` and ``ninja``, the driver will be built when the +target platform is x86-based. No additional compilation steps are necessary. + +Device Setup +------------- + +Devices using VFIO/UIO drivers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The HW devices to be used will need to be bound to a user-space IO driver for use. +The ``dpdk-devbind.py`` script can be used to view the state of the devices +and to bind them to a suitable DPDK-supported driver, such as ``vfio-pci``. +For example:: + + $ dpdk-devbind.py -b vfio-pci 6a:01.0 + +Device Probing and Initialization +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For devices bound to a suitable DPDK-supported VFIO/UIO driver, the HW devices will +be found as part of the device scan done at application initialization time without +the need to pass parameters to the application. + +For Intel\ |reg| DSA devices, DPDK will automatically configure the device with the +maximum number of workqueues available on it, partitioning all resources equally +among the queues. +If fewer workqueues are required, then the ``max_queues`` parameter may be passed to +the device driver on the EAL commandline, via the ``allowlist`` or ``-a`` flag e.g.:: + + $ dpdk-test -a <b:d:f>,max_queues=4 diff --git a/doc/guides/dmadevs/index.rst b/doc/guides/dmadevs/index.rst index 0bce29d766..5d4abf880e 100644 --- a/doc/guides/dmadevs/index.rst +++ b/doc/guides/dmadevs/index.rst @@ -10,3 +10,5 @@ an application through DMA API. .. toctree:: :maxdepth: 2 :numbered: + + idxd diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst index d5435a64aa..f8678efa94 100644 --- a/doc/guides/rel_notes/release_21_11.rst +++ b/doc/guides/rel_notes/release_21_11.rst @@ -75,6 +75,11 @@ New Features operations. * Added multi-process support. +* **Added IDXD dmadev driver implementation.** + + The IDXD dmadev driver provide device drivers for the Intel DSA devices. + This device driver can be used through the generic dmadev API. + * **Added new RSS offload types for IPv4/L4 checksum in RSS flow.** Added macros ETH_RSS_IPV4_CHKSUM and ETH_RSS_L4_CHKSUM, now IPv4 and diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c new file mode 100644 index 0000000000..e00ddbe5ef --- /dev/null +++ b/drivers/dma/idxd/idxd_common.c @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#include <rte_log.h> + +#include "idxd_internal.h" + +int idxd_pmd_logtype; + +RTE_LOG_REGISTER_DEFAULT(idxd_pmd_logtype, WARNING); diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h new file mode 100644 index 0000000000..c6a7dcd72f --- /dev/null +++ b/drivers/dma/idxd/idxd_internal.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#ifndef _IDXD_INTERNAL_H_ +#define _IDXD_INTERNAL_H_ + +/** + * @file idxd_internal.h + * + * Internal data structures for the idxd/DSA driver for dmadev + * + * @warning + * @b EXPERIMENTAL: these structures and APIs may change without prior notice + */ + +extern int idxd_pmd_logtype; + +#define IDXD_PMD_LOG(level, fmt, args...) rte_log(RTE_LOG_ ## level, \ + idxd_pmd_logtype, "IDXD: %s(): " fmt "\n", __func__, ##args) + +#define IDXD_PMD_DEBUG(fmt, args...) IDXD_PMD_LOG(DEBUG, fmt, ## args) +#define IDXD_PMD_INFO(fmt, args...) IDXD_PMD_LOG(INFO, fmt, ## args) +#define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) +#define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) + +#endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c new file mode 100644 index 0000000000..79e4aadcab --- /dev/null +++ b/drivers/dma/idxd/idxd_pci.c @@ -0,0 +1,55 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include <rte_bus_pci.h> + +#include "idxd_internal.h" + +#define IDXD_VENDOR_ID 0x8086 +#define IDXD_DEVICE_ID_SPR 0x0B25 + +#define IDXD_PMD_DMADEV_NAME_PCI dmadev_idxd_pci + +const struct rte_pci_id pci_id_idxd_map[] = { + { RTE_PCI_DEVICE(IDXD_VENDOR_ID, IDXD_DEVICE_ID_SPR) }, + { .vendor_id = 0, /* sentinel */ }, +}; + +static int +idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) +{ + int ret = 0; + char name[PCI_PRI_STR_SIZE]; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); + dev->device.driver = &drv->driver; + + return ret; +} + +static int +idxd_dmadev_remove_pci(struct rte_pci_device *dev) +{ + char name[PCI_PRI_STR_SIZE]; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + + IDXD_PMD_INFO("Closing %s on NUMA node %d", + name, dev->device.numa_node); + + return 0; +} + +struct rte_pci_driver idxd_pmd_drv_pci = { + .id_table = pci_id_idxd_map, + .drv_flags = RTE_PCI_DRV_NEED_MAPPING, + .probe = idxd_dmadev_probe_pci, + .remove = idxd_dmadev_remove_pci, +}; + +RTE_PMD_REGISTER_PCI(IDXD_PMD_DMADEV_NAME_PCI, idxd_pmd_drv_pci); +RTE_PMD_REGISTER_PCI_TABLE(IDXD_PMD_DMADEV_NAME_PCI, pci_id_idxd_map); +RTE_PMD_REGISTER_KMOD_DEP(IDXD_PMD_DMADEV_NAME_PCI, "vfio-pci"); +RTE_PMD_REGISTER_PARAM_STRING(dmadev_idxd_pci, "max_queues=0"); diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build new file mode 100644 index 0000000000..11620ba156 --- /dev/null +++ b/drivers/dma/idxd/meson.build @@ -0,0 +1,11 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2021 Intel Corporation + +build = dpdk_conf.has('RTE_ARCH_X86') +reason = 'only supported on x86' + +deps += ['bus_pci'] +sources = files( + 'idxd_common.c', + 'idxd_pci.c' +) diff --git a/drivers/dma/idxd/version.map b/drivers/dma/idxd/version.map new file mode 100644 index 0000000000..4a76d1d52d --- /dev/null +++ b/drivers/dma/idxd/version.map @@ -0,0 +1,3 @@ +DPDK_21 { + local: *; +}; diff --git a/drivers/dma/meson.build b/drivers/dma/meson.build index d9c7ede32f..411be7a240 100644 --- a/drivers/dma/meson.build +++ b/drivers/dma/meson.build @@ -2,5 +2,7 @@ # Copyright 2021 HiSilicon Limited drivers = [ + 'idxd', 'skeleton', ] +std_deps = ['dmadev'] -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v11 02/16] dma/idxd: add skeleton for VFIO based DSA device 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz @ 2021-10-22 15:47 ` Thomas Monjalon 0 siblings, 0 replies; 243+ messages in thread From: Thomas Monjalon @ 2021-10-22 15:47 UTC (permalink / raw) To: Kevin Laatz; +Cc: dev, bruce.richardson, fengchengwen, jerinj, conor.walsh 20/10/2021 18:29, Kevin Laatz: > --- /dev/null > +++ b/drivers/dma/idxd/version.map > @@ -0,0 +1,3 @@ > +DPDK_21 { > + local: *; > +}; Should be DPDK_22 Will fix while merging. ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v11 03/16] dma/idxd: add bus device probing 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 01/16] raw/ioat: only build if dmadev not present Kevin Laatz 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz @ 2021-10-20 16:30 ` Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz ` (13 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-20 16:30 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the basic device probing for DSA devices bound to the IDXD kernel driver. These devices can be configured via sysfs and made available to DPDK if they are found during bus scan. Relevant documentation is included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 64 ++++++ drivers/dma/idxd/idxd_bus.c | 352 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 34 +++ drivers/dma/idxd/meson.build | 4 + 4 files changed, 454 insertions(+) create mode 100644 drivers/dma/idxd/idxd_bus.c diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 924700d17e..ce33e2857a 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -32,6 +32,56 @@ target platform is x86-based. No additional compilation steps are necessary. Device Setup ------------- +Intel\ |reg| DSA devices can use the IDXD kernel driver or DPDK-supported drivers, +such as ``vfio-pci``. Both are supported by the IDXD PMD. + +Intel\ |reg| DSA devices using IDXD kernel driver +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To use an Intel\ |reg| DSA device bound to the IDXD kernel driver, the device must first be configured. +The `accel-config <https://github.com/intel/idxd-config>`_ utility library can be used for configuration. + +.. note:: + The device configuration can also be done by directly interacting with the sysfs nodes. + An example of how this may be done can be seen in the script ``dpdk_idxd_cfg.py`` + included in the driver source directory. + +There are some mandatory configuration steps before being able to use a device with an application. +The internal engines, which do the copies or other operations, +and the work-queues, which are used by applications to assign work to the device, +need to be assigned to groups, and the various other configuration options, +such as priority or queue depth, need to be set for each queue. + +To assign an engine to a group:: + + $ accel-config config-engine dsa0/engine0.0 --group-id=0 + $ accel-config config-engine dsa0/engine0.1 --group-id=1 + +To assign work queues to groups for passing descriptors to the engines a similar accel-config command can be used. +However, the work queues also need to be configured depending on the use case. +Some configuration options include: + +* mode (Dedicated/Shared): Indicates whether a WQ may accept jobs from multiple queues simultaneously. +* priority: WQ priority between 1 and 15. Larger value means higher priority. +* wq-size: the size of the WQ. Sum of all WQ sizes must be less that the total-size defined by the device. +* type: WQ type (kernel/mdev/user). Determines how the device is presented. +* name: identifier given to the WQ. + +Example configuration for a work queue:: + + $ accel-config config-wq dsa0/wq0.0 --group-id=0 \ + --mode=dedicated --priority=10 --wq-size=8 \ + --type=user --name=dpdk_app1 + +Once the devices have been configured, they need to be enabled:: + + $ accel-config enable-device dsa0 + $ accel-config enable-wq dsa0/wq0.0 + +Check the device configuration:: + + $ accel-config list + Devices using VFIO/UIO drivers ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -56,3 +106,17 @@ If fewer workqueues are required, then the ``max_queues`` parameter may be passe the device driver on the EAL commandline, via the ``allowlist`` or ``-a`` flag e.g.:: $ dpdk-test -a <b:d:f>,max_queues=4 + +For devices bound to the IDXD kernel driver, +the DPDK IDXD driver will automatically perform a scan for available workqueues +to use. Any workqueues found listed in ``/dev/dsa`` on the system will be checked +in ``/sys``, and any which have ``dpdk_`` prefix in their name will be automatically +probed by the driver to make them available to the application. +Alternatively, to support use by multiple DPDK processes simultaneously, +the value used as the DPDK ``--file-prefix`` parameter may be used as a workqueue +name prefix, instead of ``dpdk_``, allowing each DPDK application instance to only +use a subset of configured queues. + +Once probed successfully, irrespective of kernel driver, the device will appear as a ``dmadev``, +that is a "DMA device type" inside DPDK, and can be accessed using APIs from the +``rte_dmadev`` library. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c new file mode 100644 index 0000000000..c0f862a965 --- /dev/null +++ b/drivers/dma/idxd/idxd_bus.c @@ -0,0 +1,352 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Intel Corporation + */ + +#include <dirent.h> +#include <fcntl.h> +#include <unistd.h> +#include <sys/mman.h> +#include <libgen.h> + +#include <rte_bus.h> +#include <rte_eal.h> +#include <rte_log.h> +#include <rte_dmadev_pmd.h> +#include <rte_string_fns.h> + +#include "idxd_internal.h" + +/* default value for DSA paths, but allow override in environment for testing */ +#define DSA_DEV_PATH "/dev/dsa" +#define DSA_SYSFS_PATH "/sys/bus/dsa/devices" + +static unsigned int devcount; + +/** unique identifier for a DSA device/WQ instance */ +struct dsa_wq_addr { + uint16_t device_id; + uint16_t wq_id; +}; + +/** a DSA device instance */ +struct rte_dsa_device { + struct rte_device device; /**< Inherit core device */ + TAILQ_ENTRY(rte_dsa_device) next; /**< next dev in list */ + + char wq_name[32]; /**< the workqueue name/number e.g. wq0.1 */ + struct dsa_wq_addr addr; /**< Identifies the specific WQ */ +}; + +/* forward prototypes */ +struct dsa_bus; +static int dsa_scan(void); +static int dsa_probe(void); +static struct rte_device *dsa_find_device(const struct rte_device *start, + rte_dev_cmp_t cmp, const void *data); +static enum rte_iova_mode dsa_get_iommu_class(void); +static int dsa_addr_parse(const char *name, void *addr); + +/** List of devices */ +TAILQ_HEAD(dsa_device_list, rte_dsa_device); + +/** + * Structure describing the DSA bus + */ +struct dsa_bus { + struct rte_bus bus; /**< Inherit the generic class */ + struct rte_driver driver; /**< Driver struct for devices to point to */ + struct dsa_device_list device_list; /**< List of PCI devices */ +}; + +struct dsa_bus dsa_bus = { + .bus = { + .scan = dsa_scan, + .probe = dsa_probe, + .find_device = dsa_find_device, + .get_iommu_class = dsa_get_iommu_class, + .parse = dsa_addr_parse, + }, + .driver = { + .name = "dmadev_idxd" + }, + .device_list = TAILQ_HEAD_INITIALIZER(dsa_bus.device_list), +}; + +static inline const char * +dsa_get_dev_path(void) +{ + const char *path = getenv("DSA_DEV_PATH"); + return path ? path : DSA_DEV_PATH; +} + +static inline const char * +dsa_get_sysfs_path(void) +{ + const char *path = getenv("DSA_SYSFS_PATH"); + return path ? path : DSA_SYSFS_PATH; +} + +static void * +idxd_bus_mmap_wq(struct rte_dsa_device *dev) +{ + void *addr; + char path[PATH_MAX]; + int fd; + + snprintf(path, sizeof(path), "%s/%s", dsa_get_dev_path(), dev->wq_name); + fd = open(path, O_RDWR); + if (fd < 0) { + IDXD_PMD_ERR("Failed to open device path: %s", path); + return NULL; + } + + addr = mmap(NULL, 0x1000, PROT_WRITE, MAP_SHARED, fd, 0); + close(fd); + if (addr == MAP_FAILED) { + IDXD_PMD_ERR("Failed to mmap device %s", path); + return NULL; + } + + return addr; +} + +static int +read_wq_string(struct rte_dsa_device *dev, const char *filename, + char *value, size_t valuelen) +{ + char sysfs_node[PATH_MAX]; + int len; + int fd; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s", + dsa_get_sysfs_path(), dev->wq_name, filename); + fd = open(sysfs_node, O_RDONLY); + if (fd < 0) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + len = read(fd, value, valuelen - 1); + close(fd); + if (len < 0) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + value[len] = '\0'; + return 0; +} + +static int +read_wq_int(struct rte_dsa_device *dev, const char *filename, + int *value) +{ + char sysfs_node[PATH_MAX]; + FILE *f; + int ret = 0; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s", + dsa_get_sysfs_path(), dev->wq_name, filename); + f = fopen(sysfs_node, "r"); + if (f == NULL) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + if (fscanf(f, "%d", value) != 1) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + ret = -1; + } + + fclose(f); + return ret; +} + +static int +read_device_int(struct rte_dsa_device *dev, const char *filename, + int *value) +{ + char sysfs_node[PATH_MAX]; + FILE *f; + int ret = 0; + + snprintf(sysfs_node, sizeof(sysfs_node), "%s/dsa%d/%s", + dsa_get_sysfs_path(), dev->addr.device_id, filename); + f = fopen(sysfs_node, "r"); + if (f == NULL) { + IDXD_PMD_ERR("%s(): opening file '%s' failed: %s", + __func__, sysfs_node, strerror(errno)); + return -1; + } + + if (fscanf(f, "%d", value) != 1) { + IDXD_PMD_ERR("%s(): error reading file '%s': %s", + __func__, sysfs_node, strerror(errno)); + ret = -1; + } + + fclose(f); + return ret; +} + +static int +idxd_probe_dsa(struct rte_dsa_device *dev) +{ + struct idxd_dmadev idxd = {0}; + int ret = 0; + + IDXD_PMD_INFO("Probing device %s on numa node %d", + dev->wq_name, dev->device.numa_node); + if (read_wq_int(dev, "size", &ret) < 0) + return -1; + idxd.max_batches = ret; + if (read_wq_int(dev, "max_batch_size", &ret) < 0) + return -1; + idxd.max_batch_size = ret; + idxd.qid = dev->addr.wq_id; + idxd.sva_support = 1; + + idxd.portal = idxd_bus_mmap_wq(dev); + if (idxd.portal == NULL) { + IDXD_PMD_ERR("WQ mmap failed"); + return -ENOENT; + } + + return 0; +} + +static int +is_for_this_process_use(const char *name) +{ + char *runtime_dir = strdup(rte_eal_get_runtime_dir()); + char *prefix = basename(runtime_dir); + int prefixlen = strlen(prefix); + int retval = 0; + + if (strncmp(name, "dpdk_", 5) == 0) + retval = 1; + if (strncmp(name, prefix, prefixlen) == 0 && name[prefixlen] == '_') + retval = 1; + + free(runtime_dir); + return retval; +} + +static int +dsa_probe(void) +{ + struct rte_dsa_device *dev; + + TAILQ_FOREACH(dev, &dsa_bus.device_list, next) { + char type[64], name[64]; + + if (read_wq_string(dev, "type", type, sizeof(type)) < 0 || + read_wq_string(dev, "name", name, sizeof(name)) < 0) + continue; + + if (strncmp(type, "user", 4) == 0 && is_for_this_process_use(name)) { + dev->device.driver = &dsa_bus.driver; + idxd_probe_dsa(dev); + continue; + } + IDXD_PMD_DEBUG("WQ '%s', not allocated to DPDK", dev->wq_name); + } + + return 0; +} + +static int +dsa_scan(void) +{ + const char *path = dsa_get_dev_path(); + struct dirent *wq; + DIR *dev_dir; + + dev_dir = opendir(path); + if (dev_dir == NULL) { + if (errno == ENOENT) + return 0; /* no bus, return without error */ + IDXD_PMD_ERR("%s(): opendir '%s' failed: %s", + __func__, path, strerror(errno)); + return -1; + } + + while ((wq = readdir(dev_dir)) != NULL) { + struct rte_dsa_device *dev; + int numa_node = -1; + + if (strncmp(wq->d_name, "wq", 2) != 0) + continue; + if (strnlen(wq->d_name, sizeof(dev->wq_name)) == sizeof(dev->wq_name)) { + IDXD_PMD_ERR("%s(): wq name too long: '%s', skipping", + __func__, wq->d_name); + continue; + } + IDXD_PMD_DEBUG("%s(): found %s/%s", __func__, path, wq->d_name); + + dev = malloc(sizeof(*dev)); + if (dsa_addr_parse(wq->d_name, &dev->addr) < 0) { + IDXD_PMD_ERR("Error parsing WQ name: %s", wq->d_name); + free(dev); + continue; + } + dev->device.bus = &dsa_bus.bus; + strlcpy(dev->wq_name, wq->d_name, sizeof(dev->wq_name)); + TAILQ_INSERT_TAIL(&dsa_bus.device_list, dev, next); + devcount++; + + read_device_int(dev, "numa_node", &numa_node); + dev->device.numa_node = numa_node; + dev->device.name = dev->wq_name; + } + + closedir(dev_dir); + return 0; +} + +static struct rte_device * +dsa_find_device(const struct rte_device *start, rte_dev_cmp_t cmp, + const void *data) +{ + struct rte_dsa_device *dev = TAILQ_FIRST(&dsa_bus.device_list); + + /* the rte_device struct must be at start of dsa structure */ + RTE_BUILD_BUG_ON(offsetof(struct rte_dsa_device, device) != 0); + + if (start != NULL) /* jump to start point if given */ + dev = TAILQ_NEXT((const struct rte_dsa_device *)start, next); + while (dev != NULL) { + if (cmp(&dev->device, data) == 0) + return &dev->device; + dev = TAILQ_NEXT(dev, next); + } + return NULL; +} + +static enum rte_iova_mode +dsa_get_iommu_class(void) +{ + /* if there are no devices, report don't care, otherwise VA mode */ + return devcount > 0 ? RTE_IOVA_VA : RTE_IOVA_DC; +} + +static int +dsa_addr_parse(const char *name, void *addr) +{ + struct dsa_wq_addr *wq = addr; + unsigned int device_id, wq_id; + + if (sscanf(name, "wq%u.%u", &device_id, &wq_id) != 2) { + IDXD_PMD_DEBUG("Parsing WQ name failed: %s", name); + return -1; + } + + wq->device_id = device_id; + wq->wq_id = wq_id; + return 0; +} + +RTE_REGISTER_BUS(dsa, dsa_bus.bus); diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index c6a7dcd72f..b8a7d7dab6 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -24,4 +24,38 @@ extern int idxd_pmd_logtype; #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) +struct idxd_dmadev { + /* counters to track the batches */ + unsigned short max_batches; + unsigned short batch_idx_read; + unsigned short batch_idx_write; + + /* track descriptors and handles */ + unsigned short desc_ring_mask; + unsigned short ids_avail; /* handles for ops completed */ + unsigned short ids_returned; /* the read pointer for hdls/desc rings */ + unsigned short batch_start; /* start+size == write pointer for hdls/desc */ + unsigned short batch_size; + + void *portal; /* address to write the batch descriptor */ + + struct idxd_completion *batch_comp_ring; + unsigned short *batch_idx_ring; /* store where each batch ends */ + + rte_iova_t batch_iova; /* base address of the batch comp ring */ + rte_iova_t desc_iova; /* base address of desc ring, needed for completions */ + + unsigned short max_batch_size; + + struct rte_dma_dev *dmadev; + uint8_t sva_support; + uint8_t qid; + + union { + struct { + unsigned int dsa_id; + } bus; + } u; +}; + #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index 11620ba156..37af6e1b8f 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -9,3 +9,7 @@ sources = files( 'idxd_common.c', 'idxd_pci.c' ) + +if is_linux + sources += files('idxd_bus.c') +endif -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v11 04/16] dma/idxd: create dmadev instances on bus probe 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Kevin Laatz ` (2 preceding siblings ...) 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 03/16] dma/idxd: add bus device probing Kevin Laatz @ 2021-10-20 16:30 ` Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz ` (12 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-20 16:30 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When a suitable device is found during the bus scan/probe, create a dmadev instance for each HW queue. Internal structures required for device creation are also added. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- drivers/dma/idxd/idxd_bus.c | 19 ++++++++++ drivers/dma/idxd/idxd_common.c | 61 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 27 ++++++++++++++ drivers/dma/idxd/idxd_internal.h | 7 ++++ 4 files changed, 114 insertions(+) create mode 100644 drivers/dma/idxd/idxd_hw_defs.h diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index c0f862a965..f5bd10191a 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -86,6 +86,18 @@ dsa_get_sysfs_path(void) return path ? path : DSA_SYSFS_PATH; } +static int +idxd_dev_close(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->data->dev_private; + munmap(idxd->portal, 0x1000); + return 0; +} + +static const struct rte_dma_dev_ops idxd_bus_ops = { + .dev_close = idxd_dev_close, +}; + static void * idxd_bus_mmap_wq(struct rte_dsa_device *dev) { @@ -207,6 +219,7 @@ idxd_probe_dsa(struct rte_dsa_device *dev) return -1; idxd.max_batch_size = ret; idxd.qid = dev->addr.wq_id; + idxd.u.bus.dsa_id = dev->addr.device_id; idxd.sva_support = 1; idxd.portal = idxd_bus_mmap_wq(dev); @@ -215,6 +228,12 @@ idxd_probe_dsa(struct rte_dsa_device *dev) return -ENOENT; } + ret = idxd_dmadev_create(dev->wq_name, &dev->device, &idxd, &idxd_bus_ops); + if (ret) { + IDXD_PMD_ERR("Failed to create dmadev %s", dev->wq_name); + return ret; + } + return 0; } diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index e00ddbe5ef..08ed3e4998 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -2,10 +2,71 @@ * Copyright 2021 Intel Corporation */ +#include <rte_malloc.h> +#include <rte_common.h> #include <rte_log.h> #include "idxd_internal.h" +#define IDXD_PMD_NAME_STR "dmadev_idxd" + +int +idxd_dmadev_create(const char *name, struct rte_device *dev, + const struct idxd_dmadev *base_idxd, + const struct rte_dma_dev_ops *ops) +{ + struct idxd_dmadev *idxd = NULL; + struct rte_dma_dev *dmadev = NULL; + int ret = 0; + + if (!name) { + IDXD_PMD_ERR("Invalid name of the device!"); + ret = -EINVAL; + goto cleanup; + } + + /* Allocate device structure */ + dmadev = rte_dma_pmd_allocate(name, dev->numa_node, sizeof(struct idxd_dmadev)); + if (dmadev == NULL) { + IDXD_PMD_ERR("Unable to allocate dma device"); + ret = -ENOMEM; + goto cleanup; + } + dmadev->dev_ops = ops; + dmadev->device = dev; + + idxd = dmadev->data->dev_private; + *idxd = *base_idxd; /* copy over the main fields already passed in */ + idxd->dmadev = dmadev; + + /* allocate batch index ring and completion ring. + * The +1 is because we can never fully use + * the ring, otherwise read == write means both full and empty. + */ + idxd->batch_comp_ring = rte_zmalloc_socket(NULL, (sizeof(idxd->batch_idx_ring[0]) + + sizeof(idxd->batch_comp_ring[0])) * (idxd->max_batches + 1), + sizeof(idxd->batch_comp_ring[0]), dev->numa_node); + if (idxd->batch_comp_ring == NULL) { + IDXD_PMD_ERR("Unable to reserve memory for batch data\n"); + ret = -ENOMEM; + goto cleanup; + } + idxd->batch_idx_ring = (void *)&idxd->batch_comp_ring[idxd->max_batches+1]; + idxd->batch_iova = rte_mem_virt2iova(idxd->batch_comp_ring); + + dmadev->fp_obj->dev_private = idxd; + + idxd->dmadev->state = RTE_DMA_DEV_READY; + + return 0; + +cleanup: + if (dmadev) + rte_dma_pmd_release(name); + + return ret; +} + int idxd_pmd_logtype; RTE_LOG_REGISTER_DEFAULT(idxd_pmd_logtype, WARNING); diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h new file mode 100644 index 0000000000..a92d462d01 --- /dev/null +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2021 Intel Corporation + */ + +#ifndef _IDXD_HW_DEFS_H_ +#define _IDXD_HW_DEFS_H_ + +#define IDXD_COMP_STATUS_INCOMPLETE 0 +#define IDXD_COMP_STATUS_SUCCESS 1 +#define IDXD_COMP_STATUS_INVALID_OPCODE 0x10 +#define IDXD_COMP_STATUS_INVALID_SIZE 0x13 +#define IDXD_COMP_STATUS_SKIPPED 0xFF /* not official IDXD error, needed as placeholder */ + +/** + * Completion record structure written back by DSA + */ +struct idxd_completion { + uint8_t status; + uint8_t result; + /* 16-bits pad here */ + uint32_t completed_size; /* data length, or descriptors for batch */ + + rte_iova_t fault_address; + uint32_t invalid_flags; +} __rte_aligned(32); + +#endif diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index b8a7d7dab6..8f1cdf6102 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -5,6 +5,10 @@ #ifndef _IDXD_INTERNAL_H_ #define _IDXD_INTERNAL_H_ +#include <rte_dmadev_pmd.h> + +#include "idxd_hw_defs.h" + /** * @file idxd_internal.h * @@ -58,4 +62,7 @@ struct idxd_dmadev { } u; }; +int idxd_dmadev_create(const char *name, struct rte_device *dev, + const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); + #endif /* _IDXD_INTERNAL_H_ */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v11 05/16] dma/idxd: create dmadev instances on pci probe 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Kevin Laatz ` (3 preceding siblings ...) 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz @ 2021-10-20 16:30 ` Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 06/16] dma/idxd: add datapath structures Kevin Laatz ` (11 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-20 16:30 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When a suitable device is found during the PCI probe, create a dmadev instance for each HW queue. HW definitions required are also included. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- v11: fix destroy --- drivers/dma/idxd/idxd_hw_defs.h | 63 +++++++ drivers/dma/idxd/idxd_internal.h | 13 ++ drivers/dma/idxd/idxd_pci.c | 271 ++++++++++++++++++++++++++++++- 3 files changed, 344 insertions(+), 3 deletions(-) diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h index a92d462d01..86f7f3526b 100644 --- a/drivers/dma/idxd/idxd_hw_defs.h +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -24,4 +24,67 @@ struct idxd_completion { uint32_t invalid_flags; } __rte_aligned(32); +/*** Definitions for Intel(R) Data Streaming Accelerator ***/ + +#define IDXD_CMD_SHIFT 20 +enum rte_idxd_cmds { + idxd_enable_dev = 1, + idxd_disable_dev, + idxd_drain_all, + idxd_abort_all, + idxd_reset_device, + idxd_enable_wq, + idxd_disable_wq, + idxd_drain_wq, + idxd_abort_wq, + idxd_reset_wq, +}; + +/* General bar0 registers */ +struct rte_idxd_bar0 { + uint32_t __rte_cache_aligned version; /* offset 0x00 */ + uint64_t __rte_aligned(0x10) gencap; /* offset 0x10 */ + uint64_t __rte_aligned(0x10) wqcap; /* offset 0x20 */ + uint64_t __rte_aligned(0x10) grpcap; /* offset 0x30 */ + uint64_t __rte_aligned(0x08) engcap; /* offset 0x38 */ + uint64_t __rte_aligned(0x10) opcap; /* offset 0x40 */ + uint64_t __rte_aligned(0x20) offsets[2]; /* offset 0x60 */ + uint32_t __rte_aligned(0x20) gencfg; /* offset 0x80 */ + uint32_t __rte_aligned(0x08) genctrl; /* offset 0x88 */ + uint32_t __rte_aligned(0x10) gensts; /* offset 0x90 */ + uint32_t __rte_aligned(0x08) intcause; /* offset 0x98 */ + uint32_t __rte_aligned(0x10) cmd; /* offset 0xA0 */ + uint32_t __rte_aligned(0x08) cmdstatus; /* offset 0xA8 */ + uint64_t __rte_aligned(0x20) swerror[4]; /* offset 0xC0 */ +}; + +/* workqueue config is provided by array of uint32_t. */ +enum rte_idxd_wqcfg { + wq_size_idx, /* size is in first 32-bit value */ + wq_threshold_idx, /* WQ threshold second 32-bits */ + wq_mode_idx, /* WQ mode and other flags */ + wq_sizes_idx, /* WQ transfer and batch sizes */ + wq_occ_int_idx, /* WQ occupancy interrupt handle */ + wq_occ_limit_idx, /* WQ occupancy limit */ + wq_state_idx, /* WQ state and occupancy state */ +}; + +#define WQ_MODE_SHARED 0 +#define WQ_MODE_DEDICATED 1 +#define WQ_PRIORITY_SHIFT 4 +#define WQ_BATCH_SZ_SHIFT 5 +#define WQ_STATE_SHIFT 30 +#define WQ_STATE_MASK 0x3 + +struct rte_idxd_grpcfg { + uint64_t grpwqcfg[4] __rte_cache_aligned; /* 64-byte register set */ + uint64_t grpengcfg; /* offset 32 */ + uint32_t grpflags; /* offset 40 */ +}; + +#define GENSTS_DEV_STATE_MASK 0x03 +#define CMDSTATUS_ACTIVE_SHIFT 31 +#define CMDSTATUS_ACTIVE_MASK (1 << 31) +#define CMDSTATUS_ERR_MASK 0xFF + #endif diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 8f1cdf6102..8473bf939f 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -6,6 +6,7 @@ #define _IDXD_INTERNAL_H_ #include <rte_dmadev_pmd.h> +#include <rte_spinlock.h> #include "idxd_hw_defs.h" @@ -28,6 +29,16 @@ extern int idxd_pmd_logtype; #define IDXD_PMD_ERR(fmt, args...) IDXD_PMD_LOG(ERR, fmt, ## args) #define IDXD_PMD_WARN(fmt, args...) IDXD_PMD_LOG(WARNING, fmt, ## args) +struct idxd_pci_common { + rte_spinlock_t lk; + + uint8_t wq_cfg_sz; + volatile struct rte_idxd_bar0 *regs; + volatile uint32_t *wq_regs_base; + volatile struct rte_idxd_grpcfg *grp_regs; + volatile void *portals; +}; + struct idxd_dmadev { /* counters to track the batches */ unsigned short max_batches; @@ -59,6 +70,8 @@ struct idxd_dmadev { struct { unsigned int dsa_id; } bus; + + struct idxd_pci_common *pci; } u; }; diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 79e4aadcab..6d26574917 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -3,6 +3,9 @@ */ #include <rte_bus_pci.h> +#include <rte_devargs.h> +#include <rte_dmadev_pmd.h> +#include <rte_malloc.h> #include "idxd_internal.h" @@ -16,28 +19,290 @@ const struct rte_pci_id pci_id_idxd_map[] = { { .vendor_id = 0, /* sentinel */ }, }; +static inline int +idxd_pci_dev_command(struct idxd_dmadev *idxd, enum rte_idxd_cmds command) +{ + uint8_t err_code; + uint16_t qid = idxd->qid; + int i = 0; + + if (command >= idxd_disable_wq && command <= idxd_reset_wq) + qid = (1 << qid); + rte_spinlock_lock(&idxd->u.pci->lk); + idxd->u.pci->regs->cmd = (command << IDXD_CMD_SHIFT) | qid; + + do { + rte_pause(); + err_code = idxd->u.pci->regs->cmdstatus; + if (++i >= 1000) { + IDXD_PMD_ERR("Timeout waiting for command response from HW"); + rte_spinlock_unlock(&idxd->u.pci->lk); + return err_code; + } + } while (err_code & CMDSTATUS_ACTIVE_MASK); + rte_spinlock_unlock(&idxd->u.pci->lk); + + err_code &= CMDSTATUS_ERR_MASK; + return -err_code; +} + +static uint32_t * +idxd_get_wq_cfg(struct idxd_pci_common *pci, uint8_t wq_idx) +{ + return RTE_PTR_ADD(pci->wq_regs_base, + (uintptr_t)wq_idx << (5 + pci->wq_cfg_sz)); +} + +static int +idxd_is_wq_enabled(struct idxd_dmadev *idxd) +{ + uint32_t state = idxd_get_wq_cfg(idxd->u.pci, idxd->qid)[wq_state_idx]; + return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; +} + +static int +idxd_pci_dev_close(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint8_t err_code; + + /* disable the device */ + err_code = idxd_pci_dev_command(idxd, idxd_disable_dev); + if (err_code) { + IDXD_PMD_ERR("Error disabling device: code %#x", err_code); + return err_code; + } + IDXD_PMD_DEBUG("IDXD Device disabled OK"); + + /* free device memory */ + IDXD_PMD_DEBUG("Freeing device driver memory"); + rte_free(idxd->batch_idx_ring); + + return 0; +} + +static const struct rte_dma_dev_ops idxd_pci_ops = { + .dev_close = idxd_pci_dev_close, +}; + +/* each portal uses 4 x 4k pages */ +#define IDXD_PORTAL_SIZE (4096 * 4) + +static int +init_pci_device(struct rte_pci_device *dev, struct idxd_dmadev *idxd, + unsigned int max_queues) +{ + struct idxd_pci_common *pci; + uint8_t nb_groups, nb_engines, nb_wqs; + uint16_t grp_offset, wq_offset; /* how far into bar0 the regs are */ + uint16_t wq_size, total_wq_size; + uint8_t lg2_max_batch, lg2_max_copy_size; + unsigned int i, err_code; + + pci = malloc(sizeof(*pci)); + if (pci == NULL) { + IDXD_PMD_ERR("%s: Can't allocate memory", __func__); + err_code = -1; + goto err; + } + rte_spinlock_init(&pci->lk); + + /* assign the bar registers, and then configure device */ + pci->regs = dev->mem_resource[0].addr; + grp_offset = (uint16_t)pci->regs->offsets[0]; + pci->grp_regs = RTE_PTR_ADD(pci->regs, grp_offset * 0x100); + wq_offset = (uint16_t)(pci->regs->offsets[0] >> 16); + pci->wq_regs_base = RTE_PTR_ADD(pci->regs, wq_offset * 0x100); + pci->portals = dev->mem_resource[2].addr; + pci->wq_cfg_sz = (pci->regs->wqcap >> 24) & 0x0F; + + /* sanity check device status */ + if (pci->regs->gensts & GENSTS_DEV_STATE_MASK) { + /* need function-level-reset (FLR) or is enabled */ + IDXD_PMD_ERR("Device status is not disabled, cannot init"); + err_code = -1; + goto err; + } + if (pci->regs->cmdstatus & CMDSTATUS_ACTIVE_MASK) { + /* command in progress */ + IDXD_PMD_ERR("Device has a command in progress, cannot init"); + err_code = -1; + goto err; + } + + /* read basic info about the hardware for use when configuring */ + nb_groups = (uint8_t)pci->regs->grpcap; + nb_engines = (uint8_t)pci->regs->engcap; + nb_wqs = (uint8_t)(pci->regs->wqcap >> 16); + total_wq_size = (uint16_t)pci->regs->wqcap; + lg2_max_copy_size = (uint8_t)(pci->regs->gencap >> 16) & 0x1F; + lg2_max_batch = (uint8_t)(pci->regs->gencap >> 21) & 0x0F; + + IDXD_PMD_DEBUG("nb_groups = %u, nb_engines = %u, nb_wqs = %u", + nb_groups, nb_engines, nb_wqs); + + /* zero out any old config */ + for (i = 0; i < nb_groups; i++) { + pci->grp_regs[i].grpengcfg = 0; + pci->grp_regs[i].grpwqcfg[0] = 0; + } + for (i = 0; i < nb_wqs; i++) + idxd_get_wq_cfg(pci, i)[0] = 0; + + /* limit queues if necessary */ + if (max_queues != 0 && nb_wqs > max_queues) { + nb_wqs = max_queues; + if (nb_engines > max_queues) + nb_engines = max_queues; + if (nb_groups > max_queues) + nb_engines = max_queues; + IDXD_PMD_DEBUG("Limiting queues to %u", nb_wqs); + } + + /* put each engine into a separate group to avoid reordering */ + if (nb_groups > nb_engines) + nb_groups = nb_engines; + if (nb_groups < nb_engines) + nb_engines = nb_groups; + + /* assign engines to groups, round-robin style */ + for (i = 0; i < nb_engines; i++) { + IDXD_PMD_DEBUG("Assigning engine %u to group %u", + i, i % nb_groups); + pci->grp_regs[i % nb_groups].grpengcfg |= (1ULL << i); + } + + /* now do the same for queues and give work slots to each queue */ + wq_size = total_wq_size / nb_wqs; + IDXD_PMD_DEBUG("Work queue size = %u, max batch = 2^%u, max copy = 2^%u", + wq_size, lg2_max_batch, lg2_max_copy_size); + for (i = 0; i < nb_wqs; i++) { + /* add engine "i" to a group */ + IDXD_PMD_DEBUG("Assigning work queue %u to group %u", + i, i % nb_groups); + pci->grp_regs[i % nb_groups].grpwqcfg[0] |= (1ULL << i); + /* now configure it, in terms of size, max batch, mode */ + idxd_get_wq_cfg(pci, i)[wq_size_idx] = wq_size; + idxd_get_wq_cfg(pci, i)[wq_mode_idx] = (1 << WQ_PRIORITY_SHIFT) | + WQ_MODE_DEDICATED; + idxd_get_wq_cfg(pci, i)[wq_sizes_idx] = lg2_max_copy_size | + (lg2_max_batch << WQ_BATCH_SZ_SHIFT); + } + + /* dump the group configuration to output */ + for (i = 0; i < nb_groups; i++) { + IDXD_PMD_DEBUG("## Group %d", i); + IDXD_PMD_DEBUG(" GRPWQCFG: %"PRIx64, pci->grp_regs[i].grpwqcfg[0]); + IDXD_PMD_DEBUG(" GRPENGCFG: %"PRIx64, pci->grp_regs[i].grpengcfg); + IDXD_PMD_DEBUG(" GRPFLAGS: %"PRIx32, pci->grp_regs[i].grpflags); + } + + idxd->u.pci = pci; + idxd->max_batches = wq_size; + + /* enable the device itself */ + err_code = idxd_pci_dev_command(idxd, idxd_enable_dev); + if (err_code) { + IDXD_PMD_ERR("Error enabling device: code %#x", err_code); + goto err; + } + IDXD_PMD_DEBUG("IDXD Device enabled OK"); + + return nb_wqs; + +err: + free(pci); + return err_code; +} + static int idxd_dmadev_probe_pci(struct rte_pci_driver *drv, struct rte_pci_device *dev) { - int ret = 0; + struct idxd_dmadev idxd = {0}; + uint8_t nb_wqs; + int qid, ret = 0; char name[PCI_PRI_STR_SIZE]; + unsigned int max_queues = 0; rte_pci_device_name(&dev->addr, name, sizeof(name)); IDXD_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); dev->device.driver = &drv->driver; + if (dev->device.devargs && dev->device.devargs->args[0] != '\0') { + /* if the number of devargs grows beyond just 1, use rte_kvargs */ + if (sscanf(dev->device.devargs->args, + "max_queues=%u", &max_queues) != 1) { + IDXD_PMD_ERR("Invalid device parameter: '%s'", + dev->device.devargs->args); + return -1; + } + } + + ret = init_pci_device(dev, &idxd, max_queues); + if (ret < 0) { + IDXD_PMD_ERR("Error initializing PCI hardware"); + return ret; + } + if (idxd.u.pci->portals == NULL) { + IDXD_PMD_ERR("Error, invalid portal assigned during initialization\n"); + free(idxd.u.pci); + return -EINVAL; + } + nb_wqs = (uint8_t)ret; + + /* set up one device for each queue */ + for (qid = 0; qid < nb_wqs; qid++) { + char qname[32]; + + /* add the queue number to each device name */ + snprintf(qname, sizeof(qname), "%s-q%d", name, qid); + idxd.qid = qid; + idxd.portal = RTE_PTR_ADD(idxd.u.pci->portals, + qid * IDXD_PORTAL_SIZE); + if (idxd_is_wq_enabled(&idxd)) + IDXD_PMD_ERR("Error, WQ %u seems enabled", qid); + ret = idxd_dmadev_create(qname, &dev->device, + &idxd, &idxd_pci_ops); + if (ret != 0) { + IDXD_PMD_ERR("Failed to create dmadev %s", name); + if (qid == 0) /* if no devices using this, free pci */ + free(idxd.u.pci); + return ret; + } + } + + return 0; +} + +static int +idxd_dmadev_destroy(const char *name) +{ + int ret = 0; + + /* rte_dma_close is called by pmd_release */ + ret = rte_dma_pmd_release(name); + if (ret) + IDXD_PMD_DEBUG("Device cleanup failed"); + return ret; } static int idxd_dmadev_remove_pci(struct rte_pci_device *dev) { + int i = 0; char name[PCI_PRI_STR_SIZE]; rte_pci_device_name(&dev->addr, name, sizeof(name)); - IDXD_PMD_INFO("Closing %s on NUMA node %d", - name, dev->device.numa_node); + IDXD_PMD_INFO("Closing %s on NUMA node %d", name, dev->device.numa_node); + + RTE_DMA_FOREACH_DEV(i) { + struct rte_dma_info *info = {0}; + rte_dma_info_get(i, info); + if (strncmp(name, info->dev_name, strlen(name)) == 0) + idxd_dmadev_destroy(info->dev_name); + } return 0; } -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v11 06/16] dma/idxd: add datapath structures 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Kevin Laatz ` (4 preceding siblings ...) 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz @ 2021-10-20 16:30 ` Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 07/16] dma/idxd: add configure and info_get functions Kevin Laatz ` (10 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-20 16:30 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add data structures required for the data path for IDXD devices. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 33 +++++++++++++++++++++++++ drivers/dma/idxd/idxd_hw_defs.h | 41 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 4 ++++ drivers/dma/idxd/idxd_pci.c | 2 ++ 5 files changed, 81 insertions(+) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index f5bd10191a..2d5490b2df 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -96,6 +96,7 @@ idxd_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_close = idxd_dev_close, + .dev_dump = idxd_dump, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 08ed3e4998..46598c368c 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -10,6 +10,35 @@ #define IDXD_PMD_NAME_STR "dmadev_idxd" +int +idxd_dump(const struct rte_dma_dev *dev, FILE *f) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + unsigned int i; + + fprintf(f, "== IDXD Private Data ==\n"); + fprintf(f, " Portal: %p\n", idxd->portal); + fprintf(f, " Config: { ring_size: %u }\n", + idxd->qcfg.nb_desc); + fprintf(f, " Batch ring (sz = %u, max_batches = %u):\n\t", + idxd->max_batches + 1, idxd->max_batches); + for (i = 0; i <= idxd->max_batches; i++) { + fprintf(f, " %u ", idxd->batch_idx_ring[i]); + if (i == idxd->batch_idx_read && i == idxd->batch_idx_write) + fprintf(f, "[rd ptr, wr ptr] "); + else if (i == idxd->batch_idx_read) + fprintf(f, "[rd ptr] "); + else if (i == idxd->batch_idx_write) + fprintf(f, "[wr ptr] "); + if (i == idxd->max_batches) + fprintf(f, "\n"); + } + + fprintf(f, " Curr batch: start = %u, size = %u\n", idxd->batch_start, idxd->batch_size); + fprintf(f, " IDS: avail = %u, returned: %u\n", idxd->ids_avail, idxd->ids_returned); + return 0; +} + int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, @@ -19,6 +48,10 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, struct rte_dma_dev *dmadev = NULL; int ret = 0; + RTE_BUILD_BUG_ON(sizeof(struct idxd_hw_desc) != 64); + RTE_BUILD_BUG_ON(offsetof(struct idxd_hw_desc, size) != 32); + RTE_BUILD_BUG_ON(sizeof(struct idxd_completion) != 32); + if (!name) { IDXD_PMD_ERR("Invalid name of the device!"); ret = -EINVAL; diff --git a/drivers/dma/idxd/idxd_hw_defs.h b/drivers/dma/idxd/idxd_hw_defs.h index 86f7f3526b..55ca9f7f52 100644 --- a/drivers/dma/idxd/idxd_hw_defs.h +++ b/drivers/dma/idxd/idxd_hw_defs.h @@ -5,6 +5,47 @@ #ifndef _IDXD_HW_DEFS_H_ #define _IDXD_HW_DEFS_H_ +/* + * Defines used in the data path for interacting with IDXD hardware. + */ +#define IDXD_CMD_OP_SHIFT 24 +enum rte_idxd_ops { + idxd_op_nop = 0, + idxd_op_batch, + idxd_op_drain, + idxd_op_memmove, + idxd_op_fill +}; + +#define IDXD_FLAG_FENCE (1 << 0) +#define IDXD_FLAG_COMPLETION_ADDR_VALID (1 << 2) +#define IDXD_FLAG_REQUEST_COMPLETION (1 << 3) +#define IDXD_FLAG_CACHE_CONTROL (1 << 8) + +/** + * Hardware descriptor used by DSA hardware, for both bursts and + * for individual operations. + */ +struct idxd_hw_desc { + uint32_t pasid; + uint32_t op_flags; + rte_iova_t completion; + + RTE_STD_C11 + union { + rte_iova_t src; /* source address for copy ops etc. */ + rte_iova_t desc_addr; /* descriptor pointer for batch */ + }; + rte_iova_t dst; + + uint32_t size; /* length of data for op, or batch size */ + + uint16_t intr_handle; /* completion interrupt handle */ + + /* remaining 26 bytes are reserved */ + uint16_t __reserved[13]; +} __rte_aligned(64); + #define IDXD_COMP_STATUS_INCOMPLETE 0 #define IDXD_COMP_STATUS_SUCCESS 1 #define IDXD_COMP_STATUS_INVALID_OPCODE 0x10 diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 8473bf939f..5e253fdfbc 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -40,6 +40,8 @@ struct idxd_pci_common { }; struct idxd_dmadev { + struct idxd_hw_desc *desc_ring; + /* counters to track the batches */ unsigned short max_batches; unsigned short batch_idx_read; @@ -63,6 +65,7 @@ struct idxd_dmadev { unsigned short max_batch_size; struct rte_dma_dev *dmadev; + struct rte_dma_vchan_conf qcfg; uint8_t sva_support; uint8_t qid; @@ -77,5 +80,6 @@ struct idxd_dmadev { int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); +int idxd_dump(const struct rte_dma_dev *dev, FILE *f); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 6d26574917..0b3a6ee4bc 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -77,12 +77,14 @@ idxd_pci_dev_close(struct rte_dma_dev *dev) /* free device memory */ IDXD_PMD_DEBUG("Freeing device driver memory"); rte_free(idxd->batch_idx_ring); + rte_free(idxd->desc_ring); return 0; } static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_close = idxd_pci_dev_close, + .dev_dump = idxd_dump, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v11 07/16] dma/idxd: add configure and info_get functions 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Kevin Laatz ` (5 preceding siblings ...) 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 06/16] dma/idxd: add datapath structures Kevin Laatz @ 2021-10-20 16:30 ` Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz ` (9 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-20 16:30 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add functions for device configuration. The info_get function is included here since it can be useful for checking successful configuration. Documentation is also updated to add device configuration usage info. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> --- doc/guides/dmadevs/idxd.rst | 15 +++++++ drivers/dma/idxd/idxd_bus.c | 3 ++ drivers/dma/idxd/idxd_common.c | 71 ++++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 6 +++ drivers/dma/idxd/idxd_pci.c | 3 ++ 5 files changed, 98 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index ce33e2857a..62ffd39ee0 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -120,3 +120,18 @@ use a subset of configured queues. Once probed successfully, irrespective of kernel driver, the device will appear as a ``dmadev``, that is a "DMA device type" inside DPDK, and can be accessed using APIs from the ``rte_dmadev`` library. + +Using IDXD DMAdev Devices +-------------------------- + +To use the devices from an application, the dmadev API can be used. + +Device Configuration +~~~~~~~~~~~~~~~~~~~~~ + +IDXD configuration requirements: + +* ``ring_size`` must be a power of two, between 64 and 4096. +* Only one ``vchan`` is supported per device (work queue). +* IDXD devices do not support silent mode. +* The transfer direction must be set to ``RTE_DMA_DIR_MEM_TO_MEM`` to copy from memory to memory. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index 2d5490b2df..971fe34b88 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -97,6 +97,9 @@ idxd_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_close = idxd_dev_close, .dev_dump = idxd_dump, + .dev_configure = idxd_configure, + .vchan_setup = idxd_vchan_setup, + .dev_info_get = idxd_info_get, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 46598c368c..70d094e3a2 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -39,6 +39,77 @@ idxd_dump(const struct rte_dma_dev *dev, FILE *f) return 0; } +int +idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t size) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + + if (size < sizeof(*info)) + return -EINVAL; + + *info = (struct rte_dma_info) { + .dev_capa = RTE_DMA_CAPA_MEM_TO_MEM | RTE_DMA_CAPA_HANDLES_ERRORS | + RTE_DMA_CAPA_OPS_COPY | RTE_DMA_CAPA_OPS_FILL, + .max_vchans = 1, + .max_desc = 4096, + .min_desc = 64, + }; + if (idxd->sva_support) + info->dev_capa |= RTE_DMA_CAPA_SVA; + return 0; +} + +int +idxd_configure(struct rte_dma_dev *dev __rte_unused, const struct rte_dma_conf *dev_conf, + uint32_t conf_sz) +{ + if (sizeof(struct rte_dma_conf) != conf_sz) + return -EINVAL; + + if (dev_conf->nb_vchans != 1) + return -EINVAL; + return 0; +} + +int +idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint16_t max_desc = qconf->nb_desc; + + if (sizeof(struct rte_dma_vchan_conf) != qconf_sz) + return -EINVAL; + + idxd->qcfg = *qconf; + + if (!rte_is_power_of_2(max_desc)) + max_desc = rte_align32pow2(max_desc); + IDXD_PMD_DEBUG("DMA dev %u using %u descriptors", dev->data->dev_id, max_desc); + idxd->desc_ring_mask = max_desc - 1; + idxd->qcfg.nb_desc = max_desc; + + /* in case we are reconfiguring a device, free any existing memory */ + rte_free(idxd->desc_ring); + + /* allocate the descriptor ring at 2x size as batches can't wrap */ + idxd->desc_ring = rte_zmalloc(NULL, sizeof(*idxd->desc_ring) * max_desc * 2, 0); + if (idxd->desc_ring == NULL) + return -ENOMEM; + idxd->desc_iova = rte_mem_virt2iova(idxd->desc_ring); + + idxd->batch_idx_read = 0; + idxd->batch_idx_write = 0; + idxd->batch_start = 0; + idxd->batch_size = 0; + idxd->ids_returned = 0; + idxd->ids_avail = 0; + + memset(idxd->batch_comp_ring, 0, sizeof(*idxd->batch_comp_ring) * + (idxd->max_batches + 1)); + return 0; +} + int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 5e253fdfbc..1dbe31abcd 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -81,5 +81,11 @@ struct idxd_dmadev { int idxd_dmadev_create(const char *name, struct rte_device *dev, const struct idxd_dmadev *base_idxd, const struct rte_dma_dev_ops *ops); int idxd_dump(const struct rte_dma_dev *dev, FILE *f); +int idxd_configure(struct rte_dma_dev *dev, const struct rte_dma_conf *dev_conf, + uint32_t conf_sz); +int idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan, + const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz); +int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *dev_info, + uint32_t size); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 0b3a6ee4bc..c9e193a11d 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -85,6 +85,9 @@ idxd_pci_dev_close(struct rte_dma_dev *dev) static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_close = idxd_pci_dev_close, .dev_dump = idxd_dump, + .dev_configure = idxd_configure, + .vchan_setup = idxd_vchan_setup, + .dev_info_get = idxd_info_get, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v11 08/16] dma/idxd: add start and stop functions for pci devices 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Kevin Laatz ` (6 preceding siblings ...) 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 07/16] dma/idxd: add configure and info_get functions Kevin Laatz @ 2021-10-20 16:30 ` Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 09/16] dma/idxd: add data-path job submission functions Kevin Laatz ` (8 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-20 16:30 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add device start/stop functions for DSA devices bound to vfio. For devices bound to the IDXD kernel driver, these are not required since the IDXD kernel driver takes care of this. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 3 +++ drivers/dma/idxd/idxd_pci.c | 51 +++++++++++++++++++++++++++++++++++++ 2 files changed, 54 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 62ffd39ee0..711890bd9e 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -135,3 +135,6 @@ IDXD configuration requirements: * Only one ``vchan`` is supported per device (work queue). * IDXD devices do not support silent mode. * The transfer direction must be set to ``RTE_DMA_DIR_MEM_TO_MEM`` to copy from memory to memory. + +Once configured, the device can then be made ready for use by calling the +``rte_dma_start()`` API. diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index c9e193a11d..58760d2e74 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -60,6 +60,55 @@ idxd_is_wq_enabled(struct idxd_dmadev *idxd) return ((state >> WQ_STATE_SHIFT) & WQ_STATE_MASK) == 0x1; } +static int +idxd_pci_dev_stop(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint8_t err_code; + + if (!idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Work queue %d already disabled", idxd->qid); + return 0; + } + + err_code = idxd_pci_dev_command(idxd, idxd_disable_wq); + if (err_code || idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Failed disabling work queue %d, error code: %#x", + idxd->qid, err_code); + return err_code == 0 ? -1 : -err_code; + } + IDXD_PMD_DEBUG("Work queue %d disabled OK", idxd->qid); + + return 0; +} + +static int +idxd_pci_dev_start(struct rte_dma_dev *dev) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint8_t err_code; + + if (idxd_is_wq_enabled(idxd)) { + IDXD_PMD_WARN("WQ %d already enabled", idxd->qid); + return 0; + } + + if (idxd->desc_ring == NULL) { + IDXD_PMD_ERR("WQ %d has not been fully configured", idxd->qid); + return -EINVAL; + } + + err_code = idxd_pci_dev_command(idxd, idxd_enable_wq); + if (err_code || !idxd_is_wq_enabled(idxd)) { + IDXD_PMD_ERR("Failed enabling work queue %d, error code: %#x", + idxd->qid, err_code); + return err_code == 0 ? -1 : -err_code; + } + IDXD_PMD_DEBUG("Work queue %d enabled OK", idxd->qid); + + return 0; +} + static int idxd_pci_dev_close(struct rte_dma_dev *dev) { @@ -88,6 +137,8 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .dev_start = idxd_pci_dev_start, + .dev_stop = idxd_pci_dev_stop, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v11 09/16] dma/idxd: add data-path job submission functions 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Kevin Laatz ` (7 preceding siblings ...) 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz @ 2021-10-20 16:30 ` Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 10/16] dma/idxd: add data-path job completion functions Kevin Laatz ` (7 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-20 16:30 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add data path functions for enqueuing and submitting operations to DSA devices. Documentation updates are included for dmadev library and IDXD driver docs as appropriate. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> --- doc/guides/dmadevs/idxd.rst | 9 +++ doc/guides/prog_guide/dmadev.rst | 19 +++++ drivers/dma/idxd/idxd_common.c | 135 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 ++ drivers/dma/idxd/meson.build | 1 + 5 files changed, 169 insertions(+) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index 711890bd9e..d548c4751a 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -138,3 +138,12 @@ IDXD configuration requirements: Once configured, the device can then be made ready for use by calling the ``rte_dma_start()`` API. + +Performing Data Copies +~~~~~~~~~~~~~~~~~~~~~~~ + +Refer to the :ref:`Enqueue / Dequeue APIs <dmadev_enqueue_dequeue>` section of the dmadev library +documentation for details on operation enqueue and submission API usage. + +It is expected that, for efficiency reasons, a burst of operations will be enqueued to the +device via multiple enqueue calls between calls to the ``rte_dma_submit()`` function. diff --git a/doc/guides/prog_guide/dmadev.rst b/doc/guides/prog_guide/dmadev.rst index 32f7147862..30734f3a36 100644 --- a/doc/guides/prog_guide/dmadev.rst +++ b/doc/guides/prog_guide/dmadev.rst @@ -67,6 +67,8 @@ can be used to get the device info and supported features. Silent mode is a special device capability which does not require the application to invoke dequeue APIs. +.. _dmadev_enqueue_dequeue: + Enqueue / Dequeue APIs ~~~~~~~~~~~~~~~~~~~~~~ @@ -80,6 +82,23 @@ The ``rte_dma_submit`` API is used to issue doorbell to hardware. Alternatively the ``RTE_DMA_OP_FLAG_SUBMIT`` flag can be passed to the enqueue APIs to also issue the doorbell to hardware. +The following code demonstrates how to enqueue a burst of copies to the +device and start the hardware processing of them: + +.. code-block:: C + + struct rte_mbuf *srcs[DMA_BURST_SZ], *dsts[DMA_BURST_SZ]; + unsigned int i; + + for (i = 0; i < RTE_DIM(srcs); i++) { + if (rte_dma_copy(dev_id, vchan, rte_pktmbuf_iova(srcs[i]), + rte_pktmbuf_iova(dsts[i]), COPY_LEN, 0) < 0) { + PRINT_ERR("Error with rte_dma_copy for buffer %u\n", i); + return -1; + } + } + rte_dma_submit(dev_id, vchan); + There are two dequeue APIs ``rte_dma_completed`` and ``rte_dma_completed_status``, these are used to obtain the results of the enqueue requests. ``rte_dma_completed`` will return the number of successfully diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 70d094e3a2..616829c962 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -2,14 +2,145 @@ * Copyright 2021 Intel Corporation */ +#include <x86intrin.h> + #include <rte_malloc.h> #include <rte_common.h> #include <rte_log.h> +#include <rte_prefetch.h> #include "idxd_internal.h" #define IDXD_PMD_NAME_STR "dmadev_idxd" +static __rte_always_inline rte_iova_t +__desc_idx_to_iova(struct idxd_dmadev *idxd, uint16_t n) +{ + return idxd->desc_iova + (n * sizeof(struct idxd_hw_desc)); +} + +static __rte_always_inline void +__idxd_movdir64b(volatile void *dst, const struct idxd_hw_desc *src) +{ + asm volatile (".byte 0x66, 0x0f, 0x38, 0xf8, 0x02" + : + : "a" (dst), "d" (src) + : "memory"); +} + +static __rte_always_inline void +__submit(struct idxd_dmadev *idxd) +{ + rte_prefetch1(&idxd->batch_comp_ring[idxd->batch_idx_read]); + + if (idxd->batch_size == 0) + return; + + /* write completion to batch comp ring */ + rte_iova_t comp_addr = idxd->batch_iova + + (idxd->batch_idx_write * sizeof(struct idxd_completion)); + + if (idxd->batch_size == 1) { + /* submit batch directly */ + struct idxd_hw_desc desc = + idxd->desc_ring[idxd->batch_start & idxd->desc_ring_mask]; + desc.completion = comp_addr; + desc.op_flags |= IDXD_FLAG_REQUEST_COMPLETION; + _mm_sfence(); /* fence before writing desc to device */ + __idxd_movdir64b(idxd->portal, &desc); + } else { + const struct idxd_hw_desc batch_desc = { + .op_flags = (idxd_op_batch << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_COMPLETION_ADDR_VALID | + IDXD_FLAG_REQUEST_COMPLETION, + .desc_addr = __desc_idx_to_iova(idxd, + idxd->batch_start & idxd->desc_ring_mask), + .completion = comp_addr, + .size = idxd->batch_size, + }; + _mm_sfence(); /* fence before writing desc to device */ + __idxd_movdir64b(idxd->portal, &batch_desc); + } + + if (++idxd->batch_idx_write > idxd->max_batches) + idxd->batch_idx_write = 0; + + idxd->batch_start += idxd->batch_size; + idxd->batch_size = 0; + idxd->batch_idx_ring[idxd->batch_idx_write] = idxd->batch_start; + _mm256_store_si256((void *)&idxd->batch_comp_ring[idxd->batch_idx_write], + _mm256_setzero_si256()); +} + +static __rte_always_inline int +__idxd_write_desc(struct idxd_dmadev *idxd, + const uint32_t op_flags, + const rte_iova_t src, + const rte_iova_t dst, + const uint32_t size, + const uint32_t flags) +{ + uint16_t mask = idxd->desc_ring_mask; + uint16_t job_id = idxd->batch_start + idxd->batch_size; + /* we never wrap batches, so we only mask the start and allow start+size to overflow */ + uint16_t write_idx = (idxd->batch_start & mask) + idxd->batch_size; + + /* first check batch ring space then desc ring space */ + if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) || + idxd->batch_idx_write + 1 == idxd->batch_idx_read) + return -ENOSPC; + if (((write_idx + 1) & mask) == (idxd->ids_returned & mask)) + return -ENOSPC; + + /* write desc. Note: descriptors don't wrap, but the completion address does */ + const uint64_t op_flags64 = (uint64_t)(op_flags | IDXD_FLAG_COMPLETION_ADDR_VALID) << 32; + const uint64_t comp_addr = __desc_idx_to_iova(idxd, write_idx & mask); + _mm256_store_si256((void *)&idxd->desc_ring[write_idx], + _mm256_set_epi64x(dst, src, comp_addr, op_flags64)); + _mm256_store_si256((void *)&idxd->desc_ring[write_idx].size, + _mm256_set_epi64x(0, 0, 0, size)); + + idxd->batch_size++; + + rte_prefetch0_write(&idxd->desc_ring[write_idx + 1]); + + if (flags & RTE_DMA_OP_FLAG_SUBMIT) + __submit(idxd); + + return job_id; +} + +int +idxd_enqueue_copy(void *dev_private, uint16_t qid __rte_unused, rte_iova_t src, + rte_iova_t dst, unsigned int length, uint64_t flags) +{ + /* we can take advantage of the fact that the fence flag in dmadev and DSA are the same, + * but check it at compile time to be sure. + */ + RTE_BUILD_BUG_ON(RTE_DMA_OP_FLAG_FENCE != IDXD_FLAG_FENCE); + uint32_t memmove = (idxd_op_memmove << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_CACHE_CONTROL | (flags & IDXD_FLAG_FENCE); + return __idxd_write_desc(dev_private, memmove, src, dst, length, + flags); +} + +int +idxd_enqueue_fill(void *dev_private, uint16_t qid __rte_unused, uint64_t pattern, + rte_iova_t dst, unsigned int length, uint64_t flags) +{ + uint32_t fill = (idxd_op_fill << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_CACHE_CONTROL | (flags & IDXD_FLAG_FENCE); + return __idxd_write_desc(dev_private, fill, pattern, dst, length, + flags); +} + +int +idxd_submit(void *dev_private, uint16_t qid __rte_unused) +{ + __submit(dev_private); + return 0; +} + int idxd_dump(const struct rte_dma_dev *dev, FILE *f) { @@ -139,6 +270,10 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->dev_ops = ops; dmadev->device = dev; + dmadev->fp_obj->copy = idxd_enqueue_copy; + dmadev->fp_obj->fill = idxd_enqueue_fill; + dmadev->fp_obj->submit = idxd_submit; + idxd = dmadev->data->dev_private; *idxd = *base_idxd; /* copy over the main fields already passed in */ idxd->dmadev = dmadev; diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 1dbe31abcd..ab4d71095e 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -87,5 +87,10 @@ int idxd_vchan_setup(struct rte_dma_dev *dev, uint16_t vchan, const struct rte_dma_vchan_conf *qconf, uint32_t qconf_sz); int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *dev_info, uint32_t size); +int idxd_enqueue_copy(void *dev_private, uint16_t qid, rte_iova_t src, + rte_iova_t dst, unsigned int length, uint64_t flags); +int idxd_enqueue_fill(void *dev_private, uint16_t qid, uint64_t pattern, + rte_iova_t dst, unsigned int length, uint64_t flags); +int idxd_submit(void *dev_private, uint16_t qid); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/meson.build b/drivers/dma/idxd/meson.build index 37af6e1b8f..fdfce81a94 100644 --- a/drivers/dma/idxd/meson.build +++ b/drivers/dma/idxd/meson.build @@ -5,6 +5,7 @@ build = dpdk_conf.has('RTE_ARCH_X86') reason = 'only supported on x86' deps += ['bus_pci'] +cflags += '-mavx2' # all platforms with idxd HW support AVX sources = files( 'idxd_common.c', 'idxd_pci.c' -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v11 10/16] dma/idxd: add data-path job completion functions 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Kevin Laatz ` (8 preceding siblings ...) 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 09/16] dma/idxd: add data-path job submission functions Kevin Laatz @ 2021-10-20 16:30 ` Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 11/16] dma/idxd: add operation statistic tracking Kevin Laatz ` (6 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-20 16:30 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add the data path functions for gathering completed operations. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- doc/guides/dmadevs/idxd.rst | 32 ++++- drivers/dma/idxd/idxd_common.c | 236 +++++++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 + 3 files changed, 272 insertions(+), 1 deletion(-) diff --git a/doc/guides/dmadevs/idxd.rst b/doc/guides/dmadevs/idxd.rst index d548c4751a..d4a210b854 100644 --- a/doc/guides/dmadevs/idxd.rst +++ b/doc/guides/dmadevs/idxd.rst @@ -143,7 +143,37 @@ Performing Data Copies ~~~~~~~~~~~~~~~~~~~~~~~ Refer to the :ref:`Enqueue / Dequeue APIs <dmadev_enqueue_dequeue>` section of the dmadev library -documentation for details on operation enqueue and submission API usage. +documentation for details on operation enqueue, submission and completion API usage. It is expected that, for efficiency reasons, a burst of operations will be enqueued to the device via multiple enqueue calls between calls to the ``rte_dma_submit()`` function. + +When gathering completions, ``rte_dma_completed()`` should be used, up until the point an error +occurs in an operation. If an error was encountered, ``rte_dma_completed_status()`` must be used +to kick the device off to continue processing operations and also to gather the status of each +individual operations which is filled in to the ``status`` array provided as parameter by the +application. + +The following status codes are supported by IDXD: + +* ``RTE_DMA_STATUS_SUCCESSFUL``: The operation was successful. +* ``RTE_DMA_STATUS_INVALID_OPCODE``: The operation failed due to an invalid operation code. +* ``RTE_DMA_STATUS_INVALID_LENGTH``: The operation failed due to an invalid data length. +* ``RTE_DMA_STATUS_NOT_ATTEMPTED``: The operation was not attempted. +* ``RTE_DMA_STATUS_ERROR_UNKNOWN``: The operation failed due to an unspecified error. + +The following code shows how to retrieve the number of successfully completed +copies within a burst and then using ``rte_dma_completed_status()`` to check +which operation failed and kick off the device to continue processing operations: + +.. code-block:: C + + enum rte_dma_status_code status[COMP_BURST_SZ]; + uint16_t count, idx, status_count; + bool error = 0; + + count = rte_dma_completed(dev_id, vchan, COMP_BURST_SZ, &idx, &error); + + if (error){ + status_count = rte_dma_completed_status(dev_id, vchan, COMP_BURST_SZ, &idx, status); + } diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 616829c962..86056db02b 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -141,6 +141,240 @@ idxd_submit(void *dev_private, uint16_t qid __rte_unused) return 0; } +static enum rte_dma_status_code +get_comp_status(struct idxd_completion *c) +{ + uint8_t st = c->status; + switch (st) { + /* successful descriptors are not written back normally */ + case IDXD_COMP_STATUS_INCOMPLETE: + case IDXD_COMP_STATUS_SUCCESS: + return RTE_DMA_STATUS_SUCCESSFUL; + case IDXD_COMP_STATUS_INVALID_OPCODE: + return RTE_DMA_STATUS_INVALID_OPCODE; + case IDXD_COMP_STATUS_INVALID_SIZE: + return RTE_DMA_STATUS_INVALID_LENGTH; + case IDXD_COMP_STATUS_SKIPPED: + return RTE_DMA_STATUS_NOT_ATTEMPTED; + default: + return RTE_DMA_STATUS_ERROR_UNKNOWN; + } +} + +static __rte_always_inline int +batch_ok(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) +{ + uint16_t ret; + uint8_t bstatus; + + if (max_ops == 0) + return 0; + + /* first check if there are any unreturned handles from last time */ + if (idxd->ids_avail != idxd->ids_returned) { + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); + idxd->ids_returned += ret; + if (status) + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); + return ret; + } + + if (idxd->batch_idx_read == idxd->batch_idx_write) + return 0; + + bstatus = idxd->batch_comp_ring[idxd->batch_idx_read].status; + /* now check if next batch is complete and successful */ + if (bstatus == IDXD_COMP_STATUS_SUCCESS) { + /* since the batch idx ring stores the start of each batch, pre-increment to lookup + * start of next batch. + */ + if (++idxd->batch_idx_read > idxd->max_batches) + idxd->batch_idx_read = 0; + idxd->ids_avail = idxd->batch_idx_ring[idxd->batch_idx_read]; + + ret = RTE_MIN((uint16_t)(idxd->ids_avail - idxd->ids_returned), max_ops); + idxd->ids_returned += ret; + if (status) + memset(status, RTE_DMA_STATUS_SUCCESSFUL, ret * sizeof(*status)); + return ret; + } + /* check if batch is incomplete */ + else if (bstatus == IDXD_COMP_STATUS_INCOMPLETE) + return 0; + + return -1; /* error case */ +} + +static inline uint16_t +batch_completed(struct idxd_dmadev *idxd, uint16_t max_ops, bool *has_error) +{ + uint16_t i; + uint16_t b_start, b_end, next_batch; + + int ret = batch_ok(idxd, max_ops, NULL); + if (ret >= 0) + return ret; + + /* ERROR case, not successful, not incomplete */ + /* Get the batch size, and special case size 1. + * once we identify the actual failure job, return other jobs, then update + * the batch ring indexes to make it look like the first job of the batch has failed. + * Subsequent calls here will always return zero packets, and the error must be cleared by + * calling the completed_status() function. + */ + next_batch = (idxd->batch_idx_read + 1); + if (next_batch > idxd->max_batches) + next_batch = 0; + b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; + b_end = idxd->batch_idx_ring[next_batch]; + + if (b_end - b_start == 1) { /* not a batch */ + *has_error = true; + return 0; + } + + for (i = b_start; i < b_end; i++) { + struct idxd_completion *c = (void *)&idxd->desc_ring[i & idxd->desc_ring_mask]; + if (c->status > IDXD_COMP_STATUS_SUCCESS) /* ignore incomplete(0) and success(1) */ + break; + } + ret = RTE_MIN((uint16_t)(i - idxd->ids_returned), max_ops); + if (ret < max_ops) + *has_error = true; /* we got up to the point of error */ + idxd->ids_avail = idxd->ids_returned += ret; + + /* to ensure we can call twice and just return 0, set start of batch to where we finished */ + idxd->batch_comp_ring[idxd->batch_idx_read].completed_size -= ret; + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; + if (idxd->batch_idx_ring[next_batch] - idxd->batch_idx_ring[idxd->batch_idx_read] == 1) { + /* copy over the descriptor status to the batch ring as if no batch */ + uint16_t d_idx = idxd->batch_idx_ring[idxd->batch_idx_read] & idxd->desc_ring_mask; + struct idxd_completion *desc_comp = (void *)&idxd->desc_ring[d_idx]; + idxd->batch_comp_ring[idxd->batch_idx_read].status = desc_comp->status; + } + + return ret; +} + +static uint16_t +batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) +{ + uint16_t next_batch; + + int ret = batch_ok(idxd, max_ops, status); + if (ret >= 0) + return ret; + + /* ERROR case, not successful, not incomplete */ + /* Get the batch size, and special case size 1. + */ + next_batch = (idxd->batch_idx_read + 1); + if (next_batch > idxd->max_batches) + next_batch = 0; + const uint16_t b_start = idxd->batch_idx_ring[idxd->batch_idx_read]; + const uint16_t b_end = idxd->batch_idx_ring[next_batch]; + const uint16_t b_len = b_end - b_start; + if (b_len == 1) {/* not a batch */ + *status = get_comp_status(&idxd->batch_comp_ring[idxd->batch_idx_read]); + idxd->ids_avail++; + idxd->ids_returned++; + idxd->batch_idx_read = next_batch; + return 1; + } + + /* not a single-element batch, need to process more. + * Scenarios: + * 1. max_ops >= batch_size - can fit everything, simple case + * - loop through completed ops and then add on any not-attempted ones + * 2. max_ops < batch_size - can't fit everything, more complex case + * - loop through completed/incomplete and stop when hit max_ops + * - adjust the batch descriptor to update where we stopped, with appropriate bcount + * - if bcount is to be exactly 1, update the batch descriptor as it will be treated as + * non-batch next time. + */ + const uint16_t bcount = idxd->batch_comp_ring[idxd->batch_idx_read].completed_size; + for (ret = 0; ret < b_len && ret < max_ops; ret++) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; + status[ret] = (ret < bcount) ? get_comp_status(c) : RTE_DMA_STATUS_NOT_ATTEMPTED; + } + idxd->ids_avail = idxd->ids_returned += ret; + + /* everything fit */ + if (ret == b_len) { + idxd->batch_idx_read = next_batch; + return ret; + } + + /* set up for next time, update existing batch descriptor & start idx at batch_idx_read */ + idxd->batch_idx_ring[idxd->batch_idx_read] += ret; + if (ret > bcount) { + /* we have only incomplete ones - set batch completed size to 0 */ + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; + comp->completed_size = 0; + /* if there is only one descriptor left, job skipped so set flag appropriately */ + if (b_len - ret == 1) + comp->status = IDXD_COMP_STATUS_SKIPPED; + } else { + struct idxd_completion *comp = &idxd->batch_comp_ring[idxd->batch_idx_read]; + comp->completed_size -= ret; + /* if there is only one descriptor left, copy status info straight to desc */ + if (comp->completed_size == 1) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; + comp->status = c->status; + /* individual descs can be ok without writeback, but not batches */ + if (comp->status == IDXD_COMP_STATUS_INCOMPLETE) + comp->status = IDXD_COMP_STATUS_SUCCESS; + } else if (bcount == b_len) { + /* check if we still have an error, and clear flag if not */ + uint16_t i; + for (i = b_start + ret; i < b_end; i++) { + struct idxd_completion *c = (void *) + &idxd->desc_ring[i & idxd->desc_ring_mask]; + if (c->status > IDXD_COMP_STATUS_SUCCESS) + break; + } + if (i == b_end) /* no errors */ + comp->status = IDXD_COMP_STATUS_SUCCESS; + } + } + + return ret; +} + +uint16_t +idxd_completed(void *dev_private, uint16_t qid __rte_unused, uint16_t max_ops, + uint16_t *last_idx, bool *has_error) +{ + struct idxd_dmadev *idxd = dev_private; + uint16_t batch, ret = 0; + + do { + batch = batch_completed(idxd, max_ops - ret, has_error); + ret += batch; + } while (batch > 0 && *has_error == false); + + *last_idx = idxd->ids_returned - 1; + return ret; +} + +uint16_t +idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, uint16_t max_ops, + uint16_t *last_idx, enum rte_dma_status_code *status) +{ + struct idxd_dmadev *idxd = dev_private; + uint16_t batch, ret = 0; + + do { + batch = batch_completed_status(idxd, max_ops - ret, &status[ret]); + ret += batch; + } while (batch > 0); + + *last_idx = idxd->ids_returned - 1; + return ret; +} + int idxd_dump(const struct rte_dma_dev *dev, FILE *f) { @@ -273,6 +507,8 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->fp_obj->copy = idxd_enqueue_copy; dmadev->fp_obj->fill = idxd_enqueue_fill; dmadev->fp_obj->submit = idxd_submit; + dmadev->fp_obj->completed = idxd_completed; + dmadev->fp_obj->completed_status = idxd_completed_status; idxd = dmadev->data->dev_private; *idxd = *base_idxd; /* copy over the main fields already passed in */ diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index ab4d71095e..4208b0dee8 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -92,5 +92,10 @@ int idxd_enqueue_copy(void *dev_private, uint16_t qid, rte_iova_t src, int idxd_enqueue_fill(void *dev_private, uint16_t qid, uint64_t pattern, rte_iova_t dst, unsigned int length, uint64_t flags); int idxd_submit(void *dev_private, uint16_t qid); +uint16_t idxd_completed(void *dev_private, uint16_t qid, uint16_t max_ops, + uint16_t *last_idx, bool *has_error); +uint16_t idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, + uint16_t max_ops, uint16_t *last_idx, + enum rte_dma_status_code *status); #endif /* _IDXD_INTERNAL_H_ */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v11 11/16] dma/idxd: add operation statistic tracking 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Kevin Laatz ` (9 preceding siblings ...) 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 10/16] dma/idxd: add data-path job completion functions Kevin Laatz @ 2021-10-20 16:30 ` Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 12/16] dma/idxd: add vchan status function Kevin Laatz ` (5 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-20 16:30 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add statistic tracking for DSA devices. The dmadev library documentation is also updated to add a generic section for using the library's statistics APIs. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> --- doc/guides/prog_guide/dmadev.rst | 11 +++++++++++ drivers/dma/idxd/idxd_bus.c | 2 ++ drivers/dma/idxd/idxd_common.c | 27 +++++++++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 5 +++++ drivers/dma/idxd/idxd_pci.c | 2 ++ 5 files changed, 47 insertions(+) diff --git a/doc/guides/prog_guide/dmadev.rst b/doc/guides/prog_guide/dmadev.rst index 30734f3a36..77863f8028 100644 --- a/doc/guides/prog_guide/dmadev.rst +++ b/doc/guides/prog_guide/dmadev.rst @@ -107,3 +107,14 @@ completed operations along with the status of each operation (filled into the ``status`` array passed by user). These two APIs can also return the last completed operation's ``ring_idx`` which could help user track operations within their own application-defined rings. + + +Querying Device Statistics +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The statistics from a dmadev device can be got via the statistics functions, +i.e. ``rte_dma_stats_get()``. The statistics returned for each device instance are: + +* ``submitted``: The number of operations submitted to the device. +* ``completed``: The number of operations which have completed (successful and failed). +* ``errors``: The number of operations that completed with error. diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index 971fe34b88..78299daf5e 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -100,6 +100,8 @@ static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .stats_get = idxd_stats_get, + .stats_reset = idxd_stats_reset, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 86056db02b..60dbd87efb 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -65,6 +65,8 @@ __submit(struct idxd_dmadev *idxd) if (++idxd->batch_idx_write > idxd->max_batches) idxd->batch_idx_write = 0; + idxd->stats.submitted += idxd->batch_size; + idxd->batch_start += idxd->batch_size; idxd->batch_size = 0; idxd->batch_idx_ring[idxd->batch_idx_write] = idxd->batch_start; @@ -276,6 +278,8 @@ batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_ const uint16_t b_len = b_end - b_start; if (b_len == 1) {/* not a batch */ *status = get_comp_status(&idxd->batch_comp_ring[idxd->batch_idx_read]); + if (status != RTE_DMA_STATUS_SUCCESSFUL) + idxd->stats.errors++; idxd->ids_avail++; idxd->ids_returned++; idxd->batch_idx_read = next_batch; @@ -297,6 +301,8 @@ batch_completed_status(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_ struct idxd_completion *c = (void *) &idxd->desc_ring[(b_start + ret) & idxd->desc_ring_mask]; status[ret] = (ret < bcount) ? get_comp_status(c) : RTE_DMA_STATUS_NOT_ATTEMPTED; + if (status[ret] != RTE_DMA_STATUS_SUCCESSFUL) + idxd->stats.errors++; } idxd->ids_avail = idxd->ids_returned += ret; @@ -355,6 +361,7 @@ idxd_completed(void *dev_private, uint16_t qid __rte_unused, uint16_t max_ops, ret += batch; } while (batch > 0 && *has_error == false); + idxd->stats.completed += ret; *last_idx = idxd->ids_returned - 1; return ret; } @@ -371,6 +378,7 @@ idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, uint16_t max ret += batch; } while (batch > 0); + idxd->stats.completed += ret; *last_idx = idxd->ids_returned - 1; return ret; } @@ -404,6 +412,25 @@ idxd_dump(const struct rte_dma_dev *dev, FILE *f) return 0; } +int +idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + struct rte_dma_stats *stats, uint32_t stats_sz) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + if (stats_sz < sizeof(*stats)) + return -EINVAL; + *stats = idxd->stats; + return 0; +} + +int +idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan __rte_unused) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + idxd->stats = (struct rte_dma_stats){0}; + return 0; +} + int idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t size) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 4208b0dee8..a85a1fb79e 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -59,6 +59,8 @@ struct idxd_dmadev { struct idxd_completion *batch_comp_ring; unsigned short *batch_idx_ring; /* store where each batch ends */ + struct rte_dma_stats stats; + rte_iova_t batch_iova; /* base address of the batch comp ring */ rte_iova_t desc_iova; /* base address of desc ring, needed for completions */ @@ -97,5 +99,8 @@ uint16_t idxd_completed(void *dev_private, uint16_t qid, uint16_t max_ops, uint16_t idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, uint16_t max_ops, uint16_t *last_idx, enum rte_dma_status_code *status); +int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, + struct rte_dma_stats *stats, uint32_t stats_sz); +int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 58760d2e74..c0b78021ad 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -137,6 +137,8 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .dev_configure = idxd_configure, .vchan_setup = idxd_vchan_setup, .dev_info_get = idxd_info_get, + .stats_get = idxd_stats_get, + .stats_reset = idxd_stats_reset, .dev_start = idxd_pci_dev_start, .dev_stop = idxd_pci_dev_stop, }; -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v11 12/16] dma/idxd: add vchan status function 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Kevin Laatz ` (10 preceding siblings ...) 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 11/16] dma/idxd: add operation statistic tracking Kevin Laatz @ 2021-10-20 16:30 ` Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 13/16] dma/idxd: add burst capacity API Kevin Laatz ` (4 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-20 16:30 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz When testing dmadev drivers, it is useful to have the HW device in a known state. This patch adds the implementation of the function which will wait for the device to be idle (all jobs completed) before proceeding. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> --- drivers/dma/idxd/idxd_bus.c | 1 + drivers/dma/idxd/idxd_common.c | 17 +++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 2 ++ drivers/dma/idxd/idxd_pci.c | 1 + 4 files changed, 21 insertions(+) diff --git a/drivers/dma/idxd/idxd_bus.c b/drivers/dma/idxd/idxd_bus.c index 78299daf5e..08639e9dce 100644 --- a/drivers/dma/idxd/idxd_bus.c +++ b/drivers/dma/idxd/idxd_bus.c @@ -102,6 +102,7 @@ static const struct rte_dma_dev_ops idxd_bus_ops = { .dev_info_get = idxd_info_get, .stats_get = idxd_stats_get, .stats_reset = idxd_stats_reset, + .vchan_status = idxd_vchan_status, }; static void * diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 60dbd87efb..5ba0b69e12 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -163,6 +163,23 @@ get_comp_status(struct idxd_completion *c) } } +int +idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan __rte_unused, + enum rte_dma_vchan_status *status) +{ + struct idxd_dmadev *idxd = dev->fp_obj->dev_private; + uint16_t last_batch_write = idxd->batch_idx_write == 0 ? idxd->max_batches : + idxd->batch_idx_write - 1; + uint8_t bstatus = (idxd->batch_comp_ring[last_batch_write].status != 0); + + /* An IDXD device will always be either active or idle. + * RTE_DMA_VCHAN_HALTED_ERROR is therefore not supported by IDXD. + */ + *status = bstatus ? RTE_DMA_VCHAN_IDLE : RTE_DMA_VCHAN_ACTIVE; + + return 0; +} + static __rte_always_inline int batch_ok(struct idxd_dmadev *idxd, uint16_t max_ops, enum rte_dma_status_code *status) { diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index a85a1fb79e..50acb82d3d 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -102,5 +102,7 @@ uint16_t idxd_completed_status(void *dev_private, uint16_t qid __rte_unused, int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, struct rte_dma_stats *stats, uint32_t stats_sz); int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); +int idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan, + enum rte_dma_vchan_status *status); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index c0b78021ad..81952cfc40 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -141,6 +141,7 @@ static const struct rte_dma_dev_ops idxd_pci_ops = { .stats_reset = idxd_stats_reset, .dev_start = idxd_pci_dev_start, .dev_stop = idxd_pci_dev_stop, + .vchan_status = idxd_vchan_status, }; /* each portal uses 4 x 4k pages */ -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v11 13/16] dma/idxd: add burst capacity API 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Kevin Laatz ` (11 preceding siblings ...) 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 12/16] dma/idxd: add vchan status function Kevin Laatz @ 2021-10-20 16:30 ` Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz ` (3 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-20 16:30 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add support for the burst capacity API. This API will provide the calling application with the remaining capacity of the current burst (limited by max HW batch size). Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> --- drivers/dma/idxd/idxd_common.c | 21 +++++++++++++++++++++ drivers/dma/idxd/idxd_internal.h | 1 + drivers/dma/idxd/idxd_pci.c | 1 + 3 files changed, 23 insertions(+) diff --git a/drivers/dma/idxd/idxd_common.c b/drivers/dma/idxd/idxd_common.c index 5ba0b69e12..fc11b11337 100644 --- a/drivers/dma/idxd/idxd_common.c +++ b/drivers/dma/idxd/idxd_common.c @@ -468,6 +468,26 @@ idxd_info_get(const struct rte_dma_dev *dev, struct rte_dma_info *info, uint32_t return 0; } +uint16_t +idxd_burst_capacity(const void *dev_private, uint16_t vchan __rte_unused) +{ + const struct idxd_dmadev *idxd = dev_private; + uint16_t write_idx = idxd->batch_start + idxd->batch_size; + uint16_t used_space; + + /* Check for space in the batch ring */ + if ((idxd->batch_idx_read == 0 && idxd->batch_idx_write == idxd->max_batches) || + idxd->batch_idx_write + 1 == idxd->batch_idx_read) + return 0; + + /* For descriptors, check for wrap-around on write but not read */ + if (idxd->ids_returned > write_idx) + write_idx += idxd->desc_ring_mask + 1; + used_space = write_idx - idxd->ids_returned; + + return RTE_MIN((idxd->desc_ring_mask - used_space), idxd->max_batch_size); +} + int idxd_configure(struct rte_dma_dev *dev __rte_unused, const struct rte_dma_conf *dev_conf, uint32_t conf_sz) @@ -553,6 +573,7 @@ idxd_dmadev_create(const char *name, struct rte_device *dev, dmadev->fp_obj->submit = idxd_submit; dmadev->fp_obj->completed = idxd_completed; dmadev->fp_obj->completed_status = idxd_completed_status; + dmadev->fp_obj->burst_capacity = idxd_burst_capacity; idxd = dmadev->data->dev_private; *idxd = *base_idxd; /* copy over the main fields already passed in */ diff --git a/drivers/dma/idxd/idxd_internal.h b/drivers/dma/idxd/idxd_internal.h index 50acb82d3d..3375600217 100644 --- a/drivers/dma/idxd/idxd_internal.h +++ b/drivers/dma/idxd/idxd_internal.h @@ -104,5 +104,6 @@ int idxd_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, int idxd_stats_reset(struct rte_dma_dev *dev, uint16_t vchan); int idxd_vchan_status(const struct rte_dma_dev *dev, uint16_t vchan, enum rte_dma_vchan_status *status); +uint16_t idxd_burst_capacity(const void *dev_private, uint16_t vchan); #endif /* _IDXD_INTERNAL_H_ */ diff --git a/drivers/dma/idxd/idxd_pci.c b/drivers/dma/idxd/idxd_pci.c index 81952cfc40..1aabacee41 100644 --- a/drivers/dma/idxd/idxd_pci.c +++ b/drivers/dma/idxd/idxd_pci.c @@ -258,6 +258,7 @@ init_pci_device(struct rte_pci_device *dev, struct idxd_dmadev *idxd, idxd->u.pci = pci; idxd->max_batches = wq_size; + idxd->max_batch_size = 1 << lg2_max_batch; /* enable the device itself */ err_code = idxd_pci_dev_command(idxd, idxd_enable_dev); -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v11 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Kevin Laatz ` (12 preceding siblings ...) 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 13/16] dma/idxd: add burst capacity API Kevin Laatz @ 2021-10-20 16:30 ` Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 15/16] devbind: add dma device class Kevin Laatz ` (2 subsequent siblings) 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-20 16:30 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz From: Conor Walsh <conor.walsh@intel.com> Move the example script for configuring IDXD devices bound to the IDXD kernel driver from raw to dma, and create a symlink to still allow use from raw. Signed-off-by: Conor Walsh <conor.walsh@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> --- drivers/dma/idxd/dpdk_idxd_cfg.py | 117 +++++++++++++++++++++++++++++ drivers/raw/ioat/dpdk_idxd_cfg.py | 118 +----------------------------- 2 files changed, 118 insertions(+), 117 deletions(-) create mode 100755 drivers/dma/idxd/dpdk_idxd_cfg.py mode change 100755 => 120000 drivers/raw/ioat/dpdk_idxd_cfg.py diff --git a/drivers/dma/idxd/dpdk_idxd_cfg.py b/drivers/dma/idxd/dpdk_idxd_cfg.py new file mode 100755 index 0000000000..fcc27822ef --- /dev/null +++ b/drivers/dma/idxd/dpdk_idxd_cfg.py @@ -0,0 +1,117 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2020 Intel Corporation + +""" +Configure an entire Intel DSA instance, using idxd kernel driver, for DPDK use +""" + +import sys +import argparse +import os +import os.path + + +class SysfsDir: + "Used to read/write paths in a sysfs directory" + def __init__(self, path): + self.path = path + + def read_int(self, filename): + "Return a value from sysfs file" + with open(os.path.join(self.path, filename)) as f: + return int(f.readline()) + + def write_values(self, values): + "write dictionary, where key is filename and value is value to write" + for filename, contents in values.items(): + with open(os.path.join(self.path, filename), "w") as f: + f.write(str(contents)) + + +def reset_device(dsa_id): + "Reset the DSA device and all its queues" + drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") + drv_dir.write_values({"unbind": f"dsa{dsa_id}"}) + + +def get_pci_dir(pci): + "Search for the sysfs directory of the PCI device" + base_dir = '/sys/bus/pci/devices/' + for path, dirs, files in os.walk(base_dir): + for dir in dirs: + if pci in dir: + return os.path.join(base_dir, dir) + sys.exit(f"Could not find sysfs directory for device {pci}") + + +def get_dsa_id(pci): + "Get the DSA instance ID using the PCI address of the device" + pci_dir = get_pci_dir(pci) + for path, dirs, files in os.walk(pci_dir): + for dir in dirs: + if dir.startswith('dsa') and 'wq' not in dir: + return int(dir[3:]) + sys.exit(f"Could not get device ID for device {pci}") + + +def configure_dsa(dsa_id, queues, prefix): + "Configure the DSA instance with appropriate number of queues" + dsa_dir = SysfsDir(f"/sys/bus/dsa/devices/dsa{dsa_id}") + drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") + + max_groups = dsa_dir.read_int("max_groups") + max_engines = dsa_dir.read_int("max_engines") + max_queues = dsa_dir.read_int("max_work_queues") + max_work_queues_size = dsa_dir.read_int("max_work_queues_size") + + nb_queues = min(queues, max_queues) + if queues > nb_queues: + print(f"Setting number of queues to max supported value: {max_queues}") + + # we want one engine per group, and no more engines than queues + nb_groups = min(max_engines, max_groups, nb_queues) + for grp in range(nb_groups): + dsa_dir.write_values({f"engine{dsa_id}.{grp}/group_id": grp}) + + # configure each queue + for q in range(nb_queues): + wq_dir = SysfsDir(os.path.join(dsa_dir.path, f"wq{dsa_id}.{q}")) + wq_dir.write_values({"group_id": q % nb_groups, + "type": "user", + "mode": "dedicated", + "name": f"{prefix}_wq{dsa_id}.{q}", + "priority": 1, + "size": int(max_work_queues_size / nb_queues)}) + + # enable device and then queues + drv_dir.write_values({"bind": f"dsa{dsa_id}"}) + for q in range(nb_queues): + drv_dir.write_values({"bind": f"wq{dsa_id}.{q}"}) + + +def main(args): + "Main function, does arg parsing and calls config function" + arg_p = argparse.ArgumentParser( + description="Configure whole DSA device instance for DPDK use") + arg_p.add_argument('dsa_id', + help="Specify DSA instance either via DSA instance number or PCI address") + arg_p.add_argument('-q', metavar='queues', type=int, default=255, + help="Number of queues to set up") + arg_p.add_argument('--name-prefix', metavar='prefix', dest='prefix', + default="dpdk", + help="Prefix for workqueue name to mark for DPDK use [default: 'dpdk']") + arg_p.add_argument('--reset', action='store_true', + help="Reset DSA device and its queues") + parsed_args = arg_p.parse_args(args[1:]) + + dsa_id = parsed_args.dsa_id + dsa_id = get_dsa_id(dsa_id) if ':' in dsa_id else dsa_id + if parsed_args.reset: + reset_device(dsa_id) + else: + configure_dsa(dsa_id, parsed_args.q, parsed_args.prefix) + + +if __name__ == "__main__": + main(sys.argv) diff --git a/drivers/raw/ioat/dpdk_idxd_cfg.py b/drivers/raw/ioat/dpdk_idxd_cfg.py deleted file mode 100755 index fcc27822ef..0000000000 --- a/drivers/raw/ioat/dpdk_idxd_cfg.py +++ /dev/null @@ -1,117 +0,0 @@ -#!/usr/bin/env python3 -# SPDX-License-Identifier: BSD-3-Clause -# Copyright(c) 2020 Intel Corporation - -""" -Configure an entire Intel DSA instance, using idxd kernel driver, for DPDK use -""" - -import sys -import argparse -import os -import os.path - - -class SysfsDir: - "Used to read/write paths in a sysfs directory" - def __init__(self, path): - self.path = path - - def read_int(self, filename): - "Return a value from sysfs file" - with open(os.path.join(self.path, filename)) as f: - return int(f.readline()) - - def write_values(self, values): - "write dictionary, where key is filename and value is value to write" - for filename, contents in values.items(): - with open(os.path.join(self.path, filename), "w") as f: - f.write(str(contents)) - - -def reset_device(dsa_id): - "Reset the DSA device and all its queues" - drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") - drv_dir.write_values({"unbind": f"dsa{dsa_id}"}) - - -def get_pci_dir(pci): - "Search for the sysfs directory of the PCI device" - base_dir = '/sys/bus/pci/devices/' - for path, dirs, files in os.walk(base_dir): - for dir in dirs: - if pci in dir: - return os.path.join(base_dir, dir) - sys.exit(f"Could not find sysfs directory for device {pci}") - - -def get_dsa_id(pci): - "Get the DSA instance ID using the PCI address of the device" - pci_dir = get_pci_dir(pci) - for path, dirs, files in os.walk(pci_dir): - for dir in dirs: - if dir.startswith('dsa') and 'wq' not in dir: - return int(dir[3:]) - sys.exit(f"Could not get device ID for device {pci}") - - -def configure_dsa(dsa_id, queues, prefix): - "Configure the DSA instance with appropriate number of queues" - dsa_dir = SysfsDir(f"/sys/bus/dsa/devices/dsa{dsa_id}") - drv_dir = SysfsDir("/sys/bus/dsa/drivers/dsa") - - max_groups = dsa_dir.read_int("max_groups") - max_engines = dsa_dir.read_int("max_engines") - max_queues = dsa_dir.read_int("max_work_queues") - max_work_queues_size = dsa_dir.read_int("max_work_queues_size") - - nb_queues = min(queues, max_queues) - if queues > nb_queues: - print(f"Setting number of queues to max supported value: {max_queues}") - - # we want one engine per group, and no more engines than queues - nb_groups = min(max_engines, max_groups, nb_queues) - for grp in range(nb_groups): - dsa_dir.write_values({f"engine{dsa_id}.{grp}/group_id": grp}) - - # configure each queue - for q in range(nb_queues): - wq_dir = SysfsDir(os.path.join(dsa_dir.path, f"wq{dsa_id}.{q}")) - wq_dir.write_values({"group_id": q % nb_groups, - "type": "user", - "mode": "dedicated", - "name": f"{prefix}_wq{dsa_id}.{q}", - "priority": 1, - "size": int(max_work_queues_size / nb_queues)}) - - # enable device and then queues - drv_dir.write_values({"bind": f"dsa{dsa_id}"}) - for q in range(nb_queues): - drv_dir.write_values({"bind": f"wq{dsa_id}.{q}"}) - - -def main(args): - "Main function, does arg parsing and calls config function" - arg_p = argparse.ArgumentParser( - description="Configure whole DSA device instance for DPDK use") - arg_p.add_argument('dsa_id', - help="Specify DSA instance either via DSA instance number or PCI address") - arg_p.add_argument('-q', metavar='queues', type=int, default=255, - help="Number of queues to set up") - arg_p.add_argument('--name-prefix', metavar='prefix', dest='prefix', - default="dpdk", - help="Prefix for workqueue name to mark for DPDK use [default: 'dpdk']") - arg_p.add_argument('--reset', action='store_true', - help="Reset DSA device and its queues") - parsed_args = arg_p.parse_args(args[1:]) - - dsa_id = parsed_args.dsa_id - dsa_id = get_dsa_id(dsa_id) if ':' in dsa_id else dsa_id - if parsed_args.reset: - reset_device(dsa_id) - else: - configure_dsa(dsa_id, parsed_args.q, parsed_args.prefix) - - -if __name__ == "__main__": - main(sys.argv) diff --git a/drivers/raw/ioat/dpdk_idxd_cfg.py b/drivers/raw/ioat/dpdk_idxd_cfg.py new file mode 120000 index 0000000000..85545548d1 --- /dev/null +++ b/drivers/raw/ioat/dpdk_idxd_cfg.py @@ -0,0 +1 @@ +../../dma/idxd/dpdk_idxd_cfg.py \ No newline at end of file -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v11 15/16] devbind: add dma device class 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Kevin Laatz ` (13 preceding siblings ...) 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz @ 2021-10-20 16:30 ` Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 2021-10-22 18:07 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Thomas Monjalon 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-20 16:30 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz Add a new class for DMA devices. Devices listed under the DMA class are to be used with the dmadev library. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> --- usertools/dpdk-devbind.py | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py index 5f0e817055..da89b87816 100755 --- a/usertools/dpdk-devbind.py +++ b/usertools/dpdk-devbind.py @@ -71,6 +71,7 @@ network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class] baseband_devices = [acceleration_class] crypto_devices = [encryption_class, intel_processor_class] +dma_devices = [] eventdev_devices = [cavium_sso, cavium_tim, intel_dlb, octeontx2_sso] mempool_devices = [cavium_fpa, octeontx2_npa] compress_devices = [cavium_zip] @@ -585,6 +586,9 @@ def show_status(): if status_dev in ["crypto", "all"]: show_device_status(crypto_devices, "Crypto") + if status_dev in ["dma", "all"]: + show_device_status(dma_devices, "DMA") + if status_dev in ["event", "all"]: show_device_status(eventdev_devices, "Eventdev") @@ -653,7 +657,7 @@ def parse_args(): parser.add_argument( '--status-dev', help="Print the status of given device group.", - choices=['baseband', 'compress', 'crypto', 'event', + choices=['baseband', 'compress', 'crypto', 'dma', 'event', 'mempool', 'misc', 'net', 'regex']) bind_group = parser.add_mutually_exclusive_group() bind_group.add_argument( @@ -734,6 +738,7 @@ def do_arg_actions(): get_device_details(network_devices) get_device_details(baseband_devices) get_device_details(crypto_devices) + get_device_details(dma_devices) get_device_details(eventdev_devices) get_device_details(mempool_devices) get_device_details(compress_devices) @@ -756,6 +761,7 @@ def main(): get_device_details(network_devices) get_device_details(baseband_devices) get_device_details(crypto_devices) + get_device_details(dma_devices) get_device_details(eventdev_devices) get_device_details(mempool_devices) get_device_details(compress_devices) -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* [dpdk-dev] [PATCH v11 16/16] devbind: move idxd device ID to dmadev class 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Kevin Laatz ` (14 preceding siblings ...) 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 15/16] devbind: add dma device class Kevin Laatz @ 2021-10-20 16:30 ` Kevin Laatz 2021-10-22 18:07 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Thomas Monjalon 16 siblings, 0 replies; 243+ messages in thread From: Kevin Laatz @ 2021-10-20 16:30 UTC (permalink / raw) To: dev Cc: thomas, bruce.richardson, fengchengwen, jerinj, conor.walsh, Kevin Laatz The dmadev library is the preferred abstraction for using IDXD devices and will replace the rawdev implementation in future. This patch moves the IDXD device ID to the dmadev class. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> --- usertools/dpdk-devbind.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py index da89b87816..ba18e2a487 100755 --- a/usertools/dpdk-devbind.py +++ b/usertools/dpdk-devbind.py @@ -71,13 +71,13 @@ network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class] baseband_devices = [acceleration_class] crypto_devices = [encryption_class, intel_processor_class] -dma_devices = [] +dma_devices = [intel_idxd_spr] eventdev_devices = [cavium_sso, cavium_tim, intel_dlb, octeontx2_sso] mempool_devices = [cavium_fpa, octeontx2_npa] compress_devices = [cavium_zip] regex_devices = [octeontx2_ree] misc_devices = [cnxk_bphy, cnxk_bphy_cgx, cnxk_inl_dev, intel_ioat_bdw, - intel_ioat_skx, intel_ioat_icx, intel_idxd_spr, intel_ntb_skx, + intel_ioat_skx, intel_ioat_icx, intel_ntb_skx, intel_ntb_icx, octeontx2_dma] # global dict ethernet devices present. Dictionary indexed by PCI address. -- 2.30.2 ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Kevin Laatz ` (15 preceding siblings ...) 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz @ 2021-10-22 18:07 ` Thomas Monjalon 2021-10-23 6:55 ` David Marchand 16 siblings, 1 reply; 243+ messages in thread From: Thomas Monjalon @ 2021-10-22 18:07 UTC (permalink / raw) To: Kevin Laatz; +Cc: dev, bruce.richardson, fengchengwen, jerinj, conor.walsh 20/10/2021 18:29, Kevin Laatz: > This patchset adds a dmadev driver and associated documentation to support > Intel Data Streaming Accelerator devices. This driver is intended to > ultimately replace the current IDXD part of the IOAT rawdev driver. Applied (with notified fix), thanks. ^ permalink raw reply [flat|nested] 243+ messages in thread
* Re: [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices 2021-10-22 18:07 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Thomas Monjalon @ 2021-10-23 6:55 ` David Marchand 0 siblings, 0 replies; 243+ messages in thread From: David Marchand @ 2021-10-23 6:55 UTC (permalink / raw) To: Kevin Laatz, Conor Walsh, Thomas Monjalon Cc: dev, Bruce Richardson, Chengwen Feng, Jerin Jacob Kollanukkaran On Fri, Oct 22, 2021 at 8:08 PM Thomas Monjalon <thomas@monjalon.net> wrote: > > 20/10/2021 18:29, Kevin Laatz: > > This patchset adds a dmadev driver and associated documentation to support > > Intel Data Streaming Accelerator devices. This driver is intended to > > ultimately replace the current IDXD part of the IOAT rawdev driver. > > Applied (with notified fix), thanks. Something is wrong in main branch with Windows compilation. It was not caught before, since this series crossed the road while Windows compilation on dma drivers passed. I am not entirely sure on the fix, I'll post a patch to see if CI agrees. (I still wonder how stdint.h gets included for uintX definitions, but that's either "include luck" or my lack of coffee). -- David Marchand ^ permalink raw reply [flat|nested] 243+ messages in thread
end of thread, other threads:[~2021-10-23 6:55 UTC | newest] Thread overview: 243+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-08-27 17:20 [dpdk-dev] [PATCH 00/13] add dmadev driver for idxd devices Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 01/13] raw/ioat: only build if dmadev not present Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 02/13] doc: initial commit for dmadevs section Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 03/13] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 04/13] dma/idxd: add bus device probing Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 05/13] dma/idxd: create dmadev instances on bus probe Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 06/13] dma/idxd: create dmadev instances on pci probe Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 07/13] dma/idxd: add datapath structures Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 08/13] dma/idxd: add configure and info_get functions Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 09/13] dma/idxd: add start and stop functions for pci devices Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 10/13] dma/idxd: add data-path job submission functions Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 11/13] dma/idxd: add data-path job completion functions Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 12/13] dma/idxd: add operation statistic tracking Kevin Laatz 2021-08-27 17:20 ` [dpdk-dev] [PATCH 13/13] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 01/16] raw/ioat: only build if dmadev not present Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 02/16] doc: initial commit for dmadevs section Kevin Laatz 2021-09-03 10:51 ` Bruce Richardson 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 03/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 04/16] dma/idxd: add bus device probing Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 05/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 06/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 07/16] dma/idxd: add datapath structures Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 08/16] dma/idxd: add configure and info_get functions Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 09/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 10/16] dma/idxd: add data-path job submission functions Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 11/16] dma/idxd: add data-path job completion functions Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 12/16] dma/idxd: add operation statistic tracking Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 13/16] dma/idxd: add vchan idle function Kevin Laatz 2021-09-03 10:49 ` [dpdk-dev] [PATCH v2 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz 2021-09-03 10:50 ` [dpdk-dev] [PATCH v2 15/16] devbind: add dma device class Kevin Laatz 2021-09-03 10:50 ` [dpdk-dev] [PATCH v2 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 2021-09-08 10:29 ` [dpdk-dev] [PATCH v3 00/17] add dmadev driver for idxd devices Kevin Laatz 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 01/17] raw/ioat: only build if dmadev not present Kevin Laatz 2021-09-08 16:00 ` Conor Walsh 2021-09-09 11:11 ` Kevin Laatz 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 02/17] doc: initial commit for dmadevs section Kevin Laatz 2021-09-08 16:00 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 03/17] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz 2021-09-08 16:47 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 04/17] dma/idxd: add bus device probing Kevin Laatz 2021-09-08 16:47 ` Conor Walsh 2021-09-09 11:10 ` Kevin Laatz 2021-09-15 10:12 ` Maxime Coquelin 2021-09-15 11:06 ` Bruce Richardson 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 05/17] dma/idxd: create dmadev instances on bus probe Kevin Laatz 2021-09-08 16:47 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 06/17] dma/idxd: create dmadev instances on pci probe Kevin Laatz 2021-09-08 16:48 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 07/17] dma/idxd: add datapath structures Kevin Laatz 2021-09-09 11:23 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 08/17] dma/idxd: add configure and info_get functions Kevin Laatz 2021-09-09 11:23 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 09/17] dma/idxd: add start and stop functions for pci devices Kevin Laatz 2021-09-09 11:24 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 10/17] dma/idxd: add data-path job submission functions Kevin Laatz 2021-09-09 11:24 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 11/17] dma/idxd: add data-path job completion functions Kevin Laatz 2021-09-09 11:24 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 12/17] dma/idxd: add operation statistic tracking Kevin Laatz 2021-09-09 11:25 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 13/17] dma/idxd: add vchan status function Kevin Laatz 2021-09-09 11:26 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 14/17] dma/idxd: add burst capacity API Kevin Laatz 2021-09-09 11:26 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 15/17] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 16/17] devbind: add dma device class Kevin Laatz 2021-09-09 11:26 ` Conor Walsh 2021-09-08 10:30 ` [dpdk-dev] [PATCH v3 17/17] devbind: move idxd device ID to dmadev class Kevin Laatz 2021-09-09 11:27 ` Conor Walsh 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 01/16] raw/ioat: only build if dmadev not present Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 03/16] dma/idxd: add bus device probing Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 06/16] dma/idxd: add datapath structures Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 07/16] dma/idxd: add configure and info_get functions Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 09/16] dma/idxd: add data-path job submission functions Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 10/16] dma/idxd: add data-path job completion functions Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 11/16] dma/idxd: add operation statistic tracking Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 12/16] dma/idxd: add vchan status function Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 13/16] dma/idxd: add burst capacity API Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 15/16] devbind: add dma device class Kevin Laatz 2021-09-17 14:02 ` [dpdk-dev] [PATCH v4 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 01/16] raw/ioat: only build if dmadev not present Kevin Laatz 2021-09-20 10:15 ` Bruce Richardson 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz 2021-09-20 10:23 ` Bruce Richardson 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 03/16] dma/idxd: add bus device probing Kevin Laatz 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz 2021-09-22 2:04 ` fengchengwen 2021-09-22 9:12 ` Kevin Laatz 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz 2021-09-22 2:12 ` fengchengwen 2021-09-22 9:18 ` Kevin Laatz 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 06/16] dma/idxd: add datapath structures Kevin Laatz 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 07/16] dma/idxd: add configure and info_get functions Kevin Laatz 2021-09-20 10:27 ` Bruce Richardson 2021-09-22 2:31 ` fengchengwen 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz 2021-09-22 2:40 ` fengchengwen 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 09/16] dma/idxd: add data-path job submission functions Kevin Laatz 2021-09-20 10:30 ` Bruce Richardson 2021-09-22 3:22 ` fengchengwen 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 10/16] dma/idxd: add data-path job completion functions Kevin Laatz 2021-09-20 10:36 ` Bruce Richardson 2021-09-22 3:47 ` fengchengwen 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 11/16] dma/idxd: add operation statistic tracking Kevin Laatz 2021-09-22 3:51 ` fengchengwen 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 12/16] dma/idxd: add vchan status function Kevin Laatz 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 13/16] dma/idxd: add burst capacity API Kevin Laatz 2021-09-20 10:39 ` Bruce Richardson 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz 2021-09-20 10:43 ` Bruce Richardson 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 15/16] devbind: add dma device class Kevin Laatz 2021-09-20 10:45 ` Bruce Richardson 2021-09-22 2:19 ` fengchengwen 2021-09-17 15:24 ` [dpdk-dev] [PATCH v5 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 2021-09-20 10:46 ` Bruce Richardson 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 01/16] raw/ioat: only build if dmadev not present Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 03/16] dma/idxd: add bus device probing Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 06/16] dma/idxd: add datapath structures Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 07/16] dma/idxd: add configure and info_get functions Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 09/16] dma/idxd: add data-path job submission functions Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 10/16] dma/idxd: add data-path job completion functions Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 11/16] dma/idxd: add operation statistic tracking Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 12/16] dma/idxd: add vchan status function Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 13/16] dma/idxd: add burst capacity API Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 15/16] devbind: add dma device class Kevin Laatz 2021-09-24 13:39 ` [dpdk-dev] [PATCH v6 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 01/16] raw/ioat: only build if dmadev not present Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz 2021-10-18 10:32 ` Thomas Monjalon 2021-10-18 10:41 ` Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 03/16] dma/idxd: add bus device probing Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 06/16] dma/idxd: add datapath structures Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 07/16] dma/idxd: add configure and info_get functions Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 09/16] dma/idxd: add data-path job submission functions Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 10/16] dma/idxd: add data-path job completion functions Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 11/16] dma/idxd: add operation statistic tracking Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 12/16] dma/idxd: add vchan status function Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 13/16] dma/idxd: add burst capacity API Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 15/16] devbind: add dma device class Kevin Laatz 2021-10-13 16:30 ` [dpdk-dev] [PATCH v7 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 01/16] raw/ioat: only build if dmadev not present Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 03/16] dma/idxd: add bus device probing Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 06/16] dma/idxd: add datapath structures Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 07/16] dma/idxd: add configure and info_get functions Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 09/16] dma/idxd: add data-path job submission functions Kevin Laatz 2021-10-19 7:04 ` Thomas Monjalon 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 10/16] dma/idxd: add data-path job completion functions Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 11/16] dma/idxd: add operation statistic tracking Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 12/16] dma/idxd: add vchan status function Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 13/16] dma/idxd: add burst capacity API Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 15/16] devbind: add dma device class Kevin Laatz 2021-10-18 12:28 ` [dpdk-dev] [PATCH v8 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 01/16] raw/ioat: only build if dmadev not present Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 03/16] dma/idxd: add bus device probing Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 06/16] dma/idxd: add datapath structures Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 07/16] dma/idxd: add configure and info_get functions Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 09/16] dma/idxd: add data-path job submission functions Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 10/16] dma/idxd: add data-path job completion functions Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 11/16] dma/idxd: add operation statistic tracking Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 12/16] dma/idxd: add vchan status function Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 13/16] dma/idxd: add burst capacity API Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 15/16] devbind: add dma device class Kevin Laatz 2021-10-19 11:25 ` [dpdk-dev] [PATCH v9 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 01/16] raw/ioat: only build if dmadev not present Kevin Laatz 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 03/16] dma/idxd: add bus device probing Kevin Laatz 2021-10-20 6:54 ` fengchengwen 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz 2021-10-20 7:10 ` fengchengwen 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz 2021-10-20 7:34 ` fengchengwen 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 06/16] dma/idxd: add datapath structures Kevin Laatz 2021-10-20 7:44 ` fengchengwen 2021-10-20 8:20 ` Bruce Richardson 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 07/16] dma/idxd: add configure and info_get functions Kevin Laatz 2021-10-20 7:54 ` fengchengwen 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz 2021-10-20 8:04 ` fengchengwen 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 09/16] dma/idxd: add data-path job submission functions Kevin Laatz 2021-10-20 8:27 ` fengchengwen 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 10/16] dma/idxd: add data-path job completion functions Kevin Laatz 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 11/16] dma/idxd: add operation statistic tracking Kevin Laatz 2021-10-20 9:18 ` fengchengwen 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 12/16] dma/idxd: add vchan status function Kevin Laatz 2021-10-20 9:30 ` fengchengwen 2021-10-20 9:52 ` Bruce Richardson 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 13/16] dma/idxd: add burst capacity API Kevin Laatz 2021-10-20 9:32 ` fengchengwen 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 15/16] devbind: add dma device class Kevin Laatz 2021-10-19 14:10 ` [dpdk-dev] [PATCH v10 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Kevin Laatz 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 01/16] raw/ioat: only build if dmadev not present Kevin Laatz 2021-10-20 16:29 ` [dpdk-dev] [PATCH v11 02/16] dma/idxd: add skeleton for VFIO based DSA device Kevin Laatz 2021-10-22 15:47 ` Thomas Monjalon 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 03/16] dma/idxd: add bus device probing Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 04/16] dma/idxd: create dmadev instances on bus probe Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 05/16] dma/idxd: create dmadev instances on pci probe Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 06/16] dma/idxd: add datapath structures Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 07/16] dma/idxd: add configure and info_get functions Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 08/16] dma/idxd: add start and stop functions for pci devices Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 09/16] dma/idxd: add data-path job submission functions Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 10/16] dma/idxd: add data-path job completion functions Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 11/16] dma/idxd: add operation statistic tracking Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 12/16] dma/idxd: add vchan status function Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 13/16] dma/idxd: add burst capacity API Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 14/16] dma/idxd: move dpdk_idxd_cfg.py from raw to dma Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 15/16] devbind: add dma device class Kevin Laatz 2021-10-20 16:30 ` [dpdk-dev] [PATCH v11 16/16] devbind: move idxd device ID to dmadev class Kevin Laatz 2021-10-22 18:07 ` [dpdk-dev] [PATCH v11 00/16] add dmadev driver for idxd devices Thomas Monjalon 2021-10-23 6:55 ` David Marchand
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).