From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0EA7DA0C54; Mon, 6 Sep 2021 17:59:23 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D1552410F0; Mon, 6 Sep 2021 17:59:22 +0200 (CEST) Received: from cae-Bilby-RV1.amd.com (unknown [165.204.156.251]) by mails.dpdk.org (Postfix) with ESMTP id 5100D410ED for ; Mon, 6 Sep 2021 17:59:21 +0200 (CEST) Received: from cae-Bilby-RV1.amd.com (localhost [127.0.0.1]) by cae-Bilby-RV1.amd.com (8.15.2/8.15.2/Debian-18) with ESMTP id 186FxG6C061877; Mon, 6 Sep 2021 21:29:16 +0530 Received: (from cae@localhost) by cae-Bilby-RV1.amd.com (8.15.2/8.15.2/Submit) id 186FxGtb061876; Mon, 6 Sep 2021 21:29:16 +0530 From: Selwin Sebastian To: dev@dpdk.org Cc: Selwin Sebastian Date: Mon, 6 Sep 2021 21:29:11 +0530 Message-Id: <20210906155911.61829-1-selwin.sebastian@amd.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [dpdk-dev] [RFC PATCH v2] raw/ptdma: introduce ptdma driver X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Selwin Sebastian Add support for PTDMA driver Signed-off-by: Selwin Sebastian --- MAINTAINERS | 5 + doc/guides/rawdevs/ptdma.rst | 220 ++++++++++++++ drivers/raw/meson.build | 1 + drivers/raw/ptdma/meson.build | 16 + drivers/raw/ptdma/ptdma_dev.c | 135 +++++++++ drivers/raw/ptdma/ptdma_pmd_private.h | 41 +++ drivers/raw/ptdma/ptdma_rawdev.c | 266 +++++++++++++++++ drivers/raw/ptdma/ptdma_rawdev_spec.h | 362 +++++++++++++++++++++++ drivers/raw/ptdma/ptdma_rawdev_test.c | 272 +++++++++++++++++ drivers/raw/ptdma/rte_ptdma_rawdev.h | 124 ++++++++ drivers/raw/ptdma/rte_ptdma_rawdev_fns.h | 298 +++++++++++++++++++ drivers/raw/ptdma/version.map | 5 + usertools/dpdk-devbind.py | 4 +- 13 files changed, 1748 insertions(+), 1 deletion(-) create mode 100644 doc/guides/rawdevs/ptdma.rst create mode 100644 drivers/raw/ptdma/meson.build create mode 100644 drivers/raw/ptdma/ptdma_dev.c create mode 100644 drivers/raw/ptdma/ptdma_pmd_private.h create mode 100644 drivers/raw/ptdma/ptdma_rawdev.c create mode 100644 drivers/raw/ptdma/ptdma_rawdev_spec.h create mode 100644 drivers/raw/ptdma/ptdma_rawdev_test.c create mode 100644 drivers/raw/ptdma/rte_ptdma_rawdev.h create mode 100644 drivers/raw/ptdma/rte_ptdma_rawdev_fns.h create mode 100644 drivers/raw/ptdma/version.map diff --git a/MAINTAINERS b/MAINTAINERS index 266f5ac1da..f4afd1a072 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1305,6 +1305,11 @@ F: doc/guides/rawdevs/ioat.rst F: examples/ioat/ F: doc/guides/sample_app_ug/ioat.rst +PTDMA Rawdev +M: Selwin Sebastian +F: drivers/raw/ptdma/ +F: doc/guides/rawdevs/ptdma.rst + NXP DPAA2 QDMA M: Nipun Gupta F: drivers/raw/dpaa2_qdma/ diff --git a/doc/guides/rawdevs/ptdma.rst b/doc/guides/rawdevs/ptdma.rst new file mode 100644 index 0000000000..50772f9f3b --- /dev/null +++ b/doc/guides/rawdevs/ptdma.rst @@ -0,0 +1,220 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2021 Advanced Micro Devices, Inc. All rights reserved. + +PTDMA Rawdev Driver +=================== + +The ``ptdma`` rawdev driver provides a poll-mode driver (PMD) for AMD PTDMA device. + +Hardware Requirements +---------------------- + +The ``dpdk-devbind.py`` script, included with DPDK, +can be used to show the presence of supported hardware. +Running ``dpdk-devbind.py --status-dev misc`` will show all the miscellaneous, +or rawdev-based devices on the system. + +Sample output from a system with PTDMA is shown below + +Misc (rawdev) devices using DPDK-compatible driver +================================================== +0000:01:00.2 'Starship/Matisse PTDMA 1498' drv=igb_uio unused=vfio-pci +0000:02:00.2 'Starship/Matisse PTDMA 1498' drv=igb_uio unused=vfio-pci + +Devices using UIO drivers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The HW devices to be used will need to be bound to a user-space IO driver for use. +The ``dpdk-devbind.py`` script can be used to view the state of the PTDMA devices +and to bind them to a suitable DPDK-supported driver, such as ``igb_uio``. +For example:: + + $ sudo ./usertools/dpdk-devbind.py --force --bind=igb_uio 0000:01:00.2 0000:02:00.2 + +Compilation +------------ + +For builds using ``meson`` and ``ninja``, the driver will be built when the target platform is x86-based. +No additional compilation steps are necessary. + + +Using PTDMA Rawdev Devices +-------------------------- + +To use the devices from an application, the rawdev API can be used, along +with definitions taken from the device-specific header file +``rte_ptdma_rawdev.h``. This header is needed to get the definition of +structure parameters used by some of the rawdev APIs for PTDMA rawdev +devices, as well as providing key functions for using the device for memory +copies. + +Getting Device Information +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Basic information about each rawdev device can be queried using the +``rte_rawdev_info_get()`` API. For most applications, this API will be +needed to verify that the rawdev in question is of the expected type. For +example, the following code snippet can be used to identify an PTDMA +rawdev device for use by an application: + +.. code-block:: C + + for (i = 0; i < count && !found; i++) { + struct rte_rawdev_info info = { .dev_private = NULL }; + found = (rte_rawdev_info_get(i, &info, 0) == 0 && + strcmp(info.driver_name, + PTDMA_PMD_RAWDEV_NAME) == 0); + } + +When calling the ``rte_rawdev_info_get()`` API for an PTDMA rawdev device, +the ``dev_private`` field in the ``rte_rawdev_info`` struct should either +be NULL, or else be set to point to a structure of type +``rte_ptdma_rawdev_config``, in which case the size of the configured device +input ring will be returned in that structure. + +Device Configuration +~~~~~~~~~~~~~~~~~~~~~ + +Configuring an PTDMA rawdev device is done using the +``rte_rawdev_configure()`` API, which takes the same structure parameters +as the, previously referenced, ``rte_rawdev_info_get()`` API. The main +difference is that, because the parameter is used as input rather than +output, the ``dev_private`` structure element cannot be NULL, and must +point to a valid ``rte_ptdma_rawdev_config`` structure, containing the ring +size to be used by the device. The ring size must be a power of two, +between 64 and 4096. +If it is not needed, the tracking by the driver of user-provided completion +handles may be disabled by setting the ``hdls_disable`` flag in +the configuration structure also. + +The following code shows how the device is configured in +``test_ptdma_rawdev.c``: + +.. code-block:: C + + #define PTDMA_TEST_RINGSIZE 512 + struct rte_ptdma_rawdev_config p = { .ring_size = -1 }; + struct rte_rawdev_info info = { .dev_private = &p }; + + /* ... */ + + p.ring_size = PTDMA_TEST_RINGSIZE; + if (rte_rawdev_configure(dev_id, &info, sizeof(p)) != 0) { + printf("Error with rte_rawdev_configure()\n"); + return -1; + } + +Once configured, the device can then be made ready for use by calling the +``rte_rawdev_start()`` API. + +Performing Data Copies +~~~~~~~~~~~~~~~~~~~~~~~ + +To perform data copies using PTDMA rawdev devices, the functions +``rte_ptdma_enqueue_copy()`` and ``rte_ptdma_perform_ops()`` should be used. +Once copies have been completed, the completion will be reported back when +the application calls ``rte_ptdma_completed_ops()``. + +The ``rte_ptdma_enqueue_copy()`` function enqueues a single copy to the +device ring for copying at a later point. The parameters to that function +include the IOVA addresses of both the source and destination buffers, +as well as two "handles" to be returned to the user when the copy is +completed. These handles can be arbitrary values, but two are provided so +that the library can track handles for both source and destination on +behalf of the user, e.g. virtual addresses for the buffers, or mbuf +pointers if packet data is being copied. + +While the ``rte_ptdma_enqueue_copy()`` function enqueues a copy operation on +the device ring, the copy will not actually be performed until after the +application calls the ``rte_ptdma_perform_ops()`` function. This function +informs the device hardware of the elements enqueued on the ring, and the +device will begin to process them. It is expected that, for efficiency +reasons, a burst of operations will be enqueued to the device via multiple +enqueue calls between calls to the ``rte_ptdma_perform_ops()`` function. + +The following code from ``test_ptdma_rawdev.c`` demonstrates how to enqueue +a burst of copies to the device and start the hardware processing of them: + +.. code-block:: C + + struct rte_mbuf *srcs[32], *dsts[32]; + unsigned int j; + + for (i = 0; i < RTE_DIM(srcs); i++) { + char *src_data; + + srcs[i] = rte_pktmbuf_alloc(pool); + dsts[i] = rte_pktmbuf_alloc(pool); + srcs[i]->data_len = srcs[i]->pkt_len = length; + dsts[i]->data_len = dsts[i]->pkt_len = length; + src_data = rte_pktmbuf_mtod(srcs[i], char *); + + for (j = 0; j < length; j++) + src_data[j] = rand() & 0xFF; + + if (rte_ptdma_enqueue_copy(dev_id, + srcs[i]->buf_iova + srcs[i]->data_off, + dsts[i]->buf_iova + dsts[i]->data_off, + length, + (uintptr_t)srcs[i], + (uintptr_t)dsts[i]) != 1) { + printf("Error with rte_ptdma_enqueue_copy for buffer %u\n", + i); + return -1; + } + } + rte_ptdma_perform_ops(dev_id); + +To retrieve information about completed copies, the API +``rte_ptdma_completed_ops()`` should be used. This API will return to the +application a set of completion handles passed in when the relevant copies +were enqueued. + +The following code from ``test_ptdma_rawdev.c`` shows the test code +retrieving information about the completed copies and validating the data +is correct before freeing the data buffers using the returned handles: + +.. code-block:: C + + if (rte_ptdma_completed_ops(dev_id, 64, (void *)completed_src, + (void *)completed_dst) != RTE_DIM(srcs)) { + printf("Error with rte_ptdma_completed_ops\n"); + return -1; + } + for (i = 0; i < RTE_DIM(srcs); i++) { + char *src_data, *dst_data; + + if (completed_src[i] != srcs[i]) { + printf("Error with source pointer %u\n", i); + return -1; + } + if (completed_dst[i] != dsts[i]) { + printf("Error with dest pointer %u\n", i); + return -1; + } + + src_data = rte_pktmbuf_mtod(srcs[i], char *); + dst_data = rte_pktmbuf_mtod(dsts[i], char *); + for (j = 0; j < length; j++) + if (src_data[j] != dst_data[j]) { + printf("Error with copy of packet %u, byte %u\n", + i, j); + return -1; + } + rte_pktmbuf_free(srcs[i]); + rte_pktmbuf_free(dsts[i]); + } + +Querying Device Statistics +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The statistics from the PTDMA rawdev device can be got via the xstats +functions in the ``rte_rawdev`` library, i.e. +``rte_rawdev_xstats_names_get()``, ``rte_rawdev_xstats_get()`` and +``rte_rawdev_xstats_by_name_get``. The statistics returned for each device +instance are: + +* ``failed_enqueues`` +* ``successful_enqueues`` +* ``copies_started`` +* ``copies_completed`` diff --git a/drivers/raw/meson.build b/drivers/raw/meson.build index b51536f8a7..e896745d9c 100644 --- a/drivers/raw/meson.build +++ b/drivers/raw/meson.build @@ -14,6 +14,7 @@ drivers = [ 'ntb', 'octeontx2_dma', 'octeontx2_ep', + 'ptdma', 'skeleton', ] std_deps = ['rawdev'] diff --git a/drivers/raw/ptdma/meson.build b/drivers/raw/ptdma/meson.build new file mode 100644 index 0000000000..a3eab8dbfd --- /dev/null +++ b/drivers/raw/ptdma/meson.build @@ -0,0 +1,16 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright 2021 Advanced Micro Devices, Inc. All rights reserved. + +build = dpdk_conf.has('RTE_ARCH_X86') +reason = 'only supported on x86' +sources = files( + 'ptdma_rawdev.c', + 'ptdma_dev.c', + 'ptdma_rawdev_test.c') +deps += ['bus_pci', + 'bus_vdev', + 'mbuf', + 'rawdev'] + +headers = files('rte_ptdma_rawdev.h', + 'rte_ptdma_rawdev_fns.h') diff --git a/drivers/raw/ptdma/ptdma_dev.c b/drivers/raw/ptdma/ptdma_dev.c new file mode 100644 index 0000000000..1d0207a9af --- /dev/null +++ b/drivers/raw/ptdma/ptdma_dev.c @@ -0,0 +1,135 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Advanced Micro Devices, Inc. All rights reserved. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include + +#include "ptdma_rawdev_spec.h" +#include "ptdma_pmd_private.h" +#include "rte_ptdma_rawdev_fns.h" + +static int ptdma_dev_id; + +static const struct rte_memzone * +ptdma_queue_dma_zone_reserve(const char *queue_name, + uint32_t queue_size, + int socket_id) +{ + const struct rte_memzone *mz; + + mz = rte_memzone_lookup(queue_name); + if (mz != 0) { + if (((size_t)queue_size <= mz->len) && + ((socket_id == SOCKET_ID_ANY) || + (socket_id == mz->socket_id))) { + PTDMA_PMD_INFO("re-use memzone already " + "allocated for %s", queue_name); + return mz; + } + PTDMA_PMD_ERR("Incompatible memzone already " + "allocated %s, size %u, socket %d. " + "Requested size %u, socket %u", + queue_name, (uint32_t)mz->len, + mz->socket_id, queue_size, socket_id); + return NULL; + } + + PTDMA_PMD_INFO("Allocate memzone for %s, size %u on socket %u", + queue_name, queue_size, socket_id); + + return rte_memzone_reserve_aligned(queue_name, queue_size, + socket_id, RTE_MEMZONE_IOVA_CONTIG, queue_size); +} + +int +ptdma_add_queue(struct rte_ptdma_rawdev *dev) +{ + int i; + uint32_t dma_addr_lo, dma_addr_hi; + uint32_t ptdma_version = 0; + struct ptdma_cmd_queue *cmd_q; + const struct rte_memzone *q_mz; + void *vaddr; + + if (dev == NULL) + return -1; + + dev->id = ptdma_dev_id++; + dev->qidx = 0; + vaddr = (void *)(dev->pci.mem_resource[2].addr); + + PTDMA_WRITE_REG(vaddr, CMD_REQID_CONFIG_OFFSET, 0x0); + ptdma_version = PTDMA_READ_REG(vaddr, CMD_PTDMA_VERSION); + PTDMA_PMD_INFO("PTDMA VERSION = 0x%x", ptdma_version); + + dev->cmd_q_count = 0; + /* Find available queues */ + for (i = 0; i < MAX_HW_QUEUES; i++) { + cmd_q = &dev->cmd_q[dev->cmd_q_count++]; + cmd_q->dev = dev; + cmd_q->id = i; + cmd_q->qidx = 0; + cmd_q->qsize = Q_SIZE(Q_DESC_SIZE); + + cmd_q->reg_base = (uint8_t *)vaddr + + CMD_Q_STATUS_INCR * (i + 1); + + /* PTDMA queue memory */ + snprintf(cmd_q->memz_name, sizeof(cmd_q->memz_name), + "%s_%d_%s_%d_%s", + "ptdma_dev", + (int)dev->id, "queue", + (int)cmd_q->id, "mem"); + q_mz = ptdma_queue_dma_zone_reserve(cmd_q->memz_name, + cmd_q->qsize, rte_socket_id()); + cmd_q->qbase_addr = (void *)q_mz->addr; + cmd_q->qbase_desc = (void *)q_mz->addr; + cmd_q->qbase_phys_addr = q_mz->iova; + + cmd_q->qcontrol = 0; + /* init control reg to zero */ + PTDMA_WRITE_REG(cmd_q->reg_base, CMD_Q_CONTROL_BASE, + cmd_q->qcontrol); + + /* Disable the interrupts */ + PTDMA_WRITE_REG(cmd_q->reg_base, CMD_Q_INT_ENABLE_BASE, 0x00); + PTDMA_READ_REG(cmd_q->reg_base, CMD_Q_INT_STATUS_BASE); + PTDMA_READ_REG(cmd_q->reg_base, CMD_Q_STATUS_BASE); + + /* Clear the interrupts */ + PTDMA_WRITE_REG(cmd_q->reg_base, CMD_Q_INTERRUPT_STATUS_BASE, + ALL_INTERRUPTS); + + /* Configure size of each virtual queue accessible to host */ + cmd_q->qcontrol &= ~(CMD_Q_SIZE << CMD_Q_SHIFT); + cmd_q->qcontrol |= QUEUE_SIZE_VAL << CMD_Q_SHIFT; + + dma_addr_lo = low32_value(cmd_q->qbase_phys_addr); + PTDMA_WRITE_REG(cmd_q->reg_base, CMD_Q_TAIL_LO_BASE, + (uint32_t)dma_addr_lo); + PTDMA_WRITE_REG(cmd_q->reg_base, CMD_Q_HEAD_LO_BASE, + (uint32_t)dma_addr_lo); + + dma_addr_hi = high32_value(cmd_q->qbase_phys_addr); + cmd_q->qcontrol |= (dma_addr_hi << 16); + PTDMA_WRITE_REG(cmd_q->reg_base, CMD_Q_CONTROL_BASE, + cmd_q->qcontrol); + + } + return 0; +} diff --git a/drivers/raw/ptdma/ptdma_pmd_private.h b/drivers/raw/ptdma/ptdma_pmd_private.h new file mode 100644 index 0000000000..0c25e737f5 --- /dev/null +++ b/drivers/raw/ptdma/ptdma_pmd_private.h @@ -0,0 +1,41 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Advanced Micro Devices, Inc. All rights reserved. + */ + +#ifndef _PTDMA_PMD_PRIVATE_H_ +#define _PTDMA_PMD_PRIVATE_H_ + +#include +#include "ptdma_rawdev_spec.h" + +extern int ptdma_pmd_logtype; + +#define PTDMA_PMD_LOG(level, fmt, args...) \ + rte_log(RTE_LOG_ ## level, ptdma_pmd_logtype, "%s(): " fmt "\n", \ + __func__, ##args) + +#define PTDMA_PMD_FUNC_TRACE() PTDMA_PMD_LOG(DEBUG, ">>") + +#define PTDMA_PMD_ERR(fmt, args...) \ + PTDMA_PMD_LOG(ERR, fmt, ## args) +#define PTDMA_PMD_WARN(fmt, args...) \ + PTDMA_PMD_LOG(WARNING, fmt, ## args) +#define PTDMA_PMD_DEBUG(fmt, args...) \ + PTDMA_PMD_LOG(DEBUG, fmt, ## args) +#define PTDMA_PMD_INFO(fmt, args...) \ + PTDMA_PMD_LOG(INFO, fmt, ## args) + +int ptdma_xstats_get(const struct rte_rawdev *dev, const unsigned int ids[], + uint64_t values[], unsigned int n); +int ptdma_xstats_get_names(const struct rte_rawdev *dev, + struct rte_rawdev_xstats_name *names, + unsigned int size); +int ptdma_xstats_reset(struct rte_rawdev *dev, const uint32_t *ids, + uint32_t nb_ids); +int ptdma_add_queue(struct rte_ptdma_rawdev *dev); + +extern int ptdma_rawdev_test(uint16_t dev_id); + +#endif /* _PTDMA_PMD_PRIVATE_H_ */ + + diff --git a/drivers/raw/ptdma/ptdma_rawdev.c b/drivers/raw/ptdma/ptdma_rawdev.c new file mode 100644 index 0000000000..cfa57d81ed --- /dev/null +++ b/drivers/raw/ptdma/ptdma_rawdev.c @@ -0,0 +1,266 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Advanced Micro Devices, Inc. All rights reserved. + */ + +#include +#include +#include +#include +#include + +#include "rte_ptdma_rawdev.h" +#include "ptdma_rawdev_spec.h" +#include "ptdma_pmd_private.h" + +RTE_LOG_REGISTER(ptdma_pmd_logtype, rawdev.ptdma, INFO); + +uint8_t ptdma_rawdev_driver_id; +static struct rte_pci_driver ptdma_pmd_drv; + +#define AMD_VENDOR_ID 0x1022 +#define PTDMA_DEVICE_ID 0x1498 +#define COMPLETION_SZ sizeof(__m128i) + +static const struct rte_pci_id pci_id_ptdma_map[] = { + { RTE_PCI_DEVICE(AMD_VENDOR_ID, PTDMA_DEVICE_ID) }, + { .vendor_id = 0, /* sentinel */ }, +}; + +static const char * const xstat_names[] = { + "failed_enqueues", "successful_enqueues", + "copies_started", "copies_completed" +}; + +static int +ptdma_dev_configure(const struct rte_rawdev *dev, rte_rawdev_obj_t config, + size_t config_size) +{ + struct rte_ptdma_rawdev_config *params = config; + struct rte_ptdma_rawdev *ptdma_priv = dev->dev_private; + + if (dev->started) + return -EBUSY; + if (params == NULL || config_size != sizeof(*params)) + return -EINVAL; + if (params->ring_size > 8192 || params->ring_size < 64 || + !rte_is_power_of_2(params->ring_size)) + return -EINVAL; + ptdma_priv->ring_size = params->ring_size; + ptdma_priv->hdls_disable = params->hdls_disable; + ptdma_priv->hdls = rte_zmalloc_socket("ptdma_hdls", + ptdma_priv->ring_size * sizeof(*ptdma_priv->hdls), + RTE_CACHE_LINE_SIZE, rte_socket_id()); + return 0; +} + +static int +ptdma_rawdev_remove(struct rte_pci_device *dev); + +int +ptdma_xstats_get(const struct rte_rawdev *dev, const unsigned int ids[], + uint64_t values[], unsigned int n) +{ + const struct rte_ptdma_rawdev *ptdma = dev->dev_private; + const uint64_t *stats = (const void *)&ptdma->xstats; + unsigned int i; + + for (i = 0; i < n; i++) { + if (ids[i] > sizeof(ptdma->xstats)/sizeof(*stats)) + values[i] = 0; + else + values[i] = stats[ids[i]]; + } + return n; +} + +int +ptdma_xstats_get_names(const struct rte_rawdev *dev, + struct rte_rawdev_xstats_name *names, + unsigned int size) +{ + unsigned int i; + + RTE_SET_USED(dev); + if (size < RTE_DIM(xstat_names)) + return RTE_DIM(xstat_names); + for (i = 0; i < RTE_DIM(xstat_names); i++) + strlcpy(names[i].name, xstat_names[i], sizeof(names[i])); + return RTE_DIM(xstat_names); +} + +int +ptdma_xstats_reset(struct rte_rawdev *dev, const uint32_t *ids, + uint32_t nb_ids) +{ + struct rte_ptdma_rawdev *ptdma = dev->dev_private; + uint64_t *stats = (void *)&ptdma->xstats; + unsigned int i; + + if (!ids) { + memset(&ptdma->xstats, 0, sizeof(ptdma->xstats)); + return 0; + } + for (i = 0; i < nb_ids; i++) + if (ids[i] < sizeof(ptdma->xstats)/sizeof(*stats)) + stats[ids[i]] = 0; + return 0; +} + +static int +ptdma_dev_start(struct rte_rawdev *dev) +{ + RTE_SET_USED(dev); + return 0; +} + +static void +ptdma_dev_stop(struct rte_rawdev *dev) +{ + RTE_SET_USED(dev); +} + +static int +ptdma_dev_close(struct rte_rawdev *dev __rte_unused) +{ + return 0; +} + +static int +ptdma_dev_info_get(struct rte_rawdev *dev, rte_rawdev_obj_t dev_info, + size_t dev_info_size) +{ + struct rte_ptdma_rawdev_config *cfg = dev_info; + struct rte_ptdma_rawdev *ptdma = dev->dev_private; + + if (dev_info == NULL || dev_info_size != sizeof(*cfg)) + return -EINVAL; + cfg->ring_size = ptdma->ring_size; + cfg->hdls_disable = ptdma->hdls_disable; + return 0; +} + +static int +ptdma_rawdev_create(const char *name, struct rte_pci_device *dev) +{ + static const struct rte_rawdev_ops ptdma_rawdev_ops = { + .dev_configure = ptdma_dev_configure, + .dev_start = ptdma_dev_start, + .dev_stop = ptdma_dev_stop, + .dev_close = ptdma_dev_close, + .dev_info_get = ptdma_dev_info_get, + .xstats_get = ptdma_xstats_get, + .xstats_get_names = ptdma_xstats_get_names, + .xstats_reset = ptdma_xstats_reset, + .dev_selftest = ptdma_rawdev_test, + }; + struct rte_rawdev *rawdev = NULL; + struct rte_ptdma_rawdev *ptdma_priv = NULL; + int ret = 0; + if (!name) { + PTDMA_PMD_ERR("Invalid name of the device!"); + ret = -EINVAL; + goto cleanup; + } + /* Allocate device structure */ + rawdev = rte_rawdev_pmd_allocate(name, sizeof(struct rte_rawdev), + rte_socket_id()); + if (rawdev == NULL) { + PTDMA_PMD_ERR("Unable to allocate raw device"); + ret = -ENOMEM; + goto cleanup; + } + + rawdev->dev_id = ptdma_rawdev_driver_id++; + PTDMA_PMD_INFO("dev_id = %d", rawdev->dev_id); + PTDMA_PMD_INFO("driver_name = %s", dev->device.driver->name); + + rawdev->dev_ops = &ptdma_rawdev_ops; + rawdev->device = &dev->device; + rawdev->driver_name = dev->device.driver->name; + + ptdma_priv = rte_zmalloc_socket("ptdma_priv", sizeof(*ptdma_priv), + RTE_CACHE_LINE_SIZE, rte_socket_id()); + rawdev->dev_private = ptdma_priv; + ptdma_priv->rawdev = rawdev; + ptdma_priv->ring_size = 0; + ptdma_priv->pci = *dev; + + /* device is valid, add queue details */ + if (ptdma_add_queue(ptdma_priv)) + goto init_error; + + return 0; + +cleanup: + if (rawdev) + rte_rawdev_pmd_release(rawdev); + return ret; +init_error: + PTDMA_PMD_ERR("driver %s(): failed", __func__); + ptdma_rawdev_remove(dev); + return -EFAULT; +} + +static int +ptdma_rawdev_destroy(const char *name) +{ + int ret; + struct rte_rawdev *rdev; + if (!name) { + PTDMA_PMD_ERR("Invalid device name"); + return -EINVAL; + } + rdev = rte_rawdev_pmd_get_named_dev(name); + if (!rdev) { + PTDMA_PMD_ERR("Invalid device name (%s)", name); + return -EINVAL; + } + + if (rdev->dev_private != NULL) + rte_free(rdev->dev_private); + + /* rte_rawdev_close is called by pmd_release */ + ret = rte_rawdev_pmd_release(rdev); + + if (ret) + PTDMA_PMD_DEBUG("Device cleanup failed"); + return 0; +} +static int +ptdma_rawdev_probe(struct rte_pci_driver *drv, struct rte_pci_device *dev) +{ + char name[32]; + int ret = 0; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + PTDMA_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node); + + dev->device.driver = &drv->driver; + ret = ptdma_rawdev_create(name, dev); + return ret; +} + +static int +ptdma_rawdev_remove(struct rte_pci_device *dev) +{ + char name[32]; + int ret; + + rte_pci_device_name(&dev->addr, name, sizeof(name)); + PTDMA_PMD_INFO("Closing %s on NUMA node %d", + name, dev->device.numa_node); + ret = ptdma_rawdev_destroy(name); + return ret; +} + +static struct rte_pci_driver ptdma_pmd_drv = { + .id_table = pci_id_ptdma_map, + .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC, + .probe = ptdma_rawdev_probe, + .remove = ptdma_rawdev_remove, +}; + +RTE_PMD_REGISTER_PCI(PTDMA_PMD_RAWDEV_NAME, ptdma_pmd_drv); +RTE_PMD_REGISTER_PCI_TABLE(PTDMA_PMD_RAWDEV_NAME, pci_id_ptdma_map); +RTE_PMD_REGISTER_KMOD_DEP(PTDMA_PMD_RAWDEV_NAME, "* igb_uio | uio_pci_generic"); + diff --git a/drivers/raw/ptdma/ptdma_rawdev_spec.h b/drivers/raw/ptdma/ptdma_rawdev_spec.h new file mode 100644 index 0000000000..73511bec95 --- /dev/null +++ b/drivers/raw/ptdma/ptdma_rawdev_spec.h @@ -0,0 +1,362 @@ +/* SPDX-License-Identifier: BSD-3.0-Clause + * Copyright(c) 2021 Advanced Micro Devices, Inc. All rights reserved. + */ + +#ifndef __PT_DEV_H__ +#define __PT_DEV_H__ + +#include +#include +#include +#include +#include +#include + +#ifdef __cplusplus +extern "C" { +#endif + +#define BIT(nr) (1 << (nr)) + +#define BITS_PER_LONG (__SIZEOF_LONG__ * 8) +#define GENMASK(h, l) (((~0UL) << (l)) & (~0UL >> (BITS_PER_LONG - 1 - (h)))) + +#define MAX_HW_QUEUES 1 + +/* Register Mappings */ + +#define CMD_QUEUE_PRIO_OFFSET 0x00 +#define CMD_REQID_CONFIG_OFFSET 0x04 +#define CMD_TIMEOUT_OFFSET 0x08 +#define CMD_TIMEOUT_GRANULARITY 0x0C +#define CMD_PTDMA_VERSION 0x10 + +#define CMD_Q_CONTROL_BASE 0x0000 +#define CMD_Q_TAIL_LO_BASE 0x0004 +#define CMD_Q_HEAD_LO_BASE 0x0008 +#define CMD_Q_INT_ENABLE_BASE 0x000C +#define CMD_Q_INTERRUPT_STATUS_BASE 0x0010 + +#define CMD_Q_STATUS_BASE 0x0100 +#define CMD_Q_INT_STATUS_BASE 0x0104 +#define CMD_Q_DMA_STATUS_BASE 0x0108 +#define CMD_Q_DMA_READ_STATUS_BASE 0x010C +#define CMD_Q_DMA_WRITE_STATUS_BASE 0x0110 +#define CMD_Q_ABORT_BASE 0x0114 +#define CMD_Q_AX_CACHE_BASE 0x0118 + +#define CMD_CONFIG_OFFSET 0x1120 +#define CMD_CLK_GATE_CTL_OFFSET 0x6004 + +#define CMD_DESC_DW0_VAL 0x500012 + +/* Address offset for virtual queue registers */ +#define CMD_Q_STATUS_INCR 0x1000 + +/* Bit masks */ +#define CMD_CONFIG_REQID 0 +#define CMD_TIMEOUT_DISABLE 0 +#define CMD_CLK_DYN_GATING_DIS 0 +#define CMD_CLK_SW_GATE_MODE 0 +#define CMD_CLK_GATE_CTL 0 +#define CMD_QUEUE_PRIO GENMASK(2, 1) +#define CMD_CONFIG_VHB_EN BIT(0) +#define CMD_CLK_DYN_GATING_EN BIT(0) +#define CMD_CLK_HW_GATE_MODE BIT(0) +#define CMD_CLK_GATE_ON_DELAY BIT(12) +#define CMD_CLK_GATE_OFF_DELAY BIT(12) + +#define CMD_CLK_GATE_CONFIG (CMD_CLK_GATE_CTL | \ + CMD_CLK_HW_GATE_MODE | \ + CMD_CLK_GATE_ON_DELAY | \ + CMD_CLK_DYN_GATING_EN | \ + CMD_CLK_GATE_OFF_DELAY) + +#define CMD_Q_LEN 32 +#define CMD_Q_RUN BIT(0) +#define CMD_Q_HALT BIT(1) +#define CMD_Q_MEM_LOCATION BIT(2) +#define CMD_Q_SIZE GENMASK(4, 0) +#define CMD_Q_SHIFT GENMASK(1, 0) +#define COMMANDS_PER_QUEUE 8192 + + +#define QUEUE_SIZE_VAL ((ffs(COMMANDS_PER_QUEUE) - 2) & \ + CMD_Q_SIZE) +#define Q_PTR_MASK (2 << (QUEUE_SIZE_VAL + 5) - 1) +#define Q_DESC_SIZE sizeof(struct ptdma_desc) +#define Q_SIZE(n) (COMMANDS_PER_QUEUE * (n)) + +#define INT_COMPLETION BIT(0) +#define INT_ERROR BIT(1) +#define INT_QUEUE_STOPPED BIT(2) +#define INT_EMPTY_QUEUE BIT(3) +#define SUPPORTED_INTERRUPTS (INT_COMPLETION | INT_ERROR) +#define ALL_INTERRUPTS (INT_COMPLETION | INT_ERROR | \ + INT_QUEUE_STOPPED) + +/****** Local Storage Block ******/ +#define LSB_START 0 +#define LSB_END 127 +#define LSB_COUNT (LSB_END - LSB_START + 1) + +#define LSB_REGION_WIDTH 5 +#define MAX_LSB_CNT 8 + +#define LSB_SIZE 16 +#define LSB_ITEM_SIZE 128 +#define SLSB_MAP_SIZE (MAX_LSB_CNT * LSB_SIZE) +#define LSB_ENTRY_NUMBER(LSB_ADDR) (LSB_ADDR / LSB_ITEM_SIZE) + + +#define PT_DMAPOOL_MAX_SIZE 64 +#define PT_DMAPOOL_ALIGN BIT(5) + +#define PT_PASSTHRU_BLOCKSIZE 512 + +/* General PTDMA Defines */ + +#define PTDMA_SB_BYTES 32 +#define PTDMA_ENGINE_PASSTHRU 0x5 + +/* Word 0 */ +#define PTDMA_CMD_DW0(p) ((p)->dw0) +#define PTDMA_CMD_SOC(p) (PTDMA_CMD_DW0(p).soc) +#define PTDMA_CMD_IOC(p) (PTDMA_CMD_DW0(p).ioc) +#define PTDMA_CMD_INIT(p) (PTDMA_CMD_DW0(p).init) +#define PTDMA_CMD_EOM(p) (PTDMA_CMD_DW0(p).eom) +#define PTDMA_CMD_FUNCTION(p) (PTDMA_CMD_DW0(p).function) +#define PTDMA_CMD_ENGINE(p) (PTDMA_CMD_DW0(p).engine) +#define PTDMA_CMD_PROT(p) (PTDMA_CMD_DW0(p).prot) + +/* Word 1 */ +#define PTDMA_CMD_DW1(p) ((p)->length) +#define PTDMA_CMD_LEN(p) (PTDMA_CMD_DW1(p)) + +/* Word 2 */ +#define PTDMA_CMD_DW2(p) ((p)->src_lo) +#define PTDMA_CMD_SRC_LO(p) (PTDMA_CMD_DW2(p)) + +/* Word 3 */ +#define PTDMA_CMD_DW3(p) ((p)->dw3) +#define PTDMA_CMD_SRC_MEM(p) ((p)->dw3.src_mem) +#define PTDMA_CMD_SRC_HI(p) ((p)->dw3.src_hi) +#define PTDMA_CMD_LSB_ID(p) ((p)->dw3.lsb_cxt_id) +#define PTDMA_CMD_FIX_SRC(p) ((p)->dw3.fixed) + +/* Words 4/5 */ +#define PTDMA_CMD_DST_LO(p) ((p)->dst_lo) +#define PTDMA_CMD_DW5(p) ((p)->dw5.dst_hi) +#define PTDMA_CMD_DST_HI(p) (PTDMA_CMD_DW5(p)) +#define PTDMA_CMD_DST_MEM(p) ((p)->dw5.dst_mem) +#define PTDMA_CMD_FIX_DST(p) ((p)->dw5.fixed) + +/* bitmap */ +enum { + BITS_PER_WORD = sizeof(unsigned long) * CHAR_BIT +}; + +#define WORD_OFFSET(b) ((b) / BITS_PER_WORD) +#define BIT_OFFSET(b) ((b) % BITS_PER_WORD) + +#define PTDMA_DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d)) +#define PTDMA_BITMAP_SIZE(nr) \ + PTDMA_DIV_ROUND_UP(nr, CHAR_BIT * sizeof(unsigned long)) + +#define PTDMA_BITMAP_FIRST_WORD_MASK(start) \ + (~0UL << ((start) & (BITS_PER_WORD - 1))) +#define PTDMA_BITMAP_LAST_WORD_MASK(nbits) \ + (~0UL >> (-(nbits) & (BITS_PER_WORD - 1))) + +#define __ptdma_round_mask(x, y) ((typeof(x))((y)-1)) +#define ptdma_round_down(x, y) ((x) & ~__ptdma_round_mask(x, y)) + +/** PTDMA registers Write/Read */ +static inline void ptdma_pci_reg_write(void *base, int offset, + uint32_t value) +{ + volatile void *reg_addr = ((uint8_t *)base + offset); + rte_write32((rte_cpu_to_le_32(value)), reg_addr); +} + +static inline uint32_t ptdma_pci_reg_read(void *base, int offset) +{ + volatile void *reg_addr = ((uint8_t *)base + offset); + return rte_le_to_cpu_32(rte_read32(reg_addr)); +} + +#define PTDMA_READ_REG(hw_addr, reg_offset) \ + ptdma_pci_reg_read(hw_addr, reg_offset) + +#define PTDMA_WRITE_REG(hw_addr, reg_offset, value) \ + ptdma_pci_reg_write(hw_addr, reg_offset, value) + +/** + * A structure describing a PTDMA command queue. + */ +struct ptdma_cmd_queue { + struct rte_ptdma_rawdev *dev; + char memz_name[RTE_MEMZONE_NAMESIZE]; + + /* Queue identifier */ + uint64_t id; /**< queue id */ + uint64_t qidx; /**< queue index */ + uint64_t qsize; /**< queue size */ + + /* Queue address */ + struct ptdma_desc *qbase_desc; + void *qbase_addr; + phys_addr_t qbase_phys_addr; + /**< queue-page registers addr */ + void *reg_base; + uint32_t qcontrol; + /**< queue ctrl reg */ + uint32_t head_offset; + uint32_t tail_offset; + + int lsb; + /**< lsb region assigned to queue */ + unsigned long lsbmask; + /**< lsb regions queue can access */ + unsigned long lsbmap[PTDMA_BITMAP_SIZE(LSB_COUNT)]; + /**< all lsb resources which queue is using */ + uint32_t sb_key; + /**< lsb assigned for queue */ +} __rte_cache_aligned; + +/* Passthru engine */ + +#define PTDMA_PT_BYTESWAP(p) ((p)->pt.byteswap) +#define PTDMA_PT_BITWISE(p) ((p)->pt.bitwise) + +/** + * passthru_bitwise - type of bitwise passthru operation + * + * @PTDMA_PASSTHRU_BITWISE_NOOP: no bitwise operation performed + * @PTDMA_PASSTHRU_BITWISE_AND: perform bitwise AND of src with mask + * @PTDMA_PASSTHRU_BITWISE_OR: perform bitwise OR of src with mask + * @PTDMA_PASSTHRU_BITWISE_XOR: perform bitwise XOR of src with mask + * @PTDMA_PASSTHRU_BITWISE_MASK: overwrite with mask + */ +enum ptdma_passthru_bitwise { + PTDMA_PASSTHRU_BITWISE_NOOP = 0, + PTDMA_PASSTHRU_BITWISE_AND, + PTDMA_PASSTHRU_BITWISE_OR, + PTDMA_PASSTHRU_BITWISE_XOR, + PTDMA_PASSTHRU_BITWISE_MASK, + PTDMA_PASSTHRU_BITWISE__LAST, +}; + +/** + * ptdma_passthru_byteswap - type of byteswap passthru operation + * + * @PTDMA_PASSTHRU_BYTESWAP_NOOP: no byte swapping performed + * @PTDMA_PASSTHRU_BYTESWAP_32BIT: swap bytes within 32-bit words + * @PTDMA_PASSTHRU_BYTESWAP_256BIT: swap bytes within 256-bit words + */ +enum ptdma_passthru_byteswap { + PTDMA_PASSTHRU_BYTESWAP_NOOP = 0, + PTDMA_PASSTHRU_BYTESWAP_32BIT, + PTDMA_PASSTHRU_BYTESWAP_256BIT, + PTDMA_PASSTHRU_BYTESWAP__LAST, +}; + +/** + * PTDMA passthru + */ +struct ptdma_passthru { + phys_addr_t src_addr; + phys_addr_t dest_addr; + enum ptdma_passthru_bitwise bit_mod; + enum ptdma_passthru_byteswap byte_swap; + int len; +}; + +union ptdma_function { + struct { + uint16_t byteswap:2; + uint16_t bitwise:3; + uint16_t reflect:2; + uint16_t rsvd:8; + } pt; + uint16_t raw; +}; + +/** + * ptdma memory type + */ +enum ptdma_memtype { + PTDMA_MEMTYPE_SYSTEM = 0, + PTDMA_MEMTYPE_SB, + PTDMA_MEMTYPE_LOCAL, + PTDMA_MEMTYPE_LAST, +}; + +/* + * descriptor for PTDMA commands + * 8 32-bit words: + * word 0: function; engine; control bits + * word 1: length of source data + * word 2: low 32 bits of source pointer + * word 3: upper 16 bits of source pointer; source memory type + * word 4: low 32 bits of destination pointer + * word 5: upper 16 bits of destination pointer; destination memory type + * word 6: reserved 32 bits + * word 7: reserved 32 bits + */ + +union dword0 { + struct { + uint32_t soc:1; + uint32_t ioc:1; + uint32_t rsvd1:1; + uint32_t init:1; + uint32_t eom:1; + uint32_t function:15; + uint32_t engine:4; + uint32_t prot:1; + uint32_t rsvd2:7; + }; + uint32_t val; +}; + +struct dword3 { + uint32_t src_hi:16; + uint32_t src_mem:2; + uint32_t lsb_cxt_id:8; + uint32_t rsvd1:5; + uint32_t fixed:1; +}; + +struct dword5 { + uint32_t dst_hi:16; + uint32_t dst_mem:2; + uint32_t rsvd1:13; + uint32_t fixed:1; +}; + +struct ptdma_desc { + union dword0 dw0; + uint32_t length; + uint32_t src_lo; + struct dword3 dw3; + uint32_t dst_lo; + struct dword5 dw5; + uint32_t rsvd1; + uint32_t rsvd2; +}; + + +static inline uint32_t +low32_value(unsigned long addr) +{ + return ((uint64_t)addr) & 0x0ffffffff; +} + +static inline uint32_t +high32_value(unsigned long addr) +{ + return ((uint64_t)addr >> 32) & 0x00000ffff; +} + +#endif diff --git a/drivers/raw/ptdma/ptdma_rawdev_test.c b/drivers/raw/ptdma/ptdma_rawdev_test.c new file mode 100644 index 0000000000..fbbcd66c8d --- /dev/null +++ b/drivers/raw/ptdma/ptdma_rawdev_test.c @@ -0,0 +1,272 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Advanced Micro Devices, Inc. All rights reserved. + **/ + +#include +#include +#include +#include "rte_rawdev.h" +#include "rte_ptdma_rawdev.h" +#include "ptdma_pmd_private.h" + +#define MAX_SUPPORTED_RAWDEVS 16 +#define TEST_SKIPPED 77 + + +static struct rte_mempool *pool; +static unsigned short expected_ring_size[MAX_SUPPORTED_RAWDEVS]; + +#define PRINT_ERR(...) print_err(__func__, __LINE__, __VA_ARGS__) + +static inline int +__rte_format_printf(3, 4) +print_err(const char *func, int lineno, const char *format, ...) +{ + va_list ap; + int ret; + + ret = fprintf(stderr, "In %s:%d - ", func, lineno); + va_start(ap, format); + ret += vfprintf(stderr, format, ap); + va_end(ap); + + return ret; +} + +static int +test_enqueue_copies(int dev_id) +{ + const unsigned int length = 1024; + unsigned int i = 0; + do { + struct rte_mbuf *src, *dst; + char *src_data, *dst_data; + struct rte_mbuf *completed[2] = {0}; + + /* test doing a single copy */ + src = rte_pktmbuf_alloc(pool); + dst = rte_pktmbuf_alloc(pool); + src->data_len = src->pkt_len = length; + dst->data_len = dst->pkt_len = length; + src_data = rte_pktmbuf_mtod(src, char *); + dst_data = rte_pktmbuf_mtod(dst, char *); + + for (i = 0; i < length; i++) + src_data[i] = rand() & 0xFF; + + if (rte_ptdma_enqueue_copy(dev_id, + src->buf_iova + src->data_off, + dst->buf_iova + dst->data_off, + length, + (uintptr_t)src, + (uintptr_t)dst) != 1) { + PRINT_ERR("Error with rte_ptdma_enqueue_copy - 1\n"); + return -1; + } + rte_ptdma_perform_ops(dev_id); + usleep(10); + + if (rte_ptdma_completed_ops(dev_id, 1, (void *)&completed[0], + (void *)&completed[1]) != 1) { + PRINT_ERR("Error with rte_ptdma_completed_ops - 1\n"); + return -1; + } + if (completed[0] != src || completed[1] != dst) { + PRINT_ERR("Error with completions: got (%p, %p), not (%p,%p)\n", + completed[0], completed[1], src, dst); + return -1; + } + + for (i = 0; i < length; i++) + if (dst_data[i] != src_data[i]) { + PRINT_ERR("Data mismatch at char %u - 1\n", i); + return -1; + } + rte_pktmbuf_free(src); + rte_pktmbuf_free(dst); + + + } while (0); + + /* test doing multiple copies */ + do { + struct rte_mbuf *srcs[32], *dsts[32]; + struct rte_mbuf *completed_src[64]; + struct rte_mbuf *completed_dst[64]; + unsigned int j; + + for (i = 0; i < RTE_DIM(srcs) ; i++) { + char *src_data; + + srcs[i] = rte_pktmbuf_alloc(pool); + dsts[i] = rte_pktmbuf_alloc(pool); + srcs[i]->data_len = srcs[i]->pkt_len = length; + dsts[i]->data_len = dsts[i]->pkt_len = length; + src_data = rte_pktmbuf_mtod(srcs[i], char *); + + for (j = 0; j < length; j++) + src_data[j] = rand() & 0xFF; + + if (rte_ptdma_enqueue_copy(dev_id, + srcs[i]->buf_iova + srcs[i]->data_off, + dsts[i]->buf_iova + dsts[i]->data_off, + length, + (uintptr_t)srcs[i], + (uintptr_t)dsts[i]) != 1) { + PRINT_ERR("Error with rte_ptdma_enqueue_copy for buffer %u\n", + i); + return -1; + } + } + rte_ptdma_perform_ops(dev_id); + usleep(100); + + if (rte_ptdma_completed_ops(dev_id, 64, (void *)completed_src, + (void *)completed_dst) != RTE_DIM(srcs)) { + PRINT_ERR("Error with rte_ptdma_completed_ops\n"); + return -1; + } + + for (i = 0; i < RTE_DIM(srcs) ; i++) { + char *src_data, *dst_data; + if (completed_src[i] != srcs[i]) { + PRINT_ERR("Error with source pointer %u\n", i); + return -1; + } + if (completed_dst[i] != dsts[i]) { + PRINT_ERR("Error with dest pointer %u\n", i); + return -1; + } + + src_data = rte_pktmbuf_mtod(srcs[i], char *); + dst_data = rte_pktmbuf_mtod(dsts[i], char *); + for (j = 0; j < length; j++) + if (src_data[j] != dst_data[j]) { + PRINT_ERR("Error with copy of packet %u, byte %u\n", + i, j); + return -1; + } + + rte_pktmbuf_free(srcs[i]); + rte_pktmbuf_free(dsts[i]); + } + + } while (0); + + return 0; +} + +int +ptdma_rawdev_test(uint16_t dev_id) +{ +#define PTDMA_TEST_RINGSIZE 512 + struct rte_ptdma_rawdev_config p = { .ring_size = -1 }; + struct rte_rawdev_info info = { .dev_private = &p }; + struct rte_rawdev_xstats_name *snames = NULL; + uint64_t *stats = NULL; + unsigned int *ids = NULL; + unsigned int nb_xstats; + unsigned int i; + + if (dev_id >= MAX_SUPPORTED_RAWDEVS) { + printf("Skipping test. Cannot test rawdevs with id's greater than %d\n", + MAX_SUPPORTED_RAWDEVS); + return TEST_SKIPPED; + } + + rte_rawdev_info_get(dev_id, &info, sizeof(p)); + if (p.ring_size != expected_ring_size[dev_id]) { + PRINT_ERR("Error, initial ring size is not as expected (Actual: %d, Expected: %d)\n", + (int)p.ring_size, expected_ring_size[dev_id]); + return -1; + } + + p.ring_size = PTDMA_TEST_RINGSIZE; + if (rte_rawdev_configure(dev_id, &info, sizeof(p)) != 0) { + PRINT_ERR("Error with rte_rawdev_configure()\n"); + return -1; + } + rte_rawdev_info_get(dev_id, &info, sizeof(p)); + if (p.ring_size != PTDMA_TEST_RINGSIZE) { + PRINT_ERR("Error, ring size is not %d (%d)\n", + PTDMA_TEST_RINGSIZE, (int)p.ring_size); + return -1; + } + expected_ring_size[dev_id] = p.ring_size; + + if (rte_rawdev_start(dev_id) != 0) { + PRINT_ERR("Error with rte_rawdev_start()\n"); + return -1; + } + + pool = rte_pktmbuf_pool_create("TEST_PTDMA_POOL", + 256, /* n == num elements */ + 32, /* cache size */ + 0, /* priv size */ + 2048, /* data room size */ + info.socket_id); + if (pool == NULL) { + PRINT_ERR("Error with mempool creation\n"); + return -1; + } + + /* allocate memory for xstats names and values */ + nb_xstats = rte_rawdev_xstats_names_get(dev_id, NULL, 0); + + snames = malloc(sizeof(*snames) * nb_xstats); + if (snames == NULL) { + PRINT_ERR("Error allocating xstat names memory\n"); + goto err; + } + rte_rawdev_xstats_names_get(dev_id, snames, nb_xstats); + + ids = malloc(sizeof(*ids) * nb_xstats); + if (ids == NULL) { + PRINT_ERR("Error allocating xstat ids memory\n"); + goto err; + } + for (i = 0; i < nb_xstats; i++) + ids[i] = i; + + stats = malloc(sizeof(*stats) * nb_xstats); + if (stats == NULL) { + PRINT_ERR("Error allocating xstat memory\n"); + goto err; + } + + /* run the test cases */ + printf("Running Copy Tests\n"); + for (i = 0; i < 100; i++) { + unsigned int j; + + if (test_enqueue_copies(dev_id) != 0) + goto err; + + rte_rawdev_xstats_get(dev_id, ids, stats, nb_xstats); + for (j = 0; j < nb_xstats; j++) + printf("%s: %"PRIu64" ", snames[j].name, stats[j]); + printf("\r"); + } + printf("\n"); + + rte_rawdev_stop(dev_id); + if (rte_rawdev_xstats_reset(dev_id, NULL, 0) != 0) { + PRINT_ERR("Error resetting xstat values\n"); + goto err; + } + + rte_mempool_free(pool); + free(snames); + free(stats); + free(ids); + return 0; + +err: + rte_rawdev_stop(dev_id); + rte_rawdev_xstats_reset(dev_id, NULL, 0); + rte_mempool_free(pool); + free(snames); + free(stats); + free(ids); + return -1; +} diff --git a/drivers/raw/ptdma/rte_ptdma_rawdev.h b/drivers/raw/ptdma/rte_ptdma_rawdev.h new file mode 100644 index 0000000000..84eccbc4e8 --- /dev/null +++ b/drivers/raw/ptdma/rte_ptdma_rawdev.h @@ -0,0 +1,124 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Advanced Micro Devices, Inc. All rights reserved. + */ + +#ifndef _RTE_PTMDA_RAWDEV_H_ +#define _RTE_PTMDA_RAWDEV_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +/** + * @file rte_ptdma_rawdev.h + * + * Definitions for using the ptdma rawdev device driver + * + * @warning + * @b EXPERIMENTAL: these structures and APIs may change without prior notice + */ + +#include + +/** Name of the device driver */ +#define PTDMA_PMD_RAWDEV_NAME rawdev_ptdma +/** String reported as the device driver name by rte_rawdev_info_get() */ +#define PTDMA_PMD_RAWDEV_NAME_STR "rawdev_ptdma" + +/** + * Configuration structure for an ptdma rawdev instance + * + * This structure is to be passed as the ".dev_private" parameter when + * calling the rte_rawdev_get_info() and rte_rawdev_configure() APIs on + * an ptdma rawdev instance. + */ +struct rte_ptdma_rawdev_config { + unsigned short ring_size; /**< size of job submission descriptor ring */ + bool hdls_disable; /**< if set, ignore user-supplied handle params */ +}; + +/** + * Enqueue a copy operation onto the ptdma device + * + * This queues up a copy operation to be performed by hardware, but does not + * trigger hardware to begin that operation. + * + * @param dev_id + * The rawdev device id of the ptdma instance + * @param src + * The physical address of the source buffer + * @param dst + * The physical address of the destination buffer + * @param length + * The length of the data to be copied + * @param src_hdl + * An opaque handle for the source data, to be returned when this operation + * has been completed and the user polls for the completion details. + * NOTE: If hdls_disable configuration option for the device is set, this + * parameter is ignored. + * @param dst_hdl + * An opaque handle for the destination data, to be returned when this + * operation has been completed and the user polls for the completion details. + * NOTE: If hdls_disable configuration option for the device is set, this + * parameter is ignored. + * @return + * Number of operations enqueued, either 0 or 1 + */ +static inline int +__rte_experimental +rte_ptdma_enqueue_copy(int dev_id, phys_addr_t src, phys_addr_t dst, + unsigned int length, uintptr_t src_hdl, uintptr_t dst_hdl); + + +/** + * Trigger hardware to begin performing enqueued operations + * + * This API is used to write to the hardware to trigger it + * to begin the operations previously enqueued by rte_ptdma_enqueue_copy() + * + * @param dev_id + * The rawdev device id of the ptdma instance + */ +static inline void +__rte_experimental +rte_ptdma_perform_ops(int dev_id); + +/** + * Returns details of operations that have been completed + * + * This function returns number of newly-completed operations. + * + * @param dev_id + * The rawdev device id of the ptdma instance + * @param max_copies + * The number of entries which can fit in the src_hdls and dst_hdls + * arrays, i.e. max number of completed operations to report. + * NOTE: If hdls_disable configuration option for the device is set, this + * parameter is ignored. + * @param src_hdls + * Array to hold the source handle parameters of the completed ops. + * NOTE: If hdls_disable configuration option for the device is set, this + * parameter is ignored. + * @param dst_hdls + * Array to hold the destination handle parameters of the completed ops. + * NOTE: If hdls_disable configuration option for the device is set, this + * parameter is ignored. + * @return + * -1 on error, with rte_errno set appropriately. + * Otherwise number of completed operations i.e. number of entries written + * to the src_hdls and dst_hdls array parameters. + */ +static inline int +__rte_experimental +rte_ptdma_completed_ops(int dev_id, uint8_t max_copies, + uintptr_t *src_hdls, uintptr_t *dst_hdls); + + +/* include the implementation details from a separate file */ +#include "rte_ptdma_rawdev_fns.h" + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_PTMDA_RAWDEV_H_ */ diff --git a/drivers/raw/ptdma/rte_ptdma_rawdev_fns.h b/drivers/raw/ptdma/rte_ptdma_rawdev_fns.h new file mode 100644 index 0000000000..f4dced3bef --- /dev/null +++ b/drivers/raw/ptdma/rte_ptdma_rawdev_fns.h @@ -0,0 +1,298 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Advanced Micro Devices, Inc. All rights reserved. + */ +#ifndef _RTE_PTDMA_RAWDEV_FNS_H_ +#define _RTE_PTDMA_RAWDEV_FNS_H_ + +#include +#include +#include +#include +#include "ptdma_rawdev_spec.h" +#include "ptdma_pmd_private.h" + +/** + * @internal + * some statistics for tracking, if added/changed update xstats fns + */ +struct rte_ptdma_xstats { + uint64_t enqueue_failed; + uint64_t enqueued; + uint64_t started; + uint64_t completed; +}; + +/** + * @internal + * Structure representing an PTDMA device instance + */ +struct rte_ptdma_rawdev { + struct rte_rawdev *rawdev; + struct rte_ptdma_xstats xstats; + unsigned short ring_size; + + bool hdls_disable; + __m128i *hdls; /* completion handles for returning to user */ + unsigned short next_read; + unsigned short next_write; + + int id; /**< ptdma dev id on platform */ + struct ptdma_cmd_queue cmd_q[MAX_HW_QUEUES]; /**< ptdma queue */ + int cmd_q_count; /**< no. of ptdma Queues */ + struct rte_pci_device pci; /**< ptdma pci identifier */ + int qidx; + +}; + +static __rte_always_inline void +ptdma_dump_registers(int dev_id) +{ + struct rte_ptdma_rawdev *ptdma_priv = + (struct rte_ptdma_rawdev *)rte_rawdevs[dev_id].dev_private; + struct ptdma_cmd_queue *cmd_q; + uint32_t cur_head_offset; + uint32_t cur_tail_offset; + + cmd_q = &ptdma_priv->cmd_q[0]; + + PTDMA_PMD_DEBUG("cmd_q->head_offset = %d\n", cmd_q->head_offset); + PTDMA_PMD_DEBUG("cmd_q->tail_offset = %d\n", cmd_q->tail_offset); + PTDMA_PMD_DEBUG("cmd_q->id = %" PRIx64 "\n", cmd_q->id); + PTDMA_PMD_DEBUG("cmd_q->qidx = %" PRIx64 "\n", cmd_q->qidx); + PTDMA_PMD_DEBUG("cmd_q->qsize = %" PRIx64 "\n", cmd_q->qsize); + + cur_head_offset = PTDMA_READ_REG(cmd_q->reg_base, + CMD_Q_HEAD_LO_BASE); + cur_tail_offset = PTDMA_READ_REG(cmd_q->reg_base, + CMD_Q_TAIL_LO_BASE); + + PTDMA_PMD_DEBUG("cur_head_offset = %d\n", cur_head_offset); + PTDMA_PMD_DEBUG("cur_tail_offset = %d\n", cur_tail_offset); + PTDMA_PMD_DEBUG("Q_CONTROL_BASE = 0x%x\n", + PTDMA_READ_REG(cmd_q->reg_base, + CMD_Q_CONTROL_BASE)); + PTDMA_PMD_DEBUG("Q_STATUS_BASE = 0x%x\n", + PTDMA_READ_REG(cmd_q->reg_base, + CMD_Q_STATUS_BASE)); + PTDMA_PMD_DEBUG("Q_INT_STATUS_BASE = 0x%x\n", + PTDMA_READ_REG(cmd_q->reg_base, + CMD_Q_INT_STATUS_BASE)); + PTDMA_PMD_DEBUG("Q_DMA_STATUS_BASE = 0x%x\n", + PTDMA_READ_REG(cmd_q->reg_base, + CMD_Q_DMA_STATUS_BASE)); + PTDMA_PMD_DEBUG("Q_DMA_RD_STS_BASE = 0x%x\n", + PTDMA_READ_REG(cmd_q->reg_base, + CMD_Q_DMA_READ_STATUS_BASE)); + PTDMA_PMD_DEBUG("Q_DMA_WRT_STS_BASE = 0x%x\n", + PTDMA_READ_REG(cmd_q->reg_base, + CMD_Q_DMA_WRITE_STATUS_BASE)); +} + +static __rte_always_inline void +ptdma_perform_passthru(struct ptdma_passthru *pst, + struct ptdma_cmd_queue *cmd_q) +{ + struct ptdma_desc *desc; + union ptdma_function function; + + desc = &cmd_q->qbase_desc[cmd_q->qidx]; + + PTDMA_CMD_ENGINE(desc) = PTDMA_ENGINE_PASSTHRU; + + PTDMA_CMD_SOC(desc) = 0; + PTDMA_CMD_IOC(desc) = 0; + PTDMA_CMD_INIT(desc) = 0; + PTDMA_CMD_EOM(desc) = 0; + PTDMA_CMD_PROT(desc) = 0; + + function.raw = 0; + PTDMA_PT_BYTESWAP(&function) = pst->byte_swap; + PTDMA_PT_BITWISE(&function) = pst->bit_mod; + PTDMA_CMD_FUNCTION(desc) = function.raw; + PTDMA_CMD_LEN(desc) = pst->len; + + PTDMA_CMD_SRC_LO(desc) = (uint32_t)(pst->src_addr); + PTDMA_CMD_SRC_HI(desc) = high32_value(pst->src_addr); + PTDMA_CMD_SRC_MEM(desc) = PTDMA_MEMTYPE_SYSTEM; + + PTDMA_CMD_DST_LO(desc) = (uint32_t)(pst->dest_addr); + PTDMA_CMD_DST_HI(desc) = high32_value(pst->dest_addr); + PTDMA_CMD_DST_MEM(desc) = PTDMA_MEMTYPE_SYSTEM; + + cmd_q->qidx = (cmd_q->qidx + 1) % COMMANDS_PER_QUEUE; + +} + + +static __rte_always_inline int +ptdma_ops_to_enqueue(int dev_id, uint32_t op, uint64_t src, phys_addr_t dst, + unsigned int length, uintptr_t src_hdl, uintptr_t dst_hdl) +{ + struct rte_ptdma_rawdev *ptdma_priv = + (struct rte_ptdma_rawdev *)rte_rawdevs[dev_id].dev_private; + struct ptdma_cmd_queue *cmd_q; + struct ptdma_passthru pst; + uint32_t cmd_q_ctrl; + unsigned short write = ptdma_priv->next_write; + unsigned short read = ptdma_priv->next_read; + unsigned short mask = ptdma_priv->ring_size - 1; + unsigned short space = mask + read - write; + + cmd_q = &ptdma_priv->cmd_q[0]; + cmd_q_ctrl = PTDMA_READ_REG(cmd_q->reg_base, CMD_Q_CONTROL_BASE); + + if (cmd_q_ctrl & CMD_Q_RUN) { + /* Turn the queue off using control register */ + PTDMA_WRITE_REG(cmd_q->reg_base, CMD_Q_CONTROL_BASE, + cmd_q_ctrl & ~CMD_Q_RUN); + do { + cmd_q_ctrl = PTDMA_READ_REG(cmd_q->reg_base, + CMD_Q_CONTROL_BASE); + } while (!(cmd_q_ctrl & CMD_Q_HALT)); + } + + if (space == 0) { + ptdma_priv->xstats.enqueue_failed++; + return 0; + } + + ptdma_priv->next_write = write + 1; + write &= mask; + + if (!op) + pst.src_addr = src; + else + PTDMA_PMD_DEBUG("Operation not supported by PTDMA\n"); + + pst.dest_addr = dst; + pst.len = length; + pst.bit_mod = PTDMA_PASSTHRU_BITWISE_NOOP; + pst.byte_swap = PTDMA_PASSTHRU_BYTESWAP_NOOP; + + cmd_q = &ptdma_priv->cmd_q[0]; + + cmd_q->head_offset = (uint32_t)(PTDMA_READ_REG(cmd_q->reg_base, + CMD_Q_HEAD_LO_BASE)); + + ptdma_perform_passthru(&pst, cmd_q); + + cmd_q->tail_offset = (uint32_t)(cmd_q->qbase_phys_addr + cmd_q->qidx * + Q_DESC_SIZE); + rte_wmb(); + + /* Write the new tail address back to the queue register */ + PTDMA_WRITE_REG(cmd_q->reg_base, CMD_Q_TAIL_LO_BASE, + cmd_q->tail_offset); + + if (!ptdma_priv->hdls_disable) + ptdma_priv->hdls[write] = + _mm_set_epi64x((int64_t)dst_hdl, + (int64_t)src_hdl); + ptdma_priv->xstats.enqueued++; + + return 1; +} + +static __rte_always_inline int +ptdma_ops_to_dequeue(int dev_id, int max_copies, uintptr_t *src_hdls, + uintptr_t *dst_hdls) +{ + struct rte_ptdma_rawdev *ptdma_priv = + (struct rte_ptdma_rawdev *)rte_rawdevs[dev_id].dev_private; + struct ptdma_cmd_queue *cmd_q; + uint32_t cur_head_offset; + short end_read; + unsigned short count; + unsigned short read = ptdma_priv->next_read; + unsigned short write = ptdma_priv->next_write; + unsigned short mask = ptdma_priv->ring_size - 1; + int i = 0; + + cmd_q = &ptdma_priv->cmd_q[0]; + + cur_head_offset = PTDMA_READ_REG(cmd_q->reg_base, + CMD_Q_HEAD_LO_BASE); + + end_read = cur_head_offset - cmd_q->head_offset; + + if (end_read < 0) + end_read = COMMANDS_PER_QUEUE - cmd_q->head_offset + + cur_head_offset; + if (end_read < max_copies) + return 0; + + if (end_read != 0) + count = (write - (read & mask)) & mask; + else + return 0; + + if (ptdma_priv->hdls_disable) { + read += count; + goto end; + } + + if (count > max_copies) + count = max_copies; + + for (; i < count - 1; i += 2, read += 2) { + __m128i hdls0 = + _mm_load_si128(&ptdma_priv->hdls[read & mask]); + __m128i hdls1 = + _mm_load_si128(&ptdma_priv->hdls[(read + 1) & mask]); + _mm_storeu_si128((__m128i *)&src_hdls[i], + _mm_unpacklo_epi64(hdls0, hdls1)); + _mm_storeu_si128((__m128i *)&dst_hdls[i], + _mm_unpackhi_epi64(hdls0, hdls1)); + } + + for (; i < count; i++, read++) { + uintptr_t *hdls = + (uintptr_t *)&ptdma_priv->hdls[read & mask]; + src_hdls[i] = hdls[0]; + dst_hdls[i] = hdls[1]; + } +end: + ptdma_priv->next_read = read; + ptdma_priv->xstats.completed += count; + + return count; +} + +static inline int +rte_ptdma_enqueue_copy(int dev_id, phys_addr_t src, phys_addr_t dst, + unsigned int length, uintptr_t src_hdl, uintptr_t dst_hdl) +{ + return ptdma_ops_to_enqueue(dev_id, 0, src, dst, length, + src_hdl, dst_hdl); +} + +static inline void +rte_ptdma_perform_ops(int dev_id) +{ + struct rte_ptdma_rawdev *ptdma_priv = + (struct rte_ptdma_rawdev *)rte_rawdevs[dev_id].dev_private; + struct ptdma_cmd_queue *cmd_q; + uint32_t cmd_q_ctrl; + + cmd_q = &ptdma_priv->cmd_q[0]; + cmd_q_ctrl = PTDMA_READ_REG(cmd_q->reg_base, CMD_Q_CONTROL_BASE); + + /* Turn the queue on using control register */ + PTDMA_WRITE_REG(cmd_q->reg_base, CMD_Q_CONTROL_BASE, + cmd_q_ctrl | CMD_Q_RUN); + + ptdma_priv->xstats.started = ptdma_priv->xstats.enqueued; +} + +static inline int +rte_ptdma_completed_ops(int dev_id, uint8_t max_copies, + uintptr_t *src_hdls, uintptr_t *dst_hdls) +{ + int ret = 0; + + ret = ptdma_ops_to_dequeue(dev_id, max_copies, src_hdls, dst_hdls); + + return ret; +} + +#endif diff --git a/drivers/raw/ptdma/version.map b/drivers/raw/ptdma/version.map new file mode 100644 index 0000000000..45917242ca --- /dev/null +++ b/drivers/raw/ptdma/version.map @@ -0,0 +1,5 @@ +DPDK_21 { + + local: *; +}; + diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py index 74d16e4c4b..30c11e92ba 100755 --- a/usertools/dpdk-devbind.py +++ b/usertools/dpdk-devbind.py @@ -65,6 +65,8 @@ 'SVendor': None, 'SDevice': None} intel_ntb_icx = {'Class': '06', 'Vendor': '8086', 'Device': '347e', 'SVendor': None, 'SDevice': None} +amd_ptdma = {'Class': '10', 'Vendor': '1022', 'Device': '1498', + 'SVendor': None, 'SDevice': None} network_devices = [network_class, cavium_pkx, avp_vnic, ifpga_class] baseband_devices = [acceleration_class] @@ -74,7 +76,7 @@ compress_devices = [cavium_zip] regex_devices = [octeontx2_ree] misc_devices = [cnxk_bphy, cnxk_bphy_cgx, intel_ioat_bdw, intel_ioat_skx, intel_ioat_icx, intel_idxd_spr, - intel_ntb_skx, intel_ntb_icx, + intel_ntb_skx, intel_ntb_icx, amd_ptdma, octeontx2_dma] # global dict ethernet devices present. Dictionary indexed by PCI address. -- 2.25.1