From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 76FD4A034F; Mon, 11 Oct 2021 09:38:44 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 75ABB410F5; Mon, 11 Oct 2021 09:38:29 +0200 (CEST) Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by mails.dpdk.org (Postfix) with ESMTP id DA27A40E01 for ; Mon, 11 Oct 2021 09:38:24 +0200 (CEST) Received: from dggemv703-chm.china.huawei.com (unknown [172.30.72.57]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4HSVww3kRmz1DHV8; Mon, 11 Oct 2021 15:36:48 +0800 (CST) Received: from dggpeml500024.china.huawei.com (7.185.36.10) by dggemv703-chm.china.huawei.com (10.3.19.46) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.8; Mon, 11 Oct 2021 15:38:23 +0800 Received: from localhost.localdomain (10.67.165.24) by dggpeml500024.china.huawei.com (7.185.36.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.8; Mon, 11 Oct 2021 15:38:22 +0800 From: Chengwen Feng To: , , , , , CC: , , , , , , , , , , , Date: Mon, 11 Oct 2021 15:33:45 +0800 Message-ID: <20211011073348.8235-4-fengchengwen@huawei.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211011073348.8235-1-fengchengwen@huawei.com> References: <1625231891-2963-1-git-send-email-fengchengwen@huawei.com> <20211011073348.8235-1-fengchengwen@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.67.165.24] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpeml500024.china.huawei.com (7.185.36.10) X-CFilter-Loop: Reflected Subject: [dpdk-dev] [PATCH v25 3/6] dmadev: add data plane API support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This patch add data plane API for dmadev. Signed-off-by: Chengwen Feng Acked-by: Bruce Richardson Acked-by: Morten Brørup Reviewed-by: Kevin Laatz Reviewed-by: Conor Walsh --- doc/guides/prog_guide/dmadev.rst | 22 ++ doc/guides/rel_notes/release_21_11.rst | 2 +- lib/dmadev/meson.build | 1 + lib/dmadev/rte_dmadev.c | 112 ++++++ lib/dmadev/rte_dmadev.h | 451 +++++++++++++++++++++++++ lib/dmadev/rte_dmadev_core.h | 81 +++++ lib/dmadev/rte_dmadev_pmd.h | 2 + lib/dmadev/version.map | 7 + 8 files changed, 677 insertions(+), 1 deletion(-) create mode 100644 lib/dmadev/rte_dmadev_core.h diff --git a/doc/guides/prog_guide/dmadev.rst b/doc/guides/prog_guide/dmadev.rst index 5c70ad3d6a..2e2a4bb62a 100644 --- a/doc/guides/prog_guide/dmadev.rst +++ b/doc/guides/prog_guide/dmadev.rst @@ -96,3 +96,25 @@ can be used to get the device info and supported features. Silent mode is a special device capability which does not require the application to invoke dequeue APIs. + + +Enqueue / Dequeue APIs +~~~~~~~~~~~~~~~~~~~~~~ + +Enqueue APIs such as ``rte_dma_copy`` and ``rte_dma_fill`` can be used to +enqueue operations to hardware. If an enqueue is successful, a ``ring_idx`` is +returned. This ``ring_idx`` can be used by applications to track per operation +metadata in an application-defined circular ring. + +The ``rte_dma_submit`` API is used to issue doorbell to hardware. +Alternatively the ``RTE_DMA_OP_FLAG_SUBMIT`` flag can be passed to the enqueue +APIs to also issue the doorbell to hardware. + +There are two dequeue APIs ``rte_dma_completed`` and +``rte_dma_completed_status``, these are used to obtain the results of the +enqueue requests. ``rte_dma_completed`` will return the number of successfully +completed operations. ``rte_dma_completed_status`` will return the number of +completed operations along with the status of each operation (filled into the +``status`` array passed by user). These two APIs can also return the last +completed operation's ``ring_idx`` which could help user track operations within +their own application-defined rings. diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst index f935a3f395..d1d7abf694 100644 --- a/doc/guides/rel_notes/release_21_11.rst +++ b/doc/guides/rel_notes/release_21_11.rst @@ -144,7 +144,7 @@ New Features * **Introduced dmadev library with:** * Device allocation functions. - * Control plane API. + * Control and data plane API. Removed Items diff --git a/lib/dmadev/meson.build b/lib/dmadev/meson.build index f8d54c6e74..d2fc85e8c7 100644 --- a/lib/dmadev/meson.build +++ b/lib/dmadev/meson.build @@ -3,4 +3,5 @@ sources = files('rte_dmadev.c') headers = files('rte_dmadev.h') +indirect_headers += files('rte_dmadev_core.h') driver_sdk_headers += files('rte_dmadev_pmd.h') diff --git a/lib/dmadev/rte_dmadev.c b/lib/dmadev/rte_dmadev.c index a6a5680d2b..4080ba63bd 100644 --- a/lib/dmadev/rte_dmadev.c +++ b/lib/dmadev/rte_dmadev.c @@ -17,6 +17,7 @@ static int16_t dma_devices_max; +struct rte_dma_fp_object *rte_dma_fp_objs; struct rte_dma_dev *rte_dma_devices; RTE_LOG_REGISTER_DEFAULT(rte_dma_logtype, INFO); @@ -97,6 +98,38 @@ dma_find_by_name(const char *name) return NULL; } +static void dma_fp_object_dummy(struct rte_dma_fp_object *obj); + +static int +dma_fp_data_prepare(void) +{ + size_t size; + void *ptr; + int i; + + if (rte_dma_fp_objs != NULL) + return 0; + + /* Fast-path object must align cacheline, but the return value of malloc + * may not be aligned to the cache line. Therefore, extra memory is + * applied for realignment. + * note: We do not call posix_memalign/aligned_alloc because it is + * version dependent on libc. + */ + size = dma_devices_max * sizeof(struct rte_dma_fp_object) + + RTE_CACHE_LINE_SIZE; + ptr = malloc(size); + if (ptr == NULL) + return -ENOMEM; + memset(ptr, 0, size); + + rte_dma_fp_objs = RTE_PTR_ALIGN(ptr, RTE_CACHE_LINE_SIZE); + for (i = 0; i < dma_devices_max; i++) + dma_fp_object_dummy(&rte_dma_fp_objs[i]); + + return 0; +} + static int dma_dev_data_prepare(void) { @@ -117,8 +150,15 @@ dma_dev_data_prepare(void) static int dma_data_prepare(void) { + int ret; + if (dma_devices_max == 0) dma_devices_max = RTE_DMADEV_DEFAULT_MAX; + + ret = dma_fp_data_prepare(); + if (ret) + return ret; + return dma_dev_data_prepare(); } @@ -161,6 +201,8 @@ dma_allocate(const char *name, int numa_node, size_t private_data_size) dev->dev_id = dev_id; dev->numa_node = numa_node; dev->dev_private = dev_private; + dev->fp_obj = &rte_dma_fp_objs[dev_id]; + dma_fp_object_dummy(dev->fp_obj); return dev; } @@ -169,6 +211,7 @@ static void dma_release(struct rte_dma_dev *dev) { rte_free(dev->dev_private); + dma_fp_object_dummy(dev->fp_obj); memset(dev, 0, sizeof(struct rte_dma_dev)); } @@ -604,3 +647,72 @@ rte_dma_dump(int16_t dev_id, FILE *f) return 0; } + +static int +dummy_copy(__rte_unused void *dev_private, __rte_unused uint16_t vchan, + __rte_unused rte_iova_t src, __rte_unused rte_iova_t dst, + __rte_unused uint32_t length, __rte_unused uint64_t flags) +{ + RTE_DMA_LOG(ERR, "copy is not configured or not supported."); + return -EINVAL; +} + +static int +dummy_copy_sg(__rte_unused void *dev_private, __rte_unused uint16_t vchan, + __rte_unused const struct rte_dma_sge *src, + __rte_unused const struct rte_dma_sge *dst, + __rte_unused uint16_t nb_src, __rte_unused uint16_t nb_dst, + __rte_unused uint64_t flags) +{ + RTE_DMA_LOG(ERR, "copy_sg is not configured or not supported."); + return -EINVAL; +} + +static int +dummy_fill(__rte_unused void *dev_private, __rte_unused uint16_t vchan, + __rte_unused uint64_t pattern, __rte_unused rte_iova_t dst, + __rte_unused uint32_t length, __rte_unused uint64_t flags) +{ + RTE_DMA_LOG(ERR, "fill is not configured or not supported."); + return -EINVAL; +} + +static int +dummy_submit(__rte_unused void *dev_private, __rte_unused uint16_t vchan) +{ + RTE_DMA_LOG(ERR, "submit is not configured or not supported."); + return -EINVAL; +} + +static uint16_t +dummy_completed(__rte_unused void *dev_private, __rte_unused uint16_t vchan, + __rte_unused const uint16_t nb_cpls, + __rte_unused uint16_t *last_idx, __rte_unused bool *has_error) +{ + RTE_DMA_LOG(ERR, "completed is not configured or not supported."); + return 0; +} + +static uint16_t +dummy_completed_status(__rte_unused void *dev_private, + __rte_unused uint16_t vchan, + __rte_unused const uint16_t nb_cpls, + __rte_unused uint16_t *last_idx, + __rte_unused enum rte_dma_status_code *status) +{ + RTE_DMA_LOG(ERR, + "completed_status is not configured or not supported."); + return 0; +} + +static void +dma_fp_object_dummy(struct rte_dma_fp_object *obj) +{ + obj->dev_private = NULL; + obj->copy = dummy_copy; + obj->copy_sg = dummy_copy_sg; + obj->fill = dummy_fill; + obj->submit = dummy_submit; + obj->completed = dummy_completed; + obj->completed_status = dummy_completed_status; +} diff --git a/lib/dmadev/rte_dmadev.h b/lib/dmadev/rte_dmadev.h index 34a4c26851..95b6a0a810 100644 --- a/lib/dmadev/rte_dmadev.h +++ b/lib/dmadev/rte_dmadev.h @@ -65,6 +65,77 @@ * Finally, an application can close a dmadev by invoking the rte_dma_close() * function. * + * The dataplane APIs include two parts: + * The first part is the submission of operation requests: + * - rte_dma_copy() + * - rte_dma_copy_sg() + * - rte_dma_fill() + * - rte_dma_submit() + * + * These APIs could work with different virtual DMA channels which have + * different contexts. + * + * The first three APIs are used to submit the operation request to the virtual + * DMA channel, if the submission is successful, a positive + * ring_idx <= UINT16_MAX is returned, otherwise a negative number is returned. + * + * The last API is used to issue doorbell to hardware, and also there are flags + * (@see RTE_DMA_OP_FLAG_SUBMIT) parameter of the first three APIs could do the + * same work. + * @note When enqueuing a set of jobs to the device, having a separate submit + * outside a loop makes for clearer code than having a check for the last + * iteration inside the loop to set a special submit flag. However, for cases + * where one item alone is to be submitted or there is a small set of jobs to + * be submitted sequentially, having a submit flag provides a lower-overhead + * way of doing the submission while still keeping the code clean. + * + * The second part is to obtain the result of requests: + * - rte_dma_completed() + * - return the number of operation requests completed successfully. + * - rte_dma_completed_status() + * - return the number of operation requests completed. + * + * @note If the dmadev works in silent mode (@see RTE_DMA_CAPA_SILENT), + * application does not invoke the above two completed APIs. + * + * About the ring_idx which enqueue APIs (e.g. rte_dma_copy(), rte_dma_fill()) + * return, the rules are as follows: + * - ring_idx for each virtual DMA channel are independent. + * - For a virtual DMA channel, the ring_idx is monotonically incremented, + * when it reach UINT16_MAX, it wraps back to zero. + * - This ring_idx can be used by applications to track per-operation + * metadata in an application-defined circular ring. + * - The initial ring_idx of a virtual DMA channel is zero, after the + * device is stopped, the ring_idx needs to be reset to zero. + * + * One example: + * - step-1: start one dmadev + * - step-2: enqueue a copy operation, the ring_idx return is 0 + * - step-3: enqueue a copy operation again, the ring_idx return is 1 + * - ... + * - step-101: stop the dmadev + * - step-102: start the dmadev + * - step-103: enqueue a copy operation, the ring_idx return is 0 + * - ... + * - step-x+0: enqueue a fill operation, the ring_idx return is 65535 + * - step-x+1: enqueue a copy operation, the ring_idx return is 0 + * - ... + * + * The DMA operation address used in enqueue APIs (i.e. rte_dma_copy(), + * rte_dma_copy_sg(), rte_dma_fill()) is defined as rte_iova_t type. + * + * The dmadev supports two types of address: memory address and device address. + * + * - memory address: the source and destination address of the memory-to-memory + * transfer type, or the source address of the memory-to-device transfer type, + * or the destination address of the device-to-memory transfer type. + * @note If the device support SVA (@see RTE_DMA_CAPA_SVA), the memory address + * can be any VA address, otherwise it must be an IOVA address. + * + * - device address: the source and destination address of the device-to-device + * transfer type, or the source address of the device-to-memory transfer type, + * or the destination address of the memory-to-device transfer type. + * * About MT-safe, all the functions of the dmadev API implemented by a PMD are * lock-free functions which assume to not be invoked in parallel on different * logical cores to work on the same target dmadev object. @@ -590,6 +661,386 @@ int rte_dma_stats_reset(int16_t dev_id, uint16_t vchan); __rte_experimental int rte_dma_dump(int16_t dev_id, FILE *f); +/** + * DMA transfer result status code defines. + * + * @see rte_dma_completed_status + */ +enum rte_dma_status_code { + /** The operation completed successfully. */ + RTE_DMA_STATUS_SUCCESSFUL, + /** The operation failed to complete due abort by user. + * This is mainly used when processing dev_stop, user could modidy the + * descriptors (e.g. change one bit to tell hardware abort this job), + * it allows outstanding requests to be complete as much as possible, + * so reduce the time to stop the device. + */ + RTE_DMA_STATUS_USER_ABORT, + /** The operation failed to complete due to following scenarios: + * The jobs in a particular batch are not attempted because they + * appeared after a fence where a previous job failed. In some HW + * implementation it's possible for jobs from later batches would be + * completed, though, so report the status from the not attempted jobs + * before reporting those newer completed jobs. + */ + RTE_DMA_STATUS_NOT_ATTEMPTED, + /** The operation failed to complete due invalid source address. */ + RTE_DMA_STATUS_INVALID_SRC_ADDR, + /** The operation failed to complete due invalid destination address. */ + RTE_DMA_STATUS_INVALID_DST_ADDR, + /** The operation failed to complete due invalid source or destination + * address, cover the case that only knows the address error, but not + * sure which address error. + */ + RTE_DMA_STATUS_INVALID_ADDR, + /** The operation failed to complete due invalid length. */ + RTE_DMA_STATUS_INVALID_LENGTH, + /** The operation failed to complete due invalid opcode. + * The DMA descriptor could have multiple format, which are + * distinguished by the opcode field. + */ + RTE_DMA_STATUS_INVALID_OPCODE, + /** The operation failed to complete due bus read error. */ + RTE_DMA_STATUS_BUS_READ_ERROR, + /** The operation failed to complete due bus write error. */ + RTE_DMA_STATUS_BUS_WRITE_ERROR, + /** The operation failed to complete due bus error, cover the case that + * only knows the bus error, but not sure which direction error. + */ + RTE_DMA_STATUS_BUS_ERROR, + /** The operation failed to complete due data poison. */ + RTE_DMA_STATUS_DATA_POISION, + /** The operation failed to complete due descriptor read error. */ + RTE_DMA_STATUS_DESCRIPTOR_READ_ERROR, + /** The operation failed to complete due device link error. + * Used to indicates that the link error in the memory-to-device/ + * device-to-memory/device-to-device transfer scenario. + */ + RTE_DMA_STATUS_DEV_LINK_ERROR, + /** The operation failed to complete due lookup page fault. */ + RTE_DMA_STATUS_PAGE_FAULT, + /** The operation failed to complete due unknown reason. + * The initial value is 256, which reserves space for future errors. + */ + RTE_DMA_STATUS_ERROR_UNKNOWN = 0x100, +}; + +/** + * A structure used to hold scatter-gather DMA operation request entry. + * + * @see rte_dma_copy_sg + */ +struct rte_dma_sge { + rte_iova_t addr; /**< The DMA operation address. */ + uint32_t length; /**< The DMA operation length. */ +}; + +#include "rte_dmadev_core.h" + +/**@{@name DMA operation flag + * @see rte_dma_copy() + * @see rte_dma_copy_sg() + * @see rte_dma_fill() + */ +#define RTE_DMA_OP_FLAG_FENCE RTE_BIT64(0) +/**< Fence flag. + * It means the operation with this flag must be processed only after all + * previous operations are completed. + * If the specify DMA HW works in-order (it means it has default fence between + * operations), this flag could be NOP. + */ +#define RTE_DMA_OP_FLAG_SUBMIT RTE_BIT64(1) +/**< Submit flag. + * It means the operation with this flag must issue doorbell to hardware after + * enqueued jobs. + */ +#define RTE_DMA_OP_FLAG_LLC RTE_BIT64(2) +/**< Write data to low level cache hint. + * Used for performance optimization, this is just a hint, and there is no + * capability bit for this, driver should not return error if this flag was set. + */ +/**@}*/ + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Enqueue a copy operation onto the virtual DMA channel. + * + * This queues up a copy operation to be performed by hardware, if the 'flags' + * parameter contains RTE_DMA_OP_FLAG_SUBMIT then trigger doorbell to begin + * this operation, otherwise do not trigger doorbell. + * + * @param dev_id + * The identifier of the device. + * @param vchan + * The identifier of virtual DMA channel. + * @param src + * The address of the source buffer. + * @param dst + * The address of the destination buffer. + * @param length + * The length of the data to be copied. + * @param flags + * An flags for this operation. + * @see RTE_DMA_OP_FLAG_* + * + * @return + * - 0..UINT16_MAX: index of enqueued job. + * - -ENOSPC: if no space left to enqueue. + * - other values < 0 on failure. + */ +__rte_experimental +static inline int +rte_dma_copy(int16_t dev_id, uint16_t vchan, rte_iova_t src, rte_iova_t dst, + uint32_t length, uint64_t flags) +{ + struct rte_dma_fp_object *obj = &rte_dma_fp_objs[dev_id]; + +#ifdef RTE_DMADEV_DEBUG + if (!rte_dma_is_valid(dev_id) || length == 0) + return -EINVAL; + RTE_FUNC_PTR_OR_ERR_RET(*obj->copy, -ENOTSUP); +#endif + + return (*obj->copy)(obj->dev_private, vchan, src, dst, length, flags); +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Enqueue a scatter-gather list copy operation onto the virtual DMA channel. + * + * This queues up a scatter-gather list copy operation to be performed by + * hardware, if the 'flags' parameter contains RTE_DMA_OP_FLAG_SUBMIT then + * trigger doorbell to begin this operation, otherwise do not trigger doorbell. + * + * @param dev_id + * The identifier of the device. + * @param vchan + * The identifier of virtual DMA channel. + * @param src + * The pointer of source scatter-gather entry array. + * @param dst + * The pointer of destination scatter-gather entry array. + * @param nb_src + * The number of source scatter-gather entry. + * @see struct rte_dma_info::max_sges + * @param nb_dst + * The number of destination scatter-gather entry. + * @see struct rte_dma_info::max_sges + * @param flags + * An flags for this operation. + * @see RTE_DMA_OP_FLAG_* + * + * @return + * - 0..UINT16_MAX: index of enqueued job. + * - -ENOSPC: if no space left to enqueue. + * - other values < 0 on failure. + */ +__rte_experimental +static inline int +rte_dma_copy_sg(int16_t dev_id, uint16_t vchan, struct rte_dma_sge *src, + struct rte_dma_sge *dst, uint16_t nb_src, uint16_t nb_dst, + uint64_t flags) +{ + struct rte_dma_fp_object *obj = &rte_dma_fp_objs[dev_id]; + +#ifdef RTE_DMADEV_DEBUG + if (!rte_dma_is_valid(dev_id) || src == NULL || dst == NULL || + nb_src == 0 || nb_dst == 0) + return -EINVAL; + RTE_FUNC_PTR_OR_ERR_RET(*obj->copy_sg, -ENOTSUP); +#endif + + return (*obj->copy_sg)(obj->dev_private, vchan, src, dst, nb_src, + nb_dst, flags); +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Enqueue a fill operation onto the virtual DMA channel. + * + * This queues up a fill operation to be performed by hardware, if the 'flags' + * parameter contains RTE_DMA_OP_FLAG_SUBMIT then trigger doorbell to begin + * this operation, otherwise do not trigger doorbell. + * + * @param dev_id + * The identifier of the device. + * @param vchan + * The identifier of virtual DMA channel. + * @param pattern + * The pattern to populate the destination buffer with. + * @param dst + * The address of the destination buffer. + * @param length + * The length of the destination buffer. + * @param flags + * An flags for this operation. + * @see RTE_DMA_OP_FLAG_* + * + * @return + * - 0..UINT16_MAX: index of enqueued job. + * - -ENOSPC: if no space left to enqueue. + * - other values < 0 on failure. + */ +__rte_experimental +static inline int +rte_dma_fill(int16_t dev_id, uint16_t vchan, uint64_t pattern, + rte_iova_t dst, uint32_t length, uint64_t flags) +{ + struct rte_dma_fp_object *obj = &rte_dma_fp_objs[dev_id]; + +#ifdef RTE_DMADEV_DEBUG + if (!rte_dma_is_valid(dev_id) || length == 0) + return -EINVAL; + RTE_FUNC_PTR_OR_ERR_RET(*obj->fill, -ENOTSUP); +#endif + + return (*obj->fill)(obj->dev_private, vchan, pattern, dst, length, + flags); +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Trigger hardware to begin performing enqueued operations. + * + * This API is used to write the "doorbell" to the hardware to trigger it + * to begin the operations previously enqueued by rte_dma_copy/fill(). + * + * @param dev_id + * The identifier of the device. + * @param vchan + * The identifier of virtual DMA channel. + * + * @return + * 0 on success. Otherwise negative value is returned. + */ +__rte_experimental +static inline int +rte_dma_submit(int16_t dev_id, uint16_t vchan) +{ + struct rte_dma_fp_object *obj = &rte_dma_fp_objs[dev_id]; + +#ifdef RTE_DMADEV_DEBUG + if (!rte_dma_is_valid(dev_id)) + return -EINVAL; + RTE_FUNC_PTR_OR_ERR_RET(*obj->submit, -ENOTSUP); +#endif + + return (*obj->submit)(obj->dev_private, vchan); +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Return the number of operations that have been successfully completed. + * + * @param dev_id + * The identifier of the device. + * @param vchan + * The identifier of virtual DMA channel. + * @param nb_cpls + * The maximum number of completed operations that can be processed. + * @param[out] last_idx + * The last completed operation's ring_idx. + * If not required, NULL can be passed in. + * @param[out] has_error + * Indicates if there are transfer error. + * If not required, NULL can be passed in. + * + * @return + * The number of operations that successfully completed. This return value + * must be less than or equal to the value of nb_cpls. + */ +__rte_experimental +static inline uint16_t +rte_dma_completed(int16_t dev_id, uint16_t vchan, const uint16_t nb_cpls, + uint16_t *last_idx, bool *has_error) +{ + struct rte_dma_fp_object *obj = &rte_dma_fp_objs[dev_id]; + uint16_t idx; + bool err; + +#ifdef RTE_DMADEV_DEBUG + if (!rte_dma_is_valid(dev_id) || nb_cpls == 0) + return 0; + RTE_FUNC_PTR_OR_ERR_RET(*obj->completed, 0); +#endif + + /* Ensure the pointer values are non-null to simplify drivers. + * In most cases these should be compile time evaluated, since this is + * an inline function. + * - If NULL is explicitly passed as parameter, then compiler knows the + * value is NULL + * - If address of local variable is passed as parameter, then compiler + * can know it's non-NULL. + */ + if (last_idx == NULL) + last_idx = &idx; + if (has_error == NULL) + has_error = &err; + + *has_error = false; + return (*obj->completed)(obj->dev_private, vchan, nb_cpls, last_idx, + has_error); +} + +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Return the number of operations that have been completed, and the operations + * result may succeed or fail. + * + * @param dev_id + * The identifier of the device. + * @param vchan + * The identifier of virtual DMA channel. + * @param nb_cpls + * Indicates the size of status array. + * @param[out] last_idx + * The last completed operation's ring_idx. + * If not required, NULL can be passed in. + * @param[out] status + * This is a pointer to an array of length 'nb_cpls' that holds the completion + * status code of each operation. + * @see enum rte_dma_status_code + * + * @return + * The number of operations that completed. This return value must be less + * than or equal to the value of nb_cpls. + * If this number is greater than zero (assuming n), then n values in the + * status array are also set. + */ +__rte_experimental +static inline uint16_t +rte_dma_completed_status(int16_t dev_id, uint16_t vchan, + const uint16_t nb_cpls, uint16_t *last_idx, + enum rte_dma_status_code *status) +{ + struct rte_dma_fp_object *obj = &rte_dma_fp_objs[dev_id]; + uint16_t idx; + +#ifdef RTE_DMADEV_DEBUG + if (!rte_dma_is_valid(dev_id) || nb_cpls == 0 || status == NULL) + return 0; + RTE_FUNC_PTR_OR_ERR_RET(*obj->completed_status, 0); +#endif + + if (last_idx == NULL) + last_idx = &idx; + + return (*obj->completed_status)(obj->dev_private, vchan, nb_cpls, + last_idx, status); +} + #ifdef __cplusplus } #endif diff --git a/lib/dmadev/rte_dmadev_core.h b/lib/dmadev/rte_dmadev_core.h new file mode 100644 index 0000000000..6947091924 --- /dev/null +++ b/lib/dmadev/rte_dmadev_core.h @@ -0,0 +1,81 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 HiSilicon Limited + * Copyright(c) 2021 Intel Corporation + */ + +#ifndef RTE_DMADEV_CORE_H +#define RTE_DMADEV_CORE_H + +/** + * @file + * + * DMA Device internal header. + * + * This header contains internal data types which are used by dataplane inline + * function. + * + * Applications should not use these functions directly. + */ + +/** @internal Used to enqueue a copy operation. */ +typedef int (*rte_dma_copy_t)(void *dev_private, uint16_t vchan, + rte_iova_t src, rte_iova_t dst, + uint32_t length, uint64_t flags); + +/** @internal Used to enqueue a scatter-gather list copy operation. */ +typedef int (*rte_dma_copy_sg_t)(void *dev_private, uint16_t vchan, + const struct rte_dma_sge *src, + const struct rte_dma_sge *dst, + uint16_t nb_src, uint16_t nb_dst, + uint64_t flags); + +/** @internal Used to enqueue a fill operation. */ +typedef int (*rte_dma_fill_t)(void *dev_private, uint16_t vchan, + uint64_t pattern, rte_iova_t dst, + uint32_t length, uint64_t flags); + +/** @internal Used to trigger hardware to begin working. */ +typedef int (*rte_dma_submit_t)(void *dev_private, uint16_t vchan); + +/** @internal Used to return number of successful completed operations. */ +typedef uint16_t (*rte_dma_completed_t)(void *dev_private, + uint16_t vchan, const uint16_t nb_cpls, + uint16_t *last_idx, bool *has_error); + +/** @internal Used to return number of completed operations. */ +typedef uint16_t (*rte_dma_completed_status_t)(void *dev_private, + uint16_t vchan, const uint16_t nb_cpls, + uint16_t *last_idx, enum rte_dma_status_code *status); + +/** + * @internal + * Fast-path dmadev functions and related data are hold in a flat array. + * One entry per dmadev. + * + * On 64-bit systems contents of this structure occupy exactly two 64B lines. + * On 32-bit systems contents of this structure fits into one 64B line. + * + * The 'dev_private' field was placed in the first cache line to optimize + * performance because the PMD driver mainly depends on this field. + */ +struct rte_dma_fp_object { + /** PMD-specific private data. The driver should copy + * rte_dma_dev.dev_private to this field during initialization. + */ + void *dev_private; + rte_dma_copy_t copy; + rte_dma_copy_sg_t copy_sg; + rte_dma_fill_t fill; + rte_dma_submit_t submit; + rte_dma_completed_t completed; + rte_dma_completed_status_t completed_status; + void *reserved_cl0; + /** Reserve space for future IO functions, while keeping data and + * dev_ops pointers on the second cacheline. + */ + void *reserved_cl1[6]; +} __rte_cache_aligned; + +extern struct rte_dma_fp_object *rte_dma_fp_objs; + +#endif /* RTE_DMADEV_CORE_H */ diff --git a/lib/dmadev/rte_dmadev_pmd.h b/lib/dmadev/rte_dmadev_pmd.h index 5fcf0f60b8..d6d2161306 100644 --- a/lib/dmadev/rte_dmadev_pmd.h +++ b/lib/dmadev/rte_dmadev_pmd.h @@ -100,6 +100,8 @@ struct rte_dma_dev { void *dev_private; /**< PMD-specific private data. */ /** Device info which supplied during device initialization. */ struct rte_device *device; + /**< Fast-path functions and related data. */ + struct rte_dma_fp_object *fp_obj; /** Functions implemented by PMD. */ const struct rte_dma_dev_ops *dev_ops; struct rte_dma_conf dev_conf; /**< DMA device configuration. */ diff --git a/lib/dmadev/version.map b/lib/dmadev/version.map index e925dfcd6d..e17207b212 100644 --- a/lib/dmadev/version.map +++ b/lib/dmadev/version.map @@ -2,10 +2,15 @@ EXPERIMENTAL { global: rte_dma_close; + rte_dma_completed; + rte_dma_completed_status; rte_dma_configure; + rte_dma_copy; + rte_dma_copy_sg; rte_dma_count_avail; rte_dma_dev_max; rte_dma_dump; + rte_dma_fill; rte_dma_get_dev_id_by_name; rte_dma_info_get; rte_dma_is_valid; @@ -13,6 +18,7 @@ EXPERIMENTAL { rte_dma_stats_get; rte_dma_stats_reset; rte_dma_stop; + rte_dma_submit; rte_dma_vchan_setup; local: *; @@ -22,6 +28,7 @@ INTERNAL { global: rte_dma_devices; + rte_dma_fp_objs; rte_dma_pmd_allocate; rte_dma_pmd_release; -- 2.33.0