From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 094DAA04BA; Wed, 7 Oct 2020 18:39:25 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id B6ABE1B80B; Wed, 7 Oct 2020 18:37:29 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 248CD1B68A for ; Wed, 7 Oct 2020 18:37:10 +0200 (CEST) IronPort-SDR: AUW2srpk64ZyoIE6b8mBmsI7wd9LVk4OjUkPrTVUja+sQVb8qcb0yHFqQ9HGmw0kJ8Fvd/eYwF a0pRQyJR/DZg== X-IronPort-AV: E=McAfee;i="6000,8403,9767"; a="249724610" X-IronPort-AV: E=Sophos;i="5.77,347,1596524400"; d="scan'208";a="249724610" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Oct 2020 09:31:36 -0700 IronPort-SDR: i0rNvelbwUP8p5MLX5azkv9SmyCEs1vAi0gH0NAx4GRtXMM0xFXg2rLFc8grkBz1KOrNaFK3nc fOHwYxPCYwTA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,347,1596524400"; d="scan'208";a="354966103" Received: from silpixa00399126.ir.intel.com ([10.237.222.4]) by orsmga007.jf.intel.com with ESMTP; 07 Oct 2020 09:31:34 -0700 From: Bruce Richardson To: dev@dpdk.org Cc: patrick.fu@intel.com, thomas@monjalon.net, Kevin Laatz , Bruce Richardson , Radu Nicolau Date: Wed, 7 Oct 2020 17:30:23 +0100 Message-Id: <20201007163023.2817-26-bruce.richardson@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201007163023.2817-1-bruce.richardson@intel.com> References: <20200721095140.719297-1-bruce.richardson@intel.com> <20201007163023.2817-1-bruce.richardson@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [dpdk-dev] [PATCH v5 25/25] raw/ioat: add fill operation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Kevin Laatz Add fill operation enqueue support for IOAT and IDXD. The fill enqueue is similar to the copy enqueue, but takes a 'pattern' rather than a source address to transfer to the destination address. This patch also includes an additional test case for the new operation type. Signed-off-by: Kevin Laatz Signed-off-by: Bruce Richardson Acked-by: Radu Nicolau --- doc/guides/rawdevs/ioat.rst | 10 ++++ doc/guides/rel_notes/release_20_11.rst | 2 + drivers/raw/ioat/ioat_rawdev_test.c | 62 ++++++++++++++++++++++++ drivers/raw/ioat/rte_ioat_rawdev.h | 26 +++++++++++ drivers/raw/ioat/rte_ioat_rawdev_fns.h | 65 ++++++++++++++++++++++++-- 5 files changed, 160 insertions(+), 5 deletions(-) diff --git a/doc/guides/rawdevs/ioat.rst b/doc/guides/rawdevs/ioat.rst index 7c2a2d457..250cfc48a 100644 --- a/doc/guides/rawdevs/ioat.rst +++ b/doc/guides/rawdevs/ioat.rst @@ -285,6 +285,16 @@ is correct before freeing the data buffers using the returned handles: } +Filling an Area of Memory +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The IOAT driver also has support for the ``fill`` operation, where an area +of memory is overwritten, or filled, with a short pattern of data. +Fill operations can be performed in much the same was as copy operations +described above, just using the ``rte_ioat_enqueue_fill()`` function rather +than the ``rte_ioat_enqueue_copy()`` function. + + Querying Device Statistics ~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst index e48e6ea75..943ec83fd 100644 --- a/doc/guides/rel_notes/release_20_11.rst +++ b/doc/guides/rel_notes/release_20_11.rst @@ -122,6 +122,8 @@ New Features * Added support for Intel\ |reg| Data Streaming Accelerator hardware. For more information, see https://01.org/blogs/2019/introducing-intel-data-streaming-accelerator + * Added support for the fill operation via the API ``rte_ioat_enqueue_fill()``, + where the hardware fills an area of memory with a repeating pattern. * Added a per-device configuration flag to disable management of user-provided completion handles * Renamed the ``rte_ioat_do_copies()`` API to ``rte_ioat_perform_ops()``, and renamed the ``rte_ioat_completed_copies()`` API to ``rte_ioat_completed_ops()`` diff --git a/drivers/raw/ioat/ioat_rawdev_test.c b/drivers/raw/ioat/ioat_rawdev_test.c index 60d189b62..101f24a67 100644 --- a/drivers/raw/ioat/ioat_rawdev_test.c +++ b/drivers/raw/ioat/ioat_rawdev_test.c @@ -155,6 +155,52 @@ test_enqueue_copies(int dev_id) return 0; } +static int +test_enqueue_fill(int dev_id) +{ + const unsigned int length[] = {8, 64, 1024, 50, 100, 89}; + struct rte_mbuf *dst = rte_pktmbuf_alloc(pool); + char *dst_data = rte_pktmbuf_mtod(dst, char *); + struct rte_mbuf *completed[2] = {0}; + uint64_t pattern = 0xfedcba9876543210; + unsigned int i, j; + + for (i = 0; i < RTE_DIM(length); i++) { + /* reset dst_data */ + memset(dst_data, 0, length[i]); + + /* perform the fill operation */ + if (rte_ioat_enqueue_fill(dev_id, pattern, + dst->buf_iova + dst->data_off, length[i], + (uintptr_t)dst) != 1) { + PRINT_ERR("Error with rte_ioat_enqueue_fill\n"); + return -1; + } + + rte_ioat_perform_ops(dev_id); + usleep(100); + + if (rte_ioat_completed_ops(dev_id, 1, (void *)&completed[0], + (void *)&completed[1]) != 1) { + PRINT_ERR("Error with completed ops\n"); + return -1; + } + /* check the result */ + for (j = 0; j < length[i]; j++) { + char pat_byte = ((char *)&pattern)[j % 8]; + if (dst_data[j] != pat_byte) { + PRINT_ERR("Error with fill operation (length = %u): got (%x), not (%x)\n", + length[i], dst_data[j], + pat_byte); + return -1; + } + } + } + + rte_pktmbuf_free(dst); + return 0; +} + int ioat_rawdev_test(uint16_t dev_id) { @@ -234,6 +280,7 @@ ioat_rawdev_test(uint16_t dev_id) } /* run the test cases */ + printf("Running Copy Tests\n"); for (i = 0; i < 100; i++) { unsigned int j; @@ -247,6 +294,21 @@ ioat_rawdev_test(uint16_t dev_id) } printf("\n"); + /* test enqueue fill operation */ + printf("Running Fill Tests\n"); + for (i = 0; i < 100; i++) { + unsigned int j; + + if (test_enqueue_fill(dev_id) != 0) + goto err; + + rte_rawdev_xstats_get(dev_id, ids, stats, nb_xstats); + for (j = 0; j < nb_xstats; j++) + printf("%s: %"PRIu64" ", snames[j].name, stats[j]); + printf("\r"); + } + printf("\n"); + rte_rawdev_stop(dev_id); if (rte_rawdev_xstats_reset(dev_id, NULL, 0) != 0) { PRINT_ERR("Error resetting xstat values\n"); diff --git a/drivers/raw/ioat/rte_ioat_rawdev.h b/drivers/raw/ioat/rte_ioat_rawdev.h index 6b891cd44..b7632ebf3 100644 --- a/drivers/raw/ioat/rte_ioat_rawdev.h +++ b/drivers/raw/ioat/rte_ioat_rawdev.h @@ -37,6 +37,32 @@ struct rte_ioat_rawdev_config { bool hdls_disable; /**< if set, ignore user-supplied handle params */ }; +/** + * Enqueue a fill operation onto the ioat device + * + * This queues up a fill operation to be performed by hardware, but does not + * trigger hardware to begin that operation. + * + * @param dev_id + * The rawdev device id of the ioat instance + * @param pattern + * The pattern to populate the destination buffer with + * @param dst + * The physical address of the destination buffer + * @param length + * The length of the destination buffer + * @param dst_hdl + * An opaque handle for the destination data, to be returned when this + * operation has been completed and the user polls for the completion details. + * NOTE: If hdls_disable configuration option for the device is set, this + * parameter is ignored. + * @return + * Number of operations enqueued, either 0 or 1 + */ +static inline int +rte_ioat_enqueue_fill(int dev_id, uint64_t pattern, phys_addr_t dst, + unsigned int length, uintptr_t dst_hdl); + /** * Enqueue a copy operation onto the ioat device * diff --git a/drivers/raw/ioat/rte_ioat_rawdev_fns.h b/drivers/raw/ioat/rte_ioat_rawdev_fns.h index d0045d8a4..c2c4601ca 100644 --- a/drivers/raw/ioat/rte_ioat_rawdev_fns.h +++ b/drivers/raw/ioat/rte_ioat_rawdev_fns.h @@ -115,6 +115,13 @@ enum rte_idxd_ops { #define IDXD_FLAG_REQUEST_COMPLETION (1 << 3) #define IDXD_FLAG_CACHE_CONTROL (1 << 8) +#define IOAT_COMP_UPDATE_SHIFT 3 +#define IOAT_CMD_OP_SHIFT 24 +enum rte_ioat_ops { + ioat_op_copy = 0, /* Standard DMA Operation */ + ioat_op_fill /* Block Fill */ +}; + /** * Hardware descriptor used by DSA hardware, for both bursts and * for individual operations. @@ -203,11 +210,8 @@ struct rte_idxd_rawdev { struct rte_idxd_desc_batch *batch_ring; }; -/* - * Enqueue a copy operation onto the ioat device - */ static __rte_always_inline int -__ioat_enqueue_copy(int dev_id, phys_addr_t src, phys_addr_t dst, +__ioat_write_desc(int dev_id, uint32_t op, uint64_t src, phys_addr_t dst, unsigned int length, uintptr_t src_hdl, uintptr_t dst_hdl) { struct rte_ioat_rawdev *ioat = @@ -229,7 +233,8 @@ __ioat_enqueue_copy(int dev_id, phys_addr_t src, phys_addr_t dst, desc = &ioat->desc_ring[write]; desc->size = length; /* set descriptor write-back every 16th descriptor */ - desc->u.control_raw = (uint32_t)((!(write & 0xF)) << 3); + desc->u.control_raw = (uint32_t)((op << IOAT_CMD_OP_SHIFT) | + (!(write & 0xF) << IOAT_COMP_UPDATE_SHIFT)); desc->src_addr = src; desc->dest_addr = dst; @@ -242,6 +247,27 @@ __ioat_enqueue_copy(int dev_id, phys_addr_t src, phys_addr_t dst, return 1; } +static __rte_always_inline int +__ioat_enqueue_fill(int dev_id, uint64_t pattern, phys_addr_t dst, + unsigned int length, uintptr_t dst_hdl) +{ + static const uintptr_t null_hdl; + + return __ioat_write_desc(dev_id, ioat_op_fill, pattern, dst, length, + null_hdl, dst_hdl); +} + +/* + * Enqueue a copy operation onto the ioat device + */ +static __rte_always_inline int +__ioat_enqueue_copy(int dev_id, phys_addr_t src, phys_addr_t dst, + unsigned int length, uintptr_t src_hdl, uintptr_t dst_hdl) +{ + return __ioat_write_desc(dev_id, ioat_op_copy, src, dst, length, + src_hdl, dst_hdl); +} + /* add fence to last written descriptor */ static __rte_always_inline int __ioat_fence(int dev_id) @@ -380,6 +406,23 @@ __idxd_write_desc(int dev_id, const struct rte_idxd_hw_desc *desc, return 0; } +static __rte_always_inline int +__idxd_enqueue_fill(int dev_id, uint64_t pattern, rte_iova_t dst, + unsigned int length, uintptr_t dst_hdl) +{ + const struct rte_idxd_hw_desc desc = { + .op_flags = (idxd_op_fill << IDXD_CMD_OP_SHIFT) | + IDXD_FLAG_CACHE_CONTROL, + .src = pattern, + .dst = dst, + .size = length + }; + const struct rte_idxd_user_hdl hdl = { + .dst = dst_hdl + }; + return __idxd_write_desc(dev_id, &desc, &hdl); +} + static __rte_always_inline int __idxd_enqueue_copy(int dev_id, rte_iova_t src, rte_iova_t dst, unsigned int length, uintptr_t src_hdl, uintptr_t dst_hdl) @@ -475,6 +518,18 @@ __idxd_completed_ops(int dev_id, uint8_t max_ops, return n; } +static inline int +rte_ioat_enqueue_fill(int dev_id, uint64_t pattern, phys_addr_t dst, + unsigned int len, uintptr_t dst_hdl) +{ + enum rte_ioat_dev_type *type = + (enum rte_ioat_dev_type *)rte_rawdevs[dev_id].dev_private; + if (*type == RTE_IDXD_DEV) + return __idxd_enqueue_fill(dev_id, pattern, dst, len, dst_hdl); + else + return __ioat_enqueue_fill(dev_id, pattern, dst, len, dst_hdl); +} + static inline int rte_ioat_enqueue_copy(int dev_id, phys_addr_t src, phys_addr_t dst, unsigned int length, uintptr_t src_hdl, uintptr_t dst_hdl) -- 2.25.1