From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 094DAA04BA;
	Wed,  7 Oct 2020 18:39:25 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id B6ABE1B80B;
	Wed,  7 Oct 2020 18:37:29 +0200 (CEST)
Received: from mga05.intel.com (mga05.intel.com [192.55.52.43])
 by dpdk.org (Postfix) with ESMTP id 248CD1B68A
 for <dev@dpdk.org>; Wed,  7 Oct 2020 18:37:10 +0200 (CEST)
IronPort-SDR: AUW2srpk64ZyoIE6b8mBmsI7wd9LVk4OjUkPrTVUja+sQVb8qcb0yHFqQ9HGmw0kJ8Fvd/eYwF
 a0pRQyJR/DZg==
X-IronPort-AV: E=McAfee;i="6000,8403,9767"; a="249724610"
X-IronPort-AV: E=Sophos;i="5.77,347,1596524400"; d="scan'208";a="249724610"
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga007.jf.intel.com ([10.7.209.58])
 by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 07 Oct 2020 09:31:36 -0700
IronPort-SDR: i0rNvelbwUP8p5MLX5azkv9SmyCEs1vAi0gH0NAx4GRtXMM0xFXg2rLFc8grkBz1KOrNaFK3nc
 fOHwYxPCYwTA==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.77,347,1596524400"; d="scan'208";a="354966103"
Received: from silpixa00399126.ir.intel.com ([10.237.222.4])
 by orsmga007.jf.intel.com with ESMTP; 07 Oct 2020 09:31:34 -0700
From: Bruce Richardson <bruce.richardson@intel.com>
To: dev@dpdk.org
Cc: patrick.fu@intel.com, thomas@monjalon.net,
 Kevin Laatz <kevin.laatz@intel.com>,
 Bruce Richardson <bruce.richardson@intel.com>,
 Radu Nicolau <radu.nicolau@intel.com>
Date: Wed,  7 Oct 2020 17:30:23 +0100
Message-Id: <20201007163023.2817-26-bruce.richardson@intel.com>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <20201007163023.2817-1-bruce.richardson@intel.com>
References: <20200721095140.719297-1-bruce.richardson@intel.com>
 <20201007163023.2817-1-bruce.richardson@intel.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Subject: [dpdk-dev] [PATCH v5 25/25] raw/ioat: add fill operation
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

From: Kevin Laatz <kevin.laatz@intel.com>

Add fill operation enqueue support for IOAT and IDXD. The fill enqueue is
similar to the copy enqueue, but takes a 'pattern' rather than a source
address to transfer to the destination address. This patch also includes an
additional test case for the new operation type.

Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
---
 doc/guides/rawdevs/ioat.rst            | 10 ++++
 doc/guides/rel_notes/release_20_11.rst |  2 +
 drivers/raw/ioat/ioat_rawdev_test.c    | 62 ++++++++++++++++++++++++
 drivers/raw/ioat/rte_ioat_rawdev.h     | 26 +++++++++++
 drivers/raw/ioat/rte_ioat_rawdev_fns.h | 65 ++++++++++++++++++++++++--
 5 files changed, 160 insertions(+), 5 deletions(-)

diff --git a/doc/guides/rawdevs/ioat.rst b/doc/guides/rawdevs/ioat.rst
index 7c2a2d457..250cfc48a 100644
--- a/doc/guides/rawdevs/ioat.rst
+++ b/doc/guides/rawdevs/ioat.rst
@@ -285,6 +285,16 @@ is correct before freeing the data buffers using the returned handles:
         }
 
 
+Filling an Area of Memory
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The IOAT driver also has support for the ``fill`` operation, where an area
+of memory is overwritten, or filled, with a short pattern of data.
+Fill operations can be performed in much the same was as copy operations
+described above, just using the ``rte_ioat_enqueue_fill()`` function rather
+than the ``rte_ioat_enqueue_copy()`` function.
+
+
 Querying Device Statistics
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index e48e6ea75..943ec83fd 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -122,6 +122,8 @@ New Features
 
   * Added support for Intel\ |reg| Data Streaming Accelerator hardware.
     For more information, see https://01.org/blogs/2019/introducing-intel-data-streaming-accelerator
+  * Added support for the fill operation via the API ``rte_ioat_enqueue_fill()``,
+    where the hardware fills an area of memory with a repeating pattern.
   * Added a per-device configuration flag to disable management of user-provided completion handles
   * Renamed the ``rte_ioat_do_copies()`` API to ``rte_ioat_perform_ops()``,
     and renamed the ``rte_ioat_completed_copies()`` API to ``rte_ioat_completed_ops()``
diff --git a/drivers/raw/ioat/ioat_rawdev_test.c b/drivers/raw/ioat/ioat_rawdev_test.c
index 60d189b62..101f24a67 100644
--- a/drivers/raw/ioat/ioat_rawdev_test.c
+++ b/drivers/raw/ioat/ioat_rawdev_test.c
@@ -155,6 +155,52 @@ test_enqueue_copies(int dev_id)
 	return 0;
 }
 
+static int
+test_enqueue_fill(int dev_id)
+{
+	const unsigned int length[] = {8, 64, 1024, 50, 100, 89};
+	struct rte_mbuf *dst = rte_pktmbuf_alloc(pool);
+	char *dst_data = rte_pktmbuf_mtod(dst, char *);
+	struct rte_mbuf *completed[2] = {0};
+	uint64_t pattern = 0xfedcba9876543210;
+	unsigned int i, j;
+
+	for (i = 0; i < RTE_DIM(length); i++) {
+		/* reset dst_data */
+		memset(dst_data, 0, length[i]);
+
+		/* perform the fill operation */
+		if (rte_ioat_enqueue_fill(dev_id, pattern,
+				dst->buf_iova + dst->data_off, length[i],
+				(uintptr_t)dst) != 1) {
+			PRINT_ERR("Error with rte_ioat_enqueue_fill\n");
+			return -1;
+		}
+
+		rte_ioat_perform_ops(dev_id);
+		usleep(100);
+
+		if (rte_ioat_completed_ops(dev_id, 1, (void *)&completed[0],
+			(void *)&completed[1]) != 1) {
+			PRINT_ERR("Error with completed ops\n");
+			return -1;
+		}
+		/* check the result */
+		for (j = 0; j < length[i]; j++) {
+			char pat_byte = ((char *)&pattern)[j % 8];
+			if (dst_data[j] != pat_byte) {
+				PRINT_ERR("Error with fill operation (length = %u): got (%x), not (%x)\n",
+						length[i], dst_data[j],
+						pat_byte);
+				return -1;
+			}
+		}
+	}
+
+	rte_pktmbuf_free(dst);
+	return 0;
+}
+
 int
 ioat_rawdev_test(uint16_t dev_id)
 {
@@ -234,6 +280,7 @@ ioat_rawdev_test(uint16_t dev_id)
 	}
 
 	/* run the test cases */
+	printf("Running Copy Tests\n");
 	for (i = 0; i < 100; i++) {
 		unsigned int j;
 
@@ -247,6 +294,21 @@ ioat_rawdev_test(uint16_t dev_id)
 	}
 	printf("\n");
 
+	/* test enqueue fill operation */
+	printf("Running Fill Tests\n");
+	for (i = 0; i < 100; i++) {
+		unsigned int j;
+
+		if (test_enqueue_fill(dev_id) != 0)
+			goto err;
+
+		rte_rawdev_xstats_get(dev_id, ids, stats, nb_xstats);
+		for (j = 0; j < nb_xstats; j++)
+			printf("%s: %"PRIu64"   ", snames[j].name, stats[j]);
+		printf("\r");
+	}
+	printf("\n");
+
 	rte_rawdev_stop(dev_id);
 	if (rte_rawdev_xstats_reset(dev_id, NULL, 0) != 0) {
 		PRINT_ERR("Error resetting xstat values\n");
diff --git a/drivers/raw/ioat/rte_ioat_rawdev.h b/drivers/raw/ioat/rte_ioat_rawdev.h
index 6b891cd44..b7632ebf3 100644
--- a/drivers/raw/ioat/rte_ioat_rawdev.h
+++ b/drivers/raw/ioat/rte_ioat_rawdev.h
@@ -37,6 +37,32 @@ struct rte_ioat_rawdev_config {
 	bool hdls_disable;    /**< if set, ignore user-supplied handle params */
 };
 
+/**
+ * Enqueue a fill operation onto the ioat device
+ *
+ * This queues up a fill operation to be performed by hardware, but does not
+ * trigger hardware to begin that operation.
+ *
+ * @param dev_id
+ *   The rawdev device id of the ioat instance
+ * @param pattern
+ *   The pattern to populate the destination buffer with
+ * @param dst
+ *   The physical address of the destination buffer
+ * @param length
+ *   The length of the destination buffer
+ * @param dst_hdl
+ *   An opaque handle for the destination data, to be returned when this
+ *   operation has been completed and the user polls for the completion details.
+ *   NOTE: If hdls_disable configuration option for the device is set, this
+ *   parameter is ignored.
+ * @return
+ *   Number of operations enqueued, either 0 or 1
+ */
+static inline int
+rte_ioat_enqueue_fill(int dev_id, uint64_t pattern, phys_addr_t dst,
+		unsigned int length, uintptr_t dst_hdl);
+
 /**
  * Enqueue a copy operation onto the ioat device
  *
diff --git a/drivers/raw/ioat/rte_ioat_rawdev_fns.h b/drivers/raw/ioat/rte_ioat_rawdev_fns.h
index d0045d8a4..c2c4601ca 100644
--- a/drivers/raw/ioat/rte_ioat_rawdev_fns.h
+++ b/drivers/raw/ioat/rte_ioat_rawdev_fns.h
@@ -115,6 +115,13 @@ enum rte_idxd_ops {
 #define IDXD_FLAG_REQUEST_COMPLETION    (1 << 3)
 #define IDXD_FLAG_CACHE_CONTROL         (1 << 8)
 
+#define IOAT_COMP_UPDATE_SHIFT	3
+#define IOAT_CMD_OP_SHIFT	24
+enum rte_ioat_ops {
+	ioat_op_copy = 0,	/* Standard DMA Operation */
+	ioat_op_fill		/* Block Fill */
+};
+
 /**
  * Hardware descriptor used by DSA hardware, for both bursts and
  * for individual operations.
@@ -203,11 +210,8 @@ struct rte_idxd_rawdev {
 	struct rte_idxd_desc_batch *batch_ring;
 };
 
-/*
- * Enqueue a copy operation onto the ioat device
- */
 static __rte_always_inline int
-__ioat_enqueue_copy(int dev_id, phys_addr_t src, phys_addr_t dst,
+__ioat_write_desc(int dev_id, uint32_t op, uint64_t src, phys_addr_t dst,
 		unsigned int length, uintptr_t src_hdl, uintptr_t dst_hdl)
 {
 	struct rte_ioat_rawdev *ioat =
@@ -229,7 +233,8 @@ __ioat_enqueue_copy(int dev_id, phys_addr_t src, phys_addr_t dst,
 	desc = &ioat->desc_ring[write];
 	desc->size = length;
 	/* set descriptor write-back every 16th descriptor */
-	desc->u.control_raw = (uint32_t)((!(write & 0xF)) << 3);
+	desc->u.control_raw = (uint32_t)((op << IOAT_CMD_OP_SHIFT) |
+			(!(write & 0xF) << IOAT_COMP_UPDATE_SHIFT));
 	desc->src_addr = src;
 	desc->dest_addr = dst;
 
@@ -242,6 +247,27 @@ __ioat_enqueue_copy(int dev_id, phys_addr_t src, phys_addr_t dst,
 	return 1;
 }
 
+static __rte_always_inline int
+__ioat_enqueue_fill(int dev_id, uint64_t pattern, phys_addr_t dst,
+		unsigned int length, uintptr_t dst_hdl)
+{
+	static const uintptr_t null_hdl;
+
+	return __ioat_write_desc(dev_id, ioat_op_fill, pattern, dst, length,
+			null_hdl, dst_hdl);
+}
+
+/*
+ * Enqueue a copy operation onto the ioat device
+ */
+static __rte_always_inline int
+__ioat_enqueue_copy(int dev_id, phys_addr_t src, phys_addr_t dst,
+		unsigned int length, uintptr_t src_hdl, uintptr_t dst_hdl)
+{
+	return __ioat_write_desc(dev_id, ioat_op_copy, src, dst, length,
+			src_hdl, dst_hdl);
+}
+
 /* add fence to last written descriptor */
 static __rte_always_inline int
 __ioat_fence(int dev_id)
@@ -380,6 +406,23 @@ __idxd_write_desc(int dev_id, const struct rte_idxd_hw_desc *desc,
 	return 0;
 }
 
+static __rte_always_inline int
+__idxd_enqueue_fill(int dev_id, uint64_t pattern, rte_iova_t dst,
+		unsigned int length, uintptr_t dst_hdl)
+{
+	const struct rte_idxd_hw_desc desc = {
+			.op_flags =  (idxd_op_fill << IDXD_CMD_OP_SHIFT) |
+				IDXD_FLAG_CACHE_CONTROL,
+			.src = pattern,
+			.dst = dst,
+			.size = length
+	};
+	const struct rte_idxd_user_hdl hdl = {
+			.dst = dst_hdl
+	};
+	return __idxd_write_desc(dev_id, &desc, &hdl);
+}
+
 static __rte_always_inline int
 __idxd_enqueue_copy(int dev_id, rte_iova_t src, rte_iova_t dst,
 		unsigned int length, uintptr_t src_hdl, uintptr_t dst_hdl)
@@ -475,6 +518,18 @@ __idxd_completed_ops(int dev_id, uint8_t max_ops,
 	return n;
 }
 
+static inline int
+rte_ioat_enqueue_fill(int dev_id, uint64_t pattern, phys_addr_t dst,
+		unsigned int len, uintptr_t dst_hdl)
+{
+	enum rte_ioat_dev_type *type =
+			(enum rte_ioat_dev_type *)rte_rawdevs[dev_id].dev_private;
+	if (*type == RTE_IDXD_DEV)
+		return __idxd_enqueue_fill(dev_id, pattern, dst, len, dst_hdl);
+	else
+		return __ioat_enqueue_fill(dev_id, pattern, dst, len, dst_hdl);
+}
+
 static inline int
 rte_ioat_enqueue_copy(int dev_id, phys_addr_t src, phys_addr_t dst,
 		unsigned int length, uintptr_t src_hdl, uintptr_t dst_hdl)
-- 
2.25.1