DPDK patches and discussions
 help / color / mirror / Atom feed
* [RFC v2] ethdev: an API for cache stashing hints
@ 2024-07-15 22:11 Wathsala Vithanage
  2024-07-17  2:27 ` Stephen Hemminger
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Wathsala Vithanage @ 2024-07-15 22:11 UTC (permalink / raw)
  To: dev, Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko
  Cc: nd, Wathsala Vithanage, Dhruv Tripathi

An application provides cache stashing hints to the ethernet devices to
improve memory access latencies from the CPU and the NIC. This patch
introduces three distinct hints for this purpose.

The RTE_ETH_DEV_STASH_HINT_HOST_WILLNEED hint indicates that the host
(CPU) requires the data written by the NIC immediately. This implies
that the CPU expects to read data from its local cache rather than LLC
or main memory if possible. This would improve memory access latency in
the Rx path. For PCI devices with TPH capability, these hints translate
into DWHR (Device Writes Host Reads) access pattern. This hint is only
valid for receive queues.

The RTE_ETH_DEV_STASH_HINT_BI_DIR_DATA hint indicates that the host and
the device access the data structure equally. Rx/Tx queue descriptors
fit the description of such data. This hint applies to both Rx and Tx
directions.  In the PCI TPH context, this hint translates into a
Bi-Directional access pattern.

RTE_ETH_DEV_STASH_HINT_DEV_ONLY hint indicates that the CPU is not
involved in a given device's receive or transmit paths. This implies
that only devices are involved in the IO path. Depending on the
implementation, this hint may result in data getting placed in a cache
close to the device or not cached at all. For PCI devices with TPH
capability, this hint translates into D*D* (DWDR, DRDW, DWDW, DRDR)
access patterns. This is a bidirectional hint, and it can be applied to
both Rx and Tx queues.  

The RTE_ETH_DEV_STASH_HINT_HOST_DONTNEED hint indicates that the device
reads data written by the host (CPU) that may still be in the host's
local cache but is not required by the host anytime soon. This hint is
intended to prevent unnecessary cache invalidations that cause
interconnect latencies when a device writes to a buffer already in host
cache memory. In DPDK, this could happen with the recycling of mbufs
where a mbuf is placed in the Tx queue that then gets back into mempool
and gets recycled back into the Rx queue, all while a copy is being held
in the CPU's local cache unnecessarily. By using this hint on supported
platforms, the mbuf will be invalidated after the device completes the
buffer reading, but it will be well before the buffer gets recycled and
updated in the Rx path. This hint is only valid for transmit queues. 

Applications use three main interfaces in the ethdev library to discover
and set cache stashing hints. rte_eth_dev_stashing_hints_tx interface is
used to set hints on a Tx queue. rte_eth_dev_stashing_hints_rx interface
is used to set hints on an Rx queue. Both of these functions take the
following parameters as inputs: a port_id (the id of the ethernet
device), a cpu_id (the target CPU), a cache_level (the level of the
cache hierarchy the data should be stashed into), a queue_id (the queue
the hints are applied to). In addition to the above list of parameters,
a type parameter indicates the type of the object the application
expects to be stashed by the hardware. Depending on the hardware, these
may vary. Intel E810 NICs support the stashing of Rx/Tx descriptors,
packet headers, and packet payloads. These are indicated by the macros
RTE_ETH_DEV_STASH_TYPE_DESC, RTE_ETH_DEV_STASH_TYPE_HEADER,
RTE_ETH_DEV_STASH_TYPE_PAYLOAD. Hardware capable of stashing data at any
given offset into a packet can use the RTE_ETH_DEV_STASH_TYPE_OFFSET
type. When an offset is used, the offset parameter in the above two
functions should be set appropriately.

rte_eth_dev_stashing_hints_discover is used to discover the object types
and hints supported in the platform and the device. The function takes
types and hints pointers used as a bit vector to indicate hints and
types supported by the NIC. An application that intends to use stashing
hints should first discover supported hints and types and then use the
functions rte_eth_dev_stashing_hints_tx and
rte_eth_dev_stashing_hints_rx as required to set stashing hints
accordingly. eth_dev_ops structure has been updated with two new ops
that a PMD should implement to support cache stashing hints. A PMD that
intends to support cache stashing hints should initialize the
set_stashing_hints function pointer to a function that issues hints to
the underlying hardware in compliance with platform capabilities. The
same PMD should also implement a function that can return two-bit fields
indicating supported types and hints and then initialize the
discover_stashing_hints function pointer with it. If the NIC supports
cache stashing hints, the NIC should always set the
RTE_ETH_DEV_CAPA_CACHE_STASHING device capability.

Signed-off-by: Wathsala Vithanage <wathsala.vithanage@arm.com>
Reviewed-by: Dhruv Tripathi <dhruv.tripathi@arm.com>
---
 .mailmap                   |   1 +
 lib/ethdev/ethdev_driver.h |  67 +++++++++++
 lib/ethdev/rte_ethdev.c    | 153 +++++++++++++++++++++++++
 lib/ethdev/rte_ethdev.h    | 225 +++++++++++++++++++++++++++++++++++++
 lib/ethdev/version.map     |   6 +
 5 files changed, 452 insertions(+)

diff --git a/.mailmap b/.mailmap
index f1e64286a1..9c28b74655 100644
--- a/.mailmap
+++ b/.mailmap
@@ -338,6 +338,7 @@ Dexia Li <dexia.li@jaguarmicro.com>
 Dexuan Cui <decui@microsoft.com>
 Dharmik Thakkar <dharmikjayesh.thakkar@arm.com> <dharmik.thakkar@arm.com>
 Dheemanth Mallikarjun <dheemanthm@vmware.com>
+Dhruv Tripathi <dhruv.tripathi@arm.com>
 Diana Wang <na.wang@corigine.com>
 Didier Pallard <didier.pallard@6wind.com>
 Dilshod Urazov <dilshod.urazov@oktetlabs.ru>
diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index 883e59a927..b90dc8793b 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -1235,6 +1235,70 @@ typedef int (*eth_count_aggr_ports_t)(struct rte_eth_dev *dev);
 typedef int (*eth_map_aggr_tx_affinity_t)(struct rte_eth_dev *dev, uint16_t tx_queue_id,
 					  uint8_t affinity);
 
+/**
+ * @internal
+ * Set cache stashing hint in the ethernet device.
+ *
+ * @param dev
+ *   Port (ethdev) handle.
+ * @param cpuid
+ *   ID of the targeted CPU.
+ * @param cache_level
+ *   Level of the cache to stash data.
+ * @param queue_id
+ *   List of receive queue ids used in rte_eth_rx_burst().
+ * @param queue_direction
+ *   RTE_ETH_DEV_QUEUE_TYPE_RX if queue that corresponds to queue_id is an
+ *   rx queue.
+ *   RTE_ETH_DEV_QUEUE_TYPE_TX if queue that corresponds to queue_id is a
+ *   tx queue.
+ * @param types
+ *   A vector of stashing types to apply hints on a given queue direction.
+ *   hints are applied on the types specified in types vector.
+ *   types can include queue descriptors (RTE_ETH_DEV_STASH_TYPE_DESC),
+ *   packet headers (RTE_ETH_DEV_STASH_TYPE_HEADER),
+ *   packet payloads (RTE_ETH_DEV_STASH_TYPE_PAYLOAD) or
+ *   to an offset (RTE_ETH_DEV_STASH_TYPE_OFFSET) in to a packet.
+ *   types have to be compatible with the queue_direction or an -EINVAL will
+ *   be returned.
+ * @param hints
+ *   Cache stashing hints
+ * @param offset
+ *   Offset into the packet if RTE_ETH_DEV_STASH_TYPE_OFFSET is set in hints.
+ *
+ * @return
+ *   -ENOTSUP if the device or the platform does not support cache stashing.
+ *   -ENOSYS  if the underlying PMD hasn't implemented cache stashing feature.
+ *   -EINVAL  on invalid arguments.
+ *   0 on success.
+ */
+typedef int (*eth_set_stashing_hints_t)(struct rte_eth_dev *dev, uint16_t cpuid,
+					uint8_t cache_level,
+					uint16_t queue_id, uint8_t queue_direction,
+					uint16_t types, uint8_t hints, off_t offset);
+
+/**
+ * @internal
+ * Discover cache stashing hints and object types supported in the ethernet device.
+ *
+ * @param dev
+ *   Port (ethdev) handle.
+ * @param types
+ *   Set bits for supported object types.
+ * @param hints
+ *   Set bits for supported stashing hints.
+ *
+ * @return
+ *   -ENOTSUP if the device or the platform does not support cache stashing.
+ *   -ENOSYS  if the underlying PMD hasn't implemented cache stashing feature.
+ *   -EINVAL  on NULL values for types or hints parameters.
+ *   On return, types and hints parameters will have bits set for supported
+ *   object types and hints.
+ *   0 on success.
+ */
+typedef int (*eth_discover_stashing_hints_t)(struct rte_eth_dev *dev,
+					     uint16_t *types, uint16_t *hints);
+
 /**
  * @internal A structure containing the functions exported by an Ethernet driver.
  */
@@ -1257,6 +1321,9 @@ struct eth_dev_ops {
 	eth_mac_addr_remove_t      mac_addr_remove; /**< Remove MAC address */
 	eth_mac_addr_add_t         mac_addr_add;  /**< Add a MAC address */
 	eth_mac_addr_set_t         mac_addr_set;  /**< Set a MAC address */
+	eth_set_stashing_hints_t   set_stashing_hints; /**< Set cache stashing*/
+	/**Discover supported stashing hints*/
+	eth_discover_stashing_hints_t discover_stashing_hints;
 	/** Set list of multicast addresses */
 	eth_set_mc_addr_list_t     set_mc_addr_list;
 	mtu_set_t                  mtu_set;       /**< Set MTU */
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index f1c658f49e..fafc94223e 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -153,6 +153,7 @@ static const struct {
 	{RTE_ETH_DEV_CAPA_RXQ_SHARE, "RXQ_SHARE"},
 	{RTE_ETH_DEV_CAPA_FLOW_RULE_KEEP, "FLOW_RULE_KEEP"},
 	{RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP, "FLOW_SHARED_OBJECT_KEEP"},
+	{RTE_ETH_DEV_CAPA_CACHE_STASHING, "CACHE_STASHING"},
 };
 
 enum {
@@ -7008,4 +7009,156 @@ int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t tx_queue_id,
 	return ret;
 }
 
+int
+rte_eth_dev_validate_stashing_hints(uint16_t port_id, uint16_t queue_id,
+				    uint8_t queue_direction, uint16_t types,
+				    uint16_t hints)
+{
+	struct rte_eth_dev *dev;
+	struct rte_eth_dev_info dev_info;
+	uint16_t nb_queues;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+
+	/*
+	 * Check for invalid types
+	 */
+	if (!RTE_ETH_DEV_STASH_TYPE_VALID(types)) {
+		RTE_ETHDEV_LOG_LINE(ERR, "Invalid stashing type");
+		return -EINVAL;
+	}
+
+	/*
+	 * Ensure that hints (HOST_DONOTNEED, HOST_WILLNEED, BI_DIR_DATA, and
+	 * DEV_ONLY etc.) are not mixed incorrectly in the hint argument.
+	 * Only hints of one queue direction (Rx or Tx) can be combined in the
+	 * hint argument. If the hint argument contains hint types compatible
+	 * with both Rx and Tx directions it can be applied to any queue of the
+	 * two queue types.
+	 */
+	if (!RTE_ETH_DEV_STASH_HINT_IS_RXTX(hints)) {
+		/*
+		 * This is not a Rx and a Tx hint.
+		 * Therefore it can only be applied to single queue direction.
+		 */
+		if (RTE_ETH_DEV_STASH_HINT_IS_TX(hints) ==
+		    RTE_ETH_DEV_STASH_HINT_IS_RX(hints)) {
+			RTE_ETHDEV_LOG_LINE(ERR, "This hint is not compatible "
+					    "with both Rx and Tx paths");
+			return -EINVAL;
+		}
+		/*
+		 * Ensure that hint is compatible with the specified queue
+		 * direction in the queue_direction argument.
+		 */
+		if (((queue_direction == RTE_ETH_DEV_QUEUE_TYPE_TX) &&
+		    RTE_ETH_DEV_STASH_HINT_IS_RX(hints)) ||
+		    ((queue_direction == RTE_ETH_DEV_QUEUE_TYPE_RX) &&
+		    RTE_ETH_DEV_STASH_HINT_IS_TX(hints))) {
+			RTE_ETHDEV_LOG_LINE(ERR, "Hints are not applicable to "
+					    "this queue type");
+			return -EINVAL;
+		}
+	}
+
+	dev = &rte_eth_devices[port_id];
+
+	nb_queues = (queue_direction == RTE_ETH_DEV_QUEUE_TYPE_RX) ?
+				      dev->data->nb_rx_queues :
+				      dev->data->nb_tx_queues;
+
+	if (queue_id >= nb_queues) {
+		RTE_ETHDEV_LOG_LINE(ERR, "Invalid Rx queue_id=%u", queue_id);
+		return -EINVAL;
+	}
+
+	rte_eth_dev_info_get(port_id, &dev_info);
+
+	if ((dev_info.dev_capa & RTE_ETH_DEV_CAPA_CACHE_STASHING) !=
+	    RTE_ETH_DEV_CAPA_CACHE_STASHING)
+		return -ENOTSUP;
+
+	if (*dev->dev_ops->set_stashing_hints == NULL) {
+		RTE_ETHDEV_LOG_LINE(ERR, "Stashing hints are not implemented "
+				    "in %s for %s", dev_info.driver_name,
+				    dev_info.device->name);
+		return -ENOSYS;
+	}
+
+	return 0;
+}
+
+int
+rte_eth_dev_stashing_hints_rx(uint16_t port_id, uint16_t cpuid,
+			      uint8_t cache_level, uint16_t queue_id,
+			      uint16_t types, off_t offset,
+			      uint16_t hints)
+{
+	struct rte_eth_dev *dev;
+
+	int ret = rte_eth_dev_validate_stashing_hints(port_id, queue_id,
+						      RTE_ETH_DEV_QUEUE_TYPE_RX,
+						      types, hints);
+	if (ret < 0)
+		return ret;
+
+	dev = &rte_eth_devices[port_id];
+
+	return eth_err(port_id, (*dev->dev_ops->set_stashing_hints)(dev, cpuid,
+		       cache_level, queue_id, RTE_ETH_DEV_QUEUE_TYPE_RX,
+		       types, hints, offset));
+}
+
+int
+rte_eth_dev_stashing_hints_tx(uint16_t port_id, uint16_t cpuid,
+			      uint8_t cache_level, uint16_t queue_id,
+			      uint16_t types, off_t offset,
+			      uint16_t hints)
+{
+	struct rte_eth_dev *dev;
+
+	int ret = rte_eth_dev_validate_stashing_hints(port_id, queue_id,
+						      RTE_ETH_DEV_QUEUE_TYPE_TX,
+						      types, hints);
+	if (ret < 0)
+		return ret;
+
+	dev = &rte_eth_devices[port_id];
+
+	return eth_err(port_id,
+		       (*dev->dev_ops->set_stashing_hints) (dev, cpuid,
+		       cache_level, queue_id, RTE_ETH_DEV_QUEUE_TYPE_TX, types,
+		       hints, offset));
+}
+
+int
+rte_eth_dev_stashing_hints_discover(uint16_t port_id, uint16_t *types,
+				    uint16_t *hints)
+{
+	struct rte_eth_dev *dev;
+	struct rte_eth_dev_info dev_info;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+
+	if (!types || !hints)
+		return -EINVAL;
+
+	dev = &rte_eth_devices[port_id];
+	rte_eth_dev_info_get(port_id, &dev_info);
+
+	if ((dev_info.dev_capa & RTE_ETH_DEV_CAPA_CACHE_STASHING) !=
+	    RTE_ETH_DEV_CAPA_CACHE_STASHING)
+		return -ENOTSUP;
+
+	if (*dev->dev_ops->discover_stashing_hints == NULL) {
+		RTE_ETHDEV_LOG_LINE(ERR, "Stashing hints are not implemented "
+				    "in %s for %s", dev_info.driver_name,
+				    dev_info.device->name);
+		return -ENOSYS;
+	}
+	return eth_err(port_id,
+		       (*dev->dev_ops->discover_stashing_hints)
+		       (dev, types, hints));
+}
+
 RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 548fada1c7..a42f272885 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -1648,6 +1648,9 @@ struct rte_eth_conf {
 #define RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP RTE_BIT64(4)
 /**@}*/
 
+/** Device supports stashing to CPU/system caches. */
+#define RTE_ETH_DEV_CAPA_CACHE_STASHING RTE_BIT64(5)
+
 /*
  * Fallback default preferred Rx/Tx port parameters.
  * These are used if an application requests default parameters
@@ -1819,6 +1822,8 @@ struct rte_eth_dev_info {
 	struct rte_eth_dev_portconf default_txportconf;
 	/** Generic device capabilities (RTE_ETH_DEV_CAPA_). */
 	uint64_t dev_capa;
+	uint16_t stashing_hints_capa;
+	uint16_t stashing_types_capa;
 	/**
 	 * Switching information for ports on a device with a
 	 * embedded managed interconnect/switch.
@@ -5964,6 +5969,226 @@ int rte_eth_cman_config_set(uint16_t port_id, const struct rte_eth_cman_config *
 __rte_experimental
 int rte_eth_cman_config_get(uint16_t port_id, struct rte_eth_cman_config *config);
 
+
+
+/** Queue type is RX. */
+#define RTE_ETH_DEV_QUEUE_TYPE_RX		0
+/** Queue type is TX. */
+#define RTE_ETH_DEV_QUEUE_TYPE_TX		1
+
+/**@{@name Ethernet device cache stashing hints
+ *@see rte_eth_dev_stashing_hints_discover
+ *@see rte_eth_dev_stashing_hints_rx
+ *@see rte_eth_dev_stashing_hints_tx
+ */
+/**
+ * Data read by the device could still be in a CPU local cache memory but
+ * not required by the CPU before ethernet device is done with Tx.
+ * In other words CPU does not mind evicting the relevant cache line(s)
+ * from it's local cache.
+ */
+#define RTE_ETH_DEV_STASH_HINT_HOST_DONTNEED	0x001
+
+/**
+ * Data is read and written equally by the CPU and the NIC.
+ */
+#define RTE_ETH_DEV_STASH_HINT_BI_DIR_DATA	0x100
+
+/**
+ * Data written by the device is read by a CPU immediately. CPU prefers
+ * availability of the data in it's local cache memory by the time read
+ * takes place.
+ */
+#define RTE_ETH_DEV_STASH_HINT_HOST_WILLNEED	0x010
+
+/**
+ * Data written by the device is only read by device.
+ * Host CPUs do not read this data or write to the location of the data.
+ */
+#define RTE_ETH_DEV_STASH_HINT_DEV_ONLY		0x200
+
+
+#define __RTE_ETH_DEV_STASH_HINT_TX_MASK	0x00f
+
+#define __RTE_ETH_DEV_STASH_HINT_RX_MASK	0x0f0
+
+#define __RTE_ETH_DEV_STASH_HINT_RXTX_MASK	0xf00
+
+
+/**@}*/
+
+#define RTE_ETH_DEV_STASH_HINT_IS_TX(h)				\
+	((!((h) & ~(__RTE_ETH_DEV_STASH_HINT_TX_MASK))) && (h))
+
+#define RTE_ETH_DEV_STASH_HINT_IS_RX(h)				\
+	((!((h) & ~(__RTE_ETH_DEV_STASH_HINT_RX_MASK))) && (h))
+
+#define RTE_ETH_DEV_STASH_HINT_IS_RXTX(h)		\
+	((!((h) & ~(__RTE_ETH_DEV_STASH_HINT_RXTX_MASK))) && (h))
+
+/**@{@name Stashable Rx/Tx queue object types supported by the ethernet device
+ *@see rte_eth_dev_stashing_hints_discover
+ *@see rte_eth_dev_stashing_hints_rx
+ *@see rte_eth_dev_stashing_hints_tx
+ */
+
+/**
+ * Apply stashing hint to data at a given offset from the start of a
+ * received packet.
+ */
+#define RTE_ETH_DEV_STASH_TYPE_OFFSET	0x0001
+
+/** Apply stashing hint to an rx descriptor. */
+#define RTE_ETH_DEV_STASH_TYPE_DESC	0x0002
+
+/** Apply stashing hint to a header of a received packet. */
+#define RTE_ETH_DEV_STASH_TYPE_HEADER	0x0004
+
+/** Apply stashing hint to a payload of a received packet. */
+#define RTE_ETH_DEV_STASH_TYPE_PAYLOAD	0x0008
+#define __RTE_ETH_DEV_STASH_TYPE_MASK	0x000f
+/**@}*/
+
+#define RTE_ETH_DEV_STASH_TYPE_VALID(t)				\
+	((!((t) & (~__RTE_ETH_DEV_STASH_TYPE_MASK))) && (t))
+
+/**
+ *
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
+ *
+ * @internal
+ * Helper function to validate stashing hints.
+ */
+__rte_experimental
+int rte_eth_dev_validate_stashing_hints(uint16_t port_id, uint16_t queue_id,
+					uint8_t queue_direction, uint16_t type,
+					uint16_t hint);
+/**
+ *
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
+ *
+ * Provide cache stashing hints for improved memory access latencies for
+ * packets received by the NIC. Hints the underlying hardware that CPU indicated
+ * in cpuid parameter prefers to have the data specified in the type parameter
+ * at a level in the memory hierarchy specified in cache_level parameter for
+ * access pattern(s) specified in hints parameter.
+ * This feature is available only in supported NICs and platforms.
+ *
+ * @param port_id
+ *  The port identifier of the Ethernet device.
+ * @param cpuid
+ *  ID of the targeted CPU for the hint.
+ * @param cache_level
+ *  The preferred level of the cache the packets are expected at the time of
+ *  retrieval.
+ * @param queue_id
+ *  The index of the receive queue to which hints are applied.
+ * @param types
+ *  A vector of stashing types to apply hints on receive queue.
+ *  Hints are applied on the types specified in types vector.
+ *  types can include receive queue descriptors (RTE_ETH_DEV_STASH_TYPE_DESC),
+ *  packet headers (RTE_ETH_DEV_STASH_TYPE_HEADER),
+ *  packet payloads (RTE_ETH_DEV_STASH_TYPE_PAYLOAD) or
+ *  to an offset (RTE_ETH_DEV_STASH_TYPE_OFFSET) in to packet.
+ *  Types used should be compatible with RX queues, if not -EINVAL will be
+ *  returned.
+ * @param offset
+ *  Offset into the packet if RTE_ETH_DEV_STASH_TYPE_RX_OFFSET is set in hints.
+ * @param hints
+ *  A vector of stashing hints to the device and the platform.
+ * @return
+ *  - (-ENODEV) on incorrect port_ids.
+ *  - (-EINVAL) if both RX and TX types are used in conjuection in type
+ *  parameter.
+ *  - (-EINVAL) if hints are incompatible with RX queues.
+ *  - (-EINVAL) on invalid queue_id.
+ *  - (-ENOTSUP) if RTE_ETH_DEV_CAPA_CACHE_STASHING capability is unavailable.
+ *  - (-ENOSYS) if PMD does not implement cache stashing hints.
+ *  - (0) on Success.
+ */
+__rte_experimental
+int rte_eth_dev_stashing_hints_rx(uint16_t port_id, uint16_t cpuid,
+				 uint8_t cache_level, uint16_t queue_id,
+				 uint16_t types, off_t offset, uint16_t hints);
+
+/**
+ *
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
+ *
+ * Provide cache stashing hints for improved memory access latencies for
+ * packets being transmitted by the NIC. Hints the underlying hardware that CPU
+ * prefers to have the data specified in the type parameter at a level in the
+ * memory hierarchy specified in cache_level parameter for an access pattern
+ * specified in hints parameter.
+ * This feature is available only in supported NICs and platforms.
+ *
+ * @param port_id
+ *  The port identifier of the Ethernet device.
+ * @param cpuid
+ *  ID of the targeted CPU for the hint.
+ * @param cache_level
+ *  The preferred level of the cache the packets are expected at the time of
+ *  transmission.
+ * @param queue_id
+ *  The index of the transmit queue which hints are applied to.
+ * @param types
+ *  A vector of stashing types to apply hints on transmit queue.
+ *  hints are applied on types specified in types vector.
+ *  types can innclude transmit queue descriptors (RTE_ETH_DEV_STASH_TYPE_DESC),
+ *  packet headers (RTE_ETH_DEV_STASH_TYPE_HEADER),
+ *  packet payloads (RTE_ETH_DEV_STASH_TYPE_PAYLOAD) or
+ *  to an offset (RTE_ETH_DEV_STASH_TYPE_OFFSET) in to packet.
+ *  Types used should be compatible with TX queues, if not -EINVAL will be
+ *  returned.
+ * @param offset
+ *  Offset into the packet if RTE_ETH_DEV_STASH_TYPE_RX_OFFSET is set in hints.
+ * @param hints
+ *  A vector of stashing hints to the device and the platform.
+ * @return
+ *  - (-ENODEV) on incorrect port_ids.
+ *  - (-EINVAL) if both RX and TX types are used in conjuection in type
+ *  parameter.
+ *  - (-EINVAL) if hints are incompatible with TX queues.
+ *  - (-EINVAL) on invalid queue_id.
+ *  - (-ENOTSUP) if RTE_ETH_DEV_CAPA_CACHE_STASHING capability is unavailable.
+ *  - (-ENOSYS) if PMD does not implement cache stashing hints.
+ *  - (0) on Success.
+ */
+__rte_experimental
+int rte_eth_dev_stashing_hints_tx(uint16_t port_id, uint16_t cpuid,
+				 uint8_t cache_level, uint16_t queue_id,
+				 uint16_t types, off_t offset, uint16_t hints);
+
+/**
+ *
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
+ *
+ * Discover cache stashing hints and object types supported in the ethernet
+ * device.
+ *
+ * @param port_id
+ *  The port identifier of the Ethernet device.
+ * @param types
+ *  Supported types vector set by the ethernet device.
+ * @param hints
+ *  Supported hints vector set by the ethernet device.
+ * @return
+ *  On return types and hints parameters will have bits set for supported
+ *  object types.
+ *  - (-ENOTSUP) if the device or the platform does not support cache stashing.
+ *  - (-ENOSYS)  if the underlying PMD hasn't implemented cache stashing
+ *  feature.
+ *  - (-EINVAL)  on NULL values for types or hints parameters.
+ *  - (0) on success.
+ */
+__rte_experimental
+int rte_eth_dev_stashing_hints_discover(uint16_t port_id, uint16_t *types,
+					uint16_t *hints);
+
 #include <rte_ethdev_core.h>
 
 /**
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index 79f6f5293b..5eef0b4540 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -325,6 +325,12 @@ EXPERIMENTAL {
 	rte_flow_template_table_resizable;
 	rte_flow_template_table_resize;
 	rte_flow_template_table_resize_complete;
+
+	# added in 24.07
+	rte_eth_dev_stashing_hints_rx;
+	rte_eth_dev_stashing_hints_tx;
+	rte_eth_dev_stashing_hints_discover;
+	rte_eth_dev_validate_stashing_hints;
 };
 
 INTERNAL {
-- 
2.34.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC v2] ethdev: an API for cache stashing hints
  2024-07-15 22:11 [RFC v2] ethdev: an API for cache stashing hints Wathsala Vithanage
@ 2024-07-17  2:27 ` Stephen Hemminger
  2024-07-18 18:48   ` Wathsala Wathawana Vithanage
  2024-07-20  3:05   ` Honnappa Nagarahalli
  2024-07-17 10:32 ` Konstantin Ananyev
  2024-07-22 11:18 ` Ferruh Yigit
  2 siblings, 2 replies; 7+ messages in thread
From: Stephen Hemminger @ 2024-07-17  2:27 UTC (permalink / raw)
  To: Wathsala Vithanage
  Cc: dev, Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko, nd, Dhruv Tripathi

On Mon, 15 Jul 2024 22:11:41 +0000
Wathsala Vithanage <wathsala.vithanage@arm.com> wrote:

> An application provides cache stashing hints to the ethernet devices to
> improve memory access latencies from the CPU and the NIC. This patch
> introduces three distinct hints for this purpose.
> 
> The RTE_ETH_DEV_STASH_HINT_HOST_WILLNEED hint indicates that the host
> (CPU) requires the data written by the NIC immediately. This implies
> that the CPU expects to read data from its local cache rather than LLC
> or main memory if possible. This would improve memory access latency in
> the Rx path. For PCI devices with TPH capability, these hints translate
> into DWHR (Device Writes Host Reads) access pattern. This hint is only
> valid for receive queues.
> 
> The RTE_ETH_DEV_STASH_HINT_BI_DIR_DATA hint indicates that the host and
> the device access the data structure equally. Rx/Tx queue descriptors
> fit the description of such data. This hint applies to both Rx and Tx
> directions.  In the PCI TPH context, this hint translates into a
> Bi-Directional access pattern.
> 
> RTE_ETH_DEV_STASH_HINT_DEV_ONLY hint indicates that the CPU is not
> involved in a given device's receive or transmit paths. This implies
> that only devices are involved in the IO path. Depending on the
> implementation, this hint may result in data getting placed in a cache
> close to the device or not cached at all. For PCI devices with TPH
> capability, this hint translates into D*D* (DWDR, DRDW, DWDW, DRDR)
> access patterns. This is a bidirectional hint, and it can be applied to
> both Rx and Tx queues.  
> 
> The RTE_ETH_DEV_STASH_HINT_HOST_DONTNEED hint indicates that the device
> reads data written by the host (CPU) that may still be in the host's
> local cache but is not required by the host anytime soon. This hint is
> intended to prevent unnecessary cache invalidations that cause
> interconnect latencies when a device writes to a buffer already in host
> cache memory. In DPDK, this could happen with the recycling of mbufs
> where a mbuf is placed in the Tx queue that then gets back into mempool
> and gets recycled back into the Rx queue, all while a copy is being held
> in the CPU's local cache unnecessarily. By using this hint on supported
> platforms, the mbuf will be invalidated after the device completes the
> buffer reading, but it will be well before the buffer gets recycled and
> updated in the Rx path. This hint is only valid for transmit queues. 
> 
> Applications use three main interfaces in the ethdev library to discover
> and set cache stashing hints. rte_eth_dev_stashing_hints_tx interface is
> used to set hints on a Tx queue. rte_eth_dev_stashing_hints_rx interface
> is used to set hints on an Rx queue. Both of these functions take the
> following parameters as inputs: a port_id (the id of the ethernet
> device), a cpu_id (the target CPU), a cache_level (the level of the
> cache hierarchy the data should be stashed into), a queue_id (the queue
> the hints are applied to). In addition to the above list of parameters,
> a type parameter indicates the type of the object the application
> expects to be stashed by the hardware. Depending on the hardware, these
> may vary. Intel E810 NICs support the stashing of Rx/Tx descriptors,
> packet headers, and packet payloads. These are indicated by the macros
> RTE_ETH_DEV_STASH_TYPE_DESC, RTE_ETH_DEV_STASH_TYPE_HEADER,
> RTE_ETH_DEV_STASH_TYPE_PAYLOAD. Hardware capable of stashing data at any
> given offset into a packet can use the RTE_ETH_DEV_STASH_TYPE_OFFSET
> type. When an offset is used, the offset parameter in the above two
> functions should be set appropriately.
> 
> rte_eth_dev_stashing_hints_discover is used to discover the object types
> and hints supported in the platform and the device. The function takes
> types and hints pointers used as a bit vector to indicate hints and
> types supported by the NIC. An application that intends to use stashing
> hints should first discover supported hints and types and then use the
> functions rte_eth_dev_stashing_hints_tx and
> rte_eth_dev_stashing_hints_rx as required to set stashing hints
> accordingly. eth_dev_ops structure has been updated with two new ops
> that a PMD should implement to support cache stashing hints. A PMD that
> intends to support cache stashing hints should initialize the
> set_stashing_hints function pointer to a function that issues hints to
> the underlying hardware in compliance with platform capabilities. The
> same PMD should also implement a function that can return two-bit fields
> indicating supported types and hints and then initialize the
> discover_stashing_hints function pointer with it. If the NIC supports
> cache stashing hints, the NIC should always set the
> RTE_ETH_DEV_CAPA_CACHE_STASHING device capability.
> 
> Signed-off-by: Wathsala Vithanage <wathsala.vithanage@arm.com>
> Reviewed-by: Dhruv Tripathi <dhruv.tripathi@arm.com>

My initial reaction is negative on this. The DPDK does not need more nerd knobs
for performance. If it is a performance win, it should be automatic and handled
by the driver.

If you absolutely have to have another flag, then it should be in existing config
(yes, extend the ABI) rather than adding more flags and calls in ethdev.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [RFC v2] ethdev: an API for cache stashing hints
  2024-07-15 22:11 [RFC v2] ethdev: an API for cache stashing hints Wathsala Vithanage
  2024-07-17  2:27 ` Stephen Hemminger
@ 2024-07-17 10:32 ` Konstantin Ananyev
  2024-07-22 11:18 ` Ferruh Yigit
  2 siblings, 0 replies; 7+ messages in thread
From: Konstantin Ananyev @ 2024-07-17 10:32 UTC (permalink / raw)
  To: Wathsala Vithanage, dev, Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko
  Cc: nd, Dhruv Tripathi



> An application provides cache stashing hints to the ethernet devices to
> improve memory access latencies from the CPU and the NIC. This patch
> introduces three distinct hints for this purpose.
> 
> The RTE_ETH_DEV_STASH_HINT_HOST_WILLNEED hint indicates that the host
> (CPU) requires the data written by the NIC immediately. This implies
> that the CPU expects to read data from its local cache rather than LLC
> or main memory if possible. This would improve memory access latency in
> the Rx path. For PCI devices with TPH capability, these hints translate
> into DWHR (Device Writes Host Reads) access pattern. This hint is only
> valid for receive queues.
> 
> The RTE_ETH_DEV_STASH_HINT_BI_DIR_DATA hint indicates that the host and
> the device access the data structure equally. Rx/Tx queue descriptors
> fit the description of such data. This hint applies to both Rx and Tx
> directions.  In the PCI TPH context, this hint translates into a
> Bi-Directional access pattern.
> 
> RTE_ETH_DEV_STASH_HINT_DEV_ONLY hint indicates that the CPU is not
> involved in a given device's receive or transmit paths. This implies
> that only devices are involved in the IO path. Depending on the
> implementation, this hint may result in data getting placed in a cache
> close to the device or not cached at all. For PCI devices with TPH
> capability, this hint translates into D*D* (DWDR, DRDW, DWDW, DRDR)
> access patterns. This is a bidirectional hint, and it can be applied to
> both Rx and Tx queues.
> 
> The RTE_ETH_DEV_STASH_HINT_HOST_DONTNEED hint indicates that the device
> reads data written by the host (CPU) that may still be in the host's
> local cache but is not required by the host anytime soon. This hint is
> intended to prevent unnecessary cache invalidations that cause
> interconnect latencies when a device writes to a buffer already in host
> cache memory. In DPDK, this could happen with the recycling of mbufs
> where a mbuf is placed in the Tx queue that then gets back into mempool
> and gets recycled back into the Rx queue, all while a copy is being held
> in the CPU's local cache unnecessarily. By using this hint on supported
> platforms, the mbuf will be invalidated after the device completes the
> buffer reading, but it will be well before the buffer gets recycled and
> updated in the Rx path. This hint is only valid for transmit queues.
> 
> Applications use three main interfaces in the ethdev library to discover
> and set cache stashing hints. rte_eth_dev_stashing_hints_tx interface is
> used to set hints on a Tx queue. rte_eth_dev_stashing_hints_rx interface
> is used to set hints on an Rx queue. Both of these functions take the
> following parameters as inputs: a port_id (the id of the ethernet
> device), a cpu_id (the target CPU), a cache_level (the level of the
> cache hierarchy the data should be stashed into), a queue_id (the queue
> the hints are applied to). In addition to the above list of parameters,
> a type parameter indicates the type of the object the application
> expects to be stashed by the hardware. Depending on the hardware, these
> may vary. Intel E810 NICs support the stashing of Rx/Tx descriptors,
> packet headers, and packet payloads. These are indicated by the macros
> RTE_ETH_DEV_STASH_TYPE_DESC, RTE_ETH_DEV_STASH_TYPE_HEADER,
> RTE_ETH_DEV_STASH_TYPE_PAYLOAD. Hardware capable of stashing data at any
> given offset into a packet can use the RTE_ETH_DEV_STASH_TYPE_OFFSET
> type. When an offset is used, the offset parameter in the above two
> functions should be set appropriately.
> 
> rte_eth_dev_stashing_hints_discover is used to discover the object types
> and hints supported in the platform and the device. The function takes
> types and hints pointers used as a bit vector to indicate hints and
> types supported by the NIC. An application that intends to use stashing
> hints should first discover supported hints and types and then use the
> functions rte_eth_dev_stashing_hints_tx and
> rte_eth_dev_stashing_hints_rx as required to set stashing hints
> accordingly. eth_dev_ops structure has been updated with two new ops
> that a PMD should implement to support cache stashing hints. A PMD that
> intends to support cache stashing hints should initialize the
> set_stashing_hints function pointer to a function that issues hints to
> the underlying hardware in compliance with platform capabilities. The
> same PMD should also implement a function that can return two-bit fields
> indicating supported types and hints and then initialize the
> discover_stashing_hints function pointer with it. If the NIC supports
> cache stashing hints, the NIC should always set the
> RTE_ETH_DEV_CAPA_CACHE_STASHING device capability.

Sounds like an interesting idea...
Do you plan to have a reference implementation in one (or few) actual PMDs?
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [RFC v2] ethdev: an API for cache stashing hints
  2024-07-17  2:27 ` Stephen Hemminger
@ 2024-07-18 18:48   ` Wathsala Wathawana Vithanage
  2024-07-20  3:05   ` Honnappa Nagarahalli
  1 sibling, 0 replies; 7+ messages in thread
From: Wathsala Wathawana Vithanage @ 2024-07-18 18:48 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, thomas, Ferruh Yigit, Andrew Rybchenko, nd, Dhruv Tripathi,
	Honnappa Nagarahalli, nd

> 
> My initial reaction is negative on this. The DPDK does not need more nerd
> knobs for performance. If it is a performance win, it should be automatic and
> handled by the driver.
> 
> If you absolutely have to have another flag, then it should be in existing config
> (yes, extend the ABI) rather than adding more flags and calls in ethdev.


Thanks, Steve, for the feedback. My thesis is that in a DPDK-based packet processing system,
the application is more knowledgeable of memory buffer (packets) usage than the generic
underlying hardware or the PMD (I have provided some examples below with the hint they
would map into). Recognizing such cases, PCI SIG introduced TLP Packet Processing Hints (TPH).
Consequently, many interconnect designers enabled support for TPH in their interconnects so
that based on steering tags provided by an application to a NIC, which sets them in the TLP
header, memory buffers can be targeted toward a CPU at the desired level in the cache hierarchy.
With this proposed API, applications provide cache-stashing hints to ethernet devices to improve
memory access latencies from the CPU and the NIC to improve system performance.

Listed below are some use cases.

- A run-to-completion application may not need the next packet immediately in L1D. It may rather
issue a prefetch and do other work with packet and application data already in L1D before it needs
the next packet. A generic PMD will not know such subtleties in the application endpoint, and it
would resolve to stash buffers into the L1D indiscriminately or not do it at all. But, with a hint from
the application that buffers of the packets will be stashed at a cache level suitable for the
application. (like UNIX MADV_DONOTNEED but for mbufs at cache line granularity)

- Similarly, a pipelined application may use a hint that advice the buffers are needed in L1D as soon
as they arrive. (parallels MADV_WILLNEED)

- Let's call the time between a mbuf being allocated into an Rx queue, freed back into mempool in
the Tx path, and once again reallocated back in the Same Rx queue the "buffer recycle window". 
The length of the buffer recycle window is a function of the application in question; the PMD or the
NIC has no prior knowledge of this property of an application. A buffer may stay in the L1D of a CPU
throughout the entire recycle window if the window is short enough for that application.
An application with a short buffer recycle window may hint to the platform that the Tx buffer is not
needed anytime soon in the CPU cache via a hint to avoid unnecessary cache invalidations when
the buffer gets written by the Rx packet for the second time. (parallels MADV_DONOTNEED)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC v2] ethdev: an API for cache stashing hints
  2024-07-17  2:27 ` Stephen Hemminger
  2024-07-18 18:48   ` Wathsala Wathawana Vithanage
@ 2024-07-20  3:05   ` Honnappa Nagarahalli
  1 sibling, 0 replies; 7+ messages in thread
From: Honnappa Nagarahalli @ 2024-07-20  3:05 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Wathsala Wathawana Vithanage, dev, thomas, Ferruh Yigit,
	Andrew Rybchenko, nd, Dhruv Tripathi



> On Jul 16, 2024, at 9:27 PM, Stephen Hemminger <stephen@networkplumber.org> wrote:
> 
> On Mon, 15 Jul 2024 22:11:41 +0000
> Wathsala Vithanage <wathsala.vithanage@arm.com> wrote:
> 
>> An application provides cache stashing hints to the ethernet devices to
>> improve memory access latencies from the CPU and the NIC. This patch
>> introduces three distinct hints for this purpose.
>> 
>> The RTE_ETH_DEV_STASH_HINT_HOST_WILLNEED hint indicates that the host
>> (CPU) requires the data written by the NIC immediately. This implies
>> that the CPU expects to read data from its local cache rather than LLC
>> or main memory if possible. This would improve memory access latency in
>> the Rx path. For PCI devices with TPH capability, these hints translate
>> into DWHR (Device Writes Host Reads) access pattern. This hint is only
>> valid for receive queues.
>> 
>> The RTE_ETH_DEV_STASH_HINT_BI_DIR_DATA hint indicates that the host and
>> the device access the data structure equally. Rx/Tx queue descriptors
>> fit the description of such data. This hint applies to both Rx and Tx
>> directions.  In the PCI TPH context, this hint translates into a
>> Bi-Directional access pattern.
>> 
>> RTE_ETH_DEV_STASH_HINT_DEV_ONLY hint indicates that the CPU is not
>> involved in a given device's receive or transmit paths. This implies
>> that only devices are involved in the IO path. Depending on the
>> implementation, this hint may result in data getting placed in a cache
>> close to the device or not cached at all. For PCI devices with TPH
>> capability, this hint translates into D*D* (DWDR, DRDW, DWDW, DRDR)
>> access patterns. This is a bidirectional hint, and it can be applied to
>> both Rx and Tx queues.  
>> 
>> The RTE_ETH_DEV_STASH_HINT_HOST_DONTNEED hint indicates that the device
>> reads data written by the host (CPU) that may still be in the host's
>> local cache but is not required by the host anytime soon. This hint is
>> intended to prevent unnecessary cache invalidations that cause
>> interconnect latencies when a device writes to a buffer already in host
>> cache memory. In DPDK, this could happen with the recycling of mbufs
>> where a mbuf is placed in the Tx queue that then gets back into mempool
>> and gets recycled back into the Rx queue, all while a copy is being held
>> in the CPU's local cache unnecessarily. By using this hint on supported
>> platforms, the mbuf will be invalidated after the device completes the
>> buffer reading, but it will be well before the buffer gets recycled and
>> updated in the Rx path. This hint is only valid for transmit queues. 
>> 
>> Applications use three main interfaces in the ethdev library to discover
>> and set cache stashing hints. rte_eth_dev_stashing_hints_tx interface is
>> used to set hints on a Tx queue. rte_eth_dev_stashing_hints_rx interface
>> is used to set hints on an Rx queue. Both of these functions take the
>> following parameters as inputs: a port_id (the id of the ethernet
>> device), a cpu_id (the target CPU), a cache_level (the level of the
>> cache hierarchy the data should be stashed into), a queue_id (the queue
>> the hints are applied to). In addition to the above list of parameters,
>> a type parameter indicates the type of the object the application
>> expects to be stashed by the hardware. Depending on the hardware, these
>> may vary. Intel E810 NICs support the stashing of Rx/Tx descriptors,
>> packet headers, and packet payloads. These are indicated by the macros
>> RTE_ETH_DEV_STASH_TYPE_DESC, RTE_ETH_DEV_STASH_TYPE_HEADER,
>> RTE_ETH_DEV_STASH_TYPE_PAYLOAD. Hardware capable of stashing data at any
>> given offset into a packet can use the RTE_ETH_DEV_STASH_TYPE_OFFSET
>> type. When an offset is used, the offset parameter in the above two
>> functions should be set appropriately.
>> 
>> rte_eth_dev_stashing_hints_discover is used to discover the object types
>> and hints supported in the platform and the device. The function takes
>> types and hints pointers used as a bit vector to indicate hints and
>> types supported by the NIC. An application that intends to use stashing
>> hints should first discover supported hints and types and then use the
>> functions rte_eth_dev_stashing_hints_tx and
>> rte_eth_dev_stashing_hints_rx as required to set stashing hints
>> accordingly. eth_dev_ops structure has been updated with two new ops
>> that a PMD should implement to support cache stashing hints. A PMD that
>> intends to support cache stashing hints should initialize the
>> set_stashing_hints function pointer to a function that issues hints to
>> the underlying hardware in compliance with platform capabilities. The
>> same PMD should also implement a function that can return two-bit fields
>> indicating supported types and hints and then initialize the
>> discover_stashing_hints function pointer with it. If the NIC supports
>> cache stashing hints, the NIC should always set the
>> RTE_ETH_DEV_CAPA_CACHE_STASHING device capability.
>> 
>> Signed-off-by: Wathsala Vithanage <wathsala.vithanage@arm.com>
>> Reviewed-by: Dhruv Tripathi <dhruv.tripathi@arm.com>
> 
> My initial reaction is negative on this. The DPDK does not need more nerd knobs
> for performance. If it is a performance win, it should be automatic and handled
> by the driver.
> 
IMO, DPDK provides low level APIs and they should provide flexibility for users to control what part of the data from NIC is stashed where. For ex: currently available systems across multiple architectures provide system wide configuration to control stashing data from the NIC to system cache. The configuration allows for all the data from NIC to be stated or none. Whereas some applications need access to just the headers and some others need access to all the packet data. 

> If you absolutely have to have another flag, then it should be in existing config
> (yes, extend the ABI) rather than adding more flags and calls in ethdev.
Agree. Extending the ABI would result in a better solution rather than another set of APIs.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC v2] ethdev: an API for cache stashing hints
  2024-07-15 22:11 [RFC v2] ethdev: an API for cache stashing hints Wathsala Vithanage
  2024-07-17  2:27 ` Stephen Hemminger
  2024-07-17 10:32 ` Konstantin Ananyev
@ 2024-07-22 11:18 ` Ferruh Yigit
  2024-07-26 20:01   ` Wathsala Wathawana Vithanage
  2 siblings, 1 reply; 7+ messages in thread
From: Ferruh Yigit @ 2024-07-22 11:18 UTC (permalink / raw)
  To: Wathsala Vithanage, dev, Thomas Monjalon, Andrew Rybchenko
  Cc: nd, Dhruv Tripathi

On 7/15/2024 11:11 PM, Wathsala Vithanage wrote:
> An application provides cache stashing hints to the ethernet devices to
> improve memory access latencies from the CPU and the NIC. This patch
> introduces three distinct hints for this purpose.
> 
> The RTE_ETH_DEV_STASH_HINT_HOST_WILLNEED hint indicates that the host
> (CPU) requires the data written by the NIC immediately. This implies
> that the CPU expects to read data from its local cache rather than LLC
> or main memory if possible. This would improve memory access latency in
> the Rx path. For PCI devices with TPH capability, these hints translate
> into DWHR (Device Writes Host Reads) access pattern. This hint is only
> valid for receive queues.
> 
> The RTE_ETH_DEV_STASH_HINT_BI_DIR_DATA hint indicates that the host and
> the device access the data structure equally. Rx/Tx queue descriptors
> fit the description of such data. This hint applies to both Rx and Tx
> directions.  In the PCI TPH context, this hint translates into a
> Bi-Directional access pattern.
> 
> RTE_ETH_DEV_STASH_HINT_DEV_ONLY hint indicates that the CPU is not
> involved in a given device's receive or transmit paths. This implies
> that only devices are involved in the IO path. Depending on the
> implementation, this hint may result in data getting placed in a cache
> close to the device or not cached at all. For PCI devices with TPH
> capability, this hint translates into D*D* (DWDR, DRDW, DWDW, DRDR)
> access patterns. This is a bidirectional hint, and it can be applied to
> both Rx and Tx queues.  
> 
> The RTE_ETH_DEV_STASH_HINT_HOST_DONTNEED hint indicates that the device
> reads data written by the host (CPU) that may still be in the host's
> local cache but is not required by the host anytime soon. This hint is
> intended to prevent unnecessary cache invalidations that cause
> interconnect latencies when a device writes to a buffer already in host
> cache memory. In DPDK, this could happen with the recycling of mbufs
> where a mbuf is placed in the Tx queue that then gets back into mempool
> and gets recycled back into the Rx queue, all while a copy is being held
> in the CPU's local cache unnecessarily. By using this hint on supported
> platforms, the mbuf will be invalidated after the device completes the
> buffer reading, but it will be well before the buffer gets recycled and
> updated in the Rx path. This hint is only valid for transmit queues. 
> 
> Applications use three main interfaces in the ethdev library to discover
> and set cache stashing hints. rte_eth_dev_stashing_hints_tx interface is
> used to set hints on a Tx queue. rte_eth_dev_stashing_hints_rx interface
> is used to set hints on an Rx queue. Both of these functions take the
> following parameters as inputs: a port_id (the id of the ethernet
> device), a cpu_id (the target CPU), a cache_level (the level of the
> cache hierarchy the data should be stashed into), a queue_id (the queue
> the hints are applied to). In addition to the above list of parameters,
> a type parameter indicates the type of the object the application
> expects to be stashed by the hardware. Depending on the hardware, these
> may vary. Intel E810 NICs support the stashing of Rx/Tx descriptors,
> packet headers, and packet payloads. These are indicated by the macros
> RTE_ETH_DEV_STASH_TYPE_DESC, RTE_ETH_DEV_STASH_TYPE_HEADER,
> RTE_ETH_DEV_STASH_TYPE_PAYLOAD. Hardware capable of stashing data at any
> given offset into a packet can use the RTE_ETH_DEV_STASH_TYPE_OFFSET
> type. When an offset is used, the offset parameter in the above two
> functions should be set appropriately.
> 
> rte_eth_dev_stashing_hints_discover is used to discover the object types
> and hints supported in the platform and the device. The function takes
> types and hints pointers used as a bit vector to indicate hints and
> types supported by the NIC. An application that intends to use stashing
> hints should first discover supported hints and types and then use the
> functions rte_eth_dev_stashing_hints_tx and
> rte_eth_dev_stashing_hints_rx as required to set stashing hints
> accordingly. eth_dev_ops structure has been updated with two new ops
> that a PMD should implement to support cache stashing hints. A PMD that
> intends to support cache stashing hints should initialize the
> set_stashing_hints function pointer to a function that issues hints to
> the underlying hardware in compliance with platform capabilities. The
> same PMD should also implement a function that can return two-bit fields
> indicating supported types and hints and then initialize the
> discover_stashing_hints function pointer with it. If the NIC supports
> cache stashing hints, the NIC should always set the
> RTE_ETH_DEV_CAPA_CACHE_STASHING device capability.
> 
> Signed-off-by: Wathsala Vithanage <wathsala.vithanage@arm.com>
> Reviewed-by: Dhruv Tripathi <dhruv.tripathi@arm.com>
> 

This is a fine grained config for performance improvement, it may help
to see the performance impact and driver implementation complexity,
before deciding how practical it is.
As ethdev API, as long as it is separate set of APIs, I don't see much
concern to have them.

In existing FEC APIs, and now in the speed lane patchset [1], we are
following similar design with three APIs:
rte_eth_X_set()
rte_eth_X_get()
rte_eth_X_get_capability()

Instead of adding RTE_ETH_DEV_CAPA_ macro and contaminating
'rte_eth_dev_info' with this edge use case, what do you think follow
above design and have dedicated get capability API?

And I can see set() has two different APIs,
'rte_eth_dev_stashing_hints_rx' & 'rte_eth_dev_stashing_hints_tx', is
there a reason to have two separate APIs instead of having one which
gets RX & TX as argument, as done in internal device ops?



[1]
https://patches.dpdk.org/project/dpdk/list/?series=32450&state=*




^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [RFC v2] ethdev: an API for cache stashing hints
  2024-07-22 11:18 ` Ferruh Yigit
@ 2024-07-26 20:01   ` Wathsala Wathawana Vithanage
  0 siblings, 0 replies; 7+ messages in thread
From: Wathsala Wathawana Vithanage @ 2024-07-26 20:01 UTC (permalink / raw)
  To: Ferruh Yigit, dev, thomas, Andrew Rybchenko
  Cc: nd, Dhruv Tripathi, Honnappa Nagarahalli, nd

> rte_eth_X_get_capability()
> 

rte_eth_dev_stashing_hints_discover is somewhat similar.

> Instead of adding RTE_ETH_DEV_CAPA_ macro and contaminating
> 'rte_eth_dev_info' with this edge use case, what do you think follow above
> design and have dedicated get capability API?

I think it's better to have a dedicated API, given that we already have a fine
grained capabilities discovery function. I will add this feedback to V3 of the
RFC.

> 
> And I can see set() has two different APIs, 'rte_eth_dev_stashing_hints_rx' &
> 'rte_eth_dev_stashing_hints_tx', is there a reason to have two separate APIs
> instead of having one which gets RX & TX as argument, as done in internal
> device ops?

Some types/hints may only apply to a single queue direction, so I thought it
would be better to separate them out into separate Rx and Tx APIs for ease
of comprehension/use for the developer.
In fact, underneath, it uses one API for both Rx and Tx.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-07-26 20:01 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-15 22:11 [RFC v2] ethdev: an API for cache stashing hints Wathsala Vithanage
2024-07-17  2:27 ` Stephen Hemminger
2024-07-18 18:48   ` Wathsala Wathawana Vithanage
2024-07-20  3:05   ` Honnappa Nagarahalli
2024-07-17 10:32 ` Konstantin Ananyev
2024-07-22 11:18 ` Ferruh Yigit
2024-07-26 20:01   ` Wathsala Wathawana Vithanage

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).