DPDK patches and discussions
 help / color / mirror / Atom feed
From: Wathsala Vithanage <wathsala.vithanage@arm.com>
Cc: dev@dpdk.org, nd@arm.com,
	Wathsala Vithanage <wathsala.vithanage@arm.com>
Subject: [RFC v3 0/2] An API for Stashing Packets into CPU caches
Date: Mon, 21 Oct 2024 01:52:44 +0000	[thread overview]
Message-ID: <20241021015246.304431-1-wathsala.vithanage@arm.com> (raw)
In-Reply-To: <20240715221141.16153-1-wathsala.vithanage@arm.com>

DPDK applications benefit from Direct Cache Access (DCA) features like
Intel DDIO and Arm's write-allocate-to-SLC. However, those features do
not allow fine-grained control of direct cache access, such as stashing
packets into upper-level caches (L2 caches) of a processor or the shared
cache of a chiplet. PCIe TLP Processing Hints (TPH) addresses this need
in a vendor-agnostic manner. TPH capability has existed since
PCI Express Base Specification revision 3.0; today, numerous Network
Interface Cards and interconnects from different vendors support TPH
capability. TPH comprises a steering tag (ST) and a processing hint
(PH). ST specifies the cache level of a CPU at which the data should be
written to (or DCAed into), while PH is a hint provided by the PCIe
requester to the completer on an upcoming traffic pattern. Some NIC
vendors bundle TPH capability with fine-grained control over the type of
objects that can be stashed into CPU caches, such as

- Rx/Tx queue descriptors
- Packet-headers
- Packet-payloads
- Data from a given offset from the start of a packet

Note that stashable object types are outside the scope of PCIe standard;
therefore, vendors could support any combination of the above items as
they see fit.

To enable TPH and fine-grained packet stashing, this API extends the
ethdev library, PCI library, and the PCI driver. In this design, the
application via the ethdev stashing API provides hints to the PMD to
indicate the underlying hardware at which processor and cache level it
prefers a packet to end up. Once the PMD receives a CPU and a
cache-level combination, it must extract the matching ST from the TPH
ACPI _DSM of the PCIe root port to which the NIC is connected. To
facilitate the extraction of STs, the PCI library and the PCI driver
APIs are extended.

PMD's implementation of eth_dev_ops stashing_rx_hints_set and
stashing_tx_hints_set function pointers are responsible for extracting
the ST. The PCI bus driver provides the generic TPH ST extraction API
that can be used by any PMD that drives a PCIe device. The extraction
process begins by calling rte_pci_extract_tph_st() function in
drivers/bus/pci/rte_bus_pci.h, which takes an initialized input object
rte_tph_acpi__dsm_args and a pointer to rte_tph_acpi__dsm_return to
store the ST returned by the TPH _DSM. rte_tph_acpi__dsm_arg and
rte_tph_acpi__dsm_return objects are defined in lib/pci/rte_pci_tph.h as
defined by the PCIe firmware specification and the associated ECN titled
"Revised _DSM for Cache Locality TPH Features". The helper function
rte_init_tph_acpi__dsm_args is used by the rte_pci_extract_tph_st() to
convert lcore_id and cache_level provided by the PMD into well-formatted
rte_tph_acpi__dsm_args. The processor or, in some cases, a container ID
(which is synonymous with a core complex of a chiplet die) and the cache
level in the rte_tph_acpi__dsm_args structure are not the same as the
lcore_id and the cache_level provided by the application to the ethdev
library, which PMD passes down to the rte_pci_extract_st() function. The
rte_init_tph_acpi__dsm_args helper converts lcore_id to an APIC
processor-id or a PPTT processor-container-id if the container of the
lcore_id was requested as the target by the application. Similarly, it
must convert cache_level to a PPTT cache-reference-id. These conversions
are possible with the hwloc library or some other library DPDK may
eventually provide. However, DPDK cannot execute the TPH _DSM directly,
as it can only be done with kernel privileges. Therefore, appropriate
mechanisms must be established in supported Operating Systems(Linux,
FreeBSD, and Windows) to expose the _DSM return for a given argument.
For instance, on Linux, this mechanism could be sysfs. Therefore, the
implementation of rte_pci_extract_tph_st() is done in OS-specific files
drivers/bus/pci/{bsd, linux, windows}/pci.c.

Once the ST is acquired from the OS-specific method described earlier,
the stashing_rx_hints_set/stashing_tx_hints_set PMD implementations are
ready to set the ST. As per PCIe specification, hints can be put on the
MSI-X tables or using a device-specific method. Considering this, many
NICs that support TPH allow setting steering tags and processing hints
on the device's MSI-X table and queue contexts. For PMDs, setting the ST
on queue contexts is the only viable method of using TPH. Therefore, the
DPDK can only support setting ST in queue contexts. An application uses
the cache stashing ethdev API by first calling the
rte_eth_dev_stashing_capabilities_get() function to find out what object
types can be stashed into a processor cache by the NIC out of the object
types in the bulleted list above. This function takes a port_id and a
pointer to a uint16_t to report back the object type flags. PMD
implements the stashing_capabilities_get function pointer in
eth_dev_ops. If the underlying platform or the NIC does not support TPH,
this function returns -ENOTSUP and the application should consider any
values stored in the objects pointer invalid.

Once the application knows the supported object types that can be
stashed, the next step is to set the steering tags for the packets
associated with Rx and Tx queues via
rte_eth_dev_stashing_rx_config_set() and
rte_eth_dev_stashing_tx_config_set() ethdev library function
respectively. These functions execute the  rte_pci_extract_tph_st() via
eth_dev_ops pointers stashing_rx_hints_set and stashing_tx_hints_set.
Both the functions have an identical signature, a port_id, a queue_id,
and a config object. The port_id and the queue-id are used to locate the
device and the queue. The config object is of type struct
rte_eth_stashing_config, which specifies the lcore_id and the
cache_level, indicating where objects from this queue should be stashed.
It also has the field 'container' to indicate if the target should be
the container of the processor specified by the lcore_id in a
chiplet-based SoC. The 'objects' field in the config sets the types of
objects the application wishes to stash based on the capabilities found
earlier. If the objects field includes the flag
RTE_ETH_DEV_STASH_OBJECT_OFFSET, the 'offset' field must be used to set
the desired offset. These functions invoke PMD implementations of the
stashing functionality via stashing_rx_hints_set and
stashing_tx_hints_set, function pointers in eth_dev_ops, respectively.


Wathsala Vithanage (2):
  pci: introduce the PCIe TLP Processing Hints API
  ethdev: introduce the cache stashing hints API

 drivers/bus/pci/bsd/pci.c     |  12 +++
 drivers/bus/pci/linux/pci.c   |  12 +++
 drivers/bus/pci/rte_bus_pci.h |  22 +++++
 drivers/bus/pci/version.map   |   3 +
 drivers/bus/pci/windows/pci.c |  14 +++
 lib/ethdev/ethdev_driver.h    |  66 ++++++++++++++
 lib/ethdev/rte_ethdev.c       | 120 ++++++++++++++++++++++++++
 lib/ethdev/rte_ethdev.h       | 156 ++++++++++++++++++++++++++++++++++
 lib/ethdev/version.map        |   4 +
 lib/pci/meson.build           |   2 +
 lib/pci/rte_pci.h             |   2 +
 lib/pci/rte_pci_tph.c         |  20 +++++
 lib/pci/rte_pci_tph.h         | 111 ++++++++++++++++++++++++
 13 files changed, 544 insertions(+)
 create mode 100644 lib/pci/rte_pci_tph.c
 create mode 100644 lib/pci/rte_pci_tph.h

-- 
2.34.1


  parent reply	other threads:[~2024-10-21  1:53 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-15 22:11 [RFC v2] ethdev: an API for cache stashing hints Wathsala Vithanage
2024-07-17  2:27 ` Stephen Hemminger
2024-07-18 18:48   ` Wathsala Wathawana Vithanage
2024-07-20  3:05   ` Honnappa Nagarahalli
2024-07-17 10:32 ` Konstantin Ananyev
2024-07-22 11:18 ` Ferruh Yigit
2024-07-26 20:01   ` Wathsala Wathawana Vithanage
2024-09-22 21:43     ` Ferruh Yigit
2024-10-04 17:52       ` Stephen Hemminger
2024-10-04 18:46         ` Wathsala Wathawana Vithanage
2024-10-21  1:52 ` Wathsala Vithanage [this message]
2024-10-21  1:52   ` [RFC v3 1/2] pci: introduce the PCIe TLP Processing Hints API Wathsala Vithanage
2024-10-21  1:52   ` [RFC v3 2/2] ethdev: introduce the cache stashing hints API Wathsala Vithanage
2024-10-21  7:36     ` Morten Brørup
2024-10-21  7:35   ` [RFC v3 0/2] An API for Stashing Packets into CPU caches Chenbo Xia
2024-10-21 12:01     ` Wathsala Wathawana Vithanage
2024-10-22  1:12   ` Stephen Hemminger
2024-10-22 18:37     ` Wathsala Wathawana Vithanage
2024-10-22 21:23       ` Stephen Hemminger
2024-10-23 17:59 ` [RFC v2] ethdev: an API for cache stashing hints Mattias Rönnblom
2024-10-23 20:18   ` Stephen Hemminger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241021015246.304431-1-wathsala.vithanage@arm.com \
    --to=wathsala.vithanage@arm.com \
    --cc=dev@dpdk.org \
    --cc=nd@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).