From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id CAA9845B8B; Mon, 21 Oct 2024 03:53:01 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5EA634021F; Mon, 21 Oct 2024 03:53:01 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id 96621400EF for ; Mon, 21 Oct 2024 03:52:59 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 917D0DA7; Sun, 20 Oct 2024 18:53:28 -0700 (PDT) Received: from ampere-altra-2-1.usa.Arm.com (ampere-altra-2-1.usa.arm.com [10.118.91.158]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id D762F3F73B; Sun, 20 Oct 2024 18:52:58 -0700 (PDT) From: Wathsala Vithanage To: Cc: dev@dpdk.org, nd@arm.com, Wathsala Vithanage Subject: [RFC v3 0/2] An API for Stashing Packets into CPU caches Date: Mon, 21 Oct 2024 01:52:44 +0000 Message-Id: <20241021015246.304431-1-wathsala.vithanage@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240715221141.16153-1-wathsala.vithanage@arm.com> References: <20240715221141.16153-1-wathsala.vithanage@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org DPDK applications benefit from Direct Cache Access (DCA) features like Intel DDIO and Arm's write-allocate-to-SLC. However, those features do not allow fine-grained control of direct cache access, such as stashing packets into upper-level caches (L2 caches) of a processor or the shared cache of a chiplet. PCIe TLP Processing Hints (TPH) addresses this need in a vendor-agnostic manner. TPH capability has existed since PCI Express Base Specification revision 3.0; today, numerous Network Interface Cards and interconnects from different vendors support TPH capability. TPH comprises a steering tag (ST) and a processing hint (PH). ST specifies the cache level of a CPU at which the data should be written to (or DCAed into), while PH is a hint provided by the PCIe requester to the completer on an upcoming traffic pattern. Some NIC vendors bundle TPH capability with fine-grained control over the type of objects that can be stashed into CPU caches, such as - Rx/Tx queue descriptors - Packet-headers - Packet-payloads - Data from a given offset from the start of a packet Note that stashable object types are outside the scope of PCIe standard; therefore, vendors could support any combination of the above items as they see fit. To enable TPH and fine-grained packet stashing, this API extends the ethdev library, PCI library, and the PCI driver. In this design, the application via the ethdev stashing API provides hints to the PMD to indicate the underlying hardware at which processor and cache level it prefers a packet to end up. Once the PMD receives a CPU and a cache-level combination, it must extract the matching ST from the TPH ACPI _DSM of the PCIe root port to which the NIC is connected. To facilitate the extraction of STs, the PCI library and the PCI driver APIs are extended. PMD's implementation of eth_dev_ops stashing_rx_hints_set and stashing_tx_hints_set function pointers are responsible for extracting the ST. The PCI bus driver provides the generic TPH ST extraction API that can be used by any PMD that drives a PCIe device. The extraction process begins by calling rte_pci_extract_tph_st() function in drivers/bus/pci/rte_bus_pci.h, which takes an initialized input object rte_tph_acpi__dsm_args and a pointer to rte_tph_acpi__dsm_return to store the ST returned by the TPH _DSM. rte_tph_acpi__dsm_arg and rte_tph_acpi__dsm_return objects are defined in lib/pci/rte_pci_tph.h as defined by the PCIe firmware specification and the associated ECN titled "Revised _DSM for Cache Locality TPH Features". The helper function rte_init_tph_acpi__dsm_args is used by the rte_pci_extract_tph_st() to convert lcore_id and cache_level provided by the PMD into well-formatted rte_tph_acpi__dsm_args. The processor or, in some cases, a container ID (which is synonymous with a core complex of a chiplet die) and the cache level in the rte_tph_acpi__dsm_args structure are not the same as the lcore_id and the cache_level provided by the application to the ethdev library, which PMD passes down to the rte_pci_extract_st() function. The rte_init_tph_acpi__dsm_args helper converts lcore_id to an APIC processor-id or a PPTT processor-container-id if the container of the lcore_id was requested as the target by the application. Similarly, it must convert cache_level to a PPTT cache-reference-id. These conversions are possible with the hwloc library or some other library DPDK may eventually provide. However, DPDK cannot execute the TPH _DSM directly, as it can only be done with kernel privileges. Therefore, appropriate mechanisms must be established in supported Operating Systems(Linux, FreeBSD, and Windows) to expose the _DSM return for a given argument. For instance, on Linux, this mechanism could be sysfs. Therefore, the implementation of rte_pci_extract_tph_st() is done in OS-specific files drivers/bus/pci/{bsd, linux, windows}/pci.c. Once the ST is acquired from the OS-specific method described earlier, the stashing_rx_hints_set/stashing_tx_hints_set PMD implementations are ready to set the ST. As per PCIe specification, hints can be put on the MSI-X tables or using a device-specific method. Considering this, many NICs that support TPH allow setting steering tags and processing hints on the device's MSI-X table and queue contexts. For PMDs, setting the ST on queue contexts is the only viable method of using TPH. Therefore, the DPDK can only support setting ST in queue contexts. An application uses the cache stashing ethdev API by first calling the rte_eth_dev_stashing_capabilities_get() function to find out what object types can be stashed into a processor cache by the NIC out of the object types in the bulleted list above. This function takes a port_id and a pointer to a uint16_t to report back the object type flags. PMD implements the stashing_capabilities_get function pointer in eth_dev_ops. If the underlying platform or the NIC does not support TPH, this function returns -ENOTSUP and the application should consider any values stored in the objects pointer invalid. Once the application knows the supported object types that can be stashed, the next step is to set the steering tags for the packets associated with Rx and Tx queues via rte_eth_dev_stashing_rx_config_set() and rte_eth_dev_stashing_tx_config_set() ethdev library function respectively. These functions execute the rte_pci_extract_tph_st() via eth_dev_ops pointers stashing_rx_hints_set and stashing_tx_hints_set. Both the functions have an identical signature, a port_id, a queue_id, and a config object. The port_id and the queue-id are used to locate the device and the queue. The config object is of type struct rte_eth_stashing_config, which specifies the lcore_id and the cache_level, indicating where objects from this queue should be stashed. It also has the field 'container' to indicate if the target should be the container of the processor specified by the lcore_id in a chiplet-based SoC. The 'objects' field in the config sets the types of objects the application wishes to stash based on the capabilities found earlier. If the objects field includes the flag RTE_ETH_DEV_STASH_OBJECT_OFFSET, the 'offset' field must be used to set the desired offset. These functions invoke PMD implementations of the stashing functionality via stashing_rx_hints_set and stashing_tx_hints_set, function pointers in eth_dev_ops, respectively. Wathsala Vithanage (2): pci: introduce the PCIe TLP Processing Hints API ethdev: introduce the cache stashing hints API drivers/bus/pci/bsd/pci.c | 12 +++ drivers/bus/pci/linux/pci.c | 12 +++ drivers/bus/pci/rte_bus_pci.h | 22 +++++ drivers/bus/pci/version.map | 3 + drivers/bus/pci/windows/pci.c | 14 +++ lib/ethdev/ethdev_driver.h | 66 ++++++++++++++ lib/ethdev/rte_ethdev.c | 120 ++++++++++++++++++++++++++ lib/ethdev/rte_ethdev.h | 156 ++++++++++++++++++++++++++++++++++ lib/ethdev/version.map | 4 + lib/pci/meson.build | 2 + lib/pci/rte_pci.h | 2 + lib/pci/rte_pci_tph.c | 20 +++++ lib/pci/rte_pci_tph.h | 111 ++++++++++++++++++++++++ 13 files changed, 544 insertions(+) create mode 100644 lib/pci/rte_pci_tph.c create mode 100644 lib/pci/rte_pci_tph.h -- 2.34.1