DPDK patches and discussions
 help / color / mirror / Atom feed
From: Stephen Hemminger <stephen@networkplumber.org>
To: Wathsala Vithanage <wathsala.vithanage@arm.com>
Cc: dev@dpdk.org, nd@arm.com
Subject: Re: [PATCH v5 0/4] An API for Cache Stashing with TPH
Date: Wed, 4 Jun 2025 09:51:14 -0700	[thread overview]
Message-ID: <20250604095114.2e540905@hermes.local> (raw)
In-Reply-To: <20250602223805.816816-1-wathsala.vithanage@arm.com>

On Mon,  2 Jun 2025 22:38:00 +0000
Wathsala Vithanage <wathsala.vithanage@arm.com> wrote:

> Today, DPDK applications benefit from Direct Cache Access (DCA) features
> like Intel DDIO and Arm's write-allocate-to-SLC. However, those features
> do not allow fine-grained control of direct cache access, such as
> stashing packets into upper-level caches (L2 caches) of a processor or
> the shared cache of a chiplet. PCIe TLP Processing Hints (TPH) addresses
> this need in a vendor-agnostic manner. TPH capability has existed since
> PCI Express Base Specification revision 3.0; today, numerous Network
> Interface Cards and interconnects from different vendors support TPH
> capability. TPH comprises a steering tag (ST) and a processing hint
> (PH). ST specifies the cache level of a CPU at which the data should be
> written to (or DCAed into), while PH is a hint provided by the PCIe
> requester to the completer on an upcoming traffic pattern. Some NIC
> vendors bundle TPH capability with fine-grained control over the type of
> objects that can be stashed into CPU caches, such as
> 
> - Rx/Tx queue descriptors
> - Packet-headers
> - Packet-payloads
> - Data from a given offset from the start of a packet
> 
> Note that stashable object types are outside the scope of the PCIe
> standard; therefore, vendors could support any combination of the above
> items as they see fit.
> 
> To enable TPH and fine-grained packet stashing, this API extends the
> ethdev library and the PCI bus driver. In this design, the application
> provides hints to the PMD via the ethdev stashing API to indicate the
> underlying hardware at which CPU and cache level it prefers a packet to
> end up. Once the PMD receives a CPU and a cache-level combination (or a
> list of such combinations), it must extract the matching ST from the PCI
> bus driver for such combinations. The PCI bus driver implements the TPH
> functions in an OS specific way; for Linux, it depends on the TPH
> capabilities of the VFIO kernel driver.
> 
> An application uses the cache stashing ethdev API by first calling the
> rte_eth_dev_stashing_capabilities_get() function to find out what object
> types can be stashed into a CPU cache by the NIC out of the object types
> in the bulleted list above. This function takes a port_id and a pointer
> to a uint16_t to report back the object type flags. PMD implements the
> stashing_capabilities_get function pointer in eth_dev_ops. If the
> underlying platform or the NIC does not support TPH, this function
> returns -ENOTSUP, and the application should consider any values stored
> in the object invalid.
> 
> Once the application knows the supported object types that can be
> stashed, the next step is to set the steering tags for the packets
> associated with Rx and Tx queues via
> rte_eth_dev_stashing_{rx,tx}_config_set() ethdev library functions. Both
> functions have an identical signature, a port_id, a queue_id, and a
> config object. The port_id and the queue_id are used to locate the
> device and the queue. The config object is of type struct
> rte_eth_stashing_config, which specifies the lcore_id and the
> cache_level, indicating where objects from this queue should be stashed.
> The 'objects' field in the config sets the types of objects the
> application wishes to stash based on the capabilities found earlier.
> Note that if the 'objects' field includes the flag
> RTE_ETH_DEV_STASH_OBJECT_OFFSET, the 'offset' field must be used to set
> the desired offset. These functions invoke PMD implementations of the
> stashing functionality via the stashing_{rx,tx}_hints_set function
> callbacks in the eth_dev_ops, respectively.
> 
> The PMD's implementation of the stashing_rx_hints_set() and
> stashing_tx_hints_set() functions is ultimately responsible for
> extracting the ST via the API provided by the PCI bus driver. Before
> extracting STs, the PMD should enable the TPH capability in the endpoint
> device by calling the rte_pci_tph_enable() function.  The application
> begins the ST extraction process by calling the rte_pci_tph_st_get()
> function in drivers/bus/pci/rte_bus_pci.h, which returns STs via the
> same rte_tph_info objects array passed into it as an argument.  Once PMD
> acquires ST, the stashing_{rx,tx}_hints_set callbacks implemented in the
> PMD are ready to set the ST as per the rte_eth_stashing_config object
> passed to them by the higher-level ethdev functions
> ret_eth_dev_stashing_{rx,tx}_hints(). As per the PCIe specification, STs
> can be placed on the MSI-X tables or in a device-specific location. For
> PMDs, setting the STs on queue contexts is the only viable way of using
> TPH. Therefore, the PMDs should only enable TPH in device-specific mode.
> 
> V4->V5:
>  * Enable stashing-hints (TPH) in Intel i40e driver.
>  * Update exported symbol version from 25.03 to 25.07.
>  * Add TPH mode macros.
> 
> V3->V4:
>  * Add VFIO IOCTL based ST extraction mechanism to Linux PCI bus driver
>  * Remove ST extraction via direct access to ACPI _DSM
>  * Replace rte_pci_extract_tph_st() with rte_pci_tph_st_get() in PCI
>    bus driver.
> 
> Wathsala Vithanage (4):
>   pci: add non-merged Linux uAPI changes
>   bus/pci: introduce the PCIe TLP Processing Hints API
>   ethdev: introduce the cache stashing hints API
>   net/i40e: enable TPH in i40e
> 
>  drivers/bus/pci/bsd/pci.c            |  43 +++++++
>  drivers/bus/pci/bus_pci_driver.h     |  52 ++++++++
>  drivers/bus/pci/linux/pci.c          | 100 ++++++++++++++++
>  drivers/bus/pci/linux/pci_init.h     |  14 +++
>  drivers/bus/pci/linux/pci_vfio.c     | 170 +++++++++++++++++++++++++++
>  drivers/bus/pci/private.h            |   8 ++
>  drivers/bus/pci/rte_bus_pci.h        |  67 +++++++++++
>  drivers/bus/pci/windows/pci.c        |  43 +++++++
>  drivers/net/intel/i40e/i40e_ethdev.c | 127 ++++++++++++++++++++
>  kernel/linux/uapi/linux/vfio_tph.h   | 102 ++++++++++++++++
>  lib/ethdev/ethdev_driver.h           |  66 +++++++++++
>  lib/ethdev/rte_ethdev.c              | 149 +++++++++++++++++++++++
>  lib/ethdev/rte_ethdev.h              | 158 +++++++++++++++++++++++++
>  lib/pci/rte_pci.h                    |  15 +++
>  14 files changed, 1114 insertions(+)
>  create mode 100644 kernel/linux/uapi/linux/vfio_tph.h
> 

How will this impact existing applications that never use the API?
It is crucial that existing 3rd party applications, just work without
modifications. We don't want to hear from Network Virtual Appliance
vendors that there is a performance regression in DPDK. They are already
reluctant to keep up with DPDK versions.

I.e if the application does nothing caching must be enabled.

  parent reply	other threads:[~2025-06-04 16:51 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-15 22:11 [RFC v2] ethdev: an API for cache stashing hints Wathsala Vithanage
2024-07-17  2:27 ` Stephen Hemminger
2024-07-18 18:48   ` Wathsala Wathawana Vithanage
2024-07-20  3:05   ` Honnappa Nagarahalli
2024-07-17 10:32 ` Konstantin Ananyev
2024-07-22 11:18 ` Ferruh Yigit
2024-07-26 20:01   ` Wathsala Wathawana Vithanage
2024-09-22 21:43     ` Ferruh Yigit
2024-10-04 17:52       ` Stephen Hemminger
2024-10-04 18:46         ` Wathsala Wathawana Vithanage
2024-10-21  1:52 ` [RFC v3 0/2] An API for Stashing Packets into CPU caches Wathsala Vithanage
2024-10-21  1:52   ` [RFC v3 1/2] pci: introduce the PCIe TLP Processing Hints API Wathsala Vithanage
2024-12-03 20:54     ` Stephen Hemminger
2024-10-21  1:52   ` [RFC v3 2/2] ethdev: introduce the cache stashing hints API Wathsala Vithanage
2024-10-21  7:36     ` Morten Brørup
2024-10-24  5:49     ` Jerin Jacob
2024-10-24  6:59       ` Morten Brørup
2024-10-24 15:12         ` Wathsala Wathawana Vithanage
2024-10-24 15:04       ` Wathsala Wathawana Vithanage
2024-12-03 21:13     ` Stephen Hemminger
2024-12-05 15:40       ` David Marchand
2024-12-05 21:00         ` Stephen Hemminger
2024-10-21  7:35   ` [RFC v3 0/2] An API for Stashing Packets into CPU caches Chenbo Xia
2024-10-21 12:01     ` Wathsala Wathawana Vithanage
2024-10-22  1:12   ` Stephen Hemminger
2024-10-22 18:37     ` Wathsala Wathawana Vithanage
2024-10-22 21:23       ` Stephen Hemminger
2025-05-17 15:17   ` [RFC PATCH v4 0/3] " Wathsala Vithanage
2025-05-17 15:17     ` [RFC PATCH v4 1/3] pci: add non-merged Linux uAPI changes Wathsala Vithanage
2025-05-19  6:41       ` David Marchand
2025-05-19 17:55         ` Wathsala Wathawana Vithanage
2025-05-17 15:17     ` [RFC PATCH v4 2/3] bus/pci: introduce the PCIe TLP Processing Hints API Wathsala Vithanage
2025-05-19  6:44       ` David Marchand
2025-05-19 17:57         ` Wathsala Wathawana Vithanage
2025-05-17 15:17     ` [RFC PATCH v4 3/3] ethdev: introduce the cache stashing hints API Wathsala Vithanage
2025-05-20 13:53       ` Stephen Hemminger
2025-06-02 22:38   ` [PATCH v5 0/4] An API for Cache Stashing with TPH Wathsala Vithanage
2025-06-02 22:38     ` [PATCH v5 1/4] pci: add non-merged Linux uAPI changes Wathsala Vithanage
2025-06-02 23:11       ` Wathsala Wathawana Vithanage
2025-06-02 23:16         ` Wathsala Wathawana Vithanage
2025-06-04 20:43       ` Stephen Hemminger
2025-06-02 22:38     ` [PATCH v5 2/4] bus/pci: introduce the PCIe TLP Processing Hints API Wathsala Vithanage
2025-06-03  8:11       ` Morten Brørup
2025-06-04 16:54       ` Bruce Richardson
2025-06-04 22:52         ` Wathsala Wathawana Vithanage
2025-06-05  7:50           ` Bruce Richardson
2025-06-05 14:32             ` Wathsala Wathawana Vithanage
2025-06-05 10:18           ` Bruce Richardson
2025-06-05 14:25             ` Wathsala Wathawana Vithanage
2025-06-05 10:30       ` Bruce Richardson
2025-06-02 22:38     ` [PATCH v5 3/4] ethdev: introduce the cache stashing hints API Wathsala Vithanage
2025-06-03  8:43       ` Morten Brørup
2025-06-05 10:03       ` Bruce Richardson
2025-06-05 14:30         ` Wathsala Wathawana Vithanage
2025-06-02 22:38     ` [PATCH v5 4/4] net/i40e: enable TPH in i40e Wathsala Vithanage
2025-06-04 16:51     ` Stephen Hemminger [this message]
2025-06-04 22:24       ` [PATCH v5 0/4] An API for Cache Stashing with TPH Wathsala Wathawana Vithanage
2024-10-23 17:59 ` [RFC v2] ethdev: an API for cache stashing hints Mattias Rönnblom
2024-10-23 20:18   ` Stephen Hemminger
2024-10-24 14:59   ` Wathsala Wathawana Vithanage
2024-10-25  7:43   ` Andrew Rybchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250604095114.2e540905@hermes.local \
    --to=stephen@networkplumber.org \
    --cc=dev@dpdk.org \
    --cc=nd@arm.com \
    --cc=wathsala.vithanage@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).