From: "Van Haaren, Harry" <harry.van.haaren@intel.com>
To: Gregory Etelson <getelson@nvidia.com>,
"Richardson, Bruce" <bruce.richardson@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [RFC] add Rust API for basic port operations
Date: Thu, 17 Apr 2025 12:26:44 +0000 [thread overview]
Message-ID: <PH8PR11MB680361C556B7E2164363C5E3D7BC2@PH8PR11MB6803.namprd11.prod.outlook.com> (raw)
In-Reply-To: <IA1PR12MB633009C4430CF2081D824D3AA5BD2@IA1PR12MB6330.namprd12.prod.outlook.com>
> From: Gregory Etelson
> Sent: Wednesday, April 16, 2025 12:42 PM
> To: Van Haaren, Harry; Richardson, Bruce
> Cc: dev@dpdk.org
> Subject: [RFC] add Rust API for basic port operations
>
> This RFC proposes Rust API for basic DPDK port management and IO operations.
Thanks for posting this RFC. I'll share my API RFC in the coming days too, for now
some feedback (which is already integrated/existing in the API I've been working on)
Bigger picture; the APIs below implement "DPDK C API thinking" in Rust. I'd like to rework the APIs to be "DPDK with Rust Ergonomics" instead:
Example:
1) DpdkPortConf::new_from(0, c_dev_conf_struct, c_rx_conf_struct, c_tx_conf_struct, 8, 8, 512, 512, mempool (or None));
- the API here is not use case focussed. The values passed in are documented, but not easy to explore/understand.
2) let ports = dpdk::get_ports(); // returns a Vec<DpdkPort>
let p = ports.get(0);
p.rxqs(8, rx_mempool)?;
p.rx_crc_strip(true);
p.set_rss_hash(&[0,1,2,3..N])?;
// Notes: configure RX values: note the type-safe, and english-syntax functions.
// These functions prompt in modern IDEs, making "exploring" APIs easy/fast/intuitive (struct member bit-fields are not as easy to explore)
// The "rxqs()" function takes the parameters it needs (u16 count, and a Arc<Dpdk::mempool> object)
// - this avoids exposing an "Option<Arc<Mempool>>" publically, hiding impl detail, giving better ergonomics
// Notice the "?" operator, this can be used "inline" to early-return and error (e.g. rss_hash array has incorrect length)
The ability to error check individual functions/APIs allow better error reporting to the user,
specifically at the point where the error occurred. As these are not datapath APIs, allocations and
overhead in the failure case should not be a problem - user-experience is the goal:
1) on error: -EINVAL returned from rte_eth_rxq_configure(), difficult to see what parameter caused the -EINVAL.
2) on error: "rss_hash has invalid lenght of 42" (or whatever we make the Debug/std::error::Error impl say)
To summarize, "use-case focused APIs" are easier to read, debug, and maintain.
The work here is to design/build these Rust ergonomic APIs: it takes DPDK knowledge, and some Rust API
best-practices to figure these out. There's probably a "playbook" of like 3 rules/strategies, which will
cover 90%+ of the API design.
Hopefully this email is a starting point; some resources for future reader's reference;
- https://burntsushi.net/rust-error-handling/
- https://corrode.dev/blog/idiomatic-rust-resources/ (look for "error-handling" tag)
> ```rust
> /// Configuration details for a DPDK port.
> ///
> /// # Overview
> ///
> /// `DpdkPortConf` is used to initialize and configure ports in a DPDK environment. It includes:
> /// - Device information (`dev_info`) pulled from DPDK.
> /// - Transmit and receive configurations, including queue settings and memory pools.
> /// - Details about the number of queues and descriptors for transmission and reception.
> ///
> /// This struct is typically instantiated using the [`DpdkPortConf::new_from`] method.
> ///
> /// # Example
> ///
> /// ```
> /// let conf = DpdkPortConf::new_from(
> /// port_id,
> /// dev_conf,
> /// tx_conf,
> /// rx_conf,
> /// 8, // Number of RX queues
> /// 8, // Number of TX queues
> /// 512, // RX descriptors
> /// 512, // TX descriptors
> /// 0, // RX queue socket ID
> /// 0, // TX queue socket ID
> /// Some(rx_mempool) // RX queue mempool
> /// ).unwrap();
A few observations; The DpdkPortConf itself is a Rust struct, but internally it exposes a number
of "pub" fields (rte_eth_dev_info, _eth_conf, rte_eth_rxconf, ...), which are directly bindgen-ed
from the DPDK C API. While this works, it will not provide the ergonomics we'd like for Rust code.
For example, the eth_dev_info->driver_name is a CString (aka, const char *), which needs to be
converted to a safe Rust CString or CStr (see https://doc.rust-lang.org/beta/std/ffi/struct.CStr.html#method.from_ptr)
A big value add in Rust development is the automatic "Debug" implementations, allowing one to just
println!("{dev_info:?}");
a struct, and that the compiler is able to identify the contents, and present them automatically.
That will not work if we have C structures with pointers, resulting in bad debug/logging for users.
> ///
> #[derive(Clone)]
> pub struct DpdkPortConf {
> /// Information about the DPDK Ethernet device (e.g., driver, capabilities, etc.).
> pub dev_info: rte_eth_dev_info,
>
> /// Configuration for the Ethernet device. Determines the overall behavior of the device.
> pub dev_conf: rte_eth_conf,
>
> /// Configuration for transmitting (Tx) packets on the Ethernet device.
> pub tx_conf: rte_eth_txconf,
>
> /// Configuration for receiving (Rx) packets on the Ethernet device.
> pub rx_conf: rte_eth_rxconf,
>
> /// Number of receive (Rx) queues configured on this port.
> pub rxq_num: u16,
>
> /// Number of transmit (Tx) queues configured on this port.
> pub txq_num: u16,
>
> /// Number of descriptors for each transmit (Tx) queue.
> /// Descriptors represent items in a queue to handle packets.
> pub tx_desc_num: u16,
>
> /// Number of descriptors for each receive (Rx) queue.
> pub rx_desc_num: u16,
>
> /// NUMA socket ID associated with the memory used by receive (Rx) queues.
> pub rxq_socket_id: u32,
>
> /// NUMA socket ID associated with the memory used by transmit (Tx) queues.
> pub txq_socket_id: u32,
>
> /// Memory pool associated with receive (Rx) queues.
> /// This manages the buffers used for storing incoming packets.
> pub rxq_mempool: Option<Arc<DpdkMempool>>,
> }
Any reaons that rxq_mempool has an Option<>..? Is it for TX-only port configurations?
That's the only thing I could imagine is a cause. See above API where RX and TX are
configured independently - this avoids passing an "RX mempool" at all, if only TX is configured!
> /// A trait defining basic operations for managing and interacting with a DPDK port.
> ///
> /// # Overview
> ///
> /// The `DpdkPort` trait standardizes how to operate on DPDK ports, making it possible to:
> /// - Configure the Ethernet device using [`configure`].
> /// - Start the device using [`start`].
> /// - Handle Rx and Tx bursts of packets using [`rx_burst`] and [`tx_burst`].
> ///
> pub trait DpdkPort: Send + Sync {
This trait provides a very C API centric view. Unfortunately, that view does not include the
Rust API semantics we need for RXQ and TXQ being Send/!Sync (see https://youtu.be/lb6xn2xQ-NQ?t=802)
Example of INCORRECT usage of this RFC API (that Rust compiler cannot help with)
(refer to https://github.com/getelson-at-mellanox/rdpdk/blob/main/port/mlx5/mlx5_port.rs#L30 for port creation)
thread 1:
let port = Mlx5::from(0, &port_conf);
let pkts = &mut [*mut rte_mbuf; 32];
port.rx_burst(0, &mut pkts); // segfault simulateous access on (port/queue combo)
thread 2:
let port = Mlx5::from(0, &port_conf);
let pkts = &mut [*mut rte_mbuf; 32];
port.rx_burst(0, &mut pkts); // segfault simulateous access on (port/queue combo)
The problem here is that the user was able to create two "handles" to the same "port".
And from that handle of a port, it was possible to call rx_burst(), (or tx_burst(), etc).
The Rust compiler cannot identify that Mlx5::from(0) is not safe to call twice.
My current preferred solution is to have a Dpdk::eth::get_ports() API, which returns
all the port instances. This removes the user's ability to "create" ports at all, and hence
it cannot be mis-used either.
All in all, the method in which C DPDK APIs work (pass in integer ID for port/queue), is
a very different to how safe Rust APIs will look when building in Send/Sync safety.
> /// Returns the port ID of the DPDK port.
> ///
> /// # Return Value
> /// A `u16` that uniquely identifies the DPDK port.
> ///
> /// # Example
> /// ```
> /// let port_id = dpdk_port.port_id();
> /// println!("DPDK port ID: {}", port_id);
> /// ```
> fn port_id(&self) -> u16;
>
> /// Returns a reference to the configuration object of the DPDK port.
> ///
> /// # Return Value
> /// A reference to [`DpdkPortConf`], which contains various settings like Rx/Tx queue configurations,
> /// memory pools, and NUMA socket IDs.
> ///
> /// # Example
> /// ```
> /// let port_config = dpdk_port.port_conf();
> /// println!("Rx queues: {}", port_config.rxq_num);
> /// ```
> fn port_conf(&self) -> &DpdkPortConf;
>
> /// Configures the DPDK Ethernet device with the settings specified in the port configuration.
> ///
> /// This method is typically called before starting the port to ensure it is prepared for Rx and Tx operations.
> ///
> /// # Return Value
> /// - `Ok(())` if the configuration was applied successfully.
> /// - `Err(String)` with a descriptive error message if the configuration failed.
> ///
> /// # Example
> /// ```
> /// let result = dpdk_port.configure();
> /// if let Err(err) = result {
> /// eprintln!("Failed to configure the port: {}", err);
> /// }
> /// ```
> fn configure(&mut self) -> Result<(), String>;
>
> /// Starts the DPDK Ethernet device.
> ///
> /// This method initializes the Rx and Tx queues, making the port ready for data transmission
> /// and reception.
> ///
> /// # Return Value
> /// - `Ok(())` if the port was started successfully.
> /// - `Err(String)` if the startup process failed, with a descriptive error message.
> ///
> /// # Example
> /// ```
> /// let result = dpdk_port.start();
> /// if let Err(err) = result {
> /// eprintln!("Failed to start the port: {}", err);
> /// }
> /// ```
> fn start(&mut self) -> Result<(), String>;
>
> /// Receives a burst of packets on the specified Rx queue.
> ///
> /// # Parameters
> /// - `queue_id`: The ID of the Rx queue to receive packets from.
> /// - `pkts`: A mutable reference to an array of packet buffers (`*mut rte_mbuf`) where received packets
> /// will be written.
> ///
> /// # Return Value
> /// - `Ok(u16)` containing the number of packets successfully received.
> /// - `Err(String)` if the operation failed.
> ///
> /// # Example
> /// ```
> /// let pkts: Vec<*mut rte_mbuf> = vec![std::ptr::null_mut(); 32];
> /// let received = dpdk_port.rx_burst(0, &pkts);
> /// match received {
> /// Ok(count) => println!("Received {} packets", count),
> /// Err(err) => eprintln!("Rx burst failed: {}", err),
> /// }
> /// ```
> fn rx_burst(&mut self, queue_id: u16, pkts: &[*mut rte_mbuf]) -> Result<u16, String>;
>
> /// Sends a burst of packets on the specified Tx queue.
> ///
> /// # Parameters
> /// - `queue_id`: The ID of the Tx queue to send packets on.
> /// - `pkts`: A reference to an array of packet buffers (`*mut rte_mbuf`) to send.
> ///
> /// # Return Value
> /// - `Ok(u16)` containing the number of packets successfully sent.
> /// - `Err(String)` if the operation failed.
> ///
> /// # Example
> /// ```
> /// let pkts: Vec<*mut rte_mbuf> = vec![some_packet_ptr1, some_packet_ptr2];
> /// let sent = dpdk_port.tx_burst(0, &pkts);
> /// match sent {
> /// Ok(count) => println!("Sent {} packets", count),
> /// Err(err) => eprintln!("Tx burst failed: {}", err),
> /// }
> /// ```
> fn tx_burst(&mut self, queue_id: u16, pkts: &[*mut rte_mbuf]) -> Result<u16, String>;
> }
> ```
Having a single trait with configure(), start() and rx_burst() makes it impossible to encode the
Send/Sync attributes of the "Port" concept, and the "RxQueue" concept seperately.
So perhaps a solution with two traits could work; I haven't investigated in detail yet.
> The API uses raw FFI pointers in DpdkPort rx_burst() and tx_burst() to preserve DPDK performance.
Cool, nice trick... but does it remove all virtual functions/function-pointer dispatch?
- https://github.com/getelson-at-mellanox/rdpdk/blob/main/app/runpmd/runpmd.rs#L143
This seems to accept a "&mut Vec<Box<dyn DpdkPort>>", which would mean the ".rx_burst()"
is a virtual-function into the trait, and then a function-pointer into the PMD?
It could be interesting to change "do_io" to a generic function, taking a generic DpdkPort instance? Then the first level
trait-resolution would become compile-time (through Rust compiler monomorphisation of the generic do_io<Pmd: impl DpdkPort>() into a
more specific "do_io<Mlx5>()" code?
I don't believe this to *actually* matter for performance. It might matter for an IO-forward, 0 packet data processing test.
So micro-benchmark bragging, yes this has impact... but touch the packet data and it no longer matters!
> The API implementation is here:
>
> API definition:
> https://github.com/getelson-at-mellanox/rdpdk/blob/main/lib/port/port.rs
>
> API implementation with RTE function calls:
> https://github.com/getelson-at-mellanox/rdpdk/blob/main/lib/port/raw_port.rs
>
> Implement direct calls to mlx5 Rx/Tx IO functions:
> https://github.com/getelson-at-mellanox/rdpdk/blob/main/port/mlx5/mlx5_port.rs
>
>
> Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Thanks for sharing the links and RFC!
I will share the RFC API that communicates the Send/Sync requirements to the Rust compiler for Ethdev Port Rxq as a sample,
and we can try compare/converge the "trait DpdkPort" and the compile-time-thread-safe approaches into the best API?
Regards, -Harry
next prev parent reply other threads:[~2025-04-17 12:27 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-16 11:42 Gregory Etelson
2025-04-17 12:26 ` Van Haaren, Harry [this message]
2025-04-17 16:19 ` Etelson, Gregory
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=PH8PR11MB680361C556B7E2164363C5E3D7BC2@PH8PR11MB6803.namprd11.prod.outlook.com \
--to=harry.van.haaren@intel.com \
--cc=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=getelson@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).