DPDK patches and discussions
 help / color / mirror / Atom feed
* [RFC v2 00/12] Packet capture using port mirroring
       [not found] <0250411234927.114568-1-stephen@networkplumber.org>
@ 2025-07-09 17:33 ` Stephen Hemminger
  2025-07-09 17:33   ` [RFC v2 01/12] ethdev: allow start/stop from secondary process Stephen Hemminger
                     ` (11 more replies)
  0 siblings, 12 replies; 14+ messages in thread
From: Stephen Hemminger @ 2025-07-09 17:33 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

This is a rework of how packet capture is done in DPDK.
The existing mechanism using callbacks has a number of problems:
it won't work when packets are sent and received in secondary process because
callbacks only function in the primary process; pdump requires that the
main program "opt-in" to having packet capture which requires changes
to existing network appliances.

The new mechanism builds on the concept of port mirroring used
in Linux/FreeBSD and also router vendors (SPAN ports).
See: https://en.wikipedia.org/wiki/Port_mirroring for a general description.

The implementation uses two new ethdev API's one to add
a port mirror, and one to remove it.

The internals of dumpcap are redone but the program command
syntax is unchanged (no doc change needed). Using the new mirror mechanism,
the dumpcap program creates a port using ring PMD and
then directs a port mirror to that new ring PMD instance.
Then the packets are extracted from the ring.
If capturing on multiple devices, they all get mirrored to
the same ring PMD.

The dumpcap program needs to be able to start the ring PMD it created
but the existing ethdev API is broken when used from secondary process.
One of the patches fixes that, it was also reported previously in Bugzilla.

The mirror API has a restriction that the port being mirrored to must allow
lock free transmit. This is because both Rx and Tx as well as multiple
queues need to go over the same transmit queue. Although it might be possible
to have some complex mapping between multiple ports/queues being mirrored
to multiple transmit queues, that gets to be a lot of overhead in API's
and implementation. Fixing the ring and null PMD to allow lockless
transmit is good enough.

Added tests for the new features.
The performance has not been measured but the overhead of checking for
mirror port is no more that the overhead of looking at the callback list.

TODO items:
  - some names are too verbose
  - need release notes for new features
  - more through testing on real hardware

Stephen Hemminger (12):
  ethdev: allow start/stop from secondary process
  test: add test for hotplug and secondary process operations
  net/ring: allow lockfree transmit if ring supports it
  net/ring: add new devargs for dumpcap use
  mbuf: add dynamic flag for use by port mirroring
  ethdev: add port mirroring feature
  pcapng: split packet copy from header insertion
  pcapng: make queue optional
  test: add tests for ethdev mirror
  app/testpmd: support for port mirroring
  app/dumpcap: use port mirror instead of pdump
  pdump: mark API and application as deprecated

 app/dumpcap/main.c                          | 443 ++++++++++++++++----
 app/dumpcap/meson.build                     |   2 +-
 app/pdump/main.c                            |   3 +
 app/pdump/meson.build                       |   1 +
 app/test-pmd/cmdline.c                      |   1 +
 app/test-pmd/cmdline_mirror.c               | 129 ++++++
 app/test-pmd/cmdline_mirror.h               |  12 +
 app/test-pmd/meson.build                    |   2 +
 app/test/meson.build                        |   7 +-
 app/test/test_ethdev_mirror.c               | 285 +++++++++++++
 app/test/test_mp_secondary.c                |  51 ++-
 config/rte_config.h                         |   1 +
 doc/guides/rel_notes/deprecation.rst        |   3 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  15 +
 drivers/net/ring/rte_eth_ring.c             |  91 +++-
 lib/ethdev/ethdev_driver.c                  |  11 +-
 lib/ethdev/ethdev_driver.h                  |   6 +
 lib/ethdev/ethdev_private.c                 | 133 ++++++
 lib/ethdev/ethdev_private.h                 |  29 ++
 lib/ethdev/ethdev_trace.h                   |  17 +
 lib/ethdev/ethdev_trace_points.c            |   6 +
 lib/ethdev/meson.build                      |   1 +
 lib/ethdev/rte_ethdev.c                     | 101 +++--
 lib/ethdev/rte_ethdev.h                     |  23 +-
 lib/ethdev/rte_ethdev_core.h                |   6 +-
 lib/ethdev/rte_mirror.c                     | 344 +++++++++++++++
 lib/ethdev/rte_mirror.h                     | 113 +++++
 lib/mbuf/rte_mbuf_dyn.h                     |   9 +
 lib/pcapng/rte_pcapng.c                     | 179 ++++----
 lib/pcapng/rte_pcapng.h                     |  27 +-
 lib/pdump/rte_pdump.h                       |  12 +-
 31 files changed, 1851 insertions(+), 212 deletions(-)
 create mode 100644 app/test-pmd/cmdline_mirror.c
 create mode 100644 app/test-pmd/cmdline_mirror.h
 create mode 100644 app/test/test_ethdev_mirror.c
 create mode 100644 lib/ethdev/rte_mirror.c
 create mode 100644 lib/ethdev/rte_mirror.h

-- 
2.47.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC v2 01/12] ethdev: allow start/stop from secondary process
  2025-07-09 17:33 ` [RFC v2 00/12] Packet capture using port mirroring Stephen Hemminger
@ 2025-07-09 17:33   ` Stephen Hemminger
  2025-07-09 17:47     ` Khadem Ullah
  2025-07-09 17:33   ` [RFC v2 02/12] test: add test for hotplug and secondary process operations Stephen Hemminger
                     ` (10 subsequent siblings)
  11 siblings, 1 reply; 14+ messages in thread
From: Stephen Hemminger @ 2025-07-09 17:33 UTC (permalink / raw)
  To: dev
  Cc: Stephen Hemminger, Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, Anatoly Burakov

Before this patch if secondary process called start/stop
it would only impact the secondary process, the ethdev on the
primary process was not started.

With this patch, when start/stop is called from secondary,
it calls the primary and does the operation there. The design
is generic, and we can later add queue and other operations
as needed.

Bugzilla ID: 73

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 lib/ethdev/ethdev_driver.c  | 11 ++++-
 lib/ethdev/ethdev_private.c | 93 +++++++++++++++++++++++++++++++++++++
 lib/ethdev/ethdev_private.h | 26 +++++++++++
 lib/ethdev/rte_ethdev.c     | 84 +++++++++++++++++++--------------
 4 files changed, 177 insertions(+), 37 deletions(-)

diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
index ec0c1e1176..66ba7dafc9 100644
--- a/lib/ethdev/ethdev_driver.c
+++ b/lib/ethdev/ethdev_driver.c
@@ -114,6 +114,14 @@ rte_eth_dev_allocate(const char *name)
 		goto unlock;
 	}
 
+	if (eth_dev_shared_data->allocated_ports == 0 &&
+	    rte_mp_action_register(ETHDEV_MP, ethdev_server) &&
+	    rte_errno != ENOTSUP) {
+		RTE_ETHDEV_LOG_LINE(ERR,
+			 "Could not start %s service", ETHDEV_MP);
+		goto unlock;
+	}
+
 	eth_dev = eth_dev_get(port_id);
 	eth_dev->flow_fp_ops = &rte_flow_fp_default_ops;
 	strlcpy(eth_dev->data->name, name, sizeof(eth_dev->data->name));
@@ -282,7 +290,8 @@ rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
 		memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
 		eth_dev->data = NULL;
 
-		eth_dev_shared_data->allocated_ports--;
+		if (--eth_dev_shared_data->allocated_ports == 0)
+			rte_mp_action_unregister(ETHDEV_MP);
 		eth_dev_shared_data_release();
 	}
 
diff --git a/lib/ethdev/ethdev_private.c b/lib/ethdev/ethdev_private.c
index 285d377d91..184cf33f99 100644
--- a/lib/ethdev/ethdev_private.c
+++ b/lib/ethdev/ethdev_private.c
@@ -2,6 +2,8 @@
  * Copyright(c) 2018 Gaëtan Rivet
  */
 
+#include <assert.h>
+
 #include <eal_export.h>
 #include <rte_debug.h>
 
@@ -477,3 +479,94 @@ eth_dev_tx_queue_config(struct rte_eth_dev *dev, uint16_t nb_queues)
 	dev->data->nb_tx_queues = nb_queues;
 	return 0;
 }
+
+static int
+ethdev_handle_request(const struct ethdev_mp_request *req)
+{
+	switch (req->operation) {
+	case ETH_REQ_START:
+		return rte_eth_dev_start(req->port_id);
+
+	case ETH_REQ_STOP:
+		return rte_eth_dev_stop(req->port_id);
+
+	default:
+		return -EINVAL;
+	}
+}
+
+static_assert(sizeof(struct ethdev_mp_request) <= RTE_MP_MAX_PARAM_LEN,
+	"ethdev MP request bigger than available param space");
+
+static_assert(sizeof(struct ethdev_mp_response) <= RTE_MP_MAX_PARAM_LEN,
+	"ethdev MP response bigger than available param space");
+
+int
+ethdev_server(const struct rte_mp_msg *mp_msg, const void *peer)
+{
+	const struct ethdev_mp_request *req
+		= (const struct ethdev_mp_request *)mp_msg->param;
+
+	struct rte_mp_msg mp_resp = {
+		.name = ETHDEV_MP,
+	};
+	struct ethdev_mp_response *resp;
+
+	resp = (struct ethdev_mp_response *)mp_resp.param;
+	mp_resp.len_param = sizeof(*resp);
+	resp->res_op = req->operation;
+
+	/* recv client requests */
+	if (mp_msg->len_param != sizeof(*req))
+		resp->err_value = -EINVAL;
+	else
+		resp->err_value = ethdev_handle_request(req);
+
+	return rte_mp_reply(&mp_resp, peer);
+}
+
+int
+ethdev_request(uint16_t port_id, enum ethdev_mp_operation operation,
+	       const void *buf, size_t buf_len)
+{
+	struct rte_mp_msg mp_req = { };
+	struct rte_mp_reply mp_reply;
+	struct ethdev_mp_request *req;
+	struct timespec ts = {.tv_sec = 5, .tv_nsec = 0};
+	int ret;
+
+	if (sizeof(*req) + buf_len > RTE_MP_MAX_PARAM_LEN) {
+		RTE_ETHDEV_LOG_LINE(ERR,
+				    "request %u port %u invalid len %zu",
+				    operation, port_id, buf_len);
+		return -EINVAL;
+	}
+
+	strlcpy(mp_req.name, ETHDEV_MP, RTE_MP_MAX_NAME_LEN);
+	mp_req.len_param = sizeof(*req) + buf_len;
+
+	req = (struct ethdev_mp_request *)mp_req.param;
+	req->operation = operation;
+	req->port_id = port_id;
+
+	if (buf_len > 0)
+		memcpy(req->config, buf, buf_len);
+
+	ret = rte_mp_request_sync(&mp_req, &mp_reply, &ts);
+	if (ret == 0) {
+		const struct rte_mp_msg *mp_rep = &mp_reply.msgs[0];
+		const struct ethdev_mp_response *resp
+			= (const struct ethdev_mp_response *)mp_rep->param;
+
+		if (resp->err_value == 0)
+			ret = 0;
+		else
+			rte_errno = -resp->err_value;
+		free(mp_reply.msgs);
+	}
+
+	if (ret < 0)
+		RTE_ETHDEV_LOG_LINE(ERR,
+		       "port %up ethdev op %u failed", port_id, operation);
+	return ret;
+}
diff --git a/lib/ethdev/ethdev_private.h b/lib/ethdev/ethdev_private.h
index b07b1b4c42..f58f161871 100644
--- a/lib/ethdev/ethdev_private.h
+++ b/lib/ethdev/ethdev_private.h
@@ -79,4 +79,30 @@ void eth_dev_txq_release(struct rte_eth_dev *dev, uint16_t qid);
 int eth_dev_rx_queue_config(struct rte_eth_dev *dev, uint16_t nb_queues);
 int eth_dev_tx_queue_config(struct rte_eth_dev *dev, uint16_t nb_queues);
 
+/* Used to allow start/stop from secondary */
+#define ETHDEV_MP	"mp_ethdev"
+
+enum ethdev_mp_operation {
+	ETH_REQ_START,
+	ETH_REQ_STOP,
+};
+
+struct ethdev_mp_request {
+	uint16_t operation;
+	uint16_t port_id;
+	uint32_t reserved;
+
+	uint8_t config[]; /* operation specific */
+};
+
+struct ethdev_mp_response {
+	uint16_t res_op;
+	uint16_t port_id;
+	int32_t err_value;
+};
+
+int ethdev_server(const struct rte_mp_msg *mp_msg, const void *peer);
+int ethdev_request(uint16_t port_id, enum ethdev_mp_operation op,
+		   const void *buf, size_t buf_len);
+
 #endif /* _ETH_PRIVATE_H_ */
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index dd7c00bc94..41363af2c3 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -1800,54 +1800,61 @@ rte_eth_dev_start(uint16_t port_id)
 
 	if (dev->data->dev_configured == 0) {
 		RTE_ETHDEV_LOG_LINE(INFO,
-			"Device with port_id=%"PRIu16" is not configured.",
-			port_id);
+				    "Device with port_id=%"PRIu16" is not configured.",
+				    port_id);
 		return -EINVAL;
 	}
 
 	if (dev->data->dev_started != 0) {
 		RTE_ETHDEV_LOG_LINE(INFO,
-			"Device with port_id=%"PRIu16" already started",
-			port_id);
+				    "Device with port_id=%"PRIu16" already started",
+				    port_id);
 		return 0;
 	}
 
-	ret = rte_eth_dev_info_get(port_id, &dev_info);
-	if (ret != 0)
-		return ret;
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		ret = rte_eth_dev_info_get(port_id, &dev_info);
+		if (ret != 0)
+			return ret;
 
-	restore_flags = rte_eth_get_restore_flags(dev, RTE_ETH_START);
+		restore_flags = rte_eth_get_restore_flags(dev, RTE_ETH_START);
 
-	/* Lets restore MAC now if device does not support live change */
-	if ((*dev_info.dev_flags & RTE_ETH_DEV_NOLIVE_MAC_ADDR) &&
-	    (restore_flags & RTE_ETH_RESTORE_MAC_ADDR))
-		eth_dev_mac_restore(dev, &dev_info);
+		/* Restore MAC now if device does not support live change */
+		if ((*dev_info.dev_flags & RTE_ETH_DEV_NOLIVE_MAC_ADDR) &&
+		    (restore_flags & RTE_ETH_RESTORE_MAC_ADDR))
+			eth_dev_mac_restore(dev, &dev_info);
 
-	diag = dev->dev_ops->dev_start(dev);
-	if (diag == 0)
-		dev->data->dev_started = 1;
-	else
-		return eth_err(port_id, diag);
+		diag = dev->dev_ops->dev_start(dev);
+		if (diag == 0)
+			dev->data->dev_started = 1;
+		else
+			return eth_err(port_id, diag);
 
-	ret = eth_dev_config_restore(dev, &dev_info, restore_flags, port_id);
-	if (ret != 0) {
-		RTE_ETHDEV_LOG_LINE(ERR,
-			"Error during restoring configuration for device (port %u): %s",
-			port_id, rte_strerror(-ret));
-		ret_stop = rte_eth_dev_stop(port_id);
-		if (ret_stop != 0) {
+		ret = eth_dev_config_restore(dev, &dev_info, restore_flags, port_id);
+		if (ret != 0) {
 			RTE_ETHDEV_LOG_LINE(ERR,
-				"Failed to stop device (port %u): %s",
-				port_id, rte_strerror(-ret_stop));
-		}
+				"Error during restoring configuration for device (port %u): %s",
+				 port_id, rte_strerror(-ret));
+			ret_stop = rte_eth_dev_stop(port_id);
+			if (ret_stop != 0) {
+				RTE_ETHDEV_LOG_LINE(ERR,
+					"Failed to stop device (port %u): %s",
+					 port_id, rte_strerror(-ret_stop));
+			}
 
-		return ret;
-	}
+			return ret;
+		}
 
-	if (dev->data->dev_conf.intr_conf.lsc == 0) {
-		if (dev->dev_ops->link_update == NULL)
-			return -ENOTSUP;
-		dev->dev_ops->link_update(dev, 0);
+		if (dev->data->dev_conf.intr_conf.lsc == 0) {
+			if (dev->dev_ops->link_update == NULL)
+				return -ENOTSUP;
+			dev->dev_ops->link_update(dev, 0);
+		}
+	} else {
+		/* in secondary, proxy to primary */
+		ret = ethdev_request(port_id, ETH_REQ_START, NULL, 0);
+		if (ret != 0)
+			return ret;
 	}
 
 	/* expose selection of PMD fast-path functions */
@@ -1880,9 +1887,14 @@ rte_eth_dev_stop(uint16_t port_id)
 	/* point fast-path functions to dummy ones */
 	eth_dev_fp_ops_reset(rte_eth_fp_ops + port_id);
 
-	ret = dev->dev_ops->dev_stop(dev);
-	if (ret == 0)
-		dev->data->dev_started = 0;
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		ret = dev->dev_ops->dev_stop(dev);
+		if (ret == 0)
+			dev->data->dev_started = 0;
+	} else {
+		ret = ethdev_request(port_id, ETH_REQ_STOP, NULL, 0);
+	}
+
 	rte_ethdev_trace_stop(port_id, ret);
 
 	return ret;
-- 
2.47.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC v2 02/12] test: add test for hotplug and secondary process operations
  2025-07-09 17:33 ` [RFC v2 00/12] Packet capture using port mirroring Stephen Hemminger
  2025-07-09 17:33   ` [RFC v2 01/12] ethdev: allow start/stop from secondary process Stephen Hemminger
@ 2025-07-09 17:33   ` Stephen Hemminger
  2025-07-09 17:33   ` [RFC v2 03/12] net/ring: allow lockfree transmit if ring supports it Stephen Hemminger
                     ` (9 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Stephen Hemminger @ 2025-07-09 17:33 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Anatoly Burakov

Use null device to exercise ethdev start/stop in secondary process.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 app/test/test_mp_secondary.c | 51 +++++++++++++++++++++++++++++++++---
 1 file changed, 48 insertions(+), 3 deletions(-)

diff --git a/app/test/test_mp_secondary.c b/app/test/test_mp_secondary.c
index f3694530a8..09611cc5c2 100644
--- a/app/test/test_mp_secondary.c
+++ b/app/test/test_mp_secondary.c
@@ -15,6 +15,9 @@
 #include <string.h>
 #include <unistd.h>
 
+#include <rte_ethdev.h>
+#include <rte_bus_vdev.h>
+
 #ifdef RTE_EXEC_ENV_WINDOWS
 int
 test_mp_secondary(void)
@@ -204,6 +207,44 @@ run_object_creation_tests(void)
 	return 0;
 }
 
+static int
+run_ethdev_tests(void)
+{
+	char vdev_name[] = "net_null";
+	struct rte_eth_conf dev_conf = { 0 };
+	uint16_t port;
+	int ret;
+
+	printf("### Testing hotplug and ethdev start/stop\n");
+
+	/* use hotplug to make a null vdev */
+	ret = rte_eal_hotplug_add("vdev", vdev_name, "");
+	TEST_ASSERT(ret == 0, "Hotplug add of '%s' failed", vdev_name);
+
+	printf("# Checked hotlug_add OK\n");
+
+	ret = rte_eth_dev_get_port_by_name(vdev_name, &port);
+	TEST_ASSERT(ret == 0, "Lookup vdev '%s' failed", vdev_name);
+
+	ret = rte_eth_dev_configure(port, 1, 1, &dev_conf);
+	TEST_ASSERT(ret == 0, "Configure port %u failed", port);
+
+	ret = rte_eth_dev_start(port);
+	TEST_ASSERT(ret == 0, "Start port %u failed", port);
+
+	printf("# Checked rte_eth_dev_start\n");
+
+	ret = rte_eth_dev_stop(port);
+	TEST_ASSERT(ret == 0, "Stop port %u failed", port);
+
+	printf("# Checked rte_eth_dev_stop\n");
+
+	rte_eal_hotplug_remove("vdev", vdev_name);
+
+	printf("# Checked hotlug_remove OK\n");
+	return 0;
+}
+
 /* if called in a primary process, just spawns off a secondary process to
  * run validation tests - which brings us right back here again...
  * if called in a secondary process, this runs a series of API tests to check
@@ -212,15 +253,19 @@ run_object_creation_tests(void)
 int
 test_mp_secondary(void)
 {
-	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+	int ret;
+
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
 		return run_secondary_instances();
-	}
 
 	printf("IN SECONDARY PROCESS\n");
+	ret = run_ethdev_tests();
+	if (ret != 0)
+		return ret;
 
 	return run_object_creation_tests();
 }
 
 #endif /* !RTE_EXEC_ENV_WINDOWS */
 
-REGISTER_FAST_TEST(multiprocess_autotest, false, false, test_mp_secondary);
+REGISTER_FAST_TEST(multiprocess_autotest, true, true, test_mp_secondary);
-- 
2.47.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC v2 03/12] net/ring: allow lockfree transmit if ring supports it
  2025-07-09 17:33 ` [RFC v2 00/12] Packet capture using port mirroring Stephen Hemminger
  2025-07-09 17:33   ` [RFC v2 01/12] ethdev: allow start/stop from secondary process Stephen Hemminger
  2025-07-09 17:33   ` [RFC v2 02/12] test: add test for hotplug and secondary process operations Stephen Hemminger
@ 2025-07-09 17:33   ` Stephen Hemminger
  2025-07-09 17:33   ` [RFC v2 04/12] net/ring: add new devargs for dumpcap use Stephen Hemminger
                     ` (8 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Stephen Hemminger @ 2025-07-09 17:33 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Need to be able have multiple threads all using same
transmit queue when using SPAN.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 drivers/net/ring/rte_eth_ring.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/drivers/net/ring/rte_eth_ring.c b/drivers/net/ring/rte_eth_ring.c
index b1085bf390..966c64d9a5 100644
--- a/drivers/net/ring/rte_eth_ring.c
+++ b/drivers/net/ring/rte_eth_ring.c
@@ -177,6 +177,19 @@ eth_tx_queue_setup(struct rte_eth_dev *dev, uint16_t tx_queue_id,
 	return 0;
 }
 
+static bool
+eth_tx_queue_lockfree(const struct pmd_internals *internals)
+{
+	unsigned int i;
+
+	for (i = 0; i < internals->max_tx_queues; i++) {
+		const struct ring_queue *r = &internals->tx_ring_queues[i];
+
+		if (r->rng->flags & RING_F_SP_ENQ)
+			return false;
+	}
+	return true;
+}
 
 static int
 eth_dev_info(struct rte_eth_dev *dev,
@@ -189,6 +202,10 @@ eth_dev_info(struct rte_eth_dev *dev,
 	dev_info->max_rx_queues = (uint16_t)internals->max_rx_queues;
 	dev_info->rx_offload_capa = RTE_ETH_RX_OFFLOAD_SCATTER;
 	dev_info->tx_offload_capa = RTE_ETH_TX_OFFLOAD_MULTI_SEGS;
+
+	if (eth_tx_queue_lockfree(internals))
+		dev_info->tx_offload_capa |= RTE_ETH_TX_OFFLOAD_MT_LOCKFREE;
+
 	dev_info->max_tx_queues = (uint16_t)internals->max_tx_queues;
 	dev_info->min_rx_bufsize = 0;
 
-- 
2.47.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC v2 04/12] net/ring: add new devargs for dumpcap use
  2025-07-09 17:33 ` [RFC v2 00/12] Packet capture using port mirroring Stephen Hemminger
                     ` (2 preceding siblings ...)
  2025-07-09 17:33   ` [RFC v2 03/12] net/ring: allow lockfree transmit if ring supports it Stephen Hemminger
@ 2025-07-09 17:33   ` Stephen Hemminger
  2025-07-09 17:33   ` [RFC v2 05/12] mbuf: add dynamic flag for use by port mirroring Stephen Hemminger
                     ` (7 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Stephen Hemminger @ 2025-07-09 17:33 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Need ability to allow process like dumpcap to make a ring ethdev
with a pre-existing ring. Do this via devargs so it can work
with hotplug. It looked like the API rte_eth_from_ring() would
work for this, but it doesn't do the right thing.

Add argument to ring PMD creation to force packets going
into ring to be timestamped.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 drivers/net/ring/rte_eth_ring.c | 74 ++++++++++++++++++++++++++++++++-
 1 file changed, 72 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ring/rte_eth_ring.c b/drivers/net/ring/rte_eth_ring.c
index 966c64d9a5..0f97cd1262 100644
--- a/drivers/net/ring/rte_eth_ring.c
+++ b/drivers/net/ring/rte_eth_ring.c
@@ -20,12 +20,18 @@
 #define ETH_RING_ACTION_CREATE		"CREATE"
 #define ETH_RING_ACTION_ATTACH		"ATTACH"
 #define ETH_RING_ACTION_MAX_LEN		8 /* CREATE | ACTION */
+
 #define ETH_RING_INTERNAL_ARG		"internal"
 #define ETH_RING_INTERNAL_ARG_MAX_LEN	19 /* "0x..16chars..\0" */
 
+#define ETH_RING_RING_ARG		"ring"
+#define ETH_RING_TIMESTAMP_ARG		"timestamp"
+
 static const char *valid_arguments[] = {
 	ETH_RING_NUMA_NODE_ACTION_ARG,
 	ETH_RING_INTERNAL_ARG,
+	ETH_RING_RING_ARG,
+	ETH_RING_TIMESTAMP_ARG,
 	NULL
 };
 
@@ -38,6 +44,9 @@ struct ring_internal_args {
 	void *addr; /* self addr for sanity check */
 };
 
+static uint64_t timestamp_dynflag;
+static int timestamp_dynfield_offset = -1;
+
 enum dev_action {
 	DEV_CREATE,
 	DEV_ATTACH
@@ -46,6 +55,8 @@ enum dev_action {
 struct ring_queue {
 	struct rte_ring *rng;
 	uint16_t in_port;
+	uint8_t timestamp;
+
 	RTE_ATOMIC(uint64_t) rx_pkts;
 	RTE_ATOMIC(uint64_t) tx_pkts;
 };
@@ -53,6 +64,7 @@ struct ring_queue {
 struct pmd_internals {
 	unsigned int max_rx_queues;
 	unsigned int max_tx_queues;
+	uint8_t timestamp;
 
 	struct ring_queue rx_ring_queues[RTE_PMD_RING_MAX_RX_RINGS];
 	struct ring_queue tx_ring_queues[RTE_PMD_RING_MAX_TX_RINGS];
@@ -61,6 +73,7 @@ struct pmd_internals {
 	enum dev_action action;
 };
 
+
 static struct rte_eth_link pmd_link = {
 	.link_speed = RTE_ETH_SPEED_NUM_10G,
 	.link_duplex = RTE_ETH_LINK_FULL_DUPLEX,
@@ -96,8 +109,23 @@ eth_ring_tx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
 {
 	void **ptrs = (void *)&bufs[0];
 	struct ring_queue *r = q;
-	const uint16_t nb_tx = (uint16_t)rte_ring_enqueue_burst(r->rng,
-			ptrs, nb_bufs, NULL);
+	uint16_t nb_tx;
+
+	if (r->timestamp) {
+		unsigned int i;
+		uint64_t cycles = rte_get_tsc_cycles();
+
+		for (i = 0; i < nb_bufs; i++) {
+			struct rte_mbuf *m = bufs[i];
+			*RTE_MBUF_DYNFIELD(m, timestamp_dynfield_offset,
+					   rte_mbuf_timestamp_t *) = cycles;
+			m->ol_flags |= timestamp_dynflag;
+		}
+	}
+
+
+	nb_tx = (uint16_t)rte_ring_enqueue_burst(r->rng, ptrs, nb_bufs, NULL);
+
 	if (r->rng->flags & RING_F_SP_ENQ)
 		r->tx_pkts += nb_tx;
 	else
@@ -173,6 +201,7 @@ eth_tx_queue_setup(struct rte_eth_dev *dev, uint16_t tx_queue_id,
 {
 	struct pmd_internals *internals = dev->data->dev_private;
 
+	internals->tx_ring_queues[tx_queue_id].timestamp = internals->timestamp;
 	dev->data->tx_queues[tx_queue_id] = &internals->tx_ring_queues[tx_queue_id];
 	return 0;
 }
@@ -693,6 +722,20 @@ parse_internal_args(const char *key __rte_unused, const char *value,
 	return 0;
 }
 
+static int
+parse_ring_arg(const char *key __rte_unused, const char *value, void *data)
+{
+	struct rte_ring **rp = data;
+
+	*rp = rte_ring_lookup(value);
+	if (*rp == NULL) {
+		PMD_LOG(ERR, "ring '%s' not found", value);
+		return -1;
+	}
+
+	return 0;
+}
+
 static int
 rte_pmd_ring_probe(struct rte_vdev_device *dev)
 {
@@ -770,6 +813,18 @@ rte_pmd_ring_probe(struct rte_vdev_device *dev)
 				&eth_dev);
 			if (ret >= 0)
 				ret = 0;
+		} else if (rte_kvargs_count(kvlist, ETH_RING_RING_ARG) == 1) {
+			struct rte_ring *rxtx[1] = { };
+
+			ret = rte_kvargs_process(kvlist, ETH_RING_RING_ARG, parse_ring_arg, rxtx);
+			if (ret < 0)
+				goto out_free;
+
+			/* Note: rte_eth_from_ring() does not do what is expected here! */
+			ret = do_eth_dev_ring_create(name, dev, rxtx, 1, rxtx, 1,
+						     rte_socket_id(), DEV_ATTACH, &eth_dev);
+			if (ret < 0)
+				goto out_free;
 		} else {
 			ret = rte_kvargs_count(kvlist, ETH_RING_NUMA_NODE_ACTION_ARG);
 			info = rte_zmalloc("struct node_action_list",
@@ -806,6 +861,20 @@ rte_pmd_ring_probe(struct rte_vdev_device *dev)
 				}
 			}
 		}
+
+		if (rte_kvargs_count(kvlist, ETH_RING_TIMESTAMP_ARG) == 1) {
+			struct pmd_internals *internals = eth_dev->data->dev_private;
+
+			if (timestamp_dynfield_offset == -1) {
+				ret = rte_mbuf_dyn_rx_timestamp_register(&timestamp_dynfield_offset,
+									 &timestamp_dynflag);
+				if (ret < 0)
+					return ret;
+			}
+
+			internals->timestamp = 1;
+		}
+
 	}
 
 out_free:
@@ -843,4 +912,5 @@ static struct rte_vdev_driver pmd_ring_drv = {
 RTE_PMD_REGISTER_VDEV(net_ring, pmd_ring_drv);
 RTE_PMD_REGISTER_ALIAS(net_ring, eth_ring);
 RTE_PMD_REGISTER_PARAM_STRING(net_ring,
+	ETH_RING_RING_ARG "=<string> "
 	ETH_RING_NUMA_NODE_ACTION_ARG "=name:node:action(ATTACH|CREATE)");
-- 
2.47.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC v2 05/12] mbuf: add dynamic flag for use by port mirroring
  2025-07-09 17:33 ` [RFC v2 00/12] Packet capture using port mirroring Stephen Hemminger
                     ` (3 preceding siblings ...)
  2025-07-09 17:33   ` [RFC v2 04/12] net/ring: add new devargs for dumpcap use Stephen Hemminger
@ 2025-07-09 17:33   ` Stephen Hemminger
  2025-07-09 17:33   ` [RFC v2 06/12] ethdev: add port mirroring feature Stephen Hemminger
                     ` (6 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Stephen Hemminger @ 2025-07-09 17:33 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Add names for new port mirror fields.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 lib/mbuf/rte_mbuf_dyn.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/lib/mbuf/rte_mbuf_dyn.h b/lib/mbuf/rte_mbuf_dyn.h
index 865c90f579..72a31b22b3 100644
--- a/lib/mbuf/rte_mbuf_dyn.h
+++ b/lib/mbuf/rte_mbuf_dyn.h
@@ -327,6 +327,15 @@ int rte_mbuf_dyn_tx_timestamp_register(int *field_offset, uint64_t *tx_flag);
 #define RTE_MBUF_DYNFIELD_IP_REASSEMBLY_NAME "rte_dynfield_ip_reassembly"
 #define RTE_MBUF_DYNFLAG_IP_REASSEMBLY_INCOMPLETE_NAME "rte_dynflag_ip_reassembly_incomplete"
 
+/**
+ * The ethdev port mirror needs some extra packet information that
+ * includes information about the packet before it was copied to the mirror.
+ */
+#define RTE_MBUF_DYNFIELD_MIRROR_ORIGIN "rte_dynfield_mirror_origin"
+#define RTE_MBUF_DYNFLAG_MIRROR_ORIGIN  "rte_dynflag_mirror_origin"
+#define RTE_MBUF_DYNFLAG_MIRROR_INGRESS "rte_dynflag_mirror_ingress"
+#define RTE_MBUF_DYNFLAG_MIRROR_EGRESS "rte_dynflag_mirror_egress"
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.47.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC v2 06/12] ethdev: add port mirroring feature
  2025-07-09 17:33 ` [RFC v2 00/12] Packet capture using port mirroring Stephen Hemminger
                     ` (4 preceding siblings ...)
  2025-07-09 17:33   ` [RFC v2 05/12] mbuf: add dynamic flag for use by port mirroring Stephen Hemminger
@ 2025-07-09 17:33   ` Stephen Hemminger
  2025-07-09 17:33   ` [RFC v2 07/12] pcapng: split packet copy from header insertion Stephen Hemminger
                     ` (5 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Stephen Hemminger @ 2025-07-09 17:33 UTC (permalink / raw)
  To: dev
  Cc: Stephen Hemminger, Bruce Richardson, Thomas Monjalon,
	Ferruh Yigit, Andrew Rybchenko, Anatoly Burakov

This adds new feature port mirroring to the ethdev layer.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 config/rte_config.h              |   1 +
 lib/ethdev/ethdev_driver.h       |   6 +
 lib/ethdev/ethdev_private.c      |  58 +++++-
 lib/ethdev/ethdev_private.h      |   3 +
 lib/ethdev/ethdev_trace.h        |  17 ++
 lib/ethdev/ethdev_trace_points.c |   6 +
 lib/ethdev/meson.build           |   1 +
 lib/ethdev/rte_ethdev.c          |  17 +-
 lib/ethdev/rte_ethdev.h          |  23 ++-
 lib/ethdev/rte_ethdev_core.h     |   6 +-
 lib/ethdev/rte_mirror.c          | 344 +++++++++++++++++++++++++++++++
 lib/ethdev/rte_mirror.h          | 113 ++++++++++
 12 files changed, 577 insertions(+), 18 deletions(-)
 create mode 100644 lib/ethdev/rte_mirror.c
 create mode 100644 lib/ethdev/rte_mirror.h

diff --git a/config/rte_config.h b/config/rte_config.h
index 05344e2619..76eb2aa417 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -69,6 +69,7 @@
 #define RTE_MAX_QUEUES_PER_PORT 1024
 #define RTE_ETHDEV_QUEUE_STAT_CNTRS 16 /* max 256 */
 #define RTE_ETHDEV_RXTX_CALLBACKS 1
+#define RTE_ETHDEV_MIRROR 1
 #define RTE_MAX_MULTI_HOST_CTRLS 4
 
 /* cryptodev defines */
diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index 2b4d2ae9c3..586f9b7a2b 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -85,12 +85,18 @@ struct __rte_cache_aligned rte_eth_dev {
 	 * received packets before passing them to the user
 	 */
 	RTE_ATOMIC(struct rte_eth_rxtx_callback *) post_rx_burst_cbs[RTE_MAX_QUEUES_PER_PORT];
+
+	/** Receive mirrors */
+	RTE_ATOMIC(struct rte_eth_mirror *) rx_mirror;
 	/**
 	 * User-supplied functions called from tx_burst to pre-process
 	 * received packets before passing them to the driver for transmission
 	 */
 	RTE_ATOMIC(struct rte_eth_rxtx_callback *) pre_tx_burst_cbs[RTE_MAX_QUEUES_PER_PORT];
 
+	/** Transmit mirrors */
+	RTE_ATOMIC(struct rte_eth_mirror *) tx_mirror;
+
 	enum rte_eth_dev_state state; /**< Flag indicating the port state */
 	void *security_ctx; /**< Context for security ops */
 };
diff --git a/lib/ethdev/ethdev_private.c b/lib/ethdev/ethdev_private.c
index 184cf33f99..d7d99a09f2 100644
--- a/lib/ethdev/ethdev_private.c
+++ b/lib/ethdev/ethdev_private.c
@@ -280,6 +280,8 @@ eth_dev_fp_ops_setup(struct rte_eth_fp_ops *fpo,
 	fpo->tx_descriptor_status = dev->tx_descriptor_status;
 	fpo->recycle_tx_mbufs_reuse = dev->recycle_tx_mbufs_reuse;
 	fpo->recycle_rx_descriptors_refill = dev->recycle_rx_descriptors_refill;
+	fpo->rx_mirror = (struct rte_eth_mirror * __rte_atomic *)(uintptr_t)&dev->rx_mirror;
+	fpo->tx_mirror = (struct rte_eth_mirror * __rte_atomic *)(uintptr_t)&dev->tx_mirror;
 
 	fpo->rxq.data = dev->data->rx_queues;
 	fpo->rxq.clbk = (void * __rte_atomic *)(uintptr_t)dev->post_rx_burst_cbs;
@@ -481,17 +483,53 @@ eth_dev_tx_queue_config(struct rte_eth_dev *dev, uint16_t nb_queues)
 }
 
 static int
-ethdev_handle_request(const struct ethdev_mp_request *req)
+ethdev_handle_request(const struct ethdev_mp_request *req, size_t len)
 {
+	len -= sizeof(*req);
+
 	switch (req->operation) {
 	case ETH_REQ_START:
+		if (len != 0)
+			return -EINVAL;
+
 		return rte_eth_dev_start(req->port_id);
 
 	case ETH_REQ_STOP:
+		if (len != 0)
+			return -EINVAL;
 		return rte_eth_dev_stop(req->port_id);
 
+	case ETH_REQ_RESET:
+		if (len != 0)
+			return -EINVAL;
+		return rte_eth_dev_reset(req->port_id);
+
+	case ETH_REQ_ADD_MIRROR:
+		if (len != sizeof(struct rte_eth_mirror_conf)) {
+			RTE_ETHDEV_LOG_LINE(ERR,
+				    "add mirror conf wrong size %zu", len);
+			return -EINVAL;
+		}
+
+		const struct rte_eth_mirror_conf *conf
+			= (const struct rte_eth_mirror_conf *) req->config;
+
+		return rte_eth_add_mirror(req->port_id, conf);
+
+	case ETH_REQ_REMOVE_MIRROR:
+		if (len != sizeof(uint16_t)) {
+			RTE_ETHDEV_LOG_LINE(ERR,
+				    "mirror remove wrong size %zu", len);
+			return -EINVAL;
+		}
+
+		uint16_t target = *(const uint16_t *) req->config;
+		return rte_eth_remove_mirror(req->port_id, target);
+
 	default:
-		return -EINVAL;
+		RTE_ETHDEV_LOG_LINE(ERR,
+			    "Unknown mp request operation %u", req->operation);
+		return -ENOTSUP;
 	}
 }
 
@@ -504,23 +542,25 @@ static_assert(sizeof(struct ethdev_mp_response) <= RTE_MP_MAX_PARAM_LEN,
 int
 ethdev_server(const struct rte_mp_msg *mp_msg, const void *peer)
 {
-	const struct ethdev_mp_request *req
-		= (const struct ethdev_mp_request *)mp_msg->param;
-
 	struct rte_mp_msg mp_resp = {
 		.name = ETHDEV_MP,
 	};
 	struct ethdev_mp_response *resp;
+	const struct ethdev_mp_request *req;
 
 	resp = (struct ethdev_mp_response *)mp_resp.param;
 	mp_resp.len_param = sizeof(*resp);
-	resp->res_op = req->operation;
 
 	/* recv client requests */
-	if (mp_msg->len_param != sizeof(*req))
+	if (mp_msg->len_param < (int)sizeof(*req)) {
+		RTE_ETHDEV_LOG_LINE(ERR, "invalid request from secondary");
 		resp->err_value = -EINVAL;
-	else
-		resp->err_value = ethdev_handle_request(req);
+	} else {
+		req = (const struct ethdev_mp_request *)mp_msg->param;
+		resp->res_op = req->operation;
+		resp->err_value = ethdev_handle_request(req, mp_msg->len_param);
+
+	}
 
 	return rte_mp_reply(&mp_resp, peer);
 }
diff --git a/lib/ethdev/ethdev_private.h b/lib/ethdev/ethdev_private.h
index f58f161871..d2fdc20057 100644
--- a/lib/ethdev/ethdev_private.h
+++ b/lib/ethdev/ethdev_private.h
@@ -85,6 +85,9 @@ int eth_dev_tx_queue_config(struct rte_eth_dev *dev, uint16_t nb_queues);
 enum ethdev_mp_operation {
 	ETH_REQ_START,
 	ETH_REQ_STOP,
+	ETH_REQ_RESET,
+	ETH_REQ_ADD_MIRROR,
+	ETH_REQ_REMOVE_MIRROR,
 };
 
 struct ethdev_mp_request {
diff --git a/lib/ethdev/ethdev_trace.h b/lib/ethdev/ethdev_trace.h
index 482befc209..e137afcbf7 100644
--- a/lib/ethdev/ethdev_trace.h
+++ b/lib/ethdev/ethdev_trace.h
@@ -1035,6 +1035,23 @@ RTE_TRACE_POINT(
 	rte_trace_point_emit_int(ret);
 )
 
+RTE_TRACE_POINT(
+	rte_eth_trace_add_mirror,
+	RTE_TRACE_POINT_ARGS(uint16_t port_id,
+			     const struct rte_eth_mirror_conf *conf, int ret),
+	rte_trace_point_emit_u16(port_id);
+	rte_trace_point_emit_u16(conf->target);
+	rte_trace_point_emit_int(ret);
+)
+
+RTE_TRACE_POINT(
+	rte_eth_trace_remove_mirror,
+	RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t target_id, int ret),
+	rte_trace_point_emit_u16(port_id);
+	rte_trace_point_emit_u16(target_id);
+	rte_trace_point_emit_int(ret);
+)
+
 RTE_TRACE_POINT(
 	rte_eth_trace_rx_queue_info_get,
 	RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t queue_id,
diff --git a/lib/ethdev/ethdev_trace_points.c b/lib/ethdev/ethdev_trace_points.c
index 071c508327..fa1fd21809 100644
--- a/lib/ethdev/ethdev_trace_points.c
+++ b/lib/ethdev/ethdev_trace_points.c
@@ -389,6 +389,12 @@ RTE_TRACE_POINT_REGISTER(rte_eth_trace_remove_rx_callback,
 RTE_TRACE_POINT_REGISTER(rte_eth_trace_remove_tx_callback,
 	lib.ethdev.remove_tx_callback)
 
+RTE_TRACE_POINT_REGISTER(rte_eth_trace_add_mirror,
+	lib.ethdev.add_mirror)
+
+RTE_TRACE_POINT_REGISTER(rte_eth_trace_remove_mirror,
+	lib.ethdev.remove_mirror)
+
 RTE_TRACE_POINT_REGISTER(rte_eth_trace_rx_queue_info_get,
 	lib.ethdev.rx_queue_info_get)
 
diff --git a/lib/ethdev/meson.build b/lib/ethdev/meson.build
index f1d2586591..3672b6a35b 100644
--- a/lib/ethdev/meson.build
+++ b/lib/ethdev/meson.build
@@ -11,6 +11,7 @@ sources = files(
         'rte_ethdev_cman.c',
         'rte_ethdev_telemetry.c',
         'rte_flow.c',
+        'rte_mirror.c',
         'rte_mtr.c',
         'rte_tm.c',
         'sff_telemetry.c',
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index 41363af2c3..6602a771bd 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -14,6 +14,8 @@
 #include <bus_driver.h>
 #include <eal_export.h>
 #include <rte_log.h>
+#include <rte_cycles.h>
+#include <rte_alarm.h>
 #include <rte_interrupts.h>
 #include <rte_kvargs.h>
 #include <rte_memcpy.h>
@@ -2041,13 +2043,16 @@ rte_eth_dev_reset(uint16_t port_id)
 	if (dev->dev_ops->dev_reset == NULL)
 		return -ENOTSUP;
 
-	ret = rte_eth_dev_stop(port_id);
-	if (ret != 0) {
-		RTE_ETHDEV_LOG_LINE(ERR,
-			"Failed to stop device (port %u) before reset: %s - ignore",
-			port_id, rte_strerror(-ret));
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		ret = rte_eth_dev_stop(port_id);
+		if (ret != 0)
+			RTE_ETHDEV_LOG_LINE(ERR,
+				    "Failed to stop device (port %u) before reset: %s - ignore",
+				    port_id, rte_strerror(-ret));
+		ret = eth_err(port_id, dev->dev_ops->dev_reset(dev));
+	} else {
+		ret = ethdev_request(port_id, ETH_REQ_STOP, NULL, 0);
 	}
-	ret = eth_err(port_id, dev->dev_ops->dev_reset(dev));
 
 	rte_ethdev_trace_reset(port_id, ret);
 
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index f9fb6ae549..4bd75e7afb 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -170,6 +170,7 @@
 
 #include "rte_ethdev_trace_fp.h"
 #include "rte_dev_info.h"
+#include "rte_mirror.h"
 
 #ifdef __cplusplus
 extern "C" {
@@ -1466,7 +1467,6 @@ enum rte_eth_tunnel_type {
 	RTE_ETH_TUNNEL_TYPE_ECPRI,
 	RTE_ETH_TUNNEL_TYPE_MAX,
 };
-
 #ifdef __cplusplus
 }
 #endif
@@ -6334,6 +6334,17 @@ rte_eth_rx_burst(uint16_t port_id, uint16_t queue_id,
 
 	nb_rx = p->rx_pkt_burst(qd, rx_pkts, nb_pkts);
 
+#ifdef RTE_ETHDEV_MIRROR
+	if (p->rx_mirror) {
+		const struct rte_eth_mirror *mirror;
+
+		mirror = rte_atomic_load_explicit(p->rx_mirror, rte_memory_order_relaxed);
+		if (unlikely(mirror != NULL))
+			rte_eth_mirror_burst(port_id, queue_id, RTE_ETH_MIRROR_DIRECTION_INGRESS,
+					     rx_pkts, nb_rx, mirror);
+	}
+#endif
+
 #ifdef RTE_ETHDEV_RXTX_CALLBACKS
 	{
 		void *cb;
@@ -6692,6 +6703,16 @@ rte_eth_tx_burst(uint16_t port_id, uint16_t queue_id,
 	}
 #endif
 
+#ifdef RTE_ETHDEV_MIRROR
+	if (p->tx_mirror) {
+		const struct rte_eth_mirror *mirror;
+
+		mirror = rte_atomic_load_explicit(p->tx_mirror, rte_memory_order_relaxed);
+		if (unlikely(mirror != NULL))
+			rte_eth_mirror_burst(port_id, queue_id, RTE_ETH_MIRROR_DIRECTION_EGRESS,
+					     tx_pkts, nb_pkts, mirror);
+	}
+#endif
 	nb_pkts = p->tx_pkt_burst(qd, tx_pkts, nb_pkts);
 
 	rte_ethdev_trace_tx_burst(port_id, queue_id, (void **)tx_pkts, nb_pkts);
diff --git a/lib/ethdev/rte_ethdev_core.h b/lib/ethdev/rte_ethdev_core.h
index e55fb42996..2ef3b659f7 100644
--- a/lib/ethdev/rte_ethdev_core.h
+++ b/lib/ethdev/rte_ethdev_core.h
@@ -101,7 +101,8 @@ struct __rte_cache_aligned rte_eth_fp_ops {
 	eth_rx_descriptor_status_t rx_descriptor_status;
 	/** Refill Rx descriptors with the recycling mbufs. */
 	eth_recycle_rx_descriptors_refill_t recycle_rx_descriptors_refill;
-	uintptr_t reserved1[2];
+	uintptr_t reserved1;
+	RTE_ATOMIC(struct rte_eth_mirror *) *rx_mirror;
 	/**@}*/
 
 	/**@{*/
@@ -121,7 +122,8 @@ struct __rte_cache_aligned rte_eth_fp_ops {
 	eth_recycle_tx_mbufs_reuse_t recycle_tx_mbufs_reuse;
 	/** Get the number of used Tx descriptors. */
 	eth_tx_queue_count_t tx_queue_count;
-	uintptr_t reserved2[1];
+	RTE_ATOMIC(struct rte_eth_mirror *) *tx_mirror;
+	uintptr_t reserved2;
 	/**@}*/
 
 };
diff --git a/lib/ethdev/rte_mirror.c b/lib/ethdev/rte_mirror.c
new file mode 100644
index 0000000000..97318176b2
--- /dev/null
+++ b/lib/ethdev/rte_mirror.c
@@ -0,0 +1,344 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Stephen Hemminger <stephen@networkplumber.org>
+ */
+
+#include <errno.h>
+#include <inttypes.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <string.h>
+
+#include <rte_config.h>
+#include <rte_bitops.h>
+#include <rte_eal.h>
+#include <rte_log.h>
+#include <rte_common.h>
+#include <rte_cycles.h>
+#include <rte_alarm.h>
+#include <rte_ether.h>
+#include <rte_malloc.h>
+#include <rte_mbuf.h>
+#include <rte_spinlock.h>
+#include <rte_stdatomic.h>
+#include <eal_export.h>
+
+#include "rte_ethdev.h"
+#include "rte_mirror.h"
+#include "ethdev_driver.h"
+#include "ethdev_private.h"
+#include "ethdev_trace.h"
+
+/* Upper bound of packet bursts redirected */
+#define RTE_MIRROR_BURST_SIZE 64
+
+/* spinlock for setting up mirror ports */
+static rte_spinlock_t mirror_port_lock = RTE_SPINLOCK_INITIALIZER;
+
+/* dynamically assigned offload flag to indicate ingress vs egress */
+static uint64_t mirror_origin_flag;
+static int mirror_origin_offset = -1;
+static uint64_t mirror_ingress_flag;
+static uint64_t mirror_egress_flag;
+
+static uint64_t mbuf_timestamp_dynflag;
+static int mbuf_timestamp_offset = -1;
+
+/* register dynamic mbuf fields, done on first mirror creation */
+static int
+ethdev_dyn_mirror_register(void)
+{
+	const struct rte_mbuf_dynfield field_desc = {
+		.name = RTE_MBUF_DYNFIELD_MIRROR_ORIGIN,
+		.size = sizeof(rte_mbuf_origin_t),
+		.align = sizeof(rte_mbuf_origin_t),
+	};
+	struct rte_mbuf_dynflag flag_desc = {
+		.name = RTE_MBUF_DYNFLAG_MIRROR_ORIGIN,
+	};
+	int offset;
+
+	if (rte_mbuf_dyn_tx_timestamp_register(&mbuf_timestamp_offset, &mbuf_timestamp_dynflag) < 0) {
+		RTE_ETHDEV_LOG_LINE(ERR, "Failed to register timestamp flag");
+		return -1;
+	}
+
+	offset = rte_mbuf_dynfield_register(&field_desc);
+	if (offset < 0) {
+		RTE_ETHDEV_LOG_LINE(ERR, "Failed to register mbuf origin field");
+		return -1;
+	}
+	mirror_origin_offset = offset;
+
+	offset = rte_mbuf_dynflag_register(&flag_desc);
+	if (offset < 0) {
+		RTE_ETHDEV_LOG_LINE(ERR, "Failed to register mbuf origin flag");
+		return -1;
+	}
+	mirror_origin_flag = RTE_BIT64(offset);
+
+	strlcpy(flag_desc.name, RTE_MBUF_DYNFLAG_MIRROR_INGRESS,
+		sizeof(flag_desc.name));
+	offset = rte_mbuf_dynflag_register(&flag_desc);
+	if (offset < 0) {
+		RTE_ETHDEV_LOG_LINE(ERR, "Failed to register mbuf ingress flag");
+		return -1;
+	}
+	mirror_ingress_flag = RTE_BIT64(offset);
+
+	strlcpy(flag_desc.name, RTE_MBUF_DYNFLAG_MIRROR_EGRESS,
+		sizeof(flag_desc.name));
+	offset = rte_mbuf_dynflag_register(&flag_desc);
+	if (offset < 0) {
+		RTE_ETHDEV_LOG_LINE(ERR, "Failed to register mbuf egress flag");
+		return -1;
+	}
+	mirror_egress_flag = RTE_BIT64(offset);
+
+	return 0;
+}
+
+/**
+ * Structure used to hold information mirror port mirrors for a
+ * queue on Rx and Tx.
+ */
+struct rte_eth_mirror {
+	RTE_ATOMIC(struct rte_eth_mirror *) next;
+	struct rte_eth_mirror_conf conf;
+};
+
+/* Add a new mirror entry to the list. */
+static int
+ethdev_insert_mirror(struct rte_eth_mirror **top,
+		   const struct rte_eth_mirror_conf *conf)
+{
+	struct rte_eth_mirror *mirror = rte_zmalloc(NULL, sizeof(*mirror), 0);
+	if (mirror == NULL)
+		return -ENOMEM;
+
+	mirror->conf = *conf;
+	mirror->next = *top;
+
+	rte_atomic_store_explicit(top, mirror, rte_memory_order_relaxed);
+	return 0;
+}
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_eth_add_mirror, 25.11)
+int
+rte_eth_add_mirror(uint16_t port_id, const struct rte_eth_mirror_conf *conf)
+{
+#ifndef RTE_ETHDEV_MIRROR
+	return -ENOTSUP;
+#endif
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	struct rte_eth_dev_info dev_info;
+
+	if (conf == NULL) {
+		RTE_ETHDEV_LOG_LINE(ERR, "Missing configuration information");
+		return -EINVAL;
+	}
+
+	if (conf->direction == 0 ||
+	    conf->direction > (RTE_ETH_MIRROR_DIRECTION_INGRESS | RTE_ETH_MIRROR_DIRECTION_EGRESS)) {
+		RTE_ETHDEV_LOG_LINE(ERR, "Invalid direction %#x", conf->direction);
+		return -EINVAL;
+	}
+
+	if (conf->snaplen < RTE_ETHER_HDR_LEN) {
+		RTE_ETHDEV_LOG_LINE(ERR, "invalid snap len");
+		return -EINVAL;
+	}
+
+	if (conf->mp == NULL) {
+		RTE_ETHDEV_LOG_LINE(ERR, "not a valid mempool");
+		return -EINVAL;
+	}
+
+	if (conf->flags & ~RTE_ETH_MIRROR_FLAG_MASK) {
+		RTE_ETHDEV_LOG_LINE(ERR, "unsupported flags");
+		return -EINVAL;
+	}
+
+	/* Checks that target exists */
+	int ret = rte_eth_dev_info_get(conf->target, &dev_info);
+	if (ret != 0)
+		return ret;
+
+	/* Loopback mirror could create packet storm */
+	if (conf->target == port_id) {
+		RTE_ETHDEV_LOG_LINE(ERR, "Cannot mirror port to self");
+		return -EINVAL;
+	}
+
+	/* Because multiple directions and multiple queues will all going to the mirror port
+	 * need the transmit path to be lockfree.
+	 */
+	if (!(dev_info.tx_offload_capa & RTE_ETH_TX_OFFLOAD_MT_LOCKFREE)) {
+		RTE_ETHDEV_LOG_LINE(ERR, "Mirror needs lockfree transmit");
+		return -ENOTSUP;
+	}
+
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		/* Register dynamic fields once */
+		if (mirror_origin_offset < 0) {
+			ret = ethdev_dyn_mirror_register();
+			if (ret < 0)
+				return ret;
+		}
+
+		rte_spinlock_lock(&mirror_port_lock);
+		ret = 0;
+		if (conf->direction & RTE_ETH_MIRROR_DIRECTION_INGRESS)
+			ret = ethdev_insert_mirror(&dev->rx_mirror, conf);
+		if (ret == 0 && (conf->direction & RTE_ETH_MIRROR_DIRECTION_EGRESS))
+			ret = ethdev_insert_mirror(&dev->tx_mirror, conf);
+		rte_spinlock_unlock(&mirror_port_lock);
+	} else {
+		/* in secondary, proxy to primary */
+		ret = ethdev_request(port_id, ETH_REQ_ADD_MIRROR, conf, sizeof(*conf));
+		if (ret != 0)
+			return ret;
+	}
+
+	rte_eth_trace_add_mirror(port_id, conf, ret);
+	return ret;
+}
+
+static bool
+ethdev_delete_mirror(struct rte_eth_mirror **top, uint16_t target_id)
+{
+	struct rte_eth_mirror *mirror;
+
+	while ((mirror = *top) != NULL) {
+		if (mirror->conf.target == target_id)
+			goto found;
+		top = &mirror->next;
+	}
+	/* not found in list */
+	return false;
+
+found:
+	/* unlink from list */
+	rte_atomic_store_explicit(top, mirror->next, rte_memory_order_relaxed);
+
+	/* Defer freeing the mirror until after one second
+	 * to allow for active threads that are using it.
+	 * Assumes no PMD takes more than one second to transmit a burst.
+	 * Alternative would be RCU, but RCU in DPDK is optional.
+	 */
+	rte_eal_alarm_set(US_PER_S, rte_free, mirror);
+	return true;
+}
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_eth_remove_mirror, 25.11)
+int
+rte_eth_remove_mirror(uint16_t port_id, uint16_t target_id)
+{
+#ifndef RTE_ETHDEV_MIRROR
+	return -ENOTSUP;
+#endif
+	int ret = 0;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(target_id, -ENODEV);
+
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		bool found;
+
+		rte_spinlock_lock(&mirror_port_lock);
+		found = ethdev_delete_mirror(&dev->rx_mirror, target_id);
+		found |= ethdev_delete_mirror(&dev->tx_mirror, target_id);
+		rte_spinlock_unlock(&mirror_port_lock);
+		if (!found)
+			ret = -ENOENT; /* no mirror present */
+	} else {
+		ret = ethdev_request(port_id, ETH_REQ_REMOVE_MIRROR,
+				     &target_id, sizeof(target_id));
+	}
+
+	rte_eth_trace_remove_mirror(port_id, target_id, ret);
+	return ret;
+}
+
+static inline void
+eth_dev_mirror(uint16_t port_id, uint16_t queue_id, uint8_t direction,
+	     struct rte_mbuf **pkts, uint16_t nb_pkts,
+	     const struct rte_eth_mirror_conf *conf)
+{
+	struct rte_mbuf *tosend[RTE_MIRROR_BURST_SIZE];
+	unsigned int count = 0;
+	unsigned int i;
+
+	for (i = 0; i < nb_pkts; i++) {
+		struct rte_mbuf *m = pkts[i];
+		struct rte_mbuf *mc;
+
+		if (conf->flags & RTE_ETH_MIRROR_INDIRECT_FLAG) {
+			mc = rte_pktmbuf_alloc(conf->mp);
+			if (unlikely(mc == NULL))
+				continue;
+
+			/* Make both mbuf's point to the same data */
+			rte_pktmbuf_attach(mc, m);
+		} else {
+			mc = rte_pktmbuf_copy(m, conf->mp, 0, conf->snaplen);
+			/* TODO: dropped stats? */
+			if (unlikely(mc == NULL))
+				continue;
+		}
+
+		/* Put info about origin of the packet */
+		if (conf->flags & RTE_ETH_MIRROR_ORIGIN_FLAG) {
+			struct rte_mbuf_origin *origin
+				= RTE_MBUF_DYNFIELD(mc, mirror_origin_offset, rte_mbuf_origin_t *);
+			origin->original_len = m->pkt_len;
+			origin->port_id = port_id;
+			origin->queue_id = queue_id;
+			mc->ol_flags |= mirror_origin_flag;
+		}
+
+		/* Insert timestamp into packet */
+		if (conf->flags & RTE_ETH_MIRROR_TIMESTAMP_FLAG) {
+			*RTE_MBUF_DYNFIELD(m, mbuf_timestamp_offset, rte_mbuf_timestamp_t *)
+				= rte_get_tsc_cycles();
+			mc->ol_flags |= mbuf_timestamp_dynflag;
+		}
+
+		mc->ol_flags &= ~(mirror_ingress_flag | mirror_egress_flag);
+		if (direction & RTE_ETH_MIRROR_DIRECTION_INGRESS)
+			mc->ol_flags |= mirror_ingress_flag;
+		else if (direction & RTE_ETH_MIRROR_DIRECTION_EGRESS)
+			mc->ol_flags |= mirror_egress_flag;
+
+		tosend[count++] = mc;
+	}
+
+	uint16_t nsent = rte_eth_tx_burst(conf->target, 0, tosend, count);
+	if (unlikely(nsent < count)) {
+		uint16_t drop = count - nsent;
+
+		/* TODO: need some stats here? */
+		rte_pktmbuf_free_bulk(pkts + nsent, drop);
+	}
+}
+
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_eth_mirror_burst, 25.11)
+void
+rte_eth_mirror_burst(uint16_t port_id, uint16_t queue_id, uint8_t direction,
+		     struct rte_mbuf **pkts, uint16_t nb_pkts,
+		     const struct rte_eth_mirror *mirror)
+{
+	unsigned int i;
+
+	for (i = 0; i < nb_pkts; i += RTE_MIRROR_BURST_SIZE) {
+		uint16_t left  = nb_pkts - i;
+		uint16_t burst = RTE_MIN(left, RTE_MIRROR_BURST_SIZE);
+
+		eth_dev_mirror(port_id, queue_id, direction,
+			     pkts + i, burst, &mirror->conf);
+	}
+}
diff --git a/lib/ethdev/rte_mirror.h b/lib/ethdev/rte_mirror.h
new file mode 100644
index 0000000000..27a684b4ae
--- /dev/null
+++ b/lib/ethdev/rte_mirror.h
@@ -0,0 +1,113 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Stephen Hemminger <stephen@networkplumber.org>
+ */
+
+#ifndef RTE_MIRROR_H_
+#define RTE_MIRROR_H_
+
+#include <stdint.h>
+
+#include <rte_compat.h>
+#include <rte_mbuf.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * @file
+ * Ethdev port mirroring
+ *
+ * This interface provides the ability to duplicate packets to another port.
+ */
+
+/* Definitions for ethdev analyzer direction */
+#define RTE_ETH_MIRROR_DIRECTION_INGRESS 1
+#define RTE_ETH_MIRROR_DIRECTION_EGRESS 2
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice.
+ *
+ * This dynamic field is added to mbuf's when they are copied to
+ * the port mirror.
+ */
+typedef struct rte_mbuf_origin {
+	uint32_t original_len;	/**< Packet length before copy */
+	uint16_t port_id;	/**< Port where packet originated */
+	uint16_t queue_id;      /**< Queue used for Tx or Rx */
+} rte_mbuf_origin_t;
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice.
+ *
+ * Structure used to configure ethdev Switched Port Analyzer (MIRROR)
+ */
+struct rte_eth_mirror_conf {
+	struct rte_mempool *mp;	/**< Memory pool for copies, If NULL then cloned. */
+	uint32_t snaplen;	/**< Upper limit on number of bytes to copy */
+	uint32_t flags;		/**< bitmask of RTE_ETH_MIRROR_XXX_FLAG's */
+	uint16_t target;	/**< Destination port */
+	uint8_t direction;	/**< bitmask of RTE_ETH_MIRROR_DIRECTION_XXX */
+};
+
+#define RTE_ETH_MIRROR_TIMESTAMP_FLAG 1	/**< insert timestamp into mirrored packet */
+#define RTE_ETH_MIRROR_ORIGIN_FLAG 2	/**< insert rte_mbuf_origin into mirrored packet */
+#define RTE_ETH_MIRROR_INDIRECT_FLAG 4  /**< use rte_mbuf_attach rather than copy */
+
+#define RTE_ETH_MIRROR_FLAG_MASK 7
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
+ *
+ * Create a Switched Port Analyzer (MIRROR) instance.
+ *
+ * @param port_id
+ *   The port identifier of the source Ethernet device.
+ * @param conf
+ *   Settings for this MIRROR instance..
+ * @return
+ *   Negative errno value on error, 0 on success.
+ */
+__rte_experimental
+int
+rte_eth_add_mirror(uint16_t port_id, const struct rte_eth_mirror_conf *conf);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
+ *
+ * Break port mirrorning.
+ * After this call no more packets will be sent the target port.
+ *
+ * @param port_id
+ *   The port identifier of the source Ethernet device.
+ * @param target_id
+ *   The identifier of the destination port.
+ * @return
+ *   Negative errno value on error, 0 on success.
+ */
+__rte_experimental
+int rte_eth_remove_mirror(uint16_t port_id, uint16_t target_id);
+
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
+ *
+ * Helper routine for rte_eth_rx_burst() and rte_eth_tx_burst().
+ * Do not use directly.
+ */
+struct rte_eth_mirror;
+__rte_experimental
+void rte_eth_mirror_burst(uint16_t port_id, uint16_t quque_id, uint8_t dir,
+			  struct rte_mbuf **pkts, uint16_t nb_pkts,
+			  const struct rte_eth_mirror *mirror);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_MIRROR_H_ */
-- 
2.47.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC v2 07/12] pcapng: split packet copy from header insertion
  2025-07-09 17:33 ` [RFC v2 00/12] Packet capture using port mirroring Stephen Hemminger
                     ` (5 preceding siblings ...)
  2025-07-09 17:33   ` [RFC v2 06/12] ethdev: add port mirroring feature Stephen Hemminger
@ 2025-07-09 17:33   ` Stephen Hemminger
  2025-07-09 17:33   ` [RFC v2 08/12] pcapng: make queue optional Stephen Hemminger
                     ` (4 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Stephen Hemminger @ 2025-07-09 17:33 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Reshma Pattan

In new model, the packet was already copied, only need
to wrap it in pcapng format.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 lib/pcapng/rte_pcapng.c | 178 +++++++++++++++++++++-------------------
 lib/pcapng/rte_pcapng.h |  27 +++++-
 2 files changed, 120 insertions(+), 85 deletions(-)

diff --git a/lib/pcapng/rte_pcapng.c b/lib/pcapng/rte_pcapng.c
index 2a07b4c1f5..6db5d4da50 100644
--- a/lib/pcapng/rte_pcapng.c
+++ b/lib/pcapng/rte_pcapng.c
@@ -1,3 +1,4 @@
+
 /* SPDX-License-Identifier: BSD-3-Clause
  * Copyright(c) 2019 Microsoft Corporation
  */
@@ -432,8 +433,24 @@ pcapng_vlan_insert(struct rte_mbuf *m, uint16_t ether_type, uint16_t tci)
 	return 0;
 }
 
+/* pad the packet to 32 bit boundary */
+static inline int
+pcapng_mbuf_pad32(struct rte_mbuf *m)
+{
+	uint32_t pkt_len = rte_pktmbuf_pkt_len(m);
+	uint32_t padding = RTE_ALIGN(pkt_len, sizeof(uint32_t)) - pkt_len;
+
+	if (padding > 0) {
+		void *tail = rte_pktmbuf_append(m, padding);
+		if (tail == NULL)
+			return -1;
+		memset(tail, 0, padding);
+	}
+	return 0;
+}
+
 /*
- *   The mbufs created use the Pcapng standard enhanced packet  block.
+ *  The mbufs created use the Pcapng standard enhanced packet block.
  *
  *                         1                   2                   3
  *     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
@@ -468,71 +485,28 @@ pcapng_vlan_insert(struct rte_mbuf *m, uint16_t ether_type, uint16_t tci)
  *    |                      Block Total Length                       |
  *    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  */
-
-/* Make a copy of original mbuf with pcapng header and options */
-RTE_EXPORT_SYMBOL(rte_pcapng_copy)
-struct rte_mbuf *
-rte_pcapng_copy(uint16_t port_id, uint32_t queue,
-		const struct rte_mbuf *md,
-		struct rte_mempool *mp,
-		uint32_t length,
-		enum rte_pcapng_direction direction,
-		const char *comment)
+RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_pcapng_insert, 25.07)
+int
+rte_pcapng_insert(struct rte_mbuf *m, uint32_t queue,
+		  enum rte_pcapng_direction direction, uint32_t orig_len,
+		  uint64_t timestamp, const char *comment)
 {
 	struct pcapng_enhance_packet_block *epb;
-	uint32_t orig_len, pkt_len, padding, flags;
-	struct pcapng_option *opt;
-	uint64_t timestamp;
-	uint16_t optlen;
-	struct rte_mbuf *mc;
-	bool rss_hash;
-
-#ifdef RTE_LIBRTE_ETHDEV_DEBUG
-	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
-#endif
-	orig_len = rte_pktmbuf_pkt_len(md);
+	uint32_t pkt_len = rte_pktmbuf_pkt_len(m);
+	uint32_t flags;
 
-	/* Take snapshot of the data */
-	mc = rte_pktmbuf_copy(md, mp, 0, length);
-	if (unlikely(mc == NULL))
-		return NULL;
-
-	/* Expand any offloaded VLAN information */
-	if ((direction == RTE_PCAPNG_DIRECTION_IN &&
-	     (md->ol_flags & RTE_MBUF_F_RX_VLAN_STRIPPED)) ||
-	    (direction == RTE_PCAPNG_DIRECTION_OUT &&
-	     (md->ol_flags & RTE_MBUF_F_TX_VLAN))) {
-		if (pcapng_vlan_insert(mc, RTE_ETHER_TYPE_VLAN,
-				       md->vlan_tci) != 0)
-			goto fail;
-	}
+	if (unlikely(pcapng_mbuf_pad32(m) < 0))
+		return -1;
 
-	if ((direction == RTE_PCAPNG_DIRECTION_IN &&
-	     (md->ol_flags & RTE_MBUF_F_RX_QINQ_STRIPPED)) ||
-	    (direction == RTE_PCAPNG_DIRECTION_OUT &&
-	     (md->ol_flags & RTE_MBUF_F_TX_QINQ))) {
-		if (pcapng_vlan_insert(mc, RTE_ETHER_TYPE_QINQ,
-				       md->vlan_tci_outer) != 0)
-			goto fail;
-	}
+	uint16_t optlen = pcapng_optlen(sizeof(flags));
 
-	/* record HASH on incoming packets */
-	rss_hash = (direction == RTE_PCAPNG_DIRECTION_IN &&
-		    (md->ol_flags & RTE_MBUF_F_RX_RSS_HASH));
+	/* make queue optional? */
+	optlen += pcapng_optlen(sizeof(queue));
 
-	/* pad the packet to 32 bit boundary */
-	pkt_len = rte_pktmbuf_pkt_len(mc);
-	padding = RTE_ALIGN(pkt_len, sizeof(uint32_t)) - pkt_len;
-	if (padding > 0) {
-		void *tail = rte_pktmbuf_append(mc, padding);
+	/* does packet have valid RSS hash to include */
+	bool rss_hash = (direction == RTE_PCAPNG_DIRECTION_IN &&
+			 (m->ol_flags & RTE_MBUF_F_RX_RSS_HASH));
 
-		if (tail == NULL)
-			goto fail;
-		memset(tail, 0, padding);
-	}
-
-	optlen = pcapng_optlen(sizeof(flags));
-	optlen += pcapng_optlen(sizeof(queue));
 	if (rss_hash)
 		optlen += pcapng_optlen(sizeof(uint8_t) + sizeof(uint32_t));
 
@@ -540,10 +514,10 @@ rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 		optlen += pcapng_optlen(strlen(comment));
 
 	/* reserve trailing options and block length */
-	opt = (struct pcapng_option *)
-		rte_pktmbuf_append(mc, optlen + sizeof(uint32_t));
+	struct pcapng_option *opt = (struct pcapng_option *)
+		rte_pktmbuf_append(m, optlen + sizeof(uint32_t));
 	if (unlikely(opt == NULL))
-		goto fail;
+		return -1;
 
 	switch (direction) {
 	case RTE_PCAPNG_DIRECTION_IN:
@@ -556,24 +530,20 @@ rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 		flags = 0;
 	}
 
-	opt = pcapng_add_option(opt, PCAPNG_EPB_FLAGS,
-				&flags, sizeof(flags));
-
-	opt = pcapng_add_option(opt, PCAPNG_EPB_QUEUE,
-				&queue, sizeof(queue));
+	opt = pcapng_add_option(opt, PCAPNG_EPB_FLAGS, &flags, sizeof(flags));
+	opt = pcapng_add_option(opt, PCAPNG_EPB_QUEUE, &queue, sizeof(queue));
 
 	if (rss_hash) {
 		uint8_t hash_opt[5];
 
-		/* The algorithm could be something else if
-		 * using rte_flow_action_rss; but the current API does not
-		 * have a way for ethdev to report  this on a per-packet basis.
+		/* The algorithm could be something else but the current API does not
+		 * have a way for to record this on a per-packet basis
+		 * and the PCAPNG hash types don't match the DPDK types.
 		 */
 		hash_opt[0] = PCAPNG_HASH_TOEPLITZ;
 
-		memcpy(&hash_opt[1], &md->hash.rss, sizeof(uint32_t));
-		opt = pcapng_add_option(opt, PCAPNG_EPB_HASH,
-					&hash_opt, sizeof(hash_opt));
+		memcpy(&hash_opt[1], &m->hash.rss, sizeof(uint32_t));
+		opt = pcapng_add_option(opt, PCAPNG_EPB_HASH, &hash_opt, sizeof(hash_opt));
 	}
 
 	if (comment)
@@ -583,19 +553,14 @@ rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 	/* Note: END_OPT necessary here. Wireshark doesn't do it. */
 
 	/* Add PCAPNG packet header */
-	epb = (struct pcapng_enhance_packet_block *)
-		rte_pktmbuf_prepend(mc, sizeof(*epb));
+	epb = (struct pcapng_enhance_packet_block *) rte_pktmbuf_prepend(m, sizeof(*epb));
 	if (unlikely(epb == NULL))
-		goto fail;
+		return -1;
 
 	epb->block_type = PCAPNG_ENHANCED_PACKET_BLOCK;
-	epb->block_length = rte_pktmbuf_pkt_len(mc);
-
-	/* Interface index is filled in later during write */
-	mc->port = port_id;
+	epb->block_length = rte_pktmbuf_pkt_len(m);
 
-	/* Put timestamp in cycles here - adjust in packet write */
-	timestamp = rte_get_tsc_cycles();
+	/* Put timestamp in cycles here - adjusted in packet write */
 	epb->timestamp_hi = timestamp >> 32;
 	epb->timestamp_lo = (uint32_t)timestamp;
 	epb->capture_length = pkt_len;
@@ -603,9 +568,56 @@ rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 
 	/* set trailer of block length */
 	*(uint32_t *)opt = epb->block_length;
+	return 0;
+}
+
+/* Make a copy of original mbuf with pcapng header and options */
+RTE_EXPORT_SYMBOL(rte_pcapng_copy)
+struct rte_mbuf *
+rte_pcapng_copy(uint16_t port_id, uint32_t queue,
+		const struct rte_mbuf *md,
+		struct rte_mempool *mp,
+		uint32_t length,
+		enum rte_pcapng_direction direction,
+		const char *comment)
+{
+	uint32_t orig_len = rte_pktmbuf_pkt_len(md);
+	struct rte_mbuf *mc;
 
-	return mc;
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
+#endif
+
+	/* Take snapshot of the data */
+	mc = rte_pktmbuf_copy(md, mp, 0, length);
+	if (unlikely(mc == NULL))
+		return NULL;
+
+	/* Expand any offloaded VLAN information */
+	if ((direction == RTE_PCAPNG_DIRECTION_IN &&
+	     (md->ol_flags & RTE_MBUF_F_RX_VLAN_STRIPPED)) ||
+	    (direction == RTE_PCAPNG_DIRECTION_OUT &&
+	     (md->ol_flags & RTE_MBUF_F_TX_VLAN))) {
+		if (pcapng_vlan_insert(mc, RTE_ETHER_TYPE_VLAN,
+				       md->vlan_tci) != 0)
+			goto fail;
+	}
+
+	if ((direction == RTE_PCAPNG_DIRECTION_IN &&
+	     (md->ol_flags & RTE_MBUF_F_RX_QINQ_STRIPPED)) ||
+	    (direction == RTE_PCAPNG_DIRECTION_OUT &&
+	     (md->ol_flags & RTE_MBUF_F_TX_QINQ))) {
+		if (pcapng_vlan_insert(mc, RTE_ETHER_TYPE_QINQ,
+				       md->vlan_tci_outer) != 0)
+			goto fail;
+	}
+
+	/* Interface index is filled in later during write */
+	mc->port = port_id;
 
+	if (likely(rte_pcapng_insert(mc, queue, direction, orig_len,
+				     rte_get_tsc_cycles(), comment) == 0))
+		return mc;
 fail:
 	rte_pktmbuf_free(mc);
 	return NULL;
diff --git a/lib/pcapng/rte_pcapng.h b/lib/pcapng/rte_pcapng.h
index 48f2b57564..4914ac9622 100644
--- a/lib/pcapng/rte_pcapng.h
+++ b/lib/pcapng/rte_pcapng.h
@@ -99,7 +99,7 @@ enum rte_pcapng_direction {
 };
 
 /**
- * Format an mbuf for writing to file.
+ * Make a copy of mbuf for writing to file.
  *
  * @param port_id
  *   The Ethernet port on which packet was received
@@ -117,7 +117,7 @@ enum rte_pcapng_direction {
  * @param direction
  *   The direction of the packer: receive, transmit or unknown.
  * @param comment
- *   Packet comment.
+ *   Packet comment (optional).
  *
  * @return
  *   - The pointer to the new mbuf formatted for pcapng_write
@@ -129,6 +129,29 @@ rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 		uint32_t length,
 		enum rte_pcapng_direction direction, const char *comment);
 
+/**
+ * Format an mbuf for writing to file.
+ *
+ * @param m
+ *   The mbuf to modify.
+ * @param queue
+ *   The queue on the Ethernet port where packet was received
+ *   or is going to be transmitted.
+ * @param direction
+ *   The direction of the packer: receive, transmit or unknown.
+ * @param orig_len
+ *   The length of the original packet which maybe less than actual
+ *   packet if only a snapshot was captured.
+ * @param timestamp
+ *   The timestamp for packet in TSC cycles.
+ * @param comment
+ *   Packet comment (optional).
+ */
+__rte_experimental
+int
+rte_pcapng_insert(struct rte_mbuf *m, uint32_t queue,
+		  enum rte_pcapng_direction direction, uint32_t orig_len,
+		  uint64_t timestamp, const char *comment);
 
 /**
  * Determine optimum mbuf data size.
-- 
2.47.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC v2 08/12] pcapng: make queue optional
  2025-07-09 17:33 ` [RFC v2 00/12] Packet capture using port mirroring Stephen Hemminger
                     ` (6 preceding siblings ...)
  2025-07-09 17:33   ` [RFC v2 07/12] pcapng: split packet copy from header insertion Stephen Hemminger
@ 2025-07-09 17:33   ` Stephen Hemminger
  2025-07-09 17:33   ` [RFC v2 09/12] test: add tests for ethdev mirror Stephen Hemminger
                     ` (3 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Stephen Hemminger @ 2025-07-09 17:33 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Reshma Pattan

The queue field is optional in pcapng received packet.
Use UINT16_MAX as flag value.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 lib/pcapng/rte_pcapng.c | 7 ++++---
 lib/pcapng/rte_pcapng.h | 4 ++--
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/lib/pcapng/rte_pcapng.c b/lib/pcapng/rte_pcapng.c
index 6db5d4da50..c27cb9b48e 100644
--- a/lib/pcapng/rte_pcapng.c
+++ b/lib/pcapng/rte_pcapng.c
@@ -500,8 +500,8 @@ rte_pcapng_insert(struct rte_mbuf *m, uint32_t queue,
 
 	uint16_t optlen = pcapng_optlen(sizeof(flags));
 
-	/* make queue optional? */
-	optlen += pcapng_optlen(sizeof(queue));
+	if (queue != UINT16_MAX)
+		optlen += pcapng_optlen(sizeof(queue));
 
 	/* does packet have valid RSS hash to include */
 	bool rss_hash = (direction == RTE_PCAPNG_DIRECTION_IN &&
@@ -531,7 +531,8 @@ rte_pcapng_insert(struct rte_mbuf *m, uint32_t queue,
 	}
 
 	opt = pcapng_add_option(opt, PCAPNG_EPB_FLAGS, &flags, sizeof(flags));
-	opt = pcapng_add_option(opt, PCAPNG_EPB_QUEUE, &queue, sizeof(queue));
+	if (queue != UINT16_MAX)
+		opt = pcapng_add_option(opt, PCAPNG_EPB_QUEUE, &queue, sizeof(queue));
 
 	if (rss_hash) {
 		uint8_t hash_opt[5];
diff --git a/lib/pcapng/rte_pcapng.h b/lib/pcapng/rte_pcapng.h
index 4914ac9622..48aaab365c 100644
--- a/lib/pcapng/rte_pcapng.h
+++ b/lib/pcapng/rte_pcapng.h
@@ -135,8 +135,8 @@ rte_pcapng_copy(uint16_t port_id, uint32_t queue,
  * @param m
  *   The mbuf to modify.
  * @param queue
- *   The queue on the Ethernet port where packet was received
- *   or is going to be transmitted.
+ *   The queue on the Ethernet port where packet was received or is going to be transmitted.
+ *   Optional: use UINT16_MAX if not specified.
  * @param direction
  *   The direction of the packer: receive, transmit or unknown.
  * @param orig_len
-- 
2.47.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC v2 09/12] test: add tests for ethdev mirror
  2025-07-09 17:33 ` [RFC v2 00/12] Packet capture using port mirroring Stephen Hemminger
                     ` (7 preceding siblings ...)
  2025-07-09 17:33   ` [RFC v2 08/12] pcapng: make queue optional Stephen Hemminger
@ 2025-07-09 17:33   ` Stephen Hemminger
  2025-07-09 17:33   ` [RFC v2 10/12] app/testpmd: support for port mirroring Stephen Hemminger
                     ` (2 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: Stephen Hemminger @ 2025-07-09 17:33 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko

Simple API and packet mirroring standalone tests.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 app/test/meson.build          |   1 +
 app/test/test_ethdev_mirror.c | 285 ++++++++++++++++++++++++++++++++++
 2 files changed, 286 insertions(+)
 create mode 100644 app/test/test_ethdev_mirror.c

diff --git a/app/test/meson.build b/app/test/meson.build
index 7d38f51918..07bc00148d 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -73,6 +73,7 @@ source_file_deps = {
     'test_errno.c': [],
     'test_ethdev_api.c': ['ethdev'],
     'test_ethdev_link.c': ['ethdev'],
+    'test_ethdev_mirror.c': ['net_ring', 'ethdev', 'bus_vdev'],
     'test_event_crypto_adapter.c': ['cryptodev', 'eventdev', 'bus_vdev'],
     'test_event_dma_adapter.c': ['dmadev', 'eventdev', 'bus_vdev'],
     'test_event_eth_rx_adapter.c': ['ethdev', 'eventdev', 'bus_vdev'],
diff --git a/app/test/test_ethdev_mirror.c b/app/test/test_ethdev_mirror.c
new file mode 100644
index 0000000000..b5b89e977f
--- /dev/null
+++ b/app/test/test_ethdev_mirror.c
@@ -0,0 +1,285 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Stephen Hemminger <stephen@networkplumber.org>
+ */
+#include "test.h"
+
+#include <stdio.h>
+
+#include <rte_eth_ring.h>
+#include <rte_ethdev.h>
+#include <rte_bus_vdev.h>
+
+#define SOCKET0 0
+#define RING_SIZE 256
+#define BURST_SZ 64
+#define NB_MBUF 512
+
+static struct rte_mempool *mirror_pool;
+static const uint32_t pkt_len = 200;
+
+static struct rte_ring *rxtx_ring;
+static const char ring_dev[] = "net_ring0";
+static uint16_t ring_port;
+static const char null_dev[] = "net_null0";
+static uint16_t null_port;
+
+static int
+configure_port(uint16_t port, const char *name)
+{
+	struct rte_eth_conf eth_conf = { 0 };
+
+	if (rte_eth_dev_configure(port, 1, 1, &eth_conf) < 0) {
+		fprintf(stderr, "Configure failed for port %u: %s\n", port, name);
+		return -1;
+	}
+
+	/* only single queue */
+	if (rte_eth_tx_queue_setup(port, 0, RING_SIZE, SOCKET0, NULL) < 0) {
+		fprintf(stderr, "TX queue setup failed port %u: %s\n", port, name);
+		return -1;
+	}
+
+	if (rte_eth_rx_queue_setup(port, 0, RING_SIZE, SOCKET0, NULL, mirror_pool) < 0) {
+		fprintf(stderr, "RX queue setup failed port %u: %s\n", port, name);
+		return -1;
+	}
+
+	if (rte_eth_dev_start(port) < 0) {
+		fprintf(stderr, "Error starting port %u:%s\n", port, name);
+		return -1;
+	}
+
+	return 0;
+}
+
+static void
+test_cleanup(void)
+{
+	rte_ring_free(rxtx_ring);
+
+	rte_vdev_uninit("net_ring");
+	rte_vdev_uninit("net_null");
+
+	rte_mempool_free(mirror_pool);
+}
+
+/* Make two virtual devices ring and null */
+static int
+test_setup(void)
+{
+	char null_name[] = "net_null0";
+	int ret;
+
+	/* ring must support multiple enqueue for mirror to work */
+	rxtx_ring = rte_ring_create("R0", RING_SIZE, SOCKET0, RING_F_MP_RTS_ENQ | RING_F_SC_DEQ);
+	if (rxtx_ring == NULL) {
+		fprintf(stderr, "rte_ring_create R0 failed\n");
+		goto fail;
+	}
+	ret = rte_eth_from_rings(ring_dev, &rxtx_ring, 1, &rxtx_ring, 1, SOCKET0);
+	if (ret < 0) {
+		fprintf(stderr, "rte_eth_from_rings failed\n");
+		goto fail;
+	}
+	ring_port = ret;
+
+	/* Make a dummy null device to snoop on */
+	if (rte_vdev_init(null_dev, NULL) != 0) {
+		fprintf(stderr, "Failed to create vdev '%s'\n", null_dev);
+		goto fail;
+	}
+	if (rte_eth_dev_get_port_by_name(null_dev, &null_port) != 0) {
+		fprintf(stderr, "cannot find added vdev %s\n", null_name);
+		goto fail;
+	}
+
+	mirror_pool = rte_pktmbuf_pool_create("mirror_pool", NB_MBUF, 32,
+					      0, RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id());
+	if (mirror_pool == NULL) {
+		fprintf(stderr, "rte_pktmbuf_pool_create failed\n");
+		goto fail;
+	}
+
+	ret = configure_port(ring_port, ring_dev);
+	if (ret < 0)
+		goto fail;
+
+	ret = configure_port(null_port, null_dev);
+	if (ret < 0)
+		goto fail;
+
+	return 0;
+fail:
+	test_cleanup();
+	return -1;
+}
+
+/* Make sure mirror API checks args */
+static int32_t
+ethdev_mirror_api(void)
+{
+	struct rte_eth_mirror_conf conf = {
+		.mp = mirror_pool,
+		.direction = RTE_ETH_MIRROR_DIRECTION_EGRESS,
+	};
+	uint16_t invalid_port = RTE_MAX_ETHPORTS;
+	int ret;
+
+	conf.target = null_port;
+	ret = rte_eth_add_mirror(invalid_port, &conf);
+	TEST_ASSERT(ret != 0, "Created mirror from invalid port");
+
+	conf.target = invalid_port;
+	ret = rte_eth_add_mirror(null_port, &conf);
+	TEST_ASSERT(ret != 0, "Created mirror to invalid port");
+
+	conf.direction = 0;
+	conf.target = ring_port;
+	ret = rte_eth_add_mirror(null_port, &conf);
+	TEST_ASSERT(ret != 0, "Created mirror with invalid direction");
+
+	conf.direction = RTE_ETH_MIRROR_DIRECTION_INGRESS;
+	conf.target = ring_port;
+	ret = rte_eth_add_mirror(ring_port, &conf);
+	TEST_ASSERT(ret != 0, "Created mirror to self");
+
+	conf.target = null_port;
+	ret = rte_eth_add_mirror(ring_port, &conf);
+	TEST_ASSERT(ret == 0, "Could not create mirror from ring to null");
+
+	ret = rte_eth_add_mirror(ring_port, &conf);
+	TEST_ASSERT(ret != 0, "Able to create duplicate mirror");
+
+	ret = rte_eth_remove_mirror(ring_port, null_port);
+	TEST_ASSERT(ret == 0, "Unable to delete mirror");
+
+	ret = rte_eth_remove_mirror(ring_port, null_port);
+	TEST_ASSERT(ret != 0, "Able to delete port without mirror");
+
+	return 0;
+}
+
+static int
+init_mbuf(struct rte_mbuf *m, uint16_t id)
+{
+	struct {
+		struct rte_ether_hdr eth;
+		struct rte_ipv4_hdr ip;
+		struct rte_udp_hdr udp;
+	} hdrs = {
+		.eth = {
+			.dst_addr.addr_bytes = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff },
+			.ether_type = rte_cpu_to_be_16(RTE_ETHER_TYPE_IPV4),
+		},
+		.ip = {
+			.version_ihl = RTE_IPV4_VHL_DEF,
+			.time_to_live = 1,
+			.next_proto_id = IPPROTO_UDP,
+			.src_addr = rte_cpu_to_be_32(RTE_IPV4_LOOPBACK),
+			.dst_addr = rte_cpu_to_be_32(RTE_IPV4_BROADCAST),
+		},
+		.udp = {
+			.dst_port = rte_cpu_to_be_16(9), /* Discard port */
+		},
+	};
+
+	rte_eth_random_addr(hdrs.eth.src_addr.addr_bytes);
+	uint16_t plen = pkt_len - sizeof(struct rte_ether_hdr);
+
+	hdrs.ip.packet_id = rte_cpu_to_be_16(id);
+	hdrs.ip.total_length = rte_cpu_to_be_16(plen);
+	hdrs.ip.hdr_checksum = rte_ipv4_cksum(&hdrs.ip);
+
+	plen -= sizeof(struct rte_ipv4_hdr);
+	hdrs.udp.src_port = rte_rand();
+	hdrs.udp.dgram_len = rte_cpu_to_be_16(plen);
+
+	void *data = rte_pktmbuf_append(m, pkt_len);
+	TEST_ASSERT(data != NULL, "could not add header");
+
+	memcpy(data, &hdrs, sizeof(hdrs));
+	return 0;
+}
+
+static int
+init_burst(struct rte_mbuf *pkts[], unsigned int n)
+{
+	for (unsigned int i = 0; i < n; i++) {
+		if (init_mbuf(pkts[i], i) < 0)
+			return -1;
+	}
+	return 0;
+}
+
+static int
+validate_burst(struct rte_mbuf *pkts[], unsigned int n)
+{
+	for (unsigned int i = 0; i < n; i++) {
+		struct rte_mbuf *m = pkts[i];
+
+		rte_mbuf_sanity_check(m, 1);
+		TEST_ASSERT(m->pkt_len != pkt_len,
+			    "mirror packet not right len");
+
+		const struct rte_ether_hdr *eh
+			= rte_pktmbuf_mtod(m, struct rte_ether_hdr *);
+		TEST_ASSERT(rte_is_broadcast_ether_addr(&eh->dst_addr),
+			    "mirrored packet is not broadcast");
+
+	}
+	return 0;
+}
+
+static int32_t
+ethdev_mirror_packets(void)
+{
+	struct rte_eth_mirror_conf conf = {
+		.mp = mirror_pool,
+		.target = ring_port,
+		.direction = RTE_ETH_MIRROR_DIRECTION_EGRESS,
+	};
+	struct rte_mbuf *tx_pkts[BURST_SZ], *rx_pkts[BURST_SZ];
+	uint16_t nb_tx, nb_rx;
+	int ret;
+
+	ret = rte_pktmbuf_alloc_bulk(mirror_pool, tx_pkts, BURST_SZ);
+	TEST_ASSERT(ret == 0, "Could not allocate mbufs");
+
+	ret = init_burst(tx_pkts, BURST_SZ);
+	TEST_ASSERT(ret == 0, "Init mbufs failed");
+
+	ret = rte_eth_add_mirror(null_port, &conf);
+	TEST_ASSERT(ret == 0, "Could not create mirror from ring to null");
+
+	nb_tx = rte_eth_tx_burst(null_port, 0, tx_pkts, BURST_SZ);
+	TEST_ASSERT(nb_tx == BURST_SZ, "Only sent %u burst to null (vs %u)",
+		    nb_tx, BURST_SZ);
+
+	nb_rx = rte_eth_rx_burst(ring_port, 0, rx_pkts, BURST_SZ);
+	TEST_ASSERT(nb_rx == BURST_SZ, "Only received %u of %u packets",
+		    nb_rx, BURST_SZ);
+
+	validate_burst(rx_pkts, nb_rx);
+	rte_pktmbuf_free_bulk(rx_pkts, nb_rx);
+
+	return 0;
+}
+
+static struct unit_test_suite ethdev_mirror_suite = {
+	.suite_name = "port mirroring",
+	.setup = test_setup,
+	.teardown = test_cleanup,
+	.unit_test_cases = {
+		TEST_CASE(ethdev_mirror_api),
+		TEST_CASE(ethdev_mirror_packets),
+		TEST_CASES_END()
+	}
+};
+
+static int
+test_ethdev_mirror(void)
+{
+	return unit_test_suite_runner(&ethdev_mirror_suite);
+}
+
+REGISTER_FAST_TEST(ethdev_mirror, true, true, test_ethdev_mirror);
-- 
2.47.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC v2 10/12] app/testpmd: support for port mirroring
  2025-07-09 17:33 ` [RFC v2 00/12] Packet capture using port mirroring Stephen Hemminger
                     ` (8 preceding siblings ...)
  2025-07-09 17:33   ` [RFC v2 09/12] test: add tests for ethdev mirror Stephen Hemminger
@ 2025-07-09 17:33   ` Stephen Hemminger
  2025-07-09 17:33   ` [RFC v2 11/12] app/dumpcap: use port mirror instead of pdump Stephen Hemminger
  2025-07-09 17:33   ` [RFC v2 12/12] pdump: mark API and application as deprecated Stephen Hemminger
  11 siblings, 0 replies; 14+ messages in thread
From: Stephen Hemminger @ 2025-07-09 17:33 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Aman Singh

Add new commands to enable testing of port mirror functionality.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 app/test-pmd/cmdline.c                      |   1 +
 app/test-pmd/cmdline_mirror.c               | 129 ++++++++++++++++++++
 app/test-pmd/cmdline_mirror.h               |  12 ++
 app/test-pmd/meson.build                    |   1 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  15 +++
 5 files changed, 158 insertions(+)
 create mode 100644 app/test-pmd/cmdline_mirror.c
 create mode 100644 app/test-pmd/cmdline_mirror.h

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 7b4e27eddf..bc4b8b3b72 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -66,6 +66,7 @@
 #include "testpmd.h"
 #include "cmdline_cman.h"
 #include "cmdline_mtr.h"
+#include "cmdline_mirror.h"
 #include "cmdline_tm.h"
 #include "bpf_cmd.h"
 
diff --git a/app/test-pmd/cmdline_mirror.c b/app/test-pmd/cmdline_mirror.c
new file mode 100644
index 0000000000..2c645ffafe
--- /dev/null
+++ b/app/test-pmd/cmdline_mirror.c
@@ -0,0 +1,129 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Stephen Hemminger <stephen@networkplumber.org>
+ */
+
+#include <stdlib.h>
+
+#include <cmdline_parse.h>
+#include <cmdline_parse_num.h>
+#include <cmdline_parse_string.h>
+
+#include <rte_ethdev.h>
+
+#include "testpmd.h"
+#include "cmdline_mirror.h"
+
+/* *** Create MIRROR port Object *** */
+struct cmd_create_port_mirror_result {
+	cmdline_fixed_string_t create;
+	cmdline_fixed_string_t port;
+	cmdline_fixed_string_t mirror;
+	uint16_t port_id;
+	cmdline_fixed_string_t destination;
+	uint16_t target_id;
+};
+
+static cmdline_parse_token_string_t cmd_create_port_mirror_create =
+	TOKEN_STRING_INITIALIZER(struct cmd_create_port_mirror_result, create, "create");
+static cmdline_parse_token_string_t cmd_create_port_mirror_port =
+	TOKEN_STRING_INITIALIZER(struct cmd_create_port_mirror_result, port, "port");
+static cmdline_parse_token_string_t cmd_create_port_mirror_mirror =
+	TOKEN_STRING_INITIALIZER(struct cmd_create_port_mirror_result, mirror, "mirror");
+static cmdline_parse_token_string_t cmd_create_port_mirror_destination =
+	TOKEN_STRING_INITIALIZER(struct cmd_create_port_mirror_result, destination, "destination");
+static cmdline_parse_token_num_t cmd_create_port_mirror_port_id =
+	TOKEN_NUM_INITIALIZER(struct cmd_create_port_mirror_result, port_id, RTE_UINT16);
+static cmdline_parse_token_num_t cmd_create_port_mirror_target_id =
+	TOKEN_NUM_INITIALIZER(struct cmd_create_port_mirror_result, target_id, RTE_UINT16);
+
+static void cmd_create_port_mirror_parsed(void *parsed_result,
+	__rte_unused struct cmdline *cl,
+	__rte_unused void *data)
+{
+	const struct cmd_create_port_mirror_result *res = parsed_result;
+	/* TODO these should be set by command */
+	struct rte_eth_mirror_conf mirror_conf = {
+		.direction = RTE_ETH_MIRROR_DIRECTION_INGRESS | RTE_ETH_MIRROR_DIRECTION_EGRESS
+	};
+	int ret;
+
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
+		return;
+
+	if (port_id_is_invalid(res->target_id, ENABLED_WARN))
+		return;
+
+	mirror_conf.target = res->target_id;
+
+	ret = rte_eth_add_mirror(res->port_id,  &mirror_conf);
+	if (ret != 0)
+		fprintf(stderr, "%s\n", rte_strerror(-ret));
+}
+
+cmdline_parse_inst_t cmd_create_port_mirror = {
+	.f = cmd_create_port_mirror_parsed,
+	.data = NULL,
+	.help_str = "create port mirror <port_id> destination <port_id>",
+	.tokens = {
+		(void *)&cmd_create_port_mirror_create,
+		(void *)&cmd_create_port_mirror_port,
+		(void *)&cmd_create_port_mirror_mirror,
+		(void *)&cmd_create_port_mirror_port_id,
+		(void *)&cmd_create_port_mirror_destination,
+		(void *)&cmd_create_port_mirror_target_id,
+		NULL
+	},
+};
+
+/* *** Delete Port Mirror Object *** */
+struct cmd_delete_port_mirror_result {
+	cmdline_fixed_string_t delete;
+	cmdline_fixed_string_t port;
+	cmdline_fixed_string_t mirror;
+	uint16_t port_id;
+	cmdline_fixed_string_t destination;
+	uint16_t target_id;
+};
+
+static void cmd_delete_port_mirror_parsed(void *parsed_result,
+	__rte_unused struct cmdline *cl,
+	__rte_unused void *data)
+{
+	const struct cmd_delete_port_mirror_result *res = parsed_result;
+	int ret;
+
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
+		return;
+
+	ret = rte_eth_remove_mirror(res->port_id, res->target_id);
+	if (ret != 0)
+		fprintf(stderr, "%s\n", rte_strerror(-ret));
+}
+
+static cmdline_parse_token_string_t cmd_delete_port_mirror_delete =
+	TOKEN_STRING_INITIALIZER(struct cmd_delete_port_mirror_result, delete, "delete");
+static cmdline_parse_token_string_t cmd_delete_port_mirror_port =
+	TOKEN_STRING_INITIALIZER(struct cmd_delete_port_mirror_result, port, "port");
+static cmdline_parse_token_string_t cmd_delete_port_mirror_mirror =
+	TOKEN_STRING_INITIALIZER(struct cmd_delete_port_mirror_result, mirror, "mirror");
+static cmdline_parse_token_num_t cmd_delete_port_mirror_port_id =
+	TOKEN_NUM_INITIALIZER(struct cmd_delete_port_mirror_result, port_id, RTE_UINT16);
+static cmdline_parse_token_string_t cmd_delete_port_mirror_destination =
+	TOKEN_STRING_INITIALIZER(struct cmd_create_port_mirror_result, destination, "destination");
+static cmdline_parse_token_num_t cmd_delete_port_mirror_target_id =
+	TOKEN_NUM_INITIALIZER(struct cmd_delete_port_mirror_result, target_id, RTE_UINT16);
+
+cmdline_parse_inst_t cmd_delete_port_mirror = {
+	.f = cmd_delete_port_mirror_parsed,
+	.data = NULL,
+	.help_str = "delete port mirror <port_id> destination <port_id>",
+	.tokens = {
+		(void *)&cmd_delete_port_mirror_delete,
+		(void *)&cmd_delete_port_mirror_port,
+		(void *)&cmd_delete_port_mirror_mirror,
+		(void *)&cmd_delete_port_mirror_port_id,
+		(void *)&cmd_delete_port_mirror_destination,
+		(void *)&cmd_delete_port_mirror_target_id,
+		NULL
+	},
+};
diff --git a/app/test-pmd/cmdline_mirror.h b/app/test-pmd/cmdline_mirror.h
new file mode 100644
index 0000000000..23ebe1ed75
--- /dev/null
+++ b/app/test-pmd/cmdline_mirror.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2025 Stephen Hemminger <stephen@networkplumber.org>
+ */
+
+#ifndef _CMDLINE_MIRROR_H_
+#define _CMDLINE_MIRROR_H_
+
+/* Port MIRROR commands */
+extern cmdline_parse_inst_t cmd_create_port_mirror;
+extern cmdline_parse_inst_t cmd_disable_port_mirror;
+
+#endif /* _CMDLINE_MIRROR_H_ */
diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build
index 000548c261..e67665028f 100644
--- a/app/test-pmd/meson.build
+++ b/app/test-pmd/meson.build
@@ -8,6 +8,7 @@ sources = files(
         'cmdline.c',
         'cmdline_cman.c',
         'cmdline_flow.c',
+        'cmdline_mirror.c',
         'cmdline_mtr.c',
         'cmdline_tm.c',
         'cmd_flex_item.c',
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 6ad83ae50d..603468a70d 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -2570,6 +2570,21 @@ where:
 * ``clear``: Flag that indicates whether the statistics counters should
   be cleared (i.e. set to zero) immediately after they have been read or not.
 
+create mirror
+-------------
+
+Create a new MIRROR port for an ethernet device::
+
+   testpmd> create port mirror (port_id) destination (port_id)
+
+delete mirror
+-------------
+
+Delete an existing MIRROR port on ethernet device::
+
+   testpmd> delete port mirror (port_id)
+
+
 Traffic Management
 ------------------
 
-- 
2.47.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC v2 11/12] app/dumpcap: use port mirror instead of pdump
  2025-07-09 17:33 ` [RFC v2 00/12] Packet capture using port mirroring Stephen Hemminger
                     ` (9 preceding siblings ...)
  2025-07-09 17:33   ` [RFC v2 10/12] app/testpmd: support for port mirroring Stephen Hemminger
@ 2025-07-09 17:33   ` Stephen Hemminger
  2025-07-09 17:33   ` [RFC v2 12/12] pdump: mark API and application as deprecated Stephen Hemminger
  11 siblings, 0 replies; 14+ messages in thread
From: Stephen Hemminger @ 2025-07-09 17:33 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Reshma Pattan

Use the new port mirror API instead of pdump.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 app/dumpcap/main.c      | 443 +++++++++++++++++++++++++++++++++-------
 app/dumpcap/meson.build |   2 +-
 2 files changed, 373 insertions(+), 72 deletions(-)

diff --git a/app/dumpcap/main.c b/app/dumpcap/main.c
index 3d3c0dbc66..700b1d46da 100644
--- a/app/dumpcap/main.c
+++ b/app/dumpcap/main.c
@@ -36,8 +36,6 @@
 #include <rte_mbuf.h>
 #include <rte_mempool.h>
 #include <rte_pcapng.h>
-#include <rte_pdump.h>
-#include <rte_ring.h>
 #include <rte_string_fns.h>
 #include <rte_thread.h>
 #include <rte_time.h>
@@ -92,16 +90,28 @@ struct capture_options {
 struct interface {
 	TAILQ_ENTRY(interface) next;
 	uint16_t port;
+	struct rte_bpf *filter;
+	uint64_t filteraccept;
 	struct capture_options opts;
-	struct rte_bpf_prm *bpf_prm;
 	char name[RTE_ETH_NAME_MAX_LEN];
 
+	struct rte_eth_stats start_stats;
 	const char *ifname;
 	const char *ifdescr;
 };
 
+static int timestamp_dynfield = -1;
+static uint64_t timestamp_dynflag;
+static int mirror_origin_dynfield = -1;
+static uint64_t mirror_origin_dynflag;
+static uint64_t mirror_ingress_dynflag;
+static uint64_t mirror_egress_dynflag;
+
+static bool filtering;
+
 TAILQ_HEAD(interface_list, interface);
 static struct interface_list interfaces = TAILQ_HEAD_INITIALIZER(interfaces);
+static struct interface *port2intf[RTE_MAX_ETHPORTS];
 
 /* Can do either pcap or pcapng format output */
 typedef union {
@@ -239,14 +249,16 @@ static void find_interfaces(void)
 
 	TAILQ_FOREACH(intf, &interfaces, next) {
 		/* if name is valid then just record port */
-		if (rte_eth_dev_get_port_by_name(intf->name, &intf->port) == 0)
-			continue;
+		if (rte_eth_dev_get_port_by_name(intf->name, &intf->port) != 0) {
+			/* maybe got passed port number string as name */
+			intf->port = get_uint(intf->name, "port_number", UINT16_MAX);
+			if (rte_eth_dev_get_name_by_port(intf->port, intf->name) < 0)
+				rte_exit(EXIT_FAILURE, "Invalid port number %u\n",
+					 intf->port);
+		}
 
-		/* maybe got passed port number string as name */
-		intf->port = get_uint(intf->name, "port_number", UINT16_MAX);
-		if (rte_eth_dev_get_name_by_port(intf->port, intf->name) < 0)
-			rte_exit(EXIT_FAILURE, "Invalid port number %u\n",
-				 intf->port);
+		if (rte_eth_stats_get(intf->port, &intf->start_stats) < 0)
+			rte_exit(EXIT_FAILURE, "Could not read stats for port %u\n", intf->port);
 	}
 }
 
@@ -266,6 +278,9 @@ static void set_default_interface(void)
 
 		intf = add_interface(name);
 		intf->port = p;
+
+		if (rte_eth_stats_get(intf->port, &intf->start_stats) < 0)
+			rte_exit(EXIT_FAILURE, "Could not read stats for default port %u\n", intf->port);
 		return;
 	}
 	rte_exit(EXIT_FAILURE, "No usable interfaces found\n");
@@ -295,10 +310,17 @@ static void compile_filters(void)
 		struct bpf_program bf;
 		pcap_t *pcap;
 
+		/* cache for filter packets */
+		port2intf[intf->port] = intf;
+
+		if (intf->opts.filter == NULL)
+			continue;
+
 		pcap = pcap_open_dead(DLT_EN10MB, intf->opts.snap_len);
 		if (!pcap)
 			rte_exit(EXIT_FAILURE, "can not open pcap\n");
 
+
 		if (pcap_compile(pcap, &bf, intf->opts.filter,
 				 1, PCAP_NETMASK_UNKNOWN) != 0) {
 			fprintf(stderr,
@@ -316,7 +338,7 @@ static void compile_filters(void)
 				 rte_strerror(rte_errno), rte_errno);
 
 		if (dump_bpf) {
-			printf("cBPF program (%u insns)\n", bf.bf_len);
+			printf("cops program (%u insns)\n", bf.bf_len);
 			bpf_dump(&bf, 1);
 			printf("\neBPF program (%u insns)\n",
 			       bpf_prm->nb_ins);
@@ -324,11 +346,18 @@ static void compile_filters(void)
 			exit(0);
 		}
 
-		intf->bpf_prm = bpf_prm;
+
+		intf->filter = rte_bpf_load(bpf_prm);
+		if (intf->filter == NULL)
+			rte_exit(EXIT_FAILURE,
+				 "BPF load failed '%s'\n%s(%d)\n",
+				 intf->name, rte_strerror(rte_errno), rte_errno);
 
 		/* Don't care about original program any more */
 		pcap_freecode(&bf);
 		pcap_close(pcap);
+
+		filtering = true;
 	}
 }
 
@@ -518,13 +547,12 @@ static void statistics_loop(void)
 }
 
 static void
-cleanup_pdump_resources(void)
+disable_mirror_ports(uint16_t mirror_port)
 {
 	struct interface *intf;
 
 	TAILQ_FOREACH(intf, &interfaces, next) {
-		rte_pdump_disable(intf->port,
-				  RTE_PDUMP_ALL_QUEUES, RTE_PDUMP_FLAG_RXTX);
+		rte_eth_remove_mirror(intf->port, mirror_port);
 		if (intf->opts.promisc_mode)
 			rte_eth_promiscuous_disable(intf->port);
 	}
@@ -552,7 +580,6 @@ enable_primary_monitor(void)
 {
 	int ret;
 
-	/* Once primary exits, so will pdump. */
 	ret = rte_eal_alarm_set(MONITOR_INTERVAL, monitor_primary, NULL);
 	if (ret < 0)
 		fprintf(stderr, "Fail to enable monitor:%d\n", ret);
@@ -571,23 +598,24 @@ disable_primary_monitor(void)
 static void
 report_packet_stats(dumpcap_out_t out)
 {
-	struct rte_pdump_stats pdump_stats;
 	struct interface *intf;
-	uint64_t ifrecv, ifdrop;
-	double percent;
 
 	fputc('\n', stderr);
+
 	TAILQ_FOREACH(intf, &interfaces, next) {
-		if (rte_pdump_stats(intf->port, &pdump_stats) < 0)
+		uint64_t ifrecv, ifdrop;
+		struct rte_eth_stats stats;
+		double percent;
+
+		if (rte_eth_stats_get(intf->port, &stats) < 0)
 			continue;
 
-		/* do what Wiretap does */
-		ifrecv = pdump_stats.accepted + pdump_stats.filtered;
-		ifdrop = pdump_stats.nombuf + pdump_stats.ringfull;
+		ifrecv = stats.ipackets - intf->start_stats.ipackets;
+		ifdrop = (stats.rx_nombuf - intf->start_stats.rx_nombuf)
+			+ (stats.imissed - intf->start_stats.imissed);
 
 		if (use_pcapng)
-			rte_pcapng_write_stats(out.pcapng, intf->port,
-					       ifrecv, ifdrop, NULL);
+			rte_pcapng_write_stats(out.pcapng, intf->port, ifrecv, ifdrop, NULL);
 
 		if (ifrecv == 0)
 			percent = 0;
@@ -668,12 +696,21 @@ static void dpdk_init(void)
 		rte_exit(EXIT_FAILURE, "Can not restore original CPU affinity\n");
 }
 
-/* Create packet ring shared between callbacks and process */
-static struct rte_ring *create_ring(void)
+/* Create ring port to redirect packet to.
+ * This could be much simpler if the ring PMD API (rte_eth_from_rings)
+ * worked from a secondary process, but it doesn't.
+ */
+static struct rte_ring *
+create_ring_dev(char *vdev_name, uint16_t *mirror_port)
 {
-	struct rte_ring *ring;
+	struct rte_eth_dev_owner owner = { 0 };
+	struct rte_eth_conf dev_conf = { 0 };
+	struct rte_ring *ring = NULL;
 	char ring_name[RTE_RING_NAMESIZE];
+	char vdev_args[128];
 	size_t size, log2;
+	uint16_t port;
+	int ret;
 
 	/* Find next power of 2 >= size. */
 	size = ring_size;
@@ -686,17 +723,60 @@ static struct rte_ring *create_ring(void)
 		ring_size = size;
 	}
 
-	/* Want one ring per invocation of program */
-	snprintf(ring_name, sizeof(ring_name),
-		 "dumpcap-%d", getpid());
+	ret = rte_eth_dev_owner_new(&owner.id);
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "rte_eth_dev_owner_new failed: %s\n",
+			 strerror(-ret));
 
-	ring = rte_ring_create(ring_name, ring_size,
-			       rte_socket_id(), 0);
+	/* Give the vdev a unique name */
+	snprintf(ring_name, sizeof(ring_name), "dumpcap%"PRIu64, owner.id);
+	ring = rte_ring_create(ring_name, ring_size, rte_socket_id(),
+			       RING_F_MP_RTS_ENQ | RING_F_SC_DEQ);
 	if (ring == NULL)
 		rte_exit(EXIT_FAILURE, "Could not create ring :%s\n",
 			 rte_strerror(rte_errno));
 
+	strlcpy(owner.name, ring_name, sizeof(owner.name));
+
+	snprintf(vdev_name, RTE_DEV_NAME_MAX_LEN,
+		 "net_ring-dumpcap%"PRIu64, owner.id);
+	snprintf(vdev_args, sizeof(vdev_args),
+		 "ring=%s,timestamp", ring_name);
+
+	if (rte_eal_hotplug_add("vdev", vdev_name, vdev_args) < 0)
+		rte_exit(EXIT_FAILURE,
+			 "rte_eal_hotplug_add of %s failed:%s\n",
+			 vdev_name, rte_strerror(rte_errno));
+
+	ret = rte_eth_dev_get_port_by_name(vdev_name, &port);
+	if (ret != 0) {
+		fprintf(stderr, "Could not port for %s: %s\n",
+			vdev_name, strerror(-ret));
+		goto unplug;
+	}
+
+	ret = rte_eth_dev_owner_set(port, &owner);
+	if (ret != 0) {
+		fprintf(stderr, "Could not set owner for port for %u: %s\n",
+			port, strerror(-ret));
+		goto unplug;
+	}
+
+	ret = rte_eth_dev_configure(port, 1, 1, &dev_conf);
+	if (ret < 0) {
+		fprintf(stderr, "Could not configure port %u: %s\n",
+			port, strerror(-ret));
+		goto unplug;
+	}
+
+	*mirror_port = port;
 	return ring;
+
+unplug:
+	rte_eal_hotplug_remove("vdev", vdev_name);
+	rte_ring_free(ring);
+	rte_eal_cleanup();
+	exit(EXIT_FAILURE);
 }
 
 static struct rte_mempool *create_mempool(void)
@@ -796,7 +876,7 @@ static dumpcap_out_t create_output(void)
 					   version(), capture_comment);
 		if (ret.pcapng == NULL)
 			rte_exit(EXIT_FAILURE, "pcapng_fdopen failed: %s\n",
-				 strerror(rte_errno));
+				 rte_strerror(rte_errno));
 		free(os);
 
 		TAILQ_FOREACH(intf, &interfaces, next) {
@@ -823,37 +903,32 @@ static dumpcap_out_t create_output(void)
 	return ret;
 }
 
-static void enable_pdump(struct rte_ring *r, struct rte_mempool *mp)
+static int enable_mirror(uint16_t mirror_port, struct rte_mempool *mpool)
 {
 	struct interface *intf;
 	unsigned int count = 0;
-	uint32_t flags;
 	int ret;
 
-	flags = RTE_PDUMP_FLAG_RXTX;
-	if (use_pcapng)
-		flags |= RTE_PDUMP_FLAG_PCAPNG;
+	ret = rte_eth_dev_start(mirror_port);
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "Could not start mirror port %u: %s\n",
+			 mirror_port, strerror(-ret));
 
 	TAILQ_FOREACH(intf, &interfaces, next) {
-		ret = rte_pdump_enable_bpf(intf->port, RTE_PDUMP_ALL_QUEUES,
-					   flags, intf->opts.snap_len,
-					   r, mp, intf->bpf_prm);
+		struct rte_eth_mirror_conf conf = {
+			.mp = mpool,
+			.target = mirror_port,
+			.snaplen = intf->opts.snap_len,
+			.flags = RTE_ETH_MIRROR_ORIGIN_FLAG | RTE_ETH_MIRROR_TIMESTAMP_FLAG,
+			.direction = RTE_ETH_MIRROR_DIRECTION_INGRESS |
+				     RTE_ETH_MIRROR_DIRECTION_EGRESS,
+		};
+
+		ret = rte_eth_add_mirror(intf->port, &conf);
 		if (ret < 0) {
-			const struct interface *intf2;
-
-			/* unwind any previous enables */
-			TAILQ_FOREACH(intf2, &interfaces, next) {
-				if (intf == intf2)
-					break;
-				rte_pdump_disable(intf2->port,
-						  RTE_PDUMP_ALL_QUEUES, RTE_PDUMP_FLAG_RXTX);
-				if (intf2->opts.promisc_mode)
-					rte_eth_promiscuous_disable(intf2->port);
-			}
-			rte_exit(EXIT_FAILURE,
-				"Packet dump enable on %u:%s failed %s\n",
-				intf->port, intf->name,
-				rte_strerror(rte_errno));
+			fprintf(stderr, "Port mirror on %u:%s failed %s\n",
+				intf->port, intf->name, strerror(-ret));
+			return -1;
 		}
 
 		if (intf->opts.promisc_mode) {
@@ -868,6 +943,60 @@ static void enable_pdump(struct rte_ring *r, struct rte_mempool *mp)
 		}
 		++count;
 	}
+	return count;
+}
+
+static int setup_mbuf(void)
+{
+	int offset;
+
+	offset = rte_mbuf_dynfield_lookup(RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
+	if (offset < 0) {
+		fprintf(stderr, "Could not find timestamp dynamic field\n");
+		return -1;
+	}
+	timestamp_dynfield = offset;
+
+	offset = rte_mbuf_dynflag_lookup(RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
+	if (offset < 0) {
+		fprintf(stderr, "Could not find timestamp flag\n");
+		return -1;
+	}
+	timestamp_dynflag = RTE_BIT64(offset);
+
+	offset = rte_mbuf_dynfield_lookup(RTE_MBUF_DYNFIELD_MIRROR_ORIGIN, NULL);
+	if (offset < 0) {
+		fprintf(stderr, "Could not find mirror origin dynamic field\n");
+		return -1;
+	}
+	mirror_origin_dynfield = offset;
+
+	offset = rte_mbuf_dynflag_lookup(RTE_MBUF_DYNFLAG_MIRROR_ORIGIN, NULL);
+	if (offset < 0) {
+		fprintf(stderr, "Could not find mirror origin dynamic flag\n");
+		return -1;
+	}
+	mirror_origin_dynflag = RTE_BIT64(offset);
+
+	offset = rte_mbuf_dynflag_lookup(RTE_MBUF_DYNFLAG_MIRROR_INGRESS, NULL);
+	if (offset < 0) {
+		fprintf(stderr, "Could not find mirror ingress flag\n");
+		return -1;
+	}
+	mirror_ingress_dynflag = RTE_BIT64(offset);
+
+	offset = rte_mbuf_dynflag_lookup(RTE_MBUF_DYNFLAG_MIRROR_EGRESS, NULL);
+	if (offset < 0) {
+		fprintf(stderr, "Could not find mirror egress flag\n");
+		return -1;
+	}
+	mirror_egress_dynflag = RTE_BIT64(offset);
+	return 0;
+}
+
+static void show_capturing(unsigned int count)
+{
+	struct interface *intf;
 
 	fputs("Capturing on ", stdout);
 	TAILQ_FOREACH(intf, &interfaces, next) {
@@ -898,6 +1027,47 @@ static void show_count(uint64_t count)
 	bt = fprintf(stderr, "%"PRIu64" ", count);
 }
 
+static ssize_t
+pcapng_write_packets(rte_pcapng_t *pcapng, struct rte_mbuf *pkts[], uint16_t n)
+{
+	struct rte_mbuf *towrite[BURST_SIZE];
+	unsigned int count = 0;
+
+	for (unsigned int i = 0; i < n; i++) {
+		struct rte_mbuf *m = pkts[i];
+		enum rte_pcapng_direction direction = RTE_PCAPNG_DIRECTION_UNKNOWN;
+
+		/* If no origin flag then why did we get this? */
+		if (unlikely(!(m->ol_flags & mirror_origin_dynflag))) {
+			fprintf(stderr, "Missing origin info in packet\n");
+			return -1;
+		}
+
+		const struct rte_mbuf_origin *origin
+			= RTE_MBUF_DYNFIELD(m, mirror_origin_dynfield, rte_mbuf_origin_t *);
+
+		if (m->ol_flags & mirror_ingress_dynflag)
+			direction = RTE_PCAPNG_DIRECTION_IN;
+		else if (m->ol_flags & mirror_egress_dynflag)
+			direction = RTE_PCAPNG_DIRECTION_OUT;
+
+		uint64_t timestamp;
+		if (m->ol_flags & timestamp_dynflag)
+			timestamp = *RTE_MBUF_DYNFIELD(m, timestamp_dynfield,
+						       rte_mbuf_timestamp_t *);
+		else
+			timestamp = rte_get_tsc_cycles();
+
+		if (rte_pcapng_insert(m, origin->queue_id, direction, origin->original_len,
+				      timestamp, NULL) < 0)
+			continue;  /* skip no headroom? */
+
+		towrite[count++] = m;
+	}
+
+	return rte_pcapng_write_packets(pcapng, towrite, count);
+}
+
 /* Write multiple packets in older pcap format */
 static ssize_t
 pcap_write_packets(pcap_dumper_t *dumper,
@@ -930,16 +1100,127 @@ pcap_write_packets(pcap_dumper_t *dumper,
 	return total;
 }
 
+/* Do packet filter on incoming packets.
+ * Some issues to note here:
+ *  - would be better to do before putting into ring
+ *    but that causes ethdev to have dependency on BPF.
+ *  - the filter is per-interface and there maybe more than one interface
+ *    combined into the ring. That means filter is executed once
+ *    per packet (not as burst).
+ */
+static unsigned int
+filter_packets(struct rte_mbuf *pkts[], unsigned int n)
+{
+	uint64_t results[BURST_SIZE];
+	uint32_t matches = 0;
+
+	for (unsigned int i = 0; i < n; i++) {
+		struct rte_mbuf *m = pkts[i];
+
+		/* all packets should have origin info, if not just skip */
+		if (unlikely(!(m->ol_flags & mirror_origin_dynflag)))
+			continue;
+
+		const struct rte_mbuf_origin *origin
+			= RTE_MBUF_DYNFIELD(m, mirror_origin_dynfield, rte_mbuf_origin_t *);
+		uint16_t port_id = origin->port_id;
+
+		if (unlikely(port_id >= RTE_MAX_ETHPORTS))
+			continue;
+
+		const struct interface *intf = port2intf[port_id];
+		if (intf == NULL || intf->filter == NULL)
+			continue;
+
+		if (rte_bpf_exec_burst(intf->filter, (void **)&m, &results[i], 1) == 1)
+			++matches;
+	}
+
+	if (matches == n)
+		return n;	/* no packets skipped */
+
+	/* need to shuffle out the skipped packets */
+	unsigned int count = 0;
+	struct rte_mbuf *towrite[BURST_SIZE];
+	for (unsigned int i = 0; i < n; i++) {
+		if (results[i])
+			towrite[count++] = pkts[i];
+		else
+			rte_pktmbuf_free(pkts[i]);
+	}
+	memcpy(pkts, towrite, count * sizeof(struct rte_mbuf *));
+	return count;
+}
+
+/* Make sure configuration of mbuf pool has enough headroom for both vlans */
+static_assert(RTE_PKTMBUF_HEADROOM > 2 * sizeof(struct rte_vlan_hdr),
+	"not enough mbuf headroom for vlan insertion");
+
+/*
+ * More general version of rte_vlan_insert()
+ * Note: mbufs are allocated by rte_eth_mirror_burst() from the
+ * pool that was passed when setting up mirror.
+ */
+static void
+vlan_insert(struct rte_mbuf *m, uint16_t ether_type, uint16_t tci)
+{
+	struct rte_ether_hdr *nh, tmph;
+	const struct rte_ether_hdr *oh;
+	struct rte_vlan_hdr *vh;
+
+	RTE_ASSERT(!RTE_MBUF_DIRECT(m));
+	RTE_ASSERT(rte_mbuf_refcnt_read(m) == 1);
+
+	oh = rte_pktmbuf_read(m, 0, sizeof(*oh), &tmph);
+	RTE_ASSERT(oh != NULL);
+
+	/* overlay new header */
+	nh = (struct rte_ether_hdr *)
+		rte_pktmbuf_prepend(m, sizeof(struct rte_vlan_hdr));
+
+	RTE_ASSERT(nh != NULL);
+
+	memmove(nh, oh, 2 * RTE_ETHER_ADDR_LEN);
+	nh->ether_type = rte_cpu_to_be_16(ether_type);
+
+	vh = (struct rte_vlan_hdr *) (nh + 1);
+	vh->vlan_tci = rte_cpu_to_be_16(tci);
+}
+
+/* Filtering and pcap file format require that VLAN not be offloaded */
+static void
+remove_vlan_offload(struct rte_mbuf *pkts[], unsigned int n)
+{
+	unsigned int i;
+
+	for (i = 0; i < n; i++) {
+		struct rte_mbuf *m = pkts[i];
+
+		if (m->ol_flags & mirror_ingress_dynflag) {
+			if (m->ol_flags & RTE_MBUF_F_RX_VLAN_STRIPPED)
+				vlan_insert(m, RTE_ETHER_TYPE_VLAN, m->vlan_tci);
+			if (m->ol_flags & RTE_MBUF_F_RX_QINQ_STRIPPED)
+				vlan_insert(m, RTE_ETHER_TYPE_QINQ, m->vlan_tci_outer);
+		}
+		if (m->ol_flags & mirror_egress_dynflag) {
+			if (m->ol_flags & RTE_MBUF_F_TX_VLAN)
+				vlan_insert(m, RTE_ETHER_TYPE_VLAN, m->vlan_tci);
+			if (m->ol_flags & RTE_MBUF_F_TX_QINQ)
+				vlan_insert(m, RTE_ETHER_TYPE_QINQ, m->vlan_tci_outer);
+
+		}
+	}
+}
+
 /* Process all packets in ring and dump to capture file */
-static int process_ring(dumpcap_out_t out, struct rte_ring *r)
+static int process_ring(dumpcap_out_t out, struct rte_ring *ring)
 {
 	struct rte_mbuf *pkts[BURST_SIZE];
-	unsigned int avail, n;
+	unsigned int n, avail;
 	static unsigned int empty_count;
 	ssize_t written;
 
-	n = rte_ring_sc_dequeue_burst(r, (void **) pkts, BURST_SIZE,
-				      &avail);
+	n = rte_ring_sc_dequeue_burst(ring, (void **)pkts, BURST_SIZE, &avail);
 	if (n == 0) {
 		/* don't consume endless amounts of cpu if idle */
 		if (empty_count < SLEEP_THRESHOLD)
@@ -948,11 +1229,18 @@ static int process_ring(dumpcap_out_t out, struct rte_ring *r)
 			usleep(10);
 		return 0;
 	}
-
 	empty_count = (avail == 0);
 
+	remove_vlan_offload(pkts, n);
+
+	if (filtering) {
+		n = filter_packets(pkts, n);
+		if (n == 0)
+			return 0;
+	}
+
 	if (use_pcapng)
-		written = rte_pcapng_write_packets(out.pcapng, pkts, n);
+		written = pcapng_write_packets(out.pcapng, pkts, n);
 	else
 		written = pcap_write_packets(out.dumper, pkts, n);
 
@@ -971,15 +1259,18 @@ static int process_ring(dumpcap_out_t out, struct rte_ring *r)
 
 int main(int argc, char **argv)
 {
-	struct rte_ring *r;
 	struct rte_mempool *mp;
 	struct sigaction action = {
 		.sa_flags = SA_RESTART,
 		.sa_handler = signal_handler,
 	};
 	struct sigaction origaction;
+	struct rte_ring *ring = NULL;
+	char vdev_name[RTE_DEV_NAME_MAX_LEN];
+	uint16_t mirror_port = UINT16_MAX;
 	dumpcap_out_t out;
 	char *p;
+	int ret;
 
 	p = strrchr(argv[0], '/');
 	if (p == NULL)
@@ -1018,12 +1309,21 @@ int main(int argc, char **argv)
 		exit(0);
 	}
 
-	r = create_ring();
 	mp = create_mempool();
+	ring = create_ring_dev(vdev_name, &mirror_port);
 	out = create_output();
 
+	ret = enable_mirror(mirror_port, mp);
+	if (ret < 0)
+		goto cleanup;
+
+	ret = setup_mbuf();
+	if (ret < 0)
+		goto cleanup;
+
+	show_capturing(ret);
+
 	start_time = time(NULL);
-	enable_pdump(r, mp);
 
 	if (!quiet) {
 		fprintf(stderr, "Packets captured: ");
@@ -1031,7 +1331,7 @@ int main(int argc, char **argv)
 	}
 
 	while (!rte_atomic_load_explicit(&quit_signal, rte_memory_order_relaxed)) {
-		if (process_ring(out, r) < 0) {
+		if (process_ring(out, ring) < 0) {
 			fprintf(stderr, "pcapng file write failed; %s\n",
 				strerror(errno));
 			break;
@@ -1048,19 +1348,20 @@ int main(int argc, char **argv)
 			break;
 	}
 
-	disable_primary_monitor();
-
 	if (rte_eal_primary_proc_alive(NULL))
 		report_packet_stats(out);
 
+cleanup:
 	if (use_pcapng)
 		rte_pcapng_close(out.pcapng);
 	else
 		pcap_dump_close(out.dumper);
 
-	cleanup_pdump_resources();
+	disable_primary_monitor();
+	disable_mirror_ports(mirror_port);
 
-	rte_ring_free(r);
+	rte_eal_hotplug_remove("vdev", vdev_name);
+	rte_ring_free(ring);
 	rte_mempool_free(mp);
 
 	return rte_eal_cleanup() ? EXIT_FAILURE : 0;
diff --git a/app/dumpcap/meson.build b/app/dumpcap/meson.build
index 69c016c780..6212291d40 100644
--- a/app/dumpcap/meson.build
+++ b/app/dumpcap/meson.build
@@ -14,4 +14,4 @@ endif
 
 ext_deps += pcap_dep
 sources = files('main.c')
-deps += ['ethdev', 'pdump', 'pcapng', 'bpf']
+deps += ['net_ring', 'ethdev', 'pcapng', 'bpf']
-- 
2.47.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC v2 12/12] pdump: mark API and application as deprecated
  2025-07-09 17:33 ` [RFC v2 00/12] Packet capture using port mirroring Stephen Hemminger
                     ` (10 preceding siblings ...)
  2025-07-09 17:33   ` [RFC v2 11/12] app/dumpcap: use port mirror instead of pdump Stephen Hemminger
@ 2025-07-09 17:33   ` Stephen Hemminger
  11 siblings, 0 replies; 14+ messages in thread
From: Stephen Hemminger @ 2025-07-09 17:33 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Reshma Pattan, Aman Singh

When the pdump application is run print a warning.
Add release note about deprecation and mark the API's
as deprecated.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

pdump build deprecated
---
 app/pdump/main.c                     |  3 +++
 app/pdump/meson.build                |  1 +
 app/test-pmd/meson.build             |  1 +
 app/test/meson.build                 |  6 +++++-
 doc/guides/rel_notes/deprecation.rst |  3 +++
 lib/pdump/rte_pdump.h                | 12 +++++++++---
 6 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/app/pdump/main.c b/app/pdump/main.c
index fa85859703..a4e0f4e5ae 100644
--- a/app/pdump/main.c
+++ b/app/pdump/main.c
@@ -986,6 +986,9 @@ main(int argc, char **argv)
 	char mp_flag[] = "--proc-type=secondary";
 	char *argp[argc + 2];
 
+
+	fprintf(stderr, "%s: is deprecated use dpdk-dumpcap instead\n", argv[0]);
+
 	/* catch ctrl-c so we can cleanup on exit */
 	sigemptyset(&action.sa_mask);
 	sigaction(SIGTERM, &action, NULL);
diff --git a/app/pdump/meson.build b/app/pdump/meson.build
index a4c7dd44e7..1fe034fcb8 100644
--- a/app/pdump/meson.build
+++ b/app/pdump/meson.build
@@ -11,3 +11,4 @@ sources = files('main.c')
 deps += ['ethdev', 'kvargs', 'pdump']
 
 cflags += no_wvla_cflag
+cflags += '-Wno-deprecated-declarations'
diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build
index e67665028f..3ab8280ce7 100644
--- a/app/test-pmd/meson.build
+++ b/app/test-pmd/meson.build
@@ -60,6 +60,7 @@ if dpdk_conf.has('RTE_LIB_METRICS')
 endif
 if dpdk_conf.has('RTE_LIB_PDUMP')
     deps += 'pdump'
+    cflags += '-Wno-deprecated-declarations'
 endif
 if dpdk_conf.has('RTE_NET_BNXT')
     deps += 'net_bnxt'
diff --git a/app/test/meson.build b/app/test/meson.build
index 07bc00148d..45ece2179c 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -138,7 +138,6 @@ source_file_deps = {
     'test_net_ip6.c': ['net'],
     'test_pcapng.c': ['ethdev', 'net', 'pcapng', 'bus_vdev'],
     'test_pdcp.c': ['eventdev', 'pdcp', 'net', 'timer', 'security'],
-    'test_pdump.c': ['pdump'] + sample_packet_forward_deps,
     'test_per_lcore.c': [],
     'test_pflock.c': [],
     'test_pie.c': ['sched'],
@@ -217,6 +216,11 @@ source_file_ext_deps = {
     'test_pcapng.c': ['pcap'],
 }
 
+if dpdk_conf.has('RTE_LIB_PDUMP')
+    source_file_deps += { 'test_pdump.c': ['pdump'] + sample_packet_forward_deps, }
+    cflags += '-Wno-deprecated-declarations'
+endif
+
 def_lib = get_option('default_library')
 foreach f, f_deps : source_file_deps
     has_deps = true
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index e2d4125308..03a6b2712e 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -152,3 +152,6 @@ Deprecation Notices
 * bus/vmbus: Starting DPDK 25.11, all the vmbus API defined in
   ``drivers/bus/vmbus/rte_bus_vmbus.h`` will become internal to DPDK.
   Those API functions are used internally by DPDK core and netvsc PMD.
+
+* pdump: The pdump API and application have been deprecated and will
+  be removed in a later release.
diff --git a/lib/pdump/rte_pdump.h b/lib/pdump/rte_pdump.h
index 1e32d46097..bfea7c560d 100644
--- a/lib/pdump/rte_pdump.h
+++ b/lib/pdump/rte_pdump.h
@@ -39,6 +39,7 @@ enum {
  * @return
  *    0 on success, -1 on error
  */
+__rte_deprecated
 int
 rte_pdump_init(void);
 
@@ -50,6 +51,7 @@ rte_pdump_init(void);
  * @return
  *    0 on success, -1 on error
  */
+__rte_deprecated
 int
 rte_pdump_uninit(void);
 
@@ -75,7 +77,7 @@ rte_pdump_uninit(void);
  * @return
  *    0 on success, -1 on error, rte_errno is set accordingly.
  */
-
+__rte_deprecated
 int
 rte_pdump_enable(uint16_t port, uint16_t queue, uint32_t flags,
 		struct rte_ring *ring,
@@ -106,6 +108,7 @@ rte_pdump_enable(uint16_t port, uint16_t queue, uint32_t flags,
  * @return
  *    0 on success, -1 on error, rte_errno is set accordingly.
  */
+__rte_deprecated
 int
 rte_pdump_enable_bpf(uint16_t port_id, uint16_t queue,
 		     uint32_t flags, uint32_t snaplen,
@@ -129,7 +132,7 @@ rte_pdump_enable_bpf(uint16_t port_id, uint16_t queue,
  * @return
  *    0 on success, -1 on error, rte_errno is set accordingly.
  */
-
+__rte_deprecated
 int
 rte_pdump_disable(uint16_t port, uint16_t queue, uint32_t flags);
 
@@ -156,7 +159,7 @@ rte_pdump_disable(uint16_t port, uint16_t queue, uint32_t flags);
  * @return
  *    0 on success, -1 on error, rte_errno is set accordingly.
  */
-
+__rte_deprecated
 int
 rte_pdump_enable_by_deviceid(char *device_id, uint16_t queue,
 				uint32_t flags,
@@ -189,6 +192,7 @@ rte_pdump_enable_by_deviceid(char *device_id, uint16_t queue,
  * @return
  *    0 on success, -1 on error, rte_errno is set accordingly.
  */
+__rte_deprecated
 int
 rte_pdump_enable_bpf_by_deviceid(const char *device_id, uint16_t queue,
 				 uint32_t flags, uint32_t snaplen,
@@ -215,6 +219,7 @@ rte_pdump_enable_bpf_by_deviceid(const char *device_id, uint16_t queue,
  * @return
  *    0 on success, -1 on error, rte_errno is set accordingly.
  */
+__rte_deprecated
 int
 rte_pdump_disable_by_deviceid(char *device_id, uint16_t queue,
 				uint32_t flags);
@@ -243,6 +248,7 @@ struct rte_pdump_stats {
  * @return
  *   Zero if successful. -1 on error and rte_errno is set.
  */
+__rte_deprecated
 int
 rte_pdump_stats(uint16_t port_id, struct rte_pdump_stats *stats);
 
-- 
2.47.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC v2 01/12] ethdev: allow start/stop from secondary process
  2025-07-09 17:33   ` [RFC v2 01/12] ethdev: allow start/stop from secondary process Stephen Hemminger
@ 2025-07-09 17:47     ` Khadem Ullah
  0 siblings, 0 replies; 14+ messages in thread
From: Khadem Ullah @ 2025-07-09 17:47 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko, Anatoly Burakov

[-- Attachment #1: Type: text/plain, Size: 12530 bytes --]

Does that means that the RSS of the secondary application can be also
applied to the primary application, or in case of multiple instances on the
same interface ?

Best,
Khadem

On Wed, Jul 9, 2025, 22:36 Stephen Hemminger <stephen@networkplumber.org>
wrote:

> Before this patch if secondary process called start/stop
> it would only impact the secondary process, the ethdev on the
> primary process was not started.
>
> With this patch, when start/stop is called from secondary,
> it calls the primary and does the operation there. The design
> is generic, and we can later add queue and other operations
> as needed.
>
> Bugzilla ID: 73
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
>  lib/ethdev/ethdev_driver.c  | 11 ++++-
>  lib/ethdev/ethdev_private.c | 93 +++++++++++++++++++++++++++++++++++++
>  lib/ethdev/ethdev_private.h | 26 +++++++++++
>  lib/ethdev/rte_ethdev.c     | 84 +++++++++++++++++++--------------
>  4 files changed, 177 insertions(+), 37 deletions(-)
>
> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
> index ec0c1e1176..66ba7dafc9 100644
> --- a/lib/ethdev/ethdev_driver.c
> +++ b/lib/ethdev/ethdev_driver.c
> @@ -114,6 +114,14 @@ rte_eth_dev_allocate(const char *name)
>                 goto unlock;
>         }
>
> +       if (eth_dev_shared_data->allocated_ports == 0 &&
> +           rte_mp_action_register(ETHDEV_MP, ethdev_server) &&
> +           rte_errno != ENOTSUP) {
> +               RTE_ETHDEV_LOG_LINE(ERR,
> +                        "Could not start %s service", ETHDEV_MP);
> +               goto unlock;
> +       }
> +
>         eth_dev = eth_dev_get(port_id);
>         eth_dev->flow_fp_ops = &rte_flow_fp_default_ops;
>         strlcpy(eth_dev->data->name, name, sizeof(eth_dev->data->name));
> @@ -282,7 +290,8 @@ rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
>                 memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
>                 eth_dev->data = NULL;
>
> -               eth_dev_shared_data->allocated_ports--;
> +               if (--eth_dev_shared_data->allocated_ports == 0)
> +                       rte_mp_action_unregister(ETHDEV_MP);
>                 eth_dev_shared_data_release();
>         }
>
> diff --git a/lib/ethdev/ethdev_private.c b/lib/ethdev/ethdev_private.c
> index 285d377d91..184cf33f99 100644
> --- a/lib/ethdev/ethdev_private.c
> +++ b/lib/ethdev/ethdev_private.c
> @@ -2,6 +2,8 @@
>   * Copyright(c) 2018 Gaëtan Rivet
>   */
>
> +#include <assert.h>
> +
>  #include <eal_export.h>
>  #include <rte_debug.h>
>
> @@ -477,3 +479,94 @@ eth_dev_tx_queue_config(struct rte_eth_dev *dev,
> uint16_t nb_queues)
>         dev->data->nb_tx_queues = nb_queues;
>         return 0;
>  }
> +
> +static int
> +ethdev_handle_request(const struct ethdev_mp_request *req)
> +{
> +       switch (req->operation) {
> +       case ETH_REQ_START:
> +               return rte_eth_dev_start(req->port_id);
> +
> +       case ETH_REQ_STOP:
> +               return rte_eth_dev_stop(req->port_id);
> +
> +       default:
> +               return -EINVAL;
> +       }
> +}
> +
> +static_assert(sizeof(struct ethdev_mp_request) <= RTE_MP_MAX_PARAM_LEN,
> +       "ethdev MP request bigger than available param space");
> +
> +static_assert(sizeof(struct ethdev_mp_response) <= RTE_MP_MAX_PARAM_LEN,
> +       "ethdev MP response bigger than available param space");
> +
> +int
> +ethdev_server(const struct rte_mp_msg *mp_msg, const void *peer)
> +{
> +       const struct ethdev_mp_request *req
> +               = (const struct ethdev_mp_request *)mp_msg->param;
> +
> +       struct rte_mp_msg mp_resp = {
> +               .name = ETHDEV_MP,
> +       };
> +       struct ethdev_mp_response *resp;
> +
> +       resp = (struct ethdev_mp_response *)mp_resp.param;
> +       mp_resp.len_param = sizeof(*resp);
> +       resp->res_op = req->operation;
> +
> +       /* recv client requests */
> +       if (mp_msg->len_param != sizeof(*req))
> +               resp->err_value = -EINVAL;
> +       else
> +               resp->err_value = ethdev_handle_request(req);
> +
> +       return rte_mp_reply(&mp_resp, peer);
> +}
> +
> +int
> +ethdev_request(uint16_t port_id, enum ethdev_mp_operation operation,
> +              const void *buf, size_t buf_len)
> +{
> +       struct rte_mp_msg mp_req = { };
> +       struct rte_mp_reply mp_reply;
> +       struct ethdev_mp_request *req;
> +       struct timespec ts = {.tv_sec = 5, .tv_nsec = 0};
> +       int ret;
> +
> +       if (sizeof(*req) + buf_len > RTE_MP_MAX_PARAM_LEN) {
> +               RTE_ETHDEV_LOG_LINE(ERR,
> +                                   "request %u port %u invalid len %zu",
> +                                   operation, port_id, buf_len);
> +               return -EINVAL;
> +       }
> +
> +       strlcpy(mp_req.name, ETHDEV_MP, RTE_MP_MAX_NAME_LEN);
> +       mp_req.len_param = sizeof(*req) + buf_len;
> +
> +       req = (struct ethdev_mp_request *)mp_req.param;
> +       req->operation = operation;
> +       req->port_id = port_id;
> +
> +       if (buf_len > 0)
> +               memcpy(req->config, buf, buf_len);
> +
> +       ret = rte_mp_request_sync(&mp_req, &mp_reply, &ts);
> +       if (ret == 0) {
> +               const struct rte_mp_msg *mp_rep = &mp_reply.msgs[0];
> +               const struct ethdev_mp_response *resp
> +                       = (const struct ethdev_mp_response *)mp_rep->param;
> +
> +               if (resp->err_value == 0)
> +                       ret = 0;
> +               else
> +                       rte_errno = -resp->err_value;
> +               free(mp_reply.msgs);
> +       }
> +
> +       if (ret < 0)
> +               RTE_ETHDEV_LOG_LINE(ERR,
> +                      "port %up ethdev op %u failed", port_id, operation);
> +       return ret;
> +}
> diff --git a/lib/ethdev/ethdev_private.h b/lib/ethdev/ethdev_private.h
> index b07b1b4c42..f58f161871 100644
> --- a/lib/ethdev/ethdev_private.h
> +++ b/lib/ethdev/ethdev_private.h
> @@ -79,4 +79,30 @@ void eth_dev_txq_release(struct rte_eth_dev *dev,
> uint16_t qid);
>  int eth_dev_rx_queue_config(struct rte_eth_dev *dev, uint16_t nb_queues);
>  int eth_dev_tx_queue_config(struct rte_eth_dev *dev, uint16_t nb_queues);
>
> +/* Used to allow start/stop from secondary */
> +#define ETHDEV_MP      "mp_ethdev"
> +
> +enum ethdev_mp_operation {
> +       ETH_REQ_START,
> +       ETH_REQ_STOP,
> +};
> +
> +struct ethdev_mp_request {
> +       uint16_t operation;
> +       uint16_t port_id;
> +       uint32_t reserved;
> +
> +       uint8_t config[]; /* operation specific */
> +};
> +
> +struct ethdev_mp_response {
> +       uint16_t res_op;
> +       uint16_t port_id;
> +       int32_t err_value;
> +};
> +
> +int ethdev_server(const struct rte_mp_msg *mp_msg, const void *peer);
> +int ethdev_request(uint16_t port_id, enum ethdev_mp_operation op,
> +                  const void *buf, size_t buf_len);
> +
>  #endif /* _ETH_PRIVATE_H_ */
> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> index dd7c00bc94..41363af2c3 100644
> --- a/lib/ethdev/rte_ethdev.c
> +++ b/lib/ethdev/rte_ethdev.c
> @@ -1800,54 +1800,61 @@ rte_eth_dev_start(uint16_t port_id)
>
>         if (dev->data->dev_configured == 0) {
>                 RTE_ETHDEV_LOG_LINE(INFO,
> -                       "Device with port_id=%"PRIu16" is not configured.",
> -                       port_id);
> +                                   "Device with port_id=%"PRIu16" is not
> configured.",
> +                                   port_id);
>                 return -EINVAL;
>         }
>
>         if (dev->data->dev_started != 0) {
>                 RTE_ETHDEV_LOG_LINE(INFO,
> -                       "Device with port_id=%"PRIu16" already started",
> -                       port_id);
> +                                   "Device with port_id=%"PRIu16" already
> started",
> +                                   port_id);
>                 return 0;
>         }
>
> -       ret = rte_eth_dev_info_get(port_id, &dev_info);
> -       if (ret != 0)
> -               return ret;
> +       if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> +               ret = rte_eth_dev_info_get(port_id, &dev_info);
> +               if (ret != 0)
> +                       return ret;
>
> -       restore_flags = rte_eth_get_restore_flags(dev, RTE_ETH_START);
> +               restore_flags = rte_eth_get_restore_flags(dev,
> RTE_ETH_START);
>
> -       /* Lets restore MAC now if device does not support live change */
> -       if ((*dev_info.dev_flags & RTE_ETH_DEV_NOLIVE_MAC_ADDR) &&
> -           (restore_flags & RTE_ETH_RESTORE_MAC_ADDR))
> -               eth_dev_mac_restore(dev, &dev_info);
> +               /* Restore MAC now if device does not support live change
> */
> +               if ((*dev_info.dev_flags & RTE_ETH_DEV_NOLIVE_MAC_ADDR) &&
> +                   (restore_flags & RTE_ETH_RESTORE_MAC_ADDR))
> +                       eth_dev_mac_restore(dev, &dev_info);
>
> -       diag = dev->dev_ops->dev_start(dev);
> -       if (diag == 0)
> -               dev->data->dev_started = 1;
> -       else
> -               return eth_err(port_id, diag);
> +               diag = dev->dev_ops->dev_start(dev);
> +               if (diag == 0)
> +                       dev->data->dev_started = 1;
> +               else
> +                       return eth_err(port_id, diag);
>
> -       ret = eth_dev_config_restore(dev, &dev_info, restore_flags,
> port_id);
> -       if (ret != 0) {
> -               RTE_ETHDEV_LOG_LINE(ERR,
> -                       "Error during restoring configuration for device
> (port %u): %s",
> -                       port_id, rte_strerror(-ret));
> -               ret_stop = rte_eth_dev_stop(port_id);
> -               if (ret_stop != 0) {
> +               ret = eth_dev_config_restore(dev, &dev_info,
> restore_flags, port_id);
> +               if (ret != 0) {
>                         RTE_ETHDEV_LOG_LINE(ERR,
> -                               "Failed to stop device (port %u): %s",
> -                               port_id, rte_strerror(-ret_stop));
> -               }
> +                               "Error during restoring configuration for
> device (port %u): %s",
> +                                port_id, rte_strerror(-ret));
> +                       ret_stop = rte_eth_dev_stop(port_id);
> +                       if (ret_stop != 0) {
> +                               RTE_ETHDEV_LOG_LINE(ERR,
> +                                       "Failed to stop device (port %u):
> %s",
> +                                        port_id, rte_strerror(-ret_stop));
> +                       }
>
> -               return ret;
> -       }
> +                       return ret;
> +               }
>
> -       if (dev->data->dev_conf.intr_conf.lsc == 0) {
> -               if (dev->dev_ops->link_update == NULL)
> -                       return -ENOTSUP;
> -               dev->dev_ops->link_update(dev, 0);
> +               if (dev->data->dev_conf.intr_conf.lsc == 0) {
> +                       if (dev->dev_ops->link_update == NULL)
> +                               return -ENOTSUP;
> +                       dev->dev_ops->link_update(dev, 0);
> +               }
> +       } else {
> +               /* in secondary, proxy to primary */
> +               ret = ethdev_request(port_id, ETH_REQ_START, NULL, 0);
> +               if (ret != 0)
> +                       return ret;
>         }
>
>         /* expose selection of PMD fast-path functions */
> @@ -1880,9 +1887,14 @@ rte_eth_dev_stop(uint16_t port_id)
>         /* point fast-path functions to dummy ones */
>         eth_dev_fp_ops_reset(rte_eth_fp_ops + port_id);
>
> -       ret = dev->dev_ops->dev_stop(dev);
> -       if (ret == 0)
> -               dev->data->dev_started = 0;
> +       if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> +               ret = dev->dev_ops->dev_stop(dev);
> +               if (ret == 0)
> +                       dev->data->dev_started = 0;
> +       } else {
> +               ret = ethdev_request(port_id, ETH_REQ_STOP, NULL, 0);
> +       }
> +
>         rte_ethdev_trace_stop(port_id, ret);
>
>         return ret;
> --
> 2.47.2
>
>

[-- Attachment #2: Type: text/html, Size: 15564 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-07-09 17:47 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <0250411234927.114568-1-stephen@networkplumber.org>
2025-07-09 17:33 ` [RFC v2 00/12] Packet capture using port mirroring Stephen Hemminger
2025-07-09 17:33   ` [RFC v2 01/12] ethdev: allow start/stop from secondary process Stephen Hemminger
2025-07-09 17:47     ` Khadem Ullah
2025-07-09 17:33   ` [RFC v2 02/12] test: add test for hotplug and secondary process operations Stephen Hemminger
2025-07-09 17:33   ` [RFC v2 03/12] net/ring: allow lockfree transmit if ring supports it Stephen Hemminger
2025-07-09 17:33   ` [RFC v2 04/12] net/ring: add new devargs for dumpcap use Stephen Hemminger
2025-07-09 17:33   ` [RFC v2 05/12] mbuf: add dynamic flag for use by port mirroring Stephen Hemminger
2025-07-09 17:33   ` [RFC v2 06/12] ethdev: add port mirroring feature Stephen Hemminger
2025-07-09 17:33   ` [RFC v2 07/12] pcapng: split packet copy from header insertion Stephen Hemminger
2025-07-09 17:33   ` [RFC v2 08/12] pcapng: make queue optional Stephen Hemminger
2025-07-09 17:33   ` [RFC v2 09/12] test: add tests for ethdev mirror Stephen Hemminger
2025-07-09 17:33   ` [RFC v2 10/12] app/testpmd: support for port mirroring Stephen Hemminger
2025-07-09 17:33   ` [RFC v2 11/12] app/dumpcap: use port mirror instead of pdump Stephen Hemminger
2025-07-09 17:33   ` [RFC v2 12/12] pdump: mark API and application as deprecated Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).