DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH 0/4] pcapng fixes
@ 2023-09-21  4:23 Stephen Hemminger
  2023-09-21  4:23 ` [PATCH 1/4] pdump: fix setting rte_errno on mp error Stephen Hemminger
                   ` (9 more replies)
  0 siblings, 10 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-09-21  4:23 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

There were a couple of reported bugs in dumpcap around
timestamps and multiple invocations. This patchset does
some refactoring to fix them in a simpler way.

Stephen Hemminger (4):
  pdump: fix setting rte_errno on mp error
  dumpcap: allow multiple invocations
  pcapng: change timestamp argument to write_stats
  pcapng: move timestamp calculation into pdump

 app/dumpcap/main.c      | 31 ++++++++--------
 app/test/test_pcapng.c  |  4 +--
 lib/pcapng/rte_pcapng.c | 79 ++++-------------------------------------
 lib/pcapng/rte_pcapng.h |  7 ++--
 lib/pdump/rte_pdump.c   | 61 +++++++++++++++++++++++++++----
 5 files changed, 83 insertions(+), 99 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH 1/4] pdump: fix setting rte_errno on mp error
  2023-09-21  4:23 [PATCH 0/4] pcapng fixes Stephen Hemminger
@ 2023-09-21  4:23 ` Stephen Hemminger
  2023-09-21  4:23 ` [PATCH 2/4] dumpcap: allow multiple invocations Stephen Hemminger
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-09-21  4:23 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Reshma Pattan, Jianfeng Tan

The response from MP server sets err_value to negative
on error. The convention for rte_errno is to use a positive
value on error. This makes errors like duplicate registration
show up with the correct error value.

Fixes: 660098d61f57 ("pdump: use generic multi-process channel")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 lib/pdump/rte_pdump.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/pdump/rte_pdump.c b/lib/pdump/rte_pdump.c
index 53cca1034d41..a70085bd0211 100644
--- a/lib/pdump/rte_pdump.c
+++ b/lib/pdump/rte_pdump.c
@@ -564,9 +564,10 @@ pdump_prepare_client_request(const char *device, uint16_t queue,
 	if (rte_mp_request_sync(&mp_req, &mp_reply, &ts) == 0) {
 		mp_rep = &mp_reply.msgs[0];
 		resp = (struct pdump_response *)mp_rep->param;
-		rte_errno = resp->err_value;
-		if (!resp->err_value)
+		if (resp->err_value == 0)
 			ret = 0;
+		else
+			rte_errno = -resp->err_value;
 		free(mp_reply.msgs);
 	}
 
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH 2/4] dumpcap: allow multiple invocations
  2023-09-21  4:23 [PATCH 0/4] pcapng fixes Stephen Hemminger
  2023-09-21  4:23 ` [PATCH 1/4] pdump: fix setting rte_errno on mp error Stephen Hemminger
@ 2023-09-21  4:23 ` Stephen Hemminger
  2023-09-21  6:22   ` Morten Brørup
  2023-09-21  4:23 ` [PATCH 3/4] pcapng: change timestamp argument to write_stats Stephen Hemminger
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 61+ messages in thread
From: Stephen Hemminger @ 2023-09-21  4:23 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Isaac Boukris, Reshma Pattan

If dumpcap is run twice with each instance pointing a different
interface, it would fail because of overlap in ring a pool names.
Fix by putting process id in the name.

Fixes: cbb44143be74 ("app/dumpcap: add new packet capture application")
Reported-by: Isaac Boukris <iboukris@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 app/dumpcap/main.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/app/dumpcap/main.c b/app/dumpcap/main.c
index 64294bbfb3e6..37754fd06f4f 100644
--- a/app/dumpcap/main.c
+++ b/app/dumpcap/main.c
@@ -44,7 +44,6 @@
 #include <pcap/pcap.h>
 #include <pcap/bpf.h>
 
-#define RING_NAME "capture-ring"
 #define MONITOR_INTERVAL  (500 * 1000)
 #define MBUF_POOL_CACHE_SIZE 32
 #define BURST_SIZE 32
@@ -647,6 +646,7 @@ static void dpdk_init(void)
 static struct rte_ring *create_ring(void)
 {
 	struct rte_ring *ring;
+	char ring_name[RTE_RING_NAMESIZE];
 	size_t size, log2;
 
 	/* Find next power of 2 >= size. */
@@ -660,28 +660,28 @@ static struct rte_ring *create_ring(void)
 		ring_size = size;
 	}
 
-	ring = rte_ring_lookup(RING_NAME);
-	if (ring == NULL) {
-		ring = rte_ring_create(RING_NAME, ring_size,
-					rte_socket_id(), 0);
-		if (ring == NULL)
-			rte_exit(EXIT_FAILURE, "Could not create ring :%s\n",
-				 rte_strerror(rte_errno));
-	}
+	/* Want one ring per invocation of program */
+	snprintf(ring_name, sizeof(ring_name),
+		 "dumpcap-%u", getpid());
+
+	ring = rte_ring_create(ring_name, ring_size,
+			       rte_socket_id(), 0);
+	if (ring == NULL)
+		rte_exit(EXIT_FAILURE, "Could not create ring :%s\n",
+			 rte_strerror(rte_errno));
+
 	return ring;
 }
 
 static struct rte_mempool *create_mempool(void)
 {
 	const struct interface *intf;
-	static const char pool_name[] = "capture_mbufs";
+	char pool_name[RTE_MEMPOOL_NAMESIZE];
 	size_t num_mbufs = 2 * ring_size;
 	struct rte_mempool *mp;
 	uint32_t data_size = 128;
 
-	mp = rte_mempool_lookup(pool_name);
-	if (mp)
-		return mp;
+	snprintf(pool_name, sizeof(pool_name), "capture_%u", getpid());
 
 	/* Common pool so size mbuf for biggest snap length */
 	TAILQ_FOREACH(intf, &interfaces, next) {
@@ -826,7 +826,7 @@ static void enable_pdump(struct rte_ring *r, struct rte_mempool *mp)
 			rte_exit(EXIT_FAILURE,
 				"Packet dump enable on %u:%s failed %s\n",
 				intf->port, intf->name,
-				rte_strerror(-ret));
+				rte_strerror(rte_errno));
 		}
 
 		if (intf->opts.promisc_mode) {
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH 3/4] pcapng: change timestamp argument to write_stats
  2023-09-21  4:23 [PATCH 0/4] pcapng fixes Stephen Hemminger
  2023-09-21  4:23 ` [PATCH 1/4] pdump: fix setting rte_errno on mp error Stephen Hemminger
  2023-09-21  4:23 ` [PATCH 2/4] dumpcap: allow multiple invocations Stephen Hemminger
@ 2023-09-21  4:23 ` Stephen Hemminger
  2023-09-21  4:23 ` [PATCH 4/4] pcapng: move timestamp calculation into pdump Stephen Hemminger
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-09-21  4:23 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Reshma Pattan

In order to cleanup the management of time base calculation,
later patch will move the calculation from pcapng to the pdump
library. One of the changes necessary is to move the timestamp
calculation in the write_stats call from the pcapng library
into the caller. Since dumpcap already does this for other timestamps
the change is rather small.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 app/dumpcap/main.c      | 3 ++-
 app/test/test_pcapng.c  | 4 ++--
 lib/pcapng/rte_pcapng.c | 8 +++-----
 lib/pcapng/rte_pcapng.h | 5 ++++-
 4 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/app/dumpcap/main.c b/app/dumpcap/main.c
index 37754fd06f4f..8f6ab3396cef 100644
--- a/app/dumpcap/main.c
+++ b/app/dumpcap/main.c
@@ -577,6 +577,7 @@ report_packet_stats(dumpcap_out_t out)
 	struct rte_pdump_stats pdump_stats;
 	struct interface *intf;
 	uint64_t ifrecv, ifdrop;
+	uint64_t timestamp = create_timestamp();
 	double percent;
 
 	fputc('\n', stderr);
@@ -590,7 +591,7 @@ report_packet_stats(dumpcap_out_t out)
 
 		if (use_pcapng)
 			rte_pcapng_write_stats(out.pcapng, intf->port, NULL,
-					       start_time, end_time,
+					       timestamp, start_time, end_time,
 					       ifrecv, ifdrop);
 
 		if (ifrecv == 0)
diff --git a/app/test/test_pcapng.c b/app/test/test_pcapng.c
index b8429a02f160..55aa2cf93666 100644
--- a/app/test/test_pcapng.c
+++ b/app/test/test_pcapng.c
@@ -173,8 +173,8 @@ test_write_stats(void)
 	ssize_t len;
 
 	/* write a statistics block */
-	len = rte_pcapng_write_stats(pcapng, port_id,
-				     NULL, 0, 0,
+	len = rte_pcapng_write_stats(pcapng, port_id, NULL,
+				     0, 0, 0,
 				     NUM_PACKETS, 0);
 	if (len <= 0) {
 		fprintf(stderr, "Write of statistics failed\n");
diff --git a/lib/pcapng/rte_pcapng.c b/lib/pcapng/rte_pcapng.c
index 3c91fc77644a..ddce7bc87141 100644
--- a/lib/pcapng/rte_pcapng.c
+++ b/lib/pcapng/rte_pcapng.c
@@ -368,7 +368,7 @@ rte_pcapng_add_interface(rte_pcapng_t *self, uint16_t port,
  */
 ssize_t
 rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
-		       const char *comment,
+		       const char *comment, uint64_t sample_time,
 		       uint64_t start_time, uint64_t end_time,
 		       uint64_t ifrecv, uint64_t ifdrop)
 {
@@ -376,7 +376,6 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 	struct pcapng_option *opt;
 	uint32_t optlen, len;
 	uint8_t *buf;
-	uint64_t ns;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
@@ -425,9 +424,8 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 	hdr->block_length = len;
 	hdr->interface_id = self->port_index[port_id];
 
-	ns = pcapng_tsc_to_ns(rte_get_tsc_cycles());
-	hdr->timestamp_hi = ns >> 32;
-	hdr->timestamp_lo = (uint32_t)ns;
+	hdr->timestamp_hi = sample_time >> 32;
+	hdr->timestamp_lo = (uint32_t)sample_time;
 
 	/* clone block_length after option */
 	memcpy(opt, &len, sizeof(uint32_t));
diff --git a/lib/pcapng/rte_pcapng.h b/lib/pcapng/rte_pcapng.h
index d93cc9f73ad5..1225ed5536ff 100644
--- a/lib/pcapng/rte_pcapng.h
+++ b/lib/pcapng/rte_pcapng.h
@@ -189,7 +189,9 @@ rte_pcapng_write_packets(rte_pcapng_t *self,
  * @param port
  *  The Ethernet port to report stats on.
  * @param comment
- *   Optional comment to add to statistics.
+ *  Optional comment to add to statistics.
+ * @param timestamp
+ *  Time this statistic sample refers to in nanoseconds.
  * @param start_time
  *  The time when packet capture was started in nanoseconds.
  *  Optional: can be zero if not known.
@@ -209,6 +211,7 @@ __rte_experimental
 ssize_t
 rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port,
 		       const char *comment,
+		       uint64_t timestamp,
 		       uint64_t start_time, uint64_t end_time,
 		       uint64_t ifrecv, uint64_t ifdrop);
 
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH 4/4] pcapng: move timestamp calculation into pdump
  2023-09-21  4:23 [PATCH 0/4] pcapng fixes Stephen Hemminger
                   ` (2 preceding siblings ...)
  2023-09-21  4:23 ` [PATCH 3/4] pcapng: change timestamp argument to write_stats Stephen Hemminger
@ 2023-09-21  4:23 ` Stephen Hemminger
  2023-10-02  8:15   ` David Marchand
  2023-10-05 23:06 ` [PATCH v2 0/4] dumpcap and pcapng fixes Stephen Hemminger
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 61+ messages in thread
From: Stephen Hemminger @ 2023-09-21  4:23 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Reshma Pattan, Quentin Armitage

The computation of timestamp is more easily done in pdump
than pcapng. The initialization is easier and makes the pcapng
library have no global state.

It also makes it easier to add HW timestamp support later.

Simplify the computation of nanoseconds from TSC to a two
step process which avoids numeric overflow issues. The previous
code was not thread safe as well.

Fixes: c882eb544842 ("pcapng: fix timestamp wrapping in output files")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 lib/pcapng/rte_pcapng.c | 71 ++---------------------------------------
 lib/pcapng/rte_pcapng.h |  2 +-
 lib/pdump/rte_pdump.c   | 56 +++++++++++++++++++++++++++++---
 3 files changed, 55 insertions(+), 74 deletions(-)

diff --git a/lib/pcapng/rte_pcapng.c b/lib/pcapng/rte_pcapng.c
index ddce7bc87141..f6b3bd0ca718 100644
--- a/lib/pcapng/rte_pcapng.c
+++ b/lib/pcapng/rte_pcapng.c
@@ -25,7 +25,6 @@
 #include <rte_mbuf.h>
 #include <rte_os_shim.h>
 #include <rte_pcapng.h>
-#include <rte_reciprocal.h>
 #include <rte_time.h>
 
 #include "pcapng_proto.h"
@@ -43,15 +42,6 @@ struct rte_pcapng {
 	uint32_t port_index[RTE_MAX_ETHPORTS];
 };
 
-/* For converting TSC cycles to PCAPNG ns format */
-static struct pcapng_time {
-	uint64_t ns;
-	uint64_t cycles;
-	uint64_t tsc_hz;
-	struct rte_reciprocal_u64 tsc_hz_inverse;
-} pcapng_time;
-
-
 #ifdef RTE_EXEC_ENV_WINDOWS
 /*
  * Windows does not have writev() call.
@@ -102,58 +92,6 @@ static ssize_t writev(int fd, const struct iovec *iov, int iovcnt)
 #define if_indextoname(ifindex, ifname) NULL
 #endif
 
-static inline void
-pcapng_init(void)
-{
-	struct timespec ts;
-
-	pcapng_time.cycles = rte_get_tsc_cycles();
-	clock_gettime(CLOCK_REALTIME, &ts);
-	pcapng_time.cycles = (pcapng_time.cycles + rte_get_tsc_cycles()) / 2;
-	pcapng_time.ns = rte_timespec_to_ns(&ts);
-
-	pcapng_time.tsc_hz = rte_get_tsc_hz();
-	pcapng_time.tsc_hz_inverse = rte_reciprocal_value_u64(pcapng_time.tsc_hz);
-}
-
-/* PCAPNG timestamps are in nanoseconds */
-static uint64_t pcapng_tsc_to_ns(uint64_t cycles)
-{
-	uint64_t delta, secs;
-
-	if (!pcapng_time.tsc_hz)
-		pcapng_init();
-
-	/* In essence the calculation is:
-	 *   delta = (cycles - pcapng_time.cycles) * NSEC_PRE_SEC / rte_get_tsc_hz()
-	 * but this overflows within 4 to 8 seconds depending on TSC frequency.
-	 * Instead, if delta >= pcapng_time.tsc_hz:
-	 *   Increase pcapng_time.ns and pcapng_time.cycles by the number of
-	 *   whole seconds in delta and reduce delta accordingly.
-	 * delta will therefore always lie in the interval [0, pcapng_time.tsc_hz),
-	 * which will not overflow when multiplied by NSEC_PER_SEC provided the
-	 * TSC frequency < approx 18.4GHz.
-	 *
-	 * Currently all TSCs operate below 5GHz.
-	 */
-	delta = cycles - pcapng_time.cycles;
-	if (unlikely(delta >= pcapng_time.tsc_hz)) {
-		if (likely(delta < pcapng_time.tsc_hz * 2)) {
-			delta -= pcapng_time.tsc_hz;
-			pcapng_time.cycles += pcapng_time.tsc_hz;
-			pcapng_time.ns += NSEC_PER_SEC;
-		} else {
-			secs = rte_reciprocal_divide_u64(delta, &pcapng_time.tsc_hz_inverse);
-			delta -= secs * pcapng_time.tsc_hz;
-			pcapng_time.cycles += secs * pcapng_time.tsc_hz;
-			pcapng_time.ns += secs * NSEC_PER_SEC;
-		}
-	}
-
-	return pcapng_time.ns + rte_reciprocal_divide_u64(delta * NSEC_PER_SEC,
-							  &pcapng_time.tsc_hz_inverse);
-}
-
 /* length of option including padding */
 static uint16_t pcapng_optlen(uint16_t len)
 {
@@ -518,7 +456,7 @@ struct rte_mbuf *
 rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 		const struct rte_mbuf *md,
 		struct rte_mempool *mp,
-		uint32_t length, uint64_t cycles,
+		uint32_t length, uint64_t timestamp,
 		enum rte_pcapng_direction direction,
 		const char *comment)
 {
@@ -527,14 +465,11 @@ rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 	struct pcapng_option *opt;
 	uint16_t optlen;
 	struct rte_mbuf *mc;
-	uint64_t ns;
 	bool rss_hash;
 
 #ifdef RTE_LIBRTE_ETHDEV_DEBUG
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
 #endif
-	ns = pcapng_tsc_to_ns(cycles);
-
 	orig_len = rte_pktmbuf_pkt_len(md);
 
 	/* Take snapshot of the data */
@@ -639,8 +574,8 @@ rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 	/* Interface index is filled in later during write */
 	mc->port = port_id;
 
-	epb->timestamp_hi = ns >> 32;
-	epb->timestamp_lo = (uint32_t)ns;
+	epb->timestamp_hi = timestamp >> 32;
+	epb->timestamp_lo = (uint32_t)timestamp;
 	epb->capture_length = data_len;
 	epb->original_length = orig_len;
 
diff --git a/lib/pcapng/rte_pcapng.h b/lib/pcapng/rte_pcapng.h
index 1225ed5536ff..b9a9ee23ad1d 100644
--- a/lib/pcapng/rte_pcapng.h
+++ b/lib/pcapng/rte_pcapng.h
@@ -122,7 +122,7 @@ enum rte_pcapng_direction {
  *   The upper limit on bytes to copy.  Passing UINT32_MAX
  *   means all data (after offset).
  * @param timestamp
- *   The timestamp in TSC cycles.
+ *   The timestamp in nanoseconds since 1/1/1970.
  * @param direction
  *   The direction of the packer: receive, transmit or unknown.
  * @param comment
diff --git a/lib/pdump/rte_pdump.c b/lib/pdump/rte_pdump.c
index a70085bd0211..384abf5e27ad 100644
--- a/lib/pdump/rte_pdump.c
+++ b/lib/pdump/rte_pdump.c
@@ -10,7 +10,9 @@
 #include <rte_log.h>
 #include <rte_memzone.h>
 #include <rte_errno.h>
+#include <rte_reciprocal.h>
 #include <rte_string_fns.h>
+#include <rte_time.h>
 #include <rte_pcapng.h>
 
 #include "rte_pdump.h"
@@ -78,6 +80,33 @@ static struct {
 	const struct rte_memzone *mz;
 } *pdump_stats;
 
+/* Time conversion values */
+static struct {
+	uint64_t offset_ns;	/* ns since 1/1/1970 when initialized */
+	uint64_t tsc_base;	/* TSC when initialized */
+	uint64_t tsc_hz;	/* copy of rte_tsc_hz() */
+	struct rte_reciprocal_u64 tsc_hz_inverse; /* inverse of tsc_hz */
+} pdump_time;
+
+/* Convert from TSC (CPU cycles) to nanoseconds */
+static uint64_t pdump_timestamp(void)
+{
+	uint64_t delta, secs, ns;
+
+	delta = rte_get_tsc_cycles() - pdump_time.tsc_base;
+
+	/* Avoid numeric wraparound by computing seconds first */
+	secs = rte_reciprocal_divide_u64(delta, &pdump_time.tsc_hz_inverse);
+
+	/* Remove the seconds portion */
+	delta -= secs * pdump_time.tsc_hz;
+	ns = rte_reciprocal_divide_u64(delta * NS_PER_S,
+				       &pdump_time.tsc_hz_inverse);
+
+	return secs * NS_PER_S + ns + pdump_time.offset_ns;
+}
+
+
 /* Create a clone of mbuf to be placed into ring. */
 static void
 pdump_copy(uint16_t port_id, uint16_t queue,
@@ -90,7 +119,7 @@ pdump_copy(uint16_t port_id, uint16_t queue,
 	int ring_enq;
 	uint16_t d_pkts = 0;
 	struct rte_mbuf *dup_bufs[nb_pkts];
-	uint64_t ts;
+	uint64_t timestamp = 0;
 	struct rte_ring *ring;
 	struct rte_mempool *mp;
 	struct rte_mbuf *p;
@@ -99,7 +128,6 @@ pdump_copy(uint16_t port_id, uint16_t queue,
 	if (cbs->filter)
 		rte_bpf_exec_burst(cbs->filter, (void **)pkts, rcs, nb_pkts);
 
-	ts = rte_get_tsc_cycles();
 	ring = cbs->ring;
 	mp = cbs->mp;
 	for (i = 0; i < nb_pkts; i++) {
@@ -119,12 +147,17 @@ pdump_copy(uint16_t port_id, uint16_t queue,
 		 * If using pcapng then want to wrap packets
 		 * otherwise a simple copy.
 		 */
-		if (cbs->ver == V2)
+		if (cbs->ver == V2) {
+			/* calculate timestamp on first packet */
+			if (timestamp == 0)
+				timestamp = pdump_timestamp();
+
 			p = rte_pcapng_copy(port_id, queue,
 					    pkts[i], mp, cbs->snaplen,
-					    ts, direction, NULL);
-		else
+					    timestamp, direction, NULL);
+		} else {
 			p = rte_pktmbuf_copy(pkts[i], mp, 0, cbs->snaplen);
+		}
 
 		if (unlikely(p == NULL))
 			__atomic_fetch_add(&stats->nombuf, 1, __ATOMIC_RELAXED);
@@ -421,8 +454,21 @@ int
 rte_pdump_init(void)
 {
 	const struct rte_memzone *mz;
+	struct timespec ts;
+	uint64_t cycles;
 	int ret;
 
+	/* Compute time base offsets */
+	cycles = rte_get_tsc_cycles();
+	clock_gettime(CLOCK_REALTIME, &ts);
+
+	/* put initial TSC value in middle of clock_gettime() call */
+	pdump_time.tsc_base = (cycles + rte_get_tsc_cycles()) / 2;
+	pdump_time.offset_ns = rte_timespec_to_ns(&ts);
+
+	pdump_time.tsc_hz = rte_get_tsc_hz();
+	pdump_time.tsc_hz_inverse = rte_reciprocal_value_u64(pdump_time.tsc_hz);
+
 	mz = rte_memzone_reserve(MZ_RTE_PDUMP_STATS, sizeof(*pdump_stats),
 				 rte_socket_id(), 0);
 	if (mz == NULL) {
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH 2/4] dumpcap: allow multiple invocations
  2023-09-21  4:23 ` [PATCH 2/4] dumpcap: allow multiple invocations Stephen Hemminger
@ 2023-09-21  6:22   ` Morten Brørup
  2023-09-21  7:10     ` Isaac Boukris
  2023-11-07  2:34     ` Stephen Hemminger
  0 siblings, 2 replies; 61+ messages in thread
From: Morten Brørup @ 2023-09-21  6:22 UTC (permalink / raw)
  To: Stephen Hemminger, dev; +Cc: Isaac Boukris, Reshma Pattan

> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Thursday, 21 September 2023 06.24
> 
> If dumpcap is run twice with each instance pointing a different
> interface, it would fail because of overlap in ring a pool names.
> Fix by putting process id in the name.
> 
> Fixes: cbb44143be74 ("app/dumpcap: add new packet capture application")
> Reported-by: Isaac Boukris <iboukris@gmail.com>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
>  app/dumpcap/main.c | 28 ++++++++++++++--------------
>  1 file changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/app/dumpcap/main.c b/app/dumpcap/main.c
> index 64294bbfb3e6..37754fd06f4f 100644
> --- a/app/dumpcap/main.c
> +++ b/app/dumpcap/main.c
> @@ -44,7 +44,6 @@
>  #include <pcap/pcap.h>
>  #include <pcap/bpf.h>
> 
> -#define RING_NAME "capture-ring"
>  #define MONITOR_INTERVAL  (500 * 1000)
>  #define MBUF_POOL_CACHE_SIZE 32
>  #define BURST_SIZE 32
> @@ -647,6 +646,7 @@ static void dpdk_init(void)
>  static struct rte_ring *create_ring(void)
>  {
>  	struct rte_ring *ring;
> +	char ring_name[RTE_RING_NAMESIZE];
>  	size_t size, log2;
> 
>  	/* Find next power of 2 >= size. */
> @@ -660,28 +660,28 @@ static struct rte_ring *create_ring(void)
>  		ring_size = size;
>  	}
> 
> -	ring = rte_ring_lookup(RING_NAME);
> -	if (ring == NULL) {
> -		ring = rte_ring_create(RING_NAME, ring_size,
> -					rte_socket_id(), 0);
> -		if (ring == NULL)
> -			rte_exit(EXIT_FAILURE, "Could not create ring :%s\n",
> -				 rte_strerror(rte_errno));
> -	}
> +	/* Want one ring per invocation of program */
> +	snprintf(ring_name, sizeof(ring_name),
> +		 "dumpcap-%u", getpid());

I'm not sure getpid() is available on Windows. How about:

#ifdef _WIN32
#include <processthreadsapi.h> // With the headers, not here.
"dumpcap-%lu", GetCurrentProcessId());
#else
"dumpcap-%u", getpid());
#endif

> +
> +	ring = rte_ring_create(ring_name, ring_size,
> +			       rte_socket_id(), 0);
> +	if (ring == NULL)
> +		rte_exit(EXIT_FAILURE, "Could not create ring :%s\n",
> +			 rte_strerror(rte_errno));
> +
>  	return ring;
>  }
> 
>  static struct rte_mempool *create_mempool(void)
>  {
>  	const struct interface *intf;
> -	static const char pool_name[] = "capture_mbufs";
> +	char pool_name[RTE_MEMPOOL_NAMESIZE];
>  	size_t num_mbufs = 2 * ring_size;
>  	struct rte_mempool *mp;
>  	uint32_t data_size = 128;
> 
> -	mp = rte_mempool_lookup(pool_name);
> -	if (mp)
> -		return mp;
> +	snprintf(pool_name, sizeof(pool_name), "capture_%u", getpid());

Same regarding getpid().

> 
>  	/* Common pool so size mbuf for biggest snap length */
>  	TAILQ_FOREACH(intf, &interfaces, next) {
> @@ -826,7 +826,7 @@ static void enable_pdump(struct rte_ring *r, struct
> rte_mempool *mp)
>  			rte_exit(EXIT_FAILURE,
>  				"Packet dump enable on %u:%s failed %s\n",
>  				intf->port, intf->name,
> -				rte_strerror(-ret));
> +				rte_strerror(rte_errno));
>  		}
> 
>  		if (intf->opts.promisc_mode) {
> --
> 2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 2/4] dumpcap: allow multiple invocations
  2023-09-21  6:22   ` Morten Brørup
@ 2023-09-21  7:10     ` Isaac Boukris
  2023-11-07  2:34     ` Stephen Hemminger
  1 sibling, 0 replies; 61+ messages in thread
From: Isaac Boukris @ 2023-09-21  7:10 UTC (permalink / raw)
  To: Morten Brørup; +Cc: Stephen Hemminger, dev, Reshma Pattan

On Thu, Sep 21, 2023 at 9:22 AM Morten Brørup <mb@smartsharesystems.com> wrote:
>
> > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > Sent: Thursday, 21 September 2023 06.24
> >
> > If dumpcap is run twice with each instance pointing a different
> > interface, it would fail because of overlap in ring a pool names.
> > Fix by putting process id in the name.
> >
> > Fixes: cbb44143be74 ("app/dumpcap: add new packet capture application")
> > Reported-by: Isaac Boukris <iboukris@gmail.com>
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> > ---
> >  app/dumpcap/main.c | 28 ++++++++++++++--------------
> >  1 file changed, 14 insertions(+), 14 deletions(-)
> >
> > diff --git a/app/dumpcap/main.c b/app/dumpcap/main.c
> > index 64294bbfb3e6..37754fd06f4f 100644
> > --- a/app/dumpcap/main.c
> > +++ b/app/dumpcap/main.c
> > @@ -44,7 +44,6 @@
> >  #include <pcap/pcap.h>
> >  #include <pcap/bpf.h>
> >
> > -#define RING_NAME "capture-ring"
> >  #define MONITOR_INTERVAL  (500 * 1000)
> >  #define MBUF_POOL_CACHE_SIZE 32
> >  #define BURST_SIZE 32
> > @@ -647,6 +646,7 @@ static void dpdk_init(void)
> >  static struct rte_ring *create_ring(void)
> >  {
> >       struct rte_ring *ring;
> > +     char ring_name[RTE_RING_NAMESIZE];
> >       size_t size, log2;
> >
> >       /* Find next power of 2 >= size. */
> > @@ -660,28 +660,28 @@ static struct rte_ring *create_ring(void)
> >               ring_size = size;
> >       }
> >
> > -     ring = rte_ring_lookup(RING_NAME);
> > -     if (ring == NULL) {
> > -             ring = rte_ring_create(RING_NAME, ring_size,
> > -                                     rte_socket_id(), 0);
> > -             if (ring == NULL)
> > -                     rte_exit(EXIT_FAILURE, "Could not create ring :%s\n",
> > -                              rte_strerror(rte_errno));
> > -     }
> > +     /* Want one ring per invocation of program */
> > +     snprintf(ring_name, sizeof(ring_name),
> > +              "dumpcap-%u", getpid());
>
> I'm not sure getpid() is available on Windows. How about:

I think the 'app/dumpcap/meson.build' file indicates no support for Windows.

Regards

> #ifdef _WIN32
> #include <processthreadsapi.h> // With the headers, not here.
> "dumpcap-%lu", GetCurrentProcessId());
> #else
> "dumpcap-%u", getpid());
> #endif
>
> > +
> > +     ring = rte_ring_create(ring_name, ring_size,
> > +                            rte_socket_id(), 0);
> > +     if (ring == NULL)
> > +             rte_exit(EXIT_FAILURE, "Could not create ring :%s\n",
> > +                      rte_strerror(rte_errno));
> > +
> >       return ring;
> >  }
> >
> >  static struct rte_mempool *create_mempool(void)
> >  {
> >       const struct interface *intf;
> > -     static const char pool_name[] = "capture_mbufs";
> > +     char pool_name[RTE_MEMPOOL_NAMESIZE];
> >       size_t num_mbufs = 2 * ring_size;
> >       struct rte_mempool *mp;
> >       uint32_t data_size = 128;
> >
> > -     mp = rte_mempool_lookup(pool_name);
> > -     if (mp)
> > -             return mp;
> > +     snprintf(pool_name, sizeof(pool_name), "capture_%u", getpid());
>
> Same regarding getpid().
>
> >
> >       /* Common pool so size mbuf for biggest snap length */
> >       TAILQ_FOREACH(intf, &interfaces, next) {
> > @@ -826,7 +826,7 @@ static void enable_pdump(struct rte_ring *r, struct
> > rte_mempool *mp)
> >                       rte_exit(EXIT_FAILURE,
> >                               "Packet dump enable on %u:%s failed %s\n",
> >                               intf->port, intf->name,
> > -                             rte_strerror(-ret));
> > +                             rte_strerror(rte_errno));
> >               }
> >
> >               if (intf->opts.promisc_mode) {
> > --
> > 2.39.2
>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 4/4] pcapng: move timestamp calculation into pdump
  2023-09-21  4:23 ` [PATCH 4/4] pcapng: move timestamp calculation into pdump Stephen Hemminger
@ 2023-10-02  8:15   ` David Marchand
  2023-10-04 17:13     ` Stephen Hemminger
  0 siblings, 1 reply; 61+ messages in thread
From: David Marchand @ 2023-10-02  8:15 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Reshma Pattan, Quentin Armitage

Hello Stephen,

On Thu, Sep 21, 2023 at 6:24 AM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> The computation of timestamp is more easily done in pdump
> than pcapng. The initialization is easier and makes the pcapng
> library have no global state.
>
> It also makes it easier to add HW timestamp support later.
>
> Simplify the computation of nanoseconds from TSC to a two
> step process which avoids numeric overflow issues. The previous
> code was not thread safe as well.
>

Bugzilla ID: 1291 ?

This patch (and patch 3) updates some pcapng API, is it worth a RN update?

> Fixes: c882eb544842 ("pcapng: fix timestamp wrapping in output files")

Is it worth backporting?
I would say no, as some API update was needed to fix the issue.
But on the other hand, this is an experimental API, so I prefer to ask.


> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>


-- 
David Marchand


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 4/4] pcapng: move timestamp calculation into pdump
  2023-10-02  8:15   ` David Marchand
@ 2023-10-04 17:13     ` Stephen Hemminger
  2023-10-06  9:10       ` David Marchand
  0 siblings, 1 reply; 61+ messages in thread
From: Stephen Hemminger @ 2023-10-04 17:13 UTC (permalink / raw)
  To: David Marchand; +Cc: dev, Reshma Pattan, Quentin Armitage

On Mon, 2 Oct 2023 10:15:25 +0200
David Marchand <david.marchand@redhat.com> wrote:

> >  
> 
> Bugzilla ID: 1291 ?
> 
> This patch (and patch 3) updates some pcapng API, is it worth a RN update?
> 
> > Fixes: c882eb544842 ("pcapng: fix timestamp wrapping in output files")  
> 
> Is it worth backporting?
> I would say no, as some API update was needed to fix the issue.
> But on the other hand, this is an experimental API, so I prefer to ask.
> 
> 
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>  

Good question.
Is experimental API allowed to change in a stable release?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v2 0/4] dumpcap and pcapng fixes
  2023-09-21  4:23 [PATCH 0/4] pcapng fixes Stephen Hemminger
                   ` (3 preceding siblings ...)
  2023-09-21  4:23 ` [PATCH 4/4] pcapng: move timestamp calculation into pdump Stephen Hemminger
@ 2023-10-05 23:06 ` Stephen Hemminger
  2023-10-05 23:06   ` [PATCH v2 1/4] pdump: fix setting rte_errno on mp error Stephen Hemminger
                     ` (3 more replies)
  2023-11-08 18:35 ` [PATCH v3 0/5] dumpcap and pcapng fixes Stephen Hemminger
                   ` (4 subsequent siblings)
  9 siblings, 4 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-10-05 23:06 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

This version slightly modifies the pcapng API to fix
issues related to timestamping. The design choices are
to maximize performance in the primary process; and do
all the time adjustment in the secondary (dumpcap) since
the dumpcap needs to system calls anyway to write the result.

This patches set changes where the adjustment is calculated
into the pcapng portion that opens the output file.
All details of the format of timestamp are contained inside
pcapng (data hiding).

Stephen Hemminger (4):
  pdump: fix setting rte_errno on mp error
  dumpcap: allow multiple invocations
  pcapng: modify timestamp calculation
  test: cleanups to pcapng test

 app/dumpcap/main.c      |  53 +++---
 app/test/meson.build    |   2 +-
 app/test/test_pcapng.c  | 378 +++++++++++++++++++++++++---------------
 lib/graph/graph_pcap.c  |   2 +-
 lib/pcapng/rte_pcapng.c | 119 +++++--------
 lib/pcapng/rte_pcapng.h |  19 +-
 lib/pdump/rte_pdump.c   |   9 +-
 7 files changed, 318 insertions(+), 264 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v2 1/4] pdump: fix setting rte_errno on mp error
  2023-10-05 23:06 ` [PATCH v2 0/4] dumpcap and pcapng fixes Stephen Hemminger
@ 2023-10-05 23:06   ` Stephen Hemminger
  2023-10-05 23:06   ` [PATCH v2 2/4] dumpcap: allow multiple invocations Stephen Hemminger
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-10-05 23:06 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Reshma Pattan, Jianfeng Tan

The response from MP server sets err_value to negative
on error. The convention for rte_errno is to use a positive
value on error. This makes errors like duplicate registration
show up with the correct error value.

Fixes: 660098d61f57 ("pdump: use generic multi-process channel")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 lib/pdump/rte_pdump.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/pdump/rte_pdump.c b/lib/pdump/rte_pdump.c
index 53cca1034d41..a70085bd0211 100644
--- a/lib/pdump/rte_pdump.c
+++ b/lib/pdump/rte_pdump.c
@@ -564,9 +564,10 @@ pdump_prepare_client_request(const char *device, uint16_t queue,
 	if (rte_mp_request_sync(&mp_req, &mp_reply, &ts) == 0) {
 		mp_rep = &mp_reply.msgs[0];
 		resp = (struct pdump_response *)mp_rep->param;
-		rte_errno = resp->err_value;
-		if (!resp->err_value)
+		if (resp->err_value == 0)
 			ret = 0;
+		else
+			rte_errno = -resp->err_value;
 		free(mp_reply.msgs);
 	}
 
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v2 2/4] dumpcap: allow multiple invocations
  2023-10-05 23:06 ` [PATCH v2 0/4] dumpcap and pcapng fixes Stephen Hemminger
  2023-10-05 23:06   ` [PATCH v2 1/4] pdump: fix setting rte_errno on mp error Stephen Hemminger
@ 2023-10-05 23:06   ` Stephen Hemminger
  2023-10-05 23:06   ` [PATCH v2 3/4] pcapng: modify timestamp calculation Stephen Hemminger
  2023-10-05 23:06   ` [PATCH v2 4/4] test: cleanups to pcapng test Stephen Hemminger
  3 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-10-05 23:06 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Isaac Boukris, Reshma Pattan

If dumpcap is run twice with each instance pointing a different
interface, it would fail because of overlap in ring a pool names.
Fix by putting process id in the name.

Fixes: cbb44143be74 ("app/dumpcap: add new packet capture application")
Reported-by: Isaac Boukris <iboukris@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 app/dumpcap/main.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/app/dumpcap/main.c b/app/dumpcap/main.c
index 64294bbfb3e6..37754fd06f4f 100644
--- a/app/dumpcap/main.c
+++ b/app/dumpcap/main.c
@@ -44,7 +44,6 @@
 #include <pcap/pcap.h>
 #include <pcap/bpf.h>
 
-#define RING_NAME "capture-ring"
 #define MONITOR_INTERVAL  (500 * 1000)
 #define MBUF_POOL_CACHE_SIZE 32
 #define BURST_SIZE 32
@@ -647,6 +646,7 @@ static void dpdk_init(void)
 static struct rte_ring *create_ring(void)
 {
 	struct rte_ring *ring;
+	char ring_name[RTE_RING_NAMESIZE];
 	size_t size, log2;
 
 	/* Find next power of 2 >= size. */
@@ -660,28 +660,28 @@ static struct rte_ring *create_ring(void)
 		ring_size = size;
 	}
 
-	ring = rte_ring_lookup(RING_NAME);
-	if (ring == NULL) {
-		ring = rte_ring_create(RING_NAME, ring_size,
-					rte_socket_id(), 0);
-		if (ring == NULL)
-			rte_exit(EXIT_FAILURE, "Could not create ring :%s\n",
-				 rte_strerror(rte_errno));
-	}
+	/* Want one ring per invocation of program */
+	snprintf(ring_name, sizeof(ring_name),
+		 "dumpcap-%u", getpid());
+
+	ring = rte_ring_create(ring_name, ring_size,
+			       rte_socket_id(), 0);
+	if (ring == NULL)
+		rte_exit(EXIT_FAILURE, "Could not create ring :%s\n",
+			 rte_strerror(rte_errno));
+
 	return ring;
 }
 
 static struct rte_mempool *create_mempool(void)
 {
 	const struct interface *intf;
-	static const char pool_name[] = "capture_mbufs";
+	char pool_name[RTE_MEMPOOL_NAMESIZE];
 	size_t num_mbufs = 2 * ring_size;
 	struct rte_mempool *mp;
 	uint32_t data_size = 128;
 
-	mp = rte_mempool_lookup(pool_name);
-	if (mp)
-		return mp;
+	snprintf(pool_name, sizeof(pool_name), "capture_%u", getpid());
 
 	/* Common pool so size mbuf for biggest snap length */
 	TAILQ_FOREACH(intf, &interfaces, next) {
@@ -826,7 +826,7 @@ static void enable_pdump(struct rte_ring *r, struct rte_mempool *mp)
 			rte_exit(EXIT_FAILURE,
 				"Packet dump enable on %u:%s failed %s\n",
 				intf->port, intf->name,
-				rte_strerror(-ret));
+				rte_strerror(rte_errno));
 		}
 
 		if (intf->opts.promisc_mode) {
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v2 3/4] pcapng: modify timestamp calculation
  2023-10-05 23:06 ` [PATCH v2 0/4] dumpcap and pcapng fixes Stephen Hemminger
  2023-10-05 23:06   ` [PATCH v2 1/4] pdump: fix setting rte_errno on mp error Stephen Hemminger
  2023-10-05 23:06   ` [PATCH v2 2/4] dumpcap: allow multiple invocations Stephen Hemminger
@ 2023-10-05 23:06   ` Stephen Hemminger
  2023-10-05 23:06   ` [PATCH v2 4/4] test: cleanups to pcapng test Stephen Hemminger
  3 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-10-05 23:06 UTC (permalink / raw)
  To: dev
  Cc: Stephen Hemminger, Reshma Pattan, Jerin Jacob, Kiran Kumar K,
	Nithin Dabilpuram, Zhirun Yan, Quentin Armitage

The computation of timestamp is best done in the part of
pcapng library that is in secondary process.
The secondary process is already doing a bunch of system
calls which makes it not performance sensitive.

Simplify the computation of nanoseconds from TSC to a two
step process which avoids numeric overflow issues. The previous
code was not thread safe as well.

Fixes: c882eb544842 ("pcapng: fix timestamp wrapping in output files")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 app/dumpcap/main.c      |  25 +++------
 app/test/test_pcapng.c  |   4 +-
 lib/graph/graph_pcap.c  |   2 +-
 lib/pcapng/rte_pcapng.c | 119 +++++++++++++++-------------------------
 lib/pcapng/rte_pcapng.h |  19 ++-----
 lib/pdump/rte_pdump.c   |   4 +-
 6 files changed, 61 insertions(+), 112 deletions(-)

diff --git a/app/dumpcap/main.c b/app/dumpcap/main.c
index 37754fd06f4f..764dac6c37c0 100644
--- a/app/dumpcap/main.c
+++ b/app/dumpcap/main.c
@@ -66,13 +66,13 @@ static bool print_stats;
 
 /* capture limit options */
 static struct {
-	uint64_t  duration;	/* nanoseconds */
+	time_t  duration;	/* seconds */
 	unsigned long packets;  /* number of packets in file */
 	size_t size;		/* file size (bytes) */
 } stop;
 
 /* Running state */
-static uint64_t start_time, end_time;
+static time_t start_time;
 static uint64_t packets_received;
 static size_t file_size;
 
@@ -197,7 +197,7 @@ static void auto_stop(char *opt)
 		if (*value == '\0' || *endp != '\0' || interval <= 0)
 			rte_exit(EXIT_FAILURE,
 				 "Invalid duration \"%s\"\n", value);
-		stop.duration = NSEC_PER_SEC * interval;
+		stop.duration = interval;
 	} else if (strcmp(opt, "filesize") == 0) {
 		stop.size = get_uint(value, "filesize", 0) * 1024;
 	} else if (strcmp(opt, "packets") == 0) {
@@ -511,15 +511,6 @@ static void statistics_loop(void)
 	}
 }
 
-/* Return the time since 1/1/1970 in nanoseconds */
-static uint64_t create_timestamp(void)
-{
-	struct timespec now;
-
-	clock_gettime(CLOCK_MONOTONIC, &now);
-	return rte_timespec_to_ns(&now);
-}
-
 static void
 cleanup_pdump_resources(void)
 {
@@ -589,9 +580,8 @@ report_packet_stats(dumpcap_out_t out)
 		ifdrop = pdump_stats.nombuf + pdump_stats.ringfull;
 
 		if (use_pcapng)
-			rte_pcapng_write_stats(out.pcapng, intf->port, NULL,
-					       start_time, end_time,
-					       ifrecv, ifdrop);
+			rte_pcapng_write_stats(out.pcapng, intf->port,
+					       ifrecv, ifdrop, NULL);
 
 		if (ifrecv == 0)
 			percent = 0;
@@ -983,7 +973,7 @@ int main(int argc, char **argv)
 	mp = create_mempool();
 	out = create_output();
 
-	start_time = create_timestamp();
+	start_time = time(NULL);
 	enable_pdump(r, mp);
 
 	if (!quiet) {
@@ -1005,11 +995,10 @@ int main(int argc, char **argv)
 			break;
 
 		if (stop.duration != 0 &&
-		    create_timestamp() - start_time > stop.duration)
+		    time(NULL) - start_time > stop.duration)
 			break;
 	}
 
-	end_time = create_timestamp();
 	disable_primary_monitor();
 
 	if (rte_eal_primary_proc_alive(NULL))
diff --git a/app/test/test_pcapng.c b/app/test/test_pcapng.c
index b8429a02f160..55aa2cf93666 100644
--- a/app/test/test_pcapng.c
+++ b/app/test/test_pcapng.c
@@ -173,8 +173,8 @@ test_write_stats(void)
 	ssize_t len;
 
 	/* write a statistics block */
-	len = rte_pcapng_write_stats(pcapng, port_id,
-				     NULL, 0, 0,
+	len = rte_pcapng_write_stats(pcapng, port_id, NULL,
+				     0, 0, 0,
 				     NUM_PACKETS, 0);
 	if (len <= 0) {
 		fprintf(stderr, "Write of statistics failed\n");
diff --git a/lib/graph/graph_pcap.c b/lib/graph/graph_pcap.c
index db722c375fa7..89525f1220ca 100644
--- a/lib/graph/graph_pcap.c
+++ b/lib/graph/graph_pcap.c
@@ -214,7 +214,7 @@ graph_pcap_dispatch(struct rte_graph *graph,
 		mbuf = (struct rte_mbuf *)objs[i];
 
 		mc = rte_pcapng_copy(mbuf->port, 0, mbuf, pkt_mp, mbuf->pkt_len,
-				     rte_get_tsc_cycles(), 0, buffer);
+				     0, buffer);
 		if (mc == NULL)
 			break;
 
diff --git a/lib/pcapng/rte_pcapng.c b/lib/pcapng/rte_pcapng.c
index 3c91fc77644a..13fd2b97fb80 100644
--- a/lib/pcapng/rte_pcapng.c
+++ b/lib/pcapng/rte_pcapng.c
@@ -36,22 +36,14 @@
 /* Format of the capture file handle */
 struct rte_pcapng {
 	int  outfd;		/* output file */
-
 	unsigned int ports;	/* number of interfaces added */
+	uint64_t offset_ns;	/* ns since 1/1/1970 when initialized */
+	uint64_t tsc_base;	/* TSC when started */
 
 	/* DPDK port id to interface index in file */
 	uint32_t port_index[RTE_MAX_ETHPORTS];
 };
 
-/* For converting TSC cycles to PCAPNG ns format */
-static struct pcapng_time {
-	uint64_t ns;
-	uint64_t cycles;
-	uint64_t tsc_hz;
-	struct rte_reciprocal_u64 tsc_hz_inverse;
-} pcapng_time;
-
-
 #ifdef RTE_EXEC_ENV_WINDOWS
 /*
  * Windows does not have writev() call.
@@ -102,56 +94,21 @@ static ssize_t writev(int fd, const struct iovec *iov, int iovcnt)
 #define if_indextoname(ifindex, ifname) NULL
 #endif
 
-static inline void
-pcapng_init(void)
+/* Convert from TSC (CPU cycles) to nanoseconds */
+static uint64_t
+pcapng_timestamp(const rte_pcapng_t *self, uint64_t cycles)
 {
-	struct timespec ts;
+	uint64_t delta, rem, secs, ns;
+	const uint64_t hz = rte_get_tsc_hz();
 
-	pcapng_time.cycles = rte_get_tsc_cycles();
-	clock_gettime(CLOCK_REALTIME, &ts);
-	pcapng_time.cycles = (pcapng_time.cycles + rte_get_tsc_cycles()) / 2;
-	pcapng_time.ns = rte_timespec_to_ns(&ts);
-
-	pcapng_time.tsc_hz = rte_get_tsc_hz();
-	pcapng_time.tsc_hz_inverse = rte_reciprocal_value_u64(pcapng_time.tsc_hz);
-}
+	delta = cycles - self->tsc_base;
 
-/* PCAPNG timestamps are in nanoseconds */
-static uint64_t pcapng_tsc_to_ns(uint64_t cycles)
-{
-	uint64_t delta, secs;
-
-	if (!pcapng_time.tsc_hz)
-		pcapng_init();
-
-	/* In essence the calculation is:
-	 *   delta = (cycles - pcapng_time.cycles) * NSEC_PRE_SEC / rte_get_tsc_hz()
-	 * but this overflows within 4 to 8 seconds depending on TSC frequency.
-	 * Instead, if delta >= pcapng_time.tsc_hz:
-	 *   Increase pcapng_time.ns and pcapng_time.cycles by the number of
-	 *   whole seconds in delta and reduce delta accordingly.
-	 * delta will therefore always lie in the interval [0, pcapng_time.tsc_hz),
-	 * which will not overflow when multiplied by NSEC_PER_SEC provided the
-	 * TSC frequency < approx 18.4GHz.
-	 *
-	 * Currently all TSCs operate below 5GHz.
-	 */
-	delta = cycles - pcapng_time.cycles;
-	if (unlikely(delta >= pcapng_time.tsc_hz)) {
-		if (likely(delta < pcapng_time.tsc_hz * 2)) {
-			delta -= pcapng_time.tsc_hz;
-			pcapng_time.cycles += pcapng_time.tsc_hz;
-			pcapng_time.ns += NSEC_PER_SEC;
-		} else {
-			secs = rte_reciprocal_divide_u64(delta, &pcapng_time.tsc_hz_inverse);
-			delta -= secs * pcapng_time.tsc_hz;
-			pcapng_time.cycles += secs * pcapng_time.tsc_hz;
-			pcapng_time.ns += secs * NSEC_PER_SEC;
-		}
-	}
+	/* Avoid numeric wraparound by computing seconds first */
+	secs = delta / hz;
+	rem = delta % hz;
+	ns = (rem * NS_PER_S) / hz;
 
-	return pcapng_time.ns + rte_reciprocal_divide_u64(delta * NSEC_PER_SEC,
-							  &pcapng_time.tsc_hz_inverse);
+	return secs * NS_PER_S + ns + self->offset_ns;
 }
 
 /* length of option including padding */
@@ -368,15 +325,15 @@ rte_pcapng_add_interface(rte_pcapng_t *self, uint16_t port,
  */
 ssize_t
 rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
-		       const char *comment,
-		       uint64_t start_time, uint64_t end_time,
-		       uint64_t ifrecv, uint64_t ifdrop)
+		       uint64_t ifrecv, uint64_t ifdrop,
+		       const char *comment)
 {
 	struct pcapng_statistics *hdr;
 	struct pcapng_option *opt;
+	uint64_t start_time = self->offset_ns;
+	uint64_t sample_time;
 	uint32_t optlen, len;
 	uint8_t *buf;
-	uint64_t ns;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
@@ -386,10 +343,10 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 		optlen += pcapng_optlen(sizeof(ifrecv));
 	if (ifdrop != UINT64_MAX)
 		optlen += pcapng_optlen(sizeof(ifdrop));
+
 	if (start_time != 0)
 		optlen += pcapng_optlen(sizeof(start_time));
-	if (end_time != 0)
-		optlen += pcapng_optlen(sizeof(end_time));
+
 	if (comment)
 		optlen += pcapng_optlen(strlen(comment));
 	if (optlen != 0)
@@ -409,9 +366,6 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 	if (start_time != 0)
 		opt = pcapng_add_option(opt, PCAPNG_ISB_STARTTIME,
 					 &start_time, sizeof(start_time));
-	if (end_time != 0)
-		opt = pcapng_add_option(opt, PCAPNG_ISB_ENDTIME,
-					 &end_time, sizeof(end_time));
 	if (ifrecv != UINT64_MAX)
 		opt = pcapng_add_option(opt, PCAPNG_ISB_IFRECV,
 				&ifrecv, sizeof(ifrecv));
@@ -425,9 +379,9 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 	hdr->block_length = len;
 	hdr->interface_id = self->port_index[port_id];
 
-	ns = pcapng_tsc_to_ns(rte_get_tsc_cycles());
-	hdr->timestamp_hi = ns >> 32;
-	hdr->timestamp_lo = (uint32_t)ns;
+	sample_time = pcapng_timestamp(self, rte_get_tsc_cycles());
+	hdr->timestamp_hi = sample_time >> 32;
+	hdr->timestamp_lo = (uint32_t)sample_time;
 
 	/* clone block_length after option */
 	memcpy(opt, &len, sizeof(uint32_t));
@@ -520,23 +474,21 @@ struct rte_mbuf *
 rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 		const struct rte_mbuf *md,
 		struct rte_mempool *mp,
-		uint32_t length, uint64_t cycles,
+		uint32_t length,
 		enum rte_pcapng_direction direction,
 		const char *comment)
 {
 	struct pcapng_enhance_packet_block *epb;
 	uint32_t orig_len, data_len, padding, flags;
 	struct pcapng_option *opt;
+	uint64_t timestamp;
 	uint16_t optlen;
 	struct rte_mbuf *mc;
-	uint64_t ns;
 	bool rss_hash;
 
 #ifdef RTE_LIBRTE_ETHDEV_DEBUG
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
 #endif
-	ns = pcapng_tsc_to_ns(cycles);
-
 	orig_len = rte_pktmbuf_pkt_len(md);
 
 	/* Take snapshot of the data */
@@ -641,8 +593,10 @@ rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 	/* Interface index is filled in later during write */
 	mc->port = port_id;
 
-	epb->timestamp_hi = ns >> 32;
-	epb->timestamp_lo = (uint32_t)ns;
+	/* Put timestamp in cycles here - adjust in packet write */
+	timestamp = rte_get_tsc_cycles();
+	epb->timestamp_hi = timestamp >> 32;
+	epb->timestamp_lo = (uint32_t)timestamp;
 	epb->capture_length = data_len;
 	epb->original_length = orig_len;
 
@@ -668,6 +622,7 @@ rte_pcapng_write_packets(rte_pcapng_t *self,
 	for (i = 0; i < nb_pkts; i++) {
 		struct rte_mbuf *m = pkts[i];
 		struct pcapng_enhance_packet_block *epb;
+		uint64_t cycles, timestamp;
 
 		/* sanity check that is really a pcapng mbuf */
 		epb = rte_pktmbuf_mtod(m, struct pcapng_enhance_packet_block *);
@@ -684,6 +639,13 @@ rte_pcapng_write_packets(rte_pcapng_t *self,
 			return -1;
 		}
 
+		/* adjust timestamp recorded in packet */
+		cycles = (uint64_t)epb->timestamp_hi << 32;
+		cycles += epb->timestamp_lo;
+		timestamp = pcapng_timestamp(self, cycles);
+		epb->timestamp_hi = timestamp >> 32;
+		epb->timestamp_lo = (uint32_t)timestamp;
+
 		/*
 		 * Handle case of highly fragmented and large burst size
 		 * Note: this assumes that max segments per mbuf < IOV_MAX
@@ -725,6 +687,8 @@ rte_pcapng_fdopen(int fd,
 {
 	unsigned int i;
 	rte_pcapng_t *self;
+	struct timespec ts;
+	uint64_t cycles;
 
 	self = malloc(sizeof(*self));
 	if (!self) {
@@ -734,6 +698,13 @@ rte_pcapng_fdopen(int fd,
 
 	self->outfd = fd;
 	self->ports = 0;
+
+	/* record start time in ns since 1/1/1970 */
+	cycles = rte_get_tsc_cycles();
+	clock_gettime(CLOCK_REALTIME, &ts);
+	self->tsc_base = (cycles + rte_get_tsc_cycles()) / 2;
+	self->offset_ns = rte_timespec_to_ns(&ts);
+
 	for (i = 0; i < RTE_MAX_ETHPORTS; i++)
 		self->port_index[i] = UINT32_MAX;
 
diff --git a/lib/pcapng/rte_pcapng.h b/lib/pcapng/rte_pcapng.h
index d93cc9f73ad5..c40795c721de 100644
--- a/lib/pcapng/rte_pcapng.h
+++ b/lib/pcapng/rte_pcapng.h
@@ -121,8 +121,6 @@ enum rte_pcapng_direction {
  * @param length
  *   The upper limit on bytes to copy.  Passing UINT32_MAX
  *   means all data (after offset).
- * @param timestamp
- *   The timestamp in TSC cycles.
  * @param direction
  *   The direction of the packer: receive, transmit or unknown.
  * @param comment
@@ -136,7 +134,7 @@ __rte_experimental
 struct rte_mbuf *
 rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 		const struct rte_mbuf *m, struct rte_mempool *mp,
-		uint32_t length, uint64_t timestamp,
+		uint32_t length,
 		enum rte_pcapng_direction direction, const char *comment);
 
 
@@ -188,29 +186,22 @@ rte_pcapng_write_packets(rte_pcapng_t *self,
  *  The handle to the packet capture file
  * @param port
  *  The Ethernet port to report stats on.
- * @param comment
- *   Optional comment to add to statistics.
- * @param start_time
- *  The time when packet capture was started in nanoseconds.
- *  Optional: can be zero if not known.
- * @param end_time
- *  The time when packet capture was stopped in nanoseconds.
- *  Optional: can be zero if not finished;
  * @param ifrecv
  *  The number of packets received by capture.
  *  Optional: use UINT64_MAX if not known.
  * @param ifdrop
  *  The number of packets missed by the capture process.
  *  Optional: use UINT64_MAX if not known.
+ * @param comment
+ *  Optional comment to add to statistics.
  * @return
  *  number of bytes written to file, -1 on failure to write file
  */
 __rte_experimental
 ssize_t
 rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port,
-		       const char *comment,
-		       uint64_t start_time, uint64_t end_time,
-		       uint64_t ifrecv, uint64_t ifdrop);
+		       uint64_t ifrecv, uint64_t ifdrop,
+		       const char *comment);
 
 #ifdef __cplusplus
 }
diff --git a/lib/pdump/rte_pdump.c b/lib/pdump/rte_pdump.c
index a70085bd0211..903f92839b8e 100644
--- a/lib/pdump/rte_pdump.c
+++ b/lib/pdump/rte_pdump.c
@@ -90,7 +90,6 @@ pdump_copy(uint16_t port_id, uint16_t queue,
 	int ring_enq;
 	uint16_t d_pkts = 0;
 	struct rte_mbuf *dup_bufs[nb_pkts];
-	uint64_t ts;
 	struct rte_ring *ring;
 	struct rte_mempool *mp;
 	struct rte_mbuf *p;
@@ -99,7 +98,6 @@ pdump_copy(uint16_t port_id, uint16_t queue,
 	if (cbs->filter)
 		rte_bpf_exec_burst(cbs->filter, (void **)pkts, rcs, nb_pkts);
 
-	ts = rte_get_tsc_cycles();
 	ring = cbs->ring;
 	mp = cbs->mp;
 	for (i = 0; i < nb_pkts; i++) {
@@ -122,7 +120,7 @@ pdump_copy(uint16_t port_id, uint16_t queue,
 		if (cbs->ver == V2)
 			p = rte_pcapng_copy(port_id, queue,
 					    pkts[i], mp, cbs->snaplen,
-					    ts, direction, NULL);
+					    direction, NULL);
 		else
 			p = rte_pktmbuf_copy(pkts[i], mp, 0, cbs->snaplen);
 
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v2 4/4] test: cleanups to pcapng test
  2023-10-05 23:06 ` [PATCH v2 0/4] dumpcap and pcapng fixes Stephen Hemminger
                     ` (2 preceding siblings ...)
  2023-10-05 23:06   ` [PATCH v2 3/4] pcapng: modify timestamp calculation Stephen Hemminger
@ 2023-10-05 23:06   ` Stephen Hemminger
  3 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-10-05 23:06 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Reshma Pattan

Overhaul of the pcapng test:
  - promote it to be a fast test so it gets regularly run.
  - create null device and use i.
  - use UDP discard packets that are valid so that for debugging
    the resulting pcapng file can be looked at with wireshark.
  - do basic checks on resulting pcap file that lengths and
    timestamps are in range.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 app/test/meson.build   |   2 +-
 app/test/test_pcapng.c | 378 ++++++++++++++++++++++++++---------------
 2 files changed, 242 insertions(+), 138 deletions(-)

diff --git a/app/test/meson.build b/app/test/meson.build
index bf9fc906128f..81d7c41a07cb 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -124,7 +124,7 @@ source_file_deps = {
     'test_meter.c': ['meter'],
     'test_metrics.c': ['metrics'],
     'test_mp_secondary.c': ['hash', 'lpm'],
-    'test_pcapng.c': ['ethdev', 'net', 'pcapng'],
+    'test_pcapng.c': ['ethdev', 'net', 'pcapng', 'bus_vdev'],
     'test_pdcp.c': ['eventdev', 'pdcp', 'net', 'timer', 'security'],
     'test_pdump.c': ['pdump'] + sample_packet_forward_deps,
     'test_per_lcore.c': [],
diff --git a/app/test/test_pcapng.c b/app/test/test_pcapng.c
index 55aa2cf93666..45223ef38240 100644
--- a/app/test/test_pcapng.c
+++ b/app/test/test_pcapng.c
@@ -6,25 +6,34 @@
 #include <stdlib.h>
 #include <unistd.h>
 
+#include <rte_bus_vdev.h>
 #include <rte_ethdev.h>
 #include <rte_ether.h>
+#include <rte_ip.h>
 #include <rte_mbuf.h>
 #include <rte_mempool.h>
 #include <rte_net.h>
 #include <rte_pcapng.h>
+#include <rte_random.h>
+#include <rte_reciprocal.h>
+#include <rte_time.h>
+#include <rte_udp.h>
 
 #include <pcap/pcap.h>
 
 #include "test.h"
 
-#define NUM_PACKETS    10
-#define DUMMY_MBUF_NUM 3
+#define PCAPNG_TEST_DEBUG 0
+
+#define TOTAL_PACKETS	4096
+#define MAX_BURST	64
+#define MAX_GAP_US	100000
+#define DUMMY_MBUF_NUM	3
 
-static rte_pcapng_t *pcapng;
 static struct rte_mempool *mp;
 static const uint32_t pkt_len = 200;
 static uint16_t port_id;
-static char file_name[] = "/tmp/pcapng_test_XXXXXX.pcapng";
+static const char null_dev[] = "net_null0";
 
 /* first mbuf in the packet, should always be at offset 0 */
 struct dummy_mbuf {
@@ -61,6 +70,7 @@ mbuf1_prepare(struct dummy_mbuf *dm, uint32_t plen)
 	struct {
 		struct rte_ether_hdr eth;
 		struct rte_ipv4_hdr ip;
+		struct rte_udp_hdr udp;
 	} pkt = {
 		.eth = {
 			.dst_addr.addr_bytes = "\xff\xff\xff\xff\xff\xff",
@@ -68,149 +78,226 @@ mbuf1_prepare(struct dummy_mbuf *dm, uint32_t plen)
 		},
 		.ip = {
 			.version_ihl = RTE_IPV4_VHL_DEF,
-			.total_length = rte_cpu_to_be_16(plen),
-			.time_to_live = IPDEFTTL,
-			.next_proto_id = IPPROTO_RAW,
+			.time_to_live = 1,
+			.next_proto_id = IPPROTO_UDP,
 			.src_addr = rte_cpu_to_be_32(RTE_IPV4_LOOPBACK),
 			.dst_addr = rte_cpu_to_be_32(RTE_IPV4_BROADCAST),
-		}
+		},
+		.udp = {
+			.dst_port = rte_cpu_to_be_16(9), /* Discard port */
+		},
 	};
 
 	memset(dm, 0, sizeof(*dm));
 	dummy_mbuf_prep(&dm->mb[0], dm->buf[0], sizeof(dm->buf[0]), plen);
 
 	rte_eth_random_addr(pkt.eth.src_addr.addr_bytes);
-	memcpy(rte_pktmbuf_mtod(dm->mb, void *), &pkt, RTE_MIN(sizeof(pkt), plen));
+	plen -= sizeof(struct rte_ether_hdr);
+
+	pkt.ip.total_length = rte_cpu_to_be_16(plen);
+	pkt.ip.hdr_checksum = rte_ipv4_cksum(&pkt.ip);
+
+	plen -= sizeof(struct rte_ipv4_hdr);
+	pkt.udp.src_port = rte_rand();
+	pkt.udp.dgram_len = rte_cpu_to_be_16(plen);
+
+	memcpy(rte_pktmbuf_mtod(dm->mb, void *), &pkt, sizeof(pkt));
 }
 
-static int
-test_setup(void)
+/*
+ * Make a timestamp value as used by PCAPNG file format
+ * The library uses nanosecond time resolution so this is
+ * time elapsed since 1970-01-01 00:00:00 UTC.
+ *
+ * Use the same way of calculating as in pdump library.
+ */
+static struct {
+	uint64_t offset_ns;	/* ns since 1/1/1970 when initialized */
+	uint64_t tsc_base;	/* TSC when initialized */
+	uint64_t tsc_hz;	/* copy of rte_tsc_hz() */
+	struct rte_reciprocal_u64 tsc_hz_inverse; /* inverse of tsc_hz */
+} time_base;
+
+static void timestamp_init(void)
 {
-	int tmp_fd;
+	struct timespec ts;
+	uint64_t cycles;
 
-	port_id = rte_eth_find_next(0);
-	if (port_id >= RTE_MAX_ETHPORTS) {
-		fprintf(stderr, "No valid Ether port\n");
-		return -1;
-	}
+	/* Compute time base offsets */
+	cycles = rte_get_tsc_cycles();
+	clock_gettime(CLOCK_REALTIME, &ts);
 
-	tmp_fd = mkstemps(file_name, strlen(".pcapng"));
-	if (tmp_fd == -1) {
-		perror("mkstemps() failure");
-		return -1;
-	}
-	printf("pcapng: output file %s\n", file_name);
+	/* put initial TSC value in middle of clock_gettime() call */
+	time_base.tsc_base = (cycles + rte_get_tsc_cycles()) / 2;
+	time_base.offset_ns = rte_timespec_to_ns(&ts);
 
-	/* open a test capture file */
-	pcapng = rte_pcapng_fdopen(tmp_fd, NULL, NULL, "pcapng_test", NULL);
-	if (pcapng == NULL) {
-		fprintf(stderr, "rte_pcapng_fdopen failed\n");
-		close(tmp_fd);
-		return -1;
-	}
+	time_base.tsc_hz = rte_get_tsc_hz();
+	time_base.tsc_hz_inverse = rte_reciprocal_value_u64(time_base.tsc_hz);
+}
 
-	/* Add interface to the file */
-	if (rte_pcapng_add_interface(pcapng, port_id,
-				     NULL, NULL, NULL) != 0) {
-		fprintf(stderr, "can not add port %u\n", port_id);
-		return -1;
+static int
+test_setup(void)
+{
+	port_id = rte_eth_dev_count_avail();
+
+	/* Make a dummy null device to snoop on */
+	if (rte_vdev_init(null_dev, NULL) != 0) {
+		fprintf(stderr, "Failed to create vdev '%s'\n", null_dev);
+		goto fail;
 	}
 
 	/* Make a pool for cloned packets */
-	mp = rte_pktmbuf_pool_create_by_ops("pcapng_test_pool", IOV_MAX + NUM_PACKETS,
-					    0, 0,
-					    rte_pcapng_mbuf_size(pkt_len),
+	mp = rte_pktmbuf_pool_create_by_ops("pcapng_test_pool",
+					    MAX_BURST, 0, 0,
+					    rte_pcapng_mbuf_size(pkt_len) + 128,
 					    SOCKET_ID_ANY, "ring_mp_sc");
 	if (mp == NULL) {
 		fprintf(stderr, "Cannot create mempool\n");
-		return -1;
+		goto fail;
 	}
+
+	timestamp_init();
 	return 0;
+
+fail:
+	rte_vdev_uninit(null_dev);
+	rte_mempool_free(mp);
+	return -1;
 }
 
 static int
-test_write_packets(void)
+fill_pcapng_file(rte_pcapng_t *pcapng, unsigned int num_packets)
 {
-	struct rte_mbuf *orig;
-	struct rte_mbuf *clones[NUM_PACKETS] = { };
 	struct dummy_mbuf mbfs;
-	unsigned int i;
+	struct rte_mbuf *orig;
+	unsigned int burst_size;
+	unsigned int count;
 	ssize_t len;
 
 	/* make a dummy packet */
 	mbuf1_prepare(&mbfs, pkt_len);
-
-	/* clone them */
 	orig  = &mbfs.mb[0];
-	for (i = 0; i < NUM_PACKETS; i++) {
-		struct rte_mbuf *mc;
 
-		mc = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len,
-				rte_get_tsc_cycles(), 0, NULL);
-		if (mc == NULL) {
-			fprintf(stderr, "Cannot copy packet\n");
+	for (count = 0; count < num_packets; count += burst_size) {
+		struct rte_mbuf *clones[MAX_BURST];
+		unsigned int i;
+
+		/* put 1 .. MAX_BURST packets in one write call */
+		burst_size = rte_rand_max(MAX_BURST) + 1;
+		for (i = 0; i < burst_size; i++) {
+			struct rte_mbuf *mc;
+
+			mc = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len,
+					     RTE_PCAPNG_DIRECTION_IN,
+					     NULL);
+			if (mc == NULL) {
+				fprintf(stderr, "Cannot copy packet\n");
+				return -1;
+			}
+			clones[i] = mc;
+		}
+
+		/* write it to capture file */
+		len = rte_pcapng_write_packets(pcapng, clones, burst_size);
+		rte_pktmbuf_free_bulk(clones, burst_size);
+
+		if (len <= 0) {
+			fprintf(stderr, "Write of packets failed: %s\n",
+				rte_strerror(rte_errno));
 			return -1;
 		}
-		clones[i] = mc;
+
+		/* Leave a small gap between packets to test for time wrap */
+		usleep(rte_rand_max(MAX_GAP_US));
 	}
 
-	/* write it to capture file */
-	len = rte_pcapng_write_packets(pcapng, clones, NUM_PACKETS);
+	return count;
+}
 
-	rte_pktmbuf_free_bulk(clones, NUM_PACKETS);
+static char *
+fmt_time(char *buf, size_t size, uint64_t ts_ns)
+{
+	time_t sec;
+	size_t len;
 
-	if (len <= 0) {
-		fprintf(stderr, "Write of packets failed\n");
-		return -1;
-	}
+	sec = ts_ns / NS_PER_S;
+	len = strftime(buf, size, "%X", localtime(&sec));
+	snprintf(buf + len, size - len, ".%09lu",
+		 (unsigned long)(ts_ns % NS_PER_S));
 
-	return 0;
+	return buf;
 }
 
-static int
-test_write_stats(void)
+/* Context for the pcap_loop callback */
+struct pkt_print_ctx {
+	pcap_t *pcap;
+	unsigned int count;
+};
+
+static void
+print_packet(uint64_t ts_ns, const struct rte_ether_hdr *eh, size_t len)
 {
-	ssize_t len;
+	char tbuf[128], src[64], dst[64];
 
-	/* write a statistics block */
-	len = rte_pcapng_write_stats(pcapng, port_id, NULL,
-				     0, 0, 0,
-				     NUM_PACKETS, 0);
-	if (len <= 0) {
-		fprintf(stderr, "Write of statistics failed\n");
-		return -1;
-	}
-	return 0;
+	fmt_time(tbuf, sizeof(tbuf), ts_ns);
+	rte_ether_format_addr(dst, sizeof(dst), &eh->dst_addr);
+	rte_ether_format_addr(src, sizeof(src), &eh->src_addr);
+	printf("%s: %s -> %s type %x length %zu\n",
+	       tbuf, src, dst, rte_be_to_cpu_16(eh->ether_type), len);
 }
 
+/* Callback from pcap_loop used to validate packets in the file */
 static void
-pkt_print(u_char *user, const struct pcap_pkthdr *h,
-	  const u_char *bytes)
+parse_pcap_packet(u_char *user, const struct pcap_pkthdr *h,
+		  const u_char *bytes)
 {
-	unsigned int *countp = (unsigned int *)user;
+	struct pkt_print_ctx *ctx = (struct pkt_print_ctx *)user;
 	const struct rte_ether_hdr *eh;
-	struct tm *tm;
-	char tbuf[128], src[64], dst[64];
+	const struct rte_ipv4_hdr *ip;
+	struct timespec ts;
+	uint64_t ns, now;
 
-	tm = localtime(&h->ts.tv_sec);
-	if (tm == NULL) {
-		perror("localtime");
-		return;
+	eh = (const struct rte_ether_hdr *)bytes;
+	ip = (const struct rte_ipv4_hdr *)(eh + 1);
+
+	ctx->count += 1;
+
+	clock_gettime(CLOCK_REALTIME, &ts);
+	now = rte_timespec_to_ns(&ts);
+
+	/* The pcap library is misleading in reporting timestamp.
+	 * packet header struct gives timestamp as a timeval (ie. usec);
+	 * but the file is open in nanonsecond mode therefore
+	 * the timestamp is really in timespec (ie. nanoseconds).
+	 */
+	ns = h->ts.tv_sec * NS_PER_S + h->ts.tv_usec;
+	if (ns < time_base.offset_ns || ns > now) {
+		char tstart[128], tend[128];
+
+		fmt_time(tstart, sizeof(tstart), time_base.offset_ns);
+		fmt_time(tend, sizeof(tend), now);
+		fprintf(stderr, "Timestamp out of range [%s .. %s]\n",
+			tstart, tend);
+		goto error;
 	}
 
-	if (strftime(tbuf, sizeof(tbuf), "%X", tm) == 0) {
-		fprintf(stderr, "strftime returned 0!\n");
-		return;
+	if (!rte_is_broadcast_ether_addr(&eh->dst_addr)) {
+		fprintf(stderr, "Destination is not broadcast\n");
+		goto error;
 	}
 
-	eh = (const struct rte_ether_hdr *)bytes;
-	rte_ether_format_addr(dst, sizeof(dst), &eh->dst_addr);
-	rte_ether_format_addr(src, sizeof(src), &eh->src_addr);
-	printf("%s.%06lu: %s -> %s type %x length %u\n",
-	       tbuf, (unsigned long)h->ts.tv_usec,
-	       src, dst, rte_be_to_cpu_16(eh->ether_type), h->len);
+	if (rte_ipv4_cksum(ip) != 0) {
+		fprintf(stderr, "Bad IPv4 checksum\n");
+		goto error;
+	}
+
+	return;		/* packet is normal */
 
-	*countp += 1;
+error:
+	print_packet(ns, eh, h->len);
+
+	/* Stop parsing at first error */
+	pcap_breakloop(ctx->pcap);
 }
 
 /*
@@ -219,78 +306,98 @@ pkt_print(u_char *user, const struct pcap_pkthdr *h,
  * but that creates an unwanted dependency.
  */
 static int
-test_validate(void)
+valid_pcapng_file(const char *file_name, unsigned int expected)
 {
 	char errbuf[PCAP_ERRBUF_SIZE];
-	unsigned int count = 0;
-	pcap_t *pcap;
+	struct pkt_print_ctx ctx;
 	int ret;
 
-	pcap = pcap_open_offline(file_name, errbuf);
-	if (pcap == NULL) {
+	ctx.count = 0;
+	ctx.pcap = pcap_open_offline_with_tstamp_precision(file_name,
+							   PCAP_TSTAMP_PRECISION_NANO,
+							   errbuf);
+	if (ctx.pcap == NULL) {
 		fprintf(stderr, "pcap_open_offline('%s') failed: %s\n",
 			file_name, errbuf);
 		return -1;
 	}
 
-	ret = pcap_loop(pcap, 0, pkt_print, (u_char *)&count);
-	if (ret == 0)
-		printf("Saw %u packets\n", count);
-	else
+	ret = pcap_loop(ctx.pcap, 0, parse_pcap_packet, (u_char *)&ctx);
+	if (ret != 0) {
 		fprintf(stderr, "pcap_dispatch: failed: %s\n",
-			pcap_geterr(pcap));
-	pcap_close(pcap);
+			pcap_geterr(ctx.pcap));
+	} else if (ctx.count != expected) {
+		printf("Only %u packets, expected %u\n",
+		       ctx.count, expected);
+		ret = -1;
+	}
+
+	pcap_close(ctx.pcap);
 
 	return ret;
 }
 
 static int
-test_write_over_limit_iov_max(void)
+test_write_packets(void)
 {
-	struct rte_mbuf *orig;
-	struct rte_mbuf *clones[IOV_MAX + NUM_PACKETS] = { };
-	struct dummy_mbuf mbfs;
-	unsigned int i;
-	ssize_t len;
-
-	/* make a dummy packet */
-	mbuf1_prepare(&mbfs, pkt_len);
+	char file_name[] = "/tmp/pcapng_test_XXXXXX.pcapng";
+	static rte_pcapng_t *pcapng;
+	int ret, tmp_fd, count;
 
-	/* clone them */
-	orig  = &mbfs.mb[0];
-	for (i = 0; i < IOV_MAX + NUM_PACKETS; i++) {
-		struct rte_mbuf *mc;
+	tmp_fd = mkstemps(file_name, strlen(".pcapng"));
+	if (tmp_fd == -1) {
+		perror("mkstemps() failure");
+		goto fail;
+	}
+	printf("pcapng: output file %s\n", file_name);
 
-		mc = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len,
-				rte_get_tsc_cycles(), 0, NULL);
-		if (mc == NULL) {
-			fprintf(stderr, "Cannot copy packet\n");
-			return -1;
-		}
-		clones[i] = mc;
+	/* open a test capture file */
+	pcapng = rte_pcapng_fdopen(tmp_fd, NULL, NULL, "pcapng_test", NULL);
+	if (pcapng == NULL) {
+		fprintf(stderr, "rte_pcapng_fdopen failed\n");
+		close(tmp_fd);
+		goto fail;
 	}
 
-	/* write it to capture file */
-	len = rte_pcapng_write_packets(pcapng, clones, IOV_MAX + NUM_PACKETS);
+	/* Add interface to the file */
+	ret = rte_pcapng_add_interface(pcapng, port_id,
+				       NULL, NULL, NULL);
+	if (ret < 0) {
+		fprintf(stderr, "can not add port %u\n", port_id);
+		goto fail;
+	}
 
-	rte_pktmbuf_free_bulk(clones, IOV_MAX + NUM_PACKETS);
+	count = fill_pcapng_file(pcapng, TOTAL_PACKETS);
+	if (count < 0)
+		goto fail;
 
-	if (len <= 0) {
-		fprintf(stderr, "Write of packets failed\n");
-		return -1;
+	/* write a statistics block */
+	ret = rte_pcapng_write_stats(pcapng, port_id,
+				     count, 0, "end of test");
+	if (ret <= 0) {
+		fprintf(stderr, "Write of statistics failed\n");
+		goto fail;
 	}
 
-	return 0;
+	rte_pcapng_close(pcapng);
+
+	ret = valid_pcapng_file(file_name, count);
+	/* if test fails want to investigate the file */
+	if (ret == 0)
+		unlink(file_name);
+
+	return ret;
+
+fail:
+	rte_pcapng_close(pcapng);
+	return -1;
 }
 
 static void
 test_cleanup(void)
 {
 	rte_mempool_free(mp);
-
-	if (pcapng)
-		rte_pcapng_close(pcapng);
-
+	rte_vdev_uninit(null_dev);
 }
 
 static struct
@@ -300,9 +407,6 @@ unit_test_suite test_pcapng_suite  = {
 	.suite_name = "Test Pcapng Unit Test Suite",
 	.unit_test_cases = {
 		TEST_CASE(test_write_packets),
-		TEST_CASE(test_write_stats),
-		TEST_CASE(test_validate),
-		TEST_CASE(test_write_over_limit_iov_max),
 		TEST_CASES_END()
 	}
 };
@@ -313,4 +417,4 @@ test_pcapng(void)
 	return unit_test_suite_runner(&test_pcapng_suite);
 }
 
-REGISTER_TEST_COMMAND(pcapng_autotest, test_pcapng);
+REGISTER_FAST_TEST(pcapng_autotest, true, true, test_pcapng);
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 4/4] pcapng: move timestamp calculation into pdump
  2023-10-04 17:13     ` Stephen Hemminger
@ 2023-10-06  9:10       ` David Marchand
  2023-10-06 14:59         ` Kevin Traynor
  0 siblings, 1 reply; 61+ messages in thread
From: David Marchand @ 2023-10-06  9:10 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, Reshma Pattan, Quentin Armitage, Thomas Monjalon, Kevin Traynor

On Wed, Oct 4, 2023 at 7:13 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Mon, 2 Oct 2023 10:15:25 +0200
> David Marchand <david.marchand@redhat.com> wrote:
>
> > >
> >
> > Bugzilla ID: 1291 ?
> >
> > This patch (and patch 3) updates some pcapng API, is it worth a RN update?
> >
> > > Fixes: c882eb544842 ("pcapng: fix timestamp wrapping in output files")
> >
> > Is it worth backporting?
> > I would say no, as some API update was needed to fix the issue.
> > But on the other hand, this is an experimental API, so I prefer to ask.
> >
> >
> > > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
>
> Good question.
> Is experimental API allowed to change in a stable release?

I don't think this is cleary described in our ABI policy.
An experimental API may be changed at any time, but nothing is said
wrt backports.

Breaking an API is always a pain, and for a LTS release it would
probably be badly accepted by users.

Cc: Kevin for his opinion.

We may need a clarification on this topic in the doc.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 4/4] pcapng: move timestamp calculation into pdump
  2023-10-06  9:10       ` David Marchand
@ 2023-10-06 14:59         ` Kevin Traynor
  0 siblings, 0 replies; 61+ messages in thread
From: Kevin Traynor @ 2023-10-06 14:59 UTC (permalink / raw)
  To: David Marchand, Stephen Hemminger
  Cc: dev, Reshma Pattan, Quentin Armitage, Thomas Monjalon,
	Luca Boccassi, Xueming(Steven) Li, Christian Ehrhardt

On 06/10/2023 10:10, David Marchand wrote:
> On Wed, Oct 4, 2023 at 7:13 PM Stephen Hemminger
> <stephen@networkplumber.org> wrote:
>>
>> On Mon, 2 Oct 2023 10:15:25 +0200
>> David Marchand <david.marchand@redhat.com> wrote:
>>
>>>>
>>>
>>> Bugzilla ID: 1291 ?
>>>
>>> This patch (and patch 3) updates some pcapng API, is it worth a RN update?
>>>
>>>> Fixes: c882eb544842 ("pcapng: fix timestamp wrapping in output files")
>>>
>>> Is it worth backporting?
>>> I would say no, as some API update was needed to fix the issue.
>>> But on the other hand, this is an experimental API, so I prefer to ask.
>>>
>>>
>>>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
>>
>> Good question.
>> Is experimental API allowed to change in a stable release?
> 
> I don't think this is cleary described in our ABI policy.
> An experimental API may be changed at any time, but nothing is said
> wrt backports.
> 
> Breaking an API is always a pain, and for a LTS release it would
> probably be badly accepted by users.
> 

yes, I agree. IIRC, this arose sometime in the past with a branch that 
Luca was maintaining and I think the consensus among LTS maintainers was 
not to change experimental API on stable branches.

> Cc: Kevin for his opinion.
> 
> We may need a clarification on this topic in the doc.
> 
> 

Perhaps it's not a "rule" since experimental API comes with no 
guarantee, but I can add something to the docs that it is a guideline 
not to break experimental API on stable branch.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 2/4] dumpcap: allow multiple invocations
  2023-09-21  6:22   ` Morten Brørup
  2023-09-21  7:10     ` Isaac Boukris
@ 2023-11-07  2:34     ` Stephen Hemminger
  1 sibling, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-07  2:34 UTC (permalink / raw)
  To: Morten Brørup; +Cc: dev, Isaac Boukris, Reshma Pattan

On Thu, 21 Sep 2023 08:22:12 +0200
Morten Brørup <mb@smartsharesystems.com> wrote:

> I'm not sure getpid() is available on Windows. How about:
> 
> #ifdef _WIN32
> #include <processthreadsapi.h> // With the headers, not here.
> "dumpcap-%lu", GetCurrentProcessId());
> #else
> "dumpcap-%u", getpid());
> #endif


Dumpcap doesn't support windows because there are lots of things about
pdump library that won't work on Windows.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v3 0/5] dumpcap and pcapng fixes
  2023-09-21  4:23 [PATCH 0/4] pcapng fixes Stephen Hemminger
                   ` (4 preceding siblings ...)
  2023-10-05 23:06 ` [PATCH v2 0/4] dumpcap and pcapng fixes Stephen Hemminger
@ 2023-11-08 18:35 ` Stephen Hemminger
  2023-11-08 18:35   ` [PATCH v3 1/5] pdump: fix setting rte_errno on mp error Stephen Hemminger
                     ` (4 more replies)
  2023-11-09 17:34 ` [PATCH v4 0/5] dumpcap and pcapng fixes Stephen Hemminger
                   ` (3 subsequent siblings)
  9 siblings, 5 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-08 18:35 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

This has bugfixes and tests for dumpcap and pcapng.
It should be in 23.11 but seems to have been ignored.

It fixes issues related to timestamping. The design choices are
to maximize performance in the primary process; and do
all the time adjustment in the secondary (dumpcap) since
the dumpcap needs to system calls anyway to write the result.

This patches set changes where the adjustment is calculated
into the pcapng portion that opens the output file.
All details of the format of timestamp are contained inside
pcapng (data hiding).

v3 - don't use alloca() since can have VLA type issues

Stephen Hemminger (5):
  pdump: fix setting rte_errno on mp error
  dumpcap: allow multiple invocations
  pcapng: modify timestamp calculation
  pcapng: avoid using alloca()
  test: cleanups to pcapng test

 app/dumpcap/main.c      |  53 ++---
 app/test/meson.build    |   2 +-
 app/test/test_pcapng.c  | 418 +++++++++++++++++++++++++++-------------
 lib/graph/graph_pcap.c  |   2 +-
 lib/pcapng/rte_pcapng.c | 153 ++++++---------
 lib/pcapng/rte_pcapng.h |  19 +-
 lib/pdump/rte_pdump.c   |   9 +-
 7 files changed, 371 insertions(+), 285 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v3 1/5] pdump: fix setting rte_errno on mp error
  2023-11-08 18:35 ` [PATCH v3 0/5] dumpcap and pcapng fixes Stephen Hemminger
@ 2023-11-08 18:35   ` Stephen Hemminger
  2023-11-09  7:34     ` Morten Brørup
  2023-11-08 18:35   ` [PATCH v3 2/5] dumpcap: allow multiple invocations Stephen Hemminger
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-08 18:35 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

The response from MP server sets err_value to negative
on error. The convention for rte_errno is to use a positive
value on error. This makes errors like duplicate registration
show up with the correct error value.

Fixes: 660098d61f57 ("pdump: use generic multi-process channel")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 lib/pdump/rte_pdump.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/pdump/rte_pdump.c b/lib/pdump/rte_pdump.c
index 80b90c6f7d03..e94f49e21250 100644
--- a/lib/pdump/rte_pdump.c
+++ b/lib/pdump/rte_pdump.c
@@ -564,9 +564,10 @@ pdump_prepare_client_request(const char *device, uint16_t queue,
 	if (rte_mp_request_sync(&mp_req, &mp_reply, &ts) == 0) {
 		mp_rep = &mp_reply.msgs[0];
 		resp = (struct pdump_response *)mp_rep->param;
-		rte_errno = resp->err_value;
-		if (!resp->err_value)
+		if (resp->err_value == 0)
 			ret = 0;
+		else
+			rte_errno = -resp->err_value;
 		free(mp_reply.msgs);
 	}
 
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v3 2/5] dumpcap: allow multiple invocations
  2023-11-08 18:35 ` [PATCH v3 0/5] dumpcap and pcapng fixes Stephen Hemminger
  2023-11-08 18:35   ` [PATCH v3 1/5] pdump: fix setting rte_errno on mp error Stephen Hemminger
@ 2023-11-08 18:35   ` Stephen Hemminger
  2023-11-09  7:50     ` Morten Brørup
  2023-11-08 18:35   ` [PATCH v3 3/5] pcapng: modify timestamp calculation Stephen Hemminger
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-08 18:35 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Isaac Boukris

If dumpcap is run twice with each instance pointing a different
interface, it would fail because of overlap in ring a pool names.
Fix by putting process id in the name.

Fixes: cbb44143be74 ("app/dumpcap: add new packet capture application")
Reported-by: Isaac Boukris <iboukris@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 app/dumpcap/main.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/app/dumpcap/main.c b/app/dumpcap/main.c
index 64294bbfb3e6..37754fd06f4f 100644
--- a/app/dumpcap/main.c
+++ b/app/dumpcap/main.c
@@ -44,7 +44,6 @@
 #include <pcap/pcap.h>
 #include <pcap/bpf.h>
 
-#define RING_NAME "capture-ring"
 #define MONITOR_INTERVAL  (500 * 1000)
 #define MBUF_POOL_CACHE_SIZE 32
 #define BURST_SIZE 32
@@ -647,6 +646,7 @@ static void dpdk_init(void)
 static struct rte_ring *create_ring(void)
 {
 	struct rte_ring *ring;
+	char ring_name[RTE_RING_NAMESIZE];
 	size_t size, log2;
 
 	/* Find next power of 2 >= size. */
@@ -660,28 +660,28 @@ static struct rte_ring *create_ring(void)
 		ring_size = size;
 	}
 
-	ring = rte_ring_lookup(RING_NAME);
-	if (ring == NULL) {
-		ring = rte_ring_create(RING_NAME, ring_size,
-					rte_socket_id(), 0);
-		if (ring == NULL)
-			rte_exit(EXIT_FAILURE, "Could not create ring :%s\n",
-				 rte_strerror(rte_errno));
-	}
+	/* Want one ring per invocation of program */
+	snprintf(ring_name, sizeof(ring_name),
+		 "dumpcap-%u", getpid());
+
+	ring = rte_ring_create(ring_name, ring_size,
+			       rte_socket_id(), 0);
+	if (ring == NULL)
+		rte_exit(EXIT_FAILURE, "Could not create ring :%s\n",
+			 rte_strerror(rte_errno));
+
 	return ring;
 }
 
 static struct rte_mempool *create_mempool(void)
 {
 	const struct interface *intf;
-	static const char pool_name[] = "capture_mbufs";
+	char pool_name[RTE_MEMPOOL_NAMESIZE];
 	size_t num_mbufs = 2 * ring_size;
 	struct rte_mempool *mp;
 	uint32_t data_size = 128;
 
-	mp = rte_mempool_lookup(pool_name);
-	if (mp)
-		return mp;
+	snprintf(pool_name, sizeof(pool_name), "capture_%u", getpid());
 
 	/* Common pool so size mbuf for biggest snap length */
 	TAILQ_FOREACH(intf, &interfaces, next) {
@@ -826,7 +826,7 @@ static void enable_pdump(struct rte_ring *r, struct rte_mempool *mp)
 			rte_exit(EXIT_FAILURE,
 				"Packet dump enable on %u:%s failed %s\n",
 				intf->port, intf->name,
-				rte_strerror(-ret));
+				rte_strerror(rte_errno));
 		}
 
 		if (intf->opts.promisc_mode) {
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v3 3/5] pcapng: modify timestamp calculation
  2023-11-08 18:35 ` [PATCH v3 0/5] dumpcap and pcapng fixes Stephen Hemminger
  2023-11-08 18:35   ` [PATCH v3 1/5] pdump: fix setting rte_errno on mp error Stephen Hemminger
  2023-11-08 18:35   ` [PATCH v3 2/5] dumpcap: allow multiple invocations Stephen Hemminger
@ 2023-11-08 18:35   ` Stephen Hemminger
  2023-11-09  7:57     ` Morten Brørup
  2023-11-08 18:35   ` [PATCH v3 4/5] pcapng: avoid using alloca() Stephen Hemminger
  2023-11-08 18:35   ` [PATCH v3 5/5] test: cleanups to pcapng test Stephen Hemminger
  4 siblings, 1 reply; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-08 18:35 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

The computation of timestamp is best done in the part of
pcapng library that is in secondary process.
The secondary process is already doing a bunch of system
calls which makes it not performance sensitive.

Simplify the computation of nanoseconds from TSC to a two
step process which avoids numeric overflow issues. The previous
code was not thread safe as well.

Fixes: c882eb544842 ("pcapng: fix timestamp wrapping in output files")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 app/dumpcap/main.c      |  25 +++------
 app/test/test_pcapng.c  |   4 +-
 lib/graph/graph_pcap.c  |   2 +-
 lib/pcapng/rte_pcapng.c | 119 +++++++++++++++-------------------------
 lib/pcapng/rte_pcapng.h |  19 ++-----
 lib/pdump/rte_pdump.c   |   4 +-
 6 files changed, 61 insertions(+), 112 deletions(-)

diff --git a/app/dumpcap/main.c b/app/dumpcap/main.c
index 37754fd06f4f..764dac6c37c0 100644
--- a/app/dumpcap/main.c
+++ b/app/dumpcap/main.c
@@ -66,13 +66,13 @@ static bool print_stats;
 
 /* capture limit options */
 static struct {
-	uint64_t  duration;	/* nanoseconds */
+	time_t  duration;	/* seconds */
 	unsigned long packets;  /* number of packets in file */
 	size_t size;		/* file size (bytes) */
 } stop;
 
 /* Running state */
-static uint64_t start_time, end_time;
+static time_t start_time;
 static uint64_t packets_received;
 static size_t file_size;
 
@@ -197,7 +197,7 @@ static void auto_stop(char *opt)
 		if (*value == '\0' || *endp != '\0' || interval <= 0)
 			rte_exit(EXIT_FAILURE,
 				 "Invalid duration \"%s\"\n", value);
-		stop.duration = NSEC_PER_SEC * interval;
+		stop.duration = interval;
 	} else if (strcmp(opt, "filesize") == 0) {
 		stop.size = get_uint(value, "filesize", 0) * 1024;
 	} else if (strcmp(opt, "packets") == 0) {
@@ -511,15 +511,6 @@ static void statistics_loop(void)
 	}
 }
 
-/* Return the time since 1/1/1970 in nanoseconds */
-static uint64_t create_timestamp(void)
-{
-	struct timespec now;
-
-	clock_gettime(CLOCK_MONOTONIC, &now);
-	return rte_timespec_to_ns(&now);
-}
-
 static void
 cleanup_pdump_resources(void)
 {
@@ -589,9 +580,8 @@ report_packet_stats(dumpcap_out_t out)
 		ifdrop = pdump_stats.nombuf + pdump_stats.ringfull;
 
 		if (use_pcapng)
-			rte_pcapng_write_stats(out.pcapng, intf->port, NULL,
-					       start_time, end_time,
-					       ifrecv, ifdrop);
+			rte_pcapng_write_stats(out.pcapng, intf->port,
+					       ifrecv, ifdrop, NULL);
 
 		if (ifrecv == 0)
 			percent = 0;
@@ -983,7 +973,7 @@ int main(int argc, char **argv)
 	mp = create_mempool();
 	out = create_output();
 
-	start_time = create_timestamp();
+	start_time = time(NULL);
 	enable_pdump(r, mp);
 
 	if (!quiet) {
@@ -1005,11 +995,10 @@ int main(int argc, char **argv)
 			break;
 
 		if (stop.duration != 0 &&
-		    create_timestamp() - start_time > stop.duration)
+		    time(NULL) - start_time > stop.duration)
 			break;
 	}
 
-	end_time = create_timestamp();
 	disable_primary_monitor();
 
 	if (rte_eal_primary_proc_alive(NULL))
diff --git a/app/test/test_pcapng.c b/app/test/test_pcapng.c
index b8429a02f160..55aa2cf93666 100644
--- a/app/test/test_pcapng.c
+++ b/app/test/test_pcapng.c
@@ -173,8 +173,8 @@ test_write_stats(void)
 	ssize_t len;
 
 	/* write a statistics block */
-	len = rte_pcapng_write_stats(pcapng, port_id,
-				     NULL, 0, 0,
+	len = rte_pcapng_write_stats(pcapng, port_id, NULL,
+				     0, 0, 0,
 				     NUM_PACKETS, 0);
 	if (len <= 0) {
 		fprintf(stderr, "Write of statistics failed\n");
diff --git a/lib/graph/graph_pcap.c b/lib/graph/graph_pcap.c
index db722c375fa7..89525f1220ca 100644
--- a/lib/graph/graph_pcap.c
+++ b/lib/graph/graph_pcap.c
@@ -214,7 +214,7 @@ graph_pcap_dispatch(struct rte_graph *graph,
 		mbuf = (struct rte_mbuf *)objs[i];
 
 		mc = rte_pcapng_copy(mbuf->port, 0, mbuf, pkt_mp, mbuf->pkt_len,
-				     rte_get_tsc_cycles(), 0, buffer);
+				     0, buffer);
 		if (mc == NULL)
 			break;
 
diff --git a/lib/pcapng/rte_pcapng.c b/lib/pcapng/rte_pcapng.c
index 3c91fc77644a..13fd2b97fb80 100644
--- a/lib/pcapng/rte_pcapng.c
+++ b/lib/pcapng/rte_pcapng.c
@@ -36,22 +36,14 @@
 /* Format of the capture file handle */
 struct rte_pcapng {
 	int  outfd;		/* output file */
-
 	unsigned int ports;	/* number of interfaces added */
+	uint64_t offset_ns;	/* ns since 1/1/1970 when initialized */
+	uint64_t tsc_base;	/* TSC when started */
 
 	/* DPDK port id to interface index in file */
 	uint32_t port_index[RTE_MAX_ETHPORTS];
 };
 
-/* For converting TSC cycles to PCAPNG ns format */
-static struct pcapng_time {
-	uint64_t ns;
-	uint64_t cycles;
-	uint64_t tsc_hz;
-	struct rte_reciprocal_u64 tsc_hz_inverse;
-} pcapng_time;
-
-
 #ifdef RTE_EXEC_ENV_WINDOWS
 /*
  * Windows does not have writev() call.
@@ -102,56 +94,21 @@ static ssize_t writev(int fd, const struct iovec *iov, int iovcnt)
 #define if_indextoname(ifindex, ifname) NULL
 #endif
 
-static inline void
-pcapng_init(void)
+/* Convert from TSC (CPU cycles) to nanoseconds */
+static uint64_t
+pcapng_timestamp(const rte_pcapng_t *self, uint64_t cycles)
 {
-	struct timespec ts;
+	uint64_t delta, rem, secs, ns;
+	const uint64_t hz = rte_get_tsc_hz();
 
-	pcapng_time.cycles = rte_get_tsc_cycles();
-	clock_gettime(CLOCK_REALTIME, &ts);
-	pcapng_time.cycles = (pcapng_time.cycles + rte_get_tsc_cycles()) / 2;
-	pcapng_time.ns = rte_timespec_to_ns(&ts);
-
-	pcapng_time.tsc_hz = rte_get_tsc_hz();
-	pcapng_time.tsc_hz_inverse = rte_reciprocal_value_u64(pcapng_time.tsc_hz);
-}
+	delta = cycles - self->tsc_base;
 
-/* PCAPNG timestamps are in nanoseconds */
-static uint64_t pcapng_tsc_to_ns(uint64_t cycles)
-{
-	uint64_t delta, secs;
-
-	if (!pcapng_time.tsc_hz)
-		pcapng_init();
-
-	/* In essence the calculation is:
-	 *   delta = (cycles - pcapng_time.cycles) * NSEC_PRE_SEC / rte_get_tsc_hz()
-	 * but this overflows within 4 to 8 seconds depending on TSC frequency.
-	 * Instead, if delta >= pcapng_time.tsc_hz:
-	 *   Increase pcapng_time.ns and pcapng_time.cycles by the number of
-	 *   whole seconds in delta and reduce delta accordingly.
-	 * delta will therefore always lie in the interval [0, pcapng_time.tsc_hz),
-	 * which will not overflow when multiplied by NSEC_PER_SEC provided the
-	 * TSC frequency < approx 18.4GHz.
-	 *
-	 * Currently all TSCs operate below 5GHz.
-	 */
-	delta = cycles - pcapng_time.cycles;
-	if (unlikely(delta >= pcapng_time.tsc_hz)) {
-		if (likely(delta < pcapng_time.tsc_hz * 2)) {
-			delta -= pcapng_time.tsc_hz;
-			pcapng_time.cycles += pcapng_time.tsc_hz;
-			pcapng_time.ns += NSEC_PER_SEC;
-		} else {
-			secs = rte_reciprocal_divide_u64(delta, &pcapng_time.tsc_hz_inverse);
-			delta -= secs * pcapng_time.tsc_hz;
-			pcapng_time.cycles += secs * pcapng_time.tsc_hz;
-			pcapng_time.ns += secs * NSEC_PER_SEC;
-		}
-	}
+	/* Avoid numeric wraparound by computing seconds first */
+	secs = delta / hz;
+	rem = delta % hz;
+	ns = (rem * NS_PER_S) / hz;
 
-	return pcapng_time.ns + rte_reciprocal_divide_u64(delta * NSEC_PER_SEC,
-							  &pcapng_time.tsc_hz_inverse);
+	return secs * NS_PER_S + ns + self->offset_ns;
 }
 
 /* length of option including padding */
@@ -368,15 +325,15 @@ rte_pcapng_add_interface(rte_pcapng_t *self, uint16_t port,
  */
 ssize_t
 rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
-		       const char *comment,
-		       uint64_t start_time, uint64_t end_time,
-		       uint64_t ifrecv, uint64_t ifdrop)
+		       uint64_t ifrecv, uint64_t ifdrop,
+		       const char *comment)
 {
 	struct pcapng_statistics *hdr;
 	struct pcapng_option *opt;
+	uint64_t start_time = self->offset_ns;
+	uint64_t sample_time;
 	uint32_t optlen, len;
 	uint8_t *buf;
-	uint64_t ns;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
@@ -386,10 +343,10 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 		optlen += pcapng_optlen(sizeof(ifrecv));
 	if (ifdrop != UINT64_MAX)
 		optlen += pcapng_optlen(sizeof(ifdrop));
+
 	if (start_time != 0)
 		optlen += pcapng_optlen(sizeof(start_time));
-	if (end_time != 0)
-		optlen += pcapng_optlen(sizeof(end_time));
+
 	if (comment)
 		optlen += pcapng_optlen(strlen(comment));
 	if (optlen != 0)
@@ -409,9 +366,6 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 	if (start_time != 0)
 		opt = pcapng_add_option(opt, PCAPNG_ISB_STARTTIME,
 					 &start_time, sizeof(start_time));
-	if (end_time != 0)
-		opt = pcapng_add_option(opt, PCAPNG_ISB_ENDTIME,
-					 &end_time, sizeof(end_time));
 	if (ifrecv != UINT64_MAX)
 		opt = pcapng_add_option(opt, PCAPNG_ISB_IFRECV,
 				&ifrecv, sizeof(ifrecv));
@@ -425,9 +379,9 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 	hdr->block_length = len;
 	hdr->interface_id = self->port_index[port_id];
 
-	ns = pcapng_tsc_to_ns(rte_get_tsc_cycles());
-	hdr->timestamp_hi = ns >> 32;
-	hdr->timestamp_lo = (uint32_t)ns;
+	sample_time = pcapng_timestamp(self, rte_get_tsc_cycles());
+	hdr->timestamp_hi = sample_time >> 32;
+	hdr->timestamp_lo = (uint32_t)sample_time;
 
 	/* clone block_length after option */
 	memcpy(opt, &len, sizeof(uint32_t));
@@ -520,23 +474,21 @@ struct rte_mbuf *
 rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 		const struct rte_mbuf *md,
 		struct rte_mempool *mp,
-		uint32_t length, uint64_t cycles,
+		uint32_t length,
 		enum rte_pcapng_direction direction,
 		const char *comment)
 {
 	struct pcapng_enhance_packet_block *epb;
 	uint32_t orig_len, data_len, padding, flags;
 	struct pcapng_option *opt;
+	uint64_t timestamp;
 	uint16_t optlen;
 	struct rte_mbuf *mc;
-	uint64_t ns;
 	bool rss_hash;
 
 #ifdef RTE_LIBRTE_ETHDEV_DEBUG
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
 #endif
-	ns = pcapng_tsc_to_ns(cycles);
-
 	orig_len = rte_pktmbuf_pkt_len(md);
 
 	/* Take snapshot of the data */
@@ -641,8 +593,10 @@ rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 	/* Interface index is filled in later during write */
 	mc->port = port_id;
 
-	epb->timestamp_hi = ns >> 32;
-	epb->timestamp_lo = (uint32_t)ns;
+	/* Put timestamp in cycles here - adjust in packet write */
+	timestamp = rte_get_tsc_cycles();
+	epb->timestamp_hi = timestamp >> 32;
+	epb->timestamp_lo = (uint32_t)timestamp;
 	epb->capture_length = data_len;
 	epb->original_length = orig_len;
 
@@ -668,6 +622,7 @@ rte_pcapng_write_packets(rte_pcapng_t *self,
 	for (i = 0; i < nb_pkts; i++) {
 		struct rte_mbuf *m = pkts[i];
 		struct pcapng_enhance_packet_block *epb;
+		uint64_t cycles, timestamp;
 
 		/* sanity check that is really a pcapng mbuf */
 		epb = rte_pktmbuf_mtod(m, struct pcapng_enhance_packet_block *);
@@ -684,6 +639,13 @@ rte_pcapng_write_packets(rte_pcapng_t *self,
 			return -1;
 		}
 
+		/* adjust timestamp recorded in packet */
+		cycles = (uint64_t)epb->timestamp_hi << 32;
+		cycles += epb->timestamp_lo;
+		timestamp = pcapng_timestamp(self, cycles);
+		epb->timestamp_hi = timestamp >> 32;
+		epb->timestamp_lo = (uint32_t)timestamp;
+
 		/*
 		 * Handle case of highly fragmented and large burst size
 		 * Note: this assumes that max segments per mbuf < IOV_MAX
@@ -725,6 +687,8 @@ rte_pcapng_fdopen(int fd,
 {
 	unsigned int i;
 	rte_pcapng_t *self;
+	struct timespec ts;
+	uint64_t cycles;
 
 	self = malloc(sizeof(*self));
 	if (!self) {
@@ -734,6 +698,13 @@ rte_pcapng_fdopen(int fd,
 
 	self->outfd = fd;
 	self->ports = 0;
+
+	/* record start time in ns since 1/1/1970 */
+	cycles = rte_get_tsc_cycles();
+	clock_gettime(CLOCK_REALTIME, &ts);
+	self->tsc_base = (cycles + rte_get_tsc_cycles()) / 2;
+	self->offset_ns = rte_timespec_to_ns(&ts);
+
 	for (i = 0; i < RTE_MAX_ETHPORTS; i++)
 		self->port_index[i] = UINT32_MAX;
 
diff --git a/lib/pcapng/rte_pcapng.h b/lib/pcapng/rte_pcapng.h
index d93cc9f73ad5..c40795c721de 100644
--- a/lib/pcapng/rte_pcapng.h
+++ b/lib/pcapng/rte_pcapng.h
@@ -121,8 +121,6 @@ enum rte_pcapng_direction {
  * @param length
  *   The upper limit on bytes to copy.  Passing UINT32_MAX
  *   means all data (after offset).
- * @param timestamp
- *   The timestamp in TSC cycles.
  * @param direction
  *   The direction of the packer: receive, transmit or unknown.
  * @param comment
@@ -136,7 +134,7 @@ __rte_experimental
 struct rte_mbuf *
 rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 		const struct rte_mbuf *m, struct rte_mempool *mp,
-		uint32_t length, uint64_t timestamp,
+		uint32_t length,
 		enum rte_pcapng_direction direction, const char *comment);
 
 
@@ -188,29 +186,22 @@ rte_pcapng_write_packets(rte_pcapng_t *self,
  *  The handle to the packet capture file
  * @param port
  *  The Ethernet port to report stats on.
- * @param comment
- *   Optional comment to add to statistics.
- * @param start_time
- *  The time when packet capture was started in nanoseconds.
- *  Optional: can be zero if not known.
- * @param end_time
- *  The time when packet capture was stopped in nanoseconds.
- *  Optional: can be zero if not finished;
  * @param ifrecv
  *  The number of packets received by capture.
  *  Optional: use UINT64_MAX if not known.
  * @param ifdrop
  *  The number of packets missed by the capture process.
  *  Optional: use UINT64_MAX if not known.
+ * @param comment
+ *  Optional comment to add to statistics.
  * @return
  *  number of bytes written to file, -1 on failure to write file
  */
 __rte_experimental
 ssize_t
 rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port,
-		       const char *comment,
-		       uint64_t start_time, uint64_t end_time,
-		       uint64_t ifrecv, uint64_t ifdrop);
+		       uint64_t ifrecv, uint64_t ifdrop,
+		       const char *comment);
 
 #ifdef __cplusplus
 }
diff --git a/lib/pdump/rte_pdump.c b/lib/pdump/rte_pdump.c
index e94f49e21250..5a1ec14d7a18 100644
--- a/lib/pdump/rte_pdump.c
+++ b/lib/pdump/rte_pdump.c
@@ -90,7 +90,6 @@ pdump_copy(uint16_t port_id, uint16_t queue,
 	int ring_enq;
 	uint16_t d_pkts = 0;
 	struct rte_mbuf *dup_bufs[nb_pkts];
-	uint64_t ts;
 	struct rte_ring *ring;
 	struct rte_mempool *mp;
 	struct rte_mbuf *p;
@@ -99,7 +98,6 @@ pdump_copy(uint16_t port_id, uint16_t queue,
 	if (cbs->filter)
 		rte_bpf_exec_burst(cbs->filter, (void **)pkts, rcs, nb_pkts);
 
-	ts = rte_get_tsc_cycles();
 	ring = cbs->ring;
 	mp = cbs->mp;
 	for (i = 0; i < nb_pkts; i++) {
@@ -122,7 +120,7 @@ pdump_copy(uint16_t port_id, uint16_t queue,
 		if (cbs->ver == V2)
 			p = rte_pcapng_copy(port_id, queue,
 					    pkts[i], mp, cbs->snaplen,
-					    ts, direction, NULL);
+					    direction, NULL);
 		else
 			p = rte_pktmbuf_copy(pkts[i], mp, 0, cbs->snaplen);
 
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v3 4/5] pcapng: avoid using alloca()
  2023-11-08 18:35 ` [PATCH v3 0/5] dumpcap and pcapng fixes Stephen Hemminger
                     ` (2 preceding siblings ...)
  2023-11-08 18:35   ` [PATCH v3 3/5] pcapng: modify timestamp calculation Stephen Hemminger
@ 2023-11-08 18:35   ` Stephen Hemminger
  2023-11-09  8:21     ` Morten Brørup
  2023-11-08 18:35   ` [PATCH v3 5/5] test: cleanups to pcapng test Stephen Hemminger
  4 siblings, 1 reply; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-08 18:35 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

The function alloca() like VLA's has problems if the caller
passes a large value. Instead use a fixed size buffer (4K)
which will be more than sufficient for the info related blocks
in the file. Add bounds checks as well.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 lib/pcapng/rte_pcapng.c | 34 +++++++++++++---------------------
 1 file changed, 13 insertions(+), 21 deletions(-)

diff --git a/lib/pcapng/rte_pcapng.c b/lib/pcapng/rte_pcapng.c
index 13fd2b97fb80..67f74d31aa32 100644
--- a/lib/pcapng/rte_pcapng.c
+++ b/lib/pcapng/rte_pcapng.c
@@ -140,9 +140,8 @@ pcapng_section_block(rte_pcapng_t *self,
 {
 	struct pcapng_section_header *hdr;
 	struct pcapng_option *opt;
-	void *buf;
+	uint8_t buf[BUFSIZ];
 	uint32_t len;
-	ssize_t cc;
 
 	len = sizeof(*hdr);
 	if (hw)
@@ -158,8 +157,7 @@ pcapng_section_block(rte_pcapng_t *self,
 	len += pcapng_optlen(0);
 	len += sizeof(uint32_t);
 
-	buf = calloc(1, len);
-	if (!buf)
+	if (len > sizeof(buf))
 		return -1;
 
 	hdr = (struct pcapng_section_header *)buf;
@@ -193,10 +191,7 @@ pcapng_section_block(rte_pcapng_t *self,
 	/* clone block_length after option */
 	memcpy(opt, &hdr->block_length, sizeof(uint32_t));
 
-	cc = write(self->outfd, buf, len);
-	free(buf);
-
-	return cc;
+	return write(self->outfd, buf, len);
 }
 
 /* Write an interface block for a DPDK port */
@@ -213,7 +208,7 @@ rte_pcapng_add_interface(rte_pcapng_t *self, uint16_t port,
 	struct pcapng_option *opt;
 	const uint8_t tsresol = 9;	/* nanosecond resolution */
 	uint32_t len;
-	void *buf;
+	uint8_t buf[BUFSIZ];
 	char ifname_buf[IF_NAMESIZE];
 	char ifhw[256];
 	uint64_t speed = 0;
@@ -267,8 +262,7 @@ rte_pcapng_add_interface(rte_pcapng_t *self, uint16_t port,
 	len += pcapng_optlen(0);
 	len += sizeof(uint32_t);
 
-	buf = alloca(len);
-	if (!buf)
+	if (len > sizeof(buf))
 		return -1;
 
 	hdr = (struct pcapng_interface_block *)buf;
@@ -296,17 +290,16 @@ rte_pcapng_add_interface(rte_pcapng_t *self, uint16_t port,
 		opt = pcapng_add_option(opt, PCAPNG_IFB_HARDWARE,
 					 ifhw, strlen(ifhw));
 	if (filter) {
-		/* Encoding is that the first octet indicates string vs BPF */
 		size_t len;
-		char *buf;
 
 		len = strlen(filter) + 1;
-		buf = alloca(len);
-		*buf = '\0';
-		memcpy(buf + 1, filter, len);
+		opt->code = PCAPNG_IFB_FILTER;
+		opt->length = len;
+		/* Encoding is that the first octet indicates string vs BPF */
+		opt->data[0] = 0;
+		memcpy(opt->data + 1, filter, strlen(filter));
 
-		opt = pcapng_add_option(opt, PCAPNG_IFB_FILTER,
-					buf, len);
+		opt = (struct pcapng_option *)((uint8_t *)opt + pcapng_optlen(len));
 	}
 
 	opt = pcapng_add_option(opt, PCAPNG_OPT_END, NULL, 0);
@@ -333,7 +326,7 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 	uint64_t start_time = self->offset_ns;
 	uint64_t sample_time;
 	uint32_t optlen, len;
-	uint8_t *buf;
+	uint8_t buf[BUFSIZ];
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
@@ -353,8 +346,7 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 		optlen += pcapng_optlen(0);
 
 	len = sizeof(*hdr) + optlen + sizeof(uint32_t);
-	buf = alloca(len);
-	if (buf == NULL)
+	if (len > sizeof(buf))
 		return -1;
 
 	hdr = (struct pcapng_statistics *)buf;
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v3 5/5] test: cleanups to pcapng test
  2023-11-08 18:35 ` [PATCH v3 0/5] dumpcap and pcapng fixes Stephen Hemminger
                     ` (3 preceding siblings ...)
  2023-11-08 18:35   ` [PATCH v3 4/5] pcapng: avoid using alloca() Stephen Hemminger
@ 2023-11-08 18:35   ` Stephen Hemminger
  4 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-08 18:35 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

Overhaul of the pcapng test:
  - promote it to be a fast test so it gets regularly run.
  - create null device and use i.
  - use UDP discard packets that are valid so that for debugging
    the resulting pcapng file can be looked at with wireshark.
  - do basic checks on resulting pcap file that lengths and
    timestamps are in range.
  - add test for interface options

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 app/test/meson.build   |   2 +-
 app/test/test_pcapng.c | 418 +++++++++++++++++++++++++++--------------
 2 files changed, 282 insertions(+), 138 deletions(-)

diff --git a/app/test/meson.build b/app/test/meson.build
index 4183d66b0e9c..dcc93f4a43b4 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -128,7 +128,7 @@ source_file_deps = {
     'test_metrics.c': ['metrics'],
     'test_mp_secondary.c': ['hash', 'lpm'],
     'test_net_ether.c': ['net'],
-    'test_pcapng.c': ['ethdev', 'net', 'pcapng'],
+    'test_pcapng.c': ['ethdev', 'net', 'pcapng', 'bus_vdev'],
     'test_pdcp.c': ['eventdev', 'pdcp', 'net', 'timer', 'security'],
     'test_pdump.c': ['pdump'] + sample_packet_forward_deps,
     'test_per_lcore.c': [],
diff --git a/app/test/test_pcapng.c b/app/test/test_pcapng.c
index 55aa2cf93666..c973aa47d1f8 100644
--- a/app/test/test_pcapng.c
+++ b/app/test/test_pcapng.c
@@ -6,25 +6,34 @@
 #include <stdlib.h>
 #include <unistd.h>
 
+#include <rte_bus_vdev.h>
 #include <rte_ethdev.h>
 #include <rte_ether.h>
+#include <rte_ip.h>
 #include <rte_mbuf.h>
 #include <rte_mempool.h>
 #include <rte_net.h>
 #include <rte_pcapng.h>
+#include <rte_random.h>
+#include <rte_reciprocal.h>
+#include <rte_time.h>
+#include <rte_udp.h>
 
 #include <pcap/pcap.h>
 
 #include "test.h"
 
-#define NUM_PACKETS    10
-#define DUMMY_MBUF_NUM 3
+#define PCAPNG_TEST_DEBUG 0
+
+#define TOTAL_PACKETS	4096
+#define MAX_BURST	64
+#define MAX_GAP_US	100000
+#define DUMMY_MBUF_NUM	3
 
-static rte_pcapng_t *pcapng;
 static struct rte_mempool *mp;
 static const uint32_t pkt_len = 200;
 static uint16_t port_id;
-static char file_name[] = "/tmp/pcapng_test_XXXXXX.pcapng";
+static const char null_dev[] = "net_null0";
 
 /* first mbuf in the packet, should always be at offset 0 */
 struct dummy_mbuf {
@@ -61,6 +70,7 @@ mbuf1_prepare(struct dummy_mbuf *dm, uint32_t plen)
 	struct {
 		struct rte_ether_hdr eth;
 		struct rte_ipv4_hdr ip;
+		struct rte_udp_hdr udp;
 	} pkt = {
 		.eth = {
 			.dst_addr.addr_bytes = "\xff\xff\xff\xff\xff\xff",
@@ -68,149 +78,201 @@ mbuf1_prepare(struct dummy_mbuf *dm, uint32_t plen)
 		},
 		.ip = {
 			.version_ihl = RTE_IPV4_VHL_DEF,
-			.total_length = rte_cpu_to_be_16(plen),
-			.time_to_live = IPDEFTTL,
-			.next_proto_id = IPPROTO_RAW,
+			.time_to_live = 1,
+			.next_proto_id = IPPROTO_UDP,
 			.src_addr = rte_cpu_to_be_32(RTE_IPV4_LOOPBACK),
 			.dst_addr = rte_cpu_to_be_32(RTE_IPV4_BROADCAST),
-		}
+		},
+		.udp = {
+			.dst_port = rte_cpu_to_be_16(9), /* Discard port */
+		},
 	};
 
 	memset(dm, 0, sizeof(*dm));
 	dummy_mbuf_prep(&dm->mb[0], dm->buf[0], sizeof(dm->buf[0]), plen);
 
 	rte_eth_random_addr(pkt.eth.src_addr.addr_bytes);
-	memcpy(rte_pktmbuf_mtod(dm->mb, void *), &pkt, RTE_MIN(sizeof(pkt), plen));
+	plen -= sizeof(struct rte_ether_hdr);
+
+	pkt.ip.total_length = rte_cpu_to_be_16(plen);
+	pkt.ip.hdr_checksum = rte_ipv4_cksum(&pkt.ip);
+
+	plen -= sizeof(struct rte_ipv4_hdr);
+	pkt.udp.src_port = rte_rand();
+	pkt.udp.dgram_len = rte_cpu_to_be_16(plen);
+
+	memcpy(rte_pktmbuf_mtod(dm->mb, void *), &pkt, sizeof(pkt));
 }
 
 static int
 test_setup(void)
 {
-	int tmp_fd;
-
-	port_id = rte_eth_find_next(0);
-	if (port_id >= RTE_MAX_ETHPORTS) {
-		fprintf(stderr, "No valid Ether port\n");
-		return -1;
-	}
+	port_id = rte_eth_dev_count_avail();
 
-	tmp_fd = mkstemps(file_name, strlen(".pcapng"));
-	if (tmp_fd == -1) {
-		perror("mkstemps() failure");
-		return -1;
-	}
-	printf("pcapng: output file %s\n", file_name);
-
-	/* open a test capture file */
-	pcapng = rte_pcapng_fdopen(tmp_fd, NULL, NULL, "pcapng_test", NULL);
-	if (pcapng == NULL) {
-		fprintf(stderr, "rte_pcapng_fdopen failed\n");
-		close(tmp_fd);
-		return -1;
-	}
-
-	/* Add interface to the file */
-	if (rte_pcapng_add_interface(pcapng, port_id,
-				     NULL, NULL, NULL) != 0) {
-		fprintf(stderr, "can not add port %u\n", port_id);
-		return -1;
+	/* Make a dummy null device to snoop on */
+	if (rte_vdev_init(null_dev, NULL) != 0) {
+		fprintf(stderr, "Failed to create vdev '%s'\n", null_dev);
+		goto fail;
 	}
 
 	/* Make a pool for cloned packets */
-	mp = rte_pktmbuf_pool_create_by_ops("pcapng_test_pool", IOV_MAX + NUM_PACKETS,
-					    0, 0,
-					    rte_pcapng_mbuf_size(pkt_len),
+	mp = rte_pktmbuf_pool_create_by_ops("pcapng_test_pool",
+					    MAX_BURST, 0, 0,
+					    rte_pcapng_mbuf_size(pkt_len) + 128,
 					    SOCKET_ID_ANY, "ring_mp_sc");
 	if (mp == NULL) {
 		fprintf(stderr, "Cannot create mempool\n");
-		return -1;
+		goto fail;
 	}
+
 	return 0;
+
+fail:
+	rte_vdev_uninit(null_dev);
+	rte_mempool_free(mp);
+	return -1;
 }
 
 static int
-test_write_packets(void)
+fill_pcapng_file(rte_pcapng_t *pcapng, unsigned int num_packets)
 {
-	struct rte_mbuf *orig;
-	struct rte_mbuf *clones[NUM_PACKETS] = { };
 	struct dummy_mbuf mbfs;
-	unsigned int i;
+	struct rte_mbuf *orig;
+	unsigned int burst_size;
+	unsigned int count;
 	ssize_t len;
 
 	/* make a dummy packet */
 	mbuf1_prepare(&mbfs, pkt_len);
-
-	/* clone them */
 	orig  = &mbfs.mb[0];
-	for (i = 0; i < NUM_PACKETS; i++) {
-		struct rte_mbuf *mc;
 
-		mc = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len,
-				rte_get_tsc_cycles(), 0, NULL);
-		if (mc == NULL) {
-			fprintf(stderr, "Cannot copy packet\n");
+	for (count = 0; count < num_packets; count += burst_size) {
+		struct rte_mbuf *clones[MAX_BURST];
+		unsigned int i;
+
+		/* put 1 .. MAX_BURST packets in one write call */
+		burst_size = rte_rand_max(MAX_BURST) + 1;
+		for (i = 0; i < burst_size; i++) {
+			struct rte_mbuf *mc;
+
+			mc = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len,
+					     RTE_PCAPNG_DIRECTION_IN,
+					     NULL);
+			if (mc == NULL) {
+				fprintf(stderr, "Cannot copy packet\n");
+				return -1;
+			}
+			clones[i] = mc;
+		}
+
+		/* write it to capture file */
+		len = rte_pcapng_write_packets(pcapng, clones, burst_size);
+		rte_pktmbuf_free_bulk(clones, burst_size);
+
+		if (len <= 0) {
+			fprintf(stderr, "Write of packets failed: %s\n",
+				rte_strerror(rte_errno));
 			return -1;
 		}
-		clones[i] = mc;
+
+		/* Leave a small gap between packets to test for time wrap */
+		usleep(rte_rand_max(MAX_GAP_US));
 	}
 
-	/* write it to capture file */
-	len = rte_pcapng_write_packets(pcapng, clones, NUM_PACKETS);
+	return count;
+}
 
-	rte_pktmbuf_free_bulk(clones, NUM_PACKETS);
+static char *
+fmt_time(char *buf, size_t size, uint64_t ts_ns)
+{
+	time_t sec;
+	size_t len;
 
-	if (len <= 0) {
-		fprintf(stderr, "Write of packets failed\n");
-		return -1;
-	}
+	sec = ts_ns / NS_PER_S;
+	len = strftime(buf, size, "%X", localtime(&sec));
+	snprintf(buf + len, size - len, ".%09lu",
+		 (unsigned long)(ts_ns % NS_PER_S));
 
-	return 0;
+	return buf;
 }
 
-static int
-test_write_stats(void)
+/* Context for the pcap_loop callback */
+struct pkt_print_ctx {
+	pcap_t *pcap;
+	unsigned int count;
+	uint64_t start_ns;
+	uint64_t end_ns;
+};
+
+static void
+print_packet(uint64_t ts_ns, const struct rte_ether_hdr *eh, size_t len)
 {
-	ssize_t len;
+	char tbuf[128], src[64], dst[64];
 
-	/* write a statistics block */
-	len = rte_pcapng_write_stats(pcapng, port_id, NULL,
-				     0, 0, 0,
-				     NUM_PACKETS, 0);
-	if (len <= 0) {
-		fprintf(stderr, "Write of statistics failed\n");
-		return -1;
-	}
-	return 0;
+	fmt_time(tbuf, sizeof(tbuf), ts_ns);
+	rte_ether_format_addr(dst, sizeof(dst), &eh->dst_addr);
+	rte_ether_format_addr(src, sizeof(src), &eh->src_addr);
+	printf("%s: %s -> %s type %x length %zu\n",
+	       tbuf, src, dst, rte_be_to_cpu_16(eh->ether_type), len);
 }
 
+/* Callback from pcap_loop used to validate packets in the file */
 static void
-pkt_print(u_char *user, const struct pcap_pkthdr *h,
-	  const u_char *bytes)
+parse_pcap_packet(u_char *user, const struct pcap_pkthdr *h,
+		  const u_char *bytes)
 {
-	unsigned int *countp = (unsigned int *)user;
+	struct pkt_print_ctx *ctx = (struct pkt_print_ctx *)user;
 	const struct rte_ether_hdr *eh;
-	struct tm *tm;
-	char tbuf[128], src[64], dst[64];
+	const struct rte_ipv4_hdr *ip;
+	uint64_t ns;
 
-	tm = localtime(&h->ts.tv_sec);
-	if (tm == NULL) {
-		perror("localtime");
-		return;
+	eh = (const struct rte_ether_hdr *)bytes;
+	ip = (const struct rte_ipv4_hdr *)(eh + 1);
+
+	ctx->count += 1;
+
+	/* The pcap library is misleading in reporting timestamp.
+	 * packet header struct gives timestamp as a timeval (ie. usec);
+	 * but the file is open in nanonsecond mode therefore
+	 * the timestamp is really in timespec (ie. nanoseconds).
+	 */
+	ns = h->ts.tv_sec * NS_PER_S + h->ts.tv_usec;
+	if (ns < ctx->start_ns || ns > ctx->end_ns) {
+		char tstart[128], tend[128];
+
+		fmt_time(tstart, sizeof(tstart), ctx->start_ns);
+		fmt_time(tend, sizeof(tend), ctx->end_ns);
+		fprintf(stderr, "Timestamp out of range [%s .. %s]\n",
+			tstart, tend);
+		goto error;
 	}
 
-	if (strftime(tbuf, sizeof(tbuf), "%X", tm) == 0) {
-		fprintf(stderr, "strftime returned 0!\n");
-		return;
+	if (!rte_is_broadcast_ether_addr(&eh->dst_addr)) {
+		fprintf(stderr, "Destination is not broadcast\n");
+		goto error;
 	}
 
-	eh = (const struct rte_ether_hdr *)bytes;
-	rte_ether_format_addr(dst, sizeof(dst), &eh->dst_addr);
-	rte_ether_format_addr(src, sizeof(src), &eh->src_addr);
-	printf("%s.%06lu: %s -> %s type %x length %u\n",
-	       tbuf, (unsigned long)h->ts.tv_usec,
-	       src, dst, rte_be_to_cpu_16(eh->ether_type), h->len);
+	if (rte_ipv4_cksum(ip) != 0) {
+		fprintf(stderr, "Bad IPv4 checksum\n");
+		goto error;
+	}
+
+	return;		/* packet is normal */
+
+error:
+	print_packet(ns, eh, h->len);
+
+	/* Stop parsing at first error */
+	pcap_breakloop(ctx->pcap);
+}
 
-	*countp += 1;
+static uint64_t
+current_timestamp(void)
+{
+	struct timespec ts;
+
+	clock_gettime(CLOCK_REALTIME, &ts);
+	return rte_timespec_to_ns(&ts);
 }
 
 /*
@@ -219,78 +281,162 @@ pkt_print(u_char *user, const struct pcap_pkthdr *h,
  * but that creates an unwanted dependency.
  */
 static int
-test_validate(void)
+valid_pcapng_file(const char *file_name, uint64_t started, unsigned int expected)
 {
 	char errbuf[PCAP_ERRBUF_SIZE];
-	unsigned int count = 0;
-	pcap_t *pcap;
+	struct pkt_print_ctx ctx = { };
 	int ret;
 
-	pcap = pcap_open_offline(file_name, errbuf);
-	if (pcap == NULL) {
+	ctx.start_ns = started;
+	ctx.end_ns = current_timestamp();
+
+	ctx.pcap = pcap_open_offline_with_tstamp_precision(file_name,
+							   PCAP_TSTAMP_PRECISION_NANO,
+							   errbuf);
+	if (ctx.pcap == NULL) {
 		fprintf(stderr, "pcap_open_offline('%s') failed: %s\n",
 			file_name, errbuf);
 		return -1;
 	}
 
-	ret = pcap_loop(pcap, 0, pkt_print, (u_char *)&count);
-	if (ret == 0)
-		printf("Saw %u packets\n", count);
-	else
+	ret = pcap_loop(ctx.pcap, 0, parse_pcap_packet, (u_char *)&ctx);
+	if (ret != 0) {
 		fprintf(stderr, "pcap_dispatch: failed: %s\n",
-			pcap_geterr(pcap));
-	pcap_close(pcap);
+			pcap_geterr(ctx.pcap));
+	} else if (ctx.count != expected) {
+		printf("Only %u packets, expected %u\n",
+		       ctx.count, expected);
+		ret = -1;
+	}
+
+	pcap_close(ctx.pcap);
 
 	return ret;
 }
 
 static int
-test_write_over_limit_iov_max(void)
+test_add_interface(void)
 {
-	struct rte_mbuf *orig;
-	struct rte_mbuf *clones[IOV_MAX + NUM_PACKETS] = { };
-	struct dummy_mbuf mbfs;
-	unsigned int i;
-	ssize_t len;
+	char file_name[] = "/tmp/pcapng_test_XXXXXX.pcapng";
+	static rte_pcapng_t *pcapng;
+	int ret, tmp_fd;
+	uint64_t now = current_timestamp();
 
-	/* make a dummy packet */
-	mbuf1_prepare(&mbfs, pkt_len);
+	tmp_fd = mkstemps(file_name, strlen(".pcapng"));
+	if (tmp_fd == -1) {
+		perror("mkstemps() failure");
+		goto fail;
+	}
+	printf("pcapng: output file %s\n", file_name);
 
-	/* clone them */
-	orig  = &mbfs.mb[0];
-	for (i = 0; i < IOV_MAX + NUM_PACKETS; i++) {
-		struct rte_mbuf *mc;
+	/* open a test capture file */
+	pcapng = rte_pcapng_fdopen(tmp_fd, NULL, NULL, "pcapng_addif", NULL);
+	if (pcapng == NULL) {
+		fprintf(stderr, "rte_pcapng_fdopen failed\n");
+		close(tmp_fd);
+		goto fail;
+	}
 
-		mc = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len,
-				rte_get_tsc_cycles(), 0, NULL);
-		if (mc == NULL) {
-			fprintf(stderr, "Cannot copy packet\n");
-			return -1;
-		}
-		clones[i] = mc;
+	/* Add interface to the file */
+	ret = rte_pcapng_add_interface(pcapng, port_id,
+				       NULL, NULL, NULL);
+	if (ret < 0) {
+		fprintf(stderr, "can not add port %u\n", port_id);
+		goto fail;
 	}
 
-	/* write it to capture file */
-	len = rte_pcapng_write_packets(pcapng, clones, IOV_MAX + NUM_PACKETS);
+	/* Add interface with ifname and ifdescr */
+	ret = rte_pcapng_add_interface(pcapng, port_id,
+				       "myeth", "Some long description", NULL);
+	if (ret < 0) {
+		fprintf(stderr, "can not add port %u with ifname\n", port_id);
+		goto fail;
+	}
+
+	/* Add interface with filter */
+	ret = rte_pcapng_add_interface(pcapng, port_id,
+				       NULL, NULL, "tcp port 8080");
+	if (ret < 0) {
+		fprintf(stderr, "can not add port %u with filter\n", port_id);
+		goto fail;
+	}
 
-	rte_pktmbuf_free_bulk(clones, IOV_MAX + NUM_PACKETS);
+	rte_pcapng_close(pcapng);
 
-	if (len <= 0) {
-		fprintf(stderr, "Write of packets failed\n");
-		return -1;
+	ret = valid_pcapng_file(file_name, now, 0);
+	/* if test fails want to investigate the file */
+	if (ret == 0)
+		unlink(file_name);
+
+	return ret;
+
+fail:
+	rte_pcapng_close(pcapng);
+	return -1;
+}
+
+static int
+test_write_packets(void)
+{
+	char file_name[] = "/tmp/pcapng_test_XXXXXX.pcapng";
+	static rte_pcapng_t *pcapng;
+	int ret, tmp_fd, count;
+	uint64_t now = current_timestamp();
+
+	tmp_fd = mkstemps(file_name, strlen(".pcapng"));
+	if (tmp_fd == -1) {
+		perror("mkstemps() failure");
+		goto fail;
 	}
+	printf("pcapng: output file %s\n", file_name);
 
-	return 0;
+	/* open a test capture file */
+	pcapng = rte_pcapng_fdopen(tmp_fd, NULL, NULL, "pcapng_test", NULL);
+	if (pcapng == NULL) {
+		fprintf(stderr, "rte_pcapng_fdopen failed\n");
+		close(tmp_fd);
+		goto fail;
+	}
+
+	/* Add interface to the file */
+	ret = rte_pcapng_add_interface(pcapng, port_id,
+				       NULL, NULL, NULL);
+	if (ret < 0) {
+		fprintf(stderr, "can not add port %u\n", port_id);
+		goto fail;
+	}
+
+	count = fill_pcapng_file(pcapng, TOTAL_PACKETS);
+	if (count < 0)
+		goto fail;
+
+	/* write a statistics block */
+	ret = rte_pcapng_write_stats(pcapng, port_id,
+				     count, 0, "end of test");
+	if (ret <= 0) {
+		fprintf(stderr, "Write of statistics failed\n");
+		goto fail;
+	}
+
+	rte_pcapng_close(pcapng);
+
+	ret = valid_pcapng_file(file_name, now, count);
+	/* if test fails want to investigate the file */
+	if (ret == 0)
+		unlink(file_name);
+
+	return ret;
+
+fail:
+	rte_pcapng_close(pcapng);
+	return -1;
 }
 
 static void
 test_cleanup(void)
 {
 	rte_mempool_free(mp);
-
-	if (pcapng)
-		rte_pcapng_close(pcapng);
-
+	rte_vdev_uninit(null_dev);
 }
 
 static struct
@@ -299,10 +445,8 @@ unit_test_suite test_pcapng_suite  = {
 	.teardown = test_cleanup,
 	.suite_name = "Test Pcapng Unit Test Suite",
 	.unit_test_cases = {
+		TEST_CASE(test_add_interface),
 		TEST_CASE(test_write_packets),
-		TEST_CASE(test_write_stats),
-		TEST_CASE(test_validate),
-		TEST_CASE(test_write_over_limit_iov_max),
 		TEST_CASES_END()
 	}
 };
@@ -313,4 +457,4 @@ test_pcapng(void)
 	return unit_test_suite_runner(&test_pcapng_suite);
 }
 
-REGISTER_TEST_COMMAND(pcapng_autotest, test_pcapng);
+REGISTER_FAST_TEST(pcapng_autotest, true, true, test_pcapng);
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH v3 1/5] pdump: fix setting rte_errno on mp error
  2023-11-08 18:35   ` [PATCH v3 1/5] pdump: fix setting rte_errno on mp error Stephen Hemminger
@ 2023-11-09  7:34     ` Morten Brørup
  0 siblings, 0 replies; 61+ messages in thread
From: Morten Brørup @ 2023-11-09  7:34 UTC (permalink / raw)
  To: Stephen Hemminger, dev

> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Wednesday, 8 November 2023 19.36
> 
> The response from MP server sets err_value to negative
> on error. The convention for rte_errno is to use a positive
> value on error. This makes errors like duplicate registration
> show up with the correct error value.
> 
> Fixes: 660098d61f57 ("pdump: use generic multi-process channel")
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---

Acked-by:  Morten Brørup <mb@smartsharesystems.com>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH v3 2/5] dumpcap: allow multiple invocations
  2023-11-08 18:35   ` [PATCH v3 2/5] dumpcap: allow multiple invocations Stephen Hemminger
@ 2023-11-09  7:50     ` Morten Brørup
  2023-11-09 15:40       ` Stephen Hemminger
  2023-11-09 17:16       ` Stephen Hemminger
  0 siblings, 2 replies; 61+ messages in thread
From: Morten Brørup @ 2023-11-09  7:50 UTC (permalink / raw)
  To: Stephen Hemminger, dev; +Cc: Isaac Boukris

> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Wednesday, 8 November 2023 19.36
> 
> If dumpcap is run twice with each instance pointing a different
> interface, it would fail because of overlap in ring a pool names.
> Fix by putting process id in the name.
> 
> Fixes: cbb44143be74 ("app/dumpcap: add new packet capture application")
> Reported-by: Isaac Boukris <iboukris@gmail.com>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---

Minor detail: getpid() returns int, so prefer %d over %u.

[...]

>  			rte_exit(EXIT_FAILURE,
>  				"Packet dump enable on %u:%s failed %s\n",
>  				intf->port, intf->name,
> -				rte_strerror(-ret));
> +				rte_strerror(rte_errno));

This bugfix (the line above, not the patch itself) supports Tyler's proposal to standardize on returning -1 with rte_errno set on failure, instead of some functions returning -errno. Our dual convention for function return values will cause many bugs like this.

With %u or %d,

Reviewed-by: Morten Brørup <mb@smartsharesystems.com>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH v3 3/5] pcapng: modify timestamp calculation
  2023-11-08 18:35   ` [PATCH v3 3/5] pcapng: modify timestamp calculation Stephen Hemminger
@ 2023-11-09  7:57     ` Morten Brørup
  0 siblings, 0 replies; 61+ messages in thread
From: Morten Brørup @ 2023-11-09  7:57 UTC (permalink / raw)
  To: Stephen Hemminger, dev

> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Wednesday, 8 November 2023 19.36
] pcapng: modify timestamp calculation
> 
> The computation of timestamp is best done in the part of
> pcapng library that is in secondary process.
> The secondary process is already doing a bunch of system
> calls which makes it not performance sensitive.
> 
> Simplify the computation of nanoseconds from TSC to a two
> step process which avoids numeric overflow issues. The previous
> code was not thread safe as well.
> 
> Fixes: c882eb544842 ("pcapng: fix timestamp wrapping in output files")
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---

This changes the rte_pcapng lib API, but it is marked experimental, so should be allowed.

Acked-by: Morten Brørup <mb@smartsharesystems.com>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH v3 4/5] pcapng: avoid using alloca()
  2023-11-08 18:35   ` [PATCH v3 4/5] pcapng: avoid using alloca() Stephen Hemminger
@ 2023-11-09  8:21     ` Morten Brørup
  2023-11-09 15:44       ` Stephen Hemminger
  0 siblings, 1 reply; 61+ messages in thread
From: Morten Brørup @ 2023-11-09  8:21 UTC (permalink / raw)
  To: Stephen Hemminger, dev

> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Wednesday, 8 November 2023 19.36
> 
> The function alloca() like VLA's has problems if the caller
> passes a large value. Instead use a fixed size buffer (4K)
> which will be more than sufficient for the info related blocks
> in the file. Add bounds checks as well.
> 
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---

I can't find the definition of BUFSIZ. Please make sure to add a comment to the definition of BUFSIZ mentioning - like in your patch description - that it will be more than sufficient for the info related blocks in the file.

More comments inline below, regarding existing bugs found while reviewing.


Assuming BUFSIZ has a comment describing the reason for its value,

Acked-by: Morten Brørup <mb@smartsharesystems.com>


>  lib/pcapng/rte_pcapng.c | 34 +++++++++++++---------------------
>  1 file changed, 13 insertions(+), 21 deletions(-)
> 
> diff --git a/lib/pcapng/rte_pcapng.c b/lib/pcapng/rte_pcapng.c
> index 13fd2b97fb80..67f74d31aa32 100644
> --- a/lib/pcapng/rte_pcapng.c
> +++ b/lib/pcapng/rte_pcapng.c
> @@ -140,9 +140,8 @@ pcapng_section_block(rte_pcapng_t *self,
>  {
>  	struct pcapng_section_header *hdr;
>  	struct pcapng_option *opt;
> -	void *buf;
> +	uint8_t buf[BUFSIZ];
>  	uint32_t len;
> -	ssize_t cc;
> 
>  	len = sizeof(*hdr);
>  	if (hw)
> @@ -158,8 +157,7 @@ pcapng_section_block(rte_pcapng_t *self,
>  	len += pcapng_optlen(0);
>  	len += sizeof(uint32_t);
> 
> -	buf = calloc(1, len);
> -	if (!buf)
> +	if (len > sizeof(buf))
>  		return -1;

Existing bug: rte_errno must be set before returning -1. This bug occurs multiple times in rte_pcapng.c, probably also in code you're not updating in this patch.

> 
>  	hdr = (struct pcapng_section_header *)buf;
> @@ -193,10 +191,7 @@ pcapng_section_block(rte_pcapng_t *self,
>  	/* clone block_length after option */
>  	memcpy(opt, &hdr->block_length, sizeof(uint32_t));
> 
> -	cc = write(self->outfd, buf, len);
> -	free(buf);
> -
> -	return cc;
> +	return write(self->outfd, buf, len);

Existing bug: if write() returns -1, errno must be stored in rte_errno before returning -1. This bug might also occur multiple times in rte_pcapng.c.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v3 2/5] dumpcap: allow multiple invocations
  2023-11-09  7:50     ` Morten Brørup
@ 2023-11-09 15:40       ` Stephen Hemminger
  2023-11-09 16:00         ` Morten Brørup
  2023-11-09 17:16       ` Stephen Hemminger
  1 sibling, 1 reply; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-09 15:40 UTC (permalink / raw)
  To: Morten Brørup; +Cc: dev, Isaac Boukris

On Thu, 9 Nov 2023 08:50:10 +0100
Morten Brørup <mb@smartsharesystems.com> wrote:

> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> > ---  
> 
> Minor detail: getpid() returns int, so prefer %d over %u.

Let me check, per man page. getpid() returns pid_t.
The typedef chain leads to:
	pid_t -> __pid_t -> __PID_T_TYPE -> __S32_TYPE  -> int32 -> int



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v3 4/5] pcapng: avoid using alloca()
  2023-11-09  8:21     ` Morten Brørup
@ 2023-11-09 15:44       ` Stephen Hemminger
  2023-11-09 16:25         ` Morten Brørup
  0 siblings, 1 reply; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-09 15:44 UTC (permalink / raw)
  To: Morten Brørup; +Cc: dev

On Thu, 9 Nov 2023 09:21:22 +0100
Morten Brørup <mb@smartsharesystems.com> wrote:

> I can't find the definition of BUFSIZ. Please make sure to add a comment to the definition of BUFSIZ mentioning - like in your patch description - that it will be more than sufficient for the info related blocks in the file.
> 
> More comments inline below, regarding existing bugs found while reviewing.
> 
> 
> Assuming BUFSIZ has a comment describing the reason for its value,
> 
> Acked-by: Morten Brørup <mb@smartsharesystems.com>

The constant BUFSIZ comes from stdio.h and used lots of places in libraries.
It is 8192 in current glibc and unlikely to be a problem.
Chose it because this a on stack buffer used before writing to a file which
is similar to what stdio does.

The library does not use stdio because most of the I/O is writing packets
which needs to be fast and overhead of extra stdio buffer is harmful.
Looking into using io_uring in a future version.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH v3 2/5] dumpcap: allow multiple invocations
  2023-11-09 15:40       ` Stephen Hemminger
@ 2023-11-09 16:00         ` Morten Brørup
  0 siblings, 0 replies; 61+ messages in thread
From: Morten Brørup @ 2023-11-09 16:00 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Isaac Boukris

> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Thursday, 9 November 2023 16.40
> 
> On Thu, 9 Nov 2023 08:50:10 +0100
> Morten Brørup <mb@smartsharesystems.com> wrote:
> 
> > > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> > > ---
> >
> > Minor detail: getpid() returns int, so prefer %d over %u.
> 
> Let me check, per man page. getpid() returns pid_t.
> The typedef chain leads to:
> 	pid_t -> __pid_t -> __PID_T_TYPE -> __S32_TYPE  -> int32 -> int

Thank you for confirming. So %d is preferred over %u for getpid(). :-)


^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH v3 4/5] pcapng: avoid using alloca()
  2023-11-09 15:44       ` Stephen Hemminger
@ 2023-11-09 16:25         ` Morten Brørup
  0 siblings, 0 replies; 61+ messages in thread
From: Morten Brørup @ 2023-11-09 16:25 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev

> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Thursday, 9 November 2023 16.45
> 
> On Thu, 9 Nov 2023 09:21:22 +0100
> Morten Brørup <mb@smartsharesystems.com> wrote:
> 
> > I can't find the definition of BUFSIZ. Please make sure to add a
> comment to the definition of BUFSIZ mentioning - like in your patch
> description - that it will be more than sufficient for the info related
> blocks in the file.
> >
> > More comments inline below, regarding existing bugs found while
> reviewing.
> >
> >
> > Assuming BUFSIZ has a comment describing the reason for its value,
> >
> > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> 
> The constant BUFSIZ comes from stdio.h and used lots of places in
> libraries.
> It is 8192 in current glibc and unlikely to be a problem.

OK, didn't know that. So I looked it up, trying to learn more about it.

I found two sources [1], [2] mentioning that BUFSIZ is guaranteed to be at least 256.

[1]: https://www.gnu.org/software/libc/manual/html_node/Controlling-Buffering.html#BUFSIZ
[2]: Page 234 in "The C Standard Library" by P.J. Plauger, ISBN: 0-13-131509-9, from 1992

If 256 suffices, then I am OK with using BUFSIZ.

I hope the authors of the other libraries using BUFSIZ don't assume more than the C standard promises about it.

> Chose it because this a on stack buffer used before writing to a file
> which
> is similar to what stdio does.
> 
> The library does not use stdio because most of the I/O is writing
> packets
> which needs to be fast and overhead of extra stdio buffer is harmful.
> Looking into using io_uring in a future version.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v3 2/5] dumpcap: allow multiple invocations
  2023-11-09  7:50     ` Morten Brørup
  2023-11-09 15:40       ` Stephen Hemminger
@ 2023-11-09 17:16       ` Stephen Hemminger
  2023-11-09 18:22         ` Morten Brørup
  1 sibling, 1 reply; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-09 17:16 UTC (permalink / raw)
  To: Morten Brørup; +Cc: dev, Isaac Boukris

On Thu, 9 Nov 2023 08:50:10 +0100
Morten Brørup <mb@smartsharesystems.com> wrote:

> >  			rte_exit(EXIT_FAILURE,
> >  				"Packet dump enable on %u:%s failed %s\n",
> >  				intf->port, intf->name,
> > -				rte_strerror(-ret));
> > +				rte_strerror(rte_errno));  
> 
> This bugfix (the line above, not the patch itself) supports Tyler's proposal to standardize on returning -1 with rte_errno set on failure, instead of some functions returning -errno. Our dual convention for function return values will cause many bugs like this.

The error case here is when rte_pdump_enable_bpf() fails.
This is return from pdump_enable in pdump library.
The library does follow the rte_errno convention correctly.
But the error message wasn't reporting correctly which would lead to confusing error in case where
multiple invocations failed.

It is not possible to do multiple captures on same interface. And not worth modifying the
library (would require multiple copies and ref counts) to handle this case.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v4 0/5] dumpcap and pcapng fixes
  2023-09-21  4:23 [PATCH 0/4] pcapng fixes Stephen Hemminger
                   ` (5 preceding siblings ...)
  2023-11-08 18:35 ` [PATCH v3 0/5] dumpcap and pcapng fixes Stephen Hemminger
@ 2023-11-09 17:34 ` Stephen Hemminger
  2023-11-09 17:34   ` [PATCH v4 1/5] pdump: fix setting rte_errno on mp error Stephen Hemminger
                     ` (4 more replies)
  2023-11-09 19:45 ` [PATCH v5 0/5] dumpcap and pcapng fixes Stephen Hemminger
                   ` (2 subsequent siblings)
  9 siblings, 5 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-09 17:34 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

This series has bugfixes and tests for dumpcap and pcapng.
It should be in 23.11!

It fixes issues related to timestamping. The design choices are
to maximize performance in the primary process; and do
all the time adjustment in the secondary (dumpcap) since
the dumpcap needs to system calls anyway to write the result.

This patches set changes where the adjustment is calculated
into the pcapng portion that opens the output file.
All details of the format of timestamp are contained inside
pcapng (data hiding).

v4 - incorporate review feedback
v3 - don't use alloca() since can have VLA type issues

Stephen Hemminger (5):
  pdump: fix setting rte_errno on mp error
  dumpcap: allow multiple invocations
  pcapng: modify timestamp calculation
  pcapng: avoid using alloca()
  test: cleanups to pcapng test

 app/dumpcap/main.c      |  53 ++---
 app/test/meson.build    |   2 +-
 app/test/test_pcapng.c  | 418 +++++++++++++++++++++++++++-------------
 lib/graph/graph_pcap.c  |   2 +-
 lib/pcapng/rte_pcapng.c | 156 ++++++---------
 lib/pcapng/rte_pcapng.h |  19 +-
 lib/pdump/rte_pdump.c   |   9 +-
 7 files changed, 374 insertions(+), 285 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v4 1/5] pdump: fix setting rte_errno on mp error
  2023-11-09 17:34 ` [PATCH v4 0/5] dumpcap and pcapng fixes Stephen Hemminger
@ 2023-11-09 17:34   ` Stephen Hemminger
  2023-11-09 17:34   ` [PATCH v4 2/5] dumpcap: allow multiple invocations Stephen Hemminger
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-09 17:34 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Morten Brørup, Reshma Pattan, Jianfeng Tan

The response from MP server sets err_value to negative
on error. The convention for rte_errno is to use a positive
value on error. This makes errors like duplicate registration
show up with the correct error value.

Fixes: 660098d61f57 ("pdump: use generic multi-process channel")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by:  Morten Brørup <mb@smartsharesystems.com>
---
 lib/pdump/rte_pdump.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/pdump/rte_pdump.c b/lib/pdump/rte_pdump.c
index 80b90c6f7d03..e94f49e21250 100644
--- a/lib/pdump/rte_pdump.c
+++ b/lib/pdump/rte_pdump.c
@@ -564,9 +564,10 @@ pdump_prepare_client_request(const char *device, uint16_t queue,
 	if (rte_mp_request_sync(&mp_req, &mp_reply, &ts) == 0) {
 		mp_rep = &mp_reply.msgs[0];
 		resp = (struct pdump_response *)mp_rep->param;
-		rte_errno = resp->err_value;
-		if (!resp->err_value)
+		if (resp->err_value == 0)
 			ret = 0;
+		else
+			rte_errno = -resp->err_value;
 		free(mp_reply.msgs);
 	}
 
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v4 2/5] dumpcap: allow multiple invocations
  2023-11-09 17:34 ` [PATCH v4 0/5] dumpcap and pcapng fixes Stephen Hemminger
  2023-11-09 17:34   ` [PATCH v4 1/5] pdump: fix setting rte_errno on mp error Stephen Hemminger
@ 2023-11-09 17:34   ` Stephen Hemminger
  2023-11-09 18:30     ` Morten Brørup
  2023-11-09 17:34   ` [PATCH v4 3/5] pcapng: modify timestamp calculation Stephen Hemminger
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-09 17:34 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Isaac Boukris, Reshma Pattan

If dumpcap is run twice with each instance pointing a different
interface, it would fail because of overlap in ring a pool names.
Fix by putting process id in the name.

It is still not allowed to do multiple invocations on the same
interface because only one callback is allowed and only one copy
of mbuf is done. Dumpcap will fail with error in this case:

   pdump_prepare_client_request(): client request for pdump enable/disable failed
   EAL: Error - exiting with code: 1
     Cause: Packet dump enable on 0:net_null0 failed File exists

Fixes: cbb44143be74 ("app/dumpcap: add new packet capture application")
Reported-by: Isaac Boukris <iboukris@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 app/dumpcap/main.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/app/dumpcap/main.c b/app/dumpcap/main.c
index 64294bbfb3e6..74c754e272c5 100644
--- a/app/dumpcap/main.c
+++ b/app/dumpcap/main.c
@@ -44,7 +44,6 @@
 #include <pcap/pcap.h>
 #include <pcap/bpf.h>
 
-#define RING_NAME "capture-ring"
 #define MONITOR_INTERVAL  (500 * 1000)
 #define MBUF_POOL_CACHE_SIZE 32
 #define BURST_SIZE 32
@@ -647,6 +646,7 @@ static void dpdk_init(void)
 static struct rte_ring *create_ring(void)
 {
 	struct rte_ring *ring;
+	char ring_name[RTE_RING_NAMESIZE];
 	size_t size, log2;
 
 	/* Find next power of 2 >= size. */
@@ -660,28 +660,28 @@ static struct rte_ring *create_ring(void)
 		ring_size = size;
 	}
 
-	ring = rte_ring_lookup(RING_NAME);
-	if (ring == NULL) {
-		ring = rte_ring_create(RING_NAME, ring_size,
-					rte_socket_id(), 0);
-		if (ring == NULL)
-			rte_exit(EXIT_FAILURE, "Could not create ring :%s\n",
-				 rte_strerror(rte_errno));
-	}
+	/* Want one ring per invocation of program */
+	snprintf(ring_name, sizeof(ring_name),
+		 "dumpcap-%d", getpid());
+
+	ring = rte_ring_create(ring_name, ring_size,
+			       rte_socket_id(), 0);
+	if (ring == NULL)
+		rte_exit(EXIT_FAILURE, "Could not create ring :%s\n",
+			 rte_strerror(rte_errno));
+
 	return ring;
 }
 
 static struct rte_mempool *create_mempool(void)
 {
 	const struct interface *intf;
-	static const char pool_name[] = "capture_mbufs";
+	char pool_name[RTE_MEMPOOL_NAMESIZE];
 	size_t num_mbufs = 2 * ring_size;
 	struct rte_mempool *mp;
 	uint32_t data_size = 128;
 
-	mp = rte_mempool_lookup(pool_name);
-	if (mp)
-		return mp;
+	snprintf(pool_name, sizeof(pool_name), "capture_%u", getpid());
 
 	/* Common pool so size mbuf for biggest snap length */
 	TAILQ_FOREACH(intf, &interfaces, next) {
@@ -826,7 +826,7 @@ static void enable_pdump(struct rte_ring *r, struct rte_mempool *mp)
 			rte_exit(EXIT_FAILURE,
 				"Packet dump enable on %u:%s failed %s\n",
 				intf->port, intf->name,
-				rte_strerror(-ret));
+				rte_strerror(rte_errno));
 		}
 
 		if (intf->opts.promisc_mode) {
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v4 3/5] pcapng: modify timestamp calculation
  2023-11-09 17:34 ` [PATCH v4 0/5] dumpcap and pcapng fixes Stephen Hemminger
  2023-11-09 17:34   ` [PATCH v4 1/5] pdump: fix setting rte_errno on mp error Stephen Hemminger
  2023-11-09 17:34   ` [PATCH v4 2/5] dumpcap: allow multiple invocations Stephen Hemminger
@ 2023-11-09 17:34   ` Stephen Hemminger
  2023-11-09 17:34   ` [PATCH v4 4/5] pcapng: avoid using alloca() Stephen Hemminger
  2023-11-09 17:34   ` [PATCH v4 5/5] test: cleanups to pcapng test Stephen Hemminger
  4 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-09 17:34 UTC (permalink / raw)
  To: dev
  Cc: Stephen Hemminger, Morten Brørup, Reshma Pattan,
	Jerin Jacob, Kiran Kumar K, Nithin Dabilpuram, Zhirun Yan,
	Quentin Armitage

The computation of timestamp is best done in the part of
pcapng library that is in secondary process.
The secondary process is already doing a bunch of system
calls which makes it not performance sensitive.
This does change the rte_pcapng_copy()
and rte_pcapng_write_stats() experimental API's.

Simplify the computation of nanoseconds from TSC to a two
step process which avoids numeric overflow issues. The previous
code was not thread safe as well.

Fixes: c882eb544842 ("pcapng: fix timestamp wrapping in output files")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
---
 app/dumpcap/main.c      |  25 +++------
 app/test/test_pcapng.c  |   4 +-
 lib/graph/graph_pcap.c  |   2 +-
 lib/pcapng/rte_pcapng.c | 119 +++++++++++++++-------------------------
 lib/pcapng/rte_pcapng.h |  19 ++-----
 lib/pdump/rte_pdump.c   |   4 +-
 6 files changed, 61 insertions(+), 112 deletions(-)

diff --git a/app/dumpcap/main.c b/app/dumpcap/main.c
index 74c754e272c5..b5770875fab4 100644
--- a/app/dumpcap/main.c
+++ b/app/dumpcap/main.c
@@ -66,13 +66,13 @@ static bool print_stats;
 
 /* capture limit options */
 static struct {
-	uint64_t  duration;	/* nanoseconds */
+	time_t  duration;	/* seconds */
 	unsigned long packets;  /* number of packets in file */
 	size_t size;		/* file size (bytes) */
 } stop;
 
 /* Running state */
-static uint64_t start_time, end_time;
+static time_t start_time;
 static uint64_t packets_received;
 static size_t file_size;
 
@@ -197,7 +197,7 @@ static void auto_stop(char *opt)
 		if (*value == '\0' || *endp != '\0' || interval <= 0)
 			rte_exit(EXIT_FAILURE,
 				 "Invalid duration \"%s\"\n", value);
-		stop.duration = NSEC_PER_SEC * interval;
+		stop.duration = interval;
 	} else if (strcmp(opt, "filesize") == 0) {
 		stop.size = get_uint(value, "filesize", 0) * 1024;
 	} else if (strcmp(opt, "packets") == 0) {
@@ -511,15 +511,6 @@ static void statistics_loop(void)
 	}
 }
 
-/* Return the time since 1/1/1970 in nanoseconds */
-static uint64_t create_timestamp(void)
-{
-	struct timespec now;
-
-	clock_gettime(CLOCK_MONOTONIC, &now);
-	return rte_timespec_to_ns(&now);
-}
-
 static void
 cleanup_pdump_resources(void)
 {
@@ -589,9 +580,8 @@ report_packet_stats(dumpcap_out_t out)
 		ifdrop = pdump_stats.nombuf + pdump_stats.ringfull;
 
 		if (use_pcapng)
-			rte_pcapng_write_stats(out.pcapng, intf->port, NULL,
-					       start_time, end_time,
-					       ifrecv, ifdrop);
+			rte_pcapng_write_stats(out.pcapng, intf->port,
+					       ifrecv, ifdrop, NULL);
 
 		if (ifrecv == 0)
 			percent = 0;
@@ -983,7 +973,7 @@ int main(int argc, char **argv)
 	mp = create_mempool();
 	out = create_output();
 
-	start_time = create_timestamp();
+	start_time = time(NULL);
 	enable_pdump(r, mp);
 
 	if (!quiet) {
@@ -1005,11 +995,10 @@ int main(int argc, char **argv)
 			break;
 
 		if (stop.duration != 0 &&
-		    create_timestamp() - start_time > stop.duration)
+		    time(NULL) - start_time > stop.duration)
 			break;
 	}
 
-	end_time = create_timestamp();
 	disable_primary_monitor();
 
 	if (rte_eal_primary_proc_alive(NULL))
diff --git a/app/test/test_pcapng.c b/app/test/test_pcapng.c
index b8429a02f160..55aa2cf93666 100644
--- a/app/test/test_pcapng.c
+++ b/app/test/test_pcapng.c
@@ -173,8 +173,8 @@ test_write_stats(void)
 	ssize_t len;
 
 	/* write a statistics block */
-	len = rte_pcapng_write_stats(pcapng, port_id,
-				     NULL, 0, 0,
+	len = rte_pcapng_write_stats(pcapng, port_id, NULL,
+				     0, 0, 0,
 				     NUM_PACKETS, 0);
 	if (len <= 0) {
 		fprintf(stderr, "Write of statistics failed\n");
diff --git a/lib/graph/graph_pcap.c b/lib/graph/graph_pcap.c
index db722c375fa7..89525f1220ca 100644
--- a/lib/graph/graph_pcap.c
+++ b/lib/graph/graph_pcap.c
@@ -214,7 +214,7 @@ graph_pcap_dispatch(struct rte_graph *graph,
 		mbuf = (struct rte_mbuf *)objs[i];
 
 		mc = rte_pcapng_copy(mbuf->port, 0, mbuf, pkt_mp, mbuf->pkt_len,
-				     rte_get_tsc_cycles(), 0, buffer);
+				     0, buffer);
 		if (mc == NULL)
 			break;
 
diff --git a/lib/pcapng/rte_pcapng.c b/lib/pcapng/rte_pcapng.c
index 3c91fc77644a..13fd2b97fb80 100644
--- a/lib/pcapng/rte_pcapng.c
+++ b/lib/pcapng/rte_pcapng.c
@@ -36,22 +36,14 @@
 /* Format of the capture file handle */
 struct rte_pcapng {
 	int  outfd;		/* output file */
-
 	unsigned int ports;	/* number of interfaces added */
+	uint64_t offset_ns;	/* ns since 1/1/1970 when initialized */
+	uint64_t tsc_base;	/* TSC when started */
 
 	/* DPDK port id to interface index in file */
 	uint32_t port_index[RTE_MAX_ETHPORTS];
 };
 
-/* For converting TSC cycles to PCAPNG ns format */
-static struct pcapng_time {
-	uint64_t ns;
-	uint64_t cycles;
-	uint64_t tsc_hz;
-	struct rte_reciprocal_u64 tsc_hz_inverse;
-} pcapng_time;
-
-
 #ifdef RTE_EXEC_ENV_WINDOWS
 /*
  * Windows does not have writev() call.
@@ -102,56 +94,21 @@ static ssize_t writev(int fd, const struct iovec *iov, int iovcnt)
 #define if_indextoname(ifindex, ifname) NULL
 #endif
 
-static inline void
-pcapng_init(void)
+/* Convert from TSC (CPU cycles) to nanoseconds */
+static uint64_t
+pcapng_timestamp(const rte_pcapng_t *self, uint64_t cycles)
 {
-	struct timespec ts;
+	uint64_t delta, rem, secs, ns;
+	const uint64_t hz = rte_get_tsc_hz();
 
-	pcapng_time.cycles = rte_get_tsc_cycles();
-	clock_gettime(CLOCK_REALTIME, &ts);
-	pcapng_time.cycles = (pcapng_time.cycles + rte_get_tsc_cycles()) / 2;
-	pcapng_time.ns = rte_timespec_to_ns(&ts);
-
-	pcapng_time.tsc_hz = rte_get_tsc_hz();
-	pcapng_time.tsc_hz_inverse = rte_reciprocal_value_u64(pcapng_time.tsc_hz);
-}
+	delta = cycles - self->tsc_base;
 
-/* PCAPNG timestamps are in nanoseconds */
-static uint64_t pcapng_tsc_to_ns(uint64_t cycles)
-{
-	uint64_t delta, secs;
-
-	if (!pcapng_time.tsc_hz)
-		pcapng_init();
-
-	/* In essence the calculation is:
-	 *   delta = (cycles - pcapng_time.cycles) * NSEC_PRE_SEC / rte_get_tsc_hz()
-	 * but this overflows within 4 to 8 seconds depending on TSC frequency.
-	 * Instead, if delta >= pcapng_time.tsc_hz:
-	 *   Increase pcapng_time.ns and pcapng_time.cycles by the number of
-	 *   whole seconds in delta and reduce delta accordingly.
-	 * delta will therefore always lie in the interval [0, pcapng_time.tsc_hz),
-	 * which will not overflow when multiplied by NSEC_PER_SEC provided the
-	 * TSC frequency < approx 18.4GHz.
-	 *
-	 * Currently all TSCs operate below 5GHz.
-	 */
-	delta = cycles - pcapng_time.cycles;
-	if (unlikely(delta >= pcapng_time.tsc_hz)) {
-		if (likely(delta < pcapng_time.tsc_hz * 2)) {
-			delta -= pcapng_time.tsc_hz;
-			pcapng_time.cycles += pcapng_time.tsc_hz;
-			pcapng_time.ns += NSEC_PER_SEC;
-		} else {
-			secs = rte_reciprocal_divide_u64(delta, &pcapng_time.tsc_hz_inverse);
-			delta -= secs * pcapng_time.tsc_hz;
-			pcapng_time.cycles += secs * pcapng_time.tsc_hz;
-			pcapng_time.ns += secs * NSEC_PER_SEC;
-		}
-	}
+	/* Avoid numeric wraparound by computing seconds first */
+	secs = delta / hz;
+	rem = delta % hz;
+	ns = (rem * NS_PER_S) / hz;
 
-	return pcapng_time.ns + rte_reciprocal_divide_u64(delta * NSEC_PER_SEC,
-							  &pcapng_time.tsc_hz_inverse);
+	return secs * NS_PER_S + ns + self->offset_ns;
 }
 
 /* length of option including padding */
@@ -368,15 +325,15 @@ rte_pcapng_add_interface(rte_pcapng_t *self, uint16_t port,
  */
 ssize_t
 rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
-		       const char *comment,
-		       uint64_t start_time, uint64_t end_time,
-		       uint64_t ifrecv, uint64_t ifdrop)
+		       uint64_t ifrecv, uint64_t ifdrop,
+		       const char *comment)
 {
 	struct pcapng_statistics *hdr;
 	struct pcapng_option *opt;
+	uint64_t start_time = self->offset_ns;
+	uint64_t sample_time;
 	uint32_t optlen, len;
 	uint8_t *buf;
-	uint64_t ns;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
@@ -386,10 +343,10 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 		optlen += pcapng_optlen(sizeof(ifrecv));
 	if (ifdrop != UINT64_MAX)
 		optlen += pcapng_optlen(sizeof(ifdrop));
+
 	if (start_time != 0)
 		optlen += pcapng_optlen(sizeof(start_time));
-	if (end_time != 0)
-		optlen += pcapng_optlen(sizeof(end_time));
+
 	if (comment)
 		optlen += pcapng_optlen(strlen(comment));
 	if (optlen != 0)
@@ -409,9 +366,6 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 	if (start_time != 0)
 		opt = pcapng_add_option(opt, PCAPNG_ISB_STARTTIME,
 					 &start_time, sizeof(start_time));
-	if (end_time != 0)
-		opt = pcapng_add_option(opt, PCAPNG_ISB_ENDTIME,
-					 &end_time, sizeof(end_time));
 	if (ifrecv != UINT64_MAX)
 		opt = pcapng_add_option(opt, PCAPNG_ISB_IFRECV,
 				&ifrecv, sizeof(ifrecv));
@@ -425,9 +379,9 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 	hdr->block_length = len;
 	hdr->interface_id = self->port_index[port_id];
 
-	ns = pcapng_tsc_to_ns(rte_get_tsc_cycles());
-	hdr->timestamp_hi = ns >> 32;
-	hdr->timestamp_lo = (uint32_t)ns;
+	sample_time = pcapng_timestamp(self, rte_get_tsc_cycles());
+	hdr->timestamp_hi = sample_time >> 32;
+	hdr->timestamp_lo = (uint32_t)sample_time;
 
 	/* clone block_length after option */
 	memcpy(opt, &len, sizeof(uint32_t));
@@ -520,23 +474,21 @@ struct rte_mbuf *
 rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 		const struct rte_mbuf *md,
 		struct rte_mempool *mp,
-		uint32_t length, uint64_t cycles,
+		uint32_t length,
 		enum rte_pcapng_direction direction,
 		const char *comment)
 {
 	struct pcapng_enhance_packet_block *epb;
 	uint32_t orig_len, data_len, padding, flags;
 	struct pcapng_option *opt;
+	uint64_t timestamp;
 	uint16_t optlen;
 	struct rte_mbuf *mc;
-	uint64_t ns;
 	bool rss_hash;
 
 #ifdef RTE_LIBRTE_ETHDEV_DEBUG
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
 #endif
-	ns = pcapng_tsc_to_ns(cycles);
-
 	orig_len = rte_pktmbuf_pkt_len(md);
 
 	/* Take snapshot of the data */
@@ -641,8 +593,10 @@ rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 	/* Interface index is filled in later during write */
 	mc->port = port_id;
 
-	epb->timestamp_hi = ns >> 32;
-	epb->timestamp_lo = (uint32_t)ns;
+	/* Put timestamp in cycles here - adjust in packet write */
+	timestamp = rte_get_tsc_cycles();
+	epb->timestamp_hi = timestamp >> 32;
+	epb->timestamp_lo = (uint32_t)timestamp;
 	epb->capture_length = data_len;
 	epb->original_length = orig_len;
 
@@ -668,6 +622,7 @@ rte_pcapng_write_packets(rte_pcapng_t *self,
 	for (i = 0; i < nb_pkts; i++) {
 		struct rte_mbuf *m = pkts[i];
 		struct pcapng_enhance_packet_block *epb;
+		uint64_t cycles, timestamp;
 
 		/* sanity check that is really a pcapng mbuf */
 		epb = rte_pktmbuf_mtod(m, struct pcapng_enhance_packet_block *);
@@ -684,6 +639,13 @@ rte_pcapng_write_packets(rte_pcapng_t *self,
 			return -1;
 		}
 
+		/* adjust timestamp recorded in packet */
+		cycles = (uint64_t)epb->timestamp_hi << 32;
+		cycles += epb->timestamp_lo;
+		timestamp = pcapng_timestamp(self, cycles);
+		epb->timestamp_hi = timestamp >> 32;
+		epb->timestamp_lo = (uint32_t)timestamp;
+
 		/*
 		 * Handle case of highly fragmented and large burst size
 		 * Note: this assumes that max segments per mbuf < IOV_MAX
@@ -725,6 +687,8 @@ rte_pcapng_fdopen(int fd,
 {
 	unsigned int i;
 	rte_pcapng_t *self;
+	struct timespec ts;
+	uint64_t cycles;
 
 	self = malloc(sizeof(*self));
 	if (!self) {
@@ -734,6 +698,13 @@ rte_pcapng_fdopen(int fd,
 
 	self->outfd = fd;
 	self->ports = 0;
+
+	/* record start time in ns since 1/1/1970 */
+	cycles = rte_get_tsc_cycles();
+	clock_gettime(CLOCK_REALTIME, &ts);
+	self->tsc_base = (cycles + rte_get_tsc_cycles()) / 2;
+	self->offset_ns = rte_timespec_to_ns(&ts);
+
 	for (i = 0; i < RTE_MAX_ETHPORTS; i++)
 		self->port_index[i] = UINT32_MAX;
 
diff --git a/lib/pcapng/rte_pcapng.h b/lib/pcapng/rte_pcapng.h
index d93cc9f73ad5..c40795c721de 100644
--- a/lib/pcapng/rte_pcapng.h
+++ b/lib/pcapng/rte_pcapng.h
@@ -121,8 +121,6 @@ enum rte_pcapng_direction {
  * @param length
  *   The upper limit on bytes to copy.  Passing UINT32_MAX
  *   means all data (after offset).
- * @param timestamp
- *   The timestamp in TSC cycles.
  * @param direction
  *   The direction of the packer: receive, transmit or unknown.
  * @param comment
@@ -136,7 +134,7 @@ __rte_experimental
 struct rte_mbuf *
 rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 		const struct rte_mbuf *m, struct rte_mempool *mp,
-		uint32_t length, uint64_t timestamp,
+		uint32_t length,
 		enum rte_pcapng_direction direction, const char *comment);
 
 
@@ -188,29 +186,22 @@ rte_pcapng_write_packets(rte_pcapng_t *self,
  *  The handle to the packet capture file
  * @param port
  *  The Ethernet port to report stats on.
- * @param comment
- *   Optional comment to add to statistics.
- * @param start_time
- *  The time when packet capture was started in nanoseconds.
- *  Optional: can be zero if not known.
- * @param end_time
- *  The time when packet capture was stopped in nanoseconds.
- *  Optional: can be zero if not finished;
  * @param ifrecv
  *  The number of packets received by capture.
  *  Optional: use UINT64_MAX if not known.
  * @param ifdrop
  *  The number of packets missed by the capture process.
  *  Optional: use UINT64_MAX if not known.
+ * @param comment
+ *  Optional comment to add to statistics.
  * @return
  *  number of bytes written to file, -1 on failure to write file
  */
 __rte_experimental
 ssize_t
 rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port,
-		       const char *comment,
-		       uint64_t start_time, uint64_t end_time,
-		       uint64_t ifrecv, uint64_t ifdrop);
+		       uint64_t ifrecv, uint64_t ifdrop,
+		       const char *comment);
 
 #ifdef __cplusplus
 }
diff --git a/lib/pdump/rte_pdump.c b/lib/pdump/rte_pdump.c
index e94f49e21250..5a1ec14d7a18 100644
--- a/lib/pdump/rte_pdump.c
+++ b/lib/pdump/rte_pdump.c
@@ -90,7 +90,6 @@ pdump_copy(uint16_t port_id, uint16_t queue,
 	int ring_enq;
 	uint16_t d_pkts = 0;
 	struct rte_mbuf *dup_bufs[nb_pkts];
-	uint64_t ts;
 	struct rte_ring *ring;
 	struct rte_mempool *mp;
 	struct rte_mbuf *p;
@@ -99,7 +98,6 @@ pdump_copy(uint16_t port_id, uint16_t queue,
 	if (cbs->filter)
 		rte_bpf_exec_burst(cbs->filter, (void **)pkts, rcs, nb_pkts);
 
-	ts = rte_get_tsc_cycles();
 	ring = cbs->ring;
 	mp = cbs->mp;
 	for (i = 0; i < nb_pkts; i++) {
@@ -122,7 +120,7 @@ pdump_copy(uint16_t port_id, uint16_t queue,
 		if (cbs->ver == V2)
 			p = rte_pcapng_copy(port_id, queue,
 					    pkts[i], mp, cbs->snaplen,
-					    ts, direction, NULL);
+					    direction, NULL);
 		else
 			p = rte_pktmbuf_copy(pkts[i], mp, 0, cbs->snaplen);
 
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v4 4/5] pcapng: avoid using alloca()
  2023-11-09 17:34 ` [PATCH v4 0/5] dumpcap and pcapng fixes Stephen Hemminger
                     ` (2 preceding siblings ...)
  2023-11-09 17:34   ` [PATCH v4 3/5] pcapng: modify timestamp calculation Stephen Hemminger
@ 2023-11-09 17:34   ` Stephen Hemminger
  2023-11-09 17:34   ` [PATCH v4 5/5] test: cleanups to pcapng test Stephen Hemminger
  4 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-09 17:34 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Morten Brørup, Reshma Pattan

The function alloca() like VLA's has problems if the caller
passes a large value. Instead use a fixed size buffer (2K)
which will be more than sufficient for the info related blocks
in the file. Add bounds checks as well.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
---
 lib/pcapng/rte_pcapng.c | 37 ++++++++++++++++---------------------
 1 file changed, 16 insertions(+), 21 deletions(-)

diff --git a/lib/pcapng/rte_pcapng.c b/lib/pcapng/rte_pcapng.c
index 13fd2b97fb80..f74ec939a9f8 100644
--- a/lib/pcapng/rte_pcapng.c
+++ b/lib/pcapng/rte_pcapng.c
@@ -33,6 +33,9 @@
 /* conversion from DPDK speed to PCAPNG */
 #define PCAPNG_MBPS_SPEED 1000000ull
 
+/* upper bound for section, stats and interface blocks */
+#define PCAPNG_BLKSIZ	2048
+
 /* Format of the capture file handle */
 struct rte_pcapng {
 	int  outfd;		/* output file */
@@ -140,9 +143,8 @@ pcapng_section_block(rte_pcapng_t *self,
 {
 	struct pcapng_section_header *hdr;
 	struct pcapng_option *opt;
-	void *buf;
+	uint8_t buf[PCAPNG_BLKSIZ];
 	uint32_t len;
-	ssize_t cc;
 
 	len = sizeof(*hdr);
 	if (hw)
@@ -158,8 +160,7 @@ pcapng_section_block(rte_pcapng_t *self,
 	len += pcapng_optlen(0);
 	len += sizeof(uint32_t);
 
-	buf = calloc(1, len);
-	if (!buf)
+	if (len > sizeof(buf))
 		return -1;
 
 	hdr = (struct pcapng_section_header *)buf;
@@ -193,10 +194,7 @@ pcapng_section_block(rte_pcapng_t *self,
 	/* clone block_length after option */
 	memcpy(opt, &hdr->block_length, sizeof(uint32_t));
 
-	cc = write(self->outfd, buf, len);
-	free(buf);
-
-	return cc;
+	return write(self->outfd, buf, len);
 }
 
 /* Write an interface block for a DPDK port */
@@ -213,7 +211,7 @@ rte_pcapng_add_interface(rte_pcapng_t *self, uint16_t port,
 	struct pcapng_option *opt;
 	const uint8_t tsresol = 9;	/* nanosecond resolution */
 	uint32_t len;
-	void *buf;
+	uint8_t buf[PCAPNG_BLKSIZ];
 	char ifname_buf[IF_NAMESIZE];
 	char ifhw[256];
 	uint64_t speed = 0;
@@ -267,8 +265,7 @@ rte_pcapng_add_interface(rte_pcapng_t *self, uint16_t port,
 	len += pcapng_optlen(0);
 	len += sizeof(uint32_t);
 
-	buf = alloca(len);
-	if (!buf)
+	if (len > sizeof(buf))
 		return -1;
 
 	hdr = (struct pcapng_interface_block *)buf;
@@ -296,17 +293,16 @@ rte_pcapng_add_interface(rte_pcapng_t *self, uint16_t port,
 		opt = pcapng_add_option(opt, PCAPNG_IFB_HARDWARE,
 					 ifhw, strlen(ifhw));
 	if (filter) {
-		/* Encoding is that the first octet indicates string vs BPF */
 		size_t len;
-		char *buf;
 
 		len = strlen(filter) + 1;
-		buf = alloca(len);
-		*buf = '\0';
-		memcpy(buf + 1, filter, len);
+		opt->code = PCAPNG_IFB_FILTER;
+		opt->length = len;
+		/* Encoding is that the first octet indicates string vs BPF */
+		opt->data[0] = 0;
+		memcpy(opt->data + 1, filter, strlen(filter));
 
-		opt = pcapng_add_option(opt, PCAPNG_IFB_FILTER,
-					buf, len);
+		opt = (struct pcapng_option *)((uint8_t *)opt + pcapng_optlen(len));
 	}
 
 	opt = pcapng_add_option(opt, PCAPNG_OPT_END, NULL, 0);
@@ -333,7 +329,7 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 	uint64_t start_time = self->offset_ns;
 	uint64_t sample_time;
 	uint32_t optlen, len;
-	uint8_t *buf;
+	uint8_t buf[PCAPNG_BLKSIZ];
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
@@ -353,8 +349,7 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 		optlen += pcapng_optlen(0);
 
 	len = sizeof(*hdr) + optlen + sizeof(uint32_t);
-	buf = alloca(len);
-	if (buf == NULL)
+	if (len > sizeof(buf))
 		return -1;
 
 	hdr = (struct pcapng_statistics *)buf;
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v4 5/5] test: cleanups to pcapng test
  2023-11-09 17:34 ` [PATCH v4 0/5] dumpcap and pcapng fixes Stephen Hemminger
                     ` (3 preceding siblings ...)
  2023-11-09 17:34   ` [PATCH v4 4/5] pcapng: avoid using alloca() Stephen Hemminger
@ 2023-11-09 17:34   ` Stephen Hemminger
  4 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-09 17:34 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Reshma Pattan

Overhaul of the pcapng test:
  - promote it to be a fast test so it gets regularly run.
  - create null device and use i.
  - use UDP discard packets that are valid so that for debugging
    the resulting pcapng file can be looked at with wireshark.
  - do basic checks on resulting pcap file that lengths and
    timestamps are in range.
  - add test for interface options

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 app/test/meson.build   |   2 +-
 app/test/test_pcapng.c | 418 +++++++++++++++++++++++++++--------------
 2 files changed, 282 insertions(+), 138 deletions(-)

diff --git a/app/test/meson.build b/app/test/meson.build
index 4183d66b0e9c..dcc93f4a43b4 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -128,7 +128,7 @@ source_file_deps = {
     'test_metrics.c': ['metrics'],
     'test_mp_secondary.c': ['hash', 'lpm'],
     'test_net_ether.c': ['net'],
-    'test_pcapng.c': ['ethdev', 'net', 'pcapng'],
+    'test_pcapng.c': ['ethdev', 'net', 'pcapng', 'bus_vdev'],
     'test_pdcp.c': ['eventdev', 'pdcp', 'net', 'timer', 'security'],
     'test_pdump.c': ['pdump'] + sample_packet_forward_deps,
     'test_per_lcore.c': [],
diff --git a/app/test/test_pcapng.c b/app/test/test_pcapng.c
index 55aa2cf93666..c973aa47d1f8 100644
--- a/app/test/test_pcapng.c
+++ b/app/test/test_pcapng.c
@@ -6,25 +6,34 @@
 #include <stdlib.h>
 #include <unistd.h>
 
+#include <rte_bus_vdev.h>
 #include <rte_ethdev.h>
 #include <rte_ether.h>
+#include <rte_ip.h>
 #include <rte_mbuf.h>
 #include <rte_mempool.h>
 #include <rte_net.h>
 #include <rte_pcapng.h>
+#include <rte_random.h>
+#include <rte_reciprocal.h>
+#include <rte_time.h>
+#include <rte_udp.h>
 
 #include <pcap/pcap.h>
 
 #include "test.h"
 
-#define NUM_PACKETS    10
-#define DUMMY_MBUF_NUM 3
+#define PCAPNG_TEST_DEBUG 0
+
+#define TOTAL_PACKETS	4096
+#define MAX_BURST	64
+#define MAX_GAP_US	100000
+#define DUMMY_MBUF_NUM	3
 
-static rte_pcapng_t *pcapng;
 static struct rte_mempool *mp;
 static const uint32_t pkt_len = 200;
 static uint16_t port_id;
-static char file_name[] = "/tmp/pcapng_test_XXXXXX.pcapng";
+static const char null_dev[] = "net_null0";
 
 /* first mbuf in the packet, should always be at offset 0 */
 struct dummy_mbuf {
@@ -61,6 +70,7 @@ mbuf1_prepare(struct dummy_mbuf *dm, uint32_t plen)
 	struct {
 		struct rte_ether_hdr eth;
 		struct rte_ipv4_hdr ip;
+		struct rte_udp_hdr udp;
 	} pkt = {
 		.eth = {
 			.dst_addr.addr_bytes = "\xff\xff\xff\xff\xff\xff",
@@ -68,149 +78,201 @@ mbuf1_prepare(struct dummy_mbuf *dm, uint32_t plen)
 		},
 		.ip = {
 			.version_ihl = RTE_IPV4_VHL_DEF,
-			.total_length = rte_cpu_to_be_16(plen),
-			.time_to_live = IPDEFTTL,
-			.next_proto_id = IPPROTO_RAW,
+			.time_to_live = 1,
+			.next_proto_id = IPPROTO_UDP,
 			.src_addr = rte_cpu_to_be_32(RTE_IPV4_LOOPBACK),
 			.dst_addr = rte_cpu_to_be_32(RTE_IPV4_BROADCAST),
-		}
+		},
+		.udp = {
+			.dst_port = rte_cpu_to_be_16(9), /* Discard port */
+		},
 	};
 
 	memset(dm, 0, sizeof(*dm));
 	dummy_mbuf_prep(&dm->mb[0], dm->buf[0], sizeof(dm->buf[0]), plen);
 
 	rte_eth_random_addr(pkt.eth.src_addr.addr_bytes);
-	memcpy(rte_pktmbuf_mtod(dm->mb, void *), &pkt, RTE_MIN(sizeof(pkt), plen));
+	plen -= sizeof(struct rte_ether_hdr);
+
+	pkt.ip.total_length = rte_cpu_to_be_16(plen);
+	pkt.ip.hdr_checksum = rte_ipv4_cksum(&pkt.ip);
+
+	plen -= sizeof(struct rte_ipv4_hdr);
+	pkt.udp.src_port = rte_rand();
+	pkt.udp.dgram_len = rte_cpu_to_be_16(plen);
+
+	memcpy(rte_pktmbuf_mtod(dm->mb, void *), &pkt, sizeof(pkt));
 }
 
 static int
 test_setup(void)
 {
-	int tmp_fd;
-
-	port_id = rte_eth_find_next(0);
-	if (port_id >= RTE_MAX_ETHPORTS) {
-		fprintf(stderr, "No valid Ether port\n");
-		return -1;
-	}
+	port_id = rte_eth_dev_count_avail();
 
-	tmp_fd = mkstemps(file_name, strlen(".pcapng"));
-	if (tmp_fd == -1) {
-		perror("mkstemps() failure");
-		return -1;
-	}
-	printf("pcapng: output file %s\n", file_name);
-
-	/* open a test capture file */
-	pcapng = rte_pcapng_fdopen(tmp_fd, NULL, NULL, "pcapng_test", NULL);
-	if (pcapng == NULL) {
-		fprintf(stderr, "rte_pcapng_fdopen failed\n");
-		close(tmp_fd);
-		return -1;
-	}
-
-	/* Add interface to the file */
-	if (rte_pcapng_add_interface(pcapng, port_id,
-				     NULL, NULL, NULL) != 0) {
-		fprintf(stderr, "can not add port %u\n", port_id);
-		return -1;
+	/* Make a dummy null device to snoop on */
+	if (rte_vdev_init(null_dev, NULL) != 0) {
+		fprintf(stderr, "Failed to create vdev '%s'\n", null_dev);
+		goto fail;
 	}
 
 	/* Make a pool for cloned packets */
-	mp = rte_pktmbuf_pool_create_by_ops("pcapng_test_pool", IOV_MAX + NUM_PACKETS,
-					    0, 0,
-					    rte_pcapng_mbuf_size(pkt_len),
+	mp = rte_pktmbuf_pool_create_by_ops("pcapng_test_pool",
+					    MAX_BURST, 0, 0,
+					    rte_pcapng_mbuf_size(pkt_len) + 128,
 					    SOCKET_ID_ANY, "ring_mp_sc");
 	if (mp == NULL) {
 		fprintf(stderr, "Cannot create mempool\n");
-		return -1;
+		goto fail;
 	}
+
 	return 0;
+
+fail:
+	rte_vdev_uninit(null_dev);
+	rte_mempool_free(mp);
+	return -1;
 }
 
 static int
-test_write_packets(void)
+fill_pcapng_file(rte_pcapng_t *pcapng, unsigned int num_packets)
 {
-	struct rte_mbuf *orig;
-	struct rte_mbuf *clones[NUM_PACKETS] = { };
 	struct dummy_mbuf mbfs;
-	unsigned int i;
+	struct rte_mbuf *orig;
+	unsigned int burst_size;
+	unsigned int count;
 	ssize_t len;
 
 	/* make a dummy packet */
 	mbuf1_prepare(&mbfs, pkt_len);
-
-	/* clone them */
 	orig  = &mbfs.mb[0];
-	for (i = 0; i < NUM_PACKETS; i++) {
-		struct rte_mbuf *mc;
 
-		mc = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len,
-				rte_get_tsc_cycles(), 0, NULL);
-		if (mc == NULL) {
-			fprintf(stderr, "Cannot copy packet\n");
+	for (count = 0; count < num_packets; count += burst_size) {
+		struct rte_mbuf *clones[MAX_BURST];
+		unsigned int i;
+
+		/* put 1 .. MAX_BURST packets in one write call */
+		burst_size = rte_rand_max(MAX_BURST) + 1;
+		for (i = 0; i < burst_size; i++) {
+			struct rte_mbuf *mc;
+
+			mc = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len,
+					     RTE_PCAPNG_DIRECTION_IN,
+					     NULL);
+			if (mc == NULL) {
+				fprintf(stderr, "Cannot copy packet\n");
+				return -1;
+			}
+			clones[i] = mc;
+		}
+
+		/* write it to capture file */
+		len = rte_pcapng_write_packets(pcapng, clones, burst_size);
+		rte_pktmbuf_free_bulk(clones, burst_size);
+
+		if (len <= 0) {
+			fprintf(stderr, "Write of packets failed: %s\n",
+				rte_strerror(rte_errno));
 			return -1;
 		}
-		clones[i] = mc;
+
+		/* Leave a small gap between packets to test for time wrap */
+		usleep(rte_rand_max(MAX_GAP_US));
 	}
 
-	/* write it to capture file */
-	len = rte_pcapng_write_packets(pcapng, clones, NUM_PACKETS);
+	return count;
+}
 
-	rte_pktmbuf_free_bulk(clones, NUM_PACKETS);
+static char *
+fmt_time(char *buf, size_t size, uint64_t ts_ns)
+{
+	time_t sec;
+	size_t len;
 
-	if (len <= 0) {
-		fprintf(stderr, "Write of packets failed\n");
-		return -1;
-	}
+	sec = ts_ns / NS_PER_S;
+	len = strftime(buf, size, "%X", localtime(&sec));
+	snprintf(buf + len, size - len, ".%09lu",
+		 (unsigned long)(ts_ns % NS_PER_S));
 
-	return 0;
+	return buf;
 }
 
-static int
-test_write_stats(void)
+/* Context for the pcap_loop callback */
+struct pkt_print_ctx {
+	pcap_t *pcap;
+	unsigned int count;
+	uint64_t start_ns;
+	uint64_t end_ns;
+};
+
+static void
+print_packet(uint64_t ts_ns, const struct rte_ether_hdr *eh, size_t len)
 {
-	ssize_t len;
+	char tbuf[128], src[64], dst[64];
 
-	/* write a statistics block */
-	len = rte_pcapng_write_stats(pcapng, port_id, NULL,
-				     0, 0, 0,
-				     NUM_PACKETS, 0);
-	if (len <= 0) {
-		fprintf(stderr, "Write of statistics failed\n");
-		return -1;
-	}
-	return 0;
+	fmt_time(tbuf, sizeof(tbuf), ts_ns);
+	rte_ether_format_addr(dst, sizeof(dst), &eh->dst_addr);
+	rte_ether_format_addr(src, sizeof(src), &eh->src_addr);
+	printf("%s: %s -> %s type %x length %zu\n",
+	       tbuf, src, dst, rte_be_to_cpu_16(eh->ether_type), len);
 }
 
+/* Callback from pcap_loop used to validate packets in the file */
 static void
-pkt_print(u_char *user, const struct pcap_pkthdr *h,
-	  const u_char *bytes)
+parse_pcap_packet(u_char *user, const struct pcap_pkthdr *h,
+		  const u_char *bytes)
 {
-	unsigned int *countp = (unsigned int *)user;
+	struct pkt_print_ctx *ctx = (struct pkt_print_ctx *)user;
 	const struct rte_ether_hdr *eh;
-	struct tm *tm;
-	char tbuf[128], src[64], dst[64];
+	const struct rte_ipv4_hdr *ip;
+	uint64_t ns;
 
-	tm = localtime(&h->ts.tv_sec);
-	if (tm == NULL) {
-		perror("localtime");
-		return;
+	eh = (const struct rte_ether_hdr *)bytes;
+	ip = (const struct rte_ipv4_hdr *)(eh + 1);
+
+	ctx->count += 1;
+
+	/* The pcap library is misleading in reporting timestamp.
+	 * packet header struct gives timestamp as a timeval (ie. usec);
+	 * but the file is open in nanonsecond mode therefore
+	 * the timestamp is really in timespec (ie. nanoseconds).
+	 */
+	ns = h->ts.tv_sec * NS_PER_S + h->ts.tv_usec;
+	if (ns < ctx->start_ns || ns > ctx->end_ns) {
+		char tstart[128], tend[128];
+
+		fmt_time(tstart, sizeof(tstart), ctx->start_ns);
+		fmt_time(tend, sizeof(tend), ctx->end_ns);
+		fprintf(stderr, "Timestamp out of range [%s .. %s]\n",
+			tstart, tend);
+		goto error;
 	}
 
-	if (strftime(tbuf, sizeof(tbuf), "%X", tm) == 0) {
-		fprintf(stderr, "strftime returned 0!\n");
-		return;
+	if (!rte_is_broadcast_ether_addr(&eh->dst_addr)) {
+		fprintf(stderr, "Destination is not broadcast\n");
+		goto error;
 	}
 
-	eh = (const struct rte_ether_hdr *)bytes;
-	rte_ether_format_addr(dst, sizeof(dst), &eh->dst_addr);
-	rte_ether_format_addr(src, sizeof(src), &eh->src_addr);
-	printf("%s.%06lu: %s -> %s type %x length %u\n",
-	       tbuf, (unsigned long)h->ts.tv_usec,
-	       src, dst, rte_be_to_cpu_16(eh->ether_type), h->len);
+	if (rte_ipv4_cksum(ip) != 0) {
+		fprintf(stderr, "Bad IPv4 checksum\n");
+		goto error;
+	}
+
+	return;		/* packet is normal */
+
+error:
+	print_packet(ns, eh, h->len);
+
+	/* Stop parsing at first error */
+	pcap_breakloop(ctx->pcap);
+}
 
-	*countp += 1;
+static uint64_t
+current_timestamp(void)
+{
+	struct timespec ts;
+
+	clock_gettime(CLOCK_REALTIME, &ts);
+	return rte_timespec_to_ns(&ts);
 }
 
 /*
@@ -219,78 +281,162 @@ pkt_print(u_char *user, const struct pcap_pkthdr *h,
  * but that creates an unwanted dependency.
  */
 static int
-test_validate(void)
+valid_pcapng_file(const char *file_name, uint64_t started, unsigned int expected)
 {
 	char errbuf[PCAP_ERRBUF_SIZE];
-	unsigned int count = 0;
-	pcap_t *pcap;
+	struct pkt_print_ctx ctx = { };
 	int ret;
 
-	pcap = pcap_open_offline(file_name, errbuf);
-	if (pcap == NULL) {
+	ctx.start_ns = started;
+	ctx.end_ns = current_timestamp();
+
+	ctx.pcap = pcap_open_offline_with_tstamp_precision(file_name,
+							   PCAP_TSTAMP_PRECISION_NANO,
+							   errbuf);
+	if (ctx.pcap == NULL) {
 		fprintf(stderr, "pcap_open_offline('%s') failed: %s\n",
 			file_name, errbuf);
 		return -1;
 	}
 
-	ret = pcap_loop(pcap, 0, pkt_print, (u_char *)&count);
-	if (ret == 0)
-		printf("Saw %u packets\n", count);
-	else
+	ret = pcap_loop(ctx.pcap, 0, parse_pcap_packet, (u_char *)&ctx);
+	if (ret != 0) {
 		fprintf(stderr, "pcap_dispatch: failed: %s\n",
-			pcap_geterr(pcap));
-	pcap_close(pcap);
+			pcap_geterr(ctx.pcap));
+	} else if (ctx.count != expected) {
+		printf("Only %u packets, expected %u\n",
+		       ctx.count, expected);
+		ret = -1;
+	}
+
+	pcap_close(ctx.pcap);
 
 	return ret;
 }
 
 static int
-test_write_over_limit_iov_max(void)
+test_add_interface(void)
 {
-	struct rte_mbuf *orig;
-	struct rte_mbuf *clones[IOV_MAX + NUM_PACKETS] = { };
-	struct dummy_mbuf mbfs;
-	unsigned int i;
-	ssize_t len;
+	char file_name[] = "/tmp/pcapng_test_XXXXXX.pcapng";
+	static rte_pcapng_t *pcapng;
+	int ret, tmp_fd;
+	uint64_t now = current_timestamp();
 
-	/* make a dummy packet */
-	mbuf1_prepare(&mbfs, pkt_len);
+	tmp_fd = mkstemps(file_name, strlen(".pcapng"));
+	if (tmp_fd == -1) {
+		perror("mkstemps() failure");
+		goto fail;
+	}
+	printf("pcapng: output file %s\n", file_name);
 
-	/* clone them */
-	orig  = &mbfs.mb[0];
-	for (i = 0; i < IOV_MAX + NUM_PACKETS; i++) {
-		struct rte_mbuf *mc;
+	/* open a test capture file */
+	pcapng = rte_pcapng_fdopen(tmp_fd, NULL, NULL, "pcapng_addif", NULL);
+	if (pcapng == NULL) {
+		fprintf(stderr, "rte_pcapng_fdopen failed\n");
+		close(tmp_fd);
+		goto fail;
+	}
 
-		mc = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len,
-				rte_get_tsc_cycles(), 0, NULL);
-		if (mc == NULL) {
-			fprintf(stderr, "Cannot copy packet\n");
-			return -1;
-		}
-		clones[i] = mc;
+	/* Add interface to the file */
+	ret = rte_pcapng_add_interface(pcapng, port_id,
+				       NULL, NULL, NULL);
+	if (ret < 0) {
+		fprintf(stderr, "can not add port %u\n", port_id);
+		goto fail;
 	}
 
-	/* write it to capture file */
-	len = rte_pcapng_write_packets(pcapng, clones, IOV_MAX + NUM_PACKETS);
+	/* Add interface with ifname and ifdescr */
+	ret = rte_pcapng_add_interface(pcapng, port_id,
+				       "myeth", "Some long description", NULL);
+	if (ret < 0) {
+		fprintf(stderr, "can not add port %u with ifname\n", port_id);
+		goto fail;
+	}
+
+	/* Add interface with filter */
+	ret = rte_pcapng_add_interface(pcapng, port_id,
+				       NULL, NULL, "tcp port 8080");
+	if (ret < 0) {
+		fprintf(stderr, "can not add port %u with filter\n", port_id);
+		goto fail;
+	}
 
-	rte_pktmbuf_free_bulk(clones, IOV_MAX + NUM_PACKETS);
+	rte_pcapng_close(pcapng);
 
-	if (len <= 0) {
-		fprintf(stderr, "Write of packets failed\n");
-		return -1;
+	ret = valid_pcapng_file(file_name, now, 0);
+	/* if test fails want to investigate the file */
+	if (ret == 0)
+		unlink(file_name);
+
+	return ret;
+
+fail:
+	rte_pcapng_close(pcapng);
+	return -1;
+}
+
+static int
+test_write_packets(void)
+{
+	char file_name[] = "/tmp/pcapng_test_XXXXXX.pcapng";
+	static rte_pcapng_t *pcapng;
+	int ret, tmp_fd, count;
+	uint64_t now = current_timestamp();
+
+	tmp_fd = mkstemps(file_name, strlen(".pcapng"));
+	if (tmp_fd == -1) {
+		perror("mkstemps() failure");
+		goto fail;
 	}
+	printf("pcapng: output file %s\n", file_name);
 
-	return 0;
+	/* open a test capture file */
+	pcapng = rte_pcapng_fdopen(tmp_fd, NULL, NULL, "pcapng_test", NULL);
+	if (pcapng == NULL) {
+		fprintf(stderr, "rte_pcapng_fdopen failed\n");
+		close(tmp_fd);
+		goto fail;
+	}
+
+	/* Add interface to the file */
+	ret = rte_pcapng_add_interface(pcapng, port_id,
+				       NULL, NULL, NULL);
+	if (ret < 0) {
+		fprintf(stderr, "can not add port %u\n", port_id);
+		goto fail;
+	}
+
+	count = fill_pcapng_file(pcapng, TOTAL_PACKETS);
+	if (count < 0)
+		goto fail;
+
+	/* write a statistics block */
+	ret = rte_pcapng_write_stats(pcapng, port_id,
+				     count, 0, "end of test");
+	if (ret <= 0) {
+		fprintf(stderr, "Write of statistics failed\n");
+		goto fail;
+	}
+
+	rte_pcapng_close(pcapng);
+
+	ret = valid_pcapng_file(file_name, now, count);
+	/* if test fails want to investigate the file */
+	if (ret == 0)
+		unlink(file_name);
+
+	return ret;
+
+fail:
+	rte_pcapng_close(pcapng);
+	return -1;
 }
 
 static void
 test_cleanup(void)
 {
 	rte_mempool_free(mp);
-
-	if (pcapng)
-		rte_pcapng_close(pcapng);
-
+	rte_vdev_uninit(null_dev);
 }
 
 static struct
@@ -299,10 +445,8 @@ unit_test_suite test_pcapng_suite  = {
 	.teardown = test_cleanup,
 	.suite_name = "Test Pcapng Unit Test Suite",
 	.unit_test_cases = {
+		TEST_CASE(test_add_interface),
 		TEST_CASE(test_write_packets),
-		TEST_CASE(test_write_stats),
-		TEST_CASE(test_validate),
-		TEST_CASE(test_write_over_limit_iov_max),
 		TEST_CASES_END()
 	}
 };
@@ -313,4 +457,4 @@ test_pcapng(void)
 	return unit_test_suite_runner(&test_pcapng_suite);
 }
 
-REGISTER_TEST_COMMAND(pcapng_autotest, test_pcapng);
+REGISTER_FAST_TEST(pcapng_autotest, true, true, test_pcapng);
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH v3 2/5] dumpcap: allow multiple invocations
  2023-11-09 17:16       ` Stephen Hemminger
@ 2023-11-09 18:22         ` Morten Brørup
  0 siblings, 0 replies; 61+ messages in thread
From: Morten Brørup @ 2023-11-09 18:22 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Isaac Boukris

> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Thursday, 9 November 2023 18.16
> 
> On Thu, 9 Nov 2023 08:50:10 +0100
> Morten Brørup <mb@smartsharesystems.com> wrote:
> 
> > >  			rte_exit(EXIT_FAILURE,
> > >  				"Packet dump enable on %u:%s failed %s\n",
> > >  				intf->port, intf->name,
> > > -				rte_strerror(-ret));
> > > +				rte_strerror(rte_errno));
> >
> > This bugfix (the line above, not the patch itself) supports Tyler's
> proposal to standardize on returning -1 with rte_errno set on failure,
> instead of some functions returning -errno. Our dual convention for
> function return values will cause many bugs like this.
> 
> The error case here is when rte_pdump_enable_bpf() fails.
> This is return from pdump_enable in pdump library.
> The library does follow the rte_errno convention correctly.

I'm sorry about being unclear in my comment about rte_errno conventions; it was not targeted at this library.

My comment was meant as general support for Tyler's suggestion, using this as an example of a bug that would not have been there if the return convention was always -1 with rte_errno.

With the dual return convention, it's amazing that you caught this bug.

> But the error message wasn't reporting correctly which would lead to
> confusing error in case where
> multiple invocations failed.
> 
> It is not possible to do multiple captures on same interface. And not
> worth modifying the
> library (would require multiple copies and ref counts) to handle this
> case.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH v4 2/5] dumpcap: allow multiple invocations
  2023-11-09 17:34   ` [PATCH v4 2/5] dumpcap: allow multiple invocations Stephen Hemminger
@ 2023-11-09 18:30     ` Morten Brørup
  0 siblings, 0 replies; 61+ messages in thread
From: Morten Brørup @ 2023-11-09 18:30 UTC (permalink / raw)
  To: Stephen Hemminger, dev; +Cc: Isaac Boukris, Reshma Pattan

> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Thursday, 9 November 2023 18.34
> 
> If dumpcap is run twice with each instance pointing a different
> interface, it would fail because of overlap in ring a pool names.
> Fix by putting process id in the name.
> 
> It is still not allowed to do multiple invocations on the same
> interface because only one callback is allowed and only one copy
> of mbuf is done. Dumpcap will fail with error in this case:
> 
>    pdump_prepare_client_request(): client request for pdump
> enable/disable failed
>    EAL: Error - exiting with code: 1
>      Cause: Packet dump enable on 0:net_null0 failed File exists
> 
> Fixes: cbb44143be74 ("app/dumpcap: add new packet capture application")
> Reported-by: Isaac Boukris <iboukris@gmail.com>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---

[...]

> +	snprintf(ring_name, sizeof(ring_name),
> +		 "dumpcap-%d", getpid());

Fixed - thank you.

[...]

> +	snprintf(pool_name, sizeof(pool_name), "capture_%u", getpid());

Should change from %u to %d here too. ;-)

Either way,

Reviewed-by: Morten Brørup <mb@smartsharesystems.com>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v5 0/5] dumpcap and pcapng fixes
  2023-09-21  4:23 [PATCH 0/4] pcapng fixes Stephen Hemminger
                   ` (6 preceding siblings ...)
  2023-11-09 17:34 ` [PATCH v4 0/5] dumpcap and pcapng fixes Stephen Hemminger
@ 2023-11-09 19:45 ` Stephen Hemminger
  2023-11-09 19:45   ` [PATCH v5 1/5] pdump: fix setting rte_errno on mp error Stephen Hemminger
                     ` (4 more replies)
  2023-11-13 16:15 ` [PATCH v6 0/5] dumpcap and pcapng fixes Stephen Hemminger
  2023-11-17 16:35 ` [PATCH v7 0/5] dumpcap and pcapng fixes Stephen Hemminger
  9 siblings, 5 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-09 19:45 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

This series has bugfixes and tests for dumpcap and pcapng.
It should be in 23.11!

It fixes issues related to timestamping. The design choices are
to maximize performance in the primary process; and do
all the time adjustment in the secondary (dumpcap) since
the dumpcap needs to system calls anyway to write the result.

This patches set changes where the adjustment is calculated
into the pcapng portion that opens the output file.
All details of the format of timestamp are contained inside
pcapng (data hiding).

v5 - fix format of getpid in capture name
v4 - incorporate review feedback
v3 - don't use alloca() since can have VLA type issues

Stephen Hemminger (5):
  pdump: fix setting rte_errno on mp error
  dumpcap: allow multiple invocations
  pcapng: modify timestamp calculation
  pcapng: avoid using alloca()
  test: cleanups to pcapng test

 app/dumpcap/main.c      |  53 ++---
 app/test/meson.build    |   2 +-
 app/test/test_pcapng.c  | 418 +++++++++++++++++++++++++++-------------
 lib/graph/graph_pcap.c  |   2 +-
 lib/pcapng/rte_pcapng.c | 156 ++++++---------
 lib/pcapng/rte_pcapng.h |  19 +-
 lib/pdump/rte_pdump.c   |   9 +-
 7 files changed, 374 insertions(+), 285 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v5 1/5] pdump: fix setting rte_errno on mp error
  2023-11-09 19:45 ` [PATCH v5 0/5] dumpcap and pcapng fixes Stephen Hemminger
@ 2023-11-09 19:45   ` Stephen Hemminger
  2023-11-09 19:45   ` [PATCH v5 2/5] dumpcap: allow multiple invocations Stephen Hemminger
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-09 19:45 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Morten Brørup, Reshma Pattan, Jianfeng Tan

The response from MP server sets err_value to negative
on error. The convention for rte_errno is to use a positive
value on error. This makes errors like duplicate registration
show up with the correct error value.

Fixes: 660098d61f57 ("pdump: use generic multi-process channel")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
---
 lib/pdump/rte_pdump.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/pdump/rte_pdump.c b/lib/pdump/rte_pdump.c
index 80b90c6f7d03..e94f49e21250 100644
--- a/lib/pdump/rte_pdump.c
+++ b/lib/pdump/rte_pdump.c
@@ -564,9 +564,10 @@ pdump_prepare_client_request(const char *device, uint16_t queue,
 	if (rte_mp_request_sync(&mp_req, &mp_reply, &ts) == 0) {
 		mp_rep = &mp_reply.msgs[0];
 		resp = (struct pdump_response *)mp_rep->param;
-		rte_errno = resp->err_value;
-		if (!resp->err_value)
+		if (resp->err_value == 0)
 			ret = 0;
+		else
+			rte_errno = -resp->err_value;
 		free(mp_reply.msgs);
 	}
 
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v5 2/5] dumpcap: allow multiple invocations
  2023-11-09 19:45 ` [PATCH v5 0/5] dumpcap and pcapng fixes Stephen Hemminger
  2023-11-09 19:45   ` [PATCH v5 1/5] pdump: fix setting rte_errno on mp error Stephen Hemminger
@ 2023-11-09 19:45   ` Stephen Hemminger
  2023-11-09 20:09     ` Morten Brørup
  2023-11-09 19:45   ` [PATCH v5 3/5] pcapng: modify timestamp calculation Stephen Hemminger
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-09 19:45 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Isaac Boukris, Reshma Pattan

If dumpcap is run twice with each instance pointing a different
interface, it would fail because of overlap in ring a pool names.
Fix by putting process id in the name.

It is still not allowed to do multiple invocations on the same
interface because only one callback is allowed and only one copy
of mbuf is done. Dumpcap will fail with error in this case:

   pdump_prepare_client_request(): client request for pdump enable/disable failed
   EAL: Error - exiting with code: 1
     Cause: Packet dump enable on 0:net_null0 failed File exists

Fixes: cbb44143be74 ("app/dumpcap: add new packet capture application")
Reported-by: Isaac Boukris <iboukris@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 app/dumpcap/main.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/app/dumpcap/main.c b/app/dumpcap/main.c
index 64294bbfb3e6..efc60372d718 100644
--- a/app/dumpcap/main.c
+++ b/app/dumpcap/main.c
@@ -44,7 +44,6 @@
 #include <pcap/pcap.h>
 #include <pcap/bpf.h>
 
-#define RING_NAME "capture-ring"
 #define MONITOR_INTERVAL  (500 * 1000)
 #define MBUF_POOL_CACHE_SIZE 32
 #define BURST_SIZE 32
@@ -647,6 +646,7 @@ static void dpdk_init(void)
 static struct rte_ring *create_ring(void)
 {
 	struct rte_ring *ring;
+	char ring_name[RTE_RING_NAMESIZE];
 	size_t size, log2;
 
 	/* Find next power of 2 >= size. */
@@ -660,28 +660,28 @@ static struct rte_ring *create_ring(void)
 		ring_size = size;
 	}
 
-	ring = rte_ring_lookup(RING_NAME);
-	if (ring == NULL) {
-		ring = rte_ring_create(RING_NAME, ring_size,
-					rte_socket_id(), 0);
-		if (ring == NULL)
-			rte_exit(EXIT_FAILURE, "Could not create ring :%s\n",
-				 rte_strerror(rte_errno));
-	}
+	/* Want one ring per invocation of program */
+	snprintf(ring_name, sizeof(ring_name),
+		 "dumpcap-%d", getpid());
+
+	ring = rte_ring_create(ring_name, ring_size,
+			       rte_socket_id(), 0);
+	if (ring == NULL)
+		rte_exit(EXIT_FAILURE, "Could not create ring :%s\n",
+			 rte_strerror(rte_errno));
+
 	return ring;
 }
 
 static struct rte_mempool *create_mempool(void)
 {
 	const struct interface *intf;
-	static const char pool_name[] = "capture_mbufs";
+	char pool_name[RTE_MEMPOOL_NAMESIZE];
 	size_t num_mbufs = 2 * ring_size;
 	struct rte_mempool *mp;
 	uint32_t data_size = 128;
 
-	mp = rte_mempool_lookup(pool_name);
-	if (mp)
-		return mp;
+	snprintf(pool_name, sizeof(pool_name), "capture_%d", getpid());
 
 	/* Common pool so size mbuf for biggest snap length */
 	TAILQ_FOREACH(intf, &interfaces, next) {
@@ -826,7 +826,7 @@ static void enable_pdump(struct rte_ring *r, struct rte_mempool *mp)
 			rte_exit(EXIT_FAILURE,
 				"Packet dump enable on %u:%s failed %s\n",
 				intf->port, intf->name,
-				rte_strerror(-ret));
+				rte_strerror(rte_errno));
 		}
 
 		if (intf->opts.promisc_mode) {
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v5 3/5] pcapng: modify timestamp calculation
  2023-11-09 19:45 ` [PATCH v5 0/5] dumpcap and pcapng fixes Stephen Hemminger
  2023-11-09 19:45   ` [PATCH v5 1/5] pdump: fix setting rte_errno on mp error Stephen Hemminger
  2023-11-09 19:45   ` [PATCH v5 2/5] dumpcap: allow multiple invocations Stephen Hemminger
@ 2023-11-09 19:45   ` Stephen Hemminger
  2023-11-12 14:22     ` Thomas Monjalon
  2023-11-09 19:45   ` [PATCH v5 4/5] pcapng: avoid using alloca() Stephen Hemminger
  2023-11-09 19:45   ` [PATCH v5 5/5] test: cleanups to pcapng test Stephen Hemminger
  4 siblings, 1 reply; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-09 19:45 UTC (permalink / raw)
  To: dev
  Cc: Stephen Hemminger, Morten Brørup, Reshma Pattan,
	Jerin Jacob, Kiran Kumar K, Nithin Dabilpuram, Zhirun Yan,
	Quentin Armitage

The computation of timestamp is best done in the part of
pcapng library that is in secondary process.
The secondary process is already doing a bunch of system
calls which makes it not performance sensitive.
This does change the rte_pcapng_copy()
and rte_pcapng_write_stats() experimental API's.

Simplify the computation of nanoseconds from TSC to a two
step process which avoids numeric overflow issues. The previous
code was not thread safe as well.

Fixes: c882eb544842 ("pcapng: fix timestamp wrapping in output files")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
---
 app/dumpcap/main.c      |  25 +++------
 app/test/test_pcapng.c  |   4 +-
 lib/graph/graph_pcap.c  |   2 +-
 lib/pcapng/rte_pcapng.c | 119 +++++++++++++++-------------------------
 lib/pcapng/rte_pcapng.h |  19 ++-----
 lib/pdump/rte_pdump.c   |   4 +-
 6 files changed, 61 insertions(+), 112 deletions(-)

diff --git a/app/dumpcap/main.c b/app/dumpcap/main.c
index efc60372d718..583bce80166c 100644
--- a/app/dumpcap/main.c
+++ b/app/dumpcap/main.c
@@ -66,13 +66,13 @@ static bool print_stats;
 
 /* capture limit options */
 static struct {
-	uint64_t  duration;	/* nanoseconds */
+	time_t  duration;	/* seconds */
 	unsigned long packets;  /* number of packets in file */
 	size_t size;		/* file size (bytes) */
 } stop;
 
 /* Running state */
-static uint64_t start_time, end_time;
+static time_t start_time;
 static uint64_t packets_received;
 static size_t file_size;
 
@@ -197,7 +197,7 @@ static void auto_stop(char *opt)
 		if (*value == '\0' || *endp != '\0' || interval <= 0)
 			rte_exit(EXIT_FAILURE,
 				 "Invalid duration \"%s\"\n", value);
-		stop.duration = NSEC_PER_SEC * interval;
+		stop.duration = interval;
 	} else if (strcmp(opt, "filesize") == 0) {
 		stop.size = get_uint(value, "filesize", 0) * 1024;
 	} else if (strcmp(opt, "packets") == 0) {
@@ -511,15 +511,6 @@ static void statistics_loop(void)
 	}
 }
 
-/* Return the time since 1/1/1970 in nanoseconds */
-static uint64_t create_timestamp(void)
-{
-	struct timespec now;
-
-	clock_gettime(CLOCK_MONOTONIC, &now);
-	return rte_timespec_to_ns(&now);
-}
-
 static void
 cleanup_pdump_resources(void)
 {
@@ -589,9 +580,8 @@ report_packet_stats(dumpcap_out_t out)
 		ifdrop = pdump_stats.nombuf + pdump_stats.ringfull;
 
 		if (use_pcapng)
-			rte_pcapng_write_stats(out.pcapng, intf->port, NULL,
-					       start_time, end_time,
-					       ifrecv, ifdrop);
+			rte_pcapng_write_stats(out.pcapng, intf->port,
+					       ifrecv, ifdrop, NULL);
 
 		if (ifrecv == 0)
 			percent = 0;
@@ -983,7 +973,7 @@ int main(int argc, char **argv)
 	mp = create_mempool();
 	out = create_output();
 
-	start_time = create_timestamp();
+	start_time = time(NULL);
 	enable_pdump(r, mp);
 
 	if (!quiet) {
@@ -1005,11 +995,10 @@ int main(int argc, char **argv)
 			break;
 
 		if (stop.duration != 0 &&
-		    create_timestamp() - start_time > stop.duration)
+		    time(NULL) - start_time > stop.duration)
 			break;
 	}
 
-	end_time = create_timestamp();
 	disable_primary_monitor();
 
 	if (rte_eal_primary_proc_alive(NULL))
diff --git a/app/test/test_pcapng.c b/app/test/test_pcapng.c
index b8429a02f160..55aa2cf93666 100644
--- a/app/test/test_pcapng.c
+++ b/app/test/test_pcapng.c
@@ -173,8 +173,8 @@ test_write_stats(void)
 	ssize_t len;
 
 	/* write a statistics block */
-	len = rte_pcapng_write_stats(pcapng, port_id,
-				     NULL, 0, 0,
+	len = rte_pcapng_write_stats(pcapng, port_id, NULL,
+				     0, 0, 0,
 				     NUM_PACKETS, 0);
 	if (len <= 0) {
 		fprintf(stderr, "Write of statistics failed\n");
diff --git a/lib/graph/graph_pcap.c b/lib/graph/graph_pcap.c
index db722c375fa7..89525f1220ca 100644
--- a/lib/graph/graph_pcap.c
+++ b/lib/graph/graph_pcap.c
@@ -214,7 +214,7 @@ graph_pcap_dispatch(struct rte_graph *graph,
 		mbuf = (struct rte_mbuf *)objs[i];
 
 		mc = rte_pcapng_copy(mbuf->port, 0, mbuf, pkt_mp, mbuf->pkt_len,
-				     rte_get_tsc_cycles(), 0, buffer);
+				     0, buffer);
 		if (mc == NULL)
 			break;
 
diff --git a/lib/pcapng/rte_pcapng.c b/lib/pcapng/rte_pcapng.c
index 3c91fc77644a..13fd2b97fb80 100644
--- a/lib/pcapng/rte_pcapng.c
+++ b/lib/pcapng/rte_pcapng.c
@@ -36,22 +36,14 @@
 /* Format of the capture file handle */
 struct rte_pcapng {
 	int  outfd;		/* output file */
-
 	unsigned int ports;	/* number of interfaces added */
+	uint64_t offset_ns;	/* ns since 1/1/1970 when initialized */
+	uint64_t tsc_base;	/* TSC when started */
 
 	/* DPDK port id to interface index in file */
 	uint32_t port_index[RTE_MAX_ETHPORTS];
 };
 
-/* For converting TSC cycles to PCAPNG ns format */
-static struct pcapng_time {
-	uint64_t ns;
-	uint64_t cycles;
-	uint64_t tsc_hz;
-	struct rte_reciprocal_u64 tsc_hz_inverse;
-} pcapng_time;
-
-
 #ifdef RTE_EXEC_ENV_WINDOWS
 /*
  * Windows does not have writev() call.
@@ -102,56 +94,21 @@ static ssize_t writev(int fd, const struct iovec *iov, int iovcnt)
 #define if_indextoname(ifindex, ifname) NULL
 #endif
 
-static inline void
-pcapng_init(void)
+/* Convert from TSC (CPU cycles) to nanoseconds */
+static uint64_t
+pcapng_timestamp(const rte_pcapng_t *self, uint64_t cycles)
 {
-	struct timespec ts;
+	uint64_t delta, rem, secs, ns;
+	const uint64_t hz = rte_get_tsc_hz();
 
-	pcapng_time.cycles = rte_get_tsc_cycles();
-	clock_gettime(CLOCK_REALTIME, &ts);
-	pcapng_time.cycles = (pcapng_time.cycles + rte_get_tsc_cycles()) / 2;
-	pcapng_time.ns = rte_timespec_to_ns(&ts);
-
-	pcapng_time.tsc_hz = rte_get_tsc_hz();
-	pcapng_time.tsc_hz_inverse = rte_reciprocal_value_u64(pcapng_time.tsc_hz);
-}
+	delta = cycles - self->tsc_base;
 
-/* PCAPNG timestamps are in nanoseconds */
-static uint64_t pcapng_tsc_to_ns(uint64_t cycles)
-{
-	uint64_t delta, secs;
-
-	if (!pcapng_time.tsc_hz)
-		pcapng_init();
-
-	/* In essence the calculation is:
-	 *   delta = (cycles - pcapng_time.cycles) * NSEC_PRE_SEC / rte_get_tsc_hz()
-	 * but this overflows within 4 to 8 seconds depending on TSC frequency.
-	 * Instead, if delta >= pcapng_time.tsc_hz:
-	 *   Increase pcapng_time.ns and pcapng_time.cycles by the number of
-	 *   whole seconds in delta and reduce delta accordingly.
-	 * delta will therefore always lie in the interval [0, pcapng_time.tsc_hz),
-	 * which will not overflow when multiplied by NSEC_PER_SEC provided the
-	 * TSC frequency < approx 18.4GHz.
-	 *
-	 * Currently all TSCs operate below 5GHz.
-	 */
-	delta = cycles - pcapng_time.cycles;
-	if (unlikely(delta >= pcapng_time.tsc_hz)) {
-		if (likely(delta < pcapng_time.tsc_hz * 2)) {
-			delta -= pcapng_time.tsc_hz;
-			pcapng_time.cycles += pcapng_time.tsc_hz;
-			pcapng_time.ns += NSEC_PER_SEC;
-		} else {
-			secs = rte_reciprocal_divide_u64(delta, &pcapng_time.tsc_hz_inverse);
-			delta -= secs * pcapng_time.tsc_hz;
-			pcapng_time.cycles += secs * pcapng_time.tsc_hz;
-			pcapng_time.ns += secs * NSEC_PER_SEC;
-		}
-	}
+	/* Avoid numeric wraparound by computing seconds first */
+	secs = delta / hz;
+	rem = delta % hz;
+	ns = (rem * NS_PER_S) / hz;
 
-	return pcapng_time.ns + rte_reciprocal_divide_u64(delta * NSEC_PER_SEC,
-							  &pcapng_time.tsc_hz_inverse);
+	return secs * NS_PER_S + ns + self->offset_ns;
 }
 
 /* length of option including padding */
@@ -368,15 +325,15 @@ rte_pcapng_add_interface(rte_pcapng_t *self, uint16_t port,
  */
 ssize_t
 rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
-		       const char *comment,
-		       uint64_t start_time, uint64_t end_time,
-		       uint64_t ifrecv, uint64_t ifdrop)
+		       uint64_t ifrecv, uint64_t ifdrop,
+		       const char *comment)
 {
 	struct pcapng_statistics *hdr;
 	struct pcapng_option *opt;
+	uint64_t start_time = self->offset_ns;
+	uint64_t sample_time;
 	uint32_t optlen, len;
 	uint8_t *buf;
-	uint64_t ns;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
@@ -386,10 +343,10 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 		optlen += pcapng_optlen(sizeof(ifrecv));
 	if (ifdrop != UINT64_MAX)
 		optlen += pcapng_optlen(sizeof(ifdrop));
+
 	if (start_time != 0)
 		optlen += pcapng_optlen(sizeof(start_time));
-	if (end_time != 0)
-		optlen += pcapng_optlen(sizeof(end_time));
+
 	if (comment)
 		optlen += pcapng_optlen(strlen(comment));
 	if (optlen != 0)
@@ -409,9 +366,6 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 	if (start_time != 0)
 		opt = pcapng_add_option(opt, PCAPNG_ISB_STARTTIME,
 					 &start_time, sizeof(start_time));
-	if (end_time != 0)
-		opt = pcapng_add_option(opt, PCAPNG_ISB_ENDTIME,
-					 &end_time, sizeof(end_time));
 	if (ifrecv != UINT64_MAX)
 		opt = pcapng_add_option(opt, PCAPNG_ISB_IFRECV,
 				&ifrecv, sizeof(ifrecv));
@@ -425,9 +379,9 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 	hdr->block_length = len;
 	hdr->interface_id = self->port_index[port_id];
 
-	ns = pcapng_tsc_to_ns(rte_get_tsc_cycles());
-	hdr->timestamp_hi = ns >> 32;
-	hdr->timestamp_lo = (uint32_t)ns;
+	sample_time = pcapng_timestamp(self, rte_get_tsc_cycles());
+	hdr->timestamp_hi = sample_time >> 32;
+	hdr->timestamp_lo = (uint32_t)sample_time;
 
 	/* clone block_length after option */
 	memcpy(opt, &len, sizeof(uint32_t));
@@ -520,23 +474,21 @@ struct rte_mbuf *
 rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 		const struct rte_mbuf *md,
 		struct rte_mempool *mp,
-		uint32_t length, uint64_t cycles,
+		uint32_t length,
 		enum rte_pcapng_direction direction,
 		const char *comment)
 {
 	struct pcapng_enhance_packet_block *epb;
 	uint32_t orig_len, data_len, padding, flags;
 	struct pcapng_option *opt;
+	uint64_t timestamp;
 	uint16_t optlen;
 	struct rte_mbuf *mc;
-	uint64_t ns;
 	bool rss_hash;
 
 #ifdef RTE_LIBRTE_ETHDEV_DEBUG
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
 #endif
-	ns = pcapng_tsc_to_ns(cycles);
-
 	orig_len = rte_pktmbuf_pkt_len(md);
 
 	/* Take snapshot of the data */
@@ -641,8 +593,10 @@ rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 	/* Interface index is filled in later during write */
 	mc->port = port_id;
 
-	epb->timestamp_hi = ns >> 32;
-	epb->timestamp_lo = (uint32_t)ns;
+	/* Put timestamp in cycles here - adjust in packet write */
+	timestamp = rte_get_tsc_cycles();
+	epb->timestamp_hi = timestamp >> 32;
+	epb->timestamp_lo = (uint32_t)timestamp;
 	epb->capture_length = data_len;
 	epb->original_length = orig_len;
 
@@ -668,6 +622,7 @@ rte_pcapng_write_packets(rte_pcapng_t *self,
 	for (i = 0; i < nb_pkts; i++) {
 		struct rte_mbuf *m = pkts[i];
 		struct pcapng_enhance_packet_block *epb;
+		uint64_t cycles, timestamp;
 
 		/* sanity check that is really a pcapng mbuf */
 		epb = rte_pktmbuf_mtod(m, struct pcapng_enhance_packet_block *);
@@ -684,6 +639,13 @@ rte_pcapng_write_packets(rte_pcapng_t *self,
 			return -1;
 		}
 
+		/* adjust timestamp recorded in packet */
+		cycles = (uint64_t)epb->timestamp_hi << 32;
+		cycles += epb->timestamp_lo;
+		timestamp = pcapng_timestamp(self, cycles);
+		epb->timestamp_hi = timestamp >> 32;
+		epb->timestamp_lo = (uint32_t)timestamp;
+
 		/*
 		 * Handle case of highly fragmented and large burst size
 		 * Note: this assumes that max segments per mbuf < IOV_MAX
@@ -725,6 +687,8 @@ rte_pcapng_fdopen(int fd,
 {
 	unsigned int i;
 	rte_pcapng_t *self;
+	struct timespec ts;
+	uint64_t cycles;
 
 	self = malloc(sizeof(*self));
 	if (!self) {
@@ -734,6 +698,13 @@ rte_pcapng_fdopen(int fd,
 
 	self->outfd = fd;
 	self->ports = 0;
+
+	/* record start time in ns since 1/1/1970 */
+	cycles = rte_get_tsc_cycles();
+	clock_gettime(CLOCK_REALTIME, &ts);
+	self->tsc_base = (cycles + rte_get_tsc_cycles()) / 2;
+	self->offset_ns = rte_timespec_to_ns(&ts);
+
 	for (i = 0; i < RTE_MAX_ETHPORTS; i++)
 		self->port_index[i] = UINT32_MAX;
 
diff --git a/lib/pcapng/rte_pcapng.h b/lib/pcapng/rte_pcapng.h
index d93cc9f73ad5..c40795c721de 100644
--- a/lib/pcapng/rte_pcapng.h
+++ b/lib/pcapng/rte_pcapng.h
@@ -121,8 +121,6 @@ enum rte_pcapng_direction {
  * @param length
  *   The upper limit on bytes to copy.  Passing UINT32_MAX
  *   means all data (after offset).
- * @param timestamp
- *   The timestamp in TSC cycles.
  * @param direction
  *   The direction of the packer: receive, transmit or unknown.
  * @param comment
@@ -136,7 +134,7 @@ __rte_experimental
 struct rte_mbuf *
 rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 		const struct rte_mbuf *m, struct rte_mempool *mp,
-		uint32_t length, uint64_t timestamp,
+		uint32_t length,
 		enum rte_pcapng_direction direction, const char *comment);
 
 
@@ -188,29 +186,22 @@ rte_pcapng_write_packets(rte_pcapng_t *self,
  *  The handle to the packet capture file
  * @param port
  *  The Ethernet port to report stats on.
- * @param comment
- *   Optional comment to add to statistics.
- * @param start_time
- *  The time when packet capture was started in nanoseconds.
- *  Optional: can be zero if not known.
- * @param end_time
- *  The time when packet capture was stopped in nanoseconds.
- *  Optional: can be zero if not finished;
  * @param ifrecv
  *  The number of packets received by capture.
  *  Optional: use UINT64_MAX if not known.
  * @param ifdrop
  *  The number of packets missed by the capture process.
  *  Optional: use UINT64_MAX if not known.
+ * @param comment
+ *  Optional comment to add to statistics.
  * @return
  *  number of bytes written to file, -1 on failure to write file
  */
 __rte_experimental
 ssize_t
 rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port,
-		       const char *comment,
-		       uint64_t start_time, uint64_t end_time,
-		       uint64_t ifrecv, uint64_t ifdrop);
+		       uint64_t ifrecv, uint64_t ifdrop,
+		       const char *comment);
 
 #ifdef __cplusplus
 }
diff --git a/lib/pdump/rte_pdump.c b/lib/pdump/rte_pdump.c
index e94f49e21250..5a1ec14d7a18 100644
--- a/lib/pdump/rte_pdump.c
+++ b/lib/pdump/rte_pdump.c
@@ -90,7 +90,6 @@ pdump_copy(uint16_t port_id, uint16_t queue,
 	int ring_enq;
 	uint16_t d_pkts = 0;
 	struct rte_mbuf *dup_bufs[nb_pkts];
-	uint64_t ts;
 	struct rte_ring *ring;
 	struct rte_mempool *mp;
 	struct rte_mbuf *p;
@@ -99,7 +98,6 @@ pdump_copy(uint16_t port_id, uint16_t queue,
 	if (cbs->filter)
 		rte_bpf_exec_burst(cbs->filter, (void **)pkts, rcs, nb_pkts);
 
-	ts = rte_get_tsc_cycles();
 	ring = cbs->ring;
 	mp = cbs->mp;
 	for (i = 0; i < nb_pkts; i++) {
@@ -122,7 +120,7 @@ pdump_copy(uint16_t port_id, uint16_t queue,
 		if (cbs->ver == V2)
 			p = rte_pcapng_copy(port_id, queue,
 					    pkts[i], mp, cbs->snaplen,
-					    ts, direction, NULL);
+					    direction, NULL);
 		else
 			p = rte_pktmbuf_copy(pkts[i], mp, 0, cbs->snaplen);
 
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v5 4/5] pcapng: avoid using alloca()
  2023-11-09 19:45 ` [PATCH v5 0/5] dumpcap and pcapng fixes Stephen Hemminger
                     ` (2 preceding siblings ...)
  2023-11-09 19:45   ` [PATCH v5 3/5] pcapng: modify timestamp calculation Stephen Hemminger
@ 2023-11-09 19:45   ` Stephen Hemminger
  2023-11-09 19:45   ` [PATCH v5 5/5] test: cleanups to pcapng test Stephen Hemminger
  4 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-09 19:45 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Morten Brørup, Reshma Pattan

The function alloca() like VLA's has problems if the caller
passes a large value. Instead use a fixed size buffer (2K)
which will be more than sufficient for the info related blocks
in the file. Add bounds checks as well.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
---
 lib/pcapng/rte_pcapng.c | 37 ++++++++++++++++---------------------
 1 file changed, 16 insertions(+), 21 deletions(-)

diff --git a/lib/pcapng/rte_pcapng.c b/lib/pcapng/rte_pcapng.c
index 13fd2b97fb80..f74ec939a9f8 100644
--- a/lib/pcapng/rte_pcapng.c
+++ b/lib/pcapng/rte_pcapng.c
@@ -33,6 +33,9 @@
 /* conversion from DPDK speed to PCAPNG */
 #define PCAPNG_MBPS_SPEED 1000000ull
 
+/* upper bound for section, stats and interface blocks */
+#define PCAPNG_BLKSIZ	2048
+
 /* Format of the capture file handle */
 struct rte_pcapng {
 	int  outfd;		/* output file */
@@ -140,9 +143,8 @@ pcapng_section_block(rte_pcapng_t *self,
 {
 	struct pcapng_section_header *hdr;
 	struct pcapng_option *opt;
-	void *buf;
+	uint8_t buf[PCAPNG_BLKSIZ];
 	uint32_t len;
-	ssize_t cc;
 
 	len = sizeof(*hdr);
 	if (hw)
@@ -158,8 +160,7 @@ pcapng_section_block(rte_pcapng_t *self,
 	len += pcapng_optlen(0);
 	len += sizeof(uint32_t);
 
-	buf = calloc(1, len);
-	if (!buf)
+	if (len > sizeof(buf))
 		return -1;
 
 	hdr = (struct pcapng_section_header *)buf;
@@ -193,10 +194,7 @@ pcapng_section_block(rte_pcapng_t *self,
 	/* clone block_length after option */
 	memcpy(opt, &hdr->block_length, sizeof(uint32_t));
 
-	cc = write(self->outfd, buf, len);
-	free(buf);
-
-	return cc;
+	return write(self->outfd, buf, len);
 }
 
 /* Write an interface block for a DPDK port */
@@ -213,7 +211,7 @@ rte_pcapng_add_interface(rte_pcapng_t *self, uint16_t port,
 	struct pcapng_option *opt;
 	const uint8_t tsresol = 9;	/* nanosecond resolution */
 	uint32_t len;
-	void *buf;
+	uint8_t buf[PCAPNG_BLKSIZ];
 	char ifname_buf[IF_NAMESIZE];
 	char ifhw[256];
 	uint64_t speed = 0;
@@ -267,8 +265,7 @@ rte_pcapng_add_interface(rte_pcapng_t *self, uint16_t port,
 	len += pcapng_optlen(0);
 	len += sizeof(uint32_t);
 
-	buf = alloca(len);
-	if (!buf)
+	if (len > sizeof(buf))
 		return -1;
 
 	hdr = (struct pcapng_interface_block *)buf;
@@ -296,17 +293,16 @@ rte_pcapng_add_interface(rte_pcapng_t *self, uint16_t port,
 		opt = pcapng_add_option(opt, PCAPNG_IFB_HARDWARE,
 					 ifhw, strlen(ifhw));
 	if (filter) {
-		/* Encoding is that the first octet indicates string vs BPF */
 		size_t len;
-		char *buf;
 
 		len = strlen(filter) + 1;
-		buf = alloca(len);
-		*buf = '\0';
-		memcpy(buf + 1, filter, len);
+		opt->code = PCAPNG_IFB_FILTER;
+		opt->length = len;
+		/* Encoding is that the first octet indicates string vs BPF */
+		opt->data[0] = 0;
+		memcpy(opt->data + 1, filter, strlen(filter));
 
-		opt = pcapng_add_option(opt, PCAPNG_IFB_FILTER,
-					buf, len);
+		opt = (struct pcapng_option *)((uint8_t *)opt + pcapng_optlen(len));
 	}
 
 	opt = pcapng_add_option(opt, PCAPNG_OPT_END, NULL, 0);
@@ -333,7 +329,7 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 	uint64_t start_time = self->offset_ns;
 	uint64_t sample_time;
 	uint32_t optlen, len;
-	uint8_t *buf;
+	uint8_t buf[PCAPNG_BLKSIZ];
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
@@ -353,8 +349,7 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 		optlen += pcapng_optlen(0);
 
 	len = sizeof(*hdr) + optlen + sizeof(uint32_t);
-	buf = alloca(len);
-	if (buf == NULL)
+	if (len > sizeof(buf))
 		return -1;
 
 	hdr = (struct pcapng_statistics *)buf;
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v5 5/5] test: cleanups to pcapng test
  2023-11-09 19:45 ` [PATCH v5 0/5] dumpcap and pcapng fixes Stephen Hemminger
                     ` (3 preceding siblings ...)
  2023-11-09 19:45   ` [PATCH v5 4/5] pcapng: avoid using alloca() Stephen Hemminger
@ 2023-11-09 19:45   ` Stephen Hemminger
  4 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-09 19:45 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Reshma Pattan

Overhaul of the pcapng test:
  - promote it to be a fast test so it gets regularly run.
  - create null device and use i.
  - use UDP discard packets that are valid so that for debugging
    the resulting pcapng file can be looked at with wireshark.
  - do basic checks on resulting pcap file that lengths and
    timestamps are in range.
  - add test for interface options

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 app/test/meson.build   |   2 +-
 app/test/test_pcapng.c | 418 +++++++++++++++++++++++++++--------------
 2 files changed, 282 insertions(+), 138 deletions(-)

diff --git a/app/test/meson.build b/app/test/meson.build
index 4183d66b0e9c..dcc93f4a43b4 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -128,7 +128,7 @@ source_file_deps = {
     'test_metrics.c': ['metrics'],
     'test_mp_secondary.c': ['hash', 'lpm'],
     'test_net_ether.c': ['net'],
-    'test_pcapng.c': ['ethdev', 'net', 'pcapng'],
+    'test_pcapng.c': ['ethdev', 'net', 'pcapng', 'bus_vdev'],
     'test_pdcp.c': ['eventdev', 'pdcp', 'net', 'timer', 'security'],
     'test_pdump.c': ['pdump'] + sample_packet_forward_deps,
     'test_per_lcore.c': [],
diff --git a/app/test/test_pcapng.c b/app/test/test_pcapng.c
index 55aa2cf93666..c973aa47d1f8 100644
--- a/app/test/test_pcapng.c
+++ b/app/test/test_pcapng.c
@@ -6,25 +6,34 @@
 #include <stdlib.h>
 #include <unistd.h>
 
+#include <rte_bus_vdev.h>
 #include <rte_ethdev.h>
 #include <rte_ether.h>
+#include <rte_ip.h>
 #include <rte_mbuf.h>
 #include <rte_mempool.h>
 #include <rte_net.h>
 #include <rte_pcapng.h>
+#include <rte_random.h>
+#include <rte_reciprocal.h>
+#include <rte_time.h>
+#include <rte_udp.h>
 
 #include <pcap/pcap.h>
 
 #include "test.h"
 
-#define NUM_PACKETS    10
-#define DUMMY_MBUF_NUM 3
+#define PCAPNG_TEST_DEBUG 0
+
+#define TOTAL_PACKETS	4096
+#define MAX_BURST	64
+#define MAX_GAP_US	100000
+#define DUMMY_MBUF_NUM	3
 
-static rte_pcapng_t *pcapng;
 static struct rte_mempool *mp;
 static const uint32_t pkt_len = 200;
 static uint16_t port_id;
-static char file_name[] = "/tmp/pcapng_test_XXXXXX.pcapng";
+static const char null_dev[] = "net_null0";
 
 /* first mbuf in the packet, should always be at offset 0 */
 struct dummy_mbuf {
@@ -61,6 +70,7 @@ mbuf1_prepare(struct dummy_mbuf *dm, uint32_t plen)
 	struct {
 		struct rte_ether_hdr eth;
 		struct rte_ipv4_hdr ip;
+		struct rte_udp_hdr udp;
 	} pkt = {
 		.eth = {
 			.dst_addr.addr_bytes = "\xff\xff\xff\xff\xff\xff",
@@ -68,149 +78,201 @@ mbuf1_prepare(struct dummy_mbuf *dm, uint32_t plen)
 		},
 		.ip = {
 			.version_ihl = RTE_IPV4_VHL_DEF,
-			.total_length = rte_cpu_to_be_16(plen),
-			.time_to_live = IPDEFTTL,
-			.next_proto_id = IPPROTO_RAW,
+			.time_to_live = 1,
+			.next_proto_id = IPPROTO_UDP,
 			.src_addr = rte_cpu_to_be_32(RTE_IPV4_LOOPBACK),
 			.dst_addr = rte_cpu_to_be_32(RTE_IPV4_BROADCAST),
-		}
+		},
+		.udp = {
+			.dst_port = rte_cpu_to_be_16(9), /* Discard port */
+		},
 	};
 
 	memset(dm, 0, sizeof(*dm));
 	dummy_mbuf_prep(&dm->mb[0], dm->buf[0], sizeof(dm->buf[0]), plen);
 
 	rte_eth_random_addr(pkt.eth.src_addr.addr_bytes);
-	memcpy(rte_pktmbuf_mtod(dm->mb, void *), &pkt, RTE_MIN(sizeof(pkt), plen));
+	plen -= sizeof(struct rte_ether_hdr);
+
+	pkt.ip.total_length = rte_cpu_to_be_16(plen);
+	pkt.ip.hdr_checksum = rte_ipv4_cksum(&pkt.ip);
+
+	plen -= sizeof(struct rte_ipv4_hdr);
+	pkt.udp.src_port = rte_rand();
+	pkt.udp.dgram_len = rte_cpu_to_be_16(plen);
+
+	memcpy(rte_pktmbuf_mtod(dm->mb, void *), &pkt, sizeof(pkt));
 }
 
 static int
 test_setup(void)
 {
-	int tmp_fd;
-
-	port_id = rte_eth_find_next(0);
-	if (port_id >= RTE_MAX_ETHPORTS) {
-		fprintf(stderr, "No valid Ether port\n");
-		return -1;
-	}
+	port_id = rte_eth_dev_count_avail();
 
-	tmp_fd = mkstemps(file_name, strlen(".pcapng"));
-	if (tmp_fd == -1) {
-		perror("mkstemps() failure");
-		return -1;
-	}
-	printf("pcapng: output file %s\n", file_name);
-
-	/* open a test capture file */
-	pcapng = rte_pcapng_fdopen(tmp_fd, NULL, NULL, "pcapng_test", NULL);
-	if (pcapng == NULL) {
-		fprintf(stderr, "rte_pcapng_fdopen failed\n");
-		close(tmp_fd);
-		return -1;
-	}
-
-	/* Add interface to the file */
-	if (rte_pcapng_add_interface(pcapng, port_id,
-				     NULL, NULL, NULL) != 0) {
-		fprintf(stderr, "can not add port %u\n", port_id);
-		return -1;
+	/* Make a dummy null device to snoop on */
+	if (rte_vdev_init(null_dev, NULL) != 0) {
+		fprintf(stderr, "Failed to create vdev '%s'\n", null_dev);
+		goto fail;
 	}
 
 	/* Make a pool for cloned packets */
-	mp = rte_pktmbuf_pool_create_by_ops("pcapng_test_pool", IOV_MAX + NUM_PACKETS,
-					    0, 0,
-					    rte_pcapng_mbuf_size(pkt_len),
+	mp = rte_pktmbuf_pool_create_by_ops("pcapng_test_pool",
+					    MAX_BURST, 0, 0,
+					    rte_pcapng_mbuf_size(pkt_len) + 128,
 					    SOCKET_ID_ANY, "ring_mp_sc");
 	if (mp == NULL) {
 		fprintf(stderr, "Cannot create mempool\n");
-		return -1;
+		goto fail;
 	}
+
 	return 0;
+
+fail:
+	rte_vdev_uninit(null_dev);
+	rte_mempool_free(mp);
+	return -1;
 }
 
 static int
-test_write_packets(void)
+fill_pcapng_file(rte_pcapng_t *pcapng, unsigned int num_packets)
 {
-	struct rte_mbuf *orig;
-	struct rte_mbuf *clones[NUM_PACKETS] = { };
 	struct dummy_mbuf mbfs;
-	unsigned int i;
+	struct rte_mbuf *orig;
+	unsigned int burst_size;
+	unsigned int count;
 	ssize_t len;
 
 	/* make a dummy packet */
 	mbuf1_prepare(&mbfs, pkt_len);
-
-	/* clone them */
 	orig  = &mbfs.mb[0];
-	for (i = 0; i < NUM_PACKETS; i++) {
-		struct rte_mbuf *mc;
 
-		mc = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len,
-				rte_get_tsc_cycles(), 0, NULL);
-		if (mc == NULL) {
-			fprintf(stderr, "Cannot copy packet\n");
+	for (count = 0; count < num_packets; count += burst_size) {
+		struct rte_mbuf *clones[MAX_BURST];
+		unsigned int i;
+
+		/* put 1 .. MAX_BURST packets in one write call */
+		burst_size = rte_rand_max(MAX_BURST) + 1;
+		for (i = 0; i < burst_size; i++) {
+			struct rte_mbuf *mc;
+
+			mc = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len,
+					     RTE_PCAPNG_DIRECTION_IN,
+					     NULL);
+			if (mc == NULL) {
+				fprintf(stderr, "Cannot copy packet\n");
+				return -1;
+			}
+			clones[i] = mc;
+		}
+
+		/* write it to capture file */
+		len = rte_pcapng_write_packets(pcapng, clones, burst_size);
+		rte_pktmbuf_free_bulk(clones, burst_size);
+
+		if (len <= 0) {
+			fprintf(stderr, "Write of packets failed: %s\n",
+				rte_strerror(rte_errno));
 			return -1;
 		}
-		clones[i] = mc;
+
+		/* Leave a small gap between packets to test for time wrap */
+		usleep(rte_rand_max(MAX_GAP_US));
 	}
 
-	/* write it to capture file */
-	len = rte_pcapng_write_packets(pcapng, clones, NUM_PACKETS);
+	return count;
+}
 
-	rte_pktmbuf_free_bulk(clones, NUM_PACKETS);
+static char *
+fmt_time(char *buf, size_t size, uint64_t ts_ns)
+{
+	time_t sec;
+	size_t len;
 
-	if (len <= 0) {
-		fprintf(stderr, "Write of packets failed\n");
-		return -1;
-	}
+	sec = ts_ns / NS_PER_S;
+	len = strftime(buf, size, "%X", localtime(&sec));
+	snprintf(buf + len, size - len, ".%09lu",
+		 (unsigned long)(ts_ns % NS_PER_S));
 
-	return 0;
+	return buf;
 }
 
-static int
-test_write_stats(void)
+/* Context for the pcap_loop callback */
+struct pkt_print_ctx {
+	pcap_t *pcap;
+	unsigned int count;
+	uint64_t start_ns;
+	uint64_t end_ns;
+};
+
+static void
+print_packet(uint64_t ts_ns, const struct rte_ether_hdr *eh, size_t len)
 {
-	ssize_t len;
+	char tbuf[128], src[64], dst[64];
 
-	/* write a statistics block */
-	len = rte_pcapng_write_stats(pcapng, port_id, NULL,
-				     0, 0, 0,
-				     NUM_PACKETS, 0);
-	if (len <= 0) {
-		fprintf(stderr, "Write of statistics failed\n");
-		return -1;
-	}
-	return 0;
+	fmt_time(tbuf, sizeof(tbuf), ts_ns);
+	rte_ether_format_addr(dst, sizeof(dst), &eh->dst_addr);
+	rte_ether_format_addr(src, sizeof(src), &eh->src_addr);
+	printf("%s: %s -> %s type %x length %zu\n",
+	       tbuf, src, dst, rte_be_to_cpu_16(eh->ether_type), len);
 }
 
+/* Callback from pcap_loop used to validate packets in the file */
 static void
-pkt_print(u_char *user, const struct pcap_pkthdr *h,
-	  const u_char *bytes)
+parse_pcap_packet(u_char *user, const struct pcap_pkthdr *h,
+		  const u_char *bytes)
 {
-	unsigned int *countp = (unsigned int *)user;
+	struct pkt_print_ctx *ctx = (struct pkt_print_ctx *)user;
 	const struct rte_ether_hdr *eh;
-	struct tm *tm;
-	char tbuf[128], src[64], dst[64];
+	const struct rte_ipv4_hdr *ip;
+	uint64_t ns;
 
-	tm = localtime(&h->ts.tv_sec);
-	if (tm == NULL) {
-		perror("localtime");
-		return;
+	eh = (const struct rte_ether_hdr *)bytes;
+	ip = (const struct rte_ipv4_hdr *)(eh + 1);
+
+	ctx->count += 1;
+
+	/* The pcap library is misleading in reporting timestamp.
+	 * packet header struct gives timestamp as a timeval (ie. usec);
+	 * but the file is open in nanonsecond mode therefore
+	 * the timestamp is really in timespec (ie. nanoseconds).
+	 */
+	ns = h->ts.tv_sec * NS_PER_S + h->ts.tv_usec;
+	if (ns < ctx->start_ns || ns > ctx->end_ns) {
+		char tstart[128], tend[128];
+
+		fmt_time(tstart, sizeof(tstart), ctx->start_ns);
+		fmt_time(tend, sizeof(tend), ctx->end_ns);
+		fprintf(stderr, "Timestamp out of range [%s .. %s]\n",
+			tstart, tend);
+		goto error;
 	}
 
-	if (strftime(tbuf, sizeof(tbuf), "%X", tm) == 0) {
-		fprintf(stderr, "strftime returned 0!\n");
-		return;
+	if (!rte_is_broadcast_ether_addr(&eh->dst_addr)) {
+		fprintf(stderr, "Destination is not broadcast\n");
+		goto error;
 	}
 
-	eh = (const struct rte_ether_hdr *)bytes;
-	rte_ether_format_addr(dst, sizeof(dst), &eh->dst_addr);
-	rte_ether_format_addr(src, sizeof(src), &eh->src_addr);
-	printf("%s.%06lu: %s -> %s type %x length %u\n",
-	       tbuf, (unsigned long)h->ts.tv_usec,
-	       src, dst, rte_be_to_cpu_16(eh->ether_type), h->len);
+	if (rte_ipv4_cksum(ip) != 0) {
+		fprintf(stderr, "Bad IPv4 checksum\n");
+		goto error;
+	}
+
+	return;		/* packet is normal */
+
+error:
+	print_packet(ns, eh, h->len);
+
+	/* Stop parsing at first error */
+	pcap_breakloop(ctx->pcap);
+}
 
-	*countp += 1;
+static uint64_t
+current_timestamp(void)
+{
+	struct timespec ts;
+
+	clock_gettime(CLOCK_REALTIME, &ts);
+	return rte_timespec_to_ns(&ts);
 }
 
 /*
@@ -219,78 +281,162 @@ pkt_print(u_char *user, const struct pcap_pkthdr *h,
  * but that creates an unwanted dependency.
  */
 static int
-test_validate(void)
+valid_pcapng_file(const char *file_name, uint64_t started, unsigned int expected)
 {
 	char errbuf[PCAP_ERRBUF_SIZE];
-	unsigned int count = 0;
-	pcap_t *pcap;
+	struct pkt_print_ctx ctx = { };
 	int ret;
 
-	pcap = pcap_open_offline(file_name, errbuf);
-	if (pcap == NULL) {
+	ctx.start_ns = started;
+	ctx.end_ns = current_timestamp();
+
+	ctx.pcap = pcap_open_offline_with_tstamp_precision(file_name,
+							   PCAP_TSTAMP_PRECISION_NANO,
+							   errbuf);
+	if (ctx.pcap == NULL) {
 		fprintf(stderr, "pcap_open_offline('%s') failed: %s\n",
 			file_name, errbuf);
 		return -1;
 	}
 
-	ret = pcap_loop(pcap, 0, pkt_print, (u_char *)&count);
-	if (ret == 0)
-		printf("Saw %u packets\n", count);
-	else
+	ret = pcap_loop(ctx.pcap, 0, parse_pcap_packet, (u_char *)&ctx);
+	if (ret != 0) {
 		fprintf(stderr, "pcap_dispatch: failed: %s\n",
-			pcap_geterr(pcap));
-	pcap_close(pcap);
+			pcap_geterr(ctx.pcap));
+	} else if (ctx.count != expected) {
+		printf("Only %u packets, expected %u\n",
+		       ctx.count, expected);
+		ret = -1;
+	}
+
+	pcap_close(ctx.pcap);
 
 	return ret;
 }
 
 static int
-test_write_over_limit_iov_max(void)
+test_add_interface(void)
 {
-	struct rte_mbuf *orig;
-	struct rte_mbuf *clones[IOV_MAX + NUM_PACKETS] = { };
-	struct dummy_mbuf mbfs;
-	unsigned int i;
-	ssize_t len;
+	char file_name[] = "/tmp/pcapng_test_XXXXXX.pcapng";
+	static rte_pcapng_t *pcapng;
+	int ret, tmp_fd;
+	uint64_t now = current_timestamp();
 
-	/* make a dummy packet */
-	mbuf1_prepare(&mbfs, pkt_len);
+	tmp_fd = mkstemps(file_name, strlen(".pcapng"));
+	if (tmp_fd == -1) {
+		perror("mkstemps() failure");
+		goto fail;
+	}
+	printf("pcapng: output file %s\n", file_name);
 
-	/* clone them */
-	orig  = &mbfs.mb[0];
-	for (i = 0; i < IOV_MAX + NUM_PACKETS; i++) {
-		struct rte_mbuf *mc;
+	/* open a test capture file */
+	pcapng = rte_pcapng_fdopen(tmp_fd, NULL, NULL, "pcapng_addif", NULL);
+	if (pcapng == NULL) {
+		fprintf(stderr, "rte_pcapng_fdopen failed\n");
+		close(tmp_fd);
+		goto fail;
+	}
 
-		mc = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len,
-				rte_get_tsc_cycles(), 0, NULL);
-		if (mc == NULL) {
-			fprintf(stderr, "Cannot copy packet\n");
-			return -1;
-		}
-		clones[i] = mc;
+	/* Add interface to the file */
+	ret = rte_pcapng_add_interface(pcapng, port_id,
+				       NULL, NULL, NULL);
+	if (ret < 0) {
+		fprintf(stderr, "can not add port %u\n", port_id);
+		goto fail;
 	}
 
-	/* write it to capture file */
-	len = rte_pcapng_write_packets(pcapng, clones, IOV_MAX + NUM_PACKETS);
+	/* Add interface with ifname and ifdescr */
+	ret = rte_pcapng_add_interface(pcapng, port_id,
+				       "myeth", "Some long description", NULL);
+	if (ret < 0) {
+		fprintf(stderr, "can not add port %u with ifname\n", port_id);
+		goto fail;
+	}
+
+	/* Add interface with filter */
+	ret = rte_pcapng_add_interface(pcapng, port_id,
+				       NULL, NULL, "tcp port 8080");
+	if (ret < 0) {
+		fprintf(stderr, "can not add port %u with filter\n", port_id);
+		goto fail;
+	}
 
-	rte_pktmbuf_free_bulk(clones, IOV_MAX + NUM_PACKETS);
+	rte_pcapng_close(pcapng);
 
-	if (len <= 0) {
-		fprintf(stderr, "Write of packets failed\n");
-		return -1;
+	ret = valid_pcapng_file(file_name, now, 0);
+	/* if test fails want to investigate the file */
+	if (ret == 0)
+		unlink(file_name);
+
+	return ret;
+
+fail:
+	rte_pcapng_close(pcapng);
+	return -1;
+}
+
+static int
+test_write_packets(void)
+{
+	char file_name[] = "/tmp/pcapng_test_XXXXXX.pcapng";
+	static rte_pcapng_t *pcapng;
+	int ret, tmp_fd, count;
+	uint64_t now = current_timestamp();
+
+	tmp_fd = mkstemps(file_name, strlen(".pcapng"));
+	if (tmp_fd == -1) {
+		perror("mkstemps() failure");
+		goto fail;
 	}
+	printf("pcapng: output file %s\n", file_name);
 
-	return 0;
+	/* open a test capture file */
+	pcapng = rte_pcapng_fdopen(tmp_fd, NULL, NULL, "pcapng_test", NULL);
+	if (pcapng == NULL) {
+		fprintf(stderr, "rte_pcapng_fdopen failed\n");
+		close(tmp_fd);
+		goto fail;
+	}
+
+	/* Add interface to the file */
+	ret = rte_pcapng_add_interface(pcapng, port_id,
+				       NULL, NULL, NULL);
+	if (ret < 0) {
+		fprintf(stderr, "can not add port %u\n", port_id);
+		goto fail;
+	}
+
+	count = fill_pcapng_file(pcapng, TOTAL_PACKETS);
+	if (count < 0)
+		goto fail;
+
+	/* write a statistics block */
+	ret = rte_pcapng_write_stats(pcapng, port_id,
+				     count, 0, "end of test");
+	if (ret <= 0) {
+		fprintf(stderr, "Write of statistics failed\n");
+		goto fail;
+	}
+
+	rte_pcapng_close(pcapng);
+
+	ret = valid_pcapng_file(file_name, now, count);
+	/* if test fails want to investigate the file */
+	if (ret == 0)
+		unlink(file_name);
+
+	return ret;
+
+fail:
+	rte_pcapng_close(pcapng);
+	return -1;
 }
 
 static void
 test_cleanup(void)
 {
 	rte_mempool_free(mp);
-
-	if (pcapng)
-		rte_pcapng_close(pcapng);
-
+	rte_vdev_uninit(null_dev);
 }
 
 static struct
@@ -299,10 +445,8 @@ unit_test_suite test_pcapng_suite  = {
 	.teardown = test_cleanup,
 	.suite_name = "Test Pcapng Unit Test Suite",
 	.unit_test_cases = {
+		TEST_CASE(test_add_interface),
 		TEST_CASE(test_write_packets),
-		TEST_CASE(test_write_stats),
-		TEST_CASE(test_validate),
-		TEST_CASE(test_write_over_limit_iov_max),
 		TEST_CASES_END()
 	}
 };
@@ -313,4 +457,4 @@ test_pcapng(void)
 	return unit_test_suite_runner(&test_pcapng_suite);
 }
 
-REGISTER_TEST_COMMAND(pcapng_autotest, test_pcapng);
+REGISTER_FAST_TEST(pcapng_autotest, true, true, test_pcapng);
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH v5 2/5] dumpcap: allow multiple invocations
  2023-11-09 19:45   ` [PATCH v5 2/5] dumpcap: allow multiple invocations Stephen Hemminger
@ 2023-11-09 20:09     ` Morten Brørup
  0 siblings, 0 replies; 61+ messages in thread
From: Morten Brørup @ 2023-11-09 20:09 UTC (permalink / raw)
  To: Stephen Hemminger, dev; +Cc: Isaac Boukris, Reshma Pattan

> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Thursday, 9 November 2023 20.46
> 
> If dumpcap is run twice with each instance pointing a different
> interface, it would fail because of overlap in ring a pool names.
> Fix by putting process id in the name.
> 
> It is still not allowed to do multiple invocations on the same
> interface because only one callback is allowed and only one copy
> of mbuf is done. Dumpcap will fail with error in this case:
> 
>    pdump_prepare_client_request(): client request for pdump
> enable/disable failed
>    EAL: Error - exiting with code: 1
>      Cause: Packet dump enable on 0:net_null0 failed File exists
> 
> Fixes: cbb44143be74 ("app/dumpcap: add new packet capture application")
> Reported-by: Isaac Boukris <iboukris@gmail.com>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---

Reviewed-by: Morten Brørup <mb@smartsharesystems.com>


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 3/5] pcapng: modify timestamp calculation
  2023-11-09 19:45   ` [PATCH v5 3/5] pcapng: modify timestamp calculation Stephen Hemminger
@ 2023-11-12 14:22     ` Thomas Monjalon
  0 siblings, 0 replies; 61+ messages in thread
From: Thomas Monjalon @ 2023-11-12 14:22 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, Morten Brørup, Reshma Pattan, Jerin Jacob,
	Kiran Kumar K, Nithin Dabilpuram, Zhirun Yan, Quentin Armitage,
	Stephen Hemminger

09/11/2023 20:45, Stephen Hemminger:
> The computation of timestamp is best done in the part of
> pcapng library that is in secondary process.
> The secondary process is already doing a bunch of system
> calls which makes it not performance sensitive.
> This does change the rte_pcapng_copy()
> and rte_pcapng_write_stats() experimental API's.
> 
> Simplify the computation of nanoseconds from TSC to a two
> step process which avoids numeric overflow issues. The previous
> code was not thread safe as well.
> 
> Fixes: c882eb544842 ("pcapng: fix timestamp wrapping in output files")
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> Acked-by: Morten Brørup <mb@smartsharesystems.com>

It does not compile:

app/test/test_pcapng.c:148:22: error:
too many arguments to function 'rte_pcapng_copy'

Please make sure it compiles after each patch.
Thank you




^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v6 0/5] dumpcap and pcapng fixes
  2023-09-21  4:23 [PATCH 0/4] pcapng fixes Stephen Hemminger
                   ` (7 preceding siblings ...)
  2023-11-09 19:45 ` [PATCH v5 0/5] dumpcap and pcapng fixes Stephen Hemminger
@ 2023-11-13 16:15 ` Stephen Hemminger
  2023-11-13 16:15   ` [PATCH v6 1/5] pdump: fix setting rte_errno on mp error Stephen Hemminger
                     ` (4 more replies)
  2023-11-17 16:35 ` [PATCH v7 0/5] dumpcap and pcapng fixes Stephen Hemminger
  9 siblings, 5 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-13 16:15 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

It fixes issues related to timestamping. The design choices are
to maximize performance in the primary process; and do
all the time adjustment in the secondary (dumpcap) since
the dumpcap needs to system calls anyway to write the result.

This patches set changes where the adjustment is calculated
into the pcapng portion that opens the output file.
All details of the format of timestamp are contained inside
pcapng (data hiding).

v6 - make sure all steps compile
v5 - fix format of getpid in capture name
v4 - incorporate review feedback
v3 - don't use alloca() since can have VLA type issues

Stephen Hemminger (5):
  pdump: fix setting rte_errno on mp error
  dumpcap: allow multiple invocations
  pcapng: modify timestamp calculation
  pcapng: avoid using alloca()
  test: cleanups to pcapng test

 app/dumpcap/main.c      |  53 ++---
 app/test/meson.build    |   2 +-
 app/test/test_pcapng.c  | 417 +++++++++++++++++++++++++++-------------
 lib/graph/graph_pcap.c  |   2 +-
 lib/pcapng/rte_pcapng.c | 156 ++++++---------
 lib/pcapng/rte_pcapng.h |  19 +-
 lib/pdump/rte_pdump.c   |   9 +-
 7 files changed, 373 insertions(+), 285 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v6 1/5] pdump: fix setting rte_errno on mp error
  2023-11-13 16:15 ` [PATCH v6 0/5] dumpcap and pcapng fixes Stephen Hemminger
@ 2023-11-13 16:15   ` Stephen Hemminger
  2023-11-13 16:15   ` [PATCH v6 2/5] dumpcap: allow multiple invocations Stephen Hemminger
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-13 16:15 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Morten Brørup, Reshma Pattan, Jianfeng Tan

The response from MP server sets err_value to negative
on error. The convention for rte_errno is to use a positive
value on error. This makes errors like duplicate registration
show up with the correct error value.

Fixes: 660098d61f57 ("pdump: use generic multi-process channel")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
---
 lib/pdump/rte_pdump.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/pdump/rte_pdump.c b/lib/pdump/rte_pdump.c
index 80b90c6f7d03..e94f49e21250 100644
--- a/lib/pdump/rte_pdump.c
+++ b/lib/pdump/rte_pdump.c
@@ -564,9 +564,10 @@ pdump_prepare_client_request(const char *device, uint16_t queue,
 	if (rte_mp_request_sync(&mp_req, &mp_reply, &ts) == 0) {
 		mp_rep = &mp_reply.msgs[0];
 		resp = (struct pdump_response *)mp_rep->param;
-		rte_errno = resp->err_value;
-		if (!resp->err_value)
+		if (resp->err_value == 0)
 			ret = 0;
+		else
+			rte_errno = -resp->err_value;
 		free(mp_reply.msgs);
 	}
 
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v6 2/5] dumpcap: allow multiple invocations
  2023-11-13 16:15 ` [PATCH v6 0/5] dumpcap and pcapng fixes Stephen Hemminger
  2023-11-13 16:15   ` [PATCH v6 1/5] pdump: fix setting rte_errno on mp error Stephen Hemminger
@ 2023-11-13 16:15   ` Stephen Hemminger
  2023-11-13 16:15   ` [PATCH v6 3/5] pcapng: modify timestamp calculation Stephen Hemminger
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-13 16:15 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Isaac Boukris, Reshma Pattan

If dumpcap is run twice with each instance pointing a different
interface, it would fail because of overlap in ring a pool names.
Fix by putting process id in the name.

It is still not allowed to do multiple invocations on the same
interface because only one callback is allowed and only one copy
of mbuf is done. Dumpcap will fail with error in this case:

   pdump_prepare_client_request(): client request for pdump enable/disable failed
   EAL: Error - exiting with code: 1
     Cause: Packet dump enable on 0:net_null0 failed File exists

Fixes: cbb44143be74 ("app/dumpcap: add new packet capture application")
Reported-by: Isaac Boukris <iboukris@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 app/dumpcap/main.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/app/dumpcap/main.c b/app/dumpcap/main.c
index 4f581bd341d8..d05dddac0071 100644
--- a/app/dumpcap/main.c
+++ b/app/dumpcap/main.c
@@ -44,7 +44,6 @@
 #include <pcap/pcap.h>
 #include <pcap/bpf.h>
 
-#define RING_NAME "capture-ring"
 #define MONITOR_INTERVAL  (500 * 1000)
 #define MBUF_POOL_CACHE_SIZE 32
 #define BURST_SIZE 32
@@ -647,6 +646,7 @@ static void dpdk_init(void)
 static struct rte_ring *create_ring(void)
 {
 	struct rte_ring *ring;
+	char ring_name[RTE_RING_NAMESIZE];
 	size_t size, log2;
 
 	/* Find next power of 2 >= size. */
@@ -660,28 +660,28 @@ static struct rte_ring *create_ring(void)
 		ring_size = size;
 	}
 
-	ring = rte_ring_lookup(RING_NAME);
-	if (ring == NULL) {
-		ring = rte_ring_create(RING_NAME, ring_size,
-					rte_socket_id(), 0);
-		if (ring == NULL)
-			rte_exit(EXIT_FAILURE, "Could not create ring :%s\n",
-				 rte_strerror(rte_errno));
-	}
+	/* Want one ring per invocation of program */
+	snprintf(ring_name, sizeof(ring_name),
+		 "dumpcap-%d", getpid());
+
+	ring = rte_ring_create(ring_name, ring_size,
+			       rte_socket_id(), 0);
+	if (ring == NULL)
+		rte_exit(EXIT_FAILURE, "Could not create ring :%s\n",
+			 rte_strerror(rte_errno));
+
 	return ring;
 }
 
 static struct rte_mempool *create_mempool(void)
 {
 	const struct interface *intf;
-	static const char pool_name[] = "capture_mbufs";
+	char pool_name[RTE_MEMPOOL_NAMESIZE];
 	size_t num_mbufs = 2 * ring_size;
 	struct rte_mempool *mp;
 	uint32_t data_size = 128;
 
-	mp = rte_mempool_lookup(pool_name);
-	if (mp)
-		return mp;
+	snprintf(pool_name, sizeof(pool_name), "capture_%d", getpid());
 
 	/* Common pool so size mbuf for biggest snap length */
 	TAILQ_FOREACH(intf, &interfaces, next) {
@@ -826,7 +826,7 @@ static void enable_pdump(struct rte_ring *r, struct rte_mempool *mp)
 			rte_exit(EXIT_FAILURE,
 				"Packet dump enable on %u:%s failed %s\n",
 				intf->port, intf->name,
-				rte_strerror(-ret));
+				rte_strerror(rte_errno));
 		}
 
 		if (intf->opts.promisc_mode) {
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v6 3/5] pcapng: modify timestamp calculation
  2023-11-13 16:15 ` [PATCH v6 0/5] dumpcap and pcapng fixes Stephen Hemminger
  2023-11-13 16:15   ` [PATCH v6 1/5] pdump: fix setting rte_errno on mp error Stephen Hemminger
  2023-11-13 16:15   ` [PATCH v6 2/5] dumpcap: allow multiple invocations Stephen Hemminger
@ 2023-11-13 16:15   ` Stephen Hemminger
  2023-11-13 16:15   ` [PATCH v6 4/5] pcapng: avoid using alloca() Stephen Hemminger
  2023-11-13 16:15   ` [PATCH v6 5/5] test: cleanups to pcapng test Stephen Hemminger
  4 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-13 16:15 UTC (permalink / raw)
  To: dev
  Cc: Stephen Hemminger, Morten Brørup, Reshma Pattan,
	Jerin Jacob, Kiran Kumar K, Nithin Dabilpuram, Zhirun Yan,
	Quentin Armitage

The computation of timestamp is best done in the part of
pcapng library that is in secondary process.
The secondary process is already doing a bunch of system
calls which makes it not performance sensitive.
This does change the rte_pcapng_copy()
and rte_pcapng_write_stats() experimental API's.

Simplify the computation of nanoseconds from TSC to a two
step process which avoids numeric overflow issues. The previous
code was not thread safe as well.

Fixes: c882eb544842 ("pcapng: fix timestamp wrapping in output files")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
---
 app/dumpcap/main.c      |  25 +++------
 app/test/test_pcapng.c  |   7 +--
 lib/graph/graph_pcap.c  |   2 +-
 lib/pcapng/rte_pcapng.c | 119 +++++++++++++++-------------------------
 lib/pcapng/rte_pcapng.h |  19 ++-----
 lib/pdump/rte_pdump.c   |   4 +-
 6 files changed, 62 insertions(+), 114 deletions(-)

diff --git a/app/dumpcap/main.c b/app/dumpcap/main.c
index d05dddac0071..fc28e2d7027a 100644
--- a/app/dumpcap/main.c
+++ b/app/dumpcap/main.c
@@ -66,13 +66,13 @@ static bool print_stats;
 
 /* capture limit options */
 static struct {
-	uint64_t  duration;	/* nanoseconds */
+	time_t  duration;	/* seconds */
 	unsigned long packets;  /* number of packets in file */
 	size_t size;		/* file size (bytes) */
 } stop;
 
 /* Running state */
-static uint64_t start_time, end_time;
+static time_t start_time;
 static uint64_t packets_received;
 static size_t file_size;
 
@@ -197,7 +197,7 @@ static void auto_stop(char *opt)
 		if (*value == '\0' || *endp != '\0' || interval <= 0)
 			rte_exit(EXIT_FAILURE,
 				 "Invalid duration \"%s\"\n", value);
-		stop.duration = NSEC_PER_SEC * interval;
+		stop.duration = interval;
 	} else if (strcmp(opt, "filesize") == 0) {
 		stop.size = get_uint(value, "filesize", 0) * 1024;
 	} else if (strcmp(opt, "packets") == 0) {
@@ -511,15 +511,6 @@ static void statistics_loop(void)
 	}
 }
 
-/* Return the time since 1/1/1970 in nanoseconds */
-static uint64_t create_timestamp(void)
-{
-	struct timespec now;
-
-	clock_gettime(CLOCK_MONOTONIC, &now);
-	return rte_timespec_to_ns(&now);
-}
-
 static void
 cleanup_pdump_resources(void)
 {
@@ -589,9 +580,8 @@ report_packet_stats(dumpcap_out_t out)
 		ifdrop = pdump_stats.nombuf + pdump_stats.ringfull;
 
 		if (use_pcapng)
-			rte_pcapng_write_stats(out.pcapng, intf->port, NULL,
-					       start_time, end_time,
-					       ifrecv, ifdrop);
+			rte_pcapng_write_stats(out.pcapng, intf->port,
+					       ifrecv, ifdrop, NULL);
 
 		if (ifrecv == 0)
 			percent = 0;
@@ -983,7 +973,7 @@ int main(int argc, char **argv)
 	mp = create_mempool();
 	out = create_output();
 
-	start_time = create_timestamp();
+	start_time = time(NULL);
 	enable_pdump(r, mp);
 
 	if (!quiet) {
@@ -1005,11 +995,10 @@ int main(int argc, char **argv)
 			break;
 
 		if (stop.duration != 0 &&
-		    create_timestamp() - start_time > stop.duration)
+		    time(NULL) - start_time > stop.duration)
 			break;
 	}
 
-	end_time = create_timestamp();
 	disable_primary_monitor();
 
 	if (rte_eal_primary_proc_alive(NULL))
diff --git a/app/test/test_pcapng.c b/app/test/test_pcapng.c
index b8429a02f160..21131dfa0c5e 100644
--- a/app/test/test_pcapng.c
+++ b/app/test/test_pcapng.c
@@ -146,7 +146,7 @@ test_write_packets(void)
 		struct rte_mbuf *mc;
 
 		mc = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len,
-				rte_get_tsc_cycles(), 0, NULL);
+				     RTE_PCAPNG_DIRECTION_UNKNOWN, NULL);
 		if (mc == NULL) {
 			fprintf(stderr, "Cannot copy packet\n");
 			return -1;
@@ -174,8 +174,7 @@ test_write_stats(void)
 
 	/* write a statistics block */
 	len = rte_pcapng_write_stats(pcapng, port_id,
-				     NULL, 0, 0,
-				     NUM_PACKETS, 0);
+				     UINT64_MAX, UINT64_MAX, NULL);
 	if (len <= 0) {
 		fprintf(stderr, "Write of statistics failed\n");
 		return -1;
@@ -262,7 +261,7 @@ test_write_over_limit_iov_max(void)
 		struct rte_mbuf *mc;
 
 		mc = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len,
-				rte_get_tsc_cycles(), 0, NULL);
+				     RTE_PCAPNG_DIRECTION_UNKNOWN, NULL);
 		if (mc == NULL) {
 			fprintf(stderr, "Cannot copy packet\n");
 			return -1;
diff --git a/lib/graph/graph_pcap.c b/lib/graph/graph_pcap.c
index db722c375fa7..89525f1220ca 100644
--- a/lib/graph/graph_pcap.c
+++ b/lib/graph/graph_pcap.c
@@ -214,7 +214,7 @@ graph_pcap_dispatch(struct rte_graph *graph,
 		mbuf = (struct rte_mbuf *)objs[i];
 
 		mc = rte_pcapng_copy(mbuf->port, 0, mbuf, pkt_mp, mbuf->pkt_len,
-				     rte_get_tsc_cycles(), 0, buffer);
+				     0, buffer);
 		if (mc == NULL)
 			break;
 
diff --git a/lib/pcapng/rte_pcapng.c b/lib/pcapng/rte_pcapng.c
index 3c91fc77644a..13fd2b97fb80 100644
--- a/lib/pcapng/rte_pcapng.c
+++ b/lib/pcapng/rte_pcapng.c
@@ -36,22 +36,14 @@
 /* Format of the capture file handle */
 struct rte_pcapng {
 	int  outfd;		/* output file */
-
 	unsigned int ports;	/* number of interfaces added */
+	uint64_t offset_ns;	/* ns since 1/1/1970 when initialized */
+	uint64_t tsc_base;	/* TSC when started */
 
 	/* DPDK port id to interface index in file */
 	uint32_t port_index[RTE_MAX_ETHPORTS];
 };
 
-/* For converting TSC cycles to PCAPNG ns format */
-static struct pcapng_time {
-	uint64_t ns;
-	uint64_t cycles;
-	uint64_t tsc_hz;
-	struct rte_reciprocal_u64 tsc_hz_inverse;
-} pcapng_time;
-
-
 #ifdef RTE_EXEC_ENV_WINDOWS
 /*
  * Windows does not have writev() call.
@@ -102,56 +94,21 @@ static ssize_t writev(int fd, const struct iovec *iov, int iovcnt)
 #define if_indextoname(ifindex, ifname) NULL
 #endif
 
-static inline void
-pcapng_init(void)
+/* Convert from TSC (CPU cycles) to nanoseconds */
+static uint64_t
+pcapng_timestamp(const rte_pcapng_t *self, uint64_t cycles)
 {
-	struct timespec ts;
+	uint64_t delta, rem, secs, ns;
+	const uint64_t hz = rte_get_tsc_hz();
 
-	pcapng_time.cycles = rte_get_tsc_cycles();
-	clock_gettime(CLOCK_REALTIME, &ts);
-	pcapng_time.cycles = (pcapng_time.cycles + rte_get_tsc_cycles()) / 2;
-	pcapng_time.ns = rte_timespec_to_ns(&ts);
-
-	pcapng_time.tsc_hz = rte_get_tsc_hz();
-	pcapng_time.tsc_hz_inverse = rte_reciprocal_value_u64(pcapng_time.tsc_hz);
-}
+	delta = cycles - self->tsc_base;
 
-/* PCAPNG timestamps are in nanoseconds */
-static uint64_t pcapng_tsc_to_ns(uint64_t cycles)
-{
-	uint64_t delta, secs;
-
-	if (!pcapng_time.tsc_hz)
-		pcapng_init();
-
-	/* In essence the calculation is:
-	 *   delta = (cycles - pcapng_time.cycles) * NSEC_PRE_SEC / rte_get_tsc_hz()
-	 * but this overflows within 4 to 8 seconds depending on TSC frequency.
-	 * Instead, if delta >= pcapng_time.tsc_hz:
-	 *   Increase pcapng_time.ns and pcapng_time.cycles by the number of
-	 *   whole seconds in delta and reduce delta accordingly.
-	 * delta will therefore always lie in the interval [0, pcapng_time.tsc_hz),
-	 * which will not overflow when multiplied by NSEC_PER_SEC provided the
-	 * TSC frequency < approx 18.4GHz.
-	 *
-	 * Currently all TSCs operate below 5GHz.
-	 */
-	delta = cycles - pcapng_time.cycles;
-	if (unlikely(delta >= pcapng_time.tsc_hz)) {
-		if (likely(delta < pcapng_time.tsc_hz * 2)) {
-			delta -= pcapng_time.tsc_hz;
-			pcapng_time.cycles += pcapng_time.tsc_hz;
-			pcapng_time.ns += NSEC_PER_SEC;
-		} else {
-			secs = rte_reciprocal_divide_u64(delta, &pcapng_time.tsc_hz_inverse);
-			delta -= secs * pcapng_time.tsc_hz;
-			pcapng_time.cycles += secs * pcapng_time.tsc_hz;
-			pcapng_time.ns += secs * NSEC_PER_SEC;
-		}
-	}
+	/* Avoid numeric wraparound by computing seconds first */
+	secs = delta / hz;
+	rem = delta % hz;
+	ns = (rem * NS_PER_S) / hz;
 
-	return pcapng_time.ns + rte_reciprocal_divide_u64(delta * NSEC_PER_SEC,
-							  &pcapng_time.tsc_hz_inverse);
+	return secs * NS_PER_S + ns + self->offset_ns;
 }
 
 /* length of option including padding */
@@ -368,15 +325,15 @@ rte_pcapng_add_interface(rte_pcapng_t *self, uint16_t port,
  */
 ssize_t
 rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
-		       const char *comment,
-		       uint64_t start_time, uint64_t end_time,
-		       uint64_t ifrecv, uint64_t ifdrop)
+		       uint64_t ifrecv, uint64_t ifdrop,
+		       const char *comment)
 {
 	struct pcapng_statistics *hdr;
 	struct pcapng_option *opt;
+	uint64_t start_time = self->offset_ns;
+	uint64_t sample_time;
 	uint32_t optlen, len;
 	uint8_t *buf;
-	uint64_t ns;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
@@ -386,10 +343,10 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 		optlen += pcapng_optlen(sizeof(ifrecv));
 	if (ifdrop != UINT64_MAX)
 		optlen += pcapng_optlen(sizeof(ifdrop));
+
 	if (start_time != 0)
 		optlen += pcapng_optlen(sizeof(start_time));
-	if (end_time != 0)
-		optlen += pcapng_optlen(sizeof(end_time));
+
 	if (comment)
 		optlen += pcapng_optlen(strlen(comment));
 	if (optlen != 0)
@@ -409,9 +366,6 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 	if (start_time != 0)
 		opt = pcapng_add_option(opt, PCAPNG_ISB_STARTTIME,
 					 &start_time, sizeof(start_time));
-	if (end_time != 0)
-		opt = pcapng_add_option(opt, PCAPNG_ISB_ENDTIME,
-					 &end_time, sizeof(end_time));
 	if (ifrecv != UINT64_MAX)
 		opt = pcapng_add_option(opt, PCAPNG_ISB_IFRECV,
 				&ifrecv, sizeof(ifrecv));
@@ -425,9 +379,9 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 	hdr->block_length = len;
 	hdr->interface_id = self->port_index[port_id];
 
-	ns = pcapng_tsc_to_ns(rte_get_tsc_cycles());
-	hdr->timestamp_hi = ns >> 32;
-	hdr->timestamp_lo = (uint32_t)ns;
+	sample_time = pcapng_timestamp(self, rte_get_tsc_cycles());
+	hdr->timestamp_hi = sample_time >> 32;
+	hdr->timestamp_lo = (uint32_t)sample_time;
 
 	/* clone block_length after option */
 	memcpy(opt, &len, sizeof(uint32_t));
@@ -520,23 +474,21 @@ struct rte_mbuf *
 rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 		const struct rte_mbuf *md,
 		struct rte_mempool *mp,
-		uint32_t length, uint64_t cycles,
+		uint32_t length,
 		enum rte_pcapng_direction direction,
 		const char *comment)
 {
 	struct pcapng_enhance_packet_block *epb;
 	uint32_t orig_len, data_len, padding, flags;
 	struct pcapng_option *opt;
+	uint64_t timestamp;
 	uint16_t optlen;
 	struct rte_mbuf *mc;
-	uint64_t ns;
 	bool rss_hash;
 
 #ifdef RTE_LIBRTE_ETHDEV_DEBUG
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
 #endif
-	ns = pcapng_tsc_to_ns(cycles);
-
 	orig_len = rte_pktmbuf_pkt_len(md);
 
 	/* Take snapshot of the data */
@@ -641,8 +593,10 @@ rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 	/* Interface index is filled in later during write */
 	mc->port = port_id;
 
-	epb->timestamp_hi = ns >> 32;
-	epb->timestamp_lo = (uint32_t)ns;
+	/* Put timestamp in cycles here - adjust in packet write */
+	timestamp = rte_get_tsc_cycles();
+	epb->timestamp_hi = timestamp >> 32;
+	epb->timestamp_lo = (uint32_t)timestamp;
 	epb->capture_length = data_len;
 	epb->original_length = orig_len;
 
@@ -668,6 +622,7 @@ rte_pcapng_write_packets(rte_pcapng_t *self,
 	for (i = 0; i < nb_pkts; i++) {
 		struct rte_mbuf *m = pkts[i];
 		struct pcapng_enhance_packet_block *epb;
+		uint64_t cycles, timestamp;
 
 		/* sanity check that is really a pcapng mbuf */
 		epb = rte_pktmbuf_mtod(m, struct pcapng_enhance_packet_block *);
@@ -684,6 +639,13 @@ rte_pcapng_write_packets(rte_pcapng_t *self,
 			return -1;
 		}
 
+		/* adjust timestamp recorded in packet */
+		cycles = (uint64_t)epb->timestamp_hi << 32;
+		cycles += epb->timestamp_lo;
+		timestamp = pcapng_timestamp(self, cycles);
+		epb->timestamp_hi = timestamp >> 32;
+		epb->timestamp_lo = (uint32_t)timestamp;
+
 		/*
 		 * Handle case of highly fragmented and large burst size
 		 * Note: this assumes that max segments per mbuf < IOV_MAX
@@ -725,6 +687,8 @@ rte_pcapng_fdopen(int fd,
 {
 	unsigned int i;
 	rte_pcapng_t *self;
+	struct timespec ts;
+	uint64_t cycles;
 
 	self = malloc(sizeof(*self));
 	if (!self) {
@@ -734,6 +698,13 @@ rte_pcapng_fdopen(int fd,
 
 	self->outfd = fd;
 	self->ports = 0;
+
+	/* record start time in ns since 1/1/1970 */
+	cycles = rte_get_tsc_cycles();
+	clock_gettime(CLOCK_REALTIME, &ts);
+	self->tsc_base = (cycles + rte_get_tsc_cycles()) / 2;
+	self->offset_ns = rte_timespec_to_ns(&ts);
+
 	for (i = 0; i < RTE_MAX_ETHPORTS; i++)
 		self->port_index[i] = UINT32_MAX;
 
diff --git a/lib/pcapng/rte_pcapng.h b/lib/pcapng/rte_pcapng.h
index d93cc9f73ad5..c40795c721de 100644
--- a/lib/pcapng/rte_pcapng.h
+++ b/lib/pcapng/rte_pcapng.h
@@ -121,8 +121,6 @@ enum rte_pcapng_direction {
  * @param length
  *   The upper limit on bytes to copy.  Passing UINT32_MAX
  *   means all data (after offset).
- * @param timestamp
- *   The timestamp in TSC cycles.
  * @param direction
  *   The direction of the packer: receive, transmit or unknown.
  * @param comment
@@ -136,7 +134,7 @@ __rte_experimental
 struct rte_mbuf *
 rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 		const struct rte_mbuf *m, struct rte_mempool *mp,
-		uint32_t length, uint64_t timestamp,
+		uint32_t length,
 		enum rte_pcapng_direction direction, const char *comment);
 
 
@@ -188,29 +186,22 @@ rte_pcapng_write_packets(rte_pcapng_t *self,
  *  The handle to the packet capture file
  * @param port
  *  The Ethernet port to report stats on.
- * @param comment
- *   Optional comment to add to statistics.
- * @param start_time
- *  The time when packet capture was started in nanoseconds.
- *  Optional: can be zero if not known.
- * @param end_time
- *  The time when packet capture was stopped in nanoseconds.
- *  Optional: can be zero if not finished;
  * @param ifrecv
  *  The number of packets received by capture.
  *  Optional: use UINT64_MAX if not known.
  * @param ifdrop
  *  The number of packets missed by the capture process.
  *  Optional: use UINT64_MAX if not known.
+ * @param comment
+ *  Optional comment to add to statistics.
  * @return
  *  number of bytes written to file, -1 on failure to write file
  */
 __rte_experimental
 ssize_t
 rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port,
-		       const char *comment,
-		       uint64_t start_time, uint64_t end_time,
-		       uint64_t ifrecv, uint64_t ifdrop);
+		       uint64_t ifrecv, uint64_t ifdrop,
+		       const char *comment);
 
 #ifdef __cplusplus
 }
diff --git a/lib/pdump/rte_pdump.c b/lib/pdump/rte_pdump.c
index e94f49e21250..5a1ec14d7a18 100644
--- a/lib/pdump/rte_pdump.c
+++ b/lib/pdump/rte_pdump.c
@@ -90,7 +90,6 @@ pdump_copy(uint16_t port_id, uint16_t queue,
 	int ring_enq;
 	uint16_t d_pkts = 0;
 	struct rte_mbuf *dup_bufs[nb_pkts];
-	uint64_t ts;
 	struct rte_ring *ring;
 	struct rte_mempool *mp;
 	struct rte_mbuf *p;
@@ -99,7 +98,6 @@ pdump_copy(uint16_t port_id, uint16_t queue,
 	if (cbs->filter)
 		rte_bpf_exec_burst(cbs->filter, (void **)pkts, rcs, nb_pkts);
 
-	ts = rte_get_tsc_cycles();
 	ring = cbs->ring;
 	mp = cbs->mp;
 	for (i = 0; i < nb_pkts; i++) {
@@ -122,7 +120,7 @@ pdump_copy(uint16_t port_id, uint16_t queue,
 		if (cbs->ver == V2)
 			p = rte_pcapng_copy(port_id, queue,
 					    pkts[i], mp, cbs->snaplen,
-					    ts, direction, NULL);
+					    direction, NULL);
 		else
 			p = rte_pktmbuf_copy(pkts[i], mp, 0, cbs->snaplen);
 
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v6 4/5] pcapng: avoid using alloca()
  2023-11-13 16:15 ` [PATCH v6 0/5] dumpcap and pcapng fixes Stephen Hemminger
                     ` (2 preceding siblings ...)
  2023-11-13 16:15   ` [PATCH v6 3/5] pcapng: modify timestamp calculation Stephen Hemminger
@ 2023-11-13 16:15   ` Stephen Hemminger
  2023-11-13 16:15   ` [PATCH v6 5/5] test: cleanups to pcapng test Stephen Hemminger
  4 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-13 16:15 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Morten Brørup, Reshma Pattan

The function alloca() like VLA's has problems if the caller
passes a large value. Instead use a fixed size buffer (2K)
which will be more than sufficient for the info related blocks
in the file. Add bounds checks as well.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
---
 lib/pcapng/rte_pcapng.c | 37 ++++++++++++++++---------------------
 1 file changed, 16 insertions(+), 21 deletions(-)

diff --git a/lib/pcapng/rte_pcapng.c b/lib/pcapng/rte_pcapng.c
index 13fd2b97fb80..f74ec939a9f8 100644
--- a/lib/pcapng/rte_pcapng.c
+++ b/lib/pcapng/rte_pcapng.c
@@ -33,6 +33,9 @@
 /* conversion from DPDK speed to PCAPNG */
 #define PCAPNG_MBPS_SPEED 1000000ull
 
+/* upper bound for section, stats and interface blocks */
+#define PCAPNG_BLKSIZ	2048
+
 /* Format of the capture file handle */
 struct rte_pcapng {
 	int  outfd;		/* output file */
@@ -140,9 +143,8 @@ pcapng_section_block(rte_pcapng_t *self,
 {
 	struct pcapng_section_header *hdr;
 	struct pcapng_option *opt;
-	void *buf;
+	uint8_t buf[PCAPNG_BLKSIZ];
 	uint32_t len;
-	ssize_t cc;
 
 	len = sizeof(*hdr);
 	if (hw)
@@ -158,8 +160,7 @@ pcapng_section_block(rte_pcapng_t *self,
 	len += pcapng_optlen(0);
 	len += sizeof(uint32_t);
 
-	buf = calloc(1, len);
-	if (!buf)
+	if (len > sizeof(buf))
 		return -1;
 
 	hdr = (struct pcapng_section_header *)buf;
@@ -193,10 +194,7 @@ pcapng_section_block(rte_pcapng_t *self,
 	/* clone block_length after option */
 	memcpy(opt, &hdr->block_length, sizeof(uint32_t));
 
-	cc = write(self->outfd, buf, len);
-	free(buf);
-
-	return cc;
+	return write(self->outfd, buf, len);
 }
 
 /* Write an interface block for a DPDK port */
@@ -213,7 +211,7 @@ rte_pcapng_add_interface(rte_pcapng_t *self, uint16_t port,
 	struct pcapng_option *opt;
 	const uint8_t tsresol = 9;	/* nanosecond resolution */
 	uint32_t len;
-	void *buf;
+	uint8_t buf[PCAPNG_BLKSIZ];
 	char ifname_buf[IF_NAMESIZE];
 	char ifhw[256];
 	uint64_t speed = 0;
@@ -267,8 +265,7 @@ rte_pcapng_add_interface(rte_pcapng_t *self, uint16_t port,
 	len += pcapng_optlen(0);
 	len += sizeof(uint32_t);
 
-	buf = alloca(len);
-	if (!buf)
+	if (len > sizeof(buf))
 		return -1;
 
 	hdr = (struct pcapng_interface_block *)buf;
@@ -296,17 +293,16 @@ rte_pcapng_add_interface(rte_pcapng_t *self, uint16_t port,
 		opt = pcapng_add_option(opt, PCAPNG_IFB_HARDWARE,
 					 ifhw, strlen(ifhw));
 	if (filter) {
-		/* Encoding is that the first octet indicates string vs BPF */
 		size_t len;
-		char *buf;
 
 		len = strlen(filter) + 1;
-		buf = alloca(len);
-		*buf = '\0';
-		memcpy(buf + 1, filter, len);
+		opt->code = PCAPNG_IFB_FILTER;
+		opt->length = len;
+		/* Encoding is that the first octet indicates string vs BPF */
+		opt->data[0] = 0;
+		memcpy(opt->data + 1, filter, strlen(filter));
 
-		opt = pcapng_add_option(opt, PCAPNG_IFB_FILTER,
-					buf, len);
+		opt = (struct pcapng_option *)((uint8_t *)opt + pcapng_optlen(len));
 	}
 
 	opt = pcapng_add_option(opt, PCAPNG_OPT_END, NULL, 0);
@@ -333,7 +329,7 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 	uint64_t start_time = self->offset_ns;
 	uint64_t sample_time;
 	uint32_t optlen, len;
-	uint8_t *buf;
+	uint8_t buf[PCAPNG_BLKSIZ];
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
@@ -353,8 +349,7 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 		optlen += pcapng_optlen(0);
 
 	len = sizeof(*hdr) + optlen + sizeof(uint32_t);
-	buf = alloca(len);
-	if (buf == NULL)
+	if (len > sizeof(buf))
 		return -1;
 
 	hdr = (struct pcapng_statistics *)buf;
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v6 5/5] test: cleanups to pcapng test
  2023-11-13 16:15 ` [PATCH v6 0/5] dumpcap and pcapng fixes Stephen Hemminger
                     ` (3 preceding siblings ...)
  2023-11-13 16:15   ` [PATCH v6 4/5] pcapng: avoid using alloca() Stephen Hemminger
@ 2023-11-13 16:15   ` Stephen Hemminger
  4 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-13 16:15 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Reshma Pattan

Overhaul of the pcapng test:
  - promote it to be a fast test so it gets regularly run.
  - create null device and use i.
  - use UDP discard packets that are valid so that for debugging
    the resulting pcapng file can be looked at with wireshark.
  - do basic checks on resulting pcap file that lengths and
    timestamps are in range.
  - add test for interface options

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 app/test/meson.build   |   2 +-
 app/test/test_pcapng.c | 416 +++++++++++++++++++++++++++--------------
 2 files changed, 281 insertions(+), 137 deletions(-)

diff --git a/app/test/meson.build b/app/test/meson.build
index 4183d66b0e9c..dcc93f4a43b4 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -128,7 +128,7 @@ source_file_deps = {
     'test_metrics.c': ['metrics'],
     'test_mp_secondary.c': ['hash', 'lpm'],
     'test_net_ether.c': ['net'],
-    'test_pcapng.c': ['ethdev', 'net', 'pcapng'],
+    'test_pcapng.c': ['ethdev', 'net', 'pcapng', 'bus_vdev'],
     'test_pdcp.c': ['eventdev', 'pdcp', 'net', 'timer', 'security'],
     'test_pdump.c': ['pdump'] + sample_packet_forward_deps,
     'test_per_lcore.c': [],
diff --git a/app/test/test_pcapng.c b/app/test/test_pcapng.c
index 21131dfa0c5e..89535efad096 100644
--- a/app/test/test_pcapng.c
+++ b/app/test/test_pcapng.c
@@ -6,25 +6,34 @@
 #include <stdlib.h>
 #include <unistd.h>
 
+#include <rte_bus_vdev.h>
 #include <rte_ethdev.h>
 #include <rte_ether.h>
+#include <rte_ip.h>
 #include <rte_mbuf.h>
 #include <rte_mempool.h>
 #include <rte_net.h>
 #include <rte_pcapng.h>
+#include <rte_random.h>
+#include <rte_reciprocal.h>
+#include <rte_time.h>
+#include <rte_udp.h>
 
 #include <pcap/pcap.h>
 
 #include "test.h"
 
-#define NUM_PACKETS    10
-#define DUMMY_MBUF_NUM 3
+#define PCAPNG_TEST_DEBUG 0
+
+#define TOTAL_PACKETS	4096
+#define MAX_BURST	64
+#define MAX_GAP_US	100000
+#define DUMMY_MBUF_NUM	3
 
-static rte_pcapng_t *pcapng;
 static struct rte_mempool *mp;
 static const uint32_t pkt_len = 200;
 static uint16_t port_id;
-static char file_name[] = "/tmp/pcapng_test_XXXXXX.pcapng";
+static const char null_dev[] = "net_null0";
 
 /* first mbuf in the packet, should always be at offset 0 */
 struct dummy_mbuf {
@@ -61,6 +70,7 @@ mbuf1_prepare(struct dummy_mbuf *dm, uint32_t plen)
 	struct {
 		struct rte_ether_hdr eth;
 		struct rte_ipv4_hdr ip;
+		struct rte_udp_hdr udp;
 	} pkt = {
 		.eth = {
 			.dst_addr.addr_bytes = "\xff\xff\xff\xff\xff\xff",
@@ -68,148 +78,200 @@ mbuf1_prepare(struct dummy_mbuf *dm, uint32_t plen)
 		},
 		.ip = {
 			.version_ihl = RTE_IPV4_VHL_DEF,
-			.total_length = rte_cpu_to_be_16(plen),
-			.time_to_live = IPDEFTTL,
-			.next_proto_id = IPPROTO_RAW,
+			.time_to_live = 1,
+			.next_proto_id = IPPROTO_UDP,
 			.src_addr = rte_cpu_to_be_32(RTE_IPV4_LOOPBACK),
 			.dst_addr = rte_cpu_to_be_32(RTE_IPV4_BROADCAST),
-		}
+		},
+		.udp = {
+			.dst_port = rte_cpu_to_be_16(9), /* Discard port */
+		},
 	};
 
 	memset(dm, 0, sizeof(*dm));
 	dummy_mbuf_prep(&dm->mb[0], dm->buf[0], sizeof(dm->buf[0]), plen);
 
 	rte_eth_random_addr(pkt.eth.src_addr.addr_bytes);
-	memcpy(rte_pktmbuf_mtod(dm->mb, void *), &pkt, RTE_MIN(sizeof(pkt), plen));
+	plen -= sizeof(struct rte_ether_hdr);
+
+	pkt.ip.total_length = rte_cpu_to_be_16(plen);
+	pkt.ip.hdr_checksum = rte_ipv4_cksum(&pkt.ip);
+
+	plen -= sizeof(struct rte_ipv4_hdr);
+	pkt.udp.src_port = rte_rand();
+	pkt.udp.dgram_len = rte_cpu_to_be_16(plen);
+
+	memcpy(rte_pktmbuf_mtod(dm->mb, void *), &pkt, sizeof(pkt));
 }
 
 static int
 test_setup(void)
 {
-	int tmp_fd;
-
-	port_id = rte_eth_find_next(0);
-	if (port_id >= RTE_MAX_ETHPORTS) {
-		fprintf(stderr, "No valid Ether port\n");
-		return -1;
-	}
+	port_id = rte_eth_dev_count_avail();
 
-	tmp_fd = mkstemps(file_name, strlen(".pcapng"));
-	if (tmp_fd == -1) {
-		perror("mkstemps() failure");
-		return -1;
-	}
-	printf("pcapng: output file %s\n", file_name);
-
-	/* open a test capture file */
-	pcapng = rte_pcapng_fdopen(tmp_fd, NULL, NULL, "pcapng_test", NULL);
-	if (pcapng == NULL) {
-		fprintf(stderr, "rte_pcapng_fdopen failed\n");
-		close(tmp_fd);
-		return -1;
-	}
-
-	/* Add interface to the file */
-	if (rte_pcapng_add_interface(pcapng, port_id,
-				     NULL, NULL, NULL) != 0) {
-		fprintf(stderr, "can not add port %u\n", port_id);
-		return -1;
+	/* Make a dummy null device to snoop on */
+	if (rte_vdev_init(null_dev, NULL) != 0) {
+		fprintf(stderr, "Failed to create vdev '%s'\n", null_dev);
+		goto fail;
 	}
 
 	/* Make a pool for cloned packets */
-	mp = rte_pktmbuf_pool_create_by_ops("pcapng_test_pool", IOV_MAX + NUM_PACKETS,
-					    0, 0,
-					    rte_pcapng_mbuf_size(pkt_len),
+	mp = rte_pktmbuf_pool_create_by_ops("pcapng_test_pool",
+					    MAX_BURST, 0, 0,
+					    rte_pcapng_mbuf_size(pkt_len) + 128,
 					    SOCKET_ID_ANY, "ring_mp_sc");
 	if (mp == NULL) {
 		fprintf(stderr, "Cannot create mempool\n");
-		return -1;
+		goto fail;
 	}
+
 	return 0;
+
+fail:
+	rte_vdev_uninit(null_dev);
+	rte_mempool_free(mp);
+	return -1;
 }
 
 static int
-test_write_packets(void)
+fill_pcapng_file(rte_pcapng_t *pcapng, unsigned int num_packets)
 {
-	struct rte_mbuf *orig;
-	struct rte_mbuf *clones[NUM_PACKETS] = { };
 	struct dummy_mbuf mbfs;
-	unsigned int i;
+	struct rte_mbuf *orig;
+	unsigned int burst_size;
+	unsigned int count;
 	ssize_t len;
 
 	/* make a dummy packet */
 	mbuf1_prepare(&mbfs, pkt_len);
-
-	/* clone them */
 	orig  = &mbfs.mb[0];
-	for (i = 0; i < NUM_PACKETS; i++) {
-		struct rte_mbuf *mc;
 
-		mc = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len,
-				     RTE_PCAPNG_DIRECTION_UNKNOWN, NULL);
-		if (mc == NULL) {
-			fprintf(stderr, "Cannot copy packet\n");
+	for (count = 0; count < num_packets; count += burst_size) {
+		struct rte_mbuf *clones[MAX_BURST];
+		unsigned int i;
+
+		/* put 1 .. MAX_BURST packets in one write call */
+		burst_size = rte_rand_max(MAX_BURST) + 1;
+		for (i = 0; i < burst_size; i++) {
+			struct rte_mbuf *mc;
+
+			mc = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len,
+					     RTE_PCAPNG_DIRECTION_IN, NULL);
+			if (mc == NULL) {
+				fprintf(stderr, "Cannot copy packet\n");
+				return -1;
+			}
+			clones[i] = mc;
+		}
+
+		/* write it to capture file */
+		len = rte_pcapng_write_packets(pcapng, clones, burst_size);
+		rte_pktmbuf_free_bulk(clones, burst_size);
+
+		if (len <= 0) {
+			fprintf(stderr, "Write of packets failed: %s\n",
+				rte_strerror(rte_errno));
 			return -1;
 		}
-		clones[i] = mc;
+
+		/* Leave a small gap between packets to test for time wrap */
+		usleep(rte_rand_max(MAX_GAP_US));
 	}
 
-	/* write it to capture file */
-	len = rte_pcapng_write_packets(pcapng, clones, NUM_PACKETS);
+	return count;
+}
 
-	rte_pktmbuf_free_bulk(clones, NUM_PACKETS);
+static char *
+fmt_time(char *buf, size_t size, uint64_t ts_ns)
+{
+	time_t sec;
+	size_t len;
 
-	if (len <= 0) {
-		fprintf(stderr, "Write of packets failed\n");
-		return -1;
-	}
+	sec = ts_ns / NS_PER_S;
+	len = strftime(buf, size, "%X", localtime(&sec));
+	snprintf(buf + len, size - len, ".%09lu",
+		 (unsigned long)(ts_ns % NS_PER_S));
 
-	return 0;
+	return buf;
 }
 
-static int
-test_write_stats(void)
+/* Context for the pcap_loop callback */
+struct pkt_print_ctx {
+	pcap_t *pcap;
+	unsigned int count;
+	uint64_t start_ns;
+	uint64_t end_ns;
+};
+
+static void
+print_packet(uint64_t ts_ns, const struct rte_ether_hdr *eh, size_t len)
 {
-	ssize_t len;
+	char tbuf[128], src[64], dst[64];
 
-	/* write a statistics block */
-	len = rte_pcapng_write_stats(pcapng, port_id,
-				     UINT64_MAX, UINT64_MAX, NULL);
-	if (len <= 0) {
-		fprintf(stderr, "Write of statistics failed\n");
-		return -1;
-	}
-	return 0;
+	fmt_time(tbuf, sizeof(tbuf), ts_ns);
+	rte_ether_format_addr(dst, sizeof(dst), &eh->dst_addr);
+	rte_ether_format_addr(src, sizeof(src), &eh->src_addr);
+	printf("%s: %s -> %s type %x length %zu\n",
+	       tbuf, src, dst, rte_be_to_cpu_16(eh->ether_type), len);
 }
 
+/* Callback from pcap_loop used to validate packets in the file */
 static void
-pkt_print(u_char *user, const struct pcap_pkthdr *h,
-	  const u_char *bytes)
+parse_pcap_packet(u_char *user, const struct pcap_pkthdr *h,
+		  const u_char *bytes)
 {
-	unsigned int *countp = (unsigned int *)user;
+	struct pkt_print_ctx *ctx = (struct pkt_print_ctx *)user;
 	const struct rte_ether_hdr *eh;
-	struct tm *tm;
-	char tbuf[128], src[64], dst[64];
+	const struct rte_ipv4_hdr *ip;
+	uint64_t ns;
 
-	tm = localtime(&h->ts.tv_sec);
-	if (tm == NULL) {
-		perror("localtime");
-		return;
+	eh = (const struct rte_ether_hdr *)bytes;
+	ip = (const struct rte_ipv4_hdr *)(eh + 1);
+
+	ctx->count += 1;
+
+	/* The pcap library is misleading in reporting timestamp.
+	 * packet header struct gives timestamp as a timeval (ie. usec);
+	 * but the file is open in nanonsecond mode therefore
+	 * the timestamp is really in timespec (ie. nanoseconds).
+	 */
+	ns = h->ts.tv_sec * NS_PER_S + h->ts.tv_usec;
+	if (ns < ctx->start_ns || ns > ctx->end_ns) {
+		char tstart[128], tend[128];
+
+		fmt_time(tstart, sizeof(tstart), ctx->start_ns);
+		fmt_time(tend, sizeof(tend), ctx->end_ns);
+		fprintf(stderr, "Timestamp out of range [%s .. %s]\n",
+			tstart, tend);
+		goto error;
 	}
 
-	if (strftime(tbuf, sizeof(tbuf), "%X", tm) == 0) {
-		fprintf(stderr, "strftime returned 0!\n");
-		return;
+	if (!rte_is_broadcast_ether_addr(&eh->dst_addr)) {
+		fprintf(stderr, "Destination is not broadcast\n");
+		goto error;
 	}
 
-	eh = (const struct rte_ether_hdr *)bytes;
-	rte_ether_format_addr(dst, sizeof(dst), &eh->dst_addr);
-	rte_ether_format_addr(src, sizeof(src), &eh->src_addr);
-	printf("%s.%06lu: %s -> %s type %x length %u\n",
-	       tbuf, (unsigned long)h->ts.tv_usec,
-	       src, dst, rte_be_to_cpu_16(eh->ether_type), h->len);
+	if (rte_ipv4_cksum(ip) != 0) {
+		fprintf(stderr, "Bad IPv4 checksum\n");
+		goto error;
+	}
+
+	return;		/* packet is normal */
+
+error:
+	print_packet(ns, eh, h->len);
+
+	/* Stop parsing at first error */
+	pcap_breakloop(ctx->pcap);
+}
 
-	*countp += 1;
+static uint64_t
+current_timestamp(void)
+{
+	struct timespec ts;
+
+	clock_gettime(CLOCK_REALTIME, &ts);
+	return rte_timespec_to_ns(&ts);
 }
 
 /*
@@ -218,78 +280,162 @@ pkt_print(u_char *user, const struct pcap_pkthdr *h,
  * but that creates an unwanted dependency.
  */
 static int
-test_validate(void)
+valid_pcapng_file(const char *file_name, uint64_t started, unsigned int expected)
 {
 	char errbuf[PCAP_ERRBUF_SIZE];
-	unsigned int count = 0;
-	pcap_t *pcap;
+	struct pkt_print_ctx ctx = { };
 	int ret;
 
-	pcap = pcap_open_offline(file_name, errbuf);
-	if (pcap == NULL) {
+	ctx.start_ns = started;
+	ctx.end_ns = current_timestamp();
+
+	ctx.pcap = pcap_open_offline_with_tstamp_precision(file_name,
+							   PCAP_TSTAMP_PRECISION_NANO,
+							   errbuf);
+	if (ctx.pcap == NULL) {
 		fprintf(stderr, "pcap_open_offline('%s') failed: %s\n",
 			file_name, errbuf);
 		return -1;
 	}
 
-	ret = pcap_loop(pcap, 0, pkt_print, (u_char *)&count);
-	if (ret == 0)
-		printf("Saw %u packets\n", count);
-	else
+	ret = pcap_loop(ctx.pcap, 0, parse_pcap_packet, (u_char *)&ctx);
+	if (ret != 0) {
 		fprintf(stderr, "pcap_dispatch: failed: %s\n",
-			pcap_geterr(pcap));
-	pcap_close(pcap);
+			pcap_geterr(ctx.pcap));
+	} else if (ctx.count != expected) {
+		printf("Only %u packets, expected %u\n",
+		       ctx.count, expected);
+		ret = -1;
+	}
+
+	pcap_close(ctx.pcap);
 
 	return ret;
 }
 
 static int
-test_write_over_limit_iov_max(void)
+test_add_interface(void)
 {
-	struct rte_mbuf *orig;
-	struct rte_mbuf *clones[IOV_MAX + NUM_PACKETS] = { };
-	struct dummy_mbuf mbfs;
-	unsigned int i;
-	ssize_t len;
+	char file_name[] = "/tmp/pcapng_test_XXXXXX.pcapng";
+	static rte_pcapng_t *pcapng;
+	int ret, tmp_fd;
+	uint64_t now = current_timestamp();
 
-	/* make a dummy packet */
-	mbuf1_prepare(&mbfs, pkt_len);
+	tmp_fd = mkstemps(file_name, strlen(".pcapng"));
+	if (tmp_fd == -1) {
+		perror("mkstemps() failure");
+		goto fail;
+	}
+	printf("pcapng: output file %s\n", file_name);
 
-	/* clone them */
-	orig  = &mbfs.mb[0];
-	for (i = 0; i < IOV_MAX + NUM_PACKETS; i++) {
-		struct rte_mbuf *mc;
+	/* open a test capture file */
+	pcapng = rte_pcapng_fdopen(tmp_fd, NULL, NULL, "pcapng_addif", NULL);
+	if (pcapng == NULL) {
+		fprintf(stderr, "rte_pcapng_fdopen failed\n");
+		close(tmp_fd);
+		goto fail;
+	}
 
-		mc = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len,
-				     RTE_PCAPNG_DIRECTION_UNKNOWN, NULL);
-		if (mc == NULL) {
-			fprintf(stderr, "Cannot copy packet\n");
-			return -1;
-		}
-		clones[i] = mc;
+	/* Add interface to the file */
+	ret = rte_pcapng_add_interface(pcapng, port_id,
+				       NULL, NULL, NULL);
+	if (ret < 0) {
+		fprintf(stderr, "can not add port %u\n", port_id);
+		goto fail;
 	}
 
-	/* write it to capture file */
-	len = rte_pcapng_write_packets(pcapng, clones, IOV_MAX + NUM_PACKETS);
+	/* Add interface with ifname and ifdescr */
+	ret = rte_pcapng_add_interface(pcapng, port_id,
+				       "myeth", "Some long description", NULL);
+	if (ret < 0) {
+		fprintf(stderr, "can not add port %u with ifname\n", port_id);
+		goto fail;
+	}
+
+	/* Add interface with filter */
+	ret = rte_pcapng_add_interface(pcapng, port_id,
+				       NULL, NULL, "tcp port 8080");
+	if (ret < 0) {
+		fprintf(stderr, "can not add port %u with filter\n", port_id);
+		goto fail;
+	}
 
-	rte_pktmbuf_free_bulk(clones, IOV_MAX + NUM_PACKETS);
+	rte_pcapng_close(pcapng);
 
-	if (len <= 0) {
-		fprintf(stderr, "Write of packets failed\n");
-		return -1;
+	ret = valid_pcapng_file(file_name, now, 0);
+	/* if test fails want to investigate the file */
+	if (ret == 0)
+		unlink(file_name);
+
+	return ret;
+
+fail:
+	rte_pcapng_close(pcapng);
+	return -1;
+}
+
+static int
+test_write_packets(void)
+{
+	char file_name[] = "/tmp/pcapng_test_XXXXXX.pcapng";
+	static rte_pcapng_t *pcapng;
+	int ret, tmp_fd, count;
+	uint64_t now = current_timestamp();
+
+	tmp_fd = mkstemps(file_name, strlen(".pcapng"));
+	if (tmp_fd == -1) {
+		perror("mkstemps() failure");
+		goto fail;
 	}
+	printf("pcapng: output file %s\n", file_name);
 
-	return 0;
+	/* open a test capture file */
+	pcapng = rte_pcapng_fdopen(tmp_fd, NULL, NULL, "pcapng_test", NULL);
+	if (pcapng == NULL) {
+		fprintf(stderr, "rte_pcapng_fdopen failed\n");
+		close(tmp_fd);
+		goto fail;
+	}
+
+	/* Add interface to the file */
+	ret = rte_pcapng_add_interface(pcapng, port_id,
+				       NULL, NULL, NULL);
+	if (ret < 0) {
+		fprintf(stderr, "can not add port %u\n", port_id);
+		goto fail;
+	}
+
+	count = fill_pcapng_file(pcapng, TOTAL_PACKETS);
+	if (count < 0)
+		goto fail;
+
+	/* write a statistics block */
+	ret = rte_pcapng_write_stats(pcapng, port_id,
+				     count, 0, "end of test");
+	if (ret <= 0) {
+		fprintf(stderr, "Write of statistics failed\n");
+		goto fail;
+	}
+
+	rte_pcapng_close(pcapng);
+
+	ret = valid_pcapng_file(file_name, now, count);
+	/* if test fails want to investigate the file */
+	if (ret == 0)
+		unlink(file_name);
+
+	return ret;
+
+fail:
+	rte_pcapng_close(pcapng);
+	return -1;
 }
 
 static void
 test_cleanup(void)
 {
 	rte_mempool_free(mp);
-
-	if (pcapng)
-		rte_pcapng_close(pcapng);
-
+	rte_vdev_uninit(null_dev);
 }
 
 static struct
@@ -298,10 +444,8 @@ unit_test_suite test_pcapng_suite  = {
 	.teardown = test_cleanup,
 	.suite_name = "Test Pcapng Unit Test Suite",
 	.unit_test_cases = {
+		TEST_CASE(test_add_interface),
 		TEST_CASE(test_write_packets),
-		TEST_CASE(test_write_stats),
-		TEST_CASE(test_validate),
-		TEST_CASE(test_write_over_limit_iov_max),
 		TEST_CASES_END()
 	}
 };
@@ -312,4 +456,4 @@ test_pcapng(void)
 	return unit_test_suite_runner(&test_pcapng_suite);
 }
 
-REGISTER_TEST_COMMAND(pcapng_autotest, test_pcapng);
+REGISTER_FAST_TEST(pcapng_autotest, true, true, test_pcapng);
-- 
2.39.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v7 0/5] dumpcap and pcapng fixes
  2023-09-21  4:23 [PATCH 0/4] pcapng fixes Stephen Hemminger
                   ` (8 preceding siblings ...)
  2023-11-13 16:15 ` [PATCH v6 0/5] dumpcap and pcapng fixes Stephen Hemminger
@ 2023-11-17 16:35 ` Stephen Hemminger
  2023-11-17 16:35   ` [PATCH v7 1/5] pdump: fix setting rte_errno on mp error Stephen Hemminger
                     ` (5 more replies)
  9 siblings, 6 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-17 16:35 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger

It fixes issues related to timestamping. The design choices are
to maximize performance in the primary process; and do
all the time adjustment in the secondary (dumpcap) since
the dumpcap needs to system calls anyway to write the result.

This patches set changes where the adjustment is calculated
into the pcapng portion that opens the output file.
All details of the format of timestamp are contained inside
pcapng (data hiding).

v7 - no change, rebase there were some apply failures by CI
v6 - make sure all steps compile
v5 - fix format of getpid in capture name
v4 - incorporate review feedback
v3 - don't use alloca() since can have VLA type issues

Stephen Hemminger (5):
  pdump: fix setting rte_errno on mp error
  dumpcap: allow multiple invocations
  pcapng: modify timestamp calculation
  pcapng: avoid using alloca()
  test: cleanups to pcapng test

 app/dumpcap/main.c      |  53 ++---
 app/test/meson.build    |   2 +-
 app/test/test_pcapng.c  | 417 +++++++++++++++++++++++++++-------------
 lib/graph/graph_pcap.c  |   2 +-
 lib/pcapng/rte_pcapng.c | 156 ++++++---------
 lib/pcapng/rte_pcapng.h |  19 +-
 lib/pdump/rte_pdump.c   |   9 +-
 7 files changed, 373 insertions(+), 285 deletions(-)

-- 
2.42.0


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v7 1/5] pdump: fix setting rte_errno on mp error
  2023-11-17 16:35 ` [PATCH v7 0/5] dumpcap and pcapng fixes Stephen Hemminger
@ 2023-11-17 16:35   ` Stephen Hemminger
  2023-11-17 16:35   ` [PATCH v7 2/5] dumpcap: allow multiple invocations Stephen Hemminger
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-17 16:35 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Morten Brørup, Reshma Pattan, Jianfeng Tan

The response from MP server sets err_value to negative
on error. The convention for rte_errno is to use a positive
value on error. This makes errors like duplicate registration
show up with the correct error value.

Fixes: 660098d61f57 ("pdump: use generic multi-process channel")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
---
 lib/pdump/rte_pdump.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/pdump/rte_pdump.c b/lib/pdump/rte_pdump.c
index 80b90c6f7d03..e94f49e21250 100644
--- a/lib/pdump/rte_pdump.c
+++ b/lib/pdump/rte_pdump.c
@@ -564,9 +564,10 @@ pdump_prepare_client_request(const char *device, uint16_t queue,
 	if (rte_mp_request_sync(&mp_req, &mp_reply, &ts) == 0) {
 		mp_rep = &mp_reply.msgs[0];
 		resp = (struct pdump_response *)mp_rep->param;
-		rte_errno = resp->err_value;
-		if (!resp->err_value)
+		if (resp->err_value == 0)
 			ret = 0;
+		else
+			rte_errno = -resp->err_value;
 		free(mp_reply.msgs);
 	}
 
-- 
2.42.0


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v7 2/5] dumpcap: allow multiple invocations
  2023-11-17 16:35 ` [PATCH v7 0/5] dumpcap and pcapng fixes Stephen Hemminger
  2023-11-17 16:35   ` [PATCH v7 1/5] pdump: fix setting rte_errno on mp error Stephen Hemminger
@ 2023-11-17 16:35   ` Stephen Hemminger
  2023-11-17 16:35   ` [PATCH v7 3/5] pcapng: modify timestamp calculation Stephen Hemminger
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-17 16:35 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Isaac Boukris, Reshma Pattan

If dumpcap is run twice with each instance pointing a different
interface, it would fail because of overlap in ring a pool names.
Fix by putting process id in the name.

It is still not allowed to do multiple invocations on the same
interface because only one callback is allowed and only one copy
of mbuf is done. Dumpcap will fail with error in this case:

   pdump_prepare_client_request(): client request for pdump enable/disable failed
   EAL: Error - exiting with code: 1
     Cause: Packet dump enable on 0:net_null0 failed File exists

Fixes: cbb44143be74 ("app/dumpcap: add new packet capture application")
Reported-by: Isaac Boukris <iboukris@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 app/dumpcap/main.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/app/dumpcap/main.c b/app/dumpcap/main.c
index 4f581bd341d8..d05dddac0071 100644
--- a/app/dumpcap/main.c
+++ b/app/dumpcap/main.c
@@ -44,7 +44,6 @@
 #include <pcap/pcap.h>
 #include <pcap/bpf.h>
 
-#define RING_NAME "capture-ring"
 #define MONITOR_INTERVAL  (500 * 1000)
 #define MBUF_POOL_CACHE_SIZE 32
 #define BURST_SIZE 32
@@ -647,6 +646,7 @@ static void dpdk_init(void)
 static struct rte_ring *create_ring(void)
 {
 	struct rte_ring *ring;
+	char ring_name[RTE_RING_NAMESIZE];
 	size_t size, log2;
 
 	/* Find next power of 2 >= size. */
@@ -660,28 +660,28 @@ static struct rte_ring *create_ring(void)
 		ring_size = size;
 	}
 
-	ring = rte_ring_lookup(RING_NAME);
-	if (ring == NULL) {
-		ring = rte_ring_create(RING_NAME, ring_size,
-					rte_socket_id(), 0);
-		if (ring == NULL)
-			rte_exit(EXIT_FAILURE, "Could not create ring :%s\n",
-				 rte_strerror(rte_errno));
-	}
+	/* Want one ring per invocation of program */
+	snprintf(ring_name, sizeof(ring_name),
+		 "dumpcap-%d", getpid());
+
+	ring = rte_ring_create(ring_name, ring_size,
+			       rte_socket_id(), 0);
+	if (ring == NULL)
+		rte_exit(EXIT_FAILURE, "Could not create ring :%s\n",
+			 rte_strerror(rte_errno));
+
 	return ring;
 }
 
 static struct rte_mempool *create_mempool(void)
 {
 	const struct interface *intf;
-	static const char pool_name[] = "capture_mbufs";
+	char pool_name[RTE_MEMPOOL_NAMESIZE];
 	size_t num_mbufs = 2 * ring_size;
 	struct rte_mempool *mp;
 	uint32_t data_size = 128;
 
-	mp = rte_mempool_lookup(pool_name);
-	if (mp)
-		return mp;
+	snprintf(pool_name, sizeof(pool_name), "capture_%d", getpid());
 
 	/* Common pool so size mbuf for biggest snap length */
 	TAILQ_FOREACH(intf, &interfaces, next) {
@@ -826,7 +826,7 @@ static void enable_pdump(struct rte_ring *r, struct rte_mempool *mp)
 			rte_exit(EXIT_FAILURE,
 				"Packet dump enable on %u:%s failed %s\n",
 				intf->port, intf->name,
-				rte_strerror(-ret));
+				rte_strerror(rte_errno));
 		}
 
 		if (intf->opts.promisc_mode) {
-- 
2.42.0


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v7 3/5] pcapng: modify timestamp calculation
  2023-11-17 16:35 ` [PATCH v7 0/5] dumpcap and pcapng fixes Stephen Hemminger
  2023-11-17 16:35   ` [PATCH v7 1/5] pdump: fix setting rte_errno on mp error Stephen Hemminger
  2023-11-17 16:35   ` [PATCH v7 2/5] dumpcap: allow multiple invocations Stephen Hemminger
@ 2023-11-17 16:35   ` Stephen Hemminger
  2023-11-17 16:35   ` [PATCH v7 4/5] pcapng: avoid using alloca() Stephen Hemminger
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-17 16:35 UTC (permalink / raw)
  To: dev
  Cc: Stephen Hemminger, Morten Brørup, Reshma Pattan,
	Jerin Jacob, Kiran Kumar K, Nithin Dabilpuram, Zhirun Yan,
	Quentin Armitage

The computation of timestamp is best done in the part of
pcapng library that is in secondary process.
The secondary process is already doing a bunch of system
calls which makes it not performance sensitive.
This does change the rte_pcapng_copy()
and rte_pcapng_write_stats() experimental API's.

Simplify the computation of nanoseconds from TSC to a two
step process which avoids numeric overflow issues. The previous
code was not thread safe as well.

Fixes: c882eb544842 ("pcapng: fix timestamp wrapping in output files")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
---
 app/dumpcap/main.c      |  25 +++------
 app/test/test_pcapng.c  |   7 +--
 lib/graph/graph_pcap.c  |   2 +-
 lib/pcapng/rte_pcapng.c | 119 +++++++++++++++-------------------------
 lib/pcapng/rte_pcapng.h |  19 ++-----
 lib/pdump/rte_pdump.c   |   4 +-
 6 files changed, 62 insertions(+), 114 deletions(-)

diff --git a/app/dumpcap/main.c b/app/dumpcap/main.c
index d05dddac0071..fc28e2d7027a 100644
--- a/app/dumpcap/main.c
+++ b/app/dumpcap/main.c
@@ -66,13 +66,13 @@ static bool print_stats;
 
 /* capture limit options */
 static struct {
-	uint64_t  duration;	/* nanoseconds */
+	time_t  duration;	/* seconds */
 	unsigned long packets;  /* number of packets in file */
 	size_t size;		/* file size (bytes) */
 } stop;
 
 /* Running state */
-static uint64_t start_time, end_time;
+static time_t start_time;
 static uint64_t packets_received;
 static size_t file_size;
 
@@ -197,7 +197,7 @@ static void auto_stop(char *opt)
 		if (*value == '\0' || *endp != '\0' || interval <= 0)
 			rte_exit(EXIT_FAILURE,
 				 "Invalid duration \"%s\"\n", value);
-		stop.duration = NSEC_PER_SEC * interval;
+		stop.duration = interval;
 	} else if (strcmp(opt, "filesize") == 0) {
 		stop.size = get_uint(value, "filesize", 0) * 1024;
 	} else if (strcmp(opt, "packets") == 0) {
@@ -511,15 +511,6 @@ static void statistics_loop(void)
 	}
 }
 
-/* Return the time since 1/1/1970 in nanoseconds */
-static uint64_t create_timestamp(void)
-{
-	struct timespec now;
-
-	clock_gettime(CLOCK_MONOTONIC, &now);
-	return rte_timespec_to_ns(&now);
-}
-
 static void
 cleanup_pdump_resources(void)
 {
@@ -589,9 +580,8 @@ report_packet_stats(dumpcap_out_t out)
 		ifdrop = pdump_stats.nombuf + pdump_stats.ringfull;
 
 		if (use_pcapng)
-			rte_pcapng_write_stats(out.pcapng, intf->port, NULL,
-					       start_time, end_time,
-					       ifrecv, ifdrop);
+			rte_pcapng_write_stats(out.pcapng, intf->port,
+					       ifrecv, ifdrop, NULL);
 
 		if (ifrecv == 0)
 			percent = 0;
@@ -983,7 +973,7 @@ int main(int argc, char **argv)
 	mp = create_mempool();
 	out = create_output();
 
-	start_time = create_timestamp();
+	start_time = time(NULL);
 	enable_pdump(r, mp);
 
 	if (!quiet) {
@@ -1005,11 +995,10 @@ int main(int argc, char **argv)
 			break;
 
 		if (stop.duration != 0 &&
-		    create_timestamp() - start_time > stop.duration)
+		    time(NULL) - start_time > stop.duration)
 			break;
 	}
 
-	end_time = create_timestamp();
 	disable_primary_monitor();
 
 	if (rte_eal_primary_proc_alive(NULL))
diff --git a/app/test/test_pcapng.c b/app/test/test_pcapng.c
index b8429a02f160..21131dfa0c5e 100644
--- a/app/test/test_pcapng.c
+++ b/app/test/test_pcapng.c
@@ -146,7 +146,7 @@ test_write_packets(void)
 		struct rte_mbuf *mc;
 
 		mc = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len,
-				rte_get_tsc_cycles(), 0, NULL);
+				     RTE_PCAPNG_DIRECTION_UNKNOWN, NULL);
 		if (mc == NULL) {
 			fprintf(stderr, "Cannot copy packet\n");
 			return -1;
@@ -174,8 +174,7 @@ test_write_stats(void)
 
 	/* write a statistics block */
 	len = rte_pcapng_write_stats(pcapng, port_id,
-				     NULL, 0, 0,
-				     NUM_PACKETS, 0);
+				     UINT64_MAX, UINT64_MAX, NULL);
 	if (len <= 0) {
 		fprintf(stderr, "Write of statistics failed\n");
 		return -1;
@@ -262,7 +261,7 @@ test_write_over_limit_iov_max(void)
 		struct rte_mbuf *mc;
 
 		mc = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len,
-				rte_get_tsc_cycles(), 0, NULL);
+				     RTE_PCAPNG_DIRECTION_UNKNOWN, NULL);
 		if (mc == NULL) {
 			fprintf(stderr, "Cannot copy packet\n");
 			return -1;
diff --git a/lib/graph/graph_pcap.c b/lib/graph/graph_pcap.c
index db722c375fa7..89525f1220ca 100644
--- a/lib/graph/graph_pcap.c
+++ b/lib/graph/graph_pcap.c
@@ -214,7 +214,7 @@ graph_pcap_dispatch(struct rte_graph *graph,
 		mbuf = (struct rte_mbuf *)objs[i];
 
 		mc = rte_pcapng_copy(mbuf->port, 0, mbuf, pkt_mp, mbuf->pkt_len,
-				     rte_get_tsc_cycles(), 0, buffer);
+				     0, buffer);
 		if (mc == NULL)
 			break;
 
diff --git a/lib/pcapng/rte_pcapng.c b/lib/pcapng/rte_pcapng.c
index 3c91fc77644a..13fd2b97fb80 100644
--- a/lib/pcapng/rte_pcapng.c
+++ b/lib/pcapng/rte_pcapng.c
@@ -36,22 +36,14 @@
 /* Format of the capture file handle */
 struct rte_pcapng {
 	int  outfd;		/* output file */
-
 	unsigned int ports;	/* number of interfaces added */
+	uint64_t offset_ns;	/* ns since 1/1/1970 when initialized */
+	uint64_t tsc_base;	/* TSC when started */
 
 	/* DPDK port id to interface index in file */
 	uint32_t port_index[RTE_MAX_ETHPORTS];
 };
 
-/* For converting TSC cycles to PCAPNG ns format */
-static struct pcapng_time {
-	uint64_t ns;
-	uint64_t cycles;
-	uint64_t tsc_hz;
-	struct rte_reciprocal_u64 tsc_hz_inverse;
-} pcapng_time;
-
-
 #ifdef RTE_EXEC_ENV_WINDOWS
 /*
  * Windows does not have writev() call.
@@ -102,56 +94,21 @@ static ssize_t writev(int fd, const struct iovec *iov, int iovcnt)
 #define if_indextoname(ifindex, ifname) NULL
 #endif
 
-static inline void
-pcapng_init(void)
+/* Convert from TSC (CPU cycles) to nanoseconds */
+static uint64_t
+pcapng_timestamp(const rte_pcapng_t *self, uint64_t cycles)
 {
-	struct timespec ts;
+	uint64_t delta, rem, secs, ns;
+	const uint64_t hz = rte_get_tsc_hz();
 
-	pcapng_time.cycles = rte_get_tsc_cycles();
-	clock_gettime(CLOCK_REALTIME, &ts);
-	pcapng_time.cycles = (pcapng_time.cycles + rte_get_tsc_cycles()) / 2;
-	pcapng_time.ns = rte_timespec_to_ns(&ts);
-
-	pcapng_time.tsc_hz = rte_get_tsc_hz();
-	pcapng_time.tsc_hz_inverse = rte_reciprocal_value_u64(pcapng_time.tsc_hz);
-}
+	delta = cycles - self->tsc_base;
 
-/* PCAPNG timestamps are in nanoseconds */
-static uint64_t pcapng_tsc_to_ns(uint64_t cycles)
-{
-	uint64_t delta, secs;
-
-	if (!pcapng_time.tsc_hz)
-		pcapng_init();
-
-	/* In essence the calculation is:
-	 *   delta = (cycles - pcapng_time.cycles) * NSEC_PRE_SEC / rte_get_tsc_hz()
-	 * but this overflows within 4 to 8 seconds depending on TSC frequency.
-	 * Instead, if delta >= pcapng_time.tsc_hz:
-	 *   Increase pcapng_time.ns and pcapng_time.cycles by the number of
-	 *   whole seconds in delta and reduce delta accordingly.
-	 * delta will therefore always lie in the interval [0, pcapng_time.tsc_hz),
-	 * which will not overflow when multiplied by NSEC_PER_SEC provided the
-	 * TSC frequency < approx 18.4GHz.
-	 *
-	 * Currently all TSCs operate below 5GHz.
-	 */
-	delta = cycles - pcapng_time.cycles;
-	if (unlikely(delta >= pcapng_time.tsc_hz)) {
-		if (likely(delta < pcapng_time.tsc_hz * 2)) {
-			delta -= pcapng_time.tsc_hz;
-			pcapng_time.cycles += pcapng_time.tsc_hz;
-			pcapng_time.ns += NSEC_PER_SEC;
-		} else {
-			secs = rte_reciprocal_divide_u64(delta, &pcapng_time.tsc_hz_inverse);
-			delta -= secs * pcapng_time.tsc_hz;
-			pcapng_time.cycles += secs * pcapng_time.tsc_hz;
-			pcapng_time.ns += secs * NSEC_PER_SEC;
-		}
-	}
+	/* Avoid numeric wraparound by computing seconds first */
+	secs = delta / hz;
+	rem = delta % hz;
+	ns = (rem * NS_PER_S) / hz;
 
-	return pcapng_time.ns + rte_reciprocal_divide_u64(delta * NSEC_PER_SEC,
-							  &pcapng_time.tsc_hz_inverse);
+	return secs * NS_PER_S + ns + self->offset_ns;
 }
 
 /* length of option including padding */
@@ -368,15 +325,15 @@ rte_pcapng_add_interface(rte_pcapng_t *self, uint16_t port,
  */
 ssize_t
 rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
-		       const char *comment,
-		       uint64_t start_time, uint64_t end_time,
-		       uint64_t ifrecv, uint64_t ifdrop)
+		       uint64_t ifrecv, uint64_t ifdrop,
+		       const char *comment)
 {
 	struct pcapng_statistics *hdr;
 	struct pcapng_option *opt;
+	uint64_t start_time = self->offset_ns;
+	uint64_t sample_time;
 	uint32_t optlen, len;
 	uint8_t *buf;
-	uint64_t ns;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
@@ -386,10 +343,10 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 		optlen += pcapng_optlen(sizeof(ifrecv));
 	if (ifdrop != UINT64_MAX)
 		optlen += pcapng_optlen(sizeof(ifdrop));
+
 	if (start_time != 0)
 		optlen += pcapng_optlen(sizeof(start_time));
-	if (end_time != 0)
-		optlen += pcapng_optlen(sizeof(end_time));
+
 	if (comment)
 		optlen += pcapng_optlen(strlen(comment));
 	if (optlen != 0)
@@ -409,9 +366,6 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 	if (start_time != 0)
 		opt = pcapng_add_option(opt, PCAPNG_ISB_STARTTIME,
 					 &start_time, sizeof(start_time));
-	if (end_time != 0)
-		opt = pcapng_add_option(opt, PCAPNG_ISB_ENDTIME,
-					 &end_time, sizeof(end_time));
 	if (ifrecv != UINT64_MAX)
 		opt = pcapng_add_option(opt, PCAPNG_ISB_IFRECV,
 				&ifrecv, sizeof(ifrecv));
@@ -425,9 +379,9 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 	hdr->block_length = len;
 	hdr->interface_id = self->port_index[port_id];
 
-	ns = pcapng_tsc_to_ns(rte_get_tsc_cycles());
-	hdr->timestamp_hi = ns >> 32;
-	hdr->timestamp_lo = (uint32_t)ns;
+	sample_time = pcapng_timestamp(self, rte_get_tsc_cycles());
+	hdr->timestamp_hi = sample_time >> 32;
+	hdr->timestamp_lo = (uint32_t)sample_time;
 
 	/* clone block_length after option */
 	memcpy(opt, &len, sizeof(uint32_t));
@@ -520,23 +474,21 @@ struct rte_mbuf *
 rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 		const struct rte_mbuf *md,
 		struct rte_mempool *mp,
-		uint32_t length, uint64_t cycles,
+		uint32_t length,
 		enum rte_pcapng_direction direction,
 		const char *comment)
 {
 	struct pcapng_enhance_packet_block *epb;
 	uint32_t orig_len, data_len, padding, flags;
 	struct pcapng_option *opt;
+	uint64_t timestamp;
 	uint16_t optlen;
 	struct rte_mbuf *mc;
-	uint64_t ns;
 	bool rss_hash;
 
 #ifdef RTE_LIBRTE_ETHDEV_DEBUG
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
 #endif
-	ns = pcapng_tsc_to_ns(cycles);
-
 	orig_len = rte_pktmbuf_pkt_len(md);
 
 	/* Take snapshot of the data */
@@ -641,8 +593,10 @@ rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 	/* Interface index is filled in later during write */
 	mc->port = port_id;
 
-	epb->timestamp_hi = ns >> 32;
-	epb->timestamp_lo = (uint32_t)ns;
+	/* Put timestamp in cycles here - adjust in packet write */
+	timestamp = rte_get_tsc_cycles();
+	epb->timestamp_hi = timestamp >> 32;
+	epb->timestamp_lo = (uint32_t)timestamp;
 	epb->capture_length = data_len;
 	epb->original_length = orig_len;
 
@@ -668,6 +622,7 @@ rte_pcapng_write_packets(rte_pcapng_t *self,
 	for (i = 0; i < nb_pkts; i++) {
 		struct rte_mbuf *m = pkts[i];
 		struct pcapng_enhance_packet_block *epb;
+		uint64_t cycles, timestamp;
 
 		/* sanity check that is really a pcapng mbuf */
 		epb = rte_pktmbuf_mtod(m, struct pcapng_enhance_packet_block *);
@@ -684,6 +639,13 @@ rte_pcapng_write_packets(rte_pcapng_t *self,
 			return -1;
 		}
 
+		/* adjust timestamp recorded in packet */
+		cycles = (uint64_t)epb->timestamp_hi << 32;
+		cycles += epb->timestamp_lo;
+		timestamp = pcapng_timestamp(self, cycles);
+		epb->timestamp_hi = timestamp >> 32;
+		epb->timestamp_lo = (uint32_t)timestamp;
+
 		/*
 		 * Handle case of highly fragmented and large burst size
 		 * Note: this assumes that max segments per mbuf < IOV_MAX
@@ -725,6 +687,8 @@ rte_pcapng_fdopen(int fd,
 {
 	unsigned int i;
 	rte_pcapng_t *self;
+	struct timespec ts;
+	uint64_t cycles;
 
 	self = malloc(sizeof(*self));
 	if (!self) {
@@ -734,6 +698,13 @@ rte_pcapng_fdopen(int fd,
 
 	self->outfd = fd;
 	self->ports = 0;
+
+	/* record start time in ns since 1/1/1970 */
+	cycles = rte_get_tsc_cycles();
+	clock_gettime(CLOCK_REALTIME, &ts);
+	self->tsc_base = (cycles + rte_get_tsc_cycles()) / 2;
+	self->offset_ns = rte_timespec_to_ns(&ts);
+
 	for (i = 0; i < RTE_MAX_ETHPORTS; i++)
 		self->port_index[i] = UINT32_MAX;
 
diff --git a/lib/pcapng/rte_pcapng.h b/lib/pcapng/rte_pcapng.h
index 03d658aab209..48f2b5756430 100644
--- a/lib/pcapng/rte_pcapng.h
+++ b/lib/pcapng/rte_pcapng.h
@@ -114,8 +114,6 @@ enum rte_pcapng_direction {
  * @param length
  *   The upper limit on bytes to copy.  Passing UINT32_MAX
  *   means all data (after offset).
- * @param timestamp
- *   The timestamp in TSC cycles.
  * @param direction
  *   The direction of the packer: receive, transmit or unknown.
  * @param comment
@@ -128,7 +126,7 @@ enum rte_pcapng_direction {
 struct rte_mbuf *
 rte_pcapng_copy(uint16_t port_id, uint32_t queue,
 		const struct rte_mbuf *m, struct rte_mempool *mp,
-		uint32_t length, uint64_t timestamp,
+		uint32_t length,
 		enum rte_pcapng_direction direction, const char *comment);
 
 
@@ -178,28 +176,21 @@ rte_pcapng_write_packets(rte_pcapng_t *self,
  *  The handle to the packet capture file
  * @param port
  *  The Ethernet port to report stats on.
- * @param comment
- *   Optional comment to add to statistics.
- * @param start_time
- *  The time when packet capture was started in nanoseconds.
- *  Optional: can be zero if not known.
- * @param end_time
- *  The time when packet capture was stopped in nanoseconds.
- *  Optional: can be zero if not finished;
  * @param ifrecv
  *  The number of packets received by capture.
  *  Optional: use UINT64_MAX if not known.
  * @param ifdrop
  *  The number of packets missed by the capture process.
  *  Optional: use UINT64_MAX if not known.
+ * @param comment
+ *  Optional comment to add to statistics.
  * @return
  *  number of bytes written to file, -1 on failure to write file
  */
 ssize_t
 rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port,
-		       const char *comment,
-		       uint64_t start_time, uint64_t end_time,
-		       uint64_t ifrecv, uint64_t ifdrop);
+		       uint64_t ifrecv, uint64_t ifdrop,
+		       const char *comment);
 
 #ifdef __cplusplus
 }
diff --git a/lib/pdump/rte_pdump.c b/lib/pdump/rte_pdump.c
index e94f49e21250..5a1ec14d7a18 100644
--- a/lib/pdump/rte_pdump.c
+++ b/lib/pdump/rte_pdump.c
@@ -90,7 +90,6 @@ pdump_copy(uint16_t port_id, uint16_t queue,
 	int ring_enq;
 	uint16_t d_pkts = 0;
 	struct rte_mbuf *dup_bufs[nb_pkts];
-	uint64_t ts;
 	struct rte_ring *ring;
 	struct rte_mempool *mp;
 	struct rte_mbuf *p;
@@ -99,7 +98,6 @@ pdump_copy(uint16_t port_id, uint16_t queue,
 	if (cbs->filter)
 		rte_bpf_exec_burst(cbs->filter, (void **)pkts, rcs, nb_pkts);
 
-	ts = rte_get_tsc_cycles();
 	ring = cbs->ring;
 	mp = cbs->mp;
 	for (i = 0; i < nb_pkts; i++) {
@@ -122,7 +120,7 @@ pdump_copy(uint16_t port_id, uint16_t queue,
 		if (cbs->ver == V2)
 			p = rte_pcapng_copy(port_id, queue,
 					    pkts[i], mp, cbs->snaplen,
-					    ts, direction, NULL);
+					    direction, NULL);
 		else
 			p = rte_pktmbuf_copy(pkts[i], mp, 0, cbs->snaplen);
 
-- 
2.42.0


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v7 4/5] pcapng: avoid using alloca()
  2023-11-17 16:35 ` [PATCH v7 0/5] dumpcap and pcapng fixes Stephen Hemminger
                     ` (2 preceding siblings ...)
  2023-11-17 16:35   ` [PATCH v7 3/5] pcapng: modify timestamp calculation Stephen Hemminger
@ 2023-11-17 16:35   ` Stephen Hemminger
  2023-11-17 16:35   ` [PATCH v7 5/5] test: cleanups to pcapng test Stephen Hemminger
  2023-11-22 22:42   ` [PATCH v7 0/5] dumpcap and pcapng fixes Thomas Monjalon
  5 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-17 16:35 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Morten Brørup, Reshma Pattan

The function alloca() like VLA's has problems if the caller
passes a large value. Instead use a fixed size buffer (2K)
which will be more than sufficient for the info related blocks
in the file. Add bounds checks as well.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
---
 lib/pcapng/rte_pcapng.c | 37 ++++++++++++++++---------------------
 1 file changed, 16 insertions(+), 21 deletions(-)

diff --git a/lib/pcapng/rte_pcapng.c b/lib/pcapng/rte_pcapng.c
index 13fd2b97fb80..f74ec939a9f8 100644
--- a/lib/pcapng/rte_pcapng.c
+++ b/lib/pcapng/rte_pcapng.c
@@ -33,6 +33,9 @@
 /* conversion from DPDK speed to PCAPNG */
 #define PCAPNG_MBPS_SPEED 1000000ull
 
+/* upper bound for section, stats and interface blocks */
+#define PCAPNG_BLKSIZ	2048
+
 /* Format of the capture file handle */
 struct rte_pcapng {
 	int  outfd;		/* output file */
@@ -140,9 +143,8 @@ pcapng_section_block(rte_pcapng_t *self,
 {
 	struct pcapng_section_header *hdr;
 	struct pcapng_option *opt;
-	void *buf;
+	uint8_t buf[PCAPNG_BLKSIZ];
 	uint32_t len;
-	ssize_t cc;
 
 	len = sizeof(*hdr);
 	if (hw)
@@ -158,8 +160,7 @@ pcapng_section_block(rte_pcapng_t *self,
 	len += pcapng_optlen(0);
 	len += sizeof(uint32_t);
 
-	buf = calloc(1, len);
-	if (!buf)
+	if (len > sizeof(buf))
 		return -1;
 
 	hdr = (struct pcapng_section_header *)buf;
@@ -193,10 +194,7 @@ pcapng_section_block(rte_pcapng_t *self,
 	/* clone block_length after option */
 	memcpy(opt, &hdr->block_length, sizeof(uint32_t));
 
-	cc = write(self->outfd, buf, len);
-	free(buf);
-
-	return cc;
+	return write(self->outfd, buf, len);
 }
 
 /* Write an interface block for a DPDK port */
@@ -213,7 +211,7 @@ rte_pcapng_add_interface(rte_pcapng_t *self, uint16_t port,
 	struct pcapng_option *opt;
 	const uint8_t tsresol = 9;	/* nanosecond resolution */
 	uint32_t len;
-	void *buf;
+	uint8_t buf[PCAPNG_BLKSIZ];
 	char ifname_buf[IF_NAMESIZE];
 	char ifhw[256];
 	uint64_t speed = 0;
@@ -267,8 +265,7 @@ rte_pcapng_add_interface(rte_pcapng_t *self, uint16_t port,
 	len += pcapng_optlen(0);
 	len += sizeof(uint32_t);
 
-	buf = alloca(len);
-	if (!buf)
+	if (len > sizeof(buf))
 		return -1;
 
 	hdr = (struct pcapng_interface_block *)buf;
@@ -296,17 +293,16 @@ rte_pcapng_add_interface(rte_pcapng_t *self, uint16_t port,
 		opt = pcapng_add_option(opt, PCAPNG_IFB_HARDWARE,
 					 ifhw, strlen(ifhw));
 	if (filter) {
-		/* Encoding is that the first octet indicates string vs BPF */
 		size_t len;
-		char *buf;
 
 		len = strlen(filter) + 1;
-		buf = alloca(len);
-		*buf = '\0';
-		memcpy(buf + 1, filter, len);
+		opt->code = PCAPNG_IFB_FILTER;
+		opt->length = len;
+		/* Encoding is that the first octet indicates string vs BPF */
+		opt->data[0] = 0;
+		memcpy(opt->data + 1, filter, strlen(filter));
 
-		opt = pcapng_add_option(opt, PCAPNG_IFB_FILTER,
-					buf, len);
+		opt = (struct pcapng_option *)((uint8_t *)opt + pcapng_optlen(len));
 	}
 
 	opt = pcapng_add_option(opt, PCAPNG_OPT_END, NULL, 0);
@@ -333,7 +329,7 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 	uint64_t start_time = self->offset_ns;
 	uint64_t sample_time;
 	uint32_t optlen, len;
-	uint8_t *buf;
+	uint8_t buf[PCAPNG_BLKSIZ];
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
@@ -353,8 +349,7 @@ rte_pcapng_write_stats(rte_pcapng_t *self, uint16_t port_id,
 		optlen += pcapng_optlen(0);
 
 	len = sizeof(*hdr) + optlen + sizeof(uint32_t);
-	buf = alloca(len);
-	if (buf == NULL)
+	if (len > sizeof(buf))
 		return -1;
 
 	hdr = (struct pcapng_statistics *)buf;
-- 
2.42.0


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v7 5/5] test: cleanups to pcapng test
  2023-11-17 16:35 ` [PATCH v7 0/5] dumpcap and pcapng fixes Stephen Hemminger
                     ` (3 preceding siblings ...)
  2023-11-17 16:35   ` [PATCH v7 4/5] pcapng: avoid using alloca() Stephen Hemminger
@ 2023-11-17 16:35   ` Stephen Hemminger
  2023-11-22 22:42   ` [PATCH v7 0/5] dumpcap and pcapng fixes Thomas Monjalon
  5 siblings, 0 replies; 61+ messages in thread
From: Stephen Hemminger @ 2023-11-17 16:35 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Reshma Pattan

Overhaul of the pcapng test:
  - promote it to be a fast test so it gets regularly run.
  - create null device and use i.
  - use UDP discard packets that are valid so that for debugging
    the resulting pcapng file can be looked at with wireshark.
  - do basic checks on resulting pcap file that lengths and
    timestamps are in range.
  - add test for interface options

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 app/test/meson.build   |   2 +-
 app/test/test_pcapng.c | 416 +++++++++++++++++++++++++++--------------
 2 files changed, 281 insertions(+), 137 deletions(-)

diff --git a/app/test/meson.build b/app/test/meson.build
index 4183d66b0e9c..dcc93f4a43b4 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -128,7 +128,7 @@ source_file_deps = {
     'test_metrics.c': ['metrics'],
     'test_mp_secondary.c': ['hash', 'lpm'],
     'test_net_ether.c': ['net'],
-    'test_pcapng.c': ['ethdev', 'net', 'pcapng'],
+    'test_pcapng.c': ['ethdev', 'net', 'pcapng', 'bus_vdev'],
     'test_pdcp.c': ['eventdev', 'pdcp', 'net', 'timer', 'security'],
     'test_pdump.c': ['pdump'] + sample_packet_forward_deps,
     'test_per_lcore.c': [],
diff --git a/app/test/test_pcapng.c b/app/test/test_pcapng.c
index 21131dfa0c5e..89535efad096 100644
--- a/app/test/test_pcapng.c
+++ b/app/test/test_pcapng.c
@@ -6,25 +6,34 @@
 #include <stdlib.h>
 #include <unistd.h>
 
+#include <rte_bus_vdev.h>
 #include <rte_ethdev.h>
 #include <rte_ether.h>
+#include <rte_ip.h>
 #include <rte_mbuf.h>
 #include <rte_mempool.h>
 #include <rte_net.h>
 #include <rte_pcapng.h>
+#include <rte_random.h>
+#include <rte_reciprocal.h>
+#include <rte_time.h>
+#include <rte_udp.h>
 
 #include <pcap/pcap.h>
 
 #include "test.h"
 
-#define NUM_PACKETS    10
-#define DUMMY_MBUF_NUM 3
+#define PCAPNG_TEST_DEBUG 0
+
+#define TOTAL_PACKETS	4096
+#define MAX_BURST	64
+#define MAX_GAP_US	100000
+#define DUMMY_MBUF_NUM	3
 
-static rte_pcapng_t *pcapng;
 static struct rte_mempool *mp;
 static const uint32_t pkt_len = 200;
 static uint16_t port_id;
-static char file_name[] = "/tmp/pcapng_test_XXXXXX.pcapng";
+static const char null_dev[] = "net_null0";
 
 /* first mbuf in the packet, should always be at offset 0 */
 struct dummy_mbuf {
@@ -61,6 +70,7 @@ mbuf1_prepare(struct dummy_mbuf *dm, uint32_t plen)
 	struct {
 		struct rte_ether_hdr eth;
 		struct rte_ipv4_hdr ip;
+		struct rte_udp_hdr udp;
 	} pkt = {
 		.eth = {
 			.dst_addr.addr_bytes = "\xff\xff\xff\xff\xff\xff",
@@ -68,148 +78,200 @@ mbuf1_prepare(struct dummy_mbuf *dm, uint32_t plen)
 		},
 		.ip = {
 			.version_ihl = RTE_IPV4_VHL_DEF,
-			.total_length = rte_cpu_to_be_16(plen),
-			.time_to_live = IPDEFTTL,
-			.next_proto_id = IPPROTO_RAW,
+			.time_to_live = 1,
+			.next_proto_id = IPPROTO_UDP,
 			.src_addr = rte_cpu_to_be_32(RTE_IPV4_LOOPBACK),
 			.dst_addr = rte_cpu_to_be_32(RTE_IPV4_BROADCAST),
-		}
+		},
+		.udp = {
+			.dst_port = rte_cpu_to_be_16(9), /* Discard port */
+		},
 	};
 
 	memset(dm, 0, sizeof(*dm));
 	dummy_mbuf_prep(&dm->mb[0], dm->buf[0], sizeof(dm->buf[0]), plen);
 
 	rte_eth_random_addr(pkt.eth.src_addr.addr_bytes);
-	memcpy(rte_pktmbuf_mtod(dm->mb, void *), &pkt, RTE_MIN(sizeof(pkt), plen));
+	plen -= sizeof(struct rte_ether_hdr);
+
+	pkt.ip.total_length = rte_cpu_to_be_16(plen);
+	pkt.ip.hdr_checksum = rte_ipv4_cksum(&pkt.ip);
+
+	plen -= sizeof(struct rte_ipv4_hdr);
+	pkt.udp.src_port = rte_rand();
+	pkt.udp.dgram_len = rte_cpu_to_be_16(plen);
+
+	memcpy(rte_pktmbuf_mtod(dm->mb, void *), &pkt, sizeof(pkt));
 }
 
 static int
 test_setup(void)
 {
-	int tmp_fd;
-
-	port_id = rte_eth_find_next(0);
-	if (port_id >= RTE_MAX_ETHPORTS) {
-		fprintf(stderr, "No valid Ether port\n");
-		return -1;
-	}
+	port_id = rte_eth_dev_count_avail();
 
-	tmp_fd = mkstemps(file_name, strlen(".pcapng"));
-	if (tmp_fd == -1) {
-		perror("mkstemps() failure");
-		return -1;
-	}
-	printf("pcapng: output file %s\n", file_name);
-
-	/* open a test capture file */
-	pcapng = rte_pcapng_fdopen(tmp_fd, NULL, NULL, "pcapng_test", NULL);
-	if (pcapng == NULL) {
-		fprintf(stderr, "rte_pcapng_fdopen failed\n");
-		close(tmp_fd);
-		return -1;
-	}
-
-	/* Add interface to the file */
-	if (rte_pcapng_add_interface(pcapng, port_id,
-				     NULL, NULL, NULL) != 0) {
-		fprintf(stderr, "can not add port %u\n", port_id);
-		return -1;
+	/* Make a dummy null device to snoop on */
+	if (rte_vdev_init(null_dev, NULL) != 0) {
+		fprintf(stderr, "Failed to create vdev '%s'\n", null_dev);
+		goto fail;
 	}
 
 	/* Make a pool for cloned packets */
-	mp = rte_pktmbuf_pool_create_by_ops("pcapng_test_pool", IOV_MAX + NUM_PACKETS,
-					    0, 0,
-					    rte_pcapng_mbuf_size(pkt_len),
+	mp = rte_pktmbuf_pool_create_by_ops("pcapng_test_pool",
+					    MAX_BURST, 0, 0,
+					    rte_pcapng_mbuf_size(pkt_len) + 128,
 					    SOCKET_ID_ANY, "ring_mp_sc");
 	if (mp == NULL) {
 		fprintf(stderr, "Cannot create mempool\n");
-		return -1;
+		goto fail;
 	}
+
 	return 0;
+
+fail:
+	rte_vdev_uninit(null_dev);
+	rte_mempool_free(mp);
+	return -1;
 }
 
 static int
-test_write_packets(void)
+fill_pcapng_file(rte_pcapng_t *pcapng, unsigned int num_packets)
 {
-	struct rte_mbuf *orig;
-	struct rte_mbuf *clones[NUM_PACKETS] = { };
 	struct dummy_mbuf mbfs;
-	unsigned int i;
+	struct rte_mbuf *orig;
+	unsigned int burst_size;
+	unsigned int count;
 	ssize_t len;
 
 	/* make a dummy packet */
 	mbuf1_prepare(&mbfs, pkt_len);
-
-	/* clone them */
 	orig  = &mbfs.mb[0];
-	for (i = 0; i < NUM_PACKETS; i++) {
-		struct rte_mbuf *mc;
 
-		mc = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len,
-				     RTE_PCAPNG_DIRECTION_UNKNOWN, NULL);
-		if (mc == NULL) {
-			fprintf(stderr, "Cannot copy packet\n");
+	for (count = 0; count < num_packets; count += burst_size) {
+		struct rte_mbuf *clones[MAX_BURST];
+		unsigned int i;
+
+		/* put 1 .. MAX_BURST packets in one write call */
+		burst_size = rte_rand_max(MAX_BURST) + 1;
+		for (i = 0; i < burst_size; i++) {
+			struct rte_mbuf *mc;
+
+			mc = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len,
+					     RTE_PCAPNG_DIRECTION_IN, NULL);
+			if (mc == NULL) {
+				fprintf(stderr, "Cannot copy packet\n");
+				return -1;
+			}
+			clones[i] = mc;
+		}
+
+		/* write it to capture file */
+		len = rte_pcapng_write_packets(pcapng, clones, burst_size);
+		rte_pktmbuf_free_bulk(clones, burst_size);
+
+		if (len <= 0) {
+			fprintf(stderr, "Write of packets failed: %s\n",
+				rte_strerror(rte_errno));
 			return -1;
 		}
-		clones[i] = mc;
+
+		/* Leave a small gap between packets to test for time wrap */
+		usleep(rte_rand_max(MAX_GAP_US));
 	}
 
-	/* write it to capture file */
-	len = rte_pcapng_write_packets(pcapng, clones, NUM_PACKETS);
+	return count;
+}
 
-	rte_pktmbuf_free_bulk(clones, NUM_PACKETS);
+static char *
+fmt_time(char *buf, size_t size, uint64_t ts_ns)
+{
+	time_t sec;
+	size_t len;
 
-	if (len <= 0) {
-		fprintf(stderr, "Write of packets failed\n");
-		return -1;
-	}
+	sec = ts_ns / NS_PER_S;
+	len = strftime(buf, size, "%X", localtime(&sec));
+	snprintf(buf + len, size - len, ".%09lu",
+		 (unsigned long)(ts_ns % NS_PER_S));
 
-	return 0;
+	return buf;
 }
 
-static int
-test_write_stats(void)
+/* Context for the pcap_loop callback */
+struct pkt_print_ctx {
+	pcap_t *pcap;
+	unsigned int count;
+	uint64_t start_ns;
+	uint64_t end_ns;
+};
+
+static void
+print_packet(uint64_t ts_ns, const struct rte_ether_hdr *eh, size_t len)
 {
-	ssize_t len;
+	char tbuf[128], src[64], dst[64];
 
-	/* write a statistics block */
-	len = rte_pcapng_write_stats(pcapng, port_id,
-				     UINT64_MAX, UINT64_MAX, NULL);
-	if (len <= 0) {
-		fprintf(stderr, "Write of statistics failed\n");
-		return -1;
-	}
-	return 0;
+	fmt_time(tbuf, sizeof(tbuf), ts_ns);
+	rte_ether_format_addr(dst, sizeof(dst), &eh->dst_addr);
+	rte_ether_format_addr(src, sizeof(src), &eh->src_addr);
+	printf("%s: %s -> %s type %x length %zu\n",
+	       tbuf, src, dst, rte_be_to_cpu_16(eh->ether_type), len);
 }
 
+/* Callback from pcap_loop used to validate packets in the file */
 static void
-pkt_print(u_char *user, const struct pcap_pkthdr *h,
-	  const u_char *bytes)
+parse_pcap_packet(u_char *user, const struct pcap_pkthdr *h,
+		  const u_char *bytes)
 {
-	unsigned int *countp = (unsigned int *)user;
+	struct pkt_print_ctx *ctx = (struct pkt_print_ctx *)user;
 	const struct rte_ether_hdr *eh;
-	struct tm *tm;
-	char tbuf[128], src[64], dst[64];
+	const struct rte_ipv4_hdr *ip;
+	uint64_t ns;
 
-	tm = localtime(&h->ts.tv_sec);
-	if (tm == NULL) {
-		perror("localtime");
-		return;
+	eh = (const struct rte_ether_hdr *)bytes;
+	ip = (const struct rte_ipv4_hdr *)(eh + 1);
+
+	ctx->count += 1;
+
+	/* The pcap library is misleading in reporting timestamp.
+	 * packet header struct gives timestamp as a timeval (ie. usec);
+	 * but the file is open in nanonsecond mode therefore
+	 * the timestamp is really in timespec (ie. nanoseconds).
+	 */
+	ns = h->ts.tv_sec * NS_PER_S + h->ts.tv_usec;
+	if (ns < ctx->start_ns || ns > ctx->end_ns) {
+		char tstart[128], tend[128];
+
+		fmt_time(tstart, sizeof(tstart), ctx->start_ns);
+		fmt_time(tend, sizeof(tend), ctx->end_ns);
+		fprintf(stderr, "Timestamp out of range [%s .. %s]\n",
+			tstart, tend);
+		goto error;
 	}
 
-	if (strftime(tbuf, sizeof(tbuf), "%X", tm) == 0) {
-		fprintf(stderr, "strftime returned 0!\n");
-		return;
+	if (!rte_is_broadcast_ether_addr(&eh->dst_addr)) {
+		fprintf(stderr, "Destination is not broadcast\n");
+		goto error;
 	}
 
-	eh = (const struct rte_ether_hdr *)bytes;
-	rte_ether_format_addr(dst, sizeof(dst), &eh->dst_addr);
-	rte_ether_format_addr(src, sizeof(src), &eh->src_addr);
-	printf("%s.%06lu: %s -> %s type %x length %u\n",
-	       tbuf, (unsigned long)h->ts.tv_usec,
-	       src, dst, rte_be_to_cpu_16(eh->ether_type), h->len);
+	if (rte_ipv4_cksum(ip) != 0) {
+		fprintf(stderr, "Bad IPv4 checksum\n");
+		goto error;
+	}
+
+	return;		/* packet is normal */
+
+error:
+	print_packet(ns, eh, h->len);
+
+	/* Stop parsing at first error */
+	pcap_breakloop(ctx->pcap);
+}
 
-	*countp += 1;
+static uint64_t
+current_timestamp(void)
+{
+	struct timespec ts;
+
+	clock_gettime(CLOCK_REALTIME, &ts);
+	return rte_timespec_to_ns(&ts);
 }
 
 /*
@@ -218,78 +280,162 @@ pkt_print(u_char *user, const struct pcap_pkthdr *h,
  * but that creates an unwanted dependency.
  */
 static int
-test_validate(void)
+valid_pcapng_file(const char *file_name, uint64_t started, unsigned int expected)
 {
 	char errbuf[PCAP_ERRBUF_SIZE];
-	unsigned int count = 0;
-	pcap_t *pcap;
+	struct pkt_print_ctx ctx = { };
 	int ret;
 
-	pcap = pcap_open_offline(file_name, errbuf);
-	if (pcap == NULL) {
+	ctx.start_ns = started;
+	ctx.end_ns = current_timestamp();
+
+	ctx.pcap = pcap_open_offline_with_tstamp_precision(file_name,
+							   PCAP_TSTAMP_PRECISION_NANO,
+							   errbuf);
+	if (ctx.pcap == NULL) {
 		fprintf(stderr, "pcap_open_offline('%s') failed: %s\n",
 			file_name, errbuf);
 		return -1;
 	}
 
-	ret = pcap_loop(pcap, 0, pkt_print, (u_char *)&count);
-	if (ret == 0)
-		printf("Saw %u packets\n", count);
-	else
+	ret = pcap_loop(ctx.pcap, 0, parse_pcap_packet, (u_char *)&ctx);
+	if (ret != 0) {
 		fprintf(stderr, "pcap_dispatch: failed: %s\n",
-			pcap_geterr(pcap));
-	pcap_close(pcap);
+			pcap_geterr(ctx.pcap));
+	} else if (ctx.count != expected) {
+		printf("Only %u packets, expected %u\n",
+		       ctx.count, expected);
+		ret = -1;
+	}
+
+	pcap_close(ctx.pcap);
 
 	return ret;
 }
 
 static int
-test_write_over_limit_iov_max(void)
+test_add_interface(void)
 {
-	struct rte_mbuf *orig;
-	struct rte_mbuf *clones[IOV_MAX + NUM_PACKETS] = { };
-	struct dummy_mbuf mbfs;
-	unsigned int i;
-	ssize_t len;
+	char file_name[] = "/tmp/pcapng_test_XXXXXX.pcapng";
+	static rte_pcapng_t *pcapng;
+	int ret, tmp_fd;
+	uint64_t now = current_timestamp();
 
-	/* make a dummy packet */
-	mbuf1_prepare(&mbfs, pkt_len);
+	tmp_fd = mkstemps(file_name, strlen(".pcapng"));
+	if (tmp_fd == -1) {
+		perror("mkstemps() failure");
+		goto fail;
+	}
+	printf("pcapng: output file %s\n", file_name);
 
-	/* clone them */
-	orig  = &mbfs.mb[0];
-	for (i = 0; i < IOV_MAX + NUM_PACKETS; i++) {
-		struct rte_mbuf *mc;
+	/* open a test capture file */
+	pcapng = rte_pcapng_fdopen(tmp_fd, NULL, NULL, "pcapng_addif", NULL);
+	if (pcapng == NULL) {
+		fprintf(stderr, "rte_pcapng_fdopen failed\n");
+		close(tmp_fd);
+		goto fail;
+	}
 
-		mc = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len,
-				     RTE_PCAPNG_DIRECTION_UNKNOWN, NULL);
-		if (mc == NULL) {
-			fprintf(stderr, "Cannot copy packet\n");
-			return -1;
-		}
-		clones[i] = mc;
+	/* Add interface to the file */
+	ret = rte_pcapng_add_interface(pcapng, port_id,
+				       NULL, NULL, NULL);
+	if (ret < 0) {
+		fprintf(stderr, "can not add port %u\n", port_id);
+		goto fail;
 	}
 
-	/* write it to capture file */
-	len = rte_pcapng_write_packets(pcapng, clones, IOV_MAX + NUM_PACKETS);
+	/* Add interface with ifname and ifdescr */
+	ret = rte_pcapng_add_interface(pcapng, port_id,
+				       "myeth", "Some long description", NULL);
+	if (ret < 0) {
+		fprintf(stderr, "can not add port %u with ifname\n", port_id);
+		goto fail;
+	}
+
+	/* Add interface with filter */
+	ret = rte_pcapng_add_interface(pcapng, port_id,
+				       NULL, NULL, "tcp port 8080");
+	if (ret < 0) {
+		fprintf(stderr, "can not add port %u with filter\n", port_id);
+		goto fail;
+	}
 
-	rte_pktmbuf_free_bulk(clones, IOV_MAX + NUM_PACKETS);
+	rte_pcapng_close(pcapng);
 
-	if (len <= 0) {
-		fprintf(stderr, "Write of packets failed\n");
-		return -1;
+	ret = valid_pcapng_file(file_name, now, 0);
+	/* if test fails want to investigate the file */
+	if (ret == 0)
+		unlink(file_name);
+
+	return ret;
+
+fail:
+	rte_pcapng_close(pcapng);
+	return -1;
+}
+
+static int
+test_write_packets(void)
+{
+	char file_name[] = "/tmp/pcapng_test_XXXXXX.pcapng";
+	static rte_pcapng_t *pcapng;
+	int ret, tmp_fd, count;
+	uint64_t now = current_timestamp();
+
+	tmp_fd = mkstemps(file_name, strlen(".pcapng"));
+	if (tmp_fd == -1) {
+		perror("mkstemps() failure");
+		goto fail;
 	}
+	printf("pcapng: output file %s\n", file_name);
 
-	return 0;
+	/* open a test capture file */
+	pcapng = rte_pcapng_fdopen(tmp_fd, NULL, NULL, "pcapng_test", NULL);
+	if (pcapng == NULL) {
+		fprintf(stderr, "rte_pcapng_fdopen failed\n");
+		close(tmp_fd);
+		goto fail;
+	}
+
+	/* Add interface to the file */
+	ret = rte_pcapng_add_interface(pcapng, port_id,
+				       NULL, NULL, NULL);
+	if (ret < 0) {
+		fprintf(stderr, "can not add port %u\n", port_id);
+		goto fail;
+	}
+
+	count = fill_pcapng_file(pcapng, TOTAL_PACKETS);
+	if (count < 0)
+		goto fail;
+
+	/* write a statistics block */
+	ret = rte_pcapng_write_stats(pcapng, port_id,
+				     count, 0, "end of test");
+	if (ret <= 0) {
+		fprintf(stderr, "Write of statistics failed\n");
+		goto fail;
+	}
+
+	rte_pcapng_close(pcapng);
+
+	ret = valid_pcapng_file(file_name, now, count);
+	/* if test fails want to investigate the file */
+	if (ret == 0)
+		unlink(file_name);
+
+	return ret;
+
+fail:
+	rte_pcapng_close(pcapng);
+	return -1;
 }
 
 static void
 test_cleanup(void)
 {
 	rte_mempool_free(mp);
-
-	if (pcapng)
-		rte_pcapng_close(pcapng);
-
+	rte_vdev_uninit(null_dev);
 }
 
 static struct
@@ -298,10 +444,8 @@ unit_test_suite test_pcapng_suite  = {
 	.teardown = test_cleanup,
 	.suite_name = "Test Pcapng Unit Test Suite",
 	.unit_test_cases = {
+		TEST_CASE(test_add_interface),
 		TEST_CASE(test_write_packets),
-		TEST_CASE(test_write_stats),
-		TEST_CASE(test_validate),
-		TEST_CASE(test_write_over_limit_iov_max),
 		TEST_CASES_END()
 	}
 };
@@ -312,4 +456,4 @@ test_pcapng(void)
 	return unit_test_suite_runner(&test_pcapng_suite);
 }
 
-REGISTER_TEST_COMMAND(pcapng_autotest, test_pcapng);
+REGISTER_FAST_TEST(pcapng_autotest, true, true, test_pcapng);
-- 
2.42.0


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v7 0/5] dumpcap and pcapng fixes
  2023-11-17 16:35 ` [PATCH v7 0/5] dumpcap and pcapng fixes Stephen Hemminger
                     ` (4 preceding siblings ...)
  2023-11-17 16:35   ` [PATCH v7 5/5] test: cleanups to pcapng test Stephen Hemminger
@ 2023-11-22 22:42   ` Thomas Monjalon
  5 siblings, 0 replies; 61+ messages in thread
From: Thomas Monjalon @ 2023-11-22 22:42 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev

> Stephen Hemminger (5):
>   pdump: fix setting rte_errno on mp error
>   dumpcap: allow multiple invocations
>   pcapng: modify timestamp calculation
>   pcapng: avoid using alloca()
>   test: cleanups to pcapng test

Applied with a note about the API change in release notes.



^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2023-11-22 22:42 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-21  4:23 [PATCH 0/4] pcapng fixes Stephen Hemminger
2023-09-21  4:23 ` [PATCH 1/4] pdump: fix setting rte_errno on mp error Stephen Hemminger
2023-09-21  4:23 ` [PATCH 2/4] dumpcap: allow multiple invocations Stephen Hemminger
2023-09-21  6:22   ` Morten Brørup
2023-09-21  7:10     ` Isaac Boukris
2023-11-07  2:34     ` Stephen Hemminger
2023-09-21  4:23 ` [PATCH 3/4] pcapng: change timestamp argument to write_stats Stephen Hemminger
2023-09-21  4:23 ` [PATCH 4/4] pcapng: move timestamp calculation into pdump Stephen Hemminger
2023-10-02  8:15   ` David Marchand
2023-10-04 17:13     ` Stephen Hemminger
2023-10-06  9:10       ` David Marchand
2023-10-06 14:59         ` Kevin Traynor
2023-10-05 23:06 ` [PATCH v2 0/4] dumpcap and pcapng fixes Stephen Hemminger
2023-10-05 23:06   ` [PATCH v2 1/4] pdump: fix setting rte_errno on mp error Stephen Hemminger
2023-10-05 23:06   ` [PATCH v2 2/4] dumpcap: allow multiple invocations Stephen Hemminger
2023-10-05 23:06   ` [PATCH v2 3/4] pcapng: modify timestamp calculation Stephen Hemminger
2023-10-05 23:06   ` [PATCH v2 4/4] test: cleanups to pcapng test Stephen Hemminger
2023-11-08 18:35 ` [PATCH v3 0/5] dumpcap and pcapng fixes Stephen Hemminger
2023-11-08 18:35   ` [PATCH v3 1/5] pdump: fix setting rte_errno on mp error Stephen Hemminger
2023-11-09  7:34     ` Morten Brørup
2023-11-08 18:35   ` [PATCH v3 2/5] dumpcap: allow multiple invocations Stephen Hemminger
2023-11-09  7:50     ` Morten Brørup
2023-11-09 15:40       ` Stephen Hemminger
2023-11-09 16:00         ` Morten Brørup
2023-11-09 17:16       ` Stephen Hemminger
2023-11-09 18:22         ` Morten Brørup
2023-11-08 18:35   ` [PATCH v3 3/5] pcapng: modify timestamp calculation Stephen Hemminger
2023-11-09  7:57     ` Morten Brørup
2023-11-08 18:35   ` [PATCH v3 4/5] pcapng: avoid using alloca() Stephen Hemminger
2023-11-09  8:21     ` Morten Brørup
2023-11-09 15:44       ` Stephen Hemminger
2023-11-09 16:25         ` Morten Brørup
2023-11-08 18:35   ` [PATCH v3 5/5] test: cleanups to pcapng test Stephen Hemminger
2023-11-09 17:34 ` [PATCH v4 0/5] dumpcap and pcapng fixes Stephen Hemminger
2023-11-09 17:34   ` [PATCH v4 1/5] pdump: fix setting rte_errno on mp error Stephen Hemminger
2023-11-09 17:34   ` [PATCH v4 2/5] dumpcap: allow multiple invocations Stephen Hemminger
2023-11-09 18:30     ` Morten Brørup
2023-11-09 17:34   ` [PATCH v4 3/5] pcapng: modify timestamp calculation Stephen Hemminger
2023-11-09 17:34   ` [PATCH v4 4/5] pcapng: avoid using alloca() Stephen Hemminger
2023-11-09 17:34   ` [PATCH v4 5/5] test: cleanups to pcapng test Stephen Hemminger
2023-11-09 19:45 ` [PATCH v5 0/5] dumpcap and pcapng fixes Stephen Hemminger
2023-11-09 19:45   ` [PATCH v5 1/5] pdump: fix setting rte_errno on mp error Stephen Hemminger
2023-11-09 19:45   ` [PATCH v5 2/5] dumpcap: allow multiple invocations Stephen Hemminger
2023-11-09 20:09     ` Morten Brørup
2023-11-09 19:45   ` [PATCH v5 3/5] pcapng: modify timestamp calculation Stephen Hemminger
2023-11-12 14:22     ` Thomas Monjalon
2023-11-09 19:45   ` [PATCH v5 4/5] pcapng: avoid using alloca() Stephen Hemminger
2023-11-09 19:45   ` [PATCH v5 5/5] test: cleanups to pcapng test Stephen Hemminger
2023-11-13 16:15 ` [PATCH v6 0/5] dumpcap and pcapng fixes Stephen Hemminger
2023-11-13 16:15   ` [PATCH v6 1/5] pdump: fix setting rte_errno on mp error Stephen Hemminger
2023-11-13 16:15   ` [PATCH v6 2/5] dumpcap: allow multiple invocations Stephen Hemminger
2023-11-13 16:15   ` [PATCH v6 3/5] pcapng: modify timestamp calculation Stephen Hemminger
2023-11-13 16:15   ` [PATCH v6 4/5] pcapng: avoid using alloca() Stephen Hemminger
2023-11-13 16:15   ` [PATCH v6 5/5] test: cleanups to pcapng test Stephen Hemminger
2023-11-17 16:35 ` [PATCH v7 0/5] dumpcap and pcapng fixes Stephen Hemminger
2023-11-17 16:35   ` [PATCH v7 1/5] pdump: fix setting rte_errno on mp error Stephen Hemminger
2023-11-17 16:35   ` [PATCH v7 2/5] dumpcap: allow multiple invocations Stephen Hemminger
2023-11-17 16:35   ` [PATCH v7 3/5] pcapng: modify timestamp calculation Stephen Hemminger
2023-11-17 16:35   ` [PATCH v7 4/5] pcapng: avoid using alloca() Stephen Hemminger
2023-11-17 16:35   ` [PATCH v7 5/5] test: cleanups to pcapng test Stephen Hemminger
2023-11-22 22:42   ` [PATCH v7 0/5] dumpcap and pcapng fixes Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).