DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 00/15] remove mbuf timestamp
@ 2020-10-29  9:27 Thomas Monjalon
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 01/15] eventdev: remove software Rx timestamp Thomas Monjalon
                   ` (18 more replies)
  0 siblings, 19 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29  9:27 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo

The mbuf field timestamp was announced to be removed for three reasons:
  - a dynamic field already exist, used for Tx only
  - this field always used 8 bytes even if unneeded
  - this field is in the first half (cacheline) of mbuf

After this series the dynamic field timestamp is used for both Rx and Tx
with separate dynamic flags to distinguish when the value is meaningful
without resetting the field during forwarding.

As a consequence, 8 bytes can be re-allocated to dynamic fields,
and bonus...
the mempool pointer can be promoted to the first half of mbuf!

This mbuf layout change is important to allow adding more features
(consuming more dynamic fields) during the next year
and can bring some performance improvement.


Thomas Monjalon (15):
  eventdev: remove software Rx timestamp
  mbuf: add Rx timestamp dynamic flag
  ethdev: register mbuf field and flags for timestamp
  latency: switch timestamp to dynamic mbuf field
  net/ark: switch timestamp to dynamic mbuf field
  net/dpaa2: switch timestamp to dynamic mbuf field
  net/mlx5: fix dynamic mbuf offset lookup check
  net/mlx5: switch timestamp to dynamic mbuf field
  net/nfb: switch timestamp to dynamic mbuf field
  net/octeontx2: switch timestamp to dynamic mbuf field
  net/pcap: switch timestamp to dynamic mbuf field
  app/testpmd: switch timestamp to dynamic mbuf field
  examples/rxtx_callbacks: switch timestamp to dynamic field
  mbuf: remove deprecated timestamp field
  mbuf: move pool pointer in hotter first half

 app/test-pmd/config.c                         | 38 ----------
 app/test-pmd/util.c                           | 39 ++++++++++-
 app/test/test_mbuf.c                          |  1 -
 doc/guides/nics/mlx5.rst                      |  5 +-
 .../prog_guide/event_ethernet_rx_adapter.rst  |  6 +-
 doc/guides/rel_notes/deprecation.rst          |  6 --
 doc/guides/rel_notes/release_20_11.rst        |  4 ++
 drivers/net/ark/ark_ethdev.c                  | 23 ++++++
 drivers/net/ark/ark_ethdev_rx.c               | 10 ++-
 drivers/net/dpaa2/dpaa2_ethdev.c              | 20 ++++++
 drivers/net/dpaa2/dpaa2_ethdev.h              |  2 +
 drivers/net/dpaa2/dpaa2_rxtx.c                | 25 +++++--
 drivers/net/mlx5/mlx5_rxq.c                   | 36 ++++++++++
 drivers/net/mlx5/mlx5_rxtx.c                  |  8 +--
 drivers/net/mlx5/mlx5_rxtx.h                  | 19 +++++
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h      | 41 +++++------
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h         | 43 ++++++------
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h          | 35 +++++-----
 drivers/net/mlx5/mlx5_trigger.c               |  2 +-
 drivers/net/mlx5/mlx5_txq.c                   |  2 +-
 drivers/net/nfb/nfb_rx.c                      | 23 +++++-
 drivers/net/nfb/nfb_rx.h                      | 18 +++--
 drivers/net/octeontx2/otx2_ethdev.c           | 33 +++++++++
 drivers/net/octeontx2/otx2_rx.h               | 19 ++++-
 drivers/net/octeontx2/version.map             |  7 ++
 drivers/net/pcap/rte_eth_pcap.c               | 29 +++++++-
 examples/rxtx_callbacks/main.c                | 12 +++-
 lib/librte_ethdev/rte_ethdev.c                | 70 +++++++++++++++++++
 lib/librte_ethdev/rte_ethdev.h                | 13 +++-
 .../rte_event_eth_rx_adapter.c                | 11 ---
 .../rte_event_eth_rx_adapter.h                |  6 +-
 lib/librte_kni/rte_kni_common.h               |  3 +-
 lib/librte_latencystats/rte_latencystats.c    | 48 +++++++++++--
 lib/librte_mbuf/rte_mbuf.c                    |  2 -
 lib/librte_mbuf/rte_mbuf.h                    |  1 -
 lib/librte_mbuf/rte_mbuf_core.h               | 15 +---
 lib/librte_mbuf/rte_mbuf_dyn.h                | 11 +--
 37 files changed, 502 insertions(+), 184 deletions(-)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH 01/15] eventdev: remove software Rx timestamp
  2020-10-29  9:27 [dpdk-dev] [PATCH 00/15] remove mbuf timestamp Thomas Monjalon
@ 2020-10-29  9:27 ` Thomas Monjalon
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 02/15] mbuf: add Rx timestamp dynamic flag Thomas Monjalon
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29  9:27 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Nikhil Rao

This a revert of the commit 569758758dcd ("eventdev: add Rx timestamp").
If the Rx timestamp is not configured on the ethdev port,
there is no reason to set one.
Also the accuracy  of the timestamp was bad because set at a late stage.
Anyway there is no trace of the usage of this timestamp.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 doc/guides/prog_guide/event_ethernet_rx_adapter.rst |  6 +-----
 lib/librte_eventdev/rte_event_eth_rx_adapter.c      | 11 -----------
 lib/librte_eventdev/rte_event_eth_rx_adapter.h      |  6 +-----
 3 files changed, 2 insertions(+), 21 deletions(-)

diff --git a/doc/guides/prog_guide/event_ethernet_rx_adapter.rst b/doc/guides/prog_guide/event_ethernet_rx_adapter.rst
index 236f43f455..cb44ce0e47 100644
--- a/doc/guides/prog_guide/event_ethernet_rx_adapter.rst
+++ b/doc/guides/prog_guide/event_ethernet_rx_adapter.rst
@@ -12,11 +12,7 @@ be supported in hardware or require a software thread to receive packets from
 the ethdev port using ethdev poll mode APIs and enqueue these as events to the
 event device using the eventdev API. Both transfer mechanisms may be present on
 the same platform depending on the particular combination of the ethdev and
-the event device. For SW based packet transfer, if the mbuf does not have a
-timestamp set, the adapter adds a timestamp to the mbuf using
-rte_get_tsc_cycles(), this provides a more accurate timestamp as compared to
-if the application were to set the timestamp since it avoids event device
-schedule latency.
+the event device.
 
 The Event Ethernet Rx Adapter library is intended for the application code to
 configure both transfer mechanisms using a common API. A capability API allows
diff --git a/lib/librte_eventdev/rte_event_eth_rx_adapter.c b/lib/librte_eventdev/rte_event_eth_rx_adapter.c
index f0000d1ede..3c73046551 100644
--- a/lib/librte_eventdev/rte_event_eth_rx_adapter.c
+++ b/lib/librte_eventdev/rte_event_eth_rx_adapter.c
@@ -763,7 +763,6 @@ rxa_buffer_mbufs(struct rte_event_eth_rx_adapter *rx_adapter,
 	uint32_t rss_mask;
 	uint32_t rss;
 	int do_rss;
-	uint64_t ts;
 	uint16_t nb_cb;
 	uint16_t dropped;
 
@@ -771,16 +770,6 @@ rxa_buffer_mbufs(struct rte_event_eth_rx_adapter *rx_adapter,
 	rss_mask = ~(((m->ol_flags & PKT_RX_RSS_HASH) != 0) - 1);
 	do_rss = !rss_mask && !eth_rx_queue_info->flow_id_mask;
 
-	if ((m->ol_flags & PKT_RX_TIMESTAMP) == 0) {
-		ts = rte_get_tsc_cycles();
-		for (i = 0; i < num; i++) {
-			m = mbufs[i];
-
-			m->timestamp = ts;
-			m->ol_flags |= PKT_RX_TIMESTAMP;
-		}
-	}
-
 	for (i = 0; i < num; i++) {
 		m = mbufs[i];
 
diff --git a/lib/librte_eventdev/rte_event_eth_rx_adapter.h b/lib/librte_eventdev/rte_event_eth_rx_adapter.h
index 2dd259c279..21bb1e54c8 100644
--- a/lib/librte_eventdev/rte_event_eth_rx_adapter.h
+++ b/lib/librte_eventdev/rte_event_eth_rx_adapter.h
@@ -21,11 +21,7 @@
  *
  * The adapter uses a EAL service core function for SW based packet transfer
  * and uses the eventdev PMD functions to configure HW based packet transfer
- * between the ethernet device and the event device. For SW based packet
- * transfer, if the mbuf does not have a timestamp set, the adapter adds a
- * timestamp to the mbuf using rte_get_tsc_cycles(), this provides a more
- * accurate timestamp as compared to if the application were to set the time
- * stamp since it avoids event device schedule latency.
+ * between the ethernet device and the event device.
  *
  * The ethernet Rx event adapter's functions are:
  *  - rte_event_eth_rx_adapter_create_ext()
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH 02/15] mbuf: add Rx timestamp dynamic flag
  2020-10-29  9:27 [dpdk-dev] [PATCH 00/15] remove mbuf timestamp Thomas Monjalon
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 01/15] eventdev: remove software Rx timestamp Thomas Monjalon
@ 2020-10-29  9:27 ` Thomas Monjalon
  2020-10-29  9:58   ` Andrew Rybchenko
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 03/15] ethdev: register mbuf field and flags for timestamp Thomas Monjalon
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29  9:27 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo

There is already a dynamic field for timestamp,
used only for Tx scheduling, thanks to the dedicated flag.
The same field can be used for Rx timestamp filled by drivers.
The only missing part to get rid of the static timestamp field
was to declare a new dynamic flag for Rx usage.

After migrating all Rx timestamp usages, it will be possible
to remove the deprecated timestamp field.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 lib/librte_mbuf/rte_mbuf_dyn.h | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 0ebac88b83..5fb85c0610 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -258,14 +258,14 @@ void rte_mbuf_dyn_dump(FILE *out);
  * timestamp. The dynamic Tx timestamp flag tells whether the field contains
  * actual timestamp value for the packets being sent, this value can be
  * used by PMD to schedule packet sending.
- *
- * After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
- * and obsoleting, the dedicated Rx timestamp flag is supposed to be
- * introduced and the shared dynamic timestamp field will be used
- * to handle the timestamps on receiving datapath as well.
  */
 #define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME "rte_dynfield_timestamp"
 
+/**
+ * Indicate that the timestamp field in the mbuf was filled by the driver.
+ */
+#define RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME "rte_dynflag_rx_timestamp"
+
 /**
  * When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME flag set on the
  * packet being sent it tries to synchronize the time of packet appearing
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH 03/15] ethdev: register mbuf field and flags for timestamp
  2020-10-29  9:27 [dpdk-dev] [PATCH 00/15] remove mbuf timestamp Thomas Monjalon
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 01/15] eventdev: remove software Rx timestamp Thomas Monjalon
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 02/15] mbuf: add Rx timestamp dynamic flag Thomas Monjalon
@ 2020-10-29  9:27 ` Thomas Monjalon
  2020-10-29 10:08   ` Andrew Rybchenko
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 04/15] latency: switch timestamp to dynamic mbuf field Thomas Monjalon
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29  9:27 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger, Matan Azrad, Shahaf Shuler

During port configure or queue setup, the offload flags
DEV_RX_OFFLOAD_TIMESTAMP and DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP
trigger the registration of the related mbuf field and flags.

Previously, the Tx timestamp field and flag were registered in testpmd,
as described in mlx5 guide.
For the general usage of Rx and Tx timestamps,
managing registrations inside ethdev is simpler and properly documented.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 app/test-pmd/config.c          | 38 ------------------
 doc/guides/nics/mlx5.rst       |  5 +--
 lib/librte_ethdev/rte_ethdev.c | 70 ++++++++++++++++++++++++++++++++++
 lib/librte_ethdev/rte_ethdev.h |  9 ++++-
 lib/librte_mbuf/rte_mbuf_dyn.h |  1 +
 5 files changed, 81 insertions(+), 42 deletions(-)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 1668ae3238..9a2baf16fe 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -3955,44 +3955,6 @@ show_tx_pkt_times(void)
 void
 set_tx_pkt_times(unsigned int *tx_times)
 {
-	uint16_t port_id;
-	int offload_found = 0;
-	int offset;
-	int flag;
-
-	static const struct rte_mbuf_dynfield desc_offs = {
-		.name = RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
-		.size = sizeof(uint64_t),
-		.align = __alignof__(uint64_t),
-	};
-	static const struct rte_mbuf_dynflag desc_flag = {
-		.name = RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME,
-	};
-
-	RTE_ETH_FOREACH_DEV(port_id) {
-		struct rte_eth_dev_info dev_info = { 0 };
-		int ret;
-
-		ret = rte_eth_dev_info_get(port_id, &dev_info);
-		if (ret == 0 && dev_info.tx_offload_capa &
-				DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP) {
-			offload_found = 1;
-			break;
-		}
-	}
-	if (!offload_found) {
-		printf("No device supporting Tx timestamp scheduling found, "
-		       "dynamic flag and field not registered\n");
-		return;
-	}
-	offset = rte_mbuf_dynfield_register(&desc_offs);
-	if (offset < 0 && rte_errno != EEXIST)
-		printf("Dynamic timestamp field registration error: %d",
-		       rte_errno);
-	flag = rte_mbuf_dynflag_register(&desc_flag);
-	if (flag < 0 && rte_errno != EEXIST)
-		printf("Dynamic timestamp flag registration error: %d",
-		       rte_errno);
 	tx_pkt_times_inter = tx_times[0];
 	tx_pkt_times_intra = tx_times[1];
 }
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index afa65a1379..fa8b13dd1b 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -237,9 +237,8 @@ Limitations
   ``txq_inline_max`` and ``txq_inline_mpw`` devargs keys.
 
 - To provide the packet send scheduling on mbuf timestamps the ``tx_pp``
-  parameter should be specified, RTE_MBUF_DYNFIELD_TIMESTAMP_NAME and
-  RTE_MBUF_DYNFLAG_TIMESTAMP_NAME should be registered by application.
-  When PMD sees the RTE_MBUF_DYNFLAG_TIMESTAMP_NAME set on the packet
+  parameter should be specified.
+  When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME set on the packet
   being sent it tries to synchronize the time of packet appearing on
   the wire with the specified packet timestamp. It the specified one
   is in the past it should be ignored, if one is in the distant future
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index b12bb3854d..7c9aadb461 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -31,6 +31,7 @@
 #include <rte_mempool.h>
 #include <rte_malloc.h>
 #include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 #include <rte_errno.h>
 #include <rte_spinlock.h>
 #include <rte_string_fns.h>
@@ -1232,6 +1233,59 @@ eth_dev_check_lro_pkt_size(uint16_t port_id, uint32_t config_size,
 	return ret;
 }
 
+static inline int
+eth_dev_timestamp_mbuf_register(uint64_t rx_offloads, uint64_t tx_offloads)
+{
+	static const struct rte_mbuf_dynfield field_desc = {
+		.name = RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
+		.size = sizeof(rte_mbuf_timestamp_t),
+		.align = __alignof__(rte_mbuf_timestamp_t),
+	};
+	static const struct rte_mbuf_dynflag rx_flag_desc = {
+		.name = RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME,
+	};
+	static const struct rte_mbuf_dynflag tx_flag_desc = {
+		.name = RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME,
+	};
+	static bool done_rx, done_tx;
+	bool todo_rx, todo_tx;
+	int offset;
+
+	todo_rx = (rx_offloads & DEV_RX_OFFLOAD_TIMESTAMP) != 0
+		&& !done_rx;
+	todo_tx = (tx_offloads & DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP) != 0
+		&& !done_tx;
+
+	if (todo_rx || todo_tx) {
+		offset = rte_mbuf_dynfield_register(&field_desc);
+		if (offset < 0) {
+			RTE_ETHDEV_LOG(ERR,
+					"Failed to register mbuf field for timestamp\n");
+			return -rte_errno;
+		}
+	}
+	if (todo_rx) {
+		offset = rte_mbuf_dynflag_register(&rx_flag_desc);
+		if (offset < 0) {
+			RTE_ETHDEV_LOG(ERR,
+					"Failed to register mbuf flag for Rx timestamp\n");
+			return -rte_errno;
+		}
+		done_rx = true;
+	}
+	if (todo_tx) {
+		offset = rte_mbuf_dynflag_register(&tx_flag_desc);
+		if (offset < 0) {
+			RTE_ETHDEV_LOG(ERR,
+					"Failed to register mbuf flag for Tx timestamp\n");
+			return -rte_errno;
+		}
+		done_tx = true;
+	}
+
+	return 0;
+}
+
 /*
  * Validate offloads that are requested through rte_eth_dev_configure against
  * the offloads successfully set by the ethernet device.
@@ -1481,6 +1535,12 @@ rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
 		goto rollback;
 	}
 
+	/* Register mbuf field and flags for timestamp offloads if enabled. */
+	ret = eth_dev_timestamp_mbuf_register(dev_conf->rxmode.offloads,
+			dev_conf->txmode.offloads);
+	if (ret != 0)
+		goto rollback;
+
 	/*
 	 * Setup new number of RX/TX queues and reconfigure device.
 	 */
@@ -2088,6 +2148,11 @@ rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id,
 			return ret;
 	}
 
+	/* Register mbuf field and flag for Rx timestamp offload if enabled. */
+	ret = eth_dev_timestamp_mbuf_register(local_conf.offloads, 0);
+	if (ret != 0)
+		return ret;
+
 	ret = (*dev->dev_ops->rx_queue_setup)(dev, rx_queue_id, nb_rx_desc,
 					      socket_id, &local_conf, mp);
 	if (!ret) {
@@ -2268,6 +2333,11 @@ rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
 		return -EINVAL;
 	}
 
+	/* Register mbuf field and flag for Tx timestamp offload if enabled. */
+	ret = eth_dev_timestamp_mbuf_register(0, local_conf.offloads);
+	if (ret != 0)
+		return ret;
+
 	rte_ethdev_trace_txq_setup(port_id, tx_queue_id, nb_tx_desc, tx_conf);
 	return eth_err(port_id, (*dev->dev_ops->tx_queue_setup)(dev,
 		       tx_queue_id, nb_tx_desc, socket_id, &local_conf));
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index ba997f16ce..3be0050592 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1344,6 +1344,9 @@ struct rte_eth_conf {
 #define DEV_RX_OFFLOAD_VLAN_EXTEND	0x00000400
 #define DEV_RX_OFFLOAD_JUMBO_FRAME	0x00000800
 #define DEV_RX_OFFLOAD_SCATTER		0x00002000
+/**
+ * The mbuf field and flag are registered when the offload is configured.
+ */
 #define DEV_RX_OFFLOAD_TIMESTAMP	0x00004000
 #define DEV_RX_OFFLOAD_SECURITY         0x00008000
 #define DEV_RX_OFFLOAD_KEEP_CRC		0x00010000
@@ -1408,7 +1411,11 @@ struct rte_eth_conf {
 #define DEV_TX_OFFLOAD_IP_TNL_TSO       0x00080000
 /** Device supports outer UDP checksum */
 #define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM  0x00100000
-/** Device supports send on timestamp */
+/**
+ * Device sends on time read from RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
+ * if RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME is set in ol_flags.
+ * The mbuf field and flag are registered when the offload is configured.
+ */
 #define DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
 /*
  * If new Tx offload capabilities are defined, they also must be
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 5fb85c0610..d4d8f66f77 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -260,6 +260,7 @@ void rte_mbuf_dyn_dump(FILE *out);
  * used by PMD to schedule packet sending.
  */
 #define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME "rte_dynfield_timestamp"
+typedef uint64_t rte_mbuf_timestamp_t;
 
 /**
  * Indicate that the timestamp field in the mbuf was filled by the driver.
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH 04/15] latency: switch timestamp to dynamic mbuf field
  2020-10-29  9:27 [dpdk-dev] [PATCH 00/15] remove mbuf timestamp Thomas Monjalon
                   ` (2 preceding siblings ...)
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 03/15] ethdev: register mbuf field and flags for timestamp Thomas Monjalon
@ 2020-10-29  9:27 ` Thomas Monjalon
  2020-10-29 10:13   ` Andrew Rybchenko
  2020-10-29 14:20   ` Pattan, Reshma
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 05/15] net/ark: " Thomas Monjalon
                   ` (14 subsequent siblings)
  18 siblings, 2 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29  9:27 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Reshma Pattan

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced with the dynamic one.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 lib/librte_latencystats/rte_latencystats.c | 48 +++++++++++++++++++---
 1 file changed, 43 insertions(+), 5 deletions(-)

diff --git a/lib/librte_latencystats/rte_latencystats.c b/lib/librte_latencystats/rte_latencystats.c
index ba2fff3bcb..a21f6239d9 100644
--- a/lib/librte_latencystats/rte_latencystats.c
+++ b/lib/librte_latencystats/rte_latencystats.c
@@ -8,7 +8,9 @@
 #include <math.h>
 
 #include <rte_string_fns.h>
+#include <rte_bitops.h>
 #include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 #include <rte_log.h>
 #include <rte_cycles.h>
 #include <rte_ethdev.h>
@@ -31,6 +33,16 @@ latencystat_cycles_per_ns(void)
 /* Macros for printing using RTE_LOG */
 #define RTE_LOGTYPE_LATENCY_STATS RTE_LOGTYPE_USER1
 
+static uint64_t timestamp_dynflag;
+static int timestamp_dynfield_offset = -1;
+
+static inline rte_mbuf_timestamp_t *
+timestamp_dynfield(struct rte_mbuf *mbuf)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+			timestamp_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 static const char *MZ_RTE_LATENCY_STATS = "rte_latencystats";
 static int latency_stats_index;
 static uint64_t samp_intvl;
@@ -128,10 +140,10 @@ add_time_stamps(uint16_t pid __rte_unused,
 		diff_tsc = now - prev_tsc;
 		timer_tsc += diff_tsc;
 
-		if ((pkts[i]->ol_flags & PKT_RX_TIMESTAMP) == 0
+		if ((pkts[i]->ol_flags & timestamp_dynflag) == 0
 				&& (timer_tsc >= samp_intvl)) {
-			pkts[i]->timestamp = now;
-			pkts[i]->ol_flags |= PKT_RX_TIMESTAMP;
+			*timestamp_dynfield(pkts[i]) = now;
+			pkts[i]->ol_flags |= timestamp_dynflag;
 			timer_tsc = 0;
 		}
 		prev_tsc = now;
@@ -161,8 +173,8 @@ calc_latency(uint16_t pid __rte_unused,
 
 	now = rte_rdtsc();
 	for (i = 0; i < nb_pkts; i++) {
-		if (pkts[i]->ol_flags & PKT_RX_TIMESTAMP)
-			latency[cnt++] = now - pkts[i]->timestamp;
+		if (pkts[i]->ol_flags & timestamp_dynflag)
+			latency[cnt++] = now - *timestamp_dynfield(pkts[i]);
 	}
 
 	rte_spinlock_lock(&glob_stats->lock);
@@ -204,6 +216,14 @@ int
 rte_latencystats_init(uint64_t app_samp_intvl,
 		rte_latency_stats_flow_type_fn user_cb)
 {
+	static const struct rte_mbuf_dynfield timestamp_dynfield_desc = {
+		.name = RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
+		.size = sizeof(rte_mbuf_timestamp_t),
+		.align = __alignof__(rte_mbuf_timestamp_t),
+	};
+	static const struct rte_mbuf_dynflag timestamp_dynflag_desc = {
+		.name = RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME,
+	};
 	unsigned int i;
 	uint16_t pid;
 	uint16_t qid;
@@ -211,6 +231,7 @@ rte_latencystats_init(uint64_t app_samp_intvl,
 	const char *ptr_strings[NUM_LATENCY_STATS] = {0};
 	const struct rte_memzone *mz = NULL;
 	const unsigned int flags = 0;
+	int timestamp_dynflag_offset;
 	int ret;
 
 	if (rte_memzone_lookup(MZ_RTE_LATENCY_STATS))
@@ -241,6 +262,23 @@ rte_latencystats_init(uint64_t app_samp_intvl,
 		return -1;
 	}
 
+	/* Register mbuf field and flag for Rx timestamp */
+	timestamp_dynfield_offset =
+			rte_mbuf_dynfield_register(&timestamp_dynfield_desc);
+	if (timestamp_dynfield_offset < 0) {
+		RTE_LOG(ERR, LATENCY_STATS,
+				"Cannot register mbuf field for timestamp\n");
+		return -rte_errno;
+	}
+	timestamp_dynflag_offset =
+			rte_mbuf_dynflag_register(&timestamp_dynflag_desc);
+	if (timestamp_dynflag_offset < 0) {
+		RTE_LOG(ERR, LATENCY_STATS,
+				"Cannot register mbuf field for timestamp\n");
+		return -rte_errno;
+	}
+	timestamp_dynflag = RTE_BIT64(timestamp_dynflag_offset);
+
 	/** Register Rx/Tx callbacks */
 	RTE_ETH_FOREACH_DEV(pid) {
 		struct rte_eth_dev_info dev_info;
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH 05/15] net/ark: switch timestamp to dynamic mbuf field
  2020-10-29  9:27 [dpdk-dev] [PATCH 00/15] remove mbuf timestamp Thomas Monjalon
                   ` (3 preceding siblings ...)
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 04/15] latency: switch timestamp to dynamic mbuf field Thomas Monjalon
@ 2020-10-29  9:27 ` Thomas Monjalon
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 06/15] net/dpaa2: " Thomas Monjalon
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29  9:27 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Shepard Siegel, Ed Czeck,
	John Miller

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related dynamic mbuf flag is set, although was missing previously.

The timestamp is set if configured for at least one device.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 drivers/net/ark/ark_ethdev.c    | 23 +++++++++++++++++++++++
 drivers/net/ark/ark_ethdev_rx.c | 10 +++++++++-
 2 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ark/ark_ethdev.c b/drivers/net/ark/ark_ethdev.c
index fa343999a1..629f825019 100644
--- a/drivers/net/ark/ark_ethdev.c
+++ b/drivers/net/ark/ark_ethdev.c
@@ -9,6 +9,7 @@
 #include <rte_bus_pci.h>
 #include <rte_ethdev_pci.h>
 #include <rte_kvargs.h>
+#include <rte_bitops.h>
 
 #include "rte_pmd_ark.h"
 #include "ark_global.h"
@@ -79,6 +80,8 @@ static int  eth_ark_set_mtu(struct rte_eth_dev *dev, uint16_t size);
 #define ARK_TX_MAX_QUEUE (4096 * 4)
 #define ARK_TX_MIN_QUEUE (256)
 
+uint64_t ark_timestamp_rx_dynflag;
+int ark_timestamp_dynfield_offset = -1;
 int rte_pmd_ark_rx_userdata_dynfield_offset = -1;
 int rte_pmd_ark_tx_userdata_dynfield_offset = -1;
 
@@ -552,6 +555,24 @@ static int
 eth_ark_dev_configure(struct rte_eth_dev *dev)
 {
 	struct ark_adapter *ark = dev->data->dev_private;
+	int ark_timestamp_rx_dynflag_offset;
+
+	if (dev->data->dev_conf.rxmode.offloads & DEV_RX_OFFLOAD_TIMESTAMP) {
+		ark_timestamp_dynfield_offset = rte_mbuf_dynfield_lookup(
+				RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
+		if (ark_timestamp_dynfield_offset < 0) {
+			ARK_PMD_LOG(ERR, "Failed to lookup timestamp field\n");
+			return -rte_errno;
+		}
+		ark_timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
+				RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
+		if (ark_timestamp_rx_dynflag_offset < 0) {
+			ARK_PMD_LOG(ERR, "Failed to lookup Rx timestamp flag\n");
+			return -rte_errno;
+		}
+		ark_timestamp_rx_dynflag =
+				RTE_BIT64(ark_timestamp_rx_dynflag_offset);
+	}
 
 	eth_ark_dev_set_link_up(dev);
 	if (ark->user_ext.dev_configure)
@@ -782,6 +803,8 @@ eth_ark_dev_info_get(struct rte_eth_dev *dev,
 				ETH_LINK_SPEED_50G |
 				ETH_LINK_SPEED_100G);
 
+	dev_info->rx_offload_capa = DEV_RX_OFFLOAD_TIMESTAMP;
+
 	return 0;
 }
 
diff --git a/drivers/net/ark/ark_ethdev_rx.c b/drivers/net/ark/ark_ethdev_rx.c
index 825b4791be..dda796da0d 100644
--- a/drivers/net/ark/ark_ethdev_rx.c
+++ b/drivers/net/ark/ark_ethdev_rx.c
@@ -15,6 +15,9 @@
 #define ARK_RX_META_OFFSET (RTE_PKTMBUF_HEADROOM - ARK_RX_META_SIZE)
 #define ARK_RX_MAX_NOCHAIN (RTE_MBUF_DEFAULT_DATAROOM)
 
+extern uint64_t ark_timestamp_rx_dynflag;
+extern int ark_timestamp_dynfield_offset;
+
 /* Forward declarations */
 struct ark_rx_queue;
 struct ark_rx_meta;
@@ -272,7 +275,12 @@ eth_ark_recv_pkts(void *rx_queue,
 		mbuf->port = meta->port;
 		mbuf->pkt_len = meta->pkt_len;
 		mbuf->data_len = meta->pkt_len;
-		mbuf->timestamp = meta->timestamp;
+		/* set timestamp if enabled at least on one device */
+		if (ark_timestamp_rx_dynflag > 0) {
+			*RTE_MBUF_DYNFIELD(mbuf, ark_timestamp_dynfield_offset,
+				rte_mbuf_timestamp_t *) = meta->timestamp;
+			mbuf->ol_flags |= ark_timestamp_rx_dynflag;
+		}
 		rte_pmd_ark_mbuf_rx_userdata_set(mbuf, meta->user_data);
 
 		if (ARK_DEBUG_CORE) {	/* debug sanity checks */
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH 06/15] net/dpaa2: switch timestamp to dynamic mbuf field
  2020-10-29  9:27 [dpdk-dev] [PATCH 00/15] remove mbuf timestamp Thomas Monjalon
                   ` (4 preceding siblings ...)
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 05/15] net/ark: " Thomas Monjalon
@ 2020-10-29  9:27 ` Thomas Monjalon
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 07/15] net/mlx5: fix dynamic mbuf offset lookup check Thomas Monjalon
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29  9:27 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Hemant Agrawal,
	Sachin Saxena

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 drivers/net/dpaa2/dpaa2_ethdev.c | 20 ++++++++++++++++++++
 drivers/net/dpaa2/dpaa2_ethdev.h |  2 ++
 drivers/net/dpaa2/dpaa2_rxtx.c   | 25 ++++++++++++++++++-------
 3 files changed, 40 insertions(+), 7 deletions(-)

diff --git a/drivers/net/dpaa2/dpaa2_ethdev.c b/drivers/net/dpaa2/dpaa2_ethdev.c
index 04e60c56f2..ff368174bd 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.c
+++ b/drivers/net/dpaa2/dpaa2_ethdev.c
@@ -8,6 +8,7 @@
 #include <time.h>
 #include <net/if.h>
 
+#include <rte_bitops.h>
 #include <rte_mbuf.h>
 #include <rte_ethdev_driver.h>
 #include <rte_malloc.h>
@@ -65,6 +66,8 @@ static uint64_t dev_tx_offloads_nodis =
 
 /* enable timestamp in mbuf */
 bool dpaa2_enable_ts[RTE_MAX_ETHPORTS];
+uint64_t dpaa2_timestamp_rx_dynflag;
+int dpaa2_timestamp_dynfield_offset = -1;
 
 struct rte_dpaa2_xstats_name_off {
 	char name[RTE_ETH_XSTATS_NAME_SIZE];
@@ -505,6 +508,7 @@ dpaa2_eth_dev_configure(struct rte_eth_dev *dev)
 	struct rte_eth_conf *eth_conf = &dev->data->dev_conf;
 	uint64_t rx_offloads = eth_conf->rxmode.offloads;
 	uint64_t tx_offloads = eth_conf->txmode.offloads;
+	int timestamp_rx_dynflag_offset;
 	int rx_l3_csum_offload = false;
 	int rx_l4_csum_offload = false;
 	int tx_l3_csum_offload = false;
@@ -587,7 +591,23 @@ dpaa2_eth_dev_configure(struct rte_eth_dev *dev)
 #if !defined(RTE_LIBRTE_IEEE1588)
 	if (rx_offloads & DEV_RX_OFFLOAD_TIMESTAMP)
 #endif
+	{
+		dpaa2_timestamp_dynfield_offset = rte_mbuf_dynfield_lookup(
+				RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
+		if (dpaa2_timestamp_dynfield_offset < 0) {
+			DPAA2_PMD_ERR("Error to lookup timestamp field");
+			return -rte_errno;
+		}
+		timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
+				RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
+		if (timestamp_rx_dynflag_offset < 0) {
+			DPAA2_PMD_ERR("Error to lookup Rx timestamp flag");
+			return -rte_errno;
+		}
+		dpaa2_timestamp_rx_dynflag =
+				RTE_BIT64(timestamp_rx_dynflag_offset);
 		dpaa2_enable_ts[dev->data->port_id] = true;
+	}
 
 	if (tx_offloads & DEV_TX_OFFLOAD_IPV4_CKSUM)
 		tx_l3_csum_offload = true;
diff --git a/drivers/net/dpaa2/dpaa2_ethdev.h b/drivers/net/dpaa2/dpaa2_ethdev.h
index 94cf253827..8d82f74684 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.h
+++ b/drivers/net/dpaa2/dpaa2_ethdev.h
@@ -92,6 +92,8 @@
 
 /* enable timestamp in mbuf*/
 extern bool dpaa2_enable_ts[];
+extern uint64_t dpaa2_timestamp_rx_dynflag;
+extern int dpaa2_timestamp_dynfield_offset;
 
 #define DPAA2_QOS_TABLE_RECONFIGURE	1
 #define DPAA2_FS_TABLE_RECONFIGURE	2
diff --git a/drivers/net/dpaa2/dpaa2_rxtx.c b/drivers/net/dpaa2/dpaa2_rxtx.c
index 4dd1d5f578..cef70bfabe 100644
--- a/drivers/net/dpaa2/dpaa2_rxtx.c
+++ b/drivers/net/dpaa2/dpaa2_rxtx.c
@@ -31,6 +31,13 @@ dpaa2_dev_rx_parse_slow(struct rte_mbuf *mbuf,
 
 static void enable_tx_tstamp(struct qbman_fd *fd) __rte_unused;
 
+static inline rte_mbuf_timestamp_t *
+dpaa2_timestamp_dynfield(struct rte_mbuf *mbuf)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+		dpaa2_timestamp_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 #define DPAA2_MBUF_TO_CONTIG_FD(_mbuf, _fd, _bpid)  do { \
 	DPAA2_SET_FD_ADDR(_fd, DPAA2_MBUF_VADDR_TO_IOVA(_mbuf)); \
 	DPAA2_SET_FD_LEN(_fd, _mbuf->data_len); \
@@ -109,9 +116,10 @@ dpaa2_dev_rx_parse_new(struct rte_mbuf *m, const struct qbman_fd *fd,
 	m->ol_flags |= PKT_RX_RSS_HASH;
 
 	if (dpaa2_enable_ts[m->port]) {
-		m->timestamp = annotation->word2;
-		m->ol_flags |= PKT_RX_TIMESTAMP;
-		DPAA2_PMD_DP_DEBUG("pkt timestamp:0x%" PRIx64 "", m->timestamp);
+		*dpaa2_timestamp_dynfield(m) = annotation->word2;
+		m->ol_flags |= dpaa2_timestamp_rx_dynflag;
+		DPAA2_PMD_DP_DEBUG("pkt timestamp:0x%" PRIx64 "",
+				*dpaa2_timestamp_dynfield(m));
 	}
 
 	DPAA2_PMD_DP_DEBUG("HW frc = 0x%x\t packet type =0x%x "
@@ -223,9 +231,12 @@ dpaa2_dev_rx_parse(struct rte_mbuf *mbuf, void *hw_annot_addr)
 	else if (BIT_ISSET_AT_POS(annotation->word8, DPAA2_ETH_FAS_L4CE))
 		mbuf->ol_flags |= PKT_RX_L4_CKSUM_BAD;
 
-	mbuf->ol_flags |= PKT_RX_TIMESTAMP;
-	mbuf->timestamp = annotation->word2;
-	DPAA2_PMD_DP_DEBUG("pkt timestamp: 0x%" PRIx64 "", mbuf->timestamp);
+	if (dpaa2_enable_ts[mbuf->port]) {
+		*dpaa2_timestamp_dynfield(mbuf) = annotation->word2;
+		mbuf->ol_flags |= dpaa2_timestamp_rx_dynflag;
+		DPAA2_PMD_DP_DEBUG("pkt timestamp: 0x%" PRIx64 "",
+				*dpaa2_timestamp_dynfield(mbuf));
+	}
 
 	/* Check detailed parsing requirement */
 	if (annotation->word3 & 0x7FFFFC3FFFF)
@@ -629,7 +640,7 @@ dpaa2_dev_prefetch_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 		else
 			bufs[num_rx] = eth_fd_to_mbuf(fd, eth_data->port_id);
 #if defined(RTE_LIBRTE_IEEE1588)
-		priv->rx_timestamp = bufs[num_rx]->timestamp;
+		priv->rx_timestamp = *dpaa2_timestamp_dynfield(bufs[num_rx]);
 #endif
 
 		if (eth_data->dev_conf.rxmode.offloads &
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH 07/15] net/mlx5: fix dynamic mbuf offset lookup check
  2020-10-29  9:27 [dpdk-dev] [PATCH 00/15] remove mbuf timestamp Thomas Monjalon
                   ` (5 preceding siblings ...)
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 06/15] net/dpaa2: " Thomas Monjalon
@ 2020-10-29  9:27 ` Thomas Monjalon
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 08/15] net/mlx5: switch timestamp to dynamic mbuf field Thomas Monjalon
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29  9:27 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, stable, Matan Azrad,
	Shahaf Shuler, Ori Kam

The functions rte_mbuf_dynfield_lookup() and rte_mbuf_dynflag_lookup()
can return an offset starting with 0 or a negative error code.

In reality the first offsets are probably reserved forever,
but for the sake of strict API compliance,
the checks which considered 0 as an error are fixed.

Fixes: efa79e68c8cd ("net/mlx5: support fine grain dynamic flag")
Fixes: 3172c471b86f ("net/mlx5: prepare Tx queue structures to support timestamp")
Fixes: 0febfcce3693 ("net/mlx5: prepare Tx to support scheduling")
Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 drivers/net/mlx5/mlx5_rxtx.c    | 4 ++--
 drivers/net/mlx5/mlx5_trigger.c | 2 +-
 drivers/net/mlx5/mlx5_txq.c     | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index b530ff421f..e86468b67a 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -5661,9 +5661,9 @@ mlx5_select_tx_function(struct rte_eth_dev *dev)
 	}
 	if (tx_offloads & DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP &&
 	    rte_mbuf_dynflag_lookup
-			(RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME, NULL) > 0 &&
+			(RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME, NULL) >= 0 &&
 	    rte_mbuf_dynfield_lookup
-			(RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL) > 0) {
+			(RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL) >= 0) {
 		/* Offload configured, dynamic entities registered. */
 		olx |= MLX5_TXOFF_CONFIG_TXPP;
 	}
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index 7735f022a3..917b433c4a 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -302,7 +302,7 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 	DRV_LOG(DEBUG, "port %u starting device", dev->data->port_id);
 	fine_inline = rte_mbuf_dynflag_lookup
 		(RTE_PMD_MLX5_FINE_GRANULARITY_INLINE, NULL);
-	if (fine_inline > 0)
+	if (fine_inline >= 0)
 		rte_net_mlx5_dynf_inline_mask = 1UL << fine_inline;
 	else
 		rte_net_mlx5_dynf_inline_mask = 0;
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index af84f5f72b..8ed2bcff7b 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -1305,7 +1305,7 @@ mlx5_txq_dynf_timestamp_set(struct rte_eth_dev *dev)
 				(RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME, NULL);
 	off = rte_mbuf_dynfield_lookup
 				(RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
-	if (nbit > 0 && off >= 0 && sh->txpp.refcnt)
+	if (nbit >= 0 && off >= 0 && sh->txpp.refcnt)
 		mask = 1ULL << nbit;
 	for (i = 0; i != priv->txqs_n; ++i) {
 		data = (*priv->txqs)[i];
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH 08/15] net/mlx5: switch timestamp to dynamic mbuf field
  2020-10-29  9:27 [dpdk-dev] [PATCH 00/15] remove mbuf timestamp Thomas Monjalon
                   ` (6 preceding siblings ...)
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 07/15] net/mlx5: fix dynamic mbuf offset lookup check Thomas Monjalon
@ 2020-10-29  9:27 ` Thomas Monjalon
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 09/15] net/nfb: " Thomas Monjalon
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29  9:27 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Matan Azrad,
	Shahaf Shuler, David Christensen, Ruifeng Wang,
	Konstantin Ananyev

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 drivers/net/mlx5/mlx5_rxq.c              | 36 ++++++++++++++++++++
 drivers/net/mlx5/mlx5_rxtx.c             |  4 +--
 drivers/net/mlx5/mlx5_rxtx.h             | 19 +++++++++++
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h | 41 +++++++++++-----------
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h    | 43 ++++++++++++------------
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h     | 35 +++++++++----------
 6 files changed, 118 insertions(+), 60 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index f1d8373079..877aa24a18 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1310,6 +1310,39 @@ mlx5_max_lro_msg_size_adjust(struct rte_eth_dev *dev, uint16_t idx,
 		priv->max_lro_msg_size * MLX5_LRO_SEG_CHUNK_SIZE);
 }
 
+/**
+ * Lookup mbuf field and flag for Rx timestamp if offload requested.
+ *
+ * @param rxq_data
+ *   Datapath struct where field offset and flag mask are stored.
+ *
+ * @return
+ *   0 on success or offload disabled, negative errno otherwise.
+ */
+static int
+mlx5_rx_timestamp_setup(struct mlx5_rxq_data *rxq_data)
+{
+	int timestamp_rx_dynflag_offset;
+
+	rxq_data->timestamp_rx_flag = 0;
+	if (rxq_data->hw_timestamp == 0)
+		return 0;
+	rxq_data->timestamp_offset = rte_mbuf_dynfield_lookup(
+			RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
+	if (rxq_data->timestamp_offset < 0) {
+		DRV_LOG(ERR, "Cannot lookup timestamp field\n");
+		return -rte_errno;
+	}
+	timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
+			RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
+	if (timestamp_rx_dynflag_offset < 0) {
+		DRV_LOG(ERR, "Cannot lookup Rx timestamp flag\n");
+		return -rte_errno;
+	}
+	rxq_data->timestamp_rx_flag = RTE_BIT64(timestamp_rx_dynflag_offset);
+	return 0;
+}
+
 /**
  * Create a DPDK Rx queue.
  *
@@ -1492,7 +1525,10 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
 	mlx5_max_lro_msg_size_adjust(dev, idx, max_lro_size);
 	/* Toggle RX checksum offload if hardware supports it. */
 	tmpl->rxq.csum = !!(offloads & DEV_RX_OFFLOAD_CHECKSUM);
+	/* Configure Rx timestamp. */
 	tmpl->rxq.hw_timestamp = !!(offloads & DEV_RX_OFFLOAD_TIMESTAMP);
+	if (mlx5_rx_timestamp_setup(&tmpl->rxq) != 0)
+		goto error;
 	/* Configure VLAN stripping. */
 	tmpl->rxq.vlan_strip = !!(offloads & DEV_RX_OFFLOAD_VLAN_STRIP);
 	/* By default, FCS (CRC) is stripped by hardware. */
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index e86468b67a..b577aab00b 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -1287,8 +1287,8 @@ rxq_cq_to_mbuf(struct mlx5_rxq_data *rxq, struct rte_mbuf *pkt,
 
 		if (rxq->rt_timestamp)
 			ts = mlx5_txpp_convert_rx_ts(rxq->sh, ts);
-		pkt->timestamp = ts;
-		pkt->ol_flags |= PKT_RX_TIMESTAMP;
+		mlx5_timestamp_set(pkt, rxq->timestamp_offset, ts);
+		pkt->ol_flags |= rxq->timestamp_rx_flag;
 	}
 }
 
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 674296ee98..e9eca36b40 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -151,6 +151,8 @@ struct mlx5_rxq_data {
 	/* CQ (UAR) access lock required for 32bit implementations */
 #endif
 	uint32_t tunnel; /* Tunnel information. */
+	int timestamp_offset; /* Dynamic mbuf field for timestamp. */
+	uint64_t timestamp_rx_flag; /* Dynamic mbuf flag for timestamp. */
 	uint64_t flow_meta_mask;
 	int32_t flow_meta_offset;
 } __rte_cache_aligned;
@@ -681,4 +683,21 @@ mlx5_txpp_convert_tx_ts(struct mlx5_dev_ctx_shared *sh, uint64_t mts)
 	return ci;
 }
 
+/**
+ * Set timestamp in mbuf dynamic field.
+ *
+ * @param mbuf
+ *   Structure to write into.
+ * @param offset
+ *   Dynamic field offset in mbuf structure.
+ * @param timestamp
+ *   Value to write.
+ */
+static __rte_always_inline void
+mlx5_timestamp_set(struct rte_mbuf *mbuf, int offset,
+		rte_mbuf_timestamp_t timestamp)
+{
+	*RTE_MBUF_DYNFIELD(mbuf, offset, rte_mbuf_timestamp_t *) = timestamp;
+}
+
 #endif /* RTE_PMD_MLX5_RXTX_H_ */
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
index 6bf0c9b540..171d7bb0f8 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
@@ -330,13 +330,13 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
 	vector unsigned char ol_flags = (vector unsigned char)
 		(vector unsigned int){
 			rxq->rss_hash * PKT_RX_RSS_HASH |
-				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
+				rxq->hw_timestamp * rxq->timestamp_rx_flag,
 			rxq->rss_hash * PKT_RX_RSS_HASH |
-				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
+				rxq->hw_timestamp * rxq->timestamp_rx_flag,
 			rxq->rss_hash * PKT_RX_RSS_HASH |
-				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
+				rxq->hw_timestamp * rxq->timestamp_rx_flag,
 			rxq->rss_hash * PKT_RX_RSS_HASH |
-				rxq->hw_timestamp * PKT_RX_TIMESTAMP};
+				rxq->hw_timestamp * rxq->timestamp_rx_flag};
 	vector unsigned char cv_flags;
 	const vector unsigned char zero = (vector unsigned char){0};
 	const vector unsigned char ptype_mask =
@@ -1025,31 +1025,32 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t pkts_n,
 		/* D.5 fill in mbuf - rearm_data and packet_type. */
 		rxq_cq_to_ptype_oflags_v(rxq, cqes, opcode, &pkts[pos]);
 		if (rxq->hw_timestamp) {
+			int offset = rxq->timestamp_offset;
 			if (rxq->rt_timestamp) {
 				struct mlx5_dev_ctx_shared *sh = rxq->sh;
 				uint64_t ts;
 
 				ts = rte_be_to_cpu_64(cq[pos].timestamp);
-				pkts[pos]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p1].timestamp);
-				pkts[pos + 1]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 1], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p2].timestamp);
-				pkts[pos + 2]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 2], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p3].timestamp);
-				pkts[pos + 3]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 3], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 			} else {
-				pkts[pos]->timestamp = rte_be_to_cpu_64
-						(cq[pos].timestamp);
-				pkts[pos + 1]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p1].timestamp);
-				pkts[pos + 2]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p2].timestamp);
-				pkts[pos + 3]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p3].timestamp);
+				mlx5_timestamp_set(pkts[pos], offset,
+					rte_be_to_cpu_64(cq[pos].timestamp));
+				mlx5_timestamp_set(pkts[pos + 1], offset,
+					rte_be_to_cpu_64(cq[pos + p1].timestamp));
+				mlx5_timestamp_set(pkts[pos + 2], offset,
+					rte_be_to_cpu_64(cq[pos + p2].timestamp));
+				mlx5_timestamp_set(pkts[pos + 3], offset,
+					rte_be_to_cpu_64(cq[pos + p3].timestamp));
 			}
 		}
 		if (rxq->dynf_meta) {
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index d122dad4fe..436b247ade 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -271,7 +271,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
 	uint32x4_t pinfo, cv_flags;
 	uint32x4_t ol_flags =
 		vdupq_n_u32(rxq->rss_hash * PKT_RX_RSS_HASH |
-			    rxq->hw_timestamp * PKT_RX_TIMESTAMP);
+			    rxq->hw_timestamp * rxq->timestamp_rx_flag);
 	const uint32x4_t ptype_ol_mask = { 0x106, 0x106, 0x106, 0x106 };
 	const uint8x16_t cv_flag_sel = {
 		0,
@@ -697,6 +697,7 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t pkts_n,
 		rxq_cq_to_ptype_oflags_v(rxq, ptype_info, flow_tag,
 					 opcode, &elts[pos]);
 		if (rxq->hw_timestamp) {
+			int offset = rxq->timestamp_offset;
 			if (rxq->rt_timestamp) {
 				struct mlx5_dev_ctx_shared *sh = rxq->sh;
 				uint64_t ts;
@@ -704,36 +705,36 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t pkts_n,
 				ts = rte_be_to_cpu_64
 					(container_of(p0, struct mlx5_cqe,
 						      pkt_info)->timestamp);
-				elts[pos]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(elts[pos], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64
 					(container_of(p1, struct mlx5_cqe,
 						      pkt_info)->timestamp);
-				elts[pos + 1]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(elts[pos + 1], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64
 					(container_of(p2, struct mlx5_cqe,
 						      pkt_info)->timestamp);
-				elts[pos + 2]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(elts[pos + 2], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64
 					(container_of(p3, struct mlx5_cqe,
 						      pkt_info)->timestamp);
-				elts[pos + 3]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(elts[pos + 3], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 			} else {
-				elts[pos]->timestamp = rte_be_to_cpu_64
-					(container_of(p0, struct mlx5_cqe,
-						      pkt_info)->timestamp);
-				elts[pos + 1]->timestamp = rte_be_to_cpu_64
-					(container_of(p1, struct mlx5_cqe,
-						      pkt_info)->timestamp);
-				elts[pos + 2]->timestamp = rte_be_to_cpu_64
-					(container_of(p2, struct mlx5_cqe,
-						      pkt_info)->timestamp);
-				elts[pos + 3]->timestamp = rte_be_to_cpu_64
-					(container_of(p3, struct mlx5_cqe,
-						      pkt_info)->timestamp);
+				mlx5_timestamp_set(elts[pos], offset,
+					rte_be_to_cpu_64(container_of(p0,
+					struct mlx5_cqe, pkt_info)->timestamp));
+				mlx5_timestamp_set(elts[pos + 1], offset,
+					rte_be_to_cpu_64(container_of(p1,
+					struct mlx5_cqe, pkt_info)->timestamp));
+				mlx5_timestamp_set(elts[pos + 2], offset,
+					rte_be_to_cpu_64(container_of(p2,
+					struct mlx5_cqe, pkt_info)->timestamp));
+				mlx5_timestamp_set(elts[pos + 3], offset,
+					rte_be_to_cpu_64(container_of(p3,
+					struct mlx5_cqe, pkt_info)->timestamp));
 			}
 		}
 		if (rxq->dynf_meta) {
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 0bbcbeefff..ae4439efc7 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -251,7 +251,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq, __m128i cqes[4],
 	__m128i pinfo0, pinfo1;
 	__m128i pinfo, ptype;
 	__m128i ol_flags = _mm_set1_epi32(rxq->rss_hash * PKT_RX_RSS_HASH |
-					  rxq->hw_timestamp * PKT_RX_TIMESTAMP);
+					  rxq->hw_timestamp * rxq->timestamp_rx_flag);
 	__m128i cv_flags;
 	const __m128i zero = _mm_setzero_si128();
 	const __m128i ptype_mask =
@@ -656,31 +656,32 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t pkts_n,
 		/* D.5 fill in mbuf - rearm_data and packet_type. */
 		rxq_cq_to_ptype_oflags_v(rxq, cqes, opcode, &pkts[pos]);
 		if (rxq->hw_timestamp) {
+			int offset = rxq->timestamp_offset;
 			if (rxq->rt_timestamp) {
 				struct mlx5_dev_ctx_shared *sh = rxq->sh;
 				uint64_t ts;
 
 				ts = rte_be_to_cpu_64(cq[pos].timestamp);
-				pkts[pos]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p1].timestamp);
-				pkts[pos + 1]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 1], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p2].timestamp);
-				pkts[pos + 2]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 2], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p3].timestamp);
-				pkts[pos + 3]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 3], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 			} else {
-				pkts[pos]->timestamp = rte_be_to_cpu_64
-						(cq[pos].timestamp);
-				pkts[pos + 1]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p1].timestamp);
-				pkts[pos + 2]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p2].timestamp);
-				pkts[pos + 3]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p3].timestamp);
+				mlx5_timestamp_set(pkts[pos], offset,
+					rte_be_to_cpu_64(cq[pos].timestamp));
+				mlx5_timestamp_set(pkts[pos + 1], offset,
+					rte_be_to_cpu_64(cq[pos + p1].timestamp));
+				mlx5_timestamp_set(pkts[pos + 2], offset,
+					rte_be_to_cpu_64(cq[pos + p2].timestamp));
+				mlx5_timestamp_set(pkts[pos + 3], offset,
+					rte_be_to_cpu_64(cq[pos + p3].timestamp));
 			}
 		}
 		if (rxq->dynf_meta) {
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH 09/15] net/nfb: switch timestamp to dynamic mbuf field
  2020-10-29  9:27 [dpdk-dev] [PATCH 00/15] remove mbuf timestamp Thomas Monjalon
                   ` (7 preceding siblings ...)
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 08/15] net/mlx5: switch timestamp to dynamic mbuf field Thomas Monjalon
@ 2020-10-29  9:27 ` Thomas Monjalon
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 10/15] net/octeontx2: " Thomas Monjalon
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29  9:27 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Martin Spinler

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 drivers/net/nfb/nfb_rx.c | 23 ++++++++++++++++++++++-
 drivers/net/nfb/nfb_rx.h | 18 ++++++++++++++----
 2 files changed, 36 insertions(+), 5 deletions(-)

diff --git a/drivers/net/nfb/nfb_rx.c b/drivers/net/nfb/nfb_rx.c
index d97179f818..1f90357522 100644
--- a/drivers/net/nfb/nfb_rx.c
+++ b/drivers/net/nfb/nfb_rx.c
@@ -5,10 +5,14 @@
  */
 
 #include <rte_kvargs.h>
+#include <rte_bitops.h>
 
 #include "nfb_rx.h"
 #include "nfb.h"
 
+uint64_t nfb_timestamp_rx_dynflag;
+int nfb_timestamp_dynfield_offset = -1;
+
 static int
 timestamp_check_handler(__rte_unused const char *key,
 	const char *value, __rte_unused void *opaque)
@@ -23,6 +27,7 @@ timestamp_check_handler(__rte_unused const char *key,
 static int
 nfb_check_timestamp(struct rte_devargs *devargs)
 {
+	int timestamp_rx_dynflag_offset;
 	struct rte_kvargs *kvlist;
 
 	if (devargs == NULL)
@@ -38,6 +43,7 @@ nfb_check_timestamp(struct rte_devargs *devargs)
 	}
 	/* Timestamps are enabled when there is
 	 * key-value pair: enable_timestamp=1
+	 * TODO: timestamp should be enabled with DEV_RX_OFFLOAD_TIMESTAMP
 	 */
 	if (rte_kvargs_process(kvlist, TIMESTAMP_ARG,
 		timestamp_check_handler, NULL) < 0) {
@@ -46,6 +52,21 @@ nfb_check_timestamp(struct rte_devargs *devargs)
 	}
 	rte_kvargs_free(kvlist);
 
+	nfb_timestamp_dynfield_offset = rte_mbuf_dynfield_lookup(
+			RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
+	if (nfb_timestamp_dynfield_offset < 0) {
+		RTE_LOG(ERR, PMD, "Cannot lookup timestamp field\n");
+		return -rte_errno;
+	}
+	timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
+			RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
+	if (timestamp_rx_dynflag_offset < 0) {
+		RTE_LOG(ERR, PMD, "Cannot lookup Rx timestamp flag\n");
+		return -rte_errno;
+	}
+	nfb_timestamp_rx_dynflag =
+		RTE_BIT64(timestamp_rx_dynflag_offset);
+
 	return 1;
 }
 
@@ -125,7 +146,7 @@ nfb_eth_rx_queue_setup(struct rte_eth_dev *dev,
 	else
 		rte_free(rxq);
 
-	if (nfb_check_timestamp(dev->device->devargs))
+	if (nfb_check_timestamp(dev->device->devargs) > 0)
 		rxq->flags |= NFB_TIMESTAMP_FLAG;
 
 	return ret;
diff --git a/drivers/net/nfb/nfb_rx.h b/drivers/net/nfb/nfb_rx.h
index cf3899b2fb..e548226e0f 100644
--- a/drivers/net/nfb/nfb_rx.h
+++ b/drivers/net/nfb/nfb_rx.h
@@ -15,6 +15,16 @@
 
 #define NFB_TIMESTAMP_FLAG (1 << 0)
 
+extern uint64_t nfb_timestamp_rx_dynflag;
+extern int nfb_timestamp_dynfield_offset;
+
+static inline rte_mbuf_timestamp_t *
+nfb_timestamp_dynfield(struct rte_mbuf *mbuf)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+		nfb_timestamp_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 struct ndp_rx_queue {
 	struct nfb_device *nfb;	     /* nfb dev structure */
 	struct ndp_queue *queue;     /* rx queue */
@@ -191,15 +201,15 @@ nfb_eth_ndp_rx(void *queue,
 
 			if (timestamping_enabled) {
 				/* nanoseconds */
-				mbuf->timestamp =
+				*nfb_timestamp_dynfield(mbuf) =
 					rte_le_to_cpu_32(*((uint32_t *)
 					(packets[i].header + 4)));
-				mbuf->timestamp <<= 32;
+				*nfb_timestamp_dynfield(mbuf) <<= 32;
 				/* seconds */
-				mbuf->timestamp |=
+				*nfb_timestamp_dynfield(mbuf) |=
 					rte_le_to_cpu_32(*((uint32_t *)
 					(packets[i].header + 8)));
-				mbuf->ol_flags |= PKT_RX_TIMESTAMP;
+				mbuf->ol_flags |= nfb_timestamp_rx_dynflag;
 			}
 
 			bufs[num_rx++] = mbuf;
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH 10/15] net/octeontx2: switch timestamp to dynamic mbuf field
  2020-10-29  9:27 [dpdk-dev] [PATCH 00/15] remove mbuf timestamp Thomas Monjalon
                   ` (8 preceding siblings ...)
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 09/15] net/nfb: " Thomas Monjalon
@ 2020-10-29  9:27 ` Thomas Monjalon
  2020-10-29 11:02   ` Andrew Rybchenko
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 11/15] net/pcap: " Thomas Monjalon
                   ` (8 subsequent siblings)
  18 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29  9:27 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Nithin Dabilpuram,
	Kiran Kumar K, Ray Kinsella, Neil Horman

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 drivers/net/octeontx2/otx2_ethdev.c | 33 +++++++++++++++++++++++++++++
 drivers/net/octeontx2/otx2_rx.h     | 19 ++++++++++++++---
 drivers/net/octeontx2/version.map   |  7 ++++++
 3 files changed, 56 insertions(+), 3 deletions(-)

diff --git a/drivers/net/octeontx2/otx2_ethdev.c b/drivers/net/octeontx2/otx2_ethdev.c
index cfb733a4b5..ad95219438 100644
--- a/drivers/net/octeontx2/otx2_ethdev.c
+++ b/drivers/net/octeontx2/otx2_ethdev.c
@@ -4,6 +4,7 @@
 
 #include <inttypes.h>
 
+#include <rte_bitops.h>
 #include <rte_ethdev_pci.h>
 #include <rte_io.h>
 #include <rte_malloc.h>
@@ -14,6 +15,35 @@
 #include "otx2_ethdev.h"
 #include "otx2_ethdev_sec.h"
 
+uint64_t rte_pmd_octeontx2_timestamp_rx_dynflag;
+int rte_pmd_octeontx2_timestamp_dynfield_offset = -1;
+
+static int
+otx2_rx_timestamp_setup(uint16_t flags)
+{
+	int timestamp_rx_dynflag_offset;
+
+	if ((flags & NIX_RX_OFFLOAD_TSTAMP_F) == 0)
+		return 0;
+
+	rte_pmd_octeontx2_timestamp_dynfield_offset = rte_mbuf_dynfield_lookup(
+			RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
+	if (rte_pmd_octeontx2_timestamp_dynfield_offset < 0) {
+		otx2_err("Failed to lookup timestamp field");
+		return -rte_errno;
+	}
+	timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
+			RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
+	if (timestamp_rx_dynflag_offset < 0) {
+		otx2_err("Failed to lookup Rx timestamp flag");
+		return -rte_errno;
+	}
+	rte_pmd_octeontx2_timestamp_rx_dynflag =
+			RTE_BIT64(timestamp_rx_dynflag_offset);
+
+	return 0;
+}
+
 static inline uint64_t
 nix_get_rx_offload_capa(struct otx2_eth_dev *dev)
 {
@@ -1874,6 +1904,9 @@ otx2_nix_configure(struct rte_eth_dev *eth_dev)
 	dev->tx_offload_flags |= nix_tx_offload_flags(eth_dev);
 	dev->rss_info.rss_grps = NIX_RSS_GRPS;
 
+	if (otx2_rx_timestamp_setup(dev->rx_offload_flags) != 0)
+		goto fail_offloads;
+
 	nb_rxq = RTE_MAX(data->nb_rx_queues, 1);
 	nb_txq = RTE_MAX(data->nb_tx_queues, 1);
 
diff --git a/drivers/net/octeontx2/otx2_rx.h b/drivers/net/octeontx2/otx2_rx.h
index 61a5c436dd..6981edce82 100644
--- a/drivers/net/octeontx2/otx2_rx.h
+++ b/drivers/net/octeontx2/otx2_rx.h
@@ -63,6 +63,18 @@ union mbuf_initializer {
 	uint64_t value;
 };
 
+/* variables are exported because this file is included in other drivers */
+extern uint64_t rte_pmd_octeontx2_timestamp_rx_dynflag;
+extern int rte_pmd_octeontx2_timestamp_dynfield_offset;
+
+static inline rte_mbuf_timestamp_t *
+otx2_timestamp_dynfield(struct rte_mbuf *mbuf)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+		rte_pmd_octeontx2_timestamp_dynfield_offset,
+		rte_mbuf_timestamp_t *);
+}
+
 static __rte_always_inline void
 otx2_nix_mbuf_to_tstamp(struct rte_mbuf *mbuf,
 			struct otx2_timesync_info *tstamp, const uint16_t flag,
@@ -77,15 +89,16 @@ otx2_nix_mbuf_to_tstamp(struct rte_mbuf *mbuf,
 		/* Reading the rx timestamp inserted by CGX, viz at
 		 * starting of the packet data.
 		 */
-		mbuf->timestamp = rte_be_to_cpu_64(*tstamp_ptr);
+		*otx2_timestamp_dynfield(mbuf) = rte_be_to_cpu_64(*tstamp_ptr);
 		/* PKT_RX_IEEE1588_TMST flag needs to be set only in case
 		 * PTP packets are received.
 		 */
 		if (mbuf->packet_type == RTE_PTYPE_L2_ETHER_TIMESYNC) {
-			tstamp->rx_tstamp = mbuf->timestamp;
+			tstamp->rx_tstamp = *otx2_timestamp_dynfield(mbuf);
 			tstamp->rx_ready = 1;
 			mbuf->ol_flags |= PKT_RX_IEEE1588_PTP |
-				PKT_RX_IEEE1588_TMST | PKT_RX_TIMESTAMP;
+				PKT_RX_IEEE1588_TMST |
+				rte_pmd_octeontx2_timestamp_rx_dynflag;
 		}
 	}
 }
diff --git a/drivers/net/octeontx2/version.map b/drivers/net/octeontx2/version.map
index 4a76d1d52d..d4f4784bcd 100644
--- a/drivers/net/octeontx2/version.map
+++ b/drivers/net/octeontx2/version.map
@@ -1,3 +1,10 @@
 DPDK_21 {
 	local: *;
 };
+
+INTERNAL {
+	global:
+
+	rte_pmd_octeontx2_timestamp_dynfield_offset;
+	rte_pmd_octeontx2_timestamp_rx_dynflag;
+};
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH 11/15] net/pcap: switch timestamp to dynamic mbuf field
  2020-10-29  9:27 [dpdk-dev] [PATCH 00/15] remove mbuf timestamp Thomas Monjalon
                   ` (9 preceding siblings ...)
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 10/15] net/octeontx2: " Thomas Monjalon
@ 2020-10-29  9:27 ` Thomas Monjalon
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 12/15] app/testpmd: " Thomas Monjalon
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29  9:27 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 drivers/net/pcap/rte_eth_pcap.c | 29 ++++++++++++++++++++++++++---
 1 file changed, 26 insertions(+), 3 deletions(-)

diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index 34e82317b1..b4b7a1839b 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -18,6 +18,7 @@
 
 #include <pcap.h>
 
+#include <rte_bitops.h>
 #include <rte_cycles.h>
 #include <rte_ethdev_driver.h>
 #include <rte_ethdev_vdev.h>
@@ -51,6 +52,9 @@ static uint64_t start_cycles;
 static uint64_t hz;
 static uint8_t iface_idx;
 
+static uint64_t timestamp_rx_dynflag;
+static int timestamp_dynfield_offset = -1;
+
 struct queue_stat {
 	volatile unsigned long pkts;
 	volatile unsigned long bytes;
@@ -265,9 +269,11 @@ eth_pcap_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 		}
 
 		mbuf->pkt_len = (uint16_t)header.caplen;
-		mbuf->timestamp = (uint64_t)header.ts.tv_sec * 1000000
-							+ header.ts.tv_usec;
-		mbuf->ol_flags |= PKT_RX_TIMESTAMP;
+		*RTE_MBUF_DYNFIELD(mbuf, timestamp_dynfield_offset,
+			rte_mbuf_timestamp_t *) =
+				(uint64_t)header.ts.tv_sec * 1000000 +
+				header.ts.tv_usec;
+		mbuf->ol_flags |= timestamp_rx_dynflag;
 		mbuf->port = pcap_q->port_id;
 		bufs[num_rx] = mbuf;
 		num_rx++;
@@ -656,6 +662,23 @@ eth_dev_stop(struct rte_eth_dev *dev)
 static int
 eth_dev_configure(struct rte_eth_dev *dev __rte_unused)
 {
+	int timestamp_rx_dynflag_offset;
+
+	timestamp_dynfield_offset = rte_mbuf_dynfield_lookup(
+			RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
+	if (timestamp_dynfield_offset < 0) {
+		PMD_LOG(ERR, "Failed to lookup timestamp field");
+		return -rte_errno;
+	}
+	timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
+			RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
+	if (timestamp_rx_dynflag_offset < 0) {
+		PMD_LOG(ERR, "Failed lookup Rx timestamp flag");
+		return -rte_errno;
+	}
+	timestamp_rx_dynflag =
+		RTE_BIT64(timestamp_rx_dynflag_offset);
+
 	return 0;
 }
 
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH 12/15] app/testpmd: switch timestamp to dynamic mbuf field
  2020-10-29  9:27 [dpdk-dev] [PATCH 00/15] remove mbuf timestamp Thomas Monjalon
                   ` (10 preceding siblings ...)
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 11/15] net/pcap: " Thomas Monjalon
@ 2020-10-29  9:27 ` Thomas Monjalon
  2020-10-29 10:20   ` Andrew Rybchenko
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 13/15] examples/rxtx_callbacks: switch timestamp to dynamic field Thomas Monjalon
                   ` (6 subsequent siblings)
  18 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29  9:27 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 app/test-pmd/util.c | 39 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index 781a813759..eebb5166ad 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -5,6 +5,7 @@
 
 #include <stdio.h>
 
+#include <rte_bitops.h>
 #include <rte_net.h>
 #include <rte_mbuf.h>
 #include <rte_ether.h>
@@ -22,6 +23,40 @@ print_ether_addr(const char *what, const struct rte_ether_addr *eth_addr)
 	printf("%s%s", what, buf);
 }
 
+static inline bool
+is_timestamp_enabled(const struct rte_mbuf *mbuf)
+{
+	static uint64_t timestamp_rx_dynflag;
+
+	int timestamp_rx_dynflag_offset;
+
+	if (timestamp_rx_dynflag == 0) {
+		timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
+				RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
+		if (timestamp_rx_dynflag_offset < 0)
+			return false;
+		timestamp_rx_dynflag = RTE_BIT64(timestamp_rx_dynflag_offset);
+	}
+
+	return (mbuf->ol_flags & timestamp_rx_dynflag) != 0;
+}
+
+static inline rte_mbuf_timestamp_t
+get_timestamp(const struct rte_mbuf *mbuf)
+{
+	static int timestamp_dynfield_offset = -1;
+
+	if (timestamp_dynfield_offset < 0) {
+		timestamp_dynfield_offset = rte_mbuf_dynfield_lookup(
+				RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
+		if (timestamp_dynfield_offset < 0)
+			return 0;
+	}
+
+	return *RTE_MBUF_DYNFIELD(mbuf,
+			timestamp_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 static inline void
 dump_pkt_burst(uint16_t port_id, uint16_t queue, struct rte_mbuf *pkts[],
 	      uint16_t nb_pkts, int is_rx)
@@ -107,8 +142,8 @@ dump_pkt_burst(uint16_t port_id, uint16_t queue, struct rte_mbuf *pkts[],
 				printf("hash=0x%x ID=0x%x ",
 				       mb->hash.fdir.hash, mb->hash.fdir.id);
 		}
-		if (ol_flags & PKT_RX_TIMESTAMP)
-			printf(" - timestamp %"PRIu64" ", mb->timestamp);
+		if (is_timestamp_enabled(mb))
+			printf(" - timestamp %"PRIu64" ", get_timestamp(mb));
 		if (ol_flags & PKT_RX_QINQ)
 			printf(" - QinQ VLAN tci=0x%x, VLAN tci outer=0x%x",
 			       mb->vlan_tci, mb->vlan_tci_outer);
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH 13/15] examples/rxtx_callbacks: switch timestamp to dynamic field
  2020-10-29  9:27 [dpdk-dev] [PATCH 00/15] remove mbuf timestamp Thomas Monjalon
                   ` (11 preceding siblings ...)
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 12/15] app/testpmd: " Thomas Monjalon
@ 2020-10-29  9:27 ` Thomas Monjalon
  2020-10-29 10:21   ` Andrew Rybchenko
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 14/15] mbuf: remove deprecated timestamp field Thomas Monjalon
                   ` (5 subsequent siblings)
  18 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29  9:27 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, John McNamara

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 examples/rxtx_callbacks/main.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/examples/rxtx_callbacks/main.c b/examples/rxtx_callbacks/main.c
index b9a98ceddc..4798e0962c 100644
--- a/examples/rxtx_callbacks/main.c
+++ b/examples/rxtx_callbacks/main.c
@@ -19,6 +19,10 @@
 #define MBUF_CACHE_SIZE 250
 #define BURST_SIZE 32
 
+static int hwts_dynfield_offset = -1;
+#define HWTS_FIELD(mbuf) (*RTE_MBUF_DYNFIELD(mbuf, \
+		hwts_dynfield_offset, rte_mbuf_timestamp_t *))
+
 typedef uint64_t tsc_t;
 static int tsc_dynfield_offset = -1;
 #define TSC_FIELD(mbuf) (*RTE_MBUF_DYNFIELD(mbuf, \
@@ -73,7 +77,7 @@ calc_latency(uint16_t port, uint16_t qidx __rte_unused,
 	for (i = 0; i < nb_pkts; i++) {
 		cycles += now - TSC_FIELD(pkts[i]);
 		if (hw_timestamping)
-			queue_ticks += ticks - pkts[i]->timestamp;
+			queue_ticks += ticks - HWTS_FIELD(pkts[i]);
 	}
 
 	latency_numbers.total_cycles += cycles;
@@ -137,6 +141,12 @@ port_init(uint16_t port, struct rte_mempool *mbuf_pool)
 			return -1;
 		}
 		port_conf.rxmode.offloads |= DEV_RX_OFFLOAD_TIMESTAMP;
+		hwts_dynfield_offset = rte_mbuf_dynfield_lookup(
+				RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
+		if (hwts_dynfield_offset < 0) {
+			printf("ERROR: Failed to lookup timestamp field\n");
+			return -rte_errno;
+		}
 	}
 
 	retval = rte_eth_dev_configure(port, rx_rings, tx_rings, &port_conf);
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH 14/15] mbuf: remove deprecated timestamp field
  2020-10-29  9:27 [dpdk-dev] [PATCH 00/15] remove mbuf timestamp Thomas Monjalon
                   ` (12 preceding siblings ...)
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 13/15] examples/rxtx_callbacks: switch timestamp to dynamic field Thomas Monjalon
@ 2020-10-29  9:27 ` Thomas Monjalon
  2020-10-29 10:23   ` Andrew Rybchenko
  2020-10-29 14:48   ` Kinsella, Ray
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotter first half Thomas Monjalon
                   ` (4 subsequent siblings)
  18 siblings, 2 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29  9:27 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Ray Kinsella, Neil Horman

As announced in the deprecation note, the field timestamp
is removed to allow giving more space to the dynamic fields.
The related offload flag PKT_RX_TIMESTAMP is also removed.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 app/test/test_mbuf.c                   |  1 -
 doc/guides/rel_notes/deprecation.rst   |  1 -
 doc/guides/rel_notes/release_20_11.rst |  4 ++++
 lib/librte_ethdev/rte_ethdev.h         |  4 +++-
 lib/librte_mbuf/rte_mbuf.c             |  2 --
 lib/librte_mbuf/rte_mbuf.h             |  1 -
 lib/librte_mbuf/rte_mbuf_core.h        | 12 +-----------
 7 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
index 80d1850da9..85c150d843 100644
--- a/app/test/test_mbuf.c
+++ b/app/test/test_mbuf.c
@@ -1621,7 +1621,6 @@ test_get_rx_ol_flag_name(void)
 		VAL_NAME(PKT_RX_FDIR_FLX),
 		VAL_NAME(PKT_RX_QINQ_STRIPPED),
 		VAL_NAME(PKT_RX_LRO),
-		VAL_NAME(PKT_RX_TIMESTAMP),
 		VAL_NAME(PKT_RX_SEC_OFFLOAD),
 		VAL_NAME(PKT_RX_SEC_OFFLOAD_FAILED),
 		VAL_NAME(PKT_RX_OUTER_L4_CKSUM_BAD),
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 0f6f1df12a..72dbb25b83 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -86,7 +86,6 @@ Deprecation Notices
   `this presentation <https://www.youtube.com/watch?v=Ttl6MlhmzWY>`_.
   The following static fields will be moved as dynamic:
 
-  - ``timestamp``
   - ``seqn``
 
   As a consequence, the layout of the ``struct rte_mbuf`` will be re-arranged,
diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index 3cec526b6a..deb99d6d98 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -429,6 +429,10 @@ API Changes
 * mbuf: Removed the unioned fields ``userdata`` and ``udata64``
   from the structure ``rte_mbuf``. It is replaced with dynamic fields.
 
+* mbuf: Removed the field ``timestamp`` from the structure ``rte_mbuf``.
+  It is replaced with the dynamic field RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
+  which was previously used only for Tx.
+
 * pci: Removed the ``rte_kernel_driver`` enum defined in rte_dev.h and
   replaced with a private enum in the PCI subsystem.
 
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index 3be0050592..619cbe521e 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1345,6 +1345,8 @@ struct rte_eth_conf {
 #define DEV_RX_OFFLOAD_JUMBO_FRAME	0x00000800
 #define DEV_RX_OFFLOAD_SCATTER		0x00002000
 /**
+ * Timestamp is set by the driver in RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
+ * and RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME is set in ol_flags.
  * The mbuf field and flag are registered when the offload is configured.
  */
 #define DEV_RX_OFFLOAD_TIMESTAMP	0x00004000
@@ -4654,7 +4656,7 @@ int rte_eth_timesync_write_time(uint16_t port_id, const struct timespec *time);
  * rte_eth_read_clock(port, base_clock);
  *
  * Then, convert the raw mbuf timestamp with:
- * base_time_sec + (double)(mbuf->timestamp - base_clock) / freq;
+ * base_time_sec + (double)(*timestamp_dynfield(mbuf) - base_clock) / freq;
  *
  * This simple example will not provide a very good accuracy. One must
  * at least measure multiple times the frequency and do a regression.
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 8a456e5e64..09d93e6899 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -764,7 +764,6 @@ const char *rte_get_rx_ol_flag_name(uint64_t mask)
 	case PKT_RX_QINQ_STRIPPED: return "PKT_RX_QINQ_STRIPPED";
 	case PKT_RX_QINQ: return "PKT_RX_QINQ";
 	case PKT_RX_LRO: return "PKT_RX_LRO";
-	case PKT_RX_TIMESTAMP: return "PKT_RX_TIMESTAMP";
 	case PKT_RX_SEC_OFFLOAD: return "PKT_RX_SEC_OFFLOAD";
 	case PKT_RX_SEC_OFFLOAD_FAILED: return "PKT_RX_SEC_OFFLOAD_FAILED";
 	case PKT_RX_OUTER_L4_CKSUM_BAD: return "PKT_RX_OUTER_L4_CKSUM_BAD";
@@ -808,7 +807,6 @@ rte_get_rx_ol_flag_list(uint64_t mask, char *buf, size_t buflen)
 		{ PKT_RX_FDIR_FLX, PKT_RX_FDIR_FLX, NULL },
 		{ PKT_RX_QINQ_STRIPPED, PKT_RX_QINQ_STRIPPED, NULL },
 		{ PKT_RX_LRO, PKT_RX_LRO, NULL },
-		{ PKT_RX_TIMESTAMP, PKT_RX_TIMESTAMP, NULL },
 		{ PKT_RX_SEC_OFFLOAD, PKT_RX_SEC_OFFLOAD, NULL },
 		{ PKT_RX_SEC_OFFLOAD_FAILED, PKT_RX_SEC_OFFLOAD_FAILED, NULL },
 		{ PKT_RX_QINQ, PKT_RX_QINQ, NULL },
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index a1414ed7cd..6774c6281b 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -1108,7 +1108,6 @@ __rte_pktmbuf_copy_hdr(struct rte_mbuf *mdst, const struct rte_mbuf *msrc)
 	mdst->tx_offload = msrc->tx_offload;
 	mdst->hash = msrc->hash;
 	mdst->packet_type = msrc->packet_type;
-	mdst->timestamp = msrc->timestamp;
 	rte_mbuf_dynfield_copy(mdst, msrc);
 }
 
diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
index a65eaaf692..52ca1c842f 100644
--- a/lib/librte_mbuf/rte_mbuf_core.h
+++ b/lib/librte_mbuf/rte_mbuf_core.h
@@ -149,11 +149,6 @@ extern "C" {
  */
 #define PKT_RX_LRO           (1ULL << 16)
 
-/**
- * Indicate that the timestamp field in the mbuf is valid.
- */
-#define PKT_RX_TIMESTAMP     (1ULL << 17)
-
 /**
  * Indicate that security offload processing was applied on the RX packet.
  */
@@ -589,12 +584,7 @@ struct rte_mbuf {
 
 	uint16_t buf_len;         /**< Length of segment buffer. */
 
-	/** Valid if PKT_RX_TIMESTAMP is set. The unit and time reference
-	 * are not normalized but are always the same for a given port.
-	 * Some devices allow to query rte_eth_read_clock that will return the
-	 * current device timestamp.
-	 */
-	uint64_t timestamp;
+	uint64_t unused;
 
 	/* second cache line - fields only used in slow path or on TX */
 	RTE_MARKER cacheline1 __rte_cache_min_aligned;
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotter first half
  2020-10-29  9:27 [dpdk-dev] [PATCH 00/15] remove mbuf timestamp Thomas Monjalon
                   ` (13 preceding siblings ...)
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 14/15] mbuf: remove deprecated timestamp field Thomas Monjalon
@ 2020-10-29  9:27 ` Thomas Monjalon
  2020-10-29 10:50   ` Andrew Rybchenko
  2020-10-29 14:42   ` [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotter first half Kinsella, Ray
  2020-11-01 18:06 ` [dpdk-dev] [PATCH v2 00/14] remove mbuf timestamp Thomas Monjalon
                   ` (3 subsequent siblings)
  18 siblings, 2 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29  9:27 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Ray Kinsella, Neil Horman

The mempool pointer in the mbuf struct is moved
from the second to the first half.
It should increase performance on most systems having 64-byte cache line,
i.e. mbuf is split in two cache lines.
On such system, the first half (also called first cache line) is hotter
than the second one where the pool pointer was.

Moving this field gives more space to dynfield1.

This is how the mbuf layout looks like (pahole-style):

word  type                              name                byte  size
 0    void *                            buf_addr;         /*   0 +  8 */
 1    rte_iova_t                        buf_iova          /*   8 +  8 */
      /* --- RTE_MARKER64               rearm_data;                   */
 2    uint16_t                          data_off;         /*  16 +  2 */
      uint16_t                          refcnt;           /*  18 +  2 */
      uint16_t                          nb_segs;          /*  20 +  2 */
      uint16_t                          port;             /*  22 +  2 */
 3    uint64_t                          ol_flags;         /*  24 +  8 */
      /* --- RTE_MARKER                 rx_descriptor_fields1;        */
 4    uint32_t             union        packet_type;      /*  32 +  4 */
      uint32_t                          pkt_len;          /*  36 +  4 */
 5    uint16_t                          data_len;         /*  40 +  2 */
      uint16_t                          vlan_tci;         /*  42 +  2 */
 5.5  uint64_t             union        hash;             /*  44 +  8 */
 6.5  uint16_t                          vlan_tci_outer;   /*  52 +  2 */
      uint16_t                          buf_len;          /*  54 +  2 */
 7    struct rte_mempool *              pool;             /*  56 +  8 */
      /* --- RTE_MARKER                 cacheline1;                   */
 8    struct rte_mbuf *                 next;             /*  64 +  8 */
 9    uint64_t             union        tx_offload;       /*  72 +  8 */
10    uint16_t                          priv_size;        /*  80 +  2 */
      uint16_t                          timesync;         /*  82 +  2 */
      uint32_t                          seqn;             /*  84 +  4 */
11    struct rte_mbuf_ext_shared_info * shinfo;           /*  88 +  8 */
12    uint64_t                          dynfield1[4];     /*  96 + 32 */
16    /* --- END                                             128      */

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 doc/guides/rel_notes/deprecation.rst | 5 -----
 lib/librte_kni/rte_kni_common.h      | 3 ++-
 lib/librte_mbuf/rte_mbuf_core.h      | 5 ++---
 3 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 72dbb25b83..07ca1dcbb2 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -88,11 +88,6 @@ Deprecation Notices
 
   - ``seqn``
 
-  As a consequence, the layout of the ``struct rte_mbuf`` will be re-arranged,
-  avoiding impact on vectorized implementation of the driver datapaths,
-  while evaluating performance gains of a better use of the first cache line.
-
-
 * ethdev: the legacy filter API, including
   ``rte_eth_dev_filter_supported()``, ``rte_eth_dev_filter_ctrl()`` as well
   as filter types MACVLAN, ETHERTYPE, FLEXIBLE, SYN, NTUPLE, TUNNEL, FDIR,
diff --git a/lib/librte_kni/rte_kni_common.h b/lib/librte_kni/rte_kni_common.h
index 36d66e2ffa..ffb3182731 100644
--- a/lib/librte_kni/rte_kni_common.h
+++ b/lib/librte_kni/rte_kni_common.h
@@ -84,10 +84,11 @@ struct rte_kni_mbuf {
 	char pad2[4];
 	uint32_t pkt_len;       /**< Total pkt len: sum of all segment data_len. */
 	uint16_t data_len;      /**< Amount of data in segment buffer. */
+	char pad3[14];
+	void *pool;
 
 	/* fields on second cache line */
 	__attribute__((__aligned__(RTE_CACHE_LINE_MIN_SIZE)))
-	void *pool;
 	void *next;             /**< Physical address of next mbuf in kernel. */
 };
 
diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
index 52ca1c842f..ee185fa32b 100644
--- a/lib/librte_mbuf/rte_mbuf_core.h
+++ b/lib/librte_mbuf/rte_mbuf_core.h
@@ -584,12 +584,11 @@ struct rte_mbuf {
 
 	uint16_t buf_len;         /**< Length of segment buffer. */
 
-	uint64_t unused;
+	struct rte_mempool *pool; /**< Pool from which mbuf was allocated. */
 
 	/* second cache line - fields only used in slow path or on TX */
 	RTE_MARKER cacheline1 __rte_cache_min_aligned;
 
-	struct rte_mempool *pool; /**< Pool from which mbuf was allocated. */
 	struct rte_mbuf *next;    /**< Next segment of scattered packet. */
 
 	/* fields to support TX offloads */
@@ -646,7 +645,7 @@ struct rte_mbuf {
 	 */
 	struct rte_mbuf_ext_shared_info *shinfo;
 
-	uint64_t dynfield1[3]; /**< Reserved for dynamic fields. */
+	uint64_t dynfield1[4]; /**< Reserved for dynamic fields. */
 } __rte_cache_aligned;
 
 /**
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 02/15] mbuf: add Rx timestamp dynamic flag
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 02/15] mbuf: add Rx timestamp dynamic flag Thomas Monjalon
@ 2020-10-29  9:58   ` Andrew Rybchenko
  2020-10-29 18:19     ` Ajit Khaparde
  0 siblings, 1 reply; 170+ messages in thread
From: Andrew Rybchenko @ 2020-10-29  9:58 UTC (permalink / raw)
  To: Thomas Monjalon, dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	jerinj, viacheslavo

On 10/29/20 12:27 PM, Thomas Monjalon wrote:
> There is already a dynamic field for timestamp,
> used only for Tx scheduling, thanks to the dedicated flag.
> The same field can be used for Rx timestamp filled by drivers.
> The only missing part to get rid of the static timestamp field
> was to declare a new dynamic flag for Rx usage.
> 
> After migrating all Rx timestamp usages, it will be possible
> to remove the deprecated timestamp field.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>

Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 03/15] ethdev: register mbuf field and flags for timestamp
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 03/15] ethdev: register mbuf field and flags for timestamp Thomas Monjalon
@ 2020-10-29 10:08   ` Andrew Rybchenko
  2020-10-29 10:12     ` Thomas Monjalon
  0 siblings, 1 reply; 170+ messages in thread
From: Andrew Rybchenko @ 2020-10-29 10:08 UTC (permalink / raw)
  To: Thomas Monjalon, dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	jerinj, viacheslavo, Wenzhuo Lu, Beilei Xing, Bernard Iremonger,
	Matan Azrad, Shahaf Shuler

On 10/29/20 12:27 PM, Thomas Monjalon wrote:
> During port configure or queue setup, the offload flags
> DEV_RX_OFFLOAD_TIMESTAMP and DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP
> trigger the registration of the related mbuf field and flags.
> 
> Previously, the Tx timestamp field and flag were registered in testpmd,
> as described in mlx5 guide.
> For the general usage of Rx and Tx timestamps,
> managing registrations inside ethdev is simpler and properly documented.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>

A small note below, other than that

Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>

> diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
> index b12bb3854d..7c9aadb461 100644
> --- a/lib/librte_ethdev/rte_ethdev.c
> +++ b/lib/librte_ethdev/rte_ethdev.c

[snip]

> @@ -1232,6 +1233,59 @@ eth_dev_check_lro_pkt_size(uint16_t port_id, uint32_t config_size,
>  	return ret;
>  }
>  
> +static inline int
> +eth_dev_timestamp_mbuf_register(uint64_t rx_offloads, uint64_t tx_offloads)
> +{
> +	static const struct rte_mbuf_dynfield field_desc = {
> +		.name = RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
> +		.size = sizeof(rte_mbuf_timestamp_t),
> +		.align = __alignof__(rte_mbuf_timestamp_t),
> +	};
> +	static const struct rte_mbuf_dynflag rx_flag_desc = {
> +		.name = RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME,
> +	};
> +	static const struct rte_mbuf_dynflag tx_flag_desc = {
> +		.name = RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME,
> +	};
> +	static bool done_rx, done_tx;

I think we don't need these static flags. We can just repeat
registeration request and it will simply lookup and return
the same offset/flagbit as before.

[snip]

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 03/15] ethdev: register mbuf field and flags for timestamp
  2020-10-29 10:08   ` Andrew Rybchenko
@ 2020-10-29 10:12     ` Thomas Monjalon
  2020-10-29 10:33       ` Andrew Rybchenko
  0 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29 10:12 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	olivier.matz, jerinj, viacheslavo, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger, Matan Azrad, Shahaf Shuler

29/10/2020 11:08, Andrew Rybchenko:
> On 10/29/20 12:27 PM, Thomas Monjalon wrote:
> > During port configure or queue setup, the offload flags
> > DEV_RX_OFFLOAD_TIMESTAMP and DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP
> > trigger the registration of the related mbuf field and flags.
> > 
> > Previously, the Tx timestamp field and flag were registered in testpmd,
> > as described in mlx5 guide.
> > For the general usage of Rx and Tx timestamps,
> > managing registrations inside ethdev is simpler and properly documented.
> > 
> > Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> 
> A small note below, other than that
> 
> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> 
> > +static inline int
> > +eth_dev_timestamp_mbuf_register(uint64_t rx_offloads, uint64_t tx_offloads)
> > +{
> > +	static const struct rte_mbuf_dynfield field_desc = {
> > +		.name = RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
> > +		.size = sizeof(rte_mbuf_timestamp_t),
> > +		.align = __alignof__(rte_mbuf_timestamp_t),
> > +	};
> > +	static const struct rte_mbuf_dynflag rx_flag_desc = {
> > +		.name = RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME,
> > +	};
> > +	static const struct rte_mbuf_dynflag tx_flag_desc = {
> > +		.name = RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME,
> > +	};
> > +	static bool done_rx, done_tx;
> 
> I think we don't need these static flags. We can just repeat
> registeration request and it will simply lookup and return
> the same offset/flagbit as before.

Absolutely.
I did it as a small optimization in control path.

I hesitated. Given it is only 2 booleans,
do you prefer with or without or no opinion?



^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 04/15] latency: switch timestamp to dynamic mbuf field
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 04/15] latency: switch timestamp to dynamic mbuf field Thomas Monjalon
@ 2020-10-29 10:13   ` Andrew Rybchenko
  2020-10-29 10:40     ` Thomas Monjalon
  2020-10-29 14:20   ` Pattan, Reshma
  1 sibling, 1 reply; 170+ messages in thread
From: Andrew Rybchenko @ 2020-10-29 10:13 UTC (permalink / raw)
  To: Thomas Monjalon, dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	jerinj, viacheslavo, Reshma Pattan

On 10/29/20 12:27 PM, Thomas Monjalon wrote:
> The mbuf timestamp is moved to a dynamic field
> in order to allow removal of the deprecated static field.
> The related mbuf flag is also replaced with the dynamic one.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>

[snip]

> diff --git a/lib/librte_latencystats/rte_latencystats.c b/lib/librte_latencystats/rte_latencystats.c
> index ba2fff3bcb..a21f6239d9 100644
> --- a/lib/librte_latencystats/rte_latencystats.c
> +++ b/lib/librte_latencystats/rte_latencystats.c

[snip]

> @@ -204,6 +216,14 @@ int
>  rte_latencystats_init(uint64_t app_samp_intvl,
>  		rte_latency_stats_flow_type_fn user_cb)
>  {
> +	static const struct rte_mbuf_dynfield timestamp_dynfield_desc = {
> +		.name = RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
> +		.size = sizeof(rte_mbuf_timestamp_t),
> +		.align = __alignof__(rte_mbuf_timestamp_t),
> +	};
> +	static const struct rte_mbuf_dynflag timestamp_dynflag_desc = {
> +		.name = RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME,
> +	};

I dislike the duplication. If we can't just lookup by name
which is done after ethdev configure (I guess so), may be
ethdev should provide an API to register?


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 12/15] app/testpmd: switch timestamp to dynamic mbuf field
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 12/15] app/testpmd: " Thomas Monjalon
@ 2020-10-29 10:20   ` Andrew Rybchenko
  2020-10-29 10:43     ` Thomas Monjalon
  0 siblings, 1 reply; 170+ messages in thread
From: Andrew Rybchenko @ 2020-10-29 10:20 UTC (permalink / raw)
  To: Thomas Monjalon, dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	jerinj, viacheslavo, Wenzhuo Lu, Beilei Xing, Bernard Iremonger

On 10/29/20 12:27 PM, Thomas Monjalon wrote:
> The mbuf timestamp is moved to a dynamic field
> in order to allow removal of the deprecated static field.
> The related mbuf flag is also replaced.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> ---
>  app/test-pmd/util.c | 39 +++++++++++++++++++++++++++++++++++++--
>  1 file changed, 37 insertions(+), 2 deletions(-)
> 
> diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
> index 781a813759..eebb5166ad 100644
> --- a/app/test-pmd/util.c
> +++ b/app/test-pmd/util.c
> @@ -5,6 +5,7 @@
>  
>  #include <stdio.h>
>  
> +#include <rte_bitops.h>
>  #include <rte_net.h>
>  #include <rte_mbuf.h>
>  #include <rte_ether.h>
> @@ -22,6 +23,40 @@ print_ether_addr(const char *what, const struct rte_ether_addr *eth_addr)
>  	printf("%s%s", what, buf);
>  }
>  
> +static inline bool
> +is_timestamp_enabled(const struct rte_mbuf *mbuf)
> +{
> +	static uint64_t timestamp_rx_dynflag;
> +
> +	int timestamp_rx_dynflag_offset;
> +
> +	if (timestamp_rx_dynflag == 0) {
> +		timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
> +				RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);

If the flag is not registered, it will try to lookup on every
call. I'm not sure that it is good.

> +		if (timestamp_rx_dynflag_offset < 0)
> +			return false;
> +		timestamp_rx_dynflag = RTE_BIT64(timestamp_rx_dynflag_offset);
> +	}
> +
> +	return (mbuf->ol_flags & timestamp_rx_dynflag) != 0;
> +}
> +
> +static inline rte_mbuf_timestamp_t
> +get_timestamp(const struct rte_mbuf *mbuf)
> +{
> +	static int timestamp_dynfield_offset = -1;
> +
> +	if (timestamp_dynfield_offset < 0) {
> +		timestamp_dynfield_offset = rte_mbuf_dynfield_lookup(
> +				RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);

same here

> +		if (timestamp_dynfield_offset < 0)
> +			return 0;
> +	}
> +
> +	return *RTE_MBUF_DYNFIELD(mbuf,
> +			timestamp_dynfield_offset, rte_mbuf_timestamp_t *);
> +}
> +
>  static inline void
>  dump_pkt_burst(uint16_t port_id, uint16_t queue, struct rte_mbuf *pkts[],
>  	      uint16_t nb_pkts, int is_rx)
> @@ -107,8 +142,8 @@ dump_pkt_burst(uint16_t port_id, uint16_t queue, struct rte_mbuf *pkts[],
>  				printf("hash=0x%x ID=0x%x ",
>  				       mb->hash.fdir.hash, mb->hash.fdir.id);
>  		}
> -		if (ol_flags & PKT_RX_TIMESTAMP)
> -			printf(" - timestamp %"PRIu64" ", mb->timestamp);
> +		if (is_timestamp_enabled(mb))
> +			printf(" - timestamp %"PRIu64" ", get_timestamp(mb));
>  		if (ol_flags & PKT_RX_QINQ)
>  			printf(" - QinQ VLAN tci=0x%x, VLAN tci outer=0x%x",
>  			       mb->vlan_tci, mb->vlan_tci_outer);
> 


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 13/15] examples/rxtx_callbacks: switch timestamp to dynamic field
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 13/15] examples/rxtx_callbacks: switch timestamp to dynamic field Thomas Monjalon
@ 2020-10-29 10:21   ` Andrew Rybchenko
  2020-10-29 10:44     ` Thomas Monjalon
  0 siblings, 1 reply; 170+ messages in thread
From: Andrew Rybchenko @ 2020-10-29 10:21 UTC (permalink / raw)
  To: Thomas Monjalon, dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	jerinj, viacheslavo, John McNamara

On 10/29/20 12:27 PM, Thomas Monjalon wrote:
> The mbuf timestamp is moved to a dynamic field
> in order to allow removal of the deprecated static field.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> ---
>  examples/rxtx_callbacks/main.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/examples/rxtx_callbacks/main.c b/examples/rxtx_callbacks/main.c
> index b9a98ceddc..4798e0962c 100644
> --- a/examples/rxtx_callbacks/main.c
> +++ b/examples/rxtx_callbacks/main.c
> @@ -19,6 +19,10 @@
>  #define MBUF_CACHE_SIZE 250
>  #define BURST_SIZE 32
>  
> +static int hwts_dynfield_offset = -1;
> +#define HWTS_FIELD(mbuf) (*RTE_MBUF_DYNFIELD(mbuf, \
> +		hwts_dynfield_offset, rte_mbuf_timestamp_t *))
> +

Why is approach here differs? Macro vs inline function.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 14/15] mbuf: remove deprecated timestamp field
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 14/15] mbuf: remove deprecated timestamp field Thomas Monjalon
@ 2020-10-29 10:23   ` Andrew Rybchenko
  2020-10-29 18:18     ` Ajit Khaparde
  2020-10-29 14:48   ` Kinsella, Ray
  1 sibling, 1 reply; 170+ messages in thread
From: Andrew Rybchenko @ 2020-10-29 10:23 UTC (permalink / raw)
  To: Thomas Monjalon, dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	jerinj, viacheslavo, Ray Kinsella, Neil Horman

On 10/29/20 12:27 PM, Thomas Monjalon wrote:
> As announced in the deprecation note, the field timestamp
> is removed to allow giving more space to the dynamic fields.
> The related offload flag PKT_RX_TIMESTAMP is also removed.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>

Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 03/15] ethdev: register mbuf field and flags for timestamp
  2020-10-29 10:12     ` Thomas Monjalon
@ 2020-10-29 10:33       ` Andrew Rybchenko
  2020-10-29 10:46         ` Thomas Monjalon
  0 siblings, 1 reply; 170+ messages in thread
From: Andrew Rybchenko @ 2020-10-29 10:33 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	olivier.matz, jerinj, viacheslavo, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger, Matan Azrad, Shahaf Shuler

On 10/29/20 1:12 PM, Thomas Monjalon wrote:
> 29/10/2020 11:08, Andrew Rybchenko:
>> On 10/29/20 12:27 PM, Thomas Monjalon wrote:
>>> During port configure or queue setup, the offload flags
>>> DEV_RX_OFFLOAD_TIMESTAMP and DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP
>>> trigger the registration of the related mbuf field and flags.
>>>
>>> Previously, the Tx timestamp field and flag were registered in testpmd,
>>> as described in mlx5 guide.
>>> For the general usage of Rx and Tx timestamps,
>>> managing registrations inside ethdev is simpler and properly documented.
>>>
>>> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
>>
>> A small note below, other than that
>>
>> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>
>>> +static inline int
>>> +eth_dev_timestamp_mbuf_register(uint64_t rx_offloads, uint64_t tx_offloads)
>>> +{
>>> +	static const struct rte_mbuf_dynfield field_desc = {
>>> +		.name = RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
>>> +		.size = sizeof(rte_mbuf_timestamp_t),
>>> +		.align = __alignof__(rte_mbuf_timestamp_t),
>>> +	};
>>> +	static const struct rte_mbuf_dynflag rx_flag_desc = {
>>> +		.name = RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME,
>>> +	};
>>> +	static const struct rte_mbuf_dynflag tx_flag_desc = {
>>> +		.name = RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME,
>>> +	};
>>> +	static bool done_rx, done_tx;
>>
>> I think we don't need these static flags. We can just repeat
>> registeration request and it will simply lookup and return
>> the same offset/flagbit as before.
> 
> Absolutely.
> I did it as a small optimization in control path.
> 
> I hesitated. Given it is only 2 booleans,
> do you prefer with or without or no opinion?
> 
I'd prefer without it. It is always better without
static variables if possible.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 04/15] latency: switch timestamp to dynamic mbuf field
  2020-10-29 10:13   ` Andrew Rybchenko
@ 2020-10-29 10:40     ` Thomas Monjalon
  0 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29 10:40 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	olivier.matz, jerinj, viacheslavo, Reshma Pattan

29/10/2020 11:13, Andrew Rybchenko:
> On 10/29/20 12:27 PM, Thomas Monjalon wrote:
> > The mbuf timestamp is moved to a dynamic field
> > in order to allow removal of the deprecated static field.
> > The related mbuf flag is also replaced with the dynamic one.
> > 
> > Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> 
> [snip]
> 
> > diff --git a/lib/librte_latencystats/rte_latencystats.c b/lib/librte_latencystats/rte_latencystats.c
> > index ba2fff3bcb..a21f6239d9 100644
> > --- a/lib/librte_latencystats/rte_latencystats.c
> > +++ b/lib/librte_latencystats/rte_latencystats.c
> 
> [snip]
> 
> > @@ -204,6 +216,14 @@ int
> >  rte_latencystats_init(uint64_t app_samp_intvl,
> >  		rte_latency_stats_flow_type_fn user_cb)
> >  {
> > +	static const struct rte_mbuf_dynfield timestamp_dynfield_desc = {
> > +		.name = RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
> > +		.size = sizeof(rte_mbuf_timestamp_t),
> > +		.align = __alignof__(rte_mbuf_timestamp_t),
> > +	};
> > +	static const struct rte_mbuf_dynflag timestamp_dynflag_desc = {
> > +		.name = RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME,
> > +	};
> 
> I dislike the duplication. If we can't just lookup by name
> which is done after ethdev configure (I guess so), may be
> ethdev should provide an API to register?

That's because it is a separate library.
We don't know whether the feature timestamp is already enabled.
We have the port_id, so we could do something.
But the current behaviour is to use timestamp even if it is disabled
at ethdev level. And I don't want to change the behaviour.

Maybe the right solution is to register a separate field for this lib.
Anyway the time base is not the same.



^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 12/15] app/testpmd: switch timestamp to dynamic mbuf field
  2020-10-29 10:20   ` Andrew Rybchenko
@ 2020-10-29 10:43     ` Thomas Monjalon
  2020-10-29 10:52       ` Andrew Rybchenko
  0 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29 10:43 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	olivier.matz, jerinj, viacheslavo, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger

29/10/2020 11:20, Andrew Rybchenko:
> On 10/29/20 12:27 PM, Thomas Monjalon wrote:
> > The mbuf timestamp is moved to a dynamic field
> > in order to allow removal of the deprecated static field.
> > The related mbuf flag is also replaced.
> > 
> > Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> > ---
> > --- a/app/test-pmd/util.c
> > +++ b/app/test-pmd/util.c
> > +static inline bool
> > +is_timestamp_enabled(const struct rte_mbuf *mbuf)
> > +{
> > +	static uint64_t timestamp_rx_dynflag;
> > +
> > +	int timestamp_rx_dynflag_offset;
> > +
> > +	if (timestamp_rx_dynflag == 0) {
> > +		timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
> > +				RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
> 
> If the flag is not registered, it will try to lookup on every
> call. I'm not sure that it is good.

I don't see the problem.
It is a dump in a test application.
The idea is to have a fresh dump whatever was updated recently.




^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 13/15] examples/rxtx_callbacks: switch timestamp to dynamic field
  2020-10-29 10:21   ` Andrew Rybchenko
@ 2020-10-29 10:44     ` Thomas Monjalon
  0 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29 10:44 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	olivier.matz, jerinj, viacheslavo, John McNamara

29/10/2020 11:21, Andrew Rybchenko:
> On 10/29/20 12:27 PM, Thomas Monjalon wrote:
> > The mbuf timestamp is moved to a dynamic field
> > in order to allow removal of the deprecated static field.
> > 
> > Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> > ---
> >  examples/rxtx_callbacks/main.c | 12 +++++++++++-
> >  1 file changed, 11 insertions(+), 1 deletion(-)
> > 
> > diff --git a/examples/rxtx_callbacks/main.c b/examples/rxtx_callbacks/main.c
> > index b9a98ceddc..4798e0962c 100644
> > --- a/examples/rxtx_callbacks/main.c
> > +++ b/examples/rxtx_callbacks/main.c
> > @@ -19,6 +19,10 @@
> >  #define MBUF_CACHE_SIZE 250
> >  #define BURST_SIZE 32
> >  
> > +static int hwts_dynfield_offset = -1;
> > +#define HWTS_FIELD(mbuf) (*RTE_MBUF_DYNFIELD(mbuf, \
> > +		hwts_dynfield_offset, rte_mbuf_timestamp_t *))
> > +
> 
> Why is approach here differs? Macro vs inline function.

Because it is a self-contained file,
and there is already a macro for another field.

If you really want a function, I could it for both fields.



^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 03/15] ethdev: register mbuf field and flags for timestamp
  2020-10-29 10:33       ` Andrew Rybchenko
@ 2020-10-29 10:46         ` Thomas Monjalon
  0 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29 10:46 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	olivier.matz, jerinj, viacheslavo, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger, Matan Azrad, Shahaf Shuler

29/10/2020 11:33, Andrew Rybchenko:
> On 10/29/20 1:12 PM, Thomas Monjalon wrote:
> > 29/10/2020 11:08, Andrew Rybchenko:
> >> On 10/29/20 12:27 PM, Thomas Monjalon wrote:
> >>> During port configure or queue setup, the offload flags
> >>> DEV_RX_OFFLOAD_TIMESTAMP and DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP
> >>> trigger the registration of the related mbuf field and flags.
> >>>
> >>> Previously, the Tx timestamp field and flag were registered in testpmd,
> >>> as described in mlx5 guide.
> >>> For the general usage of Rx and Tx timestamps,
> >>> managing registrations inside ethdev is simpler and properly documented.
> >>>
> >>> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> >>
> >> A small note below, other than that
> >>
> >> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >>
> >>> +static inline int
> >>> +eth_dev_timestamp_mbuf_register(uint64_t rx_offloads, uint64_t tx_offloads)
> >>> +{
> >>> +	static const struct rte_mbuf_dynfield field_desc = {
> >>> +		.name = RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
> >>> +		.size = sizeof(rte_mbuf_timestamp_t),
> >>> +		.align = __alignof__(rte_mbuf_timestamp_t),
> >>> +	};
> >>> +	static const struct rte_mbuf_dynflag rx_flag_desc = {
> >>> +		.name = RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME,
> >>> +	};
> >>> +	static const struct rte_mbuf_dynflag tx_flag_desc = {
> >>> +		.name = RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME,
> >>> +	};
> >>> +	static bool done_rx, done_tx;
> >>
> >> I think we don't need these static flags. We can just repeat
> >> registeration request and it will simply lookup and return
> >> the same offset/flagbit as before.
> > 
> > Absolutely.
> > I did it as a small optimization in control path.
> > 
> > I hesitated. Given it is only 2 booleans,
> > do you prefer with or without or no opinion?
> > 
> I'd prefer without it. It is always better without
> static variables if possible.

I liked the naming of variables "todo" and "done"
but I will do what is preferred.
If nobody objects, I will remove this small (useless) optimization.




^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotter first half
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotter first half Thomas Monjalon
@ 2020-10-29 10:50   ` Andrew Rybchenko
  2020-10-29 10:56     ` Thomas Monjalon
  2020-10-29 14:42   ` [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotter first half Kinsella, Ray
  1 sibling, 1 reply; 170+ messages in thread
From: Andrew Rybchenko @ 2020-10-29 10:50 UTC (permalink / raw)
  To: Thomas Monjalon, dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	jerinj, viacheslavo, Ray Kinsella, Neil Horman

On 10/29/20 12:27 PM, Thomas Monjalon wrote:
> The mempool pointer in the mbuf struct is moved
> from the second to the first half.
> It should increase performance on most systems having 64-byte cache line,
> i.e. mbuf is split in two cache lines.
> On such system, the first half (also called first cache line) is hotter
> than the second one where the pool pointer was.
> 
> Moving this field gives more space to dynfield1.
> 
> This is how the mbuf layout looks like (pahole-style):
> 
> word  type                              name                byte  size
>  0    void *                            buf_addr;         /*   0 +  8 */
>  1    rte_iova_t                        buf_iova          /*   8 +  8 */
>       /* --- RTE_MARKER64               rearm_data;                   */
>  2    uint16_t                          data_off;         /*  16 +  2 */
>       uint16_t                          refcnt;           /*  18 +  2 */
>       uint16_t                          nb_segs;          /*  20 +  2 */
>       uint16_t                          port;             /*  22 +  2 */
>  3    uint64_t                          ol_flags;         /*  24 +  8 */
>       /* --- RTE_MARKER                 rx_descriptor_fields1;        */
>  4    uint32_t             union        packet_type;      /*  32 +  4 */
>       uint32_t                          pkt_len;          /*  36 +  4 */
>  5    uint16_t                          data_len;         /*  40 +  2 */
>       uint16_t                          vlan_tci;         /*  42 +  2 */
>  5.5  uint64_t             union        hash;             /*  44 +  8 */
>  6.5  uint16_t                          vlan_tci_outer;   /*  52 +  2 */
>       uint16_t                          buf_len;          /*  54 +  2 */
>  7    struct rte_mempool *              pool;             /*  56 +  8 */
>       /* --- RTE_MARKER                 cacheline1;                   */
>  8    struct rte_mbuf *                 next;             /*  64 +  8 */
>  9    uint64_t             union        tx_offload;       /*  72 +  8 */
> 10    uint16_t                          priv_size;        /*  80 +  2 */
>       uint16_t                          timesync;         /*  82 +  2 */
>       uint32_t                          seqn;             /*  84 +  4 */
> 11    struct rte_mbuf_ext_shared_info * shinfo;           /*  88 +  8 */
> 12    uint64_t                          dynfield1[4];     /*  96 + 32 */
> 16    /* --- END                                             128      */
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>

I'd like to understand why pool is chosen instead of, for
example, next pointer.

Pool is used on housekeeping when driver refills Rx ring or
free completed Tx mbufs. Free thresholds try to avoid it on
every Rx/Tx burst (if possible).

Next is used for multi-segment Tx and scattered (and buffer
split) Rx. IMHO the key question here is we consider these
use cases as common and priority to optimize. If yes, I'd
vote to have next on the first cacheline.

I'm not sure. Just trying to hear a bit more about it.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 12/15] app/testpmd: switch timestamp to dynamic mbuf field
  2020-10-29 10:43     ` Thomas Monjalon
@ 2020-10-29 10:52       ` Andrew Rybchenko
  0 siblings, 0 replies; 170+ messages in thread
From: Andrew Rybchenko @ 2020-10-29 10:52 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	olivier.matz, jerinj, viacheslavo, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger

On 10/29/20 1:43 PM, Thomas Monjalon wrote:
> 29/10/2020 11:20, Andrew Rybchenko:
>> On 10/29/20 12:27 PM, Thomas Monjalon wrote:
>>> The mbuf timestamp is moved to a dynamic field
>>> in order to allow removal of the deprecated static field.
>>> The related mbuf flag is also replaced.
>>>
>>> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
>>> ---
>>> --- a/app/test-pmd/util.c
>>> +++ b/app/test-pmd/util.c
>>> +static inline bool
>>> +is_timestamp_enabled(const struct rte_mbuf *mbuf)
>>> +{
>>> +	static uint64_t timestamp_rx_dynflag;
>>> +
>>> +	int timestamp_rx_dynflag_offset;
>>> +
>>> +	if (timestamp_rx_dynflag == 0) {
>>> +		timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
>>> +				RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
>>
>> If the flag is not registered, it will try to lookup on every
>> call. I'm not sure that it is good.
> 
> I don't see the problem.
> It is a dump in a test application.
> The idea is to have a fresh dump whatever was updated recently.

OK, makes sense.



^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotter first half
  2020-10-29 10:50   ` Andrew Rybchenko
@ 2020-10-29 10:56     ` Thomas Monjalon
  2020-10-29 14:15       ` Ananyev, Konstantin
  0 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29 10:56 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	olivier.matz, jerinj, viacheslavo, ajit.khaparde,
	konstantin.ananyev, honnappa.nagarahalli, maxime.coquelin,
	stephen, hemant.agrawal

29/10/2020 11:50, Andrew Rybchenko:
> On 10/29/20 12:27 PM, Thomas Monjalon wrote:
> > The mempool pointer in the mbuf struct is moved
> > from the second to the first half.
> > It should increase performance on most systems having 64-byte cache line,
> > i.e. mbuf is split in two cache lines.
> > On such system, the first half (also called first cache line) is hotter
> > than the second one where the pool pointer was.
> > 
> > Moving this field gives more space to dynfield1.
> > 
> > This is how the mbuf layout looks like (pahole-style):
> > 
> > word  type                              name                byte  size
> >  0    void *                            buf_addr;         /*   0 +  8 */
> >  1    rte_iova_t                        buf_iova          /*   8 +  8 */
> >       /* --- RTE_MARKER64               rearm_data;                   */
> >  2    uint16_t                          data_off;         /*  16 +  2 */
> >       uint16_t                          refcnt;           /*  18 +  2 */
> >       uint16_t                          nb_segs;          /*  20 +  2 */
> >       uint16_t                          port;             /*  22 +  2 */
> >  3    uint64_t                          ol_flags;         /*  24 +  8 */
> >       /* --- RTE_MARKER                 rx_descriptor_fields1;        */
> >  4    uint32_t             union        packet_type;      /*  32 +  4 */
> >       uint32_t                          pkt_len;          /*  36 +  4 */
> >  5    uint16_t                          data_len;         /*  40 +  2 */
> >       uint16_t                          vlan_tci;         /*  42 +  2 */
> >  5.5  uint64_t             union        hash;             /*  44 +  8 */
> >  6.5  uint16_t                          vlan_tci_outer;   /*  52 +  2 */
> >       uint16_t                          buf_len;          /*  54 +  2 */
> >  7    struct rte_mempool *              pool;             /*  56 +  8 */
> >       /* --- RTE_MARKER                 cacheline1;                   */
> >  8    struct rte_mbuf *                 next;             /*  64 +  8 */
> >  9    uint64_t             union        tx_offload;       /*  72 +  8 */
> > 10    uint16_t                          priv_size;        /*  80 +  2 */
> >       uint16_t                          timesync;         /*  82 +  2 */
> >       uint32_t                          seqn;             /*  84 +  4 */
> > 11    struct rte_mbuf_ext_shared_info * shinfo;           /*  88 +  8 */
> > 12    uint64_t                          dynfield1[4];     /*  96 + 32 */
> > 16    /* --- END                                             128      */
> > 
> > Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> 
> I'd like to understand why pool is chosen instead of, for
> example, next pointer.
> 
> Pool is used on housekeeping when driver refills Rx ring or
> free completed Tx mbufs. Free thresholds try to avoid it on
> every Rx/Tx burst (if possible).
> 
> Next is used for multi-segment Tx and scattered (and buffer
> split) Rx. IMHO the key question here is we consider these
> use cases as common and priority to optimize. If yes, I'd
> vote to have next on the first cacheline.
> 
> I'm not sure. Just trying to hear a bit more about it.

That's a good question.
Clearly pool and next are good options.
The best would be to have some benchmarks.
If one use case shows no benefit, the decision is easier.

If you prefer, we can leave this last patch for -rc3.



^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 10/15] net/octeontx2: switch timestamp to dynamic mbuf field
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 10/15] net/octeontx2: " Thomas Monjalon
@ 2020-10-29 11:02   ` Andrew Rybchenko
  2020-10-29 11:34     ` Thomas Monjalon
  2020-10-29 11:52     ` Slava Ovsiienko
  0 siblings, 2 replies; 170+ messages in thread
From: Andrew Rybchenko @ 2020-10-29 11:02 UTC (permalink / raw)
  To: Thomas Monjalon, dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	jerinj, viacheslavo, Nithin Dabilpuram, Kiran Kumar K,
	Ray Kinsella, Neil Horman

On 10/29/20 12:27 PM, Thomas Monjalon wrote:
> The mbuf timestamp is moved to a dynamic field
> in order to allow removal of the deprecated static field.
> The related mbuf flag is also replaced.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> ---
>  drivers/net/octeontx2/otx2_ethdev.c | 33 +++++++++++++++++++++++++++++
>  drivers/net/octeontx2/otx2_rx.h     | 19 ++++++++++++++---
>  drivers/net/octeontx2/version.map   |  7 ++++++
>  3 files changed, 56 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/octeontx2/otx2_ethdev.c b/drivers/net/octeontx2/otx2_ethdev.c
> index cfb733a4b5..ad95219438 100644
> --- a/drivers/net/octeontx2/otx2_ethdev.c
> +++ b/drivers/net/octeontx2/otx2_ethdev.c
> @@ -4,6 +4,7 @@
>  
>  #include <inttypes.h>
>  
> +#include <rte_bitops.h>
>  #include <rte_ethdev_pci.h>
>  #include <rte_io.h>
>  #include <rte_malloc.h>
> @@ -14,6 +15,35 @@
>  #include "otx2_ethdev.h"
>  #include "otx2_ethdev_sec.h"
>  
> +uint64_t rte_pmd_octeontx2_timestamp_rx_dynflag;
> +int rte_pmd_octeontx2_timestamp_dynfield_offset = -1;
> +
> +static int
> +otx2_rx_timestamp_setup(uint16_t flags)
> +{
> +	int timestamp_rx_dynflag_offset;
> +
> +	if ((flags & NIX_RX_OFFLOAD_TSTAMP_F) == 0)
> +		return 0;
> +
> +	rte_pmd_octeontx2_timestamp_dynfield_offset = rte_mbuf_dynfield_lookup(
> +			RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
> +	if (rte_pmd_octeontx2_timestamp_dynfield_offset < 0) {
> +		otx2_err("Failed to lookup timestamp field");
> +		return -rte_errno;
> +	}
> +	timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
> +			RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
> +	if (timestamp_rx_dynflag_offset < 0) {
> +		otx2_err("Failed to lookup Rx timestamp flag");
> +		return -rte_errno;
> +	}
> +	rte_pmd_octeontx2_timestamp_rx_dynflag =
> +			RTE_BIT64(timestamp_rx_dynflag_offset);
> +
> +	return 0;
> +}
> +
>  static inline uint64_t
>  nix_get_rx_offload_capa(struct otx2_eth_dev *dev)
>  {
> @@ -1874,6 +1904,9 @@ otx2_nix_configure(struct rte_eth_dev *eth_dev)
>  	dev->tx_offload_flags |= nix_tx_offload_flags(eth_dev);
>  	dev->rss_info.rss_grps = NIX_RSS_GRPS;
>  
> +	if (otx2_rx_timestamp_setup(dev->rx_offload_flags) != 0)
> +		goto fail_offloads;
> +
>  	nb_rxq = RTE_MAX(data->nb_rx_queues, 1);
>  	nb_txq = RTE_MAX(data->nb_tx_queues, 1);
>  
> diff --git a/drivers/net/octeontx2/otx2_rx.h b/drivers/net/octeontx2/otx2_rx.h
> index 61a5c436dd..6981edce82 100644
> --- a/drivers/net/octeontx2/otx2_rx.h
> +++ b/drivers/net/octeontx2/otx2_rx.h
> @@ -63,6 +63,18 @@ union mbuf_initializer {
>  	uint64_t value;
>  };
>  
> +/* variables are exported because this file is included in other drivers */
> +extern uint64_t rte_pmd_octeontx2_timestamp_rx_dynflag;
> +extern int rte_pmd_octeontx2_timestamp_dynfield_offset;
> +
> +static inline rte_mbuf_timestamp_t *
> +otx2_timestamp_dynfield(struct rte_mbuf *mbuf)
> +{
> +	return RTE_MBUF_DYNFIELD(mbuf,
> +		rte_pmd_octeontx2_timestamp_dynfield_offset,
> +		rte_mbuf_timestamp_t *);
> +}
> +

May be ethdev should provide the inline function?


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 10/15] net/octeontx2: switch timestamp to dynamic mbuf field
  2020-10-29 11:02   ` Andrew Rybchenko
@ 2020-10-29 11:34     ` Thomas Monjalon
  2020-10-29 11:37       ` Andrew Rybchenko
  2020-10-29 11:52     ` Slava Ovsiienko
  1 sibling, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29 11:34 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	olivier.matz, jerinj, viacheslavo, Nithin Dabilpuram,
	Kiran Kumar K, Ray Kinsella, Neil Horman

29/10/2020 12:02, Andrew Rybchenko:
> On 10/29/20 12:27 PM, Thomas Monjalon wrote:
> > +/* variables are exported because this file is included in other drivers */
> > +extern uint64_t rte_pmd_octeontx2_timestamp_rx_dynflag;
> > +extern int rte_pmd_octeontx2_timestamp_dynfield_offset;
> > +
> > +static inline rte_mbuf_timestamp_t *
> > +otx2_timestamp_dynfield(struct rte_mbuf *mbuf)
> > +{
> > +	return RTE_MBUF_DYNFIELD(mbuf,
> > +		rte_pmd_octeontx2_timestamp_dynfield_offset,
> > +		rte_mbuf_timestamp_t *);
> > +}
> > +
> 
> May be ethdev should provide the inline function?

It would mean exporting the offsets.

And actually I think this field should not be restricted to ethdev.



^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 10/15] net/octeontx2: switch timestamp to dynamic mbuf field
  2020-10-29 11:34     ` Thomas Monjalon
@ 2020-10-29 11:37       ` Andrew Rybchenko
  0 siblings, 0 replies; 170+ messages in thread
From: Andrew Rybchenko @ 2020-10-29 11:37 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	olivier.matz, jerinj, viacheslavo, Nithin Dabilpuram,
	Kiran Kumar K, Ray Kinsella, Neil Horman

On 10/29/20 2:34 PM, Thomas Monjalon wrote:
> 29/10/2020 12:02, Andrew Rybchenko:
>> On 10/29/20 12:27 PM, Thomas Monjalon wrote:
>>> +/* variables are exported because this file is included in other drivers */
>>> +extern uint64_t rte_pmd_octeontx2_timestamp_rx_dynflag;
>>> +extern int rte_pmd_octeontx2_timestamp_dynfield_offset;
>>> +
>>> +static inline rte_mbuf_timestamp_t *
>>> +otx2_timestamp_dynfield(struct rte_mbuf *mbuf)
>>> +{
>>> +	return RTE_MBUF_DYNFIELD(mbuf,
>>> +		rte_pmd_octeontx2_timestamp_dynfield_offset,
>>> +		rte_mbuf_timestamp_t *);
>>> +}
>>> +
>>
>> May be ethdev should provide the inline function?
> 
> It would mean exporting the offsets.
> 
> And actually I think this field should not be restricted to ethdev.
> 

I see OK.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 10/15] net/octeontx2: switch timestamp to dynamic mbuf field
  2020-10-29 11:02   ` Andrew Rybchenko
  2020-10-29 11:34     ` Thomas Monjalon
@ 2020-10-29 11:52     ` Slava Ovsiienko
  2020-10-30 12:41       ` Jerin Jacob
  1 sibling, 1 reply; 170+ messages in thread
From: Slava Ovsiienko @ 2020-10-29 11:52 UTC (permalink / raw)
  To: Andrew Rybchenko, NBU-Contact-Thomas Monjalon, dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	jerinj, Nithin Dabilpuram, Kiran Kumar K, Ray Kinsella,
	Neil Horman

Just five cents -  exporting the offset (making it global) might have side effect impacting the performance.
Offset might be located in some memory sharing the cacheline with some other variables.
If these variables are writable and are being updated frequently - we might get the cache contention.
I'd prefer to keep all dynamic offsets In the PMD and entirely control memory allocation
attributes for these ones. Hence, exporting is OK, but practical usage in datapath is questionable.

With best regards, Slava

> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Thursday, October 29, 2020 13:02
> To: NBU-Contact-Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org
> Cc: ferruh.yigit@intel.com; david.marchand@redhat.com;
> bruce.richardson@intel.com; olivier.matz@6wind.com; jerinj@marvell.com;
> Slava Ovsiienko <viacheslavo@nvidia.com>; Nithin Dabilpuram
> <ndabilpuram@marvell.com>; Kiran Kumar K <kirankumark@marvell.com>;
> Ray Kinsella <mdr@ashroe.eu>; Neil Horman <nhorman@tuxdriver.com>
> Subject: Re: [PATCH 10/15] net/octeontx2: switch timestamp to dynamic mbuf
> field
> 
> On 10/29/20 12:27 PM, Thomas Monjalon wrote:
> > The mbuf timestamp is moved to a dynamic field in order to allow
> > removal of the deprecated static field.
> > The related mbuf flag is also replaced.
> >
> > Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> > ---
> >  drivers/net/octeontx2/otx2_ethdev.c | 33
> +++++++++++++++++++++++++++++
> >  drivers/net/octeontx2/otx2_rx.h     | 19 ++++++++++++++---
> >  drivers/net/octeontx2/version.map   |  7 ++++++
> >  3 files changed, 56 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/octeontx2/otx2_ethdev.c
> > b/drivers/net/octeontx2/otx2_ethdev.c
> > index cfb733a4b5..ad95219438 100644
> > --- a/drivers/net/octeontx2/otx2_ethdev.c
> > +++ b/drivers/net/octeontx2/otx2_ethdev.c
> > @@ -4,6 +4,7 @@
> >
> >  #include <inttypes.h>
> >
> > +#include <rte_bitops.h>
> >  #include <rte_ethdev_pci.h>
> >  #include <rte_io.h>
> >  #include <rte_malloc.h>
> > @@ -14,6 +15,35 @@
> >  #include "otx2_ethdev.h"
> >  #include "otx2_ethdev_sec.h"
> >
> > +uint64_t rte_pmd_octeontx2_timestamp_rx_dynflag;
> > +int rte_pmd_octeontx2_timestamp_dynfield_offset = -1;
> > +
> > +static int
> > +otx2_rx_timestamp_setup(uint16_t flags) {
> > +	int timestamp_rx_dynflag_offset;
> > +
> > +	if ((flags & NIX_RX_OFFLOAD_TSTAMP_F) == 0)
> > +		return 0;
> > +
> > +	rte_pmd_octeontx2_timestamp_dynfield_offset =
> rte_mbuf_dynfield_lookup(
> > +			RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
> > +	if (rte_pmd_octeontx2_timestamp_dynfield_offset < 0) {
> > +		otx2_err("Failed to lookup timestamp field");
> > +		return -rte_errno;
> > +	}
> > +	timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
> > +			RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
> > +	if (timestamp_rx_dynflag_offset < 0) {
> > +		otx2_err("Failed to lookup Rx timestamp flag");
> > +		return -rte_errno;
> > +	}
> > +	rte_pmd_octeontx2_timestamp_rx_dynflag =
> > +			RTE_BIT64(timestamp_rx_dynflag_offset);
> > +
> > +	return 0;
> > +}
> > +
> >  static inline uint64_t
> >  nix_get_rx_offload_capa(struct otx2_eth_dev *dev)  { @@ -1874,6
> > +1904,9 @@ otx2_nix_configure(struct rte_eth_dev *eth_dev)
> >  	dev->tx_offload_flags |= nix_tx_offload_flags(eth_dev);
> >  	dev->rss_info.rss_grps = NIX_RSS_GRPS;
> >
> > +	if (otx2_rx_timestamp_setup(dev->rx_offload_flags) != 0)
> > +		goto fail_offloads;
> > +
> >  	nb_rxq = RTE_MAX(data->nb_rx_queues, 1);
> >  	nb_txq = RTE_MAX(data->nb_tx_queues, 1);
> >
> > diff --git a/drivers/net/octeontx2/otx2_rx.h
> > b/drivers/net/octeontx2/otx2_rx.h index 61a5c436dd..6981edce82 100644
> > --- a/drivers/net/octeontx2/otx2_rx.h
> > +++ b/drivers/net/octeontx2/otx2_rx.h
> > @@ -63,6 +63,18 @@ union mbuf_initializer {
> >  	uint64_t value;
> >  };
> >
> > +/* variables are exported because this file is included in other
> > +drivers */ extern uint64_t rte_pmd_octeontx2_timestamp_rx_dynflag;
> > +extern int rte_pmd_octeontx2_timestamp_dynfield_offset;
> > +
> > +static inline rte_mbuf_timestamp_t *
> > +otx2_timestamp_dynfield(struct rte_mbuf *mbuf) {
> > +	return RTE_MBUF_DYNFIELD(mbuf,
> > +		rte_pmd_octeontx2_timestamp_dynfield_offset,
> > +		rte_mbuf_timestamp_t *);
> > +}
> > +
> 
> May be ethdev should provide the inline function?
Just five cents -  exporting the offset (making it global) might have side effect impacting the performance.
Offset might be located in some memory sharing the cacheline with some other variables.
If these variables are writable and are being updated frequently - we might get the cache contention.
I'd prefer to keep all dynamic offsets In the PMD and entirely control memory allocation
attributes for these ones. Hence, exporting/inline function is possible,
but practical usage, say,  in datapath, is questionable.

With best regards, Slava


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotter first half
  2020-10-29 10:56     ` Thomas Monjalon
@ 2020-10-29 14:15       ` Ananyev, Konstantin
  2020-10-29 18:45         ` Ajit Khaparde
  0 siblings, 1 reply; 170+ messages in thread
From: Ananyev, Konstantin @ 2020-10-29 14:15 UTC (permalink / raw)
  To: Thomas Monjalon, Andrew Rybchenko
  Cc: dev, Yigit, Ferruh, david.marchand, Richardson, Bruce,
	olivier.matz, jerinj, viacheslavo, ajit.khaparde,
	honnappa.nagarahalli, maxime.coquelin, stephen, hemant.agrawal



> 
> 29/10/2020 11:50, Andrew Rybchenko:
> > On 10/29/20 12:27 PM, Thomas Monjalon wrote:
> > > The mempool pointer in the mbuf struct is moved
> > > from the second to the first half.
> > > It should increase performance on most systems having 64-byte cache line,
> > > i.e. mbuf is split in two cache lines.
> > > On such system, the first half (also called first cache line) is hotter
> > > than the second one where the pool pointer was.
> > >
> > > Moving this field gives more space to dynfield1.
> > >
> > > This is how the mbuf layout looks like (pahole-style):
> > >
> > > word  type                              name                byte  size
> > >  0    void *                            buf_addr;         /*   0 +  8 */
> > >  1    rte_iova_t                        buf_iova          /*   8 +  8 */
> > >       /* --- RTE_MARKER64               rearm_data;                   */
> > >  2    uint16_t                          data_off;         /*  16 +  2 */
> > >       uint16_t                          refcnt;           /*  18 +  2 */
> > >       uint16_t                          nb_segs;          /*  20 +  2 */
> > >       uint16_t                          port;             /*  22 +  2 */
> > >  3    uint64_t                          ol_flags;         /*  24 +  8 */
> > >       /* --- RTE_MARKER                 rx_descriptor_fields1;        */
> > >  4    uint32_t             union        packet_type;      /*  32 +  4 */
> > >       uint32_t                          pkt_len;          /*  36 +  4 */
> > >  5    uint16_t                          data_len;         /*  40 +  2 */
> > >       uint16_t                          vlan_tci;         /*  42 +  2 */
> > >  5.5  uint64_t             union        hash;             /*  44 +  8 */
> > >  6.5  uint16_t                          vlan_tci_outer;   /*  52 +  2 */
> > >       uint16_t                          buf_len;          /*  54 +  2 */
> > >  7    struct rte_mempool *              pool;             /*  56 +  8 */
> > >       /* --- RTE_MARKER                 cacheline1;                   */
> > >  8    struct rte_mbuf *                 next;             /*  64 +  8 */
> > >  9    uint64_t             union        tx_offload;       /*  72 +  8 */
> > > 10    uint16_t                          priv_size;        /*  80 +  2 */
> > >       uint16_t                          timesync;         /*  82 +  2 */
> > >       uint32_t                          seqn;             /*  84 +  4 */
> > > 11    struct rte_mbuf_ext_shared_info * shinfo;           /*  88 +  8 */
> > > 12    uint64_t                          dynfield1[4];     /*  96 + 32 */
> > > 16    /* --- END                                             128      */
> > >
> > > Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> >
> > I'd like to understand why pool is chosen instead of, for
> > example, next pointer.
> >
> > Pool is used on housekeeping when driver refills Rx ring or
> > free completed Tx mbufs. Free thresholds try to avoid it on
> > every Rx/Tx burst (if possible).
> >
> > Next is used for multi-segment Tx and scattered (and buffer
> > split) Rx. IMHO the key question here is we consider these
> > use cases as common and priority to optimize. If yes, I'd
> > vote to have next on the first cacheline.

Between these two I also would probably lean towards *next*
(after all _free_ also has to access/update next).
As another alternative to consider: tx_offload.
It is also used quite widely. 

> >
> > I'm not sure. Just trying to hear a bit more about it.
> .
> That's a good question.
> Clearly pool and next are good options.
> The best would be to have some benchmarks.
> If one use case shows no benefit, the decision is easier.
> 
> If you prefer, we can leave this last patch for -rc3.
> 


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 04/15] latency: switch timestamp to dynamic mbuf field
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 04/15] latency: switch timestamp to dynamic mbuf field Thomas Monjalon
  2020-10-29 10:13   ` Andrew Rybchenko
@ 2020-10-29 14:20   ` Pattan, Reshma
  2020-10-29 16:15     ` Thomas Monjalon
  1 sibling, 1 reply; 170+ messages in thread
From: Pattan, Reshma @ 2020-10-29 14:20 UTC (permalink / raw)
  To: Thomas Monjalon, dev
  Cc: Yigit, Ferruh, david.marchand, Richardson, Bruce, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo



> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>


<snip>

> 	rte_mbuf_dynflag_register(&timestamp_dynflag_desc);
> +	if (timestamp_dynflag_offset < 0) {
> +		RTE_LOG(ERR, LATENCY_STATS,
> +				"Cannot register mbuf field for timestamp\n");

Field->flag, i.e. field should be changed to flag?

<snip>


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotter first half
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotter first half Thomas Monjalon
  2020-10-29 10:50   ` Andrew Rybchenko
@ 2020-10-29 14:42   ` Kinsella, Ray
  1 sibling, 0 replies; 170+ messages in thread
From: Kinsella, Ray @ 2020-10-29 14:42 UTC (permalink / raw)
  To: Thomas Monjalon, dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Neil Horman



On 29/10/2020 09:27, Thomas Monjalon wrote:
> The mempool pointer in the mbuf struct is moved
> from the second to the first half.
> It should increase performance on most systems having 64-byte cache line,
> i.e. mbuf is split in two cache lines.
> On such system, the first half (also called first cache line) is hotter
> than the second one where the pool pointer was.
> 
> Moving this field gives more space to dynfield1.
> 
> This is how the mbuf layout looks like (pahole-style):
> 
> word  type                              name                byte  size
>  0    void *                            buf_addr;         /*   0 +  8 */
>  1    rte_iova_t                        buf_iova          /*   8 +  8 */
>       /* --- RTE_MARKER64               rearm_data;                   */
>  2    uint16_t                          data_off;         /*  16 +  2 */
>       uint16_t                          refcnt;           /*  18 +  2 */
>       uint16_t                          nb_segs;          /*  20 +  2 */
>       uint16_t                          port;             /*  22 +  2 */
>  3    uint64_t                          ol_flags;         /*  24 +  8 */
>       /* --- RTE_MARKER                 rx_descriptor_fields1;        */
>  4    uint32_t             union        packet_type;      /*  32 +  4 */
>       uint32_t                          pkt_len;          /*  36 +  4 */
>  5    uint16_t                          data_len;         /*  40 +  2 */
>       uint16_t                          vlan_tci;         /*  42 +  2 */
>  5.5  uint64_t             union        hash;             /*  44 +  8 */
>  6.5  uint16_t                          vlan_tci_outer;   /*  52 +  2 */
>       uint16_t                          buf_len;          /*  54 +  2 */
>  7    struct rte_mempool *              pool;             /*  56 +  8 */
>       /* --- RTE_MARKER                 cacheline1;                   */
>  8    struct rte_mbuf *                 next;             /*  64 +  8 */
>  9    uint64_t             union        tx_offload;       /*  72 +  8 */
> 10    uint16_t                          priv_size;        /*  80 +  2 */
>       uint16_t                          timesync;         /*  82 +  2 */
>       uint32_t                          seqn;             /*  84 +  4 */
> 11    struct rte_mbuf_ext_shared_info * shinfo;           /*  88 +  8 */
> 12    uint64_t                          dynfield1[4];     /*  96 + 32 */
> 16    /* --- END                                             128      */
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> ---
>  doc/guides/rel_notes/deprecation.rst | 5 -----
>  lib/librte_kni/rte_kni_common.h      | 3 ++-
>  lib/librte_mbuf/rte_mbuf_core.h      | 5 ++---
>  3 files changed, 4 insertions(+), 9 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> index 72dbb25b83..07ca1dcbb2 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -88,11 +88,6 @@ Deprecation Notices
>  
>    - ``seqn``
>  
> -  As a consequence, the layout of the ``struct rte_mbuf`` will be re-arranged,
> -  avoiding impact on vectorized implementation of the driver datapaths,
> -  while evaluating performance gains of a better use of the first cache line.
> -
> -
>  * ethdev: the legacy filter API, including
>    ``rte_eth_dev_filter_supported()``, ``rte_eth_dev_filter_ctrl()`` as well
>    as filter types MACVLAN, ETHERTYPE, FLEXIBLE, SYN, NTUPLE, TUNNEL, FDIR,
> diff --git a/lib/librte_kni/rte_kni_common.h b/lib/librte_kni/rte_kni_common.h
> index 36d66e2ffa..ffb3182731 100644
> --- a/lib/librte_kni/rte_kni_common.h
> +++ b/lib/librte_kni/rte_kni_common.h
> @@ -84,10 +84,11 @@ struct rte_kni_mbuf {
>  	char pad2[4];
>  	uint32_t pkt_len;       /**< Total pkt len: sum of all segment data_len. */
>  	uint16_t data_len;      /**< Amount of data in segment buffer. */
> +	char pad3[14];
> +	void *pool;
>  
>  	/* fields on second cache line */
>  	__attribute__((__aligned__(RTE_CACHE_LINE_MIN_SIZE)))
> -	void *pool;
>  	void *next;             /**< Physical address of next mbuf in kernel. */
>  };
>  
> diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> index 52ca1c842f..ee185fa32b 100644
> --- a/lib/librte_mbuf/rte_mbuf_core.h
> +++ b/lib/librte_mbuf/rte_mbuf_core.h
> @@ -584,12 +584,11 @@ struct rte_mbuf {
>  
>  	uint16_t buf_len;         /**< Length of segment buffer. */
>  
> -	uint64_t unused;
> +	struct rte_mempool *pool; /**< Pool from which mbuf was allocated. */
>  
>  	/* second cache line - fields only used in slow path or on TX */
>  	RTE_MARKER cacheline1 __rte_cache_min_aligned;
>  
> -	struct rte_mempool *pool; /**< Pool from which mbuf was allocated. */
>  	struct rte_mbuf *next;    /**< Next segment of scattered packet. */
>  
>  	/* fields to support TX offloads */
> @@ -646,7 +645,7 @@ struct rte_mbuf {
>  	 */
>  	struct rte_mbuf_ext_shared_info *shinfo;
>  
> -	uint64_t dynfield1[3]; /**< Reserved for dynamic fields. */
> +	uint64_t dynfield1[4]; /**< Reserved for dynamic fields. */
>  } __rte_cache_aligned;
>  
>  /**
> 

I will let other chime in on the merits of positioning cache alignment of the 
mempool pointer. 

From the ABI PoV, depreciate notice has been observed and since mbuf effects 
everything doing it outside of a ABI Breakage window is impossible, so it now or
never.

Ray K



^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 14/15] mbuf: remove deprecated timestamp field
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 14/15] mbuf: remove deprecated timestamp field Thomas Monjalon
  2020-10-29 10:23   ` Andrew Rybchenko
@ 2020-10-29 14:48   ` Kinsella, Ray
  1 sibling, 0 replies; 170+ messages in thread
From: Kinsella, Ray @ 2020-10-29 14:48 UTC (permalink / raw)
  To: Thomas Monjalon, dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Neil Horman



On 29/10/2020 09:27, Thomas Monjalon wrote:
> As announced in the deprecation note, the field timestamp
> is removed to allow giving more space to the dynamic fields.
> The related offload flag PKT_RX_TIMESTAMP is also removed.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> ---
>  app/test/test_mbuf.c                   |  1 -
>  doc/guides/rel_notes/deprecation.rst   |  1 -
>  doc/guides/rel_notes/release_20_11.rst |  4 ++++
>  lib/librte_ethdev/rte_ethdev.h         |  4 +++-
>  lib/librte_mbuf/rte_mbuf.c             |  2 --
>  lib/librte_mbuf/rte_mbuf.h             |  1 -
>  lib/librte_mbuf/rte_mbuf_core.h        | 12 +-----------
>  7 files changed, 8 insertions(+), 17 deletions(-)
> 
> diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
> index 80d1850da9..85c150d843 100644
> --- a/app/test/test_mbuf.c
> +++ b/app/test/test_mbuf.c
> @@ -1621,7 +1621,6 @@ test_get_rx_ol_flag_name(void)
>  		VAL_NAME(PKT_RX_FDIR_FLX),
>  		VAL_NAME(PKT_RX_QINQ_STRIPPED),
>  		VAL_NAME(PKT_RX_LRO),
> -		VAL_NAME(PKT_RX_TIMESTAMP),
>  		VAL_NAME(PKT_RX_SEC_OFFLOAD),
>  		VAL_NAME(PKT_RX_SEC_OFFLOAD_FAILED),
>  		VAL_NAME(PKT_RX_OUTER_L4_CKSUM_BAD),
> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> index 0f6f1df12a..72dbb25b83 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -86,7 +86,6 @@ Deprecation Notices
>    `this presentation <https://www.youtube.com/watch?v=Ttl6MlhmzWY>`_.
>    The following static fields will be moved as dynamic:
>  
> -  - ``timestamp``
>    - ``seqn``
>  
>    As a consequence, the layout of the ``struct rte_mbuf`` will be re-arranged,
> diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
> index 3cec526b6a..deb99d6d98 100644
> --- a/doc/guides/rel_notes/release_20_11.rst
> +++ b/doc/guides/rel_notes/release_20_11.rst
> @@ -429,6 +429,10 @@ API Changes
>  * mbuf: Removed the unioned fields ``userdata`` and ``udata64``
>    from the structure ``rte_mbuf``. It is replaced with dynamic fields.
>  
> +* mbuf: Removed the field ``timestamp`` from the structure ``rte_mbuf``.
> +  It is replaced with the dynamic field RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
> +  which was previously used only for Tx.
> +
>  * pci: Removed the ``rte_kernel_driver`` enum defined in rte_dev.h and
>    replaced with a private enum in the PCI subsystem.
>  
> diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
> index 3be0050592..619cbe521e 100644
> --- a/lib/librte_ethdev/rte_ethdev.h
> +++ b/lib/librte_ethdev/rte_ethdev.h
> @@ -1345,6 +1345,8 @@ struct rte_eth_conf {
>  #define DEV_RX_OFFLOAD_JUMBO_FRAME	0x00000800
>  #define DEV_RX_OFFLOAD_SCATTER		0x00002000
>  /**
> + * Timestamp is set by the driver in RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
> + * and RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME is set in ol_flags.
>   * The mbuf field and flag are registered when the offload is configured.
>   */
>  #define DEV_RX_OFFLOAD_TIMESTAMP	0x00004000
> @@ -4654,7 +4656,7 @@ int rte_eth_timesync_write_time(uint16_t port_id, const struct timespec *time);
>   * rte_eth_read_clock(port, base_clock);
>   *
>   * Then, convert the raw mbuf timestamp with:
> - * base_time_sec + (double)(mbuf->timestamp - base_clock) / freq;
> + * base_time_sec + (double)(*timestamp_dynfield(mbuf) - base_clock) / freq;
>   *
>   * This simple example will not provide a very good accuracy. One must
>   * at least measure multiple times the frequency and do a regression.
> diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> index 8a456e5e64..09d93e6899 100644
> --- a/lib/librte_mbuf/rte_mbuf.c
> +++ b/lib/librte_mbuf/rte_mbuf.c
> @@ -764,7 +764,6 @@ const char *rte_get_rx_ol_flag_name(uint64_t mask)
>  	case PKT_RX_QINQ_STRIPPED: return "PKT_RX_QINQ_STRIPPED";
>  	case PKT_RX_QINQ: return "PKT_RX_QINQ";
>  	case PKT_RX_LRO: return "PKT_RX_LRO";
> -	case PKT_RX_TIMESTAMP: return "PKT_RX_TIMESTAMP";
>  	case PKT_RX_SEC_OFFLOAD: return "PKT_RX_SEC_OFFLOAD";
>  	case PKT_RX_SEC_OFFLOAD_FAILED: return "PKT_RX_SEC_OFFLOAD_FAILED";
>  	case PKT_RX_OUTER_L4_CKSUM_BAD: return "PKT_RX_OUTER_L4_CKSUM_BAD";
> @@ -808,7 +807,6 @@ rte_get_rx_ol_flag_list(uint64_t mask, char *buf, size_t buflen)
>  		{ PKT_RX_FDIR_FLX, PKT_RX_FDIR_FLX, NULL },
>  		{ PKT_RX_QINQ_STRIPPED, PKT_RX_QINQ_STRIPPED, NULL },
>  		{ PKT_RX_LRO, PKT_RX_LRO, NULL },
> -		{ PKT_RX_TIMESTAMP, PKT_RX_TIMESTAMP, NULL },
>  		{ PKT_RX_SEC_OFFLOAD, PKT_RX_SEC_OFFLOAD, NULL },
>  		{ PKT_RX_SEC_OFFLOAD_FAILED, PKT_RX_SEC_OFFLOAD_FAILED, NULL },
>  		{ PKT_RX_QINQ, PKT_RX_QINQ, NULL },
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index a1414ed7cd..6774c6281b 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -1108,7 +1108,6 @@ __rte_pktmbuf_copy_hdr(struct rte_mbuf *mdst, const struct rte_mbuf *msrc)
>  	mdst->tx_offload = msrc->tx_offload;
>  	mdst->hash = msrc->hash;
>  	mdst->packet_type = msrc->packet_type;
> -	mdst->timestamp = msrc->timestamp;
>  	rte_mbuf_dynfield_copy(mdst, msrc);
>  }
>  
> diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> index a65eaaf692..52ca1c842f 100644
> --- a/lib/librte_mbuf/rte_mbuf_core.h
> +++ b/lib/librte_mbuf/rte_mbuf_core.h
> @@ -149,11 +149,6 @@ extern "C" {
>   */
>  #define PKT_RX_LRO           (1ULL << 16)
>  
> -/**
> - * Indicate that the timestamp field in the mbuf is valid.
> - */
> -#define PKT_RX_TIMESTAMP     (1ULL << 17)
> -
>  /**
>   * Indicate that security offload processing was applied on the RX packet.
>   */
> @@ -589,12 +584,7 @@ struct rte_mbuf {
>  
>  	uint16_t buf_len;         /**< Length of segment buffer. */
>  
> -	/** Valid if PKT_RX_TIMESTAMP is set. The unit and time reference
> -	 * are not normalized but are always the same for a given port.
> -	 * Some devices allow to query rte_eth_read_clock that will return the
> -	 * current device timestamp.
> -	 */
> -	uint64_t timestamp;
> +	uint64_t unused;
>  
>  	/* second cache line - fields only used in slow path or on TX */
>  	RTE_MARKER cacheline1 __rte_cache_min_aligned;


Acked-by: Ray Kinsella <mdr@ashroe.eu>

 

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 04/15] latency: switch timestamp to dynamic mbuf field
  2020-10-29 14:20   ` Pattan, Reshma
@ 2020-10-29 16:15     ` Thomas Monjalon
  0 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-29 16:15 UTC (permalink / raw)
  To: Pattan, Reshma
  Cc: dev, Yigit, Ferruh, david.marchand, Richardson, Bruce,
	olivier.matz, andrew.rybchenko, jerinj, viacheslavo

29/10/2020 15:20, Pattan, Reshma:
> 
> > -----Original Message-----
> > From: Thomas Monjalon <thomas@monjalon.net>
> 
> 
> <snip>
> 
> > 	rte_mbuf_dynflag_register(&timestamp_dynflag_desc);
> > +	if (timestamp_dynflag_offset < 0) {
> > +		RTE_LOG(ERR, LATENCY_STATS,
> > +				"Cannot register mbuf field for timestamp\n");
> 
> Field->flag, i.e. field should be changed to flag?

Yes good catch, thanks!




^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 14/15] mbuf: remove deprecated timestamp field
  2020-10-29 10:23   ` Andrew Rybchenko
@ 2020-10-29 18:18     ` Ajit Khaparde
  0 siblings, 0 replies; 170+ messages in thread
From: Ajit Khaparde @ 2020-10-29 18:18 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: Thomas Monjalon, dpdk-dev, Ferruh Yigit, David Marchand,
	Bruce Richardson, Olivier Matz, Jerin Jacob Kollanukkaran,
	Slava Ovsiienko, Ray Kinsella, Neil Horman

On Thu, Oct 29, 2020 at 3:23 AM Andrew Rybchenko
<andrew.rybchenko@oktetlabs.ru> wrote:
>
> On 10/29/20 12:27 PM, Thomas Monjalon wrote:
> > As announced in the deprecation note, the field timestamp
> > is removed to allow giving more space to the dynamic fields.
> > The related offload flag PKT_RX_TIMESTAMP is also removed.
> >
> > Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
>
> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>

Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 02/15] mbuf: add Rx timestamp dynamic flag
  2020-10-29  9:58   ` Andrew Rybchenko
@ 2020-10-29 18:19     ` Ajit Khaparde
  0 siblings, 0 replies; 170+ messages in thread
From: Ajit Khaparde @ 2020-10-29 18:19 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: Thomas Monjalon, dpdk-dev, Ferruh Yigit, David Marchand,
	Bruce Richardson, Olivier Matz, Jerin Jacob Kollanukkaran,
	Slava Ovsiienko

On Thu, Oct 29, 2020 at 2:58 AM Andrew Rybchenko
<andrew.rybchenko@oktetlabs.ru> wrote:
>
> On 10/29/20 12:27 PM, Thomas Monjalon wrote:
> > There is already a dynamic field for timestamp,
> > used only for Tx scheduling, thanks to the dedicated flag.
> > The same field can be used for Rx timestamp filled by drivers.
> > The only missing part to get rid of the static timestamp field
> > was to declare a new dynamic flag for Rx usage.
> >
> > After migrating all Rx timestamp usages, it will be possible
> > to remove the deprecated timestamp field.
> >
> > Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
>
> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>

Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>

>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotter first half
  2020-10-29 14:15       ` Ananyev, Konstantin
@ 2020-10-29 18:45         ` Ajit Khaparde
  2020-10-31 18:20           ` [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotterfirst half Morten Brørup
  0 siblings, 1 reply; 170+ messages in thread
From: Ajit Khaparde @ 2020-10-29 18:45 UTC (permalink / raw)
  To: Ananyev, Konstantin
  Cc: Thomas Monjalon, Andrew Rybchenko, dev, Yigit, Ferruh,
	david.marchand, Richardson, Bruce, olivier.matz, jerinj,
	viacheslavo, honnappa.nagarahalli, maxime.coquelin, stephen,
	hemant.agrawal

On Thu, Oct 29, 2020 at 7:15 AM Ananyev, Konstantin
<konstantin.ananyev@intel.com> wrote:
>
>
>
> >
> > 29/10/2020 11:50, Andrew Rybchenko:
> > > On 10/29/20 12:27 PM, Thomas Monjalon wrote:
> > > > The mempool pointer in the mbuf struct is moved
> > > > from the second to the first half.
> > > > It should increase performance on most systems having 64-byte cache line,
> > > > i.e. mbuf is split in two cache lines.
> > > > On such system, the first half (also called first cache line) is hotter
> > > > than the second one where the pool pointer was.
> > > >
> > > > Moving this field gives more space to dynfield1.
> > > >
> > > > This is how the mbuf layout looks like (pahole-style):
> > > >
> > > > word  type                              name                byte  size
> > > >  0    void *                            buf_addr;         /*   0 +  8 */
> > > >  1    rte_iova_t                        buf_iova          /*   8 +  8 */
> > > >       /* --- RTE_MARKER64               rearm_data;                   */
> > > >  2    uint16_t                          data_off;         /*  16 +  2 */
> > > >       uint16_t                          refcnt;           /*  18 +  2 */
> > > >       uint16_t                          nb_segs;          /*  20 +  2 */
> > > >       uint16_t                          port;             /*  22 +  2 */
> > > >  3    uint64_t                          ol_flags;         /*  24 +  8 */
> > > >       /* --- RTE_MARKER                 rx_descriptor_fields1;        */
> > > >  4    uint32_t             union        packet_type;      /*  32 +  4 */
> > > >       uint32_t                          pkt_len;          /*  36 +  4 */
> > > >  5    uint16_t                          data_len;         /*  40 +  2 */
> > > >       uint16_t                          vlan_tci;         /*  42 +  2 */
> > > >  5.5  uint64_t             union        hash;             /*  44 +  8 */
> > > >  6.5  uint16_t                          vlan_tci_outer;   /*  52 +  2 */
> > > >       uint16_t                          buf_len;          /*  54 +  2 */
> > > >  7    struct rte_mempool *              pool;             /*  56 +  8 */
> > > >       /* --- RTE_MARKER                 cacheline1;                   */
> > > >  8    struct rte_mbuf *                 next;             /*  64 +  8 */
> > > >  9    uint64_t             union        tx_offload;       /*  72 +  8 */
> > > > 10    uint16_t                          priv_size;        /*  80 +  2 */
> > > >       uint16_t                          timesync;         /*  82 +  2 */
> > > >       uint32_t                          seqn;             /*  84 +  4 */
> > > > 11    struct rte_mbuf_ext_shared_info * shinfo;           /*  88 +  8 */
> > > > 12    uint64_t                          dynfield1[4];     /*  96 + 32 */
> > > > 16    /* --- END                                             128      */
> > > >
> > > > Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> > >
> > > I'd like to understand why pool is chosen instead of, for
> > > example, next pointer.
> > >
> > > Pool is used on housekeeping when driver refills Rx ring or
> > > free completed Tx mbufs. Free thresholds try to avoid it on
> > > every Rx/Tx burst (if possible).
> > >
> > > Next is used for multi-segment Tx and scattered (and buffer
> > > split) Rx. IMHO the key question here is we consider these
> > > use cases as common and priority to optimize. If yes, I'd
> > > vote to have next on the first cacheline.
>
> Between these two I also would probably lean towards *next*
> (after all _free_ also has to access/update next).
+1

> As another alternative to consider: tx_offload.
> It is also used quite widely.
>
> > >
> > > I'm not sure. Just trying to hear a bit more about it.
> > .
> > That's a good question.
> > Clearly pool and next are good options.
> > The best would be to have some benchmarks.
> > If one use case shows no benefit, the decision is easier.
> >
> > If you prefer, we can leave this last patch for -rc3.
> >
>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 10/15] net/octeontx2: switch timestamp to dynamic mbuf field
  2020-10-29 11:52     ` Slava Ovsiienko
@ 2020-10-30 12:41       ` Jerin Jacob
  2020-11-01 16:12         ` Thomas Monjalon
  2020-11-01 20:00         ` Andrew Rybchenko
  0 siblings, 2 replies; 170+ messages in thread
From: Jerin Jacob @ 2020-10-30 12:41 UTC (permalink / raw)
  To: Slava Ovsiienko
  Cc: Andrew Rybchenko, NBU-Contact-Thomas Monjalon, dev, ferruh.yigit,
	david.marchand, bruce.richardson, olivier.matz, jerinj,
	Nithin Dabilpuram, Kiran Kumar K, Ray Kinsella, Neil Horman

On Thu, Oct 29, 2020 at 5:22 PM Slava Ovsiienko <viacheslavo@nvidia.com> wrote:
>
> Just five cents -  exporting the offset (making it global) might have side effect impacting the performance.

I agree with Slava. The offset value should be stored in the PMD structure.
IMO, We can have an ethdev API to get the offset and store it in PMD's
fastpath structures in the slow path
to use in fastpath.


> Offset might be located in some memory sharing the cacheline with some other variables.
> If these variables are writable and are being updated frequently - we might get the cache contention.
> I'd prefer to keep all dynamic offsets In the PMD and entirely control memory allocation
> attributes for these ones. Hence, exporting is OK, but practical usage in datapath is questionable.
>
> With best regards, Slava
>
> > -----Original Message-----
> > From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> > Sent: Thursday, October 29, 2020 13:02
> > To: NBU-Contact-Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org
> > Cc: ferruh.yigit@intel.com; david.marchand@redhat.com;
> > bruce.richardson@intel.com; olivier.matz@6wind.com; jerinj@marvell.com;
> > Slava Ovsiienko <viacheslavo@nvidia.com>; Nithin Dabilpuram
> > <ndabilpuram@marvell.com>; Kiran Kumar K <kirankumark@marvell.com>;
> > Ray Kinsella <mdr@ashroe.eu>; Neil Horman <nhorman@tuxdriver.com>
> > Subject: Re: [PATCH 10/15] net/octeontx2: switch timestamp to dynamic mbuf
> > field
> >
> > On 10/29/20 12:27 PM, Thomas Monjalon wrote:
> > > The mbuf timestamp is moved to a dynamic field in order to allow
> > > removal of the deprecated static field.
> > > The related mbuf flag is also replaced.
> > >
> > > Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> > > ---
> > >  drivers/net/octeontx2/otx2_ethdev.c | 33
> > +++++++++++++++++++++++++++++
> > >  drivers/net/octeontx2/otx2_rx.h     | 19 ++++++++++++++---
> > >  drivers/net/octeontx2/version.map   |  7 ++++++
> > >  3 files changed, 56 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/net/octeontx2/otx2_ethdev.c
> > > b/drivers/net/octeontx2/otx2_ethdev.c
> > > index cfb733a4b5..ad95219438 100644
> > > --- a/drivers/net/octeontx2/otx2_ethdev.c
> > > +++ b/drivers/net/octeontx2/otx2_ethdev.c
> > > @@ -4,6 +4,7 @@
> > >
> > >  #include <inttypes.h>
> > >
> > > +#include <rte_bitops.h>
> > >  #include <rte_ethdev_pci.h>
> > >  #include <rte_io.h>
> > >  #include <rte_malloc.h>
> > > @@ -14,6 +15,35 @@
> > >  #include "otx2_ethdev.h"
> > >  #include "otx2_ethdev_sec.h"
> > >
> > > +uint64_t rte_pmd_octeontx2_timestamp_rx_dynflag;
> > > +int rte_pmd_octeontx2_timestamp_dynfield_offset = -1;
> > > +
> > > +static int
> > > +otx2_rx_timestamp_setup(uint16_t flags) {
> > > +   int timestamp_rx_dynflag_offset;
> > > +
> > > +   if ((flags & NIX_RX_OFFLOAD_TSTAMP_F) == 0)
> > > +           return 0;
> > > +
> > > +   rte_pmd_octeontx2_timestamp_dynfield_offset =
> > rte_mbuf_dynfield_lookup(
> > > +                   RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
> > > +   if (rte_pmd_octeontx2_timestamp_dynfield_offset < 0) {
> > > +           otx2_err("Failed to lookup timestamp field");
> > > +           return -rte_errno;
> > > +   }
> > > +   timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
> > > +                   RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
> > > +   if (timestamp_rx_dynflag_offset < 0) {
> > > +           otx2_err("Failed to lookup Rx timestamp flag");
> > > +           return -rte_errno;
> > > +   }
> > > +   rte_pmd_octeontx2_timestamp_rx_dynflag =
> > > +                   RTE_BIT64(timestamp_rx_dynflag_offset);
> > > +
> > > +   return 0;
> > > +}
> > > +
> > >  static inline uint64_t
> > >  nix_get_rx_offload_capa(struct otx2_eth_dev *dev)  { @@ -1874,6
> > > +1904,9 @@ otx2_nix_configure(struct rte_eth_dev *eth_dev)
> > >     dev->tx_offload_flags |= nix_tx_offload_flags(eth_dev);
> > >     dev->rss_info.rss_grps = NIX_RSS_GRPS;
> > >
> > > +   if (otx2_rx_timestamp_setup(dev->rx_offload_flags) != 0)
> > > +           goto fail_offloads;
> > > +
> > >     nb_rxq = RTE_MAX(data->nb_rx_queues, 1);
> > >     nb_txq = RTE_MAX(data->nb_tx_queues, 1);
> > >
> > > diff --git a/drivers/net/octeontx2/otx2_rx.h
> > > b/drivers/net/octeontx2/otx2_rx.h index 61a5c436dd..6981edce82 100644
> > > --- a/drivers/net/octeontx2/otx2_rx.h
> > > +++ b/drivers/net/octeontx2/otx2_rx.h
> > > @@ -63,6 +63,18 @@ union mbuf_initializer {
> > >     uint64_t value;
> > >  };
> > >
> > > +/* variables are exported because this file is included in other
> > > +drivers */ extern uint64_t rte_pmd_octeontx2_timestamp_rx_dynflag;
> > > +extern int rte_pmd_octeontx2_timestamp_dynfield_offset;
> > > +
> > > +static inline rte_mbuf_timestamp_t *
> > > +otx2_timestamp_dynfield(struct rte_mbuf *mbuf) {
> > > +   return RTE_MBUF_DYNFIELD(mbuf,
> > > +           rte_pmd_octeontx2_timestamp_dynfield_offset,
> > > +           rte_mbuf_timestamp_t *);
> > > +}
> > > +
> >
> > May be ethdev should provide the inline function?
> Just five cents -  exporting the offset (making it global) might have side effect impacting the performance.
> Offset might be located in some memory sharing the cacheline with some other variables.
> If these variables are writable and are being updated frequently - we might get the cache contention.
> I'd prefer to keep all dynamic offsets In the PMD and entirely control memory allocation
> attributes for these ones. Hence, exporting/inline function is possible,
> but practical usage, say,  in datapath, is questionable.
>
> With best regards, Slava
>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotterfirst half
  2020-10-29 18:45         ` Ajit Khaparde
@ 2020-10-31 18:20           ` Morten Brørup
  2020-10-31 20:40             ` Thomas Monjalon
  0 siblings, 1 reply; 170+ messages in thread
From: Morten Brørup @ 2020-10-31 18:20 UTC (permalink / raw)
  To: Ajit Khaparde, Ananyev, Konstantin, Thomas Monjalon
  Cc: Andrew Rybchenko, dev, Yigit, Ferruh, david.marchand, Richardson,
	Bruce, olivier.matz, jerinj, viacheslavo, honnappa.nagarahalli,
	maxime.coquelin, stephen, hemant.agrawal

Thomas,

Adding my thoughts to the already detailed feedback on this important patch...

The first cache line is not inherently "hotter" than the second. The hotness depends on their usage.

The mbuf cacheline1 marker has the following comment:
/* second cache line - fields only used in slow path or on TX */

In other words, the second cache line is intended not to be touched in fast path RX.

I do not think this is true anymore. Not even with simple non-scattered RX. And regression testing probably didn't catch this, because the tests perform TX after RX, so the cache miss moved from TX to RX and became a cache hit in TX instead. (I may be wrong about this claim, but it's not important for the discussion.)

I think the right question for this patch is: Can we achieve this - not using the second cache line for fast path RX - again by putting the right fields in the first cache line?

Probably not in all cases, but perhaps for some...

Consider the application scenarios.

When a packet is received, one of three things happens to it:
1. It is immediately transmitted on one or more ports.
2. It is immediately discarded, e.g. by a firewall rule.
3. It is put in some sort of queue, e.g. a ring for the next pipeline stage, or in a QoS queue.

1. If the packet is immediately transmitted, the m->tx_offload field in the second cache line will be touched by the application and TX function anyway, so we don't need to optimize the mbuf layout for this scenario.

2. The second scenario touches m->pool no matter how it is implemented. The application can avoid touching m->next by using rte_mbuf_raw_free(), knowing that the mbuf came directly from RX and thus no other fields have been touched. In this scenario, we want m->pool in the first cache line.

3. Now, let's consider the third scenario, where RX is followed by enqueue into a ring. If the application does nothing but put the packet into a ring, we don't need to move anything into the first cache line. But applications usually does more... So it is application specific what would be good to move to the first cache line:

A. If the application does not use segmented mbufs, and performs analysis and preparation for transmission in the initial pipeline stages, and only the last pipeline stage performs TX, we could move m->tx_offload to the first cache line, which would keep the second cache line cold until the actual TX happens in the last pipeline stage - maybe even after the packet has waited in a QoS queue for a long time, and its cache lines have gone cold.

B. If the application uses segmented mbufs on RX, it might make sense moving m->next to the first cache line. (We don't use segmented mbufs, so I'm not sure about this.)


However, reality perhaps beats theory:

Looking at the E1000 PMD, it seems like even its non-scattered RX function, eth_igb_recv_pkts(), sets m->next. If it only kept its own free pool pre-initialized instead... I haven't investigated other PMDs, except briefly looking at the mlx5 PMD, and it seems like it doesn't touch m->next in RX.

I haven't looked deeper into how m->pool is being used by RX in PMDs, but I suppose that it isn't touched in RX.

<rant on>
If only we had a performance test where RX was not immediately followed by TX, but the packets were passed through a large queue in-between, so RX cache misses were not free of charge because they transform TX cache misses into cache hits instead...
<rant off>

Whatever you choose, I am sure that most applications will find it more useful than the timestamp. :-)


Med venlig hilsen / kind regards
- Morten Brørup


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotterfirst half
  2020-10-31 18:20           ` [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotterfirst half Morten Brørup
@ 2020-10-31 20:40             ` Thomas Monjalon
  2020-11-01  9:12               ` Morten Brørup
  0 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-10-31 20:40 UTC (permalink / raw)
  To: Morten Brørup
  Cc: Ajit Khaparde, Ananyev, Konstantin, Andrew Rybchenko, dev, Yigit,
	Ferruh, david.marchand, Richardson, Bruce, olivier.matz, jerinj,
	viacheslavo, honnappa.nagarahalli, maxime.coquelin, stephen,
	hemant.agrawal, viacheslavo

Thanks for the thoughts Morten.
I believe we need benchmarks of different scenarios with different drivers.


31/10/2020 19:20, Morten Brørup:
> Thomas,
> 
> Adding my thoughts to the already detailed feedback on this important patch...
> 
> The first cache line is not inherently "hotter" than the second. The hotness depends on their usage.
> 
> The mbuf cacheline1 marker has the following comment:
> /* second cache line - fields only used in slow path or on TX */
> 
> In other words, the second cache line is intended not to be touched in fast path RX.
> 
> I do not think this is true anymore. Not even with simple non-scattered RX. And regression testing probably didn't catch this, because the tests perform TX after RX, so the cache miss moved from TX to RX and became a cache hit in TX instead. (I may be wrong about this claim, but it's not important for the discussion.)
> 
> I think the right question for this patch is: Can we achieve this - not using the second cache line for fast path RX - again by putting the right fields in the first cache line?
> 
> Probably not in all cases, but perhaps for some...
> 
> Consider the application scenarios.
> 
> When a packet is received, one of three things happens to it:
> 1. It is immediately transmitted on one or more ports.
> 2. It is immediately discarded, e.g. by a firewall rule.
> 3. It is put in some sort of queue, e.g. a ring for the next pipeline stage, or in a QoS queue.
> 
> 1. If the packet is immediately transmitted, the m->tx_offload field in the second cache line will be touched by the application and TX function anyway, so we don't need to optimize the mbuf layout for this scenario.
> 
> 2. The second scenario touches m->pool no matter how it is implemented. The application can avoid touching m->next by using rte_mbuf_raw_free(), knowing that the mbuf came directly from RX and thus no other fields have been touched. In this scenario, we want m->pool in the first cache line.
> 
> 3. Now, let's consider the third scenario, where RX is followed by enqueue into a ring. If the application does nothing but put the packet into a ring, we don't need to move anything into the first cache line. But applications usually does more... So it is application specific what would be good to move to the first cache line:
> 
> A. If the application does not use segmented mbufs, and performs analysis and preparation for transmission in the initial pipeline stages, and only the last pipeline stage performs TX, we could move m->tx_offload to the first cache line, which would keep the second cache line cold until the actual TX happens in the last pipeline stage - maybe even after the packet has waited in a QoS queue for a long time, and its cache lines have gone cold.
> 
> B. If the application uses segmented mbufs on RX, it might make sense moving m->next to the first cache line. (We don't use segmented mbufs, so I'm not sure about this.)
> 
> 
> However, reality perhaps beats theory:
> 
> Looking at the E1000 PMD, it seems like even its non-scattered RX function, eth_igb_recv_pkts(), sets m->next. If it only kept its own free pool pre-initialized instead... I haven't investigated other PMDs, except briefly looking at the mlx5 PMD, and it seems like it doesn't touch m->next in RX.
> 
> I haven't looked deeper into how m->pool is being used by RX in PMDs, but I suppose that it isn't touched in RX.
> 
> <rant on>
> If only we had a performance test where RX was not immediately followed by TX, but the packets were passed through a large queue in-between, so RX cache misses were not free of charge because they transform TX cache misses into cache hits instead...
> <rant off>
> 
> Whatever you choose, I am sure that most applications will find it more useful than the timestamp. :-)




^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotterfirst half
  2020-10-31 20:40             ` Thomas Monjalon
@ 2020-11-01  9:12               ` Morten Brørup
  2020-11-01 16:21                 ` Thomas Monjalon
  2020-11-01 16:38                 ` Thomas Monjalon
  0 siblings, 2 replies; 170+ messages in thread
From: Morten Brørup @ 2020-11-01  9:12 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Ajit Khaparde, Ananyev, Konstantin, Andrew Rybchenko, dev, Yigit,
	Ferruh, david.marchand, Richardson, Bruce, olivier.matz, jerinj,
	viacheslavo, honnappa.nagarahalli, maxime.coquelin, stephen,
	hemant.agrawal, viacheslavo, Matan Azrad, Shahaf Shuler

> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Saturday, October 31, 2020 9:41 PM
> 
> 31/10/2020 19:20, Morten Brørup:
> > Thomas,
> >
> > Adding my thoughts to the already detailed feedback on this important
> patch...
> >
> > The first cache line is not inherently "hotter" than the second. The
> hotness depends on their usage.
> >
> > The mbuf cacheline1 marker has the following comment:
> > /* second cache line - fields only used in slow path or on TX */
> >
> > In other words, the second cache line is intended not to be touched in
> fast path RX.
> >
> > I do not think this is true anymore. Not even with simple non-scattered
> RX. And regression testing probably didn't catch this, because the tests
> perform TX after RX, so the cache miss moved from TX to RX and became a
> cache hit in TX instead. (I may be wrong about this claim, but it's not
> important for the discussion.)
> >
> > I think the right question for this patch is: Can we achieve this - not
> using the second cache line for fast path RX - again by putting the right
> fields in the first cache line?
> >
> > Probably not in all cases, but perhaps for some...
> >
> > Consider the application scenarios.
> >
> > When a packet is received, one of three things happens to it:
> > 1. It is immediately transmitted on one or more ports.
> > 2. It is immediately discarded, e.g. by a firewall rule.
> > 3. It is put in some sort of queue, e.g. a ring for the next pipeline
> stage, or in a QoS queue.
> >
> > 1. If the packet is immediately transmitted, the m->tx_offload field in
> the second cache line will be touched by the application and TX function
> anyway, so we don't need to optimize the mbuf layout for this scenario.
> >
> > 2. The second scenario touches m->pool no matter how it is implemented.
> The application can avoid touching m->next by using rte_mbuf_raw_free(),
> knowing that the mbuf came directly from RX and thus no other fields have
> been touched. In this scenario, we want m->pool in the first cache line.
> >
> > 3. Now, let's consider the third scenario, where RX is followed by
> enqueue into a ring. If the application does nothing but put the packet
> into a ring, we don't need to move anything into the first cache line. But
> applications usually does more... So it is application specific what would
> be good to move to the first cache line:
> >
> > A. If the application does not use segmented mbufs, and performs analysis
> and preparation for transmission in the initial pipeline stages, and only
> the last pipeline stage performs TX, we could move m->tx_offload to the
> first cache line, which would keep the second cache line cold until the
> actual TX happens in the last pipeline stage - maybe even after the packet
> has waited in a QoS queue for a long time, and its cache lines have gone
> cold.
> >
> > B. If the application uses segmented mbufs on RX, it might make sense
> moving m->next to the first cache line. (We don't use segmented mbufs, so
> I'm not sure about this.)
> >
> >
> > However, reality perhaps beats theory:
> >
> > Looking at the E1000 PMD, it seems like even its non-scattered RX
> function, eth_igb_recv_pkts(), sets m->next. If it only kept its own free
> pool pre-initialized instead... I haven't investigated other PMDs, except
> briefly looking at the mlx5 PMD, and it seems like it doesn't touch m->next
> in RX.
> >
> > I haven't looked deeper into how m->pool is being used by RX in PMDs, but
> I suppose that it isn't touched in RX.
> >
> > <rant on>
> > If only we had a performance test where RX was not immediately followed
> by TX, but the packets were passed through a large queue in-between, so RX
> cache misses were not free of charge because they transform TX cache misses
> into cache hits instead...
> > <rant off>
> >
> > Whatever you choose, I am sure that most applications will find it more
> useful than the timestamp. :-)
> 
> Thanks for the thoughts Morten.
> I believe we need benchmarks of different scenarios with different drivers.
>

If we are only allowed to modify the mbuf structure this one more time, we should look forward, not backwards!

If we move m->tx_offload to the first cache line, applications using simple, non-scattered packet mbufs would never even need to touch the second cache line, except for freeing the mbuf (which needs to read m->pool).

And this leads to my next suggestion...

One thing has always puzzled me: Why do we use 64 bits to indicate which memory pool an mbuf belongs to? The portid only uses 16 bits and an indirection index. Why don't we use the same kind of indirection index for mbuf pools?

I can easily imagine using one mbuf pool (or perhaps a few pools) per CPU socket (or per physical memory bus closest to an attached NIC), but not more than 256 mbuf memory pools in total. So, let's introduce an mbufpoolid like the portid, and cut this mbuf field down from 64 to 8 bits.

If we also cut down m->pkt_len from 32 to 24 bits, we can get the 8 bit mbuf pool index into the first cache line at no additional cost.

In other words: This would free up another 64 bit field in the mbuf structure!


And even though the m->next pointer for scattered packets resides in the second cache line, the libraries and application knows that m->next is NULL when m->nb_segs is 1. This proves that my suggestion would make touching the second cache line unnecessary (in simple cases), even for re-initializing the mbuf.


And now I will proceed out on a tangent with two more independent thoughts, so feel free to ignore.

Consider a multi CPU socket system with one mbuf pool per CPU socket, the NICs attached to each CPU socket use an RX mbuf pool with RAM on the same CPU socket. I would imagine that (re-)initializing these mbufs could be faster if performed only on a CPU on the same socket. If this is the case, mbufs should be re-initialized as part of the RX preparation at ingress, not as part of the mbuf free at egress.

Perhaps some microarchitectures are faster to compare nb_segs==0 than nb_segs==1. If so, nb_segs could be redefined to mean number of additional segments, rather than number of segments.


PS: I have added two more mlx5 maintainers to the discussion; they might have qualified opinions about how PMDs could benefit from this.


Med venlig hilsen / kind regards
- Morten Brørup




^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 10/15] net/octeontx2: switch timestamp to dynamic mbuf field
  2020-10-30 12:41       ` Jerin Jacob
@ 2020-11-01 16:12         ` Thomas Monjalon
  2020-11-01 20:00         ` Andrew Rybchenko
  1 sibling, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-01 16:12 UTC (permalink / raw)
  To: Slava Ovsiienko, Andrew Rybchenko, jerinj
  Cc: dev, dev, ferruh.yigit, david.marchand, bruce.richardson,
	olivier.matz, Nithin Dabilpuram, Kiran Kumar K, Ray Kinsella,
	Neil Horman, Jerin Jacob

30/10/2020 13:41, Jerin Jacob:
> On Thu, Oct 29, 2020 at 5:22 PM Slava Ovsiienko <viacheslavo@nvidia.com> wrote:
> > From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >
> > Just five cents -  exporting the offset (making it global) might have side effect impacting the performance.
> 
> I agree with Slava. The offset value should be stored in the PMD structure.
> IMO, We can have an ethdev API to get the offset and store it in PMD's
> fastpath structures in the slow path
> to use in fastpath.
> 
> > Offset might be located in some memory sharing the cacheline with some other variables.
> > If these variables are writable and are being updated frequently - we might get the cache contention.
> > I'd prefer to keep all dynamic offsets In the PMD and entirely control memory allocation
> > attributes for these ones. Hence, exporting is OK, but practical usage in datapath is questionable.

Yes this is a major design point:
the field offsets are preferably stored in a hot cache line
which depends on the driver, library or application context.




^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotterfirst half
  2020-11-01  9:12               ` Morten Brørup
@ 2020-11-01 16:21                 ` Thomas Monjalon
  2020-11-01 16:38                 ` Thomas Monjalon
  1 sibling, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-01 16:21 UTC (permalink / raw)
  To: Morten Brørup
  Cc: dev, Ajit Khaparde, Ananyev, Konstantin, Andrew Rybchenko, dev,
	Yigit, Ferruh, david.marchand, Richardson, Bruce, olivier.matz,
	jerinj, viacheslavo, honnappa.nagarahalli, maxime.coquelin,
	stephen, hemant.agrawal, viacheslavo, Matan Azrad, Shahaf Shuler,
	hemant.agrawal

That's very interesting food for thoughts.
I hope we will have a good community discussion on this list
during this week to make some decisions.


01/11/2020 10:12, Morten Brørup:
> > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > Sent: Saturday, October 31, 2020 9:41 PM
> > 
> > 31/10/2020 19:20, Morten Brørup:
> > > Thomas,
> > >
> > > Adding my thoughts to the already detailed feedback on this important
> > patch...
> > >
> > > The first cache line is not inherently "hotter" than the second. The
> > hotness depends on their usage.
> > >
> > > The mbuf cacheline1 marker has the following comment:
> > > /* second cache line - fields only used in slow path or on TX */
> > >
> > > In other words, the second cache line is intended not to be touched in
> > fast path RX.
> > >
> > > I do not think this is true anymore. Not even with simple non-scattered
> > RX. And regression testing probably didn't catch this, because the tests
> > perform TX after RX, so the cache miss moved from TX to RX and became a
> > cache hit in TX instead. (I may be wrong about this claim, but it's not
> > important for the discussion.)
> > >
> > > I think the right question for this patch is: Can we achieve this - not
> > using the second cache line for fast path RX - again by putting the right
> > fields in the first cache line?
> > >
> > > Probably not in all cases, but perhaps for some...
> > >
> > > Consider the application scenarios.
> > >
> > > When a packet is received, one of three things happens to it:
> > > 1. It is immediately transmitted on one or more ports.
> > > 2. It is immediately discarded, e.g. by a firewall rule.
> > > 3. It is put in some sort of queue, e.g. a ring for the next pipeline
> > stage, or in a QoS queue.
> > >
> > > 1. If the packet is immediately transmitted, the m->tx_offload field in
> > the second cache line will be touched by the application and TX function
> > anyway, so we don't need to optimize the mbuf layout for this scenario.
> > >
> > > 2. The second scenario touches m->pool no matter how it is implemented.
> > The application can avoid touching m->next by using rte_mbuf_raw_free(),
> > knowing that the mbuf came directly from RX and thus no other fields have
> > been touched. In this scenario, we want m->pool in the first cache line.
> > >
> > > 3. Now, let's consider the third scenario, where RX is followed by
> > enqueue into a ring. If the application does nothing but put the packet
> > into a ring, we don't need to move anything into the first cache line. But
> > applications usually does more... So it is application specific what would
> > be good to move to the first cache line:
> > >
> > > A. If the application does not use segmented mbufs, and performs analysis
> > and preparation for transmission in the initial pipeline stages, and only
> > the last pipeline stage performs TX, we could move m->tx_offload to the
> > first cache line, which would keep the second cache line cold until the
> > actual TX happens in the last pipeline stage - maybe even after the packet
> > has waited in a QoS queue for a long time, and its cache lines have gone
> > cold.
> > >
> > > B. If the application uses segmented mbufs on RX, it might make sense
> > moving m->next to the first cache line. (We don't use segmented mbufs, so
> > I'm not sure about this.)
> > >
> > >
> > > However, reality perhaps beats theory:
> > >
> > > Looking at the E1000 PMD, it seems like even its non-scattered RX
> > function, eth_igb_recv_pkts(), sets m->next. If it only kept its own free
> > pool pre-initialized instead... I haven't investigated other PMDs, except
> > briefly looking at the mlx5 PMD, and it seems like it doesn't touch m->next
> > in RX.
> > >
> > > I haven't looked deeper into how m->pool is being used by RX in PMDs, but
> > I suppose that it isn't touched in RX.
> > >
> > > <rant on>
> > > If only we had a performance test where RX was not immediately followed
> > by TX, but the packets were passed through a large queue in-between, so RX
> > cache misses were not free of charge because they transform TX cache misses
> > into cache hits instead...
> > > <rant off>
> > >
> > > Whatever you choose, I am sure that most applications will find it more
> > useful than the timestamp. :-)
> > 
> > Thanks for the thoughts Morten.
> > I believe we need benchmarks of different scenarios with different drivers.
> >
> 
> If we are only allowed to modify the mbuf structure this one more time, we should look forward, not backwards!
> 
> If we move m->tx_offload to the first cache line, applications using simple, non-scattered packet mbufs would never even need to touch the second cache line, except for freeing the mbuf (which needs to read m->pool).
> 
> And this leads to my next suggestion...
> 
> One thing has always puzzled me: Why do we use 64 bits to indicate which memory pool an mbuf belongs to? The portid only uses 16 bits and an indirection index. Why don't we use the same kind of indirection index for mbuf pools?
> 
> I can easily imagine using one mbuf pool (or perhaps a few pools) per CPU socket (or per physical memory bus closest to an attached NIC), but not more than 256 mbuf memory pools in total. So, let's introduce an mbufpoolid like the portid, and cut this mbuf field down from 64 to 8 bits.
> 
> If we also cut down m->pkt_len from 32 to 24 bits, we can get the 8 bit mbuf pool index into the first cache line at no additional cost.
> 
> In other words: This would free up another 64 bit field in the mbuf structure!
> 
> 
> And even though the m->next pointer for scattered packets resides in the second cache line, the libraries and application knows that m->next is NULL when m->nb_segs is 1. This proves that my suggestion would make touching the second cache line unnecessary (in simple cases), even for re-initializing the mbuf.
> 
> 
> And now I will proceed out on a tangent with two more independent thoughts, so feel free to ignore.
> 
> Consider a multi CPU socket system with one mbuf pool per CPU socket, the NICs attached to each CPU socket use an RX mbuf pool with RAM on the same CPU socket. I would imagine that (re-)initializing these mbufs could be faster if performed only on a CPU on the same socket. If this is the case, mbufs should be re-initialized as part of the RX preparation at ingress, not as part of the mbuf free at egress.
> 
> Perhaps some microarchitectures are faster to compare nb_segs==0 than nb_segs==1. If so, nb_segs could be redefined to mean number of additional segments, rather than number of segments.




^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotterfirst half
  2020-11-01  9:12               ` Morten Brørup
  2020-11-01 16:21                 ` Thomas Monjalon
@ 2020-11-01 16:38                 ` Thomas Monjalon
  2020-11-01 20:59                   ` Morten Brørup
  1 sibling, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-01 16:38 UTC (permalink / raw)
  To: Morten Brørup
  Cc: dev, Ajit Khaparde, Ananyev, Konstantin, Andrew Rybchenko, dev,
	Yigit, Ferruh, david.marchand, Richardson, Bruce, olivier.matz,
	jerinj, viacheslavo, honnappa.nagarahalli, maxime.coquelin,
	stephen, hemant.agrawal, viacheslavo, Matan Azrad, Shahaf Shuler

01/11/2020 10:12, Morten Brørup:
> One thing has always puzzled me:
> Why do we use 64 bits to indicate which memory pool
> an mbuf belongs to?
> The portid only uses 16 bits and an indirection index.
> Why don't we use the same kind of indirection index for mbuf pools?

I wonder what would be the cost of indirection. Probably neglectible.
I think it is a good proposal...
... for next year, after a deprecation notice.

> I can easily imagine using one mbuf pool (or perhaps a few pools)
> per CPU socket (or per physical memory bus closest to an attached NIC),
> but not more than 256 mbuf memory pools in total.
> So, let's introduce an mbufpoolid like the portid,
> and cut this mbuf field down from 64 to 8 bits.
> 
> If we also cut down m->pkt_len from 32 to 24 bits,

Who is using packets larger than 64k? Are 16 bits enough?

> we can get the 8 bit mbuf pool index into the first cache line
> at no additional cost.

I like the idea.
It means we don't need to move the pool pointer now,
i.e. it does not have to replace the timestamp field.

> In other words: This would free up another 64 bit field in the mbuf structure!

That would be great!


> And even though the m->next pointer for scattered packets resides
> in the second cache line, the libraries and application knows
> that m->next is NULL when m->nb_segs is 1.
> This proves that my suggestion would make touching
> the second cache line unnecessary (in simple cases),
> even for re-initializing the mbuf.

So you think the "next" pointer should stay in the second half of mbuf?

I feel you would like to move the Tx offloads in the first half
to improve performance of very simple apps.
I am thinking the opposite: we could have some dynamic fields space
in the first half to improve performance of complex Rx.
Note: we can add a flag hint for field registration in this first half.


> And now I will proceed out on a tangent with two more
> independent thoughts, so feel free to ignore.
> 
> Consider a multi CPU socket system with one mbuf pool
> per CPU socket, the NICs attached to each CPU socket
> use an RX mbuf pool with RAM on the same CPU socket.
> I would imagine that (re-)initializing these mbufs could be faster
> if performed only on a CPU on the same socket.
> If this is the case, mbufs should be re-initialized
> as part of the RX preparation at ingress,
> not as part of the mbuf free at egress.
> 
> Perhaps some microarchitectures are faster to compare
> nb_segs==0 than nb_segs==1.
> If so, nb_segs could be redefined to mean number of
> additional segments, rather than number of segments.




^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v2 00/14] remove mbuf timestamp
  2020-10-29  9:27 [dpdk-dev] [PATCH 00/15] remove mbuf timestamp Thomas Monjalon
                   ` (14 preceding siblings ...)
  2020-10-29  9:27 ` [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotter first half Thomas Monjalon
@ 2020-11-01 18:06 ` Thomas Monjalon
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 01/14] eventdev: remove software Rx timestamp Thomas Monjalon
                     ` (14 more replies)
  2020-11-03  0:13 ` [dpdk-dev] [PATCH v3 00/16] " Thomas Monjalon
                   ` (2 subsequent siblings)
  18 siblings, 15 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-01 18:06 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo

The mbuf field timestamp was announced to be removed for three reasons:
  - a dynamic field already exist, used for Tx only
  - this field always used 8 bytes even if unneeded
  - this field is in the first half (cacheline) of mbuf

After this series, the dynamic field timestamp is used for both Rx and Tx
with separate dynamic flags to distinguish when the value is meaningful
without resetting the field during forwarding.

As a consequence, 8 bytes can be re-allocated to dynamic fields
in the first half of mbuf structure.
It is still open to change more the mbuf layout.

This mbuf layout change is important to allow adding more features
(consuming more dynamic fields) during the next year,
and can allow performance improvements with new usages in the first half.


Thomas Monjalon (14):
  eventdev: remove software Rx timestamp
  mbuf: add Rx timestamp dynamic flag
  ethdev: register mbuf field and flags for timestamp
  latency: switch timestamp to dynamic mbuf field
  net/ark: switch timestamp to dynamic mbuf field
  net/dpaa2: switch timestamp to dynamic mbuf field
  net/mlx5: fix dynamic mbuf offset lookup check
  net/mlx5: switch timestamp to dynamic mbuf field
  net/nfb: switch timestamp to dynamic mbuf field
  net/octeontx2: switch timestamp to dynamic mbuf field
  net/pcap: switch timestamp to dynamic mbuf field
  app/testpmd: switch timestamp to dynamic mbuf field
  examples/rxtx_callbacks: switch timestamp to dynamic field
  mbuf: remove deprecated timestamp field

 app/test-pmd/config.c                         | 38 -----------
 app/test-pmd/util.c                           | 39 ++++++++++-
 app/test/test_mbuf.c                          |  1 -
 doc/guides/nics/mlx5.rst                      |  5 +-
 .../prog_guide/event_ethernet_rx_adapter.rst  |  6 +-
 doc/guides/rel_notes/deprecation.rst          |  4 --
 doc/guides/rel_notes/release_20_11.rst        |  4 ++
 drivers/net/ark/ark_ethdev.c                  | 23 +++++++
 drivers/net/ark/ark_ethdev_rx.c               | 10 ++-
 drivers/net/dpaa2/dpaa2_ethdev.c              | 20 ++++++
 drivers/net/dpaa2/dpaa2_ethdev.h              |  2 +
 drivers/net/dpaa2/dpaa2_rxtx.c                | 25 +++++--
 drivers/net/mlx5/mlx5_rxq.c                   | 36 ++++++++++
 drivers/net/mlx5/mlx5_rxtx.c                  |  8 +--
 drivers/net/mlx5/mlx5_rxtx.h                  | 19 ++++++
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h      | 41 ++++++------
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h         | 43 ++++++------
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h          | 35 +++++-----
 drivers/net/mlx5/mlx5_trigger.c               |  2 +-
 drivers/net/mlx5/mlx5_txq.c                   |  2 +-
 drivers/net/nfb/nfb_rx.c                      | 23 ++++++-
 drivers/net/nfb/nfb_rx.h                      | 18 +++--
 drivers/net/octeontx2/otx2_ethdev.c           | 33 ++++++++++
 drivers/net/octeontx2/otx2_rx.h               | 19 +++++-
 drivers/net/octeontx2/version.map             |  7 ++
 drivers/net/pcap/rte_eth_pcap.c               | 29 ++++++++-
 examples/rxtx_callbacks/main.c                | 17 ++++-
 lib/librte_ethdev/rte_ethdev.c                | 65 +++++++++++++++++++
 lib/librte_ethdev/rte_ethdev.h                | 13 +++-
 .../rte_event_eth_rx_adapter.c                | 11 ----
 .../rte_event_eth_rx_adapter.h                |  6 +-
 lib/librte_latencystats/rte_latencystats.c    | 48 ++++++++++++--
 lib/librte_mbuf/rte_mbuf.c                    |  2 -
 lib/librte_mbuf/rte_mbuf.h                    |  2 +-
 lib/librte_mbuf/rte_mbuf_core.h               | 12 +---
 lib/librte_mbuf/rte_mbuf_dyn.c                |  1 +
 lib/librte_mbuf/rte_mbuf_dyn.h                | 11 ++--
 37 files changed, 501 insertions(+), 179 deletions(-)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v2 01/14] eventdev: remove software Rx timestamp
  2020-11-01 18:06 ` [dpdk-dev] [PATCH v2 00/14] remove mbuf timestamp Thomas Monjalon
@ 2020-11-01 18:06   ` Thomas Monjalon
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 02/14] mbuf: add Rx timestamp dynamic flag Thomas Monjalon
                     ` (13 subsequent siblings)
  14 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-01 18:06 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Nikhil Rao

This a revert of the commit 569758758dcd ("eventdev: add Rx timestamp").
If the Rx timestamp is not configured on the ethdev port,
there is no reason to set one.
Also the accuracy  of the timestamp was bad because set at a late stage.
Anyway there is no trace of the usage of this timestamp.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 doc/guides/prog_guide/event_ethernet_rx_adapter.rst |  6 +-----
 lib/librte_eventdev/rte_event_eth_rx_adapter.c      | 11 -----------
 lib/librte_eventdev/rte_event_eth_rx_adapter.h      |  6 +-----
 3 files changed, 2 insertions(+), 21 deletions(-)

diff --git a/doc/guides/prog_guide/event_ethernet_rx_adapter.rst b/doc/guides/prog_guide/event_ethernet_rx_adapter.rst
index 236f43f455..cb44ce0e47 100644
--- a/doc/guides/prog_guide/event_ethernet_rx_adapter.rst
+++ b/doc/guides/prog_guide/event_ethernet_rx_adapter.rst
@@ -12,11 +12,7 @@ be supported in hardware or require a software thread to receive packets from
 the ethdev port using ethdev poll mode APIs and enqueue these as events to the
 event device using the eventdev API. Both transfer mechanisms may be present on
 the same platform depending on the particular combination of the ethdev and
-the event device. For SW based packet transfer, if the mbuf does not have a
-timestamp set, the adapter adds a timestamp to the mbuf using
-rte_get_tsc_cycles(), this provides a more accurate timestamp as compared to
-if the application were to set the timestamp since it avoids event device
-schedule latency.
+the event device.
 
 The Event Ethernet Rx Adapter library is intended for the application code to
 configure both transfer mechanisms using a common API. A capability API allows
diff --git a/lib/librte_eventdev/rte_event_eth_rx_adapter.c b/lib/librte_eventdev/rte_event_eth_rx_adapter.c
index f0000d1ede..3c73046551 100644
--- a/lib/librte_eventdev/rte_event_eth_rx_adapter.c
+++ b/lib/librte_eventdev/rte_event_eth_rx_adapter.c
@@ -763,7 +763,6 @@ rxa_buffer_mbufs(struct rte_event_eth_rx_adapter *rx_adapter,
 	uint32_t rss_mask;
 	uint32_t rss;
 	int do_rss;
-	uint64_t ts;
 	uint16_t nb_cb;
 	uint16_t dropped;
 
@@ -771,16 +770,6 @@ rxa_buffer_mbufs(struct rte_event_eth_rx_adapter *rx_adapter,
 	rss_mask = ~(((m->ol_flags & PKT_RX_RSS_HASH) != 0) - 1);
 	do_rss = !rss_mask && !eth_rx_queue_info->flow_id_mask;
 
-	if ((m->ol_flags & PKT_RX_TIMESTAMP) == 0) {
-		ts = rte_get_tsc_cycles();
-		for (i = 0; i < num; i++) {
-			m = mbufs[i];
-
-			m->timestamp = ts;
-			m->ol_flags |= PKT_RX_TIMESTAMP;
-		}
-	}
-
 	for (i = 0; i < num; i++) {
 		m = mbufs[i];
 
diff --git a/lib/librte_eventdev/rte_event_eth_rx_adapter.h b/lib/librte_eventdev/rte_event_eth_rx_adapter.h
index 2dd259c279..21bb1e54c8 100644
--- a/lib/librte_eventdev/rte_event_eth_rx_adapter.h
+++ b/lib/librte_eventdev/rte_event_eth_rx_adapter.h
@@ -21,11 +21,7 @@
  *
  * The adapter uses a EAL service core function for SW based packet transfer
  * and uses the eventdev PMD functions to configure HW based packet transfer
- * between the ethernet device and the event device. For SW based packet
- * transfer, if the mbuf does not have a timestamp set, the adapter adds a
- * timestamp to the mbuf using rte_get_tsc_cycles(), this provides a more
- * accurate timestamp as compared to if the application were to set the time
- * stamp since it avoids event device schedule latency.
+ * between the ethernet device and the event device.
  *
  * The ethernet Rx event adapter's functions are:
  *  - rte_event_eth_rx_adapter_create_ext()
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v2 02/14] mbuf: add Rx timestamp dynamic flag
  2020-11-01 18:06 ` [dpdk-dev] [PATCH v2 00/14] remove mbuf timestamp Thomas Monjalon
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 01/14] eventdev: remove software Rx timestamp Thomas Monjalon
@ 2020-11-01 18:06   ` Thomas Monjalon
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 03/14] ethdev: register mbuf field and flags for timestamp Thomas Monjalon
                     ` (12 subsequent siblings)
  14 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-01 18:06 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Ajit Khaparde

There is already a dynamic field for timestamp,
used only for Tx scheduling, thanks to the dedicated flag.
The same field can be used for Rx timestamp filled by drivers.
The only missing part to get rid of the static timestamp field
was to declare a new dynamic flag for Rx usage.

After migrating all Rx timestamp usages, it will be possible
to remove the deprecated timestamp field.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
---
 lib/librte_mbuf/rte_mbuf_dyn.h | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 0ebac88b83..5fb85c0610 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -258,14 +258,14 @@ void rte_mbuf_dyn_dump(FILE *out);
  * timestamp. The dynamic Tx timestamp flag tells whether the field contains
  * actual timestamp value for the packets being sent, this value can be
  * used by PMD to schedule packet sending.
- *
- * After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
- * and obsoleting, the dedicated Rx timestamp flag is supposed to be
- * introduced and the shared dynamic timestamp field will be used
- * to handle the timestamps on receiving datapath as well.
  */
 #define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME "rte_dynfield_timestamp"
 
+/**
+ * Indicate that the timestamp field in the mbuf was filled by the driver.
+ */
+#define RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME "rte_dynflag_rx_timestamp"
+
 /**
  * When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME flag set on the
  * packet being sent it tries to synchronize the time of packet appearing
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v2 03/14] ethdev: register mbuf field and flags for timestamp
  2020-11-01 18:06 ` [dpdk-dev] [PATCH v2 00/14] remove mbuf timestamp Thomas Monjalon
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 01/14] eventdev: remove software Rx timestamp Thomas Monjalon
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 02/14] mbuf: add Rx timestamp dynamic flag Thomas Monjalon
@ 2020-11-01 18:06   ` Thomas Monjalon
  2020-11-02 15:39     ` Olivier Matz
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 04/14] latency: switch timestamp to dynamic mbuf field Thomas Monjalon
                     ` (11 subsequent siblings)
  14 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-01 18:06 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger, Matan Azrad, Shahaf Shuler

During port configure or queue setup, the offload flags
DEV_RX_OFFLOAD_TIMESTAMP and DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP
trigger the registration of the related mbuf field and flags.

Previously, the Tx timestamp field and flag were registered in testpmd,
as described in mlx5 guide.
For the general usage of Rx and Tx timestamps,
managing registrations inside ethdev is simpler and properly documented.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
---
 app/test-pmd/config.c          | 38 --------------------
 doc/guides/nics/mlx5.rst       |  5 ++-
 lib/librte_ethdev/rte_ethdev.c | 65 ++++++++++++++++++++++++++++++++++
 lib/librte_ethdev/rte_ethdev.h |  9 ++++-
 lib/librte_mbuf/rte_mbuf_dyn.h |  1 +
 5 files changed, 76 insertions(+), 42 deletions(-)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 1668ae3238..9a2baf16fe 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -3955,44 +3955,6 @@ show_tx_pkt_times(void)
 void
 set_tx_pkt_times(unsigned int *tx_times)
 {
-	uint16_t port_id;
-	int offload_found = 0;
-	int offset;
-	int flag;
-
-	static const struct rte_mbuf_dynfield desc_offs = {
-		.name = RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
-		.size = sizeof(uint64_t),
-		.align = __alignof__(uint64_t),
-	};
-	static const struct rte_mbuf_dynflag desc_flag = {
-		.name = RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME,
-	};
-
-	RTE_ETH_FOREACH_DEV(port_id) {
-		struct rte_eth_dev_info dev_info = { 0 };
-		int ret;
-
-		ret = rte_eth_dev_info_get(port_id, &dev_info);
-		if (ret == 0 && dev_info.tx_offload_capa &
-				DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP) {
-			offload_found = 1;
-			break;
-		}
-	}
-	if (!offload_found) {
-		printf("No device supporting Tx timestamp scheduling found, "
-		       "dynamic flag and field not registered\n");
-		return;
-	}
-	offset = rte_mbuf_dynfield_register(&desc_offs);
-	if (offset < 0 && rte_errno != EEXIST)
-		printf("Dynamic timestamp field registration error: %d",
-		       rte_errno);
-	flag = rte_mbuf_dynflag_register(&desc_flag);
-	if (flag < 0 && rte_errno != EEXIST)
-		printf("Dynamic timestamp flag registration error: %d",
-		       rte_errno);
 	tx_pkt_times_inter = tx_times[0];
 	tx_pkt_times_intra = tx_times[1];
 }
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index afa65a1379..fa8b13dd1b 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -237,9 +237,8 @@ Limitations
   ``txq_inline_max`` and ``txq_inline_mpw`` devargs keys.
 
 - To provide the packet send scheduling on mbuf timestamps the ``tx_pp``
-  parameter should be specified, RTE_MBUF_DYNFIELD_TIMESTAMP_NAME and
-  RTE_MBUF_DYNFLAG_TIMESTAMP_NAME should be registered by application.
-  When PMD sees the RTE_MBUF_DYNFLAG_TIMESTAMP_NAME set on the packet
+  parameter should be specified.
+  When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME set on the packet
   being sent it tries to synchronize the time of packet appearing on
   the wire with the specified packet timestamp. It the specified one
   is in the past it should be ignored, if one is in the distant future
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index b12bb3854d..eafff23910 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -31,6 +31,7 @@
 #include <rte_mempool.h>
 #include <rte_malloc.h>
 #include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 #include <rte_errno.h>
 #include <rte_spinlock.h>
 #include <rte_string_fns.h>
@@ -1232,6 +1233,54 @@ eth_dev_check_lro_pkt_size(uint16_t port_id, uint32_t config_size,
 	return ret;
 }
 
+static inline int
+eth_dev_timestamp_mbuf_register(uint64_t rx_offloads, uint64_t tx_offloads)
+{
+	static const struct rte_mbuf_dynfield field_desc = {
+		.name = RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
+		.size = sizeof(rte_mbuf_timestamp_t),
+		.align = __alignof__(rte_mbuf_timestamp_t),
+	};
+	static const struct rte_mbuf_dynflag rx_flag_desc = {
+		.name = RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME,
+	};
+	static const struct rte_mbuf_dynflag tx_flag_desc = {
+		.name = RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME,
+	};
+	bool todo_rx, todo_tx;
+	int offset;
+
+	todo_rx = (rx_offloads & DEV_RX_OFFLOAD_TIMESTAMP) != 0;
+	todo_tx = (tx_offloads & DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP) != 0;
+
+	if (todo_rx || todo_tx) {
+		offset = rte_mbuf_dynfield_register(&field_desc);
+		if (offset < 0) {
+			RTE_ETHDEV_LOG(ERR,
+					"Failed to register mbuf field for timestamp\n");
+			return -rte_errno;
+		}
+	}
+	if (todo_rx) {
+		offset = rte_mbuf_dynflag_register(&rx_flag_desc);
+		if (offset < 0) {
+			RTE_ETHDEV_LOG(ERR,
+					"Failed to register mbuf flag for Rx timestamp\n");
+			return -rte_errno;
+		}
+	}
+	if (todo_tx) {
+		offset = rte_mbuf_dynflag_register(&tx_flag_desc);
+		if (offset < 0) {
+			RTE_ETHDEV_LOG(ERR,
+					"Failed to register mbuf flag for Tx timestamp\n");
+			return -rte_errno;
+		}
+	}
+
+	return 0;
+}
+
 /*
  * Validate offloads that are requested through rte_eth_dev_configure against
  * the offloads successfully set by the ethernet device.
@@ -1481,6 +1530,12 @@ rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
 		goto rollback;
 	}
 
+	/* Register mbuf field and flags for timestamp offloads if enabled. */
+	ret = eth_dev_timestamp_mbuf_register(dev_conf->rxmode.offloads,
+			dev_conf->txmode.offloads);
+	if (ret != 0)
+		goto rollback;
+
 	/*
 	 * Setup new number of RX/TX queues and reconfigure device.
 	 */
@@ -2088,6 +2143,11 @@ rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id,
 			return ret;
 	}
 
+	/* Register mbuf field and flag for Rx timestamp offload if enabled. */
+	ret = eth_dev_timestamp_mbuf_register(local_conf.offloads, 0);
+	if (ret != 0)
+		return ret;
+
 	ret = (*dev->dev_ops->rx_queue_setup)(dev, rx_queue_id, nb_rx_desc,
 					      socket_id, &local_conf, mp);
 	if (!ret) {
@@ -2268,6 +2328,11 @@ rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id,
 		return -EINVAL;
 	}
 
+	/* Register mbuf field and flag for Tx timestamp offload if enabled. */
+	ret = eth_dev_timestamp_mbuf_register(0, local_conf.offloads);
+	if (ret != 0)
+		return ret;
+
 	rte_ethdev_trace_txq_setup(port_id, tx_queue_id, nb_tx_desc, tx_conf);
 	return eth_err(port_id, (*dev->dev_ops->tx_queue_setup)(dev,
 		       tx_queue_id, nb_tx_desc, socket_id, &local_conf));
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index ba997f16ce..3be0050592 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1344,6 +1344,9 @@ struct rte_eth_conf {
 #define DEV_RX_OFFLOAD_VLAN_EXTEND	0x00000400
 #define DEV_RX_OFFLOAD_JUMBO_FRAME	0x00000800
 #define DEV_RX_OFFLOAD_SCATTER		0x00002000
+/**
+ * The mbuf field and flag are registered when the offload is configured.
+ */
 #define DEV_RX_OFFLOAD_TIMESTAMP	0x00004000
 #define DEV_RX_OFFLOAD_SECURITY         0x00008000
 #define DEV_RX_OFFLOAD_KEEP_CRC		0x00010000
@@ -1408,7 +1411,11 @@ struct rte_eth_conf {
 #define DEV_TX_OFFLOAD_IP_TNL_TSO       0x00080000
 /** Device supports outer UDP checksum */
 #define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM  0x00100000
-/** Device supports send on timestamp */
+/**
+ * Device sends on time read from RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
+ * if RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME is set in ol_flags.
+ * The mbuf field and flag are registered when the offload is configured.
+ */
 #define DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
 /*
  * If new Tx offload capabilities are defined, they also must be
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 5fb85c0610..d4d8f66f77 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -260,6 +260,7 @@ void rte_mbuf_dyn_dump(FILE *out);
  * used by PMD to schedule packet sending.
  */
 #define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME "rte_dynfield_timestamp"
+typedef uint64_t rte_mbuf_timestamp_t;
 
 /**
  * Indicate that the timestamp field in the mbuf was filled by the driver.
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v2 04/14] latency: switch timestamp to dynamic mbuf field
  2020-11-01 18:06 ` [dpdk-dev] [PATCH v2 00/14] remove mbuf timestamp Thomas Monjalon
                     ` (2 preceding siblings ...)
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 03/14] ethdev: register mbuf field and flags for timestamp Thomas Monjalon
@ 2020-11-01 18:06   ` Thomas Monjalon
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 05/14] net/ark: " Thomas Monjalon
                     ` (10 subsequent siblings)
  14 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-01 18:06 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Reshma Pattan

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced with the dynamic one.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 lib/librte_latencystats/rte_latencystats.c | 48 +++++++++++++++++++---
 1 file changed, 43 insertions(+), 5 deletions(-)

diff --git a/lib/librte_latencystats/rte_latencystats.c b/lib/librte_latencystats/rte_latencystats.c
index ba2fff3bcb..f355cc7ed9 100644
--- a/lib/librte_latencystats/rte_latencystats.c
+++ b/lib/librte_latencystats/rte_latencystats.c
@@ -8,7 +8,9 @@
 #include <math.h>
 
 #include <rte_string_fns.h>
+#include <rte_bitops.h>
 #include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 #include <rte_log.h>
 #include <rte_cycles.h>
 #include <rte_ethdev.h>
@@ -31,6 +33,16 @@ latencystat_cycles_per_ns(void)
 /* Macros for printing using RTE_LOG */
 #define RTE_LOGTYPE_LATENCY_STATS RTE_LOGTYPE_USER1
 
+static uint64_t timestamp_dynflag;
+static int timestamp_dynfield_offset = -1;
+
+static inline rte_mbuf_timestamp_t *
+timestamp_dynfield(struct rte_mbuf *mbuf)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+			timestamp_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 static const char *MZ_RTE_LATENCY_STATS = "rte_latencystats";
 static int latency_stats_index;
 static uint64_t samp_intvl;
@@ -128,10 +140,10 @@ add_time_stamps(uint16_t pid __rte_unused,
 		diff_tsc = now - prev_tsc;
 		timer_tsc += diff_tsc;
 
-		if ((pkts[i]->ol_flags & PKT_RX_TIMESTAMP) == 0
+		if ((pkts[i]->ol_flags & timestamp_dynflag) == 0
 				&& (timer_tsc >= samp_intvl)) {
-			pkts[i]->timestamp = now;
-			pkts[i]->ol_flags |= PKT_RX_TIMESTAMP;
+			*timestamp_dynfield(pkts[i]) = now;
+			pkts[i]->ol_flags |= timestamp_dynflag;
 			timer_tsc = 0;
 		}
 		prev_tsc = now;
@@ -161,8 +173,8 @@ calc_latency(uint16_t pid __rte_unused,
 
 	now = rte_rdtsc();
 	for (i = 0; i < nb_pkts; i++) {
-		if (pkts[i]->ol_flags & PKT_RX_TIMESTAMP)
-			latency[cnt++] = now - pkts[i]->timestamp;
+		if (pkts[i]->ol_flags & timestamp_dynflag)
+			latency[cnt++] = now - *timestamp_dynfield(pkts[i]);
 	}
 
 	rte_spinlock_lock(&glob_stats->lock);
@@ -204,6 +216,14 @@ int
 rte_latencystats_init(uint64_t app_samp_intvl,
 		rte_latency_stats_flow_type_fn user_cb)
 {
+	static const struct rte_mbuf_dynfield timestamp_dynfield_desc = {
+		.name = RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
+		.size = sizeof(rte_mbuf_timestamp_t),
+		.align = __alignof__(rte_mbuf_timestamp_t),
+	};
+	static const struct rte_mbuf_dynflag timestamp_dynflag_desc = {
+		.name = RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME,
+	};
 	unsigned int i;
 	uint16_t pid;
 	uint16_t qid;
@@ -211,6 +231,7 @@ rte_latencystats_init(uint64_t app_samp_intvl,
 	const char *ptr_strings[NUM_LATENCY_STATS] = {0};
 	const struct rte_memzone *mz = NULL;
 	const unsigned int flags = 0;
+	int timestamp_dynflag_offset;
 	int ret;
 
 	if (rte_memzone_lookup(MZ_RTE_LATENCY_STATS))
@@ -241,6 +262,23 @@ rte_latencystats_init(uint64_t app_samp_intvl,
 		return -1;
 	}
 
+	/* Register mbuf field and flag for Rx timestamp */
+	timestamp_dynfield_offset =
+			rte_mbuf_dynfield_register(&timestamp_dynfield_desc);
+	if (timestamp_dynfield_offset < 0) {
+		RTE_LOG(ERR, LATENCY_STATS,
+				"Cannot register mbuf field for timestamp\n");
+		return -rte_errno;
+	}
+	timestamp_dynflag_offset =
+			rte_mbuf_dynflag_register(&timestamp_dynflag_desc);
+	if (timestamp_dynflag_offset < 0) {
+		RTE_LOG(ERR, LATENCY_STATS,
+				"Cannot register mbuf flag for timestamp\n");
+		return -rte_errno;
+	}
+	timestamp_dynflag = RTE_BIT64(timestamp_dynflag_offset);
+
 	/** Register Rx/Tx callbacks */
 	RTE_ETH_FOREACH_DEV(pid) {
 		struct rte_eth_dev_info dev_info;
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v2 05/14] net/ark: switch timestamp to dynamic mbuf field
  2020-11-01 18:06 ` [dpdk-dev] [PATCH v2 00/14] remove mbuf timestamp Thomas Monjalon
                     ` (3 preceding siblings ...)
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 04/14] latency: switch timestamp to dynamic mbuf field Thomas Monjalon
@ 2020-11-01 18:06   ` Thomas Monjalon
  2020-11-02 15:32     ` Olivier Matz
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 06/14] net/dpaa2: " Thomas Monjalon
                     ` (9 subsequent siblings)
  14 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-01 18:06 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Shepard Siegel, Ed Czeck,
	John Miller

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related dynamic mbuf flag is set, although was missing previously.

The timestamp is set if configured for at least one device.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 drivers/net/ark/ark_ethdev.c    | 23 +++++++++++++++++++++++
 drivers/net/ark/ark_ethdev_rx.c | 10 +++++++++-
 2 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ark/ark_ethdev.c b/drivers/net/ark/ark_ethdev.c
index fa343999a1..629f825019 100644
--- a/drivers/net/ark/ark_ethdev.c
+++ b/drivers/net/ark/ark_ethdev.c
@@ -9,6 +9,7 @@
 #include <rte_bus_pci.h>
 #include <rte_ethdev_pci.h>
 #include <rte_kvargs.h>
+#include <rte_bitops.h>
 
 #include "rte_pmd_ark.h"
 #include "ark_global.h"
@@ -79,6 +80,8 @@ static int  eth_ark_set_mtu(struct rte_eth_dev *dev, uint16_t size);
 #define ARK_TX_MAX_QUEUE (4096 * 4)
 #define ARK_TX_MIN_QUEUE (256)
 
+uint64_t ark_timestamp_rx_dynflag;
+int ark_timestamp_dynfield_offset = -1;
 int rte_pmd_ark_rx_userdata_dynfield_offset = -1;
 int rte_pmd_ark_tx_userdata_dynfield_offset = -1;
 
@@ -552,6 +555,24 @@ static int
 eth_ark_dev_configure(struct rte_eth_dev *dev)
 {
 	struct ark_adapter *ark = dev->data->dev_private;
+	int ark_timestamp_rx_dynflag_offset;
+
+	if (dev->data->dev_conf.rxmode.offloads & DEV_RX_OFFLOAD_TIMESTAMP) {
+		ark_timestamp_dynfield_offset = rte_mbuf_dynfield_lookup(
+				RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
+		if (ark_timestamp_dynfield_offset < 0) {
+			ARK_PMD_LOG(ERR, "Failed to lookup timestamp field\n");
+			return -rte_errno;
+		}
+		ark_timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
+				RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
+		if (ark_timestamp_rx_dynflag_offset < 0) {
+			ARK_PMD_LOG(ERR, "Failed to lookup Rx timestamp flag\n");
+			return -rte_errno;
+		}
+		ark_timestamp_rx_dynflag =
+				RTE_BIT64(ark_timestamp_rx_dynflag_offset);
+	}
 
 	eth_ark_dev_set_link_up(dev);
 	if (ark->user_ext.dev_configure)
@@ -782,6 +803,8 @@ eth_ark_dev_info_get(struct rte_eth_dev *dev,
 				ETH_LINK_SPEED_50G |
 				ETH_LINK_SPEED_100G);
 
+	dev_info->rx_offload_capa = DEV_RX_OFFLOAD_TIMESTAMP;
+
 	return 0;
 }
 
diff --git a/drivers/net/ark/ark_ethdev_rx.c b/drivers/net/ark/ark_ethdev_rx.c
index c24cc00e2f..e45adf959d 100644
--- a/drivers/net/ark/ark_ethdev_rx.c
+++ b/drivers/net/ark/ark_ethdev_rx.c
@@ -15,6 +15,9 @@
 #define ARK_RX_META_OFFSET (RTE_PKTMBUF_HEADROOM - ARK_RX_META_SIZE)
 #define ARK_RX_MAX_NOCHAIN (RTE_MBUF_DEFAULT_DATAROOM)
 
+extern uint64_t ark_timestamp_rx_dynflag;
+extern int ark_timestamp_dynfield_offset;
+
 /* Forward declarations */
 struct ark_rx_queue;
 struct ark_rx_meta;
@@ -272,7 +275,12 @@ eth_ark_recv_pkts(void *rx_queue,
 		mbuf->port = meta->port;
 		mbuf->pkt_len = meta->pkt_len;
 		mbuf->data_len = meta->pkt_len;
-		mbuf->timestamp = meta->timestamp;
+		/* set timestamp if enabled at least on one device */
+		if (ark_timestamp_rx_dynflag > 0) {
+			*RTE_MBUF_DYNFIELD(mbuf, ark_timestamp_dynfield_offset,
+				rte_mbuf_timestamp_t *) = meta->timestamp;
+			mbuf->ol_flags |= ark_timestamp_rx_dynflag;
+		}
 		rte_pmd_ark_mbuf_rx_userdata_set(mbuf, meta->user_data);
 
 		if (ARK_DEBUG_CORE) {	/* debug sanity checks */
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v2 06/14] net/dpaa2: switch timestamp to dynamic mbuf field
  2020-11-01 18:06 ` [dpdk-dev] [PATCH v2 00/14] remove mbuf timestamp Thomas Monjalon
                     ` (4 preceding siblings ...)
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 05/14] net/ark: " Thomas Monjalon
@ 2020-11-01 18:06   ` Thomas Monjalon
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 07/14] net/mlx5: fix dynamic mbuf offset lookup check Thomas Monjalon
                     ` (8 subsequent siblings)
  14 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-01 18:06 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Hemant Agrawal,
	Sachin Saxena

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 drivers/net/dpaa2/dpaa2_ethdev.c | 20 ++++++++++++++++++++
 drivers/net/dpaa2/dpaa2_ethdev.h |  2 ++
 drivers/net/dpaa2/dpaa2_rxtx.c   | 25 ++++++++++++++++++-------
 3 files changed, 40 insertions(+), 7 deletions(-)

diff --git a/drivers/net/dpaa2/dpaa2_ethdev.c b/drivers/net/dpaa2/dpaa2_ethdev.c
index 04e60c56f2..ff368174bd 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.c
+++ b/drivers/net/dpaa2/dpaa2_ethdev.c
@@ -8,6 +8,7 @@
 #include <time.h>
 #include <net/if.h>
 
+#include <rte_bitops.h>
 #include <rte_mbuf.h>
 #include <rte_ethdev_driver.h>
 #include <rte_malloc.h>
@@ -65,6 +66,8 @@ static uint64_t dev_tx_offloads_nodis =
 
 /* enable timestamp in mbuf */
 bool dpaa2_enable_ts[RTE_MAX_ETHPORTS];
+uint64_t dpaa2_timestamp_rx_dynflag;
+int dpaa2_timestamp_dynfield_offset = -1;
 
 struct rte_dpaa2_xstats_name_off {
 	char name[RTE_ETH_XSTATS_NAME_SIZE];
@@ -505,6 +508,7 @@ dpaa2_eth_dev_configure(struct rte_eth_dev *dev)
 	struct rte_eth_conf *eth_conf = &dev->data->dev_conf;
 	uint64_t rx_offloads = eth_conf->rxmode.offloads;
 	uint64_t tx_offloads = eth_conf->txmode.offloads;
+	int timestamp_rx_dynflag_offset;
 	int rx_l3_csum_offload = false;
 	int rx_l4_csum_offload = false;
 	int tx_l3_csum_offload = false;
@@ -587,7 +591,23 @@ dpaa2_eth_dev_configure(struct rte_eth_dev *dev)
 #if !defined(RTE_LIBRTE_IEEE1588)
 	if (rx_offloads & DEV_RX_OFFLOAD_TIMESTAMP)
 #endif
+	{
+		dpaa2_timestamp_dynfield_offset = rte_mbuf_dynfield_lookup(
+				RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
+		if (dpaa2_timestamp_dynfield_offset < 0) {
+			DPAA2_PMD_ERR("Error to lookup timestamp field");
+			return -rte_errno;
+		}
+		timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
+				RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
+		if (timestamp_rx_dynflag_offset < 0) {
+			DPAA2_PMD_ERR("Error to lookup Rx timestamp flag");
+			return -rte_errno;
+		}
+		dpaa2_timestamp_rx_dynflag =
+				RTE_BIT64(timestamp_rx_dynflag_offset);
 		dpaa2_enable_ts[dev->data->port_id] = true;
+	}
 
 	if (tx_offloads & DEV_TX_OFFLOAD_IPV4_CKSUM)
 		tx_l3_csum_offload = true;
diff --git a/drivers/net/dpaa2/dpaa2_ethdev.h b/drivers/net/dpaa2/dpaa2_ethdev.h
index 94cf253827..8d82f74684 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.h
+++ b/drivers/net/dpaa2/dpaa2_ethdev.h
@@ -92,6 +92,8 @@
 
 /* enable timestamp in mbuf*/
 extern bool dpaa2_enable_ts[];
+extern uint64_t dpaa2_timestamp_rx_dynflag;
+extern int dpaa2_timestamp_dynfield_offset;
 
 #define DPAA2_QOS_TABLE_RECONFIGURE	1
 #define DPAA2_FS_TABLE_RECONFIGURE	2
diff --git a/drivers/net/dpaa2/dpaa2_rxtx.c b/drivers/net/dpaa2/dpaa2_rxtx.c
index 6201de4606..9cca6d16c3 100644
--- a/drivers/net/dpaa2/dpaa2_rxtx.c
+++ b/drivers/net/dpaa2/dpaa2_rxtx.c
@@ -31,6 +31,13 @@ dpaa2_dev_rx_parse_slow(struct rte_mbuf *mbuf,
 
 static void enable_tx_tstamp(struct qbman_fd *fd) __rte_unused;
 
+static inline rte_mbuf_timestamp_t *
+dpaa2_timestamp_dynfield(struct rte_mbuf *mbuf)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+		dpaa2_timestamp_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 #define DPAA2_MBUF_TO_CONTIG_FD(_mbuf, _fd, _bpid)  do { \
 	DPAA2_SET_FD_ADDR(_fd, DPAA2_MBUF_VADDR_TO_IOVA(_mbuf)); \
 	DPAA2_SET_FD_LEN(_fd, _mbuf->data_len); \
@@ -109,9 +116,10 @@ dpaa2_dev_rx_parse_new(struct rte_mbuf *m, const struct qbman_fd *fd,
 	m->ol_flags |= PKT_RX_RSS_HASH;
 
 	if (dpaa2_enable_ts[m->port]) {
-		m->timestamp = annotation->word2;
-		m->ol_flags |= PKT_RX_TIMESTAMP;
-		DPAA2_PMD_DP_DEBUG("pkt timestamp:0x%" PRIx64 "", m->timestamp);
+		*dpaa2_timestamp_dynfield(m) = annotation->word2;
+		m->ol_flags |= dpaa2_timestamp_rx_dynflag;
+		DPAA2_PMD_DP_DEBUG("pkt timestamp:0x%" PRIx64 "",
+				*dpaa2_timestamp_dynfield(m));
 	}
 
 	DPAA2_PMD_DP_DEBUG("HW frc = 0x%x\t packet type =0x%x "
@@ -223,9 +231,12 @@ dpaa2_dev_rx_parse(struct rte_mbuf *mbuf, void *hw_annot_addr)
 	else if (BIT_ISSET_AT_POS(annotation->word8, DPAA2_ETH_FAS_L4CE))
 		mbuf->ol_flags |= PKT_RX_L4_CKSUM_BAD;
 
-	mbuf->ol_flags |= PKT_RX_TIMESTAMP;
-	mbuf->timestamp = annotation->word2;
-	DPAA2_PMD_DP_DEBUG("pkt timestamp: 0x%" PRIx64 "", mbuf->timestamp);
+	if (dpaa2_enable_ts[mbuf->port]) {
+		*dpaa2_timestamp_dynfield(mbuf) = annotation->word2;
+		mbuf->ol_flags |= dpaa2_timestamp_rx_dynflag;
+		DPAA2_PMD_DP_DEBUG("pkt timestamp: 0x%" PRIx64 "",
+				*dpaa2_timestamp_dynfield(mbuf));
+	}
 
 	/* Check detailed parsing requirement */
 	if (annotation->word3 & 0x7FFFFC3FFFF)
@@ -629,7 +640,7 @@ dpaa2_dev_prefetch_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 		else
 			bufs[num_rx] = eth_fd_to_mbuf(fd, eth_data->port_id);
 #if defined(RTE_LIBRTE_IEEE1588)
-		priv->rx_timestamp = bufs[num_rx]->timestamp;
+		priv->rx_timestamp = *dpaa2_timestamp_dynfield(bufs[num_rx]);
 #endif
 
 		if (eth_data->dev_conf.rxmode.offloads &
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v2 07/14] net/mlx5: fix dynamic mbuf offset lookup check
  2020-11-01 18:06 ` [dpdk-dev] [PATCH v2 00/14] remove mbuf timestamp Thomas Monjalon
                     ` (5 preceding siblings ...)
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 06/14] net/dpaa2: " Thomas Monjalon
@ 2020-11-01 18:06   ` Thomas Monjalon
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 08/14] net/mlx5: switch timestamp to dynamic mbuf field Thomas Monjalon
                     ` (7 subsequent siblings)
  14 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-01 18:06 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, stable, Matan Azrad,
	Shahaf Shuler, Ori Kam

The functions rte_mbuf_dynfield_lookup() and rte_mbuf_dynflag_lookup()
can return an offset starting with 0 or a negative error code.

In reality the first offsets are probably reserved forever,
but for the sake of strict API compliance,
the checks which considered 0 as an error are fixed.

Fixes: efa79e68c8cd ("net/mlx5: support fine grain dynamic flag")
Fixes: 3172c471b86f ("net/mlx5: prepare Tx queue structures to support timestamp")
Fixes: 0febfcce3693 ("net/mlx5: prepare Tx to support scheduling")
Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 drivers/net/mlx5/mlx5_rxtx.c    | 4 ++--
 drivers/net/mlx5/mlx5_trigger.c | 2 +-
 drivers/net/mlx5/mlx5_txq.c     | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index b530ff421f..e86468b67a 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -5661,9 +5661,9 @@ mlx5_select_tx_function(struct rte_eth_dev *dev)
 	}
 	if (tx_offloads & DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP &&
 	    rte_mbuf_dynflag_lookup
-			(RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME, NULL) > 0 &&
+			(RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME, NULL) >= 0 &&
 	    rte_mbuf_dynfield_lookup
-			(RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL) > 0) {
+			(RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL) >= 0) {
 		/* Offload configured, dynamic entities registered. */
 		olx |= MLX5_TXOFF_CONFIG_TXPP;
 	}
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index 7735f022a3..917b433c4a 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -302,7 +302,7 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 	DRV_LOG(DEBUG, "port %u starting device", dev->data->port_id);
 	fine_inline = rte_mbuf_dynflag_lookup
 		(RTE_PMD_MLX5_FINE_GRANULARITY_INLINE, NULL);
-	if (fine_inline > 0)
+	if (fine_inline >= 0)
 		rte_net_mlx5_dynf_inline_mask = 1UL << fine_inline;
 	else
 		rte_net_mlx5_dynf_inline_mask = 0;
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index af84f5f72b..8ed2bcff7b 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -1305,7 +1305,7 @@ mlx5_txq_dynf_timestamp_set(struct rte_eth_dev *dev)
 				(RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME, NULL);
 	off = rte_mbuf_dynfield_lookup
 				(RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
-	if (nbit > 0 && off >= 0 && sh->txpp.refcnt)
+	if (nbit >= 0 && off >= 0 && sh->txpp.refcnt)
 		mask = 1ULL << nbit;
 	for (i = 0; i != priv->txqs_n; ++i) {
 		data = (*priv->txqs)[i];
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v2 08/14] net/mlx5: switch timestamp to dynamic mbuf field
  2020-11-01 18:06 ` [dpdk-dev] [PATCH v2 00/14] remove mbuf timestamp Thomas Monjalon
                     ` (6 preceding siblings ...)
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 07/14] net/mlx5: fix dynamic mbuf offset lookup check Thomas Monjalon
@ 2020-11-01 18:06   ` Thomas Monjalon
  2020-11-02  5:08     ` Ruifeng Wang
  2020-11-02 23:20     ` David Christensen
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 09/14] net/nfb: " Thomas Monjalon
                     ` (6 subsequent siblings)
  14 siblings, 2 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-01 18:06 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Matan Azrad,
	Shahaf Shuler, David Christensen, Ruifeng Wang,
	Konstantin Ananyev

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 drivers/net/mlx5/mlx5_rxq.c              | 36 ++++++++++++++++++++
 drivers/net/mlx5/mlx5_rxtx.c             |  4 +--
 drivers/net/mlx5/mlx5_rxtx.h             | 19 +++++++++++
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h | 41 +++++++++++-----------
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h    | 43 ++++++++++++------------
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h     | 35 +++++++++----------
 6 files changed, 118 insertions(+), 60 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index f1d8373079..877aa24a18 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1310,6 +1310,39 @@ mlx5_max_lro_msg_size_adjust(struct rte_eth_dev *dev, uint16_t idx,
 		priv->max_lro_msg_size * MLX5_LRO_SEG_CHUNK_SIZE);
 }
 
+/**
+ * Lookup mbuf field and flag for Rx timestamp if offload requested.
+ *
+ * @param rxq_data
+ *   Datapath struct where field offset and flag mask are stored.
+ *
+ * @return
+ *   0 on success or offload disabled, negative errno otherwise.
+ */
+static int
+mlx5_rx_timestamp_setup(struct mlx5_rxq_data *rxq_data)
+{
+	int timestamp_rx_dynflag_offset;
+
+	rxq_data->timestamp_rx_flag = 0;
+	if (rxq_data->hw_timestamp == 0)
+		return 0;
+	rxq_data->timestamp_offset = rte_mbuf_dynfield_lookup(
+			RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
+	if (rxq_data->timestamp_offset < 0) {
+		DRV_LOG(ERR, "Cannot lookup timestamp field\n");
+		return -rte_errno;
+	}
+	timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
+			RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
+	if (timestamp_rx_dynflag_offset < 0) {
+		DRV_LOG(ERR, "Cannot lookup Rx timestamp flag\n");
+		return -rte_errno;
+	}
+	rxq_data->timestamp_rx_flag = RTE_BIT64(timestamp_rx_dynflag_offset);
+	return 0;
+}
+
 /**
  * Create a DPDK Rx queue.
  *
@@ -1492,7 +1525,10 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
 	mlx5_max_lro_msg_size_adjust(dev, idx, max_lro_size);
 	/* Toggle RX checksum offload if hardware supports it. */
 	tmpl->rxq.csum = !!(offloads & DEV_RX_OFFLOAD_CHECKSUM);
+	/* Configure Rx timestamp. */
 	tmpl->rxq.hw_timestamp = !!(offloads & DEV_RX_OFFLOAD_TIMESTAMP);
+	if (mlx5_rx_timestamp_setup(&tmpl->rxq) != 0)
+		goto error;
 	/* Configure VLAN stripping. */
 	tmpl->rxq.vlan_strip = !!(offloads & DEV_RX_OFFLOAD_VLAN_STRIP);
 	/* By default, FCS (CRC) is stripped by hardware. */
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index e86468b67a..b577aab00b 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -1287,8 +1287,8 @@ rxq_cq_to_mbuf(struct mlx5_rxq_data *rxq, struct rte_mbuf *pkt,
 
 		if (rxq->rt_timestamp)
 			ts = mlx5_txpp_convert_rx_ts(rxq->sh, ts);
-		pkt->timestamp = ts;
-		pkt->ol_flags |= PKT_RX_TIMESTAMP;
+		mlx5_timestamp_set(pkt, rxq->timestamp_offset, ts);
+		pkt->ol_flags |= rxq->timestamp_rx_flag;
 	}
 }
 
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 674296ee98..e9eca36b40 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -151,6 +151,8 @@ struct mlx5_rxq_data {
 	/* CQ (UAR) access lock required for 32bit implementations */
 #endif
 	uint32_t tunnel; /* Tunnel information. */
+	int timestamp_offset; /* Dynamic mbuf field for timestamp. */
+	uint64_t timestamp_rx_flag; /* Dynamic mbuf flag for timestamp. */
 	uint64_t flow_meta_mask;
 	int32_t flow_meta_offset;
 } __rte_cache_aligned;
@@ -681,4 +683,21 @@ mlx5_txpp_convert_tx_ts(struct mlx5_dev_ctx_shared *sh, uint64_t mts)
 	return ci;
 }
 
+/**
+ * Set timestamp in mbuf dynamic field.
+ *
+ * @param mbuf
+ *   Structure to write into.
+ * @param offset
+ *   Dynamic field offset in mbuf structure.
+ * @param timestamp
+ *   Value to write.
+ */
+static __rte_always_inline void
+mlx5_timestamp_set(struct rte_mbuf *mbuf, int offset,
+		rte_mbuf_timestamp_t timestamp)
+{
+	*RTE_MBUF_DYNFIELD(mbuf, offset, rte_mbuf_timestamp_t *) = timestamp;
+}
+
 #endif /* RTE_PMD_MLX5_RXTX_H_ */
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
index 6bf0c9b540..171d7bb0f8 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
@@ -330,13 +330,13 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
 	vector unsigned char ol_flags = (vector unsigned char)
 		(vector unsigned int){
 			rxq->rss_hash * PKT_RX_RSS_HASH |
-				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
+				rxq->hw_timestamp * rxq->timestamp_rx_flag,
 			rxq->rss_hash * PKT_RX_RSS_HASH |
-				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
+				rxq->hw_timestamp * rxq->timestamp_rx_flag,
 			rxq->rss_hash * PKT_RX_RSS_HASH |
-				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
+				rxq->hw_timestamp * rxq->timestamp_rx_flag,
 			rxq->rss_hash * PKT_RX_RSS_HASH |
-				rxq->hw_timestamp * PKT_RX_TIMESTAMP};
+				rxq->hw_timestamp * rxq->timestamp_rx_flag};
 	vector unsigned char cv_flags;
 	const vector unsigned char zero = (vector unsigned char){0};
 	const vector unsigned char ptype_mask =
@@ -1025,31 +1025,32 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t pkts_n,
 		/* D.5 fill in mbuf - rearm_data and packet_type. */
 		rxq_cq_to_ptype_oflags_v(rxq, cqes, opcode, &pkts[pos]);
 		if (rxq->hw_timestamp) {
+			int offset = rxq->timestamp_offset;
 			if (rxq->rt_timestamp) {
 				struct mlx5_dev_ctx_shared *sh = rxq->sh;
 				uint64_t ts;
 
 				ts = rte_be_to_cpu_64(cq[pos].timestamp);
-				pkts[pos]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p1].timestamp);
-				pkts[pos + 1]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 1], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p2].timestamp);
-				pkts[pos + 2]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 2], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p3].timestamp);
-				pkts[pos + 3]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 3], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 			} else {
-				pkts[pos]->timestamp = rte_be_to_cpu_64
-						(cq[pos].timestamp);
-				pkts[pos + 1]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p1].timestamp);
-				pkts[pos + 2]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p2].timestamp);
-				pkts[pos + 3]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p3].timestamp);
+				mlx5_timestamp_set(pkts[pos], offset,
+					rte_be_to_cpu_64(cq[pos].timestamp));
+				mlx5_timestamp_set(pkts[pos + 1], offset,
+					rte_be_to_cpu_64(cq[pos + p1].timestamp));
+				mlx5_timestamp_set(pkts[pos + 2], offset,
+					rte_be_to_cpu_64(cq[pos + p2].timestamp));
+				mlx5_timestamp_set(pkts[pos + 3], offset,
+					rte_be_to_cpu_64(cq[pos + p3].timestamp));
 			}
 		}
 		if (rxq->dynf_meta) {
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index d122dad4fe..436b247ade 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -271,7 +271,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
 	uint32x4_t pinfo, cv_flags;
 	uint32x4_t ol_flags =
 		vdupq_n_u32(rxq->rss_hash * PKT_RX_RSS_HASH |
-			    rxq->hw_timestamp * PKT_RX_TIMESTAMP);
+			    rxq->hw_timestamp * rxq->timestamp_rx_flag);
 	const uint32x4_t ptype_ol_mask = { 0x106, 0x106, 0x106, 0x106 };
 	const uint8x16_t cv_flag_sel = {
 		0,
@@ -697,6 +697,7 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t pkts_n,
 		rxq_cq_to_ptype_oflags_v(rxq, ptype_info, flow_tag,
 					 opcode, &elts[pos]);
 		if (rxq->hw_timestamp) {
+			int offset = rxq->timestamp_offset;
 			if (rxq->rt_timestamp) {
 				struct mlx5_dev_ctx_shared *sh = rxq->sh;
 				uint64_t ts;
@@ -704,36 +705,36 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t pkts_n,
 				ts = rte_be_to_cpu_64
 					(container_of(p0, struct mlx5_cqe,
 						      pkt_info)->timestamp);
-				elts[pos]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(elts[pos], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64
 					(container_of(p1, struct mlx5_cqe,
 						      pkt_info)->timestamp);
-				elts[pos + 1]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(elts[pos + 1], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64
 					(container_of(p2, struct mlx5_cqe,
 						      pkt_info)->timestamp);
-				elts[pos + 2]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(elts[pos + 2], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64
 					(container_of(p3, struct mlx5_cqe,
 						      pkt_info)->timestamp);
-				elts[pos + 3]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(elts[pos + 3], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 			} else {
-				elts[pos]->timestamp = rte_be_to_cpu_64
-					(container_of(p0, struct mlx5_cqe,
-						      pkt_info)->timestamp);
-				elts[pos + 1]->timestamp = rte_be_to_cpu_64
-					(container_of(p1, struct mlx5_cqe,
-						      pkt_info)->timestamp);
-				elts[pos + 2]->timestamp = rte_be_to_cpu_64
-					(container_of(p2, struct mlx5_cqe,
-						      pkt_info)->timestamp);
-				elts[pos + 3]->timestamp = rte_be_to_cpu_64
-					(container_of(p3, struct mlx5_cqe,
-						      pkt_info)->timestamp);
+				mlx5_timestamp_set(elts[pos], offset,
+					rte_be_to_cpu_64(container_of(p0,
+					struct mlx5_cqe, pkt_info)->timestamp));
+				mlx5_timestamp_set(elts[pos + 1], offset,
+					rte_be_to_cpu_64(container_of(p1,
+					struct mlx5_cqe, pkt_info)->timestamp));
+				mlx5_timestamp_set(elts[pos + 2], offset,
+					rte_be_to_cpu_64(container_of(p2,
+					struct mlx5_cqe, pkt_info)->timestamp));
+				mlx5_timestamp_set(elts[pos + 3], offset,
+					rte_be_to_cpu_64(container_of(p3,
+					struct mlx5_cqe, pkt_info)->timestamp));
 			}
 		}
 		if (rxq->dynf_meta) {
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 0bbcbeefff..ae4439efc7 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -251,7 +251,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq, __m128i cqes[4],
 	__m128i pinfo0, pinfo1;
 	__m128i pinfo, ptype;
 	__m128i ol_flags = _mm_set1_epi32(rxq->rss_hash * PKT_RX_RSS_HASH |
-					  rxq->hw_timestamp * PKT_RX_TIMESTAMP);
+					  rxq->hw_timestamp * rxq->timestamp_rx_flag);
 	__m128i cv_flags;
 	const __m128i zero = _mm_setzero_si128();
 	const __m128i ptype_mask =
@@ -656,31 +656,32 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t pkts_n,
 		/* D.5 fill in mbuf - rearm_data and packet_type. */
 		rxq_cq_to_ptype_oflags_v(rxq, cqes, opcode, &pkts[pos]);
 		if (rxq->hw_timestamp) {
+			int offset = rxq->timestamp_offset;
 			if (rxq->rt_timestamp) {
 				struct mlx5_dev_ctx_shared *sh = rxq->sh;
 				uint64_t ts;
 
 				ts = rte_be_to_cpu_64(cq[pos].timestamp);
-				pkts[pos]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p1].timestamp);
-				pkts[pos + 1]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 1], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p2].timestamp);
-				pkts[pos + 2]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 2], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p3].timestamp);
-				pkts[pos + 3]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 3], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 			} else {
-				pkts[pos]->timestamp = rte_be_to_cpu_64
-						(cq[pos].timestamp);
-				pkts[pos + 1]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p1].timestamp);
-				pkts[pos + 2]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p2].timestamp);
-				pkts[pos + 3]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p3].timestamp);
+				mlx5_timestamp_set(pkts[pos], offset,
+					rte_be_to_cpu_64(cq[pos].timestamp));
+				mlx5_timestamp_set(pkts[pos + 1], offset,
+					rte_be_to_cpu_64(cq[pos + p1].timestamp));
+				mlx5_timestamp_set(pkts[pos + 2], offset,
+					rte_be_to_cpu_64(cq[pos + p2].timestamp));
+				mlx5_timestamp_set(pkts[pos + 3], offset,
+					rte_be_to_cpu_64(cq[pos + p3].timestamp));
 			}
 		}
 		if (rxq->dynf_meta) {
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v2 09/14] net/nfb: switch timestamp to dynamic mbuf field
  2020-11-01 18:06 ` [dpdk-dev] [PATCH v2 00/14] remove mbuf timestamp Thomas Monjalon
                     ` (7 preceding siblings ...)
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 08/14] net/mlx5: switch timestamp to dynamic mbuf field Thomas Monjalon
@ 2020-11-01 18:06   ` Thomas Monjalon
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 10/14] net/octeontx2: " Thomas Monjalon
                     ` (5 subsequent siblings)
  14 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-01 18:06 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Martin Spinler

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 drivers/net/nfb/nfb_rx.c | 23 ++++++++++++++++++++++-
 drivers/net/nfb/nfb_rx.h | 18 ++++++++++++++----
 2 files changed, 36 insertions(+), 5 deletions(-)

diff --git a/drivers/net/nfb/nfb_rx.c b/drivers/net/nfb/nfb_rx.c
index d97179f818..1f90357522 100644
--- a/drivers/net/nfb/nfb_rx.c
+++ b/drivers/net/nfb/nfb_rx.c
@@ -5,10 +5,14 @@
  */
 
 #include <rte_kvargs.h>
+#include <rte_bitops.h>
 
 #include "nfb_rx.h"
 #include "nfb.h"
 
+uint64_t nfb_timestamp_rx_dynflag;
+int nfb_timestamp_dynfield_offset = -1;
+
 static int
 timestamp_check_handler(__rte_unused const char *key,
 	const char *value, __rte_unused void *opaque)
@@ -23,6 +27,7 @@ timestamp_check_handler(__rte_unused const char *key,
 static int
 nfb_check_timestamp(struct rte_devargs *devargs)
 {
+	int timestamp_rx_dynflag_offset;
 	struct rte_kvargs *kvlist;
 
 	if (devargs == NULL)
@@ -38,6 +43,7 @@ nfb_check_timestamp(struct rte_devargs *devargs)
 	}
 	/* Timestamps are enabled when there is
 	 * key-value pair: enable_timestamp=1
+	 * TODO: timestamp should be enabled with DEV_RX_OFFLOAD_TIMESTAMP
 	 */
 	if (rte_kvargs_process(kvlist, TIMESTAMP_ARG,
 		timestamp_check_handler, NULL) < 0) {
@@ -46,6 +52,21 @@ nfb_check_timestamp(struct rte_devargs *devargs)
 	}
 	rte_kvargs_free(kvlist);
 
+	nfb_timestamp_dynfield_offset = rte_mbuf_dynfield_lookup(
+			RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
+	if (nfb_timestamp_dynfield_offset < 0) {
+		RTE_LOG(ERR, PMD, "Cannot lookup timestamp field\n");
+		return -rte_errno;
+	}
+	timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
+			RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
+	if (timestamp_rx_dynflag_offset < 0) {
+		RTE_LOG(ERR, PMD, "Cannot lookup Rx timestamp flag\n");
+		return -rte_errno;
+	}
+	nfb_timestamp_rx_dynflag =
+		RTE_BIT64(timestamp_rx_dynflag_offset);
+
 	return 1;
 }
 
@@ -125,7 +146,7 @@ nfb_eth_rx_queue_setup(struct rte_eth_dev *dev,
 	else
 		rte_free(rxq);
 
-	if (nfb_check_timestamp(dev->device->devargs))
+	if (nfb_check_timestamp(dev->device->devargs) > 0)
 		rxq->flags |= NFB_TIMESTAMP_FLAG;
 
 	return ret;
diff --git a/drivers/net/nfb/nfb_rx.h b/drivers/net/nfb/nfb_rx.h
index cf3899b2fb..e548226e0f 100644
--- a/drivers/net/nfb/nfb_rx.h
+++ b/drivers/net/nfb/nfb_rx.h
@@ -15,6 +15,16 @@
 
 #define NFB_TIMESTAMP_FLAG (1 << 0)
 
+extern uint64_t nfb_timestamp_rx_dynflag;
+extern int nfb_timestamp_dynfield_offset;
+
+static inline rte_mbuf_timestamp_t *
+nfb_timestamp_dynfield(struct rte_mbuf *mbuf)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+		nfb_timestamp_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 struct ndp_rx_queue {
 	struct nfb_device *nfb;	     /* nfb dev structure */
 	struct ndp_queue *queue;     /* rx queue */
@@ -191,15 +201,15 @@ nfb_eth_ndp_rx(void *queue,
 
 			if (timestamping_enabled) {
 				/* nanoseconds */
-				mbuf->timestamp =
+				*nfb_timestamp_dynfield(mbuf) =
 					rte_le_to_cpu_32(*((uint32_t *)
 					(packets[i].header + 4)));
-				mbuf->timestamp <<= 32;
+				*nfb_timestamp_dynfield(mbuf) <<= 32;
 				/* seconds */
-				mbuf->timestamp |=
+				*nfb_timestamp_dynfield(mbuf) |=
 					rte_le_to_cpu_32(*((uint32_t *)
 					(packets[i].header + 8)));
-				mbuf->ol_flags |= PKT_RX_TIMESTAMP;
+				mbuf->ol_flags |= nfb_timestamp_rx_dynflag;
 			}
 
 			bufs[num_rx++] = mbuf;
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v2 10/14] net/octeontx2: switch timestamp to dynamic mbuf field
  2020-11-01 18:06 ` [dpdk-dev] [PATCH v2 00/14] remove mbuf timestamp Thomas Monjalon
                     ` (8 preceding siblings ...)
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 09/14] net/nfb: " Thomas Monjalon
@ 2020-11-01 18:06   ` Thomas Monjalon
  2020-11-01 18:28     ` Jerin Jacob
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 11/14] net/pcap: " Thomas Monjalon
                     ` (4 subsequent siblings)
  14 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-01 18:06 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Nithin Dabilpuram,
	Kiran Kumar K, Ray Kinsella, Neil Horman

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 drivers/net/octeontx2/otx2_ethdev.c | 33 +++++++++++++++++++++++++++++
 drivers/net/octeontx2/otx2_rx.h     | 19 ++++++++++++++---
 drivers/net/octeontx2/version.map   |  7 ++++++
 3 files changed, 56 insertions(+), 3 deletions(-)

diff --git a/drivers/net/octeontx2/otx2_ethdev.c b/drivers/net/octeontx2/otx2_ethdev.c
index cfb733a4b5..ad95219438 100644
--- a/drivers/net/octeontx2/otx2_ethdev.c
+++ b/drivers/net/octeontx2/otx2_ethdev.c
@@ -4,6 +4,7 @@
 
 #include <inttypes.h>
 
+#include <rte_bitops.h>
 #include <rte_ethdev_pci.h>
 #include <rte_io.h>
 #include <rte_malloc.h>
@@ -14,6 +15,35 @@
 #include "otx2_ethdev.h"
 #include "otx2_ethdev_sec.h"
 
+uint64_t rte_pmd_octeontx2_timestamp_rx_dynflag;
+int rte_pmd_octeontx2_timestamp_dynfield_offset = -1;
+
+static int
+otx2_rx_timestamp_setup(uint16_t flags)
+{
+	int timestamp_rx_dynflag_offset;
+
+	if ((flags & NIX_RX_OFFLOAD_TSTAMP_F) == 0)
+		return 0;
+
+	rte_pmd_octeontx2_timestamp_dynfield_offset = rte_mbuf_dynfield_lookup(
+			RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
+	if (rte_pmd_octeontx2_timestamp_dynfield_offset < 0) {
+		otx2_err("Failed to lookup timestamp field");
+		return -rte_errno;
+	}
+	timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
+			RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
+	if (timestamp_rx_dynflag_offset < 0) {
+		otx2_err("Failed to lookup Rx timestamp flag");
+		return -rte_errno;
+	}
+	rte_pmd_octeontx2_timestamp_rx_dynflag =
+			RTE_BIT64(timestamp_rx_dynflag_offset);
+
+	return 0;
+}
+
 static inline uint64_t
 nix_get_rx_offload_capa(struct otx2_eth_dev *dev)
 {
@@ -1874,6 +1904,9 @@ otx2_nix_configure(struct rte_eth_dev *eth_dev)
 	dev->tx_offload_flags |= nix_tx_offload_flags(eth_dev);
 	dev->rss_info.rss_grps = NIX_RSS_GRPS;
 
+	if (otx2_rx_timestamp_setup(dev->rx_offload_flags) != 0)
+		goto fail_offloads;
+
 	nb_rxq = RTE_MAX(data->nb_rx_queues, 1);
 	nb_txq = RTE_MAX(data->nb_tx_queues, 1);
 
diff --git a/drivers/net/octeontx2/otx2_rx.h b/drivers/net/octeontx2/otx2_rx.h
index 61a5c436dd..6981edce82 100644
--- a/drivers/net/octeontx2/otx2_rx.h
+++ b/drivers/net/octeontx2/otx2_rx.h
@@ -63,6 +63,18 @@ union mbuf_initializer {
 	uint64_t value;
 };
 
+/* variables are exported because this file is included in other drivers */
+extern uint64_t rte_pmd_octeontx2_timestamp_rx_dynflag;
+extern int rte_pmd_octeontx2_timestamp_dynfield_offset;
+
+static inline rte_mbuf_timestamp_t *
+otx2_timestamp_dynfield(struct rte_mbuf *mbuf)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+		rte_pmd_octeontx2_timestamp_dynfield_offset,
+		rte_mbuf_timestamp_t *);
+}
+
 static __rte_always_inline void
 otx2_nix_mbuf_to_tstamp(struct rte_mbuf *mbuf,
 			struct otx2_timesync_info *tstamp, const uint16_t flag,
@@ -77,15 +89,16 @@ otx2_nix_mbuf_to_tstamp(struct rte_mbuf *mbuf,
 		/* Reading the rx timestamp inserted by CGX, viz at
 		 * starting of the packet data.
 		 */
-		mbuf->timestamp = rte_be_to_cpu_64(*tstamp_ptr);
+		*otx2_timestamp_dynfield(mbuf) = rte_be_to_cpu_64(*tstamp_ptr);
 		/* PKT_RX_IEEE1588_TMST flag needs to be set only in case
 		 * PTP packets are received.
 		 */
 		if (mbuf->packet_type == RTE_PTYPE_L2_ETHER_TIMESYNC) {
-			tstamp->rx_tstamp = mbuf->timestamp;
+			tstamp->rx_tstamp = *otx2_timestamp_dynfield(mbuf);
 			tstamp->rx_ready = 1;
 			mbuf->ol_flags |= PKT_RX_IEEE1588_PTP |
-				PKT_RX_IEEE1588_TMST | PKT_RX_TIMESTAMP;
+				PKT_RX_IEEE1588_TMST |
+				rte_pmd_octeontx2_timestamp_rx_dynflag;
 		}
 	}
 }
diff --git a/drivers/net/octeontx2/version.map b/drivers/net/octeontx2/version.map
index 4a76d1d52d..d4f4784bcd 100644
--- a/drivers/net/octeontx2/version.map
+++ b/drivers/net/octeontx2/version.map
@@ -1,3 +1,10 @@
 DPDK_21 {
 	local: *;
 };
+
+INTERNAL {
+	global:
+
+	rte_pmd_octeontx2_timestamp_dynfield_offset;
+	rte_pmd_octeontx2_timestamp_rx_dynflag;
+};
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v2 11/14] net/pcap: switch timestamp to dynamic mbuf field
  2020-11-01 18:06 ` [dpdk-dev] [PATCH v2 00/14] remove mbuf timestamp Thomas Monjalon
                     ` (9 preceding siblings ...)
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 10/14] net/octeontx2: " Thomas Monjalon
@ 2020-11-01 18:06   ` Thomas Monjalon
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 12/14] app/testpmd: " Thomas Monjalon
                     ` (3 subsequent siblings)
  14 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-01 18:06 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 drivers/net/pcap/rte_eth_pcap.c | 29 ++++++++++++++++++++++++++---
 1 file changed, 26 insertions(+), 3 deletions(-)

diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index 34e82317b1..b4b7a1839b 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -18,6 +18,7 @@
 
 #include <pcap.h>
 
+#include <rte_bitops.h>
 #include <rte_cycles.h>
 #include <rte_ethdev_driver.h>
 #include <rte_ethdev_vdev.h>
@@ -51,6 +52,9 @@ static uint64_t start_cycles;
 static uint64_t hz;
 static uint8_t iface_idx;
 
+static uint64_t timestamp_rx_dynflag;
+static int timestamp_dynfield_offset = -1;
+
 struct queue_stat {
 	volatile unsigned long pkts;
 	volatile unsigned long bytes;
@@ -265,9 +269,11 @@ eth_pcap_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 		}
 
 		mbuf->pkt_len = (uint16_t)header.caplen;
-		mbuf->timestamp = (uint64_t)header.ts.tv_sec * 1000000
-							+ header.ts.tv_usec;
-		mbuf->ol_flags |= PKT_RX_TIMESTAMP;
+		*RTE_MBUF_DYNFIELD(mbuf, timestamp_dynfield_offset,
+			rte_mbuf_timestamp_t *) =
+				(uint64_t)header.ts.tv_sec * 1000000 +
+				header.ts.tv_usec;
+		mbuf->ol_flags |= timestamp_rx_dynflag;
 		mbuf->port = pcap_q->port_id;
 		bufs[num_rx] = mbuf;
 		num_rx++;
@@ -656,6 +662,23 @@ eth_dev_stop(struct rte_eth_dev *dev)
 static int
 eth_dev_configure(struct rte_eth_dev *dev __rte_unused)
 {
+	int timestamp_rx_dynflag_offset;
+
+	timestamp_dynfield_offset = rte_mbuf_dynfield_lookup(
+			RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
+	if (timestamp_dynfield_offset < 0) {
+		PMD_LOG(ERR, "Failed to lookup timestamp field");
+		return -rte_errno;
+	}
+	timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
+			RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
+	if (timestamp_rx_dynflag_offset < 0) {
+		PMD_LOG(ERR, "Failed lookup Rx timestamp flag");
+		return -rte_errno;
+	}
+	timestamp_rx_dynflag =
+		RTE_BIT64(timestamp_rx_dynflag_offset);
+
 	return 0;
 }
 
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v2 12/14] app/testpmd: switch timestamp to dynamic mbuf field
  2020-11-01 18:06 ` [dpdk-dev] [PATCH v2 00/14] remove mbuf timestamp Thomas Monjalon
                     ` (10 preceding siblings ...)
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 11/14] net/pcap: " Thomas Monjalon
@ 2020-11-01 18:06   ` Thomas Monjalon
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 13/14] examples/rxtx_callbacks: switch timestamp to dynamic field Thomas Monjalon
                     ` (2 subsequent siblings)
  14 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-01 18:06 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 app/test-pmd/util.c | 39 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index 781a813759..eebb5166ad 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -5,6 +5,7 @@
 
 #include <stdio.h>
 
+#include <rte_bitops.h>
 #include <rte_net.h>
 #include <rte_mbuf.h>
 #include <rte_ether.h>
@@ -22,6 +23,40 @@ print_ether_addr(const char *what, const struct rte_ether_addr *eth_addr)
 	printf("%s%s", what, buf);
 }
 
+static inline bool
+is_timestamp_enabled(const struct rte_mbuf *mbuf)
+{
+	static uint64_t timestamp_rx_dynflag;
+
+	int timestamp_rx_dynflag_offset;
+
+	if (timestamp_rx_dynflag == 0) {
+		timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
+				RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
+		if (timestamp_rx_dynflag_offset < 0)
+			return false;
+		timestamp_rx_dynflag = RTE_BIT64(timestamp_rx_dynflag_offset);
+	}
+
+	return (mbuf->ol_flags & timestamp_rx_dynflag) != 0;
+}
+
+static inline rte_mbuf_timestamp_t
+get_timestamp(const struct rte_mbuf *mbuf)
+{
+	static int timestamp_dynfield_offset = -1;
+
+	if (timestamp_dynfield_offset < 0) {
+		timestamp_dynfield_offset = rte_mbuf_dynfield_lookup(
+				RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
+		if (timestamp_dynfield_offset < 0)
+			return 0;
+	}
+
+	return *RTE_MBUF_DYNFIELD(mbuf,
+			timestamp_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 static inline void
 dump_pkt_burst(uint16_t port_id, uint16_t queue, struct rte_mbuf *pkts[],
 	      uint16_t nb_pkts, int is_rx)
@@ -107,8 +142,8 @@ dump_pkt_burst(uint16_t port_id, uint16_t queue, struct rte_mbuf *pkts[],
 				printf("hash=0x%x ID=0x%x ",
 				       mb->hash.fdir.hash, mb->hash.fdir.id);
 		}
-		if (ol_flags & PKT_RX_TIMESTAMP)
-			printf(" - timestamp %"PRIu64" ", mb->timestamp);
+		if (is_timestamp_enabled(mb))
+			printf(" - timestamp %"PRIu64" ", get_timestamp(mb));
 		if (ol_flags & PKT_RX_QINQ)
 			printf(" - QinQ VLAN tci=0x%x, VLAN tci outer=0x%x",
 			       mb->vlan_tci, mb->vlan_tci_outer);
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v2 13/14] examples/rxtx_callbacks: switch timestamp to dynamic field
  2020-11-01 18:06 ` [dpdk-dev] [PATCH v2 00/14] remove mbuf timestamp Thomas Monjalon
                     ` (11 preceding siblings ...)
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 12/14] app/testpmd: " Thomas Monjalon
@ 2020-11-01 18:06   ` Thomas Monjalon
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 14/14] mbuf: remove deprecated timestamp field Thomas Monjalon
  2020-11-01 18:08   ` [dpdk-dev] [PATCH v2 00/14] remove mbuf timestamp Thomas Monjalon
  14 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-01 18:06 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, John McNamara

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 examples/rxtx_callbacks/main.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/examples/rxtx_callbacks/main.c b/examples/rxtx_callbacks/main.c
index 1a8e7d47d9..b1869a5ce8 100644
--- a/examples/rxtx_callbacks/main.c
+++ b/examples/rxtx_callbacks/main.c
@@ -19,6 +19,15 @@
 #define MBUF_CACHE_SIZE 250
 #define BURST_SIZE 32
 
+static int hwts_dynfield_offset = -1;
+
+static inline rte_mbuf_timestamp_t *
+hwts_field(struct rte_mbuf *mbuf)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+			hwts_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 typedef uint64_t tsc_t;
 static int tsc_dynfield_offset = -1;
 
@@ -77,7 +86,7 @@ calc_latency(uint16_t port, uint16_t qidx __rte_unused,
 	for (i = 0; i < nb_pkts; i++) {
 		cycles += now - *tsc_field(pkts[i]);
 		if (hw_timestamping)
-			queue_ticks += ticks - pkts[i]->timestamp;
+			queue_ticks += ticks - *hwts_field(pkts[i]);
 	}
 
 	latency_numbers.total_cycles += cycles;
@@ -141,6 +150,12 @@ port_init(uint16_t port, struct rte_mempool *mbuf_pool)
 			return -1;
 		}
 		port_conf.rxmode.offloads |= DEV_RX_OFFLOAD_TIMESTAMP;
+		hwts_dynfield_offset = rte_mbuf_dynfield_lookup(
+				RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
+		if (hwts_dynfield_offset < 0) {
+			printf("ERROR: Failed to lookup timestamp field\n");
+			return -rte_errno;
+		}
 	}
 
 	retval = rte_eth_dev_configure(port, rx_rings, tx_rings, &port_conf);
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v2 14/14] mbuf: remove deprecated timestamp field
  2020-11-01 18:06 ` [dpdk-dev] [PATCH v2 00/14] remove mbuf timestamp Thomas Monjalon
                     ` (12 preceding siblings ...)
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 13/14] examples/rxtx_callbacks: switch timestamp to dynamic field Thomas Monjalon
@ 2020-11-01 18:06   ` Thomas Monjalon
  2020-11-02 15:41     ` Olivier Matz
  2020-11-01 18:08   ` [dpdk-dev] [PATCH v2 00/14] remove mbuf timestamp Thomas Monjalon
  14 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-01 18:06 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Ajit Khaparde,
	Ray Kinsella, Neil Horman

As announced in the deprecation note, the field timestamp
is removed to give more space to the dynamic fields.
The related offload flag PKT_RX_TIMESTAMP is also removed.

This is how the mbuf layout looks like (pahole-style):

word  type                              name                byte  size
 0    void *                            buf_addr;         /*   0 +  8 */
 1    rte_iova_t                        buf_iova          /*   8 +  8 */
      /* --- RTE_MARKER64               rearm_data;                   */
 2    uint16_t                          data_off;         /*  16 +  2 */
      uint16_t                          refcnt;           /*  18 +  2 */
      uint16_t                          nb_segs;          /*  20 +  2 */
      uint16_t                          port;             /*  22 +  2 */
 3    uint64_t                          ol_flags;         /*  24 +  8 */
      /* --- RTE_MARKER                 rx_descriptor_fields1;        */
 4    uint32_t             union        packet_type;      /*  32 +  4 */
      uint32_t                          pkt_len;          /*  36 +  4 */
 5    uint16_t                          data_len;         /*  40 +  2 */
      uint16_t                          vlan_tci;         /*  42 +  2 */
 5.5  uint64_t             union        hash;             /*  44 +  8 */
 6.5  uint16_t                          vlan_tci_outer;   /*  52 +  2 */
      uint16_t                          buf_len;          /*  54 +  2 */
 7    uint64_t                          dynfield0[1];     /*  56 +  8 */
      /* --- RTE_MARKER                 cacheline1;                   */
 8    struct rte_mempool *              pool;             /*  64 +  8 */
 9    struct rte_mbuf *                 next;             /*  72 +  8 */
10    uint64_t             union        tx_offload;       /*  80 +  8 */
11    struct rte_mbuf_ext_shared_info * shinfo;           /*  88 +  8 */
12    uint16_t                          priv_size;        /*  96 +  2 */
      uint16_t                          timesync;         /*  98 +  2 */
12.5  uint32_t                          dynfield1[7];     /* 100 + 28 */
16    /* --- END                                             128      */

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
---
 app/test/test_mbuf.c                   |  1 -
 doc/guides/rel_notes/deprecation.rst   |  4 ----
 doc/guides/rel_notes/release_20_11.rst |  4 ++++
 lib/librte_ethdev/rte_ethdev.h         |  4 +++-
 lib/librte_mbuf/rte_mbuf.c             |  2 --
 lib/librte_mbuf/rte_mbuf.h             |  2 +-
 lib/librte_mbuf/rte_mbuf_core.h        | 12 +-----------
 lib/librte_mbuf/rte_mbuf_dyn.c         |  1 +
 8 files changed, 10 insertions(+), 20 deletions(-)

diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
index 3a13cf4e1f..a40f7d4883 100644
--- a/app/test/test_mbuf.c
+++ b/app/test/test_mbuf.c
@@ -1621,7 +1621,6 @@ test_get_rx_ol_flag_name(void)
 		VAL_NAME(PKT_RX_FDIR_FLX),
 		VAL_NAME(PKT_RX_QINQ_STRIPPED),
 		VAL_NAME(PKT_RX_LRO),
-		VAL_NAME(PKT_RX_TIMESTAMP),
 		VAL_NAME(PKT_RX_SEC_OFFLOAD),
 		VAL_NAME(PKT_RX_SEC_OFFLOAD_FAILED),
 		VAL_NAME(PKT_RX_OUTER_L4_CKSUM_BAD),
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index fe3fd3956c..22aecf0bab 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -84,10 +84,6 @@ Deprecation Notices
 * mbuf: Some fields will be converted to dynamic API in DPDK 20.11
   in order to reserve more space for the dynamic fields, as explained in
   `this presentation <https://www.youtube.com/watch?v=Ttl6MlhmzWY>`_.
-  The following static fields will be moved as dynamic:
-
-  - ``timestamp``
-
   As a consequence, the layout of the ``struct rte_mbuf`` will be re-arranged,
   avoiding impact on vectorized implementation of the driver datapaths,
   while evaluating performance gains of a better use of the first cache line.
diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index 88b9086390..6455022169 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -448,6 +448,10 @@ API Changes
 * mbuf: Removed the field ``seqn`` from the structure ``rte_mbuf``.
   It is replaced with dynamic fields.
 
+* mbuf: Removed the field ``timestamp`` from the structure ``rte_mbuf``.
+  It is replaced with the dynamic field RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
+  which was previously used only for Tx.
+
 * pci: Removed the ``rte_kernel_driver`` enum defined in rte_dev.h and
   replaced with a private enum in the PCI subsystem.
 
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index 3be0050592..619cbe521e 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1345,6 +1345,8 @@ struct rte_eth_conf {
 #define DEV_RX_OFFLOAD_JUMBO_FRAME	0x00000800
 #define DEV_RX_OFFLOAD_SCATTER		0x00002000
 /**
+ * Timestamp is set by the driver in RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
+ * and RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME is set in ol_flags.
  * The mbuf field and flag are registered when the offload is configured.
  */
 #define DEV_RX_OFFLOAD_TIMESTAMP	0x00004000
@@ -4654,7 +4656,7 @@ int rte_eth_timesync_write_time(uint16_t port_id, const struct timespec *time);
  * rte_eth_read_clock(port, base_clock);
  *
  * Then, convert the raw mbuf timestamp with:
- * base_time_sec + (double)(mbuf->timestamp - base_clock) / freq;
+ * base_time_sec + (double)(*timestamp_dynfield(mbuf) - base_clock) / freq;
  *
  * This simple example will not provide a very good accuracy. One must
  * at least measure multiple times the frequency and do a regression.
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 8a456e5e64..09d93e6899 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -764,7 +764,6 @@ const char *rte_get_rx_ol_flag_name(uint64_t mask)
 	case PKT_RX_QINQ_STRIPPED: return "PKT_RX_QINQ_STRIPPED";
 	case PKT_RX_QINQ: return "PKT_RX_QINQ";
 	case PKT_RX_LRO: return "PKT_RX_LRO";
-	case PKT_RX_TIMESTAMP: return "PKT_RX_TIMESTAMP";
 	case PKT_RX_SEC_OFFLOAD: return "PKT_RX_SEC_OFFLOAD";
 	case PKT_RX_SEC_OFFLOAD_FAILED: return "PKT_RX_SEC_OFFLOAD_FAILED";
 	case PKT_RX_OUTER_L4_CKSUM_BAD: return "PKT_RX_OUTER_L4_CKSUM_BAD";
@@ -808,7 +807,6 @@ rte_get_rx_ol_flag_list(uint64_t mask, char *buf, size_t buflen)
 		{ PKT_RX_FDIR_FLX, PKT_RX_FDIR_FLX, NULL },
 		{ PKT_RX_QINQ_STRIPPED, PKT_RX_QINQ_STRIPPED, NULL },
 		{ PKT_RX_LRO, PKT_RX_LRO, NULL },
-		{ PKT_RX_TIMESTAMP, PKT_RX_TIMESTAMP, NULL },
 		{ PKT_RX_SEC_OFFLOAD, PKT_RX_SEC_OFFLOAD, NULL },
 		{ PKT_RX_SEC_OFFLOAD_FAILED, PKT_RX_SEC_OFFLOAD_FAILED, NULL },
 		{ PKT_RX_QINQ, PKT_RX_QINQ, NULL },
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index a1414ed7cd..17e0b205c0 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -1095,6 +1095,7 @@ rte_pktmbuf_attach_extbuf(struct rte_mbuf *m, void *buf_addr,
 static inline void
 rte_mbuf_dynfield_copy(struct rte_mbuf *mdst, const struct rte_mbuf *msrc)
 {
+	memcpy(&mdst->dynfield0, msrc->dynfield0, sizeof(mdst->dynfield0));
 	memcpy(&mdst->dynfield1, msrc->dynfield1, sizeof(mdst->dynfield1));
 }
 
@@ -1108,7 +1109,6 @@ __rte_pktmbuf_copy_hdr(struct rte_mbuf *mdst, const struct rte_mbuf *msrc)
 	mdst->tx_offload = msrc->tx_offload;
 	mdst->hash = msrc->hash;
 	mdst->packet_type = msrc->packet_type;
-	mdst->timestamp = msrc->timestamp;
 	rte_mbuf_dynfield_copy(mdst, msrc);
 }
 
diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
index 3fb5abda3c..f6720a8729 100644
--- a/lib/librte_mbuf/rte_mbuf_core.h
+++ b/lib/librte_mbuf/rte_mbuf_core.h
@@ -149,11 +149,6 @@ extern "C" {
  */
 #define PKT_RX_LRO           (1ULL << 16)
 
-/**
- * Indicate that the timestamp field in the mbuf is valid.
- */
-#define PKT_RX_TIMESTAMP     (1ULL << 17)
-
 /**
  * Indicate that security offload processing was applied on the RX packet.
  */
@@ -589,12 +584,7 @@ struct rte_mbuf {
 
 	uint16_t buf_len;         /**< Length of segment buffer. */
 
-	/** Valid if PKT_RX_TIMESTAMP is set. The unit and time reference
-	 * are not normalized but are always the same for a given port.
-	 * Some devices allow to query rte_eth_read_clock that will return the
-	 * current device timestamp.
-	 */
-	uint64_t timestamp;
+	uint64_t dynfield0[1]; /**< Reserved for dynamic fields. */
 
 	/* second cache line - fields only used in slow path or on TX */
 	RTE_MARKER cacheline1 __rte_cache_min_aligned;
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.c b/lib/librte_mbuf/rte_mbuf_dyn.c
index 538a43f695..1d835f9ddd 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.c
+++ b/lib/librte_mbuf/rte_mbuf_dyn.c
@@ -124,6 +124,7 @@ init_shared_mem(void)
 		 * rte_mbuf_dynfield_copy().
 		 */
 		memset(shm, 0, sizeof(*shm));
+		mark_free(dynfield0);
 		mark_free(dynfield1);
 
 		/* init free_flags */
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v2 00/14] remove mbuf timestamp
  2020-11-01 18:06 ` [dpdk-dev] [PATCH v2 00/14] remove mbuf timestamp Thomas Monjalon
                     ` (13 preceding siblings ...)
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 14/14] mbuf: remove deprecated timestamp field Thomas Monjalon
@ 2020-11-01 18:08   ` Thomas Monjalon
  14 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-01 18:08 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo

01/11/2020 19:06, Thomas Monjalon:
> The mbuf field timestamp was announced to be removed for three reasons:
>   - a dynamic field already exist, used for Tx only
>   - this field always used 8 bytes even if unneeded
>   - this field is in the first half (cacheline) of mbuf
> 
> After this series, the dynamic field timestamp is used for both Rx and Tx
> with separate dynamic flags to distinguish when the value is meaningful
> without resetting the field during forwarding.
> 
> As a consequence, 8 bytes can be re-allocated to dynamic fields
> in the first half of mbuf structure.
> It is still open to change more the mbuf layout.
> 
> This mbuf layout change is important to allow adding more features
> (consuming more dynamic fields) during the next year,
> and can allow performance improvements with new usages in the first half.

The changelog was missing:

v2:
- remove optimization to register only once in ethdev
- fix error message in latencystats
- convert rxtx_callbacks macro to inline function
- increase dynamic fields space
- do not move pool field




^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/14] net/octeontx2: switch timestamp to dynamic mbuf field
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 10/14] net/octeontx2: " Thomas Monjalon
@ 2020-11-01 18:28     ` Jerin Jacob
  2020-11-02  9:38       ` Thomas Monjalon
  0 siblings, 1 reply; 170+ messages in thread
From: Jerin Jacob @ 2020-11-01 18:28 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dpdk-dev, Ferruh Yigit, David Marchand, Richardson, Bruce,
	Olivier Matz, Andrew Rybchenko, Jerin Jacob,
	Viacheslav Ovsiienko, Nithin Dabilpuram, Kiran Kumar K,
	Ray Kinsella, Neil Horman

On Sun, Nov 1, 2020 at 11:40 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> The mbuf timestamp is moved to a dynamic field
> in order to allow removal of the deprecated static field.
> The related mbuf flag is also replaced.
>
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> ---
>  drivers/net/octeontx2/otx2_ethdev.c | 33 +++++++++++++++++++++++++++++
>  drivers/net/octeontx2/otx2_rx.h     | 19 ++++++++++++++---
>  drivers/net/octeontx2/version.map   |  7 ++++++
>  3 files changed, 56 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/octeontx2/otx2_ethdev.c b/drivers/net/octeontx2/otx2_ethdev.c
> index cfb733a4b5..ad95219438 100644
> --- a/drivers/net/octeontx2/otx2_ethdev.c
> +++ b/drivers/net/octeontx2/otx2_ethdev.c
> @@ -4,6 +4,7 @@
>
>  #include <inttypes.h>
>
> +#include <rte_bitops.h>
>  #include <rte_ethdev_pci.h>
>  #include <rte_io.h>
>  #include <rte_malloc.h>
> @@ -14,6 +15,35 @@
>  #include "otx2_ethdev.h"
>  #include "otx2_ethdev_sec.h"
>
> +uint64_t rte_pmd_octeontx2_timestamp_rx_dynflag;
> +int rte_pmd_octeontx2_timestamp_dynfield_offset = -1;


Instead of the global variable, please move the offset to struct
otx2_timesync_info(which is used in fastpath) and accessible in slow
path.

> +
> +static int
> +otx2_rx_timestamp_setup(uint16_t flags)
> +{
> +       int timestamp_rx_dynflag_offset;
> +
> +       if ((flags & NIX_RX_OFFLOAD_TSTAMP_F) == 0)
> +               return 0;
> +
> +       rte_pmd_octeontx2_timestamp_dynfield_offset = rte_mbuf_dynfield_lookup(
> +                       RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
> +       if (rte_pmd_octeontx2_timestamp_dynfield_offset < 0) {
> +               otx2_err("Failed to lookup timestamp field");
> +               return -rte_errno;
> +       }
> +       timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
> +                       RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
> +       if (timestamp_rx_dynflag_offset < 0) {
> +               otx2_err("Failed to lookup Rx timestamp flag");
> +               return -rte_errno;
> +       }
> +       rte_pmd_octeontx2_timestamp_rx_dynflag =
> +                       RTE_BIT64(timestamp_rx_dynflag_offset);
> +
> +       return 0;
> +}
> +
>  static inline uint64_t
>  nix_get_rx_offload_capa(struct otx2_eth_dev *dev)
>  {
> @@ -1874,6 +1904,9 @@ otx2_nix_configure(struct rte_eth_dev *eth_dev)
>         dev->tx_offload_flags |= nix_tx_offload_flags(eth_dev);
>         dev->rss_info.rss_grps = NIX_RSS_GRPS;
>
> +       if (otx2_rx_timestamp_setup(dev->rx_offload_flags) != 0)
> +               goto fail_offloads;
> +
>         nb_rxq = RTE_MAX(data->nb_rx_queues, 1);
>         nb_txq = RTE_MAX(data->nb_tx_queues, 1);
>
> diff --git a/drivers/net/octeontx2/otx2_rx.h b/drivers/net/octeontx2/otx2_rx.h
> index 61a5c436dd..6981edce82 100644
> --- a/drivers/net/octeontx2/otx2_rx.h
> +++ b/drivers/net/octeontx2/otx2_rx.h
> @@ -63,6 +63,18 @@ union mbuf_initializer {
>         uint64_t value;
>  };
>
> +/* variables are exported because this file is included in other drivers */
> +extern uint64_t rte_pmd_octeontx2_timestamp_rx_dynflag;
> +extern int rte_pmd_octeontx2_timestamp_dynfield_offset;
> +
> +static inline rte_mbuf_timestamp_t *
> +otx2_timestamp_dynfield(struct rte_mbuf *mbuf)

Please take the offset from struct otx2_timesync_info *tstamp. it
already available in otx2_nix_mbuf_to_tstamp(). See below.

> +{
> +       return RTE_MBUF_DYNFIELD(mbuf,
> +               rte_pmd_octeontx2_timestamp_dynfield_offset,
> +               rte_mbuf_timestamp_t *);
> +}
> +
>  static __rte_always_inline void
>  otx2_nix_mbuf_to_tstamp(struct rte_mbuf *mbuf,
>                         struct otx2_timesync_info *tstamp, const uint16_t flag,
> @@ -77,15 +89,16 @@ otx2_nix_mbuf_to_tstamp(struct rte_mbuf *mbuf,
>                 /* Reading the rx timestamp inserted by CGX, viz at
>                  * starting of the packet data.
>                  */
> -               mbuf->timestamp = rte_be_to_cpu_64(*tstamp_ptr);
> +               *otx2_timestamp_dynfield(mbuf) = rte_be_to_cpu_64(*tstamp_ptr);
>                 /* PKT_RX_IEEE1588_TMST flag needs to be set only in case
>                  * PTP packets are received.
>                  */
>                 if (mbuf->packet_type == RTE_PTYPE_L2_ETHER_TIMESYNC) {
> -                       tstamp->rx_tstamp = mbuf->timestamp;
> +                       tstamp->rx_tstamp = *otx2_timestamp_dynfield(mbuf);
>                         tstamp->rx_ready = 1;
>                         mbuf->ol_flags |= PKT_RX_IEEE1588_PTP |
> -                               PKT_RX_IEEE1588_TMST | PKT_RX_TIMESTAMP;
> +                               PKT_RX_IEEE1588_TMST |
> +                               rte_pmd_octeontx2_timestamp_rx_dynflag;
>                 }
>         }
>  }
> diff --git a/drivers/net/octeontx2/version.map b/drivers/net/octeontx2/version.map
> index 4a76d1d52d..d4f4784bcd 100644
> --- a/drivers/net/octeontx2/version.map
> +++ b/drivers/net/octeontx2/version.map
> @@ -1,3 +1,10 @@
>  DPDK_21 {
>         local: *;
>  };
> +
> +INTERNAL {
> +       global:
> +
> +       rte_pmd_octeontx2_timestamp_dynfield_offset;
> +       rte_pmd_octeontx2_timestamp_rx_dynflag;

No need to export this function if offset is part of struct otx2_timesync_info


> +};
> --
> 2.28.0
>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 10/15] net/octeontx2: switch timestamp to dynamic mbuf field
  2020-10-30 12:41       ` Jerin Jacob
  2020-11-01 16:12         ` Thomas Monjalon
@ 2020-11-01 20:00         ` Andrew Rybchenko
  1 sibling, 0 replies; 170+ messages in thread
From: Andrew Rybchenko @ 2020-11-01 20:00 UTC (permalink / raw)
  To: Jerin Jacob, Slava Ovsiienko
  Cc: NBU-Contact-Thomas Monjalon, dev, ferruh.yigit, david.marchand,
	bruce.richardson, olivier.matz, jerinj, Nithin Dabilpuram,
	Kiran Kumar K, Ray Kinsella, Neil Horman

On 10/30/20 3:41 PM, Jerin Jacob wrote:
> On Thu, Oct 29, 2020 at 5:22 PM Slava Ovsiienko <viacheslavo@nvidia.com> wrote:
>> Just five cents -  exporting the offset (making it global) might have side effect impacting the performance.
> I agree with Slava. The offset value should be stored in the PMD structure.
> IMO, We can have an ethdev API to get the offset and store it in PMD's
> fastpath structures in the slow path
> to use in fastpath.

I don't mind. My main goal is to raise the topic and, may be, add a bit
more comments in the code to help the future maintainers to
understand it. It is not trivial topic and any help to code readers in
the comments would be very useful.

>> Offset might be located in some memory sharing the cacheline with some other variables.
>> If these variables are writable and are being updated frequently - we might get the cache contention.
>> I'd prefer to keep all dynamic offsets In the PMD and entirely control memory allocation
>> attributes for these ones. Hence, exporting is OK, but practical usage in datapath is questionable.
>>
>> With best regards, Slava
>>
>>> -----Original Message-----
>>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>> Sent: Thursday, October 29, 2020 13:02
>>> To: NBU-Contact-Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org
>>> Cc: ferruh.yigit@intel.com; david.marchand@redhat.com;
>>> bruce.richardson@intel.com; olivier.matz@6wind.com; jerinj@marvell.com;
>>> Slava Ovsiienko <viacheslavo@nvidia.com>; Nithin Dabilpuram
>>> <ndabilpuram@marvell.com>; Kiran Kumar K <kirankumark@marvell.com>;
>>> Ray Kinsella <mdr@ashroe.eu>; Neil Horman <nhorman@tuxdriver.com>
>>> Subject: Re: [PATCH 10/15] net/octeontx2: switch timestamp to dynamic mbuf
>>> field
>>>
>>> On 10/29/20 12:27 PM, Thomas Monjalon wrote:
>>>> The mbuf timestamp is moved to a dynamic field in order to allow
>>>> removal of the deprecated static field.
>>>> The related mbuf flag is also replaced.
>>>>
>>>> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
>>>> ---
>>>>   drivers/net/octeontx2/otx2_ethdev.c | 33
>>> +++++++++++++++++++++++++++++
>>>>   drivers/net/octeontx2/otx2_rx.h     | 19 ++++++++++++++---
>>>>   drivers/net/octeontx2/version.map   |  7 ++++++
>>>>   3 files changed, 56 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/net/octeontx2/otx2_ethdev.c
>>>> b/drivers/net/octeontx2/otx2_ethdev.c
>>>> index cfb733a4b5..ad95219438 100644
>>>> --- a/drivers/net/octeontx2/otx2_ethdev.c
>>>> +++ b/drivers/net/octeontx2/otx2_ethdev.c
>>>> @@ -4,6 +4,7 @@
>>>>
>>>>   #include <inttypes.h>
>>>>
>>>> +#include <rte_bitops.h>
>>>>   #include <rte_ethdev_pci.h>
>>>>   #include <rte_io.h>
>>>>   #include <rte_malloc.h>
>>>> @@ -14,6 +15,35 @@
>>>>   #include "otx2_ethdev.h"
>>>>   #include "otx2_ethdev_sec.h"
>>>>
>>>> +uint64_t rte_pmd_octeontx2_timestamp_rx_dynflag;
>>>> +int rte_pmd_octeontx2_timestamp_dynfield_offset = -1;
>>>> +
>>>> +static int
>>>> +otx2_rx_timestamp_setup(uint16_t flags) {
>>>> +   int timestamp_rx_dynflag_offset;
>>>> +
>>>> +   if ((flags & NIX_RX_OFFLOAD_TSTAMP_F) == 0)
>>>> +           return 0;
>>>> +
>>>> +   rte_pmd_octeontx2_timestamp_dynfield_offset =
>>> rte_mbuf_dynfield_lookup(
>>>> +                   RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
>>>> +   if (rte_pmd_octeontx2_timestamp_dynfield_offset < 0) {
>>>> +           otx2_err("Failed to lookup timestamp field");
>>>> +           return -rte_errno;
>>>> +   }
>>>> +   timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
>>>> +                   RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
>>>> +   if (timestamp_rx_dynflag_offset < 0) {
>>>> +           otx2_err("Failed to lookup Rx timestamp flag");
>>>> +           return -rte_errno;
>>>> +   }
>>>> +   rte_pmd_octeontx2_timestamp_rx_dynflag =
>>>> +                   RTE_BIT64(timestamp_rx_dynflag_offset);
>>>> +
>>>> +   return 0;
>>>> +}
>>>> +
>>>>   static inline uint64_t
>>>>   nix_get_rx_offload_capa(struct otx2_eth_dev *dev)  { @@ -1874,6
>>>> +1904,9 @@ otx2_nix_configure(struct rte_eth_dev *eth_dev)
>>>>      dev->tx_offload_flags |= nix_tx_offload_flags(eth_dev);
>>>>      dev->rss_info.rss_grps = NIX_RSS_GRPS;
>>>>
>>>> +   if (otx2_rx_timestamp_setup(dev->rx_offload_flags) != 0)
>>>> +           goto fail_offloads;
>>>> +
>>>>      nb_rxq = RTE_MAX(data->nb_rx_queues, 1);
>>>>      nb_txq = RTE_MAX(data->nb_tx_queues, 1);
>>>>
>>>> diff --git a/drivers/net/octeontx2/otx2_rx.h
>>>> b/drivers/net/octeontx2/otx2_rx.h index 61a5c436dd..6981edce82 100644
>>>> --- a/drivers/net/octeontx2/otx2_rx.h
>>>> +++ b/drivers/net/octeontx2/otx2_rx.h
>>>> @@ -63,6 +63,18 @@ union mbuf_initializer {
>>>>      uint64_t value;
>>>>   };
>>>>
>>>> +/* variables are exported because this file is included in other
>>>> +drivers */ extern uint64_t rte_pmd_octeontx2_timestamp_rx_dynflag;
>>>> +extern int rte_pmd_octeontx2_timestamp_dynfield_offset;
>>>> +
>>>> +static inline rte_mbuf_timestamp_t *
>>>> +otx2_timestamp_dynfield(struct rte_mbuf *mbuf) {
>>>> +   return RTE_MBUF_DYNFIELD(mbuf,
>>>> +           rte_pmd_octeontx2_timestamp_dynfield_offset,
>>>> +           rte_mbuf_timestamp_t *);
>>>> +}
>>>> +
>>> May be ethdev should provide the inline function?
>> Just five cents -  exporting the offset (making it global) might have side effect impacting the performance.
>> Offset might be located in some memory sharing the cacheline with some other variables.
>> If these variables are writable and are being updated frequently - we might get the cache contention.
>> I'd prefer to keep all dynamic offsets In the PMD and entirely control memory allocation
>> attributes for these ones. Hence, exporting/inline function is possible,
>> but practical usage, say,  in datapath, is questionable.
>>
>> With best regards, Slava
>>


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotterfirst half
  2020-11-01 16:38                 ` Thomas Monjalon
@ 2020-11-01 20:59                   ` Morten Brørup
  2020-11-02 15:58                     ` Thomas Monjalon
  0 siblings, 1 reply; 170+ messages in thread
From: Morten Brørup @ 2020-11-01 20:59 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, Ajit Khaparde, Ananyev, Konstantin, Andrew Rybchenko, dev,
	Yigit, Ferruh, david.marchand, Richardson, Bruce, olivier.matz,
	jerinj, viacheslavo, honnappa.nagarahalli, maxime.coquelin,
	stephen, hemant.agrawal, viacheslavo, Matan Azrad, Shahaf Shuler

> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Sunday, November 1, 2020 5:38 PM
> 
> 01/11/2020 10:12, Morten Brørup:
> > One thing has always puzzled me:
> > Why do we use 64 bits to indicate which memory pool
> > an mbuf belongs to?
> > The portid only uses 16 bits and an indirection index.
> > Why don't we use the same kind of indirection index for mbuf pools?
> 
> I wonder what would be the cost of indirection. Probably neglectible.

Probably. The portid does it, and that indirection is heavily used everywhere.

The size of mbuf memory pool indirection array should be compile time configurable, like the size of the portid indirection array.

And for reference, the indirection array will fit into one cache line if we default to 8 mbuf pools, thus supporting an 8 CPU socket system with one mbuf pool per CPU socket, or a 4 CPU socket system with two mbuf pools per CPU socket.

(And as a side note: Our application is optimized for single-socket systems, and we only use one mbuf pool. I guess many applications were developed without carefully optimizing for multi-socket systems, and also just use one mbuf pool. In these cases, the mbuf structure doesn't really need a pool field. But it is still there, and the DPDK libraries use it, so we didn't bother removing it.)

> I think it is a good proposal...
> ... for next year, after a deprecation notice.
> 
> > I can easily imagine using one mbuf pool (or perhaps a few pools)
> > per CPU socket (or per physical memory bus closest to an attached NIC),
> > but not more than 256 mbuf memory pools in total.
> > So, let's introduce an mbufpoolid like the portid,
> > and cut this mbuf field down from 64 to 8 bits.
> >
> > If we also cut down m->pkt_len from 32 to 24 bits,
> 
> Who is using packets larger than 64k? Are 16 bits enough?

I personally consider 64k a reasonable packet size limit. Exotic applications with even larger packets would have to live with this constraint. But let's see if there are any objections. For reference, 64k corresponds to ca. 44 Ethernet (1500 byte) packets.

(The limit could be 65535 bytes, to avoid translation of the value 0 into 65536 bytes.)

This modification would go nicely hand in hand with the mbuf pool indirection modification.

... after yet another round of ABI stability discussions, depreciation notices, and so on. :-)

> 
> > we can get the 8 bit mbuf pool index into the first cache line
> > at no additional cost.
> 
> I like the idea.
> It means we don't need to move the pool pointer now,
> i.e. it does not have to replace the timestamp field.

Agreed! Don't move m->pool to the first cache line; it is not used for RX.

> 
> > In other words: This would free up another 64 bit field in the mbuf
> structure!
> 
> That would be great!
> 
> 
> > And even though the m->next pointer for scattered packets resides
> > in the second cache line, the libraries and application knows
> > that m->next is NULL when m->nb_segs is 1.
> > This proves that my suggestion would make touching
> > the second cache line unnecessary (in simple cases),
> > even for re-initializing the mbuf.
> 
> So you think the "next" pointer should stay in the second half of mbuf?
> 
> I feel you would like to move the Tx offloads in the first half
> to improve performance of very simple apps.

"Very simple apps" sounds like a minority of apps. I would rather say "very simple packet handling scenarios", e.g. forwarding of normal size non-segmented packets. I would guess that the vast majority of packets handled by DPDK applications actually match this scenario. So I'm proposing to optimize for what I think is the most common scenario.

If segmented packets are common, then m->next could be moved to the first cache line. But it will only improve the pure RX steps of the pipeline. When preparing the packet for TX, m->tx_offloads will need to be set, and the second cache line comes into play. So I'm wondering how big the benefit of having m->next in the first cache line really is - assuming that m->nb_segs will be checked before accessing m->next.

> I am thinking the opposite: we could have some dynamic fields space
> in the first half to improve performance of complex Rx.
> Note: we can add a flag hint for field registration in this first half.
> 

I have had the same thoughts. However, I would prefer being able to forward ordinary packets without using the second mbuf cache line at all (although only in specific scenarios like my example above).

Furthermore, the application can abuse the 64 bit m->tx_offload field for private purposes until it is time to prepare the packet for TX and pass it on to the driver. This hack somewhat resembles a dynamic field in the first cache line, and will not be possible if the m->pool or m->next field is moved there.


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v2 08/14] net/mlx5: switch timestamp to dynamic mbuf field
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 08/14] net/mlx5: switch timestamp to dynamic mbuf field Thomas Monjalon
@ 2020-11-02  5:08     ` Ruifeng Wang
  2020-11-02 23:20     ` David Christensen
  1 sibling, 0 replies; 170+ messages in thread
From: Ruifeng Wang @ 2020-11-02  5:08 UTC (permalink / raw)
  To: thomas, dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Matan Azrad,
	Shahaf Shuler, David Christensen, Konstantin Ananyev, nd


> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Monday, November 2, 2020 2:06 AM
> To: dev@dpdk.org
> Cc: ferruh.yigit@intel.com; david.marchand@redhat.com;
> bruce.richardson@intel.com; olivier.matz@6wind.com;
> andrew.rybchenko@oktetlabs.ru; jerinj@marvell.com;
> viacheslavo@nvidia.com; Matan Azrad <matan@nvidia.com>; Shahaf Shuler
> <shahafs@nvidia.com>; David Christensen <drc@linux.vnet.ibm.com>;
> Ruifeng Wang <Ruifeng.Wang@arm.com>; Konstantin Ananyev
> <konstantin.ananyev@intel.com>
> Subject: [PATCH v2 08/14] net/mlx5: switch timestamp to dynamic mbuf field
> 
> The mbuf timestamp is moved to a dynamic field in order to allow removal of
> the deprecated static field.
> The related mbuf flag is also replaced.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> ---
>  drivers/net/mlx5/mlx5_rxq.c              | 36 ++++++++++++++++++++
>  drivers/net/mlx5/mlx5_rxtx.c             |  4 +--
>  drivers/net/mlx5/mlx5_rxtx.h             | 19 +++++++++++
>  drivers/net/mlx5/mlx5_rxtx_vec_altivec.h | 41 +++++++++++-----------
>  drivers/net/mlx5/mlx5_rxtx_vec_neon.h    | 43 ++++++++++++------------
>  drivers/net/mlx5/mlx5_rxtx_vec_sse.h     | 35 +++++++++----------
>  6 files changed, 118 insertions(+), 60 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
> index f1d8373079..877aa24a18 100644
> --- a/drivers/net/mlx5/mlx5_rxq.c
> +++ b/drivers/net/mlx5/mlx5_rxq.c
> @@ -1310,6 +1310,39 @@ mlx5_max_lro_msg_size_adjust(struct
> rte_eth_dev *dev, uint16_t idx,
>  		priv->max_lro_msg_size * MLX5_LRO_SEG_CHUNK_SIZE);  }
> 
> +/**
> + * Lookup mbuf field and flag for Rx timestamp if offload requested.
> + *
> + * @param rxq_data
> + *   Datapath struct where field offset and flag mask are stored.
> + *
> + * @return
> + *   0 on success or offload disabled, negative errno otherwise.
> + */
> +static int
> +mlx5_rx_timestamp_setup(struct mlx5_rxq_data *rxq_data) {
> +	int timestamp_rx_dynflag_offset;
> +
> +	rxq_data->timestamp_rx_flag = 0;
> +	if (rxq_data->hw_timestamp == 0)
> +		return 0;
> +	rxq_data->timestamp_offset = rte_mbuf_dynfield_lookup(
> +			RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
> +	if (rxq_data->timestamp_offset < 0) {
> +		DRV_LOG(ERR, "Cannot lookup timestamp field\n");
> +		return -rte_errno;
> +	}
> +	timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
> +			RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME,
> NULL);
> +	if (timestamp_rx_dynflag_offset < 0) {
> +		DRV_LOG(ERR, "Cannot lookup Rx timestamp flag\n");
> +		return -rte_errno;
> +	}
> +	rxq_data->timestamp_rx_flag =
> RTE_BIT64(timestamp_rx_dynflag_offset);
> +	return 0;
> +}
> +
>  /**
>   * Create a DPDK Rx queue.
>   *
> @@ -1492,7 +1525,10 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t
> idx, uint16_t desc,
>  	mlx5_max_lro_msg_size_adjust(dev, idx, max_lro_size);
>  	/* Toggle RX checksum offload if hardware supports it. */
>  	tmpl->rxq.csum = !!(offloads & DEV_RX_OFFLOAD_CHECKSUM);
> +	/* Configure Rx timestamp. */
>  	tmpl->rxq.hw_timestamp = !!(offloads &
> DEV_RX_OFFLOAD_TIMESTAMP);
> +	if (mlx5_rx_timestamp_setup(&tmpl->rxq) != 0)
> +		goto error;
>  	/* Configure VLAN stripping. */
>  	tmpl->rxq.vlan_strip = !!(offloads &
> DEV_RX_OFFLOAD_VLAN_STRIP);
>  	/* By default, FCS (CRC) is stripped by hardware. */ diff --git
> a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index
> e86468b67a..b577aab00b 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -1287,8 +1287,8 @@ rxq_cq_to_mbuf(struct mlx5_rxq_data *rxq, struct
> rte_mbuf *pkt,
> 
>  		if (rxq->rt_timestamp)
>  			ts = mlx5_txpp_convert_rx_ts(rxq->sh, ts);
> -		pkt->timestamp = ts;
> -		pkt->ol_flags |= PKT_RX_TIMESTAMP;
> +		mlx5_timestamp_set(pkt, rxq->timestamp_offset, ts);
> +		pkt->ol_flags |= rxq->timestamp_rx_flag;
>  	}
>  }
> 
> diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
> index 674296ee98..e9eca36b40 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.h
> +++ b/drivers/net/mlx5/mlx5_rxtx.h
> @@ -151,6 +151,8 @@ struct mlx5_rxq_data {
>  	/* CQ (UAR) access lock required for 32bit implementations */
> #endif
>  	uint32_t tunnel; /* Tunnel information. */
> +	int timestamp_offset; /* Dynamic mbuf field for timestamp. */
> +	uint64_t timestamp_rx_flag; /* Dynamic mbuf flag for timestamp. */
>  	uint64_t flow_meta_mask;
>  	int32_t flow_meta_offset;
>  } __rte_cache_aligned;
> @@ -681,4 +683,21 @@ mlx5_txpp_convert_tx_ts(struct
> mlx5_dev_ctx_shared *sh, uint64_t mts)
>  	return ci;
>  }
> 
> +/**
> + * Set timestamp in mbuf dynamic field.
> + *
> + * @param mbuf
> + *   Structure to write into.
> + * @param offset
> + *   Dynamic field offset in mbuf structure.
> + * @param timestamp
> + *   Value to write.
> + */
> +static __rte_always_inline void
> +mlx5_timestamp_set(struct rte_mbuf *mbuf, int offset,
> +		rte_mbuf_timestamp_t timestamp)
> +{
> +	*RTE_MBUF_DYNFIELD(mbuf, offset, rte_mbuf_timestamp_t *) =
> timestamp;
> +}
> +
>  #endif /* RTE_PMD_MLX5_RXTX_H_ */
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
> b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
> index 6bf0c9b540..171d7bb0f8 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
> @@ -330,13 +330,13 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data
> *rxq,
>  	vector unsigned char ol_flags = (vector unsigned char)
>  		(vector unsigned int){
>  			rxq->rss_hash * PKT_RX_RSS_HASH |
> -				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
> +				rxq->hw_timestamp * rxq-
> >timestamp_rx_flag,
>  			rxq->rss_hash * PKT_RX_RSS_HASH |
> -				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
> +				rxq->hw_timestamp * rxq-
> >timestamp_rx_flag,
>  			rxq->rss_hash * PKT_RX_RSS_HASH |
> -				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
> +				rxq->hw_timestamp * rxq-
> >timestamp_rx_flag,
>  			rxq->rss_hash * PKT_RX_RSS_HASH |
> -				rxq->hw_timestamp * PKT_RX_TIMESTAMP};
> +				rxq->hw_timestamp * rxq-
> >timestamp_rx_flag};
>  	vector unsigned char cv_flags;
>  	const vector unsigned char zero = (vector unsigned char){0};
>  	const vector unsigned char ptype_mask = @@ -1025,31 +1025,32 @@
> rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t
> pkts_n,
>  		/* D.5 fill in mbuf - rearm_data and packet_type. */
>  		rxq_cq_to_ptype_oflags_v(rxq, cqes, opcode, &pkts[pos]);
>  		if (rxq->hw_timestamp) {
> +			int offset = rxq->timestamp_offset;
>  			if (rxq->rt_timestamp) {
>  				struct mlx5_dev_ctx_shared *sh = rxq->sh;
>  				uint64_t ts;
> 
>  				ts = rte_be_to_cpu_64(cq[pos].timestamp);
> -				pkts[pos]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(pkts[pos], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  				ts = rte_be_to_cpu_64(cq[pos +
> p1].timestamp);
> -				pkts[pos + 1]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(pkts[pos + 1], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  				ts = rte_be_to_cpu_64(cq[pos +
> p2].timestamp);
> -				pkts[pos + 2]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(pkts[pos + 2], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  				ts = rte_be_to_cpu_64(cq[pos +
> p3].timestamp);
> -				pkts[pos + 3]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(pkts[pos + 3], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  			} else {
> -				pkts[pos]->timestamp = rte_be_to_cpu_64
> -						(cq[pos].timestamp);
> -				pkts[pos + 1]->timestamp =
> rte_be_to_cpu_64
> -						(cq[pos + p1].timestamp);
> -				pkts[pos + 2]->timestamp =
> rte_be_to_cpu_64
> -						(cq[pos + p2].timestamp);
> -				pkts[pos + 3]->timestamp =
> rte_be_to_cpu_64
> -						(cq[pos + p3].timestamp);
> +				mlx5_timestamp_set(pkts[pos], offset,
> +
> 	rte_be_to_cpu_64(cq[pos].timestamp));
> +				mlx5_timestamp_set(pkts[pos + 1], offset,
> +					rte_be_to_cpu_64(cq[pos +
> p1].timestamp));
> +				mlx5_timestamp_set(pkts[pos + 2], offset,
> +					rte_be_to_cpu_64(cq[pos +
> p2].timestamp));
> +				mlx5_timestamp_set(pkts[pos + 3], offset,
> +					rte_be_to_cpu_64(cq[pos +
> p3].timestamp));
>  			}
>  		}
>  		if (rxq->dynf_meta) {
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> index d122dad4fe..436b247ade 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> @@ -271,7 +271,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data
> *rxq,
>  	uint32x4_t pinfo, cv_flags;
>  	uint32x4_t ol_flags =
>  		vdupq_n_u32(rxq->rss_hash * PKT_RX_RSS_HASH |
> -			    rxq->hw_timestamp * PKT_RX_TIMESTAMP);
> +			    rxq->hw_timestamp * rxq->timestamp_rx_flag);
>  	const uint32x4_t ptype_ol_mask = { 0x106, 0x106, 0x106, 0x106 };
>  	const uint8x16_t cv_flag_sel = {
>  		0,
> @@ -697,6 +697,7 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct
> rte_mbuf **pkts, uint16_t pkts_n,
>  		rxq_cq_to_ptype_oflags_v(rxq, ptype_info, flow_tag,
>  					 opcode, &elts[pos]);
>  		if (rxq->hw_timestamp) {
> +			int offset = rxq->timestamp_offset;
>  			if (rxq->rt_timestamp) {
>  				struct mlx5_dev_ctx_shared *sh = rxq->sh;
>  				uint64_t ts;
> @@ -704,36 +705,36 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct
> rte_mbuf **pkts, uint16_t pkts_n,
>  				ts = rte_be_to_cpu_64
>  					(container_of(p0, struct mlx5_cqe,
>  						      pkt_info)->timestamp);
> -				elts[pos]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(elts[pos], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  				ts = rte_be_to_cpu_64
>  					(container_of(p1, struct mlx5_cqe,
>  						      pkt_info)->timestamp);
> -				elts[pos + 1]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(elts[pos + 1], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  				ts = rte_be_to_cpu_64
>  					(container_of(p2, struct mlx5_cqe,
>  						      pkt_info)->timestamp);
> -				elts[pos + 2]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(elts[pos + 2], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  				ts = rte_be_to_cpu_64
>  					(container_of(p3, struct mlx5_cqe,
>  						      pkt_info)->timestamp);
> -				elts[pos + 3]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(elts[pos + 3], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  			} else {
> -				elts[pos]->timestamp = rte_be_to_cpu_64
> -					(container_of(p0, struct mlx5_cqe,
> -						      pkt_info)->timestamp);
> -				elts[pos + 1]->timestamp =
> rte_be_to_cpu_64
> -					(container_of(p1, struct mlx5_cqe,
> -						      pkt_info)->timestamp);
> -				elts[pos + 2]->timestamp =
> rte_be_to_cpu_64
> -					(container_of(p2, struct mlx5_cqe,
> -						      pkt_info)->timestamp);
> -				elts[pos + 3]->timestamp =
> rte_be_to_cpu_64
> -					(container_of(p3, struct mlx5_cqe,
> -						      pkt_info)->timestamp);
> +				mlx5_timestamp_set(elts[pos], offset,
> +					rte_be_to_cpu_64(container_of(p0,
> +					struct mlx5_cqe, pkt_info)-
> >timestamp));
> +				mlx5_timestamp_set(elts[pos + 1], offset,
> +					rte_be_to_cpu_64(container_of(p1,
> +					struct mlx5_cqe, pkt_info)-
> >timestamp));
> +				mlx5_timestamp_set(elts[pos + 2], offset,
> +					rte_be_to_cpu_64(container_of(p2,
> +					struct mlx5_cqe, pkt_info)-
> >timestamp));
> +				mlx5_timestamp_set(elts[pos + 3], offset,
> +					rte_be_to_cpu_64(container_of(p3,
> +					struct mlx5_cqe, pkt_info)-
> >timestamp));
>  			}
>  		}
>  		if (rxq->dynf_meta) {
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> index 0bbcbeefff..ae4439efc7 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> @@ -251,7 +251,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data
> *rxq, __m128i cqes[4],
>  	__m128i pinfo0, pinfo1;
>  	__m128i pinfo, ptype;
>  	__m128i ol_flags = _mm_set1_epi32(rxq->rss_hash *
> PKT_RX_RSS_HASH |
> -					  rxq->hw_timestamp *
> PKT_RX_TIMESTAMP);
> +					  rxq->hw_timestamp * rxq-
> >timestamp_rx_flag);
>  	__m128i cv_flags;
>  	const __m128i zero = _mm_setzero_si128();
>  	const __m128i ptype_mask =
> @@ -656,31 +656,32 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct
> rte_mbuf **pkts, uint16_t pkts_n,
>  		/* D.5 fill in mbuf - rearm_data and packet_type. */
>  		rxq_cq_to_ptype_oflags_v(rxq, cqes, opcode, &pkts[pos]);
>  		if (rxq->hw_timestamp) {
> +			int offset = rxq->timestamp_offset;
>  			if (rxq->rt_timestamp) {
>  				struct mlx5_dev_ctx_shared *sh = rxq->sh;
>  				uint64_t ts;
> 
>  				ts = rte_be_to_cpu_64(cq[pos].timestamp);
> -				pkts[pos]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(pkts[pos], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  				ts = rte_be_to_cpu_64(cq[pos +
> p1].timestamp);
> -				pkts[pos + 1]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(pkts[pos + 1], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  				ts = rte_be_to_cpu_64(cq[pos +
> p2].timestamp);
> -				pkts[pos + 2]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(pkts[pos + 2], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  				ts = rte_be_to_cpu_64(cq[pos +
> p3].timestamp);
> -				pkts[pos + 3]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(pkts[pos + 3], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  			} else {
> -				pkts[pos]->timestamp = rte_be_to_cpu_64
> -						(cq[pos].timestamp);
> -				pkts[pos + 1]->timestamp =
> rte_be_to_cpu_64
> -						(cq[pos + p1].timestamp);
> -				pkts[pos + 2]->timestamp =
> rte_be_to_cpu_64
> -						(cq[pos + p2].timestamp);
> -				pkts[pos + 3]->timestamp =
> rte_be_to_cpu_64
> -						(cq[pos + p3].timestamp);
> +				mlx5_timestamp_set(pkts[pos], offset,
> +
> 	rte_be_to_cpu_64(cq[pos].timestamp));
> +				mlx5_timestamp_set(pkts[pos + 1], offset,
> +					rte_be_to_cpu_64(cq[pos +
> p1].timestamp));
> +				mlx5_timestamp_set(pkts[pos + 2], offset,
> +					rte_be_to_cpu_64(cq[pos +
> p2].timestamp));
> +				mlx5_timestamp_set(pkts[pos + 3], offset,
> +					rte_be_to_cpu_64(cq[pos +
> p3].timestamp));
>  			}
>  		}
>  		if (rxq->dynf_meta) {
> --
> 2.28.0

Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/14] net/octeontx2: switch timestamp to dynamic mbuf field
  2020-11-01 18:28     ` Jerin Jacob
@ 2020-11-02  9:38       ` Thomas Monjalon
  2020-11-02 11:01         ` Thomas Monjalon
  0 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-02  9:38 UTC (permalink / raw)
  To: Jerin Jacob, Jerin Jacob
  Cc: dev, Ferruh Yigit, David Marchand, Richardson, Bruce,
	Olivier Matz, Andrew Rybchenko, Viacheslav Ovsiienko,
	Nithin Dabilpuram, Kiran Kumar K, Ray Kinsella, Neil Horman

01/11/2020 19:28, Jerin Jacob:
> On Sun, Nov 1, 2020 at 11:40 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > The mbuf timestamp is moved to a dynamic field
> > in order to allow removal of the deprecated static field.
> > The related mbuf flag is also replaced.
> >
> > Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> > ---
[...]
> > +uint64_t rte_pmd_octeontx2_timestamp_rx_dynflag;
> > +int rte_pmd_octeontx2_timestamp_dynfield_offset = -1;
> 
> Instead of the global variable, please move the offset to struct
> otx2_timesync_info(which is used in fastpath) and accessible in slow
> path.

This structure is initialized in otx2_nix_dev_start().
otx2_rx_timestamp_setup() is called earlier in otx2_nix_configure().
One of the two has to change. Which one?

> > +static int
> > +otx2_rx_timestamp_setup(uint16_t flags)
> > +{
> > +       int timestamp_rx_dynflag_offset;
> > +
> > +       if ((flags & NIX_RX_OFFLOAD_TSTAMP_F) == 0)
> > +               return 0;
> > +
> > +       rte_pmd_octeontx2_timestamp_dynfield_offset = rte_mbuf_dynfield_lookup(
> > +                       RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
> > +       if (rte_pmd_octeontx2_timestamp_dynfield_offset < 0) {
> > +               otx2_err("Failed to lookup timestamp field");
> > +               return -rte_errno;
> > +       }
> > +       timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
> > +                       RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
> > +       if (timestamp_rx_dynflag_offset < 0) {
> > +               otx2_err("Failed to lookup Rx timestamp flag");
> > +               return -rte_errno;
> > +       }
> > +       rte_pmd_octeontx2_timestamp_rx_dynflag =
> > +                       RTE_BIT64(timestamp_rx_dynflag_offset);
> > +
> > +       return 0;
> > +}
> > @@ -1874,6 +1904,9 @@ otx2_nix_configure(struct rte_eth_dev *eth_dev)
> >         dev->tx_offload_flags |= nix_tx_offload_flags(eth_dev);
> >         dev->rss_info.rss_grps = NIX_RSS_GRPS;
> >
> > +       if (otx2_rx_timestamp_setup(dev->rx_offload_flags) != 0)
> > +               goto fail_offloads;
> > +
> >         nb_rxq = RTE_MAX(data->nb_rx_queues, 1);
> >         nb_txq = RTE_MAX(data->nb_tx_queues, 1);




^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/14] net/octeontx2: switch timestamp to dynamic mbuf field
  2020-11-02  9:38       ` Thomas Monjalon
@ 2020-11-02 11:01         ` Thomas Monjalon
  0 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-02 11:01 UTC (permalink / raw)
  To: Jerin Jacob, Jerin Jacob
  Cc: dev, Ferruh Yigit, David Marchand, Richardson, Bruce,
	Olivier Matz, Andrew Rybchenko, Viacheslav Ovsiienko,
	Nithin Dabilpuram, Kiran Kumar K, Ray Kinsella, Neil Horman

02/11/2020 10:38, Thomas Monjalon:
> 01/11/2020 19:28, Jerin Jacob:
> > On Sun, Nov 1, 2020 at 11:40 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > >
> > > The mbuf timestamp is moved to a dynamic field
> > > in order to allow removal of the deprecated static field.
> > > The related mbuf flag is also replaced.
> > >
> > > Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> > > ---
> [...]
> > > +uint64_t rte_pmd_octeontx2_timestamp_rx_dynflag;
> > > +int rte_pmd_octeontx2_timestamp_dynfield_offset = -1;
> > 
> > Instead of the global variable, please move the offset to struct
> > otx2_timesync_info(which is used in fastpath) and accessible in slow
> > path.
> 
> This structure is initialized in otx2_nix_dev_start().
> otx2_rx_timestamp_setup() is called earlier in otx2_nix_configure().
> One of the two has to change. Which one?

I see that the timestamp config can be changed in otx2_nix_dev_start()
so it looks I have no other choice than moving timestamp setup
at "start" stage anyway.

Will be done in v3, that I will send later today.



^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v2 05/14] net/ark: switch timestamp to dynamic mbuf field
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 05/14] net/ark: " Thomas Monjalon
@ 2020-11-02 15:32     ` Olivier Matz
  2020-11-02 16:10       ` Thomas Monjalon
  0 siblings, 1 reply; 170+ messages in thread
From: Olivier Matz @ 2020-11-02 15:32 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	andrew.rybchenko, jerinj, viacheslavo, Shepard Siegel, Ed Czeck,
	John Miller

On Sun, Nov 01, 2020 at 07:06:17PM +0100, Thomas Monjalon wrote:
> The mbuf timestamp is moved to a dynamic field
> in order to allow removal of the deprecated static field.
> The related dynamic mbuf flag is set, although was missing previously.
> 
> The timestamp is set if configured for at least one device.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> ---
>  drivers/net/ark/ark_ethdev.c    | 23 +++++++++++++++++++++++
>  drivers/net/ark/ark_ethdev_rx.c | 10 +++++++++-
>  2 files changed, 32 insertions(+), 1 deletion(-)

<...>

> --- a/drivers/net/ark/ark_ethdev_rx.c
> +++ b/drivers/net/ark/ark_ethdev_rx.c
> @@ -15,6 +15,9 @@
>  #define ARK_RX_META_OFFSET (RTE_PKTMBUF_HEADROOM - ARK_RX_META_SIZE)
>  #define ARK_RX_MAX_NOCHAIN (RTE_MBUF_DEFAULT_DATAROOM)
>  
> +extern uint64_t ark_timestamp_rx_dynflag;
> +extern int ark_timestamp_dynfield_offset;
> +

Wouldn't it be better in a .h ?
Maybe ark_ethdev_rx.h

>  /* Forward declarations */
>  struct ark_rx_queue;
>  struct ark_rx_meta;
> @@ -272,7 +275,12 @@ eth_ark_recv_pkts(void *rx_queue,
>  		mbuf->port = meta->port;
>  		mbuf->pkt_len = meta->pkt_len;
>  		mbuf->data_len = meta->pkt_len;
> -		mbuf->timestamp = meta->timestamp;
> +		/* set timestamp if enabled at least on one device */
> +		if (ark_timestamp_rx_dynflag > 0) {
> +			*RTE_MBUF_DYNFIELD(mbuf, ark_timestamp_dynfield_offset,
> +				rte_mbuf_timestamp_t *) = meta->timestamp;
> +			mbuf->ol_flags |= ark_timestamp_rx_dynflag;
> +		}
>  		rte_pmd_ark_mbuf_rx_userdata_set(mbuf, meta->user_data);
>  
>  		if (ARK_DEBUG_CORE) {	/* debug sanity checks */
> -- 
> 2.28.0
> 

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v2 03/14] ethdev: register mbuf field and flags for timestamp
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 03/14] ethdev: register mbuf field and flags for timestamp Thomas Monjalon
@ 2020-11-02 15:39     ` Olivier Matz
  2020-11-02 16:52       ` Thomas Monjalon
  0 siblings, 1 reply; 170+ messages in thread
From: Olivier Matz @ 2020-11-02 15:39 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	andrew.rybchenko, jerinj, viacheslavo, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger, Matan Azrad, Shahaf Shuler

On Sun, Nov 01, 2020 at 07:06:15PM +0100, Thomas Monjalon wrote:
> During port configure or queue setup, the offload flags
> DEV_RX_OFFLOAD_TIMESTAMP and DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP
> trigger the registration of the related mbuf field and flags.
> 
> Previously, the Tx timestamp field and flag were registered in testpmd,
> as described in mlx5 guide.
> For the general usage of Rx and Tx timestamps,
> managing registrations inside ethdev is simpler and properly documented.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>

<...>

> --- a/lib/librte_ethdev/rte_ethdev.c
> +++ b/lib/librte_ethdev/rte_ethdev.c
> @@ -31,6 +31,7 @@
>  #include <rte_mempool.h>
>  #include <rte_malloc.h>
>  #include <rte_mbuf.h>
> +#include <rte_mbuf_dyn.h>
>  #include <rte_errno.h>
>  #include <rte_spinlock.h>
>  #include <rte_string_fns.h>
> @@ -1232,6 +1233,54 @@ eth_dev_check_lro_pkt_size(uint16_t port_id, uint32_t config_size,
>  	return ret;
>  }
>  
> +static inline int
> +eth_dev_timestamp_mbuf_register(uint64_t rx_offloads, uint64_t tx_offloads)
> +{
> +	static const struct rte_mbuf_dynfield field_desc = {
> +		.name = RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
> +		.size = sizeof(rte_mbuf_timestamp_t),
> +		.align = __alignof__(rte_mbuf_timestamp_t),
> +	};
> +	static const struct rte_mbuf_dynflag rx_flag_desc = {
> +		.name = RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME,
> +	};
> +	static const struct rte_mbuf_dynflag tx_flag_desc = {
> +		.name = RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME,
> +	};
> +	bool todo_rx, todo_tx;
> +	int offset;
> +
> +	todo_rx = (rx_offloads & DEV_RX_OFFLOAD_TIMESTAMP) != 0;
> +	todo_tx = (tx_offloads & DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP) != 0;
> +
> +	if (todo_rx || todo_tx) {
> +		offset = rte_mbuf_dynfield_register(&field_desc);
> +		if (offset < 0) {
> +			RTE_ETHDEV_LOG(ERR,
> +					"Failed to register mbuf field for timestamp\n");
> +			return -rte_errno;
> +		}
> +	}
> +	if (todo_rx) {
> +		offset = rte_mbuf_dynflag_register(&rx_flag_desc);
> +		if (offset < 0) {
> +			RTE_ETHDEV_LOG(ERR,
> +					"Failed to register mbuf flag for Rx timestamp\n");
> +			return -rte_errno;
> +		}
> +	}
> +	if (todo_tx) {
> +		offset = rte_mbuf_dynflag_register(&tx_flag_desc);
> +		if (offset < 0) {
> +			RTE_ETHDEV_LOG(ERR,
> +					"Failed to register mbuf flag for Tx timestamp\n");
> +			return -rte_errno;
> +		}
> +	}
> +
> +	return 0;
> +}

The code that registers the dynamic fields and flags for timestamp is
more or less duplicated several times in the patchset. As discussed
privately, it would make sense to have helpers to register them in one
operation, without the need to redeclare the structures:

  int rte_mbuf_dyn_rx_timestamp_register(int *offset, uint64_t *rx_bitnum)
  int rte_mbuf_dyn_tx_timestamp_register(int *offset, uint64_t *tx_bitnum)


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v2 14/14] mbuf: remove deprecated timestamp field
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 14/14] mbuf: remove deprecated timestamp field Thomas Monjalon
@ 2020-11-02 15:41     ` Olivier Matz
  2020-11-02 15:47       ` David Marchand
  0 siblings, 1 reply; 170+ messages in thread
From: Olivier Matz @ 2020-11-02 15:41 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	andrew.rybchenko, jerinj, viacheslavo, Ajit Khaparde,
	Ray Kinsella, Neil Horman

On Sun, Nov 01, 2020 at 07:06:26PM +0100, Thomas Monjalon wrote:
> As announced in the deprecation note, the field timestamp
> is removed to give more space to the dynamic fields.
> The related offload flag PKT_RX_TIMESTAMP is also removed.
> 
> This is how the mbuf layout looks like (pahole-style):
> 
> word  type                              name                byte  size
>  0    void *                            buf_addr;         /*   0 +  8 */
>  1    rte_iova_t                        buf_iova          /*   8 +  8 */
>       /* --- RTE_MARKER64               rearm_data;                   */
>  2    uint16_t                          data_off;         /*  16 +  2 */
>       uint16_t                          refcnt;           /*  18 +  2 */
>       uint16_t                          nb_segs;          /*  20 +  2 */
>       uint16_t                          port;             /*  22 +  2 */
>  3    uint64_t                          ol_flags;         /*  24 +  8 */
>       /* --- RTE_MARKER                 rx_descriptor_fields1;        */
>  4    uint32_t             union        packet_type;      /*  32 +  4 */
>       uint32_t                          pkt_len;          /*  36 +  4 */
>  5    uint16_t                          data_len;         /*  40 +  2 */
>       uint16_t                          vlan_tci;         /*  42 +  2 */
>  5.5  uint64_t             union        hash;             /*  44 +  8 */
>  6.5  uint16_t                          vlan_tci_outer;   /*  52 +  2 */
>       uint16_t                          buf_len;          /*  54 +  2 */
>  7    uint64_t                          dynfield0[1];     /*  56 +  8 */
>       /* --- RTE_MARKER                 cacheline1;                   */
>  8    struct rte_mempool *              pool;             /*  64 +  8 */
>  9    struct rte_mbuf *                 next;             /*  72 +  8 */
> 10    uint64_t             union        tx_offload;       /*  80 +  8 */
> 11    struct rte_mbuf_ext_shared_info * shinfo;           /*  88 +  8 */
> 12    uint16_t                          priv_size;        /*  96 +  2 */
>       uint16_t                          timesync;         /*  98 +  2 */
> 12.5  uint32_t                          dynfield1[7];     /* 100 + 28 */
> 16    /* --- END                                             128      */
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
> Acked-by: Ray Kinsella <mdr@ashroe.eu>

<...>

> --- a/lib/librte_mbuf/rte_mbuf_core.h
> +++ b/lib/librte_mbuf/rte_mbuf_core.h
> @@ -149,11 +149,6 @@ extern "C" {
>   */
>  #define PKT_RX_LRO           (1ULL << 16)
>  
> -/**
> - * Indicate that the timestamp field in the mbuf is valid.
> - */
> -#define PKT_RX_TIMESTAMP     (1ULL << 17)
> -
>  /**
>   * Indicate that security offload processing was applied on the RX packet.
>   */

nit: can we keep a comment here to highlight the flag 17 is unused?

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v2 14/14] mbuf: remove deprecated timestamp field
  2020-11-02 15:41     ` Olivier Matz
@ 2020-11-02 15:47       ` David Marchand
  2020-11-02 15:49         ` Thomas Monjalon
  0 siblings, 1 reply; 170+ messages in thread
From: David Marchand @ 2020-11-02 15:47 UTC (permalink / raw)
  To: Thomas Monjalon, Olivier Matz
  Cc: dev, Yigit, Ferruh, Bruce Richardson, Andrew Rybchenko,
	Jerin Jacob Kollanukkaran, Slava Ovsiienko, Ajit Khaparde,
	Ray Kinsella, Neil Horman

On Mon, Nov 2, 2020 at 4:41 PM Olivier Matz <olivier.matz@6wind.com> wrote:
> > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > @@ -149,11 +149,6 @@ extern "C" {
> >   */
> >  #define PKT_RX_LRO           (1ULL << 16)
> >
> > -/**
> > - * Indicate that the timestamp field in the mbuf is valid.
> > - */
> > -#define PKT_RX_TIMESTAMP     (1ULL << 17)
> > -
> >  /**
> >   * Indicate that security offload processing was applied on the RX packet.
> >   */
>
> nit: can we keep a comment here to highlight the flag 17 is unused?

What about marking it free in free_flags?


-- 
David Marchand


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v2 14/14] mbuf: remove deprecated timestamp field
  2020-11-02 15:47       ` David Marchand
@ 2020-11-02 15:49         ` Thomas Monjalon
  0 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-02 15:49 UTC (permalink / raw)
  To: Olivier Matz, David Marchand
  Cc: dev, Yigit, Ferruh, Bruce Richardson, Andrew Rybchenko,
	Jerin Jacob Kollanukkaran, Slava Ovsiienko, Ajit Khaparde,
	Ray Kinsella, Neil Horman

02/11/2020 16:47, David Marchand:
> On Mon, Nov 2, 2020 at 4:41 PM Olivier Matz <olivier.matz@6wind.com> wrote:
> > > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > > @@ -149,11 +149,6 @@ extern "C" {
> > >   */
> > >  #define PKT_RX_LRO           (1ULL << 16)
> > >
> > > -/**
> > > - * Indicate that the timestamp field in the mbuf is valid.
> > > - */
> > > -#define PKT_RX_TIMESTAMP     (1ULL << 17)
> > > -
> > >  /**
> > >   * Indicate that security offload processing was applied on the RX packet.
> > >   */
> >
> > nit: can we keep a comment here to highlight the flag 17 is unused?
> 
> What about marking it free in free_flags?

Overkill for a single bit. We will probably use it in future.
I will add a comment as suggested by Olivier.



^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotterfirst half
  2020-11-01 20:59                   ` Morten Brørup
@ 2020-11-02 15:58                     ` Thomas Monjalon
  2020-11-03 12:10                       ` Morten Brørup
  0 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-02 15:58 UTC (permalink / raw)
  To: dev, techboard
  Cc: Ajit Khaparde, Ananyev, Konstantin, Andrew Rybchenko, dev, Yigit,
	Ferruh, david.marchand, Richardson, Bruce, olivier.matz, jerinj,
	viacheslavo, honnappa.nagarahalli, maxime.coquelin, stephen,
	hemant.agrawal, viacheslavo, Matan Azrad, Shahaf Shuler,
	Morten Brørup

+Cc techboard

We need benchmark numbers in order to take a decision.
Please all, prepare some arguments and numbers so we can discuss
the mbuf layout in the next techboard meeting.


01/11/2020 21:59, Morten Brørup:
> > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > Sent: Sunday, November 1, 2020 5:38 PM
> > 
> > 01/11/2020 10:12, Morten Brørup:
> > > One thing has always puzzled me:
> > > Why do we use 64 bits to indicate which memory pool
> > > an mbuf belongs to?
> > > The portid only uses 16 bits and an indirection index.
> > > Why don't we use the same kind of indirection index for mbuf pools?
> > 
> > I wonder what would be the cost of indirection. Probably neglectible.
> 
> Probably. The portid does it, and that indirection is heavily used everywhere.
> 
> The size of mbuf memory pool indirection array should be compile time configurable, like the size of the portid indirection array.
> 
> And for reference, the indirection array will fit into one cache line if we default to 8 mbuf pools, thus supporting an 8 CPU socket system with one mbuf pool per CPU socket, or a 4 CPU socket system with two mbuf pools per CPU socket.
> 
> (And as a side note: Our application is optimized for single-socket systems, and we only use one mbuf pool. I guess many applications were developed without carefully optimizing for multi-socket systems, and also just use one mbuf pool. In these cases, the mbuf structure doesn't really need a pool field. But it is still there, and the DPDK libraries use it, so we didn't bother removing it.)
> 
> > I think it is a good proposal...
> > ... for next year, after a deprecation notice.
> > 
> > > I can easily imagine using one mbuf pool (or perhaps a few pools)
> > > per CPU socket (or per physical memory bus closest to an attached NIC),
> > > but not more than 256 mbuf memory pools in total.
> > > So, let's introduce an mbufpoolid like the portid,
> > > and cut this mbuf field down from 64 to 8 bits.

We will need to measure the perf of the solution.
There is a chance for the cost to be too much high.


> > > If we also cut down m->pkt_len from 32 to 24 bits,
> > 
> > Who is using packets larger than 64k? Are 16 bits enough?
> 
> I personally consider 64k a reasonable packet size limit. Exotic applications with even larger packets would have to live with this constraint. But let's see if there are any objections. For reference, 64k corresponds to ca. 44 Ethernet (1500 byte) packets.
> 
> (The limit could be 65535 bytes, to avoid translation of the value 0 into 65536 bytes.)
> 
> This modification would go nicely hand in hand with the mbuf pool indirection modification.
> 
> ... after yet another round of ABI stability discussions, depreciation notices, and so on. :-)

After more thoughts, I'm afraid 64k is too small in some cases.
And 24-bit manipulation would probably break performance.
I'm afraid we are stuck with 32-bit length.


> > > we can get the 8 bit mbuf pool index into the first cache line
> > > at no additional cost.
> > 
> > I like the idea.
> > It means we don't need to move the pool pointer now,
> > i.e. it does not have to replace the timestamp field.
> 
> Agreed! Don't move m->pool to the first cache line; it is not used for RX.
> 
> > 
> > > In other words: This would free up another 64 bit field in the mbuf
> > structure!
> > 
> > That would be great!
> > 
> > 
> > > And even though the m->next pointer for scattered packets resides
> > > in the second cache line, the libraries and application knows
> > > that m->next is NULL when m->nb_segs is 1.
> > > This proves that my suggestion would make touching
> > > the second cache line unnecessary (in simple cases),
> > > even for re-initializing the mbuf.
> > 
> > So you think the "next" pointer should stay in the second half of mbuf?
> > 
> > I feel you would like to move the Tx offloads in the first half
> > to improve performance of very simple apps.
> 
> "Very simple apps" sounds like a minority of apps. I would rather say "very simple packet handling scenarios", e.g. forwarding of normal size non-segmented packets. I would guess that the vast majority of packets handled by DPDK applications actually match this scenario. So I'm proposing to optimize for what I think is the most common scenario.
> 
> If segmented packets are common, then m->next could be moved to the first cache line. But it will only improve the pure RX steps of the pipeline. When preparing the packet for TX, m->tx_offloads will need to be set, and the second cache line comes into play. So I'm wondering how big the benefit of having m->next in the first cache line really is - assuming that m->nb_segs will be checked before accessing m->next.
> 
> > I am thinking the opposite: we could have some dynamic fields space
> > in the first half to improve performance of complex Rx.
> > Note: we can add a flag hint for field registration in this first half.
> > 
> 
> I have had the same thoughts. However, I would prefer being able to forward ordinary packets without using the second mbuf cache line at all (although only in specific scenarios like my example above).
> 
> Furthermore, the application can abuse the 64 bit m->tx_offload field for private purposes until it is time to prepare the packet for TX and pass it on to the driver. This hack somewhat resembles a dynamic field in the first cache line, and will not be possible if the m->pool or m->next field is moved there.




^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v2 05/14] net/ark: switch timestamp to dynamic mbuf field
  2020-11-02 15:32     ` Olivier Matz
@ 2020-11-02 16:10       ` Thomas Monjalon
  0 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-02 16:10 UTC (permalink / raw)
  To: Olivier Matz
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	andrew.rybchenko, jerinj, viacheslavo, Shepard Siegel, Ed Czeck,
	John Miller

02/11/2020 16:32, Olivier Matz:
> On Sun, Nov 01, 2020 at 07:06:17PM +0100, Thomas Monjalon wrote:
> > --- a/drivers/net/ark/ark_ethdev_rx.c
> > +++ b/drivers/net/ark/ark_ethdev_rx.c
> > @@ -15,6 +15,9 @@
> >  #define ARK_RX_META_OFFSET (RTE_PKTMBUF_HEADROOM - ARK_RX_META_SIZE)
> >  #define ARK_RX_MAX_NOCHAIN (RTE_MBUF_DEFAULT_DATAROOM)
> >  
> > +extern uint64_t ark_timestamp_rx_dynflag;
> > +extern int ark_timestamp_dynfield_offset;
> > +
> 
> Wouldn't it be better in a .h ?
> Maybe ark_ethdev_rx.h

Yes it would allow type checking on compilation.




^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v2 03/14] ethdev: register mbuf field and flags for timestamp
  2020-11-02 15:39     ` Olivier Matz
@ 2020-11-02 16:52       ` Thomas Monjalon
  0 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-02 16:52 UTC (permalink / raw)
  To: Olivier Matz
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	andrew.rybchenko, jerinj, viacheslavo, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger, Matan Azrad, Shahaf Shuler

02/11/2020 16:39, Olivier Matz:
> On Sun, Nov 01, 2020 at 07:06:15PM +0100, Thomas Monjalon wrote:
> > During port configure or queue setup, the offload flags
> > DEV_RX_OFFLOAD_TIMESTAMP and DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP
> > trigger the registration of the related mbuf field and flags.
> > 
> > Previously, the Tx timestamp field and flag were registered in testpmd,
> > as described in mlx5 guide.
> > For the general usage of Rx and Tx timestamps,
> > managing registrations inside ethdev is simpler and properly documented.
> > 
> > Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> > Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> 
> <...>
> 
> > --- a/lib/librte_ethdev/rte_ethdev.c
> > +++ b/lib/librte_ethdev/rte_ethdev.c
> > +static inline int
> > +eth_dev_timestamp_mbuf_register(uint64_t rx_offloads, uint64_t tx_offloads)
> > +{
> > +	static const struct rte_mbuf_dynfield field_desc = {
> > +		.name = RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
> > +		.size = sizeof(rte_mbuf_timestamp_t),
> > +		.align = __alignof__(rte_mbuf_timestamp_t),
> > +	};
> > +	static const struct rte_mbuf_dynflag rx_flag_desc = {
> > +		.name = RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME,
> > +	};
> > +	static const struct rte_mbuf_dynflag tx_flag_desc = {
> > +		.name = RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME,
> > +	};
> > +	bool todo_rx, todo_tx;
> > +	int offset;
> > +
> > +	todo_rx = (rx_offloads & DEV_RX_OFFLOAD_TIMESTAMP) != 0;
> > +	todo_tx = (tx_offloads & DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP) != 0;
> > +
> > +	if (todo_rx || todo_tx) {
> > +		offset = rte_mbuf_dynfield_register(&field_desc);
> > +		if (offset < 0) {
> > +			RTE_ETHDEV_LOG(ERR,
> > +					"Failed to register mbuf field for timestamp\n");
> > +			return -rte_errno;
> > +		}
> > +	}
> > +	if (todo_rx) {
> > +		offset = rte_mbuf_dynflag_register(&rx_flag_desc);
> > +		if (offset < 0) {
> > +			RTE_ETHDEV_LOG(ERR,
> > +					"Failed to register mbuf flag for Rx timestamp\n");
> > +			return -rte_errno;
> > +		}
> > +	}
> > +	if (todo_tx) {
> > +		offset = rte_mbuf_dynflag_register(&tx_flag_desc);
> > +		if (offset < 0) {
> > +			RTE_ETHDEV_LOG(ERR,
> > +					"Failed to register mbuf flag for Tx timestamp\n");
> > +			return -rte_errno;
> > +		}
> > +	}
> > +
> > +	return 0;
> > +}
> 
> The code that registers the dynamic fields and flags for timestamp is
> more or less duplicated several times in the patchset. As discussed
> privately, it would make sense to have helpers to register them in one
> operation, without the need to redeclare the structures:
> 
>   int rte_mbuf_dyn_rx_timestamp_register(int *offset, uint64_t *rx_bitnum)
>   int rte_mbuf_dyn_tx_timestamp_register(int *offset, uint64_t *tx_bitnum)

Yes we can have a function in rte_mbuf_dyn.h/c which is called
to register field and flags from drivers or apps.
Then no need to register at ethdev level.



^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v2 08/14] net/mlx5: switch timestamp to dynamic mbuf field
  2020-11-01 18:06   ` [dpdk-dev] [PATCH v2 08/14] net/mlx5: switch timestamp to dynamic mbuf field Thomas Monjalon
  2020-11-02  5:08     ` Ruifeng Wang
@ 2020-11-02 23:20     ` David Christensen
  1 sibling, 0 replies; 170+ messages in thread
From: David Christensen @ 2020-11-02 23:20 UTC (permalink / raw)
  To: Thomas Monjalon, dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Matan Azrad,
	Shahaf Shuler, Ruifeng Wang, Konstantin Ananyev



On 11/1/20 10:06 AM, Thomas Monjalon wrote:
> The mbuf timestamp is moved to a dynamic field
> in order to allow removal of the deprecated static field.
> The related mbuf flag is also replaced.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> ---
>   drivers/net/mlx5/mlx5_rxq.c              | 36 ++++++++++++++++++++
>   drivers/net/mlx5/mlx5_rxtx.c             |  4 +--
>   drivers/net/mlx5/mlx5_rxtx.h             | 19 +++++++++++
>   drivers/net/mlx5/mlx5_rxtx_vec_altivec.h | 41 +++++++++++-----------
>   drivers/net/mlx5/mlx5_rxtx_vec_neon.h    | 43 ++++++++++++------------
>   drivers/net/mlx5/mlx5_rxtx_vec_sse.h     | 35 +++++++++----------
>   6 files changed, 118 insertions(+), 60 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
> index f1d8373079..877aa24a18 100644
> --- a/drivers/net/mlx5/mlx5_rxq.c
> +++ b/drivers/net/mlx5/mlx5_rxq.c
> @@ -1310,6 +1310,39 @@ mlx5_max_lro_msg_size_adjust(struct rte_eth_dev *dev, uint16_t idx,
>   		priv->max_lro_msg_size * MLX5_LRO_SEG_CHUNK_SIZE);
>   }
> 
> +/**
> + * Lookup mbuf field and flag for Rx timestamp if offload requested.
> + *
> + * @param rxq_data
> + *   Datapath struct where field offset and flag mask are stored.
> + *
> + * @return
> + *   0 on success or offload disabled, negative errno otherwise.
> + */
> +static int
> +mlx5_rx_timestamp_setup(struct mlx5_rxq_data *rxq_data)
> +{
> +	int timestamp_rx_dynflag_offset;
> +
> +	rxq_data->timestamp_rx_flag = 0;
> +	if (rxq_data->hw_timestamp == 0)
> +		return 0;
> +	rxq_data->timestamp_offset = rte_mbuf_dynfield_lookup(
> +			RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
> +	if (rxq_data->timestamp_offset < 0) {
> +		DRV_LOG(ERR, "Cannot lookup timestamp field\n");
> +		return -rte_errno;
> +	}
> +	timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
> +			RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
> +	if (timestamp_rx_dynflag_offset < 0) {
> +		DRV_LOG(ERR, "Cannot lookup Rx timestamp flag\n");
> +		return -rte_errno;
> +	}
> +	rxq_data->timestamp_rx_flag = RTE_BIT64(timestamp_rx_dynflag_offset);
> +	return 0;
> +}
> +
>   /**
>    * Create a DPDK Rx queue.
>    *
> @@ -1492,7 +1525,10 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
>   	mlx5_max_lro_msg_size_adjust(dev, idx, max_lro_size);
>   	/* Toggle RX checksum offload if hardware supports it. */
>   	tmpl->rxq.csum = !!(offloads & DEV_RX_OFFLOAD_CHECKSUM);
> +	/* Configure Rx timestamp. */
>   	tmpl->rxq.hw_timestamp = !!(offloads & DEV_RX_OFFLOAD_TIMESTAMP);
> +	if (mlx5_rx_timestamp_setup(&tmpl->rxq) != 0)
> +		goto error;
>   	/* Configure VLAN stripping. */
>   	tmpl->rxq.vlan_strip = !!(offloads & DEV_RX_OFFLOAD_VLAN_STRIP);
>   	/* By default, FCS (CRC) is stripped by hardware. */
> diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
> index e86468b67a..b577aab00b 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -1287,8 +1287,8 @@ rxq_cq_to_mbuf(struct mlx5_rxq_data *rxq, struct rte_mbuf *pkt,
> 
>   		if (rxq->rt_timestamp)
>   			ts = mlx5_txpp_convert_rx_ts(rxq->sh, ts);
> -		pkt->timestamp = ts;
> -		pkt->ol_flags |= PKT_RX_TIMESTAMP;
> +		mlx5_timestamp_set(pkt, rxq->timestamp_offset, ts);
> +		pkt->ol_flags |= rxq->timestamp_rx_flag;
>   	}
>   }
> 
> diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
> index 674296ee98..e9eca36b40 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.h
> +++ b/drivers/net/mlx5/mlx5_rxtx.h
> @@ -151,6 +151,8 @@ struct mlx5_rxq_data {
>   	/* CQ (UAR) access lock required for 32bit implementations */
>   #endif
>   	uint32_t tunnel; /* Tunnel information. */
> +	int timestamp_offset; /* Dynamic mbuf field for timestamp. */
> +	uint64_t timestamp_rx_flag; /* Dynamic mbuf flag for timestamp. */
>   	uint64_t flow_meta_mask;
>   	int32_t flow_meta_offset;
>   } __rte_cache_aligned;
> @@ -681,4 +683,21 @@ mlx5_txpp_convert_tx_ts(struct mlx5_dev_ctx_shared *sh, uint64_t mts)
>   	return ci;
>   }
> 
> +/**
> + * Set timestamp in mbuf dynamic field.
> + *
> + * @param mbuf
> + *   Structure to write into.
> + * @param offset
> + *   Dynamic field offset in mbuf structure.
> + * @param timestamp
> + *   Value to write.
> + */
> +static __rte_always_inline void
> +mlx5_timestamp_set(struct rte_mbuf *mbuf, int offset,
> +		rte_mbuf_timestamp_t timestamp)
> +{
> +	*RTE_MBUF_DYNFIELD(mbuf, offset, rte_mbuf_timestamp_t *) = timestamp;
> +}
> +
>   #endif /* RTE_PMD_MLX5_RXTX_H_ */
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
> index 6bf0c9b540..171d7bb0f8 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
> @@ -330,13 +330,13 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
>   	vector unsigned char ol_flags = (vector unsigned char)
>   		(vector unsigned int){
>   			rxq->rss_hash * PKT_RX_RSS_HASH |
> -				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
> +				rxq->hw_timestamp * rxq->timestamp_rx_flag,
>   			rxq->rss_hash * PKT_RX_RSS_HASH |
> -				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
> +				rxq->hw_timestamp * rxq->timestamp_rx_flag,
>   			rxq->rss_hash * PKT_RX_RSS_HASH |
> -				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
> +				rxq->hw_timestamp * rxq->timestamp_rx_flag,
>   			rxq->rss_hash * PKT_RX_RSS_HASH |
> -				rxq->hw_timestamp * PKT_RX_TIMESTAMP};
> +				rxq->hw_timestamp * rxq->timestamp_rx_flag};
>   	vector unsigned char cv_flags;
>   	const vector unsigned char zero = (vector unsigned char){0};
>   	const vector unsigned char ptype_mask =
> @@ -1025,31 +1025,32 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t pkts_n,
>   		/* D.5 fill in mbuf - rearm_data and packet_type. */
>   		rxq_cq_to_ptype_oflags_v(rxq, cqes, opcode, &pkts[pos]);
>   		if (rxq->hw_timestamp) {
> +			int offset = rxq->timestamp_offset;
>   			if (rxq->rt_timestamp) {
>   				struct mlx5_dev_ctx_shared *sh = rxq->sh;
>   				uint64_t ts;
> 
>   				ts = rte_be_to_cpu_64(cq[pos].timestamp);
> -				pkts[pos]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(pkts[pos], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>   				ts = rte_be_to_cpu_64(cq[pos + p1].timestamp);
> -				pkts[pos + 1]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(pkts[pos + 1], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>   				ts = rte_be_to_cpu_64(cq[pos + p2].timestamp);
> -				pkts[pos + 2]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(pkts[pos + 2], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>   				ts = rte_be_to_cpu_64(cq[pos + p3].timestamp);
> -				pkts[pos + 3]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(pkts[pos + 3], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>   			} else {
> -				pkts[pos]->timestamp = rte_be_to_cpu_64
> -						(cq[pos].timestamp);
> -				pkts[pos + 1]->timestamp = rte_be_to_cpu_64
> -						(cq[pos + p1].timestamp);
> -				pkts[pos + 2]->timestamp = rte_be_to_cpu_64
> -						(cq[pos + p2].timestamp);
> -				pkts[pos + 3]->timestamp = rte_be_to_cpu_64
> -						(cq[pos + p3].timestamp);
> +				mlx5_timestamp_set(pkts[pos], offset,
> +					rte_be_to_cpu_64(cq[pos].timestamp));
> +				mlx5_timestamp_set(pkts[pos + 1], offset,
> +					rte_be_to_cpu_64(cq[pos + p1].timestamp));
> +				mlx5_timestamp_set(pkts[pos + 2], offset,
> +					rte_be_to_cpu_64(cq[pos + p2].timestamp));
> +				mlx5_timestamp_set(pkts[pos + 3], offset,
> +					rte_be_to_cpu_64(cq[pos + p3].timestamp));
>   			}
>   		}
>   		if (rxq->dynf_meta) {

Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v3 00/16] remove mbuf timestamp
  2020-10-29  9:27 [dpdk-dev] [PATCH 00/15] remove mbuf timestamp Thomas Monjalon
                   ` (15 preceding siblings ...)
  2020-11-01 18:06 ` [dpdk-dev] [PATCH v2 00/14] remove mbuf timestamp Thomas Monjalon
@ 2020-11-03  0:13 ` Thomas Monjalon
  2020-11-03  0:13   ` [dpdk-dev] [PATCH v3 01/16] eventdev: remove software Rx timestamp Thomas Monjalon
                     ` (16 more replies)
  2020-11-03 12:21 ` [dpdk-dev] [PATCH v4 " Thomas Monjalon
  2020-11-03 14:09 ` [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp Thomas Monjalon
  18 siblings, 17 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03  0:13 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo

The mbuf field timestamp was announced to be removed for three reasons:
  - a dynamic field already exist, used for Tx only
  - this field always used 8 bytes even if unneeded
  - this field is in the first half (cacheline) of mbuf

After this series, the dynamic field timestamp is used for both Rx and Tx
with separate dynamic flags to distinguish when the value is meaningful
without resetting the field during forwarding.

As a consequence, 8 bytes can be re-allocated to dynamic fields
in the first half of mbuf structure.
It is still open to change more the mbuf layout.

This mbuf layout change is important to allow adding more features
(consuming more dynamic fields) during the next year,
and can allow performance improvements with new usages in the first half.


v3:
- move ark variables declaration in a .h file
- improve cache locality for octeontx2
- add comments about cache locality in commit logs
- add comment for unused flag offset 17
- add timestamp register functions
- replace lookup with register in drivers and apps
- remove register in ethdev

v2:
- remove optimization to register only once in ethdev
- fix error message in latencystats
- convert rxtx_callbacks macro to inline function
- increase dynamic fields space
- do not move pool field


Thomas Monjalon (16):
  eventdev: remove software Rx timestamp
  mbuf: add Rx timestamp flag and helpers
  latency: switch Rx timestamp to dynamic mbuf field
  net/ark: switch Rx timestamp to dynamic mbuf field
  net/dpaa2: switch Rx timestamp to dynamic mbuf field
  net/mlx5: fix dynamic mbuf offset lookup check
  net/mlx5: switch Rx timestamp to dynamic mbuf field
  net/nfb: switch Rx timestamp to dynamic mbuf field
  net/octeontx2: switch Rx timestamp to dynamic mbuf field
  net/pcap: switch Rx timestamp to dynamic mbuf field
  app/testpmd: switch Rx timestamp to dynamic mbuf field
  examples/rxtx_callbacks: switch timestamp to dynamic field
  ethdev: add doxygen comment for Rx timestamp API
  mbuf: remove deprecated timestamp field
  mbuf: add Tx timestamp registration helper
  ethdev: include mbuf registration in Tx timestamp API

 app/test-pmd/config.c                         | 38 -------------
 app/test-pmd/util.c                           | 39 ++++++++++++-
 app/test/test_mbuf.c                          |  1 -
 doc/guides/nics/mlx5.rst                      |  5 +-
 .../prog_guide/event_ethernet_rx_adapter.rst  |  6 +-
 doc/guides/rel_notes/deprecation.rst          |  4 --
 doc/guides/rel_notes/release_20_11.rst        |  4 ++
 drivers/net/ark/ark_ethdev.c                  | 16 ++++++
 drivers/net/ark/ark_ethdev_rx.c               |  7 ++-
 drivers/net/ark/ark_ethdev_rx.h               |  2 +
 drivers/net/dpaa2/dpaa2_ethdev.c              | 11 ++++
 drivers/net/dpaa2/dpaa2_ethdev.h              |  2 +
 drivers/net/dpaa2/dpaa2_rxtx.c                | 25 ++++++---
 drivers/net/mlx5/mlx5_ethdev.c                |  8 ++-
 drivers/net/mlx5/mlx5_rxq.c                   |  8 +++
 drivers/net/mlx5/mlx5_rxtx.c                  |  8 +--
 drivers/net/mlx5/mlx5_rxtx.h                  | 19 +++++++
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h      | 41 +++++++-------
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h         | 43 ++++++++-------
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h          | 35 ++++++------
 drivers/net/mlx5/mlx5_trigger.c               |  2 +-
 drivers/net/mlx5/mlx5_txq.c                   |  2 +-
 drivers/net/nfb/nfb_rx.c                      | 15 ++++-
 drivers/net/nfb/nfb_rx.h                      | 18 ++++--
 drivers/net/octeontx2/otx2_ethdev.c           | 10 ++++
 drivers/net/octeontx2/otx2_rx.h               | 19 ++++++-
 drivers/net/pcap/rte_eth_pcap.c               | 20 ++++++-
 examples/rxtx_callbacks/main.c                | 16 +++++-
 lib/librte_ethdev/rte_ethdev.h                | 13 ++++-
 .../rte_event_eth_rx_adapter.c                | 11 ----
 .../rte_event_eth_rx_adapter.h                |  6 +-
 lib/librte_latencystats/rte_latencystats.c    | 30 ++++++++--
 lib/librte_mbuf/rte_mbuf.c                    |  2 -
 lib/librte_mbuf/rte_mbuf.h                    |  2 +-
 lib/librte_mbuf/rte_mbuf_core.h               | 12 +---
 lib/librte_mbuf/rte_mbuf_dyn.c                | 51 +++++++++++++++++
 lib/librte_mbuf/rte_mbuf_dyn.h                | 55 +++++++++++++++++--
 lib/librte_mbuf/version.map                   |  2 +
 38 files changed, 429 insertions(+), 179 deletions(-)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v3 01/16] eventdev: remove software Rx timestamp
  2020-11-03  0:13 ` [dpdk-dev] [PATCH v3 00/16] " Thomas Monjalon
@ 2020-11-03  0:13   ` Thomas Monjalon
  2020-11-03  0:13   ` [dpdk-dev] [PATCH v3 02/16] mbuf: add Rx timestamp flag and helpers Thomas Monjalon
                     ` (15 subsequent siblings)
  16 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03  0:13 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Nikhil Rao

This a revert of the commit 569758758dcd ("eventdev: add Rx timestamp").
If the Rx timestamp is not configured on the ethdev port,
there is no reason to set one.
Also the accuracy  of the timestamp was bad because set at a late stage.
Anyway there is no trace of the usage of this timestamp.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 doc/guides/prog_guide/event_ethernet_rx_adapter.rst |  6 +-----
 lib/librte_eventdev/rte_event_eth_rx_adapter.c      | 11 -----------
 lib/librte_eventdev/rte_event_eth_rx_adapter.h      |  6 +-----
 3 files changed, 2 insertions(+), 21 deletions(-)

diff --git a/doc/guides/prog_guide/event_ethernet_rx_adapter.rst b/doc/guides/prog_guide/event_ethernet_rx_adapter.rst
index 236f43f455..cb44ce0e47 100644
--- a/doc/guides/prog_guide/event_ethernet_rx_adapter.rst
+++ b/doc/guides/prog_guide/event_ethernet_rx_adapter.rst
@@ -12,11 +12,7 @@ be supported in hardware or require a software thread to receive packets from
 the ethdev port using ethdev poll mode APIs and enqueue these as events to the
 event device using the eventdev API. Both transfer mechanisms may be present on
 the same platform depending on the particular combination of the ethdev and
-the event device. For SW based packet transfer, if the mbuf does not have a
-timestamp set, the adapter adds a timestamp to the mbuf using
-rte_get_tsc_cycles(), this provides a more accurate timestamp as compared to
-if the application were to set the timestamp since it avoids event device
-schedule latency.
+the event device.
 
 The Event Ethernet Rx Adapter library is intended for the application code to
 configure both transfer mechanisms using a common API. A capability API allows
diff --git a/lib/librte_eventdev/rte_event_eth_rx_adapter.c b/lib/librte_eventdev/rte_event_eth_rx_adapter.c
index f0000d1ede..3c73046551 100644
--- a/lib/librte_eventdev/rte_event_eth_rx_adapter.c
+++ b/lib/librte_eventdev/rte_event_eth_rx_adapter.c
@@ -763,7 +763,6 @@ rxa_buffer_mbufs(struct rte_event_eth_rx_adapter *rx_adapter,
 	uint32_t rss_mask;
 	uint32_t rss;
 	int do_rss;
-	uint64_t ts;
 	uint16_t nb_cb;
 	uint16_t dropped;
 
@@ -771,16 +770,6 @@ rxa_buffer_mbufs(struct rte_event_eth_rx_adapter *rx_adapter,
 	rss_mask = ~(((m->ol_flags & PKT_RX_RSS_HASH) != 0) - 1);
 	do_rss = !rss_mask && !eth_rx_queue_info->flow_id_mask;
 
-	if ((m->ol_flags & PKT_RX_TIMESTAMP) == 0) {
-		ts = rte_get_tsc_cycles();
-		for (i = 0; i < num; i++) {
-			m = mbufs[i];
-
-			m->timestamp = ts;
-			m->ol_flags |= PKT_RX_TIMESTAMP;
-		}
-	}
-
 	for (i = 0; i < num; i++) {
 		m = mbufs[i];
 
diff --git a/lib/librte_eventdev/rte_event_eth_rx_adapter.h b/lib/librte_eventdev/rte_event_eth_rx_adapter.h
index 2dd259c279..21bb1e54c8 100644
--- a/lib/librte_eventdev/rte_event_eth_rx_adapter.h
+++ b/lib/librte_eventdev/rte_event_eth_rx_adapter.h
@@ -21,11 +21,7 @@
  *
  * The adapter uses a EAL service core function for SW based packet transfer
  * and uses the eventdev PMD functions to configure HW based packet transfer
- * between the ethernet device and the event device. For SW based packet
- * transfer, if the mbuf does not have a timestamp set, the adapter adds a
- * timestamp to the mbuf using rte_get_tsc_cycles(), this provides a more
- * accurate timestamp as compared to if the application were to set the time
- * stamp since it avoids event device schedule latency.
+ * between the ethernet device and the event device.
  *
  * The ethernet Rx event adapter's functions are:
  *  - rte_event_eth_rx_adapter_create_ext()
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v3 02/16] mbuf: add Rx timestamp flag and helpers
  2020-11-03  0:13 ` [dpdk-dev] [PATCH v3 00/16] " Thomas Monjalon
  2020-11-03  0:13   ` [dpdk-dev] [PATCH v3 01/16] eventdev: remove software Rx timestamp Thomas Monjalon
@ 2020-11-03  0:13   ` Thomas Monjalon
  2020-11-03  9:33     ` Olivier Matz
  2020-11-03  0:13   ` [dpdk-dev] [PATCH v3 03/16] latency: switch Rx timestamp to dynamic mbuf field Thomas Monjalon
                     ` (14 subsequent siblings)
  16 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03  0:13 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Ray Kinsella, Neil Horman

There is already a dynamic field for timestamp,
used only for Tx scheduling with the dedicated Tx offload flag.
The same field can be used for Rx timestamp filled by drivers.

A new dynamic flag is defined for Rx usage.
A new function wraps the registration of both field and Rx flag.
The type rte_mbuf_timestamp_t is defined for the API users.

After migrating all Rx timestamp usages, it will be possible
to remove the deprecated timestamp field.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 lib/librte_mbuf/rte_mbuf_dyn.c | 43 ++++++++++++++++++++++++++++++++++
 lib/librte_mbuf/rte_mbuf_dyn.h | 33 ++++++++++++++++++++++----
 lib/librte_mbuf/version.map    |  1 +
 3 files changed, 72 insertions(+), 5 deletions(-)

diff --git a/lib/librte_mbuf/rte_mbuf_dyn.c b/lib/librte_mbuf/rte_mbuf_dyn.c
index 538a43f695..e279b23aea 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.c
+++ b/lib/librte_mbuf/rte_mbuf_dyn.c
@@ -13,6 +13,7 @@
 #include <rte_errno.h>
 #include <rte_malloc.h>
 #include <rte_string_fns.h>
+#include <rte_bitops.h>
 #include <rte_mbuf.h>
 #include <rte_mbuf_dyn.h>
 
@@ -569,3 +570,45 @@ void rte_mbuf_dyn_dump(FILE *out)
 
 	rte_mcfg_tailq_write_unlock();
 }
+
+static int
+rte_mbuf_dyn_timestamp_register(int *field_offset, uint64_t *flag,
+		const char *direction, const char *flag_name)
+{
+	static const struct rte_mbuf_dynfield field_desc = {
+		.name = RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
+		.size = sizeof(rte_mbuf_timestamp_t),
+		.align = __alignof__(rte_mbuf_timestamp_t),
+	};
+	struct rte_mbuf_dynflag flag_desc;
+	int offset;
+
+	offset = rte_mbuf_dynfield_register(&field_desc);
+	if (offset < 0) {
+		RTE_LOG(ERR, MBUF,
+			"Failed to register mbuf field for timestamp\n");
+		return -1;
+	}
+	if (field_offset != NULL)
+		*field_offset = offset;
+
+	strlcpy(flag_desc.name, flag_name, sizeof flag_desc.name);
+	offset = rte_mbuf_dynflag_register(&flag_desc);
+	if (offset < 0) {
+		RTE_LOG(ERR, MBUF,
+			"Failed to register mbuf flag for %s timestamp\n",
+			direction);
+		return -1;
+	}
+	if (flag != NULL)
+		*flag = RTE_BIT64(offset);
+
+	return 0;
+}
+
+int
+rte_mbuf_dyn_rx_timestamp_register(int *field_offset, uint64_t *rx_flag)
+{
+	return rte_mbuf_dyn_timestamp_register(field_offset, rx_flag,
+			"Rx", RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME);
+}
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 0ebac88b83..2e729ddaca 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -258,13 +258,36 @@ void rte_mbuf_dyn_dump(FILE *out);
  * timestamp. The dynamic Tx timestamp flag tells whether the field contains
  * actual timestamp value for the packets being sent, this value can be
  * used by PMD to schedule packet sending.
- *
- * After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
- * and obsoleting, the dedicated Rx timestamp flag is supposed to be
- * introduced and the shared dynamic timestamp field will be used
- * to handle the timestamps on receiving datapath as well.
  */
 #define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME "rte_dynfield_timestamp"
+typedef uint64_t rte_mbuf_timestamp_t;
+
+/**
+ * Indicate that the timestamp field in the mbuf was filled by the driver.
+ */
+#define RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME "rte_dynflag_rx_timestamp"
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Register dynamic mbuf field and flag for Rx timestamp.
+ *
+ * @param field_offset
+ *   Pointer to the offset of the registered mbuf field, can be NULL.
+ *   The same field is shared for Rx and Tx timestamp.
+ * @param rx_flag
+ *   Pointer to the mask of the registered offload flag, can be NULL.
+ * @return
+ *   0 on success, -1 otherwise.
+ *   Possible values for rte_errno:
+ *   - EEXIST: already registered with different parameters.
+ *   - EPERM: called from a secondary process.
+ *   - ENOENT: no more field or flag available.
+ *   - ENOMEM: allocation failure.
+ */
+__rte_experimental
+int rte_mbuf_dyn_rx_timestamp_register(int *field_offset, uint64_t *rx_flag);
 
 /**
  * When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME flag set on the
diff --git a/lib/librte_mbuf/version.map b/lib/librte_mbuf/version.map
index a011aaead3..0b66668bff 100644
--- a/lib/librte_mbuf/version.map
+++ b/lib/librte_mbuf/version.map
@@ -42,6 +42,7 @@ EXPERIMENTAL {
 	rte_mbuf_dynflag_register;
 	rte_mbuf_dynflag_register_bitnum;
 	rte_mbuf_dyn_dump;
+	rte_mbuf_dyn_rx_timestamp_register;
 	rte_pktmbuf_copy;
 	rte_pktmbuf_free_bulk;
 	rte_pktmbuf_pool_create_extbuf;
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v3 03/16] latency: switch Rx timestamp to dynamic mbuf field
  2020-11-03  0:13 ` [dpdk-dev] [PATCH v3 00/16] " Thomas Monjalon
  2020-11-03  0:13   ` [dpdk-dev] [PATCH v3 01/16] eventdev: remove software Rx timestamp Thomas Monjalon
  2020-11-03  0:13   ` [dpdk-dev] [PATCH v3 02/16] mbuf: add Rx timestamp flag and helpers Thomas Monjalon
@ 2020-11-03  0:13   ` Thomas Monjalon
  2020-11-03  0:13   ` [dpdk-dev] [PATCH v3 04/16] net/ark: " Thomas Monjalon
                     ` (13 subsequent siblings)
  16 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03  0:13 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Reshma Pattan

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced with the dynamic one.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 lib/librte_latencystats/rte_latencystats.c | 30 ++++++++++++++++++----
 1 file changed, 25 insertions(+), 5 deletions(-)

diff --git a/lib/librte_latencystats/rte_latencystats.c b/lib/librte_latencystats/rte_latencystats.c
index ba2fff3bcb..ab8db7a139 100644
--- a/lib/librte_latencystats/rte_latencystats.c
+++ b/lib/librte_latencystats/rte_latencystats.c
@@ -9,6 +9,7 @@
 
 #include <rte_string_fns.h>
 #include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 #include <rte_log.h>
 #include <rte_cycles.h>
 #include <rte_ethdev.h>
@@ -31,6 +32,16 @@ latencystat_cycles_per_ns(void)
 /* Macros for printing using RTE_LOG */
 #define RTE_LOGTYPE_LATENCY_STATS RTE_LOGTYPE_USER1
 
+static uint64_t timestamp_dynflag;
+static int timestamp_dynfield_offset = -1;
+
+static inline rte_mbuf_timestamp_t *
+timestamp_dynfield(struct rte_mbuf *mbuf)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+			timestamp_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 static const char *MZ_RTE_LATENCY_STATS = "rte_latencystats";
 static int latency_stats_index;
 static uint64_t samp_intvl;
@@ -128,10 +139,10 @@ add_time_stamps(uint16_t pid __rte_unused,
 		diff_tsc = now - prev_tsc;
 		timer_tsc += diff_tsc;
 
-		if ((pkts[i]->ol_flags & PKT_RX_TIMESTAMP) == 0
+		if ((pkts[i]->ol_flags & timestamp_dynflag) == 0
 				&& (timer_tsc >= samp_intvl)) {
-			pkts[i]->timestamp = now;
-			pkts[i]->ol_flags |= PKT_RX_TIMESTAMP;
+			*timestamp_dynfield(pkts[i]) = now;
+			pkts[i]->ol_flags |= timestamp_dynflag;
 			timer_tsc = 0;
 		}
 		prev_tsc = now;
@@ -161,8 +172,8 @@ calc_latency(uint16_t pid __rte_unused,
 
 	now = rte_rdtsc();
 	for (i = 0; i < nb_pkts; i++) {
-		if (pkts[i]->ol_flags & PKT_RX_TIMESTAMP)
-			latency[cnt++] = now - pkts[i]->timestamp;
+		if (pkts[i]->ol_flags & timestamp_dynflag)
+			latency[cnt++] = now - *timestamp_dynfield(pkts[i]);
 	}
 
 	rte_spinlock_lock(&glob_stats->lock);
@@ -241,6 +252,15 @@ rte_latencystats_init(uint64_t app_samp_intvl,
 		return -1;
 	}
 
+	/* Register mbuf field and flag for Rx timestamp */
+	ret = rte_mbuf_dyn_rx_timestamp_register(&timestamp_dynfield_offset,
+			&timestamp_dynflag);
+	if (ret != 0) {
+		RTE_LOG(ERR, LATENCY_STATS,
+			"Cannot register mbuf field/flag for timestamp\n");
+		return -rte_errno;
+	}
+
 	/** Register Rx/Tx callbacks */
 	RTE_ETH_FOREACH_DEV(pid) {
 		struct rte_eth_dev_info dev_info;
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v3 04/16] net/ark: switch Rx timestamp to dynamic mbuf field
  2020-11-03  0:13 ` [dpdk-dev] [PATCH v3 00/16] " Thomas Monjalon
                     ` (2 preceding siblings ...)
  2020-11-03  0:13   ` [dpdk-dev] [PATCH v3 03/16] latency: switch Rx timestamp to dynamic mbuf field Thomas Monjalon
@ 2020-11-03  0:13   ` Thomas Monjalon
  2020-11-03  0:13   ` [dpdk-dev] [PATCH v3 05/16] net/dpaa2: " Thomas Monjalon
                     ` (12 subsequent siblings)
  16 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03  0:13 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Shepard Siegel, Ed Czeck,
	John Miller

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related dynamic mbuf flag is set, although was missing previously.

The timestamp is set if configured for at least one device.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 drivers/net/ark/ark_ethdev.c    | 16 ++++++++++++++++
 drivers/net/ark/ark_ethdev_rx.c |  7 ++++++-
 drivers/net/ark/ark_ethdev_rx.h |  2 ++
 3 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ark/ark_ethdev.c b/drivers/net/ark/ark_ethdev.c
index fa343999a1..a34dcc5291 100644
--- a/drivers/net/ark/ark_ethdev.c
+++ b/drivers/net/ark/ark_ethdev.c
@@ -79,6 +79,8 @@ static int  eth_ark_set_mtu(struct rte_eth_dev *dev, uint16_t size);
 #define ARK_TX_MAX_QUEUE (4096 * 4)
 #define ARK_TX_MIN_QUEUE (256)
 
+uint64_t ark_timestamp_rx_dynflag;
+int ark_timestamp_dynfield_offset = -1;
 int rte_pmd_ark_rx_userdata_dynfield_offset = -1;
 int rte_pmd_ark_tx_userdata_dynfield_offset = -1;
 
@@ -552,6 +554,18 @@ static int
 eth_ark_dev_configure(struct rte_eth_dev *dev)
 {
 	struct ark_adapter *ark = dev->data->dev_private;
+	int ret;
+
+	if (dev->data->dev_conf.rxmode.offloads & DEV_RX_OFFLOAD_TIMESTAMP) {
+		ret = rte_mbuf_dyn_rx_timestamp_register(
+				&ark_timestamp_dynfield_offset,
+				&ark_timestamp_rx_dynflag);
+		if (ret != 0) {
+			ARK_PMD_LOG(ERR,
+				"Failed to register Rx timestamp field/flag\n");
+			return -rte_errno;
+		}
+	}
 
 	eth_ark_dev_set_link_up(dev);
 	if (ark->user_ext.dev_configure)
@@ -782,6 +796,8 @@ eth_ark_dev_info_get(struct rte_eth_dev *dev,
 				ETH_LINK_SPEED_50G |
 				ETH_LINK_SPEED_100G);
 
+	dev_info->rx_offload_capa = DEV_RX_OFFLOAD_TIMESTAMP;
+
 	return 0;
 }
 
diff --git a/drivers/net/ark/ark_ethdev_rx.c b/drivers/net/ark/ark_ethdev_rx.c
index c24cc00e2f..d29d3db783 100644
--- a/drivers/net/ark/ark_ethdev_rx.c
+++ b/drivers/net/ark/ark_ethdev_rx.c
@@ -272,7 +272,12 @@ eth_ark_recv_pkts(void *rx_queue,
 		mbuf->port = meta->port;
 		mbuf->pkt_len = meta->pkt_len;
 		mbuf->data_len = meta->pkt_len;
-		mbuf->timestamp = meta->timestamp;
+		/* set timestamp if enabled at least on one device */
+		if (ark_timestamp_rx_dynflag > 0) {
+			*RTE_MBUF_DYNFIELD(mbuf, ark_timestamp_dynfield_offset,
+				rte_mbuf_timestamp_t *) = meta->timestamp;
+			mbuf->ol_flags |= ark_timestamp_rx_dynflag;
+		}
 		rte_pmd_ark_mbuf_rx_userdata_set(mbuf, meta->user_data);
 
 		if (ARK_DEBUG_CORE) {	/* debug sanity checks */
diff --git a/drivers/net/ark/ark_ethdev_rx.h b/drivers/net/ark/ark_ethdev_rx.h
index 0fdd29b1ab..001fa9bdfa 100644
--- a/drivers/net/ark/ark_ethdev_rx.h
+++ b/drivers/net/ark/ark_ethdev_rx.h
@@ -11,6 +11,8 @@
 #include <rte_mempool.h>
 #include <rte_ethdev_driver.h>
 
+extern uint64_t ark_timestamp_rx_dynflag;
+extern int ark_timestamp_dynfield_offset;
 
 int eth_ark_dev_rx_queue_setup(struct rte_eth_dev *dev,
 			       uint16_t queue_idx,
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v3 05/16] net/dpaa2: switch Rx timestamp to dynamic mbuf field
  2020-11-03  0:13 ` [dpdk-dev] [PATCH v3 00/16] " Thomas Monjalon
                     ` (3 preceding siblings ...)
  2020-11-03  0:13   ` [dpdk-dev] [PATCH v3 04/16] net/ark: " Thomas Monjalon
@ 2020-11-03  0:13   ` Thomas Monjalon
  2020-11-03  9:18     ` Hemant Agrawal
  2020-11-03  0:13   ` [dpdk-dev] [PATCH v3 06/16] net/mlx5: fix dynamic mbuf offset lookup check Thomas Monjalon
                     ` (11 subsequent siblings)
  16 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03  0:13 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Hemant Agrawal,
	Sachin Saxena

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 drivers/net/dpaa2/dpaa2_ethdev.c | 11 +++++++++++
 drivers/net/dpaa2/dpaa2_ethdev.h |  2 ++
 drivers/net/dpaa2/dpaa2_rxtx.c   | 25 ++++++++++++++++++-------
 3 files changed, 31 insertions(+), 7 deletions(-)

diff --git a/drivers/net/dpaa2/dpaa2_ethdev.c b/drivers/net/dpaa2/dpaa2_ethdev.c
index 04e60c56f2..3b0c7717b6 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.c
+++ b/drivers/net/dpaa2/dpaa2_ethdev.c
@@ -65,6 +65,8 @@ static uint64_t dev_tx_offloads_nodis =
 
 /* enable timestamp in mbuf */
 bool dpaa2_enable_ts[RTE_MAX_ETHPORTS];
+uint64_t dpaa2_timestamp_rx_dynflag;
+int dpaa2_timestamp_dynfield_offset = -1;
 
 struct rte_dpaa2_xstats_name_off {
 	char name[RTE_ETH_XSTATS_NAME_SIZE];
@@ -587,7 +589,16 @@ dpaa2_eth_dev_configure(struct rte_eth_dev *dev)
 #if !defined(RTE_LIBRTE_IEEE1588)
 	if (rx_offloads & DEV_RX_OFFLOAD_TIMESTAMP)
 #endif
+	{
+		ret = rte_mbuf_dyn_rx_timestamp_register(
+				&dpaa2_timestamp_dynfield_offset,
+				&dpaa2_timestamp_rx_dynflag);
+		if (ret != 0) {
+			DPAA2_PMD_ERR("Error to register timestamp field/flag");
+			return -rte_errno;
+		}
 		dpaa2_enable_ts[dev->data->port_id] = true;
+	}
 
 	if (tx_offloads & DEV_TX_OFFLOAD_IPV4_CKSUM)
 		tx_l3_csum_offload = true;
diff --git a/drivers/net/dpaa2/dpaa2_ethdev.h b/drivers/net/dpaa2/dpaa2_ethdev.h
index 94cf253827..8d82f74684 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.h
+++ b/drivers/net/dpaa2/dpaa2_ethdev.h
@@ -92,6 +92,8 @@
 
 /* enable timestamp in mbuf*/
 extern bool dpaa2_enable_ts[];
+extern uint64_t dpaa2_timestamp_rx_dynflag;
+extern int dpaa2_timestamp_dynfield_offset;
 
 #define DPAA2_QOS_TABLE_RECONFIGURE	1
 #define DPAA2_FS_TABLE_RECONFIGURE	2
diff --git a/drivers/net/dpaa2/dpaa2_rxtx.c b/drivers/net/dpaa2/dpaa2_rxtx.c
index 6201de4606..9cca6d16c3 100644
--- a/drivers/net/dpaa2/dpaa2_rxtx.c
+++ b/drivers/net/dpaa2/dpaa2_rxtx.c
@@ -31,6 +31,13 @@ dpaa2_dev_rx_parse_slow(struct rte_mbuf *mbuf,
 
 static void enable_tx_tstamp(struct qbman_fd *fd) __rte_unused;
 
+static inline rte_mbuf_timestamp_t *
+dpaa2_timestamp_dynfield(struct rte_mbuf *mbuf)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+		dpaa2_timestamp_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 #define DPAA2_MBUF_TO_CONTIG_FD(_mbuf, _fd, _bpid)  do { \
 	DPAA2_SET_FD_ADDR(_fd, DPAA2_MBUF_VADDR_TO_IOVA(_mbuf)); \
 	DPAA2_SET_FD_LEN(_fd, _mbuf->data_len); \
@@ -109,9 +116,10 @@ dpaa2_dev_rx_parse_new(struct rte_mbuf *m, const struct qbman_fd *fd,
 	m->ol_flags |= PKT_RX_RSS_HASH;
 
 	if (dpaa2_enable_ts[m->port]) {
-		m->timestamp = annotation->word2;
-		m->ol_flags |= PKT_RX_TIMESTAMP;
-		DPAA2_PMD_DP_DEBUG("pkt timestamp:0x%" PRIx64 "", m->timestamp);
+		*dpaa2_timestamp_dynfield(m) = annotation->word2;
+		m->ol_flags |= dpaa2_timestamp_rx_dynflag;
+		DPAA2_PMD_DP_DEBUG("pkt timestamp:0x%" PRIx64 "",
+				*dpaa2_timestamp_dynfield(m));
 	}
 
 	DPAA2_PMD_DP_DEBUG("HW frc = 0x%x\t packet type =0x%x "
@@ -223,9 +231,12 @@ dpaa2_dev_rx_parse(struct rte_mbuf *mbuf, void *hw_annot_addr)
 	else if (BIT_ISSET_AT_POS(annotation->word8, DPAA2_ETH_FAS_L4CE))
 		mbuf->ol_flags |= PKT_RX_L4_CKSUM_BAD;
 
-	mbuf->ol_flags |= PKT_RX_TIMESTAMP;
-	mbuf->timestamp = annotation->word2;
-	DPAA2_PMD_DP_DEBUG("pkt timestamp: 0x%" PRIx64 "", mbuf->timestamp);
+	if (dpaa2_enable_ts[mbuf->port]) {
+		*dpaa2_timestamp_dynfield(mbuf) = annotation->word2;
+		mbuf->ol_flags |= dpaa2_timestamp_rx_dynflag;
+		DPAA2_PMD_DP_DEBUG("pkt timestamp: 0x%" PRIx64 "",
+				*dpaa2_timestamp_dynfield(mbuf));
+	}
 
 	/* Check detailed parsing requirement */
 	if (annotation->word3 & 0x7FFFFC3FFFF)
@@ -629,7 +640,7 @@ dpaa2_dev_prefetch_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 		else
 			bufs[num_rx] = eth_fd_to_mbuf(fd, eth_data->port_id);
 #if defined(RTE_LIBRTE_IEEE1588)
-		priv->rx_timestamp = bufs[num_rx]->timestamp;
+		priv->rx_timestamp = *dpaa2_timestamp_dynfield(bufs[num_rx]);
 #endif
 
 		if (eth_data->dev_conf.rxmode.offloads &
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v3 06/16] net/mlx5: fix dynamic mbuf offset lookup check
  2020-11-03  0:13 ` [dpdk-dev] [PATCH v3 00/16] " Thomas Monjalon
                     ` (4 preceding siblings ...)
  2020-11-03  0:13   ` [dpdk-dev] [PATCH v3 05/16] net/dpaa2: " Thomas Monjalon
@ 2020-11-03  0:13   ` Thomas Monjalon
  2020-11-03  8:12     ` Slava Ovsiienko
  2020-11-03  0:13   ` [dpdk-dev] [PATCH v3 07/16] net/mlx5: switch Rx timestamp to dynamic mbuf field Thomas Monjalon
                     ` (10 subsequent siblings)
  16 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03  0:13 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, stable, Matan Azrad,
	Shahaf Shuler, Ori Kam

The functions rte_mbuf_dynfield_lookup() and rte_mbuf_dynflag_lookup()
can return an offset starting with 0 or a negative error code.

In reality the first offsets are probably reserved forever,
but for the sake of strict API compliance,
the checks which considered 0 as an error are fixed.

Fixes: efa79e68c8cd ("net/mlx5: support fine grain dynamic flag")
Fixes: 3172c471b86f ("net/mlx5: prepare Tx queue structures to support timestamp")
Fixes: 0febfcce3693 ("net/mlx5: prepare Tx to support scheduling")
Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 drivers/net/mlx5/mlx5_rxtx.c    | 4 ++--
 drivers/net/mlx5/mlx5_trigger.c | 2 +-
 drivers/net/mlx5/mlx5_txq.c     | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index b530ff421f..e86468b67a 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -5661,9 +5661,9 @@ mlx5_select_tx_function(struct rte_eth_dev *dev)
 	}
 	if (tx_offloads & DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP &&
 	    rte_mbuf_dynflag_lookup
-			(RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME, NULL) > 0 &&
+			(RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME, NULL) >= 0 &&
 	    rte_mbuf_dynfield_lookup
-			(RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL) > 0) {
+			(RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL) >= 0) {
 		/* Offload configured, dynamic entities registered. */
 		olx |= MLX5_TXOFF_CONFIG_TXPP;
 	}
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index 7735f022a3..917b433c4a 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -302,7 +302,7 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 	DRV_LOG(DEBUG, "port %u starting device", dev->data->port_id);
 	fine_inline = rte_mbuf_dynflag_lookup
 		(RTE_PMD_MLX5_FINE_GRANULARITY_INLINE, NULL);
-	if (fine_inline > 0)
+	if (fine_inline >= 0)
 		rte_net_mlx5_dynf_inline_mask = 1UL << fine_inline;
 	else
 		rte_net_mlx5_dynf_inline_mask = 0;
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index af84f5f72b..8ed2bcff7b 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -1305,7 +1305,7 @@ mlx5_txq_dynf_timestamp_set(struct rte_eth_dev *dev)
 				(RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME, NULL);
 	off = rte_mbuf_dynfield_lookup
 				(RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
-	if (nbit > 0 && off >= 0 && sh->txpp.refcnt)
+	if (nbit >= 0 && off >= 0 && sh->txpp.refcnt)
 		mask = 1ULL << nbit;
 	for (i = 0; i != priv->txqs_n; ++i) {
 		data = (*priv->txqs)[i];
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v3 07/16] net/mlx5: switch Rx timestamp to dynamic mbuf field
  2020-11-03  0:13 ` [dpdk-dev] [PATCH v3 00/16] " Thomas Monjalon
                     ` (5 preceding siblings ...)
  2020-11-03  0:13   ` [dpdk-dev] [PATCH v3 06/16] net/mlx5: fix dynamic mbuf offset lookup check Thomas Monjalon
@ 2020-11-03  0:13   ` Thomas Monjalon
  2020-11-03  8:12     ` Slava Ovsiienko
  2020-11-03  0:13   ` [dpdk-dev] [PATCH v3 08/16] net/nfb: " Thomas Monjalon
                     ` (9 subsequent siblings)
  16 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03  0:13 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Ruifeng Wang,
	David Christensen, Matan Azrad, Shahaf Shuler,
	Konstantin Ananyev

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

The dynamic offset and flag are stored in struct mlx5_rxq_data
to favor cache locality.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
---
 drivers/net/mlx5/mlx5_rxq.c              |  8 +++++
 drivers/net/mlx5/mlx5_rxtx.c             |  4 +--
 drivers/net/mlx5/mlx5_rxtx.h             | 19 +++++++++++
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h | 41 +++++++++++-----------
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h    | 43 ++++++++++++------------
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h     | 35 +++++++++----------
 6 files changed, 90 insertions(+), 60 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index f1d8373079..52519910ee 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1492,7 +1492,15 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
 	mlx5_max_lro_msg_size_adjust(dev, idx, max_lro_size);
 	/* Toggle RX checksum offload if hardware supports it. */
 	tmpl->rxq.csum = !!(offloads & DEV_RX_OFFLOAD_CHECKSUM);
+	/* Configure Rx timestamp. */
 	tmpl->rxq.hw_timestamp = !!(offloads & DEV_RX_OFFLOAD_TIMESTAMP);
+	tmpl->rxq.timestamp_rx_flag = 0;
+	if (tmpl->rxq.hw_timestamp && rte_mbuf_dyn_rx_timestamp_register(
+			&tmpl->rxq.timestamp_offset,
+			&tmpl->rxq.timestamp_rx_flag) != 0) {
+		DRV_LOG(ERR, "Cannot register Rx timestamp field/flag");
+		goto error;
+	}
 	/* Configure VLAN stripping. */
 	tmpl->rxq.vlan_strip = !!(offloads & DEV_RX_OFFLOAD_VLAN_STRIP);
 	/* By default, FCS (CRC) is stripped by hardware. */
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index e86468b67a..b577aab00b 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -1287,8 +1287,8 @@ rxq_cq_to_mbuf(struct mlx5_rxq_data *rxq, struct rte_mbuf *pkt,
 
 		if (rxq->rt_timestamp)
 			ts = mlx5_txpp_convert_rx_ts(rxq->sh, ts);
-		pkt->timestamp = ts;
-		pkt->ol_flags |= PKT_RX_TIMESTAMP;
+		mlx5_timestamp_set(pkt, rxq->timestamp_offset, ts);
+		pkt->ol_flags |= rxq->timestamp_rx_flag;
 	}
 }
 
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 674296ee98..e9eca36b40 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -151,6 +151,8 @@ struct mlx5_rxq_data {
 	/* CQ (UAR) access lock required for 32bit implementations */
 #endif
 	uint32_t tunnel; /* Tunnel information. */
+	int timestamp_offset; /* Dynamic mbuf field for timestamp. */
+	uint64_t timestamp_rx_flag; /* Dynamic mbuf flag for timestamp. */
 	uint64_t flow_meta_mask;
 	int32_t flow_meta_offset;
 } __rte_cache_aligned;
@@ -681,4 +683,21 @@ mlx5_txpp_convert_tx_ts(struct mlx5_dev_ctx_shared *sh, uint64_t mts)
 	return ci;
 }
 
+/**
+ * Set timestamp in mbuf dynamic field.
+ *
+ * @param mbuf
+ *   Structure to write into.
+ * @param offset
+ *   Dynamic field offset in mbuf structure.
+ * @param timestamp
+ *   Value to write.
+ */
+static __rte_always_inline void
+mlx5_timestamp_set(struct rte_mbuf *mbuf, int offset,
+		rte_mbuf_timestamp_t timestamp)
+{
+	*RTE_MBUF_DYNFIELD(mbuf, offset, rte_mbuf_timestamp_t *) = timestamp;
+}
+
 #endif /* RTE_PMD_MLX5_RXTX_H_ */
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
index 6bf0c9b540..171d7bb0f8 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
@@ -330,13 +330,13 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
 	vector unsigned char ol_flags = (vector unsigned char)
 		(vector unsigned int){
 			rxq->rss_hash * PKT_RX_RSS_HASH |
-				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
+				rxq->hw_timestamp * rxq->timestamp_rx_flag,
 			rxq->rss_hash * PKT_RX_RSS_HASH |
-				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
+				rxq->hw_timestamp * rxq->timestamp_rx_flag,
 			rxq->rss_hash * PKT_RX_RSS_HASH |
-				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
+				rxq->hw_timestamp * rxq->timestamp_rx_flag,
 			rxq->rss_hash * PKT_RX_RSS_HASH |
-				rxq->hw_timestamp * PKT_RX_TIMESTAMP};
+				rxq->hw_timestamp * rxq->timestamp_rx_flag};
 	vector unsigned char cv_flags;
 	const vector unsigned char zero = (vector unsigned char){0};
 	const vector unsigned char ptype_mask =
@@ -1025,31 +1025,32 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t pkts_n,
 		/* D.5 fill in mbuf - rearm_data and packet_type. */
 		rxq_cq_to_ptype_oflags_v(rxq, cqes, opcode, &pkts[pos]);
 		if (rxq->hw_timestamp) {
+			int offset = rxq->timestamp_offset;
 			if (rxq->rt_timestamp) {
 				struct mlx5_dev_ctx_shared *sh = rxq->sh;
 				uint64_t ts;
 
 				ts = rte_be_to_cpu_64(cq[pos].timestamp);
-				pkts[pos]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p1].timestamp);
-				pkts[pos + 1]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 1], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p2].timestamp);
-				pkts[pos + 2]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 2], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p3].timestamp);
-				pkts[pos + 3]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 3], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 			} else {
-				pkts[pos]->timestamp = rte_be_to_cpu_64
-						(cq[pos].timestamp);
-				pkts[pos + 1]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p1].timestamp);
-				pkts[pos + 2]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p2].timestamp);
-				pkts[pos + 3]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p3].timestamp);
+				mlx5_timestamp_set(pkts[pos], offset,
+					rte_be_to_cpu_64(cq[pos].timestamp));
+				mlx5_timestamp_set(pkts[pos + 1], offset,
+					rte_be_to_cpu_64(cq[pos + p1].timestamp));
+				mlx5_timestamp_set(pkts[pos + 2], offset,
+					rte_be_to_cpu_64(cq[pos + p2].timestamp));
+				mlx5_timestamp_set(pkts[pos + 3], offset,
+					rte_be_to_cpu_64(cq[pos + p3].timestamp));
 			}
 		}
 		if (rxq->dynf_meta) {
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index d122dad4fe..436b247ade 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -271,7 +271,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
 	uint32x4_t pinfo, cv_flags;
 	uint32x4_t ol_flags =
 		vdupq_n_u32(rxq->rss_hash * PKT_RX_RSS_HASH |
-			    rxq->hw_timestamp * PKT_RX_TIMESTAMP);
+			    rxq->hw_timestamp * rxq->timestamp_rx_flag);
 	const uint32x4_t ptype_ol_mask = { 0x106, 0x106, 0x106, 0x106 };
 	const uint8x16_t cv_flag_sel = {
 		0,
@@ -697,6 +697,7 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t pkts_n,
 		rxq_cq_to_ptype_oflags_v(rxq, ptype_info, flow_tag,
 					 opcode, &elts[pos]);
 		if (rxq->hw_timestamp) {
+			int offset = rxq->timestamp_offset;
 			if (rxq->rt_timestamp) {
 				struct mlx5_dev_ctx_shared *sh = rxq->sh;
 				uint64_t ts;
@@ -704,36 +705,36 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t pkts_n,
 				ts = rte_be_to_cpu_64
 					(container_of(p0, struct mlx5_cqe,
 						      pkt_info)->timestamp);
-				elts[pos]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(elts[pos], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64
 					(container_of(p1, struct mlx5_cqe,
 						      pkt_info)->timestamp);
-				elts[pos + 1]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(elts[pos + 1], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64
 					(container_of(p2, struct mlx5_cqe,
 						      pkt_info)->timestamp);
-				elts[pos + 2]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(elts[pos + 2], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64
 					(container_of(p3, struct mlx5_cqe,
 						      pkt_info)->timestamp);
-				elts[pos + 3]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(elts[pos + 3], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 			} else {
-				elts[pos]->timestamp = rte_be_to_cpu_64
-					(container_of(p0, struct mlx5_cqe,
-						      pkt_info)->timestamp);
-				elts[pos + 1]->timestamp = rte_be_to_cpu_64
-					(container_of(p1, struct mlx5_cqe,
-						      pkt_info)->timestamp);
-				elts[pos + 2]->timestamp = rte_be_to_cpu_64
-					(container_of(p2, struct mlx5_cqe,
-						      pkt_info)->timestamp);
-				elts[pos + 3]->timestamp = rte_be_to_cpu_64
-					(container_of(p3, struct mlx5_cqe,
-						      pkt_info)->timestamp);
+				mlx5_timestamp_set(elts[pos], offset,
+					rte_be_to_cpu_64(container_of(p0,
+					struct mlx5_cqe, pkt_info)->timestamp));
+				mlx5_timestamp_set(elts[pos + 1], offset,
+					rte_be_to_cpu_64(container_of(p1,
+					struct mlx5_cqe, pkt_info)->timestamp));
+				mlx5_timestamp_set(elts[pos + 2], offset,
+					rte_be_to_cpu_64(container_of(p2,
+					struct mlx5_cqe, pkt_info)->timestamp));
+				mlx5_timestamp_set(elts[pos + 3], offset,
+					rte_be_to_cpu_64(container_of(p3,
+					struct mlx5_cqe, pkt_info)->timestamp));
 			}
 		}
 		if (rxq->dynf_meta) {
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 0bbcbeefff..ae4439efc7 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -251,7 +251,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq, __m128i cqes[4],
 	__m128i pinfo0, pinfo1;
 	__m128i pinfo, ptype;
 	__m128i ol_flags = _mm_set1_epi32(rxq->rss_hash * PKT_RX_RSS_HASH |
-					  rxq->hw_timestamp * PKT_RX_TIMESTAMP);
+					  rxq->hw_timestamp * rxq->timestamp_rx_flag);
 	__m128i cv_flags;
 	const __m128i zero = _mm_setzero_si128();
 	const __m128i ptype_mask =
@@ -656,31 +656,32 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t pkts_n,
 		/* D.5 fill in mbuf - rearm_data and packet_type. */
 		rxq_cq_to_ptype_oflags_v(rxq, cqes, opcode, &pkts[pos]);
 		if (rxq->hw_timestamp) {
+			int offset = rxq->timestamp_offset;
 			if (rxq->rt_timestamp) {
 				struct mlx5_dev_ctx_shared *sh = rxq->sh;
 				uint64_t ts;
 
 				ts = rte_be_to_cpu_64(cq[pos].timestamp);
-				pkts[pos]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p1].timestamp);
-				pkts[pos + 1]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 1], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p2].timestamp);
-				pkts[pos + 2]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 2], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p3].timestamp);
-				pkts[pos + 3]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 3], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 			} else {
-				pkts[pos]->timestamp = rte_be_to_cpu_64
-						(cq[pos].timestamp);
-				pkts[pos + 1]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p1].timestamp);
-				pkts[pos + 2]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p2].timestamp);
-				pkts[pos + 3]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p3].timestamp);
+				mlx5_timestamp_set(pkts[pos], offset,
+					rte_be_to_cpu_64(cq[pos].timestamp));
+				mlx5_timestamp_set(pkts[pos + 1], offset,
+					rte_be_to_cpu_64(cq[pos + p1].timestamp));
+				mlx5_timestamp_set(pkts[pos + 2], offset,
+					rte_be_to_cpu_64(cq[pos + p2].timestamp));
+				mlx5_timestamp_set(pkts[pos + 3], offset,
+					rte_be_to_cpu_64(cq[pos + p3].timestamp));
 			}
 		}
 		if (rxq->dynf_meta) {
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v3 08/16] net/nfb: switch Rx timestamp to dynamic mbuf field
  2020-11-03  0:13 ` [dpdk-dev] [PATCH v3 00/16] " Thomas Monjalon
                     ` (6 preceding siblings ...)
  2020-11-03  0:13   ` [dpdk-dev] [PATCH v3 07/16] net/mlx5: switch Rx timestamp to dynamic mbuf field Thomas Monjalon
@ 2020-11-03  0:13   ` Thomas Monjalon
  2020-11-03 10:20     ` Olivier Matz
  2020-11-03  0:14   ` [dpdk-dev] [PATCH v3 09/16] net/octeontx2: " Thomas Monjalon
                     ` (8 subsequent siblings)
  16 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03  0:13 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Martin Spinler

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 drivers/net/nfb/nfb_rx.c | 15 ++++++++++++++-
 drivers/net/nfb/nfb_rx.h | 18 ++++++++++++++----
 2 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/drivers/net/nfb/nfb_rx.c b/drivers/net/nfb/nfb_rx.c
index d97179f818..d6d4ba9663 100644
--- a/drivers/net/nfb/nfb_rx.c
+++ b/drivers/net/nfb/nfb_rx.c
@@ -9,6 +9,9 @@
 #include "nfb_rx.h"
 #include "nfb.h"
 
+uint64_t nfb_timestamp_rx_dynflag;
+int nfb_timestamp_dynfield_offset = -1;
+
 static int
 timestamp_check_handler(__rte_unused const char *key,
 	const char *value, __rte_unused void *opaque)
@@ -24,6 +27,7 @@ static int
 nfb_check_timestamp(struct rte_devargs *devargs)
 {
 	struct rte_kvargs *kvlist;
+	int ret;
 
 	if (devargs == NULL)
 		return 0;
@@ -38,6 +42,7 @@ nfb_check_timestamp(struct rte_devargs *devargs)
 	}
 	/* Timestamps are enabled when there is
 	 * key-value pair: enable_timestamp=1
+	 * TODO: timestamp should be enabled with DEV_RX_OFFLOAD_TIMESTAMP
 	 */
 	if (rte_kvargs_process(kvlist, TIMESTAMP_ARG,
 		timestamp_check_handler, NULL) < 0) {
@@ -46,6 +51,14 @@ nfb_check_timestamp(struct rte_devargs *devargs)
 	}
 	rte_kvargs_free(kvlist);
 
+	ret = rte_mbuf_dyn_rx_timestamp_register(
+			&nfb_timestamp_dynfield_offset,
+			&nfb_timestamp_rx_dynflag);
+	if (ret != 0) {
+		RTE_LOG(ERR, PMD, "Cannot register Rx timestamp field/flag\n");
+		return -rte_errno;
+	}
+
 	return 1;
 }
 
@@ -125,7 +138,7 @@ nfb_eth_rx_queue_setup(struct rte_eth_dev *dev,
 	else
 		rte_free(rxq);
 
-	if (nfb_check_timestamp(dev->device->devargs))
+	if (nfb_check_timestamp(dev->device->devargs) > 0)
 		rxq->flags |= NFB_TIMESTAMP_FLAG;
 
 	return ret;
diff --git a/drivers/net/nfb/nfb_rx.h b/drivers/net/nfb/nfb_rx.h
index cf3899b2fb..e548226e0f 100644
--- a/drivers/net/nfb/nfb_rx.h
+++ b/drivers/net/nfb/nfb_rx.h
@@ -15,6 +15,16 @@
 
 #define NFB_TIMESTAMP_FLAG (1 << 0)
 
+extern uint64_t nfb_timestamp_rx_dynflag;
+extern int nfb_timestamp_dynfield_offset;
+
+static inline rte_mbuf_timestamp_t *
+nfb_timestamp_dynfield(struct rte_mbuf *mbuf)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+		nfb_timestamp_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 struct ndp_rx_queue {
 	struct nfb_device *nfb;	     /* nfb dev structure */
 	struct ndp_queue *queue;     /* rx queue */
@@ -191,15 +201,15 @@ nfb_eth_ndp_rx(void *queue,
 
 			if (timestamping_enabled) {
 				/* nanoseconds */
-				mbuf->timestamp =
+				*nfb_timestamp_dynfield(mbuf) =
 					rte_le_to_cpu_32(*((uint32_t *)
 					(packets[i].header + 4)));
-				mbuf->timestamp <<= 32;
+				*nfb_timestamp_dynfield(mbuf) <<= 32;
 				/* seconds */
-				mbuf->timestamp |=
+				*nfb_timestamp_dynfield(mbuf) |=
 					rte_le_to_cpu_32(*((uint32_t *)
 					(packets[i].header + 8)));
-				mbuf->ol_flags |= PKT_RX_TIMESTAMP;
+				mbuf->ol_flags |= nfb_timestamp_rx_dynflag;
 			}
 
 			bufs[num_rx++] = mbuf;
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v3 09/16] net/octeontx2: switch Rx timestamp to dynamic mbuf field
  2020-11-03  0:13 ` [dpdk-dev] [PATCH v3 00/16] " Thomas Monjalon
                     ` (7 preceding siblings ...)
  2020-11-03  0:13   ` [dpdk-dev] [PATCH v3 08/16] net/nfb: " Thomas Monjalon
@ 2020-11-03  0:14   ` Thomas Monjalon
  2020-11-03 10:52     ` Harman Kalra
  2020-11-03  0:14   ` [dpdk-dev] [PATCH v3 10/16] net/pcap: " Thomas Monjalon
                     ` (7 subsequent siblings)
  16 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03  0:14 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Nithin Dabilpuram,
	Kiran Kumar K

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

The dynamic offset and flag are stored in struct otx2_timesync_info
to favor cache locality.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 drivers/net/octeontx2/otx2_ethdev.c | 10 ++++++++++
 drivers/net/octeontx2/otx2_rx.h     | 19 ++++++++++++++++---
 2 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/drivers/net/octeontx2/otx2_ethdev.c b/drivers/net/octeontx2/otx2_ethdev.c
index cfb733a4b5..f6962be9b2 100644
--- a/drivers/net/octeontx2/otx2_ethdev.c
+++ b/drivers/net/octeontx2/otx2_ethdev.c
@@ -2219,6 +2219,16 @@ otx2_nix_dev_start(struct rte_eth_dev *eth_dev)
 	else
 		otx2_nix_timesync_disable(eth_dev);
 
+	if (dev->rx_offload_flags & NIX_RX_OFFLOAD_TSTAMP_F) {
+		rc = rte_mbuf_dyn_rx_timestamp_register(
+				&dev->tstamp.tstamp_dynfield_offset,
+				&dev->tstamp.rx_tstamp_dynflag);
+		if (rc != 0) {
+			otx2_err("Failed to register Rx timestamp field/flag");
+			return -rte_errno;
+		}
+	}
+
 	/* Update VF about data off shifted by 8 bytes if PTP already
 	 * enabled in PF owning this VF
 	 */
diff --git a/drivers/net/octeontx2/otx2_rx.h b/drivers/net/octeontx2/otx2_rx.h
index 61a5c436dd..926f614a4e 100644
--- a/drivers/net/octeontx2/otx2_rx.h
+++ b/drivers/net/octeontx2/otx2_rx.h
@@ -49,6 +49,8 @@ struct otx2_timesync_info {
 	uint64_t	rx_tstamp;
 	rte_iova_t	tx_tstamp_iova;
 	uint64_t	*tx_tstamp;
+	uint64_t	rx_tstamp_dynflag;
+	int		tstamp_dynfield_offset;
 	uint8_t		tx_ready;
 	uint8_t		rx_ready;
 } __rte_cache_aligned;
@@ -63,6 +65,14 @@ union mbuf_initializer {
 	uint64_t value;
 };
 
+static inline rte_mbuf_timestamp_t *
+otx2_timestamp_dynfield(struct rte_mbuf *mbuf,
+		struct otx2_timesync_info *info)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+		info->tstamp_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 static __rte_always_inline void
 otx2_nix_mbuf_to_tstamp(struct rte_mbuf *mbuf,
 			struct otx2_timesync_info *tstamp, const uint16_t flag,
@@ -77,15 +87,18 @@ otx2_nix_mbuf_to_tstamp(struct rte_mbuf *mbuf,
 		/* Reading the rx timestamp inserted by CGX, viz at
 		 * starting of the packet data.
 		 */
-		mbuf->timestamp = rte_be_to_cpu_64(*tstamp_ptr);
+		*otx2_timestamp_dynfield(mbuf, tstamp) =
+				rte_be_to_cpu_64(*tstamp_ptr);
 		/* PKT_RX_IEEE1588_TMST flag needs to be set only in case
 		 * PTP packets are received.
 		 */
 		if (mbuf->packet_type == RTE_PTYPE_L2_ETHER_TIMESYNC) {
-			tstamp->rx_tstamp = mbuf->timestamp;
+			tstamp->rx_tstamp =
+					*otx2_timestamp_dynfield(mbuf, tstamp);
 			tstamp->rx_ready = 1;
 			mbuf->ol_flags |= PKT_RX_IEEE1588_PTP |
-				PKT_RX_IEEE1588_TMST | PKT_RX_TIMESTAMP;
+				PKT_RX_IEEE1588_TMST |
+				tstamp->rx_tstamp_dynflag;
 		}
 	}
 }
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v3 10/16] net/pcap: switch Rx timestamp to dynamic mbuf field
  2020-11-03  0:13 ` [dpdk-dev] [PATCH v3 00/16] " Thomas Monjalon
                     ` (8 preceding siblings ...)
  2020-11-03  0:14   ` [dpdk-dev] [PATCH v3 09/16] net/octeontx2: " Thomas Monjalon
@ 2020-11-03  0:14   ` Thomas Monjalon
  2020-11-03  0:14   ` [dpdk-dev] [PATCH v3 11/16] app/testpmd: " Thomas Monjalon
                     ` (6 subsequent siblings)
  16 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03  0:14 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 drivers/net/pcap/rte_eth_pcap.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index 34e82317b1..4e6d49370e 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -51,6 +51,9 @@ static uint64_t start_cycles;
 static uint64_t hz;
 static uint8_t iface_idx;
 
+static uint64_t timestamp_rx_dynflag;
+static int timestamp_dynfield_offset = -1;
+
 struct queue_stat {
 	volatile unsigned long pkts;
 	volatile unsigned long bytes;
@@ -265,9 +268,11 @@ eth_pcap_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 		}
 
 		mbuf->pkt_len = (uint16_t)header.caplen;
-		mbuf->timestamp = (uint64_t)header.ts.tv_sec * 1000000
-							+ header.ts.tv_usec;
-		mbuf->ol_flags |= PKT_RX_TIMESTAMP;
+		*RTE_MBUF_DYNFIELD(mbuf, timestamp_dynfield_offset,
+			rte_mbuf_timestamp_t *) =
+				(uint64_t)header.ts.tv_sec * 1000000 +
+				header.ts.tv_usec;
+		mbuf->ol_flags |= timestamp_rx_dynflag;
 		mbuf->port = pcap_q->port_id;
 		bufs[num_rx] = mbuf;
 		num_rx++;
@@ -656,6 +661,15 @@ eth_dev_stop(struct rte_eth_dev *dev)
 static int
 eth_dev_configure(struct rte_eth_dev *dev __rte_unused)
 {
+	int ret;
+
+	ret = rte_mbuf_dyn_rx_timestamp_register(&timestamp_dynfield_offset,
+			&timestamp_rx_dynflag);
+	if (ret != 0) {
+		PMD_LOG(ERR, "Failed to register Rx timestamp field/flag");
+		return -rte_errno;
+	}
+
 	return 0;
 }
 
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v3 11/16] app/testpmd: switch Rx timestamp to dynamic mbuf field
  2020-11-03  0:13 ` [dpdk-dev] [PATCH v3 00/16] " Thomas Monjalon
                     ` (9 preceding siblings ...)
  2020-11-03  0:14   ` [dpdk-dev] [PATCH v3 10/16] net/pcap: " Thomas Monjalon
@ 2020-11-03  0:14   ` Thomas Monjalon
  2020-11-03 10:23     ` Olivier Matz
  2020-11-03  0:14   ` [dpdk-dev] [PATCH v3 12/16] examples/rxtx_callbacks: switch timestamp to dynamic field Thomas Monjalon
                     ` (5 subsequent siblings)
  16 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03  0:14 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 app/test-pmd/util.c | 39 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index 781a813759..eebb5166ad 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -5,6 +5,7 @@
 
 #include <stdio.h>
 
+#include <rte_bitops.h>
 #include <rte_net.h>
 #include <rte_mbuf.h>
 #include <rte_ether.h>
@@ -22,6 +23,40 @@ print_ether_addr(const char *what, const struct rte_ether_addr *eth_addr)
 	printf("%s%s", what, buf);
 }
 
+static inline bool
+is_timestamp_enabled(const struct rte_mbuf *mbuf)
+{
+	static uint64_t timestamp_rx_dynflag;
+
+	int timestamp_rx_dynflag_offset;
+
+	if (timestamp_rx_dynflag == 0) {
+		timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
+				RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
+		if (timestamp_rx_dynflag_offset < 0)
+			return false;
+		timestamp_rx_dynflag = RTE_BIT64(timestamp_rx_dynflag_offset);
+	}
+
+	return (mbuf->ol_flags & timestamp_rx_dynflag) != 0;
+}
+
+static inline rte_mbuf_timestamp_t
+get_timestamp(const struct rte_mbuf *mbuf)
+{
+	static int timestamp_dynfield_offset = -1;
+
+	if (timestamp_dynfield_offset < 0) {
+		timestamp_dynfield_offset = rte_mbuf_dynfield_lookup(
+				RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
+		if (timestamp_dynfield_offset < 0)
+			return 0;
+	}
+
+	return *RTE_MBUF_DYNFIELD(mbuf,
+			timestamp_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 static inline void
 dump_pkt_burst(uint16_t port_id, uint16_t queue, struct rte_mbuf *pkts[],
 	      uint16_t nb_pkts, int is_rx)
@@ -107,8 +142,8 @@ dump_pkt_burst(uint16_t port_id, uint16_t queue, struct rte_mbuf *pkts[],
 				printf("hash=0x%x ID=0x%x ",
 				       mb->hash.fdir.hash, mb->hash.fdir.id);
 		}
-		if (ol_flags & PKT_RX_TIMESTAMP)
-			printf(" - timestamp %"PRIu64" ", mb->timestamp);
+		if (is_timestamp_enabled(mb))
+			printf(" - timestamp %"PRIu64" ", get_timestamp(mb));
 		if (ol_flags & PKT_RX_QINQ)
 			printf(" - QinQ VLAN tci=0x%x, VLAN tci outer=0x%x",
 			       mb->vlan_tci, mb->vlan_tci_outer);
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v3 12/16] examples/rxtx_callbacks: switch timestamp to dynamic field
  2020-11-03  0:13 ` [dpdk-dev] [PATCH v3 00/16] " Thomas Monjalon
                     ` (10 preceding siblings ...)
  2020-11-03  0:14   ` [dpdk-dev] [PATCH v3 11/16] app/testpmd: " Thomas Monjalon
@ 2020-11-03  0:14   ` Thomas Monjalon
  2020-11-03  0:14   ` [dpdk-dev] [PATCH v3 13/16] ethdev: add doxygen comment for Rx timestamp API Thomas Monjalon
                     ` (4 subsequent siblings)
  16 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03  0:14 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, John McNamara

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 examples/rxtx_callbacks/main.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/examples/rxtx_callbacks/main.c b/examples/rxtx_callbacks/main.c
index 1a8e7d47d9..35c6c39807 100644
--- a/examples/rxtx_callbacks/main.c
+++ b/examples/rxtx_callbacks/main.c
@@ -19,6 +19,15 @@
 #define MBUF_CACHE_SIZE 250
 #define BURST_SIZE 32
 
+static int hwts_dynfield_offset = -1;
+
+static inline rte_mbuf_timestamp_t *
+hwts_field(struct rte_mbuf *mbuf)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+			hwts_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 typedef uint64_t tsc_t;
 static int tsc_dynfield_offset = -1;
 
@@ -77,7 +86,7 @@ calc_latency(uint16_t port, uint16_t qidx __rte_unused,
 	for (i = 0; i < nb_pkts; i++) {
 		cycles += now - *tsc_field(pkts[i]);
 		if (hw_timestamping)
-			queue_ticks += ticks - pkts[i]->timestamp;
+			queue_ticks += ticks - *hwts_field(pkts[i]);
 	}
 
 	latency_numbers.total_cycles += cycles;
@@ -141,6 +150,11 @@ port_init(uint16_t port, struct rte_mempool *mbuf_pool)
 			return -1;
 		}
 		port_conf.rxmode.offloads |= DEV_RX_OFFLOAD_TIMESTAMP;
+		rte_mbuf_dyn_rx_timestamp_register(&hwts_dynfield_offset, NULL);
+		if (hwts_dynfield_offset < 0) {
+			printf("ERROR: Failed to register timestamp field\n");
+			return -rte_errno;
+		}
 	}
 
 	retval = rte_eth_dev_configure(port, rx_rings, tx_rings, &port_conf);
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v3 13/16] ethdev: add doxygen comment for Rx timestamp API
  2020-11-03  0:13 ` [dpdk-dev] [PATCH v3 00/16] " Thomas Monjalon
                     ` (11 preceding siblings ...)
  2020-11-03  0:14   ` [dpdk-dev] [PATCH v3 12/16] examples/rxtx_callbacks: switch timestamp to dynamic field Thomas Monjalon
@ 2020-11-03  0:14   ` Thomas Monjalon
  2020-11-03  0:14   ` [dpdk-dev] [PATCH v3 14/16] mbuf: remove deprecated timestamp field Thomas Monjalon
                     ` (3 subsequent siblings)
  16 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03  0:14 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo

The offload flag DEV_RX_OFFLOAD_TIMESTAMP had no documentation.
After switching to dynamic mbuf flag and field,
it becomes even more important to explicit the feature behaviour.

A doxyegn comment for the timesync API was mentioning
the deprecated timestamp field, so it is also updated.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 lib/librte_ethdev/rte_ethdev.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index ba997f16ce..1fc5f662fa 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1344,6 +1344,11 @@ struct rte_eth_conf {
 #define DEV_RX_OFFLOAD_VLAN_EXTEND	0x00000400
 #define DEV_RX_OFFLOAD_JUMBO_FRAME	0x00000800
 #define DEV_RX_OFFLOAD_SCATTER		0x00002000
+/**
+ * Timestamp is set by the driver in RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
+ * and RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME is set in ol_flags.
+ * The mbuf field and flag are registered when the offload is configured.
+ */
 #define DEV_RX_OFFLOAD_TIMESTAMP	0x00004000
 #define DEV_RX_OFFLOAD_SECURITY         0x00008000
 #define DEV_RX_OFFLOAD_KEEP_CRC		0x00010000
@@ -4647,7 +4652,7 @@ int rte_eth_timesync_write_time(uint16_t port_id, const struct timespec *time);
  * rte_eth_read_clock(port, base_clock);
  *
  * Then, convert the raw mbuf timestamp with:
- * base_time_sec + (double)(mbuf->timestamp - base_clock) / freq;
+ * base_time_sec + (double)(*timestamp_dynfield(mbuf) - base_clock) / freq;
  *
  * This simple example will not provide a very good accuracy. One must
  * at least measure multiple times the frequency and do a regression.
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v3 14/16] mbuf: remove deprecated timestamp field
  2020-11-03  0:13 ` [dpdk-dev] [PATCH v3 00/16] " Thomas Monjalon
                     ` (12 preceding siblings ...)
  2020-11-03  0:14   ` [dpdk-dev] [PATCH v3 13/16] ethdev: add doxygen comment for Rx timestamp API Thomas Monjalon
@ 2020-11-03  0:14   ` Thomas Monjalon
  2020-11-03  0:14   ` [dpdk-dev] [PATCH v3 15/16] mbuf: add Tx timestamp registration helper Thomas Monjalon
                     ` (2 subsequent siblings)
  16 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03  0:14 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Ajit Khaparde,
	Ray Kinsella, Neil Horman

As announced in the deprecation note, the field timestamp
is removed to give more space to the dynamic fields.
The related offload flag PKT_RX_TIMESTAMP is also removed.

This is how the mbuf layout looks like (pahole-style):

word  type                              name                byte  size
 0    void *                            buf_addr;         /*   0 +  8 */
 1    rte_iova_t                        buf_iova          /*   8 +  8 */
      /* --- RTE_MARKER64               rearm_data;                   */
 2    uint16_t                          data_off;         /*  16 +  2 */
      uint16_t                          refcnt;           /*  18 +  2 */
      uint16_t                          nb_segs;          /*  20 +  2 */
      uint16_t                          port;             /*  22 +  2 */
 3    uint64_t                          ol_flags;         /*  24 +  8 */
      /* --- RTE_MARKER                 rx_descriptor_fields1;        */
 4    uint32_t             union        packet_type;      /*  32 +  4 */
      uint32_t                          pkt_len;          /*  36 +  4 */
 5    uint16_t                          data_len;         /*  40 +  2 */
      uint16_t                          vlan_tci;         /*  42 +  2 */
 5.5  uint64_t             union        hash;             /*  44 +  8 */
 6.5  uint16_t                          vlan_tci_outer;   /*  52 +  2 */
      uint16_t                          buf_len;          /*  54 +  2 */
 7    uint64_t                          dynfield0[1];     /*  56 +  8 */
      /* --- RTE_MARKER                 cacheline1;                   */
 8    struct rte_mempool *              pool;             /*  64 +  8 */
 9    struct rte_mbuf *                 next;             /*  72 +  8 */
10    uint64_t             union        tx_offload;       /*  80 +  8 */
11    struct rte_mbuf_ext_shared_info * shinfo;           /*  88 +  8 */
12    uint16_t                          priv_size;        /*  96 +  2 */
      uint16_t                          timesync;         /*  98 +  2 */
12.5  uint32_t                          dynfield1[7];     /* 100 + 28 */
16    /* --- END                                             128      */

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
---
 app/test/test_mbuf.c                   |  1 -
 doc/guides/rel_notes/deprecation.rst   |  4 ----
 doc/guides/rel_notes/release_20_11.rst |  4 ++++
 lib/librte_mbuf/rte_mbuf.c             |  2 --
 lib/librte_mbuf/rte_mbuf.h             |  2 +-
 lib/librte_mbuf/rte_mbuf_core.h        | 12 ++----------
 lib/librte_mbuf/rte_mbuf_dyn.c         |  1 +
 7 files changed, 8 insertions(+), 18 deletions(-)

diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
index 3a13cf4e1f..a40f7d4883 100644
--- a/app/test/test_mbuf.c
+++ b/app/test/test_mbuf.c
@@ -1621,7 +1621,6 @@ test_get_rx_ol_flag_name(void)
 		VAL_NAME(PKT_RX_FDIR_FLX),
 		VAL_NAME(PKT_RX_QINQ_STRIPPED),
 		VAL_NAME(PKT_RX_LRO),
-		VAL_NAME(PKT_RX_TIMESTAMP),
 		VAL_NAME(PKT_RX_SEC_OFFLOAD),
 		VAL_NAME(PKT_RX_SEC_OFFLOAD_FAILED),
 		VAL_NAME(PKT_RX_OUTER_L4_CKSUM_BAD),
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index fe3fd3956c..22aecf0bab 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -84,10 +84,6 @@ Deprecation Notices
 * mbuf: Some fields will be converted to dynamic API in DPDK 20.11
   in order to reserve more space for the dynamic fields, as explained in
   `this presentation <https://www.youtube.com/watch?v=Ttl6MlhmzWY>`_.
-  The following static fields will be moved as dynamic:
-
-  - ``timestamp``
-
   As a consequence, the layout of the ``struct rte_mbuf`` will be re-arranged,
   avoiding impact on vectorized implementation of the driver datapaths,
   while evaluating performance gains of a better use of the first cache line.
diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index 88b9086390..6455022169 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -448,6 +448,10 @@ API Changes
 * mbuf: Removed the field ``seqn`` from the structure ``rte_mbuf``.
   It is replaced with dynamic fields.
 
+* mbuf: Removed the field ``timestamp`` from the structure ``rte_mbuf``.
+  It is replaced with the dynamic field RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
+  which was previously used only for Tx.
+
 * pci: Removed the ``rte_kernel_driver`` enum defined in rte_dev.h and
   replaced with a private enum in the PCI subsystem.
 
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 8a456e5e64..09d93e6899 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -764,7 +764,6 @@ const char *rte_get_rx_ol_flag_name(uint64_t mask)
 	case PKT_RX_QINQ_STRIPPED: return "PKT_RX_QINQ_STRIPPED";
 	case PKT_RX_QINQ: return "PKT_RX_QINQ";
 	case PKT_RX_LRO: return "PKT_RX_LRO";
-	case PKT_RX_TIMESTAMP: return "PKT_RX_TIMESTAMP";
 	case PKT_RX_SEC_OFFLOAD: return "PKT_RX_SEC_OFFLOAD";
 	case PKT_RX_SEC_OFFLOAD_FAILED: return "PKT_RX_SEC_OFFLOAD_FAILED";
 	case PKT_RX_OUTER_L4_CKSUM_BAD: return "PKT_RX_OUTER_L4_CKSUM_BAD";
@@ -808,7 +807,6 @@ rte_get_rx_ol_flag_list(uint64_t mask, char *buf, size_t buflen)
 		{ PKT_RX_FDIR_FLX, PKT_RX_FDIR_FLX, NULL },
 		{ PKT_RX_QINQ_STRIPPED, PKT_RX_QINQ_STRIPPED, NULL },
 		{ PKT_RX_LRO, PKT_RX_LRO, NULL },
-		{ PKT_RX_TIMESTAMP, PKT_RX_TIMESTAMP, NULL },
 		{ PKT_RX_SEC_OFFLOAD, PKT_RX_SEC_OFFLOAD, NULL },
 		{ PKT_RX_SEC_OFFLOAD_FAILED, PKT_RX_SEC_OFFLOAD_FAILED, NULL },
 		{ PKT_RX_QINQ, PKT_RX_QINQ, NULL },
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index a1414ed7cd..17e0b205c0 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -1095,6 +1095,7 @@ rte_pktmbuf_attach_extbuf(struct rte_mbuf *m, void *buf_addr,
 static inline void
 rte_mbuf_dynfield_copy(struct rte_mbuf *mdst, const struct rte_mbuf *msrc)
 {
+	memcpy(&mdst->dynfield0, msrc->dynfield0, sizeof(mdst->dynfield0));
 	memcpy(&mdst->dynfield1, msrc->dynfield1, sizeof(mdst->dynfield1));
 }
 
@@ -1108,7 +1109,6 @@ __rte_pktmbuf_copy_hdr(struct rte_mbuf *mdst, const struct rte_mbuf *msrc)
 	mdst->tx_offload = msrc->tx_offload;
 	mdst->hash = msrc->hash;
 	mdst->packet_type = msrc->packet_type;
-	mdst->timestamp = msrc->timestamp;
 	rte_mbuf_dynfield_copy(mdst, msrc);
 }
 
diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
index 3fb5abda3c..38e24a580d 100644
--- a/lib/librte_mbuf/rte_mbuf_core.h
+++ b/lib/librte_mbuf/rte_mbuf_core.h
@@ -149,10 +149,7 @@ extern "C" {
  */
 #define PKT_RX_LRO           (1ULL << 16)
 
-/**
- * Indicate that the timestamp field in the mbuf is valid.
- */
-#define PKT_RX_TIMESTAMP     (1ULL << 17)
+/* There is no flag defined at offset 17. It is free for any future use. */
 
 /**
  * Indicate that security offload processing was applied on the RX packet.
@@ -589,12 +586,7 @@ struct rte_mbuf {
 
 	uint16_t buf_len;         /**< Length of segment buffer. */
 
-	/** Valid if PKT_RX_TIMESTAMP is set. The unit and time reference
-	 * are not normalized but are always the same for a given port.
-	 * Some devices allow to query rte_eth_read_clock that will return the
-	 * current device timestamp.
-	 */
-	uint64_t timestamp;
+	uint64_t dynfield0[1]; /**< Reserved for dynamic fields. */
 
 	/* second cache line - fields only used in slow path or on TX */
 	RTE_MARKER cacheline1 __rte_cache_min_aligned;
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.c b/lib/librte_mbuf/rte_mbuf_dyn.c
index e279b23aea..8168271951 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.c
+++ b/lib/librte_mbuf/rte_mbuf_dyn.c
@@ -125,6 +125,7 @@ init_shared_mem(void)
 		 * rte_mbuf_dynfield_copy().
 		 */
 		memset(shm, 0, sizeof(*shm));
+		mark_free(dynfield0);
 		mark_free(dynfield1);
 
 		/* init free_flags */
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v3 15/16] mbuf: add Tx timestamp registration helper
  2020-11-03  0:13 ` [dpdk-dev] [PATCH v3 00/16] " Thomas Monjalon
                     ` (13 preceding siblings ...)
  2020-11-03  0:14   ` [dpdk-dev] [PATCH v3 14/16] mbuf: remove deprecated timestamp field Thomas Monjalon
@ 2020-11-03  0:14   ` Thomas Monjalon
  2020-11-03  0:14   ` [dpdk-dev] [PATCH v3 16/16] ethdev: include mbuf registration in Tx timestamp API Thomas Monjalon
  2020-11-03  9:00   ` [dpdk-dev] [PATCH v3 00/16] remove mbuf timestamp David Marchand
  16 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03  0:14 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Ray Kinsella, Neil Horman

The function rte_mbuf_dyn_tx_timestamp_register()
can be used to register the required field and flag.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 lib/librte_mbuf/rte_mbuf_dyn.c |  7 +++++++
 lib/librte_mbuf/rte_mbuf_dyn.h | 22 ++++++++++++++++++++++
 lib/librte_mbuf/version.map    |  1 +
 3 files changed, 30 insertions(+)

diff --git a/lib/librte_mbuf/rte_mbuf_dyn.c b/lib/librte_mbuf/rte_mbuf_dyn.c
index 8168271951..ba083853e4 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.c
+++ b/lib/librte_mbuf/rte_mbuf_dyn.c
@@ -613,3 +613,10 @@ rte_mbuf_dyn_rx_timestamp_register(int *field_offset, uint64_t *rx_flag)
 	return rte_mbuf_dyn_timestamp_register(field_offset, rx_flag,
 			"Rx", RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME);
 }
+
+int
+rte_mbuf_dyn_tx_timestamp_register(int *field_offset, uint64_t *tx_flag)
+{
+	return rte_mbuf_dyn_timestamp_register(field_offset, tx_flag,
+			"Tx", RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME);
+}
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 2e729ddaca..d88e7bacc5 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -304,4 +304,26 @@ int rte_mbuf_dyn_rx_timestamp_register(int *field_offset, uint64_t *rx_flag);
  */
 #define RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME "rte_dynflag_tx_timestamp"
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Register dynamic mbuf field and flag for Tx timestamp.
+ *
+ * @param field_offset
+ *   Pointer to the offset of the registered mbuf field, can be NULL.
+ *   The same field is shared for Rx and Tx timestamp.
+ * @param tx_flag
+ *   Pointer to the mask of the registered offload flag, can be NULL.
+ * @return
+ *   0 on success, -1 otherwise.
+ *   Possible values for rte_errno:
+ *   - EEXIST: already registered with different parameters.
+ *   - EPERM: called from a secondary process.
+ *   - ENOENT: no more field or flag available.
+ *   - ENOMEM: allocation failure.
+ */
+__rte_experimental
+int rte_mbuf_dyn_tx_timestamp_register(int *field_offset, uint64_t *tx_flag);
+
 #endif
diff --git a/lib/librte_mbuf/version.map b/lib/librte_mbuf/version.map
index 0b66668bff..b7d98e7eb1 100644
--- a/lib/librte_mbuf/version.map
+++ b/lib/librte_mbuf/version.map
@@ -43,6 +43,7 @@ EXPERIMENTAL {
 	rte_mbuf_dynflag_register_bitnum;
 	rte_mbuf_dyn_dump;
 	rte_mbuf_dyn_rx_timestamp_register;
+	rte_mbuf_dyn_tx_timestamp_register;
 	rte_pktmbuf_copy;
 	rte_pktmbuf_free_bulk;
 	rte_pktmbuf_pool_create_extbuf;
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v3 16/16] ethdev: include mbuf registration in Tx timestamp API
  2020-11-03  0:13 ` [dpdk-dev] [PATCH v3 00/16] " Thomas Monjalon
                     ` (14 preceding siblings ...)
  2020-11-03  0:14   ` [dpdk-dev] [PATCH v3 15/16] mbuf: add Tx timestamp registration helper Thomas Monjalon
@ 2020-11-03  0:14   ` Thomas Monjalon
  2020-11-03  7:54     ` Slava Ovsiienko
  2020-11-03  9:00   ` [dpdk-dev] [PATCH v3 00/16] remove mbuf timestamp David Marchand
  16 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03  0:14 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger, Matan Azrad, Shahaf Shuler

Previously, the Tx timestamp field and flag were registered in testpmd,
as described in mlx5 guide.
For consistency between Rx and Tx timestamps,
managing mbuf registrations inside the driver, as properly documented,
is a simpler expectation.

The only driver to support this feature (mlx5) is updated
as well as the testpmd application.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 app/test-pmd/config.c          | 38 ----------------------------------
 doc/guides/nics/mlx5.rst       |  5 ++---
 drivers/net/mlx5/mlx5_ethdev.c |  8 ++++++-
 lib/librte_ethdev/rte_ethdev.h |  6 +++++-
 4 files changed, 14 insertions(+), 43 deletions(-)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 1668ae3238..9a2baf16fe 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -3955,44 +3955,6 @@ show_tx_pkt_times(void)
 void
 set_tx_pkt_times(unsigned int *tx_times)
 {
-	uint16_t port_id;
-	int offload_found = 0;
-	int offset;
-	int flag;
-
-	static const struct rte_mbuf_dynfield desc_offs = {
-		.name = RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
-		.size = sizeof(uint64_t),
-		.align = __alignof__(uint64_t),
-	};
-	static const struct rte_mbuf_dynflag desc_flag = {
-		.name = RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME,
-	};
-
-	RTE_ETH_FOREACH_DEV(port_id) {
-		struct rte_eth_dev_info dev_info = { 0 };
-		int ret;
-
-		ret = rte_eth_dev_info_get(port_id, &dev_info);
-		if (ret == 0 && dev_info.tx_offload_capa &
-				DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP) {
-			offload_found = 1;
-			break;
-		}
-	}
-	if (!offload_found) {
-		printf("No device supporting Tx timestamp scheduling found, "
-		       "dynamic flag and field not registered\n");
-		return;
-	}
-	offset = rte_mbuf_dynfield_register(&desc_offs);
-	if (offset < 0 && rte_errno != EEXIST)
-		printf("Dynamic timestamp field registration error: %d",
-		       rte_errno);
-	flag = rte_mbuf_dynflag_register(&desc_flag);
-	if (flag < 0 && rte_errno != EEXIST)
-		printf("Dynamic timestamp flag registration error: %d",
-		       rte_errno);
 	tx_pkt_times_inter = tx_times[0];
 	tx_pkt_times_intra = tx_times[1];
 }
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index afa65a1379..fa8b13dd1b 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -237,9 +237,8 @@ Limitations
   ``txq_inline_max`` and ``txq_inline_mpw`` devargs keys.
 
 - To provide the packet send scheduling on mbuf timestamps the ``tx_pp``
-  parameter should be specified, RTE_MBUF_DYNFIELD_TIMESTAMP_NAME and
-  RTE_MBUF_DYNFLAG_TIMESTAMP_NAME should be registered by application.
-  When PMD sees the RTE_MBUF_DYNFLAG_TIMESTAMP_NAME set on the packet
+  parameter should be specified.
+  When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME set on the packet
   being sent it tries to synchronize the time of packet appearing on
   the wire with the specified packet timestamp. It the specified one
   is in the past it should be ignored, if one is in the distant future
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 7631f644b2..76ef02664f 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -88,7 +88,13 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
 
 	if (dev->data->dev_conf.rxmode.mq_mode & ETH_MQ_RX_RSS_FLAG)
 		dev->data->dev_conf.rxmode.offloads |= DEV_RX_OFFLOAD_RSS_HASH;
-
+	if ((dev->data->dev_conf.txmode.offloads &
+			DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP) &&
+			rte_mbuf_dyn_tx_timestamp_register(NULL, NULL) != 0) {
+		DRV_LOG(ERR, "port %u cannot register Tx timestamp field/flag",
+			dev->data->port_id);
+		return -rte_errno;
+	}
 	memcpy(priv->rss_conf.rss_key,
 	       use_app_rss_key ?
 	       dev->data->dev_conf.rx_adv_conf.rss_conf.rss_key :
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index 1fc5f662fa..619cbe521e 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1413,7 +1413,11 @@ struct rte_eth_conf {
 #define DEV_TX_OFFLOAD_IP_TNL_TSO       0x00080000
 /** Device supports outer UDP checksum */
 #define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM  0x00100000
-/** Device supports send on timestamp */
+/**
+ * Device sends on time read from RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
+ * if RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME is set in ol_flags.
+ * The mbuf field and flag are registered when the offload is configured.
+ */
 #define DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
 /*
  * If new Tx offload capabilities are defined, they also must be
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v3 16/16] ethdev: include mbuf registration in Tx timestamp API
  2020-11-03  0:14   ` [dpdk-dev] [PATCH v3 16/16] ethdev: include mbuf registration in Tx timestamp API Thomas Monjalon
@ 2020-11-03  7:54     ` Slava Ovsiienko
  0 siblings, 0 replies; 170+ messages in thread
From: Slava Ovsiienko @ 2020-11-03  7:54 UTC (permalink / raw)
  To: NBU-Contact-Thomas Monjalon, dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger, Matan Azrad, Shahaf Shuler

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Tuesday, November 3, 2020 2:14
> To: dev@dpdk.org
> Cc: ferruh.yigit@intel.com; david.marchand@redhat.com;
> bruce.richardson@intel.com; olivier.matz@6wind.com;
> andrew.rybchenko@oktetlabs.ru; jerinj@marvell.com; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Wenzhuo Lu <wenzhuo.lu@intel.com>; Beilei Xing
> <beilei.xing@intel.com>; Bernard Iremonger <bernard.iremonger@intel.com>;
> Matan Azrad <matan@nvidia.com>; Shahaf Shuler <shahafs@nvidia.com>
> Subject: [PATCH v3 16/16] ethdev: include mbuf registration in Tx timestamp
> API
> 
> Previously, the Tx timestamp field and flag were registered in testpmd, as
> described in mlx5 guide.
> For consistency between Rx and Tx timestamps, managing mbuf registrations
> inside the driver, as properly documented, is a simpler expectation.
> 
> The only driver to support this feature (mlx5) is updated as well as the testpmd
> application.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v3 07/16] net/mlx5: switch Rx timestamp to dynamic mbuf field
  2020-11-03  0:13   ` [dpdk-dev] [PATCH v3 07/16] net/mlx5: switch Rx timestamp to dynamic mbuf field Thomas Monjalon
@ 2020-11-03  8:12     ` Slava Ovsiienko
  0 siblings, 0 replies; 170+ messages in thread
From: Slava Ovsiienko @ 2020-11-03  8:12 UTC (permalink / raw)
  To: NBU-Contact-Thomas Monjalon, dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, Ruifeng Wang, David Christensen,
	Matan Azrad, Shahaf Shuler, Konstantin Ananyev

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Tuesday, November 3, 2020 2:14
> To: dev@dpdk.org
> Cc: ferruh.yigit@intel.com; david.marchand@redhat.com;
> bruce.richardson@intel.com; olivier.matz@6wind.com;
> andrew.rybchenko@oktetlabs.ru; jerinj@marvell.com; Slava Ovsiienko
> <viacheslavo@nvidia.com>; Ruifeng Wang <ruifeng.wang@arm.com>; David
> Christensen <drc@linux.vnet.ibm.com>; Matan Azrad <matan@nvidia.com>;
> Shahaf Shuler <shahafs@nvidia.com>; Konstantin Ananyev
> <konstantin.ananyev@intel.com>
> Subject: [PATCH v3 07/16] net/mlx5: switch Rx timestamp to dynamic mbuf
> field
> 
> The mbuf timestamp is moved to a dynamic field in order to allow removal of
> the deprecated static field.
> The related mbuf flag is also replaced.
> 
> The dynamic offset and flag are stored in struct mlx5_rxq_data to favor cache
> locality.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

> ---
>  drivers/net/mlx5/mlx5_rxq.c              |  8 +++++
>  drivers/net/mlx5/mlx5_rxtx.c             |  4 +--
>  drivers/net/mlx5/mlx5_rxtx.h             | 19 +++++++++++
>  drivers/net/mlx5/mlx5_rxtx_vec_altivec.h | 41 +++++++++++-----------
>  drivers/net/mlx5/mlx5_rxtx_vec_neon.h    | 43 ++++++++++++------------
>  drivers/net/mlx5/mlx5_rxtx_vec_sse.h     | 35 +++++++++----------
>  6 files changed, 90 insertions(+), 60 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c index
> f1d8373079..52519910ee 100644
> --- a/drivers/net/mlx5/mlx5_rxq.c
> +++ b/drivers/net/mlx5/mlx5_rxq.c
> @@ -1492,7 +1492,15 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t
> idx, uint16_t desc,
>  	mlx5_max_lro_msg_size_adjust(dev, idx, max_lro_size);
>  	/* Toggle RX checksum offload if hardware supports it. */
>  	tmpl->rxq.csum = !!(offloads & DEV_RX_OFFLOAD_CHECKSUM);
> +	/* Configure Rx timestamp. */
>  	tmpl->rxq.hw_timestamp = !!(offloads &
> DEV_RX_OFFLOAD_TIMESTAMP);
> +	tmpl->rxq.timestamp_rx_flag = 0;
> +	if (tmpl->rxq.hw_timestamp &&
> rte_mbuf_dyn_rx_timestamp_register(
> +			&tmpl->rxq.timestamp_offset,
> +			&tmpl->rxq.timestamp_rx_flag) != 0) {
> +		DRV_LOG(ERR, "Cannot register Rx timestamp field/flag");
> +		goto error;
> +	}
>  	/* Configure VLAN stripping. */
>  	tmpl->rxq.vlan_strip = !!(offloads & DEV_RX_OFFLOAD_VLAN_STRIP);
>  	/* By default, FCS (CRC) is stripped by hardware. */ diff --git
> a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index
> e86468b67a..b577aab00b 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -1287,8 +1287,8 @@ rxq_cq_to_mbuf(struct mlx5_rxq_data *rxq, struct
> rte_mbuf *pkt,
> 
>  		if (rxq->rt_timestamp)
>  			ts = mlx5_txpp_convert_rx_ts(rxq->sh, ts);
> -		pkt->timestamp = ts;
> -		pkt->ol_flags |= PKT_RX_TIMESTAMP;
> +		mlx5_timestamp_set(pkt, rxq->timestamp_offset, ts);
> +		pkt->ol_flags |= rxq->timestamp_rx_flag;
>  	}
>  }
> 
> diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h index
> 674296ee98..e9eca36b40 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.h
> +++ b/drivers/net/mlx5/mlx5_rxtx.h
> @@ -151,6 +151,8 @@ struct mlx5_rxq_data {
>  	/* CQ (UAR) access lock required for 32bit implementations */  #endif
>  	uint32_t tunnel; /* Tunnel information. */
> +	int timestamp_offset; /* Dynamic mbuf field for timestamp. */
> +	uint64_t timestamp_rx_flag; /* Dynamic mbuf flag for timestamp. */
>  	uint64_t flow_meta_mask;
>  	int32_t flow_meta_offset;
>  } __rte_cache_aligned;
> @@ -681,4 +683,21 @@ mlx5_txpp_convert_tx_ts(struct
> mlx5_dev_ctx_shared *sh, uint64_t mts)
>  	return ci;
>  }
> 
> +/**
> + * Set timestamp in mbuf dynamic field.
> + *
> + * @param mbuf
> + *   Structure to write into.
> + * @param offset
> + *   Dynamic field offset in mbuf structure.
> + * @param timestamp
> + *   Value to write.
> + */
> +static __rte_always_inline void
> +mlx5_timestamp_set(struct rte_mbuf *mbuf, int offset,
> +		rte_mbuf_timestamp_t timestamp)
> +{
> +	*RTE_MBUF_DYNFIELD(mbuf, offset, rte_mbuf_timestamp_t *) =
> timestamp;
> +}
> +
>  #endif /* RTE_PMD_MLX5_RXTX_H_ */
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
> b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
> index 6bf0c9b540..171d7bb0f8 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
> @@ -330,13 +330,13 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data
> *rxq,
>  	vector unsigned char ol_flags = (vector unsigned char)
>  		(vector unsigned int){
>  			rxq->rss_hash * PKT_RX_RSS_HASH |
> -				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
> +				rxq->hw_timestamp * rxq-
> >timestamp_rx_flag,
>  			rxq->rss_hash * PKT_RX_RSS_HASH |
> -				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
> +				rxq->hw_timestamp * rxq-
> >timestamp_rx_flag,
>  			rxq->rss_hash * PKT_RX_RSS_HASH |
> -				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
> +				rxq->hw_timestamp * rxq-
> >timestamp_rx_flag,
>  			rxq->rss_hash * PKT_RX_RSS_HASH |
> -				rxq->hw_timestamp * PKT_RX_TIMESTAMP};
> +				rxq->hw_timestamp * rxq-
> >timestamp_rx_flag};
>  	vector unsigned char cv_flags;
>  	const vector unsigned char zero = (vector unsigned char){0};
>  	const vector unsigned char ptype_mask = @@ -1025,31 +1025,32 @@
> rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t
> pkts_n,
>  		/* D.5 fill in mbuf - rearm_data and packet_type. */
>  		rxq_cq_to_ptype_oflags_v(rxq, cqes, opcode, &pkts[pos]);
>  		if (rxq->hw_timestamp) {
> +			int offset = rxq->timestamp_offset;
>  			if (rxq->rt_timestamp) {
>  				struct mlx5_dev_ctx_shared *sh = rxq->sh;
>  				uint64_t ts;
> 
>  				ts = rte_be_to_cpu_64(cq[pos].timestamp);
> -				pkts[pos]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(pkts[pos], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  				ts = rte_be_to_cpu_64(cq[pos +
> p1].timestamp);
> -				pkts[pos + 1]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(pkts[pos + 1], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  				ts = rte_be_to_cpu_64(cq[pos +
> p2].timestamp);
> -				pkts[pos + 2]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(pkts[pos + 2], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  				ts = rte_be_to_cpu_64(cq[pos +
> p3].timestamp);
> -				pkts[pos + 3]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(pkts[pos + 3], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  			} else {
> -				pkts[pos]->timestamp = rte_be_to_cpu_64
> -						(cq[pos].timestamp);
> -				pkts[pos + 1]->timestamp = rte_be_to_cpu_64
> -						(cq[pos + p1].timestamp);
> -				pkts[pos + 2]->timestamp = rte_be_to_cpu_64
> -						(cq[pos + p2].timestamp);
> -				pkts[pos + 3]->timestamp = rte_be_to_cpu_64
> -						(cq[pos + p3].timestamp);
> +				mlx5_timestamp_set(pkts[pos], offset,
> +
> 	rte_be_to_cpu_64(cq[pos].timestamp));
> +				mlx5_timestamp_set(pkts[pos + 1], offset,
> +					rte_be_to_cpu_64(cq[pos +
> p1].timestamp));
> +				mlx5_timestamp_set(pkts[pos + 2], offset,
> +					rte_be_to_cpu_64(cq[pos +
> p2].timestamp));
> +				mlx5_timestamp_set(pkts[pos + 3], offset,
> +					rte_be_to_cpu_64(cq[pos +
> p3].timestamp));
>  			}
>  		}
>  		if (rxq->dynf_meta) {
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> index d122dad4fe..436b247ade 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> @@ -271,7 +271,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
>  	uint32x4_t pinfo, cv_flags;
>  	uint32x4_t ol_flags =
>  		vdupq_n_u32(rxq->rss_hash * PKT_RX_RSS_HASH |
> -			    rxq->hw_timestamp * PKT_RX_TIMESTAMP);
> +			    rxq->hw_timestamp * rxq->timestamp_rx_flag);
>  	const uint32x4_t ptype_ol_mask = { 0x106, 0x106, 0x106, 0x106 };
>  	const uint8x16_t cv_flag_sel = {
>  		0,
> @@ -697,6 +697,7 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct
> rte_mbuf **pkts, uint16_t pkts_n,
>  		rxq_cq_to_ptype_oflags_v(rxq, ptype_info, flow_tag,
>  					 opcode, &elts[pos]);
>  		if (rxq->hw_timestamp) {
> +			int offset = rxq->timestamp_offset;
>  			if (rxq->rt_timestamp) {
>  				struct mlx5_dev_ctx_shared *sh = rxq->sh;
>  				uint64_t ts;
> @@ -704,36 +705,36 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct
> rte_mbuf **pkts, uint16_t pkts_n,
>  				ts = rte_be_to_cpu_64
>  					(container_of(p0, struct mlx5_cqe,
>  						      pkt_info)->timestamp);
> -				elts[pos]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(elts[pos], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  				ts = rte_be_to_cpu_64
>  					(container_of(p1, struct mlx5_cqe,
>  						      pkt_info)->timestamp);
> -				elts[pos + 1]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(elts[pos + 1], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  				ts = rte_be_to_cpu_64
>  					(container_of(p2, struct mlx5_cqe,
>  						      pkt_info)->timestamp);
> -				elts[pos + 2]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(elts[pos + 2], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  				ts = rte_be_to_cpu_64
>  					(container_of(p3, struct mlx5_cqe,
>  						      pkt_info)->timestamp);
> -				elts[pos + 3]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(elts[pos + 3], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  			} else {
> -				elts[pos]->timestamp = rte_be_to_cpu_64
> -					(container_of(p0, struct mlx5_cqe,
> -						      pkt_info)->timestamp);
> -				elts[pos + 1]->timestamp = rte_be_to_cpu_64
> -					(container_of(p1, struct mlx5_cqe,
> -						      pkt_info)->timestamp);
> -				elts[pos + 2]->timestamp = rte_be_to_cpu_64
> -					(container_of(p2, struct mlx5_cqe,
> -						      pkt_info)->timestamp);
> -				elts[pos + 3]->timestamp = rte_be_to_cpu_64
> -					(container_of(p3, struct mlx5_cqe,
> -						      pkt_info)->timestamp);
> +				mlx5_timestamp_set(elts[pos], offset,
> +					rte_be_to_cpu_64(container_of(p0,
> +					struct mlx5_cqe, pkt_info)-
> >timestamp));
> +				mlx5_timestamp_set(elts[pos + 1], offset,
> +					rte_be_to_cpu_64(container_of(p1,
> +					struct mlx5_cqe, pkt_info)-
> >timestamp));
> +				mlx5_timestamp_set(elts[pos + 2], offset,
> +					rte_be_to_cpu_64(container_of(p2,
> +					struct mlx5_cqe, pkt_info)-
> >timestamp));
> +				mlx5_timestamp_set(elts[pos + 3], offset,
> +					rte_be_to_cpu_64(container_of(p3,
> +					struct mlx5_cqe, pkt_info)-
> >timestamp));
>  			}
>  		}
>  		if (rxq->dynf_meta) {
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> index 0bbcbeefff..ae4439efc7 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> @@ -251,7 +251,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
> __m128i cqes[4],
>  	__m128i pinfo0, pinfo1;
>  	__m128i pinfo, ptype;
>  	__m128i ol_flags = _mm_set1_epi32(rxq->rss_hash *
> PKT_RX_RSS_HASH |
> -					  rxq->hw_timestamp *
> PKT_RX_TIMESTAMP);
> +					  rxq->hw_timestamp * rxq-
> >timestamp_rx_flag);
>  	__m128i cv_flags;
>  	const __m128i zero = _mm_setzero_si128();
>  	const __m128i ptype_mask =
> @@ -656,31 +656,32 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct
> rte_mbuf **pkts, uint16_t pkts_n,
>  		/* D.5 fill in mbuf - rearm_data and packet_type. */
>  		rxq_cq_to_ptype_oflags_v(rxq, cqes, opcode, &pkts[pos]);
>  		if (rxq->hw_timestamp) {
> +			int offset = rxq->timestamp_offset;
>  			if (rxq->rt_timestamp) {
>  				struct mlx5_dev_ctx_shared *sh = rxq->sh;
>  				uint64_t ts;
> 
>  				ts = rte_be_to_cpu_64(cq[pos].timestamp);
> -				pkts[pos]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(pkts[pos], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  				ts = rte_be_to_cpu_64(cq[pos +
> p1].timestamp);
> -				pkts[pos + 1]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(pkts[pos + 1], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  				ts = rte_be_to_cpu_64(cq[pos +
> p2].timestamp);
> -				pkts[pos + 2]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(pkts[pos + 2], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  				ts = rte_be_to_cpu_64(cq[pos +
> p3].timestamp);
> -				pkts[pos + 3]->timestamp =
> -					mlx5_txpp_convert_rx_ts(sh, ts);
> +				mlx5_timestamp_set(pkts[pos + 3], offset,
> +					mlx5_txpp_convert_rx_ts(sh, ts));
>  			} else {
> -				pkts[pos]->timestamp = rte_be_to_cpu_64
> -						(cq[pos].timestamp);
> -				pkts[pos + 1]->timestamp = rte_be_to_cpu_64
> -						(cq[pos + p1].timestamp);
> -				pkts[pos + 2]->timestamp = rte_be_to_cpu_64
> -						(cq[pos + p2].timestamp);
> -				pkts[pos + 3]->timestamp = rte_be_to_cpu_64
> -						(cq[pos + p3].timestamp);
> +				mlx5_timestamp_set(pkts[pos], offset,
> +
> 	rte_be_to_cpu_64(cq[pos].timestamp));
> +				mlx5_timestamp_set(pkts[pos + 1], offset,
> +					rte_be_to_cpu_64(cq[pos +
> p1].timestamp));
> +				mlx5_timestamp_set(pkts[pos + 2], offset,
> +					rte_be_to_cpu_64(cq[pos +
> p2].timestamp));
> +				mlx5_timestamp_set(pkts[pos + 3], offset,
> +					rte_be_to_cpu_64(cq[pos +
> p3].timestamp));
>  			}
>  		}
>  		if (rxq->dynf_meta) {
> --
> 2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v3 06/16] net/mlx5: fix dynamic mbuf offset lookup check
  2020-11-03  0:13   ` [dpdk-dev] [PATCH v3 06/16] net/mlx5: fix dynamic mbuf offset lookup check Thomas Monjalon
@ 2020-11-03  8:12     ` Slava Ovsiienko
  0 siblings, 0 replies; 170+ messages in thread
From: Slava Ovsiienko @ 2020-11-03  8:12 UTC (permalink / raw)
  To: NBU-Contact-Thomas Monjalon, dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, stable, Matan Azrad, Shahaf Shuler,
	Ori Kam

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Tuesday, November 3, 2020 2:14
> To: dev@dpdk.org
> Cc: ferruh.yigit@intel.com; david.marchand@redhat.com;
> bruce.richardson@intel.com; olivier.matz@6wind.com;
> andrew.rybchenko@oktetlabs.ru; jerinj@marvell.com; Slava Ovsiienko
> <viacheslavo@nvidia.com>; stable@dpdk.org; Matan Azrad
> <matan@nvidia.com>; Shahaf Shuler <shahafs@nvidia.com>; Ori Kam
> <orika@mellanox.com>
> Subject: [PATCH v3 06/16] net/mlx5: fix dynamic mbuf offset lookup check
> 
> The functions rte_mbuf_dynfield_lookup() and rte_mbuf_dynflag_lookup() can
> return an offset starting with 0 or a negative error code.
> 
> In reality the first offsets are probably reserved forever, but for the sake of
> strict API compliance, the checks which considered 0 as an error are fixed.
> 
> Fixes: efa79e68c8cd ("net/mlx5: support fine grain dynamic flag")
> Fixes: 3172c471b86f ("net/mlx5: prepare Tx queue structures to support
> timestamp")
> Fixes: 0febfcce3693 ("net/mlx5: prepare Tx to support scheduling")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

> ---
>  drivers/net/mlx5/mlx5_rxtx.c    | 4 ++--
>  drivers/net/mlx5/mlx5_trigger.c | 2 +-
>  drivers/net/mlx5/mlx5_txq.c     | 2 +-
>  3 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index
> b530ff421f..e86468b67a 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -5661,9 +5661,9 @@ mlx5_select_tx_function(struct rte_eth_dev *dev)
>  	}
>  	if (tx_offloads & DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP &&
>  	    rte_mbuf_dynflag_lookup
> -			(RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME, NULL)
> > 0 &&
> +			(RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME, NULL)
> >= 0 &&
>  	    rte_mbuf_dynfield_lookup
> -			(RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL) >
> 0) {
> +			(RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL) >=
> 0) {
>  		/* Offload configured, dynamic entities registered. */
>  		olx |= MLX5_TXOFF_CONFIG_TXPP;
>  	}
> diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
> index 7735f022a3..917b433c4a 100644
> --- a/drivers/net/mlx5/mlx5_trigger.c
> +++ b/drivers/net/mlx5/mlx5_trigger.c
> @@ -302,7 +302,7 @@ mlx5_dev_start(struct rte_eth_dev *dev)
>  	DRV_LOG(DEBUG, "port %u starting device", dev->data->port_id);
>  	fine_inline = rte_mbuf_dynflag_lookup
>  		(RTE_PMD_MLX5_FINE_GRANULARITY_INLINE, NULL);
> -	if (fine_inline > 0)
> +	if (fine_inline >= 0)
>  		rte_net_mlx5_dynf_inline_mask = 1UL << fine_inline;
>  	else
>  		rte_net_mlx5_dynf_inline_mask = 0;
> diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c index
> af84f5f72b..8ed2bcff7b 100644
> --- a/drivers/net/mlx5/mlx5_txq.c
> +++ b/drivers/net/mlx5/mlx5_txq.c
> @@ -1305,7 +1305,7 @@ mlx5_txq_dynf_timestamp_set(struct rte_eth_dev
> *dev)
> 
> 	(RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME, NULL);
>  	off = rte_mbuf_dynfield_lookup
>  				(RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
> NULL);
> -	if (nbit > 0 && off >= 0 && sh->txpp.refcnt)
> +	if (nbit >= 0 && off >= 0 && sh->txpp.refcnt)
>  		mask = 1ULL << nbit;
>  	for (i = 0; i != priv->txqs_n; ++i) {
>  		data = (*priv->txqs)[i];
> --
> 2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v3 00/16] remove mbuf timestamp
  2020-11-03  0:13 ` [dpdk-dev] [PATCH v3 00/16] " Thomas Monjalon
                     ` (15 preceding siblings ...)
  2020-11-03  0:14   ` [dpdk-dev] [PATCH v3 16/16] ethdev: include mbuf registration in Tx timestamp API Thomas Monjalon
@ 2020-11-03  9:00   ` David Marchand
  16 siblings, 0 replies; 170+ messages in thread
From: David Marchand @ 2020-11-03  9:00 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, Yigit, Ferruh, Bruce Richardson, Olivier Matz,
	Andrew Rybchenko, Jerin Jacob Kollanukkaran, Slava Ovsiienko

On Tue, Nov 3, 2020 at 1:14 AM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> The mbuf field timestamp was announced to be removed for three reasons:
>   - a dynamic field already exist, used for Tx only
>   - this field always used 8 bytes even if unneeded
>   - this field is in the first half (cacheline) of mbuf
>
> After this series, the dynamic field timestamp is used for both Rx and Tx
> with separate dynamic flags to distinguish when the value is meaningful
> without resetting the field during forwarding.
>
> As a consequence, 8 bytes can be re-allocated to dynamic fields
> in the first half of mbuf structure.
> It is still open to change more the mbuf layout.
>
> This mbuf layout change is important to allow adding more features
> (consuming more dynamic fields) during the next year,
> and can allow performance improvements with new usages in the first half.

For the series:
Acked-by: David Marchand <david.marchand@redhat.com>


-- 
David Marchand


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v3 05/16] net/dpaa2: switch Rx timestamp to dynamic mbuf field
  2020-11-03  0:13   ` [dpdk-dev] [PATCH v3 05/16] net/dpaa2: " Thomas Monjalon
@ 2020-11-03  9:18     ` Hemant Agrawal
  0 siblings, 0 replies; 170+ messages in thread
From: Hemant Agrawal @ 2020-11-03  9:18 UTC (permalink / raw)
  To: Thomas Monjalon, dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Sachin Saxena (OSS)

Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v3 02/16] mbuf: add Rx timestamp flag and helpers
  2020-11-03  0:13   ` [dpdk-dev] [PATCH v3 02/16] mbuf: add Rx timestamp flag and helpers Thomas Monjalon
@ 2020-11-03  9:33     ` Olivier Matz
  2020-11-03  9:59       ` Thomas Monjalon
  0 siblings, 1 reply; 170+ messages in thread
From: Olivier Matz @ 2020-11-03  9:33 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	andrew.rybchenko, jerinj, viacheslavo, Ray Kinsella, Neil Horman

Hi Thomas,

On Tue, Nov 03, 2020 at 01:13:53AM +0100, Thomas Monjalon wrote:
> There is already a dynamic field for timestamp,
> used only for Tx scheduling with the dedicated Tx offload flag.
> The same field can be used for Rx timestamp filled by drivers.
> 
> A new dynamic flag is defined for Rx usage.
> A new function wraps the registration of both field and Rx flag.
> The type rte_mbuf_timestamp_t is defined for the API users.
> 
> After migrating all Rx timestamp usages, it will be possible
> to remove the deprecated timestamp field.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> ---
>  lib/librte_mbuf/rte_mbuf_dyn.c | 43 ++++++++++++++++++++++++++++++++++
>  lib/librte_mbuf/rte_mbuf_dyn.h | 33 ++++++++++++++++++++++----
>  lib/librte_mbuf/version.map    |  1 +
>  3 files changed, 72 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/librte_mbuf/rte_mbuf_dyn.c b/lib/librte_mbuf/rte_mbuf_dyn.c
> index 538a43f695..e279b23aea 100644
> --- a/lib/librte_mbuf/rte_mbuf_dyn.c
> +++ b/lib/librte_mbuf/rte_mbuf_dyn.c
> @@ -13,6 +13,7 @@
>  #include <rte_errno.h>
>  #include <rte_malloc.h>
>  #include <rte_string_fns.h>
> +#include <rte_bitops.h>
>  #include <rte_mbuf.h>
>  #include <rte_mbuf_dyn.h>
>  
> @@ -569,3 +570,45 @@ void rte_mbuf_dyn_dump(FILE *out)
>  
>  	rte_mcfg_tailq_write_unlock();
>  }
> +
> +static int
> +rte_mbuf_dyn_timestamp_register(int *field_offset, uint64_t *flag,
> +		const char *direction, const char *flag_name)
> +{
> +	static const struct rte_mbuf_dynfield field_desc = {
> +		.name = RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
> +		.size = sizeof(rte_mbuf_timestamp_t),
> +		.align = __alignof__(rte_mbuf_timestamp_t),
> +	};
> +	struct rte_mbuf_dynflag flag_desc;
> +	int offset;
> +
> +	offset = rte_mbuf_dynfield_register(&field_desc);
> +	if (offset < 0) {
> +		RTE_LOG(ERR, MBUF,
> +			"Failed to register mbuf field for timestamp\n");
> +		return -1;
> +	}
> +	if (field_offset != NULL)
> +		*field_offset = offset;
> +
> +	strlcpy(flag_desc.name, flag_name, sizeof flag_desc.name);

The rest of the flag_desc structure is not initialized to 0 (the "flags"
field).

I suggest to do it at declaration:

	struct rte_mbuf_dynflag flag_desc = { 0 };


> +	offset = rte_mbuf_dynflag_register(&flag_desc);
> +	if (offset < 0) {
> +		RTE_LOG(ERR, MBUF,
> +			"Failed to register mbuf flag for %s timestamp\n",
> +			direction);
> +		return -1;
> +	}
> +	if (flag != NULL)
> +		*flag = RTE_BIT64(offset);
> +
> +	return 0;
> +}
> +
> +int
> +rte_mbuf_dyn_rx_timestamp_register(int *field_offset, uint64_t *rx_flag)
> +{
> +	return rte_mbuf_dyn_timestamp_register(field_offset, rx_flag,
> +			"Rx", RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME);
> +}
> diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
> index 0ebac88b83..2e729ddaca 100644
> --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> @@ -258,13 +258,36 @@ void rte_mbuf_dyn_dump(FILE *out);
>   * timestamp. The dynamic Tx timestamp flag tells whether the field contains
>   * actual timestamp value for the packets being sent, this value can be
>   * used by PMD to schedule packet sending.
> - *
> - * After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
> - * and obsoleting, the dedicated Rx timestamp flag is supposed to be
> - * introduced and the shared dynamic timestamp field will be used
> - * to handle the timestamps on receiving datapath as well.
>   */
>  #define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME "rte_dynfield_timestamp"
> +typedef uint64_t rte_mbuf_timestamp_t;
> +
> +/**
> + * Indicate that the timestamp field in the mbuf was filled by the driver.
> + */
> +#define RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME "rte_dynflag_rx_timestamp"
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Register dynamic mbuf field and flag for Rx timestamp.
> + *
> + * @param field_offset
> + *   Pointer to the offset of the registered mbuf field, can be NULL.
> + *   The same field is shared for Rx and Tx timestamp.
> + * @param rx_flag
> + *   Pointer to the mask of the registered offload flag, can be NULL.
> + * @return
> + *   0 on success, -1 otherwise.
> + *   Possible values for rte_errno:
> + *   - EEXIST: already registered with different parameters.
> + *   - EPERM: called from a secondary process.
> + *   - ENOENT: no more field or flag available.
> + *   - ENOMEM: allocation failure.
> + */
> +__rte_experimental
> +int rte_mbuf_dyn_rx_timestamp_register(int *field_offset, uint64_t *rx_flag);
>  
>  /**
>   * When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME flag set on the
> diff --git a/lib/librte_mbuf/version.map b/lib/librte_mbuf/version.map
> index a011aaead3..0b66668bff 100644
> --- a/lib/librte_mbuf/version.map
> +++ b/lib/librte_mbuf/version.map
> @@ -42,6 +42,7 @@ EXPERIMENTAL {
>  	rte_mbuf_dynflag_register;
>  	rte_mbuf_dynflag_register_bitnum;
>  	rte_mbuf_dyn_dump;
> +	rte_mbuf_dyn_rx_timestamp_register;
>  	rte_pktmbuf_copy;
>  	rte_pktmbuf_free_bulk;
>  	rte_pktmbuf_pool_create_extbuf;
> -- 
> 2.28.0
> 

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v3 02/16] mbuf: add Rx timestamp flag and helpers
  2020-11-03  9:33     ` Olivier Matz
@ 2020-11-03  9:59       ` Thomas Monjalon
  0 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03  9:59 UTC (permalink / raw)
  To: Olivier Matz
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	andrew.rybchenko, jerinj, viacheslavo, Ray Kinsella, Neil Horman

03/11/2020 10:33, Olivier Matz:
> On Tue, Nov 03, 2020 at 01:13:53AM +0100, Thomas Monjalon wrote:
> > +static int
> > +rte_mbuf_dyn_timestamp_register(int *field_offset, uint64_t *flag,
> > +		const char *direction, const char *flag_name)
> > +{
> > +	static const struct rte_mbuf_dynfield field_desc = {
> > +		.name = RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
> > +		.size = sizeof(rte_mbuf_timestamp_t),
> > +		.align = __alignof__(rte_mbuf_timestamp_t),
> > +	};
> > +	struct rte_mbuf_dynflag flag_desc;
> > +	int offset;
> > +
> > +	offset = rte_mbuf_dynfield_register(&field_desc);
> > +	if (offset < 0) {
> > +		RTE_LOG(ERR, MBUF,
> > +			"Failed to register mbuf field for timestamp\n");
> > +		return -1;
> > +	}
> > +	if (field_offset != NULL)
> > +		*field_offset = offset;
> > +
> > +	strlcpy(flag_desc.name, flag_name, sizeof flag_desc.name);
> 
> The rest of the flag_desc structure is not initialized to 0 (the "flags"
> field).
> 
> I suggest to do it at declaration:
> 
> 	struct rte_mbuf_dynflag flag_desc = { 0 };

Yes I forgot, thanks for catching.




^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v3 08/16] net/nfb: switch Rx timestamp to dynamic mbuf field
  2020-11-03  0:13   ` [dpdk-dev] [PATCH v3 08/16] net/nfb: " Thomas Monjalon
@ 2020-11-03 10:20     ` Olivier Matz
  0 siblings, 0 replies; 170+ messages in thread
From: Olivier Matz @ 2020-11-03 10:20 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	andrew.rybchenko, jerinj, viacheslavo, Martin Spinler

On Tue, Nov 03, 2020 at 01:13:59AM +0100, Thomas Monjalon wrote:
> The mbuf timestamp is moved to a dynamic field
> in order to allow removal of the deprecated static field.
> The related mbuf flag is also replaced.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> ---
>  drivers/net/nfb/nfb_rx.c | 15 ++++++++++++++-
>  drivers/net/nfb/nfb_rx.h | 18 ++++++++++++++----
>  2 files changed, 28 insertions(+), 5 deletions(-)

<...>

> index cf3899b2fb..e548226e0f 100644
> --- a/drivers/net/nfb/nfb_rx.h
> +++ b/drivers/net/nfb/nfb_rx.h
> @@ -15,6 +15,16 @@
>  
>  #define NFB_TIMESTAMP_FLAG (1 << 0)
>  
> +extern uint64_t nfb_timestamp_rx_dynflag;
> +extern int nfb_timestamp_dynfield_offset;
> +
> +static inline rte_mbuf_timestamp_t *
> +nfb_timestamp_dynfield(struct rte_mbuf *mbuf)
> +{
> +	return RTE_MBUF_DYNFIELD(mbuf,
> +		nfb_timestamp_dynfield_offset, rte_mbuf_timestamp_t *);
> +}
> +
>  struct ndp_rx_queue {
>  	struct nfb_device *nfb;	     /* nfb dev structure */
>  	struct ndp_queue *queue;     /* rx queue */
> @@ -191,15 +201,15 @@ nfb_eth_ndp_rx(void *queue,
>  
>  			if (timestamping_enabled) {
>  				/* nanoseconds */
> -				mbuf->timestamp =
> +				*nfb_timestamp_dynfield(mbuf) =
>  					rte_le_to_cpu_32(*((uint32_t *)
>  					(packets[i].header + 4)));
> -				mbuf->timestamp <<= 32;
> +				*nfb_timestamp_dynfield(mbuf) <<= 32;
>  				/* seconds */
> -				mbuf->timestamp |=
> +				*nfb_timestamp_dynfield(mbuf) |=
>  					rte_le_to_cpu_32(*((uint32_t *)
>  					(packets[i].header + 8)));
> -				mbuf->ol_flags |= PKT_RX_TIMESTAMP;
> +				mbuf->ol_flags |= nfb_timestamp_rx_dynflag;
>  			}
>  
>  			bufs[num_rx++] = mbuf;

I think it would be better with a local variable.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v3 11/16] app/testpmd: switch Rx timestamp to dynamic mbuf field
  2020-11-03  0:14   ` [dpdk-dev] [PATCH v3 11/16] app/testpmd: " Thomas Monjalon
@ 2020-11-03 10:23     ` Olivier Matz
  0 siblings, 0 replies; 170+ messages in thread
From: Olivier Matz @ 2020-11-03 10:23 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	andrew.rybchenko, jerinj, viacheslavo, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger

On Tue, Nov 03, 2020 at 01:14:02AM +0100, Thomas Monjalon wrote:
> The mbuf timestamp is moved to a dynamic field
> in order to allow removal of the deprecated static field.
> The related mbuf flag is also replaced.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> ---
>  app/test-pmd/util.c | 39 +++++++++++++++++++++++++++++++++++++--
>  1 file changed, 37 insertions(+), 2 deletions(-)
> 
> diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
> index 781a813759..eebb5166ad 100644
> --- a/app/test-pmd/util.c
> +++ b/app/test-pmd/util.c
> @@ -5,6 +5,7 @@
>  
>  #include <stdio.h>
>  
> +#include <rte_bitops.h>
>  #include <rte_net.h>
>  #include <rte_mbuf.h>
>  #include <rte_ether.h>
> @@ -22,6 +23,40 @@ print_ether_addr(const char *what, const struct rte_ether_addr *eth_addr)
>  	printf("%s%s", what, buf);
>  }
>  
> +static inline bool
> +is_timestamp_enabled(const struct rte_mbuf *mbuf)
> +{
> +	static uint64_t timestamp_rx_dynflag;
> +
> +	int timestamp_rx_dynflag_offset;

unneeded blank line

> +
> +	if (timestamp_rx_dynflag == 0) {
> +		timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
> +				RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
> +		if (timestamp_rx_dynflag_offset < 0)
> +			return false;
> +		timestamp_rx_dynflag = RTE_BIT64(timestamp_rx_dynflag_offset);
> +	}
> +
> +	return (mbuf->ol_flags & timestamp_rx_dynflag) != 0;
> +}
> +
> +static inline rte_mbuf_timestamp_t
> +get_timestamp(const struct rte_mbuf *mbuf)
> +{
> +	static int timestamp_dynfield_offset = -1;
> +
> +	if (timestamp_dynfield_offset < 0) {
> +		timestamp_dynfield_offset = rte_mbuf_dynfield_lookup(
> +				RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
> +		if (timestamp_dynfield_offset < 0)
> +			return 0;
> +	}
> +
> +	return *RTE_MBUF_DYNFIELD(mbuf,
> +			timestamp_dynfield_offset, rte_mbuf_timestamp_t *);
> +}
> +
>  static inline void
>  dump_pkt_burst(uint16_t port_id, uint16_t queue, struct rte_mbuf *pkts[],
>  	      uint16_t nb_pkts, int is_rx)
> @@ -107,8 +142,8 @@ dump_pkt_burst(uint16_t port_id, uint16_t queue, struct rte_mbuf *pkts[],
>  				printf("hash=0x%x ID=0x%x ",
>  				       mb->hash.fdir.hash, mb->hash.fdir.id);
>  		}
> -		if (ol_flags & PKT_RX_TIMESTAMP)
> -			printf(" - timestamp %"PRIu64" ", mb->timestamp);
> +		if (is_timestamp_enabled(mb))
> +			printf(" - timestamp %"PRIu64" ", get_timestamp(mb));
>  		if (ol_flags & PKT_RX_QINQ)
>  			printf(" - QinQ VLAN tci=0x%x, VLAN tci outer=0x%x",
>  			       mb->vlan_tci, mb->vlan_tci_outer);
> -- 
> 2.28.0
> 

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v3 09/16] net/octeontx2: switch Rx timestamp to dynamic mbuf field
  2020-11-03  0:14   ` [dpdk-dev] [PATCH v3 09/16] net/octeontx2: " Thomas Monjalon
@ 2020-11-03 10:52     ` Harman Kalra
  2020-11-03 11:22       ` Thomas Monjalon
  0 siblings, 1 reply; 170+ messages in thread
From: Harman Kalra @ 2020-11-03 10:52 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	olivier.matz, andrew.rybchenko, jerinj, viacheslavo,
	Nithin Dabilpuram, Kiran Kumar K

On Tue, Nov 03, 2020 at 01:14:00AM +0100, Thomas Monjalon wrote:
> The mbuf timestamp is moved to a dynamic field
> in order to allow removal of the deprecated static field.
> The related mbuf flag is also replaced.
> 
> The dynamic offset and flag are stored in struct otx2_timesync_info
> to favor cache locality.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Hi Thomas,

   With the following changes, ptpclient and testpmd(ieee1588 mode) is
   crashing for us. I am debugging the issue and will update soon.
  ------------------
   Steps to reproduce:
   1. Testpmd:
      ./dpdk-testpmd -c 0xffff01 -n 4 -w 0002:05:00.0 -- -i
      --port-topology=loop
      testpmd> set fwd ieee1588
      testpmd> set port 0 ptype_mask 0xf
      testpmd> start

      I am sending ptp packets using scapy from the peer:
      >>> p = Ether(src='98:03:9b:67:b0:d0', dst='FA:62:0C:27:AD:BC',
		      >>> type=35063)/Raw(load='\x00\x02')
      >>> sendp (p, iface="p5p2")

      I am observing seg fault even for 1 packet.

    2. ./ptpclient -l 1 -n 4 -w 0002:05:00.0 -- -p 0xf    -- on board
       ptp4l -E -2 -H -i p5p2  -m -q -p /dev/ptp4     ... on peer

Thanks
Harman

> ---
>  drivers/net/octeontx2/otx2_ethdev.c | 10 ++++++++++
>  drivers/net/octeontx2/otx2_rx.h     | 19 ++++++++++++++++---
>  2 files changed, 26 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/octeontx2/otx2_ethdev.c b/drivers/net/octeontx2/otx2_ethdev.c
> index cfb733a4b5..f6962be9b2 100644
> --- a/drivers/net/octeontx2/otx2_ethdev.c
> +++ b/drivers/net/octeontx2/otx2_ethdev.c
> @@ -2219,6 +2219,16 @@ otx2_nix_dev_start(struct rte_eth_dev *eth_dev)
>  	else
>  		otx2_nix_timesync_disable(eth_dev);
>  
> +	if (dev->rx_offload_flags & NIX_RX_OFFLOAD_TSTAMP_F) {
> +		rc = rte_mbuf_dyn_rx_timestamp_register(
> +				&dev->tstamp.tstamp_dynfield_offset,
> +				&dev->tstamp.rx_tstamp_dynflag);
> +		if (rc != 0) {
> +			otx2_err("Failed to register Rx timestamp field/flag");
> +			return -rte_errno;
> +		}
> +	}
> +
>  	/* Update VF about data off shifted by 8 bytes if PTP already
>  	 * enabled in PF owning this VF
>  	 */
> diff --git a/drivers/net/octeontx2/otx2_rx.h b/drivers/net/octeontx2/otx2_rx.h
> index 61a5c436dd..926f614a4e 100644
> --- a/drivers/net/octeontx2/otx2_rx.h
> +++ b/drivers/net/octeontx2/otx2_rx.h
> @@ -49,6 +49,8 @@ struct otx2_timesync_info {
>  	uint64_t	rx_tstamp;
>  	rte_iova_t	tx_tstamp_iova;
>  	uint64_t	*tx_tstamp;
> +	uint64_t	rx_tstamp_dynflag;
> +	int		tstamp_dynfield_offset;
>  	uint8_t		tx_ready;
>  	uint8_t		rx_ready;
>  } __rte_cache_aligned;
> @@ -63,6 +65,14 @@ union mbuf_initializer {
>  	uint64_t value;
>  };
>  
> +static inline rte_mbuf_timestamp_t *
> +otx2_timestamp_dynfield(struct rte_mbuf *mbuf,
> +		struct otx2_timesync_info *info)
> +{
> +	return RTE_MBUF_DYNFIELD(mbuf,
> +		info->tstamp_dynfield_offset, rte_mbuf_timestamp_t *);
> +}
> +
>  static __rte_always_inline void
>  otx2_nix_mbuf_to_tstamp(struct rte_mbuf *mbuf,
>  			struct otx2_timesync_info *tstamp, const uint16_t flag,
> @@ -77,15 +87,18 @@ otx2_nix_mbuf_to_tstamp(struct rte_mbuf *mbuf,
>  		/* Reading the rx timestamp inserted by CGX, viz at
>  		 * starting of the packet data.
>  		 */
> -		mbuf->timestamp = rte_be_to_cpu_64(*tstamp_ptr);
> +		*otx2_timestamp_dynfield(mbuf, tstamp) =
> +				rte_be_to_cpu_64(*tstamp_ptr);
>  		/* PKT_RX_IEEE1588_TMST flag needs to be set only in case
>  		 * PTP packets are received.
>  		 */
>  		if (mbuf->packet_type == RTE_PTYPE_L2_ETHER_TIMESYNC) {
> -			tstamp->rx_tstamp = mbuf->timestamp;
> +			tstamp->rx_tstamp =
> +					*otx2_timestamp_dynfield(mbuf, tstamp);
>  			tstamp->rx_ready = 1;
>  			mbuf->ol_flags |= PKT_RX_IEEE1588_PTP |
> -				PKT_RX_IEEE1588_TMST | PKT_RX_TIMESTAMP;
> +				PKT_RX_IEEE1588_TMST |
> +				tstamp->rx_tstamp_dynflag;
>  		}
>  	}
>  }
> -- 
> 2.28.0
> 

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v3 09/16] net/octeontx2: switch Rx timestamp to dynamic mbuf field
  2020-11-03 10:52     ` Harman Kalra
@ 2020-11-03 11:22       ` Thomas Monjalon
  2020-11-03 12:21         ` Thomas Monjalon
  0 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 11:22 UTC (permalink / raw)
  To: Harman Kalra
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	olivier.matz, andrew.rybchenko, jerinj, viacheslavo,
	Nithin Dabilpuram, Kiran Kumar K

03/11/2020 11:52, Harman Kalra:
>    With the following changes, ptpclient and testpmd(ieee1588 mode) is
>    crashing for us. I am debugging the issue and will update soon.
>   ------------------
>    Steps to reproduce:
>    1. Testpmd:
>       ./dpdk-testpmd -c 0xffff01 -n 4 -w 0002:05:00.0 -- -i
>       --port-topology=loop
>       testpmd> set fwd ieee1588
>       testpmd> set port 0 ptype_mask 0xf
>       testpmd> start
> 
>       I am sending ptp packets using scapy from the peer:
>       >>> p = Ether(src='98:03:9b:67:b0:d0', dst='FA:62:0C:27:AD:BC',
> 		      >>> type=35063)/Raw(load='\x00\x02')
>       >>> sendp (p, iface="p5p2")
> 
>       I am observing seg fault even for 1 packet.

Where is the crash? Could you provide a backtrace?
Is the field well registered?




^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotterfirst half
  2020-11-02 15:58                     ` Thomas Monjalon
@ 2020-11-03 12:10                       ` Morten Brørup
  2020-11-03 12:25                         ` Bruce Richardson
  2020-11-03 14:02                         ` Slava Ovsiienko
  0 siblings, 2 replies; 170+ messages in thread
From: Morten Brørup @ 2020-11-03 12:10 UTC (permalink / raw)
  To: Thomas Monjalon, dev, techboard
  Cc: Ajit Khaparde, Ananyev, Konstantin, Andrew Rybchenko, dev, Yigit,
	Ferruh, david.marchand, Richardson, Bruce, olivier.matz, jerinj,
	viacheslavo, honnappa.nagarahalli, maxime.coquelin, stephen,
	hemant.agrawal, viacheslavo, Matan Azrad, Shahaf Shuler

> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Monday, November 2, 2020 4:58 PM
> 
> +Cc techboard
> 
> We need benchmark numbers in order to take a decision.
> Please all, prepare some arguments and numbers so we can discuss
> the mbuf layout in the next techboard meeting.

I propose that the techboard considers this from two angels:

1. Long term goals and their relative priority. I.e. what can be
achieved with wide-ranging modifications, requiring yet another ABI
break and due notices.

2. Short term goals, i.e. what can be achieved for this release.


My suggestions follow...

1. Regarding long term goals:

I have argued that simple forwarding of non-segmented packets using
only the first mbuf cache line can be achieved by making three
modifications:

a) Move m->tx_offload to the first cache line.
b) Use an 8 bit pktmbuf mempool index in the first cache line,
   instead of the 64 bit m->pool pointer in the second cache line.
c) Do not access m->next when we know that it is NULL.
   We can use m->nb_segs == 1 or some other invariant as the gate.
   It can be implemented by adding an m->next accessor function:
   struct rte_mbuf * rte_mbuf_next(struct rte_mbuf * m)
   {
       return m->nb_segs == 1 ? NULL : m->next;
   }

Regarding the priority of this goal, I guess that simple forwarding
of non-segmented packets is probably the path taken by the majority
of packets handled by DPDK.


An alternative goal could be:
Do not touch the second cache line during RX.
A comment in the mbuf structure says so, but it is not true anymore.

(I guess that regression testing didn't catch this because the tests
perform TX immediately after RX, so the cache miss just moves from
the TX to the RX part of the test application.)


2. Regarding short term goals:

The current DPDK source code looks to me like m->next is the most
frequently accessed field in the second cache line, so it makes sense
moving this to the first cache line, rather than m->pool.
Benchmarking may help here.

If we - without breaking the ABI - can introduce a gate to avoid
accessing m->next when we know that it is NULL, we should keep it in
the second cache line.

In this case, I would prefer to move m->tx_offload to the first cache
line, thereby providing a field available for application use, until
the application prepares the packet for transmission.


> 
> 
> 01/11/2020 21:59, Morten Brørup:
> > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > Sent: Sunday, November 1, 2020 5:38 PM
> > >
> > > 01/11/2020 10:12, Morten Brørup:
> > > > One thing has always puzzled me:
> > > > Why do we use 64 bits to indicate which memory pool
> > > > an mbuf belongs to?
> > > > The portid only uses 16 bits and an indirection index.
> > > > Why don't we use the same kind of indirection index for mbuf
> pools?
> > >
> > > I wonder what would be the cost of indirection. Probably
> neglectible.
> >
> > Probably. The portid does it, and that indirection is heavily used
> everywhere.
> >
> > The size of mbuf memory pool indirection array should be compile time
> configurable, like the size of the portid indirection array.
> >
> > And for reference, the indirection array will fit into one cache line
> if we default to 8 mbuf pools, thus supporting an 8 CPU socket system
> with one mbuf pool per CPU socket, or a 4 CPU socket system with two
> mbuf pools per CPU socket.
> >
> > (And as a side note: Our application is optimized for single-socket
> systems, and we only use one mbuf pool. I guess many applications were
> developed without carefully optimizing for multi-socket systems, and
> also just use one mbuf pool. In these cases, the mbuf structure doesn't
> really need a pool field. But it is still there, and the DPDK libraries
> use it, so we didn't bother removing it.)
> >
> > > I think it is a good proposal...
> > > ... for next year, after a deprecation notice.
> > >
> > > > I can easily imagine using one mbuf pool (or perhaps a few pools)
> > > > per CPU socket (or per physical memory bus closest to an attached
> NIC),
> > > > but not more than 256 mbuf memory pools in total.
> > > > So, let's introduce an mbufpoolid like the portid,
> > > > and cut this mbuf field down from 64 to 8 bits.
> 
> We will need to measure the perf of the solution.
> There is a chance for the cost to be too much high.
> 
> 
> > > > If we also cut down m->pkt_len from 32 to 24 bits,
> > >
> > > Who is using packets larger than 64k? Are 16 bits enough?
> >
> > I personally consider 64k a reasonable packet size limit. Exotic
> applications with even larger packets would have to live with this
> constraint. But let's see if there are any objections. For reference,
> 64k corresponds to ca. 44 Ethernet (1500 byte) packets.
> >
> > (The limit could be 65535 bytes, to avoid translation of the value 0
> into 65536 bytes.)
> >
> > This modification would go nicely hand in hand with the mbuf pool
> indirection modification.
> >
> > ... after yet another round of ABI stability discussions,
> depreciation notices, and so on. :-)
> 
> After more thoughts, I'm afraid 64k is too small in some cases.
> And 24-bit manipulation would probably break performance.
> I'm afraid we are stuck with 32-bit length.

Yes, 24 bit manipulation would probably break performance.

Perhaps a solution exists with 16 bits (least significant bits) for
the common cases, and 8 bits more (most significant bits) for the less
common cases. Just thinking out loud here...

> 
> > > > we can get the 8 bit mbuf pool index into the first cache line
> > > > at no additional cost.
> > >
> > > I like the idea.
> > > It means we don't need to move the pool pointer now,
> > > i.e. it does not have to replace the timestamp field.
> >
> > Agreed! Don't move m->pool to the first cache line; it is not used
> for RX.
> >
> > >
> > > > In other words: This would free up another 64 bit field in the
> mbuf
> > > structure!
> > >
> > > That would be great!
> > >
> > >
> > > > And even though the m->next pointer for scattered packets resides
> > > > in the second cache line, the libraries and application knows
> > > > that m->next is NULL when m->nb_segs is 1.
> > > > This proves that my suggestion would make touching
> > > > the second cache line unnecessary (in simple cases),
> > > > even for re-initializing the mbuf.
> > >
> > > So you think the "next" pointer should stay in the second half of
> mbuf?
> > >
> > > I feel you would like to move the Tx offloads in the first half
> > > to improve performance of very simple apps.
> >
> > "Very simple apps" sounds like a minority of apps. I would rather say
> "very simple packet handling scenarios", e.g. forwarding of normal size
> non-segmented packets. I would guess that the vast majority of packets
> handled by DPDK applications actually match this scenario. So I'm
> proposing to optimize for what I think is the most common scenario.
> >
> > If segmented packets are common, then m->next could be moved to the
> first cache line. But it will only improve the pure RX steps of the
> pipeline. When preparing the packet for TX, m->tx_offloads will need to
> be set, and the second cache line comes into play. So I'm wondering how
> big the benefit of having m->next in the first cache line really is -
> assuming that m->nb_segs will be checked before accessing m->next.
> >
> > > I am thinking the opposite: we could have some dynamic fields space
> > > in the first half to improve performance of complex Rx.
> > > Note: we can add a flag hint for field registration in this first
> half.
> > >
> >
> > I have had the same thoughts. However, I would prefer being able to
> forward ordinary packets without using the second mbuf cache line at
> all (although only in specific scenarios like my example above).
> >
> > Furthermore, the application can abuse the 64 bit m->tx_offload field
> for private purposes until it is time to prepare the packet for TX and
> pass it on to the driver. This hack somewhat resembles a dynamic field
> in the first cache line, and will not be possible if the m->pool or m-
> >next field is moved there.
> 
> 
> 


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v3 09/16] net/octeontx2: switch Rx timestamp to dynamic mbuf field
  2020-11-03 11:22       ` Thomas Monjalon
@ 2020-11-03 12:21         ` Thomas Monjalon
  2020-11-03 14:23           ` [dpdk-dev] [EXT] " Harman Kalra
  0 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 12:21 UTC (permalink / raw)
  To: Harman Kalra
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	olivier.matz, andrew.rybchenko, jerinj, viacheslavo,
	Nithin Dabilpuram, Kiran Kumar K

03/11/2020 12:22, Thomas Monjalon:
> 03/11/2020 11:52, Harman Kalra:
> >    With the following changes, ptpclient and testpmd(ieee1588 mode) is
> >    crashing for us. I am debugging the issue and will update soon.
> >   ------------------
> >    Steps to reproduce:
> >    1. Testpmd:
> >       ./dpdk-testpmd -c 0xffff01 -n 4 -w 0002:05:00.0 -- -i
> >       --port-topology=loop
> >       testpmd> set fwd ieee1588
> >       testpmd> set port 0 ptype_mask 0xf
> >       testpmd> start
> > 
> >       I am sending ptp packets using scapy from the peer:
> >       >>> p = Ether(src='98:03:9b:67:b0:d0', dst='FA:62:0C:27:AD:BC',
> > 		      >>> type=35063)/Raw(load='\x00\x02')
> >       >>> sendp (p, iface="p5p2")
> > 
> >       I am observing seg fault even for 1 packet.
> 
> Where is the crash? Could you provide a backtrace?
> Is the field well registered?

Sorry Harman, without any more explanation, we must move forward.
I am going to send a v4 without any change for octeontx2.
It should be merged today for -rc2.



^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v4 00/16] remove mbuf timestamp
  2020-10-29  9:27 [dpdk-dev] [PATCH 00/15] remove mbuf timestamp Thomas Monjalon
                   ` (16 preceding siblings ...)
  2020-11-03  0:13 ` [dpdk-dev] [PATCH v3 00/16] " Thomas Monjalon
@ 2020-11-03 12:21 ` Thomas Monjalon
  2020-11-03 12:21   ` [dpdk-dev] [PATCH v4 01/16] eventdev: remove software Rx timestamp Thomas Monjalon
                     ` (15 more replies)
  2020-11-03 14:09 ` [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp Thomas Monjalon
  18 siblings, 16 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 12:21 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo

The mbuf field timestamp was announced to be removed for three reasons:
  - a dynamic field already exist, used for Tx only
  - this field always used 8 bytes even if unneeded
  - this field is in the first half (cacheline) of mbuf

After this series, the dynamic field timestamp is used for both Rx and Tx
with separate dynamic flags to distinguish when the value is meaningful
without resetting the field during forwarding.

As a consequence, 8 bytes can be re-allocated to dynamic fields
in the first half of mbuf structure.
It is still open to change more the mbuf layout.

This mbuf layout change is important to allow adding more features
(consuming more dynamic fields) during the next year,
and can allow performance improvements with new usages in the first half.


v4:
- use local variable in nfb
- fix flag initialization
- remove useless blank line

v3:
- move ark variables declaration in a .h file
- improve cache locality for octeontx2
- add comments about cache locality in commit logs
- add comment for unused flag offset 17
- add timestamp register functions
- replace lookup with register in drivers and apps
- remove register in ethdev

v2:
- remove optimization to register only once in ethdev
- fix error message in latencystats
- convert rxtx_callbacks macro to inline function
- increase dynamic fields space
- do not move pool field


Thomas Monjalon (16):
  eventdev: remove software Rx timestamp
  mbuf: add Rx timestamp flag and helpers
  latency: switch Rx timestamp to dynamic mbuf field
  net/ark: switch Rx timestamp to dynamic mbuf field
  net/dpaa2: switch Rx timestamp to dynamic mbuf field
  net/mlx5: fix dynamic mbuf offset lookup check
  net/mlx5: switch Rx timestamp to dynamic mbuf field
  net/nfb: switch Rx timestamp to dynamic mbuf field
  net/octeontx2: switch Rx timestamp to dynamic mbuf field
  net/pcap: switch Rx timestamp to dynamic mbuf field
  app/testpmd: switch Rx timestamp to dynamic mbuf field
  examples/rxtx_callbacks: switch timestamp to dynamic field
  ethdev: add doxygen comment for Rx timestamp API
  mbuf: remove deprecated timestamp field
  mbuf: add Tx timestamp registration helper
  ethdev: include mbuf registration in Tx timestamp API

 app/test-pmd/config.c                         | 38 -------------
 app/test-pmd/util.c                           | 38 ++++++++++++-
 app/test/test_mbuf.c                          |  1 -
 doc/guides/nics/mlx5.rst                      |  5 +-
 .../prog_guide/event_ethernet_rx_adapter.rst  |  6 +-
 doc/guides/rel_notes/deprecation.rst          |  4 --
 doc/guides/rel_notes/release_20_11.rst        |  4 ++
 drivers/net/ark/ark_ethdev.c                  | 16 ++++++
 drivers/net/ark/ark_ethdev_rx.c               |  7 ++-
 drivers/net/ark/ark_ethdev_rx.h               |  2 +
 drivers/net/dpaa2/dpaa2_ethdev.c              | 11 ++++
 drivers/net/dpaa2/dpaa2_ethdev.h              |  2 +
 drivers/net/dpaa2/dpaa2_rxtx.c                | 25 ++++++---
 drivers/net/mlx5/mlx5_ethdev.c                |  8 ++-
 drivers/net/mlx5/mlx5_rxq.c                   |  8 +++
 drivers/net/mlx5/mlx5_rxtx.c                  |  8 +--
 drivers/net/mlx5/mlx5_rxtx.h                  | 19 +++++++
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h      | 41 +++++++-------
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h         | 43 ++++++++-------
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h          | 35 ++++++------
 drivers/net/mlx5/mlx5_trigger.c               |  2 +-
 drivers/net/mlx5/mlx5_txq.c                   |  2 +-
 drivers/net/nfb/nfb_rx.c                      | 15 ++++-
 drivers/net/nfb/nfb_rx.h                      | 21 +++++--
 drivers/net/octeontx2/otx2_ethdev.c           | 10 ++++
 drivers/net/octeontx2/otx2_rx.h               | 19 ++++++-
 drivers/net/pcap/rte_eth_pcap.c               | 20 ++++++-
 examples/rxtx_callbacks/main.c                | 16 +++++-
 lib/librte_ethdev/rte_ethdev.h                | 13 ++++-
 .../rte_event_eth_rx_adapter.c                | 11 ----
 .../rte_event_eth_rx_adapter.h                |  6 +-
 lib/librte_latencystats/rte_latencystats.c    | 30 ++++++++--
 lib/librte_mbuf/rte_mbuf.c                    |  2 -
 lib/librte_mbuf/rte_mbuf.h                    |  2 +-
 lib/librte_mbuf/rte_mbuf_core.h               | 12 +---
 lib/librte_mbuf/rte_mbuf_dyn.c                | 51 +++++++++++++++++
 lib/librte_mbuf/rte_mbuf_dyn.h                | 55 +++++++++++++++++--
 lib/librte_mbuf/version.map                   |  2 +
 38 files changed, 431 insertions(+), 179 deletions(-)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v4 01/16] eventdev: remove software Rx timestamp
  2020-11-03 12:21 ` [dpdk-dev] [PATCH v4 " Thomas Monjalon
@ 2020-11-03 12:21   ` Thomas Monjalon
  2020-11-03 12:21   ` [dpdk-dev] [PATCH v4 02/16] mbuf: add Rx timestamp flag and helpers Thomas Monjalon
                     ` (14 subsequent siblings)
  15 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 12:21 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Nikhil Rao

This a revert of the commit 569758758dcd ("eventdev: add Rx timestamp").
If the Rx timestamp is not configured on the ethdev port,
there is no reason to set one.
Also the accuracy  of the timestamp was bad because set at a late stage.
Anyway there is no trace of the usage of this timestamp.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 doc/guides/prog_guide/event_ethernet_rx_adapter.rst |  6 +-----
 lib/librte_eventdev/rte_event_eth_rx_adapter.c      | 11 -----------
 lib/librte_eventdev/rte_event_eth_rx_adapter.h      |  6 +-----
 3 files changed, 2 insertions(+), 21 deletions(-)

diff --git a/doc/guides/prog_guide/event_ethernet_rx_adapter.rst b/doc/guides/prog_guide/event_ethernet_rx_adapter.rst
index 236f43f455..cb44ce0e47 100644
--- a/doc/guides/prog_guide/event_ethernet_rx_adapter.rst
+++ b/doc/guides/prog_guide/event_ethernet_rx_adapter.rst
@@ -12,11 +12,7 @@ be supported in hardware or require a software thread to receive packets from
 the ethdev port using ethdev poll mode APIs and enqueue these as events to the
 event device using the eventdev API. Both transfer mechanisms may be present on
 the same platform depending on the particular combination of the ethdev and
-the event device. For SW based packet transfer, if the mbuf does not have a
-timestamp set, the adapter adds a timestamp to the mbuf using
-rte_get_tsc_cycles(), this provides a more accurate timestamp as compared to
-if the application were to set the timestamp since it avoids event device
-schedule latency.
+the event device.
 
 The Event Ethernet Rx Adapter library is intended for the application code to
 configure both transfer mechanisms using a common API. A capability API allows
diff --git a/lib/librte_eventdev/rte_event_eth_rx_adapter.c b/lib/librte_eventdev/rte_event_eth_rx_adapter.c
index f0000d1ede..3c73046551 100644
--- a/lib/librte_eventdev/rte_event_eth_rx_adapter.c
+++ b/lib/librte_eventdev/rte_event_eth_rx_adapter.c
@@ -763,7 +763,6 @@ rxa_buffer_mbufs(struct rte_event_eth_rx_adapter *rx_adapter,
 	uint32_t rss_mask;
 	uint32_t rss;
 	int do_rss;
-	uint64_t ts;
 	uint16_t nb_cb;
 	uint16_t dropped;
 
@@ -771,16 +770,6 @@ rxa_buffer_mbufs(struct rte_event_eth_rx_adapter *rx_adapter,
 	rss_mask = ~(((m->ol_flags & PKT_RX_RSS_HASH) != 0) - 1);
 	do_rss = !rss_mask && !eth_rx_queue_info->flow_id_mask;
 
-	if ((m->ol_flags & PKT_RX_TIMESTAMP) == 0) {
-		ts = rte_get_tsc_cycles();
-		for (i = 0; i < num; i++) {
-			m = mbufs[i];
-
-			m->timestamp = ts;
-			m->ol_flags |= PKT_RX_TIMESTAMP;
-		}
-	}
-
 	for (i = 0; i < num; i++) {
 		m = mbufs[i];
 
diff --git a/lib/librte_eventdev/rte_event_eth_rx_adapter.h b/lib/librte_eventdev/rte_event_eth_rx_adapter.h
index 2dd259c279..21bb1e54c8 100644
--- a/lib/librte_eventdev/rte_event_eth_rx_adapter.h
+++ b/lib/librte_eventdev/rte_event_eth_rx_adapter.h
@@ -21,11 +21,7 @@
  *
  * The adapter uses a EAL service core function for SW based packet transfer
  * and uses the eventdev PMD functions to configure HW based packet transfer
- * between the ethernet device and the event device. For SW based packet
- * transfer, if the mbuf does not have a timestamp set, the adapter adds a
- * timestamp to the mbuf using rte_get_tsc_cycles(), this provides a more
- * accurate timestamp as compared to if the application were to set the time
- * stamp since it avoids event device schedule latency.
+ * between the ethernet device and the event device.
  *
  * The ethernet Rx event adapter's functions are:
  *  - rte_event_eth_rx_adapter_create_ext()
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v4 02/16] mbuf: add Rx timestamp flag and helpers
  2020-11-03 12:21 ` [dpdk-dev] [PATCH v4 " Thomas Monjalon
  2020-11-03 12:21   ` [dpdk-dev] [PATCH v4 01/16] eventdev: remove software Rx timestamp Thomas Monjalon
@ 2020-11-03 12:21   ` Thomas Monjalon
  2020-11-03 12:34     ` Andrew Rybchenko
  2020-11-03 12:21   ` [dpdk-dev] [PATCH v4 03/16] latency: switch Rx timestamp to dynamic mbuf field Thomas Monjalon
                     ` (13 subsequent siblings)
  15 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 12:21 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Ray Kinsella, Neil Horman

There is already a dynamic field for timestamp,
used only for Tx scheduling with the dedicated Tx offload flag.
The same field can be used for Rx timestamp filled by drivers.

A new dynamic flag is defined for Rx usage.
A new function wraps the registration of both field and Rx flag.
The type rte_mbuf_timestamp_t is defined for the API users.

After migrating all Rx timestamp usages, it will be possible
to remove the deprecated timestamp field.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 lib/librte_mbuf/rte_mbuf_dyn.c | 43 ++++++++++++++++++++++++++++++++++
 lib/librte_mbuf/rte_mbuf_dyn.h | 33 ++++++++++++++++++++++----
 lib/librte_mbuf/version.map    |  1 +
 3 files changed, 72 insertions(+), 5 deletions(-)

diff --git a/lib/librte_mbuf/rte_mbuf_dyn.c b/lib/librte_mbuf/rte_mbuf_dyn.c
index 538a43f695..5b608a27d7 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.c
+++ b/lib/librte_mbuf/rte_mbuf_dyn.c
@@ -13,6 +13,7 @@
 #include <rte_errno.h>
 #include <rte_malloc.h>
 #include <rte_string_fns.h>
+#include <rte_bitops.h>
 #include <rte_mbuf.h>
 #include <rte_mbuf_dyn.h>
 
@@ -569,3 +570,45 @@ void rte_mbuf_dyn_dump(FILE *out)
 
 	rte_mcfg_tailq_write_unlock();
 }
+
+static int
+rte_mbuf_dyn_timestamp_register(int *field_offset, uint64_t *flag,
+		const char *direction, const char *flag_name)
+{
+	static const struct rte_mbuf_dynfield field_desc = {
+		.name = RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
+		.size = sizeof(rte_mbuf_timestamp_t),
+		.align = __alignof__(rte_mbuf_timestamp_t),
+	};
+	struct rte_mbuf_dynflag flag_desc = { 0 };
+	int offset;
+
+	offset = rte_mbuf_dynfield_register(&field_desc);
+	if (offset < 0) {
+		RTE_LOG(ERR, MBUF,
+			"Failed to register mbuf field for timestamp\n");
+		return -1;
+	}
+	if (field_offset != NULL)
+		*field_offset = offset;
+
+	strlcpy(flag_desc.name, flag_name, sizeof(flag_desc.name));
+	offset = rte_mbuf_dynflag_register(&flag_desc);
+	if (offset < 0) {
+		RTE_LOG(ERR, MBUF,
+			"Failed to register mbuf flag for %s timestamp\n",
+			direction);
+		return -1;
+	}
+	if (flag != NULL)
+		*flag = RTE_BIT64(offset);
+
+	return 0;
+}
+
+int
+rte_mbuf_dyn_rx_timestamp_register(int *field_offset, uint64_t *rx_flag)
+{
+	return rte_mbuf_dyn_timestamp_register(field_offset, rx_flag,
+			"Rx", RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME);
+}
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 0ebac88b83..2e729ddaca 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -258,13 +258,36 @@ void rte_mbuf_dyn_dump(FILE *out);
  * timestamp. The dynamic Tx timestamp flag tells whether the field contains
  * actual timestamp value for the packets being sent, this value can be
  * used by PMD to schedule packet sending.
- *
- * After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
- * and obsoleting, the dedicated Rx timestamp flag is supposed to be
- * introduced and the shared dynamic timestamp field will be used
- * to handle the timestamps on receiving datapath as well.
  */
 #define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME "rte_dynfield_timestamp"
+typedef uint64_t rte_mbuf_timestamp_t;
+
+/**
+ * Indicate that the timestamp field in the mbuf was filled by the driver.
+ */
+#define RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME "rte_dynflag_rx_timestamp"
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Register dynamic mbuf field and flag for Rx timestamp.
+ *
+ * @param field_offset
+ *   Pointer to the offset of the registered mbuf field, can be NULL.
+ *   The same field is shared for Rx and Tx timestamp.
+ * @param rx_flag
+ *   Pointer to the mask of the registered offload flag, can be NULL.
+ * @return
+ *   0 on success, -1 otherwise.
+ *   Possible values for rte_errno:
+ *   - EEXIST: already registered with different parameters.
+ *   - EPERM: called from a secondary process.
+ *   - ENOENT: no more field or flag available.
+ *   - ENOMEM: allocation failure.
+ */
+__rte_experimental
+int rte_mbuf_dyn_rx_timestamp_register(int *field_offset, uint64_t *rx_flag);
 
 /**
  * When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME flag set on the
diff --git a/lib/librte_mbuf/version.map b/lib/librte_mbuf/version.map
index a011aaead3..0b66668bff 100644
--- a/lib/librte_mbuf/version.map
+++ b/lib/librte_mbuf/version.map
@@ -42,6 +42,7 @@ EXPERIMENTAL {
 	rte_mbuf_dynflag_register;
 	rte_mbuf_dynflag_register_bitnum;
 	rte_mbuf_dyn_dump;
+	rte_mbuf_dyn_rx_timestamp_register;
 	rte_pktmbuf_copy;
 	rte_pktmbuf_free_bulk;
 	rte_pktmbuf_pool_create_extbuf;
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v4 03/16] latency: switch Rx timestamp to dynamic mbuf field
  2020-11-03 12:21 ` [dpdk-dev] [PATCH v4 " Thomas Monjalon
  2020-11-03 12:21   ` [dpdk-dev] [PATCH v4 01/16] eventdev: remove software Rx timestamp Thomas Monjalon
  2020-11-03 12:21   ` [dpdk-dev] [PATCH v4 02/16] mbuf: add Rx timestamp flag and helpers Thomas Monjalon
@ 2020-11-03 12:21   ` Thomas Monjalon
  2020-11-03 12:21   ` [dpdk-dev] [PATCH v4 04/16] net/ark: " Thomas Monjalon
                     ` (12 subsequent siblings)
  15 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 12:21 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Reshma Pattan

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced with the dynamic one.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 lib/librte_latencystats/rte_latencystats.c | 30 ++++++++++++++++++----
 1 file changed, 25 insertions(+), 5 deletions(-)

diff --git a/lib/librte_latencystats/rte_latencystats.c b/lib/librte_latencystats/rte_latencystats.c
index ba2fff3bcb..ab8db7a139 100644
--- a/lib/librte_latencystats/rte_latencystats.c
+++ b/lib/librte_latencystats/rte_latencystats.c
@@ -9,6 +9,7 @@
 
 #include <rte_string_fns.h>
 #include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 #include <rte_log.h>
 #include <rte_cycles.h>
 #include <rte_ethdev.h>
@@ -31,6 +32,16 @@ latencystat_cycles_per_ns(void)
 /* Macros for printing using RTE_LOG */
 #define RTE_LOGTYPE_LATENCY_STATS RTE_LOGTYPE_USER1
 
+static uint64_t timestamp_dynflag;
+static int timestamp_dynfield_offset = -1;
+
+static inline rte_mbuf_timestamp_t *
+timestamp_dynfield(struct rte_mbuf *mbuf)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+			timestamp_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 static const char *MZ_RTE_LATENCY_STATS = "rte_latencystats";
 static int latency_stats_index;
 static uint64_t samp_intvl;
@@ -128,10 +139,10 @@ add_time_stamps(uint16_t pid __rte_unused,
 		diff_tsc = now - prev_tsc;
 		timer_tsc += diff_tsc;
 
-		if ((pkts[i]->ol_flags & PKT_RX_TIMESTAMP) == 0
+		if ((pkts[i]->ol_flags & timestamp_dynflag) == 0
 				&& (timer_tsc >= samp_intvl)) {
-			pkts[i]->timestamp = now;
-			pkts[i]->ol_flags |= PKT_RX_TIMESTAMP;
+			*timestamp_dynfield(pkts[i]) = now;
+			pkts[i]->ol_flags |= timestamp_dynflag;
 			timer_tsc = 0;
 		}
 		prev_tsc = now;
@@ -161,8 +172,8 @@ calc_latency(uint16_t pid __rte_unused,
 
 	now = rte_rdtsc();
 	for (i = 0; i < nb_pkts; i++) {
-		if (pkts[i]->ol_flags & PKT_RX_TIMESTAMP)
-			latency[cnt++] = now - pkts[i]->timestamp;
+		if (pkts[i]->ol_flags & timestamp_dynflag)
+			latency[cnt++] = now - *timestamp_dynfield(pkts[i]);
 	}
 
 	rte_spinlock_lock(&glob_stats->lock);
@@ -241,6 +252,15 @@ rte_latencystats_init(uint64_t app_samp_intvl,
 		return -1;
 	}
 
+	/* Register mbuf field and flag for Rx timestamp */
+	ret = rte_mbuf_dyn_rx_timestamp_register(&timestamp_dynfield_offset,
+			&timestamp_dynflag);
+	if (ret != 0) {
+		RTE_LOG(ERR, LATENCY_STATS,
+			"Cannot register mbuf field/flag for timestamp\n");
+		return -rte_errno;
+	}
+
 	/** Register Rx/Tx callbacks */
 	RTE_ETH_FOREACH_DEV(pid) {
 		struct rte_eth_dev_info dev_info;
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v4 04/16] net/ark: switch Rx timestamp to dynamic mbuf field
  2020-11-03 12:21 ` [dpdk-dev] [PATCH v4 " Thomas Monjalon
                     ` (2 preceding siblings ...)
  2020-11-03 12:21   ` [dpdk-dev] [PATCH v4 03/16] latency: switch Rx timestamp to dynamic mbuf field Thomas Monjalon
@ 2020-11-03 12:21   ` Thomas Monjalon
  2020-11-03 12:37     ` Andrew Rybchenko
  2020-11-03 12:21   ` [dpdk-dev] [PATCH v4 05/16] net/dpaa2: " Thomas Monjalon
                     ` (11 subsequent siblings)
  15 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 12:21 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Shepard Siegel, Ed Czeck,
	John Miller

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related dynamic mbuf flag is set, although was missing previously.

The timestamp is set if configured for at least one device.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 drivers/net/ark/ark_ethdev.c    | 16 ++++++++++++++++
 drivers/net/ark/ark_ethdev_rx.c |  7 ++++++-
 drivers/net/ark/ark_ethdev_rx.h |  2 ++
 3 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ark/ark_ethdev.c b/drivers/net/ark/ark_ethdev.c
index fa343999a1..a34dcc5291 100644
--- a/drivers/net/ark/ark_ethdev.c
+++ b/drivers/net/ark/ark_ethdev.c
@@ -79,6 +79,8 @@ static int  eth_ark_set_mtu(struct rte_eth_dev *dev, uint16_t size);
 #define ARK_TX_MAX_QUEUE (4096 * 4)
 #define ARK_TX_MIN_QUEUE (256)
 
+uint64_t ark_timestamp_rx_dynflag;
+int ark_timestamp_dynfield_offset = -1;
 int rte_pmd_ark_rx_userdata_dynfield_offset = -1;
 int rte_pmd_ark_tx_userdata_dynfield_offset = -1;
 
@@ -552,6 +554,18 @@ static int
 eth_ark_dev_configure(struct rte_eth_dev *dev)
 {
 	struct ark_adapter *ark = dev->data->dev_private;
+	int ret;
+
+	if (dev->data->dev_conf.rxmode.offloads & DEV_RX_OFFLOAD_TIMESTAMP) {
+		ret = rte_mbuf_dyn_rx_timestamp_register(
+				&ark_timestamp_dynfield_offset,
+				&ark_timestamp_rx_dynflag);
+		if (ret != 0) {
+			ARK_PMD_LOG(ERR,
+				"Failed to register Rx timestamp field/flag\n");
+			return -rte_errno;
+		}
+	}
 
 	eth_ark_dev_set_link_up(dev);
 	if (ark->user_ext.dev_configure)
@@ -782,6 +796,8 @@ eth_ark_dev_info_get(struct rte_eth_dev *dev,
 				ETH_LINK_SPEED_50G |
 				ETH_LINK_SPEED_100G);
 
+	dev_info->rx_offload_capa = DEV_RX_OFFLOAD_TIMESTAMP;
+
 	return 0;
 }
 
diff --git a/drivers/net/ark/ark_ethdev_rx.c b/drivers/net/ark/ark_ethdev_rx.c
index c24cc00e2f..d29d3db783 100644
--- a/drivers/net/ark/ark_ethdev_rx.c
+++ b/drivers/net/ark/ark_ethdev_rx.c
@@ -272,7 +272,12 @@ eth_ark_recv_pkts(void *rx_queue,
 		mbuf->port = meta->port;
 		mbuf->pkt_len = meta->pkt_len;
 		mbuf->data_len = meta->pkt_len;
-		mbuf->timestamp = meta->timestamp;
+		/* set timestamp if enabled at least on one device */
+		if (ark_timestamp_rx_dynflag > 0) {
+			*RTE_MBUF_DYNFIELD(mbuf, ark_timestamp_dynfield_offset,
+				rte_mbuf_timestamp_t *) = meta->timestamp;
+			mbuf->ol_flags |= ark_timestamp_rx_dynflag;
+		}
 		rte_pmd_ark_mbuf_rx_userdata_set(mbuf, meta->user_data);
 
 		if (ARK_DEBUG_CORE) {	/* debug sanity checks */
diff --git a/drivers/net/ark/ark_ethdev_rx.h b/drivers/net/ark/ark_ethdev_rx.h
index 0fdd29b1ab..001fa9bdfa 100644
--- a/drivers/net/ark/ark_ethdev_rx.h
+++ b/drivers/net/ark/ark_ethdev_rx.h
@@ -11,6 +11,8 @@
 #include <rte_mempool.h>
 #include <rte_ethdev_driver.h>
 
+extern uint64_t ark_timestamp_rx_dynflag;
+extern int ark_timestamp_dynfield_offset;
 
 int eth_ark_dev_rx_queue_setup(struct rte_eth_dev *dev,
 			       uint16_t queue_idx,
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v4 05/16] net/dpaa2: switch Rx timestamp to dynamic mbuf field
  2020-11-03 12:21 ` [dpdk-dev] [PATCH v4 " Thomas Monjalon
                     ` (3 preceding siblings ...)
  2020-11-03 12:21   ` [dpdk-dev] [PATCH v4 04/16] net/ark: " Thomas Monjalon
@ 2020-11-03 12:21   ` Thomas Monjalon
  2020-11-03 12:21   ` [dpdk-dev] [PATCH v4 06/16] net/mlx5: fix dynamic mbuf offset lookup check Thomas Monjalon
                     ` (10 subsequent siblings)
  15 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 12:21 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Hemant Agrawal,
	Sachin Saxena

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 drivers/net/dpaa2/dpaa2_ethdev.c | 11 +++++++++++
 drivers/net/dpaa2/dpaa2_ethdev.h |  2 ++
 drivers/net/dpaa2/dpaa2_rxtx.c   | 25 ++++++++++++++++++-------
 3 files changed, 31 insertions(+), 7 deletions(-)

diff --git a/drivers/net/dpaa2/dpaa2_ethdev.c b/drivers/net/dpaa2/dpaa2_ethdev.c
index 04e60c56f2..3b0c7717b6 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.c
+++ b/drivers/net/dpaa2/dpaa2_ethdev.c
@@ -65,6 +65,8 @@ static uint64_t dev_tx_offloads_nodis =
 
 /* enable timestamp in mbuf */
 bool dpaa2_enable_ts[RTE_MAX_ETHPORTS];
+uint64_t dpaa2_timestamp_rx_dynflag;
+int dpaa2_timestamp_dynfield_offset = -1;
 
 struct rte_dpaa2_xstats_name_off {
 	char name[RTE_ETH_XSTATS_NAME_SIZE];
@@ -587,7 +589,16 @@ dpaa2_eth_dev_configure(struct rte_eth_dev *dev)
 #if !defined(RTE_LIBRTE_IEEE1588)
 	if (rx_offloads & DEV_RX_OFFLOAD_TIMESTAMP)
 #endif
+	{
+		ret = rte_mbuf_dyn_rx_timestamp_register(
+				&dpaa2_timestamp_dynfield_offset,
+				&dpaa2_timestamp_rx_dynflag);
+		if (ret != 0) {
+			DPAA2_PMD_ERR("Error to register timestamp field/flag");
+			return -rte_errno;
+		}
 		dpaa2_enable_ts[dev->data->port_id] = true;
+	}
 
 	if (tx_offloads & DEV_TX_OFFLOAD_IPV4_CKSUM)
 		tx_l3_csum_offload = true;
diff --git a/drivers/net/dpaa2/dpaa2_ethdev.h b/drivers/net/dpaa2/dpaa2_ethdev.h
index 94cf253827..8d82f74684 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.h
+++ b/drivers/net/dpaa2/dpaa2_ethdev.h
@@ -92,6 +92,8 @@
 
 /* enable timestamp in mbuf*/
 extern bool dpaa2_enable_ts[];
+extern uint64_t dpaa2_timestamp_rx_dynflag;
+extern int dpaa2_timestamp_dynfield_offset;
 
 #define DPAA2_QOS_TABLE_RECONFIGURE	1
 #define DPAA2_FS_TABLE_RECONFIGURE	2
diff --git a/drivers/net/dpaa2/dpaa2_rxtx.c b/drivers/net/dpaa2/dpaa2_rxtx.c
index 6201de4606..9cca6d16c3 100644
--- a/drivers/net/dpaa2/dpaa2_rxtx.c
+++ b/drivers/net/dpaa2/dpaa2_rxtx.c
@@ -31,6 +31,13 @@ dpaa2_dev_rx_parse_slow(struct rte_mbuf *mbuf,
 
 static void enable_tx_tstamp(struct qbman_fd *fd) __rte_unused;
 
+static inline rte_mbuf_timestamp_t *
+dpaa2_timestamp_dynfield(struct rte_mbuf *mbuf)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+		dpaa2_timestamp_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 #define DPAA2_MBUF_TO_CONTIG_FD(_mbuf, _fd, _bpid)  do { \
 	DPAA2_SET_FD_ADDR(_fd, DPAA2_MBUF_VADDR_TO_IOVA(_mbuf)); \
 	DPAA2_SET_FD_LEN(_fd, _mbuf->data_len); \
@@ -109,9 +116,10 @@ dpaa2_dev_rx_parse_new(struct rte_mbuf *m, const struct qbman_fd *fd,
 	m->ol_flags |= PKT_RX_RSS_HASH;
 
 	if (dpaa2_enable_ts[m->port]) {
-		m->timestamp = annotation->word2;
-		m->ol_flags |= PKT_RX_TIMESTAMP;
-		DPAA2_PMD_DP_DEBUG("pkt timestamp:0x%" PRIx64 "", m->timestamp);
+		*dpaa2_timestamp_dynfield(m) = annotation->word2;
+		m->ol_flags |= dpaa2_timestamp_rx_dynflag;
+		DPAA2_PMD_DP_DEBUG("pkt timestamp:0x%" PRIx64 "",
+				*dpaa2_timestamp_dynfield(m));
 	}
 
 	DPAA2_PMD_DP_DEBUG("HW frc = 0x%x\t packet type =0x%x "
@@ -223,9 +231,12 @@ dpaa2_dev_rx_parse(struct rte_mbuf *mbuf, void *hw_annot_addr)
 	else if (BIT_ISSET_AT_POS(annotation->word8, DPAA2_ETH_FAS_L4CE))
 		mbuf->ol_flags |= PKT_RX_L4_CKSUM_BAD;
 
-	mbuf->ol_flags |= PKT_RX_TIMESTAMP;
-	mbuf->timestamp = annotation->word2;
-	DPAA2_PMD_DP_DEBUG("pkt timestamp: 0x%" PRIx64 "", mbuf->timestamp);
+	if (dpaa2_enable_ts[mbuf->port]) {
+		*dpaa2_timestamp_dynfield(mbuf) = annotation->word2;
+		mbuf->ol_flags |= dpaa2_timestamp_rx_dynflag;
+		DPAA2_PMD_DP_DEBUG("pkt timestamp: 0x%" PRIx64 "",
+				*dpaa2_timestamp_dynfield(mbuf));
+	}
 
 	/* Check detailed parsing requirement */
 	if (annotation->word3 & 0x7FFFFC3FFFF)
@@ -629,7 +640,7 @@ dpaa2_dev_prefetch_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 		else
 			bufs[num_rx] = eth_fd_to_mbuf(fd, eth_data->port_id);
 #if defined(RTE_LIBRTE_IEEE1588)
-		priv->rx_timestamp = bufs[num_rx]->timestamp;
+		priv->rx_timestamp = *dpaa2_timestamp_dynfield(bufs[num_rx]);
 #endif
 
 		if (eth_data->dev_conf.rxmode.offloads &
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v4 06/16] net/mlx5: fix dynamic mbuf offset lookup check
  2020-11-03 12:21 ` [dpdk-dev] [PATCH v4 " Thomas Monjalon
                     ` (4 preceding siblings ...)
  2020-11-03 12:21   ` [dpdk-dev] [PATCH v4 05/16] net/dpaa2: " Thomas Monjalon
@ 2020-11-03 12:21   ` Thomas Monjalon
  2020-11-03 12:21   ` [dpdk-dev] [PATCH v4 07/16] net/mlx5: switch Rx timestamp to dynamic mbuf field Thomas Monjalon
                     ` (9 subsequent siblings)
  15 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 12:21 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, stable, Matan Azrad,
	Shahaf Shuler, Ori Kam

The functions rte_mbuf_dynfield_lookup() and rte_mbuf_dynflag_lookup()
can return an offset starting with 0 or a negative error code.

In reality the first offsets are probably reserved forever,
but for the sake of strict API compliance,
the checks which considered 0 as an error are fixed.

Fixes: efa79e68c8cd ("net/mlx5: support fine grain dynamic flag")
Fixes: 3172c471b86f ("net/mlx5: prepare Tx queue structures to support timestamp")
Fixes: 0febfcce3693 ("net/mlx5: prepare Tx to support scheduling")
Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 drivers/net/mlx5/mlx5_rxtx.c    | 4 ++--
 drivers/net/mlx5/mlx5_trigger.c | 2 +-
 drivers/net/mlx5/mlx5_txq.c     | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index b530ff421f..e86468b67a 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -5661,9 +5661,9 @@ mlx5_select_tx_function(struct rte_eth_dev *dev)
 	}
 	if (tx_offloads & DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP &&
 	    rte_mbuf_dynflag_lookup
-			(RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME, NULL) > 0 &&
+			(RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME, NULL) >= 0 &&
 	    rte_mbuf_dynfield_lookup
-			(RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL) > 0) {
+			(RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL) >= 0) {
 		/* Offload configured, dynamic entities registered. */
 		olx |= MLX5_TXOFF_CONFIG_TXPP;
 	}
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index 7735f022a3..917b433c4a 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -302,7 +302,7 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 	DRV_LOG(DEBUG, "port %u starting device", dev->data->port_id);
 	fine_inline = rte_mbuf_dynflag_lookup
 		(RTE_PMD_MLX5_FINE_GRANULARITY_INLINE, NULL);
-	if (fine_inline > 0)
+	if (fine_inline >= 0)
 		rte_net_mlx5_dynf_inline_mask = 1UL << fine_inline;
 	else
 		rte_net_mlx5_dynf_inline_mask = 0;
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index af84f5f72b..8ed2bcff7b 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -1305,7 +1305,7 @@ mlx5_txq_dynf_timestamp_set(struct rte_eth_dev *dev)
 				(RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME, NULL);
 	off = rte_mbuf_dynfield_lookup
 				(RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
-	if (nbit > 0 && off >= 0 && sh->txpp.refcnt)
+	if (nbit >= 0 && off >= 0 && sh->txpp.refcnt)
 		mask = 1ULL << nbit;
 	for (i = 0; i != priv->txqs_n; ++i) {
 		data = (*priv->txqs)[i];
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v4 07/16] net/mlx5: switch Rx timestamp to dynamic mbuf field
  2020-11-03 12:21 ` [dpdk-dev] [PATCH v4 " Thomas Monjalon
                     ` (5 preceding siblings ...)
  2020-11-03 12:21   ` [dpdk-dev] [PATCH v4 06/16] net/mlx5: fix dynamic mbuf offset lookup check Thomas Monjalon
@ 2020-11-03 12:21   ` Thomas Monjalon
  2020-11-03 12:21   ` [dpdk-dev] [PATCH v4 08/16] net/nfb: " Thomas Monjalon
                     ` (8 subsequent siblings)
  15 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 12:21 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Ruifeng Wang,
	David Christensen, Matan Azrad, Shahaf Shuler,
	Konstantin Ananyev

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

The dynamic offset and flag are stored in struct mlx5_rxq_data
to favor cache locality.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 drivers/net/mlx5/mlx5_rxq.c              |  8 +++++
 drivers/net/mlx5/mlx5_rxtx.c             |  4 +--
 drivers/net/mlx5/mlx5_rxtx.h             | 19 +++++++++++
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h | 41 +++++++++++-----------
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h    | 43 ++++++++++++------------
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h     | 35 +++++++++----------
 6 files changed, 90 insertions(+), 60 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index f1d8373079..52519910ee 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1492,7 +1492,15 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
 	mlx5_max_lro_msg_size_adjust(dev, idx, max_lro_size);
 	/* Toggle RX checksum offload if hardware supports it. */
 	tmpl->rxq.csum = !!(offloads & DEV_RX_OFFLOAD_CHECKSUM);
+	/* Configure Rx timestamp. */
 	tmpl->rxq.hw_timestamp = !!(offloads & DEV_RX_OFFLOAD_TIMESTAMP);
+	tmpl->rxq.timestamp_rx_flag = 0;
+	if (tmpl->rxq.hw_timestamp && rte_mbuf_dyn_rx_timestamp_register(
+			&tmpl->rxq.timestamp_offset,
+			&tmpl->rxq.timestamp_rx_flag) != 0) {
+		DRV_LOG(ERR, "Cannot register Rx timestamp field/flag");
+		goto error;
+	}
 	/* Configure VLAN stripping. */
 	tmpl->rxq.vlan_strip = !!(offloads & DEV_RX_OFFLOAD_VLAN_STRIP);
 	/* By default, FCS (CRC) is stripped by hardware. */
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index e86468b67a..b577aab00b 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -1287,8 +1287,8 @@ rxq_cq_to_mbuf(struct mlx5_rxq_data *rxq, struct rte_mbuf *pkt,
 
 		if (rxq->rt_timestamp)
 			ts = mlx5_txpp_convert_rx_ts(rxq->sh, ts);
-		pkt->timestamp = ts;
-		pkt->ol_flags |= PKT_RX_TIMESTAMP;
+		mlx5_timestamp_set(pkt, rxq->timestamp_offset, ts);
+		pkt->ol_flags |= rxq->timestamp_rx_flag;
 	}
 }
 
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 674296ee98..e9eca36b40 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -151,6 +151,8 @@ struct mlx5_rxq_data {
 	/* CQ (UAR) access lock required for 32bit implementations */
 #endif
 	uint32_t tunnel; /* Tunnel information. */
+	int timestamp_offset; /* Dynamic mbuf field for timestamp. */
+	uint64_t timestamp_rx_flag; /* Dynamic mbuf flag for timestamp. */
 	uint64_t flow_meta_mask;
 	int32_t flow_meta_offset;
 } __rte_cache_aligned;
@@ -681,4 +683,21 @@ mlx5_txpp_convert_tx_ts(struct mlx5_dev_ctx_shared *sh, uint64_t mts)
 	return ci;
 }
 
+/**
+ * Set timestamp in mbuf dynamic field.
+ *
+ * @param mbuf
+ *   Structure to write into.
+ * @param offset
+ *   Dynamic field offset in mbuf structure.
+ * @param timestamp
+ *   Value to write.
+ */
+static __rte_always_inline void
+mlx5_timestamp_set(struct rte_mbuf *mbuf, int offset,
+		rte_mbuf_timestamp_t timestamp)
+{
+	*RTE_MBUF_DYNFIELD(mbuf, offset, rte_mbuf_timestamp_t *) = timestamp;
+}
+
 #endif /* RTE_PMD_MLX5_RXTX_H_ */
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
index 6bf0c9b540..171d7bb0f8 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
@@ -330,13 +330,13 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
 	vector unsigned char ol_flags = (vector unsigned char)
 		(vector unsigned int){
 			rxq->rss_hash * PKT_RX_RSS_HASH |
-				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
+				rxq->hw_timestamp * rxq->timestamp_rx_flag,
 			rxq->rss_hash * PKT_RX_RSS_HASH |
-				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
+				rxq->hw_timestamp * rxq->timestamp_rx_flag,
 			rxq->rss_hash * PKT_RX_RSS_HASH |
-				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
+				rxq->hw_timestamp * rxq->timestamp_rx_flag,
 			rxq->rss_hash * PKT_RX_RSS_HASH |
-				rxq->hw_timestamp * PKT_RX_TIMESTAMP};
+				rxq->hw_timestamp * rxq->timestamp_rx_flag};
 	vector unsigned char cv_flags;
 	const vector unsigned char zero = (vector unsigned char){0};
 	const vector unsigned char ptype_mask =
@@ -1025,31 +1025,32 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t pkts_n,
 		/* D.5 fill in mbuf - rearm_data and packet_type. */
 		rxq_cq_to_ptype_oflags_v(rxq, cqes, opcode, &pkts[pos]);
 		if (rxq->hw_timestamp) {
+			int offset = rxq->timestamp_offset;
 			if (rxq->rt_timestamp) {
 				struct mlx5_dev_ctx_shared *sh = rxq->sh;
 				uint64_t ts;
 
 				ts = rte_be_to_cpu_64(cq[pos].timestamp);
-				pkts[pos]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p1].timestamp);
-				pkts[pos + 1]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 1], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p2].timestamp);
-				pkts[pos + 2]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 2], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p3].timestamp);
-				pkts[pos + 3]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 3], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 			} else {
-				pkts[pos]->timestamp = rte_be_to_cpu_64
-						(cq[pos].timestamp);
-				pkts[pos + 1]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p1].timestamp);
-				pkts[pos + 2]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p2].timestamp);
-				pkts[pos + 3]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p3].timestamp);
+				mlx5_timestamp_set(pkts[pos], offset,
+					rte_be_to_cpu_64(cq[pos].timestamp));
+				mlx5_timestamp_set(pkts[pos + 1], offset,
+					rte_be_to_cpu_64(cq[pos + p1].timestamp));
+				mlx5_timestamp_set(pkts[pos + 2], offset,
+					rte_be_to_cpu_64(cq[pos + p2].timestamp));
+				mlx5_timestamp_set(pkts[pos + 3], offset,
+					rte_be_to_cpu_64(cq[pos + p3].timestamp));
 			}
 		}
 		if (rxq->dynf_meta) {
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index d122dad4fe..436b247ade 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -271,7 +271,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
 	uint32x4_t pinfo, cv_flags;
 	uint32x4_t ol_flags =
 		vdupq_n_u32(rxq->rss_hash * PKT_RX_RSS_HASH |
-			    rxq->hw_timestamp * PKT_RX_TIMESTAMP);
+			    rxq->hw_timestamp * rxq->timestamp_rx_flag);
 	const uint32x4_t ptype_ol_mask = { 0x106, 0x106, 0x106, 0x106 };
 	const uint8x16_t cv_flag_sel = {
 		0,
@@ -697,6 +697,7 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t pkts_n,
 		rxq_cq_to_ptype_oflags_v(rxq, ptype_info, flow_tag,
 					 opcode, &elts[pos]);
 		if (rxq->hw_timestamp) {
+			int offset = rxq->timestamp_offset;
 			if (rxq->rt_timestamp) {
 				struct mlx5_dev_ctx_shared *sh = rxq->sh;
 				uint64_t ts;
@@ -704,36 +705,36 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t pkts_n,
 				ts = rte_be_to_cpu_64
 					(container_of(p0, struct mlx5_cqe,
 						      pkt_info)->timestamp);
-				elts[pos]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(elts[pos], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64
 					(container_of(p1, struct mlx5_cqe,
 						      pkt_info)->timestamp);
-				elts[pos + 1]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(elts[pos + 1], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64
 					(container_of(p2, struct mlx5_cqe,
 						      pkt_info)->timestamp);
-				elts[pos + 2]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(elts[pos + 2], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64
 					(container_of(p3, struct mlx5_cqe,
 						      pkt_info)->timestamp);
-				elts[pos + 3]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(elts[pos + 3], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 			} else {
-				elts[pos]->timestamp = rte_be_to_cpu_64
-					(container_of(p0, struct mlx5_cqe,
-						      pkt_info)->timestamp);
-				elts[pos + 1]->timestamp = rte_be_to_cpu_64
-					(container_of(p1, struct mlx5_cqe,
-						      pkt_info)->timestamp);
-				elts[pos + 2]->timestamp = rte_be_to_cpu_64
-					(container_of(p2, struct mlx5_cqe,
-						      pkt_info)->timestamp);
-				elts[pos + 3]->timestamp = rte_be_to_cpu_64
-					(container_of(p3, struct mlx5_cqe,
-						      pkt_info)->timestamp);
+				mlx5_timestamp_set(elts[pos], offset,
+					rte_be_to_cpu_64(container_of(p0,
+					struct mlx5_cqe, pkt_info)->timestamp));
+				mlx5_timestamp_set(elts[pos + 1], offset,
+					rte_be_to_cpu_64(container_of(p1,
+					struct mlx5_cqe, pkt_info)->timestamp));
+				mlx5_timestamp_set(elts[pos + 2], offset,
+					rte_be_to_cpu_64(container_of(p2,
+					struct mlx5_cqe, pkt_info)->timestamp));
+				mlx5_timestamp_set(elts[pos + 3], offset,
+					rte_be_to_cpu_64(container_of(p3,
+					struct mlx5_cqe, pkt_info)->timestamp));
 			}
 		}
 		if (rxq->dynf_meta) {
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 0bbcbeefff..ae4439efc7 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -251,7 +251,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq, __m128i cqes[4],
 	__m128i pinfo0, pinfo1;
 	__m128i pinfo, ptype;
 	__m128i ol_flags = _mm_set1_epi32(rxq->rss_hash * PKT_RX_RSS_HASH |
-					  rxq->hw_timestamp * PKT_RX_TIMESTAMP);
+					  rxq->hw_timestamp * rxq->timestamp_rx_flag);
 	__m128i cv_flags;
 	const __m128i zero = _mm_setzero_si128();
 	const __m128i ptype_mask =
@@ -656,31 +656,32 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t pkts_n,
 		/* D.5 fill in mbuf - rearm_data and packet_type. */
 		rxq_cq_to_ptype_oflags_v(rxq, cqes, opcode, &pkts[pos]);
 		if (rxq->hw_timestamp) {
+			int offset = rxq->timestamp_offset;
 			if (rxq->rt_timestamp) {
 				struct mlx5_dev_ctx_shared *sh = rxq->sh;
 				uint64_t ts;
 
 				ts = rte_be_to_cpu_64(cq[pos].timestamp);
-				pkts[pos]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p1].timestamp);
-				pkts[pos + 1]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 1], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p2].timestamp);
-				pkts[pos + 2]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 2], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p3].timestamp);
-				pkts[pos + 3]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 3], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 			} else {
-				pkts[pos]->timestamp = rte_be_to_cpu_64
-						(cq[pos].timestamp);
-				pkts[pos + 1]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p1].timestamp);
-				pkts[pos + 2]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p2].timestamp);
-				pkts[pos + 3]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p3].timestamp);
+				mlx5_timestamp_set(pkts[pos], offset,
+					rte_be_to_cpu_64(cq[pos].timestamp));
+				mlx5_timestamp_set(pkts[pos + 1], offset,
+					rte_be_to_cpu_64(cq[pos + p1].timestamp));
+				mlx5_timestamp_set(pkts[pos + 2], offset,
+					rte_be_to_cpu_64(cq[pos + p2].timestamp));
+				mlx5_timestamp_set(pkts[pos + 3], offset,
+					rte_be_to_cpu_64(cq[pos + p3].timestamp));
 			}
 		}
 		if (rxq->dynf_meta) {
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v4 08/16] net/nfb: switch Rx timestamp to dynamic mbuf field
  2020-11-03 12:21 ` [dpdk-dev] [PATCH v4 " Thomas Monjalon
                     ` (6 preceding siblings ...)
  2020-11-03 12:21   ` [dpdk-dev] [PATCH v4 07/16] net/mlx5: switch Rx timestamp to dynamic mbuf field Thomas Monjalon
@ 2020-11-03 12:21   ` Thomas Monjalon
  2020-11-03 12:21   ` [dpdk-dev] [PATCH v4 09/16] net/octeontx2: " Thomas Monjalon
                     ` (7 subsequent siblings)
  15 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 12:21 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Martin Spinler

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 drivers/net/nfb/nfb_rx.c | 15 ++++++++++++++-
 drivers/net/nfb/nfb_rx.h | 21 +++++++++++++++++----
 2 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/drivers/net/nfb/nfb_rx.c b/drivers/net/nfb/nfb_rx.c
index d97179f818..d6d4ba9663 100644
--- a/drivers/net/nfb/nfb_rx.c
+++ b/drivers/net/nfb/nfb_rx.c
@@ -9,6 +9,9 @@
 #include "nfb_rx.h"
 #include "nfb.h"
 
+uint64_t nfb_timestamp_rx_dynflag;
+int nfb_timestamp_dynfield_offset = -1;
+
 static int
 timestamp_check_handler(__rte_unused const char *key,
 	const char *value, __rte_unused void *opaque)
@@ -24,6 +27,7 @@ static int
 nfb_check_timestamp(struct rte_devargs *devargs)
 {
 	struct rte_kvargs *kvlist;
+	int ret;
 
 	if (devargs == NULL)
 		return 0;
@@ -38,6 +42,7 @@ nfb_check_timestamp(struct rte_devargs *devargs)
 	}
 	/* Timestamps are enabled when there is
 	 * key-value pair: enable_timestamp=1
+	 * TODO: timestamp should be enabled with DEV_RX_OFFLOAD_TIMESTAMP
 	 */
 	if (rte_kvargs_process(kvlist, TIMESTAMP_ARG,
 		timestamp_check_handler, NULL) < 0) {
@@ -46,6 +51,14 @@ nfb_check_timestamp(struct rte_devargs *devargs)
 	}
 	rte_kvargs_free(kvlist);
 
+	ret = rte_mbuf_dyn_rx_timestamp_register(
+			&nfb_timestamp_dynfield_offset,
+			&nfb_timestamp_rx_dynflag);
+	if (ret != 0) {
+		RTE_LOG(ERR, PMD, "Cannot register Rx timestamp field/flag\n");
+		return -rte_errno;
+	}
+
 	return 1;
 }
 
@@ -125,7 +138,7 @@ nfb_eth_rx_queue_setup(struct rte_eth_dev *dev,
 	else
 		rte_free(rxq);
 
-	if (nfb_check_timestamp(dev->device->devargs))
+	if (nfb_check_timestamp(dev->device->devargs) > 0)
 		rxq->flags |= NFB_TIMESTAMP_FLAG;
 
 	return ret;
diff --git a/drivers/net/nfb/nfb_rx.h b/drivers/net/nfb/nfb_rx.h
index cf3899b2fb..27a2888a75 100644
--- a/drivers/net/nfb/nfb_rx.h
+++ b/drivers/net/nfb/nfb_rx.h
@@ -15,6 +15,16 @@
 
 #define NFB_TIMESTAMP_FLAG (1 << 0)
 
+extern uint64_t nfb_timestamp_rx_dynflag;
+extern int nfb_timestamp_dynfield_offset;
+
+static inline rte_mbuf_timestamp_t *
+nfb_timestamp_dynfield(struct rte_mbuf *mbuf)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+		nfb_timestamp_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 struct ndp_rx_queue {
 	struct nfb_device *nfb;	     /* nfb dev structure */
 	struct ndp_queue *queue;     /* rx queue */
@@ -190,16 +200,19 @@ nfb_eth_ndp_rx(void *queue,
 			mbuf->ol_flags = 0;
 
 			if (timestamping_enabled) {
+				rte_mbuf_timestamp_t timestamp;
+
 				/* nanoseconds */
-				mbuf->timestamp =
+				timestamp =
 					rte_le_to_cpu_32(*((uint32_t *)
 					(packets[i].header + 4)));
-				mbuf->timestamp <<= 32;
+				timestamp <<= 32;
 				/* seconds */
-				mbuf->timestamp |=
+				timestamp |=
 					rte_le_to_cpu_32(*((uint32_t *)
 					(packets[i].header + 8)));
-				mbuf->ol_flags |= PKT_RX_TIMESTAMP;
+				*nfb_timestamp_dynfield(mbuf) = timestamp;
+				mbuf->ol_flags |= nfb_timestamp_rx_dynflag;
 			}
 
 			bufs[num_rx++] = mbuf;
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v4 09/16] net/octeontx2: switch Rx timestamp to dynamic mbuf field
  2020-11-03 12:21 ` [dpdk-dev] [PATCH v4 " Thomas Monjalon
                     ` (7 preceding siblings ...)
  2020-11-03 12:21   ` [dpdk-dev] [PATCH v4 08/16] net/nfb: " Thomas Monjalon
@ 2020-11-03 12:21   ` Thomas Monjalon
  2020-11-03 12:21   ` [dpdk-dev] [PATCH v4 10/16] net/pcap: " Thomas Monjalon
                     ` (6 subsequent siblings)
  15 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 12:21 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Nithin Dabilpuram,
	Kiran Kumar K

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

The dynamic offset and flag are stored in struct otx2_timesync_info
to favor cache locality.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 drivers/net/octeontx2/otx2_ethdev.c | 10 ++++++++++
 drivers/net/octeontx2/otx2_rx.h     | 19 ++++++++++++++++---
 2 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/drivers/net/octeontx2/otx2_ethdev.c b/drivers/net/octeontx2/otx2_ethdev.c
index cfb733a4b5..f6962be9b2 100644
--- a/drivers/net/octeontx2/otx2_ethdev.c
+++ b/drivers/net/octeontx2/otx2_ethdev.c
@@ -2219,6 +2219,16 @@ otx2_nix_dev_start(struct rte_eth_dev *eth_dev)
 	else
 		otx2_nix_timesync_disable(eth_dev);
 
+	if (dev->rx_offload_flags & NIX_RX_OFFLOAD_TSTAMP_F) {
+		rc = rte_mbuf_dyn_rx_timestamp_register(
+				&dev->tstamp.tstamp_dynfield_offset,
+				&dev->tstamp.rx_tstamp_dynflag);
+		if (rc != 0) {
+			otx2_err("Failed to register Rx timestamp field/flag");
+			return -rte_errno;
+		}
+	}
+
 	/* Update VF about data off shifted by 8 bytes if PTP already
 	 * enabled in PF owning this VF
 	 */
diff --git a/drivers/net/octeontx2/otx2_rx.h b/drivers/net/octeontx2/otx2_rx.h
index 61a5c436dd..926f614a4e 100644
--- a/drivers/net/octeontx2/otx2_rx.h
+++ b/drivers/net/octeontx2/otx2_rx.h
@@ -49,6 +49,8 @@ struct otx2_timesync_info {
 	uint64_t	rx_tstamp;
 	rte_iova_t	tx_tstamp_iova;
 	uint64_t	*tx_tstamp;
+	uint64_t	rx_tstamp_dynflag;
+	int		tstamp_dynfield_offset;
 	uint8_t		tx_ready;
 	uint8_t		rx_ready;
 } __rte_cache_aligned;
@@ -63,6 +65,14 @@ union mbuf_initializer {
 	uint64_t value;
 };
 
+static inline rte_mbuf_timestamp_t *
+otx2_timestamp_dynfield(struct rte_mbuf *mbuf,
+		struct otx2_timesync_info *info)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+		info->tstamp_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 static __rte_always_inline void
 otx2_nix_mbuf_to_tstamp(struct rte_mbuf *mbuf,
 			struct otx2_timesync_info *tstamp, const uint16_t flag,
@@ -77,15 +87,18 @@ otx2_nix_mbuf_to_tstamp(struct rte_mbuf *mbuf,
 		/* Reading the rx timestamp inserted by CGX, viz at
 		 * starting of the packet data.
 		 */
-		mbuf->timestamp = rte_be_to_cpu_64(*tstamp_ptr);
+		*otx2_timestamp_dynfield(mbuf, tstamp) =
+				rte_be_to_cpu_64(*tstamp_ptr);
 		/* PKT_RX_IEEE1588_TMST flag needs to be set only in case
 		 * PTP packets are received.
 		 */
 		if (mbuf->packet_type == RTE_PTYPE_L2_ETHER_TIMESYNC) {
-			tstamp->rx_tstamp = mbuf->timestamp;
+			tstamp->rx_tstamp =
+					*otx2_timestamp_dynfield(mbuf, tstamp);
 			tstamp->rx_ready = 1;
 			mbuf->ol_flags |= PKT_RX_IEEE1588_PTP |
-				PKT_RX_IEEE1588_TMST | PKT_RX_TIMESTAMP;
+				PKT_RX_IEEE1588_TMST |
+				tstamp->rx_tstamp_dynflag;
 		}
 	}
 }
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v4 10/16] net/pcap: switch Rx timestamp to dynamic mbuf field
  2020-11-03 12:21 ` [dpdk-dev] [PATCH v4 " Thomas Monjalon
                     ` (8 preceding siblings ...)
  2020-11-03 12:21   ` [dpdk-dev] [PATCH v4 09/16] net/octeontx2: " Thomas Monjalon
@ 2020-11-03 12:21   ` Thomas Monjalon
  2020-11-03 12:22   ` [dpdk-dev] [PATCH v4 11/16] app/testpmd: " Thomas Monjalon
                     ` (5 subsequent siblings)
  15 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 12:21 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 drivers/net/pcap/rte_eth_pcap.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index 34e82317b1..4e6d49370e 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -51,6 +51,9 @@ static uint64_t start_cycles;
 static uint64_t hz;
 static uint8_t iface_idx;
 
+static uint64_t timestamp_rx_dynflag;
+static int timestamp_dynfield_offset = -1;
+
 struct queue_stat {
 	volatile unsigned long pkts;
 	volatile unsigned long bytes;
@@ -265,9 +268,11 @@ eth_pcap_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 		}
 
 		mbuf->pkt_len = (uint16_t)header.caplen;
-		mbuf->timestamp = (uint64_t)header.ts.tv_sec * 1000000
-							+ header.ts.tv_usec;
-		mbuf->ol_flags |= PKT_RX_TIMESTAMP;
+		*RTE_MBUF_DYNFIELD(mbuf, timestamp_dynfield_offset,
+			rte_mbuf_timestamp_t *) =
+				(uint64_t)header.ts.tv_sec * 1000000 +
+				header.ts.tv_usec;
+		mbuf->ol_flags |= timestamp_rx_dynflag;
 		mbuf->port = pcap_q->port_id;
 		bufs[num_rx] = mbuf;
 		num_rx++;
@@ -656,6 +661,15 @@ eth_dev_stop(struct rte_eth_dev *dev)
 static int
 eth_dev_configure(struct rte_eth_dev *dev __rte_unused)
 {
+	int ret;
+
+	ret = rte_mbuf_dyn_rx_timestamp_register(&timestamp_dynfield_offset,
+			&timestamp_rx_dynflag);
+	if (ret != 0) {
+		PMD_LOG(ERR, "Failed to register Rx timestamp field/flag");
+		return -rte_errno;
+	}
+
 	return 0;
 }
 
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v4 11/16] app/testpmd: switch Rx timestamp to dynamic mbuf field
  2020-11-03 12:21 ` [dpdk-dev] [PATCH v4 " Thomas Monjalon
                     ` (9 preceding siblings ...)
  2020-11-03 12:21   ` [dpdk-dev] [PATCH v4 10/16] net/pcap: " Thomas Monjalon
@ 2020-11-03 12:22   ` Thomas Monjalon
  2020-11-03 12:22   ` [dpdk-dev] [PATCH v4 12/16] examples/rxtx_callbacks: switch timestamp to dynamic field Thomas Monjalon
                     ` (4 subsequent siblings)
  15 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 12:22 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 app/test-pmd/util.c | 38 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 36 insertions(+), 2 deletions(-)

diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index 781a813759..649bf8f53a 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -5,6 +5,7 @@
 
 #include <stdio.h>
 
+#include <rte_bitops.h>
 #include <rte_net.h>
 #include <rte_mbuf.h>
 #include <rte_ether.h>
@@ -22,6 +23,39 @@ print_ether_addr(const char *what, const struct rte_ether_addr *eth_addr)
 	printf("%s%s", what, buf);
 }
 
+static inline bool
+is_timestamp_enabled(const struct rte_mbuf *mbuf)
+{
+	static uint64_t timestamp_rx_dynflag;
+	int timestamp_rx_dynflag_offset;
+
+	if (timestamp_rx_dynflag == 0) {
+		timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
+				RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
+		if (timestamp_rx_dynflag_offset < 0)
+			return false;
+		timestamp_rx_dynflag = RTE_BIT64(timestamp_rx_dynflag_offset);
+	}
+
+	return (mbuf->ol_flags & timestamp_rx_dynflag) != 0;
+}
+
+static inline rte_mbuf_timestamp_t
+get_timestamp(const struct rte_mbuf *mbuf)
+{
+	static int timestamp_dynfield_offset = -1;
+
+	if (timestamp_dynfield_offset < 0) {
+		timestamp_dynfield_offset = rte_mbuf_dynfield_lookup(
+				RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
+		if (timestamp_dynfield_offset < 0)
+			return 0;
+	}
+
+	return *RTE_MBUF_DYNFIELD(mbuf,
+			timestamp_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 static inline void
 dump_pkt_burst(uint16_t port_id, uint16_t queue, struct rte_mbuf *pkts[],
 	      uint16_t nb_pkts, int is_rx)
@@ -107,8 +141,8 @@ dump_pkt_burst(uint16_t port_id, uint16_t queue, struct rte_mbuf *pkts[],
 				printf("hash=0x%x ID=0x%x ",
 				       mb->hash.fdir.hash, mb->hash.fdir.id);
 		}
-		if (ol_flags & PKT_RX_TIMESTAMP)
-			printf(" - timestamp %"PRIu64" ", mb->timestamp);
+		if (is_timestamp_enabled(mb))
+			printf(" - timestamp %"PRIu64" ", get_timestamp(mb));
 		if (ol_flags & PKT_RX_QINQ)
 			printf(" - QinQ VLAN tci=0x%x, VLAN tci outer=0x%x",
 			       mb->vlan_tci, mb->vlan_tci_outer);
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v4 12/16] examples/rxtx_callbacks: switch timestamp to dynamic field
  2020-11-03 12:21 ` [dpdk-dev] [PATCH v4 " Thomas Monjalon
                     ` (10 preceding siblings ...)
  2020-11-03 12:22   ` [dpdk-dev] [PATCH v4 11/16] app/testpmd: " Thomas Monjalon
@ 2020-11-03 12:22   ` Thomas Monjalon
  2020-11-03 12:22   ` [dpdk-dev] [PATCH v4 13/16] ethdev: add doxygen comment for Rx timestamp API Thomas Monjalon
                     ` (3 subsequent siblings)
  15 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 12:22 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, John McNamara

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 examples/rxtx_callbacks/main.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/examples/rxtx_callbacks/main.c b/examples/rxtx_callbacks/main.c
index 1a8e7d47d9..35c6c39807 100644
--- a/examples/rxtx_callbacks/main.c
+++ b/examples/rxtx_callbacks/main.c
@@ -19,6 +19,15 @@
 #define MBUF_CACHE_SIZE 250
 #define BURST_SIZE 32
 
+static int hwts_dynfield_offset = -1;
+
+static inline rte_mbuf_timestamp_t *
+hwts_field(struct rte_mbuf *mbuf)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+			hwts_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 typedef uint64_t tsc_t;
 static int tsc_dynfield_offset = -1;
 
@@ -77,7 +86,7 @@ calc_latency(uint16_t port, uint16_t qidx __rte_unused,
 	for (i = 0; i < nb_pkts; i++) {
 		cycles += now - *tsc_field(pkts[i]);
 		if (hw_timestamping)
-			queue_ticks += ticks - pkts[i]->timestamp;
+			queue_ticks += ticks - *hwts_field(pkts[i]);
 	}
 
 	latency_numbers.total_cycles += cycles;
@@ -141,6 +150,11 @@ port_init(uint16_t port, struct rte_mempool *mbuf_pool)
 			return -1;
 		}
 		port_conf.rxmode.offloads |= DEV_RX_OFFLOAD_TIMESTAMP;
+		rte_mbuf_dyn_rx_timestamp_register(&hwts_dynfield_offset, NULL);
+		if (hwts_dynfield_offset < 0) {
+			printf("ERROR: Failed to register timestamp field\n");
+			return -rte_errno;
+		}
 	}
 
 	retval = rte_eth_dev_configure(port, rx_rings, tx_rings, &port_conf);
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v4 13/16] ethdev: add doxygen comment for Rx timestamp API
  2020-11-03 12:21 ` [dpdk-dev] [PATCH v4 " Thomas Monjalon
                     ` (11 preceding siblings ...)
  2020-11-03 12:22   ` [dpdk-dev] [PATCH v4 12/16] examples/rxtx_callbacks: switch timestamp to dynamic field Thomas Monjalon
@ 2020-11-03 12:22   ` Thomas Monjalon
  2020-11-03 12:40     ` Andrew Rybchenko
  2020-11-03 12:22   ` [dpdk-dev] [PATCH v4 14/16] mbuf: remove deprecated timestamp field Thomas Monjalon
                     ` (2 subsequent siblings)
  15 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 12:22 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo

The offload flag DEV_RX_OFFLOAD_TIMESTAMP had no documentation.
After switching to dynamic mbuf flag and field,
it becomes even more important to explicit the feature behaviour.

A doxygen comment for the timesync API was mentioning
the deprecated timestamp field, so it is also updated.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 lib/librte_ethdev/rte_ethdev.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index ba997f16ce..1fc5f662fa 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1344,6 +1344,11 @@ struct rte_eth_conf {
 #define DEV_RX_OFFLOAD_VLAN_EXTEND	0x00000400
 #define DEV_RX_OFFLOAD_JUMBO_FRAME	0x00000800
 #define DEV_RX_OFFLOAD_SCATTER		0x00002000
+/**
+ * Timestamp is set by the driver in RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
+ * and RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME is set in ol_flags.
+ * The mbuf field and flag are registered when the offload is configured.
+ */
 #define DEV_RX_OFFLOAD_TIMESTAMP	0x00004000
 #define DEV_RX_OFFLOAD_SECURITY         0x00008000
 #define DEV_RX_OFFLOAD_KEEP_CRC		0x00010000
@@ -4647,7 +4652,7 @@ int rte_eth_timesync_write_time(uint16_t port_id, const struct timespec *time);
  * rte_eth_read_clock(port, base_clock);
  *
  * Then, convert the raw mbuf timestamp with:
- * base_time_sec + (double)(mbuf->timestamp - base_clock) / freq;
+ * base_time_sec + (double)(*timestamp_dynfield(mbuf) - base_clock) / freq;
  *
  * This simple example will not provide a very good accuracy. One must
  * at least measure multiple times the frequency and do a regression.
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v4 14/16] mbuf: remove deprecated timestamp field
  2020-11-03 12:21 ` [dpdk-dev] [PATCH v4 " Thomas Monjalon
                     ` (12 preceding siblings ...)
  2020-11-03 12:22   ` [dpdk-dev] [PATCH v4 13/16] ethdev: add doxygen comment for Rx timestamp API Thomas Monjalon
@ 2020-11-03 12:22   ` Thomas Monjalon
  2020-11-03 12:22   ` [dpdk-dev] [PATCH v4 15/16] mbuf: add Tx timestamp registration helper Thomas Monjalon
  2020-11-03 12:22   ` [dpdk-dev] [PATCH v4 16/16] ethdev: include mbuf registration in Tx timestamp API Thomas Monjalon
  15 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 12:22 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Ajit Khaparde,
	Ray Kinsella, Neil Horman

As announced in the deprecation note, the field timestamp
is removed to give more space to the dynamic fields.
The related offload flag PKT_RX_TIMESTAMP is also removed.

This is how the mbuf layout looks like (pahole-style):

word  type                              name                byte  size
 0    void *                            buf_addr;         /*   0 +  8 */
 1    rte_iova_t                        buf_iova          /*   8 +  8 */
      /* --- RTE_MARKER64               rearm_data;                   */
 2    uint16_t                          data_off;         /*  16 +  2 */
      uint16_t                          refcnt;           /*  18 +  2 */
      uint16_t                          nb_segs;          /*  20 +  2 */
      uint16_t                          port;             /*  22 +  2 */
 3    uint64_t                          ol_flags;         /*  24 +  8 */
      /* --- RTE_MARKER                 rx_descriptor_fields1;        */
 4    uint32_t             union        packet_type;      /*  32 +  4 */
      uint32_t                          pkt_len;          /*  36 +  4 */
 5    uint16_t                          data_len;         /*  40 +  2 */
      uint16_t                          vlan_tci;         /*  42 +  2 */
 5.5  uint64_t             union        hash;             /*  44 +  8 */
 6.5  uint16_t                          vlan_tci_outer;   /*  52 +  2 */
      uint16_t                          buf_len;          /*  54 +  2 */
 7    uint64_t                          dynfield0[1];     /*  56 +  8 */
      /* --- RTE_MARKER                 cacheline1;                   */
 8    struct rte_mempool *              pool;             /*  64 +  8 */
 9    struct rte_mbuf *                 next;             /*  72 +  8 */
10    uint64_t             union        tx_offload;       /*  80 +  8 */
11    struct rte_mbuf_ext_shared_info * shinfo;           /*  88 +  8 */
12    uint16_t                          priv_size;        /*  96 +  2 */
      uint16_t                          timesync;         /*  98 +  2 */
12.5  uint32_t                          dynfield1[7];     /* 100 + 28 */
16    /* --- END                                             128      */

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 app/test/test_mbuf.c                   |  1 -
 doc/guides/rel_notes/deprecation.rst   |  4 ----
 doc/guides/rel_notes/release_20_11.rst |  4 ++++
 lib/librte_mbuf/rte_mbuf.c             |  2 --
 lib/librte_mbuf/rte_mbuf.h             |  2 +-
 lib/librte_mbuf/rte_mbuf_core.h        | 12 ++----------
 lib/librte_mbuf/rte_mbuf_dyn.c         |  1 +
 7 files changed, 8 insertions(+), 18 deletions(-)

diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
index 3a13cf4e1f..a40f7d4883 100644
--- a/app/test/test_mbuf.c
+++ b/app/test/test_mbuf.c
@@ -1621,7 +1621,6 @@ test_get_rx_ol_flag_name(void)
 		VAL_NAME(PKT_RX_FDIR_FLX),
 		VAL_NAME(PKT_RX_QINQ_STRIPPED),
 		VAL_NAME(PKT_RX_LRO),
-		VAL_NAME(PKT_RX_TIMESTAMP),
 		VAL_NAME(PKT_RX_SEC_OFFLOAD),
 		VAL_NAME(PKT_RX_SEC_OFFLOAD_FAILED),
 		VAL_NAME(PKT_RX_OUTER_L4_CKSUM_BAD),
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index fe3fd3956c..22aecf0bab 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -84,10 +84,6 @@ Deprecation Notices
 * mbuf: Some fields will be converted to dynamic API in DPDK 20.11
   in order to reserve more space for the dynamic fields, as explained in
   `this presentation <https://www.youtube.com/watch?v=Ttl6MlhmzWY>`_.
-  The following static fields will be moved as dynamic:
-
-  - ``timestamp``
-
   As a consequence, the layout of the ``struct rte_mbuf`` will be re-arranged,
   avoiding impact on vectorized implementation of the driver datapaths,
   while evaluating performance gains of a better use of the first cache line.
diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index f1a6925678..7c8246d1b3 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -458,6 +458,10 @@ API Changes
 * mbuf: Removed the field ``seqn`` from the structure ``rte_mbuf``.
   It is replaced with dynamic fields.
 
+* mbuf: Removed the field ``timestamp`` from the structure ``rte_mbuf``.
+  It is replaced with the dynamic field RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
+  which was previously used only for Tx.
+
 * pci: Removed the ``rte_kernel_driver`` enum defined in rte_dev.h and
   replaced with a private enum in the PCI subsystem.
 
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 8a456e5e64..09d93e6899 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -764,7 +764,6 @@ const char *rte_get_rx_ol_flag_name(uint64_t mask)
 	case PKT_RX_QINQ_STRIPPED: return "PKT_RX_QINQ_STRIPPED";
 	case PKT_RX_QINQ: return "PKT_RX_QINQ";
 	case PKT_RX_LRO: return "PKT_RX_LRO";
-	case PKT_RX_TIMESTAMP: return "PKT_RX_TIMESTAMP";
 	case PKT_RX_SEC_OFFLOAD: return "PKT_RX_SEC_OFFLOAD";
 	case PKT_RX_SEC_OFFLOAD_FAILED: return "PKT_RX_SEC_OFFLOAD_FAILED";
 	case PKT_RX_OUTER_L4_CKSUM_BAD: return "PKT_RX_OUTER_L4_CKSUM_BAD";
@@ -808,7 +807,6 @@ rte_get_rx_ol_flag_list(uint64_t mask, char *buf, size_t buflen)
 		{ PKT_RX_FDIR_FLX, PKT_RX_FDIR_FLX, NULL },
 		{ PKT_RX_QINQ_STRIPPED, PKT_RX_QINQ_STRIPPED, NULL },
 		{ PKT_RX_LRO, PKT_RX_LRO, NULL },
-		{ PKT_RX_TIMESTAMP, PKT_RX_TIMESTAMP, NULL },
 		{ PKT_RX_SEC_OFFLOAD, PKT_RX_SEC_OFFLOAD, NULL },
 		{ PKT_RX_SEC_OFFLOAD_FAILED, PKT_RX_SEC_OFFLOAD_FAILED, NULL },
 		{ PKT_RX_QINQ, PKT_RX_QINQ, NULL },
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index a1414ed7cd..17e0b205c0 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -1095,6 +1095,7 @@ rte_pktmbuf_attach_extbuf(struct rte_mbuf *m, void *buf_addr,
 static inline void
 rte_mbuf_dynfield_copy(struct rte_mbuf *mdst, const struct rte_mbuf *msrc)
 {
+	memcpy(&mdst->dynfield0, msrc->dynfield0, sizeof(mdst->dynfield0));
 	memcpy(&mdst->dynfield1, msrc->dynfield1, sizeof(mdst->dynfield1));
 }
 
@@ -1108,7 +1109,6 @@ __rte_pktmbuf_copy_hdr(struct rte_mbuf *mdst, const struct rte_mbuf *msrc)
 	mdst->tx_offload = msrc->tx_offload;
 	mdst->hash = msrc->hash;
 	mdst->packet_type = msrc->packet_type;
-	mdst->timestamp = msrc->timestamp;
 	rte_mbuf_dynfield_copy(mdst, msrc);
 }
 
diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
index 3fb5abda3c..38e24a580d 100644
--- a/lib/librte_mbuf/rte_mbuf_core.h
+++ b/lib/librte_mbuf/rte_mbuf_core.h
@@ -149,10 +149,7 @@ extern "C" {
  */
 #define PKT_RX_LRO           (1ULL << 16)
 
-/**
- * Indicate that the timestamp field in the mbuf is valid.
- */
-#define PKT_RX_TIMESTAMP     (1ULL << 17)
+/* There is no flag defined at offset 17. It is free for any future use. */
 
 /**
  * Indicate that security offload processing was applied on the RX packet.
@@ -589,12 +586,7 @@ struct rte_mbuf {
 
 	uint16_t buf_len;         /**< Length of segment buffer. */
 
-	/** Valid if PKT_RX_TIMESTAMP is set. The unit and time reference
-	 * are not normalized but are always the same for a given port.
-	 * Some devices allow to query rte_eth_read_clock that will return the
-	 * current device timestamp.
-	 */
-	uint64_t timestamp;
+	uint64_t dynfield0[1]; /**< Reserved for dynamic fields. */
 
 	/* second cache line - fields only used in slow path or on TX */
 	RTE_MARKER cacheline1 __rte_cache_min_aligned;
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.c b/lib/librte_mbuf/rte_mbuf_dyn.c
index 5b608a27d7..4f50da09f3 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.c
+++ b/lib/librte_mbuf/rte_mbuf_dyn.c
@@ -125,6 +125,7 @@ init_shared_mem(void)
 		 * rte_mbuf_dynfield_copy().
 		 */
 		memset(shm, 0, sizeof(*shm));
+		mark_free(dynfield0);
 		mark_free(dynfield1);
 
 		/* init free_flags */
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v4 15/16] mbuf: add Tx timestamp registration helper
  2020-11-03 12:21 ` [dpdk-dev] [PATCH v4 " Thomas Monjalon
                     ` (13 preceding siblings ...)
  2020-11-03 12:22   ` [dpdk-dev] [PATCH v4 14/16] mbuf: remove deprecated timestamp field Thomas Monjalon
@ 2020-11-03 12:22   ` Thomas Monjalon
  2020-11-03 12:42     ` Andrew Rybchenko
  2020-11-03 12:22   ` [dpdk-dev] [PATCH v4 16/16] ethdev: include mbuf registration in Tx timestamp API Thomas Monjalon
  15 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 12:22 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Ray Kinsella, Neil Horman

The function rte_mbuf_dyn_tx_timestamp_register()
can be used to register the required field and flag.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 lib/librte_mbuf/rte_mbuf_dyn.c |  7 +++++++
 lib/librte_mbuf/rte_mbuf_dyn.h | 22 ++++++++++++++++++++++
 lib/librte_mbuf/version.map    |  1 +
 3 files changed, 30 insertions(+)

diff --git a/lib/librte_mbuf/rte_mbuf_dyn.c b/lib/librte_mbuf/rte_mbuf_dyn.c
index 4f50da09f3..101b5bd95f 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.c
+++ b/lib/librte_mbuf/rte_mbuf_dyn.c
@@ -613,3 +613,10 @@ rte_mbuf_dyn_rx_timestamp_register(int *field_offset, uint64_t *rx_flag)
 	return rte_mbuf_dyn_timestamp_register(field_offset, rx_flag,
 			"Rx", RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME);
 }
+
+int
+rte_mbuf_dyn_tx_timestamp_register(int *field_offset, uint64_t *tx_flag)
+{
+	return rte_mbuf_dyn_timestamp_register(field_offset, tx_flag,
+			"Tx", RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME);
+}
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 2e729ddaca..d88e7bacc5 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -304,4 +304,26 @@ int rte_mbuf_dyn_rx_timestamp_register(int *field_offset, uint64_t *rx_flag);
  */
 #define RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME "rte_dynflag_tx_timestamp"
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Register dynamic mbuf field and flag for Tx timestamp.
+ *
+ * @param field_offset
+ *   Pointer to the offset of the registered mbuf field, can be NULL.
+ *   The same field is shared for Rx and Tx timestamp.
+ * @param tx_flag
+ *   Pointer to the mask of the registered offload flag, can be NULL.
+ * @return
+ *   0 on success, -1 otherwise.
+ *   Possible values for rte_errno:
+ *   - EEXIST: already registered with different parameters.
+ *   - EPERM: called from a secondary process.
+ *   - ENOENT: no more field or flag available.
+ *   - ENOMEM: allocation failure.
+ */
+__rte_experimental
+int rte_mbuf_dyn_tx_timestamp_register(int *field_offset, uint64_t *tx_flag);
+
 #endif
diff --git a/lib/librte_mbuf/version.map b/lib/librte_mbuf/version.map
index 0b66668bff..b7d98e7eb1 100644
--- a/lib/librte_mbuf/version.map
+++ b/lib/librte_mbuf/version.map
@@ -43,6 +43,7 @@ EXPERIMENTAL {
 	rte_mbuf_dynflag_register_bitnum;
 	rte_mbuf_dyn_dump;
 	rte_mbuf_dyn_rx_timestamp_register;
+	rte_mbuf_dyn_tx_timestamp_register;
 	rte_pktmbuf_copy;
 	rte_pktmbuf_free_bulk;
 	rte_pktmbuf_pool_create_extbuf;
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v4 16/16] ethdev: include mbuf registration in Tx timestamp API
  2020-11-03 12:21 ` [dpdk-dev] [PATCH v4 " Thomas Monjalon
                     ` (14 preceding siblings ...)
  2020-11-03 12:22   ` [dpdk-dev] [PATCH v4 15/16] mbuf: add Tx timestamp registration helper Thomas Monjalon
@ 2020-11-03 12:22   ` Thomas Monjalon
  2020-11-03 12:45     ` Andrew Rybchenko
  15 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 12:22 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger, Matan Azrad, Shahaf Shuler

Previously, the Tx timestamp field and flag were registered in testpmd,
as described in mlx5 guide.
For consistency between Rx and Tx timestamps,
managing mbuf registrations inside the driver, as properly documented,
is a simpler expectation.

The only driver to support this feature (mlx5) is updated
as well as the testpmd application.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 app/test-pmd/config.c          | 38 ----------------------------------
 doc/guides/nics/mlx5.rst       |  5 ++---
 drivers/net/mlx5/mlx5_ethdev.c |  8 ++++++-
 lib/librte_ethdev/rte_ethdev.h |  6 +++++-
 4 files changed, 14 insertions(+), 43 deletions(-)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 1668ae3238..9a2baf16fe 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -3955,44 +3955,6 @@ show_tx_pkt_times(void)
 void
 set_tx_pkt_times(unsigned int *tx_times)
 {
-	uint16_t port_id;
-	int offload_found = 0;
-	int offset;
-	int flag;
-
-	static const struct rte_mbuf_dynfield desc_offs = {
-		.name = RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
-		.size = sizeof(uint64_t),
-		.align = __alignof__(uint64_t),
-	};
-	static const struct rte_mbuf_dynflag desc_flag = {
-		.name = RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME,
-	};
-
-	RTE_ETH_FOREACH_DEV(port_id) {
-		struct rte_eth_dev_info dev_info = { 0 };
-		int ret;
-
-		ret = rte_eth_dev_info_get(port_id, &dev_info);
-		if (ret == 0 && dev_info.tx_offload_capa &
-				DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP) {
-			offload_found = 1;
-			break;
-		}
-	}
-	if (!offload_found) {
-		printf("No device supporting Tx timestamp scheduling found, "
-		       "dynamic flag and field not registered\n");
-		return;
-	}
-	offset = rte_mbuf_dynfield_register(&desc_offs);
-	if (offset < 0 && rte_errno != EEXIST)
-		printf("Dynamic timestamp field registration error: %d",
-		       rte_errno);
-	flag = rte_mbuf_dynflag_register(&desc_flag);
-	if (flag < 0 && rte_errno != EEXIST)
-		printf("Dynamic timestamp flag registration error: %d",
-		       rte_errno);
 	tx_pkt_times_inter = tx_times[0];
 	tx_pkt_times_intra = tx_times[1];
 }
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index afa65a1379..fa8b13dd1b 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -237,9 +237,8 @@ Limitations
   ``txq_inline_max`` and ``txq_inline_mpw`` devargs keys.
 
 - To provide the packet send scheduling on mbuf timestamps the ``tx_pp``
-  parameter should be specified, RTE_MBUF_DYNFIELD_TIMESTAMP_NAME and
-  RTE_MBUF_DYNFLAG_TIMESTAMP_NAME should be registered by application.
-  When PMD sees the RTE_MBUF_DYNFLAG_TIMESTAMP_NAME set on the packet
+  parameter should be specified.
+  When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME set on the packet
   being sent it tries to synchronize the time of packet appearing on
   the wire with the specified packet timestamp. It the specified one
   is in the past it should be ignored, if one is in the distant future
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 7631f644b2..76ef02664f 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -88,7 +88,13 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
 
 	if (dev->data->dev_conf.rxmode.mq_mode & ETH_MQ_RX_RSS_FLAG)
 		dev->data->dev_conf.rxmode.offloads |= DEV_RX_OFFLOAD_RSS_HASH;
-
+	if ((dev->data->dev_conf.txmode.offloads &
+			DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP) &&
+			rte_mbuf_dyn_tx_timestamp_register(NULL, NULL) != 0) {
+		DRV_LOG(ERR, "port %u cannot register Tx timestamp field/flag",
+			dev->data->port_id);
+		return -rte_errno;
+	}
 	memcpy(priv->rss_conf.rss_key,
 	       use_app_rss_key ?
 	       dev->data->dev_conf.rx_adv_conf.rss_conf.rss_key :
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index 1fc5f662fa..619cbe521e 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1413,7 +1413,11 @@ struct rte_eth_conf {
 #define DEV_TX_OFFLOAD_IP_TNL_TSO       0x00080000
 /** Device supports outer UDP checksum */
 #define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM  0x00100000
-/** Device supports send on timestamp */
+/**
+ * Device sends on time read from RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
+ * if RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME is set in ol_flags.
+ * The mbuf field and flag are registered when the offload is configured.
+ */
 #define DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
 /*
  * If new Tx offload capabilities are defined, they also must be
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotterfirst half
  2020-11-03 12:10                       ` Morten Brørup
@ 2020-11-03 12:25                         ` Bruce Richardson
  2020-11-03 13:46                           ` Morten Brørup
  2020-11-03 14:02                         ` Slava Ovsiienko
  1 sibling, 1 reply; 170+ messages in thread
From: Bruce Richardson @ 2020-11-03 12:25 UTC (permalink / raw)
  To: Morten Brørup
  Cc: Thomas Monjalon, dev, techboard, Ajit Khaparde, Ananyev,
	Konstantin, Andrew Rybchenko, Yigit, Ferruh, david.marchand,
	olivier.matz, jerinj, viacheslavo, honnappa.nagarahalli,
	maxime.coquelin, stephen, hemant.agrawal, Matan Azrad,
	Shahaf Shuler

On Tue, Nov 03, 2020 at 01:10:05PM +0100, Morten Brørup wrote:
> > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > Sent: Monday, November 2, 2020 4:58 PM
> > 
> > +Cc techboard
> > 
> > We need benchmark numbers in order to take a decision.
> > Please all, prepare some arguments and numbers so we can discuss
> > the mbuf layout in the next techboard meeting.
> 
> I propose that the techboard considers this from two angels:
> 
> 1. Long term goals and their relative priority. I.e. what can be
> achieved with wide-ranging modifications, requiring yet another ABI
> break and due notices.
> 
> 2. Short term goals, i.e. what can be achieved for this release.
> 
> 
> My suggestions follow...
> 
> 1. Regarding long term goals:
> 
> I have argued that simple forwarding of non-segmented packets using
> only the first mbuf cache line can be achieved by making three
> modifications:
> 
> a) Move m->tx_offload to the first cache line.
> b) Use an 8 bit pktmbuf mempool index in the first cache line,
>    instead of the 64 bit m->pool pointer in the second cache line.
> c) Do not access m->next when we know that it is NULL.
>    We can use m->nb_segs == 1 or some other invariant as the gate.
>    It can be implemented by adding an m->next accessor function:
>    struct rte_mbuf * rte_mbuf_next(struct rte_mbuf * m)
>    {
>        return m->nb_segs == 1 ? NULL : m->next;
>    }
> 
> Regarding the priority of this goal, I guess that simple forwarding
> of non-segmented packets is probably the path taken by the majority
> of packets handled by DPDK.
> 
> 
> An alternative goal could be:
> Do not touch the second cache line during RX.
> A comment in the mbuf structure says so, but it is not true anymore.
>

The comment should be true for non-scattered RX, I believe. I'm not aware
of any use of second cacheline for the fast-path RXs for many drivers. Am I
missing something that has changed recently here?

/Bruce
 

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v4 02/16] mbuf: add Rx timestamp flag and helpers
  2020-11-03 12:21   ` [dpdk-dev] [PATCH v4 02/16] mbuf: add Rx timestamp flag and helpers Thomas Monjalon
@ 2020-11-03 12:34     ` Andrew Rybchenko
  0 siblings, 0 replies; 170+ messages in thread
From: Andrew Rybchenko @ 2020-11-03 12:34 UTC (permalink / raw)
  To: Thomas Monjalon, dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	jerinj, viacheslavo, Ray Kinsella, Neil Horman

On 11/3/20 3:21 PM, Thomas Monjalon wrote:
> There is already a dynamic field for timestamp,
> used only for Tx scheduling with the dedicated Tx offload flag.
> The same field can be used for Rx timestamp filled by drivers.
> 
> A new dynamic flag is defined for Rx usage.
> A new function wraps the registration of both field and Rx flag.
> The type rte_mbuf_timestamp_t is defined for the API users.
> 
> After migrating all Rx timestamp usages, it will be possible
> to remove the deprecated timestamp field.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> Acked-by: David Marchand <david.marchand@redhat.com>

Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v4 04/16] net/ark: switch Rx timestamp to dynamic mbuf field
  2020-11-03 12:21   ` [dpdk-dev] [PATCH v4 04/16] net/ark: " Thomas Monjalon
@ 2020-11-03 12:37     ` Andrew Rybchenko
  2020-11-03 13:08       ` Thomas Monjalon
  0 siblings, 1 reply; 170+ messages in thread
From: Andrew Rybchenko @ 2020-11-03 12:37 UTC (permalink / raw)
  To: Thomas Monjalon, dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	jerinj, viacheslavo, Shepard Siegel, Ed Czeck, John Miller

On 11/3/20 3:21 PM, Thomas Monjalon wrote:
> The mbuf timestamp is moved to a dynamic field
> in order to allow removal of the deprecated static field.
> The related dynamic mbuf flag is set, although was missing previously.
> 
> The timestamp is set if configured for at least one device.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> Acked-by: David Marchand <david.marchand@redhat.com>

Just one minor comment below

> ---
>  drivers/net/ark/ark_ethdev.c    | 16 ++++++++++++++++
>  drivers/net/ark/ark_ethdev_rx.c |  7 ++++++-
>  drivers/net/ark/ark_ethdev_rx.h |  2 ++
>  3 files changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ark/ark_ethdev.c b/drivers/net/ark/ark_ethdev.c
> index fa343999a1..a34dcc5291 100644
> --- a/drivers/net/ark/ark_ethdev.c
> +++ b/drivers/net/ark/ark_ethdev.c
> @@ -79,6 +79,8 @@ static int  eth_ark_set_mtu(struct rte_eth_dev *dev, uint16_t size);
>  #define ARK_TX_MAX_QUEUE (4096 * 4)
>  #define ARK_TX_MIN_QUEUE (256)
>  
> +uint64_t ark_timestamp_rx_dynflag;
> +int ark_timestamp_dynfield_offset = -1;
>  int rte_pmd_ark_rx_userdata_dynfield_offset = -1;
>  int rte_pmd_ark_tx_userdata_dynfield_offset = -1;

It is a bit confusing that naming above differs so much and
put in the same block without empty lines in between.
I guess the reason is export/no-export. May be it would
be useful to highlight it in a comment.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v4 13/16] ethdev: add doxygen comment for Rx timestamp API
  2020-11-03 12:22   ` [dpdk-dev] [PATCH v4 13/16] ethdev: add doxygen comment for Rx timestamp API Thomas Monjalon
@ 2020-11-03 12:40     ` Andrew Rybchenko
  0 siblings, 0 replies; 170+ messages in thread
From: Andrew Rybchenko @ 2020-11-03 12:40 UTC (permalink / raw)
  To: Thomas Monjalon, dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	jerinj, viacheslavo

On 11/3/20 3:22 PM, Thomas Monjalon wrote:
> The offload flag DEV_RX_OFFLOAD_TIMESTAMP had no documentation.
> After switching to dynamic mbuf flag and field,
> it becomes even more important to explicit the feature behaviour.
> 
> A doxygen comment for the timesync API was mentioning
> the deprecated timestamp field, so it is also updated.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> Acked-by: David Marchand <david.marchand@redhat.com>

Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v4 15/16] mbuf: add Tx timestamp registration helper
  2020-11-03 12:22   ` [dpdk-dev] [PATCH v4 15/16] mbuf: add Tx timestamp registration helper Thomas Monjalon
@ 2020-11-03 12:42     ` Andrew Rybchenko
  0 siblings, 0 replies; 170+ messages in thread
From: Andrew Rybchenko @ 2020-11-03 12:42 UTC (permalink / raw)
  To: Thomas Monjalon, dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	jerinj, viacheslavo, Ray Kinsella, Neil Horman

On 11/3/20 3:22 PM, Thomas Monjalon wrote:
> The function rte_mbuf_dyn_tx_timestamp_register()
> can be used to register the required field and flag.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> Acked-by: David Marchand <david.marchand@redhat.com>

Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v4 16/16] ethdev: include mbuf registration in Tx timestamp API
  2020-11-03 12:22   ` [dpdk-dev] [PATCH v4 16/16] ethdev: include mbuf registration in Tx timestamp API Thomas Monjalon
@ 2020-11-03 12:45     ` Andrew Rybchenko
  0 siblings, 0 replies; 170+ messages in thread
From: Andrew Rybchenko @ 2020-11-03 12:45 UTC (permalink / raw)
  To: Thomas Monjalon, dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	jerinj, viacheslavo, Wenzhuo Lu, Beilei Xing, Bernard Iremonger,
	Matan Azrad, Shahaf Shuler

On 11/3/20 3:22 PM, Thomas Monjalon wrote:
> Previously, the Tx timestamp field and flag were registered in testpmd,
> as described in mlx5 guide.
> For consistency between Rx and Tx timestamps,
> managing mbuf registrations inside the driver, as properly documented,
> is a simpler expectation.
> 
> The only driver to support this feature (mlx5) is updated
> as well as the testpmd application.
> 
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
> Acked-by: David Marchand <david.marchand@redhat.com>

Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v4 04/16] net/ark: switch Rx timestamp to dynamic mbuf field
  2020-11-03 12:37     ` Andrew Rybchenko
@ 2020-11-03 13:08       ` Thomas Monjalon
  0 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 13:08 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	olivier.matz, jerinj, viacheslavo, Shepard Siegel, Ed Czeck,
	John Miller

03/11/2020 13:37, Andrew Rybchenko:
> On 11/3/20 3:21 PM, Thomas Monjalon wrote:
> > --- a/drivers/net/ark/ark_ethdev.c
> > +++ b/drivers/net/ark/ark_ethdev.c
> > +uint64_t ark_timestamp_rx_dynflag;
> > +int ark_timestamp_dynfield_offset = -1;
> >  int rte_pmd_ark_rx_userdata_dynfield_offset = -1;
> >  int rte_pmd_ark_tx_userdata_dynfield_offset = -1;
> 
> It is a bit confusing that naming above differs so much and
> put in the same block without empty lines in between.
> I guess the reason is export/no-export. May be it would
> be useful to highlight it in a comment.

I can add a blank line.
Not sure it is worth an out of scope comment to explain
that "rte_pmd_" prefixed variables are for the PMD-specific API.




^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotterfirst half
  2020-11-03 12:25                         ` Bruce Richardson
@ 2020-11-03 13:46                           ` Morten Brørup
  2020-11-03 13:50                             ` Bruce Richardson
  0 siblings, 1 reply; 170+ messages in thread
From: Morten Brørup @ 2020-11-03 13:46 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: Thomas Monjalon, dev, techboard, Ajit Khaparde, Ananyev,
	Konstantin, Andrew Rybchenko, Yigit, Ferruh, david.marchand,
	olivier.matz, jerinj, viacheslavo, honnappa.nagarahalli,
	maxime.coquelin, stephen, hemant.agrawal, Matan Azrad,
	Shahaf Shuler

> From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> Sent: Tuesday, November 3, 2020 1:26 PM
> 
> On Tue, Nov 03, 2020 at 01:10:05PM +0100, Morten Brørup wrote:
> > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > Sent: Monday, November 2, 2020 4:58 PM
> > >
> > > +Cc techboard
> > >
> > > We need benchmark numbers in order to take a decision.
> > > Please all, prepare some arguments and numbers so we can discuss
> > > the mbuf layout in the next techboard meeting.
> >
> > I propose that the techboard considers this from two angels:
> >
> > 1. Long term goals and their relative priority. I.e. what can be
> > achieved with wide-ranging modifications, requiring yet another ABI
> > break and due notices.
> >
> > 2. Short term goals, i.e. what can be achieved for this release.
> >
> >
> > My suggestions follow...
> >
> > 1. Regarding long term goals:
> >
> > I have argued that simple forwarding of non-segmented packets using
> > only the first mbuf cache line can be achieved by making three
> > modifications:
> >
> > a) Move m->tx_offload to the first cache line.
> > b) Use an 8 bit pktmbuf mempool index in the first cache line,
> >    instead of the 64 bit m->pool pointer in the second cache line.
> > c) Do not access m->next when we know that it is NULL.
> >    We can use m->nb_segs == 1 or some other invariant as the gate.
> >    It can be implemented by adding an m->next accessor function:
> >    struct rte_mbuf * rte_mbuf_next(struct rte_mbuf * m)
> >    {
> >        return m->nb_segs == 1 ? NULL : m->next;
> >    }
> >
> > Regarding the priority of this goal, I guess that simple forwarding
> > of non-segmented packets is probably the path taken by the majority
> > of packets handled by DPDK.
> >
> >
> > An alternative goal could be:
> > Do not touch the second cache line during RX.
> > A comment in the mbuf structure says so, but it is not true anymore.
> >
> 
> The comment should be true for non-scattered RX, I believe.

You are correct.

My suggestion was unclear: Extend this remark to include segmented packets.

This could be a priority if the techboard considers RX segmented packets more important than my suggestion for single cache line forwarding of non-segmented packets.


> I'm not aware of any use of second cacheline for the fast-path RXs for many drivers.
> Am I missing something that has changed recently here?

Check out eth_igb_recv_pkts() in the E1000 driver: rxm->next = NULL;
Or pmd_rx_burst() in the TAP driver: new_tail->next = seg->next;

Perhaps the documentation should describe best practices for implementing RX and TX functions in drivers, including allocating/freeing mbufs. Or an example dummy Ethernet driver could do it.


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotterfirst half
  2020-11-03 13:46                           ` Morten Brørup
@ 2020-11-03 13:50                             ` Bruce Richardson
  2020-11-03 14:03                               ` Morten Brørup
  0 siblings, 1 reply; 170+ messages in thread
From: Bruce Richardson @ 2020-11-03 13:50 UTC (permalink / raw)
  To: Morten Brørup
  Cc: Thomas Monjalon, dev, techboard, Ajit Khaparde, Ananyev,
	Konstantin, Andrew Rybchenko, Yigit, Ferruh, david.marchand,
	olivier.matz, jerinj, viacheslavo, honnappa.nagarahalli,
	maxime.coquelin, stephen, hemant.agrawal, Matan Azrad,
	Shahaf Shuler

On Tue, Nov 03, 2020 at 02:46:17PM +0100, Morten Brørup wrote:
> > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > Sent: Tuesday, November 3, 2020 1:26 PM
> > 
> > On Tue, Nov 03, 2020 at 01:10:05PM +0100, Morten Brørup wrote:
> > > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > > Sent: Monday, November 2, 2020 4:58 PM
> > > >
> > > > +Cc techboard
> > > >
> > > > We need benchmark numbers in order to take a decision.
> > > > Please all, prepare some arguments and numbers so we can discuss
> > > > the mbuf layout in the next techboard meeting.
> > >
> > > I propose that the techboard considers this from two angels:
> > >
> > > 1. Long term goals and their relative priority. I.e. what can be
> > > achieved with wide-ranging modifications, requiring yet another ABI
> > > break and due notices.
> > >
> > > 2. Short term goals, i.e. what can be achieved for this release.
> > >
> > >
> > > My suggestions follow...
> > >
> > > 1. Regarding long term goals:
> > >
> > > I have argued that simple forwarding of non-segmented packets using
> > > only the first mbuf cache line can be achieved by making three
> > > modifications:
> > >
> > > a) Move m->tx_offload to the first cache line.
> > > b) Use an 8 bit pktmbuf mempool index in the first cache line,
> > >    instead of the 64 bit m->pool pointer in the second cache line.
> > > c) Do not access m->next when we know that it is NULL.
> > >    We can use m->nb_segs == 1 or some other invariant as the gate.
> > >    It can be implemented by adding an m->next accessor function:
> > >    struct rte_mbuf * rte_mbuf_next(struct rte_mbuf * m)
> > >    {
> > >        return m->nb_segs == 1 ? NULL : m->next;
> > >    }
> > >
> > > Regarding the priority of this goal, I guess that simple forwarding
> > > of non-segmented packets is probably the path taken by the majority
> > > of packets handled by DPDK.
> > >
> > >
> > > An alternative goal could be:
> > > Do not touch the second cache line during RX.
> > > A comment in the mbuf structure says so, but it is not true anymore.
> > >
> > 
> > The comment should be true for non-scattered RX, I believe.
> 
> You are correct.
> 
> My suggestion was unclear: Extend this remark to include segmented packets.
> 
> This could be a priority if the techboard considers RX segmented packets more important than my suggestion for single cache line forwarding of non-segmented packets.
> 
> 
> > I'm not aware of any use of second cacheline for the fast-path RXs for many drivers.
> > Am I missing something that has changed recently here?
> 
> Check out eth_igb_recv_pkts() in the E1000 driver: rxm->next = NULL;
> Or pmd_rx_burst() in the TAP driver: new_tail->next = seg->next;
> 
> Perhaps the documentation should describe best practices for implementing RX and TX functions in drivers, including allocating/freeing mbufs. Or an example dummy Ethernet driver could do it.
> 

Yes, perhaps I should be clearer about the "fast-path", because I was
thinking of the optimized RX/TX paths for those nics at 10G and above.
Probably the documentation should indeed have an update clarifying things a
bit, since using the first cacheline only possible but not mandatory for
simple RX.

/Bruce

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotterfirst half
  2020-11-03 12:10                       ` Morten Brørup
  2020-11-03 12:25                         ` Bruce Richardson
@ 2020-11-03 14:02                         ` Slava Ovsiienko
  2020-11-03 15:03                           ` Morten Brørup
  1 sibling, 1 reply; 170+ messages in thread
From: Slava Ovsiienko @ 2020-11-03 14:02 UTC (permalink / raw)
  To: Morten Brørup, NBU-Contact-Thomas Monjalon, dev, techboard
  Cc: Ajit Khaparde, Ananyev, Konstantin, Andrew Rybchenko, dev, Yigit,
	Ferruh, david.marchand, Richardson, Bruce, olivier.matz, jerinj,
	honnappa.nagarahalli, maxime.coquelin, stephen, hemant.agrawal,
	Matan Azrad, Shahaf Shuler

Hi, Morten

> -----Original Message-----
> From: Morten Brørup <mb@smartsharesystems.com>
> Sent: Tuesday, November 3, 2020 14:10
> To: NBU-Contact-Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org;
> techboard@dpdk.org
> Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Andrew Rybchenko
> <andrew.rybchenko@oktetlabs.ru>; dev@dpdk.org; Yigit, Ferruh
> <ferruh.yigit@intel.com>; david.marchand@redhat.com; Richardson, Bruce
> <bruce.richardson@intel.com>; olivier.matz@6wind.com; jerinj@marvell.com;
> Slava Ovsiienko <viacheslavo@nvidia.com>; honnappa.nagarahalli@arm.com;
> maxime.coquelin@redhat.com; stephen@networkplumber.org;
> hemant.agrawal@nxp.com; Slava Ovsiienko <viacheslavo@nvidia.com>; Matan
> Azrad <matan@nvidia.com>; Shahaf Shuler <shahafs@nvidia.com>
> Subject: RE: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotterfirst
> half
> 
> > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > Sent: Monday, November 2, 2020 4:58 PM
> >
> > +Cc techboard
> >
> > We need benchmark numbers in order to take a decision.
> > Please all, prepare some arguments and numbers so we can discuss the
> > mbuf layout in the next techboard meeting.
> 
> I propose that the techboard considers this from two angels:
> 
> 1. Long term goals and their relative priority. I.e. what can be achieved with
> wide-ranging modifications, requiring yet another ABI break and due notices.
> 
> 2. Short term goals, i.e. what can be achieved for this release.
> 
> 
> My suggestions follow...
> 
> 1. Regarding long term goals:
> 
> I have argued that simple forwarding of non-segmented packets using only the
> first mbuf cache line can be achieved by making three
> modifications:
> 
> a) Move m->tx_offload to the first cache line.
Not all PMDs use this field on Tx. HW might support the checksum offloads
directly, not requiring these fields at all. 


> b) Use an 8 bit pktmbuf mempool index in the first cache line,
>    instead of the 64 bit m->pool pointer in the second cache line.
256 mpool looks enough, as for me. Regarding the indirect access to the pool
(via some table) - it might introduce some performance impact. For example, 
mlx5 PMD strongly relies on pool field for allocating mbufs in Rx datapath.
We're going to update (o-o, we found point to optimize), but for now it does.

> c) Do not access m->next when we know that it is NULL.
>    We can use m->nb_segs == 1 or some other invariant as the gate.
>    It can be implemented by adding an m->next accessor function:
>    struct rte_mbuf * rte_mbuf_next(struct rte_mbuf * m)
>    {
>        return m->nb_segs == 1 ? NULL : m->next;
>    }

Sorry, not sure about this. IIRC, nb_segs is valid in the first segment/mbuf  only.
If we have the 4 segments in the pkt we see nb_seg=4 in the first one, and the nb_seg=1
in the others. The next field is NULL in the last mbuf only. Am I wrong and miss something ?

> Regarding the priority of this goal, I guess that simple forwarding of non-
> segmented packets is probably the path taken by the majority of packets
> handled by DPDK.
> 
> An alternative goal could be:
> Do not touch the second cache line during RX.
> A comment in the mbuf structure says so, but it is not true anymore.
> 
> (I guess that regression testing didn't catch this because the tests perform TX
> immediately after RX, so the cache miss just moves from the TX to the RX part
> of the test application.)
> 
> 
> 2. Regarding short term goals:
> 
> The current DPDK source code looks to me like m->next is the most frequently
> accessed field in the second cache line, so it makes sense moving this to the
> first cache line, rather than m->pool.
> Benchmarking may help here.

Moreover, for the segmented packets the packet size is supposed to be large,
and it imposes the relatively low packet rate, so probably optimization of
moving next to the 1st cache line might be negligible at all. Just compare 148Mpps of
64B pkts and 4Mpps of 3000B pkts over 100Gbps link. Currently we are on benchmarking
and did not succeed yet on difference finding. The benefit can't be expressed in mpps delta,
we should measure CPU clocks, but Rx queue is almost always empty - we have an empty
loops. So, if we have the boost - it is extremely hard to catch one.

With best regards, Slava

>
> 
> If we - without breaking the ABI - can introduce a gate to avoid accessing m-
> >next when we know that it is NULL, we should keep it in the second cache
> line.
> 
> In this case, I would prefer to move m->tx_offload to the first cache line,
> thereby providing a field available for application use, until the application
> prepares the packet for transmission.
> 
> 
> >
> >
> > 01/11/2020 21:59, Morten Brørup:
> > > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > > Sent: Sunday, November 1, 2020 5:38 PM
> > > >
> > > > 01/11/2020 10:12, Morten Brørup:
> > > > > One thing has always puzzled me:
> > > > > Why do we use 64 bits to indicate which memory pool an mbuf
> > > > > belongs to?
> > > > > The portid only uses 16 bits and an indirection index.
> > > > > Why don't we use the same kind of indirection index for mbuf
> > pools?
> > > >
> > > > I wonder what would be the cost of indirection. Probably
> > neglectible.
> > >
> > > Probably. The portid does it, and that indirection is heavily used
> > everywhere.
> > >
> > > The size of mbuf memory pool indirection array should be compile
> > > time
> > configurable, like the size of the portid indirection array.
> > >
> > > And for reference, the indirection array will fit into one cache
> > > line
> > if we default to 8 mbuf pools, thus supporting an 8 CPU socket system
> > with one mbuf pool per CPU socket, or a 4 CPU socket system with two
> > mbuf pools per CPU socket.
> > >
> > > (And as a side note: Our application is optimized for single-socket
> > systems, and we only use one mbuf pool. I guess many applications were
> > developed without carefully optimizing for multi-socket systems, and
> > also just use one mbuf pool. In these cases, the mbuf structure
> > doesn't really need a pool field. But it is still there, and the DPDK
> > libraries use it, so we didn't bother removing it.)
> > >
> > > > I think it is a good proposal...
> > > > ... for next year, after a deprecation notice.
> > > >
> > > > > I can easily imagine using one mbuf pool (or perhaps a few
> > > > > pools) per CPU socket (or per physical memory bus closest to an
> > > > > attached
> > NIC),
> > > > > but not more than 256 mbuf memory pools in total.
> > > > > So, let's introduce an mbufpoolid like the portid, and cut this
> > > > > mbuf field down from 64 to 8 bits.
> >
> > We will need to measure the perf of the solution.
> > There is a chance for the cost to be too much high.
> >
> >
> > > > > If we also cut down m->pkt_len from 32 to 24 bits,
> > > >
> > > > Who is using packets larger than 64k? Are 16 bits enough?
> > >
> > > I personally consider 64k a reasonable packet size limit. Exotic
> > applications with even larger packets would have to live with this
> > constraint. But let's see if there are any objections. For reference,
> > 64k corresponds to ca. 44 Ethernet (1500 byte) packets.
> > >
> > > (The limit could be 65535 bytes, to avoid translation of the value 0
> > into 65536 bytes.)
> > >
> > > This modification would go nicely hand in hand with the mbuf pool
> > indirection modification.
> > >
> > > ... after yet another round of ABI stability discussions,
> > depreciation notices, and so on. :-)
> >
> > After more thoughts, I'm afraid 64k is too small in some cases.
> > And 24-bit manipulation would probably break performance.
> > I'm afraid we are stuck with 32-bit length.
> 
> Yes, 24 bit manipulation would probably break performance.
> 
> Perhaps a solution exists with 16 bits (least significant bits) for the common
> cases, and 8 bits more (most significant bits) for the less common cases. Just
> thinking out loud here...
> 
> >
> > > > > we can get the 8 bit mbuf pool index into the first cache line
> > > > > at no additional cost.
> > > >
> > > > I like the idea.
> > > > It means we don't need to move the pool pointer now, i.e. it does
> > > > not have to replace the timestamp field.
> > >
> > > Agreed! Don't move m->pool to the first cache line; it is not used
> > for RX.
> > >
> > > >
> > > > > In other words: This would free up another 64 bit field in the
> > mbuf
> > > > structure!
> > > >
> > > > That would be great!
> > > >
> > > >
> > > > > And even though the m->next pointer for scattered packets
> > > > > resides in the second cache line, the libraries and application
> > > > > knows that m->next is NULL when m->nb_segs is 1.
> > > > > This proves that my suggestion would make touching the second
> > > > > cache line unnecessary (in simple cases), even for
> > > > > re-initializing the mbuf.
> > > >
> > > > So you think the "next" pointer should stay in the second half of
> > mbuf?
> > > >
> > > > I feel you would like to move the Tx offloads in the first half to
> > > > improve performance of very simple apps.
> > >
> > > "Very simple apps" sounds like a minority of apps. I would rather
> > > say
> > "very simple packet handling scenarios", e.g. forwarding of normal
> > size non-segmented packets. I would guess that the vast majority of
> > packets handled by DPDK applications actually match this scenario. So
> > I'm proposing to optimize for what I think is the most common scenario.
> > >
> > > If segmented packets are common, then m->next could be moved to the
> > first cache line. But it will only improve the pure RX steps of the
> > pipeline. When preparing the packet for TX, m->tx_offloads will need
> > to be set, and the second cache line comes into play. So I'm wondering
> > how big the benefit of having m->next in the first cache line really
> > is - assuming that m->nb_segs will be checked before accessing m->next.
> > >
> > > > I am thinking the opposite: we could have some dynamic fields
> > > > space in the first half to improve performance of complex Rx.
> > > > Note: we can add a flag hint for field registration in this first
> > half.
> > > >
> > >
> > > I have had the same thoughts. However, I would prefer being able to
> > forward ordinary packets without using the second mbuf cache line at
> > all (although only in specific scenarios like my example above).
> > >
> > > Furthermore, the application can abuse the 64 bit m->tx_offload
> > > field
> > for private purposes until it is time to prepare the packet for TX and
> > pass it on to the driver. This hack somewhat resembles a dynamic field
> > in the first cache line, and will not be possible if the m->pool or m-
> > >next field is moved there.
> >
> >
> >


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotterfirst half
  2020-11-03 13:50                             ` Bruce Richardson
@ 2020-11-03 14:03                               ` Morten Brørup
  0 siblings, 0 replies; 170+ messages in thread
From: Morten Brørup @ 2020-11-03 14:03 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: Thomas Monjalon, dev, techboard, Ajit Khaparde, Ananyev,
	Konstantin, Andrew Rybchenko, Yigit, Ferruh, david.marchand,
	olivier.matz, jerinj, viacheslavo, honnappa.nagarahalli,
	maxime.coquelin, stephen, hemant.agrawal, Matan Azrad,
	Shahaf Shuler

> From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> Sent: Tuesday, November 3, 2020 2:50 PM
> 
> On Tue, Nov 03, 2020 at 02:46:17PM +0100, Morten Brørup wrote:
> > > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > > Sent: Tuesday, November 3, 2020 1:26 PM
> > >
> > > On Tue, Nov 03, 2020 at 01:10:05PM +0100, Morten Brørup wrote:
> > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > > > Sent: Monday, November 2, 2020 4:58 PM
> > > > >
> > > > > +Cc techboard
> > > > >
> > > > > We need benchmark numbers in order to take a decision.
> > > > > Please all, prepare some arguments and numbers so we can
> discuss
> > > > > the mbuf layout in the next techboard meeting.
> > > >
> > > > I propose that the techboard considers this from two angels:
> > > >
> > > > 1. Long term goals and their relative priority. I.e. what can be
> > > > achieved with wide-ranging modifications, requiring yet another
> ABI
> > > > break and due notices.
> > > >
> > > > 2. Short term goals, i.e. what can be achieved for this release.
> > > >
> > > >
> > > > My suggestions follow...
> > > >
> > > > 1. Regarding long term goals:
> > > >
> > > > I have argued that simple forwarding of non-segmented packets
> using
> > > > only the first mbuf cache line can be achieved by making three
> > > > modifications:
> > > >
> > > > a) Move m->tx_offload to the first cache line.
> > > > b) Use an 8 bit pktmbuf mempool index in the first cache line,
> > > >    instead of the 64 bit m->pool pointer in the second cache
> line.
> > > > c) Do not access m->next when we know that it is NULL.
> > > >    We can use m->nb_segs == 1 or some other invariant as the
> gate.
> > > >    It can be implemented by adding an m->next accessor function:
> > > >    struct rte_mbuf * rte_mbuf_next(struct rte_mbuf * m)
> > > >    {
> > > >        return m->nb_segs == 1 ? NULL : m->next;
> > > >    }
> > > >
> > > > Regarding the priority of this goal, I guess that simple
> forwarding
> > > > of non-segmented packets is probably the path taken by the
> majority
> > > > of packets handled by DPDK.
> > > >
> > > >
> > > > An alternative goal could be:
> > > > Do not touch the second cache line during RX.
> > > > A comment in the mbuf structure says so, but it is not true
> anymore.
> > > >
> > >
> > > The comment should be true for non-scattered RX, I believe.
> >
> > You are correct.
> >
> > My suggestion was unclear: Extend this remark to include segmented
> packets.
> >
> > This could be a priority if the techboard considers RX segmented
> packets more important than my suggestion for single cache line
> forwarding of non-segmented packets.
> >
> >
> > > I'm not aware of any use of second cacheline for the fast-path RXs
> for many drivers.
> > > Am I missing something that has changed recently here?
> >
> > Check out eth_igb_recv_pkts() in the E1000 driver: rxm->next = NULL;
> > Or pmd_rx_burst() in the TAP driver: new_tail->next = seg->next;
> >
> > Perhaps the documentation should describe best practices for
> implementing RX and TX functions in drivers, including
> allocating/freeing mbufs. Or an example dummy Ethernet driver could do
> it.
> >
> 
> Yes, perhaps I should be clearer about the "fast-path", because I was
> thinking of the optimized RX/TX paths for those nics at 10G and above.
> Probably the documentation should indeed have an update clarifying
> things a
> bit, since using the first cacheline only possible but not mandatory
> for
> simple RX.

I sometimes look at the source code of the simple drivers for reference, as they are easier to understand than the advanced vector drivers.
I suppose new PMD developers also would. :-)

Anyway, it is probably a good idea to add a clarifying note to the documentation, thus reflecting reality.

Just make sure that it says that the second cache line is supposed to be untouched by RX of high performance drivers, so application developers still consider it cold.


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp
  2020-10-29  9:27 [dpdk-dev] [PATCH 00/15] remove mbuf timestamp Thomas Monjalon
                   ` (17 preceding siblings ...)
  2020-11-03 12:21 ` [dpdk-dev] [PATCH v4 " Thomas Monjalon
@ 2020-11-03 14:09 ` Thomas Monjalon
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 01/16] eventdev: remove software Rx timestamp Thomas Monjalon
                     ` (17 more replies)
  18 siblings, 18 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 14:09 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo

The mbuf field timestamp was announced to be removed for three reasons:
  - a dynamic field already exist, used for Tx only
  - this field always used 8 bytes even if unneeded
  - this field is in the first half (cacheline) of mbuf

After this series, the dynamic field timestamp is used for both Rx and Tx
with separate dynamic flags to distinguish when the value is meaningful
without resetting the field during forwarding.

As a consequence, 8 bytes can be re-allocated to dynamic fields
in the first half of mbuf structure.
It is still open to change more the mbuf layout.

This mbuf layout change is important to allow adding more features
(consuming more dynamic fields) during the next year,
and can allow performance improvements with new usages in the first half.


v5:
- add a blank line between different kind of ARK variables
- move registration after octeontx2 VF config
- register also in otx2_nix_timesync_enable

v4:
- use local variable in nfb
- fix flag initialization
- remove useless blank line

v3:
- move ark variables declaration in a .h file 
- improve cache locality for octeontx2
- add comments about cache locality in commit logs
- add comment for unused flag offset 17
- add timestamp register functions
- replace lookup with register in drivers and apps
- remove register in ethdev

v2:
- remove optimization to register only once in ethdev
- fix error message in latencystats
- convert rxtx_callbacks macro to inline function
- increase dynamic fields space
- do not move pool field


Thomas Monjalon (16):
  eventdev: remove software Rx timestamp
  mbuf: add Rx timestamp flag and helpers
  latency: switch Rx timestamp to dynamic mbuf field
  net/ark: switch Rx timestamp to dynamic mbuf field
  net/dpaa2: switch Rx timestamp to dynamic mbuf field
  net/mlx5: fix dynamic mbuf offset lookup check
  net/mlx5: switch Rx timestamp to dynamic mbuf field
  net/nfb: switch Rx timestamp to dynamic mbuf field
  net/octeontx2: switch Rx timestamp to dynamic mbuf field
  net/pcap: switch Rx timestamp to dynamic mbuf field
  app/testpmd: switch Rx timestamp to dynamic mbuf field
  examples/rxtx_callbacks: switch timestamp to dynamic field
  ethdev: add doxygen comment for Rx timestamp API
  mbuf: remove deprecated timestamp field
  mbuf: add Tx timestamp registration helper
  ethdev: include mbuf registration in Tx timestamp API

 app/test-pmd/config.c                         | 38 -------------
 app/test-pmd/util.c                           | 38 ++++++++++++-
 app/test/test_mbuf.c                          |  1 -
 doc/guides/nics/mlx5.rst                      |  5 +-
 .../prog_guide/event_ethernet_rx_adapter.rst  |  6 +-
 doc/guides/rel_notes/deprecation.rst          |  4 --
 doc/guides/rel_notes/release_20_11.rst        |  4 ++
 drivers/net/ark/ark_ethdev.c                  | 17 ++++++
 drivers/net/ark/ark_ethdev_rx.c               |  7 ++-
 drivers/net/ark/ark_ethdev_rx.h               |  2 +
 drivers/net/dpaa2/dpaa2_ethdev.c              | 11 ++++
 drivers/net/dpaa2/dpaa2_ethdev.h              |  2 +
 drivers/net/dpaa2/dpaa2_rxtx.c                | 25 ++++++---
 drivers/net/mlx5/mlx5_ethdev.c                |  8 ++-
 drivers/net/mlx5/mlx5_rxq.c                   |  8 +++
 drivers/net/mlx5/mlx5_rxtx.c                  |  8 +--
 drivers/net/mlx5/mlx5_rxtx.h                  | 19 +++++++
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h      | 41 +++++++-------
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h         | 43 ++++++++-------
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h          | 35 ++++++------
 drivers/net/mlx5/mlx5_trigger.c               |  2 +-
 drivers/net/mlx5/mlx5_txq.c                   |  2 +-
 drivers/net/nfb/nfb_rx.c                      | 15 ++++-
 drivers/net/nfb/nfb_rx.h                      | 21 +++++--
 drivers/net/octeontx2/otx2_ethdev.c           | 10 ++++
 drivers/net/octeontx2/otx2_ptp.c              |  8 +++
 drivers/net/octeontx2/otx2_rx.h               | 19 ++++++-
 drivers/net/pcap/rte_eth_pcap.c               | 20 ++++++-
 examples/rxtx_callbacks/main.c                | 16 +++++-
 lib/librte_ethdev/rte_ethdev.h                | 14 ++++-
 .../rte_event_eth_rx_adapter.c                | 11 ----
 .../rte_event_eth_rx_adapter.h                |  6 +-
 lib/librte_latencystats/rte_latencystats.c    | 30 ++++++++--
 lib/librte_mbuf/rte_mbuf.c                    |  2 -
 lib/librte_mbuf/rte_mbuf.h                    |  2 +-
 lib/librte_mbuf/rte_mbuf_core.h               | 12 +---
 lib/librte_mbuf/rte_mbuf_dyn.c                | 51 +++++++++++++++++
 lib/librte_mbuf/rte_mbuf_dyn.h                | 55 +++++++++++++++++--
 lib/librte_mbuf/version.map                   |  2 +
 39 files changed, 440 insertions(+), 180 deletions(-)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v5 01/16] eventdev: remove software Rx timestamp
  2020-11-03 14:09 ` [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp Thomas Monjalon
@ 2020-11-03 14:09   ` Thomas Monjalon
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 02/16] mbuf: add Rx timestamp flag and helpers Thomas Monjalon
                     ` (16 subsequent siblings)
  17 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 14:09 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Nikhil Rao

This a revert of the commit 569758758dcd ("eventdev: add Rx timestamp").
If the Rx timestamp is not configured on the ethdev port,
there is no reason to set one.
Also the accuracy  of the timestamp was bad because set at a late stage.
Anyway there is no trace of the usage of this timestamp.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 doc/guides/prog_guide/event_ethernet_rx_adapter.rst |  6 +-----
 lib/librte_eventdev/rte_event_eth_rx_adapter.c      | 11 -----------
 lib/librte_eventdev/rte_event_eth_rx_adapter.h      |  6 +-----
 3 files changed, 2 insertions(+), 21 deletions(-)

diff --git a/doc/guides/prog_guide/event_ethernet_rx_adapter.rst b/doc/guides/prog_guide/event_ethernet_rx_adapter.rst
index 236f43f455..cb44ce0e47 100644
--- a/doc/guides/prog_guide/event_ethernet_rx_adapter.rst
+++ b/doc/guides/prog_guide/event_ethernet_rx_adapter.rst
@@ -12,11 +12,7 @@ be supported in hardware or require a software thread to receive packets from
 the ethdev port using ethdev poll mode APIs and enqueue these as events to the
 event device using the eventdev API. Both transfer mechanisms may be present on
 the same platform depending on the particular combination of the ethdev and
-the event device. For SW based packet transfer, if the mbuf does not have a
-timestamp set, the adapter adds a timestamp to the mbuf using
-rte_get_tsc_cycles(), this provides a more accurate timestamp as compared to
-if the application were to set the timestamp since it avoids event device
-schedule latency.
+the event device.
 
 The Event Ethernet Rx Adapter library is intended for the application code to
 configure both transfer mechanisms using a common API. A capability API allows
diff --git a/lib/librte_eventdev/rte_event_eth_rx_adapter.c b/lib/librte_eventdev/rte_event_eth_rx_adapter.c
index f0000d1ede..3c73046551 100644
--- a/lib/librte_eventdev/rte_event_eth_rx_adapter.c
+++ b/lib/librte_eventdev/rte_event_eth_rx_adapter.c
@@ -763,7 +763,6 @@ rxa_buffer_mbufs(struct rte_event_eth_rx_adapter *rx_adapter,
 	uint32_t rss_mask;
 	uint32_t rss;
 	int do_rss;
-	uint64_t ts;
 	uint16_t nb_cb;
 	uint16_t dropped;
 
@@ -771,16 +770,6 @@ rxa_buffer_mbufs(struct rte_event_eth_rx_adapter *rx_adapter,
 	rss_mask = ~(((m->ol_flags & PKT_RX_RSS_HASH) != 0) - 1);
 	do_rss = !rss_mask && !eth_rx_queue_info->flow_id_mask;
 
-	if ((m->ol_flags & PKT_RX_TIMESTAMP) == 0) {
-		ts = rte_get_tsc_cycles();
-		for (i = 0; i < num; i++) {
-			m = mbufs[i];
-
-			m->timestamp = ts;
-			m->ol_flags |= PKT_RX_TIMESTAMP;
-		}
-	}
-
 	for (i = 0; i < num; i++) {
 		m = mbufs[i];
 
diff --git a/lib/librte_eventdev/rte_event_eth_rx_adapter.h b/lib/librte_eventdev/rte_event_eth_rx_adapter.h
index 2dd259c279..21bb1e54c8 100644
--- a/lib/librte_eventdev/rte_event_eth_rx_adapter.h
+++ b/lib/librte_eventdev/rte_event_eth_rx_adapter.h
@@ -21,11 +21,7 @@
  *
  * The adapter uses a EAL service core function for SW based packet transfer
  * and uses the eventdev PMD functions to configure HW based packet transfer
- * between the ethernet device and the event device. For SW based packet
- * transfer, if the mbuf does not have a timestamp set, the adapter adds a
- * timestamp to the mbuf using rte_get_tsc_cycles(), this provides a more
- * accurate timestamp as compared to if the application were to set the time
- * stamp since it avoids event device schedule latency.
+ * between the ethernet device and the event device.
  *
  * The ethernet Rx event adapter's functions are:
  *  - rte_event_eth_rx_adapter_create_ext()
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v5 02/16] mbuf: add Rx timestamp flag and helpers
  2020-11-03 14:09 ` [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp Thomas Monjalon
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 01/16] eventdev: remove software Rx timestamp Thomas Monjalon
@ 2020-11-03 14:09   ` Thomas Monjalon
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 03/16] latency: switch Rx timestamp to dynamic mbuf field Thomas Monjalon
                     ` (15 subsequent siblings)
  17 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 14:09 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Ray Kinsella, Neil Horman

There is already a dynamic field for timestamp,
used only for Tx scheduling with the dedicated Tx offload flag.
The same field can be used for Rx timestamp filled by drivers.

A new dynamic flag is defined for Rx usage.
A new function wraps the registration of both field and Rx flag.
The type rte_mbuf_timestamp_t is defined for the API users.

After migrating all Rx timestamp usages, it will be possible
to remove the deprecated timestamp field.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
---
 lib/librte_mbuf/rte_mbuf_dyn.c | 43 ++++++++++++++++++++++++++++++++++
 lib/librte_mbuf/rte_mbuf_dyn.h | 33 ++++++++++++++++++++++----
 lib/librte_mbuf/version.map    |  1 +
 3 files changed, 72 insertions(+), 5 deletions(-)

diff --git a/lib/librte_mbuf/rte_mbuf_dyn.c b/lib/librte_mbuf/rte_mbuf_dyn.c
index 538a43f695..5b608a27d7 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.c
+++ b/lib/librte_mbuf/rte_mbuf_dyn.c
@@ -13,6 +13,7 @@
 #include <rte_errno.h>
 #include <rte_malloc.h>
 #include <rte_string_fns.h>
+#include <rte_bitops.h>
 #include <rte_mbuf.h>
 #include <rte_mbuf_dyn.h>
 
@@ -569,3 +570,45 @@ void rte_mbuf_dyn_dump(FILE *out)
 
 	rte_mcfg_tailq_write_unlock();
 }
+
+static int
+rte_mbuf_dyn_timestamp_register(int *field_offset, uint64_t *flag,
+		const char *direction, const char *flag_name)
+{
+	static const struct rte_mbuf_dynfield field_desc = {
+		.name = RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
+		.size = sizeof(rte_mbuf_timestamp_t),
+		.align = __alignof__(rte_mbuf_timestamp_t),
+	};
+	struct rte_mbuf_dynflag flag_desc = { 0 };
+	int offset;
+
+	offset = rte_mbuf_dynfield_register(&field_desc);
+	if (offset < 0) {
+		RTE_LOG(ERR, MBUF,
+			"Failed to register mbuf field for timestamp\n");
+		return -1;
+	}
+	if (field_offset != NULL)
+		*field_offset = offset;
+
+	strlcpy(flag_desc.name, flag_name, sizeof(flag_desc.name));
+	offset = rte_mbuf_dynflag_register(&flag_desc);
+	if (offset < 0) {
+		RTE_LOG(ERR, MBUF,
+			"Failed to register mbuf flag for %s timestamp\n",
+			direction);
+		return -1;
+	}
+	if (flag != NULL)
+		*flag = RTE_BIT64(offset);
+
+	return 0;
+}
+
+int
+rte_mbuf_dyn_rx_timestamp_register(int *field_offset, uint64_t *rx_flag)
+{
+	return rte_mbuf_dyn_timestamp_register(field_offset, rx_flag,
+			"Rx", RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME);
+}
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 0ebac88b83..2e729ddaca 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -258,13 +258,36 @@ void rte_mbuf_dyn_dump(FILE *out);
  * timestamp. The dynamic Tx timestamp flag tells whether the field contains
  * actual timestamp value for the packets being sent, this value can be
  * used by PMD to schedule packet sending.
- *
- * After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
- * and obsoleting, the dedicated Rx timestamp flag is supposed to be
- * introduced and the shared dynamic timestamp field will be used
- * to handle the timestamps on receiving datapath as well.
  */
 #define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME "rte_dynfield_timestamp"
+typedef uint64_t rte_mbuf_timestamp_t;
+
+/**
+ * Indicate that the timestamp field in the mbuf was filled by the driver.
+ */
+#define RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME "rte_dynflag_rx_timestamp"
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Register dynamic mbuf field and flag for Rx timestamp.
+ *
+ * @param field_offset
+ *   Pointer to the offset of the registered mbuf field, can be NULL.
+ *   The same field is shared for Rx and Tx timestamp.
+ * @param rx_flag
+ *   Pointer to the mask of the registered offload flag, can be NULL.
+ * @return
+ *   0 on success, -1 otherwise.
+ *   Possible values for rte_errno:
+ *   - EEXIST: already registered with different parameters.
+ *   - EPERM: called from a secondary process.
+ *   - ENOENT: no more field or flag available.
+ *   - ENOMEM: allocation failure.
+ */
+__rte_experimental
+int rte_mbuf_dyn_rx_timestamp_register(int *field_offset, uint64_t *rx_flag);
 
 /**
  * When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME flag set on the
diff --git a/lib/librte_mbuf/version.map b/lib/librte_mbuf/version.map
index a011aaead3..0b66668bff 100644
--- a/lib/librte_mbuf/version.map
+++ b/lib/librte_mbuf/version.map
@@ -42,6 +42,7 @@ EXPERIMENTAL {
 	rte_mbuf_dynflag_register;
 	rte_mbuf_dynflag_register_bitnum;
 	rte_mbuf_dyn_dump;
+	rte_mbuf_dyn_rx_timestamp_register;
 	rte_pktmbuf_copy;
 	rte_pktmbuf_free_bulk;
 	rte_pktmbuf_pool_create_extbuf;
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v5 03/16] latency: switch Rx timestamp to dynamic mbuf field
  2020-11-03 14:09 ` [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp Thomas Monjalon
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 01/16] eventdev: remove software Rx timestamp Thomas Monjalon
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 02/16] mbuf: add Rx timestamp flag and helpers Thomas Monjalon
@ 2020-11-03 14:09   ` Thomas Monjalon
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 04/16] net/ark: " Thomas Monjalon
                     ` (14 subsequent siblings)
  17 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 14:09 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Reshma Pattan

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced with the dynamic one.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 lib/librte_latencystats/rte_latencystats.c | 30 ++++++++++++++++++----
 1 file changed, 25 insertions(+), 5 deletions(-)

diff --git a/lib/librte_latencystats/rte_latencystats.c b/lib/librte_latencystats/rte_latencystats.c
index ba2fff3bcb..ab8db7a139 100644
--- a/lib/librte_latencystats/rte_latencystats.c
+++ b/lib/librte_latencystats/rte_latencystats.c
@@ -9,6 +9,7 @@
 
 #include <rte_string_fns.h>
 #include <rte_mbuf.h>
+#include <rte_mbuf_dyn.h>
 #include <rte_log.h>
 #include <rte_cycles.h>
 #include <rte_ethdev.h>
@@ -31,6 +32,16 @@ latencystat_cycles_per_ns(void)
 /* Macros for printing using RTE_LOG */
 #define RTE_LOGTYPE_LATENCY_STATS RTE_LOGTYPE_USER1
 
+static uint64_t timestamp_dynflag;
+static int timestamp_dynfield_offset = -1;
+
+static inline rte_mbuf_timestamp_t *
+timestamp_dynfield(struct rte_mbuf *mbuf)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+			timestamp_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 static const char *MZ_RTE_LATENCY_STATS = "rte_latencystats";
 static int latency_stats_index;
 static uint64_t samp_intvl;
@@ -128,10 +139,10 @@ add_time_stamps(uint16_t pid __rte_unused,
 		diff_tsc = now - prev_tsc;
 		timer_tsc += diff_tsc;
 
-		if ((pkts[i]->ol_flags & PKT_RX_TIMESTAMP) == 0
+		if ((pkts[i]->ol_flags & timestamp_dynflag) == 0
 				&& (timer_tsc >= samp_intvl)) {
-			pkts[i]->timestamp = now;
-			pkts[i]->ol_flags |= PKT_RX_TIMESTAMP;
+			*timestamp_dynfield(pkts[i]) = now;
+			pkts[i]->ol_flags |= timestamp_dynflag;
 			timer_tsc = 0;
 		}
 		prev_tsc = now;
@@ -161,8 +172,8 @@ calc_latency(uint16_t pid __rte_unused,
 
 	now = rte_rdtsc();
 	for (i = 0; i < nb_pkts; i++) {
-		if (pkts[i]->ol_flags & PKT_RX_TIMESTAMP)
-			latency[cnt++] = now - pkts[i]->timestamp;
+		if (pkts[i]->ol_flags & timestamp_dynflag)
+			latency[cnt++] = now - *timestamp_dynfield(pkts[i]);
 	}
 
 	rte_spinlock_lock(&glob_stats->lock);
@@ -241,6 +252,15 @@ rte_latencystats_init(uint64_t app_samp_intvl,
 		return -1;
 	}
 
+	/* Register mbuf field and flag for Rx timestamp */
+	ret = rte_mbuf_dyn_rx_timestamp_register(&timestamp_dynfield_offset,
+			&timestamp_dynflag);
+	if (ret != 0) {
+		RTE_LOG(ERR, LATENCY_STATS,
+			"Cannot register mbuf field/flag for timestamp\n");
+		return -rte_errno;
+	}
+
 	/** Register Rx/Tx callbacks */
 	RTE_ETH_FOREACH_DEV(pid) {
 		struct rte_eth_dev_info dev_info;
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v5 04/16] net/ark: switch Rx timestamp to dynamic mbuf field
  2020-11-03 14:09 ` [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp Thomas Monjalon
                     ` (2 preceding siblings ...)
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 03/16] latency: switch Rx timestamp to dynamic mbuf field Thomas Monjalon
@ 2020-11-03 14:09   ` Thomas Monjalon
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 05/16] net/dpaa2: " Thomas Monjalon
                     ` (13 subsequent siblings)
  17 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 14:09 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Shepard Siegel, Ed Czeck,
	John Miller

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related dynamic mbuf flag is set, although was missing previously.

The timestamp is set if configured for at least one device.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 drivers/net/ark/ark_ethdev.c    | 17 +++++++++++++++++
 drivers/net/ark/ark_ethdev_rx.c |  7 ++++++-
 drivers/net/ark/ark_ethdev_rx.h |  2 ++
 3 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ark/ark_ethdev.c b/drivers/net/ark/ark_ethdev.c
index fa343999a1..a658993512 100644
--- a/drivers/net/ark/ark_ethdev.c
+++ b/drivers/net/ark/ark_ethdev.c
@@ -79,6 +79,9 @@ static int  eth_ark_set_mtu(struct rte_eth_dev *dev, uint16_t size);
 #define ARK_TX_MAX_QUEUE (4096 * 4)
 #define ARK_TX_MIN_QUEUE (256)
 
+uint64_t ark_timestamp_rx_dynflag;
+int ark_timestamp_dynfield_offset = -1;
+
 int rte_pmd_ark_rx_userdata_dynfield_offset = -1;
 int rte_pmd_ark_tx_userdata_dynfield_offset = -1;
 
@@ -552,6 +555,18 @@ static int
 eth_ark_dev_configure(struct rte_eth_dev *dev)
 {
 	struct ark_adapter *ark = dev->data->dev_private;
+	int ret;
+
+	if (dev->data->dev_conf.rxmode.offloads & DEV_RX_OFFLOAD_TIMESTAMP) {
+		ret = rte_mbuf_dyn_rx_timestamp_register(
+				&ark_timestamp_dynfield_offset,
+				&ark_timestamp_rx_dynflag);
+		if (ret != 0) {
+			ARK_PMD_LOG(ERR,
+				"Failed to register Rx timestamp field/flag\n");
+			return -rte_errno;
+		}
+	}
 
 	eth_ark_dev_set_link_up(dev);
 	if (ark->user_ext.dev_configure)
@@ -782,6 +797,8 @@ eth_ark_dev_info_get(struct rte_eth_dev *dev,
 				ETH_LINK_SPEED_50G |
 				ETH_LINK_SPEED_100G);
 
+	dev_info->rx_offload_capa = DEV_RX_OFFLOAD_TIMESTAMP;
+
 	return 0;
 }
 
diff --git a/drivers/net/ark/ark_ethdev_rx.c b/drivers/net/ark/ark_ethdev_rx.c
index c24cc00e2f..d29d3db783 100644
--- a/drivers/net/ark/ark_ethdev_rx.c
+++ b/drivers/net/ark/ark_ethdev_rx.c
@@ -272,7 +272,12 @@ eth_ark_recv_pkts(void *rx_queue,
 		mbuf->port = meta->port;
 		mbuf->pkt_len = meta->pkt_len;
 		mbuf->data_len = meta->pkt_len;
-		mbuf->timestamp = meta->timestamp;
+		/* set timestamp if enabled at least on one device */
+		if (ark_timestamp_rx_dynflag > 0) {
+			*RTE_MBUF_DYNFIELD(mbuf, ark_timestamp_dynfield_offset,
+				rte_mbuf_timestamp_t *) = meta->timestamp;
+			mbuf->ol_flags |= ark_timestamp_rx_dynflag;
+		}
 		rte_pmd_ark_mbuf_rx_userdata_set(mbuf, meta->user_data);
 
 		if (ARK_DEBUG_CORE) {	/* debug sanity checks */
diff --git a/drivers/net/ark/ark_ethdev_rx.h b/drivers/net/ark/ark_ethdev_rx.h
index 0fdd29b1ab..001fa9bdfa 100644
--- a/drivers/net/ark/ark_ethdev_rx.h
+++ b/drivers/net/ark/ark_ethdev_rx.h
@@ -11,6 +11,8 @@
 #include <rte_mempool.h>
 #include <rte_ethdev_driver.h>
 
+extern uint64_t ark_timestamp_rx_dynflag;
+extern int ark_timestamp_dynfield_offset;
 
 int eth_ark_dev_rx_queue_setup(struct rte_eth_dev *dev,
 			       uint16_t queue_idx,
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v5 05/16] net/dpaa2: switch Rx timestamp to dynamic mbuf field
  2020-11-03 14:09 ` [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp Thomas Monjalon
                     ` (3 preceding siblings ...)
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 04/16] net/ark: " Thomas Monjalon
@ 2020-11-03 14:09   ` Thomas Monjalon
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 06/16] net/mlx5: fix dynamic mbuf offset lookup check Thomas Monjalon
                     ` (12 subsequent siblings)
  17 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 14:09 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Hemant Agrawal,
	Sachin Saxena

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 drivers/net/dpaa2/dpaa2_ethdev.c | 11 +++++++++++
 drivers/net/dpaa2/dpaa2_ethdev.h |  2 ++
 drivers/net/dpaa2/dpaa2_rxtx.c   | 25 ++++++++++++++++++-------
 3 files changed, 31 insertions(+), 7 deletions(-)

diff --git a/drivers/net/dpaa2/dpaa2_ethdev.c b/drivers/net/dpaa2/dpaa2_ethdev.c
index 04e60c56f2..3b0c7717b6 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.c
+++ b/drivers/net/dpaa2/dpaa2_ethdev.c
@@ -65,6 +65,8 @@ static uint64_t dev_tx_offloads_nodis =
 
 /* enable timestamp in mbuf */
 bool dpaa2_enable_ts[RTE_MAX_ETHPORTS];
+uint64_t dpaa2_timestamp_rx_dynflag;
+int dpaa2_timestamp_dynfield_offset = -1;
 
 struct rte_dpaa2_xstats_name_off {
 	char name[RTE_ETH_XSTATS_NAME_SIZE];
@@ -587,7 +589,16 @@ dpaa2_eth_dev_configure(struct rte_eth_dev *dev)
 #if !defined(RTE_LIBRTE_IEEE1588)
 	if (rx_offloads & DEV_RX_OFFLOAD_TIMESTAMP)
 #endif
+	{
+		ret = rte_mbuf_dyn_rx_timestamp_register(
+				&dpaa2_timestamp_dynfield_offset,
+				&dpaa2_timestamp_rx_dynflag);
+		if (ret != 0) {
+			DPAA2_PMD_ERR("Error to register timestamp field/flag");
+			return -rte_errno;
+		}
 		dpaa2_enable_ts[dev->data->port_id] = true;
+	}
 
 	if (tx_offloads & DEV_TX_OFFLOAD_IPV4_CKSUM)
 		tx_l3_csum_offload = true;
diff --git a/drivers/net/dpaa2/dpaa2_ethdev.h b/drivers/net/dpaa2/dpaa2_ethdev.h
index 94cf253827..8d82f74684 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.h
+++ b/drivers/net/dpaa2/dpaa2_ethdev.h
@@ -92,6 +92,8 @@
 
 /* enable timestamp in mbuf*/
 extern bool dpaa2_enable_ts[];
+extern uint64_t dpaa2_timestamp_rx_dynflag;
+extern int dpaa2_timestamp_dynfield_offset;
 
 #define DPAA2_QOS_TABLE_RECONFIGURE	1
 #define DPAA2_FS_TABLE_RECONFIGURE	2
diff --git a/drivers/net/dpaa2/dpaa2_rxtx.c b/drivers/net/dpaa2/dpaa2_rxtx.c
index 6201de4606..9cca6d16c3 100644
--- a/drivers/net/dpaa2/dpaa2_rxtx.c
+++ b/drivers/net/dpaa2/dpaa2_rxtx.c
@@ -31,6 +31,13 @@ dpaa2_dev_rx_parse_slow(struct rte_mbuf *mbuf,
 
 static void enable_tx_tstamp(struct qbman_fd *fd) __rte_unused;
 
+static inline rte_mbuf_timestamp_t *
+dpaa2_timestamp_dynfield(struct rte_mbuf *mbuf)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+		dpaa2_timestamp_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 #define DPAA2_MBUF_TO_CONTIG_FD(_mbuf, _fd, _bpid)  do { \
 	DPAA2_SET_FD_ADDR(_fd, DPAA2_MBUF_VADDR_TO_IOVA(_mbuf)); \
 	DPAA2_SET_FD_LEN(_fd, _mbuf->data_len); \
@@ -109,9 +116,10 @@ dpaa2_dev_rx_parse_new(struct rte_mbuf *m, const struct qbman_fd *fd,
 	m->ol_flags |= PKT_RX_RSS_HASH;
 
 	if (dpaa2_enable_ts[m->port]) {
-		m->timestamp = annotation->word2;
-		m->ol_flags |= PKT_RX_TIMESTAMP;
-		DPAA2_PMD_DP_DEBUG("pkt timestamp:0x%" PRIx64 "", m->timestamp);
+		*dpaa2_timestamp_dynfield(m) = annotation->word2;
+		m->ol_flags |= dpaa2_timestamp_rx_dynflag;
+		DPAA2_PMD_DP_DEBUG("pkt timestamp:0x%" PRIx64 "",
+				*dpaa2_timestamp_dynfield(m));
 	}
 
 	DPAA2_PMD_DP_DEBUG("HW frc = 0x%x\t packet type =0x%x "
@@ -223,9 +231,12 @@ dpaa2_dev_rx_parse(struct rte_mbuf *mbuf, void *hw_annot_addr)
 	else if (BIT_ISSET_AT_POS(annotation->word8, DPAA2_ETH_FAS_L4CE))
 		mbuf->ol_flags |= PKT_RX_L4_CKSUM_BAD;
 
-	mbuf->ol_flags |= PKT_RX_TIMESTAMP;
-	mbuf->timestamp = annotation->word2;
-	DPAA2_PMD_DP_DEBUG("pkt timestamp: 0x%" PRIx64 "", mbuf->timestamp);
+	if (dpaa2_enable_ts[mbuf->port]) {
+		*dpaa2_timestamp_dynfield(mbuf) = annotation->word2;
+		mbuf->ol_flags |= dpaa2_timestamp_rx_dynflag;
+		DPAA2_PMD_DP_DEBUG("pkt timestamp: 0x%" PRIx64 "",
+				*dpaa2_timestamp_dynfield(mbuf));
+	}
 
 	/* Check detailed parsing requirement */
 	if (annotation->word3 & 0x7FFFFC3FFFF)
@@ -629,7 +640,7 @@ dpaa2_dev_prefetch_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 		else
 			bufs[num_rx] = eth_fd_to_mbuf(fd, eth_data->port_id);
 #if defined(RTE_LIBRTE_IEEE1588)
-		priv->rx_timestamp = bufs[num_rx]->timestamp;
+		priv->rx_timestamp = *dpaa2_timestamp_dynfield(bufs[num_rx]);
 #endif
 
 		if (eth_data->dev_conf.rxmode.offloads &
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v5 06/16] net/mlx5: fix dynamic mbuf offset lookup check
  2020-11-03 14:09 ` [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp Thomas Monjalon
                     ` (4 preceding siblings ...)
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 05/16] net/dpaa2: " Thomas Monjalon
@ 2020-11-03 14:09   ` Thomas Monjalon
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 07/16] net/mlx5: switch Rx timestamp to dynamic mbuf field Thomas Monjalon
                     ` (11 subsequent siblings)
  17 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 14:09 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, stable, Matan Azrad,
	Shahaf Shuler, Ori Kam

The functions rte_mbuf_dynfield_lookup() and rte_mbuf_dynflag_lookup()
can return an offset starting with 0 or a negative error code.

In reality the first offsets are probably reserved forever,
but for the sake of strict API compliance,
the checks which considered 0 as an error are fixed.

Fixes: efa79e68c8cd ("net/mlx5: support fine grain dynamic flag")
Fixes: 3172c471b86f ("net/mlx5: prepare Tx queue structures to support timestamp")
Fixes: 0febfcce3693 ("net/mlx5: prepare Tx to support scheduling")
Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 drivers/net/mlx5/mlx5_rxtx.c    | 4 ++--
 drivers/net/mlx5/mlx5_trigger.c | 2 +-
 drivers/net/mlx5/mlx5_txq.c     | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index b530ff421f..e86468b67a 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -5661,9 +5661,9 @@ mlx5_select_tx_function(struct rte_eth_dev *dev)
 	}
 	if (tx_offloads & DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP &&
 	    rte_mbuf_dynflag_lookup
-			(RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME, NULL) > 0 &&
+			(RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME, NULL) >= 0 &&
 	    rte_mbuf_dynfield_lookup
-			(RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL) > 0) {
+			(RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL) >= 0) {
 		/* Offload configured, dynamic entities registered. */
 		olx |= MLX5_TXOFF_CONFIG_TXPP;
 	}
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index 7735f022a3..917b433c4a 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -302,7 +302,7 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 	DRV_LOG(DEBUG, "port %u starting device", dev->data->port_id);
 	fine_inline = rte_mbuf_dynflag_lookup
 		(RTE_PMD_MLX5_FINE_GRANULARITY_INLINE, NULL);
-	if (fine_inline > 0)
+	if (fine_inline >= 0)
 		rte_net_mlx5_dynf_inline_mask = 1UL << fine_inline;
 	else
 		rte_net_mlx5_dynf_inline_mask = 0;
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index af84f5f72b..8ed2bcff7b 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -1305,7 +1305,7 @@ mlx5_txq_dynf_timestamp_set(struct rte_eth_dev *dev)
 				(RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME, NULL);
 	off = rte_mbuf_dynfield_lookup
 				(RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
-	if (nbit > 0 && off >= 0 && sh->txpp.refcnt)
+	if (nbit >= 0 && off >= 0 && sh->txpp.refcnt)
 		mask = 1ULL << nbit;
 	for (i = 0; i != priv->txqs_n; ++i) {
 		data = (*priv->txqs)[i];
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v5 07/16] net/mlx5: switch Rx timestamp to dynamic mbuf field
  2020-11-03 14:09 ` [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp Thomas Monjalon
                     ` (5 preceding siblings ...)
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 06/16] net/mlx5: fix dynamic mbuf offset lookup check Thomas Monjalon
@ 2020-11-03 14:09   ` Thomas Monjalon
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 08/16] net/nfb: " Thomas Monjalon
                     ` (10 subsequent siblings)
  17 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 14:09 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Ruifeng Wang,
	David Christensen, Matan Azrad, Shahaf Shuler,
	Konstantin Ananyev

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

The dynamic offset and flag are stored in struct mlx5_rxq_data
to favor cache locality.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 drivers/net/mlx5/mlx5_rxq.c              |  8 +++++
 drivers/net/mlx5/mlx5_rxtx.c             |  4 +--
 drivers/net/mlx5/mlx5_rxtx.h             | 19 +++++++++++
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h | 41 +++++++++++-----------
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h    | 43 ++++++++++++------------
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h     | 35 +++++++++----------
 6 files changed, 90 insertions(+), 60 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index f1d8373079..52519910ee 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1492,7 +1492,15 @@ mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t idx, uint16_t desc,
 	mlx5_max_lro_msg_size_adjust(dev, idx, max_lro_size);
 	/* Toggle RX checksum offload if hardware supports it. */
 	tmpl->rxq.csum = !!(offloads & DEV_RX_OFFLOAD_CHECKSUM);
+	/* Configure Rx timestamp. */
 	tmpl->rxq.hw_timestamp = !!(offloads & DEV_RX_OFFLOAD_TIMESTAMP);
+	tmpl->rxq.timestamp_rx_flag = 0;
+	if (tmpl->rxq.hw_timestamp && rte_mbuf_dyn_rx_timestamp_register(
+			&tmpl->rxq.timestamp_offset,
+			&tmpl->rxq.timestamp_rx_flag) != 0) {
+		DRV_LOG(ERR, "Cannot register Rx timestamp field/flag");
+		goto error;
+	}
 	/* Configure VLAN stripping. */
 	tmpl->rxq.vlan_strip = !!(offloads & DEV_RX_OFFLOAD_VLAN_STRIP);
 	/* By default, FCS (CRC) is stripped by hardware. */
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index e86468b67a..b577aab00b 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -1287,8 +1287,8 @@ rxq_cq_to_mbuf(struct mlx5_rxq_data *rxq, struct rte_mbuf *pkt,
 
 		if (rxq->rt_timestamp)
 			ts = mlx5_txpp_convert_rx_ts(rxq->sh, ts);
-		pkt->timestamp = ts;
-		pkt->ol_flags |= PKT_RX_TIMESTAMP;
+		mlx5_timestamp_set(pkt, rxq->timestamp_offset, ts);
+		pkt->ol_flags |= rxq->timestamp_rx_flag;
 	}
 }
 
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 674296ee98..e9eca36b40 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -151,6 +151,8 @@ struct mlx5_rxq_data {
 	/* CQ (UAR) access lock required for 32bit implementations */
 #endif
 	uint32_t tunnel; /* Tunnel information. */
+	int timestamp_offset; /* Dynamic mbuf field for timestamp. */
+	uint64_t timestamp_rx_flag; /* Dynamic mbuf flag for timestamp. */
 	uint64_t flow_meta_mask;
 	int32_t flow_meta_offset;
 } __rte_cache_aligned;
@@ -681,4 +683,21 @@ mlx5_txpp_convert_tx_ts(struct mlx5_dev_ctx_shared *sh, uint64_t mts)
 	return ci;
 }
 
+/**
+ * Set timestamp in mbuf dynamic field.
+ *
+ * @param mbuf
+ *   Structure to write into.
+ * @param offset
+ *   Dynamic field offset in mbuf structure.
+ * @param timestamp
+ *   Value to write.
+ */
+static __rte_always_inline void
+mlx5_timestamp_set(struct rte_mbuf *mbuf, int offset,
+		rte_mbuf_timestamp_t timestamp)
+{
+	*RTE_MBUF_DYNFIELD(mbuf, offset, rte_mbuf_timestamp_t *) = timestamp;
+}
+
 #endif /* RTE_PMD_MLX5_RXTX_H_ */
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
index 6bf0c9b540..171d7bb0f8 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
@@ -330,13 +330,13 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
 	vector unsigned char ol_flags = (vector unsigned char)
 		(vector unsigned int){
 			rxq->rss_hash * PKT_RX_RSS_HASH |
-				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
+				rxq->hw_timestamp * rxq->timestamp_rx_flag,
 			rxq->rss_hash * PKT_RX_RSS_HASH |
-				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
+				rxq->hw_timestamp * rxq->timestamp_rx_flag,
 			rxq->rss_hash * PKT_RX_RSS_HASH |
-				rxq->hw_timestamp * PKT_RX_TIMESTAMP,
+				rxq->hw_timestamp * rxq->timestamp_rx_flag,
 			rxq->rss_hash * PKT_RX_RSS_HASH |
-				rxq->hw_timestamp * PKT_RX_TIMESTAMP};
+				rxq->hw_timestamp * rxq->timestamp_rx_flag};
 	vector unsigned char cv_flags;
 	const vector unsigned char zero = (vector unsigned char){0};
 	const vector unsigned char ptype_mask =
@@ -1025,31 +1025,32 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t pkts_n,
 		/* D.5 fill in mbuf - rearm_data and packet_type. */
 		rxq_cq_to_ptype_oflags_v(rxq, cqes, opcode, &pkts[pos]);
 		if (rxq->hw_timestamp) {
+			int offset = rxq->timestamp_offset;
 			if (rxq->rt_timestamp) {
 				struct mlx5_dev_ctx_shared *sh = rxq->sh;
 				uint64_t ts;
 
 				ts = rte_be_to_cpu_64(cq[pos].timestamp);
-				pkts[pos]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p1].timestamp);
-				pkts[pos + 1]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 1], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p2].timestamp);
-				pkts[pos + 2]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 2], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p3].timestamp);
-				pkts[pos + 3]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 3], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 			} else {
-				pkts[pos]->timestamp = rte_be_to_cpu_64
-						(cq[pos].timestamp);
-				pkts[pos + 1]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p1].timestamp);
-				pkts[pos + 2]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p2].timestamp);
-				pkts[pos + 3]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p3].timestamp);
+				mlx5_timestamp_set(pkts[pos], offset,
+					rte_be_to_cpu_64(cq[pos].timestamp));
+				mlx5_timestamp_set(pkts[pos + 1], offset,
+					rte_be_to_cpu_64(cq[pos + p1].timestamp));
+				mlx5_timestamp_set(pkts[pos + 2], offset,
+					rte_be_to_cpu_64(cq[pos + p2].timestamp));
+				mlx5_timestamp_set(pkts[pos + 3], offset,
+					rte_be_to_cpu_64(cq[pos + p3].timestamp));
 			}
 		}
 		if (rxq->dynf_meta) {
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index d122dad4fe..436b247ade 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -271,7 +271,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq,
 	uint32x4_t pinfo, cv_flags;
 	uint32x4_t ol_flags =
 		vdupq_n_u32(rxq->rss_hash * PKT_RX_RSS_HASH |
-			    rxq->hw_timestamp * PKT_RX_TIMESTAMP);
+			    rxq->hw_timestamp * rxq->timestamp_rx_flag);
 	const uint32x4_t ptype_ol_mask = { 0x106, 0x106, 0x106, 0x106 };
 	const uint8x16_t cv_flag_sel = {
 		0,
@@ -697,6 +697,7 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t pkts_n,
 		rxq_cq_to_ptype_oflags_v(rxq, ptype_info, flow_tag,
 					 opcode, &elts[pos]);
 		if (rxq->hw_timestamp) {
+			int offset = rxq->timestamp_offset;
 			if (rxq->rt_timestamp) {
 				struct mlx5_dev_ctx_shared *sh = rxq->sh;
 				uint64_t ts;
@@ -704,36 +705,36 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t pkts_n,
 				ts = rte_be_to_cpu_64
 					(container_of(p0, struct mlx5_cqe,
 						      pkt_info)->timestamp);
-				elts[pos]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(elts[pos], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64
 					(container_of(p1, struct mlx5_cqe,
 						      pkt_info)->timestamp);
-				elts[pos + 1]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(elts[pos + 1], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64
 					(container_of(p2, struct mlx5_cqe,
 						      pkt_info)->timestamp);
-				elts[pos + 2]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(elts[pos + 2], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64
 					(container_of(p3, struct mlx5_cqe,
 						      pkt_info)->timestamp);
-				elts[pos + 3]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(elts[pos + 3], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 			} else {
-				elts[pos]->timestamp = rte_be_to_cpu_64
-					(container_of(p0, struct mlx5_cqe,
-						      pkt_info)->timestamp);
-				elts[pos + 1]->timestamp = rte_be_to_cpu_64
-					(container_of(p1, struct mlx5_cqe,
-						      pkt_info)->timestamp);
-				elts[pos + 2]->timestamp = rte_be_to_cpu_64
-					(container_of(p2, struct mlx5_cqe,
-						      pkt_info)->timestamp);
-				elts[pos + 3]->timestamp = rte_be_to_cpu_64
-					(container_of(p3, struct mlx5_cqe,
-						      pkt_info)->timestamp);
+				mlx5_timestamp_set(elts[pos], offset,
+					rte_be_to_cpu_64(container_of(p0,
+					struct mlx5_cqe, pkt_info)->timestamp));
+				mlx5_timestamp_set(elts[pos + 1], offset,
+					rte_be_to_cpu_64(container_of(p1,
+					struct mlx5_cqe, pkt_info)->timestamp));
+				mlx5_timestamp_set(elts[pos + 2], offset,
+					rte_be_to_cpu_64(container_of(p2,
+					struct mlx5_cqe, pkt_info)->timestamp));
+				mlx5_timestamp_set(elts[pos + 3], offset,
+					rte_be_to_cpu_64(container_of(p3,
+					struct mlx5_cqe, pkt_info)->timestamp));
 			}
 		}
 		if (rxq->dynf_meta) {
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 0bbcbeefff..ae4439efc7 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -251,7 +251,7 @@ rxq_cq_to_ptype_oflags_v(struct mlx5_rxq_data *rxq, __m128i cqes[4],
 	__m128i pinfo0, pinfo1;
 	__m128i pinfo, ptype;
 	__m128i ol_flags = _mm_set1_epi32(rxq->rss_hash * PKT_RX_RSS_HASH |
-					  rxq->hw_timestamp * PKT_RX_TIMESTAMP);
+					  rxq->hw_timestamp * rxq->timestamp_rx_flag);
 	__m128i cv_flags;
 	const __m128i zero = _mm_setzero_si128();
 	const __m128i ptype_mask =
@@ -656,31 +656,32 @@ rxq_burst_v(struct mlx5_rxq_data *rxq, struct rte_mbuf **pkts, uint16_t pkts_n,
 		/* D.5 fill in mbuf - rearm_data and packet_type. */
 		rxq_cq_to_ptype_oflags_v(rxq, cqes, opcode, &pkts[pos]);
 		if (rxq->hw_timestamp) {
+			int offset = rxq->timestamp_offset;
 			if (rxq->rt_timestamp) {
 				struct mlx5_dev_ctx_shared *sh = rxq->sh;
 				uint64_t ts;
 
 				ts = rte_be_to_cpu_64(cq[pos].timestamp);
-				pkts[pos]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p1].timestamp);
-				pkts[pos + 1]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 1], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p2].timestamp);
-				pkts[pos + 2]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 2], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 				ts = rte_be_to_cpu_64(cq[pos + p3].timestamp);
-				pkts[pos + 3]->timestamp =
-					mlx5_txpp_convert_rx_ts(sh, ts);
+				mlx5_timestamp_set(pkts[pos + 3], offset,
+					mlx5_txpp_convert_rx_ts(sh, ts));
 			} else {
-				pkts[pos]->timestamp = rte_be_to_cpu_64
-						(cq[pos].timestamp);
-				pkts[pos + 1]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p1].timestamp);
-				pkts[pos + 2]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p2].timestamp);
-				pkts[pos + 3]->timestamp = rte_be_to_cpu_64
-						(cq[pos + p3].timestamp);
+				mlx5_timestamp_set(pkts[pos], offset,
+					rte_be_to_cpu_64(cq[pos].timestamp));
+				mlx5_timestamp_set(pkts[pos + 1], offset,
+					rte_be_to_cpu_64(cq[pos + p1].timestamp));
+				mlx5_timestamp_set(pkts[pos + 2], offset,
+					rte_be_to_cpu_64(cq[pos + p2].timestamp));
+				mlx5_timestamp_set(pkts[pos + 3], offset,
+					rte_be_to_cpu_64(cq[pos + p3].timestamp));
 			}
 		}
 		if (rxq->dynf_meta) {
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v5 08/16] net/nfb: switch Rx timestamp to dynamic mbuf field
  2020-11-03 14:09 ` [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp Thomas Monjalon
                     ` (6 preceding siblings ...)
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 07/16] net/mlx5: switch Rx timestamp to dynamic mbuf field Thomas Monjalon
@ 2020-11-03 14:09   ` Thomas Monjalon
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 09/16] net/octeontx2: " Thomas Monjalon
                     ` (9 subsequent siblings)
  17 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 14:09 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Martin Spinler

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 drivers/net/nfb/nfb_rx.c | 15 ++++++++++++++-
 drivers/net/nfb/nfb_rx.h | 21 +++++++++++++++++----
 2 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/drivers/net/nfb/nfb_rx.c b/drivers/net/nfb/nfb_rx.c
index d97179f818..d6d4ba9663 100644
--- a/drivers/net/nfb/nfb_rx.c
+++ b/drivers/net/nfb/nfb_rx.c
@@ -9,6 +9,9 @@
 #include "nfb_rx.h"
 #include "nfb.h"
 
+uint64_t nfb_timestamp_rx_dynflag;
+int nfb_timestamp_dynfield_offset = -1;
+
 static int
 timestamp_check_handler(__rte_unused const char *key,
 	const char *value, __rte_unused void *opaque)
@@ -24,6 +27,7 @@ static int
 nfb_check_timestamp(struct rte_devargs *devargs)
 {
 	struct rte_kvargs *kvlist;
+	int ret;
 
 	if (devargs == NULL)
 		return 0;
@@ -38,6 +42,7 @@ nfb_check_timestamp(struct rte_devargs *devargs)
 	}
 	/* Timestamps are enabled when there is
 	 * key-value pair: enable_timestamp=1
+	 * TODO: timestamp should be enabled with DEV_RX_OFFLOAD_TIMESTAMP
 	 */
 	if (rte_kvargs_process(kvlist, TIMESTAMP_ARG,
 		timestamp_check_handler, NULL) < 0) {
@@ -46,6 +51,14 @@ nfb_check_timestamp(struct rte_devargs *devargs)
 	}
 	rte_kvargs_free(kvlist);
 
+	ret = rte_mbuf_dyn_rx_timestamp_register(
+			&nfb_timestamp_dynfield_offset,
+			&nfb_timestamp_rx_dynflag);
+	if (ret != 0) {
+		RTE_LOG(ERR, PMD, "Cannot register Rx timestamp field/flag\n");
+		return -rte_errno;
+	}
+
 	return 1;
 }
 
@@ -125,7 +138,7 @@ nfb_eth_rx_queue_setup(struct rte_eth_dev *dev,
 	else
 		rte_free(rxq);
 
-	if (nfb_check_timestamp(dev->device->devargs))
+	if (nfb_check_timestamp(dev->device->devargs) > 0)
 		rxq->flags |= NFB_TIMESTAMP_FLAG;
 
 	return ret;
diff --git a/drivers/net/nfb/nfb_rx.h b/drivers/net/nfb/nfb_rx.h
index cf3899b2fb..27a2888a75 100644
--- a/drivers/net/nfb/nfb_rx.h
+++ b/drivers/net/nfb/nfb_rx.h
@@ -15,6 +15,16 @@
 
 #define NFB_TIMESTAMP_FLAG (1 << 0)
 
+extern uint64_t nfb_timestamp_rx_dynflag;
+extern int nfb_timestamp_dynfield_offset;
+
+static inline rte_mbuf_timestamp_t *
+nfb_timestamp_dynfield(struct rte_mbuf *mbuf)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+		nfb_timestamp_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 struct ndp_rx_queue {
 	struct nfb_device *nfb;	     /* nfb dev structure */
 	struct ndp_queue *queue;     /* rx queue */
@@ -190,16 +200,19 @@ nfb_eth_ndp_rx(void *queue,
 			mbuf->ol_flags = 0;
 
 			if (timestamping_enabled) {
+				rte_mbuf_timestamp_t timestamp;
+
 				/* nanoseconds */
-				mbuf->timestamp =
+				timestamp =
 					rte_le_to_cpu_32(*((uint32_t *)
 					(packets[i].header + 4)));
-				mbuf->timestamp <<= 32;
+				timestamp <<= 32;
 				/* seconds */
-				mbuf->timestamp |=
+				timestamp |=
 					rte_le_to_cpu_32(*((uint32_t *)
 					(packets[i].header + 8)));
-				mbuf->ol_flags |= PKT_RX_TIMESTAMP;
+				*nfb_timestamp_dynfield(mbuf) = timestamp;
+				mbuf->ol_flags |= nfb_timestamp_rx_dynflag;
 			}
 
 			bufs[num_rx++] = mbuf;
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v5 09/16] net/octeontx2: switch Rx timestamp to dynamic mbuf field
  2020-11-03 14:09 ` [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp Thomas Monjalon
                     ` (7 preceding siblings ...)
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 08/16] net/nfb: " Thomas Monjalon
@ 2020-11-03 14:09   ` Thomas Monjalon
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 10/16] net/pcap: " Thomas Monjalon
                     ` (8 subsequent siblings)
  17 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 14:09 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Nithin Dabilpuram,
	Kiran Kumar K

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

The registration of field and flag is done in both
otx2_nix_dev_start() and otx2_nix_timesync_enable().

The dynamic offset and flag are stored in struct otx2_timesync_info
to favor cache locality.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 drivers/net/octeontx2/otx2_ethdev.c | 10 ++++++++++
 drivers/net/octeontx2/otx2_ptp.c    |  8 ++++++++
 drivers/net/octeontx2/otx2_rx.h     | 19 ++++++++++++++++---
 3 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/drivers/net/octeontx2/otx2_ethdev.c b/drivers/net/octeontx2/otx2_ethdev.c
index cfb733a4b5..6cebbe677d 100644
--- a/drivers/net/octeontx2/otx2_ethdev.c
+++ b/drivers/net/octeontx2/otx2_ethdev.c
@@ -2225,6 +2225,16 @@ otx2_nix_dev_start(struct rte_eth_dev *eth_dev)
 	if (otx2_ethdev_is_ptp_en(dev) && otx2_dev_is_vf(dev))
 		otx2_nix_ptp_enable_vf(eth_dev);
 
+	if (dev->rx_offload_flags & NIX_RX_OFFLOAD_TSTAMP_F) {
+		rc = rte_mbuf_dyn_rx_timestamp_register(
+				&dev->tstamp.tstamp_dynfield_offset,
+				&dev->tstamp.rx_tstamp_dynflag);
+		if (rc != 0) {
+			otx2_err("Failed to register Rx timestamp field/flag");
+			return -rte_errno;
+		}
+	}
+
 	rc = npc_rx_enable(dev);
 	if (rc) {
 		otx2_err("Failed to enable NPC rx %d", rc);
diff --git a/drivers/net/octeontx2/otx2_ptp.c b/drivers/net/octeontx2/otx2_ptp.c
index ae5a2b7cd1..b8ef4c181d 100644
--- a/drivers/net/octeontx2/otx2_ptp.c
+++ b/drivers/net/octeontx2/otx2_ptp.c
@@ -239,6 +239,14 @@ otx2_nix_timesync_enable(struct rte_eth_dev *eth_dev)
 	dev->tstamp.tx_tstamp_iova = ts->iova;
 	dev->tstamp.tx_tstamp = ts->addr;
 
+	rc = rte_mbuf_dyn_rx_timestamp_register(
+			&dev->tstamp.tstamp_dynfield_offset,
+			&dev->tstamp.rx_tstamp_dynflag);
+	if (rc != 0) {
+		otx2_err("Failed to register Rx timestamp field/flag");
+		return -rte_errno;
+	}
+
 	/* System time should be already on by default */
 	nix_start_timecounters(eth_dev);
 
diff --git a/drivers/net/octeontx2/otx2_rx.h b/drivers/net/octeontx2/otx2_rx.h
index 61a5c436dd..926f614a4e 100644
--- a/drivers/net/octeontx2/otx2_rx.h
+++ b/drivers/net/octeontx2/otx2_rx.h
@@ -49,6 +49,8 @@ struct otx2_timesync_info {
 	uint64_t	rx_tstamp;
 	rte_iova_t	tx_tstamp_iova;
 	uint64_t	*tx_tstamp;
+	uint64_t	rx_tstamp_dynflag;
+	int		tstamp_dynfield_offset;
 	uint8_t		tx_ready;
 	uint8_t		rx_ready;
 } __rte_cache_aligned;
@@ -63,6 +65,14 @@ union mbuf_initializer {
 	uint64_t value;
 };
 
+static inline rte_mbuf_timestamp_t *
+otx2_timestamp_dynfield(struct rte_mbuf *mbuf,
+		struct otx2_timesync_info *info)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+		info->tstamp_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 static __rte_always_inline void
 otx2_nix_mbuf_to_tstamp(struct rte_mbuf *mbuf,
 			struct otx2_timesync_info *tstamp, const uint16_t flag,
@@ -77,15 +87,18 @@ otx2_nix_mbuf_to_tstamp(struct rte_mbuf *mbuf,
 		/* Reading the rx timestamp inserted by CGX, viz at
 		 * starting of the packet data.
 		 */
-		mbuf->timestamp = rte_be_to_cpu_64(*tstamp_ptr);
+		*otx2_timestamp_dynfield(mbuf, tstamp) =
+				rte_be_to_cpu_64(*tstamp_ptr);
 		/* PKT_RX_IEEE1588_TMST flag needs to be set only in case
 		 * PTP packets are received.
 		 */
 		if (mbuf->packet_type == RTE_PTYPE_L2_ETHER_TIMESYNC) {
-			tstamp->rx_tstamp = mbuf->timestamp;
+			tstamp->rx_tstamp =
+					*otx2_timestamp_dynfield(mbuf, tstamp);
 			tstamp->rx_ready = 1;
 			mbuf->ol_flags |= PKT_RX_IEEE1588_PTP |
-				PKT_RX_IEEE1588_TMST | PKT_RX_TIMESTAMP;
+				PKT_RX_IEEE1588_TMST |
+				tstamp->rx_tstamp_dynflag;
 		}
 	}
 }
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v5 10/16] net/pcap: switch Rx timestamp to dynamic mbuf field
  2020-11-03 14:09 ` [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp Thomas Monjalon
                     ` (8 preceding siblings ...)
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 09/16] net/octeontx2: " Thomas Monjalon
@ 2020-11-03 14:09   ` Thomas Monjalon
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 11/16] app/testpmd: " Thomas Monjalon
                     ` (7 subsequent siblings)
  17 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 14:09 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 drivers/net/pcap/rte_eth_pcap.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index 34e82317b1..4e6d49370e 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -51,6 +51,9 @@ static uint64_t start_cycles;
 static uint64_t hz;
 static uint8_t iface_idx;
 
+static uint64_t timestamp_rx_dynflag;
+static int timestamp_dynfield_offset = -1;
+
 struct queue_stat {
 	volatile unsigned long pkts;
 	volatile unsigned long bytes;
@@ -265,9 +268,11 @@ eth_pcap_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 		}
 
 		mbuf->pkt_len = (uint16_t)header.caplen;
-		mbuf->timestamp = (uint64_t)header.ts.tv_sec * 1000000
-							+ header.ts.tv_usec;
-		mbuf->ol_flags |= PKT_RX_TIMESTAMP;
+		*RTE_MBUF_DYNFIELD(mbuf, timestamp_dynfield_offset,
+			rte_mbuf_timestamp_t *) =
+				(uint64_t)header.ts.tv_sec * 1000000 +
+				header.ts.tv_usec;
+		mbuf->ol_flags |= timestamp_rx_dynflag;
 		mbuf->port = pcap_q->port_id;
 		bufs[num_rx] = mbuf;
 		num_rx++;
@@ -656,6 +661,15 @@ eth_dev_stop(struct rte_eth_dev *dev)
 static int
 eth_dev_configure(struct rte_eth_dev *dev __rte_unused)
 {
+	int ret;
+
+	ret = rte_mbuf_dyn_rx_timestamp_register(&timestamp_dynfield_offset,
+			&timestamp_rx_dynflag);
+	if (ret != 0) {
+		PMD_LOG(ERR, "Failed to register Rx timestamp field/flag");
+		return -rte_errno;
+	}
+
 	return 0;
 }
 
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v5 11/16] app/testpmd: switch Rx timestamp to dynamic mbuf field
  2020-11-03 14:09 ` [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp Thomas Monjalon
                     ` (9 preceding siblings ...)
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 10/16] net/pcap: " Thomas Monjalon
@ 2020-11-03 14:09   ` Thomas Monjalon
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 12/16] examples/rxtx_callbacks: switch timestamp to dynamic field Thomas Monjalon
                     ` (6 subsequent siblings)
  17 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 14:09 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.
The related mbuf flag is also replaced.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 app/test-pmd/util.c | 38 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 36 insertions(+), 2 deletions(-)

diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index 781a813759..649bf8f53a 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -5,6 +5,7 @@
 
 #include <stdio.h>
 
+#include <rte_bitops.h>
 #include <rte_net.h>
 #include <rte_mbuf.h>
 #include <rte_ether.h>
@@ -22,6 +23,39 @@ print_ether_addr(const char *what, const struct rte_ether_addr *eth_addr)
 	printf("%s%s", what, buf);
 }
 
+static inline bool
+is_timestamp_enabled(const struct rte_mbuf *mbuf)
+{
+	static uint64_t timestamp_rx_dynflag;
+	int timestamp_rx_dynflag_offset;
+
+	if (timestamp_rx_dynflag == 0) {
+		timestamp_rx_dynflag_offset = rte_mbuf_dynflag_lookup(
+				RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME, NULL);
+		if (timestamp_rx_dynflag_offset < 0)
+			return false;
+		timestamp_rx_dynflag = RTE_BIT64(timestamp_rx_dynflag_offset);
+	}
+
+	return (mbuf->ol_flags & timestamp_rx_dynflag) != 0;
+}
+
+static inline rte_mbuf_timestamp_t
+get_timestamp(const struct rte_mbuf *mbuf)
+{
+	static int timestamp_dynfield_offset = -1;
+
+	if (timestamp_dynfield_offset < 0) {
+		timestamp_dynfield_offset = rte_mbuf_dynfield_lookup(
+				RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
+		if (timestamp_dynfield_offset < 0)
+			return 0;
+	}
+
+	return *RTE_MBUF_DYNFIELD(mbuf,
+			timestamp_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 static inline void
 dump_pkt_burst(uint16_t port_id, uint16_t queue, struct rte_mbuf *pkts[],
 	      uint16_t nb_pkts, int is_rx)
@@ -107,8 +141,8 @@ dump_pkt_burst(uint16_t port_id, uint16_t queue, struct rte_mbuf *pkts[],
 				printf("hash=0x%x ID=0x%x ",
 				       mb->hash.fdir.hash, mb->hash.fdir.id);
 		}
-		if (ol_flags & PKT_RX_TIMESTAMP)
-			printf(" - timestamp %"PRIu64" ", mb->timestamp);
+		if (is_timestamp_enabled(mb))
+			printf(" - timestamp %"PRIu64" ", get_timestamp(mb));
 		if (ol_flags & PKT_RX_QINQ)
 			printf(" - QinQ VLAN tci=0x%x, VLAN tci outer=0x%x",
 			       mb->vlan_tci, mb->vlan_tci_outer);
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v5 12/16] examples/rxtx_callbacks: switch timestamp to dynamic field
  2020-11-03 14:09 ` [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp Thomas Monjalon
                     ` (10 preceding siblings ...)
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 11/16] app/testpmd: " Thomas Monjalon
@ 2020-11-03 14:09   ` Thomas Monjalon
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 13/16] ethdev: add doxygen comment for Rx timestamp API Thomas Monjalon
                     ` (5 subsequent siblings)
  17 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 14:09 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, John McNamara

The mbuf timestamp is moved to a dynamic field
in order to allow removal of the deprecated static field.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 examples/rxtx_callbacks/main.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/examples/rxtx_callbacks/main.c b/examples/rxtx_callbacks/main.c
index 1a8e7d47d9..35c6c39807 100644
--- a/examples/rxtx_callbacks/main.c
+++ b/examples/rxtx_callbacks/main.c
@@ -19,6 +19,15 @@
 #define MBUF_CACHE_SIZE 250
 #define BURST_SIZE 32
 
+static int hwts_dynfield_offset = -1;
+
+static inline rte_mbuf_timestamp_t *
+hwts_field(struct rte_mbuf *mbuf)
+{
+	return RTE_MBUF_DYNFIELD(mbuf,
+			hwts_dynfield_offset, rte_mbuf_timestamp_t *);
+}
+
 typedef uint64_t tsc_t;
 static int tsc_dynfield_offset = -1;
 
@@ -77,7 +86,7 @@ calc_latency(uint16_t port, uint16_t qidx __rte_unused,
 	for (i = 0; i < nb_pkts; i++) {
 		cycles += now - *tsc_field(pkts[i]);
 		if (hw_timestamping)
-			queue_ticks += ticks - pkts[i]->timestamp;
+			queue_ticks += ticks - *hwts_field(pkts[i]);
 	}
 
 	latency_numbers.total_cycles += cycles;
@@ -141,6 +150,11 @@ port_init(uint16_t port, struct rte_mempool *mbuf_pool)
 			return -1;
 		}
 		port_conf.rxmode.offloads |= DEV_RX_OFFLOAD_TIMESTAMP;
+		rte_mbuf_dyn_rx_timestamp_register(&hwts_dynfield_offset, NULL);
+		if (hwts_dynfield_offset < 0) {
+			printf("ERROR: Failed to register timestamp field\n");
+			return -rte_errno;
+		}
 	}
 
 	retval = rte_eth_dev_configure(port, rx_rings, tx_rings, &port_conf);
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v5 13/16] ethdev: add doxygen comment for Rx timestamp API
  2020-11-03 14:09 ` [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp Thomas Monjalon
                     ` (11 preceding siblings ...)
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 12/16] examples/rxtx_callbacks: switch timestamp to dynamic field Thomas Monjalon
@ 2020-11-03 14:09   ` Thomas Monjalon
  2020-11-03 19:07     ` Ajit Khaparde
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 14/16] mbuf: remove deprecated timestamp field Thomas Monjalon
                     ` (4 subsequent siblings)
  17 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 14:09 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo

The offload flag DEV_RX_OFFLOAD_TIMESTAMP had no documentation.
After switching to dynamic mbuf flag and field,
it becomes even more important to explicit the feature behaviour.

A doxygen comment for the timesync API was mentioning
the deprecated timestamp field, so it is also updated.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
---
 lib/librte_ethdev/rte_ethdev.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index e341a08817..4988054cb2 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1344,6 +1344,11 @@ struct rte_eth_conf {
 #define DEV_RX_OFFLOAD_VLAN_EXTEND	0x00000400
 #define DEV_RX_OFFLOAD_JUMBO_FRAME	0x00000800
 #define DEV_RX_OFFLOAD_SCATTER		0x00002000
+/**
+ * Timestamp is set by the driver in RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
+ * and RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME is set in ol_flags.
+ * The mbuf field and flag are registered when the offload is configured.
+ */
 #define DEV_RX_OFFLOAD_TIMESTAMP	0x00004000
 #define DEV_RX_OFFLOAD_SECURITY         0x00008000
 #define DEV_RX_OFFLOAD_KEEP_CRC		0x00010000
@@ -4646,7 +4651,7 @@ int rte_eth_timesync_write_time(uint16_t port_id, const struct timespec *time);
  * rte_eth_read_clock(port, base_clock);
  *
  * Then, convert the raw mbuf timestamp with:
- * base_time_sec + (double)(mbuf->timestamp - base_clock) / freq;
+ * base_time_sec + (double)(*timestamp_dynfield(mbuf) - base_clock) / freq;
  *
  * This simple example will not provide a very good accuracy. One must
  * at least measure multiple times the frequency and do a regression.
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v5 14/16] mbuf: remove deprecated timestamp field
  2020-11-03 14:09 ` [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp Thomas Monjalon
                     ` (12 preceding siblings ...)
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 13/16] ethdev: add doxygen comment for Rx timestamp API Thomas Monjalon
@ 2020-11-03 14:09   ` Thomas Monjalon
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 15/16] mbuf: add Tx timestamp registration helper Thomas Monjalon
                     ` (3 subsequent siblings)
  17 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 14:09 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Ajit Khaparde,
	Ray Kinsella, Neil Horman

As announced in the deprecation note, the field timestamp
is removed to give more space to the dynamic fields.
The related offload flag PKT_RX_TIMESTAMP is also removed.

This is how the mbuf layout looks like (pahole-style):

word  type                              name                byte  size
 0    void *                            buf_addr;         /*   0 +  8 */
 1    rte_iova_t                        buf_iova          /*   8 +  8 */
      /* --- RTE_MARKER64               rearm_data;                   */
 2    uint16_t                          data_off;         /*  16 +  2 */
      uint16_t                          refcnt;           /*  18 +  2 */
      uint16_t                          nb_segs;          /*  20 +  2 */
      uint16_t                          port;             /*  22 +  2 */
 3    uint64_t                          ol_flags;         /*  24 +  8 */
      /* --- RTE_MARKER                 rx_descriptor_fields1;        */
 4    uint32_t             union        packet_type;      /*  32 +  4 */
      uint32_t                          pkt_len;          /*  36 +  4 */
 5    uint16_t                          data_len;         /*  40 +  2 */
      uint16_t                          vlan_tci;         /*  42 +  2 */
 5.5  uint64_t             union        hash;             /*  44 +  8 */
 6.5  uint16_t                          vlan_tci_outer;   /*  52 +  2 */
      uint16_t                          buf_len;          /*  54 +  2 */
 7    uint64_t                          dynfield0[1];     /*  56 +  8 */
      /* --- RTE_MARKER                 cacheline1;                   */
 8    struct rte_mempool *              pool;             /*  64 +  8 */
 9    struct rte_mbuf *                 next;             /*  72 +  8 */
10    uint64_t             union        tx_offload;       /*  80 +  8 */
11    struct rte_mbuf_ext_shared_info * shinfo;           /*  88 +  8 */
12    uint16_t                          priv_size;        /*  96 +  2 */
      uint16_t                          timesync;         /*  98 +  2 */
12.5  uint32_t                          dynfield1[7];     /* 100 + 28 */
16    /* --- END                                             128      */

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: David Marchand <david.marchand@redhat.com>
---
 app/test/test_mbuf.c                   |  1 -
 doc/guides/rel_notes/deprecation.rst   |  4 ----
 doc/guides/rel_notes/release_20_11.rst |  4 ++++
 lib/librte_mbuf/rte_mbuf.c             |  2 --
 lib/librte_mbuf/rte_mbuf.h             |  2 +-
 lib/librte_mbuf/rte_mbuf_core.h        | 12 ++----------
 lib/librte_mbuf/rte_mbuf_dyn.c         |  1 +
 7 files changed, 8 insertions(+), 18 deletions(-)

diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
index 3a13cf4e1f..a40f7d4883 100644
--- a/app/test/test_mbuf.c
+++ b/app/test/test_mbuf.c
@@ -1621,7 +1621,6 @@ test_get_rx_ol_flag_name(void)
 		VAL_NAME(PKT_RX_FDIR_FLX),
 		VAL_NAME(PKT_RX_QINQ_STRIPPED),
 		VAL_NAME(PKT_RX_LRO),
-		VAL_NAME(PKT_RX_TIMESTAMP),
 		VAL_NAME(PKT_RX_SEC_OFFLOAD),
 		VAL_NAME(PKT_RX_SEC_OFFLOAD_FAILED),
 		VAL_NAME(PKT_RX_OUTER_L4_CKSUM_BAD),
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index fe3fd3956c..22aecf0bab 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -84,10 +84,6 @@ Deprecation Notices
 * mbuf: Some fields will be converted to dynamic API in DPDK 20.11
   in order to reserve more space for the dynamic fields, as explained in
   `this presentation <https://www.youtube.com/watch?v=Ttl6MlhmzWY>`_.
-  The following static fields will be moved as dynamic:
-
-  - ``timestamp``
-
   As a consequence, the layout of the ``struct rte_mbuf`` will be re-arranged,
   avoiding impact on vectorized implementation of the driver datapaths,
   while evaluating performance gains of a better use of the first cache line.
diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index f1a6925678..7c8246d1b3 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -458,6 +458,10 @@ API Changes
 * mbuf: Removed the field ``seqn`` from the structure ``rte_mbuf``.
   It is replaced with dynamic fields.
 
+* mbuf: Removed the field ``timestamp`` from the structure ``rte_mbuf``.
+  It is replaced with the dynamic field RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
+  which was previously used only for Tx.
+
 * pci: Removed the ``rte_kernel_driver`` enum defined in rte_dev.h and
   replaced with a private enum in the PCI subsystem.
 
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index 8a456e5e64..09d93e6899 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -764,7 +764,6 @@ const char *rte_get_rx_ol_flag_name(uint64_t mask)
 	case PKT_RX_QINQ_STRIPPED: return "PKT_RX_QINQ_STRIPPED";
 	case PKT_RX_QINQ: return "PKT_RX_QINQ";
 	case PKT_RX_LRO: return "PKT_RX_LRO";
-	case PKT_RX_TIMESTAMP: return "PKT_RX_TIMESTAMP";
 	case PKT_RX_SEC_OFFLOAD: return "PKT_RX_SEC_OFFLOAD";
 	case PKT_RX_SEC_OFFLOAD_FAILED: return "PKT_RX_SEC_OFFLOAD_FAILED";
 	case PKT_RX_OUTER_L4_CKSUM_BAD: return "PKT_RX_OUTER_L4_CKSUM_BAD";
@@ -808,7 +807,6 @@ rte_get_rx_ol_flag_list(uint64_t mask, char *buf, size_t buflen)
 		{ PKT_RX_FDIR_FLX, PKT_RX_FDIR_FLX, NULL },
 		{ PKT_RX_QINQ_STRIPPED, PKT_RX_QINQ_STRIPPED, NULL },
 		{ PKT_RX_LRO, PKT_RX_LRO, NULL },
-		{ PKT_RX_TIMESTAMP, PKT_RX_TIMESTAMP, NULL },
 		{ PKT_RX_SEC_OFFLOAD, PKT_RX_SEC_OFFLOAD, NULL },
 		{ PKT_RX_SEC_OFFLOAD_FAILED, PKT_RX_SEC_OFFLOAD_FAILED, NULL },
 		{ PKT_RX_QINQ, PKT_RX_QINQ, NULL },
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index a1414ed7cd..17e0b205c0 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -1095,6 +1095,7 @@ rte_pktmbuf_attach_extbuf(struct rte_mbuf *m, void *buf_addr,
 static inline void
 rte_mbuf_dynfield_copy(struct rte_mbuf *mdst, const struct rte_mbuf *msrc)
 {
+	memcpy(&mdst->dynfield0, msrc->dynfield0, sizeof(mdst->dynfield0));
 	memcpy(&mdst->dynfield1, msrc->dynfield1, sizeof(mdst->dynfield1));
 }
 
@@ -1108,7 +1109,6 @@ __rte_pktmbuf_copy_hdr(struct rte_mbuf *mdst, const struct rte_mbuf *msrc)
 	mdst->tx_offload = msrc->tx_offload;
 	mdst->hash = msrc->hash;
 	mdst->packet_type = msrc->packet_type;
-	mdst->timestamp = msrc->timestamp;
 	rte_mbuf_dynfield_copy(mdst, msrc);
 }
 
diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
index 3fb5abda3c..38e24a580d 100644
--- a/lib/librte_mbuf/rte_mbuf_core.h
+++ b/lib/librte_mbuf/rte_mbuf_core.h
@@ -149,10 +149,7 @@ extern "C" {
  */
 #define PKT_RX_LRO           (1ULL << 16)
 
-/**
- * Indicate that the timestamp field in the mbuf is valid.
- */
-#define PKT_RX_TIMESTAMP     (1ULL << 17)
+/* There is no flag defined at offset 17. It is free for any future use. */
 
 /**
  * Indicate that security offload processing was applied on the RX packet.
@@ -589,12 +586,7 @@ struct rte_mbuf {
 
 	uint16_t buf_len;         /**< Length of segment buffer. */
 
-	/** Valid if PKT_RX_TIMESTAMP is set. The unit and time reference
-	 * are not normalized but are always the same for a given port.
-	 * Some devices allow to query rte_eth_read_clock that will return the
-	 * current device timestamp.
-	 */
-	uint64_t timestamp;
+	uint64_t dynfield0[1]; /**< Reserved for dynamic fields. */
 
 	/* second cache line - fields only used in slow path or on TX */
 	RTE_MARKER cacheline1 __rte_cache_min_aligned;
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.c b/lib/librte_mbuf/rte_mbuf_dyn.c
index 5b608a27d7..4f50da09f3 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.c
+++ b/lib/librte_mbuf/rte_mbuf_dyn.c
@@ -125,6 +125,7 @@ init_shared_mem(void)
 		 * rte_mbuf_dynfield_copy().
 		 */
 		memset(shm, 0, sizeof(*shm));
+		mark_free(dynfield0);
 		mark_free(dynfield1);
 
 		/* init free_flags */
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v5 15/16] mbuf: add Tx timestamp registration helper
  2020-11-03 14:09 ` [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp Thomas Monjalon
                     ` (13 preceding siblings ...)
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 14/16] mbuf: remove deprecated timestamp field Thomas Monjalon
@ 2020-11-03 14:09   ` Thomas Monjalon
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 16/16] ethdev: include mbuf registration in Tx timestamp API Thomas Monjalon
                     ` (2 subsequent siblings)
  17 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 14:09 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Ray Kinsella, Neil Horman

The function rte_mbuf_dyn_tx_timestamp_register()
can be used to register the required field and flag.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
---
 lib/librte_mbuf/rte_mbuf_dyn.c |  7 +++++++
 lib/librte_mbuf/rte_mbuf_dyn.h | 22 ++++++++++++++++++++++
 lib/librte_mbuf/version.map    |  1 +
 3 files changed, 30 insertions(+)

diff --git a/lib/librte_mbuf/rte_mbuf_dyn.c b/lib/librte_mbuf/rte_mbuf_dyn.c
index 4f50da09f3..101b5bd95f 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.c
+++ b/lib/librte_mbuf/rte_mbuf_dyn.c
@@ -613,3 +613,10 @@ rte_mbuf_dyn_rx_timestamp_register(int *field_offset, uint64_t *rx_flag)
 	return rte_mbuf_dyn_timestamp_register(field_offset, rx_flag,
 			"Rx", RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME);
 }
+
+int
+rte_mbuf_dyn_tx_timestamp_register(int *field_offset, uint64_t *tx_flag)
+{
+	return rte_mbuf_dyn_timestamp_register(field_offset, tx_flag,
+			"Tx", RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME);
+}
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 2e729ddaca..d88e7bacc5 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -304,4 +304,26 @@ int rte_mbuf_dyn_rx_timestamp_register(int *field_offset, uint64_t *rx_flag);
  */
 #define RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME "rte_dynflag_tx_timestamp"
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Register dynamic mbuf field and flag for Tx timestamp.
+ *
+ * @param field_offset
+ *   Pointer to the offset of the registered mbuf field, can be NULL.
+ *   The same field is shared for Rx and Tx timestamp.
+ * @param tx_flag
+ *   Pointer to the mask of the registered offload flag, can be NULL.
+ * @return
+ *   0 on success, -1 otherwise.
+ *   Possible values for rte_errno:
+ *   - EEXIST: already registered with different parameters.
+ *   - EPERM: called from a secondary process.
+ *   - ENOENT: no more field or flag available.
+ *   - ENOMEM: allocation failure.
+ */
+__rte_experimental
+int rte_mbuf_dyn_tx_timestamp_register(int *field_offset, uint64_t *tx_flag);
+
 #endif
diff --git a/lib/librte_mbuf/version.map b/lib/librte_mbuf/version.map
index 0b66668bff..b7d98e7eb1 100644
--- a/lib/librte_mbuf/version.map
+++ b/lib/librte_mbuf/version.map
@@ -43,6 +43,7 @@ EXPERIMENTAL {
 	rte_mbuf_dynflag_register_bitnum;
 	rte_mbuf_dyn_dump;
 	rte_mbuf_dyn_rx_timestamp_register;
+	rte_mbuf_dyn_tx_timestamp_register;
 	rte_pktmbuf_copy;
 	rte_pktmbuf_free_bulk;
 	rte_pktmbuf_pool_create_extbuf;
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [dpdk-dev] [PATCH v5 16/16] ethdev: include mbuf registration in Tx timestamp API
  2020-11-03 14:09 ` [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp Thomas Monjalon
                     ` (14 preceding siblings ...)
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 15/16] mbuf: add Tx timestamp registration helper Thomas Monjalon
@ 2020-11-03 14:09   ` Thomas Monjalon
  2020-11-03 14:17   ` [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp Olivier Matz
  2020-11-03 16:08   ` Stephen Hemminger
  17 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 14:09 UTC (permalink / raw)
  To: dev
  Cc: ferruh.yigit, david.marchand, bruce.richardson, olivier.matz,
	andrew.rybchenko, jerinj, viacheslavo, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger, Matan Azrad, Shahaf Shuler

Previously, the Tx timestamp field and flag were registered in testpmd,
as described in mlx5 guide.
For consistency between Rx and Tx timestamps,
managing mbuf registrations inside the driver, as properly documented,
is a simpler expectation.

The only driver to support this feature (mlx5) is updated
as well as the testpmd application.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
---
 app/test-pmd/config.c          | 38 ----------------------------------
 doc/guides/nics/mlx5.rst       |  5 ++---
 drivers/net/mlx5/mlx5_ethdev.c |  8 ++++++-
 lib/librte_ethdev/rte_ethdev.h |  7 +++++--
 4 files changed, 14 insertions(+), 44 deletions(-)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 1668ae3238..9a2baf16fe 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -3955,44 +3955,6 @@ show_tx_pkt_times(void)
 void
 set_tx_pkt_times(unsigned int *tx_times)
 {
-	uint16_t port_id;
-	int offload_found = 0;
-	int offset;
-	int flag;
-
-	static const struct rte_mbuf_dynfield desc_offs = {
-		.name = RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
-		.size = sizeof(uint64_t),
-		.align = __alignof__(uint64_t),
-	};
-	static const struct rte_mbuf_dynflag desc_flag = {
-		.name = RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME,
-	};
-
-	RTE_ETH_FOREACH_DEV(port_id) {
-		struct rte_eth_dev_info dev_info = { 0 };
-		int ret;
-
-		ret = rte_eth_dev_info_get(port_id, &dev_info);
-		if (ret == 0 && dev_info.tx_offload_capa &
-				DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP) {
-			offload_found = 1;
-			break;
-		}
-	}
-	if (!offload_found) {
-		printf("No device supporting Tx timestamp scheduling found, "
-		       "dynamic flag and field not registered\n");
-		return;
-	}
-	offset = rte_mbuf_dynfield_register(&desc_offs);
-	if (offset < 0 && rte_errno != EEXIST)
-		printf("Dynamic timestamp field registration error: %d",
-		       rte_errno);
-	flag = rte_mbuf_dynflag_register(&desc_flag);
-	if (flag < 0 && rte_errno != EEXIST)
-		printf("Dynamic timestamp flag registration error: %d",
-		       rte_errno);
 	tx_pkt_times_inter = tx_times[0];
 	tx_pkt_times_intra = tx_times[1];
 }
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index afa65a1379..fa8b13dd1b 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -237,9 +237,8 @@ Limitations
   ``txq_inline_max`` and ``txq_inline_mpw`` devargs keys.
 
 - To provide the packet send scheduling on mbuf timestamps the ``tx_pp``
-  parameter should be specified, RTE_MBUF_DYNFIELD_TIMESTAMP_NAME and
-  RTE_MBUF_DYNFLAG_TIMESTAMP_NAME should be registered by application.
-  When PMD sees the RTE_MBUF_DYNFLAG_TIMESTAMP_NAME set on the packet
+  parameter should be specified.
+  When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME set on the packet
   being sent it tries to synchronize the time of packet appearing on
   the wire with the specified packet timestamp. It the specified one
   is in the past it should be ignored, if one is in the distant future
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 7631f644b2..76ef02664f 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -88,7 +88,13 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
 
 	if (dev->data->dev_conf.rxmode.mq_mode & ETH_MQ_RX_RSS_FLAG)
 		dev->data->dev_conf.rxmode.offloads |= DEV_RX_OFFLOAD_RSS_HASH;
-
+	if ((dev->data->dev_conf.txmode.offloads &
+			DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP) &&
+			rte_mbuf_dyn_tx_timestamp_register(NULL, NULL) != 0) {
+		DRV_LOG(ERR, "port %u cannot register Tx timestamp field/flag",
+			dev->data->port_id);
+		return -rte_errno;
+	}
 	memcpy(priv->rss_conf.rss_key,
 	       use_app_rss_key ?
 	       dev->data->dev_conf.rx_adv_conf.rss_conf.rss_key :
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index 4988054cb2..f689550745 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1413,8 +1413,11 @@ struct rte_eth_conf {
 #define DEV_TX_OFFLOAD_IP_TNL_TSO       0x00080000
 /** Device supports outer UDP checksum */
 #define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM  0x00100000
-
-/** Device supports send on timestamp */
+/**
+ * Device sends on time read from RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
+ * if RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME is set in ol_flags.
+ * The mbuf field and flag are registered when the offload is configured.
+ */
 #define DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
 
 
-- 
2.28.0


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp
  2020-11-03 14:09 ` [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp Thomas Monjalon
                     ` (15 preceding siblings ...)
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 16/16] ethdev: include mbuf registration in Tx timestamp API Thomas Monjalon
@ 2020-11-03 14:17   ` Olivier Matz
  2020-11-03 14:44     ` Thomas Monjalon
  2020-11-03 16:08   ` Stephen Hemminger
  17 siblings, 1 reply; 170+ messages in thread
From: Olivier Matz @ 2020-11-03 14:17 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	andrew.rybchenko, jerinj, viacheslavo

On Tue, Nov 03, 2020 at 03:09:15PM +0100, Thomas Monjalon wrote:
> The mbuf field timestamp was announced to be removed for three reasons:
>   - a dynamic field already exist, used for Tx only
>   - this field always used 8 bytes even if unneeded
>   - this field is in the first half (cacheline) of mbuf
> 
> After this series, the dynamic field timestamp is used for both Rx and Tx
> with separate dynamic flags to distinguish when the value is meaningful
> without resetting the field during forwarding.
> 
> As a consequence, 8 bytes can be re-allocated to dynamic fields
> in the first half of mbuf structure.
> It is still open to change more the mbuf layout.
> 
> This mbuf layout change is important to allow adding more features
> (consuming more dynamic fields) during the next year,
> and can allow performance improvements with new usages in the first half.
> 
> 
> v5:
> - add a blank line between different kind of ARK variables
> - move registration after octeontx2 VF config
> - register also in otx2_nix_timesync_enable
> 
> v4:
> - use local variable in nfb
> - fix flag initialization
> - remove useless blank line
> 
> v3:
> - move ark variables declaration in a .h file 
> - improve cache locality for octeontx2
> - add comments about cache locality in commit logs
> - add comment for unused flag offset 17
> - add timestamp register functions
> - replace lookup with register in drivers and apps
> - remove register in ethdev
> 
> v2:
> - remove optimization to register only once in ethdev
> - fix error message in latencystats
> - convert rxtx_callbacks macro to inline function
> - increase dynamic fields space
> - do not move pool field
> 
> 
> Thomas Monjalon (16):
>   eventdev: remove software Rx timestamp
>   mbuf: add Rx timestamp flag and helpers
>   latency: switch Rx timestamp to dynamic mbuf field
>   net/ark: switch Rx timestamp to dynamic mbuf field
>   net/dpaa2: switch Rx timestamp to dynamic mbuf field
>   net/mlx5: fix dynamic mbuf offset lookup check
>   net/mlx5: switch Rx timestamp to dynamic mbuf field
>   net/nfb: switch Rx timestamp to dynamic mbuf field
>   net/octeontx2: switch Rx timestamp to dynamic mbuf field
>   net/pcap: switch Rx timestamp to dynamic mbuf field
>   app/testpmd: switch Rx timestamp to dynamic mbuf field
>   examples/rxtx_callbacks: switch timestamp to dynamic field
>   ethdev: add doxygen comment for Rx timestamp API
>   mbuf: remove deprecated timestamp field
>   mbuf: add Tx timestamp registration helper
>   ethdev: include mbuf registration in Tx timestamp API
> 
>  app/test-pmd/config.c                         | 38 -------------
>  app/test-pmd/util.c                           | 38 ++++++++++++-
>  app/test/test_mbuf.c                          |  1 -
>  doc/guides/nics/mlx5.rst                      |  5 +-
>  .../prog_guide/event_ethernet_rx_adapter.rst  |  6 +-
>  doc/guides/rel_notes/deprecation.rst          |  4 --
>  doc/guides/rel_notes/release_20_11.rst        |  4 ++
>  drivers/net/ark/ark_ethdev.c                  | 17 ++++++
>  drivers/net/ark/ark_ethdev_rx.c               |  7 ++-
>  drivers/net/ark/ark_ethdev_rx.h               |  2 +
>  drivers/net/dpaa2/dpaa2_ethdev.c              | 11 ++++
>  drivers/net/dpaa2/dpaa2_ethdev.h              |  2 +
>  drivers/net/dpaa2/dpaa2_rxtx.c                | 25 ++++++---
>  drivers/net/mlx5/mlx5_ethdev.c                |  8 ++-
>  drivers/net/mlx5/mlx5_rxq.c                   |  8 +++
>  drivers/net/mlx5/mlx5_rxtx.c                  |  8 +--
>  drivers/net/mlx5/mlx5_rxtx.h                  | 19 +++++++
>  drivers/net/mlx5/mlx5_rxtx_vec_altivec.h      | 41 +++++++-------
>  drivers/net/mlx5/mlx5_rxtx_vec_neon.h         | 43 ++++++++-------
>  drivers/net/mlx5/mlx5_rxtx_vec_sse.h          | 35 ++++++------
>  drivers/net/mlx5/mlx5_trigger.c               |  2 +-
>  drivers/net/mlx5/mlx5_txq.c                   |  2 +-
>  drivers/net/nfb/nfb_rx.c                      | 15 ++++-
>  drivers/net/nfb/nfb_rx.h                      | 21 +++++--
>  drivers/net/octeontx2/otx2_ethdev.c           | 10 ++++
>  drivers/net/octeontx2/otx2_ptp.c              |  8 +++
>  drivers/net/octeontx2/otx2_rx.h               | 19 ++++++-
>  drivers/net/pcap/rte_eth_pcap.c               | 20 ++++++-
>  examples/rxtx_callbacks/main.c                | 16 +++++-
>  lib/librte_ethdev/rte_ethdev.h                | 14 ++++-
>  .../rte_event_eth_rx_adapter.c                | 11 ----
>  .../rte_event_eth_rx_adapter.h                |  6 +-
>  lib/librte_latencystats/rte_latencystats.c    | 30 ++++++++--
>  lib/librte_mbuf/rte_mbuf.c                    |  2 -
>  lib/librte_mbuf/rte_mbuf.h                    |  2 +-
>  lib/librte_mbuf/rte_mbuf_core.h               | 12 +---
>  lib/librte_mbuf/rte_mbuf_dyn.c                | 51 +++++++++++++++++
>  lib/librte_mbuf/rte_mbuf_dyn.h                | 55 +++++++++++++++++--
>  lib/librte_mbuf/version.map                   |  2 +
>  39 files changed, 440 insertions(+), 180 deletions(-)
> 
> -- 
> 2.28.0
> 

For the series:
Acked-by: Olivier Matz <olivier.matz@6wind.com>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [EXT] Re: [PATCH v3 09/16] net/octeontx2: switch Rx timestamp to dynamic mbuf field
  2020-11-03 12:21         ` Thomas Monjalon
@ 2020-11-03 14:23           ` Harman Kalra
  0 siblings, 0 replies; 170+ messages in thread
From: Harman Kalra @ 2020-11-03 14:23 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	olivier.matz, andrew.rybchenko, jerinj, viacheslavo,
	Nithin Dabilpuram, Kiran Kumar K

On Tue, Nov 03, 2020 at 01:21:01PM +0100, Thomas Monjalon wrote:
> External Email
> 
> ----------------------------------------------------------------------
> 03/11/2020 12:22, Thomas Monjalon:
> > 03/11/2020 11:52, Harman Kalra:
> > >    With the following changes, ptpclient and testpmd(ieee1588 mode) is
> > >    crashing for us. I am debugging the issue and will update soon.
> > >   ------------------
> > >    Steps to reproduce:
> > >    1. Testpmd:
> > >       ./dpdk-testpmd -c 0xffff01 -n 4 -w 0002:05:00.0 -- -i
> > >       --port-topology=loop
> > >       testpmd> set fwd ieee1588
> > >       testpmd> set port 0 ptype_mask 0xf
> > >       testpmd> start
> > > 
> > >       I am sending ptp packets using scapy from the peer:
> > >       >>> p = Ether(src='98:03:9b:67:b0:d0', dst='FA:62:0C:27:AD:BC',
> > > 		      >>> type=35063)/Raw(load='\x00\x02')
> > >       >>> sendp (p, iface="p5p2")
> > > 
> > >       I am observing seg fault even for 1 packet.
> > 
> > Where is the crash? Could you provide a backtrace?
> > Is the field well registered?
> 
> Sorry Harman, without any more explanation, we must move forward.
> I am going to send a v4 without any change for octeontx2.
> It should be merged today for -rc2.
> 

Hi Thomas,

   I have fixed the issue and sent a patch on top of your changes,
   kindly apply it:
   https://patches.dpdk.org/patch/83611/

   Once you will squash both the patches, I will abondon my patch.
   
Thanks
Harman

> 

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp
  2020-11-03 14:17   ` [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp Olivier Matz
@ 2020-11-03 14:44     ` Thomas Monjalon
  0 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 14:44 UTC (permalink / raw)
  To: david.marchand, andrew.rybchenko, Olivier Matz
  Cc: dev, ferruh.yigit, bruce.richardson, jerinj, viacheslavo

03/11/2020 15:17, Olivier Matz:
> On Tue, Nov 03, 2020 at 03:09:15PM +0100, Thomas Monjalon wrote:
> > The mbuf field timestamp was announced to be removed for three reasons:
> >   - a dynamic field already exist, used for Tx only
> >   - this field always used 8 bytes even if unneeded
> >   - this field is in the first half (cacheline) of mbuf
> > 
> > After this series, the dynamic field timestamp is used for both Rx and Tx
> > with separate dynamic flags to distinguish when the value is meaningful
> > without resetting the field during forwarding.
> > 
> > As a consequence, 8 bytes can be re-allocated to dynamic fields
> > in the first half of mbuf structure.
> > It is still open to change more the mbuf layout.
> > 
> > This mbuf layout change is important to allow adding more features
> > (consuming more dynamic fields) during the next year,
> > and can allow performance improvements with new usages in the first half.
[...]
> >  39 files changed, 440 insertions(+), 180 deletions(-)
> 
> For the series:
> Acked-by: Olivier Matz <olivier.matz@6wind.com>

Applied, thanks for the help Olivier, David and Andrew.

Next step: decide whether we keep "free space" in the first half
for dynamic fields or move another field from the second half.



^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotterfirst half
  2020-11-03 14:02                         ` Slava Ovsiienko
@ 2020-11-03 15:03                           ` Morten Brørup
  2020-11-04 15:00                             ` Olivier Matz
  0 siblings, 1 reply; 170+ messages in thread
From: Morten Brørup @ 2020-11-03 15:03 UTC (permalink / raw)
  To: Slava Ovsiienko, NBU-Contact-Thomas Monjalon, dev, techboard
  Cc: Ajit Khaparde, Ananyev, Konstantin, Andrew Rybchenko, dev, Yigit,
	Ferruh, david.marchand, Richardson, Bruce, olivier.matz, jerinj,
	honnappa.nagarahalli, maxime.coquelin, stephen, hemant.agrawal,
	Matan Azrad, Shahaf Shuler

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Slava Ovsiienko
> Sent: Tuesday, November 3, 2020 3:03 PM
> 
> Hi, Morten
> 
> > From: Morten Brørup <mb@smartsharesystems.com>
> > Sent: Tuesday, November 3, 2020 14:10
> >
> > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > Sent: Monday, November 2, 2020 4:58 PM
> > >
> > > +Cc techboard
> > >
> > > We need benchmark numbers in order to take a decision.
> > > Please all, prepare some arguments and numbers so we can discuss
> the
> > > mbuf layout in the next techboard meeting.
> >
> > I propose that the techboard considers this from two angels:
> >
> > 1. Long term goals and their relative priority. I.e. what can be
> achieved with
> > wide-ranging modifications, requiring yet another ABI break and due
> notices.
> >
> > 2. Short term goals, i.e. what can be achieved for this release.
> >
> >
> > My suggestions follow...
> >
> > 1. Regarding long term goals:
> >
> > I have argued that simple forwarding of non-segmented packets using
> only the
> > first mbuf cache line can be achieved by making three
> > modifications:
> >
> > a) Move m->tx_offload to the first cache line.
> Not all PMDs use this field on Tx. HW might support the checksum
> offloads
> directly, not requiring these fields at all.
> 
> 
> > b) Use an 8 bit pktmbuf mempool index in the first cache line,
> >    instead of the 64 bit m->pool pointer in the second cache line.
> 256 mpool looks enough, as for me. Regarding the indirect access to the
> pool
> (via some table) - it might introduce some performance impact.

It might, but I hope that it is negligible, so the benefits outweigh the disadvantages.

It would have to be measured, though.

And m->pool is only used for free()'ing (and detach()'ing) mbufs.

> For example,
> mlx5 PMD strongly relies on pool field for allocating mbufs in Rx
> datapath.
> We're going to update (o-o, we found point to optimize), but for now it
> does.

Without looking at the source code, I don't think the PMD is using m->pool in the RX datapath, I think it is using a pool dedicated to a receive queue used for RX descriptors in the PMD (i.e. driver->queue->pool).

> 
> > c) Do not access m->next when we know that it is NULL.
> >    We can use m->nb_segs == 1 or some other invariant as the gate.
> >    It can be implemented by adding an m->next accessor function:
> >    struct rte_mbuf * rte_mbuf_next(struct rte_mbuf * m)
> >    {
> >        return m->nb_segs == 1 ? NULL : m->next;
> >    }
> 
> Sorry, not sure about this. IIRC, nb_segs is valid in the first
> segment/mbuf  only.
> If we have the 4 segments in the pkt we see nb_seg=4 in the first one,
> and the nb_seg=1
> in the others. The next field is NULL in the last mbuf only. Am I wrong
> and miss something ?

You are correct.

This would have to be updated too. Either by increasing m->nb_seg in the following segments, or by splitting up relevant functions into functions for working on first segments (incl. non-segmented packets), and functions for working on following segments of segmented packets.

> 
> > Regarding the priority of this goal, I guess that simple forwarding
> of non-
> > segmented packets is probably the path taken by the majority of
> packets
> > handled by DPDK.
> >
> > An alternative goal could be:
> > Do not touch the second cache line during RX.
> > A comment in the mbuf structure says so, but it is not true anymore.
> >
> > (I guess that regression testing didn't catch this because the tests
> perform TX
> > immediately after RX, so the cache miss just moves from the TX to the
> RX part
> > of the test application.)
> >
> >
> > 2. Regarding short term goals:
> >
> > The current DPDK source code looks to me like m->next is the most
> frequently
> > accessed field in the second cache line, so it makes sense moving
> this to the
> > first cache line, rather than m->pool.
> > Benchmarking may help here.
> 
> Moreover, for the segmented packets the packet size is supposed to be
> large,
> and it imposes the relatively low packet rate, so probably optimization
> of
> moving next to the 1st cache line might be negligible at all. Just
> compare 148Mpps of
> 64B pkts and 4Mpps of 3000B pkts over 100Gbps link. Currently we are on
> benchmarking
> and did not succeed yet on difference finding. The benefit can't be
> expressed in mpps delta,
> we should measure CPU clocks, but Rx queue is almost always empty - we
> have an empty
> loops. So, if we have the boost - it is extremely hard to catch one.

Very good point regarding the value of such an optimization, Slava!

And when free()'ing packets, both m->next and m->pool are touched.

So perhaps the free()/detach() functions in the mbuf library can be modified to handle first segments (and non-segmented packets) and following segments differently, so accessing m->next can be avoided for non-segmented packets. Then m->pool should be moved to the first cache line.


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp
  2020-11-03 14:09 ` [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp Thomas Monjalon
                     ` (16 preceding siblings ...)
  2020-11-03 14:17   ` [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp Olivier Matz
@ 2020-11-03 16:08   ` Stephen Hemminger
  2020-11-03 16:20     ` Thomas Monjalon
  17 siblings, 1 reply; 170+ messages in thread
From: Stephen Hemminger @ 2020-11-03 16:08 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	olivier.matz, andrew.rybchenko, jerinj, viacheslavo

On Tue,  3 Nov 2020 15:09:15 +0100
Thomas Monjalon <thomas@monjalon.net> wrote:

> The mbuf field timestamp was announced to be removed for three reasons:
>   - a dynamic field already exist, used for Tx only
>   - this field always used 8 bytes even if unneeded
>   - this field is in the first half (cacheline) of mbuf
> 
> After this series, the dynamic field timestamp is used for both Rx and Tx
> with separate dynamic flags to distinguish when the value is meaningful
> without resetting the field during forwarding.

There should be a place in documentation which describes all the
dynamic fields and their meaning.  For example, which drivers/features
set the field and the exact meaning.  Is the timestamp in HW units,
UTC units, or TSC ticks?

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp
  2020-11-03 16:08   ` Stephen Hemminger
@ 2020-11-03 16:20     ` Thomas Monjalon
  2020-11-03 17:42       ` Stephen Hemminger
  0 siblings, 1 reply; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 16:20 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	olivier.matz, andrew.rybchenko, jerinj, viacheslavo

03/11/2020 17:08, Stephen Hemminger:
> On Tue,  3 Nov 2020 15:09:15 +0100
> Thomas Monjalon <thomas@monjalon.net> wrote:
> 
> > The mbuf field timestamp was announced to be removed for three reasons:
> >   - a dynamic field already exist, used for Tx only
> >   - this field always used 8 bytes even if unneeded
> >   - this field is in the first half (cacheline) of mbuf
> > 
> > After this series, the dynamic field timestamp is used for both Rx and Tx
> > with separate dynamic flags to distinguish when the value is meaningful
> > without resetting the field during forwarding.
> 
> There should be a place in documentation which describes all the
> dynamic fields and their meaning.  For example, which drivers/features
> set the field and the exact meaning.

A dynamic field can be registered by anyone, including the apps.
So you will never get a full list.
The meaning of each field should be defined in its context
(driver, lib or app).

> Is the timestamp in HW units, UTC units, or TSC ticks?

The timestamp unit is driver-specific.
It is explained in ethdev API:
http://doc.dpdk.org/api/rte__ethdev_8h.html#a4346bf07a0d302c9ba4fe06baffd3196



^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp
  2020-11-03 16:20     ` Thomas Monjalon
@ 2020-11-03 17:42       ` Stephen Hemminger
  2020-11-03 17:55         ` Thomas Monjalon
  0 siblings, 1 reply; 170+ messages in thread
From: Stephen Hemminger @ 2020-11-03 17:42 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	olivier.matz, andrew.rybchenko, jerinj, viacheslavo

On Tue, 03 Nov 2020 17:20:20 +0100
Thomas Monjalon <thomas@monjalon.net> wrote:

> 03/11/2020 17:08, Stephen Hemminger:
> > On Tue,  3 Nov 2020 15:09:15 +0100
> > Thomas Monjalon <thomas@monjalon.net> wrote:
> >   
> > > The mbuf field timestamp was announced to be removed for three reasons:
> > >   - a dynamic field already exist, used for Tx only
> > >   - this field always used 8 bytes even if unneeded
> > >   - this field is in the first half (cacheline) of mbuf
> > > 
> > > After this series, the dynamic field timestamp is used for both Rx and Tx
> > > with separate dynamic flags to distinguish when the value is meaningful
> > > without resetting the field during forwarding.  
> > 
> > There should be a place in documentation which describes all the
> > dynamic fields and their meaning.  For example, which drivers/features
> > set the field and the exact meaning.  
> 
> A dynamic field can be registered by anyone, including the apps.
> So you will never get a full list.
> The meaning of each field should be defined in its context
> (driver, lib or app).
> 
> > Is the timestamp in HW units, UTC units, or TSC ticks?  
> 
> The timestamp unit is driver-specific.
> It is explained in ethdev API:
> http://doc.dpdk.org/api/rte__ethdev_8h.html#a4346bf07a0d302c9ba4fe06baffd3196


Are there are any conventions we should use in this area?
There could be overlapping usage between subsystems?

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v5 00/16] remove mbuf timestamp
  2020-11-03 17:42       ` Stephen Hemminger
@ 2020-11-03 17:55         ` Thomas Monjalon
  0 siblings, 0 replies; 170+ messages in thread
From: Thomas Monjalon @ 2020-11-03 17:55 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, ferruh.yigit, david.marchand, bruce.richardson,
	olivier.matz, andrew.rybchenko, jerinj, viacheslavo

03/11/2020 18:42, Stephen Hemminger:
> On Tue, 03 Nov 2020 17:20:20 +0100
> Thomas Monjalon <thomas@monjalon.net> wrote:
> > 03/11/2020 17:08, Stephen Hemminger:
> > > On Tue,  3 Nov 2020 15:09:15 +0100
> > > Thomas Monjalon <thomas@monjalon.net> wrote:
> > >   
> > > > The mbuf field timestamp was announced to be removed for three reasons:
> > > >   - a dynamic field already exist, used for Tx only
> > > >   - this field always used 8 bytes even if unneeded
> > > >   - this field is in the first half (cacheline) of mbuf
> > > > 
> > > > After this series, the dynamic field timestamp is used for both Rx and Tx
> > > > with separate dynamic flags to distinguish when the value is meaningful
> > > > without resetting the field during forwarding.  
> > > 
> > > There should be a place in documentation which describes all the
> > > dynamic fields and their meaning.  For example, which drivers/features
> > > set the field and the exact meaning.  
> > 
> > A dynamic field can be registered by anyone, including the apps.
> > So you will never get a full list.
> > The meaning of each field should be defined in its context
> > (driver, lib or app).
> > 
> > > Is the timestamp in HW units, UTC units, or TSC ticks?  
> > 
> > The timestamp unit is driver-specific.
> > It is explained in ethdev API:
> > http://doc.dpdk.org/api/rte__ethdev_8h.html#a4346bf07a0d302c9ba4fe06baffd3196
> 
> 
> Are there are any conventions we should use in this area?
> There could be overlapping usage between subsystems?

The name of the field should be prefixed with the right context
to avoid overlapping of different usages.
It is documented here:
	http://doc.dpdk.org/api/rte__mbuf__dyn_8h.html




^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH v5 13/16] ethdev: add doxygen comment for Rx timestamp API
  2020-11-03 14:09   ` [dpdk-dev] [PATCH v5 13/16] ethdev: add doxygen comment for Rx timestamp API Thomas Monjalon
@ 2020-11-03 19:07     ` Ajit Khaparde
  0 siblings, 0 replies; 170+ messages in thread
From: Ajit Khaparde @ 2020-11-03 19:07 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dpdk-dev, Ferruh Yigit, David Marchand, Bruce Richardson,
	Olivier Matz, Andrew Rybchenko, Jerin Jacob Kollanukkaran,
	Slava Ovsiienko

On Tue, Nov 3, 2020 at 6:15 AM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> The offload flag DEV_RX_OFFLOAD_TIMESTAMP had no documentation.
> After switching to dynamic mbuf flag and field,
> it becomes even more important to explicit the feature behaviour.
>
> A doxygen comment for the timesync API was mentioning
> the deprecated timestamp field, so it is also updated.
>
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> Acked-by: David Marchand <david.marchand@redhat.com>
> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>

Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>

> ---
>  lib/librte_ethdev/rte_ethdev.h | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
> index e341a08817..4988054cb2 100644
> --- a/lib/librte_ethdev/rte_ethdev.h
> +++ b/lib/librte_ethdev/rte_ethdev.h
> @@ -1344,6 +1344,11 @@ struct rte_eth_conf {
>  #define DEV_RX_OFFLOAD_VLAN_EXTEND     0x00000400
>  #define DEV_RX_OFFLOAD_JUMBO_FRAME     0x00000800
>  #define DEV_RX_OFFLOAD_SCATTER         0x00002000
> +/**
> + * Timestamp is set by the driver in RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
> + * and RTE_MBUF_DYNFLAG_RX_TIMESTAMP_NAME is set in ol_flags.
> + * The mbuf field and flag are registered when the offload is configured.
> + */
>  #define DEV_RX_OFFLOAD_TIMESTAMP       0x00004000
>  #define DEV_RX_OFFLOAD_SECURITY         0x00008000
>  #define DEV_RX_OFFLOAD_KEEP_CRC                0x00010000
> @@ -4646,7 +4651,7 @@ int rte_eth_timesync_write_time(uint16_t port_id, const struct timespec *time);
>   * rte_eth_read_clock(port, base_clock);
>   *
>   * Then, convert the raw mbuf timestamp with:
> - * base_time_sec + (double)(mbuf->timestamp - base_clock) / freq;
> + * base_time_sec + (double)(*timestamp_dynfield(mbuf) - base_clock) / freq;
>   *
>   * This simple example will not provide a very good accuracy. One must
>   * at least measure multiple times the frequency and do a regression.
> --
> 2.28.0
>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotterfirst half
  2020-11-03 15:03                           ` Morten Brørup
@ 2020-11-04 15:00                             ` Olivier Matz
  2020-11-05  0:25                               ` Ananyev, Konstantin
  0 siblings, 1 reply; 170+ messages in thread
From: Olivier Matz @ 2020-11-04 15:00 UTC (permalink / raw)
  To: Morten Brørup
  Cc: Slava Ovsiienko, NBU-Contact-Thomas Monjalon, dev, techboard,
	Ajit Khaparde, Ananyev, Konstantin, Andrew Rybchenko, Yigit,
	Ferruh, david.marchand, Richardson, Bruce, jerinj,
	honnappa.nagarahalli, maxime.coquelin, stephen, hemant.agrawal,
	Matan Azrad, Shahaf Shuler

Hi,

On Tue, Nov 03, 2020 at 04:03:46PM +0100, Morten Brørup wrote:
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Slava Ovsiienko
> > Sent: Tuesday, November 3, 2020 3:03 PM
> > 
> > Hi, Morten
> > 
> > > From: Morten Brørup <mb@smartsharesystems.com>
> > > Sent: Tuesday, November 3, 2020 14:10
> > >
> > > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > > Sent: Monday, November 2, 2020 4:58 PM
> > > >
> > > > +Cc techboard
> > > >
> > > > We need benchmark numbers in order to take a decision.
> > > > Please all, prepare some arguments and numbers so we can discuss
> > the
> > > > mbuf layout in the next techboard meeting.

I did some quick tests, and it appears to me that just moving the pool
pointer to the first cache line has not a significant impact.

However, I agree with Morten that there is some room for optimization
around m->pool: I did a hack in the ixgbe driver to assume there is only
one mbuf pool. This simplifies a lot the freeing of mbufs in Tx, because
we don't have to group them in bulks that shares the same pool (see
ixgbe_tx_free_bufs()). The impact of this hack is quite good: +~5% on a
real-life forwarding use case.

It is maybe possible to store the pool in the sw ring to avoid a later
access to m->pool. Having a pool index as suggested by Morten would also
help to reduce used room in sw ring in this case. But this is a bit
off-topic :)



> > > I propose that the techboard considers this from two angels:
> > >
> > > 1. Long term goals and their relative priority. I.e. what can be
> > achieved with
> > > wide-ranging modifications, requiring yet another ABI break and due
> > notices.
> > >
> > > 2. Short term goals, i.e. what can be achieved for this release.
> > >
> > >
> > > My suggestions follow...
> > >
> > > 1. Regarding long term goals:
> > >
> > > I have argued that simple forwarding of non-segmented packets using
> > only the
> > > first mbuf cache line can be achieved by making three
> > > modifications:
> > >
> > > a) Move m->tx_offload to the first cache line.
> > Not all PMDs use this field on Tx. HW might support the checksum
> > offloads
> > directly, not requiring these fields at all.

To me, a driver should use m->tx_offload, because the application
specifies the offset where the checksum has to be done, in case the hw
is not able to recognize the protocol.

> > > b) Use an 8 bit pktmbuf mempool index in the first cache line,
> > >    instead of the 64 bit m->pool pointer in the second cache line.
> > 256 mpool looks enough, as for me. Regarding the indirect access to the
> > pool
> > (via some table) - it might introduce some performance impact.
> 
> It might, but I hope that it is negligible, so the benefits outweigh the disadvantages.
> 
> It would have to be measured, though.
> 
> And m->pool is only used for free()'ing (and detach()'ing) mbufs.
> 
> > For example,
> > mlx5 PMD strongly relies on pool field for allocating mbufs in Rx
> > datapath.
> > We're going to update (o-o, we found point to optimize), but for now it
> > does.
> 
> Without looking at the source code, I don't think the PMD is using m->pool in the RX datapath, I think it is using a pool dedicated to a receive queue used for RX descriptors in the PMD (i.e. driver->queue->pool).
> 
> > 
> > > c) Do not access m->next when we know that it is NULL.
> > >    We can use m->nb_segs == 1 or some other invariant as the gate.
> > >    It can be implemented by adding an m->next accessor function:
> > >    struct rte_mbuf * rte_mbuf_next(struct rte_mbuf * m)
> > >    {
> > >        return m->nb_segs == 1 ? NULL : m->next;
> > >    }
> > 
> > Sorry, not sure about this. IIRC, nb_segs is valid in the first
> > segment/mbuf  only.
> > If we have the 4 segments in the pkt we see nb_seg=4 in the first one,
> > and the nb_seg=1
> > in the others. The next field is NULL in the last mbuf only. Am I wrong
> > and miss something ?
> 
> You are correct.
> 
> This would have to be updated too. Either by increasing m->nb_seg in the following segments, or by splitting up relevant functions into functions for working on first segments (incl. non-segmented packets), and functions for working on following segments of segmented packets.

Instead of maintaining a valid nb_segs, a HAS_NEXT flag would be easier
to implement. However it means that an accessor needs to be used instead
of any m->next access.

> > > Regarding the priority of this goal, I guess that simple forwarding
> > of non-
> > > segmented packets is probably the path taken by the majority of
> > packets
> > > handled by DPDK.
> > >
> > > An alternative goal could be:
> > > Do not touch the second cache line during RX.
> > > A comment in the mbuf structure says so, but it is not true anymore.
> > >
> > > (I guess that regression testing didn't catch this because the tests
> > perform TX
> > > immediately after RX, so the cache miss just moves from the TX to the
> > RX part
> > > of the test application.)
> > >
> > >
> > > 2. Regarding short term goals:
> > >
> > > The current DPDK source code looks to me like m->next is the most
> > frequently
> > > accessed field in the second cache line, so it makes sense moving
> > this to the
> > > first cache line, rather than m->pool.
> > > Benchmarking may help here.
> > 
> > Moreover, for the segmented packets the packet size is supposed to be
> > large,
> > and it imposes the relatively low packet rate, so probably optimization
> > of
> > moving next to the 1st cache line might be negligible at all. Just
> > compare 148Mpps of
> > 64B pkts and 4Mpps of 3000B pkts over 100Gbps link. Currently we are on
> > benchmarking
> > and did not succeed yet on difference finding. The benefit can't be
> > expressed in mpps delta,
> > we should measure CPU clocks, but Rx queue is almost always empty - we
> > have an empty
> > loops. So, if we have the boost - it is extremely hard to catch one.
> 
> Very good point regarding the value of such an optimization, Slava!
> 
> And when free()'ing packets, both m->next and m->pool are touched.
> 
> So perhaps the free()/detach() functions in the mbuf library can be modified to handle first segments (and non-segmented packets) and following segments differently, so accessing m->next can be avoided for non-segmented packets. Then m->pool should be moved to the first cache line.
> 

I also think that Moving m->pool without doing something else about
m->next is probably useless. And it's too late for 20.11 to do
additionnal changes, so I suggest to postpone the field move to 21.11,
once we have a clearer view of possible optimizations.

Olivier

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotterfirst half
  2020-11-04 15:00                             ` Olivier Matz
@ 2020-11-05  0:25                               ` Ananyev, Konstantin
  2020-11-05  9:04                                 ` Morten Brørup
  2020-11-05  9:35                                 ` Morten Brørup
  0 siblings, 2 replies; 170+ messages in thread
From: Ananyev, Konstantin @ 2020-11-05  0:25 UTC (permalink / raw)
  To: Olivier Matz, Morten Brørup
  Cc: Slava Ovsiienko, NBU-Contact-Thomas Monjalon, dev, techboard,
	Ajit Khaparde, Andrew Rybchenko, Yigit, Ferruh, david.marchand,
	Richardson, Bruce, jerinj, honnappa.nagarahalli, maxime.coquelin,
	stephen, hemant.agrawal, Matan Azrad, Shahaf Shuler



> 
> Hi,
> 
> On Tue, Nov 03, 2020 at 04:03:46PM +0100, Morten Brørup wrote:
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Slava Ovsiienko
> > > Sent: Tuesday, November 3, 2020 3:03 PM
> > >
> > > Hi, Morten
> > >
> > > > From: Morten Brørup <mb@smartsharesystems.com>
> > > > Sent: Tuesday, November 3, 2020 14:10
> > > >
> > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > > > Sent: Monday, November 2, 2020 4:58 PM
> > > > >
> > > > > +Cc techboard
> > > > >
> > > > > We need benchmark numbers in order to take a decision.
> > > > > Please all, prepare some arguments and numbers so we can discuss
> > > the
> > > > > mbuf layout in the next techboard meeting.
> 
> I did some quick tests, and it appears to me that just moving the pool
> pointer to the first cache line has not a significant impact.

Hmm, as I remember Thomas mentioned about 5%+ improvement
with that change. Though I suppose a lot depends from actual test-case. 
Would be good to know when it does help and when it doesn't.

> 
> However, I agree with Morten that there is some room for optimization
> around m->pool: I did a hack in the ixgbe driver to assume there is only
> one mbuf pool. This simplifies a lot the freeing of mbufs in Tx, because
> we don't have to group them in bulks that shares the same pool (see
> ixgbe_tx_free_bufs()). The impact of this hack is quite good: +~5% on a
> real-life forwarding use case.

I think we already have such optimization ability within DPDK:
#define DEV_TX_OFFLOAD_MBUF_FAST_FREE   0x00010000
/**< Device supports optimization for fast release of mbufs.
 *   When set application must guarantee that per-queue all mbufs comes from
 *   the same mempool and has refcnt = 1.
 */

Seems over-optimistic to me, but many PMDs do support it.

> 
> It is maybe possible to store the pool in the sw ring to avoid a later
> access to m->pool. Having a pool index as suggested by Morten would also
> help to reduce used room in sw ring in this case. But this is a bit
> off-topic :)
> 
> 
> 
> > > > I propose that the techboard considers this from two angels:
> > > >
> > > > 1. Long term goals and their relative priority. I.e. what can be
> > > achieved with
> > > > wide-ranging modifications, requiring yet another ABI break and due
> > > notices.
> > > >
> > > > 2. Short term goals, i.e. what can be achieved for this release.
> > > >
> > > >
> > > > My suggestions follow...
> > > >
> > > > 1. Regarding long term goals:
> > > >
> > > > I have argued that simple forwarding of non-segmented packets using
> > > only the
> > > > first mbuf cache line can be achieved by making three
> > > > modifications:
> > > >
> > > > a) Move m->tx_offload to the first cache line.
> > > Not all PMDs use this field on Tx. HW might support the checksum
> > > offloads
> > > directly, not requiring these fields at all.
> 
> To me, a driver should use m->tx_offload, because the application
> specifies the offset where the checksum has to be done, in case the hw
> is not able to recognize the protocol.
> 
> > > > b) Use an 8 bit pktmbuf mempool index in the first cache line,
> > > >    instead of the 64 bit m->pool pointer in the second cache line.
> > > 256 mpool looks enough, as for me. Regarding the indirect access to the
> > > pool
> > > (via some table) - it might introduce some performance impact.
> >
> > It might, but I hope that it is negligible, so the benefits outweigh the disadvantages.
> >
> > It would have to be measured, though.
> >
> > And m->pool is only used for free()'ing (and detach()'ing) mbufs.
> >
> > > For example,
> > > mlx5 PMD strongly relies on pool field for allocating mbufs in Rx
> > > datapath.
> > > We're going to update (o-o, we found point to optimize), but for now it
> > > does.
> >
> > Without looking at the source code, I don't think the PMD is using m->pool in the RX datapath, I think it is using a pool dedicated to a
> receive queue used for RX descriptors in the PMD (i.e. driver->queue->pool).
> >
> > >
> > > > c) Do not access m->next when we know that it is NULL.
> > > >    We can use m->nb_segs == 1 or some other invariant as the gate.
> > > >    It can be implemented by adding an m->next accessor function:
> > > >    struct rte_mbuf * rte_mbuf_next(struct rte_mbuf * m)
> > > >    {
> > > >        return m->nb_segs == 1 ? NULL : m->next;
> > > >    }
> > >
> > > Sorry, not sure about this. IIRC, nb_segs is valid in the first
> > > segment/mbuf  only.
> > > If we have the 4 segments in the pkt we see nb_seg=4 in the first one,
> > > and the nb_seg=1
> > > in the others. The next field is NULL in the last mbuf only. Am I wrong
> > > and miss something ?
> >
> > You are correct.
> >
> > This would have to be updated too. Either by increasing m->nb_seg in the following segments, or by splitting up relevant functions into
> functions for working on first segments (incl. non-segmented packets), and functions for working on following segments of segmented
> packets.
> 
> Instead of maintaining a valid nb_segs, a HAS_NEXT flag would be easier
> to implement. However it means that an accessor needs to be used instead
> of any m->next access.
> 
> > > > Regarding the priority of this goal, I guess that simple forwarding
> > > of non-
> > > > segmented packets is probably the path taken by the majority of
> > > packets
> > > > handled by DPDK.
> > > >
> > > > An alternative goal could be:
> > > > Do not touch the second cache line during RX.
> > > > A comment in the mbuf structure says so, but it is not true anymore.
> > > >
> > > > (I guess that regression testing didn't catch this because the tests
> > > perform TX
> > > > immediately after RX, so the cache miss just moves from the TX to the
> > > RX part
> > > > of the test application.)
> > > >
> > > >
> > > > 2. Regarding short term goals:
> > > >
> > > > The current DPDK source code looks to me like m->next is the most
> > > frequently
> > > > accessed field in the second cache line, so it makes sense moving
> > > this to the
> > > > first cache line, rather than m->pool.
> > > > Benchmarking may help here.
> > >
> > > Moreover, for the segmented packets the packet size is supposed to be
> > > large,
> > > and it imposes the relatively low packet rate, so probably optimization
> > > of
> > > moving next to the 1st cache line might be negligible at all. Just
> > > compare 148Mpps of
> > > 64B pkts and 4Mpps of 3000B pkts over 100Gbps link. Currently we are on
> > > benchmarking
> > > and did not succeed yet on difference finding. The benefit can't be
> > > expressed in mpps delta,
> > > we should measure CPU clocks, but Rx queue is almost always empty - we
> > > have an empty
> > > loops. So, if we have the boost - it is extremely hard to catch one.
> >
> > Very good point regarding the value of such an optimization, Slava!
> >
> > And when free()'ing packets, both m->next and m->pool are touched.
> >
> > So perhaps the free()/detach() functions in the mbuf library can be modified to handle first segments (and non-segmented packets) and
> following segments differently, so accessing m->next can be avoided for non-segmented packets. Then m->pool should be moved to the
> first cache line.
> >
> 
> I also think that Moving m->pool without doing something else about
> m->next is probably useless. And it's too late for 20.11 to do
> additionnal changes, so I suggest to postpone the field move to 21.11,
> once we have a clearer view of possible optimizations.
> 
> Olivier

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotterfirst half
  2020-11-05  0:25                               ` Ananyev, Konstantin
@ 2020-11-05  9:04                                 ` Morten Brørup
  2020-11-05  9:35                                 ` Morten Brørup
  1 sibling, 0 replies; 170+ messages in thread
From: Morten Brørup @ 2020-11-05  9:04 UTC (permalink / raw)
  To: Ananyev, Konstantin, Olivier Matz
  Cc: Slava Ovsiienko, NBU-Contact-Thomas Monjalon, dev, techboard,
	Ajit Khaparde, Andrew Rybchenko, Yigit, Ferruh, david.marchand,
	Richardson, Bruce, jerinj, honnappa.nagarahalli, maxime.coquelin,
	stephen, hemant.agrawal, Matan Azrad, Shahaf Shuler

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev,
> Konstantin
> Sent: Thursday, November 5, 2020 1:26 AM
> 
> >
> > Hi,
> >
> > On Tue, Nov 03, 2020 at 04:03:46PM +0100, Morten Brørup wrote:
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Slava
> Ovsiienko
> > > > Sent: Tuesday, November 3, 2020 3:03 PM
> > > >
> > > > Hi, Morten
> > > >
> > > > > From: Morten Brørup <mb@smartsharesystems.com>
> > > > > Sent: Tuesday, November 3, 2020 14:10
> > > > >
> > > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > > > > Sent: Monday, November 2, 2020 4:58 PM
> > > > > >
> > > > > > +Cc techboard
> > > > > >
> > > > > > We need benchmark numbers in order to take a decision.
> > > > > > Please all, prepare some arguments and numbers so we can
> discuss
> > > > the
> > > > > > mbuf layout in the next techboard meeting.
> >
> > I did some quick tests, and it appears to me that just moving the
> pool
> > pointer to the first cache line has not a significant impact.
> 
> Hmm, as I remember Thomas mentioned about 5%+ improvement
> with that change. Though I suppose a lot depends from actual test-case.
> Would be good to know when it does help and when it doesn't.
> 
> >
> > However, I agree with Morten that there is some room for optimization
> > around m->pool: I did a hack in the ixgbe driver to assume there is
> only
> > one mbuf pool. This simplifies a lot the freeing of mbufs in Tx,
> because
> > we don't have to group them in bulks that shares the same pool (see
> > ixgbe_tx_free_bufs()). The impact of this hack is quite good: +~5% on
> a
> > real-life forwarding use case.
> 
> I think we already have such optimization ability within DPDK:
> #define DEV_TX_OFFLOAD_MBUF_FAST_FREE   0x00010000
> /**< Device supports optimization for fast release of mbufs.
>  *   When set application must guarantee that per-queue all mbufs comes
> from
>  *   the same mempool and has refcnt = 1.
>  */
> 
> Seems over-optimistic to me, but many PMDs do support it.

Looking at a few drivers using this flag, Intel drivers use rte_mempool_put(m->pool), and thus still reads the second cache line. Only ThunderX seems to use the optimization benefit and use rte_mempool_put_bulk(q->pool).

I would rather see a generic optimization of free()'ing non-segmented packets in the mbuf library, where free() and free_seg() take advantage of knowing whether they are working on the first segment or not - like the is_header indicator in many of the mbuf check functions - and, when working on the first segment, gate access to n->next by m->nb_segs > 1.

Concept:

static inline void
rte_pktmbuf_free(struct rte_mbuf *m)
{
    struct rte_mbuf *m_next;

    /* NOTE: Sanity check of header has moved to __rte_pktmbuf_prefree_seg(). */

    if (m != NULL) {
        if (m->nb_segs == 1) {
            __rte_pktmbuf_free_seg(m, 1);
        } else {
            m_next = m->next;
            __rte_pktmbuf_free_seg(m, 1);
            m = m_next;
            while (m != NULL) {
                m_next = m->next;
                __rte_pktmbuf_free_seg(m, 0);
                m = m_next;
            }
        }
    }
}

static __rte_always_inline void
__rte_pktmbuf_free_seg(struct rte_mbuf *m, int is_header)
{
    m = __rte_pktmbuf_prefree_seg(m, is_header);
    if (likely(m != NULL))
        rte_mbuf_raw_free(m);
}

static __rte_always_inline struct rte_mbuf *
__rte_pktmbuf_prefree_seg(struct rte_mbuf *m, int is_header)
{
    __rte_mbuf_sanity_check(m, is_header);

    if (likely(rte_mbuf_refcnt_read(m) == 1)) {

        if (!RTE_MBUF_DIRECT(m)) {
            rte_pktmbuf_detach(m);
            if (RTE_MBUF_HAS_EXTBUF(m) &&
                RTE_MBUF_HAS_PINNED_EXTBUF(m) &&
                __rte_pktmbuf_pinned_extbuf_decref(m))
                return NULL;
        }

        if (is_header && m->nb_segs == 1)
            return m; /* NOTE: Avoid touching (writing to) the second cache line. */

        if (m->next != NULL) {
            m->next = NULL;
            m->nb_segs = 1;
        }

        return m;

    } else if (__rte_mbuf_refcnt_update(m, -1) == 0) {

        if (!RTE_MBUF_DIRECT(m)) {
            rte_pktmbuf_detach(m);
            if (RTE_MBUF_HAS_EXTBUF(m) &&
                RTE_MBUF_HAS_PINNED_EXTBUF(m) &&
                __rte_pktmbuf_pinned_extbuf_decref(m))
                return NULL;
        }

        if (is_header && m->nb_segs == 1) {
            /* NOTE: Avoid touching (writing to) the second cache line. */
            rte_mbuf_refcnt_set(m, 1);
            return m;
        }

        if (m->next != NULL) {
            m->next = NULL;
            m->nb_segs = 1;
        }
        rte_mbuf_refcnt_set(m, 1);

        return m;
    }
    return NULL;
}

Furthermore, this concept might provide an additional performance improvement by moving m->pool to the first cache line, so rte_mempool_put() in rte_mbuf_raw_free() wouldn't have to touch the second cache line either.


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotterfirst half
  2020-11-05  0:25                               ` Ananyev, Konstantin
  2020-11-05  9:04                                 ` Morten Brørup
@ 2020-11-05  9:35                                 ` Morten Brørup
  2020-11-05 10:29                                   ` Bruce Richardson
  1 sibling, 1 reply; 170+ messages in thread
From: Morten Brørup @ 2020-11-05  9:35 UTC (permalink / raw)
  To: Ananyev, Konstantin, Olivier Matz
  Cc: Slava Ovsiienko, NBU-Contact-Thomas Monjalon, dev, techboard,
	Ajit Khaparde, Andrew Rybchenko, Yigit, Ferruh, david.marchand,
	Richardson, Bruce, jerinj, honnappa.nagarahalli, maxime.coquelin,
	stephen, hemant.agrawal, Matan Azrad, Shahaf Shuler

There is a simple alternative for applications with a single mbuf pool to avoid accessing m->pool.

We could add a global variable pointing to the single mbuf pool.

It would be NULL by default.

It would be set by rte_pktmbuf_pool_create() on first invocation, and reset back to NULL on following invocations. (There would need to be a counter too, to prevent setting it again on the third invocation.)

All functions accessing m->pool would use the global mbuf pool pointer if set, and otherwise use the m->pool pointer, like this:

- rte_mempool_put(m->pool, m);
+ rte_mempool_put(global_mbuf_pool ? global_mbuf_pool : m->pool, m);

This optimization can be implemented without ABI breakage:

Since m->pool is initialized as always, functions that are not modified to use the global_mbuf_pool will simply continue using m->pool, not knowing that a global mbuf pool exists.


Med venlig hilsen / kind regards
- Morten Brørup

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotterfirst half
  2020-11-05  9:35                                 ` Morten Brørup
@ 2020-11-05 10:29                                   ` Bruce Richardson
  0 siblings, 0 replies; 170+ messages in thread
From: Bruce Richardson @ 2020-11-05 10:29 UTC (