DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v2 0/3] Add rte_eth_read_clock API
@ 2019-03-27  6:19 Tom Barbette
  2019-03-27  6:19 ` Tom Barbette
                   ` (4 more replies)
  0 siblings, 5 replies; 32+ messages in thread
From: Tom Barbette @ 2019-03-27  6:19 UTC (permalink / raw)
  To: dev
  Cc: bruce.richardson, john.mcnamara, Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, Shahaf Shuler, Yongseok Koh, Tom Barbette

Some NICs allow to timestamp packets, but do not support the full
PTP synchronization process. Hence, the value set in the mbuf
timestamp field is only the raw value of an internal clock.

To make sense of this value, one at least needs to be able to query
the current hardware clock value. As with the TSC, from there
a frequency can be derieved by querying multiple time the current value of the
internal clock with some known delay between the queries (example
provided in the API doc).

This patch series adds support for MLX5.

An example app is provided in the rxtx_callback application.
It has been updated to display, on top of the software latency
in cycles, the total latency since the packet was received in hardware.
The API is used to compute a delta in the TX callback. The raw amount of
ticks is converted to cycles using a variation of the technique describe above.

Aside from offloading timestamping, which relieve the
software from a few operations, this allows to get much more precision
when studying the source of the latency in a system.
Eg. in our 100G, CX5 setup the rxtx callback application shows
SW latency is around 74 cycles (TSC is 3.2Ghz), but the latency
including NIC processing, PCIe, and queuing is around 196 cycles.

One may think at first this API is overlapping with te_eth_timesync_read_time.
rte_eth_timesync_read_time is clearly identified as part of a set of functions
to use PTP synchronization.
The device raw clock is not "sync" in any way. More importantly, the returned
value is not a timeval, but an amount of ticks. We could have a cast-based
solution, but on top of being an ugly solution, some people seeing the timeval
type of rte_eth_timesync_read_time could use it blindly.

Change in v2:
  - Rebase on current master

Tom Barbette (3):
  rte_ethdev: Add API function to read dev clock
  mlx5: Implement support for read_clock
  rxtx_callbacks: Add support for HW timestamp

 doc/guides/nics/features.rst                |  1 +
 doc/guides/sample_app_ug/rxtx_callbacks.rst |  9 ++-
 drivers/net/mlx5/mlx5.c                     |  1 +
 drivers/net/mlx5/mlx5.h                     |  1 +
 drivers/net/mlx5/mlx5_ethdev.c              | 29 +++++++
 drivers/net/mlx5/mlx5_glue.c                |  8 ++
 drivers/net/mlx5/mlx5_glue.h                |  2 +
 examples/rxtx_callbacks/Makefile            |  2 +
 examples/rxtx_callbacks/main.c              | 86 ++++++++++++++++++++-
 examples/rxtx_callbacks/meson.build         |  1 +
 lib/librte_ethdev/rte_ethdev.c              | 13 ++++
 lib/librte_ethdev/rte_ethdev.h              | 44 +++++++++++
 lib/librte_ethdev/rte_ethdev_core.h         |  6 ++
 lib/librte_ethdev/rte_ethdev_version.map    |  1 +
 lib/librte_mbuf/rte_mbuf.h                  |  2 +
 15 files changed, 201 insertions(+), 5 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [dpdk-dev] [PATCH v2 0/3] Add rte_eth_read_clock API
  2019-03-27  6:19 [dpdk-dev] [PATCH v2 0/3] Add rte_eth_read_clock API Tom Barbette
@ 2019-03-27  6:19 ` Tom Barbette
  2019-03-27  6:19 ` [dpdk-dev] [PATCH v2 1/3] rte_ethdev: Add API function to read dev clock Tom Barbette
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 32+ messages in thread
From: Tom Barbette @ 2019-03-27  6:19 UTC (permalink / raw)
  To: dev
  Cc: bruce.richardson, john.mcnamara, Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, Shahaf Shuler, Yongseok Koh, Tom Barbette

Some NICs allow to timestamp packets, but do not support the full
PTP synchronization process. Hence, the value set in the mbuf
timestamp field is only the raw value of an internal clock.

To make sense of this value, one at least needs to be able to query
the current hardware clock value. As with the TSC, from there
a frequency can be derieved by querying multiple time the current value of the
internal clock with some known delay between the queries (example
provided in the API doc).

This patch series adds support for MLX5.

An example app is provided in the rxtx_callback application.
It has been updated to display, on top of the software latency
in cycles, the total latency since the packet was received in hardware.
The API is used to compute a delta in the TX callback. The raw amount of
ticks is converted to cycles using a variation of the technique describe above.

Aside from offloading timestamping, which relieve the
software from a few operations, this allows to get much more precision
when studying the source of the latency in a system.
Eg. in our 100G, CX5 setup the rxtx callback application shows
SW latency is around 74 cycles (TSC is 3.2Ghz), but the latency
including NIC processing, PCIe, and queuing is around 196 cycles.

One may think at first this API is overlapping with te_eth_timesync_read_time.
rte_eth_timesync_read_time is clearly identified as part of a set of functions
to use PTP synchronization.
The device raw clock is not "sync" in any way. More importantly, the returned
value is not a timeval, but an amount of ticks. We could have a cast-based
solution, but on top of being an ugly solution, some people seeing the timeval
type of rte_eth_timesync_read_time could use it blindly.

Change in v2:
  - Rebase on current master

Tom Barbette (3):
  rte_ethdev: Add API function to read dev clock
  mlx5: Implement support for read_clock
  rxtx_callbacks: Add support for HW timestamp

 doc/guides/nics/features.rst                |  1 +
 doc/guides/sample_app_ug/rxtx_callbacks.rst |  9 ++-
 drivers/net/mlx5/mlx5.c                     |  1 +
 drivers/net/mlx5/mlx5.h                     |  1 +
 drivers/net/mlx5/mlx5_ethdev.c              | 29 +++++++
 drivers/net/mlx5/mlx5_glue.c                |  8 ++
 drivers/net/mlx5/mlx5_glue.h                |  2 +
 examples/rxtx_callbacks/Makefile            |  2 +
 examples/rxtx_callbacks/main.c              | 86 ++++++++++++++++++++-
 examples/rxtx_callbacks/meson.build         |  1 +
 lib/librte_ethdev/rte_ethdev.c              | 13 ++++
 lib/librte_ethdev/rte_ethdev.h              | 44 +++++++++++
 lib/librte_ethdev/rte_ethdev_core.h         |  6 ++
 lib/librte_ethdev/rte_ethdev_version.map    |  1 +
 lib/librte_mbuf/rte_mbuf.h                  |  2 +
 15 files changed, 201 insertions(+), 5 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [dpdk-dev] [PATCH v2 1/3] rte_ethdev: Add API function to read dev clock
  2019-03-27  6:19 [dpdk-dev] [PATCH v2 0/3] Add rte_eth_read_clock API Tom Barbette
  2019-03-27  6:19 ` Tom Barbette
@ 2019-03-27  6:19 ` Tom Barbette
  2019-03-27  6:19   ` Tom Barbette
  2019-04-02 17:46   ` Ferruh Yigit
  2019-03-27  6:19 ` [dpdk-dev] [PATCH v2 2/3] mlx5: Implement support for read_clock Tom Barbette
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 32+ messages in thread
From: Tom Barbette @ 2019-03-27  6:19 UTC (permalink / raw)
  To: dev
  Cc: bruce.richardson, john.mcnamara, Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, Shahaf Shuler, Yongseok Koh, Tom Barbette

Add rte_eth_read_clock to read the raw clock of a devide.

The main use is to get the device clock conversion co-efficients to be
able to translate the raw clock of the timestamp field of the pkt mbuf
to a local synced time value.

This function was missing to allow users to convert the RX timestamp field
to real time without the complexity of the rte_timesync* facility. One can
derivate the clock frequency by calling twice read_clock and then keep a
common time base.

Signed-off-by: Tom Barbette <barbette@kth.se>
---
 doc/guides/nics/features.rst             |  1 +
 lib/librte_ethdev/rte_ethdev.c           | 13 +++++++
 lib/librte_ethdev/rte_ethdev.h           | 44 ++++++++++++++++++++++++
 lib/librte_ethdev/rte_ethdev_core.h      |  6 ++++
 lib/librte_ethdev/rte_ethdev_version.map |  1 +
 lib/librte_mbuf/rte_mbuf.h               |  2 ++
 6 files changed, 67 insertions(+)

diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
index c5bf32222..1e6adfb0b 100644
--- a/doc/guides/nics/features.rst
+++ b/doc/guides/nics/features.rst
@@ -602,6 +602,7 @@ Supports Timestamp.
 * **[provides] mbuf**: ``mbuf.ol_flags:PKT_RX_TIMESTAMP``.
 * **[provides] mbuf**: ``mbuf.timestamp``.
 * **[provides] rte_eth_dev_info**: ``rx_offload_capa,rx_queue_offload_capa: DEV_RX_OFFLOAD_TIMESTAMP``.
+* **[implements] eth_dev_ops**: ``read_clock``.
 
 .. _nic_features_macsec_offload:
 
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 85c179496..7de0f35b5 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -4127,6 +4127,19 @@ rte_eth_timesync_write_time(uint16_t port_id, const struct timespec *timestamp)
 								timestamp));
 }
 
+int
+rte_eth_read_clock(uint16_t port_id, uint64_t *timestamp)
+{
+	struct rte_eth_dev *dev;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	dev = &rte_eth_devices[port_id];
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->read_clock, -ENOTSUP);
+	return eth_err(port_id, (*dev->dev_ops->read_clock)(dev,
+								timestamp));
+}
+
 int
 rte_eth_dev_get_reg_info(uint16_t port_id, struct rte_dev_reg_info *info)
 {
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index a3c864a13..e254da8b1 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -3642,6 +3642,50 @@ int rte_eth_timesync_read_time(uint16_t port_id, struct timespec *time);
  */
 int rte_eth_timesync_write_time(uint16_t port_id, const struct timespec *time);
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Read the current clock counter of an Ethernet device
+ *
+ * This returns the current raw clock value of an Ethernet device.
+ * The value returned here is from the same clock than the one
+ * filling timestamp field of RX packets. Therefore it can be used
+ * to compute a precise conversion of the device clock to the real time.
+ *
+ * E.g, a simple heuristic to derivate the frequency would be:
+ * uint64_t start, end;
+ * rte_eth_read_clock(port, start);
+ * rte_delay_ms(100);
+ * rte_eth_read_clock(port, end);
+ * double freq = (end - start) * 10;
+ *
+ * Compute a common reference with:
+ * uint64_t base_time_sec = current_time();
+ * uint64_t base_clock;
+ * rte_eth_read_clock(port, base_clock);
+ *
+ * Then, convert the raw mbuf timestamp with:
+ * base_time_sec + (double)(mbuf->timestamp - base_clock) / freq;
+ *
+ * This simple example will not provide a very good accuracy. One must
+ * at least measure multiple times the frequency and do a regression.
+ * To avoid deviation from the system time, the common reference can
+ * be repeated from time to time. The integer division can also be
+ * converted by a multiplication and a shift for better performance.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param time
+ *   Pointer to the uint64_t that holds the raw clock value.
+ *
+ * @return
+ *   - 0: Success.
+ *   - -ENODEV: The port ID is invalid.
+ *   - -ENOTSUP: The function is not supported by the Ethernet driver.
+ */
+int __rte_experimental rte_eth_read_clock(uint16_t port_id, uint64_t *time);
+
 /**
  * Config l2 tunnel ether type of an Ethernet device for filtering specific
  * tunnel packets by ether type.
diff --git a/lib/librte_ethdev/rte_ethdev_core.h b/lib/librte_ethdev/rte_ethdev_core.h
index 8f03f83f6..86806b3eb 100644
--- a/lib/librte_ethdev/rte_ethdev_core.h
+++ b/lib/librte_ethdev/rte_ethdev_core.h
@@ -322,6 +322,10 @@ typedef int (*eth_timesync_write_time)(struct rte_eth_dev *dev,
 				       const struct timespec *timestamp);
 /**< @internal Function used to get time from the device clock */
 
+typedef int (*eth_read_clock)(struct rte_eth_dev *dev,
+				      uint64_t *timestamp);
+/**< @internal Function used to get the current value of the device clock. */
+
 typedef int (*eth_get_reg_t)(struct rte_eth_dev *dev,
 				struct rte_dev_reg_info *info);
 /**< @internal Retrieve registers  */
@@ -496,6 +500,8 @@ struct eth_dev_ops {
 	eth_timesync_read_time     timesync_read_time; /** Get the device clock time. */
 	eth_timesync_write_time    timesync_write_time; /** Set the device clock time. */
 
+	eth_read_clock             read_clock;
+
 	eth_xstats_get_by_id_t     xstats_get_by_id;
 	/**< Get extended device statistic values by ID. */
 	eth_xstats_get_names_by_id_t xstats_get_names_by_id;
diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map
index 92ac3de25..12d6c3c1d 100644
--- a/lib/librte_ethdev/rte_ethdev_version.map
+++ b/lib/librte_ethdev/rte_ethdev_version.map
@@ -249,6 +249,7 @@ EXPERIMENTAL {
 	rte_eth_switch_domain_free;
 	rte_flow_conv;
 	rte_flow_expand_rss;
+	rte_eth_read_clock;
 	rte_mtr_capabilities_get;
 	rte_mtr_create;
 	rte_mtr_destroy;
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index d961ccaf6..dd55b942c 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -619,6 +619,8 @@ struct rte_mbuf {
 
 	/** Valid if PKT_RX_TIMESTAMP is set. The unit and time reference
 	 * are not normalized but are always the same for a given port.
+	 * Some devices allow to query rte_eth_read_clock that will return the
+	 * current device timestamp.
 	 */
 	uint64_t timestamp;
 
-- 
2.17.1

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [dpdk-dev] [PATCH v2 1/3] rte_ethdev: Add API function to read dev clock
  2019-03-27  6:19 ` [dpdk-dev] [PATCH v2 1/3] rte_ethdev: Add API function to read dev clock Tom Barbette
@ 2019-03-27  6:19   ` Tom Barbette
  2019-04-02 17:46   ` Ferruh Yigit
  1 sibling, 0 replies; 32+ messages in thread
From: Tom Barbette @ 2019-03-27  6:19 UTC (permalink / raw)
  To: dev
  Cc: bruce.richardson, john.mcnamara, Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, Shahaf Shuler, Yongseok Koh, Tom Barbette

Add rte_eth_read_clock to read the raw clock of a devide.

The main use is to get the device clock conversion co-efficients to be
able to translate the raw clock of the timestamp field of the pkt mbuf
to a local synced time value.

This function was missing to allow users to convert the RX timestamp field
to real time without the complexity of the rte_timesync* facility. One can
derivate the clock frequency by calling twice read_clock and then keep a
common time base.

Signed-off-by: Tom Barbette <barbette@kth.se>
---
 doc/guides/nics/features.rst             |  1 +
 lib/librte_ethdev/rte_ethdev.c           | 13 +++++++
 lib/librte_ethdev/rte_ethdev.h           | 44 ++++++++++++++++++++++++
 lib/librte_ethdev/rte_ethdev_core.h      |  6 ++++
 lib/librte_ethdev/rte_ethdev_version.map |  1 +
 lib/librte_mbuf/rte_mbuf.h               |  2 ++
 6 files changed, 67 insertions(+)

diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
index c5bf32222..1e6adfb0b 100644
--- a/doc/guides/nics/features.rst
+++ b/doc/guides/nics/features.rst
@@ -602,6 +602,7 @@ Supports Timestamp.
 * **[provides] mbuf**: ``mbuf.ol_flags:PKT_RX_TIMESTAMP``.
 * **[provides] mbuf**: ``mbuf.timestamp``.
 * **[provides] rte_eth_dev_info**: ``rx_offload_capa,rx_queue_offload_capa: DEV_RX_OFFLOAD_TIMESTAMP``.
+* **[implements] eth_dev_ops**: ``read_clock``.
 
 .. _nic_features_macsec_offload:
 
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 85c179496..7de0f35b5 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -4127,6 +4127,19 @@ rte_eth_timesync_write_time(uint16_t port_id, const struct timespec *timestamp)
 								timestamp));
 }
 
+int
+rte_eth_read_clock(uint16_t port_id, uint64_t *timestamp)
+{
+	struct rte_eth_dev *dev;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	dev = &rte_eth_devices[port_id];
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->read_clock, -ENOTSUP);
+	return eth_err(port_id, (*dev->dev_ops->read_clock)(dev,
+								timestamp));
+}
+
 int
 rte_eth_dev_get_reg_info(uint16_t port_id, struct rte_dev_reg_info *info)
 {
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index a3c864a13..e254da8b1 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -3642,6 +3642,50 @@ int rte_eth_timesync_read_time(uint16_t port_id, struct timespec *time);
  */
 int rte_eth_timesync_write_time(uint16_t port_id, const struct timespec *time);
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Read the current clock counter of an Ethernet device
+ *
+ * This returns the current raw clock value of an Ethernet device.
+ * The value returned here is from the same clock than the one
+ * filling timestamp field of RX packets. Therefore it can be used
+ * to compute a precise conversion of the device clock to the real time.
+ *
+ * E.g, a simple heuristic to derivate the frequency would be:
+ * uint64_t start, end;
+ * rte_eth_read_clock(port, start);
+ * rte_delay_ms(100);
+ * rte_eth_read_clock(port, end);
+ * double freq = (end - start) * 10;
+ *
+ * Compute a common reference with:
+ * uint64_t base_time_sec = current_time();
+ * uint64_t base_clock;
+ * rte_eth_read_clock(port, base_clock);
+ *
+ * Then, convert the raw mbuf timestamp with:
+ * base_time_sec + (double)(mbuf->timestamp - base_clock) / freq;
+ *
+ * This simple example will not provide a very good accuracy. One must
+ * at least measure multiple times the frequency and do a regression.
+ * To avoid deviation from the system time, the common reference can
+ * be repeated from time to time. The integer division can also be
+ * converted by a multiplication and a shift for better performance.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param time
+ *   Pointer to the uint64_t that holds the raw clock value.
+ *
+ * @return
+ *   - 0: Success.
+ *   - -ENODEV: The port ID is invalid.
+ *   - -ENOTSUP: The function is not supported by the Ethernet driver.
+ */
+int __rte_experimental rte_eth_read_clock(uint16_t port_id, uint64_t *time);
+
 /**
  * Config l2 tunnel ether type of an Ethernet device for filtering specific
  * tunnel packets by ether type.
diff --git a/lib/librte_ethdev/rte_ethdev_core.h b/lib/librte_ethdev/rte_ethdev_core.h
index 8f03f83f6..86806b3eb 100644
--- a/lib/librte_ethdev/rte_ethdev_core.h
+++ b/lib/librte_ethdev/rte_ethdev_core.h
@@ -322,6 +322,10 @@ typedef int (*eth_timesync_write_time)(struct rte_eth_dev *dev,
 				       const struct timespec *timestamp);
 /**< @internal Function used to get time from the device clock */
 
+typedef int (*eth_read_clock)(struct rte_eth_dev *dev,
+				      uint64_t *timestamp);
+/**< @internal Function used to get the current value of the device clock. */
+
 typedef int (*eth_get_reg_t)(struct rte_eth_dev *dev,
 				struct rte_dev_reg_info *info);
 /**< @internal Retrieve registers  */
@@ -496,6 +500,8 @@ struct eth_dev_ops {
 	eth_timesync_read_time     timesync_read_time; /** Get the device clock time. */
 	eth_timesync_write_time    timesync_write_time; /** Set the device clock time. */
 
+	eth_read_clock             read_clock;
+
 	eth_xstats_get_by_id_t     xstats_get_by_id;
 	/**< Get extended device statistic values by ID. */
 	eth_xstats_get_names_by_id_t xstats_get_names_by_id;
diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map
index 92ac3de25..12d6c3c1d 100644
--- a/lib/librte_ethdev/rte_ethdev_version.map
+++ b/lib/librte_ethdev/rte_ethdev_version.map
@@ -249,6 +249,7 @@ EXPERIMENTAL {
 	rte_eth_switch_domain_free;
 	rte_flow_conv;
 	rte_flow_expand_rss;
+	rte_eth_read_clock;
 	rte_mtr_capabilities_get;
 	rte_mtr_create;
 	rte_mtr_destroy;
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index d961ccaf6..dd55b942c 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -619,6 +619,8 @@ struct rte_mbuf {
 
 	/** Valid if PKT_RX_TIMESTAMP is set. The unit and time reference
 	 * are not normalized but are always the same for a given port.
+	 * Some devices allow to query rte_eth_read_clock that will return the
+	 * current device timestamp.
 	 */
 	uint64_t timestamp;
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [dpdk-dev] [PATCH v2 2/3] mlx5: Implement support for read_clock
  2019-03-27  6:19 [dpdk-dev] [PATCH v2 0/3] Add rte_eth_read_clock API Tom Barbette
  2019-03-27  6:19 ` Tom Barbette
  2019-03-27  6:19 ` [dpdk-dev] [PATCH v2 1/3] rte_ethdev: Add API function to read dev clock Tom Barbette
@ 2019-03-27  6:19 ` Tom Barbette
  2019-03-27  6:19   ` Tom Barbette
                     ` (3 more replies)
  2019-03-27  6:19 ` [dpdk-dev] [PATCH v2 3/3] rxtx_callbacks: Add support for HW timestamp Tom Barbette
  2019-03-27 14:41 ` [dpdk-dev] [PATCH v2 0/3] Add rte_eth_read_clock API Stephen Hemminger
  4 siblings, 4 replies; 32+ messages in thread
From: Tom Barbette @ 2019-03-27  6:19 UTC (permalink / raw)
  To: dev
  Cc: bruce.richardson, john.mcnamara, Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, Shahaf Shuler, Yongseok Koh, Tom Barbette

Signed-off-by: Tom Barbette <barbette@kth.se>
---
 drivers/net/mlx5/mlx5.c        |  1 +
 drivers/net/mlx5/mlx5.h        |  1 +
 drivers/net/mlx5/mlx5_ethdev.c | 29 +++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_glue.c   |  8 ++++++++
 drivers/net/mlx5/mlx5_glue.h   |  2 ++
 5 files changed, 41 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index ae4b71695..7091ff4bb 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -376,6 +376,7 @@ const struct eth_dev_ops mlx5_dev_ops = {
 	.xstats_get_names = mlx5_xstats_get_names,
 	.fw_version_get = mlx5_fw_version_get,
 	.dev_infos_get = mlx5_dev_infos_get,
+	.read_clock = mlx5_read_clock,
 	.dev_supported_ptypes_get = mlx5_dev_supported_ptypes_get,
 	.vlan_filter_set = mlx5_vlan_filter_set,
 	.rx_queue_setup = mlx5_rx_queue_setup,
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 538445367..88394f391 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -275,6 +275,7 @@ int mlx5_set_flags(struct rte_eth_dev *dev, unsigned int keep,
 		   unsigned int flags);
 int mlx5_dev_configure(struct rte_eth_dev *dev);
 void mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info);
+int mlx5_read_clock(struct rte_eth_dev *dev, uint64_t *time);
 int mlx5_fw_version_get(struct rte_eth_dev *dev, char *fw_ver, size_t fw_size);
 const uint32_t *mlx5_dev_supported_ptypes_get(struct rte_eth_dev *dev);
 int mlx5_link_update(struct rte_eth_dev *dev, int wait_to_complete);
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index f84f7cf69..52262ee44 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -557,6 +557,35 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 	}
 }
 
+/**
+ * Get device current raw clock counter
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param[out] time
+ *   Current raw clock counter of the device.
+ *
+ * @return
+ *   0 if the clock has correctly been read
+ *   The value of errno in case of error
+ */
+int
+mlx5_read_clock(struct rte_eth_dev *dev, uint64_t *clock)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct ibv_values_ex values;
+	int err = 0;
+
+	values.comp_mask = IBV_VALUES_MASK_RAW_CLOCK;
+	err = mlx5_glue->query_rt_values_ex(priv->ctx, &values);
+	if (err != 0) {
+		DRV_LOG(WARNING, "Could not query the clock !");
+		return err;
+	}
+	*clock = values.raw_clock.tv_nsec;
+	return 0;
+}
+
 /**
  * Get firmware version of a device.
  *
diff --git a/drivers/net/mlx5/mlx5_glue.c b/drivers/net/mlx5/mlx5_glue.c
index c817d86c5..c1786c5e9 100644
--- a/drivers/net/mlx5/mlx5_glue.c
+++ b/drivers/net/mlx5/mlx5_glue.c
@@ -86,6 +86,13 @@ mlx5_glue_query_device_ex(struct ibv_context *context,
 	return ibv_query_device_ex(context, input, attr);
 }
 
+static int
+mlx5_glue_query_rt_values_ex(struct ibv_context *context,
+			  struct ibv_values_ex *values)
+{
+	return ibv_query_rt_values_ex(context, values);
+}
+
 static int
 mlx5_glue_query_port(struct ibv_context *context, uint8_t port_num,
 		     struct ibv_port_attr *port_attr)
@@ -603,6 +610,7 @@ const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue){
 	.close_device = mlx5_glue_close_device,
 	.query_device = mlx5_glue_query_device,
 	.query_device_ex = mlx5_glue_query_device_ex,
+	.query_rt_values_ex = mlx5_glue_query_rt_values_ex,
 	.query_port = mlx5_glue_query_port,
 	.create_comp_channel = mlx5_glue_create_comp_channel,
 	.destroy_comp_channel = mlx5_glue_destroy_comp_channel,
diff --git a/drivers/net/mlx5/mlx5_glue.h b/drivers/net/mlx5/mlx5_glue.h
index b11896062..fd651cc03 100644
--- a/drivers/net/mlx5/mlx5_glue.h
+++ b/drivers/net/mlx5/mlx5_glue.h
@@ -74,6 +74,8 @@ struct mlx5_glue {
 	int (*query_device_ex)(struct ibv_context *context,
 			       const struct ibv_query_device_ex_input *input,
 			       struct ibv_device_attr_ex *attr);
+	int (*query_rt_values_ex)(struct ibv_context *context,
+			       struct ibv_values_ex *values);
 	int (*query_port)(struct ibv_context *context, uint8_t port_num,
 			  struct ibv_port_attr *port_attr);
 	struct ibv_comp_channel *(*create_comp_channel)
-- 
2.17.1

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [dpdk-dev] [PATCH v2 2/3] mlx5: Implement support for read_clock
  2019-03-27  6:19 ` [dpdk-dev] [PATCH v2 2/3] mlx5: Implement support for read_clock Tom Barbette
@ 2019-03-27  6:19   ` Tom Barbette
  2019-04-02 17:47   ` Ferruh Yigit
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 32+ messages in thread
From: Tom Barbette @ 2019-03-27  6:19 UTC (permalink / raw)
  To: dev
  Cc: bruce.richardson, john.mcnamara, Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, Shahaf Shuler, Yongseok Koh, Tom Barbette

Signed-off-by: Tom Barbette <barbette@kth.se>
---
 drivers/net/mlx5/mlx5.c        |  1 +
 drivers/net/mlx5/mlx5.h        |  1 +
 drivers/net/mlx5/mlx5_ethdev.c | 29 +++++++++++++++++++++++++++++
 drivers/net/mlx5/mlx5_glue.c   |  8 ++++++++
 drivers/net/mlx5/mlx5_glue.h   |  2 ++
 5 files changed, 41 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index ae4b71695..7091ff4bb 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -376,6 +376,7 @@ const struct eth_dev_ops mlx5_dev_ops = {
 	.xstats_get_names = mlx5_xstats_get_names,
 	.fw_version_get = mlx5_fw_version_get,
 	.dev_infos_get = mlx5_dev_infos_get,
+	.read_clock = mlx5_read_clock,
 	.dev_supported_ptypes_get = mlx5_dev_supported_ptypes_get,
 	.vlan_filter_set = mlx5_vlan_filter_set,
 	.rx_queue_setup = mlx5_rx_queue_setup,
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 538445367..88394f391 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -275,6 +275,7 @@ int mlx5_set_flags(struct rte_eth_dev *dev, unsigned int keep,
 		   unsigned int flags);
 int mlx5_dev_configure(struct rte_eth_dev *dev);
 void mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info);
+int mlx5_read_clock(struct rte_eth_dev *dev, uint64_t *time);
 int mlx5_fw_version_get(struct rte_eth_dev *dev, char *fw_ver, size_t fw_size);
 const uint32_t *mlx5_dev_supported_ptypes_get(struct rte_eth_dev *dev);
 int mlx5_link_update(struct rte_eth_dev *dev, int wait_to_complete);
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index f84f7cf69..52262ee44 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -557,6 +557,35 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 	}
 }
 
+/**
+ * Get device current raw clock counter
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param[out] time
+ *   Current raw clock counter of the device.
+ *
+ * @return
+ *   0 if the clock has correctly been read
+ *   The value of errno in case of error
+ */
+int
+mlx5_read_clock(struct rte_eth_dev *dev, uint64_t *clock)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct ibv_values_ex values;
+	int err = 0;
+
+	values.comp_mask = IBV_VALUES_MASK_RAW_CLOCK;
+	err = mlx5_glue->query_rt_values_ex(priv->ctx, &values);
+	if (err != 0) {
+		DRV_LOG(WARNING, "Could not query the clock !");
+		return err;
+	}
+	*clock = values.raw_clock.tv_nsec;
+	return 0;
+}
+
 /**
  * Get firmware version of a device.
  *
diff --git a/drivers/net/mlx5/mlx5_glue.c b/drivers/net/mlx5/mlx5_glue.c
index c817d86c5..c1786c5e9 100644
--- a/drivers/net/mlx5/mlx5_glue.c
+++ b/drivers/net/mlx5/mlx5_glue.c
@@ -86,6 +86,13 @@ mlx5_glue_query_device_ex(struct ibv_context *context,
 	return ibv_query_device_ex(context, input, attr);
 }
 
+static int
+mlx5_glue_query_rt_values_ex(struct ibv_context *context,
+			  struct ibv_values_ex *values)
+{
+	return ibv_query_rt_values_ex(context, values);
+}
+
 static int
 mlx5_glue_query_port(struct ibv_context *context, uint8_t port_num,
 		     struct ibv_port_attr *port_attr)
@@ -603,6 +610,7 @@ const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue){
 	.close_device = mlx5_glue_close_device,
 	.query_device = mlx5_glue_query_device,
 	.query_device_ex = mlx5_glue_query_device_ex,
+	.query_rt_values_ex = mlx5_glue_query_rt_values_ex,
 	.query_port = mlx5_glue_query_port,
 	.create_comp_channel = mlx5_glue_create_comp_channel,
 	.destroy_comp_channel = mlx5_glue_destroy_comp_channel,
diff --git a/drivers/net/mlx5/mlx5_glue.h b/drivers/net/mlx5/mlx5_glue.h
index b11896062..fd651cc03 100644
--- a/drivers/net/mlx5/mlx5_glue.h
+++ b/drivers/net/mlx5/mlx5_glue.h
@@ -74,6 +74,8 @@ struct mlx5_glue {
 	int (*query_device_ex)(struct ibv_context *context,
 			       const struct ibv_query_device_ex_input *input,
 			       struct ibv_device_attr_ex *attr);
+	int (*query_rt_values_ex)(struct ibv_context *context,
+			       struct ibv_values_ex *values);
 	int (*query_port)(struct ibv_context *context, uint8_t port_num,
 			  struct ibv_port_attr *port_attr);
 	struct ibv_comp_channel *(*create_comp_channel)
-- 
2.17.1


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [dpdk-dev] [PATCH v2 3/3] rxtx_callbacks: Add support for HW timestamp
  2019-03-27  6:19 [dpdk-dev] [PATCH v2 0/3] Add rte_eth_read_clock API Tom Barbette
                   ` (2 preceding siblings ...)
  2019-03-27  6:19 ` [dpdk-dev] [PATCH v2 2/3] mlx5: Implement support for read_clock Tom Barbette
@ 2019-03-27  6:19 ` Tom Barbette
  2019-03-27  6:19   ` Tom Barbette
  2019-04-02 18:22   ` Ferruh Yigit
  2019-03-27 14:41 ` [dpdk-dev] [PATCH v2 0/3] Add rte_eth_read_clock API Stephen Hemminger
  4 siblings, 2 replies; 32+ messages in thread
From: Tom Barbette @ 2019-03-27  6:19 UTC (permalink / raw)
  To: dev
  Cc: bruce.richardson, john.mcnamara, Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, Shahaf Shuler, Yongseok Koh, Tom Barbette

Use rxtx callback to demonstrate a way to use rte_eth_read_clock to
convert the hardware timestamps to an amount of cycles.

This allows to get the amount of time the packet spent since its entry
in the device. While the regular latency only shows the latency from
when it entered the software stack.

Signed-off-by: Tom Barbette <barbette@kth.se>
---
 doc/guides/sample_app_ug/rxtx_callbacks.rst |  9 ++-
 examples/rxtx_callbacks/Makefile            |  2 +
 examples/rxtx_callbacks/main.c              | 86 ++++++++++++++++++++-
 examples/rxtx_callbacks/meson.build         |  1 +
 4 files changed, 93 insertions(+), 5 deletions(-)

diff --git a/doc/guides/sample_app_ug/rxtx_callbacks.rst b/doc/guides/sample_app_ug/rxtx_callbacks.rst
index 81463d28d..6b0c64461 100644
--- a/doc/guides/sample_app_ug/rxtx_callbacks.rst
+++ b/doc/guides/sample_app_ug/rxtx_callbacks.rst
@@ -13,6 +13,10 @@ In the sample application a user defined callback is applied to all received
 packets to add a timestamp. A separate callback is applied to all packets
 prior to transmission to calculate the elapsed time, in CPU cycles.
 
+If hardware timestamping is supported by the NIC, the sample application will
+also display the average latency since the packet was timestamped in hardware,
+on top of the latency since the packet was received and processed by the RX
+callback.
 
 Compiling the Application
 -------------------------
@@ -36,7 +40,10 @@ To run the example in a ``linux`` environment:
 
 .. code-block:: console
 
-    ./build/rxtx_callbacks -l 1 -n 4
+    ./build/rxtx_callbacks -l 1 -n 4 -- [-t]
+
+Use -t to enable hardware timestamping. If not supported by the NIC, an error
+will be displayed.
 
 Refer to *DPDK Getting Started Guide* for general information on running
 applications and the Environment Abstraction Layer (EAL) options.
diff --git a/examples/rxtx_callbacks/Makefile b/examples/rxtx_callbacks/Makefile
index 5154089de..f27a61906 100644
--- a/examples/rxtx_callbacks/Makefile
+++ b/examples/rxtx_callbacks/Makefile
@@ -50,6 +50,8 @@ include $(RTE_SDK)/mk/rte.vars.mk
 
 CFLAGS += $(WERROR_FLAGS)
 
+
+CFLAGS += -DALLOW_EXPERIMENTAL_API
 # workaround for a gcc bug with noreturn attribute
 # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
 ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
diff --git a/examples/rxtx_callbacks/main.c b/examples/rxtx_callbacks/main.c
index 2058be627..7000464e1 100644
--- a/examples/rxtx_callbacks/main.c
+++ b/examples/rxtx_callbacks/main.c
@@ -10,6 +10,8 @@
 #include <rte_lcore.h>
 #include <rte_mbuf.h>
 
+#include <getopt.h>
+
 #define RX_RING_SIZE 1024
 #define TX_RING_SIZE 1024
 
@@ -17,6 +19,9 @@
 #define MBUF_CACHE_SIZE 250
 #define BURST_SIZE 32
 
+static const char usage[] =
+	"%s EAL_ARGS -- [-t]\n";
+
 static const struct rte_eth_conf port_conf_default = {
 	.rxmode = {
 		.max_rx_pkt_len = ETHER_MAX_LEN,
@@ -25,9 +30,14 @@ static const struct rte_eth_conf port_conf_default = {
 
 static struct {
 	uint64_t total_cycles;
+	uint64_t total_queue_cycles;
 	uint64_t total_pkts;
 } latency_numbers;
 
+int hw_timestamping;
+
+#define TICKS_PER_CYCLE_SHIFT 16
+uint64_t ticks_per_cycle_mult;
 
 static uint16_t
 add_timestamps(uint16_t port __rte_unused, uint16_t qidx __rte_unused,
@@ -43,22 +53,42 @@ add_timestamps(uint16_t port __rte_unused, uint16_t qidx __rte_unused,
 }
 
 static uint16_t
-calc_latency(uint16_t port __rte_unused, uint16_t qidx __rte_unused,
+calc_latency(uint16_t port, uint16_t qidx __rte_unused,
 		struct rte_mbuf **pkts, uint16_t nb_pkts, void *_ __rte_unused)
 {
 	uint64_t cycles = 0;
+	uint64_t queue_ticks = 0;
 	uint64_t now = rte_rdtsc();
+	uint64_t ticks;
 	unsigned i;
 
-	for (i = 0; i < nb_pkts; i++)
+	if (hw_timestamping)
+		rte_eth_read_clock(port, &ticks);
+
+	for (i = 0; i < nb_pkts; i++) {
 		cycles += now - pkts[i]->udata64;
+		if (hw_timestamping)
+			queue_ticks += ticks - pkts[i]->timestamp;
+	}
+
 	latency_numbers.total_cycles += cycles;
+	if (hw_timestamping)
+		latency_numbers.total_queue_cycles += (queue_ticks
+			* ticks_per_cycle_mult) >> TICKS_PER_CYCLE_SHIFT;
+
 	latency_numbers.total_pkts += nb_pkts;
 
 	if (latency_numbers.total_pkts > (100 * 1000 * 1000ULL)) {
 		printf("Latency = %"PRIu64" cycles\n",
 		latency_numbers.total_cycles / latency_numbers.total_pkts);
-		latency_numbers.total_cycles = latency_numbers.total_pkts = 0;
+		if (hw_timestamping) {
+			printf("Latency from HW = %"PRIu64" cycles\n",
+			   latency_numbers.total_queue_cycles
+			   / latency_numbers.total_pkts);
+		}
+		latency_numbers.total_cycles = 0;
+		latency_numbers.total_queue_cycles = 0;
+		latency_numbers.total_pkts = 0;
 	}
 	return nb_pkts;
 }
@@ -77,6 +107,7 @@ port_init(uint16_t port, struct rte_mempool *mbuf_pool)
 	int retval;
 	uint16_t q;
 	struct rte_eth_dev_info dev_info;
+	struct rte_eth_rxconf rxconf;
 	struct rte_eth_txconf txconf;
 
 	if (!rte_eth_dev_is_valid_port(port))
@@ -95,9 +126,20 @@ port_init(uint16_t port, struct rte_mempool *mbuf_pool)
 	if (retval != 0)
 		return retval;
 
+	rxconf = dev_info.default_rxconf;
+
+	if (hw_timestamping) {
+		if (!(dev_info.rx_offload_capa & DEV_RX_OFFLOAD_TIMESTAMP)) {
+			printf("\nERROR: Port %u does not support hardware timestamping\n"
+					, port);
+			return -1;
+		}
+		rxconf.offloads |= DEV_RX_OFFLOAD_TIMESTAMP;
+	}
+
 	for (q = 0; q < rx_rings; q++) {
 		retval = rte_eth_rx_queue_setup(port, q, nb_rxd,
-				rte_eth_dev_socket_id(port), NULL, mbuf_pool);
+			rte_eth_dev_socket_id(port), &rxconf, mbuf_pool);
 		if (retval < 0)
 			return retval;
 	}
@@ -115,6 +157,25 @@ port_init(uint16_t port, struct rte_mempool *mbuf_pool)
 	if (retval < 0)
 		return retval;
 
+	if (hw_timestamping && ticks_per_cycle_mult  == 0) {
+		uint64_t cycles_base = rte_rdtsc();
+		uint64_t ticks_base;
+		rte_eth_read_clock(port, &ticks_base);
+		rte_delay_ms(100);
+		uint64_t cycles = rte_rdtsc();
+		uint64_t ticks;
+		rte_eth_read_clock(port, &ticks);
+		uint64_t c_freq = cycles - cycles_base;
+		uint64_t t_freq = ticks - ticks_base;
+		double freq_mult = (double)c_freq / t_freq;
+		printf("TSC Freq ~= %lu\nHW Freq ~= %lu\nRatio : %f\n",
+				c_freq * 10, t_freq * 10, freq_mult);
+		/* TSC will be faster than internal ticks so freq_mult is > 0
+		 * We convert the multiplication to an integer shift & mult
+		 */
+		ticks_per_cycle_mult = (1 << TICKS_PER_CYCLE_SHIFT) / freq_mult;
+	}
+
 	struct ether_addr addr;
 
 	rte_eth_macaddr_get(port, &addr);
@@ -177,6 +238,11 @@ main(int argc, char *argv[])
 	struct rte_mempool *mbuf_pool;
 	uint16_t nb_ports;
 	uint16_t portid;
+	struct option lgopts[] = {
+		{ NULL,  0, 0, 0 }
+	};
+	int opt, option_index;
+
 
 	/* init EAL */
 	int ret = rte_eal_init(argc, argv);
@@ -186,6 +252,18 @@ main(int argc, char *argv[])
 	argc -= ret;
 	argv += ret;
 
+	while ((opt = getopt_long(argc, argv, "t", lgopts, &option_index))
+			!= EOF)
+		switch (opt) {
+		case 't':
+			hw_timestamping = 1;
+			break;
+		default:
+			printf(usage, argv[0]);
+			return -1;
+		}
+	optind = 1; /* reset getopt lib */
+
 	nb_ports = rte_eth_dev_count_avail();
 	if (nb_ports < 2 || (nb_ports & 1))
 		rte_exit(EXIT_FAILURE, "Error: number of ports must be even\n");
diff --git a/examples/rxtx_callbacks/meson.build b/examples/rxtx_callbacks/meson.build
index c34e11e36..2b0a25036 100644
--- a/examples/rxtx_callbacks/meson.build
+++ b/examples/rxtx_callbacks/meson.build
@@ -6,6 +6,7 @@
 # To build this example as a standalone application with an already-installed
 # DPDK instance, use 'make'
 
+allow_experimental_apis = true
 sources = files(
 	'main.c'
 )
-- 
2.17.1

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [dpdk-dev] [PATCH v2 3/3] rxtx_callbacks: Add support for HW timestamp
  2019-03-27  6:19 ` [dpdk-dev] [PATCH v2 3/3] rxtx_callbacks: Add support for HW timestamp Tom Barbette
@ 2019-03-27  6:19   ` Tom Barbette
  2019-04-02 18:22   ` Ferruh Yigit
  1 sibling, 0 replies; 32+ messages in thread
From: Tom Barbette @ 2019-03-27  6:19 UTC (permalink / raw)
  To: dev
  Cc: bruce.richardson, john.mcnamara, Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, Shahaf Shuler, Yongseok Koh, Tom Barbette

Use rxtx callback to demonstrate a way to use rte_eth_read_clock to
convert the hardware timestamps to an amount of cycles.

This allows to get the amount of time the packet spent since its entry
in the device. While the regular latency only shows the latency from
when it entered the software stack.

Signed-off-by: Tom Barbette <barbette@kth.se>
---
 doc/guides/sample_app_ug/rxtx_callbacks.rst |  9 ++-
 examples/rxtx_callbacks/Makefile            |  2 +
 examples/rxtx_callbacks/main.c              | 86 ++++++++++++++++++++-
 examples/rxtx_callbacks/meson.build         |  1 +
 4 files changed, 93 insertions(+), 5 deletions(-)

diff --git a/doc/guides/sample_app_ug/rxtx_callbacks.rst b/doc/guides/sample_app_ug/rxtx_callbacks.rst
index 81463d28d..6b0c64461 100644
--- a/doc/guides/sample_app_ug/rxtx_callbacks.rst
+++ b/doc/guides/sample_app_ug/rxtx_callbacks.rst
@@ -13,6 +13,10 @@ In the sample application a user defined callback is applied to all received
 packets to add a timestamp. A separate callback is applied to all packets
 prior to transmission to calculate the elapsed time, in CPU cycles.
 
+If hardware timestamping is supported by the NIC, the sample application will
+also display the average latency since the packet was timestamped in hardware,
+on top of the latency since the packet was received and processed by the RX
+callback.
 
 Compiling the Application
 -------------------------
@@ -36,7 +40,10 @@ To run the example in a ``linux`` environment:
 
 .. code-block:: console
 
-    ./build/rxtx_callbacks -l 1 -n 4
+    ./build/rxtx_callbacks -l 1 -n 4 -- [-t]
+
+Use -t to enable hardware timestamping. If not supported by the NIC, an error
+will be displayed.
 
 Refer to *DPDK Getting Started Guide* for general information on running
 applications and the Environment Abstraction Layer (EAL) options.
diff --git a/examples/rxtx_callbacks/Makefile b/examples/rxtx_callbacks/Makefile
index 5154089de..f27a61906 100644
--- a/examples/rxtx_callbacks/Makefile
+++ b/examples/rxtx_callbacks/Makefile
@@ -50,6 +50,8 @@ include $(RTE_SDK)/mk/rte.vars.mk
 
 CFLAGS += $(WERROR_FLAGS)
 
+
+CFLAGS += -DALLOW_EXPERIMENTAL_API
 # workaround for a gcc bug with noreturn attribute
 # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
 ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
diff --git a/examples/rxtx_callbacks/main.c b/examples/rxtx_callbacks/main.c
index 2058be627..7000464e1 100644
--- a/examples/rxtx_callbacks/main.c
+++ b/examples/rxtx_callbacks/main.c
@@ -10,6 +10,8 @@
 #include <rte_lcore.h>
 #include <rte_mbuf.h>
 
+#include <getopt.h>
+
 #define RX_RING_SIZE 1024
 #define TX_RING_SIZE 1024
 
@@ -17,6 +19,9 @@
 #define MBUF_CACHE_SIZE 250
 #define BURST_SIZE 32
 
+static const char usage[] =
+	"%s EAL_ARGS -- [-t]\n";
+
 static const struct rte_eth_conf port_conf_default = {
 	.rxmode = {
 		.max_rx_pkt_len = ETHER_MAX_LEN,
@@ -25,9 +30,14 @@ static const struct rte_eth_conf port_conf_default = {
 
 static struct {
 	uint64_t total_cycles;
+	uint64_t total_queue_cycles;
 	uint64_t total_pkts;
 } latency_numbers;
 
+int hw_timestamping;
+
+#define TICKS_PER_CYCLE_SHIFT 16
+uint64_t ticks_per_cycle_mult;
 
 static uint16_t
 add_timestamps(uint16_t port __rte_unused, uint16_t qidx __rte_unused,
@@ -43,22 +53,42 @@ add_timestamps(uint16_t port __rte_unused, uint16_t qidx __rte_unused,
 }
 
 static uint16_t
-calc_latency(uint16_t port __rte_unused, uint16_t qidx __rte_unused,
+calc_latency(uint16_t port, uint16_t qidx __rte_unused,
 		struct rte_mbuf **pkts, uint16_t nb_pkts, void *_ __rte_unused)
 {
 	uint64_t cycles = 0;
+	uint64_t queue_ticks = 0;
 	uint64_t now = rte_rdtsc();
+	uint64_t ticks;
 	unsigned i;
 
-	for (i = 0; i < nb_pkts; i++)
+	if (hw_timestamping)
+		rte_eth_read_clock(port, &ticks);
+
+	for (i = 0; i < nb_pkts; i++) {
 		cycles += now - pkts[i]->udata64;
+		if (hw_timestamping)
+			queue_ticks += ticks - pkts[i]->timestamp;
+	}
+
 	latency_numbers.total_cycles += cycles;
+	if (hw_timestamping)
+		latency_numbers.total_queue_cycles += (queue_ticks
+			* ticks_per_cycle_mult) >> TICKS_PER_CYCLE_SHIFT;
+
 	latency_numbers.total_pkts += nb_pkts;
 
 	if (latency_numbers.total_pkts > (100 * 1000 * 1000ULL)) {
 		printf("Latency = %"PRIu64" cycles\n",
 		latency_numbers.total_cycles / latency_numbers.total_pkts);
-		latency_numbers.total_cycles = latency_numbers.total_pkts = 0;
+		if (hw_timestamping) {
+			printf("Latency from HW = %"PRIu64" cycles\n",
+			   latency_numbers.total_queue_cycles
+			   / latency_numbers.total_pkts);
+		}
+		latency_numbers.total_cycles = 0;
+		latency_numbers.total_queue_cycles = 0;
+		latency_numbers.total_pkts = 0;
 	}
 	return nb_pkts;
 }
@@ -77,6 +107,7 @@ port_init(uint16_t port, struct rte_mempool *mbuf_pool)
 	int retval;
 	uint16_t q;
 	struct rte_eth_dev_info dev_info;
+	struct rte_eth_rxconf rxconf;
 	struct rte_eth_txconf txconf;
 
 	if (!rte_eth_dev_is_valid_port(port))
@@ -95,9 +126,20 @@ port_init(uint16_t port, struct rte_mempool *mbuf_pool)
 	if (retval != 0)
 		return retval;
 
+	rxconf = dev_info.default_rxconf;
+
+	if (hw_timestamping) {
+		if (!(dev_info.rx_offload_capa & DEV_RX_OFFLOAD_TIMESTAMP)) {
+			printf("\nERROR: Port %u does not support hardware timestamping\n"
+					, port);
+			return -1;
+		}
+		rxconf.offloads |= DEV_RX_OFFLOAD_TIMESTAMP;
+	}
+
 	for (q = 0; q < rx_rings; q++) {
 		retval = rte_eth_rx_queue_setup(port, q, nb_rxd,
-				rte_eth_dev_socket_id(port), NULL, mbuf_pool);
+			rte_eth_dev_socket_id(port), &rxconf, mbuf_pool);
 		if (retval < 0)
 			return retval;
 	}
@@ -115,6 +157,25 @@ port_init(uint16_t port, struct rte_mempool *mbuf_pool)
 	if (retval < 0)
 		return retval;
 
+	if (hw_timestamping && ticks_per_cycle_mult  == 0) {
+		uint64_t cycles_base = rte_rdtsc();
+		uint64_t ticks_base;
+		rte_eth_read_clock(port, &ticks_base);
+		rte_delay_ms(100);
+		uint64_t cycles = rte_rdtsc();
+		uint64_t ticks;
+		rte_eth_read_clock(port, &ticks);
+		uint64_t c_freq = cycles - cycles_base;
+		uint64_t t_freq = ticks - ticks_base;
+		double freq_mult = (double)c_freq / t_freq;
+		printf("TSC Freq ~= %lu\nHW Freq ~= %lu\nRatio : %f\n",
+				c_freq * 10, t_freq * 10, freq_mult);
+		/* TSC will be faster than internal ticks so freq_mult is > 0
+		 * We convert the multiplication to an integer shift & mult
+		 */
+		ticks_per_cycle_mult = (1 << TICKS_PER_CYCLE_SHIFT) / freq_mult;
+	}
+
 	struct ether_addr addr;
 
 	rte_eth_macaddr_get(port, &addr);
@@ -177,6 +238,11 @@ main(int argc, char *argv[])
 	struct rte_mempool *mbuf_pool;
 	uint16_t nb_ports;
 	uint16_t portid;
+	struct option lgopts[] = {
+		{ NULL,  0, 0, 0 }
+	};
+	int opt, option_index;
+
 
 	/* init EAL */
 	int ret = rte_eal_init(argc, argv);
@@ -186,6 +252,18 @@ main(int argc, char *argv[])
 	argc -= ret;
 	argv += ret;
 
+	while ((opt = getopt_long(argc, argv, "t", lgopts, &option_index))
+			!= EOF)
+		switch (opt) {
+		case 't':
+			hw_timestamping = 1;
+			break;
+		default:
+			printf(usage, argv[0]);
+			return -1;
+		}
+	optind = 1; /* reset getopt lib */
+
 	nb_ports = rte_eth_dev_count_avail();
 	if (nb_ports < 2 || (nb_ports & 1))
 		rte_exit(EXIT_FAILURE, "Error: number of ports must be even\n");
diff --git a/examples/rxtx_callbacks/meson.build b/examples/rxtx_callbacks/meson.build
index c34e11e36..2b0a25036 100644
--- a/examples/rxtx_callbacks/meson.build
+++ b/examples/rxtx_callbacks/meson.build
@@ -6,6 +6,7 @@
 # To build this example as a standalone application with an already-installed
 # DPDK instance, use 'make'
 
+allow_experimental_apis = true
 sources = files(
 	'main.c'
 )
-- 
2.17.1


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/3] Add rte_eth_read_clock API
  2019-03-27  6:19 [dpdk-dev] [PATCH v2 0/3] Add rte_eth_read_clock API Tom Barbette
                   ` (3 preceding siblings ...)
  2019-03-27  6:19 ` [dpdk-dev] [PATCH v2 3/3] rxtx_callbacks: Add support for HW timestamp Tom Barbette
@ 2019-03-27 14:41 ` Stephen Hemminger
  2019-03-27 14:41   ` Stephen Hemminger
                     ` (2 more replies)
  4 siblings, 3 replies; 32+ messages in thread
From: Stephen Hemminger @ 2019-03-27 14:41 UTC (permalink / raw)
  To: Tom Barbette
  Cc: dev, bruce.richardson, john.mcnamara, Thomas Monjalon,
	Ferruh Yigit, Andrew Rybchenko, Shahaf Shuler, Yongseok Koh

On Wed, 27 Mar 2019 07:19:32 +0100
Tom Barbette <barbette@kth.se> wrote:

> Some NICs allow to timestamp packets, but do not support the full
> PTP synchronization process. Hence, the value set in the mbuf
> timestamp field is only the raw value of an internal clock.
> 
> To make sense of this value, one at least needs to be able to query
> the current hardware clock value. As with the TSC, from there
> a frequency can be derieved by querying multiple time the current value of the
> internal clock with some known delay between the queries (example
> provided in the API doc).
> 
> This patch series adds support for MLX5.
> 
> An example app is provided in the rxtx_callback application.
> It has been updated to display, on top of the software latency
> in cycles, the total latency since the packet was received in hardware.
> The API is used to compute a delta in the TX callback. The raw amount of
> ticks is converted to cycles using a variation of the technique describe above.
> 
> Aside from offloading timestamping, which relieve the
> software from a few operations, this allows to get much more precision
> when studying the source of the latency in a system.
> Eg. in our 100G, CX5 setup the rxtx callback application shows
> SW latency is around 74 cycles (TSC is 3.2Ghz), but the latency
> including NIC processing, PCIe, and queuing is around 196 cycles.
> 
> One may think at first this API is overlapping with te_eth_timesync_read_time.
> rte_eth_timesync_read_time is clearly identified as part of a set of functions
> to use PTP synchronization.
> The device raw clock is not "sync" in any way. More importantly, the returned
> value is not a timeval, but an amount of ticks. We could have a cast-based
> solution, but on top of being an ugly solution, some people seeing the timeval
> type of rte_eth_timesync_read_time could use it blindly.
> 
> Change in v2:
>   - Rebase on current master
> 
> Tom Barbette (3):
>   rte_ethdev: Add API function to read dev clock
>   mlx5: Implement support for read_clock
>   rxtx_callbacks: Add support for HW timestamp
> 
>  doc/guides/nics/features.rst                |  1 +
>  doc/guides/sample_app_ug/rxtx_callbacks.rst |  9 ++-
>  drivers/net/mlx5/mlx5.c                     |  1 +
>  drivers/net/mlx5/mlx5.h                     |  1 +
>  drivers/net/mlx5/mlx5_ethdev.c              | 29 +++++++
>  drivers/net/mlx5/mlx5_glue.c                |  8 ++
>  drivers/net/mlx5/mlx5_glue.h                |  2 +
>  examples/rxtx_callbacks/Makefile            |  2 +
>  examples/rxtx_callbacks/main.c              | 86 ++++++++++++++++++++-
>  examples/rxtx_callbacks/meson.build         |  1 +
>  lib/librte_ethdev/rte_ethdev.c              | 13 ++++
>  lib/librte_ethdev/rte_ethdev.h              | 44 +++++++++++
>  lib/librte_ethdev/rte_ethdev_core.h         |  6 ++
>  lib/librte_ethdev/rte_ethdev_version.map    |  1 +
>  lib/librte_mbuf/rte_mbuf.h                  |  2 +
>  15 files changed, 201 insertions(+), 5 deletions(-)


I like this approach but would like to see the same API supported
on multiple devices.

The current timestamp API is a mess because not all devices behave the
same way. Trying to write an application that uses timestamping is therefore
very difficult.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/3] Add rte_eth_read_clock API
  2019-03-27 14:41 ` [dpdk-dev] [PATCH v2 0/3] Add rte_eth_read_clock API Stephen Hemminger
@ 2019-03-27 14:41   ` Stephen Hemminger
  2019-03-27 14:48   ` Thomas Monjalon
  2019-03-27 14:54   ` Wiles, Keith
  2 siblings, 0 replies; 32+ messages in thread
From: Stephen Hemminger @ 2019-03-27 14:41 UTC (permalink / raw)
  To: Tom Barbette
  Cc: dev, bruce.richardson, john.mcnamara, Thomas Monjalon,
	Ferruh Yigit, Andrew Rybchenko, Shahaf Shuler, Yongseok Koh

On Wed, 27 Mar 2019 07:19:32 +0100
Tom Barbette <barbette@kth.se> wrote:

> Some NICs allow to timestamp packets, but do not support the full
> PTP synchronization process. Hence, the value set in the mbuf
> timestamp field is only the raw value of an internal clock.
> 
> To make sense of this value, one at least needs to be able to query
> the current hardware clock value. As with the TSC, from there
> a frequency can be derieved by querying multiple time the current value of the
> internal clock with some known delay between the queries (example
> provided in the API doc).
> 
> This patch series adds support for MLX5.
> 
> An example app is provided in the rxtx_callback application.
> It has been updated to display, on top of the software latency
> in cycles, the total latency since the packet was received in hardware.
> The API is used to compute a delta in the TX callback. The raw amount of
> ticks is converted to cycles using a variation of the technique describe above.
> 
> Aside from offloading timestamping, which relieve the
> software from a few operations, this allows to get much more precision
> when studying the source of the latency in a system.
> Eg. in our 100G, CX5 setup the rxtx callback application shows
> SW latency is around 74 cycles (TSC is 3.2Ghz), but the latency
> including NIC processing, PCIe, and queuing is around 196 cycles.
> 
> One may think at first this API is overlapping with te_eth_timesync_read_time.
> rte_eth_timesync_read_time is clearly identified as part of a set of functions
> to use PTP synchronization.
> The device raw clock is not "sync" in any way. More importantly, the returned
> value is not a timeval, but an amount of ticks. We could have a cast-based
> solution, but on top of being an ugly solution, some people seeing the timeval
> type of rte_eth_timesync_read_time could use it blindly.
> 
> Change in v2:
>   - Rebase on current master
> 
> Tom Barbette (3):
>   rte_ethdev: Add API function to read dev clock
>   mlx5: Implement support for read_clock
>   rxtx_callbacks: Add support for HW timestamp
> 
>  doc/guides/nics/features.rst                |  1 +
>  doc/guides/sample_app_ug/rxtx_callbacks.rst |  9 ++-
>  drivers/net/mlx5/mlx5.c                     |  1 +
>  drivers/net/mlx5/mlx5.h                     |  1 +
>  drivers/net/mlx5/mlx5_ethdev.c              | 29 +++++++
>  drivers/net/mlx5/mlx5_glue.c                |  8 ++
>  drivers/net/mlx5/mlx5_glue.h                |  2 +
>  examples/rxtx_callbacks/Makefile            |  2 +
>  examples/rxtx_callbacks/main.c              | 86 ++++++++++++++++++++-
>  examples/rxtx_callbacks/meson.build         |  1 +
>  lib/librte_ethdev/rte_ethdev.c              | 13 ++++
>  lib/librte_ethdev/rte_ethdev.h              | 44 +++++++++++
>  lib/librte_ethdev/rte_ethdev_core.h         |  6 ++
>  lib/librte_ethdev/rte_ethdev_version.map    |  1 +
>  lib/librte_mbuf/rte_mbuf.h                  |  2 +
>  15 files changed, 201 insertions(+), 5 deletions(-)


I like this approach but would like to see the same API supported
on multiple devices.

The current timestamp API is a mess because not all devices behave the
same way. Trying to write an application that uses timestamping is therefore
very difficult.




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/3] Add rte_eth_read_clock API
  2019-03-27 14:41 ` [dpdk-dev] [PATCH v2 0/3] Add rte_eth_read_clock API Stephen Hemminger
  2019-03-27 14:41   ` Stephen Hemminger
@ 2019-03-27 14:48   ` Thomas Monjalon
  2019-03-27 14:48     ` Thomas Monjalon
  2019-03-27 14:54   ` Wiles, Keith
  2 siblings, 1 reply; 32+ messages in thread
From: Thomas Monjalon @ 2019-03-27 14:48 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Tom Barbette, dev, bruce.richardson, john.mcnamara, Ferruh Yigit,
	Andrew Rybchenko, Shahaf Shuler, Yongseok Koh

27/03/2019 15:41, Stephen Hemminger:
> On Wed, 27 Mar 2019 07:19:32 +0100
> Tom Barbette <barbette@kth.se> wrote:
> 
> > Some NICs allow to timestamp packets, but do not support the full
> > PTP synchronization process. Hence, the value set in the mbuf
> > timestamp field is only the raw value of an internal clock.
> > 
> > To make sense of this value, one at least needs to be able to query
> > the current hardware clock value. As with the TSC, from there
> > a frequency can be derieved by querying multiple time the current value of the
> > internal clock with some known delay between the queries (example
> > provided in the API doc).
> > 
> > This patch series adds support for MLX5.
> > 
> > An example app is provided in the rxtx_callback application.
> > It has been updated to display, on top of the software latency
> > in cycles, the total latency since the packet was received in hardware.
> > The API is used to compute a delta in the TX callback. The raw amount of
> > ticks is converted to cycles using a variation of the technique describe above.
> > 
> > Aside from offloading timestamping, which relieve the
> > software from a few operations, this allows to get much more precision
> > when studying the source of the latency in a system.
> > Eg. in our 100G, CX5 setup the rxtx callback application shows
> > SW latency is around 74 cycles (TSC is 3.2Ghz), but the latency
> > including NIC processing, PCIe, and queuing is around 196 cycles.
> > 
> > One may think at first this API is overlapping with te_eth_timesync_read_time.
> > rte_eth_timesync_read_time is clearly identified as part of a set of functions
> > to use PTP synchronization.
> > The device raw clock is not "sync" in any way. More importantly, the returned
> > value is not a timeval, but an amount of ticks. We could have a cast-based
> > solution, but on top of being an ugly solution, some people seeing the timeval
> > type of rte_eth_timesync_read_time could use it blindly.
> > 
> > Change in v2:
> >   - Rebase on current master
> > 
> > Tom Barbette (3):
> >   rte_ethdev: Add API function to read dev clock
> >   mlx5: Implement support for read_clock
> >   rxtx_callbacks: Add support for HW timestamp
> 
> I like this approach but would like to see the same API supported
> on multiple devices.
> 
> The current timestamp API is a mess because not all devices behave the
> same way. Trying to write an application that uses timestamping is therefore
> very difficult.

So what do you suggest?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/3] Add rte_eth_read_clock API
  2019-03-27 14:48   ` Thomas Monjalon
@ 2019-03-27 14:48     ` Thomas Monjalon
  0 siblings, 0 replies; 32+ messages in thread
From: Thomas Monjalon @ 2019-03-27 14:48 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Tom Barbette, dev, bruce.richardson, john.mcnamara, Ferruh Yigit,
	Andrew Rybchenko, Shahaf Shuler, Yongseok Koh

27/03/2019 15:41, Stephen Hemminger:
> On Wed, 27 Mar 2019 07:19:32 +0100
> Tom Barbette <barbette@kth.se> wrote:
> 
> > Some NICs allow to timestamp packets, but do not support the full
> > PTP synchronization process. Hence, the value set in the mbuf
> > timestamp field is only the raw value of an internal clock.
> > 
> > To make sense of this value, one at least needs to be able to query
> > the current hardware clock value. As with the TSC, from there
> > a frequency can be derieved by querying multiple time the current value of the
> > internal clock with some known delay between the queries (example
> > provided in the API doc).
> > 
> > This patch series adds support for MLX5.
> > 
> > An example app is provided in the rxtx_callback application.
> > It has been updated to display, on top of the software latency
> > in cycles, the total latency since the packet was received in hardware.
> > The API is used to compute a delta in the TX callback. The raw amount of
> > ticks is converted to cycles using a variation of the technique describe above.
> > 
> > Aside from offloading timestamping, which relieve the
> > software from a few operations, this allows to get much more precision
> > when studying the source of the latency in a system.
> > Eg. in our 100G, CX5 setup the rxtx callback application shows
> > SW latency is around 74 cycles (TSC is 3.2Ghz), but the latency
> > including NIC processing, PCIe, and queuing is around 196 cycles.
> > 
> > One may think at first this API is overlapping with te_eth_timesync_read_time.
> > rte_eth_timesync_read_time is clearly identified as part of a set of functions
> > to use PTP synchronization.
> > The device raw clock is not "sync" in any way. More importantly, the returned
> > value is not a timeval, but an amount of ticks. We could have a cast-based
> > solution, but on top of being an ugly solution, some people seeing the timeval
> > type of rte_eth_timesync_read_time could use it blindly.
> > 
> > Change in v2:
> >   - Rebase on current master
> > 
> > Tom Barbette (3):
> >   rte_ethdev: Add API function to read dev clock
> >   mlx5: Implement support for read_clock
> >   rxtx_callbacks: Add support for HW timestamp
> 
> I like this approach but would like to see the same API supported
> on multiple devices.
> 
> The current timestamp API is a mess because not all devices behave the
> same way. Trying to write an application that uses timestamping is therefore
> very difficult.

So what do you suggest?



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/3] Add rte_eth_read_clock API
  2019-03-27 14:41 ` [dpdk-dev] [PATCH v2 0/3] Add rte_eth_read_clock API Stephen Hemminger
  2019-03-27 14:41   ` Stephen Hemminger
  2019-03-27 14:48   ` Thomas Monjalon
@ 2019-03-27 14:54   ` Wiles, Keith
  2019-03-27 14:54     ` Wiles, Keith
  2019-03-27 16:08     ` Tom Barbette
  2 siblings, 2 replies; 32+ messages in thread
From: Wiles, Keith @ 2019-03-27 14:54 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Tom Barbette, dpdk-dev, Richardson, Bruce, Mcnamara, John,
	Thomas Monjalon, Yigit, Ferruh, Andrew Rybchenko, Shahaf Shuler,
	Yongseok Koh



> On Mar 27, 2019, at 9:41 AM, Stephen Hemminger <stephen@networkplumber.org> wrote:
> 
> On Wed, 27 Mar 2019 07:19:32 +0100
> Tom Barbette <barbette@kth.se> wrote:
> 
>> Some NICs allow to timestamp packets, but do not support the full
>> PTP synchronization process. Hence, the value set in the mbuf
>> timestamp field is only the raw value of an internal clock.
>> 
>> To make sense of this value, one at least needs to be able to query
>> the current hardware clock value. As with the TSC, from there
>> a frequency can be derieved by querying multiple time the current value of the
>> internal clock with some known delay between the queries (example
>> provided in the API doc).
>> 
>> This patch series adds support for MLX5.
>> 
>> An example app is provided in the rxtx_callback application.
>> It has been updated to display, on top of the software latency
>> in cycles, the total latency since the packet was received in hardware.
>> The API is used to compute a delta in the TX callback. The raw amount of
>> ticks is converted to cycles using a variation of the technique describe above.
>> 
>> Aside from offloading timestamping, which relieve the
>> software from a few operations, this allows to get much more precision
>> when studying the source of the latency in a system.
>> Eg. in our 100G, CX5 setup the rxtx callback application shows
>> SW latency is around 74 cycles (TSC is 3.2Ghz), but the latency
>> including NIC processing, PCIe, and queuing is around 196 cycles.
>> 
>> One may think at first this API is overlapping with te_eth_timesync_read_time.
>> rte_eth_timesync_read_time is clearly identified as part of a set of functions
>> to use PTP synchronization.
>> The device raw clock is not "sync" in any way. More importantly, the returned
>> value is not a timeval, but an amount of ticks. We could have a cast-based
>> solution, but on top of being an ugly solution, some people seeing the timeval
>> type of rte_eth_timesync_read_time could use it blindly.
>> 
>> Change in v2:
>>  - Rebase on current master
>> 
>> Tom Barbette (3):
>>  rte_ethdev: Add API function to read dev clock
>>  mlx5: Implement support for read_clock
>>  rxtx_callbacks: Add support for HW timestamp
>> 
>> doc/guides/nics/features.rst                |  1 +
>> doc/guides/sample_app_ug/rxtx_callbacks.rst |  9 ++-
>> drivers/net/mlx5/mlx5.c                     |  1 +
>> drivers/net/mlx5/mlx5.h                     |  1 +
>> drivers/net/mlx5/mlx5_ethdev.c              | 29 +++++++
>> drivers/net/mlx5/mlx5_glue.c                |  8 ++
>> drivers/net/mlx5/mlx5_glue.h                |  2 +
>> examples/rxtx_callbacks/Makefile            |  2 +
>> examples/rxtx_callbacks/main.c              | 86 ++++++++++++++++++++-
>> examples/rxtx_callbacks/meson.build         |  1 +
>> lib/librte_ethdev/rte_ethdev.c              | 13 ++++
>> lib/librte_ethdev/rte_ethdev.h              | 44 +++++++++++
>> lib/librte_ethdev/rte_ethdev_core.h         |  6 ++
>> lib/librte_ethdev/rte_ethdev_version.map    |  1 +
>> lib/librte_mbuf/rte_mbuf.h                  |  2 +
>> 15 files changed, 201 insertions(+), 5 deletions(-)
> 
> 
> I like this approach but would like to see the same API supported
> on multiple devices.
> 
> The current timestamp API is a mess because not all devices behave the
> same way. Trying to write an application that uses timestamping is therefore
> very difficult.

Another question is this an optional API for a PMD? I assume it is.

I know the API rte_eht_read_clock() is attempting to read the NIC for this timestamp, but if the PMD does not support this request can we just default to the rte_rdtsc() as a return value?

Regards,
Keith

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/3] Add rte_eth_read_clock API
  2019-03-27 14:54   ` Wiles, Keith
@ 2019-03-27 14:54     ` Wiles, Keith
  2019-03-27 16:08     ` Tom Barbette
  1 sibling, 0 replies; 32+ messages in thread
From: Wiles, Keith @ 2019-03-27 14:54 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Tom Barbette, dpdk-dev, Richardson, Bruce, Mcnamara, John,
	Thomas Monjalon, Yigit, Ferruh, Andrew Rybchenko, Shahaf Shuler,
	Yongseok Koh



> On Mar 27, 2019, at 9:41 AM, Stephen Hemminger <stephen@networkplumber.org> wrote:
> 
> On Wed, 27 Mar 2019 07:19:32 +0100
> Tom Barbette <barbette@kth.se> wrote:
> 
>> Some NICs allow to timestamp packets, but do not support the full
>> PTP synchronization process. Hence, the value set in the mbuf
>> timestamp field is only the raw value of an internal clock.
>> 
>> To make sense of this value, one at least needs to be able to query
>> the current hardware clock value. As with the TSC, from there
>> a frequency can be derieved by querying multiple time the current value of the
>> internal clock with some known delay between the queries (example
>> provided in the API doc).
>> 
>> This patch series adds support for MLX5.
>> 
>> An example app is provided in the rxtx_callback application.
>> It has been updated to display, on top of the software latency
>> in cycles, the total latency since the packet was received in hardware.
>> The API is used to compute a delta in the TX callback. The raw amount of
>> ticks is converted to cycles using a variation of the technique describe above.
>> 
>> Aside from offloading timestamping, which relieve the
>> software from a few operations, this allows to get much more precision
>> when studying the source of the latency in a system.
>> Eg. in our 100G, CX5 setup the rxtx callback application shows
>> SW latency is around 74 cycles (TSC is 3.2Ghz), but the latency
>> including NIC processing, PCIe, and queuing is around 196 cycles.
>> 
>> One may think at first this API is overlapping with te_eth_timesync_read_time.
>> rte_eth_timesync_read_time is clearly identified as part of a set of functions
>> to use PTP synchronization.
>> The device raw clock is not "sync" in any way. More importantly, the returned
>> value is not a timeval, but an amount of ticks. We could have a cast-based
>> solution, but on top of being an ugly solution, some people seeing the timeval
>> type of rte_eth_timesync_read_time could use it blindly.
>> 
>> Change in v2:
>>  - Rebase on current master
>> 
>> Tom Barbette (3):
>>  rte_ethdev: Add API function to read dev clock
>>  mlx5: Implement support for read_clock
>>  rxtx_callbacks: Add support for HW timestamp
>> 
>> doc/guides/nics/features.rst                |  1 +
>> doc/guides/sample_app_ug/rxtx_callbacks.rst |  9 ++-
>> drivers/net/mlx5/mlx5.c                     |  1 +
>> drivers/net/mlx5/mlx5.h                     |  1 +
>> drivers/net/mlx5/mlx5_ethdev.c              | 29 +++++++
>> drivers/net/mlx5/mlx5_glue.c                |  8 ++
>> drivers/net/mlx5/mlx5_glue.h                |  2 +
>> examples/rxtx_callbacks/Makefile            |  2 +
>> examples/rxtx_callbacks/main.c              | 86 ++++++++++++++++++++-
>> examples/rxtx_callbacks/meson.build         |  1 +
>> lib/librte_ethdev/rte_ethdev.c              | 13 ++++
>> lib/librte_ethdev/rte_ethdev.h              | 44 +++++++++++
>> lib/librte_ethdev/rte_ethdev_core.h         |  6 ++
>> lib/librte_ethdev/rte_ethdev_version.map    |  1 +
>> lib/librte_mbuf/rte_mbuf.h                  |  2 +
>> 15 files changed, 201 insertions(+), 5 deletions(-)
> 
> 
> I like this approach but would like to see the same API supported
> on multiple devices.
> 
> The current timestamp API is a mess because not all devices behave the
> same way. Trying to write an application that uses timestamping is therefore
> very difficult.

Another question is this an optional API for a PMD? I assume it is.

I know the API rte_eht_read_clock() is attempting to read the NIC for this timestamp, but if the PMD does not support this request can we just default to the rte_rdtsc() as a return value?

Regards,
Keith


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/3] Add rte_eth_read_clock API
  2019-03-27 14:54   ` Wiles, Keith
  2019-03-27 14:54     ` Wiles, Keith
@ 2019-03-27 16:08     ` Tom Barbette
  2019-03-27 16:08       ` Tom Barbette
  1 sibling, 1 reply; 32+ messages in thread
From: Tom Barbette @ 2019-03-27 16:08 UTC (permalink / raw)
  To: Wiles, Keith, Stephen Hemminger
  Cc: dpdk-dev, Richardson, Bruce, Mcnamara, John, Thomas Monjalon,
	Yigit, Ferruh, Andrew Rybchenko, Shahaf Shuler, Yongseok Koh

On 2019-03-27 15:54, Wiles, Keith wrote:
> I know the API rte_eht_read_clock() is attempting to read the NIC for this timestamp, but if the PMD does not support this request can we just default to the rte_rdtsc() as a return value?
I would not advise that, because the goal of the function is to have 
something that is from the same unit than the hardware timestamp given 
in the mbufs.

 > The current timestamp API is a mess because not all devices behave the
 > same way. Trying to write an application that uses timestamping is
 > therefore
 > very difficult.

This is different than timesync, no other devices implement hardware 
timestamping. For me, it's a different feature.

Cheers,
Tom

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/3] Add rte_eth_read_clock API
  2019-03-27 16:08     ` Tom Barbette
@ 2019-03-27 16:08       ` Tom Barbette
  0 siblings, 0 replies; 32+ messages in thread
From: Tom Barbette @ 2019-03-27 16:08 UTC (permalink / raw)
  To: Wiles, Keith, Stephen Hemminger
  Cc: dpdk-dev, Richardson, Bruce, Mcnamara, John, Thomas Monjalon,
	Yigit, Ferruh, Andrew Rybchenko, Shahaf Shuler, Yongseok Koh

On 2019-03-27 15:54, Wiles, Keith wrote:
> I know the API rte_eht_read_clock() is attempting to read the NIC for this timestamp, but if the PMD does not support this request can we just default to the rte_rdtsc() as a return value?
I would not advise that, because the goal of the function is to have 
something that is from the same unit than the hardware timestamp given 
in the mbufs.

 > The current timestamp API is a mess because not all devices behave the
 > same way. Trying to write an application that uses timestamping is
 > therefore
 > very difficult.

This is different than timesync, no other devices implement hardware 
timestamping. For me, it's a different feature.

Cheers,
Tom


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/3] rte_ethdev: Add API function to read dev clock
  2019-03-27  6:19 ` [dpdk-dev] [PATCH v2 1/3] rte_ethdev: Add API function to read dev clock Tom Barbette
  2019-03-27  6:19   ` Tom Barbette
@ 2019-04-02 17:46   ` Ferruh Yigit
  2019-04-02 17:46     ` Ferruh Yigit
  2019-04-02 19:24     ` Tom Barbette
  1 sibling, 2 replies; 32+ messages in thread
From: Ferruh Yigit @ 2019-04-02 17:46 UTC (permalink / raw)
  To: Tom Barbette, dev
  Cc: bruce.richardson, john.mcnamara, Thomas Monjalon,
	Andrew Rybchenko, Shahaf Shuler, Yongseok Koh, Olivier MATZ

On 3/27/2019 6:19 AM, Tom Barbette wrote:
> Add rte_eth_read_clock to read the raw clock of a devide.
> 
> The main use is to get the device clock conversion co-efficients to be
> able to translate the raw clock of the timestamp field of the pkt mbuf
> to a local synced time value.
> 
> This function was missing to allow users to convert the RX timestamp field
> to real time without the complexity of the rte_timesync* facility. One can
> derivate the clock frequency by calling twice read_clock and then keep a
> common time base.
> 
> Signed-off-by: Tom Barbette <barbette@kth.se>
> ---
>  doc/guides/nics/features.rst             |  1 +
>  lib/librte_ethdev/rte_ethdev.c           | 13 +++++++
>  lib/librte_ethdev/rte_ethdev.h           | 44 ++++++++++++++++++++++++
>  lib/librte_ethdev/rte_ethdev_core.h      |  6 ++++
>  lib/librte_ethdev/rte_ethdev_version.map |  1 +
>  lib/librte_mbuf/rte_mbuf.h               |  2 ++
>  6 files changed, 67 insertions(+)
> 
> diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> index c5bf32222..1e6adfb0b 100644
> --- a/doc/guides/nics/features.rst
> +++ b/doc/guides/nics/features.rst
> @@ -602,6 +602,7 @@ Supports Timestamp.
>  * **[provides] mbuf**: ``mbuf.ol_flags:PKT_RX_TIMESTAMP``.
>  * **[provides] mbuf**: ``mbuf.timestamp``.
>  * **[provides] rte_eth_dev_info**: ``rx_offload_capa,rx_queue_offload_capa: DEV_RX_OFFLOAD_TIMESTAMP``.
> +* **[implements] eth_dev_ops**: ``read_clock``.

This means for a PMD to claim 'timestamp' support, it should implement the
'read_clock' dev_ops, is it really the case?
Should we say 'related' instead of 'implements' ?

>  
>  .. _nic_features_macsec_offload:
>  
> diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
> index 85c179496..7de0f35b5 100644
> --- a/lib/librte_ethdev/rte_ethdev.c
> +++ b/lib/librte_ethdev/rte_ethdev.c
> @@ -4127,6 +4127,19 @@ rte_eth_timesync_write_time(uint16_t port_id, const struct timespec *timestamp)
>  								timestamp));
>  }
>  
> +int
> +rte_eth_read_clock(uint16_t port_id, uint64_t *timestamp)
> +{
> +	struct rte_eth_dev *dev;
> +
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> +	dev = &rte_eth_devices[port_id];
> +
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->read_clock, -ENOTSUP);
> +	return eth_err(port_id, (*dev->dev_ops->read_clock)(dev,
> +								timestamp));

Please fix the syntax.

> +}
> +
>  int
>  rte_eth_dev_get_reg_info(uint16_t port_id, struct rte_dev_reg_info *info)
>  {
> diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
> index a3c864a13..e254da8b1 100644
> --- a/lib/librte_ethdev/rte_ethdev.h
> +++ b/lib/librte_ethdev/rte_ethdev.h
> @@ -3642,6 +3642,50 @@ int rte_eth_timesync_read_time(uint16_t port_id, struct timespec *time);
>   */
>  int rte_eth_timesync_write_time(uint16_t port_id, const struct timespec *time);
>  
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Read the current clock counter of an Ethernet device
> + *
> + * This returns the current raw clock value of an Ethernet device.
> + * The value returned here is from the same clock than the one
> + * filling timestamp field of RX packets. Therefore it can be used
> + * to compute a precise conversion of the device clock to the real time.
> + *
> + * E.g, a simple heuristic to derivate the frequency would be:
> + * uint64_t start, end;
> + * rte_eth_read_clock(port, start);
> + * rte_delay_ms(100);
> + * rte_eth_read_clock(port, end);
> + * double freq = (end - start) * 10;
> + *
> + * Compute a common reference with:
> + * uint64_t base_time_sec = current_time();
> + * uint64_t base_clock;
> + * rte_eth_read_clock(port, base_clock);
> + *
> + * Then, convert the raw mbuf timestamp with:
> + * base_time_sec + (double)(mbuf->timestamp - base_clock) / freq;
> + *
> + * This simple example will not provide a very good accuracy. One must
> + * at least measure multiple times the frequency and do a regression.
> + * To avoid deviation from the system time, the common reference can
> + * be repeated from time to time. The integer division can also be
> + * converted by a multiplication and a shift for better performance.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param time
> + *   Pointer to the uint64_t that holds the raw clock value.
> + *
> + * @return
> + *   - 0: Success.
> + *   - -ENODEV: The port ID is invalid.
> + *   - -ENOTSUP: The function is not supported by the Ethernet driver.

Can PMD return a fail?

> + */
> +int __rte_experimental rte_eth_read_clock(uint16_t port_id, uint64_t *time);

s/time/timestamp to match implementation param name, (reminder of updating
doxygen when this updated)

And I can see both syntax in use currently, but I think specially with
__rte_experimental tag, it looks cleaner to me as following:

int __rte_experimental
rte_eth_read_clock(uint16_t port_id, uint64_t *timestamp);

> +
>  /**
>   * Config l2 tunnel ether type of an Ethernet device for filtering specific
>   * tunnel packets by ether type.
> diff --git a/lib/librte_ethdev/rte_ethdev_core.h b/lib/librte_ethdev/rte_ethdev_core.h
> index 8f03f83f6..86806b3eb 100644
> --- a/lib/librte_ethdev/rte_ethdev_core.h
> +++ b/lib/librte_ethdev/rte_ethdev_core.h
> @@ -322,6 +322,10 @@ typedef int (*eth_timesync_write_time)(struct rte_eth_dev *dev,
>  				       const struct timespec *timestamp);
>  /**< @internal Function used to get time from the device clock */
>  
> +typedef int (*eth_read_clock)(struct rte_eth_dev *dev,
> +				      uint64_t *timestamp);
> +/**< @internal Function used to get the current value of the device clock. */
> +
>  typedef int (*eth_get_reg_t)(struct rte_eth_dev *dev,
>  				struct rte_dev_reg_info *info);
>  /**< @internal Retrieve registers  */
> @@ -496,6 +500,8 @@ struct eth_dev_ops {
>  	eth_timesync_read_time     timesync_read_time; /** Get the device clock time. */
>  	eth_timesync_write_time    timesync_write_time; /** Set the device clock time. */
>  
> +	eth_read_clock             read_clock;
> +

Isn't this ABI breakage?

I guess it can be OK since we already increasing the ethdev ABIVER this version,
any comments?

>  	eth_xstats_get_by_id_t     xstats_get_by_id;
>  	/**< Get extended device statistic values by ID. */
>  	eth_xstats_get_names_by_id_t xstats_get_names_by_id;
> diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map
> index 92ac3de25..12d6c3c1d 100644
> --- a/lib/librte_ethdev/rte_ethdev_version.map
> +++ b/lib/librte_ethdev/rte_ethdev_version.map
> @@ -249,6 +249,7 @@ EXPERIMENTAL {
>  	rte_eth_switch_domain_free;
>  	rte_flow_conv;
>  	rte_flow_expand_rss;
> +	rte_eth_read_clock;

Can you please put this alphabetically sorted?

>  	rte_mtr_capabilities_get;
>  	rte_mtr_create;
>  	rte_mtr_destroy;
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index d961ccaf6..dd55b942c 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -619,6 +619,8 @@ struct rte_mbuf {
>  
>  	/** Valid if PKT_RX_TIMESTAMP is set. The unit and time reference
>  	 * are not normalized but are always the same for a given port.
> +	 * Some devices allow to query rte_eth_read_clock that will return the
> +	 * current device timestamp.

cc'ed Olivier.

>  	 */
>  	uint64_t timestamp;
>  
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/3] rte_ethdev: Add API function to read dev clock
  2019-04-02 17:46   ` Ferruh Yigit
@ 2019-04-02 17:46     ` Ferruh Yigit
  2019-04-02 19:24     ` Tom Barbette
  1 sibling, 0 replies; 32+ messages in thread
From: Ferruh Yigit @ 2019-04-02 17:46 UTC (permalink / raw)
  To: Tom Barbette, dev
  Cc: bruce.richardson, john.mcnamara, Thomas Monjalon,
	Andrew Rybchenko, Shahaf Shuler, Yongseok Koh, Olivier MATZ

On 3/27/2019 6:19 AM, Tom Barbette wrote:
> Add rte_eth_read_clock to read the raw clock of a devide.
> 
> The main use is to get the device clock conversion co-efficients to be
> able to translate the raw clock of the timestamp field of the pkt mbuf
> to a local synced time value.
> 
> This function was missing to allow users to convert the RX timestamp field
> to real time without the complexity of the rte_timesync* facility. One can
> derivate the clock frequency by calling twice read_clock and then keep a
> common time base.
> 
> Signed-off-by: Tom Barbette <barbette@kth.se>
> ---
>  doc/guides/nics/features.rst             |  1 +
>  lib/librte_ethdev/rte_ethdev.c           | 13 +++++++
>  lib/librte_ethdev/rte_ethdev.h           | 44 ++++++++++++++++++++++++
>  lib/librte_ethdev/rte_ethdev_core.h      |  6 ++++
>  lib/librte_ethdev/rte_ethdev_version.map |  1 +
>  lib/librte_mbuf/rte_mbuf.h               |  2 ++
>  6 files changed, 67 insertions(+)
> 
> diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> index c5bf32222..1e6adfb0b 100644
> --- a/doc/guides/nics/features.rst
> +++ b/doc/guides/nics/features.rst
> @@ -602,6 +602,7 @@ Supports Timestamp.
>  * **[provides] mbuf**: ``mbuf.ol_flags:PKT_RX_TIMESTAMP``.
>  * **[provides] mbuf**: ``mbuf.timestamp``.
>  * **[provides] rte_eth_dev_info**: ``rx_offload_capa,rx_queue_offload_capa: DEV_RX_OFFLOAD_TIMESTAMP``.
> +* **[implements] eth_dev_ops**: ``read_clock``.

This means for a PMD to claim 'timestamp' support, it should implement the
'read_clock' dev_ops, is it really the case?
Should we say 'related' instead of 'implements' ?

>  
>  .. _nic_features_macsec_offload:
>  
> diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
> index 85c179496..7de0f35b5 100644
> --- a/lib/librte_ethdev/rte_ethdev.c
> +++ b/lib/librte_ethdev/rte_ethdev.c
> @@ -4127,6 +4127,19 @@ rte_eth_timesync_write_time(uint16_t port_id, const struct timespec *timestamp)
>  								timestamp));
>  }
>  
> +int
> +rte_eth_read_clock(uint16_t port_id, uint64_t *timestamp)
> +{
> +	struct rte_eth_dev *dev;
> +
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> +	dev = &rte_eth_devices[port_id];
> +
> +	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->read_clock, -ENOTSUP);
> +	return eth_err(port_id, (*dev->dev_ops->read_clock)(dev,
> +								timestamp));

Please fix the syntax.

> +}
> +
>  int
>  rte_eth_dev_get_reg_info(uint16_t port_id, struct rte_dev_reg_info *info)
>  {
> diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
> index a3c864a13..e254da8b1 100644
> --- a/lib/librte_ethdev/rte_ethdev.h
> +++ b/lib/librte_ethdev/rte_ethdev.h
> @@ -3642,6 +3642,50 @@ int rte_eth_timesync_read_time(uint16_t port_id, struct timespec *time);
>   */
>  int rte_eth_timesync_write_time(uint16_t port_id, const struct timespec *time);
>  
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Read the current clock counter of an Ethernet device
> + *
> + * This returns the current raw clock value of an Ethernet device.
> + * The value returned here is from the same clock than the one
> + * filling timestamp field of RX packets. Therefore it can be used
> + * to compute a precise conversion of the device clock to the real time.
> + *
> + * E.g, a simple heuristic to derivate the frequency would be:
> + * uint64_t start, end;
> + * rte_eth_read_clock(port, start);
> + * rte_delay_ms(100);
> + * rte_eth_read_clock(port, end);
> + * double freq = (end - start) * 10;
> + *
> + * Compute a common reference with:
> + * uint64_t base_time_sec = current_time();
> + * uint64_t base_clock;
> + * rte_eth_read_clock(port, base_clock);
> + *
> + * Then, convert the raw mbuf timestamp with:
> + * base_time_sec + (double)(mbuf->timestamp - base_clock) / freq;
> + *
> + * This simple example will not provide a very good accuracy. One must
> + * at least measure multiple times the frequency and do a regression.
> + * To avoid deviation from the system time, the common reference can
> + * be repeated from time to time. The integer division can also be
> + * converted by a multiplication and a shift for better performance.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param time
> + *   Pointer to the uint64_t that holds the raw clock value.
> + *
> + * @return
> + *   - 0: Success.
> + *   - -ENODEV: The port ID is invalid.
> + *   - -ENOTSUP: The function is not supported by the Ethernet driver.

Can PMD return a fail?

> + */
> +int __rte_experimental rte_eth_read_clock(uint16_t port_id, uint64_t *time);

s/time/timestamp to match implementation param name, (reminder of updating
doxygen when this updated)

And I can see both syntax in use currently, but I think specially with
__rte_experimental tag, it looks cleaner to me as following:

int __rte_experimental
rte_eth_read_clock(uint16_t port_id, uint64_t *timestamp);

> +
>  /**
>   * Config l2 tunnel ether type of an Ethernet device for filtering specific
>   * tunnel packets by ether type.
> diff --git a/lib/librte_ethdev/rte_ethdev_core.h b/lib/librte_ethdev/rte_ethdev_core.h
> index 8f03f83f6..86806b3eb 100644
> --- a/lib/librte_ethdev/rte_ethdev_core.h
> +++ b/lib/librte_ethdev/rte_ethdev_core.h
> @@ -322,6 +322,10 @@ typedef int (*eth_timesync_write_time)(struct rte_eth_dev *dev,
>  				       const struct timespec *timestamp);
>  /**< @internal Function used to get time from the device clock */
>  
> +typedef int (*eth_read_clock)(struct rte_eth_dev *dev,
> +				      uint64_t *timestamp);
> +/**< @internal Function used to get the current value of the device clock. */
> +
>  typedef int (*eth_get_reg_t)(struct rte_eth_dev *dev,
>  				struct rte_dev_reg_info *info);
>  /**< @internal Retrieve registers  */
> @@ -496,6 +500,8 @@ struct eth_dev_ops {
>  	eth_timesync_read_time     timesync_read_time; /** Get the device clock time. */
>  	eth_timesync_write_time    timesync_write_time; /** Set the device clock time. */
>  
> +	eth_read_clock             read_clock;
> +

Isn't this ABI breakage?

I guess it can be OK since we already increasing the ethdev ABIVER this version,
any comments?

>  	eth_xstats_get_by_id_t     xstats_get_by_id;
>  	/**< Get extended device statistic values by ID. */
>  	eth_xstats_get_names_by_id_t xstats_get_names_by_id;
> diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map
> index 92ac3de25..12d6c3c1d 100644
> --- a/lib/librte_ethdev/rte_ethdev_version.map
> +++ b/lib/librte_ethdev/rte_ethdev_version.map
> @@ -249,6 +249,7 @@ EXPERIMENTAL {
>  	rte_eth_switch_domain_free;
>  	rte_flow_conv;
>  	rte_flow_expand_rss;
> +	rte_eth_read_clock;

Can you please put this alphabetically sorted?

>  	rte_mtr_capabilities_get;
>  	rte_mtr_create;
>  	rte_mtr_destroy;
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index d961ccaf6..dd55b942c 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -619,6 +619,8 @@ struct rte_mbuf {
>  
>  	/** Valid if PKT_RX_TIMESTAMP is set. The unit and time reference
>  	 * are not normalized but are always the same for a given port.
> +	 * Some devices allow to query rte_eth_read_clock that will return the
> +	 * current device timestamp.

cc'ed Olivier.

>  	 */
>  	uint64_t timestamp;
>  
> 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] mlx5: Implement support for read_clock
  2019-03-27  6:19 ` [dpdk-dev] [PATCH v2 2/3] mlx5: Implement support for read_clock Tom Barbette
  2019-03-27  6:19   ` Tom Barbette
@ 2019-04-02 17:47   ` Ferruh Yigit
  2019-04-02 17:47     ` Ferruh Yigit
  2019-04-02 18:26   ` Ferruh Yigit
  2019-04-03  5:28   ` Shahaf Shuler
  3 siblings, 1 reply; 32+ messages in thread
From: Ferruh Yigit @ 2019-04-02 17:47 UTC (permalink / raw)
  To: Shahaf Shuler
  Cc: Tom Barbette, dev, bruce.richardson, john.mcnamara,
	Thomas Monjalon, Andrew Rybchenko, Yongseok Koh

On 3/27/2019 6:19 AM, Tom Barbette wrote:
> Signed-off-by: Tom Barbette <barbette@kth.se>


Hi Shahaf,

Reminder of this patch waiting for review.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] mlx5: Implement support for read_clock
  2019-04-02 17:47   ` Ferruh Yigit
@ 2019-04-02 17:47     ` Ferruh Yigit
  0 siblings, 0 replies; 32+ messages in thread
From: Ferruh Yigit @ 2019-04-02 17:47 UTC (permalink / raw)
  To: Shahaf Shuler
  Cc: Tom Barbette, dev, bruce.richardson, john.mcnamara,
	Thomas Monjalon, Andrew Rybchenko, Yongseok Koh

On 3/27/2019 6:19 AM, Tom Barbette wrote:
> Signed-off-by: Tom Barbette <barbette@kth.se>


Hi Shahaf,

Reminder of this patch waiting for review.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/3] rxtx_callbacks: Add support for HW timestamp
  2019-03-27  6:19 ` [dpdk-dev] [PATCH v2 3/3] rxtx_callbacks: Add support for HW timestamp Tom Barbette
  2019-03-27  6:19   ` Tom Barbette
@ 2019-04-02 18:22   ` Ferruh Yigit
  2019-04-02 18:22     ` Ferruh Yigit
  2019-04-02 19:39     ` Tom Barbette
  1 sibling, 2 replies; 32+ messages in thread
From: Ferruh Yigit @ 2019-04-02 18:22 UTC (permalink / raw)
  To: Tom Barbette, dev
  Cc: bruce.richardson, john.mcnamara, Thomas Monjalon,
	Andrew Rybchenko, Shahaf Shuler, Yongseok Koh

On 3/27/2019 6:19 AM, Tom Barbette wrote:
> Use rxtx callback to demonstrate a way to use rte_eth_read_clock to
> convert the hardware timestamps to an amount of cycles.
> 
> This allows to get the amount of time the packet spent since its entry
> in the device. While the regular latency only shows the latency from
> when it entered the software stack.
> 
> Signed-off-by: Tom Barbette <barbette@kth.se>
<...>

> @@ -50,6 +50,8 @@ include $(RTE_SDK)/mk/rte.vars.mk
>  
>  CFLAGS += $(WERROR_FLAGS)
>  
> +
> +CFLAGS += -DALLOW_EXPERIMENTAL_API

Can you please add the experimental API as a comment.

>  # workaround for a gcc bug with noreturn attribute
>  # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
>  ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
> diff --git a/examples/rxtx_callbacks/main.c b/examples/rxtx_callbacks/main.c
> index 2058be627..7000464e1 100644
> --- a/examples/rxtx_callbacks/main.c
> +++ b/examples/rxtx_callbacks/main.c
> @@ -10,6 +10,8 @@
>  #include <rte_lcore.h>
>  #include <rte_mbuf.h>
>  
> +#include <getopt.h>

libc includes are already together above, can you move this above

> +
>  #define RX_RING_SIZE 1024
>  #define TX_RING_SIZE 1024
>  
> @@ -17,6 +19,9 @@
>  #define MBUF_CACHE_SIZE 250
>  #define BURST_SIZE 32
>  
> +static const char usage[] =
> +	"%s EAL_ARGS -- [-t]\n";
> +
>  static const struct rte_eth_conf port_conf_default = {
>  	.rxmode = {
>  		.max_rx_pkt_len = ETHER_MAX_LEN,
> @@ -25,9 +30,14 @@ static const struct rte_eth_conf port_conf_default = {
>  
>  static struct {
>  	uint64_t total_cycles;
> +	uint64_t total_queue_cycles;
>  	uint64_t total_pkts;
>  } latency_numbers;
>  
> +int hw_timestamping;
> +
> +#define TICKS_PER_CYCLE_SHIFT 16
> +uint64_t ticks_per_cycle_mult;

I am aware this is single .c file application, but as a good practice can you
please make above global variables static.

<...>

> @@ -95,9 +126,20 @@ port_init(uint16_t port, struct rte_mempool *mbuf_pool)
>  	if (retval != 0)
>  		return retval;
>  
> +	rxconf = dev_info.default_rxconf;
> +
> +	if (hw_timestamping) {
> +		if (!(dev_info.rx_offload_capa & DEV_RX_OFFLOAD_TIMESTAMP)) {
> +			printf("\nERROR: Port %u does not support hardware timestamping\n"
> +					, port);
> +			return -1;
> +		}
> +		rxconf.offloads |= DEV_RX_OFFLOAD_TIMESTAMP;
> +	}

Same comment as ethdev one, above code assume if driver announces
'DEV_RX_OFFLOAD_TIMESTAMP' capability, it have to implement 'read_clock'
dev_ops, should it be the case?

Write now only mlx implements it so this is not a problem at all, but I don't
know if all PMDs supports timestamping packets must implement 'read_clock'.

<...>

> @@ -6,6 +6,7 @@
>  # To build this example as a standalone application with an already-installed
>  # DPDK instance, use 'make'
>  
> +allow_experimental_apis = true

Can you please add the experimental API as a comment.

>  sources = files(
>  	'main.c'
>  )
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/3] rxtx_callbacks: Add support for HW timestamp
  2019-04-02 18:22   ` Ferruh Yigit
@ 2019-04-02 18:22     ` Ferruh Yigit
  2019-04-02 19:39     ` Tom Barbette
  1 sibling, 0 replies; 32+ messages in thread
From: Ferruh Yigit @ 2019-04-02 18:22 UTC (permalink / raw)
  To: Tom Barbette, dev
  Cc: bruce.richardson, john.mcnamara, Thomas Monjalon,
	Andrew Rybchenko, Shahaf Shuler, Yongseok Koh

On 3/27/2019 6:19 AM, Tom Barbette wrote:
> Use rxtx callback to demonstrate a way to use rte_eth_read_clock to
> convert the hardware timestamps to an amount of cycles.
> 
> This allows to get the amount of time the packet spent since its entry
> in the device. While the regular latency only shows the latency from
> when it entered the software stack.
> 
> Signed-off-by: Tom Barbette <barbette@kth.se>
<...>

> @@ -50,6 +50,8 @@ include $(RTE_SDK)/mk/rte.vars.mk
>  
>  CFLAGS += $(WERROR_FLAGS)
>  
> +
> +CFLAGS += -DALLOW_EXPERIMENTAL_API

Can you please add the experimental API as a comment.

>  # workaround for a gcc bug with noreturn attribute
>  # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
>  ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
> diff --git a/examples/rxtx_callbacks/main.c b/examples/rxtx_callbacks/main.c
> index 2058be627..7000464e1 100644
> --- a/examples/rxtx_callbacks/main.c
> +++ b/examples/rxtx_callbacks/main.c
> @@ -10,6 +10,8 @@
>  #include <rte_lcore.h>
>  #include <rte_mbuf.h>
>  
> +#include <getopt.h>

libc includes are already together above, can you move this above

> +
>  #define RX_RING_SIZE 1024
>  #define TX_RING_SIZE 1024
>  
> @@ -17,6 +19,9 @@
>  #define MBUF_CACHE_SIZE 250
>  #define BURST_SIZE 32
>  
> +static const char usage[] =
> +	"%s EAL_ARGS -- [-t]\n";
> +
>  static const struct rte_eth_conf port_conf_default = {
>  	.rxmode = {
>  		.max_rx_pkt_len = ETHER_MAX_LEN,
> @@ -25,9 +30,14 @@ static const struct rte_eth_conf port_conf_default = {
>  
>  static struct {
>  	uint64_t total_cycles;
> +	uint64_t total_queue_cycles;
>  	uint64_t total_pkts;
>  } latency_numbers;
>  
> +int hw_timestamping;
> +
> +#define TICKS_PER_CYCLE_SHIFT 16
> +uint64_t ticks_per_cycle_mult;

I am aware this is single .c file application, but as a good practice can you
please make above global variables static.

<...>

> @@ -95,9 +126,20 @@ port_init(uint16_t port, struct rte_mempool *mbuf_pool)
>  	if (retval != 0)
>  		return retval;
>  
> +	rxconf = dev_info.default_rxconf;
> +
> +	if (hw_timestamping) {
> +		if (!(dev_info.rx_offload_capa & DEV_RX_OFFLOAD_TIMESTAMP)) {
> +			printf("\nERROR: Port %u does not support hardware timestamping\n"
> +					, port);
> +			return -1;
> +		}
> +		rxconf.offloads |= DEV_RX_OFFLOAD_TIMESTAMP;
> +	}

Same comment as ethdev one, above code assume if driver announces
'DEV_RX_OFFLOAD_TIMESTAMP' capability, it have to implement 'read_clock'
dev_ops, should it be the case?

Write now only mlx implements it so this is not a problem at all, but I don't
know if all PMDs supports timestamping packets must implement 'read_clock'.

<...>

> @@ -6,6 +6,7 @@
>  # To build this example as a standalone application with an already-installed
>  # DPDK instance, use 'make'
>  
> +allow_experimental_apis = true

Can you please add the experimental API as a comment.

>  sources = files(
>  	'main.c'
>  )
> 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] mlx5: Implement support for read_clock
  2019-03-27  6:19 ` [dpdk-dev] [PATCH v2 2/3] mlx5: Implement support for read_clock Tom Barbette
  2019-03-27  6:19   ` Tom Barbette
  2019-04-02 17:47   ` Ferruh Yigit
@ 2019-04-02 18:26   ` Ferruh Yigit
  2019-04-02 18:26     ` Ferruh Yigit
  2019-04-02 19:05     ` Tom Barbette
  2019-04-03  5:28   ` Shahaf Shuler
  3 siblings, 2 replies; 32+ messages in thread
From: Ferruh Yigit @ 2019-04-02 18:26 UTC (permalink / raw)
  To: Tom Barbette, dev
  Cc: bruce.richardson, john.mcnamara, Thomas Monjalon,
	Andrew Rybchenko, Shahaf Shuler, Yongseok Koh

On 3/27/2019 6:19 AM, Tom Barbette wrote:
> +int
> +mlx5_read_clock(struct rte_eth_dev *dev, uint64_t *clock)
> +{
> + struct mlx5_priv *priv = dev->data->dev_private;
> + struct ibv_values_ex values;
> + int err = 0;
> +
> + values.comp_mask = IBV_VALUES_MASK_RAW_CLOCK;
> + err = mlx5_glue->query_rt_values_ex(priv->ctx, &values);

error: no member named 'ctx' in 'struct mlx5_priv'

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] mlx5: Implement support for read_clock
  2019-04-02 18:26   ` Ferruh Yigit
@ 2019-04-02 18:26     ` Ferruh Yigit
  2019-04-02 19:05     ` Tom Barbette
  1 sibling, 0 replies; 32+ messages in thread
From: Ferruh Yigit @ 2019-04-02 18:26 UTC (permalink / raw)
  To: Tom Barbette, dev
  Cc: bruce.richardson, john.mcnamara, Thomas Monjalon,
	Andrew Rybchenko, Shahaf Shuler, Yongseok Koh

On 3/27/2019 6:19 AM, Tom Barbette wrote:
> +int
> +mlx5_read_clock(struct rte_eth_dev *dev, uint64_t *clock)
> +{
> + struct mlx5_priv *priv = dev->data->dev_private;
> + struct ibv_values_ex values;
> + int err = 0;
> +
> + values.comp_mask = IBV_VALUES_MASK_RAW_CLOCK;
> + err = mlx5_glue->query_rt_values_ex(priv->ctx, &values);

error: no member named 'ctx' in 'struct mlx5_priv'

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] mlx5: Implement support for read_clock
  2019-04-02 18:26   ` Ferruh Yigit
  2019-04-02 18:26     ` Ferruh Yigit
@ 2019-04-02 19:05     ` Tom Barbette
  2019-04-02 19:05       ` Tom Barbette
  1 sibling, 1 reply; 32+ messages in thread
From: Tom Barbette @ 2019-04-02 19:05 UTC (permalink / raw)
  To: Ferruh Yigit, dev
  Cc: bruce.richardson, john.mcnamara, Thomas Monjalon,
	Andrew Rybchenko, Shahaf Shuler, Yongseok Koh

Le 02/04/2019 à 20:26, Ferruh Yigit a écrit :
> 
> error: no member named 'ctx' in 'struct mlx5_priv'
> 

Yes indeed, master changed since my last rebase. This is fixed in v3 
(I'll address other comments in related patches before sending v3).

I already addressed Shahaf's comments for this patch in prior discussions.

Thanks,

Tom

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] mlx5: Implement support for read_clock
  2019-04-02 19:05     ` Tom Barbette
@ 2019-04-02 19:05       ` Tom Barbette
  0 siblings, 0 replies; 32+ messages in thread
From: Tom Barbette @ 2019-04-02 19:05 UTC (permalink / raw)
  To: Ferruh Yigit, dev
  Cc: bruce.richardson, john.mcnamara, Thomas Monjalon,
	Andrew Rybchenko, Shahaf Shuler, Yongseok Koh

Le 02/04/2019 à 20:26, Ferruh Yigit a écrit :
> 
> error: no member named 'ctx' in 'struct mlx5_priv'
> 

Yes indeed, master changed since my last rebase. This is fixed in v3 
(I'll address other comments in related patches before sending v3).

I already addressed Shahaf's comments for this patch in prior discussions.

Thanks,

Tom

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/3] rte_ethdev: Add API function to read dev clock
  2019-04-02 17:46   ` Ferruh Yigit
  2019-04-02 17:46     ` Ferruh Yigit
@ 2019-04-02 19:24     ` Tom Barbette
  2019-04-02 19:24       ` Tom Barbette
  1 sibling, 1 reply; 32+ messages in thread
From: Tom Barbette @ 2019-04-02 19:24 UTC (permalink / raw)
  To: Ferruh Yigit, dev
  Cc: bruce.richardson, john.mcnamara, Thomas Monjalon,
	Andrew Rybchenko, Shahaf Shuler, Yongseok Koh, Olivier MATZ

Le 02/04/2019 à 19:46, Ferruh Yigit a écrit :

>> +* **[implements] eth_dev_ops**: ``read_clock``.
> 
> This means for a PMD to claim 'timestamp' support, it should implement the
> 'read_clock' dev_ops, is it really the case?
> Should we say 'related' instead of 'implements' ?

Ok for me. I guess this will depend on how another device vendor would 
implement the feature.

>> +	return eth_err(port_id, (*dev->dev_ops->read_clock)(dev,
>> +								timestamp)); > Please fix the syntax.

Just one more tab for the second line, right? Multiple functions just 
before have the same spacing. When in Rome...

> Can PMD return a fail?

MLX5's implementation may return any errno value, which are not defined 
in the ibv API it is using. Not sure how I should address that in the 
comment?

Thanks for the review! I directly addressed other comments in v3.

Tom

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/3] rte_ethdev: Add API function to read dev clock
  2019-04-02 19:24     ` Tom Barbette
@ 2019-04-02 19:24       ` Tom Barbette
  0 siblings, 0 replies; 32+ messages in thread
From: Tom Barbette @ 2019-04-02 19:24 UTC (permalink / raw)
  To: Ferruh Yigit, dev
  Cc: bruce.richardson, john.mcnamara, Thomas Monjalon,
	Andrew Rybchenko, Shahaf Shuler, Yongseok Koh, Olivier MATZ

Le 02/04/2019 à 19:46, Ferruh Yigit a écrit :

>> +* **[implements] eth_dev_ops**: ``read_clock``.
> 
> This means for a PMD to claim 'timestamp' support, it should implement the
> 'read_clock' dev_ops, is it really the case?
> Should we say 'related' instead of 'implements' ?

Ok for me. I guess this will depend on how another device vendor would 
implement the feature.

>> +	return eth_err(port_id, (*dev->dev_ops->read_clock)(dev,
>> +								timestamp)); > Please fix the syntax.

Just one more tab for the second line, right? Multiple functions just 
before have the same spacing. When in Rome...

> Can PMD return a fail?

MLX5's implementation may return any errno value, which are not defined 
in the ibv API it is using. Not sure how I should address that in the 
comment?

Thanks for the review! I directly addressed other comments in v3.

Tom

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/3] rxtx_callbacks: Add support for HW timestamp
  2019-04-02 18:22   ` Ferruh Yigit
  2019-04-02 18:22     ` Ferruh Yigit
@ 2019-04-02 19:39     ` Tom Barbette
  2019-04-02 19:39       ` Tom Barbette
  1 sibling, 1 reply; 32+ messages in thread
From: Tom Barbette @ 2019-04-02 19:39 UTC (permalink / raw)
  To: Ferruh Yigit, dev
  Cc: bruce.richardson, john.mcnamara, Thomas Monjalon,
	Andrew Rybchenko, Shahaf Shuler, Yongseok Koh

Le 02/04/2019 à 20:22, Ferruh Yigit a écrit :
> Same comment as ethdev one, above code assume if driver announces
> 'DEV_RX_OFFLOAD_TIMESTAMP' capability, it have to implement 'read_clock'
> dev_ops, should it be the case?
> 
> Write now only mlx implements it so this is not a problem at all, but I don't
> know if all PMDs supports timestamping packets must implement 'read_clock'.

I changed the init code to fail if rte_eth_read_clock did not work 
during the initialization of the base clock.

Other comments are applied.

Thanks!

Tom

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/3] rxtx_callbacks: Add support for HW timestamp
  2019-04-02 19:39     ` Tom Barbette
@ 2019-04-02 19:39       ` Tom Barbette
  0 siblings, 0 replies; 32+ messages in thread
From: Tom Barbette @ 2019-04-02 19:39 UTC (permalink / raw)
  To: Ferruh Yigit, dev
  Cc: bruce.richardson, john.mcnamara, Thomas Monjalon,
	Andrew Rybchenko, Shahaf Shuler, Yongseok Koh

Le 02/04/2019 à 20:22, Ferruh Yigit a écrit :
> Same comment as ethdev one, above code assume if driver announces
> 'DEV_RX_OFFLOAD_TIMESTAMP' capability, it have to implement 'read_clock'
> dev_ops, should it be the case?
> 
> Write now only mlx implements it so this is not a problem at all, but I don't
> know if all PMDs supports timestamping packets must implement 'read_clock'.

I changed the init code to fail if rte_eth_read_clock did not work 
during the initialization of the base clock.

Other comments are applied.

Thanks!

Tom

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] mlx5: Implement support for read_clock
  2019-03-27  6:19 ` [dpdk-dev] [PATCH v2 2/3] mlx5: Implement support for read_clock Tom Barbette
                     ` (2 preceding siblings ...)
  2019-04-02 18:26   ` Ferruh Yigit
@ 2019-04-03  5:28   ` Shahaf Shuler
  2019-04-03  5:28     ` Shahaf Shuler
  3 siblings, 1 reply; 32+ messages in thread
From: Shahaf Shuler @ 2019-04-03  5:28 UTC (permalink / raw)
  To: Tom Barbette, dev
  Cc: bruce.richardson, john.mcnamara, Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, Yongseok Koh

Wednesday, March 27, 2019 8:20 AM, Tom Barbette:
> Subject: [PATCH v2 2/3] mlx5: Implement support for read_clock
> 
> Signed-off-by: Tom Barbette <barbette@kth.se>

Apart from the compilation issue I am OK w/ this patch.

You can add my acked-by on v3.

> ---
>  drivers/net/mlx5/mlx5.c        |  1 +
>  drivers/net/mlx5/mlx5.h        |  1 +
>  drivers/net/mlx5/mlx5_ethdev.c | 29 +++++++++++++++++++++++++++++
>  drivers/net/mlx5/mlx5_glue.c   |  8 ++++++++
>  drivers/net/mlx5/mlx5_glue.h   |  2 ++
>  5 files changed, 41 insertions(+)
> 
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> ae4b71695..7091ff4bb 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -376,6 +376,7 @@ const struct eth_dev_ops mlx5_dev_ops = {
>  	.xstats_get_names = mlx5_xstats_get_names,
>  	.fw_version_get = mlx5_fw_version_get,
>  	.dev_infos_get = mlx5_dev_infos_get,
> +	.read_clock = mlx5_read_clock,
>  	.dev_supported_ptypes_get = mlx5_dev_supported_ptypes_get,
>  	.vlan_filter_set = mlx5_vlan_filter_set,
>  	.rx_queue_setup = mlx5_rx_queue_setup, diff --git
> a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index
> 538445367..88394f391 100644
> --- a/drivers/net/mlx5/mlx5.h
> +++ b/drivers/net/mlx5/mlx5.h
> @@ -275,6 +275,7 @@ int mlx5_set_flags(struct rte_eth_dev *dev, unsigned
> int keep,
>  		   unsigned int flags);
>  int mlx5_dev_configure(struct rte_eth_dev *dev);  void
> mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info
> *info);
> +int mlx5_read_clock(struct rte_eth_dev *dev, uint64_t *time);
>  int mlx5_fw_version_get(struct rte_eth_dev *dev, char *fw_ver, size_t
> fw_size);  const uint32_t *mlx5_dev_supported_ptypes_get(struct
> rte_eth_dev *dev);  int mlx5_link_update(struct rte_eth_dev *dev, int
> wait_to_complete); diff --git a/drivers/net/mlx5/mlx5_ethdev.c
> b/drivers/net/mlx5/mlx5_ethdev.c index f84f7cf69..52262ee44 100644
> --- a/drivers/net/mlx5/mlx5_ethdev.c
> +++ b/drivers/net/mlx5/mlx5_ethdev.c
> @@ -557,6 +557,35 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev,
> struct rte_eth_dev_info *info)
>  	}
>  }
> 
> +/**
> + * Get device current raw clock counter
> + *
> + * @param dev
> + *   Pointer to Ethernet device structure.
> + * @param[out] time
> + *   Current raw clock counter of the device.
> + *
> + * @return
> + *   0 if the clock has correctly been read
> + *   The value of errno in case of error
> + */
> +int
> +mlx5_read_clock(struct rte_eth_dev *dev, uint64_t *clock) {
> +	struct mlx5_priv *priv = dev->data->dev_private;
> +	struct ibv_values_ex values;
> +	int err = 0;
> +
> +	values.comp_mask = IBV_VALUES_MASK_RAW_CLOCK;
> +	err = mlx5_glue->query_rt_values_ex(priv->ctx, &values);
> +	if (err != 0) {
> +		DRV_LOG(WARNING, "Could not query the clock !");
> +		return err;
> +	}
> +	*clock = values.raw_clock.tv_nsec;
> +	return 0;
> +}
> +
>  /**
>   * Get firmware version of a device.
>   *
> diff --git a/drivers/net/mlx5/mlx5_glue.c b/drivers/net/mlx5/mlx5_glue.c
> index c817d86c5..c1786c5e9 100644
> --- a/drivers/net/mlx5/mlx5_glue.c
> +++ b/drivers/net/mlx5/mlx5_glue.c
> @@ -86,6 +86,13 @@ mlx5_glue_query_device_ex(struct ibv_context
> *context,
>  	return ibv_query_device_ex(context, input, attr);  }
> 
> +static int
> +mlx5_glue_query_rt_values_ex(struct ibv_context *context,
> +			  struct ibv_values_ex *values)
> +{
> +	return ibv_query_rt_values_ex(context, values); }
> +
>  static int
>  mlx5_glue_query_port(struct ibv_context *context, uint8_t port_num,
>  		     struct ibv_port_attr *port_attr) @@ -603,6 +610,7 @@
> const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue){
>  	.close_device = mlx5_glue_close_device,
>  	.query_device = mlx5_glue_query_device,
>  	.query_device_ex = mlx5_glue_query_device_ex,
> +	.query_rt_values_ex = mlx5_glue_query_rt_values_ex,
>  	.query_port = mlx5_glue_query_port,
>  	.create_comp_channel = mlx5_glue_create_comp_channel,
>  	.destroy_comp_channel = mlx5_glue_destroy_comp_channel, diff --
> git a/drivers/net/mlx5/mlx5_glue.h b/drivers/net/mlx5/mlx5_glue.h index
> b11896062..fd651cc03 100644
> --- a/drivers/net/mlx5/mlx5_glue.h
> +++ b/drivers/net/mlx5/mlx5_glue.h
> @@ -74,6 +74,8 @@ struct mlx5_glue {
>  	int (*query_device_ex)(struct ibv_context *context,
>  			       const struct ibv_query_device_ex_input *input,
>  			       struct ibv_device_attr_ex *attr);
> +	int (*query_rt_values_ex)(struct ibv_context *context,
> +			       struct ibv_values_ex *values);
>  	int (*query_port)(struct ibv_context *context, uint8_t port_num,
>  			  struct ibv_port_attr *port_attr);
>  	struct ibv_comp_channel *(*create_comp_channel)
> --
> 2.17.1

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] mlx5: Implement support for read_clock
  2019-04-03  5:28   ` Shahaf Shuler
@ 2019-04-03  5:28     ` Shahaf Shuler
  0 siblings, 0 replies; 32+ messages in thread
From: Shahaf Shuler @ 2019-04-03  5:28 UTC (permalink / raw)
  To: Tom Barbette, dev
  Cc: bruce.richardson, john.mcnamara, Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, Yongseok Koh

Wednesday, March 27, 2019 8:20 AM, Tom Barbette:
> Subject: [PATCH v2 2/3] mlx5: Implement support for read_clock
> 
> Signed-off-by: Tom Barbette <barbette@kth.se>

Apart from the compilation issue I am OK w/ this patch.

You can add my acked-by on v3.

> ---
>  drivers/net/mlx5/mlx5.c        |  1 +
>  drivers/net/mlx5/mlx5.h        |  1 +
>  drivers/net/mlx5/mlx5_ethdev.c | 29 +++++++++++++++++++++++++++++
>  drivers/net/mlx5/mlx5_glue.c   |  8 ++++++++
>  drivers/net/mlx5/mlx5_glue.h   |  2 ++
>  5 files changed, 41 insertions(+)
> 
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> ae4b71695..7091ff4bb 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -376,6 +376,7 @@ const struct eth_dev_ops mlx5_dev_ops = {
>  	.xstats_get_names = mlx5_xstats_get_names,
>  	.fw_version_get = mlx5_fw_version_get,
>  	.dev_infos_get = mlx5_dev_infos_get,
> +	.read_clock = mlx5_read_clock,
>  	.dev_supported_ptypes_get = mlx5_dev_supported_ptypes_get,
>  	.vlan_filter_set = mlx5_vlan_filter_set,
>  	.rx_queue_setup = mlx5_rx_queue_setup, diff --git
> a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index
> 538445367..88394f391 100644
> --- a/drivers/net/mlx5/mlx5.h
> +++ b/drivers/net/mlx5/mlx5.h
> @@ -275,6 +275,7 @@ int mlx5_set_flags(struct rte_eth_dev *dev, unsigned
> int keep,
>  		   unsigned int flags);
>  int mlx5_dev_configure(struct rte_eth_dev *dev);  void
> mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info
> *info);
> +int mlx5_read_clock(struct rte_eth_dev *dev, uint64_t *time);
>  int mlx5_fw_version_get(struct rte_eth_dev *dev, char *fw_ver, size_t
> fw_size);  const uint32_t *mlx5_dev_supported_ptypes_get(struct
> rte_eth_dev *dev);  int mlx5_link_update(struct rte_eth_dev *dev, int
> wait_to_complete); diff --git a/drivers/net/mlx5/mlx5_ethdev.c
> b/drivers/net/mlx5/mlx5_ethdev.c index f84f7cf69..52262ee44 100644
> --- a/drivers/net/mlx5/mlx5_ethdev.c
> +++ b/drivers/net/mlx5/mlx5_ethdev.c
> @@ -557,6 +557,35 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev,
> struct rte_eth_dev_info *info)
>  	}
>  }
> 
> +/**
> + * Get device current raw clock counter
> + *
> + * @param dev
> + *   Pointer to Ethernet device structure.
> + * @param[out] time
> + *   Current raw clock counter of the device.
> + *
> + * @return
> + *   0 if the clock has correctly been read
> + *   The value of errno in case of error
> + */
> +int
> +mlx5_read_clock(struct rte_eth_dev *dev, uint64_t *clock) {
> +	struct mlx5_priv *priv = dev->data->dev_private;
> +	struct ibv_values_ex values;
> +	int err = 0;
> +
> +	values.comp_mask = IBV_VALUES_MASK_RAW_CLOCK;
> +	err = mlx5_glue->query_rt_values_ex(priv->ctx, &values);
> +	if (err != 0) {
> +		DRV_LOG(WARNING, "Could not query the clock !");
> +		return err;
> +	}
> +	*clock = values.raw_clock.tv_nsec;
> +	return 0;
> +}
> +
>  /**
>   * Get firmware version of a device.
>   *
> diff --git a/drivers/net/mlx5/mlx5_glue.c b/drivers/net/mlx5/mlx5_glue.c
> index c817d86c5..c1786c5e9 100644
> --- a/drivers/net/mlx5/mlx5_glue.c
> +++ b/drivers/net/mlx5/mlx5_glue.c
> @@ -86,6 +86,13 @@ mlx5_glue_query_device_ex(struct ibv_context
> *context,
>  	return ibv_query_device_ex(context, input, attr);  }
> 
> +static int
> +mlx5_glue_query_rt_values_ex(struct ibv_context *context,
> +			  struct ibv_values_ex *values)
> +{
> +	return ibv_query_rt_values_ex(context, values); }
> +
>  static int
>  mlx5_glue_query_port(struct ibv_context *context, uint8_t port_num,
>  		     struct ibv_port_attr *port_attr) @@ -603,6 +610,7 @@
> const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue){
>  	.close_device = mlx5_glue_close_device,
>  	.query_device = mlx5_glue_query_device,
>  	.query_device_ex = mlx5_glue_query_device_ex,
> +	.query_rt_values_ex = mlx5_glue_query_rt_values_ex,
>  	.query_port = mlx5_glue_query_port,
>  	.create_comp_channel = mlx5_glue_create_comp_channel,
>  	.destroy_comp_channel = mlx5_glue_destroy_comp_channel, diff --
> git a/drivers/net/mlx5/mlx5_glue.h b/drivers/net/mlx5/mlx5_glue.h index
> b11896062..fd651cc03 100644
> --- a/drivers/net/mlx5/mlx5_glue.h
> +++ b/drivers/net/mlx5/mlx5_glue.h
> @@ -74,6 +74,8 @@ struct mlx5_glue {
>  	int (*query_device_ex)(struct ibv_context *context,
>  			       const struct ibv_query_device_ex_input *input,
>  			       struct ibv_device_attr_ex *attr);
> +	int (*query_rt_values_ex)(struct ibv_context *context,
> +			       struct ibv_values_ex *values);
>  	int (*query_port)(struct ibv_context *context, uint8_t port_num,
>  			  struct ibv_port_attr *port_attr);
>  	struct ibv_comp_channel *(*create_comp_channel)
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2019-04-03  5:28 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-27  6:19 [dpdk-dev] [PATCH v2 0/3] Add rte_eth_read_clock API Tom Barbette
2019-03-27  6:19 ` Tom Barbette
2019-03-27  6:19 ` [dpdk-dev] [PATCH v2 1/3] rte_ethdev: Add API function to read dev clock Tom Barbette
2019-03-27  6:19   ` Tom Barbette
2019-04-02 17:46   ` Ferruh Yigit
2019-04-02 17:46     ` Ferruh Yigit
2019-04-02 19:24     ` Tom Barbette
2019-04-02 19:24       ` Tom Barbette
2019-03-27  6:19 ` [dpdk-dev] [PATCH v2 2/3] mlx5: Implement support for read_clock Tom Barbette
2019-03-27  6:19   ` Tom Barbette
2019-04-02 17:47   ` Ferruh Yigit
2019-04-02 17:47     ` Ferruh Yigit
2019-04-02 18:26   ` Ferruh Yigit
2019-04-02 18:26     ` Ferruh Yigit
2019-04-02 19:05     ` Tom Barbette
2019-04-02 19:05       ` Tom Barbette
2019-04-03  5:28   ` Shahaf Shuler
2019-04-03  5:28     ` Shahaf Shuler
2019-03-27  6:19 ` [dpdk-dev] [PATCH v2 3/3] rxtx_callbacks: Add support for HW timestamp Tom Barbette
2019-03-27  6:19   ` Tom Barbette
2019-04-02 18:22   ` Ferruh Yigit
2019-04-02 18:22     ` Ferruh Yigit
2019-04-02 19:39     ` Tom Barbette
2019-04-02 19:39       ` Tom Barbette
2019-03-27 14:41 ` [dpdk-dev] [PATCH v2 0/3] Add rte_eth_read_clock API Stephen Hemminger
2019-03-27 14:41   ` Stephen Hemminger
2019-03-27 14:48   ` Thomas Monjalon
2019-03-27 14:48     ` Thomas Monjalon
2019-03-27 14:54   ` Wiles, Keith
2019-03-27 14:54     ` Wiles, Keith
2019-03-27 16:08     ` Tom Barbette
2019-03-27 16:08       ` Tom Barbette

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).