DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [RFC PATCH 0/3] mlx5 to PCAP capture with hardware timestamps
@ 2020-06-25 19:01 Vivien Didelot
  2020-06-25 19:01 ` [dpdk-dev] [RFC PATCH 1/3] net/mlx5: add timestamp-to-ns converter from libibverbs Vivien Didelot
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Vivien Didelot @ 2020-06-25 19:01 UTC (permalink / raw)
  To: dev; +Cc: Vivien Didelot, Olivier Matz, Ferruh Yigit, Patrick Keroulas

This series allows to capture packets from a Mellanox ConnectX-5 interface
into a PCAP file with support for hardware timestamps. Since libibverbs
already provides timestamp conversion to nanoseconds for mlx5, the necessary
glue is added before writing the driver's timestamps into the PCAP headers.

Patrick Keroulas (2):
  net/mlx5: add timestamp-to-ns converter from libibverbs
  ethdev: add API to convert raw timestamps to nsec

Vivien Didelot (1):
  net/pcap: support hardware Tx timestamps

 doc/guides/rel_notes/release_20_08.rst   |  1 +
 drivers/common/mlx5/linux/mlx5_glue.c    | 16 +++++++++++++
 drivers/common/mlx5/linux/mlx5_glue.h    |  4 ++++
 drivers/net/mlx5/linux/mlx5_ethdev_os.c  | 30 ++++++++++++++++++++++++
 drivers/net/mlx5/linux/mlx5_os.c         |  1 +
 drivers/net/mlx5/mlx5.h                  |  1 +
 drivers/net/pcap/rte_eth_pcap.c          | 30 ++++++++++++++----------
 lib/librte_ethdev/rte_ethdev.c           | 12 ++++++++++
 lib/librte_ethdev/rte_ethdev.h           | 17 ++++++++++++++
 lib/librte_ethdev/rte_ethdev_core.h      |  5 ++++
 lib/librte_ethdev/rte_ethdev_version.map |  2 ++
 lib/librte_mbuf/rte_mbuf_core.h          |  3 ++-
 lib/librte_pdump/rte_pdump.c             | 14 ++++++++++-
 13 files changed, 121 insertions(+), 15 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-dev] [RFC PATCH 1/3] net/mlx5: add timestamp-to-ns converter from libibverbs
  2020-06-25 19:01 [dpdk-dev] [RFC PATCH 0/3] mlx5 to PCAP capture with hardware timestamps Vivien Didelot
@ 2020-06-25 19:01 ` Vivien Didelot
  2020-06-26  6:41   ` Olivier Matz
  2020-06-25 19:01 ` [dpdk-dev] [RFC PATCH 2/3] ethdev: add API to convert raw timestamps to nsec Vivien Didelot
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Vivien Didelot @ 2020-06-25 19:01 UTC (permalink / raw)
  To: dev; +Cc: Patrick Keroulas, Olivier Matz, Ferruh Yigit

From: Patrick Keroulas <patrick.keroulas@radio-canada.ca>

While some devices update their own clock info to provide current time,
mlx5dv part of libibverbs already handles this and also converts any
raw counter cycle to nanoseconds.

Signed-off-by: Patrick Keroulas <patrick.keroulas@radio-canada.ca>
---
 drivers/common/mlx5/linux/mlx5_glue.c   | 16 +++++++++++++
 drivers/common/mlx5/linux/mlx5_glue.h   |  4 ++++
 drivers/net/mlx5/linux/mlx5_ethdev_os.c | 30 +++++++++++++++++++++++++
 drivers/net/mlx5/linux/mlx5_os.c        |  1 +
 drivers/net/mlx5/mlx5.h                 |  1 +
 5 files changed, 52 insertions(+)

diff --git a/drivers/common/mlx5/linux/mlx5_glue.c b/drivers/common/mlx5/linux/mlx5_glue.c
index c91ee33bb..cac24015b 100644
--- a/drivers/common/mlx5/linux/mlx5_glue.c
+++ b/drivers/common/mlx5/linux/mlx5_glue.c
@@ -80,6 +80,20 @@ mlx5_glue_query_rt_values_ex(struct ibv_context *context,
 	return ibv_query_rt_values_ex(context, values);
 }
 
+static int
+mlx5_glue_get_clock_info(struct ibv_context *context,
+			  struct mlx5dv_clock_info *clock_info)
+{
+	return mlx5dv_get_clock_info(context, clock_info);
+}
+
+static uint64_t
+mlx5_glue_mlx5dv_ts_to_ns(struct mlx5dv_clock_info *clock_info,
+			  uint64_t device_timestamp)
+{
+	return mlx5dv_ts_to_ns(clock_info, device_timestamp);
+}
+
 static int
 mlx5_glue_query_port(struct ibv_context *context, uint8_t port_num,
 		     struct ibv_port_attr *port_attr)
@@ -1207,6 +1221,8 @@ const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue) {
 	.query_device = mlx5_glue_query_device,
 	.query_device_ex = mlx5_glue_query_device_ex,
 	.query_rt_values_ex = mlx5_glue_query_rt_values_ex,
+	.get_clock_info = mlx5_glue_get_clock_info,
+	.convert_ts_to_ns = mlx5_glue_mlx5dv_ts_to_ns,
 	.query_port = mlx5_glue_query_port,
 	.create_comp_channel = mlx5_glue_create_comp_channel,
 	.destroy_comp_channel = mlx5_glue_destroy_comp_channel,
diff --git a/drivers/common/mlx5/linux/mlx5_glue.h b/drivers/common/mlx5/linux/mlx5_glue.h
index 5d238a40a..8d05e7398 100644
--- a/drivers/common/mlx5/linux/mlx5_glue.h
+++ b/drivers/common/mlx5/linux/mlx5_glue.h
@@ -123,6 +123,10 @@ struct mlx5_glue {
 			  struct ibv_port_attr *port_attr);
 	struct ibv_comp_channel *(*create_comp_channel)
 		(struct ibv_context *context);
+	int (*get_clock_info)(struct ibv_context *context,
+			       struct mlx5dv_clock_info *clock_info);
+	uint64_t (*convert_ts_to_ns)(struct mlx5dv_clock_info *clock_info,
+			  uint64_t device_timestamp);
 	int (*destroy_comp_channel)(struct ibv_comp_channel *channel);
 	struct ibv_cq *(*create_cq)(struct ibv_context *context, int cqe,
 				    void *cq_context,
diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
index ab47cb531..86bec111b 100644
--- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
@@ -355,6 +355,36 @@ mlx5_set_flags(struct rte_eth_dev *dev, unsigned int keep, unsigned int flags)
 	return mlx5_ifreq(dev, SIOCSIFFLAGS, &request);
 }
 
+/**
+ * Convert raw clock counter to nanoseconds
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param[in&out] timestamp
+ *   Pointer to the timestamp to be converted.
+ *
+ * @return
+ *   0 if the clock has correctly been read
+ *   The value of errno in case of error
+ */
+int
+mlx5_convert_ts_to_ns(struct rte_eth_dev *dev, uint64_t *timestamp)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct ibv_context *ctx = priv->sh->ctx;
+	struct mlx5dv_clock_info clock_info;
+
+	int err = mlx5_glue->get_clock_info(ctx, &clock_info);
+	if (err != 0) {
+		DRV_LOG(WARNING, "Could not get the clock info!");
+		return err;
+	}
+
+	*timestamp = mlx5_glue->convert_ts_to_ns(&clock_info, *timestamp);
+
+	return err;
+}
+
 /**
  * Get device current raw clock counter
  *
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 3792371c3..914e58f52 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -2408,6 +2408,7 @@ const struct eth_dev_ops mlx5_os_dev_sec_ops = {
 	.xstats_get_names = mlx5_xstats_get_names,
 	.fw_version_get = mlx5_fw_version_get,
 	.dev_infos_get = mlx5_dev_infos_get,
+	.convert_ts_to_ns = mlx5_convert_ts_to_ns,
 	.rx_descriptor_status = mlx5_rx_descriptor_status,
 	.tx_descriptor_status = mlx5_tx_descriptor_status,
 	.rxq_info_get = mlx5_rxq_info_get,
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 5bd5acd9d..dc469f7d9 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -738,6 +738,7 @@ int mlx5_get_mtu(struct rte_eth_dev *dev, uint16_t *mtu);
 int mlx5_set_flags(struct rte_eth_dev *dev, unsigned int keep,
 		   unsigned int flags);
 int mlx5_set_mtu(struct rte_eth_dev *dev, uint16_t mtu);
+int mlx5_convert_ts_to_ns(struct rte_eth_dev *dev, uint64_t *timestamp);
 int mlx5_read_clock(struct rte_eth_dev *dev, uint64_t *clock);
 int mlx5_link_update(struct rte_eth_dev *dev, int wait_to_complete);
 int mlx5_force_link_status_change(struct rte_eth_dev *dev, int status);
-- 
2.27.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-dev] [RFC PATCH 2/3] ethdev: add API to convert raw timestamps to nsec
  2020-06-25 19:01 [dpdk-dev] [RFC PATCH 0/3] mlx5 to PCAP capture with hardware timestamps Vivien Didelot
  2020-06-25 19:01 ` [dpdk-dev] [RFC PATCH 1/3] net/mlx5: add timestamp-to-ns converter from libibverbs Vivien Didelot
@ 2020-06-25 19:01 ` Vivien Didelot
  2020-06-25 19:01 ` [dpdk-dev] [RFC PATCH 3/3] net/pcap: support hardware Tx timestamps Vivien Didelot
  2020-07-08 14:34 ` [dpdk-dev] [RFC PATCH 0/3] mlx5 to PCAP capture with hardwaretimestamps Morten Brørup
  3 siblings, 0 replies; 12+ messages in thread
From: Vivien Didelot @ 2020-06-25 19:01 UTC (permalink / raw)
  To: dev; +Cc: Patrick Keroulas, Olivier Matz, Ferruh Yigit

From: Patrick Keroulas <patrick.keroulas@radio-canada.ca>

Existing ethdev functions can read/write time from/to device but
they're all related to timesync and none of them can translate a raw
counter in real time unit which is useful in a pdump application.

A new API is required because the conversion is derived from dev
clock info.

Signed-off-by: Patrick Keroulas <patrick.keroulas@radio-canada.ca>
---
 lib/librte_ethdev/rte_ethdev.c           | 12 ++++++++++++
 lib/librte_ethdev/rte_ethdev.h           | 17 +++++++++++++++++
 lib/librte_ethdev/rte_ethdev_core.h      |  5 +++++
 lib/librte_ethdev/rte_ethdev_version.map |  2 ++
 lib/librte_mbuf/rte_mbuf_core.h          |  3 ++-
 lib/librte_pdump/rte_pdump.c             | 14 +++++++++++++-
 6 files changed, 51 insertions(+), 2 deletions(-)

diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 8e10a6fc3..822fa6d5a 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -4810,6 +4810,18 @@ rte_eth_timesync_write_time(uint16_t port_id, const struct timespec *timestamp)
 								timestamp));
 }
 
+int
+rte_eth_convert_ts_to_ns(uint16_t port_id, uint64_t *timestamp)
+{
+	struct rte_eth_dev *dev;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	dev = &rte_eth_devices[port_id];
+
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->convert_ts_to_ns, -ENOTSUP);
+	return eth_err(port_id, (*dev->dev_ops->convert_ts_to_ns)(dev, timestamp));
+}
+
 int
 rte_eth_read_clock(uint16_t port_id, uint64_t *clock)
 {
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index a49242bcd..2d4d0bc7d 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -4103,6 +4103,23 @@ int rte_eth_timesync_read_time(uint16_t port_id, struct timespec *time);
  */
 int rte_eth_timesync_write_time(uint16_t port_id, const struct timespec *time);
 
+/**
+ * Convert a raw clock counter to nanoseconds from device clock
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param[in&out] timestamp
+ *   Pointer to the timestamp to be converted.
+ *
+ * @return
+ *   - 0: Success.
+ *   - -ENODEV: The port ID is invalid.
+ *   - -ENOTSUP: The function is not supported by the Ethernet driver.
+ */
+__rte_experimental
+int
+rte_eth_convert_ts_to_ns(uint16_t port_id, uint64_t *timestamp);
+
 /**
  * @warning
  * @b EXPERIMENTAL: this API may change without prior notice.
diff --git a/lib/librte_ethdev/rte_ethdev_core.h b/lib/librte_ethdev/rte_ethdev_core.h
index 32407dd41..255b41b67 100644
--- a/lib/librte_ethdev/rte_ethdev_core.h
+++ b/lib/librte_ethdev/rte_ethdev_core.h
@@ -464,6 +464,10 @@ typedef int (*eth_timesync_write_time)(struct rte_eth_dev *dev,
 				       const struct timespec *timestamp);
 /**< @internal Function used to get time from the device clock */
 
+typedef int (*eth_convert_ts_to_ns)(struct rte_eth_dev *dev,
+				      uint64_t *timestamp);
+/**< @internal Function used to convert timestamp from device clock */
+
 typedef int (*eth_read_clock)(struct rte_eth_dev *dev,
 				      uint64_t *timestamp);
 /**< @internal Function used to get the current value of the device clock. */
@@ -730,6 +734,7 @@ struct eth_dev_ops {
 	eth_timesync_read_time     timesync_read_time; /** Get the device clock time. */
 	eth_timesync_write_time    timesync_write_time; /** Set the device clock time. */
 
+	eth_convert_ts_to_ns       convert_ts_to_ns;
 	eth_read_clock             read_clock;
 
 	eth_xstats_get_by_id_t     xstats_get_by_id;
diff --git a/lib/librte_ethdev/rte_ethdev_version.map b/lib/librte_ethdev/rte_ethdev_version.map
index 715505604..754c05630 100644
--- a/lib/librte_ethdev/rte_ethdev_version.map
+++ b/lib/librte_ethdev/rte_ethdev_version.map
@@ -241,4 +241,6 @@ EXPERIMENTAL {
 	__rte_ethdev_trace_rx_burst;
 	__rte_ethdev_trace_tx_burst;
 	rte_flow_get_aged_flows;
+
+	rte_eth_convert_ts_to_ns;
 };
diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
index 16600f171..470616fb2 100644
--- a/lib/librte_mbuf/rte_mbuf_core.h
+++ b/lib/librte_mbuf/rte_mbuf_core.h
@@ -595,7 +595,8 @@ struct rte_mbuf {
 	/** Valid if PKT_RX_TIMESTAMP is set. The unit and time reference
 	 * are not normalized but are always the same for a given port.
 	 * Some devices allow to query rte_eth_read_clock that will return the
-	 * current device timestamp.
+	 * current device timestamp or rte_eth_ts_to_ns that will convert raw
+	 * counter to nanoseconds.
 	 */
 	uint64_t timestamp;
 
diff --git a/lib/librte_pdump/rte_pdump.c b/lib/librte_pdump/rte_pdump.c
index f96709f95..03d9ba484 100644
--- a/lib/librte_pdump/rte_pdump.c
+++ b/lib/librte_pdump/rte_pdump.c
@@ -100,12 +100,24 @@ pdump_copy(struct rte_mbuf **pkts, uint16_t nb_pkts, void *user_params)
 	}
 }
 
+static inline void
+pdump_ts_to_ns(uint16_t port_id, struct rte_mbuf **pkts, uint16_t nb_pkts)
+{
+	unsigned int i;
+
+	for (i = 0; i < nb_pkts; i++) {
+		if (pkts[i]->ol_flags & PKT_RX_TIMESTAMP)
+			rte_eth_convert_ts_to_ns(port_id, &pkts[i]->timestamp);
+	}
+}
+
 static uint16_t
-pdump_rx(uint16_t port __rte_unused, uint16_t qidx __rte_unused,
+pdump_rx(uint16_t port_id, uint16_t qidx __rte_unused,
 	struct rte_mbuf **pkts, uint16_t nb_pkts,
 	uint16_t max_pkts __rte_unused,
 	void *user_params)
 {
+	pdump_ts_to_ns(port_id, pkts, nb_pkts);
 	pdump_copy(pkts, nb_pkts, user_params);
 	return nb_pkts;
 }
-- 
2.27.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-dev] [RFC PATCH 3/3] net/pcap: support hardware Tx timestamps
  2020-06-25 19:01 [dpdk-dev] [RFC PATCH 0/3] mlx5 to PCAP capture with hardware timestamps Vivien Didelot
  2020-06-25 19:01 ` [dpdk-dev] [RFC PATCH 1/3] net/mlx5: add timestamp-to-ns converter from libibverbs Vivien Didelot
  2020-06-25 19:01 ` [dpdk-dev] [RFC PATCH 2/3] ethdev: add API to convert raw timestamps to nsec Vivien Didelot
@ 2020-06-25 19:01 ` Vivien Didelot
  2020-06-26  6:48   ` Olivier Matz
  2020-06-30 14:59   ` Stephen Hemminger
  2020-07-08 14:34 ` [dpdk-dev] [RFC PATCH 0/3] mlx5 to PCAP capture with hardwaretimestamps Morten Brørup
  3 siblings, 2 replies; 12+ messages in thread
From: Vivien Didelot @ 2020-06-25 19:01 UTC (permalink / raw)
  To: dev; +Cc: Vivien Didelot, Olivier Matz, Ferruh Yigit, Patrick Keroulas

When hardware timestamping is enabled on Rx path, system time should
no longer be used to calculate the timestamps when dumping packets.

Instead, use the value stored by the driver in mbuf->timestamp
and assume it is already converted to nanoseconds (otherwise the
application may edit the packet headers itself afterwards).

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: Patrick Keroulas <patrick.keroulas@radio-canada.ca>
---
 doc/guides/rel_notes/release_20_08.rst |  1 +
 drivers/net/pcap/rte_eth_pcap.c        | 30 +++++++++++++++-----------
 2 files changed, 18 insertions(+), 13 deletions(-)

diff --git a/doc/guides/rel_notes/release_20_08.rst b/doc/guides/rel_notes/release_20_08.rst
index a67015519..cd1ca987f 100644
--- a/doc/guides/rel_notes/release_20_08.rst
+++ b/doc/guides/rel_notes/release_20_08.rst
@@ -61,6 +61,7 @@ New Features
   Updated PCAP driver with new features and improvements, including:
 
   * Support software Tx nanosecond timestamps precision.
+  * Support hardware Tx timestamps.
 
 * **Updated Mellanox mlx5 driver.**
 
diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index 13a3d0ac7..3d80b699b 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -290,19 +290,23 @@ eth_null_rx(void *queue __rte_unused,
 #define NSEC_PER_SEC	1000000000L
 
 static inline void
-calculate_timestamp(struct timeval *ts) {
-	uint64_t cycles;
-	struct timeval cur_time;
+calculate_timestamp(const struct rte_mbuf *mbuf, struct timeval *ts) {
+	if (mbuf->ol_flags & PKT_RX_TIMESTAMP) {
+		ts->tv_sec = mbuf->timestamp / NSEC_PER_SEC;
+		ts->tv_usec = mbuf->timestamp % NSEC_PER_SEC;
+	} else {
+		uint64_t cycles = rte_get_timer_cycles() - start_cycles;
+		struct timeval cur_time = {
+			.tv_sec = cycles / hz,
+			.tv_usec = (cycles % hz) * NSEC_PER_SEC / hz,
+		};
 
-	cycles = rte_get_timer_cycles() - start_cycles;
-	cur_time.tv_sec = cycles / hz;
-	cur_time.tv_usec = (cycles % hz) * NSEC_PER_SEC / hz;
-
-	ts->tv_sec = start_time.tv_sec + cur_time.tv_sec;
-	ts->tv_usec = start_time.tv_usec + cur_time.tv_usec;
-	if (ts->tv_usec >= NSEC_PER_SEC) {
-		ts->tv_usec -= NSEC_PER_SEC;
-		ts->tv_sec += 1;
+		ts->tv_sec = start_time.tv_sec + cur_time.tv_sec;
+		ts->tv_usec = start_time.tv_usec + cur_time.tv_usec;
+		if (ts->tv_usec >= NSEC_PER_SEC) {
+			ts->tv_usec -= NSEC_PER_SEC;
+			ts->tv_sec += 1;
+		}
 	}
 }
 
@@ -339,7 +343,7 @@ eth_pcap_tx_dumper(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 			caplen = sizeof(temp_data);
 		}
 
-		calculate_timestamp(&header.ts);
+		calculate_timestamp(mbuf, &header.ts);
 		header.len = len;
 		header.caplen = caplen;
 		/* rte_pktmbuf_read() returns a pointer to the data directly
-- 
2.27.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 1/3] net/mlx5: add timestamp-to-ns converter from libibverbs
  2020-06-25 19:01 ` [dpdk-dev] [RFC PATCH 1/3] net/mlx5: add timestamp-to-ns converter from libibverbs Vivien Didelot
@ 2020-06-26  6:41   ` Olivier Matz
  2020-07-07 15:23     ` PATRICK KEROULAS
  0 siblings, 1 reply; 12+ messages in thread
From: Olivier Matz @ 2020-06-26  6:41 UTC (permalink / raw)
  To: Vivien Didelot; +Cc: dev, Patrick Keroulas, Ferruh Yigit

Hi Vivien,

On Thu, Jun 25, 2020 at 03:01:17PM -0400, Vivien Didelot wrote:
> From: Patrick Keroulas <patrick.keroulas@radio-canada.ca>
> 
> While some devices update their own clock info to provide current time,
> mlx5dv part of libibverbs already handles this and also converts any
> raw counter cycle to nanoseconds.
> 
> Signed-off-by: Patrick Keroulas <patrick.keroulas@radio-canada.ca>
> ---
>  drivers/common/mlx5/linux/mlx5_glue.c   | 16 +++++++++++++
>  drivers/common/mlx5/linux/mlx5_glue.h   |  4 ++++
>  drivers/net/mlx5/linux/mlx5_ethdev_os.c | 30 +++++++++++++++++++++++++
>  drivers/net/mlx5/linux/mlx5_os.c        |  1 +
>  drivers/net/mlx5/mlx5.h                 |  1 +
>  5 files changed, 52 insertions(+)
> 
> diff --git a/drivers/common/mlx5/linux/mlx5_glue.c b/drivers/common/mlx5/linux/mlx5_glue.c
> index c91ee33bb..cac24015b 100644
> --- a/drivers/common/mlx5/linux/mlx5_glue.c
> +++ b/drivers/common/mlx5/linux/mlx5_glue.c
> @@ -80,6 +80,20 @@ mlx5_glue_query_rt_values_ex(struct ibv_context *context,
>  	return ibv_query_rt_values_ex(context, values);
>  }
>  
> +static int
> +mlx5_glue_get_clock_info(struct ibv_context *context,
> +			  struct mlx5dv_clock_info *clock_info)
> +{
> +	return mlx5dv_get_clock_info(context, clock_info);
> +}
> +
> +static uint64_t
> +mlx5_glue_mlx5dv_ts_to_ns(struct mlx5dv_clock_info *clock_info,
> +			  uint64_t device_timestamp)
> +{
> +	return mlx5dv_ts_to_ns(clock_info, device_timestamp);
> +}
> +
>  static int
>  mlx5_glue_query_port(struct ibv_context *context, uint8_t port_num,
>  		     struct ibv_port_attr *port_attr)
> @@ -1207,6 +1221,8 @@ const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue) {
>  	.query_device = mlx5_glue_query_device,
>  	.query_device_ex = mlx5_glue_query_device_ex,
>  	.query_rt_values_ex = mlx5_glue_query_rt_values_ex,
> +	.get_clock_info = mlx5_glue_get_clock_info,
> +	.convert_ts_to_ns = mlx5_glue_mlx5dv_ts_to_ns,
>  	.query_port = mlx5_glue_query_port,
>  	.create_comp_channel = mlx5_glue_create_comp_channel,
>  	.destroy_comp_channel = mlx5_glue_destroy_comp_channel,
> diff --git a/drivers/common/mlx5/linux/mlx5_glue.h b/drivers/common/mlx5/linux/mlx5_glue.h
> index 5d238a40a..8d05e7398 100644
> --- a/drivers/common/mlx5/linux/mlx5_glue.h
> +++ b/drivers/common/mlx5/linux/mlx5_glue.h
> @@ -123,6 +123,10 @@ struct mlx5_glue {
>  			  struct ibv_port_attr *port_attr);
>  	struct ibv_comp_channel *(*create_comp_channel)
>  		(struct ibv_context *context);
> +	int (*get_clock_info)(struct ibv_context *context,
> +			       struct mlx5dv_clock_info *clock_info);
> +	uint64_t (*convert_ts_to_ns)(struct mlx5dv_clock_info *clock_info,
> +			  uint64_t device_timestamp);
>  	int (*destroy_comp_channel)(struct ibv_comp_channel *channel);
>  	struct ibv_cq *(*create_cq)(struct ibv_context *context, int cqe,
>  				    void *cq_context,
> diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
> index ab47cb531..86bec111b 100644
> --- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
> +++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
> @@ -355,6 +355,36 @@ mlx5_set_flags(struct rte_eth_dev *dev, unsigned int keep, unsigned int flags)
>  	return mlx5_ifreq(dev, SIOCSIFFLAGS, &request);
>  }
>  
> +/**
> + * Convert raw clock counter to nanoseconds
> + *
> + * @param dev
> + *   Pointer to Ethernet device structure.
> + * @param[in&out] timestamp
> + *   Pointer to the timestamp to be converted.
> + *
> + * @return
> + *   0 if the clock has correctly been read
> + *   The value of errno in case of error
> + */
> +int
> +mlx5_convert_ts_to_ns(struct rte_eth_dev *dev, uint64_t *timestamp)
> +{
> +	struct mlx5_priv *priv = dev->data->dev_private;
> +	struct ibv_context *ctx = priv->sh->ctx;
> +	struct mlx5dv_clock_info clock_info;
> +
> +	int err = mlx5_glue->get_clock_info(ctx, &clock_info);
> +	if (err != 0) {
> +		DRV_LOG(WARNING, "Could not get the clock info!");
> +		return err;
> +	}
> +
> +	*timestamp = mlx5_glue->convert_ts_to_ns(&clock_info, *timestamp);
> +
> +	return err;
> +}

Should mlx5dv_get_clock_info() be called for each convesion? In the man
page [1], we can read this:

       If the clock_info becomes too old then time conversion will return
       wrong conversion results.  The user must ensure that
       mlx5dv_get_clock_info(3) is called at least once every
       max_clock_info_update_nsec as returned by the mlx5dv_query_device(3)
       function.

I understand this as mlx5dv_get_clock_info() being slower than
mlx5dv_ts_to_ns(), and should not be called for each packet.

I suppose that when rdma-core library does a conversion to nanosec, it
takes in account the fact that system and device are not synchronized
(are they?).

[1] https://man7.org/linux/man-pages/man3/mlx5dv_get_clock_info.3.html

Olivier

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 3/3] net/pcap: support hardware Tx timestamps
  2020-06-25 19:01 ` [dpdk-dev] [RFC PATCH 3/3] net/pcap: support hardware Tx timestamps Vivien Didelot
@ 2020-06-26  6:48   ` Olivier Matz
  2020-07-06 18:36     ` PATRICK KEROULAS
  2020-06-30 14:59   ` Stephen Hemminger
  1 sibling, 1 reply; 12+ messages in thread
From: Olivier Matz @ 2020-06-26  6:48 UTC (permalink / raw)
  To: Vivien Didelot; +Cc: dev, Ferruh Yigit, Patrick Keroulas

On Thu, Jun 25, 2020 at 03:01:19PM -0400, Vivien Didelot wrote:
> When hardware timestamping is enabled on Rx path, system time should
> no longer be used to calculate the timestamps when dumping packets.
> 
> Instead, use the value stored by the driver in mbuf->timestamp
> and assume it is already converted to nanoseconds (otherwise the
> application may edit the packet headers itself afterwards).
> 
> Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
> Signed-off-by: Patrick Keroulas <patrick.keroulas@radio-canada.ca>
> ---
>  doc/guides/rel_notes/release_20_08.rst |  1 +
>  drivers/net/pcap/rte_eth_pcap.c        | 30 +++++++++++++++-----------
>  2 files changed, 18 insertions(+), 13 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/release_20_08.rst b/doc/guides/rel_notes/release_20_08.rst
> index a67015519..cd1ca987f 100644
> --- a/doc/guides/rel_notes/release_20_08.rst
> +++ b/doc/guides/rel_notes/release_20_08.rst
> @@ -61,6 +61,7 @@ New Features
>    Updated PCAP driver with new features and improvements, including:
>  
>    * Support software Tx nanosecond timestamps precision.
> +  * Support hardware Tx timestamps.
>  
>  * **Updated Mellanox mlx5 driver.**
>  
> diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
> index 13a3d0ac7..3d80b699b 100644
> --- a/drivers/net/pcap/rte_eth_pcap.c
> +++ b/drivers/net/pcap/rte_eth_pcap.c
> @@ -290,19 +290,23 @@ eth_null_rx(void *queue __rte_unused,
>  #define NSEC_PER_SEC	1000000000L
>  
>  static inline void
> -calculate_timestamp(struct timeval *ts) {
> -	uint64_t cycles;
> -	struct timeval cur_time;
> +calculate_timestamp(const struct rte_mbuf *mbuf, struct timeval *ts) {
> +	if (mbuf->ol_flags & PKT_RX_TIMESTAMP) {
> +		ts->tv_sec = mbuf->timestamp / NSEC_PER_SEC;
> +		ts->tv_usec = mbuf->timestamp % NSEC_PER_SEC;
> +	} else {
> +		uint64_t cycles = rte_get_timer_cycles() - start_cycles;
> +		struct timeval cur_time = {
> +			.tv_sec = cycles / hz,
> +			.tv_usec = (cycles % hz) * NSEC_PER_SEC / hz,
> +		};
>  
> -	cycles = rte_get_timer_cycles() - start_cycles;
> -	cur_time.tv_sec = cycles / hz;
> -	cur_time.tv_usec = (cycles % hz) * NSEC_PER_SEC / hz;
> -
> -	ts->tv_sec = start_time.tv_sec + cur_time.tv_sec;
> -	ts->tv_usec = start_time.tv_usec + cur_time.tv_usec;
> -	if (ts->tv_usec >= NSEC_PER_SEC) {
> -		ts->tv_usec -= NSEC_PER_SEC;
> -		ts->tv_sec += 1;
> +		ts->tv_sec = start_time.tv_sec + cur_time.tv_sec;
> +		ts->tv_usec = start_time.tv_usec + cur_time.tv_usec;
> +		if (ts->tv_usec >= NSEC_PER_SEC) {
> +			ts->tv_usec -= NSEC_PER_SEC;
> +			ts->tv_sec += 1;
> +		}
>  	}
>  }
>  

I don't get why you expect that timestamp to be in nanoseconds.
The conversion is done in librte_pdump (in the previous patch),
but it won't work if the library is not used, right?

Out of curiosity, can you explain your motivation for using the hardware
timestamp? Is it faster? More accurate? (knowing it timestamps the Rx
operation, not the Tx)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 3/3] net/pcap: support hardware Tx timestamps
  2020-06-25 19:01 ` [dpdk-dev] [RFC PATCH 3/3] net/pcap: support hardware Tx timestamps Vivien Didelot
  2020-06-26  6:48   ` Olivier Matz
@ 2020-06-30 14:59   ` Stephen Hemminger
  1 sibling, 0 replies; 12+ messages in thread
From: Stephen Hemminger @ 2020-06-30 14:59 UTC (permalink / raw)
  To: Vivien Didelot; +Cc: dev, Olivier Matz, Ferruh Yigit, Patrick Keroulas

On Thu, 25 Jun 2020 15:01:19 -0400
Vivien Didelot <vivien.didelot@gmail.com> wrote:

> +		struct timeval cur_time = {
> +			.tv_sec = cycles / hz,
> +			.tv_usec = (cycles % hz) * NSEC_PER_SEC / hz,

This is hot path, use rte_reciprocal_divide rather than slow divide instruction.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 3/3] net/pcap: support hardware Tx timestamps
  2020-06-26  6:48   ` Olivier Matz
@ 2020-07-06 18:36     ` PATRICK KEROULAS
  2020-07-07 14:47       ` Olivier Matz
  0 siblings, 1 reply; 12+ messages in thread
From: PATRICK KEROULAS @ 2020-07-06 18:36 UTC (permalink / raw)
  To: Olivier Matz; +Cc: Vivien Didelot, dev, Ferruh Yigit

On Fri, Jun 26, 2020 at 2:48 AM Olivier Matz <olivier.matz@6wind.com> wrote:
> I don't get why you expect that timestamp to be in nanoseconds.
> The conversion is done in librte_pdump (in the previous patch),
> but it won't work if the library is not used, right?
>

You're right Olivier. I took advantage of the definition of mbuf->timestamp
as being "not normalized". The proposed conversion needs NIC clock info
which can't be accessed from the secondary process. Do you have a better
suggestion? Should I set a user flag in mbuf for nanoseconds?

> Out of curiosity, can you explain your motivation for using the hardware
> timestamp? Is it faster? More accurate? (knowing it timestamps the Rx
> operation, not the Tx)

Accuracy is our requirement in the broadcast industry (and probably
in finance as well) where core systems are very time sensitive. Our
application uses the NIC mostly as a receiver for UDP stream monitoring
in order to measure the network propagation delay and jitter. Using SW Rx
timestamps completely breaks this requirement.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 3/3] net/pcap: support hardware Tx timestamps
  2020-07-06 18:36     ` PATRICK KEROULAS
@ 2020-07-07 14:47       ` Olivier Matz
  2020-07-10 19:23         ` PATRICK KEROULAS
  0 siblings, 1 reply; 12+ messages in thread
From: Olivier Matz @ 2020-07-07 14:47 UTC (permalink / raw)
  To: PATRICK KEROULAS; +Cc: Vivien Didelot, dev, Ferruh Yigit, Slava Ovsiienko

Hi Patrick,

On Mon, Jul 06, 2020 at 02:36:30PM -0400, PATRICK KEROULAS wrote:
> On Fri, Jun 26, 2020 at 2:48 AM Olivier Matz <olivier.matz@6wind.com> wrote:
> > I don't get why you expect that timestamp to be in nanoseconds.
> > The conversion is done in librte_pdump (in the previous patch),
> > but it won't work if the library is not used, right?
> >
> 
> You're right Olivier. I took advantage of the definition of mbuf->timestamp
> as being "not normalized". The proposed conversion needs NIC clock info
> which can't be accessed from the secondary process. Do you have a better
> suggestion? Should I set a user flag in mbuf for nanoseconds?
>
> > Out of curiosity, can you explain your motivation for using the hardware
> > timestamp? Is it faster? More accurate? (knowing it timestamps the Rx
> > operation, not the Tx)
> 
> Accuracy is our requirement in the broadcast industry (and probably
> in finance as well) where core systems are very time sensitive. Our
> application uses the NIC mostly as a receiver for UDP stream monitoring
> in order to measure the network propagation delay and jitter. Using SW Rx
> timestamps completely breaks this requirement.

OK, your main motivation for hardware timestamping is the accuracy, and
your application logs the Rx timestamp into the pcap when using the PMD.

For the pmd pcap part, the dynamic Tx timestamp flag that is being
introduced by Slava [1] may fit your needs: ie. when the flag is set, it
uses the provided timestamp which should be in nanosecs.

[1] https://patchwork.dpdk.org/patch/73427/

For the conversion from hardware Rx timestamp into Tx timestamp (in
nanosec), could it be done by your application? Early in the Rx path, if
a packet has a Rx timestamp flag, do the conversion to nsecs, and set
the timestamp field and the Tx timestamp flag.

Regards,
Olivier

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 1/3] net/mlx5: add timestamp-to-ns converter from libibverbs
  2020-06-26  6:41   ` Olivier Matz
@ 2020-07-07 15:23     ` PATRICK KEROULAS
  0 siblings, 0 replies; 12+ messages in thread
From: PATRICK KEROULAS @ 2020-07-07 15:23 UTC (permalink / raw)
  To: Olivier Matz; +Cc: Vivien Didelot, dev, Ferruh Yigit

On Fri, Jun 26, 2020 at 2:41 AM Olivier Matz <olivier.matz@6wind.com> wrote:
> Should mlx5dv_get_clock_info() be called for each convesion? In the man
> page [1], we can read this:
>
>        If the clock_info becomes too old then time conversion will return
>        wrong conversion results.  The user must ensure that
>        mlx5dv_get_clock_info(3) is called at least once every
>        max_clock_info_update_nsec as returned by the mlx5dv_query_device(3)
>        function.
>
> I understand this as mlx5dv_get_clock_info() being slower than
> mlx5dv_ts_to_ns(), and should not be called for each packet.

Right. I plan to use rte_timer (like in drivers/net/ena/) to query clock info
periodically.

>
> I suppose that when rdma-core library does a conversion to nanosec, it
> takes in account the fact that system and device are not synchronized
> (are they?).

PTP daemon like linuxptp is in charge of syncing both device and system
clocks with external master.

Thanks.
PK

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 0/3] mlx5 to PCAP capture with hardwaretimestamps
  2020-06-25 19:01 [dpdk-dev] [RFC PATCH 0/3] mlx5 to PCAP capture with hardware timestamps Vivien Didelot
                   ` (2 preceding siblings ...)
  2020-06-25 19:01 ` [dpdk-dev] [RFC PATCH 3/3] net/pcap: support hardware Tx timestamps Vivien Didelot
@ 2020-07-08 14:34 ` Morten Brørup
  3 siblings, 0 replies; 12+ messages in thread
From: Morten Brørup @ 2020-07-08 14:34 UTC (permalink / raw)
  To: Vivien Didelot, dev; +Cc: Olivier Matz, Ferruh Yigit, Patrick Keroulas

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Vivien Didelot
> Sent: Thursday, June 25, 2020 9:01 PM
> 
> This series allows to capture packets from a Mellanox ConnectX-5
> interface
> into a PCAP file with support for hardware timestamps. Since libibverbs
> already provides timestamp conversion to nanoseconds for mlx5, the
> necessary
> glue is added before writing the driver's timestamps into the PCAP
> headers.

Which clock source is the nanosecond timestamps output from this function using? CLOCK_REALTIME, the NIC clock, the CPU clock, or something else?

If it is the NIC clock, I guess a simple scaling factor could be used instead, for performance reasons. The driver could expose the scaling factor (or for even better resolution: its clock frequency), e.g. through the capabilities API, so the application can cache it and convert to nanoseconds by multiplying a scaling factor and adding an offset. The application would have to handle clock drift between the NIC clock and CLOCK_REALTIME (or some other preferred reference clock source) anyway.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 3/3] net/pcap: support hardware Tx timestamps
  2020-07-07 14:47       ` Olivier Matz
@ 2020-07-10 19:23         ` PATRICK KEROULAS
  0 siblings, 0 replies; 12+ messages in thread
From: PATRICK KEROULAS @ 2020-07-10 19:23 UTC (permalink / raw)
  To: Olivier Matz; +Cc: Vivien Didelot, dev, Ferruh Yigit, Slava Ovsiienko

Hello Olivier and Slava,

On Tue, Jul 7, 2020 at 10:47 AM Olivier Matz <olivier.matz@6wind.com> wrote:
> For the pmd pcap part, the dynamic Tx timestamp flag that is being
> introduced by Slava [1] may fit your needs: ie. when the flag is set, it
> uses the provided timestamp which should be in nanosecs.
>
> [1] https://patchwork.dpdk.org/patch/73427/
>
> For the conversion from hardware Rx timestamp into Tx timestamp (in
> nanosec), could it be done by your application? Early in the Rx path, if
> a packet has a Rx timestamp flag, do the conversion to nsecs, and set
> the timestamp field and the Tx timestamp flag.

Just to be clear, we use testpmd and dpdk-pdump rather than developing
our own app. The application I mentioned earlier is a pcap analyzer and
has nothing to do with dpdk.

Could you help me identify a location where to set this dyn Tx field and flag?
My current Rx data path includes drivers/net/mlx5/mlx5_rxtx_vec_sse.h
Is it a good candidate?

For the unit conversion, I finally followed the recommendations of using
start_time & clock_freq rather than libibverbs. The time accuracy looks
similarly good, based on pcap analysis.
The conversion is performed in rte_pdump, is that acceptable?

Thanks,

PK

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-07-10 19:24 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-25 19:01 [dpdk-dev] [RFC PATCH 0/3] mlx5 to PCAP capture with hardware timestamps Vivien Didelot
2020-06-25 19:01 ` [dpdk-dev] [RFC PATCH 1/3] net/mlx5: add timestamp-to-ns converter from libibverbs Vivien Didelot
2020-06-26  6:41   ` Olivier Matz
2020-07-07 15:23     ` PATRICK KEROULAS
2020-06-25 19:01 ` [dpdk-dev] [RFC PATCH 2/3] ethdev: add API to convert raw timestamps to nsec Vivien Didelot
2020-06-25 19:01 ` [dpdk-dev] [RFC PATCH 3/3] net/pcap: support hardware Tx timestamps Vivien Didelot
2020-06-26  6:48   ` Olivier Matz
2020-07-06 18:36     ` PATRICK KEROULAS
2020-07-07 14:47       ` Olivier Matz
2020-07-10 19:23         ` PATRICK KEROULAS
2020-06-30 14:59   ` Stephen Hemminger
2020-07-08 14:34 ` [dpdk-dev] [RFC PATCH 0/3] mlx5 to PCAP capture with hardwaretimestamps Morten Brørup

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).