DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH 1/1] doc: add mlx5 xstats send scheduling counters description
@ 2024-10-28 14:27 Viacheslav Ovsiienko
  2024-10-28 15:57 ` Stephen Hemminger
  0 siblings, 1 reply; 2+ messages in thread
From: Viacheslav Ovsiienko @ 2024-10-28 14:27 UTC (permalink / raw)
  To: dev; +Cc: rasland, matan, suanmingm

The mlx5 provides the scheduling send on time capability.
The check the operating status of this feature the xstats
counters are provided. This patch adds the counter descriptions
and provides some meaningful information how to interpret
the counter values in runtime.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 doc/guides/nics/mlx5.rst | 48 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index f82e2d75de..8d1a1311d4 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -2655,3 +2655,51 @@ Destroy GENEVE TLV parser for specific port::
 
 This command doesn't destroy the global list,
 For releasing options, ``flush`` command should be used.
+
+
+Extended statistics counters
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Send scheduling related xstats counters
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The mlx5 PMD provides the set of tx_pp feature related counters to provide debug and diagnostics
+on send packet scheduling. These counters are applicable only if port was probed with ``tx_pp``
+devarg and reflect the status of PMD scheduling infrastructure based on Clock and Rearm Queues.
+This infrastructure provedies the Send Scheduling capability on CX6DX NICs as temporary workaround
+and should not be engaged on the newer hardware.
+
+- ``tx_pp_missed_interrupt_errors`` - the Rearm Queue interrupt was not serviced in time. EAL handles
+  interrupts in dedicated thread and, possible, there were another time-consuming actions were taken.
+
+- ``tx_pp_rearm_queue_errors`` - hardware errors occurred on Rearm Queue, usually it is caused by not
+  servicing interrupts in time
+
+- ``tx_pp_clock_queue_errors`` - hardware errors occurred on Clock Queue, usually it indicates some
+  configuration or internal NIC hardware or firmware issues
+
+- ``tx_pp_timestamp_past_errors`` - application tried to send packet(s) with specifying timestamp in the past.
+  This counter is useful to check and debug the application code, it does not indicate PMD malfunction.
+
+- ``tx_pp_timestamp_future_errors`` - application tried to send packet(s) with specifying timestamp
+  in the too distant future, beyond the hardware capabilities to schedule the sending
+  This counter is useful to check and debug the application code, it does not indicate PMD malfunction.
+
+- ``tx_pp_jitter`` - this counter exposes the internal NIC realtime clock jitter estimation between two
+  neighbour Clock Queue completions in nanoseconds. Significant jitter might alert about clock
+  synchronization issues (say, some system PTP agent might adjust NIC clock in inappropriate way)
+
+- ``tx_pp_wander`` - the counter exposes the longterm internal NUC realtime clock stability - tx_pp_wander
+  for 2^24 completions, in nanoseconds. Significant wander might indicate clock synchronization issues.
+
+- ``tx_pp_sync_lost`` - the general operating indicator, the non-zero value says the driver lost
+  the Clock Queue synchronization and scheduling does not operate correctly. The port must be restarted
+  to restore the correct scheduling functioning.
+
+The following counters are extremely useful for application code check and debug, these ones do not
+indicate driver or hardware mulfunctions, and are also applicable for the newer hardware (with direct
+on time scheduling capabilities - ConnectX-7 and above):
+
+- ``tx_pp_timestamp_order_errors`` - application tried to send packet(s) with timestamps in not
+  strictly ascending order. Because of PMD does not reorder packets in the hardware queues, scheduling
+  timestamps order violation causes sending packets in wrong moments of time.
-- 
2.34.1


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH 1/1] doc: add mlx5 xstats send scheduling counters description
  2024-10-28 14:27 [PATCH 1/1] doc: add mlx5 xstats send scheduling counters description Viacheslav Ovsiienko
@ 2024-10-28 15:57 ` Stephen Hemminger
  0 siblings, 0 replies; 2+ messages in thread
From: Stephen Hemminger @ 2024-10-28 15:57 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, rasland, matan, suanmingm

On Mon, 28 Oct 2024 16:27:41 +0200
Viacheslav Ovsiienko <viacheslavo@nvidia.com> wrote:

> The mlx5 provides the scheduling send on time capability.
> The check the operating status of this feature the xstats
> counters are provided. This patch adds the counter descriptions
> and provides some meaningful information how to interpret
> the counter values in runtime.
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
> ---
>  doc/guides/nics/mlx5.rst | 48 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 48 insertions(+)
> 
> diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
> index f82e2d75de..8d1a1311d4 100644
> --- a/doc/guides/nics/mlx5.rst
> +++ b/doc/guides/nics/mlx5.rst
> @@ -2655,3 +2655,51 @@ Destroy GENEVE TLV parser for specific port::
>  
>  This command doesn't destroy the global list,
>  For releasing options, ``flush`` command should be used.
> +
> +
> +Extended statistics counters
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +Send scheduling related xstats counters
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +The mlx5 PMD provides the set of tx_pp feature related counters to provide debug and diagnostics
> +on send packet scheduling. These counters are applicable only if port was probed with ``tx_pp``
> +devarg and reflect the status of PMD scheduling infrastructure based on Clock and Rearm Queues.
> +This infrastructure provedies the Send Scheduling capability on CX6DX NICs as temporary workaround
> +and should not be engaged on the newer hardware.
> +
> +- ``tx_pp_missed_interrupt_errors`` - the Rearm Queue interrupt was not serviced in time. EAL handles
> +  interrupts in dedicated thread and, possible, there were another time-consuming actions were taken.
> +
> +- ``tx_pp_rearm_queue_errors`` - hardware errors occurred on Rearm Queue, usually it is caused by not
> +  servicing interrupts in time
> +
> +- ``tx_pp_clock_queue_errors`` - hardware errors occurred on Clock Queue, usually it indicates some
> +  configuration or internal NIC hardware or firmware issues
> +
> +- ``tx_pp_timestamp_past_errors`` - application tried to send packet(s) with specifying timestamp in the past.
> +  This counter is useful to check and debug the application code, it does not indicate PMD malfunction.
> +
> +- ``tx_pp_timestamp_future_errors`` - application tried to send packet(s) with specifying timestamp
> +  in the too distant future, beyond the hardware capabilities to schedule the sending
> +  This counter is useful to check and debug the application code, it does not indicate PMD malfunction.
> +
> +- ``tx_pp_jitter`` - this counter exposes the internal NIC realtime clock jitter estimation between two
> +  neighbour Clock Queue completions in nanoseconds. Significant jitter might alert about clock
> +  synchronization issues (say, some system PTP agent might adjust NIC clock in inappropriate way)
> +
> +- ``tx_pp_wander`` - the counter exposes the longterm internal NUC realtime clock stability - tx_pp_wander
> +  for 2^24 completions, in nanoseconds. Significant wander might indicate clock synchronization issues.
> +
> +- ``tx_pp_sync_lost`` - the general operating indicator, the non-zero value says the driver lost
> +  the Clock Queue synchronization and scheduling does not operate correctly. The port must be restarted
> +  to restore the correct scheduling functioning.
> +
> +The following counters are extremely useful for application code check and debug, these ones do not
> +indicate driver or hardware mulfunctions, and are also applicable for the newer hardware (with direct
> +on time scheduling capabilities - ConnectX-7 and above):
> +
> +- ``tx_pp_timestamp_order_errors`` - application tried to send packet(s) with timestamps in not
> +  strictly ascending order. Because of PMD does not reorder packets in the hardware queues, scheduling
> +  timestamps order violation causes sending packets in wrong moments of time.

Lots of grammar and spelling errors and overly wordy.
Please spend some time cleaning up the wording, find a writer or AI tool to help.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-10-28 15:57 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-10-28 14:27 [PATCH 1/1] doc: add mlx5 xstats send scheduling counters description Viacheslav Ovsiienko
2024-10-28 15:57 ` Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).