From: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
To: <dev@dpdk.org>
Cc: <rasland@nvidia.com>, <matan@nvidia.com>, <suanmingm@nvidia.com>,
<stephen@networkplumber.org>
Subject: [PATCH v2] doc: add mlx5 xstats send scheduling counters description
Date: Thu, 31 Oct 2024 10:04:38 +0200 [thread overview]
Message-ID: <20241031080438.1701634-1-viacheslavo@nvidia.com> (raw)
In-Reply-To: <20241028085708.0060bc6f@hermes.local>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="yes", Size: 3857 bytes --]
The mlx5 provides the scheduling send on time capability.
To check the operating status of this feature the extended statistics
counters are provided. This patch adds the counter descriptions
and provides some meaningful information how to interpret
the counter values in runtime.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
doc/guides/nics/mlx5.rst | 59 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 59 insertions(+)
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index b5522d50c5..5db4aeda1b 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -2662,3 +2662,62 @@ Destroy GENEVE TLV parser for specific port::
This command doesn't destroy the global list,
For releasing options, ``flush`` command should be used.
+
+
+Extended Statistics Counters
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Send Scheduling Extended Statistics Counters
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The mlx5 PMD provides a comprehensive set of counters designed for debugging
+and diagnostics related to packet scheduling during transmission. These counters
+are applicable only if the port was configured with the ``tx_pp`` devarg and
+reflect the status of the PMD scheduling infrastructure based on Clock and
+Rearm Queues, used as a workaround on ConnectX-6DX NICs.
+
+- ``tx_pp_missed_interrupt_errors`` - indicates that the Rearm Queue interrupt
+ was not serviced on time. The EAL manages interrupts in a dedicated thread,
+ and it is possible that other time-consuming actions were being processed
+ concurrently.
+
+- ``tx_pp_rearm_queue_errors`` - signifies hardware errors that occurred
+ on the Rearm Queue, typically caused by delays in servicing interrupts.
+
+- ``tx_pp_clock_queue_errors`` - reflects hardware errors on the Clock Queue,
+ which usually indicate configuration issues or problems with the internal NIC
+ hardware or firmware.
+
+- ``tx_pp_timestamp_past_errors`` - tracks the application attempted to send
+ packets with timestamps set in the past. It is useful for debugging application
+ code and does not indicate a malfunction of the PMD.
+
+- ``tx_pp_timestamp_future_errors`` - records attempts by the application to send
+ packets with timestamps set too far into the future, exceeding the hardware’s
+ scheduling capabilities. Like the previous counter, it aids in application
+ debugging without suggesting a PMD malfunction.
+
+- ``tx_pp_jitter`` - measures the internal NIC real-time clock jitter estimation
+ between two consecutive Clock Queue completions, expressed in nanoseconds.
+ Significant jitter may signal potential clock synchronization issues,
+ possibly due to inappropriate adjustments made by a system PTP
+ (Precision Time Protocol) agent.
+
+- ``tx_pp_wander`` - indicates the long-term stability of the internal NIC
+ real-time clock over 2^24 completions, measured in nanoseconds. Significant
+ wander may also suggest clock synchronization problems.
+
+- ``tx_pp_sync_lost`` - a general operational indicator; a non-zero value
+ indicates that the driver has lost synchronization with the Clock Queue,
+ resulting in improper scheduling operations. To restore correct scheduling
+ functionality, it is necessary to restart the port.
+
+The following counters are particularly valuable for verifying and debugging
+application code. They do not indicate driver or hardware malfunctions and
+are applicable to newer hardware with direct on-time scheduling capabilities
+(such as ConnectX-7 and above):
+
+- ``tx_pp_timestamp_order_errors`` - indicates attempts by the application
+ to send packets with timestamps that are not in strictly ascending order.
+ Since the PMD does not reorder packets within hardware queues, violations
+ of timestamp order can lead to packets being sent at incorrect times.
--
2.34.1
prev parent reply other threads:[~2024-10-31 8:05 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-28 14:27 [PATCH 1/1] " Viacheslav Ovsiienko
2024-10-28 15:57 ` Stephen Hemminger
2024-10-31 8:04 ` Viacheslav Ovsiienko [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241031080438.1701634-1-viacheslavo@nvidia.com \
--to=viacheslavo@nvidia.com \
--cc=dev@dpdk.org \
--cc=matan@nvidia.com \
--cc=rasland@nvidia.com \
--cc=stephen@networkplumber.org \
--cc=suanmingm@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).