* [dpdk-dev] mlx5 & pdump: convert HW timestamps to nanoseconds @ 2020-05-19 18:20 PATRICK KEROULAS 2020-05-21 15:33 ` Thomas Monjalon 0 siblings, 1 reply; 14+ messages in thread From: PATRICK KEROULAS @ 2020-05-19 18:20 UTC (permalink / raw) To: dev Hello, I'm trying to build an accurate capture device based on Mellanox Connect-X5 with following requirements: - capture every incoming packets with hardware timestamps - output: pcap with timestamps in nanoseconds My problem is that the packets forwarded to `dpdk-pdump` carry raw timestamps from NIC clock. mlx5 part of libibverbs includes a ts-to-ns converter which takes the instantaneous clock info. It's unused in dpdk so far. I've tested it in the device/port init routine and the result looks reliable. Since this approach looks very simple, compared to the time sync mechanism, I'm trying to integrate. The conversion should occur in the primary process (testpmd) I suppose. 1) The needed clock info derives from ethernet device. Is it possible to access that struct from a rx callback? 2) how to attach the nanosecond to mbuf so that `pdump` catches it? (workaround: copy `mbuf->udata64` in forwarded packets.) 3) any other idea? Regards, Patrick ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] mlx5 & pdump: convert HW timestamps to nanoseconds 2020-05-19 18:20 [dpdk-dev] mlx5 & pdump: convert HW timestamps to nanoseconds PATRICK KEROULAS @ 2020-05-21 15:33 ` Thomas Monjalon 2020-05-21 19:57 ` PATRICK KEROULAS 0 siblings, 1 reply; 14+ messages in thread From: Thomas Monjalon @ 2020-05-21 15:33 UTC (permalink / raw) To: PATRICK KEROULAS; +Cc: dev 19/05/2020 20:20, PATRICK KEROULAS: > Hello, > > I'm trying to build an accurate capture device based on Mellanox > Connect-X5 with following requirements: > - capture every incoming packets with hardware timestamps > - output: pcap with timestamps in nanoseconds > My problem is that the packets forwarded to `dpdk-pdump` carry raw > timestamps from NIC clock. Please could you describe how you use dpdk-pdump? Is it using the mlx5 PMD or pcap PMD? > mlx5 part of libibverbs includes a ts-to-ns converter which takes the > instantaneous clock info. It's unused in dpdk so far. I've tested it in the > device/port init routine and the result looks reliable. Since this approach > looks very simple, compared to the time sync mechanism, I'm trying to > integrate. > > The conversion should occur in the primary process (testpmd) I suppose. > 1) The needed clock info derives from ethernet device. Is it possible to > access that struct from a rx callback? > 2) how to attach the nanosecond to mbuf so that `pdump` catches it? > (workaround: copy `mbuf->udata64` in forwarded packets.) > 3) any other idea? The timestamp is carried in mbuf. Then the conversion must be done by the ethdev caller (application or any other upper layer). ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] mlx5 & pdump: convert HW timestamps to nanoseconds 2020-05-21 15:33 ` Thomas Monjalon @ 2020-05-21 19:57 ` PATRICK KEROULAS 2020-05-21 20:09 ` Thomas Monjalon 0 siblings, 1 reply; 14+ messages in thread From: PATRICK KEROULAS @ 2020-05-21 19:57 UTC (permalink / raw) To: Thomas Monjalon, dev; +Cc: Vivien Didelot > > I'm trying to build an accurate capture device based on Mellanox > > Connect-X5 with following requirements: > > - capture every incoming packets with hardware timestamps > > - output: pcap with timestamps in nanoseconds > > My problem is that the packets forwarded to `dpdk-pdump` carry raw > > timestamps from NIC clock. > > Please could you describe how you use dpdk-pdump? > Is it using the mlx5 PMD or pcap PMD? We're actually using both: # Rx, receive from NIC CONFIG_RTE_LIBRTE_MLX5_PMD=y # Tx, output to pcap file CONFIG_RTE_LIBRTE_PMD_PCAP=y $ sudo ./build/app/testpmd -w 0000:01:00.0 -w 0000:01:00.1 -n4 -- --enable-rx-timestamp $ dpdk-pdump -- --pdump 'port=0,queue=*,rx-dev=capture.pcap' We've sent this placeholder with the intention to start the discussion. https://github.com/DPDK/dpdk/commit/bd371e1ba5bfc5b7092d712a01bbc28799fd58bc.patch https://github.com/DPDK/dpdk/commit/e6f5c731c4ab27ab80b229af98c9b3d257e13843.patch > > mlx5 part of libibverbs includes a ts-to-ns converter which takes the > > instantaneous clock info. It's unused in dpdk so far. I've tested it in the > > device/port init routine and the result looks reliable. Since this approach > > looks very simple, compared to the time sync mechanism, I'm trying to > > integrate. > > > > The conversion should occur in the primary process (testpmd) I suppose. > > 1) The needed clock info derives from ethernet device. Is it possible to > > access that struct from a rx callback? > > 2) how to attach the nanosecond to mbuf so that `pdump` catches it? > > (workaround: copy `mbuf->udata64` in forwarded packets.) > > 3) any other idea? > > The timestamp is carried in mbuf. > Then the conversion must be done by the ethdev caller (application or > any other upper layer). What if the converter function needs a clock_info? https://github.com/linux-rdma/rdma-core/blob/7af01c79e00555207dee6132d72e7bfc1bb5485e/providers/mlx5/mlx5dv.h#L1201 I'm affraid this info may change by the time the converter is called by upper layer. Thanks, PK ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] mlx5 & pdump: convert HW timestamps to nanoseconds 2020-05-21 19:57 ` PATRICK KEROULAS @ 2020-05-21 20:09 ` Thomas Monjalon 2020-05-22 18:43 ` PATRICK KEROULAS 0 siblings, 1 reply; 14+ messages in thread From: Thomas Monjalon @ 2020-05-21 20:09 UTC (permalink / raw) To: PATRICK KEROULAS; +Cc: dev, Vivien Didelot, shahafs, rasland, matan 21/05/2020 21:57, PATRICK KEROULAS: > > > I'm trying to build an accurate capture device based on Mellanox > > > Connect-X5 with following requirements: > > > - capture every incoming packets with hardware timestamps > > > - output: pcap with timestamps in nanoseconds > > > My problem is that the packets forwarded to `dpdk-pdump` carry raw > > > timestamps from NIC clock. > > > > Please could you describe how you use dpdk-pdump? > > Is it using the mlx5 PMD or pcap PMD? > > We're actually using both: > # Rx, receive from NIC > CONFIG_RTE_LIBRTE_MLX5_PMD=y > # Tx, output to pcap file > CONFIG_RTE_LIBRTE_PMD_PCAP=y > > $ sudo ./build/app/testpmd -w 0000:01:00.0 -w 0000:01:00.1 -n4 -- > --enable-rx-timestamp > $ dpdk-pdump -- --pdump 'port=0,queue=*,rx-dev=capture.pcap' OK thanks > We've sent this placeholder with the intention to start the discussion. > https://github.com/DPDK/dpdk/commit/bd371e1ba5bfc5b7092d712a01bbc28799fd58bc.patch > https://github.com/DPDK/dpdk/commit/e6f5c731c4ab27ab80b229af98c9b3d257e13843.patch We don't use GitHub. It is just a mirror. Thanks for having started the discussion in the mailing list, it is more efficient :) > > > mlx5 part of libibverbs includes a ts-to-ns converter which takes the > > > instantaneous clock info. It's unused in dpdk so far. I've tested it in the > > > device/port init routine and the result looks reliable. Since this approach > > > looks very simple, compared to the time sync mechanism, I'm trying to > > > integrate. > > > > > > The conversion should occur in the primary process (testpmd) I suppose. > > > 1) The needed clock info derives from ethernet device. Is it possible to > > > access that struct from a rx callback? > > > 2) how to attach the nanosecond to mbuf so that `pdump` catches it? > > > (workaround: copy `mbuf->udata64` in forwarded packets.) > > > 3) any other idea? > > > > The timestamp is carried in mbuf. > > Then the conversion must be done by the ethdev caller (application or > > any other upper layer). > > What if the converter function needs a clock_info? > https://github.com/linux-rdma/rdma-core/blob/7af01c79e00555207dee6132d72e7bfc1bb5485e/providers/mlx5/mlx5dv.h#L1201 > I'm affraid this info may change by the time the converter is called > by upper layer. Indeed, the clock in the device is not an atomic one :) We need to adjust the time conversion continuously. I am not an expert of time synchronization, so I add more people Cc who could help for having a precise timestamp. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] mlx5 & pdump: convert HW timestamps to nanoseconds 2020-05-21 20:09 ` Thomas Monjalon @ 2020-05-22 18:43 ` PATRICK KEROULAS 2020-05-26 7:44 ` Tom Barbette 2020-05-26 16:00 ` Slava Ovsiienko 0 siblings, 2 replies; 14+ messages in thread From: PATRICK KEROULAS @ 2020-05-22 18:43 UTC (permalink / raw) To: Thomas Monjalon; +Cc: dev, Vivien Didelot, shahafs, rasland, matan > > > > mlx5 part of libibverbs includes a ts-to-ns converter which takes the > > > > instantaneous clock info. It's unused in dpdk so far. I've tested it in the > > > > device/port init routine and the result looks reliable. Since this approach > > > > looks very simple, compared to the time sync mechanism, I'm trying to > > > > integrate. > > > > > > > > The conversion should occur in the primary process (testpmd) I suppose. > > > > 1) The needed clock info derives from ethernet device. Is it possible to > > > > access that struct from a rx callback? > > > > 2) how to attach the nanosecond to mbuf so that `pdump` catches it? > > > > (workaround: copy `mbuf->udata64` in forwarded packets.) > > > > 3) any other idea? > > > > > > The timestamp is carried in mbuf. > > > Then the conversion must be done by the ethdev caller (application or > > > any other upper layer). > > > > What if the converter function needs a clock_info? > > https://github.com/linux-rdma/rdma-core/blob/7af01c79e00555207dee6132d72e7bfc1bb5485e/providers/mlx5/mlx5dv.h#L1201 > > I'm affraid this info may change by the time the converter is called > > by upper layer. > > Indeed, the clock in the device is not an atomic one :) > We need to adjust the time conversion continuously. > I am not an expert of time synchronization, so I add more people Cc > who could help for having a precise timestamp. Thanks Thomas. Not sure this is a synchronization issue. We have dedicated processes (linuxptp) to keep both NIC and sys clocks in sync with an external clock. It is "just" a matter of unit conversion. If it has to be performed in dpdk-pdump, I would need some help to retrieve mlx5_clock_info from inside a secondary process. Looking at mlx5_read_clock(), this info is extracted from ibv_context which looks reachable in a primary process only (segfault, if I try in pdump). ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] mlx5 & pdump: convert HW timestamps to nanoseconds 2020-05-22 18:43 ` PATRICK KEROULAS @ 2020-05-26 7:44 ` Tom Barbette 2020-05-29 20:46 ` N. Benes 2020-05-26 16:00 ` Slava Ovsiienko 1 sibling, 1 reply; 14+ messages in thread From: Tom Barbette @ 2020-05-26 7:44 UTC (permalink / raw) To: PATRICK KEROULAS, Thomas Monjalon Cc: dev, Vivien Didelot, shahafs, rasland, matan Le 22/05/2020 à 20:43, PATRICK KEROULAS a écrit : >>>>> mlx5 part of libibverbs includes a ts-to-ns converter which takes the >>>>> instantaneous clock info. It's unused in dpdk so far. I've tested it in the >>>>> device/port init routine and the result looks reliable. Since this approach >>>>> looks very simple, compared to the time sync mechanism, I'm trying to >>>>> integrate. >>>>> >>>>> The conversion should occur in the primary process (testpmd) I suppose. >>>>> 1) The needed clock info derives from ethernet device. Is it possible to >>>>> access that struct from a rx callback? >>>>> 2) how to attach the nanosecond to mbuf so that `pdump` catches it? >>>>> (workaround: copy `mbuf->udata64` in forwarded packets.) >>>>> 3) any other idea? >>>> The timestamp is carried in mbuf. >>>> Then the conversion must be done by the ethdev caller (application or >>>> any other upper layer). >>> What if the converter function needs a clock_info? >>> https://github.com/linux-rdma/rdma-core/blob/7af01c79e00555207dee6132d72e7bfc1bb5485e/providers/mlx5/mlx5dv.h#L1201 >>> I'm affraid this info may change by the time the converter is called >>> by upper layer. >> Indeed, the clock in the device is not an atomic one :) >> We need to adjust the time conversion continuously. >> I am not an expert of time synchronization, so I add more people Cc >> who could help for having a precise timestamp. > Thanks Thomas. > Not sure this is a synchronization issue. We have dedicated processes > (linuxptp) to keep both NIC and sys clocks in sync with an external clock. > It is "just" a matter of unit conversion. > > If it has to be performed in dpdk-pdump, I would need some help to > retrieve mlx5_clock_info from inside a secondary process. Looking at > mlx5_read_clock(), this info is extracted from ibv_context which looks > reachable in a primary process only (segfault, if I try in pdump). I don't know about the integrated ts-to-ns, but we implemented a translation mechanism that mimics what NTP does in Linux to translate a given clock (TSC at first) to a wall time. You'll find more info at https://orbi.uliege.be/bitstream/2268/226257/1/thesis.pdf chapter 3.4.1. This is an often forgotten matter, as we saw in real switches that the time spent in time-related VDSO is enormous. We wanted to do a very precise capture too, se we made that clock able to synchronize itself with the ConnectX 5 internal clock as a base instead of TSC. FYI the clock in CX5 si running at 800MHz, so pure nanosecond is impossible, but close enough. It is for that purpose that I proposed the rte_eth_read_clock() patch in DPDK. We need to be able to read the current clock (like rdtsc() instruction for TSC) to compute the frequency. The "converter" code is there : https://github.com/tbarbette/fastclick/blob/master/elements/userlevel/tscclock.cc, the source is configurable (TSC, rte_eth_read_clock, GPS meinberg clock, ...), the DPDK one is there : https://github.com/tbarbette/fastclick/blob/2ab021283b82d0b980551480c505ec8dced98e0a/elements/userlevel/dpdkdevclock.cc#L27 One important thing is that the conversion factor must be changed from time to time to fix the drifiting. That is the reason why we can't just push a bunch of code to DPDK (and it's probably not as simple as using the ts-to-ns in mlx5) because you must have a timer, and use a RCU to update "atomically" a > 64bits struct. Though most of that is available now in DPDK but there will always be some setup (rcu barrier, timer init, ...). In the end it's not hard science... It worked like a charm to do a campus trace capture on 100G hardware. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] mlx5 & pdump: convert HW timestamps to nanoseconds 2020-05-26 7:44 ` Tom Barbette @ 2020-05-29 20:46 ` N. Benes 0 siblings, 0 replies; 14+ messages in thread From: N. Benes @ 2020-05-29 20:46 UTC (permalink / raw) To: dev Hi everyone, Tom Barbette: > > Le 22/05/2020 à 20:43, PATRICK KEROULAS a écrit : >>>>>> mlx5 part of libibverbs includes a ts-to-ns converter which takes the >>>>>> instantaneous clock info. It's unused in dpdk so far. I've tested >>>>>> it in the >>>>>> device/port init routine and the result looks reliable. Since this >>>>>> approach >>>>>> looks very simple, compared to the time sync mechanism, I'm trying to >>>>>> integrate. >>>>>> >>>>>> The conversion should occur in the primary process (testpmd) I >>>>>> suppose. >>>>>> 1) The needed clock info derives from ethernet device. Is it >>>>>> possible to >>>>>> access that struct from a rx callback? >>>>>> 2) how to attach the nanosecond to mbuf so that `pdump` catches it? >>>>>> (workaround: copy `mbuf->udata64` in forwarded packets.) >>>>>> 3) any other idea? >>>>> The timestamp is carried in mbuf. >>>>> Then the conversion must be done by the ethdev caller (application or >>>>> any other upper layer). >>>> What if the converter function needs a clock_info? >>>> https://github.com/linux-rdma/rdma-core/blob/7af01c79e00555207dee6132d72e7bfc1bb5485e/providers/mlx5/mlx5dv.h#L1201 >>>> >>>> I'm affraid this info may change by the time the converter is called >>>> by upper layer. >>> Indeed, the clock in the device is not an atomic one :) >>> We need to adjust the time conversion continuously. >>> I am not an expert of time synchronization, so I add more people Cc >>> who could help for having a precise timestamp. >> Thanks Thomas. >> Not sure this is a synchronization issue. We have dedicated processes >> (linuxptp) to keep both NIC and sys clocks in sync with an external >> clock. >> It is "just" a matter of unit conversion. >> >> If it has to be performed in dpdk-pdump, I would need some help to >> retrieve mlx5_clock_info from inside a secondary process. Looking at >> mlx5_read_clock(), this info is extracted from ibv_context which looks >> reachable in a primary process only (segfault, if I try in pdump). The normal phc2sys can not only synchronise NIC -> system but also sys -> NIC and (I believe it does but have not tried) NIC1 -> NIC2. If I understand your proposal correctly, you want to use a free running NIC counter and calibrate out the drift afterwards. It may be easier to adapt phc2sys to use a NIC through DPDK and sync the NIC's timewheel/VCO in a proven/reliable manner (e.g. low pass filtering excursions). Then you could directly use the NIC counter value. > I don't know about the integrated ts-to-ns, but we implemented a > translation mechanism that mimics what NTP does in Linux to translate a > given clock (TSC at first) to a wall time. You'll find more info at > https://orbi.uliege.be/bitstream/2268/226257/1/thesis.pdf chapter > 3.4.1. This is an often forgotten matter, as we saw in real switches > that the time spent in time-related VDSO is enormous. Do you have measurements of vDSO clock_gettime and how much is "enormous" to you? To my knowledge, clock_gettime via vDSO on Linux only takes a few nanoseconds in the average case. However, it can go up to ~10 or even ~50 microseconds every few (~10) seconds, depending on the number of CPUs (for example single vs. dual socket, though my hardware for this test is quite old, Dell R210-II, R610). Presumably this is when the kernel locks the struct in VVAR to update the TSC drift compensation parameters. Linux clock_gettime implementation is here (different versions): https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/x86/vdso/vclock_gettime.c?h=linux-3.10.y#n193 https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/x86/entry/vdso/vclock_gettime.c?h=linux-4.19.y#n241 https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/lib/vdso/gettimeofday.c#n98 I use busy waiting on clock_gettime in a packet generator application (so far 10 GbE only) to pace jumbo frames according to a spec (simulating the traffic pattern of a to-be-developed hardware with FPGA), and COTS sniffer hardware with absolute timestamping to verify my generator's performance. I can observe the above 10-50 us artefacts and sufficiently good/low (for my needs) average execution time of clock_gettime. The only sad thing is that TAI clock does not go through vDSO and therefore I cannot use it. > We wanted to do a very precise capture too, se we made that clock able > to synchronize itself with the ConnectX 5 internal clock as a base > instead of TSC. FYI the clock in CX5 si running at 800MHz, so pure > nanosecond is impossible, but close enough. It is for that purpose that > I proposed the rte_eth_read_clock() patch in DPDK. We need to be able to > read the current clock (like rdtsc() instruction for TSC) to compute the > frequency. Doesn't this mean that you need to wait for the PCIe op from the NIC? Is this really faster than a rdtsc, memory/cache read, integer multiplication and shift? Cheers, nicolas ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] mlx5 & pdump: convert HW timestamps to nanoseconds 2020-05-22 18:43 ` PATRICK KEROULAS 2020-05-26 7:44 ` Tom Barbette @ 2020-05-26 16:00 ` Slava Ovsiienko 2020-05-29 20:56 ` PATRICK KEROULAS 2020-06-02 19:18 ` PATRICK KEROULAS 1 sibling, 2 replies; 14+ messages in thread From: Slava Ovsiienko @ 2020-05-26 16:00 UTC (permalink / raw) To: PATRICK KEROULAS, Thomas Monjalon Cc: dev, Vivien Didelot, Shahaf Shuler, Raslan Darawsheh, Matan Azrad Hi, Patrick ConnectX HW timestamp is the captured value of internal 64-bit counter running at the frequency, reported in the device_frequency_khz field of struct mlx5_ifc_cmd_hca_cap_bits{}. This structure is queried in mlx5_devx_cmd_query_hca_attr() routine. So, with known frequency it is possible to recalculate timestamp ticks to desired units. With best regards, Slava > -----Original Message----- > From: dev <dev-bounces@dpdk.org> On Behalf Of PATRICK KEROULAS > Sent: Friday, May 22, 2020 21:43 > To: Thomas Monjalon <thomas@monjalon.net> > Cc: dev@dpdk.org; Vivien Didelot <vivien.didelot@gmail.com>; Shahaf > Shuler <shahafs@mellanox.com>; Raslan Darawsheh > <rasland@mellanox.com>; Matan Azrad <matan@mellanox.com> > Subject: Re: [dpdk-dev] mlx5 & pdump: convert HW timestamps to > nanoseconds > > > > > > mlx5 part of libibverbs includes a ts-to-ns converter which > > > > > takes the instantaneous clock info. It's unused in dpdk so far. > > > > > I've tested it in the device/port init routine and the result > > > > > looks reliable. Since this approach looks very simple, compared > > > > > to the time sync mechanism, I'm trying to integrate. > > > > > > > > > > The conversion should occur in the primary process (testpmd) I > suppose. > > > > > 1) The needed clock info derives from ethernet device. Is it possible to > > > > > access that struct from a rx callback? > > > > > 2) how to attach the nanosecond to mbuf so that `pdump` catches it? > > > > > (workaround: copy `mbuf->udata64` in forwarded packets.) > > > > > 3) any other idea? > > > > > > > > The timestamp is carried in mbuf. > > > > Then the conversion must be done by the ethdev caller (application > > > > or any other upper layer). > > > > > > What if the converter function needs a clock_info? > > > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgi > > > thub.com%2Flinux-rdma%2Frdma- > core%2Fblob%2F7af01c79e00555207dee6132d > > > > 72e7bfc1bb5485e%2Fproviders%2Fmlx5%2Fmlx5dv.h%23L1201&data= > 02%7C > > > > 01%7Cviacheslavo%40mellanox.com%7C381f1c9dd36f4e18e9c908d7fe8001 > 3b%7 > > > > Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C63725769806348887 > 6&s > > > > data=CNc%2B5dyFCeFRQn56S5NNzEfTCtnInm059wxwX5GX96E%3D&re > served=0 > > > I'm affraid this info may change by the time the converter is called > > > by upper layer. > > > > Indeed, the clock in the device is not an atomic one :) We need to > > adjust the time conversion continuously. > > I am not an expert of time synchronization, so I add more people Cc > > who could help for having a precise timestamp. > > Thanks Thomas. > Not sure this is a synchronization issue. We have dedicated processes > (linuxptp) to keep both NIC and sys clocks in sync with an external clock. > It is "just" a matter of unit conversion. > > If it has to be performed in dpdk-pdump, I would need some help to retrieve > mlx5_clock_info from inside a secondary process. Looking at > mlx5_read_clock(), this info is extracted from ibv_context which looks > reachable in a primary process only (segfault, if I try in pdump). ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] mlx5 & pdump: convert HW timestamps to nanoseconds 2020-05-26 16:00 ` Slava Ovsiienko @ 2020-05-29 20:56 ` PATRICK KEROULAS 2020-05-31 19:47 ` Slava Ovsiienko 2020-06-02 19:18 ` PATRICK KEROULAS 1 sibling, 1 reply; 14+ messages in thread From: PATRICK KEROULAS @ 2020-05-29 20:56 UTC (permalink / raw) To: Slava Ovsiienko Cc: dev, Vivien Didelot, Shahaf Shuler, Raslan Darawsheh, Matan Azrad On Tue, May 26, 2020 at 12:00 PM Slava Ovsiienko <viacheslavo@mellanox.com> wrote: >> Hi, Patrick > > ConnectX HW timestamp is the captured value of internal 64-bit counter running at the frequency, > reported in the device_frequency_khz field of struct mlx5_ifc_cmd_hca_cap_bits{}. > This structure is queried in mlx5_devx_cmd_query_hca_attr() routine. > So, with known frequency it is possible to recalculate timestamp ticks to desired units. Hello Slava, Assuming that the NIC clock is already synced thanks to a PTP client, does the bit counter give an absolute time value (0 => 1 January 1970 00:00:00)? Or do I need to calculate a time duration from the process start time? I just want to validate the path from mlx5 eth dev(Rx) to eth pcap (Tx) : - query the oscillator frequency at the mlx5_eth_dev init step (mlx5_devx_cmd_query_hca_attr()) - store the freq with other hca_attr, carried by dev config which should be shared with the secondary process - in eth_pcap_tx_dumper(), retrieve the freq from the dev given by mbuf->port - convert all the incoming mbuf->timestamp using this freq whose variation should be negligible over the capture duration Last question: what is your opinion about this other method? https://github.com/linux-rdma/rdma-core/blob/7af01c79e00555207dee6132d72e7bfc1bb5485e/providers/mlx5/mlx5dv.h#L1201 Thanks a lot! ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] mlx5 & pdump: convert HW timestamps to nanoseconds 2020-05-29 20:56 ` PATRICK KEROULAS @ 2020-05-31 19:47 ` Slava Ovsiienko 0 siblings, 0 replies; 14+ messages in thread From: Slava Ovsiienko @ 2020-05-31 19:47 UTC (permalink / raw) To: PATRICK KEROULAS Cc: dev, Vivien Didelot, Shahaf Shuler, Raslan Darawsheh, Matan Azrad Hi, Patrick Please, see below. >From: PATRICK KEROULAS <patrick.keroulas@radio-canada.ca> >Sent: Friday, May 29, 2020 23:56 >To: Slava Ovsiienko <viacheslavo@mellanox.com> >Cc: dev@dpdk.org; Vivien Didelot <vivien.didelot@gmail.com>; Shahaf Shuler <shahafs@mellanox.com>; Raslan Darawsheh <rasland@mellanox.com>; Matan Azrad ><matan@mellanox.com> >Subject: Re: [dpdk-dev] mlx5 & pdump: convert HW timestamps to nanoseconds > > >On Tue, May 26, 2020 at 12:00 PM Slava Ovsiienko <mailto:viacheslavo@mellanox.com> wrote: >>> Hi, Patrick >> >> ConnectX HW timestamp is the captured value of internal 64-bit counter running at the frequency, >> reported in the device_frequency_khz field of struct mlx5_ifc_cmd_hca_cap_bits{}. >> This structure is queried in mlx5_devx_cmd_query_hca_attr() routine. >> So, with known frequency it is possible to recalculate timestamp ticks to desired units. > >Hello Slava, > >Assuming that the NIC clock is already synced thanks to a PTP client, >does the bit counter give an absolute time value (0 => 1 January 1970 >00:00:00)? Or do I need to calculate a time duration from the process >start time? [SO] I would not make any assumption about internal clock phase and its relation to time (UTC?). I suppose the getting the initial value of clock counter and calculating the actual time at the app start is valid approach. >I just want to validate the path from mlx5 eth dev(Rx) to eth pcap (Tx) : >- query the oscillator frequency at the mlx5_eth_dev init step > (mlx5_devx_cmd_query_hca_attr()) >- store the freq with other hca_attr, carried by dev config which should > be shared with the secondary process >- in eth_pcap_tx_dumper(), retrieve the freq from the dev given by > mbuf->port >- convert all the incoming mbuf->timestamp using this freq whose > variation should be negligible over the capture duration > Should work OK, as for me. >Last question: what is your opinion about this other method? >https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flinux-rdma%2Frdma-core%2Fblob%2F7af01c79e00555207dee6132d72e7bfc1bb5485e%>2Fproviders%2Fmlx5%2Fmlx5dv.h%23L1201&data=02%7C01%7Cviacheslavo%40mellanox.com%7C81833b88026b4aa93ecb08d80412b902%>7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637263825741568283&sdata=dNr63ujwKDcWTCAAWO7und3B50kcmEFYxu01y2hoy%2Bw%3D&reserved=0 > >Thanks a lot! This code checks the counter periodically to track the counter wraparound and provides the older timestamp conversion (got before clock base update). If your have the stream of pkts with monotonically increasing timestamp you could track this counter wrap in your code (save the last ts conversion result and counter value). With best regards, Slava ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] mlx5 & pdump: convert HW timestamps to nanoseconds 2020-05-26 16:00 ` Slava Ovsiienko 2020-05-29 20:56 ` PATRICK KEROULAS @ 2020-06-02 19:18 ` PATRICK KEROULAS 2020-06-03 7:48 ` Slava Ovsiienko 1 sibling, 1 reply; 14+ messages in thread From: PATRICK KEROULAS @ 2020-06-02 19:18 UTC (permalink / raw) To: Slava Ovsiienko Cc: dev, Vivien Didelot, Shahaf Shuler, Raslan Darawsheh, Matan Azrad On Tue, May 26, 2020 at 12:00 PM Slava Ovsiienko <viacheslavo@mellanox.com> wrote: > ConnectX HW timestamp is the captured value of internal 64-bit counter running at the frequency, > reported in the device_frequency_khz field of struct mlx5_ifc_cmd_hca_cap_bits{}. > This structure is queried in mlx5_devx_cmd_query_hca_attr() routine. I can't query this because "DevX is NOT supported". As a matter of fact, mlx5dv_open_device() returns NULL. Not sure if this limitation comes from HW/firmware config or capability. * rdma-code, libibverbs-dev: 28.0 * NIC Part Number: MCX516A-CDA_Ax * ConnectX-5 Ex EN * FW: 16.25.1020 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] mlx5 & pdump: convert HW timestamps to nanoseconds 2020-06-02 19:18 ` PATRICK KEROULAS @ 2020-06-03 7:48 ` Slava Ovsiienko 2020-06-05 0:09 ` PATRICK KEROULAS 0 siblings, 1 reply; 14+ messages in thread From: Slava Ovsiienko @ 2020-06-03 7:48 UTC (permalink / raw) To: PATRICK KEROULAS Cc: dev, Vivien Didelot, Shahaf Shuler, Raslan Darawsheh, Matan Azrad > -----Original Message----- > From: PATRICK KEROULAS <patrick.keroulas@radio-canada.ca> > Sent: Tuesday, June 2, 2020 22:18 > To: Slava Ovsiienko <viacheslavo@mellanox.com> > Cc: dev@dpdk.org; Vivien Didelot <vivien.didelot@gmail.com>; Shahaf > Shuler <shahafs@mellanox.com>; Raslan Darawsheh > <rasland@mellanox.com>; Matan Azrad <matan@mellanox.com> > Subject: Re: [dpdk-dev] mlx5 & pdump: convert HW timestamps to > nanoseconds > > On Tue, May 26, 2020 at 12:00 PM Slava Ovsiienko > <viacheslavo@mellanox.com> wrote: > > ConnectX HW timestamp is the captured value of internal 64-bit counter > > running at the frequency, reported in the device_frequency_khz field of > struct mlx5_ifc_cmd_hca_cap_bits{}. > > This structure is queried in mlx5_devx_cmd_query_hca_attr() routine. > > I can't query this because "DevX is NOT supported". > As a matter of fact, mlx5dv_open_device() returns NULL. > Not sure if this limitation comes from HW/firmware config or capability. > * rdma-code, libibverbs-dev: 28.0 > * NIC Part Number: MCX516A-CDA_Ax > * ConnectX-5 Ex EN > * FW: 16.25.1020 It looks like outdated firmware, please: - update the firmware - at least 16.27.2008 is GA. I would recommend to install OFED - it updates the FW - make sure the UCTX_EN option in FW configuration is set to "true" With best regards, Slava ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] mlx5 & pdump: convert HW timestamps to nanoseconds 2020-06-03 7:48 ` Slava Ovsiienko @ 2020-06-05 0:09 ` PATRICK KEROULAS 2020-06-05 16:30 ` Slava Ovsiienko 0 siblings, 1 reply; 14+ messages in thread From: PATRICK KEROULAS @ 2020-06-05 0:09 UTC (permalink / raw) To: Slava Ovsiienko Cc: dev, Vivien Didelot, Shahaf Shuler, Raslan Darawsheh, Matan Azrad On Wed, Jun 3, 2020 at 3:48 AM Slava Ovsiienko <viacheslavo@mellanox.com> wrote: > > > From: PATRICK KEROULAS <patrick.keroulas@radio-canada.ca> > > * rdma-code, libibverbs-dev: 28.0 > > * NIC Part Number: MCX516A-CDA_Ax > > * ConnectX-5 Ex EN > > * FW: 16.25.1020 > > It looks like outdated firmware, please: > - update the firmware - at least 16.27.2008 is GA. I would recommend to install OFED - it updates the FW > - make sure the UCTX_EN option in FW configuration is set to "true" Hello Slava, I managed to query device_frequency_khz by simply setting UCTX_EN=1, convert the mbuf->timestamp to nsec and write a pcap. However, the accuracy is quite disappointing, compared to libvma or even SW TS. The freq value looks constant (=78125kHz). Correct me if I'm wrong, a ptp client is supposed to continuously adjust some kind of VCO on the NIC. And even setting a crazy value through /dev/ptp interface manually doesn't affect device_frequency_khz. Please could you clarify? This leads me back to mlx5dv_clock_info->nsec. If this is a valid method, I think the only missing piece is to access it from the secondary process, which implies to share ibv_context. Best Regards, PK ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [dpdk-dev] mlx5 & pdump: convert HW timestamps to nanoseconds 2020-06-05 0:09 ` PATRICK KEROULAS @ 2020-06-05 16:30 ` Slava Ovsiienko 0 siblings, 0 replies; 14+ messages in thread From: Slava Ovsiienko @ 2020-06-05 16:30 UTC (permalink / raw) To: PATRICK KEROULAS Cc: dev, Vivien Didelot, Shahaf Shuler, Raslan Darawsheh, Matan Azrad > -----Original Message----- > From: PATRICK KEROULAS <patrick.keroulas@radio-canada.ca> > Sent: Friday, June 5, 2020 3:10 > To: Slava Ovsiienko <viacheslavo@mellanox.com> > Cc: dev@dpdk.org; Vivien Didelot <vivien.didelot@gmail.com>; Shahaf > Shuler <shahafs@mellanox.com>; Raslan Darawsheh > <rasland@mellanox.com>; Matan Azrad <matan@mellanox.com> > Subject: Re: [dpdk-dev] mlx5 & pdump: convert HW timestamps to > nanoseconds > > On Wed, Jun 3, 2020 at 3:48 AM Slava Ovsiienko > <viacheslavo@mellanox.com> wrote: > > > > > From: PATRICK KEROULAS <patrick.keroulas@radio-canada.ca> > > > * rdma-code, libibverbs-dev: 28.0 > > > * NIC Part Number: MCX516A-CDA_Ax > > > * ConnectX-5 Ex EN > > > * FW: 16.25.1020 > > > > It looks like outdated firmware, please: > > - update the firmware - at least 16.27.2008 is GA. I would recommend > > to install OFED - it updates the FW > > - make sure the UCTX_EN option in FW configuration is set to "true" > > Hello Slava, > > I managed to query device_frequency_khz by simply setting UCTX_EN=1, > convert the mbuf->timestamp to nsec and write a pcap. However, the > accuracy is quite disappointing, compared to libvma or even SW TS. > > The freq value looks constant (=78125kHz). Correct me if I'm wrong, a ptp > client is supposed to continuously adjust some kind of VCO on the NIC. And AFAIK, it is not the case for ConnectX-5, no clock adjustment, just some free running counter. ConnectX6DX will provide an option of adjustable nanosecond UTC in timestamps. > even setting a crazy value through /dev/ptp interface manually doesn't affect > device_frequency_khz. Please could you clarify? > > This leads me back to mlx5dv_clock_info->nsec. If this is a valid method, I > think the only missing piece is to access it from the secondary process, which > implies to share ibv_context. > > Best Regards, > > PK ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2020-06-05 16:30 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-05-19 18:20 [dpdk-dev] mlx5 & pdump: convert HW timestamps to nanoseconds PATRICK KEROULAS 2020-05-21 15:33 ` Thomas Monjalon 2020-05-21 19:57 ` PATRICK KEROULAS 2020-05-21 20:09 ` Thomas Monjalon 2020-05-22 18:43 ` PATRICK KEROULAS 2020-05-26 7:44 ` Tom Barbette 2020-05-29 20:46 ` N. Benes 2020-05-26 16:00 ` Slava Ovsiienko 2020-05-29 20:56 ` PATRICK KEROULAS 2020-05-31 19:47 ` Slava Ovsiienko 2020-06-02 19:18 ` PATRICK KEROULAS 2020-06-03 7:48 ` Slava Ovsiienko 2020-06-05 0:09 ` PATRICK KEROULAS 2020-06-05 16:30 ` Slava Ovsiienko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).