I convinced myself that a viable way to measure timestamps between a request packet and its response packet can be the difference between two Intel rdtsc calls

The restrictions to valid use include:
The time difference is OK provided,
I think rdtsc does all this. But then I read [1]:
So I am not so sure. 

Now, of course, Mellanox can report time stamps. Is it actually possible to get HW NIC timestamps reported for every packet sent and received without overburdening the NIC? Based on what I can see for my case (Connect 4 LX) resolution is nanoseconds. So I am tempted to not fool around with rdtsc and just use NIC timestamps.

What is praxis in DPDK programming when one needs RTTs?

[1] https://stackoverflow.com/questions/42189976/calculate-system-time-using-rdtsc