From: Olivier Matz <olivier.matz@6wind.com>
To: Victor Huertas <vhuertas@gmail.com>
Cc: James Huang <jamsphon@gmail.com>,
cristian.dumitrescu@intel.com, dev <dev@dpdk.org>
Subject: Re: [dpdk-dev] Fwd: Fwd: high latency detected in IP pipeline example
Date: Wed, 19 Feb 2020 11:53:28 +0100 [thread overview]
Message-ID: <20200219105328.GF22738@platinum> (raw)
In-Reply-To: <CAGxG5cjuNxkrB4W=jfb85q_E1yAogiaA1dnnJD1b06uyy9b+rQ@mail.gmail.com>
Hi Victor,
I have no experience with ip_pipeline. I can at least say that this
latency is much higher that what you should get.
My initial thought was that you were using several pthreads bound to the
same core, but from what I read in your first mail, this is not the
case.
Do you have a simple way to reproduce your issue with the original
example app?
Olivier
On Wed, Feb 19, 2020 at 11:37:21AM +0100, Victor Huertas wrote:
> Hi ,
>
> I put some maintainers as destination that could provide some extra
> information on this issue.
> I hope they can shed some light on this.
>
> Regards
>
> El mié., 19 feb. 2020 a las 9:29, Victor Huertas (<vhuertas@gmail.com>)
> escribió:
>
> > OK James,
> > Thanks for sharing your own experience.
> > What I would need right now is to know from maintainers if this latency
> > behaviour is something inherent in DPDK in the particular case we are
> > talking about. Furthermore, I would also appreciate it if some maintainer
> > could tell us if there is some workaround or special configuration that
> > completely mitigate this latency. I guess that there is one mitigation
> > mechanism, which is the approach that the new ip_pipeline app example
> > exposes: if two or more pipelines are in the same core the "connection"
> > between them is not a software queue but a "direct table connection".
> >
> > This proposed approach has a big impact on my application and I would like
> > to know if there is other mitigation approach taking into account the "old"
> > version of ip_pipeline example.
> >
> > Thanks for your attention
> >
> >
> > El mar., 18 feb. 2020 a las 23:09, James Huang (<jamsphon@gmail.com>)
> > escribió:
> >
> >> No. I didn't notice the RTT bouncing symptoms.
> >> In high throughput scenario, if multiple pipelines runs in a single cpu
> >> core, it does increase the latency.
> >>
> >>
> >> Regards,
> >> James Huang
> >>
> >>
> >> On Tue, Feb 18, 2020 at 1:50 AM Victor Huertas <vhuertas@gmail.com>
> >> wrote:
> >>
> >>> Dear James,
> >>>
> >>> I have done two different tests with the following configuration:
> >>> [PIPELINE 0 MASTER core =0]
> >>> [PIPELINE 1 core=1] --- SWQ1--->[PIPELINE 2 core=1] -----SWQ2---->
> >>> [PIPELINE 3 core=1]
> >>>
> >>> The first test (sending a single ping to cross all the pipelines to
> >>> measure RTT) has been done by setting the burst_write to 32 in SWQ1 and
> >>> SWQ2. NOTE: All the times we use rte_ring_enqueue_burst in the pipelines 1
> >>> and 2 we set the number of packets to write to 1.
> >>>
> >>> The result of this first test is as shown subsquently:
> >>> 64 bytes from 192.168.0.101: icmp_seq=343 ttl=63 time=59.8 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=344 ttl=63 time=59.4 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=345 ttl=63 time=59.2 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=346 ttl=63 time=59.0 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=347 ttl=63 time=59.0 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=348 ttl=63 time=59.2 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=349 ttl=63 time=59.3 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=350 ttl=63 time=59.1 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=351 ttl=63 time=58.9 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=352 ttl=63 time=58.5 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=353 ttl=63 time=58.4 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=354 ttl=63 time=58.0 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=355 ttl=63 time=58.4 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=356 ttl=63 time=57.7 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=357 ttl=63 time=56.9 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=358 ttl=63 time=57.2 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=359 ttl=63 time=57.5 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=360 ttl=63 time=57.3 ms
> >>>
> >>> As you can see, the RTT is quite high and the range of values is more or
> >>> less stable.
> >>>
> >>> The second test is the same as the first one but setting burst_write to
> >>> 1 for all SWQs. The result is this one:
> >>>
> >>> 64 bytes from 192.168.0.101: icmp_seq=131 ttl=63 time=10.6 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=132 ttl=63 time=10.6 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=133 ttl=63 time=10.5 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=134 ttl=63 time=10.7 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=135 ttl=63 time=10.8 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=136 ttl=63 time=10.4 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=137 ttl=63 time=10.7 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=138 ttl=63 time=10.5 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=139 ttl=63 time=10.4 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=140 ttl=63 time=10.2 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=141 ttl=63 time=10.4 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=142 ttl=63 time=10.9 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=143 ttl=63 time=11.4 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=144 ttl=63 time=11.3 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=145 ttl=63 time=11.5 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=146 ttl=63 time=11.6 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=147 ttl=63 time=11.0 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=148 ttl=63 time=11.3 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=149 ttl=63 time=12.0 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=150 ttl=63 time=12.6 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=151 ttl=63 time=12.4 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=152 ttl=63 time=12.3 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=153 ttl=63 time=12.8 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=154 ttl=63 time=12.4 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=155 ttl=63 time=12.8 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=156 ttl=63 time=12.7 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=157 ttl=63 time=12.6 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=158 ttl=63 time=12.9 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=159 ttl=63 time=13.4 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=160 ttl=63 time=13.8 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=161 ttl=63 time=13.4 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=162 ttl=63 time=13.3 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=163 ttl=63 time=13.3 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=164 ttl=63 time=13.7 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=165 ttl=63 time=13.7 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=166 ttl=63 time=13.8 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=167 ttl=63 time=14.7 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=168 ttl=63 time=14.7 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=169 ttl=63 time=14.7 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=170 ttl=63 time=14.7 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=171 ttl=63 time=14.6 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=172 ttl=63 time=14.6 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=173 ttl=63 time=14.5 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=174 ttl=63 time=14.5 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=175 ttl=63 time=15.1 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=176 ttl=63 time=15.6 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=177 ttl=63 time=16.0 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=178 ttl=63 time=16.9 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=179 ttl=63 time=17.7 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=180 ttl=63 time=17.6 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=181 ttl=63 time=17.9 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=182 ttl=63 time=17.9 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=183 ttl=63 time=18.5 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=184 ttl=63 time=18.9 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=185 ttl=63 time=19.8 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=186 ttl=63 time=19.8 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=187 ttl=63 time=10.7 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=188 ttl=63 time=10.5 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=189 ttl=63 time=10.4 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=190 ttl=63 time=10.3 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=191 ttl=63 time=10.5 ms
> >>> 64 bytes from 192.168.0.101: icmp_seq=192 ttl=63 time=10.7 ms
> >>> As you mentioned, the delay has decreased a lot but it is still
> >>> considerably high (in a normal router this delay is less than 1 ms).
> >>> A second strange behaviour is seen in the evolution of the RTT detected.
> >>> It begins in 10 ms and goes increasing little by litttle to reach a peak of
> >>> 20 ms aprox and then it suddely comes back to 10 ms again to increase again
> >>> till 20 ms.
> >>>
> >>> Is this the behaviour you have in your case when the burst_write is set
> >>> to 1?
> >>>
> >>> Regards,
> >>>
> >>> El mar., 18 feb. 2020 a las 8:18, James Huang (<jamsphon@gmail.com>)
> >>> escribió:
> >>>
> >>>> No. We didn't see noticable throughput difference in our test.
> >>>>
> >>>> On Mon., Feb. 17, 2020, 11:04 p.m. Victor Huertas <vhuertas@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> Thanks James for your quick answer.
> >>>>> I guess that this configuration modification implies that the packets
> >>>>> must be written one by one in the sw ring. Did you notice loose of
> >>>>> performance (in throughput) in your aplicación because of that?
> >>>>>
> >>>>> Regards
> >>>>>
> >>>>> El mar., 18 feb. 2020 0:10, James Huang <jamsphon@gmail.com> escribió:
> >>>>>
> >>>>>> Yes, I experienced similar issue in my application. In a short
> >>>>>> answer, set the swqs write burst value to 1 may reduce the latency
> >>>>>> significantly. The default write burst value is 32.
> >>>>>>
> >>>>>> On Mon., Feb. 17, 2020, 8:41 a.m. Victor Huertas <vhuertas@gmail.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>> I am developing my own DPDK application basing it in the dpdk-stable
> >>>>>>> ip_pipeline example.
> >>>>>>> At this moment I am using the 17.11 LTS version of DPDK and I amb
> >>>>>>> observing
> >>>>>>> some extrange behaviour. Maybe it is an old issue that can be solved
> >>>>>>> quickly so I would appreciate it if some expert can shade a light on
> >>>>>>> this.
> >>>>>>>
> >>>>>>> The ip_pipeline example allows you to develop Pipelines that perform
> >>>>>>> specific packet processing functions (ROUTING, FLOW_CLASSIFYING,
> >>>>>>> etc...).
> >>>>>>> The thing is that I am extending some of this pipelines with my own.
> >>>>>>> However I want to take advantage of the built-in ip_pipeline
> >>>>>>> capability of
> >>>>>>> arbitrarily assigning the logical core where the pipeline (f_run()
> >>>>>>> function) must be executed so that i can adapt the packet processing
> >>>>>>> power
> >>>>>>> to the amount of the number of cores available.
> >>>>>>> Taking this into account I have observed something strange. I show
> >>>>>>> you this
> >>>>>>> simple example below.
> >>>>>>>
> >>>>>>> Case 1:
> >>>>>>> [PIPELINE 0 MASTER core =0]
> >>>>>>> [PIPELINE 1 core=1] --- SWQ1--->[PIPELINE 2 core=2] -----SWQ2---->
> >>>>>>> [PIPELINE 3 core=3]
> >>>>>>>
> >>>>>>> Case 2:
> >>>>>>> [PIPELINE 0 MASTER core =0]
> >>>>>>> [PIPELINE 1 core=1] --- SWQ1--->[PIPELINE 2 core=1] -----SWQ2---->
> >>>>>>> [PIPELINE 3 core=1]
> >>>>>>>
> >>>>>>> I send a ping between two hosts connected at both sides of the
> >>>>>>> pipeline
> >>>>>>> model which allows these pings to cross all the pipelines (from 1 to
> >>>>>>> 3).
> >>>>>>> What I observe in Case 1 (each pipeline has its own thread in
> >>>>>>> different
> >>>>>>> core) is that the reported RTT is less than 1 ms, whereas in Case 2
> >>>>>>> (all
> >>>>>>> pipelines except MASTER are run in the same thread) is 20 ms.
> >>>>>>> Furthermore,
> >>>>>>> in Case 2, if I increase a lot (hundreds of Mbps) the packet rate
> >>>>>>> this RTT
> >>>>>>> decreases to 3 or 4 ms.
> >>>>>>>
> >>>>>>> Has somebody observed this behaviour in the past? Can it be solved
> >>>>>>> somehow?
> >>>>>>>
> >>>>>>> Thanks a lot for your attention
> >>>>>>> --
> >>>>>>> Victor
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Victor
> >>>>>>>
> >>>>>>
> >>>
> >>> --
> >>> Victor
> >>>
> >>
> >
> > --
> > Victor
> >
>
>
> --
> Victor
next prev parent reply other threads:[~2020-02-19 10:53 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CAGxG5cjY+npJ7wVqcb9MXdtKkpC6RrgYpDQA2qbaAjD7i7C2EQ@mail.gmail.com>
2020-02-17 16:41 ` [dpdk-dev] " Victor Huertas
2020-02-17 23:10 ` James Huang
2020-02-18 7:04 ` Victor Huertas
2020-02-18 7:18 ` James Huang
2020-02-18 9:49 ` Victor Huertas
2020-02-18 22:08 ` James Huang
2020-02-19 8:29 ` Victor Huertas
2020-02-19 10:37 ` [dpdk-dev] Fwd: " Victor Huertas
2020-02-19 10:53 ` Olivier Matz [this message]
2020-02-19 12:05 ` Victor Huertas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200219105328.GF22738@platinum \
--to=olivier.matz@6wind.com \
--cc=cristian.dumitrescu@intel.com \
--cc=dev@dpdk.org \
--cc=jamsphon@gmail.com \
--cc=vhuertas@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).