From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id DCE28A0555; Wed, 19 Feb 2020 11:53:32 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 0F7394C81; Wed, 19 Feb 2020 11:53:32 +0100 (CET) Received: from mail-wm1-f51.google.com (mail-wm1-f51.google.com [209.85.128.51]) by dpdk.org (Postfix) with ESMTP id 9B06423D for ; Wed, 19 Feb 2020 11:53:30 +0100 (CET) Received: by mail-wm1-f51.google.com with SMTP id p9so88568wmc.2 for ; Wed, 19 Feb 2020 02:53:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind.com; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=b0VvQdNadHDJ9WPYmBwjWtJPzMWfxLQmG7Xzv0vWcnk=; b=SqoSX9uK8by/67mnOd0KziwZP2LeKLSR7pHbcN3ubaJhL7Y4D+wl3OlrDj7syNnDib /Mzri79+Fkv0TxyaiYwHVCt+bTtV5fhaBxLoZtlG0hZgOgOZXnD5D/x/GBU6tQ3H3+n8 SjbtFweXPqAhG3X/Yiy+iQVzPBgiBD8Mo5wqTQt82GSmabYy1tMvgK3t6NqavW4+0pum bQl/OVncjk3TVp2O8/H9tDT4PVcHYlGy2Sx23YJhHYXTyP8QaP7LqiRI5kP0nclPR8Xc QVSEW4To5YxrGpWEd20GLQOEMUpJ1TKm4o0ncpmVMNlt7nUljsB0SSFYXlqxR2gcJp+9 pj2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=b0VvQdNadHDJ9WPYmBwjWtJPzMWfxLQmG7Xzv0vWcnk=; b=blBATWBOk63yMIiWrdapd8sVcFiUvw/zsy5A0clZywVGaZWmuIEtvKmQepIazc6BkD sxOFz5VdN1CP6q3MT+/d6aMXNVZIlM+1JAhompGmjIscweFrT4dJ9xBuk21ii3DA60tk vsxc7e5H/2UbdyXQTt80h1m/mvaVGhrCQ6jA0wfk2QyAB5MfkIh6cIWsGZEr7On1QDDA Th/yTE3oDS8C/IV22zlnLd2TO8Q3lGBqP9rJDfOR1Rj7oaOnrD3JGLgTkkO19Ci7kAOi aiWeiEKAqHPMz+4a6CMgSFNAtndi318Cr6+y5xkxV/FIi4IOop9EoNEKnRRF1GpLp7ha nLWA== X-Gm-Message-State: APjAAAWSFVxep2ZxKevNvqzgLh0NrK/TMok2WFBMmRWEEgQCyoYVUHWP pjdNVQiPhKyFod9M6h6K7Y7JyQQfI5M= X-Google-Smtp-Source: APXvYqzkFy2inAh+xdafp3q4MzX0ACQEppoW34xvRsGeNfO8LbTU32WCYov+GvsQWzCDryd0sdIZ3w== X-Received: by 2002:a05:600c:228f:: with SMTP id 15mr9764123wmf.56.1582109610051; Wed, 19 Feb 2020 02:53:30 -0800 (PST) Received: from 6wind.com (2a01cb0c0005a600345636f7e65ed1a0.ipv6.abo.wanadoo.fr. [2a01:cb0c:5:a600:3456:36f7:e65e:d1a0]) by smtp.gmail.com with ESMTPSA id g19sm2511194wmh.36.2020.02.19.02.53.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Feb 2020 02:53:29 -0800 (PST) Date: Wed, 19 Feb 2020 11:53:28 +0100 From: Olivier Matz To: Victor Huertas Cc: James Huang , cristian.dumitrescu@intel.com, dev Message-ID: <20200219105328.GF22738@platinum> References: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Subject: Re: [dpdk-dev] Fwd: Fwd: high latency detected in IP pipeline example X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Victor, I have no experience with ip_pipeline. I can at least say that this latency is much higher that what you should get. My initial thought was that you were using several pthreads bound to the same core, but from what I read in your first mail, this is not the case. Do you have a simple way to reproduce your issue with the original example app? Olivier On Wed, Feb 19, 2020 at 11:37:21AM +0100, Victor Huertas wrote: > Hi , > > I put some maintainers as destination that could provide some extra > information on this issue. > I hope they can shed some light on this. > > Regards > > El mié., 19 feb. 2020 a las 9:29, Victor Huertas () > escribió: > > > OK James, > > Thanks for sharing your own experience. > > What I would need right now is to know from maintainers if this latency > > behaviour is something inherent in DPDK in the particular case we are > > talking about. Furthermore, I would also appreciate it if some maintainer > > could tell us if there is some workaround or special configuration that > > completely mitigate this latency. I guess that there is one mitigation > > mechanism, which is the approach that the new ip_pipeline app example > > exposes: if two or more pipelines are in the same core the "connection" > > between them is not a software queue but a "direct table connection". > > > > This proposed approach has a big impact on my application and I would like > > to know if there is other mitigation approach taking into account the "old" > > version of ip_pipeline example. > > > > Thanks for your attention > > > > > > El mar., 18 feb. 2020 a las 23:09, James Huang () > > escribió: > > > >> No. I didn't notice the RTT bouncing symptoms. > >> In high throughput scenario, if multiple pipelines runs in a single cpu > >> core, it does increase the latency. > >> > >> > >> Regards, > >> James Huang > >> > >> > >> On Tue, Feb 18, 2020 at 1:50 AM Victor Huertas > >> wrote: > >> > >>> Dear James, > >>> > >>> I have done two different tests with the following configuration: > >>> [PIPELINE 0 MASTER core =0] > >>> [PIPELINE 1 core=1] --- SWQ1--->[PIPELINE 2 core=1] -----SWQ2----> > >>> [PIPELINE 3 core=1] > >>> > >>> The first test (sending a single ping to cross all the pipelines to > >>> measure RTT) has been done by setting the burst_write to 32 in SWQ1 and > >>> SWQ2. NOTE: All the times we use rte_ring_enqueue_burst in the pipelines 1 > >>> and 2 we set the number of packets to write to 1. > >>> > >>> The result of this first test is as shown subsquently: > >>> 64 bytes from 192.168.0.101: icmp_seq=343 ttl=63 time=59.8 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=344 ttl=63 time=59.4 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=345 ttl=63 time=59.2 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=346 ttl=63 time=59.0 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=347 ttl=63 time=59.0 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=348 ttl=63 time=59.2 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=349 ttl=63 time=59.3 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=350 ttl=63 time=59.1 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=351 ttl=63 time=58.9 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=352 ttl=63 time=58.5 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=353 ttl=63 time=58.4 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=354 ttl=63 time=58.0 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=355 ttl=63 time=58.4 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=356 ttl=63 time=57.7 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=357 ttl=63 time=56.9 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=358 ttl=63 time=57.2 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=359 ttl=63 time=57.5 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=360 ttl=63 time=57.3 ms > >>> > >>> As you can see, the RTT is quite high and the range of values is more or > >>> less stable. > >>> > >>> The second test is the same as the first one but setting burst_write to > >>> 1 for all SWQs. The result is this one: > >>> > >>> 64 bytes from 192.168.0.101: icmp_seq=131 ttl=63 time=10.6 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=132 ttl=63 time=10.6 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=133 ttl=63 time=10.5 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=134 ttl=63 time=10.7 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=135 ttl=63 time=10.8 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=136 ttl=63 time=10.4 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=137 ttl=63 time=10.7 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=138 ttl=63 time=10.5 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=139 ttl=63 time=10.4 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=140 ttl=63 time=10.2 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=141 ttl=63 time=10.4 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=142 ttl=63 time=10.9 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=143 ttl=63 time=11.4 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=144 ttl=63 time=11.3 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=145 ttl=63 time=11.5 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=146 ttl=63 time=11.6 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=147 ttl=63 time=11.0 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=148 ttl=63 time=11.3 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=149 ttl=63 time=12.0 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=150 ttl=63 time=12.6 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=151 ttl=63 time=12.4 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=152 ttl=63 time=12.3 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=153 ttl=63 time=12.8 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=154 ttl=63 time=12.4 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=155 ttl=63 time=12.8 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=156 ttl=63 time=12.7 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=157 ttl=63 time=12.6 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=158 ttl=63 time=12.9 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=159 ttl=63 time=13.4 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=160 ttl=63 time=13.8 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=161 ttl=63 time=13.4 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=162 ttl=63 time=13.3 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=163 ttl=63 time=13.3 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=164 ttl=63 time=13.7 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=165 ttl=63 time=13.7 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=166 ttl=63 time=13.8 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=167 ttl=63 time=14.7 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=168 ttl=63 time=14.7 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=169 ttl=63 time=14.7 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=170 ttl=63 time=14.7 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=171 ttl=63 time=14.6 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=172 ttl=63 time=14.6 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=173 ttl=63 time=14.5 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=174 ttl=63 time=14.5 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=175 ttl=63 time=15.1 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=176 ttl=63 time=15.6 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=177 ttl=63 time=16.0 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=178 ttl=63 time=16.9 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=179 ttl=63 time=17.7 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=180 ttl=63 time=17.6 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=181 ttl=63 time=17.9 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=182 ttl=63 time=17.9 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=183 ttl=63 time=18.5 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=184 ttl=63 time=18.9 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=185 ttl=63 time=19.8 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=186 ttl=63 time=19.8 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=187 ttl=63 time=10.7 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=188 ttl=63 time=10.5 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=189 ttl=63 time=10.4 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=190 ttl=63 time=10.3 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=191 ttl=63 time=10.5 ms > >>> 64 bytes from 192.168.0.101: icmp_seq=192 ttl=63 time=10.7 ms > >>> As you mentioned, the delay has decreased a lot but it is still > >>> considerably high (in a normal router this delay is less than 1 ms). > >>> A second strange behaviour is seen in the evolution of the RTT detected. > >>> It begins in 10 ms and goes increasing little by litttle to reach a peak of > >>> 20 ms aprox and then it suddely comes back to 10 ms again to increase again > >>> till 20 ms. > >>> > >>> Is this the behaviour you have in your case when the burst_write is set > >>> to 1? > >>> > >>> Regards, > >>> > >>> El mar., 18 feb. 2020 a las 8:18, James Huang () > >>> escribió: > >>> > >>>> No. We didn't see noticable throughput difference in our test. > >>>> > >>>> On Mon., Feb. 17, 2020, 11:04 p.m. Victor Huertas > >>>> wrote: > >>>> > >>>>> Thanks James for your quick answer. > >>>>> I guess that this configuration modification implies that the packets > >>>>> must be written one by one in the sw ring. Did you notice loose of > >>>>> performance (in throughput) in your aplicación because of that? > >>>>> > >>>>> Regards > >>>>> > >>>>> El mar., 18 feb. 2020 0:10, James Huang escribió: > >>>>> > >>>>>> Yes, I experienced similar issue in my application. In a short > >>>>>> answer, set the swqs write burst value to 1 may reduce the latency > >>>>>> significantly. The default write burst value is 32. > >>>>>> > >>>>>> On Mon., Feb. 17, 2020, 8:41 a.m. Victor Huertas > >>>>>> wrote: > >>>>>> > >>>>>>> Hi all, > >>>>>>> > >>>>>>> I am developing my own DPDK application basing it in the dpdk-stable > >>>>>>> ip_pipeline example. > >>>>>>> At this moment I am using the 17.11 LTS version of DPDK and I amb > >>>>>>> observing > >>>>>>> some extrange behaviour. Maybe it is an old issue that can be solved > >>>>>>> quickly so I would appreciate it if some expert can shade a light on > >>>>>>> this. > >>>>>>> > >>>>>>> The ip_pipeline example allows you to develop Pipelines that perform > >>>>>>> specific packet processing functions (ROUTING, FLOW_CLASSIFYING, > >>>>>>> etc...). > >>>>>>> The thing is that I am extending some of this pipelines with my own. > >>>>>>> However I want to take advantage of the built-in ip_pipeline > >>>>>>> capability of > >>>>>>> arbitrarily assigning the logical core where the pipeline (f_run() > >>>>>>> function) must be executed so that i can adapt the packet processing > >>>>>>> power > >>>>>>> to the amount of the number of cores available. > >>>>>>> Taking this into account I have observed something strange. I show > >>>>>>> you this > >>>>>>> simple example below. > >>>>>>> > >>>>>>> Case 1: > >>>>>>> [PIPELINE 0 MASTER core =0] > >>>>>>> [PIPELINE 1 core=1] --- SWQ1--->[PIPELINE 2 core=2] -----SWQ2----> > >>>>>>> [PIPELINE 3 core=3] > >>>>>>> > >>>>>>> Case 2: > >>>>>>> [PIPELINE 0 MASTER core =0] > >>>>>>> [PIPELINE 1 core=1] --- SWQ1--->[PIPELINE 2 core=1] -----SWQ2----> > >>>>>>> [PIPELINE 3 core=1] > >>>>>>> > >>>>>>> I send a ping between two hosts connected at both sides of the > >>>>>>> pipeline > >>>>>>> model which allows these pings to cross all the pipelines (from 1 to > >>>>>>> 3). > >>>>>>> What I observe in Case 1 (each pipeline has its own thread in > >>>>>>> different > >>>>>>> core) is that the reported RTT is less than 1 ms, whereas in Case 2 > >>>>>>> (all > >>>>>>> pipelines except MASTER are run in the same thread) is 20 ms. > >>>>>>> Furthermore, > >>>>>>> in Case 2, if I increase a lot (hundreds of Mbps) the packet rate > >>>>>>> this RTT > >>>>>>> decreases to 3 or 4 ms. > >>>>>>> > >>>>>>> Has somebody observed this behaviour in the past? Can it be solved > >>>>>>> somehow? > >>>>>>> > >>>>>>> Thanks a lot for your attention > >>>>>>> -- > >>>>>>> Victor > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> Victor > >>>>>>> > >>>>>> > >>> > >>> -- > >>> Victor > >>> > >> > > > > -- > > Victor > > > > > -- > Victor