From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id AF050A0555; Wed, 19 Feb 2020 11:37:35 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id EC8A31BF78; Wed, 19 Feb 2020 11:37:34 +0100 (CET) Received: from mail-ua1-f66.google.com (mail-ua1-f66.google.com [209.85.222.66]) by dpdk.org (Postfix) with ESMTP id EE2161B951 for ; Wed, 19 Feb 2020 11:37:33 +0100 (CET) Received: by mail-ua1-f66.google.com with SMTP id 59so8635015uap.12 for ; Wed, 19 Feb 2020 02:37:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=wL7lO9uA61pxOyhust1Ch3dS4TDs3Ao7ER4RiUusZSw=; b=AgSPTnP+vkX6VvuxRgbjPtclb45j98SmYMP+DcYkCyqc9jhhEDFzDBoZ/UgvdsXalO OLVXx6C+hA7yBENJUcB7IdAVA6tywzOdroCSaVxPHLP0oS4mJ7/wu0E9Zmj5Y9hvCIcl DGt3dh16Hj1cb5OFNJc8OS8A3pIXP7Pa50Rr7gxPFWHHatPyVcKan8N/s8gKsqIwcln0 FKsVLLbK7HB2nx1wO6cJSSdnfvNM36wAzj2lpQ0nbYJjMZpiD/eQZiBumW0jlu5tTw6V zNtUH/1ih1TpmdIE9jybB/JAjF9my9ETXeaGlurx845b+GJdHIQLQEBO+OQAgzV1OMMX zP7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=wL7lO9uA61pxOyhust1Ch3dS4TDs3Ao7ER4RiUusZSw=; b=uQgaa0jdXTzkR1kw6ubkpsCnM2i2DVJnvcSaPq5jPtzsXQXEZnzxEAbW+mSMVh1JiW fTwNYyBnYxoD3RQim6L5bSd2AOOUpCLF9cCBQFrn8iJ2/1pIg5TEuouWG8JAJTX6aON+ hQQGEsNX8vaPCz/HVQ5utCaovK/O+zPpcBxu8jl8lmBmvObD4BcpYCCiQa6hVA8xA7vU k5Y6vxnFct8DQJheO7e5xi/Ek45lXWGAzWo4iNV2aQlUQjFNDjVKGmfNUFf0lgRl6wHI IqfDUnyiJVJhjXaarn2RJdN2Fr7Yeq1Ep73g7Q3i/Gij+aZSMEXQc8MZaSvqSr+urK9a LobA== X-Gm-Message-State: APjAAAV3De323K7UM4hZrIQJRP70J7a546SY6hgNqAz+ODP1XMP9Smo8 OMNeKVGPny1TIKwRw/b00H1ZTut1PK6Fd5TIoi0= X-Google-Smtp-Source: APXvYqzcoaXXYgdgL3R8MJP/aYOocPI0wwjvILJoi6G8PqUZY2Csni/kxVVJuyJ2qdflLnWZG0ti8mPLxLEMnT2UQ3k= X-Received: by 2002:ab0:555e:: with SMTP id u30mr12798083uaa.39.1582108652270; Wed, 19 Feb 2020 02:37:32 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Victor Huertas Date: Wed, 19 Feb 2020 11:37:21 +0100 Message-ID: To: James Huang , cristian.dumitrescu@intel.com, dev , olivier.matz@6wind.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: [dpdk-dev] Fwd: Fwd: high latency detected in IP pipeline example X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi , I put some maintainers as destination that could provide some extra information on this issue. I hope they can shed some light on this. Regards El mi=C3=A9., 19 feb. 2020 a las 9:29, Victor Huertas (= ) escribi=C3=B3: > OK James, > Thanks for sharing your own experience. > What I would need right now is to know from maintainers if this latency > behaviour is something inherent in DPDK in the particular case we are > talking about. Furthermore, I would also appreciate it if some maintainer > could tell us if there is some workaround or special configuration that > completely mitigate this latency. I guess that there is one mitigation > mechanism, which is the approach that the new ip_pipeline app example > exposes: if two or more pipelines are in the same core the "connection" > between them is not a software queue but a "direct table connection". > > This proposed approach has a big impact on my application and I would lik= e > to know if there is other mitigation approach taking into account the "ol= d" > version of ip_pipeline example. > > Thanks for your attention > > > El mar., 18 feb. 2020 a las 23:09, James Huang () > escribi=C3=B3: > >> No. I didn't notice the RTT bouncing symptoms. >> In high throughput scenario, if multiple pipelines runs in a single cpu >> core, it does increase the latency. >> >> >> Regards, >> James Huang >> >> >> On Tue, Feb 18, 2020 at 1:50 AM Victor Huertas >> wrote: >> >>> Dear James, >>> >>> I have done two different tests with the following configuration: >>> [PIPELINE 0 MASTER core =3D0] >>> [PIPELINE 1 core=3D1] --- SWQ1--->[PIPELINE 2 core=3D1] -----SWQ2----> >>> [PIPELINE 3 core=3D1] >>> >>> The first test (sending a single ping to cross all the pipelines to >>> measure RTT) has been done by setting the burst_write to 32 in SWQ1 and >>> SWQ2. NOTE: All the times we use rte_ring_enqueue_burst in the pipeline= s 1 >>> and 2 we set the number of packets to write to 1. >>> >>> The result of this first test is as shown subsquently: >>> 64 bytes from 192.168.0.101: icmp_seq=3D343 ttl=3D63 time=3D59.8 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D344 ttl=3D63 time=3D59.4 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D345 ttl=3D63 time=3D59.2 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D346 ttl=3D63 time=3D59.0 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D347 ttl=3D63 time=3D59.0 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D348 ttl=3D63 time=3D59.2 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D349 ttl=3D63 time=3D59.3 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D350 ttl=3D63 time=3D59.1 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D351 ttl=3D63 time=3D58.9 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D352 ttl=3D63 time=3D58.5 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D353 ttl=3D63 time=3D58.4 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D354 ttl=3D63 time=3D58.0 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D355 ttl=3D63 time=3D58.4 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D356 ttl=3D63 time=3D57.7 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D357 ttl=3D63 time=3D56.9 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D358 ttl=3D63 time=3D57.2 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D359 ttl=3D63 time=3D57.5 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D360 ttl=3D63 time=3D57.3 ms >>> >>> As you can see, the RTT is quite high and the range of values is more o= r >>> less stable. >>> >>> The second test is the same as the first one but setting burst_write to >>> 1 for all SWQs. The result is this one: >>> >>> 64 bytes from 192.168.0.101: icmp_seq=3D131 ttl=3D63 time=3D10.6 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D132 ttl=3D63 time=3D10.6 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D133 ttl=3D63 time=3D10.5 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D134 ttl=3D63 time=3D10.7 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D135 ttl=3D63 time=3D10.8 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D136 ttl=3D63 time=3D10.4 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D137 ttl=3D63 time=3D10.7 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D138 ttl=3D63 time=3D10.5 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D139 ttl=3D63 time=3D10.4 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D140 ttl=3D63 time=3D10.2 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D141 ttl=3D63 time=3D10.4 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D142 ttl=3D63 time=3D10.9 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D143 ttl=3D63 time=3D11.4 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D144 ttl=3D63 time=3D11.3 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D145 ttl=3D63 time=3D11.5 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D146 ttl=3D63 time=3D11.6 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D147 ttl=3D63 time=3D11.0 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D148 ttl=3D63 time=3D11.3 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D149 ttl=3D63 time=3D12.0 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D150 ttl=3D63 time=3D12.6 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D151 ttl=3D63 time=3D12.4 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D152 ttl=3D63 time=3D12.3 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D153 ttl=3D63 time=3D12.8 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D154 ttl=3D63 time=3D12.4 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D155 ttl=3D63 time=3D12.8 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D156 ttl=3D63 time=3D12.7 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D157 ttl=3D63 time=3D12.6 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D158 ttl=3D63 time=3D12.9 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D159 ttl=3D63 time=3D13.4 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D160 ttl=3D63 time=3D13.8 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D161 ttl=3D63 time=3D13.4 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D162 ttl=3D63 time=3D13.3 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D163 ttl=3D63 time=3D13.3 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D164 ttl=3D63 time=3D13.7 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D165 ttl=3D63 time=3D13.7 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D166 ttl=3D63 time=3D13.8 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D167 ttl=3D63 time=3D14.7 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D168 ttl=3D63 time=3D14.7 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D169 ttl=3D63 time=3D14.7 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D170 ttl=3D63 time=3D14.7 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D171 ttl=3D63 time=3D14.6 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D172 ttl=3D63 time=3D14.6 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D173 ttl=3D63 time=3D14.5 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D174 ttl=3D63 time=3D14.5 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D175 ttl=3D63 time=3D15.1 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D176 ttl=3D63 time=3D15.6 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D177 ttl=3D63 time=3D16.0 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D178 ttl=3D63 time=3D16.9 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D179 ttl=3D63 time=3D17.7 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D180 ttl=3D63 time=3D17.6 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D181 ttl=3D63 time=3D17.9 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D182 ttl=3D63 time=3D17.9 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D183 ttl=3D63 time=3D18.5 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D184 ttl=3D63 time=3D18.9 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D185 ttl=3D63 time=3D19.8 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D186 ttl=3D63 time=3D19.8 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D187 ttl=3D63 time=3D10.7 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D188 ttl=3D63 time=3D10.5 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D189 ttl=3D63 time=3D10.4 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D190 ttl=3D63 time=3D10.3 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D191 ttl=3D63 time=3D10.5 ms >>> 64 bytes from 192.168.0.101: icmp_seq=3D192 ttl=3D63 time=3D10.7 ms >>> As you mentioned, the delay has decreased a lot but it is still >>> considerably high (in a normal router this delay is less than 1 ms). >>> A second strange behaviour is seen in the evolution of the RTT detected= . >>> It begins in 10 ms and goes increasing little by litttle to reach a pea= k of >>> 20 ms aprox and then it suddely comes back to 10 ms again to increase a= gain >>> till 20 ms. >>> >>> Is this the behaviour you have in your case when the burst_write is set >>> to 1? >>> >>> Regards, >>> >>> El mar., 18 feb. 2020 a las 8:18, James Huang () >>> escribi=C3=B3: >>> >>>> No. We didn't see noticable throughput difference in our test. >>>> >>>> On Mon., Feb. 17, 2020, 11:04 p.m. Victor Huertas >>>> wrote: >>>> >>>>> Thanks James for your quick answer. >>>>> I guess that this configuration modification implies that the packets >>>>> must be written one by one in the sw ring. Did you notice loose of >>>>> performance (in throughput) in your aplicaci=C3=B3n because of that? >>>>> >>>>> Regards >>>>> >>>>> El mar., 18 feb. 2020 0:10, James Huang escribi= =C3=B3: >>>>> >>>>>> Yes, I experienced similar issue in my application. In a short >>>>>> answer, set the swqs write burst value to 1 may reduce the latency >>>>>> significantly. The default write burst value is 32. >>>>>> >>>>>> On Mon., Feb. 17, 2020, 8:41 a.m. Victor Huertas >>>>>> wrote: >>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> I am developing my own DPDK application basing it in the dpdk-stabl= e >>>>>>> ip_pipeline example. >>>>>>> At this moment I am using the 17.11 LTS version of DPDK and I amb >>>>>>> observing >>>>>>> some extrange behaviour. Maybe it is an old issue that can be solve= d >>>>>>> quickly so I would appreciate it if some expert can shade a light o= n >>>>>>> this. >>>>>>> >>>>>>> The ip_pipeline example allows you to develop Pipelines that perfor= m >>>>>>> specific packet processing functions (ROUTING, FLOW_CLASSIFYING, >>>>>>> etc...). >>>>>>> The thing is that I am extending some of this pipelines with my own= . >>>>>>> However I want to take advantage of the built-in ip_pipeline >>>>>>> capability of >>>>>>> arbitrarily assigning the logical core where the pipeline (f_run() >>>>>>> function) must be executed so that i can adapt the packet processin= g >>>>>>> power >>>>>>> to the amount of the number of cores available. >>>>>>> Taking this into account I have observed something strange. I show >>>>>>> you this >>>>>>> simple example below. >>>>>>> >>>>>>> Case 1: >>>>>>> [PIPELINE 0 MASTER core =3D0] >>>>>>> [PIPELINE 1 core=3D1] --- SWQ1--->[PIPELINE 2 core=3D2] -----SWQ2--= --> >>>>>>> [PIPELINE 3 core=3D3] >>>>>>> >>>>>>> Case 2: >>>>>>> [PIPELINE 0 MASTER core =3D0] >>>>>>> [PIPELINE 1 core=3D1] --- SWQ1--->[PIPELINE 2 core=3D1] -----SWQ2--= --> >>>>>>> [PIPELINE 3 core=3D1] >>>>>>> >>>>>>> I send a ping between two hosts connected at both sides of the >>>>>>> pipeline >>>>>>> model which allows these pings to cross all the pipelines (from 1 t= o >>>>>>> 3). >>>>>>> What I observe in Case 1 (each pipeline has its own thread in >>>>>>> different >>>>>>> core) is that the reported RTT is less than 1 ms, whereas in Case 2 >>>>>>> (all >>>>>>> pipelines except MASTER are run in the same thread) is 20 ms. >>>>>>> Furthermore, >>>>>>> in Case 2, if I increase a lot (hundreds of Mbps) the packet rate >>>>>>> this RTT >>>>>>> decreases to 3 or 4 ms. >>>>>>> >>>>>>> Has somebody observed this behaviour in the past? Can it be solved >>>>>>> somehow? >>>>>>> >>>>>>> Thanks a lot for your attention >>>>>>> -- >>>>>>> Victor >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Victor >>>>>>> >>>>>> >>> >>> -- >>> Victor >>> >> > > -- > Victor > --=20 Victor