From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id AF7F1A04E6 for ; Mon, 7 Dec 2020 23:16:11 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 093A6160; Mon, 7 Dec 2020 23:16:10 +0100 (CET) Received: from wh10.alp1.flow.ch (wh10.alp1.flow.ch [185.119.84.194]) by dpdk.org (Postfix) with ESMTP id 3E806A3 for ; Mon, 7 Dec 2020 23:16:08 +0100 (CET) Received: from [::1] (port=53130 helo=wh10.alp1.flow.ch) by wh10.alp1.flow.ch with esmtpa (Exim 4.92) (envelope-from ) id 1kmOnd-006laC-1B; Mon, 07 Dec 2020 23:16:05 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Date: Mon, 07 Dec 2020 23:16:04 +0100 From: Alex Kiselev To: "Singh, Jasvinder" Cc: users@dpdk.org, "Dumitrescu, Cristian" , "Dharmappa, Savinay" In-Reply-To: References: <090256f7b7a6739f80353be3339fd062@therouter.net> <7e314aa3562c380a573781a4c0562b93@therouter.net> <4d1beb6eb85896bef1e5a1b9778006d7@therouter.net> , <49019BC8-DDA6-4B39-B395-2A68E91AB424@intel.com> <226b13286c876e69ad40a65858131b66@therouter.net> <4536a02973015dc8049834635f145a19@therouter.net>, Message-ID: <4e5bde1cf78b0f77f4a5ec016a7217d6@therouter.net> X-Sender: alex@therouter.net User-Agent: Roundcube Webmail/1.3.8 X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - wh10.alp1.flow.ch X-AntiAbuse: Original Domain - dpdk.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - therouter.net X-Get-Message-Sender-Via: wh10.alp1.flow.ch: authenticated_id: alex@therouter.net X-Authenticated-Sender: wh10.alp1.flow.ch: alex@therouter.net X-Source: X-Source-Args: X-Source-Dir: Subject: Re: [dpdk-users] scheduler issue X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org Sender: "users" On 2020-12-07 21:34, Alex Kiselev wrote: > On 2020-12-07 20:29, Singh, Jasvinder wrote: >>> On 7 Dec 2020, at 19:09, Alex Kiselev wrote: >>> >>> On 2020-12-07 20:07, Alex Kiselev wrote: >>>>> On 2020-12-07 19:18, Alex Kiselev wrote: >>>>> On 2020-12-07 18:59, Singh, Jasvinder wrote: >>>>>>> On 7 Dec 2020, at 17:45, Alex Kiselev wrote: >>>>>>> On 2020-12-07 18:31, Singh, Jasvinder wrote: >>>>>>>>> -----Original Message----- >>>>>>>>> From: Alex Kiselev >>>>>>>>> Sent: Monday, December 7, 2020 4:50 PM >>>>>>>>> To: Singh, Jasvinder >>>>>>>>> Cc: users@dpdk.org; Dumitrescu, Cristian >>>>>>>>> ; >>>>>>>>> Dharmappa, Savinay >>>>>>>>> Subject: Re: [dpdk-users] scheduler issue >>>>>>>>>> On 2020-12-07 12:32, Singh, Jasvinder wrote: >>>>>>>>> >> -----Original Message----- >>>>>>>>> >> From: Alex Kiselev >>>>>>>>> >> Sent: Monday, December 7, 2020 10:46 AM >>>>>>>>> >> To: Singh, Jasvinder >>>>>>>>> >> Cc: users@dpdk.org; Dumitrescu, Cristian >>>>>>>>> >> ; Dharmappa, Savinay >>>>>>>>> >> >>>>>>>>> >> Subject: Re: [dpdk-users] scheduler issue >>>>>>>>> >> >>>>>>>>> >> On 2020-12-07 11:00, Singh, Jasvinder wrote: >>>>>>>>> >> >> -----Original Message----- >>>>>>>>> >> >> From: users On Behalf Of Alex Kiselev >>>>>>>>> >> >> Sent: Friday, November 27, 2020 12:12 PM >>>>>>>>> >> >> To: users@dpdk.org >>>>>>>>> >> >> Cc: Dumitrescu, Cristian >>>>>>>>> >> >> Subject: Re: [dpdk-users] scheduler issue >>>>>>>>> >> >> >>>>>>>>> >> >> On 2020-11-25 16:04, Alex Kiselev wrote: >>>>>>>>> >> >> > On 2020-11-24 16:34, Alex Kiselev wrote: >>>>>>>>> >> >> >> Hello, >>>>>>>>> >> >> >> >>>>>>>>> >> >> >> I am facing a problem with the scheduler library DPDK 18.11.10 >>>>>>>>> >> >> >> with default scheduler settings (RED is off). >>>>>>>>> >> >> >> It seems like some of the pipes (last time it was 4 out of 600 >>>>>>>>> >> >> >> pipes) start incorrectly dropping most of the traffic after a >>>>>>>>> >> >> >> couple of days of successful work. >>>>>>>>> >> >> >> >>>>>>>>> >> >> >> So far I've checked that there are no mbuf leaks or any other >>>>>>>>> >> >> >> errors in my code and I am sure that traffic enters problematic >>>>>>>>> pipes. >>>>>>>>> >> >> >> Also switching a traffic in the runtime to pipes of another >>>>>>>>> >> >> >> port restores the traffic flow. >>>>>>>>> >> >> >> >>>>>>>>> >> >> >> Ho do I approach debugging this issue? >>>>>>>>> >> >> >> >>>>>>>>> >> >> >> I've added using rte_sched_queue_read_stats(), but it doesn't >>>>>>>>> >> >> >> give me counters that accumulate values (packet drops for >>>>>>>>> >> >> >> example), it gives me some kind of current values and after a >>>>>>>>> >> >> >> couple of seconds those values are reset to zero, so I can say >>>>>>>>> nothing based on that API. >>>>>>>>> >> >> >> >>>>>>>>> >> >> >> I would appreciate any ideas and help. >>>>>>>>> >> >> >> Thanks. >>>>>>>>> >> >> > >>>>>>>>> >> >> > Problematic pipes had very low bandwidth limit (1 Mbit/s) and >>>>>>>>> >> >> > also there is an oversubscription configuration event at subport >>>>>>>>> >> >> > 0 of port >>>>>>>>> >> >> > 13 to which those pipes belongs and >>>>>>>>> >> >> CONFIG_RTE_SCHED_SUBPORT_TC_OV is >>>>>>>>> >> >> > disabled. >>>>>>>>> >> >> > >>>>>>>>> >> >> > Could a congestion at that subport be the reason of the problem? >>>>>>>>> >> >> > >>>>>>>>> >> >> > How much overhead and performance degradation will add enabling >>>>>>>>> >> >> > CONFIG_RTE_SCHED_SUBPORT_TC_OV feature? >>>>>>>>> >> >> > >>>>>>>>> >> >> > Configuration: >>>>>>>>> >> >> > >>>>>>>>> >> >> > # >>>>>>>>> >> >> > # QoS Scheduler Profiles >>>>>>>>> >> >> > # >>>>>>>>> >> >> > hqos add profile 1 rate 8 K size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 2 rate 400 K size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 3 rate 600 K size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 4 rate 800 K size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 5 rate 1 M size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 6 rate 1500 K size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 7 rate 2 M size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 8 rate 3 M size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 9 rate 4 M size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 10 rate 5 M size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 11 rate 6 M size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 12 rate 8 M size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 13 rate 10 M size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 14 rate 12 M size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 15 rate 15 M size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 16 rate 16 M size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 17 rate 20 M size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 18 rate 30 M size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 19 rate 32 M size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 20 rate 40 M size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 21 rate 50 M size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 22 rate 60 M size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 23 rate 100 M size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 24 rate 25 M size 1000000 tc period 40 >>>>>>>>> >> >> > hqos add profile 25 rate 50 M size 1000000 tc period 40 >>>>>>>>> >> >> > >>>>>>>>> >> >> > # >>>>>>>>> >> >> > # Port 13 >>>>>>>>> >> >> > # >>>>>>>>> >> >> > hqos add port 13 rate 40 G mtu 1522 frame overhead 24 queue >>>>>>>>> >> >> > sizes >>>>>>>>> >> >> > 64 >>>>>>>>> >> >> > 64 64 64 >>>>>>>>> >> >> > hqos add port 13 subport 0 rate 1500 M size 1000000 tc period 10 >>>>>>>>> >> >> > hqos add port 13 subport 0 pipes 3000 profile 2 >>>>>>>>> >> >> > hqos add port 13 subport 0 pipes 3000 profile 5 >>>>>>>>> >> >> > hqos add port 13 subport 0 pipes 3000 profile 6 >>>>>>>>> >> >> > hqos add port 13 subport 0 pipes 3000 profile 7 >>>>>>>>> >> >> > hqos add port 13 subport 0 pipes 3000 profile 9 >>>>>>>>> >> >> > hqos add port 13 subport 0 pipes 3000 profile 11 >>>>>>>>> >> >> > hqos set port 13 lcore 5 >>>>>>>>> >> >> >>>>>>>>> >> >> I've enabled TC_OV feature and redirected most of the traffic to TC3. >>>>>>>>> >> >> But the issue still exists. >>>>>>>>> >> >> >>>>>>>>> >> >> Below is queue statistics of one of problematic pipes. >>>>>>>>> >> >> Almost all of the traffic entering the pipe is dropped. >>>>>>>>> >> >> >>>>>>>>> >> >> And the pipe is also configured with the 1Mbit/s profile. >>>>>>>>> >> >> So, the issue is only with very low bandwidth pipe profiles. >>>>>>>>> >> >> >>>>>>>>> >> >> And this time there was no congestion on the subport. >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> Egress qdisc >>>>>>>>> >> >> dir 0 >>>>>>>>> >> >> rate 1M >>>>>>>>> >> >> port 6, subport 0, pipe_id 138, profile_id 5 >>>>>>>>> >> >> tc 0, queue 0: bytes 752, bytes dropped 0, pkts 8, pkts dropped 0 >>>>>>>>> >> >> tc 0, queue 1: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0 >>>>>>>>> >> >> tc 0, queue 2: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0 >>>>>>>>> >> >> tc 0, queue 3: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0 >>>>>>>>> >> >> tc 1, queue 0: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0 >>>>>>>>> >> >> tc 1, queue 1: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0 >>>>>>>>> >> >> tc 1, queue 2: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0 >>>>>>>>> >> >> tc 1, queue 3: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0 >>>>>>>>> >> >> tc 2, queue 0: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0 >>>>>>>>> >> >> tc 2, queue 1: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0 >>>>>>>>> >> >> tc 2, queue 2: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0 >>>>>>>>> >> >> tc 2, queue 3: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0 >>>>>>>>> >> >> tc 3, queue 0: bytes 56669, bytes dropped 360242, pkts 150, >>>>>>>>> >> >> pkts dropped >>>>>>>>> >> >> 3749 >>>>>>>>> >> >> tc 3, queue 1: bytes 63005, bytes dropped 648782, pkts 150, >>>>>>>>> >> >> pkts dropped >>>>>>>>> >> >> 3164 >>>>>>>>> >> >> tc 3, queue 2: bytes 9984, bytes dropped 49704, pkts 128, pkts >>>>>>>>> >> >> dropped >>>>>>>>> >> >> 636 >>>>>>>>> >> >> tc 3, queue 3: bytes 15436, bytes dropped 107198, pkts 130, >>>>>>>>> >> >> pkts dropped >>>>>>>>> >> >> 354 >>>>>>>>> >> > >>>>>>>>> >> > >>>>>>>>> >> > Hi Alex, >>>>>>>>> >> > >>>>>>>>> >> > Can you try newer version of the library, say dpdk 20.11? >>>>>>>>> >> >>>>>>>>> >> Right now no, since switching to another DPDK will take a lot of time >>>>>>>>> >> because I am using a lot of custom patches. >>>>>>>>> >> >>>>>>>>> >> I've tried to simply copy the entire rte_sched lib from DPDK 19 to >>>>>>>>> >> DPDK 18. >>>>>>>>> >> And I was able to successful back port and resolve all dependency >>>>>>>>> >> issues, but it also will take some time to test this approach. >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> > Are you >>>>>>>>> >> > using dpdk qos sample app or your own app? >>>>>>>>> >> >>>>>>>>> >> My own app. >>>>>>>>> >> >>>>>>>>> >> >> What are the packets size? >>>>>>>>> >> >>>>>>>>> >> Application is used as BRAS/BNG server, so it's used to provide >>>>>>>>> >> internet access to residential customers. Therefore packet sizes are >>>>>>>>> >> typical to the internet and vary from 64 to 1500 bytes. Most of the >>>>>>>>> >> packets are around >>>>>>>>> >> 1000 bytes. >>>>>>>>> >> >>>>>>>>> >> > >>>>>>>>> >> > Couple of other things for clarification- 1. At what rate you are >>>>>>>>> >> > injecting the traffic to low bandwidth pipes? >>>>>>>>> >> >>>>>>>>> >> Well, the rate vary also, there could be congestion on some pipes at >>>>>>>>> >> some date time. >>>>>>>>> >> >>>>>>>>> >> But the problem is that once the problem occurs at a pipe or at some >>>>>>>>> >> queues inside the pipe, the pipe stops transmitting even when >>>>>>>>> >> incoming traffic rate is much lower than the pipe's rate. >>>>>>>>> >> >>>>>>>>> >> > 2. How is traffic distributed among pipes and their traffic class? >>>>>>>>> >> >>>>>>>>> >> I am using IPv4 TOS field to choose the TC and there is a tos2tc map. >>>>>>>>> >> Most of my traffic has 0 tos value which is mapped to TC3 inside my >>>>>>>>> >> app. >>>>>>>>> >> >>>>>>>>> >> Recently I've switched to a tos2map which maps all traffic to TC3 to >>>>>>>>> >> see if it solves the problem. >>>>>>>>> >> >>>>>>>>> >> Packet distribution to queues is done using the formula (ipv4.src + >>>>>>>>> >> ipv4.dst) & 3 >>>>>>>>> >> >>>>>>>>> >> > 3. Can you try putting your own counters on those pipes queues >>>>>>>>> >> > which periodically show the #packets in the queues to understand >>>>>>>>> >> > the dynamics? >>>>>>>>> >> >>>>>>>>> >> I will try. >>>>>>>>> >> >>>>>>>>> >> P.S. >>>>>>>>> >> >>>>>>>>> >> Recently I've got another problem with scheduler. >>>>>>>>> >> >>>>>>>>> >> After enabling the TC_OV feature one of the ports stops transmitting. >>>>>>>>> >> All port's pipes were affected. >>>>>>>>> >> Port had only one support, and there were only pipes with 1 Mbit/s >>>>>>>>> >> profile. >>>>>>>>> >> The problem was solved by adding a 10Mit/s profile to that port. Only >>>>>>>>> >> after that port's pipes started to transmit. >>>>>>>>> >> I guess it has something to do with calculating tc_ov_wm as it >>>>>>>>> >> depends on the maximum pipe rate. >>>>>>>>> >> >>>>>>>>> >> I am gonna make a test lab and a test build to reproduce this. >>>>>>>>> I've made some tests and was able to reproduce the port >>>>>>>>> configuration issue >>>>>>>>> using a test build of my app. >>>>>>>>> Tests showed that TC_OV feature works not correctly in DPDK >>>>>>>>> 18.11, but >>>>>>>>> there are workarounds. >>>>>>>>> I still can't reproduce my main problem which is random pipes >>>>>>>>> stop >>>>>>>>> transmitting. >>>>>>>>> Here are details: >>>>>>>>> All tests use the same test traffic generator that produce >>>>>>>>> 10 traffic flows entering 10 different pipes of port 1 subport >>>>>>>>> 0. >>>>>>>>> Only queue 0 of each pipe is used. >>>>>>>>> TX rate is 800 kbit/s. packet size is 800 byte. >>>>>>>>> Pipes rate are 1 Mbit/s. Subport 0 rate is 500 Mbit/s. >>>>>>>>> ### >>>>>>>>> ### test 1 >>>>>>>>> ### >>>>>>>>> Traffic generator is configured to use TC3. >>>>>>>>> Configuration: >>>>>>>>> hqos add profile 27 rate 1 M size 1000000 tc period 40 >>>>>>>>> hqos add profile 23 rate 100 M size 1000000 tc period 40 >>>>>>>>> # qos test port >>>>>>>>> hqos add port 1 rate 10 G mtu 1522 frame overhead 24 queue >>>>>>>>> sizes 64 64 >>>>>>>>> 64 64 >>>>>>>>> hqos add port 1 subport 0 rate 500 M size 1000000 tc period 10 >>>>>>>>> hqos add port 1 subport 0 pipes 2000 profile 27 >>>>>>>>> hqos set port 1 lcore 3 >>>>>>>>> Results: >>>>>>>>> h5 ~ # rcli sh qos rcv >>>>>>>>> rcv 0: rx rate 641280, nb pkts 501, ind 1 rcv 1: rx rate >>>>>>>>> 641280, nb pkts 501, ind >>>>>>>>> 1 rcv 2: rx rate 641280, nb pkts 501, ind 1 rcv 3: rx rate >>>>>>>>> 641280, nb pkts 501, >>>>>>>>> ind 1 rcv 4: rx rate 641280, nb pkts 501, ind 1 rcv 5: rx rate >>>>>>>>> 641280, nb pkts >>>>>>>>> 501, ind 1 rcv 6: rx rate 641280, nb pkts 501, ind 1 rcv 7: rx >>>>>>>>> rate 641280, nb >>>>>>>>> pkts 501, ind 1 rcv 8: rx rate 641280, nb pkts 501, ind 1 rcv >>>>>>>>> 9: rx rate 641280, >>>>>>>>> nb pkts 501, ind 1 >>>>>>>>> ! BUG >>>>>>>>> ! RX rate is lower then expected 800000 bit/s despite that >>>>>>>>> there is no >>>>>>>>> congestion neither at subport nor at pipes levels. >>>>>>>> [JS] - Can you elaborate on your scheduler hierarchy? >>>>>>> sure, take a look below at the output >>>>>>> "number of pipes per subport" >>>>>>> TR application always round the total number of pipes per port >>>>>>> to a power2 value. >>>>>>> I mean- how >>>>>>>> many pipes per subport? It has to be the number that can be >>>>>>>> expressed >>>>>>>> as power of 2, for e.g 4K, 2K, 1K etc. In run time, scheduler >>>>>>>> will >>>>>>>> scan all the pipes and will process only those which have got >>>>>>>> packets >>>>>>>> in their queue. >>>>>>> Configuration of port 1 with enabled profile 23 >>>>>>> h5 ~ # rcli sh hqos ports >>>>>>> hqos scheduler port: 1 >>>>>>> lcore_id: 3 >>>>>>> socket: 0 >>>>>>> rate: 0 >>>>>>> mtu: 1522 >>>>>>> frame overhead: 24 >>>>>>> number of pipes per subport: 4096 >>>>>>> pipe profiles: 2 >>>>>>> pipe profile id: 27 >>>>>>> pipe rate: 1000000 >>>>>>> number of pipes: 2000 >>>>>>> pipe pool size: 2000 >>>>>>> number of pipes in use: 0 >>>>>>> pipe profile id: 23 >>>>>>> pipe rate: 100000000 >>>>>>> number of pipes: 200 >>>>>>> pipe pool size: 200 >>>>>>> number of pipes in use: 0 >>>>>>> Configuration with only one profile at port 1 >>>>>>> hqos scheduler port: 1 >>>>>>> lcore_id: 3 >>>>>>> socket: 0 >>>>>>> rate: 0 >>>>>>> mtu: 1522 >>>>>>> frame overhead: 24 >>>>>>> number of pipes per subport: 2048 >>>>>>> pipe profiles: 1 >>>>>>> pipe profile id: 27 >>>>>>> pipe rate: 1000000 >>>>>>> number of pipes: 2000 >>>>>>> pipe pool size: 2000 >>>>>>> number of pipes in use: 0 >>>>>>> [JS] what is the meaning of number of pipes , Pipe pool size, >>>>>>> and number of pipes in use which is zero above? Does your >>>>>>> application map packet field values to these number of pipes in >>>>>>> run time ? Can you give me example of mapping of packet field >>>>>>> values to pipe id, tc, queue? >>>>> please, ignore all information from the outputs above except the >>>>> "number of pipes per subport". >>>>> since the tests were made with a simple test application which is >>>>> based on TR but doesn't use >>>>> it's production QoS logic. >>>>> The tests 1 - 5 were made with a simple test application with a >>>>> very >>>>> straitforward qos mappings >>>>> which I described at the beginning. >>>>> Here they are: >>>>> All tests use the same test traffic generator that produce >>>>> 10 traffic flows entering 10 different pipes (0 - 9) of port 1 >>>>> subport 0. >>>>> Only queue 0 of each pipe is used. >>>>> TX rate is 800 kbit/s. packet size is 800 byte. >>>>> Pipes rate are 1 Mbit/s. Subport 0 rate is 500 Mbit/s. >>>>>>>>> ### >>>>>>>>> ### test 2 >>>>>>>>> ### >>>>>>>>> Traffic generator is configured to use TC3. >>>>>>>>> !!! profile 23 has been added to the test port. >>>>>>>>> Configuration: >>>>>>>>> hqos add profile 27 rate 1 M size 1000000 tc period 40 >>>>>>>>> hqos add profile 23 rate 100 M size 1000000 tc period 40 >>>>>>>>> # qos test port >>>>>>>>> hqos add port 1 rate 10 G mtu 1522 frame overhead 24 queue >>>>>>>>> sizes 64 64 >>>>>>>>> 64 64 >>>>>>>>> hqos add port 1 subport 0 rate 500 M size 1000000 tc period 10 >>>>>>>>> hqos add port 1 subport 0 pipes 2000 profile 27 >>>>>>>>> hqos add port 1 subport 0 pipes 200 profile 23 >>>>>>>>> hqos set port 1 lcore 3 >>>>>>>>> Results: >>>>>>>>> h5 ~ # rcli sh qos rcv >>>>>>>>> rcv 0: rx rate 798720, nb pkts 624, ind 1 rcv 1: rx rate >>>>>>>>> 798720, nb pkts 624, ind >>>>>>>>> 1 rcv 2: rx rate 798720, nb pkts 624, ind 1 rcv 3: rx rate >>>>>>>>> 798720, nb pkts 624, >>>>>>>>> ind 1 rcv 4: rx rate 798720, nb pkts 624, ind 1 rcv 5: rx rate >>>>>>>>> 798720, nb pkts >>>>>>>>> 624, ind 1 rcv 6: rx rate 798720, nb pkts 624, ind 1 rcv 7: rx >>>>>>>>> rate 798720, nb >>>>>>>>> pkts 624, ind 1 rcv 8: rx rate 798720, nb pkts 624, ind 1 rcv >>>>>>>>> 9: rx rate 798720, >>>>>>>>> nb pkts 624, ind 1 >>>>>>>>> OK. >>>>>>>>> Receiving traffic is rate is equal to expected values. >>>>>>>>> So, just adding a pipes which are not being used solves the >>>>>>>>> problem. >>>>>>>>> ### >>>>>>>>> ### test 3 >>>>>>>>> ### >>>>>>>>> !!! traffic generator uses TC 0, so tc_ov is not being used in >>>>>>>>> this test. >>>>>>>>> profile 23 is not used. >>>>>>>>> Configuration without profile 23. >>>>>>>>> hqos add profile 27 rate 1 M size 1000000 tc period 40 >>>>>>>>> hqos add profile 23 rate 100 M size 1000000 tc period 40 >>>>>>>>> # qos test port >>>>>>>>> hqos add port 1 rate 10 G mtu 1522 frame overhead 24 queue >>>>>>>>> sizes 64 64 >>>>>>>>> 64 64 >>>>>>>>> hqos add port 1 subport 0 rate 500 M size 1000000 tc period 10 >>>>>>>>> hqos add port 1 subport 0 pipes 2000 profile 27 >>>>>>>>> hqos set port 1 lcore 3 >>>>>>>>> Restuls: >>>>>>>>> h5 ~ # rcli sh qos rcv >>>>>>>>> rcv 0: rx rate 798720, nb pkts 624, ind 0 rcv 1: rx rate >>>>>>>>> 798720, nb pkts 624, ind >>>>>>>>> 0 rcv 2: rx rate 798720, nb pkts 624, ind 0 rcv 3: rx rate >>>>>>>>> 798720, nb pkts 624, >>>>>>>>> ind 0 rcv 4: rx rate 798720, nb pkts 624, ind 0 rcv 5: rx rate >>>>>>>>> 798720, nb pkts >>>>>>>>> 624, ind 0 rcv 6: rx rate 798720, nb pkts 624, ind 0 rcv 7: rx >>>>>>>>> rate 798720, nb >>>>>>>>> pkts 624, ind 0 rcv 8: rx rate 798720, nb pkts 624, ind 0 rcv >>>>>>>>> 9: rx rate 798720, >>>>>>>>> nb pkts 624, ind 0 >>>>>>>>> OK. >>>>>>>>> Receiving traffic is rate is equal to expected values. >>>>>>>>> ### >>>>>>>>> ### test 4 >>>>>>>>> ### >>>>>>>>> Traffic generator is configured to use TC3. >>>>>>>>> no profile 23. >>>>>>>>> !! subport tc period has been changed from 10 to 5. >>>>>>>>> Configuration: >>>>>>>>> hqos add profile 27 rate 1 M size 1000000 tc period 40 >>>>>>>>> hqos add profile 23 rate 100 M size 1000000 tc period 40 >>>>>>>>> # qos test port >>>>>>>>> hqos add port 1 rate 10 G mtu 1522 frame overhead 24 queue >>>>>>>>> sizes 64 64 >>>>>>>>> 64 64 >>>>>>>>> hqos add port 1 subport 0 rate 500 M size 1000000 tc period 5 >>>>>>>>> hqos add port 1 subport 0 pipes 2000 profile 27 >>>>>>>>> hqos set port 1 lcore 3 >>>>>>>>> Restuls: >>>>>>>>> rcv 0: rx rate 0, nb pkts 0, ind 1 >>>>>>>>> rcv 1: rx rate 0, nb pkts 0, ind 1 >>>>>>>>> rcv 2: rx rate 0, nb pkts 0, ind 1 >>>>>>>>> rcv 3: rx rate 0, nb pkts 0, ind 1 >>>>>>>>> rcv 4: rx rate 0, nb pkts 0, ind 1 >>>>>>>>> rcv 5: rx rate 0, nb pkts 0, ind 1 >>>>>>>>> rcv 6: rx rate 0, nb pkts 0, ind 1 >>>>>>>>> rcv 7: rx rate 0, nb pkts 0, ind 1 >>>>>>>>> rcv 8: rx rate 0, nb pkts 0, ind 1 >>>>>>>>> rcv 9: rx rate 0, nb pkts 0, ind 1 >>>>>>>>> ! zero traffic >>>>>>>>> ### >>>>>>>>> ### test 5 >>>>>>>>> ### >>>>>>>>> Traffic generator is configured to use TC3. >>>>>>>>> profile 23 is enabled. >>>>>>>>> subport tc period has been changed from 10 to 5. >>>>>>>>> Configuration: >>>>>>>>> hqos add profile 27 rate 1 M size 1000000 tc period 40 >>>>>>>>> hqos add profile 23 rate 100 M size 1000000 tc period 40 >>>>>>>>> # qos test port >>>>>>>>> hqos add port 1 rate 10 G mtu 1522 frame overhead 24 queue >>>>>>>>> sizes 64 64 >>>>>>>>> 64 64 >>>>>>>>> hqos add port 1 subport 0 rate 500 M size 1000000 tc period 5 >>>>>>>>> hqos add port 1 subport 0 pipes 2000 profile 27 >>>>>>>>> hqos add port 1 subport 0 pipes 200 profile 23 >>>>>>>>> hqos set port 1 lcore 3 >>>>>>>>> Restuls: >>>>>>>>> h5 ~ # rcli sh qos rcv >>>>>>>>> rcv 0: rx rate 800000, nb pkts 625, ind 1 rcv 1: rx rate >>>>>>>>> 800000, nb pkts 625, ind >>>>>>>>> 1 rcv 2: rx rate 800000, nb pkts 625, ind 1 rcv 3: rx rate >>>>>>>>> 800000, nb pkts 625, >>>>>>>>> ind 1 rcv 4: rx rate 800000, nb pkts 625, ind 1 rcv 5: rx rate >>>>>>>>> 800000, nb pkts >>>>>>>>> 625, ind 1 rcv 6: rx rate 800000, nb pkts 625, ind 1 rcv 7: rx >>>>>>>>> rate 800000, nb >>>>>>>>> pkts 625, ind 1 rcv 8: rx rate 800000, nb pkts 625, ind 1 rcv >>>>>>>>> 9: rx rate 800000, >>>>>>>>> nb pkts 625, ind 1 >>>>>>>>> OK >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > Does this problem exist when you disable oversubscription mode? Worth >>>>>>>>> > looking at grinder_tc_ov_credits_update() and grinder_credits_update() >>>>>>>>> > functions where tc_ov_wm is altered. >>>>>>>>> > >>>>>>>>> > >>>>>>>>> >> > >>>>>>>>> >> > Thanks, >>>>>>>>> >> > Jasvinder >>>> Ok, these new two tests show even more clearly that TC_OV feature is >>>> broken. >>>> Test 1 doesn't use TC_OV and all available support bandwidth is >>>> distributed to 300 pipes in a very fair way. There are 300 >>>> generators >>>> with tx rate 1M. They produce 300 traffic flows which enter each >>>> own pipe (queue 0) of port 1 subport 0. >>>> port 1 >>>> subport rate 300 M >>>> Then application measures the rx rate of each flows after flow's >>>> traffic leaves the scheduler. >>>> For example, the following line >>>> rcv 284 rx rate 995840 nb pkts 778 >>>> shows that rx rate of the flow with number 284 is 995840 bit/s. >>>> All 300 rx rates are about 995840 bit/s (1Mbit/s) as expected. >> >> [JS] May be try repeat same test but change traffic from tc 0 to >> tc3. See if this works. > > The second test with incorrect restuls was already done with tc3. > >> >>>> The second test uses the same configuration >>>> but uses TC3, so the TC_OV function is being used. >>>> And the distribution of traffic in the test is very unfair. >>>> Some of the pipes get 875520 bit/s, some of the pipes get only >>>> 604160 bit/s despite that there is >>> >> >> [JS] try repeat test with increase pipe bandwidth let’s say 50 mbps or >> even greater. > > I increased pipe rate to 10Mbit/s and both tests (tc0 and tc3) showed > correct and identical results. > > But, then I changed the tests and increased the number of pipes to 600 > to see how it would work with subport congestion. I added 600 pipes > generatinig 10Mbit/s > and 3G subport limit, therefore each pipe should get equal share which > is about 5mbits. > And results of both tests (tc0 or tc3) are not very good. > > First pipes are getting much more bandwidth than the last ones. > The difference is 3 times. So, TC_OV is still not working!! decreasing subport tc_period from 10 to 5 has solved that problem and scheduler started to distribute subport bandwidth between 10 mbit/s pipes almost ideally. > > rcv 0 rx rate 7324160 nb pkts 5722 > rcv 1 rx rate 7281920 nb pkts 5689 > rcv 2 rx rate 7226880 nb pkts 5646 > rcv 3 rx rate 7124480 nb pkts 5566 > rcv 4 rx rate 7324160 nb pkts 5722 > rcv 5 rx rate 7271680 nb pkts 5681 > rcv 6 rx rate 7188480 nb pkts 5616 > rcv 7 rx rate 7150080 nb pkts 5586 > rcv 8 rx rate 7328000 nb pkts 5725 > rcv 9 rx rate 7249920 nb pkts 5664 > rcv 10 rx rate 7188480 nb pkts 5616 > rcv 11 rx rate 7179520 nb pkts 5609 > rcv 12 rx rate 7324160 nb pkts 5722 > rcv 13 rx rate 7208960 nb pkts 5632 > rcv 14 rx rate 7152640 nb pkts 5588 > rcv 15 rx rate 7127040 nb pkts 5568 > rcv 16 rx rate 7303680 nb pkts 5706 > .... > rcv 587 rx rate 2406400 nb pkts 1880 > rcv 588 rx rate 2406400 nb pkts 1880 > rcv 589 rx rate 2406400 nb pkts 1880 > rcv 590 rx rate 2406400 nb pkts 1880 > rcv 591 rx rate 2406400 nb pkts 1880 > rcv 592 rx rate 2398720 nb pkts 1874 > rcv 593 rx rate 2400000 nb pkts 1875 > rcv 594 rx rate 2400000 nb pkts 1875 > rcv 595 rx rate 2400000 nb pkts 1875 > rcv 596 rx rate 2401280 nb pkts 1876 > rcv 597 rx rate 2401280 nb pkts 1876 > rcv 598 rx rate 2401280 nb pkts 1876 > rcv 599 rx rate 2402560 nb pkts 1877 > rx rate sum 3156416000 > > > >> >> >> >>> ... despite that there is _NO_ congestion... >>> >>> congestion at the subport or pipe. >>>> And the subport !! doesn't use about 42 mbit/s of available >>>> bandwidth. >>>> The only difference is those test configurations is TC of generated >>>> traffic. >>>> Test 1 uses TC 1 while test 2 uses TC 3 (which is use TC_OV >>>> function). >>>> So, enabling TC_OV changes the results dramatically. >>>> ## >>>> ## test1 >>>> ## >>>> hqos add profile 7 rate 2 M size 1000000 tc period 40 >>>> # qos test port >>>> hqos add port 1 rate 10 G mtu 1522 frame overhead 24 queue sizes 64 >>>> 64 64 64 >>>> hqos add port 1 subport 0 rate 300 M size 1000000 tc period 10 >>>> hqos add port 1 subport 0 pipes 2000 profile 7 >>>> hqos add port 1 subport 0 pipes 200 profile 23 >>>> hqos set port 1 lcore 3 >>>> port 1 >>>> subport rate 300 M >>>> number of tx flows 300 >>>> generator tx rate 1M >>>> TC 1 >>>> ... >>>> rcv 284 rx rate 995840 nb pkts 778 >>>> rcv 285 rx rate 995840 nb pkts 778 >>>> rcv 286 rx rate 995840 nb pkts 778 >>>> rcv 287 rx rate 995840 nb pkts 778 >>>> rcv 288 rx rate 995840 nb pkts 778 >>>> rcv 289 rx rate 995840 nb pkts 778 >>>> rcv 290 rx rate 995840 nb pkts 778 >>>> rcv 291 rx rate 995840 nb pkts 778 >>>> rcv 292 rx rate 995840 nb pkts 778 >>>> rcv 293 rx rate 995840 nb pkts 778 >>>> rcv 294 rx rate 995840 nb pkts 778 >>>> ... >>>> sum pipe's rx rate is 298 494 720 >>>> OK. >>>> The subport rate is equally distributed to 300 pipes. >>>> ## >>>> ## test 2 >>>> ## >>>> hqos add profile 7 rate 2 M size 1000000 tc period 40 >>>> # qos test port >>>> hqos add port 1 rate 10 G mtu 1522 frame overhead 24 queue sizes 64 >>>> 64 64 64 >>>> hqos add port 1 subport 0 rate 300 M size 1000000 tc period 10 >>>> hqos add port 1 subport 0 pipes 2000 profile 7 >>>> hqos add port 1 subport 0 pipes 200 profile 23 >>>> hqos set port 1 lcore 3 >>>> port 1 >>>> subport rate 300 M >>>> number of tx flows 300 >>>> generator tx rate 1M >>>> TC 3 >>>> h5 ~ # rcli sh qos rcv >>>> rcv 0 rx rate 875520 nb pkts 684 >>>> rcv 1 rx rate 856320 nb pkts 669 >>>> rcv 2 rx rate 849920 nb pkts 664 >>>> rcv 3 rx rate 853760 nb pkts 667 >>>> rcv 4 rx rate 867840 nb pkts 678 >>>> rcv 5 rx rate 844800 nb pkts 660 >>>> rcv 6 rx rate 852480 nb pkts 666 >>>> rcv 7 rx rate 855040 nb pkts 668 >>>> rcv 8 rx rate 865280 nb pkts 676 >>>> rcv 9 rx rate 846080 nb pkts 661 >>>> rcv 10 rx rate 858880 nb pkts 671 >>>> rcv 11 rx rate 870400 nb pkts 680 >>>> rcv 12 rx rate 864000 nb pkts 675 >>>> rcv 13 rx rate 852480 nb pkts 666 >>>> rcv 14 rx rate 855040 nb pkts 668 >>>> rcv 15 rx rate 857600 nb pkts 670 >>>> rcv 16 rx rate 864000 nb pkts 675 >>>> rcv 17 rx rate 866560 nb pkts 677 >>>> rcv 18 rx rate 865280 nb pkts 676 >>>> rcv 19 rx rate 858880 nb pkts 671 >>>> rcv 20 rx rate 856320 nb pkts 669 >>>> rcv 21 rx rate 864000 nb pkts 675 >>>> rcv 22 rx rate 869120 nb pkts 679 >>>> rcv 23 rx rate 856320 nb pkts 669 >>>> rcv 24 rx rate 862720 nb pkts 674 >>>> rcv 25 rx rate 865280 nb pkts 676 >>>> rcv 26 rx rate 867840 nb pkts 678 >>>> rcv 27 rx rate 870400 nb pkts 680 >>>> rcv 28 rx rate 860160 nb pkts 672 >>>> rcv 29 rx rate 870400 nb pkts 680 >>>> rcv 30 rx rate 869120 nb pkts 679 >>>> rcv 31 rx rate 870400 nb pkts 680 >>>> rcv 32 rx rate 858880 nb pkts 671 >>>> rcv 33 rx rate 858880 nb pkts 671 >>>> rcv 34 rx rate 852480 nb pkts 666 >>>> rcv 35 rx rate 874240 nb pkts 683 >>>> rcv 36 rx rate 855040 nb pkts 668 >>>> rcv 37 rx rate 853760 nb pkts 667 >>>> rcv 38 rx rate 869120 nb pkts 679 >>>> rcv 39 rx rate 885760 nb pkts 692 >>>> rcv 40 rx rate 861440 nb pkts 673 >>>> rcv 41 rx rate 852480 nb pkts 666 >>>> rcv 42 rx rate 871680 nb pkts 681 >>>> ... >>>> ... >>>> rcv 288 rx rate 766720 nb pkts 599 >>>> rcv 289 rx rate 766720 nb pkts 599 >>>> rcv 290 rx rate 766720 nb pkts 599 >>>> rcv 291 rx rate 766720 nb pkts 599 >>>> rcv 292 rx rate 762880 nb pkts 596 >>>> rcv 293 rx rate 762880 nb pkts 596 >>>> rcv 294 rx rate 762880 nb pkts 596 >>>> rcv 295 rx rate 760320 nb pkts 594 >>>> rcv 296 rx rate 604160 nb pkts 472 >>>> rcv 297 rx rate 604160 nb pkts 472 >>>> rcv 298 rx rate 604160 nb pkts 472 >>>> rcv 299 rx rate 604160 nb pkts 472 >>>> rx rate sum 258839040 >>>> FAILED. >>>> The subport rate is distributed NOT equally between 300 pipes. >>>> Some subport bandwith (about 42) is not being used!