DPDK usage discussions
 help / color / mirror / Atom feed
From: Alex Kiselev <alex@therouter.net>
To: "Singh, Jasvinder" <jasvinder.singh@intel.com>
Cc: users@dpdk.org, "Dumitrescu,
	Cristian" <cristian.dumitrescu@intel.com>,
	"Dharmappa, Savinay" <savinay.dharmappa@intel.com>
Subject: Re: [dpdk-users] scheduler issue
Date: Mon, 07 Dec 2020 23:16:04 +0100	[thread overview]
Message-ID: <4e5bde1cf78b0f77f4a5ec016a7217d6@therouter.net> (raw)
In-Reply-To: <e6a0429dc4a1a33861a066e3401e85b6@therouter.net>

On 2020-12-07 21:34, Alex Kiselev wrote:
> On 2020-12-07 20:29, Singh, Jasvinder wrote:
>>> On 7 Dec 2020, at 19:09, Alex Kiselev <alex@therouter.net> wrote:
>>> 
>>> On 2020-12-07 20:07, Alex Kiselev wrote:
>>>>> On 2020-12-07 19:18, Alex Kiselev wrote:
>>>>> On 2020-12-07 18:59, Singh, Jasvinder wrote:
>>>>>>> On 7 Dec 2020, at 17:45, Alex Kiselev <alex@therouter.net> wrote:
>>>>>>> On 2020-12-07 18:31, Singh, Jasvinder wrote:
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Alex Kiselev <alex@therouter.net>
>>>>>>>>> Sent: Monday, December 7, 2020 4:50 PM
>>>>>>>>> To: Singh, Jasvinder <jasvinder.singh@intel.com>
>>>>>>>>> Cc: users@dpdk.org; Dumitrescu, Cristian 
>>>>>>>>> <cristian.dumitrescu@intel.com>;
>>>>>>>>> Dharmappa, Savinay <savinay.dharmappa@intel.com>
>>>>>>>>> Subject: Re: [dpdk-users] scheduler issue
>>>>>>>>>> On 2020-12-07 12:32, Singh, Jasvinder wrote:
>>>>>>>>> >> -----Original Message-----
>>>>>>>>> >> From: Alex Kiselev <alex@therouter.net>
>>>>>>>>> >> Sent: Monday, December 7, 2020 10:46 AM
>>>>>>>>> >> To: Singh, Jasvinder <jasvinder.singh@intel.com>
>>>>>>>>> >> Cc: users@dpdk.org; Dumitrescu, Cristian
>>>>>>>>> >> <cristian.dumitrescu@intel.com>; Dharmappa, Savinay
>>>>>>>>> >> <savinay.dharmappa@intel.com>
>>>>>>>>> >> Subject: Re: [dpdk-users] scheduler issue
>>>>>>>>> >>
>>>>>>>>> >> On 2020-12-07 11:00, Singh, Jasvinder wrote:
>>>>>>>>> >> >> -----Original Message-----
>>>>>>>>> >> >> From: users <users-bounces@dpdk.org> On Behalf Of Alex Kiselev
>>>>>>>>> >> >> Sent: Friday, November 27, 2020 12:12 PM
>>>>>>>>> >> >> To: users@dpdk.org
>>>>>>>>> >> >> Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
>>>>>>>>> >> >> Subject: Re: [dpdk-users] scheduler issue
>>>>>>>>> >> >>
>>>>>>>>> >> >> On 2020-11-25 16:04, Alex Kiselev wrote:
>>>>>>>>> >> >> > On 2020-11-24 16:34, Alex Kiselev wrote:
>>>>>>>>> >> >> >> Hello,
>>>>>>>>> >> >> >>
>>>>>>>>> >> >> >> I am facing a problem with the scheduler library DPDK 18.11.10
>>>>>>>>> >> >> >> with default scheduler settings (RED is off).
>>>>>>>>> >> >> >> It seems like some of the pipes (last time it was 4 out of 600
>>>>>>>>> >> >> >> pipes) start incorrectly dropping most of the traffic after a
>>>>>>>>> >> >> >> couple of days of successful work.
>>>>>>>>> >> >> >>
>>>>>>>>> >> >> >> So far I've checked that there are no mbuf leaks or any other
>>>>>>>>> >> >> >> errors in my code and I am sure that traffic enters problematic
>>>>>>>>> pipes.
>>>>>>>>> >> >> >> Also switching a traffic in the runtime to pipes of another
>>>>>>>>> >> >> >> port restores the traffic flow.
>>>>>>>>> >> >> >>
>>>>>>>>> >> >> >> Ho do I approach debugging this issue?
>>>>>>>>> >> >> >>
>>>>>>>>> >> >> >> I've added using rte_sched_queue_read_stats(), but it doesn't
>>>>>>>>> >> >> >> give me counters that accumulate values (packet drops for
>>>>>>>>> >> >> >> example), it gives me some kind of current values and after a
>>>>>>>>> >> >> >> couple of seconds those values are reset to zero, so I can say
>>>>>>>>> nothing based on that API.
>>>>>>>>> >> >> >>
>>>>>>>>> >> >> >> I would appreciate any ideas and help.
>>>>>>>>> >> >> >> Thanks.
>>>>>>>>> >> >> >
>>>>>>>>> >> >> > Problematic pipes had very low bandwidth limit (1 Mbit/s) and
>>>>>>>>> >> >> > also there is an oversubscription configuration event at subport
>>>>>>>>> >> >> > 0 of port
>>>>>>>>> >> >> > 13 to which those pipes belongs and
>>>>>>>>> >> >> CONFIG_RTE_SCHED_SUBPORT_TC_OV is
>>>>>>>>> >> >> > disabled.
>>>>>>>>> >> >> >
>>>>>>>>> >> >> > Could a congestion at that subport be the reason of the problem?
>>>>>>>>> >> >> >
>>>>>>>>> >> >> > How much overhead and performance degradation will add enabling
>>>>>>>>> >> >> > CONFIG_RTE_SCHED_SUBPORT_TC_OV feature?
>>>>>>>>> >> >> >
>>>>>>>>> >> >> > Configuration:
>>>>>>>>> >> >> >
>>>>>>>>> >> >> >   #
>>>>>>>>> >> >> >   # QoS Scheduler Profiles
>>>>>>>>> >> >> >   #
>>>>>>>>> >> >> >   hqos add profile  1 rate    8 K size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile  2 rate  400 K size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile  3 rate  600 K size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile  4 rate  800 K size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile  5 rate    1 M size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile  6 rate 1500 K size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile  7 rate    2 M size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile  8 rate    3 M size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile  9 rate    4 M size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile 10 rate    5 M size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile 11 rate    6 M size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile 12 rate    8 M size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile 13 rate   10 M size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile 14 rate   12 M size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile 15 rate   15 M size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile 16 rate   16 M size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile 17 rate   20 M size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile 18 rate   30 M size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile 19 rate   32 M size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile 20 rate   40 M size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile 21 rate   50 M size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile 22 rate   60 M size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile 23 rate  100 M size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile 24 rate 25 M size 1000000 tc period 40
>>>>>>>>> >> >> >   hqos add profile 25 rate 50 M size 1000000 tc period 40
>>>>>>>>> >> >> >
>>>>>>>>> >> >> >   #
>>>>>>>>> >> >> >   # Port 13
>>>>>>>>> >> >> >   #
>>>>>>>>> >> >> >   hqos add port 13 rate 40 G mtu 1522 frame overhead 24 queue
>>>>>>>>> >> >> > sizes
>>>>>>>>> >> >> > 64
>>>>>>>>> >> >> > 64 64 64
>>>>>>>>> >> >> >   hqos add port 13 subport 0 rate 1500 M size 1000000 tc period 10
>>>>>>>>> >> >> >   hqos add port 13 subport 0 pipes 3000 profile 2
>>>>>>>>> >> >> >   hqos add port 13 subport 0 pipes 3000 profile 5
>>>>>>>>> >> >> >   hqos add port 13 subport 0 pipes 3000 profile 6
>>>>>>>>> >> >> >   hqos add port 13 subport 0 pipes 3000 profile 7
>>>>>>>>> >> >> >   hqos add port 13 subport 0 pipes 3000 profile 9
>>>>>>>>> >> >> >   hqos add port 13 subport 0 pipes 3000 profile 11
>>>>>>>>> >> >> >   hqos set port 13 lcore 5
>>>>>>>>> >> >>
>>>>>>>>> >> >> I've enabled TC_OV feature and redirected most of the traffic to TC3.
>>>>>>>>> >> >> But the issue still exists.
>>>>>>>>> >> >>
>>>>>>>>> >> >> Below is queue statistics of one of problematic pipes.
>>>>>>>>> >> >> Almost all of the traffic entering the pipe is dropped.
>>>>>>>>> >> >>
>>>>>>>>> >> >> And the pipe is also configured with the 1Mbit/s profile.
>>>>>>>>> >> >> So, the issue is only with very low bandwidth pipe profiles.
>>>>>>>>> >> >>
>>>>>>>>> >> >> And this time there was no congestion on the subport.
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> >> >> Egress qdisc
>>>>>>>>> >> >> dir 0
>>>>>>>>> >> >>    rate 1M
>>>>>>>>> >> >>    port 6, subport 0, pipe_id 138, profile_id 5
>>>>>>>>> >> >>    tc 0, queue 0: bytes 752, bytes dropped 0, pkts 8, pkts dropped 0
>>>>>>>>> >> >>    tc 0, queue 1: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0
>>>>>>>>> >> >>    tc 0, queue 2: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0
>>>>>>>>> >> >>    tc 0, queue 3: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0
>>>>>>>>> >> >>    tc 1, queue 0: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0
>>>>>>>>> >> >>    tc 1, queue 1: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0
>>>>>>>>> >> >>    tc 1, queue 2: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0
>>>>>>>>> >> >>    tc 1, queue 3: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0
>>>>>>>>> >> >>    tc 2, queue 0: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0
>>>>>>>>> >> >>    tc 2, queue 1: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0
>>>>>>>>> >> >>    tc 2, queue 2: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0
>>>>>>>>> >> >>    tc 2, queue 3: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0
>>>>>>>>> >> >>    tc 3, queue 0: bytes 56669, bytes dropped 360242, pkts 150,
>>>>>>>>> >> >> pkts dropped
>>>>>>>>> >> >> 3749
>>>>>>>>> >> >>    tc 3, queue 1: bytes 63005, bytes dropped 648782, pkts 150,
>>>>>>>>> >> >> pkts dropped
>>>>>>>>> >> >> 3164
>>>>>>>>> >> >>    tc 3, queue 2: bytes 9984, bytes dropped 49704, pkts 128, pkts
>>>>>>>>> >> >> dropped
>>>>>>>>> >> >> 636
>>>>>>>>> >> >>    tc 3, queue 3: bytes 15436, bytes dropped 107198, pkts 130,
>>>>>>>>> >> >> pkts dropped
>>>>>>>>> >> >> 354
>>>>>>>>> >> >
>>>>>>>>> >> >
>>>>>>>>> >> > Hi Alex,
>>>>>>>>> >> >
>>>>>>>>> >> > Can you try newer version of the library, say dpdk 20.11?
>>>>>>>>> >>
>>>>>>>>> >> Right now no, since switching to another DPDK will take a lot of time
>>>>>>>>> >> because I am using a lot of custom patches.
>>>>>>>>> >>
>>>>>>>>> >> I've tried to simply copy the entire rte_sched lib from DPDK 19 to
>>>>>>>>> >> DPDK 18.
>>>>>>>>> >> And I was able to successful back port and resolve all dependency
>>>>>>>>> >> issues, but it also will take some time to test this approach.
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> >> > Are you
>>>>>>>>> >> > using dpdk qos sample app or your own app?
>>>>>>>>> >>
>>>>>>>>> >> My own app.
>>>>>>>>> >>
>>>>>>>>> >> >> What are the packets size?
>>>>>>>>> >>
>>>>>>>>> >> Application is used as BRAS/BNG server, so it's used to provide
>>>>>>>>> >> internet access to residential customers. Therefore packet sizes are
>>>>>>>>> >> typical to the internet and vary from 64 to 1500 bytes. Most of the
>>>>>>>>> >> packets are around
>>>>>>>>> >> 1000 bytes.
>>>>>>>>> >>
>>>>>>>>> >> >
>>>>>>>>> >> > Couple of other things for clarification- 1. At what rate you are
>>>>>>>>> >> > injecting the traffic to low bandwidth pipes?
>>>>>>>>> >>
>>>>>>>>> >> Well, the rate vary also, there could be congestion on some pipes at
>>>>>>>>> >> some date time.
>>>>>>>>> >>
>>>>>>>>> >> But the problem is that once the problem occurs at a pipe or at some
>>>>>>>>> >> queues inside the pipe, the pipe stops transmitting even when
>>>>>>>>> >> incoming traffic rate is much lower than the pipe's rate.
>>>>>>>>> >>
>>>>>>>>> >> > 2. How is traffic distributed among pipes and their traffic class?
>>>>>>>>> >>
>>>>>>>>> >> I am using IPv4 TOS field to choose the TC and there is a tos2tc map.
>>>>>>>>> >> Most of my traffic has 0 tos value which is mapped to TC3 inside my
>>>>>>>>> >> app.
>>>>>>>>> >>
>>>>>>>>> >> Recently I've switched to a tos2map which maps all traffic to TC3 to
>>>>>>>>> >> see if it solves the problem.
>>>>>>>>> >>
>>>>>>>>> >> Packet distribution to queues is done using the formula (ipv4.src +
>>>>>>>>> >> ipv4.dst) & 3
>>>>>>>>> >>
>>>>>>>>> >> > 3. Can you try putting your own counters on those pipes queues
>>>>>>>>> >> > which periodically show the #packets in the queues to understand
>>>>>>>>> >> > the dynamics?
>>>>>>>>> >>
>>>>>>>>> >> I will try.
>>>>>>>>> >>
>>>>>>>>> >> P.S.
>>>>>>>>> >>
>>>>>>>>> >> Recently I've got another problem with scheduler.
>>>>>>>>> >>
>>>>>>>>> >> After enabling the TC_OV feature one of the ports stops transmitting.
>>>>>>>>> >> All port's pipes were affected.
>>>>>>>>> >> Port had only one support, and there were only pipes with 1 Mbit/s
>>>>>>>>> >> profile.
>>>>>>>>> >> The problem was solved by adding a 10Mit/s profile to that port. Only
>>>>>>>>> >> after that port's pipes started to transmit.
>>>>>>>>> >> I guess it has something to do with calculating tc_ov_wm as it
>>>>>>>>> >> depends on the maximum pipe rate.
>>>>>>>>> >>
>>>>>>>>> >> I am gonna make a test lab and a test build to reproduce this.
>>>>>>>>> I've made some tests and was able to reproduce the port 
>>>>>>>>> configuration issue
>>>>>>>>> using a test build of my app.
>>>>>>>>> Tests showed that TC_OV feature works not correctly in DPDK 
>>>>>>>>> 18.11, but
>>>>>>>>> there are workarounds.
>>>>>>>>> I still can't reproduce my main problem which is random pipes 
>>>>>>>>> stop
>>>>>>>>> transmitting.
>>>>>>>>> Here are details:
>>>>>>>>> All tests use the same test traffic generator that produce
>>>>>>>>> 10 traffic flows entering 10 different pipes of port 1 subport 
>>>>>>>>> 0.
>>>>>>>>> Only queue 0 of each pipe is used.
>>>>>>>>> TX rate is 800 kbit/s. packet size is 800 byte.
>>>>>>>>> Pipes rate are 1 Mbit/s. Subport 0 rate is 500 Mbit/s.
>>>>>>>>> ###
>>>>>>>>> ### test 1
>>>>>>>>> ###
>>>>>>>>> Traffic generator is configured to use TC3.
>>>>>>>>> Configuration:
>>>>>>>>>  hqos add profile 27 rate 1 M size 1000000 tc period 40
>>>>>>>>>  hqos add profile 23 rate  100 M size 1000000 tc period 40
>>>>>>>>>  # qos test port
>>>>>>>>>  hqos add port 1 rate 10 G mtu 1522 frame overhead 24 queue 
>>>>>>>>> sizes 64 64
>>>>>>>>> 64 64
>>>>>>>>>  hqos add port 1 subport 0 rate 500 M size 1000000 tc period 10
>>>>>>>>>  hqos add port 1 subport 0 pipes 2000 profile 27
>>>>>>>>>  hqos set port 1 lcore 3
>>>>>>>>> Results:
>>>>>>>>> h5 ~ # rcli sh qos rcv
>>>>>>>>> rcv 0: rx rate 641280, nb pkts 501, ind 1 rcv 1: rx rate 
>>>>>>>>> 641280, nb pkts 501, ind
>>>>>>>>> 1 rcv 2: rx rate 641280, nb pkts 501, ind 1 rcv 3: rx rate 
>>>>>>>>> 641280, nb pkts 501,
>>>>>>>>> ind 1 rcv 4: rx rate 641280, nb pkts 501, ind 1 rcv 5: rx rate 
>>>>>>>>> 641280, nb pkts
>>>>>>>>> 501, ind 1 rcv 6: rx rate 641280, nb pkts 501, ind 1 rcv 7: rx 
>>>>>>>>> rate 641280, nb
>>>>>>>>> pkts 501, ind 1 rcv 8: rx rate 641280, nb pkts 501, ind 1 rcv 
>>>>>>>>> 9: rx rate 641280,
>>>>>>>>> nb pkts 501, ind 1
>>>>>>>>> ! BUG
>>>>>>>>> ! RX rate is lower then expected 800000 bit/s despite that 
>>>>>>>>> there is no
>>>>>>>>> congestion neither at subport nor at pipes levels.
>>>>>>>> [JS] - Can you elaborate on your scheduler hierarchy?
>>>>>>> sure, take a look below at the output
>>>>>>> "number of pipes per subport"
>>>>>>> TR application always round the total number of pipes per port
>>>>>>> to a power2 value.
>>>>>>> I mean-  how
>>>>>>>> many pipes per subport? It has to be the number that can be 
>>>>>>>> expressed
>>>>>>>> as power of 2, for e.g 4K, 2K, 1K etc.  In run time, scheduler 
>>>>>>>> will
>>>>>>>> scan all the pipes and will process only those which have got 
>>>>>>>> packets
>>>>>>>> in their queue.
>>>>>>> Configuration of port 1 with enabled profile 23
>>>>>>> h5 ~ # rcli sh hqos ports
>>>>>>> hqos scheduler port: 1
>>>>>>> lcore_id: 3
>>>>>>> socket: 0
>>>>>>> rate: 0
>>>>>>> mtu: 1522
>>>>>>> frame overhead: 24
>>>>>>> number of pipes per subport: 4096
>>>>>>> pipe profiles: 2
>>>>>>>   pipe profile id: 27
>>>>>>>   pipe rate: 1000000
>>>>>>>   number of pipes: 2000
>>>>>>>   pipe pool size: 2000
>>>>>>>   number of pipes in use: 0
>>>>>>>   pipe profile id: 23
>>>>>>>   pipe rate: 100000000
>>>>>>>   number of pipes: 200
>>>>>>>   pipe pool size: 200
>>>>>>>   number of pipes in use: 0
>>>>>>> Configuration with only one profile at port 1
>>>>>>> hqos scheduler port: 1
>>>>>>> lcore_id: 3
>>>>>>> socket: 0
>>>>>>> rate: 0
>>>>>>> mtu: 1522
>>>>>>> frame overhead: 24
>>>>>>> number of pipes per subport: 2048
>>>>>>> pipe profiles: 1
>>>>>>>   pipe profile id: 27
>>>>>>>   pipe rate: 1000000
>>>>>>>   number of pipes: 2000
>>>>>>>   pipe pool size: 2000
>>>>>>>   number of pipes in use: 0
>>>>>>> [JS]  what is the meaning of number of pipes , Pipe pool size, 
>>>>>>> and number of pipes in use which is zero above? Does your 
>>>>>>> application map packet field values to these number of pipes in 
>>>>>>> run time ? Can you give me example of mapping of packet field 
>>>>>>> values to pipe id, tc, queue?
>>>>> please, ignore all information from the outputs above except the
>>>>> "number of pipes per subport".
>>>>> since the tests were made with a simple test application which is
>>>>> based on TR but doesn't use
>>>>> it's production QoS logic.
>>>>> The tests 1 - 5 were made with a simple test application with a 
>>>>> very
>>>>> straitforward qos mappings
>>>>> which I described at the beginning.
>>>>> Here they are:
>>>>> All tests use the same test traffic generator that produce
>>>>> 10 traffic flows entering 10 different pipes (0 - 9) of port 1 
>>>>> subport 0.
>>>>> Only queue 0 of each pipe is used.
>>>>> TX rate is 800 kbit/s. packet size is 800 byte.
>>>>> Pipes rate are 1 Mbit/s. Subport 0 rate is 500 Mbit/s.
>>>>>>>>> ###
>>>>>>>>> ### test 2
>>>>>>>>> ###
>>>>>>>>> Traffic generator is configured to use TC3.
>>>>>>>>> !!! profile 23 has been added to the test port.
>>>>>>>>> Configuration:
>>>>>>>>>  hqos add profile 27 rate 1 M size 1000000 tc period 40
>>>>>>>>>  hqos add profile 23 rate  100 M size 1000000 tc period 40
>>>>>>>>>  # qos test port
>>>>>>>>>  hqos add port 1 rate 10 G mtu 1522 frame overhead 24 queue 
>>>>>>>>> sizes 64 64
>>>>>>>>> 64 64
>>>>>>>>>  hqos add port 1 subport 0 rate 500 M size 1000000 tc period 10
>>>>>>>>>  hqos add port 1 subport 0 pipes 2000 profile 27
>>>>>>>>>  hqos add port 1 subport 0 pipes 200 profile 23
>>>>>>>>>  hqos set port 1 lcore 3
>>>>>>>>> Results:
>>>>>>>>> h5 ~ # rcli sh qos rcv
>>>>>>>>> rcv 0: rx rate 798720, nb pkts 624, ind 1 rcv 1: rx rate 
>>>>>>>>> 798720, nb pkts 624, ind
>>>>>>>>> 1 rcv 2: rx rate 798720, nb pkts 624, ind 1 rcv 3: rx rate 
>>>>>>>>> 798720, nb pkts 624,
>>>>>>>>> ind 1 rcv 4: rx rate 798720, nb pkts 624, ind 1 rcv 5: rx rate 
>>>>>>>>> 798720, nb pkts
>>>>>>>>> 624, ind 1 rcv 6: rx rate 798720, nb pkts 624, ind 1 rcv 7: rx 
>>>>>>>>> rate 798720, nb
>>>>>>>>> pkts 624, ind 1 rcv 8: rx rate 798720, nb pkts 624, ind 1 rcv 
>>>>>>>>> 9: rx rate 798720,
>>>>>>>>> nb pkts 624, ind 1
>>>>>>>>> OK.
>>>>>>>>> Receiving traffic is rate is equal to expected values.
>>>>>>>>> So, just adding a pipes which are not being used solves the 
>>>>>>>>> problem.
>>>>>>>>> ###
>>>>>>>>> ### test 3
>>>>>>>>> ###
>>>>>>>>> !!! traffic generator uses TC 0, so tc_ov is not being used in 
>>>>>>>>> this test.
>>>>>>>>> profile 23 is not used.
>>>>>>>>> Configuration without profile 23.
>>>>>>>>>  hqos add profile 27 rate 1 M size 1000000 tc period 40
>>>>>>>>>  hqos add profile 23 rate  100 M size 1000000 tc period 40
>>>>>>>>>  # qos test port
>>>>>>>>>  hqos add port 1 rate 10 G mtu 1522 frame overhead 24 queue 
>>>>>>>>> sizes 64 64
>>>>>>>>> 64 64
>>>>>>>>>  hqos add port 1 subport 0 rate 500 M size 1000000 tc period 10
>>>>>>>>>  hqos add port 1 subport 0 pipes 2000 profile 27
>>>>>>>>>  hqos set port 1 lcore 3
>>>>>>>>> Restuls:
>>>>>>>>> h5 ~ # rcli sh qos rcv
>>>>>>>>> rcv 0: rx rate 798720, nb pkts 624, ind 0 rcv 1: rx rate 
>>>>>>>>> 798720, nb pkts 624, ind
>>>>>>>>> 0 rcv 2: rx rate 798720, nb pkts 624, ind 0 rcv 3: rx rate 
>>>>>>>>> 798720, nb pkts 624,
>>>>>>>>> ind 0 rcv 4: rx rate 798720, nb pkts 624, ind 0 rcv 5: rx rate 
>>>>>>>>> 798720, nb pkts
>>>>>>>>> 624, ind 0 rcv 6: rx rate 798720, nb pkts 624, ind 0 rcv 7: rx 
>>>>>>>>> rate 798720, nb
>>>>>>>>> pkts 624, ind 0 rcv 8: rx rate 798720, nb pkts 624, ind 0 rcv 
>>>>>>>>> 9: rx rate 798720,
>>>>>>>>> nb pkts 624, ind 0
>>>>>>>>> OK.
>>>>>>>>> Receiving traffic is rate is equal to expected values.
>>>>>>>>> ###
>>>>>>>>> ### test 4
>>>>>>>>> ###
>>>>>>>>> Traffic generator is configured to use TC3.
>>>>>>>>> no profile 23.
>>>>>>>>> !! subport tc period has been changed from 10 to 5.
>>>>>>>>> Configuration:
>>>>>>>>>  hqos add profile 27 rate 1 M size 1000000 tc period 40
>>>>>>>>>  hqos add profile 23 rate  100 M size 1000000 tc period 40
>>>>>>>>>  # qos test port
>>>>>>>>>  hqos add port 1 rate 10 G mtu 1522 frame overhead 24 queue 
>>>>>>>>> sizes 64 64
>>>>>>>>> 64 64
>>>>>>>>>  hqos add port 1 subport 0 rate 500 M size 1000000 tc period 5
>>>>>>>>>  hqos add port 1 subport 0 pipes 2000 profile 27
>>>>>>>>>  hqos set port 1 lcore 3
>>>>>>>>> Restuls:
>>>>>>>>> rcv 0: rx rate 0, nb pkts 0, ind 1
>>>>>>>>> rcv 1: rx rate 0, nb pkts 0, ind 1
>>>>>>>>> rcv 2: rx rate 0, nb pkts 0, ind 1
>>>>>>>>> rcv 3: rx rate 0, nb pkts 0, ind 1
>>>>>>>>> rcv 4: rx rate 0, nb pkts 0, ind 1
>>>>>>>>> rcv 5: rx rate 0, nb pkts 0, ind 1
>>>>>>>>> rcv 6: rx rate 0, nb pkts 0, ind 1
>>>>>>>>> rcv 7: rx rate 0, nb pkts 0, ind 1
>>>>>>>>> rcv 8: rx rate 0, nb pkts 0, ind 1
>>>>>>>>> rcv 9: rx rate 0, nb pkts 0, ind 1
>>>>>>>>> ! zero traffic
>>>>>>>>> ###
>>>>>>>>> ### test 5
>>>>>>>>> ###
>>>>>>>>> Traffic generator is configured to use TC3.
>>>>>>>>> profile 23 is enabled.
>>>>>>>>> subport tc period has been changed from 10 to 5.
>>>>>>>>> Configuration:
>>>>>>>>>  hqos add profile 27 rate 1 M size 1000000 tc period 40
>>>>>>>>>  hqos add profile 23 rate  100 M size 1000000 tc period 40
>>>>>>>>>  # qos test port
>>>>>>>>>  hqos add port 1 rate 10 G mtu 1522 frame overhead 24 queue 
>>>>>>>>> sizes 64 64
>>>>>>>>> 64 64
>>>>>>>>>  hqos add port 1 subport 0 rate 500 M size 1000000 tc period 5
>>>>>>>>>  hqos add port 1 subport 0 pipes 2000 profile 27
>>>>>>>>>  hqos add port 1 subport 0 pipes 200 profile 23
>>>>>>>>>  hqos set port 1 lcore 3
>>>>>>>>> Restuls:
>>>>>>>>> h5 ~ # rcli sh qos rcv
>>>>>>>>> rcv 0: rx rate 800000, nb pkts 625, ind 1 rcv 1: rx rate 
>>>>>>>>> 800000, nb pkts 625, ind
>>>>>>>>> 1 rcv 2: rx rate 800000, nb pkts 625, ind 1 rcv 3: rx rate 
>>>>>>>>> 800000, nb pkts 625,
>>>>>>>>> ind 1 rcv 4: rx rate 800000, nb pkts 625, ind 1 rcv 5: rx rate 
>>>>>>>>> 800000, nb pkts
>>>>>>>>> 625, ind 1 rcv 6: rx rate 800000, nb pkts 625, ind 1 rcv 7: rx 
>>>>>>>>> rate 800000, nb
>>>>>>>>> pkts 625, ind 1 rcv 8: rx rate 800000, nb pkts 625, ind 1 rcv 
>>>>>>>>> 9: rx rate 800000,
>>>>>>>>> nb pkts 625, ind 1
>>>>>>>>> OK
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > Does this problem exist when you disable oversubscription mode? Worth
>>>>>>>>> > looking at grinder_tc_ov_credits_update() and grinder_credits_update()
>>>>>>>>> > functions where tc_ov_wm is altered.
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >> >
>>>>>>>>> >> > Thanks,
>>>>>>>>> >> > Jasvinder
>>>> Ok, these new two tests show even more clearly that TC_OV feature is 
>>>> broken.
>>>> Test 1 doesn't use TC_OV and all available support bandwidth is
>>>> distributed to 300 pipes in a very fair way. There are 300 
>>>> generators
>>>> with tx rate 1M. They produce 300 traffic flows which enter each
>>>> own pipe (queue 0) of port 1 subport 0.
>>>>  port 1
>>>>  subport rate 300 M
>>>> Then application measures the rx rate of each flows after flow's
>>>> traffic leaves the scheduler.
>>>> For example, the following line
>>>>  rcv 284 rx rate 995840  nb pkts 778
>>>> shows that rx rate of the flow with number 284 is 995840 bit/s.
>>>> All 300 rx rates are about 995840 bit/s (1Mbit/s) as expected.
>> 
>> [JS] May be try repeat same test but change traffic from  tc 0 to
>> tc3. See if this works.
> 
> The second test with incorrect restuls was already done with tc3.
> 
>> 
>>>> The second test uses the same configuration
>>>> but uses TC3, so the TC_OV function is being used.
>>>> And the distribution of traffic in the test is very unfair.
>>>> Some of the pipes get 875520 bit/s, some of the pipes get only
>>>> 604160 bit/s despite that there is
>>> 
>> 
>> [JS] try repeat test with increase pipe bandwidth let’s say 50 mbps or
>> even greater.
> 
> I increased pipe rate to 10Mbit/s and both tests (tc0 and tc3) showed
> correct and identical results.
> 
> But, then I changed the tests and increased the number of pipes to 600
> to see how it would work with subport congestion. I added 600 pipes
> generatinig 10Mbit/s
> and 3G subport limit, therefore each pipe should get equal share which
> is about 5mbits.
> And results of both tests (tc0 or tc3) are not very good.
> 
> First pipes are getting much more bandwidth than the last ones.
> The difference is 3 times. So, TC_OV is still not working!!

decreasing subport tc_period from 10 to 5 has solved that problem
and scheduler started to distribute subport bandwidth between 10 mbit/s 
pipes almost ideally.

> 
> rcv 0   rx rate 7324160 nb pkts 5722
> rcv 1   rx rate 7281920 nb pkts 5689
> rcv 2   rx rate 7226880 nb pkts 5646
> rcv 3   rx rate 7124480 nb pkts 5566
> rcv 4   rx rate 7324160 nb pkts 5722
> rcv 5   rx rate 7271680 nb pkts 5681
> rcv 6   rx rate 7188480 nb pkts 5616
> rcv 7   rx rate 7150080 nb pkts 5586
> rcv 8   rx rate 7328000 nb pkts 5725
> rcv 9   rx rate 7249920 nb pkts 5664
> rcv 10  rx rate 7188480 nb pkts 5616
> rcv 11  rx rate 7179520 nb pkts 5609
> rcv 12  rx rate 7324160 nb pkts 5722
> rcv 13  rx rate 7208960 nb pkts 5632
> rcv 14  rx rate 7152640 nb pkts 5588
> rcv 15  rx rate 7127040 nb pkts 5568
> rcv 16  rx rate 7303680 nb pkts 5706
> ....
> rcv 587 rx rate 2406400 nb pkts 1880
> rcv 588 rx rate 2406400 nb pkts 1880
> rcv 589 rx rate 2406400 nb pkts 1880
> rcv 590 rx rate 2406400 nb pkts 1880
> rcv 591 rx rate 2406400 nb pkts 1880
> rcv 592 rx rate 2398720 nb pkts 1874
> rcv 593 rx rate 2400000 nb pkts 1875
> rcv 594 rx rate 2400000 nb pkts 1875
> rcv 595 rx rate 2400000 nb pkts 1875
> rcv 596 rx rate 2401280 nb pkts 1876
> rcv 597 rx rate 2401280 nb pkts 1876
> rcv 598 rx rate 2401280 nb pkts 1876
> rcv 599 rx rate 2402560 nb pkts 1877
> rx rate sum 3156416000



> 
> 
> 
>> 
>> 
>> 
>>> ... despite that there is _NO_ congestion...
>>> 
>>> congestion at the subport or pipe.
>>>> And the subport !! doesn't use about 42 mbit/s of available 
>>>> bandwidth.
>>>> The only difference is those test configurations is TC of generated 
>>>> traffic.
>>>> Test 1 uses TC 1 while test 2 uses TC 3 (which is use TC_OV 
>>>> function).
>>>> So, enabling TC_OV changes the results dramatically.
>>>> ##
>>>> ## test1
>>>> ##
>>>>  hqos add profile  7 rate    2 M size 1000000 tc period 40
>>>>  # qos test port
>>>>  hqos add port 1 rate 10 G mtu 1522 frame overhead 24 queue sizes 64 
>>>> 64 64 64
>>>>  hqos add port 1 subport 0 rate 300 M size 1000000 tc period 10
>>>>  hqos add port 1 subport 0 pipes 2000 profile 7
>>>>  hqos add port 1 subport 0 pipes 200 profile 23
>>>>  hqos set port 1 lcore 3
>>>> port 1
>>>> subport rate 300 M
>>>> number of tx flows 300
>>>> generator tx rate 1M
>>>> TC 1
>>>> ...
>>>> rcv 284 rx rate 995840  nb pkts 778
>>>> rcv 285 rx rate 995840  nb pkts 778
>>>> rcv 286 rx rate 995840  nb pkts 778
>>>> rcv 287 rx rate 995840  nb pkts 778
>>>> rcv 288 rx rate 995840  nb pkts 778
>>>> rcv 289 rx rate 995840  nb pkts 778
>>>> rcv 290 rx rate 995840  nb pkts 778
>>>> rcv 291 rx rate 995840  nb pkts 778
>>>> rcv 292 rx rate 995840  nb pkts 778
>>>> rcv 293 rx rate 995840  nb pkts 778
>>>> rcv 294 rx rate 995840  nb pkts 778
>>>> ...
>>>> sum pipe's rx rate is 298 494 720
>>>> OK.
>>>> The subport rate is equally distributed to 300 pipes.
>>>> ##
>>>> ##  test 2
>>>> ##
>>>>  hqos add profile  7 rate    2 M size 1000000 tc period 40
>>>>  # qos test port
>>>>  hqos add port 1 rate 10 G mtu 1522 frame overhead 24 queue sizes 64 
>>>> 64 64 64
>>>>  hqos add port 1 subport 0 rate 300 M size 1000000 tc period 10
>>>>  hqos add port 1 subport 0 pipes 2000 profile 7
>>>>  hqos add port 1 subport 0 pipes 200 profile 23
>>>>  hqos set port 1 lcore 3
>>>> port 1
>>>> subport rate 300 M
>>>> number of tx flows 300
>>>> generator tx rate 1M
>>>> TC 3
>>>> h5 ~ # rcli sh qos rcv
>>>> rcv 0   rx rate 875520  nb pkts 684
>>>> rcv 1   rx rate 856320  nb pkts 669
>>>> rcv 2   rx rate 849920  nb pkts 664
>>>> rcv 3   rx rate 853760  nb pkts 667
>>>> rcv 4   rx rate 867840  nb pkts 678
>>>> rcv 5   rx rate 844800  nb pkts 660
>>>> rcv 6   rx rate 852480  nb pkts 666
>>>> rcv 7   rx rate 855040  nb pkts 668
>>>> rcv 8   rx rate 865280  nb pkts 676
>>>> rcv 9   rx rate 846080  nb pkts 661
>>>> rcv 10  rx rate 858880  nb pkts 671
>>>> rcv 11  rx rate 870400  nb pkts 680
>>>> rcv 12  rx rate 864000  nb pkts 675
>>>> rcv 13  rx rate 852480  nb pkts 666
>>>> rcv 14  rx rate 855040  nb pkts 668
>>>> rcv 15  rx rate 857600  nb pkts 670
>>>> rcv 16  rx rate 864000  nb pkts 675
>>>> rcv 17  rx rate 866560  nb pkts 677
>>>> rcv 18  rx rate 865280  nb pkts 676
>>>> rcv 19  rx rate 858880  nb pkts 671
>>>> rcv 20  rx rate 856320  nb pkts 669
>>>> rcv 21  rx rate 864000  nb pkts 675
>>>> rcv 22  rx rate 869120  nb pkts 679
>>>> rcv 23  rx rate 856320  nb pkts 669
>>>> rcv 24  rx rate 862720  nb pkts 674
>>>> rcv 25  rx rate 865280  nb pkts 676
>>>> rcv 26  rx rate 867840  nb pkts 678
>>>> rcv 27  rx rate 870400  nb pkts 680
>>>> rcv 28  rx rate 860160  nb pkts 672
>>>> rcv 29  rx rate 870400  nb pkts 680
>>>> rcv 30  rx rate 869120  nb pkts 679
>>>> rcv 31  rx rate 870400  nb pkts 680
>>>> rcv 32  rx rate 858880  nb pkts 671
>>>> rcv 33  rx rate 858880  nb pkts 671
>>>> rcv 34  rx rate 852480  nb pkts 666
>>>> rcv 35  rx rate 874240  nb pkts 683
>>>> rcv 36  rx rate 855040  nb pkts 668
>>>> rcv 37  rx rate 853760  nb pkts 667
>>>> rcv 38  rx rate 869120  nb pkts 679
>>>> rcv 39  rx rate 885760  nb pkts 692
>>>> rcv 40  rx rate 861440  nb pkts 673
>>>> rcv 41  rx rate 852480  nb pkts 666
>>>> rcv 42  rx rate 871680  nb pkts 681
>>>> ...
>>>> ...
>>>> rcv 288 rx rate 766720  nb pkts 599
>>>> rcv 289 rx rate 766720  nb pkts 599
>>>> rcv 290 rx rate 766720  nb pkts 599
>>>> rcv 291 rx rate 766720  nb pkts 599
>>>> rcv 292 rx rate 762880  nb pkts 596
>>>> rcv 293 rx rate 762880  nb pkts 596
>>>> rcv 294 rx rate 762880  nb pkts 596
>>>> rcv 295 rx rate 760320  nb pkts 594
>>>> rcv 296 rx rate 604160  nb pkts 472
>>>> rcv 297 rx rate 604160  nb pkts 472
>>>> rcv 298 rx rate 604160  nb pkts 472
>>>> rcv 299 rx rate 604160  nb pkts 472
>>>> rx rate sum 258839040
>>>> FAILED.
>>>> The subport rate is distributed NOT equally between 300 pipes.
>>>> Some subport bandwith (about 42) is not being used!

  parent reply	other threads:[~2020-12-07 22:16 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-24 13:34 Alex Kiselev
2020-11-25 15:04 ` Alex Kiselev
2020-11-27 12:11   ` Alex Kiselev
2020-12-07 10:00     ` Singh, Jasvinder
2020-12-07 10:46       ` Alex Kiselev
2020-12-07 11:32         ` Singh, Jasvinder
2020-12-07 12:29           ` Alex Kiselev
2020-12-07 16:49           ` Alex Kiselev
2020-12-07 17:31             ` Singh, Jasvinder
2020-12-07 17:45               ` Alex Kiselev
     [not found]                 ` <49019BC8-DDA6-4B39-B395-2A68E91AB424@intel.com>
     [not found]                   ` <226b13286c876e69ad40a65858131b66@therouter.net>
     [not found]                     ` <4536a02973015dc8049834635f145a19@therouter.net>
     [not found]                       ` <f9a27b6493ae1e1e2850a3b459ab9d33@therouter.net>
     [not found]                         ` <B8241A33-0927-4411-A340-9DD0BEE07968@intel.com>
     [not found]                           ` <e6a0429dc4a1a33861a066e3401e85b6@therouter.net>
2020-12-07 22:16                             ` Alex Kiselev [this message]
2020-12-07 22:32                               ` Singh, Jasvinder
2020-12-08 10:52                                 ` Alex Kiselev
2020-12-08 13:24                                   ` Singh, Jasvinder
2020-12-09 13:41                                     ` Alex Kiselev
2020-12-10 10:29                                       ` Singh, Jasvinder
2020-12-11 21:29                                     ` Alex Kiselev
2020-12-11 22:06                                       ` Singh, Jasvinder
2020-12-11 22:27                                         ` Alex Kiselev
2020-12-11 22:36                                           ` Alex Kiselev
2020-12-11 22:55                                           ` Singh, Jasvinder
2020-12-11 23:36                                             ` Alex Kiselev
2020-12-12  0:20                                               ` Singh, Jasvinder
2020-12-12  0:45                                                 ` Alex Kiselev
2020-12-12  0:54                                                   ` Alex Kiselev
2020-12-12  1:45                                                     ` Alex Kiselev
2020-12-12 10:22                                                       ` Singh, Jasvinder
2020-12-12 10:46                                                         ` Alex Kiselev
2020-12-12 17:19                                                           ` Alex Kiselev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4e5bde1cf78b0f77f4a5ec016a7217d6@therouter.net \
    --to=alex@therouter.net \
    --cc=cristian.dumitrescu@intel.com \
    --cc=jasvinder.singh@intel.com \
    --cc=savinay.dharmappa@intel.com \
    --cc=users@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).