From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id ACAD0A0524 for ; Mon, 7 Dec 2020 12:33:03 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id E31CEF12; Mon, 7 Dec 2020 12:33:01 +0100 (CET) Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id E796DE07 for ; Mon, 7 Dec 2020 12:32:59 +0100 (CET) IronPort-SDR: mHwGo73cbJ9dwiPTXiesRsZRKjL4HvfL8HYWGKxkS7YpbE/wo3cH269wLCmjud+A5ayeJ9wyZL bsBYzFbHr8Kg== X-IronPort-AV: E=McAfee;i="6000,8403,9827"; a="173793022" X-IronPort-AV: E=Sophos;i="5.78,399,1599548400"; d="scan'208";a="173793022" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Dec 2020 03:32:57 -0800 IronPort-SDR: i1DjFKmENQzYSr/Qx2SrHqIukXu9y2bCtMH+aPl9jFMEzzcV6uqrtgK/Wgh9OpPDZP7x0xZnWV Twar7+OmQASQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,399,1599548400"; d="scan'208";a="541464053" Received: from fmsmsx601.amr.corp.intel.com ([10.18.126.81]) by fmsmga005.fm.intel.com with ESMTP; 07 Dec 2020 03:32:56 -0800 Received: from fmsmsx609.amr.corp.intel.com (10.18.126.89) by fmsmsx601.amr.corp.intel.com (10.18.126.81) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Mon, 7 Dec 2020 03:32:56 -0800 Received: from fmsmsx606.amr.corp.intel.com (10.18.126.86) by fmsmsx609.amr.corp.intel.com (10.18.126.89) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Mon, 7 Dec 2020 03:32:56 -0800 Received: from fmsedg602.ED.cps.intel.com (10.1.192.136) by fmsmsx606.amr.corp.intel.com (10.18.126.86) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5 via Frontend Transport; Mon, 7 Dec 2020 03:32:56 -0800 Received: from NAM02-SN1-obe.outbound.protection.outlook.com (104.47.36.54) by edgegateway.intel.com (192.55.55.71) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.1713.5; Mon, 7 Dec 2020 03:32:56 -0800 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=jOumaM/bWmM/qFwFt+KE7uu9rAG/2JggcRd5iOmae4DMaVPv7kzr4h0gg9v0qVeqBJkwsYq2NnMFZZ/4+d9DUeyQUBGeJ6EUU7C0UQxnk9K6gJzlmh7gZNComFaYoXjPowr+E402ZQHYnFSyCIvIJcaADCcva0tXNTzidtm7/s9/boWGcJj9IRwWOaqfX7vAR0jGPTFnFxhC1QC7ADkv+171G13md6I0a7Y308hBVHJ9QVohmUa/F9yyCqh4zFqfyPi8y12hkEkKJp2H8dDs4FjKf/HV8/O+lR77G7qH2cBy8GwIJeIcUa5ZMuVOpHgW1UmhmKG+WsmYD5EZ399H4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=E5usJTQFVN+ko5SY5jR8Gs5LW/9RD1GSffqZ7BoTCJI=; b=hkGko3El5mQpDc+vibWM9gDcetm4chhBo4Zwl1uyYWDkTERG8diDN/EBsU4g5nj1QltXMcjGRL24EA/USCPalj3RNzQ/VDWqEAfkFrcf+hfYVCbD+HhGjYf0G5GxEgokSDRo5sN5a1OTqio2ZBt8NX69zSj59MIAAFvyJkHwy49ZMbsbcX2l0fppAYuOJXvHHurWqpQlr8V+AyW6x48OX+MJyj3w9T4eVjk8OpaY5psf6DYAqQskG+ZxRszpvvRKJaRXXerZL4lvBmHO3aUpXNNmfOQ1MINT+aob9YiO7qBFkC6munCKdPv0BxSelZlYeKvoeN5vGLQHA3J5kBgUYQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=E5usJTQFVN+ko5SY5jR8Gs5LW/9RD1GSffqZ7BoTCJI=; b=iVf+0898wuEAv6EyJq7drlwo2KKChDqyG4WffuvUtTTqCwfkLy1rsy/RctJoaLwhdnHTFDzvgeWNncAk8IMbw1XHKjhwBwKEAEWu1J8TjuJjr00CG303cC7aYPqC2vS4Nru4hZ3QRD1AFUL8Uq0nvR6QIrMzaCZSJtgs7CGyVPk= Received: from CY4PR1101MB2134.namprd11.prod.outlook.com (2603:10b6:910:19::22) by CY4PR11MB1334.namprd11.prod.outlook.com (2603:10b6:903:2e::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3632.17; Mon, 7 Dec 2020 11:32:55 +0000 Received: from CY4PR1101MB2134.namprd11.prod.outlook.com ([fe80::381c:9714:7da:2490]) by CY4PR1101MB2134.namprd11.prod.outlook.com ([fe80::381c:9714:7da:2490%4]) with mapi id 15.20.3632.022; Mon, 7 Dec 2020 11:32:55 +0000 From: "Singh, Jasvinder" To: Alex Kiselev CC: "users@dpdk.org" , "Dumitrescu, Cristian" , "Dharmappa, Savinay" Thread-Topic: [dpdk-users] scheduler issue Thread-Index: AQHWwma01f6IUuRnnkCT8HaMzRCz9anY89QAgAL0SYCAD43DcIAAEa4AgAALrQA= Date: Mon, 7 Dec 2020 11:32:55 +0000 Message-ID: References: <090256f7b7a6739f80353be3339fd062@therouter.net> <7e314aa3562c380a573781a4c0562b93@therouter.net> In-Reply-To: <7e314aa3562c380a573781a4c0562b93@therouter.net> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-version: 11.5.1.3 dlp-reaction: no-action dlp-product: dlpe-windows authentication-results: therouter.net; dkim=none (message not signed) header.d=none;therouter.net; dmarc=none action=none header.from=intel.com; x-originating-ip: [109.77.241.46] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 7f3d6d77-9d2b-4a36-6626-08d89aa3d713 x-ms-traffictypediagnostic: CY4PR11MB1334: x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:3513; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: W8kGFNL5Z1tIrTxhpTppcaC9Ni87FofZKdSTQfybv8DJVOxhJqCcmrR3YSIdfOMad+xSZbulcbm+J4qNVUocU2KjU474/PQdF4WZecFlkwQQiZINWzb+fF0D/76mb1WcstZk323X8ACCzPVrhvPMyI7Pj5PLdrAI9iqHBMY3/sQkL6usBboQ6ebCYKJXabXnUkOrqsOiqhDBYU15QBmZh7951LP9VVYnd2SpSUp9hPFwoNaTjldn1nRW1Tbd6X5P3PK9v7z+fLHvkSR24GI5t4S9dXpjC8cUll+pLOt8luzfgcrJVkHSsD3Ucrj69vddyctcst1MTNze2i4o9cxHVQ== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CY4PR1101MB2134.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(346002)(136003)(376002)(396003)(366004)(39860400002)(4001150100001)(26005)(186003)(5660300002)(478600001)(6506007)(53546011)(33656002)(4326008)(83380400001)(66574015)(9686003)(2906002)(66946007)(8676002)(316002)(54906003)(7696005)(52536014)(86362001)(71200400001)(107886003)(55016002)(76116006)(6916009)(8936002)(66556008)(66476007)(64756008)(66446008); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata: =?us-ascii?Q?HhxM7zVTz4fCiYyPes+hMUFZQHMyHW+oERDgQxULjGHMe6lVHS/KMABR3479?= =?us-ascii?Q?BYEf4/RYjutCA9LOGMeSuGiM78YutLp/QOzkgMPELn5jcy7FaE2+TRCvm3V6?= =?us-ascii?Q?kaf9vBRo4YdJ8EgM7wAzmHN4vjZBALTM+dfLRRUjOhMs8EU3jdxBvSoLf+FD?= =?us-ascii?Q?nQFGB6KWeKbYUYMuLJddp3Ay3l0ksAxZNz17wg5e1vEnskOVxcV/+3eWJgZZ?= =?us-ascii?Q?zMySfF0evNhevE6WGZ7gJvYxOuZaxR1zA/deStHYFNnY4u7AfyOqjL15fYND?= =?us-ascii?Q?M1tQ4V6ew8AhNPdabO/q0ryfHP4+ARfYdZPv3laIHIOz14uZc/EPNe1APRUe?= =?us-ascii?Q?k/JwSLGVU1iQZ271gjHopc1M3yQUprpFO5c/zOVJhDcH5XOpEP/QCBqTGb/b?= =?us-ascii?Q?z/dGc50fIJ1Pu5ku3xc4uSoVHsMrwd267Og0klmDktQj+w9kg6EjWz+2jJbN?= =?us-ascii?Q?mwS2rKpwYwbLuSAe08uJtOekpVLpWl9F5y3DNPo05x0HHNwz3uMgKmWOE5FY?= =?us-ascii?Q?9jyn//8brBgrhn0onkXrNDkK2eXdy1cGU1c1VO3PFVoPyDr4pHw2CTxOGp7F?= =?us-ascii?Q?njMhGd6MZgCFjdPb09dB6Iit9Zz/j0QlkTyFh/c8/NFpKcKAbhgwToXRN1pN?= =?us-ascii?Q?C1Xtg85xRKbzzOzj/4Fql9AoJ/FMkIO0V7Cs6LeOqA1XzQCQFgJHd+Z1TkG9?= =?us-ascii?Q?NoRar07nP18goJOE3K4o6/gUBJeeGDqZR4U0r/r3NeKNNbPiwDfhKDU2TMlL?= =?us-ascii?Q?oAPZRbCxk6Gc9D8MoO2YYu0GbpnntpJ0wwnH0ahOSHH7M4M/W45FrMav2qfK?= =?us-ascii?Q?aLeoPI+3XHtBkzsFQUaDkV6cPwPOja87zplr7PstzHHwKgqJMrFVZACviKXq?= =?us-ascii?Q?zzUS6/fkN2majxJGQqz6RLPF2yC59nddK/ZPkf99IJ8CxHNgSv/h+ARD3SYv?= =?us-ascii?Q?sYSBThqPP+Lxh7I8xlvml82PaUD3xii6k7xTumXLDcU=3D?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: CY4PR1101MB2134.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 7f3d6d77-9d2b-4a36-6626-08d89aa3d713 X-MS-Exchange-CrossTenant-originalarrivaltime: 07 Dec 2020 11:32:55.0345 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: HmfCLRDS2ohSd+OyJktuaqpGNanuTGOrizQaZCWp4UKYa2HC+MEnlYi2/Fgg2Qo0jJNObGjHnR9jc5wbIEQ8g2cVL52Wl91FBT1Rjataz+E= X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY4PR11MB1334 X-OriginatorOrg: intel.com Subject: Re: [dpdk-users] scheduler issue X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org Sender: "users" > -----Original Message----- > From: Alex Kiselev > Sent: Monday, December 7, 2020 10:46 AM > To: Singh, Jasvinder > Cc: users@dpdk.org; Dumitrescu, Cristian ; > Dharmappa, Savinay > Subject: Re: [dpdk-users] scheduler issue >=20 > On 2020-12-07 11:00, Singh, Jasvinder wrote: > >> -----Original Message----- > >> From: users On Behalf Of Alex Kiselev > >> Sent: Friday, November 27, 2020 12:12 PM > >> To: users@dpdk.org > >> Cc: Dumitrescu, Cristian > >> Subject: Re: [dpdk-users] scheduler issue > >> > >> On 2020-11-25 16:04, Alex Kiselev wrote: > >> > On 2020-11-24 16:34, Alex Kiselev wrote: > >> >> Hello, > >> >> > >> >> I am facing a problem with the scheduler library DPDK 18.11.10 > >> >> with default scheduler settings (RED is off). > >> >> It seems like some of the pipes (last time it was 4 out of 600 > >> >> pipes) start incorrectly dropping most of the traffic after a > >> >> couple of days of successful work. > >> >> > >> >> So far I've checked that there are no mbuf leaks or any other > >> >> errors in my code and I am sure that traffic enters problematic pip= es. > >> >> Also switching a traffic in the runtime to pipes of another port > >> >> restores the traffic flow. > >> >> > >> >> Ho do I approach debugging this issue? > >> >> > >> >> I've added using rte_sched_queue_read_stats(), but it doesn't give > >> >> me counters that accumulate values (packet drops for example), it > >> >> gives me some kind of current values and after a couple of seconds > >> >> those values are reset to zero, so I can say nothing based on that = API. > >> >> > >> >> I would appreciate any ideas and help. > >> >> Thanks. > >> > > >> > Problematic pipes had very low bandwidth limit (1 Mbit/s) and also > >> > there is an oversubscription configuration event at subport 0 of > >> > port > >> > 13 to which those pipes belongs and > >> CONFIG_RTE_SCHED_SUBPORT_TC_OV is > >> > disabled. > >> > > >> > Could a congestion at that subport be the reason of the problem? > >> > > >> > How much overhead and performance degradation will add enabling > >> > CONFIG_RTE_SCHED_SUBPORT_TC_OV feature? > >> > > >> > Configuration: > >> > > >> > # > >> > # QoS Scheduler Profiles > >> > # > >> > hqos add profile 1 rate 8 K size 1000000 tc period 40 > >> > hqos add profile 2 rate 400 K size 1000000 tc period 40 > >> > hqos add profile 3 rate 600 K size 1000000 tc period 40 > >> > hqos add profile 4 rate 800 K size 1000000 tc period 40 > >> > hqos add profile 5 rate 1 M size 1000000 tc period 40 > >> > hqos add profile 6 rate 1500 K size 1000000 tc period 40 > >> > hqos add profile 7 rate 2 M size 1000000 tc period 40 > >> > hqos add profile 8 rate 3 M size 1000000 tc period 40 > >> > hqos add profile 9 rate 4 M size 1000000 tc period 40 > >> > hqos add profile 10 rate 5 M size 1000000 tc period 40 > >> > hqos add profile 11 rate 6 M size 1000000 tc period 40 > >> > hqos add profile 12 rate 8 M size 1000000 tc period 40 > >> > hqos add profile 13 rate 10 M size 1000000 tc period 40 > >> > hqos add profile 14 rate 12 M size 1000000 tc period 40 > >> > hqos add profile 15 rate 15 M size 1000000 tc period 40 > >> > hqos add profile 16 rate 16 M size 1000000 tc period 40 > >> > hqos add profile 17 rate 20 M size 1000000 tc period 40 > >> > hqos add profile 18 rate 30 M size 1000000 tc period 40 > >> > hqos add profile 19 rate 32 M size 1000000 tc period 40 > >> > hqos add profile 20 rate 40 M size 1000000 tc period 40 > >> > hqos add profile 21 rate 50 M size 1000000 tc period 40 > >> > hqos add profile 22 rate 60 M size 1000000 tc period 40 > >> > hqos add profile 23 rate 100 M size 1000000 tc period 40 > >> > hqos add profile 24 rate 25 M size 1000000 tc period 40 > >> > hqos add profile 25 rate 50 M size 1000000 tc period 40 > >> > > >> > # > >> > # Port 13 > >> > # > >> > hqos add port 13 rate 40 G mtu 1522 frame overhead 24 queue sizes > >> > 64 > >> > 64 64 64 > >> > hqos add port 13 subport 0 rate 1500 M size 1000000 tc period 10 > >> > hqos add port 13 subport 0 pipes 3000 profile 2 > >> > hqos add port 13 subport 0 pipes 3000 profile 5 > >> > hqos add port 13 subport 0 pipes 3000 profile 6 > >> > hqos add port 13 subport 0 pipes 3000 profile 7 > >> > hqos add port 13 subport 0 pipes 3000 profile 9 > >> > hqos add port 13 subport 0 pipes 3000 profile 11 > >> > hqos set port 13 lcore 5 > >> > >> I've enabled TC_OV feature and redirected most of the traffic to TC3. > >> But the issue still exists. > >> > >> Below is queue statistics of one of problematic pipes. > >> Almost all of the traffic entering the pipe is dropped. > >> > >> And the pipe is also configured with the 1Mbit/s profile. > >> So, the issue is only with very low bandwidth pipe profiles. > >> > >> And this time there was no congestion on the subport. > >> > >> > >> Egress qdisc > >> dir 0 > >> rate 1M > >> port 6, subport 0, pipe_id 138, profile_id 5 > >> tc 0, queue 0: bytes 752, bytes dropped 0, pkts 8, pkts dropped 0 > >> tc 0, queue 1: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0 > >> tc 0, queue 2: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0 > >> tc 0, queue 3: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0 > >> tc 1, queue 0: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0 > >> tc 1, queue 1: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0 > >> tc 1, queue 2: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0 > >> tc 1, queue 3: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0 > >> tc 2, queue 0: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0 > >> tc 2, queue 1: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0 > >> tc 2, queue 2: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0 > >> tc 2, queue 3: bytes 0, bytes dropped 0, pkts 0, pkts dropped 0 > >> tc 3, queue 0: bytes 56669, bytes dropped 360242, pkts 150, pkts > >> dropped > >> 3749 > >> tc 3, queue 1: bytes 63005, bytes dropped 648782, pkts 150, pkts > >> dropped > >> 3164 > >> tc 3, queue 2: bytes 9984, bytes dropped 49704, pkts 128, pkts > >> dropped > >> 636 > >> tc 3, queue 3: bytes 15436, bytes dropped 107198, pkts 130, pkts > >> dropped > >> 354 > > > > > > Hi Alex, > > > > Can you try newer version of the library, say dpdk 20.11? >=20 > Right now no, since switching to another DPDK will take a lot of time bec= ause > I am using a lot of custom patches. >=20 > I've tried to simply copy the entire rte_sched lib from DPDK 19 to DPDK 1= 8. > And I was able to successful back port and resolve all dependency issues,= but > it also will take some time to test this approach. >=20 >=20 > > Are you > > using dpdk qos sample app or your own app? >=20 > My own app. >=20 > >> What are the packets size? >=20 > Application is used as BRAS/BNG server, so it's used to provide internet > access to residential customers. Therefore packet sizes are typical to th= e > internet and vary from 64 to 1500 bytes. Most of the packets are around > 1000 bytes. >=20 > > > > Couple of other things for clarification- 1. At what rate you are > > injecting the traffic to low bandwidth pipes? >=20 > Well, the rate vary also, there could be congestion on some pipes at some > date time. >=20 > But the problem is that once the problem occurs at a pipe or at some queu= es > inside the pipe, the pipe stops transmitting even when incoming traffic r= ate is > much lower than the pipe's rate. >=20 > > 2. How is traffic distributed among pipes and their traffic class? >=20 > I am using IPv4 TOS field to choose the TC and there is a tos2tc map. > Most of my traffic has 0 tos value which is mapped to TC3 inside my app. >=20 > Recently I've switched to a tos2map which maps all traffic to TC3 to see = if it > solves the problem. >=20 > Packet distribution to queues is done using the formula (ipv4.src + > ipv4.dst) & 3 >=20 > > 3. Can you try putting your own counters on those pipes queues which > > periodically show the #packets in the queues to understand the > > dynamics? >=20 > I will try. >=20 > P.S. >=20 > Recently I've got another problem with scheduler. >=20 > After enabling the TC_OV feature one of the ports stops transmitting. > All port's pipes were affected. > Port had only one support, and there were only pipes with 1 Mbit/s profil= e. > The problem was solved by adding a 10Mit/s profile to that port. Only aft= er > that port's pipes started to transmit. > I guess it has something to do with calculating tc_ov_wm as it depends on= the > maximum pipe rate. >=20 > I am gonna make a test lab and a test build to reproduce this. Does this problem exist when you disable oversubscription mode? Worth looki= ng at grinder_tc_ov_credits_update() and grinder_credits_update() functions= where tc_ov_wm is altered.=20 > > > > Thanks, > > Jasvinder