From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from NAM02-BL2-obe.outbound.protection.outlook.com (mail-bl2nam02on0073.outbound.protection.outlook.com [104.47.38.73]) by dpdk.org (Postfix) with ESMTP id 404122C0C; Thu, 4 Aug 2016 15:46:33 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=casasystems.onmicrosoft.com; s=selector1-casasystems-com01b; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=T8JsBdEpW3Yp9LWX+ljBfv2FI21pWwcAdiXympMeqvE=; b=Fn34x2Xqkz8kl64Gq2PXGaY3P4n5zmqzwv9TAcFftgrcxsEgbck+iwm9jQyEBi8RkVq5etsJTz9fAZIjrER3M//CJX2y0RTDuLcoPdX4VGIIw7TOifTrOPO3GVt2Ycm6zRp6+hQyTYvdLESPwbH27NG8/D0FZjblOJINNHrk23Q= Received: from BLUPR06MB611.namprd06.prod.outlook.com (10.141.207.25) by BLUPR06MB611.namprd06.prod.outlook.com (10.141.207.25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.549.15; Thu, 4 Aug 2016 13:46:30 +0000 Received: from BLUPR06MB611.namprd06.prod.outlook.com ([10.141.207.25]) by BLUPR06MB611.namprd06.prod.outlook.com ([10.141.207.25]) with mapi id 15.01.0549.023; Thu, 4 Aug 2016 13:46:30 +0000 From: Yuyong Zhang To: "Dumitrescu, Cristian" , "dev@dpdk.org" , "users@dpdk.org" Thread-Topic: how to design high performance QoS support for a large amount of subscribers Thread-Index: AdHsy9qOZIKKBTHaTcy6xWj03wy3owABeiFAAF6x/gAAAmT8IA== Date: Thu, 4 Aug 2016 13:46:30 +0000 Message-ID: References: <3EB4FA525960D640B5BDFFD6A3D8912647A3FECC@IRSMSX108.ger.corp.intel.com> In-Reply-To: <3EB4FA525960D640B5BDFFD6A3D8912647A3FECC@IRSMSX108.ger.corp.intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=yuyong.zhang@casa-systems.com; x-originating-ip: [50.252.238.165] x-ms-office365-filtering-correlation-id: b5a31633-0312-465f-8878-08d3bc6dbd74 x-microsoft-exchange-diagnostics: 1; BLUPR06MB611; 6:22SnkdpbzsRiOmsv/IMqkRM0mjY10LJOuKNWA5+Xlm5zN1JinJjzY0SFbw3Kps7Z4Pl00jnA+SWb/HSjVUGxWNPYWCigOCoySDuU8Dt0VHniERzR9VOm7v1HvDBYBirKqQYbgj+A42qvSm/Vicf/4ZuIxkPy83eXwAL2HHmUSjAgM7IL3V6cGtcxhsfgr20OpzPAaOsGjhOFlR25wuH00jeAYr+296LCrYmPgosjsv7IHni/0w9kLa/2/LwCFoiH+0p+9AwOcnLAbSGYZUaakQnSsSCRBbKnXKzgoItKlDM=; 5:BgiS6/QTuv0g8zKoM49LS1ertdZylgVhDklR65vGzGbeiUBqWHWqTIRsrhA7HxnyWL/4WqjhOeiY+D0MT7xFqdDsv97VdmLslgvgJTDH2S+7uvgKGcYD1XuXaewfTFFAzpvqHVp0BIXYIgR0kq3XPQ==; 24:rT+hhCg+zC4P3s4UsTJ6Em11JD4VN8xoPdaXePdh60pnNUAvyl9Ca6izibdZ5Cat6TuBx8nnrGf87TRnmU/Y9s2pkNtWHGtvf94Jlx9dpqw=; 7:744Y/OAZJdsQIM2gi90mOXfuVm3aj4vtJ+AvdcAJQOgrZGjzpEpWl2kSbxjxSCg4nll2lyJiaCrXQgp6DJT9EW/De67rzyfYxDRKawMXv5/57STjpstUF4IF7apTQgQVfYIFdotU5dFPMJVJNausDAI5MNvjLY/e1+JBtt/vvGXUMMMCDB3TYfq48SQecIj2jzaQDF4Wblg/Pp2cpQhR+LPQKw22zrCyB+F4CQiqVxAlIO1vLrRHx0vV2QRh1I4F x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BLUPR06MB611; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(228905959029699); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001); SRVR:BLUPR06MB611; BCL:0; PCL:0; RULEID:; SRVR:BLUPR06MB611; x-forefront-prvs: 00246AB517 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(7916002)(189002)(377454003)(13464003)(199003)(45984002)(305945005)(3660700001)(86362001)(7736002)(2900100001)(77096005)(3280700002)(2201001)(7846002)(74316002)(92566002)(5002640100001)(19580395003)(9686002)(2950100001)(87936001)(554214002)(107886002)(8936002)(3846002)(6116002)(97736004)(68736007)(586003)(102836003)(7696003)(5001770100001)(105586002)(99286002)(33656002)(10400500002)(189998001)(2906002)(106356001)(11100500001)(66066001)(50986999)(54356999)(76176999)(76576001)(101416001)(81166006)(81156014)(122556002)(2501003)(8676002)(19580405001); DIR:OUT; SFP:1101; SCL:1; SRVR:BLUPR06MB611; H:BLUPR06MB611.namprd06.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: casa-systems.com does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: casa-systems.com X-MS-Exchange-CrossTenant-originalarrivaltime: 04 Aug 2016 13:46:30.3238 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 17b16a32-cb34-482f-946d-8a975023450e X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLUPR06MB611 Subject: Re: [dpdk-dev] how to design high performance QoS support for a large amount of subscribers X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Aug 2016 13:46:33 -0000 Thank you very much Cristian for the insightful response.=20 Very much appreciated. Regards, Yuyong -----Original Message----- From: Dumitrescu, Cristian [mailto:cristian.dumitrescu@intel.com]=20 Sent: Thursday, August 4, 2016 9:01 AM To: Yuyong Zhang ; dev@dpdk.org; users@dpdk.= org Subject: RE: how to design high performance QoS support for a large amount = of subscribers Hi Yuyong, > -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Yuyong Zhang > Sent: Tuesday, August 2, 2016 4:26 PM > To: dev@dpdk.org; users@dpdk.org > Subject: [dpdk-dev] how to design high performance QoS support for a=20 > large amount of subscribers >=20 > Hi, >=20 > I am trying to add QoS support for a high performance VNF with large=20 > amount of subscribers (millions). Welcome to the world of DPDK QoS users! It requires to support guaranteed bit rate > for different service level of subscribers. I.e. four service levels=20 > need to be > supported: >=20 > * Diamond, 500M >=20 > * Gold, 100M >=20 > * Silver, 50M >=20 > * Bronze, 10M Service levels translate to pipe profiles in our DPDK implementation. The s= et of pipe profiles is defined per port. >=20 > Here is the current pipeline design using DPDK: >=20 >=20 > * 4 RX threads, does packet classification and load balancing >=20 > * 10-20 worker thread, does application subscriber management >=20 > * 4 TX threads, sends packets to TX NICs. >=20 > * Ring buffers used among RX threads, Worker threads, and TX thre= ads >=20 > I read DPDK program guide for QoS framework regarding hierarchical > scheduler: Port, sub-port, pipe, TC and queues, I am looking for=20 > advice on how to design QoS scheduler to support millions of=20 > subscribers (pipes) which traffic are processed in tens of worker=20 > threads where subscriber management processing are handled? Having millions of pipes per port poses some challenges: 1. Does it actually make sense? Assuming the port rate is 10GbE, looking at= the smallest user rate you mention above (Bronze, 10Mbps/user), this means= that fully provisioning all users (i.e. making sure you can fully handle e= ach user in worst case scenario) results in a maximum of 1000 users per por= t. Assuming overprovisioning of 50:1, this means a maximum of 50K users per= port. 2. Memory challenge. The number of pipes per port is configurable -- hey, t= his is SW! :) -- but each of these pipes has 16 queues. For 4K pipes per po= rt, this is 64K queues per port; for typical value of 64 packets per queue,= this is 4M packets per port, so worst case scenario we need to provision 4= M packets in the buffer pool for each output port that has hierarchical sch= eduler enabled; for buffer size of ~2KB each, this means ~8GB of memory for= each output port. If you go from 4k pipes per port to 4M pipes per port, t= his means 8TB of memory per port. Do you have enough memory in your system?= :) One thing to realize is that even for millions of users in your system, not= all of them are active at the same time. So maybe have a smaller number of= pipes and only map the active users (those that have any packets to send n= ow) to them (a fraction of the total set of users), with the set of active = users changing over time. You can also consider mapping several users to the same pipe. >=20 > One design thought is as the following: >=20 > 8 ports (each one is associated with one physical port), 16-20=20 > sub-ports (each is used by one Worker thread), each sub-port supports=20 > 250K pipes for subscribers. Each worker thread manages one sub-port=20 > and does metering for the sub-port to get color, and after identity=20 > subscriber flow pick a unused pipe, and do sched enqueuer/de-queue and=20 > then put into TX rings to TX threads, and TX threads send the packets to = TX NICs. >=20 In the current implementation, each port scheduler object has to be owned b= y a single thread, i.e. you cannot slit a port across multiple threads, the= refore is not straightforward to have different sub-ports handled by differ= ent threads. The workaround is to split yourself the physical NIC port into= multiple port scheduler objects: for example, create 8 port scheduler obje= cts, set the rate of each to 1/8 of 10GbE, have each of them feed a differe= nt NIC TX queue of the same physical NIC port. You can probably get this scenario (or very similar) up pretty quickly just= by handcrafting yourself a configuration file for examples/ip_pipeline app= lication. > Are there functional and performance issues with above approach? >=20 > Any advice and input are appreciated. >=20 > Regards, >=20 > Yuyong >=20 >=20 >=20 Regards, Cristian