From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id A36D62BC5 for ; Wed, 7 Dec 2016 20:52:05 +0100 (CET) Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga102.jf.intel.com with ESMTP; 07 Dec 2016 11:52:03 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,315,1477983600"; d="scan'208";a="38004100" Received: from irsmsx154.ger.corp.intel.com ([163.33.192.96]) by orsmga004.jf.intel.com with ESMTP; 07 Dec 2016 11:52:02 -0800 Received: from irsmsx108.ger.corp.intel.com ([169.254.11.159]) by IRSMSX154.ger.corp.intel.com ([169.254.12.108]) with mapi id 14.03.0248.002; Wed, 7 Dec 2016 19:52:01 +0000 From: "Dumitrescu, Cristian" To: Alan Robertson CC: "dev@dpdk.org" , Thomas Monjalon Thread-Topic: [dpdk-dev] [RFC] ethdev: abstraction layer for QoS hierarchical scheduler Thread-Index: AQHSSzXynsac9sXyqk++qsVVyCqc5aD8RZPwgACj2wA= Date: Wed, 7 Dec 2016 19:52:01 +0000 Message-ID: <3EB4FA525960D640B5BDFFD6A3D8912652711302@IRSMSX108.ger.corp.intel.com> References: <1480529810-95280-1-git-send-email-cristian.dumitrescu@intel.com> <57688e98-15d5-1866-0c3a-9dda81621651@brocade.com> <6d862b500e1e4f34a4cbf790db8d5d48@EMEAWP-EXMB11.corp.brocade.com> In-Reply-To: <6d862b500e1e4f34a4cbf790db8d5d48@EMEAWP-EXMB11.corp.brocade.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiYWMwNzRlZGMtZDdlMC00YTgxLTk4ZDYtNDk4NTljNTBlYTg0IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE2LjIuMTEuMCIsIlRydXN0ZWRMYWJlbEhhc2giOiJqM2U3SzJRXC9vUnlqWmEzYlNYVnltVGl5dGc3WDNmTmJjNDBOXC9tTzNTblE9In0= x-ctpclassification: CTP_IC x-originating-ip: [163.33.239.182] Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [RFC] ethdev: abstraction layer for QoS hierarchical scheduler X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Dec 2016 19:52:06 -0000 Hi Alan, Thanks for your comments! > Hi Cristian, > Looking at points 10 and 11 it's good to hear nodes can be dynamically ad= ded. Yes, many implementations allow on-the-fly remapping a node from one parent= to another one, or simply adding more nodes post-initialization, so it is natural for = the API to provide this. > We've been trying to decide the best way to do this for support of qos on= tunnels for > some time now and the existing implementation doesn't allow this so effec= tively ruled > out hierarchical queueing for tunnel targets on the output interface. > Having said that, has thought been given to separating the queueing from = being so closely > tied to the Ethernet transmit process ? When queueing on a tunnel for e= xample we may > be working with encryption. When running with an anti-reply window it i= s really much > better to do the QOS (packet reordering) before the encryption. To suppo= rt this would > it be possible to have a separate scheduler structure which can be passed= into the > scheduling API ? This means the calling code can hang the structure of w= hatever entity > it wishes to perform qos on, and we get dynamic target support (sessions/= tunnels etc). Yes, this is one point where we need to look for a better solution. Current= proposal attaches the hierarchical scheduler function to an ethdev, so scheduling traffic for= tunnels that have a pre-defined bandwidth is not supported nicely. This question was also raise= d in VPP, but there tunnels are supported as a type of output interfaces, so attaching sc= heduling to an output interface also covers the tunnels case. Looks to me that nice tunnel abstractions are a gap in DPDK as well. Any th= oughts about how tunnels should be supported in DPDK? What do other people think about t= his? > Regarding the structure allocation, would it be possible to make the numb= er of queues > associated with a TC a compile time option which the scheduler would acco= mmodate ? > We frequently only use one queue per tc which means 75% of the space allo= cated at > the queueing layer for that tc is never used.=A0 This may be specific to = our implementation > but if other implementations do the same if folks could say we may get a = better idea > if this is a common case. > Whilst touching on the scheduler, the token replenishment works using a d= ivision and > multiplication obviously to cater for the fact that it may be run after s= everal tc windows > have passed.=A0 The most commonly used industrial scheduler simply does a= lapsed on the tc > and then adds the bc.=A0=A0 This relies on the scheduler being called wit= hin the tc window > though.=A0 It would be nice to have this as a configurable option since i= t's much for efficient > assuming the infra code from which it's called can guarantee the calling = frequency. This is probably feedback for librte_sched as opposed to the current API pr= oposal, as the Latter is intended to be generic/implementation-agnostic and therefor its s= cope far exceeds the existing set of librte_sched features. Btw, we do plan using the librte_sched feature as the default fall-back whe= n the HW ethdev is not scheduler-enabled, as well as the implementation of choice fo= r a lot of use-cases where it fits really well, so we do have to continue evolve and i= mprove librte_sched feature-wise and performance-wise. > I hope you'll consider these points for inclusion into a future road map.= =A0 Hopefully in the > future my employer will increase the priority of some of the tasks and a = PR may appear > on the mailing list. > Thanks, > Alan.