From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from NAM02-CY1-obe.outbound.protection.outlook.com (mail-cys01nam02on0086.outbound.protection.outlook.com [104.47.37.86]) by dpdk.org (Postfix) with ESMTP id 696EE2C6D for ; Fri, 13 Jan 2017 11:36:29 +0100 (CET) Received: from BN6PR03CA0029.namprd03.prod.outlook.com (10.175.124.15) by CY1PR0301MB0745.namprd03.prod.outlook.com (10.160.159.151) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.845.12; Fri, 13 Jan 2017 10:36:27 +0000 Received: from BL2FFO11FD047.protection.gbl (2a01:111:f400:7c09::158) by BN6PR03CA0029.outlook.office365.com (2603:10b6:404:10c::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.845.12 via Frontend Transport; Fri, 13 Jan 2017 10:36:26 +0000 Authentication-Results: spf=fail (sender IP is 192.88.168.50) smtp.mailfrom=nxp.com; intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=fail action=none header.from=nxp.com; Received-SPF: Fail (protection.outlook.com: domain of nxp.com does not designate 192.88.168.50 as permitted sender) receiver=protection.outlook.com; client-ip=192.88.168.50; helo=tx30smr01.am.freescale.net; Received: from tx30smr01.am.freescale.net (192.88.168.50) by BL2FFO11FD047.mail.protection.outlook.com (10.173.161.209) with Microsoft SMTP Server (version=TLS1_0, cipher=TLS_RSA_WITH_AES_256_CBC_SHA) id 15.1.803.8 via Frontend Transport; Fri, 13 Jan 2017 10:36:26 +0000 Received: from [127.0.0.1] ([10.232.133.65]) by tx30smr01.am.freescale.net (8.14.3/8.14.0) with ESMTP id v0DAaKWW014535; Fri, 13 Jan 2017 03:36:23 -0700 To: Cristian Dumitrescu , References: <1480529810-95280-1-git-send-email-cristian.dumitrescu@intel.com> From: Hemant Agrawal Message-ID: Date: Fri, 13 Jan 2017 16:06:20 +0530 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0 MIME-Version: 1.0 In-Reply-To: <1480529810-95280-1-git-send-email-cristian.dumitrescu@intel.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-EOPAttributedMessage: 0 X-Matching-Connectors: 131287773864307552; (91ab9b29-cfa4-454e-5278-08d120cd25b8); () X-Forefront-Antispam-Report: CIP:192.88.168.50; IPV:NLI; CTRY:US; EFV:NLI; SFV:NSPM; SFS:(10009020)(6009001)(7916002)(336005)(39380400002)(39850400002)(39400400002)(39410400002)(39840400002)(39860400002)(39450400003)(2980300002)(1109001)(1110001)(3190300001)(339900001)(189002)(377454003)(43544003)(199003)(24454002)(4001350100001)(107886002)(189998001)(626004)(97736004)(229853002)(31696002)(5001770100001)(64126003)(86362001)(5660300001)(65826007)(7126002)(5890100001)(7246003)(53946003)(83506001)(33646002)(2950100002)(85426001)(31686004)(8936002)(68736007)(105606002)(81166006)(81156014)(50466002)(8676002)(65806001)(65956001)(76176999)(120886001)(50986999)(47776003)(106466001)(92566002)(54356999)(36756003)(23746002)(38730400001)(230700001)(356003)(305945005)(551984002)(2906002)(104016004)(77096006)(579004); DIR:OUT; SFP:1101; SCL:1; SRVR:CY1PR0301MB0745; H:tx30smr01.am.freescale.net; FPR:; SPF:Fail; PTR:InfoDomainNonexistent; MX:1; A:1; LANG:en; X-Microsoft-Exchange-Diagnostics: 1; BL2FFO11FD047; 1:eTViThsOYGzW9HgJtzgc9+wJBa5WOUCTQPUl+ETeUbG4ml7zlWpbhq7XHCXKZU2Sf/+CoNIo/NQIeAQ9FfPkNn2OI5SKDz42cCTn4Tbsm6HZ+Pgb1xRZczyklQ6ZX/ESMeSoE7wWGbfGH9AFAUzfhyVIB88yVKJbVphPxr4u6lzi2pb2IlprquAYKCCe/KoxGfGjUPjQxLLHySz+WTmwBj5OmKqUEhTE60XA27KotPbDImq4b1e2FilCnsEKnSCKw2XGTe8VYovsU96Pq0L43aupDjp8D8GAyi3EwdINknJcAXdJM+45NvC4b6OpFgbbYsxlDz1wMKEQZgz7ay3k5JUpdelDWlzL4t7LInyYAf5mw6cJcowAXrdXFrZ1exWLYlAU/NpgvYMmGLRm6Aoq0QSxgTnvKt0CwYTppAZjkC1tDJELkiqS6sPRQ51fAiFcuwR5O9kSGb0xuJ6uwiK9zoXM1oYCoJx5PXayA5+d8xOz3y8RCGXIBbW3Ms4cvMJn39guzXuBm/l90DFZxFeQDivZSZPVPXvq6+TW9DD/OjbTYbxU048UqOx5lYjgTbLE4DjHXs/woFu3cN/8N5+5IVPFanICbm/LJDhxRgAWeKS7iQtpHSIsNEH1zub1CkSD X-MS-Office365-Filtering-Correlation-Id: 1762dc61-abc3-4b06-a068-08d43ba006ec X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001); SRVR:CY1PR0301MB0745; X-Microsoft-Exchange-Diagnostics: 1; CY1PR0301MB0745; 3:tDJ6vPgRB48BETJXxVNO0uf1KPlPxgkQj5i4oJwhIpfBdpNTQkOf+E1muKW5DndD8LwITFtx6Nut0PX39Vxer1qIPw6q1dHd4jEPEdWjwFT3FukN6+kJOw2vgHFSSjEfBWsRHIt+RAsd7uXuJwfHQa35cA9CuZFG+Ae+FlKt6/ivK5a54qyMmnKDpWeRrVcjeEd8hYBWAHy1gLAOVnRSzHBq6fsuNHninXAL/zPP5nsZZbs1faBpjUMB+zRQ5D8gu7hcP8IfmnOExBIDuT+CNu5VtIyG8kp3kQsHcB7u9Be+7Z881uPZhBXZkVZqcL7SNsr6RKmv19llGoL4TyvWYxYx49uUDeut9gUcLt32US/T2OHBoq4l1LE0E7dBd8QT; 25:tGumskTWIARzYXLgCWWTcopHqpcV+2V1wpBihZigdHn6NeNS4ME5l0zzTgIKYStAOJ/5KUvGZvfUmfDZFpuXSdeWFjrQRXgGEemkNvUEb4Lki8+238faT/ZSAcVBgS0Yl8x9BCRW86MXY/68CNP5xea55KDubHISfEdqMibG3BKVAsOim/y2lYcZs4C1cXiGUGCjuo/eBohW42z+xBhYsib4TwHJqs12oKu7qz4FSIhUIHOHhW6lkt7ghqOWmxJJipbf2JU4bFx02WtWfsn+Qv4cKYACsrLig97P89rxjcEkc7ss9HomVK+1CfIrhrSYxHMtrC+mzN+2wWED1rzfWehTs7Dz+w7ngGXlLyrRqrQQRDStUNRilpnrB5qawtAs/cXlfnjPfMhGybB07wjR7aAvci/balqRi79KsqKUzFoIx5qyD6+HbPrYve1CRuZVX7hvpOCApI+/LFNNI0CEWg== X-Microsoft-Exchange-Diagnostics: 1; CY1PR0301MB0745; 31:3RH8osR+xt7RkedUPkcOyH9lh88VHx5iIhqBieyfntvZxuPWn4jyEyPJ6en94Fbuwo9fpWDQ4irAJmEQm9v2xW4yRl6sfjUfzA503dRHqMZK5ZurgxBA3KLYfCyQPdX5pEapuMcQO4/KEo2u9dMuieIDBgT1F3yJQnvF2EQL0zkYHUUmAzED/MkmDHTD3NJiEbXvZtz+4DV1kCuiixEXrbH/C7BwGKqIdslh8ZjMdDfPg4S0BzB46I1mqVz+gB6ePiAD0QVdortc4NOiUbA68g== X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(21532816269658)(228905959029699); X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(6095060)(601004)(2401047)(13023025)(13017025)(13015025)(13024025)(13018025)(5005006)(8121501046)(3002001)(10201501046)(6055026)(6096035)(20161123561025)(20161123559025)(20161123556025)(20161123563025)(20161123565025); SRVR:CY1PR0301MB0745; BCL:0; PCL:0; RULEID:(400006); SRVR:CY1PR0301MB0745; X-Microsoft-Exchange-Diagnostics: 1; CY1PR0301MB0745; 4:DW4WurPZf+16veosVEBvlA42BM5BfMJEeYwMGDhgmSNZLlas17fQL63S33Wx3yyzhs2LoykXvoYooKzi3KOgFgJNYzPp6/VqS1t59OsTTNW2mIPoEasP73wMEZ7atGHkKgy3+hIuzM1bV7ZMarFtv8C5YR/9LT68fRyfCUOrVPldeHXcs/ZHqlrZ4EGGgMaBPMzS3zO9SLJj3cHURRSBs+hsvD+DrC0A9y/yaJlv550DB8Ucdaeg/64kp1gupuF4o84kIz0SBSLt5ann2Tb6dtrvQ1Fu0VwlehSCPZ8z+zZClpa77wPXgiMRfsWKK4e8TUZOz1EBnNWeteAVKp5gThknrxXdb7ZIYlZSiyMJ4eT+jsoHsAeBMlqi/+sZQArtKB2mqT/IfyhtUOPgUXAlVrAbLS1syiXBMday1i+eTV3gbgjqT+3x3MSacdh6x9OptrOQ8BiyLM4vTGnwPZSlQkZ0VIYl9LdkC6lXL7d5dsKy6p7upkYchPT27LpTgTNeGpv+x7bzOY4dj7K53JTX9kDxVOYV3/ZhMN9ujdr7gmVx3pAvmhjmD0TFcJs6BGPhJJAs+ittOYbiicsKaQQcrbFyFWO/WjoWysu3JpNt3TBm8nWyXTEGqbPEPfzL5f4lWvAPBuYanALrrui7H0CTdY09W5lUkkRcd4crGvemOuw6cf2ZrK+Y6X6fr7QQQTLytZTaIFowtv5NUmJkAJ9iwQqXoX7n4XiyhmK5aE3ut/5bqtXhWwNUNmMOTABH9HpQiQXBMSW3O9WvQzsFnESSCA== X-Forefront-PRVS: 018632C080 X-Microsoft-Exchange-Diagnostics: =?Windows-1252?Q?1; CY1PR0301MB0745; 23:2lH9GG1xRKgajVc5vA1n4dULK4QvN55K/TC?= =?Windows-1252?Q?/XY0xvJBtN5WdkUbzvwjAG6zy6vs4IE8WeY2HGT9u+85j00EIkV6cvgU?= =?Windows-1252?Q?c7CKgMKXhd08PqHLL7DEwETyS6qzfb0sSVXtKcZq58hiuHJtULzbhlfa?= =?Windows-1252?Q?Mgnc7r6mlCGY4UwnPeigzKHeUsgXZQAn6od5QX1454cwnalN2EJBX0KE?= =?Windows-1252?Q?3UrRyuUlu0hk/G7+otgQaZZSS2DhTlqKHg27eBCODTLSykrbJ/wSAept?= =?Windows-1252?Q?eTZCl2ldsOtTU9OBytIB4RveJq8jd6ai7oOPNpUsV/J4GbwVlmdwlj5q?= =?Windows-1252?Q?lZE+QNHLc8USfJ6VNCIfXOKgo6ersiHSUVOxDaj/DUBXa4UECQncK5SS?= =?Windows-1252?Q?VMjzpAspn/hE3LEcJQJIGTdlX4s9Tm9MdfpreL7/bxN1D5xDM79/R7EO?= =?Windows-1252?Q?x0WzSplWXSjrYIEQHY5++Tj5/4SOTPcTTJWuY3m2cTbUzSe+NBTvv4y6?= =?Windows-1252?Q?J0eCXr1lz+5DaBLq9I4F2Gx4oF1vxM/bn72sduG29NFbrFNfCnTiVl4u?= =?Windows-1252?Q?y79lQ8FPiyt/TR0qJTb2JaN0ETuMzEJSr5BefZx6asvF3Lsg5veWBxZs?= =?Windows-1252?Q?UGExem0T1pfiaLIc1WZWwTYE/RuBJgDkVyhtYLDEs39ORDGTU9KqSloP?= =?Windows-1252?Q?PnxWIUPG8l+2k9A1B8rPaGp/WX6Y6yeCt7YNcrPQ4sZb0CTmGVw7xJzf?= =?Windows-1252?Q?0rA8IYCBk8RyyCEf15gMcJS17caOJxQ9NUnsFClvvPT3t/nykipUZmz7?= =?Windows-1252?Q?2XUVUuGdBGP9HOEAPhZw9hjMGM/vflpA1AHaJv0ntoGxi7TIEo8eDbWP?= =?Windows-1252?Q?VIISklkrDAVDUL9mwwUbKZE9MBlAz0Y/YFJEIZHYblgK3DFiVLh33sgM?= =?Windows-1252?Q?PV6p8br8qlQ2vCzpK/yNzS+1hFMVqUem2x/tZg6/bFyVwozQ/HUeTRKC?= =?Windows-1252?Q?Y3HVJaeHhBwYtuhqQeSHNC/5ME8te9ZOgyZqxoVBxgvJhSgM+c1f7ekQ?= =?Windows-1252?Q?F1ciVyoYv2qPe8PCK56C6Y1OHRZToRiZooVu+scSN6ZRzicZ9QnaW2pE?= =?Windows-1252?Q?rJ4ghUcIjVoJV2RxoSopXvqSTIglgRXy/AVWRuynBT+koH+eAe6oXGCp?= =?Windows-1252?Q?RLhYPsr5RYZNBg1agFnZF4OvY8X3Rm2P/ZuDb2hYI/Lvza4/+pJSmNNL?= =?Windows-1252?Q?FTHRqpzzNWKRZpE3MGg/suVBvDPYN7Mdc76unIOzBNi0koAZ9bfMHlMx?= =?Windows-1252?Q?wURqU7LWTUZl8ffH/6FdvuseS5nG2tmFZe+bnPQ8SpREXv2E6i4ROayf?= =?Windows-1252?Q?xxSmLgficiaIvx/iWc79jm2ey9byFaEY1dlIO4AgLsoSB0LM61+7drLr?= =?Windows-1252?Q?jRkjRzgydqAylsBMNe5A/NaMVRIvt52CMLhaKo1EBQPuGhYdB5BtFxpy?= =?Windows-1252?Q?CeZqhIIgFvSP+YjEBl2TFmvL7K1WQC4Ufhu+d/gTipl97AB8ujteEDZv?= =?Windows-1252?Q?W6w5/Lxfqm/ou7a3phoesWPSfuFVz795Y0nMVMphl8VwTs7lEEupAtAh?= =?Windows-1252?Q?L4oGIrK7mEhIy73E4VY1H/QfDo8rw/O45LNHy8RlsQ4BR9Z/rh5SaToP?= =?Windows-1252?Q?3ibZCRIZuIsIV7LCfuObFvhNSYAUE9xII9VgpD54T32bD59EYJ1RaepL?= =?Windows-1252?Q?2DIB8bveRX7cMm1FD+OGiIgVqSGEL6n9M78PVHr8=3D?= X-Microsoft-Exchange-Diagnostics: 1; CY1PR0301MB0745; 6:+4XvntQ+fPQMe2rTfCHwy6ZoD54VXshEzrk1eCI/c24yk5mBDFZEwsKzQb8qU3RUkouIGnzl4dpaV60vihE7Ybp2rvYGGWy751MwozjoHSuqO+75A4u/owjI4u8Iq7hZgHMSF3SZVyJi4Sc7PUkUxkXrzoE0QTzyU451KNs44XZ7NE8uj3KoYfwJgKa55rF29NfxGG64GiTy24JkoZKjCWJdd0CimBkMzHTJAOjKqe/nbrRKqGpfRFw49XdtHgD7gqUWmguzZf/+EMdXDDYZxzQg0bhlP2BH/CMZCtLXwyaKkfYqPcp1OXXO6bCHkaMqAefqpElC6fC35B20pCbSQHVd+5+jP9+KzHaIzGz4MY66tGH17JVWWboZCaJYrwXRhUEqOlR0xjV+Nrrhh/qN1pBXsgqg0YDyyzOKXHMqw/8t7kWWChvWNqEk4Ik84T9k; 5:0alDjMVKW92fRz9ttTZahkgH3LG2gy5GFCVLJ/pkMpUD49L42xro7x5ROw5iUQ/bjO5K5G7WK9b0Fbb6RlQ5TUimAbH1YhDHByc1TafQh0ep++kBx7C/DerOj/U0Gz6nSYNGejKHSI8GAdNDVZ38/8nAiAGdYlEaFdN+6821tY64b40HBzRz0/wqQNLJmmS8; 24:AWoyYmkIGnjE47ZKro5LJ27hvfNvC454AlUM1mybLvErjsXn2ZdfyT+njAgdPgEoU0/c8V2LVSKJkqj2AH+PGevMoCj2G7t0Mwp1W1Dgq7c= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; CY1PR0301MB0745; 7:+iJqeEiAtLdhj2jNgzaqkvLdtUJqsgEmTxf4tcHBRDTmKV2qVGMO40MvOYJ80FEtJfxSrrM5/XG/yqtcEYtmq1iDrkEwiLCTzSjyNes8oURbEomXTVBEpT4GhaQ6JqmntVXYVIlG1MYS9hWPCo32+ntyF+JokHXFylkMMZO8XOJeyfrCRd3ATKrrzKRjiiJ+wuYKr1CWrXdCinygNeT+N7zzA3ygMv0j0kingwqIWrvQVxhYqyQnETHhaOT7ixCyLx27ZWws7zhcxZEYs7f6sXlyhRq2l9y59ahRFtELCSz1oS4eMrjqZVEuiiLCZ/nXm3zUzpwVot+vILu8Oux+65LKLUpfRJnueVvTMYopjtbSwlxIU+eHkqMbfOA6mzsgFJbYPXydJTQocDXQHUY7BX9NwdyvpU2t9/An4UAAjB+c1l0EWT/0KvnHXWzmSzz5xH6zUaL/9aorfuZvQj+2cw== X-MS-Exchange-CrossTenant-OriginalArrivalTime: 13 Jan 2017 10:36:26.0563 (UTC) X-MS-Exchange-CrossTenant-Id: 5afe0b00-7697-4969-b663-5eab37d5f47e X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5afe0b00-7697-4969-b663-5eab37d5f47e; Ip=[192.88.168.50]; Helo=[tx30smr01.am.freescale.net] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY1PR0301MB0745 Subject: Re: [dpdk-dev] [RFC] ethdev: abstraction layer for QoS hierarchical scheduler X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Jan 2017 10:36:30 -0000 On 11/30/2016 11:46 PM, Cristian Dumitrescu wrote: > This RFC proposes an ethdev-based abstraction layer for Quality of Service (QoS) > hierarchical scheduler. The goal of the abstraction layer is to provide a simple > generic API that is agnostic of the underlying HW, SW or mixed HW-SW complex > implementation. > > Q1: What is the benefit for having an abstraction layer for QoS hierarchical > layer? > A1: There is growing interest in the industry for handling various HW-based, > SW-based or mixed hierarchical scheduler implementations using a unified DPDK > API. > > Q2: Which devices are targeted by this abstraction layer? > A2: All current and future devices that expose a hierarchical scheduler feature > under DPDK, including NICs, FPGAs, ASICs, SOCs, SW libraries. > > Q3: Which scheduler hierarchies are supported by the API? > A3: Hopefully any scheduler hierarchy can be described and covered by the > current API. Of course, functional correctness, accuracy and performance levels > depend on the specific implementations of this API. > > Q4: Why have this abstraction layer into ethdev as opposed to a new type of > device (e.g. scheddev) similar to ethdev, cryptodev, eventdev, etc? > A4: Packets are sent to the Ethernet device using the ethdev API > rte_eth_tx_burst() function, with the hierarchical scheduling taking place > automatically (i.e. no SW intervention) in HW implementations. Basically, the > hierarchical scheduler is done as part of packet TX operation. > The hierarchical scheduler is typically the last stage before packet TX and it > is tightly integrated with the TX stage. The hierarchical scheduler is just > another offload feature of the Ethernet device, which needs to be accommodated > by the ethdev API similar to any other offload feature (such as RSS, DCB, > flow director, etc). > Once the decision to schedule a specific packet has been taken, this packet > cannot be dropped and it has to be sent over the wire as is, otherwise what > takes place on the wire is not what was planned at scheduling time, so the > scheduling is not accurate (Note: there are some devices which allow prepending > headers to the packet after the scheduling stage at the expense of sending > correction requests back to the scheduler, but this only strengthens the bond > between scheduling and TX). > egress QoS can be applied to a physical or a logical network device. At present the network devices are presented as ethdev in DPDK. Even a logical device can also be presented by creating a new ethdev. So it seems to be a good idea to associate it with ethdev. > Q5: Given that the packet scheduling takes place automatically for pure HW > implementations, how does packet scheduling take place for poll-mode SW > implementations? > A5: The API provided function rte_sched_run() is designed to take care of this. > For HW implementations, this function typically does nothing. For SW > implementations, this function is typically expected to perform dequeue of > packets from the hierarchical scheduler and their write to Ethernet device TX > queue, periodic flush of any buffers on enqueue-side into the hierarchical > scheduler for burst-oriented implementations, etc. > I think this is *rte_eth_sched_run* in your APIs. It will be a no-ops for hw, how do you envision it's usages in the typical software. e.g. in the l3fwd application, - every time you do a rte_eth_tx_burst - there may be locking concern here. - creating a per port thread to continue doing rte_eth_sched_run - call it in one of the existing polling thread for a port. > Q6: Which are the scheduling algorithms supported? > A6: The fundamental scheduling algorithms that are supported are Strict Priority > (SP) and Weighted Fair Queuing (WFQ). The SP and WFQ algorithms are supported at > the level of each node of the scheduling hierarchy, regardless of the node > level/position in the tree. The SP algorithm is used to schedule between sibling > nodes with different priority, while WFQ is used to schedule between groups of > siblings that have the same priority. > Algorithms such as Weighed Round Robin (WRR), byte-level WRR, Deficit WRR > (DWRR), etc are considered approximations of the ideal WFQ and are therefore > assimilated to WFQ, although an associated implementation-dependent accuracy, > performance and resource usage trade-off might exist. > > Q7: Which are the supported congestion management algorithms? > A7: Tail drop, head drop and Weighted Random Early Detection (WRED). They are > available for every leaf node in the hierarchy, subject to the specific > implementation supporting them. > We may need to introduce some kind capability APIS. e.g. NXP HW do not support headdrop. > Q8: Is traffic shaping supported? > A8: Yes, there are a number of shapers (rate limiters) that can be supported for > each node in the hierarchy (built-in limit is currently set to 4 per node). Each > shaper can be private to a node (used only by that node) or shared between > multiple nodes. > What do you mean by supporting 4 shaper per node? if you need more shapers than create new hierarchy nodes. Also, similarly if a shaper is to be shared between two nodes, than it should be in parent node? Why you want to create shaper hierarchy within a node of hierarchical QoS. > Q9: What is the purpose of having shaper profiles and WRED profiles? > A9: In most implementations, many shapers typically share the same configuration > parameters, so defining shaper profiles simplifies the configuration task. Same > considerations apply to WRED contexts and profiles. > Agree > Q10: How is the scheduling hierarchy defined and created? > A10: Scheduler hierarchy tree is set up by creating new nodes and connecting > them to other existing nodes, which thus become parent nodes. The unique ID that > is assigned to each node when the node is created is further used to update the > node configuration or to connect children nodes to it. The leaf nodes of the > scheduler hierarchy are each attached to one of the Ethernet device TX queues. It may be cleaner to differentiate between a leaf (i.e. a qos_queue) and scheduling node. > Q11: Are on-the-fly changes of the scheduling hierarchy allowed by the API? > A11: Yes. The actual changes take place subject to the specific implementation > supporting them, otherwise error code is returned. What kind of change are you seeing here? creating new nodes/levels? reconnecting a node from one parent node to another? This is more like a implementation capability. > Q12: What is the typical function call sequence to set up and run the Ethernet > device scheduler? > A12: The typical simplified function call sequence is listed below: > i) Configure the Ethernet device and its TX queues: rte_eth_dev_configure(), > rte_eth_tx_queue_setup() > ii) Create WRED profiles and WRED contexts, shaper profiles and shapers: > rte_eth_sched_wred_profile_add(), rte_eth_sched_wred_context_add(), > rte_eth_sched_shaper_profile_add(), rte_eth_sched_shaper_add() > iii) Create the scheduler hierarchy nodes and tree: rte_eth_sched_node_add() > iv) Freeze the start-up hierarchy and ask the device whether it supports it: > rte_eth_sched_node_add() > v) Start the Ethernet port: rte_eth_dev_start() > vi) Run-time scheduler hierarchy updates: rte_eth_sched_node_add(), > rte_eth_sched_node__set() > vii) Run-time packet enqueue into the hierarchical scheduler: rte_eth_tx_burst() > viii) Run-time support for SW poll-mode implementations (see previous answer): > rte_sched_run() > > Q13: Which are the possible options for the user when the Ethernet port does not > support the scheduling hierarchy required by the user? > A13: The following options are available to the user: > i) abort > ii) try out a new hierarchy (e.g. with less leaf nodes), if acceptable > iii) wrap the Ethernet device into a new type of Ethernet device that has a SW > front-end implementing the hierarchical scheduler (e.g. existing DPDK library > librte_sched); instantiate the new device type on-the-fly and check if the > hierarchy requirements can be met by the new device. > > I will like to see some kind of capability APIs upfront. 1. Number of Levels supported 2. Per level capability (capability of each level may be different) 3. - Number of nodes support at a given level 4. - Max Number of input nodes supported 5. - Type of scheduling algo supported (SP, WFQ etc) 6. - Shaper support - Dual Rate 7. - Congestion control 8. - max priorities. > Signed-off-by: Cristian Dumitrescu > --- > lib/librte_ether/rte_ethdev.h | 794 ++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 794 insertions(+) > mode change 100644 => 100755 lib/librte_ether/rte_ethdev.h > > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h > old mode 100644 > new mode 100755 > index 9678179..d4d8604 > --- a/lib/librte_ether/rte_ethdev.h > +++ b/lib/librte_ether/rte_ethdev.h > @@ -182,6 +182,8 @@ extern "C" { > #include > #include > #include > +#include > +#include > #include "rte_ether.h" > #include "rte_eth_ctrl.h" > #include "rte_dev_info.h" > @@ -1038,6 +1040,152 @@ TAILQ_HEAD(rte_eth_dev_cb_list, rte_eth_dev_callback); > /**< l2 tunnel forwarding mask */ > #define ETH_L2_TUNNEL_FORWARDING_MASK 0x00000008 > > +/** > + * Scheduler configuration > + */ > + > +/**< Max number of shapers per node */ > +#define RTE_ETH_SCHED_SHAPERS_PER_NODE 4 > +/**< Invalid shaper ID */ > +#define RTE_ETH_SCHED_SHAPER_ID_NONE UINT32_MAX > +/**< Max number of WRED contexts per node */ > +#define RTE_ETH_SCHED_WRED_CONTEXTS_PER_NODE 4 > +/**< Invalid WRED context ID */ > +#define RTE_ETH_SCHED_WRED_CONTEXT_ID_NONE UINT32_MAX > +/**< Invalid node ID */ > +#define RTE_ETH_SCHED_NODE_NULL UINT32_MAX > + > +/** > + * Congestion management (CMAN) mode > + * > + * This is used for controlling the admission of packets into a packet queue or > + * group of packet queues on congestion. On request of writing a new packet > + * into the current queue while the queue is full, the *tail drop* algorithm > + * drops the new packet while leaving the queue unmodified, as opposed to *head > + * drop* algorithm, which drops the packet at the head of the queue (the oldest > + * packet waiting in the queue) and admits the new packet at the tail of the > + * queue. > + * > + * The *Random Early Detection (RED)* algorithm works by proactively dropping > + * more and more input packets as the queue occupancy builds up. When the queue > + * is full or almost full, RED effectively works as *tail drop*. The *Weighted > + * RED* algorithm uses a separate set of RED thresholds per packet color. > + */ > +enum rte_eth_sched_cman_mode { > + RTE_ETH_SCHED_CMAN_TAIL_DROP = 0, /**< Tail drop */ > + RTE_ETH_SCHED_CMAN_HEAD_DROP, /**< Head drop */ > + RTE_ETH_SCHED_CMAN_WRED, /**< Weighted Random Early Detection (WRED) */ > +}; > + you may also need parameters whether the cman is byte based or frame based. > +/** > + * WRED profile > + */ > +struct rte_eth_sched_wred_params { > + /**< One set of RED parameters per packet color */ > + struct rte_red_params red_params[e_RTE_METER_COLORS]; > +}; > + > +/** > + * Shaper (rate limiter) profile > + * > + * Multiple shaper instances can share the same shaper profile. Each node can > + * have multiple shapers enabled (up to RTE_ETH_SCHED_SHAPERS_PER_NODE). Each > + * shaper can be private to a node (only one node using it) or shared (multiple > + * nodes use the same shaper instance). > + */ > +struct rte_eth_sched_shaper_params { > + uint64_t rate; /**< Token bucket rate (bytes per second) */ > + uint64_t size; /**< Token bucket size (bytes) */ > +}; > + dual rate shaper can be supported here. I guess by size you mean the max burst size? > +/** > + * Node parameters > + * > + * Each scheduler hierarchy node has multiple inputs (children nodes of the > + * current parent node) and a single output (which is input to its parent > + * node). The current node arbitrates its inputs using Strict Priority (SP) > + * and Weighted Fair Queuing (WFQ) algorithms to schedule input packets on its > + * output while observing its shaping/rate limiting constraints. Algorithms > + * such as Weighted Round Robin (WRR), byte-level WRR, Deficit WRR (DWRR), etc > + * are considered approximations of the ideal WFQ and are assimilated to WFQ, > + * although an associated implementation-dependent trade-off on accuracy, > + * performance and resource usage might exist. > + * > + * Children nodes with different priorities are scheduled using the SP > + * algorithm, based on their priority, with zero (0) as the highest priority. > + * Children with same priority are scheduled using the WFQ algorithm, based on > + * their weight, which is relative to the sum of the weights of all siblings > + * with same priority, with one (1) as the lowest weight. > + */ > +struct rte_eth_sched_node_params { > + /**< Child node priority (used by SP). The highest priority is zero. */ > + uint32_t priority; > + /**< Child node weight (used by WFQ), relative to some of weights of all > + siblings with same priority). The lowest weight is one. */ > + uint32_t weight; > + /**< Set of shaper instances enabled for current node. Each node shaper > + can be disabled by setting it to RTE_ETH_SCHED_SHAPER_ID_NONE. */ > + uint32_t shaper_id[RTE_ETH_SCHED_SHAPERS_PER_NODE]; > + /**< Set to zero if current node is not a hierarchy leaf node, set to a > + non-zero value otherwise. A leaf node is a hierarchy node that does > + not have any children. A leaf node has to be connected to a valid > + packet queue. */ > + int is_leaf; > + /**< Parameters valid for leaf nodes only */ > + struct { > + /**< Packet queue ID */ > + uint64_t queue_id; > + /**< Congestion management mode */ > + enum rte_eth_sched_cman_mode cman; > + /**< Set of WRED contexts enabled for current leaf node. Each > + leaf node WRED context can be disabled by setting it to > + RTE_ETH_SCHED_WRED_CONTEXT_ID_NONE. Only valid when > + congestion management for current leaf node is set to WRED. */ > + uint32_t wred_context_id[RTE_ETH_SCHED_WRED_CONTEXTS_PER_NODE]; > + } leaf; > +}; > + It will be better to separate the leaf i.e. a qos_queue from a sched node, it will simplify. e.g. struct rte_eth_sched_qos_queue{ /**< Child node priority (used by SP). The highest priority is zero. */ uint32_t priority; /**< Child node weight (used by WFQ), relative to some of weights of all siblings with same priority). The lowest weight is one. */ uint32_t weight; /**< Packet queue ID */ uint64_t queue_id; /**< Congestion management params*/ enum rte_eth_sched_cman_mode cman; }; struct rte_eth_sched_node_params { /**< Child node priority (used by SP). The highest priority is zero. */ uint32_t priority; /**< Child node weight (used by WFQ), relative to some of weights of all siblings with same priority). The lowest weight is one. */ uint32_t weight; /**< Set of shaper instances enabled for current node. Each node shaper can be disabled by setting it to RTE_ETH_SCHED_SHAPER_ID_NONE. */ uint32_t shaper_id; /**< WRED contexts enabled for current leaf node. Each leaf node WRED context can be disabled by setting it to RTE_ETH_SCHED_WRED_CONTEXT_ID_NONE. Only valid when congestion management for current leaf node is set to WRED. */ uint32_t wred_context_id; }; sched_qos_queue (s) will be connected to schdule node. > +/** > + * Node statistics counter type > + */ > +enum rte_eth_sched_stats_counter { > + /**< Number of packets scheduled from current node. */ > + RTE_ETH_SCHED_STATS_COUNTER_N_PKTS = 1<< 0, > + /**< Number of bytes scheduled from current node. */ > + RTE_ETH_SCHED_STATS_COUNTER_N_BYTES = 1 << 1, > + RTE_ETH_SCHED_STATS_COUNTER_N_PKTS_DROPPED = 1 << 2, > + RTE_ETH_SCHED_STATS_COUNTER_N_BYTES_DROPPED = 1 << 3, > + /**< Number of packets currently waiting in the packet queue of current > + leaf node. */ > + RTE_ETH_SCHED_STATS_COUNTER_N_PKTS_QUEUED = 1 << 4, > + /**< Number of bytes currently waiting in the packet queue of current > + leaf node. */ > + RTE_ETH_SCHED_STATS_COUNTER_N_BYTES_QUEUED = 1 << 5, > +}; > + > +/** > + * Node statistics counters > + */ > +struct rte_eth_sched_node_stats { > + /**< Number of packets scheduled from current node. */ > + uint64_t n_pkts; > + /**< Number of bytes scheduled from current node. */ > + uint64_t n_bytes; > + /**< Statistics counters for leaf nodes only */ > + struct { > + /**< Number of packets dropped by current leaf node. */ > + uint64_t n_pkts_dropped; > + /**< Number of bytes dropped by current leaf node. */ > + uint64_t n_bytes_dropped; > + /**< Number of packets currently waiting in the packet queue of > + current leaf node. */ > + uint64_t n_pkts_queued; > + /**< Number of bytes currently waiting in the packet queue of > + current leaf node. */ > + uint64_t n_bytes_queued; > + } leaf; > +}; > + > /* > * Definitions of all functions exported by an Ethernet driver through the > * the generic structure of type *eth_dev_ops* supplied in the *rte_eth_dev* > @@ -1421,6 +1569,120 @@ typedef int (*eth_get_dcb_info)(struct rte_eth_dev *dev, > struct rte_eth_dcb_info *dcb_info); > /**< @internal Get dcb information on an Ethernet device */ > > +typedef int (*eth_sched_wred_profile_add_t)(struct rte_eth_dev *dev, > + uint32_t wred_profile_id, > + struct rte_eth_sched_wred_params *profile); > +/**< @internal Scheduler WRED profile add */ > + > +typedef int (*eth_sched_wred_profile_delete_t)(struct rte_eth_dev *dev, > + uint32_t wred_profile_id); > +/**< @internal Scheduler WRED profile delete */ > + > +typedef int (*eth_sched_wred_context_add_t)(struct rte_eth_dev *dev, > + uint32_t wred_context_id, > + uint32_t wred_profile_id); > +/**< @internal Scheduler WRED context add */ > + > +typedef int (*eth_sched_wred_context_delete_t)(struct rte_eth_dev *dev, > + uint32_t wred_context_id); > +/**< @internal Scheduler WRED context delete */ > + > +typedef int (*eth_sched_shaper_profile_add_t)(struct rte_eth_dev *dev, > + uint32_t shaper_profile_id, > + struct rte_eth_sched_shaper_params *profile); > +/**< @internal Scheduler shaper profile add */ > + > +typedef int (*eth_sched_shaper_profile_delete_t)(struct rte_eth_dev *dev, > + uint32_t shaper_profile_id); > +/**< @internal Scheduler shaper profile delete */ > + > +typedef int (*eth_sched_shaper_add_t)(struct rte_eth_dev *dev, > + uint32_t shaper_id, > + uint32_t shaper_profile_id); > +/**< @internal Scheduler shaper instance add */ > + > +typedef int (*eth_sched_shaper_delete_t)(struct rte_eth_dev *dev, > + uint32_t shaper_id); > +/**< @internal Scheduler shaper instance delete */ > + > +typedef int (*eth_sched_node_add_t)(struct rte_eth_dev *dev, > + uint32_t node_id, > + uint32_t parent_node_id, > + struct rte_eth_sched_node_params *params); > +/**< @internal Scheduler node add */ > + > +typedef int (*eth_sched_node_delete_t)(struct rte_eth_dev *dev, > + uint32_t node_id); > +/**< @internal Scheduler node delete */ > + > +typedef int (*eth_sched_hierarchy_set_t)(struct rte_eth_dev *dev, > + int clear_on_fail); > +/**< @internal Scheduler hierarchy set */ > + > +typedef int (*eth_sched_node_priority_set_t)(struct rte_eth_dev *dev, > + uint32_t node_id, > + uint32_t priority); > +/**< @internal Scheduler node priority set */ > + > +typedef int (*eth_sched_node_weight_set_t)(struct rte_eth_dev *dev, > + uint32_t node_id, > + uint32_t weight); > +/**< @internal Scheduler node weight set */ > + > +typedef int (*eth_sched_node_shaper_set_t)(struct rte_eth_dev *dev, > + uint32_t node_id, > + uint32_t shaper_pos, > + uint32_t shaper_id); > +/**< @internal Scheduler node shaper set */ > + > +typedef int (*eth_sched_node_queue_set_t)(struct rte_eth_dev *dev, > + uint32_t node_id, > + uint32_t queue_id); > +/**< @internal Scheduler node queue set */ > + > +typedef int (*eth_sched_node_cman_set_t)(struct rte_eth_dev *dev, > + uint32_t node_id, > + enum rte_eth_sched_cman_mode cman); > +/**< @internal Scheduler node congestion management mode set */ > + > +typedef int (*eth_sched_node_wred_context_set_t)(struct rte_eth_dev *dev, > + uint32_t node_id, > + uint32_t wred_context_pos, > + uint32_t wred_context_id); > +/**< @internal Scheduler node WRED context set */ > + > +typedef int (*eth_sched_stats_get_enabled_t)(struct rte_eth_dev *dev, > + uint64_t *nonleaf_node_capability_stats_mask, > + uint64_t *nonleaf_node_enabled_stats_mask, > + uint64_t *leaf_node_capability_stats_mask, > + uint64_t *leaf_node_enabled_stats_mask); > +/**< @internal Scheduler get set of stats counters enabled for all nodes */ > + > +typedef int (*eth_sched_stats_enable_t)(struct rte_eth_dev *dev, > + uint64_t nonleaf_node_enabled_stats_mask, > + uint64_t leaf_node_enabled_stats_mask); > +/**< @internal Scheduler enable selected stats counters for all nodes */ > + > +typedef int (*eth_sched_node_stats_get_enabled_t)(struct rte_eth_dev *dev, > + uint32_t node_id, > + uint64_t *capability_stats_mask, > + uint64_t *enabled_stats_mask); > +/**< @internal Scheduler get set of stats counters enabled for specific node */ > + > +typedef int (*eth_sched_node_stats_enable_t)(struct rte_eth_dev *dev, > + uint32_t node_id, > + uint64_t enabled_stats_mask); > +/**< @internal Scheduler enable selected stats counters for specific node */ > + > +typedef int (*eth_sched_node_stats_read_t)(struct rte_eth_dev *dev, > + uint32_t node_id, > + struct rte_eth_sched_node_stats *stats, > + int clear); > +/**< @internal Scheduler read stats counters for specific node */ > + > +typedef int (*eth_sched_run_t)(struct rte_eth_dev *dev); > +/**< @internal Scheduler run */ > + > /** > * @internal A structure containing the functions exported by an Ethernet driver. > */ > @@ -1547,6 +1809,53 @@ struct eth_dev_ops { > eth_l2_tunnel_eth_type_conf_t l2_tunnel_eth_type_conf; > /** Enable/disable l2 tunnel offload functions */ > eth_l2_tunnel_offload_set_t l2_tunnel_offload_set; > + > + /** Scheduler WRED profile add */ > + eth_sched_wred_profile_add_t sched_wred_profile_add; > + /** Scheduler WRED profile delete */ > + eth_sched_wred_profile_delete_t sched_wred_profile_delete; > + /** Scheduler WRED context add */ > + eth_sched_wred_context_add_t sched_wred_context_add; > + /** Scheduler WRED context delete */ > + eth_sched_wred_context_delete_t sched_wred_context_delete; > + /** Scheduler shaper profile add */ > + eth_sched_shaper_profile_add_t sched_shaper_profile_add; > + /** Scheduler shaper profile delete */ > + eth_sched_shaper_profile_delete_t sched_shaper_profile_delete; > + /** Scheduler shaper instance add */ > + eth_sched_shaper_add_t sched_shaper_add; > + /** Scheduler shaper instance delete */ > + eth_sched_shaper_delete_t sched_shaper_delete; > + /** Scheduler node add */ > + eth_sched_node_add_t sched_node_add; > + /** Scheduler node delete */ > + eth_sched_node_delete_t sched_node_delete; > + /** Scheduler hierarchy set */ > + eth_sched_hierarchy_set_t sched_hierarchy_set; > + /** Scheduler node priority set */ > + eth_sched_node_priority_set_t sched_node_priority_set; > + /** Scheduler node weight set */ > + eth_sched_node_weight_set_t sched_node_weight_set; > + /** Scheduler node shaper set */ > + eth_sched_node_shaper_set_t sched_node_shaper_set; > + /** Scheduler node queue set */ > + eth_sched_node_queue_set_t sched_node_queue_set; > + /** Scheduler node congestion management mode set */ > + eth_sched_node_cman_set_t sched_node_cman_set; > + /** Scheduler node WRED context set */ > + eth_sched_node_wred_context_set_t sched_node_wred_context_set; > + /** Scheduler get statistics counter type enabled for all nodes */ > + eth_sched_stats_get_enabled_t sched_stats_get_enabled; > + /** Scheduler enable selected statistics counters for all nodes */ > + eth_sched_stats_enable_t sched_stats_enable; > + /** Scheduler get statistics counter type enabled for current node */ > + eth_sched_node_stats_get_enabled_t sched_node_stats_get_enabled; > + /** Scheduler enable selected statistics counters for current node */ > + eth_sched_node_stats_enable_t sched_node_stats_enable; > + /** Scheduler read statistics counters for current node */ > + eth_sched_node_stats_read_t sched_node_stats_read; > + /** Scheduler run */ > + eth_sched_run_t sched_run; > }; > > /** > @@ -4336,6 +4645,491 @@ rte_eth_dev_l2_tunnel_offload_set(uint8_t port_id, > uint8_t en); > > /** > + * Scheduler WRED profile add > + * > + * Create a new WRED profile with ID set to *wred_profile_id*. The new profile > + * is used to create one or several WRED contexts. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param wred_profile_id > + * WRED profile ID for the new profile. Needs to be unused. > + * @param profile > + * WRED profile parameters. Needs to be pre-allocated and valid. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_eth_sched_wred_profile_add(uint8_t port_id, > + uint32_t wred_profile_id, > + struct rte_eth_sched_wred_params *profile); > + > +/** > + * Scheduler WRED profile delete > + * > + * Delete an existing WRED profile. This operation fails when there is currently > + * at least one user (i.e. WRED context) of this WRED profile. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param wred_profile_id > + * WRED profile ID. Needs to be the valid. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_eth_sched_wred_profile_delete(uint8_t port_id, > + uint32_t wred_profile_id); > + > +/** > + * Scheduler WRED context add or update > + * > + * When *wred_context_id* is invalid, a new WRED context with this ID is created > + * by using the WRED profile identified by *wred_profile_id*. > + * > + * When *wred_context_id* is valid, this WRED context is no longer using the > + * profile previously assigned to it and is updated to use the profile > + * identified by *wred_profile_id*. > + * > + * A valid WRED context is assigned to one or several scheduler hierarchy leaf > + * nodes configured to use WRED as the congestion management mode. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param wred_context_id > + * WRED context ID > + * @param wred_profile_id > + * WRED profile ID. Needs to be the valid. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_eth_sched_wred_context_add(uint8_t port_id, > + uint32_t wred_context_id, > + uint32_t wred_profile_id); > + > +/** > + * Scheduler WRED context delete > + * > + * Delete an existing WRED context. This operation fails when there is currently > + * at least one user (i.e. scheduler hierarchy leaf node) of this WRED context. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param wred_context_id > + * WRED context ID. Needs to be the valid. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_eth_sched_wred_context_delete(uint8_t port_id, > + uint32_t wred_context_id); > + > +/** > + * Scheduler shaper profile add > + * > + * Create a new shaper profile with ID set to *shaper_profile_id*. The new > + * shaper profile is used to create one or several shapers. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param shaper_profile_id > + * Shaper profile ID for the new profile. Needs to be unused. > + * @param profile > + * Shaper profile parameters. Needs to be pre-allocated and valid. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_eth_sched_shaper_profile_add(uint8_t port_id, > + uint32_t shaper_profile_id, > + struct rte_eth_sched_shaper_params *profile); > + > +/** > + * Scheduler shaper profile delete > + * > + * Delete an existing shaper profile. This operation fails when there is > + * currently at least one user (i.e. shaper) of this shaper profile. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param shaper_profile_id > + * Shaper profile ID. Needs to be the valid. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +/* no users (shapers) using this profile */ > +int rte_eth_sched_shaper_profile_delete(uint8_t port_id, > + uint32_t shaper_profile_id); > + > +/** > + * Scheduler shaper add or update > + * > + * When *shaper_id* is not a valid shaper ID, a new shaper with this ID is > + * created using the shaper profile identified by *shaper_profile_id*. > + * > + * When *shaper_id* is a valid shaper ID, this shaper is no longer using the > + * shaper profile previously assigned to it and is updated to use the shaper > + * profile identified by *shaper_profile_id*. > + * > + * A valid shaper is assigned to one or several scheduler hierarchy nodes. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param shaper_id > + * Shaper ID > + * @param shaper_profile_id > + * Shaper profile ID. Needs to be the valid. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_eth_sched_shaper_add(uint8_t port_id, > + uint32_t shaper_id, > + uint32_t shaper_profile_id); > + > +/** > + * Scheduler shaper delete > + * > + * Delete an existing shaper. This operation fails when there is currently at > + * least one user (i.e. scheduler hierarchy node) of this shaper. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param shaper_id > + * Shaper ID. Needs to be the valid. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_eth_sched_shaper_delete(uint8_t port_id, > + uint32_t shaper_id); > + > +/** > + * Scheduler node add or remap > + * > + * When *node_id* is not a valid node ID, a new node with this ID is created and > + * connected as child to the existing node identified by *parent_node_id*. > + * > + * When *node_id* is a valid node ID, this node is disconnected from its current > + * parent and connected as child to another existing node identified by > + * *parent_node_id *. > + * > + * This function can be called during port initialization phase (before the > + * Ethernet port is started) for building the scheduler start-up hierarchy. > + * Subject to the specific Ethernet port supporting on-the-fly scheduler > + * hierarchy updates, this function can also be called during run-time (after > + * the Ethernet port is started). > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID > + * @param parent_node_id > + * Parent node ID. Needs to be the valid. > + * @param params > + * Node parameters. Needs to be pre-allocated and valid. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_eth_sched_node_add(uint8_t port_id, > + uint32_t node_id, > + uint32_t parent_node_id, > + struct rte_eth_sched_node_params *params); > + > +/** > + * Scheduler node delete > + * > + * Delete an existing node. This operation fails when this node currently has at > + * least one user (i.e. child node). > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_eth_sched_node_delete(uint8_t port_id, > + uint32_t node_id); > + > +/** > + * Scheduler hierarchy set > + * > + * This function is called during the port initialization phase (before the > + * Ethernet port is started) to freeze the scheduler start-up hierarchy. > + * > + * This function fails when the currently configured scheduler hierarchy is not > + * supported by the Ethernet port, in which case the user can abort or try out > + * another hierarchy configuration (e.g. a hierarchy with less leaf nodes), > + * which can be build from scratch (when *clear_on_fail* is enabled) or by > + * modifying the existing hierarchy configuration (when *clear_on_fail* is > + * disabled). > + * > + * Note that, even when the configured scheduler hierarchy is supported (so this > + * function is successful), the Ethernet port start might still fail due to e.g. > + * not enough memory being available in the system, etc. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param clear_on_fail > + * On function call failure, hierarchy is cleared when this parameter is > + * non-zero and preserved when this parameter is equal to zero. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_eth_sched_hierarchy_set(uint8_t port_id, > + int clear_on_fail); > + > +/** > + * Scheduler node priority set > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid. > + * @param priority > + * Node priority. The highest node priority is zero. Used by the SP algorithm > + * running on the parent of the current node for scheduling this child node. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_eth_sched_node_priority_set(uint8_t port_id, > + uint32_t node_id, > + uint32_t priority); > + > +/** > + * Scheduler node weight set > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid. > + * @param weight > + * Node weight. The node weight is relative to the weight sum of all siblings > + * that have the same priority. The lowest weight is zero. Used by the WFQ > + * algorithm running on the parent of the current node for scheduling this > + * child node. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_eth_sched_node_weight_set(uint8_t port_id, > + uint32_t node_id, > + uint32_t weight); > + > +/** > + * Scheduler node shaper set > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid. > + * @param shaper_pos > + * Position in the shaper array of the current node > + * (0 .. RTE_ETH_SCHED_SHAPERS_PER_NODE-1). > + * @param shaper_id > + * Shaper ID. Needs to be either valid shaper ID or set to > + * RTE_ETH_SCHED_SHAPER_ID_NONE in order to invalidate the shaper on position > + * *shaper_pos* within the current node. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_eth_sched_node_shaper_set(uint8_t port_id, > + uint32_t node_id, > + uint32_t shaper_pos, > + uint32_t shaper_id); > + > +/** > + * Scheduler node queue set > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid. > + * @param queue_id > + * Queue ID. Needs to be valid. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_eth_sched_node_queue_set(uint8_t port_id, > + uint32_t node_id, > + uint32_t queue_id); > + > +/** > + * Scheduler node congestion management mode set > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid leaf node ID. > + * @param cman > + * Congestion management mode. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_eth_sched_node_cman_set(uint8_t port_id, > + uint32_t node_id, > + enum rte_eth_sched_cman_mode cman); > + > +/** > + * Scheduler node WRED context set > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid leaf node ID that has WRED selected as the > + * congestion management mode. > + * @param wred_context_pos > + * Position in the WRED context array of the current leaf node > + * (0 .. RTE_ETH_SCHED_WRED_CONTEXTS_PER_NODE-1) > + * @param wred_context_id > + * WRED context ID. Needs to be either valid WRED context ID or set to > + * RTE_ETH_SCHED_WRED_CONTEXT_ID_NONE in order to invalidate the WRED context > + * on position *wred_context_pos* within the current leaf node. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_eth_sched_node_wred_context_set(uint8_t port_id, > + uint32_t node_id, > + uint32_t wred_context_pos, > + uint32_t wred_context_id); > + > +/** > + * Scheduler get statistics counter types enabled for all nodes > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param nonleaf_node_capability_stats_mask > + * Statistics counter types available per node for all non-leaf nodes. Needs > + * to be pre-allocated. > + * @param nonleaf_node_enabled_stats_mask > + * Statistics counter types currently enabled per node for each non-leaf node. > + * This is a subset of *nonleaf_node_capability_stats_mask*. Needs to be > + * pre-allocated. > + * @param leaf_node_capability_stats_mask > + * Statistics counter types available per node for all leaf nodes. Needs to > + * be pre-allocated. > + * @param leaf_node_enabled_stats_mask > + * Statistics counter types currently enabled for each leaf node. This is > + * a subset of *leaf_node_capability_stats_mask*. Needs to be pre-allocated. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_eth_sched_stats_get_enabled(uint8_t port_id, > + uint64_t *nonleaf_node_capability_stats_mask, > + uint64_t *nonleaf_node_enabled_stats_mask, > + uint64_t *leaf_node_capability_stats_mask, > + uint64_t *leaf_node_enabled_stats_mask); > + > +/** > + * Scheduler enable selected statistics counters for all nodes > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param nonleaf_node_enabled_stats_mask > + * Statistics counter types to be enabled per node for each non-leaf node. > + * This needs to be a subset of the statistics counter types available per > + * node for all non-leaf nodes. Any statistics counter type not included in > + * this set is to be disabled for all non-leaf nodes. > + * @param leaf_node_enabled_stats_mask > + * Statistics counter types to be enabled per node for each leaf node. This > + * needs to be a subset of the statistics counter types available per node for > + * all leaf nodes. Any statistics counter type not included in this set is to > + * be disabled for all leaf nodes. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_eth_sched_stats_enable(uint8_t port_id, > + uint64_t nonleaf_node_enabled_stats_mask, > + uint64_t leaf_node_enabled_stats_mask); > + > +/** > + * Scheduler get statistics counter types enabled for current node > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid. > + * @param capability_stats_mask > + * Statistics counter types available for the current node. Needs to be pre-allocated. > + * @param enabled_stats_mask > + * Statistics counter types currently enabled for the current node. This is > + * a subset of *capability_stats_mask*. Needs to be pre-allocated. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_eth_sched_node_stats_get_enabled(uint8_t port_id, > + uint32_t node_id, > + uint64_t *capability_stats_mask, > + uint64_t *enabled_stats_mask); > + > +/** > + * Scheduler enable selected statistics counters for current node > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid. > + * @param enabled_stats_mask > + * Statistics counter types to be enabled for the current node. This needs to > + * be a subset of the statistics counter types available for the current node. > + * Any statistics counter type not included in this set is to be disabled for > + * the current node. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_eth_sched_node_stats_enable(uint8_t port_id, > + uint32_t node_id, > + uint64_t enabled_stats_mask); > + > +/** > + * Scheduler node statistics counters read > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param node_id > + * Node ID. Needs to be valid. > + * @param stats > + * When non-NULL, it contains the current value for the statistics counters > + * enabled for the current node. > + * @param clear > + * When this parameter has a non-zero value, the statistics counters are > + * cleared (i.e. set to zero) immediately after they have been read, otherwise > + * the statistics counters are left untouched. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +int rte_eth_sched_node_stats_read(uint8_t port_id, > + uint32_t node_id, > + struct rte_eth_sched_node_stats *stats, > + int clear); > + > +/** > + * Scheduler run > + * > + * The packet enqueue side of the scheduler hierarchy is typically done through > + * the Ethernet device TX function. For HW implementations, the packet dequeue > + * side is typically done by the Ethernet device without any SW intervention, > + * therefore this functions should not do anything. > + * > + * However, for poll-mode SW or mixed HW-SW implementations, the SW intervention > + * is likely to be required for running the packet dequeue side of the scheduler > + * hierarchy. Other potential task performed by this function is periodic flush > + * of any packet enqueue-side buffers used by the burst-mode implementations. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @return > + * 0 on success, non-zero error code otherwise. > + */ > +static inline int > +rte_eth_sched_run(uint8_t port_id) > +{ > + struct rte_eth_dev *dev; > + > +#ifdef RTE_LIBRTE_ETHDEV_DEBUG > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0); > +#endif > + > + dev = &rte_eth_devices[port_id]; > + > + return (dev->dev_ops->sched_run)? dev->dev_ops->sched_run(dev) : 0; > +} > + > +/** > * Get the port id from pci adrress or device name > * Ex: 0000:2:00.0 or vdev name net_pcap0 > * >