[dpdk-users] how to design high performance QoS support for a large amount of subscribers

DPDK usage discussions
 help / color / mirror / Atom feed

* [dpdk-users] how to design high performance QoS support for a large amount of subscribers
@ 2016-08-02 15:26 Yuyong Zhang
  2016-08-04 13:01 ` Dumitrescu, Cristian
  0 siblings, 1 reply; 3+ messages in thread
From: Yuyong Zhang @ 2016-08-02 15:26 UTC (permalink / raw)
  To: dev, users

Hi,

I am trying to add QoS support for a high performance VNF with large amount of subscribers (millions). It requires to support guaranteed bit rate for different service level of subscribers. I.e. four service levels need to be supported:

*         Diamond, 500M

*         Gold, 100M

*         Silver, 50M

*         Bronze, 10M

Here is the current pipeline design using DPDK:

*         4 RX threads, does packet classification and load balancing

*         10-20 worker thread, does application subscriber management

*         4 TX threads, sends packets to TX NICs.

*         Ring buffers used among RX threads, Worker threads, and TX threads

I read DPDK program guide for QoS framework regarding  hierarchical scheduler: Port, sub-port, pipe, TC and queues, I am looking for advice on how to design QoS scheduler to support millions of subscribers (pipes) which traffic are processed in tens of worker threads where subscriber management processing are handled?

One design thought is as the following:

8 ports (each one is associated with one physical port), 16-20 sub-ports (each is used by one Worker thread), each sub-port supports 250K pipes for subscribers. Each worker thread manages one sub-port and does metering for the sub-port to get color, and after identity subscriber flow pick a unused pipe, and do sched enqueuer/de-queue and then put into TX rings to TX threads, and TX threads send the packets to TX NICs.

Are there functional and performance issues with above approach?

Any advice and input are appreciated.

Regards,

Yuyong

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [dpdk-users] how to design high performance QoS support for a large amount of subscribers
  2016-08-02 15:26 [dpdk-users] how to design high performance QoS support for a large amount of subscribers Yuyong Zhang
@ 2016-08-04 13:01 ` Dumitrescu, Cristian
  2016-08-04 13:46   ` Yuyong Zhang
  0 siblings, 1 reply; 3+ messages in thread
From: Dumitrescu, Cristian @ 2016-08-04 13:01 UTC (permalink / raw)
  To: Yuyong Zhang, dev, users

Hi Yuyong,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Yuyong Zhang
> Sent: Tuesday, August 2, 2016 4:26 PM
> To: dev@dpdk.org; users@dpdk.org
> Subject: [dpdk-dev] how to design high performance QoS support for a large
> amount of subscribers
> 
> Hi,
> 
> I am trying to add QoS support for a high performance VNF with large
> amount of subscribers (millions).

Welcome to the world of DPDK QoS users!

It requires to support guaranteed bit rate
> for different service level of subscribers. I.e. four service levels need to be
> supported:
> 
> *         Diamond, 500M
> 
> *         Gold, 100M
> 
> *         Silver, 50M
> 
> *         Bronze, 10M

Service levels translate to pipe profiles in our DPDK implementation. The set of pipe profiles is defined per port.

> 
> Here is the current pipeline design using DPDK:
> 
> 
> *         4 RX threads, does packet classification and load balancing
> 
> *         10-20 worker thread, does application subscriber management
> 
> *         4 TX threads, sends packets to TX NICs.
> 
> *         Ring buffers used among RX threads, Worker threads, and TX threads
> 
> I read DPDK program guide for QoS framework regarding  hierarchical
> scheduler: Port, sub-port, pipe, TC and queues, I am looking for advice on
> how to design QoS scheduler to support millions of subscribers (pipes) which
> traffic are processed in tens of worker threads where subscriber
> management processing are handled?

Having millions of pipes per port poses some challenges:
1. Does it actually make sense? Assuming the port rate is 10GbE, looking at the smallest user rate you mention above (Bronze, 10Mbps/user), this means that fully provisioning all users (i.e. making sure you can fully handle each user in worst case scenario) results in a maximum of 1000 users per port. Assuming overprovisioning of 50:1, this means a maximum of 50K users per port.
2. Memory challenge. The number of pipes per port is configurable -- hey, this is SW! :) -- but each of these pipes has 16 queues. For 4K pipes per port, this is 64K queues per port; for typical value of 64 packets per queue, this is 4M packets per port, so worst case scenario we need to provision 4M packets in the buffer pool for each output port that has hierarchical scheduler enabled; for buffer size of ~2KB each, this means ~8GB of memory for each output port. If you go from 4k pipes per port to 4M pipes per port, this means 8TB of memory per port. Do you have enough memory in your system? :)

One thing to realize is that even for millions of users in your system, not all of them are active at the same time. So maybe have a smaller number of pipes and only map the active users (those that have any packets to send now) to them (a fraction of the total set of users), with the set of active users changing over time.

You can also consider mapping several users to the same pipe.

> 
> One design thought is as the following:
> 
> 8 ports (each one is associated with one physical port), 16-20 sub-ports (each
> is used by one Worker thread), each sub-port supports 250K pipes for
> subscribers. Each worker thread manages one sub-port and does metering
> for the sub-port to get color, and after identity subscriber flow pick a unused
> pipe, and do sched enqueuer/de-queue and then put into TX rings to TX
> threads, and TX threads send the packets to TX NICs.
> 

In the current implementation, each port scheduler object has to be owned by a single thread, i.e. you cannot slit a port across multiple threads, therefore is not straightforward to have different sub-ports handled by different threads. The workaround is to split yourself the physical NIC port into multiple port scheduler objects: for example, create 8 port scheduler objects, set the rate of each to 1/8 of 10GbE, have each of them feed a different NIC TX queue of the same physical NIC port.

You can probably get this scenario (or very similar) up pretty quickly just by handcrafting yourself a configuration file for examples/ip_pipeline application.

> Are there functional and performance issues with above approach?
> 
> Any advice and input are appreciated.
> 
> Regards,
> 
> Yuyong
> 
> 
> 

Regards,
Cristian

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [dpdk-users] how to design high performance QoS support for a large amount of subscribers
  2016-08-04 13:01 ` Dumitrescu, Cristian
@ 2016-08-04 13:46   ` Yuyong Zhang
  0 siblings, 0 replies; 3+ messages in thread
From: Yuyong Zhang @ 2016-08-04 13:46 UTC (permalink / raw)
  To: Dumitrescu, Cristian, dev, users

Thank you very much Cristian for the insightful response. 

Very much appreciated.

Regards,

Yuyong

-----Original Message-----
From: Dumitrescu, Cristian [mailto:cristian.dumitrescu@intel.com] 
Sent: Thursday, August 4, 2016 9:01 AM
To: Yuyong Zhang <yuyong.zhang@casa-systems.com>; dev@dpdk.org; users@dpdk.org
Subject: RE: how to design high performance QoS support for a large amount of subscribers

Hi Yuyong,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Yuyong Zhang
> Sent: Tuesday, August 2, 2016 4:26 PM
> To: dev@dpdk.org; users@dpdk.org
> Subject: [dpdk-dev] how to design high performance QoS support for a 
> large amount of subscribers
> 
> Hi,
> 
> I am trying to add QoS support for a high performance VNF with large 
> amount of subscribers (millions).

Welcome to the world of DPDK QoS users!

It requires to support guaranteed bit rate
> for different service level of subscribers. I.e. four service levels 
> need to be
> supported:
> 
> *         Diamond, 500M
> 
> *         Gold, 100M
> 
> *         Silver, 50M
> 
> *         Bronze, 10M

Service levels translate to pipe profiles in our DPDK implementation. The set of pipe profiles is defined per port.

> 
> Here is the current pipeline design using DPDK:
> 
> 
> *         4 RX threads, does packet classification and load balancing
> 
> *         10-20 worker thread, does application subscriber management
> 
> *         4 TX threads, sends packets to TX NICs.
> 
> *         Ring buffers used among RX threads, Worker threads, and TX threads
> 
> I read DPDK program guide for QoS framework regarding  hierarchical
> scheduler: Port, sub-port, pipe, TC and queues, I am looking for 
> advice on how to design QoS scheduler to support millions of 
> subscribers (pipes) which traffic are processed in tens of worker 
> threads where subscriber management processing are handled?

Having millions of pipes per port poses some challenges:
1. Does it actually make sense? Assuming the port rate is 10GbE, looking at the smallest user rate you mention above (Bronze, 10Mbps/user), this means that fully provisioning all users (i.e. making sure you can fully handle each user in worst case scenario) results in a maximum of 1000 users per port. Assuming overprovisioning of 50:1, this means a maximum of 50K users per port.
2. Memory challenge. The number of pipes per port is configurable -- hey, this is SW! :) -- but each of these pipes has 16 queues. For 4K pipes per port, this is 64K queues per port; for typical value of 64 packets per queue, this is 4M packets per port, so worst case scenario we need to provision 4M packets in the buffer pool for each output port that has hierarchical scheduler enabled; for buffer size of ~2KB each, this means ~8GB of memory for each output port. If you go from 4k pipes per port to 4M pipes per port, this means 8TB of memory per port. Do you have enough memory in your system? :)

One thing to realize is that even for millions of users in your system, not all of them are active at the same time. So maybe have a smaller number of pipes and only map the active users (those that have any packets to send now) to them (a fraction of the total set of users), with the set of active users changing over time.

You can also consider mapping several users to the same pipe.

> 
> One design thought is as the following:
> 
> 8 ports (each one is associated with one physical port), 16-20 
> sub-ports (each is used by one Worker thread), each sub-port supports 
> 250K pipes for subscribers. Each worker thread manages one sub-port 
> and does metering for the sub-port to get color, and after identity 
> subscriber flow pick a unused pipe, and do sched enqueuer/de-queue and 
> then put into TX rings to TX threads, and TX threads send the packets to TX NICs.
> 

In the current implementation, each port scheduler object has to be owned by a single thread, i.e. you cannot slit a port across multiple threads, therefore is not straightforward to have different sub-ports handled by different threads. The workaround is to split yourself the physical NIC port into multiple port scheduler objects: for example, create 8 port scheduler objects, set the rate of each to 1/8 of 10GbE, have each of them feed a different NIC TX queue of the same physical NIC port.

You can probably get this scenario (or very similar) up pretty quickly just by handcrafting yourself a configuration file for examples/ip_pipeline application.

> Are there functional and performance issues with above approach?
> 
> Any advice and input are appreciated.
> 
> Regards,
> 
> Yuyong
> 
> 
> 

Regards,
Cristian

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-08-04 13:46 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-02 15:26 [dpdk-users] how to design high performance QoS support for a large amount of subscribers Yuyong Zhang
2016-08-04 13:01 ` Dumitrescu, Cristian
2016-08-04 13:46   ` Yuyong Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).