Thanks Stephen for addressing my queries , and it is helpful. One more follow up question on the same , Can DPDK HQOS be customized based on Use case ? For example: Hqos config for one of the use cases , *One Port , One Subport , 16 Pipes & Each Pipe with only one TC*. 16 pipe config was allowed but changing the 13TCs to 1TC is not allowed per Pipe. Can I still use 13 TCs but use the QueueSize as 0, Can that impact performance ? Thanks Farooq.J On Wed, May 21, 2025 at 7:48 PM Stephen Hemminger < stephen@networkplumber.org> wrote: > On Mon, 28 Apr 2025 16:55:07 +0530 > farooq basha wrote: > > > Hello DevTeam, > > > > I am planning to use DPDK HQOS for Traffic shaping with a > > run-to-completion Model. While I was reading the dpdk-qos document, I > came > > across the following statement. > > > > "*Running enqueue and dequeue operations for the same output port from > > different cores is likely to cause significant impact on scheduler’s > > performance and it is therefore not recommended"* > > > > Let's take an example, Port1 & Port2 have 4 Rx queues and each Queue > > mapped to a different CPU. Traffic coming on port1 gets forwarded to > port2 > > . With the above limitation application needs to take a lock before doing > > rte_sched_port_enqueue & dequeue operation. Performance is limited to > only > > 1 CPU even though Traffic is coming on 4 Different CPUs. > > > > Correct me if my understanding is Wrong? > > > > Thanks > > Basha > > The HQOS code is not thread safe so yes you need a lock. > The traffic scheduling (QOS) needs to be at last stage of the pipeline just > before mbufs are passed to the device. > > The issue is that QOS is single threaded, so lock is required. > > The statement is misleading, the real overhead is the lock; the secondary > overhead is the cache miss that will happen if processing on different > cores. > But if you are doing that you are going to cut performance a lot from cache > misses. >