Thanks Stephen for addressing my queries , and it is helpful.

    One more follow up question on the same ,   Can DPDK HQOS be customized
based on Use case ?

    For example: Hqos config for one of the use cases ,  *One Port , One
Subport , 16 Pipes & Each Pipe with only one TC*.
                         16 pipe config was allowed but changing the 13TCs
to 1TC is not allowed per Pipe.

    Can I still use 13 TCs but use the QueueSize as 0, Can that impact
performance ?


Thanks
Farooq.J



On Wed, May 21, 2025 at 7:48 PM Stephen Hemminger <
stephen@networkplumber.org> wrote:

> On Mon, 28 Apr 2025 16:55:07 +0530
> farooq basha <farooq.juturu@gmail.com> wrote:
>
> > Hello DevTeam,
> >
> >     I am planning to use DPDK HQOS for Traffic shaping with a
> > run-to-completion Model. While I was reading the dpdk-qos document, I
> came
> > across the following statement.
> >
> > "*Running enqueue and dequeue operations for the same output port from
> > different cores is likely to cause significant impact on scheduler’s
> > performance and it is therefore not recommended"*
> >
> >  Let's take an  example, Port1  & Port2 have 4 Rx queues and each Queue
> > mapped to a different CPU. Traffic coming on port1  gets forwarded to
> port2
> > . With the above limitation application needs to take a lock before doing
> > rte_sched_port_enqueue & dequeue operation. Performance is limited to
> only
> > 1 CPU even though Traffic is coming on 4 Different CPUs.
> >
> > Correct me if my understanding is Wrong?
> >
> > Thanks
> > Basha
>
> The HQOS code is not thread safe so yes you need a lock.
> The traffic scheduling (QOS) needs to be at last stage of the pipeline just
> before mbufs are passed to the device.
>
> The issue is that QOS is single threaded, so lock is required.
>
> The statement is misleading, the real overhead is the lock; the secondary
> overhead is the cache miss that will happen if processing on different
> cores.
> But if you are doing that you are going to cut performance a lot from cache
> misses.
>