* [dpdk-dev] Regarding HQOS with run-to-completion Model
@ 2025-04-28 11:25 farooq basha
2025-05-21 14:18 ` Stephen Hemminger
0 siblings, 1 reply; 2+ messages in thread
From: farooq basha @ 2025-04-28 11:25 UTC (permalink / raw)
To: dev
[-- Attachment #1: Type: text/plain, Size: 809 bytes --]
Hello DevTeam,
I am planning to use DPDK HQOS for Traffic shaping with a
run-to-completion Model. While I was reading the dpdk-qos document, I came
across the following statement.
"*Running enqueue and dequeue operations for the same output port from
different cores is likely to cause significant impact on scheduler’s
performance and it is therefore not recommended"*
Let's take an example, Port1 & Port2 have 4 Rx queues and each Queue
mapped to a different CPU. Traffic coming on port1 gets forwarded to port2
. With the above limitation application needs to take a lock before doing
rte_sched_port_enqueue & dequeue operation. Performance is limited to only
1 CPU even though Traffic is coming on 4 Different CPUs.
Correct me if my understanding is Wrong?
Thanks
Basha
[-- Attachment #2: Type: text/html, Size: 2727 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [dpdk-dev] Regarding HQOS with run-to-completion Model
2025-04-28 11:25 [dpdk-dev] Regarding HQOS with run-to-completion Model farooq basha
@ 2025-05-21 14:18 ` Stephen Hemminger
0 siblings, 0 replies; 2+ messages in thread
From: Stephen Hemminger @ 2025-05-21 14:18 UTC (permalink / raw)
To: farooq basha; +Cc: dev
On Mon, 28 Apr 2025 16:55:07 +0530
farooq basha <farooq.juturu@gmail.com> wrote:
> Hello DevTeam,
>
> I am planning to use DPDK HQOS for Traffic shaping with a
> run-to-completion Model. While I was reading the dpdk-qos document, I came
> across the following statement.
>
> "*Running enqueue and dequeue operations for the same output port from
> different cores is likely to cause significant impact on scheduler’s
> performance and it is therefore not recommended"*
>
> Let's take an example, Port1 & Port2 have 4 Rx queues and each Queue
> mapped to a different CPU. Traffic coming on port1 gets forwarded to port2
> . With the above limitation application needs to take a lock before doing
> rte_sched_port_enqueue & dequeue operation. Performance is limited to only
> 1 CPU even though Traffic is coming on 4 Different CPUs.
>
> Correct me if my understanding is Wrong?
>
> Thanks
> Basha
The HQOS code is not thread safe so yes you need a lock.
The traffic scheduling (QOS) needs to be at last stage of the pipeline just
before mbufs are passed to the device.
The issue is that QOS is single threaded, so lock is required.
The statement is misleading, the real overhead is the lock; the secondary
overhead is the cache miss that will happen if processing on different cores.
But if you are doing that you are going to cut performance a lot from cache
misses.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-05-21 14:18 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-04-28 11:25 [dpdk-dev] Regarding HQOS with run-to-completion Model farooq basha
2025-05-21 14:18 ` Stephen Hemminger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).