* Re: [dpdk-dev] Eventdev DSW correctness and pathologies
2018-12-19 18:53 [dpdk-dev] Eventdev DSW correctness and pathologies Venky Venkatesh
@ 2018-12-19 20:38 ` Mattias Rönnblom
0 siblings, 0 replies; 2+ messages in thread
From: Mattias Rönnblom @ 2018-12-19 20:38 UTC (permalink / raw)
To: Venky Venkatesh, dev
On 2018-12-19 19:53, Venky Venkatesh wrote:
> Couple of questions on DSW scheduling:
>
> 1. how was the correctness of the scheduling verified -- specifically the fact that ATOMIC is not scheduled simultaneously to 2 cores? I can think of feeding the same flowid on all cores and see where the various cores are busy. Any other test cases that can be run? But I am not satisfied with my verification scheme as any spurious/infrequent scheduling on 2 cores would be missed -- is there some invariant I could check for ATOMICity?
I have test cases that verifies that ordering is maintained, and also
such that attempt to verify that processing happens in order. The former
is easy - just add a sequence number at ingress, and make sure the
packets egress the system in the same order. They way I go about the
latter was to have a per-flow spinlock, and have the receiving worker to
take the lock (with spinlock_try_lock()), to make sure no other lcore
was processing the same flow id at the same time.
> 2. Is the following understanding correct: for DSW_MIGRATION_INTERVAL (viz 1 ms.) a flow is pinned to a core?
Yes.
> Which means if in this 1ms interval even if there are other cores idling and this core is oversubscribed due to another flow as well then the core would be shared between the 2 flows.
Yes, such load imbalance-related inefficiencies are certainly possible.
I touch upon this issue in my DPDK Userspace DSW seminar:
https://www.youtube.com/watch?v=M1t3cRZ2mjg
> So would an alternating pattern of simultaneous “1ms burst and silence” on the two flows be the pathological case for under-utilization of the cores and low thruput? – since the core would be shared for the busy period and just when migration is planned there is silence and then the cycle repeats. I want to have a good grasp of the pathological cases of this scheduler.
Worst case would be a situation where all the flows are owned by the
same eventdev port, and at the exact time of migration, the migrated
flow would go silent, and a new flow, previously idle, also owned by
this bottleneck port, would start sending.
After a while, all flows will have migrated from this bottleneck port to
other ports. To maintain maximum imbalance, this magical, omniscient
traffic generator would have to change all flows it's generating, and
pick flow ids which are all owned by the same port.
It's a situation not likely to be seen in the wild. That said, if you
have a system with very few, very bursty flows and a short pipeline,
imbalance might well become a real problem.
^ permalink raw reply [flat|nested] 2+ messages in thread