From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) by dpdk.org (Postfix) with ESMTP id 008A91B936 for ; Wed, 19 Dec 2018 21:38:05 +0100 (CET) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id A0AA440011 for ; Wed, 19 Dec 2018 21:38:05 +0100 (CET) Received: by mail.lysator.liu.se (Postfix, from userid 1004) id 8A49440009; Wed, 19 Dec 2018 21:38:05 +0100 (CET) X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on bernadotte.lysator.liu.se X-Spam-Level: X-Spam-Status: No, score=-0.9 required=5.0 tests=ALL_TRUSTED,AWL autolearn=disabled version=3.4.1 X-Spam-Score: -0.9 Received: from [192.168.1.59] (host-90-232-140-56.mobileonline.telia.com [90.232.140.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id 402E240005; Wed, 19 Dec 2018 21:38:04 +0100 (CET) To: Venky Venkatesh , "dev@dpdk.org" References: <2E8941CA-9F5B-4EAD-A66F-36E9E5D44921@paloaltonetworks.com> From: =?UTF-8?Q?Mattias_R=c3=b6nnblom?= Message-ID: Date: Wed, 19 Dec 2018 21:38:03 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: <2E8941CA-9F5B-4EAD-A66F-36E9E5D44921@paloaltonetworks.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV using ClamSMTP Subject: Re: [dpdk-dev] Eventdev DSW correctness and pathologies X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Dec 2018 20:38:06 -0000 On 2018-12-19 19:53, Venky Venkatesh wrote: > Couple of questions on DSW scheduling: > > 1. how was the correctness of the scheduling verified -- specifically the fact that ATOMIC is not scheduled simultaneously to 2 cores? I can think of feeding the same flowid on all cores and see where the various cores are busy. Any other test cases that can be run? But I am not satisfied with my verification scheme as any spurious/infrequent scheduling on 2 cores would be missed -- is there some invariant I could check for ATOMICity? I have test cases that verifies that ordering is maintained, and also such that attempt to verify that processing happens in order. The former is easy - just add a sequence number at ingress, and make sure the packets egress the system in the same order. They way I go about the latter was to have a per-flow spinlock, and have the receiving worker to take the lock (with spinlock_try_lock()), to make sure no other lcore was processing the same flow id at the same time. > 2. Is the following understanding correct: for DSW_MIGRATION_INTERVAL (viz 1 ms.) a flow is pinned to a core? Yes. > Which means if in this 1ms interval even if there are other cores idling and this core is oversubscribed due to another flow as well then the core would be shared between the 2 flows. Yes, such load imbalance-related inefficiencies are certainly possible. I touch upon this issue in my DPDK Userspace DSW seminar: https://www.youtube.com/watch?v=M1t3cRZ2mjg > So would an alternating pattern of simultaneous “1ms burst and silence” on the two flows be the pathological case for under-utilization of the cores and low thruput? – since the core would be shared for the busy period and just when migration is planned there is silence and then the cycle repeats. I want to have a good grasp of the pathological cases of this scheduler. Worst case would be a situation where all the flows are owned by the same eventdev port, and at the exact time of migration, the migrated flow would go silent, and a new flow, previously idle, also owned by this bottleneck port, would start sending. After a while, all flows will have migrated from this bottleneck port to other ports. To maintain maximum imbalance, this magical, omniscient traffic generator would have to change all flows it's generating, and pick flow ids which are all owned by the same port. It's a situation not likely to be seen in the wild. That said, if you have a system with very few, very bursty flows and a short pipeline, imbalance might well become a real problem.