From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 01C9D682E for ; Wed, 27 Aug 2014 16:50:18 +0200 (CEST) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga103.fm.intel.com with ESMTP; 27 Aug 2014 07:46:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.97,862,1389772800"; d="scan'208";a="377670582" Received: from jfick-mobl.amr.corp.intel.com (HELO [10.254.12.54]) ([10.254.12.54]) by FMSMGA003.fm.intel.com with ESMTP; 27 Aug 2014 07:50:09 -0700 Message-ID: <53FDF11D.3040504@intel.com> Date: Wed, 27 Aug 2014 07:54:21 -0700 From: "Venkatesan, Venky" User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: dev@dpdk.org References: <20140826093837.4e3d1d4b@urahara> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] overcommitting CPUs X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Aug 2014 14:50:19 -0000 DPDK currently isn't exactly poll mode - it has an API that receives and transmits packets. How you enter that API could be interrupt or polled -we've left that up to the application to decide, rather than force a interrupt/NAPI type architecture. I do agree with Alex in that implementing a interrupt/load driven entry point as an option will make it usable more widely. There are multiple challenges here - managing the latency of an interrupt driven scheme in a user-space context, not to mention very high jitter rates to mention a few. That said, overcommitment of CPUs can be achieved in other ways as well. You could allocate and enforce CPU sharing via cgroups, and allocate x% of a core to the DPDK pthread. It does introduce a degree of indeterminism to when the DPDK pthread gets scheduled back in (depending on how many other threads are running on that core). But it is another option ... Regards, -Venky On 8/27/2014 1:40 AM, Alex Markuze wrote: > IMHO adding "Interrupt Mode" to dpdk is important as this can open > DPDK to a larger public of consumers, I can easily imagine someone > trying to find user space networking solution (And deciding against > verbs - RDMA) for the obvious reasons and not needing deterministic > latency. > > A few thoughts: > > Deterministic Latency: Its a fiction in a sence that this something > you will be able to see only in a small controlled environment. As > network latencies in Data Centres(DC) are dominated by switch queuing > (One good reference is http://fastpass.mit.edu that Vincent shared a > few days back). > > Virtual environments: In virtual environments this is especially > interesting as the NIC driver(Hypervisor) is working in IRQ mode which > unless the Interrupts are pinned to different cpus then the VM will > have a disruptive effect on the VM's performance. Moving to interrupt > mode mode in paravirtualised environments makes sense as in any > environment that is not carefully crafted you should not expect any > deterministic guaranties and would opt for a simpler programming model > - like interrupt mode. > > NAPI: With 10G NICs Most CPUs poll rate is faster then the NIC message > rate resulting in 1:1 napi_poll callback to IRQ ratio this is true > even with small packets. In some cases where the CPU is working slower > - for example when intel_iommu=on,strict is set , you can actually see > a performance inversion where the "slower" CPU can reach higher B/W > because the slowdown makes NAPI work with the kernel effectively > moving to polling mode. > > I think that a smarter DPDK-NAPI is important, but it is a next step > IFF the interrupt mode is adopted. > > On Wed, Aug 27, 2014 at 8:48 AM, Patel, Rashmin N > wrote: >> You're right and I've felt the same harder part of determinism with other hypervisors' soft switch solutions as well. I think it's worth thinking about. >> >> Thanks, >> Rashmin >> >> On Aug 26, 2014 9:15 PM, Stephen Hemminger wrote: >> The way to handle switch between out of poll mode is to use IRQ coalescing >> parameters. >> You want to hold off IRQ until there are a couple packets or a short delay. >> Going out of poll mode >> is harder to determine. >> >> >> On Tue, Aug 26, 2014 at 9:59 AM, Zhou, Danny wrote: >> >>>> -----Original Message----- >>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Stephen Hemminger >>>> Sent: Wednesday, August 27, 2014 12:39 AM >>>> To: Michael Marchetti >>>> Cc: dev@dpdk.org >>>> Subject: Re: [dpdk-dev] overcommitting CPUs >>>> >>>> On Tue, 26 Aug 2014 16:27:14 +0000 >>>> "Michael Marchetti" wrote: >>>> >>>>> Hi, has there been any consideration to introduce a non-spinning >>> network driver (interrupt based), for the purpose of overcommitting >>>> CPUs in a virtualized environment? This would obviously have reduced >>> high-end performance but would allow for increased guest >>>> density (sharing of physical CPUs) on a host. >>>>> I am interested in adding support for this kind of operation, is there >>> any interest in the community? >>>>> Thanks, >>>>> >>>>> Mike. >>>> Better to implement a NAPI like algorithm that adapts from poll to >>> interrupt. >>> >>> Agreed, but DPDK is currently pure poll-mode based, so unlike the NAPI' >>> simple algorithm, the new heuristic algorithm should not switch from >>> poll-mode to interrupt-mode immediately once there is no packet in the >>> recent poll. Otherwise, mode switching will be too frequent which brings >>> serious negative performance impact to DPDK. >>>