From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ob0-f173.google.com (mail-ob0-f173.google.com [209.85.214.173]) by dpdk.org (Postfix) with ESMTP id 360D058EE for ; Wed, 27 Aug 2014 10:36:43 +0200 (CEST) Received: by mail-ob0-f173.google.com with SMTP id vb8so12598066obc.4 for ; Wed, 27 Aug 2014 01:40:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=yYM6MSa5is+uGIB5QFi1pUGbECJboOePJB+b6MBmQsk=; b=htj0nDk+6KuIy3Aek8kGm9l1EL0RlxvSby5mxd0ME+vTg86x2ItmDrjpD5e1a+QL6d 4fr9JlXbRK3cAjwD8YGT4ykZb4FDVxg4RhvK96enJRRj+5uGNagtombcDk40g2TgCvZQ V7uYyfzxSfMQ6hS00cKAPBviPHN5gvWmOpPR8ZkfLw4ssL92J/k5X0MrQ+6iFSb7xQfv Vf4ySub8uL+x4pOySFmxgUH33X/CfeNfpq6mkC/GKP42LeU/wuS9Uku1pHp+p2c2bLYi fSaomxTMX3rec83I6WKgeWW+1OBp7Pf5+Y9mLMgt6dJs37X8uA5RSE2UIf4xXz4tTWld 210w== X-Gm-Message-State: ALoCoQn/w5a7yVJCv89OwgNyQTIaqRzhl+CxPgyBY3rRNFDMcRvju3+aY3vJefIqim/M4yGTSHv7 MIME-Version: 1.0 X-Received: by 10.182.103.165 with SMTP id fx5mr18755188obb.61.1409128845743; Wed, 27 Aug 2014 01:40:45 -0700 (PDT) Received: by 10.202.208.202 with HTTP; Wed, 27 Aug 2014 01:40:45 -0700 (PDT) In-Reply-To: References: <20140826093837.4e3d1d4b@urahara> Date: Wed, 27 Aug 2014 11:40:45 +0300 Message-ID: From: Alex Markuze To: "Patel, Rashmin N" Content-Type: text/plain; charset=UTF-8 Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] overcommitting CPUs X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Aug 2014 08:36:43 -0000 IMHO adding "Interrupt Mode" to dpdk is important as this can open DPDK to a larger public of consumers, I can easily imagine someone trying to find user space networking solution (And deciding against verbs - RDMA) for the obvious reasons and not needing deterministic latency. A few thoughts: Deterministic Latency: Its a fiction in a sence that this something you will be able to see only in a small controlled environment. As network latencies in Data Centres(DC) are dominated by switch queuing (One good reference is http://fastpass.mit.edu that Vincent shared a few days back). Virtual environments: In virtual environments this is especially interesting as the NIC driver(Hypervisor) is working in IRQ mode which unless the Interrupts are pinned to different cpus then the VM will have a disruptive effect on the VM's performance. Moving to interrupt mode mode in paravirtualised environments makes sense as in any environment that is not carefully crafted you should not expect any deterministic guaranties and would opt for a simpler programming model - like interrupt mode. NAPI: With 10G NICs Most CPUs poll rate is faster then the NIC message rate resulting in 1:1 napi_poll callback to IRQ ratio this is true even with small packets. In some cases where the CPU is working slower - for example when intel_iommu=on,strict is set , you can actually see a performance inversion where the "slower" CPU can reach higher B/W because the slowdown makes NAPI work with the kernel effectively moving to polling mode. I think that a smarter DPDK-NAPI is important, but it is a next step IFF the interrupt mode is adopted. On Wed, Aug 27, 2014 at 8:48 AM, Patel, Rashmin N wrote: > You're right and I've felt the same harder part of determinism with other hypervisors' soft switch solutions as well. I think it's worth thinking about. > > Thanks, > Rashmin > > On Aug 26, 2014 9:15 PM, Stephen Hemminger wrote: > The way to handle switch between out of poll mode is to use IRQ coalescing > parameters. > You want to hold off IRQ until there are a couple packets or a short delay. > Going out of poll mode > is harder to determine. > > > On Tue, Aug 26, 2014 at 9:59 AM, Zhou, Danny wrote: > >> >> > -----Original Message----- >> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Stephen Hemminger >> > Sent: Wednesday, August 27, 2014 12:39 AM >> > To: Michael Marchetti >> > Cc: dev@dpdk.org >> > Subject: Re: [dpdk-dev] overcommitting CPUs >> > >> > On Tue, 26 Aug 2014 16:27:14 +0000 >> > "Michael Marchetti" wrote: >> > >> > > Hi, has there been any consideration to introduce a non-spinning >> network driver (interrupt based), for the purpose of overcommitting >> > CPUs in a virtualized environment? This would obviously have reduced >> high-end performance but would allow for increased guest >> > density (sharing of physical CPUs) on a host. >> > > >> > > I am interested in adding support for this kind of operation, is there >> any interest in the community? >> > > >> > > Thanks, >> > > >> > > Mike. >> > >> > Better to implement a NAPI like algorithm that adapts from poll to >> interrupt. >> >> Agreed, but DPDK is currently pure poll-mode based, so unlike the NAPI' >> simple algorithm, the new heuristic algorithm should not switch from >> poll-mode to interrupt-mode immediately once there is no packet in the >> recent poll. Otherwise, mode switching will be too frequent which brings >> serious negative performance impact to DPDK. >>