From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yk0-f175.google.com (mail-yk0-f175.google.com [209.85.160.175]) by dpdk.org (Postfix) with ESMTP id 7169468AF for ; Tue, 23 Sep 2014 21:18:41 +0200 (CEST) Received: by mail-yk0-f175.google.com with SMTP id 20so2271928yks.34 for ; Tue, 23 Sep 2014 12:24:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=or2NpuJgpTsHliem8CQ9Jn7kskaqeu3N+ECmpgq6qQQ=; b=VyUcDCqmERqNvCj15zBwvWyHEyZytWbM0Krdd8Y4zcxC04tdXZ0ag5yVy4xVOeMvRL 4NlV2aHUHgJsLL7vWNJWtuvkBl/hHWppIbSR0Ge/qt8WaiZt7C9+leHRFuul/jv7owFc R8b0nD7WlgDrWjfC83yJDii8hIJkUtF3d0bOU0m6jwLMC8CCN/DSxCZkRkw83b5myX+o k/sQjpxn+DLULsxsZ7FGlYXu+Z4CYqpjK1j/IS+ikUv02HC9PTgenET6Un4kqD7bw9tN gUmys7KIGC3H9l19bB9UCgtL1R7aAZIHO1znMAZKXePWiYf4ib5EO4U6YjVI/U3u6Wr0 X9mw== X-Gm-Message-State: ALoCoQlDPV1H8b8vq4MKCOXcb0Z5lBuSYVDW+uqYYoD93u7R3RfWsA1uFEnu4KMi0qc2+wgSpsrZ MIME-Version: 1.0 X-Received: by 10.236.30.198 with SMTP id k46mr1777270yha.72.1411500290568; Tue, 23 Sep 2014 12:24:50 -0700 (PDT) Received: by 10.170.84.10 with HTTP; Tue, 23 Sep 2014 12:24:50 -0700 (PDT) In-Reply-To: References: <54213D13.4040605@bisdn.de> Date: Tue, 23 Sep 2014 14:24:50 -0500 Message-ID: From: Jay Rolette To: "Zhou, Danny" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Cc: "" Subject: Re: [dpdk-dev] KNI and memzones X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Sep 2014 19:18:41 -0000 Yep, good way to describe it. Not really related to network security functions but very similar architecture. On Tue, Sep 23, 2014 at 2:12 PM, Zhou, Danny wrote: > It looks like a typical network middle box usage with IDS/IPS/DPI sort > of functionalities. Good enough performance rather than line-rate > performance should be ok for this case, and multi-threaded KNI(multiple > software rx/tx queues are established between DPDK and a single vEth netd= ev > with multiple kernel threads affinities to several lcores) should fit, wi= th > linear performance scaling if you can allocate multiple lcores to achieve > satisfied throughput for relatively big packets. > > > > Since NIC control is still in DPDK=E2=80=99 PMD for this case, bifurcated= driver > does not fit, unless you only use DPDK to rx/tx packets in your box. > > > > *From:* Jay Rolette [mailto:rolette@infiniteio.com] > *Sent:* Wednesday, September 24, 2014 2:53 AM > *To:* Zhou, Danny > *Cc:* Marc Sune; ; dev-team@bisdn.de > > *Subject:* Re: [dpdk-dev] KNI and memzones > > > > I can't discuss product details openly yet, but I'm happy to have a > detailed discussion under NDA with Intel. In fact, we had an early NDA > discussion with Intel about it a few months ago. > > > > That said, the use case isn't tied so closely to my product that I can't > describe it in general terms... > > > > Imagine a box that installs in your network as a transparent > bump-in-the-wire. Traffic comes in port 1 and is processed by our > DPDK-based engine, then the packets are forwarded out port 2, where they > head to their original destination. From a network topology point of view= , > the box is mostly invisible. > > > > Same process applies for traffic going the other way (RX on port 2, > special-sauce processing in DPDK app, TX on port 1). > > > > If you are familiar with network security products, this is very much how > IPS devices work. > > > > Where KNI comes into play is for several user-space apps that need to use > the normal network stack (sockets) to communicate over the _same_ ports > used on the main data path. We use KNI to create a virtual port with an I= P > address overlaid on the "invisible" data path ports. > > > > This isn't just for control traffic. It's obviously not line-rate > processing, but we need to get all the bandwidth we can out of it. > > > > Let me know if that makes sense or if I need to clarify some things. If > you'd rather continue this as an NDA discussion, just shoot me an email > directly. > > > > Regards, > > Jay > > > > > > > > On Tue, Sep 23, 2014 at 11:38 AM, Zhou, Danny > wrote: > > > > -----Original Message----- > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jay Rolette > > Sent: Tuesday, September 23, 2014 8:39 PM > > To: Marc Sune > > Cc: ; dev-team@bisdn.de > > Subject: Re: [dpdk-dev] KNI and memzones > > > > *> p.s. Lately someone involved with DPDK said KNI would be deprecated = in > > future DPDK releases; I haven't read or listen to this before, is this > > true? What would be the natural replacement then?* > > > > KNI is a non-trivial part of the product I'm in the process of building= . > > I'd appreciate someone "in the know" addressing this one please. Are > there > > specific roadmap plans relative to KNI that we need to be aware of? > > > > KNI and multi-threaded KNI has several limitation: > 1) Flow classification and packet distribution are done both software, > specifically KNI user space library, at the cost of CPU cycles. > 2) Low performance, skb creation/free and packetscopy between skb and mbu= f > kills performance significantly. > 3) Dedicate cores in user space and kernel space responsible for rx/tx > packets between DPDK App and KNI device, it seems to me waste too many co= re > resources. > 4) GPL license jail as KNI sits in kernel. > > We actually have a bifurcated driver prototype that meets both high > performance and upstreamable requirement, which is treated as alternative > solution of KNI. The idea is to > leverage NIC' flow director capability to bifurcate data plane packets to > DPDK and keep control plane packets or whatever packets need to go throug= h > kernel' TCP/IP stack remains > being processed in kernel(NIC driver + stack). Basically, kernel NIC > driver and DPDK co-exists to driver a same NIC device, but manipulate > different rx/tx queue pairs. Though there is some > tough consistent NIC control issue which needs to be resolved and > upstreamed to kernel, which I do not want to expose details at the moment= . > > IMHO, KNI should NOT be removed unless there is a really good user space, > open-source and socket backward-compatible() TCP/IP stack which should no= t > become true very soon. > The bifurcated driver approach could certainly replace KNI for some use > cases where DPDK does not own the NIC control. > > Do you mind share your KNI use case in more details to help determine > whether bifurcate driver could help with? > > > > Regards, > > Jay > > > > On Tue, Sep 23, 2014 at 4:27 AM, Marc Sune wrote: > > > > > Hi all, > > > > > > So we are having some problems with KNI. In short, we have a DPDK > > > application that creates KNI interfaces and destroys them during its > > > lifecycle and connecting them to DOCKER containers. Interfaces may > > > eventually be even named the same (see below). > > > > > > We were wondering why even calling rte_kni_release() the hugepages > memory > > > was rapidly being exhausted, and we also realised even after > destruction, > > > you cannot use the same name for the interface. > > > > > > After close inspection of the rte_kni lib we think the core issue and > is > > > mostly a design issue. rte_kni_alloc ends up calling > kni_memzone_reserve() > > > that calls at the end rte_memzone_reserve() which cannot be unreserve= d > by > > > rte_kni_relese() (by design of memzones). The exhaustion is rapid due > to > > > the number of FIFOs created (6). > > > > > > If this would be right, we would propose and try to provide a patch a= s > > > follows: > > > > > > * Create a new rte_kni_init(unsigned int max_knis); > > > > > > This would preallocate all the FIFO rings(TX, RX, ALLOC, FREE, Reques= t > > > and Response)*max_knis by calling kni_memzone_reserve(), and store > them in > > > a kni_fifo_pool. This should only be called once by DPDK applications > at > > > bootstrapping time. > > > > > > * rte_kni_allocate would just use one of the kni_fifo_pool (one =3D> > meaning > > > a a set of 6 FIFOs making a single slot) > > > * rte_kni_release would return to the pool. > > > > > > This should solve both issues. We would base the patch on 1.7.2. > > > > > > Thoughts? > > > marc > > > > > > p.s. Lately someone involved with DPDK said KNI would be deprecated i= n > > > future DPDK releases; I haven't read or listen to this before, is thi= s > > > true? What would be the natural replacement then? > > > > > >