From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lb0-f170.google.com (mail-lb0-f170.google.com [209.85.217.170]) by dpdk.org (Postfix) with ESMTP id CC3238E7D for ; Mon, 12 Oct 2015 10:50:56 +0200 (CEST) Received: by lbbk10 with SMTP id k10so23670775lbb.0 for ; Mon, 12 Oct 2015 01:50:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-type :content-transfer-encoding; bh=ZIUzdRLYRgkD9s1NWx0DBAPdtrkXLpd6ChqgKAHeVGk=; b=e6T/03fVdoxzwrTNVz3wqa74tD9nBwTujyB5NPLuYU1U4j1XfjeePiv1gx4FKIuh1Q dGwmKsigzgopYkF+mTzDxRlGg2qQj9/UypTQR5Ssllz/JCm1eqy9dImLINplb29G3dPJ 5YkFKSfNQ/bxepGOhGuy6QCKqjbA+iskmnDcwoBUnv/Tn9JZQ46mZGprbJ2eTHdzRyy+ UXMXT3r+mYaZjZrcxvw0C0KZaSkPRU4bHngx8ztY+ErikIi0IDKuAwvEphPgdfSFNExv SBYRxGCdx5+9JfxM6jnFd+JqewxjO/DFpq+DP7e/cRTpPw0XGuKdCrltwpJ4RiOAnIxO J+Ow== X-Gm-Message-State: ALoCoQm3Z814b4ezKFJCWJ6zcFFyEuqfsQP7rbHc6JTWO9rIygSipBVkugW5FMuE88ElM3f3J6We X-Received: by 10.112.158.1 with SMTP id wq1mr11655681lbb.67.1444639856313; Mon, 12 Oct 2015 01:50:56 -0700 (PDT) Received: from avi.cloudius-systems.com ([37.142.229.250]) by smtp.googlemail.com with ESMTPSA id pm6sm2667681lbc.1.2015.10.12.01.50.55 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 12 Oct 2015 01:50:55 -0700 (PDT) To: "Wiles, Keith" , "dev@dpdk.org" References: From: Avi Kivity Message-ID: <561B746E.4070807@scylladb.com> Date: Mon, 12 Oct 2015 11:50:54 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] Network Stack discussion notes from 2015 DPDK Userspace X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Oct 2015 08:50:57 -0000 On 10/10/2015 02:19 AM, Wiles, Keith wrote: > Here are some notes from the DPDK Network Stack discussion, I can remember please help me fill in anything I missed. > > Items I remember we talked about: > > * The only reason for a DPDK TCP/IP stack is for performance and possibly lower latency > * Meaning the developer is willing to re-write or write his application to get the best performance. > * A TCP/IPv4/v6 stack is the minimum stack we need to support applications linked with DPDK. > * SCTP is also another protocol that maybe required > * TCP is the primary protocol, usage model for most use cases > * Stack must be able to terminate TCP traffic to an application linked to DPDK > * For DPDK the customer is looking for fast applications and is willing to write the application just for DPDK network stack > * Converting an existing application could be done, but the design is for performance and may require a lot of changes to an application > * Using an application API that is not Socket is fine for high performance and maybe the only way we get best performance. > * Need to supply a Socket layer interface as a option if customer is willing to take a performance hit instead of rewriting the application > * Native application acceleration is desired, but not required when using DPDK network stack > * We have two projects related to network stack in DPDK > * The first one is porting some TCP/IP stack to DPDK plus it needs to give a reasonable performance increase over native Linux applications > * The stack code needs to be BSD/MIT like licensed (Open Sourced) > * The stack should be up to date with the latest RFCs or at least close > * A stack could be written for DPDK (not using a existing code base) and its environment for best performance > * Need to be able to configure the DPDK stack(s) from the Linux command line tools if possible > * Need a DPDK specific application layer API for application to interface with the network stack > * Could have a socket layer API on top of the specific API for applications needing to use sockets (not expected to be the best performance) > * The second item is figuring out a new IPC for East/West traffic within the same system. > * The design needs to improve performance between applications and be transparent to the application when the remote end is not on the same system. > * The new IPC path should be agnostic to local or remote end points > * Needs to be very fast compared to current Linux IPC designs. (Will OVS work here?) Basically, seastar [1] matches this exactly. Its TCP stack, unlike most stacks, is sharded -- there is a separate stack running on each core (but with a single IP address), no locking, zero-copy for both transmit and receive. It has a fast IPC between cores (all data sharing in seastar is via IPC queues; locks or atomic RMW operations are not used). There is also an RPC subsystem that can be used for inter-node communications. We've seen 7X performance improvements over the Linux TCP stack when coding a simple HTTP server. Of course, it's not all roses. Seastar is written in C++, and the higher layers are asynchronous, so there's a high barrier to entry for dpdk developers. Maybe it can't be merged outright, but perhaps it can provide some inspiration. (seastar supports subsets of TCP, UDP, ICMP, and DHCP over IPv4; no IPv6 support) [1] https://github.com/scylladb/seastar > Did I miss any details or comments, please reply and help me correct the comment or understanding. > > Thanks for everyone attending and packing into a small space. > > — > Regards, > ++Keith Wiles > Intel Corporation