From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg0-f53.google.com (mail-pg0-f53.google.com [74.125.83.53]) by dpdk.org (Postfix) with ESMTP id 9908F2A5B for ; Mon, 23 Jan 2017 18:15:58 +0100 (CET) Received: by mail-pg0-f53.google.com with SMTP id 204so46553925pge.0 for ; Mon, 23 Jan 2017 09:15:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=7fWgH0YvMk6LR3RNN8sOIcrLBBRum1pW49SBo9V7iPA=; b=N0Qi6Ra1Sjbw/XWDOwdrZFSUCzY9YE+gRNEmGTQnBv0//vJ00rJUMsXS7zfXX3Munn KLmJ5JT2SKcaXhnOJmXutHE+PBOkTNEiF8JVUIMriFHrk8G3+iga5rl9lLiikxhd0tmy bTuF57z4DRfyy6n7ws8t9ijbplD+OngOBnP/So2HXGnPfKJ7a7crLhobo6q5x1m2vWxF CvcLVl6XZcWjkm1+lp9uA0r4zA7F6ejcq+AVzza6IVHBmkXC3U8EeZ2NQ592dRFhoxfj nFGL1TqvQZCfG6GYvwzO/MFweTEW4H07OeHbJXPfr2358lp5wRYtESjVgLEeTFqEiB70 h0Iw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=7fWgH0YvMk6LR3RNN8sOIcrLBBRum1pW49SBo9V7iPA=; b=lOOFwtBYE+KvNlb1zB+MJHf/a2qeR47V/+/AJwvzqKITYjYEeKKDzDpCqlX4C5HM65 7TX94C6y6E2BQXlC1bPml5jFViQdd60rJc6jswF7va/JvLH3zB8qsuJZz+y4tbzHA7tF 3GIyy9i29QJktDNAHIx1tJo3rXG4kvx3R5ZkIoSloA/6XViWEI/5Gb8FyIDEp1QuDzF3 VvV76f6b3OS8Tvmj6Wc/H7NtgLmori1lGMAhWVAgZKR6TyVhyfLaujeMt/pE7Bvt4zxx jyDxaflG3Gy+/PbOM27LlETZqFoJGFViOMLley7rpvfrkYMI46GxtYIg8Un5sU9tTypB xrog== X-Gm-Message-State: AIkVDXLcz6YEz48yWTsbjYfTV2kG78qF3rEK1fGODSYUCx0D4qF9cnnkBnZ1MjvncOo2ag== X-Received: by 10.84.232.133 with SMTP id i5mr44007961plk.79.1485191757821; Mon, 23 Jan 2017 09:15:57 -0800 (PST) Received: from xeon-e3 (204-195-18-65.wavecable.com. [204.195.18.65]) by smtp.gmail.com with ESMTPSA id l22sm38538938pgc.43.2017.01.23.09.15.57 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 23 Jan 2017 09:15:57 -0800 (PST) Date: Mon, 23 Jan 2017 09:15:50 -0800 From: Stephen Hemminger To: Jiayu Hu Cc: dev@dpdk.org, keith.wiles@intel.com, ray.kinsella@intel.com, konstantin.ananyev@intel.com, walter.e.gilmore@intel.com, venky.venkatesan@intel.com, yuanhan.liu@linux.intel.com Message-ID: <20170123091550.212dca35@xeon-e3> In-Reply-To: <1485176592-111525-1-git-send-email-jiayu.hu@intel.com> References: <1485176592-111525-1-git-send-email-jiayu.hu@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [RFC] Add GRO support in DPDK X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Jan 2017 17:15:59 -0000 On Mon, 23 Jan 2017 21:03:12 +0800 Jiayu Hu wrote: > With the support of hardware segmentation techniques in DPDK, the > networking stack overheads of send-side of applications, which directly > leverage DPDK, have been greatly reduced. But for receive-side, numbers of > segmented packets seriously burden the networking stack of applications. > Generic Receive Offload (GRO) is a widely used method to solve the > receive-side issue, which gains performance by reducing the amount of > packets processed by the networking stack. But currently, DPDK doesn't > support GRO. Therefore, we propose to add GRO support in DPDK, and this > RFC is used to explain the basic DPDK GRO design. > > DPDK GRO is a SW-based packets assembly library, which provides GRO > abilities for numbers of protocols. In DPDK GRO, packets are merged > before returning to applications and after receiving from drivers. > > In DPDK, GRO is a capability of NIC drivers. That support GRO or not and > what GRO types are supported are up to NIC drivers. Different drivers may > support different GRO types. By default, drivers enable all supported GRO > types. For applications, they can inquire the supported GRO types by > each driver, and can control what GRO types are applied. For example, > ixgbe supports TCP and UDP GRO, but the application just needs TCP GRO. > The application can disable ixgbe UDP GRO. > > To support GRO, a driver should provide a way to tell applications what > GRO types are supported, and provides a GRO function, which is in charge > of assembling packets. Since different drivers may support different GRO > types, their GRO functions may be different. For applications, they don't > need extra operations to enable GRO. But if there are some GRO types that > are not needed, applications can use an API, like > rte_eth_gro_disable_protocols, to disable them. Besides, they can > re-enable the disabled ones. > > The GRO function processes numbers of packets at a time. In each > invocation, what GRO types are applied depends on applications, and the > amount of packets to merge depends on the networking status and > applications. Specifically, applications determine the maximum number of > packets to be processed by the GRO function, but how many packets are > actually processed depends on if there are available packets to receive. > For example, the receive-side application asks the GRO function to > process 64 packets, but the sender only sends 40 packets. At this time, > the GRO function returns after processing 40 packets. To reassemble the > given packets, the GRO function performs an "assembly procedure" on each > packet. We use an example to demonstrate this procedure. Supposing the > GRO function is going to process packetX, it will do the following two > things: > a. Find a L4 assembly function according to the packet type of > packetX. A L4 assembly function is in charge of merging packets of a > specific type. For example, TCPv4 assembly function merges packets > whose L3 IPv4 and L4 is TCP. Each L4 assembly function has a packet > array, which keeps the packets that are unable to assemble. > Initially, the packet array is empty; > b. The L4 assembly function traverses own packet array to find a > mergeable packet (comparing Ethernet, IP and L4 header fields). If > finds, merges it and packetX via chaining them together; if doesn't, > allocates a new array element to store packetX and updates element > number of the array. > After performing the assembly procedure to all packets, the GRO function > combines the results of all packet arrays, and returns these packets to > applications. > > There are lots of ways to implement the above design in DPDK. One of the > ways is: > a. Drivers tell applications what GRO types are supported via > dev->dev_ops->dev_infos_get; > b. When initialize, drivers register own GRO function as a RX > callback, which is invoked inside rte_eth_rx_burst. The name of the > GRO function should be like xxx_gro_receive (e.g. ixgbe_gro_receive). > Currently, the RX callback can only process the packets returned by > dev->rx_pkt_burst each time, and the maximum packet number > dev->rx_pkt_burst returns is determined by each driver, which can't > be interfered by applications. Therefore, to implement the above GRO > design, we have to modify current RX implementation to make driver > return packets as many as possible until the packet number meets the > demand of applications or there are not available packets to receive. > This modification is also proposed in patch: > http://dpdk.org/ml/archives/dev/2017-January/055887.html; > c. The GRO types to apply and the maximum number of packets to merge > are passed by resetting RX callback parameters. It can be achieved by > invoking rte_eth_rx_callback; > d. Simply, we can just store packet addresses into the packet array. > To check one element, we need to fetch the packet via its address. > However, this simple design is not efficient enough. Since whenever > checking one packet, one pointer dereference is generated. And a > pointer dereference always causes a cache line miss. A better way is > to store some rules in each array element. The rules must be the > prerequisites of merging two packets, like the sequence number of TCP > packets. We first compare the rules, then retrieve the packet if the > rules match. If storing the rules causes the packet array structure > is cache-unfriendly, we can store a fixed-length signature of the > rules instead. For example, the signature can be calculated by > performing XOR operation on IP addresses. Both design can avoid > unnecessary pointer dereferences. Since DPDK does burst mode already, GRO is a lot less relevant. GRO in Linux was invented because there is no burst mode in the receive API. If you look at VPP in FD.io you will see they already do aggregration and steering at the higher level in the stack. The point of GRO is that it is generic, no driver changes are necessary. Your proposal would add a lot of overhead, and cause drivers to have to be aware of higher level flows.