From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f42.google.com (mail-pa0-f42.google.com [209.85.220.42]) by dpdk.org (Postfix) with ESMTP id 61FF9AFDA for ; Fri, 9 May 2014 19:04:36 +0200 (CEST) Received: by mail-pa0-f42.google.com with SMTP id rd3so4649951pab.15 for ; Fri, 09 May 2014 10:04:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-type:content-transfer-encoding; bh=s4b4Lrb7mXFhgS0YCawC5H+QE9fmMCbdVESWl21ZSVE=; b=DdJiStaLHnx9uDyYVdur6B6Sx8XudfKmVOllvvefehbU/5Cv2Byc80t8Jk7ZB6MYcl DNT0/jNMaMSObLYMm/EzWqN+WIMDmjvUeFJhX8xd0tMiRZonikED0jQuWsccUOdsvbFo tDnibfMYluvHtw0OKrRGZUrJxmu5o7NNs1u4eiqhYvkiwz1R2ozI43i1xrZBrSgKNi6I 0cn2w/U4gs0e+FXvrhdQ/qzftrfxsUbnAKPtiqJ++44RXepHFMFJHYQTemA8L7McWvL4 Gjh2XoeGHGt5cpJ4CHa9w8FB6n2MF3DjY4TeVBCrcSRnANSYIe4YHwv5Q7cq+4uC4ldP x86A== X-Gm-Message-State: ALoCoQmtkzkLqaFKK2f9c1W8CDdzHtuAUpbl0v/dcQbhCbJ1Z/n6NF2hVz/qsM/Hg2tm2p/DJKR2 X-Received: by 10.66.66.202 with SMTP id h10mr22614234pat.70.1399655075076; Fri, 09 May 2014 10:04:35 -0700 (PDT) Received: from nehalam.linuxnetplumber.net (static-50-53-83-51.bvtn.or.frontiernet.net. [50.53.83.51]) by mx.google.com with ESMTPSA id g6sm10339168pat.2.2014.05.09.10.04.34 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Fri, 09 May 2014 10:04:34 -0700 (PDT) Date: Fri, 9 May 2014 10:04:31 -0700 From: Stephen Hemminger To: Olivier Matz Message-ID: <20140509100431.7af69959@nehalam.linuxnetplumber.net> In-Reply-To: <1399647038-15095-1-git-send-email-olivier.matz@6wind.com> References: <1399647038-15095-1-git-send-email-olivier.matz@6wind.com> X-Mailer: Claws Mail 3.9.3 (GTK+ 2.24.23; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: dev@dpdk.org Subject: Re: [dpdk-dev] [PATCH RFC 00/11] ixgbe/mbuf: add TSO support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 May 2014 17:04:36 -0000 On Fri, 9 May 2014 16:50:27 +0200 Olivier Matz wrote: > This series add TSO support in ixgbe DPDK driver. As discussed > previously on the list [1], one problem is that there is not enough room > in rte_mbuf today to store the required information to implement this > feature: > - a new ol_flag > - the MSS > - the L4 header len > > A solution would be to increase the size of the mbuf to 2 cache lines > but it could have a bad impact on performance. This series proposes some > rework to drastically reduce the size of the rte_mbuf structures before > implementing the TSO, avoiding to change the mbuf size to 128 bytes. > > After the rework of mbuf structures, the size of rte_mbuf structure is > reduced by 9 bytes. The implementation of TSO requires to double the > size of ol_flags (16 to 32 bits) and to double the size of offload > information in order to add the mss and the l4 header length (32 to 64 > bits). At the end of the whole series, sizeof(rte_mbuf) is still 64 > bytes and 4 bytes are available for future use. > > This rework causes a lot of modifications in the mbuf structure, > implying some changes in the applications that directly use the mbuf > structure fields instead of using the API functions (sometimes there is > no function). That's why this series is a RFC. In my opinion, it's the > proper moment for this evolution as the 1.7.0 window is open. > > About TSO, the new fields in mbuf try to be generic enough to apply to > other hardware in the future. To delegate the TCP segmentation to the > hardware, the user has to: > > - set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies > PKT_TX_IP_CKSUM and PKT_TX_TCP_CKSUM) > - fill the mbuf->hw_offload information: l2_len, l3_len, l4_len, mss > - calculate the pseudo header checksum and set it in the TCP header, > as required when doing hardware TCP checksum offload > - set the IP checksum to 0 > > Compilation of DPDK and examples is tested for the following > targets: x86_64-*-linuxapp-gcc, i686-*-linuxapp-gcc, x86_64-*-bsdapp-gcc > > The mbuf rework series is validated with autotests: > > cd dpdk.org/ > make install T=x86_64-default-linuxapp-gcc > cd x86_64-default-linuxapp-gcc/ > modprobe uio > insmod kmod/igb_uio.ko > python ../tools/igb_uio_bind.py -b igb_uio 0000:02:00.0 > echo 0 > /proc/sys/kernel/randomize_va_space > echo 1000 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages > echo 1000 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages > mount -t hugetlbfs none /mnt/huge > make test > > TSO is validated with IPv4 and IPv6 with testpmd (see the commit log of > last patch for details). > > The performance non-regression has been tested with 6WINDGate fast path. > > Note: this patches may conflict with patch [2] which is pushed yet, but > will probably be integrated before this series. > > [1] http://dpdk.org/ml/archives/dev/2013-October/thread.html#572 > [2] http://dpdk.org/ml/archives/dev/2014-April/002166.html > I would also like to propose changing the checksum offload flags. Many devices can indicate good checksum in some cases but can't test for many other types of packets. By changing the flags to be: PKT_RX_L4_CKSUM_GOOD and PKT_RX_IP_CKSUM_GOOD It is then possible to support devices where some cases (IPv4 + TCP) are supported but others are not. This also better aligns with Linux checksum code for cases where mbuf and meta data are being passed into kernel.