From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id C115F378E for ; Mon, 26 Jun 2017 03:34:05 +0200 (CEST) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Jun 2017 18:34:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.39,393,1493708400"; d="scan'208";a="103563887" Received: from unknown (HELO localhost.localdomain) ([10.239.128.239]) by orsmga002.jf.intel.com with ESMTP; 25 Jun 2017 18:34:01 -0700 Date: Mon, 26 Jun 2017 09:35:07 +0800 From: Jiayu Hu To: "Tan, Jianfeng" Cc: dev@dpdk.org, konstantin.ananyev@intel.com, stephen@networkplumber.org, yliu@fridaylinux.org, keith.wiles@intel.com, tiwei.bie@intel.com, lei.a.yao@intel.com Message-ID: <20170626013507.GA107424@localhost.localdomain> References: <1497770469-16661-1-git-send-email-jiayu.hu@intel.com> <1498229000-94867-1-git-send-email-jiayu.hu@intel.com> <79f2a001-aa91-8515-3b0d-bccff894deb9@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <79f2a001-aa91-8515-3b0d-bccff894deb9@intel.com> User-Agent: Mutt/1.7.1 (2016-10-04) Subject: Re: [dpdk-dev] [PATCH v6 0/3] Support TCP/IPv4 GRO in DPDK X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Jun 2017 01:34:06 -0000 Hi Jianfeng, On Mon, Jun 26, 2017 at 12:03:33AM +0800, Tan, Jianfeng wrote: > > > On 6/23/2017 10:43 PM, Jiayu Hu wrote: > > Generic Receive Offload (GRO) is a widely used SW-based offloading > > technique to reduce per-packet processing overhead. It gains performance > > by reassembling small packets into large ones. Therefore, we propose to > > support GRO in DPDK. > > > > To enable more flexibility to applications, DPDK GRO is implemented as > > a user library. Applications explicitly use the GRO library to merge > > small packets into large ones. DPDK GRO provides two reassembly modes: > > lightweigth mode and heavyweight mode. If applications want to merge > > packets in a simple way, they can select lightweight mode API. If > > applications need more fine-grained controls, they can select heavyweigth > > mode API. > > > > This patchset is to support TCP/IPv4 GRO in DPDK. The first patch is to > > provide a GRO API framework. The second patch is to support TCP/IPv4 GRO. > > The last patch is to enable TCP/IPv4 GRO in testpmd. > > > > We perform many iperf tests to see the performance gains from DPDK GRO. > > > > The test environment is: > > a. two 25Gbps physical ports (p0 and p1) are linked together. Assign p0 > > to one networking namespace and assign p1 to DPDK; > > b. enable TSO for p0. Run iperf client on p0; > > c. launch testpmd with p1 and a vhost-user port, and run it in csum > > forwarding mode. Select TCP HW checksum calculation for the > > vhost-user port in csum forwarding engine. And for better > > performance, we select IPv4 and TCP HW checksum calculation for p1 > > too; > > d. launch a VM with one CPU core and a virtio-net port. The VM OS is > > ubuntu 16.04 whose virtio-net driver supports GRO. Enables RX csum > > offloading and mrg_rxbuf for the VM. Iperf server runs in the VM; > > e. to run iperf tests, we need to avoid the csum forwarding engine > > compulsorily changes packet mac addresses. SO in our tests, we > > comment these codes out (line701 ~ line704 in csumonly.c). > > > > In each test, we run iperf with the following three configurations: > > - single flow and single TCP stream > > - multiple flows and single TCP stream > > - single flow and parallel TCP streams > > To me, flow == TCP stream; so could you explain what does flow mean? Sorry, I use inappropriate terms. 'flow' means TCP connection here. And 'multiple TCP streams' means parallel iperf-client threads. Thanks, Jiayu > > > > > We run above iperf tests on three scenatios: > > s1: disabling kernel GRO and enabling DPDK GRO > > s2: disabling kernel GRO and disabling DPDK GRO > > s3: enabling kernel GRO and disabling DPDK GRO > > Comparing the throughput of s1 with s2, we can see the performance gains > > from DPDK GRO. Comparing the throughput of s1 and s3, we can compare DPDK > > GRO performance with kernel GRO performance. > > > > Test results: > > - DPDK GRO throughput is almost 2 times than the throughput of no > > DPDK GRO and no kernel GRO; > > - DPDK GRO throughput is almost 1.2 times than the throughput of > > kernel GRO. > > > > Change log > > ========== > > v6: > > - avoid checksum validation and calculation > > - enable to process IP fragmented packets > > - add a command in testpmd > > - update documents > > - modify rte_gro_timeout_flush and rte_gro_reassemble_burst > > - rename veriable name > > v5: > > - fix some bugs > > - fix coding style issues > > v4: > > - implement DPDK GRO as an application-used library > > - introduce lightweight and heavyweight working modes to enable > > fine-grained controls to applications > > - replace cuckoo hash tables with simpler table structure > > v3: > > - fix compilation issues. > > v2: > > - provide generic reassembly function; > > - implement GRO as a device ability: > > add APIs for devices to support GRO; > > add APIs for applications to enable/disable GRO; > > - update testpmd example. > > > > Jiayu Hu (3): > > lib: add Generic Receive Offload API framework > > lib/gro: add TCP/IPv4 GRO support > > app/testpmd: enable TCP/IPv4 GRO > > > > app/test-pmd/cmdline.c | 125 +++++++++ > > app/test-pmd/config.c | 37 +++ > > app/test-pmd/csumonly.c | 5 + > > app/test-pmd/testpmd.c | 3 + > > app/test-pmd/testpmd.h | 11 + > > config/common_base | 5 + > > doc/guides/rel_notes/release_17_08.rst | 7 + > > doc/guides/testpmd_app_ug/testpmd_funcs.rst | 34 +++ > > lib/Makefile | 2 + > > lib/librte_gro/Makefile | 51 ++++ > > lib/librte_gro/rte_gro.c | 221 ++++++++++++++++ > > lib/librte_gro/rte_gro.h | 195 ++++++++++++++ > > lib/librte_gro/rte_gro_tcp.c | 393 ++++++++++++++++++++++++++++ > > lib/librte_gro/rte_gro_tcp.h | 188 +++++++++++++ > > lib/librte_gro/rte_gro_version.map | 12 + > > mk/rte.app.mk | 1 + > > 16 files changed, 1290 insertions(+) > > create mode 100644 lib/librte_gro/Makefile > > create mode 100644 lib/librte_gro/rte_gro.c > > create mode 100644 lib/librte_gro/rte_gro.h > > create mode 100644 lib/librte_gro/rte_gro_tcp.c > > create mode 100644 lib/librte_gro/rte_gro_tcp.h > > create mode 100644 lib/librte_gro/rte_gro_version.map > >