From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id C8BA81094 for ; Sun, 25 Jun 2017 18:03:37 +0200 (CEST) Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Jun 2017 09:03:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.39,391,1493708400"; d="scan'208";a="1164483648" Received: from tanjianf-mobl.ccr.corp.intel.com (HELO [10.255.27.207]) ([10.255.27.207]) by fmsmga001.fm.intel.com with ESMTP; 25 Jun 2017 09:03:34 -0700 To: Jiayu Hu , dev@dpdk.org References: <1497770469-16661-1-git-send-email-jiayu.hu@intel.com> <1498229000-94867-1-git-send-email-jiayu.hu@intel.com> Cc: konstantin.ananyev@intel.com, stephen@networkplumber.org, yliu@fridaylinux.org, keith.wiles@intel.com, tiwei.bie@intel.com, lei.a.yao@intel.com From: "Tan, Jianfeng" Message-ID: <79f2a001-aa91-8515-3b0d-bccff894deb9@intel.com> Date: Mon, 26 Jun 2017 00:03:33 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <1498229000-94867-1-git-send-email-jiayu.hu@intel.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v6 0/3] Support TCP/IPv4 GRO in DPDK X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Jun 2017 16:03:38 -0000 On 6/23/2017 10:43 PM, Jiayu Hu wrote: > Generic Receive Offload (GRO) is a widely used SW-based offloading > technique to reduce per-packet processing overhead. It gains performance > by reassembling small packets into large ones. Therefore, we propose to > support GRO in DPDK. > > To enable more flexibility to applications, DPDK GRO is implemented as > a user library. Applications explicitly use the GRO library to merge > small packets into large ones. DPDK GRO provides two reassembly modes: > lightweigth mode and heavyweight mode. If applications want to merge > packets in a simple way, they can select lightweight mode API. If > applications need more fine-grained controls, they can select heavyweigth > mode API. > > This patchset is to support TCP/IPv4 GRO in DPDK. The first patch is to > provide a GRO API framework. The second patch is to support TCP/IPv4 GRO. > The last patch is to enable TCP/IPv4 GRO in testpmd. > > We perform many iperf tests to see the performance gains from DPDK GRO. > > The test environment is: > a. two 25Gbps physical ports (p0 and p1) are linked together. Assign p0 > to one networking namespace and assign p1 to DPDK; > b. enable TSO for p0. Run iperf client on p0; > c. launch testpmd with p1 and a vhost-user port, and run it in csum > forwarding mode. Select TCP HW checksum calculation for the > vhost-user port in csum forwarding engine. And for better > performance, we select IPv4 and TCP HW checksum calculation for p1 > too; > d. launch a VM with one CPU core and a virtio-net port. The VM OS is > ubuntu 16.04 whose virtio-net driver supports GRO. Enables RX csum > offloading and mrg_rxbuf for the VM. Iperf server runs in the VM; > e. to run iperf tests, we need to avoid the csum forwarding engine > compulsorily changes packet mac addresses. SO in our tests, we > comment these codes out (line701 ~ line704 in csumonly.c). > > In each test, we run iperf with the following three configurations: > - single flow and single TCP stream > - multiple flows and single TCP stream > - single flow and parallel TCP streams To me, flow == TCP stream; so could you explain what does flow mean? > > We run above iperf tests on three scenatios: > s1: disabling kernel GRO and enabling DPDK GRO > s2: disabling kernel GRO and disabling DPDK GRO > s3: enabling kernel GRO and disabling DPDK GRO > Comparing the throughput of s1 with s2, we can see the performance gains > from DPDK GRO. Comparing the throughput of s1 and s3, we can compare DPDK > GRO performance with kernel GRO performance. > > Test results: > - DPDK GRO throughput is almost 2 times than the throughput of no > DPDK GRO and no kernel GRO; > - DPDK GRO throughput is almost 1.2 times than the throughput of > kernel GRO. > > Change log > ========== > v6: > - avoid checksum validation and calculation > - enable to process IP fragmented packets > - add a command in testpmd > - update documents > - modify rte_gro_timeout_flush and rte_gro_reassemble_burst > - rename veriable name > v5: > - fix some bugs > - fix coding style issues > v4: > - implement DPDK GRO as an application-used library > - introduce lightweight and heavyweight working modes to enable > fine-grained controls to applications > - replace cuckoo hash tables with simpler table structure > v3: > - fix compilation issues. > v2: > - provide generic reassembly function; > - implement GRO as a device ability: > add APIs for devices to support GRO; > add APIs for applications to enable/disable GRO; > - update testpmd example. > > Jiayu Hu (3): > lib: add Generic Receive Offload API framework > lib/gro: add TCP/IPv4 GRO support > app/testpmd: enable TCP/IPv4 GRO > > app/test-pmd/cmdline.c | 125 +++++++++ > app/test-pmd/config.c | 37 +++ > app/test-pmd/csumonly.c | 5 + > app/test-pmd/testpmd.c | 3 + > app/test-pmd/testpmd.h | 11 + > config/common_base | 5 + > doc/guides/rel_notes/release_17_08.rst | 7 + > doc/guides/testpmd_app_ug/testpmd_funcs.rst | 34 +++ > lib/Makefile | 2 + > lib/librte_gro/Makefile | 51 ++++ > lib/librte_gro/rte_gro.c | 221 ++++++++++++++++ > lib/librte_gro/rte_gro.h | 195 ++++++++++++++ > lib/librte_gro/rte_gro_tcp.c | 393 ++++++++++++++++++++++++++++ > lib/librte_gro/rte_gro_tcp.h | 188 +++++++++++++ > lib/librte_gro/rte_gro_version.map | 12 + > mk/rte.app.mk | 1 + > 16 files changed, 1290 insertions(+) > create mode 100644 lib/librte_gro/Makefile > create mode 100644 lib/librte_gro/rte_gro.c > create mode 100644 lib/librte_gro/rte_gro.h > create mode 100644 lib/librte_gro/rte_gro_tcp.c > create mode 100644 lib/librte_gro/rte_gro_tcp.h > create mode 100644 lib/librte_gro/rte_gro_version.map >