From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id 8639637A4 for ; Thu, 14 Jul 2016 10:04:36 +0200 (CEST) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga102.fm.intel.com with ESMTP; 14 Jul 2016 01:04:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.28,361,1464678000"; d="scan'208";a="994957870" Received: from shvmail01.sh.intel.com ([10.239.29.42]) by orsmga001.jf.intel.com with ESMTP; 14 Jul 2016 01:04:34 -0700 Received: from shecgisg004.sh.intel.com (shecgisg004.sh.intel.com [10.239.29.89]) by shvmail01.sh.intel.com with ESMTP id u6E84Vgi008778; Thu, 14 Jul 2016 16:04:31 +0800 Received: from shecgisg004.sh.intel.com (localhost [127.0.0.1]) by shecgisg004.sh.intel.com (8.13.6/8.13.6/SuSE Linux 0.8) with ESMTP id u6E84SrI029169; Thu, 14 Jul 2016 16:04:30 +0800 Received: (from wujingji@localhost) by shecgisg004.sh.intel.com (8.13.6/8.13.6/Submit) id u6E84SKr029165; Thu, 14 Jul 2016 16:04:28 +0800 From: Jingjing Wu To: john.mcnamara@intel.com Cc: dev@dpdk.org, jingjing.wu@intel.com, yong.liu@intel.com, helin.zhang@intel.com Date: Thu, 14 Jul 2016 16:04:25 +0800 Message-Id: <1468483465-29135-1-git-send-email-jingjing.wu@intel.com> X-Mailer: git-send-email 1.7.4.1 In-Reply-To: <1466650946-22523-1-git-send-email-jingjing.wu@intel.com> References: <1466650946-22523-1-git-send-email-jingjing.wu@intel.com> Subject: [dpdk-dev] [PATCH v3] doc: flow bifurcation guide on Linux X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Jul 2016 08:04:37 -0000 Flow bifurcation is a mechanism which depends the advanced Ethernet device to split traffic between queues. It provides the capability to let the kernel driver and DPDK driver co-exist and take their advantages. It is achieved by using SRIOV and NIC's advanced filtering. This patch describes it and adds the user guide on ixgbe and i40 NICs. Signed-off-by: Jingjing Wu --- v3 changes: - rename bifurcated driver to flow bifurcation - move the doc from nics to howto This patch is based on patch set "[PATCH v3 0/2] doc: live migration procedure" http://www.dpdk.org/ml/archives/dev/2016-July/043476.html doc/guides/howto/flow_bifurcation.rst | 288 +++++++++++ doc/guides/howto/img/flow_bifurcation_overview.svg | 544 +++++++++++++++++++++ doc/guides/howto/img/ixgbe_bifu_queue_idx.svg | 101 ++++ doc/guides/howto/index.rst | 1 + 4 files changed, 934 insertions(+) create mode 100644 doc/guides/howto/flow_bifurcation.rst create mode 100644 doc/guides/howto/img/flow_bifurcation_overview.svg create mode 100644 doc/guides/howto/img/ixgbe_bifu_queue_idx.svg diff --git a/doc/guides/howto/flow_bifurcation.rst b/doc/guides/howto/flow_bifurcation.rst new file mode 100644 index 0000000..5e10952 --- /dev/null +++ b/doc/guides/howto/flow_bifurcation.rst @@ -0,0 +1,288 @@ +.. BSD LICENSE + Copyright(c) 2016 Intel Corporation. All rights reserved. + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + * Neither the name of Intel Corporation nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +Flow bifurcation guide +====================== + +Flow bifurcation is a mechanism which depends the advanced Ethernet device to +split traffic between Linux user space and kernel space. Because it is hardware +assisted design, this approach can provide the line rate processing capability. +Other than KNI, the SW is just required to device configuration, no need to +take care of the packet movement during the traffic split. This can get more +performance with less CPU overhead. + +The Flow bifurcation takes advantage of Ethernet device feature to split the +incoming data traffic to user space application (Such as DPDK application) +and/or kernel space program (Linux kernel stack). It can direct some traffic +(e.g data plane traffic) to DPDK, while direct some other traffic (e.g control +plane traffic) to the traditional Linux networking stack. + +There are a number of technical options to achieve this. A typical example is +to combine the technology of SR-IOV and packet classification filtering. + +SR-IOV is a PCI standard that allows the same physical adapter to split as +multiple virtual functions. Each virtual function has separated queues with +physical function. Traffic with a virtual function's destination MAC address +from network adapter will be directed to it. In a sense, SR-IOV has the +capability on queue division. + +Packet classification filtering is the hardware capability available on most +network adapters. Filters can be configured to direct specific flows to a given +receive queue by hardware. Different NIC may have different filter types to +direct flow to a Virtual Function or a queue belong to it. + +Linux network can receive the specific traffic through kernel driver, while +DPDK can receive the specific traffic bypassing the Linux kernel by using +drivers like VFIO or DPDK igb_uio module. + +.. _figure_flow_bifurcation_overview: + +.. figure:: img/flow_bifurcation_overview.* + + Flow Bifucation Overview + + +Use Flow bifurcation on IXGBE in Linux +-------------------------------------- + +On Intel 82599 10 Gigabit Ethernet Controller series NICs, Flow bifurcation +can be achieved by SR-IOV and flow director technologies. So the traffic can +be directed to queues by flow director capability, typically by matching 5-tuple +of UDP/TCP packets. + +The step procedure is as following: + +#. Boot system without iommu, or with ``iommu=pt``. + +#. Create Virtual Functions: + + .. code-block:: console + + echo 2 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs + +#. Enable and set flow filters: + + .. code-block:: console + + ethtool -K eth1 ntuple on + ethtool -N eth1 flow-type udp4 src-ip 192.0.2.2 dst-ip 198.51.100.2 \ + action $queue_index_in_VF0 + ethtool -N eth1 flow-type udp4 src-ip 198.51.100.2 dst-ip 192.0.2.2 \ + action $queue_index_in_VF1 + + where: + + * $queue_index_in_PF: [queue index] + + * $queue_index_in_VFn: Bits 39:32 of the variable defines VF id + 1; lower 32 bits indicates the queue index of VF. + + * $queue_index_in_VF0 = (0x1 & 0xFF) << 32 + [queue index]; + + * $queue_index_in_VF1 = (0x2 & 0xFF) << 32 + [queue index]; + + .. _figure_ixgbe_bifu_queue_idx: + + .. figure:: img/ixgbe_bifu_queue_idx.* + +#. Compile the DPDK and insert igb_uio or probe vfio-pci kernel modules as normal. + +#. Bind virtual function: + + .. code-block:: console + + modprobe vfio-pci + dpdk_nic_bind.py -b vfio-pci 01:10.0 + dpdk_nic_bind.py -b vfio-pci 01:10.1 + +#. run DPDK application on VFs: + + .. code-block:: console + + testpmd -c 0xff -n 4 -- -i -w 01:10.0 -w 01:10.1 --forward-mode=mac + +In this example, traffic matching the rules will go through VF by matching the +filter rule. All other traffic which mismatching the rules, will go through +the default queue or scaling on queues in PF. That is to say UDP packets with +those IP source and destination addresses will go through the DPDK. All other +traffic, with different hosts or different protocols, will go through the Linux +networking stack. + +.. note:: + + * The above steps work on the Linux kernel v4.2. + + * The Flow bifurcation is implemented in Linux kernel and ixgbe kernel driver by following patches: + + * `ethtool: Add helper routines to pass vf to rx_flow_spec `_ + + * `ixgbe: Allow flow director to use entire queue space `_ + + * Ethtool's version used in this example is 3.18. + + +Use Flow bifurcation on I40E in Linux +------------------------------------- + +On Intel X710/XL710 series Ethernet Controllers, Flow bifurcation can be achieved +by SR-IOV , cloud filter and L3 VEB switch. So the traffic can be directed to queues by +cloud filter and L3 VEB switch's matching rule. + +* L3 VEB filter for non-tunnelled packets. It can direct a packet just by the + Destination IP address to a queue in a VF. + +* Cloud filters for tunnelled packets have following types. + + * Inner mac + + * Inner mac + VNI + + * Outer mac + Inner mac + VNI + + * Inner mac + Inner vlan + VNI + + * Inner mac + Inner vlan + +The step procedure is as following: + +#. Boot system without iommu, or with ``iommu=pt``. + +#. Build and insert i40e.ko module. + +#. Create Virtual Functions: + + .. code-block:: console + + echo 2 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs + +#. Add udp port offload to NIC if using cloud filter: + + .. code-block:: console + + ip li add vxlan0 type vxlan id 42 group 239.1.1.1 local 10.16.43.214 dev + ifconfig vxlan0 up + ip -d li show vxlan0 + + .. note:: + + Print ``add vxlan port 8472, index 0 success`` can be found in system log. + +#. Enable and set flow filters: + + * L3 VEB filter, route whose dest IP = 192.168.50.108 to VF 0's queue 2. + + .. code-block:: console + + ethtool -N flow-type ip4 dst-ip 192.168.50.108 \ + user-def 0xffffffff00000000 action 2 loc 8 + + * Inner mac, route whose inner dest mac = 0:0:0:0:9:0 to PF's queue 6. + + .. code-block:: console + + ethtool -N flow-type ether dst 00:00:00:00:00:00 \ + m ff:ff:ff:ff:ff:ff src 00:00:00:00:09:00 m 00:00:00:00:00:00 \ + user-def 0xffffffff00000003 action 6 loc 1 + + * Inner mac + VNI, route whose inner dest mac = 0:0:0:0:9:0 and VNI = 8 to PF's queue 4. + + .. code-block:: console + + ethtool -N flow-type ether dst 00:00:00:00:00:00 \ + m ff:ff:ff:ff:ff:ff src 00:00:00:00:09:00 m 00:00:00:00:00:00 \ + user-def 0x800000003 action 4 loc 4 + + * Outer mac + Inner mac + VNI, route whose outer mac= 68:05:ca:24:03:8b, inner dest mac + = c2:1a:e1:53:bc:57, and VNI = 8 to PF's queue 2. + + .. code-block:: console + + ethtool -N flow-type ether dst 68:05:ca:24:03:8b \ + m 00:00:00:00:00:00 src c2:1a:e1:53:bc:57 m 00:00:00:00:00:00 \ + user-def 0x800000003 action 2 loc 2 + + * Inner mac + Inner vlan + VNI, route whose inner dest mac = 00:00:00:00:20:00, + inner vlan = 10, and VNI = 8 to VF 0's queue 1 + + .. code-block:: console + + ethtool -N flow-type ether dst 00:00:00:00:01:00 \ + m ff:ff:ff:ff:ff:ff src 00:00:00:00:20:00 m 00:00:00:00:00:00 \ + vlan 10 user-def 0x800000000 action 1 loc 5 + + * Inner mac + Inner vlan, route whose inner dest mac = 00:00:00:00:20:00, + and inner vlan = 10 to VF 0's queue 1 + + .. code-block:: console + + ethtool -N flow-type ether dst 00:00:00:00:01:00 \ + m ff:ff:ff:ff:ff:ff src 00:00:00:00:20:00 m 00:00:00:00:00:00 \ + vlan 10 user-def 0xffffffff00000000 action 1 loc 5 + + .. note:: + + * If the upper 32 bits of 'user-def' are 0xffffffff, then the filter can + be used for programming an L3 VEB filter, otherwise the upper 32 bits + of 'user-def' can carry the tenant ID/VNI if specified/required. + + * Cloud filters can be defined with inner mac, outer mac, inner ip, inner vlan + and VNI as part of the cloud tuple. It is always the Destination (not source) + mac/ip that these filters, filter on. For all these examples dst and src mac + address fields are overloaded dst == outer, src == inner. + + * Filter will be directing a packet who matching the rule to a vf id + specified in the lower 32 bit of user-def to queue specified by 'action'. + + * If the vf id specified by the lower 32 bit of user-def is greater than + or equal to max_vfs, then the filter is for the PF queues. + +#. Compile the DPDK and insert igb_uio or probe vfio-pci kernel modules as normal. + +#. Bind virtual function: + + .. code-block:: console + + modprobe vfio-pci + dpdk_nic_bind.py -b vfio-pci 01:10.0 + dpdk_nic_bind.py -b vfio-pci 01:10.1 + +#. run DPDK application on VFs: + + .. code-block:: console + + testpmd -c 0xff -n 4 -- -i -w 01:10.0 -w 01:10.1 --forward-mode=mac + +.. note:: + + * The above steps work on the i40e Linux kernel driver v1.5.16. + + * Ethtool's version used in this example is 3.18. And the mask ``ff`` means not involved, while ``00`` or don't set mask means involved. + + * For more details of the configuration, can refer to the `cloud filter test plan `_ diff --git a/doc/guides/howto/img/flow_bifurcation_overview.svg b/doc/guides/howto/img/flow_bifurcation_overview.svg new file mode 100644 index 0000000..4fa2764 --- /dev/null +++ b/doc/guides/howto/img/flow_bifurcation_overview.svg @@ -0,0 +1,544 @@ + + + +image/svg+xmlPage-1Sheet.85NICNIC +NIC +Rounded Rectangle.20LINUXLINUX +LINUX +Rounded Rectangle.8Kernel pf driverKernel pf driver +Rounded RectangleFilters support traffic steering to VFFilters support traffic +steering to VF +Rectangle.3Rx Queues (0-N) PFRx Queues +( 0-N ) + PF +Rectangle.4Rx Queues (0-M) VF(vf 0)Rx Queues +( 0-M ) +VF(vf0) +Rectangle.5filtersfilters +Rounded Rectangle.6Tools to program filtersTools to +program filters +2-D word balloonDirector flows to queue index in specified VFinspecified VF +Director flows +to queue index +in specified VF +Rounded Rectangle.24DPDKDPDK +Rounded Rectangle.25SocketSocket +Simple Arrow.44Single arrowheadDynamic connector.70Dynamic connector.81Dynamic connector.83Dynamic connector.84Sheet.98Sheet.109Sheet.110 \ No newline at end of file diff --git a/doc/guides/howto/img/ixgbe_bifu_queue_idx.svg b/doc/guides/howto/img/ixgbe_bifu_queue_idx.svg new file mode 100644 index 0000000..f7e2bd8 --- /dev/null +++ b/doc/guides/howto/img/ixgbe_bifu_queue_idx.svg @@ -0,0 +1,101 @@ + + + + + + + + + + + + + + + Page-1 + + + Rectangle + 0x000000 + + + + + + + + + + 0x000000 + + Rectangle.2 + VF ID + 1 + + + + + + + + + + VF ID + 1 + + Rectangle.3 + Queue Index + + + + + + + + + + Queue Index + + Sheet.4 + 0 + + + + 0 + + Sheet.6 + 31 + + + + 31 + + Sheet.7 + 39 + + + + 39 + + Sheet.8 + 63 + + + + 63 + + diff --git a/doc/guides/howto/index.rst b/doc/guides/howto/index.rst index 4b97a32..a912d82 100644 --- a/doc/guides/howto/index.rst +++ b/doc/guides/howto/index.rst @@ -36,3 +36,4 @@ How To User Guide :numbered: lm_bond_virtio_sriov + flow_bifurcation -- 2.4.0