From: Declan Doherty <declan.doherty@intel.com>
To: dev@dpdk.org
Cc: Alex Rosenbaum <alexr@mellanox.com>,
Ferruh Yigit <ferruh.yigit@intel.com>,
Thomas Monjalon <thomas@monjalon.net>,
Shahaf Shuler <shahafs@mellanox.com>,
Qi Zhang <qi.z.zhang@intel.com>,
Alejandro Lucero <alejandro.lucero@netronome.com>,
Andrew Rybchenko <arybchenko@solarflare.com>,
Mohammad Abdul Awal <mohammad.abdul.awal@intel.com>,
Remy Horton <remy.horton@intel.com>,
John McNamara <john.mcnamara@intel.com>,
Rony Efraim <ronye@mellanox.com>,
Wu, Jingjing <jingjing.wu@intel.com>,
Lu, Wenzhuo <wenzhuo.lu@intel.com>,
Vincent JArdin <vincent.jardin@6wind.com>,
Yuanhan Liu <yliu@fridaylinux.org>,
Richardson, Bruce <bruce.richardson@intel.com>,
Ananyev, Konstantin <konstantin.ananyev@intel.com>,
Wang, Zhihong <zhihong.wang@intel.com>,
Adrien Mazarguil <adrien.mazarguil@6wind.com>,
Declan Doherty <declan.doherty@intel.com>
Subject: [dpdk-dev] [PATCH v6 1/8] doc: add switch representation documentation
Date: Wed, 28 Mar 2018 14:54:26 +0100 [thread overview]
Message-ID: <20180328135433.20203-2-declan.doherty@intel.com> (raw)
In-Reply-To: <20180328135433.20203-1-declan.doherty@intel.com>
From: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Add document to describe a model for representing switching capable
devices in DPDK, using a general ethdev port model and through port
representors.This document also details the port model and the
rte_flow semantics required for flow programming, as well as listing
some example use cases.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
---
doc/guides/prog_guide/index.rst | 1 +
doc/guides/prog_guide/switch_representation.rst | 829 ++++++++++++++++++++++++
2 files changed, 830 insertions(+)
create mode 100644 doc/guides/prog_guide/switch_representation.rst
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index bbbe7895d..09224af2e 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -17,6 +17,7 @@ Programmer's Guide
mbuf_lib
poll_mode_drv
rte_flow
+ switch_representation
traffic_metering_and_policing
traffic_management
bbdev
diff --git a/doc/guides/prog_guide/switch_representation.rst b/doc/guides/prog_guide/switch_representation.rst
new file mode 100644
index 000000000..f1a84f6b7
--- /dev/null
+++ b/doc/guides/prog_guide/switch_representation.rst
@@ -0,0 +1,829 @@
+.. SPDX-License-Identifier: BSD-3-Clause
+ Copyright(c) 2018 6WIND S.A.
+
+.. _switch_representation:
+
+Switch representation within DPDK applications
+==============================================
+
+.. contents:: :local:
+
+Introduction
+------------
+
+Network adapters with multiple physical ports and/or SR-IOV capabilities
+usually support the offload of traffic steering rules between their virtual
+functions (VFs), physical functions (PFs) and ports.
+
+Like for standard Ethernet switches, this involves a combination of
+automatic MAC learning and manual configuration. For most purposes it is
+managed by the host system and fully transparent to users and applications.
+
+On the other hand, applications typically found on hypervisors that process
+layer 2 (L2) traffic (such as OVS) need to steer traffic themselves
+according on their own criteria.
+
+Without a standard software interface to manage traffic steering rules
+between VFs, PFs and the various physical ports of a given device,
+applications cannot take advantage of these offloads; software processing is
+mandatory even for traffic which ends up re-injected into the device it
+originates from.
+
+This document describes how such steering rules can be configured through
+the DPDK flow API (**rte_flow**), with emphasis on the SR-IOV use case
+(PF/VF steering) using a single physical port for clarity, however the same
+logic applies to any number of ports without necessarily involving SR-IOV.
+
+Port Representors
+-----------------
+
+In many cases, traffic steering rules cannot be determined in advance;
+applications usually have to process a bit of traffic in software before
+thinking about offloading specific flows to hardware.
+
+Applications therefore need the ability to receive and inject traffic to
+various device endpoints (other VFs, PFs or physical ports) before
+connecting them together. Device drivers must provide means to hook the
+"other end" of these endpoints and to refer them when configuring flow
+rules.
+
+This role is left to so-called "port representors" (also known as "VF
+representors" in the specific context of VFs), which are to DPDK what the
+Ethernet switch device driver model (**switchdev**) [1]_ is to Linux, and
+which can be thought as a software "patch panel" front-end for applications.
+
+- DPDK port representors are implemented as additional virtual Ethernet
+ device (**ethdev**) instances, spawned on an as needed basis through
+ configuration parameters passed to the driver of the underlying
+ device using devargs.
+
+::
+
+ -w pci:dbdf,representor=0
+ -w pci:dbdf,representor=[0-3]
+ -w pci:dbdf,representor=[0,5-11]
+
+- As virtual devices, they may be more limited than their physical
+ counterparts, for instance by exposing only a subset of device
+ configuration callbacks and/or by not necessarily having Rx/Tx capability.
+
+- Among other things, they can be used to assign MAC addresses to the
+ resource they represent.
+
+- Applications can tell port representors apart from other physcial of virtual
+ port by checking the dev_flags field within their device information
+ structure for the RTE_ETH_DEV_REPRESENTOR bit-field.
+
+.. code-block:: c
+
+ struct rte_eth_dev_info {
+ ..
+ uint32_t dev_flags; /**< Device flags */
+ ..
+ };
+
+- The device or group relationship of ports can be discovered using the
+ switch_id field within the device information structure. By default the
+ switch_id of a port will be it's port_id but ports within the same switch
+ domain will share the same *switch_id* which in the case of SR-IOV devices
+ would align to the port_id of the physical function port.
+
+.. code-block:: c
+
+ struct rte_eth_dev_info {
+ ..
+ uint16_t switch_id; /**< Switch Domain Id */
+ ..
+ };
+
+
+.. [1] `Ethernet switch device driver model (switchdev)
+ <https://www.kernel.org/doc/Documentation/networking/switchdev.txt>`_
+
+Basic SR-IOV
+------------
+
+"Basic" in the sense that it is not managed by applications, which
+nonetheless expect traffic to flow between the various endpoints and the
+outside as if everything was linked by an Ethernet hub.
+
+The following diagram pictures a setup involving a device with one PF, two
+VFs and one shared physical port
+
+::
+
+ .-------------. .-------------. .-------------.
+ | hypervisor | | VM 1 | | VM 2 |
+ | application | | application | | application |
+ `--+----------' `----------+--' `--+----------'
+ | | |
+ .-----+-----. | |
+ | port_id 3 | | |
+ `-----+-----' | |
+ | | |
+ .-+--. .---+--. .--+---.
+ | PF | | VF 1 | | VF 2 |
+ `-+--' `---+--' `--+---'
+ | | |
+ `---------. .-----------------------' |
+ | | .-------------------------'
+ | | |
+ .--+-----+-----+--.
+ | interconnection |
+ `--------+--------'
+ |
+ .----+-----.
+ | physical |
+ | port 0 |
+ `----------'
+
+- A DPDK application running on the hypervisor owns the PF device, which is
+ arbitrarily assigned port index 3.
+
+- Both VFs are assigned to VMs and used by unknown applications; they may be
+ DPDK-based or anything else.
+
+- Interconnection is not necessarily done through a true Ethernet switch and
+ may not even exist as a separate entity. The role of this block is to show
+ that something brings PF, VFs and physical ports together and enables
+ communication between them, with a number of built-in restrictions.
+
+Subsequent sections in this document describe means for DPDK applications
+running on the hypervisor to freely assign specific flows between PF, VFs
+and physical ports based on traffic properties, by managing this
+interconnection.
+
+Controlled SR-IOV
+-----------------
+
+Initialization
+~~~~~~~~~~~~~~
+
+When a DPDK application gets assigned a PF device and is deliberately not
+started in `basic SR-IOV`_ mode, any traffic coming from physical ports is
+received by PF according to default rules, while VFs remain isolated.
+
+::
+
+ .-------------. .-------------. .-------------.
+ | hypervisor | | VM 1 | | VM 2 |
+ | application | | application | | application |
+ `--+----------' `----------+--' `--+----------'
+ | | |
+ .-----+-----. | |
+ | port_id 3 | | |
+ `-----+-----' | |
+ | | |
+ .-+--. .---+--. .--+---.
+ | PF | | VF 1 | | VF 2 |
+ `-+--' `------' `------'
+ |
+ `-----.
+ |
+ .--+----------------------.
+ | managed interconnection |
+ `------------+------------'
+ |
+ .----+-----.
+ | physical |
+ | port 0 |
+ `----------'
+
+In this mode, interconnection must be configured by the application to
+enable VF communication, for instance by explicitly directing traffic with a
+given destination MAC address to VF 1 and allowing that with the same source
+MAC address to come out of it.
+
+For this to work, hypervisor applications need a way to refer to either VF 1
+or VF 2 in addition to the PF. This is addressed by `VF representors`_.
+
+VF representors
+~~~~~~~~~~~~~~~
+
+VF representors are virtual but standard DPDK network devices (albeit with
+limited capabilities) created by PMDs when managing a PF device.
+
+Since they represent VF instances used by other applications, configuring
+them (e.g. assigning a MAC address or setting up promiscuous mode) affects
+interconnection accordingly. If supported, they may also be used as two-way
+communication ports with VFs (assuming **switchdev** topology)
+
+
+::
+
+ .-------------. .-------------. .-------------.
+ | hypervisor | | VM 1 | | VM 2 |
+ | application | | application | | application |
+ `--+---+---+--' `----------+--' `--+----------'
+ | | | | |
+ | | `-------------------. | |
+ | `---------. | | |
+ | | | | |
+ .-----+-----. .-----+-----. .-----+-----. | |
+ | port_id 3 | | port_id 4 | | port_id 5 | | |
+ `-----+-----' `-----+-----' `-----+-----' | |
+ | | | | |
+ .-+--. .-----+-----. .-----+-----. .---+--. .--+---.
+ | PF | | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
+ `-+--' `-----+-----' `-----+-----' `---+--' `--+---'
+ | | | | |
+ | | .---------' | |
+ `-----. | | .-----------------' |
+ | | | | .---------------------'
+ | | | | |
+ .--+-------+---+---+---+--.
+ | managed interconnection |
+ `------------+------------'
+ |
+ .----+-----.
+ | physical |
+ | port 0 |
+ `----------'
+
+- VF representors are assigned arbitrary port indices 4 and 5 in the
+ hypervisor application and are respectively associated with VF 1 and VF 2.
+
+- They can't be dissociated; even if VF 1 and VF 2 were not connected,
+ representors could still be used for configuration.
+
+- In this context, port index 3 can be thought as a representor for physical
+ port 0.
+
+As previously described, the "interconnection" block represents a logical
+concept. Interconnection occurs when hardware configuration enables traffic
+flows from one place to another (e.g. physical port 0 to VF 1) according to
+some criteria.
+
+This is discussed in more detail in `traffic steering`_.
+
+Traffic steering
+~~~~~~~~~~~~~~~~
+
+In the following diagram, each meaningful traffic origin or endpoint as seen
+by the hypervisor application is tagged with a unique letter from A to F.
+
+::
+
+ .-------------. .-------------. .-------------.
+ | hypervisor | | VM 1 | | VM 2 |
+ | application | | application | | application |
+ `--+---+---+--' `----------+--' `--+----------'
+ | | | | |
+ | | `-------------------. | |
+ | `---------. | | |
+ | | | | |
+ .----(A)----. .----(B)----. .----(C)----. | |
+ | port_id 3 | | port_id 4 | | port_id 5 | | |
+ `-----+-----' `-----+-----' `-----+-----' | |
+ | | | | |
+ .-+--. .-----+-----. .-----+-----. .---+--. .--+---.
+ | PF | | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
+ `-+--' `-----+-----' `-----+-----' `--(D)-' `-(E)--'
+ | | | | |
+ | | .---------' | |
+ `-----. | | .-----------------' |
+ | | | | .---------------------'
+ | | | | |
+ .--+-------+---+---+---+--.
+ | managed interconnection |
+ `------------+------------'
+ |
+ .---(F)----.
+ | physical |
+ | port 0 |
+ `----------'
+
+- **A**: PF device.
+- **B**: port representor for VF 1.
+- **C**: port representor for VF 2.
+- **D**: VF 1 proper.
+- **E**: VF 2 proper.
+- **F**: physical port.
+
+Although uncommon, some devices do not enforce a one to one mapping between
+PF and physical ports. For instance, by default all ports of **mlx4**
+adapters are available to all their PF/VF instances, in which case
+additional ports appear next to **F** in the above diagram.
+
+Assuming no interconnection is provided by default in this mode, setting up
+a `basic SR-IOV`_ configuration involving physical port 0 could be broken
+down as:
+
+PF:
+
+- **A to F**: let everything through.
+- **F to A**: PF MAC as destination.
+
+VF 1:
+
+- **A to D**, **E to D** and **F to D**: VF 1 MAC as destination.
+- **D to A**: VF 1 MAC as source and PF MAC as destination.
+- **D to E**: VF 1 MAC as source and VF 2 MAC as destination.
+- **D to F**: VF 1 MAC as source.
+
+VF 2:
+
+- **A to E**, **D to E** and **F to E**: VF 2 MAC as destination.
+- **E to A**: VF 2 MAC as source and PF MAC as destination.
+- **E to D**: VF 2 MAC as source and VF 1 MAC as destination.
+- **E to F**: VF 2 MAC as source.
+
+Devices may additionally support advanced matching criteria such as
+IPv4/IPv6 addresses or TCP/UDP ports.
+
+The combination of matching criteria with target endpoints fits well with
+**rte_flow** [6]_, which expresses flow rules as combinations of patterns
+and actions.
+
+Enhancing **rte_flow** with the ability to make flow rules match and target
+these endpoints provides a standard interface to manage their
+interconnection without introducing new concepts and whole new API to
+implement them. This is described in `flow API (rte_flow)`_.
+
+.. [6] `Generic flow API (rte_flow)
+ <http://dpdk.org/doc/guides/prog_guide/rte_flow.html>`_
+
+Flow API (rte_flow)
+-------------------
+
+Extensions
+~~~~~~~~~~
+
+Compared to creating a brand new dedicated interface, **rte_flow** was
+deemed flexible enough to manage representor traffic only with minor
+extensions:
+
+- Using physical ports, PF, VF or port representors as targets.
+
+- Affecting traffic that is not necessarily addressed to the DPDK port ID a
+ flow rule is associated with (e.g. forcing VF traffic redirection to PF).
+
+For advanced uses:
+
+- Rule-based packet counters.
+
+- The ability to combine several identical actions for traffic duplication
+ (e.g. VF representor in addition to a physical port).
+
+- Dedicated actions for traffic encapsulation / decapsulation before
+ reaching a endpoint.
+
+Traffic direction
+~~~~~~~~~~~~~~~~~
+
+From an application standpoint, "ingress" and "egress" flow rule attributes
+apply to the DPDK port ID they are associated with. They select a traffic
+direction for matching patterns, but have no impact on actions.
+
+When matching traffic coming from or going to a different place than the
+immediate port ID a flow rule is associated with, these attributes keep
+their meaning while applying to the chosen origin, as highlighted by the
+following diagram
+
+::
+
+ .-------------. .-------------. .-------------.
+ | hypervisor | | VM 1 | | VM 2 |
+ | application | | application | | application |
+ `--+---+---+--' `----------+--' `--+----------'
+ | | | | |
+ | | `-------------------. | |
+ | `---------. | | |
+ | ^ | ^ | ^ | |
+ | | ingress | | ingress | | ingress | |
+ | | egress | | egress | | egress | |
+ | v | v | v | |
+ .----(A)----. .----(B)----. .----(C)----. | |
+ | port_id 3 | | port_id 4 | | port_id 5 | | |
+ `-----+-----' `-----+-----' `-----+-----' | |
+ | | | | |
+ .-+--. .-----+-----. .-----+-----. .---+--. .--+---.
+ | PF | | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
+ `-+--' `-----+-----' `-----+-----' `--(D)-' `-(E)--'
+ | | | ^ | | ^
+ | | | egress | | | | egress
+ | | | ingress | | | | ingress
+ | | .---------' v | | v
+ `-----. | | .-----------------' |
+ | | | | .---------------------'
+ | | | | |
+ .--+-------+---+---+---+--.
+ | managed interconnection |
+ `------------+------------'
+ ^ |
+ ingress | |
+ egress | |
+ v |
+ .---(F)----.
+ | physical |
+ | port 0 |
+ `----------'
+
+Ingress and egress are defined as relative to the application creating the
+flow rule.
+
+For instance, matching traffic sent by VM 2 would be done through an ingress
+flow rule on VF 2 (**E**). Likewise for incoming traffic on physical port
+(**F**). This also applies to **C** and **A** respectively.
+
+Transferring traffic
+~~~~~~~~~~~~~~~~~~~~
+
+Without port representors
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+`Traffic direction`_ describes how an application could match traffic coming
+from or going to a specific place reachable from a DPDK port ID. This makes
+sense when the traffic in question is normally seen (i.e. sent or received)
+by the application creating the flow rule (e.g. as in "redirect all traffic
+coming from VF 1 to local queue 6").
+
+However this does not force such traffic to take a specific route. Creating
+a flow rule on **A** matching traffic coming from **D** is only meaningful
+if it can be received by **A** in the first place, otherwise doing so simply
+has no effect.
+
+A new flow rule attribute named "transfer" is necessary for that. Combining
+it with "ingress" or "egress" and a specific origin requests a flow rule to
+be applied at the lowest level
+
+::
+
+ ingress only : ingress + transfer
+ :
+ .-------------. .-------------. : .-------------. .-------------.
+ | hypervisor | | VM 1 | : | hypervisor | | VM 1 |
+ | application | | application | : | application | | application |
+ `------+------' `--+----------' : `------+------' `--+----------'
+ | | | traffic : | | | traffic
+ .----(A)----. | v : .----(A)----. | v
+ | port_id 3 | | : | port_id 3 | |
+ `-----+-----' | : `-----+-----' |
+ | | : | ^ |
+ | | : | | traffic |
+ .-+--. .---+--. : .-+--. .---+--.
+ | PF | | VF 1 | : | PF | | VF 1 |
+ `-+--' `--(D)-' : `-+--' `--(D)-'
+ | | | traffic : | ^ | | traffic
+ | | v : | | traffic | v
+ .--+-----------+--. : .--+-----------+--.
+ | interconnection | : | interconnection |
+ `--------+--------' : `--------+--------'
+ | | traffic : |
+ | v : |
+ .---(F)----. : .---(F)----.
+ | physical | : | physical |
+ | port 0 | : | port 0 |
+ `----------' : `----------'
+
+With "ingress" only, traffic is matched on **A** thus still goes to physical
+port **F** by default
+
+
+::
+
+ testpmd> flow create 3 ingress pattern vf id is 1 / end
+ actions queue index 6 / end
+
+With "ingress + transfer", traffic is matched on **D** and is therefore
+successfully assigned to queue 6 on **A**
+
+
+::
+
+ testpmd> flow create 3 ingress transfer pattern vf id is 1 / end
+ actions queue index 6 / end
+
+
+With port representors
+^^^^^^^^^^^^^^^^^^^^^^
+
+When port representors exist, implicit flow rules with the "transfer"
+attribute (described in `without port representors`_) are be assumed to
+exist between them and their represented resources. These may be immutable.
+
+In this case, traffic is received by default through the representor and
+neither the "transfer" attribute nor traffic origin in flow rule patterns
+are necessary. They simply have to be created on the representor port
+directly and may target a different representor as described in `PORT_ID
+action`_.
+
+Implicit traffic flow with port representor
+
+::
+
+ .-------------. .-------------.
+ | hypervisor | | VM 1 |
+ | application | | application |
+ `--+-------+--' `----------+--'
+ | | ^ | | traffic
+ | | | traffic | v
+ | `-----. |
+ | | |
+ .----(A)----. .----(B)----. |
+ | port_id 3 | | port_id 4 | |
+ `-----+-----' `-----+-----' |
+ | | |
+ .-+--. .-----+-----. .---+--.
+ | PF | | VF 1 rep. | | VF 1 |
+ `-+--' `-----+-----' `--(D)-'
+ | | |
+ .--|-------------|-----------|--.
+ | | | | |
+ | | `-----------' |
+ | | <-- traffic |
+ `--|----------------------------'
+ |
+ .---(F)----.
+ | physical |
+ | port 0 |
+ `----------'
+
+Pattern items and actions
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+PORT pattern item
+^^^^^^^^^^^^^^^^^
+
+Matches traffic originating from (ingress) or going to (egress) a physical
+port of the underlying device.
+
+Using this pattern item without specifying a port index matches the physical
+port associated with the current DPDK port ID by default. As described in
+`traffic steering`_, specifying it should be rarely needed.
+
+- Matches **F** in `traffic steering`_.
+
+PORT action
+^^^^^^^^^^^
+
+Directs matching traffic to a given physical port index.
+
+- Targets **F** in `traffic steering`_.
+
+PORT_ID pattern item
+^^^^^^^^^^^^^^^^^^^^
+
+Matches traffic originating from (ingress) or going to (egress) a given DPDK
+port ID.
+
+Normally only supported if the port ID in question is known by the
+underlying PMD and related to the device the flow rule is created against.
+
+This must not be confused with the `PORT pattern item`_ which refers to the
+physical port of a device. ``PORT_ID`` refers to a ``struct rte_eth_dev``
+object on the application side (also known as "port representor" depending
+on the kind of underlying device).
+
+- Matches **A**, **B** or **C** in `traffic steering`_.
+
+PORT_ID action
+^^^^^^^^^^^^^^
+
+Directs matching traffic to a given DPDK port ID.
+
+Same restrictions as `PORT_ID pattern item`_.
+
+- Targets **A**, **B** or **C** in `traffic steering`_.
+
+PF pattern item
+^^^^^^^^^^^^^^^
+
+Matches traffic originating from (ingress) or going to (egress) the physical
+function of the current device.
+
+If supported, should work even if the physical function is not managed by
+the application and thus not associated with a DPDK port ID. Its behavior is
+otherwise similar to `PORT_ID pattern item`_ using PF port ID.
+
+- Matches **A** in `traffic steering`_.
+
+PF action
+^^^^^^^^^
+
+Directs matching traffic to the physical function of the current device.
+
+Same restrictions as `PF pattern item`_.
+
+- Targets **A** in `traffic steering`_.
+
+VF pattern item
+^^^^^^^^^^^^^^^
+
+Matches traffic originating from (ingress) or going to (egress) a given
+virtual function of the current device.
+
+If supported, should work even if the virtual function is not managed by
+the application and thus not associated with a DPDK port ID. Its behavior is
+otherwise similar to `PORT_ID pattern item`_ using VF port ID.
+
+Note this pattern item does not match VF representors traffic which, as
+separate entities, should be addressed through their own port IDs.
+
+- Matches **D** or **E** in `traffic steering`_.
+
+VF action
+^^^^^^^^^
+
+Directs matching traffic to a given virtual function of the current device.
+
+Same restrictions as `VF pattern item`_.
+
+- Targets **D** or **E** in `traffic steering`_.
+
+\*_ENCAP actions
+^^^^^^^^^^^^^^^^
+
+These actions are named according to the protocol they encapsulate traffic
+with (e.g. ``VXLAN_ENCAP``) and using specific parameters (e.g. VNI for
+VXLAN).
+
+While they modify traffic and can be used multiple times (order matters),
+unlike `PORT_ID action`_ and friends, they have no impact on steering.
+
+As described in `actions order and repetition`_ this means they are useless
+if used alone in an action list, the resulting traffic gets dropped unless
+combined with either ``PASSTHRU`` or other endpoint-targeting actions.
+
+\*_DECAP actions
+^^^^^^^^^^^^^^^^
+
+They perform the reverse of `\*_ENCAP actions`_ by popping protocol headers
+from traffic instead of pushing them. They can be used multiple times as
+well.
+
+Note that using these actions on non-matching traffic results in undefined
+behavior. It is recommended to match the protocol headers to decapsulate on
+the pattern side of a flow rule in order to use these actions or otherwise
+make sure only matching traffic goes through.
+
+Actions Order and Repetition
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Flow rules are currently restricted to at most a single action of each
+supported type, performed in an unpredictable order (or all at once). To
+repeat actions in a predictable fashion, applications have to make rules
+pass-through and use priority levels.
+
+It's now clear that PMD support for chaining multiple non-terminating flow
+rules of varying priority levels is prohibitively difficult to implement
+compared to simply allowing multiple identical actions performed in a
+defined order by a single flow rule.
+
+- This change is required to support protocol encapsulation offloads and the
+ ability to perform them multiple times (e.g. VLAN then VXLAN).
+
+- It makes the ``DUP`` action redundant since multiple ``QUEUE`` actions can
+ be combined for duplication.
+
+- The (non-)terminating property of actions must be discarded. Instead, flow
+ rules themselves must be considered terminating by default (i.e. dropping
+ traffic if there is no specific target) unless a ``PASSTHRU`` action is
+ also specified.
+
+Switching Examples
+------------------
+
+This section provides practical examples based on the established Testpmd
+flow command syntax [2]_, in the context described in `traffic steering`_
+
+::
+
+ .-------------. .-------------. .-------------.
+ | hypervisor | | VM 1 | | VM 2 |
+ | application | | application | | application |
+ `--+---+---+--' `----------+--' `--+----------'
+ | | | | |
+ | | `-------------------. | |
+ | `---------. | | |
+ | | | | |
+ .----(A)----. .----(B)----. .----(C)----. | |
+ | port_id 3 | | port_id 4 | | port_id 5 | | |
+ `-----+-----' `-----+-----' `-----+-----' | |
+ | | | | |
+ .-+--. .-----+-----. .-----+-----. .---+--. .--+---.
+ | PF | | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
+ `-+--' `-----+-----' `-----+-----' `--(D)-' `-(E)--'
+ | | | | |
+ | | .---------' | |
+ `-----. | | .-----------------' |
+ | | | | .---------------------'
+ | | | | |
+ .--|-------|---|---|---|--.
+ | | | `---|---' |
+ | | `-------' |
+ | `---------. |
+ `------------|------------'
+ |
+ .---(F)----.
+ | physical |
+ | port 0 |
+ `----------'
+
+By default, PF (**A**) can communicate with the physical port it is
+associated with (**F**), while VF 1 (**D**) and VF 2 (**E**) are isolated
+and restricted to communicate with the hypervisor application through their
+respective representors (**B** and **C**) if supported.
+
+Examples in subsequent sections apply to hypervisor applications only and
+are based on port representors **A**, **B** and **C**.
+
+.. [2] `Flow syntax
+ <http://dpdk.org/doc/guides/testpmd_app_ug/testpmd_funcs.html#flow-syntax>`
+
+Associating VF 1 with physical port 0
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Assign all port traffic (**F**) to VF 1 (**D**) indiscriminately through
+their representors
+
+::
+
+ flow create 3 ingress pattern / end actions port_id id 4 / end
+ flow create 4 ingress pattern / end actions port_id id 3 / end
+
+More practical example with MAC address restrictions
+
+::
+
+ flow create 3 ingress
+ pattern eth dst is {VF 1 MAC} / end
+ actions port_id id 4 / end
+
+::
+
+ flow create 4 ingress
+ pattern eth src is {VF 1 MAC} / end
+ actions port_id id 3 / end
+
+
+Sharing broadcasts
+~~~~~~~~~~~~~~~~~~
+
+From outside to PF and VFs
+
+::
+
+ flow create 3 ingress
+ pattern eth dst is ff:ff:ff:ff:ff:ff / end
+ actions port_id id 3 / port_id id 4 / port_id id 5 / end
+
+Note ``port_id id 3`` is necessary otherwise only VFs would receive matching
+traffic.
+
+From PF to outside and VFs
+
+::
+
+ flow create 3 egress
+ pattern eth dst is ff:ff:ff:ff:ff:ff / end
+ actions port / port_id id 4 / port_id id 5 / end
+
+From VFs to outside and PF
+
+::
+
+ flow create 4 ingress
+ pattern eth dst is ff:ff:ff:ff:ff:ff src is {VF 1 MAC} / end
+ actions port_id id 3 / port_id id 5 / end
+
+ flow create 5 ingress
+ pattern eth dst is ff:ff:ff:ff:ff:ff src is {VF 2 MAC} / end
+ actions port_id id 4 / port_id id 4 / end
+
+Similar ``33:33:*`` rules based on known MAC addresses should be added for
+IPv6 traffic.
+
+Encapsulating VF 2 traffic in VXLAN
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Assuming pass-through flow rules are supported
+
+::
+
+ flow create 5 ingress
+ pattern eth / end
+ actions vxlan_encap vni 42 / passthru / end
+
+::
+
+ flow create 5 egress
+ pattern vxlan vni is 42 / end
+ actions vxlan_decap / passthru / end
+
+Here ``passthru`` is needed since as described in `actions order and
+repetition`_, flow rules are otherwise terminating; if supported, a rule
+without a target endpoint will drop traffic.
+
+Without pass-through support, ingress encapsulation on the destination
+endpoint might not be supported and action list must provide one
+
+::
+
+ flow create 5 ingress
+ pattern eth src is {VF 2 MAC} / end
+ actions vxlan_encap vni 42 / port_id id 3 / end
+
+ flow create 3 ingress
+ pattern vxlan vni is 42 / end
+ actions vxlan_decap / port_id id 5 / end
--
2.14.3
next prev parent reply other threads:[~2018-03-28 14:02 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-28 13:54 [dpdk-dev] [PATCH v6 0/7] switching device representation Declan Doherty
2018-03-28 13:54 ` Declan Doherty [this message]
2018-03-28 14:53 ` [dpdk-dev] [PATCH v6 1/8] doc: add switch representation documentation Thomas Monjalon
2018-03-28 15:05 ` Doherty, Declan
2018-04-03 15:52 ` Adrien Mazarguil
2018-03-28 13:54 ` [dpdk-dev] [PATCH v6 2/8] ethdev: add switch identifier parameter to port Declan Doherty
2018-03-29 6:13 ` Shahaf Shuler
2018-03-29 9:13 ` Doherty, Declan
2018-03-29 10:12 ` Shahaf Shuler
2018-03-29 15:12 ` Doherty, Declan
2018-04-01 6:10 ` Shahaf Shuler
2018-03-28 13:54 ` [dpdk-dev] [PATCH v6 3/8] ethdev: add generic create/destroy ethdev APIs Declan Doherty
2018-03-29 6:13 ` Shahaf Shuler
2018-03-29 9:22 ` Doherty, Declan
2018-03-28 13:54 ` [dpdk-dev] [PATCH v6 4/8] ethdev: Add port representor device flag Declan Doherty
2018-03-29 6:13 ` Shahaf Shuler
2018-03-29 7:34 ` Thomas Monjalon
2018-03-29 14:53 ` Doherty, Declan
2018-04-01 6:14 ` Shahaf Shuler
2018-03-28 13:54 ` [dpdk-dev] [PATCH v6 5/8] app/testpmd: add port name to device info Declan Doherty
2018-03-28 13:54 ` [dpdk-dev] [PATCH v6 6/8] ethdev: add common devargs parser Declan Doherty
2018-03-29 12:12 ` Gaëtan Rivet
2018-03-28 13:54 ` [dpdk-dev] [PATCH v6 7/8] net/i40e: add support for representor ports Declan Doherty
2018-03-28 13:54 ` [dpdk-dev] [PATCH v6 8/8] net/ixgbe: " Declan Doherty
2018-04-16 13:05 ` [dpdk-dev] [PATCH v7 0/9] switching devices representation Declan Doherty
2018-04-16 13:05 ` [dpdk-dev] [PATCH v7 1/9] doc: add switch representation documentation Declan Doherty
2018-04-16 15:55 ` Kovacevic, Marko
2018-04-16 13:05 ` [dpdk-dev] [PATCH v7 2/9] ethdev: add switch identifier parameter to port Declan Doherty
2018-04-24 16:38 ` Thomas Monjalon
2018-04-16 13:05 ` [dpdk-dev] [PATCH v7 3/9] ethdev: add generic create/destroy ethdev APIs Declan Doherty
2018-04-20 13:01 ` Ananyev, Konstantin
2018-04-24 17:48 ` Thomas Monjalon
2018-04-16 13:06 ` [dpdk-dev] [PATCH v7 4/9] ethdev: Add port representor device flag Declan Doherty
2018-04-24 19:37 ` Thomas Monjalon
2018-04-25 12:17 ` Doherty, Declan
2018-04-25 12:23 ` Thomas Monjalon
2018-04-16 13:06 ` [dpdk-dev] [PATCH v7 5/9] app/testpmd: add port name to device info Declan Doherty
2018-04-16 13:06 ` [dpdk-dev] [PATCH v7 6/9] ethdev: add common devargs parser Declan Doherty
2018-04-20 13:16 ` Ananyev, Konstantin
2018-04-24 19:53 ` Thomas Monjalon
2018-04-25 9:40 ` Remy Horton
2018-04-25 10:06 ` Thomas Monjalon
2018-04-25 10:45 ` Remy Horton
2018-04-16 13:06 ` [dpdk-dev] [PATCH v7 7/9] ethdev: add switch domain allocator Declan Doherty
2018-04-20 13:22 ` Ananyev, Konstantin
2018-04-24 19:58 ` Thomas Monjalon
2018-04-16 13:06 ` [dpdk-dev] [PATCH v7 8/9] net/i40e: add support for representor ports Declan Doherty
2018-04-16 13:06 ` [dpdk-dev] [PATCH v7 9/9] net/ixgbe: " Declan Doherty
2018-04-20 13:29 ` Ananyev, Konstantin
2018-04-26 10:40 ` [dpdk-dev] [dpdk=-dev][PATCH v8 0/9] switching devices representation Declan Doherty
2018-04-26 10:40 ` [dpdk-dev] [PATCH v8 1/9] doc: add switch representation documentation Declan Doherty
2018-04-26 10:40 ` [dpdk-dev] [PATCH v8 2/9] ethdev: add switch identifier parameter to port Declan Doherty
2018-04-26 12:02 ` Thomas Monjalon
2018-04-26 14:26 ` Thomas Monjalon
2018-04-27 16:29 ` Ferruh Yigit
2018-04-26 10:40 ` [dpdk-dev] [PATCH v8 3/9] ethdev: add generic create/destroy ethdev APIs Declan Doherty
2018-04-26 12:16 ` Ferruh Yigit
2018-04-26 10:41 ` [dpdk-dev] [PATCH v8 4/9] ethdev: Add port representor device flag Declan Doherty
2018-04-26 10:41 ` [dpdk-dev] [PATCH v8 5/9] app/testpmd: add port name to device info Declan Doherty
2018-04-26 10:41 ` [dpdk-dev] [PATCH v8 6/9] ethdev: add common devargs parser Declan Doherty
2018-04-26 12:03 ` Ananyev, Konstantin
2018-04-26 14:21 ` Ferruh Yigit
2018-04-26 14:28 ` Doherty, Declan
2018-04-26 14:44 ` Thomas Monjalon
2018-04-26 14:48 ` Ananyev, Konstantin
2018-04-26 14:30 ` Remy Horton
2018-04-26 12:15 ` Ferruh Yigit
2018-04-26 10:41 ` [dpdk-dev] [PATCH v8 7/9] ethdev: add switch domain allocator Declan Doherty
2018-04-26 12:27 ` Ananyev, Konstantin
2018-04-26 10:41 ` [dpdk-dev] [PATCH v8 8/9] net/i40e: add support for representor ports Declan Doherty
2018-04-26 10:41 ` [dpdk-dev] [PATCH v8 9/9] net/ixgbe: " Declan Doherty
2018-04-26 16:24 ` [dpdk-dev] [dpdk=-dev][PATCH v8 0/9] switching devices representation Ferruh Yigit
2018-04-26 16:35 ` Ferruh Yigit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180328135433.20203-2-declan.doherty@intel.com \
--to=declan.doherty@intel.com \
--cc=adrien.mazarguil@6wind.com \
--cc=alejandro.lucero@netronome.com \
--cc=alexr@mellanox.com \
--cc=arybchenko@solarflare.com \
--cc=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=ferruh.yigit@intel.com \
--cc=jingjing.wu@intel.com \
--cc=john.mcnamara@intel.com \
--cc=konstantin.ananyev@intel.com \
--cc=mohammad.abdul.awal@intel.com \
--cc=qi.z.zhang@intel.com \
--cc=remy.horton@intel.com \
--cc=ronye@mellanox.com \
--cc=shahafs@mellanox.com \
--cc=thomas@monjalon.net \
--cc=vincent.jardin@6wind.com \
--cc=wenzhuo.lu@intel.com \
--cc=yliu@fridaylinux.org \
--cc=zhihong.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).