From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id 05C108019 for ; Thu, 26 Apr 2018 12:49:26 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 26 Apr 2018 03:49:26 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,330,1520924400"; d="scan'208";a="53809270" Received: from dwdohert-ws.ir.intel.com ([163.33.210.60]) by orsmga002.jf.intel.com with ESMTP; 26 Apr 2018 03:49:23 -0700 From: Declan Doherty To: dev@dpdk.org Cc: Adrien Mazarguil , Ferruh Yigit , Thomas Monjalon , Shahaf Shuler , Konstantin Ananyev , Declan Doherty , Adrien Mazarguil Date: Thu, 26 Apr 2018 11:40:57 +0100 Message-Id: <20180426104105.18342-2-declan.doherty@intel.com> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180426104105.18342-1-declan.doherty@intel.com> References: <20180416130605.6509-1-declan.doherty@intel.com> <20180426104105.18342-1-declan.doherty@intel.com> Subject: [dpdk-dev] [PATCH v8 1/9] doc: add switch representation documentation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Apr 2018 10:49:27 -0000 Add document to describe the model for representing switching capable devices in DPDK, using a general ethdev port model and through port representors. This document also details the port model and the rte_flow semantics required for flow programming, as well as listing some example use cases. Signed-off-by: Adrien Mazarguil Signed-off-by: Declan Doherty Reviewed-by: Marko Kovacevic --- doc/guides/prog_guide/index.rst | 1 + doc/guides/prog_guide/switch_representation.rst | 837 ++++++++++++++++++++++++ 2 files changed, 838 insertions(+) create mode 100644 doc/guides/prog_guide/switch_representation.rst diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst index 589c05d96..235ad0201 100644 --- a/doc/guides/prog_guide/index.rst +++ b/doc/guides/prog_guide/index.rst @@ -17,6 +17,7 @@ Programmer's Guide mbuf_lib poll_mode_drv rte_flow + switch_representation traffic_metering_and_policing traffic_management bbdev diff --git a/doc/guides/prog_guide/switch_representation.rst b/doc/guides/prog_guide/switch_representation.rst new file mode 100644 index 000000000..f5ee516f6 --- /dev/null +++ b/doc/guides/prog_guide/switch_representation.rst @@ -0,0 +1,837 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2018 6WIND S.A. + +.. _switch_representation: + +Switch Representation within DPDK Applications +============================================== + +.. contents:: :local: + +Introduction +------------ + +Network adapters with multiple physical ports and/or SR-IOV capabilities +usually support the offload of traffic steering rules between their virtual +functions (VFs), physical functions (PFs) and ports. + +Like for standard Ethernet switches, this involves a combination of +automatic MAC learning and manual configuration. For most purposes it is +managed by the host system and fully transparent to users and applications. + +On the other hand, applications typically found on hypervisors that process +layer 2 (L2) traffic (such as OVS) need to steer traffic themselves +according on their own criteria. + +Without a standard software interface to manage traffic steering rules +between VFs, PFs and the various physical ports of a given device, +applications cannot take advantage of these offloads; software processing is +mandatory even for traffic which ends up re-injected into the device it +originates from. + +This document describes how such steering rules can be configured through +the DPDK flow API (**rte_flow**), with emphasis on the SR-IOV use case +(PF/VF steering) using a single physical port for clarity, however the same +logic applies to any number of ports without necessarily involving SR-IOV. + +Port Representors +----------------- + +In many cases, traffic steering rules cannot be determined in advance; +applications usually have to process a bit of traffic in software before +thinking about offloading specific flows to hardware. + +Applications therefore need the ability to receive and inject traffic to +various device endpoints (other VFs, PFs or physical ports) before +connecting them together. Device drivers must provide means to hook the +"other end" of these endpoints and to refer them when configuring flow +rules. + +This role is left to so-called "port representors" (also known as "VF +representors" in the specific context of VFs), which are to DPDK what the +Ethernet switch device driver model (**switchdev**) [1]_ is to Linux, and +which can be thought as a software "patch panel" front-end for applications. + +- DPDK port representors are implemented as additional virtual Ethernet + device (**ethdev**) instances, spawned on an as needed basis through + configuration parameters passed to the driver of the underlying + device using devargs. + +:: + + -w pci:dbdf,representor=0 + -w pci:dbdf,representor=[0-3] + -w pci:dbdf,representor=[0,5-11] + +- As virtual devices, they may be more limited than their physical + counterparts, for instance by exposing only a subset of device + configuration callbacks and/or by not necessarily having Rx/Tx capability. + +- Among other things, they can be used to assign MAC addresses to the + resource they represent. + +- Applications can tell port representors apart from other physical of virtual + port by checking the dev_flags field within their device information + structure for the RTE_ETH_DEV_REPRESENTOR bit-field. + +.. code-block:: c + + struct rte_eth_dev_info { + ... + uint32_t dev_flags; /**< Device flags */ + ... + }; + +- The device or group relationship of ports can be discovered using the + switch ``domain_id`` field within the devices switch information structure. By + default the switch ``domain_id`` of a port will be + ``RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID`` to indicate that the port doesn't + support the concept of a switch domain, but ports which do support the concept + will be allocated a unique switch ``domain_id``, ports within the same switch + domain will share the same ``domain_id``. The switch ``port_id`` is used to + specify the port_id in terms of the switch, so in the case of SR-IOV devices + the switch ``port_id`` would represent the virtual function identifier of the + port. + +.. code-block:: c + + /** + * Ethernet device associated switch information + */ + struct rte_eth_switch_info { + const char *name; /**< switch name */ + uint16_t domain_id; /**< switch domain id */ + uint16_t port_id; /**< switch port id */ + }; + + +.. [1] `Ethernet switch device driver model (switchdev) + `_ + +Basic SR-IOV +------------ + +"Basic" in the sense that it is not managed by applications, which +nonetheless expect traffic to flow between the various endpoints and the +outside as if everything was linked by an Ethernet hub. + +The following diagram pictures a setup involving a device with one PF, two +VFs and one shared physical port + +:: + + .-------------. .-------------. .-------------. + | hypervisor | | VM 1 | | VM 2 | + | application | | application | | application | + `--+----------' `----------+--' `--+----------' + | | | + .-----+-----. | | + | port_id 3 | | | + `-----+-----' | | + | | | + .-+--. .---+--. .--+---. + | PF | | VF 1 | | VF 2 | + `-+--' `---+--' `--+---' + | | | + `---------. .-----------------------' | + | | .-------------------------' + | | | + .--+-----+-----+--. + | interconnection | + `--------+--------' + | + .----+-----. + | physical | + | port 0 | + `----------' + +- A DPDK application running on the hypervisor owns the PF device, which is + arbitrarily assigned port index 3. + +- Both VFs are assigned to VMs and used by unknown applications; they may be + DPDK-based or anything else. + +- Interconnection is not necessarily done through a true Ethernet switch and + may not even exist as a separate entity. The role of this block is to show + that something brings PF, VFs and physical ports together and enables + communication between them, with a number of built-in restrictions. + +Subsequent sections in this document describe means for DPDK applications +running on the hypervisor to freely assign specific flows between PF, VFs +and physical ports based on traffic properties, by managing this +interconnection. + +Controlled SR-IOV +----------------- + +Initialization +~~~~~~~~~~~~~~ + +When a DPDK application gets assigned a PF device and is deliberately not +started in `basic SR-IOV`_ mode, any traffic coming from physical ports is +received by PF according to default rules, while VFs remain isolated. + +:: + + .-------------. .-------------. .-------------. + | hypervisor | | VM 1 | | VM 2 | + | application | | application | | application | + `--+----------' `----------+--' `--+----------' + | | | + .-----+-----. | | + | port_id 3 | | | + `-----+-----' | | + | | | + .-+--. .---+--. .--+---. + | PF | | VF 1 | | VF 2 | + `-+--' `------' `------' + | + `-----. + | + .--+----------------------. + | managed interconnection | + `------------+------------' + | + .----+-----. + | physical | + | port 0 | + `----------' + +In this mode, interconnection must be configured by the application to +enable VF communication, for instance by explicitly directing traffic with a +given destination MAC address to VF 1 and allowing that with the same source +MAC address to come out of it. + +For this to work, hypervisor applications need a way to refer to either VF 1 +or VF 2 in addition to the PF. This is addressed by `VF representors`_. + +VF Representors +~~~~~~~~~~~~~~~ + +VF representors are virtual but standard DPDK network devices (albeit with +limited capabilities) created by PMDs when managing a PF device. + +Since they represent VF instances used by other applications, configuring +them (e.g. assigning a MAC address or setting up promiscuous mode) affects +interconnection accordingly. If supported, they may also be used as two-way +communication ports with VFs (assuming **switchdev** topology) + + +:: + + .-------------. .-------------. .-------------. + | hypervisor | | VM 1 | | VM 2 | + | application | | application | | application | + `--+---+---+--' `----------+--' `--+----------' + | | | | | + | | `-------------------. | | + | `---------. | | | + | | | | | + .-----+-----. .-----+-----. .-----+-----. | | + | port_id 3 | | port_id 4 | | port_id 5 | | | + `-----+-----' `-----+-----' `-----+-----' | | + | | | | | + .-+--. .-----+-----. .-----+-----. .---+--. .--+---. + | PF | | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 | + `-+--' `-----+-----' `-----+-----' `---+--' `--+---' + | | | | | + | | .---------' | | + `-----. | | .-----------------' | + | | | | .---------------------' + | | | | | + .--+-------+---+---+---+--. + | managed interconnection | + `------------+------------' + | + .----+-----. + | physical | + | port 0 | + `----------' + +- VF representors are assigned arbitrary port indices 4 and 5 in the + hypervisor application and are respectively associated with VF 1 and VF 2. + +- They can't be dissociated; even if VF 1 and VF 2 were not connected, + representors could still be used for configuration. + +- In this context, port index 3 can be thought as a representor for physical + port 0. + +As previously described, the "interconnection" block represents a logical +concept. Interconnection occurs when hardware configuration enables traffic +flows from one place to another (e.g. physical port 0 to VF 1) according to +some criteria. + +This is discussed in more detail in `traffic steering`_. + +Traffic Steering +~~~~~~~~~~~~~~~~ + +In the following diagram, each meaningful traffic origin or endpoint as seen +by the hypervisor application is tagged with a unique letter from A to F. + +:: + + .-------------. .-------------. .-------------. + | hypervisor | | VM 1 | | VM 2 | + | application | | application | | application | + `--+---+---+--' `----------+--' `--+----------' + | | | | | + | | `-------------------. | | + | `---------. | | | + | | | | | + .----(A)----. .----(B)----. .----(C)----. | | + | port_id 3 | | port_id 4 | | port_id 5 | | | + `-----+-----' `-----+-----' `-----+-----' | | + | | | | | + .-+--. .-----+-----. .-----+-----. .---+--. .--+---. + | PF | | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 | + `-+--' `-----+-----' `-----+-----' `--(D)-' `-(E)--' + | | | | | + | | .---------' | | + `-----. | | .-----------------' | + | | | | .---------------------' + | | | | | + .--+-------+---+---+---+--. + | managed interconnection | + `------------+------------' + | + .---(F)----. + | physical | + | port 0 | + `----------' + +- **A**: PF device. +- **B**: port representor for VF 1. +- **C**: port representor for VF 2. +- **D**: VF 1 proper. +- **E**: VF 2 proper. +- **F**: physical port. + +Although uncommon, some devices do not enforce a one to one mapping between +PF and physical ports. For instance, by default all ports of **mlx4** +adapters are available to all their PF/VF instances, in which case +additional ports appear next to **F** in the above diagram. + +Assuming no interconnection is provided by default in this mode, setting up +a `basic SR-IOV`_ configuration involving physical port 0 could be broken +down as: + +PF: + +- **A to F**: let everything through. +- **F to A**: PF MAC as destination. + +VF 1: + +- **A to D**, **E to D** and **F to D**: VF 1 MAC as destination. +- **D to A**: VF 1 MAC as source and PF MAC as destination. +- **D to E**: VF 1 MAC as source and VF 2 MAC as destination. +- **D to F**: VF 1 MAC as source. + +VF 2: + +- **A to E**, **D to E** and **F to E**: VF 2 MAC as destination. +- **E to A**: VF 2 MAC as source and PF MAC as destination. +- **E to D**: VF 2 MAC as source and VF 1 MAC as destination. +- **E to F**: VF 2 MAC as source. + +Devices may additionally support advanced matching criteria such as +IPv4/IPv6 addresses or TCP/UDP ports. + +The combination of matching criteria with target endpoints fits well with +**rte_flow** [6]_, which expresses flow rules as combinations of patterns +and actions. + +Enhancing **rte_flow** with the ability to make flow rules match and target +these endpoints provides a standard interface to manage their +interconnection without introducing new concepts and whole new API to +implement them. This is described in `flow API (rte_flow)`_. + +.. [6] `Generic flow API (rte_flow) + `_ + +Flow API (rte_flow) +------------------- + +Extensions +~~~~~~~~~~ + +Compared to creating a brand new dedicated interface, **rte_flow** was +deemed flexible enough to manage representor traffic only with minor +extensions: + +- Using physical ports, PF, VF or port representors as targets. + +- Affecting traffic that is not necessarily addressed to the DPDK port ID a + flow rule is associated with (e.g. forcing VF traffic redirection to PF). + +For advanced uses: + +- Rule-based packet counters. + +- The ability to combine several identical actions for traffic duplication + (e.g. VF representor in addition to a physical port). + +- Dedicated actions for traffic encapsulation / decapsulation before + reaching an endpoint. + +Traffic Direction +~~~~~~~~~~~~~~~~~ + +From an application standpoint, "ingress" and "egress" flow rule attributes +apply to the DPDK port ID they are associated with. They select a traffic +direction for matching patterns, but have no impact on actions. + +When matching traffic coming from or going to a different place than the +immediate port ID a flow rule is associated with, these attributes keep +their meaning while applying to the chosen origin, as highlighted by the +following diagram + +:: + + .-------------. .-------------. .-------------. + | hypervisor | | VM 1 | | VM 2 | + | application | | application | | application | + `--+---+---+--' `----------+--' `--+----------' + | | | | | + | | `-------------------. | | + | `---------. | | | + | ^ | ^ | ^ | | + | | ingress | | ingress | | ingress | | + | | egress | | egress | | egress | | + | v | v | v | | + .----(A)----. .----(B)----. .----(C)----. | | + | port_id 3 | | port_id 4 | | port_id 5 | | | + `-----+-----' `-----+-----' `-----+-----' | | + | | | | | + .-+--. .-----+-----. .-----+-----. .---+--. .--+---. + | PF | | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 | + `-+--' `-----+-----' `-----+-----' `--(D)-' `-(E)--' + | | | ^ | | ^ + | | | egress | | | | egress + | | | ingress | | | | ingress + | | .---------' v | | v + `-----. | | .-----------------' | + | | | | .---------------------' + | | | | | + .--+-------+---+---+---+--. + | managed interconnection | + `------------+------------' + ^ | + ingress | | + egress | | + v | + .---(F)----. + | physical | + | port 0 | + `----------' + +Ingress and egress are defined as relative to the application creating the +flow rule. + +For instance, matching traffic sent by VM 2 would be done through an ingress +flow rule on VF 2 (**E**). Likewise for incoming traffic on physical port +(**F**). This also applies to **C** and **A** respectively. + +Transferring Traffic +~~~~~~~~~~~~~~~~~~~~ + +Without Port Representors +^^^^^^^^^^^^^^^^^^^^^^^^^ + +`Traffic direction`_ describes how an application could match traffic coming +from or going to a specific place reachable from a DPDK port ID. This makes +sense when the traffic in question is normally seen (i.e. sent or received) +by the application creating the flow rule (e.g. as in "redirect all traffic +coming from VF 1 to local queue 6"). + +However this does not force such traffic to take a specific route. Creating +a flow rule on **A** matching traffic coming from **D** is only meaningful +if it can be received by **A** in the first place, otherwise doing so simply +has no effect. + +A new flow rule attribute named "transfer" is necessary for that. Combining +it with "ingress" or "egress" and a specific origin requests a flow rule to +be applied at the lowest level + +:: + + ingress only : ingress + transfer + : + .-------------. .-------------. : .-------------. .-------------. + | hypervisor | | VM 1 | : | hypervisor | | VM 1 | + | application | | application | : | application | | application | + `------+------' `--+----------' : `------+------' `--+----------' + | | | traffic : | | | traffic + .----(A)----. | v : .----(A)----. | v + | port_id 3 | | : | port_id 3 | | + `-----+-----' | : `-----+-----' | + | | : | ^ | + | | : | | traffic | + .-+--. .---+--. : .-+--. .---+--. + | PF | | VF 1 | : | PF | | VF 1 | + `-+--' `--(D)-' : `-+--' `--(D)-' + | | | traffic : | ^ | | traffic + | | v : | | traffic | v + .--+-----------+--. : .--+-----------+--. + | interconnection | : | interconnection | + `--------+--------' : `--------+--------' + | | traffic : | + | v : | + .---(F)----. : .---(F)----. + | physical | : | physical | + | port 0 | : | port 0 | + `----------' : `----------' + +With "ingress" only, traffic is matched on **A** thus still goes to physical +port **F** by default + + +:: + + testpmd> flow create 3 ingress pattern vf id is 1 / end + actions queue index 6 / end + +With "ingress + transfer", traffic is matched on **D** and is therefore +successfully assigned to queue 6 on **A** + + +:: + + testpmd> flow create 3 ingress transfer pattern vf id is 1 / end + actions queue index 6 / end + + +With Port Representors +^^^^^^^^^^^^^^^^^^^^^^ + +When port representors exist, implicit flow rules with the "transfer" +attribute (described in `without port representors`_) are be assumed to +exist between them and their represented resources. These may be immutable. + +In this case, traffic is received by default through the representor and +neither the "transfer" attribute nor traffic origin in flow rule patterns +are necessary. They simply have to be created on the representor port +directly and may target a different representor as described in `PORT_ID +action`_. + +Implicit traffic flow with port representor + +:: + + .-------------. .-------------. + | hypervisor | | VM 1 | + | application | | application | + `--+-------+--' `----------+--' + | | ^ | | traffic + | | | traffic | v + | `-----. | + | | | + .----(A)----. .----(B)----. | + | port_id 3 | | port_id 4 | | + `-----+-----' `-----+-----' | + | | | + .-+--. .-----+-----. .---+--. + | PF | | VF 1 rep. | | VF 1 | + `-+--' `-----+-----' `--(D)-' + | | | + .--|-------------|-----------|--. + | | | | | + | | `-----------' | + | | <-- traffic | + `--|----------------------------' + | + .---(F)----. + | physical | + | port 0 | + `----------' + +Pattern Items And Actions +~~~~~~~~~~~~~~~~~~~~~~~~~ + +PORT Pattern Item +^^^^^^^^^^^^^^^^^ + +Matches traffic originating from (ingress) or going to (egress) a physical +port of the underlying device. + +Using this pattern item without specifying a port index matches the physical +port associated with the current DPDK port ID by default. As described in +`traffic steering`_, specifying it should be rarely needed. + +- Matches **F** in `traffic steering`_. + +PORT Action +^^^^^^^^^^^ + +Directs matching traffic to a given physical port index. + +- Targets **F** in `traffic steering`_. + +PORT_ID Pattern Item +^^^^^^^^^^^^^^^^^^^^ + +Matches traffic originating from (ingress) or going to (egress) a given DPDK +port ID. + +Normally only supported if the port ID in question is known by the +underlying PMD and related to the device the flow rule is created against. + +This must not be confused with the `PORT pattern item`_ which refers to the +physical port of a device. ``PORT_ID`` refers to a ``struct rte_eth_dev`` +object on the application side (also known as "port representor" depending +on the kind of underlying device). + +- Matches **A**, **B** or **C** in `traffic steering`_. + +PORT_ID Action +^^^^^^^^^^^^^^ + +Directs matching traffic to a given DPDK port ID. + +Same restrictions as `PORT_ID pattern item`_. + +- Targets **A**, **B** or **C** in `traffic steering`_. + +PF Pattern Item +^^^^^^^^^^^^^^^ + +Matches traffic originating from (ingress) or going to (egress) the physical +function of the current device. + +If supported, should work even if the physical function is not managed by +the application and thus not associated with a DPDK port ID. Its behavior is +otherwise similar to `PORT_ID pattern item`_ using PF port ID. + +- Matches **A** in `traffic steering`_. + +PF Action +^^^^^^^^^ + +Directs matching traffic to the physical function of the current device. + +Same restrictions as `PF pattern item`_. + +- Targets **A** in `traffic steering`_. + +VF Pattern Item +^^^^^^^^^^^^^^^ + +Matches traffic originating from (ingress) or going to (egress) a given +virtual function of the current device. + +If supported, should work even if the virtual function is not managed by +the application and thus not associated with a DPDK port ID. Its behavior is +otherwise similar to `PORT_ID pattern item`_ using VF port ID. + +Note this pattern item does not match VF representors traffic which, as +separate entities, should be addressed through their own port IDs. + +- Matches **D** or **E** in `traffic steering`_. + +VF Action +^^^^^^^^^ + +Directs matching traffic to a given virtual function of the current device. + +Same restrictions as `VF pattern item`_. + +- Targets **D** or **E** in `traffic steering`_. + +\*_ENCAP actions +^^^^^^^^^^^^^^^^ + +These actions are named according to the protocol they encapsulate traffic +with (e.g. ``VXLAN_ENCAP``) and using specific parameters (e.g. VNI for +VXLAN). + +While they modify traffic and can be used multiple times (order matters), +unlike `PORT_ID action`_ and friends, they have no impact on steering. + +As described in `actions order and repetition`_ this means they are useless +if used alone in an action list, the resulting traffic gets dropped unless +combined with either ``PASSTHRU`` or other endpoint-targeting actions. + +\*_DECAP actions +^^^^^^^^^^^^^^^^ + +They perform the reverse of `\*_ENCAP actions`_ by popping protocol headers +from traffic instead of pushing them. They can be used multiple times as +well. + +Note that using these actions on non-matching traffic results in undefined +behavior. It is recommended to match the protocol headers to decapsulate on +the pattern side of a flow rule in order to use these actions or otherwise +make sure only matching traffic goes through. + +Actions Order and Repetition +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Flow rules are currently restricted to at most a single action of each +supported type, performed in an unpredictable order (or all at once). To +repeat actions in a predictable fashion, applications have to make rules +pass-through and use priority levels. + +It's now clear that PMD support for chaining multiple non-terminating flow +rules of varying priority levels is prohibitively difficult to implement +compared to simply allowing multiple identical actions performed in a +defined order by a single flow rule. + +- This change is required to support protocol encapsulation offloads and the + ability to perform them multiple times (e.g. VLAN then VXLAN). + +- It makes the ``DUP`` action redundant since multiple ``QUEUE`` actions can + be combined for duplication. + +- The (non-)terminating property of actions must be discarded. Instead, flow + rules themselves must be considered terminating by default (i.e. dropping + traffic if there is no specific target) unless a ``PASSTHRU`` action is + also specified. + +Switching Examples +------------------ + +This section provides practical examples based on the established testpmd +flow command syntax [2]_, in the context described in `traffic steering`_ + +:: + + .-------------. .-------------. .-------------. + | hypervisor | | VM 1 | | VM 2 | + | application | | application | | application | + `--+---+---+--' `----------+--' `--+----------' + | | | | | + | | `-------------------. | | + | `---------. | | | + | | | | | + .----(A)----. .----(B)----. .----(C)----. | | + | port_id 3 | | port_id 4 | | port_id 5 | | | + `-----+-----' `-----+-----' `-----+-----' | | + | | | | | + .-+--. .-----+-----. .-----+-----. .---+--. .--+---. + | PF | | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 | + `-+--' `-----+-----' `-----+-----' `--(D)-' `-(E)--' + | | | | | + | | .---------' | | + `-----. | | .-----------------' | + | | | | .---------------------' + | | | | | + .--|-------|---|---|---|--. + | | | `---|---' | + | | `-------' | + | `---------. | + `------------|------------' + | + .---(F)----. + | physical | + | port 0 | + `----------' + +By default, PF (**A**) can communicate with the physical port it is +associated with (**F**), while VF 1 (**D**) and VF 2 (**E**) are isolated +and restricted to communicate with the hypervisor application through their +respective representors (**B** and **C**) if supported. + +Examples in subsequent sections apply to hypervisor applications only and +are based on port representors **A**, **B** and **C**. + +.. [2] `Flow syntax + `_ + +Associating VF 1 with Physical Port 0 +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Assign all port traffic (**F**) to VF 1 (**D**) indiscriminately through +their representors + +:: + + flow create 3 ingress pattern / end actions port_id id 4 / end + flow create 4 ingress pattern / end actions port_id id 3 / end + +More practical example with MAC address restrictions + +:: + + flow create 3 ingress + pattern eth dst is {VF 1 MAC} / end + actions port_id id 4 / end + +:: + + flow create 4 ingress + pattern eth src is {VF 1 MAC} / end + actions port_id id 3 / end + + +Sharing Broadcasts +~~~~~~~~~~~~~~~~~~ + +From outside to PF and VFs + +:: + + flow create 3 ingress + pattern eth dst is ff:ff:ff:ff:ff:ff / end + actions port_id id 3 / port_id id 4 / port_id id 5 / end + +Note ``port_id id 3`` is necessary otherwise only VFs would receive matching +traffic. + +From PF to outside and VFs + +:: + + flow create 3 egress + pattern eth dst is ff:ff:ff:ff:ff:ff / end + actions port / port_id id 4 / port_id id 5 / end + +From VFs to outside and PF + +:: + + flow create 4 ingress + pattern eth dst is ff:ff:ff:ff:ff:ff src is {VF 1 MAC} / end + actions port_id id 3 / port_id id 5 / end + + flow create 5 ingress + pattern eth dst is ff:ff:ff:ff:ff:ff src is {VF 2 MAC} / end + actions port_id id 4 / port_id id 4 / end + +Similar ``33:33:*`` rules based on known MAC addresses should be added for +IPv6 traffic. + +Encapsulating VF 2 Traffic in VXLAN +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Assuming pass-through flow rules are supported + +:: + + flow create 5 ingress + pattern eth / end + actions vxlan_encap vni 42 / passthru / end + +:: + + flow create 5 egress + pattern vxlan vni is 42 / end + actions vxlan_decap / passthru / end + +Here ``passthru`` is needed since as described in `actions order and +repetition`_, flow rules are otherwise terminating; if supported, a rule +without a target endpoint will drop traffic. + +Without pass-through support, ingress encapsulation on the destination +endpoint might not be supported and action list must provide one + +:: + + flow create 5 ingress + pattern eth src is {VF 2 MAC} / end + actions vxlan_encap vni 42 / port_id id 3 / end + + flow create 3 ingress + pattern vxlan vni is 42 / end + actions vxlan_decap / port_id id 5 / end -- 2.14.3