DPDK patches and discussions
 help / color / mirror / Atom feed
From: Slava Ovsiienko <viacheslavo@mellanox.com>
To: Shahaf Shuler <shahafs@mellanox.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>, Yongseok Koh <yskoh@mellanox.com>,
	Slava Ovsiienko <viacheslavo@mellanox.com>
Subject: [dpdk-dev] [PATCH v3 00/13] net/mlx5: e-switch VXLAN encap/decap hardware offload
Date: Thu, 1 Nov 2018 12:19:21 +0000
Message-ID: <1541074741-41368-1-git-send-email-viacheslavo@mellanox.com> (raw)
In-Reply-To: <1539612815-47199-1-git-send-email-viacheslavo@mellanox.com>

This patchset adds the VXLAN encapsulation/decapsulation hardware
offload feature for E-Switch.
 
A typical use case of tunneling infrastructure is port representors 
in switchdev mode, with VXLAN traffic encapsulation performed on
traffic coming *from* a representor and decapsulation on traffic
going *to* that representor, in order to transparently assign
a given VXLAN to VF traffic.

Since these actions are supported at the E-Switch level, the "transfer" 
attribute must be set on such flow rules. They must also be combined
with a port redirection action to make sense.

Since only ingress is supported, encapsulation flow rules are normally
applied on a physical port and emit traffic to a port representor. 
The opposite order is used for decapsulation.

Like other mlx5 E-Switch flow rule actions, these ones are implemented
through Linux's TC flower API. Since the Linux interface for VXLAN
encap/decap involves virtual network devices (i.e. ip link add type
		vxlan [...]), the PMD dynamically spawns them on a needed basis
through Netlink calls. These VXLAN implicitly created devices are
called VTEPs (Virtual Tunnel End Points).

VXLAN interfaces are dynamically created for each local port of
outer networks and then used as targets for TC "flower" filters
in order to perform encapsulation. For decapsulation the VXLAN
devices are created for each unique UDP-port. These VXLAN interfaces
are system-wide, the only one device with given UDP port can exist 
in the system (the attempt of creating another device with the 
same UDP local port returns EEXIST), so PMD should support the
shared (between PMD instances) device database. 

Rules samples consideraions:

$PF 		- physical device, outer network
$VF 		- representor for VF, outer/inner network
$VXLAN		- VTEP netdev name
$PF_OUTER_IP 	- $PF IP (v4 or v6) within outer network
$REMOTE_IP 	- remote peer IP (v4 or v6) within outer network
$LOCAL_PORT	- local UDP port
$REMOTE_PORT	- remote UDP port

VXLAN VTEP creation with iproute2 (PMD does the same via Netlink):

- for encapsulation:

  ip link add $VXLAN type vxlan dstport $LOCAL_PORT external dev $PF
  ip link set dev $VXLAN up
  tc qdisc del dev $VXLAN ingress
  tc qdisc add dev $VXLAN ingress

$LOCAL_PORT for egress encapsulated traffic (note, this is not
source UDP port in the VXLAN header, it is just UDP port assigned
to VTEP, no practical usage) is selected from available	UDP ports
automatically in range 30000-60000.

- for decapsulation:

  ip link add $VXLAN type vxlan dstport $LOCAL_PORT external
  ip link set dev $VXLAN up
  tc qdisc del dev $VXLAN ingress
  tc qdisc add dev $VXLAN ingress

$LOCAL_PORT is UDP port receiving the VXLAN traffic from outer networks.

All ingress UDP traffic with given UDP destination port from ALL existing
netdevs is routed by kernel to the $VXLAN net device. While applying the
rule the kernel checks the IP parameter withing rule, determines the
appropriate underlaying PF and tryes to setup the rule hardware offload.

VXLAN encapsulation 

VXLAN encap rules are applied to the VF ingress traffic and have the 
VTEP as actual redirection destinations instead of outer PF.
The encapsulation rule should provide:
- redirection action VF->PF
- VF port ID
- some inner network parameters (MACs) 
- the tunnel outer source IP (v4/v6), (IS A MUST)
- the tunnel outer destination IP (v4/v6), (IS A MUST).
- VNI - Virtual Network Identifier (IS A MUST)

VXLAN encapsulation rule sample for tc utility:

  tc filter add dev $VF protocol all parent ffff: flower skip_sw \
	action tunnel_key set dst_port $REMOTE_PORT \
	src_ip $PF_OUTER_IP dst_ip $REMOTE_IP id $VNI \
	action mirred egress redirect dev $VXLAN

VXLAN encapsulation rule sample for testpmd:

- Setting up outer properties of VXLAN tunnel:

  set vxlan ip-version ipv4 vni $VNI \
	udp-src $IGNORED udp-dst $REMOTE_PORT \
	ip-src $PF_OUTER_IP ip-dst $REMOTE_IP \
 	eth-src $IGNORED eth-dst $REMOTE_MAC

- Creating a flow rule on port ID 4 performing VXLAN encapsulation
  with the abovementioned properties and directing the resulting
  traffic to port ID 0:

  flow create 4 ingress transfer pattern eth src is $INNER_MAC / end
	actions vxlan_encap / port_id id 0 / end

There is no direct way found to provide kernel with all required
encapsulatioh header parameters. The encapsulation VTEP is created
attached to the outer interface and assumed as default path for
egress encapsulated traffic. The outer tunnel IP address are
assigned to interface using Netlink, the implicit route is
created like this:

  ip addr add <src_ip> peer <dst_ip> dev <outer> scope link

The peer address option provides implicit route, and scope link
attribute reduces the risk of conflicts. At initialization time all
local scope link addresses are flushed from the outer network device.

The destination MAC address is provided via permenent neigh rule:

 ip neigh add dev <outer> lladdr <dst_mac> to <dst_ip> nud permanent

At initialization time all neigh rules of permanent type are flushed
from the outer network device. 

VXLAN decapsulation 

VXLAN decap rules are applied to the ingress traffic of VTEP ($VXLAN)
device instead of PF. The decapsulation rule should provide:
- redirection action PF->VF
- VF port ID as redirection destination
- $VXLAN device as ingress traffic source
- the tunnel outer source IP (v4/v6), (optional)
- the tunnel outer destination IP (v4/v6), (IS A MUST)
- the tunnel local UDP port (IS A MUST, PMD looks for appropriate VTEP
  with given local UDP port)
- VNI - Virtual Network Identifier (IS A MUST)

VXLAN decap rule sample for tc utility: 

  tc filter add dev $VXLAN protocol all parent ffff: flower skip_sw \
	enc_src_ip $REMOTE_IP enc_dst_ip $PF_OUTER_IP enc_key_id $VNI \
	nc_dst_port $LOCAL_PORT \
	action tunnel_key unset action mirred egress redirect dev $VF
						
VXLAN decap rule sample for testpmd: 

- Creating a flow on port ID 0 performing VXLAN decapsulation and directing
  the result to port ID 4 with checking inner properties:

  flow create 0 ingress transfer pattern / 
  	ipv4 src is $REMOTE_IP dst $PF_LOCAL_IP /
	udp src is 9999 dst is $LOCAL_PORT / vxlan vni is $VNI / 
	eth src is 00:11:22:33:44:55 dst is $INNER_MAC / end
        actions vxlan_decap / port_id id 4 / end

The VXLAN encap/decap rules constrains (implied by current kernel support)

- VXLAN decapsulation provided for PF->VF direction only
- VXLAN encapsulation provided for VF->PF direction only
- current implementation will support non-shared database of VTEPs
  (impossible simultaneous usage of the same UDP port by several
   instances of DPDK apps)

Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

---
v3:
  * patchset is resplitted into more dedicated parts
  * decapsulation rule takes MAC from inner eth item
  * appropriate RTE_BEx are replaced with runtime rte_cpu_xxx
  * E-Switch Flow counter deletion is fixed
  * VTEP management routines are refactored
  * found typos are corrected

v2:
  * removed non-VXLAN related parts
  * multipart Netlink messages support
  * local IP and peer IP rules management
  * neigh IP address to MAC address rules
  * management rules cleanup at outer device initialization
  * attached devices cleanup at outer device initialization

v1:
 * http://patches.dpdk.org/patch/45800/
 * Refactored code of initial experimental proposal

v0:
 * http://patches.dpdk.org/cover/44080/
 * Initial proposal by Adrien Mazarguil <adrien.mazarguil@6wind.com>

Viacheslav Ovsiienko (13):
  net/mlx5: prepare makefile for adding e-switch VXLAN
  net/mlx5: prepare meson.build for adding e-switch VXLAN
  net/mlx5: add necessary definitions for e-switch VXLAN
  net/mlx5: add necessary structures for e-switch VXLAN
  net/mlx5: swap items/actions validations for e-switch rules
  net/mlx5: add e-switch VXLAN support to validation routine
  net/mlx5: add VXLAN support to flow prepare routine
  net/mlx5: add VXLAN support to flow translate routine
  net/mlx5: e-switch VXLAN netlink routines update
  net/mlx5: fix e-switch Flow counter deletion
  net/mlx5: add e-switch VXLAN tunnel devices management
  net/mlx5: add e-switch VXLAN encapsulation rules
  net/mlx5: add e-switch VXLAN rule cleanup routines

 drivers/net/mlx5/Makefile        |   85 +
 drivers/net/mlx5/meson.build     |   34 +
 drivers/net/mlx5/mlx5_flow.h     |   11 +
 drivers/net/mlx5/mlx5_flow_tcf.c | 5118 +++++++++++++++++++++++++++++---------
 4 files changed, 4107 insertions(+), 1141 deletions(-)

-- 
1.8.3.1

  parent reply	other threads:[~2018-11-01 12:19 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-02  6:30 [dpdk-dev] [PATCH 1/5] net/mlx5: add VXLAN encap/decap support for e-switch Slava Ovsiienko
2018-10-02  6:30 ` [dpdk-dev] [PATCH 2/5] net/mlx5: e-switch VXLAN netlink routines update Slava Ovsiienko
2018-10-02  6:30 ` [dpdk-dev] [PATCH 3/5] net/mlx5: e-switch VXLAN flow validation routine Slava Ovsiienko
2018-10-02  6:30 ` [dpdk-dev] [PATCH 4/5] net/mlx5: e-switch VXLAN flow translation routine Slava Ovsiienko
2018-10-02  6:30 ` [dpdk-dev] [PATCH 5/5] net/mlx5: e-switch VXLAN tunnel devices management Slava Ovsiienko
2018-10-15 14:13 ` [dpdk-dev] [PATCH v2 0/7] net/mlx5: e-switch VXLAN encap/decap hardware offload Viacheslav Ovsiienko
2018-10-15 14:13   ` [dpdk-dev] [PATCH v2 1/7] net/mlx5: e-switch VXLAN configuration and definitions Viacheslav Ovsiienko
2018-10-23 10:01     ` Yongseok Koh
2018-10-25 12:50       ` Slava Ovsiienko
2018-10-25 23:33         ` Yongseok Koh
2018-10-15 14:13   ` [dpdk-dev] [PATCH v2 2/7] net/mlx5: e-switch VXLAN flow validation routine Viacheslav Ovsiienko
2018-10-23 10:04     ` Yongseok Koh
2018-10-25 13:53       ` Slava Ovsiienko
2018-10-26  3:07         ` Yongseok Koh
2018-10-26  8:39           ` Slava Ovsiienko
2018-10-26 21:56             ` Yongseok Koh
2018-10-29  9:33               ` Slava Ovsiienko
2018-10-29 18:26                 ` Yongseok Koh
2018-10-15 14:13   ` [dpdk-dev] [PATCH v2 3/7] net/mlx5: e-switch VXLAN flow translation routine Viacheslav Ovsiienko
2018-10-23 10:06     ` Yongseok Koh
2018-10-25 14:37       ` Slava Ovsiienko
2018-10-26  4:22         ` Yongseok Koh
2018-10-26  9:06           ` Slava Ovsiienko
2018-10-26 22:10             ` Yongseok Koh
2018-10-15 14:13   ` [dpdk-dev] [PATCH v2 4/7] net/mlx5: e-switch VXLAN netlink routines update Viacheslav Ovsiienko
2018-10-23 10:07     ` Yongseok Koh
2018-10-15 14:13   ` [dpdk-dev] [PATCH v2 5/7] net/mlx5: e-switch VXLAN tunnel devices management Viacheslav Ovsiienko
2018-10-25  0:28     ` Yongseok Koh
2018-10-25 20:21       ` Slava Ovsiienko
2018-10-26  6:25         ` Yongseok Koh
2018-10-26  9:35           ` Slava Ovsiienko
2018-10-26 22:42             ` Yongseok Koh
2018-10-29 11:53               ` Slava Ovsiienko
2018-10-29 18:42                 ` Yongseok Koh
2018-10-15 14:13   ` [dpdk-dev] [PATCH v2 6/7] net/mlx5: e-switch VXLAN encapsulation rules management Viacheslav Ovsiienko
2018-10-25  0:33     ` Yongseok Koh
2018-10-15 14:13   ` [dpdk-dev] [PATCH v2 7/7] net/mlx5: e-switch VXLAN rule cleanup routines Viacheslav Ovsiienko
2018-10-25  0:36     ` Yongseok Koh
2018-10-25 20:32       ` Slava Ovsiienko
2018-10-26  6:30         ` Yongseok Koh
2018-11-01 12:19   ` Slava Ovsiienko [this message]
2018-11-01 12:19     ` [dpdk-dev] [PATCH v3 01/13] net/mlx5: prepare makefile for adding e-switch VXLAN Slava Ovsiienko
2018-11-01 20:33       ` Yongseok Koh
2018-11-01 12:19     ` [dpdk-dev] [PATCH v3 02/13] net/mlx5: prepare meson.build " Slava Ovsiienko
2018-11-01 20:33       ` Yongseok Koh
2018-11-01 12:19     ` [dpdk-dev] [PATCH v3 03/13] net/mlx5: add necessary definitions for " Slava Ovsiienko
2018-11-01 20:35       ` Yongseok Koh
2018-11-01 12:19     ` [dpdk-dev] [PATCH v3 04/13] net/mlx5: add necessary structures " Slava Ovsiienko
2018-11-01 20:36       ` Yongseok Koh
2018-11-01 12:19     ` [dpdk-dev] [PATCH v3 05/13] net/mlx5: swap items/actions validations for e-switch rules Slava Ovsiienko
2018-11-01 20:37       ` Yongseok Koh
2018-11-01 12:19     ` [dpdk-dev] [PATCH v3 06/13] net/mlx5: add e-switch VXLAN support to validation routine Slava Ovsiienko
2018-11-01 20:49       ` Yongseok Koh
2018-11-01 12:19     ` [dpdk-dev] [PATCH v3 07/13] net/mlx5: add VXLAN support to flow prepare routine Slava Ovsiienko
2018-11-01 21:03       ` Yongseok Koh
2018-11-01 12:19     ` [dpdk-dev] [PATCH v3 08/13] net/mlx5: add VXLAN support to flow translate routine Slava Ovsiienko
2018-11-01 21:18       ` Yongseok Koh
2018-11-01 12:19     ` [dpdk-dev] [PATCH v3 09/13] net/mlx5: e-switch VXLAN netlink routines update Slava Ovsiienko
2018-11-01 21:21       ` Yongseok Koh
2018-11-01 12:19     ` [dpdk-dev] [PATCH v3 10/13] net/mlx5: fix e-switch Flow counter deletion Slava Ovsiienko
2018-11-01 22:00       ` Yongseok Koh
2018-11-01 12:19     ` [dpdk-dev] [PATCH v3 11/13] net/mlx5: add e-switch VXLAN tunnel devices management Slava Ovsiienko
2018-11-01 23:59       ` Yongseok Koh
2018-11-01 12:19     ` [dpdk-dev] [PATCH v3 12/13] net/mlx5: add e-switch VXLAN encapsulation rules Slava Ovsiienko
2018-11-02  0:01       ` Yongseok Koh
2018-11-01 12:19     ` [dpdk-dev] [PATCH v3 13/13] net/mlx5: add e-switch VXLAN rule cleanup routines Slava Ovsiienko
2018-11-02  0:01       ` Yongseok Koh
2018-11-01 20:32     ` [dpdk-dev] [PATCH v3 00/13] net/mlx5: e-switch VXLAN encap/decap hardware offload Yongseok Koh
2018-11-02 17:53     ` [dpdk-dev] [PATCH v4 " Slava Ovsiienko
2018-11-02 17:53       ` [dpdk-dev] [PATCH v4 01/13] net/mlx5: prepare makefile for adding E-Switch VXLAN Slava Ovsiienko
2018-11-03  6:18         ` [dpdk-dev] [PATCH v5 00/13] net/mlx5: e-switch VXLAN encap/decap hardware offload Slava Ovsiienko
2018-11-03  6:18           ` [dpdk-dev] [PATCH v5 01/13] net/mlx5: prepare makefile for adding E-Switch VXLAN Slava Ovsiienko
2018-11-12 20:01             ` [dpdk-dev] [PATCH 0/4] net/mlx5: prepare to add E-switch rule flags check Slava Ovsiienko
2018-11-12 20:01               ` [dpdk-dev] [PATCH 1/4] net/mlx5: prepare Netlink communication routine to fix Slava Ovsiienko
2018-11-13 13:21                 ` Shahaf Shuler
2018-11-12 20:01               ` [dpdk-dev] [PATCH 2/4] net/mlx5: fix Netlink communication routine Slava Ovsiienko
2018-11-13 13:21                 ` Shahaf Shuler
2018-11-14 12:57                   ` Slava Ovsiienko
2018-11-12 20:01               ` [dpdk-dev] [PATCH 3/4] net/mlx5: prepare to add E-switch rule flags check Slava Ovsiienko
2018-11-12 20:01               ` [dpdk-dev] [PATCH 4/4] net/mlx5: add E-switch rule hardware offload flag check Slava Ovsiienko
2018-11-13 13:21               ` [dpdk-dev] [PATCH 0/4] net/mlx5: prepare to add E-switch rule flags check Shahaf Shuler
2018-11-14 14:56                 ` Shahaf Shuler
2018-11-03  6:18           ` [dpdk-dev] [PATCH v5 03/13] net/mlx5: add necessary definitions for E-Switch VXLAN Slava Ovsiienko
2018-11-03  6:18           ` [dpdk-dev] [PATCH v5 02/13] net/mlx5: prepare meson.build for adding " Slava Ovsiienko
2018-11-03  6:18           ` [dpdk-dev] [PATCH v5 04/13] net/mlx5: add necessary structures for " Slava Ovsiienko
2018-11-03  6:18           ` [dpdk-dev] [PATCH v5 05/13] net/mlx5: swap items/actions validations for E-Switch rules Slava Ovsiienko
2018-11-03  6:18           ` [dpdk-dev] [PATCH v5 06/13] net/mlx5: add E-Switch VXLAN support to validation routine Slava Ovsiienko
2018-11-03  6:18           ` [dpdk-dev] [PATCH v5 07/13] net/mlx5: add VXLAN support to flow prepare routine Slava Ovsiienko
2018-11-03  6:18           ` [dpdk-dev] [PATCH v5 08/13] net/mlx5: add VXLAN support to flow translate routine Slava Ovsiienko
2018-11-03  6:18           ` [dpdk-dev] [PATCH v5 09/13] net/mlx5: update E-Switch VXLAN netlink routines Slava Ovsiienko
2018-11-03  6:18           ` [dpdk-dev] [PATCH v5 10/13] net/mlx5: fix E-Switch Flow counter deletion Slava Ovsiienko
2018-11-03  6:18           ` [dpdk-dev] [PATCH v5 11/13] net/mlx5: add E-switch VXLAN tunnel devices management Slava Ovsiienko
2018-11-03  6:18           ` [dpdk-dev] [PATCH v5 12/13] net/mlx5: add E-Switch VXLAN encapsulation rules Slava Ovsiienko
2018-11-03  6:18           ` [dpdk-dev] [PATCH v5 13/13] net/mlx5: add E-switch VXLAN rule cleanup routines Slava Ovsiienko
2018-11-04  6:48           ` [dpdk-dev] [PATCH v5 00/13] net/mlx5: e-switch VXLAN encap/decap hardware offload Shahaf Shuler
2018-11-02 17:53       ` [dpdk-dev] [PATCH v4 02/13] net/mlx5: prepare meson.build for adding E-Switch VXLAN Slava Ovsiienko
2018-11-02 17:53       ` [dpdk-dev] [PATCH v4 03/13] net/mlx5: add necessary definitions for " Slava Ovsiienko
2018-11-02 17:53       ` [dpdk-dev] [PATCH v4 04/13] net/mlx5: add necessary structures " Slava Ovsiienko
2018-11-02 17:53       ` [dpdk-dev] [PATCH v4 05/13] net/mlx5: swap items/actions validations for E-Switch rules Slava Ovsiienko
2018-11-02 17:53       ` [dpdk-dev] [PATCH v4 06/13] net/mlx5: add E-Switch VXLAN support to validation routine Slava Ovsiienko
2018-11-02 17:53       ` [dpdk-dev] [PATCH v4 07/13] net/mlx5: add VXLAN support to flow prepare routine Slava Ovsiienko
2018-11-02 21:38         ` Yongseok Koh
2018-11-02 17:53       ` [dpdk-dev] [PATCH v4 08/13] net/mlx5: add VXLAN support to flow translate routine Slava Ovsiienko
2018-11-02 21:53         ` Yongseok Koh
2018-11-02 23:29           ` Yongseok Koh
2018-11-02 17:53       ` [dpdk-dev] [PATCH v4 09/13] net/mlx5: update E-Switch VXLAN netlink routines Slava Ovsiienko
2018-11-02 17:53       ` [dpdk-dev] [PATCH v4 10/13] net/mlx5: fix E-Switch Flow counter deletion Slava Ovsiienko
2018-11-02 17:53       ` [dpdk-dev] [PATCH v4 11/13] net/mlx5: add E-switch VXLAN tunnel devices management Slava Ovsiienko
2018-11-02 17:53       ` [dpdk-dev] [PATCH v4 12/13] net/mlx5: add E-Switch VXLAN encapsulation rules Slava Ovsiienko
2018-11-02 17:53       ` [dpdk-dev] [PATCH v4 13/13] net/mlx5: add E-switch VXLAN rule cleanup routines Slava Ovsiienko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1541074741-41368-1-git-send-email-viacheslavo@mellanox.com \
    --to=viacheslavo@mellanox.com \
    --cc=dev@dpdk.org \
    --cc=shahafs@mellanox.com \
    --cc=yskoh@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git