From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR03-AM5-obe.outbound.protection.outlook.com (mail-eopbgr30054.outbound.protection.outlook.com [40.107.3.54]) by dpdk.org (Postfix) with ESMTP id B3AF9239 for ; Sun, 4 Nov 2018 07:48:09 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=8oGoz1C7qXmziL4foFdUsWFcHeRXIlO4RPelvnkgxDU=; b=qqvBp5Xck5yospPftllQIqi92cO0of2yLruTc2vQjz8HsHG7c9ju/LmYxesRnsYNQ7p/DcMmaUqTbJIyCq+9R4/Mw/uBjA5Tifpg5eEa8Aab/eTZK2Zh7JhwCtoYTXDFWTVOXxk06Ik80Mpt5kCb+OtuNMChcI+HerA74mRWTl8= Received: from DB7PR05MB4426.eurprd05.prod.outlook.com (52.134.109.15) by DB7PR05MB4492.eurprd05.prod.outlook.com (52.134.109.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1294.25; Sun, 4 Nov 2018 06:48:07 +0000 Received: from DB7PR05MB4426.eurprd05.prod.outlook.com ([fe80::bc22:c2f5:208d:826f]) by DB7PR05MB4426.eurprd05.prod.outlook.com ([fe80::bc22:c2f5:208d:826f%2]) with mapi id 15.20.1294.028; Sun, 4 Nov 2018 06:48:07 +0000 From: Shahaf Shuler To: Slava Ovsiienko CC: "dev@dpdk.org" , Yongseok Koh Thread-Topic: [PATCH v5 00/13] net/mlx5: e-switch VXLAN encap/decap hardware offload Thread-Index: AQHUcz0N2iKxXcac5kObFhd8Y8q6u6U+f0gA Date: Sun, 4 Nov 2018 06:48:07 +0000 Message-ID: References: <1541181152-15788-2-git-send-email-viacheslavo@mellanox.com> <1541225876-8817-1-git-send-email-viacheslavo@mellanox.com> In-Reply-To: <1541225876-8817-1-git-send-email-viacheslavo@mellanox.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=shahafs@mellanox.com; x-originating-ip: [31.154.10.105] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB7PR05MB4492; 6:xbRiQbXRiibM2hMNk0JirV4CWeOXcswTAkDDZVnxvAx4tigVeet1sr4ba3HzPwPWrAORv2qmJHRFZvJJINODsajPEp/11tYVv5m1q1p35VrlfZ0Cpxj/I+JozHdamSZbFnDSiX6KXIeE/C1Pnyy7W7VyLOcOBminxaPstUXuA/75gTMI4mxhNyOmpmoDYGSc5wKuXNYzs4IjN+F1cEtnhxfNZ0N7vsjdH8Ytd5lpcoFcrFTFf6dPMmoVwB48WUA32QzhbQuNFPohYuxWgltseKBQWXKbB0Gf+coSLDQ6pJ3fvo30l+ZQVOJS16dOC3Aora+ILASJsugihYcQqB1F0PoGqvBGq+oVXMWrPEYH9q/kMm9bdQTi+XRGHWixsjPdNUttVUOrA3HmcJOmy6tyI1ntSEvbr6CysXqdTI+ZTY1qzCetn87NwOO7+fNhJHp+leZS5hyOiqXyUp94/t+RjA==; 5:/WO5b0XE0sOrqe1Pl5l1I7SqFx2X0RqOu+LC5MW2MjwON7oRv8WV7ZUEVpwmGrX5RMfeffblasjCRvfS+ehB0Lm9kdr2Ix91zKgpI0drIWdp29U5k2m8RgQi5/euKNwx6S3Au0L5/iiP0VcvFVtNuNCUqMfY3eoKLyb4Edp3nRk=; 7:neZbTpjFH0kULdOhP/HInYzZe4e/IHsip122S41XRc853ldwV03yIpQ7V+nSyZNmPzyLlWKFSoJe+hJSnX0seuzc6kaQGhGu4GP5A6UyOEMhUgCoEyoV0beK/LgapfCj0ggYa8pi7hVqjJ72KsljBA== x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-correlation-id: 0c4c81b6-7c90-483f-4034-08d642217a77 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(7020095)(4652040)(8989299)(5600074)(711020)(4618075)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(2017052603328)(7153060)(7193020); SRVR:DB7PR05MB4492; x-ms-traffictypediagnostic: DB7PR05MB4492: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(3002001)(10201501046)(93006095)(93001095)(3231382)(944501410)(52105095)(6055026)(148016)(149066)(150057)(6041310)(20161123562045)(20161123564045)(20161123560045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123558120)(201708071742011)(7699051)(76991095); SRVR:DB7PR05MB4492; BCL:0; PCL:0; RULEID:; SRVR:DB7PR05MB4492; x-forefront-prvs: 084674B2CF x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(136003)(39860400002)(396003)(346002)(376002)(366004)(189003)(199004)(5024004)(14444005)(26005)(6506007)(256004)(186003)(102836004)(105586002)(106356001)(71200400001)(86362001)(11346002)(476003)(71190400001)(446003)(6116002)(486006)(66066001)(6436002)(3846002)(2900100001)(229853002)(97736004)(25786009)(9686003)(4326008)(6246003)(53936002)(74316002)(55016002)(305945005)(7736002)(6636002)(316002)(54906003)(5660300001)(6862004)(107886003)(14454004)(478600001)(33656002)(7696005)(81156014)(8936002)(2906002)(76176011)(99286004)(81166006)(68736007)(8676002); DIR:OUT; SFP:1101; SCL:1; SRVR:DB7PR05MB4492; H:DB7PR05MB4426.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: fBxk5JghoKsI9a6kJ5UD/HvFKssxmtL7ukSVLn9p8hpw+9E3dkXTAPqqP7F0YVtIjL0j2Pjt4A1WR2AA7uMZHFLg1zuKbi+ygHum9yQSo6nI0NGkf3iFGWrZJ8AgwwEWVd+/JtHRSdg4+s4Kryv4Y713kWMhgEM87An4m9wcIP3Cyx/3W/ProVC8QedPsdaggq9wEvOvSOqCZXmz4iZ0SMDI9pKYDEPHZqHq4m8MYuOu74C9A5vrfJV/J1921aqc0YF0WqfywkFUD1KinnUydS6Bo0mk81Mi/LIrYiHHT4RcxVGT5U3fyEgl+4q0BwB1+qvkrdYo4DcLuODjQdM19ZugjhE92vIgtFPVdFEGrTA= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: 0c4c81b6-7c90-483f-4034-08d642217a77 X-MS-Exchange-CrossTenant-originalarrivaltime: 04 Nov 2018 06:48:07.4243 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB7PR05MB4492 Subject: Re: [dpdk-dev] [PATCH v5 00/13] net/mlx5: e-switch VXLAN encap/decap hardware offload X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Nov 2018 06:48:09 -0000 Saturday, November 3, 2018 8:19 AM, Slava Ovsiienko: > Subject: [PATCH v5 00/13] net/mlx5: e-switch VXLAN encap/decap hardware > offload >=20 > This patchset adds the VXLAN encapsulation/decapsulation hardware offload > feature for E-Switch. >=20 > A typical use case of tunneling infrastructure is port representors in > switchdev mode, with VXLAN traffic encapsulation performed on traffic > coming *from* a representor and decapsulation on traffic going *to* that > representor, in order to transparently assign a given VXLAN to VF traffic= . >=20 > Since these actions are supported at the E-Switch level, the "transfer" > attribute must be set on such flow rules. They must also be combined with= a > port redirection action to make sense. >=20 > Since only ingress is supported, encapsulation flow rules are normally ap= plied > on a physical port and emit traffic to a port representor. > The opposite order is used for decapsulation. >=20 > Like other mlx5 E-Switch flow rule actions, these ones are implemented > through Linux's TC flower API. Since the Linux interface for VXLAN > encap/decap involves virtual network devices (i.e. ip link add type > vxlan [...]), the PMD dynamically spawns them on a needed > basis through Netlink calls. These VXLAN implicitly created devices are c= alled > VTEPs (Virtual Tunnel End Points). >=20 > VXLAN interfaces are dynamically created for each local port of outer > networks and then used as targets for TC "flower" filters in order to per= form > encapsulation. For decapsulation the VXLAN devices are created for each > unique UDP-port. These VXLAN interfaces are system-wide, the only one > device with given UDP port can exist in the system (the attempt of creati= ng > another device with the same UDP local port returns EEXIST), so PMD shoul= d > support the shared (between PMD instances) device database. >=20 > Rules samples consideraions: >=20 > $PF - physical device, outer network > $VF - representor for VF, outer/inner network > $VXLAN - VTEP netdev name > $PF_OUTER_IP - $PF IP (v4 or v6) within outer network > $REMOTE_IP - remote peer IP (v4 or v6) within outer network > $LOCAL_PORT - local UDP port > $REMOTE_PORT - remote UDP port >=20 > VXLAN VTEP creation with iproute2 (PMD does the same via Netlink): >=20 > - for encapsulation: >=20 > ip link add $VXLAN type vxlan dstport $LOCAL_PORT external dev $PF > ip link set dev $VXLAN up > tc qdisc del dev $VXLAN ingress > tc qdisc add dev $VXLAN ingress >=20 > $LOCAL_PORT for egress encapsulated traffic (note, this is not source UDP > port in the VXLAN header, it is just UDP port assigned > to VTEP, no practical usage) is selected from available UDP ports > automatically in range 30000-60000. >=20 > - for decapsulation: >=20 > ip link add $VXLAN type vxlan dstport $LOCAL_PORT external > ip link set dev $VXLAN up > tc qdisc del dev $VXLAN ingress > tc qdisc add dev $VXLAN ingress >=20 > $LOCAL_PORT is UDP port receiving the VXLAN traffic from outer networks. >=20 > All ingress UDP traffic with given UDP destination port from ALL existing > netdevs is routed by kernel to the $VXLAN net device. While applying the > rule the kernel checks the IP parameter withing rule, determines the > appropriate underlaying PF and tryes to setup the rule hardware offload. >=20 > VXLAN encapsulation >=20 > VXLAN encap rules are applied to the VF ingress traffic and have the VTEP= as > actual redirection destinations instead of outer PF. > The encapsulation rule should provide: > - redirection action VF->PF > - VF port ID > - some inner network parameters (MACs) > - the tunnel outer source IP (v4/v6), (IS A MUST) > - the tunnel outer destination IP (v4/v6), (IS A MUST). > - VNI - Virtual Network Identifier (IS A MUST) >=20 > VXLAN encapsulation rule sample for tc utility: >=20 > tc filter add dev $VF protocol all parent ffff: flower skip_sw \ > action tunnel_key set dst_port $REMOTE_PORT \ > src_ip $PF_OUTER_IP dst_ip $REMOTE_IP id $VNI \ > action mirred egress redirect dev $VXLAN >=20 > VXLAN encapsulation rule sample for testpmd: >=20 > - Setting up outer properties of VXLAN tunnel: >=20 > set vxlan ip-version ipv4 vni $VNI \ > udp-src $IGNORED udp-dst $REMOTE_PORT \ > ip-src $PF_OUTER_IP ip-dst $REMOTE_IP \ > eth-src $IGNORED eth-dst $REMOTE_MAC >=20 > - Creating a flow rule on port ID 4 performing VXLAN encapsulation > with the abovementioned properties and directing the resulting > traffic to port ID 0: >=20 > flow create 4 ingress transfer pattern eth src is $INNER_MAC / end > actions vxlan_encap / port_id id 0 / end >=20 > There is no direct way found to provide kernel with all required > encapsulatioh header parameters. The encapsulation VTEP is created > attached to the outer interface and assumed as default path for egress > encapsulated traffic. The outer tunnel IP address are assigned to interfa= ce > using Netlink, the implicit route is created like this: >=20 > ip addr add peer dev scope link >=20 > The peer address option provides implicit route, and scope link attribute > reduces the risk of conflicts. At initialization time all local scope lin= k addresses > are flushed from the outer network device. >=20 > The destination MAC address is provided via permenent neigh rule: >=20 > ip neigh add dev lladdr to nud permanent >=20 > At initialization time all neigh rules of permanent type are flushed from= the > outer network device. >=20 > VXLAN decapsulation >=20 > VXLAN decap rules are applied to the ingress traffic of VTEP ($VXLAN) dev= ice > instead of PF. The decapsulation rule should provide: > - redirection action PF->VF > - VF port ID as redirection destination > - $VXLAN device as ingress traffic source > - the tunnel outer source IP (v4/v6), (optional) > - the tunnel outer destination IP (v4/v6), (IS A MUST) > - the tunnel local UDP port (IS A MUST, PMD looks for appropriate VTEP > with given local UDP port) > - VNI - Virtual Network Identifier (IS A MUST) >=20 > VXLAN decap rule sample for tc utility: >=20 > tc filter add dev $VXLAN protocol all parent ffff: flower skip_sw \ > enc_src_ip $REMOTE_IP enc_dst_ip $PF_OUTER_IP enc_key_id $VNI > \ > nc_dst_port $LOCAL_PORT \ > action tunnel_key unset action mirred egress redirect dev $VF >=20 > VXLAN decap rule sample for testpmd: >=20 > - Creating a flow on port ID 0 performing VXLAN decapsulation and directi= ng > the result to port ID 4 with checking inner properties: >=20 > flow create 0 ingress transfer pattern / > ipv4 src is $REMOTE_IP dst $PF_LOCAL_IP / > udp src is 9999 dst is $LOCAL_PORT / vxlan vni is $VNI / > eth src is 00:11:22:33:44:55 dst is $INNER_MAC / end > actions vxlan_decap / port_id id 4 / end >=20 > The VXLAN encap/decap rules constrains (implied by current kernel support= ) >=20 > - VXLAN decapsulation provided for PF->VF direction only > - VXLAN encapsulation provided for VF->PF direction only > - current implementation will support non-shared database of VTEPs > (impossible simultaneous usage of the same UDP port by several > instances of DPDK apps) >=20 > Suggested-by: Adrien Mazarguil > Signed-off-by: Viacheslav Ovsiienko >=20 Well done.=20 Applied to next-net-mlx, thanks.=20