From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR03-AM5-obe.outbound.protection.outlook.com (mail-eopbgr30080.outbound.protection.outlook.com [40.107.3.80]) by dpdk.org (Postfix) with ESMTP id 3041E2C28 for ; Mon, 15 Oct 2018 16:14:16 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lktZpFu5XVc8qAN2V7BARFQZk+JQOvObqq38/iEGvgs=; b=k5XeXHQ7cVeWisSlB4W0qmjr9UcER2FD2woJ7h/pnmruXRiFzoVihNEuv0bPJddXSiuwX0n867eiTMUWLKa3NUm8TXZPg5i3kS7ugl/p8QI9IbkLJIU50ontAgzexhjkZ/+NUaicndeVediIBWnl0g07hbER4HbOV1WUWfK8kLM= Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=viacheslavo@mellanox.com; Received: from mellanox.com (37.142.13.130) by VI1PR05MB3277.eurprd05.prod.outlook.com (2603:10a6:802:1c::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1228.24; Mon, 15 Oct 2018 14:14:13 +0000 From: Viacheslav Ovsiienko To: shahafs@mellanox.com, yskoh@mellanox.com Cc: dev@dpdk.org, Viacheslav Ovsiienko Date: Mon, 15 Oct 2018 14:13:33 +0000 Message-Id: <1539612815-47199-6-git-send-email-viacheslavo@mellanox.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1539612815-47199-1-git-send-email-viacheslavo@mellanox.com> References: <1538461807-37507-1-git-send-email-viacheslavo@mellanox.com> <1539612815-47199-1-git-send-email-viacheslavo@mellanox.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [37.142.13.130] X-ClientProxiedBy: CWLP265CA0084.GBRP265.PROD.OUTLOOK.COM (2603:10a6:401:50::24) To VI1PR05MB3277.eurprd05.prod.outlook.com (2603:10a6:802:1c::22) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 4062ece7-070c-4c9f-d3db-08d632a87c69 X-MS-Office365-Filtering-HT: Tenant X-Microsoft-Antispam: BCL:0; PCL:0; RULEID:(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600074)(711020)(4618075)(2017052603328)(7153060)(7193020); SRVR:VI1PR05MB3277; X-Microsoft-Exchange-Diagnostics: 1; VI1PR05MB3277; 3:8FsaZiPuj4wxPm4MoG1/2ksaBgEIJgXfwsG6TGRzTG+grSQVZM6BxFN8RROtyV+jLodLWaXHKRFbi0dP67hJ+E+tlQiWn8hnpFTModKN0NrUg1Y/I0xRvlC8ssWqSR/GVDFnzejuOYH+RteKqUjV4yLGwDwq5vr9eYbtGSF1xRAMv+bXzC1YJaqTU5nvne0m20YpwGPMDVgZ+f3kZg7jfpxUg9zb0C1JbJ/jQ1KbDJqyvUYyyiXRdBkGxF09H/2/; 25:iszPlTZfqEgw32Er5YlPCOSF8ZkuD2i5S3vOTTj4ZMdrVE9uEzB0zHJE/kNjsuRJqHB3NoR345A+NFKaeV1iU5dtrDtok23DPn0tRObC0vvb+VhEE5RUo9/2lIf75LOEysumi626KbxjCyPas25WrgOoT8aEX5PNkn66fk+6Y+MJRUca7lZGeEM8/H/C5ZYoR9jsNzRLfR0d4brTglZHe2r17WvgM32o0UBndenFX1ZYLEd26KjQh9vdP8nrTmQxH28IrWyTaSb8Jh7rVqbP21n3JmoXPn0TmKFDppOYGXuTviIw3YOWCijChigrJQ16Jc+Shzz+Ion26ajMRT4VKg==; 31:gAuhF0vH8VSUx4MKUX74rG2RWy0cP5P9hTlXx4fmZ4GC6joZkdEoRaI+VTwxFxskdNv9OBaUPDjr0FeKdqB3Z6qM9J2Jtmv9JXhp8//GhtLL3RyvZC5xH/l3IFKevyM/jfWIirW5NKNoDJ+Ci2sATxbuNUPvuG7PC4yQ1W+hGpL8dAxgeIrRFENRxICvc5519yTZxckmxg+5NkEGFVht9Y+q4J6ktQQtiLJG2WVV2Z0= X-MS-TrafficTypeDiagnostic: VI1PR05MB3277: X-Microsoft-Exchange-Diagnostics: 1; VI1PR05MB3277; 20:cNEhSDm7BllswQWVxueSRrEs2gZJuPWa9+N+2KcJjVHdK9Ka1FIV07KZqdCWFKosqQxxUyx/pkpFgraTzRlMQ+ZNDc4gq+Jm4Qs3+bKy/i4axVKo14W4dDTbPZf/1Bd4jEUYSsAwlUcGz+0A8xXn3+FIIRD0PgJUWUzU+a/0ddH4PRMGZz0dfpxg2Xb2yTss/NfXFd+086r5qzBs9HBo5oInYUFiILKtjHB19oLg+VNHPcMgjDe4KBxbg4z9fHJ+onw8cnq+cGp0910qfwopCyOJuhrXOhrmXPMScvLL6kolqfkTZMjsOB22I0I8WVE/m7cVfoKR3ZLsOoYEvYzxB08rjpIWLvYGEXKdbHH7zqiaOyR+vLtjGhcCbzjjh5W5AX7L6wvSe/e55D29Oy+7B4rVSCwc1zIQ6byP8jQvup37JOneRSrWysRm3BiRdum0kBr3zfJhfJy8t24MnEv66l9zZwkbkzbb00ON86k+xKZ/hH13rAN5Pncq+xwk835E; 4:hTklqH5q5xJ5Uyk6fhXGJ0AOB+AXdCxj+I4AjzHppJ5+Wzp9G99QZTPinJYpNqnldDYIPtXWRX9ppjqn+WW8ax8+VNupXznwbe+TR2laCL7jKWN55FL02Yt0kti5XGN0hilnRzmq62URZ78FgHqYypFjca48HM7w3/0seczrmeOrfc1HXAaO+A6y83C1ebilyTJwitfbjqAD+7mpdjoIbNL3EO75nbjKPNTSP+zhLEuc5gA7dkEO3o4mr0tgYjWcCCJQmvbKSBicGR57ILXAbA3iWmCPWOwxialKLRWdYHkux3Iyskk6SrcGh5ULo2M+l7TcDDoFfDiYMY9WOWmz93+gMtcHo7bmOrClhhZc2Kg= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(211171220733660)(788757137089); X-MS-Exchange-SenderADCheck: 1 X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(3002001)(3231355)(944501410)(52105095)(10201501046)(93006095)(93001095)(6055026)(149066)(150057)(6041310)(20161123560045)(20161123558120)(20161123562045)(20161123564045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(201708071742011)(7699051); SRVR:VI1PR05MB3277; BCL:0; PCL:0; RULEID:; SRVR:VI1PR05MB3277; X-Forefront-PRVS: 0826B2F01B X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(136003)(39860400002)(376002)(346002)(366004)(396003)(199004)(189003)(107886003)(16526019)(66066001)(7736002)(8676002)(47776003)(106356001)(51416003)(7696005)(52116002)(25786009)(68736007)(105586002)(186003)(16586007)(305945005)(8936002)(3846002)(81166006)(81156014)(6116002)(386003)(50226002)(6666004)(50466002)(11346002)(5024004)(86362001)(14444005)(446003)(76176011)(97736004)(6636002)(21086003)(36756003)(2906002)(33026002)(4720700003)(69596002)(53936002)(55016002)(48376002)(85306007)(478600001)(476003)(8886007)(316002)(26005)(4326008)(4744004)(486006)(2616005)(5660300001)(956004); DIR:OUT; SFP:1101; SCL:1; SRVR:VI1PR05MB3277; H:mellanox.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; Received-SPF: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; VI1PR05MB3277; 23:obZIgengMeNLbidwukCU+6nojrJ1bTgpZ0C/NSiKD?= =?us-ascii?Q?VTyzRKV2SP1oEfZyGuXpEfOL6077ooieTjsncBVc6OfQtIHzU0la5S0DlYTT?= =?us-ascii?Q?bZ5orClUTJ/J0Mtn1/rtSRT0iTOYQ+ub88/CBD3hSHiQ6Sodw54tT4Q5uNm2?= =?us-ascii?Q?cn9H2Va0P055SLtc0LKtSFqyrLqwv8UMQLW02nqVVDpePp58WDXIcpN4yYWV?= =?us-ascii?Q?jyQ60Qwy0j9r8C6tVpzszESjbrfBiog7dbvks9VwNCfrGEf3mCAih3fYHIrx?= =?us-ascii?Q?boAsEUMvwbbaqzUjwwCKxaWQDHfLh/adoGZj2CV9yI1rQFa+zqRKopN+cDOO?= =?us-ascii?Q?beDS9XA8ZUfFLPdJLiqFe4DVnV1t4GiDPdenTIuP9AyiAqBR7tPQcEowoLcs?= =?us-ascii?Q?mST92ogXhkeKoAdcL/N4wFR8MpkzZyap8c4CM/FHzNhKwPqojnr5L8Wjk4Ly?= =?us-ascii?Q?j7x49toRfZcP3m5lmj7FUbfIGKKTkB95g4qA+C8vViVj/LaBWR40JkfgmPIR?= =?us-ascii?Q?tP9RGpmZ1bJz43wZ/ol2J16yw5tn3nr+8uEYogXq6Wvz+D2f1pLv1uBCPkG2?= =?us-ascii?Q?Y62riNGL+IV8/2gE/hoM0tnRIuIidRxYUdkuotmlSrLzmxTe4EkAFX5W3XiE?= =?us-ascii?Q?plbPh7x/Gp8JfgxdMvrklrUWlRmPd+mw9cQPp+lixkFzNhXDubUag31hDVx3?= =?us-ascii?Q?u4NrQLEo+NZL6+jxPkMAoOVHTF8ZZ6yzZoeSrxRnEE2nu+ChZ4xyhjpBdtcs?= =?us-ascii?Q?6mZVXGQHCN8neT5GyCArHiYNPH5cWkPhtxHn+DqfRwxMZeQLyUNv5IloG3+5?= =?us-ascii?Q?ZcUGRxBvz9H+ZkacA90tuTtERq9jzLR5zfFyiiTBgtV/vTjOWq/3g+DBmqab?= =?us-ascii?Q?p7toNceBSVpmMtGlSV3DBipovZm84JnjON+mNFPtehw5FS6Fio4UH3mYl7CF?= =?us-ascii?Q?EmhCVF7QX4bUcUxIA1AaX0OKpNPAffcF9XJNoZkXntF4fNOxbZ6XmTue4b4K?= =?us-ascii?Q?pNrQamwpXcerDd85dBkABf7qIkEhkSRVZZcNrVYpo12mFhtHNtSgnyckf6tO?= =?us-ascii?Q?VA8YUvMtyMYQyo8ftTnThXn7JKKR2jYzm4WfaGlfow7Z01YXHvO7GriTmoez?= =?us-ascii?Q?55YxwV8uZtOC/FXeLQci2w5PlDCJiX4/csA0H2YfUb8hQds6puq20Jv6YiHp?= =?us-ascii?Q?t0xo7bjqG1hjs871b0lN0FUcqAJZ37kIhz23duWWCWGcRvz2ZumSe+JBs7Jy?= =?us-ascii?Q?YkLi9Y6D/Fi1O1b/9ML2QhypMW7MTWQUpYbG3jN2qoydExeg0mfFLA2e3NE7?= =?us-ascii?Q?TH7NStcfvmNVT34iQr/TsMhQcF1vbvLn/xSpqG0GJCt0ug2A7wuIsusEPzIp?= =?us-ascii?Q?JSqHg=3D=3D?= X-Microsoft-Antispam-Message-Info: QDyx/off+8tQKm49N/EJ6X2aGKASDvDLHJCJiJXUaSzq/Snzl/wa0kBtwWguW3TwmrtinjPPh8rx+LjLdUrTjVHlvI/LCB2NPRAoMBqshoUoiiINt1m8W02OfbY+8GeIa4Gw3tLqEKPMAJEFjnJ4bTgBctG9GdaCxDGdirHhgb0fYkY0Z7Zd3yCcGHBpgmqOJtXgKYP+HA+HRmDeqonUAsF5QzV9lCLSgfLCa8aLRPoTuKESZ12p0CDpznbEZGszPAR3kceTNwj/JILIc7CnoMdxeWgMO4ZCYlz24gZQu4iQFM+5mioAD4/BaDlg6Sdp2p+LY+3E/KFlW+HGAhe2IeDBoNSAyJGCI6R39wfo+X8= X-Microsoft-Exchange-Diagnostics: 1; VI1PR05MB3277; 6:E//ZVbTtgwSFTrzE+apNBDWrG1+t9LFmx1sZLtEDZ5zcHkcxrdgBnrm+9MNj2+XOI0l8u6YserIYkh1Ew199zd24Y0Wyb4LLVa8ljp1FTUKsFnDm6FMhI3OraLhVQL6qPbva7V8KwlC2ar29/tbP6SDW1ZFeR9lz0PojO7y8QTpYotxc3SMqL3wdXObqltRC7gRcm8zkaZD50NT3HBOCKsDUNtph8Nn2rQKm04xj9cKanp1Gb2vvcI7HCc3L0duFX+0Ui8jh2siO/f529y8qzSJaRvKROP0IrwKikUFWoSR9gPCA0upm0DWd5NGCHSSObuPhcwXUQ1Lafw1zJ835fc36gA1TNkTcyswRXxVzWwby8Z+nf1MGW8m4M8knH+x2CVjOJ7MqXJiI48TQWPSE7aoh6isqGlGO27BDCsmmhWJyfAlJEgKDkdMFyKaLeSSpW0NwDIHO0Q6MN/O/hcMVWg==; 5:pGpiUkpSsnk4LBt2RevDiElHg/mggQc9C4DhBx26NMyOaIgoHaOLVDHmbq9bamSQnd8jrDEzFP5hoM4y9LG9cw2+NDsJ9QLctUVs8z5qdaibEjDDECrp//oHMD/ANt3g3VFZCC7rski8c/URZu51joe70BMcoa0LocTn7BEAMmk=; 7:7JUjbVpi93ebiw+OhU5TZrO5JAL/EohYnqXcYf67E2WWsf6KU9f77JOzy3qEQ5u6jFNBeN9vqrssIwqaeYiXbbARMARZXMX/Qwr3CZOKq6iqI0Ar435DaUuHhHA3TmFu89QT3tsfaaNbzqbAfRBSHht5gtITiLAOHSqoBJWgggWRY2xxdjQr7cdq0KmtvBF2QpZp0PLsNra6N/RpkyYM5eVwhUNM2pueEvYFx6uzv7/tTEoMTGquElebGk8zbsy1 SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Oct 2018 14:14:13.6302 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 4062ece7-070c-4c9f-d3db-08d632a87c69 X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR05MB3277 Subject: [dpdk-dev] [PATCH v2 5/7] net/mlx5: e-switch VXLAN tunnel devices management X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2018 14:14:16 -0000 VXLAN interfaces are dynamically created for each local UDP port of outer networks and then used as targets for TC "flower" filters in order to perform encapsulation. These VXLAN interfaces are system-wide, the only one device with given UDP port can exist in the system (the attempt of creating another device with the same UDP local port returns EEXIST), so PMD should support the shared device instances database for PMD instances. These VXLAN implicitly created devices are called VTEPs (Virtual Tunnel End Points). Creation of the VTEP occurs at the moment of rule applying. The link is set up, root ingress qdisc is also initialized. Encapsulation VTEPs are created on per port basis, the single VTEP is attached to the outer interface and is shared for all encapsulation rules on this interface. The source UDP port is automatically selected in range 30000-60000. For decapsulaton one VTEP is created per every unique UDP local port to accept tunnel traffic. The name of created VTEP consists of prefix "vmlx_" and the number of UDP port in decimal digits without leading zeros (vmlx_4789). The VTEP can be preliminary created in the system before the launching application, it allows to share UDP ports between primary and secondary processes. Suggested-by: Adrien Mazarguil Signed-off-by: Viacheslav Ovsiienko --- drivers/net/mlx5/mlx5_flow_tcf.c | 503 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 499 insertions(+), 4 deletions(-) diff --git a/drivers/net/mlx5/mlx5_flow_tcf.c b/drivers/net/mlx5/mlx5_flow_tcf.c index d6840d5..efa9c3b 100644 --- a/drivers/net/mlx5/mlx5_flow_tcf.c +++ b/drivers/net/mlx5/mlx5_flow_tcf.c @@ -3443,6 +3443,432 @@ struct pedit_parser { return -err; } +/* VTEP device list is shared between PMD port instances. */ +static LIST_HEAD(, mlx5_flow_tcf_vtep) + vtep_list_vxlan = LIST_HEAD_INITIALIZER(); +static pthread_mutex_t vtep_list_mutex = PTHREAD_MUTEX_INITIALIZER; + +/** + * Deletes VTEP network device. + * + * @param[in] tcf + * Context object initialized by mlx5_flow_tcf_context_create(). + * @param[in] vtep + * Object represinting the network device to delete. Memory + * allocated for this object is freed by routine. + */ +static void +flow_tcf_delete_iface(struct mlx5_flow_tcf_context *tcf, + struct mlx5_flow_tcf_vtep *vtep) +{ + struct nlmsghdr *nlh; + struct ifinfomsg *ifm; + alignas(struct nlmsghdr) + uint8_t buf[mnl_nlmsg_size(MNL_ALIGN(sizeof(*ifm))) + 8]; + int ret; + + assert(!vtep->refcnt); + if (vtep->created && vtep->ifindex) { + DRV_LOG(INFO, "VTEP delete (%d)", vtep->ifindex); + nlh = mnl_nlmsg_put_header(buf); + nlh->nlmsg_type = RTM_DELLINK; + nlh->nlmsg_flags = NLM_F_REQUEST; + ifm = mnl_nlmsg_put_extra_header(nlh, sizeof(*ifm)); + ifm->ifi_family = AF_UNSPEC; + ifm->ifi_index = vtep->ifindex; + ret = flow_tcf_nl_ack(tcf, nlh, 0, NULL, NULL); + if (ret) + DRV_LOG(WARNING, "netlink: error deleting VXLAN " + "encap/decap ifindex %u", + ifm->ifi_index); + } + rte_free(vtep); +} + +/** + * Creates VTEP network device. + * + * @param[in] tcf + * Context object initialized by mlx5_flow_tcf_context_create(). + * @param[in] ifouter + * Outer interface to attach new-created VXLAN device + * If zero the VXLAN device will not be attached to any device. + * @param[in] port + * UDP port of created VTEP device. + * @param[out] error + * Perform verbose error reporting if not NULL. + * + * @return + * Pointer to created device structure on success, NULL otherwise + * and rte_errno is set. + */ +#ifndef HAVE_IFLA_VXLAN_COLLECT_METADATA +static struct mlx5_flow_tcf_vtep* +flow_tcf_create_iface(struct mlx5_flow_tcf_context *tcf __rte_unused, + unsigned int ifouter __rte_unused, + uint16_t port __rte_unused, + struct rte_flow_error *error) +{ + rte_flow_error_set(error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL, + "netlink: failed to create VTEP, " + "VXLAN metadat is not supported by kernel"); + return NULL; +} +#else +static struct mlx5_flow_tcf_vtep* +flow_tcf_create_iface(struct mlx5_flow_tcf_context *tcf, + unsigned int ifouter, + uint16_t port, struct rte_flow_error *error) +{ + struct mlx5_flow_tcf_vtep *vtep; + struct nlmsghdr *nlh; + struct ifinfomsg *ifm; + char name[sizeof(MLX5_VXLAN_DEVICE_PFX) + 24]; + alignas(struct nlmsghdr) + uint8_t buf[mnl_nlmsg_size(sizeof(*ifm)) + 128 + + SZ_NLATTR_DATA_OF(sizeof(name)) + + SZ_NLATTR_NEST * 2 + + SZ_NLATTR_STRZ_OF("vxlan") + + SZ_NLATTR_DATA_OF(sizeof(uint32_t)) + + SZ_NLATTR_DATA_OF(sizeof(uint32_t)) + + SZ_NLATTR_DATA_OF(sizeof(uint16_t)) + + SZ_NLATTR_DATA_OF(sizeof(uint8_t))]; + struct nlattr *na_info; + struct nlattr *na_vxlan; + rte_be16_t vxlan_port = RTE_BE16(port); + int ret; + + vtep = rte_zmalloc(__func__, sizeof(*vtep), + alignof(struct mlx5_flow_tcf_vtep)); + if (!vtep) { + rte_flow_error_set + (error, ENOMEM, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, + NULL, "unadble to allocate memory for VTEP desc"); + return NULL; + } + *vtep = (struct mlx5_flow_tcf_vtep){ + .refcnt = 0, + .port = port, + .created = 0, + .ifouter = 0, + .ifindex = 0, + .local = LIST_HEAD_INITIALIZER(), + .neigh = LIST_HEAD_INITIALIZER(), + }; + memset(buf, 0, sizeof(buf)); + nlh = mnl_nlmsg_put_header(buf); + nlh->nlmsg_type = RTM_NEWLINK; + nlh->nlmsg_flags = NLM_F_REQUEST | NLM_F_CREATE | NLM_F_EXCL; + ifm = mnl_nlmsg_put_extra_header(nlh, sizeof(*ifm)); + ifm->ifi_family = AF_UNSPEC; + ifm->ifi_type = 0; + ifm->ifi_index = 0; + ifm->ifi_flags = IFF_UP; + ifm->ifi_change = 0xffffffff; + snprintf(name, sizeof(name), "%s%u", MLX5_VXLAN_DEVICE_PFX, port); + mnl_attr_put_strz(nlh, IFLA_IFNAME, name); + na_info = mnl_attr_nest_start(nlh, IFLA_LINKINFO); + assert(na_info); + mnl_attr_put_strz(nlh, IFLA_INFO_KIND, "vxlan"); + na_vxlan = mnl_attr_nest_start(nlh, IFLA_INFO_DATA); + if (ifouter) + mnl_attr_put_u32(nlh, IFLA_VXLAN_LINK, ifouter); + assert(na_vxlan); + mnl_attr_put_u8(nlh, IFLA_VXLAN_COLLECT_METADATA, 1); + mnl_attr_put_u8(nlh, IFLA_VXLAN_UDP_ZERO_CSUM6_RX, 1); + mnl_attr_put_u8(nlh, IFLA_VXLAN_LEARNING, 0); + mnl_attr_put_u16(nlh, IFLA_VXLAN_PORT, vxlan_port); + mnl_attr_nest_end(nlh, na_vxlan); + mnl_attr_nest_end(nlh, na_info); + assert(sizeof(buf) >= nlh->nlmsg_len); + ret = flow_tcf_nl_ack(tcf, nlh, 0, NULL, NULL); + if (ret) + DRV_LOG(WARNING, + "netlink: VTEP %s create failure (%d)", + name, rte_errno); + else + vtep->created = 1; + if (ret && ifouter) + ret = 0; + else + ret = if_nametoindex(name); + if (ret) { + vtep->ifindex = ret; + vtep->ifouter = ifouter; + memset(buf, 0, sizeof(buf)); + nlh = mnl_nlmsg_put_header(buf); + nlh->nlmsg_type = RTM_NEWLINK; + nlh->nlmsg_flags = NLM_F_REQUEST; + ifm = mnl_nlmsg_put_extra_header(nlh, sizeof(*ifm)); + ifm->ifi_family = AF_UNSPEC; + ifm->ifi_type = 0; + ifm->ifi_index = vtep->ifindex; + ifm->ifi_flags = IFF_UP; + ifm->ifi_change = IFF_UP; + ret = flow_tcf_nl_ack(tcf, nlh, 0, NULL, NULL); + if (ret) { + DRV_LOG(WARNING, + "netlink: VTEP %s set link up failure (%d)", + name, rte_errno); + rte_free(vtep); + rte_flow_error_set + (error, -errno, + RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL, + "netlink: failed to set VTEP link up"); + vtep = NULL; + } else { + ret = mlx5_flow_tcf_init(tcf, vtep->ifindex, error); + if (ret) + DRV_LOG(WARNING, + "VTEP %s init failure (%d)", name, rte_errno); + } + } else { + DRV_LOG(WARNING, + "VTEP %s failed to get index (%d)", name, errno); + rte_flow_error_set + (error, -errno, + RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL, + !vtep->created ? "netlink: failed to create VTEP" : + "netlink: failed to retrieve VTEP ifindex"); + ret = 1; + } + if (ret) { + flow_tcf_delete_iface(tcf, vtep); + vtep = NULL; + } + DRV_LOG(INFO, "VTEP create (%d, %s)", vtep->port, vtep ? "OK" : "error"); + return vtep; +} +#endif /* HAVE_IFLA_VXLAN_COLLECT_METADATA */ + +/** + * Create target interface index for VXLAN tunneling decapsulation. + * In order to share the UDP port within the other interfaces the + * VXLAN device created as not attached to any interface (if created). + * + * @param[in] tcf + * Context object initialized by mlx5_flow_tcf_context_create(). + * @param[in] dev_flow + * Flow tcf object with tunnel structure pointer set. + * @param[out] error + * Perform verbose error reporting if not NULL. + * @return + * Interface index on success, zero otherwise and rte_errno is set. + */ +static unsigned int +flow_tcf_decap_vtep_create(struct mlx5_flow_tcf_context *tcf, + struct mlx5_flow *dev_flow, + struct rte_flow_error *error) +{ + struct mlx5_flow_tcf_vtep *vtep, *vlst; + uint16_t port = dev_flow->tcf.vxlan_decap->udp_port; + + vtep = NULL; + LIST_FOREACH(vlst, &vtep_list_vxlan, next) { + if (vlst->port == port) { + vtep = vlst; + break; + } + } + if (!vtep) { + vtep = flow_tcf_create_iface(tcf, 0, port, error); + if (vtep) + LIST_INSERT_HEAD(&vtep_list_vxlan, vtep, next); + } else { + if (vtep->ifouter) { + rte_flow_error_set(error, -errno, + RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL, + "Failed to create decap VTEP, attached " + "device with the same UDP port exists"); + vtep = NULL; + } + } + if (vtep) { + vtep->refcnt++; + assert(vtep->ifindex); + return vtep->ifindex; + } else { + return 0; + } +} + +/** + * Creates target interface index for VXLAN tunneling encapsulation. + * + * @param[in] tcf + * Context object initialized by mlx5_flow_tcf_context_create(). + * @param[in] ifouter + * Network interface index to attach VXLAN encap device to. + * @param[in] dev_flow + * Flow tcf object with tunnel structure pointer set. + * @param[out] error + * Perform verbose error reporting if not NULL. + * @return + * Interface index on success, zero otherwise and rte_errno is set. + */ +static unsigned int +flow_tcf_encap_vtep_create(struct mlx5_flow_tcf_context *tcf, + unsigned int ifouter, + struct mlx5_flow *dev_flow __rte_unused, + struct rte_flow_error *error) +{ + static uint16_t encap_port = MLX5_VXLAN_PORT_RANGE_MIN - 1; + struct mlx5_flow_tcf_vtep *vtep, *vlst; + + assert(ifouter); + /* Look whether the attached VTEP for encap is created. */ + vtep = NULL; + LIST_FOREACH(vlst, &vtep_list_vxlan, next) { + if (vlst->ifouter == ifouter) { + vtep = vlst; + break; + } + } + if (!vtep) { + uint16_t pcnt; + + /* Not found, we should create the new attached VTEP. */ +/* + * TODO: not implemented yet + * flow_tcf_encap_iface_cleanup(tcf, ifouter); + * flow_tcf_encap_local_cleanup(tcf, ifouter); + * flow_tcf_encap_neigh_cleanup(tcf, ifouter); + */ + for (pcnt = 0; pcnt <= (MLX5_VXLAN_PORT_RANGE_MAX + - MLX5_VXLAN_PORT_RANGE_MIN); pcnt++) { + encap_port++; + /* Wraparound the UDP port index. */ + if (encap_port < MLX5_VXLAN_PORT_RANGE_MIN || + encap_port > MLX5_VXLAN_PORT_RANGE_MAX) + encap_port = MLX5_VXLAN_PORT_RANGE_MIN; + /* Check whether UDP port is in already in use. */ + vtep = NULL; + LIST_FOREACH(vlst, &vtep_list_vxlan, next) { + if (vlst->port == encap_port) { + vtep = vlst; + break; + } + } + if (vtep) { + vtep = NULL; + continue; + } + vtep = flow_tcf_create_iface(tcf, ifouter, + encap_port, error); + if (vtep) { + LIST_INSERT_HEAD(&vtep_list_vxlan, vtep, next); + break; + } + if (rte_errno != EEXIST) + break; + } + } + if (!vtep) + return 0; + vtep->refcnt++; + assert(vtep->ifindex); + return vtep->ifindex; +} + +/** + * Creates target interface index for tunneling of any type. + * + * @param[in] tcf + * Context object initialized by mlx5_flow_tcf_context_create(). + * @param[in] ifouter + * Network interface index to attach VXLAN encap device to. + * @param[in] dev_flow + * Flow tcf object with tunnel structure pointer set. + * @param[out] error + * Perform verbose error reporting if not NULL. + * @return + * Interface index on success, zero otherwise and rte_errno is set. + */ +static unsigned int +flow_tcf_tunnel_vtep_create(struct mlx5_flow_tcf_context *tcf, + unsigned int ifouter, + struct mlx5_flow *dev_flow, + struct rte_flow_error *error) +{ + unsigned int ret; + + assert(dev_flow->tcf.tunnel); + pthread_mutex_lock(&vtep_list_mutex); + switch (dev_flow->tcf.tunnel->type) { + case MLX5_FLOW_TCF_TUNACT_VXLAN_ENCAP: + ret = flow_tcf_encap_vtep_create(tcf, ifouter, + dev_flow, error); + break; + case MLX5_FLOW_TCF_TUNACT_VXLAN_DECAP: + ret = flow_tcf_decap_vtep_create(tcf, dev_flow, error); + break; + default: + rte_flow_error_set(error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL, + "unsupported tunnel type"); + ret = 0; + break; + } + pthread_mutex_unlock(&vtep_list_mutex); + return ret; +} + +/** + * Deletes tunneling interface by UDP port. + * + * @param[in] tcf + * Context object initialized by mlx5_flow_tcf_context_create(). + * @param[in] ifindex + * Network interface index of VXLAN device. + * @param[in] dev_flow + * Flow tcf object with tunnel structure pointer set. + */ +static void +flow_tcf_tunnel_vtep_delete(struct mlx5_flow_tcf_context *tcf, + unsigned int ifindex, + struct mlx5_flow *dev_flow) +{ + struct mlx5_flow_tcf_vtep *vtep, *vlst; + + assert(dev_flow->tcf.tunnel); + pthread_mutex_lock(&vtep_list_mutex); + vtep = NULL; + LIST_FOREACH(vlst, &vtep_list_vxlan, next) { + if (vlst->ifindex == ifindex) { + vtep = vlst; + break; + } + } + if (!vtep) { + DRV_LOG(WARNING, "No VTEP device found in the list"); + goto exit; + } + switch (dev_flow->tcf.tunnel->type) { + case MLX5_FLOW_TCF_TUNACT_VXLAN_DECAP: + break; + case MLX5_FLOW_TCF_TUNACT_VXLAN_ENCAP: +/* + * TODO: Remove the encap ancillary rules first. + * flow_tcf_encap_neigh(tcf, vtep, dev_flow, false, NULL); + * flow_tcf_encap_local(tcf, vtep, dev_flow, false, NULL); + */ + break; + default: + assert(false); + DRV_LOG(WARNING, "Unsupported tunnel type"); + break; + } + assert(dev_flow->tcf.tunnel->ifindex_tun == vtep->ifindex); + assert(vtep->refcnt); + if (!vtep->refcnt || !--vtep->refcnt) { + LIST_REMOVE(vtep, next); + flow_tcf_delete_iface(tcf, vtep); + } +exit: + pthread_mutex_unlock(&vtep_list_mutex); +} + /** * Apply flow to E-Switch by sending Netlink message. * @@ -3461,18 +3887,61 @@ struct pedit_parser { struct rte_flow_error *error) { struct priv *priv = dev->data->dev_private; - struct mlx5_flow_tcf_context *nl = priv->tcf_context; + struct mlx5_flow_tcf_context *tcf = priv->tcf_context; struct mlx5_flow *dev_flow; struct nlmsghdr *nlh; + int ret; dev_flow = LIST_FIRST(&flow->dev_flows); /* E-Switch flow can't be expanded. */ assert(!LIST_NEXT(dev_flow, next)); + if (dev_flow->tcf.applied) + return 0; nlh = dev_flow->tcf.nlh; nlh->nlmsg_type = RTM_NEWTFILTER; nlh->nlmsg_flags = NLM_F_REQUEST | NLM_F_CREATE | NLM_F_EXCL; - if (!flow_tcf_nl_ack(nl, nlh, 0, NULL, NULL)) + if (dev_flow->tcf.tunnel) { + /* + * Replace the interface index, target for + * encapsulation, source for decapsulation. + */ + assert(!dev_flow->tcf.tunnel->ifindex_tun); + assert(dev_flow->tcf.tunnel->ifindex_ptr); + /* Create actual VTEP device when rule is being applied. */ + dev_flow->tcf.tunnel->ifindex_tun + = flow_tcf_tunnel_vtep_create(tcf, + *dev_flow->tcf.tunnel->ifindex_ptr, + dev_flow, error); + DRV_LOG(INFO, "Replace ifindex: %d->%d", + dev_flow->tcf.tunnel->ifindex_tun, + *dev_flow->tcf.tunnel->ifindex_ptr); + if (!dev_flow->tcf.tunnel->ifindex_tun) + return -rte_errno; + dev_flow->tcf.tunnel->ifindex_org + = *dev_flow->tcf.tunnel->ifindex_ptr; + *dev_flow->tcf.tunnel->ifindex_ptr + = dev_flow->tcf.tunnel->ifindex_tun; + } + ret = flow_tcf_nl_ack(tcf, nlh, 0, NULL, NULL); + if (dev_flow->tcf.tunnel) { + DRV_LOG(INFO, "Restore ifindex: %d->%d", + dev_flow->tcf.tunnel->ifindex_org, + *dev_flow->tcf.tunnel->ifindex_ptr); + *dev_flow->tcf.tunnel->ifindex_ptr + = dev_flow->tcf.tunnel->ifindex_org; + dev_flow->tcf.tunnel->ifindex_org = 0; + } + if (!ret) { + dev_flow->tcf.applied = 1; return 0; + } + DRV_LOG(WARNING, "netlink: failed to create TC rule (%d)", rte_errno); + if (dev_flow->tcf.tunnel->ifindex_tun) { + flow_tcf_tunnel_vtep_delete(tcf, + dev_flow->tcf.tunnel->ifindex_tun, + dev_flow); + dev_flow->tcf.tunnel->ifindex_tun = 0; + } return rte_flow_error_set(error, rte_errno, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL, "netlink: failed to create TC flow rule"); @@ -3490,7 +3959,7 @@ struct pedit_parser { flow_tcf_remove(struct rte_eth_dev *dev, struct rte_flow *flow) { struct priv *priv = dev->data->dev_private; - struct mlx5_flow_tcf_context *nl = priv->tcf_context; + struct mlx5_flow_tcf_context *tcf = priv->tcf_context; struct mlx5_flow *dev_flow; struct nlmsghdr *nlh; @@ -3501,10 +3970,36 @@ struct pedit_parser { return; /* E-Switch flow can't be expanded. */ assert(!LIST_NEXT(dev_flow, next)); + if (!dev_flow->tcf.applied) + return; + if (dev_flow->tcf.tunnel) { + /* + * Replace the interface index, target for + * encapsulation, source for decapsulation. + */ + assert(dev_flow->tcf.tunnel->ifindex_tun); + assert(dev_flow->tcf.tunnel->ifindex_ptr); + dev_flow->tcf.tunnel->ifindex_org + = *dev_flow->tcf.tunnel->ifindex_ptr; + *dev_flow->tcf.tunnel->ifindex_ptr + = dev_flow->tcf.tunnel->ifindex_tun; + } nlh = dev_flow->tcf.nlh; nlh->nlmsg_type = RTM_DELTFILTER; nlh->nlmsg_flags = NLM_F_REQUEST; - flow_tcf_nl_ack(nl, nlh, 0, NULL, NULL); + flow_tcf_nl_ack(tcf, nlh, 0, NULL, NULL); + if (dev_flow->tcf.tunnel) { + *dev_flow->tcf.tunnel->ifindex_ptr + = dev_flow->tcf.tunnel->ifindex_org; + dev_flow->tcf.tunnel->ifindex_org = 0; + if (dev_flow->tcf.tunnel->ifindex_tun) { + flow_tcf_tunnel_vtep_delete(tcf, + dev_flow->tcf.tunnel->ifindex_tun, + dev_flow); + dev_flow->tcf.tunnel->ifindex_tun = 0; + } + } + dev_flow->tcf.applied = 0; } /** -- 1.8.3.1