From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <dev-bounces@dpdk.org> Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8ADEA45F20; Mon, 23 Dec 2024 11:12:28 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 44B3040DCE; Mon, 23 Dec 2024 11:12:03 +0100 (CET) Received: from NAM02-SN1-obe.outbound.protection.outlook.com (mail-sn1nam02on2071.outbound.protection.outlook.com [40.107.96.71]) by mails.dpdk.org (Postfix) with ESMTP id D9D7C40A84 for <dev@dpdk.org>; Mon, 23 Dec 2024 11:12:00 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=LFCxR5n6kbsctkWQV3fPP9yXBekMZxnkJSyJO7gq/kdNW64hxdPF+tJbIGWcGc5ucgYxOlD+aj5JB1jKuASYIR8GT77XnkbC+rgJ98oDhnGwIeGT8kkDhRGZ/2zHQ7OwFGxoT37mHBMAMyEEU/EPAZp9gXWromwyIg5CEXuRkvhXEhuJ2w0Va1tdixoPOvqwDcMRXelnt5wA55HXxifhC9yd0WH9ARamaK7tEznqDTR2fFl0JoPVaFGKepf1obtvr0EaMSJ8u2UJFcU7lLmDGqwaYSqd9qKsMnDEjdZ/wCgp2OV8bqMWOSx0OqogVADpy9s0CK6w++vjCE+KYWQz9Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=72K4QBZYw443QLEq7OnU4yB3Cam02jtUqoRSaVS8FUg=; b=JnaMp/kHXVheZDyI6EMneOzQuSIZX6tqIMoTXOjbSGmsWl+ADXzzzZRnRTakSwZWsXJ8K77a2+JztzvbEZSjLl8ONZjY++qjdfNX++OFN5XmhYvbZ3ZubG+TNKunVjTggRhLGifJFRmopO4JzglpnnIfJMJ9iOKk92cCE7HJMmir91y0P7ypKRBdbZxJDCwzF3qwdSAzi/KjqP2Vcy39QVmo+5m0157fricZF8xRaZE40kjSmsXrvIDrNUhpzE8IwssTpzO4m+apAXsOWEc2eHPqTJ0AaTY0iZO+IMtzEfdU75j3rsmLpw92Xe5CupKV9Uv10jNjLs5sBSm0cxNu0g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.161) smtp.rcpttodomain=monjalon.net smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=72K4QBZYw443QLEq7OnU4yB3Cam02jtUqoRSaVS8FUg=; b=izwt/lz0E8Hr0kRUl5dcB1mE4mZ9cEImTJiFgaCCT2lexC2UsLGUAFVnvgi717LC9CylCcXu2rRRHtUyQq9KvQaDD7KfcDMWJov+pRILgQSLbjiKbOQVLd7+/RGjEYOAeR2MmEJCk4LMHJVJq7UMCfXFFwO/kbmF9nY+b9FSqhn88UsRDc+eEQ4dfArUOs/3TrAeSN4lUBzVx7hQqi6ThpWggTf6oizYJSa4zA8g1i4g4Kmn5k7lPIDtJsZn3sAFW3I+md7RHV+wLnR+sAE3mfgAEEsuGgBnY+e7gdj7iBJtDRbJE6GRYOl0DeVvOlTcb3wxLo8JsSoHoTD4o9s74g== Received: from BL1P221CA0001.NAMP221.PROD.OUTLOOK.COM (2603:10b6:208:2c5::12) by IA1PR12MB6412.namprd12.prod.outlook.com (2603:10b6:208:3af::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8272.16; Mon, 23 Dec 2024 10:11:51 +0000 Received: from BN1PEPF0000468C.namprd05.prod.outlook.com (2603:10b6:208:2c5:cafe::cf) by BL1P221CA0001.outlook.office365.com (2603:10b6:208:2c5::12) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8272.19 via Frontend Transport; Mon, 23 Dec 2024 10:11:51 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.161) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.161 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.161; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.161) by BN1PEPF0000468C.mail.protection.outlook.com (10.167.243.137) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8293.12 via Frontend Transport; Mon, 23 Dec 2024 10:11:51 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Mon, 23 Dec 2024 02:11:37 -0800 Received: from nvidia.com (10.126.230.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Mon, 23 Dec 2024 02:11:34 -0800 From: "Minggang Li(Gavin)" <gavinl@nvidia.com> To: <matan@nvidia.com>, <viacheslavo@nvidia.com>, <orika@nvidia.com>, <thomas@monjalon.net>, Dariusz Sosnowski <dsosnowski@nvidia.com>, Bing Zhao <bingz@nvidia.com>, Suanming Mou <suanmingm@nvidia.com> CC: <dev@dpdk.org>, <rasland@nvidia.com> Subject: [PATCH 6/7] mlx5: use RDMA Netlink to update port information Date: Mon, 23 Dec 2024 12:11:00 +0200 Message-ID: <20241223101101.677449-7-gavinl@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241223101101.677449-1-gavinl@nvidia.com> References: <20241223101101.677449-1-gavinl@nvidia.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [10.126.230.35] X-ClientProxiedBy: rnnvmail201.nvidia.com (10.129.68.8) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BN1PEPF0000468C:EE_|IA1PR12MB6412:EE_ X-MS-Office365-Filtering-Correlation-Id: 21c56b87-fbcb-400c-5fe0-08dd233a3870 X-LD-Processed: 43083d15-7273-40c1-b7db-39efd9ccc17a,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|36860700013|1800799024|82310400026|376014; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?DRlCkcdy2uXeVsUat/jI87gvwdlKc/U3LDl7/wPbvii6Yke1qzTxgE9tMhgC?= =?us-ascii?Q?84xWCXIoAJ8TShu153ppsxIxkLRm95LNFBgnJwkZNo3v/XRDDJ/PGfn+NDmA?= =?us-ascii?Q?9jLS0LdpaSHf0ra81q+wPxQU/oaeaPq/TZ0z1IUwTrYrk+xstW08RwLAGuS7?= =?us-ascii?Q?2w9yHrhsUOAIOk2KLzs491Ok6TPK8Hgy0GYBnQ0uC/CFQHrWQ7gBQxddIj5x?= =?us-ascii?Q?gb+ubspu/cvipDmu/5pZaaxEolThnC71Ca1Ml8PRQJHQt2xnqkOKXICtNpgP?= =?us-ascii?Q?FvalmXnN2ArZwpZWOgzE0ePYO6jgUUa+ZUaUr2G3iuvQvRIm/jpG7/iUCwlP?= =?us-ascii?Q?b2wwCN3koXGvFhzU6708kPEB8UTPHwvGm/nai9HpbFntvwLiNg56XmdFKtut?= =?us-ascii?Q?YW3u5q+HlkE6Uq4spmu8orifviNQdYiOVwQcs+1Ce1HbXSjVxCKdtq02LlLo?= =?us-ascii?Q?36gxS+3QWnX1xERwsn4qEDeYFkEo7dIKNVo1/KZ9EQYs60J/ZhJ7nHUSGCBa?= =?us-ascii?Q?mDNrpajfxqtAZv0HWo9ewYVEtE1B1EKFNxsPy4+UOS/zqJQuI+G7pEzOew1+?= =?us-ascii?Q?Rxo1dz8kKO/xA8AIRR9NYG/PP32SlgMZDSWEs+m3sH+5SPddK3BJdHAD6+cE?= =?us-ascii?Q?wtUR/golOY8m53mA7KF3gkbGIXQWz6CFk5lG+SJKiUFu1h8b4r72B9bG8Su0?= =?us-ascii?Q?ugVrT3OiDCDzwT8DYqy48PIrSvcz6CrYQPOnphN+UVZ9B0jxpaS6qQIvMHMf?= =?us-ascii?Q?sb33DxmSA99Jfom/TN8aGlE7RfGWhnBAX0EmPxTALchyIJ9UtSe9YIeji4TX?= =?us-ascii?Q?qD6Xp+ifu8QhUouNU6RkYFwwQ/Fjcb58pqqoFD/VnUSVJjF2wvpUtp0GCbqc?= =?us-ascii?Q?X4TEIb0g1dM5iz2deS7vejlfVTXKEW7rVlk7lsFhb9qn7aSACkIyp8prvhKr?= =?us-ascii?Q?jrP+qQinWhL4gh+OylThoEvFuDjDfWPD7Nd0zOCzt6B1DE17sQqbzjJrGvdI?= =?us-ascii?Q?84LmVDt13wOw/VQQlouoHih7wGY08iC7AuOJuAuPhu04qTLoVsbt9VUXCJgM?= =?us-ascii?Q?SjbTthvtRPPIm77wC4LXeQgZJwM63qmEvkskuq/txCNNzakssFClK39V14GP?= =?us-ascii?Q?6+aNxP8lCxO6Act2xR0WWKTtYYtM8yWTM3g4//pDX2s/nhHZ/coSLh38C9yR?= =?us-ascii?Q?uWMBx7ZsOhbZ2ptw4F2tsZ9tlPhxuqBbQNRjJJevzKwfLQ9s1JOF+R/FqV+i?= =?us-ascii?Q?mtc6fzMCFMxQgpX6R38W4gi/+yNMiSWEUMrmIsmlrWgP+DUCLDhxjGn3yrJf?= =?us-ascii?Q?b+5CQK6AhgHH6RTeb52ERTopS7XXbraVhK2zeL4toLFf0nxjge6EpOKQ9Z8O?= =?us-ascii?Q?hK4aZ37JlhnxDLBHzXfTO3uZloVA2wAe6NuVi4F4YTJzRcqUXpEmQdWX+jU9?= =?us-ascii?Q?GpuKXJaU9aIVfjetQNlx7hBXbcOD616lyoRe6C6KYOYa6GRFqx5QgHizL/u2?= =?us-ascii?Q?HxyESQLBJxnZlOQ=3D?= X-Forefront-Antispam-Report: CIP:216.228.117.161; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc6edge2.nvidia.com; CAT:NONE; SFS:(13230040)(36860700013)(1800799024)(82310400026)(376014); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Dec 2024 10:11:51.6254 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 21c56b87-fbcb-400c-5fe0-08dd233a3870 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.117.161]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BN1PEPF0000468C.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6412 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions <dev.dpdk.org> List-Unsubscribe: <https://mails.dpdk.org/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://mails.dpdk.org/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <https://mails.dpdk.org/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Previously, port information, such as adding and deleting, is updated via route netlink. And the events used are link up/down, not the exact event for port adding or deleting, which does not performance well. To improve the performance, use RDMA monitor events to track port adding and deleting events and update corresponding port information. Signed-off-by: Minggang Li(Gavin) <gavinl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> --- doc/guides/nics/mlx5.rst | 6 ++ drivers/common/mlx5/linux/mlx5_nl.c | 74 ++++++++++++++++++----- drivers/common/mlx5/linux/mlx5_nl.h | 28 +++++++++ drivers/common/mlx5/version.map | 2 + drivers/net/mlx5/linux/mlx5_ethdev_os.c | 79 +++++++++++++++++++++++++ drivers/net/mlx5/linux/mlx5_os.c | 20 +++++++ drivers/net/mlx5/mlx5.h | 2 + 7 files changed, 195 insertions(+), 16 deletions(-) diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index 3bc8495e7a..af521c5d9b 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -1524,6 +1524,12 @@ for an additional list of options shared with other mlx5 drivers. By default, the PMD will set this value to 0. + .. note:: + + There is a race condition in probing port if probe_opt_en is set to 1. + Port probe may fail with wrong ifindex in cache while the interrupt + thread is updating the cache. Please try again if port probe failed. + - ``lacp_by_user`` parameter [int] A nonzero value enables the control of LACP traffic by the user application. diff --git a/drivers/common/mlx5/linux/mlx5_nl.c b/drivers/common/mlx5/linux/mlx5_nl.c index e03db4f918..ce1c2a8e75 100644 --- a/drivers/common/mlx5/linux/mlx5_nl.c +++ b/drivers/common/mlx5/linux/mlx5_nl.c @@ -101,6 +101,7 @@ #ifndef HAVE_RDMA_NL_GROUP_NOTIFY #define RDMA_NL_GROUP_NOTIFY 4 #endif +#define RDMA_NL_GROUP_NOTIFICATION (1 << (RDMA_NL_GROUP_NOTIFY - 1)) /* These are normally found in linux/if_link.h. */ #ifndef HAVE_IFLA_NUM_VF @@ -176,22 +177,6 @@ struct mlx5_nl_mac_addr { int mac_n; /**< Number of addresses in the array. */ }; -#define MLX5_NL_CMD_GET_IB_NAME (1 << 0) -#define MLX5_NL_CMD_GET_IB_INDEX (1 << 1) -#define MLX5_NL_CMD_GET_NET_INDEX (1 << 2) -#define MLX5_NL_CMD_GET_PORT_INDEX (1 << 3) -#define MLX5_NL_CMD_GET_PORT_STATE (1 << 4) - -/** Data structure used by mlx5_nl_cmdget_cb(). */ -struct mlx5_nl_port_info { - const char *name; /**< IB device name (in). */ - uint32_t flags; /**< found attribute flags (out). */ - uint32_t ibindex; /**< IB device index (out). */ - uint32_t ifindex; /**< Network interface index (out). */ - uint32_t portnum; /**< IB device max port number (out). */ - uint16_t state; /**< IB device port state (out). */ -}; - RTE_ATOMIC(uint32_t) atomic_sn; /* Generate Netlink sequence number. */ @@ -2110,3 +2095,60 @@ mlx5_nl_devlink_esw_multiport_get(int nlsk_fd, int family_id, const char *pci_ad *enable ? "en" : "dis", pci_addr); return ret; } + +int +mlx5_nl_rdma_monitor_init(void) +{ + return mlx5_nl_init(NETLINK_RDMA, RDMA_NL_GROUP_NOTIFICATION); +} + +void +mlx5_nl_rdma_monitor_info_get(struct nlmsghdr *hdr, struct mlx5_nl_port_info *data) +{ + size_t off = NLMSG_HDRLEN; + uint8_t event_type = 0; + + if (hdr->nlmsg_type != RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_MONITOR)) + goto error; + + while (off < hdr->nlmsg_len) { + struct nlattr *na = (void *)((uintptr_t)hdr + off); + void *payload = (void *)((uintptr_t)na + NLA_HDRLEN); + + if (na->nla_len > hdr->nlmsg_len - off) + goto error; + switch (na->nla_type) { + case RDMA_NLDEV_ATTR_EVENT_TYPE: + event_type = *(uint8_t *)payload; + if (event_type == RDMA_NETDEV_ATTACH_EVENT) { + data->flags |= MLX5_NL_CMD_GET_EVENT_TYPE; + data->event_type = MLX5_NL_RDMA_NETDEV_ATTACH_EVENT; + } else if (event_type == RDMA_NETDEV_DETACH_EVENT) { + data->flags |= MLX5_NL_CMD_GET_EVENT_TYPE; + data->event_type = MLX5_NL_RDMA_NETDEV_DETACH_EVENT; + } + break; + case RDMA_NLDEV_ATTR_DEV_INDEX: + data->ibindex = *(uint32_t *)payload; + data->flags |= MLX5_NL_CMD_GET_IB_INDEX; + break; + case RDMA_NLDEV_ATTR_PORT_INDEX: + data->portnum = *(uint32_t *)payload; + data->flags |= MLX5_NL_CMD_GET_PORT_INDEX; + break; + case RDMA_NLDEV_ATTR_NDEV_INDEX: + data->ifindex = *(uint32_t *)payload; + data->flags |= MLX5_NL_CMD_GET_NET_INDEX; + break; + default: + DRV_LOG(DEBUG, "Unknown attribute[%d] found", na->nla_type); + break; + } + off += NLA_ALIGN(na->nla_len); + } + + return; + +error: + rte_errno = EINVAL; +} diff --git a/drivers/common/mlx5/linux/mlx5_nl.h b/drivers/common/mlx5/linux/mlx5_nl.h index 396ffc98ce..e32080fa63 100644 --- a/drivers/common/mlx5/linux/mlx5_nl.h +++ b/drivers/common/mlx5/linux/mlx5_nl.h @@ -32,6 +32,27 @@ struct mlx5_nl_vlan_vmwa_context { struct mlx5_nl_vlan_dev vlan_dev[4096]; }; +#define MLX5_NL_CMD_GET_IB_NAME (1 << 0) +#define MLX5_NL_CMD_GET_IB_INDEX (1 << 1) +#define MLX5_NL_CMD_GET_NET_INDEX (1 << 2) +#define MLX5_NL_CMD_GET_PORT_INDEX (1 << 3) +#define MLX5_NL_CMD_GET_PORT_STATE (1 << 4) +#define MLX5_NL_CMD_GET_EVENT_TYPE (1 << 5) + +/** Data structure used by mlx5_nl_cmdget_cb(). */ +struct mlx5_nl_port_info { + const char *name; /**< IB device name (in). */ + uint32_t flags; /**< found attribute flags (out). */ + uint32_t ibindex; /**< IB device index (out). */ + uint32_t ifindex; /**< Network interface index (out). */ + uint32_t portnum; /**< IB device max port number (out). */ + uint16_t state; /**< IB device port state (out). */ + uint8_t event_type; /**< IB RDMA event type (out). */ +}; + +#define MLX5_NL_RDMA_NETDEV_ATTACH_EVENT (1) +#define MLX5_NL_RDMA_NETDEV_DETACH_EVENT (2) + __rte_internal int mlx5_nl_init(int protocol, int groups); __rte_internal @@ -89,4 +110,11 @@ __rte_internal int mlx5_nl_devlink_esw_multiport_get(int nlsk_fd, int family_id, const char *pci_addr, int *enable); +__rte_internal +int mlx5_nl_rdma_monitor_init(void); +__rte_internal +void mlx5_nl_rdma_monitor_info_get(struct nlmsghdr *hdr, struct mlx5_nl_port_info *data); +__rte_internal +int mlx5_nl_rdma_monitor_cap_get(int nl, uint8_t *cap); + #endif /* RTE_PMD_MLX5_NL_H_ */ diff --git a/drivers/common/mlx5/version.map b/drivers/common/mlx5/version.map index a2f72ef46a..5230576006 100644 --- a/drivers/common/mlx5/version.map +++ b/drivers/common/mlx5/version.map @@ -146,6 +146,8 @@ INTERNAL { mlx5_nl_vf_mac_addr_modify; # WINDOWS_NO_EXPORT mlx5_nl_vlan_vmwa_create; # WINDOWS_NO_EXPORT mlx5_nl_vlan_vmwa_delete; # WINDOWS_NO_EXPORT + mlx5_nl_rdma_monitor_init; # WINDOWS_NO_EXPORT + mlx5_nl_rdma_monitor_info_get; # WINDOWS_NO_EXPORT mlx5_os_umem_dereg; mlx5_os_umem_reg; diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c index 88d3c57c6e..5156d96b3a 100644 --- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c +++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c @@ -894,6 +894,85 @@ mlx5_dev_interrupt_handler_devx(void *cb_arg) #endif /* HAVE_IBV_DEVX_ASYNC */ } +static void +mlx5_dev_interrupt_ib_cb(struct nlmsghdr *hdr, void *cb_arg) +{ + mlx5_nl_rdma_monitor_info_get(hdr, (struct mlx5_nl_port_info *)cb_arg); +} + +void +mlx5_dev_interrupt_handler_ib(void *arg) +{ + struct mlx5_dev_ctx_shared *sh = arg; + struct mlx5_nl_port_info data = { + .flags = 0, + .name = "", + .ifindex = 0, + .ibindex = 0, + .portnum = 0, + }; + int nlsk_fd = rte_intr_fd_get(sh->intr_handle_ib); + struct mlx5_dev_info *dev_info; + uint32_t i; + + dev_info = &sh->cdev->dev_info; + DRV_LOG(DEBUG, "IB device %s received RDMA monitor netlink event", dev_info->ibname); + if (dev_info->port_num <= 1 || dev_info->port_info == NULL) + return; + + if (nlsk_fd < 0) + return; + + if (mlx5_nl_read_events(nlsk_fd, mlx5_dev_interrupt_ib_cb, &data) < 0) + DRV_LOG(ERR, "Failed to process Netlink events: %s", + rte_strerror(rte_errno)); + + if (!(data.flags & MLX5_NL_CMD_GET_EVENT_TYPE) || + !(data.flags & MLX5_NL_CMD_GET_PORT_INDEX) || + !(data.flags & MLX5_NL_CMD_GET_IB_INDEX)) + return; + + if (data.ibindex != dev_info->ibindex) + return; + + if (data.event_type != MLX5_NL_RDMA_NETDEV_ATTACH_EVENT && + data.event_type != MLX5_NL_RDMA_NETDEV_DETACH_EVENT) + return; + + if (data.event_type == MLX5_NL_RDMA_NETDEV_ATTACH_EVENT && + !(data.flags & MLX5_NL_CMD_GET_NET_INDEX)) + return; + + DRV_LOG(DEBUG, "Event info: type %d, ibindex %d, ifindex %d, portnum %d,", + data.event_type, data.ibindex, data.ifindex, data.portnum); + + /* Changes found in number of SF/VF ports. All information is likely unreliable. */ + if (data.portnum > dev_info->port_num) { + DRV_LOG(ERR, "Port[%d] exceeds maximum[%d]", data.portnum, dev_info->port_num); + goto flush_all; + } + if (data.event_type == MLX5_NL_RDMA_NETDEV_ATTACH_EVENT) { + if (!dev_info->port_info[data.portnum].ifindex) { + dev_info->port_info[data.portnum].ifindex = data.ifindex; + dev_info->port_info[data.portnum].valid = 1; + } else { + DRV_LOG(WARNING, "Duplicate RDMA event for port[%d] ifindex[%d]", + data.portnum, data.ifindex); + if (data.ifindex != dev_info->port_info[data.portnum].ifindex) + goto flush_all; + } + } else if (data.event_type == MLX5_NL_RDMA_NETDEV_DETACH_EVENT) { + memset(dev_info->port_info + data.portnum, 0, sizeof(struct mlx5_port_nl_info)); + } + return; + +flush_all: + for (i = 1; i <= dev_info->port_num; i++) { + dev_info->port_info[i].ifindex = 0; + dev_info->port_info[i].valid = 0; + } +} + /** * DPDK callback to bring the link DOWN. * diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c index 4537ca0466..16b275c71e 100644 --- a/drivers/net/mlx5/linux/mlx5_os.c +++ b/drivers/net/mlx5/linux/mlx5_os.c @@ -3025,6 +3025,21 @@ mlx5_os_dev_shared_handler_install(struct mlx5_dev_ctx_shared *sh) DRV_LOG(ERR, "Failed to allocate intr_handle."); return; } + if (sh->cdev->config.probe_opt && sh->cdev->dev_info.port_num > 1) { + nlsk_fd = mlx5_nl_rdma_monitor_init(); + if (nlsk_fd < 0) { + DRV_LOG(ERR, "Failed to create a socket for RDMA Netlink events: %s", + rte_strerror(rte_errno)); + return; + } + sh->intr_handle_ib = mlx5_os_interrupt_handler_create + (RTE_INTR_INSTANCE_F_SHARED, true, + nlsk_fd, mlx5_dev_interrupt_handler_ib, sh); + if (sh->intr_handle_ib == NULL) { + DRV_LOG(ERR, "Fail to allocate intr_handle"); + return; + } + } nlsk_fd = mlx5_nl_init(NETLINK_ROUTE, RTMGRP_LINK); if (nlsk_fd < 0) { DRV_LOG(ERR, "Failed to create a socket for Netlink events: %s", @@ -3086,6 +3101,11 @@ mlx5_os_dev_shared_handler_uninstall(struct mlx5_dev_ctx_shared *sh) if (sh->devx_comp) mlx5_glue->devx_destroy_cmd_comp(sh->devx_comp); #endif + fd = rte_intr_fd_get(sh->intr_handle_ib); + mlx5_os_interrupt_handler_destroy(sh->intr_handle_ib, + mlx5_dev_interrupt_handler_ib, sh); + if (fd >= 0) + close(fd); } /** diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index 89d277b523..688a7270ca 100644 --- a/drivers/net/mlx5/mlx5.h +++ b/drivers/net/mlx5/mlx5.h @@ -1602,6 +1602,7 @@ struct mlx5_dev_ctx_shared { struct rte_intr_handle *intr_handle; /* Interrupt handler for device. */ struct rte_intr_handle *intr_handle_devx; /* DEVX interrupt handler. */ struct rte_intr_handle *intr_handle_nl; /* Netlink interrupt handler. */ + struct rte_intr_handle *intr_handle_ib; /* Interrupt handler for IB device. */ void *devx_comp; /* DEVX async comp obj. */ struct mlx5_devx_obj *tis[16]; /* TIS object. */ struct mlx5_devx_obj *td; /* Transport domain. */ @@ -2302,6 +2303,7 @@ int mlx5_dev_set_flow_ctrl(struct rte_eth_dev *dev, void mlx5_dev_interrupt_handler(void *arg); void mlx5_dev_interrupt_handler_devx(void *arg); void mlx5_dev_interrupt_handler_nl(void *arg); +void mlx5_dev_interrupt_handler_ib(void *arg); int mlx5_set_link_down(struct rte_eth_dev *dev); int mlx5_set_link_up(struct rte_eth_dev *dev); int mlx5_is_removed(struct rte_eth_dev *dev); -- 2.34.1