From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 18F5745B4C; Tue, 29 Oct 2024 15:33:10 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 7FC1242EA2; Tue, 29 Oct 2024 15:32:55 +0100 (CET) Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2073.outbound.protection.outlook.com [40.107.243.73]) by mails.dpdk.org (Postfix) with ESMTP id E8EE842EAB for ; Tue, 29 Oct 2024 15:32:33 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=TkTW5U0lpN9IobdIuKTDXBdpturbocHlTewu5ZttXkW72v8soY5KbPPHw46+89b3JdXYl8eX+JWHpBysnvPK0sutloBzgCm35JHNvIBI35ppLAIu+Z+HobsKfxS8Qx5Jc3FAoAKNiSYYRaya2SK0yHYzj7JdumdbO9tTTGof+LFsVMh0AN0Y8g7AbyUM8lbUAXmI/42A3JzPdWAxIpqNDd8JuVpE6d010w1G5zLPhh3+6Zm/5fTtziiKytpwghqZ8SVG4e5yMc2JBt+JsU2D/F64COKRcuUA2B8TDR07+BEK6/hlPxfuwlo3KSSUcgMVQ9JKsx1nW5ewt9qsPLFbfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/bGvFRV5M8KjSVLTruAl7i7ZnvWWVyZxnUKLHlRPv2A=; b=QJMypKU6zSskMTKTlv+aL1pUnZBOfhG0gjuez5GJQWlkxUDpEhparWTviVoZ+UJg//DAsAPxILUHrGhR8SWInMLVuThdMkEblZwGcINfKckaB3PJUUPWunciDvtKLVoWFZ62Veywyo1+xRnS/0FbmY0tc+XFq+sXDplYxFqq0GyqknLdoTpLMsFb57bbdduTW6D58P9OGgbEC0O6PL9y3ahZml18+y8fKgWMWpyzDJY2RO1kXsWBzSSMgOHBfLsBUtw2xua1uIe8IwDMFJBx7rsU2YCgrcXjM19pzJj3qLgLXoIvwbxPaJqNvHre627hnWz8+vngIKg6NXH04kdMTg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.160) smtp.rcpttodomain=monjalon.net smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/bGvFRV5M8KjSVLTruAl7i7ZnvWWVyZxnUKLHlRPv2A=; b=tncFjRD3O1PdddFXXLK9XRNC7cgg59jglIIkMM+9dZvR+SaiAAjF5AZDeHU6XgGTgSTw8yuAb+qW6W5U0ujFbF5Qch0BNVj2jSlgpFFesIStOd/Cb7IRSkwuE3MusgOpVc00wXJmR3hNTtajRSZYlxYhBrZBPdELPtRFVVuRKoE5c+bdNv5v34YkmaZaJR8LiSWYeVDG8OCFzR9jn5h9UYet24hZlTUwSJXLGUFivv0Hzv1ClJ+AgZF5YCX7wmalncwjM4P8Hfh7EtzrgCwqn56HDxXbCDnlFVYu895RSXmo3rWpnVnD78FGBgf+mP+VFAjtaiQITSQwB3FnhodKsA== Received: from CH0PR03CA0182.namprd03.prod.outlook.com (2603:10b6:610:e4::7) by LV3PR12MB9401.namprd12.prod.outlook.com (2603:10b6:408:21c::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.32; Tue, 29 Oct 2024 14:32:23 +0000 Received: from CH2PEPF00000145.namprd02.prod.outlook.com (2603:10b6:610:e4:cafe::18) by CH0PR03CA0182.outlook.office365.com (2603:10b6:610:e4::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.26 via Frontend Transport; Tue, 29 Oct 2024 14:32:23 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.160) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.160 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.160; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.160) by CH2PEPF00000145.mail.protection.outlook.com (10.167.244.102) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8114.16 via Frontend Transport; Tue, 29 Oct 2024 14:32:23 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Tue, 29 Oct 2024 07:31:56 -0700 Received: from nvidia.com (10.126.231.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Tue, 29 Oct 2024 07:31:53 -0700 From: "Minggang Li(Gavin)" To: , , , , Dariusz Sosnowski , Bing Zhao , Suanming Mou CC: , Subject: [PATCH V3 7/7] mlx5: add backward compatibility for RDMA monitor Date: Tue, 29 Oct 2024 16:31:18 +0200 Message-ID: <20241029143118.875214-8-gavinl@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241029143118.875214-1-gavinl@nvidia.com> References: <20241029134256.874767-8-gavinl@nvidia.com> <20241029143118.875214-1-gavinl@nvidia.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [10.126.231.35] X-ClientProxiedBy: rnnvmail202.nvidia.com (10.129.68.7) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PEPF00000145:EE_|LV3PR12MB9401:EE_ X-MS-Office365-Filtering-Correlation-Id: 566ff828-f5b7-4ce0-b748-08dcf82680e5 X-LD-Processed: 43083d15-7273-40c1-b7db-39efd9ccc17a,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|1800799024|36860700013|376014|82310400026; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?rq47o2IJp31rKmnVu5HrldOoaL7UZPsXLFYbkG28q+16KgiAACrFkm+t0jj5?= =?us-ascii?Q?NC7b5eNs67NO5VAbJp18rC1+gBusyezTqCI8BQsrt6Q99CMBn7+K0dIPhDOQ?= =?us-ascii?Q?/04lE9RXmuf9q/c04D378QZ1lMcszzTGi0yrWN9tGk5ZG+TRuAVf1Y6tUPQk?= =?us-ascii?Q?lBmCsJBIX6xQo0e3q4gVUGTyLHwBhT2mLrzSEJff1r5By+gR9xKuEdlZpnqp?= =?us-ascii?Q?bY31YVrq3yBXqjLBsH7wDNNpW9jD7r2XldM2oCYQauFZoLCdpKURMB0oK5S8?= =?us-ascii?Q?WjVTnTLYyghqd/kewuILAcvJH5PrV6a4/jLsqiZXnbWca9ZkZd770q/XsuSI?= =?us-ascii?Q?RgN43WddnQw48MZ7ht5gMP7KrqyJlmaArleibtFdymnxgXXEUaR3Xqs06CYm?= =?us-ascii?Q?2Vu06Sm9gCFAFhgH/0EBRnqvqOKl3BRTAo8bmGwk+2ZCM1jeod9E32W4u7t1?= =?us-ascii?Q?VLI/EAnvqwIARIKwBXpoc09pdz3ue/C5AfvE4OiYMvGrujsPw6DdkxeXGO+m?= =?us-ascii?Q?W7vpvZdCspsKeRlD3/iErO6CQ1JnX72L2zfAqvg2IxoMfBirXAr1TsS/7T2R?= =?us-ascii?Q?oz/9guY7pN1eDV2o7PTgKCJGUMuAJGDWbWo30WFQlv0uSs4uAKQj2L5VPWN6?= =?us-ascii?Q?aFb6F/hGupvRJP+WE8NZU01BlX/qunEhlIAOv4an6G1jGpGlmwwwlUv2agX7?= =?us-ascii?Q?+9pH+wweWsHfPRrtG9I2Q5QDEamDLQcn76Bk6xXz0dRZmcDw5Ga9mQYcj+V6?= =?us-ascii?Q?Z+v1pplzrMmkL/jxDDs/x5tw0jTONm8Y5Y8YJSpNhR34ULWrCEFJ8oR5Y3SR?= =?us-ascii?Q?vUPhKDTcQZh3Lcp5d1DRwwtDT0gJNIobanrTQOpIgwo9bJEm8Fui2hvGV0QJ?= =?us-ascii?Q?lr39/dacCKQqVvyPBBmjS50q0rIh/gJCT8SF6O+3vUqhctV/Ii16i0JcusVP?= =?us-ascii?Q?9Tl6TSLVwAzF8XSjNYP1j9fSNhrOKExG/aGP8zqOyCSsXcEsJsaVJc/4HNRa?= =?us-ascii?Q?MFkP43a4rHLvhVFeyzshCd6MNEJhjRUmEnq469YkvTu0oj2wJ9DNBmc1MU5k?= =?us-ascii?Q?5humj/fQa9RrBkzNYD7kmrkctH5Y4tBAqI2/MAShdg0SeUdGwLdGMzUi5zh/?= =?us-ascii?Q?O8MDeWRVq7eLEmMwNt6PsSbr6L41E4sOwoF1U1HRgESB0XxcsKlMy15TVqBB?= =?us-ascii?Q?SrJkc7BR7TLZSDLpBDFcLoCnTAUMU1b1SP3M49ZPWNc6626iReQT+z5wIxEO?= =?us-ascii?Q?zQvcXzFL+SC61tcAa6wk27sgNJZfEd5v9q/urIIGxJgiArfKk1k8F0HFuX4T?= =?us-ascii?Q?xd/VnMUK0KBBlIQrJ9CSaDm7Ma25LjI87DPsRsETwEmQpz4rNtN7lxHANmon?= =?us-ascii?Q?xLEg3BswqVZbbpM5p56kAFI7AUUuThoAw1dYEZ6SWp71yFMD6w=3D=3D?= X-Forefront-Antispam-Report: CIP:216.228.117.160; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc6edge1.nvidia.com; CAT:NONE; SFS:(13230040)(1800799024)(36860700013)(376014)(82310400026); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Oct 2024 14:32:23.3189 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 566ff828-f5b7-4ce0-b748-08dcf82680e5 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.117.160]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CH2PEPF00000145.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: LV3PR12MB9401 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Fallback to the old way to update port information if the kernel driver does not support RDMA monitor. Signed-off-by: Minggang Li(Gavin) Acked-by: Viacheslav Ovsiienko --- doc/guides/rel_notes/release_24_11.rst | 14 +++++ drivers/common/mlx5/linux/mlx5_nl.c | 73 +++++++++++++++++++++++++ drivers/common/mlx5/version.map | 1 + drivers/net/mlx5/linux/mlx5_ethdev_os.c | 2 +- drivers/net/mlx5/linux/mlx5_os.c | 27 +++++++-- drivers/net/mlx5/mlx5.h | 1 + 6 files changed, 111 insertions(+), 7 deletions(-) diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst index fa4822d928..bc868bb74a 100644 --- a/doc/guides/rel_notes/release_24_11.rst +++ b/doc/guides/rel_notes/release_24_11.rst @@ -247,6 +247,20 @@ New Features Added ability for node to advertise and update multiple xstat counters, that can be retrieved using ``rte_graph_cluster_stats_get``. +* **Updated NVIDIA mlx5 driver.** + + Optimized port probe in large scale. + This feature enhances the efficiency of probing VF/SFs on a large scale + by significantly reducing the probing time. To activate this feature, + set ``probe_opt_en`` to a non-zero value during device probing. It + leverages a capability from the RDMA driver, expected to be released in + the upcoming kernel version 6.12 or its equivalent in OFED 24.10, + specifically the RDMA monitor. For additional details on the limitations + of devargs, refer to "doc/guides/nics/mlx5.rst". + + If there are lots of VFs/SFs to be probed by the application, eg, 300 + VFs/SFs, the option should be enabled to save probing time. + Removed Items ------------- diff --git a/drivers/common/mlx5/linux/mlx5_nl.c b/drivers/common/mlx5/linux/mlx5_nl.c index ce1c2a8e75..12f1a620f3 100644 --- a/drivers/common/mlx5/linux/mlx5_nl.c +++ b/drivers/common/mlx5/linux/mlx5_nl.c @@ -2152,3 +2152,76 @@ mlx5_nl_rdma_monitor_info_get(struct nlmsghdr *hdr, struct mlx5_nl_port_info *da error: rte_errno = EINVAL; } + +static int +mlx5_nl_rdma_monitor_cap_get_cb(struct nlmsghdr *hdr, void *arg) +{ + size_t off = NLMSG_HDRLEN; + uint8_t *cap = arg; + + if (hdr->nlmsg_type != RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_SYS_GET)) + goto error; + + *cap = 0; + while (off < hdr->nlmsg_len) { + struct nlattr *na = (void *)((uintptr_t)hdr + off); + void *payload = (void *)((uintptr_t)na + NLA_HDRLEN); + + if (na->nla_len > hdr->nlmsg_len - off) + goto error; + switch (na->nla_type) { + case RDMA_NLDEV_SYS_ATTR_MONITOR_MODE: + *cap = *(uint8_t *)payload; + return 0; + default: + break; + } + off += NLA_ALIGN(na->nla_len); + } + + return 0; + +error: + return -EINVAL; +} + +/** + * Get RDMA monitor support in driver. + * + * + * @param nl + * Netlink socket of the RDMA kind (NETLINK_RDMA). + * @param[out] cap + * Pointer to port info. + * @return + * 0 on success, negative on error and rte_errno is set. + */ +int +mlx5_nl_rdma_monitor_cap_get(int nl, uint8_t *cap) +{ + union { + struct nlmsghdr nh; + uint8_t buf[NLMSG_HDRLEN]; + } req = { + .nh = { + .nlmsg_len = NLMSG_LENGTH(0), + .nlmsg_type = RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, + RDMA_NLDEV_CMD_SYS_GET), + .nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK, + }, + }; + uint32_t sn = MLX5_NL_SN_GENERATE; + int ret; + + ret = mlx5_nl_send(nl, &req.nh, sn); + if (ret < 0) { + rte_errno = -ret; + return ret; + } + ret = mlx5_nl_recv(nl, sn, mlx5_nl_rdma_monitor_cap_get_cb, cap); + if (ret < 0) { + rte_errno = -ret; + return ret; + } + return 0; +} diff --git a/drivers/common/mlx5/version.map b/drivers/common/mlx5/version.map index 5230576006..8301485839 100644 --- a/drivers/common/mlx5/version.map +++ b/drivers/common/mlx5/version.map @@ -148,6 +148,7 @@ INTERNAL { mlx5_nl_vlan_vmwa_delete; # WINDOWS_NO_EXPORT mlx5_nl_rdma_monitor_init; # WINDOWS_NO_EXPORT mlx5_nl_rdma_monitor_info_get; # WINDOWS_NO_EXPORT + mlx5_nl_rdma_monitor_cap_get; # WINDOWS_NO_EXPORT mlx5_os_umem_dereg; mlx5_os_umem_reg; diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c index 5156d96b3a..6b2c25a7c2 100644 --- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c +++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c @@ -736,7 +736,7 @@ mlx5_dev_interrupt_nl_cb(struct nlmsghdr *hdr, void *cb_arg) if (mlx5_nl_parse_link_status_update(hdr, &if_index) < 0) return; - if (sh->cdev->config.probe_opt && sh->cdev->dev_info.port_num > 1) + if (sh->cdev->config.probe_opt && sh->cdev->dev_info.port_num > 1 && !sh->rdma_monitor_supp) mlx5_handle_port_info_update(&sh->cdev->dev_info, if_index, hdr->nlmsg_type); for (i = 0; i < sh->max_port; i++) { diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c index 16b275c71e..d3fd77af58 100644 --- a/drivers/net/mlx5/linux/mlx5_os.c +++ b/drivers/net/mlx5/linux/mlx5_os.c @@ -3017,6 +3017,7 @@ mlx5_os_dev_shared_handler_install(struct mlx5_dev_ctx_shared *sh) { struct ibv_context *ctx = sh->cdev->ctx; int nlsk_fd; + uint8_t rdma_monitor_supp = 0; sh->intr_handle = mlx5_os_interrupt_handler_create (RTE_INTR_INSTANCE_F_SHARED, true, @@ -3025,20 +3026,34 @@ mlx5_os_dev_shared_handler_install(struct mlx5_dev_ctx_shared *sh) DRV_LOG(ERR, "Failed to allocate intr_handle."); return; } - if (sh->cdev->config.probe_opt && sh->cdev->dev_info.port_num > 1) { + if (sh->cdev->config.probe_opt && + sh->cdev->dev_info.port_num > 1 && + !sh->rdma_monitor_supp) { nlsk_fd = mlx5_nl_rdma_monitor_init(); if (nlsk_fd < 0) { DRV_LOG(ERR, "Failed to create a socket for RDMA Netlink events: %s", rte_strerror(rte_errno)); return; } - sh->intr_handle_ib = mlx5_os_interrupt_handler_create - (RTE_INTR_INSTANCE_F_SHARED, true, - nlsk_fd, mlx5_dev_interrupt_handler_ib, sh); - if (sh->intr_handle_ib == NULL) { - DRV_LOG(ERR, "Fail to allocate intr_handle"); + if (mlx5_nl_rdma_monitor_cap_get(nlsk_fd, &rdma_monitor_supp)) { + DRV_LOG(ERR, "Failed to query RDMA monitor support: %s", + rte_strerror(rte_errno)); + close(nlsk_fd); return; } + sh->rdma_monitor_supp = rdma_monitor_supp; + if (sh->rdma_monitor_supp) { + sh->intr_handle_ib = mlx5_os_interrupt_handler_create + (RTE_INTR_INSTANCE_F_SHARED, true, + nlsk_fd, mlx5_dev_interrupt_handler_ib, sh); + if (sh->intr_handle_ib == NULL) { + DRV_LOG(ERR, "Fail to allocate intr_handle"); + close(nlsk_fd); + return; + } + } else { + close(nlsk_fd); + } } nlsk_fd = mlx5_nl_init(NETLINK_ROUTE, RTMGRP_LINK); if (nlsk_fd < 0) { diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index adc21c272b..b6be4646ef 100644 --- a/drivers/net/mlx5/mlx5.h +++ b/drivers/net/mlx5/mlx5.h @@ -1517,6 +1517,7 @@ struct mlx5_dev_ctx_shared { uint32_t lag_rx_port_affinity_en:1; /* lag_rx_port_affinity is supported. */ uint32_t hws_max_log_bulk_sz:5; + uint32_t rdma_monitor_supp:1; /* Log of minimal HWS counters created hard coded. */ uint32_t hws_max_nb_counters; /* Maximal number for HWS counters. */ uint32_t max_port; /* Maximal IB device port index. */ -- 2.34.1