From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id C7E2B45F20; Mon, 23 Dec 2024 11:12:20 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 986F440B97; Mon, 23 Dec 2024 11:12:01 +0100 (CET) Received: from NAM02-BN1-obe.outbound.protection.outlook.com (mail-bn1nam02on2055.outbound.protection.outlook.com [40.107.212.55]) by mails.dpdk.org (Postfix) with ESMTP id A461A40B97 for ; Mon, 23 Dec 2024 11:11:59 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=HFunMtOqDRVoTa8wa6bGdIlVChtp42Cbv/JJym2lwuxIi3Kd8CcSoW5Ba0LgWHvd9tih+U0J+Y9AlDo87UJe1bKDRpNWa9uPuVY6HNMB7Fj62Towi4FG+6EynR4ST0YlEDlyOo0Eif/qGU1kbUldqlzffbKM4TmZPPmcMufMV7jk9YmxZk1DEPNp2m77Kl++7vkd3MmlniuC74I6kpJIq7kERm5k30aPRYOLcGU/YlPC9AODoB5TxHzq6orfzeuxSfHSOwtLUvFCHhJLbOLKb5priJ18K5vwGVo9DnRlU2C3KG73cybL13lsWjhIzvwCPMQYVffCbqlk87QhrjCPBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=qrqQSOCPYCFZigCs+MYm2udVsExioCy1MpOLZXargDQ=; b=N22Uka3NIoiRGR1mZAk1r4/eu1dtQuWqZIZ3e0IzBKiR0w6oWYmDvtml462dYI9tryizhL0XpyZ2MlOcdf/pAjP7majHE/SOGAFItP3Cfwl98HLKo+/M++JYPSWBxuNbeXQBAxhqRrXXqY4FGwfSd2Gql/lYN4rbUIShKkhXZMv4yH6ykeMBWWlvHp3RtMTaOkcw4w2GT2Ipmw2xDrfg5XnFJqMEWm0bE6xYUnSgtiM5cnWVUj+ZV8mFWejjcJfMQ/D8zow75+r+xTkmnSN4zeb6lxIU6IeoiKrw6aCYuDy+QutRkF2nA75xGjDEMpH1p4jVVCwhB4HLNQsm+8BeRg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.160) smtp.rcpttodomain=monjalon.net smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=qrqQSOCPYCFZigCs+MYm2udVsExioCy1MpOLZXargDQ=; b=dzqektbK3VVKMWEQTI1ZC6e94LJ9Z8fMbENKjYDq09IpUWyLu6l8Ny3H0aY0xURJX02fJRwQGlQzv7myVozlbDSEfqLP86ZqffSRQSWPTPD9ocvEe8sAAenX2C7Aaex6UNOZfIrXqnjSfBhfi4TQo9P41I5SuuDt3Q2vutmFMJc1JiGtqiDXTs21+JsIF5sQRwCJ3NRkPFJnrB+mNh76gT/DjFi2kld6dgRC2eHYdn0lnQFl8XP6dpbaF6+eAohE+LO3/lyRZ2vaTtRq1d36+lHLrCo7EMq0/NB5Os3aG4W3ad9z3ONMDkxynZecBjZqeNWjUUQ4jkCQX+xp2VXDcg== Received: from MW2PR16CA0059.namprd16.prod.outlook.com (2603:10b6:907:1::36) by BL1PR12MB5827.namprd12.prod.outlook.com (2603:10b6:208:396::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8272.20; Mon, 23 Dec 2024 10:11:52 +0000 Received: from SJ1PEPF0000231D.namprd03.prod.outlook.com (2603:10b6:907:1:cafe::52) by MW2PR16CA0059.outlook.office365.com (2603:10b6:907:1::36) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8114.28 via Frontend Transport; Mon, 23 Dec 2024 10:11:51 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.160) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.160 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.160; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.160) by SJ1PEPF0000231D.mail.protection.outlook.com (10.167.242.234) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8293.12 via Frontend Transport; Mon, 23 Dec 2024 10:11:51 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Mon, 23 Dec 2024 02:11:40 -0800 Received: from nvidia.com (10.126.230.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Mon, 23 Dec 2024 02:11:37 -0800 From: "Minggang Li(Gavin)" To: , , , , Dariusz Sosnowski , Bing Zhao , Suanming Mou CC: , Subject: [PATCH 7/7] mlx5: add backward compatibility for RDMA monitor Date: Mon, 23 Dec 2024 12:11:01 +0200 Message-ID: <20241223101101.677449-8-gavinl@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20241223101101.677449-1-gavinl@nvidia.com> References: <20241223101101.677449-1-gavinl@nvidia.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [10.126.230.35] X-ClientProxiedBy: rnnvmail201.nvidia.com (10.129.68.8) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PEPF0000231D:EE_|BL1PR12MB5827:EE_ X-MS-Office365-Filtering-Correlation-Id: 104f7097-d61f-4568-8394-08dd233a3830 X-LD-Processed: 43083d15-7273-40c1-b7db-39efd9ccc17a,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|376014|1800799024|36860700013|82310400026; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?2AW+ApH2ggGn4js04xbIH3xpsTB0kUDrFOtZELMax13R9Zzanogehj/AQIL1?= =?us-ascii?Q?HSjdPWzRee7iDrkWqmWzMd5IubxvNkDONhX89VO5fRZfz6PHC0Y0dsE11g7Y?= =?us-ascii?Q?F//PjsI7rdHQ9uWVffRWoA/XZDqE3VKCO1LaFalZo6eFLeWZuqZRw7xi8FS0?= =?us-ascii?Q?NBlQZh3l3Qx6mhnHAQuiwvhNH4DmETEiwKnDNLEOcoEAlfafAd8ysAx3079v?= =?us-ascii?Q?fOMuCGueAA4HVVvIBHUToJmF+2XbJ8lVZOR1Sbh+xXCPCCPEtH35nisbFSCF?= =?us-ascii?Q?/2u4iYgG9U3qgaVUwuPb9HfwmrM5jzDdAFIXeKp0hToAc6tFMf24mmHmo9hd?= =?us-ascii?Q?JXpXrot5n+TR9a/FvtaSC45x9kA2w4K67elMXKyiKkm3QP39L0AzcQuVWps0?= =?us-ascii?Q?6EUL4jYkhOY3KEcfBL8XFT4t7cBsnLY64EwaV2La6oI6PnuM4fbT4Nzd1RVS?= =?us-ascii?Q?hKdVoIq0/dGc+rsmcMfD1jkhcMskOhCsnqD8f9X6/NitdHEsMGOHg30WcaTL?= =?us-ascii?Q?HkzZSLU5DJ3m7rpF9GXGWHfHffJ6ATAxewb3ZVgJBlQlPOz8bPFZPYDn73q8?= =?us-ascii?Q?r3ehlThtyNNznwZKKe+Z6ZTWISS1XhMEAEAL8YtmZm/U7BK9tl3jTkMu/FOY?= =?us-ascii?Q?GF4TFrwOsyEvrlmgYeebS10U0C0B1bSKGvcX2JM+qdmQKFPz7sT33oXB71Qu?= =?us-ascii?Q?N1l9azb7WvQU/n3MBMh2Jhs7sCd8JRcFTLDPk5q3J+EcZnWLWcLPMjQqszkw?= =?us-ascii?Q?fW7uLsK49f3/bQ/P0uAhezMrxr3gWv3u/Ljw9zPJDXhEkI18yleRdJ+73dT0?= =?us-ascii?Q?SePX6EopNrCe8Fe8mK8zwpEtI9Mr0UrItTQwiZxq2GaxM75sObkbVGdeFqU0?= =?us-ascii?Q?wbo78aYlhy/HHHLrEZlaF8A6fgymM5dO4vIK1GATMkWJXGqODo2Z02afLLB0?= =?us-ascii?Q?kGcV28FPMpyHgDfKDy0CTwKjSXtclmWl6Tju4gBVIz0u6E+Tbt+QujnF/UgL?= =?us-ascii?Q?l3jwPQgQYGaXIaVzrDZuqJDs8dLZ7/+xOhYlyqWuASkM2YjfQ82Ixg7LXcRa?= =?us-ascii?Q?Vd3H1py31JDTw77iEIYGe/mjNK05nNbv08CBz4m7H5KEzuB/DZtePoDF42a+?= =?us-ascii?Q?rV7pYIS41DPteRbGJeyJKbtOuxYlI7kxMm8H7CpfkqQtIyGrv21hWddXn0dN?= =?us-ascii?Q?1Cg3yieesC2gtaRjuT4YZb/rUXTpEXFDIF4SrxhwBFluBudQ+yAtKHV7Yj4m?= =?us-ascii?Q?aBzNLoYFSwBgC4DnPcuEVbdLJgnb3aStUM7CAkBPOqAGKqBG2A6rllrgmyB+?= =?us-ascii?Q?ZU0vIjwK84pOOhsO3KdritC7JF57QhtrEuP5nXlScH1haOP0Q9XAe3JCnHgy?= =?us-ascii?Q?uJXd8Q6xO5hn3XxyghgHh+0U6Lv+YBm1QT8JnjlDVHUi0y1iBI/BQ2hGTIZt?= =?us-ascii?Q?NkWuw3vrppSNesvBesd5CGOlPWyoFxb1ioEXbaiRnsq3xAi4G0qwm8G/jQGV?= =?us-ascii?Q?SUnzwExo53weQFM=3D?= X-Forefront-Antispam-Report: CIP:216.228.117.160; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc6edge1.nvidia.com; CAT:NONE; SFS:(13230040)(376014)(1800799024)(36860700013)(82310400026); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Dec 2024 10:11:51.3788 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 104f7097-d61f-4568-8394-08dd233a3830 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.117.160]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ1PEPF0000231D.namprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL1PR12MB5827 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Fallback to the old way to update port information if the kernel driver does not support RDMA monitor. Signed-off-by: Minggang Li(Gavin) Acked-by: Viacheslav Ovsiienko --- doc/guides/rel_notes/release_24_11.rst | 14 +++++ drivers/common/mlx5/linux/mlx5_nl.c | 73 +++++++++++++++++++++++++ drivers/common/mlx5/version.map | 1 + drivers/net/mlx5/linux/mlx5_ethdev_os.c | 2 +- drivers/net/mlx5/linux/mlx5_os.c | 27 +++++++-- drivers/net/mlx5/mlx5.h | 1 + 6 files changed, 111 insertions(+), 7 deletions(-) diff --git a/doc/guides/rel_notes/release_24_11.rst b/doc/guides/rel_notes/release_24_11.rst index 8486cd986f..567ac42663 100644 --- a/doc/guides/rel_notes/release_24_11.rst +++ b/doc/guides/rel_notes/release_24_11.rst @@ -288,6 +288,20 @@ New Features Added ability for a node to advertise and update multiple xstat counters, that can be retrieved using ``rte_graph_cluster_stats_get``. +* **Updated NVIDIA mlx5 driver.** + + Optimized port probe in large scale. + This feature enhances the efficiency of probing VF/SFs on a large scale + by significantly reducing the probing time. To activate this feature, + set ``probe_opt_en`` to a non-zero value during device probing. It + leverages a capability from the RDMA driver, expected to be released in + the upcoming kernel version 6.12 or its equivalent in OFED 24.10, + specifically the RDMA monitor. For additional details on the limitations + of devargs, refer to "doc/guides/nics/mlx5.rst". + + If there are lots of VFs/SFs to be probed by the application, eg, 300 + VFs/SFs, the option should be enabled to save probing time. + Removed Items ------------- diff --git a/drivers/common/mlx5/linux/mlx5_nl.c b/drivers/common/mlx5/linux/mlx5_nl.c index ce1c2a8e75..12f1a620f3 100644 --- a/drivers/common/mlx5/linux/mlx5_nl.c +++ b/drivers/common/mlx5/linux/mlx5_nl.c @@ -2152,3 +2152,76 @@ mlx5_nl_rdma_monitor_info_get(struct nlmsghdr *hdr, struct mlx5_nl_port_info *da error: rte_errno = EINVAL; } + +static int +mlx5_nl_rdma_monitor_cap_get_cb(struct nlmsghdr *hdr, void *arg) +{ + size_t off = NLMSG_HDRLEN; + uint8_t *cap = arg; + + if (hdr->nlmsg_type != RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_SYS_GET)) + goto error; + + *cap = 0; + while (off < hdr->nlmsg_len) { + struct nlattr *na = (void *)((uintptr_t)hdr + off); + void *payload = (void *)((uintptr_t)na + NLA_HDRLEN); + + if (na->nla_len > hdr->nlmsg_len - off) + goto error; + switch (na->nla_type) { + case RDMA_NLDEV_SYS_ATTR_MONITOR_MODE: + *cap = *(uint8_t *)payload; + return 0; + default: + break; + } + off += NLA_ALIGN(na->nla_len); + } + + return 0; + +error: + return -EINVAL; +} + +/** + * Get RDMA monitor support in driver. + * + * + * @param nl + * Netlink socket of the RDMA kind (NETLINK_RDMA). + * @param[out] cap + * Pointer to port info. + * @return + * 0 on success, negative on error and rte_errno is set. + */ +int +mlx5_nl_rdma_monitor_cap_get(int nl, uint8_t *cap) +{ + union { + struct nlmsghdr nh; + uint8_t buf[NLMSG_HDRLEN]; + } req = { + .nh = { + .nlmsg_len = NLMSG_LENGTH(0), + .nlmsg_type = RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, + RDMA_NLDEV_CMD_SYS_GET), + .nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK, + }, + }; + uint32_t sn = MLX5_NL_SN_GENERATE; + int ret; + + ret = mlx5_nl_send(nl, &req.nh, sn); + if (ret < 0) { + rte_errno = -ret; + return ret; + } + ret = mlx5_nl_recv(nl, sn, mlx5_nl_rdma_monitor_cap_get_cb, cap); + if (ret < 0) { + rte_errno = -ret; + return ret; + } + return 0; +} diff --git a/drivers/common/mlx5/version.map b/drivers/common/mlx5/version.map index 5230576006..8301485839 100644 --- a/drivers/common/mlx5/version.map +++ b/drivers/common/mlx5/version.map @@ -148,6 +148,7 @@ INTERNAL { mlx5_nl_vlan_vmwa_delete; # WINDOWS_NO_EXPORT mlx5_nl_rdma_monitor_init; # WINDOWS_NO_EXPORT mlx5_nl_rdma_monitor_info_get; # WINDOWS_NO_EXPORT + mlx5_nl_rdma_monitor_cap_get; # WINDOWS_NO_EXPORT mlx5_os_umem_dereg; mlx5_os_umem_reg; diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c index 5156d96b3a..6b2c25a7c2 100644 --- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c +++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c @@ -736,7 +736,7 @@ mlx5_dev_interrupt_nl_cb(struct nlmsghdr *hdr, void *cb_arg) if (mlx5_nl_parse_link_status_update(hdr, &if_index) < 0) return; - if (sh->cdev->config.probe_opt && sh->cdev->dev_info.port_num > 1) + if (sh->cdev->config.probe_opt && sh->cdev->dev_info.port_num > 1 && !sh->rdma_monitor_supp) mlx5_handle_port_info_update(&sh->cdev->dev_info, if_index, hdr->nlmsg_type); for (i = 0; i < sh->max_port; i++) { diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c index 16b275c71e..d3fd77af58 100644 --- a/drivers/net/mlx5/linux/mlx5_os.c +++ b/drivers/net/mlx5/linux/mlx5_os.c @@ -3017,6 +3017,7 @@ mlx5_os_dev_shared_handler_install(struct mlx5_dev_ctx_shared *sh) { struct ibv_context *ctx = sh->cdev->ctx; int nlsk_fd; + uint8_t rdma_monitor_supp = 0; sh->intr_handle = mlx5_os_interrupt_handler_create (RTE_INTR_INSTANCE_F_SHARED, true, @@ -3025,20 +3026,34 @@ mlx5_os_dev_shared_handler_install(struct mlx5_dev_ctx_shared *sh) DRV_LOG(ERR, "Failed to allocate intr_handle."); return; } - if (sh->cdev->config.probe_opt && sh->cdev->dev_info.port_num > 1) { + if (sh->cdev->config.probe_opt && + sh->cdev->dev_info.port_num > 1 && + !sh->rdma_monitor_supp) { nlsk_fd = mlx5_nl_rdma_monitor_init(); if (nlsk_fd < 0) { DRV_LOG(ERR, "Failed to create a socket for RDMA Netlink events: %s", rte_strerror(rte_errno)); return; } - sh->intr_handle_ib = mlx5_os_interrupt_handler_create - (RTE_INTR_INSTANCE_F_SHARED, true, - nlsk_fd, mlx5_dev_interrupt_handler_ib, sh); - if (sh->intr_handle_ib == NULL) { - DRV_LOG(ERR, "Fail to allocate intr_handle"); + if (mlx5_nl_rdma_monitor_cap_get(nlsk_fd, &rdma_monitor_supp)) { + DRV_LOG(ERR, "Failed to query RDMA monitor support: %s", + rte_strerror(rte_errno)); + close(nlsk_fd); return; } + sh->rdma_monitor_supp = rdma_monitor_supp; + if (sh->rdma_monitor_supp) { + sh->intr_handle_ib = mlx5_os_interrupt_handler_create + (RTE_INTR_INSTANCE_F_SHARED, true, + nlsk_fd, mlx5_dev_interrupt_handler_ib, sh); + if (sh->intr_handle_ib == NULL) { + DRV_LOG(ERR, "Fail to allocate intr_handle"); + close(nlsk_fd); + return; + } + } else { + close(nlsk_fd); + } } nlsk_fd = mlx5_nl_init(NETLINK_ROUTE, RTMGRP_LINK); if (nlsk_fd < 0) { diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index 688a7270ca..ab604042b9 100644 --- a/drivers/net/mlx5/mlx5.h +++ b/drivers/net/mlx5/mlx5.h @@ -1545,6 +1545,7 @@ struct mlx5_dev_ctx_shared { uint32_t lag_rx_port_affinity_en:1; /* lag_rx_port_affinity is supported. */ uint32_t hws_max_log_bulk_sz:5; + uint32_t rdma_monitor_supp:1; /* Log of minimal HWS counters created hard coded. */ uint32_t hws_max_nb_counters; /* Maximal number for HWS counters. */ uint32_t max_port; /* Maximal IB device port index. */ -- 2.34.1