From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 087F9A0C4D; Thu, 7 Oct 2021 00:05:19 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id ECE66411DD; Thu, 7 Oct 2021 00:04:33 +0200 (CEST) Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2045.outbound.protection.outlook.com [40.107.243.45]) by mails.dpdk.org (Postfix) with ESMTP id CABE5411D2 for ; Thu, 7 Oct 2021 00:04:32 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=h9U6PR4czHp8TodwLlD6DnWSVfPRchSpq5XB8VZqcN/tMhEK2S5DuvBStgQnJWfmmyWsxaEa/wn6lylXx1nacLWZNGq5RV8asx+iD1Tt1i0NblWPgG6Bl8OmSzrCBx5VmoJY5hHKmeD9FBRQT+exdS3GvGORzXGnlGpH3ABXCc2Jtb5Xufk7+Y7pGnTu8bZXvSS4utWtnBIfhcMcydgp3Wx3Y6BOKiiykvNo3n1AJzbcIAW+w2JVEH425xysk+Yfd/l5gnjqzIQ5azU2HeivsacGEQuNLkWAq1YOuT1DYhGl+g/qFfOAz7vuemki2KgH7BLbucNx/xCZ9tp87r2JTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ZsC8+V4Ohx2aPiRywcEaCd5wW9TqqaIPtxtWPlhW9iE=; b=lbAlkVoxuUQHtVDxwdCxB50jqUg4AsrlO3w0DZvjspgXsrQznQQNv3OrCD3odP60XUhwTuJFqsV6/Oj5s5pBlwsctykUm54grQRxmDZ8cjb65YSmurpa5AYkbIoHfi8ly6sZktb662GdZgGVbMzB+/l9lPo5rg8DP+QleW64T3uu8y3eJAh2bop7zp7V+VJf4kGcwDOMZMkaOUwk/eJGur/8326PZJneTU9jyGr9Aj7bjsEZqUKXzNwRB2d0D+LwknLibVHZ3jPRonaw+opRj1pjJeBgynKAvy+bHf8bGENQZzRsDv5xWk4uqiaNgrZfcbuzJshlgaU6BpzGZfyRpw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.112.34) smtp.rcpttodomain=monjalon.net smtp.mailfrom=nvidia.com; dmarc=pass (p=quarantine sp=none pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ZsC8+V4Ohx2aPiRywcEaCd5wW9TqqaIPtxtWPlhW9iE=; b=rJzSPHbZG8QkFlEklQb3DXQmmNjUGpST5SS+SJZqflu5E84HK/Iz7OUsdemGSVSXOE4cSar3c+ahrRQyNeYQa1WNSFKDjuB/1ba8625nF1X1+XMRpQ2MkNe4suqkIHh2Qf1lyDWhlojM84KYePaD+M3GoiCimvmQ+D8IVP13Z4LZY8LZtTJPD+rDTOFInzaNIDr6Xl5XcuOTQIPtrgVlVQigSEDrUSVQLZyY7dSty1pfJHcKZaDiRPvlK/bMKrNsLEv6jyMN4k9l1ZcHN3F6UvOsHKDdnCEF6U+gzGfCGMsM6sbHflgHp5lnn70tvQAeIQFDCpw2HbYOi6dLocbJxw== Received: from CO2PR04CA0196.namprd04.prod.outlook.com (2603:10b6:104:5::26) by BN8PR12MB4786.namprd12.prod.outlook.com (2603:10b6:408:a9::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4566.17; Wed, 6 Oct 2021 22:04:30 +0000 Received: from CO1NAM11FT015.eop-nam11.prod.protection.outlook.com (2603:10b6:104:5:cafe::86) by CO2PR04CA0196.outlook.office365.com (2603:10b6:104:5::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4566.15 via Frontend Transport; Wed, 6 Oct 2021 22:04:30 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.112.34) smtp.mailfrom=nvidia.com; monjalon.net; dkim=none (message not signed) header.d=none;monjalon.net; dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.112.34 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.112.34; helo=mail.nvidia.com; Received: from mail.nvidia.com (216.228.112.34) by CO1NAM11FT015.mail.protection.outlook.com (10.13.175.130) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.4587.18 via Frontend Transport; Wed, 6 Oct 2021 22:04:29 +0000 Received: from nvidia.com (172.20.187.6) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Wed, 6 Oct 2021 22:04:26 +0000 From: To: CC: Matan Azrad , Thomas Monjalon , Michael Baum Date: Thu, 7 Oct 2021 01:03:41 +0300 Message-ID: <20211006220350.2357487-10-michaelba@nvidia.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20211006220350.2357487-1-michaelba@nvidia.com> References: <20210930172822.1949969-1-michaelba@nvidia.com> <20211006220350.2357487-1-michaelba@nvidia.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [172.20.187.6] X-ClientProxiedBy: HQMAIL107.nvidia.com (172.20.187.13) To HQMAIL107.nvidia.com (172.20.187.13) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: eae01f8d-319f-4ebf-cc73-08d989154524 X-MS-TrafficTypeDiagnostic: BN8PR12MB4786: X-LD-Processed: 43083d15-7273-40c1-b7db-39efd9ccc17a,ExtAddr X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:285; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 4LxmtlahJ8sUOD2/LijG8CAIvlo0CSZaw4C52dM7oUTP+5xQH4kSp2OapxaJeyx5eabH3hmDCihyvG5sB0E0e1cdXyidawH43fYrm3IYJWbb61KGhqrKWOkDUCqR6sDGruY8KZdROeZZ8HwsCnNE9X5hmActZySuT352oBX1TuoLfDlnX1U+wqpTDppu/OEi6M5WXxgdXMoTWho1TkTUDt9+E7vLENPeByMCLgVhZmeb7EHGqtDr7Vh45y324ZYjWVDpJGZzbL5pyrZtbYUVgxTRclHhlBh1dHbX/LCzSOXhWahc4wfRMAXUK6yP+fNgpCQdlrzVCLxHn4CMQA1XNIOtmDrDmLZWffRFDXKZkofog+9dyFVj25Cr/pYwwhKcdPzItNMfR27uECDIUZz1YWBxLFgwIl9BjJWv8hyfX+SjZN9kL1KiZQomUBWN61JJthIWUkkE8mGnSOQfV3Oh4gKwKNkh0fYd6FEsPtt+ypCI3hE574anX95q6GYGTftR9yZ7YdKbIYk6+SsnUT98oVqTA7zMfgLCGXFctHPZYmohB60KYUU7xqGbCYqWp6lORDD/m8pCaEEQsfw1/oeg6O5or8ROJoBojDV+p9c07fdv2UXVYS80cEteQrloDkxzfcEzNmGsGoy5hFVOtMHJ6cTZFRekwgnkQGy6L8TNUigJB12qrozHdeq1C5YjKqGuiXH/mzUh1aRkvbrb2hZVPA== X-Forefront-Antispam-Report: CIP:216.228.112.34; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:schybrid03.nvidia.com; CAT:NONE; SFS:(4636009)(36840700001)(46966006)(36860700001)(2616005)(426003)(83380400001)(54906003)(26005)(186003)(7636003)(356005)(1076003)(107886003)(5660300002)(36756003)(86362001)(70586007)(70206006)(6666004)(8936002)(8676002)(6916009)(16526019)(2906002)(336012)(2876002)(316002)(7696005)(30864003)(47076005)(508600001)(6286002)(55016002)(82310400003)(4326008); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Oct 2021 22:04:29.5904 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: eae01f8d-319f-4ebf-cc73-08d989154524 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.112.34]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT015.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN8PR12MB4786 Subject: [dpdk-dev] [PATCH v2 09/18] common/mlx5: add ROCE disable in context device creation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Michael Baum Add option to get IB device after disabling RoCE. It is relevant if there is vDPA class in device arguments list. Use common device context in vDPA driver and remove the ctx field from its private structure. Signed-off-by: Michael Baum Acked-by: Matan Azrad --- drivers/common/mlx5/linux/mlx5_common_os.c | 144 ++++++++++++++++- drivers/common/mlx5/linux/mlx5_common_os.h | 7 - drivers/common/mlx5/linux/mlx5_common_verbs.c | 19 --- drivers/common/mlx5/linux/mlx5_nl.c | 2 +- drivers/common/mlx5/linux/mlx5_nl.h | 6 +- drivers/common/mlx5/mlx5_common.h | 1 - drivers/common/mlx5/mlx5_common_defs.h | 3 + drivers/common/mlx5/version.map | 6 - drivers/vdpa/mlx5/mlx5_vdpa.c | 147 ++---------------- drivers/vdpa/mlx5/mlx5_vdpa.h | 2 +- drivers/vdpa/mlx5/mlx5_vdpa_event.c | 17 +- drivers/vdpa/mlx5/mlx5_vdpa_lm.c | 4 +- drivers/vdpa/mlx5/mlx5_vdpa_mem.c | 7 +- drivers/vdpa/mlx5/mlx5_vdpa_steer.c | 11 +- drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 13 +- 15 files changed, 188 insertions(+), 201 deletions(-) diff --git a/drivers/common/mlx5/linux/mlx5_common_os.c b/drivers/common/mlx5/linux/mlx5_common_os.c index 1589212172..341822cf71 100644 --- a/drivers/common/mlx5/linux/mlx5_common_os.c +++ b/drivers/common/mlx5/linux/mlx5_common_os.c @@ -13,9 +13,13 @@ #include #include +#include +#include #include "mlx5_common.h" +#include "mlx5_nl.h" #include "mlx5_common_log.h" +#include "mlx5_common_private.h" #include "mlx5_common_defs.h" #include "mlx5_common_os.h" #include "mlx5_glue.h" @@ -402,7 +406,7 @@ mlx5_glue_constructor(void) mlx5_glue = NULL; } -struct ibv_device * +static struct ibv_device * mlx5_os_get_ibv_device(const struct rte_pci_addr *addr) { int n; @@ -435,6 +439,139 @@ mlx5_os_get_ibv_device(const struct rte_pci_addr *addr) return ibv_match; } +/* Try to disable ROCE by Netlink\Devlink. */ +static int +mlx5_nl_roce_disable(const char *addr) +{ + int nlsk_fd = mlx5_nl_init(NETLINK_GENERIC); + int devlink_id; + int enable; + int ret; + + if (nlsk_fd < 0) + return nlsk_fd; + devlink_id = mlx5_nl_devlink_family_id_get(nlsk_fd); + if (devlink_id < 0) { + ret = devlink_id; + DRV_LOG(DEBUG, + "Failed to get devlink id for ROCE operations by Netlink."); + goto close; + } + ret = mlx5_nl_enable_roce_get(nlsk_fd, devlink_id, addr, &enable); + if (ret) { + DRV_LOG(DEBUG, "Failed to get ROCE enable by Netlink: %d.", + ret); + goto close; + } else if (!enable) { + DRV_LOG(INFO, "ROCE has already disabled(Netlink)."); + goto close; + } + ret = mlx5_nl_enable_roce_set(nlsk_fd, devlink_id, addr, 0); + if (ret) + DRV_LOG(DEBUG, "Failed to disable ROCE by Netlink: %d.", ret); + else + DRV_LOG(INFO, "ROCE is disabled by Netlink successfully."); +close: + close(nlsk_fd); + return ret; +} + +/* Try to disable ROCE by sysfs. */ +static int +mlx5_sys_roce_disable(const char *addr) +{ + FILE *file_o; + int enable; + int ret; + + MKSTR(file_p, "/sys/bus/pci/devices/%s/roce_enable", addr); + file_o = fopen(file_p, "rb"); + if (!file_o) { + rte_errno = ENOTSUP; + return -ENOTSUP; + } + ret = fscanf(file_o, "%d", &enable); + if (ret != 1) { + rte_errno = EINVAL; + ret = EINVAL; + goto close; + } else if (!enable) { + ret = 0; + DRV_LOG(INFO, "ROCE has already disabled(sysfs)."); + goto close; + } + fclose(file_o); + file_o = fopen(file_p, "wb"); + if (!file_o) { + rte_errno = ENOTSUP; + return -ENOTSUP; + } + fprintf(file_o, "0\n"); + ret = 0; +close: + if (ret) + DRV_LOG(DEBUG, "Failed to disable ROCE by sysfs: %d.", ret); + else + DRV_LOG(INFO, "ROCE is disabled by sysfs successfully."); + fclose(file_o); + return ret; +} + +static int +mlx5_roce_disable(const struct rte_device *dev) +{ + char pci_addr[PCI_PRI_STR_SIZE] = { 0 }; + + if (mlx5_dev_to_pci_str(dev, pci_addr, sizeof(pci_addr)) < 0) + return -rte_errno; + /* Firstly try to disable ROCE by Netlink and fallback to sysfs. */ + if (mlx5_nl_roce_disable(pci_addr) != 0 && + mlx5_sys_roce_disable(pci_addr) != 0) + return -rte_errno; + return 0; +} + +static struct ibv_device * +mlx5_os_get_ibv_dev(const struct rte_device *dev) +{ + struct ibv_device *ibv; + + if (mlx5_dev_is_pci(dev)) + ibv = mlx5_os_get_ibv_device(&RTE_DEV_TO_PCI_CONST(dev)->addr); + else + ibv = mlx5_get_aux_ibv_device(RTE_DEV_TO_AUXILIARY_CONST(dev)); + if (ibv == NULL) { + rte_errno = ENODEV; + DRV_LOG(ERR, "Verbs device not found: %s", dev->name); + } + return ibv; +} + +static struct ibv_device * +mlx5_vdpa_get_ibv_dev(const struct rte_device *dev) +{ + struct ibv_device *ibv; + int retry; + + if (mlx5_roce_disable(dev) != 0) { + DRV_LOG(WARNING, "Failed to disable ROCE for \"%s\".", + dev->name); + return NULL; + } + /* Wait for the IB device to appear again after reload. */ + for (retry = MLX5_VDPA_MAX_RETRIES; retry > 0; --retry) { + ibv = mlx5_os_get_ibv_dev(dev); + if (ibv != NULL) + return ibv; + usleep(MLX5_VDPA_USEC); + } + DRV_LOG(ERR, + "Cannot get IB device after disabling RoCE for \"%s\", retries exceed %d.", + dev->name, MLX5_VDPA_MAX_RETRIES); + rte_errno = EAGAIN; + return NULL; +} + static int mlx5_config_doorbell_mapping_env(int dbnc) { @@ -483,7 +620,10 @@ mlx5_os_open_device(struct mlx5_common_device *cdev, uint32_t classes) struct ibv_context *ctx = NULL; int dbmap_env; - ibv = mlx5_os_get_ibv_dev(cdev->dev); + if (classes & MLX5_CLASS_VDPA) + ibv = mlx5_vdpa_get_ibv_dev(cdev->dev); + else + ibv = mlx5_os_get_ibv_dev(cdev->dev); if (!ibv) return -rte_errno; DRV_LOG(INFO, "Dev information matches for device \"%s\".", ibv->name); diff --git a/drivers/common/mlx5/linux/mlx5_common_os.h b/drivers/common/mlx5/linux/mlx5_common_os.h index 05c8ae1ba5..0e605c3a9e 100644 --- a/drivers/common/mlx5/linux/mlx5_common_os.h +++ b/drivers/common/mlx5/linux/mlx5_common_os.h @@ -289,13 +289,6 @@ mlx5_os_free(void *addr) free(addr); } -struct ibv_device * -mlx5_os_get_ibv_device(const struct rte_pci_addr *addr); - -__rte_internal -struct ibv_device * -mlx5_os_get_ibv_dev(const struct rte_device *dev); - void mlx5_set_context_attr(struct rte_device *dev, struct ibv_context *ctx); diff --git a/drivers/common/mlx5/linux/mlx5_common_verbs.c b/drivers/common/mlx5/linux/mlx5_common_verbs.c index e5a1244867..519cb8d056 100644 --- a/drivers/common/mlx5/linux/mlx5_common_verbs.c +++ b/drivers/common/mlx5/linux/mlx5_common_verbs.c @@ -11,35 +11,16 @@ #include #include -#include #include -#include #include "mlx5_common_utils.h" #include "mlx5_common_log.h" -#include "mlx5_common_private.h" #include "mlx5_autoconf.h" #include #include #include #include -struct ibv_device * -mlx5_os_get_ibv_dev(const struct rte_device *dev) -{ - struct ibv_device *ibv; - - if (mlx5_dev_is_pci(dev)) - ibv = mlx5_os_get_ibv_device(&RTE_DEV_TO_PCI_CONST(dev)->addr); - else - ibv = mlx5_get_aux_ibv_device(RTE_DEV_TO_AUXILIARY_CONST(dev)); - if (ibv == NULL) { - rte_errno = ENODEV; - DRV_LOG(ERR, "Verbs device not found: %s", dev->name); - } - return ibv; -} - /** * Verbs callback to allocate a memory. This function should allocate the space * according to the size provided residing inside a huge page. diff --git a/drivers/common/mlx5/linux/mlx5_nl.c b/drivers/common/mlx5/linux/mlx5_nl.c index 9120a697fd..530d491b66 100644 --- a/drivers/common/mlx5/linux/mlx5_nl.c +++ b/drivers/common/mlx5/linux/mlx5_nl.c @@ -1700,7 +1700,7 @@ mlx5_nl_enable_roce_get(int nlsk_fd, int family_id, const char *pci_addr, * @return * 0 on success, a negative errno value otherwise and rte_errno is set. */ -int +static int mlx5_nl_driver_reload(int nlsk_fd, int family_id, const char *pci_addr) { struct nlmsghdr *nlh; diff --git a/drivers/common/mlx5/linux/mlx5_nl.h b/drivers/common/mlx5/linux/mlx5_nl.h index 15129ffdc8..202849f52a 100644 --- a/drivers/common/mlx5/linux/mlx5_nl.h +++ b/drivers/common/mlx5/linux/mlx5_nl.h @@ -66,14 +66,10 @@ void mlx5_nl_vlan_vmwa_delete(struct mlx5_nl_vlan_vmwa_context *vmwa, __rte_internal uint32_t mlx5_nl_vlan_vmwa_create(struct mlx5_nl_vlan_vmwa_context *vmwa, uint32_t ifindex, uint16_t tag); -__rte_internal + int mlx5_nl_devlink_family_id_get(int nlsk_fd); -__rte_internal int mlx5_nl_enable_roce_get(int nlsk_fd, int family_id, const char *pci_addr, int *enable); -__rte_internal -int mlx5_nl_driver_reload(int nlsk_fd, int family_id, const char *pci_addr); -__rte_internal int mlx5_nl_enable_roce_set(int nlsk_fd, int family_id, const char *pci_addr, int enable); diff --git a/drivers/common/mlx5/mlx5_common.h b/drivers/common/mlx5/mlx5_common.h index d7d9e43a4d..066860045a 100644 --- a/drivers/common/mlx5/mlx5_common.h +++ b/drivers/common/mlx5/mlx5_common.h @@ -218,7 +218,6 @@ check_cqe(volatile struct mlx5_cqe *cqe, const uint16_t cqes_n, * - 0 on success. * - Negative value and rte_errno is set otherwise. */ -__rte_internal int mlx5_dev_to_pci_str(const struct rte_device *dev, char *addr, size_t size); /* diff --git a/drivers/common/mlx5/mlx5_common_defs.h b/drivers/common/mlx5/mlx5_common_defs.h index 6fd30f2c97..8f43b8e8ad 100644 --- a/drivers/common/mlx5/mlx5_common_defs.h +++ b/drivers/common/mlx5/mlx5_common_defs.h @@ -39,4 +39,7 @@ #define MLX5_TXDB_NCACHED 1 #define MLX5_TXDB_HEURISTIC 2 +#define MLX5_VDPA_MAX_RETRIES 20 +#define MLX5_VDPA_USEC 1000 + #endif /* RTE_PMD_MLX5_COMMON_DEFS_H_ */ diff --git a/drivers/common/mlx5/version.map b/drivers/common/mlx5/version.map index 9d17366d19..24925fc4e4 100644 --- a/drivers/common/mlx5/version.map +++ b/drivers/common/mlx5/version.map @@ -15,7 +15,6 @@ INTERNAL { mlx5_create_mr_ext; mlx5_dev_is_pci; - mlx5_dev_to_pci_str; mlx5_devx_alloc_uar; # WINDOWS_NO_EXPORT @@ -123,10 +122,6 @@ INTERNAL { mlx5_mr_release_cache; mlx5_nl_allmulti; # WINDOWS_NO_EXPORT - mlx5_nl_devlink_family_id_get; # WINDOWS_NO_EXPORT - mlx5_nl_driver_reload; # WINDOWS_NO_EXPORT - mlx5_nl_enable_roce_get; # WINDOWS_NO_EXPORT - mlx5_nl_enable_roce_set; # WINDOWS_NO_EXPORT mlx5_nl_ifindex; # WINDOWS_NO_EXPORT mlx5_nl_init; # WINDOWS_NO_EXPORT mlx5_nl_mac_addr_add; # WINDOWS_NO_EXPORT @@ -143,7 +138,6 @@ INTERNAL { mlx5_os_alloc_pd; mlx5_os_dealloc_pd; mlx5_os_dereg_mr; - mlx5_os_get_ibv_dev; # WINDOWS_NO_EXPORT mlx5_os_reg_mr; mlx5_os_umem_dereg; mlx5_os_umem_reg; diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c index d7ef303cfe..2468202ceb 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c @@ -16,6 +16,7 @@ #include #include +#include #include #include #include @@ -42,8 +43,6 @@ (1ULL << VHOST_USER_PROTOCOL_F_NET_MTU) | \ (1ULL << VHOST_USER_PROTOCOL_F_STATUS)) -#define MLX5_VDPA_MAX_RETRIES 20 -#define MLX5_VDPA_USEC 1000 #define MLX5_VDPA_DEFAULT_NO_TRAFFIC_MAX 16LLU TAILQ_HEAD(mlx5_vdpa_privs, mlx5_vdpa_priv) priv_list = @@ -193,7 +192,7 @@ static int mlx5_vdpa_pd_create(struct mlx5_vdpa_priv *priv) { #ifdef HAVE_IBV_FLOW_DV_SUPPORT - priv->pd = mlx5_glue->alloc_pd(priv->ctx); + priv->pd = mlx5_glue->alloc_pd(priv->cdev->ctx); if (priv->pd == NULL) { DRV_LOG(ERR, "Failed to allocate PD."); return errno ? -errno : -ENOMEM; @@ -238,8 +237,9 @@ mlx5_vdpa_mtu_set(struct mlx5_vdpa_priv *priv) DRV_LOG(DEBUG, "Vhost MTU is 0."); return ret; } - ret = mlx5_get_ifname_sysfs(priv->ctx->device->ibdev_path, - request.ifr_name); + ret = mlx5_get_ifname_sysfs + (mlx5_os_get_ctx_device_name(priv->cdev->ctx), + request.ifr_name); if (ret) { DRV_LOG(DEBUG, "Cannot get kernel IF name - %d.", ret); return ret; @@ -343,7 +343,7 @@ mlx5_vdpa_get_device_fd(int vid) DRV_LOG(ERR, "Invalid vDPA device: %s.", vdev->device->name); return -EINVAL; } - return priv->ctx->cmd_fd; + return ((struct ibv_context *)priv->cdev->ctx)->cmd_fd; } static int @@ -472,98 +472,6 @@ static struct rte_vdpa_dev_ops mlx5_vdpa_ops = { .reset_stats = mlx5_vdpa_reset_stats, }; -/* Try to disable ROCE by Netlink\Devlink. */ -static int -mlx5_vdpa_nl_roce_disable(const char *addr) -{ - int nlsk_fd = mlx5_nl_init(NETLINK_GENERIC); - int devlink_id; - int enable; - int ret; - - if (nlsk_fd < 0) - return nlsk_fd; - devlink_id = mlx5_nl_devlink_family_id_get(nlsk_fd); - if (devlink_id < 0) { - ret = devlink_id; - DRV_LOG(DEBUG, "Failed to get devlink id for ROCE operations by" - " Netlink."); - goto close; - } - ret = mlx5_nl_enable_roce_get(nlsk_fd, devlink_id, addr, &enable); - if (ret) { - DRV_LOG(DEBUG, "Failed to get ROCE enable by Netlink: %d.", - ret); - goto close; - } else if (!enable) { - DRV_LOG(INFO, "ROCE has already disabled(Netlink)."); - goto close; - } - ret = mlx5_nl_enable_roce_set(nlsk_fd, devlink_id, addr, 0); - if (ret) - DRV_LOG(DEBUG, "Failed to disable ROCE by Netlink: %d.", ret); - else - DRV_LOG(INFO, "ROCE is disabled by Netlink successfully."); -close: - close(nlsk_fd); - return ret; -} - -/* Try to disable ROCE by sysfs. */ -static int -mlx5_vdpa_sys_roce_disable(const char *addr) -{ - FILE *file_o; - int enable; - int ret; - - MKSTR(file_p, "/sys/bus/pci/devices/%s/roce_enable", addr); - file_o = fopen(file_p, "rb"); - if (!file_o) { - rte_errno = ENOTSUP; - return -ENOTSUP; - } - ret = fscanf(file_o, "%d", &enable); - if (ret != 1) { - rte_errno = EINVAL; - ret = EINVAL; - goto close; - } else if (!enable) { - ret = 0; - DRV_LOG(INFO, "ROCE has already disabled(sysfs)."); - goto close; - } - fclose(file_o); - file_o = fopen(file_p, "wb"); - if (!file_o) { - rte_errno = ENOTSUP; - return -ENOTSUP; - } - fprintf(file_o, "0\n"); - ret = 0; -close: - if (ret) - DRV_LOG(DEBUG, "Failed to disable ROCE by sysfs: %d.", ret); - else - DRV_LOG(INFO, "ROCE is disabled by sysfs successfully."); - fclose(file_o); - return ret; -} - -static int -mlx5_vdpa_roce_disable(struct rte_device *dev) -{ - char pci_addr[PCI_PRI_STR_SIZE] = { 0 }; - - if (mlx5_dev_to_pci_str(dev, pci_addr, sizeof(pci_addr)) < 0) - return -rte_errno; - /* Firstly try to disable ROCE by Netlink and fallback to sysfs. */ - if (mlx5_vdpa_nl_roce_disable(pci_addr) != 0 && - mlx5_vdpa_sys_roce_disable(pci_addr) != 0) - return -rte_errno; - return 0; -} - static int mlx5_vdpa_args_check_handler(const char *key, const char *val, void *opaque) { @@ -632,48 +540,20 @@ mlx5_vdpa_config_get(struct rte_devargs *devargs, struct mlx5_vdpa_priv *priv) static int mlx5_vdpa_dev_probe(struct mlx5_common_device *cdev) { - struct ibv_device *ibv; struct mlx5_vdpa_priv *priv = NULL; - struct ibv_context *ctx = NULL; struct mlx5_hca_attr attr; - int retry; int ret; - if (mlx5_vdpa_roce_disable(cdev->dev) != 0) { - DRV_LOG(WARNING, "Failed to disable ROCE for \"%s\".", - cdev->dev->name); - return -rte_errno; - } - /* Wait for the IB device to appear again after reload. */ - for (retry = MLX5_VDPA_MAX_RETRIES; retry > 0; --retry) { - ibv = mlx5_os_get_ibv_dev(cdev->dev); - if (ibv != NULL) - break; - usleep(MLX5_VDPA_USEC); - } - if (ibv == NULL) { - DRV_LOG(ERR, "Cannot get IB device after disabling RoCE for " - "\"%s\", retries exceed %d.", - cdev->dev->name, MLX5_VDPA_MAX_RETRIES); - rte_errno = EAGAIN; - return -rte_errno; - } - ctx = mlx5_glue->dv_open_device(ibv); - if (!ctx) { - DRV_LOG(ERR, "Failed to open IB device \"%s\".", ibv->name); - rte_errno = ENODEV; - return -rte_errno; - } - ret = mlx5_devx_cmd_query_hca_attr(ctx, &attr); + ret = mlx5_devx_cmd_query_hca_attr(cdev->ctx, &attr); if (ret) { DRV_LOG(ERR, "Unable to read HCA capabilities."); rte_errno = ENOTSUP; - goto error; + return -rte_errno; } else if (!attr.vdpa.valid || !attr.vdpa.max_num_virtio_queues) { DRV_LOG(ERR, "Not enough capabilities to support vdpa, maybe " "old FW/OFED version?"); rte_errno = ENOTSUP; - goto error; + return -rte_errno; } if (!attr.vdpa.queue_counters_valid) DRV_LOG(DEBUG, "No capability to support virtq statistics."); @@ -684,7 +564,7 @@ mlx5_vdpa_dev_probe(struct mlx5_common_device *cdev) if (!priv) { DRV_LOG(ERR, "Failed to allocate private memory."); rte_errno = ENOMEM; - goto error; + return -rte_errno; } priv->caps = attr.vdpa; priv->log_max_rqt_size = attr.log_max_rqt_size; @@ -692,8 +572,8 @@ mlx5_vdpa_dev_probe(struct mlx5_common_device *cdev) priv->qp_ts_format = attr.qp_ts_format; if (attr.num_lag_ports == 0) priv->num_lag_ports = 1; - priv->ctx = ctx; - priv->var = mlx5_glue->dv_alloc_var(ctx, 0); + priv->cdev = cdev; + priv->var = mlx5_glue->dv_alloc_var(priv->cdev->ctx, 0); if (!priv->var) { DRV_LOG(ERR, "Failed to allocate VAR %u.", errno); goto error; @@ -718,8 +598,6 @@ mlx5_vdpa_dev_probe(struct mlx5_common_device *cdev) mlx5_glue->dv_free_var(priv->var); rte_free(priv); } - if (ctx) - mlx5_glue->close_device(ctx); return -rte_errno; } @@ -748,7 +626,6 @@ mlx5_vdpa_dev_remove(struct mlx5_common_device *cdev) } if (priv->vdev) rte_vdpa_unregister_device(priv->vdev); - mlx5_glue->close_device(priv->ctx); pthread_mutex_destroy(&priv->vq_config_lock); rte_free(priv); } diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h index a27f3fdadb..1fe57c72b8 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa.h +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h @@ -128,8 +128,8 @@ struct mlx5_vdpa_priv { uint16_t hw_max_latency_us; /* Hardware CQ moderation period in usec. */ uint16_t hw_max_pending_comp; /* Hardware CQ moderation counter. */ struct rte_vdpa_device *vdev; /* vDPA device. */ + struct mlx5_common_device *cdev; /* Backend mlx5 device. */ int vid; /* vhost device id. */ - struct ibv_context *ctx; /* Device context. */ struct mlx5_hca_vdpa_attr caps; uint32_t pdn; /* Protection Domain number. */ struct ibv_pd *pd; diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_event.c b/drivers/vdpa/mlx5/mlx5_vdpa_event.c index bb6722839a..979a2abd41 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa_event.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa_event.c @@ -48,7 +48,7 @@ mlx5_vdpa_event_qp_global_prepare(struct mlx5_vdpa_priv *priv) { if (priv->eventc) return 0; - priv->eventc = mlx5_os_devx_create_event_channel(priv->ctx, + priv->eventc = mlx5_os_devx_create_event_channel(priv->cdev->ctx, MLX5DV_DEVX_CREATE_EVENT_CHANNEL_FLAGS_OMIT_EV_DATA); if (!priv->eventc) { rte_errno = errno; @@ -61,7 +61,7 @@ mlx5_vdpa_event_qp_global_prepare(struct mlx5_vdpa_priv *priv) * registers writings, it is safe to allocate UAR with any * memory mapping type. */ - priv->uar = mlx5_devx_alloc_uar(priv->ctx, -1); + priv->uar = mlx5_devx_alloc_uar(priv->cdev->ctx, -1); if (!priv->uar) { rte_errno = errno; DRV_LOG(ERR, "Failed to allocate UAR."); @@ -115,8 +115,8 @@ mlx5_vdpa_cq_create(struct mlx5_vdpa_priv *priv, uint16_t log_desc_n, uint16_t event_nums[1] = {0}; int ret; - ret = mlx5_devx_cq_create(priv->ctx, &cq->cq_obj, log_desc_n, &attr, - SOCKET_ID_ANY); + ret = mlx5_devx_cq_create(priv->cdev->ctx, &cq->cq_obj, log_desc_n, + &attr, SOCKET_ID_ANY); if (ret) goto error; cq->cq_ci = 0; @@ -397,7 +397,8 @@ mlx5_vdpa_err_event_setup(struct mlx5_vdpa_priv *priv) int flags; /* Setup device event channel. */ - priv->err_chnl = mlx5_glue->devx_create_event_channel(priv->ctx, 0); + priv->err_chnl = mlx5_glue->devx_create_event_channel(priv->cdev->ctx, + 0); if (!priv->err_chnl) { rte_errno = errno; DRV_LOG(ERR, "Failed to create device event channel %d.", @@ -594,7 +595,7 @@ mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, return -1; attr.pd = priv->pdn; attr.ts_format = mlx5_ts_format_conv(priv->qp_ts_format); - eqp->fw_qp = mlx5_devx_cmd_create_qp(priv->ctx, &attr); + eqp->fw_qp = mlx5_devx_cmd_create_qp(priv->cdev->ctx, &attr); if (!eqp->fw_qp) { DRV_LOG(ERR, "Failed to create FW QP(%u).", rte_errno); goto error; @@ -605,8 +606,8 @@ mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, uint16_t desc_n, attr.log_rq_stride = rte_log2_u32(MLX5_WSEG_SIZE); attr.sq_size = 0; /* No need SQ. */ attr.ts_format = mlx5_ts_format_conv(priv->qp_ts_format); - ret = mlx5_devx_qp_create(priv->ctx, &(eqp->sw_qp), log_desc_n, &attr, - SOCKET_ID_ANY); + ret = mlx5_devx_qp_create(priv->cdev->ctx, &(eqp->sw_qp), log_desc_n, + &attr, SOCKET_ID_ANY); if (ret) { DRV_LOG(ERR, "Failed to create SW QP(%u).", rte_errno); goto error; diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_lm.c b/drivers/vdpa/mlx5/mlx5_vdpa_lm.c index f391813745..0b0ffeb07d 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa_lm.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa_lm.c @@ -54,7 +54,7 @@ mlx5_vdpa_dirty_bitmap_set(struct mlx5_vdpa_priv *priv, uint64_t log_base, DRV_LOG(ERR, "Failed to allocate mem for lm mr."); return -1; } - mr->umem = mlx5_glue->devx_umem_reg(priv->ctx, + mr->umem = mlx5_glue->devx_umem_reg(priv->cdev->ctx, (void *)(uintptr_t)log_base, log_size, IBV_ACCESS_LOCAL_WRITE); if (!mr->umem) { @@ -62,7 +62,7 @@ mlx5_vdpa_dirty_bitmap_set(struct mlx5_vdpa_priv *priv, uint64_t log_base, goto err; } mkey_attr.umem_id = mr->umem->umem_id; - mr->mkey = mlx5_devx_cmd_mkey_create(priv->ctx, &mkey_attr); + mr->mkey = mlx5_devx_cmd_mkey_create(priv->cdev->ctx, &mkey_attr); if (!mr->mkey) { DRV_LOG(ERR, "Failed to create Mkey for lm."); goto err; diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_mem.c b/drivers/vdpa/mlx5/mlx5_vdpa_mem.c index a06681b494..c5cdb3abd7 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa_mem.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa_mem.c @@ -209,7 +209,7 @@ mlx5_vdpa_mem_register(struct mlx5_vdpa_priv *priv) DRV_LOG(ERR, "Failed to allocate mem entry memory."); goto error; } - entry->umem = mlx5_glue->devx_umem_reg(priv->ctx, + entry->umem = mlx5_glue->devx_umem_reg(priv->cdev->ctx, (void *)(uintptr_t)reg->host_user_addr, reg->size, IBV_ACCESS_LOCAL_WRITE); if (!entry->umem) { @@ -222,7 +222,8 @@ mlx5_vdpa_mem_register(struct mlx5_vdpa_priv *priv) mkey_attr.umem_id = entry->umem->umem_id; mkey_attr.pd = priv->pdn; mkey_attr.pg_access = 1; - entry->mkey = mlx5_devx_cmd_mkey_create(priv->ctx, &mkey_attr); + entry->mkey = mlx5_devx_cmd_mkey_create(priv->cdev->ctx, + &mkey_attr); if (!entry->mkey) { DRV_LOG(ERR, "Failed to create direct Mkey."); ret = -rte_errno; @@ -281,7 +282,7 @@ mlx5_vdpa_mem_register(struct mlx5_vdpa_priv *priv) ret = -ENOMEM; goto error; } - entry->mkey = mlx5_devx_cmd_mkey_create(priv->ctx, &mkey_attr); + entry->mkey = mlx5_devx_cmd_mkey_create(priv->cdev->ctx, &mkey_attr); if (!entry->mkey) { DRV_LOG(ERR, "Failed to create indirect Mkey."); ret = -rte_errno; diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_steer.c b/drivers/vdpa/mlx5/mlx5_vdpa_steer.c index 383f003966..a0fd2776e5 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa_steer.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa_steer.c @@ -98,7 +98,8 @@ mlx5_vdpa_rqt_prepare(struct mlx5_vdpa_priv *priv) attr->rqt_max_size = rqt_n; attr->rqt_actual_size = rqt_n; if (!priv->steer.rqt) { - priv->steer.rqt = mlx5_devx_cmd_create_rqt(priv->ctx, attr); + priv->steer.rqt = mlx5_devx_cmd_create_rqt(priv->cdev->ctx, + attr); if (!priv->steer.rqt) { DRV_LOG(ERR, "Failed to create RQT."); ret = -rte_errno; @@ -204,13 +205,13 @@ mlx5_vdpa_rss_flows_create(struct mlx5_vdpa_priv *priv) tir_att.rx_hash_field_selector_outer.selected_fields = vars[i][HASH]; priv->steer.rss[i].matcher = mlx5_glue->dv_create_flow_matcher - (priv->ctx, &dv_attr, priv->steer.tbl); + (priv->cdev->ctx, &dv_attr, priv->steer.tbl); if (!priv->steer.rss[i].matcher) { DRV_LOG(ERR, "Failed to create matcher %d.", i); goto error; } - priv->steer.rss[i].tir = mlx5_devx_cmd_create_tir(priv->ctx, - &tir_att); + priv->steer.rss[i].tir = mlx5_devx_cmd_create_tir + (priv->cdev->ctx, &tir_att); if (!priv->steer.rss[i].tir) { DRV_LOG(ERR, "Failed to create TIR %d.", i); goto error; @@ -268,7 +269,7 @@ int mlx5_vdpa_steer_setup(struct mlx5_vdpa_priv *priv) { #ifdef HAVE_MLX5DV_DR - priv->steer.domain = mlx5_glue->dr_create_domain(priv->ctx, + priv->steer.domain = mlx5_glue->dr_create_domain(priv->cdev->ctx, MLX5DV_DR_DOMAIN_TYPE_NIC_RX); if (!priv->steer.domain) { DRV_LOG(ERR, "Failed to create Rx domain."); diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c index f530646058..5ef31de834 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c @@ -250,7 +250,7 @@ mlx5_vdpa_virtq_setup(struct mlx5_vdpa_priv *priv, int index) if (priv->caps.queue_counters_valid) { if (!virtq->counters) virtq->counters = mlx5_devx_cmd_create_virtio_q_counters - (priv->ctx); + (priv->cdev->ctx); if (!virtq->counters) { DRV_LOG(ERR, "Failed to create virtq couners for virtq" " %d.", index); @@ -269,7 +269,7 @@ mlx5_vdpa_virtq_setup(struct mlx5_vdpa_priv *priv, int index) " %u.", i, index); goto error; } - virtq->umems[i].obj = mlx5_glue->devx_umem_reg(priv->ctx, + virtq->umems[i].obj = mlx5_glue->devx_umem_reg(priv->cdev->ctx, virtq->umems[i].buf, virtq->umems[i].size, IBV_ACCESS_LOCAL_WRITE); @@ -326,7 +326,7 @@ mlx5_vdpa_virtq_setup(struct mlx5_vdpa_priv *priv, int index) attr.hw_latency_mode = priv->hw_latency_mode; attr.hw_max_latency_us = priv->hw_max_latency_us; attr.hw_max_pending_comp = priv->hw_max_pending_comp; - virtq->virtq = mlx5_devx_cmd_create_virtq(priv->ctx, &attr); + virtq->virtq = mlx5_devx_cmd_create_virtq(priv->cdev->ctx, &attr); virtq->priv = priv; if (!virtq->virtq) goto error; @@ -434,6 +434,7 @@ int mlx5_vdpa_virtqs_prepare(struct mlx5_vdpa_priv *priv) { struct mlx5_devx_tis_attr tis_attr = {0}; + struct ibv_context *ctx = priv->cdev->ctx; uint32_t i; uint16_t nr_vring = rte_vhost_get_vring_num(priv->vid); int ret = rte_vhost_get_negotiated_features(priv->vid, &priv->features); @@ -457,7 +458,7 @@ mlx5_vdpa_virtqs_prepare(struct mlx5_vdpa_priv *priv) } /* Always map the entire page. */ priv->virtq_db_addr = mmap(NULL, priv->var->length, PROT_READ | - PROT_WRITE, MAP_SHARED, priv->ctx->cmd_fd, + PROT_WRITE, MAP_SHARED, ctx->cmd_fd, priv->var->mmap_off); if (priv->virtq_db_addr == MAP_FAILED) { DRV_LOG(ERR, "Failed to map doorbell page %u.", errno); @@ -467,7 +468,7 @@ mlx5_vdpa_virtqs_prepare(struct mlx5_vdpa_priv *priv) DRV_LOG(DEBUG, "VAR address of doorbell mapping is %p.", priv->virtq_db_addr); } - priv->td = mlx5_devx_cmd_create_td(priv->ctx); + priv->td = mlx5_devx_cmd_create_td(ctx); if (!priv->td) { DRV_LOG(ERR, "Failed to create transport domain."); return -rte_errno; @@ -476,7 +477,7 @@ mlx5_vdpa_virtqs_prepare(struct mlx5_vdpa_priv *priv) for (i = 0; i < priv->num_lag_ports; i++) { /* 0 is auto affinity, non-zero value to propose port. */ tis_attr.lag_tx_port_affinity = i + 1; - priv->tiss[i] = mlx5_devx_cmd_create_tis(priv->ctx, &tis_attr); + priv->tiss[i] = mlx5_devx_cmd_create_tis(ctx, &tis_attr); if (!priv->tiss[i]) { DRV_LOG(ERR, "Failed to create TIS %u.", i); goto error; -- 2.25.1