From: Rongwei Liu <rongweil@nvidia.com>
To: <matan@nvidia.com>, <viacheslavo@nvidia.com>, <orika@nvidia.com>,
<thomas@monjalon.net>
Cc: <dev@dpdk.org>, <rasland@nvidia.com>
Subject: [dpdk-dev] [PATCH 2/2] net/mlx5: support socket direct mode bonding
Date: Tue, 28 Sep 2021 11:23:55 +0300 [thread overview]
Message-ID: <20210928082355.1616087-3-rongweil@nvidia.com> (raw)
In-Reply-To: <20210928082355.1616087-1-rongweil@nvidia.com>
In socket direct mode, it's possible to bind any two (maybe four
in future) PCIe devices with IDs like xxxx:xx:xx.x and
yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have
the same PCIe domain/bus/device ID anymore,
Kernel driver uses "system_image_guid" to identify if devices can
be bound together or not. Sysfs "phys_switch_id" is used to get
"system_image_guid" of each network interface.
OFED 5.4+ is required to support "phys_switch_id".
Centos 8.1 needs to enable switch_dev mode to read "phys_switch_id".
Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Viacheslav Ovsiienko viacheslavo@nvidia.com
---
drivers/net/mlx5/linux/mlx5_os.c | 43 +++++++++++++++++++++++++-------
1 file changed, 34 insertions(+), 9 deletions(-)
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 3746057673..b8ce6cd7f3 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -2008,6 +2008,8 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
FILE *bond_file = NULL, *file;
int pf = -1;
int ret;
+ uint8_t cur_guid[32] = {0};
+ uint8_t guid[32] = {0};
/*
* Try to get master device name. If something goes
@@ -2022,6 +2024,8 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
np = mlx5_nl_portnum(nl_rdma, ibv_dev->name);
if (!np)
return -1;
+ if (mlx5_get_device_guid(pci_dev, cur_guid, sizeof(cur_guid)) < 0)
+ return -1;
/*
* The Master device might not be on the predefined
* port (not on port index 1, it is not garanted),
@@ -2050,6 +2054,7 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
char tmp_str[IF_NAMESIZE + 32];
struct rte_pci_addr pci_addr;
struct mlx5_switch_info info;
+ int ret;
/* Process slave interface names in the loop. */
snprintf(tmp_str, sizeof(tmp_str),
@@ -2080,15 +2085,6 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
tmp_str);
break;
}
- /* Match PCI address, allows BDF0+pfx or BDFx+pfx. */
- if (pci_dev->domain == pci_addr.domain &&
- pci_dev->bus == pci_addr.bus &&
- pci_dev->devid == pci_addr.devid &&
- ((pci_dev->function == 0 &&
- pci_dev->function + owner == pci_addr.function) ||
- (pci_dev->function == owner &&
- pci_addr.function == owner)))
- pf = info.port_name;
/* Get ifindex. */
snprintf(tmp_str, sizeof(tmp_str),
"/sys/class/net/%s/ifindex", ifname);
@@ -2105,6 +2101,30 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
bond_info->ports[info.port_name].pci_addr = pci_addr;
bond_info->ports[info.port_name].ifindex = ifindex;
bond_info->n_port++;
+ /*
+ * Under socket direct mode, bonding will use
+ * system_image_guid as identification.
+ * After OFED 5.4, guid is readable (ret >= 0) under sysfs.
+ * All bonding slaves should have the same guid even if driver
+ * is using PCIe BDF.
+ */
+ ret = mlx5_get_device_guid(&pci_addr, guid, sizeof(guid));
+ if (ret < 0)
+ break;
+ else if (ret > 0) {
+ if (!memcmp(guid, cur_guid, sizeof(guid)) &&
+ owner == info.port_name &&
+ (owner != 0 || (owner == 0 &&
+ !rte_pci_addr_cmp(pci_dev, &pci_addr))))
+ pf = info.port_name;
+ } else if (pci_dev->domain == pci_addr.domain &&
+ pci_dev->bus == pci_addr.bus &&
+ pci_dev->devid == pci_addr.devid &&
+ ((pci_dev->function == 0 &&
+ pci_dev->function + owner == pci_addr.function) ||
+ (pci_dev->function == owner &&
+ pci_addr.function == owner)))
+ pf = info.port_name;
}
if (pf >= 0) {
/* Get bond interface info */
@@ -2117,6 +2137,11 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
DRV_LOG(INFO, "PF device %u, bond device %u(%s)",
ifindex, bond_info->ifindex, bond_info->ifname);
}
+ if (owner == 0 && pf != 0) {
+ DRV_LOG(INFO, "PCIe instance %04x:%02x:%02x.%x isn't bonding master",
+ pci_dev->domain, pci_dev->bus, pci_dev->devid,
+ pci_dev->function);
+ }
return pf;
}
--
2.27.0
prev parent reply other threads:[~2021-09-28 8:35 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-28 8:23 [dpdk-dev] [PATCH 0/2] " Rongwei Liu
2021-09-28 8:23 ` [dpdk-dev] [PATCH 1/2] common/mlx5: support pcie device guid query Rongwei Liu
2021-09-28 8:23 ` Rongwei Liu [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210928082355.1616087-3-rongweil@nvidia.com \
--to=rongweil@nvidia.com \
--cc=dev@dpdk.org \
--cc=matan@nvidia.com \
--cc=orika@nvidia.com \
--cc=rasland@nvidia.com \
--cc=thomas@monjalon.net \
--cc=viacheslavo@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).