* [dpdk-dev] [PATCH v2 0/2] support socket direct mode bonding @ 2021-09-28 8:50 Rongwei Liu 2021-09-28 8:50 ` [dpdk-dev] [PATCH v2 1/2] common/mlx5: support pcie device guid query Rongwei Liu 2021-09-28 8:50 ` [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu 0 siblings, 2 replies; 11+ messages in thread From: Rongwei Liu @ 2021-09-28 8:50 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland In socket direct mode, it's possible to bind any two (maybe four in the future) PCIe devices with IDs like xxxx:xx:xx.x and yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have the same PCIe domain/bus/device ID anymore. Doesn't need to backport to DPDK 20.11 v2: fix ci warnings. Rongwei Liu (2): common/mlx5: support pcie device guid query net/mlx5: support socket direct mode bonding drivers/common/mlx5/linux/mlx5_common_os.c | 41 +++++++++++++++++++++ drivers/common/mlx5/linux/mlx5_common_os.h | 19 ++++++++++ drivers/net/mlx5/linux/mlx5_os.c | 43 +++++++++++++++++----- 3 files changed, 94 insertions(+), 9 deletions(-) -- 2.27.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [dpdk-dev] [PATCH v2 1/2] common/mlx5: support pcie device guid query 2021-09-28 8:50 [dpdk-dev] [PATCH v2 0/2] support socket direct mode bonding Rongwei Liu @ 2021-09-28 8:50 ` Rongwei Liu 2021-09-28 8:50 ` [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu 1 sibling, 0 replies; 11+ messages in thread From: Rongwei Liu @ 2021-09-28 8:50 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland sysfs entry "phys_switch_id" holds each PCIe device' guid. The devices which reside in the same physical NIC should have the same guid. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> --- drivers/common/mlx5/linux/mlx5_common_os.c | 41 ++++++++++++++++++++++ drivers/common/mlx5/linux/mlx5_common_os.h | 19 ++++++++++ 2 files changed, 60 insertions(+) diff --git a/drivers/common/mlx5/linux/mlx5_common_os.c b/drivers/common/mlx5/linux/mlx5_common_os.c index 9e0c823c97..8b3ee2baea 100644 --- a/drivers/common/mlx5/linux/mlx5_common_os.c +++ b/drivers/common/mlx5/linux/mlx5_common_os.c @@ -2,6 +2,7 @@ * Copyright 2020 Mellanox Technologies, Ltd */ +#include <sys/types.h> #include <unistd.h> #include <string.h> #include <stdio.h> @@ -428,3 +429,43 @@ mlx5_os_get_ibv_device(const struct rte_pci_addr *addr) mlx5_glue->free_device_list(ibv_list); return ibv_match; } + +int +mlx5_get_device_guid(const struct rte_pci_addr *dev, uint8_t *guid, size_t len) +{ + char tmp[512]; + char cur_ifname[IF_NAMESIZE + 1]; + FILE *id_file; + DIR *dir; + struct dirent *ptr; + int ret; + + if (guid == NULL || len < sizeof(u_int64_t) + 1) + return -1; + memset(guid, 0, len); + snprintf(tmp, sizeof(tmp), "/sys/bus/pci/devices/%04x:%02x:%02x.%x/net", + dev->domain, dev->bus, dev->devid, dev->function); + dir = opendir(tmp); + if (dir == NULL) + return -1; + /* Traverse to identify PF interface */ + do { + ptr = readdir(dir); + if (ptr == NULL || ptr->d_type != DT_DIR) { + closedir(dir); + return -1; + } + } while (strchr(ptr->d_name, '.') || strchr(ptr->d_name, '_') || + strchr(ptr->d_name, 'v')); + snprintf(cur_ifname, sizeof(cur_ifname), "%s", ptr->d_name); + closedir(dir); + snprintf(tmp + strlen(tmp), sizeof(tmp) - strlen(tmp), + "/%s/phys_switch_id", cur_ifname); + /* Older OFED like 5.3 doesn't support read */ + id_file = fopen(tmp, "r"); + if (!id_file) + return 0; + ret = fscanf(id_file, "%16s", guid); + fclose(id_file); + return ret; +} diff --git a/drivers/common/mlx5/linux/mlx5_common_os.h b/drivers/common/mlx5/linux/mlx5_common_os.h index c3202b6786..3cdea75373 100644 --- a/drivers/common/mlx5/linux/mlx5_common_os.h +++ b/drivers/common/mlx5/linux/mlx5_common_os.h @@ -296,4 +296,23 @@ __rte_internal struct ibv_device * mlx5_os_get_ibv_dev(const struct rte_device *dev); +/** + * This is used to query system_image_guid as describing in PRM. + * + * @param dev[in] + * Pointer to a device instance as PCIe id. + * @param guid[out] + * Pointer to the buffer to hold device guid. + * Guid is uint64_t and corresponding to 17 bytes string. + * @param len[in] + * Guid buffer length, 17 bytes at least. + * + * @return + * -1 if internal failure. + * 0 if OFED doesn't support. + * >0 if success. + */ +int +mlx5_get_device_guid(const struct rte_pci_addr *dev, uint8_t *guid, size_t len); + #endif /* RTE_PMD_MLX5_COMMON_OS_H_ */ -- 2.27.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct mode bonding 2021-09-28 8:50 [dpdk-dev] [PATCH v2 0/2] support socket direct mode bonding Rongwei Liu 2021-09-28 8:50 ` [dpdk-dev] [PATCH v2 1/2] common/mlx5: support pcie device guid query Rongwei Liu @ 2021-09-28 8:50 ` Rongwei Liu 2021-09-29 21:58 ` Thomas Monjalon 1 sibling, 1 reply; 11+ messages in thread From: Rongwei Liu @ 2021-09-28 8:50 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland In socket direct mode, it's possible to bind any two (maybe four in future) PCIe devices with IDs like xxxx:xx:xx.x and yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have the same PCIe domain/bus/device ID anymore, Kernel driver uses "system_image_guid" to identify if devices can be bound together or not. Sysfs "phys_switch_id" is used to get "system_image_guid" of each network interface. OFED 5.4+ is required to support "phys_switch_id". Centos 8.1 needs to enable switch_dev mode first. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> --- drivers/net/mlx5/linux/mlx5_os.c | 43 +++++++++++++++++++++++++------- 1 file changed, 34 insertions(+), 9 deletions(-) diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c index 3746057673..1d57b934fc 100644 --- a/drivers/net/mlx5/linux/mlx5_os.c +++ b/drivers/net/mlx5/linux/mlx5_os.c @@ -2008,6 +2008,8 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, FILE *bond_file = NULL, *file; int pf = -1; int ret; + uint8_t cur_guid[32] = {0}; + uint8_t guid[32] = {0}; /* * Try to get master device name. If something goes @@ -2022,6 +2024,8 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, np = mlx5_nl_portnum(nl_rdma, ibv_dev->name); if (!np) return -1; + if (mlx5_get_device_guid(pci_dev, cur_guid, sizeof(cur_guid)) < 0) + return -1; /* * The Master device might not be on the predefined * port (not on port index 1, it is not garanted), @@ -2050,6 +2054,7 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, char tmp_str[IF_NAMESIZE + 32]; struct rte_pci_addr pci_addr; struct mlx5_switch_info info; + int ret; /* Process slave interface names in the loop. */ snprintf(tmp_str, sizeof(tmp_str), @@ -2080,15 +2085,6 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, tmp_str); break; } - /* Match PCI address, allows BDF0+pfx or BDFx+pfx. */ - if (pci_dev->domain == pci_addr.domain && - pci_dev->bus == pci_addr.bus && - pci_dev->devid == pci_addr.devid && - ((pci_dev->function == 0 && - pci_dev->function + owner == pci_addr.function) || - (pci_dev->function == owner && - pci_addr.function == owner))) - pf = info.port_name; /* Get ifindex. */ snprintf(tmp_str, sizeof(tmp_str), "/sys/class/net/%s/ifindex", ifname); @@ -2105,6 +2101,30 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, bond_info->ports[info.port_name].pci_addr = pci_addr; bond_info->ports[info.port_name].ifindex = ifindex; bond_info->n_port++; + /* + * Under socket direct mode, bonding will use + * system_image_guid as identification. + * After OFED 5.4, guid is readable (ret >= 0) under sysfs. + * All bonding members should have the same guid even if driver + * is using PCIe BDF. + */ + ret = mlx5_get_device_guid(&pci_addr, guid, sizeof(guid)); + if (ret < 0) + break; + else if (ret > 0) { + if (!memcmp(guid, cur_guid, sizeof(guid)) && + owner == info.port_name && + (owner != 0 || (owner == 0 && + !rte_pci_addr_cmp(pci_dev, &pci_addr)))) + pf = info.port_name; + } else if (pci_dev->domain == pci_addr.domain && + pci_dev->bus == pci_addr.bus && + pci_dev->devid == pci_addr.devid && + ((pci_dev->function == 0 && + pci_dev->function + owner == pci_addr.function) || + (pci_dev->function == owner && + pci_addr.function == owner))) + pf = info.port_name; } if (pf >= 0) { /* Get bond interface info */ @@ -2117,6 +2137,11 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, DRV_LOG(INFO, "PF device %u, bond device %u(%s)", ifindex, bond_info->ifindex, bond_info->ifname); } + if (owner == 0 && pf != 0) { + DRV_LOG(INFO, "PCIe instance %04x:%02x:%02x.%x isn't bonding owner", + pci_dev->domain, pci_dev->bus, pci_dev->devid, + pci_dev->function); + } return pf; } -- 2.27.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct mode bonding 2021-09-28 8:50 ` [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu @ 2021-09-29 21:58 ` Thomas Monjalon 2021-10-04 6:45 ` Slava Ovsiienko ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Thomas Monjalon @ 2021-09-29 21:58 UTC (permalink / raw) To: matan, viacheslavo, Rongwei Liu; +Cc: orika, dev, rasland 28/09/2021 10:50, Rongwei Liu: > In socket direct mode, it's possible to bind any two (maybe four > in future) PCIe devices with IDs like xxxx:xx:xx.x and > yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have > the same PCIe domain/bus/device ID anymore, > > Kernel driver uses "system_image_guid" to identify if devices can > be bound together or not. Sysfs "phys_switch_id" is used to get > "system_image_guid" of each network interface. > > OFED 5.4+ is required to support "phys_switch_id". > Centos 8.1 needs to enable switch_dev mode first. > > Signed-off-by: Rongwei Liu <rongweil@nvidia.com> > Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> > --- > drivers/net/mlx5/linux/mlx5_os.c | 43 +++++++++++++++++++++++++------- > 1 file changed, 34 insertions(+), 9 deletions(-) Does it deserve a line in the release notes? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct mode bonding 2021-09-29 21:58 ` Thomas Monjalon @ 2021-10-04 6:45 ` Slava Ovsiienko 2021-10-08 10:05 ` [dpdk-dev] [PATCH v3 0/2] " Rongwei Liu 2021-10-14 2:57 ` [dpdk-dev] [PATCH v4 0/2] " Rongwei Liu 2 siblings, 0 replies; 11+ messages in thread From: Slava Ovsiienko @ 2021-10-04 6:45 UTC (permalink / raw) To: NBU-Contact-Thomas Monjalon, Matan Azrad, Rongwei Liu Cc: Ori Kam, dev, Raslan Darawsheh > -----Original Message----- > From: Thomas Monjalon <thomas@monjalon.net> > Sent: Thursday, September 30, 2021 0:58 > To: Matan Azrad <matan@nvidia.com>; Slava Ovsiienko > <viacheslavo@nvidia.com>; Rongwei Liu <rongweil@nvidia.com> > Cc: Ori Kam <orika@nvidia.com>; dev@dpdk.org; Raslan Darawsheh > <rasland@nvidia.com> > Subject: Re: [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct > mode bonding > > 28/09/2021 10:50, Rongwei Liu: > > In socket direct mode, it's possible to bind any two (maybe four in > > future) PCIe devices with IDs like xxxx:xx:xx.x and yyyy:yy:yy.y. > > Bonding member interfaces are unnecessary to have the same PCIe > > domain/bus/device ID anymore, > > > > Kernel driver uses "system_image_guid" to identify if devices can be > > bound together or not. Sysfs "phys_switch_id" is used to get > > "system_image_guid" of each network interface. > > > > OFED 5.4+ is required to support "phys_switch_id". > > Centos 8.1 needs to enable switch_dev mode first. > > > > Signed-off-by: Rongwei Liu <rongweil@nvidia.com> > > Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> > > --- > > drivers/net/mlx5/linux/mlx5_os.c | 43 > > +++++++++++++++++++++++++------- > > 1 file changed, 34 insertions(+), 9 deletions(-) > > Does it deserve a line in the release notes? Not sure, it is minor update. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [dpdk-dev] [PATCH v3 0/2] support socket direct mode bonding 2021-09-29 21:58 ` Thomas Monjalon 2021-10-04 6:45 ` Slava Ovsiienko @ 2021-10-08 10:05 ` Rongwei Liu 2021-10-08 10:05 ` [dpdk-dev] [PATCH v3 1/2] common/mlx5: support pcie device guid query Rongwei Liu 2021-10-08 10:05 ` [dpdk-dev] [PATCH v3 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu 2021-10-14 2:57 ` [dpdk-dev] [PATCH v4 0/2] " Rongwei Liu 2 siblings, 2 replies; 11+ messages in thread From: Rongwei Liu @ 2021-10-08 10:05 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland In socket direct mode, it's possible to bind any two (maybe four in the future) PCIe devices with IDs like xxxx:xx:xx.x and yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have the same PCIe domain/bus/device ID anymore. Doesn't need to backport to DPDK 20.11 v2: fix ci warnings. v3: add description in release_21_11. Rongwei Liu (2): common/mlx5: support pcie device guid query net/mlx5: support socket direct mode bonding doc/guides/rel_notes/release_21_11.rst | 4 ++ drivers/common/mlx5/linux/mlx5_common_os.c | 41 +++++++++++++++++++++ drivers/common/mlx5/linux/mlx5_common_os.h | 19 ++++++++++ drivers/net/mlx5/linux/mlx5_os.c | 43 +++++++++++++++++----- 4 files changed, 98 insertions(+), 9 deletions(-) -- 2.27.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [dpdk-dev] [PATCH v3 1/2] common/mlx5: support pcie device guid query 2021-10-08 10:05 ` [dpdk-dev] [PATCH v3 0/2] " Rongwei Liu @ 2021-10-08 10:05 ` Rongwei Liu 2021-10-08 10:05 ` [dpdk-dev] [PATCH v3 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu 1 sibling, 0 replies; 11+ messages in thread From: Rongwei Liu @ 2021-10-08 10:05 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland sysfs entry "phys_switch_id" holds each PCIe device' guid. The devices which reside in the same physical NIC should have the same guid. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> --- drivers/common/mlx5/linux/mlx5_common_os.c | 41 ++++++++++++++++++++++ drivers/common/mlx5/linux/mlx5_common_os.h | 19 ++++++++++ 2 files changed, 60 insertions(+) diff --git a/drivers/common/mlx5/linux/mlx5_common_os.c b/drivers/common/mlx5/linux/mlx5_common_os.c index 9e0c823c97..8b3ee2baea 100644 --- a/drivers/common/mlx5/linux/mlx5_common_os.c +++ b/drivers/common/mlx5/linux/mlx5_common_os.c @@ -2,6 +2,7 @@ * Copyright 2020 Mellanox Technologies, Ltd */ +#include <sys/types.h> #include <unistd.h> #include <string.h> #include <stdio.h> @@ -428,3 +429,43 @@ mlx5_os_get_ibv_device(const struct rte_pci_addr *addr) mlx5_glue->free_device_list(ibv_list); return ibv_match; } + +int +mlx5_get_device_guid(const struct rte_pci_addr *dev, uint8_t *guid, size_t len) +{ + char tmp[512]; + char cur_ifname[IF_NAMESIZE + 1]; + FILE *id_file; + DIR *dir; + struct dirent *ptr; + int ret; + + if (guid == NULL || len < sizeof(u_int64_t) + 1) + return -1; + memset(guid, 0, len); + snprintf(tmp, sizeof(tmp), "/sys/bus/pci/devices/%04x:%02x:%02x.%x/net", + dev->domain, dev->bus, dev->devid, dev->function); + dir = opendir(tmp); + if (dir == NULL) + return -1; + /* Traverse to identify PF interface */ + do { + ptr = readdir(dir); + if (ptr == NULL || ptr->d_type != DT_DIR) { + closedir(dir); + return -1; + } + } while (strchr(ptr->d_name, '.') || strchr(ptr->d_name, '_') || + strchr(ptr->d_name, 'v')); + snprintf(cur_ifname, sizeof(cur_ifname), "%s", ptr->d_name); + closedir(dir); + snprintf(tmp + strlen(tmp), sizeof(tmp) - strlen(tmp), + "/%s/phys_switch_id", cur_ifname); + /* Older OFED like 5.3 doesn't support read */ + id_file = fopen(tmp, "r"); + if (!id_file) + return 0; + ret = fscanf(id_file, "%16s", guid); + fclose(id_file); + return ret; +} diff --git a/drivers/common/mlx5/linux/mlx5_common_os.h b/drivers/common/mlx5/linux/mlx5_common_os.h index c3202b6786..3cdea75373 100644 --- a/drivers/common/mlx5/linux/mlx5_common_os.h +++ b/drivers/common/mlx5/linux/mlx5_common_os.h @@ -296,4 +296,23 @@ __rte_internal struct ibv_device * mlx5_os_get_ibv_dev(const struct rte_device *dev); +/** + * This is used to query system_image_guid as describing in PRM. + * + * @param dev[in] + * Pointer to a device instance as PCIe id. + * @param guid[out] + * Pointer to the buffer to hold device guid. + * Guid is uint64_t and corresponding to 17 bytes string. + * @param len[in] + * Guid buffer length, 17 bytes at least. + * + * @return + * -1 if internal failure. + * 0 if OFED doesn't support. + * >0 if success. + */ +int +mlx5_get_device_guid(const struct rte_pci_addr *dev, uint8_t *guid, size_t len); + #endif /* RTE_PMD_MLX5_COMMON_OS_H_ */ -- 2.27.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [dpdk-dev] [PATCH v3 2/2] net/mlx5: support socket direct mode bonding 2021-10-08 10:05 ` [dpdk-dev] [PATCH v3 0/2] " Rongwei Liu 2021-10-08 10:05 ` [dpdk-dev] [PATCH v3 1/2] common/mlx5: support pcie device guid query Rongwei Liu @ 2021-10-08 10:05 ` Rongwei Liu 1 sibling, 0 replies; 11+ messages in thread From: Rongwei Liu @ 2021-10-08 10:05 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland In socket direct mode, it's possible to bind any two (maybe four in future) PCIe devices with IDs like xxxx:xx:xx.x and yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have the same PCIe domain/bus/device ID anymore, Kernel driver uses "system_image_guid" to identify if devices can be bound together or not. Sysfs "phys_switch_id" is used to get "system_image_guid" of each network interface. OFED 5.4+ is required to support "phys_switch_id". Centos 8.1 needs to enable switch_dev mode first Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> --- doc/guides/rel_notes/release_21_11.rst | 4 +++ drivers/net/mlx5/linux/mlx5_os.c | 43 ++++++++++++++++++++------ 2 files changed, 38 insertions(+), 9 deletions(-) diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst index dfc2cbdeed..54a7bd230f 100644 --- a/doc/guides/rel_notes/release_21_11.rst +++ b/doc/guides/rel_notes/release_21_11.rst @@ -106,6 +106,10 @@ New Features * Added DES-CBC, AES-XCBC-MAC, AES-CMAC and non-HMAC algo support. * Added PDCP short MAC-I support. +* **Updated Mellanox mlx5 driver.** + + * Added socket direct mode bonding support which needs OFED 5.4+. + * **Updated NXP dpaa2_sec crypto PMD.** * Added PDCP short MAC-I support. diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c index 3746057673..1d57b934fc 100644 --- a/drivers/net/mlx5/linux/mlx5_os.c +++ b/drivers/net/mlx5/linux/mlx5_os.c @@ -2008,6 +2008,8 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, FILE *bond_file = NULL, *file; int pf = -1; int ret; + uint8_t cur_guid[32] = {0}; + uint8_t guid[32] = {0}; /* * Try to get master device name. If something goes @@ -2022,6 +2024,8 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, np = mlx5_nl_portnum(nl_rdma, ibv_dev->name); if (!np) return -1; + if (mlx5_get_device_guid(pci_dev, cur_guid, sizeof(cur_guid)) < 0) + return -1; /* * The Master device might not be on the predefined * port (not on port index 1, it is not garanted), @@ -2050,6 +2054,7 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, char tmp_str[IF_NAMESIZE + 32]; struct rte_pci_addr pci_addr; struct mlx5_switch_info info; + int ret; /* Process slave interface names in the loop. */ snprintf(tmp_str, sizeof(tmp_str), @@ -2080,15 +2085,6 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, tmp_str); break; } - /* Match PCI address, allows BDF0+pfx or BDFx+pfx. */ - if (pci_dev->domain == pci_addr.domain && - pci_dev->bus == pci_addr.bus && - pci_dev->devid == pci_addr.devid && - ((pci_dev->function == 0 && - pci_dev->function + owner == pci_addr.function) || - (pci_dev->function == owner && - pci_addr.function == owner))) - pf = info.port_name; /* Get ifindex. */ snprintf(tmp_str, sizeof(tmp_str), "/sys/class/net/%s/ifindex", ifname); @@ -2105,6 +2101,30 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, bond_info->ports[info.port_name].pci_addr = pci_addr; bond_info->ports[info.port_name].ifindex = ifindex; bond_info->n_port++; + /* + * Under socket direct mode, bonding will use + * system_image_guid as identification. + * After OFED 5.4, guid is readable (ret >= 0) under sysfs. + * All bonding members should have the same guid even if driver + * is using PCIe BDF. + */ + ret = mlx5_get_device_guid(&pci_addr, guid, sizeof(guid)); + if (ret < 0) + break; + else if (ret > 0) { + if (!memcmp(guid, cur_guid, sizeof(guid)) && + owner == info.port_name && + (owner != 0 || (owner == 0 && + !rte_pci_addr_cmp(pci_dev, &pci_addr)))) + pf = info.port_name; + } else if (pci_dev->domain == pci_addr.domain && + pci_dev->bus == pci_addr.bus && + pci_dev->devid == pci_addr.devid && + ((pci_dev->function == 0 && + pci_dev->function + owner == pci_addr.function) || + (pci_dev->function == owner && + pci_addr.function == owner))) + pf = info.port_name; } if (pf >= 0) { /* Get bond interface info */ @@ -2117,6 +2137,11 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, DRV_LOG(INFO, "PF device %u, bond device %u(%s)", ifindex, bond_info->ifindex, bond_info->ifname); } + if (owner == 0 && pf != 0) { + DRV_LOG(INFO, "PCIe instance %04x:%02x:%02x.%x isn't bonding owner", + pci_dev->domain, pci_dev->bus, pci_dev->devid, + pci_dev->function); + } return pf; } -- 2.27.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [dpdk-dev] [PATCH v4 0/2] support socket direct mode bonding 2021-09-29 21:58 ` Thomas Monjalon 2021-10-04 6:45 ` Slava Ovsiienko 2021-10-08 10:05 ` [dpdk-dev] [PATCH v3 0/2] " Rongwei Liu @ 2021-10-14 2:57 ` Rongwei Liu 2021-10-14 2:58 ` [dpdk-dev] [PATCH v4 1/2] common/mlx5: support pcie device guid query Rongwei Liu 2021-10-14 2:58 ` [dpdk-dev] [PATCH v4 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu 2 siblings, 2 replies; 11+ messages in thread From: Rongwei Liu @ 2021-10-14 2:57 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland In socket direct mode, it's possible to bind any two (maybe four in the future) PCIe devices with IDs like xxxx:xx:xx.x and yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have the same PCIe domain/bus/device ID anymore. Doesn't need to backport to DPDK 20.11 v2: fix ci warnings. v3: add description in release_21_11.rst. v4: add description in mlx5.rst. Rongwei Liu (2): common/mlx5: support pcie device guid query net/mlx5: support socket direct mode bonding doc/guides/nics/mlx5.rst | 4 ++ doc/guides/rel_notes/release_21_11.rst | 4 ++ drivers/common/mlx5/linux/mlx5_common_os.c | 41 +++++++++++++++++++++ drivers/common/mlx5/linux/mlx5_common_os.h | 19 ++++++++++ drivers/net/mlx5/linux/mlx5_os.c | 43 +++++++++++++++++----- 5 files changed, 102 insertions(+), 9 deletions(-) -- 2.27.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [dpdk-dev] [PATCH v4 1/2] common/mlx5: support pcie device guid query 2021-10-14 2:57 ` [dpdk-dev] [PATCH v4 0/2] " Rongwei Liu @ 2021-10-14 2:58 ` Rongwei Liu 2021-10-14 2:58 ` [dpdk-dev] [PATCH v4 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu 1 sibling, 0 replies; 11+ messages in thread From: Rongwei Liu @ 2021-10-14 2:58 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland sysfs entry "phys_switch_id" holds each PCIe device' guid. The devices which reside in the same physical NIC should have the same guid. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> --- drivers/common/mlx5/linux/mlx5_common_os.c | 41 ++++++++++++++++++++++ drivers/common/mlx5/linux/mlx5_common_os.h | 19 ++++++++++ 2 files changed, 60 insertions(+) diff --git a/drivers/common/mlx5/linux/mlx5_common_os.c b/drivers/common/mlx5/linux/mlx5_common_os.c index 9e0c823c97..8b3ee2baea 100644 --- a/drivers/common/mlx5/linux/mlx5_common_os.c +++ b/drivers/common/mlx5/linux/mlx5_common_os.c @@ -2,6 +2,7 @@ * Copyright 2020 Mellanox Technologies, Ltd */ +#include <sys/types.h> #include <unistd.h> #include <string.h> #include <stdio.h> @@ -428,3 +429,43 @@ mlx5_os_get_ibv_device(const struct rte_pci_addr *addr) mlx5_glue->free_device_list(ibv_list); return ibv_match; } + +int +mlx5_get_device_guid(const struct rte_pci_addr *dev, uint8_t *guid, size_t len) +{ + char tmp[512]; + char cur_ifname[IF_NAMESIZE + 1]; + FILE *id_file; + DIR *dir; + struct dirent *ptr; + int ret; + + if (guid == NULL || len < sizeof(u_int64_t) + 1) + return -1; + memset(guid, 0, len); + snprintf(tmp, sizeof(tmp), "/sys/bus/pci/devices/%04x:%02x:%02x.%x/net", + dev->domain, dev->bus, dev->devid, dev->function); + dir = opendir(tmp); + if (dir == NULL) + return -1; + /* Traverse to identify PF interface */ + do { + ptr = readdir(dir); + if (ptr == NULL || ptr->d_type != DT_DIR) { + closedir(dir); + return -1; + } + } while (strchr(ptr->d_name, '.') || strchr(ptr->d_name, '_') || + strchr(ptr->d_name, 'v')); + snprintf(cur_ifname, sizeof(cur_ifname), "%s", ptr->d_name); + closedir(dir); + snprintf(tmp + strlen(tmp), sizeof(tmp) - strlen(tmp), + "/%s/phys_switch_id", cur_ifname); + /* Older OFED like 5.3 doesn't support read */ + id_file = fopen(tmp, "r"); + if (!id_file) + return 0; + ret = fscanf(id_file, "%16s", guid); + fclose(id_file); + return ret; +} diff --git a/drivers/common/mlx5/linux/mlx5_common_os.h b/drivers/common/mlx5/linux/mlx5_common_os.h index c3202b6786..3cdea75373 100644 --- a/drivers/common/mlx5/linux/mlx5_common_os.h +++ b/drivers/common/mlx5/linux/mlx5_common_os.h @@ -296,4 +296,23 @@ __rte_internal struct ibv_device * mlx5_os_get_ibv_dev(const struct rte_device *dev); +/** + * This is used to query system_image_guid as describing in PRM. + * + * @param dev[in] + * Pointer to a device instance as PCIe id. + * @param guid[out] + * Pointer to the buffer to hold device guid. + * Guid is uint64_t and corresponding to 17 bytes string. + * @param len[in] + * Guid buffer length, 17 bytes at least. + * + * @return + * -1 if internal failure. + * 0 if OFED doesn't support. + * >0 if success. + */ +int +mlx5_get_device_guid(const struct rte_pci_addr *dev, uint8_t *guid, size_t len); + #endif /* RTE_PMD_MLX5_COMMON_OS_H_ */ -- 2.27.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [dpdk-dev] [PATCH v4 2/2] net/mlx5: support socket direct mode bonding 2021-10-14 2:57 ` [dpdk-dev] [PATCH v4 0/2] " Rongwei Liu 2021-10-14 2:58 ` [dpdk-dev] [PATCH v4 1/2] common/mlx5: support pcie device guid query Rongwei Liu @ 2021-10-14 2:58 ` Rongwei Liu 1 sibling, 0 replies; 11+ messages in thread From: Rongwei Liu @ 2021-10-14 2:58 UTC (permalink / raw) To: matan, viacheslavo, orika, thomas; +Cc: dev, rasland In socket direct mode, it's possible to bind any two (maybe four in future) PCIe devices with IDs like xxxx:xx:xx.x and yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have the same PCIe domain/bus/device ID anymore, Kernel driver uses "system_image_guid" to identify if devices can be bound together or not. Sysfs "phys_switch_id" is used to get "system_image_guid" of each network interface. OFED 5.4+ is required to support "phys_switch_id". Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> --- doc/guides/nics/mlx5.rst | 4 +++ doc/guides/rel_notes/release_21_11.rst | 4 +++ drivers/net/mlx5/linux/mlx5_os.c | 43 ++++++++++++++++++++------ 3 files changed, 42 insertions(+), 9 deletions(-) diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index bae73f42d8..b58236e00a 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -464,6 +464,10 @@ Limitations - In order to achieve best insertion rate, application should manage the flows per lcore. - Better to disable memory reclaim by setting ``reclaim_mem_mode`` to 0 to accelerate the flow object allocation and release with cache. +- Bonding under socket direct mode + + - Needs OFED 5.4+. + Statistics ---------- diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst index dfc2cbdeed..2a6cc765c2 100644 --- a/doc/guides/rel_notes/release_21_11.rst +++ b/doc/guides/rel_notes/release_21_11.rst @@ -106,6 +106,10 @@ New Features * Added DES-CBC, AES-XCBC-MAC, AES-CMAC and non-HMAC algo support. * Added PDCP short MAC-I support. +* **Updated Mellanox mlx5 driver.** + + * Added socket direct mode bonding support. + * **Updated NXP dpaa2_sec crypto PMD.** * Added PDCP short MAC-I support. diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c index 3746057673..1d57b934fc 100644 --- a/drivers/net/mlx5/linux/mlx5_os.c +++ b/drivers/net/mlx5/linux/mlx5_os.c @@ -2008,6 +2008,8 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, FILE *bond_file = NULL, *file; int pf = -1; int ret; + uint8_t cur_guid[32] = {0}; + uint8_t guid[32] = {0}; /* * Try to get master device name. If something goes @@ -2022,6 +2024,8 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, np = mlx5_nl_portnum(nl_rdma, ibv_dev->name); if (!np) return -1; + if (mlx5_get_device_guid(pci_dev, cur_guid, sizeof(cur_guid)) < 0) + return -1; /* * The Master device might not be on the predefined * port (not on port index 1, it is not garanted), @@ -2050,6 +2054,7 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, char tmp_str[IF_NAMESIZE + 32]; struct rte_pci_addr pci_addr; struct mlx5_switch_info info; + int ret; /* Process slave interface names in the loop. */ snprintf(tmp_str, sizeof(tmp_str), @@ -2080,15 +2085,6 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, tmp_str); break; } - /* Match PCI address, allows BDF0+pfx or BDFx+pfx. */ - if (pci_dev->domain == pci_addr.domain && - pci_dev->bus == pci_addr.bus && - pci_dev->devid == pci_addr.devid && - ((pci_dev->function == 0 && - pci_dev->function + owner == pci_addr.function) || - (pci_dev->function == owner && - pci_addr.function == owner))) - pf = info.port_name; /* Get ifindex. */ snprintf(tmp_str, sizeof(tmp_str), "/sys/class/net/%s/ifindex", ifname); @@ -2105,6 +2101,30 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, bond_info->ports[info.port_name].pci_addr = pci_addr; bond_info->ports[info.port_name].ifindex = ifindex; bond_info->n_port++; + /* + * Under socket direct mode, bonding will use + * system_image_guid as identification. + * After OFED 5.4, guid is readable (ret >= 0) under sysfs. + * All bonding members should have the same guid even if driver + * is using PCIe BDF. + */ + ret = mlx5_get_device_guid(&pci_addr, guid, sizeof(guid)); + if (ret < 0) + break; + else if (ret > 0) { + if (!memcmp(guid, cur_guid, sizeof(guid)) && + owner == info.port_name && + (owner != 0 || (owner == 0 && + !rte_pci_addr_cmp(pci_dev, &pci_addr)))) + pf = info.port_name; + } else if (pci_dev->domain == pci_addr.domain && + pci_dev->bus == pci_addr.bus && + pci_dev->devid == pci_addr.devid && + ((pci_dev->function == 0 && + pci_dev->function + owner == pci_addr.function) || + (pci_dev->function == owner && + pci_addr.function == owner))) + pf = info.port_name; } if (pf >= 0) { /* Get bond interface info */ @@ -2117,6 +2137,11 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev, DRV_LOG(INFO, "PF device %u, bond device %u(%s)", ifindex, bond_info->ifindex, bond_info->ifname); } + if (owner == 0 && pf != 0) { + DRV_LOG(INFO, "PCIe instance %04x:%02x:%02x.%x isn't bonding owner", + pci_dev->domain, pci_dev->bus, pci_dev->devid, + pci_dev->function); + } return pf; } -- 2.27.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-10-14 2:58 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-09-28 8:50 [dpdk-dev] [PATCH v2 0/2] support socket direct mode bonding Rongwei Liu 2021-09-28 8:50 ` [dpdk-dev] [PATCH v2 1/2] common/mlx5: support pcie device guid query Rongwei Liu 2021-09-28 8:50 ` [dpdk-dev] [PATCH v2 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu 2021-09-29 21:58 ` Thomas Monjalon 2021-10-04 6:45 ` Slava Ovsiienko 2021-10-08 10:05 ` [dpdk-dev] [PATCH v3 0/2] " Rongwei Liu 2021-10-08 10:05 ` [dpdk-dev] [PATCH v3 1/2] common/mlx5: support pcie device guid query Rongwei Liu 2021-10-08 10:05 ` [dpdk-dev] [PATCH v3 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu 2021-10-14 2:57 ` [dpdk-dev] [PATCH v4 0/2] " Rongwei Liu 2021-10-14 2:58 ` [dpdk-dev] [PATCH v4 1/2] common/mlx5: support pcie device guid query Rongwei Liu 2021-10-14 2:58 ` [dpdk-dev] [PATCH v4 2/2] net/mlx5: support socket direct mode bonding Rongwei Liu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).