DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [RFC 0/9] support global syntax
@ 2020-12-18 15:16 Xueming Li
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                   ` (8 more replies)
  0 siblings, 9 replies; 118+ messages in thread
From: Xueming Li @ 2020-12-18 15:16 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, Olivier Matz, Matan Azrad
  Cc: dev, xuemingl, Asaf Penso

New Global device syntax [1] is used to identify a device with full
bus, class and driver description, example:
 -a bus=pci,id=82:00.0/class=eth/driver=mlx5,dv_flow_en=1

This patch try to enable global syntax with backward compatibility by
trying to new global syntax firstle and fallback to legacy parsing.

For PCI device, BDF is retrived from the "id" attribute of bus section,
parse from device name if "id" not available.

[1]:
https://www.dpdk.org/wp-content/uploads/sites/35/2018/10/am-07-DPDK-hotplug-20180905.pdf


Xueming Li (9):
  devargs: fix data buffer storage type
  devargs: fix memory leak on parsing error
  devargs: fix memory leak in legacy parser
  devargs: fix buffer data memory leak
  kvargs: add get by key function
  devargs: support new global device syntax
  bus/pci: add new global device syntax support
  common/mlx5: support device global syntax
  net/mlx5: support new device global syntax

 app/test-pmd/config.c                        |  4 +--
 app/test-pmd/testpmd.c                       |  4 +--
 drivers/bus/pci/pci_common.c                 | 18 ++++++++--
 drivers/bus/vdev/vdev.c                      |  5 +--
 drivers/common/mlx5/mlx5_common_pci.c        |  6 +++-
 drivers/net/failsafe/failsafe_args.c         |  3 +-
 drivers/net/failsafe/failsafe_eal.c          |  2 +-
 drivers/net/mlx5/linux/mlx5_os.c             | 18 ++++++++--
 drivers/net/mlx5/mlx5.c                      |  6 +++-
 examples/multi_process/hotplug_mp/commands.c |  8 ++---
 examples/vdpa/main.c                         |  6 ++--
 lib/librte_eal/common/eal_common_dev.c       |  7 ++--
 lib/librte_eal/common/eal_common_devargs.c   | 36 ++++++++++++++++----
 lib/librte_eal/common/hotplug_mp.c           |  5 ++-
 lib/librte_eal/include/rte_dev.h             |  2 +-
 lib/librte_eal/include/rte_devargs.h         |  2 +-
 lib/librte_ethdev/rte_ethdev.c               |  5 +--
 lib/librte_kvargs/rte_kvargs.c               | 20 +++++++++++
 lib/librte_kvargs/rte_kvargs.h               | 14 ++++++++
 lib/librte_kvargs/version.map                |  1 +
 20 files changed, 134 insertions(+), 38 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type
  2020-12-18 15:16 [dpdk-dev] [RFC 0/9] support global syntax Xueming Li
@ 2020-12-18 15:16 ` Xueming Li
  2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 0/7] eal: support global syntax Xueming Li
                     ` (66 more replies)
  2020-12-18 15:16 ` [dpdk-dev] [RFC 2/9] devargs: fix memory leak on parsing error Xueming Li
                   ` (7 subsequent siblings)
  8 siblings, 67 replies; 118+ messages in thread
From: Xueming Li @ 2020-12-18 15:16 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, Olivier Matz, Matan Azrad
  Cc: dev, xuemingl, Asaf Penso, gaetan.rivet, stable

The data field fo struct devargs is used as data scratch buffer, not a
const, remove.

Also fixes references to data field of struct devargs.

Fixes: 338327d731e6 ("devargs: add function to parse device layers")
Fixes: c99a2d4c6b7f ("eal: implement device iteration initialization")
Cc: gaetan.rivet@6wind.com
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 examples/vdpa/main.c                   | 6 ++++--
 lib/librte_eal/common/eal_common_dev.c | 3 +--
 lib/librte_eal/include/rte_dev.h       | 2 +-
 lib/librte_eal/include/rte_devargs.h   | 2 +-
 4 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/examples/vdpa/main.c b/examples/vdpa/main.c
index 97e967b9a2..88e9d8780a 100644
--- a/examples/vdpa/main.c
+++ b/examples/vdpa/main.c
@@ -294,9 +294,10 @@ static void cmd_list_vdpa_devices_parsed(
 	struct rte_vdpa_device *vdev;
 	struct rte_device *dev;
 	struct rte_dev_iterator dev_iter;
+	char args[16];
 
 	cmdline_printf(cl, "device name\tqueue num\tsupported features\n");
-	RTE_DEV_FOREACH(dev, "class=vdpa", &dev_iter) {
+	RTE_DEV_FOREACH(dev, strcpy(args, "class=vdpa"), &dev_iter) {
 		vdev = rte_vdpa_find_device_by_name(dev->name);
 		if (!vdev)
 			continue;
@@ -528,6 +529,7 @@ main(int argc, char *argv[])
 	struct rte_vdpa_device *vdev;
 	struct rte_device *dev;
 	struct rte_dev_iterator dev_iter;
+	char args[16];
 
 	ret = rte_eal_init(argc, argv);
 	if (ret < 0)
@@ -549,7 +551,7 @@ main(int argc, char *argv[])
 		cmdline_interact(cl);
 		cmdline_stdin_exit(cl);
 	} else {
-		RTE_DEV_FOREACH(dev, "class=vdpa", &dev_iter) {
+		RTE_DEV_FOREACH(dev, strcpy(args, "class=vdpa"), &dev_iter) {
 			vdev = rte_vdpa_find_device_by_name(dev->name);
 			if (vdev == NULL) {
 				rte_panic("Failed to find vDPA dev for %s\n",
diff --git a/lib/librte_eal/common/eal_common_dev.c b/lib/librte_eal/common/eal_common_dev.c
index 8a3bd3100a..793fbdf24b 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -572,8 +572,7 @@ rte_dev_event_callback_process(const char *device_name,
 }
 
 int
-rte_dev_iterator_init(struct rte_dev_iterator *it,
-		      const char *dev_str)
+rte_dev_iterator_init(struct rte_dev_iterator *it, char *dev_str)
 {
 	struct rte_devargs devargs;
 	struct rte_class *cls = NULL;
diff --git a/lib/librte_eal/include/rte_dev.h b/lib/librte_eal/include/rte_dev.h
index 6dd72c11a1..b320e98637 100644
--- a/lib/librte_eal/include/rte_dev.h
+++ b/lib/librte_eal/include/rte_dev.h
@@ -299,7 +299,7 @@ typedef void *(*rte_dev_iterate_t)(const void *start,
  */
 __rte_experimental
 int
-rte_dev_iterator_init(struct rte_dev_iterator *it, const char *str);
+rte_dev_iterator_init(struct rte_dev_iterator *it, char *str);
 
 /**
  * Iterates on a device iterator.
diff --git a/lib/librte_eal/include/rte_devargs.h b/lib/librte_eal/include/rte_devargs.h
index 296f19324f..8a5ffa2af2 100644
--- a/lib/librte_eal/include/rte_devargs.h
+++ b/lib/librte_eal/include/rte_devargs.h
@@ -69,7 +69,7 @@ struct rte_devargs {
 	struct rte_class *cls; /**< class handle. */
 	const char *bus_str; /**< bus-related part of device string. */
 	const char *cls_str; /**< class-related part of device string. */
-	const char *data; /**< Device string storage. */
+	char *data; /**< Device string storage. */
 };
 
 /**
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [RFC 2/9] devargs: fix memory leak on parsing error
  2020-12-18 15:16 [dpdk-dev] [RFC 0/9] support global syntax Xueming Li
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
@ 2020-12-18 15:16 ` Xueming Li
  2020-12-18 15:16 ` [dpdk-dev] [RFC 3/9] devargs: fix memory leak in legacy parser Xueming Li
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2020-12-18 15:16 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, Olivier Matz, Matan Azrad
  Cc: dev, xuemingl, Asaf Penso, gaetan.rivet, stable

This patch fixes memory leak in parsing error handling.

Fixes: 338327d731e6 ("devargs: add function to parse device layers")
Cc: gaetan.rivet@6wind.com
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 lib/librte_eal/common/eal_common_devargs.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index fcf3d9a3cc..f36f71fbce 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -163,6 +163,11 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 		if (layers[i].kvlist)
 			rte_kvargs_free(layers[i].kvlist);
 	}
+	if (ret && devargs->data && devargs->data != devstr) {
+		/* Free duplicated data. */
+		free(devargs->data);
+		devargs->data = NULL;
+	}
 	if (ret != 0)
 		rte_errno = -ret;
 	return ret;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [RFC 3/9] devargs: fix memory leak in legacy parser
  2020-12-18 15:16 [dpdk-dev] [RFC 0/9] support global syntax Xueming Li
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
  2020-12-18 15:16 ` [dpdk-dev] [RFC 2/9] devargs: fix memory leak on parsing error Xueming Li
@ 2020-12-18 15:16 ` Xueming Li
  2020-12-18 15:16 ` [dpdk-dev] [RFC 4/9] devargs: fix buffer data memory leak Xueming Li
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2020-12-18 15:16 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, Olivier Matz, Matan Azrad
  Cc: dev, xuemingl, Asaf Penso, gaetan.rivet, stable

Data field was designed as parser buffer, will be released once in
releasing struct memory. The duplicated device arguments was not saved
to data and this caused memory leak.

This patch fixes this leak by saving to new allocated memory to data
field.

Fixes: 4969f5914c9e ("devargs: introduce new parsing helper")
Cc: gaetan.rivet@6wind.com
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 lib/librte_eal/common/eal_common_devargs.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index f36f71fbce..3c4774c88a 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -224,13 +224,14 @@ rte_devargs_parse(struct rte_devargs *da, const char *dev)
 	da->bus = bus;
 	/* Parse eventual device arguments */
 	if (devname[i] == ',')
-		da->args = strdup(&devname[i + 1]);
+		da->data = strdup(&devname[i + 1]);
 	else
-		da->args = strdup("");
-	if (da->args == NULL) {
+		da->data = strdup("");
+	if (da->data == NULL) {
 		RTE_LOG(ERR, EAL, "not enough memory to parse arguments\n");
 		return -ENOMEM;
 	}
+	da->args = da->data;
 	return 0;
 }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [RFC 4/9] devargs: fix buffer data memory leak
  2020-12-18 15:16 [dpdk-dev] [RFC 0/9] support global syntax Xueming Li
                   ` (2 preceding siblings ...)
  2020-12-18 15:16 ` [dpdk-dev] [RFC 3/9] devargs: fix memory leak in legacy parser Xueming Li
@ 2020-12-18 15:16 ` Xueming Li
  2020-12-18 15:16 ` [dpdk-dev] [RFC 5/9] kvargs: add get by key function Xueming Li
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2020-12-18 15:16 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, Olivier Matz, Matan Azrad
  Cc: dev, xuemingl, Asaf Penso, gaetan.rivet, stable

Struct rte_devargs data buffer was changed from args to data field, not
all references were changed accordingly, memory leak happened when
releasing devargs.

Free data field of devargs struct.

Fixes: 338327d731e6 ("devargs: add function to parse device layers")
Cc: gaetan.rivet@6wind.com
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 app/test-pmd/config.c                        | 4 ++--
 app/test-pmd/testpmd.c                       | 4 ++--
 drivers/bus/vdev/vdev.c                      | 5 +++--
 drivers/net/failsafe/failsafe_args.c         | 3 ++-
 drivers/net/failsafe/failsafe_eal.c          | 2 +-
 examples/multi_process/hotplug_mp/commands.c | 8 ++++----
 lib/librte_eal/common/eal_common_dev.c       | 4 ++--
 lib/librte_eal/common/eal_common_devargs.c   | 7 ++++---
 lib/librte_eal/common/hotplug_mp.c           | 5 ++---
 lib/librte_ethdev/rte_ethdev.c               | 5 +++--
 10 files changed, 25 insertions(+), 22 deletions(-)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index b51de59e1e..e7f456692b 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -593,8 +593,8 @@ device_infos_display(const char *identifier)
 
 	if (rte_devargs_parsef(&da, "%s", identifier)) {
 		printf("cannot parse identifier\n");
-		if (da.args)
-			free(da.args);
+		if (da.data)
+			free(da.data);
 		return;
 	}
 
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 33fc0fddf5..66f3ff9320 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -3056,8 +3056,8 @@ detach_devargs(char *identifier)
 	memset(&da, 0, sizeof(da));
 	if (rte_devargs_parsef(&da, "%s", identifier)) {
 		printf("cannot parse identifier\n");
-		if (da.args)
-			free(da.args);
+		if (da.data)
+			free(da.data);
 		return;
 	}
 
diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index acfd78828f..43375bb334 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -236,9 +236,10 @@ alloc_devargs(const char *name, const char *args)
 
 	devargs->bus = &rte_vdev_bus;
 	if (args)
-		devargs->args = strdup(args);
+		devargs->data = strdup(args);
 	else
-		devargs->args = strdup("");
+		devargs->data = strdup("");
+	devargs->args = devargs->data;
 
 	ret = strlcpy(devargs->name, name, sizeof(devargs->name));
 	if (ret < 0 || ret >= (int)sizeof(devargs->name)) {
diff --git a/drivers/net/failsafe/failsafe_args.c b/drivers/net/failsafe/failsafe_args.c
index 707490b94c..5e507bffbc 100644
--- a/drivers/net/failsafe/failsafe_args.c
+++ b/drivers/net/failsafe/failsafe_args.c
@@ -451,7 +451,8 @@ failsafe_args_free(struct rte_eth_dev *dev)
 		sdev->cmdline = NULL;
 		free(sdev->fd_str);
 		sdev->fd_str = NULL;
-		free(sdev->devargs.args);
+		free(sdev->devargs.data);
+		sdev->devargs.data = NULL;
 		sdev->devargs.args = NULL;
 	}
 }
diff --git a/drivers/net/failsafe/failsafe_eal.c b/drivers/net/failsafe/failsafe_eal.c
index b9fc508673..f066c053f3 100644
--- a/drivers/net/failsafe/failsafe_eal.c
+++ b/drivers/net/failsafe/failsafe_eal.c
@@ -79,7 +79,7 @@ fs_bus_init(struct rte_eth_dev *dev)
 					rte_eth_devices[pid].device->devargs;
 
 			/* Take control of probed device. */
-			free(da->args);
+			free(da->data);
 			memset(da, 0, sizeof(*da));
 			if (probed_da != NULL)
 				snprintf(devstr, sizeof(devstr), "%s,%s",
diff --git a/examples/multi_process/hotplug_mp/commands.c b/examples/multi_process/hotplug_mp/commands.c
index a8a39d07f7..e77585e5b4 100644
--- a/examples/multi_process/hotplug_mp/commands.c
+++ b/examples/multi_process/hotplug_mp/commands.c
@@ -121,8 +121,8 @@ static void cmd_dev_attach_parsed(void *parsed_result,
 
 	if (rte_devargs_parsef(&da, "%s", res->devargs)) {
 		cmdline_printf(cl, "cannot parse devargs\n");
-		if (da.args)
-			free(da.args);
+		if (da.data)
+			free(da.data);
 		return;
 	}
 
@@ -168,8 +168,8 @@ static void cmd_dev_detach_parsed(void *parsed_result,
 
 	if (rte_devargs_parsef(&da, "%s", res->devargs)) {
 		cmdline_printf(cl, "cannot parse devargs\n");
-		if (da.args)
-			free(da.args);
+		if (da.data)
+			free(da.data);
 		return;
 	}
 
diff --git a/lib/librte_eal/common/eal_common_dev.c b/lib/librte_eal/common/eal_common_dev.c
index 793fbdf24b..f65a9594cc 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -186,7 +186,7 @@ local_dev_probe(const char *devargs, struct rte_device **new_dev)
 
 err_devarg:
 	if (rte_devargs_remove(da) != 0) {
-		free(da->args);
+		free(da->data);
 		free(da);
 	}
 	return ret;
@@ -585,7 +585,7 @@ rte_dev_iterator_init(struct rte_dev_iterator *it, char *dev_str)
 	it->bus_str = NULL;
 	it->cls_str = NULL;
 
-	devargs.data = dev_str;
+	devargs.data = NULL;
 	if (rte_devargs_layers_parse(&devargs, dev_str))
 		goto get_out;
 
diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index 3c4774c88a..e1a3cd7367 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -284,7 +284,8 @@ rte_devargs_insert(struct rte_devargs **da)
 			/* device already in devargs list, must be updated */
 			listed_da->type = (*da)->type;
 			listed_da->policy = (*da)->policy;
-			free(listed_da->args);
+			if (listed_da->data)
+				free(listed_da->data);
 			listed_da->args = (*da)->args;
 			listed_da->bus = (*da)->bus;
 			listed_da->cls = (*da)->cls;
@@ -332,7 +333,7 @@ rte_devargs_add(enum rte_devtype devtype, const char *devargs_str)
 
 fail:
 	if (devargs) {
-		free(devargs->args);
+		free(devargs->data);
 		free(devargs);
 	}
 
@@ -352,7 +353,7 @@ rte_devargs_remove(struct rte_devargs *devargs)
 		if (strcmp(d->bus->name, devargs->bus->name) == 0 &&
 		    strcmp(d->name, devargs->name) == 0) {
 			TAILQ_REMOVE(&devargs_list, d, next);
-			free(d->args);
+			free(d->data);
 			free(d);
 			return 0;
 		}
diff --git a/lib/librte_eal/common/hotplug_mp.c b/lib/librte_eal/common/hotplug_mp.c
index ee791903b3..f0f7c61048 100644
--- a/lib/librte_eal/common/hotplug_mp.c
+++ b/lib/librte_eal/common/hotplug_mp.c
@@ -118,8 +118,7 @@ __handle_secondary_request(void *param)
 		ret = rte_devargs_parse(&da, req->devargs);
 		if (ret != 0)
 			goto finish;
-		free(da.args); /* we don't need those */
-		da.args = NULL;
+		free(da.data); /* we don't need those */
 
 		ret = eal_dev_hotplug_request_to_secondary(&tmp_req);
 		if (ret != 0) {
@@ -283,7 +282,7 @@ static void __handle_primary_request(void *param)
 
 		ret = local_dev_remove(dev);
 quit:
-		free(da->args);
+		free(da->data);
 		free(da);
 		break;
 	default:
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 17ddacc78d..4976961d13 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -244,7 +244,8 @@ rte_eth_iterator_init(struct rte_dev_iterator *iter, const char *devargs_str)
 		goto error;
 	}
 	iter->cls_str = cls_str;
-	free(devargs.args); /* allocated by rte_devargs_parse() */
+	free(devargs.data); /* allocated by rte_devargs_parse() */
+	devargs.data = NULL;
 	devargs.args = NULL;
 
 	iter->bus = devargs.bus;
@@ -284,7 +285,7 @@ rte_eth_iterator_init(struct rte_dev_iterator *iter, const char *devargs_str)
 	if (ret == -ENOTSUP)
 		RTE_ETHDEV_LOG(ERR, "Bus %s does not support iterating.\n",
 				iter->bus->name);
-	free(devargs.args);
+	free(devargs.data);
 	free(bus_str);
 	free(cls_str);
 	return ret;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [RFC 5/9] kvargs: add get by key function
  2020-12-18 15:16 [dpdk-dev] [RFC 0/9] support global syntax Xueming Li
                   ` (3 preceding siblings ...)
  2020-12-18 15:16 ` [dpdk-dev] [RFC 4/9] devargs: fix buffer data memory leak Xueming Li
@ 2020-12-18 15:16 ` Xueming Li
  2020-12-18 15:16 ` [dpdk-dev] [RFC 6/9] devargs: support new global device syntax Xueming Li
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2020-12-18 15:16 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, Olivier Matz, Matan Azrad
  Cc: dev, xuemingl, Asaf Penso

Adds a new function to get value of a specific key from kvargs list.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 lib/librte_kvargs/rte_kvargs.c | 20 ++++++++++++++++++++
 lib/librte_kvargs/rte_kvargs.h | 14 ++++++++++++++
 lib/librte_kvargs/version.map  |  1 +
 3 files changed, 35 insertions(+)

diff --git a/lib/librte_kvargs/rte_kvargs.c b/lib/librte_kvargs/rte_kvargs.c
index 285081c86c..bc734915f9 100644
--- a/lib/librte_kvargs/rte_kvargs.c
+++ b/lib/librte_kvargs/rte_kvargs.c
@@ -160,6 +160,26 @@ rte_kvargs_free(struct rte_kvargs *kvlist)
 	free(kvlist);
 }
 
+/* lookup the rte_kvargs structure by key */
+const char *
+rte_kvargs_get(struct rte_kvargs *kvlist, const char *key)
+{
+	unsigned int i;
+
+	if (!kvlist)
+		return NULL;
+	for (i = 0; i < kvlist->count; ++i) {
+		/* Allows key to be NULL. */
+		if (!key && !kvlist->pairs[i].key)
+			return kvlist->pairs[i].value;
+		if (!key || !kvlist->pairs[i].key)
+			continue;
+		if (!strcmp(kvlist->pairs[i].key, key))
+			return kvlist->pairs[i].value;
+	}
+	return NULL;
+}
+
 /*
  * Parse the arguments "key=value,key=value,..." string and return
  * an allocated structure that contains a key/value list. Also
diff --git a/lib/librte_kvargs/rte_kvargs.h b/lib/librte_kvargs/rte_kvargs.h
index eff598e08b..6d426241ea 100644
--- a/lib/librte_kvargs/rte_kvargs.h
+++ b/lib/librte_kvargs/rte_kvargs.h
@@ -114,6 +114,20 @@ struct rte_kvargs *rte_kvargs_parse_delim(const char *args,
  */
 void rte_kvargs_free(struct rte_kvargs *kvlist);
 
+/**
+ * Get the value matching the given key
+ *
+ * @param kvlist
+ *   The rte_kvargs structure
+ * @param key
+ *   The key that should match
+
+ * @return
+ *   The value that match, NULL if not found.
+ */
+__rte_experimental
+const char *rte_kvargs_get(struct rte_kvargs *kvlist, const char *key);
+
 /**
  * Call a handler function for each key/value matching the key
  *
diff --git a/lib/librte_kvargs/version.map b/lib/librte_kvargs/version.map
index ed375bf4a3..d6cde16f30 100644
--- a/lib/librte_kvargs/version.map
+++ b/lib/librte_kvargs/version.map
@@ -14,5 +14,6 @@ EXPERIMENTAL {
 
 	rte_kvargs_parse_delim;
 	rte_kvargs_strcmp;
+	rte_kvargs_get;
 
 };
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [RFC 6/9] devargs: support new global device syntax
  2020-12-18 15:16 [dpdk-dev] [RFC 0/9] support global syntax Xueming Li
                   ` (4 preceding siblings ...)
  2020-12-18 15:16 ` [dpdk-dev] [RFC 5/9] kvargs: add get by key function Xueming Li
@ 2020-12-18 15:16 ` Xueming Li
  2020-12-18 15:16 ` [dpdk-dev] [RFC 7/9] bus/pci: add new global device syntax support Xueming Li
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2020-12-18 15:16 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, Olivier Matz, Matan Azrad
  Cc: dev, xuemingl, Asaf Penso

When parsing a device syntax, try to parse new global syntax firstly,
then try to parse as legacy syntax if failed.

Example of new global syntax:
 -a bus=pci,addr=82:00.0/class=eth/driver=mlx5,dv_flow_en=1

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 lib/librte_eal/common/eal_common_devargs.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index e1a3cd7367..a79eea12d3 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -57,6 +57,7 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 	struct rte_class *cls = NULL;
 	struct rte_bus *bus = NULL;
 	const char *s = devstr;
+	const char *id;
 	size_t nblayer;
 	size_t i = 0;
 	int ret = 0;
@@ -116,6 +117,8 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 		if (layers[i].kvlist == NULL)
 			continue;
 		kv = &layers[i].kvlist->pairs[0];
+		if (!kv->key)
+			continue;
 		if (strcmp(kv->key, "bus") == 0) {
 			bus = rte_bus_find_by_name(kv->value);
 			if (bus == NULL) {
@@ -124,6 +127,14 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 				ret = -EFAULT;
 				goto get_out;
 			}
+			id = rte_kvargs_get(layers[i].kvlist, "id");
+			if (!id) {
+				RTE_LOG(ERR, EAL, "Could not find bus id \"%s\"\n",
+					devstr);
+				ret = -EFAULT;
+				goto get_out;
+			}
+			strncpy(devargs->name, id, sizeof(devargs->name) - 1);
 		} else if (strcmp(kv->key, "class") == 0) {
 			cls = rte_class_find_by_name(kv->value);
 			if (cls == NULL) {
@@ -190,6 +201,12 @@ rte_devargs_parse(struct rte_devargs *da, const char *dev)
 	if (da == NULL)
 		return -EINVAL;
 
+	/* First parse according new global syntax */
+	if (rte_devargs_layers_parse(da, dev) == 0 && da->bus && da->cls)
+		return 0;
+
+	/* Legacy syntax check: */
+
 	/* Retrieve eventual bus info */
 	do {
 		devname = dev;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [RFC 7/9] bus/pci: add new global device syntax support
  2020-12-18 15:16 [dpdk-dev] [RFC 0/9] support global syntax Xueming Li
                   ` (5 preceding siblings ...)
  2020-12-18 15:16 ` [dpdk-dev] [RFC 6/9] devargs: support new global device syntax Xueming Li
@ 2020-12-18 15:16 ` Xueming Li
  2020-12-18 15:16 ` [dpdk-dev] [RFC 8/9] common/mlx5: support device global syntax Xueming Li
  2020-12-18 15:16 ` [dpdk-dev] [RFC 9/9] net/mlx5: support new " Xueming Li
  8 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2020-12-18 15:16 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, Olivier Matz, Matan Azrad
  Cc: dev, xuemingl, Asaf Penso

With new global device syntax, this patch tries to get PCI BDF firstly
from bus "addr" argument, fallback to name if not found. Example:

 -w bus=pci,addr=82:00.0/class=eth/driver=mlx5,dv_flow_en=1

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 drivers/bus/pci/pci_common.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
index 9b8d769287..f6fc80abe8 100644
--- a/drivers/bus/pci/pci_common.c
+++ b/drivers/bus/pci/pci_common.c
@@ -23,6 +23,7 @@
 #include <rte_string_fns.h>
 #include <rte_common.h>
 #include <rte_devargs.h>
+#include <rte_kvargs.h>
 #include <rte_vfio.h>
 
 #include "private.h"
@@ -48,9 +49,20 @@ pci_devargs_lookup(const struct rte_pci_addr *pci_addr)
 {
 	struct rte_devargs *devargs;
 	struct rte_pci_addr addr;
+	struct rte_kvargs *kvlist = NULL;
+	const char *name;
 
 	RTE_EAL_DEVARGS_FOREACH("pci", devargs) {
-		devargs->bus->parse(devargs->name, &addr);
+		name = NULL;
+		if (devargs->bus_str) {
+			kvlist = rte_kvargs_parse(devargs->bus_str, NULL);
+			name = rte_kvargs_get(kvlist, "id");
+		}
+		if (!name)
+			name = devargs->name;
+		devargs->bus->parse(name, &addr);
+		if (kvlist)
+			rte_kvargs_free(kvlist);
 		if (!rte_pci_addr_cmp(pci_addr, &addr))
 			return devargs;
 	}
@@ -71,11 +83,11 @@ pci_name_set(struct rte_pci_device *dev)
 	/* When using a blocklist, only blocked devices will have
 	 * an rte_devargs. Allowed devices won't have one.
 	 */
-	if (devargs != NULL)
+	if (devargs != NULL && strlen(devargs->name))
 		/* If an rte_devargs exists, the generic rte_device uses the
 		 * given name as its name.
 		 */
-		dev->device.name = dev->device.devargs->name;
+		dev->device.name = devargs->name;
 	else
 		/* Otherwise, it uses the internal, canonical form. */
 		dev->device.name = dev->name;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [RFC 8/9] common/mlx5: support device global syntax
  2020-12-18 15:16 [dpdk-dev] [RFC 0/9] support global syntax Xueming Li
                   ` (6 preceding siblings ...)
  2020-12-18 15:16 ` [dpdk-dev] [RFC 7/9] bus/pci: add new global device syntax support Xueming Li
@ 2020-12-18 15:16 ` Xueming Li
  2020-12-18 15:16 ` [dpdk-dev] [RFC 9/9] net/mlx5: support new " Xueming Li
  8 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2020-12-18 15:16 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, Olivier Matz, Matan Azrad
  Cc: dev, xuemingl, Asaf Penso

This patch support new device global syntax:
bus=<bus>,k=v,,,/class=<cls>,k=v,,,/driver=<pmd>,k=v,,,,

To reuse class name of global syntax, this patch also changes internal
class name introduced by commit [1] to algin with RTE class name.

[1]
8a41f4deccc3: common/mlx5: introduce layer for multiple class drivers

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 drivers/common/mlx5/mlx5_common_pci.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/common/mlx5/mlx5_common_pci.c b/drivers/common/mlx5/mlx5_common_pci.c
index 5208972bb6..b1eda7b3c8 100644
--- a/drivers/common/mlx5/mlx5_common_pci.c
+++ b/drivers/common/mlx5/mlx5_common_pci.c
@@ -4,6 +4,7 @@
 
 #include <stdlib.h>
 #include <rte_malloc.h>
+#include <rte_class.h>
 #include "mlx5_common_utils.h"
 #include "mlx5_common_pci.h"
 
@@ -26,7 +27,7 @@ static const struct {
 	unsigned int driver_class;
 } mlx5_classes[] = {
 	{ .name = "vdpa", .driver_class = MLX5_CLASS_VDPA },
-	{ .name = "net", .driver_class = MLX5_CLASS_NET },
+	{ .name = "eth", .driver_class = MLX5_CLASS_NET },
 	{ .name = "regex", .driver_class = MLX5_CLASS_REGEX },
 };
 
@@ -115,6 +116,9 @@ parse_class_options(const struct rte_devargs *devargs)
 
 	if (devargs == NULL)
 		return 0;
+	if (devargs->cls)
+		/* support new global syntax */
+		return class_name_to_value(devargs->cls->name);
 	kvlist = rte_kvargs_parse(devargs->args, NULL);
 	if (kvlist == NULL)
 		return 0;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [RFC 9/9] net/mlx5: support new device global syntax
  2020-12-18 15:16 [dpdk-dev] [RFC 0/9] support global syntax Xueming Li
                   ` (7 preceding siblings ...)
  2020-12-18 15:16 ` [dpdk-dev] [RFC 8/9] common/mlx5: support device global syntax Xueming Li
@ 2020-12-18 15:16 ` Xueming Li
  8 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2020-12-18 15:16 UTC (permalink / raw)
  To: Viacheslav Ovsiienko, Thomas Monjalon, Ferruh Yigit,
	Andrew Rybchenko, Olivier Matz, Matan Azrad
  Cc: dev, xuemingl, Asaf Penso

This patch support new device global syntax like:
	bus=pci,addr=BB:DD.F/class=eth/driver=mlx5,devargs,..

Ignore "driver" key as part of new global device syntax in devargs.

The representor devarg is supposed to come from either class section or
driver section.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_os.c | 18 ++++++++++++++++--
 drivers/net/mlx5/mlx5.c          |  6 +++++-
 2 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 917d6be7b8..fae339584b 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -694,13 +694,27 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	if (switch_info->representor && dpdk_dev->devargs) {
 		struct rte_eth_devargs eth_da;
 
-		err = rte_eth_devargs_parse(dpdk_dev->devargs->args, &eth_da);
+		/* Representer should come from class argument or driver */
+		if (dpdk_dev->devargs->cls_str)
+			err = rte_eth_devargs_parse(dpdk_dev->devargs->cls_str,
+						    &eth_da);
 		if (err) {
 			rte_errno = -err;
 			DRV_LOG(ERR, "failed to process device arguments: %s",
-				strerror(rte_errno));
+				dpdk_dev->devargs->cls_str);
 			return NULL;
 		}
+		if (eth_da.type == RTE_ETH_REPRESENTOR_NONE) {
+			/* Support legacy device argument */
+			err = rte_eth_devargs_parse(dpdk_dev->devargs->args,
+						    &eth_da);
+			if (err) {
+				rte_errno = -err;
+				DRV_LOG(ERR, "failed to process device arguments: %s",
+					dpdk_dev->devargs->args);
+				return NULL;
+			}
+		}
 		for (i = 0; i < eth_da.nb_representor_ports; ++i)
 			if (eth_da.representor_ports[i] ==
 			    (uint16_t)switch_info->port_name)
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 52a8a252d4..3dc15e21ca 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -39,6 +39,9 @@
 #include "mlx5_flow.h"
 #include "rte_pmd_mlx5.h"
 
+/* Driver type key for new device global syntax. */
+#define MLX5_DRIVER_KEY "driver"
+
 /* Device parameter to enable RX completion queue compression. */
 #define MLX5_RXQ_CQE_COMP_EN "rxq_cqe_comp_en"
 
@@ -1449,7 +1452,7 @@ mlx5_args_check(const char *key, const char *val, void *opaque)
 	signed long tmp;
 
 	/* No-op, port representors are processed in mlx5_dev_spawn(). */
-	if (!strcmp(MLX5_REPRESENTOR, key))
+	if (!strcmp(MLX5_DRIVER_KEY, key) || !strcmp(MLX5_REPRESENTOR, key))
 		return 0;
 	errno = 0;
 	tmp = strtol(val, NULL, 0);
@@ -1603,6 +1606,7 @@ int
 mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
 {
 	const char **params = (const char *[]){
+		MLX5_DRIVER_KEY,
 		MLX5_RXQ_CQE_COMP_EN,
 		MLX5_RXQ_CQE_PAD_EN,
 		MLX5_RXQ_PKT_PAD_EN,
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v1 0/7] eal: support global syntax
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
@ 2021-01-08 14:54   ` Xueming Li
  2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 1/7] devargs: fix data buffer storage type Xueming Li
                     ` (65 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-08 14:54 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko, Olivier Matz
  Cc: dev, Viacheslav Ovsiienko, xuemingl, Asaf Penso

New Global device syntax [1] is used to identify a device with full bus,
class and driver description, example:
 -a bus=pci,id=82:00.0/class=eth/driver=mlx5,dv_flow_en=1

This patch enables global syntax with backward compatibility by trying
new global syntax firstly and fallback to legacy parsing.

For PCI device, BDF is retrived from the "id" attribute of bus section,
parse from device name if "id" not available.


Depends-on: patch-86058 ("ethdev: refactor representor infrastructure")


[1] Global Device Syntax:
https://www.dpdk.org/wp-content/uploads/sites/35/2018/10/am-07-DPDK-hotplug-20180905.pdf

[2] RFC:
http://patchwork.dpdk.org/project/dpdk/list/?series=14378


Xueming Li (7):
  devargs: fix data buffer storage type
  devargs: fix memory leak on parsing error
  devargs: fix memory leak in legacy parser
  devargs: fix buffer data memory leak
  kvargs: add get by key function
  devargs: support new global device syntax
  bus/pci: add new global device syntax support

 app/test-pmd/config.c                        |  4 +--
 app/test-pmd/testpmd.c                       |  4 +--
 drivers/bus/pci/pci_common.c                 | 18 ++++++++--
 drivers/bus/vdev/vdev.c                      |  5 +--
 drivers/net/failsafe/failsafe_args.c         |  3 +-
 drivers/net/failsafe/failsafe_eal.c          |  2 +-
 examples/multi_process/hotplug_mp/commands.c |  8 ++---
 examples/vdpa/main.c                         |  6 ++--
 lib/librte_eal/common/eal_common_dev.c       |  7 ++--
 lib/librte_eal/common/eal_common_devargs.c   | 36 ++++++++++++++++----
 lib/librte_eal/common/hotplug_mp.c           |  5 ++-
 lib/librte_eal/include/rte_dev.h             |  2 +-
 lib/librte_eal/include/rte_devargs.h         |  2 +-
 lib/librte_ethdev/rte_ethdev.c               |  5 +--
 lib/librte_kvargs/rte_kvargs.c               | 20 +++++++++++
 lib/librte_kvargs/rte_kvargs.h               | 14 ++++++++
 lib/librte_kvargs/version.map                |  1 +
 17 files changed, 108 insertions(+), 34 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v1 1/7] devargs: fix data buffer storage type
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
  2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 0/7] eal: support global syntax Xueming Li
@ 2021-01-08 14:54   ` Xueming Li
  2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 2/7] devargs: fix memory leak on parsing error Xueming Li
                     ` (64 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-08 14:54 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko, Olivier Matz
  Cc: dev, Viacheslav Ovsiienko, xuemingl, Asaf Penso, gaetan.rivet, stable

The data field fo struct devargs is used as data scratch buffer, not a
const, remove.

Also fixes references to data field of struct devargs.

Fixes: 338327d731e6 ("devargs: add function to parse device layers")
Fixes: c99a2d4c6b7f ("eal: implement device iteration initialization")
Cc: gaetan.rivet@6wind.com
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 examples/vdpa/main.c                   | 6 ++++--
 lib/librte_eal/common/eal_common_dev.c | 3 +--
 lib/librte_eal/include/rte_dev.h       | 2 +-
 lib/librte_eal/include/rte_devargs.h   | 2 +-
 4 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/examples/vdpa/main.c b/examples/vdpa/main.c
index 97e967b9a2..88e9d8780a 100644
--- a/examples/vdpa/main.c
+++ b/examples/vdpa/main.c
@@ -294,9 +294,10 @@ static void cmd_list_vdpa_devices_parsed(
 	struct rte_vdpa_device *vdev;
 	struct rte_device *dev;
 	struct rte_dev_iterator dev_iter;
+	char args[16];
 
 	cmdline_printf(cl, "device name\tqueue num\tsupported features\n");
-	RTE_DEV_FOREACH(dev, "class=vdpa", &dev_iter) {
+	RTE_DEV_FOREACH(dev, strcpy(args, "class=vdpa"), &dev_iter) {
 		vdev = rte_vdpa_find_device_by_name(dev->name);
 		if (!vdev)
 			continue;
@@ -528,6 +529,7 @@ main(int argc, char *argv[])
 	struct rte_vdpa_device *vdev;
 	struct rte_device *dev;
 	struct rte_dev_iterator dev_iter;
+	char args[16];
 
 	ret = rte_eal_init(argc, argv);
 	if (ret < 0)
@@ -549,7 +551,7 @@ main(int argc, char *argv[])
 		cmdline_interact(cl);
 		cmdline_stdin_exit(cl);
 	} else {
-		RTE_DEV_FOREACH(dev, "class=vdpa", &dev_iter) {
+		RTE_DEV_FOREACH(dev, strcpy(args, "class=vdpa"), &dev_iter) {
 			vdev = rte_vdpa_find_device_by_name(dev->name);
 			if (vdev == NULL) {
 				rte_panic("Failed to find vDPA dev for %s\n",
diff --git a/lib/librte_eal/common/eal_common_dev.c b/lib/librte_eal/common/eal_common_dev.c
index 8a3bd3100a..793fbdf24b 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -572,8 +572,7 @@ rte_dev_event_callback_process(const char *device_name,
 }
 
 int
-rte_dev_iterator_init(struct rte_dev_iterator *it,
-		      const char *dev_str)
+rte_dev_iterator_init(struct rte_dev_iterator *it, char *dev_str)
 {
 	struct rte_devargs devargs;
 	struct rte_class *cls = NULL;
diff --git a/lib/librte_eal/include/rte_dev.h b/lib/librte_eal/include/rte_dev.h
index 6dd72c11a1..b320e98637 100644
--- a/lib/librte_eal/include/rte_dev.h
+++ b/lib/librte_eal/include/rte_dev.h
@@ -299,7 +299,7 @@ typedef void *(*rte_dev_iterate_t)(const void *start,
  */
 __rte_experimental
 int
-rte_dev_iterator_init(struct rte_dev_iterator *it, const char *str);
+rte_dev_iterator_init(struct rte_dev_iterator *it, char *str);
 
 /**
  * Iterates on a device iterator.
diff --git a/lib/librte_eal/include/rte_devargs.h b/lib/librte_eal/include/rte_devargs.h
index 296f19324f..8a5ffa2af2 100644
--- a/lib/librte_eal/include/rte_devargs.h
+++ b/lib/librte_eal/include/rte_devargs.h
@@ -69,7 +69,7 @@ struct rte_devargs {
 	struct rte_class *cls; /**< class handle. */
 	const char *bus_str; /**< bus-related part of device string. */
 	const char *cls_str; /**< class-related part of device string. */
-	const char *data; /**< Device string storage. */
+	char *data; /**< Device string storage. */
 };
 
 /**
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v1 2/7] devargs: fix memory leak on parsing error
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
  2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 0/7] eal: support global syntax Xueming Li
  2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 1/7] devargs: fix data buffer storage type Xueming Li
@ 2021-01-08 14:54   ` Xueming Li
  2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 3/7] devargs: fix memory leak in legacy parser Xueming Li
                     ` (63 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-08 14:54 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko, Olivier Matz
  Cc: dev, Viacheslav Ovsiienko, xuemingl, Asaf Penso, gaetan.rivet, stable

This patch fixes memory leak in parsing error handling.

Fixes: 338327d731e6 ("devargs: add function to parse device layers")
Cc: gaetan.rivet@6wind.com
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 lib/librte_eal/common/eal_common_devargs.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index fcf3d9a3cc..f36f71fbce 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -163,6 +163,11 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 		if (layers[i].kvlist)
 			rte_kvargs_free(layers[i].kvlist);
 	}
+	if (ret && devargs->data && devargs->data != devstr) {
+		/* Free duplicated data. */
+		free(devargs->data);
+		devargs->data = NULL;
+	}
 	if (ret != 0)
 		rte_errno = -ret;
 	return ret;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v1 3/7] devargs: fix memory leak in legacy parser
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (2 preceding siblings ...)
  2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 2/7] devargs: fix memory leak on parsing error Xueming Li
@ 2021-01-08 14:54   ` Xueming Li
  2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 4/7] devargs: fix buffer data memory leak Xueming Li
                     ` (62 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-08 14:54 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko, Olivier Matz
  Cc: dev, Viacheslav Ovsiienko, xuemingl, Asaf Penso, gaetan.rivet, stable

Data field was designed as parser buffer, will be released once in
releasing struct memory. The duplicated device arguments was not saved
to data and this caused memory leak.

This patch fixes this leak by saving to new allocated memory to data
field.

Fixes: 4969f5914c9e ("devargs: introduce new parsing helper")
Cc: gaetan.rivet@6wind.com
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 lib/librte_eal/common/eal_common_devargs.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index f36f71fbce..3c4774c88a 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -224,13 +224,14 @@ rte_devargs_parse(struct rte_devargs *da, const char *dev)
 	da->bus = bus;
 	/* Parse eventual device arguments */
 	if (devname[i] == ',')
-		da->args = strdup(&devname[i + 1]);
+		da->data = strdup(&devname[i + 1]);
 	else
-		da->args = strdup("");
-	if (da->args == NULL) {
+		da->data = strdup("");
+	if (da->data == NULL) {
 		RTE_LOG(ERR, EAL, "not enough memory to parse arguments\n");
 		return -ENOMEM;
 	}
+	da->args = da->data;
 	return 0;
 }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v1 4/7] devargs: fix buffer data memory leak
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (3 preceding siblings ...)
  2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 3/7] devargs: fix memory leak in legacy parser Xueming Li
@ 2021-01-08 14:54   ` Xueming Li
  2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 5/7] kvargs: add get by key function Xueming Li
                     ` (61 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-08 14:54 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko, Olivier Matz
  Cc: dev, Viacheslav Ovsiienko, xuemingl, Asaf Penso, gaetan.rivet, stable

Struct rte_devargs data buffer was changed from args to data field, not
all references were changed accordingly, memory leak happened when
releasing devargs.

Free data field of devargs struct.

Fixes: 338327d731e6 ("devargs: add function to parse device layers")
Cc: gaetan.rivet@6wind.com
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 app/test-pmd/config.c                        | 4 ++--
 app/test-pmd/testpmd.c                       | 4 ++--
 drivers/bus/vdev/vdev.c                      | 5 +++--
 drivers/net/failsafe/failsafe_args.c         | 3 ++-
 drivers/net/failsafe/failsafe_eal.c          | 2 +-
 examples/multi_process/hotplug_mp/commands.c | 8 ++++----
 lib/librte_eal/common/eal_common_dev.c       | 4 ++--
 lib/librte_eal/common/eal_common_devargs.c   | 7 ++++---
 lib/librte_eal/common/hotplug_mp.c           | 5 ++---
 lib/librte_ethdev/rte_ethdev.c               | 5 +++--
 10 files changed, 25 insertions(+), 22 deletions(-)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index b51de59e1e..e7f456692b 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -593,8 +593,8 @@ device_infos_display(const char *identifier)
 
 	if (rte_devargs_parsef(&da, "%s", identifier)) {
 		printf("cannot parse identifier\n");
-		if (da.args)
-			free(da.args);
+		if (da.data)
+			free(da.data);
 		return;
 	}
 
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 33fc0fddf5..66f3ff9320 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -3056,8 +3056,8 @@ detach_devargs(char *identifier)
 	memset(&da, 0, sizeof(da));
 	if (rte_devargs_parsef(&da, "%s", identifier)) {
 		printf("cannot parse identifier\n");
-		if (da.args)
-			free(da.args);
+		if (da.data)
+			free(da.data);
 		return;
 	}
 
diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index acfd78828f..43375bb334 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -236,9 +236,10 @@ alloc_devargs(const char *name, const char *args)
 
 	devargs->bus = &rte_vdev_bus;
 	if (args)
-		devargs->args = strdup(args);
+		devargs->data = strdup(args);
 	else
-		devargs->args = strdup("");
+		devargs->data = strdup("");
+	devargs->args = devargs->data;
 
 	ret = strlcpy(devargs->name, name, sizeof(devargs->name));
 	if (ret < 0 || ret >= (int)sizeof(devargs->name)) {
diff --git a/drivers/net/failsafe/failsafe_args.c b/drivers/net/failsafe/failsafe_args.c
index 707490b94c..5e507bffbc 100644
--- a/drivers/net/failsafe/failsafe_args.c
+++ b/drivers/net/failsafe/failsafe_args.c
@@ -451,7 +451,8 @@ failsafe_args_free(struct rte_eth_dev *dev)
 		sdev->cmdline = NULL;
 		free(sdev->fd_str);
 		sdev->fd_str = NULL;
-		free(sdev->devargs.args);
+		free(sdev->devargs.data);
+		sdev->devargs.data = NULL;
 		sdev->devargs.args = NULL;
 	}
 }
diff --git a/drivers/net/failsafe/failsafe_eal.c b/drivers/net/failsafe/failsafe_eal.c
index b9fc508673..f066c053f3 100644
--- a/drivers/net/failsafe/failsafe_eal.c
+++ b/drivers/net/failsafe/failsafe_eal.c
@@ -79,7 +79,7 @@ fs_bus_init(struct rte_eth_dev *dev)
 					rte_eth_devices[pid].device->devargs;
 
 			/* Take control of probed device. */
-			free(da->args);
+			free(da->data);
 			memset(da, 0, sizeof(*da));
 			if (probed_da != NULL)
 				snprintf(devstr, sizeof(devstr), "%s,%s",
diff --git a/examples/multi_process/hotplug_mp/commands.c b/examples/multi_process/hotplug_mp/commands.c
index a8a39d07f7..e77585e5b4 100644
--- a/examples/multi_process/hotplug_mp/commands.c
+++ b/examples/multi_process/hotplug_mp/commands.c
@@ -121,8 +121,8 @@ static void cmd_dev_attach_parsed(void *parsed_result,
 
 	if (rte_devargs_parsef(&da, "%s", res->devargs)) {
 		cmdline_printf(cl, "cannot parse devargs\n");
-		if (da.args)
-			free(da.args);
+		if (da.data)
+			free(da.data);
 		return;
 	}
 
@@ -168,8 +168,8 @@ static void cmd_dev_detach_parsed(void *parsed_result,
 
 	if (rte_devargs_parsef(&da, "%s", res->devargs)) {
 		cmdline_printf(cl, "cannot parse devargs\n");
-		if (da.args)
-			free(da.args);
+		if (da.data)
+			free(da.data);
 		return;
 	}
 
diff --git a/lib/librte_eal/common/eal_common_dev.c b/lib/librte_eal/common/eal_common_dev.c
index 793fbdf24b..f65a9594cc 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -186,7 +186,7 @@ local_dev_probe(const char *devargs, struct rte_device **new_dev)
 
 err_devarg:
 	if (rte_devargs_remove(da) != 0) {
-		free(da->args);
+		free(da->data);
 		free(da);
 	}
 	return ret;
@@ -585,7 +585,7 @@ rte_dev_iterator_init(struct rte_dev_iterator *it, char *dev_str)
 	it->bus_str = NULL;
 	it->cls_str = NULL;
 
-	devargs.data = dev_str;
+	devargs.data = NULL;
 	if (rte_devargs_layers_parse(&devargs, dev_str))
 		goto get_out;
 
diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index 3c4774c88a..e1a3cd7367 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -284,7 +284,8 @@ rte_devargs_insert(struct rte_devargs **da)
 			/* device already in devargs list, must be updated */
 			listed_da->type = (*da)->type;
 			listed_da->policy = (*da)->policy;
-			free(listed_da->args);
+			if (listed_da->data)
+				free(listed_da->data);
 			listed_da->args = (*da)->args;
 			listed_da->bus = (*da)->bus;
 			listed_da->cls = (*da)->cls;
@@ -332,7 +333,7 @@ rte_devargs_add(enum rte_devtype devtype, const char *devargs_str)
 
 fail:
 	if (devargs) {
-		free(devargs->args);
+		free(devargs->data);
 		free(devargs);
 	}
 
@@ -352,7 +353,7 @@ rte_devargs_remove(struct rte_devargs *devargs)
 		if (strcmp(d->bus->name, devargs->bus->name) == 0 &&
 		    strcmp(d->name, devargs->name) == 0) {
 			TAILQ_REMOVE(&devargs_list, d, next);
-			free(d->args);
+			free(d->data);
 			free(d);
 			return 0;
 		}
diff --git a/lib/librte_eal/common/hotplug_mp.c b/lib/librte_eal/common/hotplug_mp.c
index ee791903b3..f0f7c61048 100644
--- a/lib/librte_eal/common/hotplug_mp.c
+++ b/lib/librte_eal/common/hotplug_mp.c
@@ -118,8 +118,7 @@ __handle_secondary_request(void *param)
 		ret = rte_devargs_parse(&da, req->devargs);
 		if (ret != 0)
 			goto finish;
-		free(da.args); /* we don't need those */
-		da.args = NULL;
+		free(da.data); /* we don't need those */
 
 		ret = eal_dev_hotplug_request_to_secondary(&tmp_req);
 		if (ret != 0) {
@@ -283,7 +282,7 @@ static void __handle_primary_request(void *param)
 
 		ret = local_dev_remove(dev);
 quit:
-		free(da->args);
+		free(da->data);
 		free(da);
 		break;
 	default:
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 17ddacc78d..4976961d13 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -244,7 +244,8 @@ rte_eth_iterator_init(struct rte_dev_iterator *iter, const char *devargs_str)
 		goto error;
 	}
 	iter->cls_str = cls_str;
-	free(devargs.args); /* allocated by rte_devargs_parse() */
+	free(devargs.data); /* allocated by rte_devargs_parse() */
+	devargs.data = NULL;
 	devargs.args = NULL;
 
 	iter->bus = devargs.bus;
@@ -284,7 +285,7 @@ rte_eth_iterator_init(struct rte_dev_iterator *iter, const char *devargs_str)
 	if (ret == -ENOTSUP)
 		RTE_ETHDEV_LOG(ERR, "Bus %s does not support iterating.\n",
 				iter->bus->name);
-	free(devargs.args);
+	free(devargs.data);
 	free(bus_str);
 	free(cls_str);
 	return ret;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v1 5/7] kvargs: add get by key function
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (4 preceding siblings ...)
  2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 4/7] devargs: fix buffer data memory leak Xueming Li
@ 2021-01-08 14:54   ` Xueming Li
  2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 6/7] devargs: support new global device syntax Xueming Li
                     ` (60 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-08 14:54 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko, Olivier Matz
  Cc: dev, Viacheslav Ovsiienko, xuemingl, Asaf Penso

Adds a new function to get value of a specific key from kvargs list.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 lib/librte_kvargs/rte_kvargs.c | 20 ++++++++++++++++++++
 lib/librte_kvargs/rte_kvargs.h | 14 ++++++++++++++
 lib/librte_kvargs/version.map  |  1 +
 3 files changed, 35 insertions(+)

diff --git a/lib/librte_kvargs/rte_kvargs.c b/lib/librte_kvargs/rte_kvargs.c
index 285081c86c..bc734915f9 100644
--- a/lib/librte_kvargs/rte_kvargs.c
+++ b/lib/librte_kvargs/rte_kvargs.c
@@ -160,6 +160,26 @@ rte_kvargs_free(struct rte_kvargs *kvlist)
 	free(kvlist);
 }
 
+/* lookup the rte_kvargs structure by key */
+const char *
+rte_kvargs_get(struct rte_kvargs *kvlist, const char *key)
+{
+	unsigned int i;
+
+	if (!kvlist)
+		return NULL;
+	for (i = 0; i < kvlist->count; ++i) {
+		/* Allows key to be NULL. */
+		if (!key && !kvlist->pairs[i].key)
+			return kvlist->pairs[i].value;
+		if (!key || !kvlist->pairs[i].key)
+			continue;
+		if (!strcmp(kvlist->pairs[i].key, key))
+			return kvlist->pairs[i].value;
+	}
+	return NULL;
+}
+
 /*
  * Parse the arguments "key=value,key=value,..." string and return
  * an allocated structure that contains a key/value list. Also
diff --git a/lib/librte_kvargs/rte_kvargs.h b/lib/librte_kvargs/rte_kvargs.h
index eff598e08b..6d426241ea 100644
--- a/lib/librte_kvargs/rte_kvargs.h
+++ b/lib/librte_kvargs/rte_kvargs.h
@@ -114,6 +114,20 @@ struct rte_kvargs *rte_kvargs_parse_delim(const char *args,
  */
 void rte_kvargs_free(struct rte_kvargs *kvlist);
 
+/**
+ * Get the value matching the given key
+ *
+ * @param kvlist
+ *   The rte_kvargs structure
+ * @param key
+ *   The key that should match
+
+ * @return
+ *   The value that match, NULL if not found.
+ */
+__rte_experimental
+const char *rte_kvargs_get(struct rte_kvargs *kvlist, const char *key);
+
 /**
  * Call a handler function for each key/value matching the key
  *
diff --git a/lib/librte_kvargs/version.map b/lib/librte_kvargs/version.map
index ed375bf4a3..d6cde16f30 100644
--- a/lib/librte_kvargs/version.map
+++ b/lib/librte_kvargs/version.map
@@ -14,5 +14,6 @@ EXPERIMENTAL {
 
 	rte_kvargs_parse_delim;
 	rte_kvargs_strcmp;
+	rte_kvargs_get;
 
 };
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v1 6/7] devargs: support new global device syntax
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (5 preceding siblings ...)
  2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 5/7] kvargs: add get by key function Xueming Li
@ 2021-01-08 14:54   ` Xueming Li
  2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 7/7] bus/pci: add new global device syntax support Xueming Li
                     ` (59 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-08 14:54 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko, Olivier Matz
  Cc: dev, Viacheslav Ovsiienko, xuemingl, Asaf Penso

When parsing a device syntax, try to parse new global syntax firstly,
then try to parse as legacy syntax if failed.

Example of new global syntax:
 -a bus=pci,addr=82:00.0/class=eth/driver=mlx5,dv_flow_en=1

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 lib/librte_eal/common/eal_common_devargs.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index e1a3cd7367..a79eea12d3 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -57,6 +57,7 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 	struct rte_class *cls = NULL;
 	struct rte_bus *bus = NULL;
 	const char *s = devstr;
+	const char *id;
 	size_t nblayer;
 	size_t i = 0;
 	int ret = 0;
@@ -116,6 +117,8 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 		if (layers[i].kvlist == NULL)
 			continue;
 		kv = &layers[i].kvlist->pairs[0];
+		if (!kv->key)
+			continue;
 		if (strcmp(kv->key, "bus") == 0) {
 			bus = rte_bus_find_by_name(kv->value);
 			if (bus == NULL) {
@@ -124,6 +127,14 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 				ret = -EFAULT;
 				goto get_out;
 			}
+			id = rte_kvargs_get(layers[i].kvlist, "id");
+			if (!id) {
+				RTE_LOG(ERR, EAL, "Could not find bus id \"%s\"\n",
+					devstr);
+				ret = -EFAULT;
+				goto get_out;
+			}
+			strncpy(devargs->name, id, sizeof(devargs->name) - 1);
 		} else if (strcmp(kv->key, "class") == 0) {
 			cls = rte_class_find_by_name(kv->value);
 			if (cls == NULL) {
@@ -190,6 +201,12 @@ rte_devargs_parse(struct rte_devargs *da, const char *dev)
 	if (da == NULL)
 		return -EINVAL;
 
+	/* First parse according new global syntax */
+	if (rte_devargs_layers_parse(da, dev) == 0 && da->bus && da->cls)
+		return 0;
+
+	/* Legacy syntax check: */
+
 	/* Retrieve eventual bus info */
 	do {
 		devname = dev;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v1 7/7] bus/pci: add new global device syntax support
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (6 preceding siblings ...)
  2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 6/7] devargs: support new global device syntax Xueming Li
@ 2021-01-08 14:54   ` Xueming Li
  2021-01-08 15:14   ` [dpdk-dev] [PATCH v1 0/2] mlx5: support global syntax Xueming Li
                     ` (58 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-08 14:54 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko, Olivier Matz
  Cc: dev, Viacheslav Ovsiienko, xuemingl, Asaf Penso

With new global device syntax, this patch tries to get PCI BDF firstly
from bus "addr" argument, fallback to name if not found. Example:

 -w bus=pci,addr=82:00.0/class=eth/driver=mlx5,dv_flow_en=1

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 drivers/bus/pci/pci_common.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
index 9b8d769287..f6fc80abe8 100644
--- a/drivers/bus/pci/pci_common.c
+++ b/drivers/bus/pci/pci_common.c
@@ -23,6 +23,7 @@
 #include <rte_string_fns.h>
 #include <rte_common.h>
 #include <rte_devargs.h>
+#include <rte_kvargs.h>
 #include <rte_vfio.h>
 
 #include "private.h"
@@ -48,9 +49,20 @@ pci_devargs_lookup(const struct rte_pci_addr *pci_addr)
 {
 	struct rte_devargs *devargs;
 	struct rte_pci_addr addr;
+	struct rte_kvargs *kvlist = NULL;
+	const char *name;
 
 	RTE_EAL_DEVARGS_FOREACH("pci", devargs) {
-		devargs->bus->parse(devargs->name, &addr);
+		name = NULL;
+		if (devargs->bus_str) {
+			kvlist = rte_kvargs_parse(devargs->bus_str, NULL);
+			name = rte_kvargs_get(kvlist, "id");
+		}
+		if (!name)
+			name = devargs->name;
+		devargs->bus->parse(name, &addr);
+		if (kvlist)
+			rte_kvargs_free(kvlist);
 		if (!rte_pci_addr_cmp(pci_addr, &addr))
 			return devargs;
 	}
@@ -71,11 +83,11 @@ pci_name_set(struct rte_pci_device *dev)
 	/* When using a blocklist, only blocked devices will have
 	 * an rte_devargs. Allowed devices won't have one.
 	 */
-	if (devargs != NULL)
+	if (devargs != NULL && strlen(devargs->name))
 		/* If an rte_devargs exists, the generic rte_device uses the
 		 * given name as its name.
 		 */
-		dev->device.name = dev->device.devargs->name;
+		dev->device.name = devargs->name;
 	else
 		/* Otherwise, it uses the internal, canonical form. */
 		dev->device.name = dev->name;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v1 0/2] mlx5: support global syntax
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (7 preceding siblings ...)
  2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 7/7] bus/pci: add new global device syntax support Xueming Li
@ 2021-01-08 15:14   ` Xueming Li
  2021-01-08 15:14   ` [dpdk-dev] [PATCH v1 1/2] common/mlx5: support device " Xueming Li
                     ` (57 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-08 15:14 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, Matan Azrad, Shahaf Shuler, Thomas Monjalon, xuemingl, Asaf Penso

New Global device syntax [1] is used to identify a device with full bus,
class and driver description, example:
 -a bus=pci,id=82:00.0/class=eth/driver=mlx5,dv_flow_en=1

This patchset enables global syntax in mlx5 PMD.


Depends-on: patch-86058 ("ethdev: refactor representor infrastructure")
Depends-on: series-14610 ("eal: support global syntax")


[1] Global Device Syntax:
https://www.dpdk.org/wp-content/uploads/sites/35/2018/10/am-07-DPDK-hotplug-20180905.pdf

[2] RFC:
http://patchwork.dpdk.org/project/dpdk/list/?series=14378


Xueming Li (2):
  common/mlx5: support device global syntax
  net/mlx5: support new device global syntax

 drivers/common/mlx5/mlx5_common_pci.c |  6 +++++-
 drivers/net/mlx5/linux/mlx5_os.c      | 18 ++++++++++++++++--
 drivers/net/mlx5/mlx5.c               |  6 +++++-
 3 files changed, 26 insertions(+), 4 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v1 1/2] common/mlx5: support device global syntax
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (8 preceding siblings ...)
  2021-01-08 15:14   ` [dpdk-dev] [PATCH v1 0/2] mlx5: support global syntax Xueming Li
@ 2021-01-08 15:14   ` Xueming Li
  2021-01-08 15:15   ` [dpdk-dev] [PATCH v1 2/2] net/mlx5: support new " Xueming Li
                     ` (56 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-08 15:14 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, Matan Azrad, Shahaf Shuler, Thomas Monjalon, xuemingl, Asaf Penso

This patch support new device global syntax:
bus=<bus>,k=v,,,/class=<cls>,k=v,,,/driver=<pmd>,k=v,,,,

To reuse class name of global syntax, this patch also changes internal
class name introduced by commit [1] to algin with RTE class name.

[1]
8a41f4deccc3: common/mlx5: introduce layer for multiple class drivers

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 drivers/common/mlx5/mlx5_common_pci.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/common/mlx5/mlx5_common_pci.c b/drivers/common/mlx5/mlx5_common_pci.c
index 5208972bb6..b1eda7b3c8 100644
--- a/drivers/common/mlx5/mlx5_common_pci.c
+++ b/drivers/common/mlx5/mlx5_common_pci.c
@@ -4,6 +4,7 @@
 
 #include <stdlib.h>
 #include <rte_malloc.h>
+#include <rte_class.h>
 #include "mlx5_common_utils.h"
 #include "mlx5_common_pci.h"
 
@@ -26,7 +27,7 @@ static const struct {
 	unsigned int driver_class;
 } mlx5_classes[] = {
 	{ .name = "vdpa", .driver_class = MLX5_CLASS_VDPA },
-	{ .name = "net", .driver_class = MLX5_CLASS_NET },
+	{ .name = "eth", .driver_class = MLX5_CLASS_NET },
 	{ .name = "regex", .driver_class = MLX5_CLASS_REGEX },
 };
 
@@ -115,6 +116,9 @@ parse_class_options(const struct rte_devargs *devargs)
 
 	if (devargs == NULL)
 		return 0;
+	if (devargs->cls)
+		/* support new global syntax */
+		return class_name_to_value(devargs->cls->name);
 	kvlist = rte_kvargs_parse(devargs->args, NULL);
 	if (kvlist == NULL)
 		return 0;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v1 2/2] net/mlx5: support new device global syntax
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (9 preceding siblings ...)
  2021-01-08 15:14   ` [dpdk-dev] [PATCH v1 1/2] common/mlx5: support device " Xueming Li
@ 2021-01-08 15:15   ` Xueming Li
  2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 0/9] net/mlx5: support SubFunction representor Xueming Li
                     ` (55 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-08 15:15 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, Matan Azrad, Shahaf Shuler, Thomas Monjalon, xuemingl, Asaf Penso

This patch support new device global syntax like:
	bus=pci,addr=BB:DD.F/class=eth/driver=mlx5,devargs,..

Ignore "driver" key as part of new global device syntax in devargs.

The representor devarg is supposed to come from either class section or
driver section.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_os.c | 18 ++++++++++++++++--
 drivers/net/mlx5/mlx5.c          |  6 +++++-
 2 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 6812a1f215..f1ed3505b1 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -699,13 +699,27 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	if (switch_info->representor && dpdk_dev->devargs) {
 		struct rte_eth_devargs eth_da;
 
-		err = rte_eth_devargs_parse(dpdk_dev->devargs->args, &eth_da);
+		/* Representer should come from class argument or driver */
+		if (dpdk_dev->devargs->cls_str)
+			err = rte_eth_devargs_parse(dpdk_dev->devargs->cls_str,
+						    &eth_da);
 		if (err) {
 			rte_errno = -err;
 			DRV_LOG(ERR, "failed to process device arguments: %s",
-				strerror(rte_errno));
+				dpdk_dev->devargs->cls_str);
 			return NULL;
 		}
+		if (eth_da.type == RTE_ETH_REPRESENTOR_NONE) {
+			/* Support legacy device argument */
+			err = rte_eth_devargs_parse(dpdk_dev->devargs->args,
+						    &eth_da);
+			if (err) {
+				rte_errno = -err;
+				DRV_LOG(ERR, "failed to process device arguments: %s",
+					dpdk_dev->devargs->args);
+				return NULL;
+			}
+		}
 		for (i = 0; i < eth_da.nb_representor_ports; ++i)
 			if (eth_da.representor_ports[i] ==
 			    (uint16_t)switch_info->port_name)
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 023ef50a77..f2b6cf9fd6 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -41,6 +41,9 @@
 #include "mlx5_flow_os.h"
 #include "rte_pmd_mlx5.h"
 
+/* Driver type key for new device global syntax. */
+#define MLX5_DRIVER_KEY "driver"
+
 /* Device parameter to enable RX completion queue compression. */
 #define MLX5_RXQ_CQE_COMP_EN "rxq_cqe_comp_en"
 
@@ -1600,7 +1603,7 @@ mlx5_args_check(const char *key, const char *val, void *opaque)
 	signed long tmp;
 
 	/* No-op, port representors are processed in mlx5_dev_spawn(). */
-	if (!strcmp(MLX5_REPRESENTOR, key))
+	if (!strcmp(MLX5_DRIVER_KEY, key) || !strcmp(MLX5_REPRESENTOR, key))
 		return 0;
 	errno = 0;
 	tmp = strtol(val, NULL, 0);
@@ -1754,6 +1757,7 @@ int
 mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
 {
 	const char **params = (const char *[]){
+		MLX5_DRIVER_KEY,
 		MLX5_RXQ_CQE_COMP_EN,
 		MLX5_RXQ_CQE_PAD_EN,
 		MLX5_RXQ_PKT_PAD_EN,
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v3 0/9] net/mlx5: support SubFunction representor
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (10 preceding siblings ...)
  2021-01-08 15:15   ` [dpdk-dev] [PATCH v1 2/2] net/mlx5: support new " Xueming Li
@ 2021-01-18 11:29   ` Xueming Li
  2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 1/9] common/mlx5: update representor name parsing Xueming Li
                     ` (54 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-18 11:29 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, Matan Azrad, Shahaf Shuler, Thomas Monjalon, xuemingl, Asaf Penso

SubFunction [1] is a portion of the PCI device, a SF netdev has its own
dedicated queues(txq, rxq). A SF netdev supports E-Switch representation
offload similar to existing PF and VF representors. A SF shares PCI
level resources with other SFs and/or with its parent PCI function.

This patch set introduces SubFunction representor support for mlx5
PMD driver.

Depends-on: series-14809 ("support SubFunction representor")

Version history:
 RFC:
 	initial version [2]
 V2:
    - support bonding representor probe with new pf#vf# devargs
    - adapt EAL api V2 [3] changes
    - update document
 V3:
    - support list of representor PF section for bonding device:
      example: representor=pf[0,1]vf[0-3]
    - add bonding information to shared PMD data
    - fix setting VF MAC through representor
    - fix bonding xstats, sum xstats from PF members.

[1] SubFunction in kernel:
https://lore.kernel.org/netdev/20201112192424.2742-1-parav@nvidia.com/

[2] RFC:
http://patchwork.dpdk.org/project/dpdk/list/?series=14376

[3] V2:
http://patchwork.dpdk.org/project/dpdk/list/?series=14560

[4] EAL part to support SF representor:
http://patchwork.dpdk.org/project/dpdk/list/?series=14809


Xueming Li (9):
  common/mlx5: update representor name parsing
  net/mlx5: support representor of sub function
  net/mlx5: revert setting bonding representor to first PF
  net/mlx5: refactor bonding representor probe
  net/mlx5: support representor from multiple PFs
  net/mlx5: save bonding member ports information
  net/mlx5: save bonding member ports information
  net/mlx5: fix setting VF default MAC through representor
  net/mlx5: improve bonding xstats

 doc/guides/nics/mlx5.rst                   |  62 +++-
 drivers/common/mlx5/linux/mlx5_common_os.c |  32 +-
 drivers/common/mlx5/linux/mlx5_nl.c        |   2 +
 drivers/common/mlx5/mlx5_common.h          |   2 +
 drivers/net/mlx5/linux/mlx5_ethdev_os.c    | 136 +++++--
 drivers/net/mlx5/linux/mlx5_os.c           | 399 ++++++++++++++-------
 drivers/net/mlx5/mlx5.c                    |  25 +-
 drivers/net/mlx5/mlx5.h                    |  25 +-
 drivers/net/mlx5/mlx5_defs.h               |   4 -
 drivers/net/mlx5/mlx5_ethdev.c             |  34 +-
 drivers/net/mlx5/mlx5_mac.c                |  26 +-
 11 files changed, 533 insertions(+), 214 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v3 1/9] common/mlx5: update representor name parsing
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (11 preceding siblings ...)
  2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 0/9] net/mlx5: support SubFunction representor Xueming Li
@ 2021-01-18 11:29   ` Xueming Li
  2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 2/9] net/mlx5: support representor of sub function Xueming Li
                     ` (53 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-18 11:29 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, Matan Azrad, Shahaf Shuler, Thomas Monjalon, xuemingl, Asaf Penso

This patch updates representor name parsing for SF.
In sysfs, representor name stored under "phys_port_name" sysfs key,
similar to VF representor, switch port name of SF representor is
"pf<x>sf<y>".

For netlink message, net SF type is supported.

Examples:

pf0sf1
pf0sf[0-3]

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/common/mlx5/linux/mlx5_common_os.c | 32 +++++++++++++++-------
 drivers/common/mlx5/linux/mlx5_nl.c        |  2 ++
 drivers/common/mlx5/mlx5_common.h          |  2 ++
 drivers/net/mlx5/linux/mlx5_ethdev_os.c    |  3 ++
 4 files changed, 29 insertions(+), 10 deletions(-)

diff --git a/drivers/common/mlx5/linux/mlx5_common_os.c b/drivers/common/mlx5/linux/mlx5_common_os.c
index 0edd78ea6d..5cf9576921 100644
--- a/drivers/common/mlx5/linux/mlx5_common_os.c
+++ b/drivers/common/mlx5/linux/mlx5_common_os.c
@@ -97,22 +97,34 @@ void
 mlx5_translate_port_name(const char *port_name_in,
 			 struct mlx5_switch_info *port_info_out)
 {
-	char pf_c1, pf_c2, vf_c1, vf_c2, eol;
+	char ctrl = 0, pf_c1, pf_c2, vf_c1, vf_c2, eol;
 	char *end;
 	int sc_items;
 
-	/*
-	 * Check for port-name as a string of the form pf0vf0
-	 * (support kernel ver >= 5.0 or OFED ver >= 4.6).
-	 */
+	sc_items = sscanf(port_name_in, "%c%d",
+			  &ctrl, &port_info_out->ctrl_num);
+	if (sc_items == 2 && ctrl == 'c') {
+		port_name_in++; /* 'c' */
+		port_name_in += snprintf(NULL, 0, "%d",
+					  port_info_out->ctrl_num);
+	}
+	/* Check for port-name as a string of the form pf0vf0 or pf0sf0 */
 	sc_items = sscanf(port_name_in, "%c%c%d%c%c%d%c",
 			  &pf_c1, &pf_c2, &port_info_out->pf_num,
 			  &vf_c1, &vf_c2, &port_info_out->port_name, &eol);
-	if (sc_items == 6 &&
-	    pf_c1 == 'p' && pf_c2 == 'f' &&
-	    vf_c1 == 'v' && vf_c2 == 'f') {
-		port_info_out->name_type = MLX5_PHYS_PORT_NAME_TYPE_PFVF;
-		return;
+	if (sc_items == 6 && pf_c1 == 'p' && pf_c2 == 'f') {
+		if (vf_c1 == 'v' && vf_c2 == 'f') {
+			/* Kernel ver >= 5.0 or OFED ver >= 4.6 */
+			port_info_out->name_type =
+					MLX5_PHYS_PORT_NAME_TYPE_PFVF;
+			return;
+		}
+		if (vf_c1 == 's' && vf_c2 == 'f') {
+			/* Kernel ver >= 5.11 or OFED ver >= 5.1 */
+			port_info_out->name_type =
+					MLX5_PHYS_PORT_NAME_TYPE_PFSF;
+			return;
+		}
 	}
 	/*
 	 * Check for port-name as a string of the form p0
diff --git a/drivers/common/mlx5/linux/mlx5_nl.c b/drivers/common/mlx5/linux/mlx5_nl.c
index 40d8620300..3d55cc98b4 100644
--- a/drivers/common/mlx5/linux/mlx5_nl.c
+++ b/drivers/common/mlx5/linux/mlx5_nl.c
@@ -1148,6 +1148,8 @@ mlx5_nl_check_switch_info(bool num_vf_set,
 	case MLX5_PHYS_PORT_NAME_TYPE_PFHPF:
 		/* Fallthrough */
 	case MLX5_PHYS_PORT_NAME_TYPE_PFVF:
+		/* Fallthrough */
+	case MLX5_PHYS_PORT_NAME_TYPE_PFSF:
 		/* New representors naming schema. */
 		switch_info->representor = 1;
 		break;
diff --git a/drivers/common/mlx5/mlx5_common.h b/drivers/common/mlx5/mlx5_common.h
index e35188da4c..a422b74577 100644
--- a/drivers/common/mlx5/mlx5_common.h
+++ b/drivers/common/mlx5/mlx5_common.h
@@ -153,6 +153,7 @@ enum mlx5_nl_phys_port_name_type {
 	MLX5_PHYS_PORT_NAME_TYPE_UPLINK, /* p0, kernel ver >= 5.0 */
 	MLX5_PHYS_PORT_NAME_TYPE_PFVF, /* pf0vf0, kernel ver >= 5.0 */
 	MLX5_PHYS_PORT_NAME_TYPE_PFHPF, /* pf0, kernel ver >= 5.7, HPF rep */
+	MLX5_PHYS_PORT_NAME_TYPE_PFSF, /* pf0sf0, kernel ver >= 5.0 */
 	MLX5_PHYS_PORT_NAME_TYPE_UNKNOWN, /* Unrecognized. */
 };
 
@@ -161,6 +162,7 @@ struct mlx5_switch_info {
 	uint32_t master:1; /**< Master device. */
 	uint32_t representor:1; /**< Representor device. */
 	enum mlx5_nl_phys_port_name_type name_type; /** < Port name type. */
+	int32_t ctrl_num; /**< Controller number (valid for c#pf#vf# format). */
 	int32_t pf_num; /**< PF number (valid for pfxvfx format only). */
 	int32_t port_name; /**< Representor port name. */
 	uint64_t switch_id; /**< Switch identifier. */
diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
index e36a78091c..1b37970c21 100644
--- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
@@ -1013,6 +1013,9 @@ mlx5_sysfs_check_switch_info(bool device_dir,
 		/* New representors naming schema. */
 		switch_info->representor = 1;
 		break;
+	default:
+		switch_info->master = device_dir;
+		break;
 	}
 }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v3 2/9] net/mlx5: support representor of sub function
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (12 preceding siblings ...)
  2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 1/9] common/mlx5: update representor name parsing Xueming Li
@ 2021-01-18 11:29   ` Xueming Li
  2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 3/9] net/mlx5: revert setting bonding representor to first PF Xueming Li
                     ` (52 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-18 11:29 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, Matan Azrad, Shahaf Shuler, Thomas Monjalon, xuemingl, Asaf Penso

This patch adds support for SF representor. Similar to VF representor,
switch port name of SF representor in phys_port_name sysfs key is
"pf<x>sf<y>".

Device representor argumnt is "representors=sf[list]", list member could
be mix of instance and range. Example:
  representors=sf[0,2,4,8-12,-1]

To probe VF representor and SF representor, need to separate into 2
devices:
  -a <BDF>,representor=vf[list] -a <BDF>,representor=sf[list]

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 doc/guides/nics/mlx5.rst                |  58 +++++++++--
 drivers/net/mlx5/linux/mlx5_ethdev_os.c |   2 +
 drivers/net/mlx5/linux/mlx5_os.c        | 123 ++++++++++++++++++++----
 drivers/net/mlx5/mlx5_ethdev.c          |   2 +
 4 files changed, 154 insertions(+), 31 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index db0c8b6c20..c7829007a4 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -871,14 +871,18 @@ Driver options
 - ``representor`` parameter [list]
 
   This parameter can be used to instantiate DPDK Ethernet devices from
-  existing port (or VF) representors configured on the device.
+  existing port (PF, VF or SF) representors configured on the device.
 
   It is a standard parameter whose format is described in
   :ref:`ethernet_device_standard_device_arguments`.
 
-  For instance, to probe port representors 0 through 2::
+  For instance, to probe VF port representors 0 through 2::
 
-    representor=[0-2]
+    representor=vf[0-2]
+
+  To probe SF port representors 0 through 2::
+
+    representor=sf[0-2]
 
 - ``max_dump_files_num`` parameter [int]
 
@@ -1287,15 +1291,15 @@ Quick Start Guide on OFED/EN
 Enable switchdev mode
 ---------------------
 
-Switchdev mode is a mode in E-Switch, that binds between representor and VF.
-Representor is a port in DPDK that is connected to a VF in such a way
-that assuming there are no offload flows, each packet that is sent from the VF
-will be received by the corresponding representor. While each packet that is
-sent to a representor will be received by the VF.
+Switchdev mode is a mode in E-Switch, that binds between representor and VF or SF.
+Representor is a port in DPDK that is connected to a VF or SF in such a way
+that assuming there are no offload flows, each packet that is sent from the VF or SF
+will be received by the corresponding representor. While each packet that is or SF
+sent to a representor will be received by the VF or SF.
 This is very useful in case of SRIOV mode, where the first packet that is sent
-by the VF will be received by the DPDK application which will decide if this
+by the VF or SF will be received by the DPDK application which will decide if this
 flow should be offloaded to the E-Switch. After offloading the flow packet
-that the VF that are matching the flow will not be received any more by
+that the VF or SF that are matching the flow will not be received any more by
 the DPDK application.
 
 1. Enable SRIOV mode::
@@ -1322,6 +1326,40 @@ the DPDK application.
 
         echo switchdev > /sys/class/net/<net device>/compat/devlink/mode
 
+SubFunction representor support
+-------------------------------
+SubFunction is a portion of the PCI device, a SF netdev has its own
+dedicated queues(txq, rxq). A SF netdev supports E-Switch representation
+offload similar to existing PF and VF representors. A SF shares PCI
+level resources with other SFs and/or with its parent PCI function.
+
+1. Configure SF feature::
+
+        mlxconfig -d <mst device> set PF_BAR2_SIZE=<0/1/2/3> PF_BAR2_ENABLE=1
+
+        Value of PF_BAR2_SIZE:
+
+            0: 8 SFs
+            1: 16 SFs
+            2: 32 SFs
+            3: 64 SFs
+
+2. Reset the FW::
+
+        mlxfwreset -d <mst device> reset
+
+3. Enable switchdev mode::
+
+        echo switchdev > /sys/class/net/<net device>/compat/devlink/mode
+
+4. Create SF::
+
+        mlnx-sf -d <PCI_BDF> -a create
+
+5. Probe SF representor::
+
+        testpmd> port attach <PCI_BDF>,representor=sf0,dv_flow_en=1
+
 Performance tuning
 ------------------
 
diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
index 1b37970c21..ac311de46d 100644
--- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
@@ -1010,6 +1010,8 @@ mlx5_sysfs_check_switch_info(bool device_dir,
 	case MLX5_PHYS_PORT_NAME_TYPE_PFHPF:
 		/* Fallthrough */
 	case MLX5_PHYS_PORT_NAME_TYPE_PFVF:
+		/* Fallthrough */
+	case MLX5_PHYS_PORT_NAME_TYPE_PFSF:
 		/* New representors naming schema. */
 		switch_info->representor = 1;
 		break;
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 4d7940bcca..b2776c080a 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -654,6 +654,8 @@ mlx5_flow_counter_mode_config(struct rte_eth_dev *dev __rte_unused)
  *   Verbs device parameters (name, port, switch_info) to spawn.
  * @param config
  *   Device configuration parameters.
+ * @param config
+ *   Device arguments.
  *
  * @return
  *   A valid Ethernet device object on success, NULL otherwise and rte_errno
@@ -665,7 +667,8 @@ mlx5_flow_counter_mode_config(struct rte_eth_dev *dev __rte_unused)
 static struct rte_eth_dev *
 mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	       struct mlx5_dev_spawn_data *spawn,
-	       struct mlx5_dev_config *config)
+	       struct mlx5_dev_config *config,
+	       struct rte_eth_devargs *eth_da)
 {
 	const struct mlx5_switch_info *switch_info = &spawn->info;
 	struct mlx5_dev_ctx_shared *sh = NULL;
@@ -696,34 +699,82 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 
 	/* Determine if this port representor is supposed to be spawned. */
 	if (switch_info->representor && dpdk_dev->devargs) {
-		struct rte_eth_devargs eth_da;
-
-		err = rte_eth_devargs_parse(dpdk_dev->devargs->args, &eth_da);
-		if (err) {
-			rte_errno = -err;
-			DRV_LOG(ERR, "failed to process device arguments: %s",
-				strerror(rte_errno));
-			return NULL;
-		}
-		if (eth_da.type != RTE_ETH_REPRESENTOR_NONE) {
-			/* Representor not specified. */
+		switch (eth_da->type) {
+		case RTE_ETH_REPRESENTOR_SF:
+			if (switch_info->name_type !=
+					MLX5_PHYS_PORT_NAME_TYPE_PFSF) {
+				rte_errno = EBUSY;
+				return NULL;
+			}
+			break;
+		case RTE_ETH_REPRESENTOR_VF:
+			/* Allows HPF representor index -1 as exception. */
+			if (!(spawn->info.port_name == -1 &&
+			      switch_info->name_type ==
+					MLX5_PHYS_PORT_NAME_TYPE_PFHPF) &&
+			    switch_info->name_type !=
+					MLX5_PHYS_PORT_NAME_TYPE_PFVF) {
+				rte_errno = EBUSY;
+				return NULL;
+			}
+			break;
+		case RTE_ETH_REPRESENTOR_NONE:
 			rte_errno = EBUSY;
 			return NULL;
-		}
-		if (eth_da.type != RTE_ETH_REPRESENTOR_VF) {
+			break;
+		default:
 			rte_errno = ENOTSUP;
 			DRV_LOG(ERR, "unsupported representor type: %s",
 				dpdk_dev->devargs->args);
 			return NULL;
 		}
-		for (i = 0; i < eth_da.nb_representor_ports; ++i)
-			if (eth_da.representor_ports[i] ==
+		/* Check controller ID: */
+		for (i = 0; i < eth_da->nb_mh_controllers; ++i)
+			if (eth_da->mh_controllers[i] ==
+			    (uint16_t)switch_info->ctrl_num)
+				break;
+		if (eth_da->nb_mh_controllers &&
+		    i == eth_da->nb_mh_controllers) {
+			rte_errno = EBUSY;
+			return NULL;
+		}
+		/* Check SF/VF ID: */
+		for (i = 0; i < eth_da->nb_representor_ports; ++i)
+			if (eth_da->representor_ports[i] ==
 			    (uint16_t)switch_info->port_name)
 				break;
-		if (i == eth_da.nb_representor_ports) {
+		if (eth_da->type != RTE_ETH_REPRESENTOR_PF &&
+		    i == eth_da->nb_representor_ports) {
 			rte_errno = EBUSY;
 			return NULL;
 		}
+		/* Check PF ID. Check after repr port to avoid warning flood. */
+		if (spawn->pf_bond >= 0) {
+			for (i = 0; i < eth_da->nb_ports; ++i)
+				if (eth_da->ports[i] ==
+				    (uint16_t)switch_info->pf_num)
+					break;
+			if (eth_da->nb_ports && i == eth_da->nb_ports) {
+				/* For backward compatibility, bonding
+				 * representor syntax supported with limitation,
+				 * device iterator won't find it:
+				 *    <PF1_BDF>,representor=#
+				 */
+				if (switch_info->pf_num > 0 &&
+				    eth_da->ports[0] == 0) {
+					DRV_LOG(WARNING, "Representor on Bonding PF should use pf#vf# format: %s",
+						dpdk_dev->devargs->args);
+				} else {
+					rte_errno = EBUSY;
+					return NULL;
+				}
+			}
+		} else if (eth_da->nb_ports > 1 || eth_da->ports[0]) {
+			rte_errno = EINVAL;
+			DRV_LOG(ERR, "PF id not supported by non-bond device: %s",
+				dpdk_dev->devargs->args);
+			return NULL;
+		}
 	}
 	/* Build device name. */
 	if (spawn->pf_bond <  0) {
@@ -731,8 +782,11 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		if (!switch_info->representor)
 			strlcpy(name, dpdk_dev->name, sizeof(name));
 		else
-			snprintf(name, sizeof(name), "%s_representor_%u",
-				 dpdk_dev->name, switch_info->port_name);
+			snprintf(name, sizeof(name), "%s_representor_%s%u",
+				 dpdk_dev->name,
+				 switch_info->name_type ==
+				 MLX5_PHYS_PORT_NAME_TYPE_PFSF ? "sf" : "vf",
+				 switch_info->port_name);
 	} else {
 		/* Bonding device. */
 		if (!switch_info->representor)
@@ -740,9 +794,11 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 				 dpdk_dev->name,
 				 mlx5_os_get_dev_device_name(spawn->phys_dev));
 		else
-			snprintf(name, sizeof(name), "%s_%s_representor_%u",
+			snprintf(name, sizeof(name), "%s_%s_representor_%s%u",
 				 dpdk_dev->name,
 				 mlx5_os_get_dev_device_name(spawn->phys_dev),
+				 switch_info->name_type ==
+				 MLX5_PHYS_PORT_NAME_TYPE_PFSF ? "sf" : "vf",
 				 switch_info->port_name);
 	}
 	/* check if the device is already spawned */
@@ -1790,6 +1846,7 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 	struct mlx5_dev_spawn_data *list = NULL;
 	struct mlx5_dev_config dev_config;
 	unsigned int dev_config_vf;
+	struct rte_eth_devargs eth_da = { .type = RTE_ETH_REPRESENTOR_NONE };
 	int ret;
 
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
@@ -1800,6 +1857,27 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 			strerror(rte_errno));
 		return -rte_errno;
 	}
+	if (pci_dev->device.devargs) {
+		/* Parse representor information from device argument. */
+		if (pci_dev->device.devargs->cls_str)
+			ret = rte_eth_devargs_parse(
+				pci_dev->device.devargs->cls_str, &eth_da);
+		if (ret) {
+			DRV_LOG(ERR, "failed to parse device arguments: %s",
+				pci_dev->device.devargs->cls_str);
+			return -rte_errno;
+		}
+		if (eth_da.type == RTE_ETH_REPRESENTOR_NONE) {
+			/* Support legacy device argument */
+			ret = rte_eth_devargs_parse(
+				pci_dev->device.devargs->args, &eth_da);
+			if (ret) {
+				DRV_LOG(ERR, "failed to parse device arguments: %s",
+					pci_dev->device.devargs->args);
+				return -rte_errno;
+			}
+		}
+	}
 	errno = 0;
 	ibv_list = mlx5_glue->get_device_list(&ret);
 	if (!ibv_list) {
@@ -1972,6 +2050,8 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 				case MLX5_PHYS_PORT_NAME_TYPE_PFHPF:
 					/* Fallthrough */
 				case MLX5_PHYS_PORT_NAME_TYPE_PFVF:
+					/* Fallthrough */
+				case MLX5_PHYS_PORT_NAME_TYPE_PFSF:
 					if (list[ns].info.pf_num == bd)
 						ns++;
 					break;
@@ -2149,7 +2229,8 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 		dev_config.log_hp_size = MLX5_ARG_UNSET;
 		list[i].eth_dev = mlx5_dev_spawn(&pci_dev->device,
 						 &list[i],
-						 &dev_config);
+						 &dev_config,
+						 &eth_da);
 		if (!list[i].eth_dev) {
 			if (rte_errno != EBUSY && rte_errno != EEXIST)
 				break;
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 45ee7e4488..ad6aacc329 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -374,6 +374,8 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 			break;
 		}
 	}
+	if (priv->master)
+		info->dev_capa = RTE_ETH_DEV_CAPA_REPRESENTOR_SF;
 	return 0;
 }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v3 3/9] net/mlx5: revert setting bonding representor to first PF
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (13 preceding siblings ...)
  2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 2/9] net/mlx5: support representor of sub function Xueming Li
@ 2021-01-18 11:29   ` Xueming Li
  2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 4/9] net/mlx5: refactor bonding representor probe Xueming Li
                     ` (51 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-18 11:29 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, Matan Azrad, Shahaf Shuler, Thomas Monjalon, xuemingl, Asaf Penso

With kernel bonding, representors on second PF are being probed by
devargs:
	<primary_bdf>,representor=pf1vf<N>
No need to save primary PF port ID and lookup when probing sibling
ports, revert patch [1]

[1]:
commit e6818853c022 ("net/mlx5: set representor to first PF in bonding
mode")

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_os.c | 20 ++------------------
 drivers/net/mlx5/mlx5.c          |  1 -
 drivers/net/mlx5/mlx5.h          |  1 -
 3 files changed, 2 insertions(+), 20 deletions(-)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index b2776c080a..7b320e8b72 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -816,13 +816,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 			rte_errno = ENOMEM;
 			return NULL;
 		}
-		priv = eth_dev->data->dev_private;
-		if (priv->sh->bond_dev != UINT16_MAX)
-			/* For bonding port, use primary PCI device. */
-			eth_dev->device =
-				rte_eth_devices[priv->sh->bond_dev].device;
-		else
-			eth_dev->device = dpdk_dev;
+		eth_dev->device = dpdk_dev;
 		eth_dev->dev_ops = &mlx5_dev_sec_ops;
 		eth_dev->rx_descriptor_status = mlx5_rx_descriptor_status;
 		eth_dev->tx_descriptor_status = mlx5_tx_descriptor_status;
@@ -1439,17 +1433,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	eth_dev->data->dev_private = priv;
 	priv->dev_data = eth_dev->data;
 	eth_dev->data->mac_addrs = priv->mac;
-	if (spawn->pf_bond < 0) {
-		eth_dev->device = dpdk_dev;
-	} else {
-		/* Use primary bond PCI as device. */
-		if (sh->bond_dev == UINT16_MAX) {
-			sh->bond_dev = eth_dev->data->port_id;
-			eth_dev->device = dpdk_dev;
-		} else {
-			eth_dev->device = rte_eth_devices[sh->bond_dev].device;
-		}
-	}
+	eth_dev->device = dpdk_dev;
 	eth_dev->data->dev_flags |= RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS;
 	/* Configure the first MAC address by default. */
 	if (mlx5_get_mac(eth_dev, &mac.addr_bytes)) {
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index e245276fce..5e8cd6a3df 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -914,7 +914,6 @@ mlx5_alloc_shared_dev_ctx(const struct mlx5_dev_spawn_data *spawn,
 		goto error;
 	}
 	sh->refcnt = 1;
-	sh->bond_dev = UINT16_MAX;
 	sh->max_port = spawn->max_port;
 	strncpy(sh->ibdev_name, mlx5_os_get_ctx_device_name(sh->ctx),
 		sizeof(sh->ibdev_name) - 1);
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 3836a9696c..e06e0ff3bb 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -668,7 +668,6 @@ struct mlx5_flex_parser_profiles {
 struct mlx5_dev_ctx_shared {
 	LIST_ENTRY(mlx5_dev_ctx_shared) next;
 	uint32_t refcnt;
-	uint16_t bond_dev; /* Bond primary device id. */
 	uint32_t devx:1; /* Opened with DV. */
 	uint32_t flow_hit_aso_en:1; /* Flow Hit ASO is supported. */
 	uint32_t max_port; /* Maximal IB device port index. */
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v3 4/9] net/mlx5: refactor bonding representor probe
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (14 preceding siblings ...)
  2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 3/9] net/mlx5: revert setting bonding representor to first PF Xueming Li
@ 2021-01-18 11:29   ` Xueming Li
  2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 5/9] net/mlx5: support representor from multiple PFs Xueming Li
                     ` (50 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-18 11:29 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, Matan Azrad, Shahaf Shuler, Thomas Monjalon, xuemingl, Asaf Penso

To probe representor on 2nd PF of kernel bonding device, had to specify
PF1 BDF in devarg:
  <PF1_BDF>,representor=0
When closing bonding device, all representors had to be closed together
and this implies all representors have to use primary PF of bonding
device. So after probing representor port on 2nd PF, when locating new
probed device using device argument, the filter used 2nd PF as PCI
address and failed to locate new device.

Conflict happened by using current representor devargs:
 - Use PCI BDF to specify representor owner PF
 - Use PCI BDF to locate probed representor device.
 - PMD uses primary PCI BDF as PCI device.

To resolve such conflicts, new representor syntax is introduced here:
  <primary BDF>,representor=pfXvfY
All representors must use primary PF as owner PCI device, PMD internally
locate owner PCI address by checking representor "pfX" part. To EAL, all
representors are registered to primary PCI device, the 2nd PF is hidden
to EAL, thus all search should be consistent.

Same to VF representor, HPF (host PF on BlueField) uses same syntax to
probe, example: representor=pf1vf[0-3,-1]

This patch also adds pf index into kernel bonding representor port name:
	<BDF>_<ib_name>_representor_pf<X>vf<Y>

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 doc/guides/nics/mlx5.rst         |   4 +-
 drivers/net/mlx5/linux/mlx5_os.c | 263 +++++++++++++++++--------------
 drivers/net/mlx5/mlx5.c          |  22 +++
 drivers/net/mlx5/mlx5.h          |   3 +-
 drivers/net/mlx5/mlx5_defs.h     |   4 -
 drivers/net/mlx5/mlx5_ethdev.c   |  27 ----
 drivers/net/mlx5/mlx5_mac.c      |   8 +-
 7 files changed, 177 insertions(+), 154 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index c7829007a4..eaca4fc058 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -878,11 +878,11 @@ Driver options
 
   For instance, to probe VF port representors 0 through 2::
 
-    representor=vf[0-2]
+    <PCI_BDF>,representor=vf[0-2]
 
   To probe SF port representors 0 through 2::
 
-    representor=sf[0-2]
+    <PCI_BDF>,representor=sf[0-2]
 
 - ``max_dump_files_num`` parameter [int]
 
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 7b320e8b72..9ae5910f46 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -645,6 +645,72 @@ mlx5_flow_counter_mode_config(struct rte_eth_dev *dev __rte_unused)
 #endif
 }
 
+/**
+ * Check if representor spawn info match devargs.
+ *
+ * @param spawn
+ *   Verbs device parameters (name, port, switch_info) to spawn.
+ * @param eth_da
+ *   Device devargs to probe.
+ * @param repr_id
+ *   Encoded representor ID.
+ *
+ * @return
+ *   Match result.
+ */
+static bool
+mlx5_representor_match(struct mlx5_dev_spawn_data *spawn,
+		       struct rte_eth_devargs *eth_da,
+		       uint16_t repr_id)
+{
+	const struct mlx5_switch_info *switch_info = &spawn->info;
+	unsigned int c, p, f;
+	uint16_t repr;
+
+	switch (eth_da->type) {
+	case RTE_ETH_REPRESENTOR_SF:
+		if (switch_info->name_type != MLX5_PHYS_PORT_NAME_TYPE_PFSF) {
+			rte_errno = EBUSY;
+			return false;
+		}
+		break;
+	case RTE_ETH_REPRESENTOR_VF:
+		/* Allows HPF representor index -1 as exception. */
+		if (!(spawn->info.port_name == -1 &&
+		      switch_info->name_type ==
+				MLX5_PHYS_PORT_NAME_TYPE_PFHPF) &&
+		    switch_info->name_type != MLX5_PHYS_PORT_NAME_TYPE_PFVF) {
+			rte_errno = EBUSY;
+			return false;
+		}
+		break;
+	case RTE_ETH_REPRESENTOR_NONE:
+		rte_errno = EBUSY;
+		return false;
+	default:
+		rte_errno = ENOTSUP;
+		DRV_LOG(ERR, "unsupported representor type");
+		return false;
+	}
+	/* Check representor ID: */
+	for (c = 0; c < eth_da->nb_mh_controllers; ++c) {
+		for (p = 0; p < eth_da->nb_ports; ++p) {
+			for (f = 0; f < eth_da->nb_representor_ports; ++f) {
+				repr = rte_eth_representor_id_encode(
+					eth_da->mh_controllers[c],
+					eth_da->ports[p],
+					eth_da->type,
+					eth_da->representor_ports[f]);
+				if (repr_id == repr)
+					return true;
+			}
+		}
+	}
+	rte_errno = EBUSY;
+	return false;
+}
+
+
 /**
  * Spawn an Ethernet device from Verbs information.
  *
@@ -676,6 +742,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	struct mlx5dv_context dv_attr = { .comp_mask = 0 };
 	struct rte_eth_dev *eth_dev = NULL;
 	struct mlx5_priv *priv = NULL;
+	uint16_t repr_id = RTE_NO_REPRESENTOR_ID;
 	int err = 0;
 	unsigned int hw_padding = 0;
 	unsigned int mps;
@@ -692,115 +759,52 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	char name[RTE_ETH_NAME_MAX_LEN];
 	int own_domain_id = 0;
 	uint16_t port_id;
-	unsigned int i;
 #ifdef HAVE_MLX5DV_DR_DEVX_PORT
 	struct mlx5dv_devx_port devx_port = { .comp_mask = 0 };
 #endif
 
+	if (switch_info->representor)
+		repr_id = rte_eth_representor_id_encode(
+			switch_info->ctrl_num,
+			spawn->pf_bond >= 0 ? switch_info->pf_num : 0,
+			switch_info->name_type ==
+				MLX5_PHYS_PORT_NAME_TYPE_PFSF ?
+				RTE_ETH_REPRESENTOR_SF : RTE_ETH_REPRESENTOR_VF,
+			switch_info->port_name);
 	/* Determine if this port representor is supposed to be spawned. */
-	if (switch_info->representor && dpdk_dev->devargs) {
-		switch (eth_da->type) {
-		case RTE_ETH_REPRESENTOR_SF:
-			if (switch_info->name_type !=
-					MLX5_PHYS_PORT_NAME_TYPE_PFSF) {
-				rte_errno = EBUSY;
-				return NULL;
-			}
-			break;
-		case RTE_ETH_REPRESENTOR_VF:
-			/* Allows HPF representor index -1 as exception. */
-			if (!(spawn->info.port_name == -1 &&
-			      switch_info->name_type ==
-					MLX5_PHYS_PORT_NAME_TYPE_PFHPF) &&
-			    switch_info->name_type !=
-					MLX5_PHYS_PORT_NAME_TYPE_PFVF) {
-				rte_errno = EBUSY;
-				return NULL;
-			}
-			break;
-		case RTE_ETH_REPRESENTOR_NONE:
-			rte_errno = EBUSY;
-			return NULL;
-			break;
-		default:
-			rte_errno = ENOTSUP;
-			DRV_LOG(ERR, "unsupported representor type: %s",
-				dpdk_dev->devargs->args);
-			return NULL;
-		}
-		/* Check controller ID: */
-		for (i = 0; i < eth_da->nb_mh_controllers; ++i)
-			if (eth_da->mh_controllers[i] ==
-			    (uint16_t)switch_info->ctrl_num)
-				break;
-		if (eth_da->nb_mh_controllers &&
-		    i == eth_da->nb_mh_controllers) {
-			rte_errno = EBUSY;
-			return NULL;
-		}
-		/* Check SF/VF ID: */
-		for (i = 0; i < eth_da->nb_representor_ports; ++i)
-			if (eth_da->representor_ports[i] ==
-			    (uint16_t)switch_info->port_name)
-				break;
-		if (eth_da->type != RTE_ETH_REPRESENTOR_PF &&
-		    i == eth_da->nb_representor_ports) {
-			rte_errno = EBUSY;
-			return NULL;
-		}
-		/* Check PF ID. Check after repr port to avoid warning flood. */
-		if (spawn->pf_bond >= 0) {
-			for (i = 0; i < eth_da->nb_ports; ++i)
-				if (eth_da->ports[i] ==
-				    (uint16_t)switch_info->pf_num)
-					break;
-			if (eth_da->nb_ports && i == eth_da->nb_ports) {
-				/* For backward compatibility, bonding
-				 * representor syntax supported with limitation,
-				 * device iterator won't find it:
-				 *    <PF1_BDF>,representor=#
-				 */
-				if (switch_info->pf_num > 0 &&
-				    eth_da->ports[0] == 0) {
-					DRV_LOG(WARNING, "Representor on Bonding PF should use pf#vf# format: %s",
-						dpdk_dev->devargs->args);
-				} else {
-					rte_errno = EBUSY;
-					return NULL;
-				}
-			}
-		} else if (eth_da->nb_ports > 1 || eth_da->ports[0]) {
-			rte_errno = EINVAL;
-			DRV_LOG(ERR, "PF id not supported by non-bond device: %s",
-				dpdk_dev->devargs->args);
-			return NULL;
-		}
-	}
+	if (switch_info->representor && dpdk_dev->devargs &&
+	    !mlx5_representor_match(spawn, eth_da, repr_id))
+		return NULL;
 	/* Build device name. */
 	if (spawn->pf_bond <  0) {
 		/* Single device. */
 		if (!switch_info->representor)
 			strlcpy(name, dpdk_dev->name, sizeof(name));
 		else
-			snprintf(name, sizeof(name), "%s_representor_%s%u",
+			err = snprintf(name, sizeof(name), "%s_representor_%s%u",
 				 dpdk_dev->name,
 				 switch_info->name_type ==
 				 MLX5_PHYS_PORT_NAME_TYPE_PFSF ? "sf" : "vf",
 				 switch_info->port_name);
 	} else {
 		/* Bonding device. */
-		if (!switch_info->representor)
-			snprintf(name, sizeof(name), "%s_%s",
+		if (!switch_info->representor) {
+			err = snprintf(name, sizeof(name), "%s_%s",
 				 dpdk_dev->name,
 				 mlx5_os_get_dev_device_name(spawn->phys_dev));
-		else
-			snprintf(name, sizeof(name), "%s_%s_representor_%s%u",
-				 dpdk_dev->name,
-				 mlx5_os_get_dev_device_name(spawn->phys_dev),
-				 switch_info->name_type ==
-				 MLX5_PHYS_PORT_NAME_TYPE_PFSF ? "sf" : "vf",
-				 switch_info->port_name);
+		} else {
+			err = snprintf(name, sizeof(name), "%s_%s_representor_c%dpf%d%s%u",
+				dpdk_dev->name,
+				mlx5_os_get_dev_device_name(spawn->phys_dev),
+				switch_info->ctrl_num,
+				switch_info->pf_num,
+				switch_info->name_type ==
+				MLX5_PHYS_PORT_NAME_TYPE_PFSF ? "sf" : "vf",
+				switch_info->port_name);
+		}
 	}
+	if (err >= (int)sizeof(name))
+		DRV_LOG(WARNING, "device name overflow %s", name);
 	/* check if the device is already spawned */
 	if (rte_eth_dev_get_port_by_name(name, &port_id) == 0) {
 		rte_errno = EEXIST;
@@ -1073,11 +1077,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	priv->vport_id = switch_info->representor ?
 			 switch_info->port_name + 1 : -1;
 #endif
-	/* representor_id field keeps the unmodified VF index. */
-	priv->representor_id = switch_info->representor ?
-		rte_eth_representor_id_encode(0, 0, RTE_ETH_REPRESENTOR_VF,
-					      switch_info->port_name) :
-		-1;
+	priv->representor_id = repr_id;
 	/*
 	 * Look for sibling devices in order to reuse their switch domain
 	 * if any, otherwise allocate one.
@@ -1692,9 +1692,11 @@ mlx5_dev_spawn_data_cmp(const void *a, const void *b)
  * @param[in] ibv_dev
  *   Pointer to Infiniband device structure.
  * @param[in] pci_dev
- *   Pointer to PCI device structure to match PCI address.
+ *   Pointer to primary PCI address structure to match.
  * @param[in] nl_rdma
  *   Netlink RDMA group socket handle.
+ * @param[in] owner
+ *   Rerepsentor owner PF index.
  *
  * @return
  *   negative value if no bonding device found, otherwise
@@ -1702,8 +1704,8 @@ mlx5_dev_spawn_data_cmp(const void *a, const void *b)
  */
 static int
 mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
-			   const struct rte_pci_device *pci_dev,
-			   int nl_rdma)
+			   const struct rte_pci_addr *pci_dev,
+			   int nl_rdma, uint16_t owner)
 {
 	char ifname[IF_NAMESIZE + 1];
 	unsigned int ifindex;
@@ -1760,10 +1762,10 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 					 " for netdev \"%s\"", ifname);
 			continue;
 		}
-		if (pci_dev->addr.domain != pci_addr.domain ||
-		    pci_dev->addr.bus != pci_addr.bus ||
-		    pci_dev->addr.devid != pci_addr.devid ||
-		    pci_dev->addr.function != pci_addr.function)
+		if (pci_dev->domain != pci_addr.domain ||
+		    pci_dev->bus != pci_addr.bus ||
+		    pci_dev->devid != pci_addr.devid ||
+		    pci_dev->function + owner != pci_addr.function)
 			continue;
 		/* Slave interface PCI address match found. */
 		fclose(file);
@@ -1831,7 +1833,8 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 	struct mlx5_dev_config dev_config;
 	unsigned int dev_config_vf;
 	struct rte_eth_devargs eth_da = { .type = RTE_ETH_REPRESENTOR_NONE };
-	int ret;
+	struct rte_pci_addr owner_pci = pci_dev->addr; /* Owner PF. */
+	int ret = -1;
 
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
 		mlx5_pmd_socket_init();
@@ -1883,7 +1886,8 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 
 		DRV_LOG(DEBUG, "checking device \"%s\"", ibv_list[ret]->name);
 		bd = mlx5_device_bond_pci_match
-				(ibv_list[ret], pci_dev, nl_rdma);
+				(ibv_list[ret], &owner_pci, nl_rdma,
+				 eth_da.ports[0]);
 		if (bd >= 0) {
 			/*
 			 * Bonding device detected. Only one match is allowed,
@@ -1900,23 +1904,28 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 				ret = -rte_errno;
 				goto exit;
 			}
+			/* Amend owner pci address if owner PF ID specified. */
+			if (eth_da.nb_representor_ports)
+				owner_pci.function += eth_da.ports[0];
 			DRV_LOG(INFO, "PCI information matches for"
 				      " slave %d bonding device \"%s\"",
 				      bd, ibv_list[ret]->name);
 			ibv_match[nd++] = ibv_list[ret];
 			break;
+		} else {
+			/* Bonding device not found. */
+			if (mlx5_dev_to_pci_addr
+				(ibv_list[ret]->ibdev_path, &pci_addr))
+				continue;
+			if (owner_pci.domain != pci_addr.domain ||
+			    owner_pci.bus != pci_addr.bus ||
+			    owner_pci.devid != pci_addr.devid ||
+			    owner_pci.function != pci_addr.function)
+				continue;
+			DRV_LOG(INFO, "PCI information matches for device \"%s\"",
+				ibv_list[ret]->name);
+			ibv_match[nd++] = ibv_list[ret];
 		}
-		if (mlx5_dev_to_pci_addr
-			(ibv_list[ret]->ibdev_path, &pci_addr))
-			continue;
-		if (pci_dev->addr.domain != pci_addr.domain ||
-		    pci_dev->addr.bus != pci_addr.bus ||
-		    pci_dev->addr.devid != pci_addr.devid ||
-		    pci_dev->addr.function != pci_addr.function)
-			continue;
-		DRV_LOG(INFO, "PCI information matches for device \"%s\"",
-			ibv_list[ret]->name);
-		ibv_match[nd++] = ibv_list[ret];
 	}
 	ibv_match[nd] = NULL;
 	if (!nd) {
@@ -1924,8 +1933,8 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 		DRV_LOG(WARNING,
 			"no Verbs device matches PCI device " PCI_PRI_FMT ","
 			" are kernel drivers loaded?",
-			pci_dev->addr.domain, pci_dev->addr.bus,
-			pci_dev->addr.devid, pci_dev->addr.function);
+			owner_pci.domain, owner_pci.bus,
+			owner_pci.devid, owner_pci.function);
 		rte_errno = ENOENT;
 		ret = -rte_errno;
 		goto exit;
@@ -2190,6 +2199,24 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 		dev_config_vf = 0;
 		break;
 	}
+	if (eth_da.type != RTE_ETH_REPRESENTOR_NONE) {
+		/* Set devargs default values. */
+		if (eth_da.nb_mh_controllers == 0) {
+			eth_da.nb_mh_controllers = 1;
+			eth_da.mh_controllers[0] = 0;
+		}
+		if (eth_da.nb_ports == 0 && ns > 0) {
+			if (list[0].pf_bond >= 0 && list[0].info.representor)
+				DRV_LOG(WARNING, "Representor on Bonding device should use pf#vf# syntax: %s",
+					pci_dev->device.devargs->args);
+			eth_da.nb_ports = 1;
+			eth_da.ports[0] = list[0].info.pf_num;
+		}
+		if (eth_da.nb_representor_ports == 0) {
+			eth_da.nb_representor_ports = 1;
+			eth_da.representor_ports[0] = 0;
+		}
+	}
 	for (i = 0; i != ns; ++i) {
 		uint32_t restore;
 
@@ -2231,8 +2258,8 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 		DRV_LOG(ERR,
 			"probe of PCI device " PCI_PRI_FMT " aborted after"
 			" encountering an error: %s",
-			pci_dev->addr.domain, pci_dev->addr.bus,
-			pci_dev->addr.devid, pci_dev->addr.function,
+			owner_pci.domain, owner_pci.bus,
+			owner_pci.devid, owner_pci.function,
 			strerror(rte_errno));
 		ret = -rte_errno;
 		/* Roll back. */
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 5e8cd6a3df..d613ffd655 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -355,6 +355,28 @@ static const struct mlx5_indexed_pool_config mlx5_ipool_cfg[] = {
 
 #define MLX5_FLOW_TABLE_HLIST_ARRAY_SIZE 4096
 
+/**
+ * Decide whether representor ID is a HPF(host PF) port on BF2.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   Non-zero if HPF, otherwise 0.
+ */
+int
+mlx5_is_hpf(struct rte_eth_dev *dev)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	enum rte_eth_representor_type type;
+	uint16_t port;
+
+	port = rte_eth_representor_id_parse(priv->representor_id,
+					    NULL, NULL, &type);
+	return priv->representor && type == RTE_ETH_REPRESENTOR_VF &&
+	       port == rte_eth_representor_id_parse(-1, NULL, NULL, NULL);
+}
+
 /**
  * Initialize the ASO aging management structure.
  *
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index e06e0ff3bb..e7afa438ce 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -915,7 +915,7 @@ struct mlx5_priv {
 	uint16_t vport_id; /* Associated VF vport index (if any). */
 	uint32_t vport_meta_tag; /* Used for vport index match ove VF LAG. */
 	uint32_t vport_meta_mask; /* Used for vport index field match mask. */
-	int32_t representor_id; /* Port representor identifier. */
+	int32_t representor_id; /* RTE_ETH_REPR(), -1 if not a representor. */
 	int32_t pf_bond; /* >=0 means PF index in bonding configuration. */
 	unsigned int if_index; /* Associated kernel network device index. */
 	uint32_t bond_ifindex; /**< Bond interface index. */
@@ -988,6 +988,7 @@ int mlx5_udp_tunnel_port_add(struct rte_eth_dev *dev,
 			      struct rte_eth_udp_tunnel *udp_tunnel);
 uint16_t mlx5_eth_find_next(uint16_t port_id, struct rte_pci_device *pci_dev);
 int mlx5_dev_close(struct rte_eth_dev *dev);
+int mlx5_is_hpf(struct rte_eth_dev *dev);
 void mlx5_age_event_prepare(struct mlx5_dev_ctx_shared *sh);
 
 /* Macro to iterate over all valid ports for mlx5 driver. */
diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index 85a0979653..4648196550 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -48,10 +48,6 @@
 #define MLX5_PMD_SOFT_COUNTERS 1
 #endif
 
-/* Switch port ID parameters for bonding configurations. */
-#define MLX5_PORT_ID_BONDING_PF_MASK 0xf
-#define MLX5_PORT_ID_BONDING_PF_SHIFT 12
-
 /* Alarm timeout. */
 #define MLX5_ALARM_TIMEOUT_US 100000
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index ad6aacc329..5341eb16c9 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -330,33 +330,6 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 	if (priv->representor) {
 		uint16_t port_id;
 
-		if (priv->pf_bond >= 0) {
-			/*
-			 * Switch port ID is opaque value with driver defined
-			 * format. Push the PF index in bonding configurations
-			 * in upper four bits of port ID. If we get too many
-			 * representors (more than 4K) or PFs (more than 15)
-			 * this approach must be reconsidered.
-			 */
-			/* Switch port ID for VF representors: 0 - 0xFFE */
-			if ((info->switch_info.port_id != 0xffff &&
-				info->switch_info.port_id >=
-				((1 << MLX5_PORT_ID_BONDING_PF_SHIFT) - 1)) ||
-			    priv->pf_bond > MLX5_PORT_ID_BONDING_PF_MASK) {
-				DRV_LOG(ERR, "can't update switch port ID"
-					     " for bonding device");
-				MLX5_ASSERT(false);
-				return -ENODEV;
-			}
-			/*
-			 * Switch port ID for Host PF representor
-			 * (representor_id is -1) , set to 0xFFF
-			 */
-			if (info->switch_info.port_id == 0xffff)
-				info->switch_info.port_id = 0xfff;
-			info->switch_info.port_id |=
-				priv->pf_bond << MLX5_PORT_ID_BONDING_PF_SHIFT;
-		}
 		MLX5_ETH_FOREACH_DEV(port_id, priv->pci_dev) {
 			struct mlx5_priv *opriv =
 				rte_eth_devices[port_id].data->dev_private;
diff --git a/drivers/net/mlx5/mlx5_mac.c b/drivers/net/mlx5/mlx5_mac.c
index bd786fd638..b5b810b508 100644
--- a/drivers/net/mlx5/mlx5_mac.c
+++ b/drivers/net/mlx5/mlx5_mac.c
@@ -159,7 +159,7 @@ mlx5_mac_addr_set(struct rte_eth_dev *dev, struct rte_ether_addr *mac_addr)
 	 * Configuring the VF instead of its representor,
 	 * need to skip the special case of HPF on Bluefield.
 	 */
-	if (priv->representor && priv->representor_id >= 0) {
+	if (priv->representor && !mlx5_is_hpf(dev)) {
 		DRV_LOG(DEBUG, "VF represented by port %u setting primary MAC address",
 			dev->data->port_id);
 		RTE_ETH_FOREACH_DEV_SIBLING(port_id, dev->data->port_id) {
@@ -169,7 +169,11 @@ mlx5_mac_addr_set(struct rte_eth_dev *dev, struct rte_ether_addr *mac_addr)
 				return mlx5_os_vf_mac_addr_modify
 				       (priv,
 					mlx5_ifindex(&rte_eth_devices[port_id]),
-					mac_addr, priv->representor_id);
+					mac_addr,
+					rte_eth_representor_id_parse(
+							priv->representor_id,
+							NULL, NULL, NULL)
+					);
 			}
 		}
 		rte_errno = -ENOTSUP;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v3 5/9] net/mlx5: support representor from multiple PFs
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (15 preceding siblings ...)
  2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 4/9] net/mlx5: refactor bonding representor probe Xueming Li
@ 2021-01-18 11:29   ` Xueming Li
  2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 6/9] net/mlx5: save bonding member ports information Xueming Li
                     ` (49 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-18 11:29 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, Matan Azrad, Shahaf Shuler, Thomas Monjalon, xuemingl, Asaf Penso

To probe representors from different kernel bonding PFs, had to specify
2 separate devargs like this:
    -a 03:00.0,representor=pf0vf[0-3] -a 03:00.0,representor=pf1vf[0-3]

This patch supports range or list of PF section in devargs, so the
alternative short devargs of above is:
    -a 03:00.0,representor=pf[0-1]vf[0-3]

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 doc/guides/nics/mlx5.rst         |   4 ++
 drivers/net/mlx5/linux/mlx5_os.c | 100 +++++++++++++++++++++----------
 2 files changed, 72 insertions(+), 32 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index eaca4fc058..480c9d3fc1 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -884,6 +884,10 @@ Driver options
 
     <PCI_BDF>,representor=sf[0-2]
 
+  To probe VF port representors 0 through 2 on both PFs of bonding device::
+
+    <Primary_PCI_BDF>,representor=pf[0,1]vf[0-2]
+
 - ``max_dump_files_num`` parameter [int]
 
   The maximum number of files per PMD entity that may be created for debug information.
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 9ae5910f46..521a0a5789 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1788,21 +1788,25 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 }
 
 /**
- * DPDK callback to register a PCI device.
+ * Register a PCI device within bonding.
  *
- * This function spawns Ethernet devices out of a given PCI device.
+ * This function spawns Ethernet devices out of a given PCI device and
+ * bonding owner PF index.
  *
- * @param[in] pci_drv
- *   PCI driver structure (mlx5_driver).
  * @param[in] pci_dev
  *   PCI device information.
+ * @param[in] req_eth_da
+ *   Requested ethdev device argument.
+ * @param[in] owner_id
+ *   Requested owner PF port ID within bonding device, default to 0.
  *
  * @return
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
-int
-mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
-		  struct rte_pci_device *pci_dev)
+static int
+mlx5_os_pci_probe_pf(struct rte_pci_device *pci_dev,
+		     struct rte_eth_devargs *req_eth_da,
+		     uint16_t owner_id)
 {
 	struct ibv_device **ibv_list;
 	/*
@@ -1832,7 +1836,7 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 	struct mlx5_dev_spawn_data *list = NULL;
 	struct mlx5_dev_config dev_config;
 	unsigned int dev_config_vf;
-	struct rte_eth_devargs eth_da = { .type = RTE_ETH_REPRESENTOR_NONE };
+	struct rte_eth_devargs eth_da = *req_eth_da;
 	struct rte_pci_addr owner_pci = pci_dev->addr; /* Owner PF. */
 	int ret = -1;
 
@@ -1844,27 +1848,6 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 			strerror(rte_errno));
 		return -rte_errno;
 	}
-	if (pci_dev->device.devargs) {
-		/* Parse representor information from device argument. */
-		if (pci_dev->device.devargs->cls_str)
-			ret = rte_eth_devargs_parse(
-				pci_dev->device.devargs->cls_str, &eth_da);
-		if (ret) {
-			DRV_LOG(ERR, "failed to parse device arguments: %s",
-				pci_dev->device.devargs->cls_str);
-			return -rte_errno;
-		}
-		if (eth_da.type == RTE_ETH_REPRESENTOR_NONE) {
-			/* Support legacy device argument */
-			ret = rte_eth_devargs_parse(
-				pci_dev->device.devargs->args, &eth_da);
-			if (ret) {
-				DRV_LOG(ERR, "failed to parse device arguments: %s",
-					pci_dev->device.devargs->args);
-				return -rte_errno;
-			}
-		}
-	}
 	errno = 0;
 	ibv_list = mlx5_glue->get_device_list(&ret);
 	if (!ibv_list) {
@@ -1886,8 +1869,7 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 
 		DRV_LOG(DEBUG, "checking device \"%s\"", ibv_list[ret]->name);
 		bd = mlx5_device_bond_pci_match
-				(ibv_list[ret], &owner_pci, nl_rdma,
-				 eth_da.ports[0]);
+				(ibv_list[ret], &owner_pci, nl_rdma, owner_id);
 		if (bd >= 0) {
 			/*
 			 * Bonding device detected. Only one match is allowed,
@@ -1906,7 +1888,7 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 			}
 			/* Amend owner pci address if owner PF ID specified. */
 			if (eth_da.nb_representor_ports)
-				owner_pci.function += eth_da.ports[0];
+				owner_pci.function += owner_id;
 			DRV_LOG(INFO, "PCI information matches for"
 				      " slave %d bonding device \"%s\"",
 				      bd, ibv_list[ret]->name);
@@ -2294,6 +2276,60 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 	return ret;
 }
 
+/**
+ * DPDK callback to register a PCI device.
+ *
+ * This function spawns Ethernet devices out of a given PCI device.
+ *
+ * @param[in] pci_drv
+ *   PCI driver structure (mlx5_driver).
+ * @param[in] pci_dev
+ *   PCI device information.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
+		  struct rte_pci_device *pci_dev)
+{
+	struct rte_eth_devargs eth_da = { .type = RTE_ETH_REPRESENTOR_NONE };
+	int ret = 0;
+	uint16_t p;
+
+	if (pci_dev->device.devargs) {
+		/* Parse representor information from device argument. */
+		if (pci_dev->device.devargs->cls_str)
+			ret = rte_eth_devargs_parse(
+				pci_dev->device.devargs->cls_str, &eth_da);
+		if (ret) {
+			DRV_LOG(ERR, "failed to parse device arguments: %s",
+				pci_dev->device.devargs->cls_str);
+			return -rte_errno;
+		}
+		if (eth_da.type == RTE_ETH_REPRESENTOR_NONE) {
+			/* Support legacy device argument */
+			ret = rte_eth_devargs_parse(
+				pci_dev->device.devargs->args, &eth_da);
+			if (ret) {
+				DRV_LOG(ERR, "failed to parse device arguments: %s",
+					pci_dev->device.devargs->args);
+				return -rte_errno;
+			}
+		}
+	}
+
+	if (eth_da.nb_ports > 0) {
+		/* Iterate all port if devargs pf is range: "pf[0-1]vf[...]". */
+		for (p = 0; p < eth_da.nb_ports; p++)
+			ret = mlx5_os_pci_probe_pf(pci_dev, &eth_da,
+						   eth_da.ports[p]);
+	} else {
+		ret = mlx5_os_pci_probe_pf(pci_dev, &eth_da, 0);
+	}
+	return ret;
+}
+
 static int
 mlx5_config_doorbell_mapping_env(const struct mlx5_dev_config *config)
 {
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v3 6/9] net/mlx5: save bonding member ports information
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (16 preceding siblings ...)
  2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 5/9] net/mlx5: support representor from multiple PFs Xueming Li
@ 2021-01-18 11:29   ` Xueming Li
  2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 7/9] " Xueming Li
                     ` (48 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-18 11:29 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, Matan Azrad, Shahaf Shuler, Thomas Monjalon, xuemingl, Asaf Penso

Since kernel bonding netdev doesn't provide statistics counter that
reflects all member ports, PMD has to manually summarize counters from
each member ports.

As a preparation, this patch collects bonding member port information
and saves to shared context data.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_ethdev_os.c |  4 +-
 drivers/net/mlx5/linux/mlx5_os.c        | 91 ++++++++++++++++---------
 drivers/net/mlx5/mlx5.c                 |  2 +
 drivers/net/mlx5/mlx5.h                 | 19 +++++-
 drivers/net/mlx5/mlx5_ethdev.c          |  5 +-
 5 files changed, 84 insertions(+), 37 deletions(-)

diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
index ac311de46d..84610a7bc0 100644
--- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
@@ -150,8 +150,8 @@ mlx5_get_ifname(const struct rte_eth_dev *dev, char (*ifname)[MLX5_NAMESIZE])
 
 	MLX5_ASSERT(priv);
 	MLX5_ASSERT(priv->sh);
-	if (priv->bond_ifindex > 0) {
-		memcpy(ifname, priv->bond_name, MLX5_NAMESIZE);
+	if (priv->master && priv->sh->bond.ifindex > 0) {
+		memcpy(ifname, priv->sh->bond.ifname, MLX5_NAMESIZE);
 		return 0;
 	}
 	ifindex = mlx5_ifindex(dev);
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 521a0a5789..47a7c3dff0 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1417,19 +1417,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	 */
 	MLX5_ASSERT(spawn->ifindex);
 	priv->if_index = spawn->ifindex;
-	if (priv->pf_bond >= 0 && priv->master) {
-		/* Get bond interface info */
-		err = mlx5_sysfs_bond_info(priv->if_index,
-				     &priv->bond_ifindex,
-				     priv->bond_name);
-		if (err)
-			DRV_LOG(ERR, "unable to get bond info: %s",
-				strerror(rte_errno));
-		else
-			DRV_LOG(INFO, "PF device %u, bond device %u(%s)",
-				priv->if_index, priv->bond_ifindex,
-				priv->bond_name);
-	}
 	eth_dev->data->dev_private = priv;
 	priv->dev_data = eth_dev->data;
 	eth_dev->data->mac_addrs = priv->mac;
@@ -1697,6 +1684,8 @@ mlx5_dev_spawn_data_cmp(const void *a, const void *b)
  *   Netlink RDMA group socket handle.
  * @param[in] owner
  *   Rerepsentor owner PF index.
+ * @param[out] bond_info
+ *   Pointer to bonding information.
  *
  * @return
  *   negative value if no bonding device found, otherwise
@@ -1705,19 +1694,22 @@ mlx5_dev_spawn_data_cmp(const void *a, const void *b)
 static int
 mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 			   const struct rte_pci_addr *pci_dev,
-			   int nl_rdma, uint16_t owner)
+			   int nl_rdma, uint16_t owner,
+			   struct mlx5_bond_info *bond_info)
 {
 	char ifname[IF_NAMESIZE + 1];
 	unsigned int ifindex;
 	unsigned int np, i;
-	FILE *file = NULL;
+	FILE *bond_file = NULL, *file;
 	int pf = -1;
+	int ret;
 
 	/*
 	 * Try to get master device name. If something goes
 	 * wrong suppose the lack of kernel support and no
 	 * bonding devices.
 	 */
+	memset(bond_info, 0, sizeof(*bond_info));
 	if (nl_rdma < 0)
 		return -1;
 	if (!strstr(ibv_dev->name, "bond"))
@@ -1741,15 +1733,15 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 		/* Try to read bonding slave names from sysfs. */
 		MKSTR(slaves,
 		      "/sys/class/net/%s/master/bonding/slaves", ifname);
-		file = fopen(slaves, "r");
-		if (file)
+		bond_file = fopen(slaves, "r");
+		if (bond_file)
 			break;
 	}
-	if (!file)
+	if (!bond_file)
 		return -1;
 	/* Use safe format to check maximal buffer length. */
 	MLX5_ASSERT(atol(RTE_STR(IF_NAMESIZE)) == IF_NAMESIZE);
-	while (fscanf(file, "%" RTE_STR(IF_NAMESIZE) "s", ifname) == 1) {
+	while (fscanf(bond_file, "%" RTE_STR(IF_NAMESIZE) "s", ifname) == 1) {
 		char tmp_str[IF_NAMESIZE + 32];
 		struct rte_pci_addr pci_addr;
 		struct mlx5_switch_info	info;
@@ -1762,13 +1754,7 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 					 " for netdev \"%s\"", ifname);
 			continue;
 		}
-		if (pci_dev->domain != pci_addr.domain ||
-		    pci_dev->bus != pci_addr.bus ||
-		    pci_dev->devid != pci_addr.devid ||
-		    pci_dev->function + owner != pci_addr.function)
-			continue;
 		/* Slave interface PCI address match found. */
-		fclose(file);
 		snprintf(tmp_str, sizeof(tmp_str),
 			 "/sys/class/net/%s/phys_port_name", ifname);
 		file = fopen(tmp_str, "rb");
@@ -1777,13 +1763,52 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 		info.name_type = MLX5_PHYS_PORT_NAME_TYPE_NOTSET;
 		if (fscanf(file, "%32s", tmp_str) == 1)
 			mlx5_translate_port_name(tmp_str, &info);
-		if (info.name_type == MLX5_PHYS_PORT_NAME_TYPE_LEGACY ||
-		    info.name_type == MLX5_PHYS_PORT_NAME_TYPE_UPLINK)
+		fclose(file);
+		/* Only process PF ports. */
+		if (info.name_type != MLX5_PHYS_PORT_NAME_TYPE_LEGACY &&
+		    info.name_type != MLX5_PHYS_PORT_NAME_TYPE_UPLINK)
+			continue;
+		/* Check max bonding member. */
+		if (info.port_name >= MLX5_BOND_MAX_PORTS) {
+			DRV_LOG(WARNING, "bonding index out of range, "
+				"please increase MLX5_BOND_MAX_PORTS: %s",
+				tmp_str);
+			break;
+		}
+		/* Match PCI address. */
+		if (pci_dev->domain == pci_addr.domain &&
+		    pci_dev->bus == pci_addr.bus &&
+		    pci_dev->devid == pci_addr.devid &&
+		    pci_dev->function + owner == pci_addr.function)
 			pf = info.port_name;
-		break;
-	}
-	if (file)
+		/* Get ifindex. */
+		snprintf(tmp_str, sizeof(tmp_str),
+			 "/sys/class/net/%s/ifindex", ifname);
+		file = fopen(tmp_str, "rb");
+		if (!file)
+			break;
+		ret = fscanf(file, "%u", &ifindex);
 		fclose(file);
+		if (ret != 1)
+			break;
+		/* Save bonding info. */
+		strncpy(bond_info->ports[info.port_name].ifname, ifname,
+			sizeof(bond_info->ports[0].ifname));
+		bond_info->ports[info.port_name].pci_addr = pci_addr;
+		bond_info->ports[info.port_name].ifindex = ifindex;
+		bond_info->n_port++;
+	}
+	if (pf >= 0) {
+		/* Get bond interface info */
+		ret = mlx5_sysfs_bond_info(ifindex, &bond_info->ifindex,
+					   bond_info->ifname);
+		if (ret)
+			DRV_LOG(ERR, "unable to get bond info: %s",
+				strerror(rte_errno));
+		else
+			DRV_LOG(INFO, "PF device %u, bond device %u(%s)",
+				ifindex, bond_info->ifindex, bond_info->ifname);
+	}
 	return pf;
 }
 
@@ -1838,6 +1863,7 @@ mlx5_os_pci_probe_pf(struct rte_pci_device *pci_dev,
 	unsigned int dev_config_vf;
 	struct rte_eth_devargs eth_da = *req_eth_da;
 	struct rte_pci_addr owner_pci = pci_dev->addr; /* Owner PF. */
+	struct mlx5_bond_info bond_info;
 	int ret = -1;
 
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
@@ -1869,7 +1895,8 @@ mlx5_os_pci_probe_pf(struct rte_pci_device *pci_dev,
 
 		DRV_LOG(DEBUG, "checking device \"%s\"", ibv_list[ret]->name);
 		bd = mlx5_device_bond_pci_match
-				(ibv_list[ret], &owner_pci, nl_rdma, owner_id);
+				(ibv_list[ret], &owner_pci, nl_rdma, owner_id,
+				 &bond_info);
 		if (bd >= 0) {
 			/*
 			 * Bonding device detected. Only one match is allowed,
@@ -1978,6 +2005,7 @@ mlx5_os_pci_probe_pf(struct rte_pci_device *pci_dev,
 		MLX5_ASSERT(nd == 1);
 		MLX5_ASSERT(np);
 		for (i = 1; i <= np; ++i) {
+			list[ns].bond_info = &bond_info;
 			list[ns].max_port = np;
 			list[ns].phys_port = i;
 			list[ns].phys_dev = ibv_match[0];
@@ -2068,6 +2096,7 @@ mlx5_os_pci_probe_pf(struct rte_pci_device *pci_dev,
 		 */
 		for (i = 0; i != nd; ++i) {
 			memset(&list[ns].info, 0, sizeof(list[ns].info));
+			list[ns].bond_info = NULL;
 			list[ns].max_port = 1;
 			list[ns].phys_port = 1;
 			list[ns].phys_dev = ibv_match[i];
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index d613ffd655..e170db948d 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -927,6 +927,8 @@ mlx5_alloc_shared_dev_ctx(const struct mlx5_dev_spawn_data *spawn,
 		rte_errno  = ENOMEM;
 		goto exit;
 	}
+	if (spawn->bond_info)
+		sh->bond = *spawn->bond_info;
 	err = mlx5_os_open_device(spawn, config, sh);
 	if (!sh->ctx)
 		goto error;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index e7afa438ce..508f98f8cd 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -115,6 +115,7 @@ struct mlx5_dev_spawn_data {
 	void *phys_dev; /**< Associated physical device. */
 	struct rte_eth_dev *eth_dev; /**< Associated Ethernet device. */
 	struct rte_pci_device *pci_dev; /**< Backend PCI device. */
+	struct mlx5_bond_info *bond_info;
 };
 
 /** Key string for IPC. */
@@ -661,6 +662,19 @@ struct mlx5_flex_parser_profiles {
 	void *obj;		/* Flex parser node object. */
 };
 
+/* Bonding device information. */
+struct mlx5_bond_info {
+	int n_port; /* Number of bond member ports. */
+	uint32_t ifindex;
+	char ifname[MLX5_NAMESIZE + 1];
+#define MLX5_BOND_MAX_PORTS 2
+	struct {
+		char ifname[MLX5_NAMESIZE + 1];
+		uint32_t ifindex;
+		struct rte_pci_addr pci_addr;
+	} ports[MLX5_BOND_MAX_PORTS];
+};
+
 /*
  * Shared Infiniband device context for Master/Representors
  * which belong to same IB device with multiple IB ports.
@@ -671,6 +685,7 @@ struct mlx5_dev_ctx_shared {
 	uint32_t devx:1; /* Opened with DV. */
 	uint32_t flow_hit_aso_en:1; /* Flow Hit ASO is supported. */
 	uint32_t max_port; /* Maximal IB device port index. */
+	struct mlx5_bond_info bond; /* Bonding information. */
 	void *ctx; /* Verbs/DV/DevX context. */
 	void *pd; /* Protection Domain. */
 	uint32_t pdn; /* Protection Domain number. */
@@ -916,10 +931,8 @@ struct mlx5_priv {
 	uint32_t vport_meta_tag; /* Used for vport index match ove VF LAG. */
 	uint32_t vport_meta_mask; /* Used for vport index field match mask. */
 	int32_t representor_id; /* RTE_ETH_REPR(), -1 if not a representor. */
-	int32_t pf_bond; /* >=0 means PF index in bonding configuration. */
+	int32_t pf_bond; /* >=0, representor owner PF index in bonding. */
 	unsigned int if_index; /* Associated kernel network device index. */
-	uint32_t bond_ifindex; /**< Bond interface index. */
-	char bond_name[MLX5_NAMESIZE]; /**< Bond interface name. */
 	/* RX/TX queues. */
 	unsigned int rxqs_n; /* RX queues array size. */
 	unsigned int txqs_n; /* TX queues array size. */
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 5341eb16c9..29389fc98f 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -42,7 +42,10 @@ mlx5_ifindex(const struct rte_eth_dev *dev)
 
 	MLX5_ASSERT(priv);
 	MLX5_ASSERT(priv->if_index);
-	ifindex = priv->bond_ifindex > 0 ? priv->bond_ifindex : priv->if_index;
+	if (priv->master && priv->sh->bond.ifindex > 0)
+		ifindex = priv->sh->bond.ifindex;
+	else
+		ifindex = priv->if_index;
 	if (!ifindex)
 		rte_errno = ENXIO;
 	return ifindex;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v3 7/9] net/mlx5: save bonding member ports information
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (17 preceding siblings ...)
  2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 6/9] net/mlx5: save bonding member ports information Xueming Li
@ 2021-01-18 11:29   ` Xueming Li
  2021-01-18 16:17     ` Slava Ovsiienko
  2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 8/9] net/mlx5: fix setting VF default MAC through representor Xueming Li
                     ` (47 subsequent siblings)
  66 siblings, 1 reply; 118+ messages in thread
From: Xueming Li @ 2021-01-18 11:29 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, Matan Azrad, Shahaf Shuler, Thomas Monjalon, xuemingl, Asaf Penso

Since kernel bonding interface doesn't provide counter summary of member
ports, PMD has to aggregate couters from of member ports.

This patch collect bonding member information and save to shared context
data.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/mlx5.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 508f98f8cd..c15af1d794 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -662,12 +662,14 @@ struct mlx5_flex_parser_profiles {
 	void *obj;		/* Flex parser node object. */
 };
 
+/* Max member ports per bonding device. */
+#define MLX5_BOND_MAX_PORTS 2
+
 /* Bonding device information. */
 struct mlx5_bond_info {
 	int n_port; /* Number of bond member ports. */
 	uint32_t ifindex;
 	char ifname[MLX5_NAMESIZE + 1];
-#define MLX5_BOND_MAX_PORTS 2
 	struct {
 		char ifname[MLX5_NAMESIZE + 1];
 		uint32_t ifindex;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v3 8/9] net/mlx5: fix setting VF default MAC through representor
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (18 preceding siblings ...)
  2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 7/9] " Xueming Li
@ 2021-01-18 11:29   ` Xueming Li
  2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 9/9] net/mlx5: improve bonding xstats Xueming Li
                     ` (46 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-18 11:29 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, Matan Azrad, Shahaf Shuler, Thomas Monjalon, xuemingl, Asaf Penso

With kernel bonding, there was an error when setting VF MAC address
through representor. The Netlink api requires ifindex of owner PF, not
bonding device ifindex.

Uses owner PF ifindex to modify VF default MAC in case of bonding
device.

Fixes: c21e5facf7d2 ("net/mlx5: use bond index for netdev operations")

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/mlx5_mac.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_mac.c b/drivers/net/mlx5/mlx5_mac.c
index b5b810b508..5a3aec89c1 100644
--- a/drivers/net/mlx5/mlx5_mac.c
+++ b/drivers/net/mlx5/mlx5_mac.c
@@ -154,6 +154,7 @@ mlx5_mac_addr_set(struct rte_eth_dev *dev, struct rte_ether_addr *mac_addr)
 {
 	uint16_t port_id;
 	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_priv *pf_priv;
 
 	/*
 	 * Configuring the VF instead of its representor,
@@ -162,19 +163,24 @@ mlx5_mac_addr_set(struct rte_eth_dev *dev, struct rte_ether_addr *mac_addr)
 	if (priv->representor && !mlx5_is_hpf(dev)) {
 		DRV_LOG(DEBUG, "VF represented by port %u setting primary MAC address",
 			dev->data->port_id);
+		if (priv->pf_bond >= 0) {
+			/* Bonding, get owner PF ifindex from shared data. */
+			return mlx5_os_vf_mac_addr_modify
+			       (priv,
+				priv->sh->bond.ports[priv->pf_bond].ifindex,
+				mac_addr,
+				rte_eth_representor_id_parse(
+						priv->representor_id,
+						NULL, NULL, NULL));
+		}
 		RTE_ETH_FOREACH_DEV_SIBLING(port_id, dev->data->port_id) {
-			priv = rte_eth_devices[port_id].data->dev_private;
-			if (priv->master == 1) {
-				priv = dev->data->dev_private;
+			pf_priv = rte_eth_devices[port_id].data->dev_private;
+			if (pf_priv->master == 1)
 				return mlx5_os_vf_mac_addr_modify
-				       (priv,
-					mlx5_ifindex(&rte_eth_devices[port_id]),
-					mac_addr,
+				       (priv, pf_priv->if_index, mac_addr,
 					rte_eth_representor_id_parse(
 							priv->representor_id,
-							NULL, NULL, NULL)
-					);
-			}
+							NULL, NULL, NULL));
 		}
 		rte_errno = -ENOTSUP;
 		return rte_errno;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v3 9/9] net/mlx5: improve bonding xstats
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (19 preceding siblings ...)
  2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 8/9] net/mlx5: fix setting VF default MAC through representor Xueming Li
@ 2021-01-18 11:29   ` Xueming Li
  2021-01-18 15:16   ` [dpdk-dev] [PATCH v2 0/5] eal: enable global device syntax Xueming Li
                     ` (45 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-18 11:29 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, Matan Azrad, Shahaf Shuler, Thomas Monjalon, xuemingl, Asaf Penso

In case of kernel bonding device, counter was read from first bonding PF
member.

This patch reads all member PFs and sums to get bond xstats.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_ethdev_os.c | 127 +++++++++++++++++++-----
 1 file changed, 102 insertions(+), 25 deletions(-)

diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
index 84610a7bc0..27afb74aff 100644
--- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
@@ -169,10 +169,10 @@ mlx5_get_ifname(const struct rte_eth_dev *dev, char (*ifname)[MLX5_NAMESIZE])
 }
 
 /**
- * Perform ifreq ioctl() on associated Ethernet device.
+ * Perform ifreq ioctl() on associated netdev ifname.
  *
- * @param[in] dev
- *   Pointer to Ethernet device.
+ * @param[in] ifname
+ *   Pointer to netdev name.
  * @param req
  *   Request number to pass to ioctl().
  * @param[out] ifr
@@ -182,7 +182,7 @@ mlx5_get_ifname(const struct rte_eth_dev *dev, char (*ifname)[MLX5_NAMESIZE])
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-mlx5_ifreq(const struct rte_eth_dev *dev, int req, struct ifreq *ifr)
+mlx5_ifreq_by_ifname(const char *ifname, int req, struct ifreq *ifr)
 {
 	int sock = socket(PF_INET, SOCK_DGRAM, IPPROTO_IP);
 	int ret = 0;
@@ -191,9 +191,7 @@ mlx5_ifreq(const struct rte_eth_dev *dev, int req, struct ifreq *ifr)
 		rte_errno = errno;
 		return -rte_errno;
 	}
-	ret = mlx5_get_ifname(dev, &ifr->ifr_name);
-	if (ret)
-		goto error;
+	rte_strscpy(ifr->ifr_name, ifname, sizeof(ifr->ifr_name));
 	ret = ioctl(sock, req, ifr);
 	if (ret == -1) {
 		rte_errno = errno;
@@ -206,6 +204,31 @@ mlx5_ifreq(const struct rte_eth_dev *dev, int req, struct ifreq *ifr)
 	return -rte_errno;
 }
 
+/**
+ * Perform ifreq ioctl() on associated Ethernet device.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet device.
+ * @param req
+ *   Request number to pass to ioctl().
+ * @param[out] ifr
+ *   Interface request structure output buffer.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_ifreq(const struct rte_eth_dev *dev, int req, struct ifreq *ifr)
+{
+	char ifname[sizeof(ifr->ifr_name)];
+	int ret;
+
+	ret = mlx5_get_ifname(dev, &ifname);
+	if (ret)
+		return -rte_errno;
+	return mlx5_ifreq_by_ifname(ifname, req, ifr);
+}
+
 /**
  * Get device MTU.
  *
@@ -1243,6 +1266,8 @@ int mlx5_get_module_eeprom(struct rte_eth_dev *dev,
  *
  * @param dev
  *   Pointer to Ethernet device.
+ * @param[in] pf
+ *   PF index in case of bonding device, -1 otherwise
  * @param[out] stats
  *   Counters table output buffer.
  *
@@ -1250,8 +1275,8 @@ int mlx5_get_module_eeprom(struct rte_eth_dev *dev,
  *   0 on success and stats is filled, negative errno value otherwise and
  *   rte_errno is set.
  */
-int
-mlx5_os_read_dev_counters(struct rte_eth_dev *dev, uint64_t *stats)
+static int
+_mlx5_os_read_dev_counters(struct rte_eth_dev *dev, int pf, uint64_t *stats)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
 	struct mlx5_xstats_ctrl *xstats_ctrl = &priv->xstats_ctrl;
@@ -1265,7 +1290,11 @@ mlx5_os_read_dev_counters(struct rte_eth_dev *dev, uint64_t *stats)
 	et_stats->cmd = ETHTOOL_GSTATS;
 	et_stats->n_stats = xstats_ctrl->stats_n;
 	ifr.ifr_data = (caddr_t)et_stats;
-	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (pf >= 0)
+		ret = mlx5_ifreq_by_ifname(priv->sh->bond.ports[pf].ifname,
+					   SIOCETHTOOL, &ifr);
+	else
+		ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
 	if (ret) {
 		DRV_LOG(WARNING,
 			"port %u unable to read statistic values from device",
@@ -1273,23 +1302,60 @@ mlx5_os_read_dev_counters(struct rte_eth_dev *dev, uint64_t *stats)
 		return ret;
 	}
 	for (i = 0; i != xstats_ctrl->mlx5_stats_n; ++i) {
-		if (xstats_ctrl->info[i].dev) {
-			ret = mlx5_os_read_dev_stat(priv,
-					    xstats_ctrl->info[i].ctr_name,
-					    &stats[i]);
-			/* return last xstats counter if fail to read. */
-			if (ret == 0)
-				xstats_ctrl->xstats[i] = stats[i];
-			else
-				stats[i] = xstats_ctrl->xstats[i];
-		} else {
-			stats[i] = (uint64_t)
-				et_stats->data[xstats_ctrl->dev_table_idx[i]];
-		}
+		if (xstats_ctrl->info[i].dev)
+			continue;
+		stats[i] += (uint64_t)
+			    et_stats->data[xstats_ctrl->dev_table_idx[i]];
 	}
 	return 0;
 }
 
+/**
+ * Read device counters.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param[out] stats
+ *   Counters table output buffer.
+ *
+ * @return
+ *   0 on success and stats is filled, negative errno value otherwise and
+ *   rte_errno is set.
+ */
+int
+mlx5_os_read_dev_counters(struct rte_eth_dev *dev, uint64_t *stats)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_xstats_ctrl *xstats_ctrl = &priv->xstats_ctrl;
+	int ret = 0, i;
+
+	memset(stats, 0, sizeof(*stats) * xstats_ctrl->mlx5_stats_n);
+	/* Read ifreq counters. */
+	if (priv->master && priv->pf_bond >= 0) {
+		/* Sum xstats from bonding device member ports. */
+		for (i = 0; i < priv->sh->bond.n_port; i++) {
+			ret = _mlx5_os_read_dev_counters(dev, i, stats);
+			if (ret)
+				return ret;
+		}
+	} else {
+		ret = _mlx5_os_read_dev_counters(dev, -1, stats);
+	}
+	/* Read IB counters. */
+	for (i = 0; i != xstats_ctrl->mlx5_stats_n; ++i) {
+		if (!xstats_ctrl->info[i].dev)
+			continue;
+		ret = mlx5_os_read_dev_stat(priv, xstats_ctrl->info[i].ctr_name,
+					    &stats[i]);
+		/* return last xstats counter if fail to read. */
+		if (ret != 0)
+			xstats_ctrl->xstats[i] = stats[i];
+		else
+			stats[i] = xstats_ctrl->xstats[i];
+	}
+	return ret;
+}
+
 /**
  * Query the number of statistics provided by ETHTOOL.
  *
@@ -1303,13 +1369,19 @@ mlx5_os_read_dev_counters(struct rte_eth_dev *dev, uint64_t *stats)
 int
 mlx5_os_get_stats_n(struct rte_eth_dev *dev)
 {
+	struct mlx5_priv *priv = dev->data->dev_private;
 	struct ethtool_drvinfo drvinfo;
 	struct ifreq ifr;
 	int ret;
 
 	drvinfo.cmd = ETHTOOL_GDRVINFO;
 	ifr.ifr_data = (caddr_t)&drvinfo;
-	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (priv->master && priv->pf_bond >= 0)
+		/* Bonding PF. */
+		ret = mlx5_ifreq_by_ifname(priv->sh->bond.ports[0].ifname,
+					   SIOCETHTOOL, &ifr);
+	else
+		ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
 	if (ret) {
 		DRV_LOG(WARNING, "port %u unable to query number of statistics",
 			dev->data->port_id);
@@ -1480,7 +1552,12 @@ mlx5_os_stats_init(struct rte_eth_dev *dev)
 	strings->string_set = ETH_SS_STATS;
 	strings->len = dev_stats_n;
 	ifr.ifr_data = (caddr_t)strings;
-	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (priv->master && priv->pf_bond >= 0)
+		/* Bonding master. */
+		ret = mlx5_ifreq_by_ifname(priv->sh->bond.ports[0].ifname,
+					   SIOCETHTOOL, &ifr);
+	else
+		ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
 	if (ret) {
 		DRV_LOG(WARNING, "port %u unable to get statistic names",
 			dev->data->port_id);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v2 0/5] eal: enable global device syntax
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (20 preceding siblings ...)
  2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 9/9] net/mlx5: improve bonding xstats Xueming Li
@ 2021-01-18 15:16   ` Xueming Li
  2021-01-18 15:16   ` [dpdk-dev] [PATCH v2 1/5] devargs: fix memory leak on parsing error Xueming Li
                     ` (44 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-18 15:16 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko, Olivier Matz
  Cc: dev, Viacheslav Ovsiienko, xuemingl, Asaf Penso

The new Global Device Syntax [1] is used to identify a device with full
bus, class and driver description, example:
 -a bus=pci,addr=82:00.0/class=eth/driver=mlx5,...

This patchset enables global device syntax with backward compatibility
by:
- unify devargs memory cleanup
- parse name from bus parameters 
- try new global syntax parsing firstly and fallback to legacy parsing.


History:

V1:
 - Inital version

V2:
 - add devargs.src as complete source dev string
 - change devargs.data to scratch buffer
 - add rte_drvargs_free() to release scratch memory
 - change name policy to align with rte_eth_iterator_init()
 - remove PCI bus fix as name already resolved in rte_devargs_parse().


[1] Global Device Syntax:
https://www.dpdk.org/wp-content/uploads/sites/35/2018/10/am-07-DPDK-hotplug-20180905.pdf

[2] RFC:
http://patchwork.dpdk.org/project/dpdk/list/?series=14378

[3] V1:
http://patchwork.dpdk.org/project/dpdk/list/?series=14610



Xueming Li (5):
  devargs: fix memory leak on parsing error
  devargs: refactor scratch buffer storage
  kvargs: add get by key function
  devargs: parse name from global device syntax
  devargs: enable global device syntax devargs

 app/test-pmd/config.c                        |  7 +--
 app/test-pmd/testpmd.c                       |  5 +-
 drivers/bus/vdev/vdev.c                      |  9 +--
 drivers/net/failsafe/failsafe_args.c         |  3 +-
 drivers/net/failsafe/failsafe_eal.c          |  2 +-
 examples/multi_process/hotplug_mp/commands.c |  6 +-
 lib/librte_eal/common/eal_common_dev.c       |  9 ++-
 lib/librte_eal/common/eal_common_devargs.c   | 60 ++++++++++++++------
 lib/librte_eal/common/hotplug_mp.c           |  6 +-
 lib/librte_eal/include/rte_devargs.h         | 18 ++++--
 lib/librte_eal/rte_eal_exports.def           |  1 +
 lib/librte_eal/version.map                   |  1 +
 lib/librte_ethdev/rte_ethdev.c               |  7 +--
 lib/librte_kvargs/rte_kvargs.c               | 20 +++++++
 lib/librte_kvargs/rte_kvargs.h               | 14 +++++
 lib/librte_kvargs/version.map                |  1 +
 16 files changed, 120 insertions(+), 49 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v2 1/5] devargs: fix memory leak on parsing error
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (21 preceding siblings ...)
  2021-01-18 15:16   ` [dpdk-dev] [PATCH v2 0/5] eal: enable global device syntax Xueming Li
@ 2021-01-18 15:16   ` Xueming Li
  2021-03-18  9:12     ` Thomas Monjalon
  2021-01-18 15:16   ` [dpdk-dev] [PATCH v2 2/5] devargs: refactor scratch buffer storage Xueming Li
                     ` (43 subsequent siblings)
  66 siblings, 1 reply; 118+ messages in thread
From: Xueming Li @ 2021-01-18 15:16 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko, Olivier Matz
  Cc: dev, Viacheslav Ovsiienko, xuemingl, Asaf Penso, gaetan.rivet, stable

This patch fixes memory leak in parsing error handling.

Fixes: 338327d731e6 ("devargs: add function to parse device layers")
Cc: gaetan.rivet@6wind.com
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 lib/librte_eal/common/eal_common_devargs.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index fcf3d9a3cc..c3969ff158 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -163,8 +163,14 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 		if (layers[i].kvlist)
 			rte_kvargs_free(layers[i].kvlist);
 	}
-	if (ret != 0)
+	if (ret != 0) {
+		if (devargs->data && devargs->data != devstr) {
+			/* Free duplicated data. */
+			free(devargs->data);
+			devargs->data = NULL;
+		}
 		rte_errno = -ret;
+	}
 	return ret;
 }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v2 2/5] devargs: refactor scratch buffer storage
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (22 preceding siblings ...)
  2021-01-18 15:16   ` [dpdk-dev] [PATCH v2 1/5] devargs: fix memory leak on parsing error Xueming Li
@ 2021-01-18 15:16   ` Xueming Li
  2021-03-18  9:14     ` Thomas Monjalon
  2021-01-18 15:16   ` [dpdk-dev] [PATCH v2 3/5] kvargs: add get by key function Xueming Li
                     ` (42 subsequent siblings)
  66 siblings, 1 reply; 118+ messages in thread
From: Xueming Li @ 2021-01-18 15:16 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko, Olivier Matz
  Cc: dev, Viacheslav Ovsiienko, xuemingl, Asaf Penso

In current design, legacy parser rte_devargs_parse() saved scratch
buffer to devargs.args while new parser rte_devargs_layers_parse() saved
to devargs.data. Code using devargs had to know the difference and
cleaned up memory accordingly - error prone.

This patch unifies data the dedicate scratch buffer, introduces
rte_devargs_free() function to wrap the memory memory clean up.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 app/test-pmd/config.c                        |  7 ++--
 app/test-pmd/testpmd.c                       |  5 ++-
 drivers/bus/vdev/vdev.c                      |  9 ++--
 drivers/net/failsafe/failsafe_args.c         |  3 +-
 drivers/net/failsafe/failsafe_eal.c          |  2 +-
 examples/multi_process/hotplug_mp/commands.c |  6 +--
 lib/librte_eal/common/eal_common_dev.c       |  9 ++--
 lib/librte_eal/common/eal_common_devargs.c   | 43 +++++++++++---------
 lib/librte_eal/common/hotplug_mp.c           |  6 +--
 lib/librte_eal/include/rte_devargs.h         | 18 ++++++--
 lib/librte_eal/rte_eal_exports.def           |  1 +
 lib/librte_eal/version.map                   |  1 +
 lib/librte_ethdev/rte_ethdev.c               |  7 ++--
 13 files changed, 64 insertions(+), 53 deletions(-)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 3f6c8642b1..21bdece399 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -509,8 +509,6 @@ device_infos_display(const char *identifier)
 
 	if (rte_devargs_parsef(&da, "%s", identifier)) {
 		printf("cannot parse identifier\n");
-		if (da.args)
-			free(da.args);
 		return;
 	}
 
@@ -558,6 +556,7 @@ device_infos_display(const char *identifier)
 			}
 		}
 	};
+	rte_devargs_free(&da);
 }
 
 void
@@ -602,8 +601,8 @@ port_infos_display(portid_t port_id)
 	else
 		printf("\nFirmware-version: %s", "not available");
 
-	if (dev_info.device->devargs && dev_info.device->devargs->args)
-		printf("\nDevargs: %s", dev_info.device->devargs->args);
+	if (dev_info.device->devargs && dev_info.device->devargs->src)
+		printf("\nDevargs: %s", dev_info.device->devargs->src);
 	printf("\nConnect to socket: %u", port->socket_id);
 
 	if (port_numa[port_id] != NUMA_NO_CONFIG) {
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 2b60f6c5d3..ea27779bfd 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -2987,8 +2987,6 @@ detach_devargs(char *identifier)
 	memset(&da, 0, sizeof(da));
 	if (rte_devargs_parsef(&da, "%s", identifier)) {
 		printf("cannot parse identifier\n");
-		if (da.args)
-			free(da.args);
 		return;
 	}
 
@@ -2997,6 +2995,7 @@ detach_devargs(char *identifier)
 			if (ports[port_id].port_status != RTE_PORT_STOPPED) {
 				printf("Port %u not stopped\n", port_id);
 				rte_eth_iterator_cleanup(&iterator);
+				rte_devargs_free(&da);
 				return;
 			}
 			port_flow_flush(port_id);
@@ -3006,6 +3005,7 @@ detach_devargs(char *identifier)
 	if (rte_eal_hotplug_remove(da.bus->name, da.name) != 0) {
 		TESTPMD_LOG(ERR, "Failed to detach device %s(%s)\n",
 			    da.name, da.bus->name);
+		rte_devargs_free(&da);
 		return;
 	}
 
@@ -3014,6 +3014,7 @@ detach_devargs(char *identifier)
 	printf("Device %s is detached\n", identifier);
 	printf("Now total ports is %d\n", nb_ports);
 	printf("Done\n");
+	rte_devargs_free(&da);
 }
 
 void
diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index acfd78828f..012326d809 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -236,13 +236,14 @@ alloc_devargs(const char *name, const char *args)
 
 	devargs->bus = &rte_vdev_bus;
 	if (args)
-		devargs->args = strdup(args);
+		devargs->data = strdup(args);
 	else
-		devargs->args = strdup("");
+		devargs->data = strdup("");
+	devargs->args = devargs->data;
 
 	ret = strlcpy(devargs->name, name, sizeof(devargs->name));
 	if (ret < 0 || ret >= (int)sizeof(devargs->name)) {
-		free(devargs->args);
+		rte_devargs_free(devargs);
 		free(devargs);
 		return NULL;
 	}
@@ -296,7 +297,7 @@ insert_vdev(const char *name, const char *args,
 
 	return 0;
 fail:
-	free(devargs->args);
+	rte_devargs_free(devargs);
 	free(devargs);
 	free(dev);
 	return ret;
diff --git a/drivers/net/failsafe/failsafe_args.c b/drivers/net/failsafe/failsafe_args.c
index 707490b94c..52fdcb977f 100644
--- a/drivers/net/failsafe/failsafe_args.c
+++ b/drivers/net/failsafe/failsafe_args.c
@@ -451,8 +451,7 @@ failsafe_args_free(struct rte_eth_dev *dev)
 		sdev->cmdline = NULL;
 		free(sdev->fd_str);
 		sdev->fd_str = NULL;
-		free(sdev->devargs.args);
-		sdev->devargs.args = NULL;
+		rte_devargs_free(&sdev->devargs);
 	}
 }
 
diff --git a/drivers/net/failsafe/failsafe_eal.c b/drivers/net/failsafe/failsafe_eal.c
index b9fc508673..3a4d8c835a 100644
--- a/drivers/net/failsafe/failsafe_eal.c
+++ b/drivers/net/failsafe/failsafe_eal.c
@@ -79,7 +79,7 @@ fs_bus_init(struct rte_eth_dev *dev)
 					rte_eth_devices[pid].device->devargs;
 
 			/* Take control of probed device. */
-			free(da->args);
+			rte_devargs_free(da);
 			memset(da, 0, sizeof(*da));
 			if (probed_da != NULL)
 				snprintf(devstr, sizeof(devstr), "%s,%s",
diff --git a/examples/multi_process/hotplug_mp/commands.c b/examples/multi_process/hotplug_mp/commands.c
index a8a39d07f7..e593cad56c 100644
--- a/examples/multi_process/hotplug_mp/commands.c
+++ b/examples/multi_process/hotplug_mp/commands.c
@@ -121,8 +121,6 @@ static void cmd_dev_attach_parsed(void *parsed_result,
 
 	if (rte_devargs_parsef(&da, "%s", res->devargs)) {
 		cmdline_printf(cl, "cannot parse devargs\n");
-		if (da.args)
-			free(da.args);
 		return;
 	}
 
@@ -131,6 +129,7 @@ static void cmd_dev_attach_parsed(void *parsed_result,
 	else
 		cmdline_printf(cl, "failed to attached device %s\n",
 				da.name);
+	rte_devargs_free(&da);
 }
 
 cmdline_parse_token_string_t cmd_dev_attach_attach =
@@ -168,8 +167,6 @@ static void cmd_dev_detach_parsed(void *parsed_result,
 
 	if (rte_devargs_parsef(&da, "%s", res->devargs)) {
 		cmdline_printf(cl, "cannot parse devargs\n");
-		if (da.args)
-			free(da.args);
 		return;
 	}
 
@@ -180,6 +177,7 @@ static void cmd_dev_detach_parsed(void *parsed_result,
 	else
 		cmdline_printf(cl, "failed to dettach device %s\n",
 			da.name);
+	rte_devargs_free(&da);
 }
 
 cmdline_parse_token_string_t cmd_dev_detach_detach =
diff --git a/lib/librte_eal/common/eal_common_dev.c b/lib/librte_eal/common/eal_common_dev.c
index 8a3bd3100a..4b4d589f64 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -185,10 +185,8 @@ local_dev_probe(const char *devargs, struct rte_device **new_dev)
 	return ret;
 
 err_devarg:
-	if (rte_devargs_remove(da) != 0) {
-		free(da->args);
-		free(da);
-	}
+	if (rte_devargs_remove(da) != 0)
+		rte_devargs_free(da);
 	return ret;
 }
 
@@ -586,7 +584,7 @@ rte_dev_iterator_init(struct rte_dev_iterator *it,
 	it->bus_str = NULL;
 	it->cls_str = NULL;
 
-	devargs.data = dev_str;
+	devargs.data = (void *)(intptr_t)dev_str;
 	if (rte_devargs_layers_parse(&devargs, dev_str))
 		goto get_out;
 
@@ -619,6 +617,7 @@ rte_dev_iterator_init(struct rte_dev_iterator *it,
 	it->device = NULL;
 	it->class_device = NULL;
 get_out:
+	rte_devargs_free(&devargs);
 	return -rte_errno;
 }
 
diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index c3969ff158..9c7a7de30e 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -144,13 +144,14 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 	devargs->drv_str = layers[2].str;
 	devargs->bus = bus;
 	devargs->cls = cls;
+	devargs->src = devstr;
 
 	/* If we own the data, clean up a bit
 	 * the several layers string, to ease
 	 * their parsing afterward.
 	 */
 	if (devargs->data != devstr) {
-		char *s = (void *)(intptr_t)(devargs->data);
+		char *s = devargs->data;
 
 		while ((s = strchr(s, '/'))) {
 			*s = '\0';
@@ -164,12 +165,8 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 			rte_kvargs_free(layers[i].kvlist);
 	}
 	if (ret != 0) {
-		if (devargs->data && devargs->data != devstr) {
-			/* Free duplicated data. */
-			free(devargs->data);
-			devargs->data = NULL;
-		}
 		rte_errno = -ret;
+		rte_devargs_free(devargs);
 	}
 	return ret;
 }
@@ -225,13 +222,17 @@ rte_devargs_parse(struct rte_devargs *da, const char *dev)
 	da->bus = bus;
 	/* Parse eventual device arguments */
 	if (devname[i] == ',')
-		da->args = strdup(&devname[i + 1]);
+		da->data = strdup(&devname[i + 1]);
 	else
-		da->args = strdup("");
-	if (da->args == NULL) {
+		da->data = strdup("");
+	if (da->data == NULL) {
 		RTE_LOG(ERR, EAL, "not enough memory to parse arguments\n");
 		return -ENOMEM;
 	}
+	da->drv_str = da->data;
+
+	da->src = dev;
+
 	return 0;
 }
 
@@ -266,6 +267,15 @@ rte_devargs_parsef(struct rte_devargs *da, const char *format, ...)
 	return ret;
 }
 
+void
+rte_devargs_free(struct rte_devargs *da)
+{
+	if (da && da->data && da->data != da->src)
+		free(da->data);
+	da->data = NULL;
+	da->src = NULL;
+}
+
 int
 rte_devargs_insert(struct rte_devargs **da)
 {
@@ -282,15 +292,8 @@ rte_devargs_insert(struct rte_devargs **da)
 		if (strcmp(listed_da->bus->name, (*da)->bus->name) == 0 &&
 				strcmp(listed_da->name, (*da)->name) == 0) {
 			/* device already in devargs list, must be updated */
-			listed_da->type = (*da)->type;
-			listed_da->policy = (*da)->policy;
-			free(listed_da->args);
-			listed_da->args = (*da)->args;
-			listed_da->bus = (*da)->bus;
-			listed_da->cls = (*da)->cls;
-			listed_da->bus_str = (*da)->bus_str;
-			listed_da->cls_str = (*da)->cls_str;
-			listed_da->data = (*da)->data;
+			rte_devargs_free(listed_da);
+			*listed_da = **da;
 			/* replace provided devargs with found one */
 			free(*da);
 			*da = listed_da;
@@ -332,7 +335,7 @@ rte_devargs_add(enum rte_devtype devtype, const char *devargs_str)
 
 fail:
 	if (devargs) {
-		free(devargs->args);
+		rte_devargs_free(devargs);
 		free(devargs);
 	}
 
@@ -352,7 +355,7 @@ rte_devargs_remove(struct rte_devargs *devargs)
 		if (strcmp(d->bus->name, devargs->bus->name) == 0 &&
 		    strcmp(d->name, devargs->name) == 0) {
 			TAILQ_REMOVE(&devargs_list, d, next);
-			free(d->args);
+			rte_devargs_free(d);
 			free(d);
 			return 0;
 		}
diff --git a/lib/librte_eal/common/hotplug_mp.c b/lib/librte_eal/common/hotplug_mp.c
index ee791903b3..13f2a427cf 100644
--- a/lib/librte_eal/common/hotplug_mp.c
+++ b/lib/librte_eal/common/hotplug_mp.c
@@ -95,6 +95,7 @@ __handle_secondary_request(void *param)
 
 	tmp_req = *req;
 
+	memset(&da, 0, sizeof(da));
 	if (req->t == EAL_DEV_REQ_TYPE_ATTACH) {
 		ret = local_dev_probe(req->devargs, &dev);
 		if (ret != 0) {
@@ -118,8 +119,6 @@ __handle_secondary_request(void *param)
 		ret = rte_devargs_parse(&da, req->devargs);
 		if (ret != 0)
 			goto finish;
-		free(da.args); /* we don't need those */
-		da.args = NULL;
 
 		ret = eal_dev_hotplug_request_to_secondary(&tmp_req);
 		if (ret != 0) {
@@ -176,6 +175,7 @@ __handle_secondary_request(void *param)
 	if (ret)
 		RTE_LOG(ERR, EAL, "failed to send response to secondary\n");
 
+	rte_devargs_free(&da);
 	free(bundle->peer);
 	free(bundle);
 }
@@ -283,7 +283,7 @@ static void __handle_primary_request(void *param)
 
 		ret = local_dev_remove(dev);
 quit:
-		free(da->args);
+		rte_devargs_free(da);
 		free(da);
 		break;
 	default:
diff --git a/lib/librte_eal/include/rte_devargs.h b/lib/librte_eal/include/rte_devargs.h
index 296f19324f..4a917a266b 100644
--- a/lib/librte_eal/include/rte_devargs.h
+++ b/lib/librte_eal/include/rte_devargs.h
@@ -60,16 +60,16 @@ struct rte_devargs {
 	/** Name of the device. */
 	char name[RTE_DEV_NAME_MAX_LEN];
 	RTE_STD_C11
-	union {
-	/** Arguments string as given by user or "" for no argument. */
-		char *args;
+	union { /**< driver-related part of device string. */
+		const char *args; /**< legacy name. */
 		const char *drv_str;
 	};
 	struct rte_bus *bus; /**< bus handle. */
 	struct rte_class *cls; /**< class handle. */
 	const char *bus_str; /**< bus-related part of device string. */
 	const char *cls_str; /**< class-related part of device string. */
-	const char *data; /**< Device string storage. */
+	char *data; /**< Scratch buffer. */
+	const char *src; /**< Arguments given by user. */
 };
 
 /**
@@ -145,6 +145,16 @@ rte_devargs_parsef(struct rte_devargs *da,
 		   const char *format, ...)
 __rte_format_printf(2, 0);
 
+/**
+ * Free resources in devargs.
+ *
+ * @param da
+ *   The devargs structure holding the device information.
+ */
+__rte_experimental
+void
+rte_devargs_free(struct rte_devargs *da);
+
 /**
  * Insert an rte_devargs in the global list.
  *
diff --git a/lib/librte_eal/rte_eal_exports.def b/lib/librte_eal/rte_eal_exports.def
index fe27bffe45..6fb1aaf8a8 100644
--- a/lib/librte_eal/rte_eal_exports.def
+++ b/lib/librte_eal/rte_eal_exports.def
@@ -29,6 +29,7 @@ EXPORTS
 	rte_devargs_next
 	rte_devargs_parse
 	rte_devargs_parsef
+	rte_devargs_free
 	rte_devargs_remove
 	rte_devargs_type_count
 	rte_dump_physmem_layout
diff --git a/lib/librte_eal/version.map b/lib/librte_eal/version.map
index b1db7ec795..ef388a30a1 100644
--- a/lib/librte_eal/version.map
+++ b/lib/librte_eal/version.map
@@ -409,6 +409,7 @@ EXPERIMENTAL {
 	rte_thread_tls_key_delete;
 	rte_thread_tls_value_get;
 	rte_thread_tls_value_set;
+	rte_devargs_free;
 };
 
 INTERNAL {
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 17ddacc78d..325e7693eb 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -193,13 +193,14 @@ int
 rte_eth_iterator_init(struct rte_dev_iterator *iter, const char *devargs_str)
 {
 	int ret;
-	struct rte_devargs devargs = {.args = NULL};
+	struct rte_devargs devargs;
 	const char *bus_param_key;
 	char *bus_str = NULL;
 	char *cls_str = NULL;
 	int str_size;
 
 	memset(iter, 0, sizeof(*iter));
+	memset(&devargs, 0, sizeof(devargs));
 
 	/*
 	 * The devargs string may use various syntaxes:
@@ -244,8 +245,6 @@ rte_eth_iterator_init(struct rte_dev_iterator *iter, const char *devargs_str)
 		goto error;
 	}
 	iter->cls_str = cls_str;
-	free(devargs.args); /* allocated by rte_devargs_parse() */
-	devargs.args = NULL;
 
 	iter->bus = devargs.bus;
 	if (iter->bus->dev_iterate == NULL) {
@@ -284,7 +283,7 @@ rte_eth_iterator_init(struct rte_dev_iterator *iter, const char *devargs_str)
 	if (ret == -ENOTSUP)
 		RTE_ETHDEV_LOG(ERR, "Bus %s does not support iterating.\n",
 				iter->bus->name);
-	free(devargs.args);
+	rte_devargs_free(&devargs);
 	free(bus_str);
 	free(cls_str);
 	return ret;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v2 3/5] kvargs: add get by key function
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (23 preceding siblings ...)
  2021-01-18 15:16   ` [dpdk-dev] [PATCH v2 2/5] devargs: refactor scratch buffer storage Xueming Li
@ 2021-01-18 15:16   ` Xueming Li
  2021-01-18 15:16   ` [dpdk-dev] [PATCH v2 4/5] devargs: parse name from global device syntax Xueming Li
                     ` (41 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-18 15:16 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko, Olivier Matz
  Cc: dev, Viacheslav Ovsiienko, xuemingl, Asaf Penso

Adds a new function to get value of a specific key from kvargs list.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 lib/librte_kvargs/rte_kvargs.c | 20 ++++++++++++++++++++
 lib/librte_kvargs/rte_kvargs.h | 14 ++++++++++++++
 lib/librte_kvargs/version.map  |  1 +
 3 files changed, 35 insertions(+)

diff --git a/lib/librte_kvargs/rte_kvargs.c b/lib/librte_kvargs/rte_kvargs.c
index 285081c86c..bc734915f9 100644
--- a/lib/librte_kvargs/rte_kvargs.c
+++ b/lib/librte_kvargs/rte_kvargs.c
@@ -160,6 +160,26 @@ rte_kvargs_free(struct rte_kvargs *kvlist)
 	free(kvlist);
 }
 
+/* lookup the rte_kvargs structure by key */
+const char *
+rte_kvargs_get(struct rte_kvargs *kvlist, const char *key)
+{
+	unsigned int i;
+
+	if (!kvlist)
+		return NULL;
+	for (i = 0; i < kvlist->count; ++i) {
+		/* Allows key to be NULL. */
+		if (!key && !kvlist->pairs[i].key)
+			return kvlist->pairs[i].value;
+		if (!key || !kvlist->pairs[i].key)
+			continue;
+		if (!strcmp(kvlist->pairs[i].key, key))
+			return kvlist->pairs[i].value;
+	}
+	return NULL;
+}
+
 /*
  * Parse the arguments "key=value,key=value,..." string and return
  * an allocated structure that contains a key/value list. Also
diff --git a/lib/librte_kvargs/rte_kvargs.h b/lib/librte_kvargs/rte_kvargs.h
index eff598e08b..6d426241ea 100644
--- a/lib/librte_kvargs/rte_kvargs.h
+++ b/lib/librte_kvargs/rte_kvargs.h
@@ -114,6 +114,20 @@ struct rte_kvargs *rte_kvargs_parse_delim(const char *args,
  */
 void rte_kvargs_free(struct rte_kvargs *kvlist);
 
+/**
+ * Get the value matching the given key
+ *
+ * @param kvlist
+ *   The rte_kvargs structure
+ * @param key
+ *   The key that should match
+
+ * @return
+ *   The value that match, NULL if not found.
+ */
+__rte_experimental
+const char *rte_kvargs_get(struct rte_kvargs *kvlist, const char *key);
+
 /**
  * Call a handler function for each key/value matching the key
  *
diff --git a/lib/librte_kvargs/version.map b/lib/librte_kvargs/version.map
index ed375bf4a3..d6cde16f30 100644
--- a/lib/librte_kvargs/version.map
+++ b/lib/librte_kvargs/version.map
@@ -14,5 +14,6 @@ EXPERIMENTAL {
 
 	rte_kvargs_parse_delim;
 	rte_kvargs_strcmp;
+	rte_kvargs_get;
 
 };
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v2 4/5] devargs: parse name from global device syntax
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (24 preceding siblings ...)
  2021-01-18 15:16   ` [dpdk-dev] [PATCH v2 3/5] kvargs: add get by key function Xueming Li
@ 2021-01-18 15:16   ` Xueming Li
  2021-01-18 15:16   ` [dpdk-dev] [PATCH v2 5/5] devargs: enable global device syntax devargs Xueming Li
                     ` (40 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-18 15:16 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko, Olivier Matz
  Cc: dev, Viacheslav Ovsiienko, xuemingl, Asaf Penso

To use Global Device Syntax as devarg, name is required for device
management.

This patch adds global device syntax name resolving by using same
strategy as function rte_eth_iterator_init(), parses from "addr" bus
parameter for PCI bus, from "name" bus parameter for vdev bus.
Example:
 -a bus=pci,addr=83:00.0/class=eth/driver=mlx5,...
    name: 03:00.0
 -a bus=vdev,name=pcap0/class=eth/driver=pcap,...
    name:pcap0

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 lib/librte_eal/common/eal_common_devargs.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index 9c7a7de30e..27af4cc0e3 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -57,6 +57,7 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 	struct rte_class *cls = NULL;
 	struct rte_bus *bus = NULL;
 	const char *s = devstr;
+	const char *name = NULL;
 	size_t nblayer;
 	size_t i = 0;
 	int ret = 0;
@@ -116,6 +117,8 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 		if (layers[i].kvlist == NULL)
 			continue;
 		kv = &layers[i].kvlist->pairs[0];
+		if (!kv->key)
+			continue;
 		if (strcmp(kv->key, "bus") == 0) {
 			bus = rte_bus_find_by_name(kv->value);
 			if (bus == NULL) {
@@ -146,6 +149,16 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 	devargs->cls = cls;
 	devargs->src = devstr;
 
+	/* Parse device name. */
+	if (bus) {
+		if (strcmp(bus->name, "vdev") == 0)
+			name = rte_kvargs_get(layers[0].kvlist, "name");
+		else if (strcmp(bus->name, "pci") == 0)
+			name = rte_kvargs_get(layers[0].kvlist, "addr");
+		if (name != NULL)
+			strncpy(devargs->name, name, sizeof(devargs->name) - 1);
+	}
+
 	/* If we own the data, clean up a bit
 	 * the several layers string, to ease
 	 * their parsing afterward.
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v2 5/5] devargs: enable global device syntax devargs
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (25 preceding siblings ...)
  2021-01-18 15:16   ` [dpdk-dev] [PATCH v2 4/5] devargs: parse name from global device syntax Xueming Li
@ 2021-01-18 15:16   ` Xueming Li
  2021-01-18 15:26   ` [dpdk-dev] [PATCH v2 0/2] mlx5: support global device syntax Xueming Li
                     ` (39 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-18 15:16 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko, Olivier Matz
  Cc: dev, Viacheslav Ovsiienko, xuemingl, Asaf Penso

When parsing a device argument, try to parse new global device syntax
firstly, fallback to legacy syntax parsing on error.

Example of new global device syntax:
 -a bus=pci,addr=82:00.0/class=eth/driver=mlx5,dv_flow_en=1

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 lib/librte_eal/common/eal_common_devargs.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index 27af4cc0e3..53ec8ad822 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -201,6 +201,12 @@ rte_devargs_parse(struct rte_devargs *da, const char *dev)
 	if (da == NULL)
 		return -EINVAL;
 
+	/* First parse according new global syntax */
+	if (rte_devargs_layers_parse(da, dev) == 0 && da->bus && da->cls)
+		return 0;
+
+	/* Legacy syntax check: */
+
 	/* Retrieve eventual bus info */
 	do {
 		devname = dev;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v2 0/2] mlx5: support global device syntax
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (26 preceding siblings ...)
  2021-01-18 15:16   ` [dpdk-dev] [PATCH v2 5/5] devargs: enable global device syntax devargs Xueming Li
@ 2021-01-18 15:26   ` Xueming Li
  2021-01-18 15:26   ` [dpdk-dev] [PATCH v2 1/2] common/mlx5: support device global syntax Xueming Li
                     ` (38 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-18 15:26 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, Matan Azrad, Shahaf Shuler, Thomas Monjalon, xuemingl, Asaf Penso

New Global device syntax [1] is used to identify a device with full bus,
class and driver description, for example:
 -a bus=pci,id=82:00.0/class=eth/driver=mlx5,dv_flow_en=1

This patchset enables global syntax in mlx5 PMD.

Depends-on: series-14815 ("eal: support global device syntax")

History:
V1:
 - initial version
V2:
 - remove the code parsing "representor" from class parameters.
   representor parsing from class should be done by class "eth" in the
future.


[1] Global Device Syntax:
https://www.dpdk.org/wp-content/uploads/sites/35/2018/10/am-07-DPDK-hotplug-20180905.pdf

[2] RFC:
http://patchwork.dpdk.org/project/dpdk/list/?series=14378

[3] V1:
http://patchwork.dpdk.org/project/dpdk/list/?series=14611


Xueming Li (2):
  common/mlx5: support device global syntax
  net/mlx5: support new global device syntax

 drivers/common/mlx5/mlx5_common_pci.c | 6 +++++-
 drivers/net/mlx5/mlx5.c               | 6 +++++-
 2 files changed, 10 insertions(+), 2 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v2 1/2] common/mlx5: support device global syntax
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (27 preceding siblings ...)
  2021-01-18 15:26   ` [dpdk-dev] [PATCH v2 0/2] mlx5: support global device syntax Xueming Li
@ 2021-01-18 15:26   ` Xueming Li
  2021-04-05 10:54     ` Slava Ovsiienko
  2021-01-18 15:26   ` [dpdk-dev] [PATCH v2 2/2] net/mlx5: support new global device syntax Xueming Li
                     ` (37 subsequent siblings)
  66 siblings, 1 reply; 118+ messages in thread
From: Xueming Li @ 2021-01-18 15:26 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, Matan Azrad, Shahaf Shuler, Thomas Monjalon, xuemingl, Asaf Penso

This patch supports new device global device syntax, resolve class type
from "class" section if the devarg is global device syntax:
bus=<bus>,k=v,,,/class=<cls>,k=v,,,/driver=<pmd>,k=v,,,,

To reuse class name of global device syntax, this patch also changes
internal class name introduced by commit [1] to algin with RTE class
name.

[1]
8a41f4deccc3: common/mlx5: introduce layer for multiple class drivers

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 drivers/common/mlx5/mlx5_common_pci.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/common/mlx5/mlx5_common_pci.c b/drivers/common/mlx5/mlx5_common_pci.c
index 5208972bb6..c03bdbf4eb 100644
--- a/drivers/common/mlx5/mlx5_common_pci.c
+++ b/drivers/common/mlx5/mlx5_common_pci.c
@@ -4,6 +4,7 @@
 
 #include <stdlib.h>
 #include <rte_malloc.h>
+#include <rte_class.h>
 #include "mlx5_common_utils.h"
 #include "mlx5_common_pci.h"
 
@@ -26,7 +27,7 @@ static const struct {
 	unsigned int driver_class;
 } mlx5_classes[] = {
 	{ .name = "vdpa", .driver_class = MLX5_CLASS_VDPA },
-	{ .name = "net", .driver_class = MLX5_CLASS_NET },
+	{ .name = "eth", .driver_class = MLX5_CLASS_NET },
 	{ .name = "regex", .driver_class = MLX5_CLASS_REGEX },
 };
 
@@ -115,6 +116,9 @@ parse_class_options(const struct rte_devargs *devargs)
 
 	if (devargs == NULL)
 		return 0;
+	if (devargs->cls != NULL)
+		/* support new global syntax */
+		return class_name_to_value(devargs->cls->name);
 	kvlist = rte_kvargs_parse(devargs->args, NULL);
 	if (kvlist == NULL)
 		return 0;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v2 2/2] net/mlx5: support new global device syntax
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (28 preceding siblings ...)
  2021-01-18 15:26   ` [dpdk-dev] [PATCH v2 1/2] common/mlx5: support device global syntax Xueming Li
@ 2021-01-18 15:26   ` Xueming Li
  2021-04-05 10:56     ` Slava Ovsiienko
  2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 0/8] net/mlx5: support SubFunction representor Xueming Li
                     ` (36 subsequent siblings)
  66 siblings, 1 reply; 118+ messages in thread
From: Xueming Li @ 2021-01-18 15:26 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, Matan Azrad, Shahaf Shuler, Thomas Monjalon, xuemingl, Asaf Penso

This patch support new global device syntax like:
	bus=pci,addr=BB:DD.F/class=eth/driver=mlx5,devargs,..

In driver parameters check, ignores "driver" key which is part of new
global device syntax instead of reporting error.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 drivers/net/mlx5/mlx5.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index e245276fce..3b0e59ce1d 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -41,6 +41,9 @@
 #include "mlx5_flow_os.h"
 #include "rte_pmd_mlx5.h"
 
+/* Driver type key for new device global syntax. */
+#define MLX5_DRIVER_KEY "driver"
+
 /* Device parameter to enable RX completion queue compression. */
 #define MLX5_RXQ_CQE_COMP_EN "rxq_cqe_comp_en"
 
@@ -1597,7 +1600,7 @@ mlx5_args_check(const char *key, const char *val, void *opaque)
 	signed long tmp;
 
 	/* No-op, port representors are processed in mlx5_dev_spawn(). */
-	if (!strcmp(MLX5_REPRESENTOR, key))
+	if (!strcmp(MLX5_DRIVER_KEY, key) || !strcmp(MLX5_REPRESENTOR, key))
 		return 0;
 	errno = 0;
 	tmp = strtol(val, NULL, 0);
@@ -1749,6 +1752,7 @@ int
 mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
 {
 	const char **params = (const char *[]){
+		MLX5_DRIVER_KEY,
 		MLX5_RXQ_CQE_COMP_EN,
 		MLX5_RXQ_PKT_PAD_EN,
 		MLX5_RX_MPRQ_EN,
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/9] net/mlx5: save bonding member ports information
  2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 7/9] " Xueming Li
@ 2021-01-18 16:17     ` Slava Ovsiienko
  2021-01-18 23:05       ` Xueming(Steven) Li
  0 siblings, 1 reply; 118+ messages in thread
From: Slava Ovsiienko @ 2021-01-18 16:17 UTC (permalink / raw)
  To: Xueming(Steven) Li
  Cc: dev, Matan Azrad, Shahaf Shuler, NBU-Contact-Thomas Monjalon,
	Xueming(Steven) Li, Asaf Penso

Hi, Xueming

- this patch has the same headline as previous one
- typos: couters -> counters, collect -> collectS, save -> saveS

With best regards, Slava

> -----Original Message-----
> From: Xueming Li <xuemingl@nvidia.com>
> Sent: Monday, January 18, 2021 13:29
> To: Slava Ovsiienko <viacheslavo@nvidia.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@nvidia.com>; Shahaf Shuler
> <shahafs@nvidia.com>; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>; Xueming(Steven) Li <xuemingl@nvidia.com>; Asaf
> Penso <asafp@nvidia.com>
> Subject: [PATCH v3 7/9] net/mlx5: save bonding member ports information
> 
> Since kernel bonding interface doesn't provide counter summary of member
> ports, PMD has to aggregate couters from of member ports.
> 
> This patch collect bonding member information and save to shared context
> data.
> 
> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
> ---
>  drivers/net/mlx5/mlx5.h | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index
> 508f98f8cd..c15af1d794 100644
> --- a/drivers/net/mlx5/mlx5.h
> +++ b/drivers/net/mlx5/mlx5.h
> @@ -662,12 +662,14 @@ struct mlx5_flex_parser_profiles {
>  	void *obj;		/* Flex parser node object. */
>  };
> 
> +/* Max member ports per bonding device. */ #define
> MLX5_BOND_MAX_PORTS
> +2
> +
>  /* Bonding device information. */
>  struct mlx5_bond_info {
>  	int n_port; /* Number of bond member ports. */
>  	uint32_t ifindex;
>  	char ifname[MLX5_NAMESIZE + 1];
> -#define MLX5_BOND_MAX_PORTS 2
>  	struct {
>  		char ifname[MLX5_NAMESIZE + 1];
>  		uint32_t ifindex;
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/9] net/mlx5: save bonding member ports information
  2021-01-18 16:17     ` Slava Ovsiienko
@ 2021-01-18 23:05       ` Xueming(Steven) Li
  0 siblings, 0 replies; 118+ messages in thread
From: Xueming(Steven) Li @ 2021-01-18 23:05 UTC (permalink / raw)
  To: Slava Ovsiienko
  Cc: dev, Matan Azrad, Shahaf Shuler, NBU-Contact-Thomas Monjalon, Asaf Penso

Hi Slava,

>-----Original Message-----
>From: Slava Ovsiienko <viacheslavo@nvidia.com>
>Sent: Tuesday, January 19, 2021 12:17 AM
>To: Xueming(Steven) Li <xuemingl@nvidia.com>
>Cc: dev@dpdk.org; Matan Azrad <matan@nvidia.com>; Shahaf Shuler
><shahafs@nvidia.com>; NBU-Contact-Thomas Monjalon
><thomas@monjalon.net>; Xueming(Steven) Li <xuemingl@nvidia.com>; Asaf
>Penso <asafp@nvidia.com>
>Subject: RE: [PATCH v3 7/9] net/mlx5: save bonding member ports
>information
>
>Hi, Xueming
>
>- this patch has the same headline as previous one
>- typos: couters -> counters, collect -> collectS, save -> saveS

My bad, this patch should  combine with previous one.

>
>With best regards, Slava
>
>> -----Original Message-----
>> From: Xueming Li <xuemingl@nvidia.com>
>> Sent: Monday, January 18, 2021 13:29
>> To: Slava Ovsiienko <viacheslavo@nvidia.com>
>> Cc: dev@dpdk.org; Matan Azrad <matan@nvidia.com>; Shahaf Shuler
>> <shahafs@nvidia.com>; NBU-Contact-Thomas Monjalon
>> <thomas@monjalon.net>; Xueming(Steven) Li <xuemingl@nvidia.com>; Asaf
>> Penso <asafp@nvidia.com>
>> Subject: [PATCH v3 7/9] net/mlx5: save bonding member ports
>> information
>>
>> Since kernel bonding interface doesn't provide counter summary of
>> member ports, PMD has to aggregate couters from of member ports.
>>
>> This patch collect bonding member information and save to shared
>> context data.
>>
>> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
>> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
>> ---
>>  drivers/net/mlx5/mlx5.h | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index
>> 508f98f8cd..c15af1d794 100644
>> --- a/drivers/net/mlx5/mlx5.h
>> +++ b/drivers/net/mlx5/mlx5.h
>> @@ -662,12 +662,14 @@ struct mlx5_flex_parser_profiles {
>>  	void *obj;		/* Flex parser node object. */
>>  };
>>
>> +/* Max member ports per bonding device. */ #define
>> MLX5_BOND_MAX_PORTS
>> +2
>> +
>>  /* Bonding device information. */
>>  struct mlx5_bond_info {
>>  	int n_port; /* Number of bond member ports. */
>>  	uint32_t ifindex;
>>  	char ifname[MLX5_NAMESIZE + 1];
>> -#define MLX5_BOND_MAX_PORTS 2
>>  	struct {
>>  		char ifname[MLX5_NAMESIZE + 1];
>>  		uint32_t ifindex;
>> --
>> 2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v4 0/8] net/mlx5: support SubFunction representor
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (29 preceding siblings ...)
  2021-01-18 15:26   ` [dpdk-dev] [PATCH v2 2/2] net/mlx5: support new global device syntax Xueming Li
@ 2021-01-19  7:28   ` Xueming Li
  2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 1/8] common/mlx5: update representor name parsing Xueming Li
                     ` (35 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-19  7:28 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, xuemingl, Asaf Penso

SubFunction [1] is a portion of the PCI device, a SF netdev has its own
dedicated queues(txq, rxq). A SF netdev supports E-Switch representation
offload similar to existing PF and VF representors. A SF shares PCI
level resources with other SFs and/or with its parent PCI function.

This patch set introduces SubFunction representor support for mlx5
PMD driver.

Depends-on: series-14834 ("ethdev: support SubFunction representor")

Version history:
 RFC:
 	initial version [2]
 V2:
    - support bonding representor probe with new pf#vf# devargs
    - adapt EAL api V2 [3] changes
    - update document
 V3:
    - support list of representor PF section for bonding device:
      example: representor=pf[0,1]vf[0-3]
    - add bonding information to shared PMD data
    - fix setting VF MAC through representor
    - fix bonding xstats, sum xstats from PF members.
 v4:
    - combine unexpected patch, thanks Slava

[1] SubFunction in kernel:
https://lore.kernel.org/netdev/20201112192424.2742-1-parav@nvidia.com/

[2] RFC:
http://patchwork.dpdk.org/project/dpdk/list/?series=14376

[3] V2:
http://patchwork.dpdk.org/project/dpdk/list/?series=14560

[3] V3:
http://patchwork.dpdk.org/project/dpdk/list/?series=14810

[4] EAL part to support SF representor:
http://patchwork.dpdk.org/project/dpdk/list/?series=14834


Xueming Li (8):
  common/mlx5: update representor name parsing
  net/mlx5: support representor of sub function
  net/mlx5: revert setting bonding representor to first PF
  net/mlx5: refactor bonding representor probe
  net/mlx5: support representor from multiple PFs
  net/mlx5: save bonding member ports information
  net/mlx5: fix setting VF default MAC through representor
  net/mlx5: improve bonding xstats

 doc/guides/nics/mlx5.rst                   |  62 +++-
 drivers/common/mlx5/linux/mlx5_common_os.c |  32 +-
 drivers/common/mlx5/linux/mlx5_nl.c        |   2 +
 drivers/common/mlx5/mlx5_common.h          |   2 +
 drivers/net/mlx5/linux/mlx5_ethdev_os.c    | 136 +++++--
 drivers/net/mlx5/linux/mlx5_os.c           | 399 ++++++++++++++-------
 drivers/net/mlx5/mlx5.c                    |  25 +-
 drivers/net/mlx5/mlx5.h                    |  25 +-
 drivers/net/mlx5/mlx5_defs.h               |   4 -
 drivers/net/mlx5/mlx5_ethdev.c             |  34 +-
 drivers/net/mlx5/mlx5_mac.c                |  26 +-
 11 files changed, 533 insertions(+), 214 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v4 1/8] common/mlx5: update representor name parsing
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (30 preceding siblings ...)
  2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 0/8] net/mlx5: support SubFunction representor Xueming Li
@ 2021-01-19  7:28   ` Xueming Li
  2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 2/8] net/mlx5: support representor of sub function Xueming Li
                     ` (34 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-19  7:28 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, xuemingl, Asaf Penso

This patch updates representor name parsing for SF.
In sysfs, representor name stored under "phys_port_name" sysfs key,
similar to VF representor, switch port name of SF representor is
"pf<x>sf<y>".

For netlink message, net SF type is supported.

Examples:

pf0sf1
pf0sf[0-3]

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/common/mlx5/linux/mlx5_common_os.c | 32 +++++++++++++++-------
 drivers/common/mlx5/linux/mlx5_nl.c        |  2 ++
 drivers/common/mlx5/mlx5_common.h          |  2 ++
 drivers/net/mlx5/linux/mlx5_ethdev_os.c    |  3 ++
 4 files changed, 29 insertions(+), 10 deletions(-)

diff --git a/drivers/common/mlx5/linux/mlx5_common_os.c b/drivers/common/mlx5/linux/mlx5_common_os.c
index 0edd78ea6d..5cf9576921 100644
--- a/drivers/common/mlx5/linux/mlx5_common_os.c
+++ b/drivers/common/mlx5/linux/mlx5_common_os.c
@@ -97,22 +97,34 @@ void
 mlx5_translate_port_name(const char *port_name_in,
 			 struct mlx5_switch_info *port_info_out)
 {
-	char pf_c1, pf_c2, vf_c1, vf_c2, eol;
+	char ctrl = 0, pf_c1, pf_c2, vf_c1, vf_c2, eol;
 	char *end;
 	int sc_items;
 
-	/*
-	 * Check for port-name as a string of the form pf0vf0
-	 * (support kernel ver >= 5.0 or OFED ver >= 4.6).
-	 */
+	sc_items = sscanf(port_name_in, "%c%d",
+			  &ctrl, &port_info_out->ctrl_num);
+	if (sc_items == 2 && ctrl == 'c') {
+		port_name_in++; /* 'c' */
+		port_name_in += snprintf(NULL, 0, "%d",
+					  port_info_out->ctrl_num);
+	}
+	/* Check for port-name as a string of the form pf0vf0 or pf0sf0 */
 	sc_items = sscanf(port_name_in, "%c%c%d%c%c%d%c",
 			  &pf_c1, &pf_c2, &port_info_out->pf_num,
 			  &vf_c1, &vf_c2, &port_info_out->port_name, &eol);
-	if (sc_items == 6 &&
-	    pf_c1 == 'p' && pf_c2 == 'f' &&
-	    vf_c1 == 'v' && vf_c2 == 'f') {
-		port_info_out->name_type = MLX5_PHYS_PORT_NAME_TYPE_PFVF;
-		return;
+	if (sc_items == 6 && pf_c1 == 'p' && pf_c2 == 'f') {
+		if (vf_c1 == 'v' && vf_c2 == 'f') {
+			/* Kernel ver >= 5.0 or OFED ver >= 4.6 */
+			port_info_out->name_type =
+					MLX5_PHYS_PORT_NAME_TYPE_PFVF;
+			return;
+		}
+		if (vf_c1 == 's' && vf_c2 == 'f') {
+			/* Kernel ver >= 5.11 or OFED ver >= 5.1 */
+			port_info_out->name_type =
+					MLX5_PHYS_PORT_NAME_TYPE_PFSF;
+			return;
+		}
 	}
 	/*
 	 * Check for port-name as a string of the form p0
diff --git a/drivers/common/mlx5/linux/mlx5_nl.c b/drivers/common/mlx5/linux/mlx5_nl.c
index 40d8620300..3d55cc98b4 100644
--- a/drivers/common/mlx5/linux/mlx5_nl.c
+++ b/drivers/common/mlx5/linux/mlx5_nl.c
@@ -1148,6 +1148,8 @@ mlx5_nl_check_switch_info(bool num_vf_set,
 	case MLX5_PHYS_PORT_NAME_TYPE_PFHPF:
 		/* Fallthrough */
 	case MLX5_PHYS_PORT_NAME_TYPE_PFVF:
+		/* Fallthrough */
+	case MLX5_PHYS_PORT_NAME_TYPE_PFSF:
 		/* New representors naming schema. */
 		switch_info->representor = 1;
 		break;
diff --git a/drivers/common/mlx5/mlx5_common.h b/drivers/common/mlx5/mlx5_common.h
index e35188da4c..a422b74577 100644
--- a/drivers/common/mlx5/mlx5_common.h
+++ b/drivers/common/mlx5/mlx5_common.h
@@ -153,6 +153,7 @@ enum mlx5_nl_phys_port_name_type {
 	MLX5_PHYS_PORT_NAME_TYPE_UPLINK, /* p0, kernel ver >= 5.0 */
 	MLX5_PHYS_PORT_NAME_TYPE_PFVF, /* pf0vf0, kernel ver >= 5.0 */
 	MLX5_PHYS_PORT_NAME_TYPE_PFHPF, /* pf0, kernel ver >= 5.7, HPF rep */
+	MLX5_PHYS_PORT_NAME_TYPE_PFSF, /* pf0sf0, kernel ver >= 5.0 */
 	MLX5_PHYS_PORT_NAME_TYPE_UNKNOWN, /* Unrecognized. */
 };
 
@@ -161,6 +162,7 @@ struct mlx5_switch_info {
 	uint32_t master:1; /**< Master device. */
 	uint32_t representor:1; /**< Representor device. */
 	enum mlx5_nl_phys_port_name_type name_type; /** < Port name type. */
+	int32_t ctrl_num; /**< Controller number (valid for c#pf#vf# format). */
 	int32_t pf_num; /**< PF number (valid for pfxvfx format only). */
 	int32_t port_name; /**< Representor port name. */
 	uint64_t switch_id; /**< Switch identifier. */
diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
index e36a78091c..1b37970c21 100644
--- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
@@ -1013,6 +1013,9 @@ mlx5_sysfs_check_switch_info(bool device_dir,
 		/* New representors naming schema. */
 		switch_info->representor = 1;
 		break;
+	default:
+		switch_info->master = device_dir;
+		break;
 	}
 }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v4 2/8] net/mlx5: support representor of sub function
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (31 preceding siblings ...)
  2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 1/8] common/mlx5: update representor name parsing Xueming Li
@ 2021-01-19  7:28   ` Xueming Li
  2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 3/8] net/mlx5: revert setting bonding representor to first PF Xueming Li
                     ` (33 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-19  7:28 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, xuemingl, Asaf Penso, Anatoly Burakov

This patch adds support for SF representor. Similar to VF representor,
switch port name of SF representor in phys_port_name sysfs key is
"pf<x>sf<y>".

Device representor argumnt is "representors=sf[list]", list member could
be mix of instance and range. Example:
  representors=sf[0,2,4,8-12,-1]

To probe VF representor and SF representor, need to separate into 2
devices:
  -a <BDF>,representor=vf[list] -a <BDF>,representor=sf[list]

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 doc/guides/nics/mlx5.rst                |  58 +++++++++--
 drivers/net/mlx5/linux/mlx5_ethdev_os.c |   2 +
 drivers/net/mlx5/linux/mlx5_os.c        | 123 ++++++++++++++++++++----
 drivers/net/mlx5/mlx5_ethdev.c          |   2 +
 4 files changed, 154 insertions(+), 31 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index db0c8b6c20..c7829007a4 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -871,14 +871,18 @@ Driver options
 - ``representor`` parameter [list]
 
   This parameter can be used to instantiate DPDK Ethernet devices from
-  existing port (or VF) representors configured on the device.
+  existing port (PF, VF or SF) representors configured on the device.
 
   It is a standard parameter whose format is described in
   :ref:`ethernet_device_standard_device_arguments`.
 
-  For instance, to probe port representors 0 through 2::
+  For instance, to probe VF port representors 0 through 2::
 
-    representor=[0-2]
+    representor=vf[0-2]
+
+  To probe SF port representors 0 through 2::
+
+    representor=sf[0-2]
 
 - ``max_dump_files_num`` parameter [int]
 
@@ -1287,15 +1291,15 @@ Quick Start Guide on OFED/EN
 Enable switchdev mode
 ---------------------
 
-Switchdev mode is a mode in E-Switch, that binds between representor and VF.
-Representor is a port in DPDK that is connected to a VF in such a way
-that assuming there are no offload flows, each packet that is sent from the VF
-will be received by the corresponding representor. While each packet that is
-sent to a representor will be received by the VF.
+Switchdev mode is a mode in E-Switch, that binds between representor and VF or SF.
+Representor is a port in DPDK that is connected to a VF or SF in such a way
+that assuming there are no offload flows, each packet that is sent from the VF or SF
+will be received by the corresponding representor. While each packet that is or SF
+sent to a representor will be received by the VF or SF.
 This is very useful in case of SRIOV mode, where the first packet that is sent
-by the VF will be received by the DPDK application which will decide if this
+by the VF or SF will be received by the DPDK application which will decide if this
 flow should be offloaded to the E-Switch. After offloading the flow packet
-that the VF that are matching the flow will not be received any more by
+that the VF or SF that are matching the flow will not be received any more by
 the DPDK application.
 
 1. Enable SRIOV mode::
@@ -1322,6 +1326,40 @@ the DPDK application.
 
         echo switchdev > /sys/class/net/<net device>/compat/devlink/mode
 
+SubFunction representor support
+-------------------------------
+SubFunction is a portion of the PCI device, a SF netdev has its own
+dedicated queues(txq, rxq). A SF netdev supports E-Switch representation
+offload similar to existing PF and VF representors. A SF shares PCI
+level resources with other SFs and/or with its parent PCI function.
+
+1. Configure SF feature::
+
+        mlxconfig -d <mst device> set PF_BAR2_SIZE=<0/1/2/3> PF_BAR2_ENABLE=1
+
+        Value of PF_BAR2_SIZE:
+
+            0: 8 SFs
+            1: 16 SFs
+            2: 32 SFs
+            3: 64 SFs
+
+2. Reset the FW::
+
+        mlxfwreset -d <mst device> reset
+
+3. Enable switchdev mode::
+
+        echo switchdev > /sys/class/net/<net device>/compat/devlink/mode
+
+4. Create SF::
+
+        mlnx-sf -d <PCI_BDF> -a create
+
+5. Probe SF representor::
+
+        testpmd> port attach <PCI_BDF>,representor=sf0,dv_flow_en=1
+
 Performance tuning
 ------------------
 
diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
index 1b37970c21..ac311de46d 100644
--- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
@@ -1010,6 +1010,8 @@ mlx5_sysfs_check_switch_info(bool device_dir,
 	case MLX5_PHYS_PORT_NAME_TYPE_PFHPF:
 		/* Fallthrough */
 	case MLX5_PHYS_PORT_NAME_TYPE_PFVF:
+		/* Fallthrough */
+	case MLX5_PHYS_PORT_NAME_TYPE_PFSF:
 		/* New representors naming schema. */
 		switch_info->representor = 1;
 		break;
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 4d7940bcca..b2776c080a 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -654,6 +654,8 @@ mlx5_flow_counter_mode_config(struct rte_eth_dev *dev __rte_unused)
  *   Verbs device parameters (name, port, switch_info) to spawn.
  * @param config
  *   Device configuration parameters.
+ * @param config
+ *   Device arguments.
  *
  * @return
  *   A valid Ethernet device object on success, NULL otherwise and rte_errno
@@ -665,7 +667,8 @@ mlx5_flow_counter_mode_config(struct rte_eth_dev *dev __rte_unused)
 static struct rte_eth_dev *
 mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	       struct mlx5_dev_spawn_data *spawn,
-	       struct mlx5_dev_config *config)
+	       struct mlx5_dev_config *config,
+	       struct rte_eth_devargs *eth_da)
 {
 	const struct mlx5_switch_info *switch_info = &spawn->info;
 	struct mlx5_dev_ctx_shared *sh = NULL;
@@ -696,34 +699,82 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 
 	/* Determine if this port representor is supposed to be spawned. */
 	if (switch_info->representor && dpdk_dev->devargs) {
-		struct rte_eth_devargs eth_da;
-
-		err = rte_eth_devargs_parse(dpdk_dev->devargs->args, &eth_da);
-		if (err) {
-			rte_errno = -err;
-			DRV_LOG(ERR, "failed to process device arguments: %s",
-				strerror(rte_errno));
-			return NULL;
-		}
-		if (eth_da.type != RTE_ETH_REPRESENTOR_NONE) {
-			/* Representor not specified. */
+		switch (eth_da->type) {
+		case RTE_ETH_REPRESENTOR_SF:
+			if (switch_info->name_type !=
+					MLX5_PHYS_PORT_NAME_TYPE_PFSF) {
+				rte_errno = EBUSY;
+				return NULL;
+			}
+			break;
+		case RTE_ETH_REPRESENTOR_VF:
+			/* Allows HPF representor index -1 as exception. */
+			if (!(spawn->info.port_name == -1 &&
+			      switch_info->name_type ==
+					MLX5_PHYS_PORT_NAME_TYPE_PFHPF) &&
+			    switch_info->name_type !=
+					MLX5_PHYS_PORT_NAME_TYPE_PFVF) {
+				rte_errno = EBUSY;
+				return NULL;
+			}
+			break;
+		case RTE_ETH_REPRESENTOR_NONE:
 			rte_errno = EBUSY;
 			return NULL;
-		}
-		if (eth_da.type != RTE_ETH_REPRESENTOR_VF) {
+			break;
+		default:
 			rte_errno = ENOTSUP;
 			DRV_LOG(ERR, "unsupported representor type: %s",
 				dpdk_dev->devargs->args);
 			return NULL;
 		}
-		for (i = 0; i < eth_da.nb_representor_ports; ++i)
-			if (eth_da.representor_ports[i] ==
+		/* Check controller ID: */
+		for (i = 0; i < eth_da->nb_mh_controllers; ++i)
+			if (eth_da->mh_controllers[i] ==
+			    (uint16_t)switch_info->ctrl_num)
+				break;
+		if (eth_da->nb_mh_controllers &&
+		    i == eth_da->nb_mh_controllers) {
+			rte_errno = EBUSY;
+			return NULL;
+		}
+		/* Check SF/VF ID: */
+		for (i = 0; i < eth_da->nb_representor_ports; ++i)
+			if (eth_da->representor_ports[i] ==
 			    (uint16_t)switch_info->port_name)
 				break;
-		if (i == eth_da.nb_representor_ports) {
+		if (eth_da->type != RTE_ETH_REPRESENTOR_PF &&
+		    i == eth_da->nb_representor_ports) {
 			rte_errno = EBUSY;
 			return NULL;
 		}
+		/* Check PF ID. Check after repr port to avoid warning flood. */
+		if (spawn->pf_bond >= 0) {
+			for (i = 0; i < eth_da->nb_ports; ++i)
+				if (eth_da->ports[i] ==
+				    (uint16_t)switch_info->pf_num)
+					break;
+			if (eth_da->nb_ports && i == eth_da->nb_ports) {
+				/* For backward compatibility, bonding
+				 * representor syntax supported with limitation,
+				 * device iterator won't find it:
+				 *    <PF1_BDF>,representor=#
+				 */
+				if (switch_info->pf_num > 0 &&
+				    eth_da->ports[0] == 0) {
+					DRV_LOG(WARNING, "Representor on Bonding PF should use pf#vf# format: %s",
+						dpdk_dev->devargs->args);
+				} else {
+					rte_errno = EBUSY;
+					return NULL;
+				}
+			}
+		} else if (eth_da->nb_ports > 1 || eth_da->ports[0]) {
+			rte_errno = EINVAL;
+			DRV_LOG(ERR, "PF id not supported by non-bond device: %s",
+				dpdk_dev->devargs->args);
+			return NULL;
+		}
 	}
 	/* Build device name. */
 	if (spawn->pf_bond <  0) {
@@ -731,8 +782,11 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		if (!switch_info->representor)
 			strlcpy(name, dpdk_dev->name, sizeof(name));
 		else
-			snprintf(name, sizeof(name), "%s_representor_%u",
-				 dpdk_dev->name, switch_info->port_name);
+			snprintf(name, sizeof(name), "%s_representor_%s%u",
+				 dpdk_dev->name,
+				 switch_info->name_type ==
+				 MLX5_PHYS_PORT_NAME_TYPE_PFSF ? "sf" : "vf",
+				 switch_info->port_name);
 	} else {
 		/* Bonding device. */
 		if (!switch_info->representor)
@@ -740,9 +794,11 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 				 dpdk_dev->name,
 				 mlx5_os_get_dev_device_name(spawn->phys_dev));
 		else
-			snprintf(name, sizeof(name), "%s_%s_representor_%u",
+			snprintf(name, sizeof(name), "%s_%s_representor_%s%u",
 				 dpdk_dev->name,
 				 mlx5_os_get_dev_device_name(spawn->phys_dev),
+				 switch_info->name_type ==
+				 MLX5_PHYS_PORT_NAME_TYPE_PFSF ? "sf" : "vf",
 				 switch_info->port_name);
 	}
 	/* check if the device is already spawned */
@@ -1790,6 +1846,7 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 	struct mlx5_dev_spawn_data *list = NULL;
 	struct mlx5_dev_config dev_config;
 	unsigned int dev_config_vf;
+	struct rte_eth_devargs eth_da = { .type = RTE_ETH_REPRESENTOR_NONE };
 	int ret;
 
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
@@ -1800,6 +1857,27 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 			strerror(rte_errno));
 		return -rte_errno;
 	}
+	if (pci_dev->device.devargs) {
+		/* Parse representor information from device argument. */
+		if (pci_dev->device.devargs->cls_str)
+			ret = rte_eth_devargs_parse(
+				pci_dev->device.devargs->cls_str, &eth_da);
+		if (ret) {
+			DRV_LOG(ERR, "failed to parse device arguments: %s",
+				pci_dev->device.devargs->cls_str);
+			return -rte_errno;
+		}
+		if (eth_da.type == RTE_ETH_REPRESENTOR_NONE) {
+			/* Support legacy device argument */
+			ret = rte_eth_devargs_parse(
+				pci_dev->device.devargs->args, &eth_da);
+			if (ret) {
+				DRV_LOG(ERR, "failed to parse device arguments: %s",
+					pci_dev->device.devargs->args);
+				return -rte_errno;
+			}
+		}
+	}
 	errno = 0;
 	ibv_list = mlx5_glue->get_device_list(&ret);
 	if (!ibv_list) {
@@ -1972,6 +2050,8 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 				case MLX5_PHYS_PORT_NAME_TYPE_PFHPF:
 					/* Fallthrough */
 				case MLX5_PHYS_PORT_NAME_TYPE_PFVF:
+					/* Fallthrough */
+				case MLX5_PHYS_PORT_NAME_TYPE_PFSF:
 					if (list[ns].info.pf_num == bd)
 						ns++;
 					break;
@@ -2149,7 +2229,8 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 		dev_config.log_hp_size = MLX5_ARG_UNSET;
 		list[i].eth_dev = mlx5_dev_spawn(&pci_dev->device,
 						 &list[i],
-						 &dev_config);
+						 &dev_config,
+						 &eth_da);
 		if (!list[i].eth_dev) {
 			if (rte_errno != EBUSY && rte_errno != EEXIST)
 				break;
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 45ee7e4488..ad6aacc329 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -374,6 +374,8 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 			break;
 		}
 	}
+	if (priv->master)
+		info->dev_capa = RTE_ETH_DEV_CAPA_REPRESENTOR_SF;
 	return 0;
 }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v4 3/8] net/mlx5: revert setting bonding representor to first PF
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (32 preceding siblings ...)
  2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 2/8] net/mlx5: support representor of sub function Xueming Li
@ 2021-01-19  7:28   ` Xueming Li
  2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 4/8] net/mlx5: refactor bonding representor probe Xueming Li
                     ` (32 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-19  7:28 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, xuemingl, Asaf Penso

With kernel bonding, representors on second PF are being probed by
devargs:
	<primary_bdf>,representor=pf1vf<N>
No need to save primary PF port ID and lookup when probing sibling
ports, revert patch [1]

[1]:
commit e6818853c022 ("net/mlx5: set representor to first PF in bonding
mode")

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_os.c | 20 ++------------------
 drivers/net/mlx5/mlx5.c          |  1 -
 drivers/net/mlx5/mlx5.h          |  1 -
 3 files changed, 2 insertions(+), 20 deletions(-)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index b2776c080a..7b320e8b72 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -816,13 +816,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 			rte_errno = ENOMEM;
 			return NULL;
 		}
-		priv = eth_dev->data->dev_private;
-		if (priv->sh->bond_dev != UINT16_MAX)
-			/* For bonding port, use primary PCI device. */
-			eth_dev->device =
-				rte_eth_devices[priv->sh->bond_dev].device;
-		else
-			eth_dev->device = dpdk_dev;
+		eth_dev->device = dpdk_dev;
 		eth_dev->dev_ops = &mlx5_dev_sec_ops;
 		eth_dev->rx_descriptor_status = mlx5_rx_descriptor_status;
 		eth_dev->tx_descriptor_status = mlx5_tx_descriptor_status;
@@ -1439,17 +1433,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	eth_dev->data->dev_private = priv;
 	priv->dev_data = eth_dev->data;
 	eth_dev->data->mac_addrs = priv->mac;
-	if (spawn->pf_bond < 0) {
-		eth_dev->device = dpdk_dev;
-	} else {
-		/* Use primary bond PCI as device. */
-		if (sh->bond_dev == UINT16_MAX) {
-			sh->bond_dev = eth_dev->data->port_id;
-			eth_dev->device = dpdk_dev;
-		} else {
-			eth_dev->device = rte_eth_devices[sh->bond_dev].device;
-		}
-	}
+	eth_dev->device = dpdk_dev;
 	eth_dev->data->dev_flags |= RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS;
 	/* Configure the first MAC address by default. */
 	if (mlx5_get_mac(eth_dev, &mac.addr_bytes)) {
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index e245276fce..5e8cd6a3df 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -914,7 +914,6 @@ mlx5_alloc_shared_dev_ctx(const struct mlx5_dev_spawn_data *spawn,
 		goto error;
 	}
 	sh->refcnt = 1;
-	sh->bond_dev = UINT16_MAX;
 	sh->max_port = spawn->max_port;
 	strncpy(sh->ibdev_name, mlx5_os_get_ctx_device_name(sh->ctx),
 		sizeof(sh->ibdev_name) - 1);
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 3836a9696c..e06e0ff3bb 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -668,7 +668,6 @@ struct mlx5_flex_parser_profiles {
 struct mlx5_dev_ctx_shared {
 	LIST_ENTRY(mlx5_dev_ctx_shared) next;
 	uint32_t refcnt;
-	uint16_t bond_dev; /* Bond primary device id. */
 	uint32_t devx:1; /* Opened with DV. */
 	uint32_t flow_hit_aso_en:1; /* Flow Hit ASO is supported. */
 	uint32_t max_port; /* Maximal IB device port index. */
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v4 4/8] net/mlx5: refactor bonding representor probe
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (33 preceding siblings ...)
  2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 3/8] net/mlx5: revert setting bonding representor to first PF Xueming Li
@ 2021-01-19  7:28   ` Xueming Li
  2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 5/8] net/mlx5: support representor from multiple PFs Xueming Li
                     ` (31 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-19  7:28 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, xuemingl, Asaf Penso, Anatoly Burakov

To probe representor on 2nd PF of kernel bonding device, had to specify
PF1 BDF in devarg:
  <PF1_BDF>,representor=0
When closing bonding device, all representors had to be closed together
and this implies all representors have to use primary PF of bonding
device. So after probing representor port on 2nd PF, when locating new
probed device using device argument, the filter used 2nd PF as PCI
address and failed to locate new device.

Conflict happened by using current representor devargs:
 - Use PCI BDF to specify representor owner PF
 - Use PCI BDF to locate probed representor device.
 - PMD uses primary PCI BDF as PCI device.

To resolve such conflicts, new representor syntax is introduced here:
  <primary BDF>,representor=pfXvfY
All representors must use primary PF as owner PCI device, PMD internally
locate owner PCI address by checking representor "pfX" part. To EAL, all
representors are registered to primary PCI device, the 2nd PF is hidden
to EAL, thus all search should be consistent.

Same to VF representor, HPF (host PF on BlueField) uses same syntax to
probe, example: representor=pf1vf[0-3,-1]

This patch also adds pf index into kernel bonding representor port name:
	<BDF>_<ib_name>_representor_pf<X>vf<Y>

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 doc/guides/nics/mlx5.rst         |   4 +-
 drivers/net/mlx5/linux/mlx5_os.c | 263 +++++++++++++++++--------------
 drivers/net/mlx5/mlx5.c          |  22 +++
 drivers/net/mlx5/mlx5.h          |   3 +-
 drivers/net/mlx5/mlx5_defs.h     |   4 -
 drivers/net/mlx5/mlx5_ethdev.c   |  27 ----
 drivers/net/mlx5/mlx5_mac.c      |   8 +-
 7 files changed, 177 insertions(+), 154 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index c7829007a4..eaca4fc058 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -878,11 +878,11 @@ Driver options
 
   For instance, to probe VF port representors 0 through 2::
 
-    representor=vf[0-2]
+    <PCI_BDF>,representor=vf[0-2]
 
   To probe SF port representors 0 through 2::
 
-    representor=sf[0-2]
+    <PCI_BDF>,representor=sf[0-2]
 
 - ``max_dump_files_num`` parameter [int]
 
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 7b320e8b72..9ae5910f46 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -645,6 +645,72 @@ mlx5_flow_counter_mode_config(struct rte_eth_dev *dev __rte_unused)
 #endif
 }
 
+/**
+ * Check if representor spawn info match devargs.
+ *
+ * @param spawn
+ *   Verbs device parameters (name, port, switch_info) to spawn.
+ * @param eth_da
+ *   Device devargs to probe.
+ * @param repr_id
+ *   Encoded representor ID.
+ *
+ * @return
+ *   Match result.
+ */
+static bool
+mlx5_representor_match(struct mlx5_dev_spawn_data *spawn,
+		       struct rte_eth_devargs *eth_da,
+		       uint16_t repr_id)
+{
+	const struct mlx5_switch_info *switch_info = &spawn->info;
+	unsigned int c, p, f;
+	uint16_t repr;
+
+	switch (eth_da->type) {
+	case RTE_ETH_REPRESENTOR_SF:
+		if (switch_info->name_type != MLX5_PHYS_PORT_NAME_TYPE_PFSF) {
+			rte_errno = EBUSY;
+			return false;
+		}
+		break;
+	case RTE_ETH_REPRESENTOR_VF:
+		/* Allows HPF representor index -1 as exception. */
+		if (!(spawn->info.port_name == -1 &&
+		      switch_info->name_type ==
+				MLX5_PHYS_PORT_NAME_TYPE_PFHPF) &&
+		    switch_info->name_type != MLX5_PHYS_PORT_NAME_TYPE_PFVF) {
+			rte_errno = EBUSY;
+			return false;
+		}
+		break;
+	case RTE_ETH_REPRESENTOR_NONE:
+		rte_errno = EBUSY;
+		return false;
+	default:
+		rte_errno = ENOTSUP;
+		DRV_LOG(ERR, "unsupported representor type");
+		return false;
+	}
+	/* Check representor ID: */
+	for (c = 0; c < eth_da->nb_mh_controllers; ++c) {
+		for (p = 0; p < eth_da->nb_ports; ++p) {
+			for (f = 0; f < eth_da->nb_representor_ports; ++f) {
+				repr = rte_eth_representor_id_encode(
+					eth_da->mh_controllers[c],
+					eth_da->ports[p],
+					eth_da->type,
+					eth_da->representor_ports[f]);
+				if (repr_id == repr)
+					return true;
+			}
+		}
+	}
+	rte_errno = EBUSY;
+	return false;
+}
+
+
 /**
  * Spawn an Ethernet device from Verbs information.
  *
@@ -676,6 +742,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	struct mlx5dv_context dv_attr = { .comp_mask = 0 };
 	struct rte_eth_dev *eth_dev = NULL;
 	struct mlx5_priv *priv = NULL;
+	uint16_t repr_id = RTE_NO_REPRESENTOR_ID;
 	int err = 0;
 	unsigned int hw_padding = 0;
 	unsigned int mps;
@@ -692,115 +759,52 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	char name[RTE_ETH_NAME_MAX_LEN];
 	int own_domain_id = 0;
 	uint16_t port_id;
-	unsigned int i;
 #ifdef HAVE_MLX5DV_DR_DEVX_PORT
 	struct mlx5dv_devx_port devx_port = { .comp_mask = 0 };
 #endif
 
+	if (switch_info->representor)
+		repr_id = rte_eth_representor_id_encode(
+			switch_info->ctrl_num,
+			spawn->pf_bond >= 0 ? switch_info->pf_num : 0,
+			switch_info->name_type ==
+				MLX5_PHYS_PORT_NAME_TYPE_PFSF ?
+				RTE_ETH_REPRESENTOR_SF : RTE_ETH_REPRESENTOR_VF,
+			switch_info->port_name);
 	/* Determine if this port representor is supposed to be spawned. */
-	if (switch_info->representor && dpdk_dev->devargs) {
-		switch (eth_da->type) {
-		case RTE_ETH_REPRESENTOR_SF:
-			if (switch_info->name_type !=
-					MLX5_PHYS_PORT_NAME_TYPE_PFSF) {
-				rte_errno = EBUSY;
-				return NULL;
-			}
-			break;
-		case RTE_ETH_REPRESENTOR_VF:
-			/* Allows HPF representor index -1 as exception. */
-			if (!(spawn->info.port_name == -1 &&
-			      switch_info->name_type ==
-					MLX5_PHYS_PORT_NAME_TYPE_PFHPF) &&
-			    switch_info->name_type !=
-					MLX5_PHYS_PORT_NAME_TYPE_PFVF) {
-				rte_errno = EBUSY;
-				return NULL;
-			}
-			break;
-		case RTE_ETH_REPRESENTOR_NONE:
-			rte_errno = EBUSY;
-			return NULL;
-			break;
-		default:
-			rte_errno = ENOTSUP;
-			DRV_LOG(ERR, "unsupported representor type: %s",
-				dpdk_dev->devargs->args);
-			return NULL;
-		}
-		/* Check controller ID: */
-		for (i = 0; i < eth_da->nb_mh_controllers; ++i)
-			if (eth_da->mh_controllers[i] ==
-			    (uint16_t)switch_info->ctrl_num)
-				break;
-		if (eth_da->nb_mh_controllers &&
-		    i == eth_da->nb_mh_controllers) {
-			rte_errno = EBUSY;
-			return NULL;
-		}
-		/* Check SF/VF ID: */
-		for (i = 0; i < eth_da->nb_representor_ports; ++i)
-			if (eth_da->representor_ports[i] ==
-			    (uint16_t)switch_info->port_name)
-				break;
-		if (eth_da->type != RTE_ETH_REPRESENTOR_PF &&
-		    i == eth_da->nb_representor_ports) {
-			rte_errno = EBUSY;
-			return NULL;
-		}
-		/* Check PF ID. Check after repr port to avoid warning flood. */
-		if (spawn->pf_bond >= 0) {
-			for (i = 0; i < eth_da->nb_ports; ++i)
-				if (eth_da->ports[i] ==
-				    (uint16_t)switch_info->pf_num)
-					break;
-			if (eth_da->nb_ports && i == eth_da->nb_ports) {
-				/* For backward compatibility, bonding
-				 * representor syntax supported with limitation,
-				 * device iterator won't find it:
-				 *    <PF1_BDF>,representor=#
-				 */
-				if (switch_info->pf_num > 0 &&
-				    eth_da->ports[0] == 0) {
-					DRV_LOG(WARNING, "Representor on Bonding PF should use pf#vf# format: %s",
-						dpdk_dev->devargs->args);
-				} else {
-					rte_errno = EBUSY;
-					return NULL;
-				}
-			}
-		} else if (eth_da->nb_ports > 1 || eth_da->ports[0]) {
-			rte_errno = EINVAL;
-			DRV_LOG(ERR, "PF id not supported by non-bond device: %s",
-				dpdk_dev->devargs->args);
-			return NULL;
-		}
-	}
+	if (switch_info->representor && dpdk_dev->devargs &&
+	    !mlx5_representor_match(spawn, eth_da, repr_id))
+		return NULL;
 	/* Build device name. */
 	if (spawn->pf_bond <  0) {
 		/* Single device. */
 		if (!switch_info->representor)
 			strlcpy(name, dpdk_dev->name, sizeof(name));
 		else
-			snprintf(name, sizeof(name), "%s_representor_%s%u",
+			err = snprintf(name, sizeof(name), "%s_representor_%s%u",
 				 dpdk_dev->name,
 				 switch_info->name_type ==
 				 MLX5_PHYS_PORT_NAME_TYPE_PFSF ? "sf" : "vf",
 				 switch_info->port_name);
 	} else {
 		/* Bonding device. */
-		if (!switch_info->representor)
-			snprintf(name, sizeof(name), "%s_%s",
+		if (!switch_info->representor) {
+			err = snprintf(name, sizeof(name), "%s_%s",
 				 dpdk_dev->name,
 				 mlx5_os_get_dev_device_name(spawn->phys_dev));
-		else
-			snprintf(name, sizeof(name), "%s_%s_representor_%s%u",
-				 dpdk_dev->name,
-				 mlx5_os_get_dev_device_name(spawn->phys_dev),
-				 switch_info->name_type ==
-				 MLX5_PHYS_PORT_NAME_TYPE_PFSF ? "sf" : "vf",
-				 switch_info->port_name);
+		} else {
+			err = snprintf(name, sizeof(name), "%s_%s_representor_c%dpf%d%s%u",
+				dpdk_dev->name,
+				mlx5_os_get_dev_device_name(spawn->phys_dev),
+				switch_info->ctrl_num,
+				switch_info->pf_num,
+				switch_info->name_type ==
+				MLX5_PHYS_PORT_NAME_TYPE_PFSF ? "sf" : "vf",
+				switch_info->port_name);
+		}
 	}
+	if (err >= (int)sizeof(name))
+		DRV_LOG(WARNING, "device name overflow %s", name);
 	/* check if the device is already spawned */
 	if (rte_eth_dev_get_port_by_name(name, &port_id) == 0) {
 		rte_errno = EEXIST;
@@ -1073,11 +1077,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	priv->vport_id = switch_info->representor ?
 			 switch_info->port_name + 1 : -1;
 #endif
-	/* representor_id field keeps the unmodified VF index. */
-	priv->representor_id = switch_info->representor ?
-		rte_eth_representor_id_encode(0, 0, RTE_ETH_REPRESENTOR_VF,
-					      switch_info->port_name) :
-		-1;
+	priv->representor_id = repr_id;
 	/*
 	 * Look for sibling devices in order to reuse their switch domain
 	 * if any, otherwise allocate one.
@@ -1692,9 +1692,11 @@ mlx5_dev_spawn_data_cmp(const void *a, const void *b)
  * @param[in] ibv_dev
  *   Pointer to Infiniband device structure.
  * @param[in] pci_dev
- *   Pointer to PCI device structure to match PCI address.
+ *   Pointer to primary PCI address structure to match.
  * @param[in] nl_rdma
  *   Netlink RDMA group socket handle.
+ * @param[in] owner
+ *   Rerepsentor owner PF index.
  *
  * @return
  *   negative value if no bonding device found, otherwise
@@ -1702,8 +1704,8 @@ mlx5_dev_spawn_data_cmp(const void *a, const void *b)
  */
 static int
 mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
-			   const struct rte_pci_device *pci_dev,
-			   int nl_rdma)
+			   const struct rte_pci_addr *pci_dev,
+			   int nl_rdma, uint16_t owner)
 {
 	char ifname[IF_NAMESIZE + 1];
 	unsigned int ifindex;
@@ -1760,10 +1762,10 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 					 " for netdev \"%s\"", ifname);
 			continue;
 		}
-		if (pci_dev->addr.domain != pci_addr.domain ||
-		    pci_dev->addr.bus != pci_addr.bus ||
-		    pci_dev->addr.devid != pci_addr.devid ||
-		    pci_dev->addr.function != pci_addr.function)
+		if (pci_dev->domain != pci_addr.domain ||
+		    pci_dev->bus != pci_addr.bus ||
+		    pci_dev->devid != pci_addr.devid ||
+		    pci_dev->function + owner != pci_addr.function)
 			continue;
 		/* Slave interface PCI address match found. */
 		fclose(file);
@@ -1831,7 +1833,8 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 	struct mlx5_dev_config dev_config;
 	unsigned int dev_config_vf;
 	struct rte_eth_devargs eth_da = { .type = RTE_ETH_REPRESENTOR_NONE };
-	int ret;
+	struct rte_pci_addr owner_pci = pci_dev->addr; /* Owner PF. */
+	int ret = -1;
 
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
 		mlx5_pmd_socket_init();
@@ -1883,7 +1886,8 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 
 		DRV_LOG(DEBUG, "checking device \"%s\"", ibv_list[ret]->name);
 		bd = mlx5_device_bond_pci_match
-				(ibv_list[ret], pci_dev, nl_rdma);
+				(ibv_list[ret], &owner_pci, nl_rdma,
+				 eth_da.ports[0]);
 		if (bd >= 0) {
 			/*
 			 * Bonding device detected. Only one match is allowed,
@@ -1900,23 +1904,28 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 				ret = -rte_errno;
 				goto exit;
 			}
+			/* Amend owner pci address if owner PF ID specified. */
+			if (eth_da.nb_representor_ports)
+				owner_pci.function += eth_da.ports[0];
 			DRV_LOG(INFO, "PCI information matches for"
 				      " slave %d bonding device \"%s\"",
 				      bd, ibv_list[ret]->name);
 			ibv_match[nd++] = ibv_list[ret];
 			break;
+		} else {
+			/* Bonding device not found. */
+			if (mlx5_dev_to_pci_addr
+				(ibv_list[ret]->ibdev_path, &pci_addr))
+				continue;
+			if (owner_pci.domain != pci_addr.domain ||
+			    owner_pci.bus != pci_addr.bus ||
+			    owner_pci.devid != pci_addr.devid ||
+			    owner_pci.function != pci_addr.function)
+				continue;
+			DRV_LOG(INFO, "PCI information matches for device \"%s\"",
+				ibv_list[ret]->name);
+			ibv_match[nd++] = ibv_list[ret];
 		}
-		if (mlx5_dev_to_pci_addr
-			(ibv_list[ret]->ibdev_path, &pci_addr))
-			continue;
-		if (pci_dev->addr.domain != pci_addr.domain ||
-		    pci_dev->addr.bus != pci_addr.bus ||
-		    pci_dev->addr.devid != pci_addr.devid ||
-		    pci_dev->addr.function != pci_addr.function)
-			continue;
-		DRV_LOG(INFO, "PCI information matches for device \"%s\"",
-			ibv_list[ret]->name);
-		ibv_match[nd++] = ibv_list[ret];
 	}
 	ibv_match[nd] = NULL;
 	if (!nd) {
@@ -1924,8 +1933,8 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 		DRV_LOG(WARNING,
 			"no Verbs device matches PCI device " PCI_PRI_FMT ","
 			" are kernel drivers loaded?",
-			pci_dev->addr.domain, pci_dev->addr.bus,
-			pci_dev->addr.devid, pci_dev->addr.function);
+			owner_pci.domain, owner_pci.bus,
+			owner_pci.devid, owner_pci.function);
 		rte_errno = ENOENT;
 		ret = -rte_errno;
 		goto exit;
@@ -2190,6 +2199,24 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 		dev_config_vf = 0;
 		break;
 	}
+	if (eth_da.type != RTE_ETH_REPRESENTOR_NONE) {
+		/* Set devargs default values. */
+		if (eth_da.nb_mh_controllers == 0) {
+			eth_da.nb_mh_controllers = 1;
+			eth_da.mh_controllers[0] = 0;
+		}
+		if (eth_da.nb_ports == 0 && ns > 0) {
+			if (list[0].pf_bond >= 0 && list[0].info.representor)
+				DRV_LOG(WARNING, "Representor on Bonding device should use pf#vf# syntax: %s",
+					pci_dev->device.devargs->args);
+			eth_da.nb_ports = 1;
+			eth_da.ports[0] = list[0].info.pf_num;
+		}
+		if (eth_da.nb_representor_ports == 0) {
+			eth_da.nb_representor_ports = 1;
+			eth_da.representor_ports[0] = 0;
+		}
+	}
 	for (i = 0; i != ns; ++i) {
 		uint32_t restore;
 
@@ -2231,8 +2258,8 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 		DRV_LOG(ERR,
 			"probe of PCI device " PCI_PRI_FMT " aborted after"
 			" encountering an error: %s",
-			pci_dev->addr.domain, pci_dev->addr.bus,
-			pci_dev->addr.devid, pci_dev->addr.function,
+			owner_pci.domain, owner_pci.bus,
+			owner_pci.devid, owner_pci.function,
 			strerror(rte_errno));
 		ret = -rte_errno;
 		/* Roll back. */
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 5e8cd6a3df..d613ffd655 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -355,6 +355,28 @@ static const struct mlx5_indexed_pool_config mlx5_ipool_cfg[] = {
 
 #define MLX5_FLOW_TABLE_HLIST_ARRAY_SIZE 4096
 
+/**
+ * Decide whether representor ID is a HPF(host PF) port on BF2.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   Non-zero if HPF, otherwise 0.
+ */
+int
+mlx5_is_hpf(struct rte_eth_dev *dev)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	enum rte_eth_representor_type type;
+	uint16_t port;
+
+	port = rte_eth_representor_id_parse(priv->representor_id,
+					    NULL, NULL, &type);
+	return priv->representor && type == RTE_ETH_REPRESENTOR_VF &&
+	       port == rte_eth_representor_id_parse(-1, NULL, NULL, NULL);
+}
+
 /**
  * Initialize the ASO aging management structure.
  *
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index e06e0ff3bb..e7afa438ce 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -915,7 +915,7 @@ struct mlx5_priv {
 	uint16_t vport_id; /* Associated VF vport index (if any). */
 	uint32_t vport_meta_tag; /* Used for vport index match ove VF LAG. */
 	uint32_t vport_meta_mask; /* Used for vport index field match mask. */
-	int32_t representor_id; /* Port representor identifier. */
+	int32_t representor_id; /* RTE_ETH_REPR(), -1 if not a representor. */
 	int32_t pf_bond; /* >=0 means PF index in bonding configuration. */
 	unsigned int if_index; /* Associated kernel network device index. */
 	uint32_t bond_ifindex; /**< Bond interface index. */
@@ -988,6 +988,7 @@ int mlx5_udp_tunnel_port_add(struct rte_eth_dev *dev,
 			      struct rte_eth_udp_tunnel *udp_tunnel);
 uint16_t mlx5_eth_find_next(uint16_t port_id, struct rte_pci_device *pci_dev);
 int mlx5_dev_close(struct rte_eth_dev *dev);
+int mlx5_is_hpf(struct rte_eth_dev *dev);
 void mlx5_age_event_prepare(struct mlx5_dev_ctx_shared *sh);
 
 /* Macro to iterate over all valid ports for mlx5 driver. */
diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index 85a0979653..4648196550 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -48,10 +48,6 @@
 #define MLX5_PMD_SOFT_COUNTERS 1
 #endif
 
-/* Switch port ID parameters for bonding configurations. */
-#define MLX5_PORT_ID_BONDING_PF_MASK 0xf
-#define MLX5_PORT_ID_BONDING_PF_SHIFT 12
-
 /* Alarm timeout. */
 #define MLX5_ALARM_TIMEOUT_US 100000
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index ad6aacc329..5341eb16c9 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -330,33 +330,6 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 	if (priv->representor) {
 		uint16_t port_id;
 
-		if (priv->pf_bond >= 0) {
-			/*
-			 * Switch port ID is opaque value with driver defined
-			 * format. Push the PF index in bonding configurations
-			 * in upper four bits of port ID. If we get too many
-			 * representors (more than 4K) or PFs (more than 15)
-			 * this approach must be reconsidered.
-			 */
-			/* Switch port ID for VF representors: 0 - 0xFFE */
-			if ((info->switch_info.port_id != 0xffff &&
-				info->switch_info.port_id >=
-				((1 << MLX5_PORT_ID_BONDING_PF_SHIFT) - 1)) ||
-			    priv->pf_bond > MLX5_PORT_ID_BONDING_PF_MASK) {
-				DRV_LOG(ERR, "can't update switch port ID"
-					     " for bonding device");
-				MLX5_ASSERT(false);
-				return -ENODEV;
-			}
-			/*
-			 * Switch port ID for Host PF representor
-			 * (representor_id is -1) , set to 0xFFF
-			 */
-			if (info->switch_info.port_id == 0xffff)
-				info->switch_info.port_id = 0xfff;
-			info->switch_info.port_id |=
-				priv->pf_bond << MLX5_PORT_ID_BONDING_PF_SHIFT;
-		}
 		MLX5_ETH_FOREACH_DEV(port_id, priv->pci_dev) {
 			struct mlx5_priv *opriv =
 				rte_eth_devices[port_id].data->dev_private;
diff --git a/drivers/net/mlx5/mlx5_mac.c b/drivers/net/mlx5/mlx5_mac.c
index bd786fd638..b5b810b508 100644
--- a/drivers/net/mlx5/mlx5_mac.c
+++ b/drivers/net/mlx5/mlx5_mac.c
@@ -159,7 +159,7 @@ mlx5_mac_addr_set(struct rte_eth_dev *dev, struct rte_ether_addr *mac_addr)
 	 * Configuring the VF instead of its representor,
 	 * need to skip the special case of HPF on Bluefield.
 	 */
-	if (priv->representor && priv->representor_id >= 0) {
+	if (priv->representor && !mlx5_is_hpf(dev)) {
 		DRV_LOG(DEBUG, "VF represented by port %u setting primary MAC address",
 			dev->data->port_id);
 		RTE_ETH_FOREACH_DEV_SIBLING(port_id, dev->data->port_id) {
@@ -169,7 +169,11 @@ mlx5_mac_addr_set(struct rte_eth_dev *dev, struct rte_ether_addr *mac_addr)
 				return mlx5_os_vf_mac_addr_modify
 				       (priv,
 					mlx5_ifindex(&rte_eth_devices[port_id]),
-					mac_addr, priv->representor_id);
+					mac_addr,
+					rte_eth_representor_id_parse(
+							priv->representor_id,
+							NULL, NULL, NULL)
+					);
 			}
 		}
 		rte_errno = -ENOTSUP;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v4 5/8] net/mlx5: support representor from multiple PFs
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (34 preceding siblings ...)
  2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 4/8] net/mlx5: refactor bonding representor probe Xueming Li
@ 2021-01-19  7:28   ` Xueming Li
  2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 6/8] net/mlx5: save bonding member ports information Xueming Li
                     ` (30 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-19  7:28 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, xuemingl, Asaf Penso

To probe representors from different kernel bonding PFs, had to specify
2 separate devargs like this:
    -a 03:00.0,representor=pf0vf[0-3] -a 03:00.0,representor=pf1vf[0-3]

This patch supports range or list of PF section in devargs, so the
alternative short devargs of above is:
    -a 03:00.0,representor=pf[0-1]vf[0-3]

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 doc/guides/nics/mlx5.rst         |   4 ++
 drivers/net/mlx5/linux/mlx5_os.c | 100 +++++++++++++++++++++----------
 2 files changed, 72 insertions(+), 32 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index eaca4fc058..480c9d3fc1 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -884,6 +884,10 @@ Driver options
 
     <PCI_BDF>,representor=sf[0-2]
 
+  To probe VF port representors 0 through 2 on both PFs of bonding device::
+
+    <Primary_PCI_BDF>,representor=pf[0,1]vf[0-2]
+
 - ``max_dump_files_num`` parameter [int]
 
   The maximum number of files per PMD entity that may be created for debug information.
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 9ae5910f46..521a0a5789 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1788,21 +1788,25 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 }
 
 /**
- * DPDK callback to register a PCI device.
+ * Register a PCI device within bonding.
  *
- * This function spawns Ethernet devices out of a given PCI device.
+ * This function spawns Ethernet devices out of a given PCI device and
+ * bonding owner PF index.
  *
- * @param[in] pci_drv
- *   PCI driver structure (mlx5_driver).
  * @param[in] pci_dev
  *   PCI device information.
+ * @param[in] req_eth_da
+ *   Requested ethdev device argument.
+ * @param[in] owner_id
+ *   Requested owner PF port ID within bonding device, default to 0.
  *
  * @return
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
-int
-mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
-		  struct rte_pci_device *pci_dev)
+static int
+mlx5_os_pci_probe_pf(struct rte_pci_device *pci_dev,
+		     struct rte_eth_devargs *req_eth_da,
+		     uint16_t owner_id)
 {
 	struct ibv_device **ibv_list;
 	/*
@@ -1832,7 +1836,7 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 	struct mlx5_dev_spawn_data *list = NULL;
 	struct mlx5_dev_config dev_config;
 	unsigned int dev_config_vf;
-	struct rte_eth_devargs eth_da = { .type = RTE_ETH_REPRESENTOR_NONE };
+	struct rte_eth_devargs eth_da = *req_eth_da;
 	struct rte_pci_addr owner_pci = pci_dev->addr; /* Owner PF. */
 	int ret = -1;
 
@@ -1844,27 +1848,6 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 			strerror(rte_errno));
 		return -rte_errno;
 	}
-	if (pci_dev->device.devargs) {
-		/* Parse representor information from device argument. */
-		if (pci_dev->device.devargs->cls_str)
-			ret = rte_eth_devargs_parse(
-				pci_dev->device.devargs->cls_str, &eth_da);
-		if (ret) {
-			DRV_LOG(ERR, "failed to parse device arguments: %s",
-				pci_dev->device.devargs->cls_str);
-			return -rte_errno;
-		}
-		if (eth_da.type == RTE_ETH_REPRESENTOR_NONE) {
-			/* Support legacy device argument */
-			ret = rte_eth_devargs_parse(
-				pci_dev->device.devargs->args, &eth_da);
-			if (ret) {
-				DRV_LOG(ERR, "failed to parse device arguments: %s",
-					pci_dev->device.devargs->args);
-				return -rte_errno;
-			}
-		}
-	}
 	errno = 0;
 	ibv_list = mlx5_glue->get_device_list(&ret);
 	if (!ibv_list) {
@@ -1886,8 +1869,7 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 
 		DRV_LOG(DEBUG, "checking device \"%s\"", ibv_list[ret]->name);
 		bd = mlx5_device_bond_pci_match
-				(ibv_list[ret], &owner_pci, nl_rdma,
-				 eth_da.ports[0]);
+				(ibv_list[ret], &owner_pci, nl_rdma, owner_id);
 		if (bd >= 0) {
 			/*
 			 * Bonding device detected. Only one match is allowed,
@@ -1906,7 +1888,7 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 			}
 			/* Amend owner pci address if owner PF ID specified. */
 			if (eth_da.nb_representor_ports)
-				owner_pci.function += eth_da.ports[0];
+				owner_pci.function += owner_id;
 			DRV_LOG(INFO, "PCI information matches for"
 				      " slave %d bonding device \"%s\"",
 				      bd, ibv_list[ret]->name);
@@ -2294,6 +2276,60 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 	return ret;
 }
 
+/**
+ * DPDK callback to register a PCI device.
+ *
+ * This function spawns Ethernet devices out of a given PCI device.
+ *
+ * @param[in] pci_drv
+ *   PCI driver structure (mlx5_driver).
+ * @param[in] pci_dev
+ *   PCI device information.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
+		  struct rte_pci_device *pci_dev)
+{
+	struct rte_eth_devargs eth_da = { .type = RTE_ETH_REPRESENTOR_NONE };
+	int ret = 0;
+	uint16_t p;
+
+	if (pci_dev->device.devargs) {
+		/* Parse representor information from device argument. */
+		if (pci_dev->device.devargs->cls_str)
+			ret = rte_eth_devargs_parse(
+				pci_dev->device.devargs->cls_str, &eth_da);
+		if (ret) {
+			DRV_LOG(ERR, "failed to parse device arguments: %s",
+				pci_dev->device.devargs->cls_str);
+			return -rte_errno;
+		}
+		if (eth_da.type == RTE_ETH_REPRESENTOR_NONE) {
+			/* Support legacy device argument */
+			ret = rte_eth_devargs_parse(
+				pci_dev->device.devargs->args, &eth_da);
+			if (ret) {
+				DRV_LOG(ERR, "failed to parse device arguments: %s",
+					pci_dev->device.devargs->args);
+				return -rte_errno;
+			}
+		}
+	}
+
+	if (eth_da.nb_ports > 0) {
+		/* Iterate all port if devargs pf is range: "pf[0-1]vf[...]". */
+		for (p = 0; p < eth_da.nb_ports; p++)
+			ret = mlx5_os_pci_probe_pf(pci_dev, &eth_da,
+						   eth_da.ports[p]);
+	} else {
+		ret = mlx5_os_pci_probe_pf(pci_dev, &eth_da, 0);
+	}
+	return ret;
+}
+
 static int
 mlx5_config_doorbell_mapping_env(const struct mlx5_dev_config *config)
 {
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v4 6/8] net/mlx5: save bonding member ports information
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (35 preceding siblings ...)
  2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 5/8] net/mlx5: support representor from multiple PFs Xueming Li
@ 2021-01-19  7:28   ` Xueming Li
  2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 7/8] net/mlx5: fix setting VF default MAC through representor Xueming Li
                     ` (29 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-19  7:28 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, xuemingl, Asaf Penso, Anatoly Burakov

Since kernel bonding netdev doesn't provide statistics counter that
reflects all member ports, PMD has to manually summarize counters from
each member ports.

As a preparation, this patch collects bonding member port information
and saves to shared context data.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_ethdev_os.c |  4 +-
 drivers/net/mlx5/linux/mlx5_os.c        | 91 ++++++++++++++++---------
 drivers/net/mlx5/mlx5.c                 |  2 +
 drivers/net/mlx5/mlx5.h                 | 21 +++++-
 drivers/net/mlx5/mlx5_ethdev.c          |  5 +-
 5 files changed, 86 insertions(+), 37 deletions(-)

diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
index ac311de46d..84610a7bc0 100644
--- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
@@ -150,8 +150,8 @@ mlx5_get_ifname(const struct rte_eth_dev *dev, char (*ifname)[MLX5_NAMESIZE])
 
 	MLX5_ASSERT(priv);
 	MLX5_ASSERT(priv->sh);
-	if (priv->bond_ifindex > 0) {
-		memcpy(ifname, priv->bond_name, MLX5_NAMESIZE);
+	if (priv->master && priv->sh->bond.ifindex > 0) {
+		memcpy(ifname, priv->sh->bond.ifname, MLX5_NAMESIZE);
 		return 0;
 	}
 	ifindex = mlx5_ifindex(dev);
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 521a0a5789..47a7c3dff0 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1417,19 +1417,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	 */
 	MLX5_ASSERT(spawn->ifindex);
 	priv->if_index = spawn->ifindex;
-	if (priv->pf_bond >= 0 && priv->master) {
-		/* Get bond interface info */
-		err = mlx5_sysfs_bond_info(priv->if_index,
-				     &priv->bond_ifindex,
-				     priv->bond_name);
-		if (err)
-			DRV_LOG(ERR, "unable to get bond info: %s",
-				strerror(rte_errno));
-		else
-			DRV_LOG(INFO, "PF device %u, bond device %u(%s)",
-				priv->if_index, priv->bond_ifindex,
-				priv->bond_name);
-	}
 	eth_dev->data->dev_private = priv;
 	priv->dev_data = eth_dev->data;
 	eth_dev->data->mac_addrs = priv->mac;
@@ -1697,6 +1684,8 @@ mlx5_dev_spawn_data_cmp(const void *a, const void *b)
  *   Netlink RDMA group socket handle.
  * @param[in] owner
  *   Rerepsentor owner PF index.
+ * @param[out] bond_info
+ *   Pointer to bonding information.
  *
  * @return
  *   negative value if no bonding device found, otherwise
@@ -1705,19 +1694,22 @@ mlx5_dev_spawn_data_cmp(const void *a, const void *b)
 static int
 mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 			   const struct rte_pci_addr *pci_dev,
-			   int nl_rdma, uint16_t owner)
+			   int nl_rdma, uint16_t owner,
+			   struct mlx5_bond_info *bond_info)
 {
 	char ifname[IF_NAMESIZE + 1];
 	unsigned int ifindex;
 	unsigned int np, i;
-	FILE *file = NULL;
+	FILE *bond_file = NULL, *file;
 	int pf = -1;
+	int ret;
 
 	/*
 	 * Try to get master device name. If something goes
 	 * wrong suppose the lack of kernel support and no
 	 * bonding devices.
 	 */
+	memset(bond_info, 0, sizeof(*bond_info));
 	if (nl_rdma < 0)
 		return -1;
 	if (!strstr(ibv_dev->name, "bond"))
@@ -1741,15 +1733,15 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 		/* Try to read bonding slave names from sysfs. */
 		MKSTR(slaves,
 		      "/sys/class/net/%s/master/bonding/slaves", ifname);
-		file = fopen(slaves, "r");
-		if (file)
+		bond_file = fopen(slaves, "r");
+		if (bond_file)
 			break;
 	}
-	if (!file)
+	if (!bond_file)
 		return -1;
 	/* Use safe format to check maximal buffer length. */
 	MLX5_ASSERT(atol(RTE_STR(IF_NAMESIZE)) == IF_NAMESIZE);
-	while (fscanf(file, "%" RTE_STR(IF_NAMESIZE) "s", ifname) == 1) {
+	while (fscanf(bond_file, "%" RTE_STR(IF_NAMESIZE) "s", ifname) == 1) {
 		char tmp_str[IF_NAMESIZE + 32];
 		struct rte_pci_addr pci_addr;
 		struct mlx5_switch_info	info;
@@ -1762,13 +1754,7 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 					 " for netdev \"%s\"", ifname);
 			continue;
 		}
-		if (pci_dev->domain != pci_addr.domain ||
-		    pci_dev->bus != pci_addr.bus ||
-		    pci_dev->devid != pci_addr.devid ||
-		    pci_dev->function + owner != pci_addr.function)
-			continue;
 		/* Slave interface PCI address match found. */
-		fclose(file);
 		snprintf(tmp_str, sizeof(tmp_str),
 			 "/sys/class/net/%s/phys_port_name", ifname);
 		file = fopen(tmp_str, "rb");
@@ -1777,13 +1763,52 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 		info.name_type = MLX5_PHYS_PORT_NAME_TYPE_NOTSET;
 		if (fscanf(file, "%32s", tmp_str) == 1)
 			mlx5_translate_port_name(tmp_str, &info);
-		if (info.name_type == MLX5_PHYS_PORT_NAME_TYPE_LEGACY ||
-		    info.name_type == MLX5_PHYS_PORT_NAME_TYPE_UPLINK)
+		fclose(file);
+		/* Only process PF ports. */
+		if (info.name_type != MLX5_PHYS_PORT_NAME_TYPE_LEGACY &&
+		    info.name_type != MLX5_PHYS_PORT_NAME_TYPE_UPLINK)
+			continue;
+		/* Check max bonding member. */
+		if (info.port_name >= MLX5_BOND_MAX_PORTS) {
+			DRV_LOG(WARNING, "bonding index out of range, "
+				"please increase MLX5_BOND_MAX_PORTS: %s",
+				tmp_str);
+			break;
+		}
+		/* Match PCI address. */
+		if (pci_dev->domain == pci_addr.domain &&
+		    pci_dev->bus == pci_addr.bus &&
+		    pci_dev->devid == pci_addr.devid &&
+		    pci_dev->function + owner == pci_addr.function)
 			pf = info.port_name;
-		break;
-	}
-	if (file)
+		/* Get ifindex. */
+		snprintf(tmp_str, sizeof(tmp_str),
+			 "/sys/class/net/%s/ifindex", ifname);
+		file = fopen(tmp_str, "rb");
+		if (!file)
+			break;
+		ret = fscanf(file, "%u", &ifindex);
 		fclose(file);
+		if (ret != 1)
+			break;
+		/* Save bonding info. */
+		strncpy(bond_info->ports[info.port_name].ifname, ifname,
+			sizeof(bond_info->ports[0].ifname));
+		bond_info->ports[info.port_name].pci_addr = pci_addr;
+		bond_info->ports[info.port_name].ifindex = ifindex;
+		bond_info->n_port++;
+	}
+	if (pf >= 0) {
+		/* Get bond interface info */
+		ret = mlx5_sysfs_bond_info(ifindex, &bond_info->ifindex,
+					   bond_info->ifname);
+		if (ret)
+			DRV_LOG(ERR, "unable to get bond info: %s",
+				strerror(rte_errno));
+		else
+			DRV_LOG(INFO, "PF device %u, bond device %u(%s)",
+				ifindex, bond_info->ifindex, bond_info->ifname);
+	}
 	return pf;
 }
 
@@ -1838,6 +1863,7 @@ mlx5_os_pci_probe_pf(struct rte_pci_device *pci_dev,
 	unsigned int dev_config_vf;
 	struct rte_eth_devargs eth_da = *req_eth_da;
 	struct rte_pci_addr owner_pci = pci_dev->addr; /* Owner PF. */
+	struct mlx5_bond_info bond_info;
 	int ret = -1;
 
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
@@ -1869,7 +1895,8 @@ mlx5_os_pci_probe_pf(struct rte_pci_device *pci_dev,
 
 		DRV_LOG(DEBUG, "checking device \"%s\"", ibv_list[ret]->name);
 		bd = mlx5_device_bond_pci_match
-				(ibv_list[ret], &owner_pci, nl_rdma, owner_id);
+				(ibv_list[ret], &owner_pci, nl_rdma, owner_id,
+				 &bond_info);
 		if (bd >= 0) {
 			/*
 			 * Bonding device detected. Only one match is allowed,
@@ -1978,6 +2005,7 @@ mlx5_os_pci_probe_pf(struct rte_pci_device *pci_dev,
 		MLX5_ASSERT(nd == 1);
 		MLX5_ASSERT(np);
 		for (i = 1; i <= np; ++i) {
+			list[ns].bond_info = &bond_info;
 			list[ns].max_port = np;
 			list[ns].phys_port = i;
 			list[ns].phys_dev = ibv_match[0];
@@ -2068,6 +2096,7 @@ mlx5_os_pci_probe_pf(struct rte_pci_device *pci_dev,
 		 */
 		for (i = 0; i != nd; ++i) {
 			memset(&list[ns].info, 0, sizeof(list[ns].info));
+			list[ns].bond_info = NULL;
 			list[ns].max_port = 1;
 			list[ns].phys_port = 1;
 			list[ns].phys_dev = ibv_match[i];
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index d613ffd655..e170db948d 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -927,6 +927,8 @@ mlx5_alloc_shared_dev_ctx(const struct mlx5_dev_spawn_data *spawn,
 		rte_errno  = ENOMEM;
 		goto exit;
 	}
+	if (spawn->bond_info)
+		sh->bond = *spawn->bond_info;
 	err = mlx5_os_open_device(spawn, config, sh);
 	if (!sh->ctx)
 		goto error;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index e7afa438ce..c15af1d794 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -115,6 +115,7 @@ struct mlx5_dev_spawn_data {
 	void *phys_dev; /**< Associated physical device. */
 	struct rte_eth_dev *eth_dev; /**< Associated Ethernet device. */
 	struct rte_pci_device *pci_dev; /**< Backend PCI device. */
+	struct mlx5_bond_info *bond_info;
 };
 
 /** Key string for IPC. */
@@ -661,6 +662,21 @@ struct mlx5_flex_parser_profiles {
 	void *obj;		/* Flex parser node object. */
 };
 
+/* Max member ports per bonding device. */
+#define MLX5_BOND_MAX_PORTS 2
+
+/* Bonding device information. */
+struct mlx5_bond_info {
+	int n_port; /* Number of bond member ports. */
+	uint32_t ifindex;
+	char ifname[MLX5_NAMESIZE + 1];
+	struct {
+		char ifname[MLX5_NAMESIZE + 1];
+		uint32_t ifindex;
+		struct rte_pci_addr pci_addr;
+	} ports[MLX5_BOND_MAX_PORTS];
+};
+
 /*
  * Shared Infiniband device context for Master/Representors
  * which belong to same IB device with multiple IB ports.
@@ -671,6 +687,7 @@ struct mlx5_dev_ctx_shared {
 	uint32_t devx:1; /* Opened with DV. */
 	uint32_t flow_hit_aso_en:1; /* Flow Hit ASO is supported. */
 	uint32_t max_port; /* Maximal IB device port index. */
+	struct mlx5_bond_info bond; /* Bonding information. */
 	void *ctx; /* Verbs/DV/DevX context. */
 	void *pd; /* Protection Domain. */
 	uint32_t pdn; /* Protection Domain number. */
@@ -916,10 +933,8 @@ struct mlx5_priv {
 	uint32_t vport_meta_tag; /* Used for vport index match ove VF LAG. */
 	uint32_t vport_meta_mask; /* Used for vport index field match mask. */
 	int32_t representor_id; /* RTE_ETH_REPR(), -1 if not a representor. */
-	int32_t pf_bond; /* >=0 means PF index in bonding configuration. */
+	int32_t pf_bond; /* >=0, representor owner PF index in bonding. */
 	unsigned int if_index; /* Associated kernel network device index. */
-	uint32_t bond_ifindex; /**< Bond interface index. */
-	char bond_name[MLX5_NAMESIZE]; /**< Bond interface name. */
 	/* RX/TX queues. */
 	unsigned int rxqs_n; /* RX queues array size. */
 	unsigned int txqs_n; /* TX queues array size. */
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 5341eb16c9..29389fc98f 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -42,7 +42,10 @@ mlx5_ifindex(const struct rte_eth_dev *dev)
 
 	MLX5_ASSERT(priv);
 	MLX5_ASSERT(priv->if_index);
-	ifindex = priv->bond_ifindex > 0 ? priv->bond_ifindex : priv->if_index;
+	if (priv->master && priv->sh->bond.ifindex > 0)
+		ifindex = priv->sh->bond.ifindex;
+	else
+		ifindex = priv->if_index;
 	if (!ifindex)
 		rte_errno = ENXIO;
 	return ifindex;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v4 7/8] net/mlx5: fix setting VF default MAC through representor
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (36 preceding siblings ...)
  2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 6/8] net/mlx5: save bonding member ports information Xueming Li
@ 2021-01-19  7:28   ` Xueming Li
  2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 8/8] net/mlx5: improve bonding xstats Xueming Li
                     ` (28 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-19  7:28 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, xuemingl, Asaf Penso

With kernel bonding, there was an error when setting VF MAC address
through representor. The Netlink api requires ifindex of owner PF, not
bonding device ifindex.

Uses owner PF ifindex to modify VF default MAC in case of bonding
device.

Fixes: c21e5facf7d2 ("net/mlx5: use bond index for netdev operations")

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/mlx5_mac.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_mac.c b/drivers/net/mlx5/mlx5_mac.c
index b5b810b508..5a3aec89c1 100644
--- a/drivers/net/mlx5/mlx5_mac.c
+++ b/drivers/net/mlx5/mlx5_mac.c
@@ -154,6 +154,7 @@ mlx5_mac_addr_set(struct rte_eth_dev *dev, struct rte_ether_addr *mac_addr)
 {
 	uint16_t port_id;
 	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_priv *pf_priv;
 
 	/*
 	 * Configuring the VF instead of its representor,
@@ -162,19 +163,24 @@ mlx5_mac_addr_set(struct rte_eth_dev *dev, struct rte_ether_addr *mac_addr)
 	if (priv->representor && !mlx5_is_hpf(dev)) {
 		DRV_LOG(DEBUG, "VF represented by port %u setting primary MAC address",
 			dev->data->port_id);
+		if (priv->pf_bond >= 0) {
+			/* Bonding, get owner PF ifindex from shared data. */
+			return mlx5_os_vf_mac_addr_modify
+			       (priv,
+				priv->sh->bond.ports[priv->pf_bond].ifindex,
+				mac_addr,
+				rte_eth_representor_id_parse(
+						priv->representor_id,
+						NULL, NULL, NULL));
+		}
 		RTE_ETH_FOREACH_DEV_SIBLING(port_id, dev->data->port_id) {
-			priv = rte_eth_devices[port_id].data->dev_private;
-			if (priv->master == 1) {
-				priv = dev->data->dev_private;
+			pf_priv = rte_eth_devices[port_id].data->dev_private;
+			if (pf_priv->master == 1)
 				return mlx5_os_vf_mac_addr_modify
-				       (priv,
-					mlx5_ifindex(&rte_eth_devices[port_id]),
-					mac_addr,
+				       (priv, pf_priv->if_index, mac_addr,
 					rte_eth_representor_id_parse(
 							priv->representor_id,
-							NULL, NULL, NULL)
-					);
-			}
+							NULL, NULL, NULL));
 		}
 		rte_errno = -ENOTSUP;
 		return rte_errno;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v4 8/8] net/mlx5: improve bonding xstats
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (37 preceding siblings ...)
  2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 7/8] net/mlx5: fix setting VF default MAC through representor Xueming Li
@ 2021-01-19  7:28   ` Xueming Li
  2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 0/9] net/mlx5: support SubFunction representor Xueming Li
                     ` (27 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-01-19  7:28 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, xuemingl, Asaf Penso

In case of kernel bonding device, counter was read from first bonding PF
member.

This patch reads all member PFs and sums to get bond xstats.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_ethdev_os.c | 127 +++++++++++++++++++-----
 1 file changed, 102 insertions(+), 25 deletions(-)

diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
index 84610a7bc0..27afb74aff 100644
--- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
@@ -169,10 +169,10 @@ mlx5_get_ifname(const struct rte_eth_dev *dev, char (*ifname)[MLX5_NAMESIZE])
 }
 
 /**
- * Perform ifreq ioctl() on associated Ethernet device.
+ * Perform ifreq ioctl() on associated netdev ifname.
  *
- * @param[in] dev
- *   Pointer to Ethernet device.
+ * @param[in] ifname
+ *   Pointer to netdev name.
  * @param req
  *   Request number to pass to ioctl().
  * @param[out] ifr
@@ -182,7 +182,7 @@ mlx5_get_ifname(const struct rte_eth_dev *dev, char (*ifname)[MLX5_NAMESIZE])
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-mlx5_ifreq(const struct rte_eth_dev *dev, int req, struct ifreq *ifr)
+mlx5_ifreq_by_ifname(const char *ifname, int req, struct ifreq *ifr)
 {
 	int sock = socket(PF_INET, SOCK_DGRAM, IPPROTO_IP);
 	int ret = 0;
@@ -191,9 +191,7 @@ mlx5_ifreq(const struct rte_eth_dev *dev, int req, struct ifreq *ifr)
 		rte_errno = errno;
 		return -rte_errno;
 	}
-	ret = mlx5_get_ifname(dev, &ifr->ifr_name);
-	if (ret)
-		goto error;
+	rte_strscpy(ifr->ifr_name, ifname, sizeof(ifr->ifr_name));
 	ret = ioctl(sock, req, ifr);
 	if (ret == -1) {
 		rte_errno = errno;
@@ -206,6 +204,31 @@ mlx5_ifreq(const struct rte_eth_dev *dev, int req, struct ifreq *ifr)
 	return -rte_errno;
 }
 
+/**
+ * Perform ifreq ioctl() on associated Ethernet device.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet device.
+ * @param req
+ *   Request number to pass to ioctl().
+ * @param[out] ifr
+ *   Interface request structure output buffer.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_ifreq(const struct rte_eth_dev *dev, int req, struct ifreq *ifr)
+{
+	char ifname[sizeof(ifr->ifr_name)];
+	int ret;
+
+	ret = mlx5_get_ifname(dev, &ifname);
+	if (ret)
+		return -rte_errno;
+	return mlx5_ifreq_by_ifname(ifname, req, ifr);
+}
+
 /**
  * Get device MTU.
  *
@@ -1243,6 +1266,8 @@ int mlx5_get_module_eeprom(struct rte_eth_dev *dev,
  *
  * @param dev
  *   Pointer to Ethernet device.
+ * @param[in] pf
+ *   PF index in case of bonding device, -1 otherwise
  * @param[out] stats
  *   Counters table output buffer.
  *
@@ -1250,8 +1275,8 @@ int mlx5_get_module_eeprom(struct rte_eth_dev *dev,
  *   0 on success and stats is filled, negative errno value otherwise and
  *   rte_errno is set.
  */
-int
-mlx5_os_read_dev_counters(struct rte_eth_dev *dev, uint64_t *stats)
+static int
+_mlx5_os_read_dev_counters(struct rte_eth_dev *dev, int pf, uint64_t *stats)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
 	struct mlx5_xstats_ctrl *xstats_ctrl = &priv->xstats_ctrl;
@@ -1265,7 +1290,11 @@ mlx5_os_read_dev_counters(struct rte_eth_dev *dev, uint64_t *stats)
 	et_stats->cmd = ETHTOOL_GSTATS;
 	et_stats->n_stats = xstats_ctrl->stats_n;
 	ifr.ifr_data = (caddr_t)et_stats;
-	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (pf >= 0)
+		ret = mlx5_ifreq_by_ifname(priv->sh->bond.ports[pf].ifname,
+					   SIOCETHTOOL, &ifr);
+	else
+		ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
 	if (ret) {
 		DRV_LOG(WARNING,
 			"port %u unable to read statistic values from device",
@@ -1273,23 +1302,60 @@ mlx5_os_read_dev_counters(struct rte_eth_dev *dev, uint64_t *stats)
 		return ret;
 	}
 	for (i = 0; i != xstats_ctrl->mlx5_stats_n; ++i) {
-		if (xstats_ctrl->info[i].dev) {
-			ret = mlx5_os_read_dev_stat(priv,
-					    xstats_ctrl->info[i].ctr_name,
-					    &stats[i]);
-			/* return last xstats counter if fail to read. */
-			if (ret == 0)
-				xstats_ctrl->xstats[i] = stats[i];
-			else
-				stats[i] = xstats_ctrl->xstats[i];
-		} else {
-			stats[i] = (uint64_t)
-				et_stats->data[xstats_ctrl->dev_table_idx[i]];
-		}
+		if (xstats_ctrl->info[i].dev)
+			continue;
+		stats[i] += (uint64_t)
+			    et_stats->data[xstats_ctrl->dev_table_idx[i]];
 	}
 	return 0;
 }
 
+/**
+ * Read device counters.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param[out] stats
+ *   Counters table output buffer.
+ *
+ * @return
+ *   0 on success and stats is filled, negative errno value otherwise and
+ *   rte_errno is set.
+ */
+int
+mlx5_os_read_dev_counters(struct rte_eth_dev *dev, uint64_t *stats)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_xstats_ctrl *xstats_ctrl = &priv->xstats_ctrl;
+	int ret = 0, i;
+
+	memset(stats, 0, sizeof(*stats) * xstats_ctrl->mlx5_stats_n);
+	/* Read ifreq counters. */
+	if (priv->master && priv->pf_bond >= 0) {
+		/* Sum xstats from bonding device member ports. */
+		for (i = 0; i < priv->sh->bond.n_port; i++) {
+			ret = _mlx5_os_read_dev_counters(dev, i, stats);
+			if (ret)
+				return ret;
+		}
+	} else {
+		ret = _mlx5_os_read_dev_counters(dev, -1, stats);
+	}
+	/* Read IB counters. */
+	for (i = 0; i != xstats_ctrl->mlx5_stats_n; ++i) {
+		if (!xstats_ctrl->info[i].dev)
+			continue;
+		ret = mlx5_os_read_dev_stat(priv, xstats_ctrl->info[i].ctr_name,
+					    &stats[i]);
+		/* return last xstats counter if fail to read. */
+		if (ret != 0)
+			xstats_ctrl->xstats[i] = stats[i];
+		else
+			stats[i] = xstats_ctrl->xstats[i];
+	}
+	return ret;
+}
+
 /**
  * Query the number of statistics provided by ETHTOOL.
  *
@@ -1303,13 +1369,19 @@ mlx5_os_read_dev_counters(struct rte_eth_dev *dev, uint64_t *stats)
 int
 mlx5_os_get_stats_n(struct rte_eth_dev *dev)
 {
+	struct mlx5_priv *priv = dev->data->dev_private;
 	struct ethtool_drvinfo drvinfo;
 	struct ifreq ifr;
 	int ret;
 
 	drvinfo.cmd = ETHTOOL_GDRVINFO;
 	ifr.ifr_data = (caddr_t)&drvinfo;
-	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (priv->master && priv->pf_bond >= 0)
+		/* Bonding PF. */
+		ret = mlx5_ifreq_by_ifname(priv->sh->bond.ports[0].ifname,
+					   SIOCETHTOOL, &ifr);
+	else
+		ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
 	if (ret) {
 		DRV_LOG(WARNING, "port %u unable to query number of statistics",
 			dev->data->port_id);
@@ -1480,7 +1552,12 @@ mlx5_os_stats_init(struct rte_eth_dev *dev)
 	strings->string_set = ETH_SS_STATS;
 	strings->len = dev_stats_n;
 	ifr.ifr_data = (caddr_t)strings;
-	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (priv->master && priv->pf_bond >= 0)
+		/* Bonding master. */
+		ret = mlx5_ifreq_by_ifname(priv->sh->bond.ports[0].ifname,
+					   SIOCETHTOOL, &ifr);
+	else
+		ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
 	if (ret) {
 		DRV_LOG(WARNING, "port %u unable to get statistic names",
 			dev->data->port_id);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/5] devargs: fix memory leak on parsing error
  2021-01-18 15:16   ` [dpdk-dev] [PATCH v2 1/5] devargs: fix memory leak on parsing error Xueming Li
@ 2021-03-18  9:12     ` Thomas Monjalon
  0 siblings, 0 replies; 118+ messages in thread
From: Thomas Monjalon @ 2021-03-18  9:12 UTC (permalink / raw)
  To: Xueming Li
  Cc: Ferruh Yigit, Andrew Rybchenko, Olivier Matz, dev,
	Viacheslav Ovsiienko, Asaf Penso, gaetan.rivet, stable

18/01/2021 16:16, Xueming Li:
> --- a/lib/librte_eal/common/eal_common_devargs.c
> +++ b/lib/librte_eal/common/eal_common_devargs.c
> +	if (ret != 0) {
> +		if (devargs->data && devargs->data != devstr) {

Better to make comparison explicit:
	if (devargs->data != NULL

> +			/* Free duplicated data. */
> +			free(devargs->data);

Before patch 2, devargs->data is const,
so we cannot free (compilation error).




^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/5] devargs: refactor scratch buffer storage
  2021-01-18 15:16   ` [dpdk-dev] [PATCH v2 2/5] devargs: refactor scratch buffer storage Xueming Li
@ 2021-03-18  9:14     ` Thomas Monjalon
  0 siblings, 0 replies; 118+ messages in thread
From: Thomas Monjalon @ 2021-03-18  9:14 UTC (permalink / raw)
  To: Xueming Li
  Cc: Ferruh Yigit, Andrew Rybchenko, Olivier Matz, dev,
	Viacheslav Ovsiienko, Asaf Penso, david.marchand,
	bruce.richardson

18/01/2021 16:16, Xueming Li:
> In current design, legacy parser rte_devargs_parse() saved scratch
> buffer to devargs.args while new parser rte_devargs_layers_parse() saved
> to devargs.data. Code using devargs had to know the difference and
> cleaned up memory accordingly - error prone.
> 
> This patch unifies data the dedicate scratch buffer, introduces
> rte_devargs_free() function to wrap the memory memory clean up.
> 
> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
> ---
> --- a/lib/librte_eal/include/rte_devargs.h
> +++ b/lib/librte_eal/include/rte_devargs.h
> @@ -60,16 +60,16 @@ struct rte_devargs {
>  	/** Name of the device. */
>  	char name[RTE_DEV_NAME_MAX_LEN];
>  	RTE_STD_C11
> -	union {
> -	/** Arguments string as given by user or "" for no argument. */
> -		char *args;
> +	union { /**< driver-related part of device string. */
> +		const char *args; /**< legacy name. */
>  		const char *drv_str;
>  	};
>  	struct rte_bus *bus; /**< bus handle. */
>  	struct rte_class *cls; /**< class handle. */
>  	const char *bus_str; /**< bus-related part of device string. */
>  	const char *cls_str; /**< class-related part of device string. */
> -	const char *data; /**< Device string storage. */
> +	char *data; /**< Scratch buffer. */
> +	const char *src; /**< Arguments given by user. */

Adding a field changes the size of the struct, which is an ABI break.
We need to plan this change for DPDK 21.11.
Let's think what can be done in the meantime.



^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v5 0/9] net/mlx5: support SubFunction representor
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (38 preceding siblings ...)
  2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 8/8] net/mlx5: improve bonding xstats Xueming Li
@ 2021-03-28 13:48   ` Xueming Li
  2021-03-31  7:20     ` Raslan Darawsheh
  2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 1/9] common/mlx5: sub-function representor port name parsing Xueming Li
                     ` (26 subsequent siblings)
  66 siblings, 1 reply; 118+ messages in thread
From: Xueming Li @ 2021-03-28 13:48 UTC (permalink / raw)
  To: Viacheslav Ovsiienko; +Cc: dev, xuemingl, Asaf Penso

SubFunction [1] is a portion of the PCI device, a SF netdev has its own
dedicated queues(txq, rxq). A SF netdev supports E-Switch representation
offload similar to existing PF and VF representors. A SF shares PCI
level resources with other SFs and/or with its parent PCI function.

This patch set introduces SubFunction representor support for mlx5
PMD driver.

Version history:
 RFC:
 	initial version [2]
 V2:
    - support bonding representor probe with new pf#vf# devargs
    - adapt EAL api V2 [3] changes
    - update document
 V3:
    - support list of representor PF section for bonding device:
      example: representor=pf[0,1]vf[0-3]
    - add bonding information to shared PMD data
    - fix setting VF MAC through representor
    - fix bonding xstats, sum xstats from PF members.
 V4:
    - combine unexpected patch, thanks Slava
 V5:
    - support new ethdev ops api to return representor info
    - new api to encode and decode representor ID
    - new patch to allow BF2 HPF(-1) probe with sf-1

[1] SubFunction in kernel:
https://lore.kernel.org/netdev/20201112192424.2742-1-parav@nvidia.com/

[2] RFC:
http://patchwork.dpdk.org/project/dpdk/list/?series=14376

[3] V2:
http://patchwork.dpdk.org/project/dpdk/list/?series=14560

[3] V3:
http://patchwork.dpdk.org/project/dpdk/list/?series=14810

[3] V4:
http://patchwork.dpdk.org/project/dpdk/list/?series=14836


Xueming Li (9):
  common/mlx5: sub-function representor port name parsing
  net/mlx5: support representor of sub function
  net/mlx5: revert setting bonding representor to first PF
  net/mlx5: refactor bonding representor probe
  net/mlx5: support list value of representor PF
  net/mlx5: save bonding member ports information
  net/mlx5: fix setting VF default MAC through representor
  net/mlx5: improve xstats of bonding port
  net/mlx5: probe host PF representor with SubFunction

 doc/guides/nics/mlx5.rst                   |  62 +++-
 drivers/common/mlx5/linux/mlx5_common_os.c |  32 +-
 drivers/common/mlx5/linux/mlx5_nl.c        |   3 +
 drivers/common/mlx5/mlx5_common.h          |   2 +
 drivers/net/mlx5/linux/mlx5_ethdev_os.c    | 136 +++++--
 drivers/net/mlx5/linux/mlx5_os.c           | 395 ++++++++++++++-------
 drivers/net/mlx5/mlx5.c                    |  24 +-
 drivers/net/mlx5/mlx5.h                    |  35 +-
 drivers/net/mlx5/mlx5_defs.h               |   4 -
 drivers/net/mlx5/mlx5_ethdev.c             | 149 ++++++--
 drivers/net/mlx5/mlx5_mac.c                |  23 +-
 11 files changed, 652 insertions(+), 213 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v5 1/9] common/mlx5: sub-function representor port name parsing
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (39 preceding siblings ...)
  2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 0/9] net/mlx5: support SubFunction representor Xueming Li
@ 2021-03-28 13:48   ` Xueming Li
  2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 2/9] net/mlx5: support representor of sub function Xueming Li
                     ` (25 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-03-28 13:48 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, xuemingl, Asaf Penso, Matan Azrad, Shahaf Shuler

This patch supports representor name parsing for SF.
In sysfs, representor name stored under "phys_port_name" sysfs key,
similar to VF representor, switch port name of SF representor is
"pf<x>sf<y>".

For netlink message, net SF type is supported.

Examples:

pf0sf1
pf0sf[0-3]

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/common/mlx5/linux/mlx5_common_os.c | 32 +++++++++++++++-------
 drivers/common/mlx5/linux/mlx5_nl.c        |  3 ++
 drivers/common/mlx5/mlx5_common.h          |  2 ++
 drivers/net/mlx5/linux/mlx5_ethdev_os.c    |  3 ++
 4 files changed, 30 insertions(+), 10 deletions(-)

diff --git a/drivers/common/mlx5/linux/mlx5_common_os.c b/drivers/common/mlx5/linux/mlx5_common_os.c
index 0edd78ea6d..5cf9576921 100644
--- a/drivers/common/mlx5/linux/mlx5_common_os.c
+++ b/drivers/common/mlx5/linux/mlx5_common_os.c
@@ -97,22 +97,34 @@ void
 mlx5_translate_port_name(const char *port_name_in,
 			 struct mlx5_switch_info *port_info_out)
 {
-	char pf_c1, pf_c2, vf_c1, vf_c2, eol;
+	char ctrl = 0, pf_c1, pf_c2, vf_c1, vf_c2, eol;
 	char *end;
 	int sc_items;
 
-	/*
-	 * Check for port-name as a string of the form pf0vf0
-	 * (support kernel ver >= 5.0 or OFED ver >= 4.6).
-	 */
+	sc_items = sscanf(port_name_in, "%c%d",
+			  &ctrl, &port_info_out->ctrl_num);
+	if (sc_items == 2 && ctrl == 'c') {
+		port_name_in++; /* 'c' */
+		port_name_in += snprintf(NULL, 0, "%d",
+					  port_info_out->ctrl_num);
+	}
+	/* Check for port-name as a string of the form pf0vf0 or pf0sf0 */
 	sc_items = sscanf(port_name_in, "%c%c%d%c%c%d%c",
 			  &pf_c1, &pf_c2, &port_info_out->pf_num,
 			  &vf_c1, &vf_c2, &port_info_out->port_name, &eol);
-	if (sc_items == 6 &&
-	    pf_c1 == 'p' && pf_c2 == 'f' &&
-	    vf_c1 == 'v' && vf_c2 == 'f') {
-		port_info_out->name_type = MLX5_PHYS_PORT_NAME_TYPE_PFVF;
-		return;
+	if (sc_items == 6 && pf_c1 == 'p' && pf_c2 == 'f') {
+		if (vf_c1 == 'v' && vf_c2 == 'f') {
+			/* Kernel ver >= 5.0 or OFED ver >= 4.6 */
+			port_info_out->name_type =
+					MLX5_PHYS_PORT_NAME_TYPE_PFVF;
+			return;
+		}
+		if (vf_c1 == 's' && vf_c2 == 'f') {
+			/* Kernel ver >= 5.11 or OFED ver >= 5.1 */
+			port_info_out->name_type =
+					MLX5_PHYS_PORT_NAME_TYPE_PFSF;
+			return;
+		}
 	}
 	/*
 	 * Check for port-name as a string of the form p0
diff --git a/drivers/common/mlx5/linux/mlx5_nl.c b/drivers/common/mlx5/linux/mlx5_nl.c
index ef7a521379..752c57b33d 100644
--- a/drivers/common/mlx5/linux/mlx5_nl.c
+++ b/drivers/common/mlx5/linux/mlx5_nl.c
@@ -746,6 +746,7 @@ mlx5_nl_mac_addr_sync(int nlsk_fd, unsigned int iface_idx,
 	int i;
 	int ret;
 
+	memset(macs, 0, n * sizeof(macs[0]));
 	ret = mlx5_nl_mac_addr_list(nlsk_fd, iface_idx, &macs, &macs_n);
 	if (ret)
 		return;
@@ -1158,6 +1159,8 @@ mlx5_nl_check_switch_info(bool num_vf_set,
 	case MLX5_PHYS_PORT_NAME_TYPE_PFHPF:
 		/* Fallthrough */
 	case MLX5_PHYS_PORT_NAME_TYPE_PFVF:
+		/* Fallthrough */
+	case MLX5_PHYS_PORT_NAME_TYPE_PFSF:
 		/* New representors naming schema. */
 		switch_info->representor = 1;
 		break;
diff --git a/drivers/common/mlx5/mlx5_common.h b/drivers/common/mlx5/mlx5_common.h
index 5028a05b49..8eda6749b4 100644
--- a/drivers/common/mlx5/mlx5_common.h
+++ b/drivers/common/mlx5/mlx5_common.h
@@ -151,6 +151,7 @@ enum mlx5_nl_phys_port_name_type {
 	MLX5_PHYS_PORT_NAME_TYPE_UPLINK, /* p0, kernel ver >= 5.0 */
 	MLX5_PHYS_PORT_NAME_TYPE_PFVF, /* pf0vf0, kernel ver >= 5.0 */
 	MLX5_PHYS_PORT_NAME_TYPE_PFHPF, /* pf0, kernel ver >= 5.7, HPF rep */
+	MLX5_PHYS_PORT_NAME_TYPE_PFSF, /* pf0sf0, kernel ver >= 5.0 */
 	MLX5_PHYS_PORT_NAME_TYPE_UNKNOWN, /* Unrecognized. */
 };
 
@@ -159,6 +160,7 @@ struct mlx5_switch_info {
 	uint32_t master:1; /**< Master device. */
 	uint32_t representor:1; /**< Representor device. */
 	enum mlx5_nl_phys_port_name_type name_type; /** < Port name type. */
+	int32_t ctrl_num; /**< Controller number (valid for c#pf#vf# format). */
 	int32_t pf_num; /**< PF number (valid for pfxvfx format only). */
 	int32_t port_name; /**< Representor port name. */
 	uint64_t switch_id; /**< Switch identifier. */
diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
index 0e8de9439e..cb692b22f2 100644
--- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
@@ -1013,6 +1013,9 @@ mlx5_sysfs_check_switch_info(bool device_dir,
 		/* New representors naming schema. */
 		switch_info->representor = 1;
 		break;
+	default:
+		switch_info->master = device_dir;
+		break;
 	}
 }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v5 2/9] net/mlx5: support representor of sub function
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (40 preceding siblings ...)
  2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 1/9] common/mlx5: sub-function representor port name parsing Xueming Li
@ 2021-03-28 13:48   ` Xueming Li
  2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 3/9] net/mlx5: revert setting bonding representor to first PF Xueming Li
                     ` (24 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-03-28 13:48 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, xuemingl, Asaf Penso, Matan Azrad, Shahaf Shuler

This patch adds support for SF representor. Similar to VF representor,
switch port name of SF representor in phys_port_name sysfs key is
"pf<x>sf<y>".

Device representor argument is "representors=sf[list]", list member
could be mix of instance and range. Example:
  representors=sf[0,2,4,8-12,-1]

To probe VF representor and SF representor, need to separate into 2
devices:
  -a <BDF>,representor=vf[list] -a <BDF>,representor=sf[list]

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 doc/guides/nics/mlx5.rst                |  58 +++++++++--
 drivers/net/mlx5/linux/mlx5_ethdev_os.c |   2 +
 drivers/net/mlx5/linux/mlx5_os.c        | 127 +++++++++++++++++++-----
 drivers/net/mlx5/mlx5.c                 |   1 +
 drivers/net/mlx5/mlx5.h                 |   9 ++
 drivers/net/mlx5/mlx5_ethdev.c          | 100 +++++++++++++++++++
 6 files changed, 263 insertions(+), 34 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index a2cfc51b2a..2e2909d82d 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -931,14 +931,18 @@ Driver options
 - ``representor`` parameter [list]
 
   This parameter can be used to instantiate DPDK Ethernet devices from
-  existing port (or VF) representors configured on the device.
+  existing port (PF, VF or SF) representors configured on the device.
 
   It is a standard parameter whose format is described in
   :ref:`ethernet_device_standard_device_arguments`.
 
-  For instance, to probe port representors 0 through 2::
+  For instance, to probe VF port representors 0 through 2::
 
-    representor=[0-2]
+    representor=vf[0-2]
+
+  To probe SF port representors 0 through 2::
+
+    representor=sf[0-2]
 
 - ``max_dump_files_num`` parameter [int]
 
@@ -1351,15 +1355,15 @@ Quick Start Guide on OFED/EN
 Enable switchdev mode
 ---------------------
 
-Switchdev mode is a mode in E-Switch, that binds between representor and VF.
-Representor is a port in DPDK that is connected to a VF in such a way
-that assuming there are no offload flows, each packet that is sent from the VF
-will be received by the corresponding representor. While each packet that is
-sent to a representor will be received by the VF.
+Switchdev mode is a mode in E-Switch, that binds between representor and VF or SF.
+Representor is a port in DPDK that is connected to a VF or SF in such a way
+that assuming there are no offload flows, each packet that is sent from the VF or SF
+will be received by the corresponding representor. While each packet that is or SF
+sent to a representor will be received by the VF or SF.
 This is very useful in case of SRIOV mode, where the first packet that is sent
-by the VF will be received by the DPDK application which will decide if this
+by the VF or SF will be received by the DPDK application which will decide if this
 flow should be offloaded to the E-Switch. After offloading the flow packet
-that the VF that are matching the flow will not be received any more by
+that the VF or SF that are matching the flow will not be received any more by
 the DPDK application.
 
 1. Enable SRIOV mode::
@@ -1386,6 +1390,40 @@ the DPDK application.
 
         echo switchdev > /sys/class/net/<net device>/compat/devlink/mode
 
+SubFunction representor support
+-------------------------------
+SubFunction is a portion of the PCI device, a SF netdev has its own
+dedicated queues(txq, rxq). A SF netdev supports E-Switch representation
+offload similar to existing PF and VF representors. A SF shares PCI
+level resources with other SFs and/or with its parent PCI function.
+
+1. Configure SF feature::
+
+        mlxconfig -d <mst device> set PF_BAR2_SIZE=<0/1/2/3> PF_BAR2_ENABLE=1
+
+        Value of PF_BAR2_SIZE:
+
+            0: 8 SFs
+            1: 16 SFs
+            2: 32 SFs
+            3: 64 SFs
+
+2. Reset the FW::
+
+        mlxfwreset -d <mst device> reset
+
+3. Enable switchdev mode::
+
+        echo switchdev > /sys/class/net/<net device>/compat/devlink/mode
+
+4. Create SF::
+
+        mlnx-sf -d <PCI_BDF> -a create
+
+5. Probe SF representor::
+
+        testpmd> port attach <PCI_BDF>,representor=sf0,dv_flow_en=1
+
 Performance tuning
 ------------------
 
diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
index cb692b22f2..2127fcfbfa 100644
--- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
@@ -1010,6 +1010,8 @@ mlx5_sysfs_check_switch_info(bool device_dir,
 	case MLX5_PHYS_PORT_NAME_TYPE_PFHPF:
 		/* Fallthrough */
 	case MLX5_PHYS_PORT_NAME_TYPE_PFVF:
+		/* Fallthrough */
+	case MLX5_PHYS_PORT_NAME_TYPE_PFSF:
 		/* New representors naming schema. */
 		switch_info->representor = 1;
 		break;
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 5740214950..aac923ea39 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -701,6 +701,8 @@ mlx5_queue_counter_id_prepare(struct rte_eth_dev *dev)
  *   Verbs device parameters (name, port, switch_info) to spawn.
  * @param config
  *   Device configuration parameters.
+ * @param config
+ *   Device arguments.
  *
  * @return
  *   A valid Ethernet device object on success, NULL otherwise and rte_errno
@@ -712,7 +714,8 @@ mlx5_queue_counter_id_prepare(struct rte_eth_dev *dev)
 static struct rte_eth_dev *
 mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	       struct mlx5_dev_spawn_data *spawn,
-	       struct mlx5_dev_config *config)
+	       struct mlx5_dev_config *config,
+	       struct rte_eth_devargs *eth_da)
 {
 	const struct mlx5_switch_info *switch_info = &spawn->info;
 	struct mlx5_dev_ctx_shared *sh = NULL;
@@ -742,34 +745,82 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 
 	/* Determine if this port representor is supposed to be spawned. */
 	if (switch_info->representor && dpdk_dev->devargs) {
-		struct rte_eth_devargs eth_da;
-
-		err = rte_eth_devargs_parse(dpdk_dev->devargs->args, &eth_da);
-		if (err) {
-			rte_errno = -err;
-			DRV_LOG(ERR, "failed to process device arguments: %s",
-				strerror(rte_errno));
-			return NULL;
-		}
-		if (eth_da.type == RTE_ETH_REPRESENTOR_NONE) {
-			/* Representor not specified. */
+		switch (eth_da->type) {
+		case RTE_ETH_REPRESENTOR_SF:
+			if (switch_info->name_type !=
+					MLX5_PHYS_PORT_NAME_TYPE_PFSF) {
+				rte_errno = EBUSY;
+				return NULL;
+			}
+			break;
+		case RTE_ETH_REPRESENTOR_VF:
+			/* Allows HPF representor index -1 as exception. */
+			if (!(spawn->info.port_name == -1 &&
+			      switch_info->name_type ==
+					MLX5_PHYS_PORT_NAME_TYPE_PFHPF) &&
+			    switch_info->name_type !=
+					MLX5_PHYS_PORT_NAME_TYPE_PFVF) {
+				rte_errno = EBUSY;
+				return NULL;
+			}
+			break;
+		case RTE_ETH_REPRESENTOR_NONE:
 			rte_errno = EBUSY;
 			return NULL;
-		}
-		if (eth_da.type != RTE_ETH_REPRESENTOR_VF) {
+			break;
+		default:
 			rte_errno = ENOTSUP;
 			DRV_LOG(ERR, "unsupported representor type: %s",
 				dpdk_dev->devargs->args);
 			return NULL;
 		}
-		for (i = 0; i < eth_da.nb_representor_ports; ++i)
-			if (eth_da.representor_ports[i] ==
+		/* Check controller ID: */
+		for (i = 0; i < eth_da->nb_mh_controllers; ++i)
+			if (eth_da->mh_controllers[i] ==
+			    (uint16_t)switch_info->ctrl_num)
+				break;
+		if (eth_da->nb_mh_controllers &&
+		    i == eth_da->nb_mh_controllers) {
+			rte_errno = EBUSY;
+			return NULL;
+		}
+		/* Check SF/VF ID: */
+		for (i = 0; i < eth_da->nb_representor_ports; ++i)
+			if (eth_da->representor_ports[i] ==
 			    (uint16_t)switch_info->port_name)
 				break;
-		if (i == eth_da.nb_representor_ports) {
+		if (eth_da->type != RTE_ETH_REPRESENTOR_PF &&
+		    i == eth_da->nb_representor_ports) {
 			rte_errno = EBUSY;
 			return NULL;
 		}
+		/* Check PF ID. Check after repr port to avoid warning flood. */
+		if (spawn->pf_bond >= 0) {
+			for (i = 0; i < eth_da->nb_ports; ++i)
+				if (eth_da->ports[i] ==
+				    (uint16_t)switch_info->pf_num)
+					break;
+			if (eth_da->nb_ports && i == eth_da->nb_ports) {
+				/* For backward compatibility, bonding
+				 * representor syntax supported with limitation,
+				 * device iterator won't find it:
+				 *    <PF1_BDF>,representor=#
+				 */
+				if (switch_info->pf_num > 0 &&
+				    eth_da->ports[0] == 0) {
+					DRV_LOG(WARNING, "Representor on Bonding PF should use pf#vf# format: %s",
+						dpdk_dev->devargs->args);
+				} else {
+					rte_errno = EBUSY;
+					return NULL;
+				}
+			}
+		} else if (eth_da->nb_ports > 1 || eth_da->ports[0]) {
+			rte_errno = EINVAL;
+			DRV_LOG(ERR, "PF id not supported by non-bond device: %s",
+				dpdk_dev->devargs->args);
+			return NULL;
+		}
 	}
 	/* Build device name. */
 	if (spawn->pf_bond <  0) {
@@ -777,8 +828,11 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		if (!switch_info->representor)
 			strlcpy(name, dpdk_dev->name, sizeof(name));
 		else
-			snprintf(name, sizeof(name), "%s_representor_%u",
-				 dpdk_dev->name, switch_info->port_name);
+			snprintf(name, sizeof(name), "%s_representor_%s%u",
+				 dpdk_dev->name,
+				 switch_info->name_type ==
+				 MLX5_PHYS_PORT_NAME_TYPE_PFSF ? "sf" : "vf",
+				 switch_info->port_name);
 	} else {
 		/* Bonding device. */
 		if (!switch_info->representor)
@@ -786,9 +840,11 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 				 dpdk_dev->name,
 				 mlx5_os_get_dev_device_name(spawn->phys_dev));
 		else
-			snprintf(name, sizeof(name), "%s_%s_representor_%u",
+			snprintf(name, sizeof(name), "%s_%s_representor_%s%u",
 				 dpdk_dev->name,
 				 mlx5_os_get_dev_device_name(spawn->phys_dev),
+				 switch_info->name_type ==
+				 MLX5_PHYS_PORT_NAME_TYPE_PFSF ? "sf" : "vf",
 				 switch_info->port_name);
 	}
 	/* check if the device is already spawned */
@@ -1063,9 +1119,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	priv->vport_id = switch_info->representor ?
 			 switch_info->port_name + 1 : -1;
 #endif
-	/* representor_id field keeps the unmodified VF index. */
-	priv->representor_id = switch_info->representor ?
-			       switch_info->port_name : -1;
+	priv->representor_id = mlx5_representor_id_encode(switch_info);
 	/*
 	 * Look for sibling devices in order to reuse their switch domain
 	 * if any, otherwise allocate one.
@@ -1839,6 +1893,7 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 	struct mlx5_dev_spawn_data *list = NULL;
 	struct mlx5_dev_config dev_config;
 	unsigned int dev_config_vf;
+	struct rte_eth_devargs eth_da = { .type = RTE_ETH_REPRESENTOR_NONE };
 	int ret;
 
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
@@ -1849,6 +1904,27 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 			strerror(rte_errno));
 		return -rte_errno;
 	}
+	if (pci_dev->device.devargs) {
+		/* Parse representor information from device argument. */
+		if (pci_dev->device.devargs->cls_str)
+			ret = rte_eth_devargs_parse
+				(pci_dev->device.devargs->cls_str, &eth_da);
+		if (ret) {
+			DRV_LOG(ERR, "failed to parse device arguments: %s",
+				pci_dev->device.devargs->cls_str);
+			return -rte_errno;
+		}
+		if (eth_da.type == RTE_ETH_REPRESENTOR_NONE) {
+			/* Support legacy device argument */
+			ret = rte_eth_devargs_parse
+				(pci_dev->device.devargs->args, &eth_da);
+			if (ret) {
+				DRV_LOG(ERR, "failed to parse device arguments: %s",
+					pci_dev->device.devargs->args);
+				return -rte_errno;
+			}
+		}
+	}
 	errno = 0;
 	ibv_list = mlx5_glue->get_device_list(&ret);
 	if (!ibv_list) {
@@ -2021,6 +2097,8 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 				case MLX5_PHYS_PORT_NAME_TYPE_PFHPF:
 					/* Fallthrough */
 				case MLX5_PHYS_PORT_NAME_TYPE_PFVF:
+					/* Fallthrough */
+				case MLX5_PHYS_PORT_NAME_TYPE_PFSF:
 					if (list[ns].info.pf_num == bd)
 						ns++;
 					break;
@@ -2198,7 +2276,8 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 		dev_config.log_hp_size = MLX5_ARG_UNSET;
 		list[i].eth_dev = mlx5_dev_spawn(&pci_dev->device,
 						 &list[i],
-						 &dev_config);
+						 &dev_config,
+						 &eth_da);
 		if (!list[i].eth_dev) {
 			if (rte_errno != EBUSY && rte_errno != EEXIST)
 				break;
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index fb586317ca..22058a0ad5 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1454,6 +1454,7 @@ const struct eth_dev_ops mlx5_dev_ops = {
 	.xstats_get_names = mlx5_xstats_get_names,
 	.fw_version_get = mlx5_fw_version_get,
 	.dev_infos_get = mlx5_dev_infos_get,
+	.representor_info_get = mlx5_representor_info_get,
 	.read_clock = mlx5_txpp_read_clock,
 	.dev_supported_ptypes_get = mlx5_dev_supported_ptypes_get,
 	.vlan_filter_set = mlx5_vlan_filter_set,
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 8e8727a6c5..33c6b39a1e 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1032,6 +1032,15 @@ int mlx5_flow_aso_age_mng_init(struct mlx5_dev_ctx_shared *sh);
 /* mlx5_ethdev.c */
 
 int mlx5_dev_configure(struct rte_eth_dev *dev);
+int mlx5_representor_info_get(struct rte_eth_dev *dev,
+			      struct rte_eth_representor_info *info);
+#define MLX5_REPRESENTOR_ID(pf, type, repr) \
+		(((pf) << 14) + ((type) << 12) + ((repr) & 0xfff))
+#define MLX5_REPRESENTOR_REPR(repr_id) \
+		((repr_id) & 0xfff)
+#define MLX5_REPRESENTOR_TYPE(repr_id) \
+		(((repr_id) >> 12) & 3)
+uint16_t mlx5_representor_id_encode(const struct mlx5_switch_info *info);
 int mlx5_fw_version_get(struct rte_eth_dev *dev, char *fw_ver,
 			size_t fw_size);
 int mlx5_dev_infos_get(struct rte_eth_dev *dev,
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 51b39ddde5..1ffb13cf2e 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -377,6 +377,106 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 	return 0;
 }
 
+/**
+ * Calculate representor ID from port switch info.
+ *
+ * Uint16 representor ID bits definition:
+ *   pf: 2
+ *   type: 2
+ *   vf/sf: 12
+ *
+ * @param info
+ *   Port switch info.
+ *
+ * @return
+ *   Encoded representor ID.
+ */
+uint16_t
+mlx5_representor_id_encode(const struct mlx5_switch_info *info)
+{
+	enum rte_eth_representor_type type = RTE_ETH_REPRESENTOR_VF;
+	uint16_t repr = info->port_name;
+
+	if (info->representor == 0)
+		return UINT16_MAX;
+	if (info->name_type == MLX5_PHYS_PORT_NAME_TYPE_PFSF)
+		type = RTE_ETH_REPRESENTOR_SF;
+	if (info->name_type == MLX5_PHYS_PORT_NAME_TYPE_PFHPF)
+		repr = UINT16_MAX;
+	return MLX5_REPRESENTOR_ID(info->pf_num, type, repr);
+}
+
+/**
+ * DPDK callback to get information about representor.
+ *
+ * Representor ID bits definition:
+ *   vf/sf: 12
+ *   type: 2
+ *   pf: 2
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param[out] info
+ *   Nullable info structure output buffer.
+ *
+ * @return
+ *   negative on error, or the number of representor ranges.
+ */
+int
+mlx5_representor_info_get(struct rte_eth_dev *dev,
+			  struct rte_eth_representor_info *info)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	int n_type = 3; /* Number of representor types, VF, HPF and SF. */
+	int n_pf = 2; /* Number of PFs. */
+	int i = 0, pf;
+
+	if (info == NULL)
+		goto out;
+	info->controller = 0;
+	info->pf = priv->pf_bond >= 0 ? priv->pf_bond : 0;
+	for (pf = 0; pf < n_pf; ++pf) {
+		/* VF range. */
+		info->ranges[i].type = RTE_ETH_REPRESENTOR_VF;
+		info->ranges[i].controller = 0;
+		info->ranges[i].pf = pf;
+		info->ranges[i].vf = 0;
+		info->ranges[i].id_base =
+			MLX5_REPRESENTOR_ID(pf, info->ranges[i].type, 0);
+		info->ranges[i].id_end =
+			MLX5_REPRESENTOR_ID(pf, info->ranges[i].type, -1);
+		snprintf(info->ranges[i].name,
+			 sizeof(info->ranges[i].name), "pf%dvf", pf);
+		i++;
+		/* HPF range. */
+		info->ranges[i].type = RTE_ETH_REPRESENTOR_VF;
+		info->ranges[i].controller = 0;
+		info->ranges[i].pf = pf;
+		info->ranges[i].vf = UINT16_MAX;
+		info->ranges[i].id_base =
+			MLX5_REPRESENTOR_ID(pf, info->ranges[i].type, -1);
+		info->ranges[i].id_end =
+			MLX5_REPRESENTOR_ID(pf, info->ranges[i].type, -1);
+		snprintf(info->ranges[i].name,
+			 sizeof(info->ranges[i].name), "pf%dvf", pf);
+		i++;
+		/* SF range. */
+		info->ranges[i].type = RTE_ETH_REPRESENTOR_SF;
+		info->ranges[i].controller = 0;
+		info->ranges[i].pf = pf;
+		info->ranges[i].vf = 0;
+		info->ranges[i].id_base =
+			MLX5_REPRESENTOR_ID(pf, info->ranges[i].type, 0);
+		info->ranges[i].id_end =
+			MLX5_REPRESENTOR_ID(pf, info->ranges[i].type, -1);
+		snprintf(info->ranges[i].name,
+			 sizeof(info->ranges[i].name), "pf%dsf", pf);
+		i++;
+	}
+out:
+	return n_type * n_pf;
+}
+
 /**
  * Get firmware version of a device.
  *
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v5 3/9] net/mlx5: revert setting bonding representor to first PF
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (41 preceding siblings ...)
  2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 2/9] net/mlx5: support representor of sub function Xueming Li
@ 2021-03-28 13:48   ` Xueming Li
  2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 4/9] net/mlx5: refactor bonding representor probe Xueming Li
                     ` (23 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-03-28 13:48 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, xuemingl, Asaf Penso, Matan Azrad, Shahaf Shuler

With kernel bonding, representors on second PF are being probed by
devargs:
	<primary_bdf>,representor=pf1vf<N>
No need to save primary PF port ID and lookup when probing sibling
ports, revert patch [1]

[1]:
commit e6818853c022 ("net/mlx5: set representor to first PF in bonding
mode")

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_os.c | 20 ++------------------
 drivers/net/mlx5/mlx5.c          |  1 -
 drivers/net/mlx5/mlx5.h          |  1 -
 3 files changed, 2 insertions(+), 20 deletions(-)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index aac923ea39..0c56cae489 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -862,13 +862,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 			rte_errno = ENOMEM;
 			return NULL;
 		}
-		priv = eth_dev->data->dev_private;
-		if (priv->sh->bond_dev != UINT16_MAX)
-			/* For bonding port, use primary PCI device. */
-			eth_dev->device =
-				rte_eth_devices[priv->sh->bond_dev].device;
-		else
-			eth_dev->device = dpdk_dev;
+		eth_dev->device = dpdk_dev;
 		eth_dev->dev_ops = &mlx5_dev_sec_ops;
 		eth_dev->rx_descriptor_status = mlx5_rx_descriptor_status;
 		eth_dev->tx_descriptor_status = mlx5_tx_descriptor_status;
@@ -1485,17 +1479,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	eth_dev->data->dev_private = priv;
 	priv->dev_data = eth_dev->data;
 	eth_dev->data->mac_addrs = priv->mac;
-	if (spawn->pf_bond < 0) {
-		eth_dev->device = dpdk_dev;
-	} else {
-		/* Use primary bond PCI as device. */
-		if (sh->bond_dev == UINT16_MAX) {
-			sh->bond_dev = eth_dev->data->port_id;
-			eth_dev->device = dpdk_dev;
-		} else {
-			eth_dev->device = rte_eth_devices[sh->bond_dev].device;
-		}
-	}
+	eth_dev->device = dpdk_dev;
 	eth_dev->data->dev_flags |= RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS;
 	/* Configure the first MAC address by default. */
 	if (mlx5_get_mac(eth_dev, &mac.addr_bytes)) {
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 22058a0ad5..aa8b50c642 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -917,7 +917,6 @@ mlx5_alloc_shared_dev_ctx(const struct mlx5_dev_spawn_data *spawn,
 		goto error;
 	}
 	sh->refcnt = 1;
-	sh->bond_dev = UINT16_MAX;
 	sh->max_port = spawn->max_port;
 	strncpy(sh->ibdev_name, mlx5_os_get_ctx_device_name(sh->ctx),
 		sizeof(sh->ibdev_name) - 1);
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 33c6b39a1e..bee0696518 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -667,7 +667,6 @@ struct mlx5_flex_parser_profiles {
 struct mlx5_dev_ctx_shared {
 	LIST_ENTRY(mlx5_dev_ctx_shared) next;
 	uint32_t refcnt;
-	uint16_t bond_dev; /* Bond primary device id. */
 	uint32_t devx:1; /* Opened with DV. */
 	uint32_t flow_hit_aso_en:1; /* Flow Hit ASO is supported. */
 	uint32_t rq_ts_format:2; /* RQ timestamp formats supported. */
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v5 4/9] net/mlx5: refactor bonding representor probe
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (42 preceding siblings ...)
  2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 3/9] net/mlx5: revert setting bonding representor to first PF Xueming Li
@ 2021-03-28 13:48   ` Xueming Li
  2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 5/9] net/mlx5: support list value of representor PF Xueming Li
                     ` (22 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-03-28 13:48 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, xuemingl, Asaf Penso, Matan Azrad, Shahaf Shuler

To probe representor on 2nd PF of kernel bonding device, had to specify
PF1 BDF in devarg:
  <PF1_BDF>,representor=0
When closing bonding device, all representors had to be closed together
and this implies all representors have to use primary PF of bonding
device. So after probing representor port on 2nd PF, when locating new
probed device using device argument, the filter used 2nd PF as PCI
address and failed to locate new device.

Conflict happened by using current representor devargs:
 - Use PCI BDF to specify representor owner PF
 - Use PCI BDF to locate probed representor device.
 - PMD uses primary PCI BDF as PCI device.

To resolve such conflicts, new representor syntax is introduced here:
  <primary BDF>,representor=pfXvfY
All representors must use primary PF as owner PCI device, PMD internally
locate owner PCI address by checking representor "pfX" part. To EAL, all
representors are registered to primary PCI device, the 2nd PF is hidden
to EAL, thus all search should be consistent.

Same to VF representor, HPF (host PF on BlueField) uses same syntax to
probe, example: representor=pf1vf[0-3,-1]

This patch also adds pf index into kernel bonding representor port name:
	<BDF>_<ib_name>_representor_pf<X>vf<Y>

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 doc/guides/nics/mlx5.rst         |   4 +-
 drivers/net/mlx5/linux/mlx5_os.c | 249 +++++++++++++++++--------------
 drivers/net/mlx5/mlx5.c          |  20 +++
 drivers/net/mlx5/mlx5.h          |   3 +-
 drivers/net/mlx5/mlx5_defs.h     |   4 -
 drivers/net/mlx5/mlx5_ethdev.c   |  27 ----
 drivers/net/mlx5/mlx5_mac.c      |   6 +-
 7 files changed, 163 insertions(+), 150 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 2e2909d82d..92fe7c11e4 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -938,11 +938,11 @@ Driver options
 
   For instance, to probe VF port representors 0 through 2::
 
-    representor=vf[0-2]
+    <PCI_BDF>,representor=vf[0-2]
 
   To probe SF port representors 0 through 2::
 
-    representor=sf[0-2]
+    <PCI_BDF>,representor=sf[0-2]
 
 - ``max_dump_files_num`` parameter [int]
 
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 0c56cae489..bb4a8719f7 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -692,6 +692,71 @@ mlx5_queue_counter_id_prepare(struct rte_eth_dev *dev)
 			"available.", dev->data->port_id);
 }
 
+/**
+ * Check if representor spawn info match devargs.
+ *
+ * @param spawn
+ *   Verbs device parameters (name, port, switch_info) to spawn.
+ * @param eth_da
+ *   Device devargs to probe.
+ *
+ * @return
+ *   Match result.
+ */
+static bool
+mlx5_representor_match(struct mlx5_dev_spawn_data *spawn,
+		       struct rte_eth_devargs *eth_da)
+{
+	struct mlx5_switch_info *switch_info = &spawn->info;
+	unsigned int p, f;
+	uint16_t id;
+	uint16_t repr_id = mlx5_representor_id_encode(switch_info);
+
+	switch (eth_da->type) {
+	case RTE_ETH_REPRESENTOR_SF:
+		if (switch_info->name_type != MLX5_PHYS_PORT_NAME_TYPE_PFSF) {
+			rte_errno = EBUSY;
+			return false;
+		}
+		break;
+	case RTE_ETH_REPRESENTOR_VF:
+		/* Allows HPF representor index -1 as exception. */
+		if (!(spawn->info.port_name == -1 &&
+		      switch_info->name_type ==
+				MLX5_PHYS_PORT_NAME_TYPE_PFHPF) &&
+		    switch_info->name_type != MLX5_PHYS_PORT_NAME_TYPE_PFVF) {
+			rte_errno = EBUSY;
+			return false;
+		}
+		break;
+	case RTE_ETH_REPRESENTOR_NONE:
+		rte_errno = EBUSY;
+		return false;
+	default:
+		rte_errno = ENOTSUP;
+		DRV_LOG(ERR, "unsupported representor type");
+		return false;
+	}
+	/* Check representor ID: */
+	for (p = 0; p < eth_da->nb_ports; ++p) {
+		if (spawn->pf_bond < 0) {
+			/* For non-LAG mode, allow and ignore pf. */
+			switch_info->pf_num = eth_da->ports[p];
+			repr_id = mlx5_representor_id_encode(switch_info);
+		}
+		for (f = 0; f < eth_da->nb_representor_ports; ++f) {
+			id = MLX5_REPRESENTOR_ID
+				(eth_da->ports[p], eth_da->type,
+				 eth_da->representor_ports[f]);
+			if (repr_id == id)
+				return true;
+		}
+	}
+	rte_errno = EBUSY;
+	return false;
+}
+
+
 /**
  * Spawn an Ethernet device from Verbs information.
  *
@@ -738,115 +803,44 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	char name[RTE_ETH_NAME_MAX_LEN];
 	int own_domain_id = 0;
 	uint16_t port_id;
-	unsigned int i;
 #ifdef HAVE_MLX5DV_DR_DEVX_PORT
 	struct mlx5dv_devx_port devx_port = { .comp_mask = 0 };
 #endif
 
 	/* Determine if this port representor is supposed to be spawned. */
-	if (switch_info->representor && dpdk_dev->devargs) {
-		switch (eth_da->type) {
-		case RTE_ETH_REPRESENTOR_SF:
-			if (switch_info->name_type !=
-					MLX5_PHYS_PORT_NAME_TYPE_PFSF) {
-				rte_errno = EBUSY;
-				return NULL;
-			}
-			break;
-		case RTE_ETH_REPRESENTOR_VF:
-			/* Allows HPF representor index -1 as exception. */
-			if (!(spawn->info.port_name == -1 &&
-			      switch_info->name_type ==
-					MLX5_PHYS_PORT_NAME_TYPE_PFHPF) &&
-			    switch_info->name_type !=
-					MLX5_PHYS_PORT_NAME_TYPE_PFVF) {
-				rte_errno = EBUSY;
-				return NULL;
-			}
-			break;
-		case RTE_ETH_REPRESENTOR_NONE:
-			rte_errno = EBUSY;
-			return NULL;
-			break;
-		default:
-			rte_errno = ENOTSUP;
-			DRV_LOG(ERR, "unsupported representor type: %s",
-				dpdk_dev->devargs->args);
-			return NULL;
-		}
-		/* Check controller ID: */
-		for (i = 0; i < eth_da->nb_mh_controllers; ++i)
-			if (eth_da->mh_controllers[i] ==
-			    (uint16_t)switch_info->ctrl_num)
-				break;
-		if (eth_da->nb_mh_controllers &&
-		    i == eth_da->nb_mh_controllers) {
-			rte_errno = EBUSY;
-			return NULL;
-		}
-		/* Check SF/VF ID: */
-		for (i = 0; i < eth_da->nb_representor_ports; ++i)
-			if (eth_da->representor_ports[i] ==
-			    (uint16_t)switch_info->port_name)
-				break;
-		if (eth_da->type != RTE_ETH_REPRESENTOR_PF &&
-		    i == eth_da->nb_representor_ports) {
-			rte_errno = EBUSY;
-			return NULL;
-		}
-		/* Check PF ID. Check after repr port to avoid warning flood. */
-		if (spawn->pf_bond >= 0) {
-			for (i = 0; i < eth_da->nb_ports; ++i)
-				if (eth_da->ports[i] ==
-				    (uint16_t)switch_info->pf_num)
-					break;
-			if (eth_da->nb_ports && i == eth_da->nb_ports) {
-				/* For backward compatibility, bonding
-				 * representor syntax supported with limitation,
-				 * device iterator won't find it:
-				 *    <PF1_BDF>,representor=#
-				 */
-				if (switch_info->pf_num > 0 &&
-				    eth_da->ports[0] == 0) {
-					DRV_LOG(WARNING, "Representor on Bonding PF should use pf#vf# format: %s",
-						dpdk_dev->devargs->args);
-				} else {
-					rte_errno = EBUSY;
-					return NULL;
-				}
-			}
-		} else if (eth_da->nb_ports > 1 || eth_da->ports[0]) {
-			rte_errno = EINVAL;
-			DRV_LOG(ERR, "PF id not supported by non-bond device: %s",
-				dpdk_dev->devargs->args);
-			return NULL;
-		}
-	}
+	if (switch_info->representor && dpdk_dev->devargs &&
+	    !mlx5_representor_match(spawn, eth_da))
+		return NULL;
 	/* Build device name. */
-	if (spawn->pf_bond <  0) {
+	if (spawn->pf_bond < 0) {
 		/* Single device. */
 		if (!switch_info->representor)
 			strlcpy(name, dpdk_dev->name, sizeof(name));
 		else
-			snprintf(name, sizeof(name), "%s_representor_%s%u",
+			err = snprintf(name, sizeof(name), "%s_representor_%s%u",
 				 dpdk_dev->name,
 				 switch_info->name_type ==
 				 MLX5_PHYS_PORT_NAME_TYPE_PFSF ? "sf" : "vf",
 				 switch_info->port_name);
 	} else {
 		/* Bonding device. */
-		if (!switch_info->representor)
-			snprintf(name, sizeof(name), "%s_%s",
+		if (!switch_info->representor) {
+			err = snprintf(name, sizeof(name), "%s_%s",
 				 dpdk_dev->name,
 				 mlx5_os_get_dev_device_name(spawn->phys_dev));
-		else
-			snprintf(name, sizeof(name), "%s_%s_representor_%s%u",
-				 dpdk_dev->name,
-				 mlx5_os_get_dev_device_name(spawn->phys_dev),
-				 switch_info->name_type ==
-				 MLX5_PHYS_PORT_NAME_TYPE_PFSF ? "sf" : "vf",
-				 switch_info->port_name);
+		} else {
+			err = snprintf(name, sizeof(name), "%s_%s_representor_c%dpf%d%s%u",
+				dpdk_dev->name,
+				mlx5_os_get_dev_device_name(spawn->phys_dev),
+				switch_info->ctrl_num,
+				switch_info->pf_num,
+				switch_info->name_type ==
+				MLX5_PHYS_PORT_NAME_TYPE_PFSF ? "sf" : "vf",
+				switch_info->port_name);
+		}
 	}
+	if (err >= (int)sizeof(name))
+		DRV_LOG(WARNING, "device name overflow %s", name);
 	/* check if the device is already spawned */
 	if (rte_eth_dev_get_port_by_name(name, &port_id) == 0) {
 		rte_errno = EEXIST;
@@ -1739,9 +1733,11 @@ mlx5_dev_spawn_data_cmp(const void *a, const void *b)
  * @param[in] ibv_dev
  *   Pointer to Infiniband device structure.
  * @param[in] pci_dev
- *   Pointer to PCI device structure to match PCI address.
+ *   Pointer to primary PCI address structure to match.
  * @param[in] nl_rdma
  *   Netlink RDMA group socket handle.
+ * @param[in] owner
+ *   Rerepsentor owner PF index.
  *
  * @return
  *   negative value if no bonding device found, otherwise
@@ -1749,8 +1745,8 @@ mlx5_dev_spawn_data_cmp(const void *a, const void *b)
  */
 static int
 mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
-			   const struct rte_pci_device *pci_dev,
-			   int nl_rdma)
+			   const struct rte_pci_addr *pci_dev,
+			   int nl_rdma, uint16_t owner)
 {
 	char ifname[IF_NAMESIZE + 1];
 	unsigned int ifindex;
@@ -1807,10 +1803,10 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 					 " for netdev \"%s\"", ifname);
 			continue;
 		}
-		if (pci_dev->addr.domain != pci_addr.domain ||
-		    pci_dev->addr.bus != pci_addr.bus ||
-		    pci_dev->addr.devid != pci_addr.devid ||
-		    pci_dev->addr.function != pci_addr.function)
+		if (pci_dev->domain != pci_addr.domain ||
+		    pci_dev->bus != pci_addr.bus ||
+		    pci_dev->devid != pci_addr.devid ||
+		    pci_dev->function + owner != pci_addr.function)
 			continue;
 		/* Slave interface PCI address match found. */
 		fclose(file);
@@ -1878,7 +1874,8 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 	struct mlx5_dev_config dev_config;
 	unsigned int dev_config_vf;
 	struct rte_eth_devargs eth_da = { .type = RTE_ETH_REPRESENTOR_NONE };
-	int ret;
+	struct rte_pci_addr owner_pci = pci_dev->addr; /* Owner PF. */
+	int ret = -1;
 
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
 		mlx5_pmd_socket_init();
@@ -1930,7 +1927,8 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 
 		DRV_LOG(DEBUG, "checking device \"%s\"", ibv_list[ret]->name);
 		bd = mlx5_device_bond_pci_match
-				(ibv_list[ret], pci_dev, nl_rdma);
+				(ibv_list[ret], &owner_pci, nl_rdma,
+				 eth_da.ports[0]);
 		if (bd >= 0) {
 			/*
 			 * Bonding device detected. Only one match is allowed,
@@ -1947,23 +1945,28 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 				ret = -rte_errno;
 				goto exit;
 			}
+			/* Amend owner pci address if owner PF ID specified. */
+			if (eth_da.nb_representor_ports)
+				owner_pci.function += eth_da.ports[0];
 			DRV_LOG(INFO, "PCI information matches for"
 				      " slave %d bonding device \"%s\"",
 				      bd, ibv_list[ret]->name);
 			ibv_match[nd++] = ibv_list[ret];
 			break;
+		} else {
+			/* Bonding device not found. */
+			if (mlx5_dev_to_pci_addr
+				(ibv_list[ret]->ibdev_path, &pci_addr))
+				continue;
+			if (owner_pci.domain != pci_addr.domain ||
+			    owner_pci.bus != pci_addr.bus ||
+			    owner_pci.devid != pci_addr.devid ||
+			    owner_pci.function != pci_addr.function)
+				continue;
+			DRV_LOG(INFO, "PCI information matches for device \"%s\"",
+				ibv_list[ret]->name);
+			ibv_match[nd++] = ibv_list[ret];
 		}
-		if (mlx5_dev_to_pci_addr
-			(ibv_list[ret]->ibdev_path, &pci_addr))
-			continue;
-		if (pci_dev->addr.domain != pci_addr.domain ||
-		    pci_dev->addr.bus != pci_addr.bus ||
-		    pci_dev->addr.devid != pci_addr.devid ||
-		    pci_dev->addr.function != pci_addr.function)
-			continue;
-		DRV_LOG(INFO, "PCI information matches for device \"%s\"",
-			ibv_list[ret]->name);
-		ibv_match[nd++] = ibv_list[ret];
 	}
 	ibv_match[nd] = NULL;
 	if (!nd) {
@@ -1971,8 +1974,8 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 		DRV_LOG(WARNING,
 			"no Verbs device matches PCI device " PCI_PRI_FMT ","
 			" are kernel drivers loaded?",
-			pci_dev->addr.domain, pci_dev->addr.bus,
-			pci_dev->addr.devid, pci_dev->addr.function);
+			owner_pci.domain, owner_pci.bus,
+			owner_pci.devid, owner_pci.function);
 		rte_errno = ENOENT;
 		ret = -rte_errno;
 		goto exit;
@@ -2237,6 +2240,24 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 		dev_config_vf = 0;
 		break;
 	}
+	if (eth_da.type != RTE_ETH_REPRESENTOR_NONE) {
+		/* Set devargs default values. */
+		if (eth_da.nb_mh_controllers == 0) {
+			eth_da.nb_mh_controllers = 1;
+			eth_da.mh_controllers[0] = 0;
+		}
+		if (eth_da.nb_ports == 0 && ns > 0) {
+			if (list[0].pf_bond >= 0 && list[0].info.representor)
+				DRV_LOG(WARNING, "Representor on Bonding device should use pf#vf# syntax: %s",
+					pci_dev->device.devargs->args);
+			eth_da.nb_ports = 1;
+			eth_da.ports[0] = list[0].info.pf_num;
+		}
+		if (eth_da.nb_representor_ports == 0) {
+			eth_da.nb_representor_ports = 1;
+			eth_da.representor_ports[0] = 0;
+		}
+	}
 	for (i = 0; i != ns; ++i) {
 		uint32_t restore;
 
@@ -2278,8 +2299,8 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 		DRV_LOG(ERR,
 			"probe of PCI device " PCI_PRI_FMT " aborted after"
 			" encountering an error: %s",
-			pci_dev->addr.domain, pci_dev->addr.bus,
-			pci_dev->addr.devid, pci_dev->addr.function,
+			owner_pci.domain, owner_pci.bus,
+			owner_pci.devid, owner_pci.function,
 			strerror(rte_errno));
 		ret = -rte_errno;
 		/* Roll back. */
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index aa8b50c642..1a3043a4e7 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -355,6 +355,26 @@ static const struct mlx5_indexed_pool_config mlx5_ipool_cfg[] = {
 
 #define MLX5_FLOW_TABLE_HLIST_ARRAY_SIZE 4096
 
+/**
+ * Decide whether representor ID is a HPF(host PF) port on BF2.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   Non-zero if HPF, otherwise 0.
+ */
+bool
+mlx5_is_hpf(struct rte_eth_dev *dev)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	uint16_t repr = MLX5_REPRESENTOR_REPR(priv->representor_id);
+	int type = MLX5_REPRESENTOR_TYPE(priv->representor_id);
+
+	return priv->representor != 0 && type == RTE_ETH_REPRESENTOR_VF &&
+	       MLX5_REPRESENTOR_REPR(-1) == repr;
+}
+
 /**
  * Initialize the ASO aging management structure.
  *
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index bee0696518..34f4bd5dfc 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -923,7 +923,7 @@ struct mlx5_priv {
 	uint16_t vport_id; /* Associated VF vport index (if any). */
 	uint32_t vport_meta_tag; /* Used for vport index match ove VF LAG. */
 	uint32_t vport_meta_mask; /* Used for vport index field match mask. */
-	int32_t representor_id; /* Port representor identifier. */
+	int32_t representor_id; /* -1 if not a representor. */
 	int32_t pf_bond; /* >=0 means PF index in bonding configuration. */
 	unsigned int if_index; /* Associated kernel network device index. */
 	uint32_t bond_ifindex; /**< Bond interface index. */
@@ -999,6 +999,7 @@ int mlx5_udp_tunnel_port_add(struct rte_eth_dev *dev,
 			      struct rte_eth_udp_tunnel *udp_tunnel);
 uint16_t mlx5_eth_find_next(uint16_t port_id, struct rte_pci_device *pci_dev);
 int mlx5_dev_close(struct rte_eth_dev *dev);
+bool mlx5_is_hpf(struct rte_eth_dev *dev);
 void mlx5_age_event_prepare(struct mlx5_dev_ctx_shared *sh);
 
 /* Macro to iterate over all valid ports for mlx5 driver. */
diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index af29d93901..8f2807dcd9 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -48,10 +48,6 @@
 #define MLX5_PMD_SOFT_COUNTERS 1
 #endif
 
-/* Switch port ID parameters for bonding configurations. */
-#define MLX5_PORT_ID_BONDING_PF_MASK 0xf
-#define MLX5_PORT_ID_BONDING_PF_SHIFT 12
-
 /* Alarm timeout. */
 #define MLX5_ALARM_TIMEOUT_US 100000
 
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 1ffb13cf2e..130980d4d6 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -330,33 +330,6 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
 	if (priv->representor) {
 		uint16_t port_id;
 
-		if (priv->pf_bond >= 0) {
-			/*
-			 * Switch port ID is opaque value with driver defined
-			 * format. Push the PF index in bonding configurations
-			 * in upper four bits of port ID. If we get too many
-			 * representors (more than 4K) or PFs (more than 15)
-			 * this approach must be reconsidered.
-			 */
-			/* Switch port ID for VF representors: 0 - 0xFFE */
-			if ((info->switch_info.port_id != 0xffff &&
-				info->switch_info.port_id >=
-				((1 << MLX5_PORT_ID_BONDING_PF_SHIFT) - 1)) ||
-			    priv->pf_bond > MLX5_PORT_ID_BONDING_PF_MASK) {
-				DRV_LOG(ERR, "can't update switch port ID"
-					     " for bonding device");
-				MLX5_ASSERT(false);
-				return -ENODEV;
-			}
-			/*
-			 * Switch port ID for Host PF representor
-			 * (representor_id is -1) , set to 0xFFF
-			 */
-			if (info->switch_info.port_id == 0xffff)
-				info->switch_info.port_id = 0xfff;
-			info->switch_info.port_id |=
-				priv->pf_bond << MLX5_PORT_ID_BONDING_PF_SHIFT;
-		}
 		MLX5_ETH_FOREACH_DEV(port_id, priv->pci_dev) {
 			struct mlx5_priv *opriv =
 				rte_eth_devices[port_id].data->dev_private;
diff --git a/drivers/net/mlx5/mlx5_mac.c b/drivers/net/mlx5/mlx5_mac.c
index 6ffcfcd97a..7b2be04889 100644
--- a/drivers/net/mlx5/mlx5_mac.c
+++ b/drivers/net/mlx5/mlx5_mac.c
@@ -159,7 +159,7 @@ mlx5_mac_addr_set(struct rte_eth_dev *dev, struct rte_ether_addr *mac_addr)
 	 * Configuring the VF instead of its representor,
 	 * need to skip the special case of HPF on Bluefield.
 	 */
-	if (priv->representor && priv->representor_id >= 0) {
+	if (priv->representor && !mlx5_is_hpf(dev)) {
 		DRV_LOG(DEBUG, "VF represented by port %u setting primary MAC address",
 			dev->data->port_id);
 		RTE_ETH_FOREACH_DEV_SIBLING(port_id, dev->data->port_id) {
@@ -169,7 +169,9 @@ mlx5_mac_addr_set(struct rte_eth_dev *dev, struct rte_ether_addr *mac_addr)
 				return mlx5_os_vf_mac_addr_modify
 				       (priv,
 					mlx5_ifindex(&rte_eth_devices[port_id]),
-					mac_addr, priv->representor_id);
+					mac_addr,
+					MLX5_REPRESENTOR_REPR
+						(priv->representor_id));
 			}
 		}
 		rte_errno = -ENOTSUP;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v5 5/9] net/mlx5: support list value of representor PF
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (43 preceding siblings ...)
  2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 4/9] net/mlx5: refactor bonding representor probe Xueming Li
@ 2021-03-28 13:48   ` Xueming Li
  2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 6/9] net/mlx5: save bonding member ports information Xueming Li
                     ` (21 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-03-28 13:48 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, xuemingl, Asaf Penso, Matan Azrad, Shahaf Shuler

To probe representors from different kernel bonding PFs, had to specify
2 separate devargs like this:
    -a 03:00.0,representor=pf0vf[0-3] -a 03:00.0,representor=pf1vf[0-3]

This patch supports range or list of PF section in devargs, so the
alternative short devargs of above is:
    -a 03:00.0,representor=pf[0-1]vf[0-3]

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 doc/guides/nics/mlx5.rst         |   4 ++
 drivers/net/mlx5/linux/mlx5_os.c | 100 +++++++++++++++++++++----------
 2 files changed, 72 insertions(+), 32 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 92fe7c11e4..b39bc475ad 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -944,6 +944,10 @@ Driver options
 
     <PCI_BDF>,representor=sf[0-2]
 
+  To probe VF port representors 0 through 2 on both PFs of bonding device::
+
+    <Primary_PCI_BDF>,representor=pf[0,1]vf[0-2]
+
 - ``max_dump_files_num`` parameter [int]
 
   The maximum number of files per PMD entity that may be created for debug information.
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index bb4a8719f7..2c702cf614 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1829,21 +1829,25 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 }
 
 /**
- * DPDK callback to register a PCI device.
+ * Register a PCI device within bonding.
  *
- * This function spawns Ethernet devices out of a given PCI device.
+ * This function spawns Ethernet devices out of a given PCI device and
+ * bonding owner PF index.
  *
- * @param[in] pci_drv
- *   PCI driver structure (mlx5_driver).
  * @param[in] pci_dev
  *   PCI device information.
+ * @param[in] req_eth_da
+ *   Requested ethdev device argument.
+ * @param[in] owner_id
+ *   Requested owner PF port ID within bonding device, default to 0.
  *
  * @return
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
-int
-mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
-		  struct rte_pci_device *pci_dev)
+static int
+mlx5_os_pci_probe_pf(struct rte_pci_device *pci_dev,
+		     struct rte_eth_devargs *req_eth_da,
+		     uint16_t owner_id)
 {
 	struct ibv_device **ibv_list;
 	/*
@@ -1873,7 +1877,7 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 	struct mlx5_dev_spawn_data *list = NULL;
 	struct mlx5_dev_config dev_config;
 	unsigned int dev_config_vf;
-	struct rte_eth_devargs eth_da = { .type = RTE_ETH_REPRESENTOR_NONE };
+	struct rte_eth_devargs eth_da = *req_eth_da;
 	struct rte_pci_addr owner_pci = pci_dev->addr; /* Owner PF. */
 	int ret = -1;
 
@@ -1885,27 +1889,6 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 			strerror(rte_errno));
 		return -rte_errno;
 	}
-	if (pci_dev->device.devargs) {
-		/* Parse representor information from device argument. */
-		if (pci_dev->device.devargs->cls_str)
-			ret = rte_eth_devargs_parse
-				(pci_dev->device.devargs->cls_str, &eth_da);
-		if (ret) {
-			DRV_LOG(ERR, "failed to parse device arguments: %s",
-				pci_dev->device.devargs->cls_str);
-			return -rte_errno;
-		}
-		if (eth_da.type == RTE_ETH_REPRESENTOR_NONE) {
-			/* Support legacy device argument */
-			ret = rte_eth_devargs_parse
-				(pci_dev->device.devargs->args, &eth_da);
-			if (ret) {
-				DRV_LOG(ERR, "failed to parse device arguments: %s",
-					pci_dev->device.devargs->args);
-				return -rte_errno;
-			}
-		}
-	}
 	errno = 0;
 	ibv_list = mlx5_glue->get_device_list(&ret);
 	if (!ibv_list) {
@@ -1927,8 +1910,7 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 
 		DRV_LOG(DEBUG, "checking device \"%s\"", ibv_list[ret]->name);
 		bd = mlx5_device_bond_pci_match
-				(ibv_list[ret], &owner_pci, nl_rdma,
-				 eth_da.ports[0]);
+				(ibv_list[ret], &owner_pci, nl_rdma, owner_id);
 		if (bd >= 0) {
 			/*
 			 * Bonding device detected. Only one match is allowed,
@@ -1947,7 +1929,7 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 			}
 			/* Amend owner pci address if owner PF ID specified. */
 			if (eth_da.nb_representor_ports)
-				owner_pci.function += eth_da.ports[0];
+				owner_pci.function += owner_id;
 			DRV_LOG(INFO, "PCI information matches for"
 				      " slave %d bonding device \"%s\"",
 				      bd, ibv_list[ret]->name);
@@ -2335,6 +2317,60 @@ mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
 	return ret;
 }
 
+/**
+ * DPDK callback to register a PCI device.
+ *
+ * This function spawns Ethernet devices out of a given PCI device.
+ *
+ * @param[in] pci_drv
+ *   PCI driver structure (mlx5_driver).
+ * @param[in] pci_dev
+ *   PCI device information.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+int
+mlx5_os_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
+		  struct rte_pci_device *pci_dev)
+{
+	struct rte_eth_devargs eth_da = { .type = RTE_ETH_REPRESENTOR_NONE };
+	int ret = 0;
+	uint16_t p;
+
+	if (pci_dev->device.devargs) {
+		/* Parse representor information from device argument. */
+		if (pci_dev->device.devargs->cls_str)
+			ret = rte_eth_devargs_parse
+				(pci_dev->device.devargs->cls_str, &eth_da);
+		if (ret) {
+			DRV_LOG(ERR, "failed to parse device arguments: %s",
+				pci_dev->device.devargs->cls_str);
+			return -rte_errno;
+		}
+		if (eth_da.type == RTE_ETH_REPRESENTOR_NONE) {
+			/* Support legacy device argument */
+			ret = rte_eth_devargs_parse
+				(pci_dev->device.devargs->args, &eth_da);
+			if (ret) {
+				DRV_LOG(ERR, "failed to parse device arguments: %s",
+					pci_dev->device.devargs->args);
+				return -rte_errno;
+			}
+		}
+	}
+
+	if (eth_da.nb_ports > 0) {
+		/* Iterate all port if devargs pf is range: "pf[0-1]vf[...]". */
+		for (p = 0; p < eth_da.nb_ports; p++)
+			ret = mlx5_os_pci_probe_pf(pci_dev, &eth_da,
+						   eth_da.ports[p]);
+	} else {
+		ret = mlx5_os_pci_probe_pf(pci_dev, &eth_da, 0);
+	}
+	return ret;
+}
+
 static int
 mlx5_config_doorbell_mapping_env(const struct mlx5_dev_config *config)
 {
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v5 6/9] net/mlx5: save bonding member ports information
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (44 preceding siblings ...)
  2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 5/9] net/mlx5: support list value of representor PF Xueming Li
@ 2021-03-28 13:48   ` Xueming Li
  2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 7/9] net/mlx5: fix setting VF default MAC through representor Xueming Li
                     ` (20 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-03-28 13:48 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, xuemingl, Asaf Penso, Matan Azrad, Shahaf Shuler

Since kernel bonding netdev doesn't provide statistics counter that
reflects all member ports, PMD has to manually summarize counters from
each member ports.

As a preparation, this patch collects bonding member port information
and saves to shared context data.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_ethdev_os.c |  4 +-
 drivers/net/mlx5/linux/mlx5_os.c        | 91 ++++++++++++++++---------
 drivers/net/mlx5/mlx5.c                 |  2 +
 drivers/net/mlx5/mlx5.h                 | 21 +++++-
 drivers/net/mlx5/mlx5_ethdev.c          |  5 +-
 5 files changed, 86 insertions(+), 37 deletions(-)

diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
index 2127fcfbfa..e7ec07e364 100644
--- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
@@ -150,8 +150,8 @@ mlx5_get_ifname(const struct rte_eth_dev *dev, char (*ifname)[MLX5_NAMESIZE])
 
 	MLX5_ASSERT(priv);
 	MLX5_ASSERT(priv->sh);
-	if (priv->bond_ifindex > 0) {
-		memcpy(ifname, priv->bond_name, MLX5_NAMESIZE);
+	if (priv->master && priv->sh->bond.ifindex > 0) {
+		memcpy(ifname, priv->sh->bond.ifname, MLX5_NAMESIZE);
 		return 0;
 	}
 	ifindex = mlx5_ifindex(dev);
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 2c702cf614..5bdc8caee5 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1457,19 +1457,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	 */
 	MLX5_ASSERT(spawn->ifindex);
 	priv->if_index = spawn->ifindex;
-	if (priv->pf_bond >= 0 && priv->master) {
-		/* Get bond interface info */
-		err = mlx5_sysfs_bond_info(priv->if_index,
-				     &priv->bond_ifindex,
-				     priv->bond_name);
-		if (err)
-			DRV_LOG(ERR, "unable to get bond info: %s",
-				strerror(rte_errno));
-		else
-			DRV_LOG(INFO, "PF device %u, bond device %u(%s)",
-				priv->if_index, priv->bond_ifindex,
-				priv->bond_name);
-	}
 	eth_dev->data->dev_private = priv;
 	priv->dev_data = eth_dev->data;
 	eth_dev->data->mac_addrs = priv->mac;
@@ -1738,6 +1725,8 @@ mlx5_dev_spawn_data_cmp(const void *a, const void *b)
  *   Netlink RDMA group socket handle.
  * @param[in] owner
  *   Rerepsentor owner PF index.
+ * @param[out] bond_info
+ *   Pointer to bonding information.
  *
  * @return
  *   negative value if no bonding device found, otherwise
@@ -1746,19 +1735,22 @@ mlx5_dev_spawn_data_cmp(const void *a, const void *b)
 static int
 mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 			   const struct rte_pci_addr *pci_dev,
-			   int nl_rdma, uint16_t owner)
+			   int nl_rdma, uint16_t owner,
+			   struct mlx5_bond_info *bond_info)
 {
 	char ifname[IF_NAMESIZE + 1];
 	unsigned int ifindex;
 	unsigned int np, i;
-	FILE *file = NULL;
+	FILE *bond_file = NULL, *file;
 	int pf = -1;
+	int ret;
 
 	/*
 	 * Try to get master device name. If something goes
 	 * wrong suppose the lack of kernel support and no
 	 * bonding devices.
 	 */
+	memset(bond_info, 0, sizeof(*bond_info));
 	if (nl_rdma < 0)
 		return -1;
 	if (!strstr(ibv_dev->name, "bond"))
@@ -1782,15 +1774,15 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 		/* Try to read bonding slave names from sysfs. */
 		MKSTR(slaves,
 		      "/sys/class/net/%s/master/bonding/slaves", ifname);
-		file = fopen(slaves, "r");
-		if (file)
+		bond_file = fopen(slaves, "r");
+		if (bond_file)
 			break;
 	}
-	if (!file)
+	if (!bond_file)
 		return -1;
 	/* Use safe format to check maximal buffer length. */
 	MLX5_ASSERT(atol(RTE_STR(IF_NAMESIZE)) == IF_NAMESIZE);
-	while (fscanf(file, "%" RTE_STR(IF_NAMESIZE) "s", ifname) == 1) {
+	while (fscanf(bond_file, "%" RTE_STR(IF_NAMESIZE) "s", ifname) == 1) {
 		char tmp_str[IF_NAMESIZE + 32];
 		struct rte_pci_addr pci_addr;
 		struct mlx5_switch_info	info;
@@ -1803,13 +1795,7 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 					 " for netdev \"%s\"", ifname);
 			continue;
 		}
-		if (pci_dev->domain != pci_addr.domain ||
-		    pci_dev->bus != pci_addr.bus ||
-		    pci_dev->devid != pci_addr.devid ||
-		    pci_dev->function + owner != pci_addr.function)
-			continue;
 		/* Slave interface PCI address match found. */
-		fclose(file);
 		snprintf(tmp_str, sizeof(tmp_str),
 			 "/sys/class/net/%s/phys_port_name", ifname);
 		file = fopen(tmp_str, "rb");
@@ -1818,13 +1804,52 @@ mlx5_device_bond_pci_match(const struct ibv_device *ibv_dev,
 		info.name_type = MLX5_PHYS_PORT_NAME_TYPE_NOTSET;
 		if (fscanf(file, "%32s", tmp_str) == 1)
 			mlx5_translate_port_name(tmp_str, &info);
-		if (info.name_type == MLX5_PHYS_PORT_NAME_TYPE_LEGACY ||
-		    info.name_type == MLX5_PHYS_PORT_NAME_TYPE_UPLINK)
+		fclose(file);
+		/* Only process PF ports. */
+		if (info.name_type != MLX5_PHYS_PORT_NAME_TYPE_LEGACY &&
+		    info.name_type != MLX5_PHYS_PORT_NAME_TYPE_UPLINK)
+			continue;
+		/* Check max bonding member. */
+		if (info.port_name >= MLX5_BOND_MAX_PORTS) {
+			DRV_LOG(WARNING, "bonding index out of range, "
+				"please increase MLX5_BOND_MAX_PORTS: %s",
+				tmp_str);
+			break;
+		}
+		/* Match PCI address. */
+		if (pci_dev->domain == pci_addr.domain &&
+		    pci_dev->bus == pci_addr.bus &&
+		    pci_dev->devid == pci_addr.devid &&
+		    pci_dev->function + owner == pci_addr.function)
 			pf = info.port_name;
-		break;
-	}
-	if (file)
+		/* Get ifindex. */
+		snprintf(tmp_str, sizeof(tmp_str),
+			 "/sys/class/net/%s/ifindex", ifname);
+		file = fopen(tmp_str, "rb");
+		if (!file)
+			break;
+		ret = fscanf(file, "%u", &ifindex);
 		fclose(file);
+		if (ret != 1)
+			break;
+		/* Save bonding info. */
+		strncpy(bond_info->ports[info.port_name].ifname, ifname,
+			sizeof(bond_info->ports[0].ifname));
+		bond_info->ports[info.port_name].pci_addr = pci_addr;
+		bond_info->ports[info.port_name].ifindex = ifindex;
+		bond_info->n_port++;
+	}
+	if (pf >= 0) {
+		/* Get bond interface info */
+		ret = mlx5_sysfs_bond_info(ifindex, &bond_info->ifindex,
+					   bond_info->ifname);
+		if (ret)
+			DRV_LOG(ERR, "unable to get bond info: %s",
+				strerror(rte_errno));
+		else
+			DRV_LOG(INFO, "PF device %u, bond device %u(%s)",
+				ifindex, bond_info->ifindex, bond_info->ifname);
+	}
 	return pf;
 }
 
@@ -1879,6 +1904,7 @@ mlx5_os_pci_probe_pf(struct rte_pci_device *pci_dev,
 	unsigned int dev_config_vf;
 	struct rte_eth_devargs eth_da = *req_eth_da;
 	struct rte_pci_addr owner_pci = pci_dev->addr; /* Owner PF. */
+	struct mlx5_bond_info bond_info;
 	int ret = -1;
 
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
@@ -1910,7 +1936,8 @@ mlx5_os_pci_probe_pf(struct rte_pci_device *pci_dev,
 
 		DRV_LOG(DEBUG, "checking device \"%s\"", ibv_list[ret]->name);
 		bd = mlx5_device_bond_pci_match
-				(ibv_list[ret], &owner_pci, nl_rdma, owner_id);
+				(ibv_list[ret], &owner_pci, nl_rdma, owner_id,
+				 &bond_info);
 		if (bd >= 0) {
 			/*
 			 * Bonding device detected. Only one match is allowed,
@@ -2019,6 +2046,7 @@ mlx5_os_pci_probe_pf(struct rte_pci_device *pci_dev,
 		MLX5_ASSERT(nd == 1);
 		MLX5_ASSERT(np);
 		for (i = 1; i <= np; ++i) {
+			list[ns].bond_info = &bond_info;
 			list[ns].max_port = np;
 			list[ns].phys_port = i;
 			list[ns].phys_dev = ibv_match[0];
@@ -2109,6 +2137,7 @@ mlx5_os_pci_probe_pf(struct rte_pci_device *pci_dev,
 		 */
 		for (i = 0; i != nd; ++i) {
 			memset(&list[ns].info, 0, sizeof(list[ns].info));
+			list[ns].bond_info = NULL;
 			list[ns].max_port = 1;
 			list[ns].phys_port = 1;
 			list[ns].phys_dev = ibv_match[i];
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 1a3043a4e7..303c25203a 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -928,6 +928,8 @@ mlx5_alloc_shared_dev_ctx(const struct mlx5_dev_spawn_data *spawn,
 		rte_errno  = ENOMEM;
 		goto exit;
 	}
+	if (spawn->bond_info)
+		sh->bond = *spawn->bond_info;
 	err = mlx5_os_open_device(spawn, config, sh);
 	if (!sh->ctx)
 		goto error;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 34f4bd5dfc..a1d2798373 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -115,6 +115,7 @@ struct mlx5_dev_spawn_data {
 	void *phys_dev; /**< Associated physical device. */
 	struct rte_eth_dev *eth_dev; /**< Associated Ethernet device. */
 	struct rte_pci_device *pci_dev; /**< Backend PCI device. */
+	struct mlx5_bond_info *bond_info;
 };
 
 /** Key string for IPC. */
@@ -660,6 +661,21 @@ struct mlx5_flex_parser_profiles {
 	void *obj;		/* Flex parser node object. */
 };
 
+/* Max member ports per bonding device. */
+#define MLX5_BOND_MAX_PORTS 2
+
+/* Bonding device information. */
+struct mlx5_bond_info {
+	int n_port; /* Number of bond member ports. */
+	uint32_t ifindex;
+	char ifname[MLX5_NAMESIZE + 1];
+	struct {
+		char ifname[MLX5_NAMESIZE + 1];
+		uint32_t ifindex;
+		struct rte_pci_addr pci_addr;
+	} ports[MLX5_BOND_MAX_PORTS];
+};
+
 /*
  * Shared Infiniband device context for Master/Representors
  * which belong to same IB device with multiple IB ports.
@@ -673,6 +689,7 @@ struct mlx5_dev_ctx_shared {
 	uint32_t sq_ts_format:2; /* SQ timestamp formats supported. */
 	uint32_t qp_ts_format:2; /* QP timestamp formats supported. */
 	uint32_t max_port; /* Maximal IB device port index. */
+	struct mlx5_bond_info bond; /* Bonding information. */
 	void *ctx; /* Verbs/DV/DevX context. */
 	void *pd; /* Protection Domain. */
 	uint32_t pdn; /* Protection Domain number. */
@@ -924,10 +941,8 @@ struct mlx5_priv {
 	uint32_t vport_meta_tag; /* Used for vport index match ove VF LAG. */
 	uint32_t vport_meta_mask; /* Used for vport index field match mask. */
 	int32_t representor_id; /* -1 if not a representor. */
-	int32_t pf_bond; /* >=0 means PF index in bonding configuration. */
+	int32_t pf_bond; /* >=0, representor owner PF index in bonding. */
 	unsigned int if_index; /* Associated kernel network device index. */
-	uint32_t bond_ifindex; /**< Bond interface index. */
-	char bond_name[MLX5_NAMESIZE]; /**< Bond interface name. */
 	/* RX/TX queues. */
 	unsigned int rxqs_n; /* RX queues array size. */
 	unsigned int txqs_n; /* TX queues array size. */
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 130980d4d6..4f97a69a20 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -42,7 +42,10 @@ mlx5_ifindex(const struct rte_eth_dev *dev)
 
 	MLX5_ASSERT(priv);
 	MLX5_ASSERT(priv->if_index);
-	ifindex = priv->bond_ifindex > 0 ? priv->bond_ifindex : priv->if_index;
+	if (priv->master && priv->sh->bond.ifindex > 0)
+		ifindex = priv->sh->bond.ifindex;
+	else
+		ifindex = priv->if_index;
 	if (!ifindex)
 		rte_errno = ENXIO;
 	return ifindex;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v5 7/9] net/mlx5: fix setting VF default MAC through representor
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (45 preceding siblings ...)
  2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 6/9] net/mlx5: save bonding member ports information Xueming Li
@ 2021-03-28 13:48   ` Xueming Li
  2021-03-31  7:46     ` Raslan Darawsheh
  2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 8/9] net/mlx5: improve xstats of bonding port Xueming Li
                     ` (19 subsequent siblings)
  66 siblings, 1 reply; 118+ messages in thread
From: Xueming Li @ 2021-03-28 13:48 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, xuemingl, Asaf Penso, Matan Azrad, Shahaf Shuler

With kernel bonding, there was an error when setting VF MAC address
through representor. The Netlink api requires ifindex of owner PF, not
bonding device ifindex.

Uses owner PF ifindex to modify VF default MAC in case of bonding
device.

Fixes: c21e5facf7d2 ("net/mlx5: use bond index for netdev operations")

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/mlx5_mac.c | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_mac.c b/drivers/net/mlx5/mlx5_mac.c
index 7b2be04889..a7946f7756 100644
--- a/drivers/net/mlx5/mlx5_mac.c
+++ b/drivers/net/mlx5/mlx5_mac.c
@@ -154,6 +154,7 @@ mlx5_mac_addr_set(struct rte_eth_dev *dev, struct rte_ether_addr *mac_addr)
 {
 	uint16_t port_id;
 	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_priv *pf_priv;
 
 	/*
 	 * Configuring the VF instead of its representor,
@@ -162,17 +163,21 @@ mlx5_mac_addr_set(struct rte_eth_dev *dev, struct rte_ether_addr *mac_addr)
 	if (priv->representor && !mlx5_is_hpf(dev)) {
 		DRV_LOG(DEBUG, "VF represented by port %u setting primary MAC address",
 			dev->data->port_id);
+		if (priv->pf_bond >= 0) {
+			/* Bonding, get owner PF ifindex from shared data. */
+			return mlx5_os_vf_mac_addr_modify
+			       (priv,
+				priv->sh->bond.ports[priv->pf_bond].ifindex,
+				mac_addr,
+				MLX5_REPRESENTOR_REPR(priv->representor_id));
+		}
 		RTE_ETH_FOREACH_DEV_SIBLING(port_id, dev->data->port_id) {
-			priv = rte_eth_devices[port_id].data->dev_private;
-			if (priv->master == 1) {
-				priv = dev->data->dev_private;
+			pf_priv = rte_eth_devices[port_id].data->dev_private;
+			if (pf_priv->master == 1)
 				return mlx5_os_vf_mac_addr_modify
-				       (priv,
-					mlx5_ifindex(&rte_eth_devices[port_id]),
-					mac_addr,
+				       (priv, pf_priv->if_index, mac_addr,
 					MLX5_REPRESENTOR_REPR
 						(priv->representor_id));
-			}
 		}
 		rte_errno = -ENOTSUP;
 		return rte_errno;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v5 8/9] net/mlx5: improve xstats of bonding port
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (46 preceding siblings ...)
  2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 7/9] net/mlx5: fix setting VF default MAC through representor Xueming Li
@ 2021-03-28 13:48   ` Xueming Li
  2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 9/9] net/mlx5: probe host PF representor with SubFunction Xueming Li
                     ` (18 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-03-28 13:48 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, xuemingl, Asaf Penso, Matan Azrad, Shahaf Shuler

In case of kernel bonding device, counter was read from first bonding PF
member.

This patch reads all member PFs and sums to get bond xstats.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_ethdev_os.c | 127 +++++++++++++++++++-----
 1 file changed, 102 insertions(+), 25 deletions(-)

diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
index e7ec07e364..e8aaa0d36a 100644
--- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
@@ -169,10 +169,10 @@ mlx5_get_ifname(const struct rte_eth_dev *dev, char (*ifname)[MLX5_NAMESIZE])
 }
 
 /**
- * Perform ifreq ioctl() on associated Ethernet device.
+ * Perform ifreq ioctl() on associated netdev ifname.
  *
- * @param[in] dev
- *   Pointer to Ethernet device.
+ * @param[in] ifname
+ *   Pointer to netdev name.
  * @param req
  *   Request number to pass to ioctl().
  * @param[out] ifr
@@ -182,7 +182,7 @@ mlx5_get_ifname(const struct rte_eth_dev *dev, char (*ifname)[MLX5_NAMESIZE])
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-mlx5_ifreq(const struct rte_eth_dev *dev, int req, struct ifreq *ifr)
+mlx5_ifreq_by_ifname(const char *ifname, int req, struct ifreq *ifr)
 {
 	int sock = socket(PF_INET, SOCK_DGRAM, IPPROTO_IP);
 	int ret = 0;
@@ -191,9 +191,7 @@ mlx5_ifreq(const struct rte_eth_dev *dev, int req, struct ifreq *ifr)
 		rte_errno = errno;
 		return -rte_errno;
 	}
-	ret = mlx5_get_ifname(dev, &ifr->ifr_name);
-	if (ret)
-		goto error;
+	rte_strscpy(ifr->ifr_name, ifname, sizeof(ifr->ifr_name));
 	ret = ioctl(sock, req, ifr);
 	if (ret == -1) {
 		rte_errno = errno;
@@ -206,6 +204,31 @@ mlx5_ifreq(const struct rte_eth_dev *dev, int req, struct ifreq *ifr)
 	return -rte_errno;
 }
 
+/**
+ * Perform ifreq ioctl() on associated Ethernet device.
+ *
+ * @param[in] dev
+ *   Pointer to Ethernet device.
+ * @param req
+ *   Request number to pass to ioctl().
+ * @param[out] ifr
+ *   Interface request structure output buffer.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+mlx5_ifreq(const struct rte_eth_dev *dev, int req, struct ifreq *ifr)
+{
+	char ifname[sizeof(ifr->ifr_name)];
+	int ret;
+
+	ret = mlx5_get_ifname(dev, &ifname);
+	if (ret)
+		return -rte_errno;
+	return mlx5_ifreq_by_ifname(ifname, req, ifr);
+}
+
 /**
  * Get device MTU.
  *
@@ -1243,6 +1266,8 @@ int mlx5_get_module_eeprom(struct rte_eth_dev *dev,
  *
  * @param dev
  *   Pointer to Ethernet device.
+ * @param[in] pf
+ *   PF index in case of bonding device, -1 otherwise
  * @param[out] stats
  *   Counters table output buffer.
  *
@@ -1250,8 +1275,8 @@ int mlx5_get_module_eeprom(struct rte_eth_dev *dev,
  *   0 on success and stats is filled, negative errno value otherwise and
  *   rte_errno is set.
  */
-int
-mlx5_os_read_dev_counters(struct rte_eth_dev *dev, uint64_t *stats)
+static int
+_mlx5_os_read_dev_counters(struct rte_eth_dev *dev, int pf, uint64_t *stats)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
 	struct mlx5_xstats_ctrl *xstats_ctrl = &priv->xstats_ctrl;
@@ -1265,7 +1290,11 @@ mlx5_os_read_dev_counters(struct rte_eth_dev *dev, uint64_t *stats)
 	et_stats->cmd = ETHTOOL_GSTATS;
 	et_stats->n_stats = xstats_ctrl->stats_n;
 	ifr.ifr_data = (caddr_t)et_stats;
-	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (pf >= 0)
+		ret = mlx5_ifreq_by_ifname(priv->sh->bond.ports[pf].ifname,
+					   SIOCETHTOOL, &ifr);
+	else
+		ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
 	if (ret) {
 		DRV_LOG(WARNING,
 			"port %u unable to read statistic values from device",
@@ -1273,23 +1302,60 @@ mlx5_os_read_dev_counters(struct rte_eth_dev *dev, uint64_t *stats)
 		return ret;
 	}
 	for (i = 0; i != xstats_ctrl->mlx5_stats_n; ++i) {
-		if (xstats_ctrl->info[i].dev) {
-			ret = mlx5_os_read_dev_stat(priv,
-					    xstats_ctrl->info[i].ctr_name,
-					    &stats[i]);
-			/* return last xstats counter if fail to read. */
-			if (ret == 0)
-				xstats_ctrl->xstats[i] = stats[i];
-			else
-				stats[i] = xstats_ctrl->xstats[i];
-		} else {
-			stats[i] = (uint64_t)
-				et_stats->data[xstats_ctrl->dev_table_idx[i]];
-		}
+		if (xstats_ctrl->info[i].dev)
+			continue;
+		stats[i] += (uint64_t)
+			    et_stats->data[xstats_ctrl->dev_table_idx[i]];
 	}
 	return 0;
 }
 
+/**
+ * Read device counters.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ * @param[out] stats
+ *   Counters table output buffer.
+ *
+ * @return
+ *   0 on success and stats is filled, negative errno value otherwise and
+ *   rte_errno is set.
+ */
+int
+mlx5_os_read_dev_counters(struct rte_eth_dev *dev, uint64_t *stats)
+{
+	struct mlx5_priv *priv = dev->data->dev_private;
+	struct mlx5_xstats_ctrl *xstats_ctrl = &priv->xstats_ctrl;
+	int ret = 0, i;
+
+	memset(stats, 0, sizeof(*stats) * xstats_ctrl->mlx5_stats_n);
+	/* Read ifreq counters. */
+	if (priv->master && priv->pf_bond >= 0) {
+		/* Sum xstats from bonding device member ports. */
+		for (i = 0; i < priv->sh->bond.n_port; i++) {
+			ret = _mlx5_os_read_dev_counters(dev, i, stats);
+			if (ret)
+				return ret;
+		}
+	} else {
+		ret = _mlx5_os_read_dev_counters(dev, -1, stats);
+	}
+	/* Read IB counters. */
+	for (i = 0; i != xstats_ctrl->mlx5_stats_n; ++i) {
+		if (!xstats_ctrl->info[i].dev)
+			continue;
+		ret = mlx5_os_read_dev_stat(priv, xstats_ctrl->info[i].ctr_name,
+					    &stats[i]);
+		/* return last xstats counter if fail to read. */
+		if (ret != 0)
+			xstats_ctrl->xstats[i] = stats[i];
+		else
+			stats[i] = xstats_ctrl->xstats[i];
+	}
+	return ret;
+}
+
 /**
  * Query the number of statistics provided by ETHTOOL.
  *
@@ -1303,13 +1369,19 @@ mlx5_os_read_dev_counters(struct rte_eth_dev *dev, uint64_t *stats)
 int
 mlx5_os_get_stats_n(struct rte_eth_dev *dev)
 {
+	struct mlx5_priv *priv = dev->data->dev_private;
 	struct ethtool_drvinfo drvinfo;
 	struct ifreq ifr;
 	int ret;
 
 	drvinfo.cmd = ETHTOOL_GDRVINFO;
 	ifr.ifr_data = (caddr_t)&drvinfo;
-	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (priv->master && priv->pf_bond >= 0)
+		/* Bonding PF. */
+		ret = mlx5_ifreq_by_ifname(priv->sh->bond.ports[0].ifname,
+					   SIOCETHTOOL, &ifr);
+	else
+		ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
 	if (ret) {
 		DRV_LOG(WARNING, "port %u unable to query number of statistics",
 			dev->data->port_id);
@@ -1480,7 +1552,12 @@ mlx5_os_stats_init(struct rte_eth_dev *dev)
 	strings->string_set = ETH_SS_STATS;
 	strings->len = dev_stats_n;
 	ifr.ifr_data = (caddr_t)strings;
-	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (priv->master && priv->pf_bond >= 0)
+		/* Bonding master. */
+		ret = mlx5_ifreq_by_ifname(priv->sh->bond.ports[0].ifname,
+					   SIOCETHTOOL, &ifr);
+	else
+		ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
 	if (ret) {
 		DRV_LOG(WARNING, "port %u unable to get statistic names",
 			dev->data->port_id);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v5 9/9] net/mlx5: probe host PF representor with SubFunction
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (47 preceding siblings ...)
  2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 8/9] net/mlx5: improve xstats of bonding port Xueming Li
@ 2021-03-28 13:48   ` Xueming Li
  2021-03-30  7:37     ` Slava Ovsiienko
  2021-03-30 12:15   ` [dpdk-dev] [PATCH v3 0/5] eal: enable global device syntax by default Xueming Li
                     ` (17 subsequent siblings)
  66 siblings, 1 reply; 118+ messages in thread
From: Xueming Li @ 2021-03-28 13:48 UTC (permalink / raw)
  To: Viacheslav Ovsiienko
  Cc: dev, xuemingl, Asaf Penso, Matan Azrad, Shahaf Shuler

To simplify BlueField HPF representor(vf[-1]) probe, this patch allows
probe it with "sf" syntax: "sf[-1]".

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_os.c | 14 ++++++++++----
 drivers/net/mlx5/mlx5.h          |  3 ++-
 drivers/net/mlx5/mlx5_ethdev.c   | 25 +++++++++++++++++++++----
 3 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 5bdc8caee5..74f72188ff 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -710,11 +710,15 @@ mlx5_representor_match(struct mlx5_dev_spawn_data *spawn,
 	struct mlx5_switch_info *switch_info = &spawn->info;
 	unsigned int p, f;
 	uint16_t id;
-	uint16_t repr_id = mlx5_representor_id_encode(switch_info);
+	uint16_t repr_id = mlx5_representor_id_encode(switch_info,
+						      eth_da->type);
 
 	switch (eth_da->type) {
 	case RTE_ETH_REPRESENTOR_SF:
-		if (switch_info->name_type != MLX5_PHYS_PORT_NAME_TYPE_PFSF) {
+		if (!(spawn->info.port_name == -1 &&
+		      switch_info->name_type ==
+				MLX5_PHYS_PORT_NAME_TYPE_PFHPF) &&
+		    switch_info->name_type != MLX5_PHYS_PORT_NAME_TYPE_PFSF) {
 			rte_errno = EBUSY;
 			return false;
 		}
@@ -742,7 +746,8 @@ mlx5_representor_match(struct mlx5_dev_spawn_data *spawn,
 		if (spawn->pf_bond < 0) {
 			/* For non-LAG mode, allow and ignore pf. */
 			switch_info->pf_num = eth_da->ports[p];
-			repr_id = mlx5_representor_id_encode(switch_info);
+			repr_id = mlx5_representor_id_encode(switch_info,
+							     eth_da->type);
 		}
 		for (f = 0; f < eth_da->nb_representor_ports; ++f) {
 			id = MLX5_REPRESENTOR_ID
@@ -1107,7 +1112,8 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 	priv->vport_id = switch_info->representor ?
 			 switch_info->port_name + 1 : -1;
 #endif
-	priv->representor_id = mlx5_representor_id_encode(switch_info);
+	priv->representor_id = mlx5_representor_id_encode(switch_info,
+							  eth_da->type);
 	/*
 	 * Look for sibling devices in order to reuse their switch domain
 	 * if any, otherwise allocate one.
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index a1d2798373..fa9e68ded9 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1055,7 +1055,8 @@ int mlx5_representor_info_get(struct rte_eth_dev *dev,
 		((repr_id) & 0xfff)
 #define MLX5_REPRESENTOR_TYPE(repr_id) \
 		(((repr_id) >> 12) & 3)
-uint16_t mlx5_representor_id_encode(const struct mlx5_switch_info *info);
+uint16_t mlx5_representor_id_encode(const struct mlx5_switch_info *info,
+				    enum rte_eth_representor_type hpf_type);
 int mlx5_fw_version_get(struct rte_eth_dev *dev, char *fw_ver,
 			size_t fw_size);
 int mlx5_dev_infos_get(struct rte_eth_dev *dev,
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 4f97a69a20..564d7132e0 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -363,12 +363,15 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
  *
  * @param info
  *   Port switch info.
+ * @param hpf_type
+ *   Use this type if port is HPF.
  *
  * @return
  *   Encoded representor ID.
  */
 uint16_t
-mlx5_representor_id_encode(const struct mlx5_switch_info *info)
+mlx5_representor_id_encode(const struct mlx5_switch_info *info,
+			   enum rte_eth_representor_type hpf_type)
 {
 	enum rte_eth_representor_type type = RTE_ETH_REPRESENTOR_VF;
 	uint16_t repr = info->port_name;
@@ -377,8 +380,10 @@ mlx5_representor_id_encode(const struct mlx5_switch_info *info)
 		return UINT16_MAX;
 	if (info->name_type == MLX5_PHYS_PORT_NAME_TYPE_PFSF)
 		type = RTE_ETH_REPRESENTOR_SF;
-	if (info->name_type == MLX5_PHYS_PORT_NAME_TYPE_PFHPF)
+	if (info->name_type == MLX5_PHYS_PORT_NAME_TYPE_PFHPF) {
+		type = hpf_type;
 		repr = UINT16_MAX;
+	}
 	return MLX5_REPRESENTOR_ID(info->pf_num, type, repr);
 }
 
@@ -403,7 +408,7 @@ mlx5_representor_info_get(struct rte_eth_dev *dev,
 			  struct rte_eth_representor_info *info)
 {
 	struct mlx5_priv *priv = dev->data->dev_private;
-	int n_type = 3; /* Number of representor types, VF, HPF and SF. */
+	int n_type = 4; /* Representor types, VF, HPF@VF, SF and HPF@SF. */
 	int n_pf = 2; /* Number of PFs. */
 	int i = 0, pf;
 
@@ -424,7 +429,7 @@ mlx5_representor_info_get(struct rte_eth_dev *dev,
 		snprintf(info->ranges[i].name,
 			 sizeof(info->ranges[i].name), "pf%dvf", pf);
 		i++;
-		/* HPF range. */
+		/* HPF range of VF type. */
 		info->ranges[i].type = RTE_ETH_REPRESENTOR_VF;
 		info->ranges[i].controller = 0;
 		info->ranges[i].pf = pf;
@@ -448,6 +453,18 @@ mlx5_representor_info_get(struct rte_eth_dev *dev,
 		snprintf(info->ranges[i].name,
 			 sizeof(info->ranges[i].name), "pf%dsf", pf);
 		i++;
+		/* HPF range of SF type. */
+		info->ranges[i].type = RTE_ETH_REPRESENTOR_SF;
+		info->ranges[i].controller = 0;
+		info->ranges[i].pf = pf;
+		info->ranges[i].vf = UINT16_MAX;
+		info->ranges[i].id_base =
+			MLX5_REPRESENTOR_ID(pf, info->ranges[i].type, -1);
+		info->ranges[i].id_end =
+			MLX5_REPRESENTOR_ID(pf, info->ranges[i].type, -1);
+		snprintf(info->ranges[i].name,
+			 sizeof(info->ranges[i].name), "pf%dsf", pf);
+		i++;
 	}
 out:
 	return n_type * n_pf;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v5 9/9] net/mlx5: probe host PF representor with SubFunction
  2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 9/9] net/mlx5: probe host PF representor with SubFunction Xueming Li
@ 2021-03-30  7:37     ` Slava Ovsiienko
  0 siblings, 0 replies; 118+ messages in thread
From: Slava Ovsiienko @ 2021-03-30  7:37 UTC (permalink / raw)
  To: Xueming(Steven) Li
  Cc: dev, Xueming(Steven) Li, Asaf Penso, Matan Azrad, Shahaf Shuler

> -----Original Message-----
> From: Xueming Li <xuemingl@nvidia.com>
> Sent: Sunday, March 28, 2021 16:48
> To: Slava Ovsiienko <viacheslavo@nvidia.com>
> Cc: dev@dpdk.org; Xueming(Steven) Li <xuemingl@nvidia.com>; Asaf Penso
> <asafp@nvidia.com>; Matan Azrad <matan@nvidia.com>; Shahaf Shuler
> <shahafs@nvidia.com>
> Subject: [PATCH v5 9/9] net/mlx5: probe host PF representor with
> SubFunction
> 
> To simplify BlueField HPF representor(vf[-1]) probe, this patch allows probe it
> with "sf" syntax: "sf[-1]".
> 
> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v3 0/5] eal: enable global device syntax by default
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (48 preceding siblings ...)
  2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 9/9] net/mlx5: probe host PF representor with SubFunction Xueming Li
@ 2021-03-30 12:15   ` Xueming Li
  2021-03-31  8:23     ` Gaëtan Rivet
  2021-03-30 12:15   ` [dpdk-dev] [PATCH v3 1/5] devargs: unify scratch buffer storage Xueming Li
                     ` (16 subsequent siblings)
  66 siblings, 1 reply; 118+ messages in thread
From: Xueming Li @ 2021-03-30 12:15 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet; +Cc: dev, xuemingl, Asaf Penso

The new Global Device Syntax [1] is used to identify a device with full
bus, class and driver description, example:
 -a bus=pci,addr=82:00.0/class=eth/driver=mlx5,...

This patchset fixes bugs and enable global device syntax with
backward compatibility by:
- unify devargs memory buffer cleanup
- parse name from bus callback 
- try new global syntax parsing firstly and fallback to legacy parsing.


History:

V1:
 - Inital version

V2:
 - add devargs.src as complete source dev string
 - change devargs.data to scratch buffer
 - add rte_devargs_free() to release scratch memory
 - change name policy to align with rte_eth_iterator_init()
 - remove PCI bus fix as name already resolved in rte_devargs_parse().
V3:
 - remove devargs.src
 - rename rte_devargs_free() to rte_devargs_reset()
 - add bus callback api to resolve devargs.

[1] Global Device Syntax:
https://www.dpdk.org/wp-content/uploads/sites/35/2018/10/am-07-DPDK-hotplug-20180905.pdf

[2] RFC:
http://patchwork.dpdk.org/project/dpdk/list/?series=14378

[3] V1:
http://patchwork.dpdk.org/project/dpdk/list/?series=14610

[4] V2:
http://patchwork.dpdk.org/project/dpdk/list/?series=14816


Xueming Li (5):
  devargs: unify scratch buffer storage
  devargs: fix memory leak on parsing error
  kvargs: add get by key function
  bus: add device arguments name parsing API
  devargs: parse global device syntax

 app/test-pmd/config.c                        |  3 +-
 app/test-pmd/testpmd.c                       |  5 +-
 drivers/bus/pci/pci_common.c                 |  1 +
 drivers/bus/pci/pci_params.c                 | 48 +++++++++++++++++
 drivers/bus/pci/private.h                    | 14 +++++
 drivers/bus/vdev/vdev.c                      | 10 ++--
 drivers/bus/vdev/vdev_params.c               | 43 +++++++++++++++
 drivers/bus/vdev/vdev_private.h              | 15 ++++++
 drivers/net/failsafe/failsafe_args.c         |  3 +-
 drivers/net/failsafe/failsafe_eal.c          |  2 +-
 examples/multi_process/hotplug_mp/commands.c |  6 +--
 lib/librte_eal/common/eal_common_dev.c       |  9 ++--
 lib/librte_eal/common/eal_common_devargs.c   | 57 ++++++++++++++------
 lib/librte_eal/common/hotplug_mp.c           |  6 +--
 lib/librte_eal/include/rte_bus.h             | 19 +++++++
 lib/librte_eal/include/rte_devargs.h         | 18 +++++--
 lib/librte_eal/rte_eal_exports.def           |  1 +
 lib/librte_eal/version.map                   |  1 +
 lib/librte_ethdev/rte_ethdev.c               |  8 +--
 lib/librte_kvargs/rte_kvargs.c               | 20 +++++++
 lib/librte_kvargs/rte_kvargs.h               | 21 ++++++++
 lib/librte_kvargs/version.map                |  1 +
 22 files changed, 263 insertions(+), 48 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v3 1/5] devargs: unify scratch buffer storage
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (49 preceding siblings ...)
  2021-03-30 12:15   ` [dpdk-dev] [PATCH v3 0/5] eal: enable global device syntax by default Xueming Li
@ 2021-03-30 12:15   ` Xueming Li
  2021-04-01  9:04     ` Kinsella, Ray
  2021-03-30 12:15   ` [dpdk-dev] [PATCH v3 2/5] devargs: fix memory leak on parsing error Xueming Li
                     ` (15 subsequent siblings)
  66 siblings, 1 reply; 118+ messages in thread
From: Xueming Li @ 2021-03-30 12:15 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet
  Cc: dev, xuemingl, Asaf Penso, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger, Gaetan Rivet, Anatoly Burakov, Dmitry Kozlyuk,
	Narcisa Ana Maria Vasile, Dmitry Malloy, Pallavi Kadam,
	Ray Kinsella, Neil Horman, Ferruh Yigit, Andrew Rybchenko

In current design, legacy parser rte_devargs_parse() saved scratch
buffer to devargs.args while new parser rte_devargs_layers_parse() saved
to devargs.data. Code using devargs had to know the difference and
cleaned up memory accordingly - error prone.

This patch unifies scratch buffer to data field, introduces
rte_devargs_reset() function to wrap the memory clean up logic.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 app/test-pmd/config.c                        |  3 +-
 app/test-pmd/testpmd.c                       |  5 +--
 drivers/bus/vdev/vdev.c                      |  9 +++---
 drivers/net/failsafe/failsafe_args.c         |  3 +-
 drivers/net/failsafe/failsafe_eal.c          |  2 +-
 examples/multi_process/hotplug_mp/commands.c |  6 ++--
 lib/librte_eal/common/eal_common_dev.c       |  9 +++---
 lib/librte_eal/common/eal_common_devargs.c   | 34 +++++++++++---------
 lib/librte_eal/common/hotplug_mp.c           |  6 ++--
 lib/librte_eal/include/rte_devargs.h         | 18 ++++++++---
 lib/librte_eal/rte_eal_exports.def           |  1 +
 lib/librte_eal/version.map                   |  1 +
 lib/librte_ethdev/rte_ethdev.c               |  8 ++---
 13 files changed, 59 insertions(+), 46 deletions(-)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index ef0b9784d0..d774610419 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -509,8 +509,6 @@ device_infos_display(const char *identifier)
 
 	if (rte_devargs_parsef(&da, "%s", identifier)) {
 		printf("cannot parse identifier\n");
-		if (da.args)
-			free(da.args);
 		return;
 	}
 
@@ -558,6 +556,7 @@ device_infos_display(const char *identifier)
 			}
 		}
 	};
+	rte_devargs_reset(&da);
 }
 
 void
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 96d2e0fcec..d4be23f8f8 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -3015,8 +3015,6 @@ detach_devargs(char *identifier)
 	memset(&da, 0, sizeof(da));
 	if (rte_devargs_parsef(&da, "%s", identifier)) {
 		printf("cannot parse identifier\n");
-		if (da.args)
-			free(da.args);
 		return;
 	}
 
@@ -3025,6 +3023,7 @@ detach_devargs(char *identifier)
 			if (ports[port_id].port_status != RTE_PORT_STOPPED) {
 				printf("Port %u not stopped\n", port_id);
 				rte_eth_iterator_cleanup(&iterator);
+				rte_devargs_reset(&da);
 				return;
 			}
 			port_flow_flush(port_id);
@@ -3034,6 +3033,7 @@ detach_devargs(char *identifier)
 	if (rte_eal_hotplug_remove(da.bus->name, da.name) != 0) {
 		TESTPMD_LOG(ERR, "Failed to detach device %s(%s)\n",
 			    da.name, da.bus->name);
+		rte_devargs_reset(&da);
 		return;
 	}
 
@@ -3042,6 +3042,7 @@ detach_devargs(char *identifier)
 	printf("Device %s is detached\n", identifier);
 	printf("Now total ports is %d\n", nb_ports);
 	printf("Done\n");
+	rte_devargs_reset(&da);
 }
 
 void
diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index 9a673347ae..d075409942 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -245,13 +245,14 @@ alloc_devargs(const char *name, const char *args)
 
 	devargs->bus = &rte_vdev_bus;
 	if (args)
-		devargs->args = strdup(args);
+		devargs->data = strdup(args);
 	else
-		devargs->args = strdup("");
+		devargs->data = strdup("");
+	devargs->args = devargs->data;
 
 	ret = strlcpy(devargs->name, name, sizeof(devargs->name));
 	if (ret < 0 || ret >= (int)sizeof(devargs->name)) {
-		free(devargs->args);
+		rte_devargs_reset(devargs);
 		free(devargs);
 		return NULL;
 	}
@@ -305,7 +306,7 @@ insert_vdev(const char *name, const char *args,
 
 	return 0;
 fail:
-	free(devargs->args);
+	rte_devargs_reset(devargs);
 	free(devargs);
 	free(dev);
 	return ret;
diff --git a/drivers/net/failsafe/failsafe_args.c b/drivers/net/failsafe/failsafe_args.c
index 707490b94c..b203e02d9a 100644
--- a/drivers/net/failsafe/failsafe_args.c
+++ b/drivers/net/failsafe/failsafe_args.c
@@ -451,8 +451,7 @@ failsafe_args_free(struct rte_eth_dev *dev)
 		sdev->cmdline = NULL;
 		free(sdev->fd_str);
 		sdev->fd_str = NULL;
-		free(sdev->devargs.args);
-		sdev->devargs.args = NULL;
+		rte_devargs_reset(&sdev->devargs);
 	}
 }
 
diff --git a/drivers/net/failsafe/failsafe_eal.c b/drivers/net/failsafe/failsafe_eal.c
index b9fc508673..cb4a2abc02 100644
--- a/drivers/net/failsafe/failsafe_eal.c
+++ b/drivers/net/failsafe/failsafe_eal.c
@@ -79,7 +79,7 @@ fs_bus_init(struct rte_eth_dev *dev)
 					rte_eth_devices[pid].device->devargs;
 
 			/* Take control of probed device. */
-			free(da->args);
+			rte_devargs_reset(da);
 			memset(da, 0, sizeof(*da));
 			if (probed_da != NULL)
 				snprintf(devstr, sizeof(devstr), "%s,%s",
diff --git a/examples/multi_process/hotplug_mp/commands.c b/examples/multi_process/hotplug_mp/commands.c
index a8a39d07f7..48fd329583 100644
--- a/examples/multi_process/hotplug_mp/commands.c
+++ b/examples/multi_process/hotplug_mp/commands.c
@@ -121,8 +121,6 @@ static void cmd_dev_attach_parsed(void *parsed_result,
 
 	if (rte_devargs_parsef(&da, "%s", res->devargs)) {
 		cmdline_printf(cl, "cannot parse devargs\n");
-		if (da.args)
-			free(da.args);
 		return;
 	}
 
@@ -131,6 +129,7 @@ static void cmd_dev_attach_parsed(void *parsed_result,
 	else
 		cmdline_printf(cl, "failed to attached device %s\n",
 				da.name);
+	rte_devargs_reset(&da);
 }
 
 cmdline_parse_token_string_t cmd_dev_attach_attach =
@@ -168,8 +167,6 @@ static void cmd_dev_detach_parsed(void *parsed_result,
 
 	if (rte_devargs_parsef(&da, "%s", res->devargs)) {
 		cmdline_printf(cl, "cannot parse devargs\n");
-		if (da.args)
-			free(da.args);
 		return;
 	}
 
@@ -180,6 +177,7 @@ static void cmd_dev_detach_parsed(void *parsed_result,
 	else
 		cmdline_printf(cl, "failed to dettach device %s\n",
 			da.name);
+	rte_devargs_reset(&da);
 }
 
 cmdline_parse_token_string_t cmd_dev_detach_detach =
diff --git a/lib/librte_eal/common/eal_common_dev.c b/lib/librte_eal/common/eal_common_dev.c
index 8a3bd3100a..148a23830a 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -185,10 +185,8 @@ local_dev_probe(const char *devargs, struct rte_device **new_dev)
 	return ret;
 
 err_devarg:
-	if (rte_devargs_remove(da) != 0) {
-		free(da->args);
-		free(da);
-	}
+	if (rte_devargs_remove(da) != 0)
+		rte_devargs_reset(da);
 	return ret;
 }
 
@@ -586,7 +584,8 @@ rte_dev_iterator_init(struct rte_dev_iterator *it,
 	it->bus_str = NULL;
 	it->cls_str = NULL;
 
-	devargs.data = dev_str;
+	/* Setting data field implies no malloc in parsing. */
+	devargs.data = (void *)(intptr_t)dev_str;
 	if (rte_devargs_layers_parse(&devargs, dev_str))
 		goto get_out;
 
diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index fcf3d9a3cc..48f85ee9c0 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -150,7 +150,7 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 	 * their parsing afterward.
 	 */
 	if (devargs->data != devstr) {
-		char *s = (void *)(intptr_t)(devargs->data);
+		char *s = devargs->data;
 
 		while ((s = strchr(s, '/'))) {
 			*s = '\0';
@@ -219,13 +219,14 @@ rte_devargs_parse(struct rte_devargs *da, const char *dev)
 	da->bus = bus;
 	/* Parse eventual device arguments */
 	if (devname[i] == ',')
-		da->args = strdup(&devname[i + 1]);
+		da->data = strdup(&devname[i + 1]);
 	else
-		da->args = strdup("");
-	if (da->args == NULL) {
+		da->data = strdup("");
+	if (da->data == NULL) {
 		RTE_LOG(ERR, EAL, "not enough memory to parse arguments\n");
 		return -ENOMEM;
 	}
+	da->drv_str = da->data;
 	return 0;
 }
 
@@ -260,6 +261,16 @@ rte_devargs_parsef(struct rte_devargs *da, const char *format, ...)
 	return ret;
 }
 
+void
+rte_devargs_reset(struct rte_devargs *da)
+{
+	if (da == NULL)
+		return;
+	if (da->data)
+		free(da->data);
+	da->data = NULL;
+}
+
 int
 rte_devargs_insert(struct rte_devargs **da)
 {
@@ -276,15 +287,8 @@ rte_devargs_insert(struct rte_devargs **da)
 		if (strcmp(listed_da->bus->name, (*da)->bus->name) == 0 &&
 				strcmp(listed_da->name, (*da)->name) == 0) {
 			/* device already in devargs list, must be updated */
-			listed_da->type = (*da)->type;
-			listed_da->policy = (*da)->policy;
-			free(listed_da->args);
-			listed_da->args = (*da)->args;
-			listed_da->bus = (*da)->bus;
-			listed_da->cls = (*da)->cls;
-			listed_da->bus_str = (*da)->bus_str;
-			listed_da->cls_str = (*da)->cls_str;
-			listed_da->data = (*da)->data;
+			rte_devargs_reset(listed_da);
+			*listed_da = **da;
 			/* replace provided devargs with found one */
 			free(*da);
 			*da = listed_da;
@@ -326,7 +330,7 @@ rte_devargs_add(enum rte_devtype devtype, const char *devargs_str)
 
 fail:
 	if (devargs) {
-		free(devargs->args);
+		rte_devargs_reset(devargs);
 		free(devargs);
 	}
 
@@ -346,7 +350,7 @@ rte_devargs_remove(struct rte_devargs *devargs)
 		if (strcmp(d->bus->name, devargs->bus->name) == 0 &&
 		    strcmp(d->name, devargs->name) == 0) {
 			TAILQ_REMOVE(&devargs_list, d, next);
-			free(d->args);
+			rte_devargs_reset(d);
 			free(d);
 			return 0;
 		}
diff --git a/lib/librte_eal/common/hotplug_mp.c b/lib/librte_eal/common/hotplug_mp.c
index ee791903b3..ae6010e8f8 100644
--- a/lib/librte_eal/common/hotplug_mp.c
+++ b/lib/librte_eal/common/hotplug_mp.c
@@ -95,6 +95,7 @@ __handle_secondary_request(void *param)
 
 	tmp_req = *req;
 
+	memset(&da, 0, sizeof(da));
 	if (req->t == EAL_DEV_REQ_TYPE_ATTACH) {
 		ret = local_dev_probe(req->devargs, &dev);
 		if (ret != 0) {
@@ -118,8 +119,6 @@ __handle_secondary_request(void *param)
 		ret = rte_devargs_parse(&da, req->devargs);
 		if (ret != 0)
 			goto finish;
-		free(da.args); /* we don't need those */
-		da.args = NULL;
 
 		ret = eal_dev_hotplug_request_to_secondary(&tmp_req);
 		if (ret != 0) {
@@ -176,6 +175,7 @@ __handle_secondary_request(void *param)
 	if (ret)
 		RTE_LOG(ERR, EAL, "failed to send response to secondary\n");
 
+	rte_devargs_reset(&da);
 	free(bundle->peer);
 	free(bundle);
 }
@@ -283,7 +283,7 @@ static void __handle_primary_request(void *param)
 
 		ret = local_dev_remove(dev);
 quit:
-		free(da->args);
+		rte_devargs_reset(da);
 		free(da);
 		break;
 	default:
diff --git a/lib/librte_eal/include/rte_devargs.h b/lib/librte_eal/include/rte_devargs.h
index 296f19324f..134b44a887 100644
--- a/lib/librte_eal/include/rte_devargs.h
+++ b/lib/librte_eal/include/rte_devargs.h
@@ -60,16 +60,16 @@ struct rte_devargs {
 	/** Name of the device. */
 	char name[RTE_DEV_NAME_MAX_LEN];
 	RTE_STD_C11
-	union {
-	/** Arguments string as given by user or "" for no argument. */
-		char *args;
+	union { /**< driver-related part of device string. */
+		const char *args; /**< legacy name. */
 		const char *drv_str;
 	};
 	struct rte_bus *bus; /**< bus handle. */
 	struct rte_class *cls; /**< class handle. */
 	const char *bus_str; /**< bus-related part of device string. */
 	const char *cls_str; /**< class-related part of device string. */
-	const char *data; /**< Device string storage. */
+	char *data;
+	/**< Raw string including bus, class and driver arguments. */
 };
 
 /**
@@ -145,6 +145,16 @@ rte_devargs_parsef(struct rte_devargs *da,
 		   const char *format, ...)
 __rte_format_printf(2, 0);
 
+/**
+ * Free resources in devargs.
+ *
+ * @param da
+ *   The devargs structure holding the device information.
+ */
+__rte_experimental
+void
+rte_devargs_reset(struct rte_devargs *da);
+
 /**
  * Insert an rte_devargs in the global list.
  *
diff --git a/lib/librte_eal/rte_eal_exports.def b/lib/librte_eal/rte_eal_exports.def
index 474cf123fa..53933b3269 100644
--- a/lib/librte_eal/rte_eal_exports.def
+++ b/lib/librte_eal/rte_eal_exports.def
@@ -30,6 +30,7 @@ EXPORTS
 	rte_devargs_parse
 	rte_devargs_parsef
 	rte_devargs_remove
+	rte_devargs_reset
 	rte_devargs_type_count
 	rte_dump_physmem_layout
 	rte_dump_stack
diff --git a/lib/librte_eal/version.map b/lib/librte_eal/version.map
index 48a2b55d57..38c2aea5be 100644
--- a/lib/librte_eal/version.map
+++ b/lib/librte_eal/version.map
@@ -415,6 +415,7 @@ EXPERIMENTAL {
 	rte_thread_tls_value_set;
 
 	# added in 21.05
+	rte_devargs_reset;
 	rte_version_minor;
 	rte_version_month;
 	rte_version_prefix;
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 3059aa55b3..e11a95558f 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -193,13 +193,14 @@ int
 rte_eth_iterator_init(struct rte_dev_iterator *iter, const char *devargs_str)
 {
 	int ret;
-	struct rte_devargs devargs = {.args = NULL};
+	struct rte_devargs devargs;
 	const char *bus_param_key;
 	char *bus_str = NULL;
 	char *cls_str = NULL;
 	int str_size;
 
 	memset(iter, 0, sizeof(*iter));
+	memset(&devargs, 0, sizeof(devargs));
 
 	/*
 	 * The devargs string may use various syntaxes:
@@ -244,8 +245,6 @@ rte_eth_iterator_init(struct rte_dev_iterator *iter, const char *devargs_str)
 		goto error;
 	}
 	iter->cls_str = cls_str;
-	free(devargs.args); /* allocated by rte_devargs_parse() */
-	devargs.args = NULL;
 
 	iter->bus = devargs.bus;
 	if (iter->bus->dev_iterate == NULL) {
@@ -278,13 +277,14 @@ rte_eth_iterator_init(struct rte_dev_iterator *iter, const char *devargs_str)
 
 end:
 	iter->cls = rte_class_find_by_name("eth");
+	rte_devargs_reset(&devargs);
 	return 0;
 
 error:
 	if (ret == -ENOTSUP)
 		RTE_ETHDEV_LOG(ERR, "Bus %s does not support iterating.\n",
 				iter->bus->name);
-	free(devargs.args);
+	rte_devargs_reset(&devargs);
 	free(bus_str);
 	free(cls_str);
 	return ret;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v3 2/5] devargs: fix memory leak on parsing error
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (50 preceding siblings ...)
  2021-03-30 12:15   ` [dpdk-dev] [PATCH v3 1/5] devargs: unify scratch buffer storage Xueming Li
@ 2021-03-30 12:15   ` Xueming Li
  2021-03-30 12:15   ` [dpdk-dev] [PATCH v3 3/5] kvargs: add get by key function Xueming Li
                     ` (14 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-03-30 12:15 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet
  Cc: dev, xuemingl, Asaf Penso, gaetan.rivet, stable, Shreyansh Jain

This patch fixes memory leak in parsing error handling.

Fixes: 338327d731e6 ("devargs: add function to parse device layers")
Cc: gaetan.rivet@6wind.com
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 lib/librte_eal/common/eal_common_devargs.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index 48f85ee9c0..e40b91ea66 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -60,6 +60,7 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 	size_t nblayer;
 	size_t i = 0;
 	int ret = 0;
+	bool allocated_data = false;
 
 	/* Split each sub-lists. */
 	nblayer = devargs_layer_count(devstr);
@@ -81,6 +82,7 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 			ret = -ENOMEM;
 			goto get_out;
 		}
+		allocated_data = true;
 		s = devargs->data;
 	}
 
@@ -163,8 +165,14 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 		if (layers[i].kvlist)
 			rte_kvargs_free(layers[i].kvlist);
 	}
-	if (ret != 0)
+	if (ret != 0) {
+		if (allocated_data) {
+			/* Free duplicated data. */
+			free(devargs->data);
+			devargs->data = NULL;
+		}
 		rte_errno = -ret;
+	}
 	return ret;
 }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v3 3/5] kvargs: add get by key function
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (51 preceding siblings ...)
  2021-03-30 12:15   ` [dpdk-dev] [PATCH v3 2/5] devargs: fix memory leak on parsing error Xueming Li
@ 2021-03-30 12:15   ` Xueming Li
  2021-04-01  9:06     ` Kinsella, Ray
  2021-03-30 12:15   ` [dpdk-dev] [PATCH v3 4/5] bus: add device arguments name parsing API Xueming Li
                     ` (13 subsequent siblings)
  66 siblings, 1 reply; 118+ messages in thread
From: Xueming Li @ 2021-03-30 12:15 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet
  Cc: dev, xuemingl, Asaf Penso, Olivier Matz, Ray Kinsella, Neil Horman

Adds a new function to get value of a specific key from kvargs list.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 lib/librte_kvargs/rte_kvargs.c | 20 ++++++++++++++++++++
 lib/librte_kvargs/rte_kvargs.h | 21 +++++++++++++++++++++
 lib/librte_kvargs/version.map  |  1 +
 3 files changed, 42 insertions(+)

diff --git a/lib/librte_kvargs/rte_kvargs.c b/lib/librte_kvargs/rte_kvargs.c
index ffae8914cf..40e7670ab3 100644
--- a/lib/librte_kvargs/rte_kvargs.c
+++ b/lib/librte_kvargs/rte_kvargs.c
@@ -203,6 +203,26 @@ rte_kvargs_free(struct rte_kvargs *kvlist)
 	free(kvlist);
 }
 
+/* Lookup a value in an rte_kvargs list by its key. */
+const char *
+rte_kvargs_get(const struct rte_kvargs *kvlist, const char *key)
+{
+	unsigned int i;
+
+	if (!kvlist)
+		return NULL;
+	for (i = 0; i < kvlist->count; ++i) {
+		/* Allows key to be NULL. */
+		if (!key && !kvlist->pairs[i].key)
+			return kvlist->pairs[i].value;
+		if (!key || !kvlist->pairs[i].key)
+			continue;
+		if (!strcmp(kvlist->pairs[i].key, key))
+			return kvlist->pairs[i].value;
+	}
+	return NULL;
+}
+
 /*
  * Parse the arguments "key=value,key=value,..." string and return
  * an allocated structure that contains a key/value list. Also
diff --git a/lib/librte_kvargs/rte_kvargs.h b/lib/librte_kvargs/rte_kvargs.h
index eff598e08b..cb3ea99850 100644
--- a/lib/librte_kvargs/rte_kvargs.h
+++ b/lib/librte_kvargs/rte_kvargs.h
@@ -114,6 +114,27 @@ struct rte_kvargs *rte_kvargs_parse_delim(const char *args,
  */
 void rte_kvargs_free(struct rte_kvargs *kvlist);
 
+/**
+ * Get the value associated with a given key.
+ *
+ * If the key is NULL, the first value from the list is returned.
+ * If multiple key matches, the value of the first one is returned.
+ *
+ * The memory returned is allocated as part of the rte_kvargs structure,
+ * it must never be modified.
+ *
+ * @param kvlist
+ *   A list of rte_kvargs pair of 'key=value'.
+ * @param key
+ *   The matching key.
+
+ * @return
+ *   NULL if no key matches the input, a value associated with a matching
+ *   key otherwise.
+ */
+__rte_experimental
+const char *rte_kvargs_get(const struct rte_kvargs *kvlist, const char *key);
+
 /**
  * Call a handler function for each key/value matching the key
  *
diff --git a/lib/librte_kvargs/version.map b/lib/librte_kvargs/version.map
index ed375bf4a3..e2bf792c60 100644
--- a/lib/librte_kvargs/version.map
+++ b/lib/librte_kvargs/version.map
@@ -12,6 +12,7 @@ DPDK_21 {
 EXPERIMENTAL {
 	global:
 
+	rte_kvargs_get;
 	rte_kvargs_parse_delim;
 	rte_kvargs_strcmp;
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v3 4/5] bus: add device arguments name parsing API
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (52 preceding siblings ...)
  2021-03-30 12:15   ` [dpdk-dev] [PATCH v3 3/5] kvargs: add get by key function Xueming Li
@ 2021-03-30 12:15   ` Xueming Li
  2021-03-31 10:19     ` Thomas Monjalon
  2021-03-30 12:15   ` [dpdk-dev] [PATCH v3 5/5] devargs: parse global device syntax Xueming Li
                     ` (12 subsequent siblings)
  66 siblings, 1 reply; 118+ messages in thread
From: Xueming Li @ 2021-03-30 12:15 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet; +Cc: dev, xuemingl, Asaf Penso

To use Global Device Syntax as devargs, name is required for device
management.

In legacy parsing API, devargs name was extracted after bus name:
  bus:name,kv_params,,,

To parse new Global Device Syntax, this patch introduces new bus API to
parse devargs and update name, different bus driver might choose
different keys from parameters with unified formating, example:
 -a bus=pci,addr=83:00.0/class=eth/driver=mlx5,...
    name: 0000:03:00.0
 -a bus=vdev,name=pcap0/class=eth/driver=pcap,...
    name:pcap0

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 drivers/bus/pci/pci_common.c               |  1 +
 drivers/bus/pci/pci_params.c               | 48 ++++++++++++++++++++++
 drivers/bus/pci/private.h                  | 14 +++++++
 drivers/bus/vdev/vdev.c                    |  1 +
 drivers/bus/vdev/vdev_params.c             | 43 +++++++++++++++++++
 drivers/bus/vdev/vdev_private.h            | 15 +++++++
 lib/librte_eal/common/eal_common_devargs.c |  6 +++
 lib/librte_eal/include/rte_bus.h           | 19 +++++++++
 8 files changed, 147 insertions(+)

diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
index 9b8d769287..61d3f51452 100644
--- a/drivers/bus/pci/pci_common.c
+++ b/drivers/bus/pci/pci_common.c
@@ -760,6 +760,7 @@ struct rte_pci_bus rte_pci_bus = {
 		.dev_iterate = rte_pci_dev_iterate,
 		.hot_unplug_handler = pci_hot_unplug_handler,
 		.sigbus_handler = pci_sigbus_handler,
+		.devargs_parse = rte_pci_devargs_parse,
 	},
 	.device_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.device_list),
 	.driver_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.driver_list),
diff --git a/drivers/bus/pci/pci_params.c b/drivers/bus/pci/pci_params.c
index 3192e9c967..7ba9e2650f 100644
--- a/drivers/bus/pci/pci_params.c
+++ b/drivers/bus/pci/pci_params.c
@@ -8,6 +8,8 @@
 #include <rte_errno.h>
 #include <rte_kvargs.h>
 #include <rte_pci.h>
+#include <rte_devargs.h>
+#include <rte_debug.h>
 
 #include "private.h"
 
@@ -76,3 +78,49 @@ rte_pci_dev_iterate(const void *start,
 	rte_kvargs_free(kvargs);
 	return dev;
 }
+
+int
+rte_pci_devargs_parse(struct rte_devargs *da)
+{
+	struct rte_kvargs *kvargs;
+	const char *addr_str;
+	struct rte_pci_addr addr;
+	int ret;
+
+	if (da == NULL)
+		return 0;
+	RTE_ASSERT(da->bus_str != NULL);
+
+	kvargs = rte_kvargs_parse(da->bus_str, NULL);
+	if (kvargs == NULL) {
+		RTE_LOG(ERR, EAL, "cannot parse argument list: %s\n",
+			da->bus_str);
+		ret = -ENODEV;
+		goto out;
+	}
+
+	addr_str = rte_kvargs_get(kvargs, pci_params_keys[RTE_PCI_PARAM_ADDR]);
+	if (addr_str == NULL) {
+		RTE_LOG(ERR, EAL, "No PCI address specified using '%s=<id>' in: %s\n",
+			pci_params_keys[RTE_PCI_PARAM_ADDR], da->bus_str);
+		ret = -ENODEV;
+		goto out;
+	}
+
+	ret = rte_pci_addr_parse(addr_str, &addr);
+	if (ret != 0) {
+		RTE_LOG(ERR, EAL, "PCI address invalid: %s\n", da->bus_str);
+		ret = -EINVAL;
+		goto out;
+	}
+
+	rte_pci_device_name(&addr, da->name, sizeof(da->name));
+
+	/* TODO: class parse -> driver parse */
+out:
+	if (kvargs != NULL)
+		rte_kvargs_free(kvargs);
+	if (ret != 0)
+		rte_errno = -ret;
+	return ret;
+}
diff --git a/drivers/bus/pci/private.h b/drivers/bus/pci/private.h
index f566943f5e..8bc5140e97 100644
--- a/drivers/bus/pci/private.h
+++ b/drivers/bus/pci/private.h
@@ -267,4 +267,18 @@ rte_pci_dev_iterate(const void *start,
 		    const char *str,
 		    const struct rte_dev_iterator *it);
 
+/*
+ * Parse device arguments and update name.
+ *
+ * @param da
+ *   device arguments to parse.
+ *
+ * @return
+ *   0 on success.
+ *   -EINVAL: kvargs string is invalid and cannot be parsed.
+ *   -ENODEV: no key matching a device ID is found in the kv list.
+ */
+int
+rte_pci_devargs_parse(struct rte_devargs *da);
+
 #endif /* _PCI_PRIVATE_H_ */
diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index d075409942..d6f651bff2 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -634,6 +634,7 @@ static struct rte_bus rte_vdev_bus = {
 	.dma_unmap = vdev_dma_unmap,
 	.get_iommu_class = vdev_get_iommu_class,
 	.dev_iterate = rte_vdev_dev_iterate,
+	.devargs_parse = rte_vdev_devargs_parse,
 };
 
 RTE_REGISTER_BUS(vdev, rte_vdev_bus);
diff --git a/drivers/bus/vdev/vdev_params.c b/drivers/bus/vdev/vdev_params.c
index 6f74704d1c..3e644ade95 100644
--- a/drivers/bus/vdev/vdev_params.c
+++ b/drivers/bus/vdev/vdev_params.c
@@ -8,6 +8,9 @@
 #include <rte_bus.h>
 #include <rte_kvargs.h>
 #include <rte_errno.h>
+#include <rte_devargs.h>
+#include <rte_string_fns.h>
+#include <rte_debug.h>
 
 #include "vdev_logs.h"
 #include "vdev_private.h"
@@ -64,3 +67,43 @@ rte_vdev_dev_iterate(const void *start,
 	rte_kvargs_free(kvargs);
 	return dev;
 }
+
+int
+rte_vdev_devargs_parse(struct rte_devargs *da)
+{
+	struct rte_kvargs *kvargs;
+	const char *name;
+	int ret = 0;
+
+	if (da == NULL)
+		return 0;
+	RTE_ASSERT(da->bus_str != NULL);
+
+	kvargs = rte_kvargs_parse(da->bus_str, NULL);
+	if (kvargs == NULL) {
+		RTE_LOG(ERR, EAL, "cannot parse argument list: %s\n",
+			da->bus_str);
+		ret = -EINVAL;
+		goto out;
+	}
+
+	name = rte_kvargs_get(kvargs, vdev_params_keys[RTE_VDEV_PARAM_NAME]);
+	if (name == NULL) {
+		RTE_LOG(ERR, EAL, "name not found: %s\n", da->bus_str);
+		ret = -ENODEV;
+		goto out;
+	}
+
+	if (rte_strscpy(da->name, name, sizeof(da->name)) < 0) {
+		RTE_LOG(ERR, EAL, "device name is too long: %s\n", name);
+		ret = -E2BIG;
+	}
+
+	/* TODO: class parse -> driver parse */
+out:
+	if (kvargs != NULL)
+		rte_kvargs_free(kvargs);
+	if (ret != 0)
+		rte_errno = -ret;
+	return ret;
+}
diff --git a/drivers/bus/vdev/vdev_private.h b/drivers/bus/vdev/vdev_private.h
index ba6dc48ff3..257682daac 100644
--- a/drivers/bus/vdev/vdev_private.h
+++ b/drivers/bus/vdev/vdev_private.h
@@ -19,6 +19,21 @@ rte_vdev_dev_iterate(const void *start,
 		     const char *str,
 		     const struct rte_dev_iterator *it);
 
+/*
+ * Parse device argument and update name.
+ *
+ * @param da
+ *   device argument to parse.
+ *
+ * @return
+ *   0 on success.
+ *   -EINVAL: kvargs string is invalid and cannot be parsed.
+ *   -ENODEV: no key matching a device ID is found in the kv list.
+ *   -E2BIG: device name is too long.
+ */
+int
+rte_vdev_devargs_parse(struct rte_devargs *da);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index e40b91ea66..b4dcb0099c 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -118,6 +118,8 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 		if (layers[i].kvlist == NULL)
 			continue;
 		kv = &layers[i].kvlist->pairs[0];
+		if (!kv->key)
+			continue;
 		if (strcmp(kv->key, "bus") == 0) {
 			bus = rte_bus_find_by_name(kv->value);
 			if (bus == NULL) {
@@ -160,6 +162,10 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 		}
 	}
 
+	/* Parse device name, optional for iterator filter. */
+	if (bus && bus->devargs_parse)
+		bus->devargs_parse(devargs);
+
 get_out:
 	for (i = 0; i < RTE_DIM(layers); i++) {
 		if (layers[i].kvlist)
diff --git a/lib/librte_eal/include/rte_bus.h b/lib/librte_eal/include/rte_bus.h
index 80b154fb98..e006b514de 100644
--- a/lib/librte_eal/include/rte_bus.h
+++ b/lib/librte_eal/include/rte_bus.h
@@ -210,6 +210,24 @@ typedef int (*rte_bus_hot_unplug_handler_t)(struct rte_device *dev);
  */
 typedef int (*rte_bus_sigbus_handler_t)(const void *failure_addr);
 
+/**
+ * Parse device arguments, setting the device name in the devargs as a result.
+ * A bus could use the class and driver layers to resolve name furthermore.
+ *
+ * On error rte_errno is set.
+ *
+ * @param da
+ *	Pointer to the devargs to parse.
+ *	The 'bus_str' field must be set.
+ *
+ * @return
+ *	0 on successful parsing.
+ *	-EINVAL: on parsing error.
+ *	-ENODEV: if no key matching a device argument is specified.
+ *	-E2BIG: device name is too long.
+ */
+typedef int (*rte_bus_devargs_parse_t)(struct rte_devargs *da);
+
 /**
  * Bus scan policies
  */
@@ -267,6 +285,7 @@ struct rte_bus {
 				/**< handle hot-unplug failure on the bus */
 	rte_bus_sigbus_handler_t sigbus_handler;
 					/**< handle sigbus error on the bus */
+	rte_bus_devargs_parse_t devargs_parse; /**< Parse device arguments */
 
 };
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v3 5/5] devargs: parse global device syntax
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (53 preceding siblings ...)
  2021-03-30 12:15   ` [dpdk-dev] [PATCH v3 4/5] bus: add device arguments name parsing API Xueming Li
@ 2021-03-30 12:15   ` Xueming Li
  2021-04-10 14:23   ` [dpdk-dev] [PATCH v4 0/5] eal: enable global device syntax by default Xueming Li
                     ` (11 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-03-30 12:15 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet; +Cc: dev, xuemingl, Asaf Penso

When parsing a devargs, try to parse using the global device syntax
first. Fallback on legacy syntax on error.

Example of new global device syntax:
 -a bus=pci,addr=82:00.0/class=eth/driver=mlx5,dv_flow_en=1

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
---
 lib/librte_eal/common/eal_common_devargs.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index b4dcb0099c..236e14824e 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -102,7 +102,6 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 		layers[i].str = s;
 		layers[i].kvlist = rte_kvargs_parse_delim(s, NULL, "/");
 		if (layers[i].kvlist == NULL) {
-			RTE_LOG(ERR, EAL, "Could not parse %s\n", s);
 			ret = -EINVAL;
 			goto get_out;
 		}
@@ -199,6 +198,12 @@ rte_devargs_parse(struct rte_devargs *da, const char *dev)
 	if (da == NULL)
 		return -EINVAL;
 
+	/* First parse according global device syntax. */
+	if (rte_devargs_layers_parse(da, dev) == 0 && da->bus && da->cls)
+		return 0;
+
+	/* Otherwise fallback to legacy syntax: */
+
 	/* Retrieve eventual bus info */
 	do {
 		devname = dev;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v5 0/9] net/mlx5: support SubFunction representor
  2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 0/9] net/mlx5: support SubFunction representor Xueming Li
@ 2021-03-31  7:20     ` Raslan Darawsheh
  2021-03-31  7:27       ` Xueming(Steven) Li
  0 siblings, 1 reply; 118+ messages in thread
From: Raslan Darawsheh @ 2021-03-31  7:20 UTC (permalink / raw)
  To: Xueming(Steven) Li, Slava Ovsiienko; +Cc: dev, Xueming(Steven) Li, Asaf Penso

Hi,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Xueming Li
> Sent: Sunday, March 28, 2021 4:48 PM
> To: Slava Ovsiienko <viacheslavo@nvidia.com>
> Cc: dev@dpdk.org; Xueming(Steven) Li <xuemingl@nvidia.com>; Asaf Penso
> <asafp@nvidia.com>
> Subject: [dpdk-dev] [PATCH v5 0/9] net/mlx5: support SubFunction
> representor
> 
> SubFunction [1] is a portion of the PCI device, a SF netdev has its own
> dedicated queues(txq, rxq). A SF netdev supports E-Switch representation
> offload similar to existing PF and VF representors. A SF shares PCI
> level resources with other SFs and/or with its parent PCI function.
> 
> This patch set introduces SubFunction representor support for mlx5
> PMD driver.
> 
> Version history:
>  RFC:
>  	initial version [2]
>  V2:
>     - support bonding representor probe with new pf#vf# devargs
>     - adapt EAL api V2 [3] changes
>     - update document
>  V3:
>     - support list of representor PF section for bonding device:
>       example: representor=pf[0,1]vf[0-3]
>     - add bonding information to shared PMD data
>     - fix setting VF MAC through representor
>     - fix bonding xstats, sum xstats from PF members.
>  V4:
>     - combine unexpected patch, thanks Slava
>  V5:
>     - support new ethdev ops api to return representor info
>     - new api to encode and decode representor ID
>     - new patch to allow BF2 HPF(-1) probe with sf-1
> 
> [1] SubFunction in kernel:
> https://lore.kernel.org/netdev/20201112192424.2742-1-parav@nvidia.com/
> 
> [2] RFC:
> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatch
> work.dpdk.org%2Fproject%2Fdpdk%2Flist%2F%3Fseries%3D14376&amp;dat
> a=04%7C01%7Crasland%40nvidia.com%7Ccc705f353dda416b4ba808d8f1f0
> 3a83%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C1%7C63752536137
> 3102759%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV
> 2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=1V%2BbE
> nDeyTxdKWgtniUBvn7hJJbREo%2Fh6FqKAV7geFA%3D&amp;reserved=0
> 
> [3] V2:
> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatch
> work.dpdk.org%2Fproject%2Fdpdk%2Flist%2F%3Fseries%3D14560&amp;dat
> a=04%7C01%7Crasland%40nvidia.com%7Ccc705f353dda416b4ba808d8f1f0
> 3a83%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C1%7C63752536137
> 3102759%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV
> 2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=DDGSVykr
> 2CsyCm5%2BEnQViGKJKWI4b4dQyByr5zDILwc%3D&amp;reserved=0
> 
> [3] V3:
> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatch
> work.dpdk.org%2Fproject%2Fdpdk%2Flist%2F%3Fseries%3D14810&amp;dat
> a=04%7C01%7Crasland%40nvidia.com%7Ccc705f353dda416b4ba808d8f1f0
> 3a83%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C1%7C63752536137
> 3102759%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV
> 2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=tYBHZ0PBI
> cdpTSZ3EedX3SpKz7hlbWIxdwCPwEw8nE8%3D&amp;reserved=0
> 
> [3] V4:
> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatch
> work.dpdk.org%2Fproject%2Fdpdk%2Flist%2F%3Fseries%3D14836&amp;dat
> a=04%7C01%7Crasland%40nvidia.com%7Ccc705f353dda416b4ba808d8f1f0
> 3a83%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C1%7C63752536137
> 3102759%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV
> 2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=CneNa%2
> BwMG5dzos%2F98%2BlouEKi7ex1CO5Zb52xTlxy1Dw%3D&amp;reserved=0
> 
> 
> Xueming Li (9):
>   common/mlx5: sub-function representor port name parsing
>   net/mlx5: support representor of sub function
>   net/mlx5: revert setting bonding representor to first PF
>   net/mlx5: refactor bonding representor probe
>   net/mlx5: support list value of representor PF
>   net/mlx5: save bonding member ports information
>   net/mlx5: fix setting VF default MAC through representor
>   net/mlx5: improve xstats of bonding port
>   net/mlx5: probe host PF representor with SubFunction
> 
>  doc/guides/nics/mlx5.rst                   |  62 +++-
>  drivers/common/mlx5/linux/mlx5_common_os.c |  32 +-
>  drivers/common/mlx5/linux/mlx5_nl.c        |   3 +
>  drivers/common/mlx5/mlx5_common.h          |   2 +
>  drivers/net/mlx5/linux/mlx5_ethdev_os.c    | 136 +++++--
>  drivers/net/mlx5/linux/mlx5_os.c           | 395 ++++++++++++++-------
>  drivers/net/mlx5/mlx5.c                    |  24 +-
>  drivers/net/mlx5/mlx5.h                    |  35 +-
>  drivers/net/mlx5/mlx5_defs.h               |   4 -
>  drivers/net/mlx5/mlx5_ethdev.c             | 149 ++++++--
>  drivers/net/mlx5/mlx5_mac.c                |  23 +-
>  11 files changed, 652 insertions(+), 213 deletions(-)
> 
> --
> 2.25.1

Series applied to next-net-mlx,

Kindest regards,
Raslan Darawsheh

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v5 0/9] net/mlx5: support SubFunction representor
  2021-03-31  7:20     ` Raslan Darawsheh
@ 2021-03-31  7:27       ` Xueming(Steven) Li
  0 siblings, 0 replies; 118+ messages in thread
From: Xueming(Steven) Li @ 2021-03-31  7:27 UTC (permalink / raw)
  To: Raslan Darawsheh, Slava Ovsiienko; +Cc: dev, Asaf Penso

Thanks!

>-----Original Message-----
>From: Raslan Darawsheh <rasland@nvidia.com>
>Sent: Wednesday, March 31, 2021 3:21 PM
>To: Xueming(Steven) Li <xuemingl@nvidia.com>; Slava Ovsiienko <viacheslavo@nvidia.com>
>Cc: dev@dpdk.org; Xueming(Steven) Li <xuemingl@nvidia.com>; Asaf Penso <asafp@nvidia.com>
>Subject: RE: [dpdk-dev] [PATCH v5 0/9] net/mlx5: support SubFunction representor
>
>Hi,
>
>> -----Original Message-----
>> From: dev <dev-bounces@dpdk.org> On Behalf Of Xueming Li
>> Sent: Sunday, March 28, 2021 4:48 PM
>> To: Slava Ovsiienko <viacheslavo@nvidia.com>
>> Cc: dev@dpdk.org; Xueming(Steven) Li <xuemingl@nvidia.com>; Asaf Penso
>> <asafp@nvidia.com>
>> Subject: [dpdk-dev] [PATCH v5 0/9] net/mlx5: support SubFunction
>> representor
>>
>> SubFunction [1] is a portion of the PCI device, a SF netdev has its
>> own dedicated queues(txq, rxq). A SF netdev supports E-Switch
>> representation offload similar to existing PF and VF representors. A
>> SF shares PCI level resources with other SFs and/or with its parent PCI function.
>>
>> This patch set introduces SubFunction representor support for mlx5 PMD
>> driver.
>>
>> Version history:
>>  RFC:
>>  	initial version [2]
>>  V2:
>>     - support bonding representor probe with new pf#vf# devargs
>>     - adapt EAL api V2 [3] changes
>>     - update document
>>  V3:
>>     - support list of representor PF section for bonding device:
>>       example: representor=pf[0,1]vf[0-3]
>>     - add bonding information to shared PMD data
>>     - fix setting VF MAC through representor
>>     - fix bonding xstats, sum xstats from PF members.
>>  V4:
>>     - combine unexpected patch, thanks Slava
>>  V5:
>>     - support new ethdev ops api to return representor info
>>     - new api to encode and decode representor ID
>>     - new patch to allow BF2 HPF(-1) probe with sf-1
>>
>> [1] SubFunction in kernel:
>> https://lore.kernel.org/netdev/20201112192424.2742-1-parav@nvidia.com/
>>
>> [2] RFC:
>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatch
>> work.dpdk.org%2Fproject%2Fdpdk%2Flist%2F%3Fseries%3D14376&amp;dat
>> a=04%7C01%7Crasland%40nvidia.com%7Ccc705f353dda416b4ba808d8f1f0
>> 3a83%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C1%7C63752536137
>> 3102759%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV
>> 2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=1V%2BbE
>> nDeyTxdKWgtniUBvn7hJJbREo%2Fh6FqKAV7geFA%3D&amp;reserved=0
>>
>> [3] V2:
>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatch
>> work.dpdk.org%2Fproject%2Fdpdk%2Flist%2F%3Fseries%3D14560&amp;dat
>> a=04%7C01%7Crasland%40nvidia.com%7Ccc705f353dda416b4ba808d8f1f0
>> 3a83%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C1%7C63752536137
>> 3102759%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV
>> 2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=DDGSVykr
>> 2CsyCm5%2BEnQViGKJKWI4b4dQyByr5zDILwc%3D&amp;reserved=0
>>
>> [3] V3:
>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatch
>> work.dpdk.org%2Fproject%2Fdpdk%2Flist%2F%3Fseries%3D14810&amp;dat
>> a=04%7C01%7Crasland%40nvidia.com%7Ccc705f353dda416b4ba808d8f1f0
>> 3a83%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C1%7C63752536137
>> 3102759%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV
>> 2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=tYBHZ0PBI
>> cdpTSZ3EedX3SpKz7hlbWIxdwCPwEw8nE8%3D&amp;reserved=0
>>
>> [3] V4:
>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatch
>> work.dpdk.org%2Fproject%2Fdpdk%2Flist%2F%3Fseries%3D14836&amp;dat
>> a=04%7C01%7Crasland%40nvidia.com%7Ccc705f353dda416b4ba808d8f1f0
>> 3a83%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C1%7C63752536137
>> 3102759%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV
>> 2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=CneNa%2
>> BwMG5dzos%2F98%2BlouEKi7ex1CO5Zb52xTlxy1Dw%3D&amp;reserved=0
>>
>>
>> Xueming Li (9):
>>   common/mlx5: sub-function representor port name parsing
>>   net/mlx5: support representor of sub function
>>   net/mlx5: revert setting bonding representor to first PF
>>   net/mlx5: refactor bonding representor probe
>>   net/mlx5: support list value of representor PF
>>   net/mlx5: save bonding member ports information
>>   net/mlx5: fix setting VF default MAC through representor
>>   net/mlx5: improve xstats of bonding port
>>   net/mlx5: probe host PF representor with SubFunction
>>
>>  doc/guides/nics/mlx5.rst                   |  62 +++-
>>  drivers/common/mlx5/linux/mlx5_common_os.c |  32 +-
>>  drivers/common/mlx5/linux/mlx5_nl.c        |   3 +
>>  drivers/common/mlx5/mlx5_common.h          |   2 +
>>  drivers/net/mlx5/linux/mlx5_ethdev_os.c    | 136 +++++--
>>  drivers/net/mlx5/linux/mlx5_os.c           | 395 ++++++++++++++-------
>>  drivers/net/mlx5/mlx5.c                    |  24 +-
>>  drivers/net/mlx5/mlx5.h                    |  35 +-
>>  drivers/net/mlx5/mlx5_defs.h               |   4 -
>>  drivers/net/mlx5/mlx5_ethdev.c             | 149 ++++++--
>>  drivers/net/mlx5/mlx5_mac.c                |  23 +-
>>  11 files changed, 652 insertions(+), 213 deletions(-)
>>
>> --
>> 2.25.1
>
>Series applied to next-net-mlx,
>
>Kindest regards,
>Raslan Darawsheh

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v5 7/9] net/mlx5: fix setting VF default MAC through representor
  2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 7/9] net/mlx5: fix setting VF default MAC through representor Xueming Li
@ 2021-03-31  7:46     ` Raslan Darawsheh
  0 siblings, 0 replies; 118+ messages in thread
From: Raslan Darawsheh @ 2021-03-31  7:46 UTC (permalink / raw)
  To: Xueming(Steven) Li, Slava Ovsiienko
  Cc: dev, Xueming(Steven) Li, Asaf Penso, Matan Azrad, Shahaf Shuler

Hi,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Xueming Li
> Sent: Sunday, March 28, 2021 4:48 PM
> To: Slava Ovsiienko <viacheslavo@nvidia.com>
> Cc: dev@dpdk.org; Xueming(Steven) Li <xuemingl@nvidia.com>; Asaf Penso
> <asafp@nvidia.com>; Matan Azrad <matan@nvidia.com>; Shahaf Shuler
> <shahafs@nvidia.com>
> Subject: [dpdk-dev] [PATCH v5 7/9] net/mlx5: fix setting VF default MAC
> through representor
> 
> With kernel bonding, there was an error when setting VF MAC address
> through representor. The Netlink api requires ifindex of owner PF, not
> bonding device ifindex.
> 
> Uses owner PF ifindex to modify VF default MAC in case of bonding
> device.
> 
> Fixes: c21e5facf7d2 ("net/mlx5: use bond index for netdev operations")
Missing Cc: stable@dpdk.org
Will be added during integration

Kindest regards
Raslan Darawsheh

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev]  [PATCH v3 0/5] eal: enable global device syntax by default
  2021-03-30 12:15   ` [dpdk-dev] [PATCH v3 0/5] eal: enable global device syntax by default Xueming Li
@ 2021-03-31  8:23     ` Gaëtan Rivet
  0 siblings, 0 replies; 118+ messages in thread
From: Gaëtan Rivet @ 2021-03-31  8:23 UTC (permalink / raw)
  To: Xueming(Steven) Li, Thomas Monjalon, Gaetan Rivet; +Cc: dev, Asaf Penso

On Tue, Mar 30, 2021, at 14:15, Xueming Li wrote:
> The new Global Device Syntax [1] is used to identify a device with full
> bus, class and driver description, example:
>  -a bus=pci,addr=82:00.0/class=eth/driver=mlx5,...
> 
> This patchset fixes bugs and enable global device syntax with
> backward compatibility by:
> - unify devargs memory buffer cleanup
> - parse name from bus callback 
> - try new global syntax parsing firstly and fallback to legacy parsing.
> 
> 
> History:
> 
> V1:
>  - Inital version
> 
> V2:
>  - add devargs.src as complete source dev string
>  - change devargs.data to scratch buffer
>  - add rte_devargs_free() to release scratch memory
>  - change name policy to align with rte_eth_iterator_init()
>  - remove PCI bus fix as name already resolved in rte_devargs_parse().
> V3:
>  - remove devargs.src
>  - rename rte_devargs_free() to rte_devargs_reset()
>  - add bus callback api to resolve devargs.
> 
> [1] Global Device Syntax:
> https://www.dpdk.org/wp-content/uploads/sites/35/2018/10/am-07-DPDK-hotplug-20180905.pdf
> 
> [2] RFC:
> http://patchwork.dpdk.org/project/dpdk/list/?series=14378
> 
> [3] V1:
> http://patchwork.dpdk.org/project/dpdk/list/?series=14610
> 
> [4] V2:
> http://patchwork.dpdk.org/project/dpdk/list/?series=14816
> 
> 
> Xueming Li (5):
>   devargs: unify scratch buffer storage
>   devargs: fix memory leak on parsing error
>   kvargs: add get by key function
>   bus: add device arguments name parsing API
>   devargs: parse global device syntax
> 
>  app/test-pmd/config.c                        |  3 +-
>  app/test-pmd/testpmd.c                       |  5 +-
>  drivers/bus/pci/pci_common.c                 |  1 +
>  drivers/bus/pci/pci_params.c                 | 48 +++++++++++++++++
>  drivers/bus/pci/private.h                    | 14 +++++
>  drivers/bus/vdev/vdev.c                      | 10 ++--
>  drivers/bus/vdev/vdev_params.c               | 43 +++++++++++++++
>  drivers/bus/vdev/vdev_private.h              | 15 ++++++
>  drivers/net/failsafe/failsafe_args.c         |  3 +-
>  drivers/net/failsafe/failsafe_eal.c          |  2 +-
>  examples/multi_process/hotplug_mp/commands.c |  6 +--
>  lib/librte_eal/common/eal_common_dev.c       |  9 ++--
>  lib/librte_eal/common/eal_common_devargs.c   | 57 ++++++++++++++------
>  lib/librte_eal/common/hotplug_mp.c           |  6 +--
>  lib/librte_eal/include/rte_bus.h             | 19 +++++++
>  lib/librte_eal/include/rte_devargs.h         | 18 +++++--
>  lib/librte_eal/rte_eal_exports.def           |  1 +
>  lib/librte_eal/version.map                   |  1 +
>  lib/librte_ethdev/rte_ethdev.c               |  8 +--
>  lib/librte_kvargs/rte_kvargs.c               | 20 +++++++
>  lib/librte_kvargs/rte_kvargs.h               | 21 ++++++++
>  lib/librte_kvargs/version.map                |  1 +
>  22 files changed, 263 insertions(+), 48 deletions(-)
> 
> -- 
> 2.25.1
> 
>

Hello,

For the series:
Reviewed-by: Gaetan Rivet <grive@u256.net>

-- 
Gaetan Rivet

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v3 4/5] bus: add device arguments name parsing API
  2021-03-30 12:15   ` [dpdk-dev] [PATCH v3 4/5] bus: add device arguments name parsing API Xueming Li
@ 2021-03-31 10:19     ` Thomas Monjalon
  2021-04-01 15:13       ` Xueming(Steven) Li
  0 siblings, 1 reply; 118+ messages in thread
From: Thomas Monjalon @ 2021-03-31 10:19 UTC (permalink / raw)
  To: Xueming Li
  Cc: Gaetan Rivet, dev, Asaf Penso, david.marchand, ferruh.yigit,
	andrew.rybchenko, hemant.agrawal, stephen, rosen.xu,
	ajit.khaparde, jerinj

The commit log should start by explaining it is adding a callback
to the bus drivers for the new devargs syntax.

30/03/2021 14:15, Xueming Li:
> To use Global Device Syntax as devargs, name is required for device
> management.

Context is missing.
You mean the argument "name" for the vdev bus?

> 
> In legacy parsing API, devargs name was extracted after bus name:
>   bus:name,kv_params,,,
> 
> To parse new Global Device Syntax, this patch introduces new bus API to
> parse devargs and update name, different bus driver might choose
> different keys from parameters with unified formating, example:
>  -a bus=pci,addr=83:00.0/class=eth/driver=mlx5,...
>     name: 0000:03:00.0
>  -a bus=vdev,name=pcap0/class=eth/driver=pcap,...
>     name:pcap0

Only PCI and vdev buses are implemented.
What can be the plan for others?
We should track the progress somewhere, maybe with TODO comments.

This commit log could also state what is the status
of the global syntax support, talking about class and device drivers.

We could update this comment in ethdev:
     * A new syntax is in development (not yet supported):
     *   - bus=X,paramX=x/class=Y,paramY=y/driver=Z,paramZ=z

[...]
> +int
> +rte_pci_devargs_parse(struct rte_devargs *da)
> +{
> +	struct rte_kvargs *kvargs;
> +	const char *addr_str;
> +	struct rte_pci_addr addr;
> +	int ret;
> +
> +	if (da == NULL)
> +		return 0;
> +	RTE_ASSERT(da->bus_str != NULL);
> +
> +	kvargs = rte_kvargs_parse(da->bus_str, NULL);
> +	if (kvargs == NULL) {
> +		RTE_LOG(ERR, EAL, "cannot parse argument list: %s\n",
> +			da->bus_str);
> +		ret = -ENODEV;
> +		goto out;
> +	}
> +
> +	addr_str = rte_kvargs_get(kvargs, pci_params_keys[RTE_PCI_PARAM_ADDR]);
> +	if (addr_str == NULL) {
> +		RTE_LOG(ERR, EAL, "No PCI address specified using '%s=<id>' in: %s\n",
> +			pci_params_keys[RTE_PCI_PARAM_ADDR], da->bus_str);
> +		ret = -ENODEV;
> +		goto out;
> +	}
> +
> +	ret = rte_pci_addr_parse(addr_str, &addr);
> +	if (ret != 0) {
> +		RTE_LOG(ERR, EAL, "PCI address invalid: %s\n", da->bus_str);
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
> +	rte_pci_device_name(&addr, da->name, sizeof(da->name));
> +
> +	/* TODO: class parse -> driver parse */

Please could you give a longer explanation of what is missing?



^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/5] devargs: unify scratch buffer storage
  2021-03-30 12:15   ` [dpdk-dev] [PATCH v3 1/5] devargs: unify scratch buffer storage Xueming Li
@ 2021-04-01  9:04     ` Kinsella, Ray
  0 siblings, 0 replies; 118+ messages in thread
From: Kinsella, Ray @ 2021-04-01  9:04 UTC (permalink / raw)
  To: Xueming Li, Thomas Monjalon, Gaetan Rivet
  Cc: dev, Asaf Penso, Wenzhuo Lu, Beilei Xing, Bernard Iremonger,
	Gaetan Rivet, Anatoly Burakov, Dmitry Kozlyuk,
	Narcisa Ana Maria Vasile, Dmitry Malloy, Pallavi Kadam,
	Neil Horman, Ferruh Yigit, Andrew Rybchenko



On 30/03/2021 13:15, Xueming Li wrote:
> In current design, legacy parser rte_devargs_parse() saved scratch
> buffer to devargs.args while new parser rte_devargs_layers_parse() saved
> to devargs.data. Code using devargs had to know the difference and
> cleaned up memory accordingly - error prone.
> 
> This patch unifies scratch buffer to data field, introduces
> rte_devargs_reset() function to wrap the memory clean up logic.
> 
> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
> ---
>  app/test-pmd/config.c                        |  3 +-
>  app/test-pmd/testpmd.c                       |  5 +--
>  drivers/bus/vdev/vdev.c                      |  9 +++---
>  drivers/net/failsafe/failsafe_args.c         |  3 +-
>  drivers/net/failsafe/failsafe_eal.c          |  2 +-
>  examples/multi_process/hotplug_mp/commands.c |  6 ++--
>  lib/librte_eal/common/eal_common_dev.c       |  9 +++---
>  lib/librte_eal/common/eal_common_devargs.c   | 34 +++++++++++---------
>  lib/librte_eal/common/hotplug_mp.c           |  6 ++--
>  lib/librte_eal/include/rte_devargs.h         | 18 ++++++++---
>  lib/librte_eal/rte_eal_exports.def           |  1 +
>  lib/librte_eal/version.map                   |  1 +
>  lib/librte_ethdev/rte_ethdev.c               |  8 ++---
>  13 files changed, 59 insertions(+), 46 deletions(-)
> 

Acked-by: Ray Kinsella <mdr@ashroe.eu>

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v3 3/5] kvargs: add get by key function
  2021-03-30 12:15   ` [dpdk-dev] [PATCH v3 3/5] kvargs: add get by key function Xueming Li
@ 2021-04-01  9:06     ` Kinsella, Ray
  2021-04-01  9:10       ` Xueming(Steven) Li
  0 siblings, 1 reply; 118+ messages in thread
From: Kinsella, Ray @ 2021-04-01  9:06 UTC (permalink / raw)
  To: Xueming Li, Thomas Monjalon, Gaetan Rivet
  Cc: dev, Asaf Penso, Olivier Matz, Neil Horman



On 30/03/2021 13:15, Xueming Li wrote:
> Adds a new function to get value of a specific key from kvargs list.
> 
> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
> ---
>  lib/librte_kvargs/rte_kvargs.c | 20 ++++++++++++++++++++
>  lib/librte_kvargs/rte_kvargs.h | 21 +++++++++++++++++++++
>  lib/librte_kvargs/version.map  |  1 +
>  3 files changed, 42 insertions(+)
> 

[SNIP]

> diff --git a/lib/librte_kvargs/version.map b/lib/librte_kvargs/version.map
> index ed375bf4a3..e2bf792c60 100644
> --- a/lib/librte_kvargs/version.map
> +++ b/lib/librte_kvargs/version.map
> @@ -12,6 +12,7 @@ DPDK_21 {
>  EXPERIMENTAL {
>  	global:
>
Please separate rte_kvargs_get from the other symbols.
And add #21.05 in front, so we know when the symbol got added. 

> +	rte_kvargs_get;
>  	rte_kvargs_parse_delim;
>  	rte_kvargs_strcmp;
>  
> 

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v3 3/5] kvargs: add get by key function
  2021-04-01  9:06     ` Kinsella, Ray
@ 2021-04-01  9:10       ` Xueming(Steven) Li
  0 siblings, 0 replies; 118+ messages in thread
From: Xueming(Steven) Li @ 2021-04-01  9:10 UTC (permalink / raw)
  To: Kinsella, Ray, NBU-Contact-Thomas Monjalon, Gaetan Rivet
  Cc: dev, Asaf Penso, Olivier Matz, Neil Horman


>-----Original Message-----
>From: Kinsella, Ray <mdr@ashroe.eu>
>Sent: Thursday, April 1, 2021 5:06 PM
>To: Xueming(Steven) Li <xuemingl@nvidia.com>; NBU-Contact-Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
><gaetanr@nvidia.com>
>Cc: dev@dpdk.org; Asaf Penso <asafp@nvidia.com>; Olivier Matz <olivier.matz@6wind.com>; Neil Horman
><nhorman@tuxdriver.com>
>Subject: Re: [PATCH v3 3/5] kvargs: add get by key function
>
>
>
>On 30/03/2021 13:15, Xueming Li wrote:
>> Adds a new function to get value of a specific key from kvargs list.
>>
>> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
>> ---
>>  lib/librte_kvargs/rte_kvargs.c | 20 ++++++++++++++++++++
>> lib/librte_kvargs/rte_kvargs.h | 21 +++++++++++++++++++++
>> lib/librte_kvargs/version.map  |  1 +
>>  3 files changed, 42 insertions(+)
>>
>
>[SNIP]
>
>> diff --git a/lib/librte_kvargs/version.map
>> b/lib/librte_kvargs/version.map index ed375bf4a3..e2bf792c60 100644
>> --- a/lib/librte_kvargs/version.map
>> +++ b/lib/librte_kvargs/version.map
>> @@ -12,6 +12,7 @@ DPDK_21 {
>>  EXPERIMENTAL {
>>  	global:
>>
>Please separate rte_kvargs_get from the other symbols.
>And add #21.05 in front, so we know when the symbol got added.

Good point, thanks
>
>> +	rte_kvargs_get;
>>  	rte_kvargs_parse_delim;
>>  	rte_kvargs_strcmp;
>>
>>

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v3 4/5] bus: add device arguments name parsing API
  2021-03-31 10:19     ` Thomas Monjalon
@ 2021-04-01 15:13       ` Xueming(Steven) Li
  2021-04-08 23:49         ` Thomas Monjalon
  0 siblings, 1 reply; 118+ messages in thread
From: Xueming(Steven) Li @ 2021-04-01 15:13 UTC (permalink / raw)
  To: NBU-Contact-Thomas Monjalon
  Cc: Gaetan Rivet, dev, Asaf Penso, david.marchand, ferruh.yigit,
	andrew.rybchenko, hemant.agrawal, stephen, rosen.xu,
	ajit.khaparde, jerinj


>-----Original Message-----
>From: Thomas Monjalon <thomas@monjalon.net>
>Sent: Wednesday, March 31, 2021 6:20 PM
>To: Xueming(Steven) Li <xuemingl@nvidia.com>
>Cc: Gaetan Rivet <gaetanr@nvidia.com>; dev@dpdk.org; Asaf Penso <asafp@nvidia.com>; david.marchand@redhat.com;
>ferruh.yigit@intel.com; andrew.rybchenko@oktetlabs.ru; hemant.agrawal@nxp.com; stephen@networkplumber.org;
>rosen.xu@intel.com; ajit.khaparde@broadcom.com; jerinj@marvell.com
>Subject: Re: [dpdk-dev] [PATCH v3 4/5] bus: add device arguments name parsing API
>
>The commit log should start by explaining it is adding a callback to the bus drivers for the new devargs syntax.

OK.

>
>30/03/2021 14:15, Xueming Li:
>> To use Global Device Syntax as devargs, name is required for device
>> management.
>
>Context is missing.
>You mean the argument "name" for the vdev bus?

Devargs.name, it is used by probe and device iterator. To locate a device from a devargs, devargs.name is compared
agains name of probed devices. This not an issue for legacy syntax, since the string after "bus:" is saved as name.

>
>>
>> In legacy parsing API, devargs name was extracted after bus name:
>>   bus:name,kv_params,,,
>>
>> To parse new Global Device Syntax, this patch introduces new bus API
>> to parse devargs and update name, different bus driver might choose
>> different keys from parameters with unified formating, example:
>>  -a bus=pci,addr=83:00.0/class=eth/driver=mlx5,...
>>     name: 0000:03:00.0
>>  -a bus=vdev,name=pcap0/class=eth/driver=pcap,...
>>     name:pcap0
>
>Only PCI and vdev buses are implemented.
>What can be the plan for others?
>We should track the progress somewhere, maybe with TODO comments.

Like legacy parser, how about using "name" as default name key, the new syntax parser can resolve it by default.
Then PCI bus overrides by using "addr" key in new bus API, 
vdev and other bus drivers simply use default implementation, i.e. using "name" as key..

>
>This commit log could also state what is the status of the global syntax support, talking about class and device drivers.

Yes

>
>We could update this comment in ethdev:
>     * A new syntax is in development (not yet supported):
>     *   - bus=X,paramX=x/class=Y,paramY=y/driver=Z,paramZ=z

Will do it in next patch - enable new syntax

>
>[...]
>> +int
>> +rte_pci_devargs_parse(struct rte_devargs *da) {
>> +	struct rte_kvargs *kvargs;
>> +	const char *addr_str;
>> +	struct rte_pci_addr addr;
>> +	int ret;
>> +
>> +	if (da == NULL)
>> +		return 0;
>> +	RTE_ASSERT(da->bus_str != NULL);
>> +
>> +	kvargs = rte_kvargs_parse(da->bus_str, NULL);
>> +	if (kvargs == NULL) {
>> +		RTE_LOG(ERR, EAL, "cannot parse argument list: %s\n",
>> +			da->bus_str);
>> +		ret = -ENODEV;
>> +		goto out;
>> +	}
>> +
>> +	addr_str = rte_kvargs_get(kvargs, pci_params_keys[RTE_PCI_PARAM_ADDR]);
>> +	if (addr_str == NULL) {
>> +		RTE_LOG(ERR, EAL, "No PCI address specified using '%s=<id>' in: %s\n",
>> +			pci_params_keys[RTE_PCI_PARAM_ADDR], da->bus_str);
>> +		ret = -ENODEV;
>> +		goto out;
>> +	}
>> +
>> +	ret = rte_pci_addr_parse(addr_str, &addr);
>> +	if (ret != 0) {
>> +		RTE_LOG(ERR, EAL, "PCI address invalid: %s\n", da->bus_str);
>> +		ret = -EINVAL;
>> +		goto out;
>> +	}
>> +
>> +	rte_pci_device_name(&addr, da->name, sizeof(da->name));
>> +
>> +	/* TODO: class parse -> driver parse */
>
>Please could you give a longer explanation of what is missing?

Just an option to give class driver and device driver to override name parsing.
But this should happen in new devargs parser, not here if needed.

I'll remove this line, and mention it in commit log.
>


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/2] common/mlx5: support device global syntax
  2021-01-18 15:26   ` [dpdk-dev] [PATCH v2 1/2] common/mlx5: support device global syntax Xueming Li
@ 2021-04-05 10:54     ` Slava Ovsiienko
  2021-04-08 12:24       ` Raslan Darawsheh
  0 siblings, 1 reply; 118+ messages in thread
From: Slava Ovsiienko @ 2021-04-05 10:54 UTC (permalink / raw)
  To: Xueming(Steven) Li
  Cc: dev, Matan Azrad, Shahaf Shuler, NBU-Contact-Thomas Monjalon,
	Xueming(Steven) Li, Asaf Penso

> -----Original Message-----
> From: Xueming Li <xuemingl@nvidia.com>
> Sent: Monday, January 18, 2021 17:27
> To: Slava Ovsiienko <viacheslavo@nvidia.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@nvidia.com>; Shahaf Shuler
> <shahafs@nvidia.com>; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>; Xueming(Steven) Li <xuemingl@nvidia.com>;
> Asaf Penso <asafp@nvidia.com>
> Subject: [PATCH v2 1/2] common/mlx5: support device global syntax
> 
> This patch supports new device global device syntax, resolve class type from
> "class" section if the devarg is global device syntax:
> bus=<bus>,k=v,,,/class=<cls>,k=v,,,/driver=<pmd>,k=v,,,,
> 
> To reuse class name of global device syntax, this patch also changes internal
> class name introduced by commit [1] to algin with RTE class name.
Typo: algin -> align

Beside this:
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>


> 
> [1]
> 8a41f4deccc3: common/mlx5: introduce layer for multiple class drivers
> 
> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
> ---
>  drivers/common/mlx5/mlx5_common_pci.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/common/mlx5/mlx5_common_pci.c
> b/drivers/common/mlx5/mlx5_common_pci.c
> index 5208972bb6..c03bdbf4eb 100644
> --- a/drivers/common/mlx5/mlx5_common_pci.c
> +++ b/drivers/common/mlx5/mlx5_common_pci.c
> @@ -4,6 +4,7 @@
> 
>  #include <stdlib.h>
>  #include <rte_malloc.h>
> +#include <rte_class.h>
>  #include "mlx5_common_utils.h"
>  #include "mlx5_common_pci.h"
> 
> @@ -26,7 +27,7 @@ static const struct {
>  	unsigned int driver_class;
>  } mlx5_classes[] = {
>  	{ .name = "vdpa", .driver_class = MLX5_CLASS_VDPA },
> -	{ .name = "net", .driver_class = MLX5_CLASS_NET },
> +	{ .name = "eth", .driver_class = MLX5_CLASS_NET },
>  	{ .name = "regex", .driver_class = MLX5_CLASS_REGEX },  };
> 
> @@ -115,6 +116,9 @@ parse_class_options(const struct rte_devargs
> *devargs)
> 
>  	if (devargs == NULL)
>  		return 0;
> +	if (devargs->cls != NULL)
> +		/* support new global syntax */
> +		return class_name_to_value(devargs->cls->name);
>  	kvlist = rte_kvargs_parse(devargs->args, NULL);
>  	if (kvlist == NULL)
>  		return 0;
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/2] net/mlx5: support new global device syntax
  2021-01-18 15:26   ` [dpdk-dev] [PATCH v2 2/2] net/mlx5: support new global device syntax Xueming Li
@ 2021-04-05 10:56     ` Slava Ovsiienko
  0 siblings, 0 replies; 118+ messages in thread
From: Slava Ovsiienko @ 2021-04-05 10:56 UTC (permalink / raw)
  To: Xueming(Steven) Li
  Cc: dev, Matan Azrad, Shahaf Shuler, NBU-Contact-Thomas Monjalon,
	Xueming(Steven) Li, Asaf Penso

> -----Original Message-----
> From: Xueming Li <xuemingl@nvidia.com>
> Sent: Monday, January 18, 2021 17:27
> To: Slava Ovsiienko <viacheslavo@nvidia.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@nvidia.com>; Shahaf Shuler
> <shahafs@nvidia.com>; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>; Xueming(Steven) Li <xuemingl@nvidia.com>;
> Asaf Penso <asafp@nvidia.com>
> Subject: [PATCH v2 2/2] net/mlx5: support new global device syntax
> 
> This patch support new global device syntax like:
> 	bus=pci,addr=BB:DD.F/class=eth/driver=mlx5,devargs,..
> 
> In driver parameters check, ignores "driver" key which is part of new global
> device syntax instead of reporting error.
> 
> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

> ---
>  drivers/net/mlx5/mlx5.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> e245276fce..3b0e59ce1d 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -41,6 +41,9 @@
>  #include "mlx5_flow_os.h"
>  #include "rte_pmd_mlx5.h"
> 
> +/* Driver type key for new device global syntax. */ #define
> +MLX5_DRIVER_KEY "driver"
> +
>  /* Device parameter to enable RX completion queue compression. */
> #define MLX5_RXQ_CQE_COMP_EN "rxq_cqe_comp_en"
> 
> @@ -1597,7 +1600,7 @@ mlx5_args_check(const char *key, const char *val,
> void *opaque)
>  	signed long tmp;
> 
>  	/* No-op, port representors are processed in mlx5_dev_spawn(). */
> -	if (!strcmp(MLX5_REPRESENTOR, key))
> +	if (!strcmp(MLX5_DRIVER_KEY, key) ||
> !strcmp(MLX5_REPRESENTOR, key))
>  		return 0;
>  	errno = 0;
>  	tmp = strtol(val, NULL, 0);
> @@ -1749,6 +1752,7 @@ int
>  mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)  {
>  	const char **params = (const char *[]){
> +		MLX5_DRIVER_KEY,
>  		MLX5_RXQ_CQE_COMP_EN,
>  		MLX5_RXQ_PKT_PAD_EN,
>  		MLX5_RX_MPRQ_EN,
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/2] common/mlx5: support device global syntax
  2021-04-05 10:54     ` Slava Ovsiienko
@ 2021-04-08 12:24       ` Raslan Darawsheh
  2021-04-08 14:04         ` Raslan Darawsheh
  0 siblings, 1 reply; 118+ messages in thread
From: Raslan Darawsheh @ 2021-04-08 12:24 UTC (permalink / raw)
  To: Slava Ovsiienko, Xueming(Steven) Li
  Cc: dev, Matan Azrad, Shahaf Shuler, NBU-Contact-Thomas Monjalon,
	Xueming(Steven) Li, Asaf Penso

Hi,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Slava Ovsiienko
> Sent: Monday, April 5, 2021 1:55 PM
> To: Xueming(Steven) Li <xuemingl@nvidia.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@nvidia.com>; Shahaf Shuler
> <shahafs@nvidia.com>; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>; Xueming(Steven) Li <xuemingl@nvidia.com>;
> Asaf Penso <asafp@nvidia.com>
> Subject: Re: [dpdk-dev] [PATCH v2 1/2] common/mlx5: support device global
> syntax
> 
> > -----Original Message-----
> > From: Xueming Li <xuemingl@nvidia.com>
> > Sent: Monday, January 18, 2021 17:27
> > To: Slava Ovsiienko <viacheslavo@nvidia.com>
> > Cc: dev@dpdk.org; Matan Azrad <matan@nvidia.com>; Shahaf Shuler
> > <shahafs@nvidia.com>; NBU-Contact-Thomas Monjalon
> > <thomas@monjalon.net>; Xueming(Steven) Li <xuemingl@nvidia.com>;
> > Asaf Penso <asafp@nvidia.com>
> > Subject: [PATCH v2 1/2] common/mlx5: support device global syntax
> >
> > This patch supports new device global device syntax, resolve class type
> from
> > "class" section if the devarg is global device syntax:
> > bus=<bus>,k=v,,,/class=<cls>,k=v,,,/driver=<pmd>,k=v,,,,
> >
> > To reuse class name of global device syntax, this patch also changes internal
> > class name introduced by commit [1] to algin with RTE class name.
> Typo: algin -> align
Fixed during integration, 
> 
> Beside this:
> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
> 
> 
Patch applied to next-net-mlx,

Kindest regards
Raslan Darawsheh

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/2] common/mlx5: support device global syntax
  2021-04-08 12:24       ` Raslan Darawsheh
@ 2021-04-08 14:04         ` Raslan Darawsheh
  2021-04-08 14:08           ` Xueming(Steven) Li
  0 siblings, 1 reply; 118+ messages in thread
From: Raslan Darawsheh @ 2021-04-08 14:04 UTC (permalink / raw)
  To: Slava Ovsiienko, Xueming(Steven) Li
  Cc: dev, Matan Azrad, Shahaf Shuler, NBU-Contact-Thomas Monjalon,
	Xueming(Steven) Li, Asaf Penso, ferruh.yigit

Due to dependency in eal:
http://patches.dpdk.org/project/dpdk/list/?series=15979
we'll drop from next-net-mlx,  and will merge once the change in main tree is merged.

Kindest regards,
Raslan Darawsheh

> -----Original Message-----
> From: Raslan Darawsheh
> Sent: Thursday, April 8, 2021 3:24 PM
> To: Slava Ovsiienko <viacheslavo@nvidia.com>; Xueming(Steven) Li
> <xuemingl@nvidia.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@nvidia.com>; Shahaf Shuler
> <shahafs@nvidia.com>; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>; Xueming(Steven) Li <xuemingl@nvidia.com>;
> Asaf Penso <asafp@nvidia.com>
> Subject: RE: [PATCH v2 1/2] common/mlx5: support device global syntax
> 
> Hi,
> 
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Slava Ovsiienko
> > Sent: Monday, April 5, 2021 1:55 PM
> > To: Xueming(Steven) Li <xuemingl@nvidia.com>
> > Cc: dev@dpdk.org; Matan Azrad <matan@nvidia.com>; Shahaf Shuler
> > <shahafs@nvidia.com>; NBU-Contact-Thomas Monjalon
> > <thomas@monjalon.net>; Xueming(Steven) Li <xuemingl@nvidia.com>;
> > Asaf Penso <asafp@nvidia.com>
> > Subject: Re: [dpdk-dev] [PATCH v2 1/2] common/mlx5: support device
> global
> > syntax
> >
> > > -----Original Message-----
> > > From: Xueming Li <xuemingl@nvidia.com>
> > > Sent: Monday, January 18, 2021 17:27
> > > To: Slava Ovsiienko <viacheslavo@nvidia.com>
> > > Cc: dev@dpdk.org; Matan Azrad <matan@nvidia.com>; Shahaf Shuler
> > > <shahafs@nvidia.com>; NBU-Contact-Thomas Monjalon
> > > <thomas@monjalon.net>; Xueming(Steven) Li <xuemingl@nvidia.com>;
> > > Asaf Penso <asafp@nvidia.com>
> > > Subject: [PATCH v2 1/2] common/mlx5: support device global syntax
> > >
> > > This patch supports new device global device syntax, resolve class type
> > from
> > > "class" section if the devarg is global device syntax:
> > > bus=<bus>,k=v,,,/class=<cls>,k=v,,,/driver=<pmd>,k=v,,,,
> > >
> > > To reuse class name of global device syntax, this patch also changes
> internal
> > > class name introduced by commit [1] to algin with RTE class name.
> > Typo: algin -> align
> Fixed during integration,
> >
> > Beside this:
> > Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
> >
> >
> Patch applied to next-net-mlx,
> 
> Kindest regards
> Raslan Darawsheh

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/2] common/mlx5: support device global syntax
  2021-04-08 14:04         ` Raslan Darawsheh
@ 2021-04-08 14:08           ` Xueming(Steven) Li
  2021-04-08 14:13             ` Raslan Darawsheh
  0 siblings, 1 reply; 118+ messages in thread
From: Xueming(Steven) Li @ 2021-04-08 14:08 UTC (permalink / raw)
  To: Raslan Darawsheh, Slava Ovsiienko
  Cc: dev, Matan Azrad, Shahaf Shuler, NBU-Contact-Thomas Monjalon,
	Asaf Penso, ferruh.yigit

Hi Raslan,

Didi you see anything broken? ASAIK, having it in repo shouldn't hurt.
On your decision :)

Thanks,
Xueming

> -----Original Message-----
> From: Raslan Darawsheh <rasland@nvidia.com>
> Sent: Thursday, April 8, 2021 10:04 PM
> To: Slava Ovsiienko <viacheslavo@nvidia.com>; Xueming(Steven) Li <xuemingl@nvidia.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@nvidia.com>; Shahaf Shuler <shahafs@nvidia.com>; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>; Xueming(Steven) Li <xuemingl@nvidia.com>; Asaf Penso <asafp@nvidia.com>; ferruh.yigit@intel.com
> Subject: RE: [PATCH v2 1/2] common/mlx5: support device global syntax
> 
> Due to dependency in eal:
> http://patches.dpdk.org/project/dpdk/list/?series=15979
> we'll drop from next-net-mlx,  and will merge once the change in main tree is merged.
> 
> Kindest regards,
> Raslan Darawsheh
> 
> > -----Original Message-----
> > From: Raslan Darawsheh
> > Sent: Thursday, April 8, 2021 3:24 PM
> > To: Slava Ovsiienko <viacheslavo@nvidia.com>; Xueming(Steven) Li
> > <xuemingl@nvidia.com>
> > Cc: dev@dpdk.org; Matan Azrad <matan@nvidia.com>; Shahaf Shuler
> > <shahafs@nvidia.com>; NBU-Contact-Thomas Monjalon
> > <thomas@monjalon.net>; Xueming(Steven) Li <xuemingl@nvidia.com>; Asaf
> > Penso <asafp@nvidia.com>
> > Subject: RE: [PATCH v2 1/2] common/mlx5: support device global syntax
> >
> > Hi,
> >
> > > -----Original Message-----
> > > From: dev <dev-bounces@dpdk.org> On Behalf Of Slava Ovsiienko
> > > Sent: Monday, April 5, 2021 1:55 PM
> > > To: Xueming(Steven) Li <xuemingl@nvidia.com>
> > > Cc: dev@dpdk.org; Matan Azrad <matan@nvidia.com>; Shahaf Shuler
> > > <shahafs@nvidia.com>; NBU-Contact-Thomas Monjalon
> > > <thomas@monjalon.net>; Xueming(Steven) Li <xuemingl@nvidia.com>;
> > > Asaf Penso <asafp@nvidia.com>
> > > Subject: Re: [dpdk-dev] [PATCH v2 1/2] common/mlx5: support device
> > global
> > > syntax
> > >
> > > > -----Original Message-----
> > > > From: Xueming Li <xuemingl@nvidia.com>
> > > > Sent: Monday, January 18, 2021 17:27
> > > > To: Slava Ovsiienko <viacheslavo@nvidia.com>
> > > > Cc: dev@dpdk.org; Matan Azrad <matan@nvidia.com>; Shahaf Shuler
> > > > <shahafs@nvidia.com>; NBU-Contact-Thomas Monjalon
> > > > <thomas@monjalon.net>; Xueming(Steven) Li <xuemingl@nvidia.com>;
> > > > Asaf Penso <asafp@nvidia.com>
> > > > Subject: [PATCH v2 1/2] common/mlx5: support device global syntax
> > > >
> > > > This patch supports new device global device syntax, resolve class
> > > > type
> > > from
> > > > "class" section if the devarg is global device syntax:
> > > > bus=<bus>,k=v,,,/class=<cls>,k=v,,,/driver=<pmd>,k=v,,,,
> > > >
> > > > To reuse class name of global device syntax, this patch also
> > > > changes
> > internal
> > > > class name introduced by commit [1] to algin with RTE class name.
> > > Typo: algin -> align
> > Fixed during integration,
> > >
> > > Beside this:
> > > Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
> > >
> > >
> > Patch applied to next-net-mlx,
> >
> > Kindest regards
> > Raslan Darawsheh

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/2] common/mlx5: support device global syntax
  2021-04-08 14:08           ` Xueming(Steven) Li
@ 2021-04-08 14:13             ` Raslan Darawsheh
  2021-04-19  9:29               ` Raslan Darawsheh
  0 siblings, 1 reply; 118+ messages in thread
From: Raslan Darawsheh @ 2021-04-08 14:13 UTC (permalink / raw)
  To: Xueming(Steven) Li, Slava Ovsiienko
  Cc: dev, Matan Azrad, Shahaf Shuler, NBU-Contact-Thomas Monjalon,
	Asaf Penso, ferruh.yigit

Hi,

> -----Original Message-----
> From: Xueming(Steven) Li <xuemingl@nvidia.com>
> Sent: Thursday, April 8, 2021 5:08 PM
> To: Raslan Darawsheh <rasland@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@nvidia.com>; Shahaf Shuler
> <shahafs@nvidia.com>; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>; Asaf Penso <asafp@nvidia.com>;
> ferruh.yigit@intel.com
> Subject: RE: [PATCH v2 1/2] common/mlx5: support device global syntax
> 
> Hi Raslan,
> 
> Didi you see anything broken? ASAIK, having it in repo shouldn't hurt.
> On your decision :)
No, It doesn't hurt/ break anything really.
But, the idea that it has some logical dependency in the main tree so I'll only wait till we'll have it merged then will take this one.

> 
> Thanks,
> Xueming
> 
Kindest regards
Raslan Darawsheh

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v3 4/5] bus: add device arguments name parsing API
  2021-04-01 15:13       ` Xueming(Steven) Li
@ 2021-04-08 23:49         ` Thomas Monjalon
  0 siblings, 0 replies; 118+ messages in thread
From: Thomas Monjalon @ 2021-04-08 23:49 UTC (permalink / raw)
  To: Xueming(Steven) Li
  Cc: dev, Gaetan Rivet, dev, Asaf Penso, david.marchand, ferruh.yigit,
	andrew.rybchenko, hemant.agrawal, stephen, rosen.xu,
	ajit.khaparde, jerinj

01/04/2021 17:13, Xueming(Steven) Li:
>From: Thomas Monjalon <thomas@monjalon.net>
> >30/03/2021 14:15, Xueming Li:
> >> To use Global Device Syntax as devargs, name is required for device
> >> management.
> >
> >Context is missing.
> >You mean the argument "name" for the vdev bus?
> 
> Devargs.name, it is used by probe and device iterator.

I think we could avoid having a name with the new syntax.
In my understanding, this is for compatibility with the legacy syntax.

> To locate a device from a devargs, devargs.name is compared
> agains name of probed devices.
> This not an issue for legacy syntax,
> since the string after "bus:" is saved as name.

It would be interesting to explain where the name is parsed
for the legacy syntax: rte_devargs_parse
and for the new syntax: rte_devargs_layers_parse called
in rte_devargs_parse in the next patch.

> >> In legacy parsing API, devargs name was extracted after bus name:
> >>   bus:name,kv_params,,,
> >>
> >> To parse new Global Device Syntax, this patch introduces new bus API
> >> to parse devargs and update name, different bus driver might choose
> >> different keys from parameters with unified formating, example:
> >>  -a bus=pci,addr=83:00.0/class=eth/driver=mlx5,...
> >>     name: 0000:03:00.0
> >>  -a bus=vdev,name=pcap0/class=eth/driver=pcap,...
> >>     name:pcap0
> >
> >Only PCI and vdev buses are implemented.
> >What can be the plan for others?
> >We should track the progress somewhere, maybe with TODO comments.
> 
> Like legacy parser, how about using "name" as default name key, the new syntax parser can resolve it by default.
> Then PCI bus overrides by using "addr" key in new bus API, 
> vdev and other bus drivers simply use default implementation, i.e. using "name" as key..

Yes, you mean if devargs_parse is not implemented by the bus driver,
the default is to parse the name property,
while the PCI implementation fills the devargs name with the addr property.

> >[...]
> >> +int
> >> +rte_pci_devargs_parse(struct rte_devargs *da) {
> >> +	struct rte_kvargs *kvargs;
> >> +	const char *addr_str;
> >> +	struct rte_pci_addr addr;
> >> +	int ret;
> >> +
> >> +	if (da == NULL)
> >> +		return 0;
> >> +	RTE_ASSERT(da->bus_str != NULL);
> >> +
> >> +	kvargs = rte_kvargs_parse(da->bus_str, NULL);
> >> +	if (kvargs == NULL) {
> >> +		RTE_LOG(ERR, EAL, "cannot parse argument list: %s\n",
> >> +			da->bus_str);
> >> +		ret = -ENODEV;
> >> +		goto out;
> >> +	}
> >> +
> >> +	addr_str = rte_kvargs_get(kvargs, pci_params_keys[RTE_PCI_PARAM_ADDR]);
> >> +	if (addr_str == NULL) {
> >> +		RTE_LOG(ERR, EAL, "No PCI address specified using '%s=<id>' in: %s\n",
> >> +			pci_params_keys[RTE_PCI_PARAM_ADDR], da->bus_str);
> >> +		ret = -ENODEV;
> >> +		goto out;
> >> +	}
> >> +
> >> +	ret = rte_pci_addr_parse(addr_str, &addr);
> >> +	if (ret != 0) {
> >> +		RTE_LOG(ERR, EAL, "PCI address invalid: %s\n", da->bus_str);
> >> +		ret = -EINVAL;
> >> +		goto out;
> >> +	}
> >> +
> >> +	rte_pci_device_name(&addr, da->name, sizeof(da->name));
> >> +
> >> +	/* TODO: class parse -> driver parse */
> >
> >Please could you give a longer explanation of what is missing?
> 
> Just an option to give class driver and device driver to override name parsing.
> But this should happen in new devargs parser, not here if needed.
> 
> I'll remove this line, and mention it in commit log.

What would be the benefit of overriding devargs name parsing?
I feel it would be better to not rely on the devargs name
with the new syntax if we have such need.



^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v4 0/5] eal: enable global device syntax by default
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (54 preceding siblings ...)
  2021-03-30 12:15   ` [dpdk-dev] [PATCH v3 5/5] devargs: parse global device syntax Xueming Li
@ 2021-04-10 14:23   ` Xueming Li
  2021-04-10 14:23   ` [dpdk-dev] [PATCH v4 1/5] devargs: unify scratch buffer storage Xueming Li
                     ` (10 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-04-10 14:23 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet; +Cc: dev, xuemingl, Asaf Penso

The new Global Device Syntax [1] is used to identify a device with full
bus, class and driver description, example:
 -a bus=pci,addr=82:00.0/class=eth/driver=mlx5,...

This patchset fixes bugs and enable global device syntax with
backward compatibility by:
- unify devargs memory buffer cleanup
- parse name from bus driver callback api 
- try new global syntax parsing firstly and fallback to legacy parsing.


History:

V1:
 - Inital version

V2:
 - add devargs.src as complete source dev string
 - change devargs.data to scratch buffer
 - add rte_devargs_free() to release scratch memory
 - change name policy to align with rte_eth_iterator_init()
 - remove PCI bus fix as name already resolved in rte_devargs_parse().
V3:
 - remove devargs.src
 - rename rte_devargs_free() to rte_devargs_reset()
 - add bus callback api to resolve devargs.
V4:
 - add RTE_DEVARGS_KEY_BUS/CLASS/DIRVER macro
 - parsing "name" by default if no bus devargs parsing callback
 - Minor fixes suggested by Ray and Thomas


[1] Global Device Syntax:
https://www.dpdk.org/wp-content/uploads/sites/35/2018/10/am-07-DPDK-hotplug-20180905.pdf

[2] RFC:
http://patchwork.dpdk.org/project/dpdk/list/?series=14378

[3] V1:
http://patchwork.dpdk.org/project/dpdk/list/?series=14610

[4] V2:
http://patchwork.dpdk.org/project/dpdk/list/?series=14816

[5] V3:
http://patchwork.dpdk.org/project/dpdk/list/?series=15979


Xueming Li (5):
  devargs: unify scratch buffer storage
  devargs: fix memory leak on parsing error
  kvargs: add get by key function
  bus: add device arguments name parsing API
  devargs: parse global device syntax

 app/test-pmd/config.c                        |  3 +-
 app/test-pmd/testpmd.c                       |  5 +-
 doc/guides/rel_notes/release_21_05.rst       |  6 ++
 drivers/bus/pci/pci_common.c                 |  1 +
 drivers/bus/pci/pci_params.c                 | 47 ++++++++++
 drivers/bus/pci/private.h                    | 14 +++
 drivers/bus/vdev/vdev.c                      |  9 +-
 drivers/net/failsafe/failsafe_args.c         |  3 +-
 drivers/net/failsafe/failsafe_eal.c          |  2 +-
 examples/multi_process/hotplug_mp/commands.c |  6 +-
 lib/librte_eal/common/eal_common_dev.c       |  9 +-
 lib/librte_eal/common/eal_common_devargs.c   | 91 +++++++++++++++-----
 lib/librte_eal/common/hotplug_mp.c           |  6 +-
 lib/librte_eal/include/rte_bus.h             | 18 ++++
 lib/librte_eal/include/rte_devargs.h         | 22 ++++-
 lib/librte_eal/rte_eal_exports.def           |  1 +
 lib/librte_eal/version.map                   |  1 +
 lib/librte_ethdev/rte_ethdev.c               |  9 +-
 lib/librte_kvargs/rte_kvargs.c               | 20 +++++
 lib/librte_kvargs/rte_kvargs.h               | 21 +++++
 lib/librte_kvargs/version.map                |  3 +
 21 files changed, 245 insertions(+), 52 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v4 1/5] devargs: unify scratch buffer storage
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (55 preceding siblings ...)
  2021-04-10 14:23   ` [dpdk-dev] [PATCH v4 0/5] eal: enable global device syntax by default Xueming Li
@ 2021-04-10 14:23   ` Xueming Li
  2021-04-10 19:59     ` Tal Shnaiderman
  2021-04-10 14:23   ` [dpdk-dev] [PATCH v4 2/5] devargs: fix memory leak on parsing error Xueming Li
                     ` (9 subsequent siblings)
  66 siblings, 1 reply; 118+ messages in thread
From: Xueming Li @ 2021-04-10 14:23 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet
  Cc: dev, xuemingl, Asaf Penso, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger, Gaetan Rivet, Anatoly Burakov, Dmitry Kozlyuk,
	Narcisa Ana Maria Vasile, Dmitry Malloy, Pallavi Kadam,
	Ray Kinsella, Neil Horman, Ferruh Yigit, Andrew Rybchenko

In current design, legacy parser rte_devargs_parse() saved scratch
buffer to devargs.args while new parser rte_devargs_layers_parse() saved
to devargs.data. Code using devargs had to know the difference and
cleaned up memory accordingly - error prone.

This patch unifies scratch buffer to data field, introduces
rte_devargs_reset() function to wrap the memory clean up logic.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Reviewed-by: Gaetan Rivet <grive@u256.net>
---
 app/test-pmd/config.c                        |  3 +-
 app/test-pmd/testpmd.c                       |  5 +--
 drivers/bus/vdev/vdev.c                      |  9 +++---
 drivers/net/failsafe/failsafe_args.c         |  3 +-
 drivers/net/failsafe/failsafe_eal.c          |  2 +-
 examples/multi_process/hotplug_mp/commands.c |  6 ++--
 lib/librte_eal/common/eal_common_dev.c       |  9 +++---
 lib/librte_eal/common/eal_common_devargs.c   | 34 +++++++++++---------
 lib/librte_eal/common/hotplug_mp.c           |  6 ++--
 lib/librte_eal/include/rte_devargs.h         | 18 ++++++++---
 lib/librte_eal/rte_eal_exports.def           |  1 +
 lib/librte_eal/version.map                   |  1 +
 lib/librte_ethdev/rte_ethdev.c               |  8 ++---
 13 files changed, 59 insertions(+), 46 deletions(-)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index ef0b9784d0..d774610419 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -509,8 +509,6 @@ device_infos_display(const char *identifier)
 
 	if (rte_devargs_parsef(&da, "%s", identifier)) {
 		printf("cannot parse identifier\n");
-		if (da.args)
-			free(da.args);
 		return;
 	}
 
@@ -558,6 +556,7 @@ device_infos_display(const char *identifier)
 			}
 		}
 	};
+	rte_devargs_reset(&da);
 }
 
 void
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 96d2e0fcec..d4be23f8f8 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -3015,8 +3015,6 @@ detach_devargs(char *identifier)
 	memset(&da, 0, sizeof(da));
 	if (rte_devargs_parsef(&da, "%s", identifier)) {
 		printf("cannot parse identifier\n");
-		if (da.args)
-			free(da.args);
 		return;
 	}
 
@@ -3025,6 +3023,7 @@ detach_devargs(char *identifier)
 			if (ports[port_id].port_status != RTE_PORT_STOPPED) {
 				printf("Port %u not stopped\n", port_id);
 				rte_eth_iterator_cleanup(&iterator);
+				rte_devargs_reset(&da);
 				return;
 			}
 			port_flow_flush(port_id);
@@ -3034,6 +3033,7 @@ detach_devargs(char *identifier)
 	if (rte_eal_hotplug_remove(da.bus->name, da.name) != 0) {
 		TESTPMD_LOG(ERR, "Failed to detach device %s(%s)\n",
 			    da.name, da.bus->name);
+		rte_devargs_reset(&da);
 		return;
 	}
 
@@ -3042,6 +3042,7 @@ detach_devargs(char *identifier)
 	printf("Device %s is detached\n", identifier);
 	printf("Now total ports is %d\n", nb_ports);
 	printf("Done\n");
+	rte_devargs_reset(&da);
 }
 
 void
diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index 9a673347ae..d075409942 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -245,13 +245,14 @@ alloc_devargs(const char *name, const char *args)
 
 	devargs->bus = &rte_vdev_bus;
 	if (args)
-		devargs->args = strdup(args);
+		devargs->data = strdup(args);
 	else
-		devargs->args = strdup("");
+		devargs->data = strdup("");
+	devargs->args = devargs->data;
 
 	ret = strlcpy(devargs->name, name, sizeof(devargs->name));
 	if (ret < 0 || ret >= (int)sizeof(devargs->name)) {
-		free(devargs->args);
+		rte_devargs_reset(devargs);
 		free(devargs);
 		return NULL;
 	}
@@ -305,7 +306,7 @@ insert_vdev(const char *name, const char *args,
 
 	return 0;
 fail:
-	free(devargs->args);
+	rte_devargs_reset(devargs);
 	free(devargs);
 	free(dev);
 	return ret;
diff --git a/drivers/net/failsafe/failsafe_args.c b/drivers/net/failsafe/failsafe_args.c
index 707490b94c..b203e02d9a 100644
--- a/drivers/net/failsafe/failsafe_args.c
+++ b/drivers/net/failsafe/failsafe_args.c
@@ -451,8 +451,7 @@ failsafe_args_free(struct rte_eth_dev *dev)
 		sdev->cmdline = NULL;
 		free(sdev->fd_str);
 		sdev->fd_str = NULL;
-		free(sdev->devargs.args);
-		sdev->devargs.args = NULL;
+		rte_devargs_reset(&sdev->devargs);
 	}
 }
 
diff --git a/drivers/net/failsafe/failsafe_eal.c b/drivers/net/failsafe/failsafe_eal.c
index b9fc508673..cb4a2abc02 100644
--- a/drivers/net/failsafe/failsafe_eal.c
+++ b/drivers/net/failsafe/failsafe_eal.c
@@ -79,7 +79,7 @@ fs_bus_init(struct rte_eth_dev *dev)
 					rte_eth_devices[pid].device->devargs;
 
 			/* Take control of probed device. */
-			free(da->args);
+			rte_devargs_reset(da);
 			memset(da, 0, sizeof(*da));
 			if (probed_da != NULL)
 				snprintf(devstr, sizeof(devstr), "%s,%s",
diff --git a/examples/multi_process/hotplug_mp/commands.c b/examples/multi_process/hotplug_mp/commands.c
index a8a39d07f7..48fd329583 100644
--- a/examples/multi_process/hotplug_mp/commands.c
+++ b/examples/multi_process/hotplug_mp/commands.c
@@ -121,8 +121,6 @@ static void cmd_dev_attach_parsed(void *parsed_result,
 
 	if (rte_devargs_parsef(&da, "%s", res->devargs)) {
 		cmdline_printf(cl, "cannot parse devargs\n");
-		if (da.args)
-			free(da.args);
 		return;
 	}
 
@@ -131,6 +129,7 @@ static void cmd_dev_attach_parsed(void *parsed_result,
 	else
 		cmdline_printf(cl, "failed to attached device %s\n",
 				da.name);
+	rte_devargs_reset(&da);
 }
 
 cmdline_parse_token_string_t cmd_dev_attach_attach =
@@ -168,8 +167,6 @@ static void cmd_dev_detach_parsed(void *parsed_result,
 
 	if (rte_devargs_parsef(&da, "%s", res->devargs)) {
 		cmdline_printf(cl, "cannot parse devargs\n");
-		if (da.args)
-			free(da.args);
 		return;
 	}
 
@@ -180,6 +177,7 @@ static void cmd_dev_detach_parsed(void *parsed_result,
 	else
 		cmdline_printf(cl, "failed to dettach device %s\n",
 			da.name);
+	rte_devargs_reset(&da);
 }
 
 cmdline_parse_token_string_t cmd_dev_detach_detach =
diff --git a/lib/librte_eal/common/eal_common_dev.c b/lib/librte_eal/common/eal_common_dev.c
index 8a3bd3100a..148a23830a 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -185,10 +185,8 @@ local_dev_probe(const char *devargs, struct rte_device **new_dev)
 	return ret;
 
 err_devarg:
-	if (rte_devargs_remove(da) != 0) {
-		free(da->args);
-		free(da);
-	}
+	if (rte_devargs_remove(da) != 0)
+		rte_devargs_reset(da);
 	return ret;
 }
 
@@ -586,7 +584,8 @@ rte_dev_iterator_init(struct rte_dev_iterator *it,
 	it->bus_str = NULL;
 	it->cls_str = NULL;
 
-	devargs.data = dev_str;
+	/* Setting data field implies no malloc in parsing. */
+	devargs.data = (void *)(intptr_t)dev_str;
 	if (rte_devargs_layers_parse(&devargs, dev_str))
 		goto get_out;
 
diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index fcf3d9a3cc..48f85ee9c0 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -150,7 +150,7 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 	 * their parsing afterward.
 	 */
 	if (devargs->data != devstr) {
-		char *s = (void *)(intptr_t)(devargs->data);
+		char *s = devargs->data;
 
 		while ((s = strchr(s, '/'))) {
 			*s = '\0';
@@ -219,13 +219,14 @@ rte_devargs_parse(struct rte_devargs *da, const char *dev)
 	da->bus = bus;
 	/* Parse eventual device arguments */
 	if (devname[i] == ',')
-		da->args = strdup(&devname[i + 1]);
+		da->data = strdup(&devname[i + 1]);
 	else
-		da->args = strdup("");
-	if (da->args == NULL) {
+		da->data = strdup("");
+	if (da->data == NULL) {
 		RTE_LOG(ERR, EAL, "not enough memory to parse arguments\n");
 		return -ENOMEM;
 	}
+	da->drv_str = da->data;
 	return 0;
 }
 
@@ -260,6 +261,16 @@ rte_devargs_parsef(struct rte_devargs *da, const char *format, ...)
 	return ret;
 }
 
+void
+rte_devargs_reset(struct rte_devargs *da)
+{
+	if (da == NULL)
+		return;
+	if (da->data)
+		free(da->data);
+	da->data = NULL;
+}
+
 int
 rte_devargs_insert(struct rte_devargs **da)
 {
@@ -276,15 +287,8 @@ rte_devargs_insert(struct rte_devargs **da)
 		if (strcmp(listed_da->bus->name, (*da)->bus->name) == 0 &&
 				strcmp(listed_da->name, (*da)->name) == 0) {
 			/* device already in devargs list, must be updated */
-			listed_da->type = (*da)->type;
-			listed_da->policy = (*da)->policy;
-			free(listed_da->args);
-			listed_da->args = (*da)->args;
-			listed_da->bus = (*da)->bus;
-			listed_da->cls = (*da)->cls;
-			listed_da->bus_str = (*da)->bus_str;
-			listed_da->cls_str = (*da)->cls_str;
-			listed_da->data = (*da)->data;
+			rte_devargs_reset(listed_da);
+			*listed_da = **da;
 			/* replace provided devargs with found one */
 			free(*da);
 			*da = listed_da;
@@ -326,7 +330,7 @@ rte_devargs_add(enum rte_devtype devtype, const char *devargs_str)
 
 fail:
 	if (devargs) {
-		free(devargs->args);
+		rte_devargs_reset(devargs);
 		free(devargs);
 	}
 
@@ -346,7 +350,7 @@ rte_devargs_remove(struct rte_devargs *devargs)
 		if (strcmp(d->bus->name, devargs->bus->name) == 0 &&
 		    strcmp(d->name, devargs->name) == 0) {
 			TAILQ_REMOVE(&devargs_list, d, next);
-			free(d->args);
+			rte_devargs_reset(d);
 			free(d);
 			return 0;
 		}
diff --git a/lib/librte_eal/common/hotplug_mp.c b/lib/librte_eal/common/hotplug_mp.c
index ee791903b3..ae6010e8f8 100644
--- a/lib/librte_eal/common/hotplug_mp.c
+++ b/lib/librte_eal/common/hotplug_mp.c
@@ -95,6 +95,7 @@ __handle_secondary_request(void *param)
 
 	tmp_req = *req;
 
+	memset(&da, 0, sizeof(da));
 	if (req->t == EAL_DEV_REQ_TYPE_ATTACH) {
 		ret = local_dev_probe(req->devargs, &dev);
 		if (ret != 0) {
@@ -118,8 +119,6 @@ __handle_secondary_request(void *param)
 		ret = rte_devargs_parse(&da, req->devargs);
 		if (ret != 0)
 			goto finish;
-		free(da.args); /* we don't need those */
-		da.args = NULL;
 
 		ret = eal_dev_hotplug_request_to_secondary(&tmp_req);
 		if (ret != 0) {
@@ -176,6 +175,7 @@ __handle_secondary_request(void *param)
 	if (ret)
 		RTE_LOG(ERR, EAL, "failed to send response to secondary\n");
 
+	rte_devargs_reset(&da);
 	free(bundle->peer);
 	free(bundle);
 }
@@ -283,7 +283,7 @@ static void __handle_primary_request(void *param)
 
 		ret = local_dev_remove(dev);
 quit:
-		free(da->args);
+		rte_devargs_reset(da);
 		free(da);
 		break;
 	default:
diff --git a/lib/librte_eal/include/rte_devargs.h b/lib/librte_eal/include/rte_devargs.h
index 296f19324f..134b44a887 100644
--- a/lib/librte_eal/include/rte_devargs.h
+++ b/lib/librte_eal/include/rte_devargs.h
@@ -60,16 +60,16 @@ struct rte_devargs {
 	/** Name of the device. */
 	char name[RTE_DEV_NAME_MAX_LEN];
 	RTE_STD_C11
-	union {
-	/** Arguments string as given by user or "" for no argument. */
-		char *args;
+	union { /**< driver-related part of device string. */
+		const char *args; /**< legacy name. */
 		const char *drv_str;
 	};
 	struct rte_bus *bus; /**< bus handle. */
 	struct rte_class *cls; /**< class handle. */
 	const char *bus_str; /**< bus-related part of device string. */
 	const char *cls_str; /**< class-related part of device string. */
-	const char *data; /**< Device string storage. */
+	char *data;
+	/**< Raw string including bus, class and driver arguments. */
 };
 
 /**
@@ -145,6 +145,16 @@ rte_devargs_parsef(struct rte_devargs *da,
 		   const char *format, ...)
 __rte_format_printf(2, 0);
 
+/**
+ * Free resources in devargs.
+ *
+ * @param da
+ *   The devargs structure holding the device information.
+ */
+__rte_experimental
+void
+rte_devargs_reset(struct rte_devargs *da);
+
 /**
  * Insert an rte_devargs in the global list.
  *
diff --git a/lib/librte_eal/rte_eal_exports.def b/lib/librte_eal/rte_eal_exports.def
index c320077547..357de89ffc 100644
--- a/lib/librte_eal/rte_eal_exports.def
+++ b/lib/librte_eal/rte_eal_exports.def
@@ -30,6 +30,7 @@ EXPORTS
 	rte_devargs_parse
 	rte_devargs_parsef
 	rte_devargs_remove
+	rte_devargs_reset
 	rte_devargs_type_count
 	rte_dump_physmem_layout
 	rte_dump_stack
diff --git a/lib/librte_eal/version.map b/lib/librte_eal/version.map
index e23745ae6e..174c548297 100644
--- a/lib/librte_eal/version.map
+++ b/lib/librte_eal/version.map
@@ -411,6 +411,7 @@ EXPERIMENTAL {
 	rte_power_pause;
 
 	# added in 21.05
+	rte_devargs_reset;
 	rte_thread_key_create;
 	rte_thread_key_delete;
 	rte_thread_value_get;
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 3059aa55b3..e11a95558f 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -193,13 +193,14 @@ int
 rte_eth_iterator_init(struct rte_dev_iterator *iter, const char *devargs_str)
 {
 	int ret;
-	struct rte_devargs devargs = {.args = NULL};
+	struct rte_devargs devargs;
 	const char *bus_param_key;
 	char *bus_str = NULL;
 	char *cls_str = NULL;
 	int str_size;
 
 	memset(iter, 0, sizeof(*iter));
+	memset(&devargs, 0, sizeof(devargs));
 
 	/*
 	 * The devargs string may use various syntaxes:
@@ -244,8 +245,6 @@ rte_eth_iterator_init(struct rte_dev_iterator *iter, const char *devargs_str)
 		goto error;
 	}
 	iter->cls_str = cls_str;
-	free(devargs.args); /* allocated by rte_devargs_parse() */
-	devargs.args = NULL;
 
 	iter->bus = devargs.bus;
 	if (iter->bus->dev_iterate == NULL) {
@@ -278,13 +277,14 @@ rte_eth_iterator_init(struct rte_dev_iterator *iter, const char *devargs_str)
 
 end:
 	iter->cls = rte_class_find_by_name("eth");
+	rte_devargs_reset(&devargs);
 	return 0;
 
 error:
 	if (ret == -ENOTSUP)
 		RTE_ETHDEV_LOG(ERR, "Bus %s does not support iterating.\n",
 				iter->bus->name);
-	free(devargs.args);
+	rte_devargs_reset(&devargs);
 	free(bus_str);
 	free(cls_str);
 	return ret;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v4 2/5] devargs: fix memory leak on parsing error
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (56 preceding siblings ...)
  2021-04-10 14:23   ` [dpdk-dev] [PATCH v4 1/5] devargs: unify scratch buffer storage Xueming Li
@ 2021-04-10 14:23   ` Xueming Li
  2021-04-10 14:23   ` [dpdk-dev] [PATCH v4 3/5] kvargs: add get by key function Xueming Li
                     ` (8 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-04-10 14:23 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet
  Cc: dev, xuemingl, Asaf Penso, gaetan.rivet, stable, Shreyansh Jain

This patch fixes memory leak in parsing error handling.

Fixes: 338327d731e6 ("devargs: add function to parse device layers")
Cc: gaetan.rivet@6wind.com
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Gaetan Rivet <grive@u256.net>
---
 lib/librte_eal/common/eal_common_devargs.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index 48f85ee9c0..e40b91ea66 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -60,6 +60,7 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 	size_t nblayer;
 	size_t i = 0;
 	int ret = 0;
+	bool allocated_data = false;
 
 	/* Split each sub-lists. */
 	nblayer = devargs_layer_count(devstr);
@@ -81,6 +82,7 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 			ret = -ENOMEM;
 			goto get_out;
 		}
+		allocated_data = true;
 		s = devargs->data;
 	}
 
@@ -163,8 +165,14 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 		if (layers[i].kvlist)
 			rte_kvargs_free(layers[i].kvlist);
 	}
-	if (ret != 0)
+	if (ret != 0) {
+		if (allocated_data) {
+			/* Free duplicated data. */
+			free(devargs->data);
+			devargs->data = NULL;
+		}
 		rte_errno = -ret;
+	}
 	return ret;
 }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v4 3/5] kvargs: add get by key function
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (57 preceding siblings ...)
  2021-04-10 14:23   ` [dpdk-dev] [PATCH v4 2/5] devargs: fix memory leak on parsing error Xueming Li
@ 2021-04-10 14:23   ` Xueming Li
  2021-04-12  6:52     ` Olivier Matz
  2021-04-10 14:23   ` [dpdk-dev] [PATCH v4 4/5] bus: add device arguments name parsing API Xueming Li
                     ` (7 subsequent siblings)
  66 siblings, 1 reply; 118+ messages in thread
From: Xueming Li @ 2021-04-10 14:23 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet
  Cc: dev, xuemingl, Asaf Penso, Olivier Matz, Ray Kinsella, Neil Horman

Adds a new function to get value of a specific key from kvargs list.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Gaetan Rivet <grive@u256.net>
---
 lib/librte_kvargs/rte_kvargs.c | 20 ++++++++++++++++++++
 lib/librte_kvargs/rte_kvargs.h | 21 +++++++++++++++++++++
 lib/librte_kvargs/version.map  |  3 +++
 3 files changed, 44 insertions(+)

diff --git a/lib/librte_kvargs/rte_kvargs.c b/lib/librte_kvargs/rte_kvargs.c
index ffae8914cf..40e7670ab3 100644
--- a/lib/librte_kvargs/rte_kvargs.c
+++ b/lib/librte_kvargs/rte_kvargs.c
@@ -203,6 +203,26 @@ rte_kvargs_free(struct rte_kvargs *kvlist)
 	free(kvlist);
 }
 
+/* Lookup a value in an rte_kvargs list by its key. */
+const char *
+rte_kvargs_get(const struct rte_kvargs *kvlist, const char *key)
+{
+	unsigned int i;
+
+	if (!kvlist)
+		return NULL;
+	for (i = 0; i < kvlist->count; ++i) {
+		/* Allows key to be NULL. */
+		if (!key && !kvlist->pairs[i].key)
+			return kvlist->pairs[i].value;
+		if (!key || !kvlist->pairs[i].key)
+			continue;
+		if (!strcmp(kvlist->pairs[i].key, key))
+			return kvlist->pairs[i].value;
+	}
+	return NULL;
+}
+
 /*
  * Parse the arguments "key=value,key=value,..." string and return
  * an allocated structure that contains a key/value list. Also
diff --git a/lib/librte_kvargs/rte_kvargs.h b/lib/librte_kvargs/rte_kvargs.h
index eff598e08b..cb3ea99850 100644
--- a/lib/librte_kvargs/rte_kvargs.h
+++ b/lib/librte_kvargs/rte_kvargs.h
@@ -114,6 +114,27 @@ struct rte_kvargs *rte_kvargs_parse_delim(const char *args,
  */
 void rte_kvargs_free(struct rte_kvargs *kvlist);
 
+/**
+ * Get the value associated with a given key.
+ *
+ * If the key is NULL, the first value from the list is returned.
+ * If multiple key matches, the value of the first one is returned.
+ *
+ * The memory returned is allocated as part of the rte_kvargs structure,
+ * it must never be modified.
+ *
+ * @param kvlist
+ *   A list of rte_kvargs pair of 'key=value'.
+ * @param key
+ *   The matching key.
+
+ * @return
+ *   NULL if no key matches the input, a value associated with a matching
+ *   key otherwise.
+ */
+__rte_experimental
+const char *rte_kvargs_get(const struct rte_kvargs *kvlist, const char *key);
+
 /**
  * Call a handler function for each key/value matching the key
  *
diff --git a/lib/librte_kvargs/version.map b/lib/librte_kvargs/version.map
index ed375bf4a3..fb9bed4f93 100644
--- a/lib/librte_kvargs/version.map
+++ b/lib/librte_kvargs/version.map
@@ -15,4 +15,7 @@ EXPERIMENTAL {
 	rte_kvargs_parse_delim;
 	rte_kvargs_strcmp;
 
+	# added in 21.05
+	rte_kvargs_get;
+
 };
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v4 4/5] bus: add device arguments name parsing API
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (58 preceding siblings ...)
  2021-04-10 14:23   ` [dpdk-dev] [PATCH v4 3/5] kvargs: add get by key function Xueming Li
@ 2021-04-10 14:23   ` Xueming Li
  2021-04-12 21:16     ` Thomas Monjalon
  2021-04-10 14:23   ` [dpdk-dev] [PATCH v4 5/5] devargs: parse global device syntax Xueming Li
                     ` (6 subsequent siblings)
  66 siblings, 1 reply; 118+ messages in thread
From: Xueming Li @ 2021-04-10 14:23 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet; +Cc: dev, xuemingl, Asaf Penso

For device probe and iterator, devargs name was key information,
parsed by rte_devargs_parse. In legacy parser, devargs name was
extracted after bus name:
  bus:name,kv_arguments,,,
Example:
  pci:83:00.0,arguments,...
  vdev:pcap0,...

To be compatible with legacy parser, this patch introduces new
bus driver API devargs_parse to parse devargs and update devargs name.
If devargs_parse not implemented by bus driver, the new syntax parser
rte_devargs_layers_parse default will resolve devargs name from bus's
"name" argument.

Different bus driver might choose different keys from arguments with
unified format. The PCI bus implementation fills the devargs name with
the "addr" argument, example:
 -a bus=pci,addr=83:00.0/class=eth/driver=mlx5,...
    name: 0000:03:00.0
 -a bus=vdev,name=pcap0/class=eth/driver=pcap,...
    name:pcap0

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Gaetan Rivet <grive@u256.net>
---
 drivers/bus/pci/pci_common.c               |  1 +
 drivers/bus/pci/pci_params.c               | 47 ++++++++++++++++++++++
 drivers/bus/pci/private.h                  | 14 +++++++
 lib/librte_eal/common/eal_common_devargs.c | 31 ++++++++++++++
 lib/librte_eal/include/rte_bus.h           | 18 +++++++++
 5 files changed, 111 insertions(+)

diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
index 9b8d769287..61d3f51452 100644
--- a/drivers/bus/pci/pci_common.c
+++ b/drivers/bus/pci/pci_common.c
@@ -760,6 +760,7 @@ struct rte_pci_bus rte_pci_bus = {
 		.dev_iterate = rte_pci_dev_iterate,
 		.hot_unplug_handler = pci_hot_unplug_handler,
 		.sigbus_handler = pci_sigbus_handler,
+		.devargs_parse = rte_pci_devargs_parse,
 	},
 	.device_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.device_list),
 	.driver_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.driver_list),
diff --git a/drivers/bus/pci/pci_params.c b/drivers/bus/pci/pci_params.c
index 3192e9c967..c0c282e948 100644
--- a/drivers/bus/pci/pci_params.c
+++ b/drivers/bus/pci/pci_params.c
@@ -8,6 +8,8 @@
 #include <rte_errno.h>
 #include <rte_kvargs.h>
 #include <rte_pci.h>
+#include <rte_devargs.h>
+#include <rte_debug.h>
 
 #include "private.h"
 
@@ -76,3 +78,48 @@ rte_pci_dev_iterate(const void *start,
 	rte_kvargs_free(kvargs);
 	return dev;
 }
+
+int
+rte_pci_devargs_parse(struct rte_devargs *da)
+{
+	struct rte_kvargs *kvargs;
+	const char *addr_str;
+	struct rte_pci_addr addr;
+	int ret;
+
+	if (da == NULL)
+		return 0;
+	RTE_ASSERT(da->bus_str != NULL);
+
+	kvargs = rte_kvargs_parse(da->bus_str, NULL);
+	if (kvargs == NULL) {
+		RTE_LOG(ERR, EAL, "cannot parse argument list: %s\n",
+			da->bus_str);
+		ret = -ENODEV;
+		goto out;
+	}
+
+	addr_str = rte_kvargs_get(kvargs, pci_params_keys[RTE_PCI_PARAM_ADDR]);
+	if (addr_str == NULL) {
+		RTE_LOG(ERR, EAL, "No PCI address specified using '%s=<id>' in: %s\n",
+			pci_params_keys[RTE_PCI_PARAM_ADDR], da->bus_str);
+		ret = -ENODEV;
+		goto out;
+	}
+
+	ret = rte_pci_addr_parse(addr_str, &addr);
+	if (ret != 0) {
+		RTE_LOG(ERR, EAL, "PCI address invalid: %s\n", da->bus_str);
+		ret = -EINVAL;
+		goto out;
+	}
+
+	rte_pci_device_name(&addr, da->name, sizeof(da->name));
+
+out:
+	if (kvargs != NULL)
+		rte_kvargs_free(kvargs);
+	if (ret != 0)
+		rte_errno = -ret;
+	return ret;
+}
diff --git a/drivers/bus/pci/private.h b/drivers/bus/pci/private.h
index f566943f5e..8bc5140e97 100644
--- a/drivers/bus/pci/private.h
+++ b/drivers/bus/pci/private.h
@@ -267,4 +267,18 @@ rte_pci_dev_iterate(const void *start,
 		    const char *str,
 		    const struct rte_dev_iterator *it);
 
+/*
+ * Parse device arguments and update name.
+ *
+ * @param da
+ *   device arguments to parse.
+ *
+ * @return
+ *   0 on success.
+ *   -EINVAL: kvargs string is invalid and cannot be parsed.
+ *   -ENODEV: no key matching a device ID is found in the kv list.
+ */
+int
+rte_pci_devargs_parse(struct rte_devargs *da);
+
 #endif /* _PCI_PRIVATE_H_ */
diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index e40b91ea66..3a62521e05 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -19,6 +19,7 @@
 #include <rte_kvargs.h>
 #include <rte_log.h>
 #include <rte_tailq.h>
+#include <rte_string_fns.h>
 #include "eal_private.h"
 
 /** user device double-linked queue type definition */
@@ -40,6 +41,28 @@ devargs_layer_count(const char *s)
 	return i;
 }
 
+/* Resolve devargs name from bus arguments. */
+static int
+devargs_bus_parse_default(struct rte_devargs *devargs,
+			  struct rte_kvargs *bus_args)
+{
+	const char *name;
+
+	/* Parse devargs name from bus key-value list. */
+	name = rte_kvargs_get(bus_args, "name");
+	if (name == NULL) {
+		RTE_LOG(INFO, EAL, "devargs name not found: %s\n",
+			devargs->data);
+		return 0;
+	}
+	if (rte_strscpy(devargs->name, name, sizeof(devargs->name)) < 0) {
+		RTE_LOG(ERR, EAL, "devargs name too long: %s\n",
+			devargs->data);
+		return -E2BIG;
+	}
+	return 0;
+}
+
 int
 rte_devargs_layers_parse(struct rte_devargs *devargs,
 			 const char *devstr)
@@ -118,6 +141,8 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 		if (layers[i].kvlist == NULL)
 			continue;
 		kv = &layers[i].kvlist->pairs[0];
+		if (kv->key == NULL)
+			continue;
 		if (strcmp(kv->key, "bus") == 0) {
 			bus = rte_bus_find_by_name(kv->value);
 			if (bus == NULL) {
@@ -160,6 +185,12 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 		}
 	}
 
+	/* Resolve devarg's name. */
+	if (bus && bus->devargs_parse)
+		ret = bus->devargs_parse(devargs);
+	else if (layers[0].kvlist != NULL)
+		ret = devargs_bus_parse_default(devargs, layers[0].kvlist);
+
 get_out:
 	for (i = 0; i < RTE_DIM(layers); i++) {
 		if (layers[i].kvlist)
diff --git a/lib/librte_eal/include/rte_bus.h b/lib/librte_eal/include/rte_bus.h
index 80b154fb98..0a99f3b5a3 100644
--- a/lib/librte_eal/include/rte_bus.h
+++ b/lib/librte_eal/include/rte_bus.h
@@ -210,6 +210,23 @@ typedef int (*rte_bus_hot_unplug_handler_t)(struct rte_device *dev);
  */
 typedef int (*rte_bus_sigbus_handler_t)(const void *failure_addr);
 
+/**
+ * Parse device arguments, setting the device name in the devargs as a result.
+ *
+ * On error rte_errno is set.
+ *
+ * @param da
+ *	Pointer to the devargs to parse.
+ *	The 'bus_str' field must be set.
+ *
+ * @return
+ *	0 on successful parsing.
+ *	-EINVAL: on parsing error.
+ *	-ENODEV: if no key matching a device argument is specified.
+ *	-E2BIG: device name is too long.
+ */
+typedef int (*rte_bus_devargs_parse_t)(struct rte_devargs *da);
+
 /**
  * Bus scan policies
  */
@@ -267,6 +284,7 @@ struct rte_bus {
 				/**< handle hot-unplug failure on the bus */
 	rte_bus_sigbus_handler_t sigbus_handler;
 					/**< handle sigbus error on the bus */
+	rte_bus_devargs_parse_t devargs_parse; /**< Parse device arguments */
 
 };
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v4 5/5] devargs: parse global device syntax
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (59 preceding siblings ...)
  2021-04-10 14:23   ` [dpdk-dev] [PATCH v4 4/5] bus: add device arguments name parsing API Xueming Li
@ 2021-04-10 14:23   ` Xueming Li
  2021-04-12 21:24     ` Thomas Monjalon
  2021-04-13  3:14   ` [dpdk-dev] [PATCH v5 0/5] eal: enable global device syntax by default Xueming Li
                     ` (5 subsequent siblings)
  66 siblings, 1 reply; 118+ messages in thread
From: Xueming Li @ 2021-04-10 14:23 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet
  Cc: dev, xuemingl, Asaf Penso, Ferruh Yigit, Andrew Rybchenko

When parsing a devargs, try to parse using the global device syntax
first. Fallback on legacy syntax on error.

Example of new global device syntax:
 -a bus=pci,addr=82:00.0/class=eth/driver=mlx5,dv_flow_en=1

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Gaetan Rivet <grive@u256.net>
---
 doc/guides/rel_notes/release_21_05.rst     |  6 ++++++
 lib/librte_eal/common/eal_common_devargs.c | 16 ++++++++++++----
 lib/librte_eal/include/rte_devargs.h       |  4 ++++
 lib/librte_ethdev/rte_ethdev.c             |  1 -
 4 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/doc/guides/rel_notes/release_21_05.rst b/doc/guides/rel_notes/release_21_05.rst
index 374d6d98ea..ff1459a4d1 100644
--- a/doc/guides/rel_notes/release_21_05.rst
+++ b/doc/guides/rel_notes/release_21_05.rst
@@ -131,6 +131,12 @@ New Features
   * Added command to display Rx queue used descriptor count.
     ``show port (port_id) rxq (queue_id) desc used count``
 
+* **Enabled new devargs parser.**
+
+  * Unified devargs storage buffer usage.
+  * Added new bus driver api to allow bus driver contribute to devargs parsing.
+  * Try new devargs syntax parser first, fallback to legacy syntax parser.
+
 
 Removed Items
 -------------
diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index 3a62521e05..069d8f8499 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -125,7 +125,6 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 		layers[i].str = s;
 		layers[i].kvlist = rte_kvargs_parse_delim(s, NULL, "/");
 		if (layers[i].kvlist == NULL) {
-			RTE_LOG(ERR, EAL, "Could not parse %s\n", s);
 			ret = -EINVAL;
 			goto get_out;
 		}
@@ -143,7 +142,7 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 		kv = &layers[i].kvlist->pairs[0];
 		if (kv->key == NULL)
 			continue;
-		if (strcmp(kv->key, "bus") == 0) {
+		if (strcmp(kv->key, RTE_DEVARGS_KEY_BUS) == 0) {
 			bus = rte_bus_find_by_name(kv->value);
 			if (bus == NULL) {
 				RTE_LOG(ERR, EAL, "Could not find bus \"%s\"\n",
@@ -151,7 +150,7 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 				ret = -EFAULT;
 				goto get_out;
 			}
-		} else if (strcmp(kv->key, "class") == 0) {
+		} else if (strcmp(kv->key, RTE_DEVARGS_KEY_CLASS) == 0) {
 			cls = rte_class_find_by_name(kv->value);
 			if (cls == NULL) {
 				RTE_LOG(ERR, EAL, "Could not find class \"%s\"\n",
@@ -159,7 +158,7 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 				ret = -EFAULT;
 				goto get_out;
 			}
-		} else if (strcmp(kv->key, "driver") == 0) {
+		} else if (strcmp(kv->key, RTE_DEVARGS_KEY_DRIVER) == 0) {
 			/* Ignore */
 			continue;
 		}
@@ -224,6 +223,15 @@ rte_devargs_parse(struct rte_devargs *da, const char *dev)
 	if (da == NULL)
 		return -EINVAL;
 
+	/* First parse according global device syntax. */
+	if (rte_devargs_layers_parse(da, dev) == 0) {
+		if (da->bus != NULL || da->cls != NULL)
+			return 0;
+		rte_devargs_reset(da);
+	}
+
+	/* Otherwise fallback to legacy syntax: */
+
 	/* Retrieve eventual bus info */
 	do {
 		devname = dev;
diff --git a/lib/librte_eal/include/rte_devargs.h b/lib/librte_eal/include/rte_devargs.h
index 134b44a887..39e34ea02e 100644
--- a/lib/librte_eal/include/rte_devargs.h
+++ b/lib/librte_eal/include/rte_devargs.h
@@ -25,6 +25,10 @@ extern "C" {
 #include <rte_compat.h>
 #include <rte_bus.h>
 
+#define RTE_DEVARGS_KEY_BUS "bus"
+#define RTE_DEVARGS_KEY_CLASS "class"
+#define RTE_DEVARGS_KEY_DRIVER "driver"
+
 /**
  * Type of generic device
  */
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index e11a95558f..9e9b136438 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -207,7 +207,6 @@ rte_eth_iterator_init(struct rte_dev_iterator *iter, const char *devargs_str)
 	 *   - 0000:08:00.0,representor=[1-3]
 	 *   - pci:0000:06:00.0,representor=[0,5]
 	 *   - class=eth,mac=00:11:22:33:44:55
-	 * A new syntax is in development (not yet supported):
 	 *   - bus=X,paramX=x/class=Y,paramY=y/driver=Z,paramZ=z
 	 */
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v4 1/5] devargs: unify scratch buffer storage
  2021-04-10 14:23   ` [dpdk-dev] [PATCH v4 1/5] devargs: unify scratch buffer storage Xueming Li
@ 2021-04-10 19:59     ` Tal Shnaiderman
  2021-04-12 12:07       ` Xueming(Steven) Li
  0 siblings, 1 reply; 118+ messages in thread
From: Tal Shnaiderman @ 2021-04-10 19:59 UTC (permalink / raw)
  To: Xueming(Steven) Li, NBU-Contact-Thomas Monjalon, Gaetan Rivet
  Cc: dev, Xueming(Steven) Li, Asaf Penso, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger, Gaetan Rivet, Anatoly Burakov, Dmitry Kozlyuk,
	Narcisa Ana Maria Vasile, Dmitry Malloy, Pallavi Kadam,
	Ray Kinsella, Neil Horman, Ferruh Yigit, Andrew Rybchenko

> Subject: [dpdk-dev] [PATCH v4 1/5] devargs: unify scratch buffer storage
> 
> In current design, legacy parser rte_devargs_parse() saved scratch buffer to
> devargs.args while new parser rte_devargs_layers_parse() saved to
> devargs.data. Code using devargs had to know the difference and cleaned up
> memory accordingly - error prone.
> 
> This patch unifies scratch buffer to data field, introduces
> rte_devargs_reset() function to wrap the memory clean up logic.
> 
> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
> Acked-by: Ray Kinsella <mdr@ashroe.eu>
> Reviewed-by: Gaetan Rivet <grive@u256.net>
> ---
>  app/test-pmd/config.c                        |  3 +-
>  app/test-pmd/testpmd.c                       |  5 +--
>  drivers/bus/vdev/vdev.c                      |  9 +++---
>  drivers/net/failsafe/failsafe_args.c         |  3 +-
>  drivers/net/failsafe/failsafe_eal.c          |  2 +-
>  examples/multi_process/hotplug_mp/commands.c |  6 ++--
>  lib/librte_eal/common/eal_common_dev.c       |  9 +++---
>  lib/librte_eal/common/eal_common_devargs.c   | 34 +++++++++++---------
>  lib/librte_eal/common/hotplug_mp.c           |  6 ++--
>  lib/librte_eal/include/rte_devargs.h         | 18 ++++++++---
>  lib/librte_eal/rte_eal_exports.def           |  1 +

rte_eal_exports.def was united with version.map and removed thus the modification above is unneeded.

>  lib/librte_eal/version.map                   |  1 +
>  lib/librte_ethdev/rte_ethdev.c               |  8 ++---
>  13 files changed, 59 insertions(+), 46 deletions(-)

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v4 3/5] kvargs: add get by key function
  2021-04-10 14:23   ` [dpdk-dev] [PATCH v4 3/5] kvargs: add get by key function Xueming Li
@ 2021-04-12  6:52     ` Olivier Matz
  2021-04-12 12:07       ` Xueming(Steven) Li
  0 siblings, 1 reply; 118+ messages in thread
From: Olivier Matz @ 2021-04-12  6:52 UTC (permalink / raw)
  To: Xueming Li
  Cc: Thomas Monjalon, Gaetan Rivet, dev, Asaf Penso, Ray Kinsella,
	Neil Horman

Hi Xueming,

On Sat, Apr 10, 2021 at 02:23:55PM +0000, Xueming Li wrote:
> Adds a new function to get value of a specific key from kvargs list.
> 
> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
> Reviewed-by: Gaetan Rivet <grive@u256.net>
> ---
>  lib/librte_kvargs/rte_kvargs.c | 20 ++++++++++++++++++++
>  lib/librte_kvargs/rte_kvargs.h | 21 +++++++++++++++++++++
>  lib/librte_kvargs/version.map  |  3 +++
>  3 files changed, 44 insertions(+)
> 
> diff --git a/lib/librte_kvargs/rte_kvargs.c b/lib/librte_kvargs/rte_kvargs.c
> index ffae8914cf..40e7670ab3 100644
> --- a/lib/librte_kvargs/rte_kvargs.c
> +++ b/lib/librte_kvargs/rte_kvargs.c
> @@ -203,6 +203,26 @@ rte_kvargs_free(struct rte_kvargs *kvlist)
>  	free(kvlist);
>  }
>  
> +/* Lookup a value in an rte_kvargs list by its key. */
> +const char *
> +rte_kvargs_get(const struct rte_kvargs *kvlist, const char *key)
> +{
> +	unsigned int i;
> +
> +	if (!kvlist)
> +		return NULL;
> +	for (i = 0; i < kvlist->count; ++i) {
> +		/* Allows key to be NULL. */
> +		if (!key && !kvlist->pairs[i].key)
> +			return kvlist->pairs[i].value;

Is it possible that kvlist->pairs[i].key == NULL? In which case?


Thanks,
Olivier

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v4 3/5] kvargs: add get by key function
  2021-04-12  6:52     ` Olivier Matz
@ 2021-04-12 12:07       ` Xueming(Steven) Li
  2021-04-12 21:18         ` Thomas Monjalon
  0 siblings, 1 reply; 118+ messages in thread
From: Xueming(Steven) Li @ 2021-04-12 12:07 UTC (permalink / raw)
  To: Olivier Matz
  Cc: NBU-Contact-Thomas Monjalon, Gaetan Rivet, dev, Asaf Penso,
	Ray Kinsella, Neil Horman



> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Monday, April 12, 2021 2:53 PM
> To: Xueming(Steven) Li <xuemingl@nvidia.com>
> Cc: NBU-Contact-Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet <gaetanr@nvidia.com>; dev@dpdk.org; Asaf Penso
> <asafp@nvidia.com>; Ray Kinsella <mdr@ashroe.eu>; Neil Horman <nhorman@tuxdriver.com>
> Subject: Re: [PATCH v4 3/5] kvargs: add get by key function
> 
> Hi Xueming,
> 
> On Sat, Apr 10, 2021 at 02:23:55PM +0000, Xueming Li wrote:
> > Adds a new function to get value of a specific key from kvargs list.
> >
> > Signed-off-by: Xueming Li <xuemingl@nvidia.com>
> > Reviewed-by: Gaetan Rivet <grive@u256.net>
> > ---
> >  lib/librte_kvargs/rte_kvargs.c | 20 ++++++++++++++++++++
> > lib/librte_kvargs/rte_kvargs.h | 21 +++++++++++++++++++++
> > lib/librte_kvargs/version.map  |  3 +++
> >  3 files changed, 44 insertions(+)
> >
> > diff --git a/lib/librte_kvargs/rte_kvargs.c
> > b/lib/librte_kvargs/rte_kvargs.c index ffae8914cf..40e7670ab3 100644
> > --- a/lib/librte_kvargs/rte_kvargs.c
> > +++ b/lib/librte_kvargs/rte_kvargs.c
> > @@ -203,6 +203,26 @@ rte_kvargs_free(struct rte_kvargs *kvlist)
> >  	free(kvlist);
> >  }
> >
> > +/* Lookup a value in an rte_kvargs list by its key. */ const char *
> > +rte_kvargs_get(const struct rte_kvargs *kvlist, const char *key) {
> > +	unsigned int i;
> > +
> > +	if (!kvlist)
> > +		return NULL;
> > +	for (i = 0; i < kvlist->count; ++i) {
> > +		/* Allows key to be NULL. */
> > +		if (!key && !kvlist->pairs[i].key)
> > +			return kvlist->pairs[i].value;
> 
> Is it possible that kvlist->pairs[i].key == NULL? In which case?

Impossible, will remove this in next version, thanks.

> 
> 
> Thanks,
> Olivier

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v4 1/5] devargs: unify scratch buffer storage
  2021-04-10 19:59     ` Tal Shnaiderman
@ 2021-04-12 12:07       ` Xueming(Steven) Li
  0 siblings, 0 replies; 118+ messages in thread
From: Xueming(Steven) Li @ 2021-04-12 12:07 UTC (permalink / raw)
  To: Tal Shnaiderman, NBU-Contact-Thomas Monjalon, Gaetan Rivet
  Cc: dev, Asaf Penso, Wenzhuo Lu, Beilei Xing, Bernard Iremonger,
	Gaetan Rivet, Anatoly Burakov, Dmitry Kozlyuk,
	Narcisa Ana Maria Vasile, Dmitry Malloy, Pallavi Kadam,
	Ray Kinsella, Neil Horman, Ferruh Yigit, Andrew Rybchenko



> -----Original Message-----
> From: Tal Shnaiderman <talshn@nvidia.com>
> Sent: Sunday, April 11, 2021 4:00 AM
> To: Xueming(Steven) Li <xuemingl@nvidia.com>; NBU-Contact-Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> <gaetanr@nvidia.com>
> Cc: dev@dpdk.org; Xueming(Steven) Li <xuemingl@nvidia.com>; Asaf Penso <asafp@nvidia.com>; Wenzhuo Lu
> <wenzhuo.lu@intel.com>; Beilei Xing <beilei.xing@intel.com>; Bernard Iremonger <bernard.iremonger@intel.com>; Gaetan Rivet
> <grive@u256.net>; Anatoly Burakov <anatoly.burakov@intel.com>; Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>; Narcisa Ana Maria
> Vasile <navasile@linux.microsoft.com>; Dmitry Malloy <dmitrym@microsoft.com>; Pallavi Kadam <pallavi.kadam@intel.com>; Ray
> Kinsella <mdr@ashroe.eu>; Neil Horman <nhorman@tuxdriver.com>; Ferruh Yigit <ferruh.yigit@intel.com>; Andrew Rybchenko
> <andrew.rybchenko@oktetlabs.ru>
> Subject: RE: [dpdk-dev] [PATCH v4 1/5] devargs: unify scratch buffer storage
> 
> > Subject: [dpdk-dev] [PATCH v4 1/5] devargs: unify scratch buffer
> > storage
> >
> > In current design, legacy parser rte_devargs_parse() saved scratch
> > buffer to devargs.args while new parser rte_devargs_layers_parse()
> > saved to devargs.data. Code using devargs had to know the difference
> > and cleaned up memory accordingly - error prone.
> >
> > This patch unifies scratch buffer to data field, introduces
> > rte_devargs_reset() function to wrap the memory clean up logic.
> >
> > Signed-off-by: Xueming Li <xuemingl@nvidia.com>
> > Acked-by: Ray Kinsella <mdr@ashroe.eu>
> > Reviewed-by: Gaetan Rivet <grive@u256.net>
> > ---
> >  app/test-pmd/config.c                        |  3 +-
> >  app/test-pmd/testpmd.c                       |  5 +--
> >  drivers/bus/vdev/vdev.c                      |  9 +++---
> >  drivers/net/failsafe/failsafe_args.c         |  3 +-
> >  drivers/net/failsafe/failsafe_eal.c          |  2 +-
> >  examples/multi_process/hotplug_mp/commands.c |  6 ++--
> >  lib/librte_eal/common/eal_common_dev.c       |  9 +++---
> >  lib/librte_eal/common/eal_common_devargs.c   | 34 +++++++++++---------
> >  lib/librte_eal/common/hotplug_mp.c           |  6 ++--
> >  lib/librte_eal/include/rte_devargs.h         | 18 ++++++++---
> >  lib/librte_eal/rte_eal_exports.def           |  1 +
> 
> rte_eal_exports.def was united with version.map and removed thus the modification above is unneeded.

Thanks, I'll rebase my code :)

> 
> >  lib/librte_eal/version.map                   |  1 +
> >  lib/librte_ethdev/rte_ethdev.c               |  8 ++---
> >  13 files changed, 59 insertions(+), 46 deletions(-)

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v4 4/5] bus: add device arguments name parsing API
  2021-04-10 14:23   ` [dpdk-dev] [PATCH v4 4/5] bus: add device arguments name parsing API Xueming Li
@ 2021-04-12 21:16     ` Thomas Monjalon
  2021-04-12 23:37       ` Xueming(Steven) Li
  0 siblings, 1 reply; 118+ messages in thread
From: Thomas Monjalon @ 2021-04-12 21:16 UTC (permalink / raw)
  To: Xueming Li; +Cc: Gaetan Rivet, dev, Asaf Penso

10/04/2021 16:23, Xueming Li:
> +	/* Resolve devarg's name. */

s/devarg's name/devargs name/

> +	if (bus && bus->devargs_parse)

Please make checks explicits with != NULL

> +		ret = bus->devargs_parse(devargs);
> +	else if (layers[0].kvlist != NULL)
> +		ret = devargs_bus_parse_default(devargs, layers[0].kvlist);
[...]
> +/**
> + * Parse device arguments, setting the device name in the devargs as a result.

It should be
"
Parse bus part of the device arguments.

The field name of the struct rte_devargs will be set.
"

> + *
> + * On error rte_errno is set.

This sentence can be below  (in @return section).

> + *
> + * @param da
> + *	Pointer to the devargs to parse.
> + *	The 'bus_str' field must be set.

Why "must"?
It should be optional, so this sentence should be removed.

> + *
> + * @return
> + *	0 on successful parsing.
> + *	-EINVAL: on parsing error.
> + *	-ENODEV: if no key matching a device argument is specified.
> + *	-E2BIG: device name is too long.
> + */
> +typedef int (*rte_bus_devargs_parse_t)(struct rte_devargs *da);

[...]
> +	rte_bus_devargs_parse_t devargs_parse; /**< Parse device arguments */

Should be "Parse bus devargs"




^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v4 3/5] kvargs: add get by key function
  2021-04-12 12:07       ` Xueming(Steven) Li
@ 2021-04-12 21:18         ` Thomas Monjalon
  0 siblings, 0 replies; 118+ messages in thread
From: Thomas Monjalon @ 2021-04-12 21:18 UTC (permalink / raw)
  To: Xueming(Steven) Li
  Cc: Olivier Matz, dev, Gaetan Rivet, Asaf Penso, Ray Kinsella

12/04/2021 14:07, Xueming(Steven) Li:
> From: Olivier Matz <olivier.matz@6wind.com>
> > > +	if (!kvlist)
> > > +		return NULL;
> > > +	for (i = 0; i < kvlist->count; ++i) {
> > > +		/* Allows key to be NULL. */
> > > +		if (!key && !kvlist->pairs[i].key)
> > > +			return kvlist->pairs[i].value;
> > 
> > Is it possible that kvlist->pairs[i].key == NULL? In which case?
> 
> Impossible, will remove this in next version, thanks.

Please do explicit checks against NULL
to make clear that they are pointers, not booleans.



^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v4 5/5] devargs: parse global device syntax
  2021-04-10 14:23   ` [dpdk-dev] [PATCH v4 5/5] devargs: parse global device syntax Xueming Li
@ 2021-04-12 21:24     ` Thomas Monjalon
  2021-04-12 23:47       ` Xueming(Steven) Li
  0 siblings, 1 reply; 118+ messages in thread
From: Thomas Monjalon @ 2021-04-12 21:24 UTC (permalink / raw)
  To: Xueming Li
  Cc: Gaetan Rivet, dev, xuemingl, Asaf Penso, Ferruh Yigit, Andrew Rybchenko

10/04/2021 16:23, Xueming Li:
> --- a/doc/guides/rel_notes/release_21_05.rst
> +++ b/doc/guides/rel_notes/release_21_05.rst
> @@ -131,6 +131,12 @@ New Features
>    * Added command to display Rx queue used descriptor count.
>      ``show port (port_id) rxq (queue_id) desc used count``
>  
> +* **Enabled new devargs parser.**
> +
> +  * Unified devargs storage buffer usage.

I think this one can be skipped, it is internal handling.

> +  * Added new bus driver api to allow bus driver contribute to devargs parsing.
> +  * Try new devargs syntax parser first, fallback to legacy syntax parser.

Rewording:
"
* Enabled devargs syntax
  ``bus=X,paramX=x/class=Y,paramY=y/driver=Z,paramZ=z``
* Added bus-level parsing of the devargs syntax.
* Kept compatibility with the legacy syntax as parsing fallback.
"

Please move this block at the beginning of the release notes.



^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v4 4/5] bus: add device arguments name parsing API
  2021-04-12 21:16     ` Thomas Monjalon
@ 2021-04-12 23:37       ` Xueming(Steven) Li
  0 siblings, 0 replies; 118+ messages in thread
From: Xueming(Steven) Li @ 2021-04-12 23:37 UTC (permalink / raw)
  To: NBU-Contact-Thomas Monjalon; +Cc: Gaetan Rivet, dev, Asaf Penso



> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Tuesday, April 13, 2021 5:17 AM
> To: Xueming(Steven) Li <xuemingl@nvidia.com>
> Cc: Gaetan Rivet <gaetanr@nvidia.com>; dev@dpdk.org; Asaf Penso <asafp@nvidia.com>
> Subject: Re: [dpdk-dev] [PATCH v4 4/5] bus: add device arguments name parsing API
> 
> 10/04/2021 16:23, Xueming Li:
> > +	/* Resolve devarg's name. */
> 
> s/devarg's name/devargs name/
> 
> > +	if (bus && bus->devargs_parse)
> 
> Please make checks explicits with != NULL
> 
> > +		ret = bus->devargs_parse(devargs);
> > +	else if (layers[0].kvlist != NULL)
> > +		ret = devargs_bus_parse_default(devargs, layers[0].kvlist);
> [...]
> > +/**
> > + * Parse device arguments, setting the device name in the devargs as a result.
> 
> It should be
> "
> Parse bus part of the device arguments.
> 
> The field name of the struct rte_devargs will be set.
> "
> 
> > + *
> > + * On error rte_errno is set.
> 
> This sentence can be below  (in @return section).
> 
> > + *
> > + * @param da
> > + *	Pointer to the devargs to parse.
> > + *	The 'bus_str' field must be set.
> 
> Why "must"?
> It should be optional, so this sentence should be removed.
> 
> > + *
> > + * @return
> > + *	0 on successful parsing.
> > + *	-EINVAL: on parsing error.
> > + *	-ENODEV: if no key matching a device argument is specified.
> > + *	-E2BIG: device name is too long.
> > + */
> > +typedef int (*rte_bus_devargs_parse_t)(struct rte_devargs *da);
> 
> [...]
> > +	rte_bus_devargs_parse_t devargs_parse; /**< Parse device arguments */
> 
> Should be "Parse bus devargs"

Thanks, will fix all issues in next version.
> 
> 


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v4 5/5] devargs: parse global device syntax
  2021-04-12 21:24     ` Thomas Monjalon
@ 2021-04-12 23:47       ` Xueming(Steven) Li
  0 siblings, 0 replies; 118+ messages in thread
From: Xueming(Steven) Li @ 2021-04-12 23:47 UTC (permalink / raw)
  To: NBU-Contact-Thomas Monjalon
  Cc: Gaetan Rivet, dev, Asaf Penso, Ferruh Yigit, Andrew Rybchenko


> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Tuesday, April 13, 2021 5:25 AM
> To: Xueming(Steven) Li <xuemingl@nvidia.com>
> Cc: Gaetan Rivet <gaetanr@nvidia.com>; dev@dpdk.org; Xueming(Steven) Li <xuemingl@nvidia.com>; Asaf Penso
> <asafp@nvidia.com>; Ferruh Yigit <ferruh.yigit@intel.com>; Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Subject: Re: [dpdk-dev] [PATCH v4 5/5] devargs: parse global device syntax
> 
> 10/04/2021 16:23, Xueming Li:
> > --- a/doc/guides/rel_notes/release_21_05.rst
> > +++ b/doc/guides/rel_notes/release_21_05.rst
> > @@ -131,6 +131,12 @@ New Features
> >    * Added command to display Rx queue used descriptor count.
> >      ``show port (port_id) rxq (queue_id) desc used count``
> >
> > +* **Enabled new devargs parser.**
> > +
> > +  * Unified devargs storage buffer usage.
> 
> I think this one can be skipped, it is internal handling.
> 
> > +  * Added new bus driver api to allow bus driver contribute to devargs parsing.
> > +  * Try new devargs syntax parser first, fallback to legacy syntax parser.
> 
> Rewording:
> "
> * Enabled devargs syntax
>   ``bus=X,paramX=x/class=Y,paramY=y/driver=Z,paramZ=z``
> * Added bus-level parsing of the devargs syntax.
> * Kept compatibility with the legacy syntax as parsing fallback.
> "
> 
> Please move this block at the beginning of the release notes.
> 
Thanks!

^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v5 0/5] eal: enable global device syntax by default
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (60 preceding siblings ...)
  2021-04-10 14:23   ` [dpdk-dev] [PATCH v4 5/5] devargs: parse global device syntax Xueming Li
@ 2021-04-13  3:14   ` Xueming Li
  2021-04-14 19:49     ` Thomas Monjalon
  2021-04-13  3:14   ` [dpdk-dev] [PATCH v5 1/5] devargs: unify scratch buffer storage Xueming Li
                     ` (4 subsequent siblings)
  66 siblings, 1 reply; 118+ messages in thread
From: Xueming Li @ 2021-04-13  3:14 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet; +Cc: dev, xuemingl, Asaf Penso

The new Global Device Syntax [1] is used to identify a device with full
bus, class and driver description, example:
 -a bus=pci,addr=82:00.0/class=eth/driver=mlx5,...

This patchset fixes bugs and enable global device syntax with
backward compatibility by:
- unify devargs memory buffer cleanup
- parse name from bus driver callback api 
- try new global syntax parsing firstly and fallback to legacy parsing.


History:

V1:
 - Initial version

V2:
 - add devargs.src as complete source dev string
 - change devargs.data to scratch buffer
 - add rte_devargs_free() to release scratch memory
 - change name policy to align with rte_eth_iterator_init()
 - remove PCI bus fix as name already resolved in rte_devargs_parse().
V3:
 - remove devargs.src
 - rename rte_devargs_free() to rte_devargs_reset()
 - add bus callback api to resolve devargs.
V4:
 - add RTE_DEVARGS_KEY_BUS/CLASS/DIRVER macro
 - parsing "name" by default if no bus devargs parsing callback
 - Minor fixes suggested by Ray and Thomas
v5:
 - Update relrease notes
 - Remove NULL support of kvargs_get function
 - Rebased on latest code
 - Small updates according to review comments


[1] Global Device Syntax:
https://www.dpdk.org/wp-content/uploads/sites/35/2018/10/am-07-DPDK-hotplug-20180905.pdf

[2] RFC:
http://patchwork.dpdk.org/project/dpdk/list/?series=14378

[3] V1:
http://patchwork.dpdk.org/project/dpdk/list/?series=14610

[4] V2:
http://patchwork.dpdk.org/project/dpdk/list/?series=14816

[5] V3:
http://patchwork.dpdk.org/project/dpdk/list/?series=15979

[6] v4:
http://patchwork.dpdk.org/project/dpdk/list/?series=16267


Xueming Li (5):
  devargs: unify scratch buffer storage
  devargs: fix memory leak on parsing error
  kvargs: add get by key function
  bus: add device arguments name parsing API
  devargs: parse global device syntax

 app/test-pmd/config.c                        |  3 +-
 app/test-pmd/testpmd.c                       |  5 +-
 doc/guides/rel_notes/release_21_05.rst       |  7 ++
 drivers/bus/pci/pci_common.c                 |  1 +
 drivers/bus/pci/pci_params.c                 | 47 ++++++++++
 drivers/bus/pci/private.h                    | 14 +++
 drivers/bus/vdev/vdev.c                      |  9 +-
 drivers/net/failsafe/failsafe_args.c         |  3 +-
 drivers/net/failsafe/failsafe_eal.c          |  2 +-
 examples/multi_process/hotplug_mp/commands.c |  6 +-
 lib/librte_eal/common/eal_common_dev.c       |  9 +-
 lib/librte_eal/common/eal_common_devargs.c   | 91 +++++++++++++++-----
 lib/librte_eal/common/hotplug_mp.c           |  6 +-
 lib/librte_eal/include/rte_bus.h             | 17 ++++
 lib/librte_eal/include/rte_devargs.h         | 22 ++++-
 lib/librte_eal/version.map                   |  1 +
 lib/librte_ethdev/rte_ethdev.c               |  9 +-
 lib/librte_kvargs/rte_kvargs.c               | 15 ++++
 lib/librte_kvargs/rte_kvargs.h               | 20 +++++
 lib/librte_kvargs/version.map                |  3 +
 20 files changed, 238 insertions(+), 52 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v5 1/5] devargs: unify scratch buffer storage
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (61 preceding siblings ...)
  2021-04-13  3:14   ` [dpdk-dev] [PATCH v5 0/5] eal: enable global device syntax by default Xueming Li
@ 2021-04-13  3:14   ` Xueming Li
  2021-04-16  7:00     ` David Marchand
  2021-04-13  3:14   ` [dpdk-dev] [PATCH v5 2/5] devargs: fix memory leak on parsing error Xueming Li
                     ` (3 subsequent siblings)
  66 siblings, 1 reply; 118+ messages in thread
From: Xueming Li @ 2021-04-13  3:14 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet
  Cc: dev, xuemingl, Asaf Penso, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger, Gaetan Rivet, Anatoly Burakov, Ray Kinsella,
	Neil Horman, Ferruh Yigit, Andrew Rybchenko

In current design, legacy parser rte_devargs_parse() saved scratch
buffer to devargs.args while new parser rte_devargs_layers_parse() saved
to devargs.data. Code using devargs had to know the difference and
cleaned up memory accordingly - error prone.

This patch unifies scratch buffer to data field, introduces
rte_devargs_reset() function to wrap the memory clean up logic.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Reviewed-by: Gaetan Rivet <grive@u256.net>
---
 app/test-pmd/config.c                        |  3 +-
 app/test-pmd/testpmd.c                       |  5 +--
 drivers/bus/vdev/vdev.c                      |  9 +++---
 drivers/net/failsafe/failsafe_args.c         |  3 +-
 drivers/net/failsafe/failsafe_eal.c          |  2 +-
 examples/multi_process/hotplug_mp/commands.c |  6 ++--
 lib/librte_eal/common/eal_common_dev.c       |  9 +++---
 lib/librte_eal/common/eal_common_devargs.c   | 34 +++++++++++---------
 lib/librte_eal/common/hotplug_mp.c           |  6 ++--
 lib/librte_eal/include/rte_devargs.h         | 18 ++++++++---
 lib/librte_eal/version.map                   |  1 +
 lib/librte_ethdev/rte_ethdev.c               |  8 ++---
 12 files changed, 58 insertions(+), 46 deletions(-)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index a5e82b7a97..a8bd664097 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -510,8 +510,6 @@ device_infos_display(const char *identifier)
 
 	if (rte_devargs_parsef(&da, "%s", identifier)) {
 		printf("cannot parse identifier\n");
-		if (da.args)
-			free(da.args);
 		return;
 	}
 
@@ -559,6 +557,7 @@ device_infos_display(const char *identifier)
 			}
 		}
 	};
+	rte_devargs_reset(&da);
 }
 
 void
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 96d2e0fcec..d4be23f8f8 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -3015,8 +3015,6 @@ detach_devargs(char *identifier)
 	memset(&da, 0, sizeof(da));
 	if (rte_devargs_parsef(&da, "%s", identifier)) {
 		printf("cannot parse identifier\n");
-		if (da.args)
-			free(da.args);
 		return;
 	}
 
@@ -3025,6 +3023,7 @@ detach_devargs(char *identifier)
 			if (ports[port_id].port_status != RTE_PORT_STOPPED) {
 				printf("Port %u not stopped\n", port_id);
 				rte_eth_iterator_cleanup(&iterator);
+				rte_devargs_reset(&da);
 				return;
 			}
 			port_flow_flush(port_id);
@@ -3034,6 +3033,7 @@ detach_devargs(char *identifier)
 	if (rte_eal_hotplug_remove(da.bus->name, da.name) != 0) {
 		TESTPMD_LOG(ERR, "Failed to detach device %s(%s)\n",
 			    da.name, da.bus->name);
+		rte_devargs_reset(&da);
 		return;
 	}
 
@@ -3042,6 +3042,7 @@ detach_devargs(char *identifier)
 	printf("Device %s is detached\n", identifier);
 	printf("Now total ports is %d\n", nb_ports);
 	printf("Done\n");
+	rte_devargs_reset(&da);
 }
 
 void
diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index 9a673347ae..d075409942 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -245,13 +245,14 @@ alloc_devargs(const char *name, const char *args)
 
 	devargs->bus = &rte_vdev_bus;
 	if (args)
-		devargs->args = strdup(args);
+		devargs->data = strdup(args);
 	else
-		devargs->args = strdup("");
+		devargs->data = strdup("");
+	devargs->args = devargs->data;
 
 	ret = strlcpy(devargs->name, name, sizeof(devargs->name));
 	if (ret < 0 || ret >= (int)sizeof(devargs->name)) {
-		free(devargs->args);
+		rte_devargs_reset(devargs);
 		free(devargs);
 		return NULL;
 	}
@@ -305,7 +306,7 @@ insert_vdev(const char *name, const char *args,
 
 	return 0;
 fail:
-	free(devargs->args);
+	rte_devargs_reset(devargs);
 	free(devargs);
 	free(dev);
 	return ret;
diff --git a/drivers/net/failsafe/failsafe_args.c b/drivers/net/failsafe/failsafe_args.c
index 707490b94c..b203e02d9a 100644
--- a/drivers/net/failsafe/failsafe_args.c
+++ b/drivers/net/failsafe/failsafe_args.c
@@ -451,8 +451,7 @@ failsafe_args_free(struct rte_eth_dev *dev)
 		sdev->cmdline = NULL;
 		free(sdev->fd_str);
 		sdev->fd_str = NULL;
-		free(sdev->devargs.args);
-		sdev->devargs.args = NULL;
+		rte_devargs_reset(&sdev->devargs);
 	}
 }
 
diff --git a/drivers/net/failsafe/failsafe_eal.c b/drivers/net/failsafe/failsafe_eal.c
index b9fc508673..cb4a2abc02 100644
--- a/drivers/net/failsafe/failsafe_eal.c
+++ b/drivers/net/failsafe/failsafe_eal.c
@@ -79,7 +79,7 @@ fs_bus_init(struct rte_eth_dev *dev)
 					rte_eth_devices[pid].device->devargs;
 
 			/* Take control of probed device. */
-			free(da->args);
+			rte_devargs_reset(da);
 			memset(da, 0, sizeof(*da));
 			if (probed_da != NULL)
 				snprintf(devstr, sizeof(devstr), "%s,%s",
diff --git a/examples/multi_process/hotplug_mp/commands.c b/examples/multi_process/hotplug_mp/commands.c
index a8a39d07f7..48fd329583 100644
--- a/examples/multi_process/hotplug_mp/commands.c
+++ b/examples/multi_process/hotplug_mp/commands.c
@@ -121,8 +121,6 @@ static void cmd_dev_attach_parsed(void *parsed_result,
 
 	if (rte_devargs_parsef(&da, "%s", res->devargs)) {
 		cmdline_printf(cl, "cannot parse devargs\n");
-		if (da.args)
-			free(da.args);
 		return;
 	}
 
@@ -131,6 +129,7 @@ static void cmd_dev_attach_parsed(void *parsed_result,
 	else
 		cmdline_printf(cl, "failed to attached device %s\n",
 				da.name);
+	rte_devargs_reset(&da);
 }
 
 cmdline_parse_token_string_t cmd_dev_attach_attach =
@@ -168,8 +167,6 @@ static void cmd_dev_detach_parsed(void *parsed_result,
 
 	if (rte_devargs_parsef(&da, "%s", res->devargs)) {
 		cmdline_printf(cl, "cannot parse devargs\n");
-		if (da.args)
-			free(da.args);
 		return;
 	}
 
@@ -180,6 +177,7 @@ static void cmd_dev_detach_parsed(void *parsed_result,
 	else
 		cmdline_printf(cl, "failed to dettach device %s\n",
 			da.name);
+	rte_devargs_reset(&da);
 }
 
 cmdline_parse_token_string_t cmd_dev_detach_detach =
diff --git a/lib/librte_eal/common/eal_common_dev.c b/lib/librte_eal/common/eal_common_dev.c
index 8a3bd3100a..148a23830a 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -185,10 +185,8 @@ local_dev_probe(const char *devargs, struct rte_device **new_dev)
 	return ret;
 
 err_devarg:
-	if (rte_devargs_remove(da) != 0) {
-		free(da->args);
-		free(da);
-	}
+	if (rte_devargs_remove(da) != 0)
+		rte_devargs_reset(da);
 	return ret;
 }
 
@@ -586,7 +584,8 @@ rte_dev_iterator_init(struct rte_dev_iterator *it,
 	it->bus_str = NULL;
 	it->cls_str = NULL;
 
-	devargs.data = dev_str;
+	/* Setting data field implies no malloc in parsing. */
+	devargs.data = (void *)(intptr_t)dev_str;
 	if (rte_devargs_layers_parse(&devargs, dev_str))
 		goto get_out;
 
diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index fcf3d9a3cc..48f85ee9c0 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -150,7 +150,7 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 	 * their parsing afterward.
 	 */
 	if (devargs->data != devstr) {
-		char *s = (void *)(intptr_t)(devargs->data);
+		char *s = devargs->data;
 
 		while ((s = strchr(s, '/'))) {
 			*s = '\0';
@@ -219,13 +219,14 @@ rte_devargs_parse(struct rte_devargs *da, const char *dev)
 	da->bus = bus;
 	/* Parse eventual device arguments */
 	if (devname[i] == ',')
-		da->args = strdup(&devname[i + 1]);
+		da->data = strdup(&devname[i + 1]);
 	else
-		da->args = strdup("");
-	if (da->args == NULL) {
+		da->data = strdup("");
+	if (da->data == NULL) {
 		RTE_LOG(ERR, EAL, "not enough memory to parse arguments\n");
 		return -ENOMEM;
 	}
+	da->drv_str = da->data;
 	return 0;
 }
 
@@ -260,6 +261,16 @@ rte_devargs_parsef(struct rte_devargs *da, const char *format, ...)
 	return ret;
 }
 
+void
+rte_devargs_reset(struct rte_devargs *da)
+{
+	if (da == NULL)
+		return;
+	if (da->data)
+		free(da->data);
+	da->data = NULL;
+}
+
 int
 rte_devargs_insert(struct rte_devargs **da)
 {
@@ -276,15 +287,8 @@ rte_devargs_insert(struct rte_devargs **da)
 		if (strcmp(listed_da->bus->name, (*da)->bus->name) == 0 &&
 				strcmp(listed_da->name, (*da)->name) == 0) {
 			/* device already in devargs list, must be updated */
-			listed_da->type = (*da)->type;
-			listed_da->policy = (*da)->policy;
-			free(listed_da->args);
-			listed_da->args = (*da)->args;
-			listed_da->bus = (*da)->bus;
-			listed_da->cls = (*da)->cls;
-			listed_da->bus_str = (*da)->bus_str;
-			listed_da->cls_str = (*da)->cls_str;
-			listed_da->data = (*da)->data;
+			rte_devargs_reset(listed_da);
+			*listed_da = **da;
 			/* replace provided devargs with found one */
 			free(*da);
 			*da = listed_da;
@@ -326,7 +330,7 @@ rte_devargs_add(enum rte_devtype devtype, const char *devargs_str)
 
 fail:
 	if (devargs) {
-		free(devargs->args);
+		rte_devargs_reset(devargs);
 		free(devargs);
 	}
 
@@ -346,7 +350,7 @@ rte_devargs_remove(struct rte_devargs *devargs)
 		if (strcmp(d->bus->name, devargs->bus->name) == 0 &&
 		    strcmp(d->name, devargs->name) == 0) {
 			TAILQ_REMOVE(&devargs_list, d, next);
-			free(d->args);
+			rte_devargs_reset(d);
 			free(d);
 			return 0;
 		}
diff --git a/lib/librte_eal/common/hotplug_mp.c b/lib/librte_eal/common/hotplug_mp.c
index ee791903b3..ae6010e8f8 100644
--- a/lib/librte_eal/common/hotplug_mp.c
+++ b/lib/librte_eal/common/hotplug_mp.c
@@ -95,6 +95,7 @@ __handle_secondary_request(void *param)
 
 	tmp_req = *req;
 
+	memset(&da, 0, sizeof(da));
 	if (req->t == EAL_DEV_REQ_TYPE_ATTACH) {
 		ret = local_dev_probe(req->devargs, &dev);
 		if (ret != 0) {
@@ -118,8 +119,6 @@ __handle_secondary_request(void *param)
 		ret = rte_devargs_parse(&da, req->devargs);
 		if (ret != 0)
 			goto finish;
-		free(da.args); /* we don't need those */
-		da.args = NULL;
 
 		ret = eal_dev_hotplug_request_to_secondary(&tmp_req);
 		if (ret != 0) {
@@ -176,6 +175,7 @@ __handle_secondary_request(void *param)
 	if (ret)
 		RTE_LOG(ERR, EAL, "failed to send response to secondary\n");
 
+	rte_devargs_reset(&da);
 	free(bundle->peer);
 	free(bundle);
 }
@@ -283,7 +283,7 @@ static void __handle_primary_request(void *param)
 
 		ret = local_dev_remove(dev);
 quit:
-		free(da->args);
+		rte_devargs_reset(da);
 		free(da);
 		break;
 	default:
diff --git a/lib/librte_eal/include/rte_devargs.h b/lib/librte_eal/include/rte_devargs.h
index 296f19324f..134b44a887 100644
--- a/lib/librte_eal/include/rte_devargs.h
+++ b/lib/librte_eal/include/rte_devargs.h
@@ -60,16 +60,16 @@ struct rte_devargs {
 	/** Name of the device. */
 	char name[RTE_DEV_NAME_MAX_LEN];
 	RTE_STD_C11
-	union {
-	/** Arguments string as given by user or "" for no argument. */
-		char *args;
+	union { /**< driver-related part of device string. */
+		const char *args; /**< legacy name. */
 		const char *drv_str;
 	};
 	struct rte_bus *bus; /**< bus handle. */
 	struct rte_class *cls; /**< class handle. */
 	const char *bus_str; /**< bus-related part of device string. */
 	const char *cls_str; /**< class-related part of device string. */
-	const char *data; /**< Device string storage. */
+	char *data;
+	/**< Raw string including bus, class and driver arguments. */
 };
 
 /**
@@ -145,6 +145,16 @@ rte_devargs_parsef(struct rte_devargs *da,
 		   const char *format, ...)
 __rte_format_printf(2, 0);
 
+/**
+ * Free resources in devargs.
+ *
+ * @param da
+ *   The devargs structure holding the device information.
+ */
+__rte_experimental
+void
+rte_devargs_reset(struct rte_devargs *da);
+
 /**
  * Insert an rte_devargs in the global list.
  *
diff --git a/lib/librte_eal/version.map b/lib/librte_eal/version.map
index e7217ae288..fe5c3dac98 100644
--- a/lib/librte_eal/version.map
+++ b/lib/librte_eal/version.map
@@ -410,6 +410,7 @@ EXPERIMENTAL {
 	rte_power_pause; # WINDOWS_NO_EXPORT
 
 	# added in 21.05
+	rte_devargs_reset;
 	rte_intr_callback_unregister_sync;
 	rte_log_list_types;
 	rte_thread_key_create;
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 6b5cfd696d..0419500fc3 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -193,13 +193,14 @@ int
 rte_eth_iterator_init(struct rte_dev_iterator *iter, const char *devargs_str)
 {
 	int ret;
-	struct rte_devargs devargs = {.args = NULL};
+	struct rte_devargs devargs;
 	const char *bus_param_key;
 	char *bus_str = NULL;
 	char *cls_str = NULL;
 	int str_size;
 
 	memset(iter, 0, sizeof(*iter));
+	memset(&devargs, 0, sizeof(devargs));
 
 	/*
 	 * The devargs string may use various syntaxes:
@@ -244,8 +245,6 @@ rte_eth_iterator_init(struct rte_dev_iterator *iter, const char *devargs_str)
 		goto error;
 	}
 	iter->cls_str = cls_str;
-	free(devargs.args); /* allocated by rte_devargs_parse() */
-	devargs.args = NULL;
 
 	iter->bus = devargs.bus;
 	if (iter->bus->dev_iterate == NULL) {
@@ -278,13 +277,14 @@ rte_eth_iterator_init(struct rte_dev_iterator *iter, const char *devargs_str)
 
 end:
 	iter->cls = rte_class_find_by_name("eth");
+	rte_devargs_reset(&devargs);
 	return 0;
 
 error:
 	if (ret == -ENOTSUP)
 		RTE_ETHDEV_LOG(ERR, "Bus %s does not support iterating.\n",
 				iter->bus->name);
-	free(devargs.args);
+	rte_devargs_reset(&devargs);
 	free(bus_str);
 	free(cls_str);
 	return ret;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v5 2/5] devargs: fix memory leak on parsing error
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (62 preceding siblings ...)
  2021-04-13  3:14   ` [dpdk-dev] [PATCH v5 1/5] devargs: unify scratch buffer storage Xueming Li
@ 2021-04-13  3:14   ` Xueming Li
  2021-04-13  3:14   ` [dpdk-dev] [PATCH v5 3/5] kvargs: add get by key function Xueming Li
                     ` (2 subsequent siblings)
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-04-13  3:14 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet
  Cc: dev, xuemingl, Asaf Penso, gaetan.rivet, stable, Shreyansh Jain

This patch fixes memory leak in parsing error handling.

Fixes: 338327d731e6 ("devargs: add function to parse device layers")
Cc: gaetan.rivet@6wind.com
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Gaetan Rivet <grive@u256.net>
---
 lib/librte_eal/common/eal_common_devargs.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index 48f85ee9c0..e40b91ea66 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -60,6 +60,7 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 	size_t nblayer;
 	size_t i = 0;
 	int ret = 0;
+	bool allocated_data = false;
 
 	/* Split each sub-lists. */
 	nblayer = devargs_layer_count(devstr);
@@ -81,6 +82,7 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 			ret = -ENOMEM;
 			goto get_out;
 		}
+		allocated_data = true;
 		s = devargs->data;
 	}
 
@@ -163,8 +165,14 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 		if (layers[i].kvlist)
 			rte_kvargs_free(layers[i].kvlist);
 	}
-	if (ret != 0)
+	if (ret != 0) {
+		if (allocated_data) {
+			/* Free duplicated data. */
+			free(devargs->data);
+			devargs->data = NULL;
+		}
 		rte_errno = -ret;
+	}
 	return ret;
 }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v5 3/5] kvargs: add get by key function
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (63 preceding siblings ...)
  2021-04-13  3:14   ` [dpdk-dev] [PATCH v5 2/5] devargs: fix memory leak on parsing error Xueming Li
@ 2021-04-13  3:14   ` Xueming Li
  2021-04-13  3:14   ` [dpdk-dev] [PATCH v5 4/5] bus: add device arguments name parsing API Xueming Li
  2021-04-13  3:14   ` [dpdk-dev] [PATCH v5 5/5] devargs: parse global device syntax Xueming Li
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-04-13  3:14 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet
  Cc: dev, xuemingl, Asaf Penso, Olivier Matz, Ray Kinsella, Neil Horman

Adds a new function to get value of a specific key from kvargs list.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Gaetan Rivet <grive@u256.net>
---
 lib/librte_kvargs/rte_kvargs.c | 15 +++++++++++++++
 lib/librte_kvargs/rte_kvargs.h | 20 ++++++++++++++++++++
 lib/librte_kvargs/version.map  |  3 +++
 3 files changed, 38 insertions(+)

diff --git a/lib/librte_kvargs/rte_kvargs.c b/lib/librte_kvargs/rte_kvargs.c
index ffae8914cf..4cce8e953b 100644
--- a/lib/librte_kvargs/rte_kvargs.c
+++ b/lib/librte_kvargs/rte_kvargs.c
@@ -203,6 +203,21 @@ rte_kvargs_free(struct rte_kvargs *kvlist)
 	free(kvlist);
 }
 
+/* Lookup a value in an rte_kvargs list by its key. */
+const char *
+rte_kvargs_get(const struct rte_kvargs *kvlist, const char *key)
+{
+	unsigned int i;
+
+	if (kvlist == NULL || key == NULL)
+		return NULL;
+	for (i = 0; i < kvlist->count; ++i) {
+		if (strcmp(kvlist->pairs[i].key, key) == 0)
+			return kvlist->pairs[i].value;
+	}
+	return NULL;
+}
+
 /*
  * Parse the arguments "key=value,key=value,..." string and return
  * an allocated structure that contains a key/value list. Also
diff --git a/lib/librte_kvargs/rte_kvargs.h b/lib/librte_kvargs/rte_kvargs.h
index eff598e08b..ab8a8186f6 100644
--- a/lib/librte_kvargs/rte_kvargs.h
+++ b/lib/librte_kvargs/rte_kvargs.h
@@ -114,6 +114,26 @@ struct rte_kvargs *rte_kvargs_parse_delim(const char *args,
  */
 void rte_kvargs_free(struct rte_kvargs *kvlist);
 
+/**
+ * Get the value associated with a given key.
+ *
+ * If multiple key matches, the value of the first one is returned.
+ *
+ * The memory returned is allocated as part of the rte_kvargs structure,
+ * it must never be modified.
+ *
+ * @param kvlist
+ *   A list of rte_kvargs pair of 'key=value'.
+ * @param key
+ *   The matching key.
+
+ * @return
+ *   NULL if no key matches the input, a value associated with a matching
+ *   key otherwise.
+ */
+__rte_experimental
+const char *rte_kvargs_get(const struct rte_kvargs *kvlist, const char *key);
+
 /**
  * Call a handler function for each key/value matching the key
  *
diff --git a/lib/librte_kvargs/version.map b/lib/librte_kvargs/version.map
index ed375bf4a3..fb9bed4f93 100644
--- a/lib/librte_kvargs/version.map
+++ b/lib/librte_kvargs/version.map
@@ -15,4 +15,7 @@ EXPERIMENTAL {
 	rte_kvargs_parse_delim;
 	rte_kvargs_strcmp;
 
+	# added in 21.05
+	rte_kvargs_get;
+
 };
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v5 4/5] bus: add device arguments name parsing API
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (64 preceding siblings ...)
  2021-04-13  3:14   ` [dpdk-dev] [PATCH v5 3/5] kvargs: add get by key function Xueming Li
@ 2021-04-13  3:14   ` Xueming Li
  2021-04-13  3:14   ` [dpdk-dev] [PATCH v5 5/5] devargs: parse global device syntax Xueming Li
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-04-13  3:14 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet; +Cc: dev, xuemingl, Asaf Penso

For device probe and iterator, devargs name was key information,
parsed by rte_devargs_parse. In legacy parser, devargs name was
extracted after bus name:
  bus:name,kv_arguments,,,
Example:
  pci:83:00.0,arguments,...
  vdev:pcap0,...

To be compatible with legacy parser, this patch introduces new
bus driver API devargs_parse to parse devargs and update devargs name.
If devargs_parse not implemented by bus driver, the new syntax parser
rte_devargs_layers_parse default will resolve devargs name from bus's
"name" argument.

Different bus driver might choose different keys from arguments with
unified format. The PCI bus implementation fills the devargs name with
the "addr" argument, example:
 -a bus=pci,addr=83:00.0/class=eth/driver=mlx5,...
    name: 0000:03:00.0
 -a bus=vdev,name=pcap0/class=eth/driver=pcap,...
    name:pcap0

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Gaetan Rivet <grive@u256.net>
---
 drivers/bus/pci/pci_common.c               |  1 +
 drivers/bus/pci/pci_params.c               | 47 ++++++++++++++++++++++
 drivers/bus/pci/private.h                  | 14 +++++++
 lib/librte_eal/common/eal_common_devargs.c | 31 ++++++++++++++
 lib/librte_eal/include/rte_bus.h           | 17 ++++++++
 5 files changed, 110 insertions(+)

diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
index ee7f966358..30630809bb 100644
--- a/drivers/bus/pci/pci_common.c
+++ b/drivers/bus/pci/pci_common.c
@@ -760,6 +760,7 @@ struct rte_pci_bus rte_pci_bus = {
 		.dev_iterate = rte_pci_dev_iterate,
 		.hot_unplug_handler = pci_hot_unplug_handler,
 		.sigbus_handler = pci_sigbus_handler,
+		.devargs_parse = rte_pci_devargs_parse,
 	},
 	.device_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.device_list),
 	.driver_list = TAILQ_HEAD_INITIALIZER(rte_pci_bus.driver_list),
diff --git a/drivers/bus/pci/pci_params.c b/drivers/bus/pci/pci_params.c
index 3192e9c967..c0c282e948 100644
--- a/drivers/bus/pci/pci_params.c
+++ b/drivers/bus/pci/pci_params.c
@@ -8,6 +8,8 @@
 #include <rte_errno.h>
 #include <rte_kvargs.h>
 #include <rte_pci.h>
+#include <rte_devargs.h>
+#include <rte_debug.h>
 
 #include "private.h"
 
@@ -76,3 +78,48 @@ rte_pci_dev_iterate(const void *start,
 	rte_kvargs_free(kvargs);
 	return dev;
 }
+
+int
+rte_pci_devargs_parse(struct rte_devargs *da)
+{
+	struct rte_kvargs *kvargs;
+	const char *addr_str;
+	struct rte_pci_addr addr;
+	int ret;
+
+	if (da == NULL)
+		return 0;
+	RTE_ASSERT(da->bus_str != NULL);
+
+	kvargs = rte_kvargs_parse(da->bus_str, NULL);
+	if (kvargs == NULL) {
+		RTE_LOG(ERR, EAL, "cannot parse argument list: %s\n",
+			da->bus_str);
+		ret = -ENODEV;
+		goto out;
+	}
+
+	addr_str = rte_kvargs_get(kvargs, pci_params_keys[RTE_PCI_PARAM_ADDR]);
+	if (addr_str == NULL) {
+		RTE_LOG(ERR, EAL, "No PCI address specified using '%s=<id>' in: %s\n",
+			pci_params_keys[RTE_PCI_PARAM_ADDR], da->bus_str);
+		ret = -ENODEV;
+		goto out;
+	}
+
+	ret = rte_pci_addr_parse(addr_str, &addr);
+	if (ret != 0) {
+		RTE_LOG(ERR, EAL, "PCI address invalid: %s\n", da->bus_str);
+		ret = -EINVAL;
+		goto out;
+	}
+
+	rte_pci_device_name(&addr, da->name, sizeof(da->name));
+
+out:
+	if (kvargs != NULL)
+		rte_kvargs_free(kvargs);
+	if (ret != 0)
+		rte_errno = -ret;
+	return ret;
+}
diff --git a/drivers/bus/pci/private.h b/drivers/bus/pci/private.h
index f566943f5e..8bc5140e97 100644
--- a/drivers/bus/pci/private.h
+++ b/drivers/bus/pci/private.h
@@ -267,4 +267,18 @@ rte_pci_dev_iterate(const void *start,
 		    const char *str,
 		    const struct rte_dev_iterator *it);
 
+/*
+ * Parse device arguments and update name.
+ *
+ * @param da
+ *   device arguments to parse.
+ *
+ * @return
+ *   0 on success.
+ *   -EINVAL: kvargs string is invalid and cannot be parsed.
+ *   -ENODEV: no key matching a device ID is found in the kv list.
+ */
+int
+rte_pci_devargs_parse(struct rte_devargs *da);
+
 #endif /* _PCI_PRIVATE_H_ */
diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index e40b91ea66..2d87e63d2a 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -19,6 +19,7 @@
 #include <rte_kvargs.h>
 #include <rte_log.h>
 #include <rte_tailq.h>
+#include <rte_string_fns.h>
 #include "eal_private.h"
 
 /** user device double-linked queue type definition */
@@ -40,6 +41,28 @@ devargs_layer_count(const char *s)
 	return i;
 }
 
+/* Resolve devargs name from bus arguments. */
+static int
+devargs_bus_parse_default(struct rte_devargs *devargs,
+			  struct rte_kvargs *bus_args)
+{
+	const char *name;
+
+	/* Parse devargs name from bus key-value list. */
+	name = rte_kvargs_get(bus_args, "name");
+	if (name == NULL) {
+		RTE_LOG(INFO, EAL, "devargs name not found: %s\n",
+			devargs->data);
+		return 0;
+	}
+	if (rte_strscpy(devargs->name, name, sizeof(devargs->name)) < 0) {
+		RTE_LOG(ERR, EAL, "devargs name too long: %s\n",
+			devargs->data);
+		return -E2BIG;
+	}
+	return 0;
+}
+
 int
 rte_devargs_layers_parse(struct rte_devargs *devargs,
 			 const char *devstr)
@@ -118,6 +141,8 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 		if (layers[i].kvlist == NULL)
 			continue;
 		kv = &layers[i].kvlist->pairs[0];
+		if (kv->key == NULL)
+			continue;
 		if (strcmp(kv->key, "bus") == 0) {
 			bus = rte_bus_find_by_name(kv->value);
 			if (bus == NULL) {
@@ -160,6 +185,12 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 		}
 	}
 
+	/* Resolve devargs name. */
+	if (bus != NULL && bus->devargs_parse != NULL)
+		ret = bus->devargs_parse(devargs);
+	else if (layers[0].kvlist != NULL)
+		ret = devargs_bus_parse_default(devargs, layers[0].kvlist);
+
 get_out:
 	for (i = 0; i < RTE_DIM(layers); i++) {
 		if (layers[i].kvlist)
diff --git a/lib/librte_eal/include/rte_bus.h b/lib/librte_eal/include/rte_bus.h
index 80b154fb98..bfa8dc0200 100644
--- a/lib/librte_eal/include/rte_bus.h
+++ b/lib/librte_eal/include/rte_bus.h
@@ -210,6 +210,22 @@ typedef int (*rte_bus_hot_unplug_handler_t)(struct rte_device *dev);
  */
 typedef int (*rte_bus_sigbus_handler_t)(const void *failure_addr);
 
+/**
+ * Parse bus part of the device arguments.
+ *
+ * The field name of the struct rte_devargs will be set.
+ *
+ * @param da
+ *	Pointer to the devargs to parse.
+ *
+ * @return
+ *	0 on successful parsing, otherwise rte_errno is set.
+ *	-EINVAL: on parsing error.
+ *	-ENODEV: if no key matching a device argument is specified.
+ *	-E2BIG: device name is too long.
+ */
+typedef int (*rte_bus_devargs_parse_t)(struct rte_devargs *da);
+
 /**
  * Bus scan policies
  */
@@ -267,6 +283,7 @@ struct rte_bus {
 				/**< handle hot-unplug failure on the bus */
 	rte_bus_sigbus_handler_t sigbus_handler;
 					/**< handle sigbus error on the bus */
+	rte_bus_devargs_parse_t devargs_parse; /**< Parse bus devargs */
 
 };
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* [dpdk-dev] [PATCH v5 5/5] devargs: parse global device syntax
  2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
                     ` (65 preceding siblings ...)
  2021-04-13  3:14   ` [dpdk-dev] [PATCH v5 4/5] bus: add device arguments name parsing API Xueming Li
@ 2021-04-13  3:14   ` Xueming Li
  66 siblings, 0 replies; 118+ messages in thread
From: Xueming Li @ 2021-04-13  3:14 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet
  Cc: dev, xuemingl, Asaf Penso, Ferruh Yigit, Andrew Rybchenko

When parsing a devargs, try to parse using the global device syntax
first. Fallback on legacy syntax on error.

Example of new global device syntax:
 -a bus=pci,addr=82:00.0/class=eth/driver=mlx5,dv_flow_en=1

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Gaetan Rivet <grive@u256.net>
---
 doc/guides/rel_notes/release_21_05.rst     |  7 +++++++
 lib/librte_eal/common/eal_common_devargs.c | 16 ++++++++++++----
 lib/librte_eal/include/rte_devargs.h       |  4 ++++
 lib/librte_ethdev/rte_ethdev.c             |  1 -
 4 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/doc/guides/rel_notes/release_21_05.rst b/doc/guides/rel_notes/release_21_05.rst
index 113b37cddc..ff29c88749 100644
--- a/doc/guides/rel_notes/release_21_05.rst
+++ b/doc/guides/rel_notes/release_21_05.rst
@@ -55,6 +55,13 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Enabled new devargs parser.**
+
+  * Enabled devargs syntax
+    ``bus=X,paramX=x/class=Y,paramY=y/driver=Z,paramZ=z``
+  * Added bus-level parsing of the devargs syntax.
+  * Kept compatibility with the legacy syntax as parsing fallback.
+
 * **Added support for Marvell CN10K SoC drivers.**
 
   Added Marvell CN10K SoC support. Marvell CN10K SoC are based on Octeon 10
diff --git a/lib/librte_eal/common/eal_common_devargs.c b/lib/librte_eal/common/eal_common_devargs.c
index 2d87e63d2a..a81ce973fe 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -125,7 +125,6 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 		layers[i].str = s;
 		layers[i].kvlist = rte_kvargs_parse_delim(s, NULL, "/");
 		if (layers[i].kvlist == NULL) {
-			RTE_LOG(ERR, EAL, "Could not parse %s\n", s);
 			ret = -EINVAL;
 			goto get_out;
 		}
@@ -143,7 +142,7 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 		kv = &layers[i].kvlist->pairs[0];
 		if (kv->key == NULL)
 			continue;
-		if (strcmp(kv->key, "bus") == 0) {
+		if (strcmp(kv->key, RTE_DEVARGS_KEY_BUS) == 0) {
 			bus = rte_bus_find_by_name(kv->value);
 			if (bus == NULL) {
 				RTE_LOG(ERR, EAL, "Could not find bus \"%s\"\n",
@@ -151,7 +150,7 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 				ret = -EFAULT;
 				goto get_out;
 			}
-		} else if (strcmp(kv->key, "class") == 0) {
+		} else if (strcmp(kv->key, RTE_DEVARGS_KEY_CLASS) == 0) {
 			cls = rte_class_find_by_name(kv->value);
 			if (cls == NULL) {
 				RTE_LOG(ERR, EAL, "Could not find class \"%s\"\n",
@@ -159,7 +158,7 @@ rte_devargs_layers_parse(struct rte_devargs *devargs,
 				ret = -EFAULT;
 				goto get_out;
 			}
-		} else if (strcmp(kv->key, "driver") == 0) {
+		} else if (strcmp(kv->key, RTE_DEVARGS_KEY_DRIVER) == 0) {
 			/* Ignore */
 			continue;
 		}
@@ -224,6 +223,15 @@ rte_devargs_parse(struct rte_devargs *da, const char *dev)
 	if (da == NULL)
 		return -EINVAL;
 
+	/* First parse according global device syntax. */
+	if (rte_devargs_layers_parse(da, dev) == 0) {
+		if (da->bus != NULL || da->cls != NULL)
+			return 0;
+		rte_devargs_reset(da);
+	}
+
+	/* Otherwise fallback to legacy syntax: */
+
 	/* Retrieve eventual bus info */
 	do {
 		devname = dev;
diff --git a/lib/librte_eal/include/rte_devargs.h b/lib/librte_eal/include/rte_devargs.h
index 134b44a887..39e34ea02e 100644
--- a/lib/librte_eal/include/rte_devargs.h
+++ b/lib/librte_eal/include/rte_devargs.h
@@ -25,6 +25,10 @@ extern "C" {
 #include <rte_compat.h>
 #include <rte_bus.h>
 
+#define RTE_DEVARGS_KEY_BUS "bus"
+#define RTE_DEVARGS_KEY_CLASS "class"
+#define RTE_DEVARGS_KEY_DRIVER "driver"
+
 /**
  * Type of generic device
  */
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 0419500fc3..c417599f79 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -207,7 +207,6 @@ rte_eth_iterator_init(struct rte_dev_iterator *iter, const char *devargs_str)
 	 *   - 0000:08:00.0,representor=[1-3]
 	 *   - pci:0000:06:00.0,representor=[0,5]
 	 *   - class=eth,mac=00:11:22:33:44:55
-	 * A new syntax is in development (not yet supported):
 	 *   - bus=X,paramX=x/class=Y,paramY=y/driver=Z,paramZ=z
 	 */
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v5 0/5] eal: enable global device syntax by default
  2021-04-13  3:14   ` [dpdk-dev] [PATCH v5 0/5] eal: enable global device syntax by default Xueming Li
@ 2021-04-14 19:49     ` Thomas Monjalon
  2021-04-23 11:06       ` Kinsella, Ray
  0 siblings, 1 reply; 118+ messages in thread
From: Thomas Monjalon @ 2021-04-14 19:49 UTC (permalink / raw)
  To: Xueming Li; +Cc: Gaetan Rivet, dev, Asaf Penso, mdr, david.marchand

13/04/2021 05:14, Xueming Li:
> Xueming Li (5):
>   devargs: unify scratch buffer storage
>   devargs: fix memory leak on parsing error
>   kvargs: add get by key function
>   bus: add device arguments name parsing API
>   devargs: parse global device syntax

The patch 4 adds a new callback in rte_bus.
I thought about it during the whole day and I don't see any good way
to merge it without breaking the ABI compatibility.

Only first 3 patches are applied for now, thanks.




^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v5 1/5] devargs: unify scratch buffer storage
  2021-04-13  3:14   ` [dpdk-dev] [PATCH v5 1/5] devargs: unify scratch buffer storage Xueming Li
@ 2021-04-16  7:00     ` David Marchand
  2021-04-16 12:32       ` Aaron Conole
  0 siblings, 1 reply; 118+ messages in thread
From: David Marchand @ 2021-04-16  7:00 UTC (permalink / raw)
  To: Aaron Conole, dpdklab
  Cc: Thomas Monjalon, Gaetan Rivet, dev, Xueming(Steven) Li,
	Asaf Penso, Wenzhuo Lu, Beilei Xing, Bernard Iremonger,
	Gaetan Rivet, Anatoly Burakov, Ray Kinsella, Neil Horman,
	Ferruh Yigit, Andrew Rybchenko, Dodji Seketeli, ci

On Tue, Apr 13, 2021 at 5:15 AM Xueming Li <xuemingl@nvidia.com> wrote:
> diff --git a/lib/librte_eal/include/rte_devargs.h b/lib/librte_eal/include/rte_devargs.h
> index 296f19324f..134b44a887 100644
> --- a/lib/librte_eal/include/rte_devargs.h
> +++ b/lib/librte_eal/include/rte_devargs.h
> @@ -60,16 +60,16 @@ struct rte_devargs {
>         /** Name of the device. */
>         char name[RTE_DEV_NAME_MAX_LEN];
>         RTE_STD_C11
> -       union {
> -       /** Arguments string as given by user or "" for no argument. */
> -               char *args;
> +       union { /**< driver-related part of device string. */
> +               const char *args; /**< legacy name. */
>                 const char *drv_str;
>         };
>         struct rte_bus *bus; /**< bus handle. */
>         struct rte_class *cls; /**< class handle. */
>         const char *bus_str; /**< bus-related part of device string. */
>         const char *cls_str; /**< class-related part of device string. */
> -       const char *data; /**< Device string storage. */
> +       char *data;
> +       /**< Raw string including bus, class and driver arguments. */
>  };
>
>  /**

- Flagging this patch for info and its impact on UNH jobs.

This change is fine, but older libabigail versions could not deal with
such changes (anonymous union, changes of const fields).
This results in an ABI check failure in the UNH x86 job on Ubuntu
18.04 (and for some people not using recent libabigail).
I can see the ARM job passes fine, so I suppose it is using a more
recent libabigail (running Ubuntu 20.04 maybe?).

We either need to disable this x86 job or update its libabigail
package (maybe aligning with what we have for public CI which is
libabigail 1.8 manually compiled).


- For the longer term, what do you think of using/extending the .ci/
scripts for use by UNH jobs?


-- 
David Marchand


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v5 1/5] devargs: unify scratch buffer storage
  2021-04-16  7:00     ` David Marchand
@ 2021-04-16 12:32       ` Aaron Conole
  2021-04-16 12:43         ` [dpdk-dev] [dpdklab] " Lincoln Lavoie
  0 siblings, 1 reply; 118+ messages in thread
From: Aaron Conole @ 2021-04-16 12:32 UTC (permalink / raw)
  To: David Marchand
  Cc: dpdklab, Thomas Monjalon, Gaetan Rivet, dev,
	Xueming\(Steven\) Li, Asaf Penso, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger, Gaetan Rivet, Anatoly Burakov, Ray Kinsella,
	Neil Horman, Ferruh Yigit, Andrew Rybchenko, Dodji Seketeli, ci,
	Owen Hilyard

David Marchand <david.marchand@redhat.com> writes:

> On Tue, Apr 13, 2021 at 5:15 AM Xueming Li <xuemingl@nvidia.com> wrote:
>> diff --git a/lib/librte_eal/include/rte_devargs.h b/lib/librte_eal/include/rte_devargs.h
>> index 296f19324f..134b44a887 100644
>> --- a/lib/librte_eal/include/rte_devargs.h
>> +++ b/lib/librte_eal/include/rte_devargs.h
>> @@ -60,16 +60,16 @@ struct rte_devargs {
>>         /** Name of the device. */
>>         char name[RTE_DEV_NAME_MAX_LEN];
>>         RTE_STD_C11
>> -       union {
>> -       /** Arguments string as given by user or "" for no argument. */
>> -               char *args;
>> +       union { /**< driver-related part of device string. */
>> +               const char *args; /**< legacy name. */
>>                 const char *drv_str;
>>         };
>>         struct rte_bus *bus; /**< bus handle. */
>>         struct rte_class *cls; /**< class handle. */
>>         const char *bus_str; /**< bus-related part of device string. */
>>         const char *cls_str; /**< class-related part of device string. */
>> -       const char *data; /**< Device string storage. */
>> +       char *data;
>> +       /**< Raw string including bus, class and driver arguments. */
>>  };
>>
>>  /**
>
> - Flagging this patch for info and its impact on UNH jobs.
>
> This change is fine, but older libabigail versions could not deal with
> such changes (anonymous union, changes of const fields).
> This results in an ABI check failure in the UNH x86 job on Ubuntu
> 18.04 (and for some people not using recent libabigail).
> I can see the ARM job passes fine, so I suppose it is using a more
> recent libabigail (running Ubuntu 20.04 maybe?).
>
> We either need to disable this x86 job or update its libabigail
> package (maybe aligning with what we have for public CI which is
> libabigail 1.8 manually compiled).
>
>
> - For the longer term, what do you think of using/extending the .ci/
> scripts for use by UNH jobs?

I think it would be great if we had some of the scripts shared as a
common resource.  That would also help us to look at fixes / changes
when needed.


^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [dpdklab] Re: [PATCH v5 1/5] devargs: unify scratch buffer storage
  2021-04-16 12:32       ` Aaron Conole
@ 2021-04-16 12:43         ` Lincoln Lavoie
  2021-04-16 12:58           ` Thomas Monjalon
  0 siblings, 1 reply; 118+ messages in thread
From: Lincoln Lavoie @ 2021-04-16 12:43 UTC (permalink / raw)
  To: Aaron Conole
  Cc: David Marchand, dpdklab, Thomas Monjalon, Gaetan Rivet, dev,
	Xueming(Steven) Li, Asaf Penso, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger, Gaetan Rivet, Anatoly Burakov, Ray Kinsella,
	Neil Horman, Ferruh Yigit, Andrew Rybchenko, Dodji Seketeli, ci,
	Owen Hilyard

All of the UNH ABI testing is moving info containers, so it can be run on
top of each OS, alongside the other compile and unit testing. This is
actually ready now, but hasn't been pushed live this week, because of the
backlog in the system because of the DTS failure.  The additional compile
jobs are already online now, it's just ABI that hasn't been pushed live.

This means the current ABI (what is reporting right now) is running on
18.04 for x86 and 20.04 for aarch64.  The aarch64 one will continue
forward, because we're not going to moving to emulated environments for
testing on that architecture.

This has two implications, first, the scripts for running ABI (and the
other tests) become part of the container definition, and at the last
meeting we talked about moving those definitions into the dpdk-ci repo,
which I think makes sense.  Second, there isn't an operating system to
"maintain" since it's what's inside the container images, which are
periodically rebuilt, but pretty much treated as ephemeral.  Assuming the
container bases / distros have the updated libabigail version packaged with
them.

Cheers,
Lincoln

On Fri, Apr 16, 2021 at 8:32 AM Aaron Conole <aconole@redhat.com> wrote:

> David Marchand <david.marchand@redhat.com> writes:
>
> > On Tue, Apr 13, 2021 at 5:15 AM Xueming Li <xuemingl@nvidia.com> wrote:
> >> diff --git a/lib/librte_eal/include/rte_devargs.h
> b/lib/librte_eal/include/rte_devargs.h
> >> index 296f19324f..134b44a887 100644
> >> --- a/lib/librte_eal/include/rte_devargs.h
> >> +++ b/lib/librte_eal/include/rte_devargs.h
> >> @@ -60,16 +60,16 @@ struct rte_devargs {
> >>         /** Name of the device. */
> >>         char name[RTE_DEV_NAME_MAX_LEN];
> >>         RTE_STD_C11
> >> -       union {
> >> -       /** Arguments string as given by user or "" for no argument. */
> >> -               char *args;
> >> +       union { /**< driver-related part of device string. */
> >> +               const char *args; /**< legacy name. */
> >>                 const char *drv_str;
> >>         };
> >>         struct rte_bus *bus; /**< bus handle. */
> >>         struct rte_class *cls; /**< class handle. */
> >>         const char *bus_str; /**< bus-related part of device string. */
> >>         const char *cls_str; /**< class-related part of device string.
> */
> >> -       const char *data; /**< Device string storage. */
> >> +       char *data;
> >> +       /**< Raw string including bus, class and driver arguments. */
> >>  };
> >>
> >>  /**
> >
> > - Flagging this patch for info and its impact on UNH jobs.
> >
> > This change is fine, but older libabigail versions could not deal with
> > such changes (anonymous union, changes of const fields).
> > This results in an ABI check failure in the UNH x86 job on Ubuntu
> > 18.04 (and for some people not using recent libabigail).
> > I can see the ARM job passes fine, so I suppose it is using a more
> > recent libabigail (running Ubuntu 20.04 maybe?).
> >
> > We either need to disable this x86 job or update its libabigail
> > package (maybe aligning with what we have for public CI which is
> > libabigail 1.8 manually compiled).
> >
> >
> > - For the longer term, what do you think of using/extending the .ci/
> > scripts for use by UNH jobs?
>
> I think it would be great if we had some of the scripts shared as a
> common resource.  That would also help us to look at fixes / changes
> when needed.
>
>

-- 
*Lincoln Lavoie*
Principal Engineer, Broadband Technologies
21 Madbury Rd., Ste. 100, Durham, NH 03824
lylavoie@iol.unh.edu
https://www.iol.unh.edu
+1-603-674-2755 (m)
<https://www.iol.unh.edu>

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [dpdklab] Re: [PATCH v5 1/5] devargs: unify scratch buffer storage
  2021-04-16 12:43         ` [dpdk-dev] [dpdklab] " Lincoln Lavoie
@ 2021-04-16 12:58           ` Thomas Monjalon
  2021-04-16 13:14             ` Lincoln Lavoie
  0 siblings, 1 reply; 118+ messages in thread
From: Thomas Monjalon @ 2021-04-16 12:58 UTC (permalink / raw)
  To: Lincoln Lavoie
  Cc: Aaron Conole, David Marchand, dpdklab, Gaetan Rivet, dev,
	Xueming(Steven) Li, Asaf Penso, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger, Gaetan Rivet, Anatoly Burakov, Ray Kinsella,
	Neil Horman, Ferruh Yigit, Andrew Rybchenko, Dodji Seketeli, ci,
	Owen Hilyard

16/04/2021 14:43, Lincoln Lavoie:
> All of the UNH ABI testing is moving info containers, so it can be run on
> top of each OS, alongside the other compile and unit testing. This is
> actually ready now, but hasn't been pushed live this week, because of the
> backlog in the system because of the DTS failure.  The additional compile
> jobs are already online now, it's just ABI that hasn't been pushed live.
> 
> This means the current ABI (what is reporting right now) is running on
> 18.04 for x86 and 20.04 for aarch64.  The aarch64 one will continue
> forward, because we're not going to moving to emulated environments for
> testing on that architecture.
> 
> This has two implications, first, the scripts for running ABI (and the
> other tests) become part of the container definition, and at the last
> meeting we talked about moving those definitions into the dpdk-ci repo,
> which I think makes sense.  Second, there isn't an operating system to
> "maintain" since it's what's inside the container images, which are
> periodically rebuilt, but pretty much treated as ephemeral.  Assuming the
> container bases / distros have the updated libabigail version packaged with
> them.

No, the version packaged in the OS is not recent enough.
Please check what is done in Travis and GitHub CI
in the shell function install_libabigail():
https://git.dpdk.org/dpdk/tree/.ci/linux-build.sh#n22


> On Fri, Apr 16, 2021 at 8:32 AM Aaron Conole <aconole@redhat.com> wrote:
> > David Marchand <david.marchand@redhat.com> writes:
> >
> > > On Tue, Apr 13, 2021 at 5:15 AM Xueming Li <xuemingl@nvidia.com> wrote:
> > >> diff --git a/lib/librte_eal/include/rte_devargs.h
> > b/lib/librte_eal/include/rte_devargs.h
> > >> index 296f19324f..134b44a887 100644
> > >> --- a/lib/librte_eal/include/rte_devargs.h
> > >> +++ b/lib/librte_eal/include/rte_devargs.h
> > >> @@ -60,16 +60,16 @@ struct rte_devargs {
> > >>         /** Name of the device. */
> > >>         char name[RTE_DEV_NAME_MAX_LEN];
> > >>         RTE_STD_C11
> > >> -       union {
> > >> -       /** Arguments string as given by user or "" for no argument. */
> > >> -               char *args;
> > >> +       union { /**< driver-related part of device string. */
> > >> +               const char *args; /**< legacy name. */
> > >>                 const char *drv_str;
> > >>         };
> > >>         struct rte_bus *bus; /**< bus handle. */
> > >>         struct rte_class *cls; /**< class handle. */
> > >>         const char *bus_str; /**< bus-related part of device string. */
> > >>         const char *cls_str; /**< class-related part of device string.
> > */
> > >> -       const char *data; /**< Device string storage. */
> > >> +       char *data;
> > >> +       /**< Raw string including bus, class and driver arguments. */
> > >>  };
> > >>
> > >>  /**
> > >
> > > - Flagging this patch for info and its impact on UNH jobs.
> > >
> > > This change is fine, but older libabigail versions could not deal with
> > > such changes (anonymous union, changes of const fields).
> > > This results in an ABI check failure in the UNH x86 job on Ubuntu
> > > 18.04 (and for some people not using recent libabigail).
> > > I can see the ARM job passes fine, so I suppose it is using a more
> > > recent libabigail (running Ubuntu 20.04 maybe?).
> > >
> > > We either need to disable this x86 job or update its libabigail
> > > package (maybe aligning with what we have for public CI which is
> > > libabigail 1.8 manually compiled).
> > >
> > >
> > > - For the longer term, what do you think of using/extending the .ci/
> > > scripts for use by UNH jobs?
> >
> > I think it would be great if we had some of the scripts shared as a
> > common resource.  That would also help us to look at fixes / changes
> > when needed.
> >
> >
> 
> 






^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [dpdklab] Re: [PATCH v5 1/5] devargs: unify scratch buffer storage
  2021-04-16 12:58           ` Thomas Monjalon
@ 2021-04-16 13:14             ` Lincoln Lavoie
  0 siblings, 0 replies; 118+ messages in thread
From: Lincoln Lavoie @ 2021-04-16 13:14 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Aaron Conole, David Marchand, dpdklab, Gaetan Rivet, dev,
	Xueming(Steven) Li, Asaf Penso, Wenzhuo Lu, Beilei Xing,
	Bernard Iremonger, Gaetan Rivet, Anatoly Burakov, Ray Kinsella,
	Neil Horman, Ferruh Yigit, Andrew Rybchenko, Dodji Seketeli, ci,
	Owen Hilyard

We can look into that, but that now will need to be tested to work across
all the different OS distros in the containers.

For now, we can install the update on the ubuntu 18.04 worker that is
running the production and remake the reference cache.

Cheers,
Lincoln

On Fri, Apr 16, 2021 at 8:58 AM Thomas Monjalon <thomas@monjalon.net> wrote:

> 16/04/2021 14:43, Lincoln Lavoie:
> > All of the UNH ABI testing is moving info containers, so it can be run on
> > top of each OS, alongside the other compile and unit testing. This is
> > actually ready now, but hasn't been pushed live this week, because of the
> > backlog in the system because of the DTS failure.  The additional compile
> > jobs are already online now, it's just ABI that hasn't been pushed live.
> >
> > This means the current ABI (what is reporting right now) is running on
> > 18.04 for x86 and 20.04 for aarch64.  The aarch64 one will continue
> > forward, because we're not going to moving to emulated environments for
> > testing on that architecture.
> >
> > This has two implications, first, the scripts for running ABI (and the
> > other tests) become part of the container definition, and at the last
> > meeting we talked about moving those definitions into the dpdk-ci repo,
> > which I think makes sense.  Second, there isn't an operating system to
> > "maintain" since it's what's inside the container images, which are
> > periodically rebuilt, but pretty much treated as ephemeral.  Assuming the
> > container bases / distros have the updated libabigail version packaged
> with
> > them.
>
> No, the version packaged in the OS is not recent enough.
> Please check what is done in Travis and GitHub CI
> in the shell function install_libabigail():
> https://git.dpdk.org/dpdk/tree/.ci/linux-build.sh#n22
>
>
> > On Fri, Apr 16, 2021 at 8:32 AM Aaron Conole <aconole@redhat.com> wrote:
> > > David Marchand <david.marchand@redhat.com> writes:
> > >
> > > > On Tue, Apr 13, 2021 at 5:15 AM Xueming Li <xuemingl@nvidia.com>
> wrote:
> > > >> diff --git a/lib/librte_eal/include/rte_devargs.h
> > > b/lib/librte_eal/include/rte_devargs.h
> > > >> index 296f19324f..134b44a887 100644
> > > >> --- a/lib/librte_eal/include/rte_devargs.h
> > > >> +++ b/lib/librte_eal/include/rte_devargs.h
> > > >> @@ -60,16 +60,16 @@ struct rte_devargs {
> > > >>         /** Name of the device. */
> > > >>         char name[RTE_DEV_NAME_MAX_LEN];
> > > >>         RTE_STD_C11
> > > >> -       union {
> > > >> -       /** Arguments string as given by user or "" for no
> argument. */
> > > >> -               char *args;
> > > >> +       union { /**< driver-related part of device string. */
> > > >> +               const char *args; /**< legacy name. */
> > > >>                 const char *drv_str;
> > > >>         };
> > > >>         struct rte_bus *bus; /**< bus handle. */
> > > >>         struct rte_class *cls; /**< class handle. */
> > > >>         const char *bus_str; /**< bus-related part of device
> string. */
> > > >>         const char *cls_str; /**< class-related part of device
> string.
> > > */
> > > >> -       const char *data; /**< Device string storage. */
> > > >> +       char *data;
> > > >> +       /**< Raw string including bus, class and driver arguments.
> */
> > > >>  };
> > > >>
> > > >>  /**
> > > >
> > > > - Flagging this patch for info and its impact on UNH jobs.
> > > >
> > > > This change is fine, but older libabigail versions could not deal
> with
> > > > such changes (anonymous union, changes of const fields).
> > > > This results in an ABI check failure in the UNH x86 job on Ubuntu
> > > > 18.04 (and for some people not using recent libabigail).
> > > > I can see the ARM job passes fine, so I suppose it is using a more
> > > > recent libabigail (running Ubuntu 20.04 maybe?).
> > > >
> > > > We either need to disable this x86 job or update its libabigail
> > > > package (maybe aligning with what we have for public CI which is
> > > > libabigail 1.8 manually compiled).
> > > >
> > > >
> > > > - For the longer term, what do you think of using/extending the .ci/
> > > > scripts for use by UNH jobs?
> > >
> > > I think it would be great if we had some of the scripts shared as a
> > > common resource.  That would also help us to look at fixes / changes
> > > when needed.
> > >
> > >
> >
> >
>
>
>
>
>
>

-- 
*Lincoln Lavoie*
Principal Engineer, Broadband Technologies
21 Madbury Rd., Ste. 100, Durham, NH 03824
lylavoie@iol.unh.edu
https://www.iol.unh.edu
+1-603-674-2755 (m)
<https://www.iol.unh.edu>

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/2] common/mlx5: support device global syntax
  2021-04-08 14:13             ` Raslan Darawsheh
@ 2021-04-19  9:29               ` Raslan Darawsheh
  2021-04-19 10:36                 ` Raslan Darawsheh
  0 siblings, 1 reply; 118+ messages in thread
From: Raslan Darawsheh @ 2021-04-19  9:29 UTC (permalink / raw)
  To: Raslan Darawsheh, Xueming(Steven) Li, Slava Ovsiienko
  Cc: dev, Matan Azrad, Shahaf Shuler, NBU-Contact-Thomas Monjalon,
	Asaf Penso, ferruh.yigit

Series applied to next-net-mlx,

Kindest regards,
Raslan Darawsheh

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Raslan Darawsheh
> Sent: Thursday, April 8, 2021 5:13 PM
> To: Xueming(Steven) Li <xuemingl@nvidia.com>; Slava Ovsiienko
> <viacheslavo@nvidia.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@nvidia.com>; Shahaf Shuler
> <shahafs@nvidia.com>; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>; Asaf Penso <asafp@nvidia.com>;
> ferruh.yigit@intel.com
> Subject: Re: [dpdk-dev] [PATCH v2 1/2] common/mlx5: support device global
> syntax
> 
> Hi,
> 
> > -----Original Message-----
> > From: Xueming(Steven) Li <xuemingl@nvidia.com>
> > Sent: Thursday, April 8, 2021 5:08 PM
> > To: Raslan Darawsheh <rasland@nvidia.com>; Slava Ovsiienko
> > <viacheslavo@nvidia.com>
> > Cc: dev@dpdk.org; Matan Azrad <matan@nvidia.com>; Shahaf Shuler
> > <shahafs@nvidia.com>; NBU-Contact-Thomas Monjalon
> > <thomas@monjalon.net>; Asaf Penso <asafp@nvidia.com>;
> > ferruh.yigit@intel.com
> > Subject: RE: [PATCH v2 1/2] common/mlx5: support device global syntax
> >
> > Hi Raslan,
> >
> > Didi you see anything broken? ASAIK, having it in repo shouldn't hurt.
> > On your decision :)
> No, It doesn't hurt/ break anything really.
> But, the idea that it has some logical dependency in the main tree so I'll only
> wait till we'll have it merged then will take this one.
> 
> >
> > Thanks,
> > Xueming
> >
> Kindest regards
> Raslan Darawsheh

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/2] common/mlx5: support device global syntax
  2021-04-19  9:29               ` Raslan Darawsheh
@ 2021-04-19 10:36                 ` Raslan Darawsheh
  0 siblings, 0 replies; 118+ messages in thread
From: Raslan Darawsheh @ 2021-04-19 10:36 UTC (permalink / raw)
  To: Xueming(Steven) Li, Slava Ovsiienko
  Cc: dev, Matan Azrad, Shahaf Shuler, NBU-Contact-Thomas Monjalon,
	Asaf Penso, ferruh.yigit

Sorry for all this confusion, 
But since we are still missing part of the dependency. so dropping form next-net-mlx again.

Kindest regards,
Raslan Darawsheh

> -----Original Message-----
> From: Raslan Darawsheh
> Sent: Monday, April 19, 2021 12:30 PM
> To: Raslan Darawsheh <rasland@nvidia.com>; Xueming(Steven) Li
> <xuemingl@nvidia.com>; Slava Ovsiienko <viacheslavo@nvidia.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@nvidia.com>; Shahaf Shuler
> <shahafs@nvidia.com>; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>; Asaf Penso <asafp@nvidia.com>;
> ferruh.yigit@intel.com
> Subject: RE: [PATCH v2 1/2] common/mlx5: support device global syntax
> 
> Series applied to next-net-mlx,
> 
> Kindest regards,
> Raslan Darawsheh
> 
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Raslan Darawsheh
> > Sent: Thursday, April 8, 2021 5:13 PM
> > To: Xueming(Steven) Li <xuemingl@nvidia.com>; Slava Ovsiienko
> > <viacheslavo@nvidia.com>
> > Cc: dev@dpdk.org; Matan Azrad <matan@nvidia.com>; Shahaf Shuler
> > <shahafs@nvidia.com>; NBU-Contact-Thomas Monjalon
> > <thomas@monjalon.net>; Asaf Penso <asafp@nvidia.com>;
> > ferruh.yigit@intel.com
> > Subject: Re: [dpdk-dev] [PATCH v2 1/2] common/mlx5: support device
> global
> > syntax
> >
> > Hi,
> >
> > > -----Original Message-----
> > > From: Xueming(Steven) Li <xuemingl@nvidia.com>
> > > Sent: Thursday, April 8, 2021 5:08 PM
> > > To: Raslan Darawsheh <rasland@nvidia.com>; Slava Ovsiienko
> > > <viacheslavo@nvidia.com>
> > > Cc: dev@dpdk.org; Matan Azrad <matan@nvidia.com>; Shahaf Shuler
> > > <shahafs@nvidia.com>; NBU-Contact-Thomas Monjalon
> > > <thomas@monjalon.net>; Asaf Penso <asafp@nvidia.com>;
> > > ferruh.yigit@intel.com
> > > Subject: RE: [PATCH v2 1/2] common/mlx5: support device global syntax
> > >
> > > Hi Raslan,
> > >
> > > Didi you see anything broken? ASAIK, having it in repo shouldn't hurt.
> > > On your decision :)
> > No, It doesn't hurt/ break anything really.
> > But, the idea that it has some logical dependency in the main tree so I'll
> only
> > wait till we'll have it merged then will take this one.
> >
> > >
> > > Thanks,
> > > Xueming
> > >
> > Kindest regards
> > Raslan Darawsheh

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v5 0/5] eal: enable global device syntax by default
  2021-04-14 19:49     ` Thomas Monjalon
@ 2021-04-23 11:06       ` Kinsella, Ray
  2021-04-23 11:39         ` Gaëtan Rivet
  0 siblings, 1 reply; 118+ messages in thread
From: Kinsella, Ray @ 2021-04-23 11:06 UTC (permalink / raw)
  To: Thomas Monjalon, Xueming Li; +Cc: Gaetan Rivet, dev, Asaf Penso, david.marchand



On 14/04/2021 20:49, Thomas Monjalon wrote:
> 13/04/2021 05:14, Xueming Li:
>> Xueming Li (5):
>>   devargs: unify scratch buffer storage
>>   devargs: fix memory leak on parsing error
>>   kvargs: add get by key function
>>   bus: add device arguments name parsing API
>>   devargs: parse global device syntax
> 
> The patch 4 adds a new callback in rte_bus.
> I thought about it during the whole day and I don't see any good way
> to merge it without breaking the ABI compatibility.
> 
> Only first 3 patches are applied for now, thanks.
> 

I took a look, I don't immediately see the concern.

The new entry is at the end of the memory structure.
The call back is internal and hidden behind the symbol rte_devargs_layers_parse.

So will only be trigger by a rte_devargs_layers_parse of the same version of DPDK that introduce the new callback.

Should be fine?

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev]  [PATCH v5 0/5] eal: enable global device syntax by default
  2021-04-23 11:06       ` Kinsella, Ray
@ 2021-04-23 11:39         ` Gaëtan Rivet
  2021-04-23 12:35           ` Kinsella, Ray
  0 siblings, 1 reply; 118+ messages in thread
From: Gaëtan Rivet @ 2021-04-23 11:39 UTC (permalink / raw)
  To: Kinsella, Ray, Thomas Monjalon, Xueming(Steven) Li
  Cc: Gaetan Rivet, dev, Asaf Penso, David Marchand

On Fri, Apr 23, 2021, at 13:06, Kinsella, Ray wrote:
> 
> 
> On 14/04/2021 20:49, Thomas Monjalon wrote:
> > 13/04/2021 05:14, Xueming Li:
> >> Xueming Li (5):
> >>   devargs: unify scratch buffer storage
> >>   devargs: fix memory leak on parsing error
> >>   kvargs: add get by key function
> >>   bus: add device arguments name parsing API
> >>   devargs: parse global device syntax
> > 
> > The patch 4 adds a new callback in rte_bus.
> > I thought about it during the whole day and I don't see any good way
> > to merge it without breaking the ABI compatibility.
> > 
> > Only first 3 patches are applied for now, thanks.
> > 
> 
> I took a look, I don't immediately see the concern.
> 
> The new entry is at the end of the memory structure.
> The call back is internal and hidden behind the symbol rte_devargs_layers_parse.
> 
> So will only be trigger by a rte_devargs_layers_parse of the same 
> version of DPDK that introduce the new callback.
> 
> Should be fine?
> 

It might have been an issue IMO with a structure exposed as an array, i.e. rte_eth_devices[].
But I thought this kind of ABI break was the kind that would be accepted between two LTS.

The only potential risk is in using a new version librte_eal.so with an older librte_bus_xxx.so
But I think it is fair to expect installations to be internally consistent.

Maybe we could have a runtime warning when loading mismatched versions
(if there isn't one already) -- each librte_*.so could have an internal version stamp and alignment could
be checked through a constructor in each lib?

-- 
Gaetan Rivet

^ permalink raw reply	[flat|nested] 118+ messages in thread

* Re: [dpdk-dev] [PATCH v5 0/5] eal: enable global device syntax by default
  2021-04-23 11:39         ` Gaëtan Rivet
@ 2021-04-23 12:35           ` Kinsella, Ray
  0 siblings, 0 replies; 118+ messages in thread
From: Kinsella, Ray @ 2021-04-23 12:35 UTC (permalink / raw)
  To: Gaëtan Rivet, Thomas Monjalon, Xueming(Steven) Li
  Cc: Gaetan Rivet, dev, Asaf Penso, David Marchand



On 23/04/2021 12:39, Gaëtan Rivet wrote:
> On Fri, Apr 23, 2021, at 13:06, Kinsella, Ray wrote:
>>
>>
>> On 14/04/2021 20:49, Thomas Monjalon wrote:
>>> 13/04/2021 05:14, Xueming Li:
>>>> Xueming Li (5):
>>>>   devargs: unify scratch buffer storage
>>>>   devargs: fix memory leak on parsing error
>>>>   kvargs: add get by key function
>>>>   bus: add device arguments name parsing API
>>>>   devargs: parse global device syntax
>>>
>>> The patch 4 adds a new callback in rte_bus.
>>> I thought about it during the whole day and I don't see any good way
>>> to merge it without breaking the ABI compatibility.
>>>
>>> Only first 3 patches are applied for now, thanks.
>>>
>>
>> I took a look, I don't immediately see the concern.
>>
>> The new entry is at the end of the memory structure.
>> The call back is internal and hidden behind the symbol rte_devargs_layers_parse.
>>
>> So will only be trigger by a rte_devargs_layers_parse of the same 
>> version of DPDK that introduce the new callback.
>>
>> Should be fine?
>>
> 
> It might have been an issue IMO with a structure exposed as an array, i.e. rte_eth_devices[].
> But I thought this kind of ABI break was the kind that would be accepted between two LTS.

Very much depends on how it is done. 
New fields are ok in some circumstances, at first glance I thought one is ok. 
 
> The only potential risk is in using a new version librte_eal.so with an older librte_bus_xxx.so

We don't account for or consider that, that would be an irrational environmnet. 

> But I think it is fair to expect installations to be internally consistent.
> 
> Maybe we could have a runtime warning when loading mismatched versions

Nope - that would be insanely complex. 

> (if there isn't one already) -- each librte_*.so could have an internal version stamp and alignment could
> be checked through a constructor in each lib?
> 

^ permalink raw reply	[flat|nested] 118+ messages in thread

end of thread, other threads:[~2021-04-23 12:35 UTC | newest]

Thread overview: 118+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-18 15:16 [dpdk-dev] [RFC 0/9] support global syntax Xueming Li
2020-12-18 15:16 ` [dpdk-dev] [RFC 1/9] devargs: fix data buffer storage type Xueming Li
2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 0/7] eal: support global syntax Xueming Li
2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 1/7] devargs: fix data buffer storage type Xueming Li
2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 2/7] devargs: fix memory leak on parsing error Xueming Li
2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 3/7] devargs: fix memory leak in legacy parser Xueming Li
2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 4/7] devargs: fix buffer data memory leak Xueming Li
2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 5/7] kvargs: add get by key function Xueming Li
2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 6/7] devargs: support new global device syntax Xueming Li
2021-01-08 14:54   ` [dpdk-dev] [PATCH v1 7/7] bus/pci: add new global device syntax support Xueming Li
2021-01-08 15:14   ` [dpdk-dev] [PATCH v1 0/2] mlx5: support global syntax Xueming Li
2021-01-08 15:14   ` [dpdk-dev] [PATCH v1 1/2] common/mlx5: support device " Xueming Li
2021-01-08 15:15   ` [dpdk-dev] [PATCH v1 2/2] net/mlx5: support new " Xueming Li
2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 0/9] net/mlx5: support SubFunction representor Xueming Li
2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 1/9] common/mlx5: update representor name parsing Xueming Li
2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 2/9] net/mlx5: support representor of sub function Xueming Li
2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 3/9] net/mlx5: revert setting bonding representor to first PF Xueming Li
2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 4/9] net/mlx5: refactor bonding representor probe Xueming Li
2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 5/9] net/mlx5: support representor from multiple PFs Xueming Li
2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 6/9] net/mlx5: save bonding member ports information Xueming Li
2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 7/9] " Xueming Li
2021-01-18 16:17     ` Slava Ovsiienko
2021-01-18 23:05       ` Xueming(Steven) Li
2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 8/9] net/mlx5: fix setting VF default MAC through representor Xueming Li
2021-01-18 11:29   ` [dpdk-dev] [PATCH v3 9/9] net/mlx5: improve bonding xstats Xueming Li
2021-01-18 15:16   ` [dpdk-dev] [PATCH v2 0/5] eal: enable global device syntax Xueming Li
2021-01-18 15:16   ` [dpdk-dev] [PATCH v2 1/5] devargs: fix memory leak on parsing error Xueming Li
2021-03-18  9:12     ` Thomas Monjalon
2021-01-18 15:16   ` [dpdk-dev] [PATCH v2 2/5] devargs: refactor scratch buffer storage Xueming Li
2021-03-18  9:14     ` Thomas Monjalon
2021-01-18 15:16   ` [dpdk-dev] [PATCH v2 3/5] kvargs: add get by key function Xueming Li
2021-01-18 15:16   ` [dpdk-dev] [PATCH v2 4/5] devargs: parse name from global device syntax Xueming Li
2021-01-18 15:16   ` [dpdk-dev] [PATCH v2 5/5] devargs: enable global device syntax devargs Xueming Li
2021-01-18 15:26   ` [dpdk-dev] [PATCH v2 0/2] mlx5: support global device syntax Xueming Li
2021-01-18 15:26   ` [dpdk-dev] [PATCH v2 1/2] common/mlx5: support device global syntax Xueming Li
2021-04-05 10:54     ` Slava Ovsiienko
2021-04-08 12:24       ` Raslan Darawsheh
2021-04-08 14:04         ` Raslan Darawsheh
2021-04-08 14:08           ` Xueming(Steven) Li
2021-04-08 14:13             ` Raslan Darawsheh
2021-04-19  9:29               ` Raslan Darawsheh
2021-04-19 10:36                 ` Raslan Darawsheh
2021-01-18 15:26   ` [dpdk-dev] [PATCH v2 2/2] net/mlx5: support new global device syntax Xueming Li
2021-04-05 10:56     ` Slava Ovsiienko
2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 0/8] net/mlx5: support SubFunction representor Xueming Li
2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 1/8] common/mlx5: update representor name parsing Xueming Li
2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 2/8] net/mlx5: support representor of sub function Xueming Li
2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 3/8] net/mlx5: revert setting bonding representor to first PF Xueming Li
2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 4/8] net/mlx5: refactor bonding representor probe Xueming Li
2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 5/8] net/mlx5: support representor from multiple PFs Xueming Li
2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 6/8] net/mlx5: save bonding member ports information Xueming Li
2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 7/8] net/mlx5: fix setting VF default MAC through representor Xueming Li
2021-01-19  7:28   ` [dpdk-dev] [PATCH v4 8/8] net/mlx5: improve bonding xstats Xueming Li
2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 0/9] net/mlx5: support SubFunction representor Xueming Li
2021-03-31  7:20     ` Raslan Darawsheh
2021-03-31  7:27       ` Xueming(Steven) Li
2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 1/9] common/mlx5: sub-function representor port name parsing Xueming Li
2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 2/9] net/mlx5: support representor of sub function Xueming Li
2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 3/9] net/mlx5: revert setting bonding representor to first PF Xueming Li
2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 4/9] net/mlx5: refactor bonding representor probe Xueming Li
2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 5/9] net/mlx5: support list value of representor PF Xueming Li
2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 6/9] net/mlx5: save bonding member ports information Xueming Li
2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 7/9] net/mlx5: fix setting VF default MAC through representor Xueming Li
2021-03-31  7:46     ` Raslan Darawsheh
2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 8/9] net/mlx5: improve xstats of bonding port Xueming Li
2021-03-28 13:48   ` [dpdk-dev] [PATCH v5 9/9] net/mlx5: probe host PF representor with SubFunction Xueming Li
2021-03-30  7:37     ` Slava Ovsiienko
2021-03-30 12:15   ` [dpdk-dev] [PATCH v3 0/5] eal: enable global device syntax by default Xueming Li
2021-03-31  8:23     ` Gaëtan Rivet
2021-03-30 12:15   ` [dpdk-dev] [PATCH v3 1/5] devargs: unify scratch buffer storage Xueming Li
2021-04-01  9:04     ` Kinsella, Ray
2021-03-30 12:15   ` [dpdk-dev] [PATCH v3 2/5] devargs: fix memory leak on parsing error Xueming Li
2021-03-30 12:15   ` [dpdk-dev] [PATCH v3 3/5] kvargs: add get by key function Xueming Li
2021-04-01  9:06     ` Kinsella, Ray
2021-04-01  9:10       ` Xueming(Steven) Li
2021-03-30 12:15   ` [dpdk-dev] [PATCH v3 4/5] bus: add device arguments name parsing API Xueming Li
2021-03-31 10:19     ` Thomas Monjalon
2021-04-01 15:13       ` Xueming(Steven) Li
2021-04-08 23:49         ` Thomas Monjalon
2021-03-30 12:15   ` [dpdk-dev] [PATCH v3 5/5] devargs: parse global device syntax Xueming Li
2021-04-10 14:23   ` [dpdk-dev] [PATCH v4 0/5] eal: enable global device syntax by default Xueming Li
2021-04-10 14:23   ` [dpdk-dev] [PATCH v4 1/5] devargs: unify scratch buffer storage Xueming Li
2021-04-10 19:59     ` Tal Shnaiderman
2021-04-12 12:07       ` Xueming(Steven) Li
2021-04-10 14:23   ` [dpdk-dev] [PATCH v4 2/5] devargs: fix memory leak on parsing error Xueming Li
2021-04-10 14:23   ` [dpdk-dev] [PATCH v4 3/5] kvargs: add get by key function Xueming Li
2021-04-12  6:52     ` Olivier Matz
2021-04-12 12:07       ` Xueming(Steven) Li
2021-04-12 21:18         ` Thomas Monjalon
2021-04-10 14:23   ` [dpdk-dev] [PATCH v4 4/5] bus: add device arguments name parsing API Xueming Li
2021-04-12 21:16     ` Thomas Monjalon
2021-04-12 23:37       ` Xueming(Steven) Li
2021-04-10 14:23   ` [dpdk-dev] [PATCH v4 5/5] devargs: parse global device syntax Xueming Li
2021-04-12 21:24     ` Thomas Monjalon
2021-04-12 23:47       ` Xueming(Steven) Li
2021-04-13  3:14   ` [dpdk-dev] [PATCH v5 0/5] eal: enable global device syntax by default Xueming Li
2021-04-14 19:49     ` Thomas Monjalon
2021-04-23 11:06       ` Kinsella, Ray
2021-04-23 11:39         ` Gaëtan Rivet
2021-04-23 12:35           ` Kinsella, Ray
2021-04-13  3:14   ` [dpdk-dev] [PATCH v5 1/5] devargs: unify scratch buffer storage Xueming Li
2021-04-16  7:00     ` David Marchand
2021-04-16 12:32       ` Aaron Conole
2021-04-16 12:43         ` [dpdk-dev] [dpdklab] " Lincoln Lavoie
2021-04-16 12:58           ` Thomas Monjalon
2021-04-16 13:14             ` Lincoln Lavoie
2021-04-13  3:14   ` [dpdk-dev] [PATCH v5 2/5] devargs: fix memory leak on parsing error Xueming Li
2021-04-13  3:14   ` [dpdk-dev] [PATCH v5 3/5] kvargs: add get by key function Xueming Li
2021-04-13  3:14   ` [dpdk-dev] [PATCH v5 4/5] bus: add device arguments name parsing API Xueming Li
2021-04-13  3:14   ` [dpdk-dev] [PATCH v5 5/5] devargs: parse global device syntax Xueming Li
2020-12-18 15:16 ` [dpdk-dev] [RFC 2/9] devargs: fix memory leak on parsing error Xueming Li
2020-12-18 15:16 ` [dpdk-dev] [RFC 3/9] devargs: fix memory leak in legacy parser Xueming Li
2020-12-18 15:16 ` [dpdk-dev] [RFC 4/9] devargs: fix buffer data memory leak Xueming Li
2020-12-18 15:16 ` [dpdk-dev] [RFC 5/9] kvargs: add get by key function Xueming Li
2020-12-18 15:16 ` [dpdk-dev] [RFC 6/9] devargs: support new global device syntax Xueming Li
2020-12-18 15:16 ` [dpdk-dev] [RFC 7/9] bus/pci: add new global device syntax support Xueming Li
2020-12-18 15:16 ` [dpdk-dev] [RFC 8/9] common/mlx5: support device global syntax Xueming Li
2020-12-18 15:16 ` [dpdk-dev] [RFC 9/9] net/mlx5: support new " Xueming Li

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git