DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH 1/4] net/netvsc: scan all net devices under the PCI device
@ 2025-01-28  1:35 longli
  2025-01-28  1:35 ` [PATCH 2/4] net/netvsc: remove RTE device if all its net devices are removed longli
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: longli @ 2025-01-28  1:35 UTC (permalink / raw)
  To: Ferruh Yigit, Andrew Rybchenko, Wei Hu; +Cc: dev, Long Li, stable

From: Long Li <longli@microsoft.com>

The current code has the wrong assumption that a PCI device can have only
one Ethernet device. This is not correct as a PCI device can be multi
functional and have multiple Ethernet devices.

Fix this by scanning all the devices under a PCI device.

Fixes: a2a23a794b3a ("net/netvsc: support VF device hot add/remove")
Cc: stable@dpdk.org
Signed-off-by: Long Li <longli@microsoft.com>
---
 drivers/net/netvsc/hn_ethdev.c | 29 ++++++++++++-----------------
 1 file changed, 12 insertions(+), 17 deletions(-)

diff --git a/drivers/net/netvsc/hn_ethdev.c b/drivers/net/netvsc/hn_ethdev.c
index 1736cb5d07..f848157b49 100644
--- a/drivers/net/netvsc/hn_ethdev.c
+++ b/drivers/net/netvsc/hn_ethdev.c
@@ -570,7 +570,7 @@ static void netvsc_hotplug_retry(void *args)
 	struct rte_devargs *d = &hot_ctx->da;
 	char buf[256];
 
-	DIR *di;
+	DIR *di = NULL;
 	struct dirent *dir;
 	struct ifreq req;
 	struct rte_ether_addr eth_addr;
@@ -590,7 +590,9 @@ static void netvsc_hotplug_retry(void *args)
 	if (!di) {
 		PMD_DRV_LOG(DEBUG, "%s: can't open directory %s, "
 			    "retrying in 1 second", __func__, buf);
-		goto retry;
+		/* The device is still being initialized, retry after 1 second */
+		rte_eal_alarm_set(1000000, netvsc_hotplug_retry, hot_ctx);
+		return;
 	}
 
 	while ((dir = readdir(di))) {
@@ -614,10 +616,9 @@ static void netvsc_hotplug_retry(void *args)
 				    dir->d_name);
 			break;
 		}
-		if (req.ifr_hwaddr.sa_family != ARPHRD_ETHER) {
-			closedir(di);
-			goto free_hotadd_ctx;
-		}
+		if (req.ifr_hwaddr.sa_family != ARPHRD_ETHER)
+			continue;
+
 		memcpy(eth_addr.addr_bytes, req.ifr_hwaddr.sa_data,
 		       RTE_DIM(eth_addr.addr_bytes));
 
@@ -636,22 +637,16 @@ static void netvsc_hotplug_retry(void *args)
 				PMD_DRV_LOG(ERR,
 					    "Failed to add PCI device %s",
 					    d->name);
-				break;
 			}
+
+			break;
 		}
-		/* When the code reaches here, we either have already added
-		 * the device, or its MAC address did not match.
-		 */
-		closedir(di);
-		goto free_hotadd_ctx;
 	}
-	closedir(di);
-retry:
-	/* The device is still being initialized, retry after 1 second */
-	rte_eal_alarm_set(1000000, netvsc_hotplug_retry, hot_ctx);
-	return;
 
 free_hotadd_ctx:
+	if (di)
+		closedir(di);
+
 	rte_spinlock_lock(&hv->hotadd_lock);
 	LIST_REMOVE(hot_ctx, list);
 	rte_spinlock_unlock(&hv->hotadd_lock);
-- 
2.34.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2/4] net/netvsc: remove RTE device if all its net devices are removed
  2025-01-28  1:35 [PATCH 1/4] net/netvsc: scan all net devices under the PCI device longli
@ 2025-01-28  1:35 ` longli
  2025-01-28  1:35 ` [PATCH 3/4] net/netvsc: log error on failure to switch data path longli
  2025-01-28  1:35 ` [PATCH 4/4] net/netvsc: cache device parameters for hot plug events longli
  2 siblings, 0 replies; 10+ messages in thread
From: longli @ 2025-01-28  1:35 UTC (permalink / raw)
  To: Ferruh Yigit, Andrew Rybchenko, Wei Hu; +Cc: dev, Long Li, stable

From: Long Li <longli@microsoft.com>

An RTE device can have multiple Ethernet devices. On hot plug events, it
can't be removed until all its Ethernet devices have been removed.

Fixes: a2a23a794b3a ("net/netvsc: support VF device hot add/remove")
Cc: stable@dpdk.org
Signed-off-by: Long Li <longli@microsoft.com>
---
 drivers/net/netvsc/hn_vf.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/net/netvsc/hn_vf.c b/drivers/net/netvsc/hn_vf.c
index b664beaa5d..5d8058774d 100644
--- a/drivers/net/netvsc/hn_vf.c
+++ b/drivers/net/netvsc/hn_vf.c
@@ -102,6 +102,7 @@ static void hn_remove_delayed(void *args)
 	uint16_t port_id = hv->vf_ctx.vf_port;
 	struct rte_device *dev = rte_eth_devices[port_id].device;
 	int ret;
+	bool all_eth_removed;
 
 	/* Tell VSP to switch data path to synthetic */
 	hn_vf_remove(hv);
@@ -138,7 +139,17 @@ static void hn_remove_delayed(void *args)
 		PMD_DRV_LOG(ERR, "rte_eth_dev_close failed port_id=%u ret=%d",
 			    port_id, ret);
 
-	ret = rte_dev_remove(dev);
+	/* Remove the rte device when all its eth devices are removed */
+	all_eth_removed = true;
+	RTE_ETH_FOREACH_DEV_OF(port_id, dev) {
+		if (rte_eth_devices[port_id].state != RTE_ETH_DEV_UNUSED) {
+			all_eth_removed = false;
+			break;
+		}
+	}
+	if (all_eth_removed)
+		ret = rte_dev_remove(dev);
+
 	hv->vf_ctx.vf_state = vf_removed;
 
 	rte_rwlock_write_unlock(&hv->vf_lock);
-- 
2.34.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 3/4] net/netvsc: log error on failure to switch data path
  2025-01-28  1:35 [PATCH 1/4] net/netvsc: scan all net devices under the PCI device longli
  2025-01-28  1:35 ` [PATCH 2/4] net/netvsc: remove RTE device if all its net devices are removed longli
@ 2025-01-28  1:35 ` longli
  2025-01-28  1:35 ` [PATCH 4/4] net/netvsc: cache device parameters for hot plug events longli
  2 siblings, 0 replies; 10+ messages in thread
From: longli @ 2025-01-28  1:35 UTC (permalink / raw)
  To: Ferruh Yigit, Andrew Rybchenko, Wei Hu; +Cc: dev, Long Li

From: Long Li <longli@microsoft.com>

There is no recovery code path if netvsc failed to switch data path.

Log an error in this case for troubleshooting.

Signed-off-by: Long Li <longli@microsoft.com>
---
 drivers/net/netvsc/hn_vf.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/netvsc/hn_vf.c b/drivers/net/netvsc/hn_vf.c
index 5d8058774d..4ff766ec8b 100644
--- a/drivers/net/netvsc/hn_vf.c
+++ b/drivers/net/netvsc/hn_vf.c
@@ -316,8 +316,9 @@ static void hn_vf_remove(struct hn_data *hv)
 	} else {
 		/* Stop incoming packets from arriving on VF */
 		ret = hn_nvs_set_datapath(hv, NVS_DATAPATH_SYNTHETIC);
-		if (ret == 0)
-			hv->vf_ctx.vf_vsc_switched = false;
+		if (ret)
+			PMD_DRV_LOG(ERR, "Failed to switch to synthetic");
+		hv->vf_ctx.vf_vsc_switched = false;
 	}
 	rte_rwlock_write_unlock(&hv->vf_lock);
 }
-- 
2.34.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 4/4] net/netvsc: cache device parameters for hot plug events
  2025-01-28  1:35 [PATCH 1/4] net/netvsc: scan all net devices under the PCI device longli
  2025-01-28  1:35 ` [PATCH 2/4] net/netvsc: remove RTE device if all its net devices are removed longli
  2025-01-28  1:35 ` [PATCH 3/4] net/netvsc: log error on failure to switch data path longli
@ 2025-01-28  1:35 ` longli
  2025-01-28 21:00   ` Stephen Hemminger
  2025-01-28 21:01   ` Stephen Hemminger
  2 siblings, 2 replies; 10+ messages in thread
From: longli @ 2025-01-28  1:35 UTC (permalink / raw)
  To: Ferruh Yigit, Andrew Rybchenko, Wei Hu; +Cc: dev, Long Li

From: Long Li <longli@microsoft.com>

If a device is hot removed and hot plugged, it needs the same driver
parameters that are passed to EAL. However, during device removal, all
EAL driver parameters are freed as part of the cleanup.

Cache those driver parameters for future hot plug events. Because we don't
know which device will show up, cache all the PCI driver parameters.

Signed-off-by: Long Li <longli@microsoft.com>
---
 drivers/net/netvsc/hn_ethdev.c | 107 +++++++++++++++++++++++++++++----
 drivers/net/netvsc/hn_var.h    |   1 -
 drivers/net/netvsc/hn_vf.c     |   4 --
 3 files changed, 94 insertions(+), 18 deletions(-)

diff --git a/drivers/net/netvsc/hn_ethdev.c b/drivers/net/netvsc/hn_ethdev.c
index f848157b49..afbbccd822 100644
--- a/drivers/net/netvsc/hn_ethdev.c
+++ b/drivers/net/netvsc/hn_ethdev.c
@@ -95,6 +95,16 @@ static const uint8_t rss_default_key[NDIS_HASH_KEYSIZE_TOEPLITZ] = {
 	0x06, 0x3c, 0x25, 0xf3,	0xfc, 0x1f, 0xdc, 0x2a,
 };
 
+static rte_spinlock_t netvsc_lock = RTE_SPINLOCK_INITIALIZER;
+struct da_cache {
+	LIST_ENTRY(da_cache) list;
+	char name[RTE_DEV_NAME_MAX_LEN];
+	char *drv_str;
+};
+
+LIST_HEAD(da_cache_list, da_cache) da_cache_list;
+static unsigned int da_cache_usage;
+
 static struct rte_eth_dev *
 eth_dev_vmbus_allocate(struct rte_vmbus_device *dev, size_t private_data_size)
 {
@@ -623,16 +633,21 @@ static void netvsc_hotplug_retry(void *args)
 		       RTE_DIM(eth_addr.addr_bytes));
 
 		if (rte_is_same_ether_addr(&eth_addr, dev->data->mac_addrs)) {
+			struct da_cache *cache;
+
+			LIST_FOREACH(cache, &da_cache_list, list) {
+				if (strcmp(cache->name, d->name) == 0)
+					break;
+			}
+
 			PMD_DRV_LOG(NOTICE,
-				    "Found matching MAC address, adding device %s network name %s",
-				    d->name, dir->d_name);
+				    "Found matching MAC address, adding device %s network name %s args %s",
+				    d->name, dir->d_name, cache ? cache->drv_str : "");
 
 			/* If this device has been hot removed from this
 			 * parent device, restore its args.
 			 */
-			ret = rte_eal_hotplug_add(d->bus->name, d->name,
-						  hv->vf_devargs ?
-						  hv->vf_devargs : "");
+			ret = rte_eal_hotplug_add(d->bus->name, d->name, cache ? cache->drv_str : "");
 			if (ret) {
 				PMD_DRV_LOG(ERR,
 					    "Failed to add PCI device %s",
@@ -1409,9 +1424,6 @@ eth_hn_dev_uninit(struct rte_eth_dev *eth_dev)
 	ret_stop = hn_dev_stop(eth_dev);
 	hn_dev_close(eth_dev);
 
-	free(hv->vf_devargs);
-	hv->vf_devargs = NULL;
-
 	hn_detach(hv);
 	hn_chim_uninit(eth_dev);
 	rte_vmbus_chan_close(hv->channels[0]);
@@ -1423,6 +1435,61 @@ eth_hn_dev_uninit(struct rte_eth_dev *eth_dev)
 	return ret_stop;
 }
 
+static int populate_cache_list(void)
+{
+	int ret;
+	struct rte_devargs *da;
+
+	rte_spinlock_lock(&netvsc_lock);
+	da_cache_usage++;
+	if (da_cache_usage > 1) {
+		ret = 0;
+		goto out;
+	}
+
+	LIST_INIT(&da_cache_list);
+	RTE_EAL_DEVARGS_FOREACH("pci", da) {
+		struct da_cache *cache;
+
+		cache = rte_zmalloc("NETVSC-HOTADD", sizeof(*cache), rte_mem_page_size());
+		if (!cache) {
+			ret = -ENOMEM;
+			goto out;
+		}
+
+		strncpy(cache->name, da->name, sizeof(da->name));
+		cache->drv_str = strdup(da->drv_str);
+		if (!cache->drv_str) {
+			rte_free(cache);
+			ret = -ENOMEM;
+			goto out;
+		}
+
+		LIST_INSERT_HEAD(&da_cache_list, cache, list);
+	}
+out:
+	rte_spinlock_unlock(&netvsc_lock);
+	return ret;
+}
+
+static void remove_cache_list(void)
+{
+	struct da_cache *cache;
+
+	rte_spinlock_lock(&netvsc_lock);
+	da_cache_usage--;
+	if (da_cache_usage)
+		goto out;
+
+	LIST_FOREACH(cache, &da_cache_list, list) {
+		LIST_REMOVE(cache, list);
+		free(cache->drv_str);
+		rte_free(cache);
+	}
+out:
+	rte_spinlock_unlock(&netvsc_lock);
+}
+
 static int eth_hn_probe(struct rte_vmbus_driver *drv __rte_unused,
 			struct rte_vmbus_device *dev)
 {
@@ -1431,24 +1498,35 @@ static int eth_hn_probe(struct rte_vmbus_driver *drv __rte_unused,
 
 	PMD_INIT_FUNC_TRACE();
 
+	ret = populate_cache_list();
+	if (ret)
+		return ret;
+
 	ret = rte_dev_event_monitor_start();
 	if (ret) {
 		PMD_DRV_LOG(ERR, "Failed to start device event monitoring");
-		return ret;
+		goto fail;
 	}
 
 	eth_dev = eth_dev_vmbus_allocate(dev, sizeof(struct hn_data));
-	if (!eth_dev)
-		return -ENOMEM;
+	if (!eth_dev) {
+		rte_dev_event_monitor_stop();
+		ret = -ENOMEM;
+		goto fail;
+	}
 
 	ret = eth_hn_dev_init(eth_dev);
 	if (ret) {
 		eth_dev_vmbus_release(eth_dev);
 		rte_dev_event_monitor_stop();
-	} else {
-		rte_eth_dev_probing_finish(eth_dev);
+		goto fail;
 	}
 
+	rte_eth_dev_probing_finish(eth_dev);
+	return 0;
+
+fail:
+	remove_cache_list();
 	return ret;
 }
 
@@ -1469,6 +1547,9 @@ static int eth_hn_remove(struct rte_vmbus_device *dev)
 
 	eth_dev_vmbus_release(eth_dev);
 	rte_dev_event_monitor_stop();
+
+	remove_cache_list();
+
 	return 0;
 }
 
diff --git a/drivers/net/netvsc/hn_var.h b/drivers/net/netvsc/hn_var.h
index 0f638bc5fd..d2263e03e8 100644
--- a/drivers/net/netvsc/hn_var.h
+++ b/drivers/net/netvsc/hn_var.h
@@ -184,7 +184,6 @@ struct hn_data {
 
 	rte_spinlock_t	hotadd_lock;
 	LIST_HEAD(hotadd_list, hv_hotadd_context) hotadd_list;
-	char		*vf_devargs;
 };
 
 static inline struct vmbus_channel *
diff --git a/drivers/net/netvsc/hn_vf.c b/drivers/net/netvsc/hn_vf.c
index 4ff766ec8b..d922e685f4 100644
--- a/drivers/net/netvsc/hn_vf.c
+++ b/drivers/net/netvsc/hn_vf.c
@@ -130,10 +130,6 @@ static void hn_remove_delayed(void *args)
 		PMD_DRV_LOG(ERR, "rte_eth_dev_stop failed port_id=%u ret=%d",
 			    port_id, ret);
 
-	/* Record the device parameters for possible hotplug events */
-	if (dev->devargs && dev->devargs->args)
-		hv->vf_devargs = strdup(dev->devargs->args);
-
 	ret = rte_eth_dev_close(port_id);
 	if (ret)
 		PMD_DRV_LOG(ERR, "rte_eth_dev_close failed port_id=%u ret=%d",
-- 
2.34.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/4] net/netvsc: cache device parameters for hot plug events
  2025-01-28  1:35 ` [PATCH 4/4] net/netvsc: cache device parameters for hot plug events longli
@ 2025-01-28 21:00   ` Stephen Hemminger
  2025-01-29  0:10     ` [EXTERNAL] " Long Li
  2025-01-28 21:01   ` Stephen Hemminger
  1 sibling, 1 reply; 10+ messages in thread
From: Stephen Hemminger @ 2025-01-28 21:00 UTC (permalink / raw)
  To: longli; +Cc: Ferruh Yigit, Andrew Rybchenko, Wei Hu, dev, Long Li

On Mon, 27 Jan 2025 17:35:06 -0800
longli@linuxonhyperv.com wrote:

> @@ -1409,9 +1424,6 @@ eth_hn_dev_uninit(struct rte_eth_dev *eth_dev)
>  	ret_stop = hn_dev_stop(eth_dev);
>  	hn_dev_close(eth_dev);
>  
> -	free(hv->vf_devargs);
> -	hv->vf_devargs = NULL;
> -
>  	hn_detach(hv);
>  	hn_chim_uninit(eth_dev);
>  	rte_vmbus_chan_close(hv->channels[0]);
> @@ -1423,6 +1435,61 @@ eth_hn_dev_uninit(struct rte_eth_dev *eth_dev)
>  	return ret_stop;
>  }
>  
> +static int populate_cache_list(void)
> +{
> +	int ret;
> +	struct rte_devargs *da;
> +
> +	rte_spinlock_lock(&netvsc_lock);
> +	da_cache_usage++;
> +	if (da_cache_usage > 1) {
> +		ret = 0;
> +		goto out;
> +	}
> +
> +	LIST_INIT(&da_cache_list);
> +	RTE_EAL_DEVARGS_FOREACH("pci", da) {
> +		struct da_cache *cache;
> +
> +		cache = rte_zmalloc("NETVSC-HOTADD", sizeof(*cache), rte_mem_page_size());
> +		if (!cache) {
> +			ret = -ENOMEM;
> +			goto out;
> +		}
> +
> +		strncpy(cache->name, da->name, sizeof(da->name));
> +		cache->drv_str = strdup(da->drv_str);
> +		if (!cache->drv_str) {
> +			rte_free(cache);
> +			ret = -ENOMEM;
> +			goto out;
> +		}
> +
> +		LIST_INSERT_HEAD(&da_cache_list, cache, list);
> +	}

Why do you need to cache entry to be page aligned, that seems unnecessary wasteful?
Why does it need to be huge pages? versus normal malloc?
The string is coming from malloc (strdup) so it can't be used by secondary process.

Since you are allocating a devargs cache entry, you could just as well use
flexible array where entry was like:
struct da_cache {
	LIST_ENTRY(da_cache) list;
	char name[RTE_DEV_NAME_MAX_LEN];
	char drv_str[];
};


	cache = malloc(sizeof(*cache) + strlen(da->drv_str) + 1);
...
	strcpy(cache->drv_str, da->drv_str);

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/4] net/netvsc: cache device parameters for hot plug events
  2025-01-28  1:35 ` [PATCH 4/4] net/netvsc: cache device parameters for hot plug events longli
  2025-01-28 21:00   ` Stephen Hemminger
@ 2025-01-28 21:01   ` Stephen Hemminger
  1 sibling, 0 replies; 10+ messages in thread
From: Stephen Hemminger @ 2025-01-28 21:01 UTC (permalink / raw)
  To: longli; +Cc: Ferruh Yigit, Andrew Rybchenko, Wei Hu, dev, Long Li

On Mon, 27 Jan 2025 17:35:06 -0800
longli@linuxonhyperv.com wrote:

> From: Long Li <longli@microsoft.com>
> 
> If a device is hot removed and hot plugged, it needs the same driver
> parameters that are passed to EAL. However, during device removal, all
> EAL driver parameters are freed as part of the cleanup.
> 
> Cache those driver parameters for future hot plug events. Because we don't
> know which device will show up, cache all the PCI driver parameters.
> 
> Signed-off-by: Long Li <longli@microsoft.com>

*Build Failed #1:
OS: OpenAnolis8.9-64
Target: x86_64-native-linuxapp-gcc
FAILED: drivers/libtmp_rte_net_netvsc.a.p/net_netvsc_hn_ethdev.c.o 
gcc -Idrivers/libtmp_rte_net_netvsc.a.p -Idrivers -I../drivers -Idrivers/net/netvsc -I../drivers/net/netvsc -Ilib/ethdev -I../lib/ethdev -I. -I.. -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include -Ilib/eal/linux/include -I../lib/eal/linux/include -Ilib/eal/x86/include -I../lib/eal/x86/include -I../kernel/linux -Ilib/eal/common -I../lib/eal/common -Ilib/eal -I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/log -I../lib/log -Ilib/metrics -I../lib/metrics -Ilib/telemetry -I../lib/telemetry -Ilib/net -I../lib/net -Ilib/mbuf -I../lib/mbuf -Ilib/mempool -I../lib/mempool -Ilib/ring -I../lib/ring -Ilib/meter -I../lib/meter -Idrivers/bus/pci -I../drivers/bus/pci -I../drivers/bus/pci/linux -Ilib/pci -I../lib/pci -Idrivers/bus/vdev -I../drivers/bus/vdev -Idrivers/bus/vmbus -I../drivers/bus/vmbus -I../drivers/bus/vmbus/linux -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -Werror -std=c11 -O3 -include rte_config.h -Wcast-qual -Wdeprecated -Wformat -Wformat-nonliteral -Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wpointer-arith -Wsign-compare -Wstrict-prototypes -Wundef -Wwrite-strings -Wno-packed-not-aligned -Wno-missing-field-initializers -D_GNU_SOURCE -fPIC -march=native -mrtm -DALLOW_EXPERIMENTAL_API -DALLOW_INTERNAL_API -Wno-format-truncation -DRTE_LOG_DEFAULT_LOGTYPE=pmd.net.netvsc -MD -MQ drivers/libtmp_rte_net_netvsc.a.p/net_netvsc_hn_ethdev.c.o -MF drivers/libtmp_rte_net_netvsc.a.p/net_netvsc_hn_ethdev.c.o.d -o drivers/libtmp_rte_net_netvsc.a.p/net_netvsc_hn_ethdev.c.o -c ../drivers/net/netvsc/hn_ethdev.c
../drivers/net/netvsc/hn_ethdev.c: In function ‘populate_cache_list’:
../drivers/net/netvsc/hn_ethdev.c:1460:40: error: argument to ‘sizeof’ in ‘strncpy’ call is the same expression as the source; did you mean to use the size of the destination? [-Werror=sizeof-pointer-memaccess]
   strncpy(cache->name, da->name, sizeof(da->name));
                                        ^
../drivers/net/netvsc/hn_ethdev.c: In function ‘eth_hn_probe’:
../drivers/net/netvsc/hn_ethdev.c:1502:5: error: ‘ret’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
  if (ret)
     ^

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [EXTERNAL] Re: [PATCH 4/4] net/netvsc: cache device parameters for hot plug events
  2025-01-28 21:00   ` Stephen Hemminger
@ 2025-01-29  0:10     ` Long Li
  2025-01-29  0:31       ` Stephen Hemminger
  2025-01-30  3:59       ` Stephen Hemminger
  0 siblings, 2 replies; 10+ messages in thread
From: Long Li @ 2025-01-29  0:10 UTC (permalink / raw)
  To: Stephen Hemminger, longli; +Cc: Ferruh Yigit, Andrew Rybchenko, Wei Hu, dev

> Subject: [EXTERNAL] Re: [PATCH 4/4] net/netvsc: cache device parameters for
> hot plug events
> 
> On Mon, 27 Jan 2025 17:35:06 -0800
> longli@linuxonhyperv.com wrote:
> 
> > @@ -1409,9 +1424,6 @@ eth_hn_dev_uninit(struct rte_eth_dev *eth_dev)
> >  	ret_stop = hn_dev_stop(eth_dev);
> >  	hn_dev_close(eth_dev);
> >
> > -	free(hv->vf_devargs);
> > -	hv->vf_devargs = NULL;
> > -
> >  	hn_detach(hv);
> >  	hn_chim_uninit(eth_dev);
> >  	rte_vmbus_chan_close(hv->channels[0]);
> > @@ -1423,6 +1435,61 @@ eth_hn_dev_uninit(struct rte_eth_dev *eth_dev)
> >  	return ret_stop;
> >  }
> >
> > +static int populate_cache_list(void)
> > +{
> > +	int ret;
> > +	struct rte_devargs *da;
> > +
> > +	rte_spinlock_lock(&netvsc_lock);
> > +	da_cache_usage++;
> > +	if (da_cache_usage > 1) {
> > +		ret = 0;
> > +		goto out;
> > +	}
> > +
> > +	LIST_INIT(&da_cache_list);
> > +	RTE_EAL_DEVARGS_FOREACH("pci", da) {
> > +		struct da_cache *cache;
> > +
> > +		cache = rte_zmalloc("NETVSC-HOTADD", sizeof(*cache),
> rte_mem_page_size());
> > +		if (!cache) {
> > +			ret = -ENOMEM;
> > +			goto out;
> > +		}
> > +
> > +		strncpy(cache->name, da->name, sizeof(da->name));
> > +		cache->drv_str = strdup(da->drv_str);
> > +		if (!cache->drv_str) {
> > +			rte_free(cache);
> > +			ret = -ENOMEM;
> > +			goto out;
> > +		}
> > +
> > +		LIST_INSERT_HEAD(&da_cache_list, cache, list);
> > +	}
> 
> Why do you need to cache entry to be page aligned, that seems unnecessary
> wasteful?

You are right it doesn't need to. I'm removing the alignment.

> Why does it need to be huge pages? versus normal malloc?

I thought it would make it easier to debug memory leaks through EAL. But normal malloc is fine because this data is not shared between primary/secondary.

> The string is coming from malloc (strdup) so it can't be used by secondary process.
> 
> Since you are allocating a devargs cache entry, you could just as well use flexible
> array where entry was like:
> struct da_cache {
> 	LIST_ENTRY(da_cache) list;
> 	char name[RTE_DEV_NAME_MAX_LEN];
> 	char drv_str[];
> };

This is difficult as I don't know how big it is for drv_str[].

> 
> 
> 	cache = malloc(sizeof(*cache) + strlen(da->drv_str) + 1); ...
> 	strcpy(cache->drv_str, da->drv_str);

Another approach is to modify EAL to never delete driver arguments when a device is removed. i.e., It doesn't call rte_devargs_remove() on device removal, instead keep those devargs for the lifetime of the process. Do you think this is a better approach? This will save work if other drivers want to cache devargs list for device hot plug events.

Thanks,
Long

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [EXTERNAL] Re: [PATCH 4/4] net/netvsc: cache device parameters for hot plug events
  2025-01-29  0:10     ` [EXTERNAL] " Long Li
@ 2025-01-29  0:31       ` Stephen Hemminger
  2025-01-29  0:54         ` Long Li
  2025-01-30  3:59       ` Stephen Hemminger
  1 sibling, 1 reply; 10+ messages in thread
From: Stephen Hemminger @ 2025-01-29  0:31 UTC (permalink / raw)
  To: Long Li; +Cc: longli, Ferruh Yigit, Andrew Rybchenko, Wei Hu, dev

On Wed, 29 Jan 2025 00:10:12 +0000
Long Li <longli@microsoft.com> wrote:

> > Subject: [EXTERNAL] Re: [PATCH 4/4] net/netvsc: cache device parameters for
> > hot plug events
> > 
> > On Mon, 27 Jan 2025 17:35:06 -0800
> > longli@linuxonhyperv.com wrote:
> >   
> > > @@ -1409,9 +1424,6 @@ eth_hn_dev_uninit(struct rte_eth_dev *eth_dev)
> > >  	ret_stop = hn_dev_stop(eth_dev);
> > >  	hn_dev_close(eth_dev);
> > >
> > > -	free(hv->vf_devargs);
> > > -	hv->vf_devargs = NULL;
> > > -
> > >  	hn_detach(hv);
> > >  	hn_chim_uninit(eth_dev);
> > >  	rte_vmbus_chan_close(hv->channels[0]);
> > > @@ -1423,6 +1435,61 @@ eth_hn_dev_uninit(struct rte_eth_dev *eth_dev)
> > >  	return ret_stop;
> > >  }
> > >
> > > +static int populate_cache_list(void)
> > > +{
> > > +	int ret;
> > > +	struct rte_devargs *da;
> > > +
> > > +	rte_spinlock_lock(&netvsc_lock);
> > > +	da_cache_usage++;
> > > +	if (da_cache_usage > 1) {
> > > +		ret = 0;
> > > +		goto out;
> > > +	}
> > > +
> > > +	LIST_INIT(&da_cache_list);
> > > +	RTE_EAL_DEVARGS_FOREACH("pci", da) {
> > > +		struct da_cache *cache;
> > > +
> > > +		cache = rte_zmalloc("NETVSC-HOTADD", sizeof(*cache),  
> > rte_mem_page_size());  
> > > +		if (!cache) {
> > > +			ret = -ENOMEM;
> > > +			goto out;
> > > +		}
> > > +
> > > +		strncpy(cache->name, da->name, sizeof(da->name));
> > > +		cache->drv_str = strdup(da->drv_str);
> > > +		if (!cache->drv_str) {
> > > +			rte_free(cache);
> > > +			ret = -ENOMEM;
> > > +			goto out;
> > > +		}
> > > +
> > > +		LIST_INSERT_HEAD(&da_cache_list, cache, list);
> > > +	}  
> > 
> > Why do you need to cache entry to be page aligned, that seems unnecessary
> > wasteful?  
> 
> You are right it doesn't need to. I'm removing the alignment.
> 
> > Why does it need to be huge pages? versus normal malloc?  
> 
> I thought it would make it easier to debug memory leaks through EAL. But normal malloc is fine because this data is not shared between primary/secondary.
> 
> > The string is coming from malloc (strdup) so it can't be used by secondary process.
> > 
> > Since you are allocating a devargs cache entry, you could just as well use flexible
> > array where entry was like:
> > struct da_cache {
> > 	LIST_ENTRY(da_cache) list;
> > 	char name[RTE_DEV_NAME_MAX_LEN];
> > 	char drv_str[];
> > };  
> 
> This is difficult as I don't know how big it is for drv_str[].

That is why the malloc adds extra space for drv_str, see below.
Flexible array allow one array at end of structure 
> > 
> > 
> > 	cache = malloc(sizeof(*cache) + strlen(da->drv_str) + 1); ...
> > 	strcpy(cache->drv_str, da->drv_str);  
> 
> Another approach is to modify EAL to never delete driver arguments when a device is removed. i.e., It doesn't call rte_devargs_remove() on device removal, instead keep those devargs for the lifetime of the process. Do you think this is a better approach? This will save work if other drivers want to cache devargs list for device hot plug events.

Not sure, right now I don't think it is possible to pre-add devargs for future hot plug


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [EXTERNAL] Re: [PATCH 4/4] net/netvsc: cache device parameters for hot plug events
  2025-01-29  0:31       ` Stephen Hemminger
@ 2025-01-29  0:54         ` Long Li
  0 siblings, 0 replies; 10+ messages in thread
From: Long Li @ 2025-01-29  0:54 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: longli, Ferruh Yigit, Andrew Rybchenko, Wei Hu, dev



> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Tuesday, January 28, 2025 4:32 PM
> To: Long Li <longli@microsoft.com>
> Cc: longli@linuxonhyperv.com; Ferruh Yigit <ferruh.yigit@amd.com>; Andrew
> Rybchenko <andrew.rybchenko@oktetlabs.ru>; Wei Hu <weh@microsoft.com>;
> dev@dpdk.org
> Subject: Re: [EXTERNAL] Re: [PATCH 4/4] net/netvsc: cache device parameters
> for hot plug events
> 
> On Wed, 29 Jan 2025 00:10:12 +0000
> Long Li <longli@microsoft.com> wrote:
> 
> > > Subject: [EXTERNAL] Re: [PATCH 4/4] net/netvsc: cache device
> > > parameters for hot plug events
> > >
> > > On Mon, 27 Jan 2025 17:35:06 -0800
> > > longli@linuxonhyperv.com wrote:
> > >
> > > > @@ -1409,9 +1424,6 @@ eth_hn_dev_uninit(struct rte_eth_dev *eth_dev)
> > > >  	ret_stop = hn_dev_stop(eth_dev);
> > > >  	hn_dev_close(eth_dev);
> > > >
> > > > -	free(hv->vf_devargs);
> > > > -	hv->vf_devargs = NULL;
> > > > -
> > > >  	hn_detach(hv);
> > > >  	hn_chim_uninit(eth_dev);
> > > >  	rte_vmbus_chan_close(hv->channels[0]);
> > > > @@ -1423,6 +1435,61 @@ eth_hn_dev_uninit(struct rte_eth_dev
> *eth_dev)
> > > >  	return ret_stop;
> > > >  }
> > > >
> > > > +static int populate_cache_list(void) {
> > > > +	int ret;
> > > > +	struct rte_devargs *da;
> > > > +
> > > > +	rte_spinlock_lock(&netvsc_lock);
> > > > +	da_cache_usage++;
> > > > +	if (da_cache_usage > 1) {
> > > > +		ret = 0;
> > > > +		goto out;
> > > > +	}
> > > > +
> > > > +	LIST_INIT(&da_cache_list);
> > > > +	RTE_EAL_DEVARGS_FOREACH("pci", da) {
> > > > +		struct da_cache *cache;
> > > > +
> > > > +		cache = rte_zmalloc("NETVSC-HOTADD", sizeof(*cache),
> > > rte_mem_page_size());
> > > > +		if (!cache) {
> > > > +			ret = -ENOMEM;
> > > > +			goto out;
> > > > +		}
> > > > +
> > > > +		strncpy(cache->name, da->name, sizeof(da->name));
> > > > +		cache->drv_str = strdup(da->drv_str);
> > > > +		if (!cache->drv_str) {
> > > > +			rte_free(cache);
> > > > +			ret = -ENOMEM;
> > > > +			goto out;
> > > > +		}
> > > > +
> > > > +		LIST_INSERT_HEAD(&da_cache_list, cache, list);
> > > > +	}
> > >
> > > Why do you need to cache entry to be page aligned, that seems
> > > unnecessary wasteful?
> >
> > You are right it doesn't need to. I'm removing the alignment.
> >
> > > Why does it need to be huge pages? versus normal malloc?
> >
> > I thought it would make it easier to debug memory leaks through EAL. But
> normal malloc is fine because this data is not shared between primary/secondary.
> >
> > > The string is coming from malloc (strdup) so it can't be used by secondary
> process.
> > >
> > > Since you are allocating a devargs cache entry, you could just as
> > > well use flexible array where entry was like:
> > > struct da_cache {
> > > 	LIST_ENTRY(da_cache) list;
> > > 	char name[RTE_DEV_NAME_MAX_LEN];
> > > 	char drv_str[];
> > > };
> >
> > This is difficult as I don't know how big it is for drv_str[].
> 
> That is why the malloc adds extra space for drv_str, see below.
> Flexible array allow one array at end of structure
> > >
> > >
> > > 	cache = malloc(sizeof(*cache) + strlen(da->drv_str) + 1); ...
> > > 	strcpy(cache->drv_str, da->drv_str);
> >

Thank you, got it.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [EXTERNAL] Re: [PATCH 4/4] net/netvsc: cache device parameters for hot plug events
  2025-01-29  0:10     ` [EXTERNAL] " Long Li
  2025-01-29  0:31       ` Stephen Hemminger
@ 2025-01-30  3:59       ` Stephen Hemminger
  1 sibling, 0 replies; 10+ messages in thread
From: Stephen Hemminger @ 2025-01-30  3:59 UTC (permalink / raw)
  To: Long Li; +Cc: longli, Ferruh Yigit, Andrew Rybchenko, Wei Hu, dev

On Wed, 29 Jan 2025 00:10:12 +0000
Long Li <longli@microsoft.com> wrote:

> Another approach is to modify EAL to never delete driver arguments when a device is removed. i.e., It doesn't call rte_devargs_remove() on device removal, instead keep those devargs for the lifetime of the process. Do you think this is a better approach? This will save work if other drivers want to cache devargs list for device hot plug events.

Agree, that having devargs be smart enough to maintain the list for future hotplug is sensible.
Do other drivers that support hotplug have the same issue, or does no one ever use devargs with things like failsafe.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-01-30  3:59 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-01-28  1:35 [PATCH 1/4] net/netvsc: scan all net devices under the PCI device longli
2025-01-28  1:35 ` [PATCH 2/4] net/netvsc: remove RTE device if all its net devices are removed longli
2025-01-28  1:35 ` [PATCH 3/4] net/netvsc: log error on failure to switch data path longli
2025-01-28  1:35 ` [PATCH 4/4] net/netvsc: cache device parameters for hot plug events longli
2025-01-28 21:00   ` Stephen Hemminger
2025-01-29  0:10     ` [EXTERNAL] " Long Li
2025-01-29  0:31       ` Stephen Hemminger
2025-01-29  0:54         ` Long Li
2025-01-30  3:59       ` Stephen Hemminger
2025-01-28 21:01   ` Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).