DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 00/10] kni: Interface detach and link status fixes.
@ 2018-06-28 22:45 Dan Gora
  2018-06-29  1:54 ` Dan Gora
  0 siblings, 1 reply; 45+ messages in thread
From: Dan Gora @ 2018-06-28 22:45 UTC (permalink / raw)
  Cc: dev, Dan Gora

Hi All,

The following patches are to fix a problem with detaching a KNI
interface using rte_kni_release and to add a new API function
to allow users to change the link status (up/down, speed, etc)
of the interface in the linux kernel.  In previous versions, it
was impossible to release a KNI interface without waiting for the
timeout in kni_net_process_request to expire if the interface is in
the UP state.  The solution to this issue was to separate the process
of releasing the netdev device from the linux kernel from actually
freeing the KNI interface in the kernel and in the RTE library by
introducing a new API function, rte_kni_free().

The last patch : 'kni: add API to set link status on kernel interface'
adds a new API function to allow the DPDK user to change the link
speed and status reported by the netdev in the linux kernel.

This resolves issues with allowing automatic network configuration
applciations such as NetworkManager to assign addresses and for user
space applications to be able to open sockets on these interfaces,
as some operations rely on the link status being up before they
work properly.

This last patch is included in this series because both new "features"
introduce new ioctls to the rte_kni kernel module, so the order in
which the patches get applied affects which number each new ioctl gets.
I thought it better to bundle them together to try to get them applied
as a series to avoid any issues with this.

Please have a look!

thanks
dan

Dan Gora (10):
  kni: remove unused variables from struct kni_dev
  kni: separate releasing netdev from freeing KNI interface
  kni: don't touch struct kni_dev after freeing
  kni: add rte_kni_free to KNI library
  kni: don't run rte_kni_handle_request after interface release
  kni: increase length of timeout for KNI responses
  kni: update kni test for rte_kni_free
  kni: add rte_kni_free to KNI example app
  kni: add rte_kni_free to KNI vdev driver
  kni: add API to set link status on kernel interface

 drivers/net/kni/rte_eth_kni.c                 |   6 +-
 examples/kni/main.c                           |   4 +-
 kernel/linux/kni/kni_dev.h                    |   5 +-
 kernel/linux/kni/kni_misc.c                   | 156 ++++++++++++++++--
 kernel/linux/kni/kni_net.c                    |   7 +-
 .../eal/include/exec-env/rte_kni_common.h     |  20 +++
 lib/librte_kni/rte_kni.c                      |  75 ++++++++-
 lib/librte_kni/rte_kni.h                      |  51 +++++-
 test/test/test_kni.c                          |  20 +++
 9 files changed, 313 insertions(+), 31 deletions(-)

-- 
2.18.0.rc1.1.g6f333ff2f

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH 00/10] kni: Interface detach and link status fixes.
  2018-06-28 22:45 [dpdk-dev] [PATCH 00/10] kni: Interface detach and link status fixes Dan Gora
@ 2018-06-29  1:54 ` Dan Gora
  2018-06-29  1:54   ` [dpdk-dev] [PATCH v2 01/10] kni: remove unused variables from struct kni_dev Dan Gora
                     ` (10 more replies)
  0 siblings, 11 replies; 45+ messages in thread
From: Dan Gora @ 2018-06-29  1:54 UTC (permalink / raw)
  To: ferruh.yigit; +Cc: dev, Dan Gora

Hi All,

The following patches are to fix a problem with detaching a KNI
interface using rte_kni_release and to add a new API function
to allow users to change the link status (up/down, speed, etc)
of the interface in the linux kernel.  In previous versions, it
was impossible to release a KNI interface without waiting for the
timeout in kni_net_process_request to expire if the interface is in
the UP state.  The solution to this issue was to separate the process
of releasing the netdev device from the linux kernel from actually
freeing the KNI interface in the kernel and in the RTE library by
introducing a new API function, rte_kni_free().

The last patch : 'kni: add API to set link status on kernel interface'
adds a new API function to allow the DPDK user to change the link
speed and status reported by the netdev in the linux kernel.

This resolves issues with allowing automatic network configuration
applciations such as NetworkManager to assign addresses and for user
space applications to be able to open sockets on these interfaces,
as some operations rely on the link status being up before they
work properly.

This last patch is included in this series because both new "features"
introduce new ioctls to the rte_kni kernel module, so the order in
which the patches get applied affects which number each new ioctl gets.
I thought it better to bundle them together to try to get them applied
as a series to avoid any issues with this.

*v2*

The first time I submitted these patches, they failed with
compilation errors in Patchwork, but they are failing due to
rte_kni_free being a missing symbol, which is only introduced here in
patch 4/10.  I'm not really sure how the automatically compilation
tool works in patchwork.  Does it apply all of the patches in the
series before compilation?  Does it compile after each patch?

I tried my best to break up these patches into small enough pieces
so that they could be reviewed fairly easily, but I'm not sure that
I did a good job.  If there are suggestions on how the changes could
be organized or split better, please let me know.

thanks
dan

v2:
  Re-submitted as a threaded series as per thomas@monjalon.net suggestions.
  cc'd Ferruh on cover letter.
  No code changes.

Dan Gora (10):
  kni: remove unused variables from struct kni_dev
  kni: separate releasing netdev from freeing KNI interface
  kni: don't touch struct kni_dev after freeing
  kni: add rte_kni_free to KNI library
  kni: don't run rte_kni_handle_request after interface release
  kni: increase length of timeout for KNI responses
  kni: update kni test for rte_kni_free
  kni: add rte_kni_free to KNI example app
  kni: add rte_kni_free to KNI vdev driver
  kni: add API to set link status on kernel interface

 drivers/net/kni/rte_eth_kni.c                 |   6 +-
 examples/kni/main.c                           |   4 +-
 kernel/linux/kni/kni_dev.h                    |   5 +-
 kernel/linux/kni/kni_misc.c                   | 156 ++++++++++++++++--
 kernel/linux/kni/kni_net.c                    |   7 +-
 .../eal/include/exec-env/rte_kni_common.h     |  20 +++
 lib/librte_kni/rte_kni.c                      |  75 ++++++++-
 lib/librte_kni/rte_kni.h                      |  51 +++++-
 test/test/test_kni.c                          |  20 +++
 9 files changed, 313 insertions(+), 31 deletions(-)

-- 
2.18.0.rc1.1.g6f333ff2f

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 01/10] kni: remove unused variables from struct kni_dev
  2018-06-29  1:54 ` Dan Gora
@ 2018-06-29  1:54   ` Dan Gora
  2018-08-29 10:29     ` Ferruh Yigit
  2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 02/10] kni: separate releasing netdev from freeing KNI interface Dan Gora
                     ` (9 subsequent siblings)
  10 siblings, 1 reply; 45+ messages in thread
From: Dan Gora @ 2018-06-29  1:54 UTC (permalink / raw)
  To: ferruh.yigit; +Cc: dev, Dan Gora

Remove the unused fields 'status' and 'synchro' from the struct
kni_dev.

Signed-off-by: Dan Gora <dg@adax.com>
---
 kernel/linux/kni/kni_dev.h | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/kernel/linux/kni/kni_dev.h b/kernel/linux/kni/kni_dev.h
index 6275ef27f..ed42989cc 100644
--- a/kernel/linux/kni/kni_dev.h
+++ b/kernel/linux/kni/kni_dev.h
@@ -37,7 +37,6 @@ struct kni_dev {
 	struct list_head list;
 
 	struct net_device_stats stats;
-	int status;
 	uint16_t group_id;           /* Group ID of a group of KNI devices */
 	uint32_t core_id;            /* Core ID to bind */
 	char name[RTE_KNI_NAMESIZE]; /* Network device name */
@@ -82,9 +81,6 @@ struct kni_dev {
 	/* mbuf size */
 	uint32_t mbuf_size;
 
-	/* synchro for request processing */
-	unsigned long synchro;
-
 	/* buffers */
 	void *pa[MBUF_BURST_SZ];
 	void *va[MBUF_BURST_SZ];
-- 
2.18.0.rc1.1.g6f333ff2f

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 02/10] kni: separate releasing netdev from freeing KNI interface
  2018-06-29  1:54 ` Dan Gora
  2018-06-29  1:54   ` [dpdk-dev] [PATCH v2 01/10] kni: remove unused variables from struct kni_dev Dan Gora
@ 2018-06-29  1:55   ` Dan Gora
  2018-08-29 10:59     ` Ferruh Yigit
  2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 03/10] kni: don't touch struct kni_dev after freeing Dan Gora
                     ` (8 subsequent siblings)
  10 siblings, 1 reply; 45+ messages in thread
From: Dan Gora @ 2018-06-29  1:55 UTC (permalink / raw)
  To: ferruh.yigit; +Cc: dev, Dan Gora

Currently the rte_kni kernel driver suffers from a problem where
when the interface is released, it generates a callback to the DPDK
application to change the interface state to Down.  However, after the
DPDK application handles the callback and generates a response back to
the kernel, the rte_kni driver cannot wake the thread which is asleep
waiting for the response, because it is holding the kni_link_lock
semaphore and it has already removed the 'struct kni_dev' from the
list of interfaces to poll for responses.

This means that if the KNI interface is in the Up state when
rte_kni_release() is called, it will always sleep for three seconds
until kni_net_release gives up waiting for a response from the DPDK
application.

To fix this, we must separate the step to release the kernel network
interface from the steps to remove the KNI interface from the list
of interfaces to poll.

When the kernel network interface is removed with unregister_netdev(),
if the interface is up, it will generate a callback to mark the
interface down, which calls kni_net_release().  kni_net_release() will
block waiting for the DPDK application to call rte_kni_handle_request()
to handle the callback, but it also needs the thread in the KNI driver
(either the per-dev thread for multi-thread or the per-driver thread)
to call kni_net_poll_resp() in order to wake the thread sleeping in
kni_net_release (actually kni_net_process_request()).

So now, KNI interfaces should be removed as such:

1) The user calls rte_kni_release().  This only unregisters the
netdev in the kernel, but touches nothing else.  This allows all the
threads to run which are necessary to handle the callback into the
DPDK application to mark the interface down.

2) The user stops the thread running rte_kni_handle_request().
After rte_kni_release() has been called, there will be no more
callbacks for that interface so it is not necessary.  It cannot be
running at the same time that rte_kni_free() frees all of the FIFOs
and DPDK memory for that KNI interface.

3) The user calls rte_kni_free().  This performs the RTE_KNI_IOCTL_FREE
ioctl which calls kni_ioctl_free().  This function removes the struct
kni_dev from the list of interfaces to poll (and kills the per-dev
kthread, if configured for multi-thread), then frees the memory in
the FIFOs.

Signed-off-by: Dan Gora <dg@adax.com>
---
 kernel/linux/kni/kni_misc.c                   | 69 +++++++++++++++----
 .../eal/include/exec-env/rte_kni_common.h     |  1 +
 2 files changed, 58 insertions(+), 12 deletions(-)

diff --git a/kernel/linux/kni/kni_misc.c b/kernel/linux/kni/kni_misc.c
index fa69f8e63..d889ffc4b 100644
--- a/kernel/linux/kni/kni_misc.c
+++ b/kernel/linux/kni/kni_misc.c
@@ -192,8 +192,6 @@ kni_dev_remove(struct kni_dev *dev)
 		free_netdev(dev->net_dev);
 	}
 
-	kni_net_release_fifo_phy(dev);
-
 	return 0;
 }
 
@@ -224,6 +222,7 @@ kni_release(struct inode *inode, struct file *file)
 		}
 
 		kni_dev_remove(dev);
+		kni_net_release_fifo_phy(dev);
 		list_del(&dev->list);
 	}
 	up_write(&knet->kni_list_lock);
@@ -263,7 +262,6 @@ kni_run_thread(struct kni_net *knet, struct kni_dev *kni, uint8_t force_bind)
 		kni->pthread = kthread_create(kni_thread_multiple,
 			(void *)kni, "kni_%s", kni->name);
 		if (IS_ERR(kni->pthread)) {
-			kni_dev_remove(kni);
 			return -ECANCELED;
 		}
 
@@ -278,7 +276,6 @@ kni_run_thread(struct kni_net *knet, struct kni_dev *kni, uint8_t force_bind)
 				(void *)knet, "kni_single");
 			if (IS_ERR(knet->kni_kthread)) {
 				mutex_unlock(&knet->kni_kthread_lock);
-				kni_dev_remove(kni);
 				return -ECANCELED;
 			}
 
@@ -460,15 +457,17 @@ kni_ioctl_create(struct net *net, uint32_t ioctl_num,
 	if (ret) {
 		pr_err("error %i registering device \"%s\"\n",
 					ret, dev_info.name);
-		kni->net_dev = NULL;
-		kni_dev_remove(kni);
+		kni_net_release_fifo_phy(kni);
 		free_netdev(net_dev);
 		return -ENODEV;
 	}
 
 	ret = kni_run_thread(knet, kni, dev_info.force_bind);
-	if (ret != 0)
+	if (ret != 0) {
+		kni_dev_remove(kni);
+		kni_net_release_fifo_phy(kni);
 		return ret;
+	}
 
 	down_write(&knet->kni_list_lock);
 	list_add(&kni->list, &knet->kni_list_head);
@@ -495,6 +494,46 @@ kni_ioctl_release(struct net *net, uint32_t ioctl_num,
 		return -EIO;
 	}
 
+	/* Release the network device according to its name */
+	if (strlen(dev_info.name) == 0)
+		return ret;
+
+	down_read(&knet->kni_list_lock);
+	list_for_each_entry_safe(dev, n, &knet->kni_list_head, list) {
+		if (strncmp(dev->name, dev_info.name, RTE_KNI_NAMESIZE) != 0)
+			continue;
+
+		kni_dev_remove(dev);
+		up_read(&knet->kni_list_lock);
+		pr_info("Successfully released kni interface: %s\n",
+			dev_info.name);
+		return 0;
+	}
+	up_read(&knet->kni_list_lock);
+	pr_info("Error: failed to find kni interface: %s\n",
+		dev_info.name);
+
+	return ret;
+}
+
+static int
+kni_ioctl_free(struct net *net, uint32_t ioctl_num,
+		unsigned long ioctl_param)
+{
+	struct kni_net *knet = net_generic(net, kni_net_id);
+	int ret = -EINVAL;
+	struct kni_dev *dev, *n;
+	struct rte_kni_device_info dev_info;
+
+	if (_IOC_SIZE(ioctl_num) > sizeof(dev_info))
+		return -EINVAL;
+
+	ret = copy_from_user(&dev_info, (void *)ioctl_param, sizeof(dev_info));
+	if (ret) {
+		pr_err("copy_from_user in kni_ioctl_release");
+		return -EIO;
+	}
+
 	/* Release the network device according to its name */
 	if (strlen(dev_info.name) == 0)
 		return ret;
@@ -509,14 +548,17 @@ kni_ioctl_release(struct net *net, uint32_t ioctl_num,
 			dev->pthread = NULL;
 		}
 
-		kni_dev_remove(dev);
+		kni_net_release_fifo_phy(dev);
 		list_del(&dev->list);
-		ret = 0;
-		break;
+		up_write(&knet->kni_list_lock);
+		pr_info("Successfully freed kni interface: %s\n",
+			dev_info.name);
+		return 0;
 	}
 	up_write(&knet->kni_list_lock);
-	pr_info("%s release kni named %s\n",
-		(ret == 0 ? "Successfully" : "Unsuccessfully"), dev_info.name);
+
+	pr_info("Error: failed to find kni interface: %s\n",
+		dev_info.name);
 
 	return ret;
 }
@@ -542,6 +584,9 @@ kni_ioctl(struct inode *inode, uint32_t ioctl_num, unsigned long ioctl_param)
 	case _IOC_NR(RTE_KNI_IOCTL_RELEASE):
 		ret = kni_ioctl_release(net, ioctl_num, ioctl_param);
 		break;
+	case _IOC_NR(RTE_KNI_IOCTL_FREE):
+		ret = kni_ioctl_free(net, ioctl_num, ioctl_param);
+		break;
 	default:
 		pr_debug("IOCTL default\n");
 		break;
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
index cfa9448bd..318a3f939 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
@@ -129,5 +129,6 @@ struct rte_kni_device_info {
 #define RTE_KNI_IOCTL_TEST    _IOWR(0, 1, int)
 #define RTE_KNI_IOCTL_CREATE  _IOWR(0, 2, struct rte_kni_device_info)
 #define RTE_KNI_IOCTL_RELEASE _IOWR(0, 3, struct rte_kni_device_info)
+#define RTE_KNI_IOCTL_FREE    _IOWR(0, 4, struct rte_kni_device_info)
 
 #endif /* _RTE_KNI_COMMON_H_ */
-- 
2.18.0.rc1.1.g6f333ff2f

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 03/10] kni: don't touch struct kni_dev after freeing
  2018-06-29  1:54 ` Dan Gora
  2018-06-29  1:54   ` [dpdk-dev] [PATCH v2 01/10] kni: remove unused variables from struct kni_dev Dan Gora
  2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 02/10] kni: separate releasing netdev from freeing KNI interface Dan Gora
@ 2018-06-29  1:55   ` Dan Gora
  2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 04/10] kni: add rte_kni_free to KNI library Dan Gora
                     ` (7 subsequent siblings)
  10 siblings, 0 replies; 45+ messages in thread
From: Dan Gora @ 2018-06-29  1:55 UTC (permalink / raw)
  To: ferruh.yigit; +Cc: dev, Dan Gora

Since the struct kni_dev is allocated as part of the netdev with
alloc_netdev, when free_netdev is called, this also frees the struct
kni_dev embedded in it.

This means that we cannot touch struct kni_dev after calling
free_netdev since that memory was already deallocated.

Separate unregistering the net_dev with unregister_netdev from freeing
the net_device and kni_dev structures into separate functions and
ensure that we do not touch anything in the kni_dev structure after
freeing it (ie don't call list_del(), etc...).

Signed-off-by: Dan Gora <dg@adax.com>
---
 kernel/linux/kni/kni_dev.h  |  1 +
 kernel/linux/kni/kni_misc.c | 28 +++++++++++++++++++++++++---
 2 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/kernel/linux/kni/kni_dev.h b/kernel/linux/kni/kni_dev.h
index ed42989cc..f9aa90ff9 100644
--- a/kernel/linux/kni/kni_dev.h
+++ b/kernel/linux/kni/kni_dev.h
@@ -37,6 +37,7 @@ struct kni_dev {
 	struct list_head list;
 
 	struct net_device_stats stats;
+	uint16_t registered;         /* 0 if already released, 1 otherwise */
 	uint16_t group_id;           /* Group ID of a group of KNI devices */
 	uint32_t core_id;            /* Core ID to bind */
 	char name[RTE_KNI_NAMESIZE]; /* Network device name */
diff --git a/kernel/linux/kni/kni_misc.c b/kernel/linux/kni/kni_misc.c
index d889ffc4b..1c38cfa1a 100644
--- a/kernel/linux/kni/kni_misc.c
+++ b/kernel/linux/kni/kni_misc.c
@@ -184,17 +184,34 @@ kni_dev_remove(struct kni_dev *dev)
 			ixgbe_kni_remove(dev->pci_dev);
 		else if (pci_match_id(igb_pci_tbl, dev->pci_dev))
 			igb_kni_remove(dev->pci_dev);
+		dev->pci_dev = NULL;
 	}
 #endif
 
-	if (dev->net_dev) {
+	if (dev->registered) {
 		unregister_netdev(dev->net_dev);
-		free_netdev(dev->net_dev);
+		dev->registered = 0;
 	}
 
 	return 0;
 }
 
+static void
+kni_dev_free(struct kni_dev *dev)
+{
+	struct net_device *net_dev;
+	/*
+	 * Remember that struct kni_dev is part of the netdev
+	 * structure, so we free *both* with free_netdev.
+	 */
+	if (dev == NULL)
+		return;
+	net_dev = dev->net_dev;
+	dev->net_dev = NULL;
+	if (net_dev)
+		free_netdev(net_dev);
+}
+
 static int
 kni_release(struct inode *inode, struct file *file)
 {
@@ -224,6 +241,7 @@ kni_release(struct inode *inode, struct file *file)
 		kni_dev_remove(dev);
 		kni_net_release_fifo_phy(dev);
 		list_del(&dev->list);
+		kni_dev_free(dev);
 	}
 	up_write(&knet->kni_list_lock);
 
@@ -457,15 +475,18 @@ kni_ioctl_create(struct net *net, uint32_t ioctl_num,
 	if (ret) {
 		pr_err("error %i registering device \"%s\"\n",
 					ret, dev_info.name);
+		kni_dev_remove(kni);
 		kni_net_release_fifo_phy(kni);
-		free_netdev(net_dev);
+		kni_dev_free(kni);
 		return -ENODEV;
 	}
+	kni->registered = 1;
 
 	ret = kni_run_thread(knet, kni, dev_info.force_bind);
 	if (ret != 0) {
 		kni_dev_remove(kni);
 		kni_net_release_fifo_phy(kni);
+		kni_dev_free(kni);
 		return ret;
 	}
 
@@ -550,6 +571,7 @@ kni_ioctl_free(struct net *net, uint32_t ioctl_num,
 
 		kni_net_release_fifo_phy(dev);
 		list_del(&dev->list);
+		kni_dev_free(dev);
 		up_write(&knet->kni_list_lock);
 		pr_info("Successfully freed kni interface: %s\n",
 			dev_info.name);
-- 
2.18.0.rc1.1.g6f333ff2f

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 04/10] kni: add rte_kni_free to KNI library
  2018-06-29  1:54 ` Dan Gora
                     ` (2 preceding siblings ...)
  2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 03/10] kni: don't touch struct kni_dev after freeing Dan Gora
@ 2018-06-29  1:55   ` Dan Gora
  2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 05/10] kni: don't run rte_kni_handle_request after interface release Dan Gora
                     ` (6 subsequent siblings)
  10 siblings, 0 replies; 45+ messages in thread
From: Dan Gora @ 2018-06-29  1:55 UTC (permalink / raw)
  To: ferruh.yigit; +Cc: dev, Dan Gora

Add the new rte_kni_free() API function to the KNI library.

This function will be called by DPDK applications after
rte_kni_release() to free the KNI interface resources from the
kernel driver.

Signed-off-by: Dan Gora <dg@adax.com>
---
 lib/librte_kni/rte_kni.c | 32 +++++++++++++++++++++++++++++---
 lib/librte_kni/rte_kni.h | 31 +++++++++++++++++++++++++++----
 2 files changed, 56 insertions(+), 7 deletions(-)

diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
index 8a8f6c1cc..1d84c0b70 100644
--- a/lib/librte_kni/rte_kni.c
+++ b/lib/librte_kni/rte_kni.c
@@ -463,8 +463,6 @@ int
 rte_kni_release(struct rte_kni *kni)
 {
 	struct rte_kni_device_info dev_info;
-	uint32_t slot_id;
-	uint32_t retry = 5;
 
 	if (!kni || !kni->in_use)
 		return -1;
@@ -475,6 +473,34 @@ rte_kni_release(struct rte_kni *kni)
 		return -1;
 	}
 
+	kni->in_use = 0;
+	return 0;
+}
+
+int
+rte_kni_free(struct rte_kni *kni)
+{
+	uint32_t slot_id;
+	uint32_t retry = 5;
+	struct rte_kni_device_info dev_info;
+
+	if (!kni)
+		return -EINVAL;
+
+	/* Must call rte_kni_release() first */
+	if (kni->in_use)
+		return -EBUSY;
+
+	/*
+	 * Free the FIFOs in the kernel and remove it from the list
+	 * of devices to poll
+	 */
+	snprintf(dev_info.name, sizeof(dev_info.name), "%s", kni->name);
+	if (ioctl(kni_fd, RTE_KNI_IOCTL_FREE, &dev_info) < 0) {
+		RTE_LOG(ERR, KNI, "Fail to release kni device\n");
+		return -1;
+	}
+
 	/* mbufs in all fifo should be released, except request/response */
 
 	/* wait until all rxq packets processed by kernel */
@@ -497,7 +523,7 @@ rte_kni_release(struct rte_kni *kni)
 	if (slot_id > kni_memzone_pool.max_ifaces) {
 		RTE_LOG(ERR, KNI, "KNI pool: corrupted slot ID: %d, max: %d\n",
 			slot_id, kni_memzone_pool.max_ifaces);
-		return -1;
+		return -EINVAL;
 	}
 	kni_memzone_pool_release(&kni_memzone_pool.slots[slot_id]);
 
diff --git a/lib/librte_kni/rte_kni.h b/lib/librte_kni/rte_kni.h
index 99055e2c2..d1a95f898 100644
--- a/lib/librte_kni/rte_kni.h
+++ b/lib/librte_kni/rte_kni.h
@@ -112,11 +112,17 @@ struct rte_kni *rte_kni_alloc(struct rte_mempool *pktmbuf_pool,
 		const struct rte_kni_conf *conf, struct rte_kni_ops *ops);
 
 /**
- * Release KNI interface according to the context. It will also release the
- * paired KNI interface in kernel space. All processing on the specific KNI
- * context need to be stopped before calling this interface.
+ * Release specified KNI interface. This will stop data transfer to and from
+ * this interface and will remove the paired KNI interface in kernel space.
  *
- * rte_kni_release is thread safe.
+ * @note This function will trigger the kernel to remove the interface, which
+ * may trigger the RTE_KNI_REQ_CFG_NETWORK_IF KNI callback. This function will
+ * block until this callback is handled or times out. The user should ensure
+ * that rte_kni_handle_request() is called for this interface in a separate
+ * thread to handle this callback to avoid this delay.
+ *
+ * rte_kni_release() is thread safe, but should not be called from the same
+ * thread as rte_kni_handle_request().
  *
  * @param kni
  *  The pointer to the context of an existent KNI interface.
@@ -127,6 +133,23 @@ struct rte_kni *rte_kni_alloc(struct rte_mempool *pktmbuf_pool,
  */
 int rte_kni_release(struct rte_kni *kni);
 
+/**
+ * Free specified KNI interface. It will also free the KNI interface resources
+ * in kernel space. No KNI functions for this interface should be called after
+ * or at the same time as calling this function. rte_kni_release() must be
+ * called before this function to release the kernel interface.
+ *
+ * @param kni
+ *  The pointer to the context of an existent KNI interface.
+ *
+ * @return
+ *  - 0 indicates success.
+ *  - -EINVAL: Invalid kni structure.
+ *  - -EBUSY: KNI interface still in use.  Must call rte_kni_release().
+ */
+int __rte_experimental
+rte_kni_free(struct rte_kni *kni);
+
 /**
  * It is used to handle the request mbufs sent from kernel space.
  * Then analyzes it and calls the specific actions for the specific requests.
-- 
2.18.0.rc1.1.g6f333ff2f

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 05/10] kni: don't run rte_kni_handle_request after interface release
  2018-06-29  1:54 ` Dan Gora
                     ` (3 preceding siblings ...)
  2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 04/10] kni: add rte_kni_free to KNI library Dan Gora
@ 2018-06-29  1:55   ` Dan Gora
  2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 06/10] kni: increase length of timeout for KNI responses Dan Gora
                     ` (5 subsequent siblings)
  10 siblings, 0 replies; 45+ messages in thread
From: Dan Gora @ 2018-06-29  1:55 UTC (permalink / raw)
  To: ferruh.yigit; +Cc: dev, Dan Gora

Check to ensure that the KNI interface is still in use before accessing
the KNI interface FIFOs to kernel space.

This will help to ensure that the user does not access the KNI
interface after rte_kni_release() has been called.

Signed-off-by: Dan Gora <dg@adax.com>
---
 lib/librte_kni/rte_kni.c | 6 +++++-
 lib/librte_kni/rte_kni.h | 4 ++++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
index 1d84c0b70..6ef0859bf 100644
--- a/lib/librte_kni/rte_kni.c
+++ b/lib/librte_kni/rte_kni.c
@@ -578,7 +578,11 @@ rte_kni_handle_request(struct rte_kni *kni)
 	unsigned ret;
 	struct rte_kni_request *req;
 
-	if (kni == NULL)
+	/*
+	 * Don't touch the req/resp fifos after
+	 * we've been released, we can be freed at any instant!
+	 */
+	if (kni == NULL || !kni->in_use)
 		return -1;
 
 	/* Get request mbuf */
diff --git a/lib/librte_kni/rte_kni.h b/lib/librte_kni/rte_kni.h
index d1a95f898..94516c38f 100644
--- a/lib/librte_kni/rte_kni.h
+++ b/lib/librte_kni/rte_kni.h
@@ -155,6 +155,10 @@ rte_kni_free(struct rte_kni *kni);
  * Then analyzes it and calls the specific actions for the specific requests.
  * Finally constructs the response mbuf and puts it back to the resp_q.
  *
+ * Thread Safety: This function should be called in a separate thread from the
+ * thread which calls rte_kni_release() for this KNI.  This function must not
+ * be called simultaneously with rte_kni_free().
+ *
  * @param kni
  *  The pointer to the context of an existent KNI interface.
  *
-- 
2.18.0.rc1.1.g6f333ff2f

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 06/10] kni: increase length of timeout for KNI responses
  2018-06-29  1:54 ` Dan Gora
                     ` (4 preceding siblings ...)
  2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 05/10] kni: don't run rte_kni_handle_request after interface release Dan Gora
@ 2018-06-29  1:55   ` Dan Gora
  2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 07/10] kni: update kni test for rte_kni_free Dan Gora
                     ` (4 subsequent siblings)
  10 siblings, 0 replies; 45+ messages in thread
From: Dan Gora @ 2018-06-29  1:55 UTC (permalink / raw)
  To: ferruh.yigit; +Cc: dev, Dan Gora

Increase the timeout to receive a response for KNI events handled
through kni_net_process_request to 10 seconds.

Certain actions, such as calling rte_eth_dev_start() can take more
than 3 seconds to return, in addition to any additional time needed
for the DPDK application to call rte_kni_handle_request().

Signed-off-by: Dan Gora <dg@adax.com>
---
 kernel/linux/kni/kni_net.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c
index 7fcfa106c..0850be434 100644
--- a/kernel/linux/kni/kni_net.c
+++ b/kernel/linux/kni/kni_net.c
@@ -25,7 +25,7 @@
 
 #define WD_TIMEOUT 5 /*jiffies */
 
-#define KNI_WAIT_RESPONSE_TIMEOUT 300 /* 3 seconds */
+#define KNI_WAIT_RESPONSE_TIMEOUT 10 /* 10 seconds */
 
 /* typedef for rx function */
 typedef void (*kni_net_rx_t)(struct kni_dev *kni);
@@ -101,7 +101,8 @@ kni_net_process_request(struct kni_dev *kni, struct rte_kni_request *req)
 	}
 
 	ret_val = wait_event_interruptible_timeout(kni->wq,
-			kni_fifo_count(kni->resp_q), 3 * HZ);
+			kni_fifo_count(kni->resp_q),
+			KNI_WAIT_RESPONSE_TIMEOUT * HZ);
 	if (signal_pending(current) || ret_val <= 0) {
 		ret = -ETIME;
 		goto fail;
-- 
2.18.0.rc1.1.g6f333ff2f

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 07/10] kni: update kni test for rte_kni_free
  2018-06-29  1:54 ` Dan Gora
                     ` (5 preceding siblings ...)
  2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 06/10] kni: increase length of timeout for KNI responses Dan Gora
@ 2018-06-29  1:55   ` Dan Gora
  2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 08/10] kni: add rte_kni_free to KNI example app Dan Gora
                     ` (3 subsequent siblings)
  10 siblings, 0 replies; 45+ messages in thread
From: Dan Gora @ 2018-06-29  1:55 UTC (permalink / raw)
  To: ferruh.yigit; +Cc: dev, Dan Gora

Add support for testing rte_kni_free() function.

Signed-off-by: Dan Gora <dg@adax.com>
---
 test/test/test_kni.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/test/test/test_kni.c b/test/test/test_kni.c
index 56773c8a2..ec051c07e 100644
--- a/test/test/test_kni.c
+++ b/test/test/test_kni.c
@@ -427,6 +427,12 @@ test_kni_processing(uint16_t port_id, struct rte_mempool *mp)
 		goto fail_kni;
 	}
 
+	/* test of freeing an unreleased kni device */
+	if (rte_kni_free(kni) == 0) {
+		printf("should not be able to free an unreleased kni device\n");
+		return -1;
+	}
+
 	if (rte_kni_release(kni) < 0) {
 		printf("fail to release kni\n");
 		return -1;
@@ -439,6 +445,12 @@ test_kni_processing(uint16_t port_id, struct rte_mempool *mp)
 		return -1;
 	}
 
+	/* test of freeing a released kni device */
+	if (rte_kni_free(kni) != 0) {
+		printf("failed to free a released kni device\n");
+		return -1;
+	}
+
 	/* test of reusing memzone */
 	kni = rte_kni_alloc(mp, &conf, &ops);
 	if (!kni) {
@@ -598,6 +610,14 @@ test_kni(void)
 		goto fail;
 	}
 
+	/* test of freeing NULL kni context */
+	ret = rte_kni_free(NULL);
+	if (ret == 0) {
+		ret = -1;
+		printf("unexpectedly freed kni successfully\n");
+		goto fail;
+	}
+
 	/* test of handling request on NULL device pointer */
 	ret = rte_kni_handle_request(NULL);
 	if (ret == 0) {
-- 
2.18.0.rc1.1.g6f333ff2f

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 08/10] kni: add rte_kni_free to KNI example app
  2018-06-29  1:54 ` Dan Gora
                     ` (6 preceding siblings ...)
  2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 07/10] kni: update kni test for rte_kni_free Dan Gora
@ 2018-06-29  1:55   ` Dan Gora
  2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 09/10] kni: add rte_kni_free to KNI vdev driver Dan Gora
                     ` (2 subsequent siblings)
  10 siblings, 0 replies; 45+ messages in thread
From: Dan Gora @ 2018-06-29  1:55 UTC (permalink / raw)
  To: ferruh.yigit; +Cc: dev, Dan Gora

Add rte_kni_free to the KNI example application.

Signed-off-by: Dan Gora <dg@adax.com>
---
 examples/kni/main.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/examples/kni/main.c b/examples/kni/main.c
index 4b162debb..75e23cd0f 100644
--- a/examples/kni/main.c
+++ b/examples/kni/main.c
@@ -883,7 +883,9 @@ kni_free_kni(uint16_t port_id)
 
 	for (i = 0; i < p[port_id]->nb_kni; i++) {
 		if (rte_kni_release(p[port_id]->kni[i]))
-			printf("Fail to release kni\n");
+			printf("Failed to release kni\n");
+		if (rte_kni_free(p[port_id]->kni[i]))
+			printf("Failed to free kni\n");
 		p[port_id]->kni[i] = NULL;
 	}
 	rte_eth_dev_stop(port_id);
-- 
2.18.0.rc1.1.g6f333ff2f

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 09/10] kni: add rte_kni_free to KNI vdev driver
  2018-06-29  1:54 ` Dan Gora
                     ` (7 preceding siblings ...)
  2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 08/10] kni: add rte_kni_free to KNI example app Dan Gora
@ 2018-06-29  1:55   ` Dan Gora
  2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface Dan Gora
  2018-07-20 11:36   ` [dpdk-dev] [PATCH 00/10] kni: Interface detach and link status fixes Ferruh Yigit
  10 siblings, 0 replies; 45+ messages in thread
From: Dan Gora @ 2018-06-29  1:55 UTC (permalink / raw)
  To: ferruh.yigit; +Cc: dev, Dan Gora

Add rte_kni_free to the virual KNI device.

Signed-off-by: Dan Gora <dg@adax.com>
---
 drivers/net/kni/rte_eth_kni.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/kni/rte_eth_kni.c b/drivers/net/kni/rte_eth_kni.c
index ab63ea427..24c6991e6 100644
--- a/drivers/net/kni/rte_eth_kni.c
+++ b/drivers/net/kni/rte_eth_kni.c
@@ -463,11 +463,13 @@ eth_kni_remove(struct rte_vdev_device *vdev)
 	if (eth_dev == NULL)
 		return -1;
 
-	eth_kni_dev_stop(eth_dev);
-
 	internals = eth_dev->data->dev_private;
 	rte_kni_release(internals->kni);
 
+	eth_kni_dev_stop(eth_dev);
+
+	rte_kni_free(internals->kni);
+
 	rte_free(internals);
 
 	rte_eth_dev_release_port(eth_dev);
-- 
2.18.0.rc1.1.g6f333ff2f

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface
  2018-06-29  1:54 ` Dan Gora
                     ` (8 preceding siblings ...)
  2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 09/10] kni: add rte_kni_free to KNI vdev driver Dan Gora
@ 2018-06-29  1:55   ` Dan Gora
  2018-08-29 11:48     ` Ferruh Yigit
                       ` (3 more replies)
  2018-07-20 11:36   ` [dpdk-dev] [PATCH 00/10] kni: Interface detach and link status fixes Ferruh Yigit
  10 siblings, 4 replies; 45+ messages in thread
From: Dan Gora @ 2018-06-29  1:55 UTC (permalink / raw)
  To: ferruh.yigit; +Cc: dev, Dan Gora

Add a new API function to KNI, rte_kni_update_link() to allow DPDK
applications to update the link state for the KNI network interfaces
in the linux kernel.

Note that the default carrier state is set to off when the interface
is opened.

Signed-off-by: Dan Gora <dg@adax.com>
---
 kernel/linux/kni/kni_misc.c                   | 63 +++++++++++++++++++
 kernel/linux/kni/kni_net.c                    |  2 +
 .../eal/include/exec-env/rte_kni_common.h     | 19 ++++++
 lib/librte_kni/rte_kni.c                      | 37 ++++++++++-
 lib/librte_kni/rte_kni.h                      | 16 +++++
 5 files changed, 136 insertions(+), 1 deletion(-)

diff --git a/kernel/linux/kni/kni_misc.c b/kernel/linux/kni/kni_misc.c
index 1c38cfa1a..b5784ad1b 100644
--- a/kernel/linux/kni/kni_misc.c
+++ b/kernel/linux/kni/kni_misc.c
@@ -585,6 +585,66 @@ kni_ioctl_free(struct net *net, uint32_t ioctl_num,
 	return ret;
 }
 
+static int
+kni_ioctl_linkstat(struct net *net, uint32_t ioctl_num,
+		unsigned long ioctl_param)
+{
+	struct kni_net *knet = net_generic(net, kni_net_id);
+	int ret = -EINVAL;
+	struct kni_dev *dev, *n;
+	struct rte_kni_link_info link_info;
+	struct net_device *netdev;
+	uint16_t link;
+
+	if (_IOC_SIZE(ioctl_num) > sizeof(link_info))
+		return -EINVAL;
+
+	ret = copy_from_user(&link_info, (void *)ioctl_param,
+			     sizeof(link_info));
+	if (ret) {
+		pr_err("copy_from_user in kni_ioctl_release");
+		return -EIO;
+	}
+
+	/* Release the network device according to its name */
+	if (strlen(link_info.name) == 0)
+		return ret;
+
+	down_read(&knet->kni_list_lock);
+	list_for_each_entry_safe(dev, n, &knet->kni_list_head, list) {
+		if (strncmp(dev->name, link_info.name, RTE_KNI_NAMESIZE) != 0)
+			continue;
+
+		netdev = dev->net_dev;
+		if (netdev == NULL) {
+			up_read(&knet->kni_list_lock);
+			return ret;
+		}
+
+		link = link_info.link_status;
+
+		if (!netif_carrier_ok(netdev) && link) {
+			pr_info("%s NIC Link is Up %d Mbps %s.\n",
+				netdev->name,
+				link_info.link_speed,
+				link_info.link_duplex ==
+					RTE_KNI_LINK_FULL_DUPLEX ?
+					"Full Duplex" : "Half Duplex");
+			netif_carrier_on(netdev);
+		} else if (netif_carrier_ok(netdev) && !link) {
+			pr_info("%s NIC Link is Down.\n",
+				netdev->name);
+			netif_carrier_off(netdev);
+		}
+
+		ret = 0;
+		break;
+	}
+	up_read(&knet->kni_list_lock);
+
+	return ret;
+}
+
 static int
 kni_ioctl(struct inode *inode, uint32_t ioctl_num, unsigned long ioctl_param)
 {
@@ -609,6 +669,9 @@ kni_ioctl(struct inode *inode, uint32_t ioctl_num, unsigned long ioctl_param)
 	case _IOC_NR(RTE_KNI_IOCTL_FREE):
 		ret = kni_ioctl_free(net, ioctl_num, ioctl_param);
 		break;
+	case _IOC_NR(RTE_KNI_IOCTL_LINKSTAT):
+		ret = kni_ioctl_linkstat(net, ioctl_num, ioctl_param);
+		break;
 	default:
 		pr_debug("IOCTL default\n");
 		break;
diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c
index 0850be434..fea3ec7e7 100644
--- a/kernel/linux/kni/kni_net.c
+++ b/kernel/linux/kni/kni_net.c
@@ -134,6 +134,7 @@ kni_net_open(struct net_device *dev)
 	struct kni_dev *kni = netdev_priv(dev);
 
 	netif_start_queue(dev);
+	netif_carrier_off(dev);
 
 	memset(&req, 0, sizeof(req));
 	req.req_id = RTE_KNI_REQ_CFG_NETWORK_IF;
@@ -153,6 +154,7 @@ kni_net_release(struct net_device *dev)
 	struct kni_dev *kni = netdev_priv(dev);
 
 	netif_stop_queue(dev); /* can't transmit any more */
+	netif_carrier_off(dev);
 
 	memset(&req, 0, sizeof(req));
 	req.req_id = RTE_KNI_REQ_CFG_NETWORK_IF;
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
index 318a3f939..f617d8026 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
@@ -124,11 +124,30 @@ struct rte_kni_device_info {
 	char mac_addr[6];
 };
 
+
+struct rte_kni_link_info {
+	char name[RTE_KNI_NAMESIZE];  /**< Network device name for KNI */
+	uint32_t link_speed;        /**< ETH_SPEED_NUM_ */
+
+#define RTE_KNI_LINK_HALF_DUPLEX 0 /**< Half-duplex connection. */
+#define RTE_KNI_LINK_FULL_DUPLEX 1 /**< Full-duplex connection. */
+	uint16_t link_duplex  : 1;  /**< RTE_KNI_LINK_[HALF/FULL]_DUPLEX */
+
+#define RTE_KNI_LINK_FIXED       0 /**< No autonegotiation. */
+#define RTE_KNI_LINK_AUTONEG     1 /**< Autonegotiated. */
+	uint16_t link_autoneg : 1;  /**< RTE_KNI_LINK_[AUTONEG/FIXED] */
+
+#define RTE_KNI_LINK_DOWN        0 /**< Link is down. */
+#define RTE_KNI_LINK_UP          1 /**< Link is up. */
+	uint16_t link_status  : 1;  /**< RTE_KNI_LINK_[DOWN/UP] */
+};
+
 #define KNI_DEVICE "kni"
 
 #define RTE_KNI_IOCTL_TEST    _IOWR(0, 1, int)
 #define RTE_KNI_IOCTL_CREATE  _IOWR(0, 2, struct rte_kni_device_info)
 #define RTE_KNI_IOCTL_RELEASE _IOWR(0, 3, struct rte_kni_device_info)
 #define RTE_KNI_IOCTL_FREE    _IOWR(0, 4, struct rte_kni_device_info)
+#define RTE_KNI_IOCTL_LINKSTAT _IOWR(0, 5, struct rte_kni_link_info)
 
 #endif /* _RTE_KNI_COMMON_H_ */
diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
index 6ef0859bf..aa3559306 100644
--- a/lib/librte_kni/rte_kni.c
+++ b/lib/librte_kni/rte_kni.c
@@ -13,7 +13,6 @@
 
 #include <rte_spinlock.h>
 #include <rte_string_fns.h>
-#include <rte_ethdev.h>
 #include <rte_malloc.h>
 #include <rte_log.h>
 #include <rte_kni.h>
@@ -817,6 +816,42 @@ rte_kni_unregister_handlers(struct rte_kni *kni)
 
 	return 0;
 }
+
+int
+rte_kni_update_link(struct rte_kni *kni, struct rte_eth_link *link)
+{
+	struct rte_kni_link_info link_info;
+
+	if (kni == NULL || !kni->in_use || link == NULL)
+		return -1;
+
+	snprintf(link_info.name, sizeof(link_info.name), "%s", kni->name);
+
+	link_info.link_speed = link->link_speed;
+	if (link->link_duplex == ETH_LINK_FULL_DUPLEX)
+		link_info.link_duplex = RTE_KNI_LINK_FULL_DUPLEX;
+	else
+		link_info.link_duplex = RTE_KNI_LINK_FULL_DUPLEX;
+
+	if (link->link_autoneg == ETH_LINK_FIXED)
+		link_info.link_autoneg = RTE_KNI_LINK_FIXED;
+	else
+		link_info.link_autoneg = RTE_KNI_LINK_AUTONEG;
+
+	if (link->link_status == ETH_LINK_UP)
+		link_info.link_status = RTE_KNI_LINK_UP;
+	else
+		link_info.link_status = RTE_KNI_LINK_DOWN;
+
+	if (ioctl(kni_fd, RTE_KNI_IOCTL_LINKSTAT, &link_info) < 0) {
+		RTE_LOG(ERR, KNI,
+			"Failed to update kni link info for dev '%s'.\n",
+			kni->name);
+		return -1;
+	}
+	return 0;
+}
+
 void
 rte_kni_close(void)
 {
diff --git a/lib/librte_kni/rte_kni.h b/lib/librte_kni/rte_kni.h
index 94516c38f..02d781a32 100644
--- a/lib/librte_kni/rte_kni.h
+++ b/lib/librte_kni/rte_kni.h
@@ -21,6 +21,7 @@
 #include <rte_memory.h>
 #include <rte_mempool.h>
 #include <rte_ether.h>
+#include <rte_ethdev.h>
 
 #include <exec-env/rte_kni_common.h>
 
@@ -255,6 +256,21 @@ int rte_kni_register_handlers(struct rte_kni *kni, struct rte_kni_ops *ops);
  */
 int rte_kni_unregister_handlers(struct rte_kni *kni);
 
+/**
+ *  Update link status info for KNI port.
+ *
+ *  Update the linkup/linkdown status of a KNI interface in the kernel.
+ *
+ *  @param kni
+ *   pointer to struct rte_kni.
+ *
+ *  @return
+ *   On success: 0
+ *   On failure: -1
+ */
+int __rte_experimental
+rte_kni_update_link(struct rte_kni *kni, struct rte_eth_link *link);
+
 /**
  *  Close KNI device.
  */
-- 
2.18.0.rc1.1.g6f333ff2f

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 00/10] kni: Interface detach and link status fixes.
  2018-06-29  1:54 ` Dan Gora
                     ` (9 preceding siblings ...)
  2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface Dan Gora
@ 2018-07-20 11:36   ` Ferruh Yigit
  10 siblings, 0 replies; 45+ messages in thread
From: Ferruh Yigit @ 2018-07-20 11:36 UTC (permalink / raw)
  To: Dan Gora; +Cc: dev

On 6/29/2018 2:54 AM, Dan Gora wrote:
> Hi All,
> 
> The following patches are to fix a problem with detaching a KNI
> interface using rte_kni_release and to add a new API function
> to allow users to change the link status (up/down, speed, etc)
> of the interface in the linux kernel.  In previous versions, it
> was impossible to release a KNI interface without waiting for the
> timeout in kni_net_process_request to expire if the interface is in
> the UP state.  The solution to this issue was to separate the process
> of releasing the netdev device from the linux kernel from actually
> freeing the KNI interface in the kernel and in the RTE library by
> introducing a new API function, rte_kni_free().
> 
> The last patch : 'kni: add API to set link status on kernel interface'
> adds a new API function to allow the DPDK user to change the link
> speed and status reported by the netdev in the linux kernel.
> 
> This resolves issues with allowing automatic network configuration
> applciations such as NetworkManager to assign addresses and for user
> space applications to be able to open sockets on these interfaces,
> as some operations rely on the link status being up before they
> work properly.
> 
> This last patch is included in this series because both new "features"
> introduce new ioctls to the rte_kni kernel module, so the order in
> which the patches get applied affects which number each new ioctl gets.
> I thought it better to bundle them together to try to get them applied
> as a series to avoid any issues with this.
> 
> *v2*
> 
> The first time I submitted these patches, they failed with
> compilation errors in Patchwork, but they are failing due to
> rte_kni_free being a missing symbol, which is only introduced here in
> patch 4/10.  I'm not really sure how the automatically compilation
> tool works in patchwork.  Does it apply all of the patches in the
> series before compilation?  Does it compile after each patch?
> 
> I tried my best to break up these patches into small enough pieces
> so that they could be reviewed fairly easily, but I'm not sure that
> I did a good job.  If there are suggestions on how the changes could
> be organized or split better, please let me know.
> 
> thanks
> dan
> 
> v2:
>   Re-submitted as a threaded series as per thomas@monjalon.net suggestions.
>   cc'd Ferruh on cover letter.
>   No code changes.
> 
> Dan Gora (10):
>   kni: remove unused variables from struct kni_dev
>   kni: separate releasing netdev from freeing KNI interface
>   kni: don't touch struct kni_dev after freeing
>   kni: add rte_kni_free to KNI library
>   kni: don't run rte_kni_handle_request after interface release
>   kni: increase length of timeout for KNI responses
>   kni: update kni test for rte_kni_free
>   kni: add rte_kni_free to KNI example app
>   kni: add rte_kni_free to KNI vdev driver
>   kni: add API to set link status on kernel interface

Hi Dan,

Thank you for the patches, but they were later for this release proposal
deadline and will be considered/reviewed for next release.

Thanks,
ferruh

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 01/10] kni: remove unused variables from struct kni_dev
  2018-06-29  1:54   ` [dpdk-dev] [PATCH v2 01/10] kni: remove unused variables from struct kni_dev Dan Gora
@ 2018-08-29 10:29     ` Ferruh Yigit
  0 siblings, 0 replies; 45+ messages in thread
From: Ferruh Yigit @ 2018-08-29 10:29 UTC (permalink / raw)
  To: Dan Gora; +Cc: dev

On 6/29/2018 2:54 AM, Dan Gora wrote:
> Remove the unused fields 'status' and 'synchro' from the struct
> kni_dev.
> 
> Signed-off-by: Dan Gora <dg@adax.com>

Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 02/10] kni: separate releasing netdev from freeing KNI interface
  2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 02/10] kni: separate releasing netdev from freeing KNI interface Dan Gora
@ 2018-08-29 10:59     ` Ferruh Yigit
  2018-09-04  0:20       ` Dan Gora
  2018-09-04  0:36       ` Dan Gora
  0 siblings, 2 replies; 45+ messages in thread
From: Ferruh Yigit @ 2018-08-29 10:59 UTC (permalink / raw)
  To: Dan Gora; +Cc: dev

On 6/29/2018 2:55 AM, Dan Gora wrote:
> Currently the rte_kni kernel driver suffers from a problem where
> when the interface is released, it generates a callback to the DPDK
> application to change the interface state to Down.  However, after the
> DPDK application handles the callback and generates a response back to
> the kernel, the rte_kni driver cannot wake the thread which is asleep
> waiting for the response, because it is holding the kni_link_lock
> semaphore and it has already removed the 'struct kni_dev' from the
> list of interfaces to poll for responses.
> 
> This means that if the KNI interface is in the Up state when
> rte_kni_release() is called, it will always sleep for three seconds
> until kni_net_release gives up waiting for a response from the DPDK
> application.
> 
> To fix this, we must separate the step to release the kernel network
> interface from the steps to remove the KNI interface from the list
> of interfaces to poll.
> 
> When the kernel network interface is removed with unregister_netdev(),
> if the interface is up, it will generate a callback to mark the
> interface down, which calls kni_net_release().  kni_net_release() will
> block waiting for the DPDK application to call rte_kni_handle_request()
> to handle the callback, but it also needs the thread in the KNI driver
> (either the per-dev thread for multi-thread or the per-driver thread)
> to call kni_net_poll_resp() in order to wake the thread sleeping in
> kni_net_release (actually kni_net_process_request()).
> 
> So now, KNI interfaces should be removed as such:
> 
> 1) The user calls rte_kni_release().  This only unregisters the
> netdev in the kernel, but touches nothing else.  This allows all the
> threads to run which are necessary to handle the callback into the
> DPDK application to mark the interface down.
> 
> 2) The user stops the thread running rte_kni_handle_request().
> After rte_kni_release() has been called, there will be no more
> callbacks for that interface so it is not necessary.  It cannot be
> running at the same time that rte_kni_free() frees all of the FIFOs
> and DPDK memory for that KNI interface.
> 
> 3) The user calls rte_kni_free().  This performs the RTE_KNI_IOCTL_FREE
> ioctl which calls kni_ioctl_free().  This function removes the struct
> kni_dev from the list of interfaces to poll (and kills the per-dev
> kthread, if configured for multi-thread), then frees the memory in
> the FIFOs.
> 
> Signed-off-by: Dan Gora <dg@adax.com>


You are right, that problem exits.
Although I don't see problem related to holding the kni_list_lock, polling
thread terminated before unregister interface cause the problem.

And it has a reason to terminate polling thread first, because it uses device
resources.

Separating unregister and free steps looks good, but I am not sure if this
should be reflected to the user, with a new ioctl and API.
When user done with interface it calls rte_kni_release() to release them, does
user really need a rte_kni_free() API or need to know the difference of two, is
there any action to take in userspace between these two APIs? I think no.

What about keeping single rte_kni_release() API and solve the issue internally
in KNI?

Previously it was doing:
- Stop threads (also there is another single/multi thread error [1])
- kni_dev_remove()
	- unregister and free netdev() [2]
	- kni_net_release_fifo_phy() [3]

Instead internally can we do:
a- Unregister kernel interfaces, rte_kni_unregister()?
b- stop threads
c- kni_net_release_fifo_phy
d- free netdev

The challenge I can see is some time required between a) and b) to let userspace
app to response, we need a way to know response received before stopping the thread.

Another thing is there are two release path, kni_release() and
kni_ioctl_release() both should be fixed.



[1]
If multi thread enabled they have been stopped, but if single thread used it has
not been stopped that is why you don't see the 3 seconds delay for default
single thread case, but not stopping the polling thread but removing the
interface is wrong.

[2]
unregistering netdev will trigger a userspace request but response won't be read
because polling thread also polls the response queue, and that thread is already
stopped at this stage.

[3]
This is also wrong as you have pointed in later patch in your series,
kni_net_release_fifo_phy() moves packets from rxq/alloq queue to free queue,
queues are still allocated but the references kept in kernel may be invalid at
this stage because of free netdev()

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface
  2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface Dan Gora
@ 2018-08-29 11:48     ` Ferruh Yigit
  2018-08-29 21:10       ` Dan Gora
  2018-08-29 15:54     ` Stephen Hemminger
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 45+ messages in thread
From: Ferruh Yigit @ 2018-08-29 11:48 UTC (permalink / raw)
  To: Dan Gora; +Cc: dev

On 6/29/2018 2:55 AM, Dan Gora wrote:
> Add a new API function to KNI, rte_kni_update_link() to allow DPDK
> applications to update the link state for the KNI network interfaces
> in the linux kernel.
> 
> Note that the default carrier state is set to off when the interface
> is opened.

Why set carrier off when interface opened? Although I don't see any difference
in interface state with or without this call.

> 
> Signed-off-by: Dan Gora <dg@adax.com>

Overall looks good to me.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface
  2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface Dan Gora
  2018-08-29 11:48     ` Ferruh Yigit
@ 2018-08-29 15:54     ` Stephen Hemminger
  2018-08-29 21:02       ` Dan Gora
  2018-09-11 23:14     ` [dpdk-dev] [PATCH 0/2] " Dan Gora
  2018-09-11 23:14     ` [dpdk-dev] [PATCH 1/2] " Dan Gora
  3 siblings, 1 reply; 45+ messages in thread
From: Stephen Hemminger @ 2018-08-29 15:54 UTC (permalink / raw)
  To: Dan Gora; +Cc: ferruh.yigit, dev

On Thu, 28 Jun 2018 18:55:08 -0700
Dan Gora <dg@adax.com> wrote:

> Add a new API function to KNI, rte_kni_update_link() to allow DPDK
> applications to update the link state for the KNI network interfaces
> in the linux kernel.
> 
> Note that the default carrier state is set to off when the interface
> is opened.
> 
> Signed-off-by: Dan Gora <dg@adax.com>

Do you really need a special ioctl for this?
There is already ability to set link state via sysfs or netlink.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface
  2018-08-29 15:54     ` Stephen Hemminger
@ 2018-08-29 21:02       ` Dan Gora
  2018-08-29 22:00         ` Stephen Hemminger
  0 siblings, 1 reply; 45+ messages in thread
From: Dan Gora @ 2018-08-29 21:02 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Ferruh Yigit, dev

On Wed, Aug 29, 2018 at 12:54 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> On Thu, 28 Jun 2018 18:55:08 -0700
> Dan Gora <dg@adax.com> wrote:
>
>> Add a new API function to KNI, rte_kni_update_link() to allow DPDK
>> applications to update the link state for the KNI network interfaces
>> in the linux kernel.
>>
>> Note that the default carrier state is set to off when the interface
>> is opened.
>>
>> Signed-off-by: Dan Gora <dg@adax.com>
>
> Do you really need a special ioctl for this?
> There is already ability to set link state via sysfs or netlink.

I think yes.. AFAIK sysfs does not constitute a stable API; it's only
available for Linux (yes, I know KNI is linux-only currently, but
there's not really any technical reason why it can't work on BSD) and
there are already callbacks to change the MTU and MAC addresses which
could also be done via netlink.  IMHO having the kernel have an
accurate view of the link state is more important than the ability to
change the MAC address of the interface...

In our application we want the linux kernel/"normal" userspace to be
able to use the DPDK controlled interfaces like any other interface.
We need to be able to assign IP addresses to them, have them
participate in routing, etc.  Since they are controlled via our DPDK
application, there is no way for the kernel to know when the cable is
connected/removed since that information is only communicated to the
DPDK application.

The other option, which I toyed with but decided against, would be to
have a polling thread in the KNI module to call a callback into the
DPDK application to poll the link status.  However that would still
possibly leave a time period when the link is down, but the kernel
does not know about it.  I decided that it would probably be best to
just have a way for the DPDK application to inform the linux kernel
(via the KNI module) that the link was down.

It's important for the linux kernel to know about the link status if
the interface is going to be treated like any other.  Things like
assigning IP addresses and adding the interfaces to the routing table
happen automatically when the link is marked "up".  If the link is not
marked "up", or is "up" when it should be "down", then the kernel
cannot configure that interface correctly, or will use it when it
should not be.

thanks
dan

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface
  2018-08-29 11:48     ` Ferruh Yigit
@ 2018-08-29 21:10       ` Dan Gora
  2018-08-29 22:01         ` Stephen Hemminger
  0 siblings, 1 reply; 45+ messages in thread
From: Dan Gora @ 2018-08-29 21:10 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: dev

On Wed, Aug 29, 2018 at 8:48 AM, Ferruh Yigit <ferruh.yigit@intel.com> wrote:
> On 6/29/2018 2:55 AM, Dan Gora wrote:
>> Add a new API function to KNI, rte_kni_update_link() to allow DPDK
>> applications to update the link state for the KNI network interfaces
>> in the linux kernel.
>>
>> Note that the default carrier state is set to off when the interface
>> is opened.
>
> Why set carrier off when interface opened?

A couple of reasons:

1) That's the way every other Ethernet driver in the linux kernel does
it that I've seen.

2) The DPDK application may not actually be ready for the interface to
be used when it is first created.  Things like NetworkManager, etc
will gladly go trying to assign IP addresses to those interfaces, add
them to the routing table, etc as soon as the interface is marked
"up".  By making the default be "down", this allows the application to
finish any initialization on the DPDK side of the interface before
allowing it to be used by the kernel.

> Although I don't see any difference
> in interface state with or without this call.

Previously in the 'ip addr' output, the 'state' would be 'UNKNOWN'
when the interface was created.  After this patch the 'state' in 'ip
addr' is 'DOWN'.

thanks
dan

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface
  2018-08-29 21:02       ` Dan Gora
@ 2018-08-29 22:00         ` Stephen Hemminger
  2018-08-29 22:12           ` Dan Gora
  0 siblings, 1 reply; 45+ messages in thread
From: Stephen Hemminger @ 2018-08-29 22:00 UTC (permalink / raw)
  To: Dan Gora; +Cc: Ferruh Yigit, dev

On Wed, 29 Aug 2018 18:02:06 -0300
Dan Gora <dg@adax.com> wrote:

> On Wed, Aug 29, 2018 at 12:54 PM, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> > On Thu, 28 Jun 2018 18:55:08 -0700
> > Dan Gora <dg@adax.com> wrote:
> >  
> >> Add a new API function to KNI, rte_kni_update_link() to allow DPDK
> >> applications to update the link state for the KNI network interfaces
> >> in the linux kernel.
> >>
> >> Note that the default carrier state is set to off when the interface
> >> is opened.
> >>
> >> Signed-off-by: Dan Gora <dg@adax.com>  
> >
> > Do you really need a special ioctl for this?
> > There is already ability to set link state via sysfs or netlink.  
> 
> I think yes.. AFAIK sysfs does not constitute a stable API; 

It is a stable API on Linux.

> it's only
> available for Linux (yes, I know KNI is linux-only currently, but
> there's not really any technical reason why it can't work on BSD) and
> there are already callbacks to change the MTU and MAC addresses which
> could also be done via netlink.  IMHO having the kernel have an
> accurate view of the link state is more important than the ability to
> change the MAC address of the interface...

The device model on BSD is significantly different than Linux.
Doing KNI on BSD is going to be a full rewrite of the driver anyway;
I won't worry about sysfs, dependency.

The important part is that if KNI is ever going to be supportable
it needs to be upstream in Linux, not a bolt on out of tree driver.
Most Enterprise distributions will not support out of tree drivers
for good reasons.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface
  2018-08-29 21:10       ` Dan Gora
@ 2018-08-29 22:01         ` Stephen Hemminger
  0 siblings, 0 replies; 45+ messages in thread
From: Stephen Hemminger @ 2018-08-29 22:01 UTC (permalink / raw)
  To: Dan Gora; +Cc: Ferruh Yigit, dev

On Wed, 29 Aug 2018 18:10:35 -0300
Dan Gora <dg@adax.com> wrote:

> On Wed, Aug 29, 2018 at 8:48 AM, Ferruh Yigit <ferruh.yigit@intel.com> wrote:
> > On 6/29/2018 2:55 AM, Dan Gora wrote:  
> >> Add a new API function to KNI, rte_kni_update_link() to allow DPDK
> >> applications to update the link state for the KNI network interfaces
> >> in the linux kernel.
> >>
> >> Note that the default carrier state is set to off when the interface
> >> is opened.  
> >
> > Why set carrier off when interface opened?  
> 
> A couple of reasons:
> 
> 1) That's the way every other Ethernet driver in the linux kernel does
> it that I've seen.
> 
> 2) The DPDK application may not actually be ready for the interface to
> be used when it is first created.  Things like NetworkManager, etc
> will gladly go trying to assign IP addresses to those interfaces, add
> them to the routing table, etc as soon as the interface is marked
> "up".  By making the default be "down", this allows the application to
> finish any initialization on the DPDK side of the interface before
> allowing it to be used by the kernel.
> 
> > Although I don't see any difference
> > in interface state with or without this call.  
> 
> Previously in the 'ip addr' output, the 'state' would be 'UNKNOWN'
> when the interface was created.  After this patch the 'state' in 'ip
> addr' is 'DOWN'.
> 
> thanks
> dan

There is also a better (richer) API for link status on linux via
the operstate functions. Those might better match the semantics of
a tunnellish interface like KNI.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface
  2018-08-29 22:00         ` Stephen Hemminger
@ 2018-08-29 22:12           ` Dan Gora
  2018-08-29 22:41             ` Dan Gora
  0 siblings, 1 reply; 45+ messages in thread
From: Dan Gora @ 2018-08-29 22:12 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Ferruh Yigit, dev

On Wed, Aug 29, 2018 at 7:00 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
>> >> Add a new API function to KNI, rte_kni_update_link() to allow DPDK
>> >> applications to update the link state for the KNI network interfaces
>> >> in the linux kernel.
>> >>
>> >> Note that the default carrier state is set to off when the interface
>> >> is opened.
>> >>
>> >> Signed-off-by: Dan Gora <dg@adax.com>
>> >
>> > Do you really need a special ioctl for this?
>> > There is already ability to set link state via sysfs or netlink.
>>
>> I think yes.. AFAIK sysfs does not constitute a stable API;
>
> It is a stable API on Linux.

Ok, I didn't know this...

Still it seems better to me to be able to call
rte_kni_update_link(kni, link); than 'open ("/sys/whatever/where ever
it may be this kernel version/link/"); write(fd, "1"); close(fd); or
whatever...

But I guess if it is actually a stable API, we can hide all of that in
'rte_kni_update_link() and just do away with the ioctl!

I'm actually kind of shocked that I'm the only one who has run into
this.. I would have thought that having an accurate link status would
have been important for whoever used KNI.

>
>> it's only
>> available for Linux (yes, I know KNI is linux-only currently, but
>> there's not really any technical reason why it can't work on BSD) and
>> there are already callbacks to change the MTU and MAC addresses which
>> could also be done via netlink.  IMHO having the kernel have an
>> accurate view of the link state is more important than the ability to
>> change the MAC address of the interface...
>
> The device model on BSD is significantly different than Linux.
> Doing KNI on BSD is going to be a full rewrite of the driver anyway;
> I won't worry about sysfs, dependency.
>
> The important part is that if KNI is ever going to be supportable
> it needs to be upstream in Linux, not a bolt on out of tree driver.
> Most Enterprise distributions will not support out of tree drivers
> for good reasons.

Agreed there.. I was really torn between using KNI or the TAP
interface.  KNI seems cleaner, and at least at the time that I started
working on this, seemed like the way to interface to the kernel moving
forward.  The TAP interface stuff didn't seem like it was necessarily
going to be supported moving forward and the KNI was supposed to be
the "high performance" method to interface to the kernel.  But having
to build and install the rte_kni module on every system that we
install our software on is a major pain.

d

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface
  2018-08-29 22:12           ` Dan Gora
@ 2018-08-29 22:41             ` Dan Gora
  2018-08-29 23:10               ` Stephen Hemminger
  0 siblings, 1 reply; 45+ messages in thread
From: Dan Gora @ 2018-08-29 22:41 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Ferruh Yigit, dev

On Wed, Aug 29, 2018 at 7:12 PM, Dan Gora <dg@adax.com> wrote:
> On Wed, Aug 29, 2018 at 7:00 PM, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
>>> >> Add a new API function to KNI, rte_kni_update_link() to allow DPDK
>>> >> applications to update the link state for the KNI network interfaces
>>> >> in the linux kernel.
>>> >>
>>> >> Note that the default carrier state is set to off when the interface
>>> >> is opened.
>>> >>
>>> >> Signed-off-by: Dan Gora <dg@adax.com>
>>> >
>>> > Do you really need a special ioctl for this?
>>> > There is already ability to set link state via sysfs or netlink.
>>>
>>> I think yes.. AFAIK sysfs does not constitute a stable API;
>>
>> It is a stable API on Linux.
>

Actually this does not seem to be completely true...

>From Documentation/admin-guide/sysfs-rules.rst:

Rules on how to access information in sysfs
===========================================

The kernel-exported sysfs exports internal kernel implementation details
and depends on internal kernel structures and layout. It is agreed upon
by the kernel developers that the Linux kernel does not provide a stable
internal API. Therefore, there are aspects of the sysfs interface that
may not be stable across kernel releases.

<snip>

- devices are only "devices"
    There is no such thing like class-, bus-, physical devices,
    interfaces, and such that you can rely on in userspace. Everything is
    just simply a "device". Class-, bus-, physical, ... types are just
    kernel implementation details which should not be expected by
    applications that look for devices in sysfs.

    The properties of a device are:

    - devpath (``/devices/pci0000:00/0000:00:1d.1/usb2/2-2/2-2:1.0``)
<snip>

    - kernel name (``sda``, ``tty``, ``0000:00:1f.2``, ...)
<snip>

    - subsystem (``block``, ``tty``, ``pci``, ...)
<snip>

    - driver (``tg3``, ``ata_piix``, ``uhci_hcd``)
<snip>

    - attributes
<snip>

    Everything else is just a kernel driver-core implementation detail
    that should not be assumed to be stable across kernel releases.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface
  2018-08-29 22:41             ` Dan Gora
@ 2018-08-29 23:10               ` Stephen Hemminger
  2018-08-30  9:49                 ` Igor Ryzhov
  0 siblings, 1 reply; 45+ messages in thread
From: Stephen Hemminger @ 2018-08-29 23:10 UTC (permalink / raw)
  To: Dan Gora; +Cc: Ferruh Yigit, dev

On Wed, 29 Aug 2018 19:41:23 -0300
Dan Gora <dg@adax.com> wrote:

> On Wed, Aug 29, 2018 at 7:12 PM, Dan Gora <dg@adax.com> wrote:
> > On Wed, Aug 29, 2018 at 7:00 PM, Stephen Hemminger
> > <stephen@networkplumber.org> wrote:  
> >>> >> Add a new API function to KNI, rte_kni_update_link() to allow DPDK
> >>> >> applications to update the link state for the KNI network interfaces
> >>> >> in the linux kernel.
> >>> >>
> >>> >> Note that the default carrier state is set to off when the interface
> >>> >> is opened.
> >>> >>
> >>> >> Signed-off-by: Dan Gora <dg@adax.com>  
> >>> >
> >>> > Do you really need a special ioctl for this?
> >>> > There is already ability to set link state via sysfs or netlink.  
> >>>
> >>> I think yes.. AFAIK sysfs does not constitute a stable API;  
> >>
> >> It is a stable API on Linux.  
> >  
> 
> Actually this does not seem to be completely true...
> 
> From Documentation/admin-guide/sysfs-rules.rst:
> 
> Rules on how to access information in sysfs
> ===========================================
> 
> The kernel-exported sysfs exports internal kernel implementation details
> and depends on internal kernel structures and layout. It is agreed upon
> by the kernel developers that the Linux kernel does not provide a stable
> internal API. Therefore, there are aspects of the sysfs interface that
> may not be stable across kernel releases.
> 
> <snip>
> 
> - devices are only "devices"
>     There is no such thing like class-, bus-, physical devices,
>     interfaces, and such that you can rely on in userspace. Everything is
>     just simply a "device". Class-, bus-, physical, ... types are just
>     kernel implementation details which should not be expected by
>     applications that look for devices in sysfs.
> 
>     The properties of a device are:
> 
>     - devpath (``/devices/pci0000:00/0000:00:1d.1/usb2/2-2/2-2:1.0``)
> <snip>
> 
>     - kernel name (``sda``, ``tty``, ``0000:00:1f.2``, ...)
> <snip>
> 
>     - subsystem (``block``, ``tty``, ``pci``, ...)
> <snip>
> 
>     - driver (``tg3``, ``ata_piix``, ``uhci_hcd``)
> <snip>
> 
>     - attributes
> <snip>
> 
>     Everything else is just a kernel driver-core implementation detail
>     that should not be assumed to be stable across kernel releases.

Network device sysfs is stable. No one ever got around to putting it in documentation
I wouldn't worry, once anything in /sys/class/net is added it is not going to change without major breakage in many many tools.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface
  2018-08-29 23:10               ` Stephen Hemminger
@ 2018-08-30  9:49                 ` Igor Ryzhov
  2018-08-30 10:32                   ` Igor Ryzhov
  2018-08-30 21:41                   ` Dan Gora
  0 siblings, 2 replies; 45+ messages in thread
From: Igor Ryzhov @ 2018-08-30  9:49 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Dan Gora, Ferruh Yigit, dev

Hi Dan,

We use KNI device exactly the same way you described – with IP addresses,
routing, etc.
And we also faced the same problem of having the actual link status in
Linux kernel.

There is a special callback for link state management in net_device_ops for
soft-devices like KNI called ndo_change_carrier.
Current KNI driver implements it already, you just need to write to
/sys/class/net/<iface>/carrier
to change link status.

Right now we implement it on application side, but I think it'll be good to
have this in rte_kni API.

Here is our implementation:

static int
linux_set_carrier(const char *name, int status)
{
char path[64];
const char *carrier = status ? "1" : "0";
int fd, ret;

sprintf(path, "/sys/devices/virtual/net/%s/carrier", name);
fd = open(path, O_WRONLY);
if (fd == -1) {
return -errno;
}

ret = write(fd, carrier, 2);
if (ret == -1) {
close(fd);
return -errno;
}

close(fd);

return 0;
}

Best regards,
Igor

On Thu, Aug 30, 2018 at 2:10 AM, Stephen Hemminger <
stephen@networkplumber.org> wrote:

> On Wed, 29 Aug 2018 19:41:23 -0300
> Dan Gora <dg@adax.com> wrote:
>
> > On Wed, Aug 29, 2018 at 7:12 PM, Dan Gora <dg@adax.com> wrote:
> > > On Wed, Aug 29, 2018 at 7:00 PM, Stephen Hemminger
> > > <stephen@networkplumber.org> wrote:
> > >>> >> Add a new API function to KNI, rte_kni_update_link() to allow DPDK
> > >>> >> applications to update the link state for the KNI network
> interfaces
> > >>> >> in the linux kernel.
> > >>> >>
> > >>> >> Note that the default carrier state is set to off when the
> interface
> > >>> >> is opened.
> > >>> >>
> > >>> >> Signed-off-by: Dan Gora <dg@adax.com>
> > >>> >
> > >>> > Do you really need a special ioctl for this?
> > >>> > There is already ability to set link state via sysfs or netlink.
> > >>>
> > >>> I think yes.. AFAIK sysfs does not constitute a stable API;
> > >>
> > >> It is a stable API on Linux.
> > >
> >
> > Actually this does not seem to be completely true...
> >
> > From Documentation/admin-guide/sysfs-rules.rst:
> >
> > Rules on how to access information in sysfs
> > ===========================================
> >
> > The kernel-exported sysfs exports internal kernel implementation details
> > and depends on internal kernel structures and layout. It is agreed upon
> > by the kernel developers that the Linux kernel does not provide a stable
> > internal API. Therefore, there are aspects of the sysfs interface that
> > may not be stable across kernel releases.
> >
> > <snip>
> >
> > - devices are only "devices"
> >     There is no such thing like class-, bus-, physical devices,
> >     interfaces, and such that you can rely on in userspace. Everything is
> >     just simply a "device". Class-, bus-, physical, ... types are just
> >     kernel implementation details which should not be expected by
> >     applications that look for devices in sysfs.
> >
> >     The properties of a device are:
> >
> >     - devpath (``/devices/pci0000:00/0000:00:1d.1/usb2/2-2/2-2:1.0``)
> > <snip>
> >
> >     - kernel name (``sda``, ``tty``, ``0000:00:1f.2``, ...)
> > <snip>
> >
> >     - subsystem (``block``, ``tty``, ``pci``, ...)
> > <snip>
> >
> >     - driver (``tg3``, ``ata_piix``, ``uhci_hcd``)
> > <snip>
> >
> >     - attributes
> > <snip>
> >
> >     Everything else is just a kernel driver-core implementation detail
> >     that should not be assumed to be stable across kernel releases.
>
> Network device sysfs is stable. No one ever got around to putting it in
> documentation
> I wouldn't worry, once anything in /sys/class/net is added it is not going
> to change without major breakage in many many tools.
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface
  2018-08-30  9:49                 ` Igor Ryzhov
@ 2018-08-30 10:32                   ` Igor Ryzhov
  2018-08-30 21:41                   ` Dan Gora
  1 sibling, 0 replies; 45+ messages in thread
From: Igor Ryzhov @ 2018-08-30 10:32 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Dan Gora, Ferruh Yigit, dev

Hi again,

Forgot to mention netif_carrier_off changes. Your changes are necessary for
correct link status management,
but netif_carrier_off call should also be added to kni_ioctl_create before
the register_netdev call.
That's needed for correct link status even before first call to ndo_open.

Best regards,
Igor

On Thu, Aug 30, 2018 at 12:49 PM, Igor Ryzhov <iryzhov@nfware.com> wrote:

> Hi Dan,
>
> We use KNI device exactly the same way you described – with IP addresses,
> routing, etc.
> And we also faced the same problem of having the actual link status in
> Linux kernel.
>
> There is a special callback for link state management in net_device_ops
> for soft-devices like KNI called ndo_change_carrier.
> Current KNI driver implements it already, you just need to write to
> /sys/class/net/<iface>/carrier to change link status.
>
> Right now we implement it on application side, but I think it'll be good
> to have this in rte_kni API.
>
> Here is our implementation:
>
> static int
> linux_set_carrier(const char *name, int status)
> {
> char path[64];
> const char *carrier = status ? "1" : "0";
> int fd, ret;
>
> sprintf(path, "/sys/devices/virtual/net/%s/carrier", name);
> fd = open(path, O_WRONLY);
> if (fd == -1) {
> return -errno;
> }
>
> ret = write(fd, carrier, 2);
> if (ret == -1) {
> close(fd);
> return -errno;
> }
>
> close(fd);
>
> return 0;
> }
>
> Best regards,
> Igor
>
> On Thu, Aug 30, 2018 at 2:10 AM, Stephen Hemminger <
> stephen@networkplumber.org> wrote:
>
>> On Wed, 29 Aug 2018 19:41:23 -0300
>> Dan Gora <dg@adax.com> wrote:
>>
>> > On Wed, Aug 29, 2018 at 7:12 PM, Dan Gora <dg@adax.com> wrote:
>> > > On Wed, Aug 29, 2018 at 7:00 PM, Stephen Hemminger
>> > > <stephen@networkplumber.org> wrote:
>> > >>> >> Add a new API function to KNI, rte_kni_update_link() to allow
>> DPDK
>> > >>> >> applications to update the link state for the KNI network
>> interfaces
>> > >>> >> in the linux kernel.
>> > >>> >>
>> > >>> >> Note that the default carrier state is set to off when the
>> interface
>> > >>> >> is opened.
>> > >>> >>
>> > >>> >> Signed-off-by: Dan Gora <dg@adax.com>
>> > >>> >
>> > >>> > Do you really need a special ioctl for this?
>> > >>> > There is already ability to set link state via sysfs or netlink.
>> > >>>
>> > >>> I think yes.. AFAIK sysfs does not constitute a stable API;
>> > >>
>> > >> It is a stable API on Linux.
>> > >
>> >
>> > Actually this does not seem to be completely true...
>> >
>> > From Documentation/admin-guide/sysfs-rules.rst:
>> >
>> > Rules on how to access information in sysfs
>> > ===========================================
>> >
>> > The kernel-exported sysfs exports internal kernel implementation details
>> > and depends on internal kernel structures and layout. It is agreed upon
>> > by the kernel developers that the Linux kernel does not provide a stable
>> > internal API. Therefore, there are aspects of the sysfs interface that
>> > may not be stable across kernel releases.
>> >
>> > <snip>
>> >
>> > - devices are only "devices"
>> >     There is no such thing like class-, bus-, physical devices,
>> >     interfaces, and such that you can rely on in userspace. Everything
>> is
>> >     just simply a "device". Class-, bus-, physical, ... types are just
>> >     kernel implementation details which should not be expected by
>> >     applications that look for devices in sysfs.
>> >
>> >     The properties of a device are:
>> >
>> >     - devpath (``/devices/pci0000:00/0000:00:1d.1/usb2/2-2/2-2:1.0``)
>> > <snip>
>> >
>> >     - kernel name (``sda``, ``tty``, ``0000:00:1f.2``, ...)
>> > <snip>
>> >
>> >     - subsystem (``block``, ``tty``, ``pci``, ...)
>> > <snip>
>> >
>> >     - driver (``tg3``, ``ata_piix``, ``uhci_hcd``)
>> > <snip>
>> >
>> >     - attributes
>> > <snip>
>> >
>> >     Everything else is just a kernel driver-core implementation detail
>> >     that should not be assumed to be stable across kernel releases.
>>
>> Network device sysfs is stable. No one ever got around to putting it in
>> documentation
>> I wouldn't worry, once anything in /sys/class/net is added it is not
>> going to change without major breakage in many many tools.
>>
>
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface
  2018-08-30  9:49                 ` Igor Ryzhov
  2018-08-30 10:32                   ` Igor Ryzhov
@ 2018-08-30 21:41                   ` Dan Gora
  2018-08-30 22:09                     ` Stephen Hemminger
  1 sibling, 1 reply; 45+ messages in thread
From: Dan Gora @ 2018-08-30 21:41 UTC (permalink / raw)
  To: Igor Ryzhov; +Cc: Stephen Hemminger, Ferruh Yigit, dev

Hi All,

So I'm a little unclear as to what to do here moving forward..

Do we all agree at least that there should be some function in DPDK in
order to set the link state for KNI interfaces?

I'm a strong yes on this point.  I don't think that every KNI user
should have to implement something like this themselves if we can
provide a simple interface to do it.  It seems like a natural
extension of the KNI functionality IMHO.

If so, should we use a new ioctl as in my patch, or just use the
"write to /sys.../carrier" method as shown below?

I'm kind of ambivalent about this point.  The ioctl method has two
minor advantages to my mind:

1) It's already done :)

2) You can set the link speed and duplex as well and get a pretty
message in the syslog when the link status changes:

[ 2100.016079] rte_kni: adax0 NIC Link is Up 10000 Mbps Full Duplex.
[ 2100.016094] IPv6: ADDRCONF(NETDEV_CHANGE): adax0: link becomes ready
[ 2262.532126] IPv6: ADDRCONF(NETDEV_UP): adax1: link is not ready
[ 2263.432148] rte_kni: adax1 NIC Link is Up 10000 Mbps Full Duplex.
[ 2263.432170] IPv6: ADDRCONF(NETDEV_CHANGE): adax1: link becomes ready

On the other hand, the "write to /sys" method is a bit more simple and
confines the changes to the user space library.  If we're confident
that the /sys ABI is stable and not going to be changed going forward
it seems like a valid alternative.

I'm willing to do it either way.  I'll defer to the judgement of the
community...

thanks
dan


On Thu, Aug 30, 2018 at 6:49 AM, Igor Ryzhov <iryzhov@nfware.com> wrote:
> Hi Dan,
>
> We use KNI device exactly the same way you described – with IP addresses,
> routing, etc.
> And we also faced the same problem of having the actual link status in Linux
> kernel.
>
> There is a special callback for link state management in net_device_ops for
> soft-devices like KNI called ndo_change_carrier.
> Current KNI driver implements it already, you just need to write to
> /sys/class/net/<iface>/carrier to change link status.
>
> Right now we implement it on application side, but I think it'll be good to
> have this in rte_kni API.
>
> Here is our implementation:
>
> static int
> linux_set_carrier(const char *name, int status)
> {
> char path[64];
> const char *carrier = status ? "1" : "0";
> int fd, ret;
>
> sprintf(path, "/sys/devices/virtual/net/%s/carrier", name);
> fd = open(path, O_WRONLY);
> if (fd == -1) {
> return -errno;
> }
>
> ret = write(fd, carrier, 2);
> if (ret == -1) {
> close(fd);
> return -errno;
> }
>
> close(fd);
>
> return 0;
> }
>
> Best regards,
> Igor

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface
  2018-08-30 21:41                   ` Dan Gora
@ 2018-08-30 22:09                     ` Stephen Hemminger
  2018-08-30 22:11                       ` Dan Gora
  0 siblings, 1 reply; 45+ messages in thread
From: Stephen Hemminger @ 2018-08-30 22:09 UTC (permalink / raw)
  To: Dan Gora; +Cc: Igor Ryzhov, Ferruh Yigit, dev

On Thu, 30 Aug 2018 18:41:14 -0300
Dan Gora <dg@adax.com> wrote:

> On the other hand, the "write to /sys" method is a bit more simple and
> confines the changes to the user space library.  If we're confident
> that the /sys ABI is stable and not going to be changed going forward
> it seems like a valid alternative.

See Documentation/ABI/testing/sysfs-class-net

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface
  2018-08-30 22:09                     ` Stephen Hemminger
@ 2018-08-30 22:11                       ` Dan Gora
  2018-09-04  0:47                         ` Dan Gora
  0 siblings, 1 reply; 45+ messages in thread
From: Dan Gora @ 2018-08-30 22:11 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Igor Ryzhov, Ferruh Yigit, dev

On Thu, Aug 30, 2018 at 7:09 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> On Thu, 30 Aug 2018 18:41:14 -0300
> Dan Gora <dg@adax.com> wrote:
>
>> On the other hand, the "write to /sys" method is a bit more simple and
>> confines the changes to the user space library.  If we're confident
>> that the /sys ABI is stable and not going to be changed going forward
>> it seems like a valid alternative.
>
> See Documentation/ABI/testing/sysfs-class-net

yeah, but it's in the 'testing' directory :)

>From Documentation/ABI/README:

testing/

This directory documents interfaces that are felt to be stable,
as the main development of this interface has been completed.
The interface can be changed to add new features, but the
current interface will not break by doing this, unless grave
errors or security problems are found in them.  Userspace
programs can start to rely on these interfaces, but they must be
aware of changes that can occur before these interfaces move to
be marked stable.  Programs that use these interfaces are
strongly encouraged to add their name to the description of
these interfaces, so that the kernel developers can easily
notify them if any changes occur (see the description of the
layout of the files below for details on how to do this.)

Like I said, I'm ok with using this if that's what everyone wants to do.

d

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 02/10] kni: separate releasing netdev from freeing KNI interface
  2018-08-29 10:59     ` Ferruh Yigit
@ 2018-09-04  0:20       ` Dan Gora
  2018-09-04  0:36       ` Dan Gora
  1 sibling, 0 replies; 45+ messages in thread
From: Dan Gora @ 2018-09-04  0:20 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: dev

Hi Ferruh,

thanks for your feedback.. I apologize for the delay getting back to
you on this..

On Wed, Aug 29, 2018 at 7:59 AM, Ferruh Yigit <ferruh.yigit@intel.com> wrote:
> You are right, that problem exits.
> Although I don't see problem related to holding the kni_list_lock, polling
> thread terminated before unregister interface cause the problem.

Actually no, the polling thread is not terminated when the deadlock occurs..

The code path is this:  Note that this only occurs when the interface
is in the UP state!

DPDK app calls
 rte_kni_release()->
  ioctl (RTE_KNI_IOCTL_RELEASE);

Down in the kernel in the KNI driver:
 kni_ioctl (RTE_KNI_IOCTL_RELEASE) ->
  kni_ioctl_release()->
   down_write(kni_list_lock);
   -- This prevents the polling thread from running because it does a
   -- "down_read(kni_list_lock)".
   kni_dev_remove()->
    unregister_netdev() - This generates the callback into the DPDK application.
     ndo_stop - kni_net_release() ->
      kni_net_process_request() ->
       wait_event_interruptible_timeout(3 seconds);

Back in the DPDK app (Note this has to be in a different thread than
the one which called rte_kni_release()!)
  rte_kni_handle_request() - RTE_KNI_REQ_CFG_NETWORK_IF ->
   ** Application callback for config_network_if **
  kni_fifo_put the response back to the kernel in the resp_q fifo.

This is the deadlock... The polling thread cannot run because
kni_ioctl_release() is still holding the kni_list_lock, so
kni_net_process_request() will not see the response back from the DPDK
app in the resp_q fifo.  Only when wait_event_interruptible_timeout
times out does kni_net_process_request see the response back from the
DPDK app, which causes kni_dev_remove() to return, allowing
kni_ioctl_release to drop the kni_list_lock.

> And it has a reason to terminate polling thread first, because it uses device
> resources.
>
> Separating unregister and free steps looks good, but I am not sure if this
> should be reflected to the user, with a new ioctl and API.
> When user done with interface it calls rte_kni_release() to release them, does
> user really need a rte_kni_free() API or need to know the difference of two, is
> there any action to take in userspace between these two APIs? I think no.
>
> What about keeping single rte_kni_release() API and solve the issue internally
> in KNI?
>
> Previously it was doing:
> - Stop threads (also there is another single/multi thread error [1])
> - kni_dev_remove()
>         - unregister and free netdev() [2]
>         - kni_net_release_fifo_phy() [3]
>
> Instead internally can we do:
> a- Unregister kernel interfaces, rte_kni_unregister()?
> b- stop threads
> c- kni_net_release_fifo_phy
> d- free netdev
>
> The challenge I can see is some time required between a) and b) to let userspace
> app to response, we need a way to know response received before stopping the thread.
>
> Another thing is there are two release path, kni_release() and
> kni_ioctl_release() both should be fixed.
>
>
>
> [1]
> If multi thread enabled they have been stopped, but if single thread used it has
> not been stopped that is why you don't see the 3 seconds delay for default
> single thread case, but not stopping the polling thread but removing the
> interface is wrong.


You see the problem in the single thread case as well if the interface
is in the 'UP' state when it is removed.  That is what triggers the
callback to the DPDK application to mark the interface 'down', as
shown above...

>
> [2]
> unregistering netdev will trigger a userspace request but response won't be read
> because polling thread also polls the response queue, and that thread is already
> stopped at this stage.
>

Not exactly.. The polling thread is not dead, just locked out because
kni_ioctl_release() is holding the kni_list_lock semaphore.


> [3]
> This is also wrong as you have pointed in later patch in your series,
> kni_net_release_fifo_phy() moves packets from rxq/alloq queue to free queue,
> queues are still allocated but the references kept in kernel may be invalid at
> this stage because of free netdev()

It's worse than that actually because you cannot touch 'dev' at all
after the free_netdev() because the 'struct kni_dev' is embedded in
the 'struct net_dev' allocated with alloc_netdev(), so you cannot call
kni_net_release_fifo_phy() after free_netdev:

kni_dev_remove()
{
    if (dev->net_dev) {
        unregister_netdev(dev->net_dev);
        free_netdev(dev->net_dev);
    }
    kni_net_release_fifo_phy(dev);
}

Because 'dev' has already been freed.

Similarly in kni_ioctl_release() you cannot do:

kni_dev_remove(dev);
list_del(&dev->list);

Because currently kni_dev_remove() calls free_netdevk(), which frees
the memory pointed to by 'dev'.

Maybe this all can be resolved more easily by allocating the 'struct
kni_dev' in separate memory rather than embedding it in the net_dev,
that way we can drop the 'kni_list_lock' before calling
kni_dev_remove() and we can still touch 'dev' after calling
free_netdev(), so we can then grab the lock again and remove it from
the list...

I'll noodle this a bit more and see if I can get it all to work.. I
just wanted to get back to you about how to reproduce the lockup.

thanks
dan

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 02/10] kni: separate releasing netdev from freeing KNI interface
  2018-08-29 10:59     ` Ferruh Yigit
  2018-09-04  0:20       ` Dan Gora
@ 2018-09-04  0:36       ` Dan Gora
  2018-10-10 17:24         ` Ferruh Yigit
  1 sibling, 1 reply; 45+ messages in thread
From: Dan Gora @ 2018-09-04  0:36 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: dev

Hi Ferruh,

I remembered now the motivation behind separating rte_kni_release()
and rte_kni_free().

The problem is that the DPDK thread which calls rte_kni_release()
_cannot_ be the same thread which handles callbacks from the KNI
driver via rte_kni_handle_request().  This is because the thread which
calls rte_kni_release() will be stuck down in
ioctl(RTE_KNI_IOCTL_RELEASE) when the kernel calls the
RTE_KNI_REQ_CFG_NETWORK_IF callback to the DPDK application.  Since
that thread cannot call rte_kni_handle_request(), the callback would
then just timeout unless some other thread calls
rte_kni_handle_request().

So then you are in a bit of a chicken and egg situation.  You _have_
to have a separate thread calling rte_kni_handle_request periodically,
but that thread also _cannot_ run after rte_kni_release returns
(actually it's worse than that because it's actually after the
ioctl(RTE_KNI_IOCTL_RELEASE) returns and the fifos are freed).

So in order to resolve this, I separated the release from the freeing
stages. This allows the DPDK application to keep the
rte_kni_handle_request() thread running while rte_kni_release() is
called so that it can handle the interface state callback, then kill
that thread so that it cannot touch any 'struct rte_kni' resources,
then free the struct rte_kni resources.


thanks
dan

On Wed, Aug 29, 2018 at 7:59 AM, Ferruh Yigit <ferruh.yigit@intel.com> wrote:

>> When the kernel network interface is removed with unregister_netdev(),
>> if the interface is up, it will generate a callback to mark the
>> interface down, which calls kni_net_release().  kni_net_release() will
>> block waiting for the DPDK application to call rte_kni_handle_request()
>> to handle the callback, but it also needs the thread in the KNI driver
>> (either the per-dev thread for multi-thread or the per-driver thread)
>> to call kni_net_poll_resp() in order to wake the thread sleeping in
>> kni_net_release (actually kni_net_process_request()).
>>
>> So now, KNI interfaces should be removed as such:
>>
>> 1) The user calls rte_kni_release().  This only unregisters the
>> netdev in the kernel, but touches nothing else.  This allows all the
>> threads to run which are necessary to handle the callback into the
>> DPDK application to mark the interface down.
>>
>> 2) The user stops the thread running rte_kni_handle_request().
>> After rte_kni_release() has been called, there will be no more
>> callbacks for that interface so it is not necessary.  It cannot be
>> running at the same time that rte_kni_free() frees all of the FIFOs
>> and DPDK memory for that KNI interface.
>>
>> 3) The user calls rte_kni_free().  This performs the RTE_KNI_IOCTL_FREE
>> ioctl which calls kni_ioctl_free().  This function removes the struct
>> kni_dev from the list of interfaces to poll (and kills the per-dev
>> kthread, if configured for multi-thread), then frees the memory in
>> the FIFOs.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface
  2018-08-30 22:11                       ` Dan Gora
@ 2018-09-04  0:47                         ` Dan Gora
  2018-09-05 12:57                           ` Stephen Hemminger
  0 siblings, 1 reply; 45+ messages in thread
From: Dan Gora @ 2018-09-04  0:47 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Igor Ryzhov, Ferruh Yigit, dev

Hi All,

One other problem with using the sysfs method to change the link state
rather than this ioctl method.  The sysfs/netdev method to change the
carrier was only introduced in kernel 3.9.  For older kernels, we
would just be out of luck.  The ioctl method will work with any kernel
version (2.6+).  It's not clear if this is a problem for DPDK apps or
not.

thanks
dan


On Thu, Aug 30, 2018 at 7:11 PM, Dan Gora <dg@adax.com> wrote:
> On Thu, Aug 30, 2018 at 7:09 PM, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
>> On Thu, 30 Aug 2018 18:41:14 -0300
>> Dan Gora <dg@adax.com> wrote:
>>
>>> On the other hand, the "write to /sys" method is a bit more simple and
>>> confines the changes to the user space library.  If we're confident
>>> that the /sys ABI is stable and not going to be changed going forward
>>> it seems like a valid alternative.
>>
>> See Documentation/ABI/testing/sysfs-class-net
>
> yeah, but it's in the 'testing' directory :)
>
> From Documentation/ABI/README:
>
> testing/
>
> This directory documents interfaces that are felt to be stable,
> as the main development of this interface has been completed.
> The interface can be changed to add new features, but the
> current interface will not break by doing this, unless grave
> errors or security problems are found in them.  Userspace
> programs can start to rely on these interfaces, but they must be
> aware of changes that can occur before these interfaces move to
> be marked stable.  Programs that use these interfaces are
> strongly encouraged to add their name to the description of
> these interfaces, so that the kernel developers can easily
> notify them if any changes occur (see the description of the
> layout of the files below for details on how to do this.)
>
> Like I said, I'm ok with using this if that's what everyone wants to do.
>
> d



-- 
Dan Gora
Software Engineer

Adax, Inc.
Rua Dona Maria Alves, 1070 Casa 5
Centro
Ubatuba, SP
CEP 11680-000
Brasil

Tel: +55 (12) 3833-1021  (Brazil and outside of US)
    : +1 (510) 859-4801  (Inside of US)
    : dan_gora (Skype)

email: dg@adax.com

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface
  2018-09-04  0:47                         ` Dan Gora
@ 2018-09-05 12:57                           ` Stephen Hemminger
  2018-09-11 21:45                             ` Dan Gora
  0 siblings, 1 reply; 45+ messages in thread
From: Stephen Hemminger @ 2018-09-05 12:57 UTC (permalink / raw)
  To: Dan Gora; +Cc: Igor Ryzhov, Ferruh Yigit, dev

On Mon, 3 Sep 2018 21:47:22 -0300
Dan Gora <dg@adax.com> wrote:

> Hi All,
> 
> One other problem with using the sysfs method to change the link state
> rather than this ioctl method.  The sysfs/netdev method to change the
> carrier was only introduced in kernel 3.9.  For older kernels, we
> would just be out of luck.  The ioctl method will work with any kernel
> version (2.6+).  It's not clear if this is a problem for DPDK apps or
> not.
> 
> thanks
> dan
> 
> 
> On Thu, Aug 30, 2018 at 7:11 PM, Dan Gora <dg@adax.com> wrote:
> > On Thu, Aug 30, 2018 at 7:09 PM, Stephen Hemminger
> > <stephen@networkplumber.org> wrote:  
> >> On Thu, 30 Aug 2018 18:41:14 -0300
> >> Dan Gora <dg@adax.com> wrote:
> >>  
> >>> On the other hand, the "write to /sys" method is a bit more simple and
> >>> confines the changes to the user space library.  If we're confident
> >>> that the /sys ABI is stable and not going to be changed going forward
> >>> it seems like a valid alternative.  
> >>
> >> See Documentation/ABI/testing/sysfs-class-net  
> >
> > yeah, but it's in the 'testing' directory :)
> >
> > From Documentation/ABI/README:
> >
> > testing/
> >
> > This directory documents interfaces that are felt to be stable,
> > as the main development of this interface has been completed.
> > The interface can be changed to add new features, but the
> > current interface will not break by doing this, unless grave
> > errors or security problems are found in them.  Userspace
> > programs can start to rely on these interfaces, but they must be
> > aware of changes that can occur before these interfaces move to
> > be marked stable.  Programs that use these interfaces are
> > strongly encouraged to add their name to the description of
> > these interfaces, so that the kernel developers can easily
> > notify them if any changes occur (see the description of the
> > layout of the files below for details on how to do this.)
> >
> > Like I said, I'm ok with using this if that's what everyone wants to do.
> >
> > d  
> 
> 
> 

Linux 3.9 is no longer supported. Currently, upstream Linux kernel is supported
from 3.16 on. If someone is on a kernel that old, they aren't going to get
any security fixes.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface
  2018-09-05 12:57                           ` Stephen Hemminger
@ 2018-09-11 21:45                             ` Dan Gora
  2018-09-11 21:52                               ` Stephen Hemminger
  0 siblings, 1 reply; 45+ messages in thread
From: Dan Gora @ 2018-09-11 21:45 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Igor Ryzhov, Ferruh Yigit, dev

Hi All,

So I implemented the "write to /sys/devices/virtual/net/*/carrier"
method to change the link status, but there is one more minor thing
that I wanted to point out about this approach.  The problem is that
you cannot read or write the '/sys/devices/virtual/net/*/carrier' file
while the interface is marked 'down'.  This means that link status
changes can only be performed by the DPDK application while the
interface is in the "up" state.  With the ioctl method, you can change
the carrier state at pretty much any time.

Is this a problem?  It's not a huge one, I guess, but it is something
else to consider.

Please let me know what you all think.

thanks
dan

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface
  2018-09-11 21:45                             ` Dan Gora
@ 2018-09-11 21:52                               ` Stephen Hemminger
  2018-09-11 22:07                                 ` Dan Gora
  0 siblings, 1 reply; 45+ messages in thread
From: Stephen Hemminger @ 2018-09-11 21:52 UTC (permalink / raw)
  To: Dan Gora; +Cc: Igor Ryzhov, Ferruh Yigit, dev

On Tue, 11 Sep 2018 18:45:49 -0300
Dan Gora <dg@adax.com> wrote:

> Hi All,
> 
> So I implemented the "write to /sys/devices/virtual/net/*/carrier"
> method to change the link status, but there is one more minor thing
> that I wanted to point out about this approach.  The problem is that
> you cannot read or write the '/sys/devices/virtual/net/*/carrier' file
> while the interface is marked 'down'.  This means that link status
> changes can only be performed by the DPDK application while the
> interface is in the "up" state.  With the ioctl method, you can change
> the carrier state at pretty much any time.
> 
> Is this a problem?  It's not a huge one, I guess, but it is something
> else to consider.
> 
> Please let me know what you all think.
> 
> thanks
> dan

The carrier state has no meaning when device is down, at least for physical
devices. Because often the PHY is powered off when the device is marked down.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface
  2018-09-11 21:52                               ` Stephen Hemminger
@ 2018-09-11 22:07                                 ` Dan Gora
  2018-09-11 23:14                                   ` Stephen Hemminger
  0 siblings, 1 reply; 45+ messages in thread
From: Dan Gora @ 2018-09-11 22:07 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Igor Ryzhov, Ferruh Yigit, dev

On Tue, Sep 11, 2018 at 6:52 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> The carrier state has no meaning when device is down, at least for physical
> devices. Because often the PHY is powered off when the device is marked down.

The thing that caught my attention is that when you mark a kernel
ethernet device 'down', you get a message that the link is down in the
syslog.

snappy:root:bash 2645 => ip link set down dev eth0
Sep 11 18:32:48 snappy kernel: e1000e: eth0 NIC Link is Down

With this method, that's not possible because you cannot change the
link state from the callback from kni_net_release.

The carrier state doesn't have any meaning from a data transfer point
of view, but it's often useful for being able to diagnose connectivity
issues (is my cable plugged in or not).

I'm still not really clear what the objection really is to the ioctl
method.  Is it just the number of changes?  That the kernel driver has
to change as well?  Just that there is another way to do it?

thanks
dan

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH 0/2] kni: add API to set link status on kernel interface
  2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface Dan Gora
  2018-08-29 11:48     ` Ferruh Yigit
  2018-08-29 15:54     ` Stephen Hemminger
@ 2018-09-11 23:14     ` Dan Gora
  2018-09-11 23:14     ` [dpdk-dev] [PATCH 1/2] " Dan Gora
  3 siblings, 0 replies; 45+ messages in thread
From: Dan Gora @ 2018-09-11 23:14 UTC (permalink / raw)
  To: dev; +Cc: Igor Ryzhov, Stephen Hemminger, Ferruh Yigit, Dan Gora

Hi All,

The following patches are to add support for DPDK applications to be
able to change the carrier state of Linux network interfaces in the
KNI kernel module.

The carrier state is changed by writing to the Linux /sys file:
/sys/devices/virtual/net/<name>/carrier, where <name> is the KNI
interface name.

These patches supercede:
'[PATCH v2 10/10] kni: add API to set link status on kernel interface'
Message-Id: <20180629015508.26599-11-dg@adax.com>

Dan Gora (2):
  kni: add API to set link status on kernel interface
  kni: set default carrier state to 'off'

 kernel/linux/kni/kni_misc.c |  2 ++
 kernel/linux/kni/kni_net.c  |  2 ++
 lib/librte_kni/rte_kni.c    | 57 +++++++++++++++++++++++++++++++++++++
 lib/librte_kni/rte_kni.h    | 18 ++++++++++++
 4 files changed, 79 insertions(+)

-- 
2.19.0

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [dpdk-dev] [PATCH 1/2] kni: add API to set link status on kernel interface
  2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface Dan Gora
                       ` (2 preceding siblings ...)
  2018-09-11 23:14     ` [dpdk-dev] [PATCH 0/2] " Dan Gora
@ 2018-09-11 23:14     ` Dan Gora
  2018-09-11 23:18       ` Dan Gora
  3 siblings, 1 reply; 45+ messages in thread
From: Dan Gora @ 2018-09-11 23:14 UTC (permalink / raw)
  To: dev; +Cc: Igor Ryzhov, Stephen Hemminger, Ferruh Yigit, Dan Gora

Add a new API function to KNI, rte_kni_update_link() to allow DPDK
applications to update the link status for KNI network interfaces in
the linux kernel.

Signed-off-by: Dan Gora <dg@adax.com>
---
 lib/librte_kni/rte_kni.c | 57 ++++++++++++++++++++++++++++++++++++++++
 lib/librte_kni/rte_kni.h | 18 +++++++++++++
 2 files changed, 75 insertions(+)

diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
index 65f6a2b03..afe1d7410 100644
--- a/lib/librte_kni/rte_kni.c
+++ b/lib/librte_kni/rte_kni.c
@@ -790,6 +790,63 @@ rte_kni_unregister_handlers(struct rte_kni *kni)
 
 	return 0;
 }
+
+int
+rte_kni_update_link(struct rte_kni *kni, struct rte_eth_link *link)
+{
+	char path[64];
+	char carrier[2];
+	const char *new_carrier;
+	int fd, ret;
+
+	if (kni == NULL || link == NULL)
+		return -1;
+
+	snprintf(path, sizeof(path), "/sys/devices/virtual/net/%s/carrier",
+		kni->name);
+
+	fd = open(path, O_RDWR);
+	if (fd == -1) {
+		RTE_LOG(ERR, KNI, "Failed to open file: %s.\n", path);
+		return -1;
+	}
+
+	ret = read(fd, carrier, 2);
+	if (ret < 1) {
+		/* Cannot read carrier until interface is marked
+		 * 'up', so don't log an error.
+		 */
+		close(fd);
+		return -1;
+	}
+
+	new_carrier = (link->link_status == ETH_LINK_UP) ? "1" : "0";
+	ret = write(fd, new_carrier, 1);
+	if (ret < 1) {
+		RTE_LOG(ERR, KNI, "Failed to write file: %s.\n", path);
+		close(fd);
+		return -1;
+	}
+
+	if (strncmp(carrier, new_carrier, 1)) {
+		if (link->link_status == ETH_LINK_UP) {
+			RTE_LOG(INFO, KNI, "%s NIC Link is Up %d Mbps %s %s.\n",
+				kni->name,
+				link->link_speed,
+				link->link_autoneg ? "(Fixed)" : "(AutoNeg)",
+				link->link_duplex ?
+					"Full Duplex" : "Half Duplex");
+		} else {
+			RTE_LOG(INFO, KNI, "%s NIC Link is Down.\n",
+				kni->name);
+		}
+	}
+
+	close(fd);
+
+	return 0;
+}
+
 void
 rte_kni_close(void)
 {
diff --git a/lib/librte_kni/rte_kni.h b/lib/librte_kni/rte_kni.h
index 99055e2c2..4118ae97a 100644
--- a/lib/librte_kni/rte_kni.h
+++ b/lib/librte_kni/rte_kni.h
@@ -21,6 +21,7 @@
 #include <rte_memory.h>
 #include <rte_mempool.h>
 #include <rte_ether.h>
+#include <rte_ethdev.h>
 
 #include <exec-env/rte_kni_common.h>
 
@@ -228,6 +229,23 @@ int rte_kni_register_handlers(struct rte_kni *kni, struct rte_kni_ops *ops);
  */
 int rte_kni_unregister_handlers(struct rte_kni *kni);
 
+/**
+ * Update link status info for KNI port.
+ *
+ * Update the linkup/linkdown status of a KNI interface in the kernel.
+ *
+ * @param kni
+ *  pointer to struct rte_kni.
+ * @param link
+ *  pointer to struct rte_eth_link containing new interface status.
+ *
+ * @return
+ *  On success: 0
+ *  On failure: -1
+ */
+int __rte_experimental
+rte_kni_update_link(struct rte_kni *kni, struct rte_eth_link *link);
+
 /**
  *  Close KNI device.
  */
-- 
2.19.0

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface
  2018-09-11 22:07                                 ` Dan Gora
@ 2018-09-11 23:14                                   ` Stephen Hemminger
  2018-09-12  4:02                                     ` Jason Wang
  0 siblings, 1 reply; 45+ messages in thread
From: Stephen Hemminger @ 2018-09-11 23:14 UTC (permalink / raw)
  To: Dan Gora; +Cc: Igor Ryzhov, Ferruh Yigit, dev

On Tue, 11 Sep 2018 19:07:47 -0300
Dan Gora <dg@adax.com> wrote:

> On Tue, Sep 11, 2018 at 6:52 PM, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> > The carrier state has no meaning when device is down, at least for physical
> > devices. Because often the PHY is powered off when the device is marked down.  
> 
> The thing that caught my attention is that when you mark a kernel
> ethernet device 'down', you get a message that the link is down in the
> syslog.
> 
> snappy:root:bash 2645 => ip link set down dev eth0
> Sep 11 18:32:48 snappy kernel: e1000e: eth0 NIC Link is Down
> 
> With this method, that's not possible because you cannot change the
> link state from the callback from kni_net_release.
> 
> The carrier state doesn't have any meaning from a data transfer point
> of view, but it's often useful for being able to diagnose connectivity
> issues (is my cable plugged in or not).
> 
> I'm still not really clear what the objection really is to the ioctl
> method.  Is it just the number of changes?  That the kernel driver has
> to change as well?  Just that there is another way to do it?
> 
> thanks
> dan

I want to see KNI as part of the standard Linux kernel at some future date.
Having KNI as an out of tree driver means it is doomed to chasing tail lights
for the Linux kernel ABI instability and also problems with Linux distributions.

One of the barriers to entry for Linux drivers is introducing new ioctl's.
Ioctl's have issues with being device specific and also 32/64 compatiablity.
If KNI has ioctl's it makes it harder to get merged some day.

I freely admit that this is forcing KNI to respond to something that is not
there yet, so if it is too hard, then doing it with ioctl is going to be
necessary.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH 1/2] kni: add API to set link status on kernel interface
  2018-09-11 23:14     ` [dpdk-dev] [PATCH 1/2] " Dan Gora
@ 2018-09-11 23:18       ` Dan Gora
  0 siblings, 0 replies; 45+ messages in thread
From: Dan Gora @ 2018-09-11 23:18 UTC (permalink / raw)
  To: dev; +Cc: Igor Ryzhov, Stephen Hemminger, Ferruh Yigit, Dan Gora

Sorry all.. I totally goofed up with git-send-email..

Please ignore this message.

On Tue, Sep 11, 2018 at 8:14 PM, Dan Gora <dg@adax.com> wrote:
> Add a new API function to KNI, rte_kni_update_link() to allow DPDK
> applications to update the link status for KNI network interfaces in
> the linux kernel.
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface
  2018-09-11 23:14                                   ` Stephen Hemminger
@ 2018-09-12  4:02                                     ` Jason Wang
  0 siblings, 0 replies; 45+ messages in thread
From: Jason Wang @ 2018-09-12  4:02 UTC (permalink / raw)
  To: Stephen Hemminger, Dan Gora; +Cc: Igor Ryzhov, Ferruh Yigit, dev



On 2018年09月12日 07:14, Stephen Hemminger wrote:
> On Tue, 11 Sep 2018 19:07:47 -0300
> Dan Gora <dg@adax.com> wrote:
>
>> On Tue, Sep 11, 2018 at 6:52 PM, Stephen Hemminger
>> <stephen@networkplumber.org> wrote:
>>> The carrier state has no meaning when device is down, at least for physical
>>> devices. Because often the PHY is powered off when the device is marked down.
>> The thing that caught my attention is that when you mark a kernel
>> ethernet device 'down', you get a message that the link is down in the
>> syslog.
>>
>> snappy:root:bash 2645 => ip link set down dev eth0
>> Sep 11 18:32:48 snappy kernel: e1000e: eth0 NIC Link is Down
>>
>> With this method, that's not possible because you cannot change the
>> link state from the callback from kni_net_release.
>>
>> The carrier state doesn't have any meaning from a data transfer point
>> of view, but it's often useful for being able to diagnose connectivity
>> issues (is my cable plugged in or not).
>>
>> I'm still not really clear what the objection really is to the ioctl
>> method.  Is it just the number of changes?  That the kernel driver has
>> to change as well?  Just that there is another way to do it?
>>
>> thanks
>> dan
> I want to see KNI as part of the standard Linux kernel at some future date.
> Having KNI as an out of tree driver means it is doomed to chasing tail lights
> for the Linux kernel ABI instability and also problems with Linux distributions.

Why not use vhost_net instead? KNI duplicates its function.

Thanks

>
> One of the barriers to entry for Linux drivers is introducing new ioctl's.
> Ioctl's have issues with being device specific and also 32/64 compatiablity.
> If KNI has ioctl's it makes it harder to get merged some day.
>
> I freely admit that this is forcing KNI to respond to something that is not
> there yet, so if it is too hard, then doing it with ioctl is going to be
> necessary.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 02/10] kni: separate releasing netdev from freeing KNI interface
  2018-09-04  0:36       ` Dan Gora
@ 2018-10-10 17:24         ` Ferruh Yigit
  2018-10-10 18:18           ` Dan Gora
  0 siblings, 1 reply; 45+ messages in thread
From: Ferruh Yigit @ 2018-10-10 17:24 UTC (permalink / raw)
  To: Dan Gora; +Cc: dev

On 9/4/2018 1:36 AM, Dan Gora wrote:
> Hi Ferruh,
> 
> I remembered now the motivation behind separating rte_kni_release()
> and rte_kni_free().
> 
> The problem is that the DPDK thread which calls rte_kni_release()
> _cannot_ be the same thread which handles callbacks from the KNI
> driver via rte_kni_handle_request().  This is because the thread which
> calls rte_kni_release() will be stuck down in
> ioctl(RTE_KNI_IOCTL_RELEASE) when the kernel calls the
> RTE_KNI_REQ_CFG_NETWORK_IF callback to the DPDK application.  Since
> that thread cannot call rte_kni_handle_request(), the callback would
> then just timeout unless some other thread calls
> rte_kni_handle_request().
> 
> So then you are in a bit of a chicken and egg situation.  You _have_
> to have a separate thread calling rte_kni_handle_request periodically,
> but that thread also _cannot_ run after rte_kni_release returns
> (actually it's worse than that because it's actually after the
> ioctl(RTE_KNI_IOCTL_RELEASE) returns and the fifos are freed).

I see, so we have problem in both end, -userspace side and kernel side.

Agreed that separating release & free may help, but I am not sure about adding a
new API for KNI.

Very simply, what about prevent kni_net_release() send callback to userspace?
This is already not working and removing it resolves the issues you mentioned.
Sample application calls rte_eth_dev_stop() after release itself, so behavior
will be same.

But the issues in kernel you mentioned, using `dev` after free_netdev() called
should be addressed.

> 
> So in order to resolve this, I separated the release from the freeing
> stages. This allows the DPDK application to keep the
> rte_kni_handle_request() thread running while rte_kni_release() is
> called so that it can handle the interface state callback, then kill
> that thread so that it cannot touch any 'struct rte_kni' resources,
> then free the struct rte_kni resources.
> 
> 
> thanks
> dan
> 
> On Wed, Aug 29, 2018 at 7:59 AM, Ferruh Yigit <ferruh.yigit@intel.com> wrote:
> 
>>> When the kernel network interface is removed with unregister_netdev(),
>>> if the interface is up, it will generate a callback to mark the
>>> interface down, which calls kni_net_release().  kni_net_release() will
>>> block waiting for the DPDK application to call rte_kni_handle_request()
>>> to handle the callback, but it also needs the thread in the KNI driver
>>> (either the per-dev thread for multi-thread or the per-driver thread)
>>> to call kni_net_poll_resp() in order to wake the thread sleeping in
>>> kni_net_release (actually kni_net_process_request()).
>>>
>>> So now, KNI interfaces should be removed as such:
>>>
>>> 1) The user calls rte_kni_release().  This only unregisters the
>>> netdev in the kernel, but touches nothing else.  This allows all the
>>> threads to run which are necessary to handle the callback into the
>>> DPDK application to mark the interface down.
>>>
>>> 2) The user stops the thread running rte_kni_handle_request().
>>> After rte_kni_release() has been called, there will be no more
>>> callbacks for that interface so it is not necessary.  It cannot be
>>> running at the same time that rte_kni_free() frees all of the FIFOs
>>> and DPDK memory for that KNI interface.
>>>
>>> 3) The user calls rte_kni_free().  This performs the RTE_KNI_IOCTL_FREE
>>> ioctl which calls kni_ioctl_free().  This function removes the struct
>>> kni_dev from the list of interfaces to poll (and kills the per-dev
>>> kthread, if configured for multi-thread), then frees the memory in
>>> the FIFOs.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 02/10] kni: separate releasing netdev from freeing KNI interface
  2018-10-10 17:24         ` Ferruh Yigit
@ 2018-10-10 18:18           ` Dan Gora
  2018-10-10 22:51             ` Ferruh Yigit
  0 siblings, 1 reply; 45+ messages in thread
From: Dan Gora @ 2018-10-10 18:18 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: dev

On Wed, Oct 10, 2018 at 2:25 PM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>
> On 9/4/2018 1:36 AM, Dan Gora wrote:
> > Hi Ferruh,
> >
> > I remembered now the motivation behind separating rte_kni_release()
> > and rte_kni_free().
> >
> > The problem is that the DPDK thread which calls rte_kni_release()
> > _cannot_ be the same thread which handles callbacks from the KNI
> > driver via rte_kni_handle_request().  This is because the thread which
> > calls rte_kni_release() will be stuck down in
> > ioctl(RTE_KNI_IOCTL_RELEASE) when the kernel calls the
> > RTE_KNI_REQ_CFG_NETWORK_IF callback to the DPDK application.  Since
> > that thread cannot call rte_kni_handle_request(), the callback would
> > then just timeout unless some other thread calls
> > rte_kni_handle_request().
> >
> > So then you are in a bit of a chicken and egg situation.  You _have_
> > to have a separate thread calling rte_kni_handle_request periodically,
> > but that thread also _cannot_ run after rte_kni_release returns
> > (actually it's worse than that because it's actually after the
> > ioctl(RTE_KNI_IOCTL_RELEASE) returns and the fifos are freed).
>
> I see, so we have problem in both end, -userspace side and kernel side.
>
> Agreed that separating release & free may help, but I am not sure about adding a
> new API for KNI.
>
> Very simply, what about prevent kni_net_release() send callback to userspace?

No, because how is the DPDK application going to know when the user
does 'ip link set down dev <kniX>'?   It's important for the DPDK
application to know when the KNI interface is marked down.

> This is already not working and removing it resolves the issues you mentioned.

Huh?  How is it not working?  Of course it works.

> Sample application calls rte_eth_dev_stop() after release itself, so behavior
> will be same.

Huh?

> But the issues in kernel you mentioned, using `dev` after free_netdev() called
> should be addressed.

Yes, that's why I fixed them in the patches that I sent.

d

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 02/10] kni: separate releasing netdev from freeing KNI interface
  2018-10-10 18:18           ` Dan Gora
@ 2018-10-10 22:51             ` Ferruh Yigit
  2018-10-10 23:38               ` Dan Gora
  0 siblings, 1 reply; 45+ messages in thread
From: Ferruh Yigit @ 2018-10-10 22:51 UTC (permalink / raw)
  To: Dan Gora; +Cc: dev

On 10/10/2018 7:18 PM, Dan Gora wrote:
> On Wed, Oct 10, 2018 at 2:25 PM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>>
>> On 9/4/2018 1:36 AM, Dan Gora wrote:
>>> Hi Ferruh,
>>>
>>> I remembered now the motivation behind separating rte_kni_release()
>>> and rte_kni_free().
>>>
>>> The problem is that the DPDK thread which calls rte_kni_release()
>>> _cannot_ be the same thread which handles callbacks from the KNI
>>> driver via rte_kni_handle_request().  This is because the thread which
>>> calls rte_kni_release() will be stuck down in
>>> ioctl(RTE_KNI_IOCTL_RELEASE) when the kernel calls the
>>> RTE_KNI_REQ_CFG_NETWORK_IF callback to the DPDK application.  Since
>>> that thread cannot call rte_kni_handle_request(), the callback would
>>> then just timeout unless some other thread calls
>>> rte_kni_handle_request().
>>>
>>> So then you are in a bit of a chicken and egg situation.  You _have_
>>> to have a separate thread calling rte_kni_handle_request periodically,
>>> but that thread also _cannot_ run after rte_kni_release returns
>>> (actually it's worse than that because it's actually after the
>>> ioctl(RTE_KNI_IOCTL_RELEASE) returns and the fifos are freed).
>>
>> I see, so we have problem in both end, -userspace side and kernel side.
>>
>> Agreed that separating release & free may help, but I am not sure about adding a
>> new API for KNI.
>>
>> Very simply, what about prevent kni_net_release() send callback to userspace?
> 
> No, because how is the DPDK application going to know when the user
> does 'ip link set down dev <kniX>'?   It's important for the DPDK
> application to know when the KNI interface is marked down.

I mean kni_net_release() called because of unregister_netdev(),

it is possible to set a flag in kni_dev_remove(), before unregister_netdev(),
and in kni_net_release() don't call kni_net_process_request() if flag is set.

Looks like it can work and only a few lines of code, what do you think?

> 
>> This is already not working and removing it resolves the issues you mentioned.
> 
> Huh?  How is it not working?  Of course it works.

The kni_net_release() called because of unregister_netdev() is not working, as
you explained in userspace the thread handles request already terminated, even
if not in kernel side response not received and timed off because of lock...

> 
>> Sample application calls rte_eth_dev_stop() after release itself, so behavior
>> will be same.
> 
> Huh?

in kni sample app, in kni_free_kni() rte_eth_dev_stop() is called after
rte_kni_release().
So if you prevent kni_net_release() called because of unregister_netdev() to
send callback it won't be problem because of existing rte_eth_dev_stop()

> 
>> But the issues in kernel you mentioned, using `dev` after free_netdev() called
>> should be addressed.
> 
> Yes, that's why I fixed them in the patches that I sent.
> 
> d
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [dpdk-dev] [PATCH v2 02/10] kni: separate releasing netdev from freeing KNI interface
  2018-10-10 22:51             ` Ferruh Yigit
@ 2018-10-10 23:38               ` Dan Gora
  0 siblings, 0 replies; 45+ messages in thread
From: Dan Gora @ 2018-10-10 23:38 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: dev

I'll have a look at this.. Maybe it will work, I'll have to try it and
test it to see if there are any gotchas that are not obvious.

d

On Wed, Oct 10, 2018 at 7:51 PM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
> >> Very simply, what about prevent kni_net_release() send callback to userspace?
> >
> > No, because how is the DPDK application going to know when the user
> > does 'ip link set down dev <kniX>'?   It's important for the DPDK
> > application to know when the KNI interface is marked down.
>
> I mean kni_net_release() called because of unregister_netdev(),
>
> it is possible to set a flag in kni_dev_remove(), before unregister_netdev(),
> and in kni_net_release() don't call kni_net_process_request() if flag is set.
>
> Looks like it can work and only a few lines of code, what do you think?
>
> >
> >> This is already not working and removing it resolves the issues you mentioned.
> >
> > Huh?  How is it not working?  Of course it works.
>
> The kni_net_release() called because of unregister_netdev() is not working, as
> you explained in userspace the thread handles request already terminated, even
> if not in kernel side response not received and timed off because of lock...
>
> >
> >> Sample application calls rte_eth_dev_stop() after release itself, so behavior
> >> will be same.
> >
> > Huh?
>
> in kni sample app, in kni_free_kni() rte_eth_dev_stop() is called after
> rte_kni_release().
> So if you prevent kni_net_release() called because of unregister_netdev() to
> send callback it won't be problem because of existing rte_eth_dev_stop()
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2018-10-10 23:38 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-28 22:45 [dpdk-dev] [PATCH 00/10] kni: Interface detach and link status fixes Dan Gora
2018-06-29  1:54 ` Dan Gora
2018-06-29  1:54   ` [dpdk-dev] [PATCH v2 01/10] kni: remove unused variables from struct kni_dev Dan Gora
2018-08-29 10:29     ` Ferruh Yigit
2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 02/10] kni: separate releasing netdev from freeing KNI interface Dan Gora
2018-08-29 10:59     ` Ferruh Yigit
2018-09-04  0:20       ` Dan Gora
2018-09-04  0:36       ` Dan Gora
2018-10-10 17:24         ` Ferruh Yigit
2018-10-10 18:18           ` Dan Gora
2018-10-10 22:51             ` Ferruh Yigit
2018-10-10 23:38               ` Dan Gora
2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 03/10] kni: don't touch struct kni_dev after freeing Dan Gora
2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 04/10] kni: add rte_kni_free to KNI library Dan Gora
2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 05/10] kni: don't run rte_kni_handle_request after interface release Dan Gora
2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 06/10] kni: increase length of timeout for KNI responses Dan Gora
2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 07/10] kni: update kni test for rte_kni_free Dan Gora
2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 08/10] kni: add rte_kni_free to KNI example app Dan Gora
2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 09/10] kni: add rte_kni_free to KNI vdev driver Dan Gora
2018-06-29  1:55   ` [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface Dan Gora
2018-08-29 11:48     ` Ferruh Yigit
2018-08-29 21:10       ` Dan Gora
2018-08-29 22:01         ` Stephen Hemminger
2018-08-29 15:54     ` Stephen Hemminger
2018-08-29 21:02       ` Dan Gora
2018-08-29 22:00         ` Stephen Hemminger
2018-08-29 22:12           ` Dan Gora
2018-08-29 22:41             ` Dan Gora
2018-08-29 23:10               ` Stephen Hemminger
2018-08-30  9:49                 ` Igor Ryzhov
2018-08-30 10:32                   ` Igor Ryzhov
2018-08-30 21:41                   ` Dan Gora
2018-08-30 22:09                     ` Stephen Hemminger
2018-08-30 22:11                       ` Dan Gora
2018-09-04  0:47                         ` Dan Gora
2018-09-05 12:57                           ` Stephen Hemminger
2018-09-11 21:45                             ` Dan Gora
2018-09-11 21:52                               ` Stephen Hemminger
2018-09-11 22:07                                 ` Dan Gora
2018-09-11 23:14                                   ` Stephen Hemminger
2018-09-12  4:02                                     ` Jason Wang
2018-09-11 23:14     ` [dpdk-dev] [PATCH 0/2] " Dan Gora
2018-09-11 23:14     ` [dpdk-dev] [PATCH 1/2] " Dan Gora
2018-09-11 23:18       ` Dan Gora
2018-07-20 11:36   ` [dpdk-dev] [PATCH 00/10] kni: Interface detach and link status fixes Ferruh Yigit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).