DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH] vhost: fix coredump on port deletion
@ 2021-08-07  8:25 Gaoxiang Liu
  2021-08-07 23:12 ` [dpdk-dev] [PATCH v2] " Gaoxiang Liu
  0 siblings, 1 reply; 19+ messages in thread
From: Gaoxiang Liu @ 2021-08-07  8:25 UTC (permalink / raw)
  To: Maxime Coquelin, Chenbo Xia; +Cc: dev, liugaoxiang, Gaoxiang Liu

The rte_vhost_driver_unregister() and vhost_user_read_cb()
can be called at the same time by 2 threads.
Eg thread1 calls rte_vhost_driver_unregister() and frees memory of conn,
then thread2 calls vhost_user_read_cb() and access invalid memory of
conn.

The fix is to move the "fdset_try_del" in front of free memory of conn ,
then avoid the race condition.

Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")

Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
---
 lib/vhost/socket.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
index 5d0d728d5..bc7278e8a 100644
--- a/lib/vhost/socket.c
+++ b/lib/vhost/socket.c
@@ -1024,6 +1024,19 @@ rte_vhost_driver_unregister(const char *path)
 	for (i = 0; i < vhost_user.vsocket_cnt; i++) {
 		struct vhost_user_socket *vsocket = vhost_user.vsockets[i];
 
+		if (vsocket->is_server) {
+			/*
+			 * If r/wcb is executing, release vhost_user's
+			 * mutex lock, and try again since the r/wcb
+			 * may use the mutex lock.
+			 */
+			if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) == -1) {
+				pthread_mutex_unlock(&vhost_user.mutex);
+				goto again;
+			}
+		 } else if (vsocket->reconnect)
+			vhost_user_remove_reconnect(vsocket);
+
 		if (!strcmp(vsocket->path, path)) {
 			pthread_mutex_lock(&vsocket->conn_mutex);
 			for (conn = TAILQ_FIRST(&vsocket->conn_list);
@@ -1056,21 +1069,8 @@ rte_vhost_driver_unregister(const char *path)
 			pthread_mutex_unlock(&vsocket->conn_mutex);
 
 			if (vsocket->is_server) {
-				/*
-				 * If r/wcb is executing, release vhost_user's
-				 * mutex lock, and try again since the r/wcb
-				 * may use the mutex lock.
-				 */
-				if (fdset_try_del(&vhost_user.fdset,
-						vsocket->socket_fd) == -1) {
-					pthread_mutex_unlock(&vhost_user.mutex);
-					goto again;
-				}
-
 				close(vsocket->socket_fd);
 				unlink(path);
-			} else if (vsocket->reconnect) {
-				vhost_user_remove_reconnect(vsocket);
 			}
 
 			pthread_mutex_destroy(&vsocket->conn_mutex);
-- 
2.32.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [dpdk-dev] [PATCH v2] vhost: fix coredump on port deletion
  2021-08-07  8:25 [dpdk-dev] [PATCH] vhost: fix coredump on port deletion Gaoxiang Liu
@ 2021-08-07 23:12 ` Gaoxiang Liu
  2021-08-13 14:02   ` [dpdk-dev] [PATCH] vhost: fix crash on port deletion The rte_vhost_driver_unregister() and vhost_user_read_cb() can be called at the same time by 2 threads. Eg thread1 calls rte_vhost_driver_unregister() and frees memory of "conn". Because socket fd has not been deleted from poll waiting fds, "vhost-events" thread calls fdset_event_dispatch, then calls vhost_user_read_cb(), and accesses invalid memory of "conn" Gaoxiang Liu
  2021-08-13 14:22   ` [dpdk-dev] [PATCH] vhost: fix crash on port deletion Gaoxiang Liu
  0 siblings, 2 replies; 19+ messages in thread
From: Gaoxiang Liu @ 2021-08-07 23:12 UTC (permalink / raw)
  To: Maxime Coquelin, Chenbo Xia; +Cc: dev, liugaoxiang, Gaoxiang Liu

The rte_vhost_driver_unregister() and vhost_user_read_cb()
can be called at the same time by 2 threads.
Eg thread1 calls rte_vhost_driver_unregister() and frees memory of conn,
then thread2 calls vhost_user_read_cb() and access invalid memory of
conn.

The fix is to move the "fdset_try_del" in front of free memory of conn ,
then avoid the race condition.

Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")

v2:
fix coding style issues

Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
---
 lib/vhost/socket.c | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
index 5d0d728d5..2eb8fcadd 100644
--- a/lib/vhost/socket.c
+++ b/lib/vhost/socket.c
@@ -1024,6 +1024,20 @@ rte_vhost_driver_unregister(const char *path)
 	for (i = 0; i < vhost_user.vsocket_cnt; i++) {
 		struct vhost_user_socket *vsocket = vhost_user.vsockets[i];
 
+		if (vsocket->is_server) {
+			/*
+			 * If r/wcb is executing, release vhost_user's
+			 * mutex lock, and try again since the r/wcb
+			 * may use the mutex lock.
+			 */
+			if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) == -1) {
+				pthread_mutex_unlock(&vhost_user.mutex);
+				goto again;
+			}
+		} else if (vsocket->reconnect) {
+			vhost_user_remove_reconnect(vsocket);
+		}
+
 		if (!strcmp(vsocket->path, path)) {
 			pthread_mutex_lock(&vsocket->conn_mutex);
 			for (conn = TAILQ_FIRST(&vsocket->conn_list);
@@ -1056,21 +1070,8 @@ rte_vhost_driver_unregister(const char *path)
 			pthread_mutex_unlock(&vsocket->conn_mutex);
 
 			if (vsocket->is_server) {
-				/*
-				 * If r/wcb is executing, release vhost_user's
-				 * mutex lock, and try again since the r/wcb
-				 * may use the mutex lock.
-				 */
-				if (fdset_try_del(&vhost_user.fdset,
-						vsocket->socket_fd) == -1) {
-					pthread_mutex_unlock(&vhost_user.mutex);
-					goto again;
-				}
-
 				close(vsocket->socket_fd);
 				unlink(path);
-			} else if (vsocket->reconnect) {
-				vhost_user_remove_reconnect(vsocket);
 			}
 
 			pthread_mutex_destroy(&vsocket->conn_mutex);
-- 
2.32.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [dpdk-dev] [PATCH] vhost: fix crash on port deletion The rte_vhost_driver_unregister() and vhost_user_read_cb() can be called at the same time by 2 threads. Eg thread1 calls rte_vhost_driver_unregister() and frees memory of "conn". Because socket fd has not been deleted from poll waiting fds, "vhost-events" thread calls fdset_event_dispatch, then calls vhost_user_read_cb(), and accesses invalid memory of "conn".
  2021-08-07 23:12 ` [dpdk-dev] [PATCH v2] " Gaoxiang Liu
@ 2021-08-13 14:02   ` Gaoxiang Liu
  2021-08-13 14:22   ` [dpdk-dev] [PATCH] vhost: fix crash on port deletion Gaoxiang Liu
  1 sibling, 0 replies; 19+ messages in thread
From: Gaoxiang Liu @ 2021-08-13 14:02 UTC (permalink / raw)
  To: maxime.coquelin, chenbo.xia; +Cc: dev, liugaoxiang, Gaoxiang Liu

The fix is to move the "fdset_try_del" in front of free memory of conn,
then avoid the race condition.

The core trace is:
Program terminated with signal 11, Segmentation fault.

Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")

v2:
fix coding style issues

v3:
add detailed log

Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
---
 lib/vhost/socket.c | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
index 5d0d728d5..2eb8fcadd 100644
--- a/lib/vhost/socket.c
+++ b/lib/vhost/socket.c
@@ -1024,6 +1024,20 @@ rte_vhost_driver_unregister(const char *path)
 	for (i = 0; i < vhost_user.vsocket_cnt; i++) {
 		struct vhost_user_socket *vsocket = vhost_user.vsockets[i];
 
+		if (vsocket->is_server) {
+			/*
+			 * If r/wcb is executing, release vhost_user's
+			 * mutex lock, and try again since the r/wcb
+			 * may use the mutex lock.
+			 */
+			if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) == -1) {
+				pthread_mutex_unlock(&vhost_user.mutex);
+				goto again;
+			}
+		} else if (vsocket->reconnect) {
+			vhost_user_remove_reconnect(vsocket);
+		}
+
 		if (!strcmp(vsocket->path, path)) {
 			pthread_mutex_lock(&vsocket->conn_mutex);
 			for (conn = TAILQ_FIRST(&vsocket->conn_list);
@@ -1056,21 +1070,8 @@ rte_vhost_driver_unregister(const char *path)
 			pthread_mutex_unlock(&vsocket->conn_mutex);
 
 			if (vsocket->is_server) {
-				/*
-				 * If r/wcb is executing, release vhost_user's
-				 * mutex lock, and try again since the r/wcb
-				 * may use the mutex lock.
-				 */
-				if (fdset_try_del(&vhost_user.fdset,
-						vsocket->socket_fd) == -1) {
-					pthread_mutex_unlock(&vhost_user.mutex);
-					goto again;
-				}
-
 				close(vsocket->socket_fd);
 				unlink(path);
-			} else if (vsocket->reconnect) {
-				vhost_user_remove_reconnect(vsocket);
 			}
 
 			pthread_mutex_destroy(&vsocket->conn_mutex);
-- 
2.32.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [dpdk-dev] [PATCH] vhost: fix crash on port deletion
  2021-08-07 23:12 ` [dpdk-dev] [PATCH v2] " Gaoxiang Liu
  2021-08-13 14:02   ` [dpdk-dev] [PATCH] vhost: fix crash on port deletion The rte_vhost_driver_unregister() and vhost_user_read_cb() can be called at the same time by 2 threads. Eg thread1 calls rte_vhost_driver_unregister() and frees memory of "conn". Because socket fd has not been deleted from poll waiting fds, "vhost-events" thread calls fdset_event_dispatch, then calls vhost_user_read_cb(), and accesses invalid memory of "conn" Gaoxiang Liu
@ 2021-08-13 14:22   ` Gaoxiang Liu
  2021-08-16  6:44     ` Xia, Chenbo
  2021-08-18 16:08     ` [dpdk-dev] [PATCH v4] " Gaoxiang Liu
  1 sibling, 2 replies; 19+ messages in thread
From: Gaoxiang Liu @ 2021-08-13 14:22 UTC (permalink / raw)
  To: maxime.coquelin, chenbo.xia; +Cc: dev, liugaoxiang, Gaoxiang Liu

The rte_vhost_driver_unregister() and vhost_user_read_cb()
can be called at the same time by 2 threads.
Eg thread1 calls rte_vhost_driver_unregister() and frees memory of
"conn".
Because socket fd has not been deleted from poll waiting fds,
"vhost-events" thread calls fdset_event_dispatch, then calls
vhost_user_read_cb(),
and accesses invalid memory of "conn".

The fix is to move the "fdset_try_del" in front of free memory of conn,
then avoid the race condition.

The core trace is:
Program terminated with signal 11, Segmentation fault.

Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")

v2:
fix coding style issues

v3:
add detailed log

Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
---
 lib/vhost/socket.c | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
index 5d0d728d5..2eb8fcadd 100644
--- a/lib/vhost/socket.c
+++ b/lib/vhost/socket.c
@@ -1024,6 +1024,20 @@ rte_vhost_driver_unregister(const char *path)
 	for (i = 0; i < vhost_user.vsocket_cnt; i++) {
 		struct vhost_user_socket *vsocket = vhost_user.vsockets[i];
 
+		if (vsocket->is_server) {
+			/*
+			 * If r/wcb is executing, release vhost_user's
+			 * mutex lock, and try again since the r/wcb
+			 * may use the mutex lock.
+			 */
+			if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) == -1) {
+				pthread_mutex_unlock(&vhost_user.mutex);
+				goto again;
+			}
+		} else if (vsocket->reconnect) {
+			vhost_user_remove_reconnect(vsocket);
+		}
+
 		if (!strcmp(vsocket->path, path)) {
 			pthread_mutex_lock(&vsocket->conn_mutex);
 			for (conn = TAILQ_FIRST(&vsocket->conn_list);
@@ -1056,21 +1070,8 @@ rte_vhost_driver_unregister(const char *path)
 			pthread_mutex_unlock(&vsocket->conn_mutex);
 
 			if (vsocket->is_server) {
-				/*
-				 * If r/wcb is executing, release vhost_user's
-				 * mutex lock, and try again since the r/wcb
-				 * may use the mutex lock.
-				 */
-				if (fdset_try_del(&vhost_user.fdset,
-						vsocket->socket_fd) == -1) {
-					pthread_mutex_unlock(&vhost_user.mutex);
-					goto again;
-				}
-
 				close(vsocket->socket_fd);
 				unlink(path);
-			} else if (vsocket->reconnect) {
-				vhost_user_remove_reconnect(vsocket);
 			}
 
 			pthread_mutex_destroy(&vsocket->conn_mutex);
-- 
2.32.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix crash on port deletion
  2021-08-13 14:22   ` [dpdk-dev] [PATCH] vhost: fix crash on port deletion Gaoxiang Liu
@ 2021-08-16  6:44     ` Xia, Chenbo
  2021-08-20 15:53       ` Gaoxiang Liu
  2021-08-18 16:08     ` [dpdk-dev] [PATCH v4] " Gaoxiang Liu
  1 sibling, 1 reply; 19+ messages in thread
From: Xia, Chenbo @ 2021-08-16  6:44 UTC (permalink / raw)
  To: Gaoxiang Liu, maxime.coquelin; +Cc: dev, liugaoxiang

Hi Gaoxiang,

> -----Original Message-----
> From: Gaoxiang Liu <gaoxiangliu0@163.com>
> Sent: Friday, August 13, 2021 10:23 PM
> To: maxime.coquelin@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
> Cc: dev@dpdk.org; liugaoxiang@huawei.com; Gaoxiang Liu <gaoxiangliu0@163.com>
> Subject: [PATCH] vhost: fix crash on port deletion
> 
> The rte_vhost_driver_unregister() and vhost_user_read_cb()
> can be called at the same time by 2 threads.
> Eg thread1 calls rte_vhost_driver_unregister() and frees memory of
> "conn".
> Because socket fd has not been deleted from poll waiting fds,
> "vhost-events" thread calls fdset_event_dispatch, then calls
> vhost_user_read_cb(),
> and accesses invalid memory of "conn".
> 
> The fix is to move the "fdset_try_del" in front of free memory of conn,
> then avoid the race condition.

I'm a bit confused here. 

First, are you talking about vhost as server or client? I guess it's server but
you'd better clarify it's a bug of both mode or one mode.

Based on we are talking about vhost as server:
we both know there are two types of socket fd. One for new connections coming,
let's call it listen fd. The other is one per connection for send/recv messages,
let's call it conn fd. And vhost_user_read_cb is called when conn fd gets some
messages. The original code first delete the conn fd from fd_set so that
vhost_user_read_cb will not be called, then free the 'conn'. This seems correct
to me.

You are moving the code of deleting the listen fd here. It's for trigger of callback
vhost_user_server_new_connection but not vhost_user_read_cb. So I don't understand
your point. Do I miss something here?

BTW, note two things:
1. please add correct prefix when 'git send-email'. E.g., for this patch, you should
add --subject-prefix="PATCH v3"
2. the version change description should be after signed-off tag, for example:

Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
---
V2:
XXX
V3:
XXX
---
 lib/vhost/socket.c | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

Thanks,
Chenbo

> 
> The core trace is:
> Program terminated with signal 11, Segmentation fault.
> 
> Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")
> 
> v2:
> fix coding style issues
> 
> v3:
> add detailed log
> 
> Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
> ---
>  lib/vhost/socket.c | 27 ++++++++++++++-------------
>  1 file changed, 14 insertions(+), 13 deletions(-)
> 
> diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
> index 5d0d728d5..2eb8fcadd 100644
> --- a/lib/vhost/socket.c
> +++ b/lib/vhost/socket.c
> @@ -1024,6 +1024,20 @@ rte_vhost_driver_unregister(const char *path)
>  	for (i = 0; i < vhost_user.vsocket_cnt; i++) {
>  		struct vhost_user_socket *vsocket = vhost_user.vsockets[i];
> 
> +		if (vsocket->is_server) {
> +			/*
> +			 * If r/wcb is executing, release vhost_user's
> +			 * mutex lock, and try again since the r/wcb
> +			 * may use the mutex lock.
> +			 */
> +			if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) ==
> -1) {
> +				pthread_mutex_unlock(&vhost_user.mutex);
> +				goto again;
> +			}
> +		} else if (vsocket->reconnect) {
> +			vhost_user_remove_reconnect(vsocket);
> +		}
> +
>  		if (!strcmp(vsocket->path, path)) {
>  			pthread_mutex_lock(&vsocket->conn_mutex);
>  			for (conn = TAILQ_FIRST(&vsocket->conn_list);
> @@ -1056,21 +1070,8 @@ rte_vhost_driver_unregister(const char *path)
>  			pthread_mutex_unlock(&vsocket->conn_mutex);
> 
>  			if (vsocket->is_server) {
> -				/*
> -				 * If r/wcb is executing, release vhost_user's
> -				 * mutex lock, and try again since the r/wcb
> -				 * may use the mutex lock.
> -				 */
> -				if (fdset_try_del(&vhost_user.fdset,
> -						vsocket->socket_fd) == -1) {
> -					pthread_mutex_unlock(&vhost_user.mutex);
> -					goto again;
> -				}
> -
>  				close(vsocket->socket_fd);
>  				unlink(path);
> -			} else if (vsocket->reconnect) {
> -				vhost_user_remove_reconnect(vsocket);
>  			}
> 
>  			pthread_mutex_destroy(&vsocket->conn_mutex);
> --
> 2.32.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [dpdk-dev] [PATCH v4] vhost: fix crash on port deletion
  2021-08-13 14:22   ` [dpdk-dev] [PATCH] vhost: fix crash on port deletion Gaoxiang Liu
  2021-08-16  6:44     ` Xia, Chenbo
@ 2021-08-18 16:08     ` Gaoxiang Liu
  2021-08-20 15:46       ` [dpdk-dev] [PATCH v5] " Gaoxiang Liu
  1 sibling, 1 reply; 19+ messages in thread
From: Gaoxiang Liu @ 2021-08-18 16:08 UTC (permalink / raw)
  To: maxime.coquelin, chenbo.xia; +Cc: dev, liugaoxiang, Gaoxiang Liu

The rte_vhost_driver_unregister() and vhost_user_server_new_connection()
can be called at the same time by 2 threads.
Eg thread1 calls rte_vhost_driver_unregister() and frees the memory of
conn_list.
"vhost-events" thread calls fdset_event_dispatch,
then calls vhost_user_server_new_connection().
A new conn fd is added in fdset in vhost_user_server_new_connection(),
then "vhost-events" thread calls vhost_user_read_cb().
when thread1 frees the memory of vsocket, vhost_user_read_cb()
will access invalid memory of socket.

When vhostuser port is created as a client, the issue also exists.

The fix is to move the "fdset_try_del" in front of free memory of conn,
then avoid the race condition.

The core trace is:
Program terminated with signal 11, Segmentation fault.

Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")

Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>

v2:
* Fix coding style issues

v3:
* Add detailed log

v4:
* Add the reason, when vhostuser port is created as a server.
---
 lib/vhost/socket.c | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
index 5d0d728d5..2eb8fcadd 100644
--- a/lib/vhost/socket.c
+++ b/lib/vhost/socket.c
@@ -1024,6 +1024,20 @@ rte_vhost_driver_unregister(const char *path)
 	for (i = 0; i < vhost_user.vsocket_cnt; i++) {
 		struct vhost_user_socket *vsocket = vhost_user.vsockets[i];
 
+		if (vsocket->is_server) {
+			/*
+			 * If r/wcb is executing, release vhost_user's
+			 * mutex lock, and try again since the r/wcb
+			 * may use the mutex lock.
+			 */
+			if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) == -1) {
+				pthread_mutex_unlock(&vhost_user.mutex);
+				goto again;
+			}
+		} else if (vsocket->reconnect) {
+			vhost_user_remove_reconnect(vsocket);
+		}
+
 		if (!strcmp(vsocket->path, path)) {
 			pthread_mutex_lock(&vsocket->conn_mutex);
 			for (conn = TAILQ_FIRST(&vsocket->conn_list);
@@ -1056,21 +1070,8 @@ rte_vhost_driver_unregister(const char *path)
 			pthread_mutex_unlock(&vsocket->conn_mutex);
 
 			if (vsocket->is_server) {
-				/*
-				 * If r/wcb is executing, release vhost_user's
-				 * mutex lock, and try again since the r/wcb
-				 * may use the mutex lock.
-				 */
-				if (fdset_try_del(&vhost_user.fdset,
-						vsocket->socket_fd) == -1) {
-					pthread_mutex_unlock(&vhost_user.mutex);
-					goto again;
-				}
-
 				close(vsocket->socket_fd);
 				unlink(path);
-			} else if (vsocket->reconnect) {
-				vhost_user_remove_reconnect(vsocket);
 			}
 
 			pthread_mutex_destroy(&vsocket->conn_mutex);
-- 
2.32.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [dpdk-dev] [PATCH v5] vhost: fix crash on port deletion
  2021-08-18 16:08     ` [dpdk-dev] [PATCH v4] " Gaoxiang Liu
@ 2021-08-20 15:46       ` Gaoxiang Liu
  2021-08-26  8:37         ` Xia, Chenbo
  2021-08-27 14:19         ` [dpdk-dev] [PATCH v6] " Gaoxiang Liu
  0 siblings, 2 replies; 19+ messages in thread
From: Gaoxiang Liu @ 2021-08-20 15:46 UTC (permalink / raw)
  To: maxime.coquelin, chenbo.xia; +Cc: dev, liugaoxiang, Gaoxiang Liu

The rte_vhost_driver_unregister() and vhost_user_read_cb()
can be called at the same time by 2 threads.
when memory of vsocket is freed in rte_vhost_driver_unregister(),
the invalid memory of vsocket is accessd in vhost_user_read_cb().
It's a bug of both mode for vhost as server or client.

Eg vhostuser port is created as server.
Thread1 calls rte_vhost_driver_unregister().
Before the listen fd is deleted from poll waiting fds,
"vhost-events" thread then calls vhost_user_server_new_connection(),
then a new conn fd is added in fdset when trying to reconnect.
"vhost-events" thread then calls vhost_user_read_cb() and
access invalid memory of socket while thread1 frees the memory of
vsocket.

Eg vhostuser port is created as client.
Thread1 calls rte_vhost_driver_unregister().
Before vsocket of reconn is deleted from reconn list,
"vhost_reconn" thread then calls vhost_user_add_connection()
then a new conn fd is added in fdset when trying to reconnect.
"vhost-events" thread then calls vhost_user_read_cb() and
access invalid memory of socket while thread1 frees the memory of
vsocket.

The fix is to move the "fdset_try_del" in front of free memory of conn,
then avoid the race condition.

The core trace is:
Program terminated with signal 11, Segmentation fault.

Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")

Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>

v2:
* Fix coding style issues.

v3:
* Add detailed log.

v4:
* Add the reason when vhostuser port is created as server.

v5:
* Add detailed log when vhostuser port is created as client.
---
 lib/vhost/socket.c | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
index 5d0d728d5..2eb8fcadd 100644
--- a/lib/vhost/socket.c
+++ b/lib/vhost/socket.c
@@ -1024,6 +1024,20 @@ rte_vhost_driver_unregister(const char *path)
 	for (i = 0; i < vhost_user.vsocket_cnt; i++) {
 		struct vhost_user_socket *vsocket = vhost_user.vsockets[i];
 
+		if (vsocket->is_server) {
+			/*
+			 * If r/wcb is executing, release vhost_user's
+			 * mutex lock, and try again since the r/wcb
+			 * may use the mutex lock.
+			 */
+			if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) == -1) {
+				pthread_mutex_unlock(&vhost_user.mutex);
+				goto again;
+			}
+		} else if (vsocket->reconnect) {
+			vhost_user_remove_reconnect(vsocket);
+		}
+
 		if (!strcmp(vsocket->path, path)) {
 			pthread_mutex_lock(&vsocket->conn_mutex);
 			for (conn = TAILQ_FIRST(&vsocket->conn_list);
@@ -1056,21 +1070,8 @@ rte_vhost_driver_unregister(const char *path)
 			pthread_mutex_unlock(&vsocket->conn_mutex);
 
 			if (vsocket->is_server) {
-				/*
-				 * If r/wcb is executing, release vhost_user's
-				 * mutex lock, and try again since the r/wcb
-				 * may use the mutex lock.
-				 */
-				if (fdset_try_del(&vhost_user.fdset,
-						vsocket->socket_fd) == -1) {
-					pthread_mutex_unlock(&vhost_user.mutex);
-					goto again;
-				}
-
 				close(vsocket->socket_fd);
 				unlink(path);
-			} else if (vsocket->reconnect) {
-				vhost_user_remove_reconnect(vsocket);
 			}
 
 			pthread_mutex_destroy(&vsocket->conn_mutex);
-- 
2.32.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: fix crash on port deletion
  2021-08-16  6:44     ` Xia, Chenbo
@ 2021-08-20 15:53       ` Gaoxiang Liu
  0 siblings, 0 replies; 19+ messages in thread
From: Gaoxiang Liu @ 2021-08-20 15:53 UTC (permalink / raw)
  To: Xia, Chenbo; +Cc: maxime.coquelin, dev, liugaoxiang

Hi Chenbo,

I add more detailed reason as below in [PATCH  v5]
The rte_vhost_driver_unregister() and vhost_user_read_cb()
can be called at the same time by 2 threads.
when memory of vsocket is freed in rte_vhost_driver_unregister(),
the invalid memory of vsocket is accessd in vhost_user_read_cb().
It's a bug of both mode for vhost as server or client.

Eg vhostuser port is created as server.
Thread1 calls rte_vhost_driver_unregister().
Before the listen fd is deleted from poll waiting fds,
"vhost-events" thread then calls vhost_user_server_new_connection(),
then a new conn fd is added in fdset when trying to reconnect.
"vhost-events" thread then calls vhost_user_read_cb() and
access invalid memory of socket while thread1 frees the memory of vsocket.

Eg vhostuser port is created as client.
Thread1 calls rte_vhost_driver_unregister().
Before vsocket of reconn is deleted from reconn list,
"vhost_reconn" thread then calls vhost_user_add_connection()
then a new conn fd is added in fdset when trying to reconnect.
"vhost-events" thread then calls vhost_user_read_cb() and
access invalid memory of socket while thread1 frees the memory of vsocket.



On 08/16/2021 14:44, Xia, Chenbo wrote:
Hi Gaoxiang,

> -----Original Message-----
> From: Gaoxiang Liu <gaoxiangliu0@163.com>
> Sent: Friday, August 13, 2021 10:23 PM
> To: maxime.coquelin@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
> Cc: dev@dpdk.org; liugaoxiang@huawei.com; Gaoxiang Liu <gaoxiangliu0@163.com>
> Subject: [PATCH] vhost: fix crash on port deletion
>
> The rte_vhost_driver_unregister() and vhost_user_read_cb()
> can be called at the same time by 2 threads.
> Eg thread1 calls rte_vhost_driver_unregister() and frees memory of
> "conn".
> Because socket fd has not been deleted from poll waiting fds,
> "vhost-events" thread calls fdset_event_dispatch, then calls
> vhost_user_read_cb(),
> and accesses invalid memory of "conn".
>
> The fix is to move the "fdset_try_del" in front of free memory of conn,
> then avoid the race condition.

I'm a bit confused here.

First, are you talking about vhost as server or client? I guess it's server but
you'd better clarify it's a bug of both mode or one mode.

Based on we are talking about vhost as server:
we both know there are two types of socket fd. One for new connections coming,
let's call it listen fd. The other is one per connection for send/recv messages,
let's call it conn fd. And vhost_user_read_cb is called when conn fd gets some
messages. The original code first delete the conn fd from fd_set so that
vhost_user_read_cb will not be called, then free the 'conn'. This seems correct
to me.

You are moving the code of deleting the listen fd here. It's for trigger of callback
vhost_user_server_new_connection but not vhost_user_read_cb. So I don't understand
your point. Do I miss something here?

BTW, note two things:
1. please add correct prefix when 'git send-email'. E.g., for this patch, you should
add --subject-prefix="PATCH v3"
2. the version change description should be after signed-off tag, for example:

Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
---
V2:
XXX
V3:
XXX
---
lib/vhost/socket.c | 27 ++++++++++++++-------------
1 file changed, 14 insertions(+), 13 deletions(-)

Thanks,
Chenbo

>
> The core trace is:
> Program terminated with signal 11, Segmentation fault.
>
> Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")
>
> v2:
> fix coding style issues
>
> v3:
> add detailed log
>
> Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
> ---
>  lib/vhost/socket.c | 27 ++++++++++++++-------------
>  1 file changed, 14 insertions(+), 13 deletions(-)
>
> diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
> index 5d0d728d5..2eb8fcadd 100644
> --- a/lib/vhost/socket.c
> +++ b/lib/vhost/socket.c
> @@ -1024,6 +1024,20 @@ rte_vhost_driver_unregister(const char *path)
>       for (i = 0; i < vhost_user.vsocket_cnt; i++) {
>            struct vhost_user_socket *vsocket = vhost_user.vsockets[i];
>
> +          if (vsocket->is_server) {
> +               /*
> +                * If r/wcb is executing, release vhost_user's
> +                * mutex lock, and try again since the r/wcb
> +                * may use the mutex lock.
> +                */
> +               if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) ==
> -1) {
> +                    pthread_mutex_unlock(&vhost_user.mutex);
> +                    goto again;
> +               }
> +          } else if (vsocket->reconnect) {
> +               vhost_user_remove_reconnect(vsocket);
> +          }
> +
>            if (!strcmp(vsocket->path, path)) {
>                 pthread_mutex_lock(&vsocket->conn_mutex);
>                 for (conn = TAILQ_FIRST(&vsocket->conn_list);
> @@ -1056,21 +1070,8 @@ rte_vhost_driver_unregister(const char *path)
>                 pthread_mutex_unlock(&vsocket->conn_mutex);
>
>                 if (vsocket->is_server) {
> -                    /*
> -                     * If r/wcb is executing, release vhost_user's
> -                     * mutex lock, and try again since the r/wcb
> -                     * may use the mutex lock.
> -                     */
> -                    if (fdset_try_del(&vhost_user.fdset,
> -                              vsocket->socket_fd) == -1) {
> -                         pthread_mutex_unlock(&vhost_user.mutex);
> -                         goto again;
> -                    }
> -
>                      close(vsocket->socket_fd);
>                      unlink(path);
> -               } else if (vsocket->reconnect) {
> -                    vhost_user_remove_reconnect(vsocket);
>                 }
>
>                 pthread_mutex_destroy(&vsocket->conn_mutex);
> --
> 2.32.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v5] vhost: fix crash on port deletion
  2021-08-20 15:46       ` [dpdk-dev] [PATCH v5] " Gaoxiang Liu
@ 2021-08-26  8:37         ` Xia, Chenbo
  2021-08-27 14:19         ` [dpdk-dev] [PATCH v6] " Gaoxiang Liu
  1 sibling, 0 replies; 19+ messages in thread
From: Xia, Chenbo @ 2021-08-26  8:37 UTC (permalink / raw)
  To: Gaoxiang Liu, maxime.coquelin; +Cc: dev, liugaoxiang

Hi Gaoxiang,

> -----Original Message-----
> From: Gaoxiang Liu <gaoxiangliu0@163.com>
> Sent: Friday, August 20, 2021 11:46 PM
> To: maxime.coquelin@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
> Cc: dev@dpdk.org; liugaoxiang@huawei.com; Gaoxiang Liu <gaoxiangliu0@163.com>
> Subject: [PATCH v5] vhost: fix crash on port deletion
> 
> The rte_vhost_driver_unregister() and vhost_user_read_cb()
> can be called at the same time by 2 threads.
> when memory of vsocket is freed in rte_vhost_driver_unregister(),
> the invalid memory of vsocket is accessd in vhost_user_read_cb().
> It's a bug of both mode for vhost as server or client.
> 
> Eg vhostuser port is created as server.
> Thread1 calls rte_vhost_driver_unregister().
> Before the listen fd is deleted from poll waiting fds,
> "vhost-events" thread then calls vhost_user_server_new_connection(),
> then a new conn fd is added in fdset when trying to reconnect.
> "vhost-events" thread then calls vhost_user_read_cb() and
> access invalid memory of socket while thread1 frees the memory of
> vsocket.
> 
> Eg vhostuser port is created as client.
> Thread1 calls rte_vhost_driver_unregister().
> Before vsocket of reconn is deleted from reconn list,
> "vhost_reconn" thread then calls vhost_user_add_connection()
> then a new conn fd is added in fdset when trying to reconnect.
> "vhost-events" thread then calls vhost_user_read_cb() and
> access invalid memory of socket while thread1 frees the memory of
> vsocket.
> 
> The fix is to move the "fdset_try_del" in front of free memory of conn,
> then avoid the race condition.
> 
> The core trace is:
> Program terminated with signal 11, Segmentation fault.

Thanks for the detailed commit log. It looks better now and easier to understand.

And please make the some words more formal. Like 'Eg' -> 'E.g.,'

> 
> Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")
> 
> Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>

Please add a new line with " --- " (as I showed you in v4). It should look like:


Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
---

v2:
* Fix coding style issues.

If you don’t do this, the description will be shown in the commit log, which is
not wanted.

> 
> v2:
> * Fix coding style issues.
> 
> v3:
> * Add detailed log.
> 
> v4:
> * Add the reason when vhostuser port is created as server.
> 
> v5:
> * Add detailed log when vhostuser port is created as client.
> ---
>  lib/vhost/socket.c | 27 ++++++++++++++-------------
>  1 file changed, 14 insertions(+), 13 deletions(-)
> 
> diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
> index 5d0d728d5..2eb8fcadd 100644
> --- a/lib/vhost/socket.c
> +++ b/lib/vhost/socket.c
> @@ -1024,6 +1024,20 @@ rte_vhost_driver_unregister(const char *path)
>  	for (i = 0; i < vhost_user.vsocket_cnt; i++) {
>  		struct vhost_user_socket *vsocket = vhost_user.vsockets[i];
> 
> +		if (vsocket->is_server) {
> +			/*
> +			 * If r/wcb is executing, release vhost_user's
> +			 * mutex lock, and try again since the r/wcb
> +			 * may use the mutex lock.
> +			 */
> +			if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) ==
> -1) {
> +				pthread_mutex_unlock(&vhost_user.mutex);
> +				goto again;
> +			}
> +		} else if (vsocket->reconnect) {
> +			vhost_user_remove_reconnect(vsocket);
> +		}
> +
>  		if (!strcmp(vsocket->path, path)) {

You should first check the param 'path' before doing anything, so please move the
strcmp too.

>  			pthread_mutex_lock(&vsocket->conn_mutex);
>  			for (conn = TAILQ_FIRST(&vsocket->conn_list);
> @@ -1056,21 +1070,8 @@ rte_vhost_driver_unregister(const char *path)
>  			pthread_mutex_unlock(&vsocket->conn_mutex);
> 
>  			if (vsocket->is_server) {
> -				/*
> -				 * If r/wcb is executing, release vhost_user's
> -				 * mutex lock, and try again since the r/wcb
> -				 * may use the mutex lock.
> -				 */
> -				if (fdset_try_del(&vhost_user.fdset,
> -						vsocket->socket_fd) == -1) {
> -					pthread_mutex_unlock(&vhost_user.mutex);
> -					goto again;
> -				}
> -
>  				close(vsocket->socket_fd);
>  				unlink(path);

Any reason why we don't move this?

Thanks,
Chenbo

> -			} else if (vsocket->reconnect) {
> -				vhost_user_remove_reconnect(vsocket);
>  			}
> 
>  			pthread_mutex_destroy(&vsocket->conn_mutex);
> --
> 2.32.0
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [dpdk-dev] [PATCH v6] vhost: fix crash on port deletion
  2021-08-20 15:46       ` [dpdk-dev] [PATCH v5] " Gaoxiang Liu
  2021-08-26  8:37         ` Xia, Chenbo
@ 2021-08-27 14:19         ` Gaoxiang Liu
  2021-08-31  5:37           ` Xia, Chenbo
  2021-09-02 15:45           ` [dpdk-dev] [PATCH v7] " Gaoxiang Liu
  1 sibling, 2 replies; 19+ messages in thread
From: Gaoxiang Liu @ 2021-08-27 14:19 UTC (permalink / raw)
  To: maxime.coquelin, chenbo.xia; +Cc: dev, liugaoxiang, Gaoxiang Liu

The rte_vhost_driver_unregister() and vhost_user_read_cb()
can be called at the same time by 2 threads.
when memory of vsocket is freed in rte_vhost_driver_unregister(),
the invalid memory of vsocket is accessed in vhost_user_read_cb().
It's a bug of both mode for vhost as server or client.

E.g.,vhostuser port is created as server.
Thread1 calls rte_vhost_driver_unregister().
Before the listen fd is deleted from poll waiting fds,
"vhost-events" thread then calls vhost_user_server_new_connection(),
then a new conn fd is added in fdset when trying to reconnect.
"vhost-events" thread then calls vhost_user_read_cb() and
accesses invalid memory of socket while thread1 frees the memory of
vsocket.

E.g.,vhostuser port is created as client.
Thread1 calls rte_vhost_driver_unregister().
Before vsocket of reconn is deleted from reconn list,
"vhost_reconn" thread then calls vhost_user_add_connection()
then a new conn fd is added in fdset when trying to reconnect.
"vhost-events" thread then calls vhost_user_read_cb() and
accesses invalid memory of socket while thread1 frees the memory of
vsocket.

The fix is to move the "fdset_try_del" in front of free memory of conn,
then avoid the race condition.

The core trace is:
Program terminated with signal 11, Segmentation fault.

Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")

Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
---

v2:
* Fix coding style issues.

v3:
* Add detailed log.

v4:
* Add the reason, when vhostuser port is created as server.

v5:
* Add detailed log when vhostuser port is created as client

v6:
* Add 'path' check before deleting listen fd
* Fix spelling issues
---
 lib/vhost/socket.c | 108 ++++++++++++++++++++++-----------------------
 1 file changed, 54 insertions(+), 54 deletions(-)

diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
index 5d0d728d5..27d5e8695 100644
--- a/lib/vhost/socket.c
+++ b/lib/vhost/socket.c
@@ -1023,66 +1023,66 @@ rte_vhost_driver_unregister(const char *path)
 
 	for (i = 0; i < vhost_user.vsocket_cnt; i++) {
 		struct vhost_user_socket *vsocket = vhost_user.vsockets[i];
+		if (strcmp(vsocket->path, path)) {
+			continue;
+		}
 
-		if (!strcmp(vsocket->path, path)) {
-			pthread_mutex_lock(&vsocket->conn_mutex);
-			for (conn = TAILQ_FIRST(&vsocket->conn_list);
-			     conn != NULL;
-			     conn = next) {
-				next = TAILQ_NEXT(conn, next);
-
-				/*
-				 * If r/wcb is executing, release vsocket's
-				 * conn_mutex and vhost_user's mutex locks, and
-				 * try again since the r/wcb may use the
-				 * conn_mutex and mutex locks.
-				 */
-				if (fdset_try_del(&vhost_user.fdset,
-						  conn->connfd) == -1) {
-					pthread_mutex_unlock(
-							&vsocket->conn_mutex);
-					pthread_mutex_unlock(&vhost_user.mutex);
-					goto again;
-				}
-
-				VHOST_LOG_CONFIG(INFO,
-					"free connfd = %d for device '%s'\n",
-					conn->connfd, path);
-				close(conn->connfd);
-				vhost_destroy_device(conn->vid);
-				TAILQ_REMOVE(&vsocket->conn_list, conn, next);
-				free(conn);
-			}
-			pthread_mutex_unlock(&vsocket->conn_mutex);
-
-			if (vsocket->is_server) {
-				/*
-				 * If r/wcb is executing, release vhost_user's
-				 * mutex lock, and try again since the r/wcb
-				 * may use the mutex lock.
-				 */
-				if (fdset_try_del(&vhost_user.fdset,
-						vsocket->socket_fd) == -1) {
-					pthread_mutex_unlock(&vhost_user.mutex);
-					goto again;
-				}
-
-				close(vsocket->socket_fd);
-				unlink(path);
-			} else if (vsocket->reconnect) {
-				vhost_user_remove_reconnect(vsocket);
+		if (vsocket->is_server) {
+			/*
+			 * If r/wcb is executing, release vhost_user's
+			 * mutex lock, and try again since the r/wcb
+			 * may use the mutex lock.
+			 */
+			if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) == -1) {
+				pthread_mutex_unlock(&vhost_user.mutex);
+				goto again;
 			}
+		} else if (vsocket->reconnect) {
+			vhost_user_remove_reconnect(vsocket);
+		}
 
-			pthread_mutex_destroy(&vsocket->conn_mutex);
-			vhost_user_socket_mem_free(vsocket);
+		pthread_mutex_lock(&vsocket->conn_mutex);
+		for (conn = TAILQ_FIRST(&vsocket->conn_list);
+			 conn != NULL;
+			 conn = next) {
+			next = TAILQ_NEXT(conn, next);
 
-			count = --vhost_user.vsocket_cnt;
-			vhost_user.vsockets[i] = vhost_user.vsockets[count];
-			vhost_user.vsockets[count] = NULL;
-			pthread_mutex_unlock(&vhost_user.mutex);
+			/*
+			 * If r/wcb is executing, release vsocket's
+			 * conn_mutex and vhost_user's mutex locks, and
+			 * try again since the r/wcb may use the
+			 * conn_mutex and mutex locks.
+			 */
+			if (fdset_try_del(&vhost_user.fdset,
+					  conn->connfd) == -1) {
+				pthread_mutex_unlock(&vsocket->conn_mutex);
+				pthread_mutex_unlock(&vhost_user.mutex);
+				goto again;
+			}
+
+			VHOST_LOG_CONFIG(INFO,
+				"free connfd = %d for device '%s'\n",
+				conn->connfd, path);
+			close(conn->connfd);
+			vhost_destroy_device(conn->vid);
+			TAILQ_REMOVE(&vsocket->conn_list, conn, next);
+			free(conn);
+		}
+		pthread_mutex_unlock(&vsocket->conn_mutex);
 
-			return 0;
+		if (vsocket->is_server) {
+			close(vsocket->socket_fd);
+			unlink(path);
 		}
+
+		pthread_mutex_destroy(&vsocket->conn_mutex);
+		vhost_user_socket_mem_free(vsocket);
+
+		count = --vhost_user.vsocket_cnt;
+		vhost_user.vsockets[i] = vhost_user.vsockets[count];
+		vhost_user.vsockets[count] = NULL;
+		pthread_mutex_unlock(&vhost_user.mutex);
+		return 0;
 	}
 	pthread_mutex_unlock(&vhost_user.mutex);
 
-- 
2.32.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v6] vhost: fix crash on port deletion
  2021-08-27 14:19         ` [dpdk-dev] [PATCH v6] " Gaoxiang Liu
@ 2021-08-31  5:37           ` Xia, Chenbo
  2021-09-02 15:38             ` Gaoxiang Liu
  2021-09-02 15:45           ` [dpdk-dev] [PATCH v7] " Gaoxiang Liu
  1 sibling, 1 reply; 19+ messages in thread
From: Xia, Chenbo @ 2021-08-31  5:37 UTC (permalink / raw)
  To: Gaoxiang Liu, maxime.coquelin; +Cc: dev, liugaoxiang

Hi Gaoxiang,

> -----Original Message-----
> From: Gaoxiang Liu <gaoxiangliu0@163.com>
> Sent: Friday, August 27, 2021 10:19 PM
> To: maxime.coquelin@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
> Cc: dev@dpdk.org; liugaoxiang@huawei.com; Gaoxiang Liu <gaoxiangliu0@163.com>
> Subject: [PATCH v6] vhost: fix crash on port deletion
> 
> The rte_vhost_driver_unregister() and vhost_user_read_cb()
> can be called at the same time by 2 threads.
> when memory of vsocket is freed in rte_vhost_driver_unregister(),
> the invalid memory of vsocket is accessed in vhost_user_read_cb().
> It's a bug of both mode for vhost as server or client.
> 
> E.g.,vhostuser port is created as server.

Put a space after ','

> Thread1 calls rte_vhost_driver_unregister().
> Before the listen fd is deleted from poll waiting fds,
> "vhost-events" thread then calls vhost_user_server_new_connection(),
> then a new conn fd is added in fdset when trying to reconnect.
> "vhost-events" thread then calls vhost_user_read_cb() and
> accesses invalid memory of socket while thread1 frees the memory of
> vsocket.
> 
> E.g.,vhostuser port is created as client.

Same here.

> Thread1 calls rte_vhost_driver_unregister().
> Before vsocket of reconn is deleted from reconn list,
> "vhost_reconn" thread then calls vhost_user_add_connection()
> then a new conn fd is added in fdset when trying to reconnect.
> "vhost-events" thread then calls vhost_user_read_cb() and
> accesses invalid memory of socket while thread1 frees the memory of
> vsocket.
> 
> The fix is to move the "fdset_try_del" in front of free memory of conn,
> then avoid the race condition.
> 
> The core trace is:
> Program terminated with signal 11, Segmentation fault.
> 
> Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")
> 
> Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
> ---
> 
> v2:
> * Fix coding style issues.
> 
> v3:
> * Add detailed log.
> 
> v4:
> * Add the reason, when vhostuser port is created as server.
> 
> v5:
> * Add detailed log when vhostuser port is created as client
> 
> v6:
> * Add 'path' check before deleting listen fd
> * Fix spelling issues
> ---
>  lib/vhost/socket.c | 108 ++++++++++++++++++++++-----------------------
>  1 file changed, 54 insertions(+), 54 deletions(-)
> 
> diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
> index 5d0d728d5..27d5e8695 100644
> --- a/lib/vhost/socket.c
> +++ b/lib/vhost/socket.c
> @@ -1023,66 +1023,66 @@ rte_vhost_driver_unregister(const char *path)
> 
>  	for (i = 0; i < vhost_user.vsocket_cnt; i++) {
>  		struct vhost_user_socket *vsocket = vhost_user.vsockets[i];
> +		if (strcmp(vsocket->path, path)) {
> +			continue;
> +		}

braces {} are not necessary for single statement blocks

> 
> -		if (!strcmp(vsocket->path, path)) {
> -			pthread_mutex_lock(&vsocket->conn_mutex);
> -			for (conn = TAILQ_FIRST(&vsocket->conn_list);
> -			     conn != NULL;
> -			     conn = next) {
> -				next = TAILQ_NEXT(conn, next);
> -
> -				/*
> -				 * If r/wcb is executing, release vsocket's
> -				 * conn_mutex and vhost_user's mutex locks, and
> -				 * try again since the r/wcb may use the
> -				 * conn_mutex and mutex locks.
> -				 */
> -				if (fdset_try_del(&vhost_user.fdset,
> -						  conn->connfd) == -1) {
> -					pthread_mutex_unlock(
> -							&vsocket->conn_mutex);
> -					pthread_mutex_unlock(&vhost_user.mutex);
> -					goto again;
> -				}
> -
> -				VHOST_LOG_CONFIG(INFO,
> -					"free connfd = %d for device '%s'\n",
> -					conn->connfd, path);
> -				close(conn->connfd);
> -				vhost_destroy_device(conn->vid);
> -				TAILQ_REMOVE(&vsocket->conn_list, conn, next);
> -				free(conn);
> -			}
> -			pthread_mutex_unlock(&vsocket->conn_mutex);
> -
> -			if (vsocket->is_server) {
> -				/*
> -				 * If r/wcb is executing, release vhost_user's
> -				 * mutex lock, and try again since the r/wcb
> -				 * may use the mutex lock.
> -				 */
> -				if (fdset_try_del(&vhost_user.fdset,
> -						vsocket->socket_fd) == -1) {
> -					pthread_mutex_unlock(&vhost_user.mutex);
> -					goto again;
> -				}
> -
> -				close(vsocket->socket_fd);
> -				unlink(path);
> -			} else if (vsocket->reconnect) {
> -				vhost_user_remove_reconnect(vsocket);
> +		if (vsocket->is_server) {
> +			/*
> +			 * If r/wcb is executing, release vhost_user's
> +			 * mutex lock, and try again since the r/wcb
> +			 * may use the mutex lock.
> +			 */
> +			if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) ==
> -1) {
> +				pthread_mutex_unlock(&vhost_user.mutex);
> +				goto again;
>  			}
> +		} else if (vsocket->reconnect) {
> +			vhost_user_remove_reconnect(vsocket);
> +		}
> 
> -			pthread_mutex_destroy(&vsocket->conn_mutex);
> -			vhost_user_socket_mem_free(vsocket);
> +		pthread_mutex_lock(&vsocket->conn_mutex);
> +		for (conn = TAILQ_FIRST(&vsocket->conn_list);
> +			 conn != NULL;
> +			 conn = next) {
> +			next = TAILQ_NEXT(conn, next);
> 
> -			count = --vhost_user.vsocket_cnt;
> -			vhost_user.vsockets[i] = vhost_user.vsockets[count];
> -			vhost_user.vsockets[count] = NULL;
> -			pthread_mutex_unlock(&vhost_user.mutex);
> +			/*
> +			 * If r/wcb is executing, release vsocket's
> +			 * conn_mutex and vhost_user's mutex locks, and
> +			 * try again since the r/wcb may use the
> +			 * conn_mutex and mutex locks.
> +			 */
> +			if (fdset_try_del(&vhost_user.fdset,
> +					  conn->connfd) == -1) {
> +				pthread_mutex_unlock(&vsocket->conn_mutex);
> +				pthread_mutex_unlock(&vhost_user.mutex);
> +				goto again;
> +			}
> +
> +			VHOST_LOG_CONFIG(INFO,
> +				"free connfd = %d for device '%s'\n",
> +				conn->connfd, path);
> +			close(conn->connfd);
> +			vhost_destroy_device(conn->vid);
> +			TAILQ_REMOVE(&vsocket->conn_list, conn, next);
> +			free(conn);
> +		}
> +		pthread_mutex_unlock(&vsocket->conn_mutex);
> 
> -			return 0;
> +		if (vsocket->is_server) {
> +			close(vsocket->socket_fd);
> +			unlink(path);
>  		}

I think you miss my comment in V5 of asking why this is not moved up after
fdset_try_del server socket fd.

Thanks,
Chenbo

> +
> +		pthread_mutex_destroy(&vsocket->conn_mutex);
> +		vhost_user_socket_mem_free(vsocket);
> +
> +		count = --vhost_user.vsocket_cnt;
> +		vhost_user.vsockets[i] = vhost_user.vsockets[count];
> +		vhost_user.vsockets[count] = NULL;
> +		pthread_mutex_unlock(&vhost_user.mutex);
> +		return 0;
>  	}
>  	pthread_mutex_unlock(&vhost_user.mutex);
> 
> --
> 2.32.0
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v6] vhost: fix crash on port deletion
  2021-08-31  5:37           ` Xia, Chenbo
@ 2021-09-02 15:38             ` Gaoxiang Liu
  2021-09-06  3:18               ` Xia, Chenbo
  0 siblings, 1 reply; 19+ messages in thread
From: Gaoxiang Liu @ 2021-09-02 15:38 UTC (permalink / raw)
  To: Xia, Chenbo; +Cc: maxime.coquelin, dev, liugaoxiang




Hi chenbo,
why this is not moved up?
>> +		if (vsocket->is_server) {
>> +			close(vsocket->socket_fd);
>> +			unlink(path);
>>  		}
==>Because if this is moved up, and if deleting conn fd from fdsets failed,
it will arrive the "again" label, then close vsocket->socket_fd and uplink "path" again.
it's not necessary.
And closing socket_fd does not  trigger vhost_user_server_new_connection.


Thanks.
Gaoxiang


At 2021-08-31 13:37:38, "Xia, Chenbo" <chenbo.xia@intel.com> wrote:
>Hi Gaoxiang,
>
>> -----Original Message-----
>> From: Gaoxiang Liu <gaoxiangliu0@163.com>
>> Sent: Friday, August 27, 2021 10:19 PM
>> To: maxime.coquelin@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
>> Cc: dev@dpdk.org; liugaoxiang@huawei.com; Gaoxiang Liu <gaoxiangliu0@163.com>
>> Subject: [PATCH v6] vhost: fix crash on port deletion
>> 
>> The rte_vhost_driver_unregister() and vhost_user_read_cb()
>> can be called at the same time by 2 threads.
>> when memory of vsocket is freed in rte_vhost_driver_unregister(),
>> the invalid memory of vsocket is accessed in vhost_user_read_cb().
>> It's a bug of both mode for vhost as server or client.
>> 
>> E.g.,vhostuser port is created as server.
>
>Put a space after ','
>
>> Thread1 calls rte_vhost_driver_unregister().
>> Before the listen fd is deleted from poll waiting fds,
>> "vhost-events" thread then calls vhost_user_server_new_connection(),
>> then a new conn fd is added in fdset when trying to reconnect.
>> "vhost-events" thread then calls vhost_user_read_cb() and
>> accesses invalid memory of socket while thread1 frees the memory of
>> vsocket.
>> 
>> E.g.,vhostuser port is created as client.
>
>Same here.
>
>> Thread1 calls rte_vhost_driver_unregister().
>> Before vsocket of reconn is deleted from reconn list,
>> "vhost_reconn" thread then calls vhost_user_add_connection()
>> then a new conn fd is added in fdset when trying to reconnect.
>> "vhost-events" thread then calls vhost_user_read_cb() and
>> accesses invalid memory of socket while thread1 frees the memory of
>> vsocket.
>> 
>> The fix is to move the "fdset_try_del" in front of free memory of conn,
>> then avoid the race condition.
>> 
>> The core trace is:
>> Program terminated with signal 11, Segmentation fault.
>> 
>> Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")
>> 
>> Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
>> ---
>> 
>> v2:
>> * Fix coding style issues.
>> 
>> v3:
>> * Add detailed log.
>> 
>> v4:
>> * Add the reason, when vhostuser port is created as server.
>> 
>> v5:
>> * Add detailed log when vhostuser port is created as client
>> 
>> v6:
>> * Add 'path' check before deleting listen fd
>> * Fix spelling issues
>> ---
>>  lib/vhost/socket.c | 108 ++++++++++++++++++++++-----------------------
>>  1 file changed, 54 insertions(+), 54 deletions(-)
>> 
>> diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
>> index 5d0d728d5..27d5e8695 100644
>> --- a/lib/vhost/socket.c
>> +++ b/lib/vhost/socket.c
>> @@ -1023,66 +1023,66 @@ rte_vhost_driver_unregister(const char *path)
>> 
>>  	for (i = 0; i < vhost_user.vsocket_cnt; i++) {
>>  		struct vhost_user_socket *vsocket = vhost_user.vsockets[i];
>> +		if (strcmp(vsocket->path, path)) {
>> +			continue;
>> +		}
>
>braces {} are not necessary for single statement blocks
>
>> 
>> -		if (!strcmp(vsocket->path, path)) {
>> -			pthread_mutex_lock(&vsocket->conn_mutex);
>> -			for (conn = TAILQ_FIRST(&vsocket->conn_list);
>> -			     conn != NULL;
>> -			     conn = next) {
>> -				next = TAILQ_NEXT(conn, next);
>> -
>> -				/*
>> -				 * If r/wcb is executing, release vsocket's
>> -				 * conn_mutex and vhost_user's mutex locks, and
>> -				 * try again since the r/wcb may use the
>> -				 * conn_mutex and mutex locks.
>> -				 */
>> -				if (fdset_try_del(&vhost_user.fdset,
>> -						  conn->connfd) == -1) {
>> -					pthread_mutex_unlock(
>> -							&vsocket->conn_mutex);
>> -					pthread_mutex_unlock(&vhost_user.mutex);
>> -					goto again;
>> -				}
>> -
>> -				VHOST_LOG_CONFIG(INFO,
>> -					"free connfd = %d for device '%s'\n",
>> -					conn->connfd, path);
>> -				close(conn->connfd);
>> -				vhost_destroy_device(conn->vid);
>> -				TAILQ_REMOVE(&vsocket->conn_list, conn, next);
>> -				free(conn);
>> -			}
>> -			pthread_mutex_unlock(&vsocket->conn_mutex);
>> -
>> -			if (vsocket->is_server) {
>> -				/*
>> -				 * If r/wcb is executing, release vhost_user's
>> -				 * mutex lock, and try again since the r/wcb
>> -				 * may use the mutex lock.
>> -				 */
>> -				if (fdset_try_del(&vhost_user.fdset,
>> -						vsocket->socket_fd) == -1) {
>> -					pthread_mutex_unlock(&vhost_user.mutex);
>> -					goto again;
>> -				}
>> -
>> -				close(vsocket->socket_fd);
>> -				unlink(path);
>> -			} else if (vsocket->reconnect) {
>> -				vhost_user_remove_reconnect(vsocket);
>> +		if (vsocket->is_server) {
>> +			/*
>> +			 * If r/wcb is executing, release vhost_user's
>> +			 * mutex lock, and try again since the r/wcb
>> +			 * may use the mutex lock.
>> +			 */
>> +			if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) ==
>> -1) {
>> +				pthread_mutex_unlock(&vhost_user.mutex);
>> +				goto again;
>>  			}
>> +		} else if (vsocket->reconnect) {
>> +			vhost_user_remove_reconnect(vsocket);
>> +		}
>> 
>> -			pthread_mutex_destroy(&vsocket->conn_mutex);
>> -			vhost_user_socket_mem_free(vsocket);
>> +		pthread_mutex_lock(&vsocket->conn_mutex);
>> +		for (conn = TAILQ_FIRST(&vsocket->conn_list);
>> +			 conn != NULL;
>> +			 conn = next) {
>> +			next = TAILQ_NEXT(conn, next);
>> 
>> -			count = --vhost_user.vsocket_cnt;
>> -			vhost_user.vsockets[i] = vhost_user.vsockets[count];
>> -			vhost_user.vsockets[count] = NULL;
>> -			pthread_mutex_unlock(&vhost_user.mutex);
>> +			/*
>> +			 * If r/wcb is executing, release vsocket's
>> +			 * conn_mutex and vhost_user's mutex locks, and
>> +			 * try again since the r/wcb may use the
>> +			 * conn_mutex and mutex locks.
>> +			 */
>> +			if (fdset_try_del(&vhost_user.fdset,
>> +					  conn->connfd) == -1) {
>> +				pthread_mutex_unlock(&vsocket->conn_mutex);
>> +				pthread_mutex_unlock(&vhost_user.mutex);
>> +				goto again;
>> +			}
>> +
>> +			VHOST_LOG_CONFIG(INFO,
>> +				"free connfd = %d for device '%s'\n",
>> +				conn->connfd, path);
>> +			close(conn->connfd);
>> +			vhost_destroy_device(conn->vid);
>> +			TAILQ_REMOVE(&vsocket->conn_list, conn, next);
>> +			free(conn);
>> +		}
>> +		pthread_mutex_unlock(&vsocket->conn_mutex);
>> 
>> -			return 0;
>> +		if (vsocket->is_server) {
>> +			close(vsocket->socket_fd);
>> +			unlink(path);
>>  		}
>
>I think you miss my comment in V5 of asking why this is not moved up after
>fdset_try_del server socket fd.
>
>Thanks,
>Chenbo
>
>> +
>> +		pthread_mutex_destroy(&vsocket->conn_mutex);
>> +		vhost_user_socket_mem_free(vsocket);
>> +
>> +		count = --vhost_user.vsocket_cnt;
>> +		vhost_user.vsockets[i] = vhost_user.vsockets[count];
>> +		vhost_user.vsockets[count] = NULL;
>> +		pthread_mutex_unlock(&vhost_user.mutex);
>> +		return 0;
>>  	}
>>  	pthread_mutex_unlock(&vhost_user.mutex);
>> 
>> --
>> 2.32.0
>> 
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [dpdk-dev] [PATCH v7] vhost: fix crash on port deletion
  2021-08-27 14:19         ` [dpdk-dev] [PATCH v6] " Gaoxiang Liu
  2021-08-31  5:37           ` Xia, Chenbo
@ 2021-09-02 15:45           ` Gaoxiang Liu
  2021-09-06  3:24             ` Xia, Chenbo
                               ` (2 more replies)
  1 sibling, 3 replies; 19+ messages in thread
From: Gaoxiang Liu @ 2021-09-02 15:45 UTC (permalink / raw)
  To: maxime.coquelin, chenbo.xia; +Cc: dev, liugaoxiang, Gaoxiang Liu

The rte_vhost_driver_unregister() and vhost_user_read_cb()
can be called at the same time by 2 threads.
when memory of vsocket is freed in rte_vhost_driver_unregister(),
the invalid memory of vsocket is accessed in vhost_user_read_cb().
It's a bug of both mode for vhost as server or client.

E.g., vhostuser port is created as server.
Thread1 calls rte_vhost_driver_unregister().
Before the listen fd is deleted from poll waiting fds,
"vhost-events" thread then calls vhost_user_server_new_connection(),
then a new conn fd is added in fdset when trying to reconnect.
"vhost-events" thread then calls vhost_user_read_cb() and
accesses invalid memory of socket while thread1 frees the memory of
vsocket.

E.g., vhostuser port is created as client.
Thread1 calls rte_vhost_driver_unregister().
Before vsocket of reconn is deleted from reconn list,
"vhost_reconn" thread then calls vhost_user_add_connection()
then a new conn fd is added in fdset when trying to reconnect.
"vhost-events" thread then calls vhost_user_read_cb() and
accesses invalid memory of socket while thread1 frees the memory of
vsocket.

The fix is to move the "fdset_try_del" in front of free memory of conn,
then avoid the race condition.

The core trace is:
Program terminated with signal 11, Segmentation fault.

Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")

Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
---

v2:
* Fix coding style issues.

v3:
* Add detailed log.

v4:
* Add the reason, when vhostuser port is created as server.

v5:
* Add detailed log when vhostuser port is created as client

v6:
* Add 'path' check before deleting listen fd
* Fix spelling issues

v7:
* Fix coding style issues.
---
 lib/vhost/socket.c | 107 ++++++++++++++++++++++-----------------------
 1 file changed, 53 insertions(+), 54 deletions(-)

diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
index 5d0d728d5..d6f9414c4 100644
--- a/lib/vhost/socket.c
+++ b/lib/vhost/socket.c
@@ -1023,66 +1023,65 @@ rte_vhost_driver_unregister(const char *path)
 
 	for (i = 0; i < vhost_user.vsocket_cnt; i++) {
 		struct vhost_user_socket *vsocket = vhost_user.vsockets[i];
+		if (strcmp(vsocket->path, path))
+			continue;
 
-		if (!strcmp(vsocket->path, path)) {
-			pthread_mutex_lock(&vsocket->conn_mutex);
-			for (conn = TAILQ_FIRST(&vsocket->conn_list);
-			     conn != NULL;
-			     conn = next) {
-				next = TAILQ_NEXT(conn, next);
-
-				/*
-				 * If r/wcb is executing, release vsocket's
-				 * conn_mutex and vhost_user's mutex locks, and
-				 * try again since the r/wcb may use the
-				 * conn_mutex and mutex locks.
-				 */
-				if (fdset_try_del(&vhost_user.fdset,
-						  conn->connfd) == -1) {
-					pthread_mutex_unlock(
-							&vsocket->conn_mutex);
-					pthread_mutex_unlock(&vhost_user.mutex);
-					goto again;
-				}
-
-				VHOST_LOG_CONFIG(INFO,
-					"free connfd = %d for device '%s'\n",
-					conn->connfd, path);
-				close(conn->connfd);
-				vhost_destroy_device(conn->vid);
-				TAILQ_REMOVE(&vsocket->conn_list, conn, next);
-				free(conn);
-			}
-			pthread_mutex_unlock(&vsocket->conn_mutex);
-
-			if (vsocket->is_server) {
-				/*
-				 * If r/wcb is executing, release vhost_user's
-				 * mutex lock, and try again since the r/wcb
-				 * may use the mutex lock.
-				 */
-				if (fdset_try_del(&vhost_user.fdset,
-						vsocket->socket_fd) == -1) {
-					pthread_mutex_unlock(&vhost_user.mutex);
-					goto again;
-				}
-
-				close(vsocket->socket_fd);
-				unlink(path);
-			} else if (vsocket->reconnect) {
-				vhost_user_remove_reconnect(vsocket);
+		if (vsocket->is_server) {
+			/*
+			 * If r/wcb is executing, release vhost_user's
+			 * mutex lock, and try again since the r/wcb
+			 * may use the mutex lock.
+			 */
+			if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) == -1) {
+				pthread_mutex_unlock(&vhost_user.mutex);
+				goto again;
 			}
+		} else if (vsocket->reconnect) {
+			vhost_user_remove_reconnect(vsocket);
+		}
 
-			pthread_mutex_destroy(&vsocket->conn_mutex);
-			vhost_user_socket_mem_free(vsocket);
+		pthread_mutex_lock(&vsocket->conn_mutex);
+		for (conn = TAILQ_FIRST(&vsocket->conn_list);
+			 conn != NULL;
+			 conn = next) {
+			next = TAILQ_NEXT(conn, next);
 
-			count = --vhost_user.vsocket_cnt;
-			vhost_user.vsockets[i] = vhost_user.vsockets[count];
-			vhost_user.vsockets[count] = NULL;
-			pthread_mutex_unlock(&vhost_user.mutex);
+			/*
+			 * If r/wcb is executing, release vsocket's
+			 * conn_mutex and vhost_user's mutex locks, and
+			 * try again since the r/wcb may use the
+			 * conn_mutex and mutex locks.
+			 */
+			if (fdset_try_del(&vhost_user.fdset,
+					  conn->connfd) == -1) {
+				pthread_mutex_unlock(&vsocket->conn_mutex);
+				pthread_mutex_unlock(&vhost_user.mutex);
+				goto again;
+			}
 
-			return 0;
+			VHOST_LOG_CONFIG(INFO,
+				"free connfd = %d for device '%s'\n",
+				conn->connfd, path);
+			close(conn->connfd);
+			vhost_destroy_device(conn->vid);
+			TAILQ_REMOVE(&vsocket->conn_list, conn, next);
+			free(conn);
+		}
+		pthread_mutex_unlock(&vsocket->conn_mutex);
+
+		if (vsocket->is_server) {
+			close(vsocket->socket_fd);
+			unlink(path);
 		}
+
+		pthread_mutex_destroy(&vsocket->conn_mutex);
+		vhost_user_socket_mem_free(vsocket);
+
+		count = --vhost_user.vsocket_cnt;
+		vhost_user.vsockets[i] = vhost_user.vsockets[count];
+		vhost_user.vsockets[count] = NULL;
+		pthread_mutex_unlock(&vhost_user.mutex);
+		return 0;
 	}
 	pthread_mutex_unlock(&vhost_user.mutex);
 
-- 
2.32.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v6] vhost: fix crash on port deletion
  2021-09-02 15:38             ` Gaoxiang Liu
@ 2021-09-06  3:18               ` Xia, Chenbo
  2021-09-06  3:32                 ` Xia, Chenbo
  2021-09-06  3:54                 ` Gaoxiang Liu
  0 siblings, 2 replies; 19+ messages in thread
From: Xia, Chenbo @ 2021-09-06  3:18 UTC (permalink / raw)
  To: Gaoxiang Liu; +Cc: maxime.coquelin, dev, liugaoxiang

Hi Gaoxiang,

>
>
>From: Gaoxiang Liu <gaoxiangliu0@163.com> 
>Sent: Thursday, September 2, 2021 11:38 PM
>To: Xia, Chenbo <chenbo.xia@intel.com>
>Cc: maxime.coquelin@redhat.com; dev@dpdk.org; liugaoxiang@huawei.com
>Subject: Re:RE: [PATCH v6] vhost: fix crash on port deletion
>
>
>Hi chenbo,
>why this is not moved up?
>>> +		if (vsocket->is_server) {
>>> +			close(vsocket->socket_fd);
>>> +			unlink(path);
>>>  		}
>==>Because if this is moved up, and if deleting conn fd from fdsets failed,
>it will arrive the "again" label, then close vsocket->socket_fd and uplink "path" again.
>it's not necessary.
>And closing socket_fd does not  trigger vhost_user_server_new_connection.

But same issue happens when you deleted 'vsocket->socket_fd' but failed to delete one
of the conn_fd: you will goto again and try to delete socket_fd again and then loop
forever. So anyway you need to prevent this from happening.

Thanks,
Chenbo

>
>Thanks.
>Gaoxiang

At 2021-08-31 13:37:38, "Xia, Chenbo" <mailto:chenbo.xia@intel.com> wrote:
>Hi Gaoxiang,
>
>> -----Original Message-----
>> From: Gaoxiang Liu <mailto:gaoxiangliu0@163.com>
>> Sent: Friday, August 27, 2021 10:19 PM
>> To: mailto:maxime.coquelin@redhat.com; Xia, Chenbo <mailto:chenbo.xia@intel.com>
>> Cc: mailto:dev@dpdk.org; mailto:liugaoxiang@huawei.com; Gaoxiang Liu <mailto:gaoxiangliu0@163.com>
>> Subject: [PATCH v6] vhost: fix crash on port deletion
>> 
>> The rte_vhost_driver_unregister() and vhost_user_read_cb()
>> can be called at the same time by 2 threads.
>> when memory of vsocket is freed in rte_vhost_driver_unregister(),
>> the invalid memory of vsocket is accessed in vhost_user_read_cb().
>> It's a bug of both mode for vhost as server or client.
>> 
>> E.g.,vhostuser port is created as server.
>
>Put a space after ','
>
>> Thread1 calls rte_vhost_driver_unregister().
>> Before the listen fd is deleted from poll waiting fds,
>> "vhost-events" thread then calls vhost_user_server_new_connection(),
>> then a new conn fd is added in fdset when trying to reconnect.
>> "vhost-events" thread then calls vhost_user_read_cb() and
>> accesses invalid memory of socket while thread1 frees the memory of
>> vsocket.
>> 
>> E.g.,vhostuser port is created as client.
>
>Same here.
>
>> Thread1 calls rte_vhost_driver_unregister().
>> Before vsocket of reconn is deleted from reconn list,
>> "vhost_reconn" thread then calls vhost_user_add_connection()
>> then a new conn fd is added in fdset when trying to reconnect.
>> "vhost-events" thread then calls vhost_user_read_cb() and
>> accesses invalid memory of socket while thread1 frees the memory of
>> vsocket.
>> 
>> The fix is to move the "fdset_try_del" in front of free memory of conn,
>> then avoid the race condition.
>> 
>> The core trace is:
>> Program terminated with signal 11, Segmentation fault.
>> 
>> Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")
>> 
>> Signed-off-by: Gaoxiang Liu <mailto:liugaoxiang@huawei.com>
>> ---
>> 
>> v2:
>> * Fix coding style issues.
>> 
>> v3:
>> * Add detailed log.
>> 
>> v4:
>> * Add the reason, when vhostuser port is created as server.
>> 
>> v5:
>> * Add detailed log when vhostuser port is created as client
>> 
>> v6:
>> * Add 'path' check before deleting listen fd
>> * Fix spelling issues
>> ---
>>  lib/vhost/socket.c | 108 ++++++++++++++++++++++-----------------------
>>  1 file changed, 54 insertions(+), 54 deletions(-)
>> 
>> diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
>> index 5d0d728d5..27d5e8695 100644
>> --- a/lib/vhost/socket.c
>> +++ b/lib/vhost/socket.c
>> @@ -1023,66 +1023,66 @@ rte_vhost_driver_unregister(const char *path)
>> 
>>  	for (i = 0; i < vhost_user.vsocket_cnt; i++) {
>>  		struct vhost_user_socket *vsocket = vhost_user.vsockets[i];
>> +		if (strcmp(vsocket->path, path)) {
>> +			continue;
>> +		}
>
>braces {} are not necessary for single statement blocks
>
>> 
>> -		if (!strcmp(vsocket->path, path)) {
>> -			pthread_mutex_lock(&vsocket->conn_mutex);
>> -			for (conn = TAILQ_FIRST(&vsocket->conn_list);
>> -			     conn != NULL;
>> -			     conn = next) {
>> -				next = TAILQ_NEXT(conn, next);
>> -
>> -				/*
>> -				 * If r/wcb is executing, release vsocket's
>> -				 * conn_mutex and vhost_user's mutex locks, and
>> -				 * try again since the r/wcb may use the
>> -				 * conn_mutex and mutex locks.
>> -				 */
>> -				if (fdset_try_del(&vhost_user.fdset,
>> -						  conn->connfd) == -1) {
>> -					pthread_mutex_unlock(
>> -							&vsocket->conn_mutex);
>> -					pthread_mutex_unlock(&vhost_user.mutex);
>> -					goto again;
>> -				}
>> -
>> -				VHOST_LOG_CONFIG(INFO,
>> -					"free connfd = %d for device '%s'\n",
>> -					conn->connfd, path);
>> -				close(conn->connfd);
>> -				vhost_destroy_device(conn->vid);
>> -				TAILQ_REMOVE(&vsocket->conn_list, conn, next);
>> -				free(conn);
>> -			}
>> -			pthread_mutex_unlock(&vsocket->conn_mutex);
>> -
>> -			if (vsocket->is_server) {
>> -				/*
>> -				 * If r/wcb is executing, release vhost_user's
>> -				 * mutex lock, and try again since the r/wcb
>> -				 * may use the mutex lock.
>> -				 */
>> -				if (fdset_try_del(&vhost_user.fdset,
>> -						vsocket->socket_fd) == -1) {
>> -					pthread_mutex_unlock(&vhost_user.mutex);
>> -					goto again;
>> -				}
>> -
>> -				close(vsocket->socket_fd);
>> -				unlink(path);
>> -			} else if (vsocket->reconnect) {
>> -				vhost_user_remove_reconnect(vsocket);
>> +		if (vsocket->is_server) {
>> +			/*
>> +			 * If r/wcb is executing, release vhost_user's
>> +			 * mutex lock, and try again since the r/wcb
>> +			 * may use the mutex lock.
>> +			 */
>> +			if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) ==
>> -1) {
>> +				pthread_mutex_unlock(&vhost_user.mutex);
>> +				goto again;
>>  			}
>> +		} else if (vsocket->reconnect) {
>> +			vhost_user_remove_reconnect(vsocket);
>> +		}
>> 
>> -			pthread_mutex_destroy(&vsocket->conn_mutex);
>> -			vhost_user_socket_mem_free(vsocket);
>> +		pthread_mutex_lock(&vsocket->conn_mutex);
>> +		for (conn = TAILQ_FIRST(&vsocket->conn_list);
>> +			 conn != NULL;
>> +			 conn = next) {
>> +			next = TAILQ_NEXT(conn, next);
>> 
>> -			count = --vhost_user.vsocket_cnt;
>> -			vhost_user.vsockets[i] = vhost_user.vsockets[count];
>> -			vhost_user.vsockets[count] = NULL;
>> -			pthread_mutex_unlock(&vhost_user.mutex);
>> +			/*
>> +			 * If r/wcb is executing, release vsocket's
>> +			 * conn_mutex and vhost_user's mutex locks, and
>> +			 * try again since the r/wcb may use the
>> +			 * conn_mutex and mutex locks.
>> +			 */
>> +			if (fdset_try_del(&vhost_user.fdset,
>> +					  conn->connfd) == -1) {
>> +				pthread_mutex_unlock(&vsocket->conn_mutex);
>> +				pthread_mutex_unlock(&vhost_user.mutex);
>> +				goto again;
>> +			}
>> +
>> +			VHOST_LOG_CONFIG(INFO,
>> +				"free connfd = %d for device '%s'\n",
>> +				conn->connfd, path);
>> +			close(conn->connfd);
>> +			vhost_destroy_device(conn->vid);
>> +			TAILQ_REMOVE(&vsocket->conn_list, conn, next);
>> +			free(conn);
>> +		}
>> +		pthread_mutex_unlock(&vsocket->conn_mutex);
>> 
>> -			return 0;
>> +		if (vsocket->is_server) {
>> +			close(vsocket->socket_fd);
>> +			unlink(path);
>>  		}
>
>I think you miss my comment in V5 of asking why this is not moved up after
>fdset_try_del server socket fd.
>
>Thanks,
>Chenbo
>
>> +
>> +		pthread_mutex_destroy(&vsocket->conn_mutex);
>> +		vhost_user_socket_mem_free(vsocket);
>> +
>> +		count = --vhost_user.vsocket_cnt;
>> +		vhost_user.vsockets[i] = vhost_user.vsockets[count];
>> +		vhost_user.vsockets[count] = NULL;
>> +		pthread_mutex_unlock(&vhost_user.mutex);
>> +		return 0;
>>  	}
>>  	pthread_mutex_unlock(&vhost_user.mutex);
>> 
>> --
>> 2.32.0
>> 
>

 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v7] vhost: fix crash on port deletion
  2021-09-02 15:45           ` [dpdk-dev] [PATCH v7] " Gaoxiang Liu
@ 2021-09-06  3:24             ` Xia, Chenbo
  2021-09-06  5:19             ` Xia, Chenbo
  2021-09-14 11:29             ` Maxime Coquelin
  2 siblings, 0 replies; 19+ messages in thread
From: Xia, Chenbo @ 2021-09-06  3:24 UTC (permalink / raw)
  To: Gaoxiang Liu, maxime.coquelin; +Cc: dev, liugaoxiang

Hi,

> -----Original Message-----
> From: Gaoxiang Liu <gaoxiangliu0@163.com>
> Sent: Thursday, September 2, 2021 11:46 PM
> To: maxime.coquelin@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
> Cc: dev@dpdk.org; liugaoxiang@huawei.com; Gaoxiang Liu <gaoxiangliu0@163.com>
> Subject: [PATCH v7] vhost: fix crash on port deletion
> 
> The rte_vhost_driver_unregister() and vhost_user_read_cb()
> can be called at the same time by 2 threads.
> when memory of vsocket is freed in rte_vhost_driver_unregister(),
> the invalid memory of vsocket is accessed in vhost_user_read_cb().
> It's a bug of both mode for vhost as server or client.
> 
> E.g., vhostuser port is created as server.
> Thread1 calls rte_vhost_driver_unregister().
> Before the listen fd is deleted from poll waiting fds,
> "vhost-events" thread then calls vhost_user_server_new_connection(),
> then a new conn fd is added in fdset when trying to reconnect.
> "vhost-events" thread then calls vhost_user_read_cb() and
> accesses invalid memory of socket while thread1 frees the memory of
> vsocket.
> 
> E.g., vhostuser port is created as client.
> Thread1 calls rte_vhost_driver_unregister().
> Before vsocket of reconn is deleted from reconn list,
> "vhost_reconn" thread then calls vhost_user_add_connection()
> then a new conn fd is added in fdset when trying to reconnect.
> "vhost-events" thread then calls vhost_user_read_cb() and
> accesses invalid memory of socket while thread1 frees the memory of
> vsocket.
> 
> The fix is to move the "fdset_try_del" in front of free memory of conn,
> then avoid the race condition.
> 
> The core trace is:
> Program terminated with signal 11, Segmentation fault.
> 
> Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")

Please check comment/reply in v6. And a suggestion:

Wait for comment/problem solved in old version before new version.
It can save everyone's effort of tracking all versions.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v6] vhost: fix crash on port deletion
  2021-09-06  3:18               ` Xia, Chenbo
@ 2021-09-06  3:32                 ` Xia, Chenbo
  2021-09-06  3:54                 ` Gaoxiang Liu
  1 sibling, 0 replies; 19+ messages in thread
From: Xia, Chenbo @ 2021-09-06  3:32 UTC (permalink / raw)
  To: Xia, Chenbo, Gaoxiang Liu; +Cc: maxime.coquelin, dev, liugaoxiang

Hi Gaoxiang,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Xia, Chenbo
> Sent: Monday, September 6, 2021 11:18 AM
> To: Gaoxiang Liu <gaoxiangliu0@163.com>
> Cc: maxime.coquelin@redhat.com; dev@dpdk.org; liugaoxiang@huawei.com
> Subject: Re: [dpdk-dev] [PATCH v6] vhost: fix crash on port deletion
> 
> Hi Gaoxiang,
> 
> >
> >
> >From: Gaoxiang Liu <gaoxiangliu0@163.com>
> >Sent: Thursday, September 2, 2021 11:38 PM
> >To: Xia, Chenbo <chenbo.xia@intel.com>
> >Cc: maxime.coquelin@redhat.com; dev@dpdk.org; liugaoxiang@huawei.com
> >Subject: Re:RE: [PATCH v6] vhost: fix crash on port deletion
> >
> >
> >Hi chenbo,
> >why this is not moved up?
> >>> +		if (vsocket->is_server) {
> >>> +			close(vsocket->socket_fd);
> >>> +			unlink(path);
> >>>  		}
> >==>Because if this is moved up, and if deleting conn fd from fdsets failed,
> >it will arrive the "again" label, then close vsocket->socket_fd and uplink
> "path" again.
> >it's not necessary.
> >And closing socket_fd does not  trigger vhost_user_server_new_connection.
> 
> But same issue happens when you deleted 'vsocket->socket_fd' but failed to
> delete one
> of the conn_fd: you will goto again and try to delete socket_fd again and then
> loop
> forever. So anyway you need to prevent this from happening.

Please ignore this. I thought delete socket_fd again will return -1.
So I am fine with putting it later otherwise we have to introduce a flag to know if
Socket_fd deletion happened once.

Thanks,
Chenbo

> 
> Thanks,
> Chenbo
> 
> >
> >Thanks.
> >Gaoxiang
> 
> At 2021-08-31 13:37:38, "Xia, Chenbo" <mailto:chenbo.xia@intel.com> wrote:
> >Hi Gaoxiang,
> >
> >> -----Original Message-----
> >> From: Gaoxiang Liu <mailto:gaoxiangliu0@163.com>
> >> Sent: Friday, August 27, 2021 10:19 PM
> >> To: mailto:maxime.coquelin@redhat.com; Xia, Chenbo
> <mailto:chenbo.xia@intel.com>
> >> Cc: mailto:dev@dpdk.org; mailto:liugaoxiang@huawei.com; Gaoxiang Liu
> <mailto:gaoxiangliu0@163.com>
> >> Subject: [PATCH v6] vhost: fix crash on port deletion
> >>
> >> The rte_vhost_driver_unregister() and vhost_user_read_cb()
> >> can be called at the same time by 2 threads.
> >> when memory of vsocket is freed in rte_vhost_driver_unregister(),
> >> the invalid memory of vsocket is accessed in vhost_user_read_cb().
> >> It's a bug of both mode for vhost as server or client.
> >>
> >> E.g.,vhostuser port is created as server.
> >
> >Put a space after ','
> >
> >> Thread1 calls rte_vhost_driver_unregister().
> >> Before the listen fd is deleted from poll waiting fds,
> >> "vhost-events" thread then calls vhost_user_server_new_connection(),
> >> then a new conn fd is added in fdset when trying to reconnect.
> >> "vhost-events" thread then calls vhost_user_read_cb() and
> >> accesses invalid memory of socket while thread1 frees the memory of
> >> vsocket.
> >>
> >> E.g.,vhostuser port is created as client.
> >
> >Same here.
> >
> >> Thread1 calls rte_vhost_driver_unregister().
> >> Before vsocket of reconn is deleted from reconn list,
> >> "vhost_reconn" thread then calls vhost_user_add_connection()
> >> then a new conn fd is added in fdset when trying to reconnect.
> >> "vhost-events" thread then calls vhost_user_read_cb() and
> >> accesses invalid memory of socket while thread1 frees the memory of
> >> vsocket.
> >>
> >> The fix is to move the "fdset_try_del" in front of free memory of conn,
> >> then avoid the race condition.
> >>
> >> The core trace is:
> >> Program terminated with signal 11, Segmentation fault.
> >>
> >> Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")
> >>
> >> Signed-off-by: Gaoxiang Liu <mailto:liugaoxiang@huawei.com>
> >> ---
> >>
> >> v2:
> >> * Fix coding style issues.
> >>
> >> v3:
> >> * Add detailed log.
> >>
> >> v4:
> >> * Add the reason, when vhostuser port is created as server.
> >>
> >> v5:
> >> * Add detailed log when vhostuser port is created as client
> >>
> >> v6:
> >> * Add 'path' check before deleting listen fd
> >> * Fix spelling issues
> >> ---
> >>  lib/vhost/socket.c | 108 ++++++++++++++++++++++-----------------------
> >>  1 file changed, 54 insertions(+), 54 deletions(-)
> >>
> >> diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
> >> index 5d0d728d5..27d5e8695 100644
> >> --- a/lib/vhost/socket.c
> >> +++ b/lib/vhost/socket.c
> >> @@ -1023,66 +1023,66 @@ rte_vhost_driver_unregister(const char *path)
> >>
> >>  	for (i = 0; i < vhost_user.vsocket_cnt; i++) {
> >>  		struct vhost_user_socket *vsocket = vhost_user.vsockets[i];
> >> +		if (strcmp(vsocket->path, path)) {
> >> +			continue;
> >> +		}
> >
> >braces {} are not necessary for single statement blocks
> >
> >>
> >> -		if (!strcmp(vsocket->path, path)) {
> >> -			pthread_mutex_lock(&vsocket->conn_mutex);
> >> -			for (conn = TAILQ_FIRST(&vsocket->conn_list);
> >> -			     conn != NULL;
> >> -			     conn = next) {
> >> -				next = TAILQ_NEXT(conn, next);
> >> -
> >> -				/*
> >> -				 * If r/wcb is executing, release vsocket's
> >> -				 * conn_mutex and vhost_user's mutex locks, and
> >> -				 * try again since the r/wcb may use the
> >> -				 * conn_mutex and mutex locks.
> >> -				 */
> >> -				if (fdset_try_del(&vhost_user.fdset,
> >> -						  conn->connfd) == -1) {
> >> -					pthread_mutex_unlock(
> >> -							&vsocket->conn_mutex);
> >> -					pthread_mutex_unlock(&vhost_user.mutex);
> >> -					goto again;
> >> -				}
> >> -
> >> -				VHOST_LOG_CONFIG(INFO,
> >> -					"free connfd = %d for device '%s'\n",
> >> -					conn->connfd, path);
> >> -				close(conn->connfd);
> >> -				vhost_destroy_device(conn->vid);
> >> -				TAILQ_REMOVE(&vsocket->conn_list, conn, next);
> >> -				free(conn);
> >> -			}
> >> -			pthread_mutex_unlock(&vsocket->conn_mutex);
> >> -
> >> -			if (vsocket->is_server) {
> >> -				/*
> >> -				 * If r/wcb is executing, release vhost_user's
> >> -				 * mutex lock, and try again since the r/wcb
> >> -				 * may use the mutex lock.
> >> -				 */
> >> -				if (fdset_try_del(&vhost_user.fdset,
> >> -						vsocket->socket_fd) == -1) {
> >> -					pthread_mutex_unlock(&vhost_user.mutex);
> >> -					goto again;
> >> -				}
> >> -
> >> -				close(vsocket->socket_fd);
> >> -				unlink(path);
> >> -			} else if (vsocket->reconnect) {
> >> -				vhost_user_remove_reconnect(vsocket);
> >> +		if (vsocket->is_server) {
> >> +			/*
> >> +			 * If r/wcb is executing, release vhost_user's
> >> +			 * mutex lock, and try again since the r/wcb
> >> +			 * may use the mutex lock.
> >> +			 */
> >> +			if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) ==
> >> -1) {
> >> +				pthread_mutex_unlock(&vhost_user.mutex);
> >> +				goto again;
> >>  			}
> >> +		} else if (vsocket->reconnect) {
> >> +			vhost_user_remove_reconnect(vsocket);
> >> +		}
> >>
> >> -			pthread_mutex_destroy(&vsocket->conn_mutex);
> >> -			vhost_user_socket_mem_free(vsocket);
> >> +		pthread_mutex_lock(&vsocket->conn_mutex);
> >> +		for (conn = TAILQ_FIRST(&vsocket->conn_list);
> >> +			 conn != NULL;
> >> +			 conn = next) {
> >> +			next = TAILQ_NEXT(conn, next);
> >>
> >> -			count = --vhost_user.vsocket_cnt;
> >> -			vhost_user.vsockets[i] = vhost_user.vsockets[count];
> >> -			vhost_user.vsockets[count] = NULL;
> >> -			pthread_mutex_unlock(&vhost_user.mutex);
> >> +			/*
> >> +			 * If r/wcb is executing, release vsocket's
> >> +			 * conn_mutex and vhost_user's mutex locks, and
> >> +			 * try again since the r/wcb may use the
> >> +			 * conn_mutex and mutex locks.
> >> +			 */
> >> +			if (fdset_try_del(&vhost_user.fdset,
> >> +					  conn->connfd) == -1) {
> >> +				pthread_mutex_unlock(&vsocket->conn_mutex);
> >> +				pthread_mutex_unlock(&vhost_user.mutex);
> >> +				goto again;
> >> +			}
> >> +
> >> +			VHOST_LOG_CONFIG(INFO,
> >> +				"free connfd = %d for device '%s'\n",
> >> +				conn->connfd, path);
> >> +			close(conn->connfd);
> >> +			vhost_destroy_device(conn->vid);
> >> +			TAILQ_REMOVE(&vsocket->conn_list, conn, next);
> >> +			free(conn);
> >> +		}
> >> +		pthread_mutex_unlock(&vsocket->conn_mutex);
> >>
> >> -			return 0;
> >> +		if (vsocket->is_server) {
> >> +			close(vsocket->socket_fd);
> >> +			unlink(path);
> >>  		}
> >
> >I think you miss my comment in V5 of asking why this is not moved up after
> >fdset_try_del server socket fd.
> >
> >Thanks,
> >Chenbo
> >
> >> +
> >> +		pthread_mutex_destroy(&vsocket->conn_mutex);
> >> +		vhost_user_socket_mem_free(vsocket);
> >> +
> >> +		count = --vhost_user.vsocket_cnt;
> >> +		vhost_user.vsockets[i] = vhost_user.vsockets[count];
> >> +		vhost_user.vsockets[count] = NULL;
> >> +		pthread_mutex_unlock(&vhost_user.mutex);
> >> +		return 0;
> >>  	}
> >>  	pthread_mutex_unlock(&vhost_user.mutex);
> >>
> >> --
> >> 2.32.0
> >>
> >
> 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v6] vhost: fix crash on port deletion
  2021-09-06  3:18               ` Xia, Chenbo
  2021-09-06  3:32                 ` Xia, Chenbo
@ 2021-09-06  3:54                 ` Gaoxiang Liu
  1 sibling, 0 replies; 19+ messages in thread
From: Gaoxiang Liu @ 2021-09-06  3:54 UTC (permalink / raw)
  To: Xia, Chenbo; +Cc: maxime.coquelin, dev, liugaoxiang

Hi Chenbo,

But same issue happens when you deleted 'vsocket->socket_fd' but failed to delete one
of the conn_fd: you will goto again and try to delete socket_fd again and then loop
forever. So anyway you need to prevent this from happening.

==>
It will not happen, because fdset_try_del only returns -1 when the fd exists and is busy.
E.g., when thread1 deleted 'vsocket->socket_fd' but failed to delete one of the conn_fd,
it will go to again and try to delete socket again, and it will not fail, because 'vsocket->socket_fd' has been deleted and fdset_try_del returns 0.
Thread1 will continue to delete the left conn_fd.

Thanks.
Gaoxiang








On 09/06/2021 11:18, Xia, Chenbo wrote:
Hi Gaoxiang,

>
>
>From: Gaoxiang Liu <gaoxiangliu0@163.com>
>Sent: Thursday, September 2, 2021 11:38 PM
>To: Xia, Chenbo <chenbo.xia@intel.com>
>Cc: maxime.coquelin@redhat.com; dev@dpdk.org; liugaoxiang@huawei.com
>Subject: Re:RE: [PATCH v6] vhost: fix crash on port deletion
>
>
>Hi chenbo,
>why this is not moved up?
>>> +          if (vsocket->is_server) {
>>> +               close(vsocket->socket_fd);
>>> +               unlink(path);
>>>            }
>==>Because if this is moved up, and if deleting conn fd from fdsets failed,
>it will arrive the "again" label, then close vsocket->socket_fd and uplink "path" again.
>it's not necessary.
>And closing socket_fd does not  trigger vhost_user_server_new_connection.

But same issue happens when you deleted 'vsocket->socket_fd' but failed to delete one
of the conn_fd: you will goto again and try to delete socket_fd again and then loop
forever. So anyway you need to prevent this from happening.

Thanks,
Chenbo

>
>Thanks.
>Gaoxiang

At 2021-08-31 13:37:38, "Xia, Chenbo" <mailto:chenbo.xia@intel.com> wrote:
>Hi Gaoxiang,
>
>> -----Original Message-----
>> From: Gaoxiang Liu <mailto:gaoxiangliu0@163.com>
>> Sent: Friday, August 27, 2021 10:19 PM
>> To: mailto:maxime.coquelin@redhat.com; Xia, Chenbo <mailto:chenbo.xia@intel.com>
>> Cc: mailto:dev@dpdk.org; mailto:liugaoxiang@huawei.com; Gaoxiang Liu <mailto:gaoxiangliu0@163.com>
>> Subject: [PATCH v6] vhost: fix crash on port deletion
>>
>> The rte_vhost_driver_unregister() and vhost_user_read_cb()
>> can be called at the same time by 2 threads.
>> when memory of vsocket is freed in rte_vhost_driver_unregister(),
>> the invalid memory of vsocket is accessed in vhost_user_read_cb().
>> It's a bug of both mode for vhost as server or client.
>>
>> E.g.,vhostuser port is created as server.
>
>Put a space after ','
>
>> Thread1 calls rte_vhost_driver_unregister().
>> Before the listen fd is deleted from poll waiting fds,
>> "vhost-events" thread then calls vhost_user_server_new_connection(),
>> then a new conn fd is added in fdset when trying to reconnect.
>> "vhost-events" thread then calls vhost_user_read_cb() and
>> accesses invalid memory of socket while thread1 frees the memory of
>> vsocket.
>>
>> E.g.,vhostuser port is created as client.
>
>Same here.
>
>> Thread1 calls rte_vhost_driver_unregister().
>> Before vsocket of reconn is deleted from reconn list,
>> "vhost_reconn" thread then calls vhost_user_add_connection()
>> then a new conn fd is added in fdset when trying to reconnect.
>> "vhost-events" thread then calls vhost_user_read_cb() and
>> accesses invalid memory of socket while thread1 frees the memory of
>> vsocket.
>>
>> The fix is to move the "fdset_try_del" in front of free memory of conn,
>> then avoid the race condition.
>>
>> The core trace is:
>> Program terminated with signal 11, Segmentation fault.
>>
>> Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")
>>
>> Signed-off-by: Gaoxiang Liu <mailto:liugaoxiang@huawei.com>
>> ---
>>
>> v2:
>> * Fix coding style issues.
>>
>> v3:
>> * Add detailed log.
>>
>> v4:
>> * Add the reason, when vhostuser port is created as server.
>>
>> v5:
>> * Add detailed log when vhostuser port is created as client
>>
>> v6:
>> * Add 'path' check before deleting listen fd
>> * Fix spelling issues
>> ---
>>  lib/vhost/socket.c | 108 ++++++++++++++++++++++-----------------------
>>  1 file changed, 54 insertions(+), 54 deletions(-)
>>
>> diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
>> index 5d0d728d5..27d5e8695 100644
>> --- a/lib/vhost/socket.c
>> +++ b/lib/vhost/socket.c
>> @@ -1023,66 +1023,66 @@ rte_vhost_driver_unregister(const char *path)
>>
>>       for (i = 0; i < vhost_user.vsocket_cnt; i++) {
>>            struct vhost_user_socket *vsocket = vhost_user.vsockets[i];
>> +          if (strcmp(vsocket->path, path)) {
>> +               continue;
>> +          }
>
>braces {} are not necessary for single statement blocks
>
>>
>> -          if (!strcmp(vsocket->path, path)) {
>> -               pthread_mutex_lock(&vsocket->conn_mutex);
>> -               for (conn = TAILQ_FIRST(&vsocket->conn_list);
>> -                    conn != NULL;
>> -                    conn = next) {
>> -                    next = TAILQ_NEXT(conn, next);
>> -
>> -                    /*
>> -                     * If r/wcb is executing, release vsocket's
>> -                     * conn_mutex and vhost_user's mutex locks, and
>> -                     * try again since the r/wcb may use the
>> -                     * conn_mutex and mutex locks.
>> -                     */
>> -                    if (fdset_try_del(&vhost_user.fdset,
>> -                                conn->connfd) == -1) {
>> -                         pthread_mutex_unlock(
>> -                                   &vsocket->conn_mutex);
>> -                         pthread_mutex_unlock(&vhost_user.mutex);
>> -                         goto again;
>> -                    }
>> -
>> -                    VHOST_LOG_CONFIG(INFO,
>> -                         "free connfd = %d for device '%s'\n",
>> -                         conn->connfd, path);
>> -                    close(conn->connfd);
>> -                    vhost_destroy_device(conn->vid);
>> -                    TAILQ_REMOVE(&vsocket->conn_list, conn, next);
>> -                    free(conn);
>> -               }
>> -               pthread_mutex_unlock(&vsocket->conn_mutex);
>> -
>> -               if (vsocket->is_server) {
>> -                    /*
>> -                     * If r/wcb is executing, release vhost_user's
>> -                     * mutex lock, and try again since the r/wcb
>> -                     * may use the mutex lock.
>> -                     */
>> -                    if (fdset_try_del(&vhost_user.fdset,
>> -                              vsocket->socket_fd) == -1) {
>> -                         pthread_mutex_unlock(&vhost_user.mutex);
>> -                         goto again;
>> -                    }
>> -
>> -                    close(vsocket->socket_fd);
>> -                    unlink(path);
>> -               } else if (vsocket->reconnect) {
>> -                    vhost_user_remove_reconnect(vsocket);
>> +          if (vsocket->is_server) {
>> +               /*
>> +                * If r/wcb is executing, release vhost_user's
>> +                * mutex lock, and try again since the r/wcb
>> +                * may use the mutex lock.
>> +                */
>> +               if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) ==
>> -1) {
>> +                    pthread_mutex_unlock(&vhost_user.mutex);
>> +                    goto again;
>>                 }
>> +          } else if (vsocket->reconnect) {
>> +               vhost_user_remove_reconnect(vsocket);
>> +          }
>>
>> -               pthread_mutex_destroy(&vsocket->conn_mutex);
>> -               vhost_user_socket_mem_free(vsocket);
>> +          pthread_mutex_lock(&vsocket->conn_mutex);
>> +          for (conn = TAILQ_FIRST(&vsocket->conn_list);
>> +                conn != NULL;
>> +                conn = next) {
>> +               next = TAILQ_NEXT(conn, next);
>>
>> -               count = --vhost_user.vsocket_cnt;
>> -               vhost_user.vsockets[i] = vhost_user.vsockets[count];
>> -               vhost_user.vsockets[count] = NULL;
>> -               pthread_mutex_unlock(&vhost_user.mutex);
>> +               /*
>> +                * If r/wcb is executing, release vsocket's
>> +                * conn_mutex and vhost_user's mutex locks, and
>> +                * try again since the r/wcb may use the
>> +                * conn_mutex and mutex locks.
>> +                */
>> +               if (fdset_try_del(&vhost_user.fdset,
>> +                           conn->connfd) == -1) {
>> +                    pthread_mutex_unlock(&vsocket->conn_mutex);
>> +                    pthread_mutex_unlock(&vhost_user.mutex);
>> +                    goto again;
>> +               }
>> +
>> +               VHOST_LOG_CONFIG(INFO,
>> +                    "free connfd = %d for device '%s'\n",
>> +                    conn->connfd, path);
>> +               close(conn->connfd);
>> +               vhost_destroy_device(conn->vid);
>> +               TAILQ_REMOVE(&vsocket->conn_list, conn, next);
>> +               free(conn);
>> +          }
>> +          pthread_mutex_unlock(&vsocket->conn_mutex);
>>
>> -               return 0;
>> +          if (vsocket->is_server) {
>> +               close(vsocket->socket_fd);
>> +               unlink(path);
>>            }
>
>I think you miss my comment in V5 of asking why this is not moved up after
>fdset_try_del server socket fd.
>
>Thanks,
>Chenbo
>
>> +
>> +          pthread_mutex_destroy(&vsocket->conn_mutex);
>> +          vhost_user_socket_mem_free(vsocket);
>> +
>> +          count = --vhost_user.vsocket_cnt;
>> +          vhost_user.vsockets[i] = vhost_user.vsockets[count];
>> +          vhost_user.vsockets[count] = NULL;
>> +          pthread_mutex_unlock(&vhost_user.mutex);
>> +          return 0;
>>       }
>>       pthread_mutex_unlock(&vhost_user.mutex);
>>
>> --
>> 2.32.0
>>
>

 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v7] vhost: fix crash on port deletion
  2021-09-02 15:45           ` [dpdk-dev] [PATCH v7] " Gaoxiang Liu
  2021-09-06  3:24             ` Xia, Chenbo
@ 2021-09-06  5:19             ` Xia, Chenbo
  2021-09-14 11:29             ` Maxime Coquelin
  2 siblings, 0 replies; 19+ messages in thread
From: Xia, Chenbo @ 2021-09-06  5:19 UTC (permalink / raw)
  To: Gaoxiang Liu, maxime.coquelin; +Cc: dev, liugaoxiang

> -----Original Message-----
> From: Gaoxiang Liu <gaoxiangliu0@163.com>
> Sent: Thursday, September 2, 2021 11:46 PM
> To: maxime.coquelin@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
> Cc: dev@dpdk.org; liugaoxiang@huawei.com; Gaoxiang Liu <gaoxiangliu0@163.com>
> Subject: [PATCH v7] vhost: fix crash on port deletion
> 
> The rte_vhost_driver_unregister() and vhost_user_read_cb()
> can be called at the same time by 2 threads.
> when memory of vsocket is freed in rte_vhost_driver_unregister(),
> the invalid memory of vsocket is accessed in vhost_user_read_cb().
> It's a bug of both mode for vhost as server or client.
> 
> E.g., vhostuser port is created as server.
> Thread1 calls rte_vhost_driver_unregister().
> Before the listen fd is deleted from poll waiting fds,
> "vhost-events" thread then calls vhost_user_server_new_connection(),
> then a new conn fd is added in fdset when trying to reconnect.
> "vhost-events" thread then calls vhost_user_read_cb() and
> accesses invalid memory of socket while thread1 frees the memory of
> vsocket.
> 
> E.g., vhostuser port is created as client.
> Thread1 calls rte_vhost_driver_unregister().
> Before vsocket of reconn is deleted from reconn list,
> "vhost_reconn" thread then calls vhost_user_add_connection()
> then a new conn fd is added in fdset when trying to reconnect.
> "vhost-events" thread then calls vhost_user_read_cb() and
> accesses invalid memory of socket while thread1 frees the memory of
> vsocket.
> 
> The fix is to move the "fdset_try_del" in front of free memory of conn,
> then avoid the race condition.
> 
> The core trace is:
> Program terminated with signal 11, Segmentation fault.
> 
> Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")
> 
> Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>

@Maxime, I noticed the author and sob tag are using different emails. You
may need to change the author email when applying.

For this patch:

Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v7] vhost: fix crash on port deletion
  2021-09-02 15:45           ` [dpdk-dev] [PATCH v7] " Gaoxiang Liu
  2021-09-06  3:24             ` Xia, Chenbo
  2021-09-06  5:19             ` Xia, Chenbo
@ 2021-09-14 11:29             ` Maxime Coquelin
  2 siblings, 0 replies; 19+ messages in thread
From: Maxime Coquelin @ 2021-09-14 11:29 UTC (permalink / raw)
  To: Gaoxiang Liu, chenbo.xia; +Cc: dev, liugaoxiang



On 9/2/21 5:45 PM, Gaoxiang Liu wrote:
> The rte_vhost_driver_unregister() and vhost_user_read_cb()
> can be called at the same time by 2 threads.
> when memory of vsocket is freed in rte_vhost_driver_unregister(),
> the invalid memory of vsocket is accessed in vhost_user_read_cb().
> It's a bug of both mode for vhost as server or client.
> 
> E.g., vhostuser port is created as server.
> Thread1 calls rte_vhost_driver_unregister().
> Before the listen fd is deleted from poll waiting fds,
> "vhost-events" thread then calls vhost_user_server_new_connection(),
> then a new conn fd is added in fdset when trying to reconnect.
> "vhost-events" thread then calls vhost_user_read_cb() and
> accesses invalid memory of socket while thread1 frees the memory of
> vsocket.
> 
> E.g., vhostuser port is created as client.
> Thread1 calls rte_vhost_driver_unregister().
> Before vsocket of reconn is deleted from reconn list,
> "vhost_reconn" thread then calls vhost_user_add_connection()
> then a new conn fd is added in fdset when trying to reconnect.
> "vhost-events" thread then calls vhost_user_read_cb() and
> accesses invalid memory of socket while thread1 frees the memory of
> vsocket.
> 
> The fix is to move the "fdset_try_del" in front of free memory of conn,
> then avoid the race condition.
> 
> The core trace is:
> Program terminated with signal 11, Segmentation fault.
> 
> Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")
> 
> Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
> ---
> 
> v2:
> * Fix coding style issues.
> 
> v3:
> * Add detailed log.
> 
> v4:
> * Add the reason, when vhostuser port is created as server.
> 
> v5:
> * Add detailed log when vhostuser port is created as client
> 
> v6:
> * Add 'path' check before deleting listen fd
> * Fix spelling issues
> 
> v7:
> * Fix coding style issues.
> ---
>  lib/vhost/socket.c | 107 ++++++++++++++++++++++-----------------------
>  1 file changed, 53 insertions(+), 54 deletions(-)
> 


Applied to dpdk-next-virtio/main.

Thanks,
Maxime


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2021-09-14 11:29 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-07  8:25 [dpdk-dev] [PATCH] vhost: fix coredump on port deletion Gaoxiang Liu
2021-08-07 23:12 ` [dpdk-dev] [PATCH v2] " Gaoxiang Liu
2021-08-13 14:02   ` [dpdk-dev] [PATCH] vhost: fix crash on port deletion The rte_vhost_driver_unregister() and vhost_user_read_cb() can be called at the same time by 2 threads. Eg thread1 calls rte_vhost_driver_unregister() and frees memory of "conn". Because socket fd has not been deleted from poll waiting fds, "vhost-events" thread calls fdset_event_dispatch, then calls vhost_user_read_cb(), and accesses invalid memory of "conn" Gaoxiang Liu
2021-08-13 14:22   ` [dpdk-dev] [PATCH] vhost: fix crash on port deletion Gaoxiang Liu
2021-08-16  6:44     ` Xia, Chenbo
2021-08-20 15:53       ` Gaoxiang Liu
2021-08-18 16:08     ` [dpdk-dev] [PATCH v4] " Gaoxiang Liu
2021-08-20 15:46       ` [dpdk-dev] [PATCH v5] " Gaoxiang Liu
2021-08-26  8:37         ` Xia, Chenbo
2021-08-27 14:19         ` [dpdk-dev] [PATCH v6] " Gaoxiang Liu
2021-08-31  5:37           ` Xia, Chenbo
2021-09-02 15:38             ` Gaoxiang Liu
2021-09-06  3:18               ` Xia, Chenbo
2021-09-06  3:32                 ` Xia, Chenbo
2021-09-06  3:54                 ` Gaoxiang Liu
2021-09-02 15:45           ` [dpdk-dev] [PATCH v7] " Gaoxiang Liu
2021-09-06  3:24             ` Xia, Chenbo
2021-09-06  5:19             ` Xia, Chenbo
2021-09-14 11:29             ` Maxime Coquelin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).