DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Gaoxiang Liu" <gaoxiangliu0@163.com>
To: "Xia, Chenbo" <chenbo.xia@intel.com>
Cc: "maxime.coquelin@redhat.com" <maxime.coquelin@redhat.com>,
	 "dev@dpdk.org" <dev@dpdk.org>,
	 "liugaoxiang@huawei.com" <liugaoxiang@huawei.com>
Subject: Re: [dpdk-dev] [PATCH v6] vhost: fix crash on port deletion
Date: Mon, 6 Sep 2021 11:54:40 +0800 (GMT+08:00)	[thread overview]
Message-ID: <65ad2c8b.7034.17bb93e6a16.Coremail.gaoxiangliu0@163.com> (raw)
In-Reply-To: <MN2PR11MB4063682914601772ECA45C999CD29@MN2PR11MB4063.namprd11.prod.outlook.com>

Hi Chenbo,

But same issue happens when you deleted 'vsocket->socket_fd' but failed to delete one
of the conn_fd: you will goto again and try to delete socket_fd again and then loop
forever. So anyway you need to prevent this from happening.

==>
It will not happen, because fdset_try_del only returns -1 when the fd exists and is busy.
E.g., when thread1 deleted 'vsocket->socket_fd' but failed to delete one of the conn_fd,
it will go to again and try to delete socket again, and it will not fail, because 'vsocket->socket_fd' has been deleted and fdset_try_del returns 0.
Thread1 will continue to delete the left conn_fd.

Thanks.
Gaoxiang








On 09/06/2021 11:18, Xia, Chenbo wrote:
Hi Gaoxiang,

>
>
>From: Gaoxiang Liu <gaoxiangliu0@163.com>
>Sent: Thursday, September 2, 2021 11:38 PM
>To: Xia, Chenbo <chenbo.xia@intel.com>
>Cc: maxime.coquelin@redhat.com; dev@dpdk.org; liugaoxiang@huawei.com
>Subject: Re:RE: [PATCH v6] vhost: fix crash on port deletion
>
>
>Hi chenbo,
>why this is not moved up?
>>> +          if (vsocket->is_server) {
>>> +               close(vsocket->socket_fd);
>>> +               unlink(path);
>>>            }
>==>Because if this is moved up, and if deleting conn fd from fdsets failed,
>it will arrive the "again" label, then close vsocket->socket_fd and uplink "path" again.
>it's not necessary.
>And closing socket_fd does not  trigger vhost_user_server_new_connection.

But same issue happens when you deleted 'vsocket->socket_fd' but failed to delete one
of the conn_fd: you will goto again and try to delete socket_fd again and then loop
forever. So anyway you need to prevent this from happening.

Thanks,
Chenbo

>
>Thanks.
>Gaoxiang

At 2021-08-31 13:37:38, "Xia, Chenbo" <mailto:chenbo.xia@intel.com> wrote:
>Hi Gaoxiang,
>
>> -----Original Message-----
>> From: Gaoxiang Liu <mailto:gaoxiangliu0@163.com>
>> Sent: Friday, August 27, 2021 10:19 PM
>> To: mailto:maxime.coquelin@redhat.com; Xia, Chenbo <mailto:chenbo.xia@intel.com>
>> Cc: mailto:dev@dpdk.org; mailto:liugaoxiang@huawei.com; Gaoxiang Liu <mailto:gaoxiangliu0@163.com>
>> Subject: [PATCH v6] vhost: fix crash on port deletion
>>
>> The rte_vhost_driver_unregister() and vhost_user_read_cb()
>> can be called at the same time by 2 threads.
>> when memory of vsocket is freed in rte_vhost_driver_unregister(),
>> the invalid memory of vsocket is accessed in vhost_user_read_cb().
>> It's a bug of both mode for vhost as server or client.
>>
>> E.g.,vhostuser port is created as server.
>
>Put a space after ','
>
>> Thread1 calls rte_vhost_driver_unregister().
>> Before the listen fd is deleted from poll waiting fds,
>> "vhost-events" thread then calls vhost_user_server_new_connection(),
>> then a new conn fd is added in fdset when trying to reconnect.
>> "vhost-events" thread then calls vhost_user_read_cb() and
>> accesses invalid memory of socket while thread1 frees the memory of
>> vsocket.
>>
>> E.g.,vhostuser port is created as client.
>
>Same here.
>
>> Thread1 calls rte_vhost_driver_unregister().
>> Before vsocket of reconn is deleted from reconn list,
>> "vhost_reconn" thread then calls vhost_user_add_connection()
>> then a new conn fd is added in fdset when trying to reconnect.
>> "vhost-events" thread then calls vhost_user_read_cb() and
>> accesses invalid memory of socket while thread1 frees the memory of
>> vsocket.
>>
>> The fix is to move the "fdset_try_del" in front of free memory of conn,
>> then avoid the race condition.
>>
>> The core trace is:
>> Program terminated with signal 11, Segmentation fault.
>>
>> Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")
>>
>> Signed-off-by: Gaoxiang Liu <mailto:liugaoxiang@huawei.com>
>> ---
>>
>> v2:
>> * Fix coding style issues.
>>
>> v3:
>> * Add detailed log.
>>
>> v4:
>> * Add the reason, when vhostuser port is created as server.
>>
>> v5:
>> * Add detailed log when vhostuser port is created as client
>>
>> v6:
>> * Add 'path' check before deleting listen fd
>> * Fix spelling issues
>> ---
>>  lib/vhost/socket.c | 108 ++++++++++++++++++++++-----------------------
>>  1 file changed, 54 insertions(+), 54 deletions(-)
>>
>> diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
>> index 5d0d728d5..27d5e8695 100644
>> --- a/lib/vhost/socket.c
>> +++ b/lib/vhost/socket.c
>> @@ -1023,66 +1023,66 @@ rte_vhost_driver_unregister(const char *path)
>>
>>       for (i = 0; i < vhost_user.vsocket_cnt; i++) {
>>            struct vhost_user_socket *vsocket = vhost_user.vsockets[i];
>> +          if (strcmp(vsocket->path, path)) {
>> +               continue;
>> +          }
>
>braces {} are not necessary for single statement blocks
>
>>
>> -          if (!strcmp(vsocket->path, path)) {
>> -               pthread_mutex_lock(&vsocket->conn_mutex);
>> -               for (conn = TAILQ_FIRST(&vsocket->conn_list);
>> -                    conn != NULL;
>> -                    conn = next) {
>> -                    next = TAILQ_NEXT(conn, next);
>> -
>> -                    /*
>> -                     * If r/wcb is executing, release vsocket's
>> -                     * conn_mutex and vhost_user's mutex locks, and
>> -                     * try again since the r/wcb may use the
>> -                     * conn_mutex and mutex locks.
>> -                     */
>> -                    if (fdset_try_del(&vhost_user.fdset,
>> -                                conn->connfd) == -1) {
>> -                         pthread_mutex_unlock(
>> -                                   &vsocket->conn_mutex);
>> -                         pthread_mutex_unlock(&vhost_user.mutex);
>> -                         goto again;
>> -                    }
>> -
>> -                    VHOST_LOG_CONFIG(INFO,
>> -                         "free connfd = %d for device '%s'\n",
>> -                         conn->connfd, path);
>> -                    close(conn->connfd);
>> -                    vhost_destroy_device(conn->vid);
>> -                    TAILQ_REMOVE(&vsocket->conn_list, conn, next);
>> -                    free(conn);
>> -               }
>> -               pthread_mutex_unlock(&vsocket->conn_mutex);
>> -
>> -               if (vsocket->is_server) {
>> -                    /*
>> -                     * If r/wcb is executing, release vhost_user's
>> -                     * mutex lock, and try again since the r/wcb
>> -                     * may use the mutex lock.
>> -                     */
>> -                    if (fdset_try_del(&vhost_user.fdset,
>> -                              vsocket->socket_fd) == -1) {
>> -                         pthread_mutex_unlock(&vhost_user.mutex);
>> -                         goto again;
>> -                    }
>> -
>> -                    close(vsocket->socket_fd);
>> -                    unlink(path);
>> -               } else if (vsocket->reconnect) {
>> -                    vhost_user_remove_reconnect(vsocket);
>> +          if (vsocket->is_server) {
>> +               /*
>> +                * If r/wcb is executing, release vhost_user's
>> +                * mutex lock, and try again since the r/wcb
>> +                * may use the mutex lock.
>> +                */
>> +               if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) ==
>> -1) {
>> +                    pthread_mutex_unlock(&vhost_user.mutex);
>> +                    goto again;
>>                 }
>> +          } else if (vsocket->reconnect) {
>> +               vhost_user_remove_reconnect(vsocket);
>> +          }
>>
>> -               pthread_mutex_destroy(&vsocket->conn_mutex);
>> -               vhost_user_socket_mem_free(vsocket);
>> +          pthread_mutex_lock(&vsocket->conn_mutex);
>> +          for (conn = TAILQ_FIRST(&vsocket->conn_list);
>> +                conn != NULL;
>> +                conn = next) {
>> +               next = TAILQ_NEXT(conn, next);
>>
>> -               count = --vhost_user.vsocket_cnt;
>> -               vhost_user.vsockets[i] = vhost_user.vsockets[count];
>> -               vhost_user.vsockets[count] = NULL;
>> -               pthread_mutex_unlock(&vhost_user.mutex);
>> +               /*
>> +                * If r/wcb is executing, release vsocket's
>> +                * conn_mutex and vhost_user's mutex locks, and
>> +                * try again since the r/wcb may use the
>> +                * conn_mutex and mutex locks.
>> +                */
>> +               if (fdset_try_del(&vhost_user.fdset,
>> +                           conn->connfd) == -1) {
>> +                    pthread_mutex_unlock(&vsocket->conn_mutex);
>> +                    pthread_mutex_unlock(&vhost_user.mutex);
>> +                    goto again;
>> +               }
>> +
>> +               VHOST_LOG_CONFIG(INFO,
>> +                    "free connfd = %d for device '%s'\n",
>> +                    conn->connfd, path);
>> +               close(conn->connfd);
>> +               vhost_destroy_device(conn->vid);
>> +               TAILQ_REMOVE(&vsocket->conn_list, conn, next);
>> +               free(conn);
>> +          }
>> +          pthread_mutex_unlock(&vsocket->conn_mutex);
>>
>> -               return 0;
>> +          if (vsocket->is_server) {
>> +               close(vsocket->socket_fd);
>> +               unlink(path);
>>            }
>
>I think you miss my comment in V5 of asking why this is not moved up after
>fdset_try_del server socket fd.
>
>Thanks,
>Chenbo
>
>> +
>> +          pthread_mutex_destroy(&vsocket->conn_mutex);
>> +          vhost_user_socket_mem_free(vsocket);
>> +
>> +          count = --vhost_user.vsocket_cnt;
>> +          vhost_user.vsockets[i] = vhost_user.vsockets[count];
>> +          vhost_user.vsockets[count] = NULL;
>> +          pthread_mutex_unlock(&vhost_user.mutex);
>> +          return 0;
>>       }
>>       pthread_mutex_unlock(&vhost_user.mutex);
>>
>> --
>> 2.32.0
>>
>

 

  parent reply	other threads:[~2021-09-06  3:54 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-07  8:25 [dpdk-dev] [PATCH] vhost: fix coredump " Gaoxiang Liu
2021-08-07 23:12 ` [dpdk-dev] [PATCH v2] " Gaoxiang Liu
2021-08-13 14:02   ` [dpdk-dev] [PATCH] vhost: fix crash on port deletion The rte_vhost_driver_unregister() and vhost_user_read_cb() can be called at the same time by 2 threads. Eg thread1 calls rte_vhost_driver_unregister() and frees memory of "conn". Because socket fd has not been deleted from poll waiting fds, "vhost-events" thread calls fdset_event_dispatch, then calls vhost_user_read_cb(), and accesses invalid memory of "conn" Gaoxiang Liu
2021-08-13 14:22   ` [dpdk-dev] [PATCH] vhost: fix crash on port deletion Gaoxiang Liu
2021-08-16  6:44     ` Xia, Chenbo
2021-08-20 15:53       ` Gaoxiang Liu
2021-08-18 16:08     ` [dpdk-dev] [PATCH v4] " Gaoxiang Liu
2021-08-20 15:46       ` [dpdk-dev] [PATCH v5] " Gaoxiang Liu
2021-08-26  8:37         ` Xia, Chenbo
2021-08-27 14:19         ` [dpdk-dev] [PATCH v6] " Gaoxiang Liu
2021-08-31  5:37           ` Xia, Chenbo
2021-09-02 15:38             ` Gaoxiang Liu
2021-09-06  3:18               ` Xia, Chenbo
2021-09-06  3:32                 ` Xia, Chenbo
2021-09-06  3:54                 ` Gaoxiang Liu [this message]
2021-09-02 15:45           ` [dpdk-dev] [PATCH v7] " Gaoxiang Liu
2021-09-06  3:24             ` Xia, Chenbo
2021-09-06  5:19             ` Xia, Chenbo
2021-09-14 11:29             ` Maxime Coquelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=65ad2c8b.7034.17bb93e6a16.Coremail.gaoxiangliu0@163.com \
    --to=gaoxiangliu0@163.com \
    --cc=chenbo.xia@intel.com \
    --cc=dev@dpdk.org \
    --cc=liugaoxiang@huawei.com \
    --cc=maxime.coquelin@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).