DPDK patches and discussions
 help / color / mirror / Atom feed
From: Tetsuya Mukawa <mukawa@igel.co.jp>
To: "Xie, Huawei" <huawei.xie@intel.com>
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH v2 11/11] lib/librte_vhost: support dynamically registering vhost server
Date: Mon, 16 Feb 2015 17:17:09 +0900	[thread overview]
Message-ID: <54E1A785.1070307@igel.co.jp> (raw)
In-Reply-To: <1423717649-11818-12-git-send-email-huawei.xie@intel.com>

On 2015/02/12 14:07, Huawei Xie wrote:
> * support calling rte_vhost_driver_register after rte_vhost_driver_session_start
> * add mutext to protect fdset from concurrent access
> * add busy flag in fdentry. this flag is set before cb and cleared after cb is finished.
>
> mutex lock scenario in vhost:
>
> * event_dispatch(in rte_vhost_driver_session_start) runs in a seperate thread, infinitely
> processing vhost messages through cb(callback).
> * event_dispatch acquires the lock, get the cb and its context, mark the busy flag,
> and releases the mutex.
> * vserver_new_vq_conn cb calls fdset_add, which acquires the mutex and add new fd into fdset.
> * vserver_message_handler cb frees data context, marks remove flag to request to delete
> connfd(connection fd) from fdset.
> * after cb returns, event_dispatch
>   1. clears busy flag.
>   2. if there is remove request, call fdset_del, which acquires mutex, checks busy flag, and
> removes connfd from fdset.
> * rte_vhost_driver_unregister(not implemented) runs in another thread, acquires the mutex,
> calls fdset_del to remove fd(listenerfd) from fdset. Then it could free data context.
>
> The above steps ensures fd data context isn't freed when cb is using.
>
> VM(s) should have been shutdown before rte_vhost_driver_unregister.
>
> Signed-off-by: Huawei Xie <huawei.xie@intel.com>
> ---
>  lib/librte_vhost/vhost_user/fd_man.c         | 63 +++++++++++++++++++++++++---
>  lib/librte_vhost/vhost_user/fd_man.h         |  5 ++-
>  lib/librte_vhost/vhost_user/vhost-net-user.c | 34 +++++++++------
>  3 files changed, 82 insertions(+), 20 deletions(-)
>
> diff --git a/lib/librte_vhost/vhost_user/fd_man.c b/lib/librte_vhost/vhost_user/fd_man.c
> index 929fbc3..63ac4df 100644
> --- a/lib/librte_vhost/vhost_user/fd_man.c
> +++ b/lib/librte_vhost/vhost_user/fd_man.c
> @@ -40,6 +40,7 @@
>  #include <sys/types.h>
>  #include <unistd.h>
>  
> +#include <rte_common.h>
>  #include <rte_log.h>
>  
>  #include "fd_man.h"
> @@ -145,6 +146,8 @@ fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat)
>  	if (pfdset == NULL || fd == -1)
>  		return -1;
>  
> +	pthread_mutex_lock(&pfdset->fd_mutex);
> +
>  	/* Find a free slot in the list. */
>  	i = fdset_find_free_slot(pfdset);
>  	if (i == -1)
> @@ -153,6 +156,8 @@ fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat)
>  	fdset_add_fd(pfdset, i, fd, rcb, wcb, dat);
>  	pfdset->num++;
>  
> +	pthread_mutex_unlock(&pfdset->fd_mutex);
> +
>  	return 0;
>  }
>  
> @@ -164,17 +169,36 @@ fdset_del(struct fdset *pfdset, int fd)
>  {
>  	int i;
>  
> +	if (pfdset == NULL || fd == -1)
> +		return;
> +
> +again:
> +	pthread_mutex_lock(&pfdset->fd_mutex);
> +
>  	i = fdset_find_fd(pfdset, fd);
>  	if (i != -1 && fd != -1) {
> +		/* busy indicates r/wcb is executing! */
> +		if (pfdset->fd[i].busy == 1) {
> +			pthread_mutex_unlock(&pfdset->fd_mutex);
> +			goto again;
> +		}
> +
>  		pfdset->fd[i].fd = -1;
>  		pfdset->fd[i].rcb = pfdset->fd[i].wcb = NULL;
>  		pfdset->num--;
>  	}
> +
> +	pthread_mutex_unlock(&pfdset->fd_mutex);
>  }
>  
>  /**
>   * This functions runs in infinite blocking loop until there is no fd in
>   * pfdset. It calls corresponding r/w handler if there is event on the fd.
> + *
> + * Before the callback is called, we set the flag to busy status; If other
> + * thread(now rte_vhost_driver_unregister) calls fdset_del concurrently, it
> + * will wait until the flag is reset to zero(which indicates the callback is
> + * finished), then it could free the context after fdset_del.
>   */
>  void
>  fdset_event_dispatch(struct fdset *pfdset)
> @@ -183,6 +207,10 @@ fdset_event_dispatch(struct fdset *pfdset)
>  	int i, maxfds;
>  	struct fdentry *pfdentry;
>  	int num = MAX_FDS;
> +	fd_cb rcb, wcb;
> +	void *dat;
> +	int fd;
> +	int remove1, remove2;
>  
>  	if (pfdset == NULL)
>  		return;
> @@ -190,18 +218,41 @@ fdset_event_dispatch(struct fdset *pfdset)
>  	while (1) {
>  		FD_ZERO(&rfds);
>  		FD_ZERO(&wfds);
> +		pthread_mutex_lock(&pfdset->fd_mutex);
> +
>  		maxfds = fdset_fill(&rfds, &wfds, pfdset);
> -		if (maxfds == -1)
> -			return;
> +		if (maxfds == -1) {
> +			pthread_mutex_unlock(&pfdset->fd_mutex);
> +			sleep(1);
> +			continue;
> +		}
> +
> +		pthread_mutex_unlock(&pfdset->fd_mutex);
>  
>  		select(maxfds + 1, &rfds, &wfds, NULL, NULL);
>  
>  		for (i = 0; i < num; i++) {
> +			remove1 = remove2 = 0;
> +			pthread_mutex_lock(&pfdset->fd_mutex);
>  			pfdentry = &pfdset->fd[i];
> -			if (pfdentry->fd >= 0 && FD_ISSET(pfdentry->fd, &rfds) && pfdentry->rcb)
> -				pfdentry->rcb(pfdentry->fd, pfdentry->dat);
> -			if (pfdentry->fd >= 0 && FD_ISSET(pfdentry->fd, &wfds) && pfdentry->wcb)
> -				pfdentry->wcb(pfdentry->fd, pfdentry->dat);
> +			fd = pfdentry->fd;
> +			rcb = pfdentry->rcb;
> +			wcb = pfdentry->wcb;
> +			dat = pfdentry->dat;
> +			pfdentry->busy = 1;
> +			pthread_mutex_unlock(&pfdset->fd_mutex);
> +			if (fd >= 0 && FD_ISSET(fd, &rfds) && rcb)
> +				rcb(fd, dat, &remove1);
> +			if (fd >= 0 && FD_ISSET(fd, &wfds) && wcb)
> +				wcb(fd, dat, &remove2);

Hi Xie,

Should we add pthread_mutex_lock() before accessing pfdentry->busy?


> +			pfdentry->busy = 0;

Should we add pthread_mutex_unlock()?

Thanks,
Tetsuya


> +			/*
> +			 * fdset_del needs to check busy flag.
> +			 * We don't allow fdset_del to be called in callback
> +			 * directly.
> +			 */
> +			if (remove1 || remove2)
> +				fdset_del(pfdset, fd);
>  		}
>  	}
>  }
> diff --git a/lib/librte_vhost/vhost_user/fd_man.h b/lib/librte_vhost/vhost_user/fd_man.h
> index 26b4619..74ecde2 100644
> --- a/lib/librte_vhost/vhost_user/fd_man.h
> +++ b/lib/librte_vhost/vhost_user/fd_man.h
> @@ -34,20 +34,23 @@
>  #ifndef _FD_MAN_H_
>  #define _FD_MAN_H_
>  #include <stdint.h>
> +#include <pthread.h>
>  
>  #define MAX_FDS 1024
>  
> -typedef void (*fd_cb)(int fd, void *dat);
> +typedef void (*fd_cb)(int fd, void *dat, int *remove);
>  
>  struct fdentry {
>  	int fd;		/* -1 indicates this entry is empty */
>  	fd_cb rcb;	/* callback when this fd is readable. */
>  	fd_cb wcb;	/* callback when this fd is writeable.*/
>  	void *dat;	/* fd context */
> +	int busy;	/* whether this entry is being used in cb. */
>  };
>  
>  struct fdset {
>  	struct fdentry fd[MAX_FDS];
> +	pthread_mutex_t fd_mutex;
>  	int num;	/* current fd number of this fdset */
>  };
>  
> diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c
> index 634a498..3aa9436 100644
> --- a/lib/librte_vhost/vhost_user/vhost-net-user.c
> +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
> @@ -41,6 +41,7 @@
>  #include <sys/socket.h>
>  #include <sys/un.h>
>  #include <errno.h>
> +#include <pthread.h>
>  
>  #include <rte_log.h>
>  #include <rte_virtio_net.h>
> @@ -51,8 +52,9 @@
>  #include "virtio-net-user.h"
>  
>  #define MAX_VIRTIO_BACKLOG 128
> -static void vserver_new_vq_conn(int fd, void *data);
> -static void vserver_message_handler(int fd, void *dat);
> +
> +static void vserver_new_vq_conn(int fd, void *data, int *remove);
> +static void vserver_message_handler(int fd, void *dat, int *remove);
>  struct vhost_net_device_ops const *ops;
>  
>  struct connfd_ctx {
> @@ -61,10 +63,18 @@ struct connfd_ctx {
>  };
>  
>  #define MAX_VHOST_SERVER 1024
> -static struct {
> +struct _vhost_server {
>  	struct vhost_server *server[MAX_VHOST_SERVER];
> -	struct fdset fdset;	/**< The fd list this vhost server manages. */
> -} g_vhost_server;
> +	struct fdset fdset;
> +};
> +
> +static struct _vhost_server g_vhost_server = {
> +	.fdset = {
> +		.fd = { [0 ... MAX_FDS - 1] = {-1, NULL, NULL, NULL, 0} },
> +		.fd_mutex = PTHREAD_MUTEX_INITIALIZER,
> +		.num = 0
> +	},
> +};
>  
>  static int vserver_idx;
>  
> @@ -261,7 +271,7 @@ send_vhost_message(int sockfd, struct VhostUserMsg *msg)
>  
>  /* call back when there is new virtio connection.  */
>  static void
> -vserver_new_vq_conn(int fd, void *dat)
> +vserver_new_vq_conn(int fd, void *dat, __rte_unused int *remove)
>  {
>  	struct vhost_server *vserver = (struct vhost_server *)dat;
>  	int conn_fd;
> @@ -304,7 +314,7 @@ vserver_new_vq_conn(int fd, void *dat)
>  
>  /* callback when there is message on the connfd */
>  static void
> -vserver_message_handler(int connfd, void *dat)
> +vserver_message_handler(int connfd, void *dat, int *remove)
>  {
>  	struct vhost_device_ctx ctx;
>  	struct connfd_ctx *cfd_ctx = (struct connfd_ctx *)dat;
> @@ -319,7 +329,7 @@ vserver_message_handler(int connfd, void *dat)
>  			"vhost read message failed\n");
>  
>  		close(connfd);
> -		fdset_del(&g_vhost_server.fdset, connfd);
> +		*remove = 1;
>  		free(cfd_ctx);
>  		user_destroy_device(ctx);
>  		ops->destroy_device(ctx);
> @@ -330,7 +340,7 @@ vserver_message_handler(int connfd, void *dat)
>  			"vhost peer closed\n");
>  
>  		close(connfd);
> -		fdset_del(&g_vhost_server.fdset, connfd);
> +		*remove = 1;
>  		free(cfd_ctx);
>  		user_destroy_device(ctx);
>  		ops->destroy_device(ctx);
> @@ -342,7 +352,7 @@ vserver_message_handler(int connfd, void *dat)
>  			"vhost read incorrect message\n");
>  
>  		close(connfd);
> -		fdset_del(&g_vhost_server.fdset, connfd);
> +		*remove = 1;
>  		free(cfd_ctx);
>  		user_destroy_device(ctx);
>  		ops->destroy_device(ctx);
> @@ -426,10 +436,8 @@ rte_vhost_driver_register(const char *path)
>  {
>  	struct vhost_server *vserver;
>  
> -	if (vserver_idx == 0) {
> -		fdset_init(&g_vhost_server.fdset);
> +	if (vserver_idx == 0)
>  		ops = get_virtio_net_callbacks();
> -	}
>  	if (vserver_idx == MAX_VHOST_SERVER)
>  		return -1;
>  

  reply	other threads:[~2015-02-16  8:17 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-12  5:07 [dpdk-dev] [PATCH v2 00/11] qemu vhost-user support Huawei Xie
2015-02-12  5:07 ` [dpdk-dev] [PATCH v2 01/11] lib/librte_vhost: enable VIRTIO_NET_F_CTRL_RX VIRTIO_NET_F_CTRL_RX is dependant on VIRTIO_NET_F_CTRL_VQ. Observed that virtio-net driver in guest would crash with only CTRL_RX enabled Huawei Xie
2015-02-16  8:15   ` Tetsuya Mukawa
2015-02-12  5:07 ` [dpdk-dev] [PATCH v2 02/11] lib/librte_vhost: create vhost_cuse directory and move vhost-net-cdev.c into vhost_cuse Huawei Xie
2015-02-12  5:07 ` [dpdk-dev] [PATCH v2 03/11] lib/librte_vhost: rename vhost-net-cdev.h to vhost-net.h Huawei Xie
2015-02-12  5:07 ` [dpdk-dev] [PATCH v2 04/11] lib/librte_vhost: move fd copying(from qemu process into vhost process) to eventfd_copy.c Huawei Xie
2015-02-12  5:07 ` [dpdk-dev] [PATCH v2 05/11] lib/librte_vhost: copy host_memory_map from virtio-net.c to a new file virtio-net-cdev.c Huawei Xie
2015-02-12  5:07 ` [dpdk-dev] [PATCH v2 06/11] lib/librte_vhost: make host_memory_map a more generic function Huawei Xie
2015-02-12  5:07 ` [dpdk-dev] [PATCH v2 07/11] lib/librte_vhost: implement cuse_set_memory_table Huawei Xie
2015-02-12  5:07 ` [dpdk-dev] [PATCH v2 08/11] lib/librte_vhost: add select based event driven processing Huawei Xie
2015-02-16 17:10   ` Ananyev, Konstantin
2015-02-12  5:07 ` [dpdk-dev] [PATCH v2 09/11] lib/librte_vhost: vhost user support Huawei Xie
2015-02-12  8:26   ` Linhaifeng
2015-02-12  9:28     ` Xie, Huawei
2015-02-12 10:19       ` Linhaifeng
2015-02-12  5:07 ` [dpdk-dev] [PATCH v2 10/11] lib/librte_vhost: support dev->ifname for vhost-user Huawei Xie
2015-02-12  5:07 ` [dpdk-dev] [PATCH v2 11/11] lib/librte_vhost: support dynamically registering vhost server Huawei Xie
2015-02-16  8:17   ` Tetsuya Mukawa [this message]
2015-02-16 17:11   ` Ananyev, Konstantin
2015-02-12  5:26 ` [dpdk-dev] [PATCH v2 00/11] qemu vhost-user support Xie, Huawei
2015-02-16  8:19 ` Tetsuya Mukawa
2015-02-22 18:20   ` Thomas Monjalon
2015-02-23 13:53     ` Czesnowicz, Przemyslaw
2015-02-23 14:08       ` Thomas Monjalon
2015-02-23 14:15         ` Czesnowicz, Przemyslaw
2015-02-23  2:50 ` Tetsuya Mukawa
2015-02-23 17:36 ` [dpdk-dev] [PATCH v3 " Przemyslaw Czesnowicz
2015-02-23 17:36   ` [dpdk-dev] [PATCH v3 01/11] lib/librte_vhost: enable VIRTIO_NET_F_CTRL_RX VIRTIO_NET_F_CTRL_RX is dependant on VIRTIO_NET_F_CTRL_VQ. Observed that virtio-net driver in guest would crash with only CTRL_RX enabled Przemyslaw Czesnowicz
2015-02-23 17:36   ` [dpdk-dev] [PATCH v3 02/11] lib/librte_vhost: create vhost_cuse directory and move vhost-net-cdev.c into vhost_cuse Przemyslaw Czesnowicz
2015-02-23 17:36   ` [dpdk-dev] [PATCH v3 03/11] lib/librte_vhost: rename vhost-net-cdev.h to vhost-net.h Przemyslaw Czesnowicz
2015-02-23 17:36   ` [dpdk-dev] [PATCH v3 04/11] lib/librte_vhost: move fd copying(from qemu process into vhost process) to eventfd_copy.c Przemyslaw Czesnowicz
2015-02-23 17:36   ` [dpdk-dev] [PATCH v3 05/11] lib/librte_vhost: copy host_memory_map from virtio-net.c to a new file virtio-net-cdev.c Przemyslaw Czesnowicz
2015-02-23 17:36   ` [dpdk-dev] [PATCH v3 06/11] lib/librte_vhost: make host_memory_map a more generic function Przemyslaw Czesnowicz
2015-02-23 17:36   ` [dpdk-dev] [PATCH v3 07/11] lib/librte_vhost: implement cuse_set_memory_table Przemyslaw Czesnowicz
2015-02-23 17:36   ` [dpdk-dev] [PATCH v3 08/11] lib/librte_vhost: add select based event driven processing Przemyslaw Czesnowicz
2015-02-23 17:36   ` [dpdk-dev] [PATCH v3 09/11] lib/librte_vhost: vhost user support Przemyslaw Czesnowicz
2015-02-27  9:42     ` Xie, Huawei
2015-02-23 17:36   ` [dpdk-dev] [PATCH v3 10/11] lib/librte_vhost: support dev->ifname for vhost-user Przemyslaw Czesnowicz
2015-02-23 17:36   ` [dpdk-dev] [PATCH v3 11/11] lib/librte_vhost: support dynamically registering vhost server Przemyslaw Czesnowicz
2015-02-24  0:41   ` [dpdk-dev] [PATCH v3 00/11] qemu vhost-user support Thomas Monjalon
2015-02-25  5:55   ` Xie, Huawei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54E1A785.1070307@igel.co.jp \
    --to=mukawa@igel.co.jp \
    --cc=dev@dpdk.org \
    --cc=huawei.xie@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).