From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id D264E1C694 for ; Fri, 6 Apr 2018 11:25:58 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 Apr 2018 02:25:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.48,414,1517904000"; d="scan'208";a="189150301" Received: from dpdk9.bj.intel.com ([172.16.182.183]) by orsmga004.jf.intel.com with ESMTP; 06 Apr 2018 02:25:55 -0700 From: zhiyong.yang@intel.com To: dev@dpdk.org Cc: Zhiyong Yang , maxime.coquelin@redhat.com, jianfeng.tan@intel.com, tiwei.bie@intel.com, zhihong.wang@intel.com, dong1.wang@intel.com, thomas@monjalon.net Date: Fri, 6 Apr 2018 17:25:54 +0800 Message-Id: <20180406092554.9842-1-zhiyong.yang@intel.com> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180406001855.54062-1-zhiyong.yang@intel.com> References: <20180406001855.54062-1-zhiyong.yang@intel.com> Subject: [dpdk-dev] [PATCH v7] net/virtio-user: add support for server mode X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2018 09:25:59 -0000 In a container environment if the vhost-user backend restarts, there's no way for it to reconnect to virtio-user. To address this, support for server mode is added. In this mode the socket file is created by virtio- user, which the backend then connects to. This means that if the backend restarts, it can reconnect to virtio-user and continue communications. With current implementation, LSC is enabled at virtio-user side to support to accept the coming connection. Server mode virtio-user only supports to work with vhost-user. Release note is updated in this patch. Signed-off-by: Zhiyong Yang --- Cc: maxime.coquelin@redhat.com Cc: jianfeng.tan@intel.com Cc: tiwei.bie@intel.com Cc: zhihong.wang@intel.com Cc: dong1.wang@intel.com Cc: thomas@monjalon.net Changes in V7: 1. avoid misusing vhost-kernel in server mode virtio-user. 2. move the funciton definition is_vhost_user_by_type before virtio_user_start_device in order that it can be called. 3. add comments in the code to state feature negotiation limit. Changes in V6: 1. fix report wrong link stauts in server mode. 2. fix some code style issues. Changes in V5: 1. Support server mode virtio-user startup in non-blocking mode. 2. rebase on top of dpdk-next-virtio. Changes in V4: 1. Don't create new pthread any more and use librte_eal interrupt thread instead. Changes in V3: 1. use EAL epoll mechanism instead of vhost events. Cancel to export vhost event APIs. 2. rebase the code on top of dpdk-next-virtio Changes in V2: 1. split two patches 1/5 and 2/5 from v1 patchset to fix some existing issues which is not strongly related to support for server mode 2. move fdset related functions to librte_eal from librte_vhost exposed as new APIs. 3. release note is added in the patch 5/5. 4. squash data structure change patch into 4/5 according to Maxime's suggestion. doc/guides/rel_notes/release_18_05.rst | 6 ++ drivers/net/virtio/virtio_user/vhost_user.c | 45 ++++++++-- drivers/net/virtio/virtio_user/virtio_user_dev.c | 101 ++++++++++++++++------- drivers/net/virtio/virtio_user/virtio_user_dev.h | 3 + drivers/net/virtio/virtio_user_ethdev.c | 101 ++++++++++++++++++++--- 5 files changed, 209 insertions(+), 47 deletions(-) diff --git a/doc/guides/rel_notes/release_18_05.rst b/doc/guides/rel_notes/release_18_05.rst index 9cc77f893..f8897b2e9 100644 --- a/doc/guides/rel_notes/release_18_05.rst +++ b/doc/guides/rel_notes/release_18_05.rst @@ -58,6 +58,12 @@ New Features * Added support for NVGRE, VXLAN and GENEVE filters in flow API. * Added support for DROP action in flow API. +* **Added support for virtio-user server mode.** + In a container environment if the vhost-user backend restarts, there's no way + for it to reconnect to virtio-user. To address this, support for server mode + is added. In this mode the socket file is created by virtio-user, which the + backend connects to. This means that if the backend restarts, it can reconnect + to virtio-user and continue communications. API Changes ----------- diff --git a/drivers/net/virtio/virtio_user/vhost_user.c b/drivers/net/virtio/virtio_user/vhost_user.c index 91c6449bb..a6df97a00 100644 --- a/drivers/net/virtio/virtio_user/vhost_user.c +++ b/drivers/net/virtio/virtio_user/vhost_user.c @@ -378,6 +378,30 @@ vhost_user_sock(struct virtio_user_dev *dev, return 0; } +#define MAX_VIRTIO_USER_BACKLOG 1 +static int +virtio_user_start_server(struct virtio_user_dev *dev, struct sockaddr_un *un) +{ + int ret; + int flag; + int fd = dev->listenfd; + + ret = bind(fd, (struct sockaddr *)un, sizeof(*un)); + if (ret < 0) { + PMD_DRV_LOG(ERR, "failed to bind to %s: %s; remove it and try again\n", + dev->path, strerror(errno)); + return -1; + } + ret = listen(fd, MAX_VIRTIO_USER_BACKLOG); + if (ret < 0) + return -1; + + flag = fcntl(fd, F_GETFL); + fcntl(fd, F_SETFL, flag | O_NONBLOCK); + + return 0; +} + /** * Set up environment to talk with a vhost user backend. * @@ -405,13 +429,24 @@ vhost_user_setup(struct virtio_user_dev *dev) memset(&un, 0, sizeof(un)); un.sun_family = AF_UNIX; snprintf(un.sun_path, sizeof(un.sun_path), "%s", dev->path); - if (connect(fd, (struct sockaddr *)&un, sizeof(un)) < 0) { - PMD_DRV_LOG(ERR, "connect error, %s", strerror(errno)); - close(fd); - return -1; + + if (dev->is_server) { + dev->listenfd = fd; + if (virtio_user_start_server(dev, &un) < 0) { + PMD_DRV_LOG(ERR, "virtio-user startup fails in server mode"); + close(fd); + return -1; + } + dev->vhostfd = -1; + } else { + if (connect(fd, (struct sockaddr *)&un, sizeof(un)) < 0) { + PMD_DRV_LOG(ERR, "connect error, %s", strerror(errno)); + close(fd); + return -1; + } + dev->vhostfd = fd; } - dev->vhostfd = fd; return 0; } diff --git a/drivers/net/virtio/virtio_user/virtio_user_dev.c b/drivers/net/virtio/virtio_user/virtio_user_dev.c index f90fee9e5..38b8bc90d 100644 --- a/drivers/net/virtio/virtio_user/virtio_user_dev.c +++ b/drivers/net/virtio/virtio_user/virtio_user_dev.c @@ -93,12 +93,26 @@ virtio_user_queue_setup(struct virtio_user_dev *dev, return 0; } +int +is_vhost_user_by_type(const char *path) +{ + struct stat sb; + + if (stat(path, &sb) == -1) + return 0; + + return S_ISSOCK(sb.st_mode); +} + int virtio_user_start_device(struct virtio_user_dev *dev) { uint64_t features; int ret; + if (is_vhost_user_by_type(dev->path) && dev->vhostfd < 0) + return -1; + /* Do not check return as already done in init, or reset in stop */ dev->ops->send_request(dev, VHOST_USER_SET_OWNER, NULL); @@ -174,17 +188,6 @@ parse_mac(struct virtio_user_dev *dev, const char *mac) } } -int -is_vhost_user_by_type(const char *path) -{ - struct stat sb; - - if (stat(path, &sb) == -1) - return 0; - - return S_ISSOCK(sb.st_mode); -} - static int virtio_user_dev_init_notify(struct virtio_user_dev *dev) { @@ -254,6 +257,8 @@ virtio_user_fill_intr_handle(struct virtio_user_dev *dev) eth_dev->intr_handle->fd = -1; if (dev->vhostfd >= 0) eth_dev->intr_handle->fd = dev->vhostfd; + else if (dev->is_server) + eth_dev->intr_handle->fd = dev->listenfd; return 0; } @@ -267,21 +272,32 @@ virtio_user_dev_setup(struct virtio_user_dev *dev) dev->vhostfds = NULL; dev->tapfds = NULL; - if (is_vhost_user_by_type(dev->path)) { - dev->ops = &ops_user; - } else { - dev->ops = &ops_kernel; - - dev->vhostfds = malloc(dev->max_queue_pairs * sizeof(int)); - dev->tapfds = malloc(dev->max_queue_pairs * sizeof(int)); - if (!dev->vhostfds || !dev->tapfds) { - PMD_INIT_LOG(ERR, "Failed to malloc"); + if (dev->is_server) { + if (access(dev->path, F_OK) == 0 && + !is_vhost_user_by_type(dev->path)) { + PMD_DRV_LOG(ERR, "Server mode doesn't support vhost-kernel!"); return -1; } - - for (q = 0; q < dev->max_queue_pairs; ++q) { - dev->vhostfds[q] = -1; - dev->tapfds[q] = -1; + dev->ops = &ops_user; + } else { + if (is_vhost_user_by_type(dev->path)) { + dev->ops = &ops_user; + } else { + dev->ops = &ops_kernel; + + dev->vhostfds = malloc(dev->max_queue_pairs * + sizeof(int)); + dev->tapfds = malloc(dev->max_queue_pairs * + sizeof(int)); + if (!dev->vhostfds || !dev->tapfds) { + PMD_INIT_LOG(ERR, "Failed to malloc"); + return -1; + } + + for (q = 0; q < dev->max_queue_pairs; ++q) { + dev->vhostfds[q] = -1; + dev->tapfds[q] = -1; + } } } @@ -337,16 +353,29 @@ virtio_user_dev_init(struct virtio_user_dev *dev, char *path, int queues, return -1; } - if (dev->ops->send_request(dev, VHOST_USER_SET_OWNER, NULL) < 0) { - PMD_INIT_LOG(ERR, "set_owner fails: %s", strerror(errno)); - return -1; - } + if (dev->vhostfd >= 0) { + if (dev->ops->send_request(dev, VHOST_USER_SET_OWNER, + NULL) < 0) { + PMD_INIT_LOG(ERR, "set_owner fails: %s", + strerror(errno)); + return -1; + } - if (dev->ops->send_request(dev, VHOST_USER_GET_FEATURES, - &dev->device_features) < 0) { - PMD_INIT_LOG(ERR, "get_features failed: %s", strerror(errno)); - return -1; + if (dev->ops->send_request(dev, VHOST_USER_GET_FEATURES, + &dev->device_features) < 0) { + PMD_INIT_LOG(ERR, "get_features failed: %s", + strerror(errno)); + return -1; + } + } else { + /* We just pretend vhost-user can support all these features. + * Note that this could be problematic that if some feature is + * negotiated but not supported by the vhost-user which comes + * later. + */ + dev->device_features = VIRTIO_USER_SUPPORTED_FEATURES; } + if (dev->mac_specified) dev->device_features |= (1ull << VIRTIO_NET_F_MAC); @@ -388,6 +417,11 @@ virtio_user_dev_uninit(struct virtio_user_dev *dev) close(dev->vhostfd); + if (dev->is_server && dev->listenfd >= 0) { + close(dev->listenfd); + dev->listenfd = -1; + } + if (dev->vhostfds) { for (i = 0; i < dev->max_queue_pairs; ++i) close(dev->vhostfds[i]); @@ -396,6 +430,9 @@ virtio_user_dev_uninit(struct virtio_user_dev *dev) } free(dev->ifname); + + if (dev->is_server) + unlink(dev->path); } static uint8_t diff --git a/drivers/net/virtio/virtio_user/virtio_user_dev.h b/drivers/net/virtio/virtio_user/virtio_user_dev.h index 5f8755771..ade727e46 100644 --- a/drivers/net/virtio/virtio_user/virtio_user_dev.h +++ b/drivers/net/virtio/virtio_user/virtio_user_dev.h @@ -6,6 +6,7 @@ #define _VIRTIO_USER_DEV_H #include +#include #include "../virtio_pci.h" #include "../virtio_ring.h" #include "vhost.h" @@ -13,6 +14,8 @@ struct virtio_user_dev { /* for vhost_user backend */ int vhostfd; + int listenfd; /* listening fd */ + bool is_server; /* server or client mode */ /* for vhost_kernel backend */ char *ifname; diff --git a/drivers/net/virtio/virtio_user_ethdev.c b/drivers/net/virtio/virtio_user_ethdev.c index 263649006..4e7b3c34f 100644 --- a/drivers/net/virtio/virtio_user_ethdev.c +++ b/drivers/net/virtio/virtio_user_ethdev.c @@ -24,15 +24,73 @@ #define virtio_user_get_dev(hw) \ ((struct virtio_user_dev *)(hw)->virtio_user_dev) +static int +virtio_user_server_reconnect(struct virtio_user_dev *dev) +{ + int ret; + int flag; + int connectfd; + struct rte_eth_dev *eth_dev = &rte_eth_devices[dev->port_id]; + + connectfd = accept(dev->listenfd, NULL, NULL); + if (connectfd < 0) + return -1; + + dev->vhostfd = connectfd; + flag = fcntl(connectfd, F_GETFD); + fcntl(connectfd, F_SETFL, flag | O_NONBLOCK); + + ret = virtio_user_start_device(dev); + if (ret < 0) + return -1; + + if (eth_dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC) { + if (rte_intr_disable(eth_dev->intr_handle) < 0) { + PMD_DRV_LOG(ERR, "interrupt disable failed"); + return -1; + } + rte_intr_callback_unregister(eth_dev->intr_handle, + virtio_interrupt_handler, + eth_dev); + eth_dev->intr_handle->fd = connectfd; + rte_intr_callback_register(eth_dev->intr_handle, + virtio_interrupt_handler, eth_dev); + + if (rte_intr_enable(eth_dev->intr_handle) < 0) { + PMD_DRV_LOG(ERR, "interrupt enable failed"); + return -1; + } + } + PMD_INIT_LOG(NOTICE, "server mode virtio-user reconnection succeeds!"); + return 0; +} + static void virtio_user_delayed_handler(void *param) { struct virtio_hw *hw = (struct virtio_hw *)param; - struct rte_eth_dev *dev = &rte_eth_devices[hw->port_id]; + struct rte_eth_dev *eth_dev = &rte_eth_devices[hw->port_id]; + struct virtio_user_dev *dev = virtio_user_get_dev(hw); - rte_intr_callback_unregister(dev->intr_handle, - virtio_interrupt_handler, - dev); + if (rte_intr_disable(eth_dev->intr_handle) < 0) { + PMD_DRV_LOG(ERR, "interrupt disable failed"); + return; + } + rte_intr_callback_unregister(eth_dev->intr_handle, + virtio_interrupt_handler, eth_dev); + if (dev->is_server) { + if (dev->vhostfd >= 0) { + close(dev->vhostfd); + dev->vhostfd = -1; + } + eth_dev->intr_handle->fd = dev->listenfd; + rte_intr_callback_register(eth_dev->intr_handle, + virtio_interrupt_handler, eth_dev); + if (rte_intr_enable(eth_dev->intr_handle) < 0) { + PMD_DRV_LOG(ERR, "interrupt enable failed"); + return; + } + } } static void @@ -67,12 +125,10 @@ virtio_user_read_dev_config(struct virtio_hw *hw, size_t offset, dev->status &= (~VIRTIO_NET_S_LINK_UP); PMD_DRV_LOG(ERR, "virtio-user port %u is down", hw->port_id); - /* Only client mode is available now. Once the - * connection is broken, it can never be up - * again. Besides, this function could be called - * in the process of interrupt handling, - * callback cannot be unregistered here, set an - * alarm to do it. + + /* This function could be called in the process + * of interrupt handling, callback cannot be + * unregistered here, set an alarm to do it. */ rte_eal_alarm_set(1, virtio_user_delayed_handler, @@ -85,7 +141,12 @@ virtio_user_read_dev_config(struct virtio_hw *hw, size_t offset, PMD_DRV_LOG(ERR, "error clearing O_NONBLOCK flag"); return; } + } else if (dev->is_server) { + dev->status &= (~VIRTIO_NET_S_LINK_UP); + if (virtio_user_server_reconnect(dev) >= 0) + dev->status |= VIRTIO_NET_S_LINK_UP; } + *(uint16_t *)dst = dev->status; } @@ -278,12 +339,15 @@ static const char *valid_args[] = { VIRTIO_USER_ARG_QUEUE_SIZE, #define VIRTIO_USER_ARG_INTERFACE_NAME "iface" VIRTIO_USER_ARG_INTERFACE_NAME, +#define VIRTIO_USER_ARG_SERVER_MODE "server" + VIRTIO_USER_ARG_SERVER_MODE, NULL }; #define VIRTIO_USER_DEF_CQ_EN 0 #define VIRTIO_USER_DEF_Q_NUM 1 #define VIRTIO_USER_DEF_Q_SZ 256 +#define VIRTIO_USER_DEF_SERVER_MODE 0 static int get_string_arg(const char *key __rte_unused, @@ -378,6 +442,7 @@ virtio_user_pmd_probe(struct rte_vdev_device *dev) uint64_t queues = VIRTIO_USER_DEF_Q_NUM; uint64_t cq = VIRTIO_USER_DEF_CQ_EN; uint64_t queue_size = VIRTIO_USER_DEF_Q_SZ; + uint64_t server_mode = VIRTIO_USER_DEF_SERVER_MODE; char *path = NULL; char *ifname = NULL; char *mac_addr = NULL; @@ -445,6 +510,15 @@ virtio_user_pmd_probe(struct rte_vdev_device *dev) } } + if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_SERVER_MODE) == 1) { + if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_SERVER_MODE, + &get_integer_arg, &server_mode) < 0) { + PMD_INIT_LOG(ERR, "error to parse %s", + VIRTIO_USER_ARG_SERVER_MODE); + goto end; + } + } + if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_CQ_NUM) == 1) { if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_CQ_NUM, &get_integer_arg, &cq) < 0) { @@ -469,6 +543,8 @@ virtio_user_pmd_probe(struct rte_vdev_device *dev) } if (rte_eal_process_type() == RTE_PROC_PRIMARY) { + struct virtio_user_dev *vu_dev; + eth_dev = virtio_user_eth_dev_alloc(dev); if (!eth_dev) { PMD_INIT_LOG(ERR, "virtio_user fails to alloc device"); @@ -476,6 +552,11 @@ virtio_user_pmd_probe(struct rte_vdev_device *dev) } hw = eth_dev->data->dev_private; + vu_dev = virtio_user_get_dev(hw); + if (server_mode == 1) + vu_dev->is_server = true; + else + vu_dev->is_server = false; if (virtio_user_dev_init(hw->virtio_user_dev, path, queues, cq, queue_size, mac_addr, &ifname) < 0) { PMD_INIT_LOG(ERR, "virtio_user_dev_init fails"); -- 2.14.3